TECHNIQUES IN PROTEIN CHEMISTRY
VIII
This Page Intentionally Left Blank
TECHNIQUES IN PROTEIN CHEMISTRY
VIII
Edited by
Daniel R. Marshak Osiris Therapeutics, Inc. Baltimore, Maryland
ACADEMIC PRESS San Diego New York Boston London Sydney Tokyo Toronto
This book is printed on acid-free paper. Copyright © 1997 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher.
Academic Press
525 B Street, Suite 1900, San Diego, California 92101-4495, USA http ://www. apnet. com United Kingdom Edition published by Academic Press Limited 24-28 Oval Road, London NWl 7DX, UK http://www.hbuk.co.uk/ap/ Library of Congress Card Catalog Number: 94-230592 International Standard Book Number: 0-12-473557-6 (case) International Standard Book Number: 0-12-473558-4 (comb) PRINTED IN THE UNITED STATES OF AMERICA 97 98 99 00 01 02 EB 9 8 7 6 5
4
3
2 1
Contents
Foreword xvii Preface xix Acknowledgments
xxi
Section I Primary Structural Analysis Protein Sequencing Using Microreactors and Capillary Electrophoresis with Thermo-optical Absorbance Detection 3 Xing-fang Li, Hongji Ren, Ming Qi, Darren E Lewis, Ian D. Ireland, Karen C. Waldron, and Norman J. Dovichi Enhancement of Concentration Limits of Detection in Capillary Electrophoresis: Examples of On-Line Sample Preconcentration, Cleanup, and Microreactor Technology in Protein Characterization 15 Andy J. Tomlinson, Linda M. Benson, Norberto A. Guzman, and Stephen Naylor Sequencing MHC Class I Peptides Using Membrane Preconcentration-Capillary Electrophoresis Tandem Mass Spectrometry (mPC-CE-MS/MS) 25 Andy J. Tomlinson, Stephen Jameson, and Stephen Naylor Nano-electrospray Mass Spectrometry and Edman Sequencing of Peptides and Proteins Collected from Capillary Electrophoresis 37 Mark D. Bauer, Yiping Sun, and Feng Wang Characterization of a Recombinant Hepatitis E Protein Vaccine Candidate by Mass Spectrometry and Sequencing Techniques 47 C Patrick McAtee and Yifan Zhang
vi
Contents
Comparison of the High Sensitivity and Standard Versions of AppUed Biosystems Procise™ 494 N-Terminal Protein Sequencers Using Various Sequencing Supports 57 Anita E. Lavin, Lee Anne Merewether, Christi L. Clogston, and Michael E Rohde
Evaluation of ABRF-96SEQ: A Sequence Assignment Exercise 69 Joseph Eernandez, ArieAdmon, Karen De Jongh, Greg Grant, William Henzel, William 5. Lane, Kathryn L. Stone, and Barbara Merrill
Internal Protein Sequencing of SDS-PAGE-Separated Proteins: Optimization of an In Gel Digest Protocol 79 Ken Williams, Mary LoPresti, and Kathy Stone
A Strategy to Obtain Internal Sequence Information from Blotted Proteins after Initial N-terminal Sequencing 91 Kuo-Liang Hsi, William E. Werner, Lynn R. Zieske, Chris H. Grimley, Steven A. O'Neill, Michael L. Kochersperger, Kent Yamada, and Pau-Miau Yuan
Internal Protein Sequencing of SDS PAGE-Separated Proteins: A Collaborative ABRF Study 99 Ken Williams, Ulf Hellman, Ryuji Kohayashi, William Lane, Sheenah Mische, and David Speicher
Section II Physical and Chemical Analysis Chromatographic Determination of Extinction Coefficients of Non-Glycosylated Proteins Using Refractive Index (RI) and UV Absorbance (UV) Detectors: Applications for Studying Protein Interactions by Size Exclusion Chromatography with Light-Scattering, UV, and RI Detectors 113 Jie Wen, Tsutomu Arakawa, Jette Wypych, Keith E. Langley, Meredith G. Schwartz, and John S. Philo
Contents
vii
Single Alkaline Phosphatase Molecule Assay by Capillary Electrophoresis Laser-Induced Fluorescence Detection 121 Douglas B. Craig, Edgar A. Arriaga, Jerome C. Y. Wong, Hui Lu, and Norman J. Dovichi
A New Centrifugal Device Used in Sample Clean-up and Concentration of Peptides 133 Donald G. Sheer, Elizabeth Kellard, William Kopaciewicz, Patrick Gearing, Jeff Wong, and Michael Klein
Sample Preparation Using Synthetic Membranes for the Study of Biopolymers by Matrix Assisted Laser Desorption/Ionization Mass Spectrometry 143 T. A. Worrall, J A. Porter, R. J Cotter, and A. S. Woods
Use of LC/MS Peptide Mapping for Characterization of Isoforms in ^^N-Labeled Recombinant Human Leptin 155 Jennifer L. Liu, Tamer Eris, Scott L. Lauren, George W. Stearns, Keith R. Westcott, and Hsieng Lu
Hyphenated HPLC Methodology for the Resolution and Elucidation of Peptides from Proteolytic Digests 165 Randall T. Bishop, Vincent E. Turula, James A. de Haseth, and Robert D. Richer
Detecting and Identifying Active Compounds from a Combinatorial Library Using lAsys and Electrospray Mass Spectrometry 177 Bolong Cao, Jan Urban, Tomas Vaisar, Richard Y. W. Shen, and Michael Kahn
Amino Acid Analysis of Unusual and Complex Samples Based on 6-Aminoquinolyl-N-hydroxysuccinimidyl Carbamate Derivatization Steven A. Cohen and Charlie van Wandelen
185
Development of a Method for Analysis of Free Amino Acids from Physiological Samples Using a 420A ABI/PE Amino Acid Analyzer 197 Klaus D. Linse, Sandie Smith, and Michelle Gadush
viii
Contents
Quantitation and Identification of Proteins by Amino Acid Analysis: ABRF-96AAA Collaborative Trial 207 K. M. Schegg, N. D. Denslow, T. T. Andersen, Y. Bao, S, A. Cohen, A. M. Mahrenholz, and K. Mann
Section III Chemical Modification Nonaqueous Chemical Modification of Lyophilized Proteins
219
Harvey Kaplan and Alpay Taralp Reaction of HIV-1 NC p7 Zinc Fingers with Electrophilic Reagents 231 E. Chertova, B. R Kane, L. V. Coren, D. G. Johnson, R. C Sowder II, P. Nower, J. R. Casas-Finet, L. O. Arthur, and L. E. Henderson The Identification and Isolation of Reactive Thiols in Ricin A-Chain and Blocked Ricin Using 2-(4'-Maleimidylanilino)naphthalene-6-sulfonic Acid 245 Mary E, Denton, Rita M. Steeves, and John M. Lambert Inactivation of the Human Cytomegalovirus Protease by Diisopropylfluorophosphate 257 Thomas Hesson, Anthony Tsarbopoulos, S. Shane Taremi, Winifred W. Prosise, Nancy Butkiewicz, Bimalendu DasMahapatra, Michael Cable, Hung Van he, and Patricia C. Weber Studies on the Status of Arginine Residues in Phospholipase A2 from Naja naja atra (Taiwan cobra) Snake Venom 267 C C Yang, T. S. Yuo, and C. K Chen Selective Reduction of the Intermolecular Disulfide Bridge in Human GUal Cell Line-Derived Neurotrophic Factor Using Tris-(2-Carboxyethyl)Phosphine 277 John O. Hui, John Le, Viswanatham Katta, Michael E Rohde, and Mitsuru Haniu Effects of Surface Hydrophobicity on the Structural Properties of Insuhn Mark L. Brader, Rohn L. Millican, David N. Brems, Henry A. Havel, Aidas Kriauciunas, and Victor J. Chen
289
Contents
ix
The Effects of in Vitro Methionine Oxidation on the Bioactivity and Structure of Human Keratinocyte Growth Factor 299 Christopher S. Spahr, Linda O. Narhi, James Speakman, Hsieng S. Lu, and Yueh-Rong Hsu
Section IV Posttranslational and Other Modifications Effects of Enzyme Giycosylation on the Chemical Step of Catalysis, as Probed by Hydrogen Tunnehng and Enthalpy of Activation 311 Amnon Kohen, Thorlakur Jonsson, and Judith P. Klinman Profile Analysis of Oligosaccharides from Glycoproteins by PMP Labeling. Comparison of Chemical and Enzymatic Release Methods Using RP-HPLC and Mass Spectrometry 321 Hanspeter Michel, Yuemei Ma, Barbara DeBarbieri, and Yu-Ching E. Pan Positive Identification of Giycosylation Sites in Proteins and Peptides Using a Modified Beckman LF 3600 N-Terminal Protein Sequencer 331 Xiaomei Lin, L. Wulf Carson, Saber M. A. Khan, Clark F. Ford, and Kristine M. Swiderek Deamidation and Isoaspartate Formation during in Vitro Aging of a Recombinant Hepatitis E Vaccine Candidate 341 C Patrick McAtee and Yifan Zhang The Isolation and Characterization of Active Site Peptides in Lysyl Oxidase 351 Sophie X. Wang, Judith P Klinman, Katalin F Medzihradszky, Alma L. Burlingame
and
Complement Activation in EDTA Blood/Plasma Samples May Be Caused by Coagulation Proteases 363 Philippe H. Pfeifer, Tony E. Hugh, Earl W. Davie, and Kazuo Fujikawa Disulfide-Linked Human Stem Cell Factor Dimer: Method of Identification and Molecular Comparison to the Noncovalent Dimer 371 Hsieng S. Lu, Michael D. Jones, and Keith E. Langley
Contents Autocatalytic Reduction of a Humanized Antibody 385 A. Ashok Kumar, John Kimura, and Jennifer Running Deer
Section V Interactions of Protein with Ligands Oxygen and Ascorbate Mediated Modification of a Recombinant Hemoglobin 399 Bruce A. Kerwin, Edward Hess, Julie Lippincott, Ray Kaiser, and Izydor Apostol Metal Activation and Regulation of E. coli RNase H James L. Keck and Susan Marqusee
409
Crystal Structure of Avian Sarcoma Virus Integrase with Bound Essential Cations 417 Jerry Alexandratos, Grzegorz Bujacz, Mariusz Jaskolski, Alexander Wlodawer, George Merkel, Richard A. Katz, and Anna Maria Skalka Multidimensional NMR Studies of an Exchangeable Apolipoprotein and Its Interactions with Lipids 427 Jianjun Wang, Daisy Sahoo, Dean Schieve, Stephane M. Gagne, Brian D. Sykes, and Robert O. Ryan NMR Methods for Analysis of CRALBP Retinoid Binding 439 Linda A. Luck, Ronald A. Venters, James T. Kapron, Karen E. Roth, Seth A. Barrows, Sara G. Paradis, and John W. Crabb A Novel Method for Measuring the Binding Properties of the Site-Directed Mutants of the Proteins That Bind Hydrophobic Ligands: Application to Cellular Retinoic Acid Binding Proteins 449 Honggao Yan, Lincong Wang, and Yue Li A Strategy for Predicting the Ligand Binding Competence of Recombinant Orphan Nuclear Receptors Using Biophysical Characterization 457 Derril Willard, Bruce Wisely, Derek Parks, Martin Rink, William Holmes, Michael Milburn, and Thomas Consler
Contents
xi
Section VI Protein-Protein Interactions Detection of /w/ra-Cellular Protein-Protein Interactions: Penicillin Interactive Proteins and Morphogene Proteins 469 5. Bhardwaj and R. A. Day
Use of Synthetic Peptides in Mapping the Binding Sites for hsp70 in a Mitochondrial Protein 481 Antonio Artigues, Ana Iriarte, and Marino
Martinez-Carrion
Interfacing Biomolecular Interaction Analysis with Mass Spectrometry and the Use of Bioreactive Mass Spectrometer Probe Tips in Protein Characterization 493 Randall W. Nelson, Jennifer R. Krone, David Dogruel, Kemmons Tubbs, Russ Granzow, and Osten Jansson
Transition-State Theory and Secondary Forces in Antigen-Antibody Complexes 505 Mark E. Mummert and Edward W. Voss, Jr Thermodynamic Investigation of Enzyme and Inhibitor Interactions with High Affinity 513 Yudu Cheng, Jacek Slon-Usakiewicz, Jing Wang, Enrico O. Purisima, and Yasuo Konishi Development and Characterization of a Fab Fragment as a Surrogate for the IL-1 Receptor 523 Y. Cong, A. S. McColl, T. R. Hynes, R. C Meckel, P. S. Mezes, C L. Lane, S. E. Lee, D. J. Wasilko, K. E Geoghegan, I. G. Otterness, and G. O. Daumy
Section VII Macromolecular Assemblies Topology of Membrane Proteins in Native Membranes Using Matrix-Assisted Laser Desorption lonization/Mass Spectrometry 533 Kamala Tyagarajan, John G. Forte, and R. Reid Townsend
xii
Contents
Role of D-Ser"^^ in the P-type Calcium Channel Blocker, w-Agatoxin-TK Tomohiro Watanabe, Manabu Kuwada, Kumiko Y. Kumagaye, Kiichiro Nakajima, Yukio Nishizawa, and Naoki Asakawa
543
Involvement of Basic Amphiphilic a-HeUcal Domain in the Reversible Membrane Interaction of Amphitropic Proteins: Structural Studies by Mass Spectrometry, Circular Dichroism, and Nuclear Magnetic Resonance 555 Nobuhiro Hayashi, Mamoru Matsubara, Koiti Titani, and Hisaaki Taniguchi
One-Dimensional Diffusion of a Protein along a Single-Stranded Nucleic Acid 565 Bradley R. Kelemen and Ronald T. Raines
Metal-Dependent Structure and Self Association of the RAGl Zinc-Binding Domain 573 Karla K. Rodgers and Karen G. Fleming
Localizing Flexibihty within the Target Site of DNA-Bending Proteins Anne Grove and E. Peter Geiduschek
585
Assembly of the Multifunctional EcoYLl DNA Restriction Enzyme in Vitro 593 David T. R Dry den, Laurie R Cooper, and Noreen E. Murray
Section VIII Three Dimensional Structure Strategies for NMR Assignment and Global Fold Determinations Using Perdeuterated Proteins 605 Ronald A. Venters, Hai M. Vu, Robert M. de Lorimier, and Leonard D. Spicer ^H-NMR Evidence for Two Buried ASN Side-Chains in the c-MYC-MAX Heterodimeric a-Hehcal Coiled-Coil 617 Pierre Lavigne, Matthew P. Crump, Stephane M. Gagne, Brian D. Sykes, Robert S. Hodges, and Cyril M. Kay
Contents
xiii
NMR Confirms the Presence of the Amino-Terminal Hehx of Group II PhosphoUpase A2 in Solution 625 Roman Jerala, Paulo E E Almeida, Rodney L. Biltonen, and Gordon S. Rule
The Crystallographic Analysis of Glycosylation-Inhibiting Factor 633 Yoichi Kato, Takanori Muto, Hiroshi Watarai, Takafumi Tomura, Toshifumi Mikayama, and Ryota Kuroki
Structure of the D30N Active Site Mutant of FIV Proteinase Complexed with a Statine-Based Inhibitor 643 Celine Schalk-Hihi, Jacek Lubkowski, Alexander Zdanov, Alexander Wlodawer, Alia Gustchina, Gary S. Laco, and John H. Elder A Homology-Based Model of Juvenile Hormone Esterase from the Crop Pest, Heliothis virescens 655 Beth Ann Thomas, W. Bret Church, and Bruce D. Hammock
Analysis of Linkers of Regular Secondary Structures in Proteins V. Geetha and Peter J. Munson
667
Structural and Functional Roles of Tyrosine-50 of Yeast Guanylate Kinase 679 Yanling Zhang, Yue Li, and Honggao Yan
Section IX Dynamics and Folding Flexibility of Serine Protease in Nonaqueous Solvent 693 Samuel Toba, David S. Hartsough, and Kenneth M. Merz, Jr
Higher-Order Structure and Dynamics of FK506-Binding Protein Probed by Backbone Amide Hydrogen/Deuterium Exchange and Electrospray Ionization Fourier Transform Ion Cyclotron Resonance Mass Spectrometry 703 Zhongqi Zhang, Weiqun Li, Ming Li, Timothy M. Logan, Shenheng Guan, and Alan G. Marshall
xiv
Contents
Internal Dynamics of Human Ubiquitin Revealed by ^-^C-Relaxation Studies of Randomly Fractionally Labeled Protein 715 A. Joshua Wand, Jeffrey L. Urbauer, Robert P. McEvoy, and Ramona J. Bieber Detection of Protein Unfolding and Fluctuations by Native State Hydrogen Exchange 727 Aaron K. Chamberlain, Tracy M. Handel, and Susan Marqusee Laser Temperature Jump for the Study of Early Events in Protein Folding Peggy A. Thompson
735
Biophysical and Structural Analysis of Human Acidic Fibroblast Growth Factor 745 Michael Blaber, Daniel H. Adamek, Aleksandar Popovic, and Sachiko I. Blaber A Thermodynamic Analysis Discriminating Loop Backbone Conformations 755 Jean-Luc Pellequer and Shu-wen W. Chen The Equilibrium Ensemble of Conformational States in Staphylococcal Nuclease 767 Vincent J. Hilser and Ernesto Freire An Evaluation of Protein Secondary Structure Prediction Algorithms Georgios Pappas, Jr., and Shankar Subramaniam
783
Section X Biological and Chemical Design Designing Water Soluble p-Sheet Peptides with Compact Structure Elena Ilyina, Vikram Roongta, and Kevin H. Mayo
797
Engineering Secondary Structure to Invert Coenzyme Specificity in Isopropylmalate Dehydrogenase 809 Ridong Chen, Ann F. Greer, Antony M. Dean, and James H. Hurley
Contents
xv
A Method for Determining Domain Binding Sites in Proteins with Swapped Domains: ImpUcations for pA3- and |3B2- CrystaiUns 817 Yuri V. Sergeev and J. Fielding Hejtmancik Complete Mutagenesis of the Gene Encoding TEM-1 p-Lactamase Timothy Palzkill, Wanzhi Huang, and Joseph Petrosino
827
Characterization of Truncated Kirsten-Ras Purified from Baculovirus Infected Insect Cells Indicates Heterogeneity due to N-terminal Processing and Nucleotide Dissociation 837 Lisa M. Churgay, Nancy B. Rankl, John M. Richardson, Gerald W. Becker, and John E. Hale Isolation and Characterization of Multiple-Methionine Mutants of T4 Lysozyme with Simplified Cores 851 Nadine C. Gassner, Walter A. Baase, Joel D. Lindstrom, Brian K. Shoichet, and Brian W. Matthews Synthesis of Alzheimer's (1-42) Ap-Amyloid Peptide with Preformed Fmoc-Aminoacyl Fluorides 865 Saskia C. E Milton, R. C, de Lisle Milton, Steven A. Kates, and Charles Glabe Analysis of Racemization during "Standard" SoUd Phase Peptide Synthesis: A Multicenter Study 875 Ruth Hogue Angeletti, Lisa Bibbs, Lynda E Bonewald, Gregg B. Fields, Jeffery W. Kelly, John S. McMurray, William T. Moore, and Susan T. Weintraub Index
891
This Page Intentionally Left Blank
Foreword
Once again it is a great pleasure to thank Dan Marshak on behalf of the Protein Society for editing Techniques in Protein Chemistry. The volumes in this series provide "bench-top" references that will be of ongoing value to practicing protein scientists. This volume continues this outstanding tradition. Following an organizational strategy that was introduced last year, the articles have been arranged by concepts rather than by methodology. It is hoped that this format will serve to alert the reader to alternative approaches that may be available to address a given biological or biochemical problem. This compilation of articles has been selected from presentations at the Tenth Symposium of the Protein Society held in San Jose, August 3-7,1996.1 would like to join Dan in thanking the Associate Editors, Phil Andrews, Gerry Carlson, Steve Carr, Xiaodong Cheng, Lowell Ericsson, Sheenah Mische, Nick Pace, Len Spicer, and Ken Wilhams, as well as the former volume editors, Joe Villafranca, Tony Hugh, John Crabb, and Ruth Angeletti, for their help. This is the second volume edited by Dan Marshak. I am pleased to announce that Gerry Carlson has kindly agreed to take over this task for the next two years.
Brian W. Matthews President The Protein Society
xvii
This Page Intentionally Left Blank
Preface Techniques in Protein Chemistry VIII is the latest volume in this successful series describing the most up-to-date methodologies in proteins. The contributions were selected from presentations at the Tenth Symposium of the Protein Society held in San Jose, California, in August, 1996. The structure of this year's edition continues the new format of last year's volume. The ten sections of the book are segregated by subject area to show the reader the techniques that are currently applied to certain problems in protein science. This reflects current trends in the field in which specific instruments and methodologies are used in several different arenas. For example, mass spectrometry is now used in protein sequencing, analysis of posttranslational modifications, analysis of chemical modifications, protein engineering, and higher order protein structure. Even methods such as crystallography and nuclear magnetic resonance are used in determining protein-ligand interactions, protein-protein interactions, and macromolecular assembhes in addition to traditional three-dimensional protein structural analysis. I hope this format will be useful to a readership that is rapidly expanding its horizons concerning the application of various techniques to questions in protein science. The credit for reviewing the manuscripts is due the associate editors: Phil Andrews, Gerry Carlson, Steve Carr, Xiaodong Cheng, Lowell Ericsson, Sheenah Mische, Nick Pace, Len Spicer, and Ken Williams. Their expertise in specific areas of protein science was the key to selecting contributions from the many excellent presentations. I have had the benefit of counsel from John Crabb and Ruth Angeletti, and look forward to next year's volume, which will be edited by Gerry Carlson. Finally, I thank my secretary, Debra Rizzieri, for her assistance. Protein science has become a fountainhead of new discoveries that fuel the engines of biology. The expansion of techniques that can be appUed to proteins has allowed the creation of a vast set of tools for the practitioner. This volume is a celebration of the investigators who invent and apply new methods. Daniel R. Marshak Osiris Therapeutics, Inc. and The Johns Hopkins University School of Medicine
XIX
This Page Intentionally Left Blank
Acknowledgments The Protein Society acknowledges with thanks the following organizations which, through their support of the Society's program goals, contributed in a meaningful way to the tenth annual symposium and thus to this volume.
Aviv Instruments, Inc. Beckman Instruments, Inc.
Perkin-Elmer Corporation, Applied Biosystems Division
BioMolecular Technologies, Inc.
PerSeptive Biosystems, Inc.
BIOSYM/Molecular Simulations
Pharmacia Biosensor
Bristol-Myers Squibb
Pharmacia Biotech, Inc.
Finnigan MAT
Rainin Instrument Co., Inc.
Fisons Instruments
Schering-Plough Research Institute
Hewlett-Packard Company
Shimadzu Scientific Instruments, Inc.
IntelliGenetics, Inc.
Supelco, Inc.
JASCO, Inc.
VYDAC
Kirin Brewery Co., Ltd.
Waters Corporation
Michrom BioResources, Inc.
Wyatt Technology Corporation
Molecular Simulations, Inc.
ZymoGenetics
XXI
This Page Intentionally Left Blank
SECTION I Primary Structural Analysis
This Page Intentionally Left Blank
Protein Sequencing Using Microreactors and Capillary Electrophoresis with Thermo-optical Absorbance Detection Xing-fang Li Hongji Ren MingQi Darren F. Lewis Ian D. Ireland Karen C. Waldron Norman J. Dovichi Department of Chemistry University of Alberta Edmonton, Alberta, Canada T6G 2G2
Abstract A miniaturized protein and peptide microsequencer consisting of either a fused silica capillary reactor or a microreactor made of Teflon is described. The performance of the miniaturized sequencer was evaluated by sequencing 33 and 27 picomoles of myoglobin that were covalently attached to Sequelon-DITC. The products generated by the sequencer were analyzed using capillary electrophoresis with thermo-optical absorbance detection. This CE system provides reproducible migration time (< 0.4% of RSD) and detection limits of less than 4 fmol.
I. Introduction The primary amino acid sequence of polypeptide is routinely determined using the commercially available gas-liquid-phase sequencers [1, 2] and solidphase sequencers [3, 4] based on the Edman degradation chemistry [5], These instruments can routinely obtain the primary amino acid sequence from 10 to 100 pmol of polypeptide. However, the need for higher sequencing sensitivity remains, as Kent et al. [6] have pointed out that rare proteins may only be present at the 30-300 fmol level on 2D-polyacrylamide gels. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
3
4
Xing-fang Li et al
Although tandem MS has demonstrated rapid and sensitive determination of primary sequence without use of Edman degradation chemistry [7-9], this technique is typically limited to peptides smaller than 15 residues; larger peptides generate very complex mass spectra that are difficult to interpret. The Edman chemistry-based sequencing techniques are still essential to biological studies. Researchers have improved sequencing sensitivity of the latter techniques by miniaturizing and modifying the sequencer components, from separation column to reaction cartridge. Reducing HPLC column inner diameter from 4 to 2-mm improved sensitivity fourfold. The continuous flow reactor (CFR) described by Shively's group [10], consisting of concentric Teflon tubes, gave high sensitivity sequence analysis of 5 pmol of protein adsorbed on polyvinylidene difluoride (PVDF) with polybrene. The Hewlett-Packard biphasic reaction column sequencer gave similar sequencing sensitivity [11]. However, these authors [10, 11] pointed out that routine use at this level was difficult. Regardless of configuration, miniaturizing the reaction cartridge volume permits the use of less reagent and thereby reduces the level of non-specific reactions that give rise to background noise in the chromatographic identification of sequencing products. The current technologies for protein sequencing are limited by the UV detection of PTH-amino acids [2]. As a result, we have developed an altemative technology for the separation and determination of minute amounts of PTH-amino acids: micellar electrokinetic capillary chromatography (MECC) with thermooptical absorbance detection (TOAD) [12,13]. This technology has been routinely used in our laboratory to identify the PTH-amino acids resulting from manual and semi-automated Edman degradation reactions for over two years. In the first part of this report, we present the reproducibihty of migration times for PTH-amino acids to demonstrate the reliability of this technique. Unfortunately, the MECC-TOAD system cannot be coupled directly to commercially available protein sequencers because of incompatibility of volume. Less than 10 nL of the sample solution is typically injected into the CE to preserve the high efficiency of separation, whereas up to 100 \\L of solution is collected from existing commercial sequencers. Thus miniaturization of the sequencer is essential in order to overcome this volume mismatch and to take advantage of the sensitive, fast and efficient determination of PTHs using MECC-TOAD. Recently, we have reported on the design of a miniaturized sequencer consisting of a capillary-sized reaction chamber and a multiport valve delivery system to address the problem of volume compatibility with CE [14]. However, that sequencer could not reproducibly deliver sub-microliter volumes of reagent because of microliter dead volumes in the multiport valves. Using this sequencer to do gas-liquid phase Edman degradation of picomole levels of proteins adsorbed on Polybrene-coated silica beads or PVDF membranes was not successful. Adequate sequencing results were only obtained when proteins were covalently bound to solid supports, but the background peaks were large. By reducing the amounts of reagent used for Edman degradation, background peaks were reduced. Unfortunately, the multiport valve and argon pressurized delivery system described by Waldron et al [14] precluded using less than 4 |iL of reagent or solvent. Therefore, in order to further reduce the reagent volumes, we redesigned the miniaturized microsequencer to eliminate the valves and the argon pressurized delivery system. In this paper, we describe the design of a miniaturized sequencer where syringe pumps are directly coupled via capillary tubing to the reaction chamber to deliver reagents and
Protein Sequencing Using Microreactors
5
solvents for covalent polypeptide sequencing. This system is able to deliver less than 2 |ULL of each reagent. As a result, the background peaks are significantly reduced. Preliminary results for this microsequencer are presented.
II. Experimental A. Routine analysis of PTH-amino acids using Micellar Electrokinetic Capillary Chromatography with Thermo-optical Ahsorbance Detection (MECC-TOAD) The instrument and conditions for the determination of PTH-amino acids by MECC-TOAD were reported in detail elsewhere [12, 13, 15]. The RSD values of the migration time were calculated after using the two-marker correction method that will be reported in a separate paper [16]. The two markers used were DMPTU and DPU. The detection limits for the 19 PTH-amino acids, DMPTU, DPTU, and DPU were calculated based on three times the standard deviation of the background signal. PTH-Y was used as the internal standard to calculate the sequencing yields. B. Miniaturization of the Protein Sequencer The design and construction of the fused silica capillary-based reaction chamber were described in detail elsewhere [15]. Another reaction chamber was constructed using a block of Teflon material. The configuration of the Teflon reactor is shown in Figure 1. The central channel with 0.762-mm-i.d. was used to host the membrane-bound protein samples. The top and bottom of this channel were connected to the Ar source with 0.762-mm-o.d. and 0.305-mm-i.d. Teflon tubing (Cole-Palmer). The other five small channels had the same diameter, 0.367 mm. These small channels were connected to syringe pumps by use of 367-|imo.d. and 100-|Lim-i.d. Teflon-coated fused silica capillaries (Polymicro). All the connections are made snug tight. The reagents were delivered through the small channels directly to the central channel.
Xing-fang Li et al
WSHl PITC
WSH2
Figure 1. Schematic of the Teflon microreactor, A- Teflon Block; B- Teflon Tubing (0.762-mm-o.d. and 0.305-mm-i.d.); C- Teflon-coated Fused Silica Capillary (367-|Lim-o.d. and lOO-^m-i.d.); D- Protein-bound Membrane; E- Sample vial for collecting the Products
C. Automated Protein Sequencing The sequencing conditions used in this study were similar to those reported previously [15], with a few changes: the syringe pumps for delivery of the reagents; order of the delivery; amounts of reagents; the reaction time; and Ar drying steps were all automatically controlled by a Macintosh computer with a program developed in our laboratory using Labview development software (National Instruments). Conversion of the ATZ-amino acids (extracted with TFA) to the PTH form was carried out off-line. After cleavage, the extract from each Edman degradation cycle was collected into a 200-|LIL vial, to which 25 |LiL of 25% aqueous TFA solution was added and mixed. The solution was heated at 67^C for 10 min. and then dried on a vacuum centrifuge. The residue in the vial was dissolved in 1 |iL
Protein Sequencing Using Microreactors
7
of internal standard (5.8 x IQ-^ M PTH-tyrosine (PTH-Y) in 10% acetonitrile/90% water) and then analyzed by MECC-TOAD for identification of the PTH-amino acid.
III. Results and Discussion A. Evaluation of MECC-TOAD as a Routine PTH-amino Acid Analyzer Standard solutions containing the 19 PTH-amino acids, DMPTU, DPTU, and DPU, all at concentration of 2.5 x 10"^ M, were analyzed routinely under the common conditions: 15 s hydrodynamic injection at 4 cm height difference; 40cm-long, 50-|im-i.d. and 185-|im-o.d. fused silica capillary preconditioned by gravity flow of the running buffer for over 24 hrs; 9 kV running voltage; and the running buffer composed of 10.7 mM sodium phosphate, 1.8 mM sodium borate, and 25 mM SDS. After the electropherograms were obtained, the migration times of the analytes were corrected based on DMPTU and DPU as markers. RSD values of migration times for the 21 analytes were calculated from ten electropherograms. When the ten electropherogrms were obtained in the same day, the RSD values of the corrected migration times were below 0.4% for all 22 analytes. Even when the ten electropherograms were obtained over a period of three months, the RSD values were still below 0.6% except PTH-H and PTH-R that are at 1% and 1.2%, respectively. These results demonstrate that PTH-amnio acid residues resulting from Edman degradation can be reliably identified by using MECC-TOAD. MECC-TOAD also provides high sensitivity. A typical performance of this instrument under the conditions described above is shown in Figure 2. The detection limits calculated from Fig.2 range from 0.5 to 1.7 |iM, which is equivalent to 1.4 to 4.6 fmol listed in Table I. In contrast, the HPLC-UV analyzers had about 1 pmol of mass detection limit and 2 |LIM concentration detection limit, provided that the injection volume was 50 |iL [24]. Unfortunately, the volume mismatch between MECC-TOAD and available sequencers have limited the use of this reproducible and high sensitive technology. Therefore, miniaturization of the protein sequencer is essential.
Xing-fang Li et al 0.30 n
0.25 H
0.20 H
nS/lfyf VH
0.15
0.10
1 4
I 5
I 6
I 1 7 8 Migration Time (min)
1 9
1 10
1 11
1 12
Figure 2. Electropherogram of the PTH-amino acids (5 x IQ-^ M) for calculation of the detection Umits (conditions described in the text).
Table I. Detection Limits (DL) of the MECC-TOAD for Determination of PTH-Amino Acids PTH-amino acids
Mass DL, fmol
Concentration DL, fxM
W, K N L G H Q, A, P, V, M, F Y E, R D, S I
L4 L8 2.1 2.5 2.6 2.7 3 3.2 4.0 4.6
0.5 0.7 0.8 0.9 1 1 1 L2 4.6 L7
Protein Sequencing Using Microreactors
9
B. Protein Sequencing using the miniaturized sequencer The syringe pump-based capillary sequencer has been used for protein sequencing for over half-a-year in our laboratory. The typical performance of the sequencer is demonstrated by the sequencing results obtained from 33 pmol of Sequelon-DITC-myoglobin. Because the free amino group was covalently bound to the DITC-membrane, the residue from the first cycle was not expected to be detected reliably, therefore, it was not analyzed. The pseudo-initial yield from the second cycle was 76%, and the repetitive yield was 87%. The electropherograms of the products from the Edman degradation cycles are shown in Figure 3. Figure 3 demonstrates that the MECC-TOAD provides baseUne separation of all components generated from the sequencing reactions. Positive identification of the PTH residues resulting from the degradation cycles were easily made by comparing the migration times of the residues and the standards. Performance of the Teflon microreactor is demonstrated by sequencing 27 pmol of the same protein sample using similar conditions to those used in the above experiments. Twelve cycles were performed, the first seven cycles were done in the same day, and the latter five cycles were done the following day. Original electropherograms of cycles 2 to 12 are shown in Figure 4. All products from the twelve cycles were positively identified. The first seven cycles gave better yields and fewer background peaks because the former were done on the first day. This phenomenon was also observed in our previous studies [15]. Figure 4 also shows that the residue PTH-L from cycle 2 co-eluted with an impurity peak. This impurity peak and the other background peaks observed in cycle 2 were dramatically reduced in the following cycles, which suggested that the background peaks were due to incomplete cleaning of the new Teflon microreactor before use. The sequencing products and by-products obtained using the Teflon microreactor (Figure 4) are similar to those obtained with the capillary reaction chamber (Figure 3). This suggests that the epoxy glue used to connect the inlet capillaries to the capillary reaction chamber in the initial experiments (Figure 3) does not cause problems in identifing the sequencing products.
Xing-fang Li et al
10 5F 3F it
o' 1.0 I Cycle 2
ilJ!jLJ!_^OiU
0.5 1.0 I Cycle 3 0.5
STD ^ ,^A. m .K.
i.or 0.5
^^^^^
NoSTD added , D
1.0 I Cycle 5
-
U2|l
DPTU
U3
Ul
.UL
-A-yJL
.STD
lom:
STD
I DPTU
irm
I
U2M „ ,
0.5 J
ZH
1.01 Cycle 0.5 1.0
A
i^ i l ^ V ^ - ^ J v
L Cycle 8
STD Q
0.5 I
DPTU
1.0 1 Cycle 9
STD
.Cycle 11
]|
I
V^V—^VJ
U2
Ul
U3 ^J
:L_jt_iwui
0.5
13
Ju^ULAw«.
;;:^
IIDPTU
A / p U3 I
ITUI
0.5 r 7
8 9 Time (min)
10
11
12
Figure 3 . Electropherograms showing the results of 33 pmol of Sequelon DITC-myoglobin by use of capillary r e a c t o r .
13
Protein Sequencing Using Microreactors
u
0.8 -J
0.6 -JA«*-^
—I
1
11
Cycle 2
1
r
U-
—r-
12
10
0.80 -1
Cycle 3
0.75 -I 0.70 0.65 0.60 0.55
I
J II
I
I
D .
Y
-|
r
I
I
I 10
I 11
I 12
Cycle 4
dptu
W^ \
—I
I
10
0.80
\
1
11
12
Cycle 5
0.70 0.60 0.50
dptu
SrfJli
Ly.JLA'vw^Au/V^U^ 1
r5
-|
\
\
1
1
1
7
8
9
10
11
12
0.80 0.70 •
Cycle 6
dptu
0.60 0.50 •
]w>wWv,J/^^ I
I
I
ift../<;'NA''A^VAyv I
I
y W,
'NA^^i^JWfcvW'*''^ r 10 11
12
Figure 4. Electropherograms showing the sequencing results of 27 pmol of Sequelon DITC-myoglobin by use of Teflon microreactor (cycles 2 to 6)
Xing-fang Li et al
12
0.50 -1 0.40
-
0.30 -
1 Y
.1^1
1 . 1
rJikxK)
0 20 ""^^ 1 4
1
1 dptu
1 5
1 6
0.40 -
vAyvvnJuJ 1 7
Y
\K.^A^^ 1 8
1 9
IV
Cycle 8
m fj
1f
U
r**^-^,*,^ 1 10
1 1 Cycle 9
dptu
0.36 0.32 -
1
1 *
1 5
1 6
12
1
jJwluJw
0.28 -
^
1 11
I
ILX 1 7
1 8
1 9
1
10
1
1
11
12
0.40 -I 0.36 0.32 -I
\>\vAy*vJ
0.28 4U^\'J^^^ 0.24
V ^ 12
10 0.40 0.32 0.28 0.24
Cycle 11
dptu
0.36
u-^l^^
K A J V>^^A^^^/V^ W v w v v \ / w ^ ' ^ S - . A . ^ ^ A v V ^ —I— 10
0.36
Cycle 12
0.32 0.28
-1 12
j / J l y ^ ^ \Afr^%^
0.24 H
1
1
1
7
1
8
Figure 4. (continued) (cycles 7 to 12).
\
9
k.>..^v4>A^ n
10
1
11
1
12
Protein Sequencing Using Microreactors
13
IV. Conclusion Edman chemistry has been used for protein and peptide sequencing for over 30 years. However, the outcome of sequencing experiments are very much dependent on the performance of the instrument. When the sequencer is miniaturized to the capillary size, reproducible sequencing results are more difficult to achieve [14]. The new design of the miniaturized sequencer using syringe pumps for delivery of reagents and direct connections without valves has provided us a new approach to miniaturize the sequencer. The short flow path and very low dead volume were achieved by directly connecting the narrow-bore capillaries (100 |Lim i.d.) to the reaction chamber. This configuration minimized side reactions. The ehmination of valves, as well as the use of capillary-size reaction chamber and delivery lines, greatly simpUfied the construction of the sequencer. We have demonstrated the ability of this sequencer to sequence low pmol levels of proteins, even though the conversion of ATZ to PTH amino acids and MECC-TOAD detection of PTH-amino acids were carried out off-line. To obtain sequencing sensitivity at fmol peptides, on-line conversion and on-line detection of PTH-amino acids are necessary.
Acknowledgments This project was supported by an operating grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada. Additional support was provided by SCIEX. XFL and KCW acknowledge NSERC Industrial Postdoctoral fellowships sponsored by SCIEX. NJD acknowledges a McCalla Professorship from the University of Alberta.
References 1. 2. 3. 4. 5. 6. 7.
8.
Hewick, R.M., Hunkapiller, M.W., Hood, L.E., and Dreyer, W J . (1981) / Biol. Chem. 256, 7990. Tempst, P., Geromanos, S., Elicone, C , Erdjument-Bromage, H. (1994) Methods: A Companion to Methods in Enzymology 6, 248. Laursen, R.A. (1971) Eur. J. Biochem. 20, 89. Pappin, D.J.C., Coull, J., and Koester, H. (1990) Anal. Biochem. 187, 10. Edman, P., and Begg, G. (1967) Eur. J. Biochem. 1, 80. Kent, S., Hood, L., Aebersold, R., Teplow, D., Smith, L., Farnsworth, V., Cartier, P., Hines, W., Hughes, P., and Dodd, C. (1987) BioTechniques 5, 314. Scoble, H.A., Vath, I.E., Yu, W., and Martin, S.A. / n P. Matsudaira (Ed.), (1993) A Practical Guide to Protein and Peptide Purification for Microsequencing, Academic Press, Inc., San Diego, pp. 125. Wilm, M., Shevchenko, A., Houthaeve, T., Breit, S., Schweigerer, L., Fotsis, T., and Mann, M. (1996) Nature, 379, 466.
14 9. 10. 11.
12. 13. 14. 15. 16.
Xing-fang Li et al Figeys, D., Oostveen, I. V., Ducret, A., and Aebersold, R. (1996) Anal Chem. 68, 1822. Calaycay, J., Rusnak, M., and Shively, J.E. (1991) Anal Biochem. 192, 23. Granlund-Moyer, K., Miller, C.G., and Sahakian, J.A. (1994) 10th International Conference on Methods in Protein Structure Analysis. Snowbird, Utah Sept. 8-13, Abstract LA3. Waldron K.C., and Dovichi, N.J. (1992) Anal Chem. 64, 1396. Chen, M., Waldron, K.C., Zhao, Y., and Dovichi, N.J. (1994) Electrophoresis 15, 1290. Waldron, K.C., Li, X.-F., Chen, M., Ireland, I., Lewis, D., Carpenter, M., and Dovichi, N.J. (l996)Talanta., in press. Li, X.-F., Waldron, K.C., Black, J., Lewis, D., Ireland, I., and Dovichi, N.J. (1996)Talanta, accepted. Li, X.-F., Ren, H., Le, X.C, Ireland, I., Qi, M., and Dovichi, N.J. unpubUshed results.
ENHANCEMENT OF CONCENTRATION LIMITS OF DETECTION IN CAPILLARY ELECTROPHORESIS: EXAMPLES OF ON-LINE SAMPLE PRECONCENTRATION, CLEANUP, AND MICROREACTOR TECHNOLOGY IN PROTEIN CHARACTERIZATION Andy J. Tomlinson , Linda M. Benson\ Norberto A. Guzman*^ and Stephen Naylor ' ^Biomedical Mass Spectrometry Facility and Department of Biochemistry and Molecular Biology ^Department of Pharmacology and Clinical Pharmacology Unit, Mayo Clinic, Rochester, MN 55905 The R.W. Johnson Pharmaceutical Research Institute Raritan, NJ 08869
I. INTRODUCTION It is paradoxical that one of the noted advantages of capillary electrophoresis (CE), namely the small volume of a conventional CE capillary, also leads in the majority of cases to a significant drav^back of the technique. The total volume of the capillary is typically only -1-2 |iL, and this results in a very limited loading capacity of analyte solutions. Optimal analyte resolution and separation efficiency are usually obtained when the sample injection is <2% of the total capillary volume (1). Ultimately, this results in poor concentration limits of detection (CLOD) and leads to a major problem when attempting to analyze relatively dilute analyte mixtures, particularly those derived from biological sources. In attempts to overcome the poor CLOD of CE, several groups have developed a number of injection techniques, including analyte stacking, field amplification, and transient isotachophoresis (tITP) that facilitate the analysis of larger sample volumes (2-4). However, since these techniques are carried out within the conventional CE capillary, the maximum sample volume that can be TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
15
16
Andy J. Tomlinson et al
analyzed is predetermined by the total capillary volume. Hence, even in most favorable cases such optimized injection techniques can normally only tolerate the introduction of < 1-2 \xL of sample without loss of CE performance. Another approach to circumvent poor CE CLOD is to undertake offline sample pretreatment and analyte concentration. This should be avoided, if possible, particularly for dilute protein solutions since losses to exposed surfaces (e.g., walls of microcentrifuge tubes, pipette tips, solid extraction phases, etc.) can be substantial. Furthermore, excessive handing of a concentrated solution of protein(s) can lead to denaturation, aggregation, precipitation, and, ultimately, poor analyte recovery. Therefore, minimal sample handling is advisable. In order to overcome the problem of limited sample loading, Guzman conceived and demonstrated the concept of on-line preconcentration with CE using a cartridge containing a bed of adsorptive phase (5). In the present work we describe the use of nonspecific on-line preconcentration-CE, on-line immunoaffmityCE (lA-CE) and on-line microreactor enzyme digestion-CE for the analyses of proteins.
II. MATERIAL AND METHODS A. Membrane-Preconcentration-CE-MS (mPC-CE-MS) The construction of the mPC-CE cartridge (Figure 1) as well as the configuration of the CE-MS interface have been described in detail elsewhere (6-9). Briefly, a piece of impregnated membrane (3M Corporation, MN, USA) is inserted into a short length (~1 cm) of Teflon tubing (300 jiim i.d. x 1500 jam o.d.). Final cartridge assembly is achieved by inserting fused silica capillary (50 jim i.d. x 365 |Lim o.d. x 1.5 cm long) into each end of the Teflon tube. The cartridge, after off-line conditioning (methanol followed by CE separation buffer), is inserted at the inlet end of the CE capillary. Analysis was carried out using a Beckman (CA, USA) P/ACE 2100 CE coupled via a Beckman CE-MS power supply to a Finnigan MAT (Bremen, Germany) ESI source on a MAT 900 mass spectrometer.
17
Enhancement of Concentration Detection Limits in CE Teflon, fused silica or metal tubing Solvent resistant epoxy resin
Polyethylene connecting sleeve
CE capillary
CE Separation Buffer Inlet
Polyimide coated fused silica capillary
Solvent resistant epoxy resin
Connection to ESI-MS
Figure 1 Schematic of an mPC-CE cartridge.
B. On-line immunoaffinity-CE The construction of the immunoaffinity analyte concentration cartridges have been described in detail elsewhere (10), and also shown in Figure 2. They consisted of either (a) 5-14 polyimide coated capillaries (25 l^m i.d. X 150 |^m o.d.) contained in rigid plastic tubing or (b) solid glass rod containing laser drilled holes (~25 jam i.d.). Anti-IgE antibodies were covalently bound to the surface of the microcapillaries. Serum was applied on the cartridge and subsequently bound IgE was eluted using 75 mM HEPES/NaOH buffer (pH 7.2), containing 3M MgCl2 and 25% ethylene glycol. The CE separation buffer consisted of 50 mM sodium tetraborate (pH8.3). Capillary wall
Microparticle plus antibody
Frit (porous glass)
Figure 2 Schematic of an lA-CE concentrator.
Concentrated protein
Andy J. Tomlinson et al
18
C. On-line microreactor enzyme digestion-CE The construction and design of the microreactor chambers is described in detail elsewhere (11) and shown schematically in Figure 3. First Microreactor
7
Protein
Frit ^ (porous glass)
Second IVIicroreactor
Sleeve Connector
FITC peptide
Figure 3 Schematic of a microreactor enzyme digestion chamber couples to CE.
III. RESULTS AND DISCUSSION A. Membrane-Preconcentration-CE-MS (mPC-CE-MS) The development of mPC-CE was undertaken to decrease the limitations encountered when using on-line solid phase-preconcentrationCE (8,9,12). The membrane is installed in a specifically designed cartridge prepared from Teflon tubing (see Figure 1). The cartridge system design conveniently allows disassembly of the mPC-CE capillary. This permits effective cleaning and conditioning of the CE capillary, as well as rapid offline activation of the adsorptive membrane. Furthermore, the mPC-CE cartridge allows large sample volumes (>100 |iL) to be loaded without compromising analyte resolution or separation efficiency afforded by conventional CE methods (8,9). In addition to analyte preconcentration, mPC-CE technology can also be used to effect sample cleanup. This is particularly important for physiologically derived samples such as blood, bile, urine, etc. where the presence of high salt concentrations can dramatically effect analyte separations by CE. Furthermore, these matrix components can complicate, and even alter electrophoretic stacking and focusing procedures, often precluding the use of these methods for preconcentration of biologically
Enhancement of Concentration Detection Limits in CE
derived samples within the CE capillary. In contrast, mPC-CE technology is relatively unaffected by such contaminants. Indeed, this approach ensures that these compounds are removed from the CE capillary prior to electrophoresis. Furthermore, when using an off-line sample loading strategy, the bidirectional flow through a mPC-cartridge allows samples to be loaded with either reverse or forward flow. We utilize a back flow to load sample followed by subsequent sample cleanup with a forward flow of a suitable solvent (typically an aqueous medium). This approach leads to flushing of sample-derived particulates from the mPC-cartridge prior to its installation onto the CE capillary. Improved reproducibility of mPC-CE performance is a result of reduced clogging of the system which alleviates adversely affecting EOF. The potential application of mPC-CE coupled to a mass spectrometer (mPC-CE-MS) in the clinical diagnosis of disease states is substantial. In part, this is due to the fact that the technique can be utilized in the direct analysis of any physiologically derived body fluid. This is demonstrated by the direct mPC-CE-MS analysis of aqueous humor obtained from a patient undergoing eye-surgery. The chemical composition of human aqueous humor is still poorly understood, mainly due to the limited sample amounts that can be collected. It has been suggested that the chemical content of this fluid may play a role in drainage of the human eye. In particular, the protein content of aqueous humor may contain important factors in this process. Hence, any method that can readily determine the protein content of aqueous humor would be of great benefit. In this specific case, we took 7 |LIL of human aqueous humor and pressure injected it directly, without any further sample pretreatment, onto a C-8 silica-based impregnated membrane. The membrane containing the aqueous humor analytes was subsequently washed for 10 minutes with separation buffer (1% acetic acid). Analytes were eluted from the membrane with 80:20 MeOH:H20 and subjected to CE separation in a polybrene-coated capillary with final detection by ESI-MS. The mPC-CEMS ion electropherogram is shown in Figure 4. A number of ion responses were observed including singly charged species at m/z 758, 760, and 782. Further, two components were tentatively identified as human serum albumin (MHSQ^^^ = 1338) and p-2 microglobulin (MHi/^"^ = 1067). Deconvolution of the ion series containing m/z 1067 revealed a molecular weight of 11,729 Da corresponding to the oxidized form of P-microglobulin that contains a single cysteine bridge. Characterization of the other components present in the mixture is currently in progress.
19
Andy J. Tomlinson et al
20 54
MH =758
hnvsJU^Mi^^^^ 54
MH =760
I^AAvA/lji^^ (D O C S^l MH* = 782 05
c 3
<
C 34i MHii^^* = 1067
JO
)3-2-microglobulin
0
Human Serum Albumin
2''i MH5o'°'=1338 ^M^K^/V^w\/y^^ 5 10
'WVAWSA^V^AAVVA.
15
20
25
30
35
40
45
Time (min) Figure 4 mPC-CE-MS analysis of 7 |iL of aqueous humor without any sample pretreatment.
B. On-line immunoafflnity-C£ Analyte concentrators that contain covalently bound antibodies are useful with CE for those appUcations that warrant detection of a specific analyte. The concept for such devices wasfirstdescribed for CZE by Guzman (5,13) and subsequently by Kennedy (14). In these studies, specific antibodies were covalently bound to either a solid phase, glass beads, multiple capillary bundles or, more recently, a piece of solid glass predrilled with a laser beam. A typical immunoaffinity analyte concentrator constructed from multiple capillary bundles is shown schematically in
Enhancement of Concentration Detection Limits in CE
21
Figure 2. The performance of a device of this construction was compared to that of a similar concentrator made from a soHd piece of glass containing holes drilled by a laser for the analysis of IgE in serum by CE. Results of these investigations indicated a broadness of the peak response for IgE when analyzed using the concentrator made from multiple capillary bundles on-line with CE (Figure 5A). Furthermore, a second minor response was detected using this approach. A significant variability of analyte migration time was also observed. It was concluded that such variability of performance was due, at least in part, to a reduction of EOF. This was observed to be progressive and suggested to be caused by partial blocking of the cartridge through sequential analysis of serum samples. In contrast, the immunoaffmity analyte concentrator made from a single piece of glass with through holes yielded only a single peak (see Figure 5B). In addition, the migration of IgE in this system was substantially faster than was observed using the analyte concentrator made from multiple capillary bundles (Figure 5A). Furthermore, peak profile was improved and IgE migration was more consistent using the single piece immunoaffmity analyte concentrator. The major response from both of these studies was collected from the CE capillary, using a purpose built fraction collector (13). The fractions collected from several consecutive injections were pooled and shown to be IgE by biological assay. These examples demonstrate the high specificity of the immunoaffmity analyte concentrator, since only IgE was isolated from serum with no detectable presence of human serum albumin or other immunoglobulins such as IgA, IgG, or IgM.
A
E c
1
0.03
o c
CO
o
CO
< > 3
0.02
-
0.01
1
1
20
40
I
B -
1
1
60
20
J1 1
1
40
60
80
Migration Time (min) Figures (A) lA-CE of IgE in serum using multiple capillaries in bundles, (B) lA-CE of IgE in serum using solid glass rod with laser drilled holes.
22
Andy J. Tomlinson et al
C. On-line microreactor enzyme digestion-CE An attractive feature of microreactions (either chemical or enzymatic) prepared from a soHd support on-Hne with CE is the potential for enhanced efficiency of these processes. This is often accompanied by shorter reaction times, consumption of smaller amounts of reagents and, perhaps most importantly, the ability to react, derivatize, or digest lower analyte concentrations than possible by conventional solution chemistries. Recently, an improvement to on-line protein digestion methodology was the construction of an enzyme modified analyte concentrator as described by Guzman (11). Using this approach, Staphlococcus aureus V8 was covalently linked to a porous glass solid support in the analyte microreactor concentrator and constrained by glass frits (see Figure 3). Specific digestion of the a-subunit of prolyl-4-hydroxylase was demonstrated by comparison of the electropherograms generated by interacting the a-subunit in analyte microreactor concentrators containing covalently linked cytochrome C, bovine serum albumin or Staphylococcus aureus V8. Proteolytic digestion was only observed when the subunit was reacted in the microreactor containing the covalently bound Staphylococcus aureus V8 protease (Figure 6A). Efficient digestion by the V8 protease was achieved on the proyl-4-hydroxylase even at 30 °C for ten minutes. A further refinement of the on-line proteolytic digestion microreactor was also described by Guzman (11) in which a second analyte concentrator microreactor is coupled on-line with the first proteolytic microreactor and the CE capillary. This second reactor contains glass beads modified with fluorescein isothiocyanate (FITC), linked to immobilized anti-FITC antibodies. The purpose of this refined approach is to chemically derivatize the peptides produced by on-line protein digestion to increase their UV and fluorescence absorbance characteristics. This fiirther alleviates the poor CLOD of conventional peptide analysis by CE through enhanced peptide detection capabilities. The two reactor system is demonstrated by the online digestion of the a-subunit of prolyl-4-hydroxylase followed by consecutive FITC derivatization and ultimately CE separation of generated peptides (see Figure 6B). In this example, the higher UV absorbance of FITC derivatized peptides was clearly observed. Furthermore, the on-line generation of FITC -labeled peptides has aided component resolution when compared to the electropherogram obtained from the analysis of the on-line generated but underivatized peptides (Figure 6A).
23
Enhancement of Concentration Detection Limits in CE
A
E c 0.03
-
-
0.02
-
-
0.01
-
o
) < >
B
jLili., 1
1
20
J
1
40
" ._ ill
U'l
\
\
60
1
0
1
20
1
L
40
1
1
60
Migration Time (min)
Figure 6 (A) On-line enzyme digestion of propyl 4-hydroxylase subunit by S. aureus V8. Ce separation of underivatized peptides monitored at 214 nm, (B) Same as 6A except peptides derivatized on-line with FITC.
IV. SUMMARY It is clear that the use of mPC-CE and lA-CE afford a powerful approach for preconcentration and on-line sample cleanup of analyte mixtures prior to separation by CE. While these techniques continue to be refined, they overcome the current limitations of poor CLOD in conventional CE. Finally, the use of these devices as microreactors affords enhanced chemical derivatization or enzymatic reactions a lower analyte concentrations than is currently possible by conventional solution chemistries. ACKNOWLEDGMENTS We thank Mrs. Diana Ayerhart (Mayo Clinic) for her help in preparing this manuscript. We also thank Mayo Foundation, Beckman Instruments, and Finnigan MAT for their support.
Andy J. Tomlinson et al
24
REFERENCES 1. 2.
3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
14.
J. Cai and Z. El Rassi, J. Liq. Chromatogr. 16 (1993) 2007. B. J. Wanders and F. M. Everaerts, in J. P. Landers (Editor), Handbook of Capillary Electrophoresis, CRC Press, Boca Raton, 1994,p. 111. R. L. Chien and D. S. Burgi, Anal Chem. 64 (1992) 489A. P. Gebauer, W. Thormann and P. Bocek, J. Chromatogr. 608 (1992) 47. N. A. Guzman, M. A. Trebilcock and J. P. Advis, J. Liq. Chromatogr. 14(1991)997. A.J. Tomlinson, L.M. Benson and S. Naylor, J. Cap. Elect. 1 (1994) 127. K.L. Johnson, A.J. Tomlinson, and S. Naylor, Rapid Commun. Mass Spectrom. 10(1996)1159. A. J. Tomlinson and S. Naylor, J. Liq. Chromatogr. 18 (1995) 3591. A. J. Tomlinson and S. Naylor, J. Cap. Elec. 2 (1995) 225. N. A. Guzman, J. Liq. Chromatogr., 18 (1995) 3751. N. A. Guzman, in P. G. Righetti (Editor), Capillary electrophoresis: an analytical tool in biotechnology, Boca Raton, CRC Press 1995. A. J. Tomlinson, W. D. Braddock, L. M. Benson, R. P. Oda and S. Naylor, J. Chromatogr. B Biomed Appl, 669 (1995) 67. N. A. Guzman, C. L. Gonzalez, M. A. Trebilcock, L. Hernandez, C. M. Berck and J. P. Advis, in N. A. Guzman (Editor), Capillary Electrophoresis Technology, Marcel Dekker Inc. New York, 1993, p. 643. L. J. Cole and R. T. Kennedy, Electrophoresis, 16 (1995) 549.
SEQUENCING MHC CLASS I PEPTIDES USING MEMBRANE PRECONCENTRATION-CAPILLARY ELECTROPHORESIS TANDEM MASS SPECTROMETRY (mPC-CE-MS/MS) Andy J. Tomlinson\ Stephen Jameson^, and Stephen Naylor^'^ Biomedical Mass Spectrometry Facility and Department of Biochemistry and Molecular Biology 2 Department of Pharmacology and Clinical Pharmacology Unit, Mayo Clinic, Rochester, MN 55905 ^Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55415
I. INTRODUCTION Major histocompatibility complex (MHC) proteins are essential components of the immune system (1). One specific role is for them to bind and present cellularly derived peptides (-8-10 amino acids - MHC Class I peptides) at the cell surface. These peptides are subsequently challenged by cytolytic Tlymphocytes (CTL's) which are programmed to differentiate between self and exogenous peptides. T-cell recognition of these latter peptides initiates a response that ultimately results in cell lysis and death of the infected cell. Hence, structural characterization of such peptides could potentially result in the development of therapeutic treatments of a number of infectious disease states such as viral cancers, AIDS, and autoimmune disease. However, the task of sequencing such peptides is difficult since MHC class I proteins can bind and present 10,000-15,000 different cellularly derived peptides present at the sub-pico-femtomole level (2,3). Hunt and coworkers have pioneered the development of methods to sequence MHC class I and class II peptides (2-6). Specifically, they utilize two dimensional microcapillary HPLC-MS/MS to separate and sequence such peptides. In this work, we describe the use of a new orthogonal two dimensional chromatography-MS/MS approach employing reversed-phase TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
25
26
Andy J. Tomlinson et al
HPLC followed by on-line membrane preconcentration capillary electrophoresis-MS/MS (mPC-CE-MS/MS) to separate and sequence MHC class I peptides. II. MATERIALS AND METHODS A. Isolation of MHC class I peptides EL-4 cells (3 x 10^) were lysed with N,N-dimethyl-N-(3sulfopropyl)-3-[[(3a,5p,7a,12a)-3,7,12-trihydroxy-24-oxocholan-24-yl]amino]-l-propanaminium hydroxide (CHAPS). The nuclei and membranes were pelleted and the supematent lysate filtered to remove lipids. The lysate was sequentially passed over sepharose columns containing a) normal mouse serum; b) Y-3 which is an anti-K*' monoclonal antibody. Both columns were washed with 45 column volumes of progressively lower molarity salt solutions. The beads were then treated with acetic acid to release antigen-antibody complexes and the complex was denatured by boiling in 10% acetic acid. The mixture was filtered through a 3 kDa poresize membrane and the filtrate containing MHC class I peptides subjected to reversed phase HPLC. B. HPLC Separations were performed on a Shimadzu HPLC instrument. A 50 jiL aliquot was injected (in wateriacetonitrile 98:2 v/v) via a Rheodyne injector (Cotati, CA) onto a Vydac analytical column (4.6 cm x 250 mm) containing C-18 packing material (300A, 5 jim). Separations were achieved using a mobile phase of A) 0.06% TFA and B) 0.052% TFA in CH3CN. A solvent gradient of 2% -> 37.5% B (0-60 minutes); 37.5% -> 75% B (60-90 minutes) and 75% -^ 98% B (90-105 minutes) was used at a flow rate of 500 |iL/min. Fractions were collected based on their UV response at 214 nm.
C. Membrane Preconcentratioii-Capillary Electrophoresis-Tandem Mass Spectrometry (mPC-CE-MS/MS) The preconcentration cartridge used in these experiments was prepared from uncoated fused silica tubing pretreated with potassium
Sequencing MHC Class I Peptides Using mPC-CE-MS/MS
methoxide, methanol, and, finally, CE separation buffer. A piece of polymeric styrene divinyl benzene (SDB) membrane was cut using a 22 gauge blunt-tipped hypodermic needle. The membrane remained in the needle until insertion into the midpoint of a short length (~1 cm) of Teflon tubing (300 j^m i.d. x 1500 |im o.d.). To install the membrane, the needle is placed over either end of the Teflon tube, and, with a small length of fused silica (positioned inside the hypodermic needle), it is carefully pushed into position. Provided this procedure is followed with adequate attention, the piece of membrane will hold its shape and completely fill the cross sectional area of Teflon tube. Final cartridge assembly is achieved by inserting the fused silica capillary (50 |Lim i.d. x 365 fim i.d. x 1.5 cm long) into each end of the Teflon tube. During this process, care is taken to neither compress the membrane nor scrape the walls of the Teflon tubing since both could result in blockage of the cartridge and low hydrodynamic flow. Also for this final step of cartridge construction, provided the inside diameter of the Teflon and outside diameter of the fused silica are similar, the cartridge will be leak-free with a push-fit connection. This latter approach negates the need for gluing the fused silica in position. The push-fit cartridge is advantageous since, if the piece of membrane becomes heavily contaminated, it is easily replaced. Prior to installation the membrane cartridge was activated by washing with MeOH, then CE separation buffer. The entire mPC-CE capillary was then conditioned under high pressure (20 psi) for ten minutes with CE separation buffer. All subsequent capillary treatments and sample loading, washing, and elution were also carried out under high pressure (20 psi). The method of analysis included a cleaning regime of methanol (0.2 min) and separation buffer (5 min) followed by a high pressure injection of the mixture to be analyzed. The capillary was then washed with separation buffer for 5 minutes and analytes were eluted from the packing material with 80:20 MeOH:H20 methanol followed by a plug of CE separation buffer. CE separations were performed using a Beckman P/ACE 2100 coupled via a Beckman CE-MS power supply and interfaced to a Finnigan electrospray source. All analyses were carried out on a MAT 95Q or (Bremen, Germany) mass spectrometer. The MAT 95Q is of BEQ1Q2 configuration (where B is the magnet, E is the electrostatic analyzer, Qj is an rf-only octapole collision cell and Q2 is quadrupole mass filter. A Finnigan MAT ESI source was used, and this device employs a spray needle that is floated and consists of an ESI voltage of 3.4 kV referenced to an accelerating voltage of 4.8 kV. A heated metal capillary (-225 °C) completes the first stage of separation of the atmospheric (API) spray region. A skimmer is positioned beyond this capillary as a second stage of separation between the API region and the MS
27
28
Andy J. Tomlinson et al
vacuum. Ions that transfer into the MS ion source initially enter an octapole that aids focusing. The source was used in a positive ion mode throughout, and the sample needle of the ESI source was replaced by the CE capillary from which 2-3 mm of the polyimide coating had been removed from the MS end with hydrofluoric acid. A sheath liquid of isopropanol:water:acetic acid (60:40:1 v/v/v) at a flow rate of 2-3 |aL/min was used to boost the flow through the ESI needle and serve as the counterelectrode for the CE capillary. Tandem-MS conditions consisted of xenon in the rf-only octopole collision cell at a gas pressure of 1.2 x 10"^ mbar. A collision energy of-24 eV on the MH2 ^ precursor ion was used (see Figure 1). III. RESULTS AND DISCUSSION A. General strategy The complexity of MHC class I peptide mixtures, as well as the similarity of their amino acid sequences requires a number of factors to be considered for specific peptide structural characterization. In particular it is important that cell lysis and the subsequent purification of the peptides utilizes reagents that will not decrease sensitivity limits of the MS/MS analysis. Furthermore, it is also important to develop a two-dimensional chromatography approach that employs different physical properties of the peptides in the mixture. This affords optimal opportunity to separate complex mixtures of structurally similar peptides. The strategy we have developed is as follows: (1) cell lysis with zwitterionic detergent, CHAPS; (2) immunoaffinity concentration of MHC class I proteins; (3) release of MHC class I peptides by treatment of the anitbody-antigen complex with 10% acetic acid; (4) coarse fractionation of peptides by reversed-phase HPLC; (5) membrane preconcentration-CE-transient isotachophoresis-MS (mPC-tlTP-CE-MS) where peptides are separated on a charge/mass (CE) and subsequently mass/charge (MS); (6) mPC-tlTP-CE-MS/MS to determine peptide sequence. The initial steps (1-4) used to isolate MHC class I peptides are based on methods described by Hunt (2-5). However, we have noted that the use of a zwitterionic detergent to lyse cells has no deleterious effect on the CE-ESI-MS analysis of MHC class I peptides. This is not the case for both cationic and anionic compounds which are difficult to remove from peptide mixtures even after multi-stage purification. Hence, they can still be present in the final MS analysis step, and this results in significant suppression of ESI-MS peptide ion current (7).
DH LU
S
o a:
^ LU w w CO
< ^
0)
>. o
t
< "QS 1
18 l a
3 ™ 550 1
<
(0 0)
(0 c 0) CO LU
3 X Q.in
fS - 3 ° Q- J
Q.
}J>t -Q.-a E
W)
c
« «^
s c
^ 0
WJ
x: uo
0
t: KJ
e
u u u DH
W
C/5
<
3 I
I
u u
fa ^
30
Andy J. Tomlinson et al
After immunoaffinity concentration of MHC proteins containing MHC class I peptides, and subsequent acetic acid release of the peptides, the latter were subjected to reversed-phase HPLC. In this first dimension of chromatography, the MHC class I peptides are separated based on their hydrophobic/hydrophilic properties. This initial chromatographic step affords a course fractionation of the complex peptide mixture into --100 jaL aliquots. Such fractions have been shown to contain a large number (-5-50) of peptides (3,8). Hence a complimentary second stage of high resolution chromatography is necessary to ensure optimal resolution of individual peptides. A second orthogonal dimension of chromatography using CE was used to separate the HPLC fractions since analytes are separated (to a first approximation) on their charge-to-mass ratio (9). The potential of on-line CE-MS to effect the separation of MHC class I peptide mixtures has been reported previously (7). However, it has been noted that this approach can be problematical due to the limited sample volume loading capacity of conventional CE capillaries (10). Ultimately, this leads to poor concentration limits of detection (CLOD), and an inability to handle dilute, complex analyte mixtures. In order to overcome this problem we have developed technology which we term membrane preconcentration-CE-MS (mPC-CE) (9,11-16) and in conjunction with MS (mPC-CE-MS) it consists of an impregnated adsorptive membrane in a Teflon cartridge installed at the inlet of the conventional CE capillary as shown previously in Figure 1. This arrangement facilitates the ready removal of the cartridge to allow CE capillary cleaning/conditioning and activation of the adsorptive membrane. Using this approach, it is possible to undertake on-line analyte loading onto the membrane of in excess of 100 jiL solution volumes. Furthermore, online sample cleanup prior to CE-MS analysis is possible, as this is particularly important for in vivo derived samples such as cell culture lysates. It should be noted that in order to efficiently remove MHC class I peptides from the adsorptive membrane, an elution solvent containing some organic component (e.g., methanol:H20 - 80:20) must be used. This ensures that efficient removal of analytes from the membrane occurs. Furthermore, optimal peptide recovery is achieved only when >50 nL of such a solvent mixture is used. This relatively large volume of elution solvent, along with the inefficient analyte stacking that occurs, results in some peak broadening and loss of analyte resolution. Therefore, the use of moving boundary tITP conditions are used to focus analyte zones and also aid in the dispersion of the organic elution solvent (16). It is carried out by eluting peptides from the membrane between zones of a leading stacking buffer (LSB), typically 0.1-5% NH4OH in water, and a trailing stacking buffer (TSB), typically 1%
Sequencing MHC Class I Peptides Using mPC-CE-MS/MS
31
acetic acid in water or CE separation buffer, and this is shown schematically in Figure 2.
Polyethylene Tubing membrane
Elution Solvent + Peptide Analytes
CE Capillary
^J
W-
©
OH
*PnWp„ P PPpftl'pPP^
ppftW p
OH OH
0
;FT= mPC-CE Cartridge
Trailing Stacking Buffer
Leading Stacking Buffer
Figure 2 Schematic of tITP conditions for use with mPC-CE-MS analysis after MHC class I peptides have been eluted from the adsorptive SDB membrane into the CE capillary. Application of the CE voltage results in rapid migration of H^ and 'OH ions with concomitant focusing of the peptides in the high organic solvent zone.
B. Sequencing K -derived MHC class I peptides In the present study, a K^ fraction of 3 x 10^ mouse derived EL-4 cells were treated as described above. Approximately 100 JLIL H P L C fractions were collected based on their UV absorbance and subsequently subjected to mPC-CE-MS. In order to undertake the latter analysis, the CH3CN solvent present in the HPLC fractions was removed and the resulting aqueous fractions (-40-70 |LIL) were diluted with CE separation buffer. This was done to maximize recovery of MHC class I peptides present in each HPLC fraction. Subsequently an aliquot of this solution (50 |LiL) was loaded off-line onto the mPC-CE cartridge. This off-line loading and sample cleanup method (with CE separation buffer) was used, since the flow rate in the on-line 25 \i i.d. mPC-CE capillary was only 120-150 nL/min. Hence on-line loading of a 50 jaL of sample would take up to 5.5 hr. However, the flow rate in an mPC cartridge alone can be much higher off-line, since these devices can withstand relatively high pressures (-60 psi). Furthermore, system back pressure is also reduced and up to 100 )LIL of sample can often be loaded off-line in <5 minutes, significantly reducing analysis time. Also, since the flow in an mPC cartridge is bidirectional, sample loaded with a reverse flow followed by cleanup in the forward direction leads to flushing of sample-derived particulate matter from the
Andy J. Tomlinson et al
32
mPC cartridge prior to assembly of the mPC-CE capillary. This improves the reproducibility of mPC-CE-MS by reducing the tendency for clogging of the cartridge. The mPC-CE-MS ion electropherogram revealed several major ion responses (Figure 3A). A minor ion at a migration time at -14 minutes afforded a doubly charged ion at m/z 503.6 (MH2^^) (Figure 3B).
100n TIC
10
15
20
25
Time(nnin) Figure 3 (A) Total ion current of the mPc-tlTP-CE-MS analysis of-50 |iL aliquot of a diluted HPLC fraction of 3 x 10^ K'' derived EL-4 cells. (B) mPC-tlTP-CE-MS ion electropherogram of response in 3 A marked with an asterisk (*). Shown to be a doubly charged ion corresponding to MH2^^ = 503.6.
33
Sequencing MHC Class I Peptides Using mPC-CE-MS/MS
Approximately 80 |LIL of the remaining diluted HPLC fraction was subsequently subjected to mPC-CE-MS/MS. The doubly charged precursor ion at m/z 503.6 was subjected to collision induced dissociation and the resulting product ion spectrum (Figure 4) revealed a series of 'y' and 'b' ions. Interpretation of these ion series indicated a sequence of XSFKFDHX (where X is either I or L). The spectral data was also searched and interpreted using the Sequest database routine developed by Yates (17), The search revealed that the peptide was derived from F-actin and found to be of self origin. From this information, the peptide sequence was determined to be ISFKFDHL MHo'
100
E+05 2.57
O
ISFKFDHL C
<
C
60
_o "S 40 N
^ O
Vs
20 b
z
I
5
y
7 V
klMlillluHL,! 'fill 200
400
600
800
m/z Figure 4 Product ion spectrum of precursor ion MH2 ^ = 503.6 after mPC-tlTP-CEMS/MS analysis.
IV. CONCLUSIONS In the present study, we describe a strategy for sequencing MHC class I peptides. We employ an orthogonal two-dimensional chromatography that consists of HPLC fractionation and on-line mPC-CEMS. The use of mPC-CE-MS allows loading and on-line sample cleanup for >100 i^L solutions containing analyte peptides and utilizes an impregnated membrane adsorptive phase contained in a cartridge placed at the inlet of the CE capillary. We show that the mPC-CE cartridge has no adverse effects on overall CE-MS performance. Ultimately we use this approach to structurally characterize peptides derived from EL-4/K
Andy J. Tomlinson et at.
34
immunoprecipitated MHC class I molecules and determine the sequence derived from MS/MS analysis of K^. K . This approach can be achieved on as little as 5-50 femtomoles of peptide. ACKNOWLEDGMENTS We thank Mrs. Diana Ayerhart (Mayo Clinic) for her help in preparing this manuscript. We also thank Mayo Foundation, Beckman Instruments, and Finnigan MAT for their support. REFERENCES 1. 2.
3.
4. 5.
6. 7.
8.
9. 10. 11.
12.
C.A. Janeway, Jr. (1993) Sci. Am. 269, 72. D.F. Hunt, R.A. Henderson, J. Shabanowitz, K. Sakaguchi, H. Michel, N. Sevilir, A.L. Cox, E. Appella and V.H. Engelhard, (1992) Science, 255 1261. A.L. Cox, J. Skipper, Y. Chen, R.A. Henderson, T.L. Darrow, J. Shabanowitz, V.H. Engelhard, D.F. Hunt and C.L. Slingluff, Jr. (1994) Science, 264 716. R.A. Henderson, H. Michel, K. Sakaguchi, J. Shabanowitz, E. Appella, D.F. Hunt and V.E. Englehard, (1192) Science, 255 1264. A. Selte, S. Ceman, R.T. Kubo, K. Sakaguchi, E. Appella, D.F. Hunt, T.A. Davis, H. Michel, J. Shabanowitz, R. Rudersdorf, H.M. Grey and R. DeMars, (1992) Science, 258 1801. C.L. Slingluff, Jr., A.L. Cox, R.A. Henderson, D.F. Hunt and V.H. Engelhard, (1993) J. Immunol. 150 2955. S. Naylor, S. Jameson and A. J. Tomlinson, 3rd International Symposium on Applied MS in the Health Sciences, Barcelona, Spain, July9-13, 1995, pg. 162. A. J. Tomlinson, R. Gallagher, P. Derrick, G. Butcher, S. Powis, and S. Naylor, 44th ASMS Conference on MS and Allied Topics, Portland, OR, May 12-16, 1996. N.A. Guzman (Editor) Capillary Electrophoresis Technology, Marcel Dekker Inc., New York, 1993, p. 857. A.J. Tomlinson and S. Naylor, (1995) J. Liq. Chromatogr. 18 3591. S. Naylor, A. J. Tomlinson, L. M. Benson, W. D. Braddock and R. P. Oda, Preseparation processor for use in capillary electrophoresis. US Patent Application 8/423,220:1995. A.J. Tomlinson, L.M. Benson, W.D. Braddock, R.P. Oda and S. Naylor, (1995) J. High Resol. Chromatogr. 18 381.
Sequencing MHC Class I Peptides Using mPC-CE-MS/MS
13. 14. 15. 16. 17.
A.J. Tomlinson, L.M. Benson, R.P. Oda, W.D. Braddock, B.L. Riggs, J.A. Katzmann and S. Naylor, (1995) J. Cap. Elec. 2 97. A.J. Tomlinson and S. Naylor, (1995) J. High Resol. Chromatogr. 18 384. A.J. Tomlinson and S. Naylor, (1995) J. Cap. Elec. 2 225. A.J. Tomlinson, N.A. Guzman and S. Naylor, (1995) J. Cap. Elec. 2 247. J.R. Yates, J.K. Eng, A.L. McCormack, and D. Schieltz, (1995) Anal. Chem. 67 1426.
35
This Page Intentionally Left Blank
Nano-electrospray Mass Spectrometry and Edman Sequencing of Peptides and Proteins Collected from Capillary Electrophoresis Mark D. Bauer, Yiping Sun and Feng Wang The Procter & Gamble Company, Miami Valley Laboratories, Cincinnati, OH 45253-8707
I.
Introduction
Capillary electrophoresis (CE) is rapidly becoming an important complementary method to high performance liquid chromatography (HPLC). It provides several advantages, including high separation efficiency, small sample consumption and short analysis time. Because of the small sample volumes required, CE is becoming the analytical tool of choice for biological applications which are sample limited. Electrospray mass spectrometry (ES/MS) is a powerful technique for the structural characterization of biomolecules. Although on-line CE-MS has been demonstrated to be useful for the analysis of biomolecules, the interfacing of CE and ES/MS is nontrivial and much more complicated than the LC-ES/MS interfacing (1-4). Some limitations for on-line CE-MS are: the capillary has to be long enough for the coupling of the CE to the MS; MS/MS sensitivity is relatively low due to the limited sample size loaded onto the capillary; and CE buffers used have to be compatible with MS analysis. Off-line approaches, which combined CE fraction collection and desorption mass spectrometry (plasma desorption and matrix-assisted laser desorption ionization) for peptide and protein analyses, have been demonstrated (5-8). The advantage of off-line approaches is the possibility of independently optimizing both CE separation and mass spectral analysis. Nanoelectrospray (nES) is a new technique for characterizing biomolecules in small volumes (0.5-2 |il) at low picomole levels (9-11). In nES, signals from a single sample loading typically last more than 30 minutes, which permits optimization of instrument parameters and MS/MS sequencing with high sensitivity. Both nES/MS and nES/MS/MS data can be obtained from a single sample loading. These features make nES an attractive off-line technique for sequencing peptides collected from CE. In this paper, an off-line approach, which combines nES/MS analysis and Edman sequencing of peptide/protein fractions collected from CE, is presented. Automatic peak collection was accomplished using a computer-controlled Beckman P/ACE 5000 instrument (12). Several different samples containing 5lOpicomoles of material, including a peptide mixture (angiotensin-I, methionine enkephalin and substance-P), a tryptic digest of cytochrome-C and proteins like myoglobin, insulin and lysozyme, were used to demonstrate this method. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
37
Mark D. Bauer et al
38
II. Materials and Methods A.
Peptides and Proteins
Angiotensin-I, methionine enkephalin, substance-P, bovine trypsin, horse cytochrome-C, horse myoglobin, bovine insulin and egg-white lysozyme were purchased from Sigma Chemical Co. (St. Louis, MO) and used without further purification. B.
Enzymatic
Digestion
Both trypsin and cytochrome-C were dissolved separately to 1 mg/ml in 100 mM NH4AC, pH 8.1. 1 |Lil of the trypsin solution was added to 100 |Lil of the cytochrome-C solution. The reaction mixture was incubated at 37°C for 2 hours. The reaction was stopped by freezing. C.
Capillary
Electrophoresis
A Beckman P/ACE 5000 (Schaumberg, IL) was used for all CE separations. The background electrolyte (BGE) was CH3CN/H2O/HCOOH (50:45:5). Washing solution was 1% NH4OH. All untreated fused-silica capillaries were obtained from Polymicro Technologies (Phoenix, AZ). Column dimensions were 97 cm x 75 |LiM. Samples ranging from 0.3 mg/ml to 2 mg/ml (each analyte) were pressure injected for 10 to 30 sec with the goal to load 5 - 1 2 picomoles of each analyte. During separation, the voltage was maintained at 30kV except during fraction collection when the voltage was reduced to 7.5 kV. To collect a CE fraction, the following stepwise procedures which were used on the Beckman P/ACE 5000 instrument: (1) tum off the voltage just before the peak entered the UV window, (2) switch to the collection vial, (3) tum on the voltage to 25% of the maximum until the peak eluted from the capillary, and (4) turn off the voltage and switch back to the original outlet vial and resume the run under initial voltage conditions. The distance from the flow cell to the capillary outlet end is 7 cm. The fraction collection window is 5-10 minutes, depending on the peak width of interest. Fraction collection was accomplished by using the outlet carousel on the P/ACE 5000 to switch to a collection vial which contained 100 |il BGE. Each fraction was dried in-vacuo. Samples were redissolved in 4 |LI1 of CH3OH/H2O/HCOOH (50:50:0.1) and subjected to nES/MS and nES/MS/MS analysis and Edman sequencing. Protein samples collected from CE were subjected to Edman sequencing without drying the fractions. D.
Nano-electrospray
Mass
Spectrometry
All nES/MS and nES/MS/MS measurements were made on a PerkinElmer Sciex API-in triple quadrupole mass spectrometer (Thomhill, Canada) equipped with the nanoelectrospray source designed by Matthias Wilm and Matthias Mann at the European Molecular Biology Laboratory, Germany (9). The long signal duration allowed the instrumental parameters to be optimized for each peptide interrogated by nES/MS/MS. Argon gas was used as the collision gas. The colHsion gas thickness was 100-200x10 atoms/cm .
Nano-electrospray MS and Edman Sequencing Using CE
E.
39
Edman Sequencing
A PE-ABD 494 Precise protein sequencer was used for Edman sequence analysis of the collected CE fractions. Sample was transferred into a ProSpin (PE-ABD) pre-wetted by 20 L| L1 of methanol. A gentle nitrogen stream was applied to reduce the concentration of acetonitrile. The volume was then brought up to 400 |il with water. The ProSpin unit was centrifuged to dryness at 5,600g. The PVDF disc was cut out and washed three times with water before being placed in a sequencing cartridge. A modified sequencing program based on NORMAL-BLOT was used.
III. Results and Discussion Capillary electrophoresis is a rapid, reliable method for separating complex mixtures according to differences in the charge-to-mass ratio of each component. Fraction collection of the components of interest is necessary for further structural characterization by other analytical methods, e.g. mass spectrometry and Edman sequencing for peptides and proteins. Because of the need to maintain a high voltage across the capillary, fraction collection of analytes is relatively difficult by CE. One approach is to electrokinetically collect fractions into separate tubes by taking advantage of the automated outiet carousel available on the Beckman P/ACE 5000 CE system (12). First, it is necessary to establish reproducible electropherograms using solvent systems that are compatible with the desired subsequent MS analysis. Buffers such as Tris, phosphate or borate, though common for CE separations, are less compatible with electrospray MS. In addition, NaOH washes of the capillary introduce high levels of sodium into the system. Therefore, only volatile solvents and buffers were used in our initial experiments. The background electrolyte (BGE) contained only acetonitrile, water and formic acid, while 1% NH4OH was used as the capillary wash solution. These restrictions on the BGE composition limited the CE separation efficiency. Nonetheless, peptides and proteins could still be separated from one another under these conditions. The collected fraction was dried in-vacuo in preparation for nES/MS analysis and Edman sequencing. Figure 1 depicts the nano-electrospray source. nES/MS offers several features that recommend it for use with fractions collected by CE. First, nES requires small loading volumes at low picomole levels. This minimizes the amount of dilution which the sample must undergo to be effectively introduced into the nES source. Second, the signal obtained by nES lasts for 30 minutes or longer. This feature allows for the extensive interrogation of the sample by MS/MS on appropriately equipped instruments. As a result, all of the data can be generated from one CE run and one nES experiment in which both MS and MS/MS data are obtained. The nES needles, which are gold-coated, pulled glass capillaries, are generally treated as disposable so once loaded all of the data is acquired at one time. Sometimes a needle is defective or is damaged early in the nES experiment. This can result in the loss of the collected CE fraction.
Mark D. Bauer et al
40 syringe for pressure 0 rings ^ ^>w
\
Au coated pulled Capillary
\
600 / ^ ^ s , / 800 V ^
A \
/
"^SOtorr 10'^ torr
1 needle holder
/ sleeve
" / ^ ^ ^ ^ ^ nUmin /N^ / ^**v^* conductive / ^o cement / /
1 -5 micron ID
to quadnjpole MS
o
-\1 \
1 Nz
interface plate 100 V
^. \ orifice 60-80 V
Figure 1. Nano-electrospray source assembly. The sample capillaries are disposable. The capillary tip is touched against the interface to initiate flow. Voltage is then applied and die capillary is positioned in front of the skinamer cone to obtain signal.
A peptide mixture (angiotensin-I, substance-P and met-enkephalin) was used in our initial experiment. After reproducible CE runs were obtained, each peptide in the nfiixture was collected individually. Figure 2 shows the electropherogram of CE-UV of the peptide mixture and the nES mass spectrum of peak "a" collected from a single CE run. The peaks at m/z 649.2 and 433.2 correspond to the doubly- and triply-charged ions of angiotensin-I, respectively. A fairly high background in the low-mass region was observed in every sample eluted from CE. To reduce the background in nES/MS analysis, the base wash by 1% NH4OH can be eliminated between CE runs. Figure 3 depicts the MS/MS of the doubly- and triply-charged precursor ions of angiotensin-I (DRVYIHPFHL). These spectra were all obtained from a single sample loading. Although the doubly- and triply-charged ions of angiotensin-I showed relatively weak peaks in Figure 2, their corresponding MS/MS spectra showed good signalto-noise ratios. Because the arginine residue was located near the N-terminus, the a- and b-series ions were predominent in both spectra (for nomenclature see ref 13). Two other peptides, substance-P and met-enkephalin, were also analyzed by nES/MS. Figure 4 shows the nES/MS and nES/MS/MS results for metenkephalin (peak c) collected from the CE (see inset Figure 2). Note that the Cterminal metiiionine was oxidized to the sulfoxide (m/z 590.2) during CE sample preparation. The methionine oxidation to the sulfoxide was also observed in substance-P CE fraction. The oxidation reaction of the methionine-containing peptides could occur during the sample drying process. Figure 5 shows that the signal from met-enkephalin lasted for at least 26 minutes with no loss in signal intensity. All three peptides were also successfully sequenced by Edman degradation using half of the material from the same collected CE fractions.
41
Nano-electrospray MS and Edman Sequencing Using CE
/ Off-line CE-nES/MS Angiotensin-I
CE
Peak
\n
a b
11.2 12.9
L?— 18.2
^
-UV Peptide
Angiotensin-I Substance-P Met-enkephalinJ
Figure 2. nES mass spectrum of the peak collected at 11.2 minutes. D-R-V-Y-l-H-P-F-H-L
200
400
b7 feklfeibJ M J M M I B I ^ * 111 1000 1200 800
600
II
-i-3
432.8
100
£:.
^w c
B c 0)
50
CO
25
?» 0)
cc
b5
75
as be
b4
lili.iJJiiiLilJ
iXk ipny\i
200
400
600
.Ji liJmij
800
b7 Hi I
'**b8 I
1000
1200
m/z
Figure 3. Off-line CE-nES/MS/MS of both the doubly and triply-charged precursor ions of angiotensin-1.
Mark D. Bauer et al
42
590.2
100
Y-G-G-F-M
MH+
75 0) 50 25
IWM^iiiM h ^ i e i M M ^ ^ ^ 400
600
500
800
1000
900
b3
V
50
700
y2
38
a4
b2
H 25
MH+ as
fl 12
^LAUWU L i , . . l ^ . ^ j ^ i ^ HfcVi100 200
ikwiWl^^iiftA 300
400
600
500
m/z Figure 4. Off-line CE-nES/MS and nES/MS/MS of oxidized methionine enkephalin.
590.2
100
10,290,000 t = Omin
S
50
0-'—I
•
"--—•
M\t\%\mitkAiJHinhmtmfl»^iuimim.^ •
•—I
_•• 10,360,000
590.2
100
• i,
t = 26 min S
50
•©
JWLL
800 700 m/z Figure 5. nES/MS spectra of 2 jul met-enkephalin collected from CE showing no loss of signal for 26 minutes. 400
500
600
Nano-electrospray MS and Edman Sequencing Using CE
43
Once the standard peptide mixture was successfully analyzed by off-line CE-nES/MS, the technique was attempted on a more complicated peptide mixture. More than 15 peptides were presented in the 2-hour tryptic digest of cytochromeC. CE separation of the peptides was achieved witiiin 30 minutes. Peaks of interest were collected in 100 |Lil BGE, dried and redissolved in 4 |i,l of the nES solvent. Figure 6 shows the nES mass spectrum of the CE peak at 21.9 minutes (inset) from the tryptic digest. Only one peptide was detected in the CE fraction. Both the singly- and doubly-charged ions of the peptide (m/z 964.5) corresponding to residues 92-99 (EDLIAYLK) of cytochrome-C were observed. Figure 7 is the nES/MS/MS of the doubly-charged precursor ion at m/z 483.0 of the peptide (residues 92-99), yielding a complete series of "y" ions. Fragmentation of the doubly-charged ion was much easier than the singly-charged ion. Both nES/MS and nES/MS/MS spectra of the peptide were obtained from a single loading of a 2-jil sample solution. Edman sequencing of the peak at 21.9 minutes, collected from a separate run, further verified the peptide sequence. Based on both MS and Edman sequencing data, there was no carryover between closely separated peaks.
I00n
92 99 E-D-L-l-A-Y-L-K +2 483.0
^ 50^
Figure 6. nES/MS of the peak (21.9 min) collected from the CE separation of a tryptic cytochrome-C.
Mark D. Bauer et al.
44 482.9
100
U2
92 99 E..D--L--I--A--Y--L--K
M
75
o 50.
>
y5 25
ye
JU
i[Lyill..m..iyL.iiLlJlilii.L..., L 200
400 400
600
y7 7'
I
800
m/z
Figure 7. nES/MS/MS of the doubly-charged ion at m/z 482.9. 100
m/z
1400
Figure 8a. nES spectrum of myoglobin collected from CE. 12 pmoles of protein were loaded onto the CE.
Proteins like insulin, myoglobin and lysozyme were also loaded onto the non-coated CE capillary and collected for nES/MS analysis and Edman sequencing. Because the peak width of proteins is larger than that of the peptides on the non-coated column, the window is relatively wide (about 10 minutes) under 7.5 kV for fraction collection. Figure 8a is the nES/MS spectrum of
Nano-electrospray MS and Edman Sequencing Using CE
45
myoglobin collected from CE using about 6 picomoles of material. It was noticed that the multiply-charged state produced in nES/MS for myoglobin was shifted to a higher values, compared to the normal electrospray MS. Again, the measured mass was 16 Da higher than the calculated mass (16951.5 Da), corresponding to one oxygen added to the protein. Figure 8b is the Edman sequencing data from the myoglobin CE fraction using about 6 picomoles of protein. The samples collected from a CE run were usually not suitable for direct Edman sequencing. High background interference was often observed which may result in a wrong sequence call. ProSpin can effectively eliminate small molecule contamination. Since the collected CE fraction contained 50% acetonitrile, the sample was partially pre-dried using a nitrogen stream before the centrifugation process. As Figure 8b shows, twelve N-terminal cycles of myoglobin were obtained from a single CE fraction with a quite clean background.
Figure 8b. Edman sequencing data of the CE fraction of myoglobin showing the N-terminal 12 cycles.
IV.
Conclusion
An off-line approach that is simple and useful for peptide/protein sequencing using 5-10 picomoles of material has been demonstrated. Peptide and protein samples were first separated by capillary electrophoresis. Selected peaks were fraction collected and analyzed by both nano-electrospray mass spectrometry and Edman sequencing. A standard peptide mixture, a tryptic-digested protein and intact proteins were used to illustrate this method. Successful fraction collection of each component required reproducible electropherograms, the ability to automatically switch the outlet buffer vessel and the ability to maintain electrophoretic integrity while eluting a peak of interest into a small outlet buffer
Mark D. Bauer et al
46
volume. Successful MS and MS/MS required the use of electrospray-compatible buffers in the initial CE separation along with the nES source which provided signals of sufficient duration to fully interrogate the ions of interest. Recently, Matthias Wilm et al. (10) reported a simple technique for peptide analysis isolated from polyacrylamide gel electrophoresis, using perfusion sorbent for sample clean-up before nano-electrospray MS analysis. This approach might be very useful for sample clean-up of CE fractions, allowing the use of different CE buffers and different types of capillary columns. Work is under way using coated amine capillaries for better CE separation of proteins.
Acknowledgments The authors gratefully acknowledge Dr. Thomas W. Keough and Dr. Kenny Morand for their help with the nano-electrospray source installation.
References 1. 2. 3. 4.
5. 6. 7. 8. 9. 10. 11. 12. 13.
Cai, J. and Henion, J. (1995), J. Chromatography, 703, 667. Pleasance, S., Thibault, P. and Kelly, J. (1992), J. Chromatography, 591, 325. Locke, S.J. and Thibault, P. (1994), Anal Chem., 66, 3436. Sun, Y., Bauer, M.D. and Wang, P., "Analysis of Peptides and Proteins by On-line CE-ES/MS and Off-line CE-nES/MS", Proceedings of the 44th ASMS Conference on Mass Spectrometry and Allied Topics, Portland, OR (1996). Herold, M. and Wu, S. (1994), LC-GC, 12, No. 7, 531. Takigiku, R., Keough, T., Lacey, M. P., Schneider, R. E. (1990), Rapid Commun. Mass Spectrom., 4(1), 24. Keough, T., Takigiku, R., Lacey, M. P., and Purdon, M. (1992), Anal. Chem., 64, 1594. Licklider L., Kuhr, W. G., Lacey, M. P., Keough, T. Purdon, M. P., and Takigiku, R., (1995), Anal Chem., 67, 4170. Wilm, M.S. and Mann, M. (1994), International J. Mass Spectrom. and Ion Processes, 136, 167. Wilm, M.S., Shevchenko, A., Houthaeve, T., Breit, S., Schweigerer, L., Fotsis, T. and Mann, M. (1996), Nature, 379, 466. Shevchenko, A., Wilm, M.S., Vorm, O. and Mann, M. (1996), Anal Chem., 68, 850. Biehler, R. and Schwartz, H.E., Beckman Instruments technical bulletin, TIBC-105. Roepstorff, P. and Fohlman, J., 1984, 11, 601.
CHARACTERIZATION OF A RECOMBINANT HEPATITIS E PROTEIN VACCINE CANDIDATE BY MASS SPECTROMETRY AND SEQUENCING TECHNIQUES
C. Patrick McAtee and Yifan Zhang Genelabs Technologies, Redwood City, CA 94063
I. Introduction A protein with an observed molecular weight of 62-kDa derived from an open reading frame of the Hepatitis E vims was expressed in a baculovirus expression vector and purified to homogeneity. The recombinant protein appeared to be a doublet by SDS-PAGE. Tryptic digestion in conjunction with mass spectrometry and sequence analysis indicated that the amino terminus was acetylated and that the internal sequences were in agreement with the predicted protein sequence. Reverse phase liquid chromatography coupled to electrospray MS (LC-MS) resolved the doublet protein into two major components of 56.5 and 58.1-kDa. Confirmation of the amino terminus of the molecule by LD-MS post source decay enabled us to tentatively assign the carboxyl terminus of each species. Sequencing of the intact protein by automated carboxyl terminal sequencing confirmed that the carboxyl terminus was truncated and that the sequence assignment predicted by LC-MS was correct. II. Materials and Methods A,
Purification of the r62-kDa Protein
Purification of the r62-kDa protein was as described by McAtee et. al. ( 1 ). The purified r62kDa protein is shown in Figure 1. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
47
C. Patrick McAtee and Yifan Zhang
48
98 ct3
64 50 36 30
16 6
B.
lilliii«^^
sllllllll
Figure 1. 4-20% SDS-PAGE of Purified Recombinant 62-kDa. A. Lane 1, Molecular weight markers; lane 2, Purified final product. Molecular mass markers are Novex SeeBlue Pre-Stained Standards and range as follows (from top to bottom): Myosin, 250-kDa; BSA, 98-kDa; Glutamic dehydrogenase, 64-kDa; Alcohol dehydrogenase, 50-kDa; Carbonic anhydrase, 36-kDa; Myoglobin, 30-kDa; Lysozyme, 16-kDa; Aprotinin, 6-kDa; Insulin B chain, 4 kDa. B. Western blot of lane 2 from above. Samples were diluted 15 fold prior to SDS-PAGE and electrophoretic transfer.
Characterization of a Hepatitis E Protein Vaccine
B.
49
In-gel Enzymatic Digestion
In-gel enzymatic digestion was carried out according to Williams and Stone ( 2 ), using perfusion in an approximately 1:5 (enzyme weight/substrate weight) ratio of modified trypsin (Promega) and digestion for 24 h at 37"C. The resulting peptides were reduced/carboxymethylated, extracted with 0.1% TFA, 60% CH3CN and then subjected to hydrolysis/amino acid analysis. C.
MALDI-TOF Mass Spectrometry
LD-MS was carried out on a 3 |LI1 aliquot of tryptic peptide samples from RP-HPLC using a VG/Fisons TofSpec mass spectrometer that was operated in the +ve linear ion mode at an accelerating voltage of 25 kV. The instrument was equipped with a nitrogen laser (337 nm) and a 0.65m linear flight tube. The data for peak 30.25 indicates a molecular mass corresponding to 1787.5 Daltons. The predicted molecular mass for this peptide (residues 1-17) is 1743.9 Daltons. This peptide failed to sequence by Edman degradation. A sequencing ladder consisting of residues 8-16 was generated by post source decay of the blocked peptide using a VG TofSpec SE LD-MS. D.
Protein/peptide Sequencing
Amino terminal sequencing was carried out on either an Applied Biosystem 477 that was equipped with on-line HPLC's for the identification of the resulting phenylthiohydantoin (PTH) amino acid derivatives. The 477 instrument was operated based upon manufacturer's recommendations and 3 pmol PTH standards were routinely used. All sequences were searched via the BLAST Network Service operated by the National Center for Biotechnology Information. For automated C-terminal sequence analysis, protein samples were applied to Zitex membranes pretreated with isopropanol and inserted into inert Kel-F columns. The sequencer column was installed into a Hewlett Packard G1009A sequencer for chemical coupling and cyclization. The coupled peptidylthiohydantoin and cyclyzed product was cleaved to the C-terminal thiohydantoin-amino acid residue and the shortened peptide using an alkali salt of trimethylsilanolate (KOTMS). The derivatized sample was analyzed by an HP 1090 liquid chromatograph with filter photometric detection at 269 nm using a Hewlett Packard specialty (25 cm X 2.1 mm) re versed-phase PTH analytical HPLC column. A 39 min binary gradient (solvent A: phosphate buffers pH 2.9; solvent B: acetonitrile) utilizing alkyl sulfonate as an ion pairing agent was developed. Thiohydantoin-amino acid standards at 100 pmol were used to standardize the analysis. E.
LC-MS (ES) Mass Spectrometry
r62-kDa protein and digests were chromatographed on a Vydac Cjg reverse phase microbore column (150 mm x 1 mm) using an ABI Model 41 OB dual syringe pumping system. The flow-
50
C. Patrick McAtee and Yifan Zhang
rate was maintained at 50 ml/ min and elution achieved using a linear gradient from 0.1% aqueous TFA to 0.1% TFA in acetonitrile. A Carlo Erba Phoenix 20 CU pump was used to deliver a mixture of methoxyethanol and isopropanol (1:1) (v/v) at 50 ml/min which was combined with the column eluent in a post column mixing chamber. An in line flow splitter was used to restrict flow to the mass spectrometer to approximately 10 ml/ min. Detection was performed immediately following elution from the column at 214 nm using an ABI 759A variable wavelength detector. Mass spectrometric detection was achieved following post column solvent addition and flow splitting by a VG BioQ triple quadrupole mass spectrometer. Spectra were recorded in the positive ion mode using electrospray ionization. Calibration of the instrument was performed in the range m/z 500-2000 by using direct injection analysis of myoglobin. Spectra were recorded at 1.5 sec intervals and a drying gas of nitrogen used to aid evaporation of the solvent. The capillary voltage was maintained at approximately 4 kV with a source temperature of 60°C.
III. Results and Discussion A,
Tryptic Peptide Analysis/MALDI-TOF
A 2 pmole quantity of the 62-kDa protein was digested in situ with trypsin in an excised polyacrylamide gel slice. The resulting peptides were resolved by reversed-phased HPLC. Peaks detected by HPLC were selected for further analysis by sequencing, LC-MS, and MALDI-TOF. One peak with a retention time at approximately 30.25 failed to yield an interpretable sequence. Upon further observation, the mass observed by LD-MS was consistent with the N-terminal residues of the 62-kDa protein with the addition of an Nterminal acetyl group. Post source decay analysis revealed that this peptide was indeed the predicted amino terminal tryptic peptide (Figure 3). All other peptide peaks matched various internal sequences of the r62-kDa protein. B,
LC-MS and Carboxyl Terminal Sequence Analysis
In order to evaluate the nature of the 62-kDa protein doublet observed by SDS-PAGE, the purified r62-kDa protein was chromatographed on a vydac C,g reversed-phased column with the eluting peak evaluated by electrospray mass spectrometry (LC-MS (ES)). The r62-kDa protein resolved into two primary components corresponding to 56.5 and 58.1-kDa, respectively. The predicted molecular mass of the r62-kDa protein using the coding sequence of residues 112 to 660 of the ORP-2 region is 59.1-kDa (Figure 4). These data suggested that a deletion occurred in the molecule, most likely at the amino or carboxyl terminus. The protein was found not to be glycosylated (Data not shown) either by periodate oxidation or by GC-MS analysis. With the confirmation of the amino terminus, the ES-MS data suggested that the carboxyl terminus may
Characterization of a Hepatitis E Protein Vaccine
51
Peptide 30.25
Peptide 42.2b
AVAPAHDTPPVPDVDSRGAILRRQYNLSTSPLTSSVATGTNLVLYAAPLSPL Peptide 15.1
Peptide 38.3
LPLQDGTNTHIMATEASNYAQYRVARATIRYRPLVPNAVGGYAISISFWPQT Peptide 33.6
Peptide 22.5 Peptide 22.1
TTTPTSVDMNSITSTDVFliLVQPGIASELVIPSERLHYRNQGWR'sVETSGVA Peptide 23
EEEATSGLVMLCIHGSLVNSYTNTPYTGALGLLDFALELEFR'NLTPGNTNTR' Peptide 11.7
Peptide 25.2
Peptide 31.1a
VSRYSSTAR'HRLRR'GADGTAELTTTAATRFMKbLYFTSTNGVGEIGRGIALT Peptide 60.8
Peptide 42.2a
LFNLADTLLGGLLPTELISSAGGQLFYSRPVVSANGEPTVKLYTSVENAQQDK' Peptide 31.1b
Peptide 30.9a
GIAIPHDIDLGESFIVVIQDYDNQHEQDRPTPSPAPSRPFSVLR'ANDVLWLSL Peptide 26.9
TAAEYDQSTYGSSTGPVYVSDSVTLVNVATGAQAVARSLDWTKVTLDGRPLS Peptide 36.5
Peptide 62.8
Peptide 30.3
TIQQYSKTFFVLPLR'GKLSFWEAGTTKAGYPYNYNTTASDQLLVENAAGHRV IL6 epitope
AISTYTTSLGAGPVSISAVAVLAPHSALALLEDTLDYPARAHTFDDFCPECR PLGLQGCAFQSTVAELQRLKMKVGKTREL
t
t
56.54 kDa species 58.16 kDa species Figure 2. Sequence analysis of r62-kDa tryptic peptides. Tryptic peptide sequences are indicated in the figure. lUPAC nomenclature is used for amino acid abbreviation.
C. Patrick McAtee and Yifan Zhang
52
UkJvwXLj^^ io
Vo • Vo
io
jto'Jis iio'iio'iioito zioiib'iib'ito'jio'j>o'jio'jto'jko'iib'iio iioito^
Figure 3. MALDI-TOF post source decay of r62-kDa amino terminal tryptic peptide . Reprinted from McAtee et. al. "Purification and Characterization of a Recombinant Hepatitis E Protein Vaccine Candidate by Liquid Chromatography-Mass Spectrometry" with kind permission from Elsevier Science-NL, Sara Burgerhartstraat 25, 1055 KV Amsterdam, The Netherlands.
53
Characterization of a Hepatitis E Protein Vaccine
1900
D
56548.5
,
2000
Da/e B 58161.4
1 \\
! \
\
55500
^^^ W ^
55750
56000
56250
56500
56750
57000
57250
57500
57750
58000
58250
^v^ 58500
58750
59000
Mass Figure 4. LC-MS Electrospray MS Analysis of r62-kDa, A. Positive ion ES-MS multiply charged spectra. B. Deconvoluted spectra. Reprinted from McAfee et. al. "Purification and Characterization of a Recombinant Hepatitis E Protein Vaccine Candidate by Liquid Chromatography-Mass Spectrometry" with kind permission from Elsevier Science-NL, Sara Burgerhartstraat 25, 1055 KV Amsterdam, The Netherlands .
54
C. Patrick McAtee and Yifan Zhang
be clipped between residues 551-552 and residues 536-537 (Figure 2). Automated carboxyl terminal sequencing was performed using intact r62-kDa protein to confirm the putative carboxyl terminal processing. The initial sequencing cycle gave rise to two very strong peaks corresponding to glutamine and lysine, neither of which are located at the predicted carboxyl terminus of the r62-kDa protein. The second cycle gave a very strong (> 200 pmole) leucine peak, indicating the presence of more than one leucine in the polypeptide mixture. The third cycle was somewhat ambiguous due to increasing background noise. However, arginine was clearly present in the third cycle along with either a glutamic acid or a glycine residue. Thus, the carboxyl sequencing data supports the existence of a heterogenous, truncated protein.
IV. Conclusion We have constructed a baculovirus vector that directs the efficient expression of a recombinant 59.1-kDa protein encoded by the Hepatitis E virus ORP-2 region. This protein was purified by chromatographic means and found to be a doublet by Tricine SDS-PAGE. Tryptic peptide analysis revealed as many as 143 peaks by reverse phase HPLC. Peaks were selected for LDMS to determine structural integrity and potential post translational modifications. One peak did not yield a sequence by Edman degradation. However, the molecular mass matched the predicted mass for the amino terminal tryptic peptide taking into consideration the removal of the N-terminal methionine by a cellular aminopeptidase followed by the acylation of the adjacent alanine residue. Post source decay analysis by laser desorption mass spectrometry indicated that the peptide was the amino terminal tryptic peptide. LC-MS using electrospray MS data established the true molecular masses of the 62-kDa doublet that was observed by SDS-PAGE. With the confirmation of the of amino terminus in previous experiments, it was possible from ES-MS data to predict the putative carboxyl terminal processing steps that gave rise to the bimodal distributed '62-kDa' species. Automated carboxyl terminal sequencing validated the predicted carboxyl terminal processing of the protein and firmly established residues 551-552 and 536-537 as the carboxyl termini of the 58.1-kDa and 56.5-kDa proteins respectively. In previous studies, we found that a 62-kDa HEV ORF-2 derived protein produced in baculovirus represented an improved antigen in comparison to bacterial expressed proteins in HEV diagnostic assays ( 3 ). The excellent immunogenic properties of this antigen were also apparent as we were able to elicit protective immune responses in primates after heterologous challenge with HEV ( 4 ). These observations suggest that the baculovirus expressed protein may contain an immunologic structure that closely resembles the native virus capsid protein.
Characterization of a Hepatitis E Protein Vaccine
55
References 1.
McAtee, C. P., Zhang, Y., Yarbough, P. O., Fuerst, T. R., Stone, K. L., Samander, S., and Williams, K. R. (1996) J. Chromatography B (in press).
2.
W illiams, K. R. and Stone, K. L. (1995) In Techniques in Protein Chemistry VI (Crabb, J.W., Ed.), pp. 143-152, Academic Press, San Diego.
3.
McAtee, C. P., Zhang, Y., Yarbough, P.O., Bird, T., and Fuerst, T.R. (1996) Prof. Exp. Pur. (in press).
4.
Fuerst, T.R., Yarbough, P.O., Zhang, Y., McAtee, C. P., Tam, A. W., McCaustland, K. A., Garcon, N., Spelbring, J., Carson, D., Myriam, F., Lifson, J.D., Slaoui, M., Prieels, J.-P., Margolis, H., and Krawczynski, K., (1996) In Enterically-Transmitted Hepatitis Viruses (Y. Buisson, P. Coursaget, and M. Kane, eds.) La Simarre, Joue-les-Tours (France) pp 384-392.
5.
Tam, A.W., Smith, M.M., Guerra, M.E., Huang, C.C, Bradley, D. W., Fry, K. E. and Reyes, G. R. (1991) Virol 185,120-131.
This Page Intentionally Left Blank
Comparison of the High Sensitivity and Standard Versions of Applied Biosystems Procise^^ 494 N-Terminal Protein Sequencers using Various Sequencing Supports Anita E. Lavin, Lee Anne Merewether, Christi L. Clogston, and Michael F. Rohde Amgen Inc., Amgen Center, Thousand Oaks, California 91320
INTRODUCTION Protein sequencing via N-terminal Edman degradation continues to be a versatile and valuable tool for directly obtaining and confirming the primary amino acid sequence of proteins and peptides. In combination with other orthogonal methods N-te'rminal sequencing provides key structural information in the characterization of recombinant proteins during all stages of the research and development process. Since the advent of automated N-terminal sequencing there has been a continual drive to improve the sensitivity of the technique. Spinning cup sequenators were designed to sequence 100 nmol of sample with a practical range between 10-50 nmol (1). Development of a sequenator using gas-liquid chemistry and a solid phase support, by Hood and Hunkapiller, significantly improved the sensitivity of automated Edman degradation with a routine range of 0.05-5 nmol (2). Applied Biosystems developed the first commercially available sequencer (470A) based on their design. The 470A can routinely yield results at the 10-100 pmol level (3). During the last decade the sensitivity of commercially available Nterminal sequencers has continued to improve. Applied Biosystems standard Procise^^ 494 N-terminal sequencer (Procise^'^ 494HT) can now be routinely operated at less than 10 pmol. The current trend is to advance N-terminal Edman sequencing to the subpicomole level. The most significant limitation to achieving this goal, using the established instrumental configuration, is the detection of PTH-amino acids by conventional HPLC (4). Applied Biosystems has now developed a high sensitivity version of the Procise™ 494 N-terminal sequencer (Procise™ 494HS) which employs a capillary HPLC systems with Micro-Syringe Pumps. The 494HS facilitates the detection of PTH-amino acids at the subpicomole level. Comparison of the signal enhancement and sensitivity of the 494HS and the 494HT is the primary objective of this investigation. Amgen Inc. is interested in validating the utility of high sensitivity sequencing because this technique is critical in our attempt to identify minute amounts of novel factors and potential therapeutics. An additional objective is to compare the repetitive and initial yields of various types of proteins using different sequencing supports. The sequencing supports utilized in this study include ProSorb™, ProBlott^M, and BioBrene™-treated TFA-activated glass fiber filters. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
57
58
Anita E. Lavin et al
MATERIALS AND METHODS A. Instrumental Protein sequencing was performed on two versions of the Applied Biosystems (Foster City, CA) Procise^^ 494 protein sequencer. The high sensitivity version 494HS had a capillary HPLC system with, a 0.8mm column, a 2.4|xl flowcell with a 6mm path length, a 50|Lil loop, and was run at a flowrate of 40 |Lil/min. The injection volume is 55-65 percent of the volume in the flask. The 494HS utilized various types of guard columns to extend the life of the analytical column. The standard version 494HT had a HPLC system with a 2.1mm column, a 12|il flowcell with a 8mm path length, a 80^il loop, and was run at a flowrate of 325 |il/min. The injection volume is 70-80 percent of the volume in the flask.
B. Materials Bovine p-lactoglobulin (BLG), ProSorb™ cartridges (PS), Mini ProBlott^^ membranes (PB), trifloroacetic acid (TFA) activated glass fiber filters (GF), BioBrene^^, and sequencing reagents were purchased from Applied Biosystems. Human serum albumin (HSA) was purchased from Hewlett Packard (Palo Alto, CA). Recombinant human erythropoietin (EPO) and recombinant human granulocyte colony-simulating factor (GCSF) were produced and purified as previously described (5-8). Pre-cast Tris-glycine gels were purchased from Novex (San Diego, CA).
C. Methods The concentration of EPO (6 pmol/|xl) and GCSF (10 pmol/fxl) were quantified by amino acid analysis. Quantitative loading of proteins onto the sequencing supports or into the SDS-gels was done using calibrated pipettes. BLG (5 pmol/)Lil) was quantitatively diluted according to Applied Biosystems recommendations. HSA (1.11 pmol/|il) was used as received from Hewlett Packard. Proteins were loaded onto ProSorb^^ cartridges using the following method. First 10 |J.l of methanol was loaded onto the PVDF to wet the membrane and then the excess was removed, this was followed by loading 100 |Lil of 0.1% TFA in 20% acetonitrile in water (S4B), the protein samples were loaded into the TFA/S4B solution, and the wick was inserted. Once the solution passed through the PVDF membrane the support was dried with nitrogen. Then 5 |xl of a diluted BioBrene™ solution was loaded onto the PVDF. The BioBrene™ solution contained 70% methanol, 20% BioBrene^M (75|Lig/iLil), and 10% of 1% TFA. Protein samples were run on 12% pre-cast, 1.0mm, 10-well Tris-glycine SDS gels under non-reducing conditions. The samples were immediately electroblotted onto Mini ProBlott PVDF membrane using a semi-dry blotting apparatus. Following blotting the membrane was washed with HPLC grade water, stained with 0.2% Coomassie R-250, and then air dried. Glass fiber filters were coated with BioBrene™ and run for three preconditioning cycles. The 494HS 6 mm filters were treated with 750 |Lig of BioBrene™ and the 494HT 9 mm filters were treated with 1000 |ig of BioBrene™.
Precise 494 N-Terminal Protein Sequencers
59
All the protein samples were sequenced for 15 residues. Each protein was run on all three of the sequencing support types on both of the 494 sequencers. Each protein was loaded in 10, 5, 2.5, and 1 pmol quantities. Additional samples of BLG were loaded onto the 494HS sequencer to determine the minimum amount BLG which could be detected and sequenced for 15 cycles. BLG was loaded in 750, 500, 250, and 125 fmol quantities.
D. Data Analysis The Applied Biosystems 610 software was used to calibrate the standard, analyze and call the sequence, and to calculate the background corrected repetitive and initial yields. The 610 sequence calls were verified and adjusted manually when necessary. Both the repetitive and initial yield were entered as a percent value. The percent repetitive yield was obtained directly from the 610 software. The percent initial yield was obtained by dividing the pmol amount indicated by the 610 software by the pmol amount loaded and multiplying by 100. The average and standard deviation were calculated for the repetitive and initial yields for each protein, on each support, using the values obtained for each quantity loaded.
RESULTS AND DISCUSSION Comparison of the 494HS and the 494HT was done from three approaches. First the absolute and relative signal enhancement were determined. The absolute signal enhancement was determined by comparing the standards from each instrument. The relative signal enhancement was determined by comparing equivalent runs from each instrument. Then the limit of detection for each instrument was determined for BLG on BioBrene^'^treated glass fiber filters. Comparisons were made based on the amount of standard in the flask or loaded onto the filter and not on the amount injected onto the column. Finally, the repetitive and initial yields for HSA, EPO, GCSF, and BLG were determined for each of the sequencing supports at various quantities.
A. Signal Enhancement Determination of the absolute signal enhancement was done by comparing the ratio of the pmol/mAU for the respective PTH-amino acid (PTH-AA) standards. The PTH-AA standard was run routinely at the 1 pmol level on the 494HS and at the 4 pmol level on the 494HT. Figure 1 compares the PTH-AA standards from both the 494HS (top) and 494HT (bottom). Both the PTH-AA standards shown in figure 1 are 2 rriAU full-scale. Table I reports the p m o l / m A U ratio for each PTH-AA in both the 494HS and 494HT standards shown in figure 1. Over the entire range of PTH-AA the absolute signal enhancement for the 494HS is three-fold over the 494HT.
60
W
D
Q
L
|T
r
M V
i1
PI
j K i"-
1 FHi
E
y A
H
1
11
R
11
1 1u
1
DPTU
p
ulU~J J^
MllU^J U L A J ^
Figure 1: Reference standards from the Procise^M 494HS (top) 1 pmol and 494HT (bottom) 4 pmol. The standards are displayed at 2 mAU full-scale. Table I: Absolute Signal Enhancement based on ratio of the observed pmol/mAU for the 494HT verses the 494HS PTH-AA
494HS pmol/mAU
494HT pmol/mAU
Signal enhancement
D N S Q T G E H A R Y P M V W F I K L
0.93 1.14 1.46 1.04 1.28 1.40 0.95 1.66 1.26 1.64 1.23 1.39 1.14 1.19 1.10 1.28 1.14 1.23 1.21
2.61 3.13 4.22 3.34 3.90 3.88 2.76 4.20 3.87 4.48 4.03 5.31 4.00 4.18 3.85 4.63 4.39 2.89 4.30
2.81 2.75 2.89 3.21 3.05 2.77 2.91 2.53 3.07 2.73 3.27 3.82 3.51 3.51 3.50 3.62 3.85 2.35 3.55
Average Standard Deviation
1.25 0.20
3.89 0.69
3.14 0.44
Precise 494 N-Terminal Protein Sequencers
61
Equivalent sequencing runs were compared to determine the relative signal enhancement. Figures 2 and 3 show two equivalent runs of EPO at the 1 pmol level on both the 494HS and the 494HT. The 494HS EPO run had a repetitive yield of 90% and an initial yield of 60%. The 494HT EPO run had a repetitive yield of 91% and an initial yield of 50%. The runs shown in figures 2 and 3 are 1.25 mAU full-scale. Table II reports the pmol/mAU ratio for the six PTH-amino acids shown in the EPO runs in figures 2 and 3. In this protein sequencing run the relative signal enhancement for the 494HS is three-fold over the 494HT.
A = 750 fmol
T
•
/^Jv
[y^
^
u DPI
P = 580 fmol
Figure 2: The first six residues of 1 pmol of EPO sequenced on the Procise^'^ 494HS sequencer. The cycles are displayed at 1.25 mAu full-scale. The first six residues of EPO are APPRLI.
Anita E. Lavin et al
62
Figure 3: The first six residues of 1 pmol of EPO sequenced on the Procise™ 494HT sequencer. The cycles are displayed at 1.25 mAu full-scale. The first six residues of EPO are APPRLI.
Table IL Relative signal enhancement based on ratio of the observed pmol/mAU for the 494HT verses the 494HS PTH-AA
494HS pmol/mAU
494HT pmol/mAU
Signal enhancement
A P P R L I
1.47 2.09 3.03 4.35 2.39 2.90
2.88 6.13 7.69 20.00 7.14 11.11
1.96 2.93 2.54 4.60 2.99 3.83
Average Standard Deviation
2.71 0.99
9.16 5.94
3.14 0.94
Precise 494 N-Terminal Protein Sequencers
63
B. Limit of Detection for BLG The lowest pmol amount of BLG that could be detected and sequenced for 15 residues was determined for the 494HS and 494HT. Figure 4 is a 125 fmol BLG run obtained from the 494HS. Figure 5 is a 1 pmol BLG run obtained for the 494HT. The 494HS 125 fmol BLG run had a repetitive yield of 95% and an initial yield of 64%. The 1 pmol 494HT BLG run had a repetitive yield of 97% and an initial yield of 50%. Both runs are shown at 1.25 mAU fullscale. Based on the comparison of the BLG runs the limit of detection for the 494HS is eight times greater than the 494HT.
Figure 4: The first six residues of 125 fmol of BLG sequenced on the Procise^'^ 494HS sequencer. The cycles are displayed at 1.25 mAU full-scale.
64
Anita E. Lavin et al
Figure 5: The first six residues of 1 pmol of BLG sequenced on the Procise^^ 494HT sequencer. The cycles are displayed at 1.25 mAU full-scale.
C. Repetitive and Initial Yields Results for the repetitive and initial yields obtained for the 494HS and 494HT from all the sequencing runs are summarized in Table III and Table IV, respectively. Over the entire spectrum of proteins and sequencing supports, the 494HS had an average repetitive yield of 91% with an average initial yield of 38%. For BLG the 494HS had an average repetitive yield of 94% and an average initial yield of 52%. The 494HT had an average repetitive yield of 91% with an average initial yield of 32%. For BLG the 494HT had an average repetitive yield of 95% and an average initial yield of 53%. Comparison of the repetitive yields for each protein on the different sequencing supports for the 494HS indicate that the best repetitive yield for
Procise 494 N-Terminal Protein Sequencers
65
Table III: High Sensitivity Frocise^M 494 Sequencer - Percent Repetitive Yield (RY) and Percent Initial Yield (YO) are reported for each individual analysis obtained Prosorb Protein HSAIO HSA 5.0 HSA 2.5 HSA 1.0
PM PM PM PM
Average Standard Deviation GCSFIO GCSF 5.0 GCSF 2.5 GCSF 1.0
PM PM PM PM
Average Standard Deviation EPOlO EPO 5.0 EPO 2.5 EPO 1.0
PM PM PM PM
Average Standard Deviation BLGIO BLG 5.0 BLG 2.5 BLG 1.0
PM PM PM PM
Average Standard Deviation
PVDF
RY
YO
RY
YO
Glass Fiber RY YO
98 93 92 97
29 20 32 20
99 94 91 89
12 20 16 20
90 92 90 95
52 38 40 30
95 3
25 6
93 5
17 4
92 2
40 9
98 88 86 79
17 34 28 40
87 85 84 79
24 14 12 20
90 90 88 90
80 44 68 80
88 8
30 10
84 3
18 6
89 1
68 17
95 92 91 91
38 50 36 20
86 85 80 83
9 10 12 30
93 93 90 90
60 60 60 60
92 2
36 12
83 3
15 10
91 2
60 0
95 97 94 94
34 30 36 30
92 94 94 92
35 92 76 60
88 94 94
43 94 44
95 2
33 3
93 1
66 24
92 4
60 29
HSA was 95% using PS followed by 93% using PB and 92% using GF. The highest repetitive yield for GCSF was 89% using GF followed by 88% using PS and 84% using PB. The best repetitive yield for EPO was 92% using PS and GF followed by 83% using PB. The highest repetitive yield for BLG was 95% using PS followed by 93% using PB and GF. Comparison of the initial yields for each protein on the different sequencing supports for the 494HS indicate that the best initial yield for HSA was 40% using GF, 25% using PS, and 17% using PB. The highest initial yield for GCSF was 68% using GF, 30% using PS, and 18% using PB. The best initial yield for EPO was 60% using GF, 36% using PS, and 15% using PB. The highest initial yield for BLG was 66% using PB, 58% using GF, and 33% using PS. Comparison of the repetitive yields for each protein on the different sequencing supports for the 494HT indicate that the best repetitive yield for HSA was 96% using PS followed by 94% using PB and GF. The highest repetitive yield for GCSF was 90% using GF followed by 89% using PS and 83% using PB. The best repetitive yield for EPO was 91% using GF followed by 90% using PS and 82% using PB. The highest initial yield for BLG was 96% using GF followed by 95% using PS and PB.
Anita E. Lavin et al
66
Table IV: Standard Procise^^ 494 Sequencer - Signal Enhancement based on ratio of the observed pmol/mAU for the 494HT verses the 494HS Protein HSAIO HSA 5.0 HSA 2.5 HSA 1.0
PM PM PM PM
Average Standard Deviation GCSFIO GCSF 5.0 GCSF 2.5 GCSF 1.0
PM PM PM PM
Average Standard Deviation EPOlO EPO 5.0 EPO 2.5 EPO 1.0
PM PM PM PM
Average Standard Deviation BLGIO BLG 5.0 BLG 2.5 BLG 1.0
PM PM PM PM
Average Standard Deviation
Prosorb RY YO
RY
PVDF
YO
Glass Fiber RY YO
96 95 95 98
13 6 16 20
98 94 90 96
32 16 20
7
94 96 94 92
35 28 32 20
96 1
14 6
94 3
19 10
94 1
29 7
87 87 89 90
38 22 12 20
78 84 83 102
23 14 8 4
92 90 88 89
23 28 32 20
88 2
23 11
87 11
12 8
90 2
26 5
87 91 92 91
16 32 28 30
83 84 87 75
17 16 12 20
93 94 91 91
49 66 44 50
90 2
27 7
82 5
16 3
92 1
52 10
95 94 96 95
41 54 36 40
95 95 94 93
33 62 76 80
96 95 97 97
43 50 68 50
95 1
43 8
95 1
63 21
96 1
53 11
Comparison of the initial yields for each protein on the different sequencing supports for the 494HT indicate that the best initial yield for HSA was 29% using GF, 19% using PB, and 14% using PS. The highest initial yield for GCSF was 26% using GF, 23% using PS, and 11% using PB. The best initial yield for EPO was 47% using GF, 27% using PS, and 16% using PB. The highest initial yield for BLG was 63% using PB, 53% using GF, and 43% using PS.
CONCLUSION Comparison of the signal enhancement and limit of detection indicated that the 494HS is significantly more sensitive than the 494HT. Several factors contribute to this increase in sensitivity. The most notable being the combination of the three-fold signal enhancement with the decrease in the detector noise and background noise from the sequencing chemistry. Chemical artifact peaks such as analine, "Co-Q'', and DPU are greatly reduced as a result of the customized cycles on the 494HS compared to the cycles on the 494HT. In addition to the reduced artifact peaks the 494HS cycles significantly alleviate the baseline "smile'' that has characterized capillary HPLC PTH-AA separations described previously (9).
Precise 494 N-Terminal Protein Sequencers
67
The repetitive yield on both 494HS and 494HT are comparable over the entire range of proteins and supports. The initial yield on the 494HS is about 5% better than on 494HT comparing all the proteins on the various supports. With respect to BLG the 494HS had about a 2% lower repetitive yield than the 494HT but a comparable initial yield. Overall the 494HS appears to have a slightly lower repetitive yield and an improved initial yield relative to the 494HT. Relative to the entire range of proteins sequenced the overall best repetitive yields were seen using GF with PS a close second and PB several percent lower. Initial yields for protein samples on GF gave significantly higher results relative to PS and PB which were essentially equal. GF appears to be the best overall sequencing support especially for small proteins and glycoproteins. This is most likely due to reduced washout from the protein being securely embedded in the BioBrene^^ matrix. Overall the Procise^^ 494HS is eight times more sensitive than the Procise^^ 494HT with a slightly lower repetitive yield and a improved initial yield. With respect to sequencing supports glass fiber filters treated with BioBrene^^ are the best sequencing support especially for small proteins and glycoproteins.
ACKNOWLEDGMENTS Our thanks go out to Steve O'Neill, Kent Yamada, and Applied Biosystems for their efforts in developing and supporting the 494HS. We also thank Scott Lauren for the amino acid analyses and Hsieng Lu for his on-going support.
REFERENCES 1) Walsh, K.A.,Ericsson, L.H., Parmelee, D.C., & Titani, K. (1981) Ann. Rev. Biochem. 50, 261-281 2) Hewick, R.M., Hunkapiller, M.W., Hood, L.E., & Dreyer, W.J. (1981) /. Biol. Chem. 256, 7990-7997 3) LeGendre, N. , Matsudaira, P.(1988) BioTech. 6, 154-159 4) Blacher, R.W.&Wieser,J. (1993) Tech. Prot. Chem. IV, 427-433 5) Takeuchi, M., Inoue, N., Strickland, T.W., Kubota, M., Wada, M., Shimizu, R., Hoshi, S., Kozutsumi, H., Takasaki, S., & Kobata, A. (1989) Proc. Natl. Acad. Sci. USA 86, 7819-7822 6) Narhi, L.O., Arakawa, T., Aoki, K.H., Elmore, R., Rohde, M.F., Boone, T., Strickland, T.W. (1991) /. Biol. Chem. 266, 23022-23026 7) Souza, L.M., Boone, T.C, Gabrilove, J., Lai, P.H., Zsebo, K.M., Murdock, D.C., Chazin, V.R., Bruszewski, J., Lu, H.S., Chen, K.K., Barendt, J., Platzer, E., Moore, M.A.S., Mertelsmann, R., & Welte, K. (1986) Science 131, 61-65 8) Lu, H.S., Boone, T.C, Souza, L.M., & Lai, P.H., (1989) Arch. Biochem. Biophys. 268, 81-92 9) Rohde, M.F., Clogston, C.L., Merewether, L.M., Derby, P. «& Nugent, K.D. (1995) Tech. Prot Chem. VI, 201-208
This Page Intentionally Left Blank
Evaluation of ABRF-96SEQ: A Sequence Assignment Exercise Joseph Fernandez , Arie Admon , Karen De Jongh , Greg Grant , William Henzel , William S. Lane , Kathryn L. Stone , and Barbara Merrill ^Protein/DNA Technology Center, The Rockefeller University, New York, N.Y. 10021 ^ Department pf Biology Technion, Haifa 32000, Israel ^ ZymoGenetics, Seattle, WA 98102 "^ Department of Molecular Biology and Pharmacology, Washington University School of Medicine, St. Louis, MO 63110 ^ Genentech Inc., South San Francisco, CA 94080 ^ Microchemistry Facility, Harvard University, Cambridge, MA 02138 ^ W.M. Keck Foundation Biotechnology Resource Laboratory, Yale University School of Medicine, New Haven, CT 06510 ^ Glaxo Wellcome, Research Triangle Park, NC 27709 L
INTRODUCTION
The Association of Biomolecular Resource Facihties (ABRF) Protein Sequence Research Committee was established in 1988 in order to provide individual laboratories with a means of self-evaluation. Each year test samples have been distributed, enabUng laboratories an opportunity to monitor their performance in areas such as sample handling, instrument operation/optimization, and data interpretation. In previous years these samples have focused on sensitivity of protein sequencing (1, 6), sample heterogeneity (2, 8), protein-bound peptides on PVDF membrane or in solution (3, 4), post-translational modifications (5), identification of cysteine and tryptophan (7), and length of sequence assignment (8). These previous studies found that many facilities have a low degree of accuracy for assigning positive correct calls, and have difficulty determining where a sequence ends. Such difficulties may arise from inadequate sample handling, sub-optimal instrument operation, or misinterpretation of obtained data. Therefore ABRF-96SEQ was designed to try and ascertain the source of these problems. The committee chose to distribute two sets of PTH chromatograms to serve as a sequence calUng exercise, one contained 32 cycles of sequence data from a novel protein (dataset A) and the other was derived from a low-level complex peptide mixture (dataset B). Also, this study represented an excellent opportunity to examine the role of mass spectrometry in assisting the protein chemist in primary sequence analysis using Edman chemistry. This study examines participant's abilities to evaluate both straightforward and more complex sequence information, as well as utilize mass spectrometry in interpreting their results. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
69
Joseph Fernandez et al
70
II
MATERIALS AND METHODS
A.
Selection and Preparation of ABRF-96SEQ Dataset A
The amino acid sequence of the protein used to obtain dataset A is shown in Figure 1. The initial yield was 20 pmole and each chromatogram contained a single, observable amino acid, thereby eliminating overcalling the sequence as a source of error. The data was obtained using a Hewlett-Packard G-IOOO protein sequencer. In addition, dataset A contained no cysteine or tryptophan. 11
21
31
QRHELLLGAG
SGPGAGQQQA
TPGALLQAGP
PR
Dataset B Major: LKSWTCLKNF
KICELKYQWL
MR-end
Dataset A :
Dataset B Minor: YAEGDVHATS KPARR-end Figure 1. Amino acid sequence of samples used to generate datasets A and B. Dataset A was from an unknown protein, and dataset B was from a mixture of two peptides present in a 5:1 ratio. B.
Selection and Preparation of ABRF-96SEQ Dataset B
Dataset B represented more challenging chromatograms that might arise from an HPLC purified peak obtained after enzymatic digestion of an intact protein. The sample was designed to possess a major and a minor component, a common occurrence for analysis of HPLC purified peptides. The major peptide was a synthetic peptide that was reduced with DTT, derivatized with iodoacetamide to form carboxyamidomethyl cysteine (CAMC) and subsequently HPLC purified. The minor peptide was a commercially available peptide obtained from Sigma (Catalog # P2046). The amino acid sequences of the major and minor peptides used to obtain dataset B are shown in Figure 1. The two peptides were mixed (10 pmole major, 2 pmole minor), applied to a polybrene treated GF/C filter, and analyzed on an AppUed Biosystems/Perkin Ehner Procise (Model 494) protein sequencer operated in the gas-phase mode. An aliquot of the mixture was mixed with an internal calibrant (bradykinin, 1061.2 da) and analyzed on a Vestec BenchTop n Matrix-Assisted-Laser-Desorption-Ionization Time-of-FUght Mass Spectrometer (MALDITOF MS) operating in the linear mode (Figure 2). The observed masses for the major (2947.8 Da) and minor (1659.2 Da) peptides were in good agreement with their predicted masses (2946.7 and 1657.9 Da respectively) and were also within the 0.1% accuracy of the instrument. Predicted masses were calculated using the SHERPA program (Table IV). C.
Distribution and Evaluation of ABRF-96SEQ Dataset A and B
The data package was distributed to 211 ABRF member laboratories that indicated they perform protein sequencing and included dataset A (32 PTH chromatograms, one PTH standard, one data sheet), dataset B (25 chromatograms, one PTH standard, one data sheet), MALDI-TOF MS of dataset B, a general cover letter, and a brief survey. Members were asked to evaluate the data in their usual manner, and return the data sheets to an independent third party who removed all identifying marks prior to forwarding the data to the committee. Participants were requested to define their sequence assignments as positive (call supported by unambiguous data) or tentative (call uncertain, but some evidence present). The committee then analyzed those assignments and defined them as correct, incorrect or over called (made positive or tentative calls beyond last amino acid).
Evaluation of ABRF-96SEQ: A Sequence Assignment Exercise
71
Figure 2. Matrix-Assisted-Laser-Desorption-Ionization Time-of-Flight Mass Spectrometric Analysis of peptide mixture used to generate Dataset B. The masses of the peptide components were 1659.2 and 2947.8 daltons. The peak indicated by a "c" represents an internal calibrant. III. Results and Discussion ABRF-96SEQ offered a unique opportunity for participating laboratories abihty to interpret the same data. There were 95 participants in the study, of which 98% perform sequence analysis, 54% use mass spectrometry routinely, and 64% provide an internal sequence analysis service. The total number of sequence reviewers were 157 or 1,65 per response and the total years of experience were 1061.4 or 6.8 years per reviewer. Concerning everyday sequence calling, 37% report only one person reviews the data while only 17% use more than 1 person; 72% say that a simple sequence is called by one person and that complex data is evaluated by more than one reviewer. There were 165 protein sequencers in the responding laboratories and are distributed as follows: 133 Applied Biosystems/Perkin Elmer, 22 Hewlett-Packard, 8 Porton/Beckman, and 2 Milligen. Respondents reported the most difficult amino acids to identify were Cys (64.5%), Trp (55%), and Ser (22.5%). A.
Dataset A
Table I summarizes the amino acid assignments that were returned for dataset A. All 3072 possible assignments for the 96 responses (one group submitted 2 datasets) were made for dataset A, of which 3066 (99.8%) were correct and only 6 (0.2%) were incorrect. There were 89 responses tiiat were 100% correct, and 87 that called all 32 residues positive correct. This high degree of accuracy is by far the best results for an ABRF protein sequence research committee studies, indicating that sequencer operators have M e difficulty in assigning sequence data tiiat is relatively high level, straightforward, and of a
Joseph Fernandez et al
72
defined length. However, there were 6 responses that contained a calling error ( 5 positive and 1 tentative) with four of these occurring at cycle 31 where Pro was misassigned as Ser. While the Ser peak did increase in this cycle, it did not subsequently decrease in cycle 32 and did not have the same yield as Pro in cycle 31 or Arg in cycle 32. If the data indicates any uncertainty such as this the committee feels the reviewer should assign that residue tentatively. Table I: Summary of Sequence Assignments for ABRF-96SEQ Datasets A and B
Total # Cycles Assigned Average Cycles Assigned Total # Correct Assignments Total # Incorrect Assignments Total # Positive Assignments Total # Tentative Assignments Average # Correct Assigned Average # Positive Assigned Average # Tentative Assigned Average # Incorrect Assigned Accuracy of PC Assignments Accuracy of TC Assignments
PC+TC+PI+TC+CX: Total #cycles/R PC+TC PI+TI+OC PC+PI TC+TI (PC+TC)/R (PC+PI)/R (TC+TW)/R (PI+TI+OC)/R PC/(PC+PI+OC) TC/(TC+TI)
Dataset A
Dataset B Major
Dataset B Minor
3072 32 3066 6 3065 7 31.9 31.9 0.1 0.1 0.998 0.857
2116 22.3 1990 126 2047 48 20.9 21.5 0.5 1.1 0.958 0.583
1310 13.8 1020 290 1077 202 10.7 11.3 2.1 3.0 0.876 0.376
a.
Sequence assignments were categorized as positive correct (PC), tentative correct (TC), positive incorrect (PI), tentative incorrect (TI), or over called past the last amino acid (OC). The number of responses (R) was 96 for dataset A and 95 for dataset B. The number of unassigned residues was 0,15, and 177 for dataset A, dataset B major and dataset B minor respectively.
B.
Dataset B Major
The results from sequence assignment of dataset B major are summarized in Table I. There were a total of 2116 cycles reported with only 15 unassigned residues in the first 22 cycles which was the peptide length. Of these, 1990 (94.7%) were correct and 126 (5.3%) were incorrect. The overall accuracy of positive calls was 95.8%, and there were 54/95 responses that were 100% positive correct with 37 of these calling aU 22 residues. Clearly, these statistics for dataset B major are comparable if not better than previous ABRF studies, especially considering the major sequence was only present at 10 pmol which is lower than previous studies (1-8). The average number of correct assignments was 21.5 and the average number of assigned cycles was 22.3. Some respondents assigned more residues than justified by the data as shown by the number of positive calls (8) made beyond cycle 22. Again, caution should be utiUzed when the amino acid signal diminishes and only tentative calls should be made. The assignments made for each residue of dataset B major is shown in Figure 3. There were 12 residues that 94/95 responses called positive correct. The lone error in these cycles was due to one respondent's sequence being off by one residue (i.e., C, L, K ... called at 5, 6, 7 ...). Since this respondent did very well on the minor sequence (13/15 correct), it is assumed that the error was due to simply writing the sequence out of order on the datasheet rather than misinterpreting the data. The most frequently misidentified residues were Leu-1 (47/95 positive correct), Ser-3 (67/95 positive correct), and Trp-4 (79/95 positive correct). At position 1, Ser, Gly, or Ala were frequently assigned; these
Evaluation of ABRF-96SEQ: A Sequence Assignment Exercise
73
are all common free amino acids that could contaminate the first cycle of a sequencer run. The only possible way to definitively assign the first cycle as Leu-1 was by using the MALDI-TOF MS data (see section E), although there was one response that correctly identified all 22 residues without use of the MS data. The amino acids frequently misassigned for Ser-3 and Trp-4 were Glu (16/21) and Gly (11/11) respectively, which also happen to be the sequence of the minor peptide at those positions. It appears that the difficulty with these cycles was due to the sample heterogeneity since the background corrected yields of Ser-3 (0.3 pm) and Trp-4 (1.3 pm) were somewhat comparable to those of Glu (2.1 pm) and Gly (0.5 pm).
100 90
^ 80 c: i> 70
01)
60
% < 50 «*-< o u 40 a> 30
s
Zs
20 10
||nyiiiii iiiiiiiiiiii Tiiyiiiiii
1
0OC| BX BSTI
npi HTC • PC
0 Fi^i^i^i"i"i^i^i"ii ii*i"i^i^i"i"i^i"i"i"i"i L K S W T C L K N F K I C E L K Y Q W L M
iiiiiiiPi
Amino Acid Sequence
Figure 3. Sequence assignments at each position of ABRF-96SEQ Dataset B major. The correct sequence is presented at the bottom of the graph. Abbreviations used are described in Table I except X which indicates that no amino acid was assigned at that residue. Dataset B minor A summary of the results for ABRP-96SEQ dataset B minor are shown in Table I. There were 1310 cycles assigned and 177 unassigned residues. Of the assigned residues, 1020 (77.8 %) were correct and 290 (22.1%) were incorrect. The positive accuracy was 87.6% which is average compared with previous studies (1-8). However, it must be noted that this peptide was present at only 2 pmol and was a minor component of the sample and thus did represent a challenging sequence calling exercise. The assignments made at each position of dataset B minor are shown in Figure 4. Residues Asp-5, Val-6, His-7, Ala-8, Thr-9, Ser-10, Pro-12, Ala-13, and Arg-14 were the amino acids most often identified as positive correct (85, 89, 86, 92, 88, 87, 89, 82, and 84 respectively). Residues Glu-3 (68/89 correct) and Gly-4 (65/88 correct) were frequently misidentified as Ser (13) and Trp (13) which were part of the major sequence. The most difficult residues to assign as correct were Tyr-1, Ala-2, Lys-11, and Arg- 15 (6,
Joseph Fernandez et al
74
6, 8,9 respectively). These amino acids were difficult or impossible to definitively identify due to free amino acid background (residues 1 and 2), interference by major component (residue 11), and termination of the sequence with identical amino acids at the last two residues. In fact, while two facilities reported all 15 residues correcdy, the committee feels that this was not possible with the supplied data.
Y A E G D V H A T S K P A R R
Amino Acid Sequence Figure 4. Sequence assignments at each position of ABRF-96SEQ Dataset B major. The correct sequence is presented at the bottom of the graph. Abbreviations used are described in Table I except X which indicates that no amino acid was assigned at that residue. D.
Use of MALDI-TOF MS with ABRF-96SEQ
There was a great deal of variation in how the mass spectrum supplied with dataset B was used. In some cases it was clearly useful to confirm Edman sequence results, while in others it appeared to be used incorrectly. A total of 82/95 respondents used the MS data supplied with dataset B; however, only 44 of these correctly call all 22 residues. There were 23 responses that reported less than 21 amino acids correct, and 15 respondents that correcdy identified 21 residues but were unable to identify the missing or incorrect amino acid. There were 6 responses that reported 21 residues as positive correct and one residue as positive incorrect even though the calculated mass did not agree with the observed mass of the major sequence (Figure 2). When the calculated mass of an assigned sequence does not agree with the mass spectrometry data (within the accuracy of ti[ie instrument, 0.1%), the source of the discrepancy should be determined and any unclear assignments should be tentative. The correct use of the MALDI-TOF MS data suppHed should have been as follows. There are two species present in the sample, one being approximately 14-15 amino acids long, and the other being approximately 25 amino acids long assuming an average mass of 115 daltons. It is obvious that the major sequence has a mass of 2947.8 daltons based on the length of the sequence data even though it is not the major ion in the spectrum. The major sequence mass should be calculated by adding the residue masses of the assigned
Evaluation of ABRF-96SEQ: A Sequence Assignment Exercise
75
amino acids plus 18 daltons (9-10). It should be noted that the mass of carboxyamidomethylated cysteine is 160.2 daltons rather than 103.1 for free cysteine. Since there are many ambiguities in the sequence assignment of the minor sequence, the mass spectrometry data is only useful for estimation of the number of amino acids in the minor peptide. The ratio of the peptides cannot be reUably estimated by the MALDI-TOF data. Table n is a list of useful websites that would assist facihty personnel in better interpreting mass spectral data. Table 11: Useful Mass Spectrometry Sites on the World Wide Web HOME PAGE
URL
notes
Walsh Laboratory Home Page EMBL Protein Peptide Group Rockefeller/NYU Mass Spectrometry UCSF Mass Spectrometry Facility Murray's Mass Spectrometry resource American Society for MS Mitchelhill's Delta mass v.2.1
http://128.95.12.16AValshLab.html http://mac-mann6.embl-heidelberg.de http://chait-5gi.rockefeller.edu http://rafael.ucsf.edu http://userwww.service.emory.edu/~kmurray/msres.html http://www.trail.com/asms http://www.medstv.unimelb.edu.au/WWWDOCS/SVIM RD0CS/MassSpec/deltamassV2.html http://thompson.mbt.washington.edu http://www.public.iastate.edu/-pedro/research_tools.html
1 2 2, 3 2 4 5 6
Biological mass spec/ U. Washington Pedro's Biomolecular Research Tools 1. 2. 3. 4. 5. 6.
Programs for peptide digest interpretation and calculation of mass from sequence. Information about database searching and/or On-line database searching. Library of Matrixes. Hyperlinks to other mass spectrometry, protein chemistry, or molecular biology sites. Information regarding ASMS meeting and short courses. List of mass shifts due to amino acid modification.
E.
Comparison to previous studies
2 4
A comparative summary of the current study with previous ABRF-SEQ samples is shown in Table III. As can be seen, ABRF-96SEQA had virtually a 100% accuracy of positive calls compared with other studies indicating that a straightforward homogenous sample with no difficult amino acids, clear chromatography, and a finite ending poses no problems for data interpretation. The accuracy was better than previous studies when native proteins (ABRF-94SEQ, and ABRF-95SEQ) or a peptide conjugated to a protein (ABRF-90SEQ, and ABRF-91SEQ) were studied. The heterogeneous ABRF-96SEQ B was also reasonably well interpreted (96% positive accuracy) as compared to ABRF89SEQ (-95% positive accuracy) which was also a heterogeneous sample. The accuracy of ABRF-96SEQB minor cannot be compared to ABRF-89SEQ minor as the results of the minor component were not evaluated in that study. Since all respondents were assigning the same data, the high positive accuracy in the current study suggests that poor instrument operation, optimization, or sample handling could be a factor in sequence assignment accuracy. One of the most notable improvement in ABRF-96SEQ was cysteine identification, with Cys-6 and Cys-13 both being assigned with an accuracy of 99%, Previous studies have shown difficulty with correct identification of cysteine (19-82%). Even when reduction and alkylation was encouraged, as in ABRF-94SEQ and ABRF-95SEQ (82%/59% and 63% respectively), the accuracy was not as good as that observed in the current study. Interestingly, ABRF-95SEQ had four respondents that
Joseph Fernandez et al
76
carboxyamidomethylated cysteine, but only two positively identified cysteine while the other two miscalled it as Glu. Also of interest is the accuracy of Trp-4 (83%) and Trp-19 (92%) in dataset B compared with other studies. In fact, only ABRF-89SEQ showed a higher accuracy for Trp (96%) which is probably attributable to the higher level of material suppUed for that study (240 pmol). The higher accuracy for Trp-19 compared to Trp 4 in ABRF-96SEQ dataset B was probably due to partial confusion of Trp-4 with the minor sequence. The increase in the accuracy of assignment of these two problematic amino acids in ABRP96SEQ is probably due to optimized PTH separation of Trp from sequencer artifacts, and separation of CAMC from Glu. Table HI: Comparison of Previous ABRF-SEQ Samples Sample
Amount Distributed
Positive Accuracy
Cysteine Accuracy
Tryptophan Accuracy
STD-1 ABRF-89SEQ ABRF-90SEQ ABRF-91SEQ ABRF-92SEQ ABRF-93SEQ
100 pmol 240/48 pmol 30 pmol 80 pmol 500 pmol 50 pmol
-95% -95% 83% 83% 94% 91%
C12 = 32% No Cysteine No Cysteine C5 = 19% No Cysteine C5 = 53%
ABRF-94SEQ
50 pmol
95%
ABRF-95SEQ
45 pmol
78%
CIO = 82% C20 = 59% C15 = 63%
ABRF-96SEQA ABRF-96SEQB Major
40 pmol 10/2 pmol
100% 96%
W7 = 83% W3 = 96% W6 = 31% W6 = 68% No Tryptophan W2 = 70% W7 = 71% W9 = 86% W23 = 58% W19 = 65% W20 = 61% No Tryptophan W4 = 83% W19 = 92%
No Cysteine C6 = 99% C13 = 99%
Resultsfromprevious studies were takenfromreferences 1-8. F.
Recommended/consensus calls for ABRF-96SEQ Dataset B
Table FV represents the consensus calls made by the 1996 ABRF-96SEQ protein sequence research committee as the best interpretation of dataset B using the suppUed data. This is also an example of how data should be reported to an investigator who is generally only interested in the final sequence assignment. It should be explained that Leu-1 was assigned by a combination of the MS data and the PTH yield. Positive identification by either technique alone is uncertain as Ser, Gly, and Ala are observed as PTH amino acids in cycle #1, and the MS alone results in a possible lie, Asp, or Asn assignment. The high background should be noted as the reason that the minor sequence is not observable in cycles 1 and 2. Cysteine should be addressed as being identified as CAMC, since free cysteine was derivatized prior to the Edman chemistry. The investigator should be specifically told that an amino acid cannot be assigned at position 11 of the minor sequence. Finally, Arg-15 of the minor sequence can only be assigned tentatively since carryover could explain the PTH yield, and MS data cannot assist in assignment information with three unassigned residues in the sequence. These recommendations are an example of how to reliably present data to an investigator.
Evaluation of ABRF-96SEQ: A Sequence Assignment Exercise IV.
77
Conclusion
The ability of most core laboratories to correctly interpret data from a homogenous and relatively high-level sample that has no amino acids that are difficult to identify, is of a finite length, and has clear PTH chromatograms is excellent as shown with dataset A (Table I). However, even under ideal conditions there were a few positive incorrect calls that should have been reported as tentative. The overall accuracy decreases when the sample is low-level and complex as evidenced with dataset B (Table I, Figures 3, and 4). Generally, the committee feels there were more positive incorrect calls than there should have been, and urges all respondents that perform sequence analysis to be cautious in assigning positive correct calls, particularly at the end of the sequence. Table IV: Consensus Calls for Dataset B 1 Residue Residue Identified Comments # Major Minor 1 Leu Leu-1 assigned by MS data and yield; high background 2 Lys high background 3 Ser Glu 4 Trp Gly 5 Thr Asp 6 Cys Val Cys identified as carboxyamidomethyl-cys (CAMC) 7 Leu His Lys Ala Is9 Asn Thr 10 Phe Ser Lys unable to assign minor sequence 11 12 Pro ne 13 Cys Ala Cys identified as carboxyamidomethyl-cys (CAMC) Glu Arg 1 14 15 Leu (Arg) consistent with peptide length estimated by MS (Arg)^ 16
1 17 1 18 19 20 21 22
123
Lys Tyr Gin Trp Leu Met Arg end
end
24 25
b.
Consensus calls were agreed upon by the authors as the best possible sequence interpretation with the available data. Arg at position 15 is assigned as tentative.
Joseph Fernandez et al
78
While those who used the MALDI-TOF MS data had a higher accuracy of positive assignments, this study indicated there are many respondents who are not well versed in the proper use of mass spectrometry data. Respondents can use mass spectra data to confirm sequence assignments within approximately 0.1% accuracy of the measured mass, to estimate the length of a peptide, and to potentially confirm tentative amino acids. The mass spectral data should not be used to estimate the ratios of ions present. The web sites listed in Table n provide a ready resource to aid in this process. Based on the quality of assignments in dataset A and B, especially regarding Cys and Trp, it may be concluded that the reduced accuracy in previous studies was not due solely to poor data interpretation. Other potential sources of sub-optimal results may be attributed in part to suboptimal sample handling, inefficient instrumentation, or poor sequencer optimization.
Acknowledgments The committee would like thank Mike Cory (Glaxo Wellcome) for receiving data and forwarding it to the committee. The committee would also like to thank the ABRF business office for assisting in the dataset distribution, and also Jeff Mathers (Rockefeller University) for synthesis of the major peptide in dataset B.
References 1.
2.
3.
4.
5.
6.
7.
8.
9. 10.
Niece, R.L., Williams, K.R., Wadsworth, C.L., Elliott, J., Stone, K.L., McMurray, WJ., Fowler, A., Atherton, D.A., Kutney, R., and Smith, A.J. (1989) in Techniques in Protein Chemistry, (T.E. Hugh, ed.), Academic Press, San Diego, pp 89-101. Speicher, D.W., Grant, G.A., Niece, R.L., Blacher, R.W., Fowler, A.V., and Williams, K.R. (1990) in Current Research in Protein Chemistry, (J.J. Villafranca, ed.). Academic Press, San Diego, pp 159-166. Yuksel, K.U., Grant, G.A., Mende-Muller, L.M., Niece, R.L., Williams, K.R., and Speicher, D.W. (1991) in Techniques in Protein Chemistry n (J.J. Villafranca, ed.) Academe Press, San Diego, pp 151-162. Crimmins, D.L., Grant, G.A., Mende-Muller, L.M., Niece, R.L., Slaughter, C , Speicher, D.W., and Yuksel, H.U., (1992) in Techniques in Protein Chemistry HI (R.H. Angeletti, ed.) Academic Press, San Diego, pp 35-45. Mische, S.M., Yuksel, K.U., Mende-Muller, L.M., Matsudaira, P., Crinmiins, D.L, and Andrews, P.C, (1993) in Techniques in Protein Chemistry IV (R.H. Angeletti, ed.) Academic Press, San Diego, pp 453-461. Rush, J., Andrews, P.C, Crimmins, D.L., Gambee, J.E., Grant, G.A., Mische, S.M., and Speicher, D.W., (1994) in Techniques in Protein Chemistry V (J.W. Crabb, ed.) Academic Press, San Diego, pp 133-141. Gambee, J.E., Andrews, P.C, De Jongh, K., Grant, G.A., Merrill, B., Mische, S.M., and Rush, J. (1995) in Techniques in Protein Chemistry VI (J.W. Crabb, ed.) Academic Press, San Diego, pp 209-217. De Jongh, K.S., Fernandez, J., Gambee, J.E., Grant, G.A., Merrill, B., Sone, K.L., and Rush, J., (1996) in Techniques in Protein Chemistry VII (D. Marchak, ed.) Academic Press, San Diego, in press. Bieman, K. (1990) in Methods in Enzymaology Vol. 193 (J.A. McCloskey, ed.) Academic Press, San Diego, p 888. Current Protocols in Protein Science (1995), (J.E. Coligan et al, ed.) John Wiley & Sons, Inc., front cover.
INTERNAL PROTEIN SEQUENCING OF SDS-PAGE-SEPARATED PROTEINS: OPTIMIZATION OF AN IN GEL DIGEST PROTOCOL Ken Williams, Mary LoPresti and Kathy Stone HHMI Biopolymer Laboratory/W.M. Keck Foundation Biotechnology Resource Laboratory, Yale University, New Haven, CT 06536
I. Introduction Surveys of biotechnology core laboratories suggest that over the last 9 years there has been nearly a 10-fold increase in the sensitivity at which internal sequencing can be routinely carried out on "unknown" proteins. That is, in response to a question concerning the amount of protein required for internal sequencing, 16 respondents to a survey carried out in 1987 gave a median estimate of 400 pmol (1) whereas the median estimate given by 28 respondents to a survey carried out in 1996 (2) was only 50 pmol. To continue this trend, which we believe primarily reflects improved methodologies, we have evaluated an in gel digest protocol (3-6) so that critical steps in this procedure can be identified and optimized and so that realistic limits can be placed on the amount of protein required to maintain a success rate that approaches 100%.
II. Materials and Methods A. Sample Preparation With the exception of studies on bovine serum albumin (BSA) and human transferrin, all other digests were carried out on Coomassie Blue-stained gel bands that had been excised from SDS polyacrylamide gels and submitted in eppendorf tubes to the internal protein sequencing service of the HHMI Biopolymer Laboratory/W.M. Keck Foundation Biotechnology Resource Laboratory at Yale University (5). The BSA and transferrin samples were subjected to SDS-PAGE in the Keck Facility and were otherwise prepared as described (5). Proteins were quantified by subjectmg 10-15% aliquots of all gel slices to hydrolysis and ion exchange amino acid analysis (5).
B. In Gel Enzymatic Cleavage of Proteins 1. Sample and blank gel pieces were cut into approximately 1 x 2 mm sections, placed into 1.5 ml Eppendorf tubes (which had been pre-washed with Buffer A (0.1% TFA, 60% CH3CN)) and then washed with 250 /xl Buffer B (50% TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
79
80
Ken Williams et al
CH3CN, 200 mM NH4HCO3, pH 8) for 30 min at RT on a tilt table. 2. After removing the wash, sufficient Buffer B (usually about 100 ix\) was added to cover the gel pieces and the approximate total volume estimated by comparing to Eppendorf tubes containing known volumes of water. 3. Sufficient 45 mM dithiothreitol (DTT) was then added to bring the final concentration to 1 mM before incubating the samples for 20 min at 3TC. 4. Twice the volume (as compared to DTT) of 100 mM methyl 4-nitrobenzene sulfonate (or an equal volume of iodoacetic acid or iodoacetamide) was added followed by a 40 min incubation at 37°C. 5. After removing the supernate, the gel slices were washed at RT on a tilt table for 30 min and then twice more for 15 min with 250 /xl Buffer B. 6. After removing, the last wash, the gel pieces were brought to dryness in a Speedvac and then hydrated by adding 1 /xl/mm^ (initial estimated gel volume) of a freshly prepared enzyme solution made by mixing one volume 0.1 mg/ml trypsin (Promega modified) or lysyl endopeptidase (Wako) with two volumes 200 mM NH4HCO3 If necessary, additional enzyme solution (0.0333 mg/ml) was added to totally immerse the gel pieces. 7. After incubating at 37''C for 24 hr, peptides were extracted with 100 jitl (or a volume equal to the gel volume if that is larger) Buffer A for one hour at RT on a tilt table. 8. After repeating step 7, the combined extracts were dried in a Speedvac, dissolved in 20 ix\ 0.05% TFA, 25% CH3CN, and diluted with 90 /xl 0.05% TFA prior to subjecting 100 /xl to HPLC.
C. Reverse Phase HPLC Separation of Enzymatic Digests Digests were fractionated on an HP1090 HPLC equipped with an Isco Model 2150 Peak Separator and a 25 cm Vydac C-18 (5 micron particle size, 300 A pore size) column equilibrated with 98% Buffer C (0.06% TFA) and 2% Buffer D (0.052% TFA, 80% CH3CN). The peptides were then eluted with the following gradient: 0-60 mm (2-37% BufferD), 60-90 min (37-75% Buffer D) and 90-105 min (75-98% Buffer D). In general, amounts of digests in the 5-250 pmol range were fractionated on 1.0 mm ID columns eluted at 50 jttl/min with larger amounts being separated on 2.1 mm ID colunms eluted at 0.15 ml/min (see references 5 and 7 for additional details).
D. MALDI-MS and Peptide Sequencing In general, - 3 % aliquots of 6 of the most symmetrical, latest eluting HPLC absorbance peaks (not also present in the blank digest) were chosen for matrix assisted laser desorption ionization mass spectrometry (MALDI-MS) on
Internal Sequencing of SDS-PAGE-Separated Proteins
81
a VG/Micromass TofSpec SE instrument (8). Our experience is that MALDIMS "screening" of peptide fractions readily detects most tryptic peptide mixtures and reagent artifact peaks prior to amino acid sequencing. Hence, we have found that using a major/minor MALDI-MS peak height response ratio of greater than 10 as an additional criterion of peptide purity significantly increases the fraction of peptides sequenced that provide usable data. (8). In the case of those peptides selected (on the basis of the combined criteria of absorbance peak shape and MALDI-MS spectrum) for sequencing, the appropriate fraction was loaded directly onto an Applied Biosystems Model 470, 477 or HS-Procise instrument operated according to the manufacturer's recommended protocol and as described previously in more detail (5). Immediately following sequencing, all peptide sequences were searched via the "Blast" email server operated by the National Center for Biotechnology Information (9).
III. Results A. Optimization of an In Gel Digest Procedure Difficulty in obtaining high sensitivity MALDI-MS spectra on in gel digests (for the purpose of peptide mass database searching prior to HPLC fractionation) carried out in the presence of Tween 20 (3-6) provided the impetus for determining if this detergent is indeed essential. Based on tryptic and lysyl endopeptidase digests of transferrin (25 pmol), Tween 20 (0.02%) did not have any significant impact on overall peptide yield as judged by the resulting absorbance profiles (Fig. 1). As a result, we have deleted this detergent from the protocol described in Materials and Methods and have since carried out over 30 successful digests in the absence of any detergent. As shown in Fig. 2 (Panels A-C), a 10-fold decrease in the 0.033 mg/ml final trypsin concentration recommended in the digest reduces the total peak height yield from a 25 pmol transferrin digest by about 2.5-fold. Since no significant increase in overall yield was noted between the 0.014 mg/ml trypsin concentration used Fig. 2B and the 0.25 mg/ml concentration used in Fig. 2C, we believe the recommended 0.0333 mg/ml concentration provides at least a two-fold excess that might help ensure that even relatively resistant proteins will nonetheless be digested successfully. An important consideration in carrying out in gel digests is the background present in a control gel slice that should not contain protein. As shown in Fig. 2 (Panels D-F), even such a minor modification as carrying out reduction and cysteine modification before, rather than after the digest can significantly lower the background. Although prior reduction and cysteine modification does not appear to generally improve overall peptide recovery from in gel digests (data not shown), it does permit cysteine residues to be identified during sequencing. Another alternative, which might improve the background even fiirther, would be to carry out the reduction and cysteine modification prior to SDS-PAGE. In the case of samples submitted to biotechnology core laboratories, the latter is not always possible. Finally, Fig. 2F indicates that the addition of two extra washes removes a large artifact peak
Ken Williams et at
82
Lack of Impact of Tween 20 on Tryptic/Lys-C Digests of 25 pmol Transferrin (1 x 250 mm col, 50/>il/min)
i
^
.1
^.U.I.M.^.MM.M.JM.,,.,,.^
MMJ^MM,
i""'''"''T""'^^"^T'"""r-.T^.TrT^
^TTTltinTyj rrn-ri rr 1^ rrrrrrrrrjTTTi
•ni-riVf TTTlii 1111111 n-Trrq-rr
Time (min) Figure 1. Reverse phase HPLC separation of tryptic (A, B) and lysyl endopeptidase (C, D) in gel digests of 25 pmol aliquots of human transferrin. Following SDS-PAGE, the gel was stained with Coomassie Blue and the bands of interest were then excised, digested and subjected to HPLC as described in Materials and Methods. The digests shown in the top two chromatograms were carried out as described in Materials and Methods while the digests shown in the bottom two chromatograms were carried out in the presence of 0.02% Tween 20. All four digests were chromatographed at 50 /zl/min on a 1 x 250 mm Vydac C-18 column.
eluting at about 54 min and gives some additional improvement to the overall background. Although additional washes of the gel slices prior to digestion can effectively reduce background they also pose potential problems in terms of sample washout. To estimate the amount of sample lost during the 150 min required to bring a sample through step 5 (see Materials and Methods), we have compared the amount of protein estimated by hydrolysis/amino acid analysis of an aliquot of the submitted gel sample with that found in the combined washes. Since the range of loss was wide, extending from less than 1 % to 56%, we have Table I. Protein loss during in gel sample washing* Range of Loss
Overall Loss
n
MW (kD)
Amount (pmol)
Gel Thickness (mm)
1.0 4.4(1-9) 48.5 89.4(49-500) Less than 10% 8 70.0 37.1(16-99) 0.8 From 10-25% 14.0(10-18) 8 15.0 32.2(26-56) 6 155(32-860) 1.3 From 26-56% *A11 data reported in terms of median values with ranges in parentheses.
Gel %Acryl. 12.0 10.0 13.8
Internal Sequencing of SDS-PAGE-Separated Proteins
83
Time (min) Figure 2. Reverse phase HPLC of in gel digests of 25 pmol amounts of transferrin that were digested with increasing concentrations of trypsin (A, B, C) and of blank sections of SDS polyacrylamide gels D, E, F). All three transferrin digests were carried out on Coomassie blue stained gel bands as described in Materials and Methods except that the final trypsin concentration in the digest varied from 0.0033 mg/ml (A) to 0.014 mg/ml (B) to 0.025 mg/ml in panel C. The corresponding absorbance peak height sums were 17.8, 44.2 and 41.9 respectively (arbitrary units). Panels D, E, and F show HPLC chromatograms of in gel digests of blank sections of SDS polyacrylamide gels. In each case a section of Coomassie Blue stained gel corresponding to the size of a single band (approximately 15 nam^) was brought through the procedure detailed in Materials and Methods with the following changes. D, The two, 15 min washes in Step 5 of this protocol were deleted and the reduction/modification with methyl 4nitrobenzene was carried out after the digest rather than before. E, The two, 15 min washes in Step 5 were deleted. F, There were no changes from the procedure described in Materials and Methods.
broken the data down in Table I by range of loss in the hope of identifying risk factor(s) for proteins that are likely to be subject to unusually high loss. As indicated in Table I, the variable that correlated most strongly with high loss was low molecular weight. Hence, the median molecular weight for those proteins that fell in the 26-56% loss category was 15 kD as compared to close to 50 kD for those proteins that fell in the less than 10% loss category. Interestingly, those proteins that suffered the highest losses also happened to be submitted in the largest relative amounts (Table I). Although additional data is needed and we expect there will be instances of protein dependent loss, the data in Table I seem to indicate that some caution be exercised in the case of relatively low molecular weight proteins. Since loss of protein during washing is likely to be time dependent (with low molecular weight reagent artifacts diffusing out more quickly than for instance a 15,000 dalton protein), simply decreasing the washing times might well lead to increased recovery of low molecular weight proteins. As a further aside we note that losses of low
Ken Williams et al
84
i'
H^MUM.M^MU.MM^UM.nn^u
.^"M,M,y,,
111111 11 111 1111 I I 111111111111 111 I I 111 11111 111111 11 11 I I IJ 11 I I I 11
Time (min) Figure 3. Reverse phase HPLC of an in gel digest of 77 pmol of an unknown 135 kD protein. A. Chromatogram of the initial tryptic digest. Since there were few, if any probable peptide peaks not also present in the blank (panel B) and since amino acid analysis of the digested gel band indicated it still contained the protein, the gel band was washed twice with 150 /xl 0.1 M Tris/HCl, pH 8, 50% CH3CN and digested again as described in Materials and Methods. Panel C shows the chromatogram that was obtained from this re-digest.
molecular weight proteins almost surely also occur during staining - which is a variable we have not yet studied. Finally, another parameter likely to negatively impact on in gel digests is excessive Coomassie Blue. Since this dye binds primarily through lysine and arginine residues (10), it is not surprising that excess Coomassie Blue prevents trypsin digestion - either by masking trypsin cleavage sites on the substrate protein or by directly binding to trypsin. Although the initial wash with Buffer B (Step 1 in the protocol in Material and Methods) is effective at removing excess Coomassie Blue, additional washes may be needed in the case of very heavily stained samples. Fig. 3 provides an example of a heavily stained sample where the initial digest (top panel) failed due to excess Coomassie Blue. In this instance it was obvious the digest had failed because the gel pieces remained darkly stained at the end of the procedure - whereas normally they are clear. Since the fact that the digested/extracted gel pieces were still darkly stained indicated the protein had failed to digest, this sample was simply
Internal Sequencing of SDS-PAGE-Separated Proteins
85
brought through the procedure again and this time the digest succeeded (Fig. 3C). In addition to suggesting it is best to stain proteins destined for in gel digestion the minimal time needed to permit adequate visualization, Fig. 3 also illustrates that when an in gel digest does occasionally fail, the protein is almost invariably still localized within the gel matrix. Hence, in these instances the sample can be readily digested after more extensive washing or, if the protein was resistant to the first protease that was tried, a different protease could be tried the second time. In regards to the latter, we have found one instance of a protein that apparently failed to cleave with lysyl endopeptidase but then did cleave when the digest was repeated with trypsin.
B. Summary of Results from 191 In Gel Digests. The data in Table II provide an extensive overview of in gel digests that can be compared to the preliminary data previously summarized on 25 similar digests (5). The median amount of protein digested in the studies summarized in Table II was 100 pmol and the average number of peptides sequenced per protein was close to 2. This number is relatively low because 68.4% of the proteins summarized in Table II were identified based on searching protein databases with the first peptide sequence obtained. Almost invariably, in these instances additional confirmation of the identification was obtained on the basis of the apparent molecular weight of the protein and by matching observed and predicted peptide masses. By "screening" peptides destined for sequencing with MALDI-MS (8), we have been able to maintain an 80% success rate in terms of successfully sequencing peptides obtained from in gel digests. Approximately 10% of peptides subjected to sequencing fail to provide any data either because they derive from the (usually) blocked NHj-terminus of the protein or perhaps were lost subsequent to HPLC collection while the remaining 10% of peptides that fail to provide usable sequences prove to contain mixtures Table II. Summary of results obtained from 191 in gel digests Parameter
Amount of Protein Digested (pmol) Total 51-100 <50 101-200 >200 54 Number of proteins digested 28 44 65 191 87 62 Average mass of protein (kD) 60 64 59 Average amount digested (pmol) 32 77 140 311 146 271 Median amount digested (pmol) 29 78 138 100 0.22 Avg. density protein band 0.10 0.28 0.26 0.49 Number of peptides sequenced 62 113 145 89 409 2.2 2.1 2.1 Avg. # peptides sequenced/protein. 2.2 2.0 77.4 77.0 80.4 % Peptides successfully sequenced 82.1 84.3 10.0 11.2 12.2 Average % initial seq. yield^ 17.6 12.8 11.4 11.2 14.2 Avg. ^residues sequenced/peptide 12.8 12.5 100 96.3 97.7 97.9 Overall digest success rate (%) 96.9 68.4 Overall % known proteins 88.0 56.8 62.3 79.1 ^Based on the initial peptide sequencing yield divided by the estimated amount of protein digested which is based on hydrolysis/amino acid analysis of the submitted gel slice.
Ken Williams et al
86
that were not detected by either HPLC absorbance peak shape or MALDI-MS screening (8). It is important to note that the overall percent initial sequencing yields, which have been calculated based on the average initial peptide sequencing yield divided by the amount of protein digested, are usually near 12% - with the higher value of about 18% observed for the <50 pmol samples probably resulting from slight under-estimation of the amount of protein actually digested in this range (Table II). The problem in this regard is that as the actual amount of protein hydrolyzed in the aliquot of gel matrix is decreased, nonspecific losses due to adsorption and other factors become more important. As previously (5), the overall success rate of in gel digests (98%) is extremely high and is all the more remarkable in view of the fact that the data summarized in Table II derives exclusively from in gel samples prepared by > 150 principal investigator-users of the internal sequencing service provided by the Keck Biotechnology Laboratory.
C. What Are the Limits of In Gel Digests? The finding that several important parameters that characterize successful in gel digests (such as the initial sequencing yields, the fraction of peptides successfully sequenced and the overall digest success rate) do not significantly decline as the amount of protein digested extends below the 50 pmol range (Table II) suggests the limits of this approach have not yet been reached. Indeed, examination of the kinetics of in gel digestion confirms this hypothesis. Assuming trypsin is never saturated with substrate in an in gel digest, the rate of formation of tryptic peptides follows the second order reaction: d[Peptides]/dT = (k^^/KJ [Protein] [Trypsin] Rearranging terms leads to the expression: d[Peptides]/[Protein] = (k^a/KJ [Trypsin]dT The inescapable conclusion from this analysis is that if both the trypsin concentration and the time of digestion are constant, the fraction of protein digested will also be constant - regardless of how little substrate protein is present. That is, under these conditions there is no theoretical limit of sensitivity to digesting in gel protein samples. This supposition is supported by the data in Fig. 4, which also serves to illustrate iht practical limits of in gel digests. By comparing the blank and sample HPLC profiles in Fig. 4 (Panels A-C), it is obvious that utilizing the approaches described in this work, a 2.5 pmol digest of transferrin is beyond the practical limits of in gel digestion. However, the accompanying MALDI-MS spectra of 10% of each digest Fig. 4 (Panels D-F) indicate that while even the 2.5 pmol digest failed based on the HPLC profile, both the 2.5 pmol and the 250 fmol digests actually succeeded based on the MALDI-MS spectra and by matching observed with expected peptide masses. Clearly, in order to successfully purify and isolate in gel digests of 2.5 pmol and less amounts of protein will require the use of capillary
Internal Sequencing of SDS-PAGE-Separated Proteins
"T""""i
87
^•.••,..,.^,M.....,^n,„Mnj,.
Time (min)
m/z (450 - 4,500)
Figure 4. Reverse phase HPLC and MALDI-MS of in gel digests of 2.5 pmol (A and D) and 250 fmol (B and E) transferrin with panels C and F corresponding to the blank control. In each case 90% of the digest was subjected to HPLC on a 1.0 mm ID Vydac C-18 column eluted at 50 jLil/min and the remaining 10% subjected to MALDI-MS.
HPLC columns and flow rates that are well below the 50 jul/min rate used in Fig. 4. In terms of trying to establish a practical limit of sensitivity for in gel digests, the chromatograms in Fig. 5 demonstrates that while a 25 pmol digest of transferrin is clearly reasonable (panel A), the increasing background in the 10 pmol digest of serum albumin suggests (panel B) the practical lunit is not far below this level. Taken together, the data in Fig. 4 and 5 suggest the practical limit of in gel digestion is probably near the 5 pmol range - providing sufficient care is exercised in terms of running a "blank" digest and then usmg it, as well as perhaps MALDI-MS screening, to avoid trying to sequence absorbance peaks resulting from reagent artifacts and trypsin autolysis products.
Conclusions Based on the data in Table II, in gel digestion is a remarkably robust approach for obtaining internal peptide sequences from SDS-PAGE-separated proteins. Since all of the samples upon which the data in Table II are based were submitted by investigator-users of the Keck Biotechnology Resource Laboratory, in gel digestion apparently imposes few constraints in terms of the particular poly aery lamide gel system being used (i.e., the data in Table II include samples from gradient SDS polyacrylamide gels as well as from native and two dimensional gels where the second dimension was SDS-PAGE) and (within reason) the quality of the reagents used for making and running the gels. That is, we suspect the quality of the reagents used by the more than 150 investigators who prepared the samples summarized in Table II probably varied widely. In addition, since the majority of these samples were shipped long distances on dry ice, Coomassie Blue-stained gel bands are also quite stable when simply excised and placed in eppendorf tubes (5).
Ken Williams et al.
88
•^
ai|
Time (min) Figure 5. Reverse phase HPLC of in gel tryptic digests of 25 pmol transferrin (A) and 10 pmol bovine serum albumin (B) and of the corresponding digests carried out on blank sections of gels (lower profiles shown in above two figures). In each instance 90% of the digest was subjected to HPLC on a 1.0 mm ID Vydac C-18 column eluted at 50 ^1/min, The respective full scale deflections were 18.9 mV for panel A and 4.4 mV for panel B with 0.5 volt corresponding to an absorbance of 1.0 at 210 nm.
Perhaps the most significant problem encountered in carrying out large numbers of in gel digests in a core laboratory setting is over-estimation of the amount of submitted protein. To circumvent this problem, we subject a 10-15 % aliquot of each submitted in gel sample to hydrolysis and high sensitivity ion exchange amino acid analysis. If this analysis indicates that less than the recommended amount of protein (i.e., currently we recommend a minimum of 25 pmol and that the density of the protein in the stained gel band exceed 0.05 fxg/mro?) remains, the submitting investigator is then firmly advised that it is in their best interest to purify additional protein that can either be pooled with the existing sample or (if the density of the sample in the gel band is too low due to it having been run in too many lanes - which causes problems in terms of efficient washing and extraction) that can be used to replace the existing sample. Upon learning that the probability of success and the quality of the resulting data would almost certainly be improved by submitting at least the recommended minimum amount of protein, almost invariably we find the submitting investigator willing to purify and submit additional protein. We believe the primary reason the median success rate for in gel digests reported by 16 respondents in a recent survey was only 78% (2), as opposed to the 98% success rate reported in Table II, was over-estimation of the amount of submitted protein. Although we estimate that hydrolysis and ion exchange amino acid analysis of an aliquot of each submitted gel band provides an approximately 10fold more accurate estimate of the amount of protein remaining than can be
Internal Sequencing of SDS-PAGE-Separated Proteins
89
determined by estimating relative Coomassie Blue staining intensity, with some caution, the latter can be successfully used to improve the overall success rate of in gel digests. In this case it is important to realize that Coomassie Blue staining is completely reversible and that the intensity of staining depends upon the number of lysine and arginine residues (10), which accounts for the 2-3 fold range in Coomassie Blue staining intensity that we often see with standard proteins. Nonetheless, we believe that if a few different concentrations of several standard protein mixtures are run on the same gel as the sample, it is possible to use relative staining intensity to routinely estimate the amount of submitted protein within a 2-3 fold range. If the minimum recommended amount of submitted protein is then set sufficiently high (i.e., 50 pmol instead of 25 pmol) to accommodate this range, it should then be possible to approach an in gel digest success rate that is close to 100%. Although the three in gel digests that we have so far carried out below the 15 pmol range succeeded, as evidenced by the fact all 3 of these "unknown" proteins were identified by database searching of the first tryptic peptide sequence obtained, we currently recommend that a minimum of 25 pmol protein be submitted for in gel digestion. This is 5-fold above the least amount of an "unknown" protein (i.e., ~5 pmol as estimated by amino acid analysis) that we have so far attempted and succeeded in digesting and sequencing. In terms of the amount of protein routinely required for internal sequencing of SDS-PAGE-separated proteins, our data agree well with that determined by a survey of 26 core laboratories, which included 16 that carry out in gel and 10 that carry out in situ PVDF digests (2). In both cases the median amount of protein recommended for internal sequencing was about 60 pmol and the least amount of protein that had been successfully digested (and from which at least two, 10 residue peptides had been sequenced) was about 25 pmol, with the range on the latter figure extending down to 5 pmol. Being ever cognizant of the expected average initial sequencing yield from an in gel digest of 12% (Table II), we suggest that a laboratory that carefully estimates the amount of submitted protein that can carry out "preparative" HPLC at flow rates near 50 /xl/min on 1 mm ID columns and that can routinely sequence in the low to sub-picomole range can routinely succeed with amounts of protein that extend down to the 25-50 pmol range. Finally, it is important to be mindful of the ever changing role played by protein chemistry in the elucidation of primary structures. Over the last 2530 years that role has evolved from the rather plodding task of complete primary structure determination to the opportunistic uncovering of selected peptide sequences needed to serve as the basis for synthesizing oligonucleotide probes and primers. One has only to keep in mind the increase in the fraction of "unknown" proteins identified by database searching of internal peptide sequences (i.e., 46% of internal sequencing samples submitted to the Keck Biotechnology Laboratory in 1994 (5) as opposed to the current 68% (Table II)) and the ever nearing completion of the human and other genome projects to realize that one of the next "frontiers" for protein chemistry will almost surely be extremely rapid and high sensitivity protein identification (often involving the identification of trace amounts of proteins separated by two-dimensional polyacrylamide gels and whose concentrations have been shown to be altered
90
Ken Williams era/.
in response to external stimuli, cell cycle and other changes) and that mass spectrometry will almost certainly be at the forefront of this effort.
Acknowledgments We especially thank Michael Laskowski (Purdue University) for bringing our attention to the kinetics of in gel digestion and Myron Crawford, Ray DeAngelis, Ed Papacoda and Nancy Williams, who are all members of the Protein Chemistry Section of the HHMI Biopolymer/Keck Biotechnology Laboratory, for their assistance with this study.
References 1.
Williams, K.R., Niece, R.L., Atherton, D., Fowler, A.V., Kutny, R. and Smith, A.J. (1988) FASEB J. 2, 3124-3130. 2. Unpublished survey of 30 biotechnology core laboratories taken by K. Williams. 3. Rosenfeld, J., Capdevielle, J., Guillemot, J.C. and Ferrara, P. (1992) Anal. Biochem. 203, 173-179. 4. Hellman, U., Wemstedt, C. Gofiez, J. and Heldin, C.-H. (1995) Anal Biochem. IIA, 451455. 5. Williams, K.R. and Stone, K.L. (1995) In Techniques in Protein Chemistry VI (J.W. Crabb, ed.) 143-152. 6. Stone, K.L. and Williams, K.R. (1996) In The Protein Protocols Handbook (J.M. Walker, ed.) 415-425. 7. Stone, K.L. and Williams, K.R. (1996) In The Protein Protocols Handbook (J.M. Walker, ed.) 427-434. 8. Williams, K.R., Samandar, S.M., Stone, K.L., Saylor, M. and Rush, J. (1996) In The Protein Protocols Handbook (J.M. Walker, ed.) 541-555. 9. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990) / . Mol. Biol.215, 403-410. 10. Chial, H.J. and Splittgerber, A.G. {\992i) Anal Biochem. 213, 362-369.
A Strategy to Obtain Internal Sequence Information from Blotted Proteins after Initial N-terminal Sequencing Kuo-Liang Hsi, William E. Werner, Lynn R. Zieske, Chris H. Grimley, Steven A. O'Neill, Michael L. Kochersperger, Kent Yamada and Pau-Miau Yuan PE Applied Biosystems, Foster City, CA
I. Introduction SDS-PAGE followed by electroblotting of protein samples onto PVDF type membrane is a commonly used approach to prepare protein samples for sequence analysis. However, if the protein of interest is N-terminally blocked, no sequence information can be obtained. Thus, to generate internal peptide fragments for the identification of sequences from N-terminally blocked proteins, or for the maximization of sequence information from larger proteins requires purification of additional protein sample. With the advent of high sensitivity sample preparation systems employing capillary HPLC, it has become feasible to explore the generation and purification of internal peptide fragments from modest amounts of protein (60 picomole) immobihzed onto PVDF membrane which have previously been subjected to Edman degradation. Our initial investigations revealed that after proteins had been subjected to Edman chemistry, they were refractory to digestion by the enzymes trypsin, LysC and Glu-C. It was possible to generate internal fragments using chymotrypsin, but the subsequent peptide maps were contaminated by extensive auto-digestion products. Greater success was achieved when chemical cleavage methods were employed. Two proteins, i.e. carbonic anhydrase and transferrin,were chosen as models for this study. The following experiments will demonstrate the generation, extraction and the subsequent purification strategy of intemal fragments using both cyanogen bromide to cleave proteins at methionine, and incubation in formic acid at elevated temperature to cut between the aspartic acid and proline.
II. Materials and methods A, Chemicals Carbonic anhydrase and transferrin were purchased from Sigma Chemical Co. (St. Louis, MO). Formic acid. Cyanogen bromide, and 3-cyclohexylamino1-propanesulfonic acid (CAPS) were purchased from Aldrich (Milwaukee, WI). Pre-cast 10-20% gradient Tris-tricine polyacrylamide gels were purchased from TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
91
92
Kuo-Liang Hsi et al
Novex (San Diego, CA). ProSorb™ Cartridges, ProBlott PVDF membrane and all the solvents and reagents used for HPLC were obtained from PE Applied Biosystems (Foster City, CA).
B. Protein sample preparation Fig. 1 illustrates the general flowchart for sample preparation. Two proteins, carbonic anhydrase possessing a blocked N-terminal and transferrin possessing a free N-terminal, were used as models in this study. Both proteins were prepared as follows. 1. ProSorb Cartridge Carbonic anhydrase and transferrin (60 picomole each) were dissolved separately in 50 ml of 0.1% TFA and loaded into the ProSorb Cartridge. After the solution had passed through the membrane, the membrane was washed with 100 ml 0.1% TFA. 2. SDS-PAGE and electroblotting Carbonic anhydrase and transferrin (60 picomole each) were run on a 1020% SDS-polyacrylamide Tris-Tricine gel (1) and then electrically transferred to PVDF membrane in CAPS buffer (2). TTie blotted proteins were stained with Coomassie Blue G-250 and the stained bands were excised for further study.
C. On-membrane cysteine modification Cysteine residues in transferrin were reduced and alkylated in a similar manner as that done for solution samples (3). Modification for samples prepared with ProSorb cartridge can be performed in the same ProSorb cartridge before membrane removal, while modifications were performed in an Eppendorf tube for the electroblotted samples. The membranes were incubated 15 minutes at room temperature in a 0.25 M Tris/HCl and 6 M Guanidine hydrochloride buffer containing 1 ml of mercaptoethanol and followed by the addition of 1 ml of 4vinyl pyridine for another 15 minutes. The membranes were washed thoroughly with 0.1 % TFA afterwards.
D. Edman degradation treatment Carbonic anhydrase and transferrin immobilized onto membranes either by ProSorb or by SDS-PAGE/electroblottig were subjected to 10-20 cycles of Edman degradation on an Applied Biosystems 473 Sequencer. The sequenced membranes were used direcdy for successive chemical fragmentation.
E. Chemical cleavage of sequenced proteins Sequenced membranes were treated in 60 ml of 70% formic acid containing 50 to l(X)-fold molar excess crystalline cyanogen bromide and incubated for 2 hours at 70° C in the dark. The solution was removed and the membrane was extracted twice with 60 ml each of 50% acetonitrile containing 10% TFA. Extractions were conducted in a sonicator for 15 minutes for each extraction. The extracts were pooled with the formic acid solution and the volume was reduced to a few ml with a Savant Speed-Vac.
Internal Sequence Information from Blotted Proteins SDS-PAGE & Electroblott
93 Immobilization On ProSorb
Reduction & Alkylation 10-20 Cycles Edman Degradation
I
Chemical Fragmentation Bromide in 70% Formic acid\ ( Cyanogen at 70 C°for 2 hours /
i
Extract Fragments
(
50% acetonitrile, 10% TFA \ 15 minutes with sonication 1
2X
/
Purify Fragments
ABI 173A microblotter \ C4 column with PVDF collection/
Sequence Purified Fragments Fig. 1. Sample preparation flowchart showing the general strategy of proteins preparation and generation of chemical degradation fragments.
F. Peptide mapping and blotting Cyanogen bromide digests of proteins were separated on an Applied Biosystems 173A capillary LC/MicroBlotter System (4-5). This newly designed system consists of a capillary LC for sample separation and a dynamic on-line microblotter for direct collection of the separated peptides onto a strip of PVDF membrane. This system is different from the static blotting system described by Hiroshi and Takao (6). Fig. 2 is a schematic diagram showing the working principles of the 173A MicroBlotter Systems. A C4 reversed-phase capillary column (0.5 x 150 mm) was employed for the separation of CNBr or formic acidcleaved fragments. cLC conditions for the separation are described in Fig. legends 3 and 4. The separated and blotted peptides were used for direct sequence determination.
94
Kuo-Liang Hsi et al |l40I>Btinip ^
- O
SfA Injector 1 Sample Loop
0.5 mm X15 cm Protein/Peptide Analytical Column 30 ^m ID Capillary Dynamic Solenoid Collected FracUons on PVDF Strip (2-4 ul/peak, 2 mm spot) Dynamic Solenoid 50-60 Hits/peak @1 Seconds/Hit
Teflon Sleeve
ii
Aligned Chart Recorder Peaks
Fig. 2. A schematic diagram of PE Applied Biosystem's 173A capillary HPLC/MicroBlotter System illustrating the working principles for peptides separation and collection.
G. Sequencing of CNBrlformic acid fragments Sequencing of the generated CNBr fragments was performed on PE Applied Biosystems Procise cLC Sequencer. 50 mg of polybrene in a 70% methanolic solution was loaded onto each excised membrane prior to sequencing. Sequencing was accomplished using the Gas-Phase cLC method.
II. Results and discussion There were two advantages to performing cyanogen bromide digestions in 70% formic acid at an elevated temperature: first, the mediionine specific cleavage occurred faster, and second, the cleavage between aspartic acid and proline pairs was catalyzed. This resulted in the generation of more peptide fragments for all samples tested in a relatively short time (1-2 hours). Fig 3 presents the capillary LC separation and direct collection onto PVDF membrane of the peptide fragments generated from carbonic anhydrase. Peaks 2, 3 , 4 and 5 from the cLC possessed sequences from carbonic anhydrase: either from Met- cleavage (peaks 2,3 and 4 ) or from formic acid induced cleavages between Asp-Pro amino acid pairs at elevated temperature (peaks 4 and 5). Although there was no difficulty in identifying the major sequences, some of
Internal Sequence Information from Blotted Proteins A^
1
15 i ^
95
B
Fig. 3. Mapping and blotting of carbonic anhydrase fragments on 173A. Chromatogram A: samples were prepared by SDS-PAGE/electroblotting. Chromatogram B: samples were prepared from ProSorb. cLC conditions: Column: 0.5 x 150 mm, C4,5m; Solvent A: 0.1% TFA; Solvent B: 0.085% TFA/AcN; Elution gradient employed: B% = 545%/140 min; Flow rate: 5 ml/min; Detection: 210 nm/AUFS 0.1. Peaks labeled with asterisk are dye markers used for locating peaks on the membrane. Peaks labeled with CB came from Coomassie Blue, which was carried over from the initial electroblotting step. the peaks contained two or more sequences (Table 1). We believe that the partial digestion or sample aggregation caused during chemical cleavage reaction may have attributed to these results. The analysis of peaks 1, 6, and 7 failed to yield any amino acid sequences. These peaks probably arose from either the reagents (CNBr for example), or they may be by-products generated by the reaction conditions, or they may represent peptides from the N-terminal which was blocked. Mass spectrometry analysis will be undertaken in the near furture to understand the nature of these peaks. Fig. 4 reveals the capillary LC separation and collection of peptide fragments generated from CNBr degradation of transferrin. Five peaks were collected and subjected to sequence analysis. It was similar to carbonic anhydrase in that some of fragments contained transferrin sequences and others appeared to be artifacts. Peaks 2, 3 and 4 yielded sequences from transferrin. All of them were generated from the cleavage at Met by CNBr. Fig. 5 shows the representative sequence data of carbonic anhydrase fragment 5 from the sample prepared by SDS-PAGE/electroblotting approach. A major sequence of PALKPLALVYGEATS...(starting from residue 41 of the protein) could be identified without difficulty. Another sequence of LKFRTLNFNAEGEPE... was also found in this sample. Fig. 6 presents some representative cycles of fragment 2 of transferrin also prepared by SDS-PAGE/electroblotting approach. A single sequence of YLGYEYVTAIRNLRE... (residue starts from 314 of the transferrin) with initial yield around 1.5 picomole was unambiguously identified. A summary of sequence analysis data of the generated fragments from both proteins is shown in Table 1. It can be seen that multiple fragments from the
Kuo-Liang Hsi et al
96
Table 1 Summary of Sequencing Data of Chemical Cleavage Fragments Carbonic Anhydrase: Peak NO. Sequences
_
P?tsrmin_e.d
2 3 4 5
Domains in the PLQtSilli
Sequencing Initial Yields (pmole).
(M)LANWRPAQPLKNRQV... (M)LKFR'ILNFNAEGEPE.. (D)PALKPLALVYGEATS... (D)PALKPLALVYGEATS... (D)PALKPLALVYGEATS... rvnLKFRTLNFNAEGEPE...
240-254 222-236 41-55 41-55 41-55 222-236
3*(2)** 3 (2) 2 (1) 3 (2.5) 3 (1)
(M)YLGYEYVTAIRNLRE„.
3.5 (1.5) 3 (1.5) 1.5 (1) 3 (2.5)
Transferrin: 2 3
(MXJLLYNKINHCRFDEF...
4
(M)YLGYEYVTAI... (MXHXYNKINHCRFDEF...
314-328 465-479 314465-479
fM)SLDGGFVYIA...
390-
Sequencing yields of the peptides prepared from ProSorb Cartridge. Sequencing yields of the peptides prepared from SDS-PAGE/electroblotting.
3 rn
1 (1)
proteins were generated by this strategy. The average overall recovery yields based upon sequencing initial yields of generated fragments from these two protein models were approximately 5% (2-4 picomole from the proteins prepared by ProSorb cartridge and 1-2.5 picomole from the proteins prepared by SDS-
Fig. 4. Mapping and blotting of transferrin fragments on 173A cLC/MicroBlotter System. Chromatogram A: samples were prepared from SDS-PAGE/electroblotting; Chromatogram B: samples were prepared from ProSorb. The cLC working conditions are the same as in Fig. 3. Peaks labeled with asterisk or CB are the same as described in Fig. 3.
97
Internal Sequence Information from Blotted Proteins
hiiM tiL Cycle 1 P = 260fmole L = 430&nole
(L)
Cycle 13 A = 320finole E=120fiaiole
Cycle 2 A « 680 ftnole : K = 0 fmole
WIA^ Cycle? A«500finole N«100finole
i
\
1 j 1
Cycle 14 T«=170finole P>110fiDole
V A , ^
Cycles
L = 430 fmole F = 200fiiiole
Ui/v Fig. 5. Sequence analysis of carbonic anhydrase fragment 5. Sequencing was performed on Applied Biosystems Procise™ cLC sequencer. Experimental conditions are described in the Materials and Methods section.
PAGE/electroblotting techniques). The low recovery was probably attributed to a combination of factors: sample was lost during initial Edman treatment, the digestions may not have gone to completion with this procedure, and the proteins were modified during initial sequencing procedure and may have become more difficult to extract. Even if only low amounts of digested fragments were recovered, it was easy to sequence these fragments. In conclusion, the 173A MicroBlotter and the 494 Procise cLC high sensitivity sequencing system gready simplified the generation of intemal peptide fragments from protein samples that had previously been exposed to Edman chemistry.
Kuo-Liang Hsi et al
98 A 1
Cycle 9 A«600fmole
VsjJ
1 1
Cycle 15 E«160finole
V_A-^
Fig. 6. Sequence analysis of transferrinfragment2. Sequencer and the experimental conditions employed were same as described in Fig. 5 and the Materials and Methods section.
References 1. Laemmli, U.K. (1970) Nature (London) 227,680-685 2. User Bulletin (1993) Number 58, Applied Biosystems 3. Hawke, D.H. and Yuan, P-M (1978) User Bulletin (Applied Biosystems) 28,1-8 4. Kochersperger, M.L., Hsi, Kuo-liang, and Yuan P-M (1994) Protein Science Vol 3, Suppl. 1, 98, 265-M 5. Hsi, Kuo-Liang, Kochersperger, M.L., Werner, W£., Ly, Hung, Sandell, S., and Yuan P-M (1995) Protein Science Vol. 4, Suppl. 2,150, 540-M 6. Murata Hiroshi and Toshifumi Takao (1993) Anal. Biochem. 210,206-208
INTERNAL PROTEIN SEQUENCING OF SDS PAGE-SEPARATED PROTEINS: A COLLABORATIVE ABRF STUDY Ken Williams^ Ulf Hellman^, Ryuji Kobayashi^, William Lane*, Sheenah Mische^ and David Speicher^ ^HHMI Biopolymer LaboratoiyAV.M. Keck Foundation Biotechnology Resource Laboratory, Yale University, New Haven, CT 06536; ^udwig Institute for Cancer Research, Uppsala, Sweden; ^Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724; ^Microchemistry Facility, Harvard University, Cambridge, MA 02138; ^Protein/DNA Technology Center, Rockefeller University, New York, NY 10021; and ^The Wistar Institute, Philadelphia, PA 19104
I. Introduction Since many eukaryotic proteins have blocked NHj-termini (1) and SDS polyacrylamide gel electrophoresis (PAGE) appears to be the current method of choice for final purification of proteins destined for amino acid sequencing, internal sequencing of these samples represents an important core laboratory activity that had not yet been addressed in a collaborative Association of Biomolecular Resource Facilities (ABRF) study. The goals of this first such study were five-fold 1) provide a mechanism for ABRF laboratories to anonymously compare their internal sequencing capabilities with other core laboratories, 2) provide a reasonable sample and well proven protocols to facilitate introduction of this technology into those laboratories that do not yet offer internal sequencing, 3) obtain data that may help determine the relative efficacy of internal sequencing from PVDF blots versus from in-gel samples, 4) determine if there are any significant commonalities among the best in-gel and PVDF digests to help optimize these protocols, and 5) compile data obtained by multiple laboratories on the same "unknown" sample that may help establish realistic expectations for internal sequencing.
II. Materials and Methods A. Sample Preparation and Distribution The 1996 ABRF internal sequencing samples consisted of three samples: 1) a 28 kD recombinant P-spectrin fi-agment; 2) the same P-spectrin fi-agment with an additional, unique, 15-residue tryptic peptide sequence inserted near its TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
99
100
Ken Williams era/.
NHj-terminus resulting in a mass of about 30 IcD; and 3) an external peptide standard (450 pmol) that was provided dry in an eppendorf tube and that had the same amino acid composition as the unique tryptic peptide insert but whose sequence was randomized. In the case of the two protein samples, 70 pmol of each had been subjected to SDS PAGE and were supplied either as Coomassie Blue stained gel slices or as a section of amido black stained PVDF membrane. In the case of the PVDF samples, an oversize piece of PVDF was included so that a section could be used as a digest control, and in the case of the gel samples, a blank section of gel was included for the same purpose in a separate eppendorf tube. In response to a descriptive letter sent to 258 ABRF Directors, which provided the option of receiving either the PVDF or gel samples, or both sets of samples, 112 laboratories requested a total of 100 PVDF and 90 gel samples.
B. Protocolfor the 1996 ABRF Internal Digest Study
Participants were requested to digest the two protein samples and the control with trypsin following either their own procedure or a representative procedure included with the samples. Since neither protein contained cysteine, modification of this amino acid was not required. Participants were then asked to subject the three digests and 22.5 pmol of the external standard to reverse phase HPLC and to forward the resulting chromatograms, along with a 3 page sample data sheet, to the Internal Protein Sequencing Committee. Anonymity of participants was ensured by having the data returned to the Committee via a disinterested third party, who numbered the data sets in order of receipt and removed all identifying marks. For those laboratories that wished to proceed further, it was suggested that the 30 kD digest be collected and that the unique, 15 residue tryptic peptide insert be further characterized by mass spectrometry and/or amino acid sequencing. This "target" peptide could be identified by its presence in the 30 kD digest and absence in the 28 kD digest, by its above average absorbance due to the presence of aromatic amino acids (see below), and by its elution close to the external standard peptide. A minor complication occurred during the week long period of time required to prepare the 190 samples. Apparent partial proteolysis occurred near the COOH-terminal region of the 30 kD protein which resulted in some cross-contamination of the 30 kD fi-agment into the 28 kD gel band. Based on NHj-terminal sequencing of selected samples, the ratio of the target peptide in the 30 kD versus 28 kD bands on SDS PAGE varied from -4:1 to 2:1 instead of the target peptide being unique to the 30 kD sample.
C Design of the Unique Peptide Insert
The NHj'tenninus sequence of the recombinant 30 kD fragment was NH^ -G-S-P-K-N-Y-E-V-H-T-W-D-V-E-L-S-O-F-K-G-S-V...
The primary concerns in choosing the sequence of the unique, peptide insert were that a tryptic digestion of the 30 kD sample would release the target
Internal Protein Sequencing: A Collaborative ABRF Study
101
peptide (underlined above) in good yield and that it should not co-elute with other major peaks in the 30 kD chromatogram. Hence, the peptide was preceded and followed by lysine, proline was avoided after the lysines, and no acidic residues were included near either intended tryptic cleavage site. The 15 residue length was chosen to be within the range commonly seen for tryptic peptides. To ensure the target peptide was a major absorbance peak, one tryptophan and one tyrosine were included, and to avoid the necessity of reduction/cysteine modification, cysteine was not included. To avoid co-elution with other tryptic peptides derivedfi-omthe 28 kD protein, the amino acid composition of the peptide was chosen so that it would elute near 30% acetonitrile based on published retention coefficients and a constant parameter that is a function of the particular column and HPLC system being used (2). A synthetic peptide analogue of the unique insert actually eluted at about 28% CH3CN bothfi"oma Vydac C-18 column on the system being tested (data not shown) andfi-oma Zorbax C-18 column on an HPLC system located in a different laboratory (Fig. 1). In the latter instance the peptide insert eluted close to a minor peak eluting at about 54 min in the 28 kD chromatogram. As noted previously, the external peptide standard had the same amino acid composition but a different sequencefromthat of the unique peptide insert. The external standard had the following sequence: NH2-LEHNVEWQEDVSYTK-COOH Somewhat surprisingly, the external standard usually eluted at an CH3CN concentration that wasfrom5-8% less than that of the target peptide.
Figure 1. Reverse phase HPLC separation of the external peptide standard (50 pmol, top chromatogram) and an in situ PVDF digest of the 28 kD recombinant (bottom chromatogram). Following SDS PAGE of a mixture of 50 pmol of each protein and blotting onto PVDF, the 28 IcD protein was digested with trypsin (3) and subjected to reverse phase HPLC on a Zorbax CI 8 column (1 x 150 mm) eluted at 37 *C at a flow rate of 75 |il/min. The column was equilibrated with 95% buffer A (0.06% TFA) and 5% buffer B (0.055% TFA in CH3CN) and was then brought to 33% and 60% buffer B with linear gradients extending to 63 and 95 min respectively.
102
KenWiUiamseffl/.
D. Data Analysis By reference to the external peptide standard it was possible to correct for differences in flow rates, path cell lengths, and other HPLC variables and to thus subject the chromatographic profiles to semi-quantitative analysis. Hence, the relative peak height for each 30 kD chromatogram was calculated fi'om the sum of the measured peak heights of the 5 most intense peaks (avoiding obvious artifact peaks at the beginning and end of the profiles) relative to that of the external standard peak height. The number of peaks in 30 kD chromatograms was defined as the number of peaks with >20% the peak height of the external peptide standard. Similarly, the number of background peaks was defined as the number of peaks in the blank digest with >20% the peak height of the external standard. A composite, relative chromatography score was calculated by adding together the relative peak heights and the number of 30 kDa peaks and then subtracting the number of background peaks from this sum. In each of these three categories, the ratio of the individual score to that of the best score was calculated prior to calculating a composite score. Hence, the composite scores can range between -1.0 (worst) and 2.0 (best). A qualitative assessment of chromatographic reproducibility was based on overlaying the 30 kD and 28 kD chromatograms to determine if it was reasonably possible to identify co-eluting peaks in these two chromatograms. The sequencing yield for the target peptide was based on the reported yield of valine at position 4 (Val4) in the sequence. III. Results As shown in Table I, 76% of the 39 laboratories that participated in this study routinely carry out in situ PVDF and/or in-gel digests, and trypsin (78%) or endoproteinase Lys-C (43%) are the two most frequently used enzymes. The most commonly cited protocols that were routinely used included those by Table I. Summary of responses to selected sample submission questions Question
n
Response
Routinely perform in-gel or PVDF digestions?
38
76%
Routinely use peptide mass database algorithms for protein identification?
39
26%
Perform mass analysis of HPLC isolated peptides prior to sequence analysis?
39
3 9%
Routinely provide database search as a service in your laboratory?
35 24
86% 25% (2-80)* 60% (10-95)"
What percentage of the proteins you receive for sequence analysis are N-terminally blocked? What percentage of the proteins you receive for sequence analysis ultimately prove to have already been sequenced as evidenced by database searches? "Median value is given followed by the range.
28
Internal Protein Sequencing: A Collaborative ABRF Study
103
Fernandez et al (3), 31%, for PVDF, and those by Rosenfeld et al (4), 15%, and Hellman et al (5), 10% for in-gel digests. The most conunonly used HPLC columns were CI8 (58%), and the most commonly used column dimensions were 2 to 2.1 mm (67%) with lengths between 150-250 mm (62%). Although respondents indicated an average of 60% of proteins submitted for internal sequencing ultimately prove to have already been sequenced, only 26% of the participants routinely use peptide mass database algorithms for protein identification (Table I). Since <40% of the participants routinely perform mass analysis of HPLC isolated peptides prior to sequence analysis (Table I), this suggests the relatively low fraction of facilities that routinely use peptide mass searching primarily reflects lack of routine access to necessary equipment. In view of data suggesting that about 80% of soluble proteins from Ascites cells are N-a-acetylated (1), it is somewhat surprising that participants in this study estimate that only 25% of proteins received for sequencing are blocked (Table I). However,the very large range (from 2-80%) in responses to this question suggests that either some laboratories may receive a high proportion of proteins from prokaryotic sources where the occurrence of blocked N-termini is very low (6), or there is considerable error in this estimate.
Figure 2. An above average in situ PVDF digest (laboratoiy#l). HPLC separation was at room temperature on a 1 X 250 mm Aquapore RP300 column at 150 nl/min. Initial conditions at 100% buffer A (0.1 % TFA) were followed by linear gradients to 55% and 85% buffer B (0.08% TFA in 70% CH3CN) to 30 and then to 40 min respectively.
0
5
10
15
20
25
Retention Time (min)
30
35
Ken Williams et al
104 0.06
< 0.04
Lj^oiuLv^-cJlNj^
o 0.02
<
o O
l/J]OL^Xv,Jl.j<^^
0.00 50
100
150
200
Retention Time (min)
Figure 3. An above average in situ gel digest (laboratory #7). HPLC separation (80% of digest was injected) was at 2 r c on a 2.1 x 100 mm Pharmacia ^RPC (C2/C18) column at 100 ^l/min. The column, equilibrated at 100% buffer A (0.065% TFA) was eluted isocratically for 20 min followed by linear gradients to 40% and 80% buffer B (0.05% TFA in CH3CN) to 180 and 190 min respectively.
B Enzyme: 5 ^g Worthington (non-modined sequencing grade)
fvV
^^wM^M-Uiai
Column ID: 4.6 mm Flow Rale: 1 ml/min 30 kD
30 kd
Figure 4. Examples of submitted HPLC chromatograms with either high background (A) or without significant 30 kD absorbance peaks (B). (A) This in situ PVDF data set (laboratory #27) was chromatographed at 30*C on a Vydac C18 column (2.1 x 250 mm) eluted at 500 jil/min and monitored at 200 nm The column equilibrated with 100% buffer A (0.1% TFA) was brought to 70% and then to 100% buffer B (0.075% TFA in 70% CH3CN) with linear gradients extending to 60 and 70 min respectively. (B) This in situ PVDF data set (laboratory #16) was obtained at room temperature on a Vydac C18 column (4.6 x 150 mm) eluted at 1 ml/min and monitored at 215 nm The column equilibrated with 100% buffer A (0.1% TFA) was then brought to 70% buffer B (0.085% TFA in CH3CN) at 90 min.
Internal Protein Sequencing: A Collaborative ABRF Study
105
Although the 39 participating laboratories submitted 27 PVDF and 30 in-gel data sets, only 20 PVDF and 22 in-gel data sets could be quantified based either on the sequencing yields reported for the target peptide and/or on the submitted chromatograms. In the remaining cases, the target peptide was not sequenced, or the chromatograms could not be quantified due to ofF-scale peaks, or the absence of a chromatogram corresponding to the external peptide standard and/or the blank digest. To identify factors that might account for the wide range in results obtained in this study (see Figures 2-4), several potentially important parameters are summarized in Table II for the 30% of the PVDF data sets that identified the most residues in the target peptide (i.e., the 6 above average data sets) versus the 30% of data sets that had the lowest composite chromatography scores (ie., the 6 below average data sets). The most significant difference between these two data sets would seem to be the apparent lack of a blocking detergent in 3 of the 6 below average data sets (Table II). In the absence of a detergent such as Triton X-100 it is likely the added protease would be lost due to adsorption onto the PVDF membrane thus accounting for the failure of these digests. Other factors that probably contributed to the relative success of the above average data sets were the more favorable HPLC conditions that were used (smaller column ID and lower flow rates), their slightly higher level of routine experience in carrying out in situ PVDF digests, and particularly important, the fact that these laboratories routinely digest an approximately 3-fold lower range of protein than that for the below average data sets (Table II). In contrast to these parameters, since there was a 10-fold range in the total PVDF wash volume among the above average data sets it is unlikely the generally larger wash volumes used by these laboratories contributed significantly to their success. Table U. Comparative datafromabove and below average PVDF digests* Description Relative peak height Number of 30 kD peaks Number of background peaks Composite chromatography score Number of residues sequenced Val4 sequencing yield in target peptide (pmol)
6 Above Average Sets Range n Median 4 4.6 3.3-11 4 14 10-15 4 3.0-21 6.0 4 1.0-1.3 1.2 13-15 6 14 1.8-5.7 5 3.0
n 6 6 6 6 6 6
6 Below Average Sets Median Range 1.1 012-2.2 2.5 0-8 7.0 2.0-35 0.11 -0.9-0.6 0 0
-
-
Total PVDF wash volume (ml)
6
0.5
0.2-2.0
6
0.2
0.05-0.6
Triton X-100 was used (%) Column ID (mm)
6
-
6
100 1.6
6 6
50 2.1
0.8-2.1
-
Column flow rate (fil/min)
6
120
0.5-2.1 20-200
6
200
17-500
Routinely perform PVDF digests (%)
6
83
-
6
67
-
Quantity of protein routinely digested Minimum (pmol) Maximum (pmol)
3 3
20 70
10-25 50-80
6 3
59 200
20-270 130-500
'See Materials and Methods for details concerning calculation of relative peak height and other data summarized above.
Ken Williams et al
106
Table III summarizes a similar comparison of above and below average ingel data sets. In this instance there is stronger positive correlation between several factors that probably contributed to the success of the laboratories that submitted the above average data sets. These include their use of smaller ID columns, lower flow rates and their significantly higher level of experience in carrying out in-gel digests at a level (i.e., 10-50 pmol) at or below that at which this study was carried out {i.e., in this study 70 pmol each of the 28 and 30 kD samples were subjected to SDS PAGE). Although the absence of Tween 20 or other detergent in the digest buffer and the use of trypsinfi-oma particular vendor seemed to correlate with the above average data sets, the small sample size and the use of only a single protein requires that additional studies be carried out to determine if this correlation is significant. Success in this study required both that the sample be digested and then fi-actionated via HPLC, hence, problems may occur during either or both of these procedures. Since a blocking detergent was apparently not included in the PVDF digest shown in Fig. 4A, this below average chromatogram may have resultedfi-oma failed digest. The high background in this chromatogram may be due to use of a 10-fold higher amount of trypsin (5 jig as opposed to the 0.2-0.5 \i% range used by the other laboratories that submitted PVDF digests) that had not been modified to minimize autolysis. In contrast, while there does not appear to be any obvious reasons (based on the data provided in the accompanying sample sheets) why the digest shown in Fig. 4B failed, the conditions under which this HPLC chromatogram was carried out were not optimum (flow rate of 1 ml/min on a 4.6 mm ID column). In comparison, the median flow rates used for the above average PVDF and in-gel data sets were Table DI. Comparative data from above and below average in-gel digests' Description Relative peak height Number of 30 kD peaks Number of background peaks Composite chromatography score Number of residues sequenced
7 Above Average Sets n Median Range 4.1 3.2-4.6 3 12-17 16 3 0 0-7.0 3 1.0-1.6 1.6 3 7
Val4 sequencing yield in target peptide (pmol)
7
11 3.3
7-1.5 1.2-4.2
Detergent was used (%)
7
14
Promega trypsin used for digest (%) Column DD (mm) Column flow rate (^il/min)
7 7 7
Routinely perform gel digests (%)
7
100 1.0 50 100
-
Quantity of protein routinely digested Minimum (pmol) Maximum (pmol)
5 6
20 100
n 7 7 7 7 7
7 Below Average Sets Median Range 1.6 0-2.1 4.0 0-7.0 12 4.0-16 0.13 -0.34-0.55 0 0
7
-
-
71 29 2.1 200
-
0.5-2.1 20-200
7 7 7 7
2.0-2.1 150-500
-
7
33
-
10-50 20-200
5 2
100 570
50-10000 130-1000
'See Materials and Methods for details concerning calculation of relative peak height and other data summarized above.
Internal Protein Sequencing: A Collaborative ABRF Study
107
120 and 50 jal/min respectively on either 1.0 or 2.1 mm columns, with the latter conditions providing as much as a 20-fold possible increase in detection sensitivity (assuming similar flow cell path lengths, monitoring wavelengths and proportional peak volumes) compared to the chromatograms shown in Fig. 4B. Finally, one of the goals of this study was to compare the relative effectiveness of in situ PVDF versus in-gel digests. Since equal numbers of both types of digests were submitted (27 PVDF and 30 in-gel), there does not appear to be a clear consensus in terms of the best way to proceed. Indeed, this supposition is supported by the data in Table IV where, with the possible exception of a small increase in the number of background peaks observed in the in-gel samples, no significant difference was seen in the quality of either the HPLC chromatograms or the accompanying sequencing data for the PVDF versus in-gel digests. With regards to sequencing yield, it is interesting to note that from the amount of protein loaded onto the gel (about 50 pmol after correcting for the 20 - 33% of 30 kD band shifted into the 28 kD region - see Section IIB) and the overall median initial sequencing yield of 3.2 pmol (based on the 2.3 pmol median yield of Va^ corrected to cycle 1 using a repetitive yield of 90%), the median overall recovery is about 6.4% (range = 3% to 16%). Assuming an average coupling yield of 50%, this corresponds to an actual average recovery of about 12.8%, which is afigurethat must be kept in mind in terms of deciding the amount of protein that must be submitted for internal sequencing to ensure a reasonable probability of success. Overall, approximately 63% of the chromatograms appeared to be reproducible and the median mass determined by 11 laboratories for the target peptide was 1895.60, which compares with the predicted (average) mass of 1895.06. The median mass error was ±0.028%. Table IV. Comparative datafromPVDF and in-gel digests* Description Relative peak height Number of 30 kD peaks Number of background peaks Number of residues sequenced Val4 sequencing yield in target peptide (pmol)
n 20 20 21 9 8
PVDF Range Median 2.4 0-14 0-15 8.0 0-35 4.0 6-15 13 1.0-5.7 2.5
n 18 18 19 7 6
Gel Median 2.6 8.5 7.0 13 2.3
Range 0-5.8 0-17 0-22 7-15 1.2-4.2
'See Materials and Methods for details concerning calculation of relative peak height and other data summarized above.
Ken Williams et al
108
IV. Conclusions Based on results obtained on the recombinant 30 kD protein, the submitted data sets argue persuasively that there is no significant diflFerence in the overall effectiveness of PVDF and in-gel approaches to internal sequencing. Rather, the choice between these two approaches would seem to rest largely with personal preference and perhaps to some extent with other factors specific to the protein being studied (such as difficulties in obtaining near quantitative blotting efficiency or, particularly in the case of low molecular weight proteins, unusually high losses during washing of SDS polyacrylamide gel slices prior to in situ digestion). Clearly, Figures 1-3 and Tables II and III demonstrate that excellent results were obtained by several laboratories using either of these approaches. In this regard the PVDF and in-gel data sets submitted by laboratory #3 were particularly noteworthy in that in both instances all 15 residues in the target peptide were correctly identified. Possible reasons for less than optimal results rangefi-omapparent methodological errors and sub-optimal HPLC conditions discussed above to the potentially more interesting finding that the presence of a detergent may contribute to a less than optimum in-gel digest. In terms of the overall success rate in this study, 51% of the laboratories that participated (/>., 20 out of a total of 39) either obtained 6 or more residues of sequence from the unique target peptide or sufficient other internal sequence to identify the parent protein as being derived from P spectrin. In one instance, this identification was also made based on peptide mass [ABRF-Ul Iclent:lJ*XS + lb + l / ACq: JU-DEC-19yb ISjOS.-Jl Cal i PEPCALlJiORi n-oCSpec SE Ret LDI> Parents U 3 e r E : 2 0 t . 3 6 t SampPos:27 Source:2S000V Ext:8333V Pocusls24000V Re(l:28S00V SupMass h-ext:ABRP 30 kD band 0 . 5 uL of supernacanc BN C a c e : 0 ( ( S h o e s : 5 0 / 9 9 0500MHz 1001
3.5E3 3.4E3 3.3E3 3.2E3 3.2E3 3.1E3 3.0E3 3.0E32.9E3 2.8E3 .2.8E3' 2.7E3 2.6E3
.2.2E3 .2.1E3 .2.1E3 .2.0E3 .1.9E3 .1.9E3 .1.8E3 .1.7E3 .1.7E3 .1.6E3 .1.5E3 .1.5E3 1.4E3 .1.3E3 .1.2E)
.000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
2100
MAU*;^ 2200
2300
2400
2500
2600
Figure 5. MALDI-MS peptide mass map of 1.7% of the 30 kD in-gel digest as submitted by laboratory 21. The resulting data base search matched the imderlined masses to the P-chain of human spectrin.
Internal Protein Sequencing: A Collaborative ABRF Study
109
searching of 1.7% of the digest (Fig. 5). Since the cover letter that accompanied the samples clearly stated that going beyond the requested digest and analytical HPLC was optional, the actualfractionof participating laboratories that could hawQ succeeded with this sample was surely above the 51% figure. In this regard, it is also interesting to note again that 24% of the participants in this study (Table I) do not routinely carry out in-gel or on-PVDF membrane digests. In fact, in some instances this study apparently represented the laboratory's first attempt at carrying out either of these digests - thus fulfilling one of the intended goals of this first ABRF collaborative research study devoted to internal sequencing of SDS PAGE- separated proteins.
Acknowledgements This work was partially supported by DOE grant number DE-FG0295ER61839 to John Crabb (W. Alton Jones Cell Science Center) on behalf of the ABRF. We especially thank the 39 laboratories that made the substantial commitment necessary to participate in this study. The assistance of Robert Tanis (Harvard Medical School) in coordinating data return and ensuring the anonymity of the participating laboratories is appreciated. We also thank Sandra Harper (Wistar Institute) for constructing the expression vector for the 30 kD protein as well as for expressing and purifying the two proteins used in the study. Several members of the authors' laboratories also contributed to the preparation and evaluation of the samples used in this study, especially: Kathy Stone (Yale University), Nora E. Poppito (Cold Spring Harbor Laboratory), Renee A. Robinson (Harvard University), Joseph Fernandez (Rockefeller University), and David Reim (Wistar Institute).
References 1. 2. 3. 4. 5. 6.
Brown, J.L. and Roberts, W.K. (1976) J. Biol. Chem, 251,1009-1014. Guo, D., Mant, C.T.. Taneja, A.K., Parker, J.R., and Hodges, R.S. (1986) J. Chrom. 359, 499-517. Fernandez, J.,DeMott,M., Atherton,D., andMische, S.M. (1992)Anal. Biochem. 201, 255-264. Rosenfeld, J, Capdevielle, I , Guillemot, J.C, and Ferrara, P. (1992) Anal. Biochem. 203, 173-179. Hellman, U., Wemstedt, C, Goftez, J., and Heldin, C.-H. (1995) Anal. Biochem. 224,451 455. Driessen, H.P.C., de Jong, W.W., Tesser, G.L, and Bloetnendal. H. (1985) In Critical Reviews in Biochemistry (G.D. Fasman, ed.) 281-325.
This Page Intentionally Left Blank
SECTION II Physical and Chemical Analysis
This Page Intentionally Left Blank
Chromatographic Determination of Extinction Coefficients of Non-GIycosylated Proteins Using Refractive Index (RI) and UV Absorbance (UV) Detectors: AppUcations for Studying Protein Interactions by Size Exclusion Chromatography with Light-Scattering, UV, and RI Detectors Jie Wen, Tsutomu Arakawa, Jette Wypych, Keith E. Langley, Meredith G. Schwartz, and John S. Philo Amgen, Inc., Thousand Oaks, CA 91320.
I. Introduction Because absorbance measurements are generally the easiest and most precise method for concentration determination, knowing the extinction coefficient, s, of a protein is important for many biochemical and biophysical studies. In particular, in our work using size exclusion chromatography with on-line light-scattering, uv absorbance, and refractive index detectors (SEC-LS/UV/RI) to study the molecular weights of glycosylated proteins and protein-carbohydrate complexes, we have shown that when s of the polypeptide is known, it is possible to combine the information from all 3 detectors to obtain the polypeptide molecular weight of the complex (1-5). However, the determination of experimental extinction coefficients by dry weight or amino acid analysis is tedious and requires great skill to achieve high accuracy. In many cases, an s calculated from the amino acid composition (6) is sufficiently accurate, but for some proteins of interest, even the amino acid composition is not known (e.g. monoclonal antibodies). Therefore, for many reasons, it would be useful to have a convenient method to measure s. Fortunately, for non-glycosylated proteins, we have found that the signals from refractive index (RI) and absorbance (UV) detectors provide a simple chromatographic method for s determination with a reasonable accuracy, and thus when SEC-LS/UV/RI experiments are done, the data for determining s are available without extra effort. In this paper, we v/ill first outline the method of using RI and UV detectors to determine the extinction coefficients of non-glycosylated proteins and TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
113
114
JieWenera/.
present results for several commercial proteins. Then we will discuss applications of this method for studying protein interactions when using SEC-LS/UV/RI.
II. Methods and Materials A. Determination of e with UV and RI Detectors When a protein is measured by UV and RI detectors, the following two basic equations can be used to express the relation of the parameters: (UV) = K^,cs [1] (RI) = Kj^ c (dn/dc) [2] where (UV) and (RI) are the intensities of uv absorbance and refractive index signals, Kuv and KRI are instrument constants, c is the protein concentration in mg/ml, s is the extinction coefficient of the protein in ml/(mgcm), and dn/dc is the refractive index increment of the protein. As seen from these two equations, when the dn/dc is known we can use a RI detector to measure the concentration and to obtain the s as follows: £oc(dn/dc)[(UV)/(RI)] [3] It appears that dn/dc differs little from protein to protein (7), and indeed it has long been common practice by those using refractometric optics in the analytical ultracentrifuge to simply assume that dn/dc is the same for all polypeptides. This view is also supported by an examination of literature values. For example, data for 17 proteins in water at visible wavelengths (8) give an average value of 0.186 ml/g with a standard deviation of 0.0024. Thus Eq. [3] can be further simplified to £oc (UV)/(RI) [4] and we still get an accuracy of 1-2%. In practice, there are several approaches to apply Eq. [3] and Eq. [4] to obtain the s. For example, we could just use a spectrophotometer and a batch-mode refractometer for this purpose. In this paper, we will focus on the on-line chromatographic determination of extinction coefficients of non-glycosylated proteins and its applications in SEC-LS/UV/RI. There are several reasons for focusing on this on-line method. First, liquid chromatography with on-line UV and RI detectors, as well as SEC-LS/UV/RI, are far more common in today's laboratories than batch-mode refractometers. Second, in batch-mode refractometry it is essential to carefully dialyze the protein sample and use the dialysate as a reference, whereas in the on-line method the SEC column provides a rapid equilibration with the eluent, and the reference is obtained conveniently from the baseline reading before or after the protein peak. Third, when using SEC-LS/UV/RI to study the interaction of a non-glycosylated protein with carbohydrates or with a glycosylated protein, the UV and RI data for determining the e are normally acquired by the computer during the experiment and no extra effort is needed. Fourth, a batch-mode spectrophotometer or a refractometer usually requires considerably greater amounts of protein than on-line UV and RI detectors. It should emphasized that any SEC system with RI and UV detectors
Chromatographic Determination of Extinction Coefficients of Non-Glycosylated Proteins
115
is sufficient for determining the 8, and the light-scattering detector is required only when studying protein interactions.
B. SEC-LS/UV/RI for Studying Protein Interactions Details of applying SEC-LS/UV/RI for studying protein interactions have been previously described (2). Only a brief outline is presented here. First, to study the interaction of a non-glycosylated protein with carbohydrates, such as the example presented in section IIIB, we use the following equation (2,4):
where Mp is the molecular weight of the polypeptide component of the protein, (UV), (LS), and (RI) are the intensities of uv absorbance, light-scattering and refractive index signals, Kuv, Kis, and Kj^ are the instrument calibration constants, Sp is the polypeptide extinction coefficient of the protein in ml/(mgcm). This equation suggests that as long as the polypeptide extinction coefficient of a protein is known, the polypeptide molecular weight of the protein or its complex can be determined. Second, for studying the interaction of a non-glycosylated protein with a glycoprotein, such as the example presented in section IIIB, we still can use Eq. [5]. In order to use this equation, we must be able to calculate the polypeptide extinction coefficient of the complex,£p. The Sp of a complex with a known stoichiometry (^m^n) c ^ ^^ calculated using the following equation: Sp = (jne^M^ + ns^M^) I (mM^ + nM^) [6] where 6^ and £^, and M^ and M^ are the polypeptide extinction coefficients and molecular weights of proteins A and B, respectively. After obtaining £p, we can calculate the polypeptide molecular weight by using Eq. [5] and a self-consistent method as described in reference 2. As seen, the extinction coefficient of each protein in the complex is required for determining the molecular weight. We regularly use bovine albumin (BSA), chicken ovalbumin, and ribonuclease (RNase) to calibrate the light-scattering instrument (2). These protein standards can also be used to obtain the calibration constant for Eq. [4].
C. Materials BSA monomer, ovalbumin (chicken), P-lactoglobulin (bovine milk), serum albumin (human), carbonic anhydrase (bovine), L-glutamic dehydrogenase (bovine liver), a-chymotrypsin (bovine), a-chymotrypsinogen A (bovine), immunoglobulin (bovine milk), pepsin, trypsin (bovine), and heparin were from Sigma. RNase and lysozyme (&gg white) were from Calbiochem. The recombinant human basic fibroblast
Jie Wen et al
116
growth factor with cysteines 70 and 88 replaced with serine and recombinant human stem cell factor were expressed and purified from E. coli as previously described (4,5).
III. Results and Discussion A. Tests on Commercial Proteins
w
30-
,^ |20-
10-
v' J
• Eight commercial proteins are injected separately onto a Superose 75 column (Pharmacia) to avoid possible 1 1 0overlapping peaks and improve the 0 1 2 Extinction Coefflcients [ml/(mg cm)] accuracy. The plot of (UV)/(RI) vs. extinction coefficients is shown in Fig. 1, Figure 1. The plot of (UV)/(RI) vs. extinction and the equation off = [(UV)/(RI)] /12.48 coefficients of eight proteins. is then obtained from a linear regression analysis forced through zero. From this equation, we can calculate back to obtain the extinction coefficients of each protein. The results for these eight proteins are summarized in Table I. The average error is 3%, which is comparable with the results from other methods (6). One possible major source of error with this technique may be the performance of the on-line UV detector. An on-line UV detector typically has larger bandwidth and poorer wavelength accuracy. This may be particularly important when the shape of the spectrum of a protein places the maximum extinction away from the measurement wavelength.
Table I.
Summary of eight commercial proteins' extinction coefficients from (UV)/(RI) method
Protein Bovine Albumin Ovalbumin (chicken) Ribonuclease P-Lactoglobulin Serum Albumin (human) Lysozyme Carbonic Anhydrase L-Glutamic Dehydrogenase
ffrom (UV)/(RI) method 0.669 0.720 0.660 0.963 0.547 2.62 1.84 0.991
f from literature (ref.no.) 0.670 (9) 0.735 (9) 0.706 (9) 0.960 (10) 0.531 (10) 2.59 (10) 1.90 (9) 0.923 (6)
Error % -0.1 -1.5 -4.6 0.3 1.6 3.3 -6.1 6.8
In most of our SEC-LS/UV/RI studies, only three protein standards (BSA monomer, ovalbumin, and RNase) are used to calibrate the light-scattering instrument (2). Therefore, we can use these same three standards to obtain the (UV)/(RI) calibration constant of Eq. [4] without extra effort. To estimate the error of this approach, we derived a new calibration constant using only these three standards, and then calculated the s of other proteins, which were compared with literature values
Chromatographic Determination of Extinction Coefficients of Non-Glycosylated Proteins
117
(Table II). The results show that this approach has an average error of 4%. The reason for using these three standards for the calibration is not because they fall close to the calibration line in Fig. 1; we routinely use these three standards for calibrating our light-scattering instrument. Table n. Extinction coefficients obtained when only 3 proteins are used for calibration Protein p-Lactoglobulin Serum Albumin (human) Lysozyme Carbonic Anhydrase L-Glutamic Dehydrogenase a-Chymotrypsin (bovine) a-Chymotrypsinogen A (bovine) Immunoglobulin (bovine milk) Pepsin (dimer) Trypsin (bovine)
s from 3-standard calibration 0.992 0.564 2.70 1.90 1.02 2.10 2.17 1.48 1.56 1.66
£• from literature (ref.no.)
Error %
0.960 (10) 0.531 (10) 2.59 (10) 1.90 (9) 0.923 (6) 2.00 (10) 2.03 (6) 1.38 (10) 1.54 (10) 1.66 (6)
03 6.2 4.2 0 11 5.0 6.9 7.2 1.3 0
It should be mentioned that it is not absolutely necessary to use protein standards to calibrate the RI detector. However, the calibration of some RI detectors shifts with time. The protein standard calibration method can correct the RI intensity shift and may partly decrease the error from the large bandwidth of some on-line UV detectors.
B. The Interaction ofbFGF with Heparin Basic fibroblast growth factor (bFGF) is known as a potent mitogen and chemoattractant for endothelial cells, and it binds tightly to highly charged carbohydrates such as heparin or heparan sulfate. In our previous publication regarding the interaction of bFGF with high molecular weight heparin (HMWH) (4), the s of bFGF was calculated from its amino acid composition (6). However, even if we did not know the amino acid composition ofbFGF, we could still use the method described in this paper to obtain its s and thereby determine the number of bFGF in the complex. When studying protein interactions, as mentioned before, three protein standards are typically used to obtain the calibration constant for the light-scattering instrument, regardless of whether or not we need these information for determining the £. Assuming that neither the amino acid composition nor the experimental e (such as data from dry weight) is available, we need to estimate the s of bFGF from UV and RI data. First, we use those three protein standards to obtain a calibration curve similar to Fig. 1, and then obtain the s of the bFGF as 0.904 ml/(mgcm) by Eq. [4]. From this extinction coefficient and Eq. [5], the polypeptide molecular weight of each complex can thus be calculated. The difference between the £• calculated from Gill's amino acid composition method [0.910, ml/(mg-cm)] and the e determined by this UV and RI
Jie Wen et al.
118
technique is 1%. For comparison, the s of the bFGF determined by dry weight is 0.938 ml/(mgcm) (Dr. Yashiko Nozaki, Duke University; personal communication). Therefore, when using Eq. [5] to calculate the polypeptide molecular weight of bFGF and HMWH complex, the same 1 % difference is expected for the molecular weights, because all other parameters, (LS), (UV), and (RI) are the same. More details and conclusions regarding the binding stoichiometry of bFGF and HMWH can be found in reference 4. C
The Interaction ofE. colt SCF with sKit
Stem cell factor (SCF) is a dimeric protein that stimulates hematopoietic progenitor cells in bone marrow. The interaction of SCF expressed in E. coli and its receptor, soluble Kit (sKit), was studied (5). The extinction coefficient of the SCF was obtained as 0.534 by using Gill's amino acid composition method and 0.585 by using the method described in this paper [note: 0.62 ml/(mg-cm) was reported in reference 11 by the amino acid analysis method]. In this complicated case, a self-consistent threedetector method described in reference 2 was used to determine the stoichiometry of the complex. The results of using these two extinction coefficients are summarized in tables IIIA and IIIB. Both results indicate that the stem cell factor dimerizes its receptor, sKit. Table IHA. Detenninatioti of stoichiometry of sKit/SCF complex with an extinction coefficient [0.585 ml/(mg-cm)] calculated from the method described in this paper Protein or mixture
Assumed stoichiometry of sKit/SCF
sKit E. coli SCF lsKit:lSCFdimer 2sKit:lSCFdimer 2sKit:2SCF dimer
8
Experimental MW
Theoretical MW
ml/(mg-cm) 1.19 0.585 0.948 1.04 0.948
55600 38700 159000 146000 159000
55815 37313 (as a dimer) 93128 148943 186256
Correct assumption?
No Yes No
Table illB. Determination of stoichiometry of sKit/SCF complex with an extinction coefficient [0.534 ml/(mg-cm)] calculated from the SCF amino acid composition Protein or mixture
Assumed stoichiometry of sKit/SCF
sKit E. coli SCF lsKit:lSCF dimer 2sKit:lSCF dimer 2sKit:2SCF dimer
8
Experimental MW
Theoretical MW
ml/(mg-cm) 1.19 0.534 0.927 1.03 0.927
55600 38700 163000 147000 163000
55815 37313 (as a dimer) 93128 148943 186256
Correct assumption?
No Yes No
Chromatographic Determination of Extinction Coefficients of Non-Glycosylated Proteins
119
IV, Conclusions The extinction coefficient of a non-glycosylated protein can be determined by UV and RI detectors. Eight commercial proteins were tested using this method and showed an average error of 3-4%. This method may be especially useful when applying SEC-LS/UV/RI to study the interaction of a non-glycosylated protein with carbohydrates or with a glycosylated protein. In two examples of such studies, the results show that the molecular weights calculated by using the extinction coefficient from the UV and RI method agree well with the molecular weights obtained by using other methods, suggesting that the UV and RI method is feasible for such studies.
Reference 1. Takagi, T. (1990)7. Chromatogr. 506, 409-416. 2. Wen, J., Arakawa, T., and Philo, J.S. (1996) Anal Biochem,, 240, 155-166. 3. Wen, J., Arakawa, T., Talvenheimo, J., Welcher, A., Horan, T., Kita, T., Tseng, J., Nicolson, M., and Philo, J.S. (1996) in Techniques in Protein Chemistry VII, (Marshak, D.R., Ed.), pp. 23-31, Academic Press, San Diego. 4. Arakawa, T., Wen, J., Philo, J.S. (1994) Arch. Biochem. Biophys. 308, 267273. 5. Philo, J.S., Wen, J., Wypych, J., Schwartz, M.G., Mendiaz, E.A., and Langley, K.E. (1996)/. Biol. Chem. Ill, 6895-6902. 6. Gill, S. C. and von Hippel, P. H. (1989) Anal. Biochem. 182, 319-326. 7. Perlmann, G.E., and Longsworth, L.G. (1948) J. Am. Chem. Soc. 70, 27192224. 8. Fasman, G.D. (1976) in CRC Handbook of Biochemistry and Molecular Biology, 3rd edition. Vol. II, pp. 372-382, CRC Press, Inc., Boca Raton. 9. Takagi, T. (1985) in Progress in HPLC (Parvez, H., Kato, Y., and Parvez, S., Eds.), VNU Science Press, Utrecht, 1, 27-41. 10. Fasman, G.D. (1976) in CRC Handbook of Biochemistry and Molecular Biology, 3rd edition. Vol. II, pp. 383, CRC Press, Inc., Boca Raton. 11. Arakawa, T., Yphantis, D.A., Lary, J.W., Narhi, L.O., Lu, H.S., Prestrelski, S.J., Clogston, C.L., Zsebo, K.M., Mendiaz, E.A., Wypych, J., and Langley, K.E. (1991)7. Biol. Chem. 266, 18942-18948.
This Page Intentionally Left Blank
SINGLE ALKALINE PHOSPHATASE MOLECULE ASSAY BY CAPILLARY ELECTROPHORESIS LASER-INDUCED FLUORESCENCE DETECTION Douglas B. Craig, Edgar A. Arriaga, Jerome C.Y. Wong, Hui Lu and Norman J. Dovichi Department of Chemistry, University of Alberta, Edmonton, Alberta T6G 2G2, Canada ABSTRACT Single molecules of alkaline phosphatase were assayed using capillary electrophoresis laser-induced fluorescence detection. Multiple incubations of individual molecules were performed. Varying the temperature in multiple incubation assay allowed for the determination of the activation energy of catalysis at the single molecule level. Molecules are heterogeneous with respect to both activity and activation energy of catalysis. Partial thermal denaturation of alkaline phosphatase results from the total denaturation of a fraction of the molecules with surviving molecules unaffected rather than a partial decrease in the activity of all the molecules. I. INTRODUCTION Chemical reactions are usually studied on a large ensemble of molecules. The development of very sensitive techniques has begun to allow the study of individual molecules, which avoids obscurement of molecular properties by ensemble averaging. Highly fluorescent proteins, multiply labeled polymers and small molecules have been detected at the single molecule level by laser-induced fluorescence in thin films, in neat flowing liquid streams, in levitated droplets and after separation by capillary electrophoresis^'^ Other characteristics, such as spectra, spring constants and excited state lifetimes have been measured on single molecules^ ^ Individual myosin molecules have been detected through their binding of fluorescently labeled ATP^. Individual molecules have also been detected at a microelectrode by electrogenerated chemiluminescence and through redox chemistry ^^'^^ Enzyme catalyzed reactions have also been studied at the single molecule level. Earher work involved measurement of beta-galactosidase activity in droplets after a 10-15 h incubation^^. In a recent study, detection of fluorescent product generated by individual molecules of lactate dehydrogenase after a 1 hr incubation has been achieved using capillary electrophoresis^I The activity of individual molecules were reproducible but activity of different molecules showed a 5-fold range. The differences in activity were suggested to reflect differences in conformation. In this paper we report the assaying of individual molecules of the enzyme alkaline phosphatase (EC 3.L3.1) by capillary electrophoresis (CE) utilizing laserTECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
121
122
Douglas B. Craig et al
induced fluorescence detection. An expanded version of this paper has been pubUshed elsewhere^'*. First we measure the activities of individual molecules. By using brief periods of CE between incubations to separate newly formed product from the enzyme, followed by subsequent incubations, we achieve multiple incubations of individual molecules, generating a kinetic plot. By varying the reaction temperature in each incubation, we calculate the activation energy of catalysis for individual alkaline phosphatase molecules. Finally, as an application for single molecule detection, we study the effect of thermal denaturation at the single molecule level. II. EXPERIMENTAL A. Reagents AttoPhos, 2'-(2-benzothiazolyl)-6'-hydroxybenzothiazole phosphate, is a weakly fluorescent alkaline phosphate substrate marketed by JBL Scientific (San Luis Obispo, CA) which is converted into the highly fluorescent product AttoFluor, 2'(2-benzothiazolyl)-6'-hydroxybenzothiazole. Calf intestinal alkaline phosphatase was obtained from Life Technologies (Gaithersburg, MD). Boric acid and pnitrophenyl phosphate were purchased from Sigma (St. Louis, MO) and chloroform, MgCl2 and diethanolamine from Fisher Scientific (Ottawa, ON). B. Instrumentation Details of the instrument have been previously pubUshed^^ The injection end of a fused silica capillary (10 |im i.d., 145 |Lim o.d., 72.5 cm long, Polymicro Technologies, Phoenix, AZ) was inmiersed into the sample or running buffer along with a platinum wire connected to a high voltage power supply (CZEIOOOR, Spellman, Plainview, NY). The detection end of the capillary, from which the polyimide has been removed by a gentle flame, was placed inside a 250 x 250 |xm inner bore sheath flow cuvette. Molecules exiting the capillary are hydrodynamically focused post-capillary into a cone by buffer flowing within the cuvette. Fluting species are excited by a 457.9 nm line from a multiwavelength Ar ion laser (Innova 90-4, Coherent, Palo Alto, CA) focused 20 |im below the capillary end by a 6.3x, 0.20 NA microscope objective (Melles Griot, Nepean, ON). Fluorescence was collected at right angle from the direction of excitation by a 60x, 0.70 NA microscope objective (Model 60x-LWD, Universe Kogaku, Japan) and selectively passed through a slit and a 580DF40 bandpass filter (Omega Optical, Brattleboro, VT) to a R1477 photomultiplier tube (PMT) (Hamamatsu, Middlesex, NJ). The analog PMT signal was collected at 10 Hz and digitized by a Macintosh Ilsi via a NB-MO-16XH-18 I/O board (National Instruments, Austin, TX). The same board controls the CE power supply. Optimum laser power for the detection of AttoFluor is 5 mW at 457.9 nm.
Alkaline Phosphatase Molecule Assay by Fluorescence Detection
123
C. Assay protocols 1. Standardization assay The amount of commercial alkaline phosphatase was estimated by monitoring the change in absorbance at 405 nm of approximately 0.01 units of enzyme in 1.5 ml of 1 M diethanolamine (pH 9.8) containing 0.5 mM MgCla and 10 mM p-nitrophenylphosphate. The absorbance coefficient of p-nitrophenol at this wavelength is 18450 Lmol'^cm ^ A molecular weight of 140 kDa and a specific activity of 2000 u/mg under these conditions (Life Technologies, data sheet) were used to calculate enzyme concentration. 2. Single molecule assay Alkaline phosphatase was diluted to 1.9x10*^^ and 9.5x10"^^ M in 100 mM borate (pH 9.5) containing 1 mM AttoPhos. AttoPhos contains some AttoFluor as an impurity. The impurity concentration is reduced by double extraction of 10 mM AttoPhos in 100 mM borate (pH 9.5) with equal volume of CHCI3. To avoid contamination with active exogenous enzyme, all buffers, vessels and pipet tips were autoclaved prior to use and dilutions were prepared in a clean air hood. The diluted enzyme was electrokinetically injected into the capillary for 3 min at an electric field of 400 Vcm'^ (injection end positive). After an incubation period of 1430 min, product was driven past the detector at an electric field of 400 Vcm'^ (injection end positive). Band broadening was found to result in the production of an -5 min plateau from a 3 min injection. Running and sheath buffer was 100 mM borate (pH 9.5). Blanks were identical but with the omission of alkaline phosphatase. Enzyme catalysis rates were calculated by comparison of peak areas to that of standards of AttoFlour. 3. Multiple Incubation Assay Alkaline phosphatase and AttoFluor have different electrophoretic mobilities. After an 8 min incubation period, a 400 Vcm"^ electric field is applied for 15 sec. This field moves enzyme molecules away from the product formed and into a fresh region of substrate. Three more incubations of 4, 2 and 1 min were performed, with intervening periods of CE. 4. Activation Energy Measurement Three quarters of the capillary, starting from the injection end, was placed within a Plexiglas box. The temperature of the interior of the box was maintained by a thermostatically controlled heater and a circulating cooling bath. The temperature of the air flowing over the capillary was monitored with a thermometer. Air temperature equilibrates within a 1 min period following a 10-degree increase in temperature. 4.6x10"^^ M alkaline phosphatase in 100 mM borate (pH 9.5) containing 1 mM AttoPhos was injected for 240 sec at 400 Vcm'^ into the capillary. The enzyme was incubated for three 15 min periods at different temperatures
124
Douglas B. Craig et al
ranging from 13 to 38°C with intervening 15 sec periods of 400 Vcm'^ electric fields. 5. Thermal Denaturation Study A control solution of 8x10"^^ M alkaline phosphatase was prepared. A second solution of 8x10'^^ M enzyme was heated at 64°C for 5 min and immediately diluted by 6 orders of magnitude. Both samples were assayed by the single molecule assay method.
III. RESULTS AND DISCUSSION A. Single molecule assay Alkaline phosphatase catalyzes the conversion of the substrate, AttoPhos, into the product, AttoFluor. AttoPhos is weakly fluorescent and AttoFluor highly fluorescent at 457.9 nm. Injection of dilute concentrations of alkaline phosphatase results in the random distribution of enzyme molecules within the capillary. The number of molecules injected will be dominated by Poisson statistics. Upon incubation, enzyme molecules will convert the substrate molecules in their immediate vicinity into product. This will result in a sphere of product surrounding the enzyme molecule that will produce a peak in the electropherogram. If few enough enzyme molecules are injected such that there is sufficient average distance between individuals along the axis of the capillary, the electropherogram will show a series of peaks above a background signd, each of which will represent the activity of a single molecule of alkaline phosphatase. Filling the entire capillary with substrate will generate a background signal produced by the fluorescence of the AttoPhos, which is weakly fluorescent but at 1 mM produces a significant signal, and due to the presence of any AttoFluor present as an impurity. Chloroform extraction of the substrate results in the removal of sufficient amounts of the AttoFluor impurity such that it's signal is less than 10% of that of the substrate. In order to provide a low background, we only fill -15% of the capillary with sample. CE separates the plateau formed by the product from that formed by the substrate impurity. Single enzyme molecule product peaks are observed sitting atop the plateau formed by this impurity alone, thus providing a lower and therefore less noisy background. Product will form in the capillary where both enzyme and substrate are present together. Since the substrate has a lower mobility than both product and enzyme, 3ie plug of substrate injected will be shorter than that of the enzyme and product impurity. Thus single molecule product peaks will appear only on a portion of the plateau produced by the impurity. Figure 1 shows the incubation of 1.9x10'^^ M and 9.5x10'^^ M alkaline phosphatase. In the blank there is a plateau which is due to the presence of the
125
Alkaline Phosphatase Molecule Assay by Fluorescence Detection
180 I
240 I
^
S
B
200 I
X
160 1
•^
120 1
: , j) Ai
CO
flliA.,: .V
80 10
10
11
11
12
12
13
13
14
Migration time (min) Figure 1 Single Molecule Assay: (A) Blank generated by the 3 min injection at 400 Vcm'* of 1 mM AttoPhos in 100 mM borate (pH 9.5). Following an incubation of 19.25 min, the sample was swept past the detector at 400Vcm"^ (B) 1.9x10"^^ M alkaline phosphatase mixed with 1 mM AttoPhos in 100 mM borate (pH 9.5). Sample was treated as in (A) but with a 28.5 min incubation. (C) 9.5x10*^^ M alkaline phosphatase mixed with 1 mM AttoPhos in 100 mM borate (pH 9.5). Sample was treated as in (A) but with a 18 min incubation.
126
Douglas B. Craig et al
AttoFluor impurity. The addition of alkaline phosphatase causes the production of peaks above this background. Based on the injection volume and the nominal enzyme concentration of 1.9x10"^^ M, we expect on average approximately 11 molecules of enzyme to be injected. We observe 11.8±3.5 peaks per run (n=4). Peak area was found by nonlinear regression analysis to one or two Gaussian peaks. Area is converted to reaction rate by comparison to the peak area of standard injections of AttoFluor and taking into account the incubation time. The mean reaction rate for 1.9x10'^^ M is 108±70 s'^ (n=47 molecules). Assay of 9.5x10*^^ M alkaline phosphatase produces half the number of peaks, 5.3±4.4 with the similar activities, 124±97 s'' (n=4). It is noteworthy that the later eluting peaks are wider than the earlier ones. Diffusion during the mobilization process leads to enhanced band broadening. Figure 2 shows multiple incubations of a single molecule. A single molecule is captured within the capillary and is incubated for 8 min. Following this incubation it is moved by CE into fresh substrate and incubated for 4 min. This is followed by 2 and 1 min incubations. The solid line is the data and the dashed line the least squares fit of 4 Gaussian peaks to the data. Since the enzyme moves faster than the product, the later produced peaks elute first. The earlier produced peaks are wider due to an increased amount of time for diffusion. 8 molecules were studied. Peak area increased linearly with incubation time, with an intercept of zero and an average linear correlation coefficient of 0.996. The mean activity was 190+78 s ^ The distribution may be shifted towards more active molecules because lesser active individuals may not have generated detectable peaks during the short incubation period. There are several hnes of evidence that indicate that peaks are due to the activity of individual alkaline phosphatase molecules. The average number of peaks observed is consistent with that expected. Decreasing the enzyme concentration by half decreases the number of peaks observed by 50% but does not affect average peak area within experimental error. The number of peaks observed is dominated by Poisson statistics. From the multiple incubation experiment, peak area is proportional to incubation time and peak spacing is consistent. The activity of individual molecules is different. This distribution of activity can have several causes. The broad distribution could simply reflect poor experimental precision. However, from the multiple incubation assay, peak area correlates strongly with incubation period. The relative precision in the reaction rate, estimated from the linear least-squares fit to the four-point kinetic plot, ranged from 2 to 10%. Enzymes could stick to the capillary wall, partially hiding the active site in some individuals. The multiple incubation assay shows that enzyme moves faster than the product. The capacity factor for absorption of the enzyme to the capillary wall must be small. Enzyme molecules could denature during the course of the assay. The multiple incubation assays shown no evidence for a decrease in activity during subsequent incubations. Enzyme aggregates would cause heterogeneous activity. Zone electrophoresis provides no evidence for aggregation'^. Heterogeneity could arise from differing degrees of glycosylation or other post-translational modifications. Mammalian alkaline phosphatases are anchored to the exterior of the cytoplasmic membrane by a phosphatidylinositol glycan moiety'^ Calf and rat intestinal alkaline phosphatase both generate at least three closely migrating electrophoretic bands, each of which is composed of
Alkaline Phosphatase Molecule Assay by Fluorescence Detection —1
:
1
1
90
10.6
,
1
1
1
r
J
110 -
> S X 100 a .£?
,
127
A A 1W 1 A 11
11.4 11.8 Migration time (min)
12.2
3 4 5 6 Incubation time (min) Figure 2 Multiple incubation of a single molecule: (A) A single molecule of alkaline phosphatase was captured within the capillary. After an 8 min incubation the sample was subjected to a 15 sec pulse of 400 Vcm', moving the enzyme molecule away from the product and into fresh substrate. The process was repeated for a 4, 2 and 1 min incubation. Following the last incubation, the contents of the capillary were swept past the detector at 400 Vcm"^ The solid line represents the data and the dashed line the least-squares fit of 4 Gaussian peaks. (B) Peak area is shown by a cross and the straight line in the least-squares fit to the data.
Douglas B. Craig et al
128
individuals with differing levels of glycosylation^^'^l Variation in glycosylation causes differences in both Km and V^'^. It has also been suggested that differences in activity may result from differences in conformations of individual molecules^^ B. Activation Energy Measurement In these experiments molecules were captured within the capillary and incubated 3 times at varying temperatures, from 13 to 38°C. 8 molecules were studied. Figure 3 shows two such molecules. Peak area increases with temperature. The two molecules do not have identical activities. The dephosphorylation of substrates by alkaline phosphatase has been proposed to occur by the following mechanism^^: EH + ROP
ki k-1
EH •ROP
^t<2 ~ k ^
EH* •ROP
ka ^
EP + ROH
k4^
EH + P
slow
where EH is the free enzyme, R-OP the substrate, EH-R-OP the enzyme-substrate complex, EH*-R-OP the activated intermediate, EP the phosphorylated enzyme and P inorganic phosphate. At the enzyme concentration used, substrate concentration does not decrease significantly over the course of the assay. The enzyme can be shown to be zeroth order in substrate concentration. A plot of ln(peak area) Vs 1/T yields a slope of Ea/R, where Ea is the activation energy of catalysis. A linear leastsquares fit to the data is used to estimate the activation energy. The data are linear with an average correlation coefficient of 0.994. The relative precision of the slope ranges from 3 to 21%, with an average relative precision of 9%. Activation energies ranged from 39 to 91 kJmol'^ with a mean of 53±16 kJmol'^ (N=8). Bulk assay of 3x10^ molecules gave an average activation energy of 50 kJmol ^ There was no correlation between activity and activation energy, r=-0.26, indicating that other and more dominate sources of heterogeneity exist. C. Thermal Denaturation Study Standard alkaline phosphatase assay has shown that heating of the enzyme for 5 min at 64°C results in a 50% loss of activity. This loss in activity may result from 2 possibilities; Total denaturation of half the molecules or partial denaturation of all the molecules. Partially denatured alkaline phosphatase was assayed and compared to control enzyme. With a 15 min incubation of 8x10"^^ M control enzyme, 9.8±3.3 peaks were observed per assay (n=4) with activities of 138±107 s'^ This is about twice the number observed in the single molecule assays; different enzyme lots were used for the two studies. Denatured enzyme produced 4.4+1.5 peaks/assay (n=4) with an average activity of 118±97 s ^ At the 85% confidence limit, the means of the activities are identical and at the 75% confidence limit the standard deviations are identical. A 50% loss of the activity in bulk assay results in 45±9%
129
Alkaline Phosphatase Molecule Assay by Fluorescence Detection 220
,—
a
,
,
A
180
140
W) 100 00
-
-
rt b. b
-
IIIA
60 h
. - - , , , ^
fij^''k4mmmi'!*^'
20
iirfi^^***"
5.5
6
6.5
HUr
d
^
7
Migration time (min)
l/T(K'b
7.5
8
8.5
xlO'-
Figure 3 Multiple incubations at varying temperatures: 2 alkaline phosphatase molecules were captured within the capillary. Sample was incubated for three 15 min periods at 16, 24 and 30°C, with intervening 15 sec periods of 400Vcm"^ high voltage. Following the last incubation, the contents of the capillary were swept past the detector at 400 Vcm"^ A set of 3 peaks is observed for each molecule. Peaks a and d correspond to incubations at 16°C, b and e to 24°C and c and f 30°C. The solid line represents the data and the dashed hne the least-squares fit. (B) Arrhenius plot to the data: The marker is the peak area determined at each temperature and the straight Hne is the least-squares fit to the data for one molecule.
Douglas B. Craig et al
130
of the number of peaks but does not affect the average activity of the surviving molecules. IV. CONCLUSIONS Measurement of enzyme activity at the single molecule level has several applications. As demonstrated in this paper, single molecule assays can be used to study microheterogeneity. We find that alkaline phosphatase is heterogeneous with respect to both activity and activation energy of catalysis. It is most likely that heterogeneity results from differences in glycosylation or other post-translational modifications. These results beg the question as to whether microheterogeneity is fortuitous or has some biological role, and if so, what that role might be. A likely possibility is regulation of alkaline phosphatase, where differences in glycosylation result in different degrees of susceptibility to protease digestion. It will be interesting to explore the effect of protease denaturation at the single molecule level. Activity does not correlate with activation energy. This indicates that microheterogeneity cannot by explained in terms of differences in the ability of individual molecules to reduce the activation energy. Thus other sources of heterogeneity must exist. For the first time we study the effects of thermal denaturation at the single molecule level. Partial denaturation of a population of molecules results in the total loss of activity of a portion of the molecules with the surviving molecules unaffected. There is no evidence for the conversion of active molecules to a conformation of lower activity. Thermal denaturation is a catastrophic phenomenon. Our data provides no evidence for the universality of catastrophic denaturation; it would be interesting to rephcate these experiments with other enzymes. ACKNOWLEDGEMENTS We thank M. Palcic, O. Hindsgaul and B. Dunford of this department for useful discussions. The work was supported by an operating grant from the Natural Sciences and Engineering Research Council. D.B.C. acknowledges a postdoctoral fellowship from the Alberta Heritage Foundation for Medical Research. J.C.Y.W. acknowledges a predoctoral summer fellowship from the Alberta Heritage Foundation for Medical Research. N.J.D. acknowledges a McCalla Professorship from the University of Alberta. We also thank E. Fudd and Y. Sam for their contributions. REFERENCES 1. 2.
Hirschfeld, T. (1976). Appl. Op. 75, 2965. Nguyen, D.C., Keller, R.A., Jett, J.H. & Martin, J.C. (1987). Anal.
Alkaline Phosphatase Molecule Assay by Fluorescence Detection
3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
131
Chem. 59, 2158. Wilkerson, C.W., Gooddwin, P.M., Ambrose, W.P., Martin, J.C. & Keller, R.A. (1993). Appl. Phys. Lett. 62, 2030. Whitten, W.B., Ramsey, J.M., Arnold, S. & Bronk, B.V. (1991). Anal. Chem. 63, 1027. Chen, D.Y. & Dovichi, NJ. (1996). Anal. Chem. 68, 690. Ambrose, P.W., Goodwin, P.M., Martin, J.C. & Keller, R.A. (1994). Science 265, 361. Moemer, W.E. (1994). Science 267, 871. Perkins, T.T., Smith, D.E., Larson, R.G. & Chu, S. (1995). Science 268, 83. Funatsu, T., Harada, Y., Tokunaga, M., Saito, K. & Yanagida, T. (1995). Nature 374, 555. Collingson, M.M.& Wightman, R.M. (1995). Science 268, 1883. Fan, F.R. & Bard, A.J. (1995). Science 267, 871. Rotman, B. (1961). Proc. Natl. Acad. Sci. U.S.A. 47, 1. Xue, Q & Yeung, E. (1995). Nature 373, 681. Craig, D.B., Arriaga, E.A., Wong, J.C.Y., Lu, H. & Dovichi, N.J. (1996). J. Amer. Chem. Soc. 118, 5245. Craig, D.B., Wong, J.C.Y. & Dovichi, N.J. (1996). Anal. Chem. 68, 697. Engstrom, L. (1961). Biochim. Biophys. Acta 52, 36. Low, M.G.& Saltiel, A.R. (1988). Science 239, 268. Saini, P.K. & Done, J. (1972). Biochim. Biophys. Acta 258, 147. Varki, A. (1993). Glycobiology 2, 97. Price, N.C. & Stevens, L. (1984) In "Fundamentals of Enzymology", p. 147. Oxford University Press, NY.
This Page Intentionally Left Blank
A NEW CENTRIFUGAL DEVICE USED IN SAMPLE CLEAN-UP AND CONCENTRATION OF PEPTIDES Donald G. Sheer, Elizabeth Kellard, and William Kopaciewicz, Amicon, Inc., Beverly, MA 01915 Patrick Gearing, Jeff Wong and Michael Klein, Protein Design Labs Inc., Mountain View, CA 94043 I. Introduction The necessity to obtain samples free of contaminants such as buffers, salts and detergents has become the rate-limiting step in protein structural characterization (1,2) . The isolation of low abundant proteins or peptides in combination with high sensitivity analyses results in high background noise that compromises data quality as well as interpretation. For example, tris and glycine interfere with amino acid sequencing, Na"^ and detergents complicate mass spectral analysis (3,4,5) and hydrolyzed protein matrix inhibits detector response to monosaccharide analysis (6,7). Adsorptive membranes offer attractive means for solving sample preparation problems. The ideal membrane would exhibit high binding efficiency toward analytes, high capacity, little or no affinity for contaminants and quantitative elution. In order to address these requirements, a strong cation exchange membrane with hydrophobic character has been incorporated into an Amicon Microcon™ centrifugal device. The resulting unit has been shown to have a broad range of selectivity in that it recognizes both free amines and hydrophobic residues. This report presents the analytical sample preparation unit Microcon™-SCX to demonstrate high binding efficiency for a variety of samples. It is ideal for concentrating peptides or oligonucleotides and for removing low molecular weight contaminants from sample prior to analysis; i.e., mass spectroscopy, sequencing, amino acid, HPLC and carbohydrate analysis. The incorporated strong cation exchange membrane adsorbs virtually all amino acids, peptides or oligonucleotides through both ionic and hydrophobic mechanisms. Due to the flow characteristics of the membrane and kinetics of analyte binding, efficient desorption is achieved with short spin times. A fixed angle rotor accommodates samples from 2 - 250 pg, while horizontal rotors are as high as 400 pg. In this paper, a variety of examples are used to demonstrate broad selectivity and high sample recovery as determined by HPLC analysis. Routine sample preparation applications with HPLC, peptide sequencing, amino acid analysis, mass spectroscopy and carbohydrate analysis are also presented. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
133
Donald G Sheer et al
134
n. Material and Methods A
Operation of Microcon-SCX
The membrane is wetted by adding 100-200 pi of methanol to device, emptied and repeated with DIW. Optimal binding occurs when samples contain < 0.1 M salt and applied in volumes < 500 ]A at a pH below the pi or pKa of the analyte. For unknown samples, the pH is routinely lowered to 3 with 5-10 pi of glacial acetic acid per 500 pi of sample and loaded into Microcon-SCX with proper orientation in rotor as shown in figure 1. A 30 sec centrifugation at 1,200 x g achieves complete binding followed by an optional wash step using 500 pi of 10 mM HCl, 10% MeOH in DIW.
Figure 1. Microcon-SCX Orientation for Sample Binding and Recovery. In fixed angle type rotors, optimal recovery of sample in 50-100 pi requires a consistent orientation in rotor during adsorption and desorption. R Sample Elution Procedure from Microcon-SCX Following analyte binding and washing, a clean vial is placed into unit with the addition of 25-50 pi of desorption reagent to sample bound membrane, spun at 14,000 x g for 15 sec and repeated with another 25-50 pi of desorption reagent. The salt free eluted sample can be either used directly for analyses, neutralized or speed vacuum dried. As described in the following section, selection of desorbant will depend on both the application and the analysis performed.
Centrifugal Device for Sample Preparation
135
C Sample Elation Procedures for Selected Microcon-SCX Applications 1. HPLC Analysis: For high pH elution, use 1.4 N NH4OH/ 50% MeOH in DIW as desorption reagent. Samples can be neutralized with HCl by adding an equal volume of 1.4 N HCl to eluted sample or placed into collection vial during desorption in oder to minimize sample exposure to high pH. For low pH elution, 3 N HCl / 50% MeOH / DIW can be used as desorbant to acheive comparable recovery. The volatile desorbant can be removed within 15 min by vacuum drying. 2. Sequence Analysis: Low pH elution with 3 N HCl / 50% MeOH / DIW as desorbant maintains compatibility with Edman chemistry. The recovered sample is spotted onto glass fiber filter for sequencing. Higher concentrations of HCl in desorbant and 50% IPA may be required for the recovery of strongly hydrophobic peptides. 3. Amino Acid Analysis: For high pH desorption, use 1.4 N NH4OH/ 50% MeOH in DIW as desorbant followed by an additional dry down step in 500 ]il of DIW to remove residual ammonium ions. Alternatively a more compatible desorbant, 3-6 N HCl in 50 % MeOH can be used prior to hydrolysis. 4. Carbohydrate Analysis: Following glycoprotein hydrolysis, dilute sample 1:1 with DIW and speed vacuum dry well to remove residual TFA. To acheive the required pH 5, add 100 \A of 20 mM sodium acetate (pH 5) and if necessary, adjust pH between 4 and 5 with NaOH. The hydrolysate containing monosaccharides is transfered to a pre-wetted Microcon-SCX and centrifuged for 30 sec at high speed to remove free amino acids and unhydrolyzed peptides for HPAE-PAE analysis by electrochemical detection. m. Results and Discussion A variety of peptide standards, protein digests and complex mixtures treated with Microcon-SCX are presented to demonstrate the efficiency of analyte binding, elution and selectivity. Profile overlay and peak area integration of RP-HPLC recovered eluates following Microcon-SCX treatment are shown in figures 2, 3 and 4 with respect to untreated samples. Results are presented that demonstrate efficient membrane binding and broad selectivity with recoveries typically above 80-90%. The selectivity for analytes that bind to Microcon-SCX include dimers, trimers, small peptides, low pi peptides, hydrophobic peptides and glycopeptides. The adsorption of generated peptides has been shown to be efficient from a variety of enzymes that include glu c, asp n, V8, trypsin and endo lys c.
Donald G. Sheer et al
136 •"
HPlCP«f>KdM
-
Microeoi^SCXEIuol*
Methionine enkephalin
Leucine enkephalir
VaUyr-Val
M===^t*
Figure 2. Reversed Phase Chromatography of HPLC Peptide Standard Overlay of Before and After Microcon-SCX. Starting peptide mixture containing 45 pg in 250 pi of standard or eluted Microcon-SCX. Separation was performed by an Amicon, C18-300-10sp, (4.6 X 250 mm) using a 4 min hold at 15 % ACN, 0.25 % TFA in DIW followed by a linear gradient in 20 min from 15 % ACN to 33 % ACN at 1 ml/min. Approximately 80 % recovery of each peptide was determined by peak area integration ratios.
The stability of the cation exchange membrane to adsorb positively charged free amine groups as dimers and trimers occurs during a brief 15 sec centrifugation. The rapid kinetics for analyte binding is shown in figure 2. Comparison as a chromatogram overlay plot of control and treated peptide standard is shown. These chromatograms remain nearly identical with sample loads from 1-250 ^igs. Cytochrome c tryptic peptide map shown in figure 3 by RP-HPLC were analyzed by peak area integration following Microcon-SCX treatment. Designated peaks expressed as % of control for 6 separate samples showed that recoveries for all peaks was ^ 90% ± 2%.
137
Centrifugal Device for Sample Preparation
Vt 10
\l
I
15
mlL (minutes)
Figure 3. Reversed Phase Chromatography of Trypshiized Cytochrome c Before and After Adsorption to Microcon-SCX. Approximately 250 pg of digest was diluted to a total volume of 500 pi and either injected directly onto column (top) or hound and eluted from Microcon-SCX 0>elow) as descrihed in Methods. Separation was performed with an Amicon, C18-300-10sp (4.6 x 250 nmi) using a linear gradient of 5 % ACN to 55 % ACN (0.1 % TFA in DIW) in 20 minutes at 1 ml/min.
A more complex digest containing peptides and glycopeptides treated by MicroconSCX using endo lys c digested human immunoglobulin heavy chain (hIgG-HC) (8) is shown as a direct HPLC chromatogram comparison of before and after SCX treatment (figure 4). Qualitatively all peptides from control appear in SCX-treated sample as further confirmed by amino acid analysis data shown in figure 5. Accurate compositional analysis in the abscence and presence of detergents demonstrate efficient analyte binding and detergent removal following Microcon-SCX treatment (11). The elevated levels of Asx and Glx observed from buffer blanks and treated samples returned to normal following a second dry down step to remove residual ammonium hydroxide from desorbant that reacted with PTC during derivitization.
Donald G. Sheer et at
138
Figure 4. Recovery of Endo Lys digested Imunoglobulin Heavy Chain Following Micron-SCX. HIgG-HC was reduced, alkylated and digested with endo lys c (8) with approximately 35 pg used for control (left) and Microcon-SCX treatment (right) for analyses. Reverse phase HPLC was performed with an Amicon C18-100lOsp column (4.6 x 250 mm) using a 180 min linear gradient from 5 to 55 % ACN, 0.1 % TFA in DIW at 0.7 ml/min following a 10 min gradient from 0 to 5 % ACN. The ability of Microcon-SCX to bind all peptides and glycopeptides demonstrates broad selectivity and efficient binding, which occurs during a 30 sec centrifiigation.
^
g S S § I
i
§ ?
^
5 !? s
s
§
Figure 5. Comparison of Compositional Analysis of Endo Lys c Digested IgG Heavy Chain Before and After Adsorption to Microcon-SCX in the Presence of Detergents. Approximately 650 picomoles of endo lys c digested hIgG-HC in 0.2 M sodium phosphate was either prepared for hydrolysis (control) or treated with detergents as described. The bound digest was desorbed with 1.2 N ammonium hydroxide in 50% methanol, vacuum dried and hydrolyzed. Amino acid analysis was performed by OPA derivatization using a HAIsil 120 C18 5 micron column (Higgins Analytical, Inc.) (11). Greater than 95 % recovery of digest was recovered from MicroconSCX treated samples (0.2 M phosphate, 0.1 % SDS and 0.1 % tween). The high Asx and Glx results from incomplete removal of ammonium from Microcon-SCX eluted samples. Normal values were achieved by repeating speed vacuum lyophilisation after adding 100 ^il of DIW.
Centrifugal Device for Sample Preparation
139
HPLC Trace . E-Ol 5.694
10 8 6 4 2 0
ysiWvY^/^/^
JfrhM^.r
VW^W^^'^^^'^^'^^^
VVr/
Mass Spectrum 100
SM3
, Base Peak
. E+01 3.791
80 60 40 20
J ill!
H II Ji|
JLJildi
1 111
U
J
33:20
Figure 6. Identification of fragments by LC-MS of Endo Lys c Digested hIgG-HC. Masses determined from MS spectrum data following Microcon-SCX desalting. Digests were injected into an LC-MS system consisting of an HP1090 plumbed to a Finnigan TSQ7000 MS with a Finnigan electrospray source. Approximately 45 L | Lg of digest was loaded onto a Nucleosil 300-5 C18 column (0.46 x 25 cm; MachereyNagel, Duren, Germany) and eluted with a gradient of ACN in 0.1% TEA at 0.7 ml/min; the effluent was analyzed on-line for both UV absorbance (top) and mass without flow splitting (below) (8). Comparison of Microcon-SCX with untreated s e x LC-MS samples showed that peptides and glycopeptides were identical. Figure 6 demonstrates the utility of Microcon-SCX to produce high quality LC-MS data. The upper LC scan was generated from endo lys c digested IgG-HC (8), following Microcon-SCX binding, washing and elution. On line LC fractions were subjected to negative ion ES (8). Mass spectrum data in lower scan was used to designate HPLC fractionated peptides in upper trace. The recovered analytes from Microcon-SCX following LC-MS showed that all predicted peptides and glycopeptides were recovered with masses ranging from 447-6,191 daltons. Microcon-SCX bound analytes washed in low pH, eluted in ammonium hydroxide and evaporated by air has given clean and accurate MALDI- TOF spectrum for a variety of samples (data not shown).
Donald G. Sheer et al
140 AMI2IF03 1 (1.254) m j
lOpmol Oligdhymidilic acid d(pT)10
J
I W j ^ ^ LXiX
K5
la?
.1.
='
.li
AMUIF01 I (1.362)
400
i 4«"i » 500
600
700
900
1000
1100
1200
Figure 7. Negative Ion Electrospray Scan of Oligo-thymidillic Acid PCR Primer (10 mer) Before and After Microcon-SCX. A. (Top) ES Scan of 10 picomoles of oligonucleotide in 20 mM sodium acetate, pH 5. (Lower) ES Scan after 10 picomoles of primer was passed over Microcon-SCX, desorbed and vacuum dried. Samples were reconstituted in water and diluted to 10 mM TEA in 50 % IPA and infused at a rate of 2 ml/min (Micro Mass). The majority of [Na"*"] passed through membrane and was removed following Microcon-SCX treatment. A 20 mM ammonium acetate or 10 mM HCl/ 20% MeOH wash removed remaning [Na"**] from sample.
Microcon-SCX was used to obtain oligo-DNA primers free of salt for DNA sequencing by MS and subsequent PCR. HPLC quantitation of Microcon-SCX recovered oligoncleotides was 75-90 % efficient in recovering most nucleotides at a pH of 2-3. To demonstrate the efficiency to remove salts from these samples, oligo-dT (19.24) was applied to Microcon-SCX in sodium acetate buffer and analyzed by ES-MS shown in figure 7. Results demonstrate that upon analyte binding, the majority of [Na"'"] passes through s e x as shown by a decrease in ionized [Na"'"]-DNA forms of treated sample. The remaining [Na"'"]-DNA forms were removed by washing membrane with 500 |Xl of 10 mM HCl in 20% MeOH prior to analyte desorption.
141
Centrifugal Device for Sample Preparation
120
120
100
100
80 nC
Glam/9.67 » Fuc/4.92
40 20 0
80
Gluni/12.08
60
<)
i 5
1C
A'0-921
10
/" ^ 15
Minutes
20
2t
^° 40 20 0
1
0
12,08 475
Of
\
h ii.ooli jl 1 I i 13-33 j \ l\s
5
1 1 1 ' 10
\ A
16.33 18.58
1 1 1 1 1 I 15
1 1 1 I 20
k 25
Minutes
Figure 8. Removal of free amino acids with Microcon-SCX from Glycoprotein Hydrolysate for Monosaccharide Determination. HIgG-HC was hydrolyzed in 2M TFA for 5 hours, vacuum dried and applied to Microcon-SCX in ammonium acetatebuffer (pH 5.0). The filtrate containing monosaccharides was analyzed using a pellicular anion exchange resin (HPAE) with pulsed amperometric detection (PAD) (10). The left trace represents the control and the right trace shows that 90 % of the monosaccharides were recovered after Microcon-SCX treatment to remove imterfering amino acids or peptides during analysis. The monosaccharides from left to right are fucose, galactosamine, glucosamine, mannose, and galactose.
The potential interference of amino acids and peptides with monosaccharide analysis has been well established (5,6). In order to demonstrate the utility of MicroconSCX as a sample prep device for carbohydrate analysis, glycoprotein hydrolysates of hIgG-HC is shown before and after SCX treatment in figure 8. Results from pulsed amperometric detection (PAD) with HPAE separation (10) showed that greater than 90% of the monosaccharides were recovered from Microcon-SCX following hydrolysis compared to the control samples. Amino acid analysis showed that approximately 75 % of free amino acids or peptides were removed by Microcon-SCX at pH 5. A pH greater than 4.5 was required in order to recover the amino sugars (galactosamine and glucosamine) in SCX filtrates. IV. Conclusion Microcon-SCX contains a hydrophobic strong cation exchange membrane adsorbing a variety of low molecular analytes to remove salts and contaminants from sample for concentrating a salt free sample in a volatile buffer. The high efficiency in binding of analytes at low pH is shown to occur within seconds during a brief centrifugation. Desorption of sample at either low or high pH in the presence of polar solvents can be achieved in volumes as low as 50 |xl. Optimal desorption incorporates two consecutive 50 jll spins using 1.4 N ammonium hydroxide or 1.2 N hydrochloric acid in 50 % methanol. A variety of application-specific desorbants are shown to achieve excellent sample recoveries.
142
Donald G. Sheer et at
Quantitation by HPLC and amino acid analysis of peptide standards, protein digests and oligonucleotides show recoveries in the range of 75 - 99%. Binding efficiency was unaltered by salt < 0.1 M (tris, phosphate and sodium chloride), chaotropic agents < 2 M (urea, guanidine) and detergents < 1 % ( SDS, Triton and Tween). Efficiency for removal of salt and detergent by Microcon-SCX using HPLC, peptide sequencing, amino acid analysis and mass spectroscopy was demonstrated. During adsorption, the majority of salts pass through SCX, however a low pH wash step removed most of the remaining salts with minimal sample loss. The efficiency for which protonated analytes bind to the cation surface appears much greater than monovalent ions to enhance salt and detergent removal. Ease of use, convenience and high turnover help maintain sample fidelity and demonstrates that Microcon-SCX can be used as an versatile sample preparation device. References 1. Kirchner, M., Fernandez, J., Shakey, Q.A., Gharahdaghi, F. and Mische, S.H.(1996) in Techniques in Protein Chemistry VII (Marshak, D.R. Ed.) pp 287-298, Academic Press, New York. 2. Merewether, L.A., Clogston, C, Patterson, S., Lu, H., (1995) in Techniques in Protein Chemistry VI. (Crabb, J. Ed) pp. 153-160, Academic Press, New York. 3. Vorm, O., Chait, B.I., and Roepstorff, R, (1993) Proc. 4lst ASMS Conf., 654-655. 4. Pappin, D.J.C., Hojrup, R, and Bleasby, A.J., (1993) Current Biology 3, 327-332. 5. Swiderek, K.M., Klein, M.L., Hefta, S.A., and Shively, J.E. (1995) in Techniques in Protein Chemistry VI. (Crabb, J. Ed) pp. 267-275, Academic Press, New York. 6. Rohrer, J., Thayer, J.T., Avdalovic, N., and Weitzhandler, M (1995) Anal. Biochem. 170, pp 54-62. 7. Anumula, K.R. (1994) Anal Biochem. 220, pp 275-283. 8. Kast, E., Pathmanabhan, N., Wong, J., O'connor, B., and Klein, M.L. (1996) Presented at the Tenth Symposium of the Protein Society, San Jose, CA, Abstract #464-T 9. Stone, K.L., LoPresti, M.B., Crawford, J.M., DeAngelis, R. and Williams, K.R. (1989) in: "A Practical Guide to Protein and Peptide Purification for Microsequencing". Ed: Matsudaira, P. Academic Press. 10. Hardy, M.R., Townsend, R.R., and Lee, Y.C. (1988) Anal. Biochem. 170, pp 54-62. 11. Seitz, P., Godel, H., "Quantitation of Cystein and Cystine"; Hewlwtt Packard Application Note # 12- 5901-0775E (1991).
Sample Preparation Using Synthetic Membranes for the Study of Biopolymers By Matrix Assisted Laser Desorption/Ionization Mass Spectrometry T.A. Worrall*, J.A. Porter+, R.J. Cotter*'#, A.S. Woods§ Departments of Biophysics and Biophysical Chemistry * Oncology§, Pharmacology and Molecular Sciences^, Molecular Biology and Genetics"*" The Johns Hopkins University School of Medicine, Baltimore, MD 21205 INTRODUCTION Since its discovery in 1987, matrix assisted laser desorption/ionization time of flight mass spectrometry (MALDI-TOF MS) has become a common technique in the mass spectral analysis of biopolymers (1,2). Its ease of operation, theoretically unlimited mass range, and ability to acquire an entire mass spectrum without scanning make the technique an excellent method to analyze high mass biopolymers. Combining such advantages with the capability of analyzing sub-picomole quantities of biopolymers makes MALDI-TOF MS extremely useful for routine mass analysis. While many groups have demonstrated that MALDI-TOF can be used routinely to analyze synthetic peptides and proteins (3,4), analysis of peptides and proteins extracted from biological sources is not as routine. Sample purity is one of the most important factors in the effective application of MALDI-TOF MS. The success of MALDI depends greatly on the crystallization process occurring within matrixanalyte mixtures during sample preparation. Moderate concentrations of salt, glycerol, and sugars detract significantly from desorption/ionization efficiency of peptides and proteins, reducing the mass range and resolution of spectra significantly. Common detergents such as PEG and TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
143
144
T.A.Worrallera/.
TRITON, added during protein extraction, undergo the desorption/ionization process more efficiently than peptides and proteins, produce much greater signal intensity than biopolymers, and often suppress the detection of biopolymers altogether. Since proteins and peptides are typically extracted from cells by using buffers and detergents, commonly used salt-containing buffers such as sodium phosphate hinder effective crystallization of samples with matrix, making effective desorption/ionization more difficult, if not impossible. Purification of biopolymers by HPLC frequently results in 30% sample losses, and could add further contaminants to samples. Polyacrylamide gel electrophoresis not only introduces sodium and potassium contamination, but also reduces the recovered concentration of biological samples. Consequently, further purification techniques are needed to get a MALDI-grade "clean sample". Such techniques, however, are time consuming, result in loss of precious biological samples, and can introduce more contaminants which are incompatible with MALDI. In recent years, several membranes have been discovered to be compatible with MALDI. Preliminary work has focused on eliminating minor impurities from low picomole concentrations of peptides and proteins. Mock et. al.(5) were able to resolve 10 pmole quantities of peptides and proteins in the presence of small concentrations of contaminants by spotting samples on an electrosprayed nitrocellulose surface. Zaluzec et. al. (6) were able to wash water soluble contaminants from samples immobilized on nylon membranes in a method analogous to washing samples immobilized on stainless steel surfaces. Both MALDI MS of samples electroblotted onto poly(vinylidene fluoride) (PVDF) and proteolysis of peptides directly on electroblotted membranes have been demonstrated (7,8,9,10). More recently, uncontaminated proteins desorbed directly from polyethylene membranes have produced high resolution, very reproducible spectra (11). Yet while desorption/ionization has been demonstrated from such membranes, there has been no attempt to establish MALDI from membranes as a generally useful purification tool in protein biochemistry. In this paper, we report the use of several micro-porous synthetic membranes to prepare contaminated peptides and proteins for MALDITOF analysis. By spotting contaminated samples directly onto activated synthetic membranes, impurities such as salts, glycerol, and detergents can be washed from the sample, while biopolymers remain intact. Following addition of matrix, samples can be desorbed and ionized
Sample Preparation of Biopolymers by MALDI-TOF MS
145
directly from the membrane. We report that MALDI-TOF spectra can be acquired from contaminated peptides, proteins, and enzymatic reactions without requiring chromatographic separation or purification. The technique is rugged and spectra are reproducible. This sample preparation technique allows the use of MALDI-TOF MS in the analysis of contaminated proteins and peptides with a minimum of preparation, producing extremely reproducible spectra with improved mass resolution for contaminated peptides, proteins, and enzymatic reactions conducted directly on the membrane. MATERIAL AND METHODS Several membranes (Cg and Cjg and the disposal IR card are a 3M product purchased from Fisher Scientific, the 113-2 and 1222 membranes were a gift from 3M) were tested: 1. 113-2 is a high density polyethylene, with a Gurley of 22 seconds (Gurley is an indication of how porous membranes are, and is measured by timing how long it takes to force a fixed volume of gas through a fixed area of membrane. Therefore the lower the Gurley the more porous the film); 2. 1222 is a polypropylene membrane with a Gurley of 5.3 seconds; 3. Type 61 disposable IR card is a polyethylene membrane; 4. Octadecyl (Cjg) extraction disks; 5. Octyl (Cg) extraction disks. I. MALDI of Pure Samples from Membranes In order to determine the optimum sample preparation conditions, we initially tried several sample preparation methods in the absence of contaminants. The following peptide and protein mixture was prepared (peptides and protein were purchased from Sigma Chemicals, St. Louis, MO): 1 pmol/Ail Parathyroid Hormone Fragment 39-68 (PHF), MW = 3285.7, 10 pmol/jul Pigeon Heart Cytochrome C (PHC), MW = 12,173, and 10 pmol/jul Bovine Serum Albumin (BSA), MW = 66,256. A saturated solution of Sinapinic acid in 1:1 ethanol:water was used as matrix. Samples were prepared on each of the membranes using each of the following protocols: a. 1 jul sample solution was added to the membrane and allowed to dry at room temperature. 1 jul of a saturated solution of sinapinic acid in 1:1 acetonitrile:water was added.
146
T. A. Worrall et al
b. Membranes were pretreated by depositing 2 Ail MeOH, immediately followed by the addition of 1 pi sample solution. The membrane dried at room temperature, and 1 jul of saturated matrix solution was added. c. Membranes were pretreated by depositing 2 jul MeOH, immediately followed by addition of 1 jul of the sample solution. The membrane was allowed to dry at room temperature. It was then washed with 3-6 ml 1:1 methanol:water. The membrane was again allowed to dry at room temperature, and 1 Ail of saturated matrix solution was added. The best results with all membranes were obtained by following protocol c. Peptide and protein solutions can take anywhere from 10 to 30 minutes to dry. n . MALDI of Contaminated Samples from Membranes. To examine the ability of membranes to prepare samples with known contaminants, we contaminated the above peptide and protein solution with 5% glycerol and 500 mM NaCl. In addition to preventing effective crystallization of analyte samples with matrix on conventional stainless steel surfaces, glycerol and sodium contaminants are frequently present in biological samples. Doped samples were prepared for MALDI-TOF analysis by saturating the membrane with MeOH, immediately followed by the addition of 1 jul of the sample. The membrane was washed 3 times with 3-6 ml 70% methanol in water and allowed to dry after each wash. Once dry, lul saturated matrix solution was added to the sample spot. i n . Endopeptidase Digest. 1 /ul peptide, 1 Ail 25 mM ammonium bicarbonate pH 8.5, 1 Ail Trypsin (1.0 jug/ul), and 1 Ail 50 mM NaCl were added to the dry membrane. The solution was incubated at room temperature (RT) for 3 minutes, and 2 /ul methanol were added to activate the membrane and stop the reaction. Once the membrane was dry, 1 AI1 matrix solution was added. IV. Chemical Cleavage. Peptide mapping of the Sonic Hedgehog protein (SHH) was done via a cyanogen bromide (CNBr) digest that cleaved at the carboxyterminus of Methionine residues (12). The SHH protein was in a
t~ r.
~l!sualu ! aA.II ela~
•(l!SUll u I aA~l~la ~
~l!SUalU[ a ~yela ~
~=.
!
I
! !
, I'~
1~
i'~I~
--,-I =t
-
-I
-~,
r.
i-iiii~iiiiiiil,ii
~
.~l!SUalU I a~.q¢la~!
~,--
u
I!
n
~
I=,,
°
~ ~.~-
I=,
~.~
° j,,,t
~.
=
"~"
, ~_~:~ ~ ,-i ¢ l,,=,l
.~-
.o
148
T. A. Worrall et al
solution containing 0.1% Triton XlOO, as well as other buffers and salts. SHH was precipitated using 1 mL of a 10% TCA solution. It was then centrifuged for 5 minutes in a microfuge. The pellet was solubilized in 70% acetic acid. The CNBr reaction was done in a test tube, and 1 n\ of the digest was deposited on the activated membrane. Peptide mapping of SHH was ideally suited to purification on the membrane, since it was in a solution containing 0.1% Triton XlOO, as well as other buffers and salts. The SHH protein was a gift from Dr. Philip Beachy (13). V. Instnimentation. All mass spectra were acquired on a Kratos Kompact MALDI III time of flight mass spectrometer with a 337 nm N2 laser and a 20 kV extraction potential in the linear mode. Membranes were affixed to the Kratos sample slide with tape. Every spectrum was the average of 50 laser shots. Spectra were calibrated from external standards desorbed from the same membrane being tested. RESULTS AND DISCUSSION In general, MALDI of samples fixed to membranes resulted in no loss of mass resolution or mass range. Spectra were extremely reproducible, and could usually be acquired at a lower threshold laser intensity. Figure 1 shows a peptide and protein mixture desorbed from each of the 5 tested membranes. All membranes except the C18 extraction disk produced well resolved spectra. Higher masses were better resolved in samples fixed to polyethylene membranes, while lower masses were better resolved by fixing samples to the Type 61 disposable IR card. Doubly and triply charged ions formed more readily upon desorption/ionization from all the membranes tested, than from stainless steel surfaces. Improvement in mass resolution by MALDI of samples loaded on synthetic membranes was particularly apparent in the MALDI of contaminated samples. We systematically examined the ability to remove measured amounts of contaminants from peptide and protein samples by doping previously pure samples with glycerol and salts. Samples doped with 5 % glycerol and 500 mM sodium were prepared for MALDI-MS analysis using the method described above. Figure 2a shows a spectrum of the contaminated mixture combined with matrix and loaded onto a stainless steel surface. The contamination, while small by biochemical standards, was significant
Sample Preparation of Biopolymers by MALDI-TOF MS
149
Matrix
k^
mass / charge
150.000
Fig 2a. Mixture Contaminated with 5% Glycerol and SOOmM NaCl desorbed from Stainless Steel Probe. PIIC+
. t 70-q tn
i
pur
^ eo-j a i * i C^ 30H
mass / charge
Fig 2b. Mixture Contaminated with 5% Glycerol and SOOmM NaCl desorbed from 1) Polyethylene Membrane 113-3; 2) Type 61 Disposable IR Card; 3) C8 Extraction Disk .
T.A.Worrallera/.
150
90
2? 70 '55 g 60 •3 50 0)
.^40
1500
mass / charge
Fragment #
2000
2500
AA Residues Position
Mol. W
Fl
1 - 11
1320.5
F2
13 -20
843.0
F3
22-29
975.0
F4
30-38
1075.1
F5
21 -29
1103.4
F6
1 -20
2273.6
F7
21 -38
2260.4
F8
21 -40
2288.5
Fig 3. Tryptic Digest of 10 pmol Growth Hormone Releasing Factor fragments 1-40,1:1 in saturated a-cyano-4-hydroxycinnamic acid, desorbed from a. Polyethylene Membrane 113-2 b. Polypropylene Membrane 1222 c. Stainless Steel Probe
Sample Preparation of Biopolymers by MALDI-TOF MS
151
enough to prevent effective crystallization of matrix with analyte. As a result, only matrix peaks were present upon MALDI of the contaminated sample from a stainless steel surface. Fig. 2b [1-3] show the contaminated sample desorbed from the polyethylene membranes 113-3 and Type 61 disposable IR card, and the C8 membrane respectively. By fixing the biopolymers to the membrane and washing away contaminants, we were able to acquire mass spectra of the previously undetectable sample without the need for chromatography or dialysis. In addition to improvements in spectra of glycerol and Na doped samples, we acquired spectra of a salt-contaminated tryptic digest. Fig 3 shows spectra of the tryptic digest fragments desorbed from polyethylene membrane 113-2, polypropylene membrane 1222, and the stainless steel surface. No peptide fragments were detected in MALDI from the stainless steel surface. In contrast, well resolved spectra, including all major fragments, were acquired for samples added to and washed on the 113-2 polyethylene membrane. By spotting contaminated digests on membranes and washing away contaminants, analyte and matrix crystallized effectively and MALDI-TOF spectra of contaminated samples were readily acquired. F3^ Fragments generated by the CNBr Digest of the SHH protein ent ft
AA Residues Position
Mol. W
Fl*
1 - 77
8651 9
F2
7 8 - 93
1773.2
F3
94 - 139
5327.8
F4
140 - 176
3947.5
F5
1 - 94
10425.1
* Fragment I is not seen because the methionine is followed by a threonine residue. In such cases cleavage occurs in a very low yield.
F3i* F2*
VW>,>*.vjJ^WvVl mass / charge
Fig, 4 Cyanogen Bromide Digest of Sonic Hedgehog Protein.
152
T.A.Worrallerfl/.
Finally, we demonstrated the effectiveness of membranes in the purification of biological samples by acquiring spectra of TRITONcontaining bacterial Sonic Hedgehog (SHH) cyanogen bromide digests (Figure 4). Native hedgehog proteins are not soluble in the absence of detergent (14), yet detergents such as TRITON produce such intense signals that they suppressed peptide and protein desorption/ionization, rendering peptides and proteins undetectable. We doped bacterial SHH with 1 % TRITON, the same concentration of TRITON necessary to solubilize native SHH, digested the mixture with CNBr, and spotted the protein digest directly onto the membrane. By spotting SHH samples on activated membranes and washing as described above, we were able to remove the TRITON XI00 from the sample spot and acquire mass spectra of the previously undetectable sample. Only fragment 1 was not seen, because methionine 77 is followed by a threonine residue, in which case cleavage occurs rarely (15). Of the several membranes we examined, the polyethylene membranes were the most effective under all conditions. More specifically, the 113-2 membrane was most effective, followed by the IR membrane. The polypropylene membrane 1222 was the next most effective, followed by the C8 and C18 extraction disks. We suspect that the smaller pore sizes of polyethylene membranes results in retention of peptides and proteins at the membrane surface. Further work is need to establish this, however. While MALDI-TOF mass spectrometry has been a rapidly expanding technique in recent years, its use has been limited to detection of extremely pure biopolymers. Samples containing salt, glycerol, and detergents render peptide and protein samples difficult to detect, or completely undetectable. Such contaminants suppress analyte signal to various degrees. By spotting samples directly onto activated membranes, however, contaminants are removed without the need for further purification and without loss of resolution and mass range. Spotting contaminated peptide and protein samples directly onto synthetic membranes will continue to expand the role of MALDI-MS in the biochemical laboratory.
Sample Preparation of Biopolymers by MALDI-TOF MS
153
REFERENCES 1. K. Tanaka, Y Ido, S. Akita, in Proceedings of the Second JapanChina Joint Symposium on Mass Spectrometry; Matsuda, H.; Liang, X.T., Eds; Bando Press, Osaka (1987) pp. 185-188. 2. M. Karas, D. Bachmann, U. Bahr, F. Hillenkamp, Int. J. Mass Spectrom. Ion Processes, 1987, 78, 53-68. 3. M.R.Chevrier, and R.J. Cotter Rapid Commun. Mass Spectrom, 1991,5,611-617. 4. L. Keefe, C. Wolkow, A. Woods, M. Chevrier, R.J. Cotter, and E. L. Lattman , /. Appl. Cryst., 1992, 25, 8739-8743. 5. K.K. Mock, C.W. Sutton, and J.S. Cottrell, Rapid Commun. Mass Spectrom, 1992, 9, 1051-1055. 6. E.J. Zaluzec, D.A. Gage, J. Allison, and J. Throck Watson, J. Am. Soc. Mass Spectrom., 1994, 5, 230-237. 7. K. Strupat, M. Karas, F. Hillenkamp, C. Eckerskorn, F. Lottspeich, Anal. Chem., 1994, 66, 464-470. 8. M.M. Vestling, C. Fenselau, Anal. Chem., 1994, 66, 471-477. 9. D. Fabris, M.M. Vestling, M.M. Cordero, V.M. Doroshenko, R.J. Cotter and C. Fenselau, Rapid Commun. Mass Spectrom., 1995, 9, 1051-1055. 10. M.M. Vestling, C. Fenselau, Mass Spectrometry Reviews, 1995, 14, 169-178. 11. J.A. Blackledge, A.J. Alexander, Anal Chem, 1995, 67, 843-848. 12. E. Gross and B. Witkop, J. Am. Chem. Soc. 1961, 83, 1510-1511. 13. J.A. Porter, D.P. Von Kessler, S.C. Kessler, K.E. Young, J.J. Lee, K. Moses and P.A. Beachy, Nature, 1995, 374, 363-366. 14. J.A. Porter, S.C. Ekker, W. Park, D.P. von Kessler, K.E. Young, C. Chen, Y. Ma, A.S. Woods, R.J. Cotter, E.V. Koonin, P.A. Beachy, Cell, 1996, 86, 21-34. 15. W.A. Schroeder, J.B. Shelton and J.R. Shelton, Arch. Biochem. Biophys., 1969, 130, 551-556.
This Page Intentionally Left Blank
Use of LC/MS Peptide Mapping for Characterization of Isoforms in ISN-Labeled Recombinant Human Leptin Jennifer L. Liu,^ Tamer Eris,l Scott L. Lauren,^ George W. Steams,^ Keith R. Westcott,^ and Hsieng Lu^
^Department of Protein Structure, ^Department of Protein Chemistry, ^Department of Process Science, Amgen Inc., Amgen Center, Thousand Oaks, CA 91320-1789
I.
INTRODUCTION
The employment of high field heteronuclear multidimensional NMR for determining the tertiary structures of proteins requires isotopically labeled proteins. The proteins are typically uniformly labeled with ^^N or double labeled with ^^N and ^^C. The production of such labeled proteins can be achieved by expression in recombinent bacterial cells using minimal media in which [l^N] ammonium sulfate and [^^C] glucose are the sole nitrogen and carbon sources. These specific "labeling" fermentation conditions, therefore, create stressed growth environments for production of proteins at high expression yield. It was known that misincorporation of norleucine in place of methionine is dramatically enhanced when cells are grown under stressed fermentation conditions (1,2). The mechanism of norleucine de novo synthesis and its incorporation into recombinant protein in bacterial cells have been established (2-5). Several groups have observed the occurence of norleucine substitution for methionine in proteins both at the internal residues and at the amino terminus with no specific preference (6, 7). Techniques such as N-terminal sequencing, amino acid analysis, and mass spectrometry were employed for the characterization of norleucine incorporation. However, the resolution of protein species containing norleucine substitutions at various methionine residues presents an analytical challenge and has not been achieved previously. The quantitative measurement of each individual Met->Nle isoform was therefore not possible. Recombinant human leptin was recently cloned (8) and expressed in E. coli, and demonstrated to effectively regulate adiposity in mice through modulation of appetite and metabolism (9, 10). The molecule contains four methionine residues at positions 1, 54, 68, and 136. In this paper, we report the separation and characterization of three norleucine-incorporated recombinant human leptins which were uniformly labeled with l^N isotope or double labeled with l^N and l^C isotopes. The extent of incorporation at each methionine residue can be determined by reverse-phase HPLC and amino acid analysis methods. The norleucine incorporation was observed preferentially occurring at the internal Met residues. II.
MATERIALS AND METHODS
TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
155
Jennifer L. Liu et al
156
A.
Material
All chemicals used were of analytical grade, except when otherwise indicated. Endoproteinase Asp-N was obtained from Boehringer Mannheim (Indianapolis, IN). l^N labeled recombinant human leptin was expressed in E. coli in inclusion bodies using [l^NJammonium sulfate as the sole nitrogen source. The protein was allowed to fold and oxidize after solubilization of inclusion bodies, and purified by ion exchange chromatography to 95% purity as assessed by SDS-polyacrylamide gel electrophoresis as described (9, 11). B.
Separation of [^^N]r-metHuLeptin
isoforms by reverse-phase
HPLC
The [^^NJr-metHuLeptin isoforms were separated by applying 750 |Xg of a leptin preparation to a Vydac C-4 reverse-phase semi-preparative column (10x250 mm) using a Hewlett Packard HPLC (Model 1090) system equipped with a diode array detector. The column was initially equilibrated with 47% mobile phase A (0.1%TFA) and 53% mobile phase B (90% acetonitrile in 0.1% TFA). A linear gradient from 53 to 55% mobile phase B was run over a period of 35 minutes at a flow rate of 2.0 ml/min. C,
Peptide mapping by reverse-phase mass spectrometry (LC/ESMS)
HPLC/electrospray
ionization
The solution of [l^NJr-metHuLeptin (0.3 mg/ml) in PBS (0.1 M sodium phosphate, 0.1 M sodium chloride, pH 7.2) was incubated with endoproteinase Asp-N at an enzyme-to-substrate ratio of 1:75 (w/w) at 25 ^C for 5 h. The digestion was terminated by adding 5 jiL of 5% TFA to the reaction. Peptides were separated by a Vydac C4 reverse-phase analytical column (4.6x250 mm) using a Hewlett Packard HPLC (Model 1090), which is on-line connected to a PE-Sciex API-100 electrospray mass spectrometer. The column was initially equilibrated with 95% mobile phase A (0.1%TFA) and 5% mobile phase B (90% acetonitrile in 0.1% TFA). A linear gradient from 10 to 50% mobile phase B was run over a period of 85 minutes at a flow rate of 0.5 mL/min. The splitting of the flow (9:1) was achieved post UV cell, allowing 50 jiL/min of the eluent to be analyzed by the electrospray mass spectrometer. D,
Amino acid analysis and N-terminal sequences
determination
Acid hydrolysis of purified [^^N]r-metHuLeptin samples (1-3 nmol) was performed using 6 N HCl, 0.1% 2-mercaptoethanol, and 0.65% phenol at 110 ^C for 24 hr as described previously (12). The hydrolysates were dried, reconstituted, and injected into a Beckman 6300 amino acid analyzer for compositional analysis. Sequential Edman degradation was performed on a Hewlett Packard GIOOOA automated protein sequencer using sequencing programs recommended by the manufacturer (Hewlett Packard Inc., Mountain View, CA). III.
RESULTS
A.
Assessment
of ^^N isotope
incorporation
The degree of ^^N incorporation was determined by electrospray mass spectrometry. The observed molecular mass (Mr) for the labeled r-metHuLeptin is 16,347
LC/MS Peptide Mapping of Recombinant Human Leptin
157
amu. The unlabeled recombinant leptin has an observed Mr of 16,157.5 amu. The relative molecular mass difference of 90 amu corresponds to 90 ^^N for l^N isotope incorporation. Therefore, the ^^N labeling efficiency was calculated as 99.9%. B,
Separation of isoforms by reverse phase HPLC
Figure 1 showed the reverse-phase HPLC profile of [^^Nj^-metHuLeptin. Eight different isoforms (forms A to H) were separated based on their hydrophobic interaction with the C4 bonded silica column. Molecular weights of the isoforms were determined by an on-line electrospray mass spectrometer connected to the HPLC. Table 1 lists the relative retention time, percent abundance, and molecular weight of each identified isoform. The native protein (form D) constitutes approximately 77% of the total protein based on HPLC area integration. Isoform A elutes at 18.37 min and has an observed Mr of 16113.0 amu. The relative molecular mass difference of form A and native [^5]sj]i-.inetHuLeptin is 234 amu, which corresponds to removal of the N-terminal dipeptide, [^^N]Met-Val, in form A. Isoform B elutes at 20.31 min and has an observed Mr of 16214.0 amu which differs from native [l^NJr-metHuLeptin by 133 amu, or a [l^NJMet residue. Isoform C elutes at 21.08 min and has an observed Mr of 16364.0 amu. The additional 16 daltons mass difference as compared to native [^^NJr-metHuLeptin suggests that isoform C is a monooxidized species. Three isoforms, F, G, and H, elute later than the native protein and each comprised approximately 4% of the total protein. The molecular masses observed for these isoforms were 18 dalton less than the native protein. Further characterization of the isoforms described above was performed to identify their structural differences from native [l^Nlr-metHuLeptin.
1
120-
D
10080-
ieoE
4020u-
A i C
1 E FG H
jjji^ 1
10
20
30 Minutes
Figure 1. Reverse phase HPLC of [l^N]r-metHuLeptin.
40
50
Jennifer L. Liu et al
158 Table L Identified [^^N]r-metHuLeptin isoforms retention time^ (15N)r-metHuLeptin isoform (min) % abundance*^
observed mass*^(amu) theoretical mass<^ (amu)
A
des-Met-Val
18.37
1.1
16113.0 (16114.5)
B
des-Met
20.31
3.7
16214.0 (16214.6)
C
monooxidized
21.08
0.9
16364.2 (16362.7)
D
native
23.25
77.6
16347.0 (16346.7)
E
unidentified
26.27
5.1
16360.0
F
(Met68->Nle68)d
29.62
4.0
16328.5 (16328.7)
G
(Met54->Nle54)d
30.56
3.9
16328.5 (16328.7)
H
(Metl36->Nlel36)d
32.43
3.8
16328.5 (16328.7)
^ Protein isoforms eluted on reverse phase HPLC. h Percent abundance determined based on area integration on reverse phase HPLC. c Observed mass on Sciex API 100 Plus with ES source. ^ Determined by structural characterization seen in Figure 2 and Table 2. C.
Endoproteinase
Asp-N peptide mapping and on-line LC/MS
Figure 2 shows endoproteinase Asp-N digested peptide maps of isoforms F, G, and H, together with that of the leptin standard. Table II lists the methionine containing peptides, their molecular masses, and sequences. In the native protein peptide map, A9, A4, and A5 peptide each contains one methionine residue in the sequence. Peptide A9, DMLWQL, elutes at 51.5 min. Peptides A4 and A5 elute at 61 min and have primary sequences of DFIPGLHPILTLSKM and DQTLAVYQQILTSMPSRNVIQISN respectively. The endoproteinase Asp-N peptide map of isoform F contains one new peptide, A5*, which elutes at 64 min. The molecular mass of A5* is 2736 amu which differs from that of A5 by approximately 18 daltons. Isoform G peptide map contains a new peptide, A4*, which elutes at 64.5 min. The molecular mass of A4* is 1682 amu which is again 18 daltons less than that of peptide A4. In the peptide map of isoform H, a new peptide, A9*, elutes at 56 min and has an MH+ of 795.4 amu. The molecular mass difference of peptides A9* and A9 is also 18 daltons. The observed molecular mass difference of 18 daltons for peptides A5*, A4*, and A9*, versus peptides A5, A4, and A9, is equivalent to the molecular mass difference of methionine and norleucine. The sequence alteration of peptides A5*, A4*, and A9*, were determined by N-terminal sequence analysis.
159
LC/MS Peptide Mapping of Recombinant Human Leptin 500-
400-
300-
3 200-
< E
100-
H____JLwl luJ Native
-100U4.A5 10
20
30
40
50
60
70
Minutes
Figure 2. Reverse phase HPLC of endo-proteinase Asp-N digestion of Met->Nle isoforms compared to unmodified [l^NJr-metHuLeptin. [l^NJr-metHuLeptin Asp- N peptide map Table II Peptide^ Fragment^ Observed mass^ Observed sequence^ (theoretical mass) DMLWQL A9 136-141 813.4%13.37)
Recovery^ % 74
A9* A5
136-141 56-79
795.4^795.37) 2753^2752.1)
DB^LWQL DQTLAVYQQILTSMPSRNVIQISN
26 79
A5* A4
56-79 41-55
2736^735.1) 1700^(1699.88)
DQTLAVYQQILTSBepSRNVIQISN DFIPGLHPILTLSKM
21 78
DHPGLHPILTLSKfie 22 1682^1681.88) ^ Peak assignment for peptide fragment eluted on reverse phase HPLC. b Protein sequencefragmentassigned based on mass and sequencing data. ^ Observed mass and theoretical mass for MH+. ^ Observed mass on Sciex API 100 Plus with ES source. ^ E is norleucine f % Recovery = peaKareaof AX x 100, where AX is A9, A5, or A4. peak area of AX + peak area of AX* A4*
41-55
Jennifer L. Liu et al
160
D,
Detection of norleucine by N-terminal
sequence
analysis
The presence of norleucine in the modified peptides was confirmed by comparing the PTH derivative of the unknown amino acid derived from Edman degradation with the PTH-norleucine standard. The PTH derivative of norleucine eluted at a later position than DPU standard (Figure 3). Figure 3(a) depicts the first three cycles of peptide A9*, AspNle-Leu on a Hewlett Packard G1000A automated protein sequencer, while peptide A9 showed Asp-Met-Leu for the first three cycles. Figure 3(b) depicts cycles 13-15 of peptide A5*, Ser-Nle-Pro, while peptide A5 showed Ser-Met-Pro for the corresponding cycles. Similar result was also observed using an Applied Biosystems gas-phasesequencer (data not shown). The N-terminal sequencing analysis of peptides A5*, A4*, and A9* confirms that each of the isoforms, F, G, and H contains a Met->Nle substitution at amino acid residues 68, 54, and 136, respectively. The N-terminal sequence analysis of isoforms A and B showed the intial cycles as Pro-Ile-Gln and Val-Pro-Ile, respectively. These results are consistant with the molecular mass data described earlier (Table I), indicating that these isoforms are N-terminal truncated. Asp
DPTU
Cycle 1
Cycle 13
DPTU
Cycle 2
bl
DPTlt
Cycle 3
LX" Figure 3. N-terminal sequence analysis of norleucine-containing peptides digested from [15N]r-metHuLeptin. (a) Cycle 1-3 of peptide A9*, Aps-Nle-Leu (b) Cycle 13-15 of peptide A5*, Ser-Nle-Pro.
LC/MS Peptide Mapping of Recombinant Human Leptin
E.
Amino Acid Analysis of Met->Nle
161
isoforms
The number of methionine residues substituted by norleucine for isoforms F, G, and H was further confirmed by amino acid composition analysis. Each of the isolated Met->Nle isoforms was shown to have a composition consistent with the incorporation of one norleucine residue for one methionine residue. While no norleucine was found in the native r-metHuLeptin, [l^NJr-metHuLeptin contains approximately 5% of norleucine relative to methionine. Table III. Amino acid eomposition of regular and norleucine-incorporated recombinant leptins Isoform F Isoform H Theoretical Amino acid Leptin [l^NJLeptin and Isoform G^ Asx 14.2 14 14.3 14.3 141 11.0 10.9 Thr 11.0 11.0 11 15.7 Ser 15.7 16.1 15.8 17 15.5 Glx 15.6 15.7 15.6 15 Pro 7.0 7.2 7.2 7.1 6 Gly 8.3 8.2 9.0 8.7 8 Ala 5.1 5.1 5.2 5.2 5 Cys N/A N/A N/A N/A 2 10.3 10.1 10.2 Val 10.3 11 3.4
3.1
9.4 23.8
9.3
Leu
23.6
NIe
0
0.2
Tyr Phe His Lys Trp Arg
2.0 2.1 4.1 7.0 N/A 4.0
2.1 1.9 4.0 7.2 N/A 4.0
Metl'
ne
2.4
4
9.0
9.2
22.8
23.3
10 23
0.8
0.9
0
2.2 1.9 4.0 7.4 N/A 4.0
2.1 1.9 4.0 7.3 N/A 4.0
2 2 4 7 2 4
2.3
^ Isofonns F (Met68->Nle68) and G (Met54->Nle54) were not separated from semipreparative reverse phase HPLC, and were analyzed together for amino acid contents. ^ The typical hydrolysis yield for methionine residue is approximately 80%.
IV.
DISCUSSION
The misincorporation of norleucine for methionine was known to occur in bacteria when high level synthesis of recombinant proteins were induced in minimal medium fermentation (1,2). This misincorporation was detected in the production of l^N-labeled recombinant human leptin produced using minimal medium conditions, however, is not present in the clinical samples produced using other fermentation conditions. The mechanism for the misincorporation was believed to involve the de novo synthesized norleucine which bypasses the leucine biosynthetic pathway and enters directly into the
Jennifer L. Liu et al
162
incorporation pathway by associating with tRNA^^^ in the acylation reaction. The level of incorporation as well as the distribution of norleucine for the methionine residues, however, has varied for different recombinant proteins (1, 5,13-15). In the production of l^N-labeled leptin, a small fraction (5%) of norleucine was incorporated in the expressed protein. Within the four methionine residues (positions 1, 54, 68, and 136), the three internal residues were equally substituted by norleucine at a rate sixteen fold greater than the incorporation detected for the methionine at the amino terminus. The discrete substitution results in the generation of three isoforms containing a norleucine in place of each of the internal methionines. This observation is unique from other recombinant proteins known to have misincorporation of norleucine for methionine. Methionine is the longest unbranched nonpolar amino acid and has an unusually flexible side chain. Norleucine and methionine differ only in the substitution of a methylene group for a divalent sulfur atom. Although the side chains of norleucine and methionine have nearly identical volumes and surface areas, the methionine sulfur atom is more polar and less hydrophobic than the corresponding methylene group in norleucine. Therefore, a methionine-containing peptide would have a higher desolvation energy compared to the same peptide which contains a norleucine substitution at the methionine position (16). The single-point substitution at the three internal methionine residues in recombinant human leptin converts the homogeneous protein into three closely related heterogeneous proteins. The local environmental changes caused from the misincorporation of norleucine at the three methionine residues reflect on the elution profile of reverse phase chromatography. The less buried norleucine residue has greater surface area accessible to interact with the solid phase of the chromatography and results in the longer retention time of the norleucine-incorporated isoforms. The elution order of the three norleucine-containing isoforms, therefore, reveals information about the relative solvent accessibility of each of the three internal methionine residues. The single-point substitution of a naturally occurring amino acid by an analog provides a convenient tool for studying the effect of molecular alteration on the biologicd activity of the proteins (3). Although sterically superimposable to methionine, norleucine is not a substrate for methionine adenosyltransferase. Therefore, it is expected not to follow the same metabolic function as methionine. On the other hand, norleucine lacks the sulfur atom which is prone to oxidation upon exposure to oxidizing reagents such as free oxygen. The substitution of methionine by norleucine might diminish the need to engineer an oxidation-resistant protein. ACKNOWLEDGEMENTS The authors would like to thank Dr. Viswanathan Katta for helpful discussion, John Le for great assistance in the on-line HPLC/MS, and Dr. Michael Rohde and Tom Boone for support of this work.
REFERENCES 1. 2. 3. 4.
Lu, H. S., Tsai, L. B., Kenney, W. C , Lai, P.-H. (1988) Biochem. Biophy. Res. Commun. 156, 2, 807-813. Tsai, L. B., Kenney, W. C , Curless, C. C , Klein, M. L., Lai, P.-H., Fenton, D. M., Altrock, B. W., Mann, M. B. (1988) Biochem. Biophy. Res. Commun. 156, 2, 733-739. Barker, D. G., and Bruton, C. J. (1979) J. Mol Biol. 133, 217-231. Brown, J. (1973) Biochim. Biophys. Acta 294, 527-529.
LC/MS Peptide Mapping of Recombinant Human Leptin
5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
163
Bogosian, G., Violand, B. N., Dorward-King, E. J., Wokerman, W. E., Jung, P. E., Kane, J. F. (1989) J. Biol Chem, 264, 1, 531-539. Kerwar, S. S., and Weissbach, H. (1970) Arch. Biochem. Biophys. 141, 525532. Trupin, J., Dickerman, H., Nirenberg, M., and Weissbach, H. (1966) Biochem. Biophy. Res. Commun. 24, 50-55. Zhang, Y., Proenca, R., Maffei, M., Barone, M., Leopold, L., Friedman, J. M. (1994) Nature 372, 425-432. Pelleymounter, M. A., Cullen, M. J., Baker, M. B., Hecht, Winters, D., Boone, T., Collins, F. (1995) Science 269, 540-543. Halaas, J. L., Gajiwala, K. S., Maffei, M., Cohen, S. L., Chait, B. T., Rabinowitz, D., Lallone, R. L., Burley, S. K., Friedman, J. M. (1995) Science 269, 543-549. H, Lu, C. Clogston, L. Merewether, L. Narhi, T. Boone (1993) In Protein Folding: In Vivo and In Vitro (J. Cleland, Ed.) 526, chap. 15. Lu, H. S., Lai, P. H. (1986) J. Chromatogr. 368, 215-231. Forsberg, J., Palm, G., Ekebacke, A., Josephson, S., and Hartmanis, M. (1990) Biochem. J. Ill, 357-363. Gilles, A.-M., Marliere, P., Rose, T., Sarfati, R., Longin, R., Meier, A. Fermamdjian, S., Monnot, M., Cohen, G. N., and Barzu, O. (1998) J. Biol. Chem. 263, 8204-8209. Randhawa, Z. I., Witkowska, H. E., Cone, J., Wilkins, J. A., Hughes, P., Yamanishi, K., Yasuda, S., Masui, Y., Arthur, P., Kletke, C , Bitsch, F., and Shackleton, C. H. L. (1994) Biochemistry 33, 4352-4362. Thomson, J., Ratnaparkhi, G. S., Varadarajan, R., Sturtevant, J. M., and Richards, F. M. (1994) Biochemistry 33, 8587-8593.
This Page Intentionally Left Blank
Hyphenated HPLC Methodology for the Resolution and Elucidation of Peptides from Proteolytic Digests Randall T. Bishop, Vincent E. Turula^ and James A. de Hasettf Department of Chemistry University of Georgia Athens, GA 30602-2556 USA
Robert D. Ricker Rockland Technologies, Inc. 538 First State Boulevard Newport, DE 19804 USA
I. Introduction The use of proteolytic enzymes in the analysis of protein structure is well established, yet the identification and characterization of the resulting peptide fragments usually requires the generation of a peptide map through a mode of separation. Reversed phase chromatography is known to be a powerful tool in the analysis of complex biological mixtures, and has found great success in the resolution of peptide mixtures. (1) Common on-line detection techniques, however, such as UV and fluorescence detectors, suffer from low sensitivity or specificity, and therefore provide little structural detail about the separated peptides. (2) More structurally informative detection teclmiques are of great value to increase the speed and efficiency with which structural information can be extracted. Mass spectrometric techniques that include continuous flow fast atom bombardment (FAB), electrospray ionization (ESI), and matrix assisted laser desorption (MALDI) have been applied successfully to protein structure investigations. (3) The hyphenation of electrospray ionization mass ^ Present Address: Amvax Inc., 12103 Indian Creek Court, Beltsville, MD 20705. ^ Author to whom correspondence is to be addressed. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
165
166
Randall T. Bishop et al
spectrometric techniques to liquid chromatography for use as an on-line detector has proven to be quite successful. Minimal sample requirements (pmol) and low flow rate restrictions (<400 |LiL/min) compare favorably with narrow- and microbore chromatographic parameters. Both singly and multiply charged ionic species allow for a very accurate determination of protein and peptide masses. (4) The combination of LC and MS produces a two dimensional separation in which peptide fragments need not be totally separated in the chromatographic stage in order to be individually detected in the mass spectrometric stage. (4) The particle beam LC/FT-IR spectrometry interface can also be used for peptide and protein HPLC experiments to provide another degree of structural characterization that is not possible with other detection techniques. Infrared absorption is sensitive to both specific amino acid functionalities and secondary structure. (5, 6) Secondary structure information is contained in the amide I, II, and III absorption bands which arise from delocalized vibrations of the peptide backbone. (7) The amide I band is recognized as the most structurally sensitive of the amide bands. The amide I band in proteins is intrinsically broad as it is composed of multiple underlying absorption bands due to the presence of multiple secondary structure elements. Infrared analysis provides secondary structure details for proteins, while for peptides, residual secondary structure details and amino acid functionalities can be observed. The particle beam (PB) LC/FT-IR spectrometry interface is a low temperature and pressure solvent elimination apparatus which serves to restrict the conformational motions of a protein while in flight. (8,12) The desolvated protein is deposited on an infrared transparent substrate and analyzed with the use of an FT-IR microscope. The PB LC/FT-IR spectrometric technique is an off-line method in that the spectral analysis is conducted after chromatographic analysis. It has been demonstrated that desolvated proteins retain the conformation that they possessed prior to introduction into the PB interface. (8) The ability of the particle beam to determine the conformational state of chromatographically analyzed proteins has recently been demonstrated. (9, 10) As with the ESI interface, the low flow rates required with the use of narrow- or microbore HPLC columns are compatible with the PB interface. In this study, the utility of both LC/MS and PB LC/FT-IR for protein structure characterization is demonstrated in the analysis of a proteolytic digestion mixture of horse heart cytochrome C. Horse heart cytochrome C is composed of three major and two minor helical structures that are interconnected by polypeptide coils and folded into a globular shape around a heme pocket. There is little other regular secondary structure. (11)
HPLC Methodology for Elucidation of Peptides
167
II. Materials and Methods A. Chemicals Horse heart cytochrome C (CytC) of the highest purity was purchased from the Sigma Chemical Company (St. Louis, MO) and used as received. Tosylamido-2-phenyl-ethylchloromethyl ketone (TPCK) inhibited trypsin was purchased from the Pierce Biochemical Company (Rockland, IL). Trifluoroacetic acid (TFA), ammonium bicarbonate (NH4HCO3), a-cyano-4hydrocinnamic acid, and 2-mercaptoethanol were purchased from Aldrich (Milwaukee, WI). Urea, calcium chloride, and acetonitrile were obtained from J.T. Baker (Phillipsburg, N.J.). Water was deionized to 18 MQ with a Bamstead NANO ultrapure water system.
B. Proteolytic Digest 1 mg of CytC was dissolved in 990 |iL digestion buffer (0.1 M NH4HCO3/I mM CaCl2) in a polypropylene tube and incubated at 85®C for 10 minutes in a water bath to facilitate thermal unfolding. 10 |iL of a 2 mg/mL TPCK inhibited trypsin stock solution (in 0.1 M NH4HCO3/I mM CaCl2) was added, which resulted in a protein-enzyme ratio of 50:1. The CytC solution was then placed in a heating block set at 37°C where the reaction was allowed to proceed for 10 hours. The proteolytic digestion was halted by the addition of TFA (-100 jimol) to lower the pH (to -^2), followed by immediate submersion of the reaction vessel into a dry ice/acetone bath. C.
Chromatography
The HPLC system consisted of a Perkin-Elmer Series 200 quaternary solvent pumping system (Perkin-Elmer, Norwalk, CT), a Zorbax C8 guard column , a Zorbax 300SB-C3 2.1 x 15 cm sterically protected tri-isopropyl (C3) column (MAC-MOD Analytical Inc., Chadds Ford, PA) , a PerkinElmer 235C diode array detector, and a PE Nelson Model 1022 digital integrator (Perkin-Elmer, Norwalk, CT). Column load was 20 |ig of material equating to peptide amounts of micrograms to hundreds of nanograms. The column was thermostated at 60 °C. A flow rate of 0.25 mL/min was used throughout and peptide bonds were monitored at 215 nm. Mobile phases were A) 0.1% TFA, and B) acetonitrile, 0.07% in TFA. The gradient
168
Randall T. Bishop et al
program was 95% A to 35 % A over 40 minutes, to 5% A over 5 minutes, and to 95% A over 10 minutes. Void volume was determined with uracil.
D. Mass Spectrometry LC/MS analysis was conducted on a Fisons VG Quattro II mass spectrometer with electrospray introduction capabilities (Fisons Instruments, Beverly, MA). A +3900 volt bias was placed on the discharge needle, the electron multiplier was held at -650 volts, and the source temperature was held 150°C. A mass range of either 600 to 1200 or 1200 to 2100 Daltons was monitored in all analyses. The following voltages were maintained on the ion optics: HV Lens, -500 V; cone, 50 V; skimmer, 1.5 V; RF lens, 200 V. Chromatographic conditions were identical to those described above. MALDI analysis was conducted on an aliquot of the original digest mixture. Mass measurements were made with a Brtiker MALDI/TOF MS instrument (Billerica, MA) equipped with a nitrogen laser (337 nm). Spectra were averaged from 100 to 150 laser pulse samples, a-cyano-4hydroxycinnamic acid was used as the MALDI matrix. Samples were prepared from the CytC digest solution, and acetonitrile and water (50:50) which was 0.1% in TFA.
E. Particle Beam LC/FT-IR Complete descriptions of the particle beam, its operation, its experimental setup, and its utility in protein structural studies have been previously described. (8, 12) . Relevant PB dimensions include a 25 |Lim diameter fused silica capillary for production of the aerosol spray, a 22 cm length desolvation chamber to remove solvent, a single stage momentum separator, and a nozzle-substrate distance of 5 mm. Particle beam deposits ranged in size from 20 jiim to 100 |im in diameter, and averaged approximately 50 |im. Deposit were made onto a water insoluble calcium fluoride (CaF2) window (25 mm dia. x 2 mm) from International Crystal Laboratories (Garfield, NJ). Infrared spectra were collected on a Perkin Elmer Spectrum 2000 infrared spectrometer equipped with an /-series IR microscope and controlled with Spectrum for Windows software (Perkin Elmer, Norwalk, CT). Spectra were coadded from 250 scans at a resolution of 8 cm"' with strong apodization. Reference spectra of CytC and individual amino acids were
HPLC Methodology for Elucidation of Peptides
169
0.2 A U
HI A
B
D
EF G
IJ^JU^_J^wJLAJJ -T—
10
15
"20"
—r—
25
—I
30
Time /min Figure 1. Chromatogram of Cytochrome C tryptic digestion mixture. Lettered peaks were analyzed with the use of both PB LC/FT-IR spectrometry and LC/MS.
acquired by the evaporation of solvent from a small amount of an analyte solution (5-10 |iL) with the use of a vacuum desiccator to produce a film of material onto a CaF2 disk. These films w^ere spectroscopically analyzed with the IR microscope in a manner identical to the particle beam deposits. All spectral manipulations were carried with GRAMS 386 software (Galactic Industries Corp., Salem, NH).
Ill, Results and Discussion A. Mass Spectrometry LC/MS analysis allowed for the complete detection of all component CytC tryptic digest fragments (14) between the mass range of 600 to 2100 Daltons, as illustrated in Table I. The tryptic digest chromatogram is illustrated in Figure 1. Chromatographic peaks are identified with their corresponding peptide fragment(s) in Table I . Only 5 peptide fragments of <600 MW, that contain a total of 16 residues in all, were not observed, and resulted in verification of 85% of the amino acid sequence. As can be seen
Randall T. Bishop et al
170
Table I. Tryptic Fragment number, sequence, and calculated and observed masses for horse heart cytochrome C observed in LC/MS Fragment Letter
Sequence
Residues
Calculated Mass
Observed Mass
A
73-79 74-79
YIPGTK KYIPGTK
678.4 806.5
678.8 806.8
B
9-13
IFVQK
634.4
634.7
C
56-60
GITWK
604.4
604.7
D
40-53 39-53
TGQAPGFTYTDANK KTGQAPGFTYTDANK
1470.7 1598.8
736.2", 1470.9 800.3", 1599.0
F
61-73 80-87
EETLMGYTLENPKK MIFAGIKK
1551.8 907.5
776.8 " 907.9
G
28-39 80-86
TGPNLGHLFGRK MIFAGIK
1296.7 779.5
648.8 " 779.8
H
28-38 91-99 80-86
TGPNLHGLFGR EDLIAYLKK MIFAGIK
1168.6 1092.6 779.5
1169.0 1093.0 779.8
I
80-86
MIFAGIK
779.5
779.8
J
88-98
TEREDLIAYLK
1350.7
676.2 "
K
91-98
EDLIAYLK
964.5
964.9
CAQCHTVEK + heme
1633.9
817.6", 1634.8
E
L M
14-22 + heme
N 0
14-22
CAQCHTVEK
1018.4
1041.4*
P
1-103
Native
12,300
Ion Series
a h
Observed as MH2 ^^. Observed as MNa^ adduct.
HPLC Methodology for Elucidation of Peptides
171
in the tryptic digest chromatogram in Figure 1, not all of the 21 predicted components were completely resolved under these chromatographic conditions, which is not a requirement for LC/MS detection provided the peptide fragment masses differ by several mass units. (14) The fragments that compose several chromatographic peaks, E, L, and N were not identified as potential proteolytic cleavages of CytC by trypsin. These are assumed to be irregular cleavages. The latest eluting peak, P, in the digest chromatogram in Figure 1 corresponds to undigested CytC as determined from the multiply charged ion series observed. (4) The presence of native protein is due to the fact that the protein was not chemically denatured during proteolysis; rather it was thermally denatured just prior to the addition of protease. Both the rate and route of proteolytic attack on a protein are dictated by higher order structure. (15) To be sure, some refolding did occur which would reduce to rate of proteolysis and account for the presence of residual native CytC.
B. PB LC/FT-IR Analysis Particle beam LC/FT-IR analysis of the CytC digestion mixture allowed a more detailed structural characterization of the protein fragments. The particle beam in its present configuration allows for the collection and spectroscopic interrogation of nanogram level injection quantities of material. If it is assumed that complete proteolysis of CytC occurred in the digest mixture, the peptide amounts in the injected volume (20|iL) ranged from 2.6 |ig (1.6 nmol) for the largest fragment (seq. no. 9-22) to 980 ng (1.6 nmol) for the smallest fragment (seq. no. 56-60). Complete proteolysis did not occur, however, as native protein was recovered. The actual amount of material isolated is, therefore, presumably well below these respective amounts due to both the incomplete proteolysis and the throughput characteristics of the PB interface. The lettered infrared spectra in Figure 2 correspond to the chromatographic digest peaks in Figure 1. The successful use of the PB interface in the analysis of chromatographic analytes which are similar in structure requires a reasonable degree of separation between components prior to PB desolvation and deposition. Coeluting analytes will be codeposited and the resulting IR spectrum will show absorbances due to both compounds. Additionally, chromatographic peaks of some smaller peptide fragments have IR spectra which show bands that correspond to peptide sidechain vibrations and
172
Randall T. Bishop et al
possible aggregation. In the IR spectra of the chromatographic peaks of larger fragments, the relative intensities of the individual sidechain vibrations are greatly reduced relative to the amide and N-H stretching vibrations related to the peptide backbone, and are consequently difficult or impossible to observe. The amide absorption bands in the IR spectra of larger fragments also show evidence of secondary structure of either a residual nature, intrinsic to the peptide in that environment, or incurred through aggregation with like polypeptides upon desorption. The LC/MS measurements do not show evidence of aggregate formation, although it is likely an aggregated peptide species would be unstable after the acquisition of multiple charges under ESI conditions. Nonetheless, non-covalent interactions have been preserved during ESI/MS in protein-ligand complexes. (16) The IR spectra indicate the possibility of aggregation in some fragments (C,J,N), as they exhibit an amide I P-sheet wavenumber below 1630 cm ^ (17) CytC digestion peak C (seq. no. 56-60, GITWK) is eluted free of other peptide fragments as determined by LC/MS measurements, has a strong amide I absorption band at 1628 cm'^ and exhibits a strong hydrophobic character in the form of a broad chromatographic peak. (1) Aggregation is not totally unexpected of this peptide in which 4 of 5 residues are either non-polar or uncharged, and the terminal lysine can be ion-paired with TFA. The presence of a high mole fraction of acetonitrile (non-hydrogen bonding solvent), however, inhibits the formation of aggregates. (17) If aggregation occurs between peptide fragments, it is most likely between only the most hydrophobic residues and at lower concentrations of organic modifier. The most useful and powerful aspect of particle beam IR spectrometry is information on the higher order solution structure of the proteins and peptides under investigation. (12) CytC digest peak D consists of two homologous peptide fragments which differ by a lysine residue on the amino terminus, see Table I. These fragments almost wholly contain the residues (49-54) involved in an a-helix in the native CytC structure. The IR spectrum that corresponds to these peptide fragments, D in Figure 2, has a strong amide I absorbance at 1654 cm"', that indicates the peptide has adopted a helical solution structure, presumably a residual from the native conformation. Conversely, CytC digest peaks J and K from Figure 2 both contain peptide fragments from the carboxy terminal helix (residues 87-102) and do not exhibit a-helical structure in their respective IR spectra from Figure 2. The peptide of spectrum J appears to exhibit primarily P-sheet
HPLC Methodology for Elucidation of Peptides
173 1668!:
3500
3000
2500 2000 Wavenumber/cm'^
1628
1500
1000
Figure 2: PB-infrared spectra of chromatographic digestion peaks. Letters correspond to chromatographic peaks in Figure 1.
Randall T. Bishop et al
174 1662 ij
3500
Figure 2 (cont.)
3000
2500 2000 Wavenumber/cm'
1500
1000
HPLC Methodology for Elucidation of Peptides
175
characteristics with a band at 1631 cm'\ possibly due to aggregation. The peptide of spectrum K appears to adopt a disordered structure evident as a band at 1662 cm'^ These peptide fragments, however, elute at a significantly higher organic modifier concentration than does the peptide fragment of peak D, which may dictate the preferred solution structure. The identification of amino acid functionalities from the IR spectra of peptides, that are composed of even a limited number of residues, is extremely difficult due to the fact that IR absorbances related to the peptide bond, amide I, II, III, and the amide N-H stretch, often mask functional group absorptions. Identification of particular amino acids is likewise difficult due to the fact that many amino acid sidechains exhibit the same or similar IR absorptions, e.g. aliphatic and amine moieties. The presence of thiol, phenyl, and carboxylic acid groups can be determined if the spectrum S/N ratio is sufficiently high and interfering absorbance bands are not present. For example, the weak S-H stretch of the cysteine thiol is evident in spectrum O of Figure 2 as a shoulder at 2560 cm"^ The strong absorbance due to C-H out of plane deformation of para-substituted benzene in tyrosine can be seen in spectra A, H, J, and K between 845 and 834 cm"^ A strong carboxylic acid C=0 stretch can be seen at 1721 cm'^ in the peptide spectrum O in Figure 2.
Conclusions Particle beam LC/FT-IR and LC/MS spectrometries provide both different and complementary information. LC/MS measurements provide fragment identity and sequence information through an accurate determination of molecular mass. PB LC/FT-IR spectrometry measurements provide information on the solution structure of the peptide that includes residual or chromatographically induced secondary structure and aggregation, as well as information on the presence of some amino acid functionalities. The PB technique is especially useful with larger peptides which posses formal secondary structure.
Acknowledgments The authors wish to thank Dr. Michael Bartlett and T.G. Venkateschwaran, Department of Medicinal Chemistry, College of Pharmacy, University of Georgia, for assistance with the LC/MS analysis,
176
Randall T. Bishop et al
and Dr. Dennis Phillips, Mass Spectrometry Facility, Department of Chemistry, University of Georgia, for assistance with the MALDI analysis.
References 1. Aguilar, M.I., and Heam, M.T.W. (1991) In "HPLC of Proteins, Peptides, and Polynucleotides" (M.T.W. Heam ed.), p. 247. VCH Publishers, Inc., New York, NY. 2. Copeland, R.A. (1994) "Methods for Protein Analysis", p. 161. Chapman Hall, N.Y., USA. 3. Andersen, J.S., Svensson, B., and Roepstorff, P. (1996) Nature Biotechnology 14, 449. 4. Mann, M., Meng, C.K., and Fenn, J.B. (1989) Anal. Chem. 61, 1702. 5. Surewicz, W.K., and Mantsch, H.H. (1996) In "Spectroscopic Methods for Determining Protein Structure in Solution" (J.A. Havel ed.), p. 135. VCH Publishers, Inc., New York. 6. Byler, D.M., and Susi, H. (1986) Biopolymers 25, 469. 7. Susi, H. (1969) In "Structure and Stability of Biological Molecules" (S.N. Timasheff and G.D. Fasman Eds.), p.575. Marcel Dekkar, N.Y. 8. Turula, V.E., and de Haseth, J.A. (1994) Appl. Spectrosc. 48,1255. 9. Turula, V.E., and de Haseth, J.A. (1996) Anal. Chem. 68, 629. 10. Bishop, R.T., Turula, V.E., and de Haseth, J.A. "Study of Conformational Effects on Reversed-Phase Chromatography of Proteins with Particle Beam LC/FT-IR Spectrometry and Free Solution Capillary Electrophoresis" (1996) Anal. Chem., in press. 11. Bushnell,G.W.,Louie,G.V.,andBrayer, G.D. (1990)J. Mol. B i o l . 214, 585. 12. Turula, V.E. (1995) "Dynamic Solution Conformation of Biopolymers by Particle Beam LC/FT-IR Spectrometry", Dissertation (Ph.D.), University of Georgia, Athens, GA. 13. Ricker, R.D., Sandoval, L.A., Permar, B.J., and Boyes, B.E. (1995) J. Pharm. Biomed. Anal. 14, 93. 14. Eshraghi, J., and Chowdhury, S.K. (1993) Anal. Chem. 65, 3528. 15. Flannery, A.V., Beynon, R.J., and Bond, J.S. (1989) In "Proteolytic Enzymes" (R.J. Deynon and J.S Bond eds.), p. 145. IRL Press, New York. 16. Smith, R.D., and Light-Wahl, K.J. (1993) Biol. Mass Spectrom. 22,493. 17. Fabian, H., Choo, L., Szendrei, G.I., Jackson, M.,Halliday, W.C, Otvos, L., and Mantsch, H. H. (1993) Appl. Spectrosc. 47,1513,
Detecting and Identifying Active Compounds from a Combinatorial Library Using lAsys and Electrospray Mass Spectrometry
Bolong Cao Jan Urban Tomas Vaisar Richard Y. W. Shen Molecumetics Ltd. BellevucWA 98005-2199
Michael Kahn Molecumetics Ltd. Bellevue,WA 98005-2199 And Department of Pathobiology University of Washington Seattle, WA 98105
I. Abstract lAsys (cuvette based evanescent optical biosensor) (Affinity Sensors, Cambridge, UK) was utilized to detect and harvest ligands from a combinatorial peptide library containing 30 sequences. Monoclonal anti-P-endorphin antibody (3-E7) was covalently immobilized on an lAsys cuvette coated with carboxymethyl dextran. The library, X3-Gly-X2-Xl-Leu-Lys-Gly-NH2, where XI = Phe, Lys, Pro; X2 = Gly, Ser; X3 = Thr, Phe, Ala, Leu, Tyr, was added to the cuvette for detection and the active peptides collected. Of the 30 sequences, four were selected from the pool in the lAsys cuvette and identified as YGGFLK-NH2, FGGFLK-NH2, YGSFLK-NH2, and FGSFLK-NH2 by electrospray mass spectrometry. YGGFLK-NH2 and YGSFLK-NH2 were found by ELISA to specifically bind the antibody. The advantage of this technique is that both detection and collection can be performed in one step, thereby increasing the efficiency of the screening process. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
177
Belong Cao et al
178
IL Introduction Libraries of polypeptides have been shown to be valuable sources of novel molecules possessing a variety of useful biologic properties, some of which may serve as leads for potential vaccines and therapeutics. Construction of peptide libraries is generally accomplished either chemically, using solid-phase peptide technology, or as fusions with bacteriophage coat proteins (1,2). A major benefit of phage display of random peptides is the co-purification with the peptide of the gene encoding it, thereby allowing selection procedures in addition to screening. As part of our continuing program to develop small molecule mimics of peptides and proteins (3), we describe a screening technique that allows for not only detection, but also capture and analysis of potential lead molecules (both peptide and nonpeptide) from libraries. An experiment demonstrating the feasibility of this approach is described.
III. Materials and Method A. Materials PAL resin, fluorenylmethyloxycarbonyl (Fmoc) and butyloxycarbonyl (Boc) amino acids with standard side chain protection and Boc-Tyr(tBu)-OH were from Advanced ChemTech (Louisville, KY, USA). FmocLys(Dde^) was from BioChem (San Diego, CA, USA). Beta-endorphin and anti-P-endorphin (3-E7) were from Boehringer Mannheim. The photocleavable biotinylated linker was prepared according to the literature procedure (4).
* Abbreviations used: Dde, N-[l-(4,4-dimethyl-2,6-dioxocyclohexadiene)]-ethyl; DIEA, diisopropylethylamine; DMF, N,N-dimethylfonnamide; EDC, l-Ethyl-3-(3'dimethylaminopropyl)carbodiimide hydrochloride; HOBt, 1-hydroxybenzotriazole; NHS, N-hydroxysuccinimide; NMP, l-methyl-2-pyrrolidinone; PAL, Peptide Amide Linker [5-(4-(9-fluorenylmethyloxycarbonyl) aminomethyl-3,5-dimethoxyphenoxy) valeric acid]; PyBOP, Benzotriazole-1-yl-oxy-trispyrrolidino-phosphonium hexafluorophosphate; TFA, trifluoroacetic acid.
Detecting and Identifying Active Compounds from a Library
179
B. Method 1. Library synthesis The enkephalin library was synthesized on 125 mg of PAL resin (5) (loading at 0.5 mmol/g) by the splitting-mixing protocol (6). The coupling of amino acids was performed using PyBOP, HOBt, DIEA (5:5:5:7 eq) in NMP/chloroform. The biotinylated photocleavable linker was coupled in NMP. The coupling time was 60 min for all steps. To deprotect the Fmoc group before each coupling, the resin was washed with DMF, treated with 25% piperidine/DMF for 2 and 10 min, and subsequently washed again with DMF. After completion of the couplings, the peptide resin was treated with 4 ml of 95% aqueous TFA for 100 min. The resulting TFA solution was concentrated to 0.5 ml by evaporating TFA in vacuo, and diethylether was added to precipitate the peptides. 2. Binding assay of the library to anti-P-endorphin The binding assay of the peptide library to anti-fi-endorphin was performed on an lAsys (Affinity Sensors, Cambridge, UK). lAsys is based on an optical evanescent sensor called resonant mirror. The basic principle is that a laser light is directed at a prism over a range of angles. At one unique angle, the light travels through the low refractive index layer (coupling layer), which is coupled to the prism, and propagates in the high refractive index layer (resonant layer), which is on top of the coupling layer. The immobilized samples on the surface of the resonant layer alter the refractive index of the resonant layer and consequently change the resonant angle. For details of the operating principles see lAsys cuvette system user's guide (1993, Affinity Sensors). The immobilization of anti-P-endorphin to the carboxymethylated dextran (which is on the surface of the resonant layer) was via NHS/EDC chemistry. Prior to antibody coupling, the carboxymethylated dextran was activated twice with 0.4 M EDC/0.1 M NHS for 10 min. Anti-P-endorphin was coupled twice to the dextran layer in 10 mM sodium acetate buffer, pH = 5.0, at 25 |Lig/ml to ensure maximum loading. After coupling, the free activated carboxyl group was blocked with 200 |dl of 1 M ethanolamine for two minutes. Finally, the cuvette with immobilized anti-P-endorphin was washed twice with 20 mM HCl and twice with PBS/0.05% tween 20, to eliminate the non-covalently bound antibody. The binding of the peptides to the antibody was carried out in PBS, pH=7.4, at 25°C. After the baseline was established for 150 |il of PBS, 50 \\\ of 3 mg/ml (0.1 mg/ml/peptide) crude peptides in water was added to the cuvette and the binding was monitored. When equilibrium was achieved (approximately 10 min), the unbound peptides were fiushed away,
Belong Cao et al
180
800J
60oJ 400. 200. 0. -200. 0
100 200 300 400 500 600 700 Tims (second)
Figure 1. A typical binding curve of a 30 compound library to immobilized anti-p-endorphin antibody (3-E7). The arrow indicates the time at which the pool of the 30 peptide library was added to the lAsys cuvette. See text for experimental details.
and the cuvette was washed five times with 200 \i\ of PBS. The bound peptide(s) was/were harvested by washing the cuvette with 200 \\X of 20 mM HCl. In order to ensure sufficient material to identify the active sequence(s) by electrospray mass spectrometry, the binding and harvesting were repeated 10-15 times. A typical binding curve is shown in Figure 1. The harvested solution was frozen in a dry ice/acetone bath and lyophilized overnight. 3. Mass spectrometry All mass spectra were obtained on a VG Quattro triple-quadrupole mass spectrometer (Micromass Inc., Altrincham, U.K.). Peptides were ionized with electrospray ionization under the following conditions: mobile phase, methanol/water (50/50 v/v); needle voltage, 2.8 kV; high voltage lens (counter electrode), 0.05 kV; and skimmer potential, -12 V. The flow rate of the mobile phase for the spectra of the whole library was 100 jil/min rather than 10 jil/min, which was used for the samples after binding. All samples were dissolved in methanol. Either 5 or 10 |il of sample was injected. Two injections were performed for each sample. Data were acquired in Multichannel Analysis
Detecting and Identifying Active Compounds from a Library
181
Table I. Theoretical and experimental (ESI-MS) masses of the peptides in the library. One letter codes for the N-terminal tetrapeptide moiety denote corresponding peptides. Peptide sequence AGGP TGGP, AGSP AGGK LGGP AGGF TGSP TGGK, AGSK LGSP LGGK FGGP TGGF, AGSF TGSF YGGP, LGGF LGSK FGSP FGGK TGSF YGSP, LGSF YGGK FGGF FGSK YGGF YGSK FGSF YGSF
[M+H]^ 1160 1190 1191 1202 1210 1220 1221 1232 1233 1236 1240 1251 1252 1263 1266 1267 1270 1282 1283 1286 1297 1302 1313 1316 1332
Theoretical (m/z) [M+2Hf^ 580.5 595.5 596.0 601.5 605.5 610.5 611.0 616.5 617.0 618.5 620.5 626.0 626.5 632.0 633.5 634.0 635.5 641.5 642.0 643.5 649.0 651.5 657.0 658.5 666.5
Experimental (m/z) [M+H]^ 1159.6 1189.4 1190.6 1201.6 1209.7 1219.6 1220.7 1231.6 1232.6 1235.7 1239.7 1250.6 1251.7 1262.6 1265.6 1266.7 1269.7 1281.8 1282.7 1285.8 1296.7 1301.8 1312.7 1315.8 1331.7
[M+2HF
580.3 595.6 595.6 601.4 605.7 610.9 610.9 616.9 616.9 618.2 620.4 626.1 626.1 631.9 633.9 633.9 633.9 641.9 641.9 643.5 648.9 652.0 656.9 658.3 666.8
(MCA) mode accumulating 5-7 scans over the ranges 300-1500 and 50-1000 at scan rates 1200 mu/5 s and 950 mu/5 s for the before and after binding samples, respectively. The acquired data were smoothed twice using a Savitzky-Golay algorithm with peak width at half height set to 0.9 mu. 4. HPLC analysis All HPLC analyses were carried out on a Hitachi system model 6200A, equipped with an L-4500 diode array UV detector. A SynChropak reversephase C4 column (0.46 X 25 cm) was employed. Elution was accomplished with a linear gradient of 0-60% acetonitrile containing 0.1% TFA over a period of 30 min at a flow rate of 1.0 ml/min.
5. ELISA Beta-endorphin was immobilized on a Costar polystyrene plate at 1 |ig/ml in 50 mM carbonate buffer, pH = 9.6, at room temperature for 3 h. The plate was
Belong Cao et al.
182
ioq 100
%
m/z
1 i 251 i 501 i 751 ^001 ^2515501 ^751300132513501375
m*km
(M 4D0 ' 5D0 • 6D0
7D0
8d0
9b0 ' 1000'HOO' 1200' 1300'
m/z
Figure 2. Electrospray mass spectrum of the 30 member library. Both doubly and singly protonated species are observed. Insert: blown up region of singly charged species. For conditions, see text.
blocked with 1% BSA in PBS containing 0.05% tween 20 at room temperature for 2 h. Testing compounds in blocking buffer were placed in the plate at different concentrations along with 20 ng/ml anti-fi-endorphin. For detection, goat anti-mouse IgG conjugated with horse radish peroxidase and TMB was utilized.
IV. Results A small library of 30 peptides was synthesized. They all were identified by electrospray mass spectrometry. The mass spectrum of the whole library is shown in Figure 2. Table I Hsts the theoretical and experimental m/z values of all possible sequences in the library. After binding to anti-P-endorphin, most peptides in the library were eliminated from the harvested samples. The mass spectrum (Figure 3) shows that the four remaining sequences are YGGFLKGNH2, FGGFLKG-NH2, YGSFLKG-NH2, and FGSFLKG-NH2. To confirm this result, the four peptides were synthesized individually and purified by HPLC. Competitive ELISA against immobilized P-endorphin showed that of the four sequences, only two, YGGFLKG-NH2 and YGSFLKG-NH2, compete with p-
183
Detecting and Identifying Active Compounds from a Library 100n
620
[FGGF+H+Na]'^
630 ' 640 ' 650 ' 660 ' 670 ' 680 ' 690
700
710
720
m/z 730 740
Figure 3. Electrospray mass spectrum of the selected compounds from the library. One letter code for the N-terminal tetrapeptide moiety represents corresponding peptides with the biotin on the lysine sidechain. Due to the presence of Na^ in the binding buffer, peptides are observed as doubly charged species with one proton and one Na^.
endorphin for antibody binding. FGGFLKG-NH2 and FGSFLKG-NH2 apparently bind nonspecifically to 3-E7, or dextran matrix in the cuvette as they did not block the binding between P-endorphin and anti-p-endorphin.
V. Conclusion We have demonstrated a rapid and facile technique for screening and selecting ligands from a limited library. This approach should prove valuable for the identification and optimization of lead molecules from pools.
References 1. Geyson, H.M., Rodda, SJ. and Mason, T.J. (1986) Molecular Immunology 23, 709-715.
184
Belong Cao et al
2. Scott, J.K. and Smith, G.P. (1990) Science 248,404-406. 3. Kahn, M. (1993) Synlett (Special Issue) 821-826. 4. Olejnik, J., Sonar, S., Krzymanska-Olejnik, E. and Rothschild, K.J. (1995) Proc. Natl. Acad. Sci. U.S.A. 92, 7590-7594. 5. Albericio, F., Kneif-Cordonier, N., Biancalana, S., Gera, L., Masada, R.I., Hudson, D. and Barany, G. (1990) J. Org. Chem. 55, 3730-3743.
6. Furka, A., Sebestyen, F., Asgedom, M. and Dibo, G. (1991) Int. J. Peptide Protein Res. 37, 487-493.
Amino Acid Analysis of Unusual and Complex Samples Based on 6-aminoquinolyl-N-hydroxysuccinimidyl Carbamate Derivatization Steven A. Cohen and Charlie van Wandelen Waters Corporation, Milford MA 01757
I. Introduction The analysis of cell culture media and supernatants, as well as non-standard protein hydrolysates such as coUagens and glycoproteins, has created the demand for techniques that accurately quantify additional amino acids not normally found in hydrolyzed samples. Methods of amino acid analysis (AAA) based on precolumn derivatization with 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC) have previously been shown to quantify hydrolyzed samples with a high degree of accuracy (1,2). The AQC-based method has also been shown to derivatize effectively in the presence of salts and lipids (3). Taking into account the above strengths, the excellent stability of the derivatives, and the unique fluorescence properties that allow for direct injection of the reaction mixture without cleanup, the AQC methodology represents an ideal choice for the analysis of complex samples. Recent studies on separation optimization showed that accurate control of mobile phase pH was essential to successfully resolve a number of important non-hydrolysate amino acids. With good control of a complex gradient profile the system could resolve a mixture of amino acids including, Asn, Gin, cysteine derivatives carboxymethyl cysteine and pyridylethyl cysteine, and the hydroxylated amino acids hydroxyproline (Hyp) and hydroxylysine (Hyl) as well as the hydrolysate amino acids (4). However, the required precision in the control of eluent pH unnecessarily complicated transfer of the method between laboratories. The method also lacked the ability to separate Orn from the hydrolysate amino acids. The current study demonstrates the utility of quaternary HPLC gradient systems for facilitating methods development and simplifying routine eluent preparation with excellent pH control. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
185
186
Steven A. Cohen and Charlie van Wandelen
Retention time reproducibility is also enhanced with a new HPLC system, especially in the shallow region of the gradient profile. Analysis of collagen hydrolysates and cell culture supernatants are shown as representative applications.
11. Experimental A. Materials Cell culture media (Life Technologies Grand Island, NY, USA) and cell culture supernatants (Repligen Inc. Cambridge MA, USA) were provided for collaborative studies. Acetonitrile (MeCN), disodium ethylenediaminetetraacetic acid, phosphoric acid, sodium acetate trihydrate, and sodium azide were from Baker (Phillipsburg, PA, USA), triethylamine (TEA) was from Aldrich (Milwaukee, WI, USA). Amino acid standards were from Pierce (Rockford, EL, USA) or Sigma (St. Louis MO, USA). Collagen Type III was from Sigma. AccQ*Fluor'''M reagents and AccQ^Tag'''^ Eluent A concentrate were from Waters Corp. (Milford, MA, USA).
B. Preparation of Standards and Samples Stock solutions (2.5 mM) of a-aminobutyric acid (Aab), y-aminobutyric acid (Gaba), Asn, cysteic acid (Cya), Gin, Hyp, Hyl, Orn, taurine (Tau), and Trp were prepared in water. Mixture I consisted of 40 ^il each of Hyl, Hyp and Pierce H mixed with 880 jil of water (0.1 mM per amino acid). Forty ^il of each stock 160 120 mV 80 40 0 ^°
Minutes ^^
Figure 1. Chromatographic separation of Mixture 1. The derivatized amino acid Mixture I (50 pmol each amino acid in 10|j,l) was chromatographed using the conditions indicated. Detection was by fluorescence. The column temperature was 37°C. Eluent A = working eluent from eluent A concentrate, pH 5.05; eluent B = MeCN. Gradient: initial = 0% B, 0.5 = 1% B, 18 = 5% B, 19 = 9% B, 29.5 = 17% B, hold for 5.5 min, wash with 60% MeCN in water for 3 min, equilibrate with 100% A for 9 min before subsequent injection, step gradient at 0.5 min, wash and equilibration steps, all others steps linear.
6-Aminoquinolyl-N-hydroxysuccinimidyl Carbamate Derivatization
187
solution and Pierce H standard were mixed with water to a final volume of 1ml to make Mixture II (0.1 mM per amino acid). A 10 |il aliquot of standard solution was buffered with 70 |il borate buffer and derivatized with 20 \x\ of AQC reagent (3 mg/ml in MeCN). Aliquots (100 pil) of media or cell free supernatants were mixed with 100 |LI1 of 0.4 mM internal standard (a-amino butyric acid) solution. Proteins were precipitated by adding 200 ^il of MeCN. Samples were centrifuged at 16,000 x g for 5 min. Ten \x\ of the supernatants were then derivatized as above. Collagen samples (100 |Lig in 50 }il water) were hydrolyzed using previously published vapor phase techniques (5). Hydrolysis was carried out at 114°C for 24 hr. Hydrolyzed samples were dissolved in 100 |il of borate buffer, and 5 |il derivatized.
C Chromatographic Instrumentation and Analysis HPLC System (A) was a Waters Alliance"^^ system consisting of a 2690XE Separations Module and an M474 scanning fluorescence detector. System (B) consisted of a 625 LC system equipped with a column heater, a 715 UltraWisp'''^ with sample cooling option and an M470 scanning fluorescence detector (all from Waters). A Millennium® 2010 Chromatography Manager was used for system control and results management. Separations were carried out using a 3.9 X 150 mm AccQ»Tag column with a 3.9 X 20 mm Sentry™ guard column (Nova-Pak® Cig bonded silica), both from Waters. Eluent concentrates for Mixture II protocols were prepared by dissolving 148 gm of NaOAc in 1.0 L of water and adding 7.06 gm of TEA. For the Mixture I separation, AccQ^Tag Eluent A concentrate (pH = 5.05) was used with pH adjusted as indicated. Concentrated eluents were titrated to the indicated pH using either a 50% phosphoric acid solution or 1.0 M NaOH. Working eluents were prepared by mixing 100 ml of the concentrate with 1.0 L of water. Fluorescence responses of AQC derivatives were measured using an excitation wavelength of 250 nm and an emission wavelength 395 nm.
11. Results A. Collagen Analysis Conditions used for hydrolysate samples (1) did not provide adequate resolution for the collagen standard. Mixture I (Figure 1). Resolution of the reagent hydrolysis product 6-aminoquinoline (AMQ) and Hyp was poor. Hyl yielded two peaks because the commercial standard was provided as a diastereomer. Both Hyl peaks eluted very close to Val which made increased resolution desirable. Resolution of Gly and His in collagen samples also needed improvement, as Gly is present in approximately 100-fold molar excess over His.
188
Steven A. Cohen and Charlie van Wandelen
A
1004
1
mv : 20-
1
AMQ 16
Ser
Asp
I
Qiu
JUL
20
Minutes
100
I
B
601
mV i
1
0
NH3
24
Hyp
'^'^^^_J
20-
n His
Hyp
60:
12
14
Asp
16
Ser a u
<*
*
NH3
_/uv_AyL_A Minutes
18
20
22
Figure 2. Influence of eluent pH and gradient slope on resolution of hydrophilic amino acids in Mixture I. (A) The gradient was initial = 0%B, 20 = 5%B, 24 = 9%B, 34 = 17%B hold for 6 min, all other conditions were identical to those used in Figure 1, except Eluent pH = 4.95 ; (B) Gradient: Initial = 0%B, 25 = 5%B, 29 = 9%B, 39 = 17%B hold for 6 min all other conditions were identical to those used in Figure 1.
200
mV
i
100
His ..^.^ Thr A|a Hyp Asp^®''GluGlyA N N'H 3 A r g i | | .Pro
Aab (Int. Std.) Leu Hyll \Hyl2
LjJfeuJi 20
Minutes
30
40
Lys Phe
U
Figure 3. Separations using optimized conditions for Mixture I separation. Eluent B = eluent titrated to pH 6.80 with NaOH; eluent C = acetonitrile. The gradient profile is given in Table 1. The column temperature was 34°C. All other conditions were identical to those in Figure 2B. (A) Mixture I standard, (B) collagen hydrolysate sample, prepared as described in the methods section. Total run time was 54 min.
6-Aminoquinolyl-N-hydroxysuccinimidyl Carbamate Derivatization
189
To overcome the deficiencies of the normal hydrolysate separation, modifications were made in the eluent pH and the gradient slope, both significant influences on the relative retention of key peak pairs. Separations shown in Figure 2 illustrate these effects. Unfortunately, the lower pH system decreased the resolution of Gly and His In contrast, decreasing the gradient steepness had the positive effect of improving both the resolution of AMQ and Hyp as well as that of Gly and His. As shown in Figure 2B the higher pH eluent could effect baseline resolution of AMQ and Hyp if the shallower gradient slope were used. Resolution of Gly and His was further improved by reducing the column temperature from 37 °C to 34 °C (not shown). Subsequent optimization studies used the conditions described in Figure 2B for the gradient steps before 25 minutes. In contrast to the earlier eluting peaks where good resolution is best at pH ~5, the separations in Figure 3 show that selectivity for the hydrophobic components, including the diderivatized amino acids (Cys, Hyl, and Lys), is favorable at higher pH. The dilemma in choosing an optimal mobile phase pH was overcome by adding a third mobile phase, a buffer similar in composition to Eluent A described in Figure 2, but significantly higher in pH. Thus, a total of four eluents, including two buffers at pH 5.05 and 6.80, MeCN and water, were used to optimize the separation. A typical gradient profile is shown in Table I. The initial mobile phase pH was 5.05, and the compositional change occurring at 28.5 min effected a rapid increase to pH 6.8. This resulted in large, advantageous selectivity changes for the diderivatized amino acids. Cys eluted afler, instead of before, Tyr, the gap between Val and Met widened significantly, both Hyl peaks eluted between Met and He, and Lys eluted between Leu and Phe. The flow rate was also increased to reduce the total analysis time. The resulting separation is shown in Figure 3A. A hydrolyzed, derivatized collagen sample analyzed under these conditions is shown in Figure 3B. Note the large excess of Gly relative to His, thus necessitating the additional resolution provided by the optimization of the early gradient slope and the column temperature. Table II shows the compositional analysis of this sample, agreeing well with previously published data (6). Table I. Gradient profile for optimized Time (min) Initial 25.0 28.0 28.5 36.5 43.0 43.5 45.0
Flow ml/min 1.00 1.00 1.00 1.50 1.50 1.50 1.50 1.00
%A pH 5.05 100 95 92 0 0 0 0 100
%B pH 6.80 0 0 0 92 88 80 0 0
%C MeCN 0 5 8 8 12 20 60 0
1. Curve 6 is a linear segment; curve 11 is a step ftinction.
%D H2O 0 0 0 0 0 0 40 0
Curve^
* 6 6 6 6 6 11 11
190
Steven A. Cohen and Charlie van Wandelen Thr
B 120.0.
•
u- NH3P Ala HIS 1 1 1 Gaba
L.. NH3
HIS, 1 I ItI r, ' 3 Gaba AMQ
AMQ
Pro
1 1
11/
0.00
mV
Pro
0 00
25
30
Mi lllllll 1 25
Minutes
30
Minutes
Thr Tau
120.0 J
mV
Thr
jArg
120.0
^. NH3|; Ala HiS| 1 n i l jGaba AMQ .Gly
m
Ser, 0.00 -
Arg
rau
25
ll
Pro
30
0.00 25
Minutes
30 Minutes
Figure 4. Influence of pH on the resolution of hydrophilic amino acids in Mixture II. The separations were generated with eluent A and B at pH 5.70 and 5.90, respectively. The eluents were blended in the following ratios: (A) 100% A, 0.0 % B, (B) 75% A, 25% B, (C) 50% A, 50% B, and (D) 0.0% A 100 % B. The gradient for all four experiments was that shown in Table III with the following modifications: i) step at 33.5 min was changed to 36.7 min, flow = 1.0 ml/min, A + B = 87 (A and B blended in the ratios specified above), C = 13, D = 0, ii) step at 33.8 min was changed to 37.0 min,flow= 1.3 ml/min, A = 0, B = 87, C = 13, D = 0, iii) step at 37.0 min was changed to 49 min,flow=1.3 ml/min, A = 0, B = 0, C = 60, D = 40. The column temperature was 39m °C. Total run time was 60 min.
Table II. Collagen Compositional Analysis Type 2 Collagen Hyp Asp Ser Glu Gly His Arg Thr Ala Pro
Experimental Data^
Data from Reference 6
Amino Acid
107 44 34 85 316 3 54 26 99 113
97 43 25 89 333 2 50 23 103 120
Tyr Val Cys Met Hyl lie Leu Lys Phe
1. Data are expressed in residues per 1000 residues. 2. Not determined. 3. Value is the sum of the two Hyl isomers.
Experimental Data from Data^ Reference 6 3 19 2 6 24^ 9 26 16 13
2 18 ND^ 10 20 9 26 15 13
6-Aminoquinolyl-N-hydroxysuccinimidyl Carbamate Derivatization
191
B. Optimization of Mixture II Resolution Amino acid analysis of unhydrolyzed samples containing free amino acids may require the resolution and quantification of non-hydrolysate components such as Aaba, Asn, Gin, Hyp, Hyl, Tau, GAB A and Orn. The separation of these amino acids plus the hydrolysate components was optimized using a similar multi-eluent system as that described for collagen samples. The basic procedure involved varying the ratio of two buffers differing only in pH in both linear and step gradient modes to effect small, yet precise changes in eluent pH. This approach had significant practical benefits. First, small errors in eluent pH titration, typically due to difficulty in manually calibrating pH meters and measuring pH within 0.02 units, could be corrected with a minor adjustment in buffer ratio without reformulating the buffers. In addition, rapid changes in eluent pH could be implemented via a step gradient. One of the key regions of the Mixture II separation involves the four component group Ser, Asn, AMQ and Gly. Previous work noted the sensitivity of AMQ retention to eluent pH (4). At pH 5.80 AMQ retention is greater than Asp, Glu, Hyp, Ser and Asn, whereas at pH 5.05 retention is less than this polar group of amino acids. Figure 4 illustrates that control of pH within a range of 0.05 units can be essential for best resolution. The separation in this region is also influenced by the organic solvent concentration. The key peak pair affected by the gradient slope was Ser/Asn. The slope also controlled the position of AMQ between Asn and Gly. The best separation was effected with a very shallow gradient (Table III). Table III Gradient Table for Optimized Separations of Mixture II Time (min) Initial 0.50 17.00 24.00 32.00 33.50 33.80 37.00 48.00 48.10 51.00 75.00^ 110.00^
Flow ml/min 1.00 1.00 1.00 1.00 1.00 1.00 1.30 1.30 1.30 1.30 1.00 1.00 0.0
%A pH 5.70 90.0 89.0 88.0 86.0 63.0 0.0 22.0 22.0 22.0 0.0 90.0 0.0 0.0
%B pH6.80 10.0 10.0 10.0 9.0 25.0 87.5 65.5 65.0 63.0 0.0 10.0 0.0 0.0
%C MeCN 0.0 1.0 2.0 5.0 12.0 12.5 12.5 13.0 15.0 60.0 0.0 60.0 60.0
1. Curve 6 is a linear segment; curve 11 is a step function. 2. These steps provide an automated shutdown procedure.
%D H2O 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 40.0 0.0 40.0 40.0
Curve^
_ 11 6 6 6 6 6 6 6 6 11 11 6
Steven A. Cohen and Charlie van Wandelen
B
lie + Orn Leu
I
50.00
mv i Hyl
0.00-f ~ 1 — 40
lie + Orn Leu
Lys
u
—I—
Minutes
45
L
Phe
50.00"
mV
Phe
Lys
o.od 40
JU Minutes
-
i
—
1
—
I
—
t
—
1
—
45 Minutes
Figure 5. Influence of pH on the resolution of hydrophobic amino acids in Mixture II. Conditions were the same as in Figure 4 A-D.
The quaternary system also simplified resolution of the hydrophobic amino acids. Key to this separation was the ability to rapidly modify eluent pH to influence the retention of the diderivatized amino acids relative to the retention of the monoderivatized analytes. As shown in Figure 5, increasing the pH in the range of 5.7 - 5.9 resulted in increased retention of the diderivatized amino acids relative to the monoderivatized ones. This selectivity was likely due to the extra quinoline tag, which would be protonated at the lower pH. The diderivatized components also exhibited different selectivity as a fiinction of organic solvent gradient slope. Shallower gradients result in longer retention relative to the monoderivatized analytes. With careful control of the key chromatographic parameters of pH, gradient slope and flow rate, excellent resolution of Mixture II was accomplished (Figure 6B). The quaternary solvent system made it possible to simultaneously manipulate the organic solvent concentration and the eluent pH. The shallow initial slope was essential for the separation of Ser and Asn, and combined with the proper eluent pH, resolved AMQ from both Asn and Gly In the middle of the separation the pH was increased to 6.8 to place Cys between Tyr and Val, and immediately after the elution of Cys, the pH was lowered to approximately 6.3 to place Orn between He and Leu. Despite the complexity of the gradient profile, reproducible separations can routinely be obtained (Table IV).
6-Aminoquinolyl-N-hydroxysuccinimidyl Carbamate Derivatization
193
Table IV. Reproducibility for Standard Mixture II and Culture Media Sample Retention Time Reproducibility Standard Mixture II
Amount Reproducibility Ham's F-10
Amino Acid
Retention Time (min)
%RSD System A (n=15)
Retention Time (min)
%RSD System B (n-7)
mg/L
%RSD (n=3)
Asp Glu Hypro Ser Asn Gly Gin
10.58 14.21 15.53 20.51 21.41 22.25 24.50 26.08 26.66 28.09 28.64 28.89 29.80 31.11 32.49 34.47 34.87 35.35 36.26 37.11 38.06 40.39 41.14 41.47 43.39 44.62
0.62 0.56 0.60 0.51 0.39 0.27 0.22 0.17 0.22 0.14 0.09 0.13 0.11 0.09 0.10 0.06 0.06 0.06 0.06 0.04 0.04 0.04 0.04 0.04 0.06 0.05
1145 17.70 16.22 24.14 24.78 26.36 27.54 28.82 29.72 30.52 30.99 31.30 ND 33.32 34.65 36.90 37.52 38.41 39.68 ND ND 45.36 46.06 46.85 49.51 51.48
0.87 0.88 0.89 0.42 0.38 0.37 0.33 0.25 0.20 0.18 0.16 0.16 ND 0.11 0.11 0.12 0.15 0.12 0.13 ND ND 0.24 0.31 0.25 0.29 0.23
27.46 52.20 7.40 36.25 21.16 29.91 343.95 27.34 ND 13.86 188.07 27.49 ND 33.73 ND 24.37 29.25 26.60 16.13 ND ND 20.65 ND 78.51 88.43 37.53
OTT 0.35 6.71 1.03 0.63 1.03 1.61 1.06 ND 1.55 0.70 1.25 ND 1.78 ND 1.68 0.70 0.85 15.36 ND ND 8.02 ND 1.23 4.86 0.58
His NH3 Thr Arg Ala Gaba Pro Aaba Tyr Cys Val Met Hylysl Hylys2 He Orn Leu Lys Phe
194
Steven A. Cohen and Charlie van Wandelen
25
30 Minutes
Figure 6. Separations for free amino acid mixtures. The eluents and gradient used are described in Table III and the text. Other conditions are described in the legend to Figure 4. (A) DMEM plus serum, (B) Standard Mixture II, and (C) non-derivatized insect cell media.
C. Analysis of Mixture 11 Type Samples The study of cell culture supernatants and media was chosen to exemplify the utility of the optimized quaternary HPLC method. The rate at which amino acids are consumed by protein synthesis or other metabolic pathways can be quantified by performing amino acid analysis on supernatants of the protein producing cell cultures. Optimization of target protein expression can then be achieved by feeding the culture concentrated supplements rich in those amino acids that are rapidly consumed.
6-Aminoquinolyl-N-hydroxysuccinimidyl Carbamate Derivatization
195
The result shown in Figure 6A illustrates the method's ability to quantify the amino acids using the simple sample preparation protocol recommended. Complex cell culture media, including insect cell, Ham's, and DMEM, were also precisely analyzed without interference from non amino acid sample components (Table IV). A complete lack of fluorescence responses observed during chromatography of non-derivatized insect cell media (Figure 6C) underscores the highly selective detection using the fluorescence parameters described earlier. A series of experiments investigating changes in the amino acid concentrations of supernatants from a Vero cell culture were performed in our lab. Cells in the batch reactor were fed nutrient supplements several times during the 26 days of culture. Aliquots of supernatant were analyzed on 6 different days 1 hr prior to feedings. Results showing consistent decrease in several key amino acids (Figure 7) correlated well with previously published results (30).
Met
Trp
Val His
Tyr — ® —
Arg
10
15
30
Days of culture
Figure 7. Concentration profile of selected amino acids in Vero cell culture over time. The cell culture was sampled on days 4, 6, 8 (prior to feeding), 16 and 26. The culture was fed with initial media concentrate on days 4, 6 and 8, indicated by the arrows. At each time interval, an aliquot was deproteinized and derivatized according to the standard procedure. Amino acids were separated and quantified using HPLC System B and applying the optimized conditions for Mixture II.
196
Steven A. Cohen and Charlie van Wandelen
III. Summary Rugged, reproducible chromatographic systems for resolving some common mixtures of AQC-derivatized amino acids, such as those produced by the hydrolysis of collagen or free amino acids present in cell culture fluids, have been developed. Separation development is simplified through the use of a quaternary gradient system which enables the precise control of eluent pH and organic solvent gradient slope essential for optimum resolution and ease of buffer preparation. Excellent retention time reproducibility is routinely achieved with modern HPLC systems. Good quantitative results are obtained for a variety of samples even in the presence of potential interferences such as salts in the sample matrix.
References 1. S. A. Cohen and D. P. Michaud, ylwflf/. Biochem., 211 (1993) 279-287. 2. D. J. Strydom and S. A. Cohen, Anal. Biochem., Ill (1994) 1928. 3. S. A. Cohen, K. M. DeAntonis and D. P. Michaud, in J. W. Crabb (Editor^, Techniques in Protein Chemistry IV, Academic Press, San Diego, 1993, 289-298. 4. S. A. Cohen and K. M DeAntonis, J. Chromatogr., 661 (1994) 25-34. 5. S. A. Cohen, T. L. Tarvin and B. A. Bidlingmeyer, Am. Lab., August (1984) 49-59. 6. E. J. Miller, A. J. Narkates andM. A. Niemann, ^«a/. Biochem., 190(1990) 92-97. 7. Eui-Cheol Jo, Hae-Joon Park, Jong-Myun Park, Kyong-Ho Kim, Biotechnol Bioeng., 36 (1990) 717-722
Development of a Method for Analysis of Free Amino Acids from Physiological Samples Using a 420A ABI/PE Amino Acid Analyzer Klaus D. Linse Sandie Smith Michelle Gadush Protein Microanalysis Facility University of Texas at Austin Austin, Texas 78712
I.
Introduction
Amino acid analysis is a well established technique for the quantitation of free amino acids found in either hydrolysates or physiological fluids (1, 2, 3, 4). While such analyses have been neglected by bio-instrumentation manufacturers in recent years, we have seen a recurring interest and demand for the quantitation of free amino acids in physiological fluids, e.g., urine, blood serum, tissue cultures, as well as in hydrolysates. Lacking the funds necessary to purchase a new amino acid analyzer dedicated solely to physiological samples, we instead developed a separation protocol which allows for the separation and quantitation of up to 30 amino acids using our 420H system. While this separation protocol does not allow for the separation of all possible free amino acids found in physiological samples, we are able to identify and quantify those most commonly requested.
II. Materials and Methods Chemicals and solvents used were of analytical and chromatographic grade, respectively. PITC, DIEA and K4EDTA were purchased from PE/ABI. Acetonitrile and methanol were obtained from VWR. Sulfosalicylic acid was from Fisher. Ultrapure water was used to make up all solutions. Amino acid standard H containing 17 amino acids and ammonia was purchased from Pierce. Additional amino acids were obtained from Sigma. Pierce H contains 1.25 |imoles/ml of L-cystine (C) and 2.5 p,moles/ml of the L amino acids arginine (R), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), tyrosine (Y), threonine (T), valine (V), alanine (A), aspartic acid (D), glutamic acid (E), glycine (G), proline (P), serine (S), and ammonia [as (NH4)2S04]. Another 2.5 ^moles/ml amino acid standard containing a-amino butyric acid (aAba), asparagine (N), citrulline (Citr), glutamine (Q), hydroxylysine (HKl, HK2), hydroxyproline (Hp), ornithine (Orn), phosphoserine (Sp), taurine (Tau), and tryptophan (W) was prepared. This was combined with the Pierce H to create a final standard containing 27 amino acids. Additional 2.5 |imoles/ml stock solutions were made for y-amino butyric acid (GABA) and phosphothreonine (Tp). All amino acid solutions were prepared in 0.01 N HCl and were stored either at 4°C (to be used within a few days) or at -20°C (for long-term storage). The final TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
197
Klaus D. Linse et al
198
concentrations of amino acids in the solutions used for standard and calibration runs were either 50, 100 or 200 picomoles in 5 ^il. (see Figure 1 for the structures of the PTC amino acids studied)
A,
Sample Treatment Prior to Analysis
Human urine, serum and rat brain tissue extracts were treated with sulfosalicylic acid to precipitate protein: 20 |LI1 of 35% sulfosalicylic acid was added to 200 \A of each sample. These solutions were vortexed and allowed to sit at room temperature at least 20 minutes before proceeding. The samples were then spun in a microfuge for 2 minutes and the supernatants were collected. Collagen samples were hydrolyzed in 6 N HCl, 110°C for 24 hours. The hydrolysates were then dried down and resuspended in 250 ^ig/ml K4EDTA. Ant hemolymph was not pretreated before analysis. Samples were loaded on to the analyzer as follows: Urine-10 jiil of a 1:2 dilution of the supernatant, serum -10 |il of undiluted supernatant, rat brain extract-10 ^il of the undiluted supernatant, collagen hydrolysate-1.2 |ig in 15 fil, fire ant hemolymph-4 \A of a 1:9 dilution. B.
Amino
Acid
Analysis
Separation protocols: System: 420A hydrolyzer and derivatizer utilizing a PTC-amino acid pre-column derivatization reaction on-line with a 130A analyzer system and a 920A data collection system (PE/ABI). Column: 2.1 x 250 mm ODS PTC column (PE/ABI). Tables I and II describe the conditions for hydrolysates and physiological samples, respectively. Table I. Conditions for Standard PTC-analysis of Hydrolysates Parameter Flow rate Column temp. Solvent A Solvent B Gradient
Condition 300 )j.ml/min 34-35°C 50 mM Sodium acetate pH -5.4, 2.5% acetonitrile (Use 16.7 ml 3 M sodium acetate pH 5.5 and make up to IL using ultrapure water and add 25 ml acetonitrile.) 70% v/v acetonitrile/water (ultrapure), 32 mM sodium acetate, ~pH 6.1 (Use 700 ml acetonitrile, add 10.5 ml 3 M sodium acetate pH 5.5 and make up to IL using ultrapure water) Column equilibrated in 4 % B; time 0, %B = 4; time 4, %B = 14; time 10, %B = 31; time 20 , %B = 55; time 25 , %B = 100; time 30, %B = 100; time 31 , %B = 4
Table II. Conditions for the Analysis of Free Amino Acids from Physiological Samples Parameter Flow rate Column temp. Solvent A Solvent B Gradient
Condition 310 |a.ml/min 31°C 75 mM sodium acetate pH 5.58 containing 2.5% acetonitrile (Use 25 ml 3 M sodium acetate pH 5.5 and make up to IL using ultrapure water, adjust pH with ammonium hydroxide, followed by the addition of 25 ml acetonitrile.) 70% v/v acetonitrile/water (ultrapure), 32 mM sodium acetate pH 6.2 (Use 700 ml acetonitrile, add 10.5 ml 3 M sodium acetate pH 5.5 and make up to IL using ultrapure water; check pH and adjust to pH 6.2 if necessary using either ammonia or acetic acid.) Column equilibrated in 4% B; time 0, %B = 4; time 4, %B = 11; time 6, %B = 1 3 ; time 10, %B = 32; time 20 , %B = 52; time 25 , %B = 100; time 30, %B = 100; time 31, %B = 4.
Analysis of Free Amino Acids from Physiological Samples
199
LCOOH
A' "
H
^
COOH
H
Nonpolar side chains
H
Acidic side chains ^ ' ^ ^ v ^ N / ^ ' V i . x ,
Glycine
Glutamic acid
I f
Aspartic acid Basic side chain? NH2
r
NH2
Lysine
Arginine
H2N
t< M IC
Uncharged polar side chains
rr^
Phenylalanine
Asparagine NH2
=\\^^^ \ / S ^
Methionine
O E ~
OH =
id
uj
I (
Q^
H
»—NH If \
Tyrosine
.CH3
H CitruUine
H
1^
>r
H^ H
H S
^
COOH
Ornithine
JJ COOH
^
OPO H
Phospho-Serine
H N
Y
».'
I
\^""0H
H N^
' " ' ^ ,j-^CH
s
^
/ ' ' M K^ CH3 H
HO a-Aminobutyric acid
Structures of PTC-Amino Acids used during this study.
cis-4-Hydroxyprolin
Taurine
5-Hydroxy-L-Lysine
N^^°^
H ""^
JJ COOH 'V' S
Figure 1:
,NH2
H
N> ^ vV S
Cysteine
COOH -NH
S
Tryptophan
Klaus D. Linse et at
K Y
VM
C
'•U-^Ju SpDEHp NSQG H RT AP RfiLj D a i . Eft Citr Tau
cxAba
Y VM C
0.B5E2 AU
ILNIF
L
K^wv^,
U 17.0
IL F W K HKlOrn G.0G37 nu HK2
U )7.(i Figure 2. Chromatographic separation of amino acids after derivatization with phenyhsothiocyanate (PITC): A. Separation of 200 picomole standard amino acid mix H containing 18 amino acids. B. Separation of an extended amino acid mix containing 28 amino acids. The standard one-letter abbreviations are used for the usual amino acids. Nonstandard amino acids are Sp, phosphoserine; Hp, hydroxyproline; Citr, citrulline; Tau, taurine; aAba, a-amino butyric acid; HKl & HK2, hydroxylysines; Orn, ornithine; *, artifacts from reagents.
Analysis of Free Amino Acids from Physiological Samples
Min.
A.
0
D 2.45 E 2.78 S 3.98 G 4.4 H 4.7 R 5.33 T 5.67 A 6.08 P 6.3
Y 8.95
—
10
V 9.93 M 10.31
11
C 11.0
12
I 12.2 L 12.43 Nl 12.8 F 13.23
13 14 15
K 14.4
201
B. Sp 2.12 2.55 D 2.95 E Hp 3.25 N S Q G H Citr Tau R T A P
4.25 4.45 4.7 4.95 5.53 5.87 6.25 6.53 6.83 7.4 7.7
aAba 9.32 Y
10.2
V M C
10.83 11.2 11.7
I 12.8 L 13.0 HKl 13.3 HK2 13.48 F 13.95 Orn 14.2 W 14.43 K 15.23
16
Figure 3. Elution pattern of the two PTC-amino acid separations showing the retention shifts for the indicated derivatized amino acids. A. Standard conditions B. Physiological conditions
202
Klaus D. Linse et al
DE
Raw
8GU RTAP
Oaa^ai
1 Till / S , A A
u
VL
JUU
/u
Figure 4. Chromatographic separation of amino acids after derivatization with phenylisothiocyanate (PITC)using standardard conditions: A. Separation of 200 picomole standard amino acid mix H (17 amino acids) and Norleucine. B. Separation of a protein hydrolysate, bovine serum albumin (BSA) C. Separation of a peptide hydrolysate, bradykinin.
Analysis of Free Amino Acids from Physiological Samples
203 IL F W K
RMM
Dai: a
Rau
On t m
JBI
HKlOrn HK2
I
yiij
ill L-i/ UMI ...4' u
1
§.^637 AU
^
1
i
IT"
5
H IM
iT"
Figure 5. Chromatographic separation of amino acids after derivatization with phenylisothiocyanate (PITC)using physiological conditions: A. Separation of an extended amino acid mix containing 27 amino acids. The standard one-letter abbreviations are used for the usual amino acids. Nonstandard amino acids are Sp, phosphoserine; Hp, hydroxyproline; Citr, citrulline; Tau, taurine; aAba, a-amino butyric acid; HKl & HK2, hydroxylysines; Orn, ornithine; *, artifacts from reagents. B. Separation of free amino acids found in human serum. C. Separation of free amino acids found in human urine.
Klaus D. Linse et al.
204 Taurine
Figure 6. Chromatographic separation of amino acids after derivatization with phenylisothiocyanate (PITC)using physiological conditions: A. Free amino acids found in rat brain tissue. B. Free amino acids in ant hemolymph. C. Bone collagen hydrolysate.
Analysis of Free Amino Acids from Physiological Samples
205
III. Results and Discussion Due to an increased interest in analysis of physiological samples, we wanted to establish analyzer methods which would allow us to choose between our standard protocol for protein and peptide hydrolysates and a separate protocol for an expanded number of amino acids, to include the most important free amino acids found in physiological samples. A study of common analysis requirements in our facility indicated that only a limited number of the possible free physiological amino acids is needed for most unknown samples. These additional amino acids of interest are aamino butyric acid, citrulline, y-amino butyric acid (GABA), hydroxyproline, hydroxylysine, ornithine, taurine, and tryptophan. Other amino acids of interest to us are phosphoserine, phosphothreonine, phosphotyrosine and carboxy-amino acids since they are released from glycoprotein or glycopeptide hydrolysates. Figure 2 shows the separation of two PTC-amino acid standards, the shifts in retention times and conditions used for both methods. Figure 2A shows the separation of 200 picomoles of PTC-amino acid standards using our standard separation protocol. We use these conditions regularly for all routine analysis of protein and peptide hydrolysates. Figure 2B shows the separation of 27 amino acids at the 200 picomole level using our separation protocol for physiological samples. All amino acids separate well under these conditions and are eluted from the column in less than 16 minutes. The observed shifts in retention times are graphically displayed in Figure 3. The use of ultrapure chemicals, thorough cleaning of the analyzer slides and minimization of contaminants in the vicinity of the instrument enables detection at the 50 picomole level. Proper sample handling is critical. The major difference in separation conditions between the standard versus the physiological method is the addition of an extra step at 6 minutes into the gradient. This step decreases the steepness of the slope of the gradient development, thus allowing for the separation of citrulline, taurine and arginine. The separation of hydroxyproline from glutamic acid is achieved by increasing the pH of solvent A from 5.25 to 5.55. This increase in pH is also beneficial for the separation of ornithine from phenylalanine. The lower temperature of the physiological method resolves Proline from phenylthiourea (PTU). An additional benefit of the physiological conditions is the baseline separation of valine and methionine, y-amino butyric acid elutes after PTU under the standard protocol. Phosphoserine can be detected using either method, although the physiological conditions result in sharper peaks. Phosphothreonine will be resolved by lowering the pH of solvent A, but hydroxyproline will then be lost by coelution with glutamic acid. In general, pH has the greatest impact on amino acid separation. Gradient and molarity, while important for individual amino acids, have less of an effect on the whole elution profile. As the column ages, it becomes necessary to increase the molarity of solvent A to continue to separate arginine and threonine. The addition of 3 M sodium acetate at pH 5.5 in 5 ml increments, as needed, will raise the molarity a sufficient amount. The versatility of our methods is shown in Figures 4, 5 and 6. Figure 4 shows standard analysis conditions used for hydrolysates. A standard, a protein and a peptide sample are illustrated. Figures 5 and 6 contain chromatograms using the physiological method. Free amino acids are found in human serum 5B, human urine 5C, rat brain tissue 6A, ant hemolyph 6B, and bone collagen hydrolysates 6C. Note the large amount of glycine and taurine in rat brain tissue and the predominant glycine peak in bone collagen hydrolysate. To minimize cross-contamination, we routinely run cleaning cycles,which contain a hydrolysis cycle followed by a derivatizing cycle, after each analysis. A 30 fil aliquot of 1 mg/ml K4EDTA in ultrapure water is spotted on to the analyzer frits just prior to running the derivatizer cycle.
Klaus D. Linse et al
206
IV.
Conclusions
The two analyzer protocols described allow us to switch from standard settings to physiological settings within a few hours using the same column. The physiological separation method enables us to reproducibly analyze samples which contain nutritionally important amino acids, including taurine, in serum and organs, e.g., liver, heart, and brain. Taurine is a major free intracellular amino acid in animal tissue. Due to taurine's roles as a conjugator of bile acids and as a protector of cell membranes, it has become the focus of study for many investigators (2). Proper care needs to be taken to minimize cross-contamination. We therefore recommend routinely running cleaning cycles which contain a hydrolysis cycle followed by a derivatizing cycle after each analysis. In conclusion, we achieved excellent resolution of up to 30 amino acids, including most of the major plasma amino acids, within 16 minutes. The method has good reproducibility of both retention times and peak areas, allowing us to routinely analyze physiological samples.
References 1. Cohen, St., Tarvin, Th. and Bidlingmeyer, B. (1984). Analysis of amino acids using precolumn derivatization with phenylisothiocyanate. American Laboratory Aug. 1984. 2. Gaul, G. E. (1989). Pediatrics 83, 433-442. 3. Harihara, M., Naga, S. and VanNoord, T. (1993). J. Chromatography 621, 15- 22. 4. Janssen, P., van Nispen, J., Melgers, P., van den Bogaart, H., Hamelinck, R. and Goverde, B. (1986). Chromatographia 22, 351-357.
Quantitation and Identification of Proteins by Amino Acid Analysis: ABRF-96AAA Collaborative Trial K.M. Schegg\ N.D. Denslow^, T.T. Andersen^, Y. Bao'*, S.A. Cohen , A.M. Mahrenholz , and K. Mann 1. 2. 3. 4. 5. 6. 7.
I.
Dept. Biochemistry, Univ. Nevada, Reno NV 89557 Dept. Biochem. and Molec. Biol., Univ. Florida, Gainesville, FL 32610 Dept. Biochem. and Molec. Biol., Albany Medical College, Albany, NY 12208 Dept. Microbiology, Univ. Virginia Medical School, Charlottesville, VA 22908 Waters Corp., Milford, MA 01757 Dept. Biochemistry, Purdue Univ., West Lafayette, IN 47907 Max-Planck-Inst. Biochemie, 82152 Martinsried, Germany
Introduction
Amino acid analysis (AAA) has, for a number of years, been a valuable tool for identifying the amino acid composition of proteins and for accurately determining protein concentration. The Amino Acid Analysis Committee of the Association of Biomolecular Resource Facilities (ABRF) distributes a yearly test sample to member faciUties and pubHshes the results (1-9), allowing participants to compare their performance with that of other laboratories. Each year, the study is designed to address particular challenges associated with AAA. This year's sample addressed two challenges: (1) to test how accurately laboratories are able to quantitate proteins using AAA, and (2) to assess the ability to use composition data to identify unknown proteins. Recently, spectacular success has been achieved using amino acid composition data submitted to computerized search programs linked to protein databases to identify proteins recovered from twodimensional gel blots (10, 11). Additional information, such as species, molecular mass and pi, can also be submitted to some search programs to improve the probability of correct identification. In addition to promoting this new technique, our goal was to determine the quality of data required for successful identifications. The collaborative nature of this AAA study provided a unique opportunity for such an assessment. The data reported here, which were submitted by 71 facilities, reveal that most sites are capable of utilizing AAA to accurately quantitate protein concentrations and are able to identify a protein solely on the basis of the amino acid composition using Internet search programs such as ExPASy and Propsearch.
Abbreviations used: AAA, amino acid analysis; ABRF, Association of Biomolecular Resource Facilities; AQC, 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate; FMOC, N-(9fluorenylmethoxycarbonyl); OPA, o-phthaldialdehyde; PITC, phenylisothiocyanate; tpis, triosephosphate isomerase. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
207
208
II.
K.M.Scheggera/.
Materials and Methods
A. Sample Preparation and Analysis Rabbit triosephosphate isomerase (tpis, Sigma) was dissolved in water to 0.1 mg/ml and 50 jLll aliquots (nominally 5 |Llg) were distributed into 1.5 ml microfuge tubes and lyophilized. The tubes were mailed to member facilities with instructions that included the suggestion that the sample should be dissolved in water or 0.1% trifluoroacetic acid-20% acetonitrile and then analyzed by each laboratory's standard method. If the participant wished to gain additional information about the unknown, molecular mass could be determined by mass spectrometry or SDS electrophoresis stained with silver and pi could be determined using lEF gels. Facilities were instructed that, in order to aid in identification, a standard protein could be analyzed in parallel with the unknown. The amino acid composition of the unknown and known proteins, along with any additional information, was to be submitted to ExPASy (http://expasy.hcuge.ch/ch2d/aacompi.html) or Propsearch (http://www.emblheidelberg.de/aaa.html), which were accessible via the Internet, or to any other search program.
B.
Calculations
Each core laboratory was asked to send their results to an independent collaborator who entered the data into an Excel spreadsheet and removed any identifiers to keep the data anonymous. The labs were requested to report amino acid composition (total pmoles each amino acid) for the unknown sample and for a known sample of their choice to be used as a calibrant for the search programs. The labs were also asked to report the total jig of unknown protein in the tube and list the top five proteins identified by their search program(s). Data reduction was as described (4,9). Briefly, total pmoles tpis were estimated on the basis of data from individual amino acids by dividing the total pmoles of each amino acid in the sample by the known number of residues of that amino acid per molecule of tpis. These data were averaged. A corrected average was then calculated by excluding individual yield values differing >+15% from the average obtained above. Composition (number of each amino acid per molecule) was obtained by dividing the pmols amino acid by the corrected pmols tpis for that analysis. Accuracy (internal error) of each residue was calculated as: Error = 100 * (I experimental composition value - true value I) / true value (1) Average Error per analysis = (S error of 16 amino acids) / 16
(2)
The error and yield values from each participant were used to obtain overall averages across participants. A constellation of 16 amino acids, which excluded Cys and Trp, was used for all calculations.
III. A.
Results and Discussion Participation
Seventy-one facilities returned data to the Amino Acid Analysis Committee. One facility returned two sets of data for a total of seventy-two data sets. This year, 40 of the 72 analyses (56%) were performed using pre-column, as opposed to
Quantitation and Identification of Proteins: ABRF-96AAA
209
post-column techniques. This is a slight increase in the percent pre-colunin users over the previous 2 years, when 52% of respondents utilized pre-column methods (8,9). As in the previous years, the most popular methods remain the precolumn PITC (40 %) and the post-column Ninhydrin ( 39%) methods. B.
Error
and
Yield
The average error for each data set is shown in Figure 1. The more accurate half of the analyses is shown on the top graph. The overall error for all analyses was 11.9 ± 9.8% with a range from 4.0 to 58.9%. The accuracy in this year's study was far better than that of last year's (1995) AAA study, which asked members to analyze a protein spotted onto a PVDF membrane (overall average error = 21.4%) (9). In the 1994 study, which involved analysis of a soluble protein similar to that in the current study, the overall average error was 10.9 + 3.7% (8). It should be pointed out, however, that in 1994 the committee did not include results from laboratories with >30% average error. If we similarly exclude the 5 worst sets of data (average error 33 to 59%), the overall average error for 1996 improves to 9.6 + 4.1%, which appears to be a slight improvement over 1994. The contrast between this year's study and that of the previous year reemphasizes the difficulty with accurate,analyses of proteins on PVDF. Total yield of protein determined by each laboratory is shown in Figure 2. We expected the average yield to be about 5 |ig, assuming the weight recorded on the Sigma bottle was accurate and assuming small pipetting errors on our part in preparing the samples. Quantitation values, however, ranged from 0.9 to 10.6 |Xg, with an average of 6.8 + 1.7 jig protein. The vast majority of results fell between 6 and 9 jxg protein, and many values clustered around the median value of7^g. Table I presents both average error and yield data in terms of the methodology used. The accuracy of analyses performed using pre-column and post-column methods is very similar, despite the fact that pre-column users tend to analyze significantly less of the total protein than post-column users. In this year's study the top site (smallest error) was a pre-column PITC site. The second best site used post column-OPA methodology (see Table II). The average total protein yield for analyses using pre-column methods is lower than the average for postcolumn users, but the difference is not statistically significant. Both of the top two sites had higher than average yields, 7.9 and 8.4 |ig, respectively. Table I. Correlation of Method with Average Error and Yield
Method Overall Pre-Column PITC AQC FMOC/OPA Post-Column Ninhydrin OPA Fluram
N 72 40 29 7 4 32 28 3 1
Average Error (%) Average Error (%) Yield (|ig) Average +/- SD Range Average +/- SD 11.9 ±9.8 4.0 - 58.9 6.8 ± 1.7 11.8 ± 10.0 4.0 - 58.9 6.6 ± 1.8 12.6 ± 11.7 6.6 ± 1.6 4.0 - 58.9 8.9 ± 2.0 5.5- 11.2 7.1 ± 1.5 10.9 ± 1.9 8.6 - 13.0 5.5 ± 3.5 12.1 ±9.6 4.0 - 42.6 7.4 ± 1.5 11.5 ±8.7 4.0 - 42.6 7.4 ± 1.4 15.1 ± 18.8 4.5 - 36.8 6.9 ± 1.3 21.1 21.1 10.6
Yield (|ig) Range 0.9 - 10.6 0.9 - 9.4 2.9 - 9.0 5.4 - 9.3 0.9 - 9.4 4.7 - 10.6 4.7 - 10.2 6.0 - 8.4 10.6
jojJ3% 96ej8AV
0I.Z6 I.96t7
USl ozt^e 9111
9999 t7929 90IZ 99Z9 899e 0161Z9t^3 e89Z
zot^e 99ZI 8933 8t769 81-t^e l.9t'9 0808
6689
£ £
J0JJ3% 96BJ9AV
aZ2€6
9W2
9608
3393
U93
ze9e
t^9t'3
ot73e Z866 89t73
0969 9^92
10Z3 61^61 0^39 99999 6986
zege 9993
oce9 £699
oe^9 t7ZZ8
1^ 0 c
II I3
CJ
-^
*-•
•r
C
p
w
2 2©
»H e
•S i.
CO
I*
tz ' i l
5= 0 2 0)
9698 CO
I^ZCl.
6996
I.Z69
69ei
£8€6 ZZ88 3t'8€ 9661. 6et79 Z179Z 9809 8931
all
o
Id Bri pieiA
m
9111 ZZ88 S992
e£P9 99999 29e9 [.£68
1.Z69 9*^98 9608 0139
1
1
1
iiiiiiiiiiiiiiiiiii
feMM^^^^^^^^ 1 1
cz ^^^^^^^^ [Z
mmmmmmmmm C
Brl piajA
1
1
1
1
1'
1
1
1
I
[
L
211
0l72€
mi 9t793
928Z
2292
I9t^9 8e22 9999 8928 17ZZ8 89172
9809
69ei.
oe69 99Z9 1^092 99Z1. 6986
911 •£ C
^
^
89e9 J J 6689 Z^9Z ZOK 8921. \rP9Z lOlZ OZK 0808 21721
oe&9 90 U Z9172 81769 899e
a J3 4:3
SI re
II
OH
^1
1)
fS sa .^ <1> 0 C3
r/)
1 (L)
•S 0
1 Q.
^ ?; Q
^3 • > ^
11 .0 a
ii
S 2
212
K.M.Scheggerfl/.
Table II. Best Analysis of ABRF-96AAA by Pre- and Post-column Methods Site/Technique Amino Acid Ala Arg Asp Glu Gly His He Leu Lys Met Phe Pro Ser Thr Tyr Val
Theory 27 8 21 27 24 4 15 15 21 2 8 10 12 15 4 25
6899/PITC Rank = 1 27.55 8.45 20.94 28.58 25.16 3.96 13.6 15.42 21.9 1.96 8.32 9.56 10.82 14.55 4.1 24.5
2601/Post-OPA Rank = 2 27.25 8.31 21.41 26.61 24.78 4.06 13.25 14.64 21.44 2.17 8.2 9.96 12.41 15.54 4.14 21.68
Overall Average ± S.D. 27.13 ±3.65 8.21 ± 1.68 20.96 ± 2.00 27.89 + 2.47 26.35 ± 3.42 4.16 ± 1.41 12.74 ± 1.42 15.43 ± 1.00 19.81 ± 3.72 1.97 ± 1.22 7.88 ± 1.25 9.80 ± 3.00 11.99 ±2.20 14.72 ± 2.39 3.97 ± 0.92 22.24 ± 2.55
7.9 3.98
8.4 4.03
6.8 ± 1.7 11.9
Total Yield (|ig) Average %EiTor
Table III: Correlations of Accuracy (% Error) of Determination for Individual Amino Acids with Analysis Method NinPost- Fluram PITC AQC FMOC/ PostPreOPA Column OPA Column hydrin (n = 72) (n = 40) (:n = 29) (n = 7) (n = 4) (n = 32) (n = 28) (n = 3) ( n = l ) 4.2 3.2 4.4 4.7 3.6 15 10 6.7 8.3 12.4 9.7 4 5.2 14.8 15.3 10 11.3 8.5 1.4 4.4 4.1 3.7 8.5 7.2 8.7 5.6 6.9 4.4 3.2 7 2.8 7.3 6.9 7.3 6.6 6.3 8.2 14.1 8.5 8 11.3 5 12.4 12.5 10.5 15.4 13.2 18.4 11.8 18.9 8 98.6 23.3 20.9 21.4 21.7 12.5 12.2 16.3 15.5 14.9 15.8 15.5 2.4 2.2 5.7 1.7 5.4 6 6 5.3 5.3 20.7 9.3 7.7 22.5 12.9 5.3 9.6 9.9 9.5 36.2 39.4 18.8 0.7 21.9 32.2 28.6 34.5 28.9 9.4 12.7 5.3 2.2 9.3 9.1 7.1 7.7 6.5 12 100.6 19.7 16.9 34.9 16 9.1 15.9 15.1 21.9 19.3 7.4 11.6 11.3 9.9 10.2 10.6 10.7 8.4 23.8 12.4 4.4 6.8 18.9 2.9 9.3 10 30.2 16.4 15.1 14.3 14 12.1 12.9 11 7.5 11.4 19.5 14.8 12 12.7 9.8 12.3 12.1 11.9 Overall
Ala Arg Asp Glu Gly His He Leu Lys Met Phe Pro Ser Thr Tyr Val Average
12
11.8
12.6
8.9
10.9
12.2
11.5
16.4
21.1
Quantitation and Identification of Proteins: ABRF-96AAA
213
Table III shows the error for individual amino acids. In general, and as expected, the highest errors were correlated with amino acids that appeared infrequently in the protein (e.g.. Met (2), His (4), and Tyr( 4)) . However, the error for Met is remarkably high (32%) and suggests destruction of Met during hydrolysis. The error for isoleucine was also quite high (15.5%) despite its relative abundance (15 residues) in the protein. This may again be a hydrolysisrelated problem, since isoleucine bonds are difficult to hydrolyze, especially in certain sequence contexts such as Val-Ile bonds. Proline, with 10 residues, had an error of 16%. Analysis accuracy of the individual amino acids was, in general, fairly similar if one compares pre-column with post-column techniques. The error for the common background amino acids alanine and glycine were slightly higher for pre-column than for post-column techniques (8.3 vs. 4.7% for Ala and 12.4 vs. 8.2% for Gly), possibly because background amino acids become an amplified problem when lower quantities of amino acids are analyzed. C.
Identification of the Unknown Protein Using Databanks
Ninety-three percent of laboratories submitted their data to either (or both) the ExPASy or Propsearch Internet sites. One laboratory used its own identification software and 4 did not attempt identification. Forty-one facilities reported the analysis of a calibration protein of their choice that was analyzed along with the experimental sample. Both the methods of analysis and search programs contain many variables, and it is necessary that all data be treated similarly to allow comparisons. To allow for comprehensive comparisons, the AAA committee recalculated the mol % and error data as described in Materials and Methods and submitted all data sets to both search programs. For the ExPASy site, data were submitted with and without the calibration protein to determine the benefit of a calibrant. The conclusions drawn below are from the committee's resubmissions. Both the Propsearch and ExPASy programs assign scores describing the fit of the submitted AAA data to the actual composition of a known protein. In the Propsearch algorithm, a distance score below 1.5 is considered good, one higher than 2.5 indicates unreliability, and those in between are in the "twilight zone" (11). Guidelines provided by the authors of the ExPASy program suggest that scores below 30 are considered good, but, in addition, the second ranked protein should score about a factor of 2 higher (10). However, the authors add the caveat that this depends on the presence of MW and pi data. When data were submitted to Propsearch, the program correctly selected tpis_rabit (i.e., tpis from rabbit) as the number one ranked protein for 44.4% of the samples. In the case of 45.8% of the samples, tpis from another species was identified as the number one choice, but in these cases tpis from rabbit, as well as from additional species, usually appeared among the other top 5 choices. This is understandable in light of the close similarity in amino acid composition among the tpis molecules from various species. Thus about 90% of the sites were able to correctly identify the protein family (tpis). It is assumed that under normal circumstances an investigator would know the species from which the sample originates. For 9.7% of the facilities, tpis (of any origin) was not in the top 5 choices, and in many cases, not in the top 20 choices. In these cases, a variety of different proteins were chosen by the search programs, but the distance scores were greater than the cutoff point for a good match, and the scores were all similarly high. In general, as shown in Figure 3, incorrectly identified samples had high average error. Over all samples, the Propsearch distance score correlated fairly well (r = 0.883) with the average error of the analysis. Analysis using the ExPASy site was more complex because data could be entered with or without accompanying calibration protein data. When data were entered without calibration proteins, only 30.6% of the entries were identified as tpis_rabit, 52.7% were identified as tpis from other species, and 16.67% were
K. M. Schegg et al
214
2 n
1
1-5-
^
1
•
W
& 0.5A,
0 0
1 1
1 1
10
20
Z 2 " >> ^ 1 % W
35n 302520-
A
5 105 0 -
•A, °°
c
O
O
o ^ o -^
*-•
1
1
10
20
% Average Error
% Average Error
Figure 3. Correlation of Scores on Propsearch or ExPASy with Average Error of Analysis. The score given by Propsearch (graph on left) or ExPASy (graph on right) for tpis_rabit was plotted against the average error for that analysis of the protein. Calculated mol % data for the query protein were submitted to search programs without calibration standard (if any). Proteins ranked as number 1 : A, rabbit tpis;iR, tpis from species other than rabbit; and o, other protein.
not identified as tpis of any species (Figure 3). The distance score assigned to rabbit tpis by ExPASy did not correlate as well with the average error (r = 0.475). Inclusion of calibration protein data improved the ranking of tpis_rabit in 14 cases (Figure 4a), caused no change in 16 cases, and decreased the ranking of tpis_rabit in 11 cases (Figure 4b). The ability of a calibration protein to aid in the correct identification of the unknown protein depended on the accurate analysis of the calibration protein. As illustrated in Table IV, those calibration proteins that either helped or made no change in the correct identification of the unknown had average errors for the analysis of the calibration protein itself of 8.3 + 3.2% and 8.0 ± 3.0%, respectively. By contrast, calibration proteins that worsened the identification had average errors values of 19.9 ± 13.9%. In some cases, laboratories misidentified the standard protein to be used as a calibrant; in others, results were poor for all reported data. Interestingly, the former problem, which can only be attributed to human error, seems to occur regularly in ABRF studies (see the 1996 report of the ABRF Peptide Synthesis committee), although at a low frequency.
D.
Molecular Mass
Determination
The Propsearch and ExPASy programs both allow entry of molecular mass to aid in identification. As part of this year's study, participants were asked to determine the molecular mass of the sample using mass spectrometry (MS) or other methods such as SDS electrophoresis or gel filtration. Sixty-five percent of the facilities did use MS to determine the molecular mass of the sample; the average mass obtained was 26,804 + 631 (one outlying value of 6000 was excluded from this calculation). The molecular mass for rabbit tpis calculated from the sequence is 26,625. Inclusion of molecular mass in the ExPASy program did not seem to aid in correct identification of the protein. However, in the Propsearch program, molecular mass can be weighted, thus making this program more sensitive to inclusion of mass; an error of 5-10% in mass was determined to adversely affect the score given by Propsearch.
Quantitation and Identification of Proteins: ABRF-96AAA
PL,
215
104-U
iWWrl^^ in
cx)
^
« n » n - ^ c N O « n o \ r n o N r ^ " ^ c N
Site Identification
H
P<
10 8 6 4 2 0
»n in
m CO ^
in ra 00
t~-
'—I '-H ^ cN
>0
m^^ r -- vo rvo »n CN o '—I en ^ r^
ON
r--
^
OS
^ in CN
ON
in
Site Identification Figure 4. ExPASy Scores for tpis_rabit When Data Were Submitted without or with Data from a CaHbration Protein. Amino acid analysis data were submitted to the ExPASy site using the 16 residue Constellation 2 with (empty bars), and without (filled bars), known protein (calibrant) data furnished by the participants. The chart shows rank assigned to tpis_rabit (SwissProt data base) for selected sites: (a) Sites (n = 14) where accompanying data improved rank of tpis_rabit. (b) Sites (n = 11) where including calibration data degraded the rank obtained for the query protein. For 16 sites (not shown), there was no change in rank of rabbit tpis with the inclusion of calibration data (see Table IV). Rank values above 10 are truncated. Table IV. Correlation of Calibration Results and Analysis Error Effect Improved Unchanged Worse
IV.
14 16 11
Average Sample Error 9.1 ±3.5 6.8 ± 2.5 11.5 ±8.1
Average Calibration Error 8.3 ± 3.2 8.0 ± 3.0 19.9 ± 13.9
Summary and Conclusions
Amino acid analysis is often touted as the most accurate method for determination of protein concentration. The data from this 1996 ABRF AAA study indicate that the vast majority of member facilities that participated in this study quantitate soluble protein well. The most striking aspect of this study, however, was the ability of the laboratories to identify the protein solely on its amino acid composition. The data from approximately 90% of the participants were sufficient for correct identification, if one knew the species of the protein's origin. Currently, identification of unknown proteins from AAA data is not frequently used for simple soluble proteins, such as triosephosphate isomerase. The technique is more commonly used to identify proteins that have been separated by two dimensional analysis on isoelectric focusing and SDS electrophoresis and then transferred to PVDF membranes. Such samples are usually present in low
K. M. Schegg et al
216
quantities. The 1995 AAA study indicated that AAA from PVDF can be performed successfully, but with average errors twice that seen with soluble proteins. The question remains whether identification of proteins can be achieved by the majority of laboratories when analyses are performed on very small samples blotted onto PVDF. This question will be addressed in next year's study.
Acknowledgments We would like to thank Dr. (Jmit Yiiksel (CryoLife, Inc.) for tabulating the data and maintaining the anonymity of the participants. This work was supported in part by NSF grant DE-FG02-95ER61839, to J.W. Crabb, on behalf of the ABRF.
References 1.
2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
Niece, R.L., Williams, K.R., Wadsworth, C.L., Elliot, J., Stone, K.L., McMurray, WJ., Fowler, A., Atherton, D., Kutny, R., and Smith, A. (1989) m" Techniques in Protein Chemistry" (T.E. Hugli, ed.) Academic Press, San Diego, pp 89-101. Crabb, J.W., Ericsson, L.H., Atherton, D., Smith, A.J., and Kutny, R. (1990) in "Current Research Protein Chemistry" (J.J. Villafranca, ed.) Academic Press, San Diego, pp 49-61. Ericsson, L.H., Atherton, D., Kutny, R., Smith, A.J. and Crabb, J.W. (1991) in "Methods of Protein Sequence Analysis 1990" (H. Jomvall, J.-O. Hoog, and A.M. Gustavsson, eds.) Birkhauser Verlag, Basel pp 143-150. Tarr, G.E., Paxton, R.J., Pan, Y.-C.E., and Paxton, R.J. (1991) in "Techniques in Protein Chemistry II" Academic Press, San Diego, pp 139-150. Strydom, D.J., Tarr, G.E., Pan, Y.-C.E., and Paxton, R.J. (1992) in "Techniques in Protein Chemistry III" (R.H. Angeletti, ed.) Academic Press, San Diego, pp 261-274. Strydom, D.J., Andersen, T.T., Apostol, I., Fox, J.W., and Crabb, J.W. 1993) in "Techniques in Protein Chemistry IV" (R.H. Angeletti, ed.) Academic Press, San Diego, pp 279-288. Yuksel, K.tF., Andersen, T.T., Apostol, I., Fox, J.W., Crabb, J.W., Paxton, R.J., and Strydom, D.J. (1994) "Techniques in Protein Chemistry V" (J.W. Crabb, ed.) Academic Press, San Diego, pp 231-240. Yuksel, K.U., Andersen, T.T., Apostol, I., Fox, J.W., Paxton, R.J., and Strydom, D., J. (1995) in "Techniques in Protein Chemistry VI" (J.W. Crabb, ed.) Academic Press, San Diego, pp 185-192. Mahrenholz, A.M., Denslow, N.D., Andersen, T.T., Schegg, K.M., Mann, K., Cohen, S.A., Fox, J.W., and Yuksel, K.U. (1996) in "Techniques in Protein Chemistry VII" (D. Marshak, ed.) Academic Press, San Diego, pp 323-330. Wilkins, M.R., Pasquali, C , Appel, R.D., Ou, K., Golaz, O., Sanchez, J.-C, Yan, J.X., Gooley, A.A., Hughes, G., Humphrey-Smith, J., Williams, K.L., and Hochstrasser, D.F. (1996) Biotechnology 14, 61-65. Hobohm, U., Houthaeve, T., and Sander, C. (1994) Anal. Biochem. 222, 202209.
SECTION III Chemical Modification
This Page Intentionally Left Blank
Nonaqueous Chemical Modification of Lyophilized Proteins Harvey Kaplan and Alpay Taralp Department of Chemistry, University of Ottawa Ottawa, Ontario KIN 6N5 I.
INTRODUCTION
Chemical modification has been widely used to investigate structurefunction relationships in native proteins. Comprehensive descriptions of techniques, reagents and strategies for the modification of proteins in aqueous environments are available in general reviews of the field (1-6). The aqueous environment restricts the choice and effectiveness of chemical modifying reagents because many are insoluble in water, react rapidly with water or form water-unstable derivatives with protein functional groups. Indeed, nonaqueous chemistry has been employed in amino and carboxyl terminal sequencing methodologies (7,8) and in the derivatization of peptides for applications in mass spectrometry (9,10). However, these applications did not focus on the native structure of the protein and the procedures were devised for denatured or fragmented proteins. Nevertheless, they show the potential advantages of a nonaqueous environment for the modification of proteins. It is now well established that the catalytic properties of a wide variety of enzymes remain intact in organic solvents (11-13). These findings imply that proteins may also retain their native structures when lyophilized and dispersed in organic solvents. Evidence has been obtained that crystallized proteins have essentially the same structure in water and organic solvent (14,15). In the lyophilized state, proteins are also in a nonaqueous environment and it is expected their physico-chemical properties will differ from that in solution, as the dynamic conformational equilibria that exits in solution will be absent. Some physico-chemical studies indicate that the structure of the lyophilized state is very similar to that in solution (16-18), while others indicate that there is some limited but reversible conformational change (19-24). There are likely to be TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
219
220
Harvey Kaplan and Alpay Taralp
differences among proteins in this regard, but the results of extensive studies with lyophilized enzymes in organic solvents provide strong evidence that most proteins in the lyophilized state retain the essential elements of their native structure (11-13). It is therefore expected that the reactivity of functional groups in lyophilized proteins will reflect their properties in solution and provide information on the solution structure. The present study reports that modification of lyophilized proteins in a nonaqueous environment has significant advantages over aqueous procedures. II. MATERIALS AND METHODS A. Proteins, reagents and amino acid derivatives Proteins. Bovine insulin, a-chymotrypsin and ribonuclease were purchased from Sigma Chemical Company. Inactivated diisopropylphosphoryl(DIP)-a-chymotrypsin was prepared by incubation with diisopropylfluorophosphate(DFP)(25). Chemicals and Solvents. H2N-MetCSMe)2 and [^^C]iodomethane were from Sigma Chemical Company. [^ce^/c-l-^'^C]anhydride (9.20 mCi/mmol) was from NEN Research Products and [^HJacetic anhydride (6.94 Ci/mmol) was from Amersham Corp. All other chemicals, reagents and solvents were high purity preparations obtained from commercial sources. ^^C-Methylated amino acids were prepared as follows: N-AcetylL-tyrosine-amide, poly lysine HBr, Gly-Leu, Ile-NH2, Ala-Ala, Phe-GlyGly, histidine amide, and cystine dimethylester dihydrochloride were methylated with iodomethane to serve as standards for the assignment of peak resonances in methylated proteins. N-a-Amino acids (10 mg) were typically dissolved in 200 mM, pH 10 sodium metaborate buffer (1 ml), and methylated directly by the addition of a 1:1 v/v solution (20 ^1) of [^^CJiodomethane in acetonitrile, with rapid mixing of the sealed biphasic mixture at 37°C for 24 h. Cystine dimethyl ester, in particular, was first converted to the diamide at pH 9.5 using ammonia. Histidine amide was first acetylated prior to reaction with iodomethane. Side-chain methylated amino acids were prepared from N-a-blocked starting materials in the same manner, and in the case of methylated poly lysine, was followed by acid hydrolysis of the peptide bonds.
Nonaqueous Chemical Modification of Lyophilized Proteins
221
Aqueous methylation of proteins at pH 7.5 and pH 10. Insulin (20 mg), DIP-a-chymotrypsin (20 mg), a-chymotrypsin (20 mg) and ribonuclease (20 mg) were placed in screw-capped vials and dissolved in 200 mM sodium phosphate buffer (10 ml), pH 7.5, or 200 mM sodium metaborate buffer (10 ml), pH 10. A 1:1 (v/v) solution (250 ^il) of [^^C]iodomethane in acetonitrile was added, the vial was sealed tightly and the biphasic mixture shaken at 37°C for 24 h. B. Modification of lyophilized proteins under nonaqueous conditions Two experimental approaches for nonaqueous chemical modification of proteins lyophilized at specific pH values (LpH) were investigated. The first parallels the approach developed by Klibanov (27-29) for enzymatic reactions in organic solvents. A protein solution is adjusted to the desired pH value, lyophilized and dispersed in octane. The modifying reagent is added to the protein dispersion and the reaction mixture is stirred in a temperature-controlled oven. The modified protein can be isolated simply by filtration or centrifugation, washed with octane and residual organic solvent removed under vacuum. In a second approach, the reaction is carried out directly on the lyophilized protein. A glass reaction vessel with two compartments is employed. Protein solution is lyophilized in one compartment and then modifying reagent is added to the other compartment, which is immersed in liquid nitrogen. The reaction vessel is sealed under vacuum and placed in an oven. To terminate the reaction, the unreacted reagent is trapped out by placing the reagent compartment in liquid nitrogen and releasing the vacuum. The modified protein from either procedure is dissolved in an aqueous medium for analysis by NMR or other analytical procedures. Methylation of proteins at LpH 7.5 and LpH 10 in octane. Proteins were lyophilized directly in the reaction vessels. Only in the case of insulin at LpH 7.5 was it necessary to lyophilize the protein from a large volume and transfer it to the reaction vessel. Insulin (20 mg) was lyophilized from a solution of 1 mM sodium phosphate buffer (40 ml), pH 7.5 and 40 mM sodium metaborate buffer (1 ml), pH 10. Ribonuclease and a-chymotrypsin were lyophilized from a solution of 40 mM sodium phosphate buffer (1 ml), pH 7.5 and 40 mM sodium metaborate buffer (1 ml), pH 10. Anhydrous octane (2 ml) was added to the protein and the medium was sonicated until the protein was finely dispersed, at which time [^^C]iodomethane (100 ^il) was added. To prevent the loss of
222
Harvey Kaplan and Alpay Taralp
iodomethane, the vessels were sealed. The protein dispersion was stirred 12 h in an oven at 75°C for the LpH 10 reactions and 24 h for the LpH 7.5 reactions. The tube was opened and the derivatized protein was centrifuged with two washes of octane and residual octane removed under vacuum. In vacuo methylation of proteins at LpH 7.5 and LpH 10. The proteins were lyophilized as described above for the octane reaction in the two-compartment reaction vessel. [^^CJIodomethane (10 ^1) was transferred into the reagent chamber which was submersed in liquid nitrogen. The open end of the reaction vessel was fitted with a vacuum hose, the reaction vessel was sealed under vacuum and incubated at 75°C. To terminate the reaction, the reagent chamber was placed in liquid nitrogen, the vacuum seal was broken and the modified protein was removed. In vacuo acetylation of a-chymotrypsin at LpH 9.0 with [^H] acetic anhydride. A solution of a-chymotrypsin (2.5 mg/ml) was adjusted to pH 9.0 with IN NaOH. Aliquots (1 ml) were lyophilized in the protein compartments of five reaction vessels. [^H] Acetic anhydride (10 |LI1, 1 mCi/mmol) was added to the reaction chambers of four reaction vessels. The fifth was used as a control to which no reagent was added. The reaction vessels were sealed under vacuum as described above and placed in an oven at 75°C. Reaction vessels were then removed at various time points and the reactions terminated and the protein isolated as described above. Quantification of ^H incorporation into amino groups. The quantification procedure employed was that used in the competitive labeling technique (30) with [^"^CJacetic anhydride (25 mCi/mmol) used to prepare the ^"^C-labeled protein. Peptides containing the ^H/^'^C-acetylated a-amino groups and s-amino groups were separated by high voltage paper electrophoresis (26). Aliquots of the ^H/^'^C-peptides were transferred into vials containing Aquasol-2 scintillation cocktail (5 ml) and the ^H/^'^C-ratios were quantified on a LKB RackBeta liquid scintillation counter. NMR Spectra. [^^C]NMR analyses (1024 transients) were obtained using a Gemini 200 MHz spectrometer. Methylated protein samples were analyzed in an 8M urea, 90% D2O solution of 100 mM sodium phosphate which gave a pH meter reading of 8. For the aqueous reactions derivatized proteins were dialyzed against 10 mM HCl and lyophilized prior to the addition of 8M urea. Acetonitrile (30 \iV) was added to reference peak resonances.
223
Nonaqueous Chemical Modification of Lyophilized Proteins III. RESULTS AND DISCUSSION
The chemical modification of insuUn with [^^CJiodomethane was employed to compare the reaction of the lyophilized protein under nonaqueous conditions, in vacuo and in octane, with that of the solubilized protein under aqueous conditions. Reactions were carried out at pH 7.5 and 10 for the aqueous reaction and LpH 7.5 and 10 for the nonaqueous conditions. Peak resonances (Figure 1) corresponding to methylated derivatives of functional groups were assigned from the chemical shifts of methylated standard compounds (Table 1). The following similarities in the water, octane and in vacuo reactions were observed: a) the same functional groups are modified in the water, octane and in vacuo reactions, b) the same derivatives of the various functional groups are obtained, viz. amino groups are trimethylated to the quaternary state, histidine forms the
4S
40
.LL. 3S
30
25 I
to
ss
40
so
Jj 35
30
25 I
D 2 34
J
WXXJ^ 55
60
45
40
35
30
25 I
30
25 F
F 2
to
55
50
40
35
30
25 PPM 60
55
50
40
JJ 35
Figure 1. NMR Spectra of '^C-Methylated Insulin. A) Water, pH 7.5; B) Water, pH 10; C) In vacuo, LpH 7.5; D) In vacuo, LpH 10; E) Octane, LpH 7.5; F) Octane, LpH 10. Peak resonances correspond to: 1- Tyr(OMe); 2- MejN^-Gly; 3- Lys(8-^NMe3); 4- McaN^-Phe; 5- a-^NMe3(unidentified) 6-His(Im^Me2),
224
Harvey Kaplan and Alpay Taralp
dimethylimidazolium cation derivative and tyrosine forms the phenolic 0-methyl derivative and c) the phenolic hydroxyl of tyrosine does not react at pH 7.5 or LpH 7.5 but reacts at pH 10 and LpH 10. However, the octane and in vacuo reactions differed from the aqueous reaction in that the degree of methylation of the phenylalanine and glycine a-amino groups, and most significantly the lysine e-amino group was considerably less at LpH 7.5 than at LpH 10. These differences are greater than is apparent in figure 1 because the vertical scales used in figures Id and If are attenuated with repect to figures Ic and le, in order that all the resonances be on scale. Insulin reacted under aqueous conditions (24 h, T = 37°C and pH =10) or nonaqueous conditions in vacuo (24h, T=75°C and LpH =10) tested negative with Pauly's diazo reagent (31) in both cases, weakly ninhydrin positive for the nonaqueous sample and ninhydrin negative for the aqueous sample, showing that the reactions proceeded to completion. The aqueous reaction of DIP-a-chymotrypsin at pH 7.5 and pH 10 with [i3C]iodomethane (Figures 2a and 2b) gave all the methylated derivatives observed with insulin. In addition, the dimethylsulfonium derivative of the methionine side-chain which is not present in insulin was observed. The nonaqueous reactions of a-chymotrypsin with [^^C]iodomethane differs from the aqueous reaction (Figure 2b) in that no 0-methyltyrosine is observed at LpH 10 (Figures 2d and 2f). Methylation of ribonuclease paralleled that of chymotrypsin in that under aqueous conditions tyrosine was methylated at pH 10 but not at pH 7.5, and not methylated under nonaqueous conditions at LpH 10. a-Chymotrypsin has three amino termini and three resonances are therefore expected in the chemical shift region of trimethylated a-amino groups as observed in the nonaqueous reaction (Figure 2c, peaks 3,4 and 5). Peak 3 for the trimethylated cystine a-amino terminus is not visible in figures 2a, 2b, 2d and 2f because of the very intense neighboring trimethylated lysine 8-amino resonance but is resolved in higher field spectrometers. It was necessary to attenuate the vertical scale in figures 2b, 2d and 2f in order to accommodate the intense resonance at 53.69 ppm, due to the superimposition of the 14 lysine residues in a-chymotrypsin at pH 10 and LpH 10. For this reason the peak intensities for the other resonances at pH 10 and LpH 10 appear weaker than they do in the spectra for reactions at pH 7.5 and LpH 7.5 (Figures 2a and 2c) but the degree of methylation of these groups is at least as great or greater. When chymotrypsin was not inactivated with DFP, more than three multiple peak resonances were observed in the
Nonaqueous Chemical Modification of Lyophilized Proteins
225
aqueous reaction (Figure 2e) indicating that, unlike the nonaqueous reaction, autolysis had occurred generating additional a-amino groups. With aqueous modifications, it is usually necessary to dialyze proteins following reaction with a derivatizing reagent and as a result small breakdown products are not observed. This is illustrated in figure 2e where a large portion of the protein has autolyzed and small methylated peptides have been removed by dialysis, requiring expansion of the vertical scale in order to observe the weak resonances of the remaining fragments. In contrast, for the nonaqueous reactions reported here the proteins need not be dialyzed so resonances corresponding to low molecular weight products and minor side reactions can also be observed.
Table 1: Chemical shifts in ppm for [^^C]Methyl Groups of Amino Acid Standards, Methylated Insulin and Methylated a-Chymotrypsin [l3C]Methyl Standards Chemical Shift (ppm) Standard Insulin a-Chymotrypsin Ac-NH-Tyr(0Me)-NH2 Me3N+-Gly-Leu H2N-Lys(s-+N Mea) [Me3N+-Cys(NH2)-S-]2 Me3N+-Phe-Gly-Gly Me3N+-Ile-NH2 Me3N+-Ala-Ala H2N-Lys(e-+NHMe2) Ac-NH-His(Im+Me2)-NH2 H2N-Lys(8-+NH2Me) +SMe3 H2N-Met(+SMeMe)
56.12 55.06 53.66 53.42 53.41 53.03 52.60 43.42 34.04, 36.46 33.59 27.5(35)
25.41
56.00 55.07 53.62
56.18 53.69 53.50
53.48
34.10,36.58
53.21 52.71 43.37 34.08, 36.59 33.55 27.54 25.65
The reaction of iodomethane with amino groups proceeds to give the quaternary trimethylamino derivative. In contrast reductive methylation yields at most the dimethylamino derivative (1-5). Although quatemization of amino groups is known to occur in vivo as a posttranslational modification of proteins (32), this modification, to our knowledge, has not been reported as an in vitro chemical modification for amino groups in native proteins and provides a means for placing a permanent positive charge on the a and s-amino groups at all pH values. Similarly, the formation of a dimethylimidazolium cation derivative with
226
Harvey Kaplan and Alpay Taralp
the side-chain of histidine was unexpected since this derivative has not been reported as either an in vitro chemical modification of a native
I. . ,. il
W
C
4S
40
3S
L.
30
2S
PPM
W
5$
3S
30
36
30
2S I
3 2J
^
| 0 $ 5 S 0 4 S 4 0 3 S 3 e 2 S
PPM M
. . l l L . L. 45
40
3S
30
2S I
«
M
4S
40
35
_U ,
30
2S
PPM
28 I
Figure 2. NMR Spectra of a-Chymotrypsin and DIP-Chymotrypsin. A) DIP-Chymotrypsin, water, pH 7.5; B) DIP-Chymotrypsin, water, pH 10; C) a-Chymotrypsin, in vacuo, LpH 7.5; D) a-Chymotrypsin, in vacuo, LpH 10; E) a-Chymotrypsin, water, pH 7.5; F) a-Chymotrypsin, octane, LpH 10. Peak resonances correspond to: 1- Tyr(Ome); 2- Lys(s-"NMe3); 3- MejN^-Cys; 4- McaN^-Ile; 5- Me3N"-Ala; 6- HisClm'Me^); 7- MetCSMe^)protein or as an in vivo post-translational modification. Like the trimethylation of amino groups, dimethylation of the imidazole function also provides a means for placing a positive charge on the side-chain at all pH values. The tyrosine phenolic hydroxyl function was readily methylated to form a methyl ether under aqueous conditions at pH 10 with the proteins
Nonaqueous Chemical Modification of Lyophilized Proteins
227
used in this investigation and with model compounds. To our knowledge this modification reaction with iodomethane has also not been previously reported as an in vitro chemical modification of native proteins. There was, however, a notable difference in the reactivity of this group between insulin (Figure Id and If) and both a-chymotrypsin (Figure 2d and 2f) and ribonuclease (spectrum not shown) in a nonaqueous environment, in that only in insulin was this function methylated at LpH 10. This suggests that the phenolic side-chains are buried in the major conformational states of a-chymotrypsin and ribonuclease in solution at pH 10. The reason these side-chains react in an aqueous, but not in an nonaqueous environment, is presumed to be due to the dynamic equilibrium that exists between the various conformational states in solution, and although the tyrosine side-chains are exposed only in minor conformations, this can lead to substantial reaction over a period of time due to Le Chatelier's principle. In the nonaqueous state no comparable dynamic equilibria exist, so that the tyrosine side-chains never become exposed. Insulin being a very small protein is unfolded to a large extent in solution at pH 10, and at LpH 10, exposing some or all of its tyrosine side-chains. The most significant difference between the aqueous and nonaqueous reactions was that for the nonaqueous reaction the relative degree of methylation of s and a-amino groups depended on the pH of lyophilization. At LpH 7.5, the degree of reaction of the e-amino groups, relative to that of the a-amino groups, is clearly much less than at LpH 10, while the imidazole group with a pK^ value approximately 6 to 7 (33) shows a much smaller difference (Figures l c & Id, l e & If and 2c & 2d). This can be explained on the basis of the pH memory effect (27), where ionizable groups "remember" the ionization state they had in the solution from which they were lyophilized. In nonaqueous media there is no water present for a dynamic equilibrium to be established between the two ionization states and therefore only the deprotonated species will be derivatized while the protonated form remains unmodified. This interpretation is consistent with the results obtained for the in vacuo reaction of the s and a-amino groups of lyophilized a-chymotrypsin at LpH 9 with acetic anhydride at 75T (Table 2). It was found that after 60 h of reaction, 87% of the a-amino groups were acetylated, but in contrast, only 26% of the lysine e-amino groups were acetylated. With pK^ values approximately 7 to 8, more that 50% of the a-amino groups would be expected to be derivatized at LpH 9, whereas with pK^ values approximately 10 to 11, less than 50% of the e-amino groups would react (33).
228
Harvey Kaplan and Alpay Taralp
Clearly, the phenomenon of pH memory has the potential to be utilized as a means of achieving selective chemical modification of ionizable functional groups, even for those within the same class, by controlling the pH of lyophilization. Table 2: Relative Incorporation* of Acetic Anhydride into Amino Groups of a-Chymotrypsin by Nonaqueous Derivatization
Amino Group a-amino s-amino
Reaction Time In Vacuo (LpH 9.0) Ih 8h 24 h 60 h 0.34 0.42 0.58 0.87 0.21 0.23 0.25 0.26
fraction relative to complete derivatization in water. T = 75°C The results of the present study show that nonaqueous modification of lyophilized proteins in octane or in vacuo is feasible and practical. The increased temperature stability (27,34) of proteins in the lyophilized state permits the use of elevated temperatures to accelerate the reactions. Modification in organic solvent has the advantage that volatile and nonvolatile reagents can be used. With the in vacuo procedure, stirring is not required to maintain the protein in a dispersed state, the reaction temperature is not limited by the boiling point of the solvent, recovery of unreacted reagent is much simpler, and no further manipulation of the modified protein is required. Regardless of which nonaqueous procedure is used, there are significant advantages over aqueous modification procedures. A case in point is the reaction with iodomethane, which because of its low solubility and slow reaction in water, has been so infrequently used for protein modification that it is not usually included in catalogues of reagents for protein modification (1,3). While the results show that iodomethane does react with proteins in water to form the same derivatives as under nonaqueous conditions, the long reaction time and rapid agitation required to disperse the iodomethane makes it very difficult to prevent denaturation or hydrolytic breakdown. It is expected that the same will be true for all insoluble modifying reagents that require long reaction times. In contrast, the nonaqueous reaction, whether in organic solvent or in vacuo, is facile with such reagents, and the possibility of irreversible structural damage is greatly reduced. Another significant advantage is that the pH memory effect observed with lyophilized proteins can be exploited to improve the selectivity of modification, even
Nonaqueous Chemical Modification of Lyophilized Proteins
229
for groups within the same functional class by controlling the pH of lyophilization. The ability to use nonaqueous conditions opens the door to the use of novel protein modifying reagents and should provide the opportunity, as in the case of iodomethane, for the preparation of novel derivatives to explore structure-function relationships in proteins.
REFERENCES Lundblad, R. L. (1995) Techniques in Protein Modification, CRC Press. 2. Imoto, T. and Yamada, H. (1989) Chemical Modification: In Protein Function: A Practical Approach (Creighton, T. E. Ed.) Chapter 10, pp. 247-277, IRL Press. 3. Lunblad, R. L. and Noyes C. M. (1984) Chemical Reagents for Protein Modificationyol, 1 and Vol. 2, CRC Press. 4. Glazer, A. N., Delange, R. J. and Sigman, D. S. (1976) Chemical Modification of Proteins; In Laboratory Techniques in Biochemistry and Molecular Biology (Work, T.S. and Work, E. Eds.) Vol. 4, pp. 3-205, North-Holland Publishing Company. 5. Means, G. E. and Feeney, R. E. (1971) Chemical Modificaton of Proteins, Holden-Day. 6. Meth. Enzymol (1972) Modification Reactions, (Timasheff, S. N. and Hirs, C. H. W. Eds.) Vol. 25b, pp. 387-651, Academic Press. 7. Laursen, R. A. (1972) Meth. Enzymol (Timasheff, S. N. and Hirs, C. H. W. Eds.) Vol. 25b, pp. 344-359, Academic Press. 8. Hawke, D. H. and Boyd, V. L. (1991) Studies On Carboxyl Terminal Degradations: In Techniques in Protein Chemistry //(Villafranca, J. J. Ed.) pp. 107-129, Academic Press. 9. Vath, J. E., Zollinger, M. and Biemann, K. (1988) Fresenius Z Anal Chem. 331, 248-252. 10. ICnapp, D. R. (1990) Meth Enzymol (McCloskey, J. A. Ed.) Vol. 193, pp. 314-329, Academic Press. 11. Wescott, C. R. and Klibanov, A. M. (1994) Biochim. Biophys. Acta 1206, 1-9. 12. Klibanov, A M. (1989) Trends Biochem. Set 14,141-144. 13. Chen, C.-S. and Sih, C J. (\9S9) Agnew. Chem. Int. Ed Engl 28, 695-707. 14. Fitzpatrick, P. A., Ringe, D. and Klibanov, A. M. (1994) Biochem. Biophys. Res. Commun. 198, 67-681. 1.
230
Harvey Kaplan and Alpay Taralp
15. Fitzpatrick, P. A., Steinmetz, A. C. U., Ringe, D. and Klibanov, A. M. (1993) Proc. Natl Acad. Sci. USA 90, 8653-8657. 16. Rupley, J. A. and Careri, G. (1991) Adv. Protein Chem. 41, 37-52. 17. Careri, G., Gratton, E., Yang. P. H. and Rupley, J. A. (1980) Nature 284, 572-574. 18. Schinkel, J. E., Downer, N. W. and Rupley, J. A. (1985) Biochemistry 24, 352-357. 19. Desai, U. R., Osterhout, J. J. and Klibanov, A. M. (1994) J. Am. Chem. Soc. 116, 9420-9422. 20. Poole, R L. and Finney, J. L. (1983) Biopolymers 22,255-260. 21. Prestrelski, S. J., Arakawa, T. and Carpenter, J. F. (1993) Arch. Biochem. Biophys. 303,465-472. 22. Desai, U. R. and Klibanov, A. M. (1995) J. Am. Chem. Soc. 117, 3940-3945. 23. Prestrelski, S. J., Tedeschi, N., Arakawa, T. and Carpenter, J. F. (1993) Biophys. J. 65, 661-671. 24. Griebenow, K. and Klibanov, A. M. (1995) Proc. Natl. Acad Sci USA 92,10969-10976. 25. Darbre, A. (1986) Practical Protein Chemistry (Darbre, A. Ed.) p. 141, John Wiley & Sons. 26. Kaplan, H. (1972) J Mol Biol 72,153-162. 27. Zaks, A. and Klibanov, A M. (1988) J Biol Chem. 263, 3194-3201. 28. Klibanov, A. M. (1984) Chemtech 16, 354-369. 29. Broos, J., Visser, A. J. W. G., Engbersen, J. F. J., Verboom, W., van Hoek, A. and Reinhoudt, D. N. (1995) J. Am. Chem. Soc. 117, 1265712663. 30. Young, N .M. and Kaplan, H. (1989) Chemical Characterization of Functional Groups in Proteins by Competitive Labelling: In Protein Function: A Practical Approach (Creighton, T. E. Ed.) Chapter 8, IRL Press. 31. Darbre, A. (1986) Practical Protein Chemistry (Darbre, A. Ed.) pp. 260-262, John Wiley & Sons. 32. Paik, W. K. and Kim, S. (1975) .4^v. Enzymol 42,227-286. 33. Creighton, T. E. (1993) Proteins: Structures and Molecular Properties, p. 6 and references sited therein, W. H. Freeman. 34. Zaks, A. and Klibanov, A. M. (1984) Science 224,1249-1251. 35. Breitmaier, E. and Voelter, W. (1987) Carbon-13 NMR Spectroscopy: High-Resolution Methods and Applications in Organic Chemistry and Biochemistry, 3'^ ed., p. 234, VCH Publishers.
REACTION OF HIV-1 NC p7 ZINC FINGERS WITH ELECTROPHILIC REAGENTS
E. Chertova, B.P. Kane, L.V. Coren, D.G. Johnson, R.C. Sowder II, P. Nower, J.R. Casas-Finet, L.O. Arthur and L.E. Henderson
SAIC, NCI-FCRDC, Frederick, MD 21702
Introduction
All nucleocapsid (NC) proteins of oncoviral and lentiviral origin have highly conserved zinc finger structures that consist of 14 amino acids with 3 cysteines and a histidine arranged in a Cys (X)2 Cys (X)4 His (X)4 Cys array (CCHC) (Henderson et al., 1981; Copeland et al., 1984;). These structures +2
coordinate Zn with the His-imidazole and Cys-thiolate groups present in the finger (Chance, M.R. et al. 1992; Summers, M.F. et al., 1992). The retroviral NC protein is necessary for genomic RNA encapsidation (Gorelick et al.,1988; Meric et al., 1988; Meric and Goff, 1989; Aldovini and Young, 1990; Dupraz et al., 1990; Gorelick et al, 1990) and also plays an essential role in the initial infectious process (Gorelick et al., 1993; Gorelick et al., 1996). Recently, it has been shown that CCHC zinc finger peptides are susceptible to chemical attack by a wide variety of oxidizing agents (Rice et al, 1993; Henderson et al, 1995; Rice et al., 1995; Tummino et al., 1996). The metal-chelated sulfur thiolates in the CCHC zinc fingers of HIV-1 p7 are known to react with a variety of chemical groups, including maleimides, nitrosos, disulfoxides, thiocarbamoyl-disulfides, and other substituted disulfides as well as oxidizing agents such as Cu^ , Fe^^ and Hg^ ions TECHNIQUES IN PROTEIN CHEMISTRY VIII
231
E. Chertova et al
232
(Henderson, et al., 1995). The reaction mechanism for the thiuram disulfide class of oxidizing agents and maleimide class of alkylating agents were examined in our laboratory and are presented below. Thiuram disulfides (Fig. 2, for structures) were examined in detail, since a member of this class of compounds, tetraethylthiuram disulfide (Antabuse ) is an FDA-approved drug for alcohol abuse therapy and has very low in vivo toxicity (oral LD50 in mice = 1.98 g/kg (Child and Grump, 1952 )). These compounds have functional groups that can modify zinc fingers in NC protein and have anti-viral activity but are not necessarily specific for the virus. In order to initiate studies leading to the design of reagents with greater specificity for the viral NC protein it is necessary to determine the mechanism of action for model compounds and in particular to determine the initial site of attack on the NC protein. Materials and methods Tetramethylthiuram disulfide, tetraethylthiuram disulfide, tetraisopropylthiuram disulfide, bis-(dibutyithiocarbamoyl) disulfide, and dicyclopentame-thylenethiuram disulfide were purchased from Aldrich. Test compounds were dissolved as 1 mmol stocks in spectral-grade dimethyl sulfoxide (DMSO; B&J Brand™) and were stored at -20 ^C. (3Mercaptoethanol was from Sigma. Endoproteinase Arg-C sequencing grade was from Boehringer Mannheim Biochemica. Acetonitril and H2O (UV grade) were obtained from EM Science, Trifluoroacetic acid (TFA, HPLC/Spectra grade) was from Pierce. Tris (ULTROL Grade) was from Calbiochem^ La JoUa, CA. HIV recombinant-NCp7. HIV NC coding sequence was cloned into the pMal-c^^ vector, expressed as a fusion protein in E.coli. After factor Xa cleavage, the 55 residues NC protein (Fig. 3) was purified by HPLC and complexed with two equivalent of Zn and stored as a lyophilized powder (NCp7). Reactions ofNcp7: 1. For modification with NEM, 200 fig of NCp7 was reacted with N-ethylmaleimide at ratio 1:6 (Ncp7 : reagent) in 600 |LI1 of 20 mM Tris-HCl at pH 7 for 3 min, 10 min and 30 min at 37 ""C. HPLC was performed on a [i-Bondapack Cig reverse phase 3.9 x 300 mm. The gradient of buffer B was: 0-14%, 5 min; 14-24%, 15 min; 25-80%, 10 min; 80%, 5 min. Peaks were detected by LKB 2140 Rapid Spectral Detector at 206, 260 and 280 nm. 2. For thiuram disulfides, 2.5 |ig of protein were mixed with
Reaction of HIV-1 NC p7 Zinc Fingers
233
reagent at ratio 1:6 in 20 |LI1 pH 7 Tris-HCl buffer at 37 C for 10 min. The reaction products were separated by reverse-phase HPLC on a-Chrorn SCjg300 (2.0 X 150 mm) column at 0.3 ml/min by the gradient of buffer B (0.05% TFA in acetonitrile): 0-15%, 5 min; 15-25%, 20 min; 25-80%, 10 min and 80%, 5 min. Peaks were monitored by Shimadzu SPD-MIOAV Diodearray Detector. Proteolytic digestion. The samples of either NCp7 or modified NC p7 were dissolved in 0.02 M Tris-HCl, pH 7 and Arg-C endoproteinase (50:1 w/w NC protein:protease) was added and incubated for 1 h at. 37° C. After digestion all samples were analyzed by reverse-phase HPLC on aChrom 5Cig-300 (2.0 x 150 mm) column at 0.3 ml/min. The gradient of buffer B was: 0-16%, 5 min; 16-25%, 20 min; 25-80%, 10 min; 80%, 5 min. Peaks were detected by UV absorption at 206 and 280 nm. Reaction of NCp7 with tetraethylthiuram and NEM, 200 |ig of NCp7 reacted at 37 °C with 6-fold excess of tetraethylthiuram disulfide in 1 ml at pH 7, for 1 min, followed by addition of 1.2 ml of 1 mmol of Nethylmaleimide and incubated for 2 h at 37 C. The reaction products were separated by reverse-phase HPLC on Vydac 5Cig-300 column (4.6 x 150 mm). The gradient of buffer B was: 0-16%, 5 min; 16-25%, 35 min; 2580%, 10 min; 80%, 5 min. Peaks were detected by LKB 2140 Rapid Spectral Detector at 206, 260 and 280 nm. Fluorescence study. All fluorescence measurements were performed on a Shimadzu RF 5000U Spectrophotometer using a 10 nm slitwidth set at 288 nm and an emission set at 352 nm. Measurements were performed at room temperature in a 2 by 10 mm pathlength quartz cuvette (Uvonic Inc.) Both 2nd finger p7 peptide and full length p7 were dissolved in 10 mM Sodium Phosphate buffer (pH 7) at 1 |LIM concentration; both peptide and protein were coordinated with either IX or 2X Zn respectively. Both the p7 peptide and NC initial fluorescence was monitored for 2.5 minutes after which different thiuram disulfides (at varying concentrations with respect to zinc finger) were added to the cuvette and its decrease in fluorescence was monitored over time. The whole virus modification with the thiuram disulfides. HIV-1 (MN) (equivalent of 1 mg p24 CA) in 20 ml was treated at 37 ^C with 50 mM test compounds for 1 h in sodium phosphate buffer, pH 7. Samples were centrifuged for 1 h at 17,000 g at 4 °C to pellet the vims and remove the drug. Samples for electrophoresis were run under non-reducing conditions. The gels (4-20%) were supplied by NOVEX (San Diego, California). Then proteins were transferred onto PVDF membranes (Towbin et. al., 1979), stained with 0.5% (w/v) Ponceau S and detected by
E. Chertova et al
234
immunoblot analysis using monospecific polyvalent rabbit antiserum prepared against purified viral NCp7 and ECL anti-rabbit IgG compounds. Crosslinking was visualized by Enhanced Chemiluminescence (ECL) following the manufacture's instructions (Amersham, Arlington, IL). Results A, Studies with N-ethylmaleimide N-ethylmaleimide (NEM) is an alkylating agent that reacts with cysteine thiols to generate a stable S-alkylated derivative (Cys-NEM) which can be identified by Edman degradation. Preliminary studies showed that NEM reacted with thiols in NCp7 and gave a stable Cys-NEM derivatives. To determine if the thiols in the protein reacted with the reagent in any specific order, the reaction was conducted under conditions of limiting reagent concentration. At various times, the modified protein products were separated by HPLC and the location of Cys-NEM residues was determined by N-terminal Edman degradation. Figure 1 shows the HPLC separations of the NEM-modified protein after 3, 10 and 30 minutes of reaction with a P7
P7 C36-M
1
P7 C36-M C39-M C49-M
3 min
Figure 1. HPLC separation of modified NCp7 after 3 min (top), 10 min (middle) and 30 min (bottom trace) with N-ethylmaleimide (6:1 molar ratio). The reaction products were separated by reverse-phase After separation, fractions were characterized using an automated Applied Biosystems Inc. 477 A Protein Sequencer
Reaction of HIV-1 NC p7 Zinc Fingers
235
molar ratio of total thiols to NEM (i.e. ratio of 6 NEM to NCp7). The major intermediate detected after 3 min of reaction time (Fig. 1; p7 C36-M) was analyzed and found to have Cys-NEM in position 36 and no other modified Cys residues. As the reaction progressed (Fig. 1; 10 and 30 min) the amount of p7C36-M remained constant, while the amount of unreacted p7 decreased and the amount of protein with three Cys-NEM residues at positions 36, 39 and 49, all located in the second zinc finger, (p7C36, 39, 49-M) increased. These data are consistent with Cys-36 in the second zinc finger of HIV-1 being the most reactive with NEM. These data are also consistent with p7C36-M being an intermediate in the reaction pathway leading to p7C36, 39, 49-M and that intermediates in the reaction path between these two states are short lived. Under the conditions of limiting reagent the amount of fully modified p7 (6 Cys-NEM residues) is negligible. At higher concentrations of NEM and or longer reaction times all 6 Cys residues can be modified with NEM (data not shown). Thus, the data show that the second zinc finger is more reactive with NEM than the first zinc finger but both can be made to react. B. Interaction the thiuram disulfides and HIVNCp7 Preliminary data had shown that NCp7 was also reactive with a wide variety of disulfides, including thiuram disulfides. The thiuram disulfides have a common reactive functional group R2N-C(S)-S-S-(S)C-NR2 and differ by the nature of the R groups. The thiuram disulfides chosen for this study were tetramethylthiuram disulfide, tetraethylthiuram disulfide, tetraisopropylthiu-ram disulfide, tetrabutylthiuram disulfide, and dicyclopentamethylenethiuram disulfide (see Fig. 2 for structures). To investigate the influence of the R groups on the reactivity with NCp7 Table I. Relative Reaction Rates for Attack of Thiuram Disulfides by HIV1 Full Length NCp7 Protein or Second Zinc Finger Peptide (ZnF2p7) Compound
NCp7 Rate
Tetramethylthiuram Disulfide Tetraethylthiuram Disulfide Tetraisopropylthiuram Disulfide Tetrabutylthiuram Disulfide Dicyclopentamethylenethiuram Disulfide
1.00 0.59 0.20 0.28 3.59
ZnF2p7 Rate
0.98 0.63 0.39 0.26 0.68
E. Chertova et al
236
zinc fingers, the reaction rates for a thiuram series with a 2nd finger peptide (18 residues from position 34 to 51,F ig. 3) and with the NCp7 were determined by following the rate at which the fluorescence of tryptophan (Trp 37) was quenched. Table 1 presents the determined initial rates for the thiuram disulfide series reacting with the second zinc finger in the context of the whole protein and in the 18 residue peptide. However, the observed order of reaction rates for the series was tetramethyl > tetraethyl > tetraisopropyl = tetrabutyl. The rate of reaction of these compounds with the peptide or the complete protein decreased as the bulk and hydrophobic character of the side chain increased. The rate of reaction for each thiuram disulfide was greater for the full length p7 nucleocapsid protein than for the zinc finger peptide. To gain some insight into the reaction pathway, as was done for 3-disulfide bonds
Reagent
Reagent
Tetramethylthiuram Disulfide CH CH-
CH
:N_fc_SS—6-Nt:C H
Tetraethylthiuram Disulfide C H oC H ry CH3CH2'^
^CH2CH3
Tetraisopropylthiuram Disulfide 9*^3 CHgCH^
CHQ
S
n
CH3CH
^CHCHo ^CHCH3
CH3
6H3
Tetrabutylthiuram Disulfide s s
CHoCHoCnpCH2\
N-C-SS-C—N^ CH3CH2CH2CH2^'
Dicyclopentamethylene-thiuram Disulfide N-C-SS—C-N
^
Figure 2. HPLC-assay of NCp7 with thiuram disulfides. NCp7 was reacted with 6-fold excess of various thiuram disulfides at pH 7.0 for 10 min at 37° C. The products of the reaction were identified as follows: I - unreacted p7; II - 3(S-S) p7; shaded peak is tetramethylthiuram disulfide; other reagents were eluted later on chromatogram (not included).
Reaction of HIV-1 NC p7 Zinc Fingers
237
NEM, the NCp7 was allowed to react for 10 min with a limiting amount of thiuram disulfide (1:6 ratio of protein to reagent) and the reaction products separated by HPLC. The chromatogram in Fig. 2 shows the distribution of protein products from each reaction. To compare the reactivities of the thiuram disulfides with the protein, we estimated the relative amounts of unreacted protein remaining after 10 min. (Fig. 2; peak I). The data reveals that the order of reactivity for the thiuram disulfides is tetramethyl > dicyclopentamethylene = tetraethyl > tetraisopropyl > tetrabutyl in good agreement with the data from the Trp fluorescence quenching study. In other studies (data not shown) the reactions were driven to completion by increasing the reaction time. The final reaction product is fully oxidized p7 with three disulfide bonds and has the chromatographic mobility as indicated in Fig. 2 (peak II). We confirmed the presence of three disulfide Figure 3 1
10
20
30
40
50
55
MQRGNFRNQRKIIKCFNCGKEGHIAKNCRAPRKRGCWKCGKEGHQMKDCTERQAN t
t
t
1st finger peptide-—t
t t-—2nd finger peptide
t
bonds by treating the oxidized p7 (Fig. 2; peak II) with 4-vinylpyridine followed by protein sequencing of the reaction product and found it to be devoid of modified free thiols. The fully oxidized protein was also reduced with p-mercaptoethanol and shown to have HPLC elution behavior identical to unreacted p7. The presence of a disulfide bond linking the finger domains was confirmed by enzymatic digestion with Arg-C endoproteinase. Enzymatic digestion at Arg residues flanking the finger domains ( t in Fig 3) produces two large peptides that are easily separated by HPLC and distinguished from each other by the UV absorption of Trp 37 in the 2nd finger peptide. Before reduction with 2-mercaptoethanol the finger peptides eluted as a single peak but separated into two chromatographic species after reduction (data not shown). The results show that the reaction with thiuram disulfides is essentially an oxidation reaction and are consistent with known properties of this class of compound. Thus the thiuram disulfides induce disulfide bonds among the zinc finger thiolates, displace zinc, and alter the active conformation of the protein.
238
E. Chertova et al
C. The mechanism of tetraethylthiuram disulfide reaction with Ncp7 To investigate transient intermediates and the reaction path we selected tetraethylthiuram disulfide as the model reagent. In Fig. 2 there are several peaks of modified protein eluting between peaks I and II that are transient intermediates in the reaction pathway leading the fully oxidized protein. The protein and reagent were reacted for 1 min and the reaction products separated by HPLC as before. In this chromatogram (Fig. 4) peak 3disulfide is fully oxidized protein and all other peaks are reaction intermediates. The two most prominent peaks of modified protein (peak B and C) had greater absorbency at 280 nm (greater ratio of 280nm/206nm) than 3-disulfide peak indicating that peaks B and C contained a mixed disulfide between the protein and the reagent. The major transient intermediates observed in Fig. 4 seemed to contain mixed disulfides between the reagent and the protein (peaks B and C) but the final reaction product (peak 3-disulfide) does not. The protein with mixed disulfide might readily undergo a rearrangement, liberating reduced reagent, to generate internal disulfide bonds. To prevent this from occurring, the free thiols of the modified protein were blocked by alkylation with NEM before separation. After 2 h in the excess NEM, the products were separated by HPLC as shown in Figure 5. Under these conditions, at least 70% of the initial protein reacted with tetraethylthiuram disulfide before addition of the NEM. Free thiols in the intermediates are most reactive with NEM and disulfides (mixed or internal) are least reactive. Reactions between the NCp7 and tetraethylthiuram disulfide that initiate after the addition of NEM are minimized by the amount of unreacted protein, the limiting amount of tetraethylthiuram disulfide in the reaction mixture and the rapid alkylation by the excess of NEM. Any unreacted NCp7 with both zinc finger should react with the excess NEM to give a final product with 6 Cys-NEM but the modification would proceed more rapidly on the second zinc finger as discussed in Fig. 1. In Fig. 5 the peak labeled "NCp7 / 6NEM" (mass-spectrometry data) is fully alkylated protein that results from the action of NEM on NCp7 protein remaining after 1 min of treatment with tetraethylthiuram disulfide. The area of the "NCp7 / 6NEM" peak relative to the other peaks in the chromatogram is consistent with the expectation that 30% of the initial protein remained unreacted after 1 min. and that this protein was quickly modified by the excess NEM. Peak A was analyzed by mass spectrometry on MALDI-II-TOF (Shimadzu) and by Edman degradation. These results showed that most of the protein was modified by the addition of 4 NEM moieties on Cys residues 28, 36, 39 and 49. However, the analysis also indicated that the peak contained lesser
Reaction of HIV-1 NC p7 Zinc Fingers
239
A A
Reagent NCp7/
2 Disulfides/ 2NEM 3 Disulfides
A
Figure 4. HPLC separation of NEM-modified NCp7 oxidated with tetraethylthiuram intermediates. amounts of other modified forms of protein. Peaks C (Fig. 5) had a ratio of absorbency at 280nm/206nm greater than peaks A, B and "NCp7/6NEM" suggesting that it contains protein with at least one mixed disulfide and derived from the intermediate in peak C in Fig. 4. To further analyze the modified intermediates separated in Fig. 4, the protein isolated in peaks A, B and C was reduced with DTT and digested with Arg-C as previously discussed (Fig. 3). The resulting peptides were separated by HPLC as shown in Fig. 5 panel A, B and C. The separated peptides were characterized by their molecular mass, HPLC mobility and by UV absorption. Peak A (Fig. 4) was analyzed in panel A (Fig. 5) and found to contain a mixture of at least three modified intermediates. The most abundant modified intermediate gave a 1st finger peptide with 1 Cys-NEM residue and a 2nd finger peptide with 3 Cys-NEM residues. The data are consistent with an oxidative intermediate with one disulfide bond in the first finger. The other two modified intermediates in peak A appear to be derived from higher oxidation products with two disulfide bonds per intermediate. The data suggest that both higher oxidation products have a disulfide bond linking the first and second finger domains and an additional disulfide bond. One intermediate with the additional disulfide in the 1st finger domain and the other with the additional disulfide in the 2nd finger domain. Peak B (Fig. 4) was analyzed and found (Fig. 5, panel B) to contain a mixture of two modified intermediates. Both modified intermediates had the same molecular mass and contained four Cys-NEM residues. One modified intermediate gave a 1st finger peptide with 3 Cys-NEM residues and a 2nd finger peptide with 1 Cys-NEM. The other modified intermediate gave a 1st finger peptide with 2 Cys-NEM residues and a 2nd finger peptide with 2 Cys-NEM residues. The results are consistent with two intermediates
E. Chertova et al
240 Arg-C Peiitides Fr#m Peak A 1st-1NEM 2nd-3NEM
/
1st-2NEM
Ar§-C Peptides Fr«m Peak B 2nd-1NEM
2nd-2NEM 1st-2NEM
1st-3NEM
Ari|-C Peptides Fr«m Peak C ,1st
A 2nd-3NEM
Figure 5. HPLC separation of Arg C NEM-modified NC p7 peptides. in the oxidative path, one with a disulfide in the 2nd finger domain and the other with a disulfide linking the 1st and 2nd finger domains. Peak C (Fig. 4) was analyzed in Fig. 5 panel C and found to contain 1st finger peptide and 2nd finger peptide with three Cys-NEM. These results are consistent with an intermediate in the NCp7 thiuram oxidation pathway with all three Cys residues in the first finger protected from NEM by disulfide bonds and all three Cys residues in the 2nd finger as thiols and unprotected. The data reveal that peak C (Fig. 4) contained an oxidative intermediate with the first finger domain modified by one internal disulfide and one mixed disulfide. The mixed disulfide was also indicated by the ratio of OD206/OD280 for peak C (Fig. 3). The data presented in Figs. 4 and 5 are consistent with a prominent reaction path that begins with an initial attack of tetraethylthiuram disulfide on the 1st zinc finger of NCp7. The earliest oxidation intermediate that accumulates as a detectable transient has one disulfide bond in the first finger linking Cys 15 to Cys 18 (major component of peak A, Fig. 5). This intermediate has a free thiol on Cys 28 which can react in the next step with tetraethylthiuram disulfide to form a mixed disulfide. This intermediate has one internal disulfide and one mixed disulfide (the major component in Fig. 5 peak C). In subsequent steps this intermediate can form either intra or
Reaction of HIV-1 NC p7 Zinc Fingers
241
inter molecular disulfide bonds and react with additional reagent to give higher oxidation products. The data for peak B (Fig. 5) suggest that a less prominent reaction path begins with an initial attack on the 2nd zinc finger and leads to higher oxidation products. Taken together the data indicate that tetraethylthiuram disulfide can attack both zinc fingers in the NCp7 but reacts more readily with the 1st finger in the native protein since peak A>peakB. D. Inactivation of HIV-1 (MN) with the thiuram disulfides in vitro In order to analyze the action of the thiuram disulfides on NC protein in whole virus, lOOOx concentrated cell-free HIV-l(MN) was incubated with 50 mM of thiuram disulfides for 60 min. The virus was then pelleted by centrifugation to remove reagents and was analyzed by western blot analysis with antibody against p7 (Fig. 6). Under non-reducing conditions the NC protein of untreated virus is a mixture of monomers, dimers, trimers and tetrameters (see Fig. 6, HIV-1 lane). These virus treated with thiuram disulfides showed NC antigen migrating above 200 kDa marker and in some cases the monomeric form of NCp7 was completely absent (tetraethyl-thiuram, tetraisopropylthiuram and dicyclopentamethyleneHIV
1
3
4
Tetramer Trimer . Dimer p7NC-
Figure 6. Non reducing SDS-PAGE analysis of HIV-1 treated with drug: lane 1 - tetramethylthiuram disulfide, 2 - tetraethylthiuram disulfide, 3 dicyclopentamethylenethiuram disulfide, 4 - tetraisopropylthiuram disulfide, and 5 - tetrabutylthiuram disulfide.
E. Chertova et al
242
thiuram disulfides). The results show that thiuram disulfides are capable of penetrating the viral membrane and attacking the NC protein in the viral core. Discussion Thiuram disulfides react with NC p7 through a sulfliydryl-disulfide exchange involving Cys thiolates and the electrophilic disulfide bond of the thiuram. The reaction proceeds through a nucleophilic attack of the thiuram disulfide by the Cys thiolate. It is known that for low molecular weight thiols such reaction forms predominantly a symmetrical disulfide, whereas with protein sulfhydryl groups, a mixed disulfide is the major product. The reaction of thiuram disulfides with NC protein yields a mixed disulfide (derived from Cys thiols and diethyldithiocarbamyl moiety) and a diethyldithiocarbamate ion as the primary products of the exchange process. A complex reaction pathway was observed and attributed to the presence of six Cys residues in close proximity in NC protein which may form a heterogeneous disulfide bonding pattern. The side group had a pronounced effect on the rate of reaction. Thiuram disulfides carrying the branched isopropyl chain as well as the longer butyl groups reacted slower than the more compact methyl or ethyl groups. This behavior may result from the increased segmental flexibility of the longer branched chain, however, the bulkier dicyclopentamethylene group reacted fastest among the compounds investigated. We attribute this effect to a constrained mobility of the cyclic derivative that maximizes productive encounters between the Cys thiolate and the thiuram disulfide moiety. This agrees with computer modeling studies suggesting a wedgelike shape for the compound. The 1st zinc finger of HIV-1 NC protein is primarily the initial target in the reaction with tetraethylthiuram disulfide. In contrast, NEM reacted faster with the 2nd p7 zinc finger. While our studies of NC protein alone demonstrated little if any crosslinking, the results with HIV-1 virus showed extensive oligomerization. The mature virion contains a compact ribonucleoprotein complex formed by the genomic RNA and ca. 2,500 copies of the NC protein. Therefor the high concentration of NC in the viral particle the formation of intermolecular disulfide bonds over intramolecular ones is expected to be favored following virus treatment with thiuram disulfides. In agreement with this model, the reaction of tetraethylthiuram disulfide with concentrated samples of HIV-1 NC protein in vitro lead to extensive p7 oligomerization. Such crosslinked macromolecular structures appear likely
Reaction of HIV-1 NC p7 Zinc Fingers
243
to result in functional impairment of the NC domain. Indeed, it has been shown that treatment of retroviruses with thiuram and aromatic disulfides rendered them non-infectious (data not shown). These results are completely compatible with the known functions of the NC protein in the viral replication cycle and published results describing the action of other oxidizing agents on a whole HIV-1 (Rice et al., 1994, Rein et al., 1996). In conclusion, thiuram disulfides are examples of a class of compounds that oxidize retroviral NC proteins and hold promise in antiretroviral therapy. References Aldovini, A., and R.A. Young. (1990). J. Virology. 64, 1920. Alexander, P., Z.M. Bacq, S.F. Cousens, M. Fox, A. Herve, and J. Lazar. {\955). Radial Res. 2,1>92. Bacq, Z.M., and A. Herve. 1953. Arch Int.Physiol 61, 433. Bacq, Z.M., A. Herve, and P. Fisher. 1953. Bull Acad. Roy. Med Belg. 18, 226. Chance, M.R., I. Sagi, M.D. Wirt, S.M. Frisbie, E. Scheuring, E. Chen, J.W. Bess, Jr., L.E. Henderson, Arthur, L.O., T.L. South, G. Perez-Alvardo, and M.F. Summers. (1992). Proc. Natl. Acad Sci. USA. 89, 10041. Child, G.P., and M. Grump. (1952). Acta Pharmacol. Toxicol. 8, 305. Copeland, T.D., M.A. Morgan, and S. Oroszlan. (1984). Virology. 133, 137. Dupraz, P., S. Oertl, C. Meric, P. Damay, and P.-F- Spahr. 1990. J. Virology. 64, 4978. Gorelick, R.J., L.E. Henderson,J.P. Hanser, and A. Rein. (1988). Proc. Natl. Acad Sci. USA. 85,8420. Gorelick, R.J., S.M. Nigida, J.W. Bess, L.O. Arthur, L.E. Henderson, and A. Rein. (1990). J. Virol. 46, 3207. Gorelick, R.J., Chabot, D.J., Rein, A., Henderson, L.E. and Arthur, L.O. (1993). J. F/>o/. 67, 4027.. Gorelick, R.J., Chabot, D.J., Ott, D.E., Gagliardi, T.D.,and Arthur, L.O. (1996).y. Virol. 70,2593. Henderson, L.E., T.D. Copeland, R.C. Sowder, II, G.W. Smythers, and S. Oroszlan. (1981). J. Biol. Chem. 256, 8400. Henderson, L.E., Rice, W.G., and Arthur, L.O. (1995). US Patent Application USSN 08/312,331 Lumper, L., and H. Zahn. (1965). Advan. Enzymol. 27,199. Meric C , and S.P. Goff. (1989). J. Virol. 63,1558. Meric C , E. Gouilloud, and P.-F. Spahr. (1988). J. Virol. 62, 3328.
244
E. Chertova et al
Rein, A., D.E. Ott, J. Mirro, L.O. Arthur, W. Rice, and L.E. Henderson. (1996). J. F/ro/. 70, 4966. Rice, W.G., Schaeffer, C.A., Harten, B., Villinger, F., South, T.L., Summers, M.F., Henderson, L.E., Bess, J.W.Jr., Arthur, L.O., McDougal, J.S., Orloff, S.L., Mendeleyev, J. and Kun, E. (1993). Nature 361, 473. Rice, W.G., J.G. Supko, L. Malspeis, R.W. Buckheit, Jr., D. Clanton, M. Bu, L. Graham, C.A. Schaeffer, J.A. Turpin, J. Domagala, R. Gogliotti, J.P. Bader, S.M. Halliday, L. Coren, R.C. Sowder II, L.O. Arthur, and L.E.Henderson. (1995). Science. 270, 1194. Summers, M.F., L.E. Henderson, M.R. Chance, J.W. Bess, Jr., T.L. South, P.R. Blake, I. Sagi, G. Perez- Alvardo, R.C. Sowder, II, D.R. Hare, and L.O. Arthur. (1992). Protein Science. 1, 563. Towbin, H., T. Staehelin, and J. Gordon. (1979). Proc. Natl Acad. Sci. USA. 76, 4350. Tummino, P.J., J.D. Scholten, P.J. Harvey, T.P. Holler, L.Maloney, R. Gogliotti, J. Domagala, and D. Hupe. (1996). Proc. Natl. Acad Sci. USA. 93, 969. Acknowledgments Research sponsored by the National Cancer Institute, Department of Health and Human Services (DHHS). The contents of this publication do not necessarily reflect the views or policies of the DHHS, nor does mention of trade names, commercial products, or organizations imply endorsements by the US Government.
The Identification and Isolation of Reactive Thiols in Ricin A-Chain and Blocked Ricin Using 2-(4'-Maleimidylanilino)naphthalene-6-sulfonic Acid Mary E. Denton, Rita M. Steeves and John M. Lambert ImmunoGen, Inc. Cambridge, MA 02139
I. Introduction The identification of reactive or chemically modified residues of proteins is often extremely important for the characterization of proteins and their activity. Peptide mapping in conjunction with Edman sequencing and/or mass spectrophotometric analysis has been the method of choice to accomplish this characterization. However, this approach alone may not be sufficient or optimal for every situation as was the case when trying to identify the affinity ligand attachment sites on the B-chain of blocked ricin (Lambert et al., 1991a). Ricin is a heterodimeric protein composed of a toxic A-chain, which is responsible for inhibiting cellular protein synthesis (Olsnes and Pihl, 1973), disulfide-linked to a B-chain, known to possess lectin activity (Baenziger and Fiete, 1979). In order to suppress the non-specific toxicity which arises from the interaction of the carbohydrate binding domains with cell-surface carbohydrates, the two carbohydrate binding pockets on the B-chain of ricin (Montfort et al., 1987) are covalently "blocked" using a modified glycopeptide containing an Nlinked triantennary oligosaccharide, thus forming "blocked ricin" (Lambert et al., 1991a). Ricin thus modified has been incorporated as the effector portion of antigen-specific immunoconjugates currently in clinical trials (Lambert et al., 1991b; Grossbard et al., 1993). The glycopeptide is derived from a pronase digestion of the serum protein fetuin and is modified in two ways to form an affinity ligand for ricin B-chain (Lambert et al., 1991a). First, a dichlorotriazine group is linked to one terminal galactose moiety of the glycopeptide to provide a cross-linking group which can react with a nucleophilic residue on the B-chain once the carbohydrate has been bound. Second, a protected thiol is added to the peptide portion of the ligand. Thus, a free thiol can be easily generated for conjugation or labeling purposes. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
245
246
Mary E. Denton et al.
Identification of the residues of ricin involved in the covalent hnkage to the affinity Hgand is important for complete characterization of the cytotoxic effector moiety and to ensure consistency of the immunoconjugate product. However, due to the intrinsic heterogeneity of the ligand, isolation of individual species of ligandbound B-chain peptides by traditional peptide mapping has not been possible. In order to minimize the effect of the ligand heterogeneity, a different approach was conceived exploiting the incorporated protected thiol at the peptide "end" of the ligand. The thiol-specific probe, 2-(4'-maleimidylanilino)naphthalene6-sulfonic acid (MIANS^) was used to label the ligand linked to B-chain in situ. MIANS was an attractive choice because its characteristic absorbance profile (Gupte and Lane, 1979; Andley et al., 1981) facilitates the identification of the peptides of interest. A monoclonal antibody recognizing MIANS was produced so that affinity chromatography could be used to isolate only those peptides of the B-chain that are cross-linked to the MIANS-labeled ligand. In order to test the specificity of the thiol labeling by MIANS and the general efficacy of this method to subsequently isolate and identify reactive thiols, the A-chain of ricin was used as a model protein. Although the A-chain contains two cysteine (Cys) residues, only the C-terminal Cys that forms the intermolecular disulfide linkage to the B-chain is accessible upon reduction of the native protein under the conditions employed (Montfort et al., 1987). If MIANS is indeed specific for reactive thiols and the anti-MIANS affinity column is effective, only MIANS-labeled peptides corresponding to the C-terminal sequence of ricin Achain should be obtained (with a "blank" cycle at the Cys position upon automated Edman sequencing, due to its derivatization). Here we report the results obtained using MIANS-labeling in conjunction with affinity chromatography to map free thiols in reduced, native ricin A-chain. We further describe the results obtained when utilizing this method to isolate ligand-bound ricin B-chain peptides. II. Materials and Methods A. Absorbance Measurements Extinction coefficients used for ricin A-chain and blocked ricin B-chain at 280 nm for 0.1% solutions were 0.765 and 1.48, respectively (Olsnes and Pihl, 1973). The extinction coefficient for MIANS is 20,000 M"' cm"^ at 322 nm (Gupte and Lane, 1979). The empirically determined contribution of MIANS to the absorbance of the labeled protein at 280 nm was 0.9 x A320.
'Abbreviations: MIANS, 2-(4'-maleimidylanilino)naphthalene-6-sulfonic acid; PBS, phosphate buffered saline: 20 mM sodium phosphate containing 150 mM sodium chloride, pH 7.2; EDTA, ethylene diamine tetraacetic acid; DTT, dithiothreitol; GuHCl, guanidine hydrochloride; 2-ME, 2-mercaptoethanol; RB2L-MIANS, ricin B-chain covalently blocked by two affinity ligands and MIANS labeled; TFA, Trifluoroacetic Acid.
Identification of A-Chain and Blocked Ricin Using MIANS
247
B. MIANS Labeling Ricin A-chain (Inland Labs) at 3 mg/mL in PBS was reduced for 30 min with 30 mM DTT at 30°C. Excess DTT was removed by Sephadex G-25 gel filtration in 5 mM sodium acetate buffer, pH 4.7, containing 50 mM NaCl and 0.5 mM EDTA. The thiol content was assessed by Ellman's assay (EUman, 1959). MIANS (Molecular Probes) was added at a ratio of 0.9 mole MIANS: 1.0 mole thiol. The reaction mixture was incubated at ambient temperature for 15 min after the pH was raised to 7.0 using 1 M Tris.HCl buffer, pH 7.4. The MIANS was quenched by addition of 1 mole equivalent of freshly prepared cysteine.HCl/mole MIANS and incubation for an additional 15 min. Any remaining thiol groups were alkylated by adding a 5-fold molar excess of iodoacetamide over total thiol and after an additional 30 min, the protein (MIANS-ricin A-chain) was dialyzed against 0.1 M Tris.HCl buffer, pH 8.5. Ricin containing two covalently attached ligands (blocked ricin) was produced according to Lambert et al. (1991a). Blocked ricin at 3 mg/mL was reduced with 4.5 mM DTT in PBS at pH 6.8 for 17 hours on ice (conditions that reduce only the ligand disulfide). The MIANS labeling, quenching and alkylation were performed as described above. The protein was dialyzed against 0.1 M Tris.HCl buffer, pH 7.7.
C. Purification of RB2L-MIANS The disulfide bond between the A and B chains of MIANS-labeled blocked ricin was reduced and the chains were separated using a modification of the method of Olsnes & Pihl (1973). Protein (approximately 10 mg) at 1 mg/mL in 0.1 M Tris.HCl buffer, pH 7.7 was reduced by incubating with 4.5% 2-ME for 20 hr at ambient temperature. After raising the pH to 8.5 with 0.1 M Tris base, the chains were separated by ion exchange chromatography on a 1.5 x 6.5 cm column of DE52 equilibrated in 0.1 M Tris.HCl buffer, pH 8.5, containing 0.1% 2-ME. After loading the protein, the colum was washed with equilibration buffer (approximately 40 mL) followed by a wash with 0.1 M Tris.HCl buffer, pH 8.5 (40 mL). Ricin A-chain does not bind to the resin under these conditions. RB2LMIANS was eluted with 0.1 M Tris.HCl buffer, pH 8.5, containing 1 M NaCl.
D. Enzymatic Digestion of Proteins MIANS-Ricin A-chain and RB2L-MIANS were reduced and alkylated under denaturing conditions. To each protein solution, GuHCl and DTT were added to 6 M and 20 mM, respectively. Each protein was incubated at 37°C for 1 hr. MIANS-ricin A-chain was reduced for 5 hr, carboxymethylated with 100 mM iodoacetic acid for 30 min, and dialyzed against 0.1 M Tris.HCl buffer, pH 8.5, containing 2.0 M GuHCl. RB2L-MIANS was reduced overnight, carboxymethylated by addition of 100 mM iodoacetic acid for 1 hr, quenched with 2-ME and dialyzed against 0.1 M Tris.HCl buffer, pH 8.5. Enzymatic digestion was performed at a 1/20 (w/w) ratio of each enzyme (Endo-Lys C and chymotrypsin) to substrate. Endoproteinase Lys-C (Endo Lys-C, Wako) digestion was performed according to Riviere et al. (1991). The protein
248
Mary E. Denton et al
was heated to 50°C for 30 min in a solution of 0.1 M Tris.HCl buffer, pH 8.5, containing 6 M GuHCl. After heating, the solution was diluted to 2 M GuHCl and enzyme was quickly added. Digestion was allowed to proceed for 6 hr at 37°C. At that time, chymotrypsin (sequencing grade, Boehringer Mannheim) was added and the incubation continued overnight. Digestion was stopped and the proteinases inactivated by adding the mixture rapidly to 3 volumes of boiling methanol (approximately 76 °C) for 3 min. Methanol was removed by rotary evaporation.
E. Affinity Purification
A murine monoclonal antibody, LG-85, specific for MIANS-labeled peptides and proteins was produced by the Hybridoma Development group at ImmunoGen, Inc. The antibody, an IgGj, was produced from the hybridoma grown as an ascites tumor in mice, and was purified from the ascites fluid by affinity chromatography over a Protein A-Sepharose column. The purified antibody was then used to prepare an LG-85-Protein G-Sepharose affinity column as follows: Antibody in PBS was added to resin at a concentration of 2 mg LG-85/mL resin packed in a column. After incubation for 90 min, the resin was washed with 0.2 M borate buffer, pH 9.0, and dimethylpimylimidate was added to a concentration of 20 mM. After cross-linking for 30 min at ambient temperature, the reaction was quenched by incubation for 30 min with 2 M ethanolamine.HCl, pH 7.8. The column was subsequently equilibrated in PBS. The peptide digests in 2 M GuHCl were diluted to 0.5 M GuHCl prior to loading onto the anti-MIANS column (LG-85-Protein G-Sepharose). After washing with 0.1 M Tris.HCl buffer, pH 8.5, followed by 0.1 M sodium citrate/sodium phosphate buffer, pH 2.9, the column was eluted with the citrate/phosphate buffer, pH 2.9, containing 4 M GuHCl. All fractions were evaluated for absorbance at 280 and 320 nm.
F. Chromatography Gelfiltrationchromatography using a Superdex Peptide HR 10/30 column (Pharmacia) was performed on a Hitachi System L-6200 Intelligent Pump equipped with an L-4200 UVA^is Detector. Reverse Phase-HPLC (RP-HPLC) was performed on a Waters 625 LC System with a 991 photodiode array detector, using a (4.6 X 250 mm) Zorbax SB-300 Cig column (MacMOD).
G. Peptide Sequencing Following affinity chromatography of the MIANS-ricin A-chain digest over the anti-MIANS column, the eluted fraction was run over the Cjg RP-HPLC column utilizing gradient elution conditions. Individual peptides were collected for amino acid analysis (data not shown) and for sequencing. Following affinity chromatography of the RB2L-MIANS digest over the anti-MIANS column, the eluted fraction was desalted over the Cjg RP-HPLC column.
Identification of A-Chain and Blocked Ricin Using MIANS
249
All peptides of interest were analyzed for amino acid composition and/or for sequence by Dr. John Leszyk at the Worcester Foundation for Experimental Biology. III. Results and Discussion A. MIANS labeling The reduction of ricin A-chain yielded 0.96 mole SH/mole protein. Following the labeling step, the ratio of MIANS/A-chain was 0.70 as determined by measurement of absorbance at 280 nm and 320 nm. The reduction of blocked ricin yielded 1.3 mole SH/mole protein. Following labeling and separation of the blocked B-chain, the ratio of MIANS/blocked B-chain was 0.97.
B. Peptide Mapping ofMIANS-Ricin A-chain Figure 1, Panel A shows the RP-HPLC chromatogram of the combined digest (Endo Lys-C followed by chymotrypsin) of MIANS-ricin A-chain detected at 214 and 320 nm. There are only 6 major peptides that are MIANS-labeled as evidenced by the 320 nm profile. All of these peptides bind to and are eluted from the anti-MIANS column as shown in Panel B of Figure 1. Moreover, as the profile at 214 nm shows, there is no evidence of significant amounts of additional, unlabeled peptides sticking to the affinity column. Note also that the ratio of the peaks in the 320 nm chromatogram of Panel B is the same as that in the 320 nm chromatogram of Panel A, indicating that each of the MIANS-labeled peptides binds to the affinity column equivalently. Amino acid analysis was performed (data not shown) on the affinity purified material corresponding to peaks labeled 1 through 6 in the 320 nm chromatogram of Panel B. Peaks 1 through 5 have very similar compositions characterized by a strong proline (Pro) signal and the presence of Arg, Ser, Ala, and Glx. In addition to the strong Pro signal and the amino acid residues contained in Peaks 1 through 5, Peak 6 contained significant amounts of Phe, Tyr and Val. Neither Cys nor carboxymethyl-Cys was found in any of the amino acid analyses (as is consistent with MIANS-derivatization at this residue). No peptides that can be derived theoretically from an Endo-Lys C plus chymotrypsin combined digestion of the ricin A-chain, other than the C-terminal peptides, fit the determined amino acid composition profiles. Although the amino acid analysis data strongly suggested that all of the peptides isolated were derived from the proline-rich C-terminal portion of the ricin A-chain containing MIANS-Cys at position 283, it was necessary to perform automated sequencing on representative peptides to unambiguously identify the location of the MIANS modification and the composition of the peptides. Two of the peptides from the affinity purified material, peaks 1 and 6, which are representative of the two amino acid analysis results, were sequenced. The sequences are shown in the 320 nm chromatogram of Panel B, Figure 1. Both peptides contain a blank cycle at the residue corresponding to the C-terminal Cys of the ricin A-chain, Cys^^^ consistent with the likelihood that this residue is MIANS-labeled. The sequence of Peak 6 differs
uoi X) nv
(i-OL X) nv
LOL X) nv
(c-Oi X) nv
^ o
^3
IZ>
^
Id
o
13
(U W)
X
d >. II J1 t-H o a.
<
T1 (U
3 PQ o
(U 43
;^
PQ
<
C/0
C/D
<
O
03
.s
S-l (L>
3 q a 6 n f^
1 § 1 o
BS
o
3
c S
O
(^
43
o II :^ < <: g 13 S g c
c
(L> U H l~i
T3 C«
^
O ON
^
M
C/3
< <
--(
1
O
rrt
.s o
c
<
1/3
c s
u
o
soa>
a.
O
T3
Hi)
o
(U
<S a 13c o S o o a>
1
t5 •-I
cTi \ti cd
rS
O B JB o
o^ oa
0 4 CO T3 C (S
^ (N
T3 (U
T3
.s -C
fl o
< s <
W II
PQ
^ nos ,c F
< ^
CO
W)
(U c/5
§•
^ 1 03
13
T3 d)
43
O
fr o 1 'o
CI,
A
01
OX)
II
O
-3 '-t
^
3 PQ
o P3 U HJ
s
a> c
^-H
< (^ H
Identification of A-Chain and Blocked Ricin Using MIANS
251
from that of Peak 1 in that it contains two additional residues N-terminal to the sequence of Peak 1 and an additional residue, Phe, on its C-terminus. Neither the buried Cys nor any primary amine (the N-terminus or Lys side chains) has been labeled by MIANS. Taken together, these data indicate that the MIANS label is highly specific for accessible thiols. Using these digestion conditions, which had been optimized for the blocked ricin B-chain, has resulted in a heterogeneous peptide map, in part due to the incomplete chymotryptic cleavage at Tyr^^^ This is likely caused by interference by the MIANS label and, along with the C-terminal Phe residue in peak 6, accounts for the difference between the peaks 1 and 6. The heterogeneity within peaks 1-5 may be derived from either the presence or absence of the C-terminal Phe and/or from instability of the MIANS label. A likely product of the "breakdown" of the MIANS label is a form where the maleimide ring opens at one of the carbonyls. (See Summary and Conclusions below). Surprisingly, whatever instability may be associated with the MIANS label, all of the labeled peptides bind equivalently well to the affinity column.
C Peptide mapping of the Ricin B-chain attachment sites The RP-HPLC analysis of the combined digest of the ricin B-chain blocked with MIANS-labeled affinity ligand (RB2L-MIANS) is shown in Figure 2, Panel A. The 320 nm detected trace demonstrates the exceptional heterogeneity of the RB2L-MIANS peptides. Although there are some seemingly well-defmed peaks in the chromatogram, attempts to sequence them directly have been unsuccessful. Panel B demonstrates the utility of the anti-MIANS column for the specific isolation of the RB2L-MIANS peptides. Here, the characteristic "double-humped" profile of the RB2L-MIANS peptides is apparent. Note that there is very little 214 nm (top trace. Panel B) absorbing material present. The relative absorbance of these peaks at 320 nm compared to 214 nm is much higher than for the MIANSricin A-chain peptides in Figure 1. Although the anti-MIANS column-eluted fraction is heterogeneous by RPHPLC, the gelfiiltrationchromatography analysis of the affinity purified RB2LMIANS peptides shown in Figure 3 reveals only one major peak with a minor component eluting on the leading shoulder. A comparison with the elution profile of ligand itself from the same column demonstrates that the RB2L-MIANS peptides are significantly larger than the ligand alone (i.e., >2500 Da). [The minor peak at 20 min in the ligand profile corresponds to ligand dimer.] Given the rigorous digestion conditions (Endo Lys-C followed by chymotrypsin, in the presence of 2 M GuHCl), it is unlikely that this profile represents B-chain peptides without bound ligand. Indeed, sequencing of the anti-MIANS column-eluted fraction without further fractionation yielded two B-chain peptide sequences of <23 amino acids (manuscript in preparation), one sequence from each domain of the Bchain (Montfort et al., 1987). These peptides, without bound ligand, would be expected to elute at retention times between 25 and 30 min on this gel filtration column on the basis of their molecular weights. Additional evidence that the peptides contained bound ligand comes from the fact that although both sequences contained one blank cycle determined by automated sequencing, neither of the
(j.oL X) n v
ii.oL X) n v
v o l X) n v
(g-Oi X) n v
H 8
in
8 *"
•-
II
<
0/)
u^ c
73
C/1
W)
-o
T3 00
C3
^
S
o c
«+H
:£3
§
cd < ^S hj ^ (N C/l
a ,<1>
^ 3 4:i T3 C3
J^
O
<: (1)
c
< ^
o C/3 en :4
^
(N
Id
(L)
o ci3
T3 -5 « H CD
t3
<
(U
«S
PH
OQ
13
CO
u
C/)
0)
c
(>
0) O W) 73
o J
•a
(U
T3
^ i S WII
in <^
2 :z; 01) < (^ S
Identification of A-Chain and Blocked Ricin Using MIANS
253
1.5
320 nm 1.0
h
0.5
h
0.0
300
200
B
280 nm
h
>
E
10
20
30
40
Time (min)
Figure 3. Gel filtration on Superdex Peptide column of Anti-MIANS affinity column-purified peptides from RB2L-MIANS after enzymatic digestion, Panel A, and affinity ligand, Panel B. The column was run isocratically at 0.5 mL/min with 0.1 M Tris.HCl buffer, pH 8.5, containing 2 M GuHCl. The detection wavelengths were 320 nm for the digest (A) and 280 nm for the affinity ligand (B).
"missing" residues corresponded to a Cys. We conclude that the two B-chain peptides thus identified are covalently bound to Ugand through the residues corresponding to the blank sequencing cycles, and that the ligand has been labeled with MIANS. By using the combined protocol of MIANS-labeling of the ligand and affinity chromatography using the anti-MIANS column, it has been possible to suppress the effect of the intrinsic heterogeneity of the ligand on the isolation of the ligand-bound B-chain peptides.
IV. Summary and Conclusions We have demonstrated the use of the MIANS label in conjunction with affinity chromatography in two representative scenarios. In the first case, MIANS
254
Mary E. Denton et al
was used to label an accessible thiol on the ricin A-chain. Following enzymatic digestion, the affinity chromatography step provided a means of isolating only the labeled thiol-containing peptides from the background of a peptide digest, making it possible to obtain clean samples for sequencing. The results indicated that the MIANS was highly specific for thiols only. In the second case, the ligand bound to ricin B-chain was labeled through its thiol with MIANS. Here, the heterogeneity of the ligand is responsible for the broad elution profile of the labeled ligand-containing B-chain peptides (from the enzymatic digest) observed by RP-HPLC. The use of the anti-MIANS affinity column has allowed us to isolate all B-chain peptides bound to MIANS-labeled ligand regardless of the inherent heterogeneity of the ligand. The gel filtration step using the Superdex Peptide column demonstrated that this fraction is relatively homogeneous in terms of size. One distinct advantage of this system, i.e., the use of the MIANS label in conjunction with the anti-MIANS affinity column, over a similar approach, e.g., using the biotin-avidin interaction, is that the MIANS label can be tracked at a wavelength, approximately 320 nm, which is unique from the peptide or protein absorbance spectrum (Gupte and Lane, 1979; Andley et al., 1986). Another potential advantage is that the anti-MIANS column is reusable. The column has been routinely regenerated for multiple uses. Although the initial studies using ricin A-chain demonstrated that there may be some heterogeneity introduced along with the MIANS ligand, results from preliminary model compound studies (R. Singh, M. Denton and R. Steeves, ImmunoGen, Inc., unpublished data) suggest that it may be possible to drive the MIANS breakdown to a uniform species. This is currently being investigated. It is important to note, however, that the chemical changes that occur within the MIANS label have no effect on its binding to the anti-MIANS affinity column.
Acknowledgments We are grateful to the Hybridoma Development Group for providing us with the anti-MIANS antibody. We also thank Dr. John Leszyk of the Worcester Foundation for performing the amino acid and sequencing analyses, and for many useful discussions.
References
Andley, U.P., Liang, J.N. and Chakrabarti, B. (1986). Biochemistry 2\\n52>-\%5%. Baenziger, J.U. and Fiete, D. (1979). J. Biol Chem. 254:9795-9799. Blattler, W.A., Lambert, J.M., and Goldmacher, V.S. (1989). Cancer Cells l(2):50-55. Ellman, G.L. (1959). Arch. Biochem. Biophys. 83: 70-77. Grossbard, M.L., Lambert, J.M., Goldmacher, V.S., Spector, N.L., Kinsella, J., Eliseo, L., Coral, F., Taylor, J.A., Blattler, W.A., Epstein, C.L., and Nadler, L.M. (1993). 1 Clin. Oncol. 11:726-737. Gupte, S.S. and Lane, L.K. (1979). J. Biol. Chem. 254:10362-10367. Lambert, J.M., Mclntyre, F., Gauthier, M.N., ZuUo, D., Rao, V., Steeves, R.M., Goldmacher,V.S., and Blattler, W.A. (1991). Biochemistry. 30:3244-3247. Lambert, J.M., Goldmacher, V.S., Collinson, A.R., Nadler, L.M., and Blattler, W.A. (1991). Can. Res. 51:6236-6242.
Identification of A-Chain and Blocked Ricin Using MIANS
255
Montfort, W., Viilafranca, J.E., Monzingo, A.F., Ernst, S.R., Katzin, B., Rutenber, E.,Xuong, N. H., Hamlin, R., and Robertus, J.D. (1987). J. Biol. Chem. 262:5398-5403. Olsnes, S. and Pihl, A. (1973). Biochemistry 12:3121-3126. Riviere, L.R., Fleming, M., Elicone, C, and Tempst, P. (1991). in Techniques in Protein C/zemw/ry//(Viilafranca, J.J., Ed.) pp. 171-179, Academic Press, San Diego, CA.
This Page Intentionally Left Blank
Inactivation of the Human Cytomegalovirus Protease by Diisopropylfluorophosphate Thomas Hesson, Anthony Tsarbopoulos, S. Shane Taremi, Winifred W. Prosise, Nancy Butkiewicz, Bimalendu DasMahapatra, Michael Cable, Hung Van Le and Patricia C. Weber Departments of Structural Chemistry, Analytical Chemistry and Virology, Schering Plough Research Institute, Kenilworth, NJ 07033
I.
Introduction
Human cytomegalovirus (CMV) encodes a 256 amino acid maturational protease, which cleaves the viral assembly protein precursor at the maturation site, releasing its 6 4 residue carboxy terminal sequence to yield mature assembly protein (1). The assembly protein, found in immature capsids, is thought to function as a scaffolding protein and is essential for virion maturation (2,3). CMV protease, the amino terminal portion of the open reading frame expressing the assembly protein precursor, also cleaves at a site closer to the amino terminus (the release site), releasing the 28 kDa active protease (4). Autoproteolysis also can occur (albeit at a much slower rate) within the active protease domains, notably between Alal43 and Alal44 and between Ala209 and Ser210 (5,6). CMV protease is a serine protease, and labeling with diisopropylfluorophosphate (DFP) has identified Serl32 as the active site serine (4,5). The structure of the CMV protease containing the diisopropylphosphorylserine at residue 132 TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
257
258
Thomas Hesson et al
(DIP-CMV protease) is likely to resemble that of the tetrahedral transition-state intermediate (7). The DIP-CMV protease would also not be susceptible to further autoproteolysis. For these reasons it would be useful to produce pure DIP-CMV protease to crystallize for structural studies. But it had been found that concentrations of DFP sufficient to yield stoichiometric incorporation of inhibitor at the active site of CMV protease, also resulted in substantial incorporation of DIP at a second site or sites (4). This heterogeneous incorporation would preclude crystallographic studies. For this reason we have attempted to optimize the conditions for inactivation of CMV protease with DFP, to produce pure DIP-CMV protease with minimum second site incorporation.
II. Materials and Methods Protease Modification with DFP: The purified double mutant (A143V/V207A) of human CMV protease was inactivated with the indicated concentrations of DFP at 22° C in Buffer A (25 mM Tris-Cl buffer, pH 7.8, 0.15 M NaCl, 10% glycerol and 10 mM DTT). The inactivation was initiated b y the addition of 3 \i\ of isopropanol (control) or DFP stock in isopropanol to 200 |xl aliquots of the protease (180-200 \LM), The reactions were allowed to proceed for the indicated times, and were stopped by separating the free DFP from protease on 5 ml polyacrylamide desalting columns (Pierce; SpeedyTM; 6000 mwco) which were equilibrated and eluted with Buffer B (50 mM sodium Hepes buffer, pH 7.7, 0.15 M NaCl, 10% glycerol and 10 mM DTT). Approximately 0.25 ml fractions were collected, and protease which was essentially free of unreacted DFP had been collected in the void within 3 minutes of application. The void pools were dialyzed versus Buffer B a t 4° C, and aliquots were frozen at -80° C. Protein concentrations were determined by the method of Bradford, using the Biorad kit and BSA fraction V as a standard. Because of the severe parasympathomimetic effects of DFP, all work with the inhibitor was done double gloved under a fume hood, and contaminated containers and surfaces were washed with 2N NaOH in 20% ethanol.
Inactivation of Human CMV by DFP
259
Protease assay: CMV protease activity was measured using the surface plasmon resonance technology of the BIAcore^M instrument (Pharmacia Biosensor). Briefly, this instrument was used to quantitate residual uncleaved biotinylated maturation site substrate which also contained a phosphotyrosine at its C-terminus: biotin-RGVVNASCRLApY. CMV protease was incubated with substrate at 22° C in 50 mM sodium Hepes buffer, pH 7.4, 25% glycerol and ImM DTT. The reaction was quenched with 1 volume of 50 mM sodium Hepes, pH 7.8, 0.15 M NaCl, 5 mM p-hydroxymercuribenzoic acid and 13 mM EDTA. The quenched solution was mixed with 0.75 volumes each of 50 mM sodium Hepes, pH 7.4, 1 M NaCl and 0.5 mg/ml streptavidin in water. The sample was then injected across a sensor chip to which antiphosphotyrosine monoclonal antibody had been coupled. A decrease in the streptavidin signal corresponded to a decrease in the concentration of uncleaved substrate. Electrospray mass spectrometry: Electrospray (ES) (8) mass spectra were acquired on a Perkin-Elmer Sciex (Concord, Canada) API III triple-quadrupole mass spectrometer equipped with an atmospheric pressure nebulization-assisted electrospray (ion-spray) source (9). The ion-spray needle was held at a 4.5 kV potential with an orifice voltage of 100 V. Aliquots of control and DFP-treated CMV protease were dialyzed versus 4 changes of 0.4% ammonium bicarbonate, frozen, and then dried in a Savant SpeedVac^M concentrator with no heat applied. The dried samples were dissolved in 9:1 0.1% aqueous trifluoroacetic acid (TEA): acetonitrile to a concentration in the low |LIM range (10-20 pmol/|iL). The samples were then infused into the ion-spray source of the mass spectrometer at a flow rate of 5 |iL/min for the duration of several full-scan spectra. The final spectrum was an averaged sum of several scans (10-20) from m/z 500 to 2400 (MCA mode) at a scan rate of 3 s/scan. The average molecular weight (Mj.) was derived from all the observed charge states with an accuracy of 2-3 Da from the expected mass.
Thomas Hesson et al
260
III. Results and Discussion Initial studies indicated that there was no loss of protease activity in the control samples, even when incubated for 2 3 hours at 2T C (data not shown). A 3.8 hour incubation was sufficient to completely inactivate 180 \\M protease in the presence of 4.3 mM DFP, while a 2.4 fold excess of the inhibitor inactivated only 50% of the enzyme (Table I). This indicates that CMV protease is less reactive with DFP than trypsin or chymotrypsin (10), and requires an order of magnitude excess of the organophosphate in order to achieve complete inactivation. Table
I. Modification
[DFP]
mol DFP/
(mM)
mol Protease
0
of the CMV Incubation
Protease
Time
with
DFP
Control
Activity
(hours0
(%)
3.0
90
3.2
90
0.084
0.46
0.43
2.4
3.4
52
4.3
24
3.8
Not Detectable
43
240
4.0
Not Detectable
CMV protease samples (180 fiM) were incubated with v a r i o u s concentrations of DFP for the times indicated, and then assayed f o r protease activity as described in Materials and Methods.
Since incubation of the protease with a large excess of DFP could nonspecifically phosphorylate tyrosine residues (10), the control and inactivated protease samples were analyzed by ES mass spectrometry (8). The ES mass spectrum of a fully active control sample of CMV protease is shown in Figure 1, where a series of multiply charged ions (from +12 to +34) provided an average Mj- value of 27,910 (calculated Mj.=27,909). The deconvolved spectrum (Fig. 1, inset), shows that only 30% of the protease was full length (observed
Inactivation of Human CMV by DFP
261
Mi.=28,040). The remaining 70% corresponds to the protease lacking the amino terminal methionine residue (desMetprotease; Mj.=27,910). Figure 2 compares the deconvoluted spectrum of a control protease sample (Panel A) with that of a protease sample (180 |iM) which had been completely inactivated by a 3-hour incubation with 4 mM DFP (Panel B). As expected, no unsubstituted protease was detectable in the spectrum in Panel B.
Figure 1. Electrospray Mass Spectrometry of CMV P r o t e a s e . Analysis of the CMV protease by ES MS as described i n Materials and Methods, yielded a mass spectrum containing an envelope of multiply charged ions ranging from the +12 to the +34 charge state, giving rise to an average Mj. of 27,910, which is i n excellent agreement with the calculated mass value (27,909) of CMV protease lacking the N-terminal methionine (desMet-protease). It should be noted that each multiply charged signal was accompanied by a satellite signal corresponding to a CMV p r o t e a s e variant having an N-terminal Met residue (full length p r o t e a s e ) . This is clearly shown in the deconvoluted mass spectrum ( i n s e t ) , where the signals at Mr 27,910 and 28,040 correspond to the desMetand the full length forms of the CMV protease, respectively.
Thomas Hesson et al
262
0 mM DFP
5/5
28,074
4.0 mM DFP
^
Mr Figure 2. Mass Spectra of CMV Protease Modified by DFP. The spectra above are the deconvoluted ES mass spectra of 180 fiM CMV protease samples incubated for 3 hours with (Panel B) a n d without 4 mM DFP (Panel A). No activity was detectable in t h e sample incubated with DFP, and as expected, no u n s u b s t i t u t e d protease was detectable in this spectrum. Both full length and desMet-protease containing 1 mole diisopropylphosphoryl (DIP) group per mole of protease (Mr =28,205 and 28,074 r e s p e c t i v e l y ) were present, as well as a large amount of d e s M e t - p r o t e a s e containing 2 moles of DIP per mole of protease (Mr =28,238). This diphosphorylated species represents at least 10% of the i n a c t i v a t e d desMet-protease. A small peak corresponding to the diphosphorylated full length protease is apparent as well. T h e r e was no evidence of a difference in DIP incorporation between t h e desMet- and full length CMV protease species. The d e s M e t - p r o t e a s e is used for comparison because of the larger amount involved.
However, both full-length and desMet-protease containing 1 mole diisopropylphosphoryl (DIP) group per mole of protease (calculated Mj. of 28,205 and 28,074, respectively) were present, as well as the desMet-protease containing 2 moles of
AXisMaiNi aAixvian
ff,'^
HH
C/3 —
If • r-
._.£_-3Q
« 52 c« <
DS
OS
•S 9-.E S § "* 2^
C/5 TO
o c
O Q t-J -Ji o^ ^ C
"
^
2^
C ^ c« «J 3 O
o c
S3.
•tlBI f/i
r
-S-S S
_- O
^ ^ B '^ ^ B ^
g.i s ^ g •" g
^ «^ P
C ^ S o 2^ 3 O . 2
g £ 00 ^
tll^ • •
ct:
c o Cu '
5 ^ 1"^ -" ' o
o o . ji:
W)
faSa ^
264
Thomas Hesson et al.
DIP per mole of protease (calculated Mi.=28,238). This diphosphorylated species represents at least 10% of the inactivated desMet-protease. In order to confine DIP incorporation to the active site of the protease, 200 |LIM protease was incubated for 2 hours with a series of DFP concentrations ranging from 0.3 to 2 mM. The deconvoluted ES mass spectra of the samples (Figures 2 and 3) and the activity data (Figure 4 and Table II) indicate a correlation between the appearance of the DIP-CMV species and protease inactivation, indicating that DIP-CMV protease has incorporated the diisopropylphosphoryl group at its active site serine.
FIGURE 4. Effect of DFP Treatment on Protease A c t i v i t y . Serial dilutions of CMV protease standard and DFP-treated protease samples (5-0.005 ^M) were assayed for their cleavage activity of the biotinylated peptide substrate for 2 hours, as described in the legend of Table 1. The amount of intact substrate remaining after quenching the reactions were monitored on the BIAcore.
A protease species with 2 DIP groups incorporated is apparent in samples incubated with 1.26, 1.63 and 2.07 DFP, although even in the sample incubated with 2.07 DFP, the diphosphorylated components represent less 5% of the inactivated protease. Based on Figure 3,
also mM mM than the
Inactivation of Human CMV by DFP
265
residual unmodified protease present in the samples incubated with 1.63 and 2.07 mM DFP is also less than 2%. Consequently, incubation of 200 jiM CMV protease at 22" C with 1.5 to 2 mM DFP was chosen as the optimum conditions for producing diisopropylphosphoryl-CMV protease suitable for crystallography. Table
DFP
II. Summary
Concentration
of DFP
Inhibition
of
CMV
Protease
mol DFP/mol
(mM) 0.33
1.70
69
0.62
3.10
36
1.26
6.30
14
1.63
8.50
6
2.07
10.4
4
4.00
22.0
0
Summary of CMV protease activity data presented in Figure 4 a n d the legend of Figure 2. The activities correlate well with t h e deconvoluted mass spectra presented in Figures 2 and 3.
IV. Conclusions CMV protease was inactivated by DFP under conditions optimum for its activity and stability (pH 7.8, 10% glycerol and 10 mM DTT). The results of electrospray mass spectrometry and protease activity measurements indicated that a 4 hour incubation of 180 |LIM CMV protease with 4 mM DFP completely inactivated the protease, but that within 3 hours nearly 10% of the protease had incorporated a second diisopropylphosphoryl (DIP) moiety. After performing 2 hour inactivations under the same conditions, but with lower DFP concentrations, it was found that 1.5 to 2 mM DFP was optimum for producing 98% DIP-CMV protease, which contained less than 5% protein with the second site labeled.
266
Thomas Hesson et al
Since the structure of the DFP-treated serine proteases resembles that of the tetrahedral transition-state intermediate, and the inactivated enzyme would not be susceptible to autoproteolysis, production of DIP-CMV protease would be useful for structure based drug design.
References 1.
Burck, P. J., Berg, D. H., Luk, T. P., Sassmannshausen, L. M., Wakulchik, M., Smith, D. P., Hsiung, H. M., Becker, E. W., Gibson, W., and Villarreal, E. C. (1994). J. Virol 68, 2937-2946. 2. Gibson, W. (1981). Virol 111, 516-537. 3. Irmiere, A., and Gibson, W. (1983). Virol 130, 118-133. 4. Stevens, J. T., Mapelli, C , Tsao, J., Hail, M., O'Boyle II, D., Weinheimer, S. P., and Diianni, C. L. (1994). Eur. J. Biochem. 2 2 6 , 361-367. 5. Holwerda, B. C , Wittwer, A. J., Duffin, K. L., Smith, C, Toth, M. V., Carr, L. S., Wiegand, R. C , and Bryant, M. L. (1994). /. Biol Chem. 269, 25911-25915. 6. O'Boyle II, D. R., Wager-Smith, K., Stevens III, J. T., a n d Weinheimer, S. P. (1995). /. Biol Chem. 270, 4753-4758. 7. Stroud, R. M., Kay, L. M., and Dickerson, R. E. (1974). /. Mol Biol 83, 185-208. 8. Fenn, J. B., Mann, M., Meng, C. K., Wong, S. F., and Whitehouse, C. M. (1990). Mass. Spectrom. Rev. 9, 37-70. 9. Covey, T. R., Bonner, R. P., Shushan, B. L, and Henion, J. D. (1988). Rapid Commun. Mass. Spectrom. 2, 249-256. 10. Cohen, J. A., Oosterbaan, R. A., and Berends, F. (1967). Meth. Enzymol 11, 686-702.
studies on the Status of Arginine Residues in Phospholipase A2 from Naja naja atra (Taiwan cobra) snake venom C. C. Yang, T. S. Yuo and C. Y. Chen Institute of Life Sciences, National Tsing Hua University, Hsinchu, Taiwan 30043, ROC
I . Introduction The enzyme phospholipase A2 (PLA2, EC 3.1.1.4) is widely distributed among various species in the animal kingdom, notably in the pancreatic tissues of mammals and venoms from snakes and bees. It specifically catalyzes the hydrolysis of the acyl-ester bond at the sn-2 position of l,2-diacyl-3-^/7phosphoglycerides in the presence of Ca^^ (Dennis, 1983). The enzyme activity of PLA2 is strongly dependent on the status of the substrate. Although PLA2 enzymes hydrolyze monomeric substrates, their activities increase by several orders of magnitude when the substrates are in aggregated form. This resuh has led to the hypothesis that there is a specific interfacial recognition site at the N-terminal region with specific affmity for lipidwater interfaces (Pieterson et al.^ 1974; van Dam-Mieras et al, 1975). Several studies have revealed that the N-terminal region is important for the activity of PLA2 enzymes (Dijkstra et al, 1981; van Eijk et al.^ 1984; Haruki et al.^ 1986; Oda et al, 1986). The X-ray crystallographic analysis of PLA2 enzymes (Dijkstra et al, 1983; Brunie et al, 1985; Scott et a/., 1990; White e/ a/., 1990) showed that a hydrogen bonding network connects the active site to the N-terminal region. Renetseder et al (1985) suggested that the hydrogen bonding network is an activation network which could stabilize the productive conformation of PLA2 enzymes for aggregated substrates. Yang and Chang TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
267
C.C.Ymg etal.
268
(1988) suggested that the N-terminal region mediated the effects of lipid binding to the active site by the hydrogen bonding network, and causes architectural changes at the active site which result in an active conformation. The structural features of PLA2 showed by chemical modification to be important for enzymatic activity are histidine (Volwerk et al, 1974; Yang and King, 1980), aspartic acid (Fleer et al, 1981; Yang et al, 1983), tryptophan (Yang and Chang, 1984; Chang et al, 1993), tyrosine (Yang et al, 1985; Yang and Lee, 1986; Soons, et al, 1986), and lysine residues (Yang et al, 1982; Yang and Chang, 1989). Therefore, a study of the role of arginine residues might throw some light on the catalytic mechanism of PLA2 enzymes. PLA2 from N. naja atra (Taiwan cobra ) snake venom is an acidic single chain polypeptide consisting of 119 amino acid residues and contains five Arg residues at the position 16, 30, 42, 94 and 117. In this study, the Arg residues of A^. naja atra PLA2 were selectively modified with phenylglyoxal (PG), and the modified derivatives were separated by HPLC. Based on changes in physicochemical and biological properties afl;er Arg modification, the possible role played by Arg residues in A^. naja atra PLA2 is discussed.
II. Materials and Methods 1.
Materials
PLA2 from Naja naja atra (Taiwan cobra) venom was isolated and purified as previously described (Yang et al, 1981). Phenylglyoxal was purchased from Aldrich Chemical Co., 8-anilinonaphthalene sulfonate (ANS) was obtained from Pierce Chemical Co.. The SynChropak RP-18 column (4.6mm x 25cm) was obtained from Synchrom. All other reagents were of analytical grade. 2.
Chemical Modification of Arg Residues with Phenylglyoxal (PG)
PLA2 was modified with PG according to the procedure described by Takahashi (1968). One micromole of PLA2 in 1 ml of 0.1 M sodium borate buffer (pH 9.0) was incubated with 100-fold molar excess of PG. The reaction was allowed to proceed for 2 hr at 37°C, then quenched by the addition of a few drops of acetic acid. The modified proteins were immediately desalted by passing through a Sephadex G-25 column equilibrated with 0.1 M acetic acid and the protein fraction was lyophilized. The modified proteins were separated by HPLC on a SynChropak RP-18 column (4.6 mm x 25 cm), equilibrated with 0.1% TFA and eluted with a linear gradient of 28-35% acetonitrile for 35 min. Flow rate was 0.8 ml/min and the effluent was monitored at 280 nm. 3.
Identification of the Modified Arg Residues
To determine the positions of the Arg residues modified in the sequence of PLA2,
Status of Arg-Residues in Phospholipase A2
269
the PG derivatives were reduced and carboxymethylated (RCM) according to the procedure described by Crestfield et al. ( 1963 ). The RCM-proteins (1.4 mg) were dissolved in 0.2 M ammonium bicarbonate buffer at pH 7.8 and trypsin was add (30:1, w/w). Digestion was carried out at 37 C for 3 hr and the hydrolysates were lyophilized. The tryptic hydrolysates were separated by HPLC on a SynChropak RP-18 column (4.6 mm x 25 cm) equilibrated with 0.1% TFA and eluted by a linear gradient of 0-35 % acetonitrile for 105 min. 4.
Circular Dichroism (CD)
CD spectra of native and Arg-modified PLA2 were measured from 190 to 250 nm on a Jasco J-700 spectropolarimeter at a concentration of 0.285 mg/ml in 10 mM Tris buffer (pH 8.0) with a cell path-length of 0.1 cm. The estimated fraction [%] of the spectra were obtained by signal averaging five scans. 5.
Other Tests
Determination
of
PLA2
activity
and
antigenicity,
polyacrylamide-gel
PCM
0.6 PG-2
1 0 CO
<
0.4
• "
Native
1
0.2
iyA>'^--U 1
20 Time (min)
L
40
Figure. 1. Separation of Arg-modified PLA2 on a SynChropak RP-18 column. The column (4 6 mm x 25 cm) was equilibrated with 0.1% TFA and eluted with a linear gradient of 28-35% acetonitrile for 35 min. Flow rate was 0 8 ml/min and the effluent was monitored at 280 nm.
270
C.C.Yang era/.
electrophoresis, amino acid analysis and sequence determination, Ca^^-induced difference spectroscopy and fluorescence measurement were performed in essentially the same manner as previously described (Yang and Chang, 1984, 1988; Chang and Yang, 1988).
III. Results and Discussion A. Chemical Modification of PLAi by Phenylglyoxal (PG) PLA2 from N. naja atra snake venom was modified with 100-fold molar excess of PG at pH 9.0 for 2 hr. The modified proteins were separated by HPLC on a SynChropak RP-18 column and two major modified derivatives were isolated (Figure 1). Polyacrylamide gel electrophoresis revealed that both modified derivatives are homogeneous and move slower than the native PLA2. Amino acid analysis showed that only one Arg residue is modified in each of the PG-1 and PG-2 derivatives. In order to determine the positions of the Arg residues modified in the sequence of N. naja atra PLA2, PLA2 and the Arg-modified derivatives were digested with trypsin after reduction and carboxymethylation. The tryptic Table I. Amino Acid Compositions of Tryptic Peptides of RCM-PLA2 and Arg-modified PLA2"
Asp Glu Gly Ser His Arg Thr Ala Pro Tyr Val Met l/2Cys-Cys He Leu Phe Lys Corresponding peptide Arg-modified
A
B
C
D
5.6 (6) 1.0(1) 0.6(1)
0.9(1) 0.8(1)
5.5 (6) 0.8(1) 0.7(1)
0.8(1) 0.9(1)
2.1(2) 1.1 (1) 2.8 (3) 1.1(1) 1.6(2)
0.2(1) 0.9(1) 1.0(1)
1.0 (1) 0.9(1) 1.1(1) 0.7(1) 0.6(1) 0.3(1) 0.7(1)
0.8(1) 2.7 (3) 1.0(1) 1.5 (2)
0.2 (1) 0.8(1) 0.8(1)
1-2(1) 1.0(1) 0.7(1) 0.2 (1) 0.7(1)
0.9(1)
0.6(1) 101-119
2.3 (2)
7-19
101-119
7-19
Arg-117
Arg-16
^he peptides obtained from Figure. 2 were analyzed.
Status of Arg-Residues in Phospholipase Aj
271
0.06
0.04
0.02
0.04|-
o
00
<
0.02
0
^LXJUIULUJI'LJAJUJ yAJ
kiJiJN
0.04
0.02
40 60 Time (min) Figure. 2. HPLC profiles of the tryptic digests of native (1), and Arg-modified derivatives PG1 (2) and PG-2 (3). The tryptic hydrolysates were separated by HPLC on a SynChropak RP-18 column (4.6 x 25 cm) equilibrated with 0.1% TFA and eluted with a linear gradient of 0-35% acetonitrile for 105 min.
272
CC.Yangerfl/.
NLYQFKNMIQCTVPSRSWWDFADYGCYCGRGGSGTPVDDL 60
80
DRCCQVHDNCYNEAEKISGCWPYFKTYSYECSQGTLTCKG
lOG
GNNACAAAVCDCDRLAA
119
ICFAGAPYNNNNYNIDLKARCQ
Figure. 3. Amino acid sequence of .V. naja extra PLA2 (Tsai et ai, 1981; Pan et al, 1994) and tryptic peptides A, B, C and D.
hydrolysates were separated by HPLC on a SynChropak RP-18 coiumn and the elution profiles are shown in Figure. 2. The resuhs of amino acid analysis show that peaks A and B in the native PLA2, which represent the peptide fragments at positions 101-119 and 7-19 (Table I ), respectively, shifted to the respective peptides C and D on the chromatographic profiles of the modified derivatives PG-1 and PG-2 (Figure 2). Amino acid analysis and sequence determination of peaks C and D (Figure 3) revealed that they were derived from the peaks A and B in the native PLA2, but were missing an Arg residue at positions 117 and 16. These results unambiguously indicate that Arg-117 and Arg-16 are modified in derivatives PG-1 and PG-2, respectively. B. Properties of Arg-modified Derivatives As shown in Table II, a precipitous drop in enzymatic activity to 25.4% that of the native PL A? was observed when Arg-16 was modified. Modification of Arg-117 resulted in a decrease in enzymatic activity by only 12.1%. However, the antigenic activities of the both modified derivatives remained unchanged. The Arg-modified derivatives also enhanced the emission intensity of ANS dramatically, and the emission intensity of the ANS-enzyme complex increased in parallel with increasing concentration of CdC^ until the saturation level was reached. The dissociation constant of Ca^^ to ANS-enzyme complex (Kd) was Table II. Properties of the Native and Arg-modified PLA2 Enzymatic activity (%) Native PLA2
100
Antigenic activity(%)
Pl
100
5.2
Arg-117 modified
87.9
97
5.0
Arg-16 modified
25.4
95
5.0
273
Status of Arg-Residues in Phospholipase A2 Table HI. Physical Properties of Native and Arg-modified PLA2 ANS binding (Kd, liM) Ca^^ binding (Kd, mM)
with 10 mM Ca'"
without Ca'^
Native PLA2
0.53
85
180
Arg-117 modified
0.50
70
127
Arg-16 modified
0.71
58
98
calculated by Scatchard plots derived from the degree of saturation of the emission intensity (Table III). In order to find out whether there is specific interaction between ANS and the enzymes, Scatchard plots was obtained by titration of 5 //M protein with increasing dye concentration from 5 to 50 jwMin the presence or absence of Ca^^ (10 mM ). A plot of the intensity of fluorescence [F] vs. [F]/[ANS] gave lines with slopes corresponding to the respective dissociation constant of the complex. As listed in Table 3, the binding ability with ANS has been changed slightly afl;er Arg-modification. The fact that the Arg-modified derivatives are similar to native PLA2 in their affinity for C3^\ and the Scatchard plots revealed that there is only one kind of Ca^^ binding site indicates that the loss of the biological activity was not related to cofactor binding. CD spectra of Arg-modified derivatives are similar to that of PLA2, reflecting the fact that modification does not affect the secondary structure of PLA2. The antigenic activity of PLA2 remained unchanged afl:er Arg modification at position 16 or 117 Therefore, the observed loss of biological activity is more likely due to the disappearance of an electrostatic interaction between the PLA2 and substrates than the conformational change of the enzyme molecule. Scott et al. (1990) showed that the catalytic event occurred on a rigid internal surface that is well shielded from bulk solvent. In their discussion of the hydrophobic channel and the interficial recognition surface, they suggested that the phospholipid molecule undergoing hydrolysis leaves the aggregate and reaches this catalytic surface by facilitated diffusion through a hydrophobic channel whose opening is in the interfacial binding surface. Inspection of the tertiary structure of .¥. naja atra PLA2 shows that the residues at positions 2, 3, 5, 6, 9, 18, 30 and 63 constitute the hydrophobic channel (Figure 4) which are involved in the interaction of enzyme molecule with phospholipid substrates (White et al, 1990; Chang et al, 1993). Thus, it is conceivable that the modification of Arg-16 might directly distort the binding ability of PLA2 with substrate v/aj perturbs the phosphate portion of the substrate,
274
C. C Yang et al
Figure. 4. Tertiary structure of N. naja atra PLA2 showing the relative location of the Arg residues and the residues of which the "hydrophobic channel" is composed. The model is based on the result of White et al. (1990). The "hydrophobic channel" was formed by the residues at positions 2, 3, 5, 6, 9, 18, 30 and 63. Numbers in parentheses indicate the position of Arg residues at 16 and 117 in PLA2.
facilitates diffusion through the hydrophobic channel from interfacial surface, and causes a drastic loss in enzymatic activity. IV. Conclusions Phospholipase A2 (PLA2) from Naja naja atra (Taiwan cobra) snake venom was subjected to arginine modification with phenylglyoxal (PG), and two major derivatives were separated by HPLC. The results of amino acid analysis and sequence determination revealed that Arg-117 and Arg-16 were modified by PG. Modification of Arg-117 and Arg-16 resulted in a decrease in enzymatic activity of PLA2 by 12.1% and 74.6%, respectively. The Arg-modified derivatives were similar to native PLA2 in their affinity for Ca^^, and the Scatchard plots revealed that the loss of the biological activity is not related to cofactor binding. Modification did not significantly affect the secondary structure of the PLA2 molecule as revealed by CD spectra. The antigenic activity of PLA2 also remained unchanged after modification. The results indicate that Arg-16 is important for the biological activities of PLA2 and the modification of Arg-16
Status of Arg-Residues in Phospholipase Aj
275
might directly distort the binding ability of PLA2 with substrate vra. perturbs the phosphate portion of the substrate, facilitates diffusion through the hydrophobic channel from interfacial surface, and causes a drastic loss in enzymatic activity. Acknowledgments This work was supported by grant NSC 84-2311-B007-027 from the National Science Council, RepubUc of China. The authors would like to thank Miss F.S. Wu for her technical assistance. References Brunie, S., Bolin, J., Gewirth, D. and Sigler, P.B. (1985) J. Biol Chem. 260, 9742. Chang, L.S., Kuo, K.W. and Chang, C.C. (1993) Biochem. Biophys. Acta 1202, 216. Chang, L.S. and Yang, C.C (1988) J. Protein Chem. 7, 713. Crestfield, S.A.M., Moore, S. and Stein, W.H. ( 1 9 6 3 ) / Biol. Chem. 238, 622. Dennis. E.A. (1983) Phospholipases. In "The Enzymes" (Boyer, P.D.. ed). Vol. 16, p. 307. Academic Press, New York. Dijkstra, B.W., Kalk, K.H., Hoi, W.G.J, and Drenth, J. (1981) J. Mol. Biol. Ul, 97. Dijkstra. B.W., Renetseder, R., Kalk. K.H., Hoi. W.G.J, and Drenth, J. (1983) J. Mol. Biol. 168, 163. Fleer, E.A.M.. Verheij. H.M. and de Haas, G.H. (1981) Eur. J. Biochem. 113 , 283. Haruki, H.. Teshima, K., Samejima, Y., Kawauchi, S. and Ikeda. K. (1986) J. Biochem. 99, 99. Pan. P.M., Yeh, M.S., Chang. W.C, Hung, C.C and Chiou S.H. (1994) Biochim. Biophys. Res. Commun. 199, 969. Pieterson, W.A., Vidal J.C. Volwerk, J.J. and de Hass, G.H. (1974) Biochemistry 13, 1455. Renetseder, R., Brunie, S.. Dijkstra, B.W., Drenth. J. and Sigler. P.B. (1985) J. Biol. Chem. 260, 11627. Scott, D.L.. White, S.P.. Otwinowski. Z.. Yuan. W . Gelbs, M.H. and Sigler, P.B. (1990) Science 286, 1541. Soons, K.R., Condrea, E., Yang, C.C and Rosenberg. P. (1986) Toxicon 24, 679. Tsai, I.H.. Wu. S.H. andLo. T.B. (1981) Toxicon 19, 141. TakahashiK. (1968) J. Biol. Chem. 23. 6171. van Dam-Mieras, M.C.E., Slotboom, A, J., Pieterson. W.A. and de Haas, G.H. (1975) Biochemistry 14, 5387. van Eijk, J.H.. Verheij, H.M. and de Haas. G.H. (1984) Eur. J. Biochem. 140, 407. Volwerk, J.J.. Pieterson. W.A. and de Haas, G.H. (1974) Biochemistry 13, 1446. White. S.P., Scotte. D.L.. Otwinowski, Z.. Gelb, M.H. and Sigler, P.B. (1990) Science 286, 1560. Yang, C.C and Chang, L.S. (1984) J. Protein Chem. 3, 195. Yang, C.C and Chang, L.S. (1988) Toxicon 26, 721. Yang. C.C and Chang. L.S. (1989) Biochem. J. 262, 855. Yang. C.C, Chen, S.F. and Fan, Y.C (1983) Toxicon Suppl. 3, p. 509. Yang, C.C, Huang, C.S. and Lee, H.J. (1985)./. Protein Chem. 4, 87. Yang, C.C and Lee. H.J. (1986)./. Protein Chem. 5, 15. Yang, C.C. and King. K. (1980) Biochim. Biophys. Acta 614, 373. Yang. C.C, King, K. and Sun. T.P. (1981) Toxicon 19. 645. Yang, C.C, King, K., Sun. T.P. andHseu. W.S. (1982) The Snake 14. 110.
This Page Intentionally Left Blank
Selective Reduction of the Intermolecular Disulfide Bridge in Human Glial Cell Line-Derived Neurotrophic Factor Using Tris-(2-Carboxyethyl)Phosphine
John O. Hui, John Le, Viswanatham Katta, Michael F. Rohde and Mitsuru Haniu Department of Protein Structure, 14-2-E Amgen Inc., Thousand Oaks, CA 91320
INTRODUCTION The essential roles played by the neurotrophic growth factors in the development of the nervous system have been well documented (1,2). Studies on how the different factors interact with their target cells and how such interactions eventually result in the generation of biological responses are actively being examined (3,4). Glial cell linederived neurotrophic factor (GDNF) is one of the more recently identified neurotrophic factors, first purified from the conditioned medium of a rat glial cell line (B49) by Lin and co-workers (5,6). Because of its ability in supporting the growth of mid-brain dopaminergic neurons in-vitro, GDNF has been implicated to have therapeutic potential in the treatment of Parkinson's disease (7). Mature GDNF is a single polypeptide with 134 amino acid residues containing 7 cysteines and functions as a glycosylated, disulfide linked dimer. Examination of the protein's primary structure suggests that GDNF is a distant member of the transforming growth factor-fi (TGF-fi) superfamily of growth factors (5). The recombinant protein has been expressed in Escherichia coli and is currently being developed as a candidate for human therapeutic use. Because the protein has to be refolded from inclusion bodies, it is important to identify the disulfide linkages of the final purified product. Based on the locations of the 7 cysteines with their relative positions that are conserved among all members of the TGF-6 family, it is expected that GDNF would share a similar disulfide structure as the other members TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
277
278
John O. Hui et al
of this class of growth factors. The crystal structure of TGF-62 has been solved to 1.8 A resolution by the study of Davies and associates (8) and the cystine linkages have been determined to exist in a "cystine knot", a threaded-ring conformation (9, 10). If the disulfides in GDNF do form the cystine knot, the following intramolecular cystine bridges would be expected to form: Cys^^ to Cys^^^; Cys^^ ^Q Cys^^^. Through this ring of 8 amino acid residues, the third intramolecular disulfide bond between Cys"*^ and Cys^02 would pass. The 2 monomers would then cross-link via a disulfide bond between Cys^^^. As can be expected from this complex structure, the protein was refractory to extensive proteolytic degradation under native conditions (11). In this investigation, we address the issue of disulfide assignment by subjecting GDNF to partial reduction using the chemical reagent tris-(2-carboxyethyl)phosphine (TCEP) in 0.17 M acetic acid at pH 2.5 and we report the structural characterization of the single reduced disulfide bond.
MATERIALS GDNF was expressed in Escherichia coli, refolded and purified to apparent homogeneity by the Process Development Department at Amgen Inc. The protein retains the initiation methionine and its biological activity is essentially identical to the native des-Met species (unpublished data). Tris-(2-carboxyethyl)phosphine (TCEP) and sequencing grade trypsin were obtained from Boehringer Mannheim. Dithiothreitol (DTT), iodoacetamide (lAM), N-ethylmaleimide (NEM), and 4-vinylpyridine (4-VP) were purchased from Sigma. Sequanal grade guanidine hydrochloride and urea, spectral grade trifluoroacetic acid (TEA) were products of Pierce. All other reagents were of the highest quality commercially available. HPLC grade water and acetonitrile from Burdick and Jackson were employed throughout.
METHODS Partial reduction of GDNF In a typical preparation, 2 |LI1 of 0.5 M TCEP was added to 500 |ig of GDNF (16.0 nmol) in 200 |Lil of 0.17 M acetic acid. Reduction was allowed to proceed at 45°C for 2 hours before the sample was chromatographed through a Vydac C4 column (0.46 x 25 cm) using a Hewlett Packard 1090 M liquid chromatograph. Solvent A was 0.1% TEA in water and solvent B was 0.1% TEA in 90% acetonitrile. The column was equilibrated in 10% solvent B and a flow rate of 0.7 ml per min was used. After sample application, the column was washed isocratically with 10% solvent B for 5 min. Protein elution was accomplished by a linear gradient in 2 steps: 10 to 30% solvent B over 5
Disulfide Bridge in Glial Cell Line-Derived Neurotrophic Factor
279
min, followed by 30 to 55% solvent B over 45 min. Protein elution was monitored by absorbance at 215 nm. The proteins were collected manually and concentrated under vacuum to a minimal volume prior to structural analysis. Alkylation of the partially reduced GDNF with N-ethylmaleimide To the partially reduced GDNF was added 100 |il of 0.2 M NEM followed by 100 |il of 6 M guanidine hydrochloride in 0.25 M Tris containing 1 mM EDTA at pH 8.5. The reaction was allowed to occur at room temperature for 30 min in the dark before it was quenched by adjusting the pH to 2.0 with 25% TFA. The modified protein was purified by reverse phase chromatography as described in the previous section. Complete reduction and pyridylethylation of the NEM modified GDNF To the lyophilized, NEM modified protein was added 50 |il of 6 M guanidine hydrochloride in 0.25 M Tris containing 1 mM EDTA at pH 8.5. Reduction was initiated by the addition of 5 |il of 0.1 M DTT. The sample was incubated at 45°C for 1 hour and the remaining free sulfhydryls were alkylated by 2 |il of 4-vinylpyridine. The sample was left at room temperature for 30 min before being purified by HPLC as described. Tryptic digestion of the modified GDNF To identify the residue that was modified by NEM, 100 |ig of the NEM treated, completely reduced and pyridylethylated GDNF was dissolved in 50 |Lil of 0.4 M ammonium bicarbonate containing 8 M urea. The solution was diluted with 150 |LI1 of water and 2 |ig of trypsin was added. Proteolysis was allowed to proceed at 37'C for overnight and the peptides generated were analyzed by chromatography through a Vydac C18 column (0.46 x 25 cm) at a flow rate of 0.7 ml per min. Solvents A and B were identical to those described above. The column was equilibrated in 2% solvent B. After sample injection, the column was washed with 2% B for 5 min before a linear gradient of 2 to 55% B over 1 hour was applied. Peptide elution was monitored by absorbance at both 215 nm and 254 nm. The peptides were manually collected and dried prior to structural characterization. Protein and peptide sequence analysis Automated Edman degradation of protein and peptide samples were performed using an Applied Biosystems sequencer (Model 470A or 477A) or a Hewlett Packard G1005A sequencer. Each sequencer was fitted with an on-line HPLC analyzer for the identification of phenylthiohydantoin derivatives.
280
John O. Hui et al
Mass spectrometry The molecular mass of protein and peptide samples was determined by electrospray ionization mass spectrometry using a Perkin-Elmer Sciex API 100 mass spectrometer. The sample was introduced either by infusion or by on-line liquid chromatography/mass spectrometry (LC/MS) using a splitter. The data were obtained by scanning from 450 to 2000 Da with a scan time of 5 s and a step size of 0.25 Da with 1.0 ms dwell time per mass step. The molecular mass of the sample was obtained using the software provided by the instrument manufacturer. SDS-PAGE Laemmli (12) gels (14%) were run in non-reducing conditions and stained with Coomassie Blue for detection of proteins as described (13).
RESULTS & DISCUSSION To show that the Escherichia coli expressed GDNF has been properly refolded, we have initiated a study to identify its disulfide bridges. However, establishing the disulfide bonds of a protein is often a labor intensive task. The protein of interest is usually subjected to extensive proteolytic or chemical degradation, peptides which are linked, ideally by a single cystine bridge, are purified and analyzed (1416). In many instances, the cysteines exist as a cluster as in human chorionic gonadotropin (17) or in a "knotted" structure found in GDNF; under such conditions the proteins are refractory to proteolytic degradation and data from partial digestion are difficult to interpret. We have subjected GDNF to chemical reduction and ask the question: Would any disulfide bond(s) be selectively reduced and thus identified by protein chemistry techniques? A major concern was of course the reactivity of the free sulfhydryl group. Once a cysteine is generated, disulfide scrambling will occur rapidly at neutral or alkaline pH and may generate misleading data (18). TCEP has been well documented to be active at acidic pH by the study of Gray (19), thus disulfide rearrangement would be minimized. Partial reduction of GDNF Reduction of GDNF with TCEP (the molar ratio of the number of cysteines in the protein to chemical reagent employed in the experiment was 1 to 4) at 45°C for 2 hours resulted in 3 major peaks (Figure 1). Examination of each peak by SDS-PAGE under nonreducing conditions showed that both peaks B (10% of the total protein) and C (30% of the total protein) migrated as monomer. Peak A was the remaining unmodified protein and peak C, because it migrated with
281
Disulfide Bridge in Glial Cell Line-Derived Neurotrophic Factor (A) 3000-
2500 H
<
£
2000
1500
n <
1000
500
10
15
20
25
30
35
40
45
50
Time (min.)
(B)
b" ' ^ ' ^..
^TT
^^t. .z- S M- a -
r^
Figure 1 (A) Reverse phase HPLC separation of the partially reduced GDNF as described in the Methods section. (B) Analysis of the 3 peaks observed in the chromatogram by nonreducing SDS-PAGE. Lane 1 is the molecular weight markers (from top to bottom) : phosphorylase b, 97.4 kDa; Bovine serum albumin, 66.2 kDa; ovalbumin, 45.0 kDa, carbonic anhydrase, 31.0 kDa; soybean trypsin inhibitor, 21.5 kDa; and lysozyme, 14.4 kDa. Lane 2, 5 |Lig of control GDNF; Lane 3, 2 p.g of peak A; Lane 4, 5 ^g of peak B and Lane 5,5 \ig of peak C.
John O. Hui et al
282
the same retention time from reverse-phase HPLC as a sample of GDNF that had been reduced with DTT in the presence of 6 M guanidine hydrochloride (data not shown), indicating that it corresponded to the fully reduced protein. The identity of peak B was therefore of interest. Alkylation of the protein with NEM produced a predominant product with a molecular mass of 15319.0 Da (Figure 2). Based on the primary structure of GDNF and assuming all the cysteinyl residues were linked by disulfides, the dimeric protein should give a molecular mass of 30386.0 Da. If the dimeric GDNF was cross-linked by a single cystine bond, its specific reduction and alkylation with NEM should generate a component with a molecular mass of 15319.0 Da. Hence, our data are consistent with the model that the dimeric protein is cross-linked through a single disulfide bridge. 15319 7.0e5 -1 6.3e5 J 5.6e5 \
a
4.9e5 \ 4.2e5 J
c
19
3.5e5 J 2.8e5 \ 2.1 e5 4 1.4e5 J 7.0e4 J 15189
N. 15120
15190
15260
J1 15330 5330
15444
15400
Mass, amu
Figure 2 GDNF.
15470
*^m.i^^%\dm 15540
Reconstructed mass spectrum of the NEM modified
Disulfide Bridge in Glial Cell Line-Derived Neurotrophic Factor
283
Identification of the intermolecular disulfide bridge Because any nucleophilic amino acid side chain can react with NEM, it is crucial to establish the residue on GDNF that was modified by the reagent. The modified protein was therefore purified by HPLC, completely reduced with DTT in the presence of 6 M guanidine hydrochloride, and the remaining cysteines differentially alkylated with 4-vinylpyridine prior to tryptic digestion. The chromatographic behavior of the phenylthiohydantoin derivative of pyridylethyl Cys (PTH-pyridylethyl Cys) has been well characterized and can be identified readily (20). The tryptic map of a control GDNF sample where the protein had been completely reduced and pyridylethylated under denaturing condition is shown in Figure 3A. Peak 1 shows the co-purification of 2 peptides which were analyzed by both automated Edman degradation and mass spectrometry. They were identified to be L92vSDKVGQApeCpeCRPIAFDDDLSFLDDNLVYi20 (molecular mass 3443.8 Da; calculated = 3443.9 Da) and V97GQApeCpeCRPIAFDDDLSFLDDNLVYl20 (molecular mass = 2901.3 Da; calculated = 2901.3 Da), where peC is used to denote a pyridylethyl Cys throughout this Discussion. The 2 peptides contained the same 2 cysteinyl residues (residues 101 and 102) and were generated by incomplete digestion of the Lys^^-Val^^ bond. The cleavage after Tyr^^O suggests the presence of a contaminating chymotryptic-like activity in the commercial trypsin employed in this study. However, even highly purified preparations of trypsin had been shown to cleave certain aromatic bonds (21). Examination of the NEM modified sample indicated that peak 1 shifted to a later retention time (designated as peak 2 in Figure 3B). N-terminal analysis again gave 2 amino acid sequences: L92vSDKVGQAXpeCRPIAFDDDLSFLDDNLVYl20 (molecular mass = 3481.2 Da) and V97GQAXpeCRPIAFDDDLSFLDDNLVYl20 (molecular mass = 2938.7 Da), where X denotes the presence of a modified amino acid, which was eluted from the PTH-analyzer with a retention time of 6.8 min, approximately 1.5 min after PTH-Glu (Figure 4). If X corresponds to NEM-modified Cys, the expected mass of the 2 peptides would be 3463.8 Da and 2921.2 Da respectively. Therefore, our results showed that the 2 measured masses gave an increment of 18 Da, indicating the succinimide ring of NEM has been hydrolyzed to the succinamic acid derivative, which occurred during tryptic digestion. The retention time of the modified amino acid observed from the PTH-analyzer is also consistent with this hypothesis. Maleimide SH-reagents have been documented to be unstable at neutral and alkaline pH values (22). Sequence analysis of the remaining peptides showed that Cys ^^l ^v^as the only residue modified by NEM, demonstrating that GDNF dimerizes through a single disulfide bridge at Cys ^0^. The model based on the protein's homology with TGF-62 was hence confirmed. The
John O. Hui et al
284 (A)
Control GDNF
500 H
I
400-1
g
200-
100
\x_^M 10
20
30
40
50
60
Time (min.) (B)
20
30 Time (min.)
Figure 3 Tryptic map of (A) control GDNF, and (B) NEM modified sample. The identities of peaks 1 and 2 are described in the Results and Discussion.
Disulfide Bridge in Glial Cell Line-Derived Neurotrophic Factor mAU^
285
cycle 4
Q.
15H< 150.5^
150J 149.51 149 148.5 148
^V-.-A>VA.AA..AA-J -I
1
4
i
r-
I
6
I
•!
i
mAU
c
150.5
o
150
LU
CO
O C CO .Q v..
o
CO
<
I
I
8
I
I
10
12
14
CO
18
20
CO
149.5
u
149 148.5 —1—'—'—'—I—'—'—'
4
6
-I—I—l—1—I—I—I—I
I
8
10
12
14
mAU. 151
16
cycle 5
151
148
\--w«-«^\..^\/vv»\/ \
16
U 18
20
H Q. Q
cycle 6
o
150.5-
LU CL
150" 149.5149-
U LA^.J\JL
148.5148-
10
12
14
16
18
20
Time (min.)
Figure 4 Elution of the NEM modified Cys from the PTH-analyzer. Cycles 4 to 6 of peak 2 (see Results and Discussion) are shown. Peak 2 shows the co-purification of 2 peptides; the expected sequences in cycles 4 to 6 were -Ala-(NEM modified Cys)-(pyridylethyl Cys)-Asp-Lys-Val-
John O. Hui et al
286
partial reduction method described in this communication may be applicable to the other members of the TGF-C family and to the best of our knowledge, this is the first report on the chromatographic identification of the PTH-derivative of NEM modified Cys.
ACKNOWLEDGMENTS The authors are grateful to Drs Robert Rush, Tony Polverino and Scott Patterson for their critical comments on the manuscript.
REFERENCES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Henderson, C.E. (1996) Current Opinion in Neurobiology 6, 6470. Thoenen, H. (1991) Trends Neurosci. 14, 165-170. Isackson, P.J. (1995) Current Opinion in Neurobiology 5, 350357. Curtis, R. and DiStefano, P.S. (1994) Trends Cell Biol 4, 383-386. Lin, L.-F. H., Doherty, D.H., Lile, J.D., Bektesh, S. & Collins, F. (1993) Science 260, 1130-1132. Lin, L.-F. H., Zhang, T.J., Collins, F. and Armes, L.G. (1994) /. Neurochem, 63, 758-768. Gash, D.M. et al, (1996) Nature 380, 252-255. Daopin, S., Li, M. and Davies, D.R. (1993) Proteins : Structure, Function, and Genetics 17, 176-192. McDonald, N. Q. and Hendrickson, W. A. (1993) Cell 73, 421424. Massague, J. (1996) Cell 85, 947-950 (1996). Haniu, M., et al. (1996) Biochemistry , manuscript submitted. Laemmli, U.K. (1970) Nature 227, 680-685. Hui, J.O., Tomasselli, A.G., Zurcher-Neely, H. and Heinrikson, R.L. (1990) /. Biol. Chem. 265, 21386-21389. Hui, J.O., Le, J., Viswanatham K., Rosenfeld, R., Rohde, M.F. and Haniu, M. (1996) /. Prot. Chem. 15, 351-358. Violand, B.N., SchUttler, M.R., Duffin, K.L. & Smith, C.E. (1995) /. Prot. Chem. 14, 341-347. Acklin, C , Stoney, K., Rosenfeld, R., Miller, J.A., Rohde, M.F. & Haniu, M. (1993) Int. ]. Peptide Protein Res. 41, 548-552. Mise, T. and Bahl, O.P. (1980) /. Biol. Chem. 255, 8516-8522. Creighton, T.E., Zapun, A. and Darby, N.J. (1995) Trends in Biotech. 13, 18-23. Gray, W.R. (1993) Protein Science 2, 1749-1755. Fullmer, C.S. (1984) Anal. Biochem. 142, 336-339.
Disulfide Bridge in Glial Cell Line-Derived Neurotrophic Factor
21 22
Maroux, S., Rovery, M., and Desnuelle, P. (1966) Biochim, Biophys. Acta 111, 147. Gregory, J.D. (1955) /. Am. Chem. Soc. 77, 3922-3923.
287
This Page Intentionally Left Blank
EFFECTS OF SURFACE HYDROPHOBICITY ON THE STRUCTURAL PROPERTIES OF INSULIN Mark L. Brader^ Rohn L. Millican^ David N. Brems^*, Henry A. Havel^» Aidas Kriauciunas* and Victor J. Chen** Divisions of ^Pharmaceutical Sciences, %iopharmaceutical Development and ^Diabetes Research Lilly Research Laboratories, Indianapolis, IN 46285
Introduction A classical observation of protein structure is that the interiors of soluble globular proteins are composed mainly of hydrophobic amino acids. This structural arrangement has been compared to the interior of a micelle (Kauzmann 1959). The significance of internal hydrophobicity as a dominant factor in protein folding has recently been demonstrated in studies of the thermostability of T4lysozyme mutants, in which specific residues were replaced with amino acids bearing lipophilic side chains of varying sizes (Mendel et al., 1992; Eriksson et al., 1992). These studies showed that, in general, protein conformational stability correlated positively with the degree of internal hydrophobicity. The studies on the T4-lysozyme mutants prompted us to consider the converse, namely the effects of increasing the surface hydrophobicity of a globular protein. We chose to address this by comparing the conformational stability of a model protein in the presence and absence of a hydrophobic group attached covalently to a specific surface residue. Human insulin (HI) was selected as the model protein because it is a relatively small polypeptide (Mr = 5808) for which there is a wealth of structural, chemical and biological information. In addition, the insulin molecule possesses a rich conformational chemistry distinguished by thoroughly characterized structural transitions and ligand binding processes (Derewenda et al., 1989; Birnbaum et al., 1996; Bloom et al., 1995; Choi et al., 1993, Bryant et al., 1992). The derivative bearing a surface hydrophobic group that is the subject of this study is N^-palmitoylLys^29 human insulin (Pal-HI, Structure 1).
Authors to whom correspondence should be addressed * Current Address: Amgen Inc., Thousand Oaks, CA 91320 TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
289
Mark L. Brader et al
290
n
12
13
14
15
16
17
Structure 1. N^-palmitoylLysB29 human insulin (Pal-HI)
Experimental Methods N-hydroxysuccinimidyl palmitate was prepared by dropwise addition of palmitoyl chloride to a rigorously stirred suspension comprising one equivalent each of N-hydroxysuccinimide and triethylamine in dry tetrahydrofuran at room temperature. After removal of solids by filtration, the solvent was evaporated and the product (yield -70%) was used without further treatment. Pal-HI was prepared by the addition of 1.2 equivalents of Nhydroxysuccinimidyl palmitate to a stirred solution of insulin in DMSO containing a 20-fold molar excess of tetramethylguanidine. After 30 minutes, the reaction was quenched with four volumes of O.IN HCl and the reaction mixture chromatographed on a Vydac C4-reversed phase column, eluted with a 0.1% trifluroacetic acid-acetonitrile gradient. Pal-HI (yield -40%) eluted at a solvent composition of about 40% acetonitrile. Static light scattering experiments were performed using a Brookhaven Instruments goniometer as described in Needham et al. (1995). A dn/dc value of 0.183 mL/g was employed. Absorption and circular dichroism spectroscopies were performed using Gary 3E and Aviv 62 DS instruments, respectively, under the conditions described in Brems et al. (1992). Guanidine-HCl induced denaturation experiments were conducted according to the method of Brems et al. (1990).
Results A biophysical study of Pal-HI was performed in order to assess the effects of palmitoylation on the structure and self-association of this derivative. Circular dichroism (CD) has been utilized as a convenient and sensitive technique for studying the conformational behavior and self-association of insulin. Insulin forms a hexamer that possesses two high affinity binding sites for certain divalent transition metal ions (Brader & Dunn, 1991). The far-UV CD spectra of HI and Pal-HI recorded under identical conditions are presented in Figure 1. In the absence (Figure 1, Panel A) and presence (Figure 1, Panel B) of Zn(II), the farUV CD spectra of HI and Pal-HI show only a slight difference indicating that palmitoylation has not greatly altered the secondary structure of the parent insulin molecule. The small difference is likely attributable to minor secondary structural
291
Effects of Surface Hydrophobicity on the Insulin
Panel A
Panel B
15
/ \l
I .0 1-H
B 01)
T3
®
1
:\
-5
®
V
-6
.//
•1
'y
1 ^^
•
1
1
1
•
1
•
1
•
1
• 1
190 200 210 220 230 240 250
Wavelength (nm)
200 210 220 230 240 250
Wavelength (nm)
Figure 1. Comparison of the far-UV CD spectra of Pal-HI and HI. All spectra were recorded with samples dissolved in 5 mM Tris-HCl, pH 7.5, and placed in a cuvette of 0.01 cm path length. Panel A shows Pal-HI (dotted line) and HI (sohd line), both at 0.1 mg/mL. Panel B shows Zn(II)-Pal-HI hexamer (solid line), Zn(II)-HI hexamer (dotted line), Zn(II)-Pal-HI hexamer in 10 mM phenol (dashed-dotted line) and Zn(II)-HI hexamer in 10 mM phenol (dashed line), all at 1.7 mg/mL and with the mole ratio of Zniprotein at 0.33.
differences between HI and Pal-HI and the fact that palmitoylation introduces an additional amide chromophore which absorbs in the far-UV region. In contrast, Figure 2 shows that the near-UV CD spectra have similar profiles, while the peak magnitude of that for Pal-HI is only about half that of HI. This change in the near-UV CD reflects a perturbation in the chromophoric environments of the tyrosine residues. The decreased CD magnitude may simply arise from a greater flexibility of the tyrosine residues in Pal-HI. Two alternative possibilities could account for this observation. Since the magnitude of the nearUV CD of insulin has been shown to correlate with hexamer assembly (Strickland & Mercola, 1976) the data of Figure 2 could indicate that palmitoylation reduces self association in the monomer-dimer-hexamer equilibrium. In addition, the different near-UV CD of Pal-HI may arise from a different conformation of aromatic side chains induced by the presence of the palmitoyl group in Pal-HI that is not directly correlated with self association. To investigate the self-association of Pal-HI in the presence and absence of Zn(II), a static light scattering study has been performed. The data in Figure 3 shows that metal-free Pal-HI undergoes a concentration dependent selfassociation process. In the presence of Zn(II), Pal-HI exhibits a different
Mark L. Brader et al
292
o
^H iH
so
-O.b
rH
§
01)
-1
0^
^
^ ©
-1 S
I
250
260
270
280
.
290
I
300
310
320
Wavelength (nm) Figure 2. Effects of Zn(II) on the near-UV CD sepctra of Pal-HI and HI. The proteins are at 1.0 mg/mL dissolved in 5 mM Tris-HCl, pH 7.5. Spectra (a) and (b) are metal-free Pal-HI and HI, respectively; (c) and (d) are Pal-HI and HI, respectively, in the presence of a mole ratio of Zn(II):protein at 0.33, corresponding to 2 Zn(II) per hexamer.
1
2
3
4
5
6
7
Concentration (mg/mL) Figure 3. Static light scattering study of the concentration-dependent self-association of Pal-HI in the presence of a 0.35 mol ratio of Zn(II):protein (solid symbols), and metal-free Pal-HI (open symbols). The solutions were prepared in 25 mM Tris-HCl buffer pH 7.4.
293
Effects of Surface Hydrophobicity on the Insulin 22
2 Zn(II)
3
4
5
6
Pal-HI hexamer mole ratio
Figure 4. Titration of 1 mg/mL Pal-HI with Zn(II) in 25 mM Tris-HCl pH 8.0. The ellipticity at 275 nm is plotted as a function of the molar ratio of Zn(II) to Pal-HI hexamer.
concentration-dependent association profile indicative of higher levels of aggregation. The data of Figure 3 may be compared qualitatively with published self-association data of HI as measured by light scattering (Hvidt, 1991) and analytical ultracentrifugation (Brems et al., 1992). Although the data of Figure 3 do not distinguish the extent of hexamer formation from other aggregation states, the increased association relative to insulin suggests that the reduction in magnitude of the near-UV CD spectrum of Pal-HI (Figure 2) probably comes from a perturbation in the conformation of the tyrosine residues as a result of palmitoylation of LysB29, as opposed to decreased self-association. This interpretation is supported by the spectra of Figure 2 which show that the nearUV CD of the Pal-HI Zn(II) hexamer is also considerably diminished. Another interesting example of the conformational behavior of the insulin molecule is exhibited by the transition-metal-substituted hexamer. Figure 4 shows that the near-UV CD signal for Pal-HI increases with increasing levels of Zn(II) and that this effect saturates at a value of two Zn(II) ions per hexamer, identical to the Zn(II)-binding of insulin. The structure of the zinc insulin hexamer has been characterized thoroughly by a series of X-ray crystal structures (Baker et al., 1988; Derewenda et al., 1989; Ciszak & Smith, 1994; Smith & Dodson, 1992a; Smith & Dodson, 1992b). The hexamer consists of three insulin dimers associated about a three-fold symmetry axis. Two high-affinity metal binding sites exist on the three-fold axis, each formed by three His^lO residues. These chelation sites may accommodate metal ions such as Zn(II), Co(II), Co(III), Cu(I) and Cu(II) (Brader & Dunn, 1991). The hexameric complex exists in two interconvertable conformational states in which the subunits are designated as either T or R according to their conformation. The R^ hexamer is stabilized by noncovalent interactions with phenolic compounds which bind to six hydrophobic pockets on the surface of the hexamer. With Co(II) ions coordinated at the HisBlO sites, the hexamer gives rise to characteristic Co(II) d-d transitions evident
Mark L. Brader et al
294
s
ss
0.75
0.5 —I 1— 500 550 600 650
o c« ©
0.25
0^ 0
Wavelength (nm)
10
15
20
Phenol (mM) Figure 5. Titration of Co(II)-hexamers of Pal-HI (solid symbols) and HI (open symbols) with phenol in the presence of 20 mM SCN". Protein concentrations at 6 mg/mL in 50 mM TrisHCIO4, pH 8.0. Inset: Visible absorption spectra of the Co(II)-hexamer complexes formed with phenol and SCN' (sohd hnes: Pal-HI; broken lines: HI).
in the visible absorption spectra. The visible spectra of Co(II) hexamer complexes have been shown to be extremely sensitive to the nature of exogenous small molecule that can serve as ligand to the metal at the HisBio site. The visible absorption spectra of HI and Pal-HI recorded in the presence of Co(II) and small molecule ligands are presented in Figure 5. Under these conditions, HI exists as the Co(II)-R6-(Ligand) hexamer in which the visible absorption spectrum arises from the pseudotetrahedral Co(II)His3L chromophore, where L is a small molecule ligand that is coordinated to the Co(II) center (Brader & Dunn, 1991). The similarity of the Pal-HI and HI spectra recorded under the same conditions indicates that the Co(II) center in this hexamer must be very similar with respect to ligand identity and geometry. This result indicates that the Pal-HI molecule forms a Co(II) hexamer and that the structural features required for formation of dimer and hexamer remain largely unperturbed by palmitoylation. In addition, these results show that the palmitoylated derivative retains the capacity to undergo the ligand-promoted T6 to R6 conformational transition and the same general profile for the phenol-binding isotherm. The conformational stabilities of HI and Pal-HI have been compared by studying the respective guanidine-HCl induced denaturation profiles. Experiments were conducted utilizing a concentration of 0.1 mg/mL protein in pH 7.4 buffer containing 20% ethanol. Under these solution conditions, both HI and Pal-HI were determined by ultracentrifugation analysis (data not shown) to exist in the monomeric state. The denaturation profiles of HI and Pal-HI obtained by monitoring the circular dichroism at 224 nm as a function of guanidine-HCl concentration are shown in Figure 6. The denaturation transitions begin at about 2.5 M guanidine-HCl and were shown to be completely reversible. From these data, the free energies of unfolding were calculated to be 6.0 and 4.5 kcal/mol for Pal-HI and HI, respectively.
Effects of Surface Hydrophobicity on the Insulin
^?
2-1
1.00
1 1^
^ •
295
0-
. c J ^ ^ xrrrq
X2)'5^
^
c: -1 -
(
-2-
&
co
o _
0.50
\ 4.0
1 5.0
1 6.0
o
GdnHCl (M)
O
•
^4-^
^ au ^
O O •
0.00
o • ,o.^^*
pOO(pOc(PC^^^ J
^
1
L
3
4
5
6
Guanidine HCl (M) Figure 6. Equilibrium denaturation isotherms for Pal-HI (solid symbols) and HI (open symbols) in pH 7.5 buffer containing 20% EtOH determined from the ellipticity at 224 nm. The fraction of protein unfolded is plotted as a function of guanidine hydrochloride concentration.
Discussion The increased conformational stability of Pal-HI over HI is a surprising yet intriguing finding of this study. The introduction of a large hydrophobic group to the surface of a small protein like insulin could perturb its structure to the extent of altering the conformation. The present results show that the palmitoylation of HI has not significantly altered its basic structural properties. Pal-HI retains the ability to form Zn(II) and Co(II) hexamers, to form pseudotetrahedral Co(II)His3L centers, to undergo the T^ to R6 hexamer conformational transition, and for the metal-free monomer to adopt a secondary structure that is closely analogous to that of HI. Collectively, these results indicate that the Pal-HI molecule forms a hexamer that is highly analogous to that of HI with respect to structure, ligand binding properties and conformational flexibility. The static light scattering results suggest that the Zn(II)-Pal-HI hexamers have a greater tendency to aggregate. This behavior is consistent with a structural model of the Zn(II)Pal-HI hexamer in which the assembly of this hexamer is essentially identical to that of the Zn(II)-HI hexamer, but the palmitoyl groups are accommodated on the surface of the hexamer. In such a model, the palmitoyl groups would interact with hydrophobic residues on the hexamer surface. An arrangement of this type would be in accord with the perturbed near-UV CD observed for Zn(II)-Pal-HI and would explain the increased propensity for hexamer aggregation as being attributed to enhanced m/^r-hexamer hydrophobic interactions. The situation is less clear for the metal-free self-association process. Although the specific nature of the metal-independent self-association is not understood, it appears to be more complex than merely the apparent monomer-dimer-hexamer equilibrium in effect for the assembly of the hexamer. It is likely that the association process for PalHI involves new hydrophobic interactions in which the palmitoyl groups
296
Mark L. Brader et al
participate. The details of how the palmitoyl group affects the intra- and interhexamer contacts both in the absence and presence of metal ions remains to be elucidated. The polypeptide of Pal-HI is folded as for HI, therefore, it is of interest to determine how the hydrocarbon tail of the surface-residing palmitoyl moiety is accommodated in an environment that is extensively solvent exposed. The increase in conformational stability in Pal-HI, compared to HI, may be interpreted by proposing the existence of a surface hydrophobic pocket on the insulin molecule that is capable of intramolecularly binding the palmitoyl group attached to LysB29. As a result of this intramolecular interaction between the pocket and the fatty acyl chain, the structure of the polypeptide as a whole attains greater stability. The hydrophobic pocket is likely to comprise aromatic residues, which would be consistent with the observed perturbation of the near-UV CD spectrum. However, the purported structural alterations that have occurred to accommodate the palmitoyl group, evidently are local and do not greatly affect hexamerization, conformational flexibility or biological activity. It is interesting to note that in the mammalian cAMP dependent protein kinase, which is naturally N-terminal myristylated, the hydrocarbon tail of the myristyl moiety also folds back into the polypeptide and is bound in a cavity created by noncontiguous hydrophobic residues (Zheng et al., 1993). Furthermore, it was found that the myristylated protein showed increased thermostability when compared with its deacylated form (Yonemoto et al., 1993). An alternative explanation for the increased stability observed for Pal-HI is through a decreased entropy for the unfolded Pal-HI polypeptide. The acyl chain may interact with the hydrophobic side chains of the unfolded polypeptide thereby decreasing its entropy. An entropy reduction of the unfolded state would result in a decrease in the number of possible conformations thereby reducing the energy required to bring two chain elements together, with the overall effect of increasing the free energy of unfolding (Flory, 1956). For the present study of Pal-HI, LysB29 was chosen as the acylation site because of the simplicity of the conjugation chemistry. This residue resides in the C-terminus of the B-chain which is a region of the insulin molecule known to be extremely flexible (Weiss et al., 1989), and this region probably plays a minor role in the folding and unfolding process of unmodified insulin. A future direction of these studies is to investigate the effect of palmitoylation at a site corresponding to the more rigid and helical portion of the molecule with the aim of establishing how the hydrocarbon chain of the fatty acid would be accommodated and determining its impact on the overall structural integrity of the molecule.
Reference Baker, E. N., Blundell, T. L., Cutfield, J. F., Cutfield, S. M., Dodson, E. J., Dodson, G. G., Hodgkin, D. C , Hubbard, R. E., Isaacs, N. W., Reynaolds, C. D., Sakabe, K., Sakabe, N., & Vijayan, N. M. (1988) Phil Trans. Roy. Soc. Ser. B 319, 369-456. Birnbaum, D. T., Dodd, S. W., Saxberg, B. E. H., Varshavsky, A. D., & Beals, J. M. (1996) Biochemistry 35, 5366-5378. Bloom, C. R., Choi, W. E., Brzovic, P. S., Ha, J. J., Huang, S.-T., Kaarsholm, N. C, & Dunn, M. F. (1995) /. Mol. Biol. 245, 324-330. Brader, M. L., & Dunn, M. F. (1991) Trends Biochem. Sci. 16, 341-345. Brems, N. B., Alter, L. A., Beckage, M. J., Chance, R. E., DiMarchi, D. D., Green, L. K., Long, H. B., Pekar, A. H., Shields, J. E., & Frank, B. H. (1992) Protein Engineering 5, 527-533.
Effects of Surface Hydrophobicity on the Insulin
297
Brems, D. N., Brown, P.L., Heckenlaible, L. A., & Frank, B. H. (1990) Biochemistry 29, 9289-9293. Bryant, C, Strohl, M., Green, L. K., Long, H. B., Alter, L. A., Pekar, A. H., Chance, R. E., & Brems, D. N. (1992) Biochemisty 31, 5692-5698. Choi, W. E., Brader, M. L., Aguilar, V., Kaarsholm, N. C , & Dunn, M. F. (1993) Biochemistry 32, 11638-11645. Ciszak, E., & Smith, G. D. (1994) Biochemistry 33, 1512-1517. Derewenda, U., Derewenda, Z., Dodson, E. J., Dodson, G. G., Reynolds, C. D., Smith, G. D., Sparks, C, & Swenson, D. (1989) Nature 338, 594596. Eriksson, A. E., Basse, W. A., Zhang, X.-J., Heinz, D. W., Blaber, M., Baldwin, E. P., & Matthews, B. W. (1992) Science 255, 178-183. Flory, P.J. (1956) J. Am. Chem. Soc. 78, 5222-5232. Hvidt, S. (1991) Biophys. Chem. 39, 205-213. Kauzmann, W. (1959) Adv. Protein Chem. 14, 1-63. Mendel, D., Ellman, J. A., Chang, Z., Veenstra, D. L., Kollman, P. A., & Schultz, P. G. (1992) Science 256, 1798-1802. Needham, G. P., Pekar, A. H., Havel, H. A. (1995) J. Pharm. Sci. 84, 437442. Smith, G. D., & Dodson, G. G. (1992a) Proteins: Sruct. Func. Genet. 14, 401408. Smith, G. D., & Dodson, G. G. (1992b) Biopolymers 32, 1749-1756. Strickland, E. H., Mercola, D. (1976) Biochemistry 15, 3875-3884. Weiss, M. A., Nguyen, D. T., Khait, I., Inouye, K., Frank, B. H., Beckage, M., O'Shea, E., Shoelson, S. E., Karplus, M., & Neuringer, L. J. (1989) Biochemistry 28, 9855-9873. Yonemoto, W., McGlone, M. L., & Taylor, S. S. (1993) J. Biol. Chem. 268, 2348-2352. Zheng, J., Knighton, D. R., Xuong, N.-H., Taylor, S. S., Sowadski, J. M., & Ten Eyck, L. F. (1993) Protein Sci. 2, 1559-1573.
This Page Intentionally Left Blank
The Effects of In Vitro Methionine Oxidation on the Bioactivity and Structure of Human Keratinocyte Growth Factor Christopher S. Spahr, Linda O. Narhi, James Speakman, Hsieng S. Lu, and Yueh-Rong Hsu Amgen Inc., Thousand Oaks, CA 91320
I. Introduction When the crystal structure of a protein is not available, other techniques can be employed to identify the amino acids that are involved in its structure and function. Commonly used techniques include chemical cross-linking, site-specific chemical modifications, and mutagenesis. Chemical modifications of Met residues using oxidizing agents such as hydrogen peroxide, t-butyl hydroperoxide, chloramine T, and sodium periodate have been useful in identifying structure and function relationships in many proteins (1-7). Keratinocyte growth factor (KGF) is a member of the fibroblast growth factor (FGF) family. The molecule is expressed by stromal fibroblasts and is involved in the proliferation and differentiation of epithelial cells in a paracrine mode (8). E.co/i-derived human KGF is biologically active (9) and may be clinically useful (10-12). In this study, hydrogen peroxide oxidation was performed on E.co/i-derived human KGF at pH 5.0 to understand the structure and function of the protein. Reverse-phase high performance liquid chromatography (RP-HPLC), peptide mapping, protein sequencing, and mass spectrometry were used to separate and identify the different methionine oxidized KGF species. Cation exchange HPLC was utilized to remove hydrogen peroxide from the different modified forms of the protein prior to mitogenic bioassay and circular dichroism (CD) analysis. Preferential oxidation of the methionine residues to methionine sulfoxide has enabled us to determine that Met 160 may play an important role in the biological function of KGF. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
299
Christopher S. Spahr et al
300
11. Materials and Methods Materials E.co//-derived human KGF contained a KGF polypeptide sequence (8) that started with Ser 24 as the N-terminus. The protein was produced and purified using methods similar to those described previously (9). Trifluoroacetic acid (TFA) was purchased from J.T. Baker. HPLC grade water and acetonitrile for RP-HPLC analysis were obtained from Burdick and Jackson. Urea was a product of Amresco. Sequencing grade trypsin was purchased from Boehringer Mannheim. Oxidation
of KGF with Hydrogen
Peroxide
KGF was incubated with hydrogen peroxide in 50 mM sodium acetate, pH 5.0 at a concentration of 0.4 mg/ml under various conditions to generate the oxidized KGF species. Detailed conditions were described in the legend to Figure 1. Reverse Phase HPLC RP-HPLC was performed using a Vydac C4 column (4.6 mm X 250 mm) connected to a Hewlett Packard 1090 HPLC system. The HPLC was equipped with a diode array detector and a PC based computer system for data processing. Solvent A was 0.1% TFA, while solvent B was composed of 0.1% TFA in 90% acetonitrile. The column was initially equilibrated at 3% B using a flow rate of 0.7 ml/min, with the absorbance monitored at 230 nm. The elution gradient consisted of isocratic conditions at 3% B for 5 minutes, followed by linear gradients to 40% B over 10 minutes, to 50% B over 20 minutes, and to 80% B over 5 minutes, and finally isocratic conditions at 80% B for 5 minutes. Tryptic Peptide
Mapping
The oxidized KGF species were vacuum lyophilized and then 100 \xl of 20 mM Tris-HCl, pH 7.0/2 M urea was added to each sample. About 25 |ig of protein was carboxyamidomethylated with iodoacetamide in the dark for 30 minutes at 25°C, using a 5:1 molar ratio of iodoacetamide to KGF. The carboxyamidomethylated KGF was digested with 0.625 [ig trypsin (40:1 ratio of KGF to trypsin) for 24 hours at 25°C. The proteolytic digests were quenched by the addition of 150 (il of 0.1% TFA, then injected onto a Vydac C8 column (4.6 mm X 250 mm) connected to a Hewlett Packard 1090 HPLC system. The
Methionine Oxidation of Human Keratinocyte Growth Factor
301
absorbance was monitored at 215 nm and a flow rate of 0.7 ml/min was used. The solvents used were identical to those described above. The digests were separated using isocratic conditions at 1% B for 5 minutes, then linear gradients to 36% B over 50 minutes, and to 70% B over 5 minutes, and finally isocratic conditions at 70% B for 5 minutes. The tryptic peptides collected from RP-HPLC were vacuum lyophilized, then resuspended in 0.1% TFA.
Protein Sequencing and Mass Spectrometry Peptides were sequenced using either an Applied Biosystems Model 470 or 477 or a Hewlett Packard GIOOOA protein sequencer, each equipped with narrow bore RP-HPLC for on-line analysis of the PTH-amino acids. Mass analysis of peptides was performed using matrix-assisted laser desorption/ionization mass spectrometry on a KRATOS MALDI III with a-cyano-4 hydroxy-cinnamic acid as the matrix.
Cation Exchange HPLC Cation exchange HPLC was performed using a TosoHaas TSK gel SP-5PW column (7.5 mm X 75 mm) connected to a Hewlett Packard 1050 liquid chromatograph. Buffer A was 20 mM sodium phosphate, pH 7.0, while buffer B was 20 mM sodium phosphate, pH 7.0/0.5 M sodium sulfate. The column was initially equilibrated with 30% buffer A at a flow rate of 1 ml/min. The elution was monitored by absorbance at 230 nm. The protein was eluted using isocratic conditions at 30% B for 20 minutes, followed by a linear gradient to 100% B over 1 minute, and finally isocratic conditions at 100% B for 20 minutes. The hydrogen peroxide eluted in the void volume. The KGF samples collected from cation exchange HPLC were buffer exchanged into phosphate-buffered saline. The protein concentrations were determined by UV absorption at 280 nm, assuming an extinction coefficient of 1.5 from a 0.1% protein solution. The concentrations were then adjusted to 1 mg/ml.
Circular Dichroism The far UV CD spectra and conformational stability, as determined by thermal denaturation, were compared using a Jasco J720 spectropolarimeter. The far UV CD spectra were determined using a cuvette with a 0.02 cm pathlength. Thermal stability was determined by continuously monitoring the change in the signal at 231 nm with increasing temperature in a thermal cuvette with a 0.1 cm pathlength and a Peltier JTC-345 thermal control unit.
Christopher S. Spahr et al
302
Balb/MK Cell Proliferation
Bioassay
The in vitro mitogenic bioassay used to determine the biological activity of KGF was similar to that described by Rubin, et al (13). The assay measured the incorporation of [^H]-thymidine by Balb/MK epidermal keratinocytes.
Kinetics of Methionine Oxidation of KGF The kinetics of oxidation of KGF were determined at room temperature with various time points. KGF was incubated with 0.5% hydrogen peroxide (v/v) in 50 mM sodium acetate, pH 5.0 at a concentration of 0.4 mg/ml. AUquots at various incubation times were injected onto a Vydac C4 column (4.6 mm X 250 mm) connected to a Hewlett Packard 1090 using the conditions described in the cation exchange HPLC section.
III. Results and Discussion Figure 1 shows RP-HPLC chromatograms of KGF samples oxidized by hydrogen peroxide under various conditions. KGF eluted as peak A in panel 1. Oxidized KGF eluted at earlier retention times (panels 2-5). As the temperature, the duration of incubation, or the 120 SO
120
fD&.rt&l
1
panel
2
painol
3
SO
i
g
°'
120
120-j
p a n e l -4-
soH
4.0
120-
panel
5
SO4.0 • 24
Tim»
25 (min.)
Figure 1. RP-HPLC chromatograms of the different species (A-F) generated by incubating KGF with hydrogen peroxide in 50 mM sodium acetate, pH 5.0 at a concentration of 0.4 mg/ml under various conditions. Panel 1- no hydrogen peroxide. Panel 2- 0.5% hydrogen peroxide, 1 hour at4°C. Panel 3- 0.125% hydrogen peroxide, 8 hours at 0°C. Panel 4- 0.5% hydrogen peroxide, 16 hours at 4°C. Panel 5- 2% hydrogen peroxide, 24 hours at room temperature.
Methionine Oxidation of Human Keratinocyte Growth Factor
303
40 Time (min.) Figure 2. Tryptic peptide maps of native KGF (A) and the individual oxidized KGF species (B-F) from Figure 1. Met containing peptides were labelled gl, hi, il, jl, kl, and 11, while the peptides containing the free sulfhydryl Cys 40 were labelled ml and nl. Oxidized forms of the same peptides were labelled g2, h2, i2, j2, k2,12, m2, and n2, respectively. Peptide i2 coelutes with the peptide T-V-A-V-G-I-V-A-I-K, while peptide 12 co-elutes with the peptide E-L-I-L-E-N-H-Y-N-T-Y-A-S-A-K.
Christopher S. Spahr et al
304
hydrogen peroxide concentration was increased, the moditied protein eluted at earher retention times than native KGF. The individual oxidized KGF species observed on RP-HPLC were labelled from B-F. Sufficient quantities of KGF species A-F were generated using the conditions described in the legend to Figure 1. These forms were collected from RP-HPLC, then subjected to tryptic peptide mapping. The tryptic peptide maps of species A-F are shown in Figure 2. Peptides containing Met residues derived from native KGF (species A) were labelled as gl, hi, il, jl, kl, and 11. Peptides containing the free sulfhydryl Cys 40, now carboxyamidomethylated with iodoacetamide, were labelled as ml and nl. In the peptide map of species B, a large portion of peptide gl shifted to g2; in the map of species C, the majority of peptide hi shifted to h2; in the map of species D, peptides gl and hi shifted to g2 and h2 respectively; in the map of species E, peptides gl, hi, and il shifted to g2, h2, and 12 respectively; and in the map of species F, peptides gl, hi, il, jl, kl, 11, ml, and nl shifted to g2, h2, 12, j2, k2, 12, m2, and n2 respectively. The mass of peptides g2, h2,12, j2, k2, and 12 each increased by about 16 mass units as compared to the respective unoxidized peptides, consistent with the oxidation of the methionine residues to methionine sulfoxide (Table I). In summary, KGF species A was Table I. Sequence and mass spectrometry analysis of methionine containing peptides from the tryptic maps of the oxidized KGF species Peptide
Observed Sequence
Frag.
Obs. (MH+) Calc. (MH+) Mass
Mass
Mass
Difference
jl
W-T-H-N-G-G-E-M-F-V-A-L-N-Q-K
(125-139)
1732.1
1731.8
0.4 0.1 -0.2 0.3
kl
N-N-Y-N-I-M-E-I-R
(62-70)
1166.3
1166.6
-0.3 1
11
G-V-E-S-E-F-Y-L-A-M-N-K
(81-92)
1387.2
1387.7
-0.5
gl
G-T-Q-E-M-K
(56-61)
693.7
693.3
hi
S-Y-D-Y-M-E-G-G-D-I-R
(24-34)
1305.6
1305.5
il
T-A-H-F-L-P-M-A-I-T
(154-163)
1101.4
1101.6
1 ml
1 1 1 1
L-F-X-R (X= carboxyamidomethyl-Cys)
(38-41)
595.4
538.7
56.7
nl
R-L-F-X-R (X= carboxyamidomethyl-Cys)
(3741)
751.5
694.9
56.6
g2
G-T-Q-E-M-K
(56-61)
709.7
693.3
16.4 1
h2
S-Y-D-Y-M-E-G-G-D-I-R
(24-34)
1321.6
1305.5
16.1
1 ^ J2 1 ^ 12
1 m2 n2
T-A-H-F-L-P-M-A-I-T
(154-163)
1117.1
1101.6
15.5 1
W-T-H-N-G-G-E-M-F-V-A-L-N-Q-K
(125-139)
1747.9
1731.8
16.1
N-N-Y-N-I-M-E-I-R
(62-70)
1182.1
1166.6
15.5 1
G-V-E-S-E-F-Y-L-A-M-N-K
(81-92)
1403.4
1387.7
15.7
L-F-X-R (X= q^steic acid)
(38-41)
586.4
538.7
47.7 1
R-L-F-X-R (X= cysteic acid)
(3741)
742.6
694.9
47.7
Methionine Oxidation of Human Keratinocyte Growth Factor
305
unoxidized KGF; species B was identified to have most of Met 28 oxidized; species C had the majority of Met 60 oxidized; species D had Met 28 and 60 oxidized; and species E had Met 28, 60, and 160 oxidized. Methionine oxidation proceeded further, with no intermediate forms, from a species with Met 28, Met 60, and Met 160 oxidized (species E) to a species in which all six Met residues were oxidized (species F). The conditions required to oxidize all six Met residues also oxidized the free sulfhydryl Cys 40 to cysteic acid. Cys 40 has previously been suggested to reside in a solvent-inaccessible, buried environment (14). This data implies that the core of the protein opens up under these oxidizing conditions. As a result, the fully oxidized protein readily precipitated out of solution, therefore no further analysis was performed on it. In contrast, the conditions required to oxidize Met 28, Met 60, and Met 160 to methionine sulfoxide did not oxidize Cys 40 nor modify any other residues, as determined by peptide mapping. These oxidized KGF forms remained soluble and stable. The kinetics of oxidation for each Met residue was determined by RP-HPLC. The peak integration of the various oxidized forms taken at different time points was plotted as a function of time (Figure 3). After 60 minutes, about 90% of Met 28 and Met 60 were oxidized, while about 35% of Met 160 and essentially none of the Met 67, Met 90, and Met 132 residues had been converted to sulfoxide derivatives. The results indicate that Met 28 and Met 60 oxidize at a very rapid rate, while Met 160 was oxidized at a slightly slower rate. This data also suggests that Met 28 and Met 60 are located in an exposed environment on the surface of the protein and Met 160 may reside in a partially solvent-accessible environment. Met 67, Met 90, and Met 132 oxidize at substantially slower rates, suggesting that they reside at a buried environment that is relatively inaccessible to the oxidizing agent. Circular Dichroism and Mitogenic
Bioassay
Sufficient quantities of control KGF, KGF with Met 28 and Met 60 oxidized, and KGF with Met 28, Met 60, and Met 160 oxidized were generated as described in the legend to Figure 1. Cation exchange HPLC was utilized to rapidly remove the hydrogen peroxide from the protein samples prior to the Balb/MK mitogenic assay and CD analysis to avoid the RP-HPLC solvents that could potentially inactivate the protein. All samples were between 90-95% homogeneous, except for the KGF species with Met 28 and Met 60 oxidized which was approximately 85% homogeneous. From the analysis of these samples by CD, the far UV spectra of the KGF species with two methionine residues oxidized (Met 28 and 60) and the species with three methionine residues oxidized (Met 28, Met 60, and Met 160) were determined to be identical to that of native
Christopher S. Spahr et al
306
KGF with regard to the positive feature at 231 nm, as well as the psheet signal and the negative drift from the disulfide signals (Figure 4A). The oxidation of Met 28, Met 60, and Met 160 did not alter the secondary structure of KGF, nor the gross structure as monitored by the 231 nm peak in the far UV CD region. The stability of the two methionine oxidized species and three methionine oxidized species in solution was compared to that of native KGF using thermal denaturation (Figure 4B). Thermal denaturation of KGF is an irreversible reaction, as KGF precipitates following heat-induced denaturation, so stability was assessed by comparing the temperature at the onset of protein unfolding. The onset of melting occurred at 49°C in all samples. The identical melting points indicate that oxidation of Met 28, Met 60, and Met 160 did not result in a significant decrease in the rigidity or thermostability of the molecule. In the mitogenic bioassay, the KGF control had maximal activity around 34,000 cpm (Figure 5). To achieve 40% maximal activity in the bioassay (about 13,600 cpm), the KGF control required about 2 ng/ml and the KGF species with Met 28 and Met 60 oxidized required 6 ng/ml. However, to achieve this level of activiy in KGF with Met 28, Met 60, and Met 160 oxidized required about 2 |ig/ml. KGF with Met 28, Met 60, and Met 160 oxidized lost significant biological activity. Oxidation of Met 160 in KGF to methionine
73
.a X
O
.s o
-B
10
20
30
40
50
60
70
80
90
100110120
Time (min) Figure 3. The kinetics of oxidation for each Met residue of KGF. KGF was incubated with 0.5% hydrogen peroxide in 50 n\M sodium acetate, pH 5.0 at a concentration of 0.4 mg/ml at room temperature. D - Met 28 oxidized. A - Met 60 oxidized. • - Met 160 oxidized. O - Met 67, Met 90, and Met 132 oxidized.
307
Methionine Oxidation of Human Keratinocyte Growth Factor
CD
•o E. G
195 200
220
230
240
Wavelength (nm)
1
250
1
1
i
1
r
24 30 40 50 60 70 80 86
Temperature (°C)
Figure 4. (A) [left] Far UV CD spectra of the different KGF samples. 1 -KGF control without cation exchange HPLC step. 2 -KGF with Met 28 and Met 60 oxidized. 3 -KGF with Met 28, Met 60, and Met 160 oxidized. (B) [right] Thermal denaturation of the different KGF samples. The CD signal was monitored at 231 nm with increasing temperature. 1 -KGF control without cation exchange HPLC step. 2 -KGF with Met 28 and Met 60 oxidized. 3 -KGF with Met 28, Met 60, and Met 160 oxidized. 39500 37000 -
A
34500-
n3
32000 -
'J
29500-
6f)
27000 -
o M 1
s u > <
A D
D
A
24500 -
D
22000 1950017000-
d ft
14500120009500-
•
7000-
s
45002000-
Ql i f l ..(M .0 01
01
A
s •
m • A O
o
A D
•
A
•
D
•
•
•
O
O
o o
A
o
o o
O
0 11*1
.1
1
10
ng/ml
100
"•"I
1000
10000
100 000
Figure 5. Balb/MK nutogenic bioassay of the KGF samples, n -KGF control without cation exchange HPLC step. A -KGF control with cation exchange HPLC step. • -KGF with Met 28 and Met 60 oxidized. o -KGF with Met 28, Met 60, and Met 160 oxidized.
Christopher S. Spahr et al
308
sulfoxide did not appear to affect the secondary structure or the stabiUty of the molecule, yet it resulted in a severe loss in biological activity. These results imply that Met 160 may play an important role in the biological function of KGF.
IV. Conclusion In vitro hydrogen peroxide oxidation of the methionine residues of KGF to their sulfoxide derivatives demonstrated that Met 28, Met 60, and Met 160 are preferentially oxidized. Met 28 and Met 60 appear to be solvent-accessible and exposed on the surface of the protein, while Met 160 appears to be partially solvent-accessible. In contrast. Met 67, Met 90, and Met 132 oxidized much more slowly, which suggests they are located in a buried, less solvent accessible environment. Oxidation of these buried residues results in the generation of a highly unstable form of KGF that readily forms aggregates. Met 160 has been identified as a residue that may be critical to the biological function of KGF. A severe loss in biological activity was observed upon conversion of this residue to methionine sulfoxide, yet no change in secondary structure or thermostability was seen.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Hsu, Y-R., Narhi, L.O., Spahr, C.S., Langley, K.E., and Lu, H.S. (1996). Protein Scl 5:1165-1173. Keck, R.G. (1996). Anal Biochem. 236:56-62. Manning, M.C., Patel, K., and Borchardt, R.T. (1989). Pharmaceutical Res. 6:903-918. Anantharamaiah, G.M., Hughes, T.A., Iqbal, M., Gawish, A., Neame, P.J., Medley, M.F., and Segrest, J.P. (1988). ] Lipid Research. 29:309-318. Oda, T., and Tokushige, M. (1988). J Biochem. 104:178-183. Teh, L.C., Murphy, L.J., Huq, N.L., Sums, A.S., Friesen, H.G., Lazarus, L., and Chapman, G.E. (1987). ] Biol Chem. 262:6472-6477. De la Llosa, P., El Abed, A., and Roy, M. (1980). Can] Biochem. 58:745-748. Finch, P.W., Rubin, J.S., Miki, T., Ron, D., and Aaronson, S.A. (1989). Science. 245:752-755. Ron, D., Bottaro, D.P., Finch, P.W., Morris, D., Rubin, J.S., and Aaronson, S.A. (1993). ] Biol Chem. 268:2984-2988. Perkett, E.A. (1995). Current Opinion in Pediatrics. 7:242-249. Ulich, T.R., Yi, E.S., Cardiff, R., Yin, S., Bikhazi, N., Biltz, R., Morris, C.F., and Pierce, G.F. (1994). Amer J Pathol. 144:862-868. Staiano-Coico, L., Krueger, J.G., Rubin, J.S., D'limi, S., Vallat, V.P., Valentino, L., Fahey III, T., Hawes, A., Kingston, G., Madden, M.R., Mathwich, M., Gottlieb, G., and Aaronson, S.A. (1993). / Exp Med. 178:865-878. Rubin, J.S., Osada, H., Finch, P.W., Taylor, W.G., Rudikoff, S., and Aaronson, S.A. (1988). Proc. Natl. Acad. Sci. USA 86:802-806. Hsu, Y-R., Hsu, E. W-J., Katta, V., Brankow, D., Tseng, J., Hu, S., Morris, C.F., Kenney, C.W., and Lu, H.S. (1996). Biochem]. (in press).
SECTION IV Posttranslational and Other Modifications
This Page Intentionally Left Blank
Effects of Enzyme Glycosylation on the Chemical Step of Catalysis, as Probed by Hydrogen Tunneling and Enthalpy of Activation Amnon Kohen, Thorlakur Jonssoii'* and Judith P. Klinman * Department of Chemistry, University of California, Berkeley, CA 94720
I. Introduction The effect of protein glycosylation on enzyme activity is poorly understood. Recent reports suggest that glycosylation does not change the enzyme conformation but does reduce the protein dynamic fluctuations (cf. Rudd et al., 1994). This effect may inhibit catalytic activity although it is not clear which kinetic steps will be predominantly affected. Most activity assays can not distinguish between effects on the chemical step and effects on internal enzymesubstrate rearrangements or product release. Hydrogen tunneling should be very sensitive to any change in the barrier crossing process for the chemical step. The contribution of quantum-mechanical tunneling to the hydrogen transfer step was demonstrated in the past with several enzymes (for review see Bahnson & Klinman, 1995) and most recently for lipoxygenase (Jonsson et al., 1996). In this study, we have investigated the effect of glycosylation on the tunneling contribution to the hydrogen transfer step in the glucose oxidase (G0)1 catalyzed reaction. Our results suggest that the degree of tunneling is affected by changes in the polysaccharide envelope on the surface of the protein. After establishing that the chemical step is largely rate limiting with [l-^H]-2-deoxyglucose as a substrate, enthalpies of activation have been measured with different glycoforms. A decrease in AH* is found to correlate with an increase in tunneling. Glucose oxidase (EC 1.1.3.4.) catalyzes the oxidation of glucose to gluconolactone and the subsequent reduction of oxygen to hydrogen peroxide in accordance with a ping-pong steady state kinetic mechanism (Pazur & Kleppe, 1964; Crueger & Crueger, 1990): ^.,
glucose oxidase
HO^^.L-0 HO
HO
FAD ^ H
^
FADH2
H.n. H2O2 > ^ ^ ^ ^ X ^ 0oo 2
HO^X-Ov HO
^O
* Current address: Kairos Scientific, Bldg. 62, 3350 Scott Boulevard, Santa Clara, CA 95054. Author to whom correspondence should be addressed. 1 Abbreviations: GO, glucose oxidase; KIE, kinetic isotope effect; TLC, thin-layer chromatography; kD, kilodalton; dd, doublet-doublet; ax, axial; eq, equatorial.; M.W., molecular weight. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
311
312
Amnon Kohen et al
The enzyme contains one very tightly, but noncovalently, bound FAD cofactor per monomer and is a homodimer with a molecular weight of 130-320 kD, depending on the extent of glycosylation. Previous reports suggest that the hydrogen transfer step is not fully rate limiting under steady state conditions for the oxidation of glucose (Bright & Gibson, 1967; Bright & Appleby, 1969; Kriechbaum et al., 1989). Consequently, anomerically protonated, deuterated and tritiated 2-deoxyglucose were used as substrates in the present study. Our results indicate that the protonated substrate is not fully rate limited by hydrogen transfer (see Results) but that deuterated and tritiated substrates can used as direct probes of the chemical step of the sugar oxidation. II. Materials and Methods A. Materials [l-3H]-2-Deoxyglucose (10 Ci/mmol), 2-deoxyglucose (grade III), glucose oxidase (EC 1.1.3.4.) fvom Aspergillus niger (VII-S, for synthesis only) and all salts and buffers (unless otherwise indicated) were from Sigma. [UL-14c]-2Deoxyglucose (255 mCi/mmol) was from American Radiolabeled Chemicals, Inc. Bis-tris propane was from Calbiochem. D2O (99.9% and 99.996%) was from Cambridge Isotope Laboratories or Aldrich. Sodium amalgam (5%) was from Anachima. Glucose oxidase from Aspergillus niger (grade I, 211 U/mg), hexokinase, endoglycosidase-H and a-mannosidase were from Boehringer Mannheim. The recombinant glucose oxidase from Aspergillus niger expressed in yeast (pSGO-2 plasmid - Frederick et al., 1990) was a generous gift of Dr. Steven Rosenberg from Chiron Co (Emeryville, CA). The pH of all buffers was adjusted at the experimental temperature. Synthesis of Anomerically Deuterated 2-Deoxyglucose [l-2H]-2-Deoxyglucose was synthesized from 2-deoxyglucose by oxidation to 2-deoxygluconolactone catalyzed by GO, followed by sodium amalgam reduction in D2O to yield the deuterated product. 2-Deoxyglucose (1 g; 3.78 mmol) was dissolved in 1 mL of 0.2 M potassium phosphate buffer pH 5.6. 155 mg of GO (25000 unit, type VII-S, containing - 1 % catalase) was added and the reaction was stirred gently under oxygen until all the reactant was converted to product (as detected by TLC [7:1 ethyl acetate:methanol - stained by "Yellow spray": 12g (NH4)6Mo7024-4H20 and 0.5g cerric ammonium nitrate in 300 ml 10% H2SO4, heated]). 25 mL of acetonitrile was added and the aqueous phase was re-extracted twice with acetonitrile. The organic phases were pooled and dried under reduced pressure and then lyophilized twice from 50 mL D2O (99.9%) to exchange all exchangeable hydrogens. The produced lactone, 0.894 g (3.38 mmol - 89.5% yield from previous step) was dissolved in 130 mL of D2O (99.996%) in a three-neck flask equipped with a mechanical stirrer and stirred under argon for 10 min. Reduction to [l-2H]-2-deoxyglucose was performed by a modification of the procedure of Isbell et al., (1962). The flask was cooled to -2 °C in a ice-salt bath and 10 g of NaHC204 (lyophilized twice from D2O) and 23 g of sodium amalgam (5%) were successively added. The reaction mixture was stirred under Ar and the temperature was raised to 4 °C. After all the lactone was converted to sugar (~ 1 h, by TLC detection), the pH was raised to 10 by addition of NaOD and 1 L of methanol was then added. The mercury and the precipitated salts were filtered through a fine sinter-glass and the volume was reduced to 5 mL. 500 mL Methanol was added and more salts were filtered out. The product was passed through a mixed ion-exchanger (BioRad, TMD-8; H+ZHCOs" form)
Enzyme Glycosylation: Hydrogen Tlinneling and Enthalpy of Activation
313
until the conductivity was similar to that of distilled water (which served as eluent). The product was lyophilized and then crystallized twice from hot ethanol to yield 0.502 g (50.2% overall yield). No anomeric protons were detected by ^HNMR (assigned as dd peak at 5.23 ppm and dd peak at 4.78 ppm for the protonated sugar) which also revealed the loss of the vicinal anomeric /-coupling at the 2-deoxy protons (/i-2ax = 9.8Hz at 6 = 1.37 ppm, J\.2ax = 3.2Hz at 5 = 1.58, J\-2eq < IHz at 8 = 1.98 and /i.2eq = 3.2Hz at 6 = 2.11 ppm as measured for the protonated sugar). Electrospray mass-spectrometry (MNa+ = 188.1) detected no iH-contamination. Micro analysis found C 43.49% (calc. 43.68%) and H 7.54% (calc. 7.33%).
Synthesis of Anomerically Deuterated [UL-^'^C]-2-Deoxyglucose [UL-i^c, l-2H]-2-Deoxyglucose was synthesized from [UL-l4c]-2deoxyglucose according to the strategy described above. All the purification steps were performed by HPLC (Phenomenex CI8-NH2 column eluted with 87% acetonitrile in water). 65 jaCi [UL-l'^C]-2-Deoxyglucose (250 mCi/mmol) was "diluted" with unlabelled 2-deoxyglucose to a specific activity of 8.4 mCi/mmol and dissolved in 170 |iL of 0. IM potassium phosphate pH 7.2. 10 |iL of catalase (5000U) and 8 mg of GO (grade I) were added. The reaction was quenched after 14 min by adding 700 |iL of acetonitrile, and the precipitated protein and salts were spun down (the pellet contained 9 |iCi ^^C). Unreacted reactant and the lactone were separated by HPLC (Altech NH2 column, eluted with 87% acetonitrile, 13% H2O at 2 ml/min) and pure lactone (-30 |iCi) fractions were pooled and lyophilized. Exchangeable hydrogens were exchanged by lyophilizing twice from D2O (99.9%). The lactone was reduced in a similar apparatus to the one described above with 750 mg of NaHC204 (lyophilized twice from D2O) and 2.1g of sodium amalgam 5% in 10 mL of D2O (99.996%). After 45 min, 50 mL of Acetonitrile was added. The mercury and the precipitated salts were removed by filtration through a sinter-glass. The product was purified by HPLC (same conditions as above) yielding 10.8 |LiCi of [UL-l^C, l-^H]-2-deoxyglucose (17% overall yield). The deuterium content of radiolabelled product was not directly determined. In subsequent experiments no trend in D/T isotope effect values was detected with fractional conversion of substrate. The latter has been seen previously with incompletely deuterated substrates (Cha et al., 1989). Deglycosylation of Glucose Oxidase Both wild type (M.W. 155±10 kD) and recombinant (M.W. 260±50 kD) glucose oxidase were deglycosylated enzymatically to the same glycoform (M.W. 136±3 kD). Deglycosylation was conducted under non-denaturating conditions using a modification of the procedure described by Kalisz et al. (1990). Since the recombinant enzyme deglycosylation was much faster, and the deglycosylated enzyme from both sources had the same M.W. and activity parameters (data not shown), the recombinant enzyme was used to produce the deglycosylated GO used below. In a typical experiment, 50 mg of GO was incubated for 24 h in 60 mL of 30 mM potassium phosphate buffer pH 5.0 with 2.6 units of endoglycosidase H and 80 units of a-mannosidase. The reaction was monitored by SDS-PAGE until the GO showed a narrow (< 3 kD) band at 68 kD. The reaction mixture was concentrated by ultrafiltration (Amicon, YM30 membrane), and the buffer was replaced with 100 mM sodium acetate pH 4.5. The mixture was loaded on a cation-exchange column (Mono-S, Pharmacia) and the GO eluted
Amnon Kohen et al
314
with 100 mM sodium acetate pH 4.5. The GO was concentrated by ultrafiltration to 3.4 niL in 0.1 M potassium phosphate, pH 7. The enzyme activity was measured by a continuous spectrophotometric assay (see Methods), active site concentration was determined by FAD absorption at 452 nm (e = 12.83 mM-lcm-l) as described by Frederick et al. (1990) and the protein concentration was measured by the Bradford assay (BioRad reagent) using bovine serum albumin as standard, or by its absorption at 280 nm using a published factor of 1.67 O.D. per mg (Swoboda & Massey, 1965). The specific activity was 430 U/mg, and the overall yield of enzyme, based on active sites measurement (452 nm absorption), was about 40%. B, Methods 1. Competitive Kinetic Isotope Effect (KIE) Experiments In these experiments a tritiated substrate was mixed with I'^C-labeled protonated (for H/T KIE experiment) or deuterated (for D/T KIE experiment) substrate. The mixture was allowed to react in the presence of enzyme, under defined conditions of pH and temperature, and quenched at different fractional conversions. The quenched mixtures were analyzed by HPLC and liquid scintillation counting to determine the fractional conversion (f, determined from the ^^C counting) and tritium to ^^C ratio in the products ([^H/l^C]/ and [3H/1^C]OO for the ratio at the time point and the infinity point, respectively). The L/T KIE (also denoted by: T(V/K)L or ki^/kj) were calculated by equation 1 (Melander & Saunders, 1987): T(V/K)L=ln(l-f)/ln{ l-f[(3H/14C)//(3H/14C)oo]}
(1)
These experiments were carried out in 10 mM bis-tris propane (for pH 9.0) as it was found that at this pH the KIE is largest, suggesting that the hydrogen transfer step is closest to being rate-limiting. Prior to kinetic experiments, tritiated and I'^C labeled substrates were copurified by the same HPLC procedure described above (see Materials). Typically, each experiment contained 1.35 |LiCi ^H and 0.13 |iCi l^c (about 0.2 mM and 0.6 mM total substrate concentration for H/T and D/T KIE experiments, respectively). The reaction mixture (1.6 mL) was pre-equilibrated in a Neslab waterbath (±0.1 °C) and three zero time point (to) of 150 |iL each, were removed. The reaction was initiated by addition of enzyme and in general, seven time points (t^) were removed. After incubation overnight with more enzyme, an infinity time point (too) was taken. All samples were quenched with 1 |LiL of 70% HCIO4. The sample's pH was raised to 8.5 by addition of 2.5 |LiL of 6 M NaOH and all remaining substrate was converted to the 6-phospho derivative by addition of 4 mM MgCh, 0.5 mM ATP, 90 mM Tris/HCl pH 8.5 and 30 units of hexokinase followed by a one hour incubation.^ The coupling reaction was quenched with 0.4 mM HgCl2 and stored at -70 "C until HPLC analysis. Before analysis, each sample was thawed and tetrabutylammonium hydrogen phosphate was added to a final concentration of 5 mM, after which precipitated salts were spun down. The HPLC analysis was performed with a CI8 column (Phenomenex, UltraCarb) equilibrated with 5 mM tetrabutylammonium hydrogen phosphate pH 6.8 and the sample was eluted with 0 to 30% acetonitrile gradient. ^ Derivatization was found necessary for efficient HPLC separation. The quantitive conversion of the unreacted substrate was guaranteed (and tested) by a large excess of hexokinase over initial GO and of ATP over 2-deoxyglucose. The loss of GO activity following acid quenching and neutralization was assayed by an oxygen electrode.
Enzyme Glycosylation: Hydrogen Tlinneling and Enthalpy of Activation
315
Tritiated water eluted in the dead-volume (2 min) followed by 2-deoxygluconolactone (5 min), 2-deoxygluconic acid (8 min), and 6-phospho-2deoxyglucose (15 min). Fractions were collected and analyzed using a liquid scintillation counter (Wallac). Since the substrate ^H/^^C ratio at to was equal to the ^H/l^c ratio in the products at too, an average of the three to was used as (3H/14C)OO in equation 1.
2. Initial Velocity Measurements Initial velocity measurements for enzyme activity (through the purification and before kinetic experiments) were performed by a continuous spectrophotometric assay as described by Lockridge et al. (1972) and modifed by Frederick et al. (1990) with a Hewlett-Packard 8452A diode array spectrophotometer equipped with a thermostatted cell holder. Initial velocity measurements in kinetic experiments were carried out by following the oxygen consumption with an oxygen electrode (Yellow Spring Instruments; 1 mL chamber). C. Data Analysis 1, KIE Analysis Analysis of the temperature dependence of KIE's was done by fitting them to equation 2 as described previously (Bahnson & Klinman, 1995 and other cited therein.). kL/kT = AL/Ar*exp[(ET-EL)/KT] (2) where kL/kx is L/T KIE, P^IA.j is the KIE on pre-exponential factors and EJ-EL is the isotope effect on the enthalpy of activation. The fitted parameters enable comparison of values for the AL/AT to theoretical limits from semiclassical models. The semiclassically calculated values range from 0.6 to 1.6 for AH/Ay and from 0.9 to 1.2 for AD/AT (Schneider & Stem, 1972; Bell, 1980; Melander & Saunders, 1987). For proper error analysis, curve fitting was carried out as a least root mean square fit exponential regression of KIE vs 1/T (using the software KaleidaGraph). Each experimental point (as shown in figure 1) was an average of two or three independent experiments with at least six time points and three zero time points per experiment. 2. Initial Velocity Data Analysis KM and kcat and their standard errors were determined by nonlinear least root mean square fit to equation 3 (using KaleidaGraph). v/[E]=kcat*[S]/(KM+[S])
(3)
where v is the initial velocity, [E] is the enzyme concentration and [S] the substrate concentration.
Amnon Kohen et al
316
III. RESULTS A. Competitive KIE Temperature Dependence The H/T and D/T KIE's for the three GO glycoforms were measured with the substrate 2-deoxyglucose (as described under Methods). The temperature was varied from 0 to 45°C and an example of representative results for the wild type enzyme are presented in Figure 1.
0.0033 0.0036 1/T [K-^] Figure 1. H/T (•) and D/T (A) KIE temperature dependence for the wild type GO. Data presented as KIE Arrhenius plots (see explanation and description under Methods). All the experiments were carried out in lOmM bis-tris propane buffer pH 9.0 as described under Methods.
The D/T KIE values were relatively larger than expected from the deuterium and tritium zero point energy differences. The H/T KIE values were large, but not as large as expected from semiclassical (no tunneling) calculations (Swain et al., 1958; Saunders, 1985). kn/kx = (fe/kT)^-26-3.34
(4)
where kn/kx and kp/kx are H/T and D/T KIE, respectively. The observed trend is kn/kx <(te/kx^-26-3.34
(5)
As discussed previously (Klinman 1995 and other cited therein), such a trend suggests that some kinetic complexity "masks" the intrinsic H/T KIE as discussed below in more details. Table I summarizes the KIE's on Arrhenius pre-exponential factor (AH/AX and AD/Ax) and their standard experimental error calculated from a nonlinear least root mean square fit (see data analysis under Methods) for the different GO glycoforms. Additionally, the ratio of ln(H/T) and ln(D/T) (the exponent of equation 4) at 25°C has been included. The different GO glycoforms are denoted by their molecular weight.
Enzyme Glycosylation: Hydrogen TUnneling and Enthalpy of Activation
317
Table I KIE's on the Arrhenius pre-exponential factor and the exponents relating H/T and D/T KIE at 25°C M.W. [kD]
AH/AX ^
AD/AT ^
exponent^
136
0.84 (±0.34)
1.47 (±0.09)
2.89 (±0.04)
155
1.28 (±0.18)
1.30 (±0.10)
2.93 (±0.05)
260
1.46 (±0.20)
0.89 (±0.04)
2.93 (±0.04)
a. The semiclassically calculated values range from 0.6 to 1.6 for A H / A J . b. The semiclassically calculated values range from 0.9 to 1.2 for AD/AX. c. See equation 4.
Values for AH/Ax are all within their semiclassical limits (see KIE analysis under Methods). On the other hand, the AQ/AX values for the three glycoforms differ significantly. The value for the heaviest GO glycoform (260 kD) is close to the lower semiclassical limit (0.9), while that for the wild type enzyme (155 kD) is shghtly above the upper semiclassical limit (1.22) and the factor for the lightest, deglycosylated form (136 kD) is clearly and significantly above this limit. This observation is quite rare but was predicted and observed in the past when substantial amount of tunneling was incorporated or expected. This issue is discussed in more detail elsewhere (Kohen, et al., 1996). B. Kinetic Complexity When more than one kinetic step is rate limiting in an enzymatic turnover, the reaction is said to be kinetically complex. The role of kinetic complexity in the glucose oxidase catalysis, was studied by calculating the ratio between the ln(H/T) and ln(D/T) KIE (Table I) which is the exponent term of equation 4. Through the temperature range studied, the distribution of these exponent values was rather narrow (2.82 to 2.98 and no more than 1.4% change for a single glycoform) suggesting a very small temperature dependence for the kinetic complexity. Moreover, the small trend in exponents with temperature can be shown to lead to H/T KIE's on the Arrhenius prefactors smaller than the intrinsic values for the lighter glycoforms and larger than the intrinsic value for the heavy one. This is a trend that further emphasizes the differences between them (Kohen et al., 1996). However, as the D/T KIE measurements are likely to be virtually free of kinetic complexity, the AQ/AX values reported in table I reflect the chemical step of the GO reaction. C. Enthalpy of Activation for [l'^H]-2-Deoxyglucose The enthalpy of activation for the oxidation of [l-2H]-2-deoxyglucose was determined with the three GO glycoforms. When using anomerically deuterated 2-deoxyglucose, the chemical step is rate limiting as discussed above. This substrate was synthesized on a gram scale (see Materials). Initial velocities for its oxidation catalyzed by the three GO glycoforms were determined at different temperatures by following the oxygen consumption using an oxygen electrode (see Methods). Determination of enthalpy of activation requires measurements of kcat under full substrate saturation over a significant temperature range. The initial velocity measurements were performed under pure oxygen atmosphere and with 0.5 M 2deoxyglucose (> 8*KM) in 10 mM bis-tris propane buffer pH 9 as described under
Amnon Kohen et al
318
Methods. At high temperatures, saturation of the enzyme with oxygen under atmospheric pressure could not be achieved. However, since GO exhibits a pingpong steady state kinetic mechanism (Swoboda & Massey, 1965; Bright & Gibson, 1967), the KM'S of 2-deoxyglucose and oxygen were determined at 25 °C and 45 °C and a simple Michaelis factor was applied to correct the Arrhenius slope: kcat = Vniax.obs. * (KM + [S]) / ([E] * [S]) (6) where Vmax.obs. is the observed Vmax^ [S], the oxygen concentration, KM is the measured Michaelis constant and [E] the enzyme concentration. The overall correction on the Arrhenius slope was less than a factor of 1.14 for all glycoforms and the trend among them was not affected. The enthalpies of activation thus determined are 8.1 (±0.36), 11.0 (±0.29) and 13.7 (±0.31) kcal/mole for the 136 kD, 155 kD and 260 kD glycoforms, respectively. IV. DISCUSSION The most significant results of this study are sunmiarized in Table II. Table II: D/T KIE's on the Arrhenius Pre-Exponential Factors and Enthalpies of Activation for [1-^H1-2-deoxyglucose with Glucose Oxidase Glycoforms^ M.W. (kD)
AD/AT^
AHD* (kcal/mole)
260
0.896 (±0.041)
13.64 (±0.31)
155
1.302 (±0.100)
11.06 (±0.29)
136
1.462 (±0.089)
8.18 (±0.39)
a. All the experiments were carried out under the same buffer and pH conditions (see Methods). b. The semiclassical calculated range is 0.9 to 1.22 (see text).
Two striking points arise from this table. First, the D/T KIE's on the Arrhenius pre-exponential factor (AD/AT) for the heavy GO glycoform is close to the lower limit of the semiclassical range. On the other hand, this factor is slightly above the upper semiclassical limit for the 155 kD glycoform and well above this limit for the light glycoform. Second, there is a correlation between a larger preexponential factor and a lower enthalpy of activation. Larger than unity KIE's on the Arrhenius pre-exponential factors are an indication of extensive tunneling and are expected to be accompanied by a lowered enthalpy of activation (Jonsson et al., 1996). Therefore, it appears that the degree of glycosylation on the surface of the protein has changed the nature of the chemical step of the enzyme catalysis. Apparently, more tunneling is involved in catalysis when the polysaccharide envelope size is reduced. These effects are discussed in detail elsewhere (Kohen et al., 1996). Acknowledgment We thank Dr. Steven Rosenberg (Chiron Co. Emeryville, CA) for the generous gift of recombinant glucose oxidase.
Enzyme Glycosylation: Hydrogen Tbnneling and Enthalpy of Activation
319
References Bahnson, B. J., & Klinman, J. P. (1995) Meth. Enzymol 249, 373-397. Bell, R. P. (1980) The Tunneling Effect in Chemistry, Chapman & Hall, London & New York. Bright H. J., & Gibson, Q. H. (1967)7. Biol. Chem. 242, 994-1003. Bright, H. J., & Appleby, M. (1969) /. Biol. Chem. 244, 3625-3634. Cha, Y., Murray, C. J., & Klinman, J. P. (1989) Science 243, 1325-1330. Cruger, A., & Cruger, W. (1990) in Microbial Enzymes and Biotechnology (Rogarty, W. M., & Kelly, C. T. Ed.) Elsevier Applied Science, London, pp 177-227. Frederick, K. R., Tung, J., Emerick, R. S., Masiarz, E.F., Chamberlain, S. H., Vasavada, A., Rosenberg, S., Chakraborty, S., Schopter, L. M., & Massey, V. (1990) /. Biol. Chem. 265, 3793-3802. Gibson, Q. H., Swoboda, B. E. P., & Massey, V. (1964) /. Biol. Chem. 239, 3927-3934. Glickman, M. H., Wiseman, J. S., & Klinman, J. P. (1994) /. Am. Chem. Soc. 116, 793-794. Hecht, H. J., Kahsz, H. M., Hendle, J., Schmid, R. D., & Schomburg, D. (1993) /. Mol. Biol. 229, 153-172. Hwang, C. C , & Grissom, C. B. J. Am. Chem. Soc. 116, 795-796. Isbell, H. S., Holt, N. B., & Frush, H. L. (1962) in Methods in Carbohydrate Chemistry (Whistler, R. L., & Wolfarm, M. L. Ed.) Academic Press, New York, pp 276-280. Jonsson, T., Glickman, M., Sun, S., & Klinman, J.P. (1996) J. Am. Chem. Soc. in press. Kalisz, H. M., Hecht, H. J., Schomburg, D., & Schmid, R. D. (1990) J. Mol. Biol. 213, 207-209. KaUsz, H. M., Hecht, H. J., Schomburg, D., & Schmid, R. D. (1991) Biochim. Biophys. Acta 1080, 138-142. Kohen, A., Jonsson, T., & Khnman, J.P. (1996) submission to Biochemistry. Kriechbaum, M., Heilman H. J., Wientjes, F. J., Hahn, M., Jany, K. D., Gassen, H. G., Sharif, F. & Alaeddinoglu, G. (1989) FEBS Lett. 255, 63-66. Lockridge, O., Massey, V., & Sullivan, P. A. (1972) J. Biol. Chem. 247, 8097-8106. Melander, L., & Saunders, W. H. (1987) Reaction Rates oflsotopic Molecules, Krieger, R. E., Fl. Muller, D. (1928) Biochem. Z 199, 136-170. Pazur, J. H., & Kleppe, K. (1964) Biochemistry 3, 578-583. Rudd, P. M., Joao, H. C , Coghill, E., Fiten, P., Saunders, M. R., Opdenakker, G., & Dwek, R. A. (1994) Biochemistry 33, 17-22. Saunders, W. H. (1985)/. Am. Chem. Soc. 107, 164-169. Schneider, M. E., & Stern, M. J. (1972) J. Am. Chem. Soc. 94, 1517-1522. Swain, C. G., Stivers, E. C , Reuwer, J. F., & Schaad, L. J. (1958) /. Am. Chem. Soc. 80, 58855893. Swoboda, B. E. P., & Massey, V. (1965) / Biol. Chem. 240, 2209-2215.
This Page Intentionally Left Blank
Profile Analysis of Oligosaccharides from Glycoproteins by PMP Labeling. Comparison of Chemical and Enzymatic Release Methods Using RP-HPLC and Mass Spectrometry Hanspeter Michel Yuemei Ma Barbara DeBarbieri, and Yu-Ching E. Pan Dept. of Analytical Research and Development Hoffmann-La Roche Inc. Nutley, NJ07110
I. Introduction Characterization of carbohydrates in glycoproteins has become an important task in the field of biotechnology, since a number of protein biopharmaceuticals are glycosylated. To gain detailed structural information, it would be advantageous to have a routine profiling method that can provide suitable samples for further analysis. Chemical or enzymatic methods can be employed to release intact oligosaccharides (1). Hydrazine is a commonly used reagent in the chemical release of both N- and 0-linked oligosaccharides (2,3). An instrument that performs automated hydrazinolysis, as well as purification and recovery of oligosaccharides, is commercially available. In the case of enzymatic treatment, peptide N-glycosidase F (PNGase F or N-glycanase) is the most commonly used enzyme; it can release virtually all known intact N-linked oligosaccharides. Among various electrophoretic and chromatographic methods that give reliable profile results of the released oligosaccharides, reversed-phase (rp) HPLC offers an advantage: since the system usually uses volatile or salt-free buffer, the recovered samples are suitable for further analysis without additional manipulation. Unfortunately, oligosaccharides lack a hydrophobic domain as well as a chromophore for sensitive detection. In order to circumvent TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
321
Hanspeter Michel et al
322
this disadvantage, Honda, et al. (4) have demonstrated that oligosaccharides labeled with l-phenyl-3-methyl-5-pyrazolone (PMP) are suitable for rp-HPLC analysis with conventional UV detection. And, as a result, two analysis kits are now available commercially to perform this method routinely (5,6). One kit performs enzymatic release of N-linked oligosaccharides followed by PMP labeling, and the second kit provides chromatographic conditions for separation of these labeled oligosaccharides. With the use of fetuin as a model system, a study was initiated to examine whether the PMP-labeling and rp-HPLC approach is also suitable for analyzing chemically released oligosaccharides. The recovered PMP-oligosaccharide samples from rp-HPLC were also analyzed by MALDI TOF and LC ESI MS.
II.
Materials and Methods
Bovine fetuin was from Sigma (St. Louis, MO). All chemicals and columns used for hydrazinolysis by automated chemical release were from Oxford GlycoSystems (Rosdale, NY). The N-linked Oligosaccharide Release and Labeling kit-PMP and Oligosaccharide HPLC kit-PMP were purchased from the Perkin-Elmer Corporation Applied Biosystem division (ABI, Foster City, CA). PNGase F was from New England BioLabs Inc. (Beverly, MA). Other chemicals and solvents were of the highest purity available. A.
Chemical release
Chemical release was performed with the use of an automated GlycoPrep 1000 instrument (Oxford GlycoSystems). Two release modes, namely N+0 and O, were performed using the programmed methods. Fetuin (250 \xg) was first dialyzed against 0.1% TFA, dried under vacuum and then subjected to hydrazinolysis. Fetuin was reacted with hydrazine under defined conditions of temperature and time: N+0 mode, which was performed at 95° C for 4 hr, released both N- and 0-linked oligosaccharides, and O- mode, which was performed at 60° C for 5 hr, released 0-linked oligosaccharides. The entire process, including solvent wash and column elution, took approximately 22 hr. B.
Enzymatic release
All the reagents included in the N-linked oligosaccharide release and labeling kitPMP were used for the treatment. The enzymatic reaction of 100 |ig of denatured fetuin sample was carried out at 37 °C for 2 hr using PNGase F from New England BioLabs. C.
PMP labeling of released oligosaccharides
PMP labeling was performed according to the instructions provided in the
Oligosaccharide Profile Analysis
323
release and labeling kit-PMP. Briefly, the oligosaccharide samples released by either chemical or enzymatic treatments were reacted with PMP reagent in a sodium hydroxide solution at 70 °C for 120 min., acidified and extracted with ethyl acetate to remove excess PMP reagent. D.
RP-HPLC analysis of PMP-labeled oligosaccharides
The HPLC system used for the analysis of PMP-oligosaccharides included a Waters WISP autosampler and an ABI Model 120 HPLC system. ABFs oligosaccharide HPLC kit-PMP was used for the separation of PMPoligosaccharides. The kit included a C-18 reversed-phase column, buffer A (10% acetonitrile, 10% 1 M ammonium acetate, pH 5.5) and buffer B (25% acetonitrile, 10%) 1 M ammonium acetate, pH 5.5). Elution of PMPoligosaccharides was monitored at UV 245 nm with the use of a PE Nelson Turbochrom data system (Cupertino, CA) for data acquisition and analysis. E.
Mass spectrometric analysis
MALDI TOF (7) mass spectra were recorded on a Bruker Reflex instrument (Billerica, MA). The rp-HPLC fractions containing PMP labeled oligosaccharides were dried in a SpeedVac concentrator (Farmingdale, NY) and redissolved in water/acetonitrile (75/25, v/v). 2,5-Dihydroxybenzoic acid (DHB) was used as a matrix. Normally, 0.3 j^L of a half-saturated solution of DHB in water/acetonitrile/trifluoroacetic acid (70/30/0.1, v/v) was mixed on the sample dish with 0.3 |iL of the sample solution. Desorption and ionization was done with a nitrogen laser (337 nm) adjusted to minimum laser attenuation. Mass spectra (10 to 20) were accumulated in linear negative ionization mode with an acceleration voltage of 25 kV. Calibration was done externally with bovine insulin and angiotensin I. LC ESI (8) mass spectra were recorded on a Finnigan TSQ700 instrument (San Jose, CA) in positive ion mode. A sample aliquot was injected into a fused silica microcapillary column with an inside diameter of 100 f^m. The microcapillary was filled at the end with 10 cm of a C-18 reversed phase resin. PMP labeled oligosaccharides were eluted at ca. 1 |iL/min directly into the electrospray ionization source with a 10 min gradient of acetic acid in H2O (0.5 %), v/v) to 80 % acetonitrile. Determination of experimental molecular weights was done with the deconvolution software provided by the manufacturer.
III. Results and Discussion The strategy used to perform the PMP oligosaccharide profile analysis, with the use of bovine fetuin as a model system, is given in Scheme 1. Bovine fetuin, a major glycoprotein in fetal calf serum, has been widely used as a model for the study of glycoprotein structure. This glycoprotein contains both N- and 0linked oligosaccharides and detailed structural information is available (9-11).
Hanspeter Michel et al
324
Glycoprotein
Chemical release by GlycoPrep 1000
Enzymatic release by PNGaseF
Oligosaccharides labeled with PMP
Reversed phase HPLC
Scheme 1. Strategy for PMP oligosaccharide profile analysis of glycoproteins.
Profile analysis: Typical profiles of PMP-oligosaccharides released from fetuin by the two chemical treatments, namely N+0 and O modes, and enzymatic treatment using PNGase F are shown in Fig 1. Comparable profile results were obtained from both chemically and enzymatically released oligosaccharides. The results indicate that PMP labeling can also be used for analyzing chemically released oligosaccharides. Five major oligosaccharide peaks were detected in these profiles. An eariier eluting peak corresponding to free PMP reagent was also detected in all of the profiles. It is believed to be due to insufficient solvent extraction of excess PMP reagent. The identity of peaks 1, 3 and 4 in the PNGase F profile has been reported previously (5). They are the three types of N-linked oligosaccharides in bovine fetuin: peak 1 - tetrasialyl triantennary, peak 3 - trisialyl triantennary, and peak 4 - bisialyl biantennary. Since the retention times of peaks 1,3 and 4 in the N+0 profile (Fig. lA) matched well with those detected in the PNGase F profile (Fig. IC), it is reasonable to assume that they are the same three N-linked oligosaccharides. To verify that this is indeed the case, mass spectrometric analysis was used to fijrther characterize the recovered PMP-oligosaccharide samples (see below). Peaks 2
Oligosaccharide Profile Analysis
325
£ c
in
c o
Times, min Figure 1. Comparison of PMP-oligosaccharide profiles generated by chemical and enzymatic treatments. (A) N+0 mode, (B) 0 mode, and (C) PNGase F.
326
Hanspeter Michel et al
and 5, on the other hand, are likely to be the 0-linked oligosaccharides, because they are detected exclusively in the profiles of N+0, and O modes. They did not appear in the PNGase F profile. It was reported previously that there are three types of 0-linked oligosaccharides in fetuin: trisaccharide, tetrasaccharide and hexasaccharide (10,11). This information was usefijl for the identification of peaks 2 and 5 (see below).
Mass spectrometric analysis While MALDI TOP mass spectrometry is widely used for the analysis of unlabeled oligosaccharides (12), LC ESI mass spectrometry is seldom used, because of the lack of a suitable chromatographic system. On the other hand, PMP-oligosaccharides could be analyzed with MALDI TOP as well as with LC ESI mass spectrometry. Pig. 2 shows a comparison of mass spectra obtained from the PMP-oligosaccharide in HPLC peak 3. Within the accuracy of the method, MALDI TOP mass spectrometry showed no difference in the molecular weights between the chemically (Pig. 2A) and the enzymatically (Pig. 2B) released oligosaccharide. Identical molecular weights were also obtained with LC ESI mass spectrometry (Pig. 2C). The experimental average molecular weight of 3211 is in good agreement with the theoretical value of 3211.0. Molecular weights were also determined for the other two N-linked oligosaccharides (HPLC peaks 1 and 4), released either chemically or enzymatically. They also were in good agreement with the theoretical values (Table 1). Together with the intact PMP-oligosaccharide, a component at lower molecular weight was observed (marked in Pig. 2A and 2B with an asterisk). This component was formed through the loss of one PMP molecule (174) during sample storage in the HPLC buffer. Its appearance could be minimized by reducing the time between fraction collection and mass spectrometric analysis. This additional component was not seen in the LC ESI mass spectrum shown in Pig. 2C because the rp-HPLC separated it from the intact PMPoligosaccharide. Shown in Pig. 3 is the result from the analysis of HPLC peak 2 (Pig. 1) with LC ESI mass spectrometry. A major component with a molecular weight of 1004.5 was found. This observed average molecular weight is in good agreement with the theoretical value of the PMP labeled 0-linked trisaccharide (10, 11). The identity of this trisaccharide was fiarther confirmed with the presence of fragment peaks in the mass spectrum (marked in Pig. 3C with an asterisk). These fragments were produced through the ESI process and were a result of a loss of sialic acid and a hexose from the intact PMP-oligosaccharide. In the case of peak 5, the major component found in this sample has a molecular weight of 801.4 (data not shown). The origin of this oligosaccharide is not clear. However, the molecular weight indicated the presence of a component that consists of a PMP-labeled N-acetylneuraminic acid and a hexose. Such a structure has not been described to be present as a 0-linked oligosaccharide (10,11). Although preliminary collision induced dissociation (CID) analysis was
Oligosaccharide Profile Analysis
327
Relative Abundance 250 200 I
3210
(M-H)
150 50
3u#»<^wxjW<*>->v^
B
110 J
3209
(M - H)
80 A 50
^»*|^^
1000
c
100
4000
3000
2000
m/z
3211
60-1
20 J
-JBL.
3100
3200
3300
m/z
Figure 2. Mass spectrometric analysis of the PMP-labeled N-linked oligosaccharide (HPLC peak 3). MALDI TOF mass spectrum of (A) chemically, (B) enzymatically released oligosaccharide, and (C) deconvoluted LC ESI mass spectrum of enzymatically released oligosaccharide. Asterisks denote fragments due to the loss of one PMP molecule.
Table 1. Mass spectrometric results obtained from the analysis of the five PMP-oligosaccharide peaks
Peak
Oligosaccharide + 2PMP *
Theoretical + 2PMP
1 2 3 4
tetrasialyl triantennary trisaccharide trisialyl triantennary bisialyl biantennary partially known
3502.2 1005.0 3211.0 2554.4 -
1 5
Average Molecular Weight | LC ESI MS 1 MALDI TOF MS chemical 1 PNGase F chemical PNGase F 1 ~ 3500 3501 3500 1 ~ 1004.5 ~ 3211 3210 3211 1 — 2555 2555 2554 1 801.4 -
* the molecular weights reported here are for PMP-oligosaccharides which contain two PMP molecules per oligosaccharide.
1
1
Hanspeter Michel et al
328
Relative Abundance 100-
11
A
50^
loo-
B
[\
m/z: 1004.5>1006.5
ft
Total ion current
se^ IXJv^K^
l\ 100 00 n
200
300
V A >V 400 1005.5
C
60-
jv
_ .
time (sec.) (M + H)"*"-
k
(MHg)^^ 20400
ll 1
600
i
800
m/z
Figure 3. LC ESI mass spectrometric analysis of the PMP labeled 0-linked oligosaccharide (HPLC peak 2) that was chemically released (0-mode). (A) Ion currents of oligosaccharide ions (m/z = 1005.5), (B) total ion current, and (C) reconstructed mass spectrum. Asterisks denote fragments produced by the ESI process.
in agreement, further experiments will be necessary to unambiguously establish the exact structure of this component.
IV.
Conclusion Remarks
With bovine fetuin as a model system, the use of PMP labeling in the routine rp-HPLC profile analysis of glycoproteins has been evaluated. Comparable profiles for oligosaccharides can be obtained regardless of whether they are released by automated chemical or manual enzymatic treatments. The PMP-oligosaccharide samples recovered from HPLC are suitable for MALDI TOF and LC ESI mass spectrometric analysis. The presence of all three N-Iinked and one of the 0-linked oligosaccharides was confirmed.
Oligosaccharide Profile Analysis
329
Acknowledgments We would like to thank Dr. H. Chokshi, K. Hollfelder and D. Ciolek for reviewing this manuscript.
References 1. 2. 3. 4. 5. 6. 7. 8.
9. 10. 11. 12.
Chaplin, M. F., and Kennedy (1994) Carbohydrate Analysis, A Practical Approach (IRL Press, Oxford). Takasaki, M., Mizuochi, T., andKobata, A. (1982) in Methods Enzymol. 83, 263-268. Patel, T., Bruce, J., Merry, A., Bigge, C, Wormald, M., Jaques, A., and Parekh, R. (1993) Biochem. 32, 679-693. Honda, S., Akao, E., Suzuki, S., Okuda, M., Kakehi, K., and Nakamura, J.(1989) Anal. Biochem. 180,351-357. Yuen, S. W., Fu, D., Zaidi, I, and O'Neill, R. (1994) ABI Poster Presentations in the Protein Society Annual Meeting. Fu, D. and O'Neill, R. A. (1995) Anal. Biochem, 227, 377-384. Hillenkamp, F., and Karas, M. (1990) Meth. Enzymol 193, 280-295. Hunt, D.F., Alexander, J.E., McCormack, A.L., Martino, P.A., Michel, H., Shabanowitz, J., Sherman, N., Moseley, M.A., Jorgenson, J.W., and Tomer, K.B. (1991) in Techniques in Protein Chemistry II (Villafranca, J.J., ed.), pp. 441-454. Academic Press, New York, USA. Green, E., Adelt, G., Baenziger, J. U. , Wilson, S., and Van Halbeek, H. (1988) J. Biol. Chem. 263, 18253-18268. Nilsson, B., Norden, N. E., and Svensson, S. (1979) J. Biol. Chem. 254, 4545-4553. Edge, A. S., and Sipro, R. G. (1987) J. Biol. Chem. 262, 16135-16141. Harvey, D. J. (1994) American Laboratory, December, 22-28.
This Page Intentionally Left Blank
Positive Identification of Glycosylation Sites In Proteins And Peptides Using A Modified Beckman LF 3600 N-Terminal Protein Sequencer
^Xiaomei Lin, ^U Wulf Carson, ^Saber M. A. Khan, ^Clark F. Ford and ^Kristine M. Swiderek ^Beckman Research Institute of the City of Hope, Duarte, CA 91010. Beckman Instruments Inc., Fullerton, CA 92634 ^Department of Biochemistry and Biophysics Department of Food Science and Human Nutrition, Iowa State University, Ames, Iowa 50011
Introduction Posttranslational modifications such as the glycosylation of proteins and peptides can be easily overlooked during conventional Edman degradation. Usually, glycoamino acids appear as blank cycles during a sequencing run. Further sequence analysis is necessary after deglycosylation of the sample in order to assign the phenylthiohydantoin (PTH)- Ser or Thr oligosaccharide (O-linked Sac) derivatives and the PTH-Asn (N-linked Sac). Also, a consensus sequence of Asn-X-Ser (Thr) is an indication for PTH-Asn (N-linked Sac) derivatives (Elhammer et al., 1993). This kind of identification is a time and sample consuming task. Positive identification of the glycosylation sites during N-terminal sequence analysis could become a real advantage. The techniques developed for the LF 3600 N-terminal protein sequencer (Beckman Instr.) provide a fast and efficient way to positively identify Ser (0-Unked Sac), -Thr (0-Unked Sac) and Asn (N-linked Sac) carbohydrate structures during N-terminal sequence analysis (Gooley et al., 1995). Liquid phase anhydrous trifluoroacetic acid (TFA) is used to extract glycosylated, polar amino acid derivatives from the reaction cartridge. These amino acids are then converted into PTH derivatives which can be TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
331
Xiaomei Lin et al
332
positively identified during on-line chromatography. In addition, different forms of carbohydrate structures at a single amino acid site have different retention times and can be distinguished by further analysis (Gooley et al., 1995). After fraction collection of individual peaks during the chromatography of one sequencing cycle, the structure of oligosaccharides at one specific site can be further examined with subsequent analyses, such as mass spectral analysis, capillary electrophoresis (CE) or high pH anion exchange chromatography (HPAEC). Adding to the flexibiUty of the instrument, it can be switched back to regular mode for high sensitivity protein and peptide sequencing. This report will describe the use of a modified Beckman LF3600 microsequencing system for the positive identification of glycosylation sites of different protein and peptide samples. The analysis of glycopeptides from carcinoembryonic antigen (CEA) will be presented. CEA is a heavily glycosylated membrane bound glycoprotein which has been intensively characterized as an important tumor marker for colonic cancer. The example for a glycoprotein is the starch binding domain (SBD) of enzyme glucoamylase, which is an industrially important exoamylase used for saccharification of liquefied starch from various sources to produce glucose and finally fructose syrups (Belshaw et al., 1993). For both, CEA and SBD, the identification of the glycosylation sites is an important contribution to understand their functionaJity.
Materials and Methods Sample Source The protein asialoglycophorin (Sigma A 9791), which contains Olinked Ser and O-linked TTir carbohydrate structures, is used as a standard for glycoprotein sequencing. Glycophorin (Sigma G-9511) can also be used as standard, but desialation needs to be done prior to coupling in order to remove sialic acid. a-Lactalbumin (Sigma L-6385) is used as a standard for high sensitivity sequencing as well as glycosequencing to compare the two sequencing modes. The glycopeptides analyzed are HPLC fractions of carcinoembryonic antigen (CEA) after digestion with trypsin and chymotrypsin (Swiderek et al., 1993). CEA was isolated and purified from liver metastases of colon tumors as described elsewhere (Pritchard et al., 1976). The enzyme glucoamylase, containing engineered enterokinase recognition site, was expressed and purified from yeast. Following the digestion by enterokinase, the starch binding domain (SBD) which contains 109 amino acid residues is separated by gel filtration and desalted using a spin column (Khan et al., 1996).
Coupling
Reaction
One prerequisite of the glycosequencing technique is the covalent attachment of the glycoprotein to a solid support. For these studies
Glycosylation Sites Using N-Terminal Sequencer
333
Sequelon-AA membrane disks (Perceptive Biosystem GEN92(X)33) were used to attach the carboxyl terminal groups of protein or peptide samples. Sequelon-AA is a polyvinylidene difluoride (PVDF) membrane derivatized with aryl amine groups. To couple the sample, the membrane is prewet with 5 |Lil CH3CN and the same volume of sample of approximately 200 pmole proteins is applied to the disk and then dried. 1 mg of water soluble carbodiimide is dissolved in 100 |il of coupling buffer as coupling reagent (available in the Sequelon-AA membrane kit). After applying 5 |Lil of coupling reagent, the membrane is incubated for 30 minutes at room temperature. The disk is then washed twice with 50% methanol/H20 to remove the extra reagent. After coupling, the membrane is stored at 4°C in an eppendorf tube. When samples are attached via carboxyl groups, the protein or peptide must first be desialated prior to the coupling reaction. This is necessary to prevent the coupling of terminal carboxyl groups of sialated glycoamino acids to the support and to increase the yield of glyco amino acids extracted during the Edman reaction. The desialation procedure is carried out according to Gooley et al. (1995). A cap cut from a 1.5 ml microcentrifuge tube was used as a reaction chamber. After the protein is dissolved in 40 |il of 0.1 M TFA (Pierce), the sample is incubated at 80 XI for 40 minutes. The sample is then dried in a vacuum centrifuge to remove TFA and resuspended in 20% CH3CN/H2O.
Glycosequencing
Analysis
The samples are applied to a modified Beckman LF3600 DT protein sequencing system (Beckman Instruments Inc. #291106). Two cartridge block bottle positions are used for TFA delivery so either gas or liquid phase sequencing can be applied. Thus, the instrument can be easily switched to regular mode of high sensitivity protein sequence analysis. The delivery tubing and the programming of the sequencer were changed in order to deliver Hquid TFA through the valve block. Kel-F valves were added to the second TFA delivery reservoir to prevent corrosion. PTHderivatives of both amino acid and glycoamino acids are chromatographed using a Gold HPLC system equipped with a diode array detector at 268 nm range. The PTH derivatives of amino acids and glycoamino acids are separated by a custom made HPLC column (250 x 2.1 mm, Hypersil) at 50°C with a flow rate of 0.2 ml/min. A new buffer system is used, 25 mM ammonium formate, pH 4.0 as buffer A and 100 % acetonitrile (Burdick & Jackson) as buffer B to separate the different PTH-glycoamino acids as well as the PTH-amino acids. The stock solution for buffer A is made by adding 30% ammonium hydroxide (Aldrich #33,881-8) to 88% Formic Acid (Aldrich #39,938-8) with a final concentration of 250 mM pH 4.0 for ammonium formate and is stored refrigerated. A gradient for the separation of the PTH-derivatives is used according to the recommendation of the manufacturer.
Xiaomei Lin et al
334
Results and Discussions Cycle
ONSOTEG
AY
MVpWKFIL
^\jMi^ ^ju-J L U _ J \MkJikj, E
L_^vA^lMl
c CD CD (M
^
»tjlji^.waJ
4
hnT m
r'mm^KikA UAA
'VJU_J
LV'^^v^U^^ iU> J
1
L
a
J
10
1
lUi^J kxAA
WLJ I
12
•
J
I
14
6
8
I
L
10
_!
12
I
L.
14
Time (min) Figure 1. (A) HPLC chromatogram of 20 pmole PTH-amino acids using the current glycosequencing method. 19 PTH-amino acids are separated by a modified reverse phase HPLC system, (B) Sequencing results of asialoglycophorin using the glycosequencing method. 200 pmole of sample were subjected to the analysis. The assigned PTH-amino acids of the 14 N-terminal residues of asialoglycophorin are LS*T*T*G/EVAMHT*T*T* S*S*S* (*representing the presence of a glycoamino acid).
(uju69E)33uoqjosciv
-^ ^ «^ - S
^ 00 O c
3 i!<^ C/3 ^
a e §•§
<^' 1? -^ "a
336
Xiaomei Lin et al
Figure lA shows the chromatogram of 20 pmole PTH-standard using the new chromatography system. The chromatogram represents the separation of a total of 19 PTH-amino acid derivatives except for cysteine. In addition, the new buffer system expands the front region of the gradient for the separation of PTH-glycoamino acids. Figure IB is a representative sequence analysis of the commercially available glycoprotein asialoglycophorin using the described glycosequencing method. 200 pmole of the sample are coupled to a A A-Sequelon membrane as described above prior to Edman sequencing. The first 14 cycles are displayed. Cycles 2, 3 and 4 show peaks of PTH derivatives of Ser (0-linked Sac) and Thr (Olinked Sac). The glyco amino acids are identified by a unique pair of peaks. For example, there are two resolved peaks of PTH-Ser (Sac) with retention times of 6.5 and 6.7 min in cycle 3 representing PTH-Ser (Sac) I and II. Two well separated peaks in cycle 4 represent PTH-Thr (Sac) I and II with retention times of 6.9 and 7.3 min. The peak pairs have been shown to represent stereo enantomers and can be present in different ratios (Gooley et al., 1995). In addition, different retention times of the peak pairs can indicate different saccharide structures. The positive identification of glycosylation sites is still possible after 10 cycles: Thr (Sac) in cycles 11, 12,13 and Ser (Sac) in cycles 14, 15, 16 are still well distinguishable. In contrast, no sequence assignments for cycles 2, 3 and 4 and 11-16 can be made using the conventional sequencing method (data not shown). A high background is observed in all cycles due to a high degree of impurity of the sample. SDS-PAGE analysis of this protein sample shows multiple bands on the gel (data not shown). To compare the glycosequencing procedure with the conventional high sensitivity sequence analysis, a-Lactalbumin is sequenced in both modes. Figure 2A and B show cycles 1-3 and cycles 12-14 from conventional and glycosequence analyses side by side. The features of each cycle appear to be similar for both runs, and the repetitive yields are approximately the same. However, for the conventional high sensitivity sequence analysis, the low level of sample amount required is about 10 pmole as demonstrated in Figure 2C. The lowest amount that can be subjected to glycosequencing is about 50 pmole (Figure 2B). Also, the lower limit of glycoamino acid detection depends on the efficiency of the coupling reaction which can vary for different protein and peptide samples. It should be noted that there is also an upper limit to couple samples to the membrane. For most of the samples, 2(X) pmole has proven to be an optimal amount for covalent coupling and further glycosequence analysis. Using the glycosequencing technique, we were able to identify the glycosylation sites of several samples submitted to us for analysis. Figure 3 shows the sequencing results of SBD domain of glucoamylase. Previous GC-MS studies had shown that only mannose is present in this protein (Khan et al., 1996). About 1 nmole of sample is coupled to the membrane disk. During subsequent glycosequence analysis, peaks with retention times of 6.5 and 7.5 in cycle 4 can be assigned to PTH-Thr (0-linked Sac). One of the retention time is different compared to those assigned for PTH-
Glycosylation Sites Using N-Terminal Sequencer
337
Thr (Sac) in asialoglycophorin, which indicates a different form of carbohydrate structure (Gooley et al., 1995). PTH-Thr (0-linked Sac) can also be identified in cycle 6. TTie carbohydrate structures linked to the Thr in this cycle are even more heterogeneous which is indicated by the presence of three peaks with retention times of 6.5, 6.9 and 7.3 min.
(J) CD C\J
o oc
-Q O (n
<
Time (min) Figure 3. Sequence analysis of SBD domain using the glycosequencing method. Peaks in cycle 4 are PTH-Thr (Sac) with retention times of 6.95 and 7.40 min and those in cycle 6 are PTH-Thr (Sac) with retentiontimesof 6.45,6.95 and 7.25 min, respectively. The sequence of the first 15 amino acids in the peptiide is SCTT*PT*AVAVTFDLT.
338
Xiaomei Lin et al
Examples for peptides containing PTH-Asn (N-linked Sac) have been observed during the sequence analysis of fractions from the tryptic/chymotryptic digest of carcinoembryonic antigen (CEA). Figure 4A shows the sequence analysis of a CEA peptide. 'Rie peak with retention time of 7.0 min in cycle 2 is assigned as PTH-Asn (N-linked Sac). The sequence analysis of a different CEA peptide as an example for Asn (Nlinked Sac) appearing in later cycles is shown in Figure 4B. Asn (N-linked Sac) in cycle 9 with a retention time of 7.0 min is observed. The identification of these glycosylation sites in CEA peptides are consistent with previous mass spectral analyses and deglycosylation experiments (Swiderek et al., 1993).
e
c CD OJ
(U
o c
<
Time (min) Figure 4. Glycosequence analysis of different CEA peptides. (A) The analysis of a CEA peptide that contains Asn (Sac) N-linked carbohydrate structures in position 2 is shown: SN*NSKPVEDK. (B) The analysis of a CEA peptide that contains N-linked carbohydrate structures in position 9 is shown. Sequence: FTCEPEAQN^TTY.
Conclusions Several examples using sequencing techniques developed for die positive identification of glycosylation sites of proteins and peptides are
Glycosylation Sites Using N-Terminal Sequencer
339
presented. The analysis is straightforward and less time and sample consuming then previously described sequencing techniques. The technique requires extra steps such as the covalent coupling of the protein or peptide samples prior to the sequence analysis. However, the sequencer modification and operation is easily applied on a routine basis. The HPLC system can be easily adjusted resulting in good separations of PTH-amino acids and PTH-glycoamino acids. In addition, different carbohydrate structures attached to a single amino acid site can be distinguished. The different glycoforms collected from the HPLC can be further analyzed by LC/MS, GC/MS, high performance anion exchange chromatography (HPAEC) and capillary electrophoresis (CE) in order to obtain the information regarding the structures and heterogeneity of the oligosaccharides.
References Elhammer AP, Poorman RA, Brown E, Maggiora LL, Hoogerheide JG and Kezdy FJ (1993) / . Biol. Chem 268, 10029-10038. Gooley A A, Packer NH, Piasano A» Redmond JW, Williams KL (1995) Techniques in Protein Chemistry Vf, 83-90. Khan SMA & Ford CF (1996) in preparation. Pritchard DO & Todd CW (1976) Cancer Res. 36,4699-4701. Reilly PJ (1979) Appl. Biochem. Bioeng. 2, 185-206. Swiderek KM, Pearson CS and Shively JE (1993) Techniques in Protein Chemistry IV, 127-134.
This Page Intentionally Left Blank
DEAMIDATION AND ISOASPARTATE FORMATION DURING in Vitro AGING OF A RECOMBINANT HEPATITIS E VACCINE CANDIDATE C. Patrick McAtee and Yifan Zhang Genelabs Technologies, Redwood City, CA 94063
I. Introduction Formation of isoaspartate via the deamidation of asparaginyl residues or isomerization of aspartyl residues constitutes a major source of instability in proteins and peptides (1). Isoaspartate arises through an intramolecular rearrangement that produces a succinimide (cyclic imide) intermediate (2). Spontaneous hydrolysis of the imide occurs with a half-life of several hours and generates an aspartyl residue linked to its C-flanking neighbor through the aspartate p-carbonyl (3). Isoaspartate-bearing proteins and peptides are specific substrates for a widely distributed protein methyltransferase; PIMT (EC 2.1.1.77) that uses S-adenosyl-L-methionine (Adomet) as its methyl donor (4). Several laboratories have become interested in monitoring the accumulation of isoaspartate in purified proteins during in vitro aging at physiological pH and temperature as deamidation and aspartate isomerization can lead to dramatic changes in the biological activity of proteins. Federal stability and purity regulations of protein based pharmaceutical products necessitates that the structure of parenteral molecules be clearly defined and consistent. Less well understood is the effect of deamidation on recombinant vaccine candidates. In evaluating a candidate vaccine for HEV (r62-kDa), it was found that the predicted amino acid sequence contained several potential deamidation sites (5, 6). II. Materials and Methods A.
Measurement of Isoaspartate Formation
Isoaspartate formation was determined by methylation of isoaspartyl residues by PIMT using [^H] S-adenosyl-L-methionine ([^H]SAM) as the methyl donor (Isoquant Kit (Promega)). Quantitation of the [^H] methanol released allowed a precise determination of the isoaspartic content in the protein. For quantitative measurements of isoaspartyl content, r62-kDa or its fragments were methylated for 30 minutes at 30°C and pH 6.8 in the presence of PIMT. Concurrently, a standard peptide (delta sleep inducing protein-DSIP) containing one isoaspartic group was incubated with PIMT under the same conditions. Following incubation of the protein and standard for varying time periods, the reaction was stopped by addition of 0.4M CAPS, pH 10; 5% SDS; 2.2% methanol; 0.01% m-cresol purple. PIMT activity was then quantitated by methanol diffusion into a scintillation cocktail followed by liquid scintillation counting. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
341
342
C. Patrick McAtee and Yifan Zhang
Methylation reactions of trypsin digested peptide fragments were performed for 30 minutes at 30°C and were stopped by freezing at -70°C. The samples were thawed and immediately injected for RP-HPLC, followed by liquid scintillation counting of the collected fractions. B.
Tryptic Digestion
r62-kDa protein was adjusted to a concentration of 10 nmole in 50 |j.l in the following buffer: 8M urea in 50 mM Tris-HCl, pH 8.0, 0.5 mM DTT. The reduced protein was Scarboxymethylated, digested with modified porcine trypsin (Promega) at a trypsin to substrate ration of 1:100 (w/w), and incubated at 30°C for 2 hours. The reaction was stopped by the addition of 0.01 volumes of .IM PMSF. The digests were then stored at -70°C until HPLC analysis was performed.
C
HPLC Analysis and Mass Spectrometry
r62-kDa digests were chromatographed on a Vydac C,8 reverse phase column using an ABI Model 41 OB dual syringe pumping system. The flow-rate was maintained at 50 \x\l min and elution achieved using a linear gradient from 0.1% aqueous TFA to 0.1% TFA in acetonitrile. A Carlo Erba Phoenix 20 CU pump was used to deliver a mixture of methoxyethanol and isopropanol (1:1) (v/v) at 50 fil/min which was combined with the column eluent in a post column mixing chamber. An in line flow splitter was used to restrict flow to the mass spectrometer to approximately 10 \\\.l min. Detection was performed immediately following elution from the column at 214 nm using an ABI 759A variable wavelength detector. Mass spectrometric detection of tryptic digested r62-lcDa protein was achieved following post column solvent addition and flow splitting by a VG BioQ triple quadrupole mass spectrometer. Spectra were recorded in the positive ion mode using electrospray ionization. Peptides of interest were repurified off-line using a phosphate buffer system previously described (7). Repurified peptides were desalted and analyzed by LC-MS (ES) and fast atom bombardment analysis mass spectral analysis (FAB-MS). FAB-MS was carried out on repurified peptides using a VG analytical ZAB 2-SE high field mass spectrometer. Z).
Protein/peptide sequencing
Conventional sequencing was carried out on either an Applied Biosystem 477 that was equipped with on-line HPLC's for the identification of the resulting phenylthiohydantoin (PTH) amino acid derivatives. The instrument was operated based upon manufacturer's recommendations and 3 pmol PTH standards were routinely used.
Deamidation and Isoaspartate Formation in Hepatitis E Vaccine
343
III. Results and Discussion A.
Kinetics of Isoaspartate Formation During In Vitro Aging
To determine if r62-kDa accumulates isoaspartate during in vitro aging under mild conditions, a sample of the purified protein was incubated at 30°C for 5 days in a pH 7.5 Tris buffer. The aged r62-kDa was then assayed at a concentration of 120 pmoles for the methyl-accepting capacity in the presence of [methyl-^H] AdoMet and PIMT. A significant increase in methyl incorporation was detected in r62-kDa aged for as little as two days (Figure 1). Methylation of r62-kDa increased with time and reached a plateau of 1.4 moles of CH3/mol of r62-kDa under these conditions. Under identical conditions, control (not aged) r62-kDa could be methylated to only 0.04 mol/mol of r62-kDa.
0.4 0.6 0.8 1 pmoles IsoAsp/pmole r62-kDa
1.2
1.4
Figure 1. Kinetics of Isoaspartate Formation During In Vitro Aging To determine if r62-kDa accumulates isoaspartate during in vitro aging under mild conditions, a sample of the purified protein was incubated at 30°C for 5 days in a pH 7.5 Tris buffer.
344 B,
C. Patrick McAtee and Yifan Zhang Isoaspartate Containing Tryptic Peptides of Aged Recombinant 62-kDa
Samples of control and aged r62-kDa were reduced, alkylated, and then digested with trypsin as described above (Figure 2). Each digest was split into two portions. One portion was subjected to reverse phased HPLC at low pH with uV detection to generate a peptide elution profile (Figure 3, left panels). The second portion was enzymatically methylated with [methyl-^H] Adomet to radiolabel all of the isoaspartate bearing peptides. The labeled peptides were then chromatographed under the same conditions as the first portion (Figure 3, right panels). HPLC of the methylated aged sample indicated the presence of three major methyl-accepting peptides. The control sample yielded a methylation pattern qualitatively similar to the aged samples. As expected, however, the amount of methyl incorporation into the major peaks was considerably less in the control sample compared to the aged sample. 10
20
30
40
50
AVAPAHDTPPVPDVDSRGAILRRQYNLSTSPLTSSVATGTNLVLYAAPLS 60
70
80
90
100
PLLPLQDGTNTHIMATEASNYAQYRVARATIRYRPLVPNAVGGYAISISF 110
120
130
140
150
WPQTTTTPTSVDMNSITSTDVRILVQPGIASELVIPSERLHYRNQGWRSV 160
170
180
190
200
ETSGVAEEEATSGLVMLCIHGSLVNSYTNTPYTGALGLLDFALELEFRNL 210
220
230
240
250
TPGNTNTRVSRYSSTARHRLRRGADGTAELTTTAATRFMKDLYFTSTNGV 260
270
280
290
300
GEIGRGIALTLFNLADTLLGGLLPTELISSAGGQLFYSRPVVSANGEPTV 310
320
330
340
350
KLYTSVENAQQDKGIAIPHDIDLGESRVVIQDYDNQHEQDRPTPSPAPSR 360
370
380
390
400
PFSVLRANDVLWLSLTAAEYDQSTYGSSTGPVYVSDSVTLVNVATGAQAV 410
420
430
440
450
ARSLDWTKVTLDGRPLDSTIQQYSKTFFVLPLRGKLSFWEAGTTKAGYPY 460
470
480
490
500
NYNTTASDQLLVENAAGHRVAISTYTTSLGAGPVSISAVAVLAPHSALAL 510
520
530
540
550
LEDTLDDYPARAHTFDDFCPECRPLGLQGCAFQSTVAELQRLKMKVGKTR EL Figure 2. Tryptic Peptides of r62-kDa Protein. Tryptic peptide sequences are indicated in tlie figure above the predicted sequence. A lower case a, b, or c following the peptide designation indicates that the peptide was obtained as a mixture of peptides. lUPAC nomenclature is used for amino acid abbreviation.
Deamidation and Isoaspartate Formation in Hepatitis E Vaccine
345
Control
Control
H ,M
°> T 1
pw \\U
J i|...^.4.C:',
['•
""
I"" v"''N'"'"
""T"'
11 Aged
JL^ Figure 3. Isoaspartate Containing Tryptic Peptides of Aged Recombinant 62-kDa Samples of control and aged r62-kDa were reduced, alkylated, and then digested with trypsin as described in the text. Each digest was split into two portions. One portion was subjected to reverse phased HPLC at low pH with uV detection to generate a peptide elution profile (left panels). The second portion was enzymatically methylated with [methyl-^H] Adomet to radiolabel all of the isoaspartate bearing peptides. The labeled peptides were then chromatographed under the same conditions as the first portion (right panels) and detected by radioactivity.
346
C.
C. Patrick McAtee and Tifan Zhang
Isolation and Characterization of Methyl-Accepting Sites
The tryptic peptides of aged r62-kDa were pooled and subjected to reversed-phase HPLC in a pH 6.0, sodium phosphate/acetonitrile solvent system as previously described ( 7 ). Compared to the control sample, the aged samples showed a significant elevation of peptide isoforms which were identical to the controls but did not sequence through asparagine residues when they were adjacent to a glycine, serine, or threonine residue (Table I). In contrast, there was also a decrease in aged samples of peptide isoforms which sequenced through the asparagine residues. These isoforms were also evaluated with the Isoquant assay for methyl-accepting capacity and it was determined that the peptides from the aged samples which did not allow complete sequencing by Edman degradation were readily methylated. Methyl accepting peptides which did not sequence efficiently also gained at least 1 atomic mass unit when examined by mass spectrometry. In a pH 6.0 solvent system, the individual peaks from the pH 2.0 solvent system were resolved into a series of peaks. The initial methyl-accepting site encompassing residues 199-208 resolved into 6 isoforms (a-f) when chromatographed at pH 6.0. These peaks were evaluated by Edman degradation and found to consist primarily of two sequences. Peak d did not sequence beyond residue 203 whereas peak e sequenced through to the end of the peptide. Likewise, methyl accepting peaks 2 and 3 were resolved on a pH 6.0 buffered system yielding similar results. The third methyl accepting peak was unusual in that the methyl accepting isoform did not sequence beyond residue 39. Unfortunately, sequencing yields were low enough in this region that the exact site(s) of deamidation could not be called with confidence. The aged peptide (A3a) gave rise to a methyl accepting species which also was 1.8 atomic mass units larger than that of the intact peptide. The deamidation at residue 60 would seem somewhat obvious as the asparagine residue is flanked by threonine on its carboxyl side. However, deamidation has also been known to occur when asparagine is flanked on its amino terminal side by serine, threonine, and to some extent, lysine (8). The asparagine at residue 41 is preceded by a threonine while another asparagine at residue 70 is preceded by a serine. This strongly suggests that the peptide encompassing residues 24-75 contains two potential deamidations. We have shown that in vitro aging of r62-kDa leads to considerable deamidation and isoaspartate formation primarily at Asn^^'*, and Asn^'*^ It is likely from mass spectrometry data that the asparagine at residue 206 may also be deamidated although it is difficult to verify that this residue is deamidated by sequence analysis. Significant deamidation also occurred at in residues 24-75 of the r62-kDa protein although it was difficult to assess the exact location of these deamidation sites. For all of these sites, the Asn residue is flanked on its carboxyl side by glycine, serine, or threonine, producing linkages that are highly susceptible to succinimide formation in small peptides. As mentioned above, deamidation involving amino terminal adjacent residues may also occur. It has been reported that a several deamidation sites in structured proteins do in fact occur at the same sequence pairs known to be the most labile in short flexible, model peptides (9). Highly flexible regions have also been indicated to play a role in antigenic determination. The three-dimensional structure of r62-kDa has not been determined so there is no detailed information on the local conformation or chain mobility at Asn^^'*, Asn ^^^, and
Deamidation and Isoaspartate Formation in Hepatitis E Vaccine
347
Asn^'*^ However, using the flexibility plot of Karplus and Schultz (10), of the ten most flexible peptides in the 62-kDa sequence, three of these peptides contained the three methyl accepting sites which were described in this work. Table I. Characterization of Methyl-Accepting Peptides* (Observed)
mass A
1087.8
1089.5
+ 1.7
No
1087.8
1087.8
0.0
241-255
Yes
1630.2
1631.4
+ 1.2
241-255
No
1630.2
1630.4
+ 0.2
24-75
Yes
5558.0
5559.8
+ 1.8
24-75
No
5558.0
5558.1
+ 0.1
HPLC Peak
Edman Sequence
Predicted Sequence
Aid
199-203
99-208
Yes
A1e
199-208
99-208
A2c
241-247
A2d
241-255
A3a
24-39
A3b
24-42*
Methyl Acceptor
mass
(Expected)
mass
*Sequencing yields were too low to give meaningful data beyond residue 42 *The methyl-accepting tryptic peptides of aged and control r62-kDa were pooled and repurified as described. The peptides were evaluated for isoaspartate formation by Edman degradation, methylation by PIMT, and FAB-MS. The nomenclature used to describe the HPLC peaks is as follows: "A" indicates the peptide was derived from an aged sample. The numbers 1 ,2, or 3 indicate whether the sample was derived from the first, second, or third methyl accepting HPLC peak using the pH 2.0 buffer system. The small case letters indicate the order of peaks resolved using the pH 6.0 buffer system
348
C. Patrick McAtee and Yifan Zhang
IV. Conclusion Deamidation has been shown to reduce the biological activity of several recombinant biopharmaceuticals. The predicted amino acid sequence of our r62-kDa vaccine candidate contained several potential deamidation "hotspots" in which an asparagine residue was flanked on its carboxyl side by a threonine, glycine, or serine residue or in one example the putative site(s) were flanked by a threonine or a serine on the amino terminal side. By methylation of isoaspartate residues generated through in vitro aging it was determined that three tryptic peptides contained notable isoaspartate accumulation. The presence of isoaspartate in these peptides was confirmed by sequence analysis and mass spectrometry. Whether deamidation plays a role in immune recognition has not been determined. However, alteration of a neutralizing epitope through deamidation at a crucial asparagine (or glutamine) residue could lead to either enhanced immune recognition or quite possibly infectious breakthrough in immunized populations. The 62kDa antigen has been previously shown to elicit protective immune responses in primates after heterologous challenge with live HEV (11). We are currently evaluating the effects of deamidation of the 62-kDa antigen to determine whether important parameters of immune response are affected in vivo. References 1.
Aswad, D. (1995) In Deamidation and Isoaspartate Formation in Peptides and Proteins {Asv/ad, D., Ed.) pp. 1-7, CRC Press.
2.
Bomstein, P. and Balian, G. (1977) Methods Enzymoi. 47, 132-145.
3.
Patel, K. and Borchardt, R.T. (1990) Pharm. Res. 7, 787-793.
4.
Johnson, B.A. and Aswad, D. (1990) In Protein Methylation (Paik, W.K. and Kim, S., Eds.) pp. 195210, CRC Press.
5.
McAtee, C. P., Zhang, Y., Yarbough, P.O., Bird, T., and Fuerst, T.R. (1996) Prat. Exp. Pur. (in press).
6.
McAtee, C.P., Zhang, Y., Yarbough, P.O., Fuerst, T.R., Stone, K.L., Samander, S., and Williams, K.R. (1996) J. Chromatog. B (in press).
7.
Paranandi, M. V., Guzzetta, A.W., Hancock, W.S., and Aswad, D. (1994) J. Biol. Chem. 269, 243253.
8.
Bischoff, R. and Kolbe, H. V.J. (1994) J. Chromatog. B 662,261-278.
9.
Wright, H. T. (1991) Protein Eng 4, 283-294.
10.
Karplus, P. A. and Schulz, G. E. (1985) Naturwissenschaf. 72, 212-213.
Deamidation and Isoaspartate Formation in Hepatitis E Vaccine 11.
349
Fuerst, T.R., Yarbough, P.O., Zhang, Y., McAtee, C. P., Tarn, A. W., McCaustland, K. A., Garcon, N., Spelbring, J., Carson, D., Myriam, F., Lifson, J.D., Slaoui, M., Prieels, J.-P., Margolis, H., and Krawczynski, K., (1996) In Enterically-Transmitted Hepatitis Viruses (Y. Buisson, P. Coursaget, and M. Kane, eds.) La Simarre, Joue-les-Tours (France) pp 384-392.
This Page Intentionally Left Blank
The Isolation and Characterization of Active Site Peptides in Lysyl Oxidase Sophie X. Wang and Judith P. Klinman* Departments of Chemistry and Molecular and Cell Biology University of California Berkeley, CA 94720 Katalin F. Medzihradszky and Alma L. Burlingame Department of Pharmaceutical Chemistry and the Liver Center University of California San Francisco, CA 94143
I. Introduction Lysyl oxidase (LO, EC 1.4.3.13) is an important extracellular, matrix-embedded protein. It catalyzes the oxidative deamination of the e-amino group of lysine side chains in elastin and collagen to form allysine ((DC-aminoadipic-y-semialdehyde) (1). The protein-bound aldehydic functional groups subsequently undergo either self-condensation or condensation with a second, unmodified lysine side chain to form inter- and intrachain crosslinks. These crosslinks confer both strength and altered solubility to the modified proteins. In the case of elastin, the presence of such crosslinks facilitates the return of protein fibers to their original size and shape after stretching (2-4). Since it has a central role in the biogenesis of connective tissues, the abnormal expression of enzyme activity is associated with a number of pathological conditions (5). Due to its important physiologic functions, the nature of the covalently bound cofactor in LO has been the subject of speculation for many years. Both redox cycling assays (6) and enzyme inhibition patterns (7, 8) indicate that LO contains a quinone-like structure at its active site. Resonance Raman studies of a C-terminal, cyanogen bromide cleavage product indicated spectral similarities to pyrroloquinoline quinone (PQQ), leading to the proposal of covalently bound PQQ (9). In Hght of the finding of topa (trihydroxyphenylalanine) quinone (TPQ) in a wide range of copper amine oxidases previously claimed to contain PQQ (10, 11), the presence of TPQ in LO appeared more feasible. However, the large size difference between LO and other amine oxidases, as well as the absence of the TPQ consensus sequence in LO (12), raised the possibility of a different cofactor existing in LO. * To whom correspondence should be addressed. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
351
Sophie X. Wang et al
352 PCX)H HCX)C
HIVJ-
HOOC
1:PQQ
In a recent paper, we have documented in detail the isolation and characterization of an active site, cofactor-containing peptide from bovine aorta LO (13). Investigation of this peptide by a combination of Edman sequencing, mass spectrometry and resonance Raman spectroscopy indicates a type of quinocofactor, previously undescribed, which is derived from the crosslinking of a modified tyrosine to the e-amino group of a lysyl side chain (13). This new cofactor is designated as lysine tyrosylquinone (LTQ, structure shown as 3). The present paper will be focused on the methodology of the isolation and mass spectroscopic characterization of the active site, cofactor-containing peptide from bovine aorta LO.
•J?
I
-N-(J;H-C-
HN ^ CH-CH2-CH2-CH2-CI-b-N>
CH2
C=0
I
3: LTQ
IL Materials and Methods A.
Protein Purification and Inactivation
LO was purified from two weeks-old bovine calf aorta by a modification of published procedures (14, 15). The aortas (ca. 600 g) were cleaned and finely ground prior to protein extraction. Extractions and all subsequent procedures were carried out at 4 °C. The ground aortas were extracted with 0.15 M NaCl-16 mM potassium phosphate (KPi) (saline buffer), 16 mM KPi, and 4 M urea-16 mM KPi buffer (pH 7.8), sequentially, in the presence of protease inhibitors PMSF (1 mM) and iodoacetamide (0.04% w/v). The saline buffer and KPi extracts contained negligible activity and were discarded. The urea buffer extracts were pooled (total of 4 to 5 L) and mixed with hydroxyapatite gel (100 g, preequilibrated in 4 M urea-16 mM KPi). After batch elution from hydroxyapatite, the urea-soluble enzyme solution was concentrated (ca. 700 ml) and dialyzed against 16 mM KPi (pH 7.8) buffer.
Active Site Peptides in Lysyl Oxidase
353
To the dialyzed, cloudy enzyme solution an equal volume of 1 M KPi (pH 7.75) was added to further precipitate the urea-soluble proteins. The suspension was centrifuged and the pellet was collected and redissolved in ca. 130 ml of 6 M urea, 0.03 M NaCl, 16 mM KPi (pH 7.8) buffer prior to being loaded on a Sephacryl S-200 column previously equilibrated with 6 M urea, 0.03 M NaCl, 16 mM KPi (pH 7.8). The column was eluted with the same buffer and fractions were collected. Those containing enzyme activity or having an enriched protein species of ~32 kDa as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) (16) were pooled and concentrated. The concentrated eluant (20 ml) was applied to a Sephacryl S-100 column previously equilibrated with 6 M urea, 0.03 M NaCl, 16 mM KPi, pH 7.8. The column was eluted with the same buffer, fractions were collected and those containing only 32 kDa (LO) and 24 kDa species were pooled and concentrated. This is referred to as the "two-banded" protein, and was used without further purification (to reduce protein loss) for phenylhydrazine inactivation (see below). The two-banded enzyme (6.34 mg) has a specific activity of 0.019 U/mg when assayed at pH 7.2 with benzylamine as substrate (17, 18). It was dissolved in 3.43 M urea, 16 mM KPi (pH 8.0) and reacted with 240 nmoles of [I'^C]phenylhydrazine hydrochloride (universallv labeled, 5550 dpm/nmole) for 30 min. A chromophore was formed with a Imax of "^54 nm. The reaction was monitored on a Hewlett-Packard 8452A Diode Array Spectrophotometer by taking a scan every 3-5 min until the formation of the chromophore reached completion. Excess phenylhydrazine was removed by passing the reaction mixture through a Bio-Rad DG-10 desalting column, equilibrated and eluted with 2 M urea, 100 mM NH4HCO3 (pH 8.0). To determine the incorporation of I'^C-labeled phenylhydrazine in the LO protein, a sample of the desalted, radiolabeled two-banded material was analyzed by SDS-PAGE (16). The separating gel contained 10.5% acrylamide using N,N'-diallyltartardiamide as crosslinking reagent. The protein gel was stained with Coomassie Brilliant Blue G - Colloidal (Sigma) and subsequendy cut into 16 slices according to the position of the protein bands. The gel slices were then dissolved in 1 to 5 ml of 2% periodic acid solution, and the radioactivity in each gel slice was quantitated by liquid scintillation.
B.
The Isolation of the Active Site Peptide
The [l^C]-phenylhydrazine-labeled protein {ca. 6 mg, as described above) was lyophilized and redissolved in 3 ml of 6 M guanidine hydrochloride (19, 20). The resulting solution was capped, deoxygenated, flushed with argon and incubated for 30 min at 37 °C. Dithiothreitol was added carefully with a syringe to a 50-fold excess and the solution was flushed with argon and incubated at 37°C for an additional 4 hrs. The solution was then cooled on ice. To the cooled solution, iodoacetamide (44.4 mg) was added to give a slight excess over the estimated free SH groups. The mixture was left in the dark for 1 hr before the excess reagents were removed by desalting with a Bio-Rad DG-10 column, preequilibrated with 2 M urea, 100 mM NH4HCO3, pH 8.0. The reduced and carboxyamidomethylated LO was then lyophilized and redissolved in 100 mM NH4HCO3 (pH 8.0) containing 2 M urea at 37°C, with shaking. The digestion buffer also contained 5 mM CaCl2, which is an absolute requirement for thermolysin activity (21). Proteolytic digestion was initiated by the addition of thermolysin to 4% (w/w) of the total protein weight. A second
Sophie X. Wang et al
354
aliquot of protease was added after 24 hrs. The digestion was stopped after 49 hrs by placing the solution at -70°C (or by addition of EDTA, 2 mM). The enzymatic digest was subjected to high performance liquid chromatography (HPLC) purification to isolate the active site-containing peptides. After the preliminary HPLC purification, those thermolytic peptides which have absorption peaks at 438 nm were lyophilized and redissolved in 50 mM sodium phosphate (pH 8.0). Subdigestion was initiated by the addition of Asp-N endoproteinase to a final concentration of 2.5% (w/w) by weight of the substrate. The digestion mixture was incubated at 37°C with gentle shaking for 19 hrs and later stored at -70°C before it was purified by HPLC. The thermolysin digest was injected onto a Dynamax re versed-phase CI 8 column (5 |im, 300 A, 4.6 x 250 mm), equilibrated with solvent A (0.11% TFA, 5% CH3CN in H2O), on a Shimadzu HPLC system (see Fig. 2). Peptides were, eluted using a linear gradient of 20 to 30% solvent B (0.1% TFA, 80% CH3CN in H2O) over 65 min at a flow rate of 1 ml/min. Elution of peptides was monitored at 214 nm and 438 nm. The Asp-N subdigest of the cofactor-containing thermolytic peptide (e.g., the dominant peak in Fig. 2) was again separated on HPLC by injecting the mixture onto a Vydac reversed-phase C18 column (4.6 x 250 mm), preequilibrated with solvent A (0.3% triethyl-ammonium acetate or TEAA in H2O) and solvent B (0.3% TEAA, 60% CH3CN in H2O) with a ratio of solvent A : solvent B of 95 : 5. This was followed by a linear gradient elution of 5 to 50% solvent B over 60 min at a flow rate of 1 ml/min. Peptide elution was monitored at 220 and 438 nm. The cofactor containing peptides were collected and lyophilized and subsequently used for sequencing, mass spectrometry, and resonance Raman studies.
C.
Mass Spectrometric
Analyses
The electrospray mass spectrum of a LO active site peptide (sample 1) was acquired by an LC/ESIMS (electrospray ionization mass spectrometry) experiment, in which the sample was further purified by microbore reversedphase HPLC and the eluant was directly injected into an electrospray ionization mass spectrometer. An Applied Biosystems (ABI) microbore HPLC system was interfaced with a Micromass BioQ quadrupole mass spectrometer equipped with an electrospray source. The mass spectrometer was scanned in noncontinuum mode over a range of m/z of 350 to 2000 at 5 s/scan. The chromatographic separation was carried out on a C-18 column (ABI, 1x100 mm) equilibrated at 98% solvent A (0.1% TFA/H2O) and 2% solvent B (0.08% TFA/CH3CN). A linear gradient was started immediately after the injection, increasing solvent B concentration by 1% per min. The average molecular weight of the peptide was determinedfromthe doubly and the triply protonated ions detected. The accurate mass measurement was performed on a Micromass AutoSpec SE mass spectrometer using electrospray ionization on the triply charged ion of a cofactor-containing peptide (sample 2). Doubly charged ions of bradykinin (m/z 530.7885) and gramicidin S (m/z 571.3608) were used as calibration standards. Leu-enkephalin (MH+ at m/z 556.2771) was also included to verify the accuracy of the mass measurement. The monoisotopic molecular mass of this peptide was established by the mean [M+3H]3+ value obtained from four separate injections. Elemental compositions of the crosslinked residue were obtained by computer calculations.
Active Site Peptides in Lysyl Oxidase
355
Matrix-assisted laser desorption ionization (MALDI)-high energy collision-induced dissociation (CID) mass spectrum of an active site peptide (sample 4) was obtained on a Micromass AutoSpec SE orthogonal acceleration time-of-flight (TOP) tandem mass spectrometer. The ^^C isobar of the peptide MH"*" ion was selected as the precursor ion, at m/z 1597.6. The collision gas was Xe, and the collision energy was 800 eV. Fragments are labeled according to Biemann nomenclature (22). Peptide bond cleavage with charge retention at the N-terminus yields b fragments, and when the charge is retained at the C-terminus y ions are formed.
III. Results and Discussion A.
Peptide Isolations and Characterization
Unlike bovine serum amine oxidase, which is available in gram quantities and was used as the prototypic system for the establishment of TPQ, a typical preparation of LO yields about 5 mg (starting with about 500 grams of aorta tissue). In an effort to minimize protein loss, purification was stopped after the stage where LO (32 kDa) coelutes with a second protein (24 kDa) from Sephacryl S-200 gel filtration. The two-banded protein was labeled with [I'^CJphenylhydrazine, to yield the expected chromophore for a phenylhydrazone derivative of a quinone structure (Fig. 1). Analysis of the l^C-labeled protein by SDS gel electrophoresis showed almost exclusive incorporation of ^^C into the 32-kDa LO band (13). Therefore, the unreactive 24-kDa band would not affect subsequent procedures since selection of active site-derived peptides from LO was dependent on screening for l^^C in addition to monitoring the UV-Vis absorption of the newly formed chromophore. 0.45
O C i3
&
0.3 H 0.15H
800 Wavelength (nm) Figure 1. [^^Cjphenylhydrazine labeling of the LO. The two-banded LO enzyme (6.34 mg) was dissolved in 3.43 M urea-16 mM KPi buffer (pH 8.0) and reacted with 240 nmoles of [l^Q-phenylhydrazine hydrochloride for 30 minutes. A chromophore was formed at 454 nm. After labeling, the reaction mixture was loaded on a Bio-Rad DG-10 desalting column to remove the excess phenylhydrazine (see text for details).
Sophie X. Wang et al
356 2.56
20
30
40
50
60
Time (min)
Figure 2. HPLC purification of the thermolytic peptide. Thermolytic digest was injected onto a Dynamax reversed-phase C18 HPLC column pre-equilibrated with 0.11% TFA, 5% CH3CN in H2O. Peptides were eluted with an CH3CN gradient and monitored at both 214 nm and 438 nm (see text for details).
The HPLC elution profile of a thermolytic digest of the [I'^Clphenylhydrazine derivative of lysyl oxidase (Fig. 2) showed a large number of 214 nm peptide peaks (top panel), together with a single dominant 438 nm peak eluting at 36 to 37 min (bottom panel). Subsequent determination of radioactivity in the peptide fractions showed the coincidence of l^C radioactivity with the 438 nm peaks. Edman sequencing results showed that the thermolytic peptides were fairly long and that more than one peptide was present in the sample (13). In order to yield short peptides to facilitate the separation and analysis, thermolytic peptides (e.g., the dominant 438 nm peak in Fig. 2) were collected and further digested with Asp-N endoproteinase. The HPLC profile of this subdigestion is shown in Fig. 3. The dominant 438 nm peak (indicated by an arrow in Fig. 3) was collected and subjected to further analysis. Once again, Edman sequencing of the Asp-N digested peptide showed two amino acids in each round, indicating that two peptides were present in the sample (13). Both peptides can be located in the cDNA-derived LO protein sequence and are the same peptides as those observed in the thermolytic sample, except that they are shorter in length (13). This has led to the proposal of two crosslinked peptides, as shown below: Asp - Thr - (Tyr - derivative) - Asn - Ala - Asp Vai - Ala - Glu - Gly - His - (Lys - derivative) - (Ala - Ser)
Active Site Peptides in Lysyl Oxidase
357
0,32 n E c o
<
to
-e
o w n
<
20
30
40
60
Time (min)
Figure 3. HPLC purification of the Asp-N peptide. The Asp-N subdigest was loaded onto a Vydac reversed-phase C18 HPLC column pre-equilibrated with 0.3% TEAA, 3% CH3CN in H2O. Peptides were eluted with an CH3CN gradient and monitored at both 214 nm and 438 nm (see text for details).
Tyr and Lys residues were suggested to have been crosslinked to form the active site cofactor since they were not detectable at the expected recovery level during Edman sequencing. This was further supported by mutagenesis studies which are discussed in the full paper (13).
B. Mass Spectral Analysis of the Active Site Peptides The cofactor-containing peptide from the thermolysin/Asp-N digests (sample 1, Fig. 3) was analysed in an LC/ESIMS experiment and the mass spectrum obtained in this way showed doubly and triply protonated ions at m/z 720.9 and 480.9, respectively (Fig. 4). These values were used to establish the average molecular mass as 1439.71. The ion at m/z 747.4 in Fig. 4 is a double-charged Fe-adduct ion, which is believed to arise from the stainless steel capillary. Subtraction of the residue weights of the known amino acid residues (2Asp, Thr, Asn, 2Ala, Val, Glu, Gly, His) in the peptide, plus masses corresponding to two N-terminal protons and two C-terminal hyckoxyl groups, yielded a mass of 393.45 as the residue-weight for the unknown crosslinking cofactor structure labeled with phenylhydrazine. To calculate the elemental composition of an ion, its mass must be determined to three or four decimal places. The accurate mass measurement was carried out on another cofactor-containing peptide sample (sample 2) isolated
Sophie X. Wang et al
358
from different thermolytic and Asp-N digests. Leu-enkephalin (MH+ at m/z of 556.2771) was included to verify the accuracy of the mass measurement. In four separate injections of this compound, MH+ was measured as m/z 556.2760 (J = 0.0025), which shows an approximately 2 ppm deviation from the calculated mass. The mean [M+3H]3+ for sample 2 (from four separate injections) was m/z 533.2356 (J = 0.0026), which establishes the monoisotopic molecular mass of this peptide as 1596.6833. From the protein sequence, the mass of sample 1 and the CID data, sample 2 is the elongated version of peptide sample 1 with an AlaSer sequence at the C-terminus of the crosslinked Lys residue. This mass value shows an approximately 5 ppm deviation from the calculated m/z value for the corresponding peptide with the proposed structure. Again, subtraction of the residue weights of the known amino acid residues (2Asp, Thr, Asn, 3Ala, Ser, Val, Glu, Gly, His) in this peptide, plus masses corresponding to two N-terminal protons and two C-terminal hydroxyl groups, yielded a mass of 393.1830 as the residue-weight for the unknown crosslinking cofactor structure labeled with phenylhydrazine. A mass of 288.1377 for the crosslinked residue was obtained when the phenylhydrazine label was further subtracted. Within a 10 ppm deviation of this mass value, computer calculations permitted only two compositions which contained an odd number of N-atoms as required for this molecule (nitrogen-rule of mass spectroscopy) using composition limits of l^C 10/30; ^H 15/50; l^N 1/9; l^Q 2/10; ^^s 0/2 (where the numbers indicate the number of a particular atom required/allowed in the structure). The C12H22N3O3S composition shows only 1.7 ppm deviation from the measured mass value. However, there is no other indication of a sulfur atom present in the cofactor. Therefore, the C15H18N3O3 composition (C21H23N5O3 including the phenylhydrazine label), which exactly matches our proposed structure and shows a -10 ppm deviation from the measured mass value, is concluded to be the correct composition. 3+ 480.9
Vaverage = 1439.71 2+ 720.9
£1
<
747.4
illi,4i^.ll).li. ljii.i\l.fLi>l(ilj[Mil.4^.il).iu^iJ4.^l'M.Mi4 400
450
500
550
600
650
•" Iji'it' .in.,..i...i.u 700
750
800
650
900
050
m/z
Figure 4. Electrospray of the LO active site peptide (sample 1). The electrospray ionization mass spectrometer was scanned in noncontinuum mode over a range of m/z of 350 to 2000 at 5 s/scan (see text for details).
Active Site Peptides in Lysyl Oxidase
359
A reasonable structure for the phenylhydrazine derivative of this cofactor, which has the composition of C21H23N5O3, is given below, as the two possible tautomers (13). ''^'*'*'*^ H N—C H ~ C O^'*'^'^^
CHp
I
HN—CH
'
^ Azo form
Hydrazone form
The calculated mass for these tautomers is 393.1801 (monoisotopic mass) or 393.45 (average mass), which shows excellent agreement with the experimental mass of 393.1830 (monoisotopic mass from accurate mass measurement, sample 2) or 393.45 (average mass from LC/ESIMS experiment, sample 1) for the derivatized active site cofactor. The underivatized cofactor itself has a structure of the type shown as 3. This is consistent with the strong evidence for a quinonelike structure in LO (6, 8, 9, 23), and with the conclusion that the cofactor is comprised of a crosslink between a tyrosine derivative and a lysine residue. This structure could arise from an initial hydroxylation of Tyr to form dopa, followed by the oxidation of dopa to dopa quinone, and subsequent nucleophilic attack by the e-amino group of a lysine side chain to generate an aminoquinol (13). One of the peptides was subjected to high energy CID analysis (13). As documented therein, the CID data, together with results obtained with additional peptide samples (data not shown), provide independent evidence for the proposed crosslinked cofactor structure and the crosslinking site in the protein. The immonium ions, b type and y type ions observed in the fragmentation pattern have confirmed the presence of most amino acids in both peptides and supported the proposed cofactor structure (13). In particular, the fragment ion at m/z 902 is present due to cleavages between the a- and P-carbon of the modified Tyr residue (-590 Da) and a cleavage between the aromatic ring and the phenylhydrazine label (-105 Da), with charge retention on the N-terminal fragment. Unusual cleavages were also observed. The ion at m/z 884 seems to be formed via cleavages between the amino group and the a-carbon, and between the a-carbon and the carbonyl group of the Lys residue with charge retention on the crosslinked residue or peptide 1(13). The cleavage at both sides of the peptidyl a-carbon is a very rare event in mass spectral analysis. However, a novel cofactor structure analyzed by a new ionization technique could result in some unexpected fragmentation results. It should be noted that of the two possible nucleophiles in the second peptide. His and Lys, the His appears unmodified; therefore, establishing Lys as the most suitable residue to undergo crosslinking to a modified Tyr residue.
360
IV.
Sophie X.Wang era/.
Conclusions
In order to confirm the structure for the LO cofactor, model compounds mimicking the proposed enzymic structure were synthesized (13, 24). The UVVis absorption spectrum of the model compound, with a Xmax of 504 nm which is red-shifted compared to other TPQ-containing amine oxidases, corresponded almost exactly to that of native LO (13). Under the same conditions, the resonance Raman spectrum of the phenylhydrazine derivative of the model compound was found to be superimposable with that of the isolated LO peptide labeled with phenylhydrazine, yet very different from the phenylhydrazine derivative of a TPQ model compound (13). As documented, a previously unknown cofactor has been identified in LO (13). It is formed by the crossslinking of two amino acid side-chains and catalyzes the redox reaction in the enzyme (13). This discovery extends the range of quino-structures now demonstrated to function as redox catalysts, provides insights into the biogenetic pathways leading to quinone production from amino acid precursors, and introduces the possibility of regulating LO activity through the design of cofactor-specific inhibitors.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
Pinnell, S. R. & Martin, G. R. (1968) Proc. Nad. Acad. Sci. U.S.A. 61, 708-716. Eyre, D. R., Paz, M. A. & Gallop, P. M. (1984) Ann. Rev. Biochem. 53, 717-748. Sanberg, L. B., Soskel, N. T. & Leslie, J. G. (1981) New Engl. J. Med. 304, 556579. Kagan, H. M. (1986) in Characterization and Regulation ofLysyl Oxidase (Academic Press, Orlando, FL), Vol. 1, pp. 321-389. Uitto, J. & Perejda, A. J. (1987) in Connective Tissue Disease (Marcel Dekker, New York). Paz, M. A., Fluckiger, R., Boak, A., Kagan, H. M. & Gallop, P. M. (1991) /. Biol. Chem. 266, 689-692. Buffoni, F., Ignesti, G. & Lodovici, M. (1981) Ital. J. Biochem. 30, 179-189. Gacheru, S. N., Trackman, P. C, Calaman, S. D., Greenaway, F. T. & Kagan, H. M. (1989) /. Biol. Chem. 264, 12963-12969. Williamson, P. R., Moog, R. S., Dooley, D. M. & Kagan, H. M. (1986) /. Biol. Chem. 261, 16302-16305. Janes, S. M., Mu. D., Wemmer, D., Smith, A. J., Kaur, S., Maltby, D. M., Burlingame, A. L. & Klinman, J. P. (1990) Science 248, 981-987. Klinman, J. P. & Mu, D. (1994) Annu. Rev. Biochem. 63, 299-344. Janes, S. M., Palcic, M. M., Seaman, C. H., Smith, A. J., Brown, D. E., Dooley, D. M., Mure, M. & Klinman, J. P. (1992) Biochem. 31, 12147-12154. Wang, S. X., Mure, M., Medzihradszky, K. F., Burlingame, A. L., Brown, D. E., Dooley, D. M., Smith, A. J., Kagan, H. M. & Klinman, J. P. (1996) Science 273, 1078-1084. Kagan, H. M., Sullivan, K. A., Olsson, T. A. & Cronlund, A. L. (1979) Biochem. J. Ill, 203-214. Williams, M. A. & Kagan, H. M. (1985) Anal. Biochem. 149, 430-437. Laemmli, U. K. (1970) Nature 111, 680-685. Trackman, P. C. & Kagan, H. M. (1979) /. Biol. Chem. 254, 7831-7836. Kagan, H. M. & Sullivan, K. A. (1982) Methods Enzymol. 82, 637-650.
Active Site Peptides in Lysyl Oxidase 19. 20. 21. 22. 23. 24.
361
Fontana, A. & Gross, E. (1986) in Fragmentation of Polypeptides by Chemical Methods, ed. Darbre, A. (John Wiley and Sons, New York), pp. 67-120. Allen, G. (1989) in Sequencing of Proteins and Peptides (Elsevier Science Publishers B. v.. New York), pp. 58. Wilkinson, J. M. (1986) in Fragmentation of Polypeptides by Enzymic Methods, ed. Darbre, A. (John Wiley and Sons, New York), pp. 121-147. Biemann, K. (1990) Method. Enzymol. 193, 886-887. Williamson, P. R., Kittler, J. M., Thanassi, J. W. & Kagan, H. M. (1986) Biochem. J. 235, 597-605. Wang, S. X., Mure, M. & Klinman, J. P. (in preparation).
Acknowledgments The authors thank Dr. Herbert Kagan and co-workers for assistance leading to the enzyme preparations used for the studies described herein. This work was supported by NIH grants GM39296 to J.P.K. and NCRR BRTP P41 RR01614 to A.L.B., and supported by NSF Biol. Instru. Prog, grant DIR 8700766 to A.L.B.
This Page Intentionally Left Blank
Complement activation in EDTA blood/plasma samples may be caused by coagulation proteases Philippe H. Pfeifer Tony E. Hugli Department of Immunology The Scripps Research Institute La Jolla, California 92037
Earl W. Davie Kazuo Fujikawa Department of Biochemistry University of Washington School of Medicine Seattle, Washington 98195-7350
L Introduction
C3a and C4a are important markers of alternative and classical pathway activation, respectively. Both pathways require divalent ions for activation, mainly Mg^^ for generation of the C3 convertase of the alternative pathway and Ca^^ for generation of the CI-complex in the classical pathway. It is known that EDTA, which complexes divalent ions, is a poor stabilizing agent to prevent ex vivo generation of C4a, whereas Futhan (nafamostat mesilate, a powerfiil serine protease inhibitor) is an excellent stabilizing agent in this regard (1-3). On the other hand, heparin, when combined with EDTA, significantly reduces ex vivo generation of C4a which may indicate an involvement of coagulation enzymes in C4 cleavage. Thrombin has been shown to be able to cleave complement factors (4). We hypothesized that other enzymes of the coagulation or fibrinolytic system were responsible for at least part of the ex vivo C4a generation observed in EDTA plasma. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
363
364
Philippe H. Pfeifer et al
Even though many coagulation factors require phosphohpids or Ca^^ for activation, we tested the hypothesis that some coagulation enzymes may be able to cleave C3 and/or C4 in a Ca^^- and phospholipid-free environment.
II. Materials and Methods Blood from healthy volunteers was drawn into standard green (heparin, 14.3U/ml) or lavender (lOmM EDTA) top tubes, or into syringes containing Futhan (nafamostat mesilate, Torii Pharmaceutical Co., Tokyo, Japan, 0.2mg/ml final concentration). Either whole blood or plasma, isolated by centrifugation at 4°C/1000xg for 15 minutes immediately after drawing, was kept at 4°C for up to 48 hours to assess generation of C3a and C4a. All of the plasma samples were then frozen at -70 °C until the C3a and C4a levels were measured by RJA (Amersham). Human C3 and C4 were obtained from Advanced Research Technologies (San Diego, CA). 20|ag of C3 or C4 were digested with the respective enzyme for 0 to 90 min in HEPES buffer (50mM HEPES and 150mM NaCl, pH 7.3). The reaction was stopped by adding Futhan at a final concentration of 0.5 mg/ml. The samples were then subjected to Tricine-PAGE (5), stained with Coomassie and analyzed by scanning the visualized bands (Personal Densitometer by Molecular Dynamics). Factor XI and plasma prekallikrein were prepared according to Tait and Fujikawa (6) while p-factor Xlla was prepared according to Fujikawa and McMullen (7). Prothrombin was isolated by the method of Mann (8). Factor X, factor IX and urokinase were the gifts of The Green Cross Co., Osaka, Japan. Factor XI and plasma prekallikrein were activated by trypsin at an enzyme to substrate ratio of 1 to 25 in lOOmM Tris-HCl at pH 8.0. The remaining trypsin was inactivated by aprotinin. Factor IX, factor X and prothrombin were activated by factor XIa, Russell's viper venom factor Xa activating enzyme and factor Xa (1 to 25-50 ratio), respectively, in lOOmM Tris-HCl, pH 8.5/5mM CaCl2. The resulting factor IXa, factor Xa and thrombin were purified by column chromatography using Waters DEAE 15HR resin. The individual samples were applied to the DEAE column that had been equilibrated with 50mM Tris-HCl at pH 8.0 and the proteins were eluted by a NaCl gradient (0 to 0.6M NaCl) in the same buffer. SDS-PAGE analysis demonstrated the complete conversion of the respective zymogens to their active forms. To assess the generation of antigenically active fragments, identical aliquots of C3 or C4 (40|ig/ml in HEPES buffer) were incubated with thrombin (9|i,g/ml), plasma kallikrein (35)Lig/ml), factor XIa (7ng/ml), factor Xa (20ng/ml), tissue-type plasminogen activator (t-PA) plus plasminogen (0.6+350^g/ml), or buffer alone. After 60 minutes the different aliquots were precipitated with the precipitating agent provided with the RIA kit to remove the precursor molecules and C3a and C4a were measured by RIA.
Complement Activation Caused by Coagulation Proteases
365
III. Results Futhan stabilized complement activation in EDTA plasma for extended periods of time at 4°C, as measured by RIA. Background levels of C3a and C4a in stabilized plasma from whole blood drawn into Futhan+EDTA and stored at 4°C for up to 48 hours showed only minimal ex vivo activation (Table I). EDTA effectively inhibited ex vivo generation of C3a, whereas there was an ongoing production of C4a. Heparin also appeared to keep the C3a, but not the C4a, at a low level. Heparin + EDTA was no better in preventing ex vivo C3a generation than EDTA or heparin alone, but yielded a marked improvement in the stabilization of C4a. Indeed, C4a levels remained almost as low with this combination as with Futhan. In whole blood anticoagulated with the same reagents and stored at 4°C for up to 48 hours, a similar tendency as in plasma could be observed. Here too, heparin plus EDTA attenuated the generation of C3a and C4a while heparin alone stabilized C3a levels but allowed a 10- to 20-fold increase in C4a over baseline values. Again, EDTA+Futhan practically inhibited the ex vivo generation of C3a and C4a (not shown). Incubation of C3 and C4 for 60 minutes at 3TC with a number of enzymes involved in coagulation and fibrinolysis indicated that half of the enzymes tested were able to cleave C3 and C4 (Table II). Interestingly, factors IXa and Xlla (pXlla) did not seem to cleave C3 or C4, whereas factors XIa and Xa, thrombin, plasma kallikrein and plasmin clearly degraded both complement factors.
Table I. Generation of C3a and C4a inplasma at 4°C' 24 hours 0 hours 48 hours | 175 201 EDTA 173 201 200 235 C3a (ng/ml) EDTA + Futhan 123 228* heparin 183 171* 271* heparin + EDTA 229* 100 271 499 EDTA 106 131 90 C4a (ng/ml) EDTA + Futhan heparin 433 525 664* 109* heparin + EDTA 125* 134* ^Whole blood was anticoagulated with lOmM EDTA, heparin (14U/ml), heparin + 5mM EDTA or 0.2mg/ml Futhan + lOmM EDTA. Plasma was separated fi"om whole blood by centrifugation for 15 minutes at lOOOxg at 4''C. All samples were kept at 4° for the length of time indicated. Values are averages of two experiments except when marked with an asterisk for single values.
Philippe H. Pfeifer et al
366
Table 11. Qualitative differences in the C3 and C4 converting ability of some enzymes involved in coagulation and fibrinolysis^ 1 Enzyme C3 cleavage C4 cleavage Thrombin 4+ Vila + tissue factor + activated protein C Plasma callikrein + + activated XII (p-XIIa) XIa + + Xa + + IXa . . Urokinase t-PA + Plasminogen + + •
II
I
1
1 1 1 1
NV^ith the exception of factor Vila + tissue factor, all experiments were performed in Ca^^- and Mg^^-free buffers. (-: no cleavage, +: cleavage)
|i|BlHiii|if|'-JBiB^
W^iS^^^
|:N:!^^^^j^il;::f;3i|iii|^^Bii|*
i:BiHiHlB
/ lilililii:::;?
Figure 1: Kinetics of C3 degradation by factor Xa and plasmin. C3 was incubated at 200^g/ml for 0, 2, 30, 60 and 90 minutes at 3TC with 44ng/ml of factor Xa (lanes 1-5). Clearly visible are the C3 a- and p-chains having molecular weights of 115 and 75kD, respectively. The band of degraded material visible in lane three has an estimated M.W. of 105-1 lOkD and increases in intensity for 0 to 30 minutes. This pattern, where only the a-chain gets degraded, is similar to proteolysis seen with thrombin, plasma kallikrein and factor XIa (not shown). Lanes 6-10: degradation of C3 (200|Lig/ml) with tissue-type plasminogen activator (0.4^g/ml) and plasminogen (350^g/ml) under identical conditions. Clearly visible are the bands of C3 a-(l 15kD) and C3 p-chains (75kD) and plasminogen (90kD), but not of plasmin (78kD). Even though little degradation is apparent after 30 minutes of incubation, extensive cleavage of both the a- and p-chain occurs between 30 and 90 minutes. The larger C3 degradation product again has a M.W. of 105-1 lOkD whereas the smaller fragment of about 60kD indicates further degrading of the C3 molecule.
Complement Activation Caused by Coagulation Proteases
367
Analysis by PAGE indicated that generally only the a-chain of C3 was cleaved, with the exception of plasmin which cleaved both the a- and the P-chains (Fig. 1). Similarly, only the a-chain of C4 was cleaved in most instances, the P-chain remaining intact and the y-chain possibly degraded only by kallikrein (Fig. 2). On a molar ratio the most active of the enzymes was factor XIa, which cleaved almost half of the C3a-chains at a 1:15,000 molar ratio. The highest degree of degradation of the C4a-chain was seen with thrombin at a 1:5 molar ratio (Table III). When comparing the fragment size of the native C3 or C4 a-chains with their degradation products, both showed a reduction in size of about lOkD, indicating that C3a or C4a-likefragmentscould have been generated. Indeed, after incubation of C3 or C4 with thrombin, kallikrein, factors XIa and Xa or t-PA plus plasminogen for one hour at 37°C, significant elevations in C3a, and particularly C4a, levels could be measured. This result clearly indicates that antigenically active fragments had been generated. However, no band in the lOkD region could be seen on the Coomassiestained gels.
Figure 2: Kinetics of C4 degradation by factor XIa (left) and kallikrein (right). C4 (200^g/ml) was incubated for 0, 2, 30, 60 and 90 minutes with factor XIa (9ng/ml) or kallikrein (45^g/ml) at 3TC in HEPES buffer. Clearly visible are the C4 a-, C4 P- and C4 y-chains of 93, 75 and 32kD, respectively. Both enzymes have similar activities, apparently cleaving the C4 a-chain at a moderate but constant rate over the whole 90 minutes. The degradation product of the C4 a-chain appears to be about 80-83kD. Gel analysis by scanning also revealed a possible slight degradation of the C4 y-chain by kallikrein, but not by any of the other active enzymes (i.e. factors Xa, XIa, thrombin or plasmin).
368
Philippe H. Pfeifer et al
Table III. Relative activity of various proteases in the cleavage of the a-chains of C3 and C4' molar enzyme : C3/C4 ratio % C3a cleaved 1 enzyme %C4a cleaved | 1 : 15,000 47 16 1 1 XIa 1 : 1,000 11 32 Xa thrombin 1:5 51 90 kallikrein 27 2:3 33 1 plasmin 4: 1 39* 4 1 ^C3 or C4 was incubated for 60 minutes at 37°C with the respective enzymes, subjected to tricine-PAGE and Coomassie staining and analyzed by scanning the different bands. The values were obtained after background subtraction and normalization to the intensity of the C3p and C4p bands, respectively, except where marked with an asterisk.
IV. Conclusions We found that EDTA-plasma shows little ex vivo generation of C3a at 4°C within 48 hours, whereas C4a levels increased significantly. This effect was even more pronounced in whole blood that had undergone the same treatment. Heparin by itself did not appear to be an effective stabilizing agent, but worked well when combined with EDTA. However, the best results were obtained with the serine protease inhibitor Futhan, indicating that enzymes of the complement activation and/or coagulation pathway possessed residual activity after the divalent ions in plasma had been chelated by EDTA. In our experiments factors Xa, XIa, thrombin and kallikrein of the coagulation system, as well as plasmin, were indeed able to cleave C3 and C4. For both molecules, the a-chain was preferentially attacked and initially diminished in size by about lOkD. This was paralleled by the appearance of antigenically active C3a- and C4a-like products, indicating that a C3a- or C4a-like fragment was generated. However, the fact that no band in the lOkD-area could be detected on the polyacrylamide gel probably means that the fragments generated from C3 and C4 are not necessarily identical with C3a and C4a. Since we used active coagulation enzymes that are not necessarily generated in EDTA plasma, we can not rule out that continued classical pathway activation in EDTA plasma accounts for the ex vivo conversion of C4.
Complement Activation Caused by Coagulation Proteases
369
Bibliography 1. Fujii, S. and Y. Hitomi. 1981. New synthetic inhibitors of CI r, CI esterase, thrombin, plasmin, kallikrein and trypsin. Biochim. Biophys. Acta 661:342-345. 2. Watkins, J., G. Wild, and S. Smith. 1989. Nafamostat to stabihse plasma samples taken for complement measurements. Lancet 896-897. 3. Issekutz, A. C, D. M. Roland, and R. A. Patrick. 1990. The effect of FUT-175 (Nafamostat mesilate) on C3a, C4a and C5a generation in vitro and inflammatory reactions in vivo. Int. J. Immunopharmac. 12:1-9. 4. Hugh, T. E. 1977. Complement factors and inflammation: effects of a-thrombin on components C3 and C5. In Chemistry and biology of thrombin. R. L. Lundblad, J. W. Fenton, and K. G. Mann, editors. Ann Arbor Science, Arm Arbor, Mich. 345-360. 5. Schaegger, H. and G. von Jagow. 1987. Tricine-sodium dodecyl sulfatepolyacrylamide gel electrophoresis for the separation of proteins in the range from 1 to 100 kDa. Anal. Biochem. 166:368-379. 6. Tait, J. F. and K. Fujikawa. 1987. Primary structure requirements for the binding of human high molecular weight kininogen to plasma prekallikrein and factor XI. J. Biol. Chem. 262:11651-11656. 7. Fujikawa, K. and B. A. McMullen. 1983. Amino acid sequence of human p-factor Xlla. J. Biol Chem. 258:10924-10933. 8. Maim, K. G. 1976. Methods in Enzymology XLV. L. Lorand, editor. Academic Press, New York. 123-156.
This Page Intentionally Left Blank
DISULFIDE-LINKED HUMAN STEM CELL FACTOR DIMER Method of IdentiHcation and Molecular Comparison to the Noncovalent Dimer
Hsieng S. Lu, Michael D. Jones, and Keith E. Langley Amgen Inc., Amgen Center, Thousand Oaks CA 91320
I. INTRODUCTION Stem cell factor (SCF), also termed "kit ligand" or "mast cell growth factor" (1-6), functions in the early stages of hematopoiesis, and is an important growth factor involved in the development and function of other cell lineages, including melanocytes and germ cells (7,8). A soluble SCF form of 165 amino acids is biologically functional and contains approximately 40% of N- and O-linked sugar moieties (9-11). Two soluble SCF^-l^^ forms recombinantly expressed in Escherichia coli in a non-glycosylated form (rhSCF) and by mammalian cells in a glycosylated form (9,12,13) contain native SCF structure and are biologically functional. SCF binds to its receptor, kit, to elicit its specific biological functions (1-3). The kit receptor belongs to the type III tyrosine kinase family whose members include receptors for macrophage colony-stimulating factor (M-CSF) and platelet derived growth factor (PDGF) (14-16). SCF, M-CSF and PDGF are all dimeric ligands that mediate receptor dimerization (12,13,17,18). In contrast with the M-CSF and PDGF dimers whose monomers are disulfidelinked (17-19), both glycosylated and nonglycosylated SCF dimers contain non-covalently linked monomers (10,12). The SCF noncovalently associated dimer was observed to undergo spontaneous dissociationreassociation of monomers in its native state (20). There are two intramolecular disulfide bonds present in each monomer of SCF molecule (Cys^-Cys^^ and Cys^^^-Cys^^^); and the production of active rhSCF from E. coli requires an oxidative folding procedure to recover its biological activity (13). Oxidation and folding of denatured and reduced rhSCF involves at least three major partially oxidized intermediates, I-l to /5, each containing a native-like or mis-paired disulfide bond (21). These forms appear to reach steady state equilibrium and are important folding intermediates. There are two off-pathway intermediates that are dimers linked by a single intermolecular disulfide bond (Cys43-Cys^^ and Cys^^Cys^^, respectively). These two intermediates exist during early folding time and disappear after folding. In the final folding mixture, the major folded TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
371
372
Hsieng S. Lu et al
SCF is the noncovalently linked dimer (SDS-dissociable) and a small fraction is SDS-nondissociable dimer. In this report, we describe strategy and methodology leading to the verification of the structure of SDSnondissociable SCF dimer. The dimer is covalently linked by four intermolecular disulfide bonds involving all cysteinyl residues. The cysteines are paired as in the non-covalently associated dimer except that all pairings are intermolecular rather than intramolecular. Other structural models, involving interwining of intramolecular disulfide loops, are ruled out. The understanding in molecular properties of the noncovalently and covalently linked dimers provides some insights to the structure and function of SCF. Detailed purification and biological/biochemical characterization of the disulfide-linked dimer have been extensively described elsewhere (22). 11. MATERIAL AND METHODS Materials: Escherichia co//-derived rhSCF (SDS-dissociable dimer) was purified according to methods described previously (12,13). The recombinant molecule contains 165 amino acids plus an N-terminal methionine at position -1. lodoacetic acid was purchased from Sigma. HPLC solvents and water were purchased from Burdick and Jackson. Sequencing reagents and solvents were supplied by Applied Biosystems (Foster City, CA) and Hewlett Packard (Mountain View, CA). All other reagents were of the highest quality available. Isolation of SDS-nondissociable dimer: Recovery of rhSCF expressed in E, coli includes solubilization of rhSCF-containing inclusion bodies, oxidation and folding, and subsequent chromatographic steps (13). After cationic exchange chromatography using an S-Sepharose column, pooled SCF (approximately 1 L containing 600 mg rhSCF) was further subjected to C-4 reverse-phase chromatography performed with a BioCat liquid chromatographic system (Perceptive Inc., New Jersey) as described (22). Analytical HPLC analysis and structural characterization: SCF dimer hybridization studies were performed according a previously described cationic exchange chromatographic procedure (20). Reverse-phase (RP-) HPLC was performed using TFA-acetonitrile gradient elution. A Vydac C4 column (4.6 mm x 25 cm; 300 A) was equilibrated with 97% solvent A (0.1% TFA)/3% solvent B (0.1% TFA in 90% acetonitrile) with 215 and 280 nm UV detection at a flow rate of 0.7 ml/min. After samples were injected into the column, the following elution program was used: a linear gradient to 20% solvent B in 5 min and to 70% B in 60 min, then isocratic elution at 70% B for 20 min. N-terminal amino acid sequence analysis of peptides was performed on an automatic protein sequencer (Applied Biosystems Models 477A, 470) as described (10). Procedures used to sequence peptides recovered from gel bands electroblotted onto PVDF membranes were described in a previous report (23). Mass spectrometric analysis was performed in a Sciex API-Ill electrospray mass spectrometer by direct infusion of sample (0.1 mg/ml in 0.1% acetic acid) at 10 |i-l/min.
Disulfide-Linked Human Stem Cell Factor Dimer
373
Hydrogen peroxide oxidation of SCF dimer and CNBr cleavages: SDSnondissociable rhSCF dimer at 1 mg/ml in 10 mM sodium acetate, pH 5.0 was incubated with 0.5% (w/v) H2O2 at 25°C for 3 h (24). After reaction, the mixture was analyzed by analytical reverse-phase HPLC as described above. The conditions used were found to completely oxidize all Met residues except Met^S. Only a small fraction (about 10%) of Met^S was oxidized. A complete CNBr cleavage at the Met residues of H202-oxidized SCF dimer species was performed as follows. Vacuum-dried samples were redissolved in 70% formic acid (0.2 mg in 150 |Lil) and then incubated with freshly prepared CNBr (400 molar ratio to SCF) at 25°C for 24 h in the dark. All the cleaved samples were immediately vacuum dried for further analysis. Limited proteolysis by endoproteinase Lys-C: Sample was reconstituted in 20 mM Tris-HCl buffer, pH 7.5 (1 mg/ml) and digested with endoproteinase Lys-C (enzyme-to-substrate ratio= 1:100) at 25 C. At 15 min and 2 h, sample aliquots (100 |il each) were taken and digestion stopped by adding 5 |Lil of 20% TFA. Samples of 5-20 |ig were dried completely and subjected to SDS-PAGE as described below. Partial reduction of SCFs: One mg/ml solutions of SCF dimer species were incubated in the presence of 1.24 mg/ml dithiothreitol (DTT) in 0.1 M TrisHCl buffer (pH 8.5) containing 2.5 M urea, 60 mM NaCl, 2 mM EDTA. Aliquots of the reaction mixture were removed at selected time intervals and unreacted thiols were blocked by the addition of 1 M iodoacetic acid (10:1 molar ratio to the thiol) in 0.3 M Tris, pH 8.0 for 2 min at room temperature. Samples were then quickly frozen in a methanol/dry ice bath and subsequently analyzed by RP-HPLC using conditions described previously (21). Gel electrophoresis and electroblotting: Aliquots of dried samples (5-20 mg) were loaded onto individual lanes of precast 16% Laemmli polyacrylamide gels (10 wells; Novex Inc., San Diego, CA) and electrophoresed (25) under nonreducing and reducing conditions. After Coomassie blue staining and destaining, protein band intensity in each gel lane was measured using an image scanner (PDI Inc. New York). In separate analyses, gel bands were also electrophoretically transferred onto PVDF membrane and the Coomassie blue-stained bands were excised for N-terminal sequence analysis (23). III. RESULTS AND DISCUSSION Isolation of SDS-nondissociable SCF dimer: Expression of rhSCF in bacteria has resulted in the production of insoluble and inactive SCF accumulated in inclusion bodies. Solubilization and in vitro folding and oxidation are therefore necessary for the recovery and chromatographic purification of active SCF (12,13,21). The rhSCF isolated in this way is a noncovalently linked, SDS-dissociable dimer (12), like naturally occurring
Hsieng S. Lu et al
374
SCF (1). However, during the cationic exchange chromatography after folding and oxidation, we have noticed that rhSCF bands of 18 and 37 kDa co-elute, as analyzed by nonreducing SDS-PAGE (data not shown). The 37 kDa species was also detectable in the final folding mixture by HPLC (21).
[M+13H] 1556.3
2074.0
Nondissodble Dimer
—r— 60 Retention Time (min)
s^ (£
Nondissodable Dimer
1200
1400
[M+17H]] 2196.0 174.0
1600
U
1800
2000
2200
M/Z
Figure 1. A, SDS-PAGE analysis. From the left- standard (45, 31, 21 and 14 kDa, from the top), DTT-reduced rhSCF, DTT-reduced dimer, nonreduced rhSCF, and nonreduced dimer. B, RP-HPLC of wild type rhSCF and the SDS-nondissociable dimer. C. ESI-MS analysis of wild type rhSCF and the dimer. The multiply charged ions are indicated; and the molecular masses of each form obtained from the respective deconvoluted spectrum are 18,658.5±2.3 and 37,315.2+3.6, respectively. Preparative reversed-phase column chromatography resolved these two species from partially purified preparations obtained after cationic exchange chromatography. The isolated major form (80-85% of the total) has a molecular weight of 18 kDa under nonreducing SDS-PAGE, while the minor form (15-20%) has a molecular weight of 37 kDa (Fig. lA). Both forms migrates as 19 kDa bands on reducing SDS-PAGE as seen in Figure lA. The major form corresponds to the active non-covalently associated, SDSdissociable rhSCF dimer (17), while the minor form represents the SCF covalently linked, SDS-nondissociable dimer. Analytical RP-HPLC using TFA-acetonitrile gradient elution is shown in Fig. IB. This analysis provides a full resolution of the two species with the covalently linked dimer being eluted later (more hydrophobic). In electrospray mass spectrometric analysis shown in Fig. IC, the major form gave an average MH+ mass of
Disulfide-Linked Human Stem Cell Factor Dimer
375
18,658.5+2.3 (theoretical mass = 18,657.6), while the dimer gave an average mass of 37,315.6+3.6 (theoretical mass = 37,315.2). This data indicated that the covalent dimer contains two identical SCF monomer, like the dissociable dimer. The 37 kDa species is now referred as SDSnondissociable dimer to dintinguish from the normal dissociable dimer. Molecular comparison: Extensive comparison of the biological, biochemical and biophysical properties between the SDS-dissociable and nondissociable forms was reported (22). Many molecular properties are shared by both molecules, however, clear differences can also be observed. A brief summary of structural/functional similarity and difference is described in Table 1. An example of difference between the two forms can be demonstrated in a dimer dissociation-reassociation experiment using a cationic exchange HPLC method. The usual noncovalently associated dimer can undergo spontaneous rapid monomer dissociation-reassocaition (20). This was shown with the use of an NIOD variant of rhSCF which migrates differently from the wild type on ion-exchange HPLC as indicated in Figure 2. Upon mixing the NIOD and wild type molecules, the appearance of hybrid dimer could be monitored (Fig. 2, chromatogram 3). Not suprisingly, the SDS-nondissociable dimer did not undergo such dissociation-reassociation and subunit exchange (Fig. 2, chromatogram 6). Table L Comparison of the molecular properties between SDS-dissociable and nondissociable rhSCF dimers A. Similaiitv: -Profile in ion-exchange chromatography and gel filtration (simiUir charge and size). -In viU"o biologiciil activity to hematopoietic cells and receptor binding. Nondissociable dimer is tliiee-fold more active, but it binds to receptor with a half efficiency (22). -Identity in disulfide pairing, Cys^-Cys^^ and Cys^^-Cys^^^ (identical peptide map). -Identity in CD spectra, fluorescence spectra, and thermostability (simiUu* secondary and tertiaiy stiuctures and lociil environment). -Homodimer in solution. B. Difference: -The covalent dimer is not dissociated by SDS (linked by intennolecular disulfides). -The SDS-nondissociable dimer elutes later in reverse-phase cliiomatography (i.e., more hydrophobic). -The SDS-nondissociable dimer c^ui not undergo spontaneous monomer-dimer dissociation-reassociat ion.
Possible Structures for SPS-nondissociable rhSCF dimer: Since the disulfide bonds Cys^-Cys^^^ and Cys^^-Cys^^^ are present in both forms and the monomers of the nondissociable SCF dimer become dissociable in the presence of reducing agent (see Fig. 1 A), it follows that there are two types of structures to explain the lack of dissociation in SDS. These two types of models are inteiTnolecularly disulfide-linked and concatenated dimers (Fig.
Hsieng S. Lu et al
376
1
20
30
0
10
20
Retention Time (min)
Figure 2. Cationic exchange HPLC. Chromatograms 1-6: rhSCF, NIOD variant, rhSCF and NIOD variant mixture, nondissociable dimer, nondissociable dimer and rhSCF, and nondissociable dimer and NIOD variant. Incubation was in 10 mM NaOAc, pH 4.5, at 37 ^C for 20 h with concentration of each sample at 1 mg/ml; and 50 |il sample was injected. 3A and 3B, respectively). There are three disulfide-linked dimers, of which cysteines are involved in the formation of four intermolecular S-S linkages (structure Al), or two inter- and two intramolecular S-S bridges (structures A2 and A3). The five possible concatenated dimers would contain interlocked, but not covalently-linked, monomers (Fig. 3B). Structure Bl is a dimer concatenated by N-terminal disulfide loops of the two monomers, while structure B2 is interlocked by two C-terminal disulfide loops. Structures B3 and B4 are dimers with a respective N- and C-terminal disulfide loop of one monomer locked into the other monomer near a sequence region (between residues 44 and 88) shared by both N- and C-terminal disulfide loops. B5 is concatenated between the N-terminal disulfide loop of one monomer and the C-terminal loop of the other. In order to determine which structure(s) correspond(s) to the isolated SDSnondissociable dimer, the experiments described in the following were performed. Several strategies were followed according to the above models. As indicated in Fig. 3, Lys^^, Lys^^^ and Met^S are important sequence positions for the cleavages. Table 2 compares the expected cleavage results to the observed data for those particular structures shown in Fig. 3. Structural characterization of the SDS-nondissociable dimer: A. Limited endoproteinase Lys-C digestion: Figure 4A shows SDS-PAGE of digests generated by limited proteolysis (nonreducing conditions, for 15 min [lane 1] or 2 h [lane 2]) with Lys-C protease. The SCF polypeptide has
377
Disulfide-Linked Human Stem Cell Factor Dimer
14 Lys residues along the polypeptide chain including Lys^^ and Lys^^^. After the limited proteolysis, bands are still apparent near the 36 kD position. When these bands were transferred to PVDF membrane and sequenced, two sequences, M"l-E-G-I-C... and S^^^-P-E-P-R..., were detected in equivalent yields, suggesting that there is complete cleavage after Lys^^^. Several small peptides were also isolated from the digest by reverse-phase HPLC, and shown by sequence and mass spectrometric analyses to be D^^-L-K, K^^^S-F-K, D149.S-R-V-S-V-T-K-P-F-M-L-R-P-V-A-A, and pl57.F-N-L-P-PV-A-A. The latter two are C-terminal peptides not in the disulfide loops. Identification of these small peptides indicates that there was also partial cleavages by Lys-C after Lys^^, Lys^^, Lys^^^, and Lys^^^. Verification of these complete and partial cleavages was provided by reducing SDS-PAGE (lanes 3 and 4, Figure 4A). Only three large peptides, 11, 10, and 7 kDa were seen; the 11 kDa and 10 kDa bands had the rhSCF N-terminal sequence M-1-E-G-I-C, and the 7 kDa peptide had the sequence S^O^.p-E-P-R... In this case, the key point is that cleavage after Lys^^^, i.e., within the Cys^^Cysl38 disulfide loop, still leaves material which migrates near 35 kDa on non-reducing SDS-PAGE. This finding is inconsistent with models B2, B4, and B5, but consistent with all of the other models (Table 2). Table 2, Assignment of SDS-nondissociable SCF dimer structure by specific cleavages Lys-C cleavage atLys99,103,148
Partial DTT reduction Cys"^, Cys^^ reduced
Expected^ Observed*^
Expected Observed
Expected Observed
Al A2 A3
monomer dimer dimer
monomer monomer monomer
dimer dimer dimer
dimer dimer dimer
dimer monomer dimer
dimer dimer dimer
81 B2 B3 B4 B5
dimer dimer monomer monomer dimer
monomer monomer monomer monomer monomer
dimer monomer dimer monomer monomer
dimer dimer dimer dimer dimer
monomer dimer monomer dimer monomer
dimer dimer dimer dimer dimer
Structures^
CNBr cleavage at Met^8
^Structural Al is the only model compatible with all experimental data (for details, see text). ^Expected result if the predicted structure is cleaved by specific cleavage methods used. ^Observed results were found in Fig. 4 and 5 (for details, see text).
B. CNBr cleavage of H202-oxidixed dimer at Met^^: Five methionines, i.e., Met"l, Met^^, Met^^, Met"*^, and Met^^^, are in the rhSCF sequence, with Met^^ and Met^^ in the N-terminal loop created by the Cys^-Cys^^ bond and Met"^^ in a sequence region shared by both disulfide loops. A complete
Hsieng S. Lu et al
378
CNBr cleavage at Met residues will open the N-terminal loop and cut in the area shared by both loops. However, a complete cleavage opens up all dimer structures (Fig. 3) and generates monomer forms under nonreducing SDSPAGE. In another approach, the SDS-nondissociable dimer was reacted with H2O2 under conditions which completely oxidize Met"^ Met^^, Met^^,and Met^^^, but only partially (about 10%, as indicated by peptide
B
LK103
N 1 pK99
r,
C89 U48 IC43
48
48-^ C89| K103H K99 -j
u
|C4
N
C138
Al
A2
A3
B3
B4
B5
Figure 3. Proposed models for SDS-nondissociable dimer. A, disulfide linked dimers (Al, A2, and A5). The number 48 shown in Al, A2, and A3 is Met at position 48. C4, C43, C89, and CI38 are cysteines at positions 4, 43, 89, and 138. K99 and K103 are lysines at positions 99 and 103. B, concatenated dimers (Bl, B2, B3, B4, and B5). M48 and K103 are also indicated as cleavage sites. mapping analyses) oxidize Met"^^. The oxidized material was then subjected to complete CNBr cleavage. Since Met sulfoxide residues are resistant to the cleavage, a selective cleavage is expected to occur after Met^^ in the 90% of polypeptide chains which were not oxidized at this position, and sequence analysis of the digest confirmed this expectation. Fig. 4B shows SDSPAGE analyses. Note in lane 4 (reducing condition) that some material remains uncleaved (about 18 kDa) whereas the majority has been cleaved. Sequence analysis showed that the 6 kDa and 13 kDa bands correspond to the peptides generated by cleavage at Met^^. In lane 2 (nonreducing condition), the material at 18 kDa has two equivalent sequences, corresponding to the rhSCF N-terminus and to the peptide sequence starting at Val^^. Thus, cleavage after Met^^ generates "monomer" (18 kDa on nonreducing SDS-PAGE), a finding which is consistent only with models Al, B3, and B4. Nineteen percent of the material visualized in lane 2
Disulfide-Linked Human Stem Cell Factor Dimer
379
remains at the "dimer" position (about 35 kDa). The material at this band position has sequences corresponding to the rhSCF N-terminus and to the peptide beginning at Val^^, in a ratio of 2:1; this result is expected since 10% of the H202-treated material was oxidized at Met^^ and therefore uncleavable with CNBr. Models Al, B3, and B4 all allow for retention of "dimer" if only one chain and not the other is cleaved. C. Partial reduction ofdimeric SCFs: When native, SDS-dissociable rhSCF dimer was partially reduced with DTT followed by alkylation with iodoacetate and the resulting mixture was analyzed by reverse-phase HPLC and peptide mapping of the HPLC peaks, the Cys'^-Cys^^ bond was found to be preferentially reduced, with generation of an intermediate {1-2) containing only the Cys^^^-Cys^^^ bond (Fig. 5A, chromatogram 2); reduction of the Cys'^^-Cys^^^ bond follows at later times. When the SDS-nondissociable SCF dimer was similarly subjected to partial reduction and alkylation, no 7-2 was detected. Instead, two unique peaks, a and b, were resolved by HPLC at retention times later than that of the SDS-nondissociable SCF dimer (Fig. 5A, chromatogram 4). By sequence analysis, both species gave a clear PTH-Cys (Cm) signal at position 4; the signal for peak b was about half that for peak a. In addition, both peaks a and b migrate at the 36 kDa "dimer" position on non-reducing SDS-PAGE (Figure 5B, lanes 6 and 7). These A
B
kDa
-< 35
18 13
Figure 4. SDS-PAGE of peptide products of SDS-nondissociable dimer derived from chemical and proteolytic cleavages. A, endoproteinase Lys-C digestion. Lanes 1 and 2 (nonreducing), products at 40 and 10 |ig: lanes 3 and 4, as lanes 1 and 2, but reducing. B, CNBr cleavage of Met-oxidized dimer. Lane 1 (nonreducing), oxidized dimer; lane 2 (nonreducing), cleavage product: lanes 3 and 4, as lanes 1 and 2, but reducing.
Hsieng S. Lu et al
380
.
A 1
E
2
c
h^
1
1
\^
CN
o 0) 3 u c
D _Q O
B
R
1-2
11
ft\ \
J\J V
SDS-nondissociable dimer
\
1 \
to
< 4
ha
b
1
_ 1 VIA Retention Time (min)
Figure 5. Partial DTT reduction ofrhSCF dimers. A, RP-HPLC analysis. Chromatograms 1 and 2: SDS-dissociable dimer (N), untreated and DTTtreated (10 min), respectively; chromatograms 3 and 4, SDS-nondissociable dimer, untreated and DTT-treated (5 min), respectively. Each in 50 L | Lg was injected. 5 , Nonreducing SDS-PAGE of forms referred to in A. Lanes, from left, protein standards, R, N, SDS-nondissociable dimer, 1-2, a, and b. findings indicate that peak a is "dimeric" material in which both Cys^-Cys^^ disulfide bonds have been broken, and the Cys^^-Cys^^^ disulfide bonds are intact, while peak b is "dimeric" material in which only one of the Cys^Cys^^ disulfide bonds has been broken. Since there is no detectable "monomeric" material (as 7-2), we conclude that the data are inconsistent with models A2, Bl, B3, and B5, but consistent with the other models (Table 2). As summarized in Table 2, the only model for the SDS-nondissociable rhSCF dimer compatible with all the results of the last three experiments is Al, with four intermolecular disulfide bonds involving all four Cys residues of each monomer. Therefore, the SDS-nondissociable dimer is a disulfidelinked dimer, with no intramolecular disulfide bonds. None of the proposed concatenated dimers exist. Comparison and speculation of quaternary structure: As described in rhSCF folding studies (21), intermediate I-l with a Cys^-Cys^^ disulfide bond is the main intermediate form during rhSCF folding and oxidation. This and other intermediates lead to the non-covalently associated SCF dimer
Disulfide-Linked Human Stem Cell Factor Dimer
381
with intramolecular disulfides, but could also undergo disulfide rearrangement to form intermolecular disulfides. For such events to occur, the partially-oxidized rhSCF monomers would have to be associated prior to disulfide formation; we have shown that all of the intermediate forms that have been identified are in dimeric state (21). As described in Table 1, many of the biochemical and biophysical properties of the non-covalently associated dimer and the disulfide-linked dimer appear indistinguishable - including surface charge, molecular size, plus secondary and tertiary structure and local environments. The disulfide-linked dimer does behave differently than the non-covalently associated dimer on RPHPLC at low pH and in the monomer dissociation-reassociation experiments (Figs. 1 and 2). In each case the differences essentially reflect the covalent attachment of the disulfide-linked dimer. The biological properties of the covalent dimer are noteworthy. Its activity toward hematopoietic target cells is 3-fold higher than the activity of non-covalently associated dimer (Table 1). However, in c-kit receptor binding experiments, the disulfide-linked dimer if anything displayed slightly lower affinity for kit in comparison with the non-covalently associated dimer. This phenomenon may be due to the
B Figure 6. Proposed quaternary structures of rhSCF dimer and disulfidelinked dimer. A, SDS-dissociable rhSCF dimer with topology similar to MCSF. B, disulfide-linked dimer having all disulfides at the dimer interface. C, disulfide-linked dimer containing A and D helices swapped between subunits (distinguished as shaded and unshaded helices). The four helical structure (A-D helices) was derived from that proposed by Bazan (27).
382
Hsieng S. Lu et al
possibility that SCF dimer is necessary to mediate kit dimerization, or at least that SCF dimer may be more effective at doing so than SCF monomer, although monomeric SCF can mediate the dimerization and activation of kit receptor (26). Depending on the Ka for monomer association to dimer (20), it is possible that much of the noncovalently associated SCF dimer could be monomeric at the 0.05 -2 ng/ml range which is equivalent to the effective concentration for the biological assay, while the disulfide-linked dimer is dimeric at all concentrations. Our above observation implies that the overall quatemary structure, including interactions at the dimer interface, would be similar for the disulfide-linked and non-covalently associated dimers. In considering the structure of SCF, as pointed out by Bazan (27), there are many reasons to expect similarity to the structure of M-CSF, which is known (28). X-ray crystallographic structure of M-CSF dimer (28) includes the four-helix bundle for each monomer which had been proposed by Bazan for both M-CSF and SCF (27). The two monomers of M-CSF associate in head-to-head fashion, i.e., the top ends of the helix bundles associate leading to a flat and elongated overall shape. The SCF-equivalent intramolecular disulfide bonds (Cys^Cys^O and Cys^S-Cys^^Q) of M-CSF are at the ends of the helix bundles distal to the dimer interface. Given that the disulfide-linked SCF dimer described here is highly active, we suggest the following speculation as to how its structure may compare to the noncovalently associated SCF dimer. If the quatemary structure of noncovalently associated SCF (see the predicted four-helical bundle structure in Fig. 6, model A) is homologous to that of MCSF, the monomers of the disulfide-linked dimer would need to be inverted in order to accommodate the disulfide bond formation, without an adverse effect on activity (Fig. 6, model B). Alternatively, if the quaternary structure of the disulfide-linked SCF dimer (model B) and the noncovalently associated SCF dimer were similar to each other, both could be inverted relative to that of the M-CSF dimer. Thirdly, and perhaps most likely, the quatemary structures of the SCFs could be similar to each other and to that of M-CSF if, for the disulfide-Hnked SCF dimer, the proposed A and D heUces (27) were swapped between the monomers within the dimer as seen in model C (Fig. 6). Such swapping would be feasible within the constraints of the proposed SCF structure (i.e. similar to M-CSF structure), and could conceivably arise during the refolding of the E. C(9//-derived recombinant molecule. There is precedent for such swapping of helices or other domains between monomers within overall oligomeric structures, e.g., interleukin 5 (29) and many other proteins as well (30). ACKNOWLEDGMENTS We are indebted to technical help from microbial fermentation and recovery process development groups at Amgen Inc. in the expression and isolation of rhSCF.
Disulfide-Linked Human Stem Cell Factor Dimer
383
REFERENCES Zsebo, K.M., Wypych, J., et al (1990) Cell 63, 195-201. Martin, F.H., Suggs, S.V., et al (1990) Cell 63, 203-211. Zsebo, K.M., Williams, D.A., et al (1990) Cell 63, 213-224. Williams, D.E., Eisenman, J., et al (1990) Cell 63, 167-174. Copeland, N.G., Gilbert, D.J., et al (1990) Cell 63, 175-183. Huang, E., Nocka, K., et al (1990) Cell 63, 225-233. Russell, E.S. (1979) Adv. Genet. 20, 357-459. Silvers, W.K. (1979) The Coat Colors of Mice. A Model for Mammalian Gene Action and Interaction. Springer-Verlag, New York. 9. Lu, H.S., Clogston, C.L., et al (1992) Arch. Biochem. Biophys. 298, 150-158. 10. Lu, H.S., Clogston, C.L., et al (1991) J. Biol. Chem. 266, 81028107. 11. Langley, K.E., Bennett, L.G., et al (1993) Blood 81, 656-660. 12. Arakawa, T., Yphantis, D.A., et al (1991) /. Biol. Chem. 266, 1894218948. 13. Langley, K.E., Wypych, J., et al (1992) Arch. Biochem. Biophys. 295 21-28 14. Yarden, Y.,*Kuang, W.-J., et al (1987) EMBO J. 6, 3341-3351. 15. Ullrich, A. and Schlessinger, J. (1990) Cell 61, 203-212. 16. Miyajima, A., Kitamura, T., Harada, N., Yokota, T. and Aral, K.-I. (1992) Ann. Rev. Immunol. 10, 295-331. 17. Johnsson, A., Heldin, C.-H., Westermark, B. and Wasteson, A. (1982) Biochem. Biophys. Res. Commun. 104, 66-74. 18. Das, S.K. and Stanley, E.R. (1982) J. Biol. Chem. 257, 13679-13684. 19. Glocker, M.O., Arbogast, B., Schreurs, J. and Deinzer, M.L. (1993) Biochem. 'il, 482-488. 20. Lu, H.S., Chang, W.-C, et al (1995) Biochem. J. 305, 563-568. 21. Jones, M.D., Narhi, L.O., Chang, W.-C. and Lu, H.S. (1996) J. Biol. Chem., 271, 11301-11308. 22. Lu, H.S., Jones, M.D., et al (1996) J. Biol. Chem. Ill, 1130911316. 23. Fausset, P.R. and Lu, H.S. (1991) Electrophoresis 12, 22-27. 24. Hsu, Y.-R., Narhi, L.O., Spahr, C , Langley, K.E. and Lu, H.S. (1996) Protein Science 5, 1165-1173. 25. Laemmli, U.K. (1970) Nature 2276, 680-685. 26. Lev, D., Yarden, Y. and Givol, D. (1992) J. Biol. Chem. 267, 1597015977. 27. Bazan, J.F. (1991) CelUS, 9-10. 28. Pandit, J., Bohm, A., et al (1993) Science 258, 1358-1362. 33. Milburn, M.V., Hassell, A.M., et al (1993) Nature 363, 172-176 30. Bennett, M.J., Schlunegger, M.P., and Eisenberg, D. (1995) Protein Science 4: 2455-2468
1. 2. 3. 4. 5. 6. 7. 8.
This Page Intentionally Left Blank
AUTOCATALYTIC REDUCTION OF A HUMANIZED ANTIBODY A. Ashok Kumar John Kimura Jennifer Running Deer ICOS Corporation Bothell, Washington 98021
I. INTRODUCTION Disulfide (-S-S-) bonds play an important role in the structure and function of proteins. IgG molecules are comprised of two heavy and two light chains linked by interchain disulfide bonds. In addition, intrachain disulfide bonds are also present in IgGs. Alkyl thiols have been used to demonstrate the structural and functional role played by disulfide bonds. Reduction of IgGs with alkyl thiols under denaturing conditions results in separation of light and heavy chains. Complimentarity determining region (CDR) grafting is a unique way of generating human antibodies with the same specificities as murine antibodies. We report here our work with a humanized antibody (hAB-1), its parent murine antibody (pMAB) and a control humanized antibody (hAB-2). Both pMAB and hAB-1 contain a cysteine residue in CDR-1 of the heavy chain. Our results demonstrate autocatalytic reduction of hAB-1 under denaturing conditions. We also report the identity of the multiple species formed due to autoreduction. N-ethylmaleimide (NEM), an alkylating reagent, reacts specifically with protein cysteines at neutral pH to form stable thioether bonds (1-4). Thiol specific reagents and in vitro mutagenesis experiments were used to confirm the involvement of a cysteine residue in the autoreduction of hAB-1. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
385
386
A. Ashok Kumar et al
II. MATERIALS AND METHODS Purification of Antibodies: Humanized antibodies (hAB-1 and hAB-2) were purified by the technique of Protein-A chromatography, followed by Phenyl Sepharose and Ion-exchange chromatographies. The murine antibody (pMAB) was purified by Protein-A chromatography. SDS-PAGE: Poly aery lamide gel electrophoresis of proteins was performed using NOVEX (San Diego, CA) pre-cast, 1 mm 12 % Trisglycine gels, by the method of Laemmli (5). Protein samples were boiled for 1-10 minutes in reducing or non-reducing NOVEX sample buffers (pH 6.8). For reducing gels, DTT was added to the sample to a final concentration of 50 mM. Samples were electrophoresed at a constant voltage of 200 volts for 70 minutes. Protein bands were visualized by staining with Coomassie Blue R-250. Aikylation of Protein Thiols: Free cysteines of pMAB and hAB-1 were blocked with NEM or iodoacetamide by incubating the proteins (50 \ig) with NEM (4 jag) or iodoacetamide (200 lag) in PBS at room temperature for 10 or 30 minutes. Matrix Assisted Laser Desorption lonization-Mass Spectrometry (MALDI-MS): Mass spectra of native and denatured antibodies were obtained with a PerSeptive Biosystems (Farmingham, MA.) Voyager Elite mass spectrometer operated in the linear mode with a Laser Sciences Inc., 337 nm nitrogen laser. hAB-1 was denatured by boiling the sample in 1.0 M guanidine-HCl, 50 mM Tris pH 7.5 buffer. Native and denatured samples were diluted with 20 mM Tris, 10 mM octylglucoside (Tris/OG) pH 6.8 buffer prior to MALDI-MS analysis. Proteins were spotted on the sample plate as a sandwich between two layers of the matrix. The bottom layer consisted of 100 mM sinapinic acid in acetonitrile and the top layer consisted of 50 mM sinapinic acid in 30% acetonitrile / 70% H2O / 0.07% TFA. The m/z scale of the instrument was calibrated using a HewlettPackard protein standard mixture. Cysteine to Serine Conversion: The heavy chain CDR-1 residue of hAB-1 was converted to a serine residue by standard in vitro mutagenesis techniques. The mutation was confirmed by sequencing the entire cDNA. The modified CS-hAB-1 was purified by Protein-A chromatography.
Autocatalytic Reduction of a Humanized Antibody
387
III. RESULTS AND DISCUSSION A. Apparent Heterogeneity of Humanized Antibody hAB-1 Humanized antibodies hAB-1, hAB-2 and pMAB were examined by SDS-PAGE under non-reducing and reducing conditions (Figure 1). Nonreducing SDS-PAGE of hAB-1 revealed 10-15 protein bands in the molecular weight range of 25 to 200 kDa, while the hAB-2 protein had two minor bands (>95 kDa and <200 kDa) and a major band at 200 kDa. The parent murine antibody had two very closely spaced bands near 200 kDa. On reduction, all three proteins showed primarily two major bands corresponding to mass values of 50 kDa and 25 kDa. All three proteins contained a minor band just above the 25 kDa band. Occasionally, a very low molecular weight band, near the dye front, was seen with hAB-2 under reducing conditions. kDa MW 1 2 3 200.0 fp •— ,pi 116.3 97.4 . 'PI 66.3
55.4
--- .
5
6
7
'^
HC
36.5 31.0
;^^^
"^IPIHIF """' '"*• ""^"""^
j_/V^
21.5
Non-reducing
Reducing
Figure 1. Apparent heterogeneity of humanized antibody hAB-1. Ten micrograms of each antibody were mixed with non-reducing or reducing SDS-PAGE sample buffer and boiled for ten minutes prior to electrophoresis. The NOVEX molecular weight markers (MW) are identified by their mass values. Samples: Lanes 1,5 pMAB, lanes 2,6 hAB-2 and lanes 3,7 hAB-1. HC and LC correspond to heavy and light chains of antibodies, respectively.
A. Ashok Kumar et al.
388
B. Effect of Sample Preparation on Apparent Heterogeneity The possible role of sample preparation technique on the observed heterogeneity was examined by studying the effects of incubation temperature and duration on the appearance of multiple bands with hAB-1. Results from SDS-PAGE of hAB-1 and pMAB boiled for 1, 5, and 10 minutes in non-reducing buffer are shown in Figure 2. The intensity of the lower molecular weight bands increase with boiling time particularly for hAB-1. Figure 3 shows the effect of incubating the humanized antibody in non-reducing SDS-PAGE buffer at different temperatures. hAB-1 migrates as a doublet between the 116.3 kDa and 200 kDa markers when incubated at 5°C or 21°C. In addition, these samples also contain a minor band at 67 kDa. The 37°C samples showed the presence of >95 kDa band and 200 kDa band in addition to the bands seen with the 5°C or 21°C samples. Samples incubated at 60°C or boiling temperature had several low molecular weight bands.
MW kDa 200.0 — 116.3 _ 97.4 ""' 66.3 — 55.4 —
1 2
3
4
5 6
WiWiVV
ill •• '
iii
36.5 31.0
*— •iiiii .„
^^^
21.5
Figure 2. Influence of sample preparation time on apparent heterogeneity. pMAB and hAB-1 were mixed with non-reducing SDS-PAGE sample buffer and boiled for various times prior to electrophoresis. Boiling times: Lanes 1,2:1 minute, lanes 3,4:5 minutes and lanes 5,6:10 minutes. Samples: pMAB (lanes 1,3 and 5), hAB-1 (lanes 2,4 and 6).
Autocatalytic Reduction of a Humanized Antibody kDa 200.0 116.3 97.4 66.3 55.4
MW 1 2
3
4
389 5
j^^l ^^^ ^^B HJ^P VH^
^"^ IH !•#
Pi p
36.5 31.0 21.5
Figure 3. Influence of temperature on apparent heterogeneity. hAB-1 was incubated in non-reducing SDS-PAGE sample buffer for ten minutes at various temperatures prior to electrophoresis. Incubation temperatures: Lane 1: 5°C, lane 2: 21 °C, lane 3: 37°C, lane 4: 60'^C and lane 5: boiling. C. Molecular Integrity of hAB-1 Antibody Gel filtration HPLC of hAB-1 in 0.2 M sodium phosphate (pH 6.8) buffer showed that the majority of the protein (97%) was in the monomer form with retention time similar to the bovine gamma globulin (158 kDa) standard. MALDI-MS is a technique well suited for the examination of low and high molecular weight biomolecules (6,7). Figure 4 shows the mass spectra for hAB-1, hAB-2 and pMAB diluted in Tris/OG buffer. The spectra contain signals for (M+H)^^ (148 kDa), (M+2H)^^ (74 kDa), (M-f3H)^^ (49.5 kDa) and (M+4H)^^ (37.6 kDa) species in each case. In addition a minor signal at 23-24 kDa was observed.
A. Ashok Kumar et al.
390
I
< \ 2 iA »
L.„
hAB-2 148117
i^N««
A 26000
4 ^ 0 6^000
pMAB 148253 ii>i
sEooo
iH
mmmmtmimmmm
idoooo 1^000 iloooo iloooo
m/z
Figure 4. MALDI-MS of native hAB-1, hAB-2 and pMAB. Samples (2 pmols) in 20 mM Tris, 10 mM octylglucoside, pH 6.8 buffer were used for MS analysis. D. Autoreduction of hAB-1 Antibody The parent murine antibody (pMAB) and the humanized antibody (hAB1) both contain a cysteine residue in their heavy chain CDR-1. The role of this cysteine residue in the apparent heterogeneity of hAB-1 was examined by incubating the hAB-1 protein with thiol specific reagents iodoacetamide (lAA) and N-ethylmaleimide (NEM). Data in Figure 5 shows that preincubation with lAA or NEM can prevent the formation of low molecular weight species seen with untreated hAB-1. Treatment with thiol specific reagents suggests autoreduction as the cause for the apparent heterogeneity.
Autocatalytic Reduction of a Humanized Antibody
391
36.5 31.0
sliiiiiii'iiiii
21.5 lOMin.
30Min.
Figures. Role of a cysteine residue in autoreduction of hAB-1. hAB-1 (50 jig) was incubated with either NEM (4 lug) or lAA (200 ^ig) in PBS at room temperature for 10 or 30 minutes. At the end of the incubation period, treated and untreated samples (10 jiig) were boiled in non-reducing SDS-PAGE sample buffer and electrophoresed. Samples: Lanes 1,4: untreated, lanes 2,5: treated with NEM and lanes 3,6: treated with lAA. E. Autoreduction of hAB-1 and pMAB in Guanidine-HCl We have examined the mass spectra of all three proteins in 1.0 M guanidine-HCl with and without thermal denaturation. Samples incubated in guanidine-HCl at room temperature have very similar spectra as in Tris/OG buffer (data not shown). Mass spectra of samples boiled in guanidine-HCl (Figure 6) revealed that similar species are seen with hAB1 and pMAB and that both are subject to autoreduction under these conditions. Resuhs with hAB-2 were different compared to hAB-1 and pMAB. With all three samples, boiling in guanidine-HCl resulted in a significant loss of (M+H)^ signal for high molecular weight components (74.5-148 kDa). The loss of (M+H)^^ and (M+2H)^^ signals of IgG in all three cases could be due to precipitation of denatured high molecular weight components. In addition, guanidine-HCl could be interfering with desorption of the ionized high molecular weight components.
A. Ashok Kumar et al
392
hAB-1
i ^»w«n..
)ii I,J
O
hAB-2
JL
13
N^wA.
i
pMAB
20K 40K 60K 80K lOOK 120k 140K 160K m/z Figure 6. MS of hAB-l, hAB-2 and pMAB boiled in LO M guanidineHCl. Samples (1 mg/ml) in 50 mM Tris, 1.0 M guanidine-HCl pH 7.5 buffer were boiled for 5 minutes, diluted with Tris/OG buffer and analyzed by MALDI-MS. Table 1 lists the molecular mass values and possible species from the mass spectra of hAB-1 and pMAB boiled in guanidine-HCl. Observed species and probable identities include: the monomer (24 kDa), dimer (47.8 kDa), and trimer (71.9 kDa) of light chain, monomer (50 kDa) and dimer (100.5 kDa) of heavy chain, combinations of one heavy and one light chain (74.3 kDa), one heavy and two light chains (98.1 kDa), and two heavy chains and one light chain (124.5 kDa). The 74.3 kDa signal could be due to the (M+2H)"^^ species of IgG or the (M+H)"^^ species of a light chainheavy chain heterodimer. Also present were species with mass values of 172.5 kDa and 197 kDa, corresponding to an IgG complexed with an additional light chain or an additional heavy chain. Finally, we saw signals for several (M+2H)^^ species.
Autocatalytic Reduction of a Humanized Antibody
393
Table 1. Identity of Autoreduction Products of hAB-1 and pMAB Molecular Mass hAB-1 pMAR 197007
Possible Species IgG + HC
Charge Status (M+H)""
172513
172513
IgG + LC
(M+H)'"
148347
148018
IgG
(M+H)*'
124489
124209
2HC + LC
(M+H)^'
100475
100279
HC + HC
2(M+H)^'
98097
98186
HC + 2LC
(M+H)""'
74308
74118
HC + LC or IgG
(M+H)""' or (M+2H)"'^
71885
71929
3LC
3(M+H)*'
50404
50236
HC
(M+H)"
47848
48010
2LC
2(M+Hf'
37210
37158
IgG
(MMH)""*
25254
25149
HC
(M+2H)"^
24031
24027
LC
(M+nf
12039
12022
LC
(M+2H)"*'^
F. Identification of the Cysteine Residue Responsible for the Autoreduction of hAB-1 The heavy chain CDR-1 cysteine of hAB-1 was converted to a serine residue by in vitro mutagenesis. Non-reducing SDS-PAGE of the CShAB-1 antibody showed that the mutated protein was resistant to autoreduction (Figure 7), confirming the role played by this cysteine in the autoreduction during thermal denaturation.
394
MW
A. Ashok Kumar et al. 1
2
3
MW
1 2
3
4
5
w
Figure 7. Establishment of CDR-1 cysteine as the residue responsible for hAB-1 autoreduction. Ten micrograms of sample were boiled with or without reducing agent for ten minutes prior to electrophoresis. Samples: Lane 1: pMAB, lane 2: hAB-1, lane 3: NEM-treated hAB~l, lane 4: cysteine to serine mutant of hAB-1 and lane 5: hAB-2. IV. CONCLUSIONS The humanized antibody, hAB-1, was heterogeneous by non-reducing SDS-PAGE and MALDI-MS under denaturing conditions. NEM and iodoacetamide treatments established the role of a cysteine residue in generation of the heterogeneity. Reducing SDS-PAGE, HPLC gel filtration and MALDI-MS of native hAB-1 have shown that the native protein is intact and is similar to pMAB and hAB-2. Our results have shown that hAB-1, when denatured, is subject to autocatalytic reduction by a cysteine residue. This leads to the formation of low molecular weight species. The role played by the heavy chain CDR-1 cysteine of hAB-1 in this process was confirmed by in vitro mutagenesis of the cysteine to a serine residue. Our results have shown that a humanized antibody with a cysteine residue in its CDR is more susceptible to autoreduction compared to the
Autocatalytic Reduction of a Humanized Antibody
395
parent murine antibody. pMAB is not susceptible to autoreduction in nonreducing SDS-PAGE buffer, while thermal denaturation of pMAB in guanidine-HCl leads to autoreduction. hAB-1 is subject to autoreduction under both conditions. hAB-2 is not subject to reduction or breakdown under either conditions. Thus, reactivity of pMAB CDR-1 cysteine towards disulfide bonds is dependent on the denaturant employed to perturb the structure, while reactivity of hAB-1 CDR-1 cysteine is independent of the denaturant employed.
ACKNOWLEDGMENTS The authors would like to thank Dr. Leland Paul for critical reading of the manuscript and Eileen Jarvis for typing the manuscript.
REFERENCES 1. Heitz, J. R., Anderson, C. D., and Anderson, B. M. (1968). Arch. Biochem. Biophys. 127,'627-636. 2. Smyth, D. G., Blumenfeld, O. O., and Konigsberg, W. (1964). Biochem. J. 91, 589-595. 3. Gorin, G., Martic, P. A., and Doughty, G. (1966). Arch. Biochem. Biophys. 115, 593-597. 4. Partis, M. D., Griffiths, D. G., Roberts, G. C , and Beechey, R. B. (1983). J. Protein Chem. 2, 263-277. 5. Laemmli, U. K. (1970). Nature (London) 111, 680-685. 6. Zaluzec, E. J., Gage, D. A., and Watson, J. T. (1995). Prot. Expr. Purif. 6, 109-123. 7. Andersen, J. S., Svensson, B., and Roepstroff, P. (1996). Nature Biotech. 14, 449-457.
This Page Intentionally Left Blank
SECTION V Interactions of Protein with Ligands
This Page Intentionally Left Blank
Oxygen and Ascorbate Mediated Modification of a Recombinant Hemoglobin Bruce A. Kerwin, Edward Hess, Julie Lippincott, Ray Kaiser^ and Izydor Apostol Somatogen Inc., Boulder, Colorado 80301 ^ Eli Lilly and Co., Indianapolis, Indiana 46285
I. Introduction Recombinant hemoglobin, rHbl.l, is atrimeric protein composed of two p-globins, two genetically fused a-globins (di-a globin) and four hemes (see ribbon structure). In the reduced state the iron center of each heme group reversibly binds molecular oxygen. Upon binding oxygen the hemoglobin can autooxidize forming metHb which is incapable of binding oxygen: HbFe2+-02 ^ HbFe3+ + O2". The metHb can be reduced back to the ferrous state using ascorbate and reduced oxygen conditions (Vestling, 1941 and Gibson, 1943) as described by the mechanism Ascorbate2- + 2HbFe3+ -> 2HbFe2+ + dehydroascorbate (Al-Ayash and Wilson, 1979). In the presence of oxygen, however, ascorbate reacts with molecular oxygen to form dehydroascorbate and superoxide anion. The dehydroascorbate can undergo hydrolytic ring rupture to form 2,3-diketogulonic acid which in turn can react further with oxygen forming additional byproducts which may modify proteins (Washko et.al., 1992). One of the more prevalent protein modifications detected has been carboxymethylation of lysine groups to form Ns-(carboxymethyl)lysine (Dunn et al, 1990 and Ortwerth et al., 1992). Although it is known that the superoxide reacts with and oxidizes the hemoglobin little is known concerning the modification of the hemoglobin by dehydroascorbate and its byproducts. Studies were undertaken to determine how rHbl.l is modified in the presence of dehydroascorbate as well as in the presence of ascorbate and oxygen. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
399
400
Bruce A. Kerwin et al
11. Materials and Methods A. Recombinant Hemoglobin Recombinant hemoglobin (rHbl.l) was produced at Somatogen as described by Looker et.al, 1992.
B. Ascorbic acid and Dehydroascorbate Ascorbic acid (sodium salt) and dehydroascorbate were both obtained from Sigma Chemical Co.
C Reverse Phase HPLC Reverse phase HPLC analysis was performed using a Zorbax C3 analytical column. The oven temperature was maintained at 40°C. Solvent A = H20/0.1% TFA and Solvent B = 100% acetonitrile/0.1% TFA. Flow rate = Iml/min. The column was equilibrated in 35% solvent B. Following sample injection the column was maintained for 5 minutes with 35% solvent B then ramped to 49% solvent B over 45 minutes. Samples were prepared by precipitation with icecold acid/acetone (Witkowska et al., 1993) and solubilization of the pellet in 0.1%) TFA/20%) acetonitrile.
D. Trypsin Mapping Tryptic mapping was performed on an HP 1090 HPLC modified with two independent switching valves for controlling flow to a Poroszyme immobilized trypsin Cartridge (Perseptive Biosystems) and an in-line Zorbax analytical CI8 column (Lippincott et al., manuscript in preparation).
E. Mass Spectrometry Mass spectrometry was performed using a Finnigan Mat LCQ.
F. Amino Acid Sequence Analysis Sequencing was performed on the isolated p-globin N-terminal tryptic peptide and the difference tryptic peptide using a Porton 2020 sequencer.
O2 and Ascorbate Mediated Modification of Recombinant Hb
401
n. Results and Discussion Studies were undertaken to determine how rHb 1.1 was modified by the oxygen mediated degradation of ascorbate. In the presence of ascorbate and ISOppm of oxygen the reverse phase HPLC profile of the hemoglobin was not significantly different from the control oxy-rHb without ascorbate (see Fig. 1). In contrast, when the hemoglobin was incubated with ascorbate and 15,000ppm of oxygen a decrease in the height of the P-globin peak was seen along with new peaks appearing on the lagging shoulder of the P-globin indicating a modification to the P-globin chain. Ascorbate is known to degrade in the presence of oxygen forming dehydroascorbate and superoxide anion which oxidizes the hemoglobin. In the sample containing only hemoglobin and 15,000ppm of oxygen the metrHb levels rose from 19% to approximately 40% while the met-rHb levels in the sample containing 15,000ppm oxygen and 5mM ascorbate demonstrated an overall decrease in the level of metHb to final concentration of 0.5% metHb (data not shown). This suggests that the superoxide anion produced during reaction of the ascorbate with oxygen alone was not responsible for the modification, rather the modification was due to a degradation product of the ascorbate. The initial product of ascorbate oxidation is dehydroascorbate. We incubated dehyrdroascorbate with the hemoglobin to determine if it could modify and produce similar reverse phase HPLC profiles as that seen with ascorbate and oxygen (see Fig. 1). Addition of dehydroascorbate to deoxy-rHb (middle panel Fig. 2) showed a reverse phase HPLC profile similar to that seen for ascorbate and oxygen (see Fig. 1). The height of the P-globin in the deoxy-rHb/ascorbate sample again decreased with new peaks appearing on the lagging shoulder of the P-globin. It appears that both the deoxy-rHb and oxy-rHb in the presence of dehydroascorbate are modified to the same degree (Fig. 2). However, we cannot exclude the possibility that residual oxygen leaking into the deoxy-rHb/ascorbate system through the septum was responsible for the modification. It is also possible that dehydroascorbate decomposition byproducts were already present prior to addition and caused the modification. The location of the modification was determined using trypsin mapping. Following dehydroascorbate modification of deoxy-rHb the main p-globin peak and the lagging shoulder of the P-globin were purified from reverse phase HPLC (see Fig. 2) and mapped with trypsin (see Fig. 3). The map of the unmodified Pglobin is shown in the upper panel and is not different from that of P-globin not exposed to ascorbate (data not shown). The map of the modified p-globin (lower panel Fig. 3) demonstrated a marked decrease in the N-terminal peptide (MHLTPEEK) of the p-globin eluting with a retention time of 51min. and the appearance of a difference peptide eluting with a retention time of 53.5min. This
Briice A. Kerwin et al
402 DAD1 A. Sig=215.4 mAUn
P
600^
OmM Ascorbate
di-a
A
700^
15,000ppm Oxygen
9004CX)300200100020 25 DAD1 A. Sig=215.4
JL_ M
J 35
45
40
mAU-:
mir^
5mM Ascorbate
eoo150ppm Oxygen
500400300200100-
J
025 20 DAD1 A. Sig=215.4
L 30
35
J1^ 1
40
mAU I
, , . , . . ., 45
min
5mM Ascorbate
600-
15,OOOppm Oxygen
900400300200100020
25
Jy 30
J 35
40
45
min
Figure 1: Revese phase HPLC analysis of hemoglobin modified by ascorbate and oxygen. Recombinant hemoglobin (50mg/ml) was equilibrated to the indicated oxygen tension followed by addition of ascorbate to a final concentration of 5mM or buffer (150mM NaCl, 5mM NaPi, pH7.4) as a control. Aliquots (0.5ml) were storred in stoppered 2ml vials for 15-16 days at 4°C then analyzed by reverse phase HPLC as described in materials and methods. The upper panel represents a typical chromatographic profile of rHbl.l. The peak at ~34.5min. represents the p-globin subunit and the peak at ~40min. represents the di-a-globin subunit. At 150ppm of oxygen no apparent modification of the protein primary structure was observed. However, at 15,000ppm of oxygen a distinct modification of the p-globin was observed.
O2 and Ascorbate Mediated Modification of Recombinant Hb
403
In contrast, when the hemoglobin was incubated with ascorbate and 15,000ppm of oxygen a decrease in the height of the (3-globin peak was seen along with new peaks appearing on the lagging shoulder of the P-globin indicating a modification to the P-globin chain. Ascorbate is known to degrade in the presence of oxygen forming dehydroascorbate and superoxide anion which oxidizes the hemoglobin. In the sample containing only hemoglobin and 15,000ppm of oxygen the metrHb levels rose from 19% to approximately 40% while the met-rHb levels in the sample containing 15,000ppm oxygen and 5mM ascorbate demonstrated an overall decrease in the level of metHb to final concentration of 0.5% metHb (data not shown). This suggests that the superoxide anion produced during reaction of the ascorbate with oxygen alone was not responsible for the modification, rather the modification was due to a degradation product of the ascorbate. The initial product of ascorbate oxidation is dehydroascorbate. We incubated dehyrdroascorbate with the hemoglobin to determine if it could modify and produce similar reverse phase HPLC profiles as that seen with ascorbate and oxygen (see Fig. 1). Addition of dehydroascorbate to deoxy-rHb (middle panel Fig. 2) showed a reverse phase HPLC profile similar to that seen for ascorbate and oxygen (see Fig. 1). The height of the p-globin in the deoxy-rHb/ascorbate sample again decreased with new peaks appearing on the lagging shoulder of the P-globin. It appears that both the deoxy-rHb and oxy-rHb in the presence of dehydroascorbate are modified to the same degree (Fig. 2). However, we cannot exclude the possibility that residual oxygen leaking into the deoxy-rHb/ascorbate system through the septum was responsible for the modification. It is also possible that dehydroascorbate decomposition byproducts were already present prior to addition and caused the modification. The location of the modification was determined using trypsin mapping. Following dehydroascorbate modification of deoxy-rHb the main p-globin peak and the lagging shoulder of the p-globin were purified from reverse phase HPLC (see Fig. 2) and mapped with trypsin (see Fig. 3). The map of the unmodified Pglobin is shown in the upper panel and is not different from that of P-globin not exposed to ascorbate (data not shown). The map of the modified P-globin (lower panel Fig. 3) demonstrated a marked decrease in the N-terminal peptide (MHLTPEEK) of the p-globin eluting with a retention time of 51min. and the appearance of a difference peptide eluting with a retention time of 53.5min. This
Bruce A. Kerwin et al
404
suggests that the dehydroascorbate modification of the deoxy-rHb is on the Nterminal tryptic peptide of the P-globin. DAD1 A, Sig=215.4 mAU-
rHb 1.1 Control
di-a
Q 1
A
\
400^ 300200100020 25 DAD1 A, Sig=215.4
L_ X
•
'
I
36
.
1 1
IL 1
40
1
mAU4003002001000-
•
I
I
.
,
.
20 25 DAD1 A, Sig=215.4
1 Jl
^ _ . . . , ,.
}
mr\
Deoxy-rHb + DHA
1 35
45
I 40
45
mAU-
min
Oxy-rHb + DHA
1
400300200100020
25
_J^^ 30
35
•
•
i
I
40
•
45
min
Figure 2: Reverse phase HPLC analysis of dehydroascorbate modified hemoglobin. Recombinant hemoglobin (50mg/ml in 150mM NaCl, 5mM NaPj, pH 7.4) was deoxygenated under a stream of humidified nitrogen, divided into two ahquots (3 ml each) in glass flasks stoppered with white rubber septa and stored at 4°C. Dehydroascorbate was prepared by evacuating and flushing a flask containing the dehydroascorbate solid followed by addition of deoxygenated water. Dehydroascorbate was added to each sample as an 8:1 molar ratio of dehydroascorbate: rHb 1.1. For the oxygenated sample one of the aliquots was reoxygenated by flushing its flask with oxygen prior to addition of the dehydroascorbate solution. Both samples were allowed to react overnight at 4°C prior to analysis. Samples were analyzed by reverse phase HPLC as described in materials and methods. The data indicate that modification by dehydroascorbate shows a similar reverse phase HPLC profile as seen for ascorbate and oxygen modification and that similar profiles are observed when modification is performed under both oxy and deoxy conditions.
O2 and Ascorbate Mediated Modification of Recombinant Hb
405
Figure 3: Tryptic mapping of the dehydroascorbate modified P-globin subunit. Dehydroascorbate modified deoxy recombinant hemoglobin was prepared as described in Fig. 2. The p-globin peak from 26-28 min. and the lagging shoulder of the p-globin peak from 2830min. were collected and mapped as described in materials and methods. The upper panel represents the tryptic map of the p-globin peak (26-28min.) and the lower panel represents the tryptic map of the lagging shoulder of the p-globin (28-3Omin.). The arrow in the upper panel indicates the position of the p-globin N-terminal peptide and the arrow in the lower panel indicates the position of the difference peptide present in the modified p-globin shoulder.
Modifications of other proteins by ascorbate have been reported on the NS-amino group of lysine to produce NS-(carboxymethyl)lysine which has a mass of 58amu (Dunn et al., 1990; Ortwerth et al., 1992). The mass of the difference peptide was determined using LC-MS (see Fig. 4 and Table I). The lower panel shows the spectrum of the modified peptide with a mass of 1055.2amu and xhe upper panel shows the spectrum of the P-globin N-terminal peptide with a mass of 983.5amu to produce a difference of 71.7amu.
Bruce A. Kerwin et al
406 (3-globin N-terminal peptide
lOOq
9(H 8(H
0)
d
% 6(H
•o c
J J
I 5(H
I ^^
oc
H 3(H
500
Difference peptide
lOOq
9aH 8(H 70H
8
d
c
J
>
-J
I ^cH cc
3
30H 2(H
iLtii.Jliill
iLllUiiii^i. i,.ii]J,iyy,.,til.Ul,!!Ill yiiiL , ;lli|ri lli LiiMl Unl^i
Figure 4: Mass spectrometry analysis of difference peptides from the dehydroascorbate modified p-globin subunit. Dehydroascorbate modified deoxy recombinant hemoglobin was prepared as described in Fig. 2. Trypsin digestion was performed as described in materials and methods and the p-globin N-terminal peptide (see upper panel Fig. 3) and the difference peptide (see lower panel Fig. 3) analyzed by LC-MS. The spectrum in the upper panel represents the p-globin N-terminal peptide and the spectrum in the lower panel represents the difference peptide. A summary of the data is presented in Table I.
This is different from that for dehydroascorbate which has a mass of 156amu and carboxymethylation which produces a mass increase of 58amu. Edman protein sequencing analysis of the difference peptide demonstrated a blocked Nterminus, suggesting that the N-terminal methionine was modified. Experiments are currently in progress to determine the exact structure of the modification.
O2 and Ascorbate Mediated Modification of Recombinant Hb
407
Table I: Summary of mass spectrometry and sequencing data from the unmodified and dehydroascorbate modified (3-globin peptides Sample
Sequence
Expected monoisotopic
Observed mass (amu)
N-terminal peptide
MHLTPEEK
983.5
983.5
Difference peptide
Blocked to Edman sequencing
p-globin
1055.2
A = 71.7
References Al-Ayash, A.I. and Wilson, M.T. (1979) Biochem. J. Ill
Ml.
Dunn, J.A., Ahmed, M.U., Murtiashaw, M.H., Richardson, J.M., Walla, M.D., Thorpe, S.R. andBaynes, J.W. (1990) Biochemistry 29:10964. Gibson, Q.H. (1943) Biochem. 1 37:615. Hoffman, S.J., Looker, D.L., Roehrich, J.M., Cozart, P.E., Durfee, S.L., Tedesco, J.L. and Stetler, G.L. (1990) Proc. Natl Acad. Sci. USA 87:8521. Looker, D., Abbott-Brown, D., Cozart, P., Durfee, S., Hoffman, S., Mathews, A.J., Miller-Roehrich, J., Shoemaker, S., Trimble, S., Fermi, G., Komiyama, N.H., Nagai, K. and Stetler, G. (1992) Nature 356:258. Ortwerth, B.J., Slight, S.H., Prabhakaram, M., Sun., Y. and Smith, J.B. (1992) Biochim. Biophys. Acta. 1117:207. Vestling, C.S. (1941) J. BiolChem. 143:439. Washko, P.W., Welch, R.W., Dhariwal, K.R., Wang, Y. and Levine, M. (1992) Anal. Biochem. 204:1. Witkowska, H.E., Bitsch, F. and Shackleton, C.H.L. (1993) Hemoglobin 17:227.
This Page Intentionally Left Blank
Metal activation and regulation of E.coli RNase H James L. Keck and Susan Marqusee Dept. of Molecular and Cell Biology University of California, Berkeley Berkeley, CA 94720
Introduction: The ribonuclease H (RNase H) family of enzymes are ubiquitous nucleases that catalyze the hydrolysis of RNA in RNA»DNA hybrids (for review, see 1). In contrast to the well-studied ribonucleases A and Tl, RNase H does not employ the 2'-OH in RNA as a nucleophile but instead activates water as the nucleophile for hydrolysis in a metal-dependent reaction. The number and role(s) of divalent metal in the RNase H reaction mechanism are still unclear. Two RNase H mechanisms have been proposed based on wellcharacterized metal-dependent DNase activities ~ a one-metal mechanism (2,3,18), modeled after DNase I (4), and a two-metal mechanism (5,19), modeled after the exonuclease domain from Klenow fragment (6-8). The one-metal mechanism is supported by observation of a single Mg2+ binding to E.coli RNase HI via X-ray crystallography, NMR and isothermal titration calorimetry (9-11,18). Also, mutagenesis of conserved residues in RNase H shows that only three of these ten residues result in a complete loss of activity when mutated to alanine (12). Of these three acidic residues, two are found by x-ray crystallography to ligand a single Mg^^ (Asp 10 and Glu48) and the third (Asp70) is proposed to abstract a proton from the attacking nucleophilic water (3,9). This divalent metal is proposed to stabilize the r e a c t i o n ' s p e n t a c o v a l e n t p h o s p h o r e n e t r a n s i t i o n state intermediate. In contrast, the two-metal mechanism is supported by observation of two Mn^+ ions bound in the active-site in the crystal structure of HIV-1 RNase H domain (a d o m a i n of reverse transcriptase) (5). The Mn2+ ions are --4 A apart (as is seen in the Klenow fragment exonuclease domain (6,8)) and are bridged by a uranium heavy atom. It is thought that the uranium acts as an TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
409
410
James L. Keck and Susan Marqusee
artificial bridging ligand that would normally be satisfied by substrate. In this two-metal mechanism, one metal activates the hydroxyl nucleophile and the second metal stabilizes the p h o s p h o r e n e intermediate of the reaction. Both mechanistic hypotheses are daunted by the lack of structural information on RNase H bound to its nucleic acid substrate. An understanding of the RNase H mechanism is important for a number of reasons. First, the RNase H activity is essential to the lifecycle of HIV (for review, see 13). Mutations that merely reduce the reverse transcriptase RNase H activity are sufficient to completely inhibit virulence in the mutant in vivo (14). Its absolute requirement makes the RNase H activity a logical drug target for anti-HIV therapies. Development of knowledge-based inhibitors will require an understanding of the activity's mechanism. Second, a number of proteins with structures homologous to RNase H have been solved in the past two years, all of which are metal-requiring nucleic acid manipulating proteins (reviewed in 15-17). This superfamily of proteins, now termed "polynucleotide transferases", includes RNase H (5,18,19), resolvase (20), integrase (21,22) and Mu transposase (23). It is assumed that the enzymes share a common mechanism, so an understanding of RNase H mechanism will assist in clarifying the mechanism of other members of this superfamily. Clearly, a determination of the number and role(s) of metal in the E.coli RNase H active site will help establish the enzyme's mechanism. We have examined the metal dependence (Mn2+ and Mg2+) of E.coli RNase H activity. Mn^^-dependent activity requires much less metal for activation than does Mg2+-activity and is inhibited upon the further addition of Mn2+. Using electron paramagnetic resonance (EPR), we have measured two distinct Mn^^ binding constants, consistent with the concentration requirements for activation and inhibition in vitro. Our data are most consistent with a singledivalent metal catalyzed reaction which can be attenuated (inhibited) upon binding a second metal. We discuss a possible mechanism for metal activation and inhibition of RNase H in light of previous mutagenesis and structural studies.
Materials and Methods
Materials: pJK502 is a T7-overexpression vector that encodes for wild-type E.coli RNase HI. It was made by site-directed mutagenesis of pSMlOl (24), reverting three alanines (residues 13, 63 and 133) to cysteines, and then subcloning the resulting gene into a p E T l l a overexpression vector (details of plasmid sequence available upon request). Overexpression and purification of E.coli RNase HI were performed essentially as described in (24) for RNase H'*'. RNase H* is
Metal-Dependence of E. coli RNase H
411
a cysteine-free version of E.coli RNase HI and was a gift from Dung Vu. M e t h o d s : i^Nflse H activity assay. Production of R N A » D N A hybrid and RNase H activity assays were performed essentially as described in (25). MnCl2 stocks were made by serial dilution of a 1 M MnCl2 stock in 1% Nitric Acid. Final MnCl2 stocks were in 0.1% Nitric Acid. The p H of the final assay solutions were unchanged by the addition of the Nitric Acid as confirmed by measuring the p H of mock reactions containing buffer (Tris) and the appropriate volume of 0.1% Nitric Acid. Velocity values are specific activity (units/mg enzyme) measurements based on four time points in the linear range of enzymatic activity. Assays were performed in 50 mM Tris, p H 8.0/50 mM N a C l / 1 mM DTT/1.5 |LIM B S A / 1 |LIM (basepairs) RNA*DNA hybrid with 0.2 to 1.2 nM E.coli RNase HI at 37 °C. One unit is defined as the amount of enzyme needed to generate 1 |Limol of acid-soluble product in 15 minutes under our reaction conditions. Electron Paramagnetic Resonance (EPR) binding experiments: All EPR measurements were carried out on a Bruker ESP300E X-band spectrometer at ambient temperature. Lyophilized RNase H* was resuspended in 50 mM Hepes, p H 7.5, and diluted 1:1 with MnCl2 stocks made up in water ~ final concentrations were 25 mM Hepes, p H 7.5, 50 |LiM RNase H* and MnCl2 from 15 to 300 |iM. Free Mn2+ is the only component that gives an EPR spectrum, allowing calculation of [Mn2+]bound ([Mn2^]totai = [Mn2+]free + [Mn2+]bound). A Mn^^ peak (centered at 3629.6 G with a + / - 60 G sweep) was measured and then compared (height and peak shape) with Mn^^ standards in the same buffer conditions to determine the scalar difference. Using these scalar factors, [Mn^+lfree was calculated at various total Mn2 + concentrations. Dissociation constants were determined via Scatchard analysis of the data (26).
Results
Mn^-^-dependence of E.coli RNase HI: The Mn2+-dependence of E.coli RNase HI catalysis was determined using a soluble assay that monitors acid-solubility of radiolabeled RNA in an R N A ^ D N A hybrid (25). This Mn^^-dependence shows activation at low concentrations of Mn2+, followed by inhibition at higher Mn^^ concentrations (Figure 1). The optimum activity is achieved in 5 |iM M n C l 2 , and is 30 % of the maximum activity in MgCl2 (data not shown). Maximum inhibition at 1 mM MnCl2 is -'20-fold inhibited relative to activity at 5 |LiM MnCl2. Activation and inhibition of RNase H'*' was indistinguishable from RNase HI (data not shown).
James L. Keck and Susan Marqusee
412
Figure 1. Mn^'^-dependence of E.coli RNase HI activity Assays were performed as described in Materials and Methods. Data points are the average of two assays with standard deviations shown as error bars.
10
1000
100
[MnCl2],|iM
Mit^-^-binding by E.coli RNase H*: EPR spectrometry was used to determine the stoichiometry and affinity of Mn^^ binding to E.coli RNase H*. Scatchard analysis of the binding data show that E.coli RNase H* (a cysteine-free version of E.coli RNase HI) has multiple Mn2+ binding sites. Two binding sites were determined with dissociation constants (Kd) of -15 |xM and --60 |iM (Figure 2). Figure 2. Equilibrium Miri^"^-binding to E.coli RNase H*^ V represents the fraction of b o u n d M i f ^ e r total ^•^^" RNase H* as o 05 - \ described in §j Materials and rp*^ 0 04 ~ Methods. Dashed 'Splines indicate the Jg 0.03best fit two lines ^ representrag the o.02 data points. K ^ measurements o.Ol are the inverse of the fits' abscissa. Q" 0
Measured Dissociation Constants Kdi z 15 |iM Kd2=60|iM
•
\ \ \ §"^
\ 1
i
1
1
1
1
i
0.2
0.4
0.6
0.8
1
1.2
1.4
V
1.6 1
1
Metal-Dependence of E. coli RNase H
413
Figure 3. Comparison of Mn 2+-activation to simulated activation curves ?
^ 0.5-
t
both metals required for activation
^ 0.4O *X2
>^
^
first metal activates second metal inhibits
> 0.3-1
"S 0.2-1 0^
»-4
S 0.1 H 1
10
[MnCl2],|iM
T
100
1000
Comparison of metal binding to dependence of the RNase H catalysis: Using the determined Mn2+-binding constants to E.coli RNase H*, the relative populations of enzyme with either 0, 1 or 2 Mn2+ ions bound as a function of [Mn^+Jtotal were determined. Figure 3 shows two simulations of the Mn2+ dependence of E.coli RNase H activity; one for the case where both metals are required for activation and one for the case where one metal is activating and one inhibiting. A comparison of Mn^+ activation of E.coli RNase HI (from Figure 1) shows the similarity between the data and the later model. The overall shape similarity of the two plots is striking. The highest metal affinity binding correlates well to in vitro activation and binding of the lower affinity metal correlates approximately to inhibition (Figure 3). Relative RNase H activity is scaled to represent the observation that maximum Mn^+-dependent activity is -^-0.3 that for maximum Mg2+-dependent activity (27). Differences between the real and theoretical activation data may imply differences in metal binding in the presence of substrate. Discussion It has been known for over 20 years that either Mn2+ or Mg2+ can activate E.coli RNase HI. However, differences between metal requirements in Mn2+ and Mg2+ are only now beginning to be understood. Here, we have shown that the Mn2+ requirement for E.coli RNase HI activity is in the low micromolar range. This value can be contrasted to the relatively high (-0.1 to 1 mM) Mg^ + concentrations required for activity (10,27). Further, we have
414
James L. Keck and Susan Marqusee
demonstrated that the Mn2+-dependent RNase H activity can be inhibited with higher Mn2+ concentrations (> 5 |iM). Mn^+ inhibition at higher metal concentrations could be due to a n u m b e r of factors, including: (1) metal-induced conformational changes in the RNA»DNA hybrid substrate or (2) metal binding to the enzyme that reduce it's activity. Similar inhibition has been documented for E.coli RNase H Mg^^-dependent reaction, with the inhibition attributed to substrate-metal association (28). This interpretation was based on the fact that E.coli RNase H binds Mg^+ with a 1:1 stoichiometry (10) and that the binding constant for Mg2+ to nucleic acid is similar to the inhibition constant (29,30). It is possible however, that binding studies reveal a second Mg2+ binding site on the enzyme only in the presence of substrate. With Mn2+, we can correlate metal binding to both activation and inhibition. We therefore, support the idea that Mn^^ inhibits as a result of binding an inhibitory site on the enzyme. We have determined here that E.coli RNase H can bind multiple (presumably two) Mn^+ ions with KdS of --15 |LIM and --60 |LIM. Upon comparison of metal-binding with our activation/inhibition data, the simplest model is that the tightest binding metal activates the enzyme while the second metal inhibits the activity (Figure 3). If both metals were required for activity, presumably there would be no inhibition at higher metal concentrations. Binding of the first metal in the absence of substrate correlates well with activation, but binding of the second metal appears weaker without substrate (i.e. the apparent inhibitory Kd is less than the measured Kd of the second metal binding). This discrepancy may indicate that substrate is involved in complete formation of the second metal binding site. Mechanism of E.coli RNase H: The metal-dependence of the RNase H reaction mechanism is not well understood. Currently there are two primary mechanisms that have been proposed: a onemetal mechanism and a two-metal activation mechanism. In light of the information presented in this paper, and in the context of information that has been gathered on the RNase H family of enzymes, we hypothesize that the RNase H mechanism is a singledivalent catalyzed reaction that can be attenuated by a second metal binding event. This hypothesis encompasses all of the seemingly contradictory information that has been presented to defend both the one and two-metal mechanisms. Figure 4 diagrams the basis of the proposed mechanism. In the absence of metal, E.coli RNase H is completely inactive. Upon addition of metal at activating concentrations (< 5 |LiM Mn^+), the tight metal binding site is filled and the enzyme is optimally active. We assume that the tightest binding metal binds in the single Mg2+ site observed crystallographically (9). Upon increasing the metal
Metal-Dependence of E. coli RNase H
415
Figure 4. Hypothetical metal-binding in the E.coli RNase H active site
Aspl34
AsplO r
Asp70
Glu48
Glu48
Inactive
Inhibited
concentration (> 5 |iM Mn^+), the second metal binding site (assumed from the co-crystal structure of the HIV RNase H domain with Mn2+ (5)) becomes occupied and the enzyme is inhibited. What is the mechanism of metal-inhibition? Our current hypothesis is based on E.coli RNase H active-site mutagenesis results coupled with the observation of two Mn^^-binding sites in the HIV RNase H domain structure. In the one-metal mechanism. Asp 70 abstracts a proton from the attacking nucleophilic water and then needs to deprotonate to reset the enzyme for the next hydrolysis (3). This deprotonation is believed to occur by shuffling the proton to His 124, since solvent is not accessible to Asp 70 and His 124 is nearby (within 4 A). Mutagenesis of His 124 to Ala results in a 100-fold reduction of kcat (3), presumably since Asp 70 must deprotonate through a less efficient mechanism. If His 124 is a liganding element for the second metal binding site, its pKa would be expected to shift down upon metal binding, making it more difficult to protonate. The effect of this pKa shift would be to inhibit the proton-transfer from Asp 70 to His 124, and thus slow the overall kcat for the reaction. We are currently testing this mechanism t h r o u g h mutagenesis and structural studies of E.coli RNase HI in Mn^^. Acknowledgments: We thank Mark Rabenstein and Yeon-kyun Shin for assistance with the EPR measurements. This work was supported by a grant from the N.I.H. (GM53321).
References
1. Hostomsky, Z., Hostomska, Z. and Matthews, D. A (1993) Ribonucleases H in Nucleases (ed. Linn, S. M., Lloyd, R. S. and Roberts, R. J.) 2nd Ed.,pp. 341-76, Cold Spring Harbor Laboratory, Cold Spring Harbor NY 2. Nakamura, H., Oda, Y., Iwai, S., Inoue, H., Ohtsuka, E., Kanaya, S., Kimura, S., Katsuda, C , Katayanagi, K., Morikawa, K., Miyashiro, H. and Ikehara, M. (1991) Proc. Natl Acad. Sci. U.S.A., 88, 11535-9
416
James L. Keck and Susan Marqusee
3. Oda, Y., Yoshida, M., and Kanaya, S., (1993) /. Biol Chem. 268, 88-92 4. Suck, D. and Oefner, C. (1986) Nature 321, 620-5 5. Davies, J. F., Hostomska, Z., Hostomsky, S., Jordan, S. and Mathews, D. A. (1991) Science 252, 88-95 6. Beese, L. and Steitz, T. A. (1991) EMBO ]. 10, 25-33 7. Derbyshire, V., Grindley, N. D. F., and Joyce, C. M. (1991) EMBO J. 10,17-24 8. Freemont, P. S., Friedman, J. M., Beese, L. S., Sanderson, M. R. and Steitz, T. A. (1988) Proc. Natl Acad. Sci. U.S.A. 85, 8924-8 9. Katayanagi, K., Okumura, M., and Morikawa, K. (1993) Proteins 17, 337-46 10. Huang, H. W. and Cowan, J. A. (1994) Eur. J. Biochem 219, 253-60 11. Oda, Y., Nakamura, H., Kanaya, S. and Ikehara, M. (1991) /. Biomol. Nmr. 1, 247-55 12. Kanaya, S., Kohara, Y., Miura, Y., Sekiguchi, A., Iwai, S., Inoue, H., Otsuka, E. and Ikehara, M. (1990) /. Biol Chem.,265, 4615-21 13. Skalka, A.-M., and Goff, S. P. (eds) (1993) Reverse Transcriptase, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 14. Tisdale, M., Schulze, T., Larder, B. A. and Moelling, K. (1991) /. Gen. Virol 72, 59-66 15. Yang, W. and Steitz, T. A. (1995) Structure 3,131-4 16. Venclovas, C. and Siksnys, V. (1995) Nature Struct. Biol 2, 838-41 17. Rice, P., Cragie, R., and Davies, D. R. (1996) Curr. Op. Struct. Biol 6, 76-83 18. Katayanagi, K., Miyagawa, M., Matsushima, M., Ishikawa, M., Kanaya, S., Ikehara, M., Matsuzaki, T. and Morikawa, K. (1990) Nature 347, 306-9 19. Yang, W., Hendrickson, W. A., Crouch, R. J. and Satow, Y. (1990) Science 249, 1398-405 20. Ariyoshi, M., Vassylyev, D. C , Iwasaki, H., Nakamura, H., Shinagawa, H. and Morikawa, K. (1994) Cell 78, 1063-72 21. Dyda, F., Hickman, A. B., Jenkins, T. M., Engelman, A., Craigie, R. and Davies, D. R. (1994) Science 266, 1981-6 22. Bujacz, G., Jaskolski, M., Alexandratos, J. and Wlodawer, A. (1995) /. Mol Biol 253, 333-46 23. Rice, P. and Mizuuchi, K. (1995) Cell 82, 209-20 24. Dabora, J. M. and Marqusee, S. (1994) Protein ScL 3,1401-8 25. Keck, J. L. and Marqusee, S. (1995) Proc. Natl Acad. Scl U.S.A. 92, 2740-4 26. Scatchard, G. (1949) Ann. N. Y. Acad. Scl 51, 660-72 27. Keck, J. L. and Marqusee, S. (1996) /. Biol Chem. 271, 19883-7. 28. Black, C. B. and Cowan, J. A. (1994) Inorg. Chem. 33, 5805-8 29. Cowan, J. A. (1991) /. Am. Chem. Soc. 113, 6025-32 30. Black, C. B. and Cowan, J. A. (1994) /. Am. Chem. Soc. 116, 1174-8
Crystal structure of avian sarcoma virus integrase with bound essential cations Jerry Alexandratos\ Grzegorz Bujacz^'^, Mariusz Jaskolski^'^ and Alexander Wlodawer^*, ^Macromolecular Structure Laboratory, NCI-Frederick Cancer Research and Development Center, ABL-Basic Research Program, Frederick, Maryland ^Faculty of Food Chemistry and Biotechnology, Technical University of Lodz, Lodz, Poland ^Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznaii, Poland
George Merkel, Richard A. Katz and Anna Marie Skalka Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania
I. Introduction Retroviral integrase (IN) is a virus-encoded enzyme that catalyzes nonspecific insertion of viral DNA into multiple sites on host DNA (1-3). Since DNA integration is an essential step in the retroviral replication cycle, this enzyme is an attractive target for inhibition of human immunodeficiency virus (HIV), the causative agent of acquired immunodeficiency syndrome (AIDS). Work over the last several years has resulted in a general understanding of the enzymatic mechanism, but more detailed analyses have been hampered by the lack of precise structural information. The situation changed when the crystal structures of the catalytic domains of both HIV-1 IN (4) and avian sarcoma virus (ASV) IN (5,6) became available. Precise data on the interaction of these enzymes and the essential ligands are necessary for understanding the structural basis of the reaction mechanism and for guiding rational drug design. Members of the structurally related superfamily of enzymes that include RNase H, RuvC resolvase, MuA transposase, and retroviral integrase contain at least three acidic residues in the active site and require divalent cations, such as Mg^"^ or Mn^"^, for their enzymatic activity. However, the precise placement of cations is reported in the X-ray crystal structures of only two of these proteins, E. coli RNase H and HIV-1 RNase H. Details of the location of metal ions in the active site of retroviral integrases can enhance our understanding of the catalytic mechanism of these enzymes and their relationship to that of other members of the superfamily. We present the structure of ASV IN catalytic domain with the essential cations Mg^"^ or Mn^"^ bound in the active site. In addition, we present the structure of an inactive complex of the catalytic domain of ASV IN with Zn^"^. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
417
Jerry Alexandratos et al
418
II. Methods The expression strategy, purification from E. coli, and activities of the purified ASV IN 52-207 fragment have been described previously (7). ASV IN protein crystals were produced from 20% polyethylene glycol (PEG) solution by the hanging drop vapor diffusion method, also described previously (5). All conditions yielded tetragonal crystals with approximate cell dimensions of a = b = 66 A, c = 81 A, space group of P432i2, with one molecule in the asymmetric unit. Crystals were soaked in metal chloride solutions in a synthetic mother liquor for at least 3 days each. Concentrations of 10 mM MnCl2,100 and 500 mM MgCl2, and 100 mM ZnCl2 solutions produced complete occupancy, whereas the occupancy was only partial in 20 mM MgCl2. X-ray diffraction data were collected at room temperature on a MAR 300 mm image plate detector, using a Rigaku RU200 rotating anode operated at 50 kV and 100 mA (Table I). Data were processed with DENZO and scaled with SCALE?ACK (8). Variations in the unit cell parameters for different crystals were less than 1.25 A, even between metal complexes and low temperature native structures. Electron density maps, calculated with the program PROTEIN (9), were interpreted using FRODO (10). The model underwent multiple cycles of restrained structure-factor least-squares refinement using PROLSQ (11). In addition to protein atoms and water molecules, the final models also include a well-ordered HEPES molecule that cocrystallized with the native protein. At least 60 water molecules were added to each structure during refinement.
III. Results and Discussion The central, catalytic domain of ASV IN is a five-stranded mixed P-sheet flanked by five a-helices. The active site is characterized by the presence of the D,D(35)E motif of three carboxylate-containing amino acids, the last two of which Table I. Summary of data collection and refmement Protein
Cell dimensions (A)
Resolution (A)
R-factor
R-free
Root mean square deviation from Mg structure
Low temperature selenomethionine
a,b= 65.40 c= 80.41
6.0-1.70
0.139
0.208
0.27
Mg (500 mM)
a,b= 66.05 c= 81.65
8.0-1.75
0.150
0.191
""
Mn(lOmM)
a,b= 66.24 c= 81.60
8.0-2.05
0.130
0.189
0.14
Zn(lOOmM)
a,b= 66.08 c=80.96
10.0-1.95
0.176
—-
0.17
Crystal Structure of Sarcoma Virus Integrase
419
are separated by 35 residues. The two aspartate residues are located on strand pi and the end of strand P4, with these strands being part of the stable P-sheet core of the protein. There is a long 10-residue loop between strand P5 and helix a4, which has higher B-factors and slightly different conformations in the different structures. This loop extends out of the compact shape of the molecule and appears quite flexible. The third catalytic residue appears near the end of helix a4 at one end of this flexible loop. The structure of ASV IN complexed with divalent cations (Mn^"^, Mg^"^, and Zn^"*") was solved at the resolution of 1.70 - 2.05 A. This enzyme is active in the presence of either Mn^"^ or Mg^"*", with the activity higher in the former than in the latter, and is inactive in the presence of Zn^"^. After refinement, the structures of Mn^"^ and Mg^"^ complexes were nearly identical in their overall architecture and in the metal binding scheme. A single ion of either metal interacts with the aspartate side chains of the D,D(35)E catalytic center and uses four water molecules to complete its octahedral coordination (Fig. la). The metal-ligand distance is within a 2.1 - 2.4 A range for the Mn^"^ ion and 2.05 - 2.15 A for the Mg^'^ion. Glu-157 does not take part in binding of these cations. Only small adjustments take place in the active site of ASV IN upon binding of a metal cofactor. The Asp-64 carboxylate shifts less than 0.4 A compared with the uncomplexed enzyme (5). A slight rotation of the Asp-121 carboxylate shifts the oxygen atom about 1 A. Only the non-metal-binding side chain of Glu-157 in the Mn^"*" structure moves a greater distance. This result agrees with previous studies, which show that even conservative mutations to these residues abolish protein activity. The soaking experiments were performed with modest metal salt concentrations. The 10 mM Mn^"^ concentration was only three times above that used in activity assays, and the 100 mM Mg concentration was similarly proportional to the concentration used in in vitro activity assays. We are certain that under the conditions of these soaking experiments only one of these catalytic metal cations occupies the active site at any one time. Because the final Mn^"^ Fo-Fc map showed a peak height of 6.5 o above background, we are very confident of its location. Metal binding with even one third of this occupancy would have been easily visible. Even the structure obtained using an extremely high 500 mM Mg^"^ concentration did not indicate a second metal-binding site. It is not known at this time whether one or two metals are required for the integration reaction to proceed, although the detailed modeling of reactions catalyzed by nucleotidyl tranferases appears to require two metal ions (12). Since we could find only one divalent cafion in the complex, we decided to examine whether other divalent cations could be bound with different stoichiometry. Even though it is known that Zn^"^ does not activate integrases, we used this cation because zinc chemistry is very similar to that of Mg (13). Unexpectedly, we observed a structure with two Zn^"^ ions coordinated by all three active site residues, and with one water ligand interacting with each metal ion (Fig. lb). As in the other structures, we observed only minimal conformational changes for both aspartate side chains. One of the Zn^"^ ions appears in essentially the same location as the Mn'^'^ and Mg^"*" ions, only 0.36 A from the position of the former and 0.29 A from the latter, coordinating with Asp-64 and Asp-121, respectively. The coordination of the second active site metal involves the other carboxylate oxygen
420
Jerry Alexandratos et al
Figure 1. Stereo views of active sites of ASV IN complexed with metals, la. Electron density map of Mg coordinated with four water molecules. Two active site carboxylate oxygens and four waters create the octahedral coordination for the metal cation, lb. Electron density map of two Zv?^ ions with coordinating two water molecules. Four active site carboxylate oxygens and two water molecules coordinate the metal cations.
Figure 2. Ribbon diagram of the ASV IN catalytic domam, with explicitly shown active site side chains for both Zv?^ (black) and Mn^"^ (grey) complexes. The metal locations and corresponding coordinated water molecules are indicated in the same colors.
Crystal Structure of Sarcoma Virus Integrase
421
of Asp-64, as well as Glu-157. Not surprisingly, the side chain of Glu-157 was observed to rotate, as this residue did not previously point into the active site. The fact that only a side chain rotation, with no backbone displacement, was needed for the second cation to bind explains why even conservative mutations of these active site residues inactivate IN completely. The coordination also seems to extend between the metal ions themselves, since the distance between the Zn^"^ ions is only 3.5 A. Both Zn^"^ ions are coplanar with the two carboxylate oxygens from Asp-121 and Glu-157 and the liganded waters. Each Zn^^ is located in the center of a triangle formed by coordination with a carbonyl oxygen, a water molecule, and the other cation, with Asp-64 coordinating both Zn^"^ ions from below this plane. The distance between any oxygen Hgand and the metal ion is within 2.1-2.4 A. The number of waters coordinating the metal bound by IN may be crucial for catalytic activity. If one water is replaced by an incoming DNA phosphate ligand, then this active site-Zn^"^ arrangement will not have a water molecule available for the hydrolysis reaction (14). Another Zn^"*" ion was found bound in a distant part of the structure, with His-103 and three additional water molecules, forming a more typical tetrahedral coordination. Binding of the Mg^"^, Mn^"^, and Zn^"*" ions does not lead to significant structural modifications in the active site or the overall protein architecture when compared with native ASV IN (Table I). This result indicates that metal-binding sites are preformed in this IN structure. The observed configuration of the D,D(35)E residues may represent a catalytically-competent active site (Fig. 2). When one divalent cation is bound, the active site side chains remain in the positions seen in the PEG native structures. Binding of two Zn^"^ cations also causes no change to the IN backbone. The side chain of Glu-157 rotates with respect to the conformation seen in the protein complexed with the other metals. However, this is completely consistent with the side chain conformation seen in the native structure of IN crystals grown from ammonium sulfate. These minor differences between the active sites of ASV IN with different cofactors seem to reflect a tendency for structural flexibility in the active site of integrases. The two Zn^"^ ions are observed at a similar concentration as one Mg^"^ ion, with no overall changes to the protein, indicating a possible mode of binding for Mg^"^ under other conditions. This observation supports the hypothesis that a second metal-binding site exists for Mg^'^/Mn^'^, but forms in the presence of substrate and/or other domains of the protein.
A. Comparison with related enzymes Although many enzymes that are active in the processing of nucleic acids, such as nucleases, DNA polymerases, or reverse transcriptases, have acidic residues in the active sites and require divalent cations for activity, such cations have been reported only for a few published structures. One published structure of reverse transcriptase from Moloney murine leukemia virus, (MMLV RT) shows a single metal bound in the active site (15), whereas none of the available structures of HIV-1 RT show bound metals. In addition, the structures of MMLV RT and E. coli RNase H with bound metals have been solved at a lower resolution than the
Jerry Alexandratos et al
422
same proteins without metals. As our data show, the quaUty of the metalcontaining ASV IN structures are as good as that of the apoenzyme. Since the electron density maps are of excellent quality, we can describe the active site with high accuracy. We have compared compared the structure of the ASV IN active site with the active sites of the other members of this superfamily for which metal complexes have been described or inferred, namely both HIV-1 and E. coli RNases H, and E. coli RuvC resolvase. As reported by Yang and Steitz (16), the similarity of the cluster of acidic residues forming the active sites of the RNase H enzymes is striking. With alignment based on conserved secondary structure elements, we have found that the best agreement in the active site of these enzymes is with ASV IN Asp-64. The placement and direction of the analogous carboxylates are very similar in RNases H and in ASV IN (Fig. 3). ASV IN Asp-121 is close to its equivalent in HIV-1 RNase H (17) and E. coli RNase H (18), whereas the side chain of the equivalent residue in E. coli RuvC resolvase (19) is more distant (not shown). The third residue of the cluster, ASV IN Glu-157, is also in quite good agreement among these enzymes. The other acidic residues in this region do not have counterparts in ASV IN. Similar to what we have observed for ASV IN, the residues in the active site of £. coli RNase H are moved only slightly upon binding of the metal, shifting Ca atoms less than 0.4 A when comparing structures without (20) and with (18) divalent cations; the side chain acidic groups shift no more than 1.5 A. Interestingly, the two residues that coordinate the Mg^"^ ion move less than the other carboxylates, indicating that the part of the active site directly coordinating the metal ion has an invariant character. A similar case is noted with a comparison of the metal-bound and unbound active sites of MMLV RT (18,20) and HIV-1 RT (21), with less than a 1.5 A r.m.s. deviation among the three active site residues. The sole Mg^"*" ion reported for E. coli RNase H (22) is complexed by Asp10 (Asp-64) and by Glu-48 (no ASV IN equivalent). Although no precise data on the location of a divalent cation are available for RuvC resolvase, a Mn^"^-binding site apparently exists between Asp-7 (Asp-64) and Asp-141 (Glu-157) (19). ZnMn
D121 D98
^ 1 D64D43 m D64D.
^ E157E149
0153N145 Figure 3. Comparison of the active sites of the HIV-1 RNase H-Mn^"^ complex (16) (black) with the ASV EST-Zn^"^ complex (grey) shows the excellent alignment of catalytic residues and metal cations.
Crystal Structure of Sarcoma Virus Integrase
423
Structural alignment of the crystallographically determined structures of ASV IN complexed with Zn^"^ and of the HIV-1 RNase H complexed with Mn^"^ was carried out using ALIGN (23). The general architecture of the ASV IN and RNase H monomers is significantly different, but it is possible to superimpose structurally conserved regions, three a-helices and one P-strand, which contain the active site and nearby structurally important residues. This alignment reveals a surprisingly good superposition of the catalytic residues, with the r.m.s. deviation of 1.3 A for the 28 atom pairs. The positions of the two Zn^"^ ions in ASV IN are very close to the two Mn^"^ ions in RNase H (17), which are directly coordinated by the carboxylates of Asp-43 (equivalent to Asp-64 in ASV IN) and Asp-98 (equivalent to Asp-121 in ASV IN), and between Asp-43 and Asp-149 (equivalent to Glu-157 in ASV IN). The distances between the two pairs of cations are 0.38 A for the Zn^"^ ions bound between the two aspartates and 0.48 A for the other Zn^"*" ion, less than the r.m.s. deviations between the protein atoms (Fig. 3). Although the three most highly conserved acidic residues are present in similar locations in all of these enzymes, the exact relationships between them are not strictly preserved. However, for the two (quite divergent) RNase H enzymes, the maximum differences in the positions of the carboxylates do not exceed 1.5 A, despite some disorder reported in the vicinity of the active site of the isolated HIV1 RNase H domain from HIV-1 RT (17). The minimal influence of the presence of the divalent cation on the disposition of residues in the active site of RNase H and ASV IN is mirrored in MMLV RT, where the differences observed in the positions of the three critical aspartates in the active site are not larger than 0.4 A when metal bound and unbound structures are compared. In the case of MMLV RT, however, the quality of the difference Fourier map is not sufficient to determine the details of the coordination of the metal. The similarities between the active sites of MMLV RT and ASV IN include the interaction of only two of the three carboxylates with a single Mn ion present in the active site. We have presented here the structure of the catalytic domain of ASV IN complexed with three different divalent cations. These results clearly show that the active site of this enzyme is preformed, in that only relatively small movements of side chains and no shifts of the main chain are needed in order to provide an environment suitable for cation binding. This is in contrast with the related core HIV-1 IN (4), in which no binding of the divalent cations could be shown by crystallographic means, and in which the constellation of active site residues differs significantly from their counterparts in ASV IN. However, antibody-binding experiments have shown that HIV-1 IN undergoes a conformational change when incubated with divalent cation cofactors (Asante-Appiah, E. and Skalka, A. M., personal communication). These results are consistent with the notion that the activity can be modulated by transitional order-disorder phenomena involving the active site, and that such conformational changes are be different for enzymes obtained from different sources. Although only a single divalent cation was observed upon soaking ASV IN in Mn^"^ and Mg^"^, the unexpected observation of two Zn^"^ ions binding to the active site of ASV IN could provide indirect proof of the hypothesis postulating the utilization of two cations for catalytic activity.
Jerry Alexandratos et al
424
Acknowledgements Research sponsored in part by the National Cancer Institute, DHHS, under contract with ABL. Other support includes National Institutes of Health grants CA47486 and CA06927, a grant for infectious disease research from Bristol-Myers Squibb Foundation, and an appropriation from the Commonwealth of Pennsylvania. The contents of this publication do not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
References 1. 2. 3.
4.
5.
6.
7.
8. 9. 10. 11. 12. 13. 14. 15.
16. 17.
18.
Katz, R. A., and Skalka, A. M. (1994). The retroviral enzymes. Annu. Rev. Biochem. 63, 133-173. Goff, S. P. (1992). Genetics of retroviral integration. Annu. Rev. Genet. 26, 527-544. Vink, C , Groeneger, O. A. M., and Plasterk, R. H. (1993). Identification of the catalytic and DNA-binding region of the human immunodeficiency virus type I integrase protein. Nucleic Acids Res. 1\, 1419-1425. Dyda, F., Hickman, A. B., Jenkins, T. M., Engelman, A., Craigie, R., and Davies, D. R. (1994). Crystal structure of the catalytic domain of HIV-1 integrase: Similarity to other polynucleotidyl transferases. Science 266, 1981-1986. Bujacz, G., Jaskolski, M., Alexandratos, J., Wlodawer, A., Merkel, G., Katz, R. A., and Skalka, A. M. (1995). High resolution structure of the catalytic domain of the avian sarcoma virus integrase. / Mol Biol. 253, 333-346. Bujacz, G., Jaskolski, M., Alexandratos, J., Wlodawer, A., Merkel, G., Katz, R. A., and Skalka, A. M. (1996). The catalytic domain of avian sarcoma vims integrase: conformation of the active-site residues in the presence of divalent cations. Structure 4, 89-96. Kulkosky, J., Katz, R. A., Merkel, G., and Skalka, A. M. (1995). Activities and substrate specificity of the evolutionarily conserved central domain of retroviral integrase. Virology 206, 448-456. Otwinowski, Z. (1992). An Oscillation Data Processing Suite for Macromolecular Crystallography, Yale University, New Haven. Sheriff, S. (1987). Addition of symmetry-related contact restraints to PROTIN and PROLSQ. J. Appl Crystallogr. 20, 55-57. Jones, T.A. (1985). Interactive computer graphics: FRODO. Methods Enzym. 115:157-171. Hendrickson, W. A. (1985). Stereochemically restrained refinement of macromolecular structures. Methods Enzymol. 115, 252-270. Steitz, T. A. (1993). DNA- and RNA-dependent DNA polymerases. Curr. Opin. Struct. 5/o/. 3,31-38. Cotton, F., and Wilkinson, K. (1988). Advanced Inorganic Chemistry (5th edition, Wiley-Interscience) Beese, L. S., and Steitz, T. A. (1991). Structural basis for the 3'-5' exonuclease activity of Escherichia coli DNA polymerase I: a two metal ion mechanism. EMBOJ. 1, 25-33. Georgiadis, M. M., Jessen, S. M., Ogata, C. M., Telesnitsky, A., Goff, S. P., and Hendrickson, W. A. (1995). Mechanistic implications fi*om the structure of a catalytic fragment of Moloney murine leukemia virus reverse transcriptase. Structure 3, 879-892. Yang, W., and Steitz, T. A. (1995). Recombining the structures of HTV integrase, RuvC and RNase H. Structure 3, 131-134. Davies, J. F.,II, Hostomska, Z., Hostomsky, Z., Jordan, S. R., and Matthews, D. A. (1991). Crystal structure of the ribonuclease H domain of HTV-1 reverse transcriptase. Science 252, 88-95. Yang, W., Hendrickson, W. A., Crouch, R. J., and Satow, Y. (1990). Structure of ribonuclease H phased at 2 A resolution by MAD analysis of the selenomethionyl protein. Science 249,139^-1405.
Crystal Structure of Sarcoma Virus Integrase 19.
20.
21.
22.
23.
425
Ariyoshi, M., Vassylyev, D. G., Iwasaki, H., Nakamura, H., Shinagawa, H., and Morikawa, K. (1994). Atomic structure of the RuvC resolvase: A HoUiday jxmction-specific endonuclease from^. colL Celin, 1063-1072. Katayanagi, K., Miyagawa, M., Matsushima, M., Ishikawa, M., Kanaya, S., Nakamura, H., Ikehara, M., Matsuzaki, T., and Morikawa, K. (1992). Structural details of ribonuclease H from Escherichia coli as refined to an atomic resolution. J. Mol Biol. 11^, 1029-1052. Unge, T., Knight, S., Bhikhabhai, R., Lovgren, S., Dauter, Z., Wilson, K., and Strandberg, B. (1994). 2.2 A resolution structure of the amino-terminal half of HIV-1 reverse transcriptase (fingers and palm subdomains). Structure 2, 953-961. Katayanagi, K., Okumura, M., and Morikawa, K. (1993). Crystal structure of Escherichia coli Rnase HI in complex with Mg^"*" at 2.8 A resolution: proof for a single Mg^''"-binding site. Proteins 17: 337-346. Satow, Y., Cohen, G. H., Padlan, E. A., and Davies, D. R. (1986). Phosphocholine binding immunoglobulin Fab McPC603: An X-ray diffraction study at 2.7 A. J. Mol Biol 190, 593-604.
This Page Intentionally Left Blank
Multidimensional NMR Studies of an Exchangeable Apolipoprotein and Its Interactions with Lipids
Jianjun Wang^, Daisy Sahoo^, Dean Schieve^, Stephane M. Gagne§, Brian D. Sykes§ and Robert O. Ryan^
^Lipid and Lipoprotein Research Group, »Protein Engineering Network Centres of Excellence, Department of Biochemistry, University of Alberta Edmonton, Alberta, Canada T6G 2S2
I. INTRODUCTION Exchangeable apolipoproteins are a class of functionally important proteins which play a key role in plasma lipoprotein metabolism. In this capacity they have been associated with several human disorders, including hyperlipidemia and cardiovascular disease (1,2). Apolipophorin-III (apoLp-III) is a model exchangeable apolipoprotein derived from the insect Manduca sexta (166 residues, Mr 18,380). ApoLp-III is a major hemolymph protein in the adult life stage and functions in lipid transport during sustained flight (3,4). Biophysical studies demonstrate that apoLp-III is a soluble monomeric protein at concentrations of 15 mg/ml (5). While the tertiary structure of M. sexta apoLp-III has not been solved. X-ray crystallography of apoLp-III from Locusta migratoria reveals a globular structure comprised of a bundle of five elongated amphipathic a-helices which are connected by short loops (6). A similar molecular architecture was also found for the 22 kDa N-terminal fragment of human apolipoprotein E (7). The crystal structure of L. migratoria apoLp-III was obtained for the protein in its lipid-free state. The lipid-bound structure of apoLp-III, however, is more interesting since it represents the active form of the protein. To date, no detailed structural reports for exchangeable apolipoproteins in complex with lipid have been reported. The crystal structure of lipid-free apoLp-III demonstrated that the five amphipathic heUces orient in such a way that their hydrophobic faces are directed toward each other to form a hydrophobic core while the hydrophilic faces of the helices are exposed to solvent. It has been hypothesized that, upon binding to a TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
427
Jianjun Wang et al
428
lipid surface, the protein undergoes a major conformational change that results in opening of the helix bundle, with exposure of the hydrophobic surfaces of the helices which contact the Hpid (6). This putative conformational change is depicted in Figure 1:
Lipid Binding
Figure 1. Open conformation model of exchangeable apolipoproteins upon lipid-binding.
Since M. sexta apoLp-in is a well-behaved member of the exchangeable apolipoprotein family in terms of its physico-chemical properties, it represents a good candidate for investigation of the molecular details of the exchangeable apolipoproteins associated with lipid binding. A study of this system may reveal a general mechanism for exchangeable apolipoprotein-Hpid interaction. In a manner similar to all exchangeable apolipoproteins, apoLp-III resists crystallization when complexed with lipid. Thus, NMR is the only potentially useful high resolution technique to investigate structural changes of apoLp-III which accompany lipidbinding. To date, no NMR structures of exchangeable apolipoproteins have been reported. In order to carry out 3D and/or 4D-NMR experiments, however, the protein of interest must be either ^^N and/or ^^C-isotope labeled. Isotope labeling strategies require an efficient bacterial expression system and such a system has recendy been developed in this laboratory for apoLp-III from M. sexta (8). The present study describes the results of labeling experiments and presents a useful method to incorporate ^^N specifically and exclusively into peptide backbone amide nitrogens. 3D-NMR experiments have been performed on isotopically labeled apoLp-m and nearly complete assignment has been achieved. In addition.
NMR Studies of an Exchangeable Apolipoprotein
429
preliminary experiments provide direct experimental evidence in support of a significant conformational change in apoLp-III upon interaction with lipid. II. METHODS Materials. 1^NH4C1, ^^N-Leucine, ^^N-Glycine, ^^N-Valine and ^^N-Lysine were purchased from Cambridge Isotope Laboratories (Andover, MA). i^C-C^-Glucose and ^H-dodecylphosphocholine were obtained from Isotec. Inc. (Miamisburg, Ohio). Unlabeled amino acids were obtained from Sigma Chemical Co. (St. Louis, MO) Bacterial expression and isotope-labeling of recombinant apoLp-III. The coding sequence of M. sexta apoLp-III was cloned into the pET expression vector (Novagen Corp., Madison, WI) directly downstream from the pelB leader sequence cleavage site. Introduction of the plasmid vector into E. coli BL21(DE3) permits high level expression upon induction with 1 mM isopropyl 6-D thiogalactopyranoside (IPTG). Significant amounts of recombinant apoLp-III were secreted into the culture medium during expression and protein was isolated from the culture supernatant following five hours incubation at 30°C. Typically, a one liter cell culture produces 150 - 200 mg of pure apoLp-III (8). The purification procedure was essentially as described by Ryan et al. (8). Isotope-labeling of apoLp-III used M9 minimal media with l^NILtCl for ^^N-uniform labeling; 15NH4Cl/^^C6-glucose for l^N/l^C uniform labeling; I5]sj4eucine/1^NH4C1 for ^^N specific backbone nitrogen labeling; ^^N-amino acid of interest/19 other ^^N-amino acids for specific amino acid ^^N-labeling. NMR Spectroscopy. NMR experiments were carried out at 30 °C on a Varian Unity 600 spectrometer equipped with three channels, a pulse-field gradient triple resonance probe with an actively shielded z gradient and a gradient amplifier unit. NMR sample concentrations ranged from 0.5 to 1.1 mM, pH. 6.5±0.1, with 250 mM phosphate buffer and 0.5 mM NaN3. 2D ^H-^^N HSQC spectra were recorded using the enhanced sensitivity mode with 8 - 3 2 transients (9). Triple resonance HNCACB (10) and CBCA(CO)NNH (10) 3D-NMR spectra, recorded on an uniformly l^N/^^C-labeled H2O sample with 8 - 1 6 transients, correlates backbone amide protons of residue / with CA and CB atoms of residue / (HNCACB) and i-1 (HNCACB, CBCA(CO)NNH) for the backbone sequential assignment. l^N-edited NOESY (9) and l^N-edited TOCSY (9) were also acquked on an uniformly l^N-labeled H2O sample with 8-12 transients, for both backbone and sidechain assignments. A mixing time of 150 ms was used for ^^N-edited NOESY experiments and a mixing time of 59 ms was used for l^N-edited TOCSY
430
Jianjun Wang et al
experiments. Pulse field gradient HCCH-TOCSY (11) and simultaneous ^^N- and l^C-edited NOESY (12) were acquired for the sidechain assignment. A mixing time of 100 ms was used for simultaneous l^N- and ^^C-edited NOESY. Titrations of ^H-dodecylphosphocholine (DPC) to specific amino acid l^N-labeled apoLp-HI samples were monitored by 2D ^H-^^N HSQC spectra at pH 6.9 - 7.0 in order to investigate structural changes induced by lipid-binding. Electrospray ionization mass spectrometry. Molecular weight determinations for control and isotope enriched apoLp-IIIs were made using a VG quattro electrospray mass spectrometer (Fisons Instruments, Manchester, UK). Molecular weights were determined as the mean value calculated for several multiply charged ions within a coherent series. The instrument was calibrated using the series of ion peaks from horse heart myoglobin with a molecular mass of 16,951 daltons. Calculated masses were derived from the amino acid sequence using the program MacPro Mass (Terry Lee, City of Hope, Duarte CA). III. RESULTS and DISCUSSION ^^N Isotope-labeling Strategies In order to pursue heteronuclear multidimensional NMR experiments, a bacterial system for expression of apoLp-III has been developed which allows facile production of 150 - 200 mg/L l^N-labeled apoLp-III or 100 - 125 mg/L l5N/l3C-double labeled apoLp-III. Figure 2, panel A shows the iR-l^N HSQC spectrum of a 1.0 mM solution of lipid-free, uniformly ^^N-labeled apoLp-III. Panel A also indicates that, although the chemical shift dispersion in the ^Hdimension is rather small (6.5 ppm to 9.5 ppm), it is generally upfield shifted, consistent with the fact that the protein secondary structure is predominantly a-helix (13). The chemical shifts in the ^^N-dimension are well-dispersed which results in good separation of the overall crosspeaks. However, certain regions in the spectrum are still crowded as shown in Figure 2. The upper right comer of the HSQC spectrum of ^^N-labeled apoLp-III shown in Figure 2, panel A contains numerous doublet crosspeaks which are derived firom side chain amines of glutamine and asparagine residues. Since apoLp-III is rich in glutamine and asparagine (25 of the 166 amino acids), this region of the spectrum is
1 CD
c o
§•
•Hi
o cc
13
3-
"if •
u •
c
a. 03
'
'
1
'
•
' < ] > ' > >
T ^
•
' ' 1 ' ' ' ' 1 ' ''
' 1 '
o
•
E ^ o i5
I
<
PQ g m
0
cc c a. o =
' ' ' 1 ' ' '*' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' '
(D
oi2
I.
CO
^
cd
oj
(/D
0)
^
CO
432
Jianjun Wang et al
crowded. In considering possible approaches to selectively label backbone amide nitrogens, we speculated that although bacteria efficientiy and readily transaminate nitrogen to a-amino nitrogens of other amino acids, they may prefer NH4CI as a precursor of Gin and Asn side chain amine nitrogens. Interestingly, when bacteria were grown in media containing ^^N-leucine and unlabeled NH4CI, specific labeling of backbone amide nitrogens was achieved (Panel B, Figure 2). The presence of unlabeled NH4CI is essential to this process because it provides an alternative, preferentially utilized, biosynthetic precursor of side chain nitrogen atoms. The isotope-labeling strategy shown in Panel B, Figure 2 is the first report, to our knowledge, which allows for specific isotope-labeling of the backbone amide nitrogen atoms of a protein. As a reference control, bacteria harboring the apoLpni/pET plasmid were cultured in medium containing ^^N-leucine as the sole nitrogen source. ApoLp-III obtained by this isotope-labeling method gave a HSQC spectrum which was indistinguishable from the spectrum shown in Figure 2, panel A (data not shown). This result confirms that bacteria are capable of redistributing nitrogen derived from ^^N-leucine into all backbone and side chain nitrogens in this protein. Importantly, when the uniform labeled spectrum is overlayed with the specifically backbone amide labeled spectrum, it is apparent that several backbone amide resonances are masked by the abundant side chain amine nitrogen resonances. Thus, this labeling strategy simphfies HSQC spectra, enabling identification of crosspeaks from backbone amides that otherwise overlap with crosspeaks from side chain amines. Spectroscopic methods (DEPT-HMQC and DEPT-SQC) have been developed which permit resolution of ^^NH and 15NH2 groups in proteins (14,15). However, the metabolic labeling strategy described herein offers an attractive altemative method, permitting specific enrichment of backbone amide nitrogens with l^N, effectively eliminating glutamine and asparagine side chain NH2 resonance, and thus simpHfying the spectrum. Mass spectrometric analysis of ^^Nbackbone amide nitrogen labeled apoLp-III indicated an isotope enrichment of 50%, demonstrating that unlabeled NH4CI competes with l^N-leucine derived nitrogens for incorporation into the backbone nitrogen atoms in the protein. By contrast, an isotope enrichment of > 95% was observed for uniformly l^N-labeled apoLp-III using l^NIL^Cl as the sole nitrogen source in M9 minimal media. It has been proposed that the amphipathic a-helix is the putative lipidassociating motif in exchangeable apolipoproteins (16,17). Further, it is generally accepted that binding of exchangeable apoHpoproteins to lipid surfaces involves contact of the apolar faces of the a-heUces, with the polar faces of the helices exposed to the solvent. Taken together with the known structures of L. migratoria apoLp-in and the 22 kDa N-terminal fragment of human apolipoprotein-E in the absence of lipid, it is apparent that a significant conformational change must
NMR Studies of an Exchangeable Apolipoprotein
433
accompany lipid association. Hence, we are particularly interested in residues located in the apolar and/or polar faces of the helices. Leucine is a typical hydrophobic residue and lysine is a typical hydrophilic residue. In M, sexta apoLpni, there are 11 leucines and 23 lysines which are dispersed along the entire length of the protein sequence. These two residues are proposed to be located on the apolar and polar faces of the amphipathic helices, respectively. To accomplish specific labeling of the 11 leucine residues in apoLp-III, we cultured bacteria in M9 minimal media containing ^^N-leucine and the 19 other unlabeled amino acids. Figure 2, panel C reveals a spectrum consisting of 11 well-separated resonances, consistent with the conclusion that specific labeling of the 11 leucine residues in apoLp-ni has been achieved. For l^N/l^C-double labeling of apoLp-III, the amount of l^C6-glucose was optimized. Normally, minimal media contains 2.4 - 3.0 g glucose/L. However, it was found that efficient overexpression of ^^N/^^c double labeled apoLp-III can be achieved with as little as 1.2 g/L of ^3C6-glucose, significantiy reducing the cost of isotope-labeling. Strategy for Complete Assignment of ApoLp-III In order to completely assign the NMR spectra, heteronuclear 3D-NMR experiments have been performed. Experiments essential for the backbone assignment are the l^N-edited NOESY, HNCACB and CBCA(CO)NNH. These spectra allow for nearly complete assignment of the backbone atoms of apoLp-III. For an a-helical protein, strong/medium intensity of dNN(i>i±l) NOEs can usually be observed (18). The strategy for sequential assignment of apoLp-III took advantage of the dNN connectivities and used NH-NH walking based on the ^^Nedited NOESY spectrum. Figure 3 gives an example of data obtained from the NHNH walking strategy. This figure shows strip plots extracted from the 150 ms mixing time ^^N-edited NOESY spectrum of apoLp-III for residues 50 - 59. This strip plot clearly demonstrates the NH-NH walking strategy and, using this approach, we assigned 80% of the backbone N, HN atoms, as well as 80% Ha atoms with the help of a ^^N-edited TOCSY spectrum. Triple resonance HNCACB and CBCA(CO)NNH spectra correlate backbone amide protons of residue / with CA and CB atoms of residues / and /-I. Spectra obtained, which yielded crosspeaks correlating 8 5 - 9 5 % of apoLp-III residues, were used to confirm the assignment obtained by NH-NH walking, complete the remaining assignment and correct two mistakes. Using a combination of the above mentioned 3D-experiments, it was possible to assign nearly all of the backbone atoms of apoLp-III. Sidechain atom assignment was completed using 3D HCCH-TOCSY and simultaneous ^^N/^^C NOESY spectra (at 100 ms mixing time).
434
Jianjun Wang et al
50/511
«oe 0
49/50
• «
CM I
m
hL51
01 < d l
M «09O * t
%'
rQ52
51/52 , 51152.
hV50
I 'If •
53/54.
hQ53
90
154/55. 55/56 J
•10
«
0»
0«
hL54 i-S55
"0
^
56/57 57/58
O'OO
h A56 F57
0 (
CO^O 0 0
I 58/59
•
hS58
«
S59 10.0
9.0
8.0
7.0
6.0
5.0
4.0
3.0
2.0
1.0
0.0
Chemical Shift (PPM) Figure 3. Strip plot of 3D 1%-edited NOESY of apoLp-III showing NH-NH walking strategy for residues 5 0 - 5 9 .
The complete assignment of apoLp-III will be reported elsewhere. However, Figure 2, Panel C shows the assignment of 11 leucine residues of apoLp-IIL Lipid binding induced conformational changes in ApoLp-III In order to investigate structural changes in apoLp-III induced by lipid-binding, we prepared specifically ^^N-valine and l^N-lysine labeled protein using a labeling strategy similar to that described for l^N-ieucine above. Compared to leucine labeling, the efficiency of label incorporation was greater with l^^-lysine, consistent with the work of others (19). For l^N.yaline specific labeling, in addition to strong crosspeaks from direct incorporation of l^N-valine, several weaker crosspeaks were found in HSQC spectra. These data suggest scrambling of valine's a-amino nitrogen to other amino acids. This interpretation is consistent with the fact that, in bacteria, valine can donate its amino nitrogen direcdy to aketoglutarate to form glutamic acid. However, we can easily identify the crosspeaks from valine in l^N-^H HSQC spectra due to the much stronger intensity
NMR Studies of an Exchangeable Apolipoprotein
435
of valine crosspeaks versus those arising from scrambling of the label. The assignment derived from 3D-NMR experiments also allows us to confirm the valine assignments. D38-dodecylphosphocholine (DPC) micelles were used to provide a lipid environment. This molecule, which possesses a phosphocholine head group and a single Ci2 hydrocarbon chain, mimics the phospholipid component of lipoprotein surface monolayers. About 40 DPC molecules form a micelle which has molecular weight of about 16 kDa. Thus, the apoLp-HI/DPC (1:1 protein/micelle) complex
F2 (ppm) K21
K73
7.2
K140
7.4 K165
7.6
K124
K92 ^K14
^?5^^.
7.8H
0 . ^ ^ ^ ^ K121
K44
8.0 K158 ^ j
^ ^
8.2-]
. ^
35
8.4-J K142
8.6-^
ri05 I ' ' ' '
I ' ' ' '
I ' ' ' ' I ' ' ' '
I ' ' ' '
I ' ' ' ' I ' ' ' '
I ' ' ' ' I ' ' I ' 1' I II
I I I I
126 125 124 123 122 121 120 119 118 117
Fl (ppm) Figure 4. 1 H - 1 % H S Q C spectra of specifically
-lysine labeled apoLp-III in the presence (light
contour peaks) and absence (black contour peaks) of DPC micelles. The assignment shown in the figure is for lysine residues of apoLp-III in its lipid-free state.
436
Jianjun Wang et al
has a predicted molecular weight of about 34.4 kDa. In addition, a final molar ratio between apoLp-IH and DPC should be above 1:40 for the study of the lipid-induced conformational changes of apoLp-HI since under this condition, DPC concentration is much higher than its CMC, and DPC micelle concentration is also higher than the protein concentration. Hence, a 1:1 apoLp-IU/DPC complex will be obtained. For simplicity, specifically ^^N-valine and ^^N-lysine labeled apoLp-IIIs were used to study conformational changes of apoLp-HI upon lipid-binding. It is postulated that these two residues (and the helices they are associated with) will undergo significant repositioning when apoLp-EI binds to lipid. ^H-^^N HSQC spectra were used to monitor chemical shift changes of the crosspeaks arising from these two residues. Figure 4 shows the HSQC spectra of specifically ^^N-lysine labeled recombinant apoLp-HI in the absence and presence of ^H-DPC micelles. In this figure, the dark contour crosspeaks represent apoLp-III in the lipid-free helix bundle conformation whereas the light contour crosspeaks represent apoLp-III in the presence of DPC micelles. It is noteworthy that specifically ^^N-lysine labeled apoLp-in in its lipid-free state gives rise to only 16 distinct resonances, of which at least four crosspeaks represent more than one lysine residue due to resonance overlap (see the assignment in Figure 4). In addition, two lysine HSQC crosspeaks (K71, K136) were missing in Figure 4 due to the fast exchange of the amide protons of these two lysine residues under the experimental conditions we used for lipid titration (pH 6.9 - 7.0). The DPC:apoLp-ni mole ratio was 1:45 in Figure 4, which is well above the DPC critical micelle concentration of 1.3 mM. I^N-^H HSQC experiments were carried out on both samples to evaluate the effect of ^H-DPC titration. Our goal was to follow the chemical shift changes of each crosspeak shown in Figure 4 and, ultimately, obtain the assignment of lysine and valine residues in the lipid-bound state. Interestingly, these two samples behave differently upon DPC titration in terms of relative HSQC crosspeak chemical shift changes. While the valine crosspeaks are less sensitive to DPC titration, lysine crosspeaks are extremely sensitive. Small amounts of DPC, even less than its CMC (100 |j,g per 3 mg apoLp-III), cause dramatic changes in the chemical shift of lysine resonance, making it difficult to follow the changes in individual crosspeaks. For this reason, we do not show the assignment of apoLpIII lysine residues in the lipid-bound state (Figure 4). On the other hand, DPC titration-induced chemical shift changes in valine crosspeaks were more easily followed which allows us to obtain the assignment of valine residues of apoLp-III at its lipid-bound state. Since the DPC micelle concentration in Figure 4 is higher than the protein concentration, a 1:1 ratio of protein/DPC micelle complex is expected. In general, crosspeaks observed with lipid-bound apoLp-III are broader than those obtained with apoLp-III in the lipid-ft"ee state, consistent with association of the protein to the micellar surface. The dramatic differences in resonance
NMR Studies of an Exchangeable Apolipoprotein
437
distribution between the lipid-free and lipid-bound apoLp-HI provides strong direct experimental support for the concept that lipid association is accompanied by a significant protein conformational change (5-6,20). Further detailed structural studies are currently in progress to characterize the molecular details of the lipidassociated conformation of apoLp-UI in terms of the structural model depicted in Figure 1. AKNOWLEDGEMENTS We thank Dr. Bill Bachovchin and David Corson for helpful discussions. ROR is a Senior Scholar of the Alberta Heritage Foundation for Medical Research and Medical Research Council of Canada Scientist. This work is supported by a grant from the Medical Research Council of Canada.
REFERENCES l.Weisgraber, K.H. (1994) Adv. Protein Chem.. 45, 249-302. 2.Weisgraber, K.H., Pitas, R.E. and Mahley, R.W. (1994) Curr. Opin. Struct. Biol, 4, 507-515. 3. Blacklock, B.J. and Ryan, R.O. (1994) Insect Biochem. Mol. Biol., 24, 855-873. 4. Ryan, R.O. (1994) Curr. Opin. Struct. Biol., 4, 499-506. 5. Kawooya J.K., Meredith, S.C, Wells, M.A., Kezdy, F.J. and Law, J.H. (1986) /. Biol Chem., 261, 13588-13591. 6. Breiter, D.R., Kanost, M.R., Benning, M.M., Wesenberg, G., Law, J.H., Wells, M.A., Rayment, I. and Holden, H.M. (1991) Biochemistry, 30, 603-608. 7. Wilson, C, Warden, M.R., Weisgraber, K.H., Mahley, R.W. and Agard, D.A. (1991) Science, ISl, 1817-1822. 8. Ryan, R.O., Schieve, D., Wientzek, M, Narayanaswami, V., Oikawa, K., Kay, CM. and Agellon, L.B. (1995) /. Lipid Res. 36, 1066-1072. 9. Zhang, O., Kay, L. E., Olivier, J. P. and Forman-Kay, J. D. (1994). J. Biomol-NMR 4, 845-858. 10. Muhandiram, D. R. and Kay, L. E. (1994). J. Magn. Reson. B103, 203-216. 11. Kay, L. E., Xu, G. Y., Singer, A. U., Muhandiram, D. R. and Forman-Kay, J. D. (1993). J. Magn. Reson. BlOl, 333-337. 12. Pascal, S. M., Muhandiram, D. R., Yamazaki, T., Forman-Kay, J. D. and Kay, L. E. (1994) /. Magn. Reson. B103, 197-201. 13. Ryan, R.O., Oikawa, K. and Kay, CM. (1993) /. Biol. Chem., 268, 1525-1530. 14. Kessler, H., Schmeider, P. and Kurz, M. (1989) /. Magn. Reson. 85, 400-405. 15. Tate, S.-L, Masui, Y. and Inagaki, F. (1991) /. Magn. Reson. 94, 625-630. 16. Segrest, J.P., Jackson, R.L., Morriseu, J.D., and Gotto, A.M.Jr. (1974) FEBS Lett., 38, 247-253.
Jianjun Wang et al.
438
17. Segrest, J. P., Garber, D.W., Brouillette, C.G., Harvey, S.C. and Anantharamaiah, G.M. (1994).
Adv. Protein Chem. 45, 303-369.
18. Wuthrich. K. (1986) NMR of Proteins and Nucleic Acids. John Wiley & Sons, N. Y. 19. Muchmore, D.C., Mcintosh, L.P., Russell, C.B., Anderson, D.E. and Dahlquist, F.W. (1989). Methods EnzymoL, 177, 44-73. 20. Wientzek, M., Kay, CM., Oikawa, K. and Ryan, R.O. (1994) / . Biol. Chem., 269, 46054612.
NMR Methods for Analysis of CRALBP Retinoid Binding* Linda A. Luck"', Ronald A. Venters^, James T. KapronS, Karen E. Roth^, Seth A. Barrows"•, Sara G. Paradis'' and John W. Crabb3 ^Department of Biology, Clarkson University, Potsdam, NY 13699 Duke University NMR Center, Duke University, Durham, NC 27708 ^Protein Chemistry Facility, W Alton Jones Cell Center, Lake Placid, NY12946
I. Introduction Cellular retinaldehyde-binding protein (CRALBP) may play a key role in visual pigment regeneration as a substrate carrier/routing protein In the visual cycle, mediating the conversion of 11-c/s-retinol to 11-c/sretinaldehyde through interaction with an 11-c/s-retinol dehydrogenase in the retinal pigment epithelium (Saari et al., 1994). The protein exhibits retinoid stereoselectivity, only binding 11-c/s retinoids with high affinity and 9-c/s-retinaldehyde with lower affinity. 11-c/s-retinaldehyde bound to CRALBP is less susceptible to photoisomerization than when bound to rhodopsin (Saari and Bredberg, 1987). No evidence has been found for covalent linkage between retinoid and CRALBP. Toward identification of the CRALBP retinoid-binding pocket and definition of the structural properties of the protein that provide high ligand stereoselectivity and low photosensitivity, solution state NMR analysis has been initiated using human recombinant CRALBP labeled by biosynthetic isotope incorporation. A combination of heteronuclear gradient-enhanced ""SN N M R and one dimensional ^^F and "^^c NMR methods coupled with improved isotope incorporation methods and mass spectrometry, have proven to be complimentary approaches for characterizing CRALBP-ligand interactions. While these methods have been used separately elsewhere, usually as a primary approach to structural problems, here we emphasize the complimentarity of the *Thls work was supported In part by USPHS grant EY06603. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
439
440
Linda A. Luck et al
techniques and the advantages of combining the methods for studying protein-ligand interactions. Such protein biotechnology is suitable for characterizing a variety of protein-ligand interactions and is becoming more accessible through specialized biomolecular resource facilities.
II. Materials and Methods Human Recombinant CRALBP (rCRALBP). All human recombinant CRALBP used in this study was expressed as a fusion protein (339 residues, Mr=39,110) in bacteria [E c o / / s t r a i n BL21(DE3)LysS] with a His-tag N-terminal extension using the pET 19b vector, labeled with 11 -c/s-retinaldehyde in the crude cell lysate and purified by nickel affinity chromatography (Qiagen Ni-ATA resin) as previously described (Chen et al., 1994; Crabb et al., 1996). The presence of ligand was monitored by the characteristic UV absorption maximum at 425 nm. Prior to NMR analysis, sample purity was verifed by SDS-PAGE and Edman degradation, the sample quantified by amino acid analysis, and the solvent exchanged to 25 mM TrisCI, pH 7.5, 1 mM DTT-EDTA and the preparation concentrated using Amicon Centri prep concentrators (10,000 MW cut off). Because 11-c/sretinaldehyde is a light sensitive ligand, rCRALBP preparation was performed under dim red illumination to retain the ligand in the binding pocket and preserve the holoprotein conformation. Biosynthetic isotope incorporation was as described in section III. The ' ' ^ N ammonium chloride and "'SC-methyl-methionine were obtained from Cambridge Isotope Laboratories; 5-fluorotryptophan was from Sigma. Solution State NMR. " ' H - " ' 5 N gradient-enhanced sensitivityenhanced heteronuclear single quantum correlation (GESE-HSQC) experiments were carried out on a three-channel Varian Unity 600 spectrometer using a '^HI'^^CI'^^H triple-resonance probe equipped with an actively shielded Bz gradient coil (Farmer II and Venters, 1996). "•3c NMR spectra were acquired on Varian Unity 600 and General Electric GN300 instruments. '^^F NMR spectra were obtained at 564 MHz on a Varian Unity 600 spectrometer equipped with a 5mm ' ' H / " ' 9 F probe. 3-Fluorophenylalanine (-38.0 ppm) was used to calibrate the chemical shifts of fluorine resonances relative to trifluoroacetic acid at 0 ppm (Luck, 1995). All samples were adjusted to 10 % D2O (v/v) prior to NMR analysis. Spectra were acquired at 25 °C first under dark conditions then again after exposure of the protein samples to bleaching illumination. Mass Spectrometry and Other Analytical Procedures. Liquid chromatography electrospray mass spectrometry (LC-ESMS) was performed on approximately 1 jig protein samples with a Perkin Elmer Sciex API-300 triple quadrupole mass spectrometer fitted with an articulated ion spray source and set to scan over a range of 400-3000
NMR Methods for Analysis of CRALBP Retinoid Binding
441
m/z at 7 s/scan in 0.25 Da steps using an orifice potential of 35 V. RPHPLC was performed with an Applied Blosystems Model 120 HPLC system (modified with a 75 |LII mixing chamber), a 5|a Vydac C18 column (1 X 250 mm), aqueous TFA/acetonitrile solvents and a flow rate of 50 |il/min. About 30% of the HPLC effluent was split to the mass spectrometer. Purified protein was quantified by phenylthiocarbamyl amino acid analysis (Applied Biosystems models 420H/130/920) and purity was evaluated by Edman degradation (Applied Biosystems models 470/120/900) as described elsewhere (Crabb et al., 1988). rCRALBP retinoid binding was monitored by ultraviolet spectral analysis and photoisomerization of bound retinoid achieved by exposure to bleaching illumination for 10-30 min at room temperature (Crabb etal., 1996).
III. Isotope Incorporation and Pre-NMR Analyses Recipes for defined media used to obtain uniform biosynthetic "•^N and 13c labeling of recombinant proteins expressed in bacteria have been published elsewhere (Venters et al., 1991). For uniform ""SN-labeling of rCRALBP, we utilized a modified minimal media containing 1 g/L [">5N,98%] ammonium chloride as the sole nitrogen source plus M9 salts, 2 mM MgS04, 1 jiM FeCIa, 100 |iM CaCl2, 50 |iM ZnS04, 10 lig/ml biotin, 10 |ig/ml folic acid, 0.1 |Lig/ml riboflavin, 5 |Lig/ml thiamine and 50 |ig/ml ampicillin. This media provided about a 2 fold improved yield of purified rCRALBP (5-6 mg/L) over that achieved with standard LB growth media. The level of ^^N isotope incorporation was evaluated by liquid chromatography electrospray mass spectrometry as shown in Fig. 1. For rCRALBP grown in the above media, the measured mass of the i^N labeled protein was 39,591 ±5 (compared with 39,594 for the calculated mass of the protein with 100% ''^N incorporation), indicating that essentially complete labeling was achieved. Several commercially available fluorinated analogs of aromatic amino acids are useful for "•Qp NMR studies, including 2-,3- and 4fluorophenylalanine, 4-,5- and 6-fluorotryptophan and 3-fluorotryrosine. High level incorporation of fluorinated amino acids can be accomplished by using a bacterial auxotroph strain for the amino acid of interest or by using glyphosate [N-(phosphonomethyl)-glycine] which inhibits the synthetic pathways of the aromatic amino acids (Kim et al, 1990). Because 5-fluorotryptophan is a protein synthesis inhibitor, protein production requires a balance with unfluorinated tryptophan. A 5:1 molar ratio of fluorinated to unfluorinated tryptophan yields up to 65% incorporation of the fluorinated amino acid in many bacterial auxotroph systems (Luck and Faike, 1991). Variable lower levels of fluorine incorporation are usually obtained without a bacterial auxotroph yet informative NMR signals are often still achievable by extending the NMR measurement time. We have obtained useful i^FTrp spectra from rCRALBP (which contains 2 mole Trp per mole protein) using 8-10 h NMR analysis periods. For these analyses,
Linda A. Luck et al
442
(A)
15NCRALBP
in
o
39500 39700 Molecular Weight (amu)
CO
c 0
lilL i
• ilUVM
(B) I
O
0)
c
CD
(C) I
o 03
o
800
1000
1200
m/z
1400
1600
1800
Figure 1. Mass Spectra of ^^U, ^^C and "^^F labeled rCRALBP. The extent of biosynthetic incorporation of isotopic labels into rCRALBP was evaluated by liquid chromatography electrospray mass spectrometry. Electrospray mass spectra with deconvoluted spectra are shown for human rCRALBP labeled with (A) ' ' ^ N , (B) ''^CMet, and (C) ''^F-Trp. The mass spectral data indicate that essentially complete ''^N incorporation and significant "^^C-Met and "^^F-Trp incorporation were obtained.
NMR Methods for Analysis of CRALBP Retinoid Binding
443
rCRALBP was produced using a minimal media (Luck and Faike, 1991) containing I\/I9 salts with 2x the normal phosphate concentration, 2% (w/v) case amino acids (Difco), 1% (v/v) glycerol, 10 |ig/ml thiamine, 100 iLig/ml ampicillin, 65 |ig/ml 5F-tryptophan (DL) and 8 |xg/ml tryptophan (L). LC-ESMS analysis of ""QP-labeled rCRALBP yielded the electrospray mass spectra shown in Fig. 1C. Multiple molecular species of the ^^F labeled protein are apparent in the deconvoluted spectra (inset) with measured masses of (a) 39,117±8, (b) 39,126±9 and (c) 39,146±8 which approximate the calculated masses for the unlabeled protein (39,110), the protein containing one equivalent of 19F-Trp (39,128) or two equivalents of ^^F-Trp (39,146). rCRALBP was labeled with "i^C-methyl methionine for NMR analysis with the same media used for "i^F labeling with the addition of 200 |ig/ml ''3C-methyl-methionine, 10 |xg/ml tryptophan and no 5fluorotryptophan. This formulation, which results in a 2:1 rato of "^^CMet to "'SC-Met, was used to keep isotope cost low while still providing useful "ISC NMR data with 10-12 h analysis times. Since ^3C-labeling has no adverse effect on protein synthesis or bacterial cell growth, higher amounts can be used for isotopic labeling to reduce NMR instrument time. The fusion rCRALBP contains 7 Met per mole protein and mass spectral analysis (Fig. IB) demonstrates significant "^^c incorporation was obtained based on a major molecular species with a measured mass of 39,121 ±5 (compared with a calculated mass of 39,117 for 100% incorporation). Compared with the uniformly labeled "•SN rCRALBP (Fig. 1A), the microheterogeneity of the ^^F and "^^C preparations from the presence of both labeled and unlabeled Trp and Met residues is readily apparent in the mass spectra (Fig. 1B,C).
IV. NMR Applications Gradient-enhanced sensitivity-enhanced heteronuclear single quantum correlation (GESE-HSQC) NMR is a rapid and sensitive highresolution, multidimensional methodology that requires advanced instrumentation and pure samples with uniform isotopic labeling in millimolar concentrations (Venters and Spicer, 1995), The experiment when applied to ""^N labeled protein correlates the ^mide proton with its directly bound ""^N nuclei. Concerted effort and significant associated costs in time and money can yield complete assignment of backbone resonances and protein three-dimensional solution structures; however, application of this methodolo'gy in a modest manner is affordable and appropriate for probing protein-ligand interactions. As an example, we present ""^N GESE-HSQC data that support localized conformational changes in rCRALBP when the ligand is removed from the retinoid binding pocket (Fig. 2). The GESE-HSQC pulsed field gradient experiment is depicted in Fig. 3. This type of NMR analysis is particularly useful for decerning whether ligand removal results in global or localized protein conformational change and, as in
o
(uidd)HI.
o
00 L.
|o«e 6 *• *
o d
O 6
o
"CO
o
o
•csi
CM
hO
O
CM
I-CO
^
o _«o CO
E a z
•:;
O
c
o
=
0
T- 2
0
O
X CO
• Q.
CD > , C
""" CO c C CO CD
CD ^
o> E CO 0 CNJ CO 0 » -
^ 2 0) E
t
^-K-^ -D
^O TO m . 5: =5 0
• O ^ - D
o L 0 JO
n^ y
0
Q.
^ O (0 to _^ T i
0c :k C w
(D
0)
Q) (0
w CD ^
0
.E E g
So
IS Co
•oco cLU
Q. QQ
c CO
cr c 'co , CO 0
^ •^
< o
I
z
o
0
0 : ^
S oo - oS T 3 OLO
3
Q - 2^ . CO
X
d
—
i^s £ : 1 -
d) CO
•2 -D 0 ^ c
^ CO
—I 0 > . < ^
C
15
c^ C3 c j ' CO CO 0 ^ 0 -0
0
0
c.5y
E
O
CO > , < 0 ••£ DC
O) 0 E^m .£ c , ^ ^ o Zi
C/) c o ^ •c ^.
^ ^
CO ^
i5 0 ^ [ E o o 0
0
P ^ C CO fc: CO ij; 0
i 5 gj-D
o . 5 2 0)
3=5 "o « j r E 2
'i
CM
0 O ii ^ 3 CO D) 0 i Z CO
445
NMR Methods for Analysis of CRALBP Retinoid Binding y
9l
iH[Tjyjp2
il'^il^iri Cpk!^9i)
Ldl (pfl
93 t,C + T2
15N
y
I
y
y
12
WALTZ-16
M
13C
G^
G, gGg
gGg
G4
G4 65
G5
Figure 3. GESE-HSQC Experiment. Coherence transfer selection between "• H nuclei and their directly bonded ''^N nuclei was achieved using the GESE-HSQC experiment shown above. In the depicted pulse sequence, 90 degree pulses are represented by wide lines, simple 180 degree pulses by black rectangles and composite inversion pulses by cross-hatched rectangles. Gz pulses represent pulsed field gradient pulses. The water-selective 90 degree flipback pulse is labeled with phase 02 and had a duration of 1.7 msec. Phase cycle elements are: 0l=2(x), 2(-x); 03=x, -x; all other phases are x unless othenA/ise indicated. Additional acquisition parameters are: t1=2.65 msec; t2=5.56 msec, z=1.2 msec, sw ( ' ' H ) = 1 1 0 0 1 HZ, SW ( ' ' 5 N ) = 5 0 0 0 HZ, gB2.("'^N decoupling field strength)=1.27 KHz with GARP1, t2=93 msec. Pulsed field gradient parameters were G3=26 G/cm, tG3=5 |Lisec; G6=25.76 G/cm, tG6=0-5 msec; G-|=1 G/cm, tQ 1=0.4 msec, gradient labeled G2, G4 and G5 were not used.
the case of rCRALBP, provides a rational foundation for pursuing identification of specific amino acid residues involved in the localized change. One dimensional "'^p NMR experiments offer advantages of low cost, sensitivity and simplicity. Fluorine provides a sensitive probe for monitoring protein conformational changes and ligand interactions in part because the chemical shift range is 100 fold larger than that of the proton due to the lone pair electrons (Gerig, 1994; Danielson and Faike, 1996; Sykes and Hull 1978). Most commonly, fluorine is substituted for hydrogen in the ring structure of aromatic amino acids such as tyrosine and tryptophan which enhances sensitivity due to the resonance electrons in the ring. Because of the usual low abundance of tryptophan in proteins and the low cost of 5-fluorotryptophan, tryptophan is generally the residue of choice for fluorine incorporation. A potential pitfall is that fluorine incorporation may cause protein denaturation and/or instability, depending on the site of incorporation and the nature of the protein. Partial apparent denaturation of 5fluorotryptophan labeled rCRALBP was observed by NMR, nevertheless, distinct "I^F resonances for each of the two labeled Trp were also observed (in NMR spectra collected over 8h from 18 mg/ml
Linda A. Luck et al
446
protein solutions before and after exposure to bleaching illumination). Apparent chemical shift differences of about 0.25 and 0.75 ppm for the two Trp were seen upon removal of ligand. The smaller shift suggests that this residue is experiencing a small conformational change whereas the larger shift suggests that the other Trp residue may be in more immediate contact with bound retinoid (data not shown). To determine which Trp residue in the CRALBP sequence is associated
BEFORE BLEACH
AFTER BLEACH
T—r-T—I—r—I—r-|—n—TT—n—r-\—n—i—fT—i
20
18
16
14
12
ppm
Figure 4. NMR Spectra of "^^C-Met labeled CRALBP with and without ligand. Solution state NMR spectra of CRALBP labeled with '•^c-Met (23 mg/ml) containing bound 11-c/s-retinaldehyde were recorded in the dark. The sample was then exposed to bleaching illumination and reanalyzed without bound ligand. Major "'^C chemical shift differences are apparent between the two spectra, suggesting that a Met residue may be in direct contact with bound retinoid and/or associated with the CRALBP retinoid binding pocket. NMR conditions include: pw90 (^^C) = 15 ^isec, sw (13Q) _ 13422 Hz, preacquisltion delay= 1 sec, "'H broadband decoupling during aquisition at a field strength of 1.85 KHz using MLEV-16.
NMR Methods for Analysis of CRALBP Retinoid Binding
447
with the observed chemical shifts, site directed Trp to Phe mutants are being prepared for additional retinoid binding and ^^F NMR analyses. One dimensional "^^c NMR experiments are also relatively inexpensive and straightforward however natural abundance ""^C (0.0018 relative to ""H) does not exhibit the sensitivity in NMR of fluorine (0.8331 relative to 1H). Specific enrichment of sites, such as the methyl carbons of methionine can increase the sensitivity and ""^CMet has proven to be an effective tool for probing protein conformational changes (Beatty et al., 1996). The advantage of the innocuous substitution of carbon-13 for carbon-12 is counteracted by the smaller chemical shift dispersion of ""^C compared with ""^F. Residues of high abundance In the protein may cause overlap of resonances in the NMR. Another consideration is that while '•^cmethlonine Is less expensive than many isotopes, it remains more expensive than '•^F-Trp and "^^N ammonium chloride. NMR analysis of the ''^c-Met labeled rCRALBP (Fig. 4) was used to probe the possible interaction of any of the seven Met residues with retinoid. Preliminary one dimensional NMR results reveal a predominant set of ovelapping 13c resonances with minor chemical shift differences before and after bleaching (Fig. 4). However, a distinct and major ""^c chemical shift difference of about 1 ppm is also apparent after removal of ligand, suggesting that at least one Met residue may be in direct contact with bound retinoid. Site directed mutagenesis of the Met residues in CRALBP is underway and additional two dimensional HSQC NMR analyses will be used to assign specific Met residues to observed chemical shifts and ligand interactions.
V. Conclusions This study demonstrates the applicability of ""^N, ""^F and ^^c NMR methodology for studying ligand interactions in a light sensitive protein such as rCRALBP. Gradient enhanced sensitivity enhanced heteronuclear single quantum correlation "^^N NMR has provided evidence that rCRALBP undergoes a specific localized conformational change upon photoisomerization of 11-c/s-retinaldehyde and removal of the ligand from the binding pocket. The results from the multidimensional NMR measurements strongly support the likelihood that the ^^F Trp and i^c-Met NMR chemical shift differences observed for the protein with and without bound 11-c/s-retinaldehyde are associated with protein-ligand interactions. Site directed mutagenesis in conjunction with further NMR and ligand binding studies promises to identify components of the rCRALBP retinoid binding pocket. The effectiveness of these NMR experiments was greatly facilitated by careful quantification and protein characterization prior to NMR analysis, particularly by liquid chromatography electrospray mass spectrometry.
448
Linda A. Luck et al
References Beatty, E.J., Cox, M.C., Frenkiel, T.A., Tarn, B.M., Kubal, G., Mason, A.B., MacGillivray, R.T.A., Sadler, P.J. and Woodworth, R.C. (1996) J Amer Chem Soc. (in press). Chen, Y., Johnson, C, West, K., Goldflam, S., Bean, M.F., Huddleston, M.J., Carr, S.C., Gabriel, J.L and Crabb, J.W. (1994) In Techniques In Protein Chemistry V, J.W. Crabb, ed., pp 371-378, Academic Press, San Diego,CA. Crabb, J.W., Johnson, CM., Carr, S.A., Armes, LG. and Saari, J.C. (1988) J Biol Chem. 263, 18678-18687. Crabb, J.W., Chen, Y., Goldflam, S., West, K.A. and Kapron, J.T. (1996) In Techniques in Molecular Biology: Retinoids, Redfern, C, ed., Humana Press, NJ (in press). Danielson, M.A. and Faike, J.J. (1996) Annu. Rev. Biophys. Biomol. Struct 25: 163-195. Gerig JT (1994) Prog. Nucl. Magn. Reson. Spectrosc. 26:293370. Kim, H.W., Perez, J.A., Ferguson, S.J., Campbell, I.D. (1990) FEBS Letts. 272: 34-36. Luck, L.A. (1995) In Techniques In Protein Chemistry V I , J.W. Crabb, ed., pp 487-494, Academic Press, San Diego,CA. Luck, L.A. and FaIke, J.J. (1991) Biochemistry 30: 4248-4252. Sykes, B.D. and Hull, W.E. (1978) Meth Enzymol. 49: 270-295. Saari, J.C. and D.L. Bredberg (1987) J Biol Chem 262, 7618-7622. Saari, J.C, Bredburg, D.L. and Noy, N. (1994) Biochemistry 331: 3106-3112. Farmer II, B.T. and Venters, R.A. (1996) J BioMolecular NMR 7, 59-71. Venters, R.A., Calderone, T.L., Spicer, LD. and Fierke, CA. (1991) Biochemistry 30, 2291-4494. Venters, R.A. and Spicer, L.D. (1995) In Techniques In Protein Chemistry V I , J.W. Crabb, ed., pp 495-502, Academic Press, San Diego, CA
A Novel Method for Measuring the Binding Properties of the Site-Directed Mutants of the Proteins That Bind Hydrophobic Ligands: Application to Cellular Retinoic Acid Binding Proteins Honggao Yan, Lincong Wang and Yue Li
Department of Biochemistry Michigan State University East Lansing, Michigan
I. Introduction Many hydrophobic molecules such as vitamin A, vitamin D and steroid hormones play vital roles in a variety of cellular processes. Because of the low solubility of these molecules in water, it has been difficult to measure the binding properties of the site-directed mutants of the proteins that interact with these hydrophobic ligands such as cellular retinoic acid binding proteins (CRABPs) (Zhang et al. 1992; Chen et al. 1995). This has greatly hampered the studies of the quantitative structure-function relationships of these important proteins. Retinoic acid (RA), a hormonally active metabolite of vitamin A, has profound effects on cell growth, differentiation, and morphogenesis. Two types of proteins have been found to bind RA: nuclear retinoic acid receptors (RARs and RXRs) and CRABPs. RARs and RXRs are RA-activated transcriptional factors that regulate expression of target genes (Mangelsdorf et al., 1994). Although the physiological roles of CRABPs are not clear at present, they are thought to be involved in cellular transport and metabolism of RA (Ong et al., 1994). Two isoforms (CRABP-I and CRABP-H) have been characterized. Both CRABP-I and CRABP-H bind specifically to SiW-trans-TQimoic acid, but they differ in affinity for RA, expression pattern and regulation. It appears that the two isoforms may have distinct functions. The idea is supported by the fact that the sequence identity of human and mouse CRABP-I (99.3%) or human and TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
449
450
Honggao Yan et al
mouse CRABP-n (93.5%) is much higher than the sequence identity (13.1%) between the two isoforms from the same source. Four conserved residues (Arg-Ul, Leu-121, Arg-132 and Tyr-134 in CRABP-n) line at the bottom of the RA binding pockets of CRABPs and interact with the carboxyl group of RA (Kleywegt et al., 1994). Site-directed mutagenesis studies have shown that the two arginine residues are important for binding of RA (Zhang et al., 1992; Chen et al., 1995). However, the affinities of these mutants for RA have not been quantitatively determined because the current RA binding assays are inapplicable to mutants with greatly decreased affinity for RA. We have developed a novel competitive binding assay for measuring the dissociation constants of the site-directed mutants of CRABPs. We have used this novel method to evaluate the contribution of Leu-121 of CRABP-H to binding of RA in conjunction with site-directed mutagenesis and NMR. The results show that Leu-121 is also important for binding of RA and contributes to the binding energy by ~ 1.4 kcal/mol.
II. Experimental Procedures A. Site-Directed Mutagenesis The oligonucleotide for making L121A mutant was 5'-GGAACTGATCGCGACCATGACG-3'. The mutant was generated by the method of Kunkel (1985) and screened by DNA sequencing. In order to ensure that there were no unintended mutations in the mutant, the entire sequence of the mutated gene was determined. Both the wild-type and mutant proteins were purified by ion exchange chromatography using DEAE-cellulose DE53 followed by gel filtration using Sephadex G-50 (Wang et al., 1996).
B. Competitive Binding Assay The assays were designed to measure the affinity of a mutant for RA relative to that of the wild-type CRABP-H. The proteins were dissolved in a phosphate buffer (4 mM NaH2P04, 16 mM Na2HP04, 150 mM NaCl, pH 7.3). The concentrations of the protein stock solutions were measured by OD280 using the absorption coefficient 19,480 M-i cm-^ for CRABP-H. RA stock solutions were prepared in absolute ethanol. The concentrations of the RA stock solutions were determined by OD336 using the absorption coefficient of 45,000 M-i cm-i. The assays were carried out in equilibrium dialysis cells at room temperature. The two compartments of each dialysis cell were separated by a semipermeable membrane with a molecular weight cutoff of 6-8 kDa. One compartment was filled with the wild-type CRABP-H (1 ml), and the other with LI21 A. An equal amount of [^H] RA (100 nM) was added to each compartment. The proteins in both compartments were in large excess of RA
Cellular Retinoic Acid Binding Proteins
451
(>20-fold). 100 |il of samples were taken from the two compartments after various times of incubation at room temperature, mixed with 5 ml of scintillation fluids and counted by a liquid scintillation counter. The equilibria in the two compartments that contain the wild-type CRABP-II and mutant proteins can be described by Eq. (1) and Eq. (2): J.
_ [WTIRA]
...
^
_ [MTIRA]
...
^d(WT) - [WT*RA]
J^d(MT) - [MT*RA]
^^ ^^
where WT, MT, WT«RA and MT#RA represent the wild-type CRABP-H, a CRABP-II mutant and their RA complexes, respectively. Therefore ^d{MT) _ [MTIWT^RA] WT\MT*RA] ^d{WT)
^3^
Since the concentrations of the proteins were much greater than their respective dissociation constants and the concentration of RA, [WT]»[\yr]^^^^p [MT] - [MT\,,i, [WT• RA\»[RA], and [MT• RA]»[RA]. Then the relative dissociation constant can be calculated by Eq. (4): ^dmT) '^diWT)
_
\.^T\otal^Wr L t^^ ]total ^MT
(4)
where C ^ and C^j are the measured radioactivities of the two compartments containing the wild-type CRABF-E and the mutant, respectively. It turned out that the system could not reach equilibrium in 2 days, presumably because of few free RA in solution to diffuse across the membrane. Since RA is not stable even in dark, the assay was redesigned to match the equilibrium conditions by varying the ratio of the protein concentrations of the wild-type and the mutant ([MT\^^^J[WT\^j^i). Thus the concentration of the mutant was varied while keeping the concentration of the wild-type at ~2 |iM. Initially the concentration of the mutant was increased in an exponential manner (e.g., 2, 20, 200 |iM). Then it was varied in a small range. Since an equal amount of RA was added to the two compartments of the dialysis cell, the two compartments should have the same RA concentration and radioactivity at the beginning of each assay. If [MT]j^^^i/[WT]j^t^i ^ i5:^(Mr)/^^(wr)' there would have a net transfer of RA across the semipermeable membrane separating the two compartments. Thus the radioactivity counts of the two compartments {C^ and C^^.) would differ after incubation for a certain period. When [MTl^^^J[WTl^^^i < K^^j^^/K^^^^, then C^-C^r >0. When [MTl^^J[WTl^^^, > K,,^r,/K,,^,, 0. When [MT],,J[WTl^,, = K,,^r)/K,,^,, then C^-C^= 0.
452
Honggao Yan et al
C. NMR Spectroscopy NOESY was performed at 32 °C on a VXR-500 spectrometer operating at a proton frequency of 500 MHz. The protein was dissolved in 20 mM sodium phosphate, pH 7.5 (direct pH meter reading), 100 mM NaCl, 5 mM DTT in D2O. The protein concentration was ~2 mM. The data was acquired in the hypercomplex mode with a mixing time of 150 ms (Jeener et al., 1979; Macura & Ernst, 1980). The spectral width was 7200 Hz in both dimensions. 2048 complex points in the t2 dimension and 256 complex points in the tl dimension were acquired. 96 transients were collected for each FID. Data processing was performed on a Sun Sparc 10 station using VNMR software from Varian. The time domain data were zero-filled once and multiplied by shifted sinebell or Gaussian functions before Fourier transformation in both dimensions. Chemical shifts were referenced to internal sodium 3-(trimethylsilyl)-propionate-2,2,3,3d4.
III. Results and Discussions A. Competitive Binding Assay Two types of methods have been in general use for measuring binding of RA to CRABPs: fluorometry and radiometry. The radiometric method involves separation of bound from free RA by dextran-coated charcoal, gel filtration and other means. Substantial loss of bound ligand during the separation process makes the method unsuitable for measuring the dissociation constants of sitedirected mutants with greatly decreased affinity for RA. The very limited solubility of RA in water (-200 nM, Szuts & Harosi, 1991) also makes the fluorometric method inapplicable for determining the dissociation constants of these mutants. Studies of the quantitative structure-function relationships of CRABPs have been hampered by the lack of methods for measuring the affinities of site-directed mutants for RA (Zhang et al. 1992; Chen et al. 1995). We have developed a novel competitive binding assay for measuring the affinities of site-directed mutants for RA relative to that of the wild-type CRABP. The essence of the method is to monitor the competition between a mutant and the wild-type protein for binding of limited RA. Equilibrium dialysis cells are used for the assays. The two compartments of each dialysis cell are filled with the wild-type and mutant proteins respectively. The absolute concentration of RA is not important as long as the concentration of free RA is much smaller than that of bound RA. There is no need to separate bound from free RA. The transfer of RA from one compartment to the other is determined by measuring the radioactivities of the samples taken from the two compartments. The direction of the net transfer is dependent on the relative affinity of the proteins and the ratio of the protein concentrations of the two compartments. A representative result is shown in Figure 1. When the ratio of
Cellular Retinoic Acid Binding Proteins
453
the concentrations of the two proteins ([L121A]/[WT]) is < 8, there is a net transfer of RA from the compartment containing L121A to the compartment containing the WT. When the ratio of the concentrations of the two proteins is > 12, there is a net transfer of RA from the compartment containing the WT to the compartment containing LI21 A. Since the relative K^ lies between the points with opposite net transfers, the K^ of L121A relative to that of the (^d(L121A/^d(WT)) ^^ 8-12.
Determination of the relative dissociation constant of a point mutant is sufficient for estimating the energetic contribution of the amino acid residue to ligand binding (AAG = RT\n{K^(^^j.^ I A:^(HT) ))• The method can also be used for measuring the relative dissociation constants of the mutants of other proteins that bind hydrophobic ligands such as RA receptors, vitamin D receptors and steroid hormone receptors.
i
-400
-600 H
[L121A]/[WT]
Figure 1. Competitive binding assays for measuring the dissociation constant of L121A mutant relative to that of the wild-type CRABP-II. The relative radioactivity is the radioactivity count of the compartment containing the wild-type protein minus that of the compartment containing L121A.
Honggao Yan et al
454
B. Conformational Characterization By NMR Since a decrease in the affinity of a mutant for RA may be caused by conformational changes, we compared the conformation of L121A with that of the wild-type protein by NMR. Parts of the NOESY spectrum of L121A are shown in Figure 2. We have recently made total sequential resonance assignment of the wild-type CRABP-E (Wang et al., in preparation). 18 interresidue NOEs between the aromatic protons in the wild-type protein have been identified and assigned. Among the 18 NOE cross peaks, 16 of them can be identified in the NOESY spectrum of LI21 A. The other two NOEs are rather weak in L121A. We have not assigned the NOEs between aromatic and aliphatic protons. Qualitatively, the aromatic-aliphatic NOE patterns of the wild-type and L121A are very similar. The results suggest that L121A mutant is properly folded and its conformation is highly similar to that of the wild-type protein. Thus the decrease in the affinity of L121A for RA is unlikely to be caused by conformational perturbations.
C. LeU'121 Is Important for Binding ofRA The results of the competitive binding assay and NMR characterization of L121A mutant suggest that Leu-121 is important for binding of RA. Leu-121 is located at the bottom of the RA binding pocket of CRABP-H. One of the methyl group of Leu-121 is in close contact with the carboxyl group of the bound RA (Kleywegt et al., 1994). The distance between the carbon of the methyl group and the oxygen of the carboxyl group is 3.26 A. The packing of the methyl group and the carboxyl group is very close to the optimal van der Waals interaction (Derewenda et al., 1995). On the basis of the relative dissociation constant, the van der Waals interaction between the methyl group of Leu-121 and the carboxyl group of RA contributes to the binding energy by ~1.4kcal/mol.
IV.
Conclusions
A novel competitive binding assay has been developed for measuring the relative dissociation constants of the site-directed mutants of CRABPs. Leu121 has been replaced with alanine by site-directed mutagenesis. The affinity of the mutant for RA is decreased by ~ 10-fold as measured by the competitive binding assay. NMR characterization indicates that the conformation of L121A mutant is very similar to that of the wild-type protein. The results taken together show that Leu-121 is important for binding retinoic acid and contribute to the binding energy by -1.4 kcal/mol.
Cellular Retinoic Acid Binding Proteins Fl
H
455
0
(ppm)^
^
o6
1
yP
0
@
0.0^ 0.1^
0
0.2^
^
0
O^ @ e @0
0.3^ 0.4^ 0.5i 0.6^ 0.7-^ 0.8^ 0.9^
0
0
[}
i
8.0
7.8
1 ' ^
7.6
1^1 11^
7.4
|llll'
7.2 F2
llT
7.0
|lft
6.8
|l
6.6
1
6.4
MM
6.2
(ppm)
Fl (ppm)^ 6.6 6.8 7.0 7.2 7.4 7.6 7.8 8.0
F2 (ppm)
Figure 2. Parts of the 500 MHz NOESY spectrum of L121A at 32 °C. The mixing time was 150 ms. Only the interresidue NOEs are labeled. The identities of the NOEs are: A, F65-2,6H ••• F71-2,6H; B, F65-2,6H ••• F71-3,5H; C, F65-4H • • W109-6H; D, F65-3,5 H ••• W109-6H; E, F50-2,6H ••• W87-7H; F, F71-4H • • W109-7H; G, F65-4H • • W109-7H; J, W87-5H •• F3-4H; K, F50-3,5H - F3-2,6H; L, F50-3,5H - W87-7H; M, F50-4H •. W87-7H; N, F50-4H ." F32,6H; O, F50-2,6H •• W87-6H; P, F50-2,6H •• F3-2,6H; Q, W87-5H ••• F3-2,6H; R, F3-4H •• W87-4H.
456
Honggao Yan et al
Acknowledgments We are indebted to Dr. Anders Astrom for providing us the wild-type cDNA clone of human CRABP-II. This work was supported by funds from the REF Center of Protein Structure and Design and the Cancer Center at Michigan State University.
References Chen, L. X., Zhang, Z.-P., Scafonas, A., Cavalli, R. C , Gabriel, J. L., Soprano, K. J., & Soprano, D. R. (1995) J, Biol Chem. 270,4518-4525. Cogan, U., Kopelman, M., Mokady, S., & Shinitzky, M. (1976) Eur. J. Biochem. 65, 71-78. Derewenda, Z. S., Lee, L., & Derewenda, U. (1995) /. Mol. Biol 252, 248-262. Jenneer, J and Ernst, R. P. (1979) J. Chem. Phys. 71,4546-4553. Kleywegt, G. J., Bergfors, T., Senn, H., Le Motte, P., Gsell, B., Shudo, K., & Jones, T.'A. (1994) Structure 2, 1241-1258. Kunkel, T. A. (1985) Proc. Natl. Acad. Sci. U. S. A. 82,488-492. Macura, S and Ernst, R.P. (1980) Mol. Phys. 41, 95-117. Mangelsdorf, D. J., Umesono, K., & Evans, R. M. (1994) In The Retinoids: Biology, Chemistry, and Medicine (Spom, M. B., Rorberts, A. B., & Goodman, D. S., Eds.) pp 319-349, Raven, New York. Ong, D. E., Newcomer, M. E., & Chytil, F. (1994) In The Retinoids: Biology, Chemistry, and Medicine (Sporn, M. B., Rorberts, A. B., & Goodman, D. S., Eds.) pp 283-317, Raven, New York. Szuts, E. Z., & Harosi, F. I. (1991) Arch. Biochem. Biophys. 287, 297-304. Wang, L., Li, Y., & Yan, H. (1996) submitted to J. Biol. Chem.. Zhang, J., Liu, Z.-P., Jones, T. A., Gierasch, L. M., & Sambrook, J. F. (1992) Proteins: Struct, Funct., Genet. 13, 87-89.
A Strategy for Predicting the Ligand Binding Competence of Recombinant Orphan Nuclear Receptors using Biophysical Characterization Derril Willard^ Bruce Wisely^, Derek Parks^ Martin Rink^ William Holmes', Michael Milbum^, and Thomas Consler' Departments of'Molecular Sciences, ^Structural Chemistry, and ^Molecular Biochemistry. Glaxo Wellcome Research and Development, Research Triangle Park, NC, 27709 I. Introduction Nuclear receptors are a loosely related group of ligand dependent transcriptional regulators with varying degrees of sequence homology. These proteins have been historically associated with the steroid hormone receptors, e.g. estrogen and glucocorticoid receptors, by virtue of DNA binding domain sequence homology comprising two zinc finger motifs. Many of these are orphan receptors, having no defined ligand. The nuclear receptors present tempting targets in the pursuit of a systems based research approach since so many have now been cloned. However, when recombinant forms of a receptor are available before its cognate ligand has been identified, a problem arises. How does one determine if an orphan receptor is active for use in in vitro assays? Researchers now have access to unparalleled amounts of DNA sequence and genetic data. Families of homologous gene products can be studied with the intent of connecting specific proteins to various disease conditions. An obvious advantage to this wide scope of research is that once the mechanism of action has been elucidated for a few family members, this information can then be applied in a general sense to other homologues. Specifically, to apply this type of strategy to our studies, we approached the problem with two premises. First, we engineered recombinant constructs of orphan nuclear receptors to contain domains with hypothetical fiinctional homology to receptors with known activities and ligands. In particular, a great deal is known concerning retinoid X receptor a (RXRa) (1) and the domains necessary for DNA binding, retinoid binding, and selfi^hetero-association. For the purposes of this study, PPARa, PPAR5, PPARy, RXRa, and LXRa constructs were created to contain the putative ligand binding domains (LBD). The amino acid residues within this conserved contiguous region have been TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
457
458
DerrilWillarderfl/.
shown to be both necessary and sufficient to demonstrate ligand binding competence for RXRa and other nuclear receptors. Structurally, the LBDs are composed primarily of multiple a-helicies (2,3). Second, we began to characterize each nuclear receptor using a variety of biophysical techniques. The object of this scrutiny was to compile a set of physical traits for each protein and to use these characteristics as a basis to compare the orphans with those receptors having defined ligand binding ability. Purified proteins were characterized by low resolution solution structure, thermal stability and propensity to self-associate or to aggregate. Observations of protein expression levels and solubility were also considered as qualitative estimates of native structure. Circular dichroism spectroscopy (CD) was employed to probe secondary structural features. CD monitored unfolding and differential scanning calorimetry (DSC) provided two separate measures of thermal stability. Static and dynamic light scattering (SLS and DLS, respectively) and analytical ultracentrifiigation were used to determine solution values for molecular size, association and aggregation state.
IL Materials and Methods A. Expression and purification Nuclear receptor LBDs were engineered to have an amino-terminal polyhistidinefiasiontag. Constructs were expressed in E. coli strain BL21 (DE3) using the T7 promoter. PPARa LBD was also expressed as a polyhistidine tagged recombinant in baculovirus infected T. ni. Protein purification in most cases was performed using a single nickel affinity step. Tagged protein was either initially purified by anion exchange chromatography then adsorbed onto a Pharmacia Chelating Sepharose Fast Flow column charged with nickel or adsorbed by nickel-chelating chromatography directly out of crude cell lysate. Proteins were eluted by 0-1 M linear gradient of imidazole in the lysis buffer.
B. Structure and stability Purified nuclear receptor proteins were buffer exchanged into PBS for CD spectral analysis using an Aviv model 62DS CD spectropolarimeter. The proteins were scanned repetitively in 0.1 cm quartz cuvettesfi^om197 to 300 nm in 1 nm wavelength increments. EUipticity was converted to molar ellipticity for comparisons. Thermal transitions were performed with the CD instrument above. Proteins were monitored at 222 nm over a temperature range of 5-80°C. Data were collected in 1°C increments with a slope of 10°C/min. Initially, data were fit to a simple sigmoidal mathematical relationship for comparison. The half-point of the thermal transition, T1/2, was determined by iterative fitting using the Boltzmann equation. Data were also fit to the following thermodynamic model:
Ligand Binding of Recombinant Orphan Nuclear Receptors
''
459
1 + exp"
where u = [{l/T-\/
T^'^AcJ^ - A//,,) + Ac^ ln(r/ T^)]/R
and where 0T is ellipticity at T (temperature in °K), 0N is the native protein ellipticity, 0D is the unfolded protein ellipticity, R is the gas constant, TD is the temperature at which the protein unfolding transition is half-complete, AHTD is the enthalpy change at TD, and Acp is the heat capacity change. DSC analyses were performed using a MicroCal MCS DSC unit. Data were analyzed to determine the midpoint of a two-state thermal transition (Tm) using the accompanying MicroCal Origin data analysis package. For each experiment approximately 1.5 mL of protein at concentrations ranging from 3-20 micromolar was analyzed during an increase in chamber temperature from 5-80 °C. The data were collected at a scan rate of 90 °C/hr and a filter period of 5 seconds. Scans were performed using PBS as the reference buffer.
C Association/Solution State Dynamic light scattering (DLS) measurements were performed on a DynaPro-801 instrument (Protein Solutions) and the data analyzed with the Autopro software package. All proteins were filtered through a 0.10 micron syringe filter and analyzed at 22°C. The translational difiiisional coefficient (DT) was obtained directly and the hydrodynamic radius (RH) is derived by a rearrangement of the following equation: DT
kT 67cr|RH
where k is the Boltzmann's constant, T is the absolute temperature and TJ is the solvent viscosity. Static light scattering (SLS) measurements were performed at 22°C on a Wyatt Technology Dawn DSP laser photometer (argon-ion laser 488 nm). Data were analyzed with the Astra software package. All proteins were filtered through a 0.10 micron syringe fiher and held at 30 psi during analysis. Each sample was run in duplicate and the refractive index increment was estimated to equal 0.180. Molecular weights are derived from a rearrangement of the following equation: K*c RiO)
1 + 2A2C M^P{e)
460
Derril Willard et al
where R(0) is the excess intensity of scattered light, c is sample concentration, A2 is the second viral coefficient, P(9) is the scattering function which depends on the molecular configuration and approaches 1 for proteins, and K* is an optical parameter. Sedimentation equilibrium analytical ultracentrifiigation was performed using a Beckman XL-A (Palo Alto, CA) centrifuge with two-channel or six-channel 12 nmi charcoal-filled epon centerpieces. Runs were performed at 20, 25, and 30 krpm at 4°C. Equilibrium was judged to be achieved by the absence of change between plots of several successive scans after approximately 20 hours. Solvent density was determined empirically at 4°C and 20°C using a Mettler DA-110 density/specific gravity meter calibrated against water. The partial specific volume of each protein was calculated using the method of Cohn and Edsall (4). Temperature differentials were incorporated using the appropriate equation (5) modified from values of each amino acid at 25°C (6). Raw data was analyzed by the Beckman/Microcal Origin non-linear regression software package using multiple iterations of the Marquardt-Levenberg algorithm (7) for parameter estimation.
D. Ligand Binding Binding constants were determined for PPARy and RXRa using ligands with known affinities. Each protein was incubated with the appropriate tritiated radioligand for two hours. Bound ligand was separated from free by gel filtration. The unbound radioligand was mixed with scintillation fluid and counted. For PPARa, PPAR5, and LXRa, similar assays were performed using compounds which had been implicated as nuclear receptor effectors in a separate cell-based assay.
III. Results Nuclear receptor LBDs were constructedfromthe homologous regions of the native sequences (Figure 1). Expression and purification of the nuclear receptors used in this study proceeded as detailed. N-terminal histidine fusion tags provided an easy and similar method of purification for each protein. In general expression yields were good (over 20 mg protein/fermentation liter) with the exception of LXRa which produced >5 mg protein/fermentation liter. Post nickel-chelating chromatography, the proteins were dialyzed into PBS for use in these studies, akhough long-term stability was found to vary from protein to protein over a range of buffer and storage conditions. In particular the histidine tag of PPARa produced in baculovirus infected T. ni cells was found to be processed away in all purification attempts shortly after elution from the nickelchelating chromatography step. Protein solubility as evidenced by ability to concentrate each nuclear receptor was found to be sufficient for the studies involved. The ultimate concentration attainable with LXRa was however
Ligand Binding of Recombinant Orphan Nuclear Receptors PPARa
461
. .6MSHNAIRFORMPRSEKAKLKAEILTCEHDIEDSETADLKSLAKRIYEAYLKNFN
PPARS
. .ONSHNAIRFORNPEAEKRKLVAOLTANE6SQYNPQVADLKAFSKHIYNAYLKNFN
PPARy LXRa RXRa
. .OMSHNAIRF6RNPQAEKEKLLAEISSDIDQLNPESADLRQALAKHLYDSYIKSFP ..QAHATSLPPRASS . .6NKREAVQEERQ_R6KDR ^NENEVESTSSANEDMPVERILEAELAVEP
PPARa
MNKVKARVILSOKASNNPPFVIHDMETLCMAEKTLVAKLVANOIQ_NKEAEVRIFHC
PPARS
MTKKKARSILT6KASHTAPFVIHDIETLWQAEKOLVWKQLVN6LPPYKEISVHVFYR
PPARY LXRa RXRa
LTKAKARAILT6KTTDKSPFVIYDMNSLNN6EDKIKFKHITPLQEQSKEVAIRIFQ6 PPQILPQLSPEQL6MIEKLVAAQQQCNRRSFSDRLRVTPWPMAPDPHSREARQQRFA KTETYVEANNOLNPS SPNDP VTN I
PPARa PPAR5 PPARY LXRa RXRa
CQCTSVETVTELTEFAKAIP6FANLDLNDQVTLLKYGVYEAIFAMLS SVNNKDGM CQCTTVETVRELTEFAKSIPSFSSLFLNI)QVTLLKYOVHEAIFAMLA__SIVNKDGL CQFRSVEAVQEITEYAKSIPGFVNLDLNDQVTLLKYGVHEIIYTMLA^SLMNKDGV HFTELAIVSVQIVDFAKQLP6FLQLSREDQIALLKTSAIEVMLLETS RRYNPGSE CQAADKQLFT_LVEWAKRIPHFSELPLDDQVILLRAGWNELLIASFSHRSIAVKDGI
PPARa PPAR5 PPARY LXRa RXRa
LVAYGNGFITREFLKSLRKPFCDINEPKFDFAMKFNALELDDSDISLFVAAIICCGD LVANGSGFVTREFLRSLRKPFSDI lEPKFEFAVKFNALELDDSDLALFI AAIILCGD LISEGQGFMTREFLKSLRKPFGDFMEPKFEFAVKFNALELDDSDLAIFIAVIILSGD SITFLKDFSYNREDFAKAGLQVEFINPIFEFSRANNELQLNDAEFALLIAISIFSAD LLATGLHVHRNSAHSAGVGAIFDRVLT ELVSKMRDMQMDKTELGCLRAIVLFNPD
PPARa PPAR5 PPARY LXRa RXRa
RPGLLNVGHIEKMQEGIVHVLRLHLQSWHPDDIFLFPKLLQKMADLRQL VTEHA RPGLMWVPRVEAIQDTILRALEFHLQANHPDAQYLFPKLLQKMADLRQL VTEHA RPGLLNVKPIEDIQDNLLQALELQLKLNHPESSQLFAKLLQKMTDLRQI ^VTEHV RPWVQDQLQVERLQHTYVEALHAYVSIHHPHDRLMFPRMLMKLVSLRTL SSVHS SKGLSNPAEVEALREKVYASLEAYCKHKYPEQPGRFAKLLLRLPALRSIGLKCLEHL
PPARa PPAR5 PPARY LXRa RXRa
QLVQIIKKTESDAALHPLLQEIYRDNY QMMQRIKKTETETSLHPLLQEIYKDMY QLLQVIKKTETDNSLHPLLQEIYKDLY EQVFALRLQDK KLPPLLSEIWDVHE FFFKLIGDTPIDTFLMEMLEAPHQMT
Figure 1. Primary sequence aligrunent of PPARa, PPARy, PPAR5, LXRa, and RXRa LBDs.
considerably lower than for the other constructs, resulting in precipitation of L X R a in the range of 0.5 mg/ml. P P A R a , PPARy, P P A R S , and R X R a all exhibited classic a-helical structure by C D spectroscopy (Figure 2). None of the four proteins had significant ellipticities in the aromatic region and all began downslopes near 240 nm. Minima at 222 n m and 208 n m were present in each scan with a crossover point at or near 200 nm. However, the spectrum for L X R a did not exhibit characteristic a-helical traits. The spectrum did not agree well with any major structural class and showed a considerable amount of scattering in the far U V . Thermal stability of the nuclear receptor L B D s was determined by both C D (Figure 3) and D S C melts. All of the proteins except L X R a showed secondary
Derril Willard et al
462
A
B,
—T 200
'
1 220
'
1 240
'
1 260
«
1 280
'
— I
1 300
1
200
r
)o] -1000-
-2000-
• •3000— I — I — I — I — I — I — I — I — I — I — I
200
220
240
260
1
1
1
240
1
260
1
1
280
280
300
Wavelength (nm)
. r •
• •
• • • •
••
—1—1—1—1—I—1—1—1—1—1—1 200 220 240 260 280 300
Wavelength (nm)
—r—'—I—'—I—'—I—'—I—'—I 220
240
260
1
300
•
.'w' 200
1
Wavelength (nm)
Wavdength (nm)
-4000H
1
220
280
300
Wavelength (nm)
Figure 2. CD spectra of A, PPARa; B, PPARy; C, PPAR8; D, RXRa; and E, LXRa LBDs. [0] denotes molar ellipticity in deg cm^/dmol, converted from raw ellipticity units. Data are averaged from multiple scans in each case and blank subtracted.
463
Ligand Binding of Recombinant Orphan Nuclear Receptors
]B
0-1 -1000-1
/ ^
-2000 H
-2000 -3000-1
i
a
-4000
^
-5000 H
-4000
^ ^ /a,k
-6000
40
-6000
60
20
-1600 n
•
^
-3000 H
vV-
/
i
-2800-
-4000'
—]
20
1
1
40
1
1—
—r— 80
60
Temperature (C)
-2000-
D
1 -2400-
-2000
-1500 n
60
-2000-
-1000-|
^
40
Temperature (C)
Temperature (Q
E
()
I
20
40
TempCTature(C)
1
60
•
-2500-
1 t
••
-3000-
•
-3500-4000-4500-5000-
• 1
1
20
1 —— 1
40
1
1
1
1 —
60
Terapaature(C)
Figure 3. CD melts of A, PPARa; B, PPARy; C, PPAR6; D, RXRa; E, LXRa LBDs measured as a function of temperature at 222 nm. [0] denotes molar ellipticity in deg cm^/dmol, converted from raw ellipticity units. Protein concentrations range from 4-20 ^M.
'
1
80
464
DerrilWillarderfl/.
Structural transitions over the temperature range 40-55°C. TD and Tm values were derived assuming a two-state model although in the case of RXRa there is some indication that a complex model should be applied. Melts were not reversible and in some cases resulted in precipitation of a portion of the sample. Ti/2 and Tm values derived by both techniques showed good agreement for each protein. CD unfolding data for LXRa appeared by visual inspection to exhibit an unfolding transition and T1/2 similar to those seen from the other nuclear receptors. However, attempts to fit the LXRa scans with the same program used to fit the other data failed to identify a TD in the above range. By DSC analysis, LXRa did not exhibit a transition as judged by the lack of any identifiable peak corresponding to AH over the experimental temperature range oflO-80°C. Association and aggregation states of the proteins were examined by SLS, DLS and sedimentation equilibrium analytical ultracentrifugation (Table 1). With the exception of SLS data for PPAR5, light scattering data observed some degree of aggregation in all cases (DLS data for PPARa was not determined). Analytical ultracentrifugation indicated that PPARa, PPARy and PPAR5 existed predominantly as monomers with molecular weights in good agreement with those calculated from the respective sequences. RXRa was shown to exist as a monomer-tetramer self-associating system in the absence of ligand. Analysis of data from LXRa runs indicated that the protein existed as a monomer but the molecular weight derived, 46.3 kD, did not agree with the calculated and apparent monomer molecular weight of 3L5 kD. Table L
S u m m a r y o f data f r o m biophysical techniques. CD Spectrum
TI«/TD(CD)'
Tm(DSC)'
MW*
SLS"
DLS"
PPARa
alpha helical
53.4/53.5
49
31.5
agg**
n.d.
31.7
Yes
PPARy
alpha helical
46.4/46.5
41.5
34.8
agg
agg
35.5
Yes
PPAR5
alpha helical
51/51
51
35.9
34
agg
36.8
Yes
RXRa
alpha helical
54/54
55
27.7
agg
agg
monomertetramer
Yes
LXRa
mixed beta/ undefined
49/*
no transition observed
31.5
agg
agg
46.3
No
Centrifugation" Binding'^!
" Values in units of °C, * Values in units of kilodaltons, *" gel filtration assay, ''aggregation, * no value obtained.
PPARy and RXRa LBDs exhibited the expected binding to their respective ligands. Novel ligands that had been implicated as PPARa and LXRa effectors in a cell-based assay were found to bind to PPARa and PPAR5 respectively in the gel filtration assay. No ligands were found to bind to LXRa in the gel filtration assay.
Ligand Binding of Recombinant Orphan Nuclear Receptors
465
IV. Discussion In our evolving strategy, we are attempting to apply low resolution structural, thermodynamic and solution state analyses in a systematic approach to develop a set of parameters indicative of native proteins for target families. Orphan nuclear receptor constructs were engineered to be homologues of constructs of verified functional nuclear receptors. Obvious initial check points in the strategy would be the lack of protein expression or the expression of only insoluble material. Four of the five constructs reported in this study provided amply expressed, soluble proteins. The low levels of both expression and solubility of LXRa provided initial indications that the construct might not exist in a native form. Qualitatively, CD spectral analysis indicated that LXRa was different from the other constructs, all of which exhibited classical a-helical structure. The LXRa spectrum clearly did not contain double minima at 208 and 222 nm but instead had a more narrow trough with a minimum around 220 nm. The spectrum is very similar to that taken from a PPARa sample which we progressed to 60°C and then rescanned (data not shown). This unfolded PPARa did not exhibit ligand binding in the gel fihration assay. Also, LXRa showed heavy scattering below 220 nm. Taken altogether, the CD data suggest that recombinant LXRa as expressed does not share a similar structure with the other four constructs. CD and DSC thermal stability studies are convenient techniques for comparing the thermodynamic characteristics of a group of proteins. While recognizing that homologous primary structure does not imply that the constructs would display similar thermodynamic properties, obvious outliers of Ti/2, TD, Tm, and AH calculations might be diagnostic for non-native proteins. Additionally, if significant misfolded populations were present, multiple transitions or less cooperative unfolding might be observed. All five constructs exhibited a range of 7°C for T1/2 as determined by CD. The range as determined by DSC for PPARa, PPARy, PPAR5, and RXRa was 13.5°C. LXRa did not produce a characteristic DSC melting profile over temperature change from 1080°C. This difference between the two techniques for LXRa is difficult to interpret. The CD transition is a measure of the difference of ellipticity at 222 nm while DSC is a measure of enthalpy of the system. The mean difference between CD Ti/2 and DSC Tm for each of the other constructs was less than 3°C. The solution behavior of proteins can ofl;en be good indicators of protein state. In this study both DLS and SLS were applied to probe for aggregation and solution molecular weight. Both techniques indicated that even constructs with verified ligand binding activity contained some proportion of aggregates. Light scattering in general is very sensitive to aggregates, even at levels representing only a small fraction of the protein populations. Analytical ultracentrifugation data indicated that RXRa existed as a monomer-tetramer (monomer-dimer with compound, data not shown) equilibrium. The other four nuclear receptors existed as monomers. Ultracentrifugation findings for the
Derril Willard et al
466
monomeric constructs are consistent with small amounts of aggregate (as suggested by light scattering data) being spun down during the runs. These populations would be invisible to later data analysis. The good ideal species fit for LXRa which yielded an aberrant molecular weight is difficult to explain. If this construct is improperly folded, assumptions (e.g. regular hydration shell) inherent in the molecular weight calculation may have been inappropriate. This quality of LXRa may also be indicative of non-native structure. The primary research impetus behind this study is the desire to establish ligands and/or mechanisms of action for each nuclear receptor. Using a simple gel filtration assay which has been validated for characterizing RXRa and PPARy-binding ligands, we were able to test novel ligands which had been indicated as orphan effectors by a cell-based assay system (data not shown). Compounds that were identified as effectors of PPARa and LXRa were shown to bind PPARa and PPAR5, respectively, but not LXRa. Additionally, unfolded PPARa resembled LXRa in CD spectrum (see above) and did not retain ligand binding ability. No small molecules were found to bind the recombinant LXRa LBD. We have presented data which suggests that we can determine the usefiilness of a recombinant protein before we have an appropriate ligand in hand. These results are an encouraging start in our attempt to predict the binding competency of recombinant orphan nuclear receptors. Obviously, a considerable amount of work needs to be completed before a great deal of confidence can be placed in this type of predictive strategy. At present more than 20 human nuclear receptors have been identified and the number is certain to grow. Biophysical characterization of other nuclear receptors should aid in the construction and refinement of a database for use in further predictive studies.
Bibliography 1. Chen, Z.-P., Shemshedini, L., Durand, B., Noy, N., Chambon, P., and Gronemeyer, H. (1994) J. Biol Chem., 269 (41): 25770-25776. 2. Wurtz, J.-M., Bourguet, W., Renaud, J.-M., Vivat, V., Chambon, P., Moras, D., and Gronemeyer, H. (1996) Nature Struct. Biol, 3(1): 87-94. 3. Parker, M.G. and White, R. (1996) Nature Struct. Biol, 3(2): 113-115. 4. Cohn, E.J., and Edsall, J.T. (1943) Proteins, amino acids and peptides as ions and dipolar ions, Rheinhold, New York, p. 157. 5. Laue, T.M., Shah, B.D., Ridgeway, T.M., and Pelletier, S.L. {1991) Analytical ultracentrifugation in biochemistry and polymer science (Harding, S.E., Rowe, A.J., and Horton, J.C., eds). The Royal Society of Chemistry, Cambridge, p. 102. 6. Durschlag, H. (1986) Thermodynamic data for biochemistry and biotechnology (Hinz, H.J., ed). Springer-Verlag, New York, p. 45. 7. Marquardt, D.W. (1963)/. Soc. Ind Appl Math., 11:431-441.
SECTION VI Protein-Protein Interactions
This Page Intentionally Left Blank
Detection of /w/ra-Cellular Protein-Protein Interactions: Penicillin Interactive Proteins and Morphogene Proteins S. Bhardwaj R.A. Day Department of Chemistry University of Cincinnati Cincinnati, Ohio 45221-0172 I. INTRODUCTION Protein-protein interactions are recognized as central to understanding regulation of metabolic processes. There are, at present, no general methods for detection and measurement of protein-protein interactions in the intact cell. Current emphasis on this subject is the measurement of these interactions in vitro outside the cell (1). The use of recombinant DNA technology has been successful in many studies to demonstrate specific interactions; the "two-hybrid" system relies on transcriptional activity and relies on the modular nature of many particular sites (2). There are other "library based" methods which generally require varying degrees of genetic manipulation (1,2). We describe here a method that exploits the occurrence of salt-bridges formed by interacting proteins in their specific associations. The method applies to intra-cellular interactions in the intact cell including membrane proteins which normally can only be solubilized at the expense of their normal interactions (3). Salt bridges are ion pairs within hydrogen bonding distances and can be converted to covalent links by the gaseous reagent cyanogen (C2N2, N=C-C=N, ethanedinitrile). C2N2 has been demonstrated to be a salt-bridge specific reagent and in a carbodiimide-like mode to generate covalent bonds (4,5,6). This is depicted in Reaction I:
R^
/;^ NH2'^ III
O II
^
OC-R" + C2N2
^ \
•
R'"^
II
II
.N-C-R"+NH2C.C=N R'^
Reaction I The somewhat analogous carbodiimides will drive this mrm-molecular process and //z^^r-molecular condensations such as is exploited in peptide bond formation. C2N2 will only do the former (4,5), i.e., convert pre-existing saltbridged functional groups into a covalent link. An in vivo method must cause minimal perturbation of the cell at the critical point in the process from which TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
469
470
S. Bhardwaj and R. A. Day
minimal perturbation of the cell at the critical point in the process from which the data will arise. The methodology must also provide identification of the proteins in question. The method described here satisfies the above criteria. Fluorescence/spectroscopic (F/S) labeled, active site directed substrate analogues mark the proteins in question. Cyanogen rapidly converts salt-bridges into covalent links (6). It is our working hypothesis borne out by the results to date that many, if not most, protein-protein associations involve specific salt-bridges. The modified proteins, in this study are the penicillin interactive proteins of Gram (+) and Gram (-) bacteria and the other morphogene proteins (MGPs) that interact with them. They are isolated by RPLC (7) following the F/S label. The proteins are unambiguously identified by computer based analysis of MALDI-TOF generated data from the tryptide and chymotryptide maps. The F/S labeled active site peptide is also identified.
n . EXPERIMENTAL MATERIALS AND METHODS A, Synthesis of active site directed fluorescence/spectroscopic (F/S) labeled filactams The synthesis of dansyl amino penicillanic acid (DNS-APA, I) was carried out according to the procedure outlined in Ref. (8). The e-dansyl monocyclic p-lactam of lysine (II) prepared essentially as described for another monocyclic p-lactam (9). JXH2
CH3OONH
II
B.
000'
^^<e"^^ '{CH2)4. NHSO2
©-N(CH3)2 (0>
Labeling of bacterial cells The labeling of Gram positive and Gram negative penicillin binding proteins (PBPs) is outlined in Scheme 1. Log phase bacterial cells were harvested by centrifuging them at 12,000 rcf for 10 minutes at 4''C. The cell pellet was washed twice with the 50 mM Tris HCl buffer (pH=7.2) and divided into three parts. One part of the pellet (control) was suspended in 500 ML of the Tris buffer and then added 25/uL of 15% SDS was added. It was left standing for 5 minutes. The other two equally divided pellets were then resuspended in 200 ML of buffer each and treated with 100-150 u^g dansyl (I)X APA or monocyclic p-lactam of N-€-dansyl lysine (II). The cell pellets were incubated for 20 minutes at 37°C. Treated cells were washed 3-4 times with buffer by centrifuging them for 5-10 minutes at 12,000 rcf and at 4"C so as to get rid of excess label as determined by visual inspection of the supernatant with UV irradiation. The cell pellets were resuspended in 200 uL of the Tris buffer. The
SCHEME 1: LABELING OF PBPS AND PBP-MORPHOGENE PRODUCT(MGP) COMPLEXES BY FLUORESCENCE SPECTROSCOPIC (F/S) PROBES BACTERIAL CELLS F/S probe 2. 1 Centrifuge
CONTROL CELLS
i LABELED CEUS | C2N2
CROSS-LINKEO PBP-MGP COMPLEXES
Centrifuge SOLUBLE UNLABELED PBP-S Reverse Phase Chromatography
Centrifuge Centrifuge SOLUBLE F/S LABELED PBP-MGP COMPLEXES
SOLUBLE F/S LABELED PBP COMPLEXES RPLC
CHROMATOGRAPHICALLY RESOLVED F/S LABELED PBP-MGP COMPLEXES
CHROMATOGRAPHICALLY RESOLVED F/S LABELED PBP COMPLEXES
SCHEME 2 : PROTEIN IDENTIFICATION CHROMATOGRAPHICALLY RESOLVED F/S LABELED PBF/FBP-MGP COMPLEXES
CHROMATOGRAPHIC FRACTION Enzymatic Digestion
PEPTIDE FRAGMENTS
MALDI-TOF/LSIMS Mass Spectrometry
Search the Protein Sequence database for muitipeptides of individual proteins using "MS-FIT' Program
Generation of Sequence fragments with matching Mol. wt.(s)
472
S. Bhardwaj and R. A. Day
of the Tris buffer. The label only pellet is treated with 25 /A. of 15% SDS and left standing for 5 minutes. C. Cyanogen Treatment Cyanogen may be obtained commercially or prepared readily in one step from AgCN(lO). Cyanogen is far less toxic than HCN, HjS or CO, but should be handled carefully (11). The headspace over the second F/S labeled pellet was swept out with -5 mL of cyanogen and allowed to stand at room temperature for 30 minutes. The color of cell pellet changes from cream to brown and the pellet disintegrates. This treatment was repeated three times with the same amount of CjNj. After the final CjNj treatment, the cell pellet was subjected to 25 yuL of 15% SDS for 5 minutes. The total volume was brought to 1 mL by adding Tris HCl buffer to each set and the cells were lysed by sonication. Cellular debris was removed by centrifugation at 12,000 rcf for 10 minutes at 4"C. The supernatant wasfilteredthrough 0.22 ^.m nylon membrane filter and diluted 5-10 fold before analysis by HPLC. The unused portion of the labeled supernatant was kept frozen at -80"C. D. Reverse phase chromatographic separation of penicillin binding proteins (PBPs) and cross-linked PBP-morphogene products (See Scheme 2) The chromatographic separation of hydrophobic membrane proteins, as well as hydrophilic "hybrid" proteins, was carried out using Microsorb-MV C18 reverse phase column (Rainin Instruments; 4.6x250 mm, 300 A pore size, 5 M particle size). The mobile phase for chromatography consisted of 99.9% acetonitrile with 0.1% trifluoroacetic acid as one solvent and 99.9% water with 0.1 % trifluoroacetic acid as the other solvent. The sample size was 50-80 yuL with lOOyuL loop attached to the injector. A gradient of 0-100% acetonitrile in 60 minutes was used with monitoring at 280 nm and 320 nm. A minimum of ten collections were made and the separated fractions were pooled. The column was cleaned periodically by injecting 100 ML of trifluroethanol (2-3 times) (12). The rechromatography of collected protein fractions was done using solvent gradients related to the retention time of that fraction in the original chromatographic profile. E. Enzymatic digestion of the RPLC purified penicillin binding proteins and "hybrid" proteins The enzymatic digestion of the purified proteins was carried out as described (13). The enzymes TPCK-treated pancreatic trypsin (EC 3.4.21.4; Type XIII) and TLCK-treated pancreatic chymotrypsin (EC 3.4.21; Type VII) werefromSigma Chemical Company. The amount of proteinase used was -2% of the concentration of the purified protein. The proteins were dissolved in 200 yuL of 0.1 N ammonium bicarbonate buffer (pH=7.0), followed by addition of the enzyme dissolved in deionized water. The enzymatic digestion was carried out by incubating the samples for -18 hours at 37° and the reaction was stopped by addition of 10% trifluoroacetic acid solution. The proteins were lyophilized.
Intra-CclMar Protein-Protein Interactions
473
E
Mass spectral analysis of digested proteins The samples were analyzed on the VG TofSpec-SE MALDI TOF mass spectrometer in the reflectron mode with positive ion detection. The samples were spotted on the sample plate in acetonitrile:water (60:40) or chloroform: methanol:TFA (1:1:0.1) mixture plus ammonium sulfate on alpha-C (a-cyano-4hydroxycinnamic acid) matrix. The ionization of the samples was carried out with Nd:YAG laser at 355 nm or nitrogen laser at 337 nm. Some of the fractions were analyzed by SIMS on a Kratos 890 mass spectrometer equipped with a Phrasor Scientific SIMS source. The mass spectral data were analyzed by the MSFIT program at the University of California, San Francisco. i n . RESULTS A,
F/S labeled PBPs The F/S labeled p-lactams were prepared as described. On the basis of their mode of action against bacteria I is termed "lytic" and II, "non-lytic." That is, diastereomers of II produce the microbiological Liesegang effect (MLEs)(14). I and n each produced characteristic chromatographic patterns in the F/S labeled penicillin interactive proteins, the penicillin binding proteins (PBPs), by RPLC (7). The hydrophobic PBPs elute much later than the cytosolic proteins. When monitored at 320 nm I-labeled PBPs showed a pattern of eight or nine major peaks (Fig. la) and >20 minor peaks. II-Labeled PBPs appeared essentially as one peak (Fig. lb). The major peaks were rechromatographed before further analysis of their proteinase digests by MALDI-TOF and/or SIMS. Two sets of peptides were generated from trypsin and chymotrypsin treatment. Characteristically each digest showed among the peptides only one F/S labeled peptide, the active site labeled peptide. The entire digest was needed for peptide mass mapping (15). The details of peptide mass-fmgerprinting of only one fraction out of several (Fig. la) analyzed is shown here. Digestion of fraction 1 (Fig. la) by the two proteinases was carried out. Shown are the results of MALDI-TOF analyses of the resultant tryptides (Fig. 2a) and chymotryptides (Fig. 2b). After eliminating peaks that were ambiguous either because (a) they could arise from more than one PBP or (b) could come from the proteinase itself, we were left with a set of peptide identities associated only with PBP IB. A typical result is shown in Tkble 1. In this case as with other reports the mass window used to examine the MALDI-TOF was less than ±3 amu as consistent with the range used in other studies (15-18). Peptide mass-fingerprinting of the I-labeled PBPs (Fig. la) revealed the eight known PBPs of E, coli (Tkble 2). Unlike SDS-PAGE analysis of labeled PBPs (19), the elution order by RPLC is not related directly to molecular weight but is dictated by hydrophobicity. II-Labeled PBPs presented only one peak (Fig. lb). Peptide-mass fingerprinting of the chymotryptic and tryptic digests revealed that it was a
S. Bhardwaj and R. A. Day
474
\2j4i6f
*320
'320
'320
'320
Figure 1. Chromatographic Traces of F/S Labeled Proteins from E. coli: (a) from i treated cells, (b) from i i treated cells, (c) from cells treated sequentially with i and C2N2, and (d) from cells treated sequentially with 11 and C2N2.
/ntra-Cellular Protein-Protein Interactions
400
600
800
1000
M/Z 1200
1400 M/Z 1800
475
1600
2200
2000
2600
476
S. Bhardwaj and R. A. Day
complex containing the eight well-known PBft plus a candidate PBP, PHSE (Tkble 2). B.
Cyanogen linking ofPBPs to other morphogene proteins The I-treated E, coli cells were treated with C2N2 and again carried through the procedures up to and including peptide mass-fingerprinting. A new set of F/S labeled peaks intermediate in retention times between the early putative cytosolic proteins and the later eluting PBPs showed combinations of PBPs and known MGPs (20,21). The chromatographic profile of these "hybride" proteins is shown in Figure Ic. Space only allows us to show the identity of proteins found in fraction 1 (Tkble 3). The cyanogen trapped F/S labeled complexes reveal only PBPs and MGPs. Donachie (21) summarizes their functions. Chromatography of the E. coli proteins from the sequentially II and cyanogen treated cells (Fig. Id) showed F/S labeled components. The peptide massfingerprintingof proteolytic digests of these purified components showed only PBPs and MGPs (20). In Tkble 4 are shown the constituent proteins of one component. Found here are all the PBPs, save PBPS and PBP7, and seventeen MGPs. Not shown here are the results from a parallel analysis of the Gram (+) B, subtilis which gave entirely analogous results with I and II with and without C2N2 treatment (20). IV, A.
DISCUSSION
Specificity The most significant measurements of cellular processes must be done with no perturbation of the process. While zero perturbation is probably not attainable, nevertheless minimization of the perturbation is important. Cyanogen as a reagent readily diffuses into a cell, is itself apolar and noninteractive until it participates in a specific reaction. Its specificity is very high for hydration by bi-functional catalysis (6). When that catalysis is provided by a salt-bridge, it is accompanied by conversion of the salt-bridge to a covalent link. This study strongly supports a high specificity for pre-formed salt-bridges. The E, coli K-12 genome has almost completely been sequenced. It contains thousands of structural genes (3469 according to the data base Swiss Prot. r33). Of these, almost 70 have been identified as PBPs and MGPs (21). Wthout cyanogen treatment only PBPs were found in the F/S labeled proteins. With cyanogen only other MGPs were found covalently linked to the PBPs. That none of the other thousands of proteins became linked provides an extremely stringent internal control. It can be anticipated that this is a general property of protein-protein interactions and that any appropriately labeled protein could become a productive target for identification of proteins that may be interacting with it. It should be noted that in this study the C2N2 treatment
Intra-CellulsiT Protein-Protein Interactions
477
Table 1. Peptides Identified from Mass Spectral Data of Chymotryptides Shown in Fig. 2a Showing Identification of PBPIB. There were 7 matches P02919 penicillin-binding protein IB (PBP-IB), (94267.0Da) Mass Found
Mass Matched
Delta Da
Sequence Start
Peptide Sequence
End 540
(W)IADAPIAL(R)
1017.700
1016.556
1.144
415
422
(F)MQLVRQEL(Q)
1017.700
1019.451
-1.751
734
743
(L)YGASGAMSIY(Q)
1099.600
1100.610
-1.010
216
225
(F)VPRSGFPDLL(V)
1594.300
1592.811
1.489
202
215
(L)ITMISSPNGEQRLF(V)
2026.840
2027.997
-1.157
129
145
(L)EATQYRQVSKMTRPGEF(T)
2435.030
2437.040
-2.010
787
808
(W)TSDPQSLCQQSEMQQQPSGNPF(D) 1
Unmatched masses: 824.2 1126.7 1879.1 2515.1
Table 2. Identification of PBPs of E, coU K12 (ATCC 29079) Labeled with F/S labeled plactams. The intact log phase cells were treated with the F/S p-lactams for 10 minutes and then disrupted by sonication. After removal of the debris by centrifugation (10,000 rpm, 10 min.), ahquots of the supemate were separated chromatographically as described (39). The F/S labeled peaks were isolated and rechromatographed on a shallower gradient. The purified peaks were divkied and subjected to tiyptic and chymotryptic digestion respectively. The digests were analyzed by MALDI-TOF or by SIMS. The resultant mass spectral data were submitted for peptide mass fingerprinting at the UCSF Mass Spectrometry Facihty.
F/S p-Lactam
F/S Labeled Chromatographic Fraction
Protein
MoLWt.
PBPIA PBP4 PBPIB PBP7 PBP2 PBP6 PBP3 PBP5
93,636 51,798 94,266 34,245 70,856 43,639 63,877 44,444
PBPIA, PBPIB, PBP2, PBP3, PBP4, PBP5, PBP7, PHSE
was extreme, C2N2 driven reactions proceed at rates that vary by five orders of magnitude (4,6). For example, the subunits of hemoglobin have salt-bridges among them; it requires -60 seconds to covalently cross-link them (4). Thus, in this study, if there were any possibility of adventitious salt-bridge formation and conversion to covalently linked functionalities, it should have been seen.
|
478
S. Bhardwaj and R. A. Day
Table 3. Identification of PBPs and MGPs in Fraction I Isolated from the Sequential F/S p-Lactam DNS-APA (I) and CjN, IVeated E. coU Log Phase Cells. P^enthesis indicate possible but not confirmed by one or more peptides unique to the parenthetic protein. PBPS Protein PBPIA PBPIB PBP2 PBP3
Mol. Wt. 93,636 94,266 70,856 63,877
Protein (PBP4) (PBP5) PBP6 PBP7
Mol. Wt. 51,798 44,444 43,639 34,245
Other MorphogeneJProteins _ Protein Mol. Wt. Protein Mol. Wt. (FTSH) 70,708 ALRI 39,000 (SLT70) 73,369 (ENVC) 41.317 LON 87,438 MURA 44,817 SECA 101,909 FTSN 45,987 MUKB 176,935 FTSY 54,513
Table 4. Identification of PBPs and MGPs in Fraction I L Isolated from the Sequential F/S DNS-Monocyclic p-Lactam (II) and C^N^ Treated E, coli Log Phase Cells. Parenthesis indicate possible but not confirmed by one or more peptides unique to the parenthetic protein. Other Morphogene Proteins
PBPS Protein PBPIA PBPIB PBP2 PBPS
Mol. Wt 93,636 94,266 70,856 63,877
Protein Mol. Wt. (PBP4)r 51,798 (PBP5) 44,444 (PBP6) 43,639
Protein MUKB SECA (LON) (FTSH) SLT70 (MURE)
Mol. Wt. 176,935 101,909 87,438 70,708 78,969 53,212
Protein (FTSY) ENVC MURA (DDLA) ALRI MREB
Mol. Wt 54,513 41,317 44,817 39,315 39,060 39,952
B. Membrane proteins Membrane proteins are difficult to work with. Solubilization in a semifunctional form is possible with the aid of certain non-ionic detergents (3); the sheath of detergent molecules attached to the hydrophobic regions of these molecules compromise studies of interactions of membrane proteins. Thus, a method applicable to an intact cell or organelle may have some value. The system here involves membrane proteins (PBPs) and cytosolic proteins (most of the other MGPs) and thus demonstrates that such proteins and their interacting ligands can be accessed. C Covalently linked complex ofpenicillin interactive proteins The non-lytic II labeled a single peak in the chromatographic profile (Fig. lb). Since the denaturing conditions of isolation and chromatography separate the I-labeled PBPs (Fig. la), it is clear that non-lytic II prevents their separation from a covalently bonded complex. As with I-treatment, Il-treatment must be followed with C2N2 to show MGP association. More MGPs are associated in the latter case. The inference is that a covalently bonded network is normal and prevents perforation and lysis as had been observed (22); the typical lytic p-lactam causes this dissociation of the complex murasome situated
/w/rfl-Cellular Protein-Protein Interactions
479
in a pre-existing opening in the cell wall which is, then, a factor of importance in bacterial cell death. When the complex breaks up the cytosol is released. D. Sites of Salt Bridging There are two important consequences of the C2N2 treatment for the analysis. The first is that some peptides that contain cross-linked residues will be diminished and may not appear in the MSFIT analysis. The second is that such cross-linked residues identify the salt-bridged site(s); in the first phase of this study we identified only the interacting proteins. E. Limitations of the Method Thus far no data have been found which indicate that C2N2 produces zn/^r-molecular condensations between proteins that have no normal association to form complexes. Neither the published papers cited here nor in an extensive unpublished body of experiments from this laboratory where conditions had been optimized for z/z/^r-molecular amide bond formation had any detectable condensation been seen. The only important limitation arises from the peptide mass fragment analysis. It appears to be completely unambiguous for single proteins (Table 2). As the number of proteins in a mixture or in a complex increase the number of unambiguous M/Z values from the MALDI-TOF analysis decrease. An important observation is that when ambiguities arose in this study, only MGPs appeared in the C2N2 driven association. None of the other ~ 3400 proteins showed up. At this stage of development, it is clear that maximum complexity is limiting. For the ~70 MGP system there is a need to follow the time course of condensation not only to deal with complexity, but more importantly to collect information about rates and degree of association. V. CONCLUSION There is no way to know what fraction of protein-protein interactions involve salt-bridges; however, recent crystallographic studies of protein-protein complexes of cytokines and growth hormone with their receptors show multiple salt bridges at the interfaces (23). Those that do not have salt bridges or hydrogen-bonded carboxylate carbinol associations (24) must remain undetectable by this technique. However, these early results show a high specificity and a good fraction of the expected interactions. It seems likely in the case of PBP-MGP interactions that we have seen vegetative state complexes and can expect a different set for cell division. When the septasome complex(es) and any other specialized complex(es) are characterized, we feel that a different set of MGPs will be found. This technique also provides a means of identifying each covalently linked, inter-protein salt-bridged site through identification of the peptide component contributed by each protein of the linked pairs and/or multimers. This is an aspect of this ongoing project. Such data will provide more specific
480
S. Bhardwaj and R. A. Day
information on how the almost 70 MGP/PBPs mutually modulate each other and cell wall synthesis (20). ACKNOWLEDGEMENTS. We thank Professor Jayasimhulu for the SIMS analyses and Mr. J. E. Carlson for the MALDI-TOF work. We appreciate being given access to the MALDI-TOF unit by Professor J. Monaco. We appreciate having access to the MASS-FIT program at UCSF and assistance from K. Clauser. The cyanogen aspect of this work was developed in the past under NIH Grant GM42697. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
22. 23. 24.
Phizicky, E.M. and Fields, S. (1995) Microbiol. Rev. 59, 94-123. M e n , J.R, Walber& M.W., Edwaixls, M.C., and EUedge, S.V. (1995) Trends Biochem. 5c?i., 20, 511-516. Kyte, J. (1995) Structure in Protein Chemistry, Garland Publishing (New York), p. 520 j ^ Day, R.A., Kirley, J., Tharp, R, Flicker, O., Strange, C. and Ghenbot, G. (1989) In (T.E. Hugli, Ed.) Techniques in Protein Chemistry, Academic Press, San Diego, pp. 517-525. Day, R.A., Hignite^ A. and Gooden, W.E. (1995) In (J.W. Crabb, Ed.) Techniques in Protein Chemistry VI, Academic Press, San Diego, pp. 435-442. Day, R.A., Tharp, R.L., Madis, M.E., Wallace, J.A., Silanee^ A.A., Hurt, P. and Mastruserio, N. (1990). Peptide Res. 3, 169-175. Day, R.A., Ahluwaha, R. and Du, Y. (1994) LC-GC12, 384-394. Cartwright, S.J., Tan, A.K. and Fink, A.L. (1989) Biochem. J., 263, 905-912. Ahluwalia, R., D ^ , R.A., and Nauss, J. (1995) Biochem. Biophys. Res. Common. 206, 577-583. Kirley, J.W., Day, R.A. and Kreishman, G.P. (1985) FEBS Lett. 193, 145. Fassett, D.W. (1983) In (F.A. I ^ y , Ed.) Industrial Hygiene and Toxicology, 2nd Ed., Interscience Publishers, NY, p. 2003. Bhardwaj, S. and Day, R.A. (1996) submitted to LC/GC. Lee, T.D. and Shively, J.E. (1990) Methods Enzymol. 193, 361-374. Day, R.A., Bhardwaj, S. and Bai, H. (1995) Miami Bio/Technology Short Reports 6, 88. Henzel, W.J., Billeci, T.M., Stults, J.T., Wong, S.C, Grimly, C and Watanabe, C. (1993) Proc. Natl. Acad. Sci. USA 90, 5011-5015. Pappin, D.J.C., Hojrup, P. and Bleaby, A.J. (1993) Current Biol. 3, 327-332. James, P., Quadron, M., Carafoh, E. and Gonnet, G. (1993) Biochem. Biophys. Res. Commun. 195, 58-64. Hynes, G., Sutton, C.W., U, S., and WiUison, K.R. (1996) EiSEB J. 10, 127-147. Waxman, D.J. and Strominger J.L. (1983) Ann. Rev. Biochem. 52, 825-869. Bhardwaj, S. (1996) Ph.D. Dissertation, University of Cincinnati. Donachie, W.D. (1993) In (M.A. de Pbdro, J.-V. Holtje and W. Loffelhaixlt, Eds.) Bacterial Growth and Lysis. Metabolism and Structure of the Bacterial Sacculus, Plenum, New York, pp. 409-18. Giesbrecht, P., Kersten, T., Madela, K., Grob, H., Bliimel, P. and Wecke, J. (1993) in de Pedro et al. op. cit. pp. 393-407. Ealick, S., Thiel, D , le Du, M., Walter, R., D'Arey, A., Chene, C , Fontoulabis, M., Garotta, G. and Winklet, F. (1996) Prot. Science 5 (Suppl. 1) 59. Karagozler, A.A., Ghenbot, G. and Day, R.A. (1993) Biopolymers 33, 687-692.
Use of Synthetic Peptides in Mapping the Binding Sites for hsp70 in a Mitochondrial Protein Antonio Artigues, Ana Iriarte, and Marino Martinez-Carrion Division of Molecular Biology and Biochemistry. School of Biological Sciences. University of Missouri-Kansas City, Kansas City, MO 64110
I. Introduction The members of the 70-kDa heat shock protein (hsp70) family perform functions that are essential for cell viability, both under normal and stress conditions. Constitutively expressed hsp70s are thought to be involved in the folding and assembly of newly synthesized proteins, disassembly of oligomeric proteins, protein degradation, and the transport of nascent peptide chains across membranes (l, 2). The structure of hsp70 consists of a variable C-terminal peptidebinding domain and a highly conserved N-terminal ATPase domain. The crystal structure of a peptide-C-terminal domain complex shows that the peptide substrate is bound in an extended conformation through numerous interactions from both side chains and backbone groups (3). As with all molecular chaperones, hsp70 shows a remarkable selectivity for unfolded structures, but low specificity towards the sequence of its potential peptide substrates. Among the few consensus features identified in peptides binding to hsp70 with high affinity is the presence of internal hydrophobic residues (3, 4), which agrees with the proposed role of this chaperone in binding to hydrophobic regions of unfolded proteins normally hidden in the native structure. However, a great variety of synthetic peptides, including organellar targeting sequences containing basic residues (5-7) also bind to hsp70. Perhaps the binding sites recognized by hsp70 depend on the specific role fulfilled by the chaperone with a given substrate. In any case, the precise characteristics of the targeting sequences that determine the recognition and binding of hsp70 to its multiple substrates remain largely obscure. The cytosolic (cAAT) and mitochondrial (mAAT) isozymes of aspartate aminotransferase share a significant degree of sequence homology TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
481
Antonio Artigues et al
482
(63%), and almost identical crystallographic structures (8, 9). The mitochondrial isozyme is synthesized in the cytosol as a precursor protein (pmAAT) with a 29-residue presequence peptide that is required for targeting and import into mitochondria. Despite the broad substrate specificity of hsp70 mentioned before, this chaperone is able to discriminate between the two highly homologous AAT isozymes. Following either synthesis in cell-free extracts (lO) or refolding firom acid-denatured states the mitochondrial isozyme binds to hsp70, whereas its cytosolic counterpart does not interact with this molecular chaperone. Thus, these isozymes provide a particularly attractive system to delineate sequence elements that might be binding motifs for hsp70 in pmAAT, but are absent or hidden in cAAT. An initial screening with a series of tetradecameric peptides corresponding to the complete amino acid sequence of pmAAT has led to the identification of several binding regions scattered over the full length of the polypeptide chain.
II. Methods A. Protein
Purification
Purification of pmAAT and cAAT was carried out as previously described (11,12). After concentration by ultrafiltration (Amicon centricon, 30,000 molecular weight cutoff), the proteins were transferred to 2 mM Tris HCl, pH 7.5, a low ionic strength buffer suitable for the subsequent unfolding of the proteins at low pH (pH=2.0). Protein concentrations were estimated from the absorbance at 356 nm or 362 nm of the pyridoxal-5'-phosphate (PLP) cofactor bound to either pm- or cAAT, respectively (molar absorption coefficient of 8,500 M cm ), and Mr=46,597 for pmAAT or Mr=46,399 for cAAT. Hsp70 was purified following published procedures (13). Following the last ammonium sulfate precipitation, the protein was exhaustively dialyzed against 25 mM Tris HCl, pH 7.5, 20 mM NaCl, 10 mM p-mercaptoethanol and kept at 4 ^C until use. Hsp70 protein concentration was measured using a molar absorption coefficient at 280 nm of 47,800 M'^ cm"^ and Mr=70,000 (14).
B. Unfolding and Refolding of pmAAT and cAAT For the reversible acid unfolding of pmAAT, a stock solution of the enzyme in 2 mM Tris HCl, pH 7.5 was denatured by addition of diluted HCl to pH 2.0, followed by incubation for 90 min at room temperature (15). Refolding of pmAAT was performed by rapid dilution of the unfolded protein to 1.8 pM final concentration in refolding buffer (40 mM
Mapping Binding Sites for hsp70 in Mitochondrial Protein
483
Hepes, 0.1 mM EDTA, 1 mM DTT, 10 jaM PLP, pH 7.5) at 10 OC. When studying the effect of hsp70 on the refolding of the enzyme, hspTO ( 1.8 |LiM) was present in the refolding mixture before initiation of refolding by addition of the unfolded protein. For competition studies, hsp70 (1.8 |iM) was preincubated with a 70-fold molar excess of each peptide (120 |LiM) in refolding buffer for 2 to 16 h at 10 ^C before addition of aciddenatured pmAAT (1.8 |iM). The reaction mixture was incubated for an additional 2 h to allow complete refolding of the pmAAT molecules that were not complexed with hsp70. Binding of a peptide to hsp70 would result in an increase in the fraction of pmAAT molecules that are allowed to recover full catalytic activity. Therefore, peptide competition was examined by determining the recovery of pmAAT activity in the presence of hsp70 and in the presence or absence of peptide. The effect of the peptides on the spontaneous refolding of pmAAT was analyzed by measuring the yield of reactivation in samples containing 120 |LiM peptide but no chaperone. The yield of reactivation was determined by measuring the recovery of transaminase activity. Data are expressed as percentage relative to the activity of a control sample maintained under identical conditions.
C. Synthesis and Purification of Peptides The collection of 14-residue peptides spanning the complete amino acid sequence of pmAAT was a generous gift from Dr. B.M. Conti-Tronconi (University of Minnesota). The peptides were synthesized according to Houghten (16). The purity of the peptides was assessed by reverse phase HPLC using a C18 column (Vydac 218TP, 250 x 4.6 mm) and an acetonitrile/water gradient (5 to 70% over 30 min) containing 1% trifluoroacetic acid. The purity of the different peptide preparations ranged from 50-95%. Most of the contaminating peptides represented truncated peptides randomly missing amino acids from incomplete coupling. For screening purposes, these peptides were used without further purification. Selected peptides were further purified by HPLC reverse phase chromatography on a BioRad C18 Hi-Pore RP-318 semipreparative column (250 x 10 mm), using the same gradient as before. Major peaks were collected and the full-length peptide peak was identified by amino acid composition analysis. Peptide concentration was determined based on the molar composition obtained by amino acid analysis. The presequence peptide, MALLHSGRVLSGM-AAAFHPGLAAAASARA, was also synthesized as a single 29-mer peptide in an Applied Biosystems 433A peptide synthesizer at the Molecular Core Facility of the School of Biological Sciences and purified by reverse phase chromatography as described above.
Antonio Artigues et al
484
D. Enzymatic
Activities
The transaminase activity was measured at 37 ^C using L-aspartate and a-ketoglutarate as substrates, using a coupled assay with malate dehydrogenase, as described previously (17). To measure the ATPase activity of hsp70, the chaperone (0.5 |LIM) was incubated at 37 ^C in 40 mM Hepes, 45 mM KCl, 120 ^iM MgCl2, 60 ^iM ATP, pH 7.5, in the presence or absence of different synthetic peptides (120 |LIM). At different incubation times, a 20-|LI1 aliquot was withdrawn and assayed for ATP content on a Turner TD 15-e Luminometer, using the ATP bioluminescence assay kit from Sigma and following manufacturer's instructions. This assay measures the light emitted upon spontaneous decomposition of adenylate-luciferin produced by luciferase from ATP and luciferin substrates (18). When ATP is the limiting reagent, the light emitted is proportional to the ATP present in the sample, and the concentration of ATP can be calculated by reference to an ATP standard curve. The rate of spontaneous hydrolysis of ATP was estimated in samples incubated at 37 ^C in the absence of hsp70.
III. RESULTS AND DISCUSSION A. Effect of hsplO on the Refolding of pmAAT and cAAT In vitro refolding of the acid-unfolded isozymes results in the reconstitution of native-like proteins. Figure 1 shows the yield of reactivation of cAAT and pmAAT (1.8 |LiM) in the absence or presence of hsp70 (1.8 |uM). In the absence of hsp70, a significant recovery of activity (70-80%) can be achieved following spontaneous refolding of both proteins. However, when hsp70 is present, the yield of reactivation of pmAAT is reduced considerably, whereas the yield of reactivation of its cytosolic counterpart is not affected. Inhibition of pmAAT refolding by hsp70 results in the formation of insoluble aggregates of the hsp70-pmAAT complex. hsp70 does not affect the activity of the native protein. This effect of hsp70 is specific, since addition of high concentrations (1 mg/ml) of other unrelated proteins such as bovine serum albumin, aldehyde dehydrogenase, or malic dehydrogenase does not prevent pmAAT refolding and reactivation (data not shown).
Mapping Binding Sites for hsp70 in Mitochondrial Protein
i
100
80
K//X
•I 60 > O CO 0 V-
o
40
485
]
0
> 20
#
——
— \ — —\— ^
Hsp70 cAAT
pmAAT
Fig. 1. The effect of hsp70 on the refolding of cAAT and pmAAT. Refolding of acid unfolded cAAT or pmAAT was performed by rapid dilution of the denatured enzymes in the refolding buffer to a final protein concentration of 1.8 fiM. When present, hsp70 (1.8 ^iM) was added to the refolding buffer before initiation of the refolding reaction. After incubation for 120 min at 10 °C, the transaminase activity recovered was measured as indicated under Methods. Reactivation data are expressed relative to that of the native enzyme incubated under identical conditions.
B. Competition by pmAAT Peptides tact Unfolded pmAAT to hsp70
of Binding of In-
Taking advantage of the fact that hsp70 binds to unfolded pmAAT and markedly reduces the yield of reactivation (from 70% to 20%), we developed a competition assay to search for putative binding sites for hsp70 in the pmAAT polypeptide. In this assay, each peptide in a collection of 43 synthetic tetradecamers spanning the entire amino acid sequence of pmAAT was tested for its ability to compete with unfolded pmAAT for binding to hsp70. Since binding to hsp70 stops refolding of pmAAT, competition by a given synthetic peptide should result in an increase in the fraction of pmAAT activity recovered. Thus, the relative affinity of the different 14-mer peptides for binding to hsp70 was established by comparing the yield of pmAAT reactivation in the presence of hsp70 alone, or hsp70 that had been preincubated with a 70-fold molar excess
Antonio Artigues et al
486
Table I. Selective binding of pmAAT peptides to hsp70. Synthetic tetradecamer peptides corresponding to the amino acid sequence of rat liver pmAAT (p-1 to p-43 from the N-terminai to the C-terminal end) were tested for their ability to compete for the binding of unfolded pmAAT to hsp70 as described under Methods. The percentage of pmAAT activity recovered relative to that obtained in the presence of hsp70 alone (20%) represents an index of the magnitude of peptide competition of pmAAT binding to hsp70: <25% (-), 26-35% 0. 36-50%(+), 51-65% (++), >66 % (+++). The maximum yield of reactivation in the absence of hsp70 is 75 5 %.
Peptide sequence
Number
Activity recoveredCompetition a (+ hsp70, %)
none presequence MALLHSGRVLSGMA SGMAAAFHPGLAAA LAAAASARASSWWT SWWTHVEMGPPDPI PDPILGVTEAFKRD FKRDTNSKKMNLGV NLGVGAYRDDNGKP NGKPYVLPSVRKAE RKAEAQIAGKNLDK NLDKEYLPIGGLAD GLADFCKASAELAL ELALGENSEVLKSG LKSGRFVTVQTISG TISGTGALRVGASF GASFLQRFFKFSRD FSRDVFLPKPSWGN SWGNHTPIFRDAGM DAGMQLQGYRYYDP YYDPKTCGFDFSGA FSGALEDISKIPEQ IPEQSVLLLHACAH ACAHNPTGVDPRDE PRPEQWKEMAAVVK AVVKKKNLFAFFDM FFDMAYQGFASGDG SGDGDKDAWAVRHF VRHFIEQGINVCLC VCLCQSYAKNMGLY MGLYGERVGAFTW FTWCDKAEEAKRV AKRVESQLKILIRP LIRPLYSNPPLNGA LNGARIAATILTSP LTSPDLRKQWLQEV LQEVKGMADRIISM IISMRTQLVSNLKK NLKKEGSSHNWQHI WQHITDQIGMFCFT
20 p r e - p * b 58 p-1 C 45 p-2 C 47 p-3 c 50 p-4 d nd p-5 * 76 p-6 37 p-7 32 p-8 35 p-9 20 p-10 6 p-11 65 p-12 80 p-13 75 p-14 45 p-15 * 80 p-16 38 p-17 30 p-18 53 p-19 43 p-20 62 p-21 e nd p-22 42 p-23 * 80 p-24 52 p-25 11 p-26 39 p-27 e nd p-28 52 p-2 9 e nd p-30 69 p-31 * 80 p-32 45 p-33 52 p-34 34 p-35 46 p-36 44 p-37 33 p-38 e nd
+-f+ + + nd +4-+
+-h ++-h +++ + +-1--H + ++ + ++ nd + +++ + + nd + nd +-I-+ ++-H -f -I+ + nd
Mapping Binding Sites for hsp70 in Mitochondrial Protein FCFTGLKPEQVERL VERLTKEFSVYMTK YMTKDGRISVAGVT AGVTSGNVGYLAHA LAHAIHQVTK (a) (b) (c) (d) (e) (*)
p-39 p-40 p-41 p-42 p-43
* * * *
52 70 74 71 20
487 + +++ +++ +++ ^
Reactivation data are expressed relative to a sample of native pmAAT incubated under identical conditions. A 29-residue peptide corresponding to the entire presequence region of pniAAT. Tetradecameric peptides with 4-residue overlapping regions spanning the presequence and the first five residues of the mature sequence. Binding of peptide p-4 to hsp70 could not be analyzed by the competition assay due to its strong inhibition of pmAAT refolding, nd, not determined Peptides p-21, p-27, p-29 and p-38 show very low solubility in aqueous solutions. These peptides were selected for further characterization of their hsp70 binding by ATPase activity stimulation assays.
of the peptide. The results of this competition cissay are summarized in Table I. Addition of several synthetic peptides (p-5, p-12, p-13, p-15, p-23, p-30, p-31, p-40, p-41; p-42, rated as +++ in Table I) produced a complete reversal of the hsp70-induced reduction in pmAAT refolding, indicating that they bind stroncfly to the chaperone and thus prevent formation of a hsp70-pmAAT complex. Another group of peptides (labeled as ++ or + in Table I) induce only a partial recovery of pmAAT activity in the presence of hsp70, suggesting a lower affinity for binding to hsp70. Finally, equivalent concentrations of several peptides (those rated or in Table I) had very little or no effect on the interaction of pmAAT with hsp70, indicating that they do not bind to the chaperone. Obviously this competition assay would not be feasible if the synthetic peptides interfered with the spontaneous refolding of pmAAT. This was tested by monitoring the yield of reactivation in the presence of concentrations of peptide similar to those used in the competition experiments (120 M) but minus hsp70. Among the 43 tetradecamers tested, only p-4, whose sequence corresponds to the N-terminal peptide of the mature portion of pmAAT, had a marked effect on the recovery of pmAAT activity (Table I). In the presence of this peptide, the yield was reduced from about 7 0% to 8%, and there was extensive aggregation of the refolding polypeptide. Consequently, binding of this peptide to hsp70 could not be tested using the competition assay. Since in the native pmAAT dimer this N-terminal peptide interacts strongly with a hydrophobic pocket on the surface of the neighboring subunit (8, 9) the presence of an excess of the synthetic peptide may interfere with the dimerization step in the folding pathway. On the other hand four peptides (p-21, p-27, p-29 and p-38) have a very limited solubility in aqueous solutions and therefore could not be tested at concentrations similar to those used for the other peptides. When used at a lower concentration, they did not show any effect on either the spontaneous refolding of pmAAT or its binding to hsp70.
488
Antonio Artigues et al
The peptide sequences with highest affinity for binding to hsp70 are not clustered in a specific region of the polypeptide pmAAT chain, but rather are scattered over the entire amino acid sequence of the enzyme. The sequence of these regions shows several of the characteristics described for peptides with high binding afiBnity to hsp70 (6, 19), such as the presence of hydrophobic and positively charged residues. Moreover, with the exception of the presequence peptides, they are localized within regions of the enzyme that are normally hidden in the folded state of the protein. However, sequence homology analysis of the different high affinity peptides did not allow for the identification of a consensus sequence, which agrees with the known broad specificity of hsp70 for peptide substrates (l, 20, 21). In addition, the majority of the peptides with high binding affinity to hsp70 map to regions in the amino acid sequence of pmAAT having the lowest degree of homology with the corresponding position in the cytosolic homologue. In addition to the collection of tetradecamers having 4-residue overlapping ends (see Table I for sequences), we also tested the competition of a 29-residue peptide corresponding to the entire presequence peptide (pre-p in Table I). Interestingly, the effect of this peptide in the competition assay is more pronounced than that of each of the 14-mer peptides (p-1, p-2, and p-3) containing sequence elements from the same region (see first four entries in Table I). One possible explanation for this different behavior is that the targeting sequence recognized by hsp70 in the intact presequence peptide has been split in the three related shorter peptides. The effect of the presequence peptide is of particular interest since it is unique to the mitochondrial enzyme. The competition of the presequence peptide with pmAAT for binding to hsp70 is concentration dependent, with an apparent affinity constant of about 9.4 jiM (data not shown). Preincubation of hsp70 with saturating concentrations of the presequence peptide also stimulates the ATPase activity of hsp70 (see below. Table II). Binding of other mitochondrial presequences to hsp70 has been recently reported (5, 7).
C. stimulation of the ATPase Activity High Affinity Binding Peptides
of hsp70
by
Hsp70 has a weak ATPase activity, with turnover rates ranging from 0.0004 to 0.0012 s"^ (14). Peptides binding to the C-terminal domain of hsp70 induce a conformational change in the N-terminal domain (6, 23, 24), which leads to a discrete stimulation of the ATPase activity. Therefore, binding of substrates to hsp70 can also be tested by monitoring changes in its ATPase activity. For this reason, we next examined the effect of several of the pmAAT tetradecamers on the ATPase activity of hsp70 using a sensitive bioluminescence assay to monitor the decrease in ATP concentration with time. All of the peptides assayed were repu-
Mapping Binding Sites for hsp70 in Mitochondrial Protein
489
Table II. The effect of peptides on the hsp70 ATPase activity. Hsp70 ATPase activity was measured by monitoring the disappearance of ATP substrate over time using a bioluminescence assay as described under Methods. The concentration of the various peptides in the assay mixture was 120 |j,M. Peptide
ATPase activity (nmole/min/mg)
Stimulation ^
none
0.92
1.00
pre-p p-5 p-15 p-23 p-31 p-40 p-41 p-42
1.70 1.82 1.75 1.34 1.63 1.50 1.30 1.30
1.85 1.98 1.90 1.46 1.77 1.63 1.41 1.41
p-43
1.00
1.09
^ Activity in the presence of peptide/basal activity in the absence of peptide.
rifled by RP-HPLC before use. Stimulation of the ATPase activity correlated well with peptide binding data obtained from competition experiments. The presequence peptide and several of the 14-mer peptides showing maximal competition with pmAAT for binding to hspTO induced a 1.5 to 2-fold ATPase stimulation (Table II). In contrast, p-43, the C-terminal peptide from pmAAT which did not bind to hspTO according to the competition assay, showed no stimulation of the chaperone ATPase activity. IV.
CONCLUSIONS
Possible hsp70 binding sites on the primary structure of the pmAAT polypeptide have been identifled by competition studies in which, previous to the initiation of pmAAT refolding, hsp70 had been preincubated with a series of synthetic tetradecameric peptides spanning the complete sequence of pmAAT. The rationale of this approach was based on two assumptions: i) hsp70 binds peptides in an extended, or at least flexible, conformation, and ii) sequence homology analysis and the use of peptides derived from a known sequence will allow the identiflcation of peptide motifs responsible for the differential interaction of hspTO with two homologous proteins, pmAAT and cAAT. The flrst assumption has recently been strengthened by the publication of the crystal structure of the hspTO peptide binding domain (3). The second has led to the mapping of putative binding sites of polypeptide regions that show maximum sequence divergence between the two isozymes.
Antonio Artigues et al
490
100
200
300
400
Position
Fig. 2. Structural comparison between cAAT and pmAAT. The average sequence homology between cAAT and pmAAT was calculated using the Plotsimilarity program included in the Wisconsin Package of the Genetics Computer Group suite of programs (version 8.0, 1984) with a window size of seven residues, after the proteins were aligned inserting gaps where necessary to maximize homology. A score of 1.5 corresponds to a region of perfect homology. The dotted line represents the overall average similarity between the two proteins. Horizontal bars indicate the position of peptides showing maximum competition with pmAAT for binding to hsp70. The peptides are identified by numbers as assigned in Table I.
Mechanistic studies on the structure-function of hsp70 have shown that upon binding of peptides there is a conformational change in hsp70 that results in a slight stimulation of hsp70 ATPase activity. Release of peptide substrates is expected to be a slow step and may require coupling to ATP hydrolysis and possibly the cooperation of other molecular chaperones. Consequently, in the absence of any other cytosolic factors, the binding of peptides to hspTO is basically irreversible. Considering these properties, several strategies have been used to identify substrate recognition features of hsp70. An initial screening of a battery of peptides derived from pmAAT for their ability to compete with pmAAT for the formation of a complex with hsp70 has allowed for a fast, easy, and accurate identification of protein sequences that efficiently bind to hsp70. Confirmation of the binding of selected peptides has been obtained by measuring the stimulation of the hsp70 ATPase activity as a consequence of the conformational change induced upon substrate binding.
Mapping Binding Sites for hsp70 in Mitochondrial Protein
491
With the exception of the presequence-containing peptides, and in agreement with the generally accepted mechanism of hsp70 action, the peptides that bind with high affinity to hsp70 comprise sequences that are hidden in the native state of the protein. These peptides contain central hydrophobic and basic carboxyl terminal amino acids, but few acidic residues. More interestingly, a sequence homology comparison of the cytosolic and mitochondrial protein sequences shows that the mitochondrial peptides binding to hsp70 correspond to regions of major sequence dissimilarity between the two isozymes (Figure 2). This suggests that sequence divergences observed between the mitochondrial and cytosolic isozymes may have arisen as a consequence of biochemical specialization to ensure the different interaction of each enzyme with the cellular machinery responsible for protein folding and translocation in vivo, thus promoting efficient import into the organelle of pmAAT and rapid folding in the cytosol of cAAT. Detailed analyses of the binding properties of each peptide, including the accurate determination of the binding affinity of each region as well as the identification of the critical residues involved in the peptidehsp70 interaction, are in progress. Information gathered from these studies should contribute to a better characterization of putative recognition sites responsible for the distinct interaction of the two isozymes with hsp70.
Bibliography 1. Mckay, D. (1993) Advances in Protein Chemistry 44, 67-98. 2. Hendrick, J.D., and Hartl, F.U. (1993) Annu. Rev. Biochem. 62, 349-384 3. Zhu, X., Zhao, X., Burholder, W.F., Gragerov, A., Ogata, CM., Gottesman, M.E., and Hendrikson, W.A. (1996) Science 272, 1606-1614. 4. Flynn, G.C., Pohl, M.T. Flocco, M.T., and Rothmann, J.E. (1991) Nature 353, 726730. 5. Endo, T., Mitsui, S., Nakai, M., and Roise, D. (1996) J. Biol. Chem. 271, 41614167. 6. Takenaka, I.M., Leung, S.M., McAndrew, S.J., Brown, J.P., and Hightower, L.E. (1995) J. Biol. Chem. 270, 19839-19844. 7. Schmid, D., Baici, A., Gehring, H., and Cristen, P. (1994) Science 263, 971-973. 8. Malashkevich, V.N., Strokopytov, B.V., Borisov, V.V., Dauter, Z., Wilson, K.S., and Torchinsky, Y.M. (1995) J. Mol. Biol. 247, 111-124. 9. Jansonius, J.N., and Vincent, M.G. (1987) In Biological Macromolecules and Assemblies (Jurnak, F., and McPherson, A., Eds.) Vol. 3, pp. 187-285, John Wiley & Sons Inc., New York. 10. Lain, B., Iriarte, A., Mattingly, J.R. Jr., Moreno, J.I., and Martinez-Carrion, M. (1995) J. Biol. Chem. 42, 24732-2739. 11. Altieri, F., Mattingly, J.R. Jr., Rodriguez-Berrocal, F.J., Iriarte, A., Wu, T., and Martinez-Carrion, M. (1989) J. Biol. Chem. 264, 4782-4786. 12. Mattingly, J. R., Jr., Iriarte, A., and Martinez-Carrion, M. (1995) J. Biol. Chem. 270, 1138-1148.
492
Antonio Artigues et al
13. Welch, W.J., and Feramisco, J.R. (1985) Molecular and Cellular Biology 5,1494914959. 14. Palleros, D.R., Welch, W.J., and Fink, A.L. (1991) Proc. Natl Acad. Sci. U.S.A. 88, 5719-5723. 15. Artigues A., Iriarte, A., and Martinez-Carrion, M. (1994) J. Biol. Chem. 269, 21990-21999. 16. Houghten, R.A. (1985) Proc. Natl. Acad. Sci. U.S.A. 82, 5431- 5135. 17. Martinez-Carrion, M., Turano, C, Chiacone, E., Bossa, F., Giartosio, A., Riva, F., and Fasella, P. (1967) J. Biol. Chem. 242, 2397-2409. 18. Leach, F.R., and Webster, J.J. (1986) Methods in Enzymology 133, 51-70. 19. Fourie, A.M., Sambrook, J.F., and Gething, M.-J. (1994) J. Biol. Chem. 269, 30470 - 30478. 20. Gilk, B.S. (1995) Cell 80, 11-14. 21. Hightower, L.E., Sadis, S.E., and Takenaka, I.M. (1994). In The Biology of Heat Shock Proteins and Molecular Caperones, (Morimoto, R.I., Tissieres, A. and Georgopoulos, C, Eds.) pp. 179-207, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. 22. Mattingly, J. R., Jr., Iriarte, A., and Martinez-Carrion, M. (1993) J. Biol. Chem. 268, 26320-26327. 23. Takeda, S., and McKay, D.B. (1996) Biochemistry. 35, 4636-4644. 24. Park, K., Flynn, G.C., Rothman, J-E., and Fasman, G.D. (1993) Protein Science 2, 325-330.
Interfacing Biomolecular Interaction Analysis with Mass Spectrometry and the use of Bioreactive Mass Spectrometer Probe Tips in Protein Characterization Randall W. Nelson, Jennifer R. Krone, David Dogruel, Kemmons Tubbs Department of Chemistry and Biochemistry Arizona State University Tempe AZ 85287-1604 Russ Granzow and Osten Jansson Pharmacia Biosensor AB, S-751 82 Uppsala, Sweden OVERVIEW The past decade has seen the development of new and powerful technologies capable of the accurate characterization of biomolecules with extreme speed and sensitivity. Two of these techniques, Biomolecular Interaction Analysis (BIA) and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF), lend themselves particularly to such analyses; the former ideally suited for the real-time investigation of biomolecular interactions, the latter finding much use in the qualitative assessment of analytes. Although the two analytical approaches operate on mutually exclusive detection principles (either surface plasmon resonance detection of a refractive index change or the physical determination the molecular mass of a gas-phase ion), they can share a common denominator ~ the use of affinity interactions in selecting the analyte. Interfacing of the two thereby creates a unique approach for the investigation of the kinetic parameters of biomolecular interaction (using BIA), and the unambiguous confirmation of the presence of targeted affinity ligands by direct mass analysis (using MALDI-TOF). In other applications, MALDI-TOF analysis can be extended beyond the primary role of protein molecular weight determination by combination with analytical enzymologies. The simplest use of enzymes in combination with MALDI-TOF is digestion of analytes into smaller fragments using endoproteases. The masses of the fragments are then determined in order to confirm or deny the sequence of the protein (or the presence of a given variant of the analyte). Traditionally, digestions are performed with both the analyte and enzymes in solution. As a result, autolysis signals are frequently observed in the mass spectra. Enzyme autolysis can be eliminated by using proteases immobilized to chromatographic supports, but generally at the expense of speed and sensitivity in analysis. An alternative to using enzymatically active chromatographic supports is to covalently attach enzymes to the surface of the mass spectrometer sample introduction device (probe). The probe device thus serves a two-fold purpose: as the enzymatic agent used for modification of the analyte, and, as a sample introduction device into the mass spectrometer. Over the past few years we have been developing new mass spectrometric approaches for the rapid, sensitive, and, accurate characterization of proteins. Reported here are some of our findings on the interfacing of Biomolecular Interaction Analysis with mass spectrometry, and the use of enzymatically active - or bioreactive - mass spectrometer probe tips in the characterization of analytes. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
493
Randall W. Nelson et al
494
INTERFACING BIA WITH MASS SPECTROMETRY I.
Introduction
Biomolecular Interaction Analysis (BIA) is an acronym given to a number of techniques used in the characterization of bio-specific interactions. A form of the technique is based on the non-destructive detection principle of surface plasmon resonance (SPR), and is capable of monitoring the binding of an analyte to a surface-immobilized binding partner in real-time [1]. Briefly, a biosensor surface (chip) comprised of an affinity ligand-derivatized carboxylated dextran layer coupled to a thin gold surface is monitored using SPR while the chip surface is exposed to the complementary affinant. Differences in surface concentration resulting from ligand-affinant interaction are detected as a change in the SPR signal, expressed in resonance units (RU), with 1000 RU corresponding to a surface concentration of- 1 ng/mml The resulting sensorgrams report the mass quantity of analyte bound to the chip surface as a function of time. Sensorgram data, as a function of analyte concentration, can then be used to determine kinetic parameters, and molar absorptivity constants, of the interaction [2]. i?
100 nm dextran u.
_JL_ 1
m/z
50 nm Gold
Polarized Light Source Resonance
Signal r-\_qsigr Surface Plasmon Resonance
Fig. 1 Biomolecular interaction analysis/mass spectrometry (BIA/MS). Biosensor chips are derivatized with affmant (or used with an affmant of streptavidin) and used in the BIA analysis of biological fluids. The chips are then introduced into a MALDI time-of-flight mass spectrometer and retained ligands analyzed by virtue of molecular weight.
Biomolecular Interaction Analysis with MS and MS Probe Tips
495
Although BIA is capable of providing pertinent information on ligand binding and kinetics, SPR detection is indirect. As a resuh the identity of the bound affinant(s) may not always be certain. This situation can hold particularly true in complex systems where there exists the possibility of binding muhiple, or unknown affmants, either non-specifically or in competition for the surface bound ligand. MALDI-TOF mass spectrometry differentiates between species by detection of analytes at precise mass-to-charge (m/z) values. When coupled with affinity isolation, this direct detection enables the unambiguous determination, or, possible identification, of the retained affinants. Interfacing of BIA with MALDI-TOF thus affords a powerful combination of techniques capable of real-time monitoring of biospecific interactions, and absolute determination of retained analytes. The coupling of BIA with MALDI-TOF mass spectrometry has therefore been explored [3,4,5]. An approach was taken in which BIA analyses were first performed; the retained analytes then mass analyzed directly from the sensor chips (see Fig. 1). II.
Materials and Methods
A.
Biomolecular Interaction Analysis
BIA analyses were performed on a rabbit anti-human IgG/human myoglobin system using a Pharmacia Biosensor BIAcore 2000 (Uppsala, Sweden). Individual flow cells of CMS (carboxylated dextran) sensor chips were derivatized with polyclonal rabbit anti-human IgG using an amine-coupling protocol described previously [6]. Cyano-stabilized human myoglobin (400 ng/mL) in the presence of human serum albumin (20 mg/mL) wasflowed(10 jiL/minute, 20 mM HEPES, 0.005% Tween 20, 150 mM NaCl, 5 mM EDTA, pH 7.4 (HBS)) over the antimyoglobin-derivatized flow cells for times ranging from 30 second to 3 minutes while monitoring the SPR signal. After incubation, the flow cell surfaces were (flow) rinsed with HBS for an additional 3 minutes before the chips were de-blocked from the instrument. Chips were dried and stored at ambient until mass spectrometric analysis. B.
MALDI Mass Spectrometry
Approximately 100 nL of a MALDI matrix, a-cyano-4-hydroxycinnamic acid, (-50 mM dissolved in 1:2, acetonitrile:1.4% TFA) was applied to each of the four flow cells (500|im x 2.0 mm) and allowed to air dry. The chips were next introduced into a prototype MALDI time-offlight mass spectrometer built specifically for analysis of the BIA chips. Briefly, the instrument consists of a linear translation stage/ion source capable of the precise targeting of each of the four flow cells under a focused laser spot (with a spatial resolution on the order of the diameter of the laser spot; ~ 200 |xm). Ions generated during a 4 ns laser pulse (355 nm; Q-switched frequencytripled Nd:YAG) were accelerated to a potential of 25 kV (in a continuous extraction mode) over a single-stage ion extraction source distance of ~ 1 cm before entering a 1.5 m field-free drift region. Ions signals were detected using a 2-stage hybrid (channel plate/discrete dynode) electron multiplier. Time-of-flight spectra were produced by signal averaging the individual spectra from 50 - 100 laser pulses (using a 500 Mhz; 500 MS/sec digital transient recorder). Custom software was used in acquisition and analysis of the mass spectra. Spectra were obtained in the positive ion mode and externally calibrated using equine cytochrome c (MW = 12360.7 Da) as a standard.
Randall W. Nelson et al
496
III.
Results/Discussion
Sensorgrams of the antibody immobilization and myoglobin binding are shown in Fig. 2. Fig. 2A shows a sensorgram obtained for one of the flow cell during the amine-coupling of the anti-myoglobin IgG to the surface of the CMS sensor chip. Anti-human IgG (~ 2 mg/mL in HBS) was flow incubated over the chip surface for ~ 7 minutes before a ~ 2 minute rinse with HBS, followed by a ~ 7 minute blocking with ethanolamine. Thefinaldifference in the sensorgram reading of- 15,000 RU translates to ~ 15 ng of antibody covalently linked to the surface of the 1 mm^ area of the flow cell. Considering two binding sites per antibody molecule, a myoglobin binding capacity of 200 fmol is estimated for the flow cell. All four flow cells of the sensor chip were derivatized using identical conditions and resulted in virtually identical sensorgrams (i.e. < 1 % deviation in the amount of antibody bound). Fig. 2B shows sensorgrams obtained during the incubation of the anti-myoglobin-derivatized flow cells with human myoglobin. Sensorgrams for flow cells two and three are shown. A difference in the sensorgram signal of- 250 RU translates to approximately 20 fmoles of myoglobin retained in flow cell 2 (during the 2.5 minute incubation). The sensorgram signal for flow cell 3 indicates roughly half that amount (-10 fmole) retained during the shorter (1 minute) incubation time. Immobilization of anti-human myoglobin on CMS chip
45000 40000
1600-, 1400
Blocking
Human myoglobin bound on CMS chip
1200H S ^
1000 800
CO
§
600
i"
400 200 H 0 -200
100
200
300 400 Time (sec)
Fig. 2 Sensorgrams of CM5/anti-human myoglobin IgG/HSA; myoglobin system. (A) Covalent immobilization of IgG to flow cells. A sensorgram reading of 15,000 RU is indicated corresponding to an antibody binding capacity of- 200 fmol myoglobin. (B) Myoglobin retained by flow cells 2 (FC2), and 3 (FC3). Retention of 20 fmol, and 10 fmol, of myoglobin is indicated for flow cells 2, and 3, respectively.
Biomolecular Interaction Analysis with MS and MS Probe Tips
497
Fig. 3 shows the mass spectra obtained from the direct MALDI-TOF analysis of flow cells 2 and 3 of the anti-myoglobin-derivatized/myoglobin-incubated CMS sensor chip. Fig. 3 (lower) was one of ca. 5 mass spectra taken from the area within flow cell 2. Significant signal is observed for both the singly-and doubly-charged ion species of the myoglobin. A measured molecular mass of 17,150 + 15 Da was found for the myoglobin by averaging the centroided mass values of the 5 spectra acquired from flow cell 2. This molecular weight is significantly higher (~ 0.4 %) than that calculated for the mono-derivatized (cyano) myoglobin (MW = 17,080 Da). Considering that the myoglobin ion signals are fairly broad, the shift to higher mass is consistent with the attachment of multiple cyano groups to the myoglobin (creating a heterogeneous sample). Fig. 3 (upper) shows a mass spectrum obtained from within flow cell 3. From the sensorgram it was estimated that - 1 0 fmol of myoglobin was present within the area of the flow cell. Again ion signal is readily observed for the myoglobin. A measured mass of MW = 17160 + 15 Da was determined for the myoglobin using the average of ca. 5 mass spectra taken from within the area of flow cell 3.
FC3 FC2 20000 Fig. 3 BIA/MS of CM5/anti-human myoglobin IgG/HSA; myoglobin system flow cells 2 (FC2) and 3 (FC3). Ion signals are observed in both spectra for the singly- and doubly-charged myoglobin. Retention of species other than the myoglobin is observed in flow cell 3 (marked by *), possibly due to non-specific interactions or the specific retention of myoglobin fragments.
Randall W. Nelson et al
498
A few issues of BIA/MS are worth noting. The first is the comparable sensitivities of the two techniques. BIA analyses registering above the ~ 100 RU level are generally considered significant. This sensorgram response translates to ~ 5 fmole of a 20 kDa protein retained over an area of ~ 1 mm^ (the area of a flow cell), an amount generally at the limit of detection of MALDITOF analysis (this is, of course, a general statement as the limits of detection observed during MALDI-TOF are highly dependent on acquisition factors, e.g., matrix and instrument, and the nature of the analyte). Furthermore, the overall sensitivity of the BIA/MS approach reported here (analysis of retained analytes directly from the sensor chip) is not compromised by sample losses associated with eluting the retained affinants and transfer to the mass spectrometer. In fact, there was no actual handling of the samples for mass spectrometry beyond the simple application of matrix solution to the sensor chip surface. While making no claims on the universality of the limits of detection, similar studies with other systems have demonstrated BIA/MS limits of detection of comparable to, or less than, those observed here [3,5]. A second aspect of the BIA/MS analysis is the observance of species in the mass spectra other than those targeted. Fig. 3 (upper) shows the presence of a number of lower molecular weight species retained along with the myoglobin. Blank analyses of flow cells derivatized with antibody and incubated with HBS/HS A buffer (no myoglobin) demonstrated the presence of a number of the lower molecular weight species, however, not all those observed in Fig. 3 (upper). A combination of both nonspecific retention of background species, and specific retention of myoglobin fragments (present in the starting solution) is suggested. Non-specific retention (while of obvious concern) can be compensated for during BIA analysis by blank substraction or saturation of the sensor chip surface. That is to say that the BIA analysis is concerned with the change in response, due to the biospecific interactions defined by the immobilized affinity ligand, after a baseline measurement is established. It is not easy, however, to compensate for the specific binding of non-targeted ligands while simultaneously analyzing for targeted ligands. By direct detection of retained species at defined molecular weights, and incorporation of quantitative methodologies [7,8], MALDI-TOF mass spectrometry has the potential to compensate for such competitive binding. BIOREACTIVE PROBE TIPS IN PROTEIN CHARACTERIZATION L
Introduction
A particular strength of MALDI-TOF mass spectrometry is the ability to analyze complex biological mixtures with little or no prior sample workup. This ability allows for a number of intricate analyses directed at the characterization - from primary to quaternary structure, and post-translational modifications - of proteins. Several such analyses involve the use of enzymes to modify a protein or peptide prior to analysis of the resultant using MALDI-TOF. More often than not, digestions are performed free in solution; a process which allows the possibility of the enzyme autolysis. The resulting autolysis products are recognized in the mass spectrum as interferences and pose a hinderance to the analysis through potential mis-interpretation, or masking of true analyte signals. A way to eliminate such interferences is to covalently immobilize the enzymes to a solid support, the complex then used as the enzymatic reagent. When considering the MALDI-TOF analysis, the support of choice is in fact the mass spectrometer probe device, which, when enzymatically-derivatized, serves to both digest the analyte, and to introduce the digestion mixture into the mass spectrometer [9,10]. There are several advantages to performing digestions using enzymatically-derivatized
Biomolecular Interaction Analysis with MS and MS Probe Tips
499
mass spectrometer probe tips. First is an overall increase in sensitivity as sample losses (in transfer and handling) are minimized. Lack of sample loss is critical in maintaining limits of detection throughout the process which are comparable to conventional MALDI analyses (elimination of sample losses is also a contributing factor to the number of proteolytic fragments observed in the mass spectrum during mass mapping). A second advantage is (as stated) the absence of interfering, or background signals due to autolytic digestion of the enzyme. The enzyme is covalently anchored to the probe surface preventing association into the MALDI matrix DSP/isopropanol ISmin.
Fig. 4 General approach of the bioreactive MALDI mass spectrometer probe tips. Gold plated probe tips are activated through the covalent attachment of enzymes (the general terminology of Au/enzyme is used to indicate the nature of the activated surfaces). The probe tips are then used for protein characterization by direct application of the analyte and time given for digestion. The digestions are stopped with the addition of a MALDI matrix, the reaction productmatrix mixture allowed to dry, and the probe tips are inserted into the mass spectrometer for MALDI-TOF analysis.
(negating desorption/ionization), and also prohibiting the freedom necessary for autolysis (which would also produce interferences). Third, digestions can be performed on a time scale equivalent to that required for the MALDI analysis (a few minutes). Covalent anchoring of the enzymes is again largely responsible for the ability to perform digestions rapidly because high effective enzyme concentrations can be used without introducing interferences. Digestion rates can be further increased by using the probe tips at elevated temperatures (accelerating diffusion limited processes and equilibrium kinetics). Lastly, use of enzymatically-derivatized probe devices is
Randall W. Nelson et al
500
quite easy, requiring no more steps than those required for a normal MALDI analysis (application of analyte and matrix to the probe). Reported here is the use of bioreactive mass spectrometer probe tips to serially digest myoglobin. The object of the serial digestion was to simultaneously view the relative stability of molecule fragments of myoglobin (generated during an initial, limited digestion of the myoglobin under denaturing conditions using pepsin-active tips at low pH), by exposing the fragment set to extensive digestion (using trypsin tips) under re-naturing conditions. II.
Experimental
A graphic depiction of the experimental process is given in Fig. 4. Stainless steel probe tips were first sputter-coated with ~ 300 nm of gold, and then activated by treatment with dithiobis (succinimidyl propionate) (DSP)/isopropanol solution (for -- 30 minutes). Probe tips were then rinsed vigorously with isopropanol and either used directly (for amine linkage), or further derivatized (for carbodiimide mediated carboxylic acid linkage) by a 15 minute incubation with a solution of ethylene diamine (EDA):isopropanol: triethylamine (40:40:20%). Trypsin was linked through amine coupling by addition of the enzyme (0.1 mg/mL in 20 mM phosphate buffer; pH 8.0) directly to the DSP-derivatized probe tips. Pepsin was linked through carboxylate coupling by addition of the enzyme (0.1 mg/mL in acetate buffer; pH 4.5; 0.1 mg/mL l-ethyl-3-(3dimethylaminopropyl) carbodiimide) to DSP/EDA-derivatized gold tips. Tips were prepared in batches (20 - 40) with the reactions performed in 50 mL conical tubes, generally overnight at ~ 4°C, using volumes of enzyme solution equal to ~ 0.5 mL per probe tip. After incubation the tips were washed with liter volumes of ice-cold incubation buffer, dried, and stored at ambient until needed. For clarity, tips are termed as Au/enzyme to denote the gold surface and linked enzyme. Whale myoglobin (MW = 17,200.4 Da) was dissolved to 0.01 mg/mL (~ 0.6 |iM) in 20 mM ammonium acetate buffer, pH 2.7, and allowed to stand for ~ 30 min. A one minute pepsin digestion was performed by application of 3 |iL of the myoglobin solution directly to the surface of an Au/pepsin probe tip (maintained in a humidified environment at 60 °C). At the same time, 1.5 |aL of a 20 mM phosphate buffer (pH 10) was applied to the surface of an Au/trypsin tip (maintained in high humidity at 60 °C). After one minute, the tips were touched together, effectively transferring a portion of the peptic digest to the Au/trypsin tip (the combination of the two buffers resulted in a solution pH of- 7.5, as verified with pH paper). Immediately following, 1.5 fxL of a a-cyano-4-hydroxycinnamic acid solution (in 1:2; acetonitrile:1.5 % TFA (ACCA)) was applied to the ~ 2 fxL of the digest mixture remaining on the Au/pepsin tip. Trypsin digestion was then allowed to proceed for 5 minutes before termination by addition of 1.5 |LIL of the ACCA matrix. Samples were allowed to air dry prior to insertion of the probes into the mass spectrometer. MALDI-TOF mass spectrometry was performed using a Vestec LaserTec ResearcH linear time-of-flight mass spectrometer (Vestec Corp. Houston, TX), modified to accommodate the probe tips (see Fig. 4), and equipped with a two-stage gridded ion source operating at 30 kV. The rest of the instrument remained unchanged from that described previously [11]. Mass spectra were acquired in the positive ion mode with each spectrum the sum of 50 - 100 individual laser desorption/ionization events. Spectra were externally calibrated using horse heart cytochrome c (MW 12,360.7 Da) as a standard. Mass data was analyzed using protein analytical worksheet software (PAWS) [12].
Biomolecular Interaction Analysis with MS and MS Probe Tips
III.
501
Results/Discussion
A combination of enzymatically-active probe tips was used to investigate the regional stability of myoglobin. A set of molecular fragments representing different regions of the protein was first prepared by a limited pepsin digestion of the protein under denaturing conditions (pH ~ 3). The set was then exposed to further, more extensive, degradation (using trypsin) under native conditions (pH ~ 8). Regions of myoglobin that do not exhibit an intrinsic steric shielding by the tertiary structure of the molecule (the molecule being either the intact myoglobin or one of the fragments) are more susceptible to digestion by the trypsin, and therefore, signals representing these fragments are expected to be attenuated in the final mass spectrum. Regions of the myoglobin possessing a tighter tertiary structure (when folded under native conditions) will exhibit a higher degree of immunity to the trypsin digestion, and representative signals in the mass spectrum will be attenuated to a lesser extent.
c 0) 4
.> ''•3
JS
«
2H
20000
Fig. 5 One minute Au/pepsin digestion of whale myoglobin under denaturing conditions (pH 3, 60 °C) (A, grey). Peptic fragments digested for 5 minutes using an Au/trypsin probe tip (pH ~ 8,60 °C) (B). Select fragments have been completely digest indicating a relatively low degree of steric hinderance (to tryptic sites) in the final 46 residues of the protein. Ion signals are marked with residue numbers. Region indicated is shown in Fig. 7.
502
Randall W.Nelson era/. Residue Number N
20
40
60
80
100
120
140
^
^^^^"" " 30 153
110.153 • 13«-153
N
20
40
60
Residue Number 100 80
120
140
R
„
^
N
20
40
60
"
•
^
^
^
Residue Number 80 100
^
120
140
c • 13«-153
Fig. 6 Coverage maps derived from the Au/pepsin-Au/trypsin serial digestion of whale myoglobin. (A) Au/pepsin digest fragments. (B) Peptic fragments exhibiting a relatively high immunity to tryptic digestion. (C) Peptic fragments eliminated during trypsin digestion. Residue numbers are as indicated.
Fig. 5A shows the resuhs of myoglobin digested under denaturing conditions using an Au/pepsin tip. Strong ion signals representing fragments due to cleavage of the myoglobin at five sites, residues 29, 69, 106, 109, and, 137, are observed. All peptic fragments contain between one and nine trypsin cleavage sites. Upon exposure to an Au/trypsin tip (Fig. 5B), select fragments in the peptic mixture are observed to undergo complete digestion, whereas others exhibit a relative immunity to digestion. Fig. 6 shows mass coverage maps of the fragments from the pepsin digestion, the fragments surviving the Au/trypsin digestion, and those completely digested by the trypsin. In general, minimal stearic shielding of tryptic sites is observed in fragments comprised of the final two helices of the myoglobin. Fig. 7 shows an evolution of tryptic fragments derived from the original pepsin digest. Signals consistent with cleavage at three of the six possible trypsin sites present in the 107 - 153 region of the myoglobin are observed. There are no other strong ion signals in the Au/pepsin; Au/trypsin spectrum due to both pepsin and trypsin digestion of the myoglobin (other signals in the spectrum are consistent with cleavage at trypsin sites — confirmed by Au/trypsin digestion of myoglobin). This observation, and the survival of numerous fragments containing residues 1 - 106, is consistent with the steric inaccessibility of sites within the central, heme-coordinated region of the molecule (independent of the final 46 residues of the molecule).
503
Biomolecular Interaction Analysis with MS and MS Probe Tips
C 0)
I
0)
1000
2000
3000
4000
5000
6000
m/z
Fig. 7 Mass spectra showing the evolution of proteolytic fragments generated by the successive Au/pepsin Au/trypsin digestion of myoglobin. Ion signals representing peptic fragments of the myoglobin originating between residues 107-153 (A) are observed to undergo complete tryptic digestion (B) indicating relatively free access to trypsin cleavage sites. Ion signals are marked to indicate proteolytic fragments (by residue).
Obviously, more data is needed in order to make any broader statements on the relative degree of m^ra-molecular interaction of the myoglobin. However, serial digestions are possible in numerous combinations, and quite easy to perform using the bioreactive probe tips. Further, digestion of the myoglobin in the presence of denaturants (detergents, salts) is also possible to study the relative accessibility to proteolytic sites, yielding additional information on the overall structure of the molecule [13]. Finally, incorporation of quantitative MALDI-TOF techniques allows the tracking of digestions as a function of time, providing even further insight into the dynamics of digestion {i.e., determination of fragment pre-cursors and final products) and molecular stability [13, 14]. Currently, we are exploring such uses of the bioreactive probe tips, in combination with the defined and accurate mass spectrometric identification of proteolytic fragments, in the study of higher-order protein structure.
Randall W. Nelson et at
504
FINAL REMARKS The rapid advancement of analytical technologies such as SPR-based biomolecular interaction analysis (BIA), and MALDI-TOF mass spectrometry, has allowed the routine characterization of biomolecules present in complex environments at physiological concentrations. Presented here has been the coupling of the two orthogonal techniques into a combined approach capable of observing real-time, solution-phase biospecific interactions (using BIA), and the rapid qualitative assessment of binding partners (using MALDI-TOF). The combined analysis is performed without compromise of the speed, sensitivity, or, accuracy of the constituent techniques, and therefore demonstrates the inception of a new bioanalytical approach: Biomolecular Interaction Analysis Mass Spectrometry (BIA/MS). An additional approach to biomolecular analysis, hioreactive mass spectrometry probe tips, has also been given. These are conceptually, and practically simple devices constructed to analytically modify biomolecules prior to mass spectrometric analysis. The bioreactive devices have proven quite convenient in use, and often necessary in maintaining high speed, sensitivity, and, accuracy in the mass spectrometric analysis of proteolytic mixtures. An obvious next step is the combination of the two techniques, BIA/MS with bioreactive mass spectrometry probe tips. Such an approach would thereby allow (all on a single surface), the real-time observance of affinity interaction followed by enzymatic modification and mass spectrometric characterization of retained ligands. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Szabo, A., Stoltz, L., and Granzow, R. (1995) Curr. Opinion Struc. Biol 5, 699- 705. Karlsson, R., Roos, H., Fargerstam, Persson, B., (1994) Methods: A Companion to Methods in Enzymology, 6 , 9 9 - 110. Krone, J.R., Nelson, R.W., Dogruel, D., Granzow, R., Williams, P., in Proceedings of the 5th Annual European BIAsymposium, Stockholm, Sweden, September 27-29, 1995, Ed. R. Millett. Page 173 - 179. Krone, J.R., Nelson, R.W., Dogruel, D., Williams, P., Granzow, R., (1996) BIAjournal 3, 16 - 17. Krone, J.R., Nelson, R.W., Dogruel, D., Williams, P., Granzow, R., (1996) Anal. Biochem. In press. BIAapplications Handbook (1994). Chapter 4. Nelson, R.W., McLean, M.A., Hutchens, T.W., (1994) Anal. Chem. 66, 1408 - 1415. Nelson, R.W., Krone, J.R., Bieber, A.L., Williams, P., (1995) Anal.Chem. 67, 1153 -1158. Dogruel, D., Williams, P., Nelson, R.W., (1995) Anal. Chem. 67, 4343 - 4348. Nelson, R.W., Dogruel, D., Krone, J.R., Williams, P., (1995) Rapid. Comm. Mass Spectrom. 9, 1380 1385. Vestec LaserTec ResearcH specification sheet, Vestec Corporation, Houston, TX, (1992). Beavis, R.C. Protein Analysis Worksheet Version 6.1.1, (1995). Patterson, D.H., Tarr, G.E., Hines, W.M., Vestal, M.L.,Proceeding of The 44*^ ASMS Conference on Mass Spectrometry and Allied Topics, Portland, Oregon, May 1996. In press. Lewis, J.K., Krone, J.R., Dogruel, D., Williams, P., Nelson, R.W., Proceeding of The 44* ASMS Conference on Mass Spectrometry and Allied Topics, Portland, Oregon, May 1996. In press.
Transition-State Theory and Secondary Forces in Antigen-Antibody Complexes Mark E. Mummert and Edward W. Voss, Jr. Dept. of Microbiology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
I. Introduction Secondary forces, defined as those interactions exhibited outside of the classically defined antibody active site, have been demonstrated to modulate the conformation and free energy of binding of antifluorescein antibodies (1-3). Figure 1 defines and distinguishes primary from secondary interactive components. The ability of the epitopic environment to influence antibody binding has obvious immunological ramifications. Dissection of those interactions that influence the overall dynamic and thermodynamics of a given protein system is of general importance in understanding interfacial protein chemistry. The antifluorescein system is advantageous for evaluating and quantitating interfacial chemistry. Binding of fluorescein ligand in the antifluorescein active site results in bathochromic shifts of the ligand's absorption spectrum and a decrease in both the fluorescence quantum yield and lifetime. These properties allow sensitive spectral and kinetic measurements to be made (4). Changes in the spectral and kinetic properties of a given antifluorescein antibody upon interacting with fluorescein attached to a carrier molecule compared to fluorescein (devoid of carrier residues) thus provides important information about secondary force directed perturbations. Placement of the fluorescein moiety in various environments is easily achieved due to the availability of the highly reactive isothiocyanate derivative of fluorescein. Evaluations of secondary interactive components have been discussed (5-8). In general, the delineation between primary and secondary interactive components have been vague (9). An important advantage of the fluorescein system is that the ligand fills the active site (10-12) which has been conclusively demonstrated by X-ray crystallographic results for the monoclonal antifluorescein antibody (mAb) 4-4-20 (13-15). Thus, interactions with carrier residues associated with the ligand-carrier complex are by necessity outside of the primary interactions. An understanding of interfacial protein chemistry requires evaluation of the thermodynamics of the system under investigation as well as the energetic barriers responsible for the observed kinetics and affinity. Due to the kinetic methodology available for the antifluorescein system, the energetic barriers for complex decomposition TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
505
505
Mark E. Mummert and Edward W. Voss, Jr. Antibody variable domains
Carrier environment (highly charged protein or lipid membrane)
Figure 1. Schematic representation differentiating primary and secondary interactions. Secondary interactions are the result of interactions between regions surrounding the mouth of the active site and regions of the carrier environment surrounding the ligand. A highly charged protein or lipid interface represents an example of substrate exerting secondary effects.
can be evaluated (Figure 2). It is important to note that the kinetic measurements can be conducted in solution under near physiological conditions. Thus, the results obtained can be extrapolated to biological situations. In this report, we summarize the results of a study in which the energetic barriers of several protein/complex decompositions were analyzed utilizing transition-state theory. In essence, fluorescein 5-isothiocyanate was covalently linked to a variety of synthetic peptides and allowed to bind with the well defined high affinity 4-4-20 mAb. Differences in the rates of decomposition were measured at 275 K and 291 K and the height of energetic barriers calculated using classical transition-state analysis (16).
n. Methods and Materials A. Monoclonal anti-fluorescein antibody 4-4-20 mAb 4-4-20 was produced in ascitic fluid from pristane treated Balb/c mice and affinity purified using a fluorescein Sepharose 4B adsorbent (17,18). B. Peptide synthesis for use as carriers Peptides of different chemical composition acetylated in the amino-terminal position were synthesized using an Applied Biosystem model 430A peptide synthesizer at the University of Illinois Genetic Engineering Facility (Urbana, DL) employing solid-phase F-moc chemistry with standard amino acid protecting groups. The generic peptide design was as follows: Ac-NH-(X)6-K-(X)6-COOwhere Ac-NH denotes the acetylated a-amino group, X represents glutamate or arginine, and K is the central lysine residue available for FITC (I) derivatization. Peptides were desalted and purity verified by RP-HPLC. Purified peptides were analyzed by mass spectrometry to verify composition.
Transition-State Theory and Secondary Forces in Ag-AB Complexes
507
Second Transition Stat*
Reaction Coordinate
Figure 2. Two dimensional reaction coordinate depicting the interaction of mAb 4-4-20 with homologous ligand. The x-axis is arbitrarily assigned reaction progression while the y-axis is the chemical potential. The height of the chemical potential barriers dictates the rate of the reaction. The encounter complex was included based on kinetic considerations (19). Monofluoresceinated peptides were synthesized by adding an equimolar concentration of FITC(I) to peptides. The reaction was adjusted to a pH of 10.3 with K2CO3 and incubated at ambient temperature overnight. The resulting reaction mixture was resolved over a P-2 column (Bio-Rad) equilibrated in 0.1 M phosphate, pH 8.0 to remove unreacted fluorescein from the peptides. Fluorescently labeled peptides were analyzed by thin layer chromatography with water saturated methyl ethyl ketone as the solvent system. C. Determination of unimolecular rate constants Ligand dissociation rates were determined at 275 K and 291 K utilizing the methodology and analysis as described in detail by (19). This technique provides an essentially unidirectional displacement of the fluorescein/antibody complex. D. Calculation of transition-state thermodynamic parameters All calculations have been described in detail elsewhere (3). Transition-state equations can be found in most elementary physical chemistry texts or in the classical work of Wynne-Jones and Eyring (16). III. Results A. Monofluoresceinated peptides Thin layer chromatographic analyses of monofluoresceinated peptides indicated a single fluorescent band for each of the labeled peptides. RF values were 0.90, 0.85, 0.83 and 0.76 for FDS, D12KF1, R6D6KF1 and R12KF1 respectively.
508
Mark E. Mummert and Edward W. Voss, Jr.
Table L Comparative unimolecular rate constants at 275 K and 291 K for the interaction of FDS and monofluoresceinated peptides with mAb 4-4-20 Ligand
k.i^
k.^i,
^Ab^Asi
FDS
1.63(±0.02)xl0-4
1.92(iO.09)xl0-3
11.8
D12KF1
3.52(±O.62)xl0-3
1.06(±0.19)xl0-l
30.1
R6D6KF1
6.96(±1.02)xl0-3
I.15(dt0.42)xl0-1
16.5
R12KR
6.79(±0.25)xlO-3
6.08(±0.81)xl0"2
8.9
k.ia = unimolecular rate constant at 275 K k_n, = unimolecular rate constant at 291 K
B. Affinity of mAb 4-4-20 with various ligands In previous studies (2), the affinity constants (Ka) for the interaction of mAb 4-420 with fluorescein and monofluoresceinated peptides were measured at 275 K. The affinities of mAb 4-4-20 for FDS, D12KH, R12KF1 and R6D6KF1 were 3.14x10^° M'\ 1.49x10^ M"^ 7.49x10^ M^^ and 7.55x10^ M \ respectively. C. Unimolecular rate constants Unimolecular rate constants for decay of the mAb 4-4-20/fluorescein complex and mAb 4-4-20/monofluoresceinated peptide complexes were determined at 275 K and 291 K. The 16 K differential resulted in significant changes in the individual decay rates of the various complexes. The largest change with temperature was with the mAb 4-420/D12KF1 complex (30.1-fold), while the smallest change was with R12KF1 (8.9-fold). Importantly, the R6D6KF1 ligand resulted in an approximate average (16.5 -fold) of the poly anionic (D12KF1) and polycationic (R12KF1) environments. Table 1 summarizes these results. D. Relationship between enthalpy and entropy Table 2 summarizes the calculated transition state thermodynamic parameters (AH", AS" and AG"). The secondary effects that resulted from the carrier molecule caused an apparent enhancement in AH" and AS" relative to fluorescein devoid of carrier residues. The enhanced values of AS" offset the enhanced AH" with the net effect of lowering the overall energetic barriers (AG") of the 4-4-20/monofluoresceinated complexes relative to the 4-4-20/fluorescein complex (Table 3).
Transition-State Theory and Secondary Forces in A g - A B Complexes
509
Table IL Comparative thermodynamic transition-state parameters and transition-state equilibria for the interaction of mAb 4-4-20 with FDS and monofluoresceinated peptides at 275 K Ligand
AH^
AS^
AC"
K^
FDS
+23.96±0.06
+0.0110.00
+20.8210.07
2.84x10-^7
D12Kn
+33.28+1.95
+0.0510.00
+19.1511.95
6.03x10-^6
R6D6KF1
+27.3213.36
+0.0310.00
+18.7713.36
1.21x10-1^
R12KF1
N.A.
N.A.
N.A.
N.A.
AH^ = transition-state enthalpy (kcal/mol) AS"^ = transition-state entropy (kcal/mol/K) K"^ = transition-state equilibrium (dimensionless) N.A. = not applicable; does not conform to the theoretical assumptions of transition-state theory
E. K values Values for the transmission coefficient (K) at 275 K were 1.00, 1.02, 1.00 and 0.58 for FDS, D12KF1, R6D6KF1 and R12KF1, respectively. Transition-state theory assumes unity for K. Deviations of K from unity indicated poor approximation of the various transition-state thermodynamic parameters. Thus all complex decays were adequately described by transition-state theory, except for the R12KF1 peptide.
IV. Discussion Understanding those components that influence the interfacial binding properties in protein/protein and protein/ligand interactions is of basic importance in protein chemistry. In this report, we have defined a system that should allow the dissection of those chemical properties that influence primary interactions via an evaluation of the transition-state thermodynamic components. It is important to realize fundamental assumptions made in the calculations. At the temperatures utilized in these experiments (275 K and 291 K), it was assumed that complexes moved over energetic barriers with standard Arrhenius motion. Deviations from Ahrrenius motion (e.g., tunneling) usually result as a consequence of low temperature (20-22). It is also important to realize that the values calculated for AH^, AS"" and /SG^ are the upper limits of the system, since solvent was considered as a part of the system (23). This study suggested that secondary forces of the mAb 4-4-20 /monofluoresceinated peptide complexes modulated binding interactions via increased transition-state enthalpic and entropic contributions. The net result was a decreased energetic barrier that allowed modulation of the previously reported affinity constants of mAb 4-4-20 for the monofluoresceinated peptides due to variation of the unimolecular rate constant (2).
510
Mark E. Mummert and Edward W. Voss, Jr.
Table i n . Comparative differences in thermcxlynaniic transition-state parameters of monofluoresceinated peptides with respect to FDS at 275 K
Ligand
AAH^
AAS''
AAC
D12KF1
+9.32±1.95
•K).04±0.00
-1.67±1.95
R6D6KF1
+3.36±3.36
0.02±0.00
-2.05±3.36
R12KF1
N.A.
N.A.
N.A.
AAIT^ = change in transition-state enthalpy with respect to FDS (kcal/mol) AAS'^ = change in transition-state entropy with respect to FDS (kcal/mol/K) AAG^ = change in transition-state free energy with respect to FDS (kcal/mol) N.A. = not applicable
Increased values of AH" and AS" for the mAb 4-4-20/monofluoresceinated peptide complexes relative to the mAb 4-4-20/fluorescein complex decay were interpreted as resulting from inclusion of the carrier peptides. Increased enthalpic contributions may have resulted from actual binding interactions between the surface accessible complementarity determining regions (CDRs) surrounding the mouth of the antibody active site and the amino acids of the peptides. Whitlow et al. (15) reported that a significant percentage of the amino acids that compose the mAb 4-4-20 CDRs were solvent accessible when fluorescein was in the active site. The increased values for AH" also may have been due to differences in hydration of the antibody complexes. Enhanced AS" values for the antibody/peptide complexes may have been a result of the greater rotational, translational and vibrational degrees of freedom as the complexes decayed relative to the mAb 4-4-20/fluorescein complex. As in the AH" argument, hydration may also be an important factor to consider. Hydration has been shown to significantly influence the free energy of binding (14). We interpreted the inability of transition-state theory to predict the decay of the mAb 4-4-20/R12KF1 complex to be a result of differential conformational changes. Deviations of K from unity are a direct result of the inertial (solvent coupling) and diffusive (intramolecular dynamic) regimes (24-27). The frictional coefficient in both of these regimes dictates the value of K (24,25). Both inertial and diffusive regimes modulate K in proteins (27-29). We therefore proposed that the mAb 4-4-20/R12KF1 complex could not be evaluated by transition-state theory due to inertial and/or diffusive regimes. We conceived that the secondary forces dictated by R12KF1 resulted in greater perturbation of the antibody variable domains than the secondary forces dictated by either D12KF1 or R6D6KF1. It was postulated that the greater van der Waals volume for arginine (R~148 A^) as opposed to aspartic acid (D~91 A^) resulted in greater variable domain atomic coordinate displacement and thus enhanced frictional components. In conclusion, the antifluorescein system provides a reasonable model with which to evaluate interfacial interactions utilizing transition-state theory. Evaluations like those presented herein provide means to develop mechanistic models to describe interfacial interaction from an energetic barrier viewpoint.
Transition-State Theory and Secondary Forces in Ag-AB Complexes
511
References Mummert, M.E. and Voss, E.W., Jr. (1995). Mol Immunol 32, 1225-1233. Mummert, M.E. and Voss, E.W., Jr. (1996) Mol Immunol in press. Mummert, M.E. and Voss, E.W., Jr. (1996) Biochemistry 35, 8187-8192. Voss, E.W., Jr. (1993) 7. Mol Recog. 6, 51-58. vanOss, C.J. and Absolom, D.R. (1984) In "Molecular Immunology" (Atassi, M.Z., vanOss,C.J. and Absolom, D.R., eds.) pp. 337-360. Marcel Dekker, New York. 6. vanOss, C.J., Good, R.J. and Chaudhuny, M.K. (1986) 7. Chromatog. 376, 111-119. 7. vanOss, C.J. (1992) In "Structure of Antigens" (Van Regenmortel, M.H.V., ed.) vol. 1, pp. 179-208. CRC Press, Inc., Boca Raton, FL. 8. vanOss, C.J. (1994) In "Immunochemistry" (vanOss, C.J. and Van Regenmortel, M.H.V., eds.) pp. 581-613, Marcel Dekker, New York. 9. vanOss, C.J. (1995) Mol Immunol 32, 199-211. 10. Voss, E.W., Jr., Eschenfeldt, W. and Root, R.T. (1976) Immunochemistry 12, 745749. 11. Omelyanenko, V.G., Jiskoot, W. and Herron, J.N. (1993) Biochemistry 32, 1042310429. 12. Carrero, J. and Voss, E.W., Jr. (1996) 7. Biol Chem. Ill, 5332-5337. 13. Herron, J.N., He, X-m., Mason, M.L., Voss, E.W., Jr. and Edmundson, A.B. (1989) Proteins: Struct., Funct., Genet. 5, 271-280. 14. Herron, J.N., Terry, A.H., Johnson, S., He, X-m., Gudday, L.W., Voss, E.W., Jr. and Edmundson, A.B. (1994) Biophys. 7. 67, 2167-2183. 15. Whitlow, M., Howard, A.J., Wood, J.F., Voss, E.W., Jr. and Hardman, K.D. (1995) Prot.Eng.^,lA9-16\. 16. Wynne-Jones, W.F.K. and Eyring, H. (1935) 7. Chem. Phys. 3, 492-502. 17. Kranz, D.M. and Voss, E.W., Jr. (1981) 7. Biol Chem. 257, 6987-6995. 18. Weidner, K.M., Denzin, L.K., Kim, M.L., Mallender, W.D., Miklasz, S.D. and Voss, E.W., Jr. (1993) Mol Immunol 30, 1003-1011. 19. Herron, J.N. (1984) In "Fluorescein Hapten: An Immunological Probe" (Voss, E.W., Jr., ed.) pp. 50-75. CRC Press, Inc., Boca Raton, FL. 20. Frauenfelder, H., Nienhaus, G.U. and Johnson, J.B. (1991) Ber. Bunsenges. Phys. Chem. 95, 272-278. 21. Wolynes, P. (1987) In "Protein Structure: Molecular and Electronic Reactivity" (Austin, R., Buhks, E., Chance, B., DeVault, D., Dutton, P.L., Frauenfelder, H. and Gol'daskii, V.I., eds) pp. 201-209, Springer-Verlag, Inc., New York. 22. Frauenfelder, H. (1979) In "Tunneling in Biological Systems" (Chance, B., DeVault, D.C., Frauenfelder, H., Marcus, R.A., Schriefer, J.R. and Sutin, N., eds.) pp. 627649. Academic Press, Inc., New York. 23. Beece, D., Eisenstein, L., Frauenfelder, H., Good, D., Marden,M.C., Reinisch, L., Reynolds, A.H., Sorensen, L.B. and Yue, K.T. (1980) Biochemistry 19, 51575157. 24. Chandler, D. (1978) 7 Chem. Phys. 68, 2959-2970. 25. Northrup, S. and Hynes, J.T. (1978) 7 Chem. Phys. 69, 5246-5260. 26. Hasha, D.L., Eguchi, T. and Jonas, J. (1982) 7. Am. Chem. Soc. 104, 2290-2297.
1. 2. 3. 4. 5.
512
Mark E. Mummert and Edward W. Voss, Jr.
27. Doster, W. (1983) Biophys. Chem. 17, 97-103. 28. Karplus, M.A. and McCammon, J.A. (1981) FEES Lett. 131, 34-36. 29. McCammon, J.A. and Karplus, M. (1979) Proc. Natl Acad. Sci. U.S.A. 76, 35853589.
Thermodynamic Investigation of Enzyme and Inhibitor Interactions with High Affinityi Yudu Cheng, Jacek Slon-Usakiewicz, Jing Wang, Enrico O. Purisima and Yasuo Konishi National Research Council of Canada, Biotechnology Research Institute Montreal, Quebec, Canada
I. Introduction The enzyme and inhibitor binding interactions may be elucidated by the thermodynamic functions such as the free energy (AG), enthalpy (AH), entropy(TAS) and heat capacity(ACp). These thermodynamic functions are related through the following equation: AG°(T) = AH°(T) - TAS°(T) = [AH°(T°) - TAS°(T°)] + ACp[(T - T°) - Tin (T/T°)]
(1)
In the above equation, AG°, AH°, AS° and ACp are the thermodynamic functions relative to a standard state(1.0 mol/L for all chemical species and 25 °C), T° is the reference temperature (298.15 K in this work). The thermodynamic study plays a major role in accessing the molecular basis of enzyme and inhibitor interactions because the thermodynamic functions convey extensive information from the binding affinity to the conformational change. In general, AG is the affinity between enzyme and inhibitor. AH is the binding energy arisen from the van der Waals interactions, hydrogen bonding interaction, dehydration and other effects (e.g. deprotonation, ion-bridge etc.). AS measures the loss or gain in the rotational, translational and/or vibrational degrees of freedom in the conformational change and consists of both solvent and conformational contributions. ACp measures the temperature dependence of AH and AS. ACp may also be temperature dependent. In eq 1, ACp is assumed to be temperature independent for simplicity. We have conducted thermodynamic studies on the thrombin and its bivalent inhibitors' interactions in which the binding affinity ranges from Kj = 10-9 to Kj = 10-12 M (Ki is the inhibition constant)(l,2). Thrombin is a key enzyme regulating thrombosis in cardiovascular disease. The synthetic bivalent thrombin inhibitors possess an active-site binding segment, a linker and a fibrinogen recognition exosite (FRE) binding segment which is based on the C-terminal sequence of hirudin, AspH55-PheH56-GluH57_GluH58_IleH59_ProH60_GluH61-GluH62 -TyrH63_LeuH64_
1NRC publication No. 39931 TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © Government of Canada All rights of reproduction in any form reserved.
513
Yudu Cheng et al
514
GlnH65-OH(HirudinH55-H65^ H stands for hirudin). Hirudin is a 65 amino acid protein and naturally occurring thrombin inhibitor with a Ki value of 2.2 x 10-14 M(3). The crystal structure of thrombin-hirudin complex(4) indicates that, besides those distinct electrostatic interactions of hirudin and thrombin, the complementary fit of the nonpolar residues seems to be of particular importance. The site-directed mutagenesis(5) and Gly substitution(6) studies of five nonpolar residues PheH56^ IleH59^ ProH60, TyrH63 and LeuH64 in the FRE binding segment showed that the residues PheH56 and IleH59 are crucial to the binding at the FRE. In order to understand the molecular details of the nonpolar residue and thrombin interactions at the FRE, the complete thermodynamic profiles (AG°, AH°, TAS° and ACp) of the analogs, in which the five nonpolar residues PheH56, IleH59, ProH60, TyrH63 and LeuH64 of a thrombin inhibitor P552(7) are consecutively substituted by Gly, were measured and analyzed in conjunction with the structural features obtained from the molecular modelling for these substitutions. The results show that the change in the binding free energy (AAG°) due to the Gly substitution has a linear correlation with the change in the molecular surface area (AAA) around the Gly substitution site, evidencing the structural basis of the free energy. Meanwhile, the components of AAG°, AAH° and TAAS°, appear no correlation with AAA because of the linear compensation of these two quantities, but are specific to the conformational effects (e.g the movement of the inhibitor's backbone and neighboring water molecules) due to the Gly substitution. In this article, we describe the technique procedures employed by us to measure and analyze the thermodynamic functions in eq 1 for the system of thrombin and inhibitor interactions.
II. Experimental method A. Materials Human a-thrombin and the fluorogenic substrate (Tos-Gly-Pro-Arg-AMCHCl) were purchased from Sigma. Fmoc derivatives of amino acids were purchased from Advanced ChemTech and Novabiochem. N-a-Fmoc-N-y-trityl-L-Gln-Wang resin was purchased from Applied Biosystem Inc. The solvents used in peptide synthesis were obtained from Anachemia Chemical Inc. and Applied Biosystems Inc. B, Peptide synthesis and
purification
The thrombin inhibitors are synthesized on a 396 Multiple Peptide Synthesizer (Advanced ChemTech) by using a conventional Fmoc strategy of solid phase peptide synthesis. Double couplings are performed throughout the synthesis. The peptides are purified on a preparative HPLC using a linear gradient of 20 to 50% acetonitrile in 0.1% TEA ( 0.5%/min gradient, 33 mL/min flow rate). The purified products with >98% purity estimated by an analytical HPLC are lyophilized. The final peptides are identified using a Beckman 6300 amino acid analyzer and a
Thermodynamics of Enzyme and Inhibitor Interactions
515
SCIEX API III mass spectrometer. C
Enzymatic
assay
The inhibition of the amidolytic activity of human a-thrombin is measured using Tos-Gly-Pro-Arg-AMC as a fluorogenic substrate in 50 mM Tris HCl buffer (pH 7.80) containing 0.1 M NaCl and 0.1% poly(ethylene glycol) 8000 at various temperatures from 10 to 45 °C. Human a-thrombin is stable in this temperature range in the presence of poly(ethylene glycol)(8). The temperature dependence of Km and V^ax is measured at 1—40 |iM and 30 pM of the substrate and thrombin, respectively. The Ki is measured under various temperatures at 40|LiM, 30pM and varying concentrations (0.3 - 100-fold of the Ki values in the temperature range of 10-45°C) of the substrate, thrombin and inhibitor, respectively. The steady-state velocity of the enzyme reaction is measured under the condition of XQ^ =383 nm and ?Lem =455 nm in a Hitachi F2000 spectrophotometer and a Perkin Elmer LS50B luminescence spectrometer. The running solutions is preincubated at the temperature of assay for 15 min. The temperature is controlled and monitored by using a HAAK^E Circulator and a YSI Series 400 Probe (±0.1 °C). The reaction starts with adding thrombin and the progressive curve is traced for 5-15 min. D, Molecular
modelling
Energy minimization. The crystal structure of the thrombin-P500 complex(9) is used as the starting structures for molecular modelling. P500 is a bivalent thrombin inhibitor with the sequence of dansyl-Arg-(D-Pip)-)LlAdod-Gly-HirudinH55-H65. Because the residues interested (PheH56^ IleH59, ProH60^ TyrH63 and LeuH64) locate only in the PRE binding segment, the bound state of the inhibitors is modelled using the sequence of Ac-HirudinH55-H65-NHCH3. The complex which includes the ERE segment Ac-HirudinH55-H65-NHCH3 and thrombin residues and water molecules within 6 A from any atom of the ERE segment is re-energy-minimized. This refined structure is the starting point for the structure modelling of the thrombin and analogous inhibitor complexation. The energy minimization for the analogs is conducted only for the atoms in the static substructure set which includes the residues within 4 A from the residue substituted. The AMBER force-field(lO) as implemented in SYBYL 6.1 (Tripos Inc.) is used with a non-bonded cutoff of 8 A, a dielectric constant of 80 and a gradient convergence tolerance of 0.005 kcal/molA. Conformational search, Monte Carlo sampling and energy minimization. In the case of the Gly substitution for the residue PheH56^ the considerably different thermodynamic profiles (small decrease in AH°, but large decrease in T°AS° and ACp) from other P552 analogs were observed, an alternative procedure, which includes the systematic conformational search, Monte Carlo sampling and energy-
Yudu Cheng et al
516
minimization(2), is applied for further study. The rotatable bonds of the backbone and the side chains are varied in 15 degree increment for (PheH56 or GlyH56) and GluH57 and 30 and 45 degree increment for AspH55 in the complex with thrombin in order to generate a database of sterically feasible conformations. The water molecules are not included in the conformational search but are included in the energy minimization. The conformers in the database are sampled and energyminimized. This is then followed by a clustering step where all of the energyminimized conformers are grouped into several clusters. Each cluster contains the conformers with similar energies and structures. The further energy minimization is conducted only for the conformer with the lowest energy in each cluster. Molecular surface area calculation. The molecular surface area is an envelope of a molecule from which the solvent is excluded(ll). The molecular surface area is estimated using the GEPOL algorithm(12) with the van der Waals radii used in the AMBER force-field(lO) and a solvent probe radius of 1.4 A. The polar molecular surface area is composed of oxygen, nitrogen and polar hydrogens (e.g., NH and OH), and nonpolar molecular surface area is composed of all other atoms. The molecular surface area of the bound state is calculated using the energy-minimized complex structures.The molecular surface area of the free state is calculated using the geometry of a tripeptide Gly-Xaa-Gly, where Xaa stands for the residues studied. The backbone conformation of the tripeptide is set as \|/ = 140° and (j) =140°. The side-chain conformations and their populations are determined based on the statistical survey of the side-chain conformations in 100 refined protein structures(13). The change in the molecular surface area(AAA) is estimated by using a thermodynamic circle shown in Scheme I, where E, I and V stand for the free enzyme, inhibitor and analog, respectively, and EI and EI' stand for the complex of thrombin and wild-type and mutated inhibitors, respectively. The circle satisfies that AAX = AX2 - AX] = AX4 - AX3 and enables that the relative thermodynamic function measured (AAG°, AAH°, T°AAS°) can be analyzed by using the structural properties predicted (conformations and AAA).
Scheme II
Scheme I E+I
AX| ^^
^1 AX3
AX4
E + r ^;;:
AX2
M
EI
Er
+
^2
+
^3
E
r
P + E (Ila)
ES ^5
EI - ^ r ^ EI k6
(Hb)
Thermodynamics of Enzyme and Inhibitor Interactions
517
III. Data analysis A. Kinetic data transformation For the system studied, the reaction between enzyme and substrate, enzyme and inhibitor may be described by Scheme 11. Conforming to Scheme Ila which represents the reaction of enzyme and substrate in the absence of inhibitor, the Michaehs constant (Km) and maximal velocity (Vmax) are given by Kn, = [E][S]/[E-S] =(k2+kp)/ki
(2)
Vn,ax = kp[E]
(3)
and
respectively. [E], [S] and [ES] are the concentrations of enzyme, substrate and enzyme-substrate complex, respectively. The enzymatic parameters, K^ and Vmax. are estimated at each temperature by using the equation: V = Vmax[S]/(Km+[S])
(4)
where v is the velocity of the enzyme and substrate reaction. Conforming to the Scheme lib which represents a slow-binding inhibition, the progressive curves of the enzymatic assay in the presence of a competitive inhibitor are analyzed using the following equation(14): P = Vst + (vo-Vs)(l-e-kt)/k
(5)
where P is the fluorescence intensity, Vs is the steady-state velocity, t is time, VQ is the initial velocity and k is a parameter relevant to the kinetic mechanism(15). The variation of steady-state velocity (Vs) with inhibitor concentration ([I]) obtained by using eq 5 is then used to determine the inhibition constant (Kj) through the following equation(16) Vs=V„ax[S]/{K„(l+[l]/Ki)+[S]}+Ve
(6)
where Ki = k4k6/k3(k5+k6) and represents the overall inhibition constant and Vc is a parameter used to account the deviation from the linearity (Vc > 0). Temperature dependence ofK^ and Vmax- Since both Km and Vmax are encountered in the calculation of the inhibition constant (Ki) at various temperatures, the temperature dependence of Km and Vmax should be a priori determined. Figure 1 shows the temperature dependence of Michaelis constant (Km)
Yudu Cheng et al
518
and maximal velocity (Vmax) in the range of 10-45 °C. The temperature dependence of Km and Vmax is analyzed using van't Hoff equation: InKm - InKd = AG°(T)/RT = [AH°(T) - TAS°(T)]/RT = {[AH°(T°) - TAS°(T°)] + ACp[(T-T°)-Tln(T/T°)]}/RT
(7)
Vmax = kp[E]T=A[E]T^(-E/RT)
(8)
and
respectively. The temperature dependence of Km is fairly weak at low temperature(< 25 °C), but becomes strong at high temperatures. Vmax is rapidly increased with temperature. The parameters in eqs 7 and 8 estimated are AH° =12.3 ± 0.5 kcal/mol, T°AS° = -5.1 ± 0.5 kcal/mol, ACp° = -0.80 ±0.09 kcal/mol-K, A = 9.65 X 1011 s-i and E = 10.4 kcal/mol. The values of AH° and T°AS° are in good agreement with those previously published for the same system(17). Temperature dependence of K^. Prior to determining the temperature dependence of Ki, the progressive curve of the enzymatic assay is analyzed using eq 5 in order to obtain the steady-state velocity. Figure 2 shows the assay data and the fitting results for the thrombin and substrate reaction inhibited by an inhibitor with varying concentrations at 25 °C. It is readily seen that steady-state velocity becomes more evident with increased inhibitor concentration ([I]) because the inhibitor slows down the decrease of the substrate concentration. Figure 3 shows 1/Vs vs. [I] for the same system in the inhibitor concentration range of 0 - 0.269 nM at 25 °C. The parameters of eq 6 for Figure 3 are Vmax = 28.8 |LiM/s, Km = 5.3 |LiM, Ki = 0.011 nM and Vc = 0.0000 l|lM/s. Similar to the temperature dependence of Km, the variation of inhibition constant with temperature may be analyzed by using van't Hoff equation (eq 6). Figure 4 shows InKi vs. T(K) for P552 and its analogs 12000.0 10000.0
:
[\]=o/i\]=^ ^<x^]=0.082
8000.0 ,^A-*^=0A2i
j _ 6000.0 4000.0 2000.0 0.0 r, iV: 1
1111
: :
^ -13.0
5,*^-*1i^^
^ j
^
•-***tn=0.26{
:
^
280 285 290 295 300 305 310 315 320
T (K)
Figure 1. Temperature dependence of Michaelis constant (K ) and maximal velocity (V ). The solid and dashed lines are the fitting results by using eqs 7 and 8 in the text, respectively.
0.00
100
200
300
400
t (sec.)
500
600
700
Figure 2. Progressive curves of the reaction of thrombin, substrate and inhibitor(P552) at 25 °C. The solid lines are the fitting results by using eq 5 in the text.
Thermodynamics of Enzyme and Inhibitor Interactions P552(F56G) and P552(I59G). In general, InKi vs. T (K) is nonlinear. This states that the binding enthalpy and entropy should be considered as temperature dependent. In fact, the fitting values of the heat capacity are largely negative, ranging from -644 to -193 cal/mol-K for the inhibitors listed in Table 1. It is noteworthy that the negative heat capacity is characteristic of the binding interactions.
280 285 290 295 300 305 310 315 320
[I] (nM)
Figure 3, Reciprocal of the steady-state velocity (1/v^) vs. the inhibitor concentration ([I]) for P552 at 25 °C. The data is fitted by using eq 6 in the text.
280 285 290 295 300 305 310 315 320
T(K)
519
T(K)
280 285 290 295 300 305 310 315 320
T(K)
Figure 4. Variation of the inhibition constant (Kj) with temperature: (A)P552, (B)P552(F56G) and (C)P552(I59G). The soHd Hnes are the fitting results by using eq 7 where K^ is replaced by Kj.
Thermodynamic
data analysis
Table 1 lists the thermodynamic functions of some inhibitors studied previously. The free energy change, AG°, is directly determined from the logarithm of the inhibition constant (i.e. RTlnK,), the enthalpy, entropy and heat capacity are the fitting results by using eq 7 where K^ is replaced by Kj. Based on Scheme I, by using the relative thermodynamic functions (AAG°, AAH°, T°AAS°) given by AAX = AXn, - AX^
(9)
where "m" and "w" stand for the "mutated" and "wild-type" inhibitors, respectively, we are able to analyze the thermodynamic changes due to the mutation in conjunction with the structural features predicted from the molecular modelling. It is interestingly noticed that, for most analogs, the decrease in the binding affinity is attributed to the less favorable enthalpy whereas, for P552(F56G), the decrease in the binding affinity is attributed to the unfavorable entropy instead enthalpy (see Table 1). Molecular modeling for P552(F56G) suggests that the backbone located in AspH55-GlyH56-GluH57 has a large movement, in which the C" of GlyH56 shifts to C^ position of the original residue PheH56, Meanwhile, the water molecules with
520
Yudu Cheng era/. Table I Thermodynamic functions of thrombin and inhibitor interactions(25°C)*
Inhibitor
P552
Ki
AG°
AH°
T°AS°
0.011±0.0 -14.97+0.02 -13.66±1.12
ACn
1.31±1.13 -0.644±0.211
P552(F56G)
5.20±0.32 -11.30±0.03 -12.74±0.69 -1.44±0.70 -0.193±0.131
P552(I59G)
1.94±0.03 -11.88±0.01
-8.08±1.23
3.80+1.24 -0.439±0.232
P552(P60G)
0.39±0.08 -12.84±0.10 -10.73±1.62
2.11±1.64 -0.573±0.305
P552(Y63G)
0.21±0.03 -13.19±0.11
-6.07±1.57
7.12+1.58 -0.623±0.294
P552(L64G)
0.32±0.02 -12.95±0.04 -10.22±1.04
2.73±1.05 -0.532±0.197
*Data is adapted from reference (1). Units: Ki-nM; AG°, AH°, T°AS°-kcal/mol; ACp-kcal/molK.
hydrogen bonds to AspH55 and GluH57 move towards thrombin and form some more hydrogen bonds. It is then beheved that the movement of both the backbone and water molecule towards thrombin must compensate the loss in the molecular interactions due to the removal of the phenyl ring of PheH56^ and reduce the configurational entropy of the inhibitor around this area. For the analogs other than P552(F56G), no such conformational 8000.0 V6:^( change is predicted. Correspondingly, the 6000.0 I59G0 / ^ \ enthalpy and entropy of these analogs "O 4000.0 L64G ^J^ exhibit regular changes. Moreover, the : S 2000.0 \ • 'y^ O linear enthalpy-entropy compensation are ca 0.0 F56G ^ o observed for the congeneric series of _ -2000.0 thrombin bivalent inhibitors. Figure 5 < -4000.0 \ shows the relative enthalpy (AAH°) vs. the -6000.0 , relative entropy(AAS°). The relative free -^°^ - , o ,-20.0 -10.0 0.00 10.0 AAS° (cal/mol-K) energy(AAG°) and the overall change in the r
P60G5^
"^^^G
m o l e c u l a r s u r f a c e a r e a a r e f o u n d t o b e Figure S. Enthalpy-emropy compensation for the congeneric ,. , A . i r j.t ^ U.-A. A.r^u series of bivalent thrombin inhibitors. The filled circles are data I m e a r l y c o r r e l a t e d t o r t h e s u b s t i t u t i o n o r t h e from reference ( D and the open drcles are data from reference
nonpolar residues at the FRE:
(17).
AAG° = 0.0154 AAAnet + 1-34 (kcal/mol) (r=1.00)
(10)
where AAAnet = AAAnpi - AAApoi ("npl" and "pol" stand for the nonpolar and polar molecular surface area, respectively) and are calculated from the inhibitors in their bound and free states by using Scheme I. It is important to notice that the enthalpyentropy compensation is an intrinsic property of solute-solute interaction but largely
Thermodynamics of Enzyme and Inhibitor Interactions
521
enhanced by the solvent participation, and this feature of the enthalpy-entropy compensation is essential for the linear correlation of the relative free energy and the molecular surface area change due to the mutation(18).
IV. Summary The method presented in this article is applicable to the enzyme and inhibitor interactions and is particularly useful for those systems with high binding affinity. Analysis of the thermodynamic functions in conjunction with the structural features predicable from the molecular modeling may access the structural origin of the enzyme and inhibitor interactions upon the point mutation or substitution. The method may be utilized to develop the strategy of the rational inhibitor/drug design and protein/enzyme engineering.
References 1. Cheng, Y., Slon-Usakiewicz, J., Wang, J., Purisima, E., & Konishi, Y. (1996) Biochemistry 35, in press. 2. Wang, J., Szewczuk, Z., Yue, S. -Y., Tsuda, Y., Konishi, Y., & Purisima, E. O. (1995) J. Mol. Biol. 253, 473-492. 3. Stone, S. R., & Hofsteenge, J. (1986) Biochemistry 25, 4622-4628. 4. Rydel, T. J., Ravichandran, K. G., Tulinsky, A., Bode, W., Huber, R., Roitsch, C , & Fenton, J. W., II. (1990) Sciences 249, 277-280. 5. Betz, A., Hofsteenge, J., & Stone, S. R.(1991) Biochemistry 30, 9848-9853. 6. Yue, S. -Y., DiMaio, J., Szewczuk, Z., Purisima, E. O., Ni, F., & Konishi, Y. (1992) Protein Eng. 5, 77-85. 7. Yuko, T., Cygler, M., Gibbs, B. F., Pedyczak, A., Fethiere, J., Yue. S. -Y., & Konishi, Y. (1994) Biochemistry 33, 14443-14451. 8. Borgne, S. L., & Graber, M. {\99A)Appl. Biochem. Biotech. 48, 125-135. 9. Fethiere, J., Tsuda, Y., Coulombe, R., Konishi, Y., & Cygler, M. (1996) Protein Sci. 5, 1174-1183. 10. Weiner, S. J., Kollman, P. A., Nguyen, D. T., & Case, D. A. (1986) J. Comp. Chem. 7, 230-252. 11. Richards, F. M. {1911) Annu. Rev. Biophys. Bioeng. 6, 151-176. 12. Pascual-Ahuir, J. L., Silla, E., & Tuiionm I. (1994) J. Comp. Chem. 15, 1127-1138. 13. Blaber, M., Zhang, X., Lindstrom, J.D., Pepiot, S. D., Baase, W. A., & Matthews, B. W. (1994) /. Mol Biol. 235, 600-624. 14. Segel, I. H. (1975) Enzyme Kinetics: Behavior and Analysis of Rapid Equilibrium and SteadyState Enzyme Systems pp 100-160, John Wiley & Sons. 15. Morrison, J. F., & Stone, S. R. (1985) Comments Mol. Cell Biophys. 2, 347-368. 16. Morrison, J. (1988) Adv. Enzymol. 61, 201-301. 17. Di Cera, E., De Cristofaro, R., Albright, D. J., & Fenton, J. W., II (1991) Biochemistry 30, 7913-7924. 18. Cheng, Y., Wang, J., Slon-Usakiewicz, J., Purisima, E. O., and Konishi, Y., manuscript in preparation.
This Page Intentionally Left Blank
Development and characterization of a Fab fragment as a surrogate for the IL-1 receptor Y, Cong, A. S. McColl, T. R. Hynes, R. C. Meckel, P. S. Mezes, C. L. Lane, S. E. Lee, D. J. Wasilko, K. F. Geoghegan, I. G. Otterness and G. O. Daumy Central Research Division, Pfizer Inc., Eastern Point Road, Groton, Connecticut 06340 L Introduction Interactions between proteins account for a substantial number of biological signaling events. These include classical hormone-receptor interactions at the cell surface, as well as the great number of intracellular interactions revealed by analyses of signal transduction pathways. They represent an attractive but difficult set of potential targets for pharmaceutical intervention. Most successfiil drugs are compounds of relative mass < 500 Da that are bioavailable when taken orally. Experience has shown that such small compounds, while capable of being bound tightly to pocket-like sites that accommodate natural ligands of similar structure and size, usually cannot achieve binding with sufficiently high affinity to the molecular surfaces of proteins recognized by other proteins (1,2). As a result, it remains to be determined how drugs that disrupt protein-protein interactions can be developed. In these circumstances, it appears worthwhile to create model systems that allow aspects of this problem to be analyzed. We have elected to use a monoclonal antibody that binds to human interleukin-lp (IL-IP) by recognizing amino acid residues that are also recognized by the IL-lp receptor (IL-IR). A monoclonal antibody that recognizes the receptor-binding residues of a cytokine can be considered a surrogate for the cytokine's natural receptor. Such a reagent might be valuable in assessing the structural basis of cytokine-receptor affinity, and could furnish a starting point from which to attempt the design of smaller competitive agents (3, 4). Certain precautions need to be observed at the outset of this effort. To be an appropriate subject for downsizing, an antibody must achieve critical interactions with at least part of the receptor-binding surface of IL-1 p. This is to exclude selection of an antibody that blocks access of IL-lp to the receptor merely by steric overlap of its molecular bulk with the space occupied by bound receptor (5). Such an antibody, on downsizing, would lead to a compound that fails to compete with IL-ip binding. Here we describe the selection and characterization of an antibody, and its Fab fragment, that provides a suitable starting point for this endeavor. n . Methods A. Cloning and expression oflLlp and its mutant derivatives A clone of the human IL-lp gene, modified to reflect the preferred codon usage of £. coli, was obtained from R&D Systems (Minneapolis, MN) and subcloned into the expression vector pET22b (Novagen, Madison, WI). Site-directed mutagenesis was performed to TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
523
524
Y.Cong era/.
generate the desired mutations of IL-ip (6). The pelB leader encoded by the vector was eliminated by Ndel digestion and religation, resulting in the final expression construct. The mutations chosen for this study were a cysteine substitution of residue K138 and alanine substitutions of residues R4 and L6. While the K138C mutation does not affect receptor binding (7, 8), the R4A and L6A changes were expected to produce proteins defective in binding to the IL-IR Type I (9-11). Two mutant forms of IL-ip were constructed. One harbored only the single K138C substitution, and was termed "native" to denote the unaltered condition of its receptor-binding surface. The second, termed "mutant #1", harbored the R4A and L6A substitutions in addition to the K138C replacement. The mutations were verified by DNA sequencing. B. Preparation of wild type rhlL-ip and rhlL-ip mutants Recombinant wild type hIL-lp, the K138C mutant and the K138C, R4A, L6A triple mutant (mutant #1) were isolated from the soluble fraction ofE. coli lysates by ammonium sulfatefi*actionationand hydrophobic interaction chromatography. The purified proteins were characterized by SDS-PAGE, western blots, N-terminal sequence, size exclusion chromatography (SEC), isoelectric focusing (lEF), matrix assisted laser desorption ionization mass spectrometry (MALDI-MS), and electrospray mass spectrometry (ESMS). C. Biotinylation of K138C IL-lp mutants The K138C mutants (-1.5 mg in 0.5 mL of PBS) were treated with 50 mM DTT (removed by gel filtration) and then biotinylated using biotin-maleimide (Sigma) (2:1 molar ratio). The biotinylated product was purified by gel filtration on Superdex 75. Evidence of biotinylation was routinely obtained by western blots probed with an avidinHRP conjugate (Pierce) and by binding of the proteins to streptavidin-coated BIAcore chips (Pharmacia SA5). D. Isolation of monoclonal antibody (mAb) and preparation of Fab fragments BALB/c mice immunized with recombinant human interleukin-lp (rhIL-lp) were the source of splenocytes for the production of mAbs. The X63-Ag8.653 myeloma cell line was used as the fiision partner (12). The mAbs were selected by binding to biotinylated K138C IL-ip immobilized in streptavidin-coated plates. A clone (F18/1E3) that produced an anti-IL-ip (IgGi) with the slowest off-rate when tested on K138C IL-ip immobilized on BIAcore chips was scaled-up in two 1 L spinners and the mAb (1E3) was isolated by protein G affinity chromatography. Fab fi-agments were prepared by digestion with immobilized papain and purified by size exclusion chromatography on Superdex 75 (Pharmacia) after undigested IgG and Fc fragments were first removed by affinity chromatography on protein A. E. Determination of kinetic constants Biotinylated K138C and mutant #1 were immobilized onto streptavidin-derivatized biosensor chips (Pharmacia SA5) by direct injection. Kinetic analysis of the binding of soluble IL-IR to the immobilized forms of IL-ip was carried out at low density, i.e. below 100 refi*active units (RU). Each binding cycle, with either soluble rhIL-lR (Genzyme) or
Fab Fragment as Surrogate for IL-1 Receptor
525
Fab fragments as analytes, was performed at a constant flow of 5 |LiL/min in PBS containing 0.02 % Tween. After the binding cycle, regeneration of the chip to its RU base line was achieved with either 5 mM or 2.5 mM NaOH, depending on the amount of analyte bound (as gauged by RU). Rate constants of dissociation (koff) were determined from analyte-saturated chips in order to minimize rebinding. Kinject, as described in the BIAcore manual, was used to confirm that koff was being measured under conditions free of rebinding. Rate constants of association (kon) were determined at different analyte concentrations and averaged (ave kon). The ratio of kog/ave kon was used to determine the dissociation constant (Ka) of the soluble IL-1 receptor and Fab fragments for immobilized K138C and mutant #1. All the data analysis was carried out using the BIAevaluation software (Pharmacia). For competition experiments, Fab was chemically coupled to CMS chips activated by treatment with l-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride and N-hydroxysuccinimide, and the binding of wild type IL-ip (29 nM) was measured at different concentrations of soluble IL-1 receptor (0-167 nM). m . Results rhIL-lp and its mutant and biotinylated mutant derivatives were characterized biochemically before use. All the proteins appeared homogeneous by SDS-PAGE and SEC, but microheterogeneity was observed by ESMS, MALDI-MS, DBF and N-terminal sequencing. This was largely accounted for by N-terminal variation. Although all of the recombinant forms of IL-1 P originally had an N-terminus of Met-Ala-Pro, products with N-termini of Met, Ala and Pro were obtained. Kinetics of the binding of soluble IL-IR to immobilized IL-lp K138C and mutant #1 were measured using a BIAcore system (Table I). Soluble IL-IR exhibited about 8fold higher affinity for the K138C mutant than for mutant #1. The higher Kd exhibited for mutant #1 resulted from both a 4-fold lower association constant (kon) and a 2-fold higher dissociation constant (kog). This experiment confirmed that mutant #1, with the R4A and L6A mutations, was defective in binding the receptor. Table I. Kinetics of IL-IR binding to biotinylated IL-lfi K138C and ''mutant m** IL-1 form Mutation Kd(nM) kofif (sec'^) kon (M^ sec"^) "native"
K138C
0.0037
8.7x10^
4.3
mutant #1
K138C, R4A, L6A
0.0087
2.4x10^
35.5
In a similar fashion, the binding of Fab fragment 103095 to the same two ligands was also assessed (Table II). The Fab fragment bound to IL-ip K138C with Kd ~ 200 nM, i.e. about 50-fold more weakly than the IL-IR exhibited for this ligand. The higher Kd resulted primarily from a 20-fold lower kon. However, the Fab expressed the same relative preference as the IL-IR between the two forms of IL-lp; it recognized the K138C mutant, but failed to recognize the mutant defective in receptor binding.
Y. Cong etal.
526
Table n . Kinetics of Fab binding to hiotinylated IL-lp K138C and ''mutant #i'' Kd(nM) Mutation IL-1 form kon (IVr^ sec'^) koflF (sec'^) "native" mutant #1
K138C K138C, R4A, L6A
0.0069
3.5x10*
197
no binding
For competition studies, the Fab fragment was chemically coupled to the BIAcore chip and wild type rhIL-ip was used as analyte. Figure 1 presents the BIAcore sensorgrams obtained when different concentrations of wild type rhIL-ip (0.6-29 nM) were passed sequentially over the immobilized Fab chip. Soluble IL-IR was able to inhibit the binding of wild type rhIL-lp to immobilized Fab (Figure 2). Inhibition increased with increasing levels of soluble receptor until about 75% of the binding signal was suppressed. A residual level of apparent binding was attributed to nonspecific interaction by soluble proteins with the chip.
120 TIME (sec) Figure 1. Sensorgram traces for binding experiments in which different concentrations ofrhlL-lp were allowed to bind to anti-IL-1 Fab immobilized on a biosensor chip. The arrow indicates the time at which analyte was injected
Fab Fragment as Surrogate for IL-1 Receptor
527
80' ^^ 3
£60 O 0)
c o
8-40
o
20 1
20
1
-
1
40 60 IL-1 receptor (nM)
_
1
n
1
80
Figure 2. Competitive inhibition by soluble IL-l receptor of the binding of wild type rhlL-ip to immobilized Fab.
IV. Discussion The receptor-binding site of IL-ip has been mapped extensively by site-directed mutagenesis (9-11). These studies have revealed that two regions located about 25 A apart are important for receptor interactions (Figure 3, see color insert). One region (Patch A) is formed by the loop between p-strands 3 and 4, and is located on the side of the pbarrel. It is comprised of H30 and Q32. The other (Patch B) is located at the open end of the p-barrel, and encompasses a discontinuous set of residues including R4, L6, F46,156, K93, K103 and E105. The spatial separation of the two regions makes it unlikely that they could be bridged by a designed small molecule or even by an antibody, since antibody epitopes typically encompass four to nine amino acids (13-15). Nevertheless, anti IL-ip neutralizing antibodies have been shown to block the IL-lp:receptor interaction, and were also shown to bind at different but overlapping regions where the receptor binds (5). Thus, an antibody binding interaction to one of the two domains was strong enough to block the IL-lp:receptor interaction. We sought such an antibody as a first step toward designing a small molecular weight IL-ip antagonist. The traditional method of generating a blocking antibody has been to elicit a series of antibodies against the ligand and then determine which of the antibodies are neutralizing. Instead, we set out to generate a series of antibodies against one of the critical binding patches in the IL-1P:IL-1R interaction. Our criterion for antibody selection was that an antibody should bind well with wild type IL-lp, but poorly against a
528
Y. Cong etaL
mutant IL-ip in which prominent residues of the patch B binding region had been substituted by alanine. We also sought an efficient way to select for antibody that recognized the patch B binding region. It has previously been shown (5, 8) that the site-selectively monobiotinylated IL-lp mutant K138C retains its binding to IL-IR on a streptavidin surface, presumably because biotinylation occurs on a surface residue remote from the two binding regions (Figure 3). We bound monobiotinylated, oriented K138C IL-lp to a BIAcore streptavidin-coated chip and carried out binding studies with IL-IR in the presence of different concentrations of either K138C or wild type IL-lp as competitor. No appreciable differences in binding constants were apparent between K138C and wild type IL-ip, confirming that K138C was fully active in receptor binding using the E14 murine cell line and in the traditional IL-l/LAF bioassay (data not shown). Therefore, use of biotinylated K138C allows coating of a functional, orientated IL-1 on a streptavidincoated microtiter plate. In a standard antibody selection protocol, IL-ip would have been applied to the wells of microtiter plates so that binding could occur in a random fashion. With much of the antigen denatured by interaction with the plastic, antibodies selected by the screen would include many capable of recognizing only denatured IL-1 p. These would lack the ability to block receptor binding by the native cytokine. To focus monoclonal antibody selection on the native receptor-recognizing regions, only antibodies that bound to biotinylated K138C orientated on streptavidin plates were selected for further study. As a result, instead of the relatively large number of positive clones that might have been expected (potentially >100 in this case), only six antibodies were obtained, and all six of these bound to biologically active IL-1 p. This result appeared to validate the strategy of using a strategically oriented IL-ip as the antigen in the screening step, although no systematic comparison with a less specific method was performed to confirm this. Since the rate constant of dissociation (kofif), rather than the rate constant for association, is the primary determinant of differences in the Kd, we determined the apparent koff for each of the antibodies. The antibody (1E3) with the lowest apparent koff, and therefore, presumably the lowest Kd, was chosen for further study with the triple mutant K138C, R4A, L6A. The triple mutant K138C, R4A, L6A was prepared and its binding to IL-IR was compared to that of K138C. The results confirmed the importance of R4 and L6 for ILIR binding. A 10-fold increase in Kd was found in the triple mutant compared to K138C alone. To minimize the effect of steric hindrance and divalent binding of the IgG-lE3, a Fabfi-agmentwas prepared and its binding to the triple mutant was compared with its binding to biotinylated K138C. Fab-1E3 failed to bind to the triple mutant. This result demonstrated the successful selection of an antibody to the receptor-binding surface of the IL-lp molecule. It also demonstrated a fundamental difference between the ILlp:antibody and the IL-lp:IL-lR binding interfaces. The BL-IR protein:protein interaction interface contains at least two spatially separated binding domains. Diminished binding due to mutation at one domain raises the Kd, but need not abolish binding, because residues elsewhere can still support a lower affinity interaction. By contrast, an antibody
Fab Fragment as Surrogate for IL-1 Receptor
529
binding domain encompasses a limited number of spatially contiguous residues. Changes in those spatially close critical residues more readily abolish antibody binding. Finally, although by the criterion of non-binding to the triple mutant, Fab-1E3 appeared likely to be a receptor antagonist, it was important to confirm this. Consistent with the direct involvement of R4 and L6 in receptor binding, IL-IR binding to wild type IL-ip decreased the binding of Fab-1E3, and conversely, the binding of Fab-1E3 to wild type IL-ip decreased the binding of IL-1 R. The techniques developed during these studies are broadly applicable to selecting surrogate receptor (or ligand) antibodies toward other protein ligand:receptor pairs. First, the use of a biologically active, oriented ligand can result in a much more efficient first selection for blocking antibodies. Second, negative selection using an appropriate mutant will directly provide a blocking antibody that will also be a surrogate receptor (or ligand). We used K138C for the first selection and the triple mutant K138C, R4, L6 for the second selection, and found the blocking antibody Fab-1E3. Replacing negative selection using an appropriate mutant with a traditional positive selection scheme based on blocking activity will, of course, provide blocking antibodies, but such a selection scheme will detect blocking antibodies that are not receptor surrogates and thus are poor candidates for downsizing. Fab-1E3 fits the criteria that it is a receptor surrogate and therefore should be suitable for downsizing. References 1. 2. 3. 4. 5. 6. 7. 8. 9.
10. 11. 12. 13. 14. 15. 16.
Braisted, A. C. and Wells, J. A. (1996) Proc. Natl. Acad. Sci. USA 93, 5688-5692. DeGrado, W. F. and Sosnick, T.R. (1996) Proc. Natl. Acad. Sci. USA 93, 5680-5681. Smythe, M. L. and von Itzstein, M. (1994) J. Am. Chem. Soc. 116, 2725-2733. Saragovi, H. U., Fitzpatrick, D., Raktabutr, A., Nakanishi, H., Kahn, M. and Greene, M. I. (1991) Science 253, 792-795. Simon, P. L., Kumar, V., Lillquist, J. S., Bhatnagar, P., Einstein, R., Lee, J., Porter, T., Green, D., Sathe, G. and Young, P. R. (1993) J. Biol. Chem. 268, 9771-9779. Kunkel, T. A., Roberts, J. D. and Zakour, R. A. (1987) Methods Enzymol. 154, 367-382. Wingfield, P., Graber, P., Shaw, A. R., Gronenborn, A. M., Clore, G. M. and MacDonald, H. R. {\9%9)Eur. J. Biochem. 179, 565-571. Chollet, A., Bomiefoy, J.-Y. and Odermatt, N. (1990) J. Immunol. Methods 127, 179-185. Labriola-Tomkins, E., Chandran, C, Kaffka, K. L., Biondi, D., Graves, B. J., Hatada, M., Madison, V. S., Karas, J., Kilian, P. L. and Ju, G. (1991) Proc. Natl. Acad Sci. USA 88, 1118211186. Grutter, M. G., van Oostrum, J., Priestle, J. P., Edelmann, E., Joss, U., Feige, U., Vosbeck, K. and Schmitz A. (1994) Prot. Eng. 7, 663-671. Evans, R. J., Bray, J., Childs, J. D., Vigers, G. P. A., Brandhuber, B. J., Skalicky, J. J., Thompson, R. C. andEisenberg, S. P. (1995) J. Biol. Chem. ll^S, 11477-11483. Kearney, J. F., Radbruch, A., Liesegang, B. and Rajewsky, K. (1979) J. Immunol. 123, 1548-1550. Kabat, E. {1910) Ann. N. Y. Acad Sci. 169, 43-54. Schecter, I. {1911) Ann. N Y. Acad Sci. 190, 394-419. Hodges, R. S., Heaton, R. J., Parker, J. M. R., Molday, L. and Molday, R. S. (1988) J. Biol. Chem. 263, 11768-11775. Priestle, J. P., Schaer, H. P. and Gruetter, M. G. (1989) Proc. Natl. Acad Sci. USA 86, 9667-9671.
This Page Intentionally Left Blank
SECTION VII Macromolecular Assemblies
This Page Intentionally Left Blank
Topology of Membrane Proteins in Native Membranes Using Matrix-assisted Laser Desorption lonization/Mass Spectrometry Kamala Tyagarajanl, John G. Forte ^ and R.Reid Townsend^ iDept. of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3200 and ^Dept. of Pharmaceutical Chemistry, University of California, San Francisco, CA 941430446
I.
Introduction
Knowledge of the topological orientation of membrane proteins within native membranes is fundamental to establishing structure-function relationships. In particular, information on topology is important for understanding the structural basis underlying the translocation function of cation pumps like the Na,KATPase, Ca-ATPase or H,K-ATPase. Previous efforts to define topology have used both theoretical and experimental approaches, such as hydropathy plots, proteolysis of vesicles, binding of regio-specific antibodies, or labeling with group-specific, membrane-sided reagents followed by identifying the modified sites (1, 2). Proteolysis of sided vesicles followed by analysis of peptide products has been one of the most common approaches to determine exposed peptide sequences. Conversely, remaining membrane-associated peptides can be analyzed after exhaustive protease digestion. Many analyses have utilized SDSPAGE to separate proteolytic fragments followed by Edman sequencing of peptides or identification using regio-specific antibodies (3, 4). However these approaches are not useful for identifying small peptides (< 5 kDa) from proteolysis. Alternatively, HPLC separation of peptides followed by Edman sequencing is possible but time-consuming and the coelution of multiple peptides makes identification by Edman sequencing difficult. More recently, mass spectrometry has been used in the identification of peptides and glycopeptides, in topological studies (5-8). In this study, we used matrix-assisted laser desorption ionization /Mass Spectrometry (MALDI/MS) to identify the peptides released from gastric parietal cell microsomes. MALDI, because of its sensitivity and relative tolerance to the presence of salts and buffers was examined for the analysis of unfractionated proteolytic digests (9, 10). MALDI with post-source decay (PSD) analysis was used to obtain sequence information on peptides even in crude digestion mixtures. Our strategy (Figure 1) consisted of proteolysis of intact vesicles, centrifugation at high speeds to separate membrane bound and soluble fractions and analysis of the mixture of released peptides by MALDI/MS. In addition, to increase the TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
533
Kamala Tyagarajan et al
534
Protein in Vesicles Proteolysis ; Centrifugation / \ Pellet Supernatant Peptides V^
MALDI/MS
I
\ . RP-HPLC separation
HPLC fractions
;
MALDI/MS with PSD analysis Topological Models—Sequencing of peptides Figure 1. Methodology used to determine the topology of membrane proteins.
sensitivity and breadth of analysis, supernatant peptides were separated by reverse-phase HPLC and individual fractions were analyzed by MALDI/MS. PSD-analysis was also performed to obtain partial sequence information and identify peptides (11). On basis of the released peptide products a topological map for a major portion of the H,K-ATPase in gastric parietal cell tubulovesicles is proposed. We focused on the gastric H,K-ATPase as a test protein because i) purified gastric microsomal vesicles are highly enriched in the enzyme (> 85-90% purity), ii) the vesicles are oriented with a common asymmetry i.e. cytoplasmic side out (12), iii) the vesicles are sealed allowing selective cytoplasmic digestion, and iv) there is a pool of existing topological data from other methods (13, 14, 15& 16).
subunit beta-subunit
Cytoplasm
Figure 2. The gastric H,K-ATPase in gastric microsomal vesicles. The H,K-ATPase is a heterodimer composed of an a-subunit and a glycoprotein p-subunit, which are asymmetrically oriented.
Topology of Membrane Proteins Using MALDI/MS
535
Thus, the H,K-ATPase in microsomes is a useful model to develop new methods to determine protein topology. The cartoon in Figure 2 illustrates that the H,KATPase is a heterodimer composed of two subunit proteins: an a-subunit of 1035 amino acids, traversing the membrane either 8 or 10 times (13), with most of its mass cytoplasmically disposed (and therefore outside the vesicles); and a glycosylated (i-subunit of 300 amino acids, traversing the membrane once and, except for a short cytoplasmic tail, with most of its mass on the extracellular side (inside the vesicles).
II. Materials and Methods Materials. Trypsin, Lys C, chymotrypsin and adrenocorticotropic hormone fragment (18-39) were purchased from Sigma (St. Louis, Mo). Tris(hydroxymethyl)aminomethane (Tris), sucrose, acetonitrile, HPLC grade water and acetic acid were purchased from Fisher Scientific (Pittsburgh, PA). Matrix (a-cyano 4-hydroxy-cinnamic acid) was purchased from Hewlett Packard (Palo Alto, CA). The low-molecular weight calibration standard was purchased from Bio-Rad (Richmond, CA). Preparation of H,K-ATPase enriched microsomal vesicles. H,K-ATPasecontaining gastric microsomal vesicles were isolated from rabbit stomach as previously described (12). Crude microsomes were harvested from homogenized mucosa of unstimulated rabbit stomach (H2 receptor-blocked) as the membrane pellet sedimenting between 10 min at 13,000 x g and 1 hr at 100,000 x g. The pellet was resuspended in 10% sucrose, brought to 40% sucrose (9 ml), and overlaid with successive layers of 30% sucrose (11 ml), 10% sucrose (16 ml) [300 mM sucrose, 5 mM tris(hydroxymethyl)aminomethane (Tris), and 0.2 mM EDTA, pH 7.4] in a 37 ml tube. After centrifugation at 80,000 x g for 4 hr, the purified gastric microsomal vesicles were collected from the interface between 10% and 30 % sucrose and stored at 4° C until use. Trypsinization of H,K-ATPase-enriched gastric microsomal vesicles. Tubulovesicles (-100 jxg of protein) were treated with trypsin (5 jig) in Tris.HCl (20 mM, pH 7.5) at 37°C for 30 min. The vesicles were next centrifuged at 100,000 X g on a TLIOO table top centrifuge for 1 hr at 4°C. The supernatant was carefully separated from the pellet. The supernatant was next boiled for 5 min and stored at -20°C until further analysis. Reverse Phase-HPLC separation of tryptic digest.. The tryptic digest (60%) was separated on an Aquapore OD-300 (Applied Biosystems Inc) C18 reverse phase column (7 |i and 1 x 250 mm) using a Michrom UMA Model 600 HPLC system with eluant monitoring at 214 nm. The first 5 min of the gradient was isocratic at 5% eluant B (98% CH3CN, 0.1% TFA) and 95 % eluant A (2% CH3CN, 0.1% TFA). This was followed by a linear gradient of 5-15 % B in 15 min, 15-50% B at 75 min and 50-75 % B at 90 min. The flow rate was 50.0 |il/min. Individual fractions were collected and stored for subsequent use. MALDI/MS analysis. Supernatant (1 jxl) was diluted 1:2 with 50% CH3CN in water and this mixture was mixed with 2 |il of a-cyano 4-hydroxy-cinnamic acid, vortexed and centrifuged. One |il was spotted onto the target. MALDI/MS of samples was carried out on a TofSpec SE from Micromass (Manchester, UK), equipped with a reflectron and using a nitrogen laser (337 nm). Samples were
Kamala lyagarajan et al
536
initially examined in the linear mode to determine whether signals > 5 kDa were present. An accelerating potential of 25 kV, a reflectron voltage of 28.5 kV and an extraction voltage of 10 kV in the reflectron-ion mode were typically used. Thirty shots were usually averaged. The instrument was calibrated with peptides from a low molecular weight peptide set from Biorad (Richmond, CA). Molecular ions of bombesin and the 18-39 amino acid clip of adrenocorticotropic hormone fragment were used as calibration standards.
III.
Results and Discussion
The H,K-ATPase-enriched vesicles were trypsinized for 30 min using a trypsinrprotein ratio of 1:25 and then centrifuged to separate the pellet from the supematant fractions. An aliquot of the supernatant was analyzed by MALDI/MS as shown in Figure 3. The observed signals {m/z 600-4400) were assigned to masses (±2 Da) of the predicted tryptic digestion products for the gastric H,KATPase a-subunit as shown in Table 1. Since, the exposure to trypsin was for a limited period of time (30 min) incompletely cleaved tryptic peptides were also observed, so it was important to include the possibility of these incompletely cleaved peptides in the search through the molecular mass signals.
I
l^^i;^^,)!^^^ 800
1200
1600
2000 2400 m/z
2800
3200
3600
4000
Figure 3. MALDI mass spectrum of the total supernatant from the tryptic digest of the H,KATPase-enriched tubulovesicles. The H,K-ATPase was digested with trypsin and the vesicles centrifuged to separate supernatant from the pellet. An aliquot of the supernatant was analyzed by MALDI/MS in the reflectron ion mode using a-cyano 4-hydroxy cinnamic acid as a matrix. The signals are denoted by numbers and were assigned to a-subunit peptides (Table 1).
537
Topology of Membrane Proteins Using MALDI/MS
Table 1. Assignment of signals obtained by MALDI/MS of a tryptic digest supernatant from H,K-ATPase-enriched vesicles. The observed masses of the numbered signals shown in Figure 3 were assigned to the masses of a-subunit peptides. Signal
No. Observed MH+
Calculated MH+
a-subunit peptide Asp483-Lys487
1 2
663
662.4
819
818.5
Val435_Arg44l
3
899
898.6
Leu'78_Arg85
4
1043
1044.6
Ala32-Lys42
5
1047
1046.6
Leu7lO-Arg7l8
6
1056
1055.6
Gly206.Arg215
7
1076
1076.6
Ala673-Lys682
8
1088
1087.5
Thr695-Arg703
9
1093
1092.6
Asp86-Arg95
10
1196
1195.7
Leu659-Arg668
11
1239
1238.7
Asnl'74.Argl84
12
1283
1282.7
Tyr66.Arg'7'7
13
1324
1323.8
Leu659-Lys669 or Ala838.Arg848
14
1370
1370.7
Gly536_Arg546
15
1375
1374.7
Asp683_Arg694
16
1458
1458.7
Phe470-Arg482
17
1485
1485.7
Ser239-Arg251 or Arg456-Lys469
18
1619
1619.8
Val224.Arg238 or Asp5ll-Arg524
19
1679
1678.9
Ala43l-Lys445
20
1712
1711.9
Glu547-Arg562
21
1824
1823.9
Phe499-Arg513
22
2013
2013
23
2159
2159.1
Glu49-Lys65 Ala637.Arg658
24
2448
2449.2
Asn755-Arg777
25
2543
2544.2
Asn252_Arg275
26
2736
2737.4
Asn37l-Arg396
27
2793
2792.3
Glu49-Lys72
28
3553
3553.9
Asnl74.Lys205
29
4004
4005
Lys738.Arg'777
Based on mass analysis by MALDI/MS of the unfractionated tryptic digest, 29 tryptic peptides from the a-subunit were tentatively identified, but only two signals corresponded to (3-subunit peptides. These latter signals at m/z 791 and 1485 corresponded to peptides from the short cytoplasmic tail of the p-subunit, and included Met^-Lys'^ and Lys^-Lys^^ from the N-terminus of the sequence. No signals that corresponded to masses of the extracellular domain of the (3subunit peptides were observed, consistent with the vesicles being oriented with their cytoplasmic side-out and preservation of vesicular integrity during
538
Kamala Tyagarajan et al
proteolysis by trypsin. The a-subunit peptides tentatively identified by MALDyMS are listed in Table 1, including: Ala32-Lys42, Glu49-Lys65, Glu^^Lys'72, Tyr66-Arg'7'7, Leu'^^-Arg^^ and Asp^^-Arg^^ from the N-terminus (before membrane segment Ml); peptides Asn^'^^-Argi^^^ ^sn^'^^-Lys^o^^ Q\y206. Arg2i5^ Yai224_Arg238 Ser239.Arg25i and Asn252-Arg275 in the cytosolic loop between membrane segments M2 and M3; peptides in the large cytosolic loop between M4 and M5 which included Asn37i.Arg396, Ala^3i.Lys445 Arg^^e. Lys469, Phe470-Arg482, Asp483.Lys487, Phe499.Arg5i3, Gly536-Arg546, Glu547Arg562, Ala637_Arg658, Leu659.Lys669 , Ala673.Lys682, Asp683-Arg694, Thr695. Arg703, Leu7io_Arg7i8^ Lys738_Arg777 and Asn755-Arg777; and a peptide Ala838Arg848 from the cytosolic loop between membrane segments M6 and M7. All of these regions have previously been deduced to be cytoplasmic (15, 16). No peptides corresponding to any of the intramembrane segments or intravesicular (extracellular) regions of the a-subunit were observed. Thus the topological prediction obtained by analysis of the MALDI mass spectrum of the entire tryptic supernatant was consistent with the currently accepted topological model of H,KATPase (16). An assignment of the identified peptides to putative extracellular regions of the H,K-ATPase is schematically shown in Figure 4. Although analysis of peptide masses in the total supernatant allows a tentative identification, mass overlap at this resolution may lead to erroneous assignments. For example, signal suppression can lead to low intensity or abolition of certain peptide signals. Since the peptides are tentatively identified on the basis of mass alone, it is prudent to perform PSD analysis to obtain sequence information and confirm the identity. PSD analysis could be performed on some peptides in the total mixture; however, it was difficult to obtain sequence information on low Luminal solution
f\ Apical I ? plasma ^Jmembrane
Cytoplasm Figure 4. Topological model of the gastric H,K-ATPase. The topological model shown is adapted from a proposal by Besancon et al. (14). The model depicts the a-subunit having ten tranmsmembrane segments, denoted as Ml-MlO. Amino acid numbers are shown for the cytoplasmic ends of segments M1-M8. The glycoprotein p-subunit traverses the membrane once and has most of its mass luminally oriented. The darkened regions indicate peptides of the asubunit that were identified by MALDI/MS analysis of the total tryptic digest supernatant of H,K-ATPase-enriched vesicles (Figure 3 and Table 1).
Topology of Membrane Proteins Using MALDI/MS
539
40 Time (min) Figure 5. Reverse-phase HPLC of the supernatant from a tryptic digest of H,K-ATPaseenriched vesicles. Peak fractions were collected up to 60 min using the gradient described in "Methods".
intensity peptides and peptides that were separated by less than 14 Da. In order to obtain a series of purified peptides, we subjected the tryptic digest supernatant to RP-HPLC as described in "Methods". The RP-HPLC trace of the digest is shown in Figure 5. We collected 30 individual fractions and an aliquot of each was subjected to MALDI/MS. The MALDI/MS of each HPLC fraction showed the presence of several peptides which had sufficient mass-differences for successful PSD-analyses. Figure 6A shows the MALDI mass spectrum of a representative fraction, Fraction 13, from the HPLC preparation. Signals were observed at m/z 730, 1047, 1327, 1371, 1394, 1798, and 2141. The assignment of these signals to peptides of the a-subunit is summarized in Table 2. Although we had noted peptides at 1047 and 1371 in the total supernatant material, signals at m/z 730, 1327, 1394, 1798 and 2141 were apparent only after HPLC fractionation. The sequence and identity of the peptides was confirmed by PSD-analyses. The PSD spectrum of the signal at m/z 1798 is shown in Figure 6B. As an example, the PSD spectrum of m/z 1798 gave a series of y ions ranging from ya-yi? and the b ions from b3-b6 confirming the amino acid sequence to be identical to peptide '719LGAIVAVTGDGVNDSPALK737 of the a-subunit. Interestingly, the presence of the series of y ions from ys-yi? demonstrated that the sequon, '731 Asn-Asp-Ser'733, exists in a non-glycosylated form. It has been suggested that one of the Asn residues within the cytoplasmic domain of the a-subunit is glycosylated (17).
Kamala lyagarajan et al
540
700
m/z
900
Figure 6. MALDI mass spectrum of fraction 13 from RP-HPLC. The H,K-ATPase-enriched vesicles were trypsinized and centrifuged to separate supernatant from pellet. The supernatant was subjected to RP/HPLC and individual fractions collected and subjected to MALDI/MS. The MALDI mass spectrum (reflectron-ion mode) was obtained using a-cyano-4-hydroxy cinnamic acid as a matrix (Panel A). The signals were assigned to a-subunit peptides (Table 2). The signal at m/z 1798, indicated by an arrow was next subjected to PSD-analysis. The PSDspectrum of MH"*" 1798.4 is shown in Panel B. Only the peaks for the b and y fragment ions are labeled. The deduced amino acid sequence is shown at the top of the panel.
Topology of Membrane Proteins Using MALDI/MS
541
Table 2. Tryptic peptides of a-subunit in Fraction 13. The H,K-ATPase-enriched microsomes were trypsinized and centrifuged to separate the supernatant from the pellet. The supernatant was subjected to RP/HPLC and individual fractions were collected and analyzed by MALDI/MS. The MALDI mass spectrum of fraction 13 is shown in Figure 6A. The signals seen were assigned to a-subunit tryptic peptides, as shown below. Observed MH+
Calculated MH+
1047
1046.6
a-subunit peptide ^lOLVIVESCQR^ls
1327
1329.8
457IVIGDASETALLK469
1371
1370.7
536GQELPLDEQWR546
1396.8
661VPVDQVNRKDAR672
1798
1797.0
'719LGAIVAAVTGDGNDSPALK737
2141
2141.1
48KEMEINDHQLSVAELEQK65
1394
Use of alternative proteases Proteases other than trypsin may be used to increase the coverage of the protein sequence or resolve ambiguities from mass overlap. For example, Lys C for topological analysis gave results that were complementary to trypsin (data not shown). From a Lys C digest it was determined that several peptides from regions Ala^-Lys^^^, Seri^4-Lys223^ Arg^^^-Lys'^^^ and Asp^^s.LygSSi ^vere cytoplasmic. Again, a signal at m/z 2824 corresponded to the mass of peptide '7iOLeu-Lys'737 (2825 Da) of the a-subunit which includes Asn'731. These data were again consistent with the accepted topological model of the H,K-ATPase (Figure 4). Treatment of vesicles with chymotrypsin using similar conditions as for trypsin (1:20, chymoptrypsin:protein) and MALDI-MS analysis of the supernatant after centrifugation of the digest gave some interesting results. Signals at m/z 996, 1015, 1298, 1460 and 1678 were observed which corresponded to the masses of p-subunit peptides (Tyr2i9-Leu227, Seri5i-Leui59, Leu25i-Leu262, Cys58-Tyr69 and Arg^^-Tyr^^, respectively) and are known to have an intra-vesicular orientation. Further investigation including PSD analyses will be performed to confirm the identity of these peptides.
IV.
Conclusions
We have demonstrated the utility of MALDI/MS in combination with proteolysis to investigate the topology of a heterodimeric membrane glycoprotein, the gastric H,K-ATPase within its native microsomal membrane. MALDI/MS proved to be a rapid and sensitive method for topological analysis of membrane proteins in native membranes. The high sensitivity, and relative tolerance of MALDI/MS to buffers and some detergents, allowed rapid assessment of topology by examination of unfractionated supernatants from vesicular digests. The above approach may also be usefully employed to assess the reconstitution of proteins into vesicles and vesicular integrity. Analysis of HPLC fractions by MALDI with PSD analysis allowed the determination of partial peptide sequence and may prove suitable for identifying post-translational modifications of
542
Kamala Tyagarajan et al
extravesiculj peptides. Finally, this approach should provide a convenient, extravesicular sensitive anc and rigorous assessment of protein topology in artificial and native membranes.
Acknowledgments This project was supported in part by NIH grant DK38792. The mass spectra were obtained at the UCSF Mass Spectrometry Facility supported by the Biomedical Research Technology Program of the National Center for Research Resources (NIH NCRR BRTP RR01614 and RR08282). The VG TofSpec SE was partially supported by Micromass, Beverley, MA.
References 1. Modyanov, N., Lutsenko, S., Chertova, E., Efremov, R. and Gulyaev, D. (1992) Acta Physiol. Scand. Supplementum, 607, 49-58. 2. Loo, T.W. and Clarke, D.M. (1995) J. Biol. Chem. 270, 843-848. 3. Serrano, R., Monk, B.C., Villalba, J.M., Montesinos, C. and Weiler EW. (1993) Eur. J. Biochem., 212, 737-744. 4. Ban, W.J. Jr, Abbott, A., Sun, Y. and Malik, B. (1992) Ann. New York Acad. Sci., 671, 436-439. 5. le Maire, M., Deschamps, S., Moller, J.V., La Caer, J.P. and Rossier, J. (1993) Anal. Biochem., 214, 50-57. 6. Mel, S.F., Falick, A.M., Burlingame, A.L. and Stroud, R.M. (1993) Biochemsitry, 32, 9473-9479. 7. Moore, C.R., Yates, J.R., Griffin, P.R., Shabnowitz, J., Martino, P.A., Hunt, D.F. and Cafiso, D.S. (1989) Biochemistry, 28, 9184-9191. 8. Poulter, L., Earnest, J.P., Stroud, R.M. and Burlingame, A.L. (1989) Proc. Natl. Acad. Sci., 86, 6645-6649. 9. Tsarbopoulos, A., Karas, M., Strupat, K., Pramanik, B.N., Nagabushan, T.L. and Hillenkamp, F. (1994) Anal. Chem., 66, 2062-2070. 10. Billeci, T. M., and Stults, J.T. (1993) Anal. Chem., 65, 1709-1716. 11. Spengler, B., Kirsch, D., Kaufmann, R. and Jaeger, E. (1992) Rapid Commun. Mass Spectrom., 6, 105-108. 12. Reenstra, W.W. and Forte, J.G. (1990) Meth. in Enzymol., 192, 151-165. 13. Bamberg, K. and Sachs, G. (1994) J. Biol. Chem., 269, 16909-16919. 14. Asano, S., Arakawa, S., Hirasawa, M., Sakai, H., Ohta, M., Ohta, K.and Takeguchi N. (1994) Biochem. J., 299, 59-64. 15. Sachs, G., Besancon, M., Shin, J.M., Mercier, F., Munson, K. and Hersey S. (1992) J. Bioenerg. Biomem., 24, 301-308. 16. Besancon, M., Shin, J.M., Mercier, F., Munson, K., Miller, M., Hersey, S. and Sachs, G. (1993) Biochemistry, 32, 2345-2355. 17. Tai, M.M, Im, W.B., Davis, J.P., Blakeman, D.P., Zurcher-Neely, H.A. and Heinrikson, R.L. (1989) Biochemistry, 28, 3183-3187.
Role of D-Ser*^ in the P-type Calcium Channel Blocker, co-Agatoxin-TK Tomohiro Watanabe, Manabu Kuwada, Kumiko Y. Kumagaye*, Kiichiro Nakajima*, Yukio Nishizawa and Naoki Asakawa Eisai Tsukuba Research Laboratories, 5-1-3 Tokodai, Tsukuba, Ibaraki 300-26, Japan and *Peptide Institute Inc., Protein Research Foundation, Osaka 562, Japan
I. Introduction Multiple types of voltage-dependent calcium channels in mammalian neurons play important roles in controlling various nervous functions such as synaptic transmission, gene expression, neuronal development and differentiation. There are at least four subtypes of the calcium channels, namely T-type, L-type, N-type, and P-type channels, classified on the basis of their electrophysiological and pharmacological properties. Among them, the P-type calcium channels have been reported to be primarily associated with neuronal transmission through regulating the release of excitatory amino acids and catecholamines (1-4). We have previously isolated a 48-amino-acid peptide, named co-agatoxin-TK (co-Aga-TK), from the venom of the funnel web spider, Agelenopsis aperta. It was found to be a potent blocker of the P-type calcium channels in rat cerebellar Purkinje neurons, but TECHNIQUES IN PROTEIN CHEMISTRY VIII
543
544
Tomohiro Watanabe et al
had no activity against T-type, L-type, or N-type channels in brain neurons. The peptide has a unique structural profile including a high-density disulfide core structure with four disulfide bonds and a D-form amino acid, D-Ser, at position 46 (Fig. 1). Interestingly, coAga-TK contains two serine residues at positions 28 and 46 of which only Ser^^ is in the D-form (©-[o-Ser^^JAga-TK) (5,6). W e have also found in the spider venom a related peptide with the same amino acid sequence and disulfide pairings as those of CO-[DSej-46]Aga-TK except for the L-configuration of Ser^^ (co-[L-Ser46]AgaTK), though the L-Ser^^ toxin is about six times less abundant than the D-Ser^^ toxin (7). These findings raise the questions of why only Sef*^ of the two serine residues is in the D-form and why the two co-Aga-TKs containing opposite configuration at Ser^^ are both present in the Agelenopsis aperta venom. Heck et al. (8) have reported the presence in the venom of a novel peptide isomerase that specifically converts L-Ser^^ to D-Ser^^ residue of co-Aga-TK. We have recently reported the complete primary structure of the peptide isomerase, which is a 29-kDa glycoprotein consisting of a 243-residue heavy chain and an 18-residue light chain cormected by a single disulfide bond (9). This was the first report to assign the structure of a peptide isomerase from an eukaryotic organism that converts the chirality of amino acid residues. co-[D-Ser4^]Aga-TK has very low solubility under neutral conditions, which precluded detailed studies of its tertiary structure by NMR spectroscopy. However, Adams et al (10) and Yu et al (11) reported two-dimensional NMR analyses of co-[D-Ser4^]Aga-TK in acidic solution; they concluded that the cystine-rich region consists of a triple-stranded antiparallel p-sheet with four loops formed by four disulfides (Cys^-Pro^^), but the carboxyl-terminal tail was very poorly defined since the carboxyl-terminal ten residues containing the D-Ser'^^ residue (Arg^^-Ala^^) adopt a disordered structure. Our structure-function relationship studies of co-[D-Ser4^]Aga-TK demonstrated that co-[L-Ser^6]Aga-TK has 80- to 90-fold less potency towards the P-type calcium channels compared with CO-[DSer46]Aga-TK. Two proteolytic fragments of co-[D-Ser46]Aga-TK, namely co-Aga-TK (1-43) and a carboxyl-terminal peptide fragment, co-Aga-TK (44-48), did not exert any significant inhibition of P-type calcium channels or interfere with the blockade of the channels elicited by native co-Aga-TK (12). Furthermore, molecular dynamics calculations showed that the carboxyl-terminal sixamino-acid peptide of co-Aga-TK containing D-Ser^^ assumes a different conformation from that containing L-Ser^^. These data suggested that the specific conformation of the carboxyl-terminal
Role of D-Ser46 in w-Agatoxin-TK
545
tail generated by the D-Ser^^ residue, together with the triplestranded antiparallel p-sheet, might be essential for the blockade of the P-type calcium channels. loop-4
OOH Figure 1. Schematic diagram of the high-density disulfide core and carboxylterminal tail containing D-Ser^^ in Q)-[D-Ser^^]Aga-TK. The disulfide core structures are represented en the basis of the coordinates determined by NMR spectroscopy (11). Amino acid residues of the peptide are represented by single-letter abbreviations in the circles.
546
Tomohiro Watanabe et al
In this study, the conformations of co-[D-Ser4^]Aga-TK and co[L-Ser46]Aga-TK were investigated by the combination of sizeexclusion chromatography, circular dichroism (CD) measurement, and fluorescence spectroscopy in order to elucidate the structural and functional effects of the configuration of the Ser^^ residue in coAga-TK. We have found that co-[D-Ser4^]Aga-TK has a particularly compact molecular shape involving p-sheet structure, whereas co[L-Ser46]Aga-TK has a relatively unfolded or extended structure at physiological pH and ionic strength. These data are discussed in terms of the possible role of the configuration of the Ser^^ residue in determining the molecular conformation of ©-Aga-TK.
11. Experimental Procedures A. Peptides and Reagents co-[L-Ser46]Aga-TK and co-[D-Ser^^]Aga-TK were synthesized by Drs. K. Y. Kumagaye and K. Nakajima of Peptide Institute Inc. using a Applied Biosystems type 430A peptide synthesizer as described previously (5). Synthetic co-[D-Ser46]Aga-TK is commercially available from the company. High-purity guanidine hydrochloride was obtained ICN Biomedicals, Inc. (Aurora, OH). The phosphate-buffered saline, pH 7.4, was prepared by dissolving Dulbecco's PBS Powder (Nissui Pharmaceutical Co., Ltd., Tokyo) in Milli-Q water, and consists of 8.10 mM Na2HP04,1.47 mM KH2PO4, 2.68 mM KCl, and 137 mM NaCl. Other chemicals and reagents used were of reagent grade.
B. Size-Exclusion Chromatography The apparent molecular masses of co-[D-Ser4^]Aga-TK and CO-[LSer^6]Aga-TK were determined by size-exclusion chromatography (LKB GTI HPLC Systems) with a Pharmacia Superdex 75HR column (10 x 300 mm) or a TSK G3000SWXL column (10 x 300 m m ) equilibrated with Dulbecco's phosphate-buffered saline, pH 7.4, with or without 5.2 M guanidine hydrochloride. The peptides were eluted from the columns with the buffer at the flow rate of 0.5 ml/min at 25 °C and elution profiles were monitored by measuring the absorbance at 280 nm or 220 nm. The column was calibrated using a Pharmacia low-molecular-weight marker kit (blue dextran, bovine serum albumin, ovalbumin, chymotrypsinogen, and
Role of D-SeH6 in co-Agatoxin-TK
ribonuclease A), aprotinin Institute Inc.).
547
(Sigma), and substance P (Peptide
C. Spectroscopic analysis For CD and fluorescence spectroscopic analyses of CO-[LSer^^]Aga-TK and co-[D-Ser^^]Aga-TK, peptide samples were prepared by freshly dissolving the lyophilized peptides at a concentration of 150 |Lig/ml in Dulbecco's phosphate-buffered saline, pH 7.4, in the presence or absence of guanidine hydrochloride. CD spectra were recorded with a Jasco J-720WI spectropolarimeter at room temperature using a 0.1 cm path-length cell. In all cases, the buffer base-line spectrum was subtracted, and the results were expressed in terms of the mean residue ellipticity {0) in units of degrees cm^ dmol'^. Fluorescence spectra were determined with a Hitachi F4500 spectrofluorometer using a 1 cm path-length cell at 25 °C.
III. Results And Discussion A. Different molecular shapes of co-lD-Ser^^lAga-TK and co[L-Ser^^lAga-TK During the isolation and characterization of biologically active peptides from Agelenopsis aperta venom, we found that two stereoisomers of the P-type calcium channel blocker, CO-[DSer4^]Aga-TK and co-[L-Ser46]Aga-TK, were eluted in distinct fractions on size-exclusion chromatography (13). This finding was confirmed with synthetic standards of the two toxins on a Superdex HR75 column equilibrated with phosphate-buffered saline, pH 7.4, as shown in Fig. 2. co-[D-Ser4^]Aga-TK was found to be eluted from the column significantly later than co-[L-Ser^^]Aga-TK (the D-Ser toxin, 39.9 min; the L-Ser toxin, 31.5 min), in spite of their identical molecular mass of 5273 Da. The D-Ser toxin was eluted in close proximity to an 11-residue peptide, substance-P (molecular mass of 1348 Da), but the L-form toxin was eluted at a similar position to aprotinin (molecular mass of 6512 Da). Under the conditions used, each toxin was eluted as a single peak at the same position at a loading concentration from 2 ^M to 200 ^M, whereas aggregated or oligomeric forms were observed at the concentration of 2 mM.
Tomohiro Watanabe et al
548
O GO
^
co-[L-Ser^^]Aga-TK
0)
u
a o
67kDa I
43kDa 25kDa
14kDa TkDa I
(0-[i>Ser46]Aga-TK
10
20
I
30
40
50
60
Time (min) Figure 2. Size-exclusion chromatography of (o-[D-Ser^^]Aga-TK and co-[LSer46]Aga-TK. The two toxins (200 |iM) were analyzed en a Superdex HR75 column equilibrated with phosphate-buffered saline, pH 7.4, as described under Experimental Procedures. The molecular masses and elution positions of bovine serum albumin (67 kDa), ovalbumin (43 kDa), chymotrypsinogen (25 kDa), ribonuclease A (14 kDa), and aprotinin (7 kDa), used as calibration standards, are shown.
Role of D-Ser46 in co-Agatoxin-TK
549
The calibration of the size-exclusion column with standard proteins demonstrated that the L-Ser toxin has an apparent molecular mass of 6 kDa, which is close to the real molecular mass of the toxin. The apparent molecular mass of the D-Ser toxin was too small to evaluate accurately from the calibration data. These results indicated that both ©-[D-Ser^^] Aga-TK and co-[L-Ser^6] Aga-TK take monomeric form at physiological pH and ionic strength, but the two toxins are significantly different in apparent molecular mass. The apparent molecular mass of D-Ser toxin was dramatically increased by the addition of guanidine hydrochloride to the elution buffer, although that of the L-Ser toxin was not altered by the denaturing reagent. In the presence of 5.2 M guanidine hydrochloride, the D-form toxin was eluted at the same position as the L-form toxin and the apparent molecular masses of the two toxins were estimated as 6 kDa based on calibration with the standard proteins. CD and fluorescence spectroscopic analyses revealed that the two toxins were unfolded and lost their secondary and tertiary structure in 5.2 M guanidine hydrochloride at pH 7.4, as described below. It, therefore, appears that the D-Ser toxin forms a compact folded structure, whereas the L-Ser toxin has a relatively unfolded or extended structure. In order to see whether or not the elution behavior of the two toxins depends on the specificity of the separation support used, the two toxins were also analyzed on a TSK GSOOOSWXL column under the same elution conditions as those of the Superdex column. Similar results were obtained, i.e., the D-form toxin was eluted later than the L-form toxin with phosphate-buffered saline. These results confirm that the different elution behavior of the two toxins was caused by the distinct molecular shapes of the two toxins.
JB. Conformational analyses of o}-[D-Ser^^]Aga-TK and co[L'Ser^^lAga-TK We examined the CD spectra of co-[D-Ser^^]Aga-TK and CO-[LSer46]Aga-TK in phosphate-buffered saline, pH 7.4, to compare the secondary structures of the two toxins. As illustrated in figure 3, the spectrum of the D-Ser toxin showed a negative peak at 208 n m , while the spectrum of the L-Ser toxin had both a negative peak at 200 nm and broad positive ellipticity centered near 220 nm.
Tomohiro Watanabe et al
550
2000
o
S
-5000
-9000
220 Wavelength (nm)
250
Figure 3. CD spectra of co-[D-Ser'*^]Aga-TK and co-[L-Ser'*^]Aga-TK in phosphatebuffered saline, pH 7.4, in the presence or absence of guanidine hydrochloride. 1, co-[D-Ser^^]Aga-TK in phosphate-buffered saline; 2, co-[L-Ser^^]Aga-TK in phosphate-buffered saline; 3, co-[L-Ser^^]Aga-TK in phosphate-buffered saline containing 5.2 mM guanidine hydrochloride; 4, ©-[D-Ser'^^JAga-TK in phosphatebuffered saline containing 5.2 mM guanidine hydrochloride. CD spectra were recorded between 210 and 250 nm or 195 and 250 nm in the presence or absence of guanidine hydrochloride, respectively.
Role of D-Ser46 in w-Agatoxin-TK
551
These features are characteristic of peptide random coil and p-sheet structures, and the magnitude of the positive ellipticity band revealed a significant difference in p-sheet contents between the two toxins. The secondary structures of the two toxins were found to be disrupted by the addition of 5.2 M guanidine hydrochloride at pH 7.4, since the spectra changed to a pattern typical of predominantly random coil structure. It was concluded that CO-[DSer46]Aga-TK has a significantly higher p-sheet content than CO-[LSer46]Aga-TK under neutral conditions. Intrinsic fluorescence of co-[D-Ser46]Aga-TK and co-[L-Ser46]AgaTK was determined to compare the tertiary structures around the Trp and Tyr residues between the two toxins. The two toxins have a single residue each of Trp and Tyr in the disulfide-rich region, at positions 14 and 9, respectively. As shown in figure 4, tryptophan fluorescence with an emission maximum near 345 nm was strongly quenched in the D-Ser toxin, but not in the L-Ser toxin, whereas tyrosine fluorescence of the two toxins showed almost the same intensity at the emission maximum of 310 nm. Further, the intensity of tryptophan fluorescence of the D-form toxin, but not that of the L-form toxin, increased concomitantly with the increase of the concentration of guanidine hydrochloride at pH 7.4. These results clearly indicate that the Trp^^ residue in the L-form toxin is exposed to the solvent, but this residue of the D-form toxin is in a relatively hydrophobic environment. Previously, Yu et al. reported the solution structure of ©-[o-Ser^^JAga-TK at pH 4.0, showing that the indole side chain of the Trp^^ residue packs against the sulfur atoms of Cys^^-Cys^ and may serve to stabilize the loop formed by the disulfide bond (11). It is, therefore, suggested that the tryptophan fluorescence of the D-Ser toxin may be quenched by the sulfur atoms of the disulfide bond, whereas the indole chromophore of the L-Ser toxin may not be affected by the sulfur atoms due to the greater distance between the two groups. In conclusion, we have investigated the conformation of CO-[DSer46]Aga-TK and co-[L-Ser4^]Aga-TK at physiological pH and ionic strength using size-exclusion chromatography and spectroscopic methods. We have found that the apparent molecular mass of co[D-Ser^^]Aga-TK is significantly smaller than that of co-[L-Ser^^]AgaTK as determined by size-exclusion chromatography. CD spectra of the two toxins also revealed that co-[D-Ser4^]Aga-TK has a higher psheet content than co-[L-Ser46]Aga-TK. Furthermore, the intrinsic fluorescence of ©-[o-Ser^^JAga-TK showed that Trp^^ of CO-[DSer^6]Aga-TK is in a relatively hydrophobic environment compared with that of ©-[L-Ser^^JAga-TK. These data imply that
Tomohiro Watanabe et al
552 3727
30001 2000-1
1000-1 0.000 u O
250.0
300.0
350.0 (Emission)
400.0
300.0
350.0 (Emission)
400.0
3727
0.000 250.0
450.0
450.0
Wavelength (nm) Figure 4. Intrinsic fluorescence spectra of co-[D-Ser^6]Aga-TK and co-[L-Ser^^]AgaTK in phosphate-buffered saline, pH 7.4. Emission spectra were recorded between the wavelengths of 250 and 450 nm at t h e excitation wavelength of 280 nm.
Role of D-Ser46 in w-Agatoxin-TK
553
the D-Ser^^ residue of co-[D-Ser^^]Aga-TK may be involved in the formation of additional intramolecular p-sheet structure in the carboxyl-terminal region or between the disulfide core and carboxyl-terminal tail, which contributes to the compact folding of co-[D-Ser4^]Aga-TK. It is also likely that the additional p-sheet causes a change in the tertiary environment around the Trp^^ residue of co-[D-Ser4^]Aga-TK. Additional experiments to assess the biological importance of the carboxyl-terminal tail seem worthwhile. For instance, it would be interesting to examine the effects of sequential truncation of the carboxyl-terminal region of co-[D-Ser4^]Aga-TK on the blockade of the P-type calcium channels. Studies are in progress to characterize further the carboxyl-terminal conformation of co-[D-Ser46] Aga-TK.
Acknowledgments We thank Dr. Kozaki for helpful discussions and Dr. Takakuwa (Jasco), for measuring CD.
References 1. 2.
3. 4. 5.
6.
7.
8.
Olivera B.M., Miljanich G.P., and Ramachandran J. (1994) Annu. Rev. Biochem. 63, 823-8671. Niidome, T., Teramoto, T., Murata, Y., Tanaka, I., Seto, T., Sawada, K., Mori, Y., and Katayama, K. (1994) Biochem. Biophys. Res. Comtnun. 203, 1821-1827 Kimura, M., Yamanishi, Y., Hanada, T., Kagaya, T., Kuwada, M., Watanabe, T., Katayama, K., and Nishizawa, Y. (1995) Neuroscience 66, 609-615 Teramoto, T., Niidome, T., Miyagawa, T., Nishizawa, Y., Katayama, K., and Sawada, K. (1995) NeuroReport 6, 1684-1688 Kuwada, M., Teramoto, T., Kumagaye, K. Y., Nakajima, K., Watanabe, T., Kawai, T., Kawakami, Y., Niidome, T., Sawada, K., Nishizawa, Y., and Katayama, K. (1994) Mol Pharmacol. 46, 587-593 Kozaki, T., Kuwada, M., Narukawa, M., Nagai, Y., and Asakawa, N. (1996) in "Peptide Chemistry 1995" (Nishi, N., ed) Protein Research Foimdation, Osaka, 245-248 Watanabe, T., Teramoto, T., Kuwada, M., Shikata, Y., Niidome, T., Kawakami, Y., Sawada, K., Nishizawa, Y., and Katayama, K. (1995) in "Peptide Chemistry 1994" (Ohno, M., ed) Protein Research Foundation, Osaka, 253-256 Heck, S. D., Siok, C. J., Krapcho, K. J., Kelbaugh, P. R., Thadeio, P. P., Welch, M. J., Williams, R. D., Ganong, A. H., Kelly, M. E., Lanzetti, A. J., Gray, W. R., Phillips, D., Parks, T. N., Jackson, H., Ahlijanian, M. K., Saccomano, N. A., and Volkmann, R. A. (1994) Science 266,1065-1068
554 9. 10. 11. 12. 13.
Tomohiro Watanabe et al Shikata, Y., Watanabe, T., Teramoto, T., Inoue, A., Kawakami, Y., Nishizawa, Y., Katayama, and K., Kuwada, M. (1995) /. Biol Chem. 270, 16719-16723 Adams, M. E., Mintz, I. M., Reily, M. D., Thanabal, V., and Bean, B. P. (1993) Mol Pharmacol 44, 681-688 Yu, H., Rosen, M. K., Saccomano, N. A., Phillips, D., Volkmann, R. A., and Schreiber, S. L. (1993) Biochemistry 32, 13123-13129 Teramoto, T., Kuwada, M., Niidome, T., Sawada, K., Nishizawa, Y., and Katayama, K. (1993) Biochem. Biophys. Res. Commun. 196, 134-140 Watanabe, T., Shikata, Y., Oda, Y., Nishizawa, Y., Kuwada, M., and Asakawa N. The two dimensional HPLC purification of biologically active polypeptides and polyamines in funnel web spider venom, manuscript in preparation
Involvement of Basic Amphiphilic a-helical Domain in the Reversible Membrane Interaction of Amphitropic Proteins: Structural Studies by Mass Spectrometry, Circular Dichroism, and Nuclear Magnetic Resonance Nobuhiro Hayashi, Mamoru Matsubara, Koiti Titani, and Hisaaki Taniguchi Division of Biomedical Polymer Science, Institute for Comprehensive Medical Science, Fujita Health University, Toyoake, Aichi 470-11, Japan
I.
Introduction
A growing number of proteins have been shown to belong to the socalled "amphitropic" proteins which are neither "pure" membrane proteins nor soluble proteins (1). Interestingly, many of them are involved in the signal transduction, and their stimulation-dependent translocation plays important roles in the transmission of signals between the plasma membrane and the nucleus (2). They usually lack any apparent hydrophobic membrane-binding domain, but the importance of highly basic domains in the Src family proteins (2) and that of the basic amphiphilic domain in MARCKS (3) in the membrane association have been well established. In the latter case, the direct phosphorylation of the domain by protein kinase C regulates the reversible membrane association of MARCKS (3, 4). It is of interest to note that some of these amphitropic proteins are fatty acylated, and the modification is also involved in the membrane interaction (2, 5). One of the major phosphoproteins in neuronal growth cone, GAP-43 (growth-associated protein-43, also known as B50, Fl, P56, or neuromodulin), which is found associated with membrane cytoskeletal fractions (6), is very hydrophilic and lacks any apparent hydrophobic membrane-binding domain (7). Palmitoylation of two cysteine residues near the N-terminus has been assumed to be involved in the interaction with membranes (8, 9). However, we have recently shown that GAP-43 isolated from the membrane fractions is notpalmitoylated at all but still retains the ability to bind phospholipid membranes in vitro (10, 11). GAP-43 belongs to the MARCKS family of acidic hydrophilic membrane-associated proteins (12) and has a similar basic amphiphilic domain which serves as the calmodulin-binding domain and the phosphorylation domain by PKC. The involvement of the domain in the membrane-anchoring of GAP-43 has, in fact, been suggested (13, 14). In the present study, we first show a detailed mass spectrometric TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
555
Nobuhiro Hayashi et al
556
analysis on the posttranslational modifications of GAP-43, which provides the basis for the understanding of the structures of molecules involved. The interaction of GAP-43 and that of the basic amphiphilic domain with membrane phospholipids are then studied using circular dichroism (CD) and nuclear magnetic resonance (NMR) to understand the underlying structural mechanisms in the interaction.
II. Materials and Methods A.
Materials
GAP-43 (10) and PKC (15) were purified from bovine brain as described previously. A peptide (QASFRGHITRKKLKGEK) corresponding to the calmodulin-binding domain of GAP-43, named GAP peptide, was synthesized using conventional tBoc chemistry in an ABI 430A peptide synthesizer (Applied Biosystems), and purified over a C18 reversedphase column (Vydac 218TP1010, The Separations Group) using a linear HjOacetonitrile gradient in the presence of 0.1% trifluoroacetic acid. Lipids purchased from Avanti Polar Lipids were suspended in 5 mM phosphate buffer (pH 7.5), and sonicated in a BRANSON SONIHER 250 sonicator for 30 min. The supernatant obtained after centrifugation in a tabletop centrifuge for 20 min was used as unilamellar liposomes. B.
Preparation
of Phosphorylated
GAP'43 and GAP
Peptide
Phosphorylation of the intact GAP-43 and GAP peptide by PKC was carried out in the reaction buffer (25 mM Tris-HCl buffer (pH7.5), 10 mM MgClj, 100 mM CaCl2, 80 |ig/ml phosphatidylserine, 8 M-g/ml dioleoyl glycerol, ImM ATP) at35°C for 90 min, and was stopped by adding 0.1% final concentration of trifluoroacetic acid. The extent of the phosphorylation was analyzed by mass spectrometry as described previously (10, 16). The phosphorylated GAP peptide was purified over a reversed-phase HPLC column. The phosphorylated GAP-43 protein was purified by ion exchange chromatography on a mono Q column (HR 5/5) using a linear gradient of NaCl (0 - 0.5M) in 20 mM Tris-HCl buffer (pH 7.5) containing 1 mM EDTA and 1 mM dithiothreitol. C.
Mass Spectrometrie
Analysis
Electrospray mass spectra were recorded in a PE Sciex API-Ill mass spectrometer as described previously (10, 16). A capillary HPLC was connected on-line to the electrospray interface of the mass spectrometer. D.
Circular Dichroism
CD spectra
were
(CD)
recorded
Spectrometry at
25°C
in
a JASCO
J-720
CD
Membrane Structure of GAP-43 Peptide
557
spectropolarimeter using a 0.1 cm cell. Concentration of the peptide was 20 }iM in 5mM phosphate buffer (pH7.3). The contents of secondary structures were calculated from the CD spectra using a CONTIN program (17) modified by Dr. F. Arisaka, Tokyo Institute of Technology.
E. NMR Spectrometric Analysis 500MHz proton NMR spectra were recorded on a Bruker DMX-500 spectrometer. Chemical shifts were measured relative to the methyl resonance of an internal reference, 4,4-dimethyl-4-silapentane-lsulfonate. GAP peptide (5mM) was dissolved in 90% H ^ - 10% D p , 99.98% D p , 50% H P -10% D p - 40% trifluoroethanol (TFE)-d3, or 60% D2O - 40% TFE-d3. The pH of the samples was 4.0 (direct meter reading). By using standard procedures for 2D proton NMR of proteins (18), the sequence-specific assignment of resonances was obtained from two-dimensional TOCSY (19), NOESY (20, 21), DQF-COSY with phase cycling (22) or with pulsed field gradient (23, 24), and TQF-COSY with pulsed field gradient (23, 24) spectra. All spectra were acquired at 25°C in the phase-sensitive mode using the time proportional phase increment technique. WATERGATE (25, 26) or presaturation was used for the water suppression. A total of 512 measurements with increasing t^ values were made, and 64 transients were accumulated for each measurement. For tj 2048 data points were taken, and the spectral widths along ^2 ^ ^ ^ 5000 Hz. The data were zero filled once in the f^ dimension. A cosine window function and a Gaussian function were used in f^ and fj dimension before Fourier transformation, respectively. For the NOESY spectra, the time-domain data were multiplied by Gaussian functions in both dimensions. All spectra were processed using Bruker XWIN-NMR or MSI Felix95.0 software packages. III.
Results and Discussion
A. Mass Spectrometric Analysis on the in Vivo Posttranslational Modifications Soft ionization techniques such as electrospray ionization and matrix assisted laser desorption are now routinely used to determine the mass of large hydrophilic polymers like proteins (27). However, as is usual for the ionization process, the presence of salts and detergents, which is common for biological samples, can affect the process significantiy. The use of the on-line capillary reversed-phase HPLC in combination of the electrospray mass spectrometer (LC/MS) has made it possible to analyze such samples directly (10,16, 28). When GAP-43 isolated from the membrane fractions of bovine brain was analyzed, a single major peak with a minor peak corresponding to a phosphorylated species was observed (Fig. la). To study the posttranslational modifications in detail, the protein was digested with specific proteases such as lysyl
Nobuhiro Hayashi et al
558
25145.0
>^
24,600
25,000
a
25,400
25,800
Mass (Da)
750
800
m/z
850
900
Fig. 1. Mass spectrometric analysis of GAP-43 purified from membrane fractions of bovine brain, (a) A deconvoluted mass spectrum of GAP-43. A deconvoluted mass spectrum of the N-terminal peptide before reduction (b) and after reduction (c). Peaks formed by oxidation of Met were also observed.
endoprotease and trypsin, and the resulting mixtures were directly analyzed with the same LC/MS apparatus. Since the cDNA sequence has been known, most of the peptides detected could be assigned solely from their masses, and the two peptides containing phosphorylation and a peptide corresponding to the N-terminal peptide were observed (10). Interestingly, the mass of the latter (796.3 Da) was slightly but significantly lower than the theoretical mass of 798.3 Da. Since the peptide contained two successive Cys residues, the peptide was treated with dithiothreitol, and directly analyzed with the LC/MS apparatus. As shown in Fig. lb, c, the mass of the peptide increased by 2 Da after the dithiothreitol treatment, suggesting that the two Cys residues form an intrachain disulfide bridge. Since no palmitoylated N-terminal peptide was detected to significant extent, we conclude that the isolated GAP-43 is not palmitoylated at the two Cys near the N-terminus. B. Conformational Change of GAP-^3 Phospholipid Binding
and GAP Peptide
upon
GAP-43 purified from bovine brain showed a CD spectrum with a single
Membrane Structure of GAP-43 Peptide
559
negative peak at around 197 nm in aqueous solution, which is typical for a random structure. At most 10% of the whole molecule seems to assume a-helix. Upon addition of acidic phospholipids such as phosphatidylglycerol (PG), however, a broad negative peak between 220 and 230 nm due to the increase in the a helix content was observed (Fig. 2a). All the acidic phospholipids tested but not neutral phospholipid such as phosphatidylcholine affected the CD spectrum in a similar way. A peptide corresponding to the calmodulin-binding domain of GAP-43 (GAP peptide) showed a similar random coil to a-helix conformational change upon phospholipid binding (Fig. 2b). The extents of the change in the CD spectra of the intact protein and the peptide are comparable, suggesting that only the domain interacts with the lipids and undergoes a conformational change to a-helix. This is reasonable, since the whole molecule of GAP-43 except for the calmodulin binding domain is hydrophilic and acidic without any hydrophobic amino acids. When ionic strength of the buffer was increased, the apparent affinity between the GAP peptide and the phospholipids decreased, suggesting that the interaction between the GAP peptide and the phospholipids involves electrostatic interaction (29, 30). The addition of TFE, a membrane
O
260
X3
o^ q
-4.0'
200
220
240
Wavelength (nm)
260
F^. 2. Effects of phospholipids on CD spectra of GAP-43 and GAP peptide. CD spectra of GAP-43 (a) and GAP peptide (b) were measured in the absence (O) and in the presence (•) of phosphatidylglycerol or phosphatidylcholine (A).
Nobuhiro Hayashi et al
560
mimicking reagent, caused a concentration dependent induction of the CD spectrum component typical for an a-helix. The a-helical content reached almost 100% in the presence of 40% TFE.
C.
Structural Analysis by Nuclear Magnetic
Resonance
The structural characteristics of the domain was further studied in detail by NMR techniques. Compared to the CD spectrometry, the NMR method gave more accurate and residue-specific information on the conformation. Large portion of the synthetic peptide formed a regular a-helix in the presence of TFE, as was evidenced by the consecutive NOE connectivities (Fig. 3a) (18). Fig. 3a shows that rather strong medium range ^H-^H NOE's of both ap(i, i+3) and aN(i, i+3) are detected in the region from Phe"^ to Lys^^ Furthermore, compared to the chemical shifts of a protons observed in GAP peptide with those obtained in random structure peptide (31), the characteristic upfield shifts of the a protons of GAP peptide except for those of two residues near the C terminus were observed (Fig. 3b). This feature is observable with a helical structure regions (32,33). These results indicate that the region (Phe^-Lys^^) forms a "regular" a-helix in the presence of TFE.
aN(i,i+1)
Q1A2 S3 F 4 R 5 G 6 H 7 |8 T9 R i ( K i i K i l i 3 K i 4 G i t i 6 K i 7
-0.60
-0.40
-0.20
0.00
Membrane Structure of GAP-43 Peptide
561
In the absence of TFE, GAP peptide showed a typical CD spectrum for a random structure (Fig. 2b). Due to resonance overlaps, many peaks in the NMR spectra could not be uniquely assigned except for several peaks. However, as is shown in Fig. 3c, the signals of a protons generally showed characteristic upfield shifts again, although the degrees were not so large as those obtained in the presence of TFE. Because Ala^ Ile^ Thr^ and Leu^^ each occurs only once in GAP peptide, and their methyl group signals give well-resolved signals in higher magnetic field region, it was possible to assign these residues. Interestingly, the a proton chemical shifts of all the assigned residues showed intermediate values between those typical for random coil and those for a helix obtained as above (Fig. 3b). Since the resonance overlaps observed is characteristic
Gly in random coil
CO
c^ Q_
Signals observed in a helix
X
00
Lys,Arg,Gln,Ser,His,Phe in random coil 8.7
8.4 8.1 F2 (ppm)
7.8
Fig. 3. NMR analysis of GAP peptide, (a) NOE connectivities of specified proton pairs observed in theNOESY spectra of GAP peptide in the presence of 40 % TFE are marked with a bar (aN(i,i+l)), open (aP(i,i+3)) and/or shaded (aN(i,i+3)) boxes, (b) Deviation in the chemical shifts of some of the a protons in the presence of TFE (open bars) and in the absence of TFE (shaded bars) from those observed with typical random coil (31) are indicated, (c) ocH (F1)/NH (F2) region of 500 MHs DQF-COSY spectrum of GAP peptide in 90 % H2O -10 % DjO. The regions, in which Lys, Arg, Gin, Ser, His, Phe, and Gly in random coil are observed, are indicated.
Nobuhiro Hayashi et al
562
for a random cx3il, and chemical sifts of a protons showed intermediatevalues between those of random coil and those of a helical structure, the GAP peptide in aqueous solution assumes an intermediate state between a random coil and a regular a helix. Such a "nascent" helical structure may deviate from ideal geometry, and/or the ends of the a-helix can fray (34, 35). The interaction of GAP peptide with phospholipids seemed to stabilize the conformation to induce an a helix, as is often the case of "nascent" a-helical structures which are usually induced or further stabilized by addition of the a-helix promoting solvent TFE (36,37).
IV.
Conclusions
GAP-43 lacks any hydrophobic region found in usual membrane proteins and the pal mi toy la ti on which has been implicated in the membrane anchoring is not present in the purified protein. However, the effector domain of basic amphiphilic nature has the ability to bind acidic phospholipids. The domain adopts an a helical conformation when put into hydrophobic environments as shown by the CD and NMR analyses. A growing body of evidence suggests that the basic amphiphilic a-helical domain, which has been initially found as a calmodulin binding motif, serves as a reversible membrane-association signal.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Burn, P. (1988) Trends Biochem. Sci. 13, 79-83. Resh, M. D. (1994) Cell 6,411-413. Taniguchi, H., and Manenti, S. (1993) /. Biol. Chem. 268,99609963. Kim, J., Shishido, T., Jiang, X., Aderem, A., and McLaughlin, S. (1994) /. Biol. Chem. 269, 28214-22821. Peitzsch, R. M., and McLaughlin, S. (1993) Biochemistry 32, 10436-10443. Meiri, K. P., and Gordon-Weeks, P. R. (1990) /. Neurosci. 10, 256-266. LaBate, M. E., and Skene, J. H. P. (1989) Neuron 3, 299-310. Zuber, M. X., Strittmatter, S. M., and Fishman, M. C (1989) Nature 341,345- 348. Skene, J. H. P., and Virag, I. (1989) /. Cell Biol. 108,613-624. Taniguchi, H., Suzuki, M., Manenti, S., and Titani, K. (1994) /. Biol. Chem. 269, 22481-22484. Hayashi, N., Matsubara, M., Titani, K., and Taniguchi, H. (1996) in preparation. Blackshear, P. J. (1993) /. Biol. Chem. 268,1501-1504. Houbre, D., Duportail, G., Deloulme, J. C, and Baudier, J.
Membrane Structure of GAP-43 Peptide
14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36.
37.
563
(1991) /. Biol Chem. 266, 7121-7123. Kim, J., Blackshear, P. J., Johnson,}. D., and McLaughlin, S. (1994) Biophys. ]. 67, 227-237. Manenti, S., Sorokine, O., Van Dorsselaer, A., and Taniguchi, H. (1992) /. Biol Chem. 267,22310-22315. Taniguchi, H., Manenti, S., Suzuki, M., and Titani, K. (1994) /. Biol Chem. 269,18299-18302. Provencher, S. W., and Glockner, J. (1981) Biochemistry 20,3337. Wiithrich, K. (1986) NMR of Proteins and Nucleic Acids, J. Wiley, New York. Bax, A., and Davies, D. G. (1985) /. Magn. Reson. 65,393-402. Jeener, J., Meier, B. H., Bachman, P., and Ernst, R. R. (1979) /. Chem. Phys. 71,4546-4553. Macura, S., Hyang, Y., Suter, D., and Ernst, R. R. (1981) /. Magn. i^eson. 43,259-281. Ranee, M., Sorensen, O. W., Bodenhausen, G., Wagner, G., Ernst, R. R., and Wiithrich, K. (1983) Biochem. Biophys. Res.Commun. 177,479-485. Baker, P., and Freeman, R. (1985) /. Magn. Reson. 64,334-338. Hurd, R. E. (1990) /. Magn. Reson. 87,422-428. Piotto, M., Saudek, V., and Sklenar, V. (1992) /. Biomol NMR 2,661-665. Sklenar, V., Piotto, M., Leppik, R., and Saudek, V. (1993) /. Magn. Reson. 102 (Ser. A), 241-245. Biemann, K. (1992) Annu Rev Biochem 61, 977-1010. Taniguchi, H. (1996) /. Mass Spectrm. Soc. Japan 44,443-457. McLaughlin, S. (1977) Curr. Top. Membr. Transp. 9,1-144. McLaughlin, S. (1989) Annu. Rev. Biophys. Biophys. Chem. 18, 13-136. Bundi, A., and Wuthrich, K. (1979) Biopolymers 18, 285-298. Pastore, A., and Saude, V. (1990) /. Magn. Reson. 90,165-176. Wishart, D., Sykes, B., and Richards, F. (1991) /. Mol Biol 222, 311-333. Dyson, H. J., Merutka, J., Waltho, J. P., Lerner, R. A., and Wright, P. E. (1992) /. Mol Biol 226, 795-817. Manning, M. C , Illangasekare, M., and Woody, R. W. (1988) Biophys. Chem. 31, 77-86. Munier, H., Blanco, F. ]., Precheur, B., Diesis, E., Nieto, J. L., Craescu, C. T., and Barzu, O. (1993) /. Biol Chem. 268,16951701. Shang, M., and Vogel, H. J. (1994) /. Biol Chem. 269, 981-985.
Acknowledgements We thank Mr. M. Suzuki for technical assistance. This work was supported in part by Grants-in-Aid from the Fujita Health University, Science Research Promotion Fund from the Japan Private School
564
Nobuhiro Hayashi et al
Promotion Foundation, Research Grant from the Naito Foundation for Medical Research, Grant-in-Aid for Scientific Research (C) (06680773) and Grants-in-Aid for Scientific Research on Priority Areas (06253218, 06276218, 07268221,07279242, 08249240 and 08260220) from the Ministry of Education, Science and Culture, Japan. M.M is a Research Fellow of the Japan Society of the Promotion of Science.
One-Dimensional Diffusion of a Protein along a Single-Stranded Nucleic Acid Bradley R. Kelemen Ronald T. Raines Department of Biochemistry University of Wisconsin Madison, WI 53706-1569
I. Introduction One-dimensional diffusion can accelerate the formation of site-specific interactions within biopolymers by up to lO^-fold (Berg et aL, 1981). Such facilitated diffusion is used by transcription factors and restriction endonucleases to locate specific sites on double-stranded DNA (von Hippel and Berg, 1989). The backbone of RNA, like that of DNA, could allow for the facilitated diffusion of proteins. Yet, the facilitated diffusion of a protein along RNA (or any single-stranded nucleic acid) has not been demonstrated previously. Bovine pancreatic ribonuclease A (RNase A; RNA depolymerase; EC 3.1.27.5) is a distributive endoribonuclease that catalyzes the cleavage of the P-O5' bond of RNA on the 3' side of pyrimidine residues. RNase A binds to polymeric substrates (Imura et aL, 1965; Trie et al., 1984; Moussaoui et al., 1995), but the mechanism by which RNase A locates a pyrimidine residue within a polymeric substrate is not known. Binding to phosphoryl groups is important for the one-dimensional diffusion of proteins along DNA (Winter et al, 1981), and may likewise provide nonspecific interactions necessary to generate one-dimensional diffusion by RNase A. RNase A has three defined phosphoryl group binding subsites, PO, PI, and P2, as well as three base binding subsites, Bl, B2, and B3 (Pares et al, 1991). The subsite interactions in the RNase A»RNA complex are shown in Figure la. The PO and P2 subsites interact with phosphoryl groups that remain intact during catalysis; the PI subsite is the active site. The Bl subsite is responsible for the pyrimidine specificity of RNase A. RNase A cleaves poly(cytidine) [poly(C)] or poly (uridine) [poly(U)] lO'^-fold faster than poly(adenosine) [poly(A)] as a result of the selectivity of the Bl TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
565
566
Bradley R. Kelemen and Ronald T. Raines
subsite. In contrast to the Bl subsite, the B2 and B3 subsites prefer to bind purines. Previously, we demonstrated that enlarging the B1 subsite increases the rate of poly(A) cleavage by lO^-fold (delCardayre and Raines, 1994; delCardayre et a/., 1994). This enlargement also converts the distributive mechanism of wildtype RNase A to a processive mechanism when poly(A) is the substrate.
Lys66| - O — P = 0 PO subsite
O. ^
fThr45 Cyt/Ura \ Asp83 [Phe120 B1 subsite
Gimi] O OH His12 I Lys41 > - O — P = 0 His 119 I - • — scissile bond Asp121 ON Ade PI subsite -O-.^ I
. Gln69 JAsn71 iGlulH
OH
Lys7 j -Q_,i=o •. ^
ArglOj >'
O. Ade/Gua | Lys1
Figure 1. a. Amino acid residues of RNase A that compose the subsites for binding phosphoryl groups (PO, PI, and P2) and bases (Bl, B2, and B3) of single-stranded nucleic acids. b. Fluorescein-labeled deoxynucleotides used to assess binding to the B1 subsite.
Single-stranded DNA is an excellent substrate analog for RNase A, and this analogy is the basis for the work described here. First, we report on the use of DNA oUgonucleotides and fluorescence polarization to probe the binding of adenine to the B1 subsite of RNase A. Then, we describe the use of DNA/RNA chimeric oligonucleotides to distinguish between three-dimensional and one-dimensional diffusion mechanisms for catalysis by RNase A. Our results provide a biophysical rationale as well as direct evidence for the diffusion of a protein along a single-stranded nucleic acid.
II. Materials and Methods A. Oligonucleotide synthesis DNA and DNA/RNA chimeric oUgonucleotides were synthesized with a Model 392 DNA/RNA synthesizer from AppUed Biosystems (Foster City, CA) with reagents from Glen Research (Sterling, VA). Oligonucleotides were purified by elution from an acrylamide gel after electrophoresis. To assess binding to the B1 subsite, we synthesized deoxynucleotides that differ only in the base that interacts with the Bl subsite (Figure la). The ligands have a uridine (U), adenosine (A), or abasic (0) residue at their 5' ends, followed by two
One-Dimensional Diffusion of Properties of Protein on SS Nucleic Acid
567
adenosine residues to fill the enzymic B2 and B3 subsites. Each deoxynucleotide is labeled with fluorescein (Fl) so that binding can be detected by fluorescence polarization. The products of these syntheses are shown in Figure lb. To probe for one-dimensional diffusion, we synthesized DNA/RNA chimeric oligonucleotides. Special precautions were taken to avoid ribonuclease contamination during synthesis, purification, and use of these chimeras. For example, all water was treated with diethylpyrocarbonate before exposure to the chimeras. Ribonucleotide 2'-hydroxyl groups were deprotected with 1 M tetrabutyl anmionium fluoride in dimethyl formamide (Aldrich Chemical; Milwaukee, WI). Purified oligonucleotides were labeled on the 5' end with [y-^^pj^xP (duPont; Wilmington, DE) by T4 kinase (Promega; Madison, WI), and desalted with a Nicj^TM gel filtration column (Pharmacia; Uppsala, Sweden).
B. Binding Fluorescence polarization (like fluorescence anisotropy) can be used to measure the rate of tumbling of a fluorescent molecule (Jameson and Sawyer, 1995; Royer, 1995). A receptor (e.g., RNase A) binding a fluorescent ligand (e.g., a labeled nucleic acid) slows the tumbUng of the Ugand. Accordingly, fluorescence polarization can reveal the fraction of a nucleic acid that is bound to RNase A. Fluorescence polarization experiments were performed as described elsewhere (B. M. Templer and R. T. Raines, unpubl. results). Briefly, RNase A (Sigma Chemical; St. Louis, MO) was dialyzed exhaustively at 4 °C against distilled water to remove salts. The enzyme was then lyophilized. The lyophilized enzyme was suspended in 0.90 mL of 0.10 M Mes-HCl buffer, pH 6.0, containing NaCl (0.10 M), such that the concentration was 1 - 2 mM (15 - 30 mg/mL). Fluorescein-labeled deoxynucleotides were dissolved in buffer and added to half of the enzyme solution to a final concentration of 2 - 3 nM. The sample volume was then raised to 1.00 mL with buffer. A blank containing enzyme but not DNA was made by raising the volume of the remaining enzyme solution to 1.00 mL with buffer. The precise concentration of enzyme was determined by assuming that A = 0.72 at 277.5 nm for a 1.0 mg/mL solution. At least five repetitive fluorescence polarization readings (with individual blank readings) were made at room temperature with a Beacon^^^ fluorescence polarization instrument (Panvera; Madison, WI). The average and standard deviations were calculated for the readings. The protein sample was then diluted by removing 0.25 mL and replacing it with buffer containing the same concentration of labeled deoxynucleotide as was in the original protein sample. The blank was diluted with buffer. The data collection and dilution steps were repeated up to thirty times. The resulting data were fit to eq 1 by a non-linear least squares analysis, which was weighted by the standard deviation of each reading. p--Pmax[RNaseA] K^-\- [RNase A] ^ ^^in
(1)
Bradley R. Kelemen and Ronald T. Raines
568
In eq 1, P is the average of the measured fluorescence polarization, Pmin is the polarization of free deoxynucleotide, and Pmax is the polarization at deoxynucleotide saturation minus Pmin- [RNase A] is protein concentration, and K(ji is the equilibrium dissociation constant. For Fl-d(AAA) and Fl-d(0AA), the value of Pmax was poorly defined but apparently similar to that for Fl-d(UAA); therefore, the Pmax of Fl-d(UAA) was used to fit the Fl-d(AAA) and Fl-d(0AA) data.
C. One-dimensional diffusion Enzymes capable of one-dimensional diffusion should cleave a substrate with a long nonspecific binding region faster than a similar substrate with a short such region (Berg et al, 1981). The substrates used here derive from simpler substrates with long and short nonspecific binding regions (Figure 2a). By merging the simpler substrates into one, evidence for facilitated diffusion can be obtained directly in a single experiment. A conceptually analogous experiment has been performed with EcoRl endonuclease (Jeltsch et al, 1994).
a
Simple Substrates d(AAAAA)Ud(AAAAA) d(AAAAA)Ud(AAAAAAAAAAAAAAAAAAAAAAAAA) Composite Substrates Oligo 1: d (AAAAA) U d (AAAAA) Ud (AAAAAAAAAAAAAAAAAAAA AAAAA) Oligo 2. d (AAA AAAAA AAAAA AAAAAAAAAAAA) Ud (AAAAA) Ud (AAAAA)
Oligo 1
Oligo 2
32p_u_u
32 p
i
NI RNase A 32p_U_U 32p_U
P1D P3D
-u—uRNase A
32 p_ 32 p_
-u—u
Figure 2. a. DNA/RNA chimeric oligonucleotide substrates used to detect one-dimensional diffusion by RNase A. Oligo 1 and Oligo 2 are circular permutations containing two cleavage sites, one of which is proximal to a long nonspecific binding region, b. Products of the cleavage of Oligo 1 and Oligo 2. Pip results from one-dimensional diffusion of RNase A along the long poly(dA) tract.
One-Dimensional Diffusion of Properties of Protein on SS Nucleic Acid
569
Oligo 1 and Oligo 2 are chimeric oligonuclotides that contain 35 DNA residues and 2 RNA residues. The RNA residues are uridine nucleotides, and are referred to as the ID and 3D sites. We chose this naming system because the ID site is closer to the long nonspecific binding region and will be cleaved faster if RNase A uses a one-dimensional diffusion mechanism. In both substrates, the ID cleavage site is flanked on one side by 25 deoxyadenosine residues. The ID and 3D cleavage sites are separated by 5 deoxy adenosine residues, and 5 more deoxyadenosine residues separate the 3D site from the end. Oligo 1 has the uridine nucleotides near the 5' end, whereas Oligo 2 has the uridine nucleotides near the y end. The use of composite substrates could compUcate data interpretation because of the possibiUty of multiple catalytic events on the same substrate. Of course, diffusion in one dimension, like diffusion in three dimensions, cannot be directional (von Hippel and Berg, 1989). Thus, RNase A bound to the long nonspecific binding region should cleave the ID site faster than the 3D site regardless of the site's proximity to the 5' or 3' end. Thus, comparing the initial rates of cleavage of Oligo 1 and Oligo 2 resolves the complications incurred from the consolidation of substrates. Only two detectable products are formed from the degradation of Oligo 1 or Ohgo 2 because only a 5' ^^p label is used for detection (Figure 2b). RNase A cleavage at the ID site produces a detectable product, PID- Cleavage at the 3D site forms a detectable product, P3D, of a different length. For Oligo 1, FID is 12 nt and P3D is 6 nt. For Oligo 2, FID and P3D are 26 and 32 nt, respectively. The ratio [PID]/[P3D] is approximately equal to the ratio of the initial rates of cleavage at the ID (/:ID) and 3D (/:3D) sites (i.e., [PID]/[P3D] = ^1D/^3D)- This ratio is an indicator of one-dimensional diffusion of RNase A along Oligo 1 and Oligo 2. A ratio of [PID]/[P3D] > 1 is indicative of one-dimensional diffusion; [ F I D ] / [PSD] = 1 is indicative of three-dimensional diffusion. Assays for one-dimensional diffusion were performed as follows. Reactions were initiated at room temperature by the addition of substrate. The reaction mixture consisted of 0.050 M Mes-HCl buffer, pH 6.0, containing RNase A (1 fmol 0.1 pmol), NaCl (0.025,0.12, or 1.0 M), and substrate (0.4 - 0.8 |LiM). Aliquots (2 |iL) of the reaction were quenched at various times by the addition to an equal volume of formamide (95% v/v) containing EDTA (20 mM), xylene cyanol (0.05% w/v), and bromophenol blue (0.05% w/v). Less than 10% of the substrate was cleaved during the course of an experiment. Reaction products were separated by electrophoresis on a denaturing 18% (w/v) acrylamide gel. To prevent shattering, these gels were soaked in an aqueous solution of acetic acid (7% v/v) and methanol (7% v/v), then in methanol before drying under reduced pressure (Thomas et al., 1992). Detection and quantification of cleavage products were made using a FhosphorlmagerT"^ radioisotope imaging system from Molecular Dynamics (Sunnyvale, CA).
570
Bradley R. Kelemen and Ronald T. Raines
III. Results A. Binding Fluorescence polarization data for the binding of RNase A to Fl-d(UAA), R-d(AAA) and Fl-d(0AA) are shown in Figure 3. RNase A binds Fl-d(UAA) approximately 20-fold more tightly than Fl-d(AAA) or Fl-d(0AA), demonstrating that the Bl subsite has affinity for a pyrimidine base. The similarity in binding affinity for Fl-d(AAA) and Fl-d(0AA) indicates that the Bl subsite of RNase A does not bind adenine significantly, but does not discriminate against it.
150
Figure 3. Binding of RNase A to Fld(UAA) (•), R-d(AAA) (O), and Fld(0AA) (D) as assessed by changes in fluoresence polarization (mP). Data were obtained in 0.10 M Mes-HCl buffer, pH 6.0, containing NaCl (0.10 M). Data were fit to eq 1, yielding K(X values of 0.13 mM, 3.3 mM, and 2.5 mM for Fl-d(UAA), Fld(AAA), and Fl-d(0AA), respectively.
mP
10'°
10"''
10'^
10"''
10"'
[RNase A] (M)
B. Facilitated diffusion A typical time-course for the degradation of Oligo 1 and OUgo 2 by RNase A in the presence of 25 mM NaCl is shown in Figure 4. The concentration of F I D exceeds that of P3D at all times for both Oligo 1 and Oligo 2. These data provide evidence that RNase A uses one-dimensional diffusion to locate pyrimidine nucleotides within a polymeric substrate. The one-dimensional diffusion of RNase A is diminished by added NaCl. The ratio [PID]/[P3D] for Oligo 1 and OHgo 2 at three concentrations of NaCl is shown in Figure 5. RNase A displays no indication of faciUtated diffusion at high NaCl concentration, where [PID]/[P3D] = 1- At 0.12 M NaCl concentration, [PID]/[P3D] > 1. indicating that RNase A can use one-dimensional diffusion at NaCl concentrations close to physiological. At 0.025 M NaCl, [PID]/[P3D] is even greater, consistent with a facilitated diffusion mechanism that relies on the nonspecific binding to the phosphoryl group of poly(dA). Under these low-salt conditions, RNase A also shows the slowest turnover of substrate. As shown in Figure 4, the cleavage occurs in a burst but is then inhibited by products. The size of this burst increases with enzyme concentration (data not shown).
One-Dimensional Diffusion of Properties of Protein on SS Nucleic Acid
a
b Oligo 1
—
0.2 r'
•
^
1
571
Oligo 2
Oligo 2 P3D(«)
Time-
Time (min)
Time (min)
Figure 4. a. Reaction products 0,1,2, 5, and 10 min after addition of RNase A to Oligo 1 and Oligo 2. Reactions were performed in 0.050 M Mes-HCl buffer, pH 6.0, containing NaCl (0.025 M). b. Plots of product formation versus time for Oligo 1 and Oligo 2.
2.0
1.5
[P3D]
Figure 5. The [PID]/[P3D] ratio versus the log of the concentration of NaCl. Data were obtained in 0.050 M Mes-HCl buffer, pH 6.0, containing NaCl (0.025,0.12, or 1.0 M).
1.0
0.5 0.10
1.00
[NaCl] (M)
IV. Conclusions RNase A can use one-dimensional diffusion along a poly(dA) tract to accelerate the location of a uridine substrate. Use of this mechanism depends on the concentration of NaCl, as expected if the enzyme were binding to the nucleic acid by nonspecific interactions with phosphoryl groups. Binding of the enzymic active site to adenosine residues is 20-fold weaker than to uridine residues, which could enhance the ability of the enzyme to slide along the poly(dA) tract. A facilitated diffusion mechanism may have evolved for a sinister purpose. Some homologs of RNase A are cytotoxic because they are able to deUver ribonucleolytic activity to the cytosol of manmialian cells (Youle et ah, 1993). Facilitated diffusion may enable these cytotoxic ribonucleases to use the poly(A) tail of mammaUan mRNAs as a runway leading to substrates in the indispensable coding region.
572
Bradley R. Kelemen and Ronald T. Raines
References Berg, O. G., Winter, R. B., and von Hippel, R H. (1981). Biochemistry 20, 6929-6948. delCardayre, S. B., and Raines, R. T. (1994). Biochemistry 33, 6031-6037. delCardayre, S. B., Thompson, J. E., and Raines, R. T. (1994). In "Techniques in Protein Chemistry V" (Crabb, J. W., ed.) pp. 313-320, Academic Press, New York. Imura, N., Irie, M., and Ukita, T (1965). 7. Biochem. 58, 264-272. Irie, M., Mikami, R, Monma, K., Ohgi, K., Watanabe, H., Yamaguchi, R., and Nagase, H. (1984). J. Biochem. (Tokyo) 96, 89-96. Jameson, D. M., and Sawyer, W. H. (1995). Methods Enzymol. 246, 283-300. Jehsch, A., Alves, J., Wolfes, H., Maass, G., and Pingoud, A. (1994). Biochemistry 33, 1021510219. Jensen, D. E., and von Hippel, R H. (1976). /. Biol. Chem. 251, 7198-7214. Moussaoui, M., Guasch, A., Boix, E., Cuchillo, C. M., and Nogues, M. V. (1995). J. Biol. Chem. 271, 4687-3692. Pares, X., Nogues, M. V., de Llorens, R., and Cuchillo, C. M. (1991). Essays Biochem. 26, 89103. Royer, C. A. (1995). Methods Molec. Biol. 40, 65-89. Thomas, M., Abedi, H., Farzaneh, F. (1992). Biotechniques 13, 533. von Hippel, R H., and Berg, O. G. (1989). J. Biol. Chem. 264, 675-678. Winter, R. B., Berg, O. G., and von Hippel, R H. (1981). Biochemistry 20, 6961-6977. Youle, R. J., Newton, D., Wu, Y.-N., Gadina, M., and Rybak, S. M. (1993). Crit. Rev. Therapeutic Drug Carrier Systems 10, 1-28
Acknowledgements We thank B. M. Templer and C. A. Royer for advice on fluorescence polarization assays. This work was supported by NIH grant GM44783. BRK was supported by NIH Chemistry - Biology Interface training grant GM08505.
Metal-dependent Structure and Self Association of the RAGl Zinc-Binding Domain Karla K. Rodgers and Karen G. Fleming Department of Molecular Biophysics and Biochemistry Yale University, New Haven, CT 06520-8114
L Introduction Structural zinc-binding domains are often characterized by the requirement of zinc coordination for proper protein folding [1]. One specific class of zinc-binding motif that will be discussed here is the zinc C3HC4 motif, also known as the RING finger [2]. To date at least eighty proteins include a sequence of approximately 50 residues consistent with a RING finger motif. This conserved sequence, with minor variations in some cases, is defined as follows: C-X2-C-loopI-CX-H-X2-C-X2-C-I00PII-C-X2-C, where X represents any amino acid. A common function attributable to the RING finger module has remained elusive, although a role in protein-protein interactions has been speculated [2]. One of the first RING finger sequences was identified in RAGl, a protein expressed in developing lymphocytes by recombination activating gene-1 [3]. RAGl, along with RAG2, is an essential component of the V(D)J recombination reaction, which produces the genetic sequence encoding for the variable regions of the T cell receptor and immunoglobulin chains. Briefly, V(D)J recombination is accomplished via selection and assembly of gene segments known as variable (V), joining (J), and sometimes diversity (D) in an ordered and precisely regulated process (for a review see [4]). The RING finger sequence of RAGl is present within the N-terminal third of the protein, which contains a total of 1040 residues in the murine form. Besides the RING finger sequence, we have recently identified the presence of two C2H2 zinc finger sequences within RAGl [5]. A domain in RAGl containing one of the zinc finger modules plus the RING finger forms a highly specific dimer, as characterized by a variety of biophysical techniques [5]. The dimerization of this zincbinding domain provides further support for the participation of RING fingers in protein-protein interactions. This dimerization TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
573
574
Karla K. Rodgers and Karen G. Fleming
domain of RAGl, previously referred to as R121, will be referred to here as ZDD, zinc-binding dimerization domain. Here we focus o n the role of metal binding to the ZDD dimer. In particular, we h a v e investigated the stabilities of different species of ZDD with varying metal-to-protein stoichiometries. Combined with the metal-binding studies, we have further investigated dimer formation of this u n i q u e zinc-binding domain while providing additional detail into the techniques and methods used,
11. Materials and Methods A.
ZDD Purification, Metal Exchange and Analysis
ZDD and a fragment of RAGl including only the RING finger sequence have been expressed in £. coli as fusion proteins with maltose binding protein (MBP). These proteins are referred to as MBP-ZDD and MBP-RF, respectively. We have recently described the cloning, expression, and purification of MBP-ZDD and MBP-RF. In addition, the proteolytic cleavage of the MBP-ZDD chimera to generate the ZDD fragment, and its subsequent purification, was done as previously reported [5]. N-terminal amino acid sequencing was done at the W.M. Keck Foundation Biotechnology Resource Laboratory, and electrospray mass spectrometry was done by Walter McMurray at the Yale University School of Medicine. The Zn2-coordinated form of ZDD (Zn2-ZDD) was produced by dialysis of the native ZDD against nitrogen-saturated 20 mM Tris-HCl (pH 7.5), 100 mM NaCl, 5 mM 2-mercaptoethanol (BME), and 1 m M EDTA at 4"'C for 17 hours. The Zn2-Cdi forni of ZDD was generated by dialysis of Zn2-ZDD against nitrogen-saturated 20 mM Tris-HCl (pH 7.5), 100 mM NaCl, 5 mM BME supplemented with one molar equivalent of CdCl2 at 4°C for 12 hours. The metal to protein stoichiometry was determined by atomic absorption spectroscopy using an Instrumentation Laboratory IL157 spectrometer. The concentration of metal ions in the RAGl proteins was measured in solutions of 3 to 10 |iM protein. These were compared to either a Zn or Cd calibration curve ranging from 1 to 15 |LiM, which was measured prior to each protein sample.
B.
Circular Dichroism
Spectroscopy
Circular dichroism (CD) spectra were collected on an AVIV model 62DS spectropolarimeter using a 0.2 mm path length cell. Protein samples were dialyzed against buffer containing 20 m M sodium phosphate (pH 7.0), 50 mM NaCl, ImM BME. Five separate spectra with a step size of 0.5 nm and a 1.5 nm bandwidth were averaged to obtain the final spectrum for each protein sample. The temperature during the scans was maintained at 25°C with a water-
Solution Properties of RAGl ZDD
575
jacketed cuvette holder. The molar ellipticity was determined using protein concentrations obtained from amino acid analysis. Thermal denaturation studies were done by collecting data at a single wavelength and increasing the temperature in VC increments, equilibrating for 60 sec, and using a 30 sec signal integration time. The melting temperatures (Tm) were determined from the temperature at which the slope of the first derivative of the data was at a minimum.
C.
Analytical
Ultracentrifugation
The details of the ultracentrifugation experiments have been described previously [5]. Briefly, equilibrium sedimentation experiments were performed in a Beckman XL-A analytical ultracentrifuge at multiple speeds in buffer containing 20 m M sodium phosphate (pH 7.0), 150 mM NaCl, and 5 mM BME at 20°C. The partial specific volumes were calculated using the values of Cohn & Edsall [6]. The data were analyzed using a modified version of IGOR Pro as well as the MacNONLIN program [7]. The applicable mathematical models for the equilibrium distributions of the isolated ZDD and the MBP-ZDD chimera, respectively, are given by q = c^^ exp a + base
(1)
for a single species, where a = M\l-vp]co^U^ -r^\ IRT, c^ is the total concentration at a radial position r-, c^^^ is the concentration at a reference position, r^^^, M and v are the monomer molecular weight (g/mol) and partial specific volume (ml/g), co is the angular velocity (rad/sec), p is the solvent density (g/ml), r. and r^^^ are the radial positions (cm) at an arbitrary position and at the reference position, R is the universal gas constant (g/mol °K), T is the absolute temperature and base is a term for non-sedimenting material; and c^ = c^^ exp a + [^jcl^) exp la + base
(2)
for a m o n o m e r / d i m e r distribution where K is the equilibrium constant, ^jcl^f is the concentration of dimer (using the law of mass action) and other terms are as previously defined. Velocity sedimentation experiments on both proteins at several concentrations were performed at 55,000 rpm in the same buffer at 20°C. The data were analyzed using the time derivative method of Stafford [8]. 1.
Calculation of s020,w
A detailed description of the calculation of the S^Q^^ parameter used in shape estimations as well as in solution molecular weight
576
Karla K. Rodgers and Karen G. Fleming
determination in conjunction with the D2Q^^ from dynamic Hght scattering has previously been reported [5]. Briefly, since at all concentrations the apparent sedimentation coefficient distributions were symmetrical and approximately Gaussian on the s* scale (data not shown here), the weight average sedimentation coefficient at a particular concentration, S2o^soivent^ ^^^ calculated from the apparent distribution function [8]. These values were used to calculate the corresponding sedimentation coefficient, S2o,u;/ which is corrected to an infinitely dilute protein concentration and to water at 20°C. 2.
Calculation of Sapp and Dapp
An extended analysis of data using the time-derivative method provides for simultaneous determination of apparent sedimentation, s^pp, and apparent diffusion coefficient, D^^, values at a particular concentration and temperature [9]. The apparent diffusion coefficient was calculated from the apparent sedimentation coefficient distribution by the following relationship: %p = i^mCO'tf/^t
(3)
where r^ is the radial position of the meniscus (cm), t is the equivalent sedimentation time (sec), and a is the standard deviation of the gis"^) versus s"^ curve determined by fitting to the following equation: g(s>Aexp[-0.5((s*-s„^J/cT)']
(4)
where A and a are constants, and s^^^ is the sedimentation coefficient given by the maximum position of the g(s*) versus s* curve. D.
Calculation
of Molecular
Weight from s and D
The Svedberg equation was used to calculate the molecular weight from the sedimentation and diffusion coefficients:
E.
s M(l-vp] - =— D RT
(5)
Calculation of Shape Factor and Axial Ratio
Calculation of the frictional coefficient, shape factor and axial ratio for the RAGl fragment (ZDD) from the sedimentation coefficient, s^^^^, has been previously described in detail [5].
Solution Properties of RAGl ZDD
577
III. Results and Discussion A
The Zinc-binding Dimerization Domain of RAGl
The zinc-binding dimerization domain, ZDD, of RAGl includes two different zinc-binding modules: a RING finger and a C2H2 zinc finger. ZDD was expressed as a fusion protein with the maltose-binding protein (MBP) in E. coli. It could be efficiently cleaved from the MPB-ZDD chimera after purification via limited proteolysis with trypsin [5]. This domain, previously referred to as R121, was originally believed to consist of 121 residues. However, from electrospray mass spectrometry and N-terminal amino acid sequencing we have determined that the domain corresponds to residues 265 to 380 in the RAGl full length sequence after cleavage from MBP, yielding a monomer molecular weight of 13.2 kDa. The position of ZDD relative to other proposed domains in the entire RAGl sequence is shown in Figure 1. It can be seen that the dimerization domain is located immediately N-terminal to the core region of RAGl, the minimal RAGl domain required for efficient recombination [10]. The locations of the proposed RING finger and zinc finger modules within ZDD are also illustrated in Figure 1. In addition to ZDD, a fragment containing only the RING finger of RAGl has been cloned and expressed as a fusion protein with MBP and is referred to as MBP-RF.
B.
Metal Binding in the Dimerization Domain
To determine the metal-to-protein stoichiometries of the zincbinding domains of RAGl atomic absorption spectroscopy was used. As expected, MBP-RF, which contains only the RING finger sequence
:p:
RINGZFA ++
1
Ii Mm I
218 288 349 380
I MBP MBP
ZFB
I 723
II
265 ZDD
380
265
380
P H H I J rHE
277
1040
1 Core RAG-1
MBP-ZDD Construct
1008
MBP-RF Construct
337
Figure 1. Schematic of proposed domaii\s in RAGl. The top bar represents the fulllength murine RAGl sequence. Solid boxes are zinc-binding domains, with ZFA and ZFB representing two zinc-finger subdomains [5]. Hatched boxes represent positivelycharged potential nucleic-acid binding regions. Lines beneath the bar indicate the positions of the ZDD and core RAGl domains. RAGl clones are represented as bars placed in the corresponding position relative to the full-length protein.
578
Karla K. Rodgers and Karen G. Fleming
of RAGl, binds approximately two zinc ions (1.7±0.2 Zn/protein molecule). The zinc binding stoichiometry of ZDD was found to be 3.2 zinc ions bound per monomer (3.0-3.5 for multiple determinations). Two of these zinc ions bind within the conserved RING finger sequence, with the third bound within the zinc finger module. W i t h measurements ranging as high as 3.5 Z n / m o n o m e r , there remains the possibility for the coordination of a fourth zinc ion. Although the geometry of a fourth zinc site is unclear from observation of the primary amino acid sequence, there are several cysteine and histidine residues that could serve as additional coordinating ligands. For the purposes of this report we will refer only to three zinc sites: two in the RING finger and one in the zinc finger. A two zinc-coordinated form (1.80±0.2 Zn/protein molecule) of ZDD (Zn2-ZDD) is easily generated by dialysis against dilute concentrations of EDTA, indicating that one of the three zinc ions is weakly bound as compared to the other two metal ions. Under similar conditions one of the two zinc ions is removed from MBPRF, which contains only the RING finger module of RAGl. Thus, the location of the labile zinc-binding site can be narrowed down to one of two sites present in the RING finger. Similar results in the COPl protein indicate that one of its RING finger zinc ions is also relatively labile [11]. Zinc-binding at the labile RING finger site is, however, reversible since dialysis of Zn2-ZDD against zinc or cadmiumcontaining solutions restores the fully coordinated species. In the case of cadmium solutions, this results in a two zinc-one c a d m i u m coordinated species (1.9±0.2 Zn and 0.9±0.2 Cd/protein molecule).
C.
Structural Stability of the Dimerization Domain
The circular dichroism (CD) spectrum of the fully zinccoordinated (native) ZDD as well as of an apo form is shown in Figure 2A. Removal of all zinc ions to produce the apo form of the domain results in extensive loss of ordered secondary structure as judged by the reduction of molar ellipticity in the CD spectrum. As in other structural zinc-binding domains, we conclude that the free energy associated with the coordination of metal ions is necessary for correct folding of ZDD. We also used CD to ascertain the effects on structure and stability of the ZDD domain upon removal of the labile-bound RING finger zinc ion. In this case, the CD spectrum shows a 36% loss in molar ellipticity at 204 nm as compared to the native domain, indicating partial unfolding upon release of the metal ion from the labile site (shown in Figure 2A). These observations support the conclusion that the labile zinc plays an important role in the determination and stabilization of the local secondary structure in the RING finger subdomain.
Solution Properties of RAGl ZDD
579
The thermal stability of ZDD was determined l y monitoring the temperature dependence of the molar ellipticity at 204 nm. ZDD was found to be quite stable with a melting temperature of 78±1°C (Figure 2B); however, a AGunfolding could not be calculated since the denaturation was completely irreversible. The Zn2-ZDD fragment lacking the labile zinc ion was found to be significantly less stable t h a n the native form. Its melting temperature was 67±2°C, 11°C lower than that of the fully metal-coordinated fragment. Further, the melting curve of Zn2-ZDD 40 50 60 70 80 90 displayed less cooperativity Temperature (°C) than the native form. Figure 2. Circular Dichroism of ZDD. A, Spectra Native-like stability and of the fully zinc-coordinated form (native), a cooperativity could be Zn2 species, and an apo-form of the fragment. B, recovered upon dialysis Thermal denaturation curves of native ZDD as| against cadmium-containing compared to a Zn2 species. solutions to produce the triply liganded Zn2Cdi species (data not shown). The reduced stability and decreased cooperativity observed for ZDD missing one of the RING finger zinc ions suggests that the extent of zinc-binding can fine-tune structural properties of not only the RING finger subdomain, but also of the entire ZDD domain. That removal of one of the coordinated zinc ions from the RING finger can have a major influence on structural stability is supported by previous structural studies of homologous RING finger domains. Specifically, high resolution structures of two different RING finger modules from equine herpes virus (EHV) gene 63 and a putative h u m a n transcription factor, FML, have been solved by nuclear magnetic resonance [12, 13]. Common features of both structures reveal two separate zinc-binding sites, with the zinc ions separated by approximately 14 A. The polypeptide chain alternately winds between the two zinc sites, such that the first and third pair of cys ligands coordinate to one zinc ion with the second zinc ion ligated by the third and fourth pair of Cys and His ligands. This u n i q u e
Karla K. Rodgers and Karen G. Fleming
580
feature is accomplished via an antiparallel P sheet situated between the individual zinc-binding sites. Thus, removal of one of the zinc ions from the RING finger module would most likely result in partial disruption of this (3 sheet. The variation in affinity for zinc ions between multiple zincbinding sites has been demonstrated with other zinc-binding subdomains. One example is the zinc binuclear cluster in the GAL4 DNA binding domain in which one of the two zinc ions is bound with higher affinity. The single zinc species of GAL4 shows a marked decrease in the free energy of binding to its specific DNA sequence, exhibiting the consequence of differential affinities for zinccoordination on protein function as well [14].
D»
Solution Properties of the Zn-Binding
Domain
Sedimentation equilibrium analytical ultracentrifugation of isolated ZDD was used to determine its solution molecular weight. These experiments revealed the presence of a single species in solution corresponding to the molecular mass of the dimeric ZDD fragment. Global analysis of the equilibrium data using equation 1
—I—'—r6.90 7.00 Radius, cm
7.10
10 10 10 Total Concentration, M (Monomer)
Figure 3. Sedimentation Equilibrium of MBP-ZDD. B, Equilibrium distribution of MBP-ZDD at 15000 rpm. The monomer and dimer exponentials, whose sum gives rise to the model fit, as well as the sum itself are shown by the solid lines. The circles are the data points. A, Residuals of the fit. C, The thickened portions of the curves indicate the concentration range wherein the analysis was carried out. The thin portions of the curves are extrapolated from analysis of those data.
Solution Properties of RAGl ZDD
581
yielded a molecular weight within 2% of that predicted from the amino acid sequence [5]. From the observed concentration range, an upper limit for the equilibrium dissociation constant could be estimated as 14 |iMo Sedimentation equilibrium of the MBP-ZDD chimera, which has a larger extinction coefficient at 280 nm, permitted experiments to be done at low enough molar concentrations to detect significant amounts of monomeric protein using the absorbance optics. A n exponential distribution of the chimeric protein at 15,000 rpm is shown in Figure 3B. The observed data are best described by equation 2, which is the sum of two exponentials corresponding to the distributions of the monomeric and dimeric chimeric proteins. The equilibrium dissociation constant was found to be 3.12 |LIM (±16%). Using the parameters derived from the m o n o m e r / d i m e r fit. Figure 3C shows the relative concentrations of the RAGl monomer and dimer as a function of total monomer concentration where it can be seen that it is predominantly dimeric at concentrations above 5 |LIM. Although all ultracentrifugation measurements were done in buffers which contained no excess zinc, bound zinc ions are required for the specific homodimer formation of ZDD as the apo form of the domain was shown to be unfolded and nonspecifically aggregated [5]. Atomic absorption spectroscopy of the samples in the buffers used for these experiments confirmed the expected stoichiometry of zincbindingo 1.
Combination of Hydrodynamic Parameters for Molecular Weight Determination
The dimeric molecular weight of the isolated ZDD fragment was further confirmed by combining sedimentation and diffusion coefficient measurements. Even though both of these coefficients are hydrodynamic measures of a macromolecule, the ratio of s to D is proportional to the molecular weight by the Svedberg equation (equation 5). By using the Svedberg relationship, the shape and hydration factors inherent in each coefficient cancel out, and the molecular weight can be calculated. 2.
Application of the Svedberg Equation Using s^20,w and D02O,W Values
We first applied the Svedberg equation to calculation of the solution molecular weight from the S^Q^^, as measured by velocity sedimentation, and the D^Q^^, as measured by dynamic light scattering. As previously described, we determined values of 2.44 S and 7,97 F for the S^Q^^ and the D^Q^^ coefficients, respectively [5]. Combining these two coefficients, obtained by two independent.
Karla K. Rodgers and Karen G. Fleming
582
experimental approaches, in the Svedberg equation yielded a solution molecular weight of 29.2 kDa for purified ZDD, which is within 10% of the dimeric mass determined by electrospray mass spectrometry. 3.
Application of the Svedberg Equation Using Sapp and Dapp Values
The molecular weight was also calculated from velocity sedimentation analysis alone by simultaneous determination of the apparent sedimentation, s^^, and diffusion, D^^, coefficients using an extended analysis of the time-derivative method. It has recently been shown that the diffusion coefficient of the macromolecule is related to the standard deviation of the g(s) versus s* curve fitted to equation 4 [9]. Figure 4B shows such a fit to ZDD sedimentation velocity data. Using equations 3 and 4, we calculated an s^^ of 2.33 S and a D^^^ of 8.21 F. Combining these simultaneously determined parameters in the Svedberg equation yielded a solution molecular weight of 26.8 kDa, within 2% of the molecular weight as measured by electrospray mass spectrometry. A major advantage of using the time derivative method is the rapidity in which one can determine the solution molecular weight of an ideal, monodisperse macromolecule. Essentially the data collection and analysis can both be done in one afternoon to yield an estimate of the solution oligomeric state.
%pp ~ ^-^^ S Dapp = 8.21 F Ms, D = 26.8 kDa
0.5-
A
Mmass spec. = 26.4 kDa X-
0 -o
g 0.3(J)
/
D
^0.2. * 0.1-
n n_ u.u-
1
6.4
6.8 radius, cm
7.2
- ^ 1
1
\
/Vi 1
1
-T
'
2 3 s* (svedbergs)
Figure 4. Sedimentation Velocity Analysis of ZDD. A, Primary data collected at 1 mg/ml (10 scans). B, Apparent sedimentation coefficient distribution function, g(s*) versus s*. The error bars represent the standard error of the mean. The solid line is the fit to equation 4. Apparent s, D, and Ms,D values were calculated as described.
Solution Properties of RAGl ZDD
E.
583
Interpretation of Shape Parameters
Insight into the overall shape of the ZDD dimer in solution was obtained by interpretation of sedimentation velocity and smallangle X-ray (SAXS) scattering experiments [5]. An experimental frictional coefficient, /2o^, was calculated from the sedimentation coefficient, S^Q^^, and using an estimate of the protein hydration, the shape factor of ZDD was found to be 1.14. When modelled as a prolate ellipsoid of revolution using Perrin's law, this shape factor corresponds to an axial ratio of 3.2, indicating a quite elongated structure. We have previously reported small-angle X-ray scattering results that gave values for the radius of gyration (Rg=23.4A), as well as the maximum dimension (dmax=89A), for the ZDD dimer (5). Again, modelled as a prolate ellipsoid of revolution, a major axis of 89 A (from dmax) would require equivalent minor axes of 27 A in order to enclose a volume consistent with the molecular weight and partial specific volume of the dimer. Thus, these studies gave an axial ratio of 3.3, consistent with that obtained from sedimentation velocity experiments. Although the values obtained from these separate techniques cannot be directly compared, as velocity sedimentation is a hydrodynamic measure of the molecule in contrast to small-angle Xray scattering, both results indicate that the ZDD dimer is likely to be more elongated than spherical in overall shape.
TV. Conclusions Using a combination of biophysical techniques we h a v e defined the solution properties of the amino terminal zinc-binding domain of the recombination activating protein, RAGl. The ZDD domain consists of two types of zinc-binding subdomains: a RING finger and a zinc finger, both of which appear to be intimately involved in the structural determination and stability of this domain. Full metal coordination is required for proper folding, since even the loss of one zinc ion results in significant alterations to the structure and stability of this protein. This zinc-binding RAGl domain self associates in solution to form a stable dimer. The dimeric oligomeric state was confirmed by combining complementary hydrodynamic parameters in the Svedberg equation to yield the solution molecular weight as well as by direct measurement in equilibrium sedimentation experiments. We were further able to measure the equilibrium dissociation constant of the dimerization reaction by equilibrium sedimentation of the MBP-ZDD fusion protein, which allowed us to access m u c h lower concentrations than were possible with the ZDD fragment alone. The free energy of the interaction shows that the dimer forms
584
Karla K. Rodgers and Karen G. Fleming
with relatively high affinity suggesting that dimerization may play an important role in the physiological function of RAGl. The overall shape of the RAGl zinc-binding dimerization domain is elongated as modelled by a prolate ellipsoid of revolution. Both sedimentation velocity and small-angle x-ray scattering experiments yielded axial ratios consistent with an extended molecule in solution for the ZDD dimer. This zinc-binding dimerization domain in RAGl is positioned immediately N-terminal to the essential core region of the entire RAGl protein (Figure 1). Stable dimerization of such an elongated structure ensures a specific positioning of the zinc-binding d o m a i n monomers with respect to each other. In such a manner, this d o m a i n is poised to orient and bring together the core region of RAGl for optimum function. Given the strong influence of zinc coordination on the structure and stability of this domain, it is plausible that the extent of zinc-binding may modulate the tertiary and quaternary structure of RAGl, possibly contributing to mechanisms of effective cellular control for V(D)J recombination.
Acknowledgements We thank Charles B. Millard and Clarence A. Broomfield for generous use of their ultracentrifuge. We thank Joseph E. Coleman, David G. Schatz, Preston Hensley, and Walter F. Stafford, III for helpful discussions. This work was supported by NIH grants DK09070 to JEC, AI32524 to DGS, GM16039 to KKR and GM16769 to KGF.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Schwabe, J.W.R. and Klug, A. (1994) Nature Struct Biol 1: 345-349. Saurin, A.J., Borden, K.L.B., Boddy, M.N., and Freemont, P.S. (1996) Trends Biochem. Sci. 21: 208-214. Schatz, D.G., Oettinger, M.A., and Baltimore, D. (1989) Cell 59: 1035-1048. Lewis, S.M. (1994) Advan. Immunol 56: 27-150. Rodgers, K.K., Bu, Z., Fleming, K.G., Schatz, D.G., Engelman, D.M., and Coleman, J.E. (1996) /. Mol Biol 260: 70-84. Cohn, E.J. and Edsall, J.T. (1943) in Proteins, Amino Acids and Peptides Reinhold Publishing Corporation: New York. pp. 370-381. Johnson, M.L., Correia, J.J., Yphantis, D.A., and Halvorson, H.R. (1981) Biophys. J. 36: 575-588. Stafford III, W.F. (1992) Anal Biochem. 203: 295-391. Stafford III, W.F. (1996) Biophys. J. 70: M-Pos452. McBlane, J.F., van Gent, D . C , Ramsden, D.A., Romeo, C , Cuomo, C.A., Gellert, M., and Oettinger, M.A. (1995) Cell 83: 387-395. von Armin, A.G. and Deng X.W. (1993) /. Biol Chem. 268: 19626-19631. Barlow, P.N., Luisi, B., Mihier, A., Elliot, M., and Everett, R. (1994) /. Mol. Biol 237: 201-211. Borden, K.L.B., Boddy, M.N., Lally, J., O'Reilly, N.J., Martin, S., Howe, K., Solomon, E., and Freemont, P.S. (1995) EMBOJ. 14:1532-1541. Rodgers, K.K. and Coleman, J.E. (1994) Protein Sci 3: 608-619.
Localizing Flexibility within the Target Site of DNA-bending Proteins Anne Grove and E. Peter Geiduschek Department of Biology and Center for Molecular Genetics University of California, San Diego La JoUa, CA 92093-0634
I. Sequence-specific DNA Bendability DNA is not the perfect double helix of traditional textbooks. Slight, but significant structure variations have been demonstrated from comparison of crystal structures of oligonucleotides. It has also become evident that these structure variations are not entirely determined by the individual base-steps (AA, AT, GC, etc), but are influenced by sequence contexts (1,2). The emerging picture of the DNA duplex, in fact, suggests a dynamic structure that is continuously contorted in a sequence-dependent manner. Macroscopic DNA bending, which is a frequent consequence of interaction with proteins, is generated by the cumulative effects of changes in local variables, twist, roll, etc. Substantial DNA bending usually involves a change in roll angles that results in a compression of the major groove, presumably because charge repulsion between the sugar-phosphate backbones opposes a compression of the minor groove (3). Accommodation of DNA in a complex that involves significant DNA curvature or looping must reflect its propensity for bending (i.e. its anisotropic flexibility). Analysis of the distribution of DNA sequences in nucleosome structures has yielded a statistical profile of trinucleotide sequences that are more tolerant of bending (2,4,5). A similar data set has been obtained by analysis of the relative accessibility of DNA to cleavage by DNase I, as variations in cutting frequency may be interpreted in terms of the widening of the minor groove that accompanies DNA bending away from the enzyme (6). The TA step has received particular attention due to its frequent use in binding sites for DNA-bending proteins, and has been rationalized by a greater range of allowable roll angles (7). For the nucleosome core particle, bendability is a major determinant of specific positioning. For proteins that introduce sharp kinks in DNA upon binding, bending appears to supplement sequence-specific TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
585
586
Anne Grove and E. Peter Geiduschek
recognition of binding sites (8). The variable flexibility that is built into the DNA sequence is obviously not the sole determinant of protein binding. Specificity in binding site selection must derive from interactions with the protein that are peripheral to the bending locus. We have reported a strategy that aims to evaluate the contribution of DNA flexibility to complex formation, by measuring the binding of DNAbending proteins to DNA in which flexibility has been imposed by tandem mismatches (9,10), here discussed in the context of the prokaryotic type II DNA binding proteins.
II. Architectural DNA-Binding Proteins: Preference for Prebent DNA Bacterial nucleoids are dense structures in which DNA supercoiling and compaction is assisted by DNA-bending proteins (11,12). Several abundant proteins are associated with this nucleoid, forming what is somewhat loosely referred to as "bacterial chromatin". In Escherichia coli, four abundant proteins are associated with the nucleoid: H-NS, Fis, HU and Integration Host Factor (IHF) (12,13), all of which bend DNA. HU binds preferentially to bent or deformed DNA, such as four-way (cruciform) junctions and DNA with nicks and gaps (14-16), and in that respect resembles the eukaryotic HMG-domain proteins which have increased affinities for cruciform structures and cisplatin DNA adducts (17,18). Both Holliday junctions and cisplatin adducts are thought to cause two helical DNA segments to form a sharp angle (19) and may allow increased binding by lessening the energetic cost of DNA bending. HU and IHF are members of the ubiquitous prokaryotic family of type II DNA-binding proteins, all dimers of 90- to 99-amino acid subunits. Most HU proteins are homodimers, IHF is heterodimeric. The structure of Bacillus stearothermophilus HU (20-23) revealed that a flexible, anti-parallel B-hairpin arm extends from each monomer, as though poised to embrace a DNA double helix. Whereas HU binds to DNA non-specifically, E. coli IHF binds relatively tightly (Kd in the nM range) to unique sites. Comparison of numerous IHF binding sites established a 9 bp interrupted consensus sequence (AATCAAxxxxTTA), asymmetrically disposed within an -30 bp site (24-26). The large binding site, determined by DNase I footprinting, and the sharp DNA bending suggests that DNA bends toward the protein and wraps around it. The genome of the B. subtilis bacteriophage SPOl contains hmU entirely replacing T. Phage SPOl encodes T F l , which is homologous to HU, and possesses similar structural features (27,28). Like HU and IHF, T F l is very abundant with more than 50,000 dimers accumulating in an SPOl-infected cell. T F l binds preferentially to hmU-containing DNA relative to T-containing DNA, prefers double-stranded over single-
Flexibility within the DNA Target Site
587
stranded DNA, and binds to selected sites in the phage genome. Only few TFl-sites have been sequenced, and no strong consensus has been found. T F l sharply bends DNA and, in so doing, wraps the DNA around the body of the protein to allow interactions with an -30 bp site, very similar to IHF (29). The type II DNA-binding proteins in prokaryotes and the HMG box proteins in eukaryotes are regarded as architectural, as they are thought to mold the DNA into a conformation that facilitates the formation of higher order protein-DNA assemblies (30). The ability to interchange these proteins in processes such as phage X site-specific recombination (with variable and in some cases limited efficiency) suggested a common function and motivated this designation (e.g. ref. 31).
III. DNA Loops: Effect on Protein Binding Since the type II DNA-binding proteins sharply bend DNA, it follows that sequence-dependent DNA deformability may contribute to the selection of preferred binding sites. This direction of thinking was explored by designing and synthesizing DNA with specifically placed loops - which should confer site-specific flexibility - and by analyzing these loop-containing duplexes for protein binding. The strategy is outlined below for two members of the family of type II DNA-binding proteins: T F l , which exhibits sequence-specific DNA binding only in the context of hmU-DNA, and IHF for which a consensus sequence is known (26). Details of the experimental design have been reported (9,10). Several properties were considered in designing loop-constructs, (i) Loops consisting of three consecutive mismatches were reported to enhance DNA flexibility (32). Based on protein binding studies, we concluded that tandem mismatches reproduce the effect of 6-nt loops in terms of increased DNA flexure but are preferable for studying protein binding, as the introduction of three consecutive base-substitutions is more likely to disrupt specific contacts (10). Only constructs with 4-nt loops were used for subsequent analyses, (ii) For the dimeric type II DNA-binding proteins, structural analysis indicated that the DNA was likely to be distorted at two sites; loop-constructs in which sets of loops were separated by variable spacings were therefore evaluated for protein binding, (iii) The asymmetrical disposition of the IHF consensus sequence required that sets of loops with optimized spacing be differently positioned across the binding region, (iv) Differences in affinity between constructs with separate loop-spacings or placements were compared to variations contributed by loops of different nucleotide composition; such sequence variations had only secondary effects on affinity (10). (v) The length of the DNA construct was selected to accommodate only one protein molecule to reduce opportunities for alternative placements.
588
Anne Grove and E. Peter Geiduschek
A. TFl For T F l , the reference DNA sequence corresponds to a preferred binding site in the SPOl genome (Figure 1). A set of 37-mer Tcontaining DNA constructs was prepared with pairs of 4-nt loops placed S3niimetrically about the center of the binding site, spaced apart by 7-11 bp. Protein binding was evaluated by electrophoretic mobility shift assays and equilibrium dissociation constants, K^, were determined from the slopes of Scatchard plots (10). Four-nt loops separated by 9 bp of duplex DNA are optimal for T F l binding (Kd ~3 nM; Figure 2); other loop-separations generate suboptimal binding. When the formation of T F l complexes with short duplex DNA is monitored by gel electrophoresis, the discrimination is effectively absolute, because affinity differences are compounded by the greater rate of dissociation of less stable complexes in the gel. To the extent that loops generate partly single-stranded regions, this would not be expected to increase the affinity of T F l which prefers duplex DNA. We interpret our results to suggest that increased binding of T F l to loop-containing duplexes is due to recognition based on DNA deformability and that DNA in a complex with T F l is distorted at two sites separated by 9 bp of duplex (10).
A No loop
B 4-nt loops
5'-CCTAGGCTACACCTACTCTTTGTA?^GAATTAAGCTTC-3' 3 ' -GGATCCGATGTGGATGAGAAACATTCTTAATTCGAAG-5 '
4-nt(spacing 7)
3' -GGATCCGATGTGGTAGAGAAACTATCTTAATTCGAAG-5 '
4-nt(spacing 8)
3' -GGATCCGATGTGCTTGAGAAACTATCTTAATTCGAAG-5 '
4-nt(spacing 9)
3' -GGATCCGATGTGCTTGAGAAACAAACTTAATTCGAAG- 5'
4-nt(spacing 10)
3' -GGATCCGATGTCCATGAGAAACAAACTTAATTCGAAG-5 '
4-nt(spacing 11)
3' -GGATCCGATGTCCATGAGAAACATAGTTAATTCGAAG- 5'
Figure 1. Sequences of 37-mer oligonucleotides corresponding to a preferred binding site for TFl. The position of a short inverted repeat flanking the center of the binding site is indicated by arrows, and two TA steps 9 bp apart noted by asterisks. For loop-containing duplexes, the sequence of the bottom strand is altered to generate mismatches of identical nucleotides. Sequences generating loops are underlined. Oligonucleotides with T-content were purchased and purified by denaturing polyacrylamide gel electrophoresis. The top strand (shared among all DNA constructs) was ^^P-labeled at the 5'-end using T4 poljniucleotide kinase. Complementary oligonucleotides were mixed stoichiometrically, heated to 90°C and slowly cooled to 4°C over several hours to form duplex DNA.
Flexibility within the DNA Target Site
589
A
l-Conplex
dsDNA ^ ^ , ^ ^ , ^ ^k- ^tfl^
^^ ^WP m r ^^ffm^'Wm ^^ ^ ^ ^^^^ ^ ssDNA 14
^
20 27
41
54
68 95
nM TFl
«Plr l i l ^ - ' l i l l l N I I N i k A -^Complex dsDNA ssDNA 3
9
14
20
27
41 54
nM TFl
Figure 2. Electrophoretic mobility shift analysis of TFl binding to (A) perfect duplex or (B) duplex with two 4-nt loops separated by 9 bp. Protein concentrations are indicated below.
B. IHF Unlike T F l , IHF exhibits sequence-specificity in T-DNA, yet is anticipated to interact with DNA in a comparable fashion. The approach to evaluating the contribution of sequence-dependent DNA flexibility to complex formation must therefore consider not only optimal spacing between sets of 4-nt loops, but the location of loops with respect to the consensus sequence. The resulting iterative process showed that IHF has highest affinity for loops separated by 8-9 bp, even if the DNA sequence does not have a strong consensus (9). Placing sets of 4-nt loops separated by 8 bp across a consensus binding region (a 37-mer duplex representing the H' site of the phage X genome) indicated that an increase in affinity requires that loops do not disrupt the consensus sequence. Optimal binding is generated by an off-center placement with one of two 4-nt loops at the edge of the upstream consensus block (Kd=0.25 nM compared to 3.7 nM for the perfect duplex). Re-evaluating the optimal separation between loops in the context of the consensus sequence confirmed the 8-9 bp optimal spacing (9). The preferred separation between loops is similar for T F l and IHF, indicating that the two proteins indeed engage their DNA target in similar fashions. For IHF, the contribution of direct base-contacts is evidenced by the distinct preference for loop placement with respect to consensus sequence elements.
590
Anne Grove and E. Peter Geiduschek
IV. DNA-Bending Proteins and Hydroxymethyluracil-Containing DNA The decreased affinity of T F l for T-DNA is correlated with reduced bending, suggesting that the substitution of hmlJ for T might affect deformabiUty. Binding to hmU-containing loop-constructs was therefore compared to results obtained with T-containing DNA. Most loopplacements diminish the affinity of T F l for hmU-DNA. For DNA with optimal placement of 4-nt loops (9 bp separation), the affinity is identical to that of perfect hmU-duplex (~3 nM). Remarkably, the discrimination between hmU and T essentially disappears with the optimal loop separation. Since site-specific flexure qualitatively and quantitatively substitutes for hmU-preference, we propose that hmUcontent and loops offer the same or similar contributions to complex formation (10). A similar analysis was extended to three other DNA-bending proteins: IHF, HU and HMGl. The affinity of IHF for one of its preferred sites is increased -6.5 fold by substituting hmU for T. Both HU and HMGl, which bind DNA non-specifically, have increased affinity for hmU-DNA relative to T-DNA of otherwise same sequence (9). There is relatively little information about the effect of substituting T with hmU on DNA bending. HmU-DNA melts ~10°C lower than does T-containing DNA of otherwise identical composition, but has been thought to have a normal B-type structure (33,34). A measurement of the torsional rigidity of hmU-containing DNA by time-resolved fluorescence polarization anisotropy of intercalated ethidium failed to show differences from similar measurements on T-DNA, indicating that hmU-DNA does not possess freely flexible joints on a length scale of -10^ bp (35). However, the relationship between wedge models of localized DNA bending and the hydrodynamic models of long-range cooperative motions, which form the basis for interpreting fluorescence polarization experiments, has not been worked out. The structure and dynamical properties of hmU-DNA are being re-examined by the group ofD. R. Kearns(36). It is a striking finding of our experiments that a substitution for the hmU-preference of T F l can be made by suitably placing flexible loops in T-DNA. To our thinking, the implication is that hmU-selectivity is a least partly due to differences in the energetics of DNA deformation between T- and hmU-DNA. We surmise that these differences are sequence-specific.
Flexibility within the DNA Target Site
591
Acknowledgments We greatly appreciate the contributions of our collaborators L. Mayol and A. Galeone and the continuing interest of and discussions with V. L. Hsu and D. R. Kearns. This research was supported by a grant from the NIGMS.
References 1. Dickerson, R. E., Goodsell, D. & Kopka, M. L. (1996). MPD and DNA bending in crystals and in solution. J. Mol. Biol. 256, 108-125. 2. Wolffe, A. P. & Drew, H. R. (1996). DNA structure: implications for chromatin structure and function. In: Frontiers in Molecular Biology IRL Press. In press. 3. Travers, A. A. (1995). Reading the minor groove. Nature Struct Biol. 2, 615-618. 4. Satchwell, S. C , Drew, H. R. & Travers, A. A. (1986). Sequence periodicities in chicken nucleosome core DNA. J. Mol. Biol. 191, 659-675. 5. Travers, A. A. & Klug, A. (1990). Bending of DNA in nucleoprotein complexes. In: DNA Topology and its Biological Implication (Cozzarelli, N. R. & Wang, J. C , eds.), pp. 57-106. Cold Spring Harbor Laboratory Press, NY. 6. Brukner, I., Sanchez, R., Suck, D. & Pongor, S. (1995). Sequence-dependent bending propensity of DNA as revealed by DNase I: parameters for trinucleotides. EMBO J. 14, 1812-1818. 7. Quintana, J. R., Grzeskowiak, K., Yanagi, K. & Dickerson, R. E. (1992). Structure of a B-DNA decamer with a central T-A step: C-G-A-T-T-A-A-T-C-G. J. Mol. Biol. 225, 379-395. 8. Gartenberg, M. R. & Crothers, D. M. (1988). DNA sequence determinants of CAPinduced bending and protein binding affinity. Nature 333, 824-829. 9. Grove, A., Galeone, A., Mayol, L. & Geiduschek, E. P. (1996a). Locahzed DNA flexibility contributes to target site selection by DNA-bending proteins. J. Mol. Biol. 206, 120-125. 10. Grove, A., Galeone, A., Mayol, L. & Geiduschek, E. P. (1996b). On the connection between inherent DNA flexure and preferred binding of hydroxymethyluracilcontaining DNA by the type II DNA-binding protein T F l . J. Mol. Biol. 206, 196206. 11. Kellenberger, E. (1996). Structure and function at the subcellular level. In: Escherichia coli and Salmonella: cellular and molecular biology. (Neidhardt, F. C. ed. in chief), pp. 17-28. ASM Press, Washington, DC. 12. Pettijohn, D. E. (1996). The nucleoid. In: Escherichia coli and Salmonella: cellular and molecular biology. (Neidhardt, F. C., ed. in chief), pp. 158-166. ASM Press, Washington, DC. 13. Finkel, S. E. & Johnson, R. C. (1992). The Fis protein: it's not j u s t for DNA inversion anymore (Published erratum: Mol. Microbiol. 1993, 7, 1023). Mol. Microbiol. 6, 3257-3265. 14. Pontiggia, A., Negri, A., Beltrame, M. & Bianchi, M. E. (1993). Protein HU binds specifically to kinked DNA. Mol. Microbiol. 7(3), 343-350. 15. Bonnefoy, E., Takahashi, M. & Rouviere-Yaniv, J. (1994). DNA-binding parameters of t h e HU protein of Escherichia coli to cruciform DNA. J. Mol. Biol. 242, 116-129. 16. Castaing, B., Zelwer, C , Laval, J. & Boiteux, S. (1995). HU protein of Escherichia coli binds specifically to DNA t h a t contains single-strand breaks or gaps. J. Biol. Chem. 270, 10291-10296.
592
Anne Grove and E. Peter Geiduschek
17. Bianchi, M. E., Beltrame, M. & Paonessa, G. (1989). Specific recognition of cruciform DNA by nuclear protein HMGl. Science 243, 1056-1059. 18. Pil, P. M. & Lippard, S. J. (1992). Specific binding of chromosomal protein HMGl to DNA damaged by the anticancer drug cisplatin. Science 256, 234-237. 19. Lilley, D. M. J. (1992). HMG has DNA wrapped up. Nature 357, 282-283. 20. Tanaka, I., Appelt, K., Dijk, J., White, S. W. & Wilson, K. S. (1984). 3-A resolution structure of a protein with histone-like properties in prokaryotes. Nature 310, 376381. 21. White, S. W., Appelt, K, Wilson, K. S. & Tanaka, I. (1989). A protein structural motif that bends DNA. Proteins: Struct. Funct. Genet. 5, 281-288. 22. Vis, H., Boelens, R., Mariani, M., Stroop, R., Vorgias, C. E., Wilson, K. S., & Kaptein, R. (1994). ^H, ^^C, and ^^N resonance assignments and secondary structure analysis of the HU protein from Bacillus stearothermophilus using twoand three-dimensional double- and triple-resonance heteronuclear magnetic resonance spectroscopy. Biochemistry 33, 14858-14870. 23. Vis, H., Mariani, M., Vorgias, C. B., Wilson, K. S., Kaptein, R. & Boelens, R. (1995). Solution structure of the HU protein from Bacillus stearothermophilus. J. Mol. Biol. 254, 692-703. 24. Craig, N. L. & Nash, H. A. (1984). E. coli integration host factor binds to specific sites in DNA. Cell 39, 707-716. 25. Yang, C.-C. & Nash, H. A. (1989). The interaction of E. coli IHF protein with its specific binding sites. Cell 57, 869-880. 26. Nash, H. A. (1996). The E. coli HU and IHF proteins: accessory factors for complex protein-DNA assemblies. In: Regulation of gene expression in Escherichia coli. (Lin, E. C. C. & Lynch, A. S., eds.), pp. 149-179. R. G. Landes Company. 27. Jia, X., Reisman, J. M., Hsu, V. L., Geiduschek, E. P., Parello, J. & Kearns, D. R. (1994). Proton and nitrogen NMR sequence-specific assignments and secondary structure determination of the Bacillus subtilis SPOl-encoded transcription factor 1. Biochemistry 33, 8842-8852. 28. Jia, X., Grove, A., Ivancic, M., Hsu, V. L., Geiduschek, E. P. & Kearns, D. R. (1996). Structure of the Bacillus subtilis phage SPOl-encoded type II DNA-binding protein TFl in solution. J. Mol. Biol. In press. 29. Schneider, G. J., Sayre, M. H. & Geiduschek, E. P. (1991). DNA-bending properties of TFl. J. Mol. Biol. 221, 777-794. 30. Grosschedl, R. (1995). Higher-order nucleoprotein complexes in transcription: analogies with site-specific recombination. Curr. Biol. 7, 362-370. 31. Segall, A. M., Goodman, S. D. & Nash, H. (1994). Architectural elements in nucleoprotein complexes: interchangeability of specific and non-specific DNA binding proteins. EMBO J. 13, 4536-4548. 32. Kahn, J. D., Yun, E. & Crothers, D. M. (1994). Detection of locaHzed DNA flexibility. Nature 368, 163-166. 33. Kallen, R. G., Simon, M. & Marmur, J. (1962). The occurrence of a new pyrimidine base replacing thymine in a bacteriophage DNA: 5-hydroxymethyl uracil. J. Mol. Biol. 5, 248-250. 34. Mellac, S., Fazakerley, G. V. & Sowers, L. C. (1993). Structure of base pairs with 5-(hydroxymethyl)-2'-deoxyuridine in DNA determined by NMR spectroscopy. Biochemistry 32, 7779-7786. 35. Hard, T. & Kearns, D. R. (1990). Reduced DNA flexibihty in complexes with a type II DNA binding protein. Biochemistry 29, 959-965. 36. Pasternack, L. B., Bramham, J., Mayol, L., Galeone, A., Jia, X. & Kearns, D. R. (1996). ^H NMR studies of the 5-(hydroxymethyl)-2'-deoxyTiridine containing TFl binding site. Nucleic Acids Res. 24, 2740-2745.
Assembly of the multifunctional EcoKl DNA restriction enzyme in vitro David T. F. Dry den*, Laurie P. Cooper and Noreen E. Murray Institute of Cell and Molecular Biology, The University of Edinburgh The King's Buildings, Edinburgh, EH9 3JR United Kingdom
I. Introduction Type I DNA restriction/modification systems have been found in many strains of Escherichia coli and Salmonella enterica (Bickle & Kruger, 1993; King & Murray, 1994; Barcus et al, 1995) and several other gram negative and positive bacteria (Dybvig & Yu, 1994; Fleischmann et al, 1995; Stein et al, 1995; Valinluck et al, 1995; Xu et al, 1995). They maintain the modification of the host chromosome after DNA replication by methylating adenine bases on the newly synthesised DNA strand within specific DNA target sequences. This methylation reaction is triggered by the recognition of targets which are methylated on the parental DNA strand. If methylation is not detected on either strand then the restriction reaction is triggered. Unmodified target sequences will exist on foreign DNA, usually of viral origin. A type I system cleaves the foreign DNA thereby preventing (restricting) its replication and propagation. In contrast to the widely used type II restriction/modification systems which have separate restriction endonucleases and modification methyltransferases (mtases), the type I systems combine both activities in one large oligomeric enzyme. The archetypal type I system is that of E.coli K12, EcoKl. This enzyme comprises three different subunits, the specificity (S) subunit which recognises the DNA sequence 5'AAC-(N)6-GTGC3', the modification mtase (M) subunit, and the restriction endonuclease (R) subunit. Two M subunits bind to one S subunit to form an active modification mtase which TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
593
David T. R Dryden et al
594
has a strong preference for methylating a target sequence which aheady contains one methylated adenine base, figure 1. The binding of two R subunits to the mtase gives rise to a nuclease activity which is triggered only when the target sequence is not methylated at either position. The molecular weights of the S, M, and R subunits are 51kDa, 59kDa and 134kDa respectively, thus the complete EcoKl enzyme has a molecular weight of 437kDa. It has proved difficult to produce more than 1 or 2 milligram of the complete nuclease for biophysical analysis by in vivo expression of the EcoYJ genes, however, it has been possible to produce large amounts of the mtase and its subunits (Dryden et al, 1993) and milligram quantities of the R subunit. Therefore, we have examined the possibility of assembling the complete EcoYl enzyme in vitro using intramolecular crosslinking combined with denaturing gel electrophoresis to detect subunit-subunit contacts, and the method of continuous variation titration to confirm subunit stoichiometrics (Job, 1928; Agmus, 1961).
Unmodified target H
1
5 ' -- AAC ( N )6 GTGC -- 3 ( N )6 CACG -- 5 3 ' •- TTG
Fully modified target CH3
I
5' - AAC ( N ) 6 GTGC - 3' 3' - TTG ( N ) 6 CACG - 5' CH^
Hemimethylated targets CH, 5' - AAC { N ) 6GTGC - 3' 3'- TTG ( N )6 CACG - 5'
5' - AAC ( N ) 6 GTGC - 3' 3' - TTG ( N ) 6 CACG - 5' CHo
Figure 1. The DNA target sequence for EcoYA in its different methylated forms. The unmodified form elicits the restriction reaction, the two hemimethylated forms elicit the modification reaction and the fully methylated form causes no reaction. EcoKI binds with the same affinity to all of these forms (Powell et al, 1993).
Assembly of EcoKl
595
II. Materials and Methods The M2S1 mtase and the partially assembled, inactive MjSi form were purified as described (Dryden et al, 1993). The M subunit was purified using DEAE ion exchange and gel filtration chromatography from cells containing an overexpression plasmid, pJFM, derived from the mtase overexpression plasmid, pJFMS. This was made by excising the Sma\-Hind\\\ fragment containing the M gene from pJFMS and ligating it into plasmid vector pJFl 18EH. The R subunit was purified from cells containing the multicopy plasmid pJK2 (Kelleher et al, 1991) using DEAE ion exchange, heparin - affinity and gel filtration column chromatography. The proteins were all at least 95% pure as judged by SDSPAGE with Coomassie blue or silver staining. Protein concentration was measured by absorption at 280nm using extinction coefficients calculated from the tyrosine and tryptophan content of the subunits (Sober, 1970). The buffer used in all crosslinking experiments was 20mM tris, 20mM MES, lOmM MgCl2, 7mM p-mercaptoethanol, O.lmM EDTA, pH8 and experiments were all performed at room temperature. Glutaraldehyde crosslinking was performed by adding a 25% stock solution of glutaraldehyde to the protein solution to obtain a final concentration of 1%) glutaraldehyde and approximately 0.2mg/ml of protein in a volume of 100|al. The reaction was terminated by the addition of 2.5 \i\ of 2M NaBH4 freshly prepared in 0.1 M NaOH. After 20 minutes, the samples were mixed with an equal volume of SDS PAGE loading buffer, boiled for 5 minutes and applied to the gel. The crosslinked samples were subjected to electrophoresis on 10% acrylamide gels with a stacking gel or on 0.8% agarose, 3.5%) acrylamide gels without a stacking gel. The agarose significantly improved the strength of the acrylamide gels without affecting their resolution. These gels were made by dissolving the agarose in hot gel electrophoresis buffer, followed by the addition of the acrylamide solution, TEMED and ammonium persulphate (Sambrook et al, 1989). The mixture was rapidly poured between pre-warmed glass plates. Gel casting and subsequent electrophoresis used either the standard tris-glycine buffer (Sambrook et al, 1989) or a 20mM sodium phosphate, 2% SDS, pH7 buffer (Sigma technical note MWS-877X). Molecular weight markers of up to 205kDa (Sigma) were used with the tris-glycine buffer system, but it was possible to use crosslinked phosphorylase b and crosslinked bovine serum albumin markers (Sigma)
David T. F. Dryden et al
596
with molecular weights up to 600kDa with the phosphate buffer system (Sigma technical note MWS-877X). The molecular weights of the crosslinked complexes were estimated by comparison to a calibration plot of log(molecular weight) versus migration for the molecular weight markers. This calibration for the markers used with the phosphate buffer system and the agarose/acrylamide gels was completely linear over the extraordinarily large range of 27kDa to at least TOOkDa. To obtain a measure of the number of R subunits which would bind in vitro to the mtase we used the continuous variation titration method (Job, 1928; Agmus, 1961). In this method, the two components are mixed such that the sum of their concentrations remains a constant but the mole fraction, x, of each component is varied. If one chooses a total concentration, C, substantially greater than the dissociation constant (Kd) for the binding of the components under investigation, then a plot of the amount of complex formed as a function of mole fraction will give the stoichiometry of the complex and an estimate of the Kd. The method can be applied to any associating system at equilibrium using any appropriate technique to measure the amount of complex formed. From the usual equation describing the binding of n molecules of B to one of A to form a complex ABn one can write Kd={[A]*[Br }/[AB„] [A] = C* (1-x) - [AB„] and [B] = C*x - n*[AB„] where % is the mole fraction of B. Therefore, { C*(l-X) - [AB„] } * { C*x - n*[ABJ }" = Kd*[ABJ The solution of this equation is simple for n=l, but is more complicated for other values of n, however, when one plots the amount of complex formed versus mole fraction, one can immediately estimate the ratio of A to B by the value of the mole fraction which gives the maximum amount of complex AB^. It can be calculated that xmax = n / (n+1) so that the maxima for n =1, 2 and 3 are at x = 0.5, 0.667 and 0.75 respectively.
Assembly of EcoKl
597
%T
100 •267
NaCI
moles/litre
•475
Figure 2. Elution profile (A) of the EcoKl mtase from a heparin agarose chromatography column showing the presence of two peaks, the major peak being the MjSi form and the minor peak being MiSj. Elution profile (B) formed by reapplying the smaller of the two peaks from elution run A to the heparm agarose column showing the re-equilibration of the mtase into M2S1 and MjSi forms. % transmission at 280nm was monitored (Dryden et al, 1993). 1 2 3 4 5 6 7 8 lySSL
it^m
MMHM
I^H ^mq 'fP'5
205
116 97.4
..^i» „
.am,—^...
:»,
^.^j^.^.A
Figure 3. SDS-PAGE, on a 0.8% agarose, 3.5% acrylamide gel run in tris-glycine buffer, of samples after crosslinking with glutaraldehyde. Bands were stained with silver. Lane 1, M2S1 mtase; lane 2, M2S1 mtase + R subunit; lane 3, MiS^ + M subunit; lane 4, M|Si visible at very bottom of gel; lane 5, M^Si + R subunit; lane 6, M subunit + R subunit; lane 7, M subunit dimer with the predominant M subunit monomer having migrated off the base of the gel; lane 8, R subunit in a monomeric form.
598
David T. F. Dryden et al
III. Results and Discussion It has been found that the M2S1 mtase can dissociate during ion exchange and heparin affinity chromatography ( Dryden et al, 1993) to give a mixture of M2S1, Ml Si and M subunit, figure 2. The dissociation has been confirmed by determination of the mtase molecular weight as a function of protein concentration by both gel filtration and sedimentation equilibrium measurements (results not shown). The Kd for this process is approximately 15nM. Figure 3 shows the effect of glutaraldehyde on our preparations of M2S1, MiSi, M and R. The most intense band in each case is that of the lowest molecular weight and corresponds to the normal multimeric state of each protein, i.e. a trimer, dimer, monomer and monomer respectively. The less intense bands of higher molecular weight are due to intermolecular crosslinking between different protein molecules rather then intramolecular crosslinking. The amount of intermolecular crosslinking can be minimised by reducing the amount of crosslinker, however, this will also lead to the presence of some free subunits which have not undergone intramolecular crosslinking (Klotz et al, 1975). Lane 3 shows that the mtase can be reconstituted in vitro by mixing Mi Si with the M subunit. The crosslinked mixture shows a band not present in either of the individually crosslinked samples, lanes 4 and 7, that migrates at the same position as the crossUnked mtase, lane 1. The apparent molecular weight of this band is 150kDa, slightly less than the 170kDa expected for the mtase trimer. This slightly lower molecular weight can be attributed to the crosslinks preventing complete unfolding of the protein and resulting in faster migration of the more compact structure through the gel. Lanes 2, 5 and 6 show that the R subunit can be crosslinked to M2S1, Mi Si and the M subunit giving rise to complexes of very high molecular weight. Electrophoresis of these complexes on agarose/acrylamide gels with the phosphate buffer system allows their weights to be estimated at 400450kDa. Further analysis of these complexes using gel filtration chromatography suggests that these complexes are of the form R2M2S1, R2M2S2 and R2M2 respectively (data not shown). Only the complex between R and M2S1 shows full nuclease activity. The estimation of subunit stoichiometry of such large complexes is a rather uncertain process, so we used the continuous variation titration method to examine the binding of R to M2S1 in more detail. Figure 4 shows a typical result of the titration of M2S1 with R after crosslinking of the
Assembly of EcoKl
599
samples and electrophoresis through the agarose/acrylamide gel. The amount of crosslinked mtase decreases with increasing mole fraction of R and a high molecular weight band of the complete nuclease appears. The amount of nuclease reaches a maximum at a mole fraction of R = 0.7 and then disappears at higher mole fractions when free R subunit becomes visible. Densitometry of this gel and several others allowed figure 5, showing the amount of nuclease formed as a function of mole fraction of R, to be plotted. This graph clearly shows that more than one molecule of the R subunit binds to each molecule of the mtase and the most likely stoichiometry is that predicted from the molecular weight determination i.e. R2M2S1. This stoichiometry agrees with that observed for EcoKl nuclease purified from cells expressing all three genes. The nuclease assembled in vitro has the same enzymatic activities as the nuclease isolated from in vivo sources (data not shown).
MOLE FRACTION OF R SUBUNIT 0
0.1
0.2
\^ (M2Sl)2
0.3
0.4
0.6
H^'
0.8
0.9
1.0
'MM
-*- R2M2S1
205kDa-^^^%^
M2S1 • 116kDa
MMMr'
Figure 4. SDS-PAGE using the same gel system as in figure 3, of samples from the continuous variation titration of the M2S1 mtase with the R subunit after crosslinking with glutaraldehyde. The maximum amount of high molecular weight complex corresponding to the EcoKl nuclease is visible between 0.6 and 0.8 mole fraction of R subunit.
David T. F. Dryden et al
600
Amount of nuclease connplex, arbitrary units
_
1
i
1
\
0.8
'' "
/
i
/
r/
^ J * "«•
\
\
-_
.^vT""*^
V . \f$i.
J
\ \
-
A
/
^
f
- -'' /^/y /
0.2
1
r —r
* * * * *
0.6 0.4
i
-''
/
y*^
'
yv /
0 t ^i^^^Q-^ 0
/
/
1
0.2
1
»]\\ _
U
/
\\
'* V 6v
/
lAr *M
\
0.4
\
L
0.6
^
1 0.8
\
\ 1
Mole Fraction of R subunit
Figure 5. A plot of the amount of nuclease complex formed in the continuous variation titration experiments versus molefractionof the R subunit as determined by densitometry of silver stained gels such as that shown in figure 4. The lines drawn are the theoretical curves expected for the association of 1 (...), 2 (-), or 3 (—) R subunits per molecule of M2S1 mtase. The error bars are +/- one standard deviation.
IV. Conclusions Our results show the effectiveness of intramolecular crosslinking coupled with SDS-PAGE in analysing a complex assembly process. The use of the continuous variation titration method of Job (Job, 1928; Agmus, 1961) is particularly useful for determining subunit stoichiometrics in situations were the high molecular weight of the complexes potentially permits many different subunit stoichiometrics.
Assembly of EcoKl
601
The ability to assemble the EcoKJ nuclease in vitro is a great advantage in mutagenesis studies since one can assemble different combinations of MjSi, M, M2S1 and R containing single amino acid changes and possessing altered activities very easily, particularly if one wishes to make a nuclease proficient in restriction but deficient in modification which would be lethal if expressed in the cell.
Acknowledgements We would like to thank Peter Thorpe for the construction of the pJFMS and pJFM plasmids. This work was supported by grants from the Medical Research Council and The Royal Society. David Dryden thanks the Royal Society for a University Research Fellowship.
References Agmus, E. (1961) Z Analyt. Chem. 183, 321-333. Barcus, V. A., Titheradge, A. J. B , & Murray, N. E. (1995) Genetics 140, 1187-1197. Bickle, T. A., & Kruger, D. H. (1993) Microbiol. Rev. 57, 434-450. Dryden, D. T. F., Cooper, L. P., & Murray, N. E. (1993) J. Biol. Chem. 268, 13228-13236. Dybvig, K., & Yu, H. (1994) Molec. Microbiol. 12, 547-560. Fleischmann, R. D., Adams, M. D., White, O., Clayton, R. A., Kirkness, E. P., Kerlavage, A. R., Bult, C. J., Tomb, J. F., Dougherty, B. A., Merrick, J. M., McKenney, K., Sutton, G., FitzHugh, W., Fields, C , Gocayne, J. D., Scott, J., Shirley, R., Liu, L-L, Glodek, A., Kelley, J. M., Weidman, J. F., Phillips, C. A., Spriggs, T., Hedblom, E., Cotton, M. D., Utterback, T. R., Hanna, M. C , Nguyen, D. T., Saudek, D. M., Brandon, R. C , Fine, L. D., Fritchman, J. L., Fuhrmann, J. L., Geoghagen, N. S. M., Gnehm, C. L., McDonald, L. A., Small, K. V., Eraser, C. F., Smith, H. O., & Venter, J. C. (1995) Science 269, 496-512. Job, P. (192S) Annls. Chim. (Ser. 10) 9, 113-134. Kelleher, J. E., Daniel, A. S, & Murray, N. E. (1991) J. Mol. Biol. Ill, 431-440. King, G., & Murray, N. E. (1994) Trends in Microbiol. 2, 465-469. Klotz, I. M., Damall, D. W., & Langerman, N. R. (1975) in The Proteins, 3rd ed. (Neurath, H., Hill, R L., & Boeder, C-L. eds) pp 293-411, Academic Press, New York. Powell, L. M., Dryden, D. T. F., Willcock, D. F., Pain, R. H., & Murray, N. E. (1993) J. Mol. Biol. 234,60-11. Sambrook, J., Fritsch, E. F., & Maniatis, T. (1989) Molecular Cloning: A laboratory manual. Cold Spring Harbor Press, NY. Sober, H. A. (1970) Handbook of Biochemistry, 2nd ed. ppB75-B76, CRC Press, Boca Raton, FL. Stein, D. C , Gunn, J. S., Radlinska, M., & Piekarowicz, A. (1995) Gene 157, 19-22. Valinluck, B., Lee, N. S., & Ryu, J. (1995) Gene 167, 59-62. Xu, G., Willert, J., Kapfer, W., & Trautner, T. A. (1995) Gene 157, 59.
This Page Intentionally Left Blank
SECTION VIII Three Dimensional Structure
This Page Intentionally Left Blank
strategies for NMR Assignment and Global Fold Determinations Using Perdeuterated Proteins Ronald A. Venters, Hai M. Vu, Robert M. de Lorimier, and Leonard D. Spicer Departments of Biochemistry and Radiology and the Duke University NMR Center, Duke University, Durham, NC 27710
L Introduction NMR is proving to be a very useful tool in structural studies of small- to medium-sized proteins in solution. For larger proteins, however, magnetic relaxation becomes a limiting factor. Here we show the benefits of using uniform high-level (> 96%) deuteration to inhibit relaxation processes. This facilitates assignment of larger proteins for structural studies and enables, via edited NOESY experiments, the determination of medium- to long-range distance constraints important in establishing the tertiary organization or global fold of proteins. The proteins studied are human carbonic anhydrase II (HCA II), a 29 kDa metalloenzyme recently assigned in our lab (1), and a 12 kDa core packing mutant of thioredoxin (L78K-TRX) for which we have characterized motional dynamics (2). NMR pulse sequences utilized for protein ^^C, ^^N, and ^H assignment (3,4) rapidly lose sensitivity as the size of the protein under study increases above 25 kDa, due mainly to fast ^''C transverse relaxation via'the strong dipolar coupling between a ^^C nucleus and its directly bonded protons (5,6,7). Since the gyromagnetic ratio of H is 6.5 times smaller than that of ^H, perdeuteration dramatically reduces this relaxation. We have successfully ^^C, ^^N and ^H- labeled the protein HCA II (8) and have demonstrated significant advantages in signal-to-noise ratios for heteronuclear NMR experiments compared to a fully protonated *^C/^^N protein (1,9). Using this protein we have also developed a general strategy for the complete mainchain, as well as carbon and NHx sidechain assignments of perdeuterated proteins (1,9,10). In addition, for both HCA II and L78K-TRX we have obtained 3D and 4D ^ N/^ N-separated NOESY data which show anticipated long range interactions from which distance constraints can be derived. These are currently being evaluated in establishing the global folding patterns for these proteins (11) and here we show initital results for L78K-TRX that confirm the importance and utility of these data in establishing tertiary organization. The rapid determination of protein global TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
605
Ronald A. Venters et al
606
folds can enhance the comparison of mutant proteins with their wild-type counterparts and can significantly speed up efforts in drug discovery. In addition, the global fold may subsequently be utilized in more detailed structural studies by helping to resolve ambiguities in 4D ^"^C/^C-separated and ^^C/^^N-separated NOESY data.
IL Experimental Conditions High-level expression of HCA II in E, coli (12) has been achieved by the construction of vectors (pACA) which contain the protein gene subcloned behind a phage T7 RNA polymerase promoter vector (13). Transcription was initiated by the addition of isopropyl-p-D-thiogalactopyranoside (IPTG), inducing a chromosomal copy of T7 RNA polymerase (behind a lac UV promoter) in the cell line BL21(DE3) (14). HCA II was purified using sulfonamide affinity chromatography with slight modifications to the published procedures (15). HCA II activity was measured by assaying enzyme-catalyzed hydrolysis of/?-nitrophenyl acetate at 348 nm(16). PHH121/XL1BLU (L78K-TRX)
PACA/BL21 (HCAII)
-RICH MEDIUM PLATELB/Amp 3rC.pH7|
- RICH MEDIUM ( H ^ - RICH MEDIUM (100% D^) AaOO-0.8 3 r C. pH 7
-MINIMALyGLUCOSE MEDIUM (99% D^)— 3rC.pH7J MINIMA17ACETATE MEDIUM (99% DjP)
INDUCTION © OD - 0.4
GROW @34» 0,16 HOURS
MINIMAL/GLUCOSE MEDIUM (99% D^)
INDUCTION O OD - 0.8
GROW O 37" C, 8 HOURS
I
I
HARVEST Cartwn sources « protonated glucose for L78K-TRX and protonated sodium (1,2- ^^Cj) acetate for HCAII Nitrogen source s
NH4CI
Figure 1. Growth of E, coli in D2O for biosynthetic labeling of HCA II and L78K-TRX.
NMR and Global Fold Determinations of Perdeuterated Proteins
607
The flow chart for biosynthetic labeling of HCA II and L78K-TRX is shown in Figure 1 above. Uniform ~H and ^^N labeled HCA II was obtained by growing BL21(DE3)pACA E. coli in defined media containing essentially 100% D2O, 3 g/L sodium acetate as the sole carbon source, and 1 g/L [^^N, 99%] ammonium chloride as the sole nitrogen source (8). In addition, the defined media contained M9 salts (17), 2 mM MgS04, 1 jiM FeCls, 10 mL/L vitamin mixture (containing 10 mg/100 mL each of biotin, choline chloride, folic acid, n i a c i n a m i d e , Dpantothenate, and pyridoxal and 1 mg/100 mL riboflavin), 5 mg/L thiamine, 100 jiM CaCL, 50 |iM ZnS04, and 50 [Xg/mL ampicillin. Stock reagents were prepared in D2O and filter sterilized. To minimize ^H/^H exchange, the media were used immediately after preparation and were never autoclaved. In order to obtain maximum sensitivity in heteronuclear 3D experiments it is essential that all amide "H be exchanged with ^ I . To achieve this, deuterated HCA II was unfolded in the presence of H2O by incubation in 3 M guanidine-HCl at pH 7.5 and room temperature for 1 hour followed by a rapid 20-fold step dilution with 0.1 M tris sulfate at pH 7.5 and subsequent refolding for 2 hours (18). Furthermore, we have optimized HCA II growth conditions for maximum protein yields in the defined acetate media described above (8). Conditions optimized included IS^QQ at time of induction, induction time, growth temperature, antibiotic levels, and pH. Doubling times for cells in 98.8% D20/acetate media increased slightly compared to H20/acetate. Optimum protein yields were obtained using the conditions we found optimum for acetate growths in H2O, with two exceptions: maximum yield was achieved when the cells were induced at A600 ^^^ 0.3-0.5, and when induction times were increased from 8 hours to 16 hours. The total mass of protein produced per liter of medium decreased approximately 33% to 50 mg compared with the same fully protonated medium. For ^^N/^H L78K-TRX a different strain of E. coll was utilized (pHH121/XLlBLU) and growth was carried out in a minimal glucose medium. Temperature and pH were as indicated in the flow chart and induction was initiated when A6oo~0.8. Induction was optimum at 8 hours compared with 4 hours in H2O. The total yield was ,15 mg/L compared with 14 mg/L in protonated media. We have also determined the upper limit of deuterium incorporation in HCA II. For this purpose milligram quantities of ^H labeled protein were produced in defined media containing 98.8% D2O and [^Ha, 98%] sodium acetate as the sole carbon source using the optimized procedures outlined above. To quantitate the level of deuterium incorporation, we analyzed the molecular mass of purified HCA II by mass spectrometry. The molecular mass of fully protonated HCA II was measured to be 29102 +/- 2.4 (theoretical mass = 29098.9). At low pH the protein contains 2018 protons; therefore, one would predict a theoretical mass increase of 2030.5 mass units upon complete deuteration. The molecular mass of protein produced in 98.8% D2O and ["1113, 98%] sodium acetate was measured to be 31133 +/13, an increase of 2034 +1- 15 mass units, indicating above 9 6 % deuterium incorporation.
Ronald A. Venters et al
608
[^Hs, 98%] sodium acetate, [^^N, 99%] ammonium chloride, and D2O were obtained from Cambridge Isotope Laboratories. NMR experiments were carried out on a 3-channel Varian Unity 600 spectrometer using a ^H/^"^C/^^N tripleresonance probe equipped with an actively shielded B^ gradient coil.
in. Results and Discussion A. Backbone and Aliphatic Sidechain Resonance Assignments Since there are no aliphatic protons present in perdeuterated proteins, new strategies must be employed and new pulse sequences developed for the NMR assignment and structure determination. The sequential mainchain assignment of perdeuterated proteins is achieved by collecting and analyzing 3D HNCACB, 3D HN(CO)CACB, and 4D HN(CACO)NH data (1). These sequences include "H decoupling when ^ C is transverse and work best if H2O flip-back pulses and pulsed field gradients are employed. Complete aliphatic deuteration increases both resolution and sensitivity in these experiments by eliminating partially deuterated CHnDm moieties, which have different ^ C chemical shifts due to the ^H isotope shift. Sidechain carbon assignments are obtained from a 3D C(CC)(CO)NH data set (9). This sequence is a modified version of the HC(CC)(CO)NH sequence in which magnetization originates on aliphatic ^^C and not aliphatic ^H. Theoretical calculations and experimental evidence indicate an approximate 3.5-fold increase in sensitivity for methine groups and an approximate 7-fold increase in sensitivity for methylene groups using the C(CC)(CO)NH experiment on perdeuterated HCA II. Sidechain NHx assignments are obtained using modified 2D versions of the ^H-^^N HSQC, HNCO, HNCACB, and HN(CO)CACB experiments (10) to provide through-bond correlations of these sidechain ^HN/^^N resonances to the previously assigned sidechain ^^C resonances. Subsequent to the assignment of the perdeuterated protein, inter-residue CJC^ and Ha/Hp chemical shifts can be obtained from the CBCA(CO)NH and H B H A ( C 0 ) N H experiments using a fully protonated ^^C/^^N labeled protein sample. These data allow for the parameterization of the "H isotope shifts on the Ca and Cft carbons and allow for the reasonable estimation of the "Yi isotope shifts at 13
sidechain C resonances (1). Sidechain ^H resonances can then be assigned from a 4D HCCH-TOCSY data set collected on the fully protonated protein sample. The ^"^Ca and ^^C^ chemical shift values obtained directly from the protonated sample and the corrected C chemical shift values of the additional sidechain carbons should facilitate the analysis of the TOCSY data.
NMR and Global Fold Determinations of Perdeuterated Proteins
609
B. Secondary Structure Determination The relationship between NMR chemical shifts and the secondary structure of a protein has been well established (19,20,21). The Ca and carbonyl carbons experience an upfield shift in extended structures, such as a p-strand, and a downfield shift in helical structures. Both the Cp and the Ha proton chemical shifts exhibit the opposite correlation. These shifts have proven to be sufficiently consistent to permit the prediction of secondary structural elements for a number of proteins (1,19,20). Knowledge of the secondary structure of a protein can be useful in identifying spin-diffusion effects during the analysis of 4D ^ N/^^N-separated NOESY data collected with long mixing times as described below. The secondary structure can also be used as a constraint in the calculation of protein global folds.
C. Global Fold Determination A global fold of a protein may be determined from the analysis of a 4D ^^N/^^N-separated NOESY spectrum collected on perdeuterated protein once the mainchain and sidechain ^HN/^ N resonances have been assigned (11). Detection of ^HN-^HNNOES in a perdeuterated protein can provide longer distance constraints than in a fully protonated protein. This is due to greater control of alternate relaxation pathways and a reduction in the number of possible spin-diffusion routes which would otherwise compete with direct ^HN-^HN cross-relaxation at long mixing times. Results in perdeuterated HCAII and L78K-TRX suggests that NOEs are detected between amides separated by 7 A or more in the crystal structure. For example Figures 2 and 3 show planes from ^^N/^^N-separated NOESY spectra of HCA II and L78K-TRX respectively. Labeled peaks correspond to amide-amide
IH(don)
(ppm)
Figure 2. H donor/ ^^N donor planes from a 4D ^^N/ ^^N-separated NOESY spectrum on a 2.8 mM perdeuterated^^ N-labeled HCA II sample.
Ronald A. Venters et al
610
V55 (Diagonal Peak) 10,0
9,5
6.6
4.6
6.6
A46
T54
14
9.0 e.5 Q,o IH don (ppm)
7,5
Figure 3. h^ acceptor plane from a 3D ^^N/ ^^N-separated NOESY spectrum on a 4.0 mM perdeuterated N-labeled L78K-TRX sample. NOEs for Leu 118 (HCA II) and Val 55 (L78K-TRX), with inter-proton distances (from the crystal structures of wild type protein) given in A, several of which are greater than 5 A. This increased range leads not only to more total constraints, but also to highly informative constraints between different structural elements, which should allow more accurate prediction of the global folding pattern (11,22-25). For example, in the crystal structure of wild type thioredoxin there are 205 mainchain amide-to-amide distances less than 5 A. Extending this range up to 7 A gives an additional 134 constraints, many of which link different substructures. In many respects medium and long range constraints are particularly important in determining precise protein structures by NMR (22) and such constraints are crucial for determining an accurate global fold of a protein. An example of the utility of using N/^ N-separated NOE data from perdeuterated L78K-TRX to establish the tertiary organization of the protein is illustrated below. The 4D ^^N/^^N-separated NOESY data were collected using a mixing time of 400 ms and the resulting spectrum was referenced using previously determined resonance assignments (2). Data on spectral peaks were tabulated using the peakpicking and volume measurement routines of a modified version of the FELIX program (Hare Research), then assigned as NOEs between specific amides. The number of inter-residue mainchain NOEs was 381, of which 80 were sequential / to i+7. The remaining 301 were / to /+2 or greater. Also assigned were approximately 60 sidechain to mainchain NOEs involving Trp indole and Asn/Gln primary amide groups. Some mainchain amides had no detectable NOEs, perhaps because their direct ^^N-^H correlations are weak as obsereved in ^H/^^N-HSQC data. The relationship between NOE volume and ^H-^H distance was examined for all NOEs where both symmetry related cross peaks were observed. A plot of the logarithm of the average volume of the two peaks, versus the inter-proton distance as mea-
NMR and Global Fold Determinations of Perdeuterated Proteins
611
sured from the crystal structure of wild type thioredoxin, is shown in Figure 4. The data suggest an approximately linear overall relationship, as expected from a dipolar interaction. Of the 280 NOEs in Figure 4, 153 (55%) correspond to distances less than 5 A, while the remaining 127 (45%) NOEs occur between protons separated by 5 A or more. It is also noted that cross peaks corresponding to distances as large as 9 A are clearly observed. NOE volume appears to be less well correlated with distance at longer distances, perhaps because of a greater contribution of spindiffusion to peak volume (25). 1.0E+09
a
^ 1.0E+08 o >
o (U
§ 1.0E+07 > <
1.0E+06 2
4
6
8
10
NH-NH Distance in WT Thioredoxin (angstroms) Figure 4. Log of the NOE volumes in L78K thioredoxin versus the distance as measured from the WT x-ray crystal structure. Our initial effort to calculate the fold of the L78K-TRX mainchain was based only on observed ^HN-^HN N O E S . Each of 398 independent NOEs (381 interresidue mainchain and 17 Trp indole-mainchain) was assigned one of three upper bounds for distances: 5.6, 7.6, or 9.0 A. This classification scheme, suggested by the data in Figure 4, provides very loose constraints. The lower bound for all interproton distances was assigned as 1.9 A, since this is the shortest HN- HN distance in wild type thioredoxin. Starting with an extended random configuration, NMRchitect 95.0 (Biosym/MSI) was used to produce structures for the protein based on distance geometry methods. Triangle bound smoothing was employed initially to find the Euclidean limits before the coordinates were randomly embedded. Optimization of the embedded structure was done by simulated annealing followed by further energy minimization using the method of conjugate gradients.
612
Ronald A. Venters et al
Without any further refinement, this procedure yielded a topological fold for L78K-TRX that is largely homologous to wild type, as illustrated in Figure 5. A central five-strand P sheet is formed (black segments) surrounded by helix-like domains that correspond closely to helical sequences in wild type (shading). Furthermore, reverse turns (white) in the L78K-TRX fold also appear in the expected positions. Long range distance constraints (>5 A) proved important in determining the tertiary fold, as illustrated in Figure 5 by four such NOEs observed in L78KTRX (drawn on the wild type structure) that link different types of secondary structure. One segment, the C-terminal helix, lacked NOEs to other secondary structural elements and is the only substructure not properly positioned. Thus using as distance geometry constraints only amide proton NOEs (3.6 per residue), a generally correct mainchain fold of L78K-TRX was obtained.
Figure 5. X-ray crystal structure of wild type thioredoxin (A) and the calculated structure of L78K-TRX (B). Distances between mainchain amide protons are given in A. A more detailed evaluation of the extent to which spin-diffusion contributes to NOEs in perdeuterated HCA II and L78K-TRX is under way. Modeling studies with three spins suggest that NOESY mixing times of up to 600 ms may be employed without significant (<25%) spin-diffusion contributing to the NOE intensity for long range HN pairs. Build-up curves for measured NOEs of long and short range protons in L78K-TRX are illustrated in Figure 6. Up to a mixing time of 600 ms, NOEs arising from long range (-6 A) HN pairs continue to intensify, while NOE intensities for short range (~2.5 A) HN pairs begin to decrease.
NMR and Global Fold Determinations of Perdeuterated Proteins A. Average Distance of 6.2 A
o
613
B. Average Distance of 2.5 A
Mf .••VS-^
;>
3.0E+05
g
2.0E+05
!
1.0E-t-05
1
/
"
'
/
•••
^ ^
J^l^'l^- f ^ ^
O.OE+00 200
400
Mixing Time (ms)
600
200
400
600
Mixing Time (ms)
Figure 6. NOE build-up curves for a sample of ten long range (A) and ten short range (B) amide pairs in L78K thioredoxin. Once a global fold has been established for a protein, a complete high-resolution 3D structure of the protein can then be calculated using distance constraints derived from 4D- ^^C/^^N-separated and ^^C/^^C-separated NOESY data. The analysis of these 4D NOESY data sets should be facilitated by the previous determination of the protein global fold.
IV. Conclusions These studies indicate that perdeuteration can be achieved in proteins expressed in several different E. coli strains by growing selected cells in D2O media. Complete deuteration provides significant signal-to-noise enhancement in heteronuclear NMR assignment and structure determination experiments which use the amide proton for detection. Using a perdeuterated ^^C/^^N sample, we have completed the ^H, ^^C, and mainchain and ^HN, ^^N, and *^C aliphatic sidechain assignments for the 259 residue protein HCA II utilizing the strategies outlined above. We are in the process of analyzing 4D ^^N/^^N-separated NOESY data on both perdeuterated HCA II and a mutant of thioredoxin in order to generate distance constraints which will be used to determine the global folds of these proteins. The data include particularly useful longer range contraints often extending to greater than 7 A. Our initial application of the strategy to the L78K-TRX mutant protein is highly encouraging and illustrates the importance of long range NOE constraints in tertiary structure evaluation.
614
Ronald A. Venters et al
The strategies we have outlined here should be applicable to proteins with rotational correlation times substantially longer than HCA II.
Acknowledgments The Duke University NMR Center was established with grants from the NIH, NSF, and the North Carolina Biotechnology Center, which are gratefully acknowledged. This work was supported in part by the NIH research grant GM 41829. The authors thank Homme W. Hellinga for providing the expression strain and facilities to purify L78K-TRX.
References 1) Venters, R.A., Farmer, B.T. II, Fierke, C.A., and Spicer, L.D. (1996) J. Mol BioL in press. 2) de Lorimier, R. M., Hellinga, H., and Spicer, L.D. (1996) Protein Science^ in press. 3) Bax, A., andGrzesiek, S. (1993) Ace. Chenu Res. 26. 131-138. 4) Muhandiram, D.R., and Kay, L.E. (1994) J. Magn. Reson., series B 103, 203-216. 5) Grzesiek, S., Anglister, J., Ren, H., and Bax, A. (1993) J. Am, Chem, Soc. 115, 4369-4370. 6) Yamazaki, T., Muhandiram, R., and Kay, L.E. (1994) J. Am, Chem. Soc. 116, 8266-8278. 7) Yamazaki, T.,Lee, M., Revington, M., Mattiello, D.L., Dahlquist, F.W., Arrowsmith, C.H., and Kay, L.E. (1994) J. Am, Chem, Soc. 116,6464-6465. 8) Venters, R.A., Huang, C.-C, Farmer, B.T. II, Trolard, R., Spicer, L.D., and Fierke, C.A. (1995) J. Biomol. NMR 5, 339-344. 9) Farmer, B.T. n, and Venters, R.A. (1995) J. Am, Chem, Soc. Ill, 4187-4188. 10) Farmer, B.T. II, and Venters, R.A. (1996) J. Biomol. NMR 7, 59-71. 11) Venters, R.A., Metzler, WJ., Spicer, L.D., Mueller, L., and Farmer, B.T. II (1995) J. Am, Chem, Soc. 117,9592-9593. 12) Nair, S.K., Calderone, T.L., Christiansen, D.W., and Fierke, C.A. (1991) J. Biol. Chem, 266, 17320-17325. 13) Rosenberg, A.H., Lade, B.N., Chui, D.S., Lin, S.W., Dunn, JJ., and Studier, F.W. (1987) Gene 56,125-135. 14) Studier, F.W. and Moffatt, B.A. (1986) J. Mol. Biol. 189, 113-130. 15) Khalifah, R.G., Strader, D.J., Bryant, S.H., and Gibson, S.M. (1977) Biochemistry 16, 22412247. 16) Veipoorte, J.A., Mehta, S., and Edsall, J.J. (1967) J. Biol. Chem, 242,4221-4229. 17) Sambrook, S., Fritsch, E.F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York. 18) Carlsson, U., Henderson, L.E., and Lindskog, S. (1973) Biochim. Biophys. Acta 310, 376-387. 19) Wishart, D.S., and Sykes, B.D. (1994) J. Biomol. NMR 4, 171-180. 20) Metzler, WJ., Constantine, K.L., Friedrichs, M.S., Bell, AJ., Ernst, E.G., Lavoie, T.B., and Mueller, L. (1993) Biochemistry 32, 13818-13829.
NMR and Global Fold Determinations of Perdeuterated Proteins 21) Spera, S. and Bax, A. (1991) J. Am, Chem. Soc. 113, 5490-5492. 22) James, T.L. (1994) Methods in Enzymology 239,416-439. 23) Zhao, D., and Jardetsky, O. (1994) /. Mol Biol. 239, 601-607. 24) Clore, G.M., Robien, M.A., and Gronenbom, A.M. (1993), J. Mol. Biol. 231, 82-102. 25) Hoogstraten, C.G. and Markley, J.L. (1996) J. Mol Biol. 258, 334-348.
615
This Page Intentionally Left Blank
IH-NMR EVIDENCE FOR TWO BURIED ASN SIDE-CHAINS IN THE cMYC-MAX HETERODIMERIC a-HELICAL COILED-COIL Pierre Lavigne, Matthew P. Crump, Stephane M. Gagne, Brian D. Sykes, Robert S. Hodges and Cyril M. Kay Department of Biochemistry and the Protein Engineering Network of Centres of Excellence, University of Alberta, Edmonton, Alberta CANADA T6G 2S2
I. INTRODUCTION The Leucine Zipper (LZ) is a dimerization motif found in the b-LZ and b-HLH-LZ transcription factor families (1,2). Upon dimerization, LZs fold into parallel and two-stranded a-helical coiled-coils (3-6). The primary structure of coiled-coils forming proteins is characterized by the heptad repeats (abcdefg)n where Leu residues are conserved at positions d and positions a are mostly occupied by Pbranched and hydrophobic residues while e and g positions are often occupied by acidic or basic residues (7,8). The tertiary interactions of the dimeric LZ or parallel and two-stranded a-helical coiled-coils are described by the knobs-into-holes model (3,9). In the b-LZ family (e.g, GCN4 and c-Jun), Asn residues are found to be conserved at an a position in the heptad reapeat (1,10). A pair of Asn side-chains destabilizes the homodimeric LZ coiled-coil compared to hydrophobic side-chains othenvise conserved at this position (3,10). From a biological point of view a lower stabilty for homodimeric species will facilitate the reassortment of LZs which is desirable in the light of theu" regulative (heterodimerization) role (3,11). It has also been shown in a series of GCN4 LZ mutants and de novo designed LZs that replacement of the Asn residue by aliphatic residues leads to the formation of oHgomers, namely trimeric and tetrameric species (12-14). In addition to decreasing the stability of dimeric LZs, Asn side-chains can impose the correct dimer orientation (parallel and in-register) and specify folding of dimeric species over oligomeric ones (3,12,13). TTie crystal structure of the GCN4 homodimeric LZ incHcates that the Asn side-chains pack asymmetrically at the interface of the dimer where an interhelical H-bond between the 5NH2 of one Asn side-chain and the 05 of the other is formed (3). On the other hand, solution NMR studies on the GCN4 (15) and cJun homodimeric LZ (16) have shown that they are symmetric. It has been shown that the Asn side-chains are most likely flipping between two distinct, symmetry-related H-bonded conformations in the fast chemical exchange regime at room temperature (14). The oncoprotein c-Myc (a b-HLH-LZ protein) heterodimerizes specifically with the protein Max (anotiier b-HLH-LZ protein) to bind DNA and activate transcription (17,18). The LZ domain of Max contains two Asn residues at a TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
617
618
Pierre Lavigne et al
positions (Fig. lA). In two previous studies (19,20) it has been shown that the cMyc and Max LZs form a heterodimeric a-helical coiled-coil with a high specificity. This has led us (19) and others (20) to propose that the LZ domains of Max and cMyc are responsible for the specificity or molecular recognition in vivo. Molecular models describing interiielical salt bridges and hydrogen bonds that might be responsible for the specificity have been proposed (19,20). Amongst other features, the two Asn side-chains found on the Max LZ are proposed to be buried and to form interhelical side-chain--side-chain and side-chain—main-chain hydrogen bonds. In this paper we focus, using proton NMR spectroscopy, on the interactions of the two Asn side-chains at the interface of the c-Myc-Max heterodimeric LZ. We report interhelical NOE's between the H5 protons of Max Asn5a and Asnl9a and protons from the side-chains of c-Myc-LZ forming the holes in which they are proposed to pack according to the knobs-into-holes model indicating that they are indeed buried. Moreover, Max Asn 19a Hz shows an interhelical NOE to a backbone amide proton of c-Myc-LZ as well as slow amide exchange indicating that Max Asnl9a is potentially forming side-chain—main-chain hydrogen bonds. As discussed, these results support the molecular models for the c-Myc-Max heterodimeric coiled-coil and shed more light into the putative role of the conserved Asn residues in the mechanism of heterodimerization in this b-HLH-LZ subfamily of transcription factors. n. MATERL\L AND METHODS Solid phase peptide synthesis of the c-Myc and the Max LZs, characterization by mass spectrometry, purification by reversed-phase HPLC and the formation of the disulfide linked c-Myc-Max heterodimeric LZ have been described elsewhere (19). All proton NMR spectra were recorded on a Varian Unity 600 at 25 °C. 6 to 10 mg of the disulfide linked c-Myc-Max heterodimeric LZ were dissolved in 0.5 mL of potassium phosphate buffer (50 mM, 10% D2O / 90% H2O and pH 4.7) containing 100 mM KCl and ImM 2,2-dimethyl-2-silapentane-5-sulfonic acid (DSS) to yield solutions ranging from 0.75 to 1.25 mM. Proton resonances were assigned from two-dimensional double quantum filtered correlation spectroscopy (DQF-COSY; (21)), two-dimensional total correlation spectrocopy (TOCSY; mixing time = 50 ms; (22)) and two-dimensional nuclear Overhauser enhancement spectrocopy (NOESY; mixing times = 1 5 0 and 200 ms; (23)) experiments. Sequential assignment of the proton resonances was performed as described by Wuthrich (24). The spectra were acquired with 2048 t2 complex data points and 256 t\ increments in the phase sensitive mode with quadrature detection using the method described by States et al. (25). Water resonance was supressed during the 1.5s relaxation period used in the NOESY, DQF-COSY and TOCSY experiments and the mixing period of the NOESY experiments by irradiating continuously at its resonance frequency. The amide exchange experiments were carried out by
c-Myc-Max Heterodimeric LZ
619
acquiring 1-D spectra as described elsewhere (19) after dissolving the lyophilized sample in 100 % D2O. pH readings were not corrected for the isotopic effect. III. RESULTS We present on Fig. 1 A, the primary sequences of the c-Myc and Max LZs. Fig. IB shows the arrangement of the heterodimeric LZ in a helical wheel representation.
10
19
defgabcdefgabcdefgabcdefgabcd
25
CGGMRRKNDTHQQDIDDLKRQNALLEQQVRAL MaxLZ CGGVQAEEQKLISEEDLLRKRREQLKHKLEQL c-Myc LZ
B
Q24 Ri7 Q
S10K17H24
Figure 1. A. Primary structures of the c-Myc and Max LZs. Sequences are taken from Zervo et al. (26) and renumbered. B. Helical wheel diagram of the c-Myc-Max heterodimeric LZ. Potential interhelical electrostatic interactions have been discussed elsewhere (19,20). In the knobs-into-hole model (9), side-chains (knobs) at position a in the heptad repeat pack in the holes formed by consecutive g and a residues and two d positions. Accordingly, Max Asn5fl is proposed to pack in the hole formed by Vall, Glu4g, GluSa and LeuS^/ on the c-Myc LZ. Similarly, Max Asnl9fl is proposed to pack in the hole formed by Leul5(/, Argl8^, Argl9fl and Leu22^/.
A. Secondary structure of the disulfide linked c-Myc-Max heterodimeric LZ As shown previously (19) by circular dichroism spectroscopy, the disulfide linked c-Myc-Max heterodimeric LZ is highly helical (>90%) between pH 4.0 and pH 7.0. We present in Fig. 2 the amide-amide region from a NOESY spectrum recorded at 25°C and pH 4.7. Extensive sequential djsiN ('» '*+!) NOE's typical for a-helices (24) can be seen. Despite poor chemical shift dispersion of the a-protons, a significant portion of the short range doN (^^'+3 and i, /+4) dap (/, /+3) a-helical connectivities (24) could be unambiguously identified. In summary, enough a-
Pierre Lavigne et al
620
helical connectivities encompassing all the primary structure of both LZs were observed to ensure that the heterodimeric disulfide linked c-Myc-Max LZ has an extensive a-helical secondary structure.
6.5H
7.CH
I 7.^
I ^,0^^
8.(H
8.5H
MaxN19Hz/ c-MycR19HN
9.0H ""I""l""|""""'|""l"iilii
9.0
8.8
8.6
8.4
i|iiM|i.ii|nii|Mii|
8.2
8.0
iiM|M.i|nii|
7.8
7.6
7.4
|nii|MM|nM
7.2
7.0
|M.i|rM.|
6.8
6.6
|MM|MII|IMI
6.4
6.2
F2 (ppm) Figure 2. Backbone and side-chain amide region of a 600 MHz NOES Y spectrum of the disulfide linked c-Myc-Max LZ at 25 °C. Mixingtime= 200 ms, pH 4.7. Labelled is the interhelical NOE between Max Asnl9fl H5 (Hz) and c-Myc Argl9fl backbone HN. B. Tertiary interactions involving the two Asn side-chains at a positions The spin systems of Max Asn5a and Max Asn 19a have been completely assigned. As these residues are proposed to be buried at the interface of the heterodimer, long range (interhelical) NOEs involving their side-chains should enable us to define or probe their tertiary interactions and verify if they are indeed buried. Figure 2 shows a NOE between Max Asnl9a H5 (Z) and c-Myc Argl9a backbone HN. In addition, both H5 side-chain protons show NOEs to c-Myc Argl9a H a and one of c-Myc Argl9a HP (Fig.3). Max Asn5a H5 side-chain protons show NOE's with c-Myc GluSfl H a and one of c-Myc Glu4g HP and the protons of one of c-Myc Leu8 J methyl groups (Fig.3). The NOEs from H8 protons of Max Asn 5a and Max Asn 19a connect these side-chains to the residues on the c-Myc LZ that form the holes in which they would pack according to the knobs-into-holes model (see legend of Fig.l). This strongly supports the proposition that both Asn side-chains are buried at the interface of the c-Myc-Max heterodimeric LZ.
c-Myc-Max Heterodimeric LZ
621 ^:^^<^ ©
MaxNSHE/^^------. MaxNSHE/ c-Myc^4 Hp ^ c-Myc L8 H5
Max N5 Hz/ c-Myc L8 H6
MaxNigHE/ . c-Myc R19 HP
MaxN19Hz/ c-MycR19Hpf
E 5-2.5
Max N19 Hz/ c-Myc R19 Ha
\
7.6
7.5
7.4
7.3
7.2
7.1
7.0
6.9
6.8
F2 (ppm)
6.7
6.6
6.5
6.4
6.3
6.2
Figure 3. Section of a 600 MHz NOESY spectrum of the disulfide linked c-Myc-Max LZ at 25 "^C. Mixing time = 200 ms, pH 4.7. Labelled are NOE's between the H5 protons of Max Asn5a and Max Asnl9a and protons from residues forming the holes on the c-Myc LZ in which they ate predicted to pack.
C. Amide exchange experiments To further verify if the Asn side-chains were buried at the interface of the c-MycMax heterodimeric LZ, we studied the susceptibilities of their H5 protons to deuterium exchange. We show on Fig.4 the amide and side-chain NH region of 1D spectra in water (Top) and 45 minutes after dissolution of the lyophilized sample in 100 % D2O (Bottom) recorded at 25 °C. As shown, the H5 protons of Max Asn 19a are not completely exchanged after 45 minutes while those of Max Asn5a are. The fact that the H5 protons of Max Asn 19a exchange slowly compared to all other side-chain amide protons indicates that they are indeed buried and most likely involved in side-chain—main-chain H-bonds as indicated by the NOE between Max Asnl9a H8 (Z) and c-Myc Argl9a HN (Fig.2). On the other hand. Max Asn5a H 5 protons are completely exchanged after 45 minutes indicating that they are on a time average more solvent exposed than those of Max Asn 19a. It is possible that local structural fluctuations in the folded form of the c-Myc-Max heterodimeric LZ occur at the N terminus.
Pierre Lavigne et al
622
^
^..VUVVVr^AVV^^
IIN|IIM|INI|ini|llll|llll|llll|llll|llll|IMI|llll|llll|IMI|llll|IIIl|llll|llll|IIM|llll|IIM|IIII|llll|llll(llll|IIII|llll|llll|llll|llll|IMI|lllll
9.0 8.8 8.6 8.4 8.2 8.0 7.8 7.6 7.4 7.2 7.0 6.8 6.6 6.4
ppm
Figure 4. Proton NMR amide exchange study of the disulfide linked heterodimeric c-Myc-Max LZ at 25 "^C. Top, 1-D spectrum in H2O 90 % : D2O 10 %, pH 4.7. Bottom, 1-D spectrum of the lyophilized sample dissolved in 100 % D20,45 minutes after dissolution.
IV. DISCUSSION The results presented here indicate that the two Asn side-chains found on the Max LZ are buried at the interface of the c-Myc-Max heterodimeric LZ. In summary, we have described NOEs between the H8 protons of both Asn side-chains and protons from the residues forming the holes in which they are predicted to pack on the cMycLZ. We report an NOE between Max Asnl9a Hz proton and c-Myc Argl9a backbone NH and slow deuterium exchange for both of Max Asn 19a H5 protons. These results support the proposition based on molecular modeling that this Asn side-chain forms H-bonds with the backbone carbonyl of c-Myc LtulSd and the backbone NH of c-Myc Argl9a (20). In a previous study we proposed that Max Asn5a was buried and involved in the stabilization of the potentially buried c-Myc Glu5a carboxylate through its NH2 moeity. This carboxylate was proposed to be involved in a critical buried salt bridge with Max His8d. The proposed formation of this buried salt bridge was supportedfromthe observed and elevated pKa of Max HisSrf in the folded form of
c-Myc-Max Heterodimeric LZ
623
the c-Myc-Max heterodimeric LZ compared to its pKa in a truncated and unfolded version of the heterodimer (19). In the crystal structure of the c-Fos-c-Jun heterodimeric LZ, the conserved Asn at a position a on the c-Jun LZ forms H-bonds with the Glu side-chain at the flanking position g on the c-Fos LZ (5). Interestingly, a similar interaction could be predicted to occur in the c-Myc-Max heterodimeric LZ between Max AsnSfl and cMyc Glu Ag (Fig. IB). On the other hand, according to the data presented here, it appears that the side-chain of Max Asn5a points in the direction of c-Myc Glu 5a rather than that of c-Myc Glu Ag. While it has been demonstrated that the conserved Asn side-chain at a positions undergoes a conformational exchange process at the interface of the homodimeric c-Jun LZ (6,14), it appears that the side-chains of Max Asn5a and Max Asn 19a adopt a single conformation at the interface of the c-Myc-Max heterodimeric LZ. By lowering the temperature Junius et al. (14) observed the gradual disappearance from 30 to 0 °C of the 15N8-H5 correlations (l^N-H HSQC) of the conserved Asn side-chain indicating a transition from a rapid to a medium rate chemical exchange at lower temperature. In the present case, no significant change in the intensity on the H5 protons signals of Max Asn5a and Asn 19a have been observed at lower temperature (5°C, data not shown). The two Asn side-chains found at a positions should contribute to heterodimerization specificity by destabilizing Max homodimeric LZ. From the data presented here, it appears that Max Asnl9a can form interhelical side-chain ••• main-chain H-bonds as proposed by Muhle-GoU et al. (20). Moreover the H5 protons of Max Asn 19a are protected from deuterium exchange indicating that these H-bonds are potentially strong and therefore favouring heterodimerization. As proposed before, Max Asn5a could also contribute to heterodimerization specificity by solvating through H-bonds the carboxylate of c-Myc Glu5a at the interface of the heterodimeric c-Myc-Max LZ (19). While the data presented here is not sufficient to make general conclusion about this putative interaction, it is interesting to note that the LZ of other proteins known to interact with Max, namely Mad (27) and Mxil (26) have a conserved acidic side-chain at the equivalent position to cMyc Glu5a (19). This suggests an important role for Max Asn5a in stabilizing the conserved acidic side-chain to form the critical buried salt bridge for molecular recognition with Max His8J as discussed before (19). ACKNOWLEDGEMENTS This work was supported by the Protein Engineering Network of Centres of Excellence of Canada. Pierre Lavigne acknowledges the Medical Research Council of Canada for a postdoctoral fellowship.
Pierre Lavigne et al
624
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27.
Hurst, H. (1994). Protein Profile 1,123-152. LitUewood, T.D. & Evan, G.I. (1994). Protein Profile 1, 639-741. O'Shea, E.K., Klemm, J. D., Kim, P.S. & Alber. T. (1991). Science 254, 539-544. Ferr6-D'Amar6, A.R., Prendergast, G.C., Ziff, E.B. & Burley, S. K. (1993). Nature (London) 363, 38-45. Glover, JJ^.M. &, Harrison, S.C. (1995). Nature (London) 373, 257-261. Junius, F.K., O'Donoghue, S.I., Nilges, M., Weiss, A.S. & King, G.F. (1996). J. Biol Chem. 271, 13663-13667. Hodges, R.S., Sodek, J., Smillie, L.B. & Jurazek, L. (1972). Cold Spring Harbor Symp. Quant. Biol. 37, 299-310. McLachlan, A.D. & Stewart, M. (1975). /. Mol. Biol. 98, 293-304. Crick, F.H.C. (1953). Acta Crystallogr. 6. 689-693. Hu, J.C. & Sauer, R.T. (1992). Nucleic Acids Mol. Biol. 6, 82-101. Wendt, H., Berger, C, Baici, A., Thomas, R.M. & Bosshard, H.R. (1995) Biochemistry 34, 4097-4107. Harbury, P.B., Zhang, T., Kim, P.S. & Alber, T. (1993). Science 262, 14011407. Lumb, K.J. & Kim, P.S. (1995). Biochemistry 34, 8642-8648. Junius, F.K., MacKay, J.P., Bubb, W.A., Jensen, S.A., Weiss, A.S. & King, G.F. (1995). Biochemistry 34, 6164-6174. Oas, T.G., Mcintosh, L.P., O'Shea, E.K. Dahlquist, F.W. & Kim, P.S. (1990) Biochemistry 29, 2891-2894. Junius, F.K., Weiss, A.S. & King, G.F. (1993). Eur. J. Biochem 214, 415-424. Amati, B., Dalton, S., Brooks, M.W., Littlewood, T.D., Evan, G.I. & Land, H. (1992). Nature (London) 359,423-426. Amati, B., Brooks, M.W.. Levy, N., Litdewood. T.D., Evan, G.I. & Land, H. (1993). Cell 11 , 233-245. Lavigne, P., Kondelewski, L.H., Houston, M.E.Jr., Sfinnichsen, F.D., Lix, B., Sykes, B.D., Hodges, R.S. & Kay, CM. (1995). /. Mol. Biol 254, 505-520. Muhle-Goll, C, Nilges, N. & Pastore, A. (1995). Biochemistry 34, 13554-13564. Ranee, M., S0rensen, O.W., Bodenhausen, G. Wagner, G., Ernst, R.R. & Wuthrich, K. (1983). Biochem. Biophys. Res. Commun. 117, 479-485. Davis, D.G. & Bax, A. (1985) /. Am. Chem. Soc. 107, 2821-2822. Jeener, J. Meier, B.H. Bachmann, P., & Ernst, R.R. (1979). /. Chem. Phys. 71, 4546-4553. Wuthrich, K. (1986). NMR of proteins and nucleic acids. John Wiley & Sons, New York. States, DJ. Haberkom, R.A. & Reuben, D.J. (1982). /. Mag. Reson. 48, 286-292. Zervos, A.S., Gyuris, J. & Brent, R. (1993). Cell 72, 223-232. Ayer, D.E., Kretzner, L. & Eisenman, R.N. (1993).^// 72, 211-222.
NMR confirms the presence of the aminoterminal helix of group II phospholipase A2 in solution Roman Jerala^, Paulo F.F. Almeida^ Rodney L. Biltonen and Gordon S. Rule^ Depts. of Biochemistry and Pharmacology, University of Virginia School of Medicine, Charlottesville, VA 22908 and ^Laboratory for molecular modeling and NMR spectroscopy, National Institute of Chemistry, 1115 Ljubljana, Slovenia
I. Introduction Phospholipases A2 (E.G. 3.1.1.4) are widely distributed physiologically important phospholipid degrading enzymes. Several tertiary and a much greater number of primary structures of PLA2 have been determined (Scott and Sigler, 1994). Based on structural similarity, the low molecular weight PLA2 have been classified into three groups (Heinrikson et al., 1977). Often both group I and II PLA2 are present in the same organism. For example in humans the group I PLA2 is present in the pancreas and the group II extracellular PLA2 are found in platelets and arthritic synovial joints. The tertiary structure and disulfide bond pattern are similar between the two groups, with the following main differences: group II PLA2 have an extension of approximately 7 residues on the carboxy terminal part, which is also connected by a disulfide bond, group I PLA2 have an insertion between the residues 54-56, and in group I the amino terminal helix is connected by a disulfide bond at residue 11 (Renetseder et al., 1985; Scott and Sigler, 1994). The activity of both group I and II PLA2 is calcium dependent and they both exhibit interfacial activation, i.e. the activity against aggregated substrate such as micelles or vesicles is several orders of magnitude higher than against the monomeric phospholipids (Slotboom et al., 1982). The activity against the aggregated substrate depends on the physical state of the membrane and its chemical composition (Burrack et al., 1993). Despite the evidence demonstrating the influence of the membrane structure on the enzymatic activity, the involvement of the putative conformational change of the enzyme remains open. Fluorescence studies have indicated the existence of spectroscopically different states of the enzyme (Jain and Maliwal, 1993; Bell ' Present address: UCEH, Universidade do Algarve, 8000 Faro, Portugal. ^ Present address: Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
625
626
Roman Jerala et al
and Biltonen, 1992). Recent NMR determination of the solution structure of the pancreatic group I PLA2 showed that, in contrast to the crystal structures, the beginning of the amino terminal helix is partly disordered (Van den Berg et al., 1995a). Moreover the solution structure of the same PLA2 complexed with the inhibitor and micelle displayed ordering of this region which thus became more similar to the crystal structure (Van den Berg et al., 1995b). The authors have suggested that this conformational change is essential for interfacial activation. We have performed an NMR study on a group II PLA2 from snake Agkistrodon piscivorus piscivorus, which has previously been extensively used in the biophysical studies on the membrane - protein interactions. Improvement in production and refolding procedure for recombinant PLA2 has allowed preparation of ^"^C and ^^N labeled protein, allowing its assignment and measurement of several structural parameters which describe the conformation of the amino terminal helix. These parameters include NOE signal, main chain dihedral angles, coupling constants, amide hydrogen exchange rates and the deviation of the chemical shift, respectively. These data provide reliable conclusions about the secondary structure of the protein even before building the complete tertiary structure.
II. Materials and Methods The gene for the App D49 PLA2 has been synthesized and subcloned into the pET3a expression vector as described (Lathrop et al. 1992). Production and refolding of recombinant protein was carried out as described (Jerala et al., 1996). The protein for our sample was dissolved in either 5 or 100% D2O, 20 mM sodium phosphate buffer, pH 4.3 and 100 mM KCl. All experiments were performed at 40°C. The concentration of the protein was approximately ImM. Spectra were recorded on a 500 MHz Varian Unity Plus NMR spectrometer, equipped with Nalorec triple resonance probe with Z gradient coil. Water suppression was achieved with gradients applied primarily as zz filters as described (Bax and Pochapsky, 1992) and with spin-lock purge pulses (Messerle et al., 1989). States-TPPI method of data collection was used for quadrature detection (Marion et al, 1989a). Data in the indirect heteronuclear dimensions were typically extended by 30% using linear prediction (Olejniczak and Eaton, 1990). Residual water signal was removed by deconvolution in the time domain (Marion et al., 1989b). Spectra were transformed and processed using Felix software (MSI). For hydrogen exchange experiments ^^N labeled protein was dissolved in H2O. After freeze-drying it was dissolved in an equal volume of D2O as before. HSQC spectra recording was started immediately and spectra were recorded every two hours for one day and then after 2 days and 2 weeks. Intensities of the H^-N cross peaks were integrated and time course was fitted to an exponential decayftonction.^J H^-H" coupling constants were calculated using data from the 3D HNHA experiment (Vuister and Bax, 1993). The amount of magnetization transferred from the H^ to the H" nucleus was evaluated from the ratio of the diagonal to the crosspeak intensities. To obtain resonance
NMR of Group II Phospholipase A2
627
assignments a set of heteronuclear multidimensional experiments was performed: HNCA, HN(CO)CA, HNCO, HNCACB, CBCA(CO)NH, HBHA(CBCACO)NH, HBHA(CBCA)NH, HCACO, and HN(CA)HA (Jerala et al. 1996, parameters in Table 1). 2D NOESY, 3D ^^N NOESY-HSQC and 3D TS (^^N, ^^C) HSQC-NOESY-HSQC (Jerala and Rule, 1995) experiments were performed as described (Jerala et al. 1996). NOE crosspeaks were assigned based on the previous assignment of the protons. Spectra were referenced using TSP in the proton dimension. In ^^N and '^C dimension the shifts were referenced by calculation from the ratio of the zero point frequencies of the standard and corrected for the temperature (Edison et al. 1994). Chemical shift deviation was calculated for H", C", CO and C^ nuclei. Consensus chemical shift index was calculated using program CSI (Wishart and Sykes, 1994).
III. Results A. Resonance Assignment Double ^^C and ^^N labeled PLA2 was prepared by the overexpression of synthetic gene for the ApD49 PLA2 in E.coli. The protein was refolded from solubilized inclusion bodies. The amino acid residue type for each spin system was determined from the characteristic C" and C'^ chemical shifts obtained from the HNCA, HNCACB and CBCA(CO)NH experiments. Gly, Ser, Thr and Ala residues could be uniquely identified, while the identity of the other residues was established from the NOEs and from the TOCSY pattern. Sequential assignment was obtained as described in detail (Jerala et al., 1996). Combinations of experiments were used which correlated two or more nuclei in adjacent residues and those which identify intraresidual cross peaks. For each of those combinations of experiments, more or less degenerate connectivity pathway between the residues was established. Separate connectivities were established for the correlations between C", C" and C^ H", H" and H^. Figure 1 illustrates the parallel sequential assignment based on the inter- and intraresidual C" and H" correlations. Due to the resonance overlap in H^ - N plane (Figure 2), particularly due to the a-helical regions, degeneracies had to be resolved using CO chemical shift. Combination of all the connectivities effectively eliminated degeneracies and allowed complete sequential assignment of the backbone. Side chains were assigned from the '^N separated TOCSY-HSQC and for the aliphatic chains from the 3D HCCH TOCSY experiments. Assignment was also confirmed with short and medium range NOEs.
Roman Jerala et al
628
m®
0
m ^•
Figure 1. Parallel sequential assignment based on the HNCA (left panel) and HN(CA)HA spectrum (right panel). Boxes denote intraresidual cross peaks which are connected by arrows to the interresidual cross peaks
B. Conformation of the N-terminal Residues a. NOE. NOE cross peaks were assigned from the 2D and 3D NOESY experiments. H^-H^ distances in regular a-helices typically measure around 2.8 A. Intermediate intensity sequential H^-H^ NOEs, another sign of a-helix, were observed for Leu2, Phe3, Gln4, Glu6 and most of the residues up to the Leul9. Even more characteristic distances for a-heUces are between the H" and H^'^^, which are 3.4 A apart. Strong NOEs have been observed between the H" Leu2 and H^ of Phe5, a typical sign of the a-helical conformation. H^ of the Leu2 was also found close to the H" of the Lys60, a distance restraint consistent with a formed and docked a-helix as found in the crystal structure of other PLA2. b. Coupling Constants. J HN-Ha coupling constant is related to the ^ dihedral angle by the Karplus equation. Coupling constants smaller than 6 Hz and larger than 8 Hz are indicative of the a-helical and P-sheet conformation respectively (Wutrich, 1986). Coupling constants were measured and for the amino terminal residues up to the Lysl 1 they were smaller than 6 Hz.
NMR of Group II Phospholipase A2
120.0
90.0 Nitrogen (ppm)
629
60.0
Figure 2. 2D *H, '^N HSQC spectrum of App D49 PLA2. Amino terminal amide signal is labeled.
c. Chemical Shift Index. The deviation of chemical shift fi-om the values for random coil is characteristic of the secondary structure (Dalgamo et al., 1983; Wishart and Sykes, 1994). C" and CO nuclei exhibit a positive deviation and H" negative deviation in a-helical regions and the opposite in p-sheet regions. Each individual deviation can depend on additional factors beside the secondary structure. Therefore, a consensus of the deviations, the chemical shift index, provide a more reliable indication of the secondary structure of the residue. In the case of AppD49 the chemical shift index indicated the existence of an a-helix between the residues 2-11. All the carbonyl and C" carbon nuclei in this part have a positive deviation. All the H" have a negative deviation, with the exception of Leu2. Two longer helices were identified between residues 39-49 and 81-98 and a short one from 17-21. A two stranded P-sheet was predicted between the residues 63-69 and 72-79 in good agreement with long range NOEs across the two strands. d. Hydrogen Exchange Protection. With the deuterium exchange experiment we could evaluate with good precision only amides exchanging at a rate of slower than approx. 2 h"^ The amino terminal amide group is generally not observed due to its rapid exchange with water. The fact that we have observed it for the AppD42 PLA2 (Fig. 2) indicates that it is probably partially protected from exchange. Amides of the Leu2 and Phe3 have exchanged within two hours while the amide of the Gln4 and more of the Phe5 are already protected indicating that the H^ of Phe5 probably participates in hydrogen bond and
Roman Jerala et al
630
c« 1
^11 1
2
3
4
6
7
9
1 0
11
1 2
1
^" 1
CL
Q.
"^^
<
-
CO
1
2
3
4
5
6
7
8
9
1 0
1
1 2
c p1 2
3
4
7
6
J
9
1 0
11
1 2
p
C 0
1
•
1
1
•
C S I
211.
Figure 3. Chemical shift deviation of the assigned residues compared to the random coil values for the first twelve amino terminal residues. Bottom panel shows the consensus chemical shift index for the complete protein.
initiates the a-helix. All other amides up to the Thrl3 exchanged at a rate slower than 0.5 h" . 1
s
2 L
3 F
4 Q
5 F
6 E
7 K
8 L
9 10 1 1 12 I K K M
•^NN
daN3 JaN
He Figure 4. Schematic representation of the NOEs {d^^: H^', H^*^^; H"\ H^*^^; H"\ H^*), ^JnaHN coupling constants and hydrogen exchange of the first twelve residues. Decreasing thickness of lines represent strong, medium and weak NOEs and small (<5Hz) and medium (>5, <9 Hz ) coupling constants. Hydrogen exchange (- unprotected, + protected amides).
NMR of Group II Phospholipase A2
631
IV. Conclusions The secondary structure of the AppD49 PLA2, as inferred from the chemical shift index, the NOE pattern, coupUng constants and hydrogen exchange experiment, is similar to previously determined PLA2 structures. Consensus chemical shift index identifies two larger a-helices, a short one, and the amino terminal helix comprising the residues 2-12. In comparison with the solution structure of the free pancreatic PLA2 the most notable difference in our study is that we find the amino terminal helix to be completely formed in the free enzyme without inhibitor or micelles. Despite the fact that the tertiary structure is still not completely refined this statement is supported by the following evidence. Coupling constants are consistent with the helical conformation, a number of short and medium range NOEs have been observed which are characteristic for the a-helical conformation. Observation of the amino terminal amide group in the HSQC spectrum indicate that it is partly protected from exchange rather than flexible and oriented towards solvent. Backbone amides of the residues 4-13 are protected from hydrogen exchange. H^ of the Phe5 is probably hydrogen bonded with carbonyl group of Serl. Biochemical studies on the precursor of the bovine phospholipase A2 and chemical modification of the amino terminal amino group have shown that the modified amino terminus does not support increased activity on the aggregated substrate (Verheij et al., 1981). We conclude that there are some differences between the conformations of amino terminal helices of different PLA2. Whether these differences are a general distinction between the group I and II enzymes remains to be established. However the beginning of the first helix, where the differences between the two solution structures have been observed, is docked at the base of (3-wing which has an insertion of the "elapid loop" (residues 62-66) in group I phospholipases. Indeed, deletion of this loop decreased the mobility of the surrounding residues and increased the activity towards micelles (Kuipers at al. 1989). The helix of the group I PLA2 is disulfide bonded (Cysll-Cys77). Removal of this bond does not affect the catalytic activity while it has a large impact upon the stability of the protein (Zhu et al. 1995). While the ordered conformation at the amino terminal of the PLA2 is probably necessary for the activity against the aggregated substrate, our observations suggest that the ordering of this helix is not sufficient to explain the interfacial activation of type II PLA2. In this work we have used different techniques from the repertoire of the NMR methods. Those techniques have provided independent data which were consistent and permitted reaching conclusion on the secondary structure of the protein. Acknowledgements We thank Drs. Quang Ye and Brian Lathrop for their help in the purification of ^^N labeled protein and (Q.Y.) assistance in assignment of ^^N spectra. The NMR spectrometer was purchased, in part, with fiinds from NSF (BIR-
632
Roman Jerala et al
9217013). This research was supported by a grant from the NSF (DMB9305002) to G.S.R. and R.L.B. and by a grant from the NIH (GM37658) to R.L.B. R.J. was supported in part by the Ministry of Science and Technology of Slovenia.
References Bax, A. and Pochapsky, S. (1992)/. Magn. Reson. 99, 638-643. Bax, A., Vuister, G.W., Grzesiek, S., Delaglio, F., Wang, A.C., Tschudin, R. and Zhu, G. (1994) Methods Enzymol 239, 79-105. Bell, J.D. and Biltonen, R.L. (1992) J. Biol Chem. 267, 11046-11056. Biltonen, R.L., Lathrop, B.K. and Bell, J.D. (1991) Methods Enzymol 197, 234-248. Bodenhausen, G. in Ruben, D.G., Chem. Phys. Lett. 69, 185, (1980)). Burrack, W.R., Yuan, Q. and Biltonen, R.L. (1993) Biochemistry 32, 583-589. Dalgamo, D.C., Levine, A. and Williams, R.J.P. (1983) Biosci. Rep. 3, 443. Edison, A.S., Abilaard, F., Westler, W.M., Mooberry, S. and Markley, J.L. (1994) Methods Enzymol 239, 3-79. Heinrikson, R.L., Krueger, E.T. and Keim, P.S. (1977) J. Biol Chem. 252, 4913. Jain, M.K. and Maliwal, B.P. (1993) Biochemistry 32, 11838-11846. Jerala, R. and Rule, G.S: (1995) j : Magn. Reson. B 108, 294-298. Jerala, R., Almeida, P:F:F., Ye, Q., Biltonen, R:L. and Rule, G.S. (1996) J. Biomol NMR7,\01-\20. Kuipers, O.P., Thunnisen, M.M.G.M., De geus. P., Dijkstra, B.W., Drenth, J., Verheij, H.M. and De Haas, G.H. (1989) Science 244, 82-85. Lathrop, B.K. and Biltonen, R.L. (1992)7: Biol Chem. 267, 21425-21431. Marion, D., Ikura, M., Tschudin, R. and Bax, A. (1989a) J. Magn. Reson. 85, 393. Marion, D., Ikura, M. and Bax, A. (1989b) J. Magn. Reson. 85, 425. Messerle, B.A., Wider, G., Otting, G., Weber, C. and Wutrich, K. (1989; J. Magn. Reson. S5,60S-613. Olejniczak, E.T. and Eaton, H. (1990) J. Magn. Reson. 87, 628-632. Renetseder, R., Brunie, S., Dijkstra, B.W., Drenth, J. and Sigler, P. (1985^ J. Biol Chem. 260, 11627-11634. Scott, D.L. and Sigler, P.B. (\994) Adv. Protein Chem. 45, 53-88. Scott, D.L., White, S.P., Browning, J.L., Rosa, J.J., Gelb, M.H. and Sigler, P.B. (1991) Science 254, 1007-1010. Scott, D.L., White, S.P., Otwinowski, Z., Yuan, W., Gelb, M.H. and Sigler, P.B. (1990) Science 250, 1541-1546. Slotboom, A.J., Verheij, H.M. and De Haas, G.D. (1982) in "Phospholipids" (Hawthorne, Ansell eds.) Elsevier Press. Van den Berg, B., Tessari, M., Boelens, R., Dijkman, R., de Haas, G., Kaptein, R. and Verheij, H.M. (1995a) Nature Struct. Biol. 2, 402-406. Van den Berg, B., Tessari, M., Boelens, R., Dijkman, R., Kaptein, R. and de Haas, G. (1995b) J. Biomol NMR5, 110-121. Vuister, G.W. and Bax, A. (1993) J. Am. Chem. Soc. 115, 7772. Wishart, D.S. and Sykes, B.D. (1994) Methods Enzymol. 239, 363-392. Wutrich, K. (1986) NMR of Proteins and Nucleic Acids, Wiley Interscience, New York. Zhu, H., Dupureur, CM., Zhang, X. and Tsai, M.D. (1995) Biochemistry 34, 1530715314.
The Crystallographic analysis of Glycosylation-Inhibiting Factor Yoichi Katoi, Takanori Mutoi, Hiroshi Watarai^, Takafumi Tomura2, Toshifumi Mikayama^, and Ryota Kurokii 1 Central Laboratories for Key Technology, Kirin Brewery Co. Ltd., Yokohama, Kanagawa 236, Japan and ^Pharmaceutical Research Laboratory, Kirin Brewery Co. Ltd., Maebashi, Gunma 371, Japan
!• INTRODUCTION Glycosylation-inhibiting factor (GIF) is a cytokine involved in the selective formation of IgE-suppressive factor (1). GIF inhibits N-glycosylation of IgE-binding factors. The unglycosylated IgE-binding factor then selectively suppresses IgE synthesis. Further, GIF appears to be a subunit of antigenspecific suppressor T cell factors (2) which facilitate the generation of antigen-specific suppressor T cells (3). GIF was thought to be produced only by suppressor T cells, however, molecular cloning of GIF cDNA provided the unexpected finding that GIF mRNA is present in a variety of some kind cell lines (4). Recent studies indicate that post-translational modification of GIF in suppressor T cells is required for the generation of the biological activity (5). However, the relationship between GIF bioactivity and the conformational transition of the protein is not known. In order to understand the mechanisms of GIF functions, we determined the crystal structure of recombinant human GIF. We have already described that GIF has a novel tertiary structure (6). Here, we report the crystallographic analysis of GIF. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
633
634
Yoichi Kato et al
II. METHODS Recombinant human GIF was expressed in Escherichia coli and purified as described previously (7). The protein solution was concentrated to 25 mg/ml using centricon-3 (Amicon) and crystallized using the hanging-drop vapor diffusion method. Initial crystallization conditions were determined with the sparse-matrix sampling method (8). Droplets consisting of 5|LI1 of the protein solution and the same volume of a precipitating solution were equilibrated against 1 ml of the precipitating solution at 4°C. X-ray diffraction experiments were executed using CuK a radiation from a Rigaku RU-200H rotating anode generator operating at 40kV and 100mA. For space group determination, precession photographs were taken with a Rigaku precession camera. Intensity data of native and heavy atom derivatives were collected with a Rigaku R-AXIS lie imaging-plate system at 4°C and reduced using the supplied software. The x-ray beam was focused by a Charles Supper double-mirror optics system. The structure of GIF was solved by the multiple isomorphous replacement (MIR) method. Heavy atom derivatives were prepared by soaking crystals for several days in stock solution (2.6M ammonium sulfate in O.IM Hepes buffer pH 6.9) in which the heavy-atom compound had been dissolved. Heavy-atom positions from the difference Patterson maps were determined using the vector verification program VERIFY (written by S. Roderick, Institute of Molecular Biology, University of Oregon) and confirmed using cross-difference Fourier maps. Refinement of heavy-atom parameters and phase calculation were executed using PHASES (9). Initial MIR map was improved by the solvent flattening (10) and non-crystallographic symmetry averaging using PHASES. The molecular model of GIF was constructed using QUANTA/XFIT (MSI). The refinement of structure was executed with the simulated annealing method using X-PLOR (11). Water molecules were located using the program
The Crystallographic Analysis of GIF
Figure 1. Crystal of human glycosylation-inhibiting factor. of crystal is approximately 0.6mm x 0.6mm x 0.4mm.
635
The
QUANTA/XSOLVATE (MSI). The last stage of structure refinement and the refinement of temperature factor for individual atoms were executed with TNT (12).
III. RESULTS and DISCUSSION A.
Crystallization
The crystals of GIF (Figure 1) were obtained from 2.0M ammonium sulfate and 2%(w/v) polyethylene glycol 400 (Wako Pure Chemical Industries, Japan) in O.IM Hepes (pH 7.5). Polyethylene glycol 400 was necessary for the reproduction of GIF crystals. The space group was P3i21 with a = b=96.6A and c = 105.8A. Supposing there are three molecules in an asymmetric unit, the ratio of unit cell volume to molecular mass in the unit cell, Vm (13), is 3.82A3/Da.
Yoichi Kato et al
636
B.
Multiple Isomorphous Replacement
Analysis
Four complete data sets were collected, two from native crystals (Native 1 and Native2) and two from heavy atom derivatives (HgCh and ethylmercurithiosalicylate (EMTS)). Nativel was mounted directly from mother liquor (2.0M ammonium sulfate in O.IM Hepes buffer pH7.5), and Native2 was soaked in the stock solution (2.6M ammonium sulfate in O.IM Hepes buffer pH6.9) used for heavy atom soaks. The results of data collection are summarized in Table I. Table I.
Results of Data Collection Nativel
Native2
HgCl2
EMTS
a,b (A)
96.62
96.72
96.73
96.63
c (A)
105.59
106.16
106.50
105.90
pH
-
6.9
6.9
6.9
(NH4)2S04 (M)
-
2.6
2.6
2.6
Heavy atom (mM)
-
-
0.2
1.0
-
2
5
3
1
1
1
1
(A)
1.9
2.3
2.5
2.7
observations
89768
58234
28107
21827
34810
25273
15887
10192
Cell
dimensions
Stock
solution
Soaking time (days) Number of Resolution Total Unique
crystals
reflections
Completeness Emerge
(%)
75.6
95.5
76.4
61.8
0.06
0.05
0.07
0.08
Riso^
for
Nativel
-
0.114
0.189
0.117
for
Native2
0.114
-
0.156
0.125
^^merge = l^N^n^I-<^>^^^Nn<J>^ where A^ is the number of unique reflections, n is the number of multiple measurements of a particular reflection, / is the measured intensity of reflection, > is the average intensity of equivalent reflections. ^/?iso = ^ I ^ P H - ^ p l / S I F p l , where FpH is the derivative structure factor amplitude, Fp is the native structure factor amplitude.
The Crystallographic Analysis of GIF
637
Difference Patterson maps were calculated between both native data sets and derivative data sets. Heavy-atom peaks were apparent in the Harker sections using Native 1, but vector verification using Nativel failed. Using Native2, difference Patterson maps were less noisy and vector verification succeeded. Therefore, MIR phasing was carried out using the data set of Native2. Two sites of HgCh (one major site and one minor site) and three sites of EMTS (one major site and two minor sites) were found. These heavyatom positions were refined at 2.8A resolution. The result of phase calculation is summarized in Table II. The mean figure of merit from 11969 reflections was 0.49.
C.
Phase
Improvement
After the calculation of MIR phases, the electron density was improved by solvent flattening followed by noncrystallographic symmetry averaging. The map after solvent flattening showed most of the secondary structure. However, the connectivity of electron density was ambiguous in the loop regions. Since it was clear that the asymmetric unit contained a trimer of GIF related by a non-crystallographic
Table II.
MIR Phasing Statistics (2.8A resolution) Native2
Number
of reflections
phased
^Cullis^
powerb
Mean figure of merit
EMTS
2
3
0.62
0.68
1.79
1.52
11969
Heavy atom sites
Phasing
HgCl2
0.49
^^Cullis = XIIFpH-/^pl-FH(calc)l/II^PH-^pl» where FpH is the derivative structure factor amplitude, Fp is the native structure factor amplitude, ^ H ( c a l c ) is the calculated heavy atom structure factor, and the summation is over the centric reflections only. bPhasing power is the ratio of the root-mean-square heavy atom scattering factor amplitude to the rms lack of closure error.
Yoichi Kato et al
638
three-fold, non-crystallographic symmetry averaging was performed. Poly-alanine models for each GIF monomer were made about the regions of secondary structure found in the solvent-flattened map, and a molecular envelope defined within 7A from all atoms in these models. Using the map after symmetry averaging, it was possible to trace the main-chain of all three GIF molecules.
D.
Refinement
Initial refinement was executed at 2.3A against Native2 data set which was used for the MIR phasing. The /?-factor was 20.0% for 22184 reflections. Ramachandran plot (14) shows that there is no residue having unfavorable dihedral angles in the structure. The coordinate error of structure was estimated to be 0.25 - 0.30A using a Luzzati plot (15). Rootmean-square deviations of bond lengths and bond angles from ideal values are 0.014A and 1.877°, respectively. Next, using Nativel data the resolution was extended to 1.9A with an /?-factor of 21.4% for 33351 reflections. Since the /?-factor between the two native data sets is 11.4%, the two refined models were compared to investigate the difference between the structures. The root-mean-square deviation between main-chain atoms and side-chain atoms of the refined structures using Nativel and Native2 was 0.18A and 0.32A, respectively. This indicates that the models refined against Nativel and Native2 are essentially the same. Finally, 170 water molecules were added and the structure refined to a final /?-factor of 16.8%. Root-mean-square deviations of bond lengths and bond angles from ideal values are 0.018A and 3.035°, respectively. A Luzzati plot shows that the structure has the estimated coordinate error of 0.20A to 0.25A.
E.
Structure of GIF
The overall structure of GIF trimer is a three-fold-related barrel structure which is composed of three 6-stranded Psheets on the inside and six a-helices on the outside (Figure
The Crystallographic Analysis of GIF
Figure 2.
639
Stereo diagram of GIF trimer.
Figure 3. Stereo diagram of GIF monomer. Two a-helices are labeled as a l and a 2 . Six p-strands are labeled from pi to p6. Two P-strands from the other monomers (p3' and p6") are also shown.
2). Each subunit consists of two P-a-p motifs related by a pseudo-twofold axis (Figure 3). The trimer structure is formed by intermonomer hydrogen bonds and hydrophobic interfaces between P-sheets. There is a 5-A diameter "hole" through the middle of the barrel. The barrel structure of GIF in part resembles "trefoil" cytokines such as interleukin-1 and
Yoichi Kato et al
640
fibroblast growth factor (16). GIF, interleukin-1, and fibroblast growth factor are secreted from mammalian cells, even if these proteins have not a signal peptide sequence (4, 17, 18). The possibility that three proteins are secreted by a common unknown mechanism is suggested from the similarity of three-dimensional structures of these proteins.
Acknowledgments The authors express grate appreciations to Drs. R. H. Jacobson, L. H. Weaver and B. W. Matthews for helpful advice on the structural determination. We thank Drs. M. Kusunoki and T. Tsukihara for advice with the molecular averaging technique, and Dr. T. Shimizu for data collection.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
Ishizaka, K. (1984) Annu. Rev, Immunol 2, 159-182. Steele, J. K., Kuchroo, V. K., Kawasaki, H., Jayaraman, S. A., Iwata, M., Ishizaka, K. & Dorf, M. E. (1989) J. Immunol. 142, 2213-2220. Iwata, M. & Ishizaka, K. (1988) J. Immunol. 141, 3270-3277. Mikayama, T., Nakano, T., Gomi, H., Nakagawa, Y., Liu, Y.-C, Sato, M., Iwamatsu, A., Ishii, Y., Weiser, W. Y., & Ishizaka, K. (1993) Proc. Natl. Acad. ScL, USA 90, 10056-10060. Liu. Y - C , Nakano, T., Elly, C , & Ishizaka, K. (1994) Proc. Natl. Acad. Sci. USA 9 1 , 11227-11231. Kato, Y., Muto, T., Tomura T., Tsumura H., Watarai H., Mikayama T., Ishizaka K., & Kuroki R. (1996) Proc. Natl. Acad. Sci. USA 9 3 , 3007-3010. T. Nakano, Y.-C. Liu, T. Mikayama, H. Watarai, M. Taniguchi, & K. Ishizaka (1995) Proc. Natl. Acad. Sci. USA 92, 9196-9200. Jancarik, J., & Kim, S.-H. (1991) J. Appl. Crystallogr. 24, 916-924. Furey, W. & Swaminathan, S. (1990) American Crystallographic Association Meeting Abstracts, Series 2 18, 73. Wang, B. C. (1985) Methods in Enzymolology 115 90-112. Briinger, A. T. (1992) In "X-PLOR Version 3.1: A system for crystallography and NMR." Yale University, New Haven, CT. Tronrud, D. E., Ten Eyck, L. P., & Matthews, B. W. (1987) Acta Crystallogr. Sect. A: Found. Crystallogr. 4 3 , 489-503. Matthews, B. W. (1968) J. Mol. Biol. 33, 491-497. Ramachandran, G., N. & Sasisekharan, V. (1968) Advan. Protein Chem.. 23, 283-437. Luzzati, P., V. (1952) Acta Crystallogr. 5, 802-810.
The Crystallographic Analysis of GIF 16. 17. 18.
641
Murzin, A. G., Lesk, A. M., and Chothia, C. (1992) J. Mol. Biol. 223, 531-543. Cao, Y. & Petterson, R. F. (1993) Growth Factors 8, 277-290. Jessop, J. J. & Hoffman, T. (1993) Lymphokine Cytokine Res. 12, 51-58.
This Page Intentionally Left Blank
structure of the D30N active site mutant of FIV proteinase complexed with a statine-based inhibitor Celine Schalk-Hihi, Jacek Lubkowski, Alexander Zdanov, Alexander Wlodawer and Alia Gustchina Macromolecular Structure Laboratory, NCI-Frederick Cancer Research and Development Center, ABL-Basic Research Program, Frederick, Maryland.
Gary S. Laco and John H. Elder Department of Molecular Biology, The Scripps Research Institute, La Jolla, California
I. Introduction Retroviruses encode a protease (PR) responsible for cleaving polyprotein precursors, and such processing is essential for proper virion assembly and maturation. Based on the presence of a sequence Asp-Ser/Thr-Gly in the active sites of retroviral proteases (1) and their inhibition in vitro by pepstatin (2-7), these enzymes have been classified as members of the aspartic protease family. Crystal structures have been determined for the proteases from Rous sarcoma virus (RSV PR) (8), fi-om two variants and several mutants of the human immunodeficiency virus (HIV PR) (9-11), from feline immunodeficiency virus (FIV PR) (12) and from equine infectious anemia virus (EIAV PR) (13). Aspartic proteases contain a single active site which includes two aspartates. In apoenzymes, the two catalytic Asp residues from the active site triad have been found to be in hydrogen bond contact with a water molecule (10). Mutations of the active site Asp25 in HIV-1 PR into Asn (14,15), Thr (3) or Ala (4,16,17) led to an inactive enzyme. Similarly, the RSV PR was inactivated by mutation of its active site Asp to He (18). Retroviral proteases have become a major target for the rational design of drugs against AIDS (19). A large number of crystal structures of complexes between retroviral proteases and inhibitors, mostly transition-state analogs, have been determined by x-ray analysis (11). These analyses show that peptidomimetic inhibitors usually bind to the active site in a similar mode, making hydrogen bonds and hydrophobic contacts with the active site pockets of the enzyme. Such data are crucial for the studies of the specificity and the catalytic mechanism of aspartic proteases, and thus would facilitate rational drug design. It is believed that a substrate bound to the protease would form similar types of interactions. However, since a substrate would be processed rapidly, no structure of a protease/substrate complex is available. As a first step in studying TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
643
644
Celine Schalk-Hihi et al
such interactions, we expressed a mutant of FIV PR in which the catalytic Asp30 was mutated into an Asn, leading to inactive protease (here designated FIV PR(D30N)). A complex between this mutant and a substrate should therefore be stable. Li order to investigate the extent of perturbation of the active site of FIV PR caused by this mutation, we determined the crystal structure of FIV PR(D30N) at 2.0 A resolution in a con^lex with LP-149, a statine-based inhibitor. This structure was then compared with the structure of the wild type enzyme FIV PR(wt) complexed with the same inhibitor. The study described here reveals that the mode of binding of LP-149 to FIV PR(D30N) is similar to the mode of binding of LP-149 to FIV PR(wt), making the mutant a valuable model to study the interactions of substrates with FIV PR, and retroviral proteases in general.
II. Materials and Methods A. Constructs expression and Purification of Mutant Protease pT7-D30N/PR; The PR ORF in the FIV 34TF10 infectious molecular clone (20) was the template in a PCR in which a 5' Ndel restriction site, initiation ATG codon, stop codon and 3' Hindlll restriction site were introduced into the PR ORF to facilitate cloning and expression of the processed form of the FIV PR (21). In addition, PCR was used to mutate the PR aspartic acid codon #30 to an asparagine codon. The resulting D30N/PR ORF was cloned into pT7-7 (22) resulting in pT7D30N/PR in which the D30N/PR ORF is under control of the T7 RNA polymerase promoter. The mutated PR ORF was sequenced to verify the mutations as well as the fidelity of the PCR reaction. pT7-D30N/PR was transformed into the E. coli cell line BL21(DE3) pLysS (23). Cultures were induced at OD^QQ = 0.5 with 1 mM IPTG for 5 hrs, cells were then pelleted and frozen. Cells were lysed and the PR inclusion bodies pelleted. The PR inclusion bodies were then solubilized in buffer A (8 M Urea, 10 mM Tris pH 8.0, 5 mM EDTA). The soluble PR was loaded onto a Q Sepharose Fast Flow column (Pharmacia) equilibrated in buffer A. The protein fraction that did not bind to the column was collected and brought to 20 mM Na Acetate pH 5.0. The protein was then loaded onto a Resource S column (Pharmacia) equilibrated in buffer B (20 mM Sodium Acetate pH 5.0, 8 M urea, 5 mM EDTA). The PR that bound to the column was eluted with a NaCl gradient. The Resource S PR containing fractions were combined, brought to 0.1 M DTT, 0.15 M NaCl, and dialyzed against 25 mM NaP04 pH 7.4, 150 mM NaCl, 5 mM EDTA, and 2 mM DTT to facilitate folding and renaturation of soluble PR as described (12). The preparation was stored at 70°C. Wild type FIV PR prepared in this manner was soluble and active. However, while the D30N/PR was soluble, consistant with proper folding of the molecule, it had no detectable PR activity on either a FIV Gag polyprotein, or a Gag derived peptide substrate (G.S.L unpublished).
Structure of D30N Active Site Mutant of FIV Proteinase
645
B. Crystallization and Structure solution Before crystallization trials, the protein was subjected to gel filtration on Superdex-75 (Pharmacia) in 50 mM sodium/potassium phosphate buffer, pH 7.4, containing 1 mM EDTA, 50 mM 2-mercaptoethanol, 150 mM sodium chloride, 5% glycerol and 5% 2-propanol, as described previously (12). The statine-based inhibitor, LP-149 (Ac-Nal-Val-Sta-Glu-Nal-NH2 e Nal is naphtylalanine and Sta is statine) (Fig. 1), was prepared at Lilly Research Laboratories (K. Hui, unpublished results). Crystallization was carried out at 4 °C using the hangingdrop vapor diffusion method as follows: 2.5 [A of the FIV PR(D30N) at 7 mg/ml complexed with LP-149 (1:4 molar ratio) in 50 mM imidazole-HCl pH 7.0 containing ImM EDTA and 1 mM dithiothreitol were mixed with an equal volume of 2 M ammonium sulfate, 0.1 M sodium acetate, pH 4.6 (Hampton Crystal Screen, solution #47). Crystals appeared within a few days and reached the size of 0.2 x 0.2 X 0.4 mm in one week. X-ray diffraction data were collected at room temperature using a MAR Research 300 mm image plate detector mounted on a Rigaku RU-200 rotating anode generator operated at 50 kV and 100 mA. The data set was collected from a single crystal and processed using the program Denzo (24). Crystals belong to the trigonal system, space group P3j21, a = b = 50.49 A, c = 74.29 A, with half of the dimeric protease molecule in the asymmetric unit, and are isomorphous with the crystals of the wild-type enzyme. Since for technical reasons the last steps of the refinement were carried out in lower symmetry space group P3j, the x-ray data were also reduced in this lower symmetry space group. The statistics of the data are summarized in Table L
Figure 1. Structure of LP-149 (Ac-Nal-Val-Sta-Glu-Nal-NH2 where Nal is naphtylalanine and Sta is statine).
Celine Schalk-Hihi et al
646 Table I. Summary of data collection and refinement statistics P3i21 P3j #reflections P3i21 P3i P3i21 completeness P3i P3iVsP3i ^erge Rf,,,,, (final model)^ P3i P3,21 Rms dev. (monomer vs monomer, all atoms) Rms dev.fi-omideality^ bonds bond angles torsion angles improper angles Average B-factor all non-H atoms ^symm
0.089 0.080 9086 (unique), 63643 (total) 16742 (unique), 63643 (total) 95.2% (10-2.0 A), 90.0% (2.05-2.0 A) 89.2% (10-2.0 A), 77.9% (2.05-2.0 A) 0.015 0.158(0.159) 0.159 ^ 0.002 A (maximum deviation 0.009 A) 0.013 A 1.69° 27.9°
l.4rwher 25.0 A^
^ One X-ray data set is the result of data reduction in the space group P3 ^ while the second data set was generated by symmetry operations from x-ray data in space group P3j21, * The values shown in parentheses correspond to the data set generated by symmetry operations from x-ray data in space group P3i21.^ As the ideal targets were assumed to be those published by Engh and Huber (25).
Figure 2. Stereoview of the electron density map in the vicinity of Asn30, the mutation site. Contour level is 0.8o.
structure of D30N Active Site Mutant of FIV Proteinase
647
III. Results and Discussion A. Structure solution and Refinement Since the crystals of the FIV PR(D30N)/LP-149 complex were isomorphous with those of the FIV PR(wt)/LP-149 complex (12), crystallographic coordinates of the monomer of FIV PR(wt) (12) were used for initial refinement, using XPLOR (26). The first step consisted of rigid body refinement and dropped the Rfactor to 0.267 at the resolution range of 10-2.0 A. This procedure was followed by positional refinement. The resulting model was then manually corrected using Frodo (27), and the inhibitor LP-149 (Fig. 1) was built into the enzyme active site. The refinement procedure of the complex was then performed using PROLSQ (28,29) at a resolution range of 10-2.0 A. The electron density for Ile37 and residues 59-61 (the tip of the flap), clearly indicated that these fi-agments adopt more than one conformation. Further refinement of the structure was thus carried out with two different conformations for these residues. After a few rounds of model building and refinement, the approximation of residues 37 and 59-61 by two alternate conformations resulted in a very good agreement between the structure and the electron density. The majority of the protein side chains could also be unambiguously rebuilt into their corresponding electron density and solvent sites located. At this stage the crystallographic R-factor was lowered to 0.163. The major focus was then shifted toward the vicinity of the active site mutation, where the electron density for Asn30 also clearly indicated that this residue adopted more than one conformation (Fig. 2). Further refinement of the modified model was thus carried out with two different conformations for residues 29-32, using X-PLOR (26). The final value of the R-factor is 0.158 for 13,484 reflections at the resolution range, 10.0-2.0 A (with |F| > 3.0 a(F), for space group P3i). The resulting geometric and stereochemical description of the model is shown in Table I. The refined model of FIV PR(D30N)/LP-149 complex contains 1871 protein atoms, 122 inhibitor atoms, 144 water molecules, and two sulfate anions. B. Structure
description
As expected, mutation of the active site Asp30 to Asn did not affect the overall structure of the protein. FIV PR(D30N) is a homodimer (117 residues per monomer, including N-terminal alanine). The tertiary structure of the monomer consists mainly of P sheets and turns and is similar to that of FIV PR(wt) (12). The two monomers are related by a two-fold axis. The active site, characterized by AsnA30, AsnB30 and the inhibitor molecule LP-149 (Fig. 1), is located between the two monomers (To simplify the description of the dimer, letters A and B are arbitrarily used to describe the residues contributed by the two different monomers; for residues adopting two different conformations, we denoted their second conformation by primes). The flaps, two P hairpin structures formed by residues A49-A69 and B49-B69, fold over the inhibitor in the active site. Their position and conformation are similar to those observed in the wild type structure (12). Residues A59-A61 (and B59-B61), which form the tip of the flaps adopt two conformations,
648
Celine Schalk-Hihi et al
sirmlar to the FIV PR(wt)/LP-149 complex (12), with equal occupancy for each of them. The flap structure is stabilized by a hydrogen bond between the carbonyl oxygen atom of Val59 and the amide nitrogen atom of Gly60 from the other monomer. Additional hydrogen bonds between the flaps and the backbone atoms of the inhibitor stabilize the complex. These hydrogen bonds are formed between the carbonyl oxygen of IleA57 and the amide nitrogen atom of inhibitor residues B206 or A203 and between the amide nitrogen of IleA57 and the carboxyl oxygen of inhibitor residues A201 or B206, depending on the orientation of the inhibitor molecule in the active site. A water molecule (WatSOl), also found in the FIV PR(wt)/LP-149 complex (12), is located on the pseudodyad axis, between the inhibitor and the flaps. It participates in four hydrogen bonds with nearly perfect tetrahedral coordination, accepting two hydrogen bonds from the amide nitrogen atoms of the valine residues (A59 and B59) at the tip of the flap and donating two hydrogen bonds to the carbonyl oxygen atoms of inhibitor residues 203 and 204. Although the overall structure of FIV PR(D30N) is very similar to that of FIV PR(wt) (12), differences in the conformation of a few protein side chains exist. The side chain of Cys84, which adopted two different orientations in the FIV PR(wt)/LP-149 con^lex, has only one orientation in the current structure. The side chain of Arg53, also built in two orientations in FIV PR(wt)/LP-149 complex, is disordered in the mutant and does not show well-defined density for all the atoms of the side chain. The side chain of Ile37, which had only one orientation in the wild type structure was modeled in two orientations in the mutant. The side chains of Lysl2, Arg64, and Glu78 are disordered and could not be located at all in the electron density map. As in FIV PR(wt), the four amino-terminal residues (including the N-terminal alanine) are disordered and could not be located in the electron density map. The unclear electron density for residue 4 suggest that this residue is probably very flexible. A sulfate anion, which was not seen in the FIV PR(wt)/LP-149 complex, has been identified at the interface between two molecules. It forms hydrogen bonds with the hydroxyl group of TyrA23, the nitrogen atom of LysA46 and the amide nitrogen atom of ArgB13. None of the residues described above make direct hydrogen bonds with the inhibitor molecule LP-149. Only slight diflerences between the FIV PR(D30N)/LP-149 complex and the FIV PR(wt)/LP-149 complex could be observed in the active site, a large pocket made of residuesfromboth monomers. Its structure is symmetric with the mutated Asn residue lying in its center. Residues A29-A32 (B29-B32), located in the active site and which include the mutated Asn, adopt two conformations in the mutant structure in contrast to the wild type structure where they had only one welldefined conformation (12). Their second conformation was numbered A29'-A31' and B29'-B3 Irrespectively. Both conformations have an occupancy of 0.5. As in the FIV PR(wt)/LP-149 conplex (12), the inhibitor molecule, LP-149, bound to the active site of the protease, adopts two orientations in the crystals of FIV PR(D30N)/LP-149 complex. One orientation is numbered from A201 to A206 and the second orientation from B201 to B206. Their overall position and conformation in the active site are similar to those observed in the FIV PR(wt)/LP-149 complex (12). The subsites of the inhibitor-binding pockets in the FIV
Structure of D30N Active Site Mutant of FIV Proteinase
649
PR(D30N)/LP-149 complex and in the FIV PR(wt)/LP-149 complex are also very similar. Because of the crystallographic two-fold symmetry, subsites SI and ST, S2 and S2' as well as S3 and S3' are identical. The SI and ST subsites include Leu28, Asn30, Gly32, Asp34, Ile35, Gly58, Val59 and VallOl. The S2 and S2' subsites include Ile35, Ile37 and Asn56. The subsites S3 and S3' are formed by Argl3, Glul5, Leu28, Asn34, Ile57, Ile98 and Gln99. A more significant difference between the two complexes concerns the position of the hydroxyl groups of the statine residues of the two inhibitor molecules built in the enzyme active site. The line which passes through the hydroxyl atoms of statine residues A204 and B204 in the D30N mutant structure was found to be roughly perpendicular to the line through the equivalent atoms in the wild type structure. In addition, the two hydroxyl oxygen atoms are further apart (1.7 A) in the mutant structure than in the wild type structure (1.1 A). The inhibitor hydroxyl groups in the FIV PR(D30N)/LP-149 complex are within hydrogen bonding distance of two oxygen atoms and three nitrogen atoms of the two alternate conformations of the protease asparagine residues. The hydroxyl group of inhibitor B (A) is within hydrogen bonding distance of OSj and N62 of AsnA30 (B30), 0 8 j of AsnA30' (B30'), N62 of AsnB30' (A30') and N82 from AsnB30 (A30), respectively. The inhibitor hydroxyl group cannot, of course, form hydrogen bonds with each of these atoms at the same time. In addition, the distance between N82 of Asn A30' (B30') and the hydroxyl oxygen atom of inhibitor residue B204 (A204) is only 2.0 A and 08^ of AsnA30' and 08^ of AsnB30' are only 1.6 A apart. These distances are too short to exist in the mutated active site and lead us to define two distinct triplets composed of two asparagine residues and one inhibitor molecule. The first triplet consists of AsnA30, AsnB30' and the inhibitor molecule B201B206. The second triplet consists of AsnB30, Asn A30' and the inhibitor molecule A201-A206. Other combinations of Asn residues and inhibitor molecule are not possible and each protein/inhibitor complex contains only one of these two possibilities. As a result, the hydroxyl group of inhibitor residue B204 (A204) makes three hydrogen bond contacts with N82 and 0 8 j of AsnA30 (B30) and with N82 of AsnB30' (A30'). In addition, the N8 2atom of AsnA30 (B30) is within hydrogen bonding distance of the O81 atom of AsnB30' (A30') (Fig.3a). The interactions between the two Asn residues of FIV PR(D30N) and the inhibitor are similar to those observed in the FIV PR(wt)/LP-149 complex (Fig.3b). The inhibitor's statine hydroxyl group in the wild type enzyme is within hydrogen bonding distance of O82 and 0 8 j from one Asp residue and 0 8 ^ from the other Asp residue. The two aspartate oxygen atoms which are nearer to the interior of the molecule are 3.0 A apart, the same distance which separates N82 of Asn A30 and 081 of Asn B30' in FIV PR(D30N) (Fig. 3). The differences found in the position of the side chains of the Asn residues and the statine groups in FIV PR(D30N) compared with that described for FIV PR(wt) do not affect the hydrogen bond network between the two Asn residues and the inhibitor statine residues. It is similar to the hydrogen bond network found in the active site of the FIV PR(wt)/LP-149 complex. The hydrogen bonds made by the two Asn residues of FIV PR(D30N) with the surrounding protein are also similar to those observed in the FIV PR(wt)/LP-149 complex (12). The backbone amide nitrogen from each
Celine Schalk-Hihi et al
650
(a)
Asn A30
Asn A30
(b)
'^Asp A30
Asp B30
^^sp A30
Asp B30
Figure 3. (a) Stereo view of the hydrogen-bonding network in the mutated active site of FIV PR(D30N) complexedwithLP-149, involving AsnABO, AsnB30' and LP-149. Dashed lines indicate possible hydrogen bond contacts, (b). Stereo view of the hydrogen-bonding network in the active site of FIV PR(wt) complexed with LP-149, involving AspA30, AspB30 and LP-149. Dashed lines indicate possible hydrogen bond contacts.
structure of D30N Active Site Mutant of FIV Proteinase
651
"active site" Asn is within hydrogen bonding distance of the carbonyl oxygen atom of LeuA102 (LeuB102) and their carbonyl oxygen atoms form hydrogen bonds with the amide nitrogens from GlyA32 (GlyB32) and AlaA33 (AlaB33). The hydroxyl group of each ThrA31 (ThrB31) forms hydrogen bonds with the amide nitrogen of ThrB31 (ThrA31) and the carbonyl oxygen atom of LeuB29 (LeuA29). The amide nitrogen atom of LeuA29 (LeuB29) also forms a hydrogen bond with the carbonyl oxygen atom of ProA14 (ProB14). Similarly, in addition to the hydrogen bonds that GlyA32 (GlyB32) makes with AsnA30 (AsnB30), the carbonyl oxygen of GlyA32 (GlyB32) is within hydrogen bonding distance of the amide nitrogen atom of the inhibitor residue B205 (A205) and the hydroxyl oxygen atom of inhibitor residue A204 (B204). This complicated network of hydrogen bonds makes the active site quite rigid . Slight differences in the distances of other hydrogen bonds that the inhibitor molecule makes with FIV PR(D30N) and FIV PR(wt) have been observed (Table II). The hydrogen bond distance CO ~NH between Ile57 and the carbonyl oxygen of P3 in LP-149 is 2.8 A in the FIV PR(D30N)/LP-149 complex, whereas it is 3.0 A in the FIV PR(wt)/LP-149 complex. The hydrogen bonds that the amide nitrogen atom and the 0 6 ^ atom of Asp34 make with the P3 peptide bond of LP-149 are, respectively, 0.4 A and 0.3 A shorter in the FIV PR(wt)/LP-149 complex than in the FIV PR(D30N)/LP-149 complex. In the S2 subsite, the hydrogen bond distance between the carbonyl oxygen of Ile57 and the amide nitrogen atom in LP-149 is 2.9 A in the FIV PR(D30N)/LP-149 complex, whereas it was 3.2 A in the wild type complex. In the SI subsite, the main differences were observed in the hydrogen bonds that 0 6 j of Asn and O62 of Asp from FIV PR(D30N) and FIV PR(wt) respectively, make with the statine group of the inhibitor. These distances are 2.5 A and 3.1 A in FIV PR(wt), wMle they are 2.8 A in FIV PR(D30N). The hydrogen bond distance between the carbonyl oxygen of Gly32 and the amide nitrogen atom in position PI of LP-149 is 3.1 A in the FIV PR(D30N)/LP-149 complex as well as in the FIV PR(wt)/LP-149 complex. In the S2' subsite, the hydrogen bonds that the inhibitor side chain makes with Ile35 and Asp34 are 0.3 A shorter in FIV PR(wt) than in FIV PR(D30N), whereas the hydrogen bond distance with Gly32 is 3.2 A for both enzymes. There is only 0.1 A difference between the hydrogen bond that Ile57 of FIV PR(D30N) and FIV PR(wt) makes with inhibitor residue P3'. Among all the hydrogen bond distances listed above, the largest difference between FIV PR(D30N)/LP-149 complex and FIV PR(wt)/LP-149 complex is only 0.3 A, which can be considered insignificant for the binding mode of the inhibitor in the active site.
Celine Schalk-Hihi et al
652
Table II. Hydrogen bond distances between FIV PR(D30N) and LP-149 compared to the hydrogen bonds formed between FIV PR(wt) and LP-149 (12) FIV PR(wt)
FIV PR(D30N)
LP-149 Hydrogen residue bond
Residue
(A)
Hydrogen bond
Residue
(A)
P4 P3
Ile57 Asp34 Asp34 Ile57 Wat301 Gly32 AspA30 AspA30 AspB30 Wat301 Gly32 Ile35 Asp34 Wat338 Ile57 Ile57 Asp34
3.0 2.6 2.5 3.2 2.7 3.1 2.6 2.5 3.1 2.6 3.2 3.0 2.8 3.2 2.5 2.6 3.0
CO - N H NH - 0 5 2 CO - N H NH - C O CO - O H NH - C O BOH -N62 BOH -O61 BOH -N62 CO - O H NH - C O 062 - N H 0€2 - N H CO - O H NH - C O CO - N H NH2 --062
Ile57 Asp34 Asp34 Ile57 Wat301 Gly32 AsnA30 AsnA30 AsnB30' Wat301 Gly32 Ile35 Asp34 Wat302 Ile57 Ile57 Asp34
2.8 2.9 2.9 2.9 2.8 3.1 2.7 2.8 2.8 2.7 3.2 3.3 3.1 3.1 2.6 2.8 2.9
P2 PI
PI' P2'
P3' P4'
CO - N H NH --062 CO - N H NH - C O CO - O H NH - C O OH -O62 OH-O61 OH -O62 CO-OH NH - C O 062 - N H 062 - N H CO - O H NH - C O CO - N H NH2 -O82
IV. Conclusion The results described here show that the crystal structure of the FIV PR(D30N)/LP-149 complex is very similar to that of the FIV PR(wt)/LP-149 complex (12). The position and orientation of the inhibitor molecule LP-149 in both proteins is nearly identical (12). Some hydrogen bond distances between the inhibitor and the active site pockets of FIV PR(D30N) were found to be slightly different from those observed in the FIV PR(wt)/LP-149 complex. These differences are however, too small to affect the binding mode of the inhibitor in the mutated active site. Since the network of the hydrogen bonds between the ligand and the enzyme in the close vicinity of the mutated asparagines did not change, we suggest that FIV PR(D30N) is a very useful tool in the study of the interactions of retroviral proteases with substrates. Such structural data may provide crucial information about the general catalytic mechanism of retroviral proteases.
Acknowledgments Research sponsored by the National Cancer Institute, DHHS, under contract with ABL. The contents of this publication do not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
structure of D30N Active Site Mutant of FIV Proteinase
653
References 1. 2.
3.
4.
5.
Toh, H., Ono, M., Saigo, K., and Miyata, T.( 1985). Retroviral protease-like sequence in the yeast transposon Tyl. Nature 315, 691 Hansen, J, Billich, S., Schulze, T., Sukrow, S., and Moelling, K.(1988). Partial purification and substrate analysis of bacterially expressed HIV protease by means of monoclonal antibody. £'M50 J. 7, 1785-1791. Seelmeier, S., Schmidt, H., Turk, V., and von der Helm, K.(1988). Human immunodeficiency virus has an aspartic-type protease that can be inhibited by pepstatin A. Proc. Natl. Acad. Sci. U. S. A. 85, 6612-6616. Darke, P. L., Leu, C. T., Davis, L. J., Heimbach, J. C , Diehl, R. E., Hill, W. S., Dixon, R. A., and Sigal, I. S.(1989). Human immunodeficiency virus protease. Bacterial expression and characterization of the purified aspartic protease. J. Biol. Chem. 264, 2307-2312. Richards, A. D., Roberts, R., Dunn, B. M., Graves, M. C , and Kay, J.(1989). Effective blocking of HIV-1 proteinase activity by characteristic inhibitors of aspartic proteinases.
FEBSLett.241,l\3-\\l. 6. 7. 8. 9.
10.
11. 12.
13.
14.
15.
16. 17.
18. 19.
20.
Schneider, J. and Kent, S. B.(1988). Enzymatic activity of a synthetic 99 residue protein corresponding to the putative HIV-1 protease. Cell 54, 363-368. Umezawa, H.(1976). Methods Enzymol. 45, 678 Miller, M., Jaskolski, M., Rao, J. K. M., Leis, J., and Wlodawer, A.( 1989). Crystal structure of a retroviral protease proves relationship to aspartic protease family. Nature 337, 576-579. Navia, M. A., Fitzgerald, P. M., McKeever, B. M., Leu, C. T., Heimbach, J. C , Herber, W. K., Sigal, I. S., Darke, P. L., and Springer, J. P.(1989). Three-dimensional structure of aspartyl proteasefi*omhuman immunodeficiency virus HIV-1. Nature 337, 615-620. Wlodawer, A., Miller, M., Jaskolski, M., Sathyanarayana, B. K., Baldwin, E., Weber, I. T., Selk, L. M., Clawson, L., Schneider, J., and Kent, S. B. H.(1989). Conserved folding in retroviral proteases: Crystal structure of a synthetic HIV-1 protease. Science 245, 616-621. Wlodawer, A. and Erickson, J. W.(1993). Structure-based inhibitors of HIV-1 protease. Annu. Rev. Biochem. 62, 543-585. Wlodawer, A., Gustchina, A., Reshetnikova, L, Lubkowski, J., Zdanov, A., Hui, K. Y., Angleton, E. L., Farmerie, W. G., Goodenow, M. M., Bhatt, D., Zhang, L., and Dunn, B. M.(1995). Structure of an inhibitor complex of the proteinase fi'om feline immunodeficiency virus. Nature Struct. Biol. 2, 480-488. Gustchina, A., Kervinen, J., Powell, D. J., Zdanov, A., Kay, J., and Wlodawer, A.(1996). Structure of equine infectious anemia virus proteinase complexed with an inhibitor. Protein Science (in press) Kohl, N. E., Emini, E. A., Schleif, W. A., Davis, L. J., Heimbach, J. C , Dixon, R. A., Scolnick, E. M., and Sigal, I. S.(1988). Active human immunodeficiency virus protease is required for viral infectivity. Proc. Natl. Acad. Sci. U. S. A. 85, 4686-4690. Loeb, D. D., Hutchison, C. A., Edgell, M. H., Farmerie, W. G., and Swanstrom, R.(1989). Mutational analysis of human immunodeficiency virus type 1 protease suggests fiinctional homology with aspartic proteinases. J. Virol. 63, 111-121. Le Grice, S. F., Mills, J., and Mous, J.(1988). Active site mutagenesis of the AIDS virus protease and its alleviation by trans complementation. EMBOJ. 7, 2547-2553. Mous, J., Heimer, E. P., and Le Grice, S. F.(1988). Processing protease and reverse transcriptase fi'om human immunodeficiency virus type I polyprotein in Escherichia coli. J. Virol. 62, 1433-1436. Kotler, M., Katz, R., and Skalka, A. M.(1988). Activity of avian retroviral protease expressed in E. coli. J. Virol. 62,2696-2700. Kaplan, A. H.(1996). Constraints on the sequence diversity of the protease of human immunodeficiency virus type 1: a guide for drug design. AIDS Res. Hum. Retroviruses 12, 849-853. Talbott, R. L, Sparger, E. E., Lovelace, K. M., Fitch, W. M., Pedersen, N. C , Luciw, P. A., and Elder, J. H.(1989). Nucleotide sequence and genomic organization of feline immunodeficiency virus. Proc. Natl. Acad. Sci. USA 86, 5743-5747.
654
Celine Schalk-Hihi et al
21.
Elder, J. H., Schnolzer, M., Hasselkus-light, C. S., Henson, M., Lemer, D. A., Philips, T. R., Wagaman, P. C , and Kent, S. B. H.(1993). Identification of proteolytic processing sites within the Gag and Pol polyproteins of feline immunodeficiency virus. J. Virol. 67, 1869-1876. Tabor, S. and Richardson, C. C.(1985). A bacteriophage T7 RNA polymerase/promoter system for controlled exclusive expression of specific genes. Proc. Natl Acad. Sci. U. S. A. 82, 1074-1078. Studier, F. W., Rosenberg, A. H., Dunn, J. J., and Dubendoff, J. W.(1990). Methods Enzymol. 185, 60-89. Otwinowski, Z.(1992). An Oscillation Data Processing Suite for Macromolecular Crystallography, Yale University, New Haven. Engh, R. and Huber, R.( 1991). Accurate bond and angle parameters for X-ray protein-structure refinement. Acta Crystallogr. A. 47, 392-400. Brunger, A.(1992). X-PLOR Version 3.1: A System for X-Ray Crystallography and NMR, Yale University Press, New Haven. Powell, M. J. D.(1977). Restart procedures for the conjugate gradient method. Mathematical Programming 12, 241-254. Hendrickson, W. A.(1985). Stereochemically restrained refinement of macromolecular structures. Methods Enzymol. 115, 252-270. Finzel, B. C.( 1987). Incorporation of fast Fourier transforms to speed restrained least-squares refinement of protein structures. J. Appl. Crystallogr. 20, 53-55.
22.
23. 24. 25. 26. 27. 28. 29.
A Homology-Based Model of Juvenile Hormone Esterase from the Crop Pest, Heliothis virescens ^ Beth Ann Thomas^, W. Bret Church^ and Bruce D. Hammock^ ^Departments of Entomology and Environmental Toxicology, University of California, Davis, CA 95616 and ^Garvan Institute of Medical Research, St. Vincent's Hospital Sydney NSW 2010, Australia
I. Introduction
JHE (Juvenile Hormone Esterase) (EC 3.1.1.1), a member of the carboxylesterase family, is a specific and kinetically competent enzyme that quickly decreases insect juvenile hormone (JH) titers by hydrolysing JH to the metabolically inactive carboxylic acid. These events occur at appropriate times during insect development. Because of its role in regulating JH titers and thus development, JHE has been a target for development as a biologically based insecticide. An important step in the development of structure/function relationships in proteins is the comparison of the enzymatic and chemical properties with the three-dimensional structure. While JHE is believed to be a member of the oc/p hydrolase fold family, only limited structural information is available for esterases (OUis et al., 1992). In the absence of an X-ray crystal structure, we have constructed a three-dimensional model of JHE by comparison to related proteins in the oc/p hydrolase fold family by the method of homologybased molecular modeling (Greer, 1991). In this communication we present the deduced three-dimensional structure of JHE from the crop pest, Heliothis virescens.
'B.A.T. is supported on a USDA post-doctoral fellowship # 95-37302-1861. This work was partly funded by USDA Competitive Research Grants Program, #94-37302-0567 and NIEHS Grant ROl ES02710-16. University of California at Davis is a NIEHS Center for Environmental Health Sciences (P30 ES05707) and an EPA Center for Ecological Health Research. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
655
656
Beth Ann Thomas et al
II. Materials and Methods
All modeling was performed on a Silicon Graphics INDY using the visualization software, Insight II (Version 95.0) from BIOSYM Technologies, Inc. Homology (Homology User Guide) was used for model generation and evaluation and Discover (Discover User Guide) for energy minimization. All minimizations utilized the CVFF forcefield. Profiles-3D (Luthy et al, 1992) interfaced to Insight II was used to evaluate both sequence alignments and the final model.
A. Primary Sequence Alignment Muhiple sequence alignments were accomplished using a block-oriented algorithm that uses a heuristic divide and conquer strategy (Wathey, 1992; Homology Users Guide). Default values were used for alignments. Profiles-3D was used to both identify sequences homologous to JHE and to assist in the evaluation of sequence alignments of JHE with other ot/p hydrolase fold proteins. The following X-ray crystal structures were used in this study: acetylcholinesterase from Torpedo californica (PDB entry lace, Sussman et al, 1991), haloalkane dehalogenase (rom Xanthobacter autotrophicus (PDB entry Ihal, Liao etal, 1992), wheat serine carboxypeptidase II (PDB entry 3sc2, Francken etal, 1991) and lipasefi^omGeotrichum candidum (PDB entry Ithg, Schrag and Cygler, 1993). These four structures have been refined at resolutions of 2.8, 1.9, 2.2 and 1.8 A, respectively. All sequence alignments were examined in the context of superimpositions of these structures. The alignment of JHE to both acetylcholinesterase and lipase by Cygler et al (1993) wasfinallyused for modeling (Hanzlik et al, 1989). This alignment was produced from 32 sequences of a/p hydrolase fold proteins and was consistent with structural superimpositions and individual alignments in the regions of structural similarity for all proteins of interest. The a-carbon coordinates of both acetylcholinesterase and lipase which are classified as "low variable cluster" by Cygler et al (1993) were used for the superimposition prior to the assignment of threedimensional coordinates for JHE. This superimposition included the following secondary structure elements: p sheets 1-7, regions around the amino acids serine and histidine and glutamate that compose the catalytic triad, loop regions that include the loop after p sheet 1 and loops between p sheets 3 and 4 and 4 and 5, helices as, a n and the C-terminal half of helix an, A root mean square deviation of 1.83 A was obtained in a superimposition of 170 a-carbon coordinates from acetylcholinesterase and lipase.
Juvenile Hormone Esterase from Heliothis virescens
657
B. Assignment of Structurally Conserved Regions (SCRs) and Loops The crystal structure of acetylcholinesterase was incomplete in two regions; for the purposes of having a complete model the two N-terminal residues were built in an extended conformation and residues 485-489 were placed by a search of the Protein Data Bank (see later for loop placement methods). The choice in this case has an effect on Profiles-3D results, but neither region was used to build the JHE model. The Profiles quality scores before and after building this region were 248 and 261, respectively, indicating better compatibility between compared sequence and structure as a result of further building. SCRs were determined according to Cygler et al. (1993). In this communication, SCRs are defined as contiguous regions that are assigned to a template and, in most cases, are inclusive of more than one secondary structure element. Some turns and loops were found to be structurally conserved between the templates, and in these cases, JHE was assigned according to the templates. In addition, nonmutated side chains were built according to the reference template that was used for assignment of the SCR in closest proximity to the region of interest. Regions of the sequence not designated as SCRs are designated as loops. Coordinates for JHE SCRs were assigned using Homology according to the template which demonstrated the highest sequence homology. Loop regions of JHE which do not fit the aforementioned criteria were assigned by loop searching in Homology. This algorithm searches the Protein Data Bank for proteins whose sequences immediately N- or C-terminal to the loop of interest are structurally similar to the model. The 10 loop candidates were visually inspected and, in general, the loop displaying the smallest root mean square deviation from the template structure was chosen for the model. In addition, loop coordinates were assigned from the protein whose loop demonstrates the least steric overlap with atoms of JHE. The three N-terminal amino acids were built in an extended conformation and did not produce any steric clash with the model. A disulfide bond was manually built between JHE Cys 70 and Cys 92 because these two cysteines align with Cys in both templates that form a conserved disulfide bond. JHE Cys 92 aligns with a Cys in both acetylcholinesterase and lipase that is involved in a conserved disulfide bond. JHE Cys 70 aligns with Cys 61 and Cys 67 in lipase and acetylcholinesterase, respectively.
C Refinement Following the assignment of all coordinates, conformational searching of mutated side chains was performed using Autorotamer in Homology. Briefly, this algorithm will minimize steric overlap by using an iterative procedure to generate new side chain conformations with the lowest energy. For each nonconserved residue in JHE, the rotamer with the lowest non-bond energy was saved. Junctions between SCRs and loop regions frequently contain peptide bonds that are skewed in length and/or dihedral angle. All SCR loop splice points were
658
Beth Ann Thomas et al
repaired by 500 iterations each of steepest descents and conjugate gradients minimization while applying a torsion force of 50 kcal/A^ to 0),the dihedral angle that defines the peptide bond, tethering the heavy atoms of the residue in the loop adjacent to the splice point with a force constant of 100 kcal/A^ and fixing the atoms of the SCR. A similar procedure was used in the one case where a splice point was created between SCRs originating from different templates. In this case all heavy atomsfi'omadjacent residues were tethered except the amide nitrogen on the N-terminal side and the carbonyl atoms on the C-terminal side of the peptide bond, which were fixed. This protocol was sufficient to satisfactorily correct most splice points. Selective energy minimization can be used to further reduce steric strain by building a model from components. The following protocol was used to "relax" the JHE model: 500 iterations of steepest descents minimization was applied to loop atoms, all mutated side chains in the SCRs and the side chains of the three N-terminal residues. A force constant of 100 kcal/A^ was used in all calculations to tether the minimizing atoms. Minimization successfully reduced the total, bond, non-bond and non-bond repulsion energy terms. Finally, the JHE model was submitted for energy minimization with the goal of simply removing steric conflicts without complete minimization. The model was energy minimized by using 100 and 200 iterations of steepest descents and conjugate gradients minimizations, respectively, with a distancedependent dielectric constant of 10. All backbone atoms in the model were tethered using a force constant of 100 kcal/A^ during the second minimization. This same minimization protocol was applied to both acetylcholinesterase and lipase as a control. The total reduction in energies for JHE was comparable to the observed reductions in the two templates that were minimized identically to the controls. The rms deviation of 978 a carbon coordinates of both JHE and acetylcholinesterase was 3.13 and 3.19 A before and after minimization, respectively. The rms deviation of atom coordinates of JHE (4368 atoms) before and after this minimization protocol is 0.6 A. Profiles 3D Analysis in Homology was used to test the validity of the JHE model. All default parameters were used for these calculations. The two template structures were treated identically as controls. Accessible solvent area (ASA) calculations were completed for the model and templates as a tool to examine whether JHE forms a globular structure that is similar to the template (Lee and Richards, 1971). The ASA values represent an average of 9 adjacent residues.
III. Results and Discussion A. Primary Sequence Alignment and Model Generation The following proteins were chosen for muhiple sequence alignment: T. californica acetylcholinesterase, Xanthohacter autotrophicus haloalkane dehalogenase, G, candidum lipase and wheat serine carboxypeptidase. This set was selected because they are all members of the ot/p hydrolase fold family (Ollis et al, 1992). This family of proteins, which is believed to have evolved by
Juvenile Hormone Esterase from Heliothis virescens
659
divergent evolution, adopts a characteristic core of 8 p sheets connected by a helices despite a lack of significant sequence similarity. The proteins in this family possess several well-conserved residues at the active site, the hydrophobic core, disulfide bridges and salt bridges (Cygler et al, 1993). Table I shows the pairwise sequence alignments between JHE and the ot/p hydrolase fold proteins examined as potential templates.
Table L Pairwise Sequence Identities of JHE to Proteins of the a/p Hydrolase Fold Family. Protein acetylcholinesterase lipase caitx)xypeptidase II haloalkane dehalogenase
% Identity to JHE^
PDB Accession Code lace Ithg 3sc2 Ihal
27.9
112
21.6 23.2
^% Identity was determined using Pairwise alignment in Homology. Dayhoff mutation matrix, gap penalty and gap length penalty of 6 and 1.65, respectively, were used.
Acetylcholinesterase and lipase demonstrated the highest overall sequence homology and the best alignment to JHE and were, therefore, used as templates for modeling JHE (Table I). The alignment of JHE to both acetylcholinesterase and lipase in the N-terminal domain of the protein is very good. The N-terminal domain is defined as the N-terminal half of the protein and is consistent with the definition of Cygler et al. (1993). The alignment was much less obvious in the C-terminal domain; the pairwise alignments indicate less than 20 % identity when comparing the C-terminal domain of JHE with either acetylcholinesterase or lipase (Table II). Table XL % Identity of Domains Between JHE and its Templates. % Identity^ N-Terminal Domain C-Terminal Domain ACE Lipase ACE Lipase JHE ACE
39.7
40.1 41.9
19.6
18.8 23.3
^% Identity was determined using Pairwise alignment in Homology. Dayhoff mutation matrix, gap penalty and gap length penalty of 6 and 1.65, respectively, were used.
Multiple sequence alignment in Homology unsurprisingly detected no significant homology in this portion of JHE when compared with either template. However, it has been reported that despite a low sequence similarity in the C-terminal
660
Beth Ann Thomas et al
domains of acetylcholinesterase and lipase, there are portions that are well conserved structurally, such as the region around the His and Glu members of the active site triad (Cygler et al, 1993). It is interesting that two out of the three members of the catalytic triad are located in the more divergent domain of these proteins. The active site channel in these proteins is mostly composed of amino acidsfromthe C-terminal domain of the proteins as well. Thus one can imagine that substrate specificity is determined, in part, by variability present in the Cterminal domain of these ot/p hydrolase fold proteins. Profiles 3D analysis was used to generate an alignment between the Cterminal domains of interest since this method is theoretically able to detect similarities in proteins that are less than 25 % identical (Luthy et al, 1992). While the program aligned the N-terminal domain in approximate congruence with the results obtained by multiple sequence alignment in Homology, it did not find significant similarities between templates and JHE. Profiles-3D analysis of JHE as compared to acetylcholinesterase yielded a Z score of 24.4. This type of analysis yields Z scores that are consistent with the idea that the templates and JHE possess the same general protein fold; a Z score > 7 indicates that the two proteins probably assume a similar three-dimensional fold (Bowie et al, 1993). However, this alignment encompassed only the N-terminal domain of the proteins. The Z-scores in Profiles-3D improved when a complete model for acetylcholinesterase was used. The N-and C-terminal domains of JHE have 39.7 and 19.6 % homology to the comparable domains in acetylcholinesterase, respectively (Table II). The C-terminal domains of the test proteins were independently analyzed by Profiles. The program did little alignment in this domain between JHE and each of its templates. Most of the C-termini of each template, in fact, was not aligned with JHE. A closer examination of the regions not included in any sort of alignment revealed that there is a sequence identity of 17 and 24 %, respectively, between acetylcholinesterase and its corresponding JHE sequence aligned according to Cygler et al, 1993. Furthermore, these regions of the templates are composed mostly of "high variable" regions, 61 and 80 %, respectively, for acetylcholinesterase and lipase. Therefore, we conclude that it is not surprising that Profiles analysis yielded negligible results for these alignments. Several available alignments were examined for sequence and structural homology. The alignment of lipase and acetylcholinesterase with JHE by Cygler et al, 1993 was ultimately used for modeling. Figure 1 shows the alignment used for SCR assignment and model building. This alignment contains minor modifications in the C-terminus. For example, to improve the alignment of p sheets 11 and 12 in acetylcholinesterase and lipase, 2 gaps were introduced after THG Thr 508. Two gaps were introduced in acetylcholinesterase before Leu 516 which improved the sequence homology score from -18 to -8. This region of the proteins is classified as "highly variable" according to Cygler et al, 1993, thus it is not surprising that it is difficult to be absolutely certain about the local sequence alignment. It has been observed that despite a low sequence similarity among proteins in the ot/p hydrolase fold family they possess a surprising structural homology (OUis et al, 1992, Cygler et al, 1993). Cygler et al, 1993 found, as expected, a high degree of sequence and structural homology in the N-
n '
f'^^p*
>>ir 1
PS
s
SI —com UQO
J
0_iC3o So-UJin
?i§
u
ON
^
;^
§1
S
II
^ .s ^
Ik
i^
662
Beth Ann Thomas et al
terminal domains of acetylcholinesterase and lipase (~ 60 % of the total sequence). While similarity was much less evident in the C-terminal domain, there are some regions of this domain that are structurally conserved. In general, the a/p hydrolase fold shows the lowest structural variability in the core of the protein which includes p sheets 1-7 (Ollis et al, 1992). Most of the a helices are less well conserved, in general, except as (Cygler et al, 1993). A view of the general topology of the JHE model is shown in Figure 2.
Figure 2. Ribbon diagram of JHE using MOLSCRIPT (Kraulis, 1991) showing the typical cx/p hydrolase fold topology consisting of 8 p sheets surrounded by a helices.
Juvenile Hormone Esterase from Heliothis virescens
663
Portions of JHE not assigned according to a template are classified as loops. All loop assignments are shown in Table III.
Table IIL Loop Structure Assignments for JHE Loop
JHE#a
JHESeq>
Template^
Template #^
i
80-85
LMAASN
2cna
159-164
SPEGSS
2
104-113
LPRVRGTTPL.
laid
252-257
RRTVPPAVTG
3
142-145
TKNV
2prk
145-148
SSGV
4
160-163
LSMN
Icsc
263-266
GLAG
5
255-262
LGNQRDGS
2sq)
A34-A41
SEMKAEHA
6
280-289
ANAVLIE QIC
2q)p
176-185
TDQMTR PDGS
7
293-297
FLPIV
3dfr
86-90
HLDQE
8
344-350
NFDLVKK
2ccy
A93-A99
AAAKAGP
9
383-388
YYNGTI
3hhb
A29-A34
LSFPTT
10
414-418
AETGG
2gbp
51-55
LAKGV
11
428-437
YEGQNSIIK
3rnt
55-63
PILSSGDVYS
12
440-444
GLNHE
3rn3
32-36
NLTKD
13
465-472
LHASPSEN
2cyp
75-82
FNDPSNAG
14
494-500
TCEDNNS
21bl
109-115
FESAQFP
15
510-513
MQYE
lake
A144-A147
TGEE
16
525-530
EFASRQ
6xla
358-363
AAAARG
^ Amino acid numbers for JHE or template structures. ^Amino acid sequence using the standard residue notation. ^Standard Protein Data Bank code names for template structures.
Template^ Sequence
Beth Ann Thomas et al
664
B. Analysis of the model Several algorithms were used to test the validity of the model. Profiles 3D analysis will detect misfolded regions of a protein by measuring the compatibility of a one-dimensional protein sequence with its three-dimensional coordinates. The profiles score (S) is used as a measure of correct folding; average profile scores of correctly folded proteins are ~ 0.5 while misfolded structures often have values of-- 0.1 (Luthy et al., 1992). The average amino acid profile score for JHE is 0.32 while it is 0.54 and 0.53, respectively, for acetylcholinesterase and lipase. Thus, JHE falls in between these regions. The sections of JHE that have scores less than zero correspond to "highly variable" regions and are the only sections in the model that fall below zero in profile score. The solventaccessible hydrophobic surface area of the model was also calculated as a measure of structure validity since it is known that, in general, charges buried in the protein interior are inconsistent with the principles of protein structure. Figure 3 shows that the accessible surface area calculations of JHE relative to acetylcholinesterase are comparable.
50
100
150
200
250
300
350
400
450
500
ASA Calculations for the JHE and ACE structures Figure 3. Accessible Solvent Area as a function of aligned sequence for JHE (thin line) and acetylcholinesterase (bold line). Numbering is according to the JHE sequence. Solid horizontal bars represent regions of the model originally taken from lipase or acetylcholinesterase and are considered to be core components of the model.
This is yet another indication of overall correct folding of JHE. However, the regions where the calculated area is greater than the threshold that would be indicative of incorrect folding are classified as "highly variable" regions and the absolute N- and C-termini. In addition, the quality of the model was evaluated using Prostat. This algorithm checks the structure for correctness in geometry by evaluating parameters such as bond length, bond angle, dihedral angle. The Prostat results for the model were qualitatively compared with those from the templates. For example, 73.2 % of the JHE residues were found to be in acceptable regions of the Ramachandran plot whereas the figures for acetylcholinesterase and lipase are 80.2 and 83.4 %, respectively. All bond lengths and dihedral angles were found to be in acceptable regions with the exception of two regions of the protein (JHE amino acids 493-494 and 437-438)
Juvenile Hormone Esterase from Heliothis virescens
665
where extensive splice repairing had to be done to correct skewed peptide bond dihedral angles. These omega angles were not able to be totally corrected even when applying a high force constant. As another measure of structure validity, the CD spectrum of wild-type JHE was determined to compare the amount of a helical secondary structure calculatedfromthe spectrum with that observed in the model. Deconvolution of the data in the wavelength region of 200 to 300 nm yields 36% a helix (J. Magdalou, personal communication). This compares to 30 % a helix in our JHE model.
IV. Conclusions The JHE model was constructed in order to better test hypotheses of catalytic activity, substrate preference and protein uptake. It will also allow us to improve our ability to rationally design modified forms of JHE as a biologically based insecticide. This model was built using both acetylcholinesterase from T. califomica and lipase from G. candidum as templates and is consistent with the principles of protein structure.
References Barton, G.J. (1993) Prot Eng. 6, 37-40. Bowie, J.U., Luthy, R. and Eisenberg, D. (1993) Science 253,164-170. Cygler, M., Schrag, J.D., Sussman, J.L., Harel, M., Silman, I., Gentry, M.K, and Doctor, B.P. (1993) Prot. ScL 2, 366-382. Discover User Guide (1994) version 2.9.5. San Diego: Biosym Technologies, Inc. Francken, S. M., Rozeboom, H.J., Lalk, K.H. and Dijkstra, B.W. (1991) EMBO J. 10,12971302. Greer, J., (1991) Meth. Enzym. 202, 239-252. Hanzlik, T.N., Abdel-Aal, Y.A.I., Harshman, L.G. and Hammock, B.D. (1989) J. Biol. Chem. 264,12419-12425. Homology Users Guide, (1993) version 2.3. San Diego: Biosym Technologies, Inc. Kraulis, P. J. (1991) J. Appl. Cryst. 24, 946-950. Lee, B. and Richards, P.M. (1971) J. Mol. Biol. 55, 379-400. Liao, D., Breddam, K., Sweet, R.M., Bullock, T. and Remington, S.J. (1992). Biochemistry 31, 9796-9812. Luthy, R., Bowie, J.U. and Eisenberg, D. (1992) Nature 356, 83-85. Ollis, D.L., Cheah, E., Cygler, M., Dijkstra, B., Frolow, F., Franken, S.M., Harel, M., Remington, S.J., Silman, I., Schrag, J., Sussman, J.L., Verschueren, K.H.G. and Goldman, A. (1992)Pro/.5'c/. 5, 197-211. Schrag, J.D. and Cygler, CM. (1993). J. Mol. Biol. 230, 575-591. Sussman, J.L., Harel, M., Frolow, F., Oefner, C , Goldman, A., Toker, L. and Silman, I. (1991) Science 253, 872-879. Wathey, J.C. (1992) Prot. Sci. 2 (Suppl. 1), 142 (abstract).
This Page Intentionally Left Blank
Analysis of Linkers of Regular Secondary Structures in Proteins VGeetha and PeterJMunson Analytical Biostatistics Section, Laboratory of Structural Biology, DCRT, Nationallnstitutes of Health, Bethesda, MD 20892
I.
Introduction
Linkers are polypeptide segments connecting regular secondary structural regions of the protein backbone. They were previously categorized as "random coils" with no defined secondary structure. This notion has been contradicted by a growing body of literature on these linker regions, by various groups over the last few years (1-16). In spite of their flexibility, the loops are important for protein structure and function. The EF-hands of calcium binding proteins, the helix-tum-helix motif of repressor proteins and the antigen binding sites of immunoglobulins are classic examples of repeating structural motifs which are important for recognition. Most of the common structural motifs that lead to super-secondary structural features were described by Efimov and Thornton (1-5). The folding units made up of secondary structural elements adjacent along the chain have also been found to associate rapidly to form compact structures (4). |3-hairpins with various size linkers (9-12) have been shown to exhibit preferences for both sequence and conformation in the linker regions. For example, a two-residue linker between two p-strands has mostly tum types I' or IF. Nevertheless, description of such motifs rest strongly upon the method of classification of the loop structures. Different groups have attempted different clustering parameters to obtain a reasonable description of the so called "random coils". Thornton's description of loop regions into structural families (5) is based on (|),\j/ and, as has been pointed out by Wodak and her co-workers (17), the end-to-end distances are more closely correlated to backbone RMSD than the backbone torsions. Even Efimov's description of loops based on the Ramachandran plot is not better in delineating sructural classes. A recent study of the taxonomy of loops (18) has considered the end-to-end distances and an axisd descriptor using virtual positions to identify families of clusters. This exhaustive method combines the same linker lengths irrespective of the flanking elements. We feel that the nature of the flanking elements is important for understanding the linkers. The present study of linkers of regular secondary structures seeks an understanding of the non-regular regions through a systematic classification scheme. The method, as any other earlier method, depends strongly upon a good secondary structural assignment method. We have studied various secondary structural definitions including DSSP (19), SSTRUC (20) [ combination of DSSP and extension of helices and strands according to lUPAC standards (21)] TECHNIQUES IN PROTEIN CHEMISTRY VIII
667
668
V. Geetha and Peter J. Munson
and STRIDE (22) on a standard data set. We have chosen STRIDE (22) which has slightly improved assignment over other methods. Such analyses of linker regions should certainly pave the way for the improvement of present loop modeling procedures.
II.
Methods
A. Data Base of Linkers A data set of 330 unique protein chains which have <45% sequence identity among themselves (23,24), with resolution better than 2.25 A and R-factor <0.25 is considered. Short segments of less than eight residues should have a higher percentage of sequence identity [according to the function derived by Schneider and Sander (25)] to be called as homologous pairs. Moreover, even identical pentapeptides or hexapeptides may have different conformations in different environments (26). Secondary structural regions are designated according to the secondary structural assignment program STRIDE (22). Linker regions of lengths from two to eight residues which connect regular secondary structures, helices and strands are isolated from the data set (Figure 1).
H - Ln - H
H-- L n - E 50
100
1—1
00
1 1 1—1
50
2
3
4
5
6
7
8
E - Ln - E
500 100
250 50
0
3
4
5
6
7
8
2
3
4
Onnn 5
6
7
8
Figure 1. Database of linkers of length two to eight residues flanked by different secondary structural regions. Note the large population of two to four residue linkersflankedby P-sheets.
Linkers of Regular Secondary Structures in Proteins
669
There are in total about 3133 linkers, of which 446 are between two helices (HLn-H), 748 between a helix and a strand (H-Ln-E), 558 between a strand and a helix (E-Ln-H), 1381 between two strands (E-Ln-E), where n = 2 to 8. B. Clustering of Linkers and analysis of conformations Linkers of length n = 2 to 8 are described by various a-carbon distances that include those between the entering and exiting regular secondary structural residues. The a-carbon distances for a linker of size n are [C".i to C\j\ for j =l,n for every linker residue /. An initial clustering procedure with all the relevant acarbon distances is performed using K-means clustering (27) using a commercially available software package JMP™ (28). These clusters are in turn clustered by Kmeans based on backbone torsions. The resultant sub-clusters from the backbone conformational space are once again K-means clustered for the side chain torsions. We have not made any attempt to do a complete clustering of the entire data. At each clustering stage, we have throughout eliminated data that lie outside the cluster.
Table L
Linkers of length two residues
Linker Type
Number of Sub Clusters
Number of examples
H-L2-H
2
21 28
(t)V (LI)
(L2)
PP
PP
non specific
H-L2-E
1
84
E-L2-H
2
27
PE
pp
37
OtR
PE
186
ttR
OR
E-L2-E
4
PP
ttL
105
PP
e
"R
670
III.
V. Geetha and Peter J. Munson
Results and Discussion
A. Two-residue linkers The two residue linker between helices (H-L2-H) are analyzed in terms of the various a-carbon distances as described in Methods. Table I lists various two residue linkers with conformations of the sub clusters identified. There is a major cluster found for this linker which divides up into two clusters in (t),\|/ space. TTie (t),\|/ regions considered here are OR (right-handed helix), OL (left-handed helix), pg (extended), pp (parallel strands) and e (occupied mostly by Gly) (4). More than half of the residues at position LI are hydrophobic and most residues at the second position are Ser/Thr/Pro residues. Ser and Thr could mimic Pro in certain positions as conceived by Thornton and her workers (5). About half of the linker residues at the first position of the second sub-cluster prefer Gly/Asn and the second position has around half Ser/Thr residues. This linker brings about chain reversal of close to 90° and they cannot be ignored as suggested earlier (18). The (H-L2-E) linker also shows a strong amino acid preference at the first position. Around 80% of them are Gly/Asn and about the same number of them are hydrophobic at the second position of the linker. The ^,\\f distribution at both the positions are shown in Figure 2. The corresponding side chain torsion angle Xi is found to have g""
150 + +
100 50
>
X::.'
ol-50h
-loo^ -150h -150
-100
-50
50
100
150
Figure 2. (t),\|/ distributions for a sub-cluster of a H-L2-E linker, where, (t)l,\|/l (.) is for LI and (t)2,\}f2 (+) is for L2 linker residue.
Linkers of Regular Secondary Structures in Proteins
671
conformation both at the first and second positions of the linker residue. The linker between strand and helix (E-L2-H) has also been identified to form distinct clusters. One of them prefers p^Pp main chain conformations at both the linker sites and the other, OCRPE conformation. The side chain torsion x^ of the second residue is around g" conformation. The linkers between strands (E-L2-E) has three clusters identified in the C" distance space (Figure 3a). The first cluster indicates two distinctive subclusters in (t),\(f space (Figure 3b), corresponding to type I and type I' turn conformations.We have not made distinction between a and y regions of (t),\|r map since we have only five distinct regions instead of the usual seven regions of Ramachandran plot. The second major cluster in turn gives rise to two distinct sub clusters in the ^,\\f space (Figure 3c), corresponding to type II and 11' turns. Although it is known from earlier studies that two-residue hairpins have turn type conformations I, I', If, the present study has very clearly distinguished the four hydrogen bonded turn types based on distance and main chain torsion clustering.
(A) 15-
^
13-
d
11-
;ir
9-
*^
7-
1 1
~ _
5"
•
.0t
,«
*
•
H
13^
12-
3
ii: 9-
'-^
ft"
^
765-
m
*
' 1 ' 1 ' 1 ' 1 ' 1 • 1 ' 5 6 7 8 9 10
Ca(i-1)-Ca(i+1)
1
«
1 1 1 i 1 1 1 1 1 1 11 5 6 7 8 9 10 12 14 16
Ca(i-l)-Ca(i+2)
V. Geetha and Peter J. Munson
672 (B)
100 >
100 %
0
^
\
'
^
-100 -100 (C)
0
0
^
•
-
-100
100
•
-100
01
%
:
0 (|)2
100
0
-100
Figure 3. (a) Major clusters identified through a-carbon distances for E-L2-E linker; (b) (t),\|/ plot for one of the clusters (Fig.3a) for E-L2-E linker. The turn types I & T are shown to isolate; (c)
Considering the second cluster of E-L2-E, about one fourth of the linkers adopt turn type II conformation, although they are known to occur seldom in haiipins. Few of the members of this structural sub-family have medium to heavily twisted p-strands and the linker has both C'-O(i) to N-H(i+3) as well as has both C'-O(i) to N-H(i+3) as well as N-H(i) to C'-0(i+3) hydrogen bonds (Figure 4 ) . The above clusters correspond to chain-reversal linkers of E-L2-E. A third cluster is found to be quite variable and constitutes the "non-chain-reversal linker", as found in the distance cluster (Fig.Ba). Some of these correspond to P-bulges, where (t),\|/
Linkers of Regular Secondary Structures in Proteins
673
values are p^ PE ^ ^ they do not bring about chain reversals of the polypeptide chain. B. Three Residue linker Linkers of length three connecting various secondary structural elements with distinct sub-clusters are listed in Table H. Those that connect two helices (H- L3-H) has the central residue to be mostly hydrophobic and the side chains are pointing inwards in the space between the two packed helices (Figure 5a). This is in agreement with earlier observations of helix interactions linked by residues of length three (5). The linker is slightly extended in order to accomodate the side chain of the middle hydrophobic residue. The side chain torsion Xi of the second linker residue is mostly g" and the third linker residue (whose main chain is an extended conformation) has a g' side chain conformation. The second major cluster for the three residue packing between helices turns out to be a turn-like conformation, more folded when compared to the previous conformation. In this case, the central residue is pointing outside and it is clear that the folded turn-like conformation (although no indication of main chain hydrogen bonding), cannot accomodate the side chain of the middle residue as shown in Figure 5b. The three residue linker between two strands (E-L3-E) has a single major cluster (Table II). The side chain conformation of the second linker residue is mostly g". The third linker residue is mostly Gly for this linker type. Few of the representatives from this cluster are shown in Figure 6. The remarkable similarity between the backbone conformations of the three residue linkers of H-L3-H (second sub-cluster) and E-L3-E is obvious (Figure 6). The carbonyl oxygen is pointing outwards and the side chain of L2 also points outwards, in both the linkers.
Figure 4. E-L2-E linkers shown along with the flanking strands for type II turns. Members of the cluster are lhar_(184-185), lpmy_(12-13),2mcm_(41-42), lled_(32-33).
674
V. Geetha and Peter J. Munson
The linker between helix and strand (H-L3-E) has one major sub-cluster. The first residue is mostly Gly and the second is hydrophobic with significant population of Ala residues. The side chain of the central linker residue points outwards but the carbonyl oxygen of L2 points inside the space between the packed helices (Figure 7), making this different from those observed in H-L3-H or E-L3-E linkers. The linker E-L3-H that links strands and helices has two major clusters and descriptions of backbone conformations are listed in Table H. The third linker residue L3 of the first sub-cluster has Xi of g conformation. For the second sub-cluster, L2 has a^ main chain conformation and g" side chain conformationand L3 has extended backbone conformation and g" side chain conformation. (A)
(B)
Figure 5. (a) H-L3-H linkers of a sub-cluster showing the hydrophobs pointing inwards. The main chain conformation is slightly extended. The proteins are IchmA (158-160), IdsbA (116-118), lede_ (228-230). (b) H-L3-H linkers of another sub-cluster showing the side chains of L2 pushed outwards from the packed space of the flanking helices, due to folded backbone conformation. The representatives of the cluster are liscA (167-169), 21hb_ (88-90), larp_ (252-254).
Linkers of Regular Secondary Structures in Proteins
675
C. Linkers of length greater than three residues The four residue linker between two strands (E-L4-E) shows distinct clusters. One fourth of the L2 residues of the first sub-cluster are either Asp, Gly or Ser/Thr/Pro. The main chain torsions preferred are OR for the first three residues and aL for the last residue. For the linker H-L4-E, one of the sub clusters shows a very strong preference for Pro at the second position (about half of observed L2), which constrains the (j) angle to be negative as observed and Asp at the third position (again about half of observed L3 residues). The E-L4-H linker is similar to the first cluster of the H-L4-E linker but the difference lies in the strong preference of Pro at Table II.
Linker Type
Linkers of length three residues
Number of SubNumber of (t)\|/ (LI) Clusters exa mples
H-L3-H
2
40
aR(/aL)
%
PE
16
OCR
OtR
bridge
H-L3-E
1
56
ttL
PP
OtR
E-L3-E
1
42
ttR
OtR
OtL
E-L3-H
2
38
non specific PP non specific OtR
PP
29
PE
Figure 6. E-L3-E linkers of a distinct sub-cluster showing the L2 residues pointing inwards. The examples are larb_ (201-203), IcobA (89-91), 7ccp_ (217-219).
1
676
V. Geetha and Peter J. Munson
the second position of the linker residue in H-L4-E. The behavior of both H-L4-E and E-L4-H linkers with same preferred main chain conformations at the second and third positions of the linker residues and the role of the flanking secondary structural regions remains to be explored. The five residue linker between helices (H-L5-H) divides up in to three major clusters. The main chain torsion preference for the first sub-cluster are mostly OCR at the first position and P^ ^t the second to fifth positions of the linker. It is not clear if the first position is still a helical residue even though the main chain torsion is still helical. There is a strong preference for Gly at this position (about half of observed) and it might act as a helix breaker rather than participating in the helical conformation. The third and fourth residues are mostly hydrophobic (about half of them). The entering helix is quite short compared to the exiting helix. The extended linker conformation facilitates mostly the aromatic hydrophobs to point inwards in the gap between the packed helices. The other sub-cluster of H-L5-H exhibits a folded turn conformation with CO(i) to N-H(i+3) hydrogen bond in the linker region, quite different from the first sub-cluster. A third distinct sub-cluster has a 3io helical conformation in the linker region. The E-L5-E linker has one large cluster and the second and third residues of the linker prefer mostly OCR and the fourth residue OL conformations. The linker HL5-E has a preference for main chain conformation to be o^ for both the second and third residues and is non-specific for the other linker residues. Pro is found more often at the second position. The E-L5-H linker has no specific preference for main chain torsions at the first and fourth positions. However, L2 to lA positions prefer PE^R^R respectively. Our analyses on other linkers of less than eight residues are found to form similar clusters in the backbone conformational space.
Figure 7. H-L3-E linkers with L2 residues pointing outwards. The examples are lcdg_ (70-72), ldts_ (30-32), IchmA (39-41).
Linkers of Regular Secondary Structures in Proteins
IV.
677
Conclusions
Our method of clustering based on a-carbon distances and in turn backbone and side chain torsions has demonstrated the existence of possible structural motifs within well defined linker regions. The method is not quite exhaustive to account for data outside the clusters and its limitations arise due to non-standard way of defining limits of the clusters. Nevertheless, the method does seem to bring out the structural differences among various linkers. Our analyses on mainly -a proteins from the CATH database (results not shown here) has resulted in two major clusters of three residue linkers between helices (H-L3-H) and few of the helical proteins have same angle of orientation of helices within a cluster. We are yet to extend this observation to the unique data set of protein chains discussed in this paper.
References 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Efimov, A.V. (1984). FEES Lett. 166,33 Efimov, A.V. (1986). Molecular Biology (Moscow). 20,250 Efimov, A.V. (1991). Protein Eng. 4, 245-250 Efimov, A.V. (1993). Prog.Biophys.Mol.Biol. 60, 201 Thornton, J.M., Sibanda, B.L., Edwards, M.S. & Barlow, D.J. (1988). 8, 63 Tramantano, A. & Lesk, A.M. (1992). Proteins: Str.Funct.& Genet. 13, 231 Wintjens, R.T., Rooman, M.J. &Wodak, S.J. (1996). J.Mol.Biol. 255, 235 Richards, P.M. & Kundrot, C.E. (1988). Proteins: Str.Funct.&Genet. 3, 71 Sibanda, B.L. & Thornton, J.M. (1985). Nature, 316, 170 Sibanda, B.L. & Thornton, J.M. (1991), Methods Enzymol. 202, 59 Mattos, C, Petsko, G.A. & Karplus, M. (1994). J.Mol.Biol. 238, 733 Milner-White, E.J. & Poet, R. (1986). Biochem. J. 240, 289 Rooman, M., Rodriguez, J. & Wodak, S. (1990). J.Mol.Biol. 213, 327 Ring, C.S., Kneller, D.G., Langridge, R. & Cohen, F.E. (1992). J.Mol.Biol. 224, 685 Scheerlinck,J.-P.Y et al. (1992). Proteins:Str.Funct. & Genet. 12, 299 Rice, P.A., Goldman, A. & Steitz, T.A. (1990). Protein:Str.Funct. & Genet. 8, 334 Boutonnet, N.S., Rooman, M.J., Ochagavia, M., Richelle J. & Wodak, S. (1995). Prot. Engg. 8, 647 Kwasigroch, J., Chomilier, J. & Momon, J. (1996). J. Mol. Biol. 259, 855 Kabsch, W. & Sander, C. (1983). Biopolymers, 22, 2577 Smith, D.K. (1989). unpubhshed results lUPAC-IUB (1970). lUPAC-IUB commission on biochemical nomenclatureAbbreviations and symbols for the description of the conformations of polypeptide chains, J.Mol.Biol. 52, 1 Frishman, D. & Argos, P. (1995). Proteins: Str. Funct. & Genet. 23, 566 Hobohm, U., Scharf, M., Schneider, R. & Sander, C. (1992). Prot. Sci. 1, 409 Hobohm, U. & Sander, C. (1994). Prot.Sci. 3, 522 Schneider, R. & Sander, C. (1991). 9, 56 Cohen, B.L, PresneH, S.R. & Cohen, F.E. (1993). Prot.Sci. 2, 2134 Fischer, L. & Van Ness, J.W. (1971). "Admissible Clustering Procedures", Biometrika, 58, 91 JMP™ . 3.0.1 - Statistics made visual from SAS Institute Inc. (1994)
This Page Intentionally Left Blank
Structural and Functional Roles of Tyrosine-50 of Yeast Guanylate Kinase Yanling Zhang, Yue Li and Honggao Yan
Department of Biochemistry Michigan State University East Lansing, Michigan
I. Introduction Guanylate kinase (GK) catalyzes the reversible phosphoryl transfer from ATP to GMP in the presence of Mg^^ (Agarwal, et al., 1978 ). It belongs to a family of nucleoside monophosphate kinases, including adenylate kinase (AK) and uridylate kinase. GK is essential for converting GMP to GDP and therefore synthesis of GTP. It also plays an important role in the cGMP cycle (Hall & Klihn, 1986), and is required for metabolic activation of the antiviral drugs Acyclovir and Gancyclovir (Miller & Miller 1980; Boehme, 1984). Interestingly, several proteins, including the protein encoded by Drosophila tumor suppresser gene dlg-h (Woods & Bryant, 1991), a rat presynaptic density protein (SAP90) (Kistner et al., 1993), a rat postsynaptic density protein (PSD95) (Cho et al., 1992), and the major palmitoylated membrane protein p55 of human erythrocytes (Bryant & Woods, 1992), share significant homology with the entire sequence of GK. It has been suggested that GK may be involved in guanine nucleotide-mediated signal transduction pathways by regulating the ratio of GTP and GDP (Woods & Bryant, 1991). Among nucleoside monophosphate kinases, AK has been extensively studied (Tsai & Yan, 1991). Although the reaction catalyzed by GK is very similar to that of AK, the two enzymes are only distantly related and share 13% identity in the amino acid sequences. The putative ATP binding domain of GK is very similar to that of AK, however, the GMP binding domain of GK and the AMP binding domain of AK are grossly different in structure (Stehle & Schulz, 1992). While the GMP binding domain consists of a mixed p-sheet and a short helix, the AMP binding domain is completely a-helical. GK is rather specific with respect to the nucleoside monophosphate substrate (Agarwal, et al., 1978; Konrad, 1992), The amino acid residues involved in binding of GMP form a distinct nucleoside monophosphate binding motif TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
679
680
Yanling Zhang et al
Tyr-50 is strictly conserved among GKs and other related proteins (Zschoke et al., 1993). X-ray crystallography reveals that the hydroxyl group of Tyr-50 is hydrogen bonded to the phosphoryl group of the bound GMP in the GK»GMP complex (Stehle & Schulz, 1992). It is hydrogen bonded to the mainchain carbonyl oxygen of Asp-98 in the unliganded enzyme (Ji et al., personal communication). In this report we describe evaluation of the structural and functional roles of Tyr-50 by site-directed mutagenesis in conjunction with kinetics, 2D NMR and guanidine hydrochloride-induced denaturation. We show that Tyr-50 stabilizes the GK«GMP complex by 2.2 kcal/mol, the ternary complex by 2.1 kcal/mol, and the transition state by 3.2 kcal/mol. Tyr-50 is not essential for proper folding of the enzyme but it contributes to the conformational stabiHty by 2.3 kcal/mol.
II. Experimental Procedures A. Site-Directed Mutagenesis The oligonucleotide for making Y50F mutant was 5'-GGTAAGGACTTTAACTTTGTC-3'. The mutant was generated by the method of Kunkel (1985) and screened by DNA sequencing. In order to ensure that there were no unintended mutations in the mutants, the entire DNA sequence of the mutated gene was determined. Expression and purification of the mutant were the same as described for the wild-type GK except that the mutant were eluted with 5 mM ATP in the step of Affi-Gel Blue chromatography (Li et al., 1996).
B. Steady-State Kinetics Kinetic experiments were carried out by measuring the formation of ADP and GDP with a coupled assay as previously described (Agarwal et al., 1978; Li et al., 1996). Kinetic parameters were obtained by nonlinear least square analysis of the data according to Cleland (1986).
C. Determination of the Dissociation Constants of Binary Complexes by NMR Substrate titration experiments were carried out in a 5 mm NMR tube. GK was dissolved in a D2O buffer containing 20 mM perdueterated Tris and 100 mM KCl, pH 7.5 (pH meter reading without correction for deuterium isotope effects). The initial volume was 0.65 ml. The initial GK concentration was 0.63 mM. The concentrations of ATP and GMP stock solutions were 27.5 and 11.2 mM respectively. ID proton NMR spectra were acquired at 25 °C on a
Mechanism of Guanylate Kinase
681
Varian VXR 500 NMR spectrometer operating at a proton frequency of 500 MHz. The spectral width was 6000 Hz with the carrier frequency at the HDO resonance. The solvent resonance was suppressed by presaturation. Each FID was composed of 16 k data points with 80 transients. The delay between successive transients was 2.8 s. The time domain data were processed by zerofilling to 32 k points, exponential multiplication (0.5 Hz) and Fourier transformation. Chemical shifts were referenced to internal sodium 3(trimethylsilyl)-propionate-2,2,3,3-ci4. The K^ values were obtained by nonlinear least square fit of the data to the equation
^=^/^
2E,
where 5^ and 6^^ are the chemical shifts of a protein resonance at the free and ligand-bound states, 6 is the chemical shift of the resonance for each titration, Et is the total protein, and Lt is the total ligand concentration. Et and Lt were varied in each titration according to the following expressions:
"^
Vo+AV LoAV VQ + AV
where Eo is the initial protein concentration, VQ is the initial volume of the protein solution, AV is the total volume of the added ligand solution, and Lo is the concentration of the ligand stock solution.
D. 2D Proton NMR GK was dissolved in the same deuterated Tris buffer as previously described. The concentration was ~2 mM. For the NMR measurements of GK»GMP complex, GK was saturated with GMP (~5-fold excess) on the basis of the titration experiments. Clean TOCSY spectra (Braunschweiler & Ernst, 1983; Bax & Davis, 1985; Griesinger et al., 1988) were acquired with the MLEV-17 mixing sequence and a mixing time of 45 ms. NOESY spectra (Jeener et al., 1979; Macura & Ernst, 1980) were acquired with a mixing time of 200 ms. All the 2D spectra were acquired in the hypercomplex mode with standard phase cycling schemes. The spectral width was 6000 Hz in both dimensions. The data consisted of 2048 complex points in the t2 dimension and 256 complex points in the tl dimension. Spectral processing was carried out on a SGI Indigo 2 workstation using the software NMRPipe (Delaglio et al., 1995). The time domain data were apodized with a Gaussian function in t2 and a shifted sine bell function in tl. The tl dimension was zero filled to 1024 points prior to Fourier transformation.
682
Yanling Zhang et al
E. Equilibrium Unfolding Gdn-HCl-induced unfolding was followed by fluorometry. Both protein and Gdn-HCl stock solutions were prepared in 100 mM Tris-HCl buffer, pH 7.7, containing 100 mM KCl. The concentration of the Gdn-HCl stock solution was determined by measuring its refractive index according to Pace (1986). The final protein concentration was 0.125 mg/ml. The protein solutions were allowed to equilibrate at room temperature for 1 h after mixing with Gdn-HCl. Fluorescence was measured on a Perkin-Elmer fluorometer with excitation wavelength of 282 nm and emission wavelength of 332 nm. The unfolding data were analyzed by the linear extrapolation method (Pace, 1986). The values for unfolding free energy changes were obtained by nonlinear least-square analysis of the entire denaturant titration curves as described by Santoro & Bolen (1988).
III. Results and Discussions A. Kinetic and Binding Properties of Y50F The steady-state kinetic parameters of Y50F are listed in Table 1 along with those of the wild-type GK. In comparison with the wild-type enzyme, substitution of Tyr-50 with a phenylalanine residue resulted in a ~6-fold decrease in kcat, a ~30-fold increase in KmcoMP) and a -'39-fold increase in Ki(GMP). The values of Km(MgATP) and Ki(MgATP) were increased by only about onefold. The results indicate that the effects of the point mutation on the kinetic properties of the enzyme is rather specific. The dissociation constants of the binary substrate complexes were determined by ID NMR titration experiments. Representative titration curves are shown in Figure 1. The Kd values are listed in Table 1. The results suggest that the affinity of Y50F for MgATP is similar to that of the wild-type enzyme, but the affinity of the mutant for GMP decreases dramatically.
B. Structural Characterization by NMR Since changes in the kinetic properties of a mutant may be caused by conformational perturbations, we compared the conformation of Y50F with that of the wild-type GK by NMR. By analysis of the TOCSY spectra and deuteration of the ring hydrogens of the phenylalanine residues, we were able to identify most aromatic spin systems. Since GK contains only one tryptophan, the tryptophan spin system was assigned to Trp-70. Tyr-78 was assigned by site-directed mutagenesis. Tyr-25, Phe-29, Phe-52, Tyr-77 and Phe-183 were tentatively assigned by analysis of the NOEs of the NOESY spectra and the distance information from the crystal coordinate (Stehle & Schulz, 1992). The chemical shifts of the aromatic resonances of Y50F and its GMP complex are
683
Mechanism of Guanylate Kinase
Table I. Comparison of Kinetic, Binding, and Conformational Properties Between WT and Y50F Steady-State Kinetics ^m(MgATP)
^(MgATP)
^ m
^(GMP)
mM mM s"^ mM 394±15 0.20±0.01 0.080±0.004 0.091±0.006 WT 0.36±0.03 0.16±0.01 3.0±0.26 Y50F 56±4
KJ
KJ
K.•m(MgATP) ^ m ( G M P ) mM M-h M-^s-i 0.035±0.003 2.0x10^ 4.3x10^ 1.4±0.12 1.6x10^ 1.9x10^
NMR Titration Kd(MgATP) (mM)
Kd(GMP) (mM)
WT Y50F
0.090±0.007 0.13±0.04
0.029±0.02 0.84±0.02
WT Y50F
Gdn-HCl Denaturation AG(jH20 (kcal/mol) m (kcal/mol*M) 7.0±0.4 5.9 4.7±0.3 4.1
Di/2(M)
1.20 1.17
Table II. Chemical Shifts of the Aromatic Residues of Y50F and Its GMP Complex^ spin system Ya Yb Yc Yd Y78 Fa Fb Fc Fd Fe Ff Fg Fh Fi W70
Y50F
7.26 6.43 6.75 6.59 7.18 6.80 7.45 6.75 7.12 6.63 5.44 (+0.03) 6.57 6.86 6.90 7.35 6.09 6.20 (-0.04) 6.86 7.37 7.00 7.13 6.73 6.88 (+0.05) 7.02 (+0.03) 7.50 (+0.03) 7.34 7.43 7.06 7.52 (+0.04) 6.93 6.94 7.05 (+0.03) 7.42 (0.04) 7.22 (+0.03) 7.10 7.30 7.36 7.03 7.57
Y50F-GMP
6.33 6.60 6.78 6.76 6.89 (+0.10) 5.60 (-0.06) 6.08 6.57 (-0.04) 6.66
7.27 6.74 7.13 (-0.03) 7.45 7.65 (+0.34) 6.57 6.89 6.85 6.98
7.43 6.88 (-0.06) 6.88 7.17 (-0.04) 7.01 7.56
7.30 7.47 7.00 7.25 (-0.06) 7.28
6.87 7.32 (-0.03) 7.28 7.19 7.01 7.54 7.40 7.40 (-0.04)
"The resonances which differ by >0.02 ppm from the corresponding resonances of the wild-type GK are underlined. The magnitudes of the differences are indicated in paretheses. The identities of Yb, Yd, Fb, Fd and Fi were tentatively assigned as follows: Yb, Y77; Yd, Y25; Fb, F183/F29; Fd, F29/F183; and Fi, F52.
684
Yanling Zhang et al 1.90
-1
r-
'
1.89
1.88 CO o
E 0
O
1.86
1.85
L.
1.84'
20
40
60
80
1
1
GMP (^1) 1
I.OU
B
• 1.89
-
V
0)
4:r
^
i.
1.88
-
\
\
(0
•'•..
o
e 0)
x: O
1.87
•••.•
'••••#..
" " • • •
•
1.86 1
1
20
40
60
1
80
MgATP (^1)
Figure 1. GMP (A) and MgATP (B) titrations of Y50F measured by proton NMR. The dotted lines were obtained by nonhnear least-square fit of the data as described in Materials and Methods.
Mechanism of Guanylate Kinase
685
listed in Table 2. For the free enzymes, the chemical shifts of all the aromatic resonances are very similar between Y50F and the wild-type GK. For the GMP complexes, the chemical shifts of all aromatic resonances of Y50F are also very similar to those of the wild-type GK except Tyr-78. The aromatic resonances of Tyr-78 of the mutant GMP complex are shifted down field by >0.1 ppm. Since both Tyr-50 and Tyr-78 are hydrogen bonded to the phosphoryl group of the bound GMP, the down field shifts of Tyr-78 in the mutant GMP complex are likely to be caused by changes in the local electronic environment due to substitution of Tyr-50 with a phenylalanine. The aromatic regions of the NOESY spectra of Y50F and its GMP complex are shown in Figure 2 and Figure 3, respectively. The NOE pattern of Y50F is virtually the same as that of the wild-type GK except the substitution of Tyr-50 with a phenylalanine. The interresidue NOEs of the mutant GMP complex are also very similar to those of the wild-type GMP complex. The NOEs between H8 of GMP and Tyr-78 observed in the wild-type GMP complex are also present in the mutant complex. The NOEs between H8 of GMP and Tyr-50 of the wild-type complex are replaced by the NOEs between H8 of GMP and Phe-50 in the mutant complex. The results taken together suggest that Y50F is properly folded. The conformation of Y50F is virtually unperturbed relative to that of the wild-type enzyme. The conformation of the mutant GMP complex is highly similar to that of the wild-type GMP complex.
C. Conformational Stability Gdn-HCl-induced denaturation experiments were used to measure the conformational stability of the proteins. The reversibility of the unfolding reaction was demonstrated by refolding after complete unfolding by Gdn-HCl. A representative denaturation curve is shown in Figure 4. The result is characteristic of two-state unfolding. Furthermore, the transition curves obtained by fluorometry are very similar to those obtained by CD measurements (data not shown), suggesting that the two-state model is appropriate for analysis of the data. The free energy changes of unfolding obtained by the linear extrapolation method (Pace, 1986; Santoro & Bolen, 1988) are listed in Table 2. The results indicate that the hydroxyl group of the phenolic ring of Tyr-50 contribute to the conformational stability by 2.3 kcal/mol. Since the hydroxyl group of Tyr-50 is hydrogen bonded to the carbonyl oxygen of Asp-98 in the unliganded state (Ji et al., personal conmiunication), the lower stability of Y50F is likely to be caused by disruption of the hydrogen bond by the mutagenesis.
D. Structural and Functional Roles of Tyr-50 Tyrosine is one of the most commonly found residues at non-helix phosphatebinding sites in proteins (Copley & Barton, 1994). Tyr-50 is identified to be at the GMP binding site of yeast GK by X-ray crystallography
Yanling Zhang er A/.
686
0 ^^
Q
0
0
1-5.4
^
9 05
00
L5.8 0 0
^ ©^
o
1-6.2 -g QL
0l4
' 0—22 ^<^ 100^1
I i'"^^
,00 ^^^
^2928^27^ * # ^
-1
7.6
1
1
7.2
0
1
1
6.8
'^4 o
25 1
1
6.4
@8^ 12® 0 6^
P 18© ^15 ^g^ ^^g
° B1 e M Q P 2 k7.0 0 0 ©3
1
6.0
^
13 10
O© 20® ®^^ 1
I- 6.6 S"
1lf
^7.4 7 4
1
1
5.6
r
CO, (ppm)
Figure 2. NOSEY specta of the aromatic protons of Y50F. Interresidue NOEs are numerically labeled: 1-3 and 5-13 are the NOEs between a-protons and aromatic protons; 4 is the NOE between Fa and Fe; 14 and 17 are the NOEs between Fb and Yd; 15 and 16 are the NOEs between Fb and Fd; 18 and 20 are the NOEs between Fc and Fe; 19 is the NOE between Fc and Fd; 21, 23, 25 and 27 are the NOEs between Y77 and W70; 22, 24 and 29 are possibly the NOEs between Y78 and F50; 26 is the NOE between Fb and Yd or Fd; 28 is possibly the NOE between Fc and Fd; 30 is the NOE between Yd-Fd.
(Stehle & Schulz, 1992). Its phenolic hydroxyl group is hydrogen bonded to the phosphoryl group of the bound GMP. In the unliganded form, Tyr-50 is hydrogen bonded to the backbone oxygen of Asp-98 (Ji et al., personal communication). In order to evaluate the roles of these hydrogen bonds in catalysis and structure, Tyr-50 is replaced by a phenylalanine. NMR characterizations of Y50F suggest that the conformation of the mutant is very similar to that of the wild-type GK. Thus the hydroxyl group of Tyr-50 is not important for proper folding of the protein. However, it stabilizes the native structure by 2.3 kcal/mol as determined by Gdn-HCl-induced denaturation.
Mechanism of Guanylate Kinase
0 f
71
USA
Qei
L5.8
X
[-6.2
*
^0 1
1
^ d (
0 © 3 • ^6.6 ©4
50 ^ '
^^
I
8.0
I
I
7.6
?
' I 20
I
'
7.2
eoi9
6.8
Q. Q.
k7.0
k7.8
%18
22ff
E
6© J•.1I-7.4
0
\ 15
Q
23 « I
ft
« 210 I ^ 0 B
687
6.4
6.0
f
h8.2 5.6
CO, (ppm)
Figure 3. NOSEY spectra of aromatic protons of the complex of Y50F with GMP. Interresidue NOEs and the NOEs between the bound GMP and the aromatic residues are numerically labeled: 1-7 are the NOEs between a-protons and aromatic protons; 8, 10 and 14 are the NOEs between Fb and Fd; 9, 11 and 16 are the NOEs between Fb and Yd; 12, 13 and 17 are the NOEs between Y77 and W70; 15 and 18 are the NOEs between Y78 and F50; 19 and 20 are the NOE between H8 of GMP and F50; 21 is the NOE between W70 and an amide or aromatic proton; 22 and 23 are the NOE between H8 of GMP and Y78.
Steady-State kinetic studies indicate that Km(GMP) and KKGMP) of the mutant increase by ~30-fold while kcat of the mutant decreases by ~6-fold. kcat/Km(GMP) of the mutant decreases by '-230-fold. The mutation causes relatively little changes in other kinetic parameters. The kinetic results are corroborated by substrate titration studies. The titration data indicate that the affinity of Y50F for MgATP is similar to that of the wild-type enzyme, but the affinity of the mutant for GMP decreases dramatically. Since the effects of the mutation on the kinetic and binding properties of the enzyme are rather specific and the structure of the mutant is highly similar to that of the wild-type GK, the kinetic
Yanling Zhang et al
688 —1
120
1
1
1
A
• \
0 O 100 c 0
o
CO 0 O D
80
-
A
0
i5 0
'•
60
A
< - . „ . .
....%....•
•
•
•
40
H 1
..1 1
2
1_
3
4
1
Gdn-HCI (M)
Figure 4. Gdn-HCl-induced unfolding of Y50F measured by fluorometry. The dotted hne was obtained by nonhnear least-square fit of the data as described in Materials and Methods.
and binding data can be quantitatively interpreted. The results suggest that the hydrogen bond between the hydroxyl group of Tyr-50 and the phosphoryl group of GMP stabilizes the GK«GMP complex by 2.2 kcal/mol, the ternary complex by 2.1 kcal/mol, and the transition state by 3.2 kcal/mol.
Acknowledgments We thank Dr. Richard Young for providing us the yeast genomic library from which we cloned the wild-type GK gene. This work was supported by a grant from the National Institute of Health (GM51901).
Mechanism of Guanylate Kinase
689
References Agarwal, K. C , Miech, R. P., & Parks, R. E., Jr. (1978) Methods Enzymol 51, 483-491. Bax .A. and Davis, D.V. (1985) 7. Magn. Resort. 65, 355-360. Boehme, R. E. (1984) J. Biol Chem. 259, 12346-12349. Braunschweiler, L., & Ernst. R. P. (1983) J. Magn. Reson. 53, 5521-528. Bryant, P. J., & Woods, D. F. (1992) Cell 68, 621-622. Cho, K.-0., Hunt, C. A., & Kennedy, M. B. (1992) Neuron 9, 929-942. Cleland, W. W. (1986) in Investigations of rates and mechanisms of reaction Part 1 (Bernasconi, C. P., Ed.) pp791-870, Wiley, New York. Copley, R. R., & Barton, G. J. (1994) J. Mol. Biol. 242, 321-329. Delaglio, P., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., & Bax, A. (1995J J. Biomol. NMR 6, 277-293. Griesinger, C., Otting, G., Wuthrich, K.,and Ernst, R.P. (1988) J. Am. Chem. Soc. 110, 7870-7872. Hall, S. W., & Ktihn, H. (1986) Eur. J. Biochem. 161, 551-556. Jenneer, J and Ernst, R. P. (1979) /. Chem. Phys. 71, 4546-4553. Kistner, U., Wenzel, B. M., Veh, R. W., Cases-Langhoff, C , Gamer, A. M., Appeltauer, U., Voss, B., Gundelfmger, E. D., & Gamer, C. C. (1993) J. Biol. Chem. 268, 4580-4583. Konrad, M. (1992) J. Biol. Chem. 267, 25652-25655. Kunkel, T. A. (1985) Proc. Natl. Acad Sci. USA 82, 488-492. Li, Y., Zhang, Y., & Yan, H. (1996) J. Biol. Chem., in press. Macura, S and Ernst, R.P. (1980) Mol. Phys. 41, 95-117. Miller, W. H., & R. L. Miller (1980) J. Biol. Chem. 255, 7204-7207. Pace, C. N. (1986) Methods Enzymol. 131, 266-280. Santoro, M. M., & Bolen, D. W. (1988) Biochemistry 27, 8063-8068. Stehle, T., & Schulz, G. E. (1992) /. Mol. Biol. 224, 1127, 1141. Tsai, M.-D., & Yan, H. (1991) Biochemistry 30, 6806-6818. Woods, D. P., & Bryant, P. J. (1991) Cell 66, 451-464. Zschocke, P. D., Schiltz, E., & Schulz, G. E. (1993) Eur. J. Biochem. 213, 263269.
This Page Intentionally Left Blank
SECTION IX Dynamics and Folding
This Page Intentionally Left Blank
Flexibility of Serine Protease in Nonaqueous Solvent Samuel Toba David S. Hartsough Kenneth M. Merz Jr. Department of Chemistry 152 Davey Laboratory The Pennsylvania State University University Park, Pennsylvania 16802 I. Introduction Recent studies of proteins in non-aqueous solvents have yielded many interesting insights into protein structure, function and dynamics. For example, Klibanov and co-workers have demonstrated that proteins have increased thermostability,^ molecular pH memory^ and altered substrate specificity-^ when they are placed in an anhydrous organic solvent. In addition to the advantage of enhancing the solubilities of organic substrates, enzymes in organic solvents can have novel properties such as the ability to catalyze reactions that are either thermodynamically or kinetically impossible.^"^ In organic solvents, serine proteases are promising catalysts for organic synthesis and the preparation of unusual polymers.^'^^ The protease family of enzymes has long generated considerable interest due to their role in peptide degradation and possibly peptide synthesis.^^"^^ Although many distinct families of serine proteases seem to exist, the two best studied ones are chymotrypsin and subtilisin. Several MD studies have been reported on serine proteases in aqueous media ^^'^^ but only little is known about the dynamics of serine proteases in non-aqueous solvents.^'^ We report the results from a molecular dynamics simulation of the serine protease y-chymotrypsin (y-CT) in hexane. The active site of chymotrypsin contains the "catalytic triad" which consists of Ser-His-Asp.^^ y-CT suspended in nearly anhydrous solvents has been found to be catalytically active.^ ^ In order for proteins to retain their activity in anhydrous solvents some water molecules are required to be present. These "essential waters" have been suggested to function as a molecular lubricant for the protein.^^ Hexane, having a dielectric constant of 1.89,^^ is a suitable nonaqueous solvent for enzymatic reactions. The low dielectric constant of hexane allows it to not compete with the protein for the essential water and allows enzymes to retain their catalytic activity. y-CT in hexane is thus an ideal system to further explore the effect of non-aqueous solvation on protein structure, function and dynamics. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
693
694
Samuel Toba et al
II. Methods The starting coordinates for all heavy atoms were obtained from the crystal structures of y-CT determined by Yennawar and co-workers in hexane.^^ All the MD simulations were carried out using a parallelized version^-^ of the MINMD module of AMBER 4.0^^ simulation package in conjunction with the OPLS^^ force field, TIP3P^^ water, and a united atom hexane model.-^^ The three different starting configurations for the simulations are labeled CT, CTWAT and CTMONO: CT is y-CT with seven surface bound hexane molecules and fifty bound "essential" water molecules (from hexane exposed crystal) all immersed in 1109 hexane molecules; CTWAT is y-CT solvated with fifty "essential" water molecules (from native crystal) and immersed in 1107 hexane molecules; CTMONO is y-CT solvated with a monolayer of 444 water molecules and immersed in 931 hexane molecules. The 50 'essential' water molecules included in the simulations were determined from experimental B-values and are required for hydration of charged groups.-^^'-^^ We introduced three chloride ions near the positively charged groups as counterions to neutralize the protein charge. y-CT consists of three peptide segments (1-13, 16-146 and 151-245) linked together by disulfide linkages. Only residues observed in the X-ray structure^-^ (residues 1-10, 16-146 and 151-245) were used in the MD simulations. The pentapeptide present in the active site of y-CT X-ray structure^^ was removed in the simulations. Before each MD simulation, each system was subjected to energy minimization for 2000 steps with periodic boundary conditions. All the bond lengths were constrained to their equilibrium values using the SHAKE^^ algorithm with a tolerance of 0.0005 A and a time step of 1.5 fs was used. The minimized systems were equilibrated by gradually increasing the system temperature from 0 K to 300 K over a period of -- 30 ps by coupling to a temperature bath^^ and constant pressure of 1 atm.^^ Both CT and CTWAT simulations were performed for 300 ps (150ps equilibration and 150ps sampling), while CTMONO dynamics was run for 270 ps (150ps equilibration and 120ps sampling). Coordinates were saved every 50 steps for CT and CTWAT, and every 1000 steps for CTMONO for analysis. All analyses were carried out for the last 150 ps for CT and CTWAT, and the last 120 ps for CTMONO. The solvent accessible surface area (SASA) values for the three systems were calculated using XPLOR^^ with a probe radius of 1.4
A.
III. Results and Discussions A molecular understanding of the effect of organic solvents on proteins is still in its infancy. It has been proposed that the organic solvent may induce a conformational change in the enzyme thus resulting in a drop in activity.-^ ^ However, recent crystal structures of enzymes in organic solvent show structures that are indistinguishable from the water counterpart. Thus, implying that the protein-organic solvent interaction does not induce a
Flexibility of Serine Protease in Nonaqueous Solvent
695
conformational change. Here, we address the effect of solvent dynamics on protein stnicture and flexibility. Our simulations shows that both CTWAT and CTMONO have lower RMS deviations compared to the CT system, indicating that they have a closer resemblance to the starting crystal structure (Table l).If we assume that CTMONO represents an aqueous simulation, this is in contrast to our earlier work on BPTI^^'-^"^ where we found that the RMS deviation was on average higher for the BPTI in water system. This difference is more obvious in the RMS deviation of the backbone atoms only. We further found the change in the placement of the 50 "essential" water molecules from CT to CTWAT have resulted in significant difference in RMS deviations. It has been reported that the correct placement of buried water molecules is necessary to obtain locally correct structure.^ ^ In protein simulations, the RMS deviation of the instantaneous structure from a reference structure (usually either X-ray or NMR derived) is indicative of the structural rearrangements the model undergoes during equilibration. The convergence of this property is taken as the point where the system is equilibrated. The RMS fluctuations of the instantaneous structure from the time-averaged structure of an equilibrated system may be taken as an indicator of the flexibility of the system. If we consider RMS deviation to be indicative of protein mobility, we observe that y-CT exhibits higher mobility in anhydrous non-polar organic media than in water.^^ However, these data show that although RMS deviation from the crystal structure can give a relative insight into conformational change, the RMS fluctuations from the average equilibrated structure is a better representation of the flexibility. We find that both the total and backbone flexibility is lower in CT and CTWAT compared to CTMONO. This is in agreement with our previous finding of increased flexibility of BPTI in water than in chloroform.^^*^^ Furthermore, we found that the total protein flexibility is not affected by the placement of bound water molecules. Distinct regions of high and low RMS for all the systems have also been observed in our simulations. Higher RMS fluctuations are observed at the beginning (the 10 residue peptide) and the end of each segment of the protein. In the CT and the CTWAT runs, all the high RMS fluctuations correspond to loop regions exposed to the solvent. In the CTMONO, although the large RMS fluctuations correspond to mainly surface loops, a few semi-buried regions with large fluctuations were also observed. Some random fluctuations in the a-helices in both CT and CTWAT were also observed, where for a short time scale during the simulation, some hydrogen bond disruption occurs within the a-helices. However, the contribution of this type of fluctuation is small compared to the total fluctuations from the surface loops. Surface loops are known to be highly flexible regions within proteins^^'^^ and it is this flexibility that allows the protein surface region to undergo substantial changes when placed in hexane. We were also interested in how the active site residues were affected by the non-aqueous solvent. The calculated RMS deviations and the RMS fluctuations for the active site residues in the three simulations are lower than the protein's average (Table 2). An MD simulation of acyl-chymotrypsin in water by Yu et. al. have found that the active site triad residues to have
696
Samuel Toba et al
Table 1. Summary of the RMS Deviation, RMS Fluctuation, Radius of Gyration, SASA, Hydrophobic exposed SASA, Solvent Diffusion, Average Distances and Hydrogen Bonding Data from the Simulations 1 Analysis Type 1 CT | CTWAT | CTMONO 1 RMS Deviation (A) Total 2.7 2.4 2.5 1 2.4 2.1 1 Backbone 2.0 1 RMS Fluctuation (A) 1 Total 0.68 0.84 1 0.67 Backbone 0.58 0.60 0.73 1 SASA (1(P A^) 1 Crystal Structure 9.2 9.2 9.2 1 MD^ average 8.7 8.7 9.2 1 Hydrophobic Residues S AS A/Total SASA (% exposed residues) | 1 Crystal Structure 34.3 34.3 34.3 1 1 MD average 39.8 39.1 38.3 1 Average Distances*^ NE2-H0G 4.2 (2.0) 4.1 (2.0) 1 3.1 (2.0) HND-ODl 1.8 (1.8) 1.8 (1.8) 1.7 (1.8) HND-0D2 2.5 (2.5) 2.8 (2.5) 1 2.5 (2.5) Hydrogen Bonding | Crystal (total) 183 183 183 1 216 203 1 Min. structure (total) 193 570 605 Total 561 <10%c 224 294 216 1 >90%ci 139 141 93 1 a) Using the structure obtained from the end of the simulation. b) Value given in parenthesis is from the crystal structure. c) Number of hydrogen bonds present <10% of the time. d) Number of hydrogen bonds present >90% of the time.
Table 2. RMS Deviation and RMS Fluctuation of the active site residues CTMONO CT 1 CTWAT Analysis Type RMS Deviation (A) 1.22 0.98 0.87 HIS-57 ASP-102 0.80 0.60 1.41 1.42 0.77 0.89 SER-195 RMS Fluctuations (A) 0.71 0.37 HIS-57 0.51 0.73 ASP-102 0.39 0.48 0.54 0.32 0.56 1 SER-195
1 1 1 1 1 1 1
Flexibility of Serine Protease in Nonaqueous Solvent
697
smaller fluctuations than the rest of the protein.^^ This observation is also consistent with the experimentally observed structural integrity of the catalytic triad in the serine protease a-lytic protease in octane.^ ^ The flexibility of the active site can be further observed from the fluctuations in the \|/ and (j) plot. The Ramachandran plots of the active site residues throughout the trajectories are within allowed regions and show close agreement with the active site values obtained from the crystal structure.^ ^ This indicates that the active site of chymotrypsin remains relatively unchanged and is in accord with structural measurements of active enzymes in most anhydrous solvents.^-^'^^'"^-^ However, as observed in the RMS fluctuations results, the scatter for His 57 and Asp 102 observed in the CT and CTWAT simulations are less than that observed in the CTMONO simulation.^ ^ From this, we note that the reduced flexibility of the active site side chains in hexane compared to the protein's average allows the protein to remain stable and catalytically active in nonaqueous environments. However, if the flexibility of the active site residues is also needed for catalytic activity, this might result in reduced activity of the enzyme in non-polar non-aqueous solvents. From our simulations we find that the structural characteristics of the active site residues are similar to the findings of NMR and crystallographic studies of serine proteases in aqueous solvent.^"^ A stable hydrogen bond between the N5(a)His-57 hydrogen and the carbonyl group of Asp-102 was found (also thought to be present in the crystal structure). It has been reported that hydrogen bonds between His-57 and Asp-102 do exist but the hydrogen bond between Ser-195 and His-57 does not exist in the enzyme without the substrate in the active site.^"^ With the absence of the peptide in our simulations, the hydrogen bond between the hydrogen on the 0')90% of the time) and those weak or non-existent (present <10% of the time). We observe that more stable hydrogen bonds are present in the CT
Samuel Toba et al
698
and CTWAT systems and although CTMONO has the highest number of total hydrogen bonds, it also has the least number of stable hydrogen bonds. Other studies have also shown an increase in intramolecular hydrogen bonds within a protein when placed in an organic environment.^^'^^'-^^ When we consider that a folded protein is usually only 5-15 kcal/mol more stable than its unfolded form in aqueous solvent, which is equivalent to no more than a handful of hydrogen bonds, this increase in intramolecular hydrogen bonding is an important factor in stabilizing a protein in organic solvent. To understand how hexane may affect the protein, we analyzed the solvent interaction with the protein. Six of the seven initially "bound" hexane molecules in CT^^ were found to have diffused off of the protein surface at the end of the dynamics run. The one bound hexane (Hex 238) remaining is surrounded by several large hydrophobic residues forming a stable hydrophobic pocket on the protein surface. Interestingly, this hexane site was again observed in a subsequent chymotrypsin crystal grown and then exposed to hexane and isopropanol mixtures by Farber and co-workers."^^ This hexane site was not occupied in the CTWAT and CTMONO systems and no hexane molecules were found in other interior regions of the protein. Although no hexane molecules were found in the protein's interior for the CTWAT and CTMONO systems, hydrophobic contacts were observed between hexane molecules near the protein surface and hydrophobic side chains in all three systems. Hexane molecules on the protein surface tend to reside in the surface "clefts" formed by the hydrophobic side chains extended into the hexane solvent. At the same time, the hydrophilic residues tended to fold back onto the surface of the protein in order to minimize surface contacts. In our CTMONO simulation, we further observed the water molecules clustered around charged hydrophilic residues, while leaving the hydrophobic residues exposed to the solvent.(Fig. 1) It has been reported that preferential solvation of the hydrophobic regions of the protein surface by the non-polar solvent is due to the thermodynamically unfavorable formation of a complete monolayer of water in a non-polar solvent."*^ Klibanov and co-workers have also shown that hexane does not strip the water layer^'^^ nor does it immobilize the water molecules at the protein/solvent interface. Instead, rearrangements of the water molecules on the protein surface is the more favored process. Our simulations clearly support these experimental observations. IV.
Conclusions
Our studies have shown that both hydration and the placement of "essential" water molecules affect the RMS deviation, but not RMS fluctuation, of the protein when placed in a non-aqueous environment. As hydration increases, the structural similarities of the protein to the crystal structure increases. Although the deviation of the protein from the X-ray structure is higher in organic solvent than in water, the flexibility of the protein is higher in water. The protein remained spherical and the major movement is due to the folding back of the hydrophilic side chains on the protein surface exposed to hexane. The placement of bound water molecules affects the "local" mobility of the protein, mainly the surface loops. The total
Flexibility of Serine Protease in Nonaqueous Solvent
699
Figure 1. Stereoview of a surface hydrophilic residue (Lysine 88) in CT (top) and CTMONO (bottom). The side chain of Lys-88 in CT is folded back to minimize surface contact with hexane. In CTMONO, Lys-88 side chain is extended into the solvent and is preferentially solvated by the cluster of water molecules.
/''"'^Y-P'fa^LYS)
^
1/ (,
r~
700
Samuel Toba era/.
solvent accessible surface area (SASA) of the protein is reduced when placed in hexane, with hydrophobic residues experiencing the biggest increase in SAS A while hydrophilic residues undergo largest decrease in SAS A. Water "stripping" by hexane was not observed in our simulations, but rearrangements of the monolayer water molecules occurred. At the end of the simulations, the water molecules on the surface of the protein have clustered around the hydrophilic residues, leaving the hydrophobic residues more exposed to the non-polar solvent. The side chain fluctuations in the active site were less than the average fluctuations of the protein. Hexane molecules did not diffuse into the hydrophobic interior of the protein indicating that although the protein remains stable in hexane, it remains impermeable to the organic solvent. Although placement of bound water molecules do not have a major effect on the protein intramolecular interactions, the number of water molecules in a non-aqueous environment affects the formation of intramolecular hydrogen-bonds. The number of stable intramolecular hydrogen-bonds decreases with increasing hydration, indicating the role that aqueous solvent plays in facilitating fluctuations in the protein structure. We conclude that the reduction in solvent accessible surface area of the protein, the increase in intra-molecular hydrogen-bonds, and the increase in net ion pair interactions have all lead to the reduction in protein flexibility which in turn increases the stability of the folded protein in organic media. Acknowledgments Special thanks to Dr. K. V. Damodaran for his critical review of this manuscript. All molecular graphics images were produced using the MidasPlus software from Computer Graphics Laboratory, University of California, San Francisco. We thank the Office of Naval Research for support of this research through grant No. NOOO14-90-3-4002. The Pittsburgh Supercomputer Center and the Cornell Theory Center are acknowledged for generous allocations of supercomputer time through a MetaCenter grant. We thank Dr. Greg Farber for helpful discussions. References (1) (2) (3) (4) (5) (6) (7)
Zaks, A.; Klibanov, A. M. Science 1984, 224, 1249-1251. Klibanov, A. M. CHEMTECH 19S6, 63, 354-359. Zaks, A.; Klibanov, A. M. J. Am. Chem. Soc. 1986 108, 27672768. Zaks, A.; Klibanov, A. M. Proc, Natl. Acad. Sci. USA 1985, 82, 3192- 3196. Kuhl, P.; Hailing, P. J.; Jakubke, H. D. Tetrahedron Lett. 1990, 31, 5213-5216. Stahl, M.; Mansson, M. 0.; Mosbach, K. Biotech. Lett. 1990, 12, 161. West, J. B.; Hennen, W. J.; Lalonde, J. L.; Bibbs, J. A.; Zhong, Z.; Meyer, E. F., Jr.; Wong, C W.J. Am. Chem. Soc. 1990, 112, 5313-5320.
Flexibility of Serine Protease in Nonaqueous Solvent
(8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25) (26) (27) (28) (29) (30) (31) (32)
701
Kasche, V.; Michaelis, G.; Galunsky, B. Biotech. Lett. 1991, 75, 75. Margolin, A. L,; Fitzpatrick, P. A.; Dubin, P. L.; Klibanov, A. M. / . Am. Chem. Soc, 1991, 113, 4693-4694. Riva, S.; Chopineau, J.; Kieboom, A. P. G.; Klibanov, A. M. / . Am. Chem. Soc. 1988, 110, 584-589. Kullman, W. Enzymatic Peptide Synthesis; CRC Press: Boca Raton, FL, 1987. Schellenberger, V.; Jakubke, H. D. Agnew. Chem. Int. Ed. Engl. 1991,50, 1437-1449. Wong, C. R ; Wang, K. T. Experientia 1991, 47, 1123. Yu, H.-A.; Karplus, M.; Nakagawa, S.; Umeyama, H. PROTEINS: Structure, Function and Genatics 1993, 16, 172-194. Gerig, J. T. Magnetic Resonance in Chemistry 1990, 28, 47. Bemis, G. W.; Carlson-Golab, G.; Katzenellenbogen, J. A. J. Am. Chem. Soc. 1992, 114, 570-578. Zheng, Y.-J.; Ornstein, R. /. Am. Chem. Soc. 1996, 118, 4175180. Polgar, L. In Hydrolytic Enzymes', A. Neuberger and K. Brocklehurst, Ed.; Elsevier Science Publishers B. V. (Biomedical Division): Amsterdam, 1987; pp 159-200. Paradkar, V. M.; Dordick, J. S. /. Am. Chem. Soc. 1994, 116, 5009-5010. Klibanov, A. M. TIBS 1989, 14, 141-144. Akhadov, Y. Y. Dielectric Properties of Binary Solutions. A Data Handbook.', Pergamon Press: New York, 1981. Yennawar, N. H.; Yennawar, H. P.; Farber, G. K. Biochemistry 1994, 33, 7326-7336. Vincent, J. J.; Merz, J., K. M. /. Comput. Chem. 1995, 16, 14201427. AMBER 4.0, Pearlman, D. A.; Case, D. A.; Caldwell, J. C; Seibel, G. L.; Singh, U. C; Weiner, P.; Kollman, P. A. University of California, San Francisco: 1991. Jorgensen, W. L.; Tirado-Rives, J./. Am. Chem. Soc. 1988, 770, 1657-1666. Jorgensen, W. L.; Chandrasekhar, J.; Madura, J.; Impey, R. W.; Klein, M. L.y. Chem. Phys. 1983, 79, 926-935. Jorgensen, W. L.; Madura, J. D.; Swenson, C. J. /. Am. Chem. 5(9C.1984, 106, 6638-6646. Rupley, J. A.; Gratton, E.; Careri, G. TIBS 1983, 8, 18-22. Ryckaert, J. P.; Ciccotti, G.; Berendsen, H. J. C. /. Comput. Phys. 1977, 23, 327-341. Berendsen, H. J. C; Potsma, J. P. M.; van Gunsteren, W. F.; DiNola, A. D.; Haak, J. R. /. Chem. Phys. 1984, 81, 3684-3690. Brunger, A. T.; Kuriyan, J.; Karplus, M. Science 1987, 235, 458460. Desai, U. R.; Osterhout, J. J.; Klibanov, A. M. /. Am. Chem. Soc. 1994, 116, 9420-9422.
702
(33) (34) (35) (36) (37) (38) (39) (40) (41) (42) (43) (44) (45) (46) (47)
Samuel Toba et al
Hartsough, D. S.; Merz, K. M. J. /. Am. Chem, Soc. 1992, 114, 10113-10116. Hartsough, D. S.; Merz, K. M. J. J. Am. Chem. Soc. 1993, 775, 6529- 6537. Sreenivasan, U.; Axelsen, P. H. Biochem. 1992, J7, 1278512791. Toba, S.; Hartsough, D. S.; Merz, K. M., Jr. 7. Am. Chem. Soc. 1996, 77S, 6490-6498. Aqvist, J.; Tapia, O. Biopolymers 1990, 30, 205-209. Norin, M.; Edholm, O.; Haeffner, F.; Hult, K. Biophy. Journal 1994, 67, 548-559. Burke, P. A.; Smith, S. O.; Bachovchin, W. W.; Klibanov, A. M. / . Am. Chem. Soc. 1989, 777, 8290-8291. Yennawar, H. P.; Tennawar, N. H.; Farber, G. K. /. Am. Chem. Soc. 1995, 117, 577-585. Fitzpatrick, P. A.; Ringe, D.; Klibanov, A. M. Biochem. Biophys. Res. Comm. 1994, 795, 675-681. Fitzpatrick, P. A.; Steinmetz, A. C; Ringe, D.; Klibanov, A. M. Proc. Natl. Acad. Sci. USA 1993, 90, 8653-8657. Wescott, C. R.; Klibanov, A, M. Biochimica et Biophysica Acta 1994, 7206, 1-9. Steitz, T. A.; R.G., S.Ann. Rev. Biophys. Bioeng. 1982, 77, 419444. Tirado-Rives, J.; Jorgensen, W. L. /. Am. Chem. Soc. 1990, 772, 2773-2781. Parker, M. C; Moore, B. D.; Blacker, A. J. Biotech, and Bioengineering 1995, 46, 452. Zaks, A.; Klibanov, A. M. /. Biol Chem. 1988, 263, 8017-8021.
Higher-Order Structure and Dynamics of FK506-Binding Protein Probed by Backbone Amide Hydrogen/Deuterium Exchange and Electrospray Ionization Fourier Transform Ion Cyclotron Resonance Mass Spectrometry Zhongqi Zhang, Weiqun Li J Ming Li,' Timothy M. Logan,^ Shenheng Guan,' and Alan G. Marshall' Center for Interdisciplinary Magnetic Resonance, National High Magnetic Field Laboratory, Florida State University, Tallahassee, FL 32310 ' Department of Chemistry, Florida State University, Tallahassee, FL 32306
I.
Introduction
Protein hydrogen exchange has become a powerful tool for probing the high-order structure and dynamics of proteins (1,2). Among the many different techniques for determining backbone amide hydrogen exchange rates, high-resolution multidimensional NMR, which can determine hydrogen exchange rates of individual amides on small and highly soluble proteins, has been the most successful and definitive. However, such experiments require large quantities of protein (typically tens of mg); very fast proton exchange rates are not observable with standard techniques; the upper mass limit is typically <30,000 Da, and quantitative measurement of distinct populations is not easily done. Recently, mass spectrometry has been used to determine the extent of H/D exchange in proteins (3). When combined with enzymatic cleavage after the H/D exchange, mass spectrometry can also determine the extent of H/D exchange for different segments of the primary amino acid sequence (4-9). Compared to H/D exchange observed by NMR, mass spectrometry is vastly more sensitive (picomole quantity); mass spectrometry extends to proteins and complexes of much higher mass; multiple conformations or intermediates with different H/D exchange rates are readily identified by their different masses; and H/D exchange can even be conducted directly in the gas phase to reveal gas-phase (i.e., unhydrated) protein structure. Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR MS (1014)) is especially useful for protein hydrogen exchange experiments because of its ultrahigh mass resolving power, which gives it the unique capability to resolve isotopic distributions in all but very large proteins (15-17). The isotopic distributions for a deuterated protein and its proteolytic fragments reveal not only their deuterium contents, but also how the deuterons are distributed—a unique advantage of mass spectrometry which makes it complementary to NMR for TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
703
Zhongqi Zhang et al
704
understanding protein structure and dynamics (18), and especially protein folding, in which several conformations may coexist simultaneously (19). The FK506 binding protein (FKBP) is a small (107 amino acids) protein which, when complexed to ascomycin or rapamycin, is involved in immunosuppression through the formation of ternary complexes with calcineurin or FRAP, respectively (20). We are using FKBP to study the role of non-random conformations in the unfolded state on folding stability mechanisms. Here, we present FT-ICR mass analysis of global and local hydrogen exchange behavior of a mutant of recombinant human FKBP under native and mildly denaturing conditions, thereby identifying the number of distinct conformations, location of surface-accessible segments of the primary amino acid sequence, and partial kinetics and equilibria of the folding/unfolding pathways.
II. Materials and Methods A.
Materials
Sephadex G-15 was purchased from Pharmacia Biotech; urea was purchased from Fisher Scientific; D2O and pepsin were purchased from Sigma Chemical Co. All chemicals were used without further purification. Deuterated urea/D20 solution was prepared by repeated lyophilization. FKBP was mutated to remove the single cysteine (C22A) by use of a megaprimer PCR protocol (21) (Li & Logan, unpublished). Mutant C22A FKBP was over-expressed in E. coli strain JM109 by use of a di-cistronic vector described previously (22). The bacteria were grown in LB medium containing 50 mg/L ampicillin and induced with 1 mM IPTG at 0.6 OD. Protein expression was continued for 8-10 hours after induction before harvesting by centrifugation. The cell paste was suspended in lysis buffer (50 mM Na phosphate, pH 7.4, containing 10 mM EDTA and 1 mM PMSF), and broken by passage through a French press. Cellular debris was removed by centrifugation (10,000 rpm for 20 min in a JA20 rotor) and loaded onto a DE52 column (2.5 x 10 cm), pre-equilibrated in elution buffer (10 mM HEPES, pH 7.5). At this pH, FKBP is not retained on the column and elutes as the first peak (23). Fractions containing FKBP were pooled, and brought to 90% saturation with ammonium sulfate. The resulting precipitate was pelleted by centrifugation, re-dissolved in 2 mL elution buffer, and applied to a gel filtration column (G75, 2.5 x 120 cm). FKBP-containing fractions were pooled, concentrated using a Centriprep 10 (Amicon), and further purified by reverse-phase HPLC (C4, Vydac) on a Shimadzu HPLC system. The 4.6 mm x 250 mm column was developed at a flow rate of 1.0 mL/min with a linear gradient (10% to 80%) of solvent B (acetonitrile containing 0.1% TFA) in solvent A (water containing 0.1% TEA) in 30 min, with two LC-IOAD solvent delivery modules. FKBP eluted at 37% acetonitrile.
B.
HID Exchange of CllA-FKBP
H/D exchange of C22A-FKBP was initiated by rapidly replacing H2O by D2O (10 mM KH2PO4/K2HPO4) by use of a spinning Sephadex G-15 gel filtration column, then mixing with an equal volume of 10 mM phosphate D2O buffer containing an appropriate concentration of urea to make the final urea concentration 0 M, 3.5 M
Structure and Dynamics of FKBP by H/D Exchange and FT-ICR MS
705
or 4.5 M. The protein/D20 solutions with different urea concentration were incubated at 25 °C for various exchange periods and then combined with a five-fold excess of 0.1 M H3PO4/KH2PO4 buffer (pH 2.4) to decrease the pH to 2.4, thereby quenching the exchange reaction. The samples were stored at -70 °C until analysis. To measure the equilibrium constant for unfolding, after C22A-FKBP was equilibrated in 3.5 or 4.5 M urea/D20 solution, we applied a five-second labeling pulse by rapid dilution of the fully deuterated protein into a 10-fold excess of H2O solution containing 3.5 and 4.5 M urea respectively, followed by quenching of the exchange by decreasing the pH to 2.4. To adjust for deuterium gain or loss under quenched conditions, we prepared two control samples. A zero-deuteration control was prepared by diluting the protiated protein solution directly into a solution so as to make the final solution composition the same as the quenched sample. A fulldeuteration control was prepared by incubating the protein/D20 solution overnight at 40 °C in 4.5 M urea. Both control samples were analyzed as for the other samples.
C.
H/D Exchange Determination by LC/FT-ICR MS
For determination of the global hydrogen exchange behavior of C22A-FKBP, each sample was thawed and analyzed by electrospray FT-ICR mass spectrometry with on-line desalting with a reverse-phase C-8 microbore guard column (15 x 1 mm, Microtech Scientific). The total analysis period for each sample was about 2.5 min. For determination of the extent of deuterium incorporation in different segments of the protein primary sequence, deuterated protein was thawed and digested for 3 min at 0 °C with pepsin (1:1 substrate:enzyme ratio). Each peptide mixture was then analyzed by on-line LC/MS, and the extent and distribution of deuterium incorporation was determined from their ESI FT-ICR mass spectra. A capillary perfusion column (50 x 0.3 mm, LC Packings) was used for LC/MS. The perfusion column decreases the separation time and thus reduces deuterium loss during separation. The complete separation period was typically less than 4 min. Mass spectrometric analyses were performed with a homebuilt FT-ICR mass spectrometer, with a shielded 9.4 Tesla superconductive magnet, and equipped with a homebuilt external electrospray interface described elsewhere (17). An Odyssey™ data system (Finnigan FTMS, Madison, WI) was used to acquire data and process the data into a peak list. The peak list was further analyzed with software written by the author to obtain the extent and distribution of deuterium in the protein or peptides.
D.
Identification of Proteolytic Fragments
C22A-FKBP was digested with pepsin (~1:1 substratelenzyme weight ratio) at pH 2.4 and 0 °C for three min, then subjected to FT-ICR MS analysis with on-line desalting. Mass resolution was between 50,000 and 120,000, depending on the peptide mass-to-charge ratio, m/z (24), Identification of some peptides could be achieved from their measured nominal-mass molecular weights and the known specificity of pepsin (25). With these peptides identified, very accurate (to 5 ppm) molecular weights of other peptides could be determined from internal mass calibration based on the identified peptides. Thus, most of the proteolytic peptides could be identified by their accurate masses.
Zhongqi Zhang et al
706
III. Results and Discussion A.
Global HID Exchange of C22A'FKBP Under Denaturing Conditions
In determining the contribution of a particular inter-residue interaction toward protein folding and stability, one creates mutations to one or more residues, and the stability of the mutant is compared to that of the wild-type. In such studies, reversible folding/unfolding reactions are required, but are often precluded by the formation of non-native disulfide bonds. FKBP contains no disulfides, but there is a single cysteine at position 22. To minimize complications due to inter-molecular disulfide bond formation during folding and unfolding, we have thus replaced this residue with alanine, and this mutant, C22A FKBP, is the new wild-type. NMR studies have shown no significant difference in the structure of C22A versus wildtype (Li and Logan, unpublished). The first step in characterizing the stability of a protein is to measure some property that indicates a change in native structure with added denaturant. To that end, we employ H/D exchange, with detection by mass spectrometry. As shown in Figure 1, the natural-abundance isotopic distribution (Figure 1, FK506-Binding Protein top) shifts upward by -28 Da (M + 8H)®+ C y s - 2 2 — • Ala-22 on incubation in D2O (10 mM phosphate buffer, pD 7.04) No H/D Exchange buffer for 6 min. (All pD values described in this text are uncorrected for isotope effects.) The extent of deuteration can 6 min Exchange, be determined by the difference Non-Denaturing between the average molecular Conditions weights (calculated from the centroid of the isotope distribution) of the deuterated and non-deuterated protein. For example, the isotopic distributions (Fig. 1, bottom two panels) for C22A-FKBP following incubation in D20/phosphate buffer in the presence of 3.5 M and 4.5 M urea (pD 7.2) for 6 min are clearly bimodal. Furthermore, the low-m/z envelope is the same as that for hydrogen Figure 1. FT-ICR high-resolution mass spectrum of exchange under nonC22A-FKBP, [M+8H]8+: Top: before H/D exchange. denaturing conditions (Fig. 1, Secondfromtop: after exchange under non-denaturing second from top), and the highconditions for 6 min. Third from top: after exchange m/z envelope for both 3.5 M in 3.5 M urea for 6 min; Bottom: after exchange in 4.5 and 4.5 M urea is the same as M urea for 6 min.
Structure and Dynamics of FKBP by H/D Exchange and FT-ICR MS
707
that for the fully deuterated control B (not shown). Finally, as the exchange period is increased, the 1OW-AW/Z envelope moves gradually to the right (i.e., to higher m/z) whereas the high-m/z envelope increases in relative abundance without change in its position. To determine the global unfolding rate constant kunfold (see below), H/D exchange of C22A-FKBP was carried out at 3.5 and 4.5 M of urea (pD 7.2) for different lengths of time. Since the two envelopes were well separated (Fig. 1, bottom two panels), the relative abundance of the high-m/z envelope could be calculated by summing all peak magnitudes in the high-m/z envelope, and then dividing by the sum of all peak magnitudes in both the low- and high-m/z distributions.
B.
Local Variations in Exchange Behavior Analyzed by Proteolytic Digests
The weakness of the global approach described above is a lack of spatial resolution of H/D exchange. To extend the global H/D exchange pattern, C22A FKBP was digested with pepsin after deuterium exchange periods to reveal not only the extent of deuterium incorporation into the protein as a whole, but also of incorporation into each different proteolytic fragment of the protein (4). For example. Figure 2 shows ESI FT-ICR mass spectra of the proteolytic fragment, V^-M^^, following incubation of C22A-FKBP in D2O under the same conditions as for the intact protein in Figure 1. Since the two envelopes were not well separated, as in Fig 2CD, two controls were used to deconvolute the overlapped isotopic distributions. Because (M + 3H)3+ (a) the low-m/z envelope of the No H/D Exchange deuterated peptide is the same as the isotopic distribution of that peptide 6 min Exchange, when the protein was incubated Non-Denaturing for the same length of time Conditions under non-denaturing conditions, and (b) the high-m/z envelope is the same as the isotopic pattern for the fully deuterated peptide, the two isotopic distributions of the peptide could thus serve as controls to deconvolve the overlapped isotope pattern. Specifically, the two control distributions were added together, and the relative 935.0 937.5 940.0 942.5 abundance of the high-m/z m/z envelope component varied to Figure 2. FT-ICR mass spectrum of the V^-M^^ yield as best fit to the fragment, [M+3H]3+, of C22A-FKBP. Conditions experimental isotopic for each spectrum are as in Figure 1. distribution.
708
Zhongqi Zhang et al
Although proteolysis and HPLC separation were performed under nominally quenched conditions, some deuterium gain or loss may take place during these steps. To correct for such isotopic exchange, a control with no artificially incorporated deuterium (control A) and a control with all amide sites previously deuterated (control B) were analyzed according to the same procedure as for the deuterated samples. If a deuterated protein or peptide has the same deuterium content as control A (or B), it is considered to have zero (or full) deuteration, and any deuteration level in between is assumed to vary linearly between the two extremes (4). All deuterium contents reported in this paper represent deuterium contents adjusted by this procedure.
Kinetics of Folding and Unfolding: Identification of Slow- and Fast-Exchanging Protons When a protein is continuously labeled with deuterium under mildly denaturing conditions, the unfolding rate of the intact protein as well as various of its component segments can be determined. The experiments described above monitored H/D isotope patterns in various urea concentrations after a fixed period of exchange time. Kinetic data on the rates of folding or unfolding can be obtained by monitoring changes in the isotope distribution for differing periods of time at fixed denaturant concentrations. Figure 3 shows the deuterium exchange-in time courses for intact C22AFKBP and a few examples of its T75.L97 proteolytic segments. For 100 example, - 5 0 % of the backbone amide hydrogen of intact C22A-FKBP are labeled c o with deuterium within one hour, whereas - 2 5 % of amide o hydrogens resist exchange up Q. for to 50 hours (the longest o u time investigated), indicating a £ range of exchange rates for E different amide hydrogens due 3 to their different 3 O conformational and hydrogenO bonding environments. These different H/D exchange rates can be localized by proteolysis 10 20 30 40 as described above. F o r Exchange Period (h) instance, amide hydrogens in segment V^^-E^^^ have very slow exchange rates, indicating Figure 3. Deuterium exchange-in time courses for a shielded and/or highly C22A-FKBP and a few of its pepsin-catalyzed hydrolytic fragments. hydrogen-bonded structure. In the three-dimensional x-ray crystal structure of FKBP (26), this C-terminal segment forms
Structure and Dynamics of FKBP by H/D Exchange and FT-ICR MS
709
the center strand of a five-stranded p-sheet, with all of its amide hydrogens involved in intramolecular hydrogen bonds. In contrast, most amide hydrogens in the T^^L^^ segment exchange very rapidly, consistent with the x-ray structure, in which all but residues I^^ and L^^ lie in an exposed loop (without a-helix or p-sheet structure). Finally, the amide hydrogens in the segment, V^'^-E^^, exchange relatively slowly, consistent with their location in a p-strand with higher solvent exposure than the C-terminus. Hydrogen exchange of a protein under denaturing conditions can be described by the following simple two-state (folded/unfolded) model:
Folded/Protonated *^unfold
t
kfold
Unfolded/Protonated^^.
folded ^(H->D) folded
Folded/Deuterated '^unfold
unfolded
kfold
Unfolded/Deuterated unfolded '(H-^D)
The model simply proposes that hydrogen exchange may occur (with different rates) in both the folded and unfolded forms of the protein, and that the folding/unfolding rates are not affected by replacement of amide hydrogens by deuteriums. (Note that k ^ ^ is zero for experiments in D2O, since there are no H2O solvent protons available for back-exchange; this rate constant does become observable by separate pulse-labeling experiments discussed later. Also, k ^ ^ is the same as k ^ ti except for kinetic isotope effects.) Let kunfold be the rate constant for unfolding, kfoid the rate constant for folding, and k"" ° ^ be the hydrogen/deuterium exchange rate for an amide hydrogen in unfolded state. If k^^J^^ « kfoid, the protein will fold and unfold many times before it can be fully deuterated, and the observed exchange rate, kobs = Kfoid ku .tj, in which Kfoid = kfoid/kunfold- This mechanism represents the EX2 limit. Under these conditions, the deuteration sites are randomly distributed about the average, so that a single-envelope isotope distribution is observed. (k"uLr) ^^^ different amide hydrogens have previously been found to range over only about one order of magnitude (27)). We observed such behavior when C22A-FKBP was incubated in 2 M urea (data not shown). On the other hand. If k""""^^^^ » kfoid, then once the protein unfolds, all amide hydrogens in the unfolded region will rapidly be replaced by deuteriums, to yield a second isotopic envelope with high deuterium content (unfolded/deuterated) distinct from that of the folded/deuterated protein. Under these conditions, known as EXl, kobs = kunfold- We observe this type of behavior for C22A-FKBP in the presence of 3.5 or 4.5 M urea (Figure 1, bottom two panels). In this limit, the rate of increase in relative abundance of the high-m/z envelope provides a direct measure of kunfold-
710
Zhongqi Zhang et al.
If the equilibrium constant for the unfolding can also be determined, then the refolding rate constant, kfold, niay also be determined. By the same reasoning, if the isotopic distributions for different segments of the deuterated protein can be determined, then the unfolding and refolding rate constants for each segment can be determined independently. A detailed unfolding and refolding mechanism may then be developed from the kinetics of those individual exchange processes, providing a powerful probe of detailed unfolding/refolding pathways. For the present example, our observation of similar unfolding kinetics for intact C22A-FKBP (Fig. 1) and all of its fragments (e.g.. Fig. 2 shows the V^-M^^ segment) in 3.5 or 4.5 M urea further confirms that the protein unfolds with a single cooperative transition. Figure 4 shows the relative abundance of high-m/z envelope (corresponding to the unfolded form) vs. H/D exchange period for C22A-FKBP and its V^-M^^ segment, at either 3.5 M or 4.5 M urea. The unfolding rate constant, kunfold* of the protein and its V^-M^^ segments of the protein can be determined by fitting each data set to a single-exponential curve. At either 3.5 M or 4.5 M urea, the unfolding rate of the intact protein and its V^-M^^ segment are near-identical. Similar results (not shown) were obtained for numerous other C22A-FKBP segments. Observation of a common unfolding rate for each of many segments of the protein backbone further establishes a two-state (folded/unfolded) equilibrium for C22A-FKBP in 3.5 and 4.5 M urea. The unfolding _ rate constant, kunfold = 1.8 ± 0.2 h-1 at 3.5 M urea and 7.2 ± 0.4 h-1 at 4.5 M urea, corresponding to a unfolding half-life of 23 min at 3.5 M urea and 6 min at 4.5 M urea. If the unfolding equilibrium constant Kunfold = k u n f o l d / k f o l d can be determined, then the refolding constant, kfoid» for different regions of the protein can also be determined as follows. The fully deuterated protein 0 0.2 0.4 0.6 0.8 equilibrated in 3.5 M and 4.5 M urea/D20 solution was Exchange Period (h) rapidly diluted (10-fold) by Figure 4. Relative abundance of high-m/z isotopic H 2 O buffer, and after 5 envelope (i.e., relative number of FKBP molecules that seconds, further deuterium have unfolded at least once) vs. exchange period for exchange-out was quenched C22A-FKBP and its V^-M^^ segment in either 3.5 M or by decreasing the pH to 2.4. 4.5 M urea. Each data set is fitted to a single During the brief 5 s period, exponential curve. The close agreement between the data amide deuteriums in unfolded for intact protein and its W^-M^^ segment and all of its protein molecules are other segments (not shown) supports a two-state replaced with hydrogen faster folded/unfolded protein equilibrium model. than are amide deuteriums in
Structure and Dynamics of FKBP by H/D Exchange and FT-ICR MS
711
folded protein molecules. If C22A FK506-Binding Protein the D/H exchange rate is faster than or comparable to the Equilibrate in D2O; refolding rate during those five Exposed to H2O (5 seconds); Quench seconds, then two distinct envelopes will be seen in the isotopic distribution, and the 3.5 iVI Urea relative abundances of the two envelopes provides a direct -[•-"{"•y-i^inrTni/yJb llir^ii»inii measure of the relative concentrations of the folded and unfolded proteins. 4.5 M Urea Figure 5 shows the isotope distribution of C22A-FKBP ^....ti^LUlAAA^^tAy^^ • )--nf-v after such pulse-labeling in 3.5 1314 1310 1312 1316 1318 M and 4.5 M urea. For the 4.5 m/z M urea experiment, two isotopic envelopes are clearly Deuterium Distribution evident. The low-m/z envelope 4.5 IVi Urea 1 represents the unfolded (rapidly D/H-exchanged) protein and the high-m/z r ^ ^M, envelope is the folded (slowly 10 20 30 40 50 60 70 D/H-exchanged) protein. Number of Deuterons Thus, the D/H exchange rate must be much faster than the Figure 5. Isotope distributions for C22A-FKBP, refolding rate. The relative [M+9H]^"*", following equilibration in 3.5 M and 4.5 M abundance of the two urea in D2O, followed by pulse-labeling for 5 s with envelopes thus provides a H2O, and quenching to stop the D/H exchange. The direct measure of the relative low-m/z envelope thus represents the unfolded form of concentrations of folded and the protein. The bottom panel represents the unfolded protein. (Although distribution of incorporated deuterons computed by the two distributions do not deconvolving the natural-isotopic distribution from the experimental mass spectra. overlap in this example, deconvolution of the naturalabundance isotope distribution clearly narrows the observed distributions, and promises to increase the power of mass spectrometry to resolve conformational states more similar than shown here.) From the distributions shown in Fig. 5, we calculated an unfolding equilibrium constant, Kunfold» of <0.05 in 3.5 M urea and 0.27 in 4.5 M urea. Thus, the refolding rate constant for C22A-FKBP in 3.5 M urea should be >40 h"! (half-life <1 min), and the refolding rate constant for C22A-FKBP in 4.5 M urea is 27 h"^ (half-life- 1.6 min).
ML
IV.
Conclusion
The present results demonstrate the utility of electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry (ESI FT-ICR MS) for analysis of hydrogen/deuterium exchange to determine the number of
Zhongqi Zhang et al
712
conformations of a protein and their associated interconversion equilibria and kinetics. Comparison of the H/D exchange for the intact protein and its proteolytic fragments provides independent simultaneous probes of the conformation(s) of vaiious segments of the protein. Compared to fluorescence, CD, and UV methods for monitoring unfolding, H/D exchange with MS detection offers the following advantages: high sensitivity, rapid measurement, and (especially) non-overlapping signals for native/unfolded forms. Furthermore, it is relatively easy to obtain fragment-specific information. NMR can also be used to study folding equilibria and kinetics by use of H/D exchange. NMR has the distinct advantage of site-specific folding information, but suffers from low sensitivity and slow measurement periods. FT-ICR offers several unique advantages over other forms of mass spectrometry for such problems. First, due to its ultrahigh mass resolution, isotopic peaks of an intact protein can be resolved. The natural isotopic abundance distribution may thus be deconvolved (by maximum-entropy techniques, to be reported in a future paper) out of the isotopic distribution of the deuterated protein or protein fragments, to leave just the distribution of the artificially introduced deuteriums, as in Figure 5 (bottom panel). The deuterium distribution is critical for identifying folding intermediates and their kinetics and equilibria. Deconvolution of the isotope natural abundance distribution increases the resolution for distinguishing between different conformational states in such experiments. Furthermore, in a proteolytic digest experiment, FT-ICR MS isotopic peaks of all protein fragments are well resolved, facilitating charge state assignment. With other mass spectrometers, isotopic peaks of all fragments are usually not resolved, and a better HPLC separation is needed in order to assign charge states, with concomitant increase in HPLC separation time and thus increased deuterium back-exchange. Second, because of FT-ICR's ultrahigh mass measurement accuracy, most of the proteolytic fragments of a protein can be identified by their accurate masses, thereby reducing the need for MS/MS or other partial sequencing of each fragment. Because peptide identification is the limiting factor (in both time and sample consumption), FT-ICR MS greatly reduces the experimental time and effort as well as sample consumption, compared to other mass analyzers.
Acknowledgments This work was supported by NSF (CHE-94-13008), NIH (GM-54035), NIH (GM31683), Florida State University, and the National High Magnetic Field Laboratory in Tallahassee, FL.
References 1. 2. 3. 4. 5. 6. 7. 8. 9.
Woodward, C; Simon, I.; Tuchsen, E. (1982). Mol. Cell. Biochem. 48, 135-60. Englander, S. W.; Kallenbach, N. R. (1984). Quart. Rev. Biophys. 16, 521-655. Smith, D. L.; Zhang, Z. (1994). Mass Spectrom. Rev. 13, 411-429. Zhang, Z; Smith, D. L. (1993). Protein Sci. 2, 522-531. Liu, Y.; Smith, D. L. (1994). J. Am. Sac. Mass Spectrom. 5, 19-28. Johnson, R. S.; Walsh, K. A. (1994). Protein Sci. 3, 2411-2418. Johnson, R. S. (1996). /. Am. Soc. Mass Spectrom. 7, 515-521. Zhang, Z.; Post, C. B.; Smith, D. L. (1996). Biochemistry 35, 779-91. Zhang, Z.; Smith, D. L. (1996). Protein Sci. 5, 1282-1289.
Structure and Dynamics of FKBP by H/D Exchange and FT-ICR MS
713
10. Comisarow, M. B.; Marshall, A. G. (1974). Chem. Phys. Lett. 25, 282-283. 11. Marshall, A. G.; Schweikhard, L. (1992). Int. J. Mass Spectrom. Ion Proc. 118/119, 3770. 12. Speir, J. P.; Gorman, G. S.; Amster, I. J. (1992). In Mass Spectrometry in the Biological Sciences: A Tutorial (M. L. Gross, ed.), p. 199-212. Kluwer Academic Publishers, Dordrecht, The Netherlands. 13. Buchanan, M. V.; Hettich, R. L. (1993). Anal. Chem. 65, 245A-259A. 14. Trends in Analyt. Chem. 13, Special Issue: Fourier Transform Mass Spectrometry (C. L. Wilkins, ed.), 1994, p. 223-251. 15. McLafferty, F. W. (1994). Ace. Chem. Res. 27, 379-386. 16. Wu, Q.; Van Orden, S.; Cheng, X.; Bakhtiar, R.; Smith, R. D. (1995). Anal. Chem. 67, 2498-2509. 17. Senko, M. W.; Hendrickson, C. L.; Pasa-Tohc, L.; Marto, J. A.; White, F. M.; Guan, S.; Marshall, A. G. (1996). Rapid Commun. Mass Spectrom. 10, 0000-0000. 18. Miranker, A.; Robinson, C. V.; Radford, S. E.; Aplin, R. T.; Dobson, C. M. (1993). Science 262, 896-900. 19. Suckau, D.; Shi, Y.; Beu, S. C.; Senko, M. W.; Quinn, J. P.; Wampler, F. M., Ill; McLafferty, F. W. (1993). Proc. Natl. Acad Sci. USA 90, 790-793. 20. Fruman, D. A.; Burakoff, S. J.; Beirer, B. E. (1994). FASEB J. 8, 391-400. 21. Barettino, D.; Feigenbutz, M.; Valcarcel, R.; Stunnenberg, H. G. (1994). Nucl. Acids Res. 22, 541-542. 22. Pilot-Mathias,'T.; Pratt, S. D.; Lane, B. C. (1993). Gene 128, 219-225. 23. Holzman, T. L.; Egan, D. A.; Edalji, R.; Simmer, R. L.; Helfrich, R.; Taylor, A.; Burres, N. S. (1991). J. Biol. Chem. 226, 2747-2479. 24. Marshall, A. G.; Comisarow, M. B.; Parisod, G. (1979). J. Chem. Phys. 71, 4434-4444. 25. Powers, J. C ; Harley, A. D.; Myers, D. V. (1977). Adv. Exp. Med. Biol 95, 141-57. 26. Duyne, G. D. V.; Standaert, R. F.; Karplus, P. A.; Schreiber, S. L.; Clardy, J. (1991). Science 252, 839-842. 27. Bai, Y.; Milne, J. S.; Mayne, L.; Englander, S. W. (1993). Proteins: Struct., Func, Genet. 17, 75-86.
This Page Intentionally Left Blank
Internal Dynamics of Human Ubiquitin Revealed by i3C-Relaxation Studies of Randomly Fractionally Labeled Protein A. Joshua Wand\ Jeffrey L. Urbauer, Robert P. McEvoy^ and Ramona J. Bieber Departments Biological Sciences, Biophysical Sciences and Chemistry and Center for Structural Biology State University of New York at Buffalo Buffalo, New York 14260
I. Introduction The physical basis of protein structure, dynamics and stability has been a subject of intense study and debate for several decades. While our knowledge of the taxonomy of protein structure appears to be nearing completeness, an understanding of the existence, character and interconversion of states near the lowest free energy state of proteins remains largely incomplete. The magnitude of the residual entropy of proteins is of critical importance to an understanding of protein structure, stability and ultimately function. In principle, nuclear magnetic resonance (NMR) spectroscopy can be employed to estimate the local dynamics throughout a protein. However, though comprehensively applied to small polypeptides, the application of NMR relaxation techniques to the study of fast local dynamics throughout a protein has been hindered by the apparent need to employ selective l^C-enrichment. Thus, although a number of studies of the motion of N-H vectors in proteins have been reported, there have been a paucity of reports on the internal dynamics of side chain C-H vectors of proteins. Here we apply a recently introduced technique to obtain reliable I3c relaxation parameters in the protein ubiquitin. Ubiquitin is a small (76 amino acids) extremely stable protein containing a broad collection of secondary structure elements including parallel and antiparallel beta strands assembled into a mixed beta sheet, alpha and 3io helices and a variety of turns (Vijay-Kumar et al., 1987; Di Stefano & Wand, 1987). In previous work, we have examined the fast main chain dynamics of ubiquitin by use of 15N NMR relaxation methods (Schneider et al., 1992). These data were analyzed in terms of the so-called model free treatment of Lipari and Szabo (1982a,b). The amplitudes of motion of the backbone amide N-H vectors of the packed regions of the protein are generally highly restricted and show no apparent correlation with secondary structure context but do show a strong
' To whom correspondence should be addressed. ^ Deceased. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
715
716
A. Joshua Wand et al
correlation with the presence of hydrogen bonding of the amide hydrogen or its peptide bond associated carbonyl. These data suggest that the main chain of ubiquitin has highly restricted motion on the sub-nanosecond time scale and that molecular packing interactions provide the dominant restriction to this motion. Although informative, the study of 15N relaxation of backbone sites in ubiquitin did not, of course, provide any insight as to the extent, range and character of side chain dynamics. To obtain a measure of the amplitude and timescale of fast (subnanosecond) dynamics of C-H vectors of both the main chain and side chains, we have carried out low pass filtered NMR relaxation experiments using 15% randomly l^C-enriched ubiquitin. Analysis of the relaxation parameters of methine alpha carbon and methyl carbon centers reveals a wide range of side chain dynamics throughout the interior of the protein. In contrast to these results, but in general agreement with the view of the main chain provided by amide N-H dynamics, the relaxation behavior of alpha carbon sites is consistently indicative of low amplitude-high frequency motion along the backbone of the protein.
II. Materials and Methods A. Preparation ofl^C-Enriched Ubiquitin The ubiquitin structural gene construct was synthesized as described in detail elsewhere (Wand et al., 1996). Expression was undertaken in minimal media containing M9 salts and the desired nitrogen and carbon source. Three types of ubiquitin samples were prepared. Uniformly 15N, 13C-labeled ubiquitin was prepared utilizing l3C6-glucose (2 g/L) and 15NH4C1 (1 g/L). The sample used for relaxation studies was prepared using a mixture of acetates (4 g/L) comprised of 15% l3Ci-acetate, 15% l3C2-acetate and 70% unlabeled acetate. To determine prochiral methyl assignments, ubiquitin was prepared using a mixture of uniformly l3C-labeled glucose (10%) and unlabeled glucose (90%). Ubiquitin was purified by a minor variation of the standard protocol (Ecker et al., 1987b). NMR samples ranged between 2 and 4 mM in ubiquitin and were prepared in 50 mM potassium phosphate buffer, pH* 5.7 in 90% H2O and 10% D2O or 100% D2O as required.
B. NMR Spectroscopy Spectra were recorded on Bruker AMX-500 and DMX-750 NMR spectrometers at 30"C. All triple resonance experiments and the HCCH-TOCSY experiment were performed on a single sample of 15N, l3C-ubiquitin in 90% H2O/10% D20 buffer. HCCH-TOCSY spectra were obtained using the pulse sequence described by Bax et al. (1990) and utilized a 27 ms DIPS 1-3 mixing sequence. The HCCH-TOCSY data sets were composed of 92 complex points in the
Internal Dynamics of Human Ubiquitin
717
incremented IH dimension, 48 complex points in the incremented 13C dimension and 256 complex points in the IH acquisition dimension. The obtained data sets were transformed to 256 by 128 by 512 points for the IH, I3c and IH dimensions spanning 4000 Hz, 4310 Hz and 4000 Hz, respectively. HNCA, HNCO and HN(CO)CA experiments were obtained essentially as described by Grzesiek and Bax (1992) and were composed of 52 (HNCA and HN(CO)CA) or 64 (HNCO) complex points in the incremented l^c dimension, 32 complex points in the incremented 15N dimension and 512 complex points in the IH acquisition dimension. The obtained data sets were transformed to 128 (HN(CO)CA and HNCA) or 256 (HNCO) by 128 by 512 points for the 13C, 15N and IH dimensions spanning 3333 Hz, 2016 Hz and 6410 Hz, respectively. Constant time 13C-HSQC spectra were collected as described by Vuister and Bax (1992). The 13C-HSQC spectrum without 13C-decoupling during the incremented time domain was acquired as described by Neri et al. (1990). NOE and spin lattice relaxation were monitored using the pulse sequences described in Figure 1. A 3 second period of composite pulse IH decoupling was used to generate the steady-state NOE. All data processing was done using the program FELIX (BioSym Technologies). Spectra are referenced to trace TMS in water (IH, 0 ppm), neat TMS (13C, 2.86 ppm) and 15NH4C1 in 1 M HCl (15N, 24.93 ppm).
C. Relaxation Data Analysis Center cross peak intensities of the serial two dimensional 13C-HSQC spectra used to quantitate spin lattice relaxation were fitted to a two parameter single exponential using standard gaussian elimination methods. The heteronuclear NOE was determined using cross peaks of 13C-1H correlation spectra obtained with and without saturation of proton resonances which were integrated and corrected for baseline offsets as required. Model free order parameters were obtained using equations [1], [2], [3] and [4] as described in the text and by Dellwo & Wand (1989). Estimates of the error in the obtained model free parameters were obtained from the variation of the model free parameters during Monte Carlo sampling of the observed relaxation parameters over the range of their estimated individual precision.
III. Results The essentially complete assignment of IH, 13c and 15N resonances of recombinant human ubiquitin were obtained by analysis of HNCA, HN(CO)CA, HNCO and HCCH-TOCSY spectra and reference to previously reported ^H (Di Stefano and Wand, 1987) and 15N (Schneider et al., 1992) resonance assignments (Wand et al., 1996). Stereospecific assignments of leucine 5-CH3 and valine y-CHs groups were obtained using the isotopic labeling strategy developed by Neri et al. (1990). This approach provided unequivocal prochiral
A. Joshua Wand et al
718
T1 1H
13c
NOE 1H
A+ ti/2
13c Figure 1. Pulse sequences used to monitor the heteronuclear NOE (bottom panel) and the spin lattice relaxation (top panel). The NOE experiment is a simple extension of the basic pulse sequence introduced by Kay et al. (1989) and utilizes continuous broadband *H decoupling during the preparation period to generate the NOE. Two dimensional spectra with and without 'H decoupling (lightly shaded region) define the NOE. The T, relaxation experiment is a simple extension of the basic pulse sequence introduced by Sklenar et al. (1987). The NOE via ^H decoupling rather than coherent polarization transfer is used to polarize the carbons. For both the NOE and Tj measurement, the proton pulse 0 (or the delay of the corresponding reverse INEPT) is set to the magic angle as described by Palmer et al. (1991). The constant time period, A, is set to minimize cos(n»[27r'Jcc+ 2'K'i^^). When T is set to l/2'Jc„ then 2A = 1/2'Jcc - 1/2'JCH resulting in the suppression of the contribution of '^C-"C-^H spin systems to the observed relaxation (see Wand et al., 1995). The value of n allows a null to be obtained at different constant time intervals providing for increased digital resolution at the expense of sensitivity. Reprinted with permission from Wand et al. (1996). Copyright 1996 American Chemical Society.
assignments for all of the valine and leucine methyl pairs (Wand et al., 1996). Human ubiquitin was randomly and fractionally enriched with 13C by expression of the protein in E. coli during growth on minimal media utilizing a
Internal Dynamics of Human Ubiquitin
719
mixture of labeled and unlabeled acetates (15% l3C2-acetate, 15% l3Ci-acetate, 70% l2ci,l2C2-acetate) as the sole carbon source. Low pass filtered indirectly detected Ti relaxation experiments were utilized to suppress contributions from 13C-13C pairs to the constant time HSQC spectra used to sample relaxation (Wand et al., 1995; see Figure 1). The contributions of the 13C-13C subpopulation to the obtained two-dimensionally sampled relaxation profile is suppressed by low pass filtration based on 13C-13C scalar coupling (Wand et al., 1995). The pulse sequence used to sample Ti relaxation polarizes the I3c nuclei via the NOE, utilizes a difference Ti time course and increases sensitivity by use of a reverse DEPT (or INEPT) sequence. This approach has several distinct advantages over the double DEPT sequences used to study methine carbon or amide nitrogen relaxation. Simple polarization with the NOE avoids creation of unwanted and complicating multiple spin coherences and allows use of IH decoupling throughout the low pass filter with similar advantages. Complications with individual multiplet component transfer and relaxation through the constant time evolution and filter are also avoided. Ti relaxation obtained at 17.6 Tesla was sampled at 0.01, 0.16, 0.31, 0.46, 0.61, 0.76, 0.91, 1.06, 1.21, 1.36, 1.51 and 1.66 seconds following inversion. Ti relaxation obtained at 11.3 Tesla was sampled at 0.01, 0.11, 0.21, 0.31, 0.41, 0.51, 0.61, 0.71, 0.81, 0.91, 1.01, and 1.11 seconds following inversion. In all cases, a constant time period of 36 ms was used which corresponds to a predicted null in the intensity contributed by 13C-13C-1H spin systems. The obtained relaxation curves were uniformly singly exponential for both methine and methyl carbon sites (Wand et al., 1995, 1996). Methyl group 13C Ti relaxation profiles were essentially insensitive to the value of the low pass filter delay used. This is consistent with the small fraction (15%) of l3C-nuclei bonded to l3C-methyl carbons in the preparation of randomly fractionally l3C-enriched ubiquitin used. Obtained Ti relaxation profiles of methine carbons, having potentially up to three adjacent carbon sites, showed some sensitivity to variation of the low pass filter delay from the null value to non-optimal values. Extensive decoupling during relaxation, appropriate setting of the proton pulse of the reverse DEPT (or the delays of the reverse INEPT) to the magic angle and use of incoherent polarization gives a decidedly single exponential character to the Ti relaxation data. This indicates the absence of significant cross-correlation and other potentially contaminating effects (Werbelow & Grant, 1977) as discussed in this context by Palmer et al. (1991) and Kay et al. (1992). The Ti and NOE relaxation parameters for a 13C spin relaxed solely by dipole-dipole interactions with a directly bonded proton spin are given by (Wittebort & Szabo, 1978):
— = - - | - ^ [ j ( ( 0 „ - ( O c ) + 3j(o)c) + 6j(a)H+roc)] ^1
^^CH
[1]
A. Joshua Wand et al
720 YH[6J(CQH+C0C)-J(CDH-C0C)]
^
N0E = 1 + —^7
-^Y^
7
XT [2]
where yH, yC, ^, and rCH correspond to the gyromagnetic ratios characteristic of hydrogen and carbon, Planck's constant and the C-H bond length, respectively. The effects of other relaxation mechanisms such as chemical shift anisotropy (Spiess, 1978) and spin rotation (Spiess et al., 1973) can be assumed to be negligible for methyl groups. The CSA of alpha and side chain methines such as the P-carbon of threonine may be as large as 20 to 40 ppm but are also not expected to contribute significantly to the relaxation rate determined. Given knowledge of the covalent geometry and fundamental constants underlying relaxation, the spectral densities, J((o), remain to be defined. Here, we adopt the so-called model free treatment due to Lipari and Szabo (1982a,b) which seeks to encapsulate the unique motional character expressed in observed relaxation behavior by defining the spectral density in terms of two local parameters:
M-\
1 + CD'T'
1 + CO'T'
[3]
where 1/T = l/xe + l/xm S2, Tm and Xe correspond to the generalized order parameter, the global isotropic tumbling correlation time and the effective internal correlation time, respectively. Equation [3] indicates that two local model free parameters need to be determined for each site under investigation. Spin lattice relaxation rates obtained at two fields and the NOE obtained at one field were fitted to Equations [1] and [2] respectively for an explicit range of values for S2 and xe. Values of 1.1 A for the C-H bond length and 4.125 ns (Schneider et al., 1992) for xm were used. The error function describing this fit is defined as (Dellwo and Wand, 1989): "yobs A/
~"
^^
\
'-pci
-yobs
[4]
For a perfect fit, the target function would equal zero. A fractional error was used to obtain equal weighting of the observed Ti and NOE data (Dellwo & Wand, 1989). The global isotropic tumbling time has been determined previously as 4.125 ns (Schneider et al., 1992). Local parameters were determined from individual error grids of 500 values for S2 ranging from 0 to 1 in 0.002 steps and 500 values of Xe ranging from 0 to 1 nanosecond in 2 ps steps constructed for each site. Global minima were confirmed by visual inspection of the two dimensional contour plots. Examples are shown in Figure 2. Generalized order parameters and effective correlation times for alpha C-H pairs
721
Internal Dynamics of Human Ubiquitin
180
360
540
720
T« (pa)
900
180
360
540
720
900
T. (pa)
Figure 2. Examples of the behavior of the error function defined by equation [4] in determining local model free parameters. Shown are contours of the error function determined by sampling reasonable ranges of values for S^ and xe. Examples are shown for the methyl carbons of Ile-3 and the alpha carbons of Ile-44 and Leu-69. Reproduced with permission from Wand et al. (1996). Copyright 1996 American Chemical Society.
were obtained for 56 residues (Figure 3). The order parameters range from essentially one (internally rigid) to approximately 0.3. However, the vast majority of generalized order parameters of backbone alpha C-H vectors in recombinant human ubiquitin are between 0.7 and 1.0. This is consistent with the view provided by backbone N-H dynamics that the main chain of the protein is relatively rigid (See Figure 3; Schneider et al., 1992). The generalized order parameters obtained for 45 methyl groups of human ubiquitin are shown in Figure 4. Monte Carlo estimates of the reliability of the obtained generalized order parameters indicates an average confidence interval of 0.036. A simple model for the motion of a methyl group is the Woessner model which predicts an order parameter based on the angle between the C-H bond and the methyl group symmetry axis (Woessner, 1962). In this model, the reorientation occurs by either simple diffusion about the symmetry axis or by a jump-rotation mechanism interchanging the three equivalent sites. For perfect
A. Joshua Wand et al
722
1.0 0.8
S2
0.6
0.4 0.2 0.0
Mil 111111111111 f 111111111111 n n n 11111 n n I 111 M111111 n 1 5
1.0
10 15 20 25 30 35 40 45 50 55 60 65 70 75 Residue
CaH
0.8
S2
0.6
0.4 0.2 0.0 5
10
15
20
25
30
35 40 Residue
45
50
55
60
65
70
75
Figure 3. Determined generalized order parameters of alpha C-H (top panel) and amide N-H (bottom panel) vectors of human ubiquitin. The amide N-H order parameters are taken from Schneider etal. (1992).
expected tetrahedral covalent geometry a corresponding S2 of 0.111 is obtained. Significant deviations of the generalized order parameter from this value indicate that the Woessner model is inappropriate. Division of the generalized order parameter by 0.111 results in the order parameter corresponding to the motion of the symmetry axis itself. The generalized order parameters of methyl groups in human ubiquitin show an unexpected range of values, ranging from 0.11 to as low as 0.01. The generalized order parameters obtained for the symmetry axes of methyl groups of alanine and threonine correspond to cone angles ranging from effectively zero (e.g., Ala-46, Thr-9, Thr-12, Thr-55) to about 50 degrees (e.g., Thr-7) and even close to 90 (Ala-28). The value obtained for Ala-46 is particularly interesting as it corresponds well with the generalized order parameter obtained for the alpha
723
Internal Dynamics of Human Ubiquitin . I I I ,
0.20
1
•
•
1
1
"k
A
0.15
0.10
-
A
_
W A
i 1 1 1 . 1 I 1 1 11
]
^
0.05
. •_
A
o
f
' •
•
•
I
1
1
1
1
10
1
• • • • 1 • II
1
20
30
• 1 • 1
40
1
- • • ' • " • ' •
0.14
0.10
:•
•
"•
:o
A
0.10 0.08 0.06 0.04
^ 4r h
o
A
0.00
PI
. 1 1 1 . 1i l1 l !
Thr Ala
•
. 1 . . . .
40 Residue
J
0.16
20
. . . . 1 .
. . . . 1
:
0.14 7
\
0.10
-; -
W
30
8
1 7,_i_Li
0.00
1 1 11 1 1 1 1 1 I I
•
A
,•,,„
-_
A
f 0
i
V
9 •
10
T
0.02
~
A
~
A
Til'
A
- * • : • »
111II1
11 1 11
30
40
1 1 1 1 1 1 1 1 1
50
60
70
Residue
1 1 1 1 11
•
11•
lie
V
A
0.05
50
A
0.12 S2
r - i r i 1 1 1 1 I1 I
1 1 1 1 1
•
• e
A
Residue
0.16
•
6
' A
S2
• :
V
-ry,
0.15
A
V "
1111
A
A
•
-•
1 I 1 1
0.20
j
Val
V
• o
nnn
. , .1
•
o
.-
A
o
0.12 S2
0.08
V
A
• 1
0.06 r 0.04 L"
A A
O
(i)
^ J
A
• A
W
0
V -
0 :
0.02 0.00
fi 1 1 1 1
-:
o
•
A
•
T
Leu
A
,
.....,.'. 30
40
1 1 1
50
1
I
1.1
1 1 1 1 1 1 i'
eo
70
Residue
Figure 4. Determined generalized order parameters of methyl group C-H vectors of human ubiquitin. The upper left panel shows the S^ values obtained for valine gamma pro-R (open symbols) and gamma pro-S (closed symbols) methyl groups. The upper right panel shows the S^ values obtained for isoleucine gamma (closed symbols) and delta (open symbols) methyl groups. The lower left panel shows S^ values obtained for threonine and alanine residues. The lower right panel shows S^ values obtained for leucine delta pro-R (open symbols) and delta proS (closed symbols) methyl groups. Circles correspond to the value of S^ at which the best fit to the relaxation data is obtained. The upward facing triangles and lower facing triangles denote the estimated 99% confidence limits of the given S^ value as determined by Monte Carlo sampling. The contribution of chemical shift anisotropy to the observed relaxation was assumed to be zero. Reproduced with permission from Wand et al. (1996) Copyright 1996 American Chemical Society.
C-H vector to which the methyl symmetry axis is rigidly attached. A similar range of generalized order parameters is seen for the methyl groups of valine and leucine and reinforces the conclusion that the interior of ubiquitin is heterogeneously dynamic.
IV. Discussion The unexpected range of dynamics within the interior of the protein has several implications. First is the suggestion that the interior of the protein is not only dynamic but heterogeneously so. The presence of potentially spatially extensive
724
A. Joshua Wand et al
dynamics within the core of the protein points to the existence of considerable side chain entropy. Side-chain conformational entropy is one of many components of the free energy balance that leads to the marginal stability of proteins. Estimates of changes in side-chain conformational entropy upon folding have been made on the basis of conformationally restricted models for the native state and various degree of conformational freedom for the unfolded state to range between 0 and -2 kcal mol-l at 300 K, depending on amino acid type (for a recent review, see Doig and Sternberg, 1995). This implies that many interior side chains are potentially interconverting between discrete states (i.e., barrier crossing) and/or moving within potential wells which are broader than usually imagined. In either case, the residual side chain entropy is potentially significant. Secondly, the data presented here suggest that the conformational entropy changes experienced by burial of a side chain are context dependent and may not be estimated on the basis of amino acid type alone. Thus, although somewhat qualitative, these arguments promote the need to expand the view of the side chain dynamics of proteins by extension to the motions of methylene centers and to ultimately quantitatively relate the internal dynamics to residual conformational entropy.
Acknowledgments Supported by NIH research grants GM-35940 and DK-39806. This study made use of the National Nuclear Magnetic Resonance Facility at Madison which is supported by NIH grant RR02301 from the Biomedical Research Technology Program, National Center for Research Resources. Equipment at the facility was purchased with funds from the University of Wisconsin, the NFS Biological Instrumentation Program (DMB-8415048), NSF Academic Research Instrumentation Program (BIR-9214394), NIH Biomedical Research Technology Program (RR02301), NIH Shared Instrumentation Program (RR02781 and RR08438), and the U.S. Department of Agriculture.
References Bax, A., Clore, M., and Gronenborn, A. M. (1990) J. Magn. Reson. 88,425-431. Dellwo, M. J., and Wand, A. J. (1989) J. Am. Chem. Soc. Ill, 4571-4578. Dellwo, M. J. and Wand, A. J. (1991) J. Magn. Reson. 91,505-516. Di Stefano, D. L., and Wand, A. J. (1987) Biochemistry 28,7272-7281. Doig, A. J., and Sternberg, M. J. E. (1995) Protein Science 4, 2247-2251. Grzesiek, S., and Bax, A. (1992) J. Magn. Reson. 96,432-440. Kay, L. E., Torchia, D. A., and Bax, A. (1989) Biochemistry 28, 8972-8979. Kay, L. E., Bull, T. E., Nicholoson, L. K., Griesinger, C, Schwalbe, H., Bax, A., and Torchia, D. A. (1992) J. Magn. Reson. 100,538-558. Lipari, G, and Szabo, A. (1982a) J. Am. Chem. Soc. 104,4546-4559. Lipari, G, and Szabo, A. (1982b) J. Am. Chem. Soc. 104,4559-4570. Neri, D., Otting, G., and Wuthrich, K. (1990) Tetrahedron 46,3287-3296.
Internal Dynamics of Human Ubiquitin
725
Palmer, A.G., III, Wright, P. E., and Ranee, M. R. (1991) Chem. Phys. Lett. 185,41-46. Schneider, D. M., Dellwo, M. J., and Wand, A. J. (1992) Biochemistry 31, 3645-3652. Sklenar, V., Torchia, D., and Bax, A. (1987) J. Magn. Reson. 73, 375-379. Spiess, H. W. (1978) NMR: Basic Princ. Prog. 15, 55-214. Spiess, H. W., Schweitzer, D., and Haeberlen, U. (1973) J. Magn. Reson. 9,444-460. Vijay-Kumar, S., Bugg, C. E., and Cook, W. J. (1987) J. Mol. Biol. 194, 531-544. Vuister, G., and Bax, A. (1992) J. Magn. Reson. 98,428-435. Wand, A. J., Bieber, R. J., Urbauer, J. L., McEvoy, R. P., and Can, Z. (1995) J. Magn. Reson. Series B. 108,173-175. Wand, A. J., Urbauer, J. L., McEvoy, R. P., and Bieber, R. J. (1996) Biochemistry 35, 6116-6125. Werbelow, L. G, and Grant, D. M. (1977) In "Advances in Magnetic Resonance" (J. S. Waugh, ed.). Vol. 9, p. 189, Academic Press. Wittebort, R. J., and Szabo, A. (1978) J. Chem. Phys. 69,1722-1736. Woessner, D. E. (1962) J. Chem. Phys. 36,1-4.
This Page Intentionally Left Blank
Detection of protein unfolding and fluctuations by native state hydrogen exchange Aaron K. Chamberlain, Tracy M. Handel and Susan Marqusee Department of Molecular and Cell Biology University of California, Berkeley Berkeley, CA 94720-3206
Protein amide hydrogen exchange is an extremely powerful technique for studies on protein folding, stability and structure. It allows one to probe different regions of a protein simultaneously and has a high degree of sensitivity. Because of this, amide hydrogen exchange presents a unique merging of energetic and structural information. The basic chemistry underlying amide h y d r o g e n exchange is well characterized, yet the structural transitions of a protein which allow exchange with solvent are still not understood. Most protein amide hydrogen exchange studies are carried out u n d e r native conditions and simply measure the protection of different amide sites within the molecule. Without some knowledge of the underlying basis for protection, however, these s t a n d a r d protection m e a s u r e m e n t s yield very little information about the structure, dynamics or energetics of the protein. Recently, several groups have examined protection as a function of dilute amounts of denaturant (Mayo & Baldwin, 1993; Bai et al., 1995; Chamberlain et al., 1996). This technique is termed native state hydrogen exchange (Bai et al., 1995) since the small levels of denaturant have no influence on the protein when monitored by s t a n d a r d probes of stability such as circular dichroism or fluorescence. These new studies reveal that hydrogen exchange from an amide site in proteins can be modeled by two different processes. Here, we show how this new method greatly increases our understanding of protein energetics and dynamics using the example of E. coli ribonuclease HI**" (RNase H*^). TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
727
Aaron K. Chamberlain et al
728
Hydrogen Exchange Theory Exchange of an individual amide hydrogen is generally described by the following reaction scheme (Hvidt & Nielsen, 1966; Englander & Kallenbach, 1983): 1
closed
v^
chem
open
>
^-1
exchanged (1)
Here, "closed" refers to a conformation protected from exchange, "open" refers to an exchange-competent conformation, and "exchanged" refers to the amide which has exchanged a hydrogen with solvent, ki andfc_iare the rate constants for the opening and closing reactions respectively and kchem is the rate constant for the actual exchange reaction, kchem can be modeled using rates determined from unstructured dipeptides (Bai et al., 1993; Connelly et al., 1993). Kop {=ki/k.i) is the equilibrium constant for the closed to open transition. Under most native protein conditions, kchem is the slow step and exchange occurs by an EX2 mechanism in which the closed to open transition is in pre-equilibrium with the chemical exchange reaction. These conditions allow measurement of the equilibrium constant of the opening reaction, Kop , since kobs =fcc/zem*Kop/(Kop+l) = kchem 1^0 p
(2)
where kobs is the observed rate constant for exchange. The second equality holds only when the closed conformation is much more stable than the open conformation which is generally true for measurable protons under native conditions. The free energy of hydrogen exchange, AGHX = -RTln Kop, is then the free energy difference between the closed and open conformations and represents the minimum free energy needed before exchange will occur.
Distinguishing unfolding events from local fluctuations hydrogen exchange with dilute denaturant One difficult question remains: what is the structural basis for the transitions from the closed to the open, exchange-competent form? By measuring hydrogen exchange as a function of dilute denaturant, two distinct transitions become apparent: unfolding
Protein Native State Hydrogen Exchange ^pen
NH (Closed)
NH (Open)
729 chemical
ND
1) Unfolded: a. Globally unfolded
b. Partially Unfolded
Native
\
jjULU
- > Exchanged
2) Locally Open (fluctuations)
Figure 1. The transitions allowing hydrogen exchange. Top: The classical scheme of exchange kinetics (Hvidt & Nielsen, 1966; Englander & Kallenbach, 1983) shows a simple closed to open transition allowing exchange to occur. Bottom: By following exchange as a function of denaturant, the protein can be seen to undergo three different opening reactions, global unfolding, partial unfolding and local fluctuations. These opening reactions differ in the energy required to cause the opening transition and their dependence on denaturant.
events and local fluctuations (Figure 1). We define "unfolding events" to be any opening transition promoted by the addition of denaturant while "local fluctuations" are not influenced by denaturant. For an individual amide exchanging through the EX2 mechanism, the hydrogen exchange reaction pathway is then ^^^
unfolded
native
_ ^^^^ exchanged
ly-
fluctuated
^fluct
(3)
Where Kunf and Kfiuct are the equilibrium constants for the unfolding transitions and fluctuations respectively. For simplicity, both the unfolded and the fluctuated conformations are assumed to exchange with the rate determined from unstructured dipeptides (Bai et al., 1993; Connelly et al., 1993). This may be a poor assumption, especially for the fluctuated form, but it allows the rates to be normalized for temperature, pH and amino acid sequence effects. Kop, the equilibrium constant determined from kobs / is then given by ^ p "" ^unf ••• Kfiuct
(4)
Aaron K. Chamberlain et al
730
and can be d o m i n a t e d by either the unfolding events or fluctuations. In general, unfolding events are known to be promoted by denaturant. Although there are many models for this dependence of unfolding free energy on denaturant (Pace, 1986), the simplest is a linear model where AGunf = AGunf(H2O) - m * [Den]
(5)
Kunf = exp[(mnDen] - AG^^f(H20)/RT)]
(6)
and
where [Den] is the denaturant concentration, m (m-value) is the denaturant sensitivity or slope of AGHX VS [Den], AGunf(H20) is the free energy of unfolding in the absence of denaturant, R is the gas constant and T is the absolute temperature. Complete, or global, unfolding of the protein should allow all amide hydrogens in the protein to exchange and will have a AGunf(H20) equal to the global unfolding reaction monitored by s t a n d a r d probes such as circular dichroism or calorimetry. However, the protein could also unfold partially, exposing only a subset of hydrogens to exchange (Figure 1). Those sites exchanging through partial unfolding will demonstrate a lower AGunf(H20) and m-values than the hydrogens exchanging through global unfolding. By contrast, local fluctuations are opening events which are not sensitive to denaturant. Therefore, AGfiuct is constant, m=0, and Kfiuct = exp(-AGfiuct/RT)
(7)
where AGfiuct is the free energy of the fluctuation allowing exchange. Equation (7) is simply equation (6) with m=0. Hydrogen exchange is unique in its ability to detect these different transitions. The observed opening equilibrium constant, Kop, can be influenced by both unfolding events and fluctuations as seen by equation (4). The opening event with the lowest free energy (largest K) will dominate the observed exchange. At higher denaturant concentrations, unfolding events dominate due to their denaturant sensitivity.
IMethods RNase H*, wild type E. coli ribonuclease H I with all three free cysteines replaced by alanine, was used in all experiments. RNase H* labeled with i^N was expressed in E. coli cells grown in M9
Protein Native State Hydrogen Exchange
731
minimal media with i^N ammonium chloride. Purification was achieved as described (Dabora & Marqusee, 1994) with the addition of ion exchange chromatography between the two reported column steps. Purified RNase H"^ was dialyzed into 200 mM ammonium bicarbonate and lyophilized. Global denaturation of unlabeled RNase H'*^ by guanidinium chloride was monitored by following the circular dichroism signal at 222 n m (25°C, lOOmM sodium acetate pH 5,5) from 0 M to 5 M guanidine. The fraction folded, Ff, was calculated as described (Dabora & Marqusee, 1994). Amide hydrogen exchange was measured on samples of 10 mg RNase H'^ which were dissolved in 550 |il, 100 mM d s - s o d i u m acetate pDread=5.1, 25°C, in D2O with deuterated guanidinium chloride (GmdCl) concentrations varying from 0 M to 1.3 M. HSQC (heteronuclear single quantum coherence) spectra (Bax et al., 1990; Norwood et al., 1990) with gradients and water flip back pulses (Grzesiek & Bax, 1993) were acquired after various times ranging from hours to --5 months. Spectra were acquired on a 600 MHz Bruker DMX spectrometer with 32 scans, Ik data points taken in the detected dimension, and 64 complex points in the indirect dimension. Data collection took 1.2 hours per spectrum. FID's were apodized in both dimensions with shifted sin functions and zero filled before Fourier transformation with Azara (Wayne Boucher, Cambridge, unpublished). Non-exchangeable proton peaks were used to normalize for instrument variations. Amide proton peak heights were calculated by peak fitting with a modified version of Priism software (Chen et al., 1996). The exchange rate constants, kobsr were calculated by fitting peak heights vs time with single exponential curves (KaleidaGraph, Abelbeck) and AGHX was calculated as described previously. Of the 155 amino acids in RNase H**", 53 amide protons were resolved and exchanged slowly enough to measure these rates.
Results and Discussion The data on RNase H* were collected at guanidine concentrations in which RNase H* appears fully native by circular dichroism. The hydrogen exchange samples contained up to 1.3 M guanidine, but the unfolding transition starts around 1.5 M guanidine (Figure 2, inset). Thus, the open conformations detected by h y d r o g e n exchange are rare and undetectable by other techniques. Theoretically, we could follow exchange into the beginning of the transition region, but the hydrogen exchange reaction becomes too fast (longest half life -6 hours).
Aaron K. Chamberlain et al
732
[GdmCl] (M) Figure 2. The hydrogen exchange behavior of RNase H* is well described by unfolding and fluchiation opening events. Results shown are from the amide protons of M47 (squares), AllO (diamonds), and Y22 (circles). The local fluctuations of AllO hide observation of the global unfolding reaction below -0.6 M guanidine and would appear identical to Y22 if exchange was only morutored at 0 M guanidine. Inset: All concentrations of guanidine used to measure exchange have no observable effect on RNase H* as monitored by circular dichroism. The fraction folded protein is shown as a function of guanidine concentration.
The unfolding/fluctuation model can explain all of the measured exchange behavior in RNase H*. The hydrogen exchange behavior of three amide protons, M47, AllO, and Y22 are shown in figure 2. These protons demonstrate three independent hydrogen exchange opening processes in RNase H*: global unfolding, partial unfolding and local fluctuations. M47 (squares) exchanges through global unfolding at all guanidine concentrations, since its AGunf(H20) value, 9.8 kcal/mole, is close to the global stability of the protein as measured by circular dichroism (10.3 kcal/mole). AllO (diamonds) also exchanges through global unfolding at guanidine concentrations > 0.6 M, but local fluctuations dominate exchange below this amount. Y22 (circles) exchanges through partial unfolding at all guanidine concentrations, since the exchange of this amide hydrogen is promoted by the addition of denaturant, but the AGunf(H20) value, 7.5 kcal/mole, is considerably lower than the global free energy of unfolding. This represents the free energy of a partial unfolding event. Even though global unfolding is a viable
Protein Native State Hydrogen Exchange
733
pathway for Y22 to exchange, the partial unfolding event is more frequent (lower in free energy) than global unfolding at all denaturant concentrations. Consequently, global unfolding is unobservable in the exchange behavior of Y22. These data demonstrate the problem with trying to interpret h y d r o g e n exchange results without following exchange as a function of denaturant. If the exchange properties of RNase H* were monitored only at one condition, say 0 M guanidine, the exchange behavior of Y22 and AllO would be indistinguishable. H o w e v e r , following exchange as a function of d e n a t u r a n t demonstrates that these two hydrogens exchange by two different opening processes. Therefore the interpretation of the AGHX values measured in 0 M guanidine differ. For Y22, this value represents the stability of the amide site to unfolding. The measured opening event is a structural transition exposing a large surface area as is indicated by the m value (Myer et al., 1995). However, the measured AGHX for AllO is determined by local fluctuations. AllO can exchange through an opening reaction that exposes very little surface area. The native conformation fluctuates so that it is unable to limit exchange sufficiently enough to observe the unfolding reactions. Hydrogen exchange can monitor these fluctuations and opening events in many protons throughout the molecule simultaneously. Combining these data leads to a structural interpretation of the exchange kinetics and determination of the regions important to protein folding and stability.
Acknowledgments We t h a n k David Agard for discussion, advice and enhancements to the Priism program for peak fitting of the exchange data. We thank Tom Alber, the Marqusee and Handel labs for discussion. This work was supported by a grant from the NIH (SM, GM50945), a Beckman Young Investigator Award (SM), the Keck Foundation (SM), and the Lucille P. Markey Charitable Trust.
References Bai, Y., Milne, J. S., Mayne, L. & Englander, S. W. (1993). Proteins 17, 75-86. Bai, Y., Sosnick, T. R., Mayne, L. & Englander, S. W. (1995). Science 269, 192-7. Bax, A., Ikura, M., Kay, L. E., Torchia, D. A. & Tschudin, R. (1990). /. Magn. Reson. 86, 304-318. Chamberlain, A. K., Handel, T. M. & Marqusee, S. (1996). Nature Structural Biology, in press.
734
Aaron K. Chamberlain et al
Chen, H., Hughes, D. D., Chan, T.-A., Sedat, J. W. & Agard, D. A. (1996). Journal of Structural Biology 116, 56-60. Connelly, G. P., Bai, Y., Jeng, M. F. «& Englander, S. W. (1993). Proteins 17, 87-92. Dabora, J. M. & Marqusee, S. (1994). Protein Sci 3,1401-8. Englander, S. W. <& Kallenbach, N. R. (1983). Q Rev Biophys 16, 521-655. Grzesiek, S. & Bax, A. (1993). /. Am. Chem. Soc. 115, 12593-12594. Hvidt, A. & Nielsen, S. O. (1966). Advances in Protein Chemistry 21, 288-386. Mayo, S. L. & Baldwin, R. L. (1993). Science 262, 873-6. Myer, J. K., Pace, C. N. & Scholtz, J. M. (1995). Protein Science 4, 2138-2148. Norwood, T. J., Boyd, J., Heritage, J., Soffe, N. & Campbell, I. D. (1990). /. Magn. Reson. 488-501. Pace, C. N. (1986) Methods in Enzymology 131, 266-280.
Laser Temperature Jump for the Study of Early Events in Protein Folding Peggy A. Thompson Laboratory of Chemical Physics National Institute of Diabetes and Digestive and Kidney Diseases National Institutes of Health Bethesda, Maryland 20892-0520
I. INTRODUCTION Experimental studies of protein folding kinetics have shown that processes occur within the milUsecond dead time of conventional mixing experiments. Such experiments, along with recent developments in the theory of protein folding (1), have led to the realization that improved time resolution in folding experiments could lead to new insights into the mechanism of protein folding. The first such experiment was carried out by Jones et al. (2). They optically triggered the folding of reduced cytochrome c by photodissociating the carbon monoxide complex with a nanosecond laser pulse. More recently, submillisecond ultrarapid mixing (3), optically triggered electron transfer (4) and temperature jump techniques (5) have been developed that can rapidly initiate protein folding and unfolding. These methods show great promise for addressing the questions concerning the fast events in protein folding. One potentially powerful method for monitoring the earliest events of protein folding is the laser temperature jump. Here a pulse of laser energy absorbed by vibrational modes of the solvent initiates the refolding or unfolding of a protein. This technique was first implemented in the 1960's and its applicability has continued to increase with advances in laser technology (6). Advantages of a laser-based instrument are the ability to (i) generate fast temperature jumps, limited only by the pulse width, (ii) use small reaction volumes, (iii) monitor the extent of the reaction by a variety of spectroscopic methods (UV, visible, and infrared absorption, fluorescence, and optical activity), and (iv) tune the laser wavelength to match different solvent absorption regions. TECHNIQUES IN PROTEIN CHEMISTRY VIII
735
736
Peggy A. Thompson
Several difficulties have been overcome during the development of this technique. Initially, only Ruby and YAG lasers, providing radiation at 694 nm and 1060 nm respectively, generated enough pulse energy to produce temperature jumps of > r (7). However, these wavelengths are not ideal for studies of biological systems because they are not strongly absorbed by aqueous solutions. Absorbing dyes were therefore needed to transfer heat to the solvent. To generate larger temperature jumps, more suitable wavelengths (between 1100 to 2000 nm) were needed. Holzwarth et al. used an iodine laser with an output at 1.315 |im for direct heating of water (8). Turner et aL (9, 10) made use of the Stimulated Raman effect to shift the Nd:YAG fundamental to 1.41 \xm in liquid N2. However, difficulties with using Uquid N2 (unstable pulses due to boiling at room temperature, self focussing, and the need for an expensive dewar) prompted use of H2 gas as the Raman active medium (11). H2 produces a frequency shift from 1.06 jim to 1.89 |Lim. Not only were larger temperature jumps achieved but very rapid heating was accomplished since the radiation was directly absorbed by the solvent. Nearinfrared heating also eliminated the complication of affecting the reactant chemical relaxation since it is unable to excite most electronic transitions in a single photon process. One remaining difficulty was the possibility of producing temperature gradients in the sample since the amount of light absorbed at any point depends exponentially on the depth. This effect can however be minimized by using short pathlength cells ( 0 . 1 - 1 mm), or reflecting the heat pulse multiple times through the sample, or by decreasing the IR optical density by either using a laser wavelength with lower absorption or by mixing water and D2O (10). Recently, interesting results have emerged from nanosecond or faster temperature jump experiments studying the initial events during protein folding/unfolding with 10-30 °C temperature changes. Williams et al. have used difference frequency generation of a Nd:YAG laser and pulsed dye laser to produce a 20 ns temperature jump pulse at 2 [xm that remained elevated for ~1 millisecond (5c). The unfolding reactions of a small helical peptide (5c) and apomyoglobin (5d) were monitored with transient infrared spectroscopy. The Stimulated Raman effect was used by Ballew et al. (5f) to generate a temperature jump pulse at 1.54 [xm that has similar time resolution as reported by Williams et al. (5c). Intrinsic tryptophan fluorescence was used to follow the refolding reaction of apomyoglobin from its cold denatured state. Phillips et al. have used a different approach to generating a temperature jump (5a). The energy from a laser pulse was absorbed by homogeneously dispersed dye molecules that subsequently released energy as heat to cause a temperature jump of up to 10 °C within 70 ps. They have studied the unfolding of RNase A by picosecond transient infrared spectroscopy. One interesting problem that can be addressed with this technique is the kinetics of helix-coil conformational change. The a-helix is the most commonly occurring form of secondary structure found in proteins, and understanding the kinetics of the helix-coil transition will undoubtedly contribute to understanding the mechanism of protein folding. Studies have shown that for long polypeptides (>200 residues), a-helix formation occurs at a rate faster than 1x10^ s"^ (12); however, a-helical segments in proteins are much smaller than 200 residues. It was initially thought that small peptides would not be stable enough to form a-helical structures in solution. Recently,
Laser Temperature Jump in Protein Folding
737
however, there have been several reports of heUx formation in small peptides in aqueous solution (13). The fact that small peptides are stable enough to form secondary structure may support theoretical postulates that secondary structure forms early in the folding process (14). Tertiary structure could then form through the relative diffusion of fluctuating secondary structural regions. Simulations of small a-helical peptides have provided insight into the helix folding mechanism and have conjectured that folding and unfolding can occur in less than ten nanoseconds (15). Applying recently developed rapid initiation techniques to the helix-coil transition of small peptides will allow these theoretical predictions to be examined. This paper describes a nanosecond laser temperature-jump instrument with time resolution suitable for investigations of the early events in protein and peptide folding/unfolding, as well as kinetic events that occur out to 12 milliseconds. As an example of the capabilities of the instrument described here, some results are presented on the kinetics of the helix-coil transition of a synthetic, 21-residue a-helical peptide studied by Lockhart and Kim (16). The peptide has a fluorescent probe, 4-methylaminobenzoic acid (MABA), covalently attached to the N-terminus, giving the structure: MABAAAAAA(AAARA)3A-CONH2, where A is alanine and R is arginine. The carbonyl oxygen of MABA forms a hydrogen bond with the amide NH of residue 4 only in the helical conformation (16), dramatically affecting the fluorescence intensity.
11. METHODS As determined from HPLC, the purity of the peptide was greater than 90%. Steady-state fluorescence spectra of this peptide were collected from 310-480 nm with an excitation wavelength of 264 nm. A 1 cm pathlength cuvette was used with concentrations of -8.6 \JM. The emission quantum yields were determined relative to N-acetyl-L-tryptophanamide at pH = 6.9 (Of = 0.13) (17). Steady-state circular dichroism spectra were obtained using a 0.5 mm pathlength cylindrical cell and concentrations of -0.26 mM. The mean molar ellipticity [6] (deg cm^ dmol"^) was calibrated with (+)-10-camphorsulfonic acid. Concentrations of the solutions were determined by measuring the absorbance of 4-methylaminobenzoic acid.
A.
Laser Temperature Jump Instrument
A near-infrared laser pulse at 1.54 jam was used to rapidly heat the aqueous peptide solution 10-20 °C, while a cw ultraviolet probe beam excited the fluorescence of the labeled peptide and monitored the relaxation kinetics. A schematic of the instrument is shown in Figure 1. To produce the temperature jump pulse the fundamental (1064 nm) of a Nd:YAG laser (Continuum Surelite I), operating at 1.67 Hz, was focused with a 0.75 m lens into a one meter Raman cell (Princeton Optics, Inc.). The Raman cell contained 600 psi of CH4 and 500 psi of He and had a conversion efficiency of up to 20% for the first Stokes hne (1.54 |im). The 1.54 jjm wavelength
Peggy A. Thompson
738
(pulse width of 3-5 ns FWHM) was separated from the fundamental and antiStokes lines with a pellin-broca prism. The T-jump pulse was then focused onto the sample with a 0.75 m lens to give a spot size of 1 mm, and directly heated a small volume of water (-0.4 jiL) by vibrational excitation. This wavelength corresponds to the near-IR absorption band of the OH stretching overtone with £ = 5.2 cm~\ Using a 500 |Lim pathlength cuvette, 38% of the infrared beam was absorbed in an aqueous medium. To ensure uniform heating at the front and back of the cuvette, the remaining 62% of the 1.54 |Lim light was reflected back onto the sample in a double pass configuration. An ultraviolet fluorescence excitation beam was generated with an intracavity frequency doubled argon ion laser (Coherent, FRED) providing a tunable wavelength range of 229 - 264 nm. The excitation beam was focused onto the sample with a 10 cm lens to give a spot size of 60-70 |im. For fluorescence experiments, front-face illumination geometry was used. The Prism
Nd:YAG
Photodiode
Frequency Doubled Argon Ion Laser
Digital L Oscilloscope
Computer
Trigger
Figure 1. Schematic of the temperature jump instrument. A 3-5 ns (FWHM) temperature jump pulse is generated using the Stimulated Raman effect to shift the Nd:YAG fundamental (1064 nm) to 1.54 |im in a mixture of CH4 and He gas. 10 mJ of 1.54 |Lim light is focused onto the 5(X) |Lim pathlength sample cell to give a spot size of 1 mm. The remaining nearinfrared light is reflected back onto the sample cell to ensure uniform heating. A cw ultraviolet beam from an intracavity frequency doubled argon ion laser is focused to a spot size of 60-70 |im and excites the fluorescent sample. The fluorescence signal is detected 90" from the excitation beam with a photomultiplier tube.
Laser Temperature Jump in Protein Folding
739
fluorescence signal was collected with a 3.9 cm focal length lens, filtered from residual reflected excitation and infrared light, and detected 90 degrees from the excitation beam with a photomultiplier tube. The signal was amplified by cascading two-channels (5x gain per channel) of a fast preamp module (Stanford Research Systems, SR240) and processed by a digital oscilloscope (Tektronix, TDS 620). The waveforms from the oscilloscope were transferred to a personal computer for analysis. The size of the laser temperature jump was characterized by monitoring tryptophan fluorescence intensity change, which decreases 1% for each degree of temperature increase. 20 degree temperature jumps can consistently be generated with this instrument using 10 mJ of laser energy. Since the vibrational relaxation time of water in the near-IR is about 10"^^ s (18), the time it takes to reach a maximum temperature jump is limited only by the laser pulse width. Figure 2a shows that the instrument response time for a temperature jump from 0 °C to 20 °C is ~5 ns and that thermal diffusion from the reaction volume takes several milliseconds (Figure 2b).
^:^ 0.055
a
0.05
S
0.045
FT
0.04
o c
0
^
0.036
>iii»ii[>
50 100 Time, nanoseconds
<#it
I 0.034 §
0.032
I
0.03 \»)K id>M|H«M»l OMXl*! SI I W W i X "
o
E
0.028 0.026
0
Time, milliseconds Figure 2. a) The decrease in tryptophan fluorescence intensity after a 0 "C to 20 "C temperature jump is shown. The fluorescence intensity can be fit with a 5 ns time constant giving the time resolution of the instrument, b) Monitoring the tryptophan fluorescence intensity shows that the temperature remains elevated for several milliseconds. Relaxation kinetics can be measured over a time range of 5 ns to ~2 ms with this instrument.
Peggy A. Thompson
740
III. RESULTS AND DISCUSSION A.
Steady State Circular Dichroism and Fluorescence
The equilibrium properties of the peptide were characterized by circular dichroism and fluorescence spectroscopies. At low temperatures, the CD spectrum of this peptide has the characteristic double minima at 222 nm and 208 nm indicative of an a-helix. An isodichroic point observed at 202 nm implies that each residue of the peptide exists in either a helix or coil conformation. The extent of helix formation is most easily monitored by following the 222 nm minimum, -[0]222- In Figure 3, the CD thermal unfolding curve monitored at 222 nm shows that the helix content decreases with increasing temperature. At 0 °C the peptide is -70% a-helix, whereas at 70 °C it is -10% helical. The midpoint of the thermal transition occurs at -25 °C, consistent with what has been previously reported (16). Similar to other small helical peptides, the helix-to-coil transition is very broad as a function of temperature, >70 °C (19). However, it is still possible to induce a significant change in the helical population with a 20 degree temperature change. The quantum yields from the fluorescence spectra of MABA-peptide within the thermally induced helix-coil transition are shown in Figure 3 for
0.8 c too
CO
a6
Circular Dichroism
>
**3
CC
0.4 0.2
Temperature, C Figure 3. Equilibrium data for the MABA-peptide as a function of temperature. The thermal unfolding curve for the peptide monitored by CD at 222 nm shows that the helical content decreases with increasing temperature (Circles). The helix-coil transition is very broad as a function of temperature with the mid-point occurring at -25 "C. The fluorescence quantum yield of MABA attached to the peptide has a strong temperature dependence (Triangles), in marked contrast to free MABA in solution (Squares). There are significant differences between the CD and fluorescence thermal transition curves. It is expected that the two experimental techniques provide different measures of the helical content. (Lines through the data points are provided to guide the eye.)
Laser Temperature Jump in Protein Folding
741
temperatures between -3.5 °C and 65 °C. The total fluorescence intensity for MABA bound to the peptide is strongly dependent on temperature, decreasing 55% as the temperature increases from 0 °C to 65 °C. No spectral shift (emission X^^^ = 368 nm) is detected in this temperature range. By comparison, very little fluorescence temperature dependence is observed for free MABA in solution (Figure 3), indicating that the fluorescence intensity for the MABA-peptide monitors the helix-coil conformational transition at the N-terminus. Equilibrium helix-coil theories (20) predict that the probability of forming a helical segment is higher in the middle of a peptide sequence than at the termini. Spectroscopic techniques sensitive to helical content may then be expected to respond differently to perturbations in the helix population. Because the CD signal has contributions from all amino acids in this peptide, the helical fraction as determined by [9]222 measures the average helical content. By contrast, the fluorescence signal should be sensitive only to the helical population of the N-terminal amino acid residues owing to the location of the fluorescent probe. A comparison of the fluorescence quantum yields and the CD data (222 nm) shows differences in the thermal transition curves (see Figure 3), consistent with the expectation that the two experimental techniques provide different measures of the peptide's helical content.
B. Temperature Jump Kinetics The laser temperature jump instrument was used to rapidly initiate the helixcoil transition for constant initial temperatures between - 8 °C and 50 ""C. The unfolding reaction kinetics were monitored by detecting the fluorescence intensity change of the MABA labeled peptide for the wavelength range 320400 nm. Figure 4 shows an example of the relaxation kinetics for a temperature jump from 10 °C to 29 °C. An average time constant of 18 (± 4) ns was measured for the unfolding reaction. A maximum relaxation time of 21 (± 4) ns is observed near the mid-point of the helix-coil transition for 1
1
1
0.06 >. • V c
o c
O D O
v^
0.055
-
^ i " ^ ^ * - s , , . i : : : : \ ^.^•~^-
0.05 -20
r. ^- y^**^"^
v^v ^ —*.
0
20
40
60
80
100
Time, nanoseconds Figure 4. The time constant for the change in fluorescence of the MABA-peptide after a temperature jump from 10 "C to 29 °C is 18 (± 4) ns. This is interpreted as the relaxation time for the change in helix content at the N-terminus.
742
Peggy A. Thompson
this peptide. The relaxation times for final temperatures 30 "C above and below the mid-point temperature are -3 times faster (7-9 ns). All of the relaxation data could be fit with a single exponential decay. It is interesting to compare these results with those from previous work on the same (but not MABA-labeled) peptide. Williams et al. studied the helix-coil transition by infrared spectroscopy for a temperature jump from 9 ''C to 27 °C (5c). They observed a relaxation time of about 160 (± 60) ns, which is approximately 8 times longer than what is observed in the present experiment under similar conditions. However, infrared spectroscopy measures an average helix content, similar to what is expected from CD, whereas the fluorescence monitors the change in helix population at the N-terminus. It has been observed that the helix probability distribution is lower at the termini then in the middle of the peptide sequence (21). Simulations of the kinetics suggest that the relaxation time for the average helical content will be longer than for the N-terminus (22). Previous investigations of hehx-coil transition kinetics, which used a variety of fast relaxation methods (electric field jump, ultrasonic absorption, dielectric relaxation and temperature jump), encountered many difficulties (12). The systems studied were long homopolymers (>200 residues) that often had hydrolyzable side chains. Controversial results have been reported, depending on the experimental technique employed, because unwanted side chain reactions or molecular reorientation were often difficult to distinguish from the helix-coil conformational change. However, as observed here, a maximum in the relaxation times was detected for these experiments ranging from 15 |LIS to 20 ns and was attributed to the hehx-coil transition.
IV. CONCLUSIONS The laser temperature jump instrument can effectively be used to initiate and observe the fast events in protein/peptide folding and unfolding as well as those events that extend out to several milliseconds. In the present study, the unfolding of a helical peptide was determined to occur within tens of nanoseconds, supporting the need for nanosecond or faster initiation techniques. Promising results obtained by the laser temperature jump method will continue to stimulate the development of additional monitoring techniques such as UV absorption and circular dichroism.
ACKNOWLEDGEMENTS This work was carried out in collaboration with James Hofrichter and William Eaton at the National Institutes of Health. The peptide was a kind gift from Peter Kim.
Laser Temperature Jump in Protein Folding
743
REFERENCES 1.
2. 3. 4. 5.
6.
7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18.
19.
20.
21.
22.
(a) M. Karplus and D. L. Weaver, Prot. Sci. 3, 650 (1994); (b) J. D. Bryngelson, J. N. Onuchic, N. D. Socci, P. G. Wolynes, Proteins 21, 167 (1995); (c) K. A. Dill et al Prot. Sci. 4, 561 (1995); (d) D. Thirumalai, J. de Phys. I 5, 1457 (1995); (e) A. A. Mirny, V. Abkevich, E. I. Shakhnovich, Folding & Design, 1, 103, (1996). C. M. Jones, E. R. Henry, Y. Hu, C.-K. Chan, S. D. Luck, A. K. Bhuyan, H. Roder, J. Hofrichter, W. A. Eaton, Proc. Natl. Acad. Sci. USA 90, 11860 (1993). C.-K. Chan, Y. Hu, S. Takahashi, D. L. Rousseau, W. A. Eaton, J. Hofrichter, in preparation. T. Pascher, J. P. Chesick, J. R. Winkler, H. B. Gray, Science 211, 1558 (1996). (a) C. M. Phillips, Y. Mizutani, R. M. Hochstrasser, Proc. Natl. Acad. Sci. USA 92, 7292 (1995); (b) B. Nolting, R. Golbik, A. Fersht, Proc. Natl. Acad. Sci. USA 92, 10668 (1995); (c) S. Williams, T. P. Causgrove, R. Gilmanshin, K. S. Fang, R. H. Callender, W. H. Woodruff, R. B. Dyer, Biochemistry 35, 691 (1996); (d) R. B. Dyer, S. Williams, W. H. Woodruff, R. Gilmanshin, R. H. Callender, Biophys. J. 70, A177 (1996); (e) P. A. Thompson, W. A. Eaton, J. Hofrichter, Biophys. J. 70, A177 (1996); (f) R. M. Ballew, J. Sabelko, M. Gruebele, Proc. Natl. Acad. Sci. USA 93, 5759 (1996). (a) G. W. Flynn, N. Sutin, in Chemical and Biochemical Applications of Lasers, C. Bradley Moore, ed.. Academic Press, New York, p. 309, 1974; (b) C. F. Bernasconi, in Relaxation Kinetics, Academic Press, New York, p. 180, 1976; (c) D. H. Turner, in Investigation of Rates and Mechanisms of Reactions, Part 2, C. F. Bernasconi, ed., John Wiley & Sons, Inc., Vol. 6, p. 141, 1986. H. Staerk, G. Czerlinski, Nature 205, 63 (1965); H. Hoffmann, E. Yeager, J. Stuehr, Rev. Sci. Instrum. 39, 649 (1968). J. F. Holzwarth, A. Schmidt, H. Wolff, R. Volk, J. Phys. Chem. 81, 2300 (1977). J. V. Beitz, G. W. Flynn, D. H. Turner, N. Sutin, J. Am. Chem. Soc. 92, 4130 1970). D. H. Turner, G. W. Flynn, N. Sutin, J. V. Beitz, J. Am. Chem. Soc. 94, 1554 (1972). S. Ameen, Rev. Sci. Instrum. 46, 1209 (1975). (a) R. Zana, Biopolymers 14, 2425 (1975); (b) B. Gruenewald, C. U. Nicola, A. Lustig, G. Schwarz, H. Klump, Biophys. Chem. 9, 137. (a) J. E. Brown, W. A. Klee, Biochemistry 10, 470 (1971); (b) J. M. Scholtz, R. L. Baldwin, Annu. Rev. Biophys. Biomol. Struct. 21, 95 (1992). (a) O. B. Ptitsyn, A. A. Rashin, Biophys. Chem. 3, 1 (1975); (b) M. Karplus, D. L. Weaver, Nature 260, 404 (1976). (a) V. Daggett, P. A. Kollman, I. D. Kuntz, Biopolymers 31, 1115 (1991); (b) V. Daggett, M. Levitt, J. Mol. Biol. 223, 1121 (1992); (c) W. Schneller, D. L. Weaver, Biopolymers 33, 1519 (1993); (d) S.-S. Sung, Biophys. J. 66, 1796 (1994); (e) S.-S. Sung, X.-W. Wu. PROT.: Struct., Func, and Gen. 25, 202 (1996). (a) D. J. Lockhart, P. S. Kim, Science 257, 947 (1992); (b) D. J. Lockhart, P. S. Kim, Science 260, 198 (1993). R. W. Cowgill, Biochim. Biophys. Acta. 168, 431 (1968). (a) D. M. Goodall, R. C. Greenhow, Chem. Phy. Lett. 9, 583 (1971); (b) L. Genberg, F. Heisel, G. McLendon, R. J. D. Miller, J. Phys. Chem. 91, 5521 (1987); (c) P. A. Anfinrud, C. Han, R. M. Hochstrasser, Proc. Natl. Acad. Sci. USA, 86, 8387 (1989). (a) K. R. Shoemaker, P. S. Kim, E. J. York, J. M. Stewart, R. L. Baldwin, Nature 326, 563 (1987); (b) J. M. Scholtz, S. Marqusee, R. L. Baldwin, E. J. York, J. M. Stewart, M. Santoro, D. W. Bolen, Proc. Natl. Acad. Sci. USA 88, 2854 (1991); (c) J. M. Scholtz, H. Qian, E. J. York, J. M. Stewart, R. L Baldwin, Biopolymers 31, 1463 (1991). (a) B. H. Zimm, J. K. Bragg, J. Chem. Phys. 31, 526 (1959); (b) S. Lifson, A. Roig, J. Chem. Phys. 34, 1963 (1961); (c) D. Poland, H. A. Scheraga, in Theory of HelixCoil Transitions in Biopolymers, Academic Press, New York, 1970. (a) E. K. Bradley, J. F. Thomason, F. E. Cohen, P. A. Kosen, L D. Kuntz, J. Mol. Biol. 215, 607 (1990); (b) S. M. Miick, A. P. Todd, G. L. Millhauser, Biochemistry 30, 9498 (1991); (c) A. Chakrabartty, J. A. Schellman, R. L. Baldwin, Nature 351, 586 (1991); (d) M. I. Liff, P. C. Lyu, N. R. Kallenbach, J. Am. Chem. Soc. 113, 1014 (1991); (e) C. A. Rohl, R. L. Baldwin, Biochemistry 33, 7760 (1994). P. A. Thompson, W. A. Eaton, J. Hofrichter (manuscript in preparation).
This Page Intentionally Left Blank
Biophysical and Structural Analysis of Human Acidic Fibroblast Growth Factor Michael Blaber, Daniel H. Adamek, Aleksandar Popovic and Sachiko I. Blaber Institute of Molecular Biophysics and Department of Chemistry, Florida State University, Tallahassee, FL 32306-3015
I. Introduction II. Materials and Methods A. Expression and Purification of human aPGF B. Calorimetric Analysis III. Results A. Purification of Human aFGF B. Calorimetric Analysis IV. Discussion A. Calorimetric Analysis Acknowledgments References
I. Introduction Acidic fibroblast growth factor (aFGF) is one of nine known members of the FGF family (1, 2, 3, 4). It is the only member which is able to bind with high affinity to all four characterized FGF receptors (FGFRs), and variants produced by alternative mRNA splicing (5). Since expression of the various FGFRs is distributed over a wide variety of cell types, including cells of mesodermal and ectodermal origin, aFGF is probably one of the broadest specificity mitogens known. FGF's have also been termed "heparin binding" growth factors due to their binding specificity for heparin and heparan proteoglycans (6, 7). Complexation with heparin has been demonstrated to protect aFGF from inactivation by heat, acid (8), proteolysis (9) and oxidation (10). Thermal inactivation appears to be a physiologically relevant phenomenon. Circular dichroism and differential calorimetric studies have suggested that the thermal transition midpoint (Tm) may be near to physiological temperature, and that interaction with heparin can stabilize aFGF by some 20 °C (11). In addition to an apparently low thermal stability, FGF appears to face additional problems in maintaining its native, functional structure. Human aFGF contains three cysteine residues and the related basic FGF (bFGF) contains four cysteines. These residues are present in the active protein as fi'ee cysteine residues and oxidation, to form either inter- or intra-chain disulfide bonds, has been demonstrated to inactivate the protein TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
745
Michael Blaber et al
746
(10). Mutation of these residues has demonstrated that they are not functionally important and substitution by serine can extend the in-vitro protein half-life considerably (12, 13). Although this increase in half-life has been interpreted as the result oi stabilization of the structure, there is no evidence for this in the formal sense, i.e. the mutation has increased the Tm value. An alternative interpretation is that a disulfide mediated irreversible denaturation pathway has been effectively eliminated. Formulation studies of aFGF have identified yet another contribution to inactivation, namely irreversible aggregation of the unfolded state. Thermal denaturation studies indicate that after unfolding, the protein aggregates and precipitates (14). This does not appear to be related to the formation of mixed disulfide bonds, but is instead a non-covalent association of the protein while in the unfolded state. Unfolded, or partially folded, forms of aFGF have very low solubility and aggregate irreversibly (15). Formulations which minimize or postpone aggregation have been interpreted as stabilizing the structure (14). Again, in the formal sense, this may not be the case. It could very well be that useful formulation additives are able to solubilize the unfolded state without any influence upon the T^. In any case, the FGF's (and particularly aFGF) may, in fact, utilize stability as a regulatory mechanism. This is achieved by combining inherently low thermal stability with irreversible denaturation (both covalent and non-covalent in origin). Furthermore, as a true regulation mechanism, under specific circumstances (e.g. in the presence of heparin) the stability (Tm) can be significantly increased. While this increase in stability would not necessarily have any effect upon the irreversible mechanisms, at physiological temperatures it would effectively minimize the fraction of the protein population which would be in the unfolded state at any given time. Thus, the irreversible denaturation mechanisms would occur at a significantly lower effective rate. The relationship between reversible and irreversible denaturation pathways for human aFGF are diagrammed in figure 1.
aggregation of unfolded state
low Tm 1
(absence of heparin)
Native State
_ ""
^ , . - ,^ Precipitation _».
Denatured -::[]
highTm (presence of heparin)
active
(oxidation of cys residues)
<
"^ Mixed Disulfide Formation
inactiv e
•
Figure 1. Reversible and irreversible denaturation pathways leading to the inactivation of aFGF function.
Biophysical and Structural Analysis of Human Acidic FGF
747
Are the "irreversible" denaturation pathways truly irreversible? Thomas and coworkers have demonstrated that the activity of (inactive) oxidized aFGF can be recovered by the addition of the reducing agent dithiothreitol (10). Furthermore, as previously mentioned, various formulation additives have minimized thermally induced aggregation of aFGF. In addition to the physiological implications, these "irreversible" denaturation pathways also complicate thermodynamic analyses of protein stability, particularly those which rely on van't Hoflf analysis of denaturation profiles. We focus here on our thermodynamic and structural studies of human aFGF and its relationship to the utilization of stability as a regulatory control mechanism for this growth factor.
II. Materials and Methods A, Expression and Purification of Human aFGF A synthetic gene for the 141 amino acid form of human aFGF (10) was inserted into the isopropyl P-D-thiogalactoside (IPTG) inducible pET21 expression vector (Novagen). The transformed E. coli host BL21(DE3) was grown at 37 °C to half the stationary phase cell density and the temperature was then reduced to 28 °C prior to induction by IPTG. Cells were harvested three hours after induction. All buffers for chromatographic purification included 5 mM dithiothreitol (DTT), 2 mM (NH4)2S04 and 0.5 mM EDTA, and all steps were performed at 4 °C. Mechanically disrupted cells were batched with DE-52 (Whatman) in 20 mM Phosphate buffer, pH 5.8. The slurry was filtered through a buchner fijnnel and aFGF was present in the supernatant fi'action. This supernatant was then loaded onto a CM-Sephadex column (Pharmacia) equilibrated to 20 mM Phosphate, pH 5.8. The aFGF bound under these conditions and was eluted with a linear NaCl gradient to 1.5 M. The peak of aFGF was pooled and loaded onto a Sephadex G-50 (Pharmacia) with 50 mM Phosphate pH 5.8 as the running buffer. The aFGF peak was then subjected to denaturation in 3 M Guanidine hydrochloride, followed by refolding by dialysis versus 20 mM Phosphate, pH 6.2. After refolding, the aFGF was loaded onto Heparin-Sepharose (Pharmacia) and eluted with a linear gradient of 1.5 M NaCl. The aFGF peak was pooled for fiirther characterization.
B. Calorimetric Analysis High sensitivity differential scanning calorimetry was performed utilizing a MicroCal MSC-DSC instrument (MicroCal, Inc.). The scan rate was 2 °C/min unless otherwise indicated and protein concentrations were 0.5 mg/ml. Triplicate runs were performed for each experiment and averaged. Data analysis was performed using the Origin program (MicroCal Software, Inc.) using a non-2 state model with a constant ACp of unfolding. In this way, both the calorimetric (AHcai) and van't HofF (AHVH) enthalpies of unfolding were determined. The eflfects of both phosphate and sulfate ions upon the stability of human aFGF were determined by the addition of either 10 mM of NaH2P04 or (NH4)2S04 to 50
748
Michael Blaber et al
mM HEPES, 0.5 mM EDTA, 2.0 mM DTT, pH 7.0. The effects of guanidine hydrochloride (GuHCl) upon the stability, and reversibility of thermal denaturation, was determined by the addition of GuHCl, in 0.1 M increments, to 20 mM NaH2P04, 0.15 M NaCl, 0.5 mM EDTA, 2 mM DTT, 10 mM (NH4)2S04, pH 7.3.
III. Results A, Purification of Human aFGF One of the surprises in the purification of human aFGF was that coming out of £". coli the protein appeared to be monomeric and soluble, yet in some way misfolded. Initial attempts to utilize heparin-Sepharose chromatography indicated that the aFGF eluted at approximately 0.6 M NaCl, significantly lower than previously reported values (16). After denaturation in guanidine hydrochloride and refolding in phosphate buffer, the aFGF peak now eluted at a more characteristic 1.5 M NaCl fi^om heparin-Sepharose resin. Another surprise related to the use of the CM-Sephadex resin. Human aFGF elutes from this resin over a broad range (0.5 to 1.0 M) of NaCl. However, elution fi-om CM-52 (cellulose based carboxy methyl cation exchanger) requires only 0.15 to 0.2 M NaCl. Thus, the aFGF may be interacting not only with the fijnctional group of the resin, but in the case of CM-Sephadex, the matrix as well. The final yield from a 1.6 liter fermentation preparation typically approached 80 mg of >98% pure material.
B, Calorimetric Analysis Calorimetric data for human aFGF at neutral pH (50 mM HEPES, 0.5 mM DTT and 2 mM EDTA) with and without the addition of 10 mM Phosphate or Sulfate ion, are listed in table I. Also listed in this table are the calorimetric data for the addition of 0.6 M GuHCl to phosphate buffered saline (plus 0.5 mM EDTA and 2.0 mM DTT) in the presence of 10 mM (NH4)2S04. Table I. Thermodynamic parameters of unfolding for human aFGF in the presence of phosphate and sulfate ions. Also listed is the effect of 0.6 M guanidine hydrochloride on the thermodynamic parameters of unfolding. Reversibility Sample Buffer Tn. AHcal AHvH (°C) (kcal/mol) (kcal/mol) (%) 0 50 mM HEPES, 0.5 mM EDTA, 2.0 mM 61 98 35.2 DTT, pH 7.0 40.9 67 143 0 +10mMNaH2PO4 46.2 72 160 0 +10 mM (NH4)2S04 0 86 144 20 mM NaH2P04, 0.15 M NaCl, 0.5 mM 46.2 EDTA, 2.0 mM DTT, 10 mM (NH4)2S04 pH7.3 +0.6 M GuHCl 46.0 66 67 88
Biophysical and Structural Analysis of Human Acidic FGF
749
IV. Discussion A. Calorimetric Analysis With the exception of the sample containing 0.6 M GuHCl (discussed below) all DSC samples exhibited irreversible denaturation, as judged by an absence of an endotherm on the second run, even in the presence of DTT. Furthermore, all samples upon removal from the calorimeter were opaque, indicating that precipitation had occurred during or after denaturation. A denaturation endotherm for human aFGF at neutral pH is shown in figure 2. Initial studies at various pH values suggests that the thermodynamic parameters do not vary much over the range pH 7.0 to 8.0, thus, this endotherm is representative of the physiological pH range. The !„, of human aFGF at pH 7.0 is approximately 35 °C, a value similar to that reported by Middaugh and coworkers (11). The profile of the endotherm indicates that a significant fraction of the protein is actually unfolded at physiological temperature. In practical terms, this information also suggests that yields from fermentation would be expected to be low unless the temperature upon induction is lowered from the typical value of 37 °C. The AHcai is 61 kcal/mol and AHVH is 98 kcal/mol. Normally, a AHVH/ AHcai ratio greater than 1.0 is indicative of a protein which is present in a multimeric state in solution. However, there is no evidence for stable multimer formation of aFGF. The most likely explanation for the observed AHVH/ AHcai ratio is that it is related to the associated aggregation and precipitation under these conditions.
Tta Heal HvH
— I
10
1
Control
10 inM Pi
35.2 6.14E4 9.80E4
40.9 6.72E4 1.43E5
1
20
1
1
30
1
1
40
10 DM SO^ 46.2 7.20E4 1.60E5
1
1 —
50
temperature (°C)
Figure 2. DSC denaturation endothenns for human aFGF in 50 mM HEPES, 0.5 mM EDTA, 2 mM DTT, pH 7.0 (short dashed line). Overlaid onto this plot are the endotherms for the same conditions but with the addition of either 10 mM NaH2P04 (long dashed line) or 10 mM (NH4)2S04 (solid line).
Michael Blaber et al
750
The effects of either phosphate or sulfate ion on the stability of human aFGF at neutral pH is shown infigure2. The addition of 10 mM phosphate ion increases the Tm by 5.7 °C to 40.9 °C, and the presence of 10 mM sulfate ion increases the T^ by 11.0 °C to 46.2 °C. The structure of human aFGF was solved with crystals grown in the presence of lOmM (NH4)2S04 and 20 mM phosphate buffer (17). A region of positive density was observed on the surface of the molecule near the residues asparagine 18, lysine 113 and lysine 118 (figure 3). This density was interpreted as an ordered sulfate ion (17). Near this region are additional basic residues including lysine 112, arginine 116, and arginine 122. Thus, this region can be described as a clustering of like-charged (i.e. basic) residues. In the unfolded state these residues are separated fi'om one another along the polypeptide chain. Thus, due to charge repulsion, they may actually contribute to instability of the native structure, and the introduction of an appropriate counter ion (e.g. sulfate) stabilizes the structure. The lack of reversibility, and the presence of precipitation, makes thermodynamic analysis of aFGF particularly challenging. Precipitation in the presence of DTT indicates that precipitation is not dependent upon the formation of mixed disulfides. Structural analysis of human aFGF (17) shows that the three fi'ee cysteine residues are located at solvent inaccessible positions (figure 4). Thus, formation of mixed disulfides would be expected to destabilize the protein because a) structural changes would be required to expose the cysteines for oxidation and b) covalent adducts of the cysteine residues would have to be tolerated within the packing constraints of the interior of the protein for the native state to be adopted.
Lys112
Argl22
Lys 112
Arg 122
Figure 3. X-ray crystal structure of human aFGF in the region of lysine 118. Shown is an Fobs-Fcaic difference density map (phases from the model), contoured at 4 a, into which is a sulfate ion has been built (17). The region around this site contains several other basic residues, including lysine 112, lysine 113, arginine 122 and lysine 128.
Biophysical and Structural Analysis of Human Acidic FGF
751
Figure 4. Stereo Ca trace of human aFGF (17) showing the locations of the three free cysteine residues at positions 16, 83 and 117.
The addition of GuHCl had little effect upon either the stability or the reversibility of thermal denaturation until a concentration of approximately 0.6 M. At this concentration the reversibility of the thermal denaturation went from 0% to 88%, as judged by a comparison of AHcai values for repetitive scans (figure 5). The addition of this amount of GuHCl did not appear to significantly destabilize the protein, as judged by the similar Tm value in comparison to the sample in the absence of GuHCl (46.0 °C versus 46.2 °C). Furthermore, the values for the calorimetric and van't Hoff enthalpies were much closer to unity (table I). In comparison to the DSC analysis in the absence of GuHCl, the effect was primarily on upon the apparent van't Hoff enthalpy. Thus, for those DSC analyses demonstrating aggregation and precipitation, the calorimetric enthalpy is the more reliable value. How is reversibility of folding achieved by the addition of a relatively small amount of GuHCl? Since there is almost no change in the Tm, and the calorimetric enthalpy is approximately 80% that of the sample in the absence of GuHCl (table I), it would appear that this amount of GuHCl has little effect upon the native state of the protein. Therefore, the GuHCl appears to be affecting primarily the unfolded state of the protein, i.e. it helps to prevent aggregation of the unfolded state, resulting in reversible folding upon cooling. The discovery that the addition of a relatively small amount of GuHCl can allow reversible denaturation will, for the first time, allow accurate determination of the thermodynamic parameters of unfolding for human aFGF. We are currently constructing a series of alanine and serine mutants at the three cysteine residues in human aFGF. DSC analyses of these mutants will allow the determination of their specific contribution to stability, separate from their effects upon irreversible denaturation.
752
Michael Blaber et al
A Tm
46.23
Heal
8.58E4
HvH
1.44E5
^--^/^—^ 10
20
30
40
^ 50
Temperature ( C)
Temperature (°C)
Temperature ("C)
Temperature ("C)
60
70
B
Figure 5. DSC denaturation endotherms for human aFGF in 20 mM NaH2P04, 0.15 M NaCl, 0.5 mM EDTA, 2.0 mM DTT, 10 mM (NH4)2S04, pH 7.3. Panel A shows repetitive scans (thefirstscan is on the left and the second on the right. Panel B shows repetitive scans, as in panel A, but with the addition of 0.6 M guanidine hydrochloride to the buffer.
Acknowledgments The authors would like to thank Drs. Ken Thomas, C. Russell Middaugh and John Brandts for helpful discussions. This work was supported in part by the Markey Foundation, Florida State University Council on Research and Creativity, and N.I.H. grant GM54429-01.
References 1. Burgess, W. H. and Maciag, T. {\9%9) Annual Reviews of Biochemistry 58, 575-606. 2. Miyamoto, M., Naruo, K., Seko, C, Matsumoto, S., Kondo, T. and Kurokawa, T. (1993) Molecular and Cellular Biology 13, 4251-4259.
Biophysical and Structural Analysis of Human Acidic FGF
753
3. Tanaka, A., Miyamoto, K., Minamino, N., Takeda, M., Sato, B., Matsuo, H. and Matsumoto, K. (1992) Proceedings of the National Academy of Science USA 89, 8928-8932. 4. Thomas, K. A. in Neurotrophic factors S. E. Loughlin, J. H. Fallon, Eds. (Academic Press, Inc., San Diego, 1993) pp. 285-312. 5. Chellaiah, A. T., McEwen, D. G., Werner, S., Xu, J. and Omitz, D. M. (1994) Journal of Biological Chemistry 269, 11620-11627. 6. Gospodarowicz, D., Cheng, J., Lui, G.-M., Baird, A. and Bohlen, P. (1984) Proceedings of the National Academy of Science USA 81, 6963-6967. 7. Lobb, R. R. and Fett, J. W. (1984) Biochemistry 23, 6295-6299. 8. Gospodarowicz, D. and Cheng, J. (1986) Journal of Cellular Physiology 128, 475-484. 9. Rosengart, T. K., Johnson, W. V., Friesel, R., Clark, R. and Maciag, T. (1988) Biochemical and Biophysical Research Communications 152, 432-440. 10. Linemeyer, D. L., Menke, J.G., Kelly, L.J., DiSalvo, J., Soderman, D., Schaefifer, M.-T., Ortega, S., Gimenez-Gallego, G. and Thomas, K.A. (1990) Growth Factors 3, 287-298. 11. Copeland, R. A., Ji, H., Halfpenny, A.J., Williams, R.W., Thompson, K.C., Heiber, W.K., Thomas, K.A., Bruner, M.W., Ryan, J.A., Marquis-Omer, D., Sanyal, G., Sitrin, R.D., Yamazaki, S. and Middaugh, C.R. (1991) Archives of Biochemistry and Biophysics 2S9, 53-61. 12. Seno, M., Sasada, R., Iwane, M., Sudo, K., Kurokawa, T., Ito, K. and Igarashi, K. (1988) Biochemical and Biophysical Research Communications 151, 701-708. 13. Ortega, S., Schaefifer, M.-T., Soderman, D., DiSalvo, J., Linemeyer, D.L., Gimenez-Gallego, G. and Thomas, K.A. (1991) Journal of Biological Chemistry 266, 5842-5846. 14. Tsai, P. K., Volkin, D.B., Dabora, J.M., Thompson, K.C., Bruner, M.W., Gress, J.O., Matuszewska, B., Keogan, M., Bondi, J.V. and Middaugh, C.R. (1993) Pharmaceutical Research 10, 649-659. 15. Mach, H., Ryan, J. A., Burke, C. J., Volkin, D. B. and Middaugh, C. R. (1993) Biochemistry 32, 7703-7711. 16. Linemeyer, D. L., Kelly, L.J., Menke, J.G., Gimenez-Gallego, G., DiSalvo, J. and Thomas, K.A. (1987) Biotechnology 5, 960-965. 17. Blaber, M., DiSalvo, J. and Thomas, K. A. (1996) Biochemistry 35, 2086-2094.
This Page Intentionally Left Blank
A thermodynamic analysis discriminating loop backbone conformations Jean-Luc Pellequer^ and Shu-wen W. Chen^ Department of Biochemistry and Molecular Biophysics. Columbia University. 630W 168th Street. New York, NY 10032
I. INTRODUCTION Antibodies are soluble molecules that specifically recognize antigens by their antigenbinding sites called complementarity determining regions (CDRs). CDRs were originally characterized as regions having a high variability in amino-acid sequences (Kabat et ai, 1977). The X-ray crystal structures of antibodies revealed that CDRs are loop connecting pstrands located at the extremity of a highly conserved p-barrel fold known as the framework (FR) (Padlan & Davies, 1975; Amzel & Poljak, 1979; Davies etai, 1990; Barre etai, 1994; Bork etal., 1994). Moreover, X-ray crystal structures of protein antigen-antibody complexes demonstrate that CDRs provide almost all intermolecular contacts with the antigenic determinant or epitope (Sheriff et al., 1987; Padlan et al., 1989; Fischmann et al., 1991; Herron etal, 1991; Tulip etal., 1992; Chitarra etal, 1993; Prasad etal., 1993; Ban etal., 1994; Bhat etal., 1994; Braden etal., 1994; Malby etal., 1994; Braden etal., 1996). In order to understand the intimate details in the specificity of the recognition process between antibodies and antigens, the three-dimensional structures of these molecules are required. Although X-ray diffraction studies provide accurate description of molecules at an atomic level, it is a time consuming task to undertake. Because of the very high structural similarity of the framework conformation, attempts have been made to model new antibody conformations using homology modeling techniques. Moreover, these modeling experiments provide a basis for integrating and testing our understanding of antibody structure. The major challenge in this approach is to adequately search conformational space for the six hypervariable loops or CDRs, three for the light chain and three for the heavy chain to obtain accurate models. We used several methods were used to model antibody CDRs that can be divided into two categories: (1) a knowledge-based approach that uses CDRs from known crystal structures of antibodies and (2) an ab-initio approach that builds CDR loops. All of these approaches must fulfill one criterion: to identify the conformation that is best adapted to the framework of a current model. Unfortunately, none of the current methods has an Present address: Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
755
756
Jean-Luc Pellequer and Shu-wen W. Chen
appropriate energy furxtion that allows the discrimination between an incorrect from a correct CDR conformation. Such a discrimination is of great significance especially for methods employing the knowledge-based approach by using canonical structures of CDRs. Indeed, methods relying on CDR structural knowledge allow you to identify which class the CDR you are modeling belong to, but do discriminate which CDR from known crystal structure of antibodies one should select. Moreover, when critical residues are not present, the modehng using canonical structures becomes less efficient (Steipe et al., 1992; Bell et ai, 1995). In this paper we report results concerning the development of a complete physical treatment that allows the screening of loop conformations in order to identify the most suitable ones for a particular antibody model. Here, we establish a formalism that allows the computation of the conformational free energies of loops by combining a molecular mechanic treatment of a loop with a continuum treatment of the solvent (Smith & Honig, 1994). We simulate a modeling study by removing the three light chain CDRs from a recently solved crystal structure of an antibody in a bound conformation, namely Fab R4545-11 (R45) (Altschuh etai, 1992; Vix et ai, 1993), then replacing loops from our database (Pellequer & Chen, 1996) and calculating the conformational free energies for each conformation. Our results reveal that loops in the database having the lowest conformational energy are the loops with the smallest RMSD compared to CDRs of R45. We expect our thermodynamic analysis to be generally useful for antibody modeling.
II. MATERIALS AND METHODS A, Insertion of a loop from the database into the Fv R45 1. Replacing loop side-chains Side-chains were substituted to match the sequence of the Fv R45. We set up the dihedral angles % 1 and x2 (Table I) according to the highest probability found in the rotamer library established by Tuffery et al. (1991). Dihedral angles x3 and %4 were set to 180°. Consequently, each amino acid side chain displays the same starting conformation. 2. Optimizing the inserted loop As compared to loop building, the use of a database requires an additional step which is the insertion of a loop into the antibody framework. For example, we could superimpose residues from the framework onto the flanking residues of each loop (N-1 and C+1). However, this would require an additional constraint on the distances between atoms from residues N-1 and C+1 during the building of the database. Such a constraint is inconsistent due to our definition of antibody CDRs (Pellequer & Chen, 1996). Indeed, this constraint could have been adopted only for loops in which residues N-1 and C+1 were in pconformations in their original molecule. Moreover, a recent study concluded that
Thermodynamics of Loop Backbone Conformations
757
superimposing flanking peptides of loops onto an antibody framework failed in providing accurate model for CDRs (Tramontano & Lesk, 1992). An alternative method is a docking procedure e.g. (Carlacci & Englander, 1993). However, this method is computationally costly and introduces an additional variable which is the identification of the best docking orientation. Although, we could have used the conformational free energy calculation to identify such best docking orientation, we found experimentally that it is more appropriate to start with a simple insertion protocol such as a least squares superimposition of the backbone atoms of each loop to the one from the native crystal structure (N, CA, C). Table I. Original and rotamer dihedral angles of light chain CDRs from R45
tL
.
IL Fab R45
Rotamer
Fab R45
Rotamer
CDRLl SER26
58.81
65
-
-
GLN27
166.61
-63
-164.22
178
ASP28
-171.73
-70
132.32
-32
ILE29
59.22
-62
168.06
163
SER30
-63.97
65
THR31
-83.74
-61
-
-
TYR32
-46.51
-64
-46.33
102
TYR50
-144.37
-64
-23.9
102
THR51
-155.63
-61
SER52
65.03
65
-
-
ARG53
-168.13
-176
-176.31
156
LEU54
-74.7
-62
62.12
170
ARG55
100.98
-176
110.83
156
SER56
-54.98
65
-
156
CDRL2
-
-
CDRL3 GLY91
-
-
SER92
-176.11
65
ARG93
-55.36
-176
-60.19
ILE94
-60.92
-62
176.79
163
PR095
35.86
27^
-45.97
-29a
PR096
25.61
27^
-42.22
-29a
a Values for Pro were obtained from Ponder & Richards (1987).
Explicit hydrogens were built on each loop when they were inserted into the Fv R45. We used the HBUILD command (Brlinger & Karplus, 1988) provided with the Xplor program (Version 3.0) (Brunger, 1992). The Xplor topology and parameter files were respectively T0PALLH6x.PR0 and PARMALL3x.PR0. Sixty cycles of conjugate gradient minimization (Powell method) were then carried out while fixing all atoms of the Fv and all heavy atoms of the loop. We used a dielectric constant of 1 and non bonded cut-off of 9A. The cut-on and cut-off of both VdW switch and electrostatic shift functions were
Jean-Luc Pellequer and Shu-wen W. Chen
758
respectively 6.5A and 8A. The total energy used in this evaluation includes bond distance, valence angle, dihedral and improper angle, van der Waals, and electrostatic energies. Then, we carried out two types of optimization (including hydrogens and heavy atoms): (1) side chains only and (2) all atoms. (1) Side chain conformations were minimized by 600 cycles of conjugate gradient minimization (Powell method) and saved. We observed that 600 cycles of minimization allows convergence in a reasonable time. (2) Starting from the conformations in (1), we applied 600 cycles of conjugate gradient minimization to all atoms of the loop.
3. Optimization of the loop closure For both minimization procedures described above the loop closure occurs identically as follows: the peptide bond atoms located at both extremities of a loop were allowed to move during the minimizations (even when only side chains were optimized). Only four atoms per extremity was needed for loop closure because of the restricted distances between N and C termini in the establishment of the database (Pellequer & Chen, 1996).
B. Relative stability of loops We assessed the stability of loops by evaluating their relative conformational free energy compared with CDRs from Fv R45. The conformational free energy in solution can be described in terms of a thermodynamic cycle and equation 1. AGconf
- • Fabi
Fab;
A /^SOlV
AGnat
AUmod
Fab
Fab
sol mod
Ar^sol AGconf
AG-J,f = AGf-, + AAG3,i,
(1)
The conformational free energy in the gas phase is obtained from a molecular mechanic force field (CHARMm): it includes internal coordinate energies as well as non-bonded interactions (van der Waals and electrostatic). The solvation free energy for transferring molecule from a gas phase to an aqueous phase is calculated with a continuum model of solvent (Jean-Charles et ai, 1991; Honig et al, 1993). Addition of the gas phase conformational free energy to the solvation free energy gives the conformational free energy
Thermodynamics of Loop Backbone Conformations
759
in solution. It should be stressed that in this thermodynamic cycle, there is no double counting interactions due to the combination of a continuum model and a molecular force field (Smith & Honig, 1994). The solvation free energy difference between modeled and native loops is AAG3„,,=AGro?.^-AG-,t (2) The solvation free energy change can be written as AG3,,,, = AGfr^"^'^^ + AG^^;^-^""'^'
(3)
where ^Q.s^^s->wa er -g ^j^^ difference in electrostatic free energy of transferring Fv from gas to water obtained from finite difference Poisson-Boltzmann calculations (Delphi Version 3.0, (Sridharan et al, ; Nicholls & Honig, 1991)), which is the difference between the reaction field energy in vacuum and in water (Gilson & Honig, 1988; Jean-Charles et ai, 1991). AG^p^"^^^ ^^ is the transfer free energy of an uncharged molecule of the same size and shape as the Fv from gas to water. It is commonly assumed that AG^p^^^^ ^^ is proportional to the total accessible surface area of the Fv (equation 4): ^(jgas^water ^ ^ ^ ^
(4)
where y is the vacuum-to-water transfer free energy coefficient. In our study we used a value o 9 of 5 cal/mol/A as determined from solubility experiments (Ben-Naim & Marcus, 1984). The reaction field energy in vacuum and in water was calculated with the Delphi program using a 129 cube grid size, three focusing runs per calculation (24%, 48% and 96%), and a dipolar boundary condition for the first run. The final resolution was 2 grid points per A, which has been shown to be sufficient for convergence. At such a resolution, the relative energy is almost insensitive to the orientation of the molecule inside the grid (Smith & Honig, 1994). The internal dielectric constant was 2 and the external dielectric was 80 for water. In the gas phase calculation, the external dielectric constant was 1. We used the newly derived PARSE parameters for radii and atom charges (Sitkoff et al., 1994) as these parameters have been optimized for accurate reproduction of the hydration free energy of amino acids upon transfer from gas phase to water phase. These PARSE parameters allow an assignment of particular values for N and C terminal residues as well as for disulfide bridges. Only Asp, Glu, Lys, and Arg were charged.
III. RESULTS Two ways of optimizing CDR modeled loops were tested: (1) only the side chains of the loops were minimized, and (2) all atoms of loops were minimized. In the first case, the backbone was kept fixed in the original loop conformation. In the second case, all atoms were minimized in order to obtain a "clash-free" loop conformation. To reduce computational requirements, only a subset of the lowest conformational free energy loops (<1500 kcal/mol) from (1) were optimized in (2), see below. Loop closure occurs identically in both approaches and therefore does not contribute to the energy difference observed between them. Modeled loops were compared with the native loops treated with identical minimization procedures.
Jean-Luc Pellequer and Shu-wen W. Chen
760
A. CDRLl The range in gas phase conformational free energy is wide: 33 loops have energies lower than 1000 kcal/mol over 406 loops when only side chains were optimized. The main highenergy component came from steric clashes (vdW energy) among backbone atoms or between side chains that could not be fully optimized due to the fixed backbone orientation. The loop with the lowest conformational free energy (in gas phase and in solution) has the lowest RMSD (Table Ila). This loop was extracted from antibody molecule (6fab, residues L26_L32). We noticed a gap (more than 100 kcal/mol) in the conformational gas phase free energy between the lowest energy loop (6fab.L26_L32) and the one with the next best energy. Due to the relatively high conformational free energy in the gas phase, the solvation free energy term did not change the ordering of the loops. As expected with an all atom optimization procedure, most of the steric clashes are removed and the conformational free energy was greatly reduced, ranging from 7.9 to 488.37 kcal/mol for the 40 selected loops. The best conformation observed when only side chains were minimized is also the best in all-atom minimization (Table lib, 6fab.L26_L32). Two loops had very close conformational gas phase free energies (6fab.L26_L32 and 4enl.287_293, Figure la), but 6fab.L26_L32 displayed a lower conformational free energy in solution, implicating the beneficial effect of the solvation free energy when assessing loop stability (Figure lb). This is a clear case where the addition of solvation free energy to the conformational gas phase free energy allowed the identification of the correct conformation. Figure la
• ••
500-
^.^
400-
"o
300-
^_
200-
M^ 5 C c« O W) O
<
400 H
•
g
;::5 «3 O
Figure lb 500-
•
^
;• • • " /
1000 -
1
• •
1
• 1
RMSD (A)
300 H 200 H
<
•e-
100 H
* 1
I
I
3
4
5
RMSD (A)
Furthermore, when the solvation free energy is included, the difference in conformational free energy between the native and modeled loops is reduced as shown in Figure 1 for CDRLl. Similar plots were obtained for CDRL2 and CDRL3 (data not shown). The gap in energy observed when only side chains were optimized is reduced but still present for CDRLl (Table lib). There is no modeled CDRLl loop that had a lower conformational gas phase free energy than the native CDR loop.
Thermodynamics of Loop Backbone Conformations
761
B, CDRL2 As for CDRLl the range in gas phase conformational free energy is wide. Only 20 loops had lower energies than 1000 kcal/mol over 196 loops when only side chains were optimized. The main component from such high energy conformations came from atom clashes (vdW energy). Several loops were found to have native-like CDR geometry (RMSD<0.51 A). Although, the loop with the lowest conformational free energy did not have the lowest RMSD (Table Ilia), its conformation was almost as close to the native as the best fit structure. The three lowest conformational free energy loops are from antibody molecules (2rhe.51_57, 2fb4.L49_L55, lrei.A50_A56). The conformational free energies both in the gas phase and in solution of 2rhe.51_57 were lower than those calculated for the native structure. The electrostatic contribution of the two arginine residues is 40 kcal/mol lower for this loop than the native. We observed a gap (more than 100 kcal/mol) in the conformational gas phase free energy between this lowest energy CDRL2 loop and the consecutive one (Table Ilia). However, this gap did not correlate with a gap in the RMSD because most of the low energy CDRL2 loops have a native-like conformation. The conformational free energy ranged from -6.24 to 1995.12 kcal/mol for the 39 selected conformations with all atom optimization. The three conformations with the lowest RMSD were the same as those found when only side chains were minimized, but the ordering is changed (Table Illb). Inclusion of the solvation free energy allowed a more accurate description of the free energy although lrei.A50_A56 still has a lower energy than the native CDRL2 loop (Table Illb). Gaps in the energy previously observed in Table Ilia have decreased in magnitude but are still present. Although nine loops were found to have a native-like CDR geometry (RMSD<0.51 A), two sub-groups could be distinguished both by their RMSD and by their conformational free energy in solution. One group has a RMSD < 0.31 A and AG^^^f < 1.5 kcal/mol (3 loops) and the second has a RMSD range from 0.37 to 0.51 A and AG'^^f from 19.35 to 34.39 kcal/mol.
C. CDRL3 As for both CDRLl and CDRL2, the range in gas phase conformational free energy is wide: 229 loops had lower energies than 1000 kcal/mol over a total of 497 loops when only side chains were optimized. There is no large gap in energy between the best conformation and the next lowest one. Indeed, the second lowest free energy conformation also has a nativelike CDR conformation (RMSD < 0.51 A, Table IVa). A large gap energy is found between the second loop and the third one, which is not native-like (gap of about 200 kcal/mol). The two loops with the lowest energy are equally close to the native conformation (Table IVa). The conformational free energy ranged from -2.25 to 443.17 kcal/mol for the 47 selected conformations with all atom optimization. Both lowest RMSD conformations observed in all-atom minimization were the same when only side-chains were minimized (Figure 2). Because of the solvation free energy term, the loop 451c.21_26 has conformational free energy close to the two best conformations, but it has non-native-like CDR conformation. This particularly low solvation free energy is due to increases in the solvent accessible surface area 50A2of arginine 93 and 13 A^ for the polar atoms of backbone compared with other loops shown in Table IVb. Although, the inclusion of solvation free energy did not
762
Jean-Luc Pellequer and Shu-wen W. Chen
exclude a wrong conformation, the difference in energy between native and minimized loops were reconciled. It is interesting to point out that the all-atom minimization improved the conformation of a loop toward a native-like CDR geometry (2cba.l66_171, Table IVb) with a RMSD of 0.27 compared with 0.61 A when only side-chains were optimized. The improvement of the geometry occurs in the last two prolines of CDRL3.
Tables II-IV. Five lowest conformational free energy loops for CDRLl, CDRL2 and CDRL3. The RMSD concerns backbone atoms N, CA, C.
Table II a) CDRLl
Loop
Sequence
side-chains minimized
RMSD
^G-conf (kcal/mol)
(kcal/mol)
(A)
R45 CDRLl
SQDISTY
0
0
0
6fab.L26_L32
SQDINNF
135.50
161.23
0.18
lmsb.A73_A79
SNWKKDE
275.16
265.87
1.52
3tms.64_70
QGDTNIA
380.96
387.21
1.47
2aaa.453_459
VPMASGL
425.98
427.11
1.32
lcox.47 53
DTPGADG
460.32
440.38
1.66
0.36
b) CDRLl all-atoms minimized 6fab.L26_L32
SQDINNF
7.90
35.30
4enl.287_293
KRYPIVS
33.03
35.68
1.73
4enl.374_380
RSGETED
34.93
69.54
1.54
lrei.A26_A32
SQDIIKY
38.67
60.56
0.65
8adh.l28 134
SRFTCRG
51.85
80.86
1.16
^^cCnf
^GZf
RMSD
(kcal/mol)
(kcal/mol)
Table III a) CDRL2
Loop
Sequence
side-chains minimized
(A)
R45 CDRL2
YTSRLRS
0
0
0
2rhe.51_57
YNDLLPS
-5.00
-14.13
0.51
2fb4.L49_L55
RDAMRPS
115.92
109.00
0.27
lrei.A50_A56
EASNLQA
122.26
157.01
0.48
9wga.A13_A19
PNNLCCS
292.04
280.9
0.89
2fbi.H195_H201
PEGESVK
341.22
339.46
0.61
b) CDRL2 all-atoms minimized lrei.A50_A56
EASNLQA
-6.24
3.10
0.29
2rhe.51_57
YNDLLPS
0.92
1.76
0.31
2fb4.L49_L55
RDAMRPS
1.48
1.30
0.30
lpcy.22_28
SPGEKIV
19.35
47.25
0.39
4mdh.A30 A36
GKDQPII
19.52
27.92
0.37
^Grconf
AGfo^f
RMSD
(kcal/mol)
(kcal/mol)
0
0
Table IV a) CDRL3
Loop
Sequence
side- chains minimized R45 CDRL3
GSRIPP
(A) 0
Thermodynamics of Loop Backbone Conformations
763
lrei.A91_A96
YQSLPY
6.22
13.82
0.19
6fab.L91_L96
GNALPR
20.35
19.52
0.25
lfcb.A217_A222
GQGVTK
208.43
238.45
0.59
2scp.A104_A109
DTNEDN
246.00
248.74
1.20
3app.l86_191
NSQGFW
255.33
239.78
1.40
0.29
b) CDRL3 all-atoms minimized lrei.A91_A96
YQSLPY
-2.25
18.74
6fab.L91_L96
GNALPR
0.03
17.59
0.23
451c.21_26
KMVGPA
2.59
34.66
0.82
2cba.l66_171
IKTKGK
8.05
22.25
0.27
3psg.316_321
QDDDSC
15.42
46.44
0.86
IV. DISCUSSION A. What are the determinant parameters to be a native-like CDR loop? In this section, we assess contributions allowing the discrimination of correct from incorrect loops (only all-atom minimized loops) and discuss the fact that loops extracted from antibodies were always found to be the best candidates. Firstly, we found a strong correlation (r > 0.95) between AG^^^f and the sum of internal bonded energy (Brunger, 1992) as well as between AG^^^^ and van der Waals energy (r ~ 0.90). This indicates that high energy loops have high VdW energies as well as high internal bonded energies. All of these loops have been extensively minimized, with no bad contacts or clashes remaining. Consequently, a higher Van der Waals energy signified imperfect packing rather than bad steric contacts. We checked all loops for imperfect closure with the framework (internal bonded energies). Although a few loops displayed some high bond angle energies (specially in CDRLl), it was not the general trend and therefore could not be responsible for the correlation observed between AG^^^f and the internal bonded energy. For instance, CDRL3 showed no bad geometric closures (no deviation around average bond distance < 0.3 A, no deviation around the average bond angle < 20°, no deviation around the average improper angle < 30°) but displayed a strong correlation between free energy and internal bonded energy (r=0.96). We conclude that the difference in energy among loops comes from a concomitant effect of internal bonded energy and packing (van der Waals energy). Second, why were antibody loops were found to be best models? Without exception, they have the lowest van der Waals energy as well as the lowest internal bonded energies. We suggest that antibody loops are the best candidates because their backbone atoms pack better onto the highly conserved FR compared with a non-antibody loop. Because the orientation of side chains was of a minor importance the most crucial interaction appears to be between the backbone of a loop and the FR. Because the FR is structurally conserved, we are not surprised that loops coming from a similar environment (another antibody) were found to dock better to the R45 framework. Therefore local environment appears to be the dominant factor in CDR modeling because non-antibody loops were found
Jean-Luc Pellequer and Shu-wen W. Chen
764
to be incapable of accurately modeling CDRs in general. Although, it may not be surprising that CDRs are best candidates for modeling CDRs, our conformational free energy screening is highly accurate since no "false positive** were found. More interestingly, our energy screening discriminates correct from incorrect conformation even when all atoms of loops were minimized. Figure 2a
Figure 2b
140012001000800«s o 60 U
6004002000- — P - - ^ — . — J — 1 — ^ — , — 1 RMSD (A)
RMSD (A)
B. Conformational free energy analysis: Side chains vs. all atom optimization Because the computational time required for minimizing seven residue loops is not negligible, it is fair to ask whether side chain optimization alone is sufficient during a large screening of candidates. Nevertheless, wc agree that a complete optimization is required at the ultimate stage of modeling in order to remove any intra-molecular bad contacts. According to our results, the same loops were found to be correct in both optimization methods (Tables II-IV, compare side-chain vs. all atoms minimized loops) We found that best loops (RMSD < 0.51 A) had a much lower free energy than incorrect loops when only side chains are minimized (Figure 2). Such a separation in the energy makes it easier to identify correct loops (Figure 2a). When all atoms of loops are minimized, this gap in energy reduced or disappeared (Figure 2b). Nevertheless, the lowest conformational free energies (either in gas phase or in solution) always correlated with native-like CDR conformations. A second aspect of this comparison should be considered. Does a complete optimization improve the geometry toward the native structure? As shown is the Tables IIIV and Figure 1, there is no significant improvement when all atoms are minimized except for one loop in CDRL3. An extended minimization (all atoms) leads to a compaction on the conformational energy scale but did not improve loop conformations toward the native state (similar RMSD to only side chain minimization). Therefore a large and fast screening may not require full optimization of all loop atoms. A complete minimization is therefore best carried out on a few selected conformations at a latter stage.
Thermodynamics of Loop Backbone Conformations C. Conformational free conformational free energy
energy
analysis:
765 gas phase
vs.
solution
An important question is: does inclusion of the solvation free energy improve the identification of correct loops? Based on the results presented in Tables lib, Illb, IVb, the conformational free energy in gas phase alone is able to identify correct conformation. When only side chains were minimized, the inclusion of solvation free energy did not help distinguish candidates due to the very high free energy of such conformations. However, when loops are completely optimized (all atoms minimization), the solvation free energy allowed a more accurate description of the near native conformation (Tables Ilb-IVb). As a consequence, including a solvation energy term in the conformational energy is a crucial step when loops to be evaluated have similar gas phase energies. However, during a fast screening test, calculation of the solvation free energies for each loop is computationally expensive without increasing modeling accuracy. We suggest that solvation free energies should only be calculated for the conformations with the lowest range of gas phase energies. In conclusion, this work confirms that with the current PDB, modeling antibody CDRs is better achieved by using a database of CDRs from antibody molecules rather than other molecules. Nevertheless, calculation of the conformational free energy is essential in order to select the best CDR conformation for a current model. Although we have currently applied the present method to antibody loops, it appears to be a promising approach for screening an ensemble of loop conformations for any protein structure framework.^
REFERENCES Altschuh, D., Vix, O., Rees, B., et al. (1992). Science 256,92-94. Amzel, L. M. & Poljak, R. J. (1979). Ann. Rev. Biochem. 48, 961-997. Ban, N., Escobar, C , Garcia, R., et al. (1994). Proc. Natl. Acad. Sci. USA 91, 1604-1608. Barre, S., Greenberg, A. S., Flajnik, M. F., et al. (1994). Nature Struct. Biol. 1, 915-920. Bell, C. W., Roberts, V. A., Scholthof, K.-B. G., et al. (1995). In Immunoanalysis of agrochemicals. Emerging technologies , pp. 50-71. American Chemical Society, Washington D.C. Ben-Nairn, A. & Marcus, Y. (1984). J. Chem. Phys. 81, 2016-2027. Bhat, T. N., Bentley, G. A., Boulot, G., et al. (1994). Proc. Natl. Acad. Sci. USA 91, 1089-1093. Bork, P., Holm, L. and Sander, C. (1994). J. Mol. Biol. I'M, 309-320. Braden, B. C , Fields, B. A., Ysern, X., et al. (1996). J. Mol. Biol. 257, 889-894. Braden, B. C , Souchon, H., Eisele, J.-L., et al. (1994). J. Mol. Biol. 243, 767-781. Brunger, A. T. (1992). X-PLOR Manual. Version 3.0./., Yale University, New Haven. Brunger, A. T. & Karplus, M. (1988). Proteins 4, 148-156. Carlacci, L. & Englander, S. W. (1993). Biopolymers 33,1271-1286. Chitarra, V., Alzari, P. M., Bentley, G. A., et al. (1993). Proc. Natl. Acad Set USA 90,7711-7715. Davies, D. R., Padlan, E. A. and Sheriff, S. (1990). Annu. Rev. Biochem. 59,439-473. Fischmann, T. O., Bentley, G. A., Bhat, T. N., et al. (1991). J. Biol. Chem. 266, 12915-12920.
^ This work has been carried out in the laboratory of Prof. Barry Honig in the Department of Biochemistry and Molecular Biophysic at Columbia University. We acknowledge J.A. Tainer for critical reading of the manuscript.
766
Jean-Luc Pellequer and Shu-wen W. Chen
Gilson, M. K. & Honig, B. (1988). Proteins 4,7-18. Herron, J. N., He, X. M., Ballard, D. W., et al. (1991). Prcteins 11,159-175. Honig, B., Sharp, K. and Yang, A.-S. (1993). J. Phys. Chem. 97, 1101-1109. Jean-Charles, A., Nicholls, A., Sharp, K., et al. (1991). J. Amer. Chem. Soc. 113, 1454-1455. Kabat, E. A., Wu, T. T. and Bilofsky, H. (1977). J. Biol. Chem. 252, 6609-6616. Malby, R. L., Tulip, W. R., Harley, V. R., et al. (1994). Structure 2, 733-746. Nicholls, A. & Honig, B. (1991). J. Comp. Chem. 12, 435-445. Padlan, E. A. & Davies, D. R. (1975). Proc. Natl. Acad. Sci. USA 72, 819-823. Padlan, E. A., Silverton, E. W., Sheriff, S., et al. (1989). Proc.Nat.Acad.Sci.USA 86, 5938-5942. Pellequer, J.-L. & Chen, S.-W. W. (1996). submitted. Ponder, J. W. & Richards, F. M. (1987). J. Mol. Biol. 193,775-791. Prasad, L., Sharma, S., Vandonselaar, M., et al. (1993). J. Biol. Chem. 268, 10705-10708. Sheriff, S., Silverton, E. W., Padlan, E. A., et al. (1987). Proc. Natl. Acad. Sci. USA 84, 8075-8079. Sitkoff, D., Sharp, K. A. and Honig, B. (1994). J. Phys. Chem. 98,1978-1988. Smith, K. C. & Honig, B. (1994). Proteins 18,119-132. Sridharan, S., Nicholls, A. and Honig, B. in preparation. Steipe, B., Pluckthun, A. and Huber, R. (1992). J. Mol. Biol. 225,739-753. Tramontano, A. & Lesk, A. M. (1992). Proteins 13, 231-245. Tulip, W. R., Varghese, J. N., Laver, W. G., et al. (1992). J. Mol. Biol. 227, 122-148. Vix, O., Rees, B., Thierry, J.-C, et al. (1993). Proteins 15, 339-348.
The Equilibrium Ensemble of Conformational States in Staphylococcal Nuclease Vincent J. Hilser and Ernesto Freire Department of Biology and Biocalorimetry Center, The Johns Hopkins University Baltimore, MD 21218
I.
Introduction
The folding/unfolding equilibrium of proteins as observed by global macroscopic observables can be usually described by simple models that include two, three or at most a few conformational states (Freire, 1995; Kuwajima etal., 1976; Privalov, 1979; Privalov, 1982). The situation is different when the equilibrium is studied by physical observables that monitor the behavior of individual residues. Among these observables, NMR detected hydrogen exchange has acquired tremendous prominence due to its unique ability to monitor simultaneously the behavior of a large number of residues (Bai etal., 1995; Jacobs & Fox, 1994; Jennings & Wright, 1993; Kim et al., 1993; Kim & Woodward, 1993; Loh etal., 1993; Morozova etaL, 1995; Radford etal., 1992a; Radford etaL, 1992b; Roder etaL, 1988; Schulman etaL, 1995; Udgaonkar & Baldwin, 1988; Woodward, 1993). It has become evident that the pattern of hydrogen exchange protection observed under native or denaturing conditions cannot be rationalized in terms of models that involve a few discrete states (Bai et al., 1995) and that its interpretation requires a statistical thermodynamic treatment of the ensemble of conformations accessible to the protein (Hilser & Freire, 1996a; Hilser & Freire, 1996b; Hilser & Freire, 1996c). TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
767
768
Vincent J. Hilser and Ernesto Freire
Under experimental conditions in which the so-called EX2 regime is obeyed, the equilibrium constant for the following reaction is measured by hydrogen exchange experiments (see for example Bai et al., 1995): K . opj
NH.(closed) <
> NH.(open)
The above reaction cannot be understood as a simple chemical reaction involving two discrete states. Rather, K
. is a statistical quantity equal to the ratio between the sum of the
concentrations of all protein conformations in which residue j is open and therefore able to exchange protons with the solvent, and the sum of the concentrations of all conformations in which residue j is closed. For this reason, physical parameters derived from K ., like van't Hoff enthalpies or m values should also be interpreted in statistical terms.
In this paper we summarize the basic statistical thermodynamic formalism required to interpret hydrogen exchange protection factors.
II. Apparent Stability Constants Per Residue The prediction or analysis of hydrogen exchange protection factors requires the introduction of a new type of statistical descriptor: the apparent stability constant per residue (Hilser & Freire, 1996a; Hilser & Freire, 1996b). The apparent stability constant per residue, K^., is defined as the ratio of the probabilities of all states in which residue j is folded. P.., to the probabilities of the states in which residue j is not folded:
Equilibrium Conformations in Staphylococcal Nuclease
I _
J=
769
Pi
(states with residuej folded)
I
p _
IQ
Pi = " ^ ^
.^.
^'^
(states with residue] not folded)
The apparent folding constant per residue, K.., is the quantity that one will measure if it were possible to experimentally determine the stability of the protein by monitoring each individual residue. The corresponding apparent free energy per residue is simply AG.. = -R»T»Ln K...
Under equilibrium conditions, the probability of any given conformational state, P., is given by the equation:
exp(-AG./RT)
^i = i;^
(^)
X exp(-AG./RT) i=0
where the statistical weights or Boltzmann exponents (exp(-AG./RT)) are defined in terms of the relative Gibbs energies AG. for each state (R is the gas constant and T the absolute temperature). The relative Gibbs energy of each state (AG.) is expressed in terms of the standard thermodynamic equation:
AG. = AH. - T-AS.
(3)
770
Vincent J. Hilser and Ernesto Freire
where AG., AH. and AS. are the relative Gibbs energy, enthalpy and entropy of state i at temperature T, respectively. In the presence of denaturants like urea or GuHCl, AG. can be written in terms of the standard linear extrapolation model, as
AG. = AG.° -m.»C^
(4)
where AG.° is the Gibbs energy in the absence of denaturant, C , the concentration of denaturant, and m. the m value for state i. In the above equations, the native state is chosen as the reference state.
A.
Temperature Dependence Of Stability Constants
The temperature dependence of In K^. can be used to define a van't Hoff enthalpy, AH^j^., that has a rigorous statistical thermodynamic meaning. This temperature dependence is proportional to the difference between the average enthalpy of the states in which residue j is folded and the average enthalpy of the states in which residue j is non-folded. More precisely, application of the classical van't Hoff equation yields
AH , . = -R»(ainK ./ar^) = . - ,. vhj
f,j
f,j
(5)
nfj
where ^. and j^^. are the average enthalpies of the states in which residue j is folded or non-folded, respectively. In equation 5 the averages ^. and j^^. are taken over the subensemble of states in which residue j is folded or non-folded, respectively. Consequently,
Equilibrium Conformations in Staphylococcal Nuclease
771
neither of the two averages reflect the total population of a given set of states over the entire ensemble of protein molecules. According to equation 5, the actual value of AH , . will be weighted by the relative population of the different states in which residue j is folded or non-folded. At temperatures in which the native state exhibits maximal stability, .. is essentially zero since it primarily represents the relative enthalpy of the native state, i.e. the reference state.
Thus, under native conditions -AH^j^. is approximately equal to the average
enthalpy of the states in which residue j is unfolded. If under native conditions this value is similar to the enthalpy change for global unfolding of the protein (as measured calorimetrically for example) then residue j belongs to a highly stable region of the molecule which undergoes unfolding only when the entire molecule unfolds. Conversely, regions that exhibit smaller -AH^^. values can become unfolded without the entire molecule being in the unfolded state. The situation changes as the temperature approaches the denaturation temperature and the native and unfolded states exhibit non-negligible populations. As the transition region is approached K. values for different residues will progressively converge into a common value that characterizes the equilibrium between the native and unfolded states.
B,
Denaturant Dependence Of Stability Constants
The denaturant dependence of of In K.. can be used to define an apparent or effective m value per residue, m ..., effj'
"^effo = "^'^^^^^ \j/^[D]) = <m>^^j - <m>^.
(6)
As in the case of the van't Hoff enthalpy per residue the effective m values are statistical quantities equal to the difference between the average m value for the conformations in which the residue is not folded and the average m value for the conformations in which that residue is folded. At low
772
Vincent J. Hilser and Ernesto Freire
denaturant concentrations the two averages are about equal and cancel each other to a large extent. This explains why it is possible for some residues to exhibit high stability constants or protection factors and simultaneously m values close to zero (Hilser & Freire, 1996a; Hilser et al., 1996). This behavior should be contrasted with that expected for the logarithm of a two-state equilibrium constant which should be linear with denaturant concentration.
III. Hydrogen Exchange Protection Factors. While the residue stability constants are purely thermodynamic quantities defined for all residues, the protection factors also contain non-thermodynamic contributions and are defined only for a subset of residues. For example, proline residues lack the amide group and therefore are not included. From a statistical standpoint, the protection factor for any given residue j can be defined as the ratio of the sum of the probabilities of the states in which residue j is closed, to the sum of the probabilities of the states in which residue j is open:
I
Pi
(states with residue j closed)
^T-,
•^
L
p
^1
clOSed,j
open,j
(states with residue j open)
It is obvious that not all residues that are folded are protected from exchange, since they can be exposed to the solvent in the native state or become exposed because adjacent or complementary surfaces become unfolded. The statistical definition of the protection factors has the same form as that of the stability constants (equation 1) and can be expressed in terms of the folding probabilities as follows:
Equilibrium Conformations in Staphylococcal Nuclease ^^
P. . - P. 1,1
^^j = p
where the correction term P
773
I,XC,1
+p
(8)
. is the sum of the probabilities of all states in which residue j is
folded, yet exchange competent. It is evident that the hydrogen exchange protection factors, PF., are equal to the stability constants per residue, K^., only when the P^^ . terms are small.
The most common situations in which a residue is folded but exposed to the solvent occurs when: 1) The amide group of the residue is exposed in the native state; and, 2) the amide group of the residue becomes exposed by being located in a region of the protein that is structurally complementary to an unfolded region. Of course, amide protons that exchange via different mechanisms (e.g. solvent penetration) will not be accounted for by this formalism. Strucutral thermodynamic analysis of the protection factors of several globular proteins (staphylococcal nuclease, hen egg white lyzosyme, equine lyzosyme, turkey ovomucoid third domain, BPTI) (Hilser & Freire, 1996a; Hilser et al., 1996) suggests that for most residues that show protection the contribution of P.
. is small and the protection factors are similar to the stability constants.
Finally, the prediction of hydrogen exchange protection factors requires knowledge of the limiting exchange rates that can be measured under a given set of experimental conditions. This constraint sets a limit to the magnitude of the protection factors that can be determined for a given amino acid in the sequence. This is a purely experimental constraint, the magnitude of which depends on the actual experimental setup (Radford et al., 1992a). In our calculations, the expected exchange rates for each amide were estimated by using the intrinsic exchange rates calculated according to the method of Bai et al.(Bai et al, 1993).
774
IV.
Vincent J. Hilser and Ernesto Freire
The COREX Algorithm
Analysis of protein equilibrium in terms of the formalism described above involves an approximation of the ensemble of conformational states available to a protein. In our laboratory, the ensemble of partially folded states is approximated with the computer by using the high resolution structure as a template. In the COREX algorithm (Hilser & Freire, 1996a; Hilser & Freire, 1996b) the entire protein is considered as being composed of different folding units and partially folded states are generated by folding and unfolding those units in all possible combinations.
The division of the protein into a given number of folding units is called a partition. In order to maximize the number of distinct partially folded states, different partitions are included in the analysis. Each partition is defined by placing a block of windows over the entire sequence of the protein. The folding units are defined by the location of the windows irrespective of whether or not they coincide with specific secondary structure elements. By sliding the entire block of windows one residue at a time different partitions of the protein are obtained. For two consecutive partitions the first and last amino acids of each folding unit are shifted by one residue. This procedure is repeated until the entire set of partitions have been exhausted. Usually, 50,000 - 150,000 states are generated for a typical globular protein.
Each of the states generated by the COREX algorithm is characterized by having some regions folded and some other regions unfolded. There are two basic assumptions in this algorithm: 1) The folded regions in partially folded states are native-like; and, 2) the unfolded regions are assumed to be devoid of structure. While these assumptions appear drastic at first, it has been shown that the resulting ensemble accounts well for hydrogen exchange protection patterns suggesting that non
Equilibrium Conformations in Staphylococcal Nuclease
775
native-like intermediates have vanishingly small probabilities and do not contribute measurably to the experimental values under normal equilibrium conditions. The Gibbs energy (AG) and probability of each state (equation 2) are calculated using a structural parametrization of the folding energetics as described elsewhere (Hilser & Freire, 1996a; Hilser et al., 1996).
V.
The Pattern of Hydrogen Exchange Protection for Staphylococcal
Nuclease Figure 1 shows the predicted and experimental hydrogen exchange protection factors for Staphylococcal Nuclease (SNase).
T
60
80
140
Residue Figure 1. Comparison of predicted and experimental (Loh et al., 1993) protection factors at 37°C. For better comparison, the negative value of the experimental protection factors has been plotted in the figure. Shown at the top of both panels are the corresponding elements of secondary' structure (adapted from (Hilser & Freire, 1996b))'
776
Vincent J. Hilser and Ernesto Freire
Inspection of the figure reveals three major protection levels. The highest Ln PF. values correspond to the second and third p strands (residues 21-39), the central residues of the fourth (3 strand (residues 73-75) and the last portion of the fifth p strand through the second a helix (residues 91-106). In this region it must be noted the presence of higher values for the highly hydrophobic cluster Leu36, Leu37, Leu38 and Val39 in p3 and Alal02, Leul03 and Vall04 in a2. The second level corresponds to the first p strand and the adjacent turns (residues 10-20), the second half of the first a helix (residues 62-68) along with the beginning of the fourth p strand (residues 71-73), and the region from the loop following the second a helix through the third helix (residues 107-135). The third level corresponds to the amino and carboxyl terminal residues (7-10 and 136-141) and the loop region from residue 41 to 53, the first half of the first a helix (residues 54-61) and the loop region defined by residues 77-89.
Of the 49 protected residues, 44 are correctly predicted to exhibit protection. In addition 62 are correctly predicted to show no protection: 6 are prolines, 26 are solvent accessible, and 30 (residues 9, 10, 35, 41, 44-46, 49, 50, 52, 54, 55, 57-60, 77-80, 83, 85-88, 118, 119, 121, 138, and 139) are predicted to have protection factors below the experimental limit of detection. This relatively large number of residues beyond experimental detection is primarily due to the high temperature (37''C) at which the experiments were performed (Loh et al., 1993). This gives a total of 100 residues (excluding prolines) or 78% for which the prediction matches the experimental results. Of the 29 mispredictions, the vast majority (24) represent cases in which protection was predicted but not observed. This pattern suggest that many of those residues may indeed be thermodynamically stable but able to exchange by adifferent mechanism. For those residues that
Equilibrium Conformations in Staphylococcal Nuclease
777
exhibit protection, the average difference between predicted and experimental protection factors expressed as differences in the apparent free energies per residue amounts to 0.3+0.6 kcal/mol.
VI.
Predicted Temperature Dependence of Hydrogen Exchange
Protection for Staphylococcal Nuclease Figure 2 shows the temperature dependence of the predicted individual folding constants for SNase. When the individual Ln(l/Kp values of each residue are plotted against the inverse temperature, a noticeable trend emerges. Specifically, there are groups of residues which show identical Ln(l/K.) values and identical temperature dependencies suggesting that these residues define cooperative folding units. These predicted trends agree with experimental results obtained for other proteins and reproduce the general behavior described by Bai et al. (Bai et al., 1995). ouuu r^-::,,,,,^^" p2 p3 (35 a2 6000 4000 112-122 H
2000
^ ^
77 - 89
II
^^"^"""^"^^""^^^^^
0.000
^
-2000 -4000 20
30
40
50 60 Temp °C
70
80
90
Figure 2. The temperature dependence of the natural logarithm of the apparent residue stability constants. For clarity a single line is shown for groups of residues that exhibit similar behavior.
778
VII.
Vincent J. Hilser and Ernesto Freire
Predicted Urea Dependence of Hydrogen Exchange Protection
for Staphylococcal Nuclease The urea concentration dependence of the natural logarithm of the apparent residue stability constants is shown in Figure 3.
Ns^ p2 (33 p5 a2 al.^^!^"P4a3
10.00
112-122
^ ^ v
5.000 77 - 89
^^^^""^^^^^^
0.000
1
-1
1
1
1
2
3
1
1
[Urea]
Figure 3. The urea concentration dependence of the natural logarithm of the apparent residue stability constants. For clarity a single line is shown for groups of residues that exhibit similar behavior. These calculations simulate the energetic conditions at 25°C where higher resolution between protection factors is expected (adapted from (Hilser & Freire, 1996a)). In general, as the urea concentration increases and the stability of the protein diminishes, the magnitude of the stability constants decreases. At low urea concentration, the rate of decrease is not the same for all residues; several groups of residues with similar stability constants and similar m values can be recognized, as shown in the figure. At increasing urea concentrations the stability
Equilibrium Conformations in Staphylococcal Nuclease
779
constants progressively merge into a single curve characterized by the parameters corresponding to the global unfolding of the protein. This is the same type of behavior observed experimentally for the denaturant dependence of the protection factors (Bai et al., 1995), which have been used to define cooperative folding units or partially unfolded forms (PUF's) (Bai et al., 1995).
As shown in the figure, the (3 barrel, particularily strands 2, 3 and 5 as well as a-helix 2 define the group of residues with the highest stability constants and m values. These residues define the most stable core of SNase. The unfolding of these residues only occurs by complete unfolding of the protein, a-helix 1, the first P strand and the loop region defined by residues 112-122 come next, followed by the loop region defined by residues 77-89. The loop between (3 strand 3 and a-heUx 1 (residues 42 - 57) is unstable at all urea concentrations and is not shown in the figure.
VIII.
Conclusions
The agreement between predicted and experimental hydrogen exchange protection factors for SNase and other proteins (Hilser & Freire, 1996a; Hilser & Freire, 1996b; Hilser & Freire, 1996c) suggests that the probability distribufion of partially folded states generated with the COREX algorithm mimics the general features of the ensemble of conformations under native conditions and that this approach can be used to examine general aspects of the equilibrium ensemble of partially folded states. It has also been shown that this approach accounts for the cooperativity of folding/unfolding transitions and succesfully predicts the apparent two-state behavior observed in temperature and denaturant induced folding/unfolding reactions.
780
Vincent J. Hilser and Ernesto Freire
Acknowledgments This work was supported by grants RR04328 and GM51362 from the National Institutes of Health.
References Bai, Y., Milne, J. S., Mayne, L. & Englander, S. W. (1993). Primary structure effects on peptide group hydrogen exchange. Proteins 17, 75-86. Bai, Y., Sosnick, T. R., Mayne, L. & Englander, S. W. (1995). Protein Folding Intermediates: Native-State Hydrogen Exchange. Science 269, 192-197. Freire, E. (1995). Thermodynamics of Partly Folded Intermediates in Proteins. Ann. Rev. of Biophys. and Biomolec. Struct. 24, 141-165. Hilser, V. J. & Freire, E. (1996a). Predicting the Equilibrium Protein Folding Pathway: StructureBased Analysis of Staphylococcal Nuclease. Proteins In Press. Hilser, V. J. & Freire, E. (1996b). Structure based calculation of the equilibrium folding pathway of proteins. Correlation with hydrogen exchange protection factors. J. Mol. Biol In Press. Hilser, V. J. & Freire, E. (1996c). Structure-Based Statistical Thermodynamic Analysis of T4 Lysozyme Mutants: Structural Mapping of Cooperative Interactions. Biophysical Chem. . Hilser, V. J., Gomez, J. & Freire, E. (1996). The Enthalpy Change in Protein Folding and Binding. Refinement of Parameters for Structure Based Calculations. Proteins In Press. Jacobs, M. D. & Fox, R. O. (1994). Staphylococcal nuclease folding intermediate charaterized by hydrogen exchange and NMR spectroscopy. Proc. Natl. Acad. Sci. USA 91, 449-453. Jennings, P. A. & Wright, P. E. (1993). Formation of a Molten Globule Intermediate Early in the Kinetic Folding Pathway of Apomyoglobin. Science 262, 892-896. Kim, K.-S., Fuchs, J. A. & Woodward, C. K. (1993). Hydrogen exchange identifies native-state motional domains important in protien folding. Biochemistry (32), 9600-9608.
Equilibrium Conformations in Staphylococcal Nuclease
781
Kim, K.-S. & Woodward, C. (1993). Protein internal flexibility and global stability: Effect of urea on hydrogen exchange rates of bovine pancreatic trypsin inhibitor. Biochemistry 32, 9609-9613. Kuwajima, K., Nitta, K., Yoneyama, M. & Sugai, S. (1976). Three-state denaturation of alactalbumin by guanidine hydrochloride. /. Mol. Biol. 106, 359-373. Loh, S. N., Prehoda, K. E., Wang, J. & Markley, J. L. (1993). Hydrogen exchange in unligated and ligated staphylococcal nuclease. Biochemistry 32, 11022-11028. Morozova, L. A., Haynie, D. T., Arico-Muendel, C, Van Dael, H. & Dobson, C. M. (1995). Structural Basis of the Stability of a Lysozyme Molten Globule. Nature Structural Biology 2, 871875. Privalov, P. L. (1979). Stability of Proteins: Small Globular Proteins. Adv. Protein Chem. 33, 167-239. Privalov, P. L. (1982). Stability of Proteins: Proteins Which Do Not Present a Single Cooperative System. Adv. Protein Chem. 35, 1-104. Radford, S. E., Buck, M., Topping, K. D., Dobson, C. M. & Evans, P. A. (1992a). Hydrogen exchange in native and denatured states of hen egg-white lysozyme. Proteins 14, 237-248. Radford, S. E., Dobson, C. M. & Evans, P. A. (1992b). The Folding of Hen Lysozyme Involves Partially Structured Intermediates and Multiple Pathways. Nature 358, 302-307. Roder, H., Elove, G. A. & Englander, S. W. (1988). Nature 335, 700-704. Schulman, B. A., Redfield, C., Peng, Z., Dobson, C. M. & Kim, P. S. (1995). Different subdomains are most protected from hydrogen exchange in the molten globule and native states of human a-lactalbumin. /. Mol. Biol. 253, 651-657. Udgaonkar, J. B. & Baldwin, R. L. (1988). NMR evidence for an early framework intermediate on the folding pathway of ribonuclease A. Nature 335, 694-699. Woodward, C. (1993). Is the slow-exchange core the protein folding core? TIBS 18, 359-360.
This Page Intentionally Left Blank
An Evaluation Of Protein Secondary Structure Prediction Algorithms Georgios Pappas Jr. and Shankar Subramaniam Department of Physiology and Biophysics, Beckman Institute, University of Illinois, Urbana, Illinois 61801
I. Introduction Over the past years several algorithms were developed in order to predict the secondary structure of proteins based on very distinct theoretical approaches (Fasman, 1989, Eisenhaber et al. 1995). This list keeps growing incessantly, with the authors eager to improve the predictive power of their methods. This frantic search exposes the current status of the prediction accuracy which is far from ideal in order to make reasonable inferences about the tertiary structure, although it does not invalidate the use of these methods as rough starting points for modeling purposes (Rost and Sander, 1995; Schultz, 1987). The level of success in the predictions reported by the authors is in the range from 60 % to 72%. Most of these results represent an overestimation due to incomplete cross-validation (Holley and Karplus, 1989) or by the lack of a reasonable number of test cases (Burgess et al., 1974). All the methods developed so far try to extract information, directly or indirectly (Lim, 1974), from the ever growing databases of X-ray crystallography resolved protein structures. Unfortunately, the rate at which new structures are added to the structure databases is far from optimal. Chothia (1992) estimated that all proteins, when their structures are known, would fall into about one thousand folding classes, more than half of them yet to be discovered. If so, this means that a great deal of information in the forthcoming structures is not available for the current methods, and therefore we still must rely on the future to see a coherent and realistic increase In the accuracy of secondary structure prediction methods. Comparative analysis of the performance of various algorithms has been carried out in the past (Kabsh and Sander, 1983). However, this task can be deceptive If factors such as the selection of proteins for the testing set and the choice of the scoring Index are not carried out properly. The present work aims to provide an updated evaluation of several predictive methods with a testing set size that permits to obtain more accurate statistics, which in turn can possibly measure the usefulness of the information gathered by those methods and also identify trends that characterize the behavior of individual algorithms. Further, we present a uniform testing of these methods, ws-a-ws the size of the datasets, the measure of accuracy and proper cross-validation procedures. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
783
784
Georgios Pappas, Jr. and Shankar Subramaniam
II. Material And Methods A. Secondary Structure Prediction Algorithms Algorithms for secondary structure prediction are based upon diverse theoretical approaches. There are three mainstream classes of methods (Gamier and Levin, 1991): 1. Statistical: Rely on the assumption that amino acids have intrinsic propensities for formation of a specific type of secondary structure. This information is collected by analysis of proteins with known tertiary structure and is done using simple statistical principles (Chou and Fasman, 1974) or more elaborate ones like information theory (Gamier et ai, 1978). 2. Neural networks: Those are highly nonlinear pattern recognition devices that try to mimic the organization of nervous systems. They are trained by adaptively learning a set of patterns and can extract high order features of the input space with the ability of making generalizations for unknown input events. 3. Sequence similarity: Based on the comparison of the protein to be predicted with an available database of known structures. The prediction is made by assigning the secondary structure of the fragment in the database which displays the most sequence similarity with a segment in the test protein. Other variations often appear utilizing different methodologies, but with no relative gain in accuracy. Those include methods that are based on hidden Markov models (Sasagawa and Tagima, 1993), stereo-chemical principles (Lim, 1974) and statistical mechanics (Ptitsyn and Finkelstein, 1983), just to cite a few. From a large list of available algorithms for secondary structure prediction, nine of them were selected to represent the main classes depicted above. Those were chosen mainly because they are the most often cited in the literature and by the fact that they permit a relatively safe implementation through of a computer program. The selected methods are summarized in table I. Table L List of secondary structure prediction methods utilized Method Code Type Statistical BPS C F Statistical D R Statistical GAS Statistical GGR Information Theory GOR Information Theory H K Neural Networks L G Sequence Similarity Q_S Neural Networks
Reference Burgess et al. (1974) Chou and Fasman (1974) Del6age and Roux(1987) Gascuel and Golmard (1988) Gibratefa/.(1989) Gamier ef a/. (1978) Holley and Karplus (1989) Levin and Gamier (1988) Qian and Sejnowski (1988)
Protein Secondary Structure Prediction Algorithms
785
A software package called MultPred (Multiple Predictions) was developed in C++ language and implements all but C__F (Chou and Fasman, 1974) and GGR (Gibrat et al., 1987) methods which were taken from the program ANTHEPROT (Deleage and Roux, 1989). Additionally, a joint prediction scheme (JOI- Joint prediction) was utilized in which the prediction from the different methods analyzed were combined and the structure predicted by the majority was assigned to the respective residue in the same fashion implemented by Nishikawa and Noguchi (1991). Some methods provide three-state prediction (i.e., locating helices, sheets and coil regions) and others four-state prediction (the former plus p-turns). For the present analysis only three-state predictions were analyzed, and the four-state predictions were transformed to three-state ones by assigning the coil state to predicted turn regions. In the case of BPS (Burgess et al., 1974) method the secondary structural propensities of the amino acids were recalculated following the paper, because the original values were based on just 9 proteins. For the L_G method (Levin and Gamier, 1988) the database used to make the prediction was the database used in this work (see below) instead of the original given in the paper. This was necessary to avoid over-predictions owing to high sequence homology between the two datasets. However, we note that the L_G parameters are not optimized for the current protein database.
6. Protein Database Selection The testing set used is composed of 148 proteins with resolution better than 2.4 A and less than 25% of sequence homology between each other (Hobohm et al., 1992). The total number of residues analyzed was 36229. Secondary structure assignments were taken from DSSP program (Kabsh and Sander, 1983a) and those were transformed to a three-state form according the rules given by Levin and Gamier (1988). Protein class assignments were based on SCOP database (Murzin et al., 1995), dividing the set of proteins in 31 all-alpha, 31 all-beta, 51 alpha/beta, 21 alpha+beta and 14 irregular or multi-domain. The relative secondary structure composition for each class is given below:
Table II. Relative composition in terms of structure for the current database distributed in terms of protein classes Coil (%) Number of proteins Protein Class a-helix(%) 3-sheet (%) 31.93 All Proteins 19.88 48.19 148 56.74 All-Alpha 3.22 40.04 31 52.98 31 8.64 38.39 All-Beta Alpha/Beta 35.78 51 46.57 17.65 26.63 23.77 49.60 Alpha+Beta 21
786
Georgios Pappas, Jr. and Shankar Subramaniam
C. Accuracy Measurements One key element in the performance analysis of secondary structure prediction methods is the proper selection of the accuracy measurement to be employed. Three different types of predictive accuracy measurements were used (Schultz and Schirmer, 1979): 1. Q3: Is calculated by taking the number of correct predictions over the sequence by the total number of amino acids. This index often produces high values because it does not penalize over predictions. 2. Mathews correlation coefficient (Cx): It is the correlation coefficient between positive predicted and observed as well as negative predicted residues. This index is particular for a specific structure x and the formula is
Cs-
('''^ yl(x + y)(x + z)(w + y)(w + z))
where
s = a,p or Coil
w = Number of correct predicted residues for structure x X = Number of negative correct prediction for structure x y = Number of residues under predicted for structure x z = Number of residues over predicted for structure x 3. Entropy-related information: This measure was introduced by Rost and Sander (1993) and is related to the probability of deviation between a random prediction and the actual prediction. The value of this index is affected by over- and underpredictions, which is not accomplished by Q3. Therefore it provides a more reliable estimate of the significance of accuracy. The formula is given by 3
3
2 at • lna«-2 ^v ' In^i, Info^l-
1=1
ij=\
N'\nN-Y,biinbi
N = Number of residues a, = Number of residues predicted to be in secondary structure / b/ = Number of residues observed to be in secondary structure / Ay = Number of residues predicted to be In / and observed to be iny
Protein Secondary Structure Prediction Algorithms
787
D. Multidimensional Scaling To help in the visualization of how the secondary structure prediction methods relate to each other, the statistical technique called multidimensional scaling (MDS) was utilized. Basically, what MDS does is, when provided a matrix of dissimilarities between objects (in our case, the algorithms for secondary structure prediction), to find a lowdimensional representation (2-dimensional) of the data, one point representing one object in such a manner that the distances in the new coordinate system match as well as possible the original distances provided in the dissimilarity matrix (Cox and Cox, 1994). For this kind of analysis one of the crucial steps is the definition of the dissimilarity between two predictive algorithms. Q3 values and Mathews' correlation coefficients where calculated for all proteins in the database resulting in an accuracy vector for each predictive method (""Xj, where m= predictive method and 1= Q3, Ca, Cp or Cc). Given those vectors, dissimilarity matrices where calculated for each accuracy index over all predictive methods using Guttman's ^ coefficient (Guttman, 1968), which is a measure of simple monotonic relationship between variables and is given by
I.UXi-''Xj\\'Xi-''Xj\
Hr.s= Dissimilarity between method rand s. ; ""Xj = Accuracy index m for protein /.
III. Results And Discussion A. Analysis Of Predictive Accuracy The first step in the analysis was to obtain the secondary structure prediction for the 148 proteins in the test database with the selected methods. The accuracy results in terms of the Q3 index can be examined in table III.
Table III Newly calculated and reported accuracies in the original papers in terms of Q3 Method New Accuracy (Q3%) Original Accuracy (Q3%) EPS 53.1 ± 7 .5 61.3 C_F 55.5 ± 8.7 59.2 D_R 59.6 ± 9.4 61.3 G_G 55.3 ± 7.5 62.3 GGR 60.4 ± 8.1 63.0 GOR 57.4 ± 8.4 58.0 H_K 60.1 ± 8.4 63.2 L_G 55.2 ± 9.7 63.0 Q S 59.6 ± 8.7 64.3
788
Georgios Pappas, Jr. and Shankar Subramaniam
When analyzing the Q3 values it is clear that those are lower compared to the ones claimed by the authors in the original papers. This can basically be due to two factors: poor statistics as a consequence of low number of test proteins, and lack of cross-validation of the results. The discrepancy between reported values and the newly calculated ones is variable indicating different degrees of prediction generalization attained by each method. Additionally, it must be kept in mind that the database used in this work may contain proteins used in the training set of some of methods, which induces an overestimation of the accuracies. Nevertheless, it is still observed that all methods had the accuracies decreased. However, despite Q3 being the most popular and widespread measure it suffers serious problems in terms of providing a reliable and significant accuracy estimate. The main Q3 drawback is that it does not take in account under- and over-predictions failing to capture the real significance of the results. For example, if we predict all the residues as being coil in the test database, an average Q3 value of 48.19% Is obtained but correlation coefficients and information values will be null. As an alternative way to analyze the accuracies it is possible to use the average Mathews' correlation coefficients and information values reported in table IV for all predicted proteins. The use of these two measures is very scarce in secondary structure prediction literature, despite their obvious superiority over Q3. In one of the few publications that utilize Mathews' correlation coefficients, Holley and Karplus (1989) reported values of Ca=41%, Cp=32% and Cc=36%. In the new analysis those values were sensibly decreased (Ca=32%, Cp=25% and Cc=31%), clearly indicating there is a poor generalization power of the method to a larger set of proteins. It also strengthens an important fact for secondary structure evaluation, already noted by Rost and Sander (1994), which is the need of a representative testing set in terms of size and structural composition that permits gathering reliable statistical information from the results. Table IV. Average Mathews' correlation coefficients and information values for the 148 chains in the testing set. Standard deviation values are shown in parentheses Method Ca (%) Cp (%) Cc (%) Information (%) 24.08 ±(16.51) 13.73 ±(14.22) 20.00 ±(12.56) 8.23 ± (5.34) BPS 24.33 ±(18.61) 21.24 ±(17.12) 29.25 ±(12.38) 11.10 ±(6.69) C F 29.15 ±(14.98) 23.27 ±(16.68) 36.58 ±(11.65) 12.85 ±(6.30) D R 20.68 ±(13.49) 28.23 ±(18.04) 23.32 ±(11.27) 9.63 ± (5.07) G G 32.15 ±(16.92) 26.87 ±(17.98) 36.17 ±(11.87) 14.08 ±(6.68) GGR 29.31 ±(17.32) 26.16 ±(17.35) 34.65 ±(11.66) 13.71 ±(6.75) GOR H K 31.96 ±(19.66) 24.62 ±(16.69) 30.99 ±(13.10) 13.01 ±(7.34) 30.56 ±(12.75) 25.64 ±(17.54) 33.67 ±(10.22) 12.69 ±(5.31) L G 28.66 ±(19.34) 22.63 ±(17.46) 33.31 ±(13.21) 12.85 ±(9.02) Q S 34.68 ±(18.71) 27.67 ±(17.86) 35.44 ±(12.56) 14.97 ±(8.32) JOI
Protein Secondary Structure Prediction Algorithms
789
To further extend the analysis, accuracies were measured in terms of correlation coefficients and information independently based on protein structural classes in order to check if there are biases particular to specific chain folds. The results are shown in figures 1 and 2. Predictive Accuracies (information) of secondary structure algoritlims 0.20
BPS C_F D_R G_G GGR GOR H_K L_G
Q_S
JOI
Algorithm Code
Predictive Accuracies ( C^^) of secondary structure algoritlims
Ca(%)
BPS
C_F D_R G_G GGR GOR H_K L_G Q_S
JOI
Algorithm Code
a^
Figure 1. Predictive accuracy in terms of information and Ca. The values are averaged over the respective class of proteins.
Georgios Pappas, Jr. and Shankar Subramaniam
790
Predictive Accuracies (CQ) of secondary structure algorithms 40.00
30.00
Cp(%)
20.00
10.00
0.00
BPS C_F D_R G_G GGR GOR H_K L_G Q_S
JOI
Algorithm Code
m-
I
ALPHAfiETA
^ ^ M
ALPHA+BETA
Predictive Accuracies (Cc) of secondary structure algorithms
30.00
CC(%)
0.00
BPS
C_F D^R G_G GGR GOR H_K
L_G Q_S
JOI
Algorithm Code i
ALPHA«ETA
^ ^ M
ALPHAfBETA
Figure 2. Variation of predictive accuracies (average) according to the protein class as measured by the Cp and Cc values.
Protein Secondary Structure Prediction Algorithms
791
The first observation of this kind of analysis is that for all types of measures utilized the behavior of predictive methods varies significantly according to the protein fold family. This can be relevant in pointing out what method performs better for the prediction of a determined structural element depending on the protein class. Conversely, it also is possible to diagnose critical points where the algorithms fail. From the information values in figure 1 it is possible to observe that in general the prediction in all alpha and alpha+beta classes has a greater success than for all beta and alpha/beta classes. Also it shows a greater variability between predictive accuracies of the methods among each other as well as between the fold classes individually. Figures 1 and 2 reinforce this view with the use of Mathews' correlation coefficients for each type of secondary structure. However, a more striking observation arises. When the prediction is done for all-beta proteins the Ca is extremely low (<16% for all methods) and this is more pronounced for Cp when predicting all-alpha proteins (<7%). This means that when one has a protein dominated by just one type of secondary structure element (all-alpha or all-beta), the prediction of a structure of the opposite type (like beta-sheets in all-alpha proteins) is an almost random event. The correlation coefficients for coil structures although showing a better performance for all alpha proteins does not show a pronounced variation as for Ca and Cp. Also it is possible to identify why generally the methods perform better for all alpha and alpha+beta proteins. For all alpha proteins the good performance is due more to the correct prediction of coils than the a-helical residues. However, those two combine to give a fair performance. In the case of alpha+beta proteins all methods display a better than average performance for both a-helices and p-strands. This fact may be due to the segregation of the domains that may confer a homogeneous character for them independently and that can be captured successfully by the methods. For the allbeta proteins the quality of the Ca is so low It decreases the overall performance of the algorithms for this class. The question now shifts to which method is the best in terms of performance. This is a complex matter because even by using the same testing set and accuracy measures the results are not totally comparable. Notwithstanding, one can observe that the joint prediction (JO!) is consistently the most successful In terms of Ca and information index. Individually, Q_S method (Qian and Sejnowski, 1988) performs well for all-alpha and alpha+beta proteins while GOR (Gamier et al., 1978) is indicated for all-beta proteins and GGR (Gibrat et al., 1987) has an edge for alpha/beta proteins.
B. Multidimensional Scaling In order to provide an easy way to visualize how the methods differ from each other the accuracy values (Q3, Ca, Cp and Cc) for each method where subjected to nonmetric multidimensional scaling analysis (Cox and Cox, 1994). The resulting graphs provide a representation in a two-dimensional space of the methods and those are shown in figure 3. As can be observed, the interrelationship between the methods
Georgios Pappas, Jr. and Shankar Subramaniam
792 Multidimensional Scaling Mathews' C
Multidimensional Scaling Mathews' C
1-l
*
0.5-
*
GGR D_R JOI* 0-
*
GOR
^
C F BPS
**
H_K
JOI
CM
-U.b-
G_G
* GGR H_K
G_G Q!S
GOR
L_G -1-1
-0.5
0
Axis 1
0.5
Multidimensional Scaling Mathews' Cc
1
\
-0.5
r-
0
Multidimensional Scaling Q3
Figure 3. Multidimensional scaling analysis of the dissimilarities between accuracies of different protein secondary structure prediction methods. The method codes can be found in Table I.
Protein Secondary Structure Prediction Algorithms
793
varies sensibly with the accuracy measure used. This fact brings up one important observation about the accuracy indexes themselves. They extract different pieces of information about the accuracy of the methods and exposes that the predictive methods work differently to attain the same goal, which is not necessarily the optimal one. Additionally, it indicates that when reporting the performance of a method it is imperative to use several evaluation indexes in order to provide a less biased estimative of efficacy. The distance between points in the graphs represents the dissimilarity of the methods within the statistical error of the construction. Therefore, clusters of methods indicate similar performance for the specific index, whereas points far apart indicate that the methods diverge in predictive terms. One fact that can be observed is that the methods JOI, H_K and GGR cluster together for the indices Ca, Cp and Cc, suggesting that they behave similarly. Incidentally, those three are among the best ones in terms of accuracy. Another observation is that methods that share the same theoretical framework like H_K and Q_S (both are neural networks based) are somewhat located closely (except perhaps in the case of Ca) maybe because they extract complementary information from the training set, despite the sets of proteins used are different.
IV. Conclusions The present analysis might give rise to a somewhat pessimistic view of the effectiveness of protein secondary structure prediction algorithms. In fact, with the increasing number of proteins with known three-dimensional structure, constant reevaluation of performance must take place in order to ascertain the validity of the methods. We note that the methods do not have the predictive power claimed by its authors when analyzed consistently using the 148 proteins selected in this study. Moreover, the situation is even worse for the Mathews correlation coefficient, which indicates that the predictions are poorly correlated with the actual structure. The inherent variability of predictive success rate depending on the protein fold class brings important observations: (1) When reporting accuracies the selection of the test set proteins should be balanced in order to include a representative number of each of the protein fold classes. (2) the prior knowledge of the protein fold class (Chou and Zhang, 1995) can be a valuable aid for the predictions and with that one can use the different algorithms in combination to predict a specific structural element of the chain. The apparent failure of the prediction methods can be explained by the fact that they take in account just short-range interactions, which are very influenced by the number of proteins in the training process. It seems that this type of statistics tends to reach a plateau with the increasing number of structures available, ruling out the existence of absolute structural propensities for individual amino acids.
794
Georgios Pappas, Jr. and Shankar Subramaniam
Acknowledgments This work was supported by a NFS grant (ASC89-02S29) to SS, a fellowship from the Brazlian government (CNPq) to GP and a computational grant from the National Center for Supercomputing Applications (NCSA). We also thank Drs. A.F.P. de Araujo and M.M. Ventura for helpful discussions.
References Bohm, G. (1996). Biophys. Chem., 59,1-32. Burgess, A. W., Ponnuswamy, P.K. and Sheraga, H. A. (1974). Israel J. Chem., 12,239-286. Chothia, C. (1992). Nature, 357, 543-544. Chou, P.Y. and Fasman, G.D. (1978). Ad\/ar). EnzymoL, 13,211-215. Cox, T.F. and Cox, M.A.A. (1994). "Monographs on statistics and applied probability 59". Chapman and Hall: New York. Deleage, G., Clerc, F.F., Roux, B. And Gautheron, D.C. (1988). CABIOS, 4 (3), 351-356. Deleage, G. and Roux, B. (1987). Prot. Eng., 1 (4), 289-294. Eisenhaber, F., Persson, B. and Argos, P. (1995). CRC Crit Rev. Biochem. 30 (1):1-94. Gamier, J. and Levin, JM (1991). CABIOS, 7,133-142. Gamier, J., Osguthorpe, D.J. and Robson, B. (1978). J. Mol. Biol., 120, 97-120. Gascuel, O. and Golmard, J. L. (1988). CABIOS, 4, 357-365. Gibrat, J.-F., Gamier, J. And Robson, B. (1987). J. Mol. Biol., 198,425-443. Guttman, L. (1968). Psychometrika, 33,469-506. Hobohm, U., Scharf, M., Schneider, R. and Sander, C. (1992). Protein Sc/. 1,409-417. Holley, L.H. and Karplus, M. (1989). Proc. Natl. Acad. Sci. USAfiS, 152-156. Kabsch, W. and Sander, C. (1983a). Biopolymers, 22,2577-2637. Kabsch, W. and Sander, C. (1983b). FEBS /eft 155,179-182. Levin, J.M. and Gamier, J. (1988). Biochim Biophys Acta, 955,283-295. Lim, V.I. (1974). J. Mol. Biol., 88, 873-894. Murzin, A.G., Brenner, S.E., Hubbard, T. and ChothIa, C. (1995). J. Mol. Biol., 247, 536-540. Nishikawa, K. and Noguchi, T. (1991). Methods EnzymoL, 202, 31-44. Ritsyn, O.B. and Finkelstein, A.V. (1983). Biopolymers, 22,15-25. Qian, N. and Sejnowski, T.J. (1988). J. Mol. Biol., 202, 865-884. Rost,B. and Sander, C. (1993). J. Mol. Biol., 232, 584-599. Rost,B. and Sander. C. (1994). J. Mol. B/o/., 235,13-26. Rost,B. and Sander, C. (1995). Proteins, 23,295-300. Rumelhart, D.E. and McClelland, J.L. (1986). "Parallel distributed processing I". MIT press, CambridgeMA. Sasagawa, F. and Tajima, K. (1993). CABIOS, 9 (2), 147-152. Schuiz, G.E. and Schirmer, R.H. (1979). "Principles of protein structure". Springer-Verlag, New York-NY. Schuiz, G.E. (1987). ^nn. Rev. Biophys. Biophys. Chem.,i7,1-21.
SECTION X Biological and Chemical Design
This Page Intentionally Left Blank
Designing Water Soluble p-Sheet Peptides with Compact Structure Elena llyina, Vikram Roongta, Kevin H. Mayo* Department of Biochemistry, Biomedical Engineering Center University of Minnesota, Minneapolis, MN 55455
I.
Introduction
While de novo design of a-helix, helix bundle, ooiled-coil, and a/p peptides [e.gs., 1 - 1 4 ] has been successful, designing water-soluble, purely p-sheet-containing peptides has met with limited success, p-sheet peptide design has proved more complicated primarily due to limited solubility via aggregation in water and to the nature of p-sheet folding which is dictated by long range interactions. The betabellin series [6] and betadoublet [9] peptides, for example, are soluble in water primarily at lower pH values and show non-compact p-sheet conformational properties as monitored by CD and NMR. With betabellin 12, for example, NMR assignments could be made only in dimethylsulfoxide, and no long range NOEs were observed indicating the absence of stable folded structure [6]. Betadoublet [9] which has the same predicted anti-parallel p-sheet motif as betabellin, was water soluble, albeit at low pH; gave a typical p-sheet CD trace, but showed a "random coil" ^H-NMR spectrum [15]. Betabellin 14D [13], the best in that series, showed good solubility at mM concentrations up to pH 5.5, but weak p-sheet folding even with a disulfide bond between two sandwiched monomers. One reason for problems in producing water-soluble, compact p-sheet peptides may lie in the fact that their design usually has been based on a number of structural propensity scales [recent e.gs., 16 - 19] which are derived either statistically from structural databases of known folded proteins or by making single or minimal site-specific changes in a fully folded protein. Such scales may be generally less useful [20] when designing only p-sheet-containing peptides where considerably more p-sheet and/or side-chain surface (particularly hydrophobic surface) will be exposed to water. Even though some p-hairpin [21,22] and o/p [14] peptides are water-soluble, form p-sheet structures and remain monomeric, larger de novo designed p-sheet- forming peptides like betadoublet and betabellin, are inherently designed to self-associate through their otherwise solvent-exposed amphipathic hydrophobic surface. Other amphipathic p-sheet forming peptides have been derived from proteins in the a-chemokine family [23] which includes platelet factor-4 (PF4) [24], interleukin-8 (IL-8) [25] and growth related protein (Gro-a) [26]. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
797
798
Elena Ilyina et al
Each member of this family contains some 70 residues, about 30 of which form a three-stranded, anti-parallel p-sheet scaffold on to which is folded a C-terminal a-helix and an aperiodic N-terminal segment that is disulfidebonded to the p-sheet domain [27 -32]. All known natural a-chemokines self-associate as dimers or tetramers. Dimers form by extension of monomer p-sheet domains to form a six-stranded sheet, called the AB-dimer, exemplified by IL-8. In native PF4, two AB- dimers are sandwiched together to form a tetramer. Peptide sequences of 33 residues in length have been derived from the p-sheet domains of PF4, IL-8 and Gro-a and characterized in terms of water solubility and of structure by using CD and NMR [36,37]. Figure 1 shows p-sheet sequences from PF4, IL-8 and Gro-a and indicates the general anti-parallel p-sheet backbone folding scheme within a subunit. In terms of water solubility, Gro-a peptide is soluble at least up to 30 mg/mL at pH values between pH 2 and pH 10, while IL-8 is only soluble at pH values below about pH 4.5 and above pH 7.5 (at least up to 24 mg/mL by pH 9) [36]. PF4 peptide is water soluble up to about pH 5.5 (22 mg/mL) [37]. For IL-8 [36] and PF4 [37] peptides at pH values below pH 4 and pH 3, respectively, and for Gro-a peptide [36] at pH values between pH 2 and pH 10, conformational preferences are not discernable by NMR or CD. At pH 5.3, however, PF4 peptide forms a p-sheet sandwich tetramer [37], and at pH values above 7.5 where IL-8 peptide is also soluble, NMR data indicate formation of a compact p-sheet dimer which exhibits maximal structure at a pH of about 10.3 [36]. These data indicate, therefore, that as arginine/lysine side-chains are deprotonated, a conformational transition to a structured p-sheet occurs. In both PF4 and IL-8 peptides, p-sheet stability increases with increasing temperature and is greatest at temperatures between 40°C and 50°C consistent with predominantly hydrophobically-modulated stabilization. These peptides are composed of about 40% to 50% hydrophobic residues. In effect, p-sheet folding in IL-8 and PF4 peptides is thermodynamically linked with oligomerization [36,37]. Moreover, p-sheet structures of PF4 [37] and IL-8 [36] peptides are native-like, at least in terms of monomer strand alignments and formation of a three-stranded p-sheet. In this respect, the hydrophobic pairings between strands are presen/ed as in native a-chemokines. The present communication summarizes a novel approach to de novo design of water-soluble p-sheet sandwich peptides with compact structure.
?
a.
DC
CO
O
>
(DCL^^
<
O 1 - LU
Q
CX)
>
> - < _ l CO
>
CL O CO UJ
Z
CO
"n__ o
GO I
» =1
6 -o
ccb
DC _J LU i^ -
- - >
i^
i^ _ l
.1
"(3 )z ^ -^ c CD 11. h-
_j
H
<^
CO < Z H L U - -
Z
z
S E o
CO § 00 ^
0 " — (O +- Q.
'^" "(0 : ^ CO 0 0 0 LJ- E « x: C "O 2 (0 •== * - o CO ^ JC CO g CO CO C o CO 2> CO 0 0 I cfi Q. CL 8 O 0 OQ o
i:i
Li.^
CO Q
u> o ^
_ l CO _ l LU DC
^v CO
£
.^
CD < ^ 1 . > co DL
SI
LU _ I CO I - _
o
1- < b_ i - < Q
_ j CO k ^
0 0 sz
CO
1-
u
1:
DC
CD
CO CM CO CO
I
c
5^ =5
^S o
•£
o = 0 ^
?^ | 2
DC
z
0 SZ Q.
go 2
£* ^ ^
* -
CD
! 1 - CO I
CO
I
CX) 1 -
CM I D CO 1 -
CQ.
T3 0 JD
*o CO
c g
'i—
O
o 0 Q.:2
CO CM CM CM -•-
O 0 O CL !5 O) o c
CM CM CO CO I
E -5.
C O
I
, I
, I
CO
CM CM OC "•- CM
<
z
, I
O
I
CM T -
!
l i - ' ^ - r - i -
Oco ! i - Q - ^ c M
' ^ ^ t C M T - l r - l
^^T-CO
CO
O 1 - CO
CM^COCMi-
Q
T - CO CD
a>
-^CMCM"^
-r-
l i - - ^
, Z ^ HI ICM_JT-CM(D'^^COCMCMJI;T-T-COCO
< = _j><::^Q.H-i--cLi-
o
CM
-«-
^^
co" CO 0
a *S 0 "O
co"
•D c CO o CO CO CO o "O
"o LU 0 CO o3 Q . Q. c O c 0 CO
0
ig l 5 " ^
_. < O
, Q- N I^J---
^ < < cDo
E
0
CO
C
Q CO 0 "D 3 o c
3 0 o •— n
•D
B
CO 0 0 JD n
T-
CM
O
>-
800
II.
Elena Ilyina era/.
Materials and methods Peptide Synthesis
Peptides of 33 amino acid residues in length were synthesized on a l\/lilligen Biosearch 9600 automated peptide synthesizer. The procedures used were based on Merrifield solid phase synthesis utilizing Fmoc-BOP chemistry [38] as described by Mayo et al. [36].
Circular Dicliroism Circular dichroic (CD) spectra were measured on a JASCO JA-710 automatic recording spectropolarimeter coupled with a data processor. Curves were recorded digitally and fed through the data processor for signal averaging and baseline subtraction. Spectra were recorded from 5 °C to 65 °C in the presence of 10 mM potassium phosphate, over a 185 nm to 250 nm range using a 0.5 mm path length, thermally jacketed quartz cuvette. Temperature was controlled by using a NesLab water bath. Peptide concentration was varied from 0.014 to 0.14 mM. The scan speed was 20 nm/min. Spectra were signal-averaged 8 times, and an equally signal averaged solvent baseline was subtracted.
A/Mf? Measurements For NMR measurements, freeze-dried peptide was dissolved either in DgO or in HgO/DgO (9:1). Protein concentration normally was in the range of 1 to 5 mM. pH was adjusted by adding |LIL quantities of NaOD or DCI to the peptide sample. NMR spectra were acquired on a Bruker AMX-600 or AMX-500 NMR spectrometer. For resonance assignments, double quantum filtered COSY [39,40] and 2D-homonuclear magnetization transfer (HOHAHA) spectra, obtained by spin-locking with a MLEV-17 sequence [41] with a mixing time of 60 ms, were used to identify spin systems. NOESY experiments [42,43] were performed to sequentially connect spin systems and to identify NOE connectivities. 2D-NMR spectra were acquired in the TPPI [44] or States- TPPI [45,46] phase sensitive mode. The water resonance was suppressed by direct irradiation (0.8 s) at the water frequency during the relaxation delay between scans as well as during the mixing time in NOESY experiments. Data were collected as 256 to 400 t1 experiments, each with Ik or 2k complex data points over a spectral width of 6 kHz in both dimensions with the carrier placed on the water resonance. For HOHAHA (COSY) and NOESY spectra, normally 16 and 64 scans, respectively, were time averaged per t1 experiment. The data were processed directly on the Bruker AMX-600 X-32 or offline on a Bruker Aspect-1 workstation with the Bruker UXNMR program. Data sets were multiplied in both dimensions by a 30-60 degree shifted sine-bell
Water-Soluble P-Sheet Peptides
801
function and zero-filled to 1k in the t1 dimension prior to Fourier transformation. Pulsed field gradient (PFG) NMR self-diffusion measurements were made on a Bruker AMX-600 using a GRASP gradient unit and a 5mm triple-resonance probe equipped with an actively shielded z-gradient coil. The PFG longitudinal eddy-current delay pulse-sequence [47] was used on peptides dissolved in D p over the temperature range 275 K to 310 K. Peptide concentrations ranged from 3 mg/mL to 10 mg/mL. Diffusion constants and apparent molecular weights were obtained as described by Mayo et al. [36].
III.
Results and Discussion
In designing p-sheet peptides, an intricate interplay which exists among p-sheet folding, self-association and water solubility must be addressed. Solubility is a double edged effect: precipitation and over-solvation versus a soluble folded peptide. Over- solvation may be defined as the tendency of a soluble peptide to prefer inter-molecular water-peptide interactions over intra-molecular folding interactions. Going too far in either direction (precipitation or over-solvation) destabilizes the folded state. Reduced solubility generally occurs due to inter-molecular peptide-peptide (hydrophobic and electrostatic) networking which results in precipitation or gelation. If, for example, a designed p-sheet peptide contains a relatively large number of hydrophobic residues which are not screened to some extent by the folding process, precipitation or gelation usually results. Inherent in the design of p-sheet forming peptides, therefore, is the capacity to self-associate, thereby screening hydrophobic surface from solvent water. Studies on p-sheet domain peptides derived from IL-8 [36], Gro-a [36] and PF4 [37], as well as those on betabellins [6,13,48] and betadoublet [9], led to some general guidelines for designing water soluble, self-association-induced p-sheet-forming peptides, i.e., the Ppep series. While PF4 and IL-8 peptides and the betabellin and betadoublet peptides can form p-sheet structure to varying degrees, all display limited solubility in water, particularly at pH values between pH 4 and pH 8. Moreover, PFG-NMR diffusion measurements have shown that PF4 and IL-8 33mers self-associate in solution forming tetramers and dimers, respectively [36]. On the other hand, Gro-a is soluble at least up to 30 mg/mL at pH values between pH 2 and pH 10, but shows no apparent p-sheet conformation measured by NMR and CD. By comparing the amino acid compositions of PF4, IL-8, Gro-a, betabellin 14D [13] and betadoublet [9] (very similar in composition to the betabellin series [6]) shown in Table 1, several observations aimed at optimizing solubility and the potential for p-sheet folding in aqueous solution, have led to the following guidelines:
802
Elena Ilyina et al
1. maintain a positive (K, R) to negative (E, D) residue ratio between 4/2 and 6/2. A higher ratio may work as well, however, when the number of positively and negatively charged residues is about equal (e.gs., IL-8 and betadoublet), intermolecular electrostatic interactions shift the solvation-precipitation equilibrium to the precipitate state. The inverse relationship, i.e., more negative than positive character, gives rise to good solubility, but does not lead to discernable structural populations as judged by NMR or CD. 2. adjust the non-charged polar residue (N,Q, T, S) composition to be less than about 20%. When the number of polar residues is too high and other stabilizing forces are too low (e.gs., Gro-a, betadoublet, betabellin), intra-molecular collapse or folding may be opposed by inter-molecular peptide-water associations (referred to as over-solvation). 3. keep the aliphatic hydrophobic residue (I, L, V, M and A) composition between 40% and 50% to promote self-associationinduced structural collapse and stability (e.gs., PF4, IL-8). 4. properly place and pair residues, particularly hydrophobic ones, in the sequence. Choosing the proper placement of hydrophobic residues in the sequence and combination of hydrophobic sidechains across the strands and within the sandwich are key to designing a more compact, stable p-sheet fold. 5. choose turn/loop motifs. For a particular p-sheet fold, some turns may be crucial, while others may not. Point 1 is crucial for acheiving good solubility and avoiding precipitation, while points 2 and 3 are crucial for avoiding over-solvation and setting the stage for structural collapse. The later points 4, & 5 are aimed at directing p-sheet folding as depicted in Figure 1. Here, the pairing of hydrophobic residues establishes the strand alignments as boxed-in in the figure. Following these guidelines, several novel peptides, the ppep series, were designed which use a combination of a-chemokine sequences (primarily from strands 2 and 3 with the loop/turn initiation sequence [49]) to maintain specific residue pairings and alternative sequences in the first turn/loop and strand 1. At present, eight ppep peptides have been studied by using CD and NMR. The amino acid sequences for two of these are given in Figure 2 along with NMR and CD spectral traces. All ppep peptides are water soluble at least up to 20-30 mg/mL at pH values between pH 2 and pH 10, and all form p-sheets, albeit to varying degrees. For most ppep peptides, CD data show a prominent negative ellipicity at 217 nm indicating formation of primarily p-sheet structure [50,51]. On raising the temperature from 5''C, CD data in general indicate an increase in p-sheet structure which becomes maximal at about 40 "^C, consistent with the well-known hydrophobic effect. Based on the presence of relatively well-defined, downfield shifted aH and NH resonances [15,52], NMR data accumulated
803
Water-Soluble p-Sheet Peptides
180 200 220 240 Wavelength, nm
ppep-4
ppep-8 10.0
~T
I
I
I
I
j
I
I
9.0
I
I
I
I
T"
ao
[ I I I I I I I > I I
7.0
I
I
I
I
I
1
I
I
I
6.0
I
I
I
I
I
I
I J
ppm
ppep-4. S I Q D L N V S M K L F R K Q A K W K I I V K L N D G R E L S L D ppep-8. ANI K L S V E M K L F K R H L K W K I I V K L N D G R E L S L D Figure 2. The amino acid sequences for Ppep-4 and Ppep-8 are shown at the bottom. 600 MHz ^H NMR spectra are shown for ppep-4 (in HJD) and Ppep-8 in (Dfi). The peptide concentration was 20 mg/mL in 20 mM potassium phosphate, 40°C, pH 6.3. Farultraviolet CD spectra of both peptides are shown at the top of the figure. Peptide concentration for CD was 20 |iM in 20 mM potassium phosphate, pH 6.3, 40°C.
804
Elena Ilyina et al
at pH 6.3, 20 mM NaCI and 40 ''C, indicate tiiat ppep-4 forms compact p-sheet structure. Tine relative integrated intensity of downfield-shifted aH resonances compared to that from a-chemokine proteins suggests that ppep-4 is at least 90%-95% folded. Although the NMR spectrum for ppep-8 (Figure 2), which is typical of those observed for all other Ppep peptides, also shows characteristic p-sheet downfield-shifted aH and NH resonances, resonances are highly broadened. Even though the intensities of these downfield shifted aH resonances indicate fractional populations of relatively well formed P-sheets in slow chemical exchange (600 MHz NMR chemical shift time scale) with "non-compact" or "unfolded" conformational states, aH resonance broadening most likely is the result of an intermediate exchange process among various aggregate states and/or among similarly folded p-sheet conformations. PFG-NMR self-diffusion measurements do indicate formation of aggregate state distributions of primarily dimers to tetramers. Furthermore, although CD data suggest more p-sheet structure for ppep-8 compared to Ppep-4, NMR data indicate the opposite. CD data, however, would give evidence for significant p-sheet structure even if it were highly transient in a molten globule-like or non-compact state. PFG-NMR self-diffusion measurements on ppep-4 indicate formation of stable tetramers over a range of temperatures from S^'C to 50°C. In fact, even at 40°C and a peptide concentration of 40 |xM, ppep-4 maintains an average molecular weight characteristic of a dimer (data not shown). Downfield shifted aH resonances remain, although dimished in relative intensity. In this respect, ppep-4 is not fully dissociated and exhibits an apparent dissociatoin constant in the |iM range. Only when 3.5 M urea is added to solution, is Ppep-4 fully dissociated. Although it is presently unknown why any one Ppep peptide folds better than any other, some insight is gleaned from an NMR structural analysis of compactly folded Ppep-4. Sequence-specific resonance assignments were done using a standard protocol [15]. Figure 3 shows segments from 'H-NMR TOCSY and NOESY spectra of Ppep-4. Downfield shifted aH and NH resonances, as expected, were found to belong to residues within a three-stranded p-sheet: D4-R13, W18-K23, E29-L32, as shown schematically in Figure 4. In the aH region of the NOESY spectrum, several aH- aH NOEs are obsen/ed which define the strand alignments within the p-sheet. While the strand 2 to strand 3 alignment in the sheet is the same as that obsen/ed in a-chemokines, the strand 1 to strand 2 alignment is different making the loop between strands 1 and 2 shorter in ppep-4. NOEs are also evident between monomers which associate to form two types of dimers, each created by extending its three-stranded anti-parallel p-sheet into a six-stranded sheet via strand 1 interactions as depicted in Figure 4. Heterodimers arise primarily from a two-residue shift
Water-Soluble p-Sheet Peptides
805
TOCSY
t
'
NOESY
ppm
•
t (
(^
i»«f=i
00
' c» .
- I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — r -
9.3
8.8
8.3
7.8
6.0
^
I- 4
I oQ o a
^ 6 5.4 ppm
Figure 3. TOCSY and NOESY contour plots are shown for ppep-4 for the amide/aromatic region in H2O{90%)/D2O(10%) (left) and for the aH region in D p (right), ppep-4 peptide concentration was 20 mg/mL in 20 mM potassium phosphate, pH 6.3, 40°C. The NOESY mixing time was 0.2 s, and the TOCSY spin lock time was 40 ms. The data were zero-filled to 1024 in t1. The raw data were multiplied by a 30° shifted sine-squared function in t1 and t2 prior to Fourier transformation.
Elena Ilyina et al
806
in the interfacial strand alignments giving rise to two distinct dimer folds. In either case, this folding pattern defines an amphipathic p-sheet similar to that observed in a-chemokines [27 - 32] from which ppep-4 was designed. Heterodimers associate to form a p-sheet sandwich tetramer via the hydrophobic surface of amphipathic sheets. Hydrophobic residues V7, M9, L11, 120 and V22 from each subunit define the core of the tetramer. Assumming this alignment is conserved in other ppep peptides, the same hydrophobic residue positions should form the core in any ppep peptide. S A Q ^5
K W D^ K L 1^ S —^ 1 \_^ V E K R L G N^ D
1 1
Q K D R L D F —— N N G L V ^--L R K^V— S K E M M, V L S ^_ K 1 S V L . 1 L N ^— F K D L' R W D K K Q Q A 1
S^
Dimer 1
D L S L E R G
D
A Q K K W R K F —— 1 L 1 K —— V M K S —— L ^V N N ^— L D;r- — Q 1
S
S 1 Q
D L N V' S M K L. F R K
QA
D N G L R K'—E V L 1 . S 1 L K D W K
Dimer 2
Figure 4. Inter-subunit anti-parallel p-sheet strand alignments are schematically shown for both types of dimers. Lines indicate aH-aH pairings.
The answer to successful p-sheet folding lies in knowing how residues on the hydrophobic surface of one amphipathic dimer pack with the hydrophobic surface of another. Optimal side-chain packing relates to fold stability and compactness. Currently, restrained molecular dynamcis and simulated annealing are being used to model the folding of the p-sheet sandwich tetramer to help answer this question.
Conclusions
Analysis of the composition, folding and solubility properties of PF4, IL-8, Gro-a 33mers and betadoublet and betabellin peptides has led to a general recipe for designing water soluble, p-sheet forming peptides. Of the eight de novo designed ppep peptides investigated to date, only one, ppep-4, exemplifies an exceptional application of this design approach by showing relatively stable, compact p-sheet structure with apparently good side-chain packing in a tetrameric state. These results have important implications to designing p-sheet peptides with a particular
Water-Soluble P-Sheet Peptides
807
bioactivity. ppep peptides have a highly positively charged surface which should be able to bind negatively charged surfaces, for example, heparin, cell surface heparan sulfate as well as phosphorylated biomolecules. Understanding how to spatially arrange lysines and arginines on the surface of ppep petides may allow modulation of binding activities.
Acknowledgements
This work was supported by generous research grants from the Minnesota Medical Foundation and the Graduate School of the University of Minnesota. *Address correspondence to K. H. Mayo. Abbreviations: NMR, nuclear magnetic resonance; 2D-NMR, twodimensional NMR; HOHAHA, 2D-NMR homonuclear Hartman-Hahn spectroscopy; NOE, nuclear Overhauser effect; NOESY, 2D-NMR nuclear Overhauser effect spectroscopy; rf, radio frequency; FID, free induction decay; CD, circular dichroism; PF4, platelet factor-4; IL-8, interleukin-8; Gro-a, growth-related protein a.
References 1. 2. 3. 4. 5.
6.
7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
Regan, L., and DeGrado, W. (1988) Science 241, 976-978. DeGrado, W., Wasserman, Z., and Lear, J. (1989) Science 243, 622-628. Hecht, M., Richardson, J., Richardson, D., and Ogden, R. (1990) Science2AQ, 884-891. Hahn, K., Klls, W., and Stewart, J. (1990) SciencelAB, 1544-1547. Fedorov, A.N., Dolgikh, D.A., Chemeris, V.V., Chernov, B.K., FInkelstein, A.V., Schulga, A.A., Aiakhov, Y.B., Kirpichnlkov, M.P. and Ritsyn. O.B. (1992) J. Moi. e/o/. 223, 927-931. Richardson, J.S., Richardson, D.C., Tweedy, N.B., Gernert, K.M., Quinn, T.P., Hecht, M.H., Erickson, B.W., Yang, Y., McClain, R.D., Donlan, M.E., and Surles, M.C. (1992) Biopfiys. J. 63,1185-1209. Handel, T.M., Williams, S.A., and DeGrado, W.F. (1993) Sc/ence 261, 879-885. Kamteker, S., Schiffer. J.M.. Xiong, H., Babik, J.M., and Hecht, U.H. (1993) Science 262,1680-1685. Quinn, T.P., Tweedy, N.B., Williams, R.W., Richardson, J.S., and Richardson, D. C. (1994) Proc. NatL Acad. Sci. USA 91. 8747-8751. Fezoui, J., Weaver, D.L., and Osterhout, J.J. (1994) Proc. Natl. Acad. Sci. USA 91,3675-3679. Kuroda, Y., Nakai, T., and Ohkubo, T. (1994) J. Moi. Biol. 236, 862-868. Bryson, J.W., Betz, S.F., Lu, H.S., Sulch, D.J., Zhou, H.X., O'Neil, K.T., and DeGrado, W.F. (1995) Science 270, 935-941. Yan, Y., and Erickson, B.W. (1994) Protein Sci. 3,1069-1073. Struthers, IVI.D., Cheng, R.P., and Imperiali, B. (1996) Sc/ence 271,342-345 (1996). Wuthrich K. (1986) NMR of Proteins and Nucleic Acids, Wiley-lnterscience, New York. Kim, C.A., and Berg, J.M. (1993) Nature 362,267-270. Minor. D.L., Jr.. and Kim, P.S. (1994) Nature367, 660-663. Minor, D.L., Jr., and Kim. P.S. (1994) A/afi/re371,264-267.
808 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52.
Elena Ilyina et al Smith, C.K., WIthka, J.M., and Regan, L (1994) Biochemistry^, 5510-5517. Otzen. D.E., and Fersht, A.R. (1995) Biochemistry34,5718-5724. Blanco, F.J., Rivas, G., and Serrano, L. (1994) Structural Biology 1, 584-590. Searle, J., Williams, D.H., and Packman, L.C. (1995) Nature Struct. Biol. 2,9991006. Miller, M.D., and Krangel, M.S. (1992) Crit. Rev. Immurjol. 12,17-46. Holt, J.C, and NiewiarowskI, S. (1985) Seminars in Hematology 22,151 -163. Schmid, J., and Weissmann, C. (1987) J. Immunol. 139,250-256. Anisowicz, A., Bardwell, L., and Sager, R. (1987) Proc. Natl. Acad. Sci. USA 84, 7188-7192. Zhang, X., Chen, L., and Bancroft, D.P. (1994) Biochemistry 23,8361-8366. Mayo, K.H., Roongta, V., Ilyina, E., Milius, R., Barker, S., Quinlan, C, LaRosa, G., and Daly, T.J. (1995) Biochemistry34,11399-11409. Clore, G.M., Bax, A., Wingfield, P.T., and Gronenbom, A.M. (1990) Biochemistry 29,5671-5676. Baldwin, E.T., Weber, I.T., St. Charles, R., Xuan, J.-C, Appella, E., Yamada, M., Matsushima, K., Edwards, B.F.P., Clore, G.M., Gronenborn, A.M. and Wlodawer, A. (1991) Proc. Natl. Acad. Sci. USA 88,502-506. Mayo, K.H., Yang, Y., Daly, T.J., Barry, J.K., and La Rosa, G.J. (1994) Biochem. J. 304, 371-376. Fairbrother, W.J., Reilly, D., Colby, T.J., Hesselgesser, J. and Horuk, R. (1994) J. Mol. Biol. 242, 252-270. Mayo, K.H., and Chen, M.-J. (1989) Biochemistry 2S, 9469-9478. Mayo, K.H. (1991) Biochemistry 30, 925-934. Yang, Y., Mayo, K.H., Daly, T.J., Barry, J.K., and LaRosa, G.J. (1994) J. Biol. C/?em. 269,20110-20118. Mayo, K.H.. Ilyina, E., and Park, H. (1996) Protein Science 5,1301-1315. Ilyina, E., and Mayo, K.H. (1995) Biochem. J. 306,407-419. Stewart, J.M., and Young, J.D. (1984) Solid Phase Peptide Synthesis, 2nd edition. Pierce Chemical Co., Rockford, IL, pp.135. Piantini, U., Sorensen, O.W., and Ernst, R.R. (1982) J. Am. Chem. Soc. 104, 6800-6801. Shaka, A.J., and Freeman, R. (1983) J. Magn. Reson. 51,169-173. Bax, A., and Davis. D.G. (1985) J. Magn. Reson. 65,355-360. Jeener, J., Meier, B., Backman, P., and Ernst, R.R. (1979) J. Chem. Phys. 71, 4546-4550. Wider, G., Macura, S., Kumar, A., Ernst, R.R., and Wuthrich, K. (1984) J. Magn. Reson. 66, 207-234. Marion, D., and Wuthrich, K. (1983) Biochem. Biophys. Res. Commun. 113^967975. States, D.J., Haberkorn, R.A., and Ruben, D.J. (1982) J. Magn. Reson. 48,286293. Marion, D., Ikura, M., Tschudin, R., and Bax, A. (1985) J. Magn. Reson. 89,393399. Gibbs, S.J., and Johnson, C.S. (1991) J. Magn. Reson. 93,395-402. Richardson, J.S., and Richardson, D.C. (1989) Trends in Biochem. Sci. 14,304309. Ilyina, E., Mllius, R., and Mayo, K.H. (1994) Biochemistry 313.13436-13444. Greenfield, N., and Fasman, G.D. (1969) Biochemistry 3,4108-4116. Johnson, W. C, Jr. (1990) Proteins 7,205-214. Wishart, D.S., Sykes, B.D., and Richards, F.M. (1992) Biochemistry 3^, 16471651.
Engineering Secondary Structure to Invert Coenzyme Specificity in Isopropylmalate Dehydrogenase Ridong Chen, Ann F. Greer, Antony M. Dean and James H. Hurley^ Department of Biological Chemistry, The Chicago Medical School, N. Chicago, IL 60064-3095 ^Laboratory of Molecular Biology, NIDDK-LMB, NIH, Bethesda, MD 20892-0580
I. Introduction The pyridine nucleotide coenzymes, NAD and NADP, provide the reducing equivalents necessary for energy transduction and biosynthesis in living cells. The reduced coenzymes are generated by the transfer of a hydride anion from a wide variety of substrates and catalyzed by an equally wide variety of dehydrogenases, the half reaction being NAD(P) + H ' ^
^ NAD(P)H
Structurally, NADP differs from NAD only by a phosphate group esterified at the 2'C of the adenosine ribose, a difference which is reflected in the enzymatic roles: NAD-dependent dehydrogenases are mostly involved in catabolic reactions, while NADP-specific enzymes are usually confined to biosynthetic pathways (1). The marked specificities displayed by dehydrogenases towards NAD and NADP have provided attractive model systems to understand the process of molecular recognition by protein engineering. Isocitrate dehydrogenase (B.C. 1.1.1.42, IDH) from Escherichia coli and isopropylmalate dehydrogenase (E.G. 1.1.1.85, IMDH) from Thermus thermophilus are involved in Krebs' cycle and leucine biosynthesis respectively. Both enzymes catalyze the sequential reactions + 2-Hydroxy acid + NAD(P)
Mg
2-Oxalo acid + H '^ TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
^ 2-Oxalo acid + NAD(P)H + H
2-Keto acid + CO^ 809
Ridong Chen et al
810
NAD-IMDH, eukaryotic NAD-IMDH, eubacterial • NAD-IDH,
eukaryotic
•NADP-IDH, eukaryotic • NADP-IDH, eubacterial Figure 1. A cladogram of the decarboxylating dehydrogenases.
Despite sharing only 25% sequence identity, structural analyses indicate that both enzymes are homodimers which share a common protein fold that lacks the papap motif characteristic of the nucleotide binding Rossmann fold (2-4). Phylogenetic analyses suggest that specificity towards isocitrate and isopropylmalate evolved before specificity towards NADP, and that the latter evolved in the IDHs around the time the eukaryotes first appeared - some 3.5 billions years ago (Fig. 1). IDH displays a 7000-fold preference for NADP whereas IMDH displays a 100-fold preference for NAD (5-7). High resolution X-ray crystal structures have been obtained for both enzymes with and without coenzymes bound (3, 4, 8, 9). The kinetic and catalytic mechanisms have also been determined (5, 7). In addition, a large quantities of recombinant proteins are readily purified, facilitating biochemical and structural analyses. The strict specificity together with an extensive knowledge of their structure and biochemistry makes IDH and IMDH ideal targets for rationally engineering coenzyme specificity. Previously, and despite the divergent evolutionary distances involved, IDH coenzyme specificity was rationally inverted from NADP to NAD by substituting seven residues (10). Here, guided by X-ray crystal structures and molecular modeling, the NAD specific IMDH of T. thermophilus is redesigned to a highly active NADP specific enzyme, an engineering feat requiring the substitution of four amino acid residues and replacing a p-tum by an a-helix and loop.
II, Materials and Methods A. Molecular
Modeling
A model of an NADP dependent IMDH was constructed by introducing an IDHlike a-helix and loop and four requisite substitutions into a model of T. thermophilus IMDH (11). A phosphate was added at the 2' position of the NAD ribose and the IDH-like a-helix and loop subjected to energy minimization by the Charmm force field for an initial 50 steps, after which all restraints on the coenzyme, on amino acids contacting residues in the coenzyme binding pocket, on amino acids contacting residues in the engineered a-helix and loop, and all residues within 8 A of the coenzyme were removed. The final 100 steps of 1000 steps of minimization reduced the calculated energy by less than -0.001 kcal/Mol, suggesting that the model had approached a stable conformation.
Coenzyme Specificity in Isopropylmalate Dehydrogenase
B.
811
Mutagenesis
Substitutions were introduced into IMDH by the method of Kunkel (12). Putative mutants were screened by enzyme assays and confirmed by dideoxy sequencing. PCR overlap extension (13) was used to replace the p-tum of IMDH by an a-helix and loop, modeled on that of E. coli IDH. The resulting 1.2 kb hybrid fragment was purified, digested with Kpn I and Sal I and inserted into the expression vector pEMBLlS-. The Nco I-Hind III fragment of 3' terminal 500 bp of IMDH, which includes three additional substitutions and the engineered secondary structure, was subcloned into wildtype IMDH and the entire region sequenced.
C
Enzyme Purification
Enzymes were purified as described previously (5). The final mutant, unlike the wildtype enzyme, binds tightly to Affi-gel blue. This provides an additional means to purify this mutant by affinity chromatography.
D. Kinetic
Analyses
Kinetic parameters were determined by following the reduction of the nicotinamide coenzymes at 340 nm as described by Dean and Dvorak (7). Rates were calculated using a molar extinction coefficient for NAD(P)H of 6200 M'^cm"^ and protein concentrations were determined at 280 nm using a molar extinction coefficient of 30,420 M-^-cm-^ Nonlinear least squares GaussNewton regressions were used to determine the fit of the data to the MichaelisMenten model.
III. Results and Discussion Graphical superposition of IDH and IMDH indicates that specificity in IMDH is conferred by Asp-278 (Fig. 2, Table I), a rigidly conserved residue that forms a double H-bond with the 2'- and 3'-hydroxyls of the adenosine ribose of NAD (Fig. 2) and which repels the negatively charged 2'-phosphate of NADP. In IDH, Asp-278 is replaced by Lys in IDH (Fig. 2, Table I), a rigidly conserved residue that, although disordered in the crystal structures, probably interacts with the 2'-phosphate of NADP. Ser-226' (on the second subunit) and the conserved Ile-279 of IMDH are replaced by Arg-226' and Tyr-279 in all eubacterial IDHs (Table I), where they form H-bonds to the 2'-phosphate of NADP. In addition, a P-turn in the coenzyme binding pocket of IMDH is replaced by an a-helix and loop in IDH. Two additional interactions with the 2'-phosphate are found in IDH: Tyr-325 (IMDH numbering) and 395-Arg (IDH numbering) (Fig. 2, Table I). These residues have no equivalent in IMDH, where the a-helix of IDH is replaced by a p-turn. Hence, inverting the coenzyme specificity requires that a change in the secondary structure of IMDH be engineered.
Ridong Chen et al
812
Loop2
K278D
Figure 2. Superposition of the coenzyme binding pockets of IMDH (light gray) with NAD (middle gray) and IDH (black) with NADP (dark gray with the 2'-phosphate in black). Dashes indicate H-bonds. IMDH numbering is used throughout, except in the a-helix and loop of IDH where IDH numbering is italicized.
A.
Engineering
Individual
Residues
Ser-226', Asp-278 and Ile-279 were replaced by Arg, Lys and Tyr in sequential rounds of mutagenesis (mutants I and II, Table II). These substitutions result in a dramatic increase in K^ for NAD, from 31 |LIM to 1836 |iM, due to the loss of Hbonds to the adenosine ribose hydroxyls of NAD. In contrast, the K^ for NADP is improved from 722 |LIM to 14 |iM, suggesting that H-bonds between Lys-278, Tyr-279 and the 2'-phosphate have been successfully established. Nevertheless, preference now favors NADP over NAD only by a modest factor of 6 (Fig. 3) because of a marked drop in k^.^^ with NADP.
B.
Engineering a Secondary Structure
Further improvements in specificity require that the p-tum of IMDH be replaced by an IDH-like a-helix so that two addition residues, Tyr-325 and 395-Arg, can interact with the 2'-phosphate of NADP while further increasing the polarity of the pocket. Molecular modeling indicated that engineering such a change in
Coenzyme Specificity in Isopropylmalate Dehydrogenase
813
Table I. Aligned sequences surrounding the nucleotide binding pocket in the decarboxylating dehydrogenases Residue 2
Residue 2
3
9 2
4 4
NADP-dependent IDH E. coli Anabaena sp. B. subtilis T. thermophilus Vibrio sp.
3 6 AGTEAF ....TCSQV EEFSS.Q.I ...NS.TEV AAAVS.DEM ...NS.TEV ...P..PVI ...SS.SEV
2 3 PPPDLGGS HGSAPDIAGKGIANP RTG....R S...L... GM. . . RTR..AR. RTG..K.T L-PANKV. . G I. . .N. . . . RTG..ARG RTG L-P.NKVD. QDK... RTA.IAA. .TA.I... L-. .QKV. .
2 6 S D A Y D N A Q
NAD-dependent M D H T. thermophilus A. chrysogenum B. subtilis C. utilis E. coli S. cerevisiae T. ferrooxidans Y. lipolitica
R
8 9
5 1
HGTAPKYAGQDKVNP H. . L . R I . . L KNVI.. KN....
0 6
VTYDFERLMDGAKL-LKCSEF . . . .LA. .LEPPVEP . . . .FA TE-V L.G.WGYDR. .KT-TEYT.A D.T.-VSC.A.
The single letter amino acid code, with dashes representing deletions and periods for identical residues, is used throughout. Table II. Kinetic parameters of wildtype and mutant enzymes toward NADP and NAD mutant I mutant II mutant III mutant IV wildtype NADP" *cat(s" )
0.26
0.88
0.09
0.58
0.39
^ m (^lM) *cat/'fm(^iM-••s-l) NAD
1750 0.00015
722 0.0012
14 0.0064
25 0.023
20 0.020
*cat(^' )
0.15
0.48
1.91
1.90
0.52
^cat/'fmCtiM-ls-l)
12 0.0125
31 0.015
1836 0.001
17800 0.00011
25560 0.00002
Specificity (NADP/NAD)
1.0
8.0
6.4
2 . 1 10^
1.0
Km im)
10"^
10"^
Sequences
wildtype IMDH mutant I mutant II mutant III mutant IV
Residue 2 2 6 S R R R R
2 7 8 D
2 7 9 I
K Y K Y K Y
3 2 4 P-tum PPDLGGS
3 3 2 AG
2 8 5 A
TYDLERLADGAKLAG TYDLERLADGAKLAG
V
« Kinetic data determined in 25 mM MOPS, 100 mM KCl, 1 mM DTT, pH 7.3 at 2 r C .
10-^
Ridong Chen et al
814
secondary structure would require additional substitutions to facilitate packing against the remaining IMDH structure. Phe-327 was replaced by Leu, also found in IMDH (Tables I, II), to avoid the steric crowding that would distort the helix and disrupting interactions with the 2'-phosphate. Met-397 of IDH (Table I, IDH numbering) is replaced by Ala, again to avoid steric crowding and to allow an Arg in IMDH to H-bond to several main chain carbonyls near the terminus of the helix (Table I, IDH numbering). The two terminal amino acids of the IMDH p-turn, Ala-332 and Gly-333, were retained to avoid steric packing problems associated with substituting the Leu and Lys of IDH. However, at the proximal end, Pro324 was replaced by the more flexible Thr, found in some related sequences (Table I), to allow minor shifts in the peptide backbone that might facilitate Hbonding between the Tyr substituted for Pro-325 and the 2'-phosphate of NADP. As judged by kinetic analyses, these major structural changes were successful: specificity for NADP increased from 6 to 208 - a factor of 35 (mutant III, Table II, Fig. 3). The K^ towards NAD increased 10-fold, and while the K^ towards NADP increased 2-fold, so kQ2X improved 6-fold.
C Generating the Final Mutant Substituting Ala-285, which lies at the back of the nucleotide binding pocket, by Val impairs performance with NAD by a factor of 5, while that towards NADP is largely maintained (mutant IV, Table II). Molecular modeling suggests that the bulkier side chain of Val forces the adenine ring to shift, disrupting an H-bond between the adenine N2 and the main chain amide at residue 286. The resulting loss in affinity might be compensated in the case of NADP by improved interactions between the 2'-phosphate and the introduced Asp-278-Lys, Ile-279Tyr, Pro-325-Tyr and 395-Arg. Either that, or the strong interactions with the 2'phosphate had already forced NADP to shift out of the way.
^Q
o<
— z oQ. * •
^
0) O £1<
0)Z
wildtype mutant I mutant II mutant III mutant IV Figure 3. The systematic shift in coenzyme specificity generated by engineering IMDH. The height of the columns represents the degree to which specificity ((^cat''^m)NADP^(^cat^^m)NAD) is improved or impaired.
Coenzyme Specificity in Isopropylmalate Dehydrogenase
815
Preference now favors NADP over NAD by a factor of 1000 (mutant IV, Table II, Fig. 3). Like wildtype IDH, and unlike wildtype IMDH, this final mutant binds tightly to Affi-Gel blue affinity columns. This provides additional evidence to support the notion that an IDH-Uke NADP binding pocket has been successfully engineered in IMDH.
D. Kinetic Properties of the Final Mutant Four mutations, Ser-226-Arg, Asp-278-Lys, Ile-279-Tyr, Ala-285-Val, coupled with engineering a surface IDH-like a-helix and loop converts the coenzyme specificity of IMDH from a 100-fold preference for NAD to a 1000-fold preference for NADP (Table II, Fig. 3). Performance with NAD was impaired 630-fold while that with NADP was improved 130-fold. Indeed, the performance of the redesigned enzyme towards NADP is 60% higher than that of the wildtype towards NAD. Inspection of Table II reveals that throughout this engineering project the /c^at^ towards NAD and NADP were generally similar and remained in the vicinity of those of the wild type enzyme. This suggests that most of the change in specificity arises from discrimination in binding, rather than changes in catalysis.
IV.
Conclusion
The successful redesign of coenzyme specificities in both IDH (10) and IMDH demonstrates convincingly that coenzyme specificities in the p-decarboxylating dehydrogenases are primarily determined by interactions between the nucleotides and surface amino acid residues in the nucleotide-binding pockets. Nevertheless, additional residues not in contact with the coenzymes play key roles through modulating the direct interactions (Ala-226-Val substitutions along with several others in the introduced a-helix). We also note that homology-based engineering, in which sequence alignments are used as a strict guide for introducing substitutions conserved in related enzymes, frequently failed to produce desired changes - no doubt because of local differences in packing, alternative means to stabilize secondary structures, and unidentified differences in local secondary structures. Thus a knowledge based approach to engineering specificities will remain essential to engineering enzymes with novel properties. Our results demonstrate that protein engineering is a powerful means to understand structure-function relations in proteins, and that engineering secondary structures to produce enzymes with novel properties is feasible. Lessons leaned from our studies should be applicable to the redesign and optimization of functions in other enzymes.
Acknowledgments This work was supported by US Public Health Service Grant GM-48735 from the National Institutes of Health.
Ridong Chen et al
816
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Chen, R. & Gadal (1990) Plant Physiol Biochem. 28, 411-418. Rossmann, M. G., Moras, D. & Olsen, K. W. (1974) Nature 250, 194-199. Imada, K., Sato, M., Tanaka, N., Katsube, Y., Matsuura, Y. & Oshima, T. (1991) J. Mol. Biol. 222, 725-738. Hurley, J. H., Thorsness, P., Ramalingham, V., Helmers, N., Koshland, D. E., Jr., & Stroud, R. M. (1989) Proc. Natl. Acad. Sci. USA 86, 8635-8639. Dean, A. M. & Koshland, D. E., Jr. (1993) Biochemistry 32, 9302-9309. Miyazaki, K. & Oshima, T. (1994) Protein Eng. 7, 401-403. Dean, A. M. & Dvorak, L. (1995) Protein Science 4, 2156-2167. Hurley, J. H., Dean, A. M., Koshland, D. E., Jr., & Stroud, R. M. (1991) Biochemistry 30, 8671-8678. Hurley, J. H. & Dean, A. M. (1994) Structure 2, 1007-1016. Chen, R., Greer, A. & Dean, A. M. (1995) Proc. Natl. Acad. Sci. USA 92, 11666- 11670. Bolduc, J. M., Dyer, D. H., Scott, W. G., Singer, P., Sweet, R. M., Koshland, D. E., Jr., & Stoddard, B. L. (1995) Science 268, 1312-1318. Kunkel, T. A., Roberts, J. D. & Zakour, R. A. (1987) Methods Enzymol. 154, 367-382. Horton, R. M., Hunt, H. D., Ho, S. N., PuUen, J. K. & Pease, L. R. (1989) Gene 77, 61-68.
A method for determining domain binding sites in proteins with swapped domains: implications for PA3- and pB2- crystallins Yuri V. Sergeev and J. Fielding Hejtmancik National Eye Institute, National Institutes of Health Bethesda, MD, 20892
I. Introduction The structure of the interdomain interface in bovine yB- and PB2-crystallins is very similar in both proteins (1-2). In yB-crystallin the interface is formed by domainsfromthe same molecule, while in the pB2-crystallin dimer the interface consist of residues from swapped amino- and carboxy-terminal domains of two different molecules which associate to form a dimer. Domain swapping can be considered as a common step for the oligomerization of proteins and may play an important role in the evolution of oligomeric proteins (3-4). The importance of surface interactions in formation of intermolecular contacts in protein crystals is shown by comparative studies of various y-crystallins (5-6). These suggest that differences in surface interactions between the domains of P- and ycrystallin may be responsible for dimer formation by P-crystallins. Mouse pA3-crystallin has about 38% and mouse PB2-crystallin has about 95% sequence homology with bovine PB2-crystallin core domains, respectively. The structure of PA3- and PB2-crystallins from mouse lens can be obtained by homology modeling, which usually is included as a standard block in molecular modeling packages such as Whatif (7). However, structure prediction for protein associates such as dimers is not well established. In order to understand what properties of the domain binding site and how the dynamics of the structure change when a monomer becomes part of an associate, it is necessaryfirstto determine the correct location of the domain binding site on the surface of the predicted protein. Here we consider how to determine the correct location of the domain binding sites on the surface of the predicted structures. We located these sites in murine pA3- and pB2-crystallin sequences TECHNIQUES IN PROTEIN CHEMISTRY VIII
817
818
Yuri V. Sergeev and X Fielding Hejtmancik
by using a combination of accessibility calculations, sequence homology and known 3D-stmcture of the domain interface for monomeric yB-crystallin and dimeric PB2-crystallinsfromthe bovine lens. With modeling we show that the structure of the common part of the domain binding sites in mouse PA3- and PB2-crystallins are similar to those in bovine PB2- and yB-crystallins. The prediction that those surface residues having a major accessible area change and low sequence variability are critical for domain association should be amenable to testing by site-directed mutagenesis followed by biochemical characterization of their association properties.
IL Methods A. Sequence and structural data Five PA3-crystallin sequences from mouse, human, rat, bovine and chicken lenses, PB2-crystallin sequences from mouse and human lenses, and pB2crystallin sequencesfrombovine, rat and chicken lenses were used (Sergeev & Hejtmancik, unpublished data). All these sequences and the bovine sequence of yB-crystallin were mailed to the PHD server for the multiple sequence alignment and secondary structure prediction ( 8). Files Igcs and Iblbfromthe January 1995 Release of the Brooldiaven Protein Data Bank were used for yBand PB2-crystallins, respectively (9). File Iblb contained four molecules; A, B, C and D; grouped as two dimers, AB and CD. Assignment of the secondary structure for PDBfileswas carried out according to the locally adopted version oftheDSSP(lO).
B. Accessibility calculations The DSSP program was used also for calculation of the accessible area with a 1.4 A radius of the 'probe' sphere. Accessible area changes for the yB-monomer and PB2-dimers were calculated as follows. The coordinates of the water molecules were removed from files. The coordinate file of yB-crystallin was split into twofilescontaining the N-terminal domain and part of the interdomain linker (residues from -2 to 85), and the rest of the molecule (the C-terminal domain and arm - residuesfrom86 to 175). The accessible area was calculated for the file Igcs containing the coordinates of the whole molecule and for the two files for isolated domains. The accessibility calculations for the PB2crystallin were repeated for the four isolated molecules and for the two dimers from the Iblb file. The accessible area change for residues in the interdomain interface was calculated by substracting the accessible area of the whole
Domain Binding Sites: pA3- and pB2- Crystallins
819
molecule from that of the isolated domains of yB-crystallin. Accessibility change calculations for isolated molecules and PB2-dimers were carried out in a similar fashion. Residues with an initial accessible area of less than 20 A^ were considered to be buried, while those with initial accessible areas greater than 20 A^ were considered to be surface residues. Determinant residues were choosen from the those surface residues which showed low variability of the accessibility in the interfaces of different monomers. For each residuefromthe interface we calculated the average accessibility change and root-mean square deviation (rms) a. Residues with the scoring function/= /a < 3.0 were considered to have a high variability of the accessible area change. Otherwise, residues with the variation of the accessibility/ ^3.0 were considered to be determinant surface positions.
p3/pll Figure 1. Schematic presentation of P-sheet 2 of the N-domain and amino-terminal residues or P-sheet 4 of the C-domain and the proximal C-terminal extension. Literface residues with large accessibility change in the interface are open circles and buried residues are filled circles. Other residues of the P-sheet are always accessible to the water *probe' and are shown by squares, while buried residues are shown as filled squares. Residues with large accessibility change are labelled vl-v8 for the variable region, and cl-c7 for the interface residues included in the 3-layer packing of P-sheets (the constant region). The main chain pathways and virtual interstrand connections of up/down equivalent Ca-atoms are shown by solid and interrupted lines, respectively.
C Structural surface template The simplified structure of the molecular surface involved in the domain-domain interactions is shown in Fig.l. Interface residues were selected by calculation of the surface accessible area change. All residues having an accessibility change
820
Yuri V. Sergeev and J. Fielding Hejtmancik
greater then 20 A^ were considered to take part in the structural template located on the surface of the protein domain. The P-sheet residues and close neighbours were represented on the lattice in a fashion similar to the presentation of the P-barrel by the hydrogen-bonding pattern in (p/a)g-barrel proteins (11). The extensions of P-strands pS and P16 including residues v7- v4 from the linker or the C-terminus, respectively, were also represented on the 2dimensional grid as shown in Fig. 1. Because residues in the interdomain interface show 2-fold symmetry (2,12). The template shown in Fig. 1 was used to represent the asymmetric unit of the molecular surface. Indeed, the structure of antiparallel p-sheets 2 and 4 involved in the domain-domain interactions of yB- and PB2-crystallins can be presented on the same 2-dimensional (2D-) lattice. All P-sheets from both domains were aligned on the lattice by the relative position of their residues as determined by their hydrogen-bonding pattern. The position of the residues of each P-sheet has been described by two coordinates: 1) the consecutive number of the residue along the chain segment of the p-strand, and 2) the number of the chain segment. Only the accessible surface positions shown by opened circles were taken mto account (Fig. 1). The presence or absence of hydrogen bonds in P-sheets and the location of residues in virtual lines (connecting Caatoms and showing position within the same up/down motifs of the P-sheet) were determined by visual inspection of the Igcs and Iblb protein structures using the program GEMM. The planar presentation of the P-sheet has been extended by projecting some additional residues fron the C-terminal and interdomain linker ontotiieplane image for better representation of all residues involved in interdomain interactions.
D. Location of the domain-binding site in protein sequence The location of the domain-binding site was determined by multiple sequence alignment, estimation of the sequence pattern variability, and properties of residues for particular positions of the structural surface template. The template prepared for the bovine yB- and PB2-crystallins was described above (Fig. 1). The sequences of mouse PA3- and PB2-crystallins were aligned with bovine yBand PB2-crystallins, and corresponding residues were placed within the structural template. The sequence variability score was calculated for each residue in the surface template as described in the caption to Fig. 2. The location of the common domain-binding site for the bovine yB- and PB2crystallins and mouse PA3- and PB2-crystallins, was determined by considering the residue positions with a positive sequence variability score (> 0) and the residues with conserved properties in corresponding positions of yB- and PB2crystallins.
Domain Binding Sites: PA3- and pB2- Crystallins
821
III. Results A. The domain-binding site The stmctural surface template was obtained for the asymmetric part of the molecular surface involved in the domain-binding site. The common part of the template, consisting of the accessible residues of the P-sheet surface, is represented in Fig. 2a. Equivalent (a)
vl
v2 v3 v4
v5 - v6 - v7 cl - c2 - c3 c4 - c5 - c6 cl v8 (C)
91±62/1.5 53±36/l,5 84±20/4,2 50±38/1.3 33±5 /6.6 -- 24±13/1, 8 59±8 /I, 4 -• 39±8 /4, 9 44±5 /5.5 -- 11116/0, 7 -
(b)
86/175 85/174 84/173 83/172 82/171 - 40/129 •- 59/148 81/170 - 41/130 -• 58/147 79/168 - 43/132 -• 56/145 54/143 53/142 (d)
34±40/0.9 65±15/4.3 82±7/ll,7 104±13/8.0 47±18/2. 6
n/a -3.8 -6,9 -0,8 -4,3 39,6 21,3 -1,5 28,2 17,0
13, 1 9. 9
11. 7 16. 2 -5. .5
Figure 2. Average accessibility changes and the sequence variabihty for the interface surface residues for both domains in yB- and PB2-crystallins: (a) surface residue positions vl-v8, cl-c7 are labelled as shown in Fig. 1; (b) numbers of the equivalent surface residuesfromN- and Cterminal dcxnains; (c) accessibility changes averaged for both domains in yB-crystallin (1 gcs file) and two pB2-dimers (Iblbfile),the ratio/= /a, is shown under the slash; (d) the sequence similarity score for aligned surface positions (b) of the yB- and pB2-crystallins estimated from the mutation matrix (Gonnet et ai, 1992) as score = Ipfij), where the sum was calculated over all i,j = 1,...4, and ihepfif) is the score for the single mutation.
surface residues from the N-and C-domains were aligned with residues as shown in Fig. 2b. The domain-binding site can be divided into two parts: the highly conserved common part (residue positions cl - c7) and the part with higher variability of the accessibility change (positions vl - v^). The average total accessibility changes calculated from Fig. 2c for both the variable and common parts are about the same value: 416 A^ and 404 k , respectively. Residues in colTjmns containing P-strands P3 and pi 1 are not involved in close interactions in the interface (Fig. 1). Residues with the most significant accessibility change are mainly located in P-strands P6, pi4, ps and pi6.
Yuri V. Sergeev and J. Fielding Hejtmancik
822
Residues in positions c7, c3, c4, c6 and c7 show both lower variability of the accessibility change (f> 3.0) and higher sequence similarity score (Fig. 2 c, d).. Hydrophobic residues are preferrably located in positions cl and c6 and charged residues are located in positions c3 and c4. Only a few residues have an accessible area change greater than 50 A^ which are considered significant (Fig. 2c). Residues c7, c3, c6 and c7 show a remarkably low sequence variability and a significant change in accessible area in both proteins. This suggests that they might serve as structural determinants in the interface. Significant accessibility changes were also observed for the linker and the C-terminus (residues v7 - v4). However these residues show a weak sequence similarity, calculatedfi'omthe mutation data (Fig. 2d).
r
••••••<1
Cs'
Figure 3, Graphical presentation of amino acid residue packing in the common part of the interdomain interface (see Fig. 1 and 2a). Asterisks identify the symmetry equivalent-positions fix)m the C-terminal dcanain. The 2-fold symmetry axis is shown in the center of the interface. The surface residues from the N- and C-terminal domains are related by that symmetry and form two nearly parallel layers of residues. Four central residues in each layer form a figure rectangle with approximately 5x8 A sides joint by angles of about 80° and 110°.
B. Layer packing of the common surface residues The interdomain interface can be described as the contact of two domain recognition sites, residing on the surface of the N- and C-terminal domains, respectively, and show a 2-fold symmetry axis. Accessibility and sequence similarity characteristics of residues in the interdomain interface are shown in Fig. 2. Contacts between the surface residues of the two P-sheets are depicted schematically in Fig. 3. Positions c, and c,' are related by this 2-fold symmetry axis. Each of the two surface lines in the top of the domain (virtual lines c7-c5 and c4-c6, connecting Ca-atoms) contains three residues: the two in the central rectangle are preferrably hydrophobic and third is a charged residue, often arginine. This type of residue packing on the top of a domain is called layer
823
Domain Binding Sites: pA3- and PB2- Crystallins
packing. Fig. 3 shows the top (residues cl - c3, cV - cS') and the middle (residues c4 - c6, c4* - €6^) layers of the interface. In both yB- and PB2crystallins, this local 2 top layer structure of 8 mainly hydrophobic residues may form the local hydrophobic core of the interface. The bottom layer contains only two residues {c7, cT) which are both polar glutamines in all known 3Ycrystallins. All layers of residues are close to parallel. Residues c3, cS\ c4 and c4* are located on the periphery of the top and middle layers (Fig.2). Polar charged residues are preferrably found in these positions in yB- and pB2-crystallins. Both the c4 and c4' positions are always arginine, the c3 and ci* residues are the R58 and R147 in yB-crystallin and E58 and E147 located in same positions of PB2-crystallin. Residues in positions c4 and c3' participate in the same charge cluster on one side and residues in positions c4' and c3 form a second cherge cluster on the opposite side of the interface. In this fashion, positively charged residues R79 and R147 come together from two layers and interact with D21 from P-strand P3 in the yBcrystallin structure, while on the other side of the interface residues R58, R59 and R168 are also closely approximated. A similar situation occurs in the PB2crystallin dimer with exception that glutamic acid residues 58 and 147 participate in these charged clusters. In conclusion, the core structure of the interface, common for pB2- and yB-crystallins, is formed by the surface residues of P-sheets 2 and 4, which are packed in a 3-layer structure. The hydrophobic interaction in the center of the top 2 layer structure is additionally stabilized by the interaction of charged residues on the periphery and in the third layer. The methyl groups of arginine often have a significant hydrophobic potential and may additionaly increase the stability of the two-layer packing.
C Location of the domain-binding sites in fiA3- and /3B2'Crystallins The common part of the structural surface template derived from bovine yBand PB2-crystallins was used to determine the domain-binding sites in mouse PA3- and PB2-crystallins. Residue properties obtained for 5 protein sequences for PA3- and PB2-crystallins in the common part of the domain-binding site are shown in Table I. As shown in that table, residues of the common recognition Table I. Properties of residues in the domain binding site of PA3- and PB2-crystallins Position
PA3-crystallin'
PB2-crystallin'
Property
IFIVI ( I I I I I ) IIIII (IIIII) cl (cl') hydrophobic AAAAA (AAAAA) AAAAA ( T T T T T ) c2 (c2') hydrophobic SEJEJEJE (EKEEE) EEEEE (EEEEE) c3 (c3') polar charged RRRRR (RRRRR) RRRRR (RRRRR) c4(c4') polar charged IIIVV (VWW) V W W (VWW) c5 (c5') hydrophobic V W W (LLLLL) IIIIV (IIIVI) c6(c6') hydrophobic c7 (c7') polar QQQQQ (QQQQQ) QQQQQ (QQQQQ) "Residuesfrommouse, human, bovine, rat and chicken lenses are shownfromleft to right.
824
Yuri V. Sergeev and J. Fielding Hejtmancik
sites of all sequences have conserved properties. Additionally it is shown that all the sur&ce clusters are conserved between the mouse and bovine sequences. The entire mouse PB2-crystallin sequence has only 8 residue changes compare to the bovine, all these changes are found outside of the domain binding site area, suggesting that the dimerization of mouse PB2-crystallin should be similar to that in bovine pB2-crystallin. The assignment of residuesfromthe PA3-ciystallin sequence to the 2Dlattice of the domain surface was also carried out using multiple sequence alignment. In that assignment the geometrical structure of the surface and the buried lines in the p-sheets is preserved among all the sequences. Amino acid residuesfromthe aligned sequence of the mouse PA3-crystallin were placed in the 2D-lattice (Fig. 4). Those PA3-crystallin residues located in the common
p-strand.
stzrfac«-t Jburl«d surface'A Jburlad
avLrfa.cm-h
P^
T19 S20 S21 C22 F23 N24
ahm^t 2 P8 ps P6
N90 A89 388 C87 186 P85 R84 F83 382 M81
G41 A42 K43
T44
G45 Y46 E47
R60 E59 L58 157 F56 Q55 Q54
Pll
D113 D114 Y115 P116 3117
shaat 4 P16 P13
P14
Q185 Q184 1183 R182 R181 V180 3179 Q178
C155 E154 L153 1152 Y151 Q150 Y149
G136 A137 W138 V139 G140 Y141 Q142
Figure 4, The proposed structure of the domain-binding site for the PA3-crystallin. Even numbered p-sheets in bovine yB- and pB2-crystallins are shown in (a) and (b), respectively. The residue numbers correspond to those in PDB files. P-Strands are located in vertical columns and designated as pi, where / = 1, 2,..., 16. Alternatively lines of Ca-atoms, in which surface and buried residuesfix)mthe p-sheet are positioned on horizontal rows. The symbols (t), (m) and (b) indicate the top, middle and bottom surface lines, respectively. The expansion of the p-sheet in the top and the bottcan of the 2D-lattice has been interrupted when the accessible area change for the residues drops below 20 A^. Residues forming the layer structure in the interface are double underlined.
part of the template show similar residue properties. Hydrophobic residues are found in positions participating in the 2-layer interactions in PB2-crystallin, and the glutamine residues at positions of Q54 and Q143 of both Py-crystallins with known 3D structure are conserved. Thus, the physical properties of interface residues involved in the interaction of the two P-sheets are preserved, suggesting that dimer formation in mouse PA3-crystallin might occur by a mechanism similar to that in bovine PB2-crystallin.
Domain Binding Sites: pA3- and pB2- Crystallins
825
IV. Discussion The stnictural surface template was derived for the asymmetric unit of the molecular surface buried into the interdomain interface of Py-crystallin. The common part of the template was selectedfromall those residues of the surface 3-sheets participating in layer packing. It consists of those residues which are highly conserved among all participating domains of both yB- and PB2crystallins and which show consistently large changes in their surface accessible areas on dimer formation. Those residues with the most similar properties are v6, cl, c3, c4, c5, c6 and c7 (Fig. 2d). The rest of the template, mainly formed by the linker (or the C-terminus) residues did not shown any sequence similarity. However, the variable part of the template is responsible for about 50% of the total accessible surface area change in the template. It seems that the common part of the template plays a key role in determining the specificity of domain binding, while the variable part is responsible for nonspecific binding. The layer packing model of the interdomain interface was critical for recognition of the common part of the template. The important role of the layer structure in the analysis of the core residue packing wasfirstconsidered for the (P/a)8 barrel proteins by Lesk et al (13). Later, it was pointed out that the difference in the elliptical and the circular |J-barrel shape can be explained by the packing of two or three aromatic residues in the central layer of proteins with a circuler shape as shown by Sergeev & Lee (11). However, that principle was considered for the residue packing inside of globular domains and never applied to the analysis of residue interactions in the interface of swapped domains. The introduction of a similar structural principle for the interface residues of two 3structural domains may suggest types of mutations that might be permitted in the interface and may help in designing experiments to study the association properties of P-crystallins. The significance of the common part of the recognition site is shown by comparison with the domain-binding site of protein S, a member of the yPciystallin superfamily (17). NMR studies show that the domain-binding site in the protein S structure is different from that in yB- and PB2-crystallins. In protein S, the domain interface is formed by residues from P-sheets 2 and 3 instead of P-sheets 2 and 4 in yB- and pB2-crystallins. The structural sequence alignment of protein S and yB-crystallin was used to determine what residues reside in the aligned positions of the template for bovine yB-crystallin. The sequence pattern of residues in that case was changed significantly. Indeed, such a distribution of residues cannot produce a favorable interaction for P-sheets 2 and 4. This suggests that proteins should not be able to form a stable dimer with P- or y-crystallin domains. The ability of acidic and basic P-crystallins to form homo- and heterodimers, and higher molecular weight oligomeric structures was studied earlier by Slingsby & Baterman (14), suggesting that the bovine PA3- and PB2crystallins preferentially form heterodimers. The association behaviour of PA3crystallin was studied by Hope et al (15-16). Modeling of the domain-
826
Domain Binding Sites: pA3- and PB2- Crystallins
recognition site for both mouse PA3- and PB2-crystallin shows similar domainbinding sites in both proteins. In contrast, the presence of a common binding subsite in these two crystallins suggests that the same site in PA3- and PB2crystallins may be involved in homo- and hetero-dimer formation and may explain why the acidic and basic p-crystallins participate in hetero-dimer and higher associations experimentally.
References 1. Bax, B., Lapatto, R., Nalini, V., Driessen, R, Lindley, P., Manadevan, D., Blundell, T., and Slingsby,C.(1990). Nature ^41,116-im. 2. L^atto, R, Nalini, V., Bax, B., Driessen, H., Lindley, P., Blundell T., and Slingsby, C. (1991). J. A/o/.5/o/. 222,1067-83. 3. Bennett, M I , Choe, S., and Eisoiberg, D. (1994). Proc. Natl Acad. Scl USA 91, 3127-3131. 4. Norledge, B.V., Mayr, E.-M., Glockshuber, R., Baterman, O.A., Slingsby, C, Jaenicke, R., and Driessen, H.P.C. (1996). Nature Struct. Biol. 3,267-274. 5. Sergeev, Y., Chirgadze, Y., Mylvaganam, S., Driessen, H., Slingsby, C, and Blundell, T. i\9SS). Proteins 4,137-147. 6. White, H.E., Driessen, H.P.C., Slingsby, C, Moss, D.S., and Lindley, P.P. (1989). J. Mol. Biol. 201,211-235. I. Vriend, G. (1990). J. Mol. Graph. 8,52-56. 8. Rost, B., and Sander, C. (1994). Proteins 19,55-72. 9. Abola, E., Bernstein, F.C., Biyant, S.H., Koetzle, T.F., and Weng, J. (1987). In: Crystallographic databases - information content, software systems, scientific applications (Allen, F.H., Bergerhoflf, G. and Sievers, R., eds.). p. 107-132. Data Comission of the Litemational Union of Crystallography. Bonn. Cambridge. Chester. 10. Kabsch, W., and Sander, C. (1983). Biopofymers 22,2577-263. II. Sergeev, Y., and Lee, B. (1994). J. Mol. Biol. 244,168-182. 12. Nalini, V., Bax, B., Driessen, R, Moss, D., Lindley, P., and Slingsby, C. (1994). J. Mol. Biol. 236, 1250-1258. 13. Lesk, AM., Branden, C.I., and Chothia, C. (1989). Proteins: Struct. Funct. Genet. 5,139148. 14. Slingsby, C, and Baterman, O. (1990). Biochemistry 29,6592-6599. 15. Hope, J., Chen, H.-C., and Hejtmancik, J.F. (1994a). Prot. Eng. 3,445-451. 16. Hope, J.N., Chen, H-C, and Hejtmancik, J.F. (1994b). J. Biol. Chem. 33,21141-21145. 17. Bagby, C, Harvey, T.S., Eagle, S.G., Inouye, S., and Ikura, M. (1994). Proc. Natl. Acad. Sci. USA9h 4308-4312.
Complete Mutagenesis of the Gene Encoding TEM-1 plactamase Timothy Palzkill^'^, Wanzhi Huang 1, and Joseph Petrosino^ Department of Microbiology and Immunology ^ Department of Biochemistry^ Baylor College of Medicine, Houston, TX 77030
I. Introduction The most common mechanism of bacterial resistance to P-lactam antibiotics such as the penicillins and cephalosporins is the synthesis of p-lactamases that cleave an amide bond in the antibiotics to generate inactive products (Wiedemann et al., 1989). Genes encoding p-lactamases can be found on the bacterial chromosome or on plasmids. The active site serine p-lactamases belong to a larger family of penicillin-recognizing enzymes that includes the penicillin binding proteins (Joris et al., 1988). All of these enzymes contain the active site serine as well as a conserved triad of K(S/T)G between the active site serine and the C-terminus (Joris et al, 1988). The class A P-lactamases are a subset of the active-site serine p-lactamases. TEM-1 P-lactamase is a class A enzyme encoded by the blajEM-l gene that is present on the transposons Tn2 and Tn3 (Datta et al, 1965). Epidemiological studies have shown that TEM-1 is the most common plasmid-mediated Plactamase and is therefore a major determinant of bacterial resistance to p-lactam antibiotics (Wiedemann et al., 1989). Compounding the problem of resistance is the discovery that TEM-1 mutant variants with altered substrate specificity have been identified in natural isolates (Jacoby and Medieros, 1991). These variant enzymes contain from one to three amino acid substitutions that enable the enzyme to hydrolyze the newer extended-spectrum cephalosporin antibiotics such as cefotaxime and ceftazidime (Jacoby and Medieros, 1991). Thus, the selective pressure of antibiotic therapy leads to fiie creation of new enzymes with expanded hydrolytic capabilities. Because of the significant role of TEM-1 P-lactamase and its mutant derivatives in antibiotic resistance, it is of interest to understand how the amino acid sequence of the enzyme establishes its structure and activity. We have determined the tolerance of each residue in TEM-1 P-lactamase to amino acid substitutions to identify those residues that make critical contributions to the structure and activity of the enzyme. The tolerance of each residue was determined by randomizing three to six contiguous codons to create a random library containing all possible amino acid substitutions for the region randomized (Palzkill and Botstein, 1992). Functional random mutants were then selected from the libraries and sequenced to identify permissible substitutions at each position. The sequences for each set of mutants allowed the importance of TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
827
Timothy Palzkill et al
828
individual positions to be assessed. Similar saturation mutagenesis approaches have been used to identify amino acid residues critical for the function of HIV-1 protease, X repressor, Lac repressor, T4 lysozyme, and bacteriophage f 1 Gene V protein (Loeb et al., 1989; Bowie et al., 1990; Markiewicz et al., 1994; Rennell etal., 1991; TerwilUger et al, 1994).
II. Materials and Methods A. Bacterial strains and Plasmids Escherichia coli BW313 [Hfr />?M(61-62), dutl, ungl, Ml, recAl, spoTl] was used to propagate plasmid DNA prior to mutagenesis (Kunkel et al., 1987). Mutagenized DNA was initially introduced into E. coli ES1301 [lacZ53, mutSlOl :: Tn5, thyA36, rha5, metBl^deoC, IN (rrnD-rrnE)] (Siegel et al., 1982). E. coli XLl-Blue [recAl, endAl, gyrA96, thi-1, hsdR17, supE44, relAl, lac, [F:: TnlO {Tci^)proAB, lacl^ A (lacZ)M15]] was used to assay antibiotic susceptibility and to prepare single stranded DNA (Bullock et al., 1987). Mutagenesis was performed on the plasmid pBG66, which was used in previous studies (Palzkill and Botstein, 1992).
B.
Random Mutagenesis Procedures
The construction of ten of the random libraries has been described previously (Palzkill and Botstein, 1992; Palzkill et al., 1994). The remaining 78 libraries were constructed as follows. First, a unique Sal I restriction site was inserted into a location within the bla gene which had been targeted for mutagenesis. A frame shift mutation, resulting from this insertion, rendered thtbla gene nonfunctional. Subsequent randomization was achieved by replacing the unique restriction site with a randomized nine-base DNA sequence. The Sal I restriction site was introduced by site-directed mutagenesis using the method of Kunkel et al., (1987) and an oligonucleotide containing the Sal I recognition sequence, 5'GTCGAC-3'. The Sal I recognition sequence is flanked by two 12 base arms which are complementary to the sequence adjacent to the site targeted for mutagenesis. The restriction site was positioned at or near the middle of the three codons to be randomized, and the second base of the middle codon was deleted to create a frameshift mutation. Phosphorylation of the oligonucleotides was done using 200 pmol of the primers in the presence of O.IM Tris-Cl (pH 7.5), O.OIM MgCl2, 5mM DTT, ImM ATP, and 5 units of T4 polynucleotide kinase (New England Biolabs) at 37°C for 45 minutes. The T4 kinase was inactivated at 65°C for 10 minutes. The oligonucleotide was then annealed to 200 ng of the ssDNA pBG66 template in the presence of 20mM Tris-Cl (pH 7.5), 2mM MgCl2, and 50mM NaCl. This mixture was heated to 70T for 10 minutes, and was slowly cooled to 30T. The annealed DNA was then placed on ice. Second strand synthesis was done using 5 units of T7 DNA polymerase (United States Biochemical) in the presence of 0.5mM dNTP's (Pharmacia), ImM ATP, lOmM Tris-Cl (pH 7.5), 5mM MgCl2, and 2mM DTT, as well as 400 units of T4 DNA ligase (New England Biolabs). The synthesis reaction was carried out at 4'C for 5 minutes, then 25T for 5 minutes, and finally 37''C for 30 minutes. After the 37C" incubation, 90|il of a TE stop buffer containing lOmM Tris-Cl (pH 8) and lOmM EDTA was added to the reaction to bring the volume up to 100^ll. Following phenol extraction and ethanol precipitation, the mutagenized DNA was transformed into E. coli ESI301 cells by electroporation. Mutants were screened on separate LB plates containing 12.5|ig ml"^ chloramphenicol and 1 mg ml"^
Complete Mutagenesis of a Gene
829
ampicillin. Plasmid DNA, containing the Sal I insert was identified by selecting clones which were chloramphenicol resistant and ampicillin susceptible. The plasmid DNA was isolated by an alkaline lysis procedure, and proper insertion of the Sal I recognition sequence was confirmed by DNA sequencing. In order to randomize the targeted regions, plasmid DNA containing the Sal I site was electroporated into E, coli BW313 (Kunkel et al., 1987). Singlestranded DNA for random replacement mutagenesis was prepared from BW313 transformants as described (Sambrook et al., 1989). An oligonucleotide designed to replace the nine base window (including the Sal I site) with random sequence, 5'-NNS NNS NNS-3' (where N indicates an equal probability of any base, and S indicates an equal probability of either C or G), was used in a second round of mutagenesis. This insured all amino acids would be sampled in the window. Two 14 base complementary arms flanked the random sequence. The second round of mutagenesis reactions were carried out as in the first. However, after transformation into E. coli ESI301, the number of transformants needed to be greater than 75,000 to ensure that all possible sequences in the three amino acid window have a 90% chance of being generated. Library DNA was isolated from the E. coli ES1301 cells by alkaline lysis, and was electroporated into E. coli XLl-B for further screening. The probability that the pool size in each experiment was large enough to contain the least probable (i.e., Tip Trp Trp) sequence combination was calculated using the Poisson distribution P = X^e'Vx!. For these calculations X np where n= pool size, p = probability of least common sequence, and x = number of times the sequence occurs in pool size n. For these calculations x = 0. The probability that the given sequence occurs is then 1 - e'^P, which is the probability that the sequence occurs one or more times in the pool (Palzkill & Botstein, 1992). Using these calculations, a library randomized for three condons that was made by pooling 75,000 transformants has a 90.3% probabiUty of containing the Trp Trp Trp sequence. Note that the same library has a >99.9% probability of containing ^ e most common Leu Leu Leu sequence.
C
Selection of Functional Mutants
To select for functional random mutants, E. coli XLl-B cells containing the plasmid library to be tested was streaked on the surface of an LB agar plate that contained 1 mg ml"l ampicillin (Sigma Chemical Co.). The agar plate was incubated overnight at 37" C. Clearly isolated single colonies were then picked the next day and cultures were grown from the single colonies to isolate ssDNA for sequencing. Alternatively, the isolated single colonies were picked and used directly for PCR to amplify the coding region of the blaTEM-1 gene. The amplified PCR product was then sequenced directly (Hanke and Wink, 1992).
III. Results A. Randomization Procedure Two different site-directed mutagenesis approaches were used to generate the set of 88 random libraries that encompass the blajEM-i g^^^- Ten libraries were constructed using the random replacement mutagenesis protocol which has been described in detail (Palzkill and Botstein, 1992). The remaining 78 libraries were constructed by a combination of linker insertion mutagenesis and oligonucleotidedirected mutagenesis (Huang et al., 1996)(Fig.l). A set of 78 linker insertion mutants were generated throughout the hldijEM-1 gene by oligonucleotide-
Timothy Palzkill et al
830 pBG66 p l a s m i d
bla^„ gene 75 76 77 M S T F K V L L C G A V L S R ^ 5 ' -ATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGT-3 ^
m
M S T F K V L L C G A V L S R -ATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGT- 3 ' 3'-AATTTCAAGACG TACACCGCGCCA-5' C G A T G-C Sail
M S T F K V L Sail C G A V L S R 5 ' -ATGAGCACTTTTAAAGTTCTGCCAGCTGATGTGGCGCGGTATTATCCCGT-3'
D
GCCAGCTGAT T G C T 5'-ATGAGCACTTTTAAAGTT GGCGCGGTATTATCCCGT-3' 3'~CGTGAAAATTTCAANNSNNSNNSCCGCGCCATAATAG-5'
[CJ
75 76 77 M S T T K V N N N G A V L S R 5 ' ~ATGAGCACTTTTAAAGTTNNSNNSNNSGGCGCGGTATTATCCCGT-3'
Figure 1. Randomization procedure used to construct 78 of the 88 bla random libraries. First, nine bases, corresponding to three contiguous codons, are selected for randomization. In this example, codons 75-77 are targeted for mutagenesis. Single stranded plasmid DNA is shown in (A). A p-lactamase insert mutant is next created, using the Kunkel method of mutagenesis, to insert a unique Sal I resttiction site within the region targeted for mutagenesis (B) (Kunkel et al., 1987). Single-stranded insert mutant DNA is then isolated from an wng-, dut- strain oiE. coli (C). Randomization is accompUshed by annealing an oligonucleotide, designed to replace the target with nine base pairs of random sequence, to the template (D). Second strand synthesis and transformation into an wn^+, dut-k- strain results in randomization of the targeted codons 75-77 (E). Since mutagenesis is not 100% efficient, some cells will have plasmids still containing the Sal I site. To remove non-mutagenized DNA, pooled plasmid DNA is restricted with Sal I. Nonmutagenized DNA is linearized, and only randomized bla DNA is transformed in the last step.
Complete Mutagenesis of a Gene
831
directed mutagenesis (Kunkel et al., 1987). Each linker insert lies within a set of codons to be randomized. In addition, each linker contains a Sal I restriction enzyme site. Importantly, the Sal I site is not present elsewhere on the plasmid used for these experiments (Fig. 1). An oligonucleotide was then designed and synthesized that would replace the Sal I site and the three codons with random sequence DNA. Oligonucleotide mutagenesis was carried out using the method of Kunkel et al., (1987) and the reactions were electroporated into E .coli. The transformants were pooled and the plasmid DNA was extracted. The number of colonies pooled at this step is an indication of the probability that the library contains all possible amino acid substitutions. For each of the 88 libraries, greater than 75,000 colonies were pooled. Therefore, all of the libraries in which three codons were mutagenized have a >90% probability of containing all possible amino acid substitutions. The final step was to digest the pooled plasmid DNA with Sal I and electroporate E. coli again (Fig. 1). This procedure effectively eliminates all non-mutagenized molecules and leaves only the random substitutions. This strategy is termed site-selection mutagenesis and has been used by others to create a variety of site-directed mutations (Deng and Nickoloff, 1992). The advantage for the functional selection strategy is that the starting molecule is itself a linker insert mutant of (i-lactamase. Thus, there are no unmutagenized,^ wild-type molecules present in the final libraries. However, it was necessary to perform all DNA manipulations using barrier micropipette tips to eliminate contamination from aerosols originating from the micropipetor. In the absence of these tips, extensive contamination of the random libraries with wild-type plasmid DNA was observed. Each of the 88 random libraries was used to transform E. coli, and functional random mutants were selected by spreading the transformed cells on agar plates containing 1 mg ml"^ ampicilHn. This is the maximal concentration on which E. coli containing the wild-type blajEM-l gene on the plasmid used to construct the random libraries can grow. Thus, phenotypically wild-type mutants are selected. The specific activity of enzyme from a number of mutants indicated that, on average, the selected mutants possess 80% the activity of the wild-type enzyme (Huang et al., 1996).
B. Comparison of Tolerance to Amino Acid Substitutions in Mutagenesis Experiments and Sequence Conservation in the Gene Family To determine the identity of allowable substitutions at each residue position, the DNA sequence of an average of 9 functional random mutants from each library were determined. In total, 43 out of the 263 (16%) mutated residues are inferred to be critical for TEM-1 P-lactamase structure and function since only the wild type amino acid is found at these positions among the sequenced mutants. This set of essential residues includes catalytic residues and a number of other amino acids that are buried in the hydrophobic core of the enzyme. A detailed description and analysis of these results has been published elsewhere (Huang et al., 1996). A large number of class A P-lactamases have now been identified and sequenced and an alignment of 20 class A p-lactamases has been published (Ambler et al., 1991). These aligned sequences permit a comparison between the conserved amino acid residue positions among class A P-lactamases and the conserved positions among the functional random mutants in TEM P-lactamase (Fig. 2). In general, there is agreement between the tolerance of a residue in TEM P-lactamase to amino acid substitutions and the amount a position is substituted in
832
Timothy Palzkill et al
TEM-1
RQ KRL CCGSTT TSERKPV QAADAEAS PSLAHGETL
D BR VY SA MML RA GDOIKT SGT ASARSCAAVE VKVSDAELLL
V T r S G RTW L H VSW K IFI TY Q PM AATQ SV AVKATF ASLL SLIN TOT QLTHSV KTSIA TIMO IRMAETVEHN GARVGYLTLE INSGKILDVH
PD F W M V E C L H T E F VAR L ML T A SS AVF KHM H TI IVS Y AQ LNDV HSAQQ FT A TSQT QLNNDDKTH PLLDK IS V LI AAKI LFWIYTKSLL REYERFPLMS TFKVLLCGCV MSRVDAGHEQ
G T D N TS V ITQM ID WLRWHDA SQLTVAFMKR LGRLIHYSRP
MSIQHFRV ALIPFFAAFC LPVFAHPETL VKVKDAEDQL GARVGYIELD LNSGKILESF RPEERFPMMS TFKVLLCGAV LSRVDAGQEQ LGRRIHYSQN 100
Class A
F L Q SQ S L TE DKH N CA Hit V LS CAM AMAL I QTM HVIGOT ASV DLVKYSPSVK
L S M V H CS P D T TQT NIH L PPD SLQ Q DLIN RFS ELRRSAQQTH KHATEGMTVK
VPYOGGQPD YAMKSQILE SSSPTEAVS AG7TFNDRH QEVKVna<M EDHQAKPYK KHKEMDSOT LNMRESRIA GCGSKI AQ AGPA D R
VKVKDAEDQL GARVGYIELD UISGKILESF RPDERFPMMS KQITLS STP SGKI MWNE TAMMRTWTAW KANQL LLC EAFAQE TSY QVHL VALIK DT LZVAQH BGRD GYG LDLEAI VES G P ISAV QO ESI DY EGA AFA QT SRL EKH D AY R G WER F D K IV GERGXQN F Y S Q G T S T BRNQRA R A K N RA
A R V H SPE 6 OA X D Y VAQ S KIVFR H MI TMEL A TSTZVA G AG SLSA ELdSU^OMS 0NTAANLLL8
F C GA D GT EDOHY F Q Q D L S Q VK SB D WW IQ T H TATIVTAAKMHMHC V SV KH AR A VXL TK HIM L AC JUBS BSS LO N KRVD R8 KW F GI HLSKAGaVME YMSC8EELTP BCT Y WMG TZGGPYELTA FLSSHGDHAT RLDRNETDLM
TFKVLLCGAV LSRVDAGQEQ LGRRIHYSQN VH AASAAKL AQAERHDLM PEQKVPIRRQ KV TTAV LI YDW QTLVS VDSALEVEKT Y SWS V S ST I I Q G E N
G Q R K E
NSKDE KPRKK A PNA ID T G Y
E
NDT T KEA SEV K T E TK G K N D S
M M
YVC K FG G ARN A P SY VAM KD C YHM RCKIT F YE ANI BMBNH SVLK FVADS A I MSLCNDF YAVTQIRQRD BSSAE TN M QQVGGLT GRQMSDPSVS DTTLL VQ LACVISSSV TEMYYADTFA EAISNDERDT TGPAAMATTL RKLLTGELIT
D^VEYSPVTE KHLTDGMTVR ELCSAAITMS 0ilTAAllIX2.T TIG6PKELTA FIAMNGDBVT RLDRMEPEZIT EAIPNDERDT TMPAAMATTL RKLLTGELLT 200 pLVEYSPVTE KHLTDGMTVR ELCSAAITLS DNSAAHII2A TIGGPKELTA FLBQHGDHVT AILDHA ISG QQVSQALKLG DAAAVTVSI 8T Q lAID AXD AAGFED DMRSI I«S VSTMN EI Y ITDISGD AVLR 8MDT V N KIF8 HL RRTVKQ AFEEV KBI YL RIIF ILRM lEAYDE ZCXBL PTF AM M 6 E PW6 K S L AH S X SSL K EV K C Q NF T D QY N Q TK8 R S G A E CE HY A I T E A K D Q R R ET RR TT TMC TT KI KBT HL CH QRA AA SC VSHQDK RA PKRGNVMEE SGE I KD lASRQELIDN MEADMVAGPL
D K C Q V FR SH A C 8 R F WQ TA M M GA Y I MSV DD K V LRSALPBG1VY lADKSGAG.E
T A PC H A CY D G LVSV RGSRGIIAAL
K R Q GP
AT VE NKTR KABC HTSAL V MRRDTKL DGKPSRI
RL0RNEPEU7 MSE R TA S SAN I QD G MPV F IK
EAIPNDERDT TMPAAMAATL RKLLTGELLT DGLLGEAH STAISASTSF QTFVLAQV S S K IVRMV NAYTFSGA E RFTSKSLQ SVK DRTK R LGSD ED AVDSR P YKA IT L FTE KS lAEDH R AP K G G GR MMK A YS P T T G H V K K N
G H V H NV TC M AT I ISLIST V BR AL LVEGMR P SK SM V ICVFSKWTPS S EDQ TAVM WIYTTGSQA TMDARNRQIA
I HK S L Y F W L IDTL Q Q STAMTS T PYRVYSC D AVAITBA MIGASLVAHH
LASRQQLIDW MEADKVAGPL LRSALPAGNF lAOKSGAG.E RGSRGIIAAL GP.DGKPSRI WIYTTGSQA TMDERMRQIA EIGASLIKHW LAARQQLIDH ARMQRE LQM EPPKNK TSL PMK KV EE BEE GI AT SGD DL VG KQQ EF NR HM AR L
MEDDKVAGPL LRSALPAGHF lADKSGAG.E RGSRGIIAAL GS LVAMRTTDAV IKAVAEPDYQ VL RT SAIG HAATSDVGLV KR LGGQSGRNS F DGVRRA N DG T GVES F TAMLT VI Q RNKG KTET KEI D N L L E T KYVN MITR KHTSKR ESR Q D A FA SG A Q TV T R P T K E Q T D Q K R A S
DGKPSRI MMRAEW PKQSAI EHGDPW GEDK
WIYTTGSQA AAMFLRDTPE ISVMIAQEAK LTTLSSRDDQ MN QVMKGEP PT FN
IMDERNRQIA SFAASDQBVS EASVDSDALV DDEFBKEL K KGIYN AV KL K PS P VT G
290
EIGASLIKHW GLAQAIAEVL RATSRVFDTY KT BV VASV DG RI MGRI EL TSGF K EM
Figure 2. Comparison of sequence variability among functional TEM-1 p-lactamase mutants and twenty aligned class A P-lactamases. The wild type TEM-1 P-lactamase primary sequence is shown. Above the sequence are the different amino acids that were identified at that sequence position among functional random mutants. Below the TEM-1 primary sequence are the different amino acids that appear at these positions in an aUgnment of 20 class A P-lactamases (Figure adapted from Huang et al., 1996).
Complete Mutagenesis of a Gene
833
the class A family. However, a detailed analysis reveals several important differences. The patterns of substitutions between TEM-1 and the class A enzymes can be grouped into 4 classes based on tolerance to amino acid substitutions. Class 1.
Positions substituted in class A and TEM-1 enzymes. 210 residues.
Class 2.
Positions not substiuted in class A or TEM-1. 13 residues. F66, S70, K73, P107, S130, D131, A134, R164, E166, D179, T180, D233, G236
Class 3.
Positions substituted in class A but not TEM-1. 10 residues. E37, G45, L81, N136, G144, G156, D157, L169, L199, L207
Class 4.
Positions substituted in TEM-1 but not class A. 30 residues. L30, Y46, P67, T71, L76, L122, A125, N132, D176, T181, W210, D214, V216, A217, L220, R222, P226, W229, A232, K234, S235, G242, R244, G245, L250, G251, M272, N276, 1282, W290
As evidenced by the large size of Class 1 above, the majority of positions in TEM-1 and class A enzymes may be substituted to some extent. Class 4 is also a large class and represents residues that make essential interactions in TEM-1 but are not conserved among other class A enzymes. An example of a Class 4 residue is Leu76. Leu76 is part of a buried, hydrophobic cluster of residues including Phe72, Alal26, and Alal35 (Jelsch et al., 1993) (Fig. 3) . The B. licheniformis and S.aureus enzymes have a threonine and glutamine residue, respectively, at position 76 (Fig.3) (Herzberg, 1991; Knox and Moews, 1991). In the B. licheniformis and S. aureus enzymes, the residues surrounding residue 76 are thus predominently hydrophilic: residue 76 interacts with its neighbors by means of hydrogen bonds rather than hydrophobic interactions, as in TEM-1. Such a change in environment may have occurred by coupled, compensating substitutions that altered the character of the region. For example, introduction of a Asn residue at position 76 of the TEM-1 enzyme would, based on the mutagenesis results, create a non-functional enzyme. Selective pressure would then favor additional mutations that either revert residue 76 to Leu or provide a new hydrogen bonding partner for Asn76.
C. Identification of Substitutions that Correct the Asn76 Defect To test the hypothesis that compensating mutations have occurred in the class A P-lactamase gene family in the vicinity of residue 76, position 76 was first converted to an asparagine residue so that residue 76 is identical to the 5. aureus enzyme. As expected, the function of this enzyme was drastically reduced. Whereas E. coli containing the wild-type enzyme is able to grow on agar plates containing 1 mg/ml ampicillin, E. coli containing the Asn76 enzyme can only grow on agar plates containing 50 |ig/ml or less of ampicillin. Western blots have shown that the Asn76 enzyme is poorly expressed relative to the wild-type enzyme (data not shown). To assess whether compensating mutations can occur in the vicinity of the Asn76 residue, suppressor mutants of the Asn76 mutant were isolated. These mutants were isolated by introducing the plasmid containing the bla^EM g^^^
834
Timothy Palzkill et al
S. aureus
B. licheniformis Figure 3. Ribbon diagram of the Leu76 region of the TEM-1 (Jelsch et al., 1993), S. aureus (Herzberg, 1991) and 5. licheniformis (Knox and Moews, 1991) structures. For simplicity, only those residues that are hydrophobic in TEM-1 and hydrophilic in the S. aureus or B. licheniformis structures are shown. Figure prepared with MOLSCRIPT (Kraulis, 1991) (Figure adapted from Huang et al., 1996).
Complete Mutagenesis of a Gene
835
encoding the Asn76 substitution into E, coli ESI301. This E. coli strain is defective in mismatch repair and accumulates mutations at a rate approximately 100-fold higher than wild-type E. coli (Siegel et al., 1982). The strain was grown for 20 generations and the plasmid DNA was isolated and used to electroporate E, coli XLl-Blue (Bullock et al., 1987). Suppressor mutants were isolated by spreading the transformed cells on agar plates containing 500 (Xg/inl ampicilhn. Ten colonies were picked and plasmid DNA was isolated and used to re-transform E. coli XLl-Blue. In each case the ampicillin resistance phenotype was linked to the plasmid containing the blajEM g^^^- The DNA sequence of blajEM g^^^ of six mutants was determined. In one case the codon at position 76 had accumulated 2 nucleotide substitutions to revert to a leucine residue (AAC Asn -> CTC Leu). Two mutants exhibited a single nucleotide change at codon 76 to introduce an isoleucine (AAC Asn -^ ATC He). Finally, three mutants contained a nucleotide substituion at codon 182 that resulted in the replacement of methionine with threonine (Metl82 ATG -^ Thrl82 ACG). Interestingly, the acarbons of residues 76 and 182 are 17 A apart and thus do not directly interact. These results do not eliminate the possibility that compensating mutations can occur in the immediate vicinity of the residue 76 side chain as appears to have occurred in the gene family, but they do indicate that there are multiple mutational pathways for correcting a defect in a protein. Studies are in progress to introduce substitutions at positions whose side chains directly interact with Asn76 to determine if they can also suppress the defect caused by the Asn76 substitution.
IV.
Discussion
Improvements in oligonucleotide synthesis methods in recent years has greatly reduced the cost of oligonucleotide directed mutagenesis experiments. This has enabled large scale mutagenesis projects, such as that described here, to be conducted. In a few months, one can create 50-100 site directed mutations. This allowed us to create SaU linker inserts by oligonucleotide directed mutagenesis at 78 sites in the bla TEM-I gene. An additional 78 oligonucleotides were then used to randomize the codons that had been targeted by the linker insertions. In addition, the widespread use of automated DNA sequencing should facilitate the determination of mutant sequences obtained using the functional selection approach. Finally, the development of phage display technology has made functional selections possible for a large variety of proteins and targets. Taken together, these improvements in technology will permit large scale oligonucleotide mutagenesis studies to be performed routinely.
References Ambler, R. P., Coulson, F. W., Frere, J.-M., Ghuysen, J.-M., Joris, B., Forsman, M., Levesque, R. C, Tiraby, G. & Waley, S. G. (1991). Biochem. J. 276, 269-272. Bowie. J. U., Reidhaar-Olson, J. F., Lim, W. A. & Sauer. R. T. (1990). Science lAl. 13061310. Bullock, W. O., Fernandez, J. M. & Short, J. M. (1987). BioTechniques 5, 376-379. Datta, N. & Kontomichalou, P. (1965). Nature 208, 239-241. Deng, W. P. & Nickoloff, J. A. (1992). Anal Biochem. 200, 81-88. Hanke, M. & Wink, M. (1994). BioTechniques 17, 858-860. Herzberg, O. (1991). /. Mol Biol 111, 701-719. Huang, W., Petrosino, J., Hirsch, M., Shenkin, P.S., & Palzkill, T. (1996). /. Mol Biol 258, 688-703.
836
Timothy Palzkill et al
Jacoby, G. A. & Medeiros, A. A. (1991). Antimicrob. Agents and Chemother. 35(9), 16971704. Jelsch, C, Mourey, L.. Masson, J.-M. & Samama, J.-P. (1993). Proteins 16, 364-383. Joris, B., Ghuysen, J.-M., Dive, G., Renard, A., Dideberg, O., Charlier, P., Frere, J.-M., Kelly, J. A., Boyington, J. C, Moews, P. C. & Knox, J. R. (1988). Biochem. J. 250. 313324. Knox, J. R. & Moews, P. C. (1991). /. Mol Biol 220, 435-455. Kraulis, P. J. (1991). /. Appl Crystallogr. 24, 946-950. Kunkel, T. A., Roberts, J. D. & Zakour, R. A. (1987). Methods EnzymoL 154, 367-382. Loeb, D. D., Swanstrom, R., Everitt, L., Manchester, M., Stamper, S. E. & Hutchison, C. A. (l9S9)J^ature 340, 397-400. Markiewicz, P., Kleina, L. G., Cruz, C., Ehret, S. & Miller, J. H. (1994). J. Mol. Biol., 421433. Palzkill, T. & Botstein, D. (1992). Proteins 14, 29-44. Palzkill, T., Le, Q.-Q., Venkatachalam, K. V., LaRocco, M. & Ocera, H. (1994a). Mol. Microbiol. 12,217-229. Rennell, D., Bouvier, S. E., Hardy, L. W. & Poteete, A. R. (1991). /. Mol. Biol. Ill, 67-87. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989). Molecular cloning: a laboratory manual. 2 edit. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. Siegel, E.C., Wain, S.L., Meltzer, S.F., Binion, M.L., & Steinberg, J.L. (1982). Mutation Research 93, 25-33. Terwilliger, T. C, Zabin, H. B., Horvath, M. B., Sandberg, W. S. & Schlunk, P. M. (1994). J. Mol Biol lU, 556-571. Wiedemann, B., Kliebe, C. & Kresken, M. (1989). /. Antimicrob. Chemother. 24, 1-24.
Characterization of Truncated Kirsten-Ras Purified from Baculovirus Infected Insect Cells Indicates Heterogeneity due to N-terminal Processing and Nucleotide Dissociation Lisa M. Churgay, Nancy B. Rankl, John M. Richardson, Gerald W. Becker and John E. Hale Lilly Research Labs Indianapolis IN
I.
Introduction
Kirsten-ras is the most frequently activated oncogene in human tumors (1). Activated K-ras is bound to GTP and possesses intrinsic GTPase activity which leads to its inactivation. Oncogenic K-ras has dramatically lower GTPase activity. The importance of K-ras in tumorigenesis makes it an important target for drug development. Determination of the crystal structure of K-ras will aid in the rational design of anti-cancer therapeutics. To this end truncated K-ras was expressed in baculovirus infected insect cells which has been suggested as an appropriate system for its production (1). K-ras was highly expressed in these cells and was purified to apparent homogeneity evaluated by SDSPAGE. Analytical DEAE chromatography and electrospray-ionization mass spectrometry (ESI-MS) of the purified protein indicated substantial heterogeneity. The different proteins were characterized by tryptic mapping utilizing LC-MS. This analysis indicated the presence of at least 4 different N-terminal variants of Kras and additional heterogeneity due to dissociation of bound nucleotide, indicating that unwanted cellular processing of proteins may occur in baculovirus infected cells. This processing may impact the further usefulness TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
837
Lisa M. Churgay et al
838
of the protein particularly in the case of protein crystallography in which N-terminal variants may impair the ability to obtain crystals suitable for diffraction studies. Thus, additional purification steps must be included in order to separate these very similar molecules, significantly reducing yields. More preferably, other cell lines or growth conditions will be evaluated in an effort to minimize the heterogeneity from this cellular processing and to obtain protein suitable for structural studies. These results demonstrate that careful characterization of purified recombinant proteins must be undertaken in order to understand the extent of cellular modification and determine the impact on the biological and physical behavior of these proteins. II.
Materials and Methods
A. Kirsten ras protein production A truncated analog containing residues 1-166 of K-ras 4B (val 12) and a C-terminal Arg-Ser dipeptide was produced in Sf9 (Spodopetera frugiperda) insect cells. Cells were cultured in Grace's insect medium supplemented with 3.33mg/ml yeastolate, 3.33mg/ml lactalbumin hydrolysate (JRH Biosciences), 10% FES (Atlanta Biologicals), 1% antibiotic /antimycotic (Sigma), 0.1% pluronic F-68 (JRH Biosciences) using magnetic spinner flasks. Infections were performed in 9L stirred vessels as previously described(2) by seeding the vessels at a final density of 8.5X10^ cells/ml using virus at an MOI of 5. B. Preparative purification of K-ras Cell pellets from 4L of baculovirus infected insect cells were resuspended into 25mM Tris-HCL, pHS.O, 5mM DTT, 250mM sucrose and protease inhibitor cocktail tablets (Boehringer Mannheim). Cells were homogenized and centrifuged at 38,000 X g for 20 min. The supernatant was ultracentrifuged at 100,000 X g for 2hrs, filtered, and loaded over a 21.5mm ID X 15 cm DEAE column
Kirsten-Ras Purified from Baculovirus Infected Insect Cells
839
(Tosohaas) equilibrated in 25mM Tris-HCl, pH 8.0, and 5mM DTT. The protein was eluted with a NaCl gradient from 0-0.5M developed over 85min. Fractions containing the K-ras were identified on 4-20% tris-glycine gels (Novex), pooled, and concentrated to lOmls in an Amicon stirred cell. Protein was passed over a Superdex 75 Prepgrade 35/600 column(Pharmacia) equilibrated in lOmM MOPS, pH 7.0, lOOmM NaCl, 5mM DTT, and ImM MgCl2. Column fractions were analyzed by SDS-PAGE and a peak fraction analyzed by mass spectrometry. C. Analysis of K-ras The peak fraction was dialyzed overnight into lOmM MOPS, pH 7.0, 5mM DTT and ImM MgC^ and approximately 5mg was injected onto a 7.5mm X 7.5cm DEAE column (Tosohaas). The K-ras eluted with a NaCl gradient from 0-250mM developed over 75min. Protein peaks were isolated and digested overnight at 37^ C with trypsin at an enzyme to substrate ratio of 1:25. Reversed phase HPLC was done on the digested protein using a Vydac Cig (4.5mm X 25cm) column and peaks were eluted with a linear gradient from 0-50% acetonitrile (0.1 % TFA) in 60min. Intact protein was analyzed by reversed phase HPLC on a Vydac Cig column with a linear gradient from 30-60% acetonitrile (0.1% TFA) over 60min. All protein sequence analysis was performed on a Procise sequencer (Applied Biosystems, Foster City, CA). D. Electrospray-mass spectral analysis All mass spectra were obtained on a PE-Sciex triple quadrapole instrument (model API III) as described (3). Collisionally induced dissociation (CID) MS/MS experiments were performed in the positive ion detection mode with the orifice potential set at +50 V and the argon collision gas thickness maintained at 315 X lO^^ molecules/ cm^. Product ion scans were averaged over a range of 50600 u in 0.1 u intervals for a dwell time of 1 msec, per interval. III.
Results
840
Lisa M. Churgay et al
A, Purification of K-ras from insect cells and initial characterization Insect cell cytoplasm was initially purified by preparative DEAE chromatography. Fractions containing K-ras were pooled and fractionated over a Superdex-75 column. The protein obtained from this purification appeared to be homogeneous by SDS-PAGE (figure lA). This protein preparation was subjected to ESI-MS analysis and multiple masses were noted (figure IB). The mass expected for the K-ras protein was detected (19146) however additional masses were seen including those at 19012, 19055 and 19186. N-terminal sequence analysis of the protein preparation yielded primarily a major sequence of MTEY and a minor sequence of TEYK indicating that some N-terminal processing of the K-ras resulting in removal of the methionyl residue had occurred. B. Identification of the N-terminally processed forms of K-ras The K-ras mixture was analyzed on a TSK-DEAE HPLC column. Four major peaks were seen to elute from this column (figure 2) and these peaks were analyzed by ESIMS. Peak 1 was primarily mass 19146 with minor components of 19012 and approximately 19590. Peak 2 was primarily 19055 with minor components of 19189 and approximately 19500. Peak 3 was primarily 19146 with a minor component of 19012 and peak 4 was primarily 19055 with a minor component of 19189. We digested peaks 1 and 2 with trypsin and separated the peptides by reversed phase HPLC. A single peptide was seen to be shifted in the digests of peaks 1 and 2 (figure 3). LC-MS of these digests indicated that this peptide was the N-terminal tryptic peptide in peak 1 with a mass of 671. This peptide was absent in the peak 2 digest and a new peak was present eluting slightly earlier in the gradient with a mass of 582. Thus the N-terminus of the protein in peak 2 was modified. This peptide was
841
Kirsten-Ras Purified from Baculovirus Infected Insect Cells
kDa
A
200-
1
2
3
*T—.
116, 97' 66 55 35 31 21 14 6
B 100 19,146
75 t B £
50 f
19,186 13
25
19,012
18,800
19,000
19,497 19,586 19,200 19,400 Molecular Weight
19,600
Figure 1. A, SDS-PAGE of K-ras produced in baculovirus infected insect cells. Protein samples were electrophoresed on a 4-20 % SDS gel under reducing conditions. 1; MW standards, 2; preparative DEAE pool, 3; purified Kirsten ras. B, Mass distribution of the purified K-ras reconstructed from the ESI-MS using PESciex MacSpec software.
o
d
d
d (uiu 082) sqv
/—V
o^ v
o >o
o ^
o
en
(N
o
t-H
O
^^^
e
• »-* H
C/5
o
CJ
(/D
•S
A
o o cd
C/3
S
T^ O
•«-j
O
T3
D
C/5
0
T3
• 1-H
•^ c ;3
cyD
*>
3cd o PQ
1/3
0
bD 0
00
03
.
nd <^
1
^4-)
(V)
p Ol
cd
^ ?:
Q
^
0
a (u ;-^ +-*
cAi T 3
C/3
en
0
Q w
pj
r^ 1s
W
;-^
PH ^cd
C/5
S c3 o i: >.
o
O
s
O
o
^ •^ <
Q
Q ,_^ cd 13 < ri
a; 0 u 0 S bjD 3
£2 ft.
(Luui7i.s)sqv
CD
E
cd
CD
< Q ^
T3 C
03 •<-H
o
O
o ^ C O
c
o cd
C/3
O
<
X)
^3 O
c T-)
X)
o (/5
(U DH
cd X!
cd
> ;:::)
+-»
S
bU O cd
Cd
6
cd
o H c
d) X
fi cd
o
bX)
cd
6
o
o
cd
c/o
.s
fl)
^o
x: o C) cd (U
;>H
o ^•i-*
O
o
u o < OH
>^ ;-H
(/3
C/3
B a.2p ^
t3 -a
- H C/5
(/5
c
S
^ o ^
C
(D
(U
^ C/3
OH
^cd < Q o
C/3
PH cd
so
u
0^
u
o kM o
843
844
Lisa M. Churgay et al
subjected to CID, MS/MS. Analysis of the daughter ions of this peptide indicate that it is acetyl-TEYK (figure 4). Thus the component of mass 19055 is des-methionylacetyl-K-ras. The K-ras mixture was injected onto a reversed phase column and eluted with an acetonitrile gradient. Four peaks were detected and LC-MS analysis indicated that peak 1 had the mass 19012; peak 2, 19146; peak 3, 19055; and peak 4 19189 (figure 5). N-terminal sequence analysis of peak 1 gave the sequence TEYK indicating that the component of mass 19012 is desmethionyl-K-ras. Peak 4 was N-terminally blocked and tryptic mapping of this peak yielded a single peptide difference from the peptide map of K-Ras. This peptide was subjected to CID, MS/MS. Analysis of the daughter ions of this peptide indicate that it is acetyl-MTEYK (data not shown). Thus the component of mass 19186 is acetyl-K-ras. C. Identification of heterogeneity due to nucleotide dissociation. N-terminal processing of K-ras is sufficient to explain the heterogeneity due to peaks 1 and 2 on the analytical DEAE chromatogram. However peaks 3 and 4 on this chromatogram indicated no evidence of additional Nterminal forms. There were, however, minor components 442-444 mass units heavier in peaks 1 and 2 that were not present in peaks 3 and 4. This could be due to some residual protein-GDP complex that survived the ESI-MS analysis. In order to examine this possibility, peaks 1-4 from the analytical DEAE run were dialyzed into ammonium bicarbonate buffer and analyzed by negative mode ESI-MS. The analysis for peaks 1 and 3 are shown in figure 6. A component of mass 442 is present in the spectrum for peak 1 and virtually absent in that for peak 3. CID tandem MS of this component indicated that it was GDP. Similar analysis indicated the presence of GDP in DEAE peak 2 but not peak 4 (data not shown). Thus DEAE
o
I
o o
O I
(N
Xi
pq-
in
AjISUQJUI
pq-
.n(%) 9Ap^J9^
(N
O I
en
r^
in
•
o o ^
in
o o
o o
O
o
o o (N
o o
T5
>
(N 00 in
O
C/3
ON
o o
VO
o o
(uiu 08Z) sqv
m o o
•s «
N
g ^
^S a §
t s
V;
^ o OH
;-( (D
U ^
c« C3
-S > (U
O
c/3 00
^
CO
W
I t-H
G c« too
o c O^
OH
(N
^
vo ^
^
W^
m ^
O
o
(%) l u i - p ^
OH
^
r-H ^
VO
o CO
«
O
o
4 ^n ^
(%) l u i -p^j
^
^
^
O
O
00
t^
CO
r (N
-"^
1 ^
(%) 'M -p^
o o
o
o o in
o
in
o o
CO
o
o a c o Xi
o o o
OH
a
N
o3
s ^
.5 ^ .S ^
C
(D
N
o
o o
s
a
B
a
S < Q o
D 43 O CO
c (/3
00
W
00
O
a
DH
00
t3 2 o
(D
•
Cd
03
N^
^ ^
(D 2^ -^ ^•
SO
4—1
847
Lisa M. Churgay et al
848
peaks 3 and 4 arise from dissociation of GDP from the different N-terminal variants of K-ras. IV.
Conclusions
Baculovirus infected insect cells are a good system for the overexpression of heterologous proteins. The level of soluble K-ras produced by these cells is quite high as seen here and by others (1). Insect cells are known to post translationally modify proteins and indeed several groups have noted N-terminal acetylation of proteins produced in these cells (3-6). We have observed partial processing of the N-terminus of K-ras both in the removal of the N-terminal residue and in the acetylation of the resulting amine terminus. This heterogeneity in processing could be a result of the level of overexpression of K-ras in this case. It is possible that the level of K-ras exceeds the capacity of the cellular enzymes involved in the processing. Thus, lower levels of expression may lead to more completely processed protein. We do not know, however, whether this type of N-terminal modification is desirable for production of diffractable crystals of K-ras. We have also demonstrated the utility of analytical DEAE coupled with ESI-MS in the determination of bound nucleotide . This technique may be useful in evaluation of K-ras protein that has been loaded with GTP analogues in order to activate it (7). These results also demonstrate the power of appropriate use of mass spectra in the interpretation of protein characterization data. The source of peaks 3 and 4 from the analytical DEAE column could have been interpreted as the result of deamidation of the protein which would be expected to yield more negatively charged protein and later eluting peaks by DEAE chromatography. Deamidation of protein does not lead to substantial alteration in its molecular mass. Negative mode ESI-MS analysis, however, demonstrated that the heterogeneity was due, in fact, to GDP dissociation
Kirsten-Ras Purified from Baculovirus Infected Insect Cells
849
allowing for the correct indentification of all forms of the expressed protein. References 1. Lowe, P.N., Page, M.J., Bradley, S., Rhodes, S., Sydenham, M., Paterson, H., Skinner, R.H. (1991) J. Biol. Chem. 266, 1672-1678. 2. Rice, J.W., Rankl, N.B., Gurganus, T.M., Marr, CM., Barna, J.B., Walters, M.M., Burns, D.J. (1993) BioTechniques. 15, 1052-1059. 3. Becker, G. W., Miller, J. R., Kovacevic, S., Ellis, R. M., Louis, A. L, Small, J. S., Stark, D. H., Roberts, E. F., Wyrick, T. K., Hoskins, J.-A., Chiou, X. G., Sharp, J. D., McClure, D. B., Riggin, R. M., & Kramer, R. M. (1994). Bio/Technology 12, 69-74. 4. Urbancikova, M., Hitchcock-DeGregori, S.E. (1994) / . Biol. Chem. 269, 24310-24315. 5. Han, B-D., Livingstone, L.R., Paseck, D.A., Yablonski, M.J., Jones, M.E. (1995) Biochemistry 34, 10835-10843. 6. Nishimura, C , Yamaoka, T., Mizutani, M., Yamashita, K., Akera, T., Tanimoto, T. (1991) Bioc. Biop. Acta. 1078, 171-178. 7. John, J., Sohmen, R., Feuerstein, J., Linke, R., Wittinghofer, A., Goody, R.S. (1990) Biochemistry 29, 6058-6065.
This Page Intentionally Left Blank
Isolation and Characterization of Multiple-Methionine Mutants of T4 Lysozyme with Simplified Cores Nadine C. Gassner, Walter A. Baase, Joel D. Lindstrom, Brian K. Shoichet^ and Brian W. Matthews Institute of Molecular Biology, Howard Hughes Medical Institute and Department of Physics, University of Oregon, Eugene, Oregon 97403-1229
I.
Introduction
Tight packing of amino acids in the cores of globular proteins has led to the idea that the complementary sizes and shapes of side-chains may help define the overall protein structure. The aim of the present study was to test the degree to which precise spatial complementarity between core residues is required to maintain native-like protein properties. Sites within the carboxy-terminal domain core of phage T4 lysozyme were substituted singly and as a group with methionine to produce a simplified core sequence. The properties of such mutant lysozymes are briefly described. In addition we describe a method to isolate mutant protein from inclusion bodies and a sensitive enzymatic assay to detect small differences in mutant protein activities.
^Present address: Department of Molecular Pharmacology and Biological Chemistry, Northwestern University Medical School, 303 E Chicago Avenue, Chicago, IL 60611-3008 TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
851
Nadine C. Gassner et al
852
II.
Materials and Methods
A.
Selection Criteria for Substitutions
The carboxy-terminal domain of T4 lysozyme (residues 81-164) is composed of seven helices and includes the largest contiguous set of buried residues in the protein. Side-chains were considered to be part of the core if they have less than 10% solvent accessible surface. Table I. Characteristics of methionine and methionine-substitution sites
Amino acid He 78 Leu 84 Leu 91 Leu 99 He 100 Met 102 Val 103 Met 106 Leu 118 Met 120 Leu 121 Leu 133 Phe 153
Accessibility of side-chain to solvent^ OOO 0.00 0.00 0.00 0.06 0.02 0.05 0.22 0.10 0.21 0.01 0.01 0.00
Solvent transfer relative to methionine, AG, (kcal/mol)^ ^078 -0.64 -0.64 -0.64 -0.78 0.00 0.02 0.00 -0.64 0.00 -0.64 -0.64 -0.76
Side-chain entropy relative to alanine, TAS (kcal/mol)^
Volume of amino acid relative to alanine
075 0.66 0.66 0.66 0.75 1.51 0.29 1.51 0.66 1.51 0.66 0.66 0.63
57 57 57 57 57 57 38 57 57 57 57 57 68
(Ay
Volume available for substitution relative to alanine (A^)' 85 52 107 130 23 — 12 — 88 145 155 132
"Fraction of the side-chain in the wild-type structure accessible to solvent calculated by the method of Lee and Richards (1971) with a probe radius of 1.4A. ''From Fauchere and Pliska (1983). ^'Entropy cost of localizing the side-chain at 300K. The quoted values are the average of the estimates of Creamer and Rose (1992), Pickett and Sternberg (1993), Sternberg and Chickos (1994) and Lee et al (1994). *Van der Waals volume of the amino acid relative to alanine (Creighton, 1992). ^Space available if the side-chain is truncated to alanine in the model of the wildtype structure and the volume of the resultant cavity estimated by the method of Connolly (1983) with a probe radius of 1.2A.
Multiple-Methionine Mutants of T4 Lysozyme
853
Six leucines (Leu 84, 91, 99, 118, 121 and 133), two isoleucines (He 78 and 100), one phenylalanine (Phe 153) and one valine (Val 103) were chosen for substitution with methionine. All residues are within a-helical regions or turns. The carboxy-terminal domain also contains a single, completely buried methionine (Met 102), and two others (Met 106 and Met 120), the side-chains of which are about 80% buried (Table I). Table n . Activity and stability of methionine-substituted lysozymes
Mutant
Activity (%)
WT* I78M L84M L91M L99M" IIOOM V103M L118M L121M L133M F153M^ 7-Met^ 10-Met^
100 70 104 96 90 105 70 98 87 106 87 43 =^20
AT„
CC)
0 -3.7 -4.9 -2.0 -1.3 -4.5 -3.1 -1.8 -2.1 -1.0 -1.6 -14.5 -25
AH(TJ (kcal/mol)
AH(ref) (kcal/mol)
AAG (kcal/mol)
130 117 110 125 134 125 117 130 129 128 128 96 42
n5
0 -1.5 -1.9 -0.8 -0.4 -1.6 -1.2 -0.7 -0.8 -0.4 -0.6 -5.0 -7.3
111 108 115 122 121 109 119 119 115 116 117 88
Activity and stability were determined as in the text. For the 10-Met mutant a loss in activity was seen with time. The melting temperature, T„, of WT* lysozyme was 65.3°C. AT„ is the change in the T„ of the mutant relative to wild-type. For the single mutants the uncertainty in AT„ is ±0.2°C; for the multiple mutants it is ±0.5°C. AH(T^ is the enthalpy of unfolding measured at T^,. The uncertainty is ±5 kcal/mol. AH(ref) is the enthalpy of unfolding calculated at the reference temperature of 59°C using a constant ACp of 2.5 kcal/mol-deg. AAG is the free energy of unfolding of the mutant relative to wild-type. AG values were computed at 59°C using a constant ACp of 2.5 kcal/mol-deg. The uncertainty in AAG is ±0.1 kcal/mol for the single mutants and ±0.4 kcal/mol for the 7-Met replacement. Because of the low value of AH of the 10-Met mutant, AAG was determined at the T„ of the mutant with an estimated imcertainty of about 1 kcal/mol. 'Mutants L99M and F153M were previously described (Eriksson et aL, 1993). *The 7-methionine mutant includes the substitutions L84M/L91M/L99M/L118M/L121M/L133M/F153M. The 10-methionine variant includes the additional substitutions I78M/I100M/V103M.
Nadine C. Gassner et al
854
B.
Mutagenesis
Mutant proteins were constructed in a cysteine-free background, i.e. Cys 54 -> Thr plus Cys 97 -> Ala (C54T/C97A) which will be referred to as WT* (Matsumura and Matthews, 1989). Methionines were introduced singly at each of the core sites listed in Table II. Multiple methionine variants were then made from different combinations of the single core sites. Among these are a seven-methionine mutant that includes the substitutions L84M/L91M/L99M/L118M/L121M/L133M/ F153M, and a ten-methionine mutant that includes the additional replacements I78M/I100M/V103M.
C.
Protein Purification
The single-Met and 7-Met variant proteins were purified according to standard methods (Alber and Matthews, 1987; Poteete et al, 1991; Muchmore et ah, 1989). The ten-methionine variant was expressed as inactive inclusion bodies (Rudolph and Lilie, 1996) which were observed directly by phase contrast microscopy in the E. coli cells as refractile bodies. To harvest the inclusion bodies, the cells were pelleted, weighed, and resuspended in 50 mM NaCl, 50 mM Tris-HCl, 10 mM NajHEDTA, 2.5 mM benzamidine-HCl, 1 mM paramethanesulfonyl fluoride and 0.1 mM 1,4-dithiothreitol, pH 8.0, at 23°C, using a volume in milliliters equivalent to the number of grams of pellet. After disrupting the cells by three passages through a French press at 20,000 psi, the lysed cells were pelleted for 10 minutes at 8,000 rpm in a JA-20 rotor at 4°C. The pellet was resuspended in ten-fold the apparent pellet volume of Nanopure water, repelleted, resuspended in 0.1 M KCl, 10 mM Tris-HCl, 5 mM MgCl2, pH 8.0, and stirred at room temperature for 30 minutes with DNase I (10 /ig/ml), the sample was again pelleted, then resuspended and pelleted two times with tenfold the pellet volume of water. Ten-fold the pellet volume of water was added, brought to 10% Triton X-100 and stirred for one hour at 4''C. It was repelleted and washed twice as above. The pellet was resuspended in ten-fold the pellet volume of water, brought to 2.5% octyl-jS-O-glucopyranoside and stirred at 4°C for one hour. The sample was again washed twice in water and repelleted. The pellet was solubilized in ten-fold its own volume with 4 M urea, 10 mM glycine and the pH adjusted to 2.0 with concentrated
Multiple-Methionine Mutants of T4 Lysozyme
855
phosphoric acid at 23°C. The sample was refolded by rapid dilution with one hundred times the pellet volume of 25 mM NaCl, 20 mM Tris-HCl, pH 5.5, at 23°C. Alternatively, refolding was accomplished by slow dialysis of the solubilized sample against 25 mM NaCl, 50 mM Na citrate, pH 5.5, plus 10% glycerol. Any remaining insoluble material was removed by centrifugation for 5 minutes at 8000 rpm. Protein purification from this point was according to standard methods previously described for other soluble lysozyme variants (Poteete et al, 1991). To test for the possible oxidation of the sulfur in the methionine side-chains, the molecular masses of the multiple methionine proteins were determined by electrospray mass spectrometry and were found to agree to within 5 Da with the theoretical values expected on the basis of natural isotopic abundance (7-Met variant expected 18694.82 Da, observed 18698.75 Da; 10-Met variant expected 18762.82 Da, observed 18767.54 Da). This indicates little if any oxidation of the introduced methionines.
D.
Preparation of Cell Wall Fragments
E, coli cell wall fragments were isolated by modification of a previous method (Becktel and Baase, 1985) and without the use of trypsin. Thirty-six grams of freeze-dried E. coli B (Sigma, EC 11303) were suspended in 250 mis of 1.5 M NaCl at 23°C. All subsequent steps were at 23''C except as noted. After French-pressing the slurry at 16,000 psi the cells were centrifuged at 16,000 g in a Beckman JA-20 rotor for one hour. Upon resuspension in 200 mis of 1 M NaCl, the cells were brought to a boil, cooled, then pelleted at 5,000 g. Suspension and pelleting was repeated three times, the supernatant becoming more clear and less viscous with each cycle. The final resuspension was into Nanopure water to a total volume of 150 mis. The suspension was added slowly to 250 mis of boiling 10% SDS with constant stirring. After cooling in an ice bath to 23''C (below room temperature SDS precipitates), the cell walls were centrifuged at 37,0(K) g for one hour at room temperature. Discarding the supernatant, the cell walls were resuspended in water using a teflon pestle tissue grinder (0.15 mm clearance, at least twenty strokes), then centrifuged. The cycle of resuspension and centrifugation was repeated six times with the penultimate and final resuspensions in 66.7 mM
Nadine C. Gassner et al
856
K3/2H3/2PO4, pH 6.88, 0.02% sodium azide. Cell walls made in this manner were stable at 4°C for one month, but will aggregate if frozen.
E.
Enzyme Activity Assays
Activity of the mutant enzymes was initially assessed by halo size on lysis plates (Streisinger et al, 1961) and by a modification of the lysoplate assay of Becktel and Baase (1985) where a chloroform treated bacterial lawn was used in lieu of embedded peptidoglycan. In order to obtain greater sensitivity, an assay was used based on the increase of circular dichroism (CD) signal observed after peptidoglycan fragments are exposed to lysozyme (Zhang et al, 1995). CD assays were carried out in 1 cm pathlength QS-IU (Hellma) birefringence-free optical cells with a 3 x 6.5 mm teflon-coated stirring flea (VWR). Two mis of 66.7 mM K3/2H3/2PO4, pH 6.8, were mixed with cell wall fragments as prepared above sufficient to result in a postreaction signal of -30 millidegrees at 223 nm as measured in a J-720 spectropolarimeter (Jasco). Stirring speed was maintained at 700 rpm using the PTC-348W temperature controller (Jasco) to keep the peptidoglycan fragments in suspension. After thermal equilibration to 20°C, reactions were initiated by manual addition of enzyme and reaction course followed by the time evolution of the CD signal at 223 nm. The change in CD with time was analyzed by fitting the early linear portion of the sigmoidal decay curve. These rates scaled as WT* concentration. Relative activities were calculated by dividing the rate of the mutant by the rate of WT* at the same protein concentration.
F.
Thermal Measurements
Protein thermal stabilities were determined by monitoring the circular dichroism at 223 nm as a function of sample temperature (Eriksson et a/., 1993; Zhang et al., 1995). The buffer was 0.1 M sodium chloride, 1.4 mM acetic acid, 8.6 mM sodium acetate, pH 5.42, with protein present at 15 to 30 /ig/ml. The melting temperature, T^, of WT* was 65.3'*C. Free energy values were computed at 59°C assuming a constant change in heat capacity, ACp of 2.5 kcal/mol-deg.
Multiple-Methionine Mutants of T4 Lysozyme
G.
857
Crystallography and Structure Analysis
Determination and evaluation of the structure of the seven-methionine mutant was as described (Gassner et al, 1996). In order to estimate the volume available for each substituted methionine side-chain, a model calculation was carried out based on the coordinates of WT* lysozyme. The coordinates were displayed on a graphics system using FRODO (Jones, 1982), and the side-chain in question truncated to alanine. The volume of the cavity that resulted was then calculated using the program of Connolly (1983). In a prior report (Eriksson et al, 1992) the same procedure was used except that the radius of the jS-carbon was assumed to correspond to a methylene rather than a methyl group. This explains why the model cavity volumes quoted here are not identical with those given by Eriksson et al (1992).
H.
Chemicals
All reagents were "Baker analyzed". pH was measured using a Radiometer PHM84 and Radiometer "S" series standards with a GK2421C electrode.
III. Results and Discussion A.
Activity
As judged by halo size using modified lysis assay plates (see Materials and Methods), all single variants had close to wild-type levels of activity. The multiple methionine variant halos were 50-70% smaller than those of the single mutants. As measured by the CD-based assay (Figure 1) the activity relative to WT* varied from 70% (I78M) to 106% (L133M) for the single methionine mutants (Table II). Despite alteration of the core composition, the ten methionine variant retained activity equal to about 20% that of wild-type. The activity measurements therefore suggest that the single methionine substitutions described here have a small effect on biological function.
Nadine C. Gassner et al
858 1
o
I •0.5
1
1
1
1
_,
,
,——'
^
1
'
y
'— '
'
1
L99M
1
L133M
•
• L91M
'
• F153M
LI IBM
<
•
j
• L121M
S-l.Oh 03
•4-*
C/3
.s
I 00 C/3
• V103M
1.5 h
\
• I78M
1
• IIOOM L84M
1
1
1
1
1
1
1
-1
1
1
1
1
1
—J
1
1
50 100 150 Estimated volume available for sidechain (A3)
1
1
Figure 1. Comparison of the catalytic activity of WT* lysozyme and the 7-methionine mutant (see text). The enzymatic hydrolysis of peptidoglycan is followed as a change in circular dichroism. Rates are obtained from the slopes, as indicated (see text for details).
B.
Structure
Crystals of the seven-methionine variant were obtained using 2 M phosphate solutions, pH 6.7, and were found to be isomorphous with wild-type lysozyme (Weaver and Matthews, 1987). A data set to 1.9A resolution, 92% complete was measured. The structure was refined to a crystallographic residual of 15.2% with bond lengths and angles within 0.18A and 3.0'' of ideal values and found to be similar to wild-type. The root-mean-square discrepancy of the main-chain atoms within the C-terminal domain was 0.20A. The xi and xi values of the mutant side-chains were similar to those in wild-type except at position 153 where Xi changed by 92"^ and avoided a steric clash (Table III). Thus, each of the substituted methionines essentially traces the path of
Multiple-Methionine Mutants of T4 Lysozyme
859
the residue it replaces, repacks the core, and appears to interact with neighboring residues in a favorable manner. In the WT* structure, there is a cavity of volume of 34A^ adjacent to Leu 99 (Eriksson et ah, 1992). In the mutant structure, the volume of this cavity is 50A^. The increase of 16A^ corresponds quite well to the expected overall decrease in van der Waals side-chain volume of llA^ (Table I). In other words, the introduction of up to seven methionines does not cause the small cavity that is present in wild-type to collapse in the mutant. The crystallographic thermal factors of the side-chains of some of the seven methionines in the mutant structure are slightly higher than those of the wild-type amino acids they replace (Table IV). The thermal factors of the three methionines that are present within the carboxy-terminal domains of both structures are, if anything, better ordered in the mutant molecule (Table IV). In all cases the electron density for the introduced methionines is well defined. There is no suggestion that substitution of seven methionines leads to disorder within the protein core or in other regions of the structure. Table HI. Comparison of side-chain rotation angles in wild-type lysozyme and the 7-Met mutant 7-Met mutant
Wild-tvpe
Leu Leu Leu Leu Leu Leu Phe
84 91 99 118 121 133 153
XiO
X2O
301 294 188 294 289 282 277
175 171 191 173 166 166 129
Met Met Met Met Met Met Met
84 81 99 118 121 133 153
XiO
X2O
289 282 203 284 276 290 185
178 188 181 196 153 159 11
Nadine C. Gassner et al
860
Table IV. Side-chain thermal factors in wild-type and a seven-methionine mutant lysozyme Wild-type lysozyme Side-chain thermal factor Amino acid (A^
7-Met mutant lysozyme Side-chain thermal factor Amino acid (A^)
Sites substituted with methionine Leu 84 16.9 Leu 91 16.6 Leu 99 16.9 Leu 118 18.5 Leu 121 16.8 Leu 133 18.3 Phe 153 22.5
Met Met Met Met Met Met Met
84 91 99 118 121 133 153
Sites with methionine in the wild-type and mutant proteins Met 102 18.8 Met 102 39.1 Met 106 Met 106 Met 120 Met 120 25.2 Whole protein 25.7 Whole protein C-terminal domain C-terminal domain 23.7
C.
33.5 29.9 23.6 23.7 16.9 18.2 21.6
16.9 25.1 26.1 24.0 22.9
Thermal Stability
Although methionine is present with relatively low frequency in naturally-occurring proteins (Klapper, 1977), in this case it appears to act as a normal hydrophobic residue. The thermal denaturations of all single variants are essentially as cooperative as wild-type with comparable enthalpies of unfolding (Table II). This is also the case for the seven- but less so for the ten-methionine variant. It is expected that loss of stability will occur for methionine replacements of Leu, He and Phe because of a reduction in solvent transfer free energy (Table I). In addition, methionine has one or more rotatable bonds than leucine, isoleucine, valine and phenylalanine which may entail a greater loss of side-chain entropy upon folding (Table I). Strain may also be introduced in some cases. The range of destabilization that is observed for the single substitutions (-0.4 to -1.9 kcal/mol) (Table II) shows how the characteristics of the local site of substitution can contribute.
Multiple-Methionine Mutants of T4 Lysozyme
861
The most destabilizing single mutants (I78M, L84M, IIOOM and V103M) have the least available space, while those that destabilize the least (e.g. L99M, L133M and F153M) have volumes available in excess of methionine (Table I; Figure 2). This suggests that in the more extreme cases, part of the loss of stability is due to steric strain associated with insufficient space for the introduced methionine.
0
1000 2000 3000 4000 Time after addition of lysozyme (seconds)
Figure 2. The decrease in free energy of unfolding (AAG, Table 2) of the single methionine mutants plotted as a function of the model volume available for methionine substitution (Table 1). The volumes shown correspond to truncation of the side-chain in the WT* structure to alanine (see text). For comparison, the van der Waals volume of a methionine side-chain relative to alanine, is 57A^ (Table 1).
As more methionines are substituted, the loss in stability is less than the sum of the stability loss for the constituent single replacements (Gassner et al, 1996). The difference between the sum of the losses in stability for the single methionine variants and the loss of stability for the ten methionine protein is 2.5 kcal/mol. As judged by the cavity calculations described above, some of the single mutants lose stability because of introduced strain. In the multiple methionine mutant.
862
Nadine C. Gassner et al
however, it appears that the combination of substitutions permits this structure to relax to reduce strain, or that new favorable interactions are introduced.
IV, Conclusion The fact that at least seven core residues can be replaced as a group with methionine in phage T4 lysozyme without introducing molten globule-like characteristics shows that strict side-chain complementarity is not required to maintain native-like protein properties.
Acknowledgements We thank Joan Wozniak and Sheila Snow for help with purifying and crystallizing mutant lysozymes. We are also grateful to Drs. Ingrid Vetter, Eric Anderson, Dale Tronrud and Larry Weaver for helpful advice. This work was supported in part by National Institutes of Health grant GM21967 to B.W.M.
References Alber, T. and Matthews, B.W. (1987) Methods EnzymoL 154, 511-533. Becktel, W J . and Baase, W.A. (1985) Analyt. Biochem. 150, 258-263. Connolly, M.L. (1983) Science 111, 709-713. Creamer, T.P. and Rose, G.D. (1992) Proc. Natl Acad. Sci, USA 89, 5937-5941. Creighton, T.E. (1992) "Protein Folding", W.H. Freeman and Co., New York. Eriksson, A.E., Baase, W.A., Zhang, X.-J., Heinz, D.W., Blaber, M., Baldwin, E.P. and Matthews B.W. (1992) Science 255, 178-183. Eriksson, A.E., Baase, W.A. and Matthews, B.W. (1993)/. Mol Biol 119, 747-769. Fauchere, J.-L. and Pliska, V. (1983) Eur. J. Med. Chem. 18, 369-375. Gassner, N., Baase, W.A. and Matthews, B.W. (1996) Proc. Natl Acad. ScL USA, in press. Jones, T.A. (1982) In "Crystallographic Computing" (Sayre, D., ed.), Oxford University Press, Oxford, pp. 303-317. Klapper, M.H. (1977) Biochem. Biophys. Res. Commun. 78, 1018. Lee, B. and Richards, F.M. (1971)7. Mol Biol 55, 379-400. Lee, K.H., Xie, D., Friere, E. and Amzel, L.M. (1994) Proteins 20, 68-84. Matsumura, M. and Matthews, B.W. (1989) Science 243, 792-794. Muchmore, D.C., Mcintosh, L.P., Russell, C.B., Anderson, D.E. and Dahlquist, F.W. (1989) Meth. Enzymol 111, 44-73. Pickett, S.D. and Sternberg, M.J.E. (1993) J. Mol Biol 231, 825-839.
Multiple-Methionine Mutants of T4 Lysozyme
863
Poteete, A.R., Dao-pin, S., Nicholson, H. and Matthews, B.W. (1991) Biochemistry 30, 1425-1432. Rudolph, R. and Lilie, H. (1996) FASEB J. 10, 49-56. Shortle, D., Stites, W.E. and Meeker, A.K. (1990) Biochemistry 29, 8033-8041. Sternberg, M.J.E. and Chickos, J.S. (1994) Prot. Engin. 7, 149. Streisinger, G., Mukai, F., Dreyer, W.J., Miller, B. and Horiuchi, S. (1961) Cold Spring Harbor Symp. Quant. Biol. 26, 25-30. Weaver, L.H. and Matthews, B.W. (1987)7. Mol. Biol. 193, 189-199. Zhang, X.-J., Baase, W.A., Shoichet, B.K., Wilson, K.P. and Matthews, B.W. (1995) Prot. Engin. 8, 1017-1022.
This Page Intentionally Left Blank
SYNTHESIS OF ALZHEIMER'S (1-42) Ap-AMYLOID PEPTIDE WITH PREFORMED Fmoc-AMINOACYL FLUORIDES
Saskia C.F. Milton\ R.C. de Lisle Milton^ Steven A. Kates^ and Charles Glabe^ ^Department of Molecular Biology and Biochemistry, University of Callfomla, Irvine CA 92697 ^Advanced Technology Center, Beckman Instruments Inc., Fullerton CA 92634 ^PerSeptive Blosystems Inc., Framlngham MA 01701
I. INTRODUCTION The major component of the senile plaque amyloid deposits associated with Alzheimer's disease is a self-assembling 42-residue peptide, known as Api-42, which is generated by proteolytic processing of the amyloid precursor protein (5). Previous syntheses of this difficult peptide sequence in our laboratory resulted in an optimal strategy employing Fmoc/Bu* tactics. This consisted of BOP/HOBt/NMM* coupling reagents for 2 h and 20 % piperidine in DMF for 7 min, both at 40 °C (1) in a continuous-flow, semiautomatic peptide synthesizer (4). Although the product that is obtained using these tactics can be purified, the synthesis takes a long time and gives a relatively low yield. As preformed Fmoc-aminoacyl fluorides have been shown to exhibit advantages over standard coupling reagents (3, 8), the Api-42 synthesis procedure was modified to employ solutions of aminoacyl fluorides in dry DMF to see if improvement In •Abbreviations: Boc: f-butyloxycarbonyl; BOP: benzotrlazole-1-oxy-trls-(dlmethylamino)phosphonlum hexafluorophosphate; BuVfBu: f-butyl; DIEA: dilsopropylethylamlne; DMF: N,Ndimethylformamlde; DAST: diethylamlnosulfurtrifluorlde; Fmoc: 9-fluorenylmethoxycarbonyl; HATU: 0-(7-azabenzotrlazol-1-yl)-1,3,3-tetramethyluronlum hexafluorophosphate; HOBt: hydroxybenzotriazole; NMM: 4-methylmorpholine; Pbf: 2,2,4,6,7pentamethyldihydrobenzofuran-5-sulfonyl; PEG-PS: polyethylene glycol-polystyrene; TFA: trifluoroacetic acid; Trt: trityl. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
865
866
Saskia C F. Milton et al
yield and purity of the crude product could be obtained. The fluoride method allowed coupling for reduced times (10 min), except Fmoc-Arg(Pbf)-OH and Fmoc-His(Trt)-OH (which were activated with HATU/DIEA and coupled for 1 h), at 22 **C. However, to ensure complete removal of the N^'-protecting groups at each step, Fmoc deprotection was maintained at elevated (40 X ) temperatures. The deprotection temperature was also raised to 55 **C (7) in a further effort to improve the yield of the crude product in a subsequent synthesis.
11. METHODS Side chain protected Fmoc amino acids, Fmoc-L-Ala-PEG-polystyrene support, BOP and HATU were obtained from PerSeptive Biosystems, Inc. DAST (diethylaminosulfur trifluoride) was obtained from Fluka Chemical Corp. and DIEA and NMM were obtained from Aldrich Chemical Co., Inc. The following Fmoc amino acid derivatives were fluorinated: Ala, Phe, Gly, lie. Leu, Met, Val, Asp(OfBu), Glu(O^Bu), Lys(Boc). Asn(Trt), Gln(Trt). Ser(fBu) and Tyr(OfBu), while His(Trt) and Arg(Pbf) were used as purchased. All solvents were of HPLC grade and ail chemicals of Analar grade. DMF was dried over 4 Angstrom sieve before use.
1. Fmoc-aminoacyl fluorides Fmoc-aminoacyl fluorides were prepared using either diethylaminosulfur trifluoride - DAST (8) or cyanuric fluoride (3) and kept under argon for long term storage. Although different Fmoc-aminoacyl fluorides were prepared with each reagent, one or the other method could be used for all (3, 6). a. Preparation of aminoacyl fluorides using DAST (Ser, Val, Gly, Asn, Glu and Met) Fmoc-amino acids were reacted at 4 "^C for 30 min with DAST (1 : 1.2 equiv.) in dry methylene chloride (over 3 Angstrom sieve). Following the addition of a further 100 ml dry methylene chloride, the reaction mixture was poured over ice (from 18 mohm water) in a separating funnel and washed two more times with ice water. The organic phase was dried over a mixture of anhydrous magnesium sulphate and 3 Angstrom sieve and filtered through a sintered funnel. Rotary evaporation produced an oil or a precipitate which was dissolved with a small amount of dry methylene chloride and crystallized by the addition of hexane. After stirring overnight, the resultant crystals were collected by filtration and dried under high vacuum (6) (Figure 1a).
Alzheimer's (1^2) A^-Amyloid Peptide with Fmoc-Aminoacyl Fluorides
867
b. Preparation of aminoacyl fluorides using cyanuric fluoride(Ala, Gin, Leu, Asp, lie, Phe, Lys and Tyr) Fmoc-amino acids in methylene chloride were refluxed under N2 with cyanuric fluoride and pyridine (1:2:1 equiv.) for 3 h. The mixture was then washed with two portions of ice water, filtered to remove precipitated cyanuric acid and dried over anhydrous magnesium sulphate and filtered through a sintered funnel. Rotary evaporation produced a precipitate which was dissolved in a small amount of dry methylene chloride and crystallized by the addition of hexane. After stirring overnight, the resultant crystals were collected by filtration and dried under high vacuum (3) (Figure 1b).
PREPARATION OF Fmoc-AMINO ACID FLUORIDES 3
Diethylaminosulfur trifluoride
+
DAST
CH2CI2 •••
EtNSFO +
HF
D Cyanuric fluoride
+
F 1 N^^N
pyridine ^
H'^V^ Figure 1
Preformed aminoacyl fluorides were prepared by two methods. Preparation of aminoacyl fluorides using DAST was accomplished using the method of Kaduk et al (6) as shown in Figure 1a. Preparation with cyanuric fluoride using the method of Carpino et. al. (3) is shown In Figure 1b.
2. Peptide synthesis Semiautomatic syntheses were performed by the continuous-flow method on a custom built synthesizer. The synthesizer, consisting of a Chontrol 4 outlet timer (Fisher Scientific), a back pressure regulator (Western Analytical), a FMI pump (Fluid Metering Inc.) and slider valves (Rainin), was connected to a pressurized nitrogen gas source to actuate the valves. The synthesizer was fitted with a 5 ml sample loop (Rainin) and Omni columns and fittings (Omnifit) as previously described (4). A polyethylene glycol-polystyrene (PEG-PS) graft support, functionalized with p-alkoxybenzyl ester to form an acid labile linker was used. N"-deprotections were accomplished using 20 % piperidine in DMF for 7 min.
868
Saskia C. F. Milton et al
The synthesizer was run at 40 °C for aminoacylation as well as deprotection steps, using either the BOP/HOBt/NMM protocol (1) or the Fmoc-aminoacyl fluoride protocol. Further syntheses were also performed at 40 ""C for the acylation steps, but 55 °C (7) for the deprotection steps with both protocols. The Fmoc-aminoacyl fluoride protocol was also used at 22 ^'C for the acylation steps, but the deprotection steps were maintained at 40 °C in this instance. a. Activation protocols i. BOP/HOBt activation. 4 molar excess: Fmoc amino acid/BOP/HOBt (1:1:1) + NMM in DMF - 2 h at 40 X (typically Asp'^ Ala^ Glu^•^^ Phe^'^"^°, Arg^ Hjs6.i3.i4^ Tyr'°, Va^^•^^ Gln'^ Lys'®•^^ Leu'^ and Asn^^ were recoupled and Val'® twice recoupled for 2 h, guided by qualitative ninhydrin results). ii. Preformed Fmoc aminoacyl fluorides. 4 molar excess: Fmoc aminoacyl fluoride in dry DMF - 10 min at 22 °C or 40 °C (no recoupling required). iii. HATU/DIEA activation. 4 molar excess Fmoc-amino acid/HATU/DlEA (1:1:2) in DMF - 1 h at 40 ''C (no recoupling required). 3. Cleavage and final deprotection Cleavage of the peptides from the resin support and concomitant deprotection of the amino acid side chains was achieved in reagent R (TFA:thioanisoIe: 1,2ethanedithiohanisole = 90:5:3:2) at room temperature for 6 h. This was followed by removal of the exhausted resin by filtration and precipitation of the peptide product in cold ether. The precipitate was allowed to settle overnight at -20 °C and then washed 3x with cold ether and dried under high vacuum. 4. RPHPLC Analytical RP HPLC was performed on a Waters system (Model 510) with a Vydac C4 (214TP54) column at a flow rate of 1.0 ml/min. Preparative RP HPLC
Figure 2 (Opposite) RP HPLC chromatograms of the Alzheimer's Ap 1-42 peptide synthesized by different protocols for activation of the Fmoc-amino acids. Analytical RP HPLC employed a Waters system with a Vydac 04 (214TP54) column at a flow rate of 1.0 ml/min. The peptides were eluted by gradient (5-95 % B, 60 min) with 0.1 % TFA (buffer A) and 0.1 % TFA / acetonitrile (buffer B). a. Crude product obtained using BOP/HOBt/NMM : 40 °C acylations + 40 °C deprotections. b. Crude product obtained using BOP/HOBt/NMM : 40 °C acylations + 55 °Cdeprotections. c. Crude product obtained using Preformed Fmoc-aminoacyl fluorides : 40 °C acylations + 55 °C deprotections. d. RP HPLC purified peptide.
Alzheimer's (1-42) A^-Amyloid Peptide with Fmoc-Aminoacyl Fluorides
—,
,
,
869
•r—f-
Vj^W
VJL, 0
10
20
30 Time (min)
40
50
60
Saskia C. F. Milton et al
870
employed a Vydac C4 (214TP1022) column and a flow rate of 8 ml/min. The peptides in both preparative and analytical RP HPLC were eluted by gradient (595 % B, 60 min) with 0.1 % TFA (buffer A) and 0.1 % TFA / acetonitrile (buffer B). The center cut from each preparative run was frozen (in liquid nitrogen) immediately upon collection and lyophillzed under high vacuum. 5. Mass Spectroscopic analysis. Ion spray mass spectral analysis of peptides was performed by Peptidogenic Research, Livermore, CA.
III. RESULTS Ninhydrin monitoring is essential in the procedure employing BOP coupling tactics, as even after 2 h, certain residues typically require recoupling. However, using the aminoacyl fluoride approach (with HATU for Arg and His), qualitative ninhydrin testing revealed complete coupling at every step. The analytical RP HPLC traces of the crude products obtained using either the BOP/HOBt/NMM protocol or the Fmoc aminoacyl fluoride protocol, are shown in Figure 2, as well as a trace of purified Api-42. The major peak in the BOP/HOBt/NMM protocols (Figure 2a and 2b) is not the desired product (as judged by spiking a run with purified material), which elutes on the back slope of the peak. In figure 2b, BOP/HOBt/NMM protocol with coupling at 40 "^C and deprotections at 55 °C, the correct product, although still not the major component, has increased. In figure 2c, the Fmoc-aminoacyl fluoride protocol with coupling at 40 **C and deprotections at 55 °C, the target peptide corresponds with the major peak (which simplifies preparative HPLC purification). Ion spray mass spectrometric analysis of two purified Api-42 preparations are shown in Figure 3 and the calculated mass (4513 daltons) is the observed mass for both the purified peptides. TABLE 1 Typical yields obtained for syntheses with the various protocols. Protocol
acvlation
BOP/HOBt/NMM BOP/HOBt/NMM Fmoc-aminoacyl-F Fmoc-aminoacyl-F Fmoc-aminoacyl-F
40°C 40°C 22°C 40°C 40°C
deorotection
Yield
40°C 55°C 40°C 40°C 55°C
21% 22% 23% 25% 28%
Yield calculated as purified weight from crude weight
Typical yields obtained for syntheses with the various protocols are shown in Table 1 where an improvement in yield (calculated as purified weight from crude weight) is observed for the Fmoc-aminoacyl fluoride protocols. These latter
Alzheimer's (1-42) Ap-Amyloid Peptide with Fmoc-Aminoacyl Fluorides
871
i,^ I i J M i , r f i » i ^ i M » * « * * < M < W « A » M ^ i ^ ^
I
S78J
600
900
I
1100
11
12«0
1500
1600
Figure 3 Ion spray mass spectra of the RP HPLC purified Alzheimer's Ap 1-42 peptide employing two different protocols for Fmoc-amino acid activation. The observed molecular mass for the peptide was 4513 daltons in both cases, while the calculated mass was 4514 daltons (monoisotopic). a. Purified product obtained using BOP/HOBt/NMM : 40°C acylations + 55°C deprotections. b. Purified product obtained using Preformed Fmoc-aminoacyl fluorides : 40°C acylations + 55°C deprotections.
Saskia C. E Milton et al
872
yields may be further improved if a succession of small fractions are made and monitored across the preparative profile rather than the single cut that was done.
IV. DISCUSSION The method employing BOP coupling tactics at elevated temperature was the result of an optimization process designed to yield the best amount and purity of this recalcitrant target peptide (1). The abbreviated (10 min) aminoacyl fluoride coupling steps were directly inserted into the procedure, as a replacement for the optimized two hour coupling protocol, without further changes giving an improved purity and at least equivalent yield without need for recoupling. Analytical RP HPLC of the crude aminoacyl fluoride product revealed an elution time corresponding to the purified, mass spectrometrically characterized product unlike the original procedure. Optimizing the Fmoc-deblocking temperature (7) resulted in a further improvement in yield. These improvements resulting from adopting preformed acyl fluoride coupling tactics are less remarkable considering the advantages of these reagents. The fluorine atom is only 1 amu larger than the hydroxyl it replaces so avoiding the steric hindrance associated with larger leaving groups. Also, the polarity of the C-F bond and the adjacent oxygen increases reactivity towards nucleophiles allowing rapid and complete kinetics. Similarly, the unique nature of the C-F bond imparts a greater stability toward oxygen nucleophiles such as water or alcohols than the analogous chlorides yet appear to be of equal or nearly equal reactivity toward amine nucleophiles (3). On the other hand, the simplicity of the reaction components, the activated Fmoc amino acid in dry DMF (in the absence of base) and the single inorganic side product (HF which is quenched by the DMF - ref. 8) obviate side reactions leading to a loss of yield and chirality.
Figure 4
The reaction mechanism of the acyl fluorides.
Alzheimer's (1^2) A^-Amyloid Peptide with Fmoc-Aminoacyl Fluorides
873
A remaining refinement, side chain protection for Arg and His that is stable to fluorination and allows isolation of the aminoacyl fluoride product (2), would be thought to provide an unsurpassed general utility for this methodology in peptide synthesis. V, C O N C L U S I O N S The protocols employing Fmoc-aminoacyl fluoride aminoacylations resulted in an improved purity and yield of crude product (by analytical RP HPLC) compared to the optimized BOP/HOBt coupling tactics. Additional advantages of using preformed aminoacyl fluoride couplings are a reduced dependence on elevated temperature during coupling and a considerable saving In synthesis time with this known difficult sequence.
References 1. Burdick, D., Soreghan, B., Kwon, M., Kosmoski, J., Knauer, M., Henschen, A., Yates, J., Cotman, 0. and Giabe, 0. (1992) J. Biol. Chem. 267, 546. 2. Carpino, LA. and El-Faham, A. (1995) J. Am. Chenn. Soc. 117, 5401. 3. Carpino, LA., Sadat-Aalaee, D., Chao, H.G. and DeSelms, R.H. (1990) J. Am. Chem. Soc. 112, 9651. 4. Glabe, C.G. (1990) Technique 2,138. 5. Haass, C, Hung, A.Y., Schlossmacher, M.G., Oltersdorf, T., Teplow, D.B. and Selkoe, D.J. (1993) Ann. N.Y. Acad. Sci. 695,109. 6. Kaduk, C, Wenschuh, H., Beyermann, M., Carpino, LA. and Bienert, M. (1996) Letters in Peptide Science 2, 285. 7. Rabinovich, A.K. and RIvier, J.E. (1994) In "Peptides: Chemistry, Structure and Biology". Hodges, R.S. and Smith, J.A. (Eds.), pp 71-73. Escom, Leiden, Netherlands. 8. Wenschuh, H., Beyermann, M., El-Faham, A., Ghassemi, S., Carpino, L.A. and Bienert, M. (1995) J. Chem. Soc, Chem. Commun., 669.
Analysis of Racemization During "Standard" Solid Phase Peptide Synthesis: A Multicenter Study Ruth Hogue Angeletti^ Lisa Bibbs^ Lynda F. Bonewald^ Gregg B. Fields^ Jeffery W. Kelly^ John S. McMurray^ William T. Moore^ & Susan T. Weintraub^ 'Lab. for Macromolecular Analysis, Albert Einstein College Med., Bronx NY 10461 ^Research Institute of Scripps Clinic, La Jolla CA 92037 ^Depts. Med. & Biochem., Univ. Texas Health Science Center, San Antonio TX 78284 "^Dept. Lab Medicine & Pathology, Univ. of Minnesota, Minneapolis MN 55455 ^Dept. of Chemistry, Texas A&M University, College Station TX 77843 ^Dept. Neuro-Oncology, Univ. of Texas, M.D. Anderson Cancer Center, Houston TX 77030 ^Dept. Pathology & Lab Medicine, Univ. of Pennsylvania, Philadelphia PA 19104 ^Dept. of Biochem., Univ. Texas Health Science Center, San Antonio TX 78284
I. Introduction Synthetic peptides have been produced in abundance by resource and research laboratories for both basic research and drug discovery programs, in numbers estimated in the tens to hundreds of thousands. Inclusion of combinatorial libraries would increase this number to many millions. While improved synthetic procedures and analytical technologies can provide assurance that these peptides have the desired sequence and purity, little or no concern is usually given to stereoisomeric purity. The assumption is often made that racemization is unlikely to occur, or that it need not be examined. The sheer number of peptides which can now be rapidly prepared, either one at a time or in sets or arrays, presents an analytical challenge, whether examining purity or racemization. Yet the critical importance of a unique peptide stereoisomer to its given biological function is widely recognized. Thus, evaluation of the degree to which racemization occurs in peptides produced by core laboratories is timely. The Peptide Synthesis Research Committee of the Association of Biomolecular Resource Facilities (ABRF) conducts anonymous studies to evaluate the ability of ABRF member laboratories to synthesize and characterize test peptides (1-5). The committee has also conducted studies which provided an opportunity for our member laboratories to attempt new synthetic methods and evaluate new analytical technologies. Previous studies by this committee have shown that peptide assembly and cleavage are no longer significant problems in most core laboratories. Therefore, for its 1996 study, the ABRF Peptide Synthesis Research Committee sought to assess the extent to which racemization occurs during peptide assembly in peptides synthesized by our member resource laboratories. TECHNIQUES IN PROTEIN CHEMISTRY VIII Copyright © 1997 by Academic Press All rights of reproduction in any form reserved.
875
876
Ruth Hogue Angeletti et al
The committee prepared for the 1996 study by designing and testing appropriate short peptides that should be straightforward to synthesize. Evaluation of these peptides by the committee also provided the opportunity to establish which tests would be most suitable for identification and quantitation of racemization in this study. During the preparatory phase of this study, it was found that handling and analysis of a hexapeptide with two potential sites of racemization were too problematic to be useful as a model for this multicenter study. A hexapeptide susceptible to racemization at a single His residue was subsequently designed, tested and found suitable for the study. This synthetic peptide was requested from each of the core laboratories for evaluation by the committee. Coded samples of unpurified peptides were characterized by amino acid analysis (AAA), AAA following reaction with Marfey's reagent, high pressure liquid chromatography (HPLC), electrospray ionization mass spectrometry (ESI-MS), matrix-assisted laser desorption ionization mass spectrometry (MALDI-MS), and enzymatic digestion by carboxypeptidase A (CPA) followed by MALDI-MS. Forty-eight laboratories participated in this study, submitting 53 samples for analysis. In addition to the results of this multicenter study, a summary of the committee's preparatory experiments are also presented in this report.
II. Materials and Methods ABRF member laboratories were asked to synthesize the following peptide (ABRF96), H-Arg-Glu-Arg-His-Ala-Tyr-OH, by the method most frequently used in their facilities. Crude samples were submitted as dry products equally apportioned in six 1.5-ml microcentrifiige tubes to minimize any artefacts due to sample handling. Laboratories were requested to provide information on the protecting groups, resin, coupling conditions, and cleavage protocols used. The peptide sequence was chosen to be straightforward to synthesize, with no problems anticipated in peptide assembly or cleavage. The His residue was included because of its known susceptibility to racemization. Purity of the submitted peptides was assessed by AAA, analytical HPLC, and mass spectrometry. The extent of racemization was evaluated as described below by AAA following reaction with Marfey's reagent, by analytical reversed-phase HPLC, and by mass spectrometric analysis of enzymatic digestion products.
A. Synthesis of Reference Peptides For both the preliminary and final test peptides, it was necessary to prepare all possible isomers as standards for analysis. These reference peptides were synthesized using a PE-ABD 430A peptide synthesizer configured for FastMoc synthesis protocols. Piperidine (20%) in N-methylpyrrolidone (NMP) was used to deprotect the N-terminal amino acid. To activate each amino acid in the reaction vessel, the following was used: 1.0 mM amino acid, 1.0 mM 2(1 H-benzotriazol-1-yl)-1 -1 -2,3, tetramethyluronium hexafluorophosphate (HBTU) in 1 M 1-hydroxybenzotriazole (HOBT) in NMP, 2.0 mM diisopropylethylamine (DIEA). Each His residue was double-coupled. HBTU, DIEA, piperidine, DMF and NMP were purchased from ABI. Methylene chloride was purchased from Burdick and Jackson.
Racemization During Solid Phase Peptide Synthesis
877
For the preliminary assessment of a hexapeptide with two potential sites for racemization (6,7), the following set of four reference peptides was synthesized: LL LD DL DD
Arg-Asp-Arg-(Z-His)-Glu-(Z-Cys) Arg-Asp-Arg-(Z-His)-Glu-(D-Cys) Arg-Asp-Arg-(D-His)-Glu-(Z-Cys) Arg-Asp-Arg-(D-His)-Glu-(D-Cys)
A second set of reference peptides with one potential site of racemization was subsequently prepared, as shown below: L D
Arg-Glu-Arg-(Z-His)-Ala-Tyr Arg-Glu-Arg-(D-His)-Ala-Tyr
For preparation of the reference peptides, all N"-Fmoc amino acids, Arg (Pmc), Asp (OtBu), Glu (OtBu) and Ala were purchased from ABI/Perkin Elmer (Foster City, CA) in preloaded cartridges. D- or Z-Cys (Trt) 2-chlorotrityl resin was purchased from Anaspec Inc. (San Jose, CA) and the Fmoc-(Z)- or Z-)His (Trt) from Novabiochem (La Jolla, CA). The resin used for the second peptide synthesis was Fmoc-Tyr (tBu)-HMP-resin from Perkin Elmer/ABD. All reference peptides were cleaved from their respective resins using 500 jLil thioanisole, 750 mg phenol, 250 \i\ 1,2-ethanedithiol, and 500 |il water; the resulting mixtures were diluted to 10 ml total volume with neat TFA. Thioanisole and 1,2-ethanedithiol were purchased from Aldrich (Milwaukee, WI), phenol and ethyl ether from EM Science (Gibbstown, NJ) and acetic acid from Mallinkrodt (Paris, KY). After 2 hr incubation in the cleavage solution, the cleaved peptides were precipitated in -70° ethyl ether, centrifuged, the supernatant decanted, and the precipitation procedure repeated. The final pellet was dissolved in 10 ml 25% acetic acid, lyophilized and redissolved in 0.1% TFA for desalting. All peptides were desalted using a Beckman 0DS-RP-C18 5 |im column, 10 mm x 25 cm, using a 0-50% gradient over 60 min: A= 0.1%) TFA, B = 90% acetonitrile, 0.1% TFA. B. Amino Acid Analysis Analyses for amino acid composition were performed according to the method of Spackman, Stein and Moore (8). Samples were hydrolyzed for 24 hr at 110°C in 6 N HCl containing 2% phenol and 1% 2-mercaptoethanol. All analyses were performed on a Beckman 6300 amino acid analyzer with a sodium polystyrene sulfonated cation exchange column (Pickering Laboratories). Determination of D- and Z-His content in peptides was carried out using Marfey's reagent, l-fluoro-2,4-dinitrophenyl-5-Z-alanine amide (FDAA) (9). Peptides were first hydrolyzed for 18 hr in 6 N HCl under reduced pressure. To ensure that racemization during hydrolysis would be minimized, this time was determined by evaluation of a time course of hydrolysis on the reference peptides. After incubation with 10 mM FDAA in 0.1 M sodium bicarbonate for 1 hr at 40°, the hydrolysates were acidified with 0.2 N HCl before injection on a Hypersil
878
Ruth Hogue Angeletti et al
CIS column (2.1 x 200 mm, 5|im particle size, 120A pore size). Each hydrolysate was incubated with FDAA immediately prior to analysis. The separations were carried out on a Hewlett Packard HP1090M HPLC with a gradient extending from 0.1% aqueous TFA to 60% acetonitrile/0.1% TFA and detection at 340 nm. The series of test peptides that contained Cys were pyridylethylated before hydrolysis.
C. Optical Rotatory Dispersion A JASCO model OR-990 chiral detector was tested as a means for assessing racemization in the reference samples. This instrument was equipped with a 44 fil flow cell (25 mm path length), a 150 W Hg-Xe lamp as light source, and an "open-loop" polarizer/analyzer. Samples were analyzed at 350 nm, with a lOx gain, 0.1 sec response, and 16 mdegree deflection. Separations were performed on a Hewlett Packard HP 1090 HPLC with a diode array detector. D. Analytical Reversed-Phase HPLC Analytical HPLC of reference peptides included evaluation of two gradient conditions: 1) 0.1% aqueous TFA to 100% acetonitrile/0.1% TFA, or 2) 0.1 M aqueous ammonium acetate pH 6.5 to 100% acetonitrile/0.1 M ammonium acetate. Both gradients were run by increasing the acetonitrile concentration by 1% per minute. A Vydac CI8 column (4.6 x 250 mm, 5 |im bead, 300 A pore size) was used with both gradients at a flow rate of 1.5 ml/min, and absorbance monitored at 230 nm. An Aztec Cyclobond Il-y chiral cyclodextran-based column was used with gradient 2 under the same elution and detection conditions. Test separations were also performed using a Nucleosil CI8 column (4.6 x 250 mm, 5 jim bead, 300 A pore size) eluted with gradient 1 at a flow rate of 0.5 ml/min and detection at 214 nm. Analyses of peptides submitted by participating laboratories were routinely carried out on a Hewlett Packard 1090 HPLC using a truncated version of gradient 1 (up to 20% acetonitrile) on the Vydac CI8 column at a flow rate of 1.5 ml/min, and absorbance monitored simultaneously at 230 and 274 nm. Exceptional peptides were re-examined using longer gradients.
E. Mass Spectrometry Electrospray ionization mass spectra were acquired on a Finnigan MAT SSQ700 quadrupole mass spectrometer fitted with an Analytica of Branford electrospray interface. The electrospray energy was -3.5 kV and the nitrogen bath gas was heated to 160 °C. Samples were dissolved in 0.5 ml of 5% aqueous acetic acid and diluted with 50% aqueous acetonitrile/0.5% acetic acid to give a concentration of approximately 15 pmole/|il, based on quantity assessment of a separate sample by amino acid analysis. It is important to note that by this approach, spectral intensity was partially dependent on even sample distribution among the tubes by the submitting laboratory. Analyses were performed by flow injection of 5-|il aliquots, infused at a rate of 1.5 fil/min. Spectral averaging for 1 min was employed prior to profile mode data acquisition. Deconvolution of the ESI mass spectra was accomplished by the BioMass program component of the SSQ data system software.
Racemization During Solid Phase Peptide Synthesis
879
Liquid chromatography-mass spectrometry (LC-ESI-MS) was performed on a PE-Sciex API-Ill triple quadrupole mass spectrometer with an lonSpray source. The samples were separated on the Nucleosil CI 8 column with gradient 1; a splitting device was used to direct a portion of the effluent into the mass spectrometer. Data were acquired in the range of m/z 400-1000, with sufficient resolution to detect the isotope peaks for singly-charged ions at m/z 1000. Matrix-assisted laser desorption ionization mass spectrometry (MALDIMS) was performed on a Fisons Instruments (Beverly, MA) VG TOfSpec timeof-flight mass spectrometer (0.6 m flight tube) fitted with a nitrogen laser (337 nm, 4 ns pulse). The accelerating voltage was 20 kV and the detector voltage 1.7 kV. Positive ion spectra were collected in the linear mode, with each spectrum derived from the accumulation of 20 to 50 laser shots. External calibrations were performed using synthetic peptides with masses covering the range of interest. Mass accuracy to within 1 -2 amu was routinely obtained. Data were analyzed using Fisons Instruments Opus Software. Upon receipt, peptide samples were dissolved at a concentration of 0.3 |imole/10 1^1 in 50% acetonitrile/0.1% TFA and stored at -20°. In order to obtain initial characterization of the submitted peptides, samples were diluted 1:10 in 50% acetonitrile containing 0.1% TFA saturated with a-cyano-4hydroxycinnamic acid (AC50) as the matrix. In preparation for enzymatic treatment, each peptide solution was diluted 1:100 with 0.25% ammonium bicarbonate containing 0.5 M NaCl. CPA (Sigma C9762) was diluted 1:10 from the reagent vial into the above peptide solution. A limit digest was achieved by overnight incubation at 37°. Trypsin was also used for some test digestions. For MALDI-MS analysis after enzyme digestion, each reaction mixture was diluted 1:10 in the AC50 matrix solution. Assessment of component quantity in the peptide samples was obtained by measurement of ion intensities, recognizing the limitations in quantification by MALDI-MS. The calculated average mass-tocharge ratio of the protonated Arg-Glu-Arg-His-Ala-Tyr peptide ([M+H]^^) is 831.9. The limit carboxypeptidase digest of the Z-His containing peptide yields Arg-Glu-Arg with an [M+H]^^ at m/z 460.5, while digestion of the D-His containing peptide produces Arg-Glu-Arg-His-Ala, characterized by [M+H]^^ at m/z 668.7. The percent racemization was calculated by comparison of the relative peak heights of the m/z 669 and m/z 461 ions in the MALDI mass spectra obtained after CPA treatment of each submitted peptide. III. Results and Discussion Analysis Of Reference Peptides With Two Potential Racemization Sites, Arg-Asp-Arg-HiS'GlU'Cys The committee first designed a peptide susceptible to racemization at two residues. His and Cys: Arg-Asp-Arg-His-Glu-Cys. Glu was placed after the Cys in order to facilitate determination of the influence of a racemized cysteine on the action of CPA. Arg was placed after the histidine in order to permit examination of whether the presence of a D-His would stop the action of trypsin.
880
Ruth Hogue Angeletti et al
The committee's first attempts to identify methods to detect and distinguish among the four isomeric peptides focused on HPLC. In these preUminary studies, it was found that reversed-phase HPLC was readily able to separate the four possible isomeric forms of the reference peptide. Of the several separation systems evaluated, the Nucleosil column (described above) provided the best resolution while the cyclodextran-based chiral column was unable to separate the four reference peptides. However, a major problem was encountered in that there was a noticeable change in the number of peaks in the HPLC profile of each "pure" peptide when rechromatographed after a period of time. These changes could not entirely be attributed to oxidation of sulfhydryl groups to disulfides. A further difficulty was found in evaluating racemization in these peptides by AAA following reaction with Marfey's reagent. Although there was satisfactory separation of derivatized D- and Z-His, D- and Z-Cys were incompletely resolved, seriously compromising accurate quantification. As a consequence, analysis of the four reference peptides by this technique showed variable amounts of Cys racemization that were inconsistent with the behavior of the purified peptides on HPLC. It was not clear if racemization was occurring during the analysis process p^r se. There were also complications in interpreting the results of MALDI-MS analysis after CPA digestion of the four reference isomeric peptides due to variations in the proteolytic product distribution. Thus, in view of the fact that the problems already encountered with peptide characterization would be expected to be exacerbated for the samples submitted from member laboratories because of the additional steps in shipping and handling, this peptide model was deemed unsuitable for the purposes of the 1996 study. Analysis Of Reference Peptides With A Single Potential Site of Racemization, Arg-Glu-Arg-His-Ala-Tyr A simpler peptide that was susceptible to racemization at only a histidine residue was prepared: Arg-Glu-Arg-His-Ala-Tyr. Of the HPLC separation protocols tested, only the chiral column was ineffective in resolving D- or Z-His containing peptides. The others achieved baseline resolution (Figure lA). Interestingly, the order of elution of the isomers was reversed when ammonium acetate was used in the mobile phase (gradient 2) instead of TFA (gradient 1). During AAA following reaction with Marfey's reagent, the Z)-His and Z-His derivatives were well resolved, with a 2.1% racemization detected for the Z-Hiscontaining peptide, and 1.9% detected for the Z)-His-containing peptide. Direct analysis with the on-line ORD detector linked to the HPLC was hampered by peak broadening, which led to incomplete resolution of the two peptides. Diffusion in the large flow cell may be a major factor in this difficulty. Because detector response can be either positive or negative, it is important to have an excellent signal-to-noise ratio and baseline HPLC resolution of the peptide isomers to be able to quantitate the degree of racemization by this technique. While the on-line ORD detector is promising, further improvements in resolution as well as sensitivity are required for it to be useful for routine peptide analysis.
Racemization During Solid Phase Peptide Synthesis
time, minutes
Figure 1. HPLC Profiles of Reference Peptides and Selected Samples A = mixture of D and L reference peptides; B = sample 9050; C = sample 5110
881
882
Ruth Hogue Angeletti et al
Table I. Peptide Evaluation Sheet
% desired Chemistry Protecting %D (Fmoc group (Trt % desired % desired MALDIMS, % D %D other molecular species present by ESICPA/MS MS or MALDI-MS analysis S A M P L E * default) diluted Marfey's HPLC MALDIMS HPLC default) nd some+12 3.34 2.58 69 Bom 70 175 nd trace +Pbf 1.91 90 484 3.98 85 nd trace +Fmoc 0.46 2.18 94 88 690 3.12 71 81 nd 1.25 988 4.1 some +Tyr 5.32 1002 4.31 68 78 87 5.9 5.99 5.69 88 1295 nd trace +Pmc 94 2.26 1354 3.25 63 95 4.85 87 nd trace +Tyr, +Fmoc, +tBu, +71 4.85 79 1609 2.57 nd trace +Bzl 1.59 77 Dnp 86 2522 Boc nd 5.30 3.83 96 Boc 93 3845 nd 1.14 2.02 98 96 4088 6.97 nd trace +Tyr, +Fmoc 5.18 65 70 4308 5.7 2.27 1.69 98 93 4343 3.00 94 nd 2.28 88 4609 1.4 some -Arg, trace +Tyr, +Pbf, +Pbf & tBu 2.49 7.68 69 60 4612 97 nd 3.25 2.03 93 4761 21 7.15 94 5.76 96 5110 94 14.9 7.22 5.53 98 5111 10.1 7.47 94 5112 5.81 98 15.7 5.52 6.99 97 83 5113 8.6 6.75 94 5114 5.48 98 nd trace +Pbf 2.16 2.06 86 95 5307 2.24 91 5874 nd some +Pmc 1.59 56 98 4.57 nd failed synthesis nd 0 6410 18 nd trace +Pmc, +224 1.61 2.25 82 97 6849 nd trace Orn substitution 3.37 1.93 84 70 6930 2.11 2.67 92 7154 nd trace +Pbf 41 91 nd 1.97 94 98 2.31 7325 nd some +Arg, +Pbf 3.14 3.96 84 80 7440 2.6 3.09 90 7638 5.66 93 nd Arg-Gly-Arg-His-Ala-Tyr 1.66 6.16 0 7818 0 nd significant -Arg 2.88 70 60 8246 0.53 76 nd 0.06 1.96 97 99 8398 4.62 nd trace +Tyr, +Fmoc 4.57 84 95 8402 68 10.3 some +Pmc 4.87 6.30 78 93 8409 7.07 92 6422 5.1 trace +Pmc 4.67 83 nd significant +nTyr, up to n=7 1.12 2.72 36 56 8465 nd significant +nTyr, up to .n=7 2.82 8475 1.42 43 56 2.9 0.04 2.14 100 8583 96 nd trace -22, +Pmc 2.53 81 96 8596 3.09 76 nd 2.20 97 100 0.00 8808 2.8 2.26 0.00 98 100 9050 17.6 trace +Fmoc 5.62 97 94 8.11 77 9140 nd trace +Pmc 0.18 3.65 98 89 9280 nd trace +anisyl, +BrZ 98 9453 Boc 0.10 1.91 80 Dnp 6.12 2.9 some +Tyr, trace +Pbf 68 81 85 9818 Mtt 7.09 94 87 DBE1 nd some -His 12.96 3.63 5.7 trace +Tyr 5.25 88 FBR1 4.18 90 nd trace+112,+Pmc 2.67 89 JL96 1.56 79 94 ndj 2.53 97 ND05 1.55 nd some +Tyr 1.63 2.58 84 PEPA 86 nd na 3.78 92 89 PEPB na = not analyzed. The hydrolysate of PEPB precipitated after reaction with Marfey's reagent |nd = not detected |
I
I
I
Racemization During Solid Phase Peptide Synthesis
883
The usefulness of combining MALDI-MS with proteolytic digestion to provide insight into the extent of racemization was found to depend on the enzyme used for digestion. Trypsin did not discriminate between peptides with D- and Z-His adjacent to the cleavage site. However, CPA proved very sensitive to the presence of D-His, even distal to the cleavage site. As seen in Figures 2AD, CPA removed only the terminal Tyr residue from the Z)-His-containing peptide, leading to a characteristic ion at m/z 669 (Arg-Glu-Arg-His-Ala) instead of the m/z 461 ion (Arg-Glu-Arg) observed for the Z-His-containing peptide. Although this particular assay may not be of general utility, similar strategies could most likely be designed for other sequences. Determination of Purity of Coded Samples by Different Analytical Methods Fifty-three peptide samples were submitted by 48 laboratories. Previous studies by this committee have demonstrated the need for multiple analytical methods for the assessment of purity. Therefore the peptides in this study were analyzed by AAA, HPLC, ESI-MS and MALDI-MS to determine purity (Table I). Only two peptide samples had less than 50% of the desired product, and three other samples had less than 70% of the desired product, as judged by their mass spectra, amino acid composition and HPLC retention time. Overall, the peptides were of excellent quality. In assessing peptide purity, there were occasional discrepancies noted among the different analytical methods. MALDI-MS analysis of the peptide samples often revealed impurities not readily observed by ESI-MS and HPLC. In most of these cases, greater dilution of the sample prior to MALDI-MS yielded data more consistent with ESI-MS and HPLC assessment. This is illustrated by sample 1354 in Figure 3A-C and in Table I. These results underscore the importance of using more than one analytical method to evaluate peptide purity, and of testing samples under several different conditions. Analysis of Coded Study Samples for Extent of Racemization Racemization was judged by three methods: AAA following reaction with Marfey's reagent, HPLC, and MALDI-MS of products produced by digestion with CPA. Although most samples that were chemically pure were found to have low levels of racemization, a few very pure samples showed extensive racemization. For example, AAA, HPLC, ESI-MS (data not shown) and MALDI-MS analysis (Figure 2E) of sample 5110 revealed that this sample contained a high percentage of the correct sequence. However, both CPA/MALDI-MS (Figure 2F) and HPLC analysis (Figure IC) of 5110 revealed high levels of the Z)-His-containing product as compared to sample 9050 (Figure IB), which showed little racemization by all criteria used. AAA following reaction with Marfey's reagent was consistent with these data. There was generally agreement between the results from determination of racemization by AAA following reaction with Marfey's reagent, by HPLC, with somewhat less agreement by CPA/MALDI-MS. However, in some cases the three analytical methods yielded divergent results. Therefore, the possible sources of discrepancy among the methods were also evaluated as part of this
Ruth Hogue Angeletti et al
884 80. 60d
404 20.
1001
4h6 ' ' '4b6 ' ' 'she 4§1
'sho ' ' 'sho ' ' 'ebo ' ' 'vbo ' ' '7b6 ' ' 'ebo ' ' 'ehd
0)
> a:
m/z Figure 2. MALDI-MS Analysis of Peptides Before and After Treatment with CPA A, C and E: samples analyzed in the absence of CPA; B, D and F: samples analyzed after treatment with CPA; A and B: L-reference peptide; C and D: D-reference peptide; E and F: sample 5110
Racemization During Solid Phase Peptide Synthesis
885
Study. CPA/MALDI-MS analysis tended to detect the presence of Z)-His only in those cases where there was high racemization. When the percentage of Z)-Hispeptide estimated by HPLC was found to be very low, the small peaks present were not recognized by the instrument's computerized integration protocol. In several instances, AAA following reaction with Marfey's reagent revealed higher racemization than shown by either HPLC or CPA/MALDI-MS; this mostfrequentlyoccurred when there was a failed synthesis, such as samples 6410 and 7818 (see below). Information can be obtained from AAA following reaction with Marfey's reagent even if the synthesis fails. In contrast, HPLC and CPA/MALDI-MS are more dependent on the presence of the correct sequence. In order to effectively utilize the HPLC and CPA/MALDI-MS methods with less pure samples, the corresponding signature ions and retention times of the modified products must be ascertained. For several other samples, the percentages of D-His-containing peptide determined by HPLC were higher than by AAA following reaction with Marfey's reagent or by CPA/MALDI-MS. For example, this discrepancy was observed for sample 4612, which contained a significant amount of des-Arg peptide, and for sample DBEl, which contained some des-His peptide. A possible explanation for this is that the Z)-His-peptide elutes just before the Z-His-peptide in the HPLC analysis. Peptide byproducts in which basic residues have been deleted would be expected to shift to an earlier retention time, perhaps coeluting with the D-Hiscontaining peptide. Figure 4 shows the distribution of the results from AAA following reaction with Marfey's reagent. Because of the skewed distribution of data, assessment of the mean and/or median extent of racemization was judged not to be useful. The majority of samples contained approximately 2% of the D-His form of the peptide. It should be noted that of the ten samples with the highest extent of racemization, five comprised the 5110-5114 series from one laboratory. Racemization of histidine is attributed to a base-sensitive, intramolecular reaction which occurs at the imidazole Ti-nitrogen (6,10). This reaction is influenced by the type of base used in the coupling step of solid phase peptide synthesis, by the relative strength of the coupling reagent, and by the polarity of the solvent. In order to control the extent of racemization, different side chain protecting strategies have been devised for both the TI:- and x-imidazole nitrogens. All but five of the samples submitted for this study used a Trt group on the i-nitrogen (Table I), an approach reported to be effective in suppressing racemization. The results of the study suggest that the Trt group is not always as effective as desired. Five samples were produced by other protecting strategies, but this is an insufficient number to permit statistical evaluation of the effects of these groups. The two peptide samples synthesized by Boc chemistry (2522, 9453) both used Dnp-protected His, and had low levels of racemization. Peptide 9818 used Mtt protection and had a high content of D-His-containing peptide. One sample (175), which used Bom protection of the Ti-nitrogen, showed the presence of a +12 amu derivative by MALDI-MS. This modification is observed frequently as the result of reaction of formaldehyde released by the Bom group during cleavage procedures. Sample 3845 used Boc protection of His, but had elevated levels of Z)-His peptide.
Ruth Hogue Angeletti et al
886
41)0
'sbo
' '
oho
'^ho ' ' ' - W
200
1300
1400
lOOi
902 B 803 702 602
+Pmc
502 402 3 02 202 102^
4uo '.001
• •
'sho
^4 c 802
502 402
Figure 3. Effect of Sample Dilution on MALDI-MS Analysis of Sample 1354 A: 300 pmoles; B: 30 pmoles; C: 3 pmoles
Sample
Figure 4. Distribution of Data from AAA Followed by Marfey's Analysis
Racemization During Solid Phase Peptide Synthesis
887
Samples 5110-5114, submitted by one laboratory, were all excellent in chemical purity, but lower in optical purity. The preparation of these five samples differed only in the length of coupling time. Examination of the detailed protocol sheets submitted by participating laboratories revealed that there was no correlation between coupling times and extent of racemization among the samples submitted, even though a wide variety of coupling times had been employed. Similarly, no correlation was noted for the type of base used in the coupling cocktail. Out of concern that the amount of Z)-His-containing peptide might simply reflect the quality of the starting protected His derivative used in the synthesis, additional information about the vendor and lot number of the compound was requested anonymously from the study participants. No correlation was found between the extent of racemization and the vendor of the Z-His derivative. There were five pairs of peptides for which each sample was synthesized using the same lot of His derivative. Two pairs of peptides synthesized with the same lot of protected His derivative were in the cohort of samples with low racemization. It was interesting to note that another pair had one member with low racemization and the other with approximately 4-5% ofDHis-containing peptide. Two other peptide pairs had one sample with low racemization, and the second with among the highest racemization observed. Thus, it appeared unlikely that the origin of the racemization could be the quality of the starting material. Analysis of Synthetic Failures From the studies conducted by this committee in the past several years, it can be concluded that the quality of peptides produced by ABRF member laboratories is high, as judged by modem analytical techniques. However, there have always been a few less than satisfactory samples for which the source of synthetic problems could only be attributed to human error. In the present study, errors of this type originated not only from some member laboratories, but from manufacturers, as illustrated in some of the examples described below. Sample 6410 appeared to be a failed synthesis, with little detectable peptide. Sample 7818 was quite pure, but had no correct product, since Gly had been substituted for Glu in the synthesis. The laboratory submitting sample 7818 used an instrument with amino acid cartridges that were refilled with bulk amino acid derivatives. Although it is not known whether the refilling procedure was carried out in-house or by a commercial supplier, this type of opportunity for a mistake could affect many peptide syntheses, and should be a cause for concern in all laboratories. Sample 8246 contained a large quantity of peptide missing the C-terminal Tyr. From the description on the protocol sheet, it appeared likely that this laboratory had prepared its own Tyr-resin. Yet purchase of preloaded resins provides no assurance of quality. Analysis of the samples by MALDI-MS revealed the presence of additional Tyr residues on ten peptide samples, nine laboratories. As noted above, at higher analyte concentrations, MALDI-MS frequently shows greater intensity for by-product ions than appropriate for their concentration in the sample. Indeed, ESI-MS analysis of the samples introduced by flow injection revealed only a few instances of samples with ions corresponding to additional
Ruth Hogue Angeletti et al Ai
+Y 994.1
time, minutes
Figures. Analysis of Sample 8475 A = MALDI-MS; B = ESI-MS; C = HPLC
Racemization During Solid Phase Peptide Synthesis
889
Tyr residues. LC-ESI-MS analysis of sample 8475 permitted detection of peptides containing up to 7 additional Tyr residues. Consistent with this observation, the amount of Tyr detected by AAA was very high. The MALDIMS and ESI-MS analyses of this sample are shown in Figure 5A-B. HPLC analysis of all ten +Tyr-containing samples showed that seven had less than 2% of the +Tyr peptide. Three samples, including 8475 (Figure 5C), had significant amounts of the +Tyr peptide and other peptides containing a variable number of Tyr residues. The laboratories which submitted all ten of these samples had all indicated that they had purchased resin with the Tyr already attached. However, these resins appeared to be purchased from several vendors. Thus, additional quality control of resins sold by manufacturers should be implemented.
IV. Conclusions The overall results from this study indicate that racemization during peptide assembly is not a serious problem in most participating laboratories. Nevertheless, it is evident that laboratories can produce peptides which have predominantly composed of the desired product, yet exhibit unacceptably high levels of racemization. AAA following reaction with Marfey's reagent permits very sensitive detection of racemization, and complementary procedures involving enzymatic digestion coupled to MS detection can also be devised. Unfortunately, HPLC separations for AAA following reaction with Marfey's reagent do not always provide baseline separation of pairs of D- and Z-amino acids. Moreover, in laboratories which produce large numbers of peptides, the additional personnel and instrument time required to analyze all peptides by this method may preclude routine use of this method. Nonetheless, in the present study, it was possible to achieve baseline resolution of the D- and Z-His containing study peptides by conventional reversed-phase HPLC chromatography. Since fortuitous separation of racemized peptides by HPLC cannot be assumed, accurate determination of racemization should be considered for peptides which are central to large research programs. Considerable thought and emphasis should be placed on the potential effects of a racemized product on the outcome of a project. It is well known that the L and D form of peptides can vary significantly with respect to stability and biological activity. Therefore, for example, if a significant portion of a synthesis is racemized and the racemized peptide is 1000 times more stable in vivo, the majority of the immune response may be directed to the racemized peptide and not to the desired peptide. This could be one explanation for peptide immunizations which result in antisera of suboptimal avidity or specificity. Racemization could also be an explanation as to why positive results obtained with a peptide library cannot be repeated with the newly synthesized, highly purified product. The implications of the above examples require serious reflection by the investigators designing the experiments and by the core laboratories which supply the requested peptides. In most of the previous studies conducted by this committee, a few peptides stood apart from the majority of high quality peptides as samples whose problems originated from human error. However, in the present study it was
890
Ruth Hogue Angeletti et al
clear that these errors were not limited to the participating peptide resource laboratories, but extended perhaps to a greater extent to reagent suppliers. While three samples exhibited serious problems with peptide assembly in the participating laboratories, nine laboratories had purchased faulty preloaded resins. The critical role of peptides in research programs requires that both suppliers and core synthesis laboratories examine their quality control procedures. While strict application of GLP or ISO9002 protocols may not be necessary for many laboratories, incorporation of some of the procedures into general laboratory operations might be considered in order to provide additional assurance that high laboratory standards are maintained, and that errors, when they occur, will be detected sufficiently early to prevent unnecessary expenditure of time and effort, and loss of good will. Although the results of this study cannot be used to formulate recommendations about remedies or precautions to be taken, they do emphasize the importance of seriously considering analysis of racemization. In the present study, it was not possible to identify the the origins of high levels of racemization that were occasionally found in peptides from laboratories whose instruments and personnel seemed to be otherwise working at high performance levels. While the length of coupling times, the type of base, or the source of the amino acid derivatives did not appear to be a factor in the racemization detected in this study, it is possible that the purity or age of coupling reagents could be important, since it has been demonstrated that His racemization takes place in the coupling step. Synthesis of the reference peptide sequence used in this study may be a helpfUl approach to determine whether coupling conditions are optimized, since the DHis and Z-His forms of this peptide are so easy to separate by HPLC.
Acknowledgements This study was supported in part by grant from the Department of Energy. The committee also thanks Nguyet Le, Anthony J. Makusky, Andrew J. Miles, and Edward Nieves for their assistance in carrying out this study.
References 1. A.J. Smith, J..D Young, S.A.. Carr, D.R. Marshak, L.C. Williams & K.R. Williams (1992) Techniques in Protein Chemistry III (ed. R.H. Angeletti): 219-229. 2. G.B. Fields, S.A. Carr, D.R. Marshak, A.J. Smith, J.T. Stults, L.C. Williams, K.R. Williams & J.D. Young (1993) Techniques in Protein Chemistry IV (ed. R.H. Angeletti): 229-238. 3. G.B. Fields, R.H. Angeletti, S.A. Carr, A.J. Smith, J.T. Stults, L.C. Williams & J.D. Young (1994) Techniques in Protein Chemistry V (ed. J.W. Crabb): 501-507. 4. G.B. Fields, R.H. Angeletti, L.F. Bonewald, W.T. Moore, A.J. Smith, J.T. Stults & L.C. Williams (1995)Techniques in Protein Chemistry VI (ed. J.W. Crabb): 539-546 5. R.H. Angeletti, L. Bibbs, L.F. Bonewald, G.B. Fields, J.S. McMurray, W.T. Moore & J.T. Stults (1996) Techniques in Protein Chemistry VII (ed. D. Marshak): 6. J.H. Jones, W.I. Ramage & M.J. Witty (1980) Int. J. Peptide Protein Res. 15, 301-303. 7. E. Atherton, P.M. Hardy, D.E. Harris & B.H. Matthews (1991) in Peptides 1990, E. Giralt & D. Andreu, eds.), Escom, Leiden, The Netherlands, pp. 243-244. 8. J. Spackman, W. Stein & S. Moore (1958) Analytical Chemistry 30, 1190-1205. 9. J.G. Adamson, T. Hoang, A. Crivici & G.A. Lajoie (1992) Anal. Biochem. 202, 210-214. 10. G.B. Fields, Z. Tian & G. Barany (1992) in Synthetic Peptides: A User's Guide, G.A. Grant, ed., Oxford University Press, New York, pp 77-183.
Index
AAT, see Aspartate aminotranferase ABRF, see Association of Biomolecular Resource Facilities Acetic anhydride, lyophilized protein acetylation, 222,227 Acidic fibroblast growth factor (aFGF) aggregation, 746-747 differential scanning calorimetry anion effects on stability, 747-748,750 data collection, 747 guanidium hydrochloride effects, 751 reversibihty of denaturation, 749,751 expression and purification of recombinant human protein, 747-748 heparin binding, 745 stability as regulatory mechanism, 745-746 Active site peptide, see Lysyl oxidase aFGF, see Acidic fibroblast growth factor w-Agatoxin-TK circular dichroism, 547,549,551 P-type calcium channel inhibition, 543-544 D-serine-46 effects conformation, 544-546, 549, 551,553 inhibition potency, 544 molecular shape, 547, 549 tryptophan fluorescence, 551 peptide isomerase, 544 size-exclusion chromatography, 546-547, 549 structure, 544 synthesis, 546 Alkahne phosphatase, single molecule assay with capillary electrophoresis activation energy determination, 121-124, 128 distribution of activity, 126,128 instrumentation, 122 reaction condition, 123-124,126 reagents, 122 thermal denaturation, 124,128,130
Amicon Microcon™-SCX clean-up of samples amino acid analysis, 135,142 glycosylation analysis, 135,139,141 high-performance hquid chromatography, 135-137 ohgonucleotides, 140 sequence analysis, 135-137 kinetics of binding, 136,141 operation, 134 sample elution, 134-135,141 Amino acid analysis 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate derivatization advantages, 185 cell culture supernatant, 194-195 collagen analysis, 186-187,189 high-performance Hquid chromatography gradient system, 185-186,189,196 instrumentation, 187 optimization, 191-192 materials, 186 sample preparation, 186-187 data evaluation, see Association of Biomolecular Resource Facilities free amino acids in physiological samples with 420A ABI/PE analyzer cleaning cycles, 205-206 gradient optimization, 205 materials, 197-198 running conditions, 198 sample treatment, 198,205 leptin, 156,161 miceUar electrokinetic capillary chromatography with thermo-optical absorbance detection, 5,7 sample clean-up, 135,142 6-Aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC), derivatization in amino acid analysis advantages, 185 cell culture supernatant, 194-195 coUagen analysis, 186-187,189 high-performance liquid chromatography
891
892 6-Aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC), derivatization in amino acid analysis (continued) gradient system, 185-186,189,196 instrumentation, 187 optimization, 191-192 materials, 186 sample preparation, 186-187 Amphitropic protein, see Growth-associated protein-43 P-Amyloid,Api-42 Alzheimer's disease role, 865 peptide synthesis activation protocols, 868 aminoacyl fluoride preparation, 866-867 approaches, 865-866,872-873 cleavage, 868 deprotection, 868,872 mass spectrometry, 870 reverse-phase high-performance Uquid chromatography, 868,870,872 semiautomatic synthesis, 867-868,870 Angiotensin-I, Edman sequencing and nanoelectrospray mass spectrometry from capillary electrophoresis samples, 37-40,43-46 Antibody, see also Humanized antibody; Myoglobin antifluorescein binding and transitionstate theory affinity constants, 508 antibody production, 506 limitations, 510 peptide carriers, 506-507 thermodynamic parameters, 508-510 transmission coefficients, 509-510 unimolecular rate constant determination, 507-508 complementarity determining regions conformational free energy calculation gas phase versus solution analysis, 765 inserted loop optimization, 756-758 loop closure optimization, 758 loop side-chain replacement, 756 relative stability of loops, 758-759 side-chain versus all atom optimization, 759-762 determinant parameters for native loops, 763-764 modeling approaches, 755-756,759 X-ray crystal structures, 755 receptor blocking, see Interleukin-ip secondary forces in binding, 505 Apolipophorin-III conformational change on lipid binding, 427-428,432-437
Index nuclear magnetic resonance, structure analysis assignment strategy, 433-434 data collection, 429^30 lipid binding, 434-437 mass spectrometry, 430 nitrogen-15 labeling of recombinant protein, 429^30,432-433 X-ray crystallography, 427 Apphed Biosystems Procise'^^, 494, highsensitivity instrument analysis data analysis, 59 Hmits of detection, 63,66 running conditions, 58 sequencing supports, 57,65-67 signal enhancement over standard instrument, 59,61,66 yields, repetitive and initial, 64-67 AQC, see, 6-Aminoquinolyl-N-hydroxysuccinimidyl carbamate Arginine modification, see Phenylglyoxal Ascorbate, hemoglobin modification, 399, 401,403 Aspartate aminotranferase (AAT) activity assay, 484 heat shock protein-70 binding ATPase stimulation by peptides, 488^90 binding sites, identification, 489,491 peptide competition assay, 485,487-488 peptide synthesis and purification, 483 refolding, 482-484 purification, 482 Aspartate protease, see Fehne immunodeficiency virus protease Association of Biomolecular Resource FaciUties (ABRF) ABRF-96SEQ comparison with previous samples, 75-76 dataset A, 70-72 dataset B, 70,72-74,76 distribution, 70 matrix assisted laser desorption/ionization mass spectrometry data analysis, 74-75,78 results of data analysis, 71-78 amino acid analysis data evaluation accuracy for individual amino acids, 213 error calculations, 208-209 membrane-blotted samples, 215-216 molecular mass determination, 214-215 participation, 208-209 sample preparation, 208 unknown protein identification using databanks, 207,213-214 yield, 208-209,215
893
Index establishment and goals, 69 internal digest versus blotting, sequencing study chromatogram analysis, 102 comparative data, 105-109 protocol, 100 questionnaire responses, 102-103 sample design, 99-101 distribution, 100 racemization during solid phase peptide synthesis, evaluation amino acid analysis, 877-878, 883,885, 889 high-performance liquid chromatography, 880,883, 885, 889-890 mass spectrometry analysis, 878-879, 883,885 optical rotary dispersion analysis, 878, 880 peptide selection, 876,879-880 purity assessment, 883,887 reference peptide synthesis, 876-877 synthetic failure analysis, 887,889-890 Atomic absorption spectroscopy, zincbinding dimerization domain of RAGl, 574, 577-578 ATPase, see Proton, potassium-ATPase Avian sarcoma virus B Basic fibroblast growth factor (bFGF), heparin interactions studied by size exclusion chromatography with light scattering/ultraviolet absorbance/refractive index detection, 117-118 Beckman LF 3600, glycosylation site identification coupUng reaction, 332-333 high-performance Hquid chromatography, 333,336-339 principle, 331-332 sample preparation, 332 standards, 332,336 bFGF, see Basic fibroblast growth factor BIA, see Biomolecular interaction analysis Biomolecular interaction analysis (BIA), matrix assisted laser desorption/ionization mass spectrometry interfacing myoglobin/antibody system, 495-497 principle, 494-495 sensitivity, 498 p-sheet peptides, water soluble design biological examples, 797-798,801,806
circular dichroism, 800, 802, 804 difficulty, 797 guidelines, 802 nuclear magnetic resonance, 800-802, 804, 806 strand alignment, 804 surface charge, 807 synthesis of peptides, 800
C3, see Complement C4, see Complement Calcium channel P-type blocker, see w-Agatoxin-TK types, 543 Capillary electrophoresis laser-induced fluorescence detection, single alkahne phosphatase molecule assay activation energy determination, 121-124,128 distribution of activity, 126,128 instrumentation, 122 reaction condition, 123-124,126 reagents, 122 thermal denaturation, 124,128,130 sequencing of proteins automation, 6-7 concentration limits of detection, 15-16,30 Edman sequencing in off-line analysis, 37,39-40,43-44 immunoaffinity capillary electrophoresis, 16-17,20-21 mass spectrometry couphng, 37 membrane-preconcentration-capillary electrophoresis-mass spectrometry, 16,18-19,25-28,30-34 micellar electrokinetic capillary chromatography with thermo-optical absorbance detection, 4-5,7,9,13 microreactor enzyme digestion capillary electrophoresis, 18,22 miniaturization of system, 4-5,7, 9,13 nanoelectrospray technique in off-line analysis, 37-40,43-46 sensitivity, 3-4,7 Carbon-13, metabolic labeling of proteins, 155,439,441,443,447 Carbonic anhydrase carbonic anhydrase II, structural determination of human protein using nuclear magnetic resonance aliphatic side-chain resonance assignment, 608 backbone resonance assignment, 608
894 Carbonic anhydrase, carbonic anhydrase II, structural determination of human protein using nuclear magnetic resonance (continued) deuteration of protein, 605,607,613 global fold determination, 609-613 metabolic labeling, 607-608 purification, 606 secondary structure determination, 609 sequencing of blotted proteins from gels automated sequencing, 94-95 capillary liquid chromatography of peptides, 94-95 chemical cleavage, 92,94 cysteine modification, 92 Edman degradation, 92 peptide mapping, 93 reagents, 91-92 recovery, 96-97 sample preparation, 92 Carcinoembryonic antigen (CEA), glycosylation site identification with Beckman LF 3600 sequencer coupling reaction, 332-333 high-performance liquid chromatography, 333,336-339 principle, 331-332 sample preparation, 332 standards, 332,336 CD, see Circular dichroism CDR, see Complementarity determining region CEA, see Carcinoembryonic antigen Cellular retinaldehyde-binding protein (CRALBP) retinoid binding nuclear magnetic resonance of binding site assignment, 443 carbon-13 labeling, 439, 441, 443, 447 data collection, 440 fluorine-19 labeling, 439, 441,443, 445 ligand-induced changes, 443, 445-447 mass spectrometry, 440-441 nitrogen-15 labeling, 439,441 recombinant protein expression, 440 stereoselectivity, 439 visual pigment regeneration, 439 Cellular retinoid-binding protein (CRABP) biological functions, 449 isoforms, 449 sequence homology, 449-450
Index site-directed mutagenesis of retinoidbinding leucine-121 in CRABP-II binding energy contribution, 450,454 competitive binding assay, 450^53 mutagenesis, 450 nuclear magnetic resonance analysis of conformation, 452,454 Chymotrypsin acetic anhydride acetylation of lyophilized protein, 222,227 7-chymotrypsin dynamics in hexane active site, 695,697 algorithms, 694-695 effects on conformation, 694-695, 697 hexane binding, 698,700 hydrogen bond stability, 697-698 water binding, 694-695,698,700 water in catalysis, 693-695 iodomethane modification aqueous solution, 224-225 materials, 220 nuclear magnetic resonance analysis, 222 octane solution, 221-222 in vacuo,224 Circular dichroism (CD) (o-agatoxin-TK, D-serine effects on conformation, 547, 549,551 growth-associated protein-43, phospholipid binding effects, 556-560 helix-coil transition of peptides, 740-741 lysozyme activity assay cell wall fragment substrate preparation, 855 data collection, 856 orphan nuclear receptors, 458^59,461, 465 zinc-binding dimerization domain of RAGl, 574-575,578-580 Clean-up of samples, see Membranepreconcentration-capillary electrophoresis-mass spectrometry; Amicon Microcon™-SCX CMV protease, see Cytomegalovirus protease Collagen, amino acid analysis and 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate derivatization, 186-187, 189 Combinatorial peptide library phage display, 178 screening with lAsys binding assay, 179-180 electrospray mass spectrometry, 180-181,182
Index enzyme-linked immunosorbent assay, 181-183 high-performance Hquid chromatography analysis, 181 library synthesis, 179 materials, 178 solid-phase synthesis, 178-179 Complement activation by divalent ions, 363 cleavage by coagulation enzymes, 363-365,367 stabilization in blood samples EDTA, 363,365 Futhan, 363,365,368 heparin, 363,365 Complementarity determining region (CDR) conformational free energy calculation gas phase versus solution analysis, 765 inserted loop optimization, 756-758 loop closure optimization, 758 loop side-chain replacement, 756 relative stability of loops, 758-759 side-chain versus all atom optimization, 759-762 determinant parameters for native loops, 763-764 modehng approaches, 755-756,759 CRABP, see Cellular retinoid-binding protein CRALBP, see Cellular retinaldehydebinding protein CrystalUn heterodimerization, 817, 825-826 interdomain interface localization in pA3and pB2-crystanins accessibility calculations, 818-819, 821-822 layer packing of common surface residues, 822-823, 825 sequence homology analysis, 817-818, 820,824 structural surface template, 819-820, 822-825 Cyanogen covalent linkage of salt bridges, 469-470, 476 membrane permeability, 476 penicillin-binding protein, in vivo morphogene protein binding assay cyanogen cross-linking, 472,476 digestion of purified proteins, 472-473 fluorescence labeling bacterial cells, 470, 472-473 P-lactam, 470 high-performance liquid chromatography, 472-473,478-479
895 Hmitations, 479 mass spectrometry, 473 principle, 469-470 specificity, 476-477 Cysteine modification, see NEthylmaleimide; 2-(4'Maleimidylanilino)naphthalene-6sulfonic acid; Thiuram disulfides; Tris-(2-carboxyethyl)phosphine Cytochrome-C Edman sequencing and nanoelectrospray mass spectrometry from capillary electrophoresis samples, 37-40, 43-46 peptide mapping with liquid chromatography/mass spectrometry, 166-169, 171,175 structural investigation with liquid chromatography/Fourier transform infrared spectrometry, 166-169, 171-172,175 tryptic digestion, 167 Cytomegalovirus (CMV) protease assay, 259 diisopropylfluorophosphate labehng of active site serine autoproteolysis prevention, 257-258, 265-266 electrospray mass spectrometry quantitation, 259-262,264 reaction conditions, 258,260,265 function, 257 substrate specificity, 257 D Deamidation, see Isoaspartate Dehydroascorbate, hemoglobin modification, 399,401,403 2-Deoxyglucose, synthesis carbon-14 substrate, 313 deuterated substrate, 312-313 DFP, see Diisopropylfluorophosphate Differential scanning calorimetry (DSC) acidic fibroblast growth factor anion effects on stability, 141-14S, 750 data collection, 747 guanidium hydrochloride effects, 751 reversibihty of denaturation, 749,751 orphan nuclear receptors, 459,461, 464-^65 Diisopropylfluorophosphate (DFP), labehng of active site serine in cytomegalovirus protease autoproteolysis prevention, 257-258, 265-266
896 Diisopropylfluorophosphate (DFP), labeling of active site serine in cytomegalovirus protease (continued) electrospray mass spectrometry quantitation, 259-262,264 reaction conditions, 258,260,265 Disulfide bond, see Humanized antibody; Stem cell factor DNA bend anisotropic flexibility, 585 binding proteins Bacillus stearothermophilus, 586-587 DNA loops, effects on binding, 587-589 Escherichia coli, 586 hydroxymethyluracil-containingDNA, effects on binding, 590 integration host factor, 586,589-590 TFl, 586-588,590 sequence context, 585 DSC, see Differential scanning calorimetry
EcoKl assembly, in vitro continuous variation titration, 596, 598-600 gel electrophoresis, 595-596,598 glutaraldehyde crosslinking, 595,598 mutant proteins, 601 purification of subunits, 595 methylation of DNA, 593 subunits, 593-594 Edman degradation automation, 57 capillary electrophoresis off-line analysis, 37,39-40,43-44 glycoaminoacids, 331 EDTA, complement stabilization in blood samples, 363,365 Electron paramagnetic resonance (EPR), manganese binding assay with ribonuclease H, 411-412 Enkephahn combinatorial peptide library screening, 177-183 Edman sequencing and nanoelectrospray mass spectrometry from capillary electrophoresis samples, 37-40, 43^6 Enthalpy of activation, glycosylation effects in glucose oxidase, 311, 316-318 EPO, see Erythropoietin EPR, see Electron paramagnetic resonance
Index Erythropoietin (EPO), Applied Biosystems ProciseT'^ 494HS sequencing, 58-59,61,65-66 N-Ethylmaleimide (NEM) glial cell line-derived neurotrophic factor, modification, 279,283 humanized antibody alkylation, 386,390, 394 NC p7 of human immunodeficiency virus1, modification, 233-235,242 Extinction coefficient, determination for proteins calculation from amino acid composition, 113 dry weight method, 113 size exclusion chromatography, light scattering/ultraviolet absorbance/refractive index determination accuracy, 114,117,119 apparatus, 114-115 principle, 114 protein interaction analysis, 115, 117-118 standard curve, 116-117
Feline immunodeficiency virus (FIV) protease mechanism, 643 X-ray crystallography of aspartate-toasparagine mutant crystallization, 645 data collection, 645 mutation effects on structure of LP-149 complex, 647-649,651-652 preparation of protein, 644 refinement, 647 Fetuin, glycosylation profile analysis, 322-324,326,328 Fibroblast growth factor, see Acidic fibroblast growth factor; Basic fibroblast growth factor FIV protease, see Feline immunodeficiency virus protease FK506-binding protein, amide deuterium exchange measured by mass spectrometry advantages over nuclear magnetic resonance, 703,712 exchange reaction, 704-705 folding/unfolding kinetics, analysis, 708-711 Fourier transform ion cyclotron mass spectrometry, 705,712
Index global exchange under denaturing conditions, 706-707 local variations examined by proteolysis, 705,707-708 protein preparation, 704 Fluorescence polarization, adenine binding to ribonuclease A, 567-568, 570 Fluorine-19, labeling of proteins, 439,441, 443,445 Futhan, complement stabilization in blood samples, 363,365,368
P-Galactosidase, single molecule assay, 121 GAP-43, see Growth-associated protein-43 GDNF, see Glial cell line-derived neurotrophic factor Gel digestion, see Polyacrylamide gel electrophoresis GIF, see Glycosylation-inhibiting factor GK, see Guanylate kinase GHal cell line-derived neurotrophic factor (GDNF) disulfide structure, 277-278, 280 Parkinson's disease treatment, 277 tris-(2-carboxyethyl)phosphine modification alkylation of partially reduced protein, 279,283 complete reduction and pyridylethylation, 279 intermolecular disulfide identification, 283,286 mass spectrometry, 280,283 materials, 278 partial reduction of protein, 278-280, 282 sequence analysis, 279,283 tryptic digestion, 279,283 Glucoamylase, glycosylation site identification with Beckman LF 3600 sequencer coupHng reaction, 332-333 high-performance hquid chromatography, 333,336-339 principle, 331-332 sample preparation, 332 standards, 332,336 Glucose oxidase catalytic reaction, 311 deglycosylation, 313-314 glycosylation effects enthalpy of activation, 311,316-318 hydrogen tunneling, 311,314-318
897 initial velocity data analysis, 315 kinetic isotope effect analysis, 314-318 Glycosylation glucose oxidase effects enthalpy of activation, 311,316-318 hydrogen tunneling, 311,314-318 profile analysis enzymatic release, 321-322,324,326 hydrazine release, 321-322 l-phenyl-3-methyl-5-pyrazolone labehng mass spectrometry, 323,326, 328 reaction conditions, 322-323 reverse-phase high-performance liquid chromatography, 321-323, 326, 328 site identification with Beckman LF 3600 sequencer couphng reaction, 332-333 high-performance liquid chromatography, 333,336-339 principle, 331-332 sample preparation, 332 standards, 332,336 Glycosylation-inhibiting factor (GIF) biological functions, 633 X-ray crystallography crystallization, 635 data collection, 634 multiple isomorphous replacement, 634, 636-637 phase improvement, 637-638 refinement, 634-635, 638 trimer structure, 638-640 Granulocyte colony-stimulating factor (GCSF), Applied Biosystems Procise^^ 494HS sequencing, 58-59,65-66 Growth-associated protein-43 (GAP-43) basic amphiphilic domain, 555-556 calmodulin-binding domain, 556,562 mass spectrometry of posttranslational modifications, 556-558 palmitoylation, 555 phosphoHpid binding effects circular dichroism, 556-560 proton nuclear magnetic resonance, 557,560-562 phosphorylation, 556 Guanylate kinase (GK) adenylate kinase similarity, 679 biological functions, 679 tyrosine-50 mutation to phenylalanine in yeast enzyme, effects on catalysis and structure equilibrium unfolding, 682,685-686
898
Index
Guanylate kinase (GK), tyrosine-50 mutation to phenylalanine in yeast enzyme, effects on catalysis and structure (continued) hydrogen bonding, 680,686,688 nuclear magnetic resonance structural analysis, 681-682,685 substrate titration, 680-682 site-directed mutagenesis, 680, 686 steady-state kinetics, 680, 682,687 H Heat shock protein-70 (Hsp70) aspartate aminotranferase ligands activity assay, 484 ATPase stimulation by peptides, 488-490 binding sites, identification, 489,491 peptide competition assay, 485,487-488 peptide synthesis and purification, 483 protein purification, 482 refolding, 482-484 constitutive functions, 481 peptide binding conformational change induction, 490 specificity, 481 structure, 481 Helix-coil transition, laser temperature jump kinetics circular dichroism, 740-741 fluorescence, 740-742 peptides applications, 736-737 design, 737 fluorescence labeling, 737 Hemoglobin, modification site identification ascorbate modification, 399,401,403 dehydroascorbate modification, 399,401, 403 mass spectrometry of peptides, 400,405 oxygen modification, 399,401,403 sequence analysis, 400,406 tryptic mapping, 400-401,403 Heparin, basic fibroblast growth factor interactions studied by size exclusion chromatography with light scattering/ultraviolet absorbance/refractive index detection, 117-118 Heparin, complement stabilization in blood samples, 363,365 Hepatitis E vaccine, isoasparatate assay in recombinant 62-kDa antigen high-performance liquid chromatography, 342,344 kinetics of formation, 343
mass spectrometry, 342,348 principle, 341-342 sequencing, 342, 348 site isolation and characterization, 346-347 tryptic digestion, 342,344 Hepatitis E, vaccine candidate characterization cloning and expression, 47,54 digestion in gels, 49-50 mass spectrometry LC-MS, 49-50, 53 matrix assisted laser desorption/ionization mass spectrometry, 49 purification, 47 sequencing, 49 Hexane, sere 7-Chymotrypsin Hirudin, thrombin binding energy minimization, 515 inhibition constant, temperature dependence, 518-519 molecular surface area calculation, 516 site-directed mutagenesis, 514 thermodynamics, 514-516,519-521 HIV-1, see Human immunodeficiency virus-1 H,K-ATPase, see Proton, potassiumATPase Hsp70, see Heat shock protein-70 Human immunodeficiency virus-1 (HIV-1), see Nucleocapsid protein Humanized antibody disulfide bonds autocatalytic reduction cysteine to serine conversion of catalytic residue, 386,393 denaturing conditions, 391-392,395 N-ethylmaleimide alkylation, 386, 390,394 structure, 385 heterogeneity in gels, 386-387 matrix assisted laser desorption ionization-mass spectrometry, 386,389, 391-392 purification by protein-A chromatography, 386 sample preparation and heterogeneity, 388 Human serum albumin, Applied Biosystems Procise™ 494HS sequencing, 58-59 Hydrogen exchange, see also Nuclear magnetic resonance equilibrium constant of folding, 768 FK506-binding protein, amide deuterium exchange measured by mass spectrometry advantages over nuclear magnetic resonance, 703,712
899
Index exchange reaction, 704-705 folding/unfolding kinetics, analysis, 708-711 Fourier transform ion cyclotron mass spectrometry, 705,712 global exchange under denaturing conditions, 706-707 local variations examined by proteolysis, 705,707-708 protein preparation, 704 Myc-Max interface, amide deuterium exchange, 621 native state hydrogen exchange of ribonuclease H* distinguishing unfolding events from local fluctuations, 728-730, 732-733 global denaturation, 731 nuclear magnetic resonance, 731 principle, 727-730,733 protein preparation, 730-731 phospholipase A2, amino-terminal helix structure analysis, 626, 629-630 protection factors apparent stability constants per residue, 767-770 COREX algorithm, 774-775,779 denaturant dependence of stability constants, 771-772 folding probability analysis, 772-773 limiting exchange rate determination, 773 temperature dependence of stability constants, 770-771 Staphylococcal nuclease pattern of protection, 775-777 temperature dependence, 777 urea dependence, 778-779 theory of amide exchange, 728 Hydrogen peroxide, methionine oxidation in keratinocyte growth factor circular dichroism analysis, 301, 305-306 kinetics of oxidation, 302,305 proliferation bioassay of modified protein, 302,306,308 reaction conditions, 300 reagents, 300 sequencing, 301 tryptic peptide mapping, 300-302,304-305 Hydrogen tunneling, glycosylation effects in glucose oxidase 311, 314-318 N-Hydroxysuccinimidyl palmitate, lysine palmitoylation in insulin, 290
I lA-CE, see Immunoaffinity capillary electrophoresis lAsys, combinatorial peptide library screening binding assay, 179-180 electrospray mass spectrometry, 180-181, 182 enzyme-linked immunosorbent assay, 181-183 high-performance hquid chromatography analysis, 181 library synthesis, 179 materials, 178 IDH, see Isocitrate dehydrogenase IHF, see Integration host factor IMDH, see Isopropylmalate dehydrogenase Immunoaffinity capillary electrophoresis (lA-CE) immunoglobulin analysis, 21 sample preconcentration, 20 Infrared spectrometry, see Liquid chromatography/Fourier transform infrared spectrometry Insulin Edman sequencing and nanoelectrospray mass spectrometry from capillary electrophoresis samples, 37-40, 43-46 iodomethane modification materials, 220 nuclear magnetic resonance analysis, 222 octane solution, 221-224 in vacuo,224 surface hydrophobicity, effects on structure absorption spectroscopy of cobalt complexes, 293-294 circular dichroism analysis, 290-291, 293,295-296 denaturation profiles, 294 light scattering analysis of zinc-induced aggregation, 291,293,295 palmitoylation, 289-290 stabilization mechanisms, 295-296 Integrase antiviral therapy, 417 avian sarcoma virus catalytic domain structure, 418^19 metal binding site comparison with related enzymes, 421-423 magnesium, 419,421-423
900 Integrase, metal binding site (continued) manganese, 419,421,423 zinc, 419,421,423 X-ray diffraction data collection, 418 metal binding site, 417-419,421-423 Integration host factor (IHF), see DNA bend Interleukin-lp biotinylation of cysteine, 524,528 cloning and expression, 523-524 Fab fragment to epitope on receptorbinding site competition studies with receptor, 526-529 drug therapy rationale, 523 kinetics of binding, 525,528 preparation, 524,527-528 receptor binding, kinetic assay, 524-525 site-directed mutagenesis of receptorbinding site, 524,527 lodomethane, lyophilized protein modification aqueous solution, 224-225,228 materials, 220 nuclear magnetic resonance analysis, 222, 225 octane solution, 221-227 in vacuo,224 Isoaspartate assay in recombinant hepatitis E vaccine high-performance liquid chromatography, 342,344 kinetics of formation, 343 mass spectrometry, 342,348 principle, 341-342 sequencing, 342,348 site isolation and characterization, 346-347 tryptic digestion, 342,344 mechanism of formation, 341 Isocitrate dehydrogenase (IDH), nicotine adenine dinucleotide specificity, 810,815 Isopropylmalate dehydrogenase (IMDH) leucine biosynthesis, 809 nicotine adenine dinucleotide specificity alteration engineering of secondary structure, 812, 814-815 isocitrate dehydrogenase modeling, 810 kinetic analysis, 811,815 mutagenesis, 811-812 Isotope effect, see Kinetic isotope effect
Index
JHE, see Juvenile hormone esterase Juvenile hormone esterase (JHE) biological function, 655 homology-based modeling of threedimensional structure algorithms, 656,660 analysis of model, 664-665 assignment, structurally conserved regions and loops, 657,662-663 homolog protein selection, 656,665 refinement, 657-658 sequence alignment, 656,658-660 K Keratinocyte growth factor (KGF) function, 299 methionine oxidation with hydrogen peroxide circular dichroism analysis, 301,305-306 kinetics of oxidation, 302,305 proliferation bioassay of modified protein, 302,306,308 reaction conditions, 300 reagents, 300 sequencing, 301 tryptic peptide mapping, 300-302, 304-305 KGF, see Keratinocyte growth factor KIE, see Kinetic isotope effect Kinetic isotope effect (KIE), glycosylation effects in glucose oxidase, 314-318 Kirsten-ras (K-ras) activation in cancer, 837 expression and purification of recombinant protein, 837-840, 848 GTPase, 837 mass spectrometry data collection, 839 heterogeneity due to nucleotide dissociation, 844,848-849 posttranslational modification analysis, 844,848 N-terminal processing, identification with high-performance liquid chromatography, 839-840,844,848 Kit receptor family, 371 stem cell factor interactions studied by size exclusion chromatography with light scattering/ultraviolet absorbance/refractive index detection, 118 K-ras, see Kirsten-ras
Index
L78K-TRX, see Thioredoxin P-Lactam bacterial resistance mechanisms, 827 fluorescence labeling, 470 P-Lactamase, see TEM-1 p-lactamase Lactate dehydrogenase, single molecule assay, 121 P-Lactoglobulin, Applied Biosystems Procise™ 494HS sequencing, 58-59,63-65,67 Laser-induced fluorescence detection, see Capillary electrophoresis LC/FTIR, see Liquid chromatography/Fourier transform infrared spectrometry LC/MS, see Liquid chromatography/mass spectrometry Leptin amino acid analysis, 156,161 isoform separation by reverse-phase highperformance liquid chromatography, 156-157 norleucine analysis in recombinant protein, 155,160,162 peptide mapping, 156,158 Leucine zipper, see Max-Myc dimer Light scattering extinction coefficient, determination by size exclusion chromatography with light scattering/ultraviolet absorbance/refractive index detection accuracy, 114,117,119 apparatus, 114-115 principle, 114 protein interaction analysis, 115, 117-118 standard curve, 116-117 orphan nuclear receptors, analysis of aggregation, 459^60,464-465 Linker region classification, 667 functions, 667 structure analysis clustering, 669,677 data base of hnkers, 668-669 five-residue linkers, 676 four-residue linkers, 675-676 STRIDE, 667-668 three-residue linkers, 673-674 two-residue linkers, 670-673 Liquid chromatography/Fourier transform infrared spectrometry (LC/FT-IR) cytochrome-C, 166-169,171-172,175
901 particle beam interface, 166 structure elucidation, 166 Liquid chromatography/mass spectrometry (LC/MS), peptide mapping advantages, 165-166 cytochrome-C, 166-169,171,175 recombinant leptin, 156 LO, see Lysyl oxidase Lyophilized protein acetic anhydride acetylation, 222,227 effects on structure, 219-220 iodomethane modification aqueous solution, 224-225,228 materials, 220 nuclear magnetic resonance analysis, 222,225 octane solution, 221-227 in vacuo, 224 pH memory, 228-229 sequencing, 219 Lysine modification, see Acetic anhydride; N-Hydroxysuccinimidylpalmitate; Iodomethane Lysozyme carboxy-terminal domain core of T4 enzyme, mutagenesis activity effects, 857 mutagenesis, 854 site selection, 852-853 structural analysis, 857-859 thermal stability measurements, 856, 860-862 circular dichroism activity assay cell wall fragment substrate preparation, 855 data collection, 856 Edman sequencing and nanoelectrospray mass spectrometry from capillary electrophoresis samples, 37-40, 43-46 purification from inclusion bodies, 854-855 Lysyl oxidase (LO) active site peptide high-performance hquid chromatography, 354,356 labeling, 353,355 mass spectrometry, 354-355,357-359 sequencing, 356-357 thermolysin digestion, 353-354 catalytic reaction, 351 cofactor identification, 351,355,360 functions, 351 phenylhydrazine inactivation, 353 purification from bovine aorta, 352-353
902
Index M
Major histocompatibility complex (MHC) class I peptide role in immune system, 25 sequencing with membrane-preconcentration-capillary electrophoresis-tandem mass spectrometry cartridge assembly, 26-27 ionization source, 27-28 K^-derived peptides, 31-34 peptide isolation, 26,28,30 peptide recovery, 30 MALDI-MS, see Matrix assisted laser desorption/ionization mass spectrometry 2-(4'-Maleimidylanilino)naphthalene-6-sulfonic acid (MIANS), ricin modification absorbance measurements, 246,254 affinity purification of MIANS label, 248, 254 applications, 246 enzymatic digestion, 247-248 peptide mapping A-chain,249,251,254 B-chain attachment sites, 251,253-254 purification of blocked B-chain, 247 reaction conditions, 247,249 sequencing of peptide, 248-249 Mass spectrometry (MS), see also Liquid chromatography/mass spectrometry; Matrix assisted laser desorption/ionization mass spectrometry; Membrane-preconcentration-capillary electrophoresis-mass spectrometry combinatorial peptide library screening, 180-181,182 FK506-binding protein, amide deuterium exchange measurements advantages over nuclear magnetic resonance, 703,712 exchange reaction, 704-705 folding/unfolding kinetics, analysis, 708-711 Fourier transform ion cyclotron mass spectrometry, 705,712 global exchange under denaturing conditions, 706-707 local variations examined by proteolysis, 705,707-708 protein preparation, 704 hepatitis E vaccine candidate characterization, 49-50,53 Kirsten-ras data collection, 839
heterogeneity due to nucleotide dissociation, 844, 848-849 posttranslational modification analysis, 844, 848 nanoelectrospray technique in protein sequencing, 37-40,43-46 World Wide Web sites for protein analysis, 75 Matrix assisted laser desorption/ionization mass spectrometry (MALDI-MS) biomolecular interaction analysis interfacing myoglobin/antibody system, 495-497 principle, 494-495 sensitivity, 498 bioreactive probe tips advantages of technique, 498-500 endoprotease digestion of analytes, 493, 498-503 immobilization of enzymes, 498 myoglobin digestion, 500-503 humanized antibody, 386,389,391-392 membrane topology determination of proton, potassium-ATPase enriched microsomal vesicle preparation, 535 high-performance hquid chromatography, 535 mass spectrometry, 536-538 peptide assignment, 537-538 post-source decay analysis, 533-534, 538-539,541 principle, 533-534 protease selection, 541 tryptic digest, 535-536 membrane-blotted samples chemical cleavage on membrane, 146, 148,152 contaminated samples, 146,148,151 endopeptidase digestion on membrane, 146,151 instrumentation, 148 membrane materials, 145,152 principle of sample clean-up, 144-145 pure samples, 145-146 oligosaccharides containing l-phenyl-3methyl-5-pyrazolone label, 323, 326,328 sample purity requirements, 143-144 sequencing of in-gel digests, 79-90 Max-Myc dimer leucine zipper solid phase synthesis, 618 structural overview, 617-618 proton nuclear magnetic resonance of interface structure amide deuterium exchange, 621
903
Index asparagine side-chain interactions, 620-623 data collection, 618-619 secondary structure, 619-620 MECC, see Micellar electrokinetic capillary chromatography Membrane-preconcentration-capillary electrophoresis-mass spectrometry (mPC-CE-MS) disease diagnosis, 19 major histocompatibility complex class I peptide sequencing with tandem mass spectrometry cartridge assembly, 26-27 ionization source, 27-28 K^-derived peptides, 31-34 peptide isolation, 26,28,30 peptide recovery, 30 sample cleanup, 18-19,30,33 preconcentration, 18 Membrane protein topology approaches in determination, 533 matrix assisted laser desorption/ionization mass spectrometry, see Proton, potassium-ATPase Methionine norleucine similarity, 162 oxidation with hydrogen peroxide, 299-300,302,303,308 replacement with norleucine under stressed fermentation conditions, 155,161-162 MHC class I peptide, see Major histocompatibility complex class I peptide MIANS, see, 2-(4'-Maleimidylanilino)naphthalene-6-sulfonic acid Micellar electrokinetic capillary chromatography (MECC) amino acid analysis, 5,7 protein sequencing, 4-5,7,9,13 mPC-CE-MS, see Membrane-preconcentration-capillary electrophoresis-mass spectrometry MS, see Mass spectrometry Myc, see Max-Myc dimer Myoglobin Edman sequencing and nanoelectrospray mass spectrometry from capillary electrophoresis samples, 37-40,43^6 matrix assisted laser desorption/ionization mass spectrometry biomolecular interaction analysis of myoglobin/antibody system, 495-497 bioreactive probe tips and endoprotease digestion, 500-503
N NC, see Nucleocapsid protein NEM, see N-Ethylmaleimide Neuromodulin, see Growth-associated protein-43 Nitrogen-15 incorporation assessment by mass spectrometry, 156-157,430 metabolic labeling of proteins, 155, 429-430,432-433,439,441 NMR, see Nuclear magnetic resonance NOESY, see Nuclear Overhauser effect spectroscopy Norleucine detection in proteins, 155,160-161 methionine similarity, 162 replacement for methionine under stressed fermentation conditions, 155,161-162 Nuclear magnetic resonance (NMR) apolipophorin-III, structure analysis assignment strategy, 433-434 data collection, 429-430 lipid binding, 434-437 mass spectrometry, 430 nitrogen-15 labeling of recombinant protein, 429^30,432-433 cellular retinaldehyde-binding protein, retinoid binding studies assignment, 443 carbon-13 labeling, 439,441,443,447 data collection, 440 fluorine-19 labehng, 439,441,443,445 hgand-induced changes, 443,445-447 mass spectrometry, 440^41 nitrogen-15 labeling, 439,441 recombinant protein expression, 440 deuterated proteins, structure analysis aliphatic sidechain resonance assignment, 608 backbone resonance assignment, 608 deuteration of protein, 605,607,613 global fold determination, 609-613 metabolic labeling, 607-608 purification, 606 secondary structure determination, 609 growth-associated protein-43, phospholipid binding effects on proton resonance, 557,560-562 guanylate kinase mutants structural analysis, 681-682,685 substrate titration, 680-682 Max-Myc dimer, proton nuclear magnetic resonance of interface structure amide deuterium exchange, 621
904 Nuclear magnetic resonance (NMR), MaxMyc dimer, proton nuclear magnetic resonance of interface structure (continued) asparagine side-chain interactions, 620-623 data collection, 618-619 secondary structure, 619-620 native state hydrogen exchange of ribonuclease H* distinguishing unfolding events from local fluctuations, 728-730, 732-733 global denaturation, 731 nuclear magnetic resonance, 731 principle, 727-730,733 protein preparation, 730-731 phospholipase A2 amino-terminal helix structure determination chemical shift index, 629,631 coupling constants, 628,631 hydrogen exchange, 626,629-630 nuclear Overhauser effect spectroscopy, 628,631 protein expression and labeling, 626 resonance assignment, 626-627 ubiquitin dynamics studies carbon-13 enrichment, 716, 718-719 data collection, 716-717 methyl group dynamics in protein interior, 719-724 nitrogen-15 relaxation of backbone, 715-716 nuclear Overhauser effect spectroscopy, 719-720 relaxation data analysis, 717 resonance assignment, 717-718 side-chain conformational entropy on folding, 724 Nuclear Overhauser effect spectroscopy (NOESY), see Nuclear magnetic resonance Nuclear receptor, see Orphan nuclear receptor Nucleocapsid protein (NC) NC p7 of human immunodeficiency virus1, modification N-ethylmaleimide, 233-235,242 thiuram disulfides digestion, 233 fluorescence assay, 233 mechanism of reactions, 237-242 reaction conditions, 232-233 reactivity, 235-237,242
Index reagents, 232 whole virus modification, 233-234, 241-243 zinc fingers, 231 O Optical rotary dispersion (ORD), racemic peptide analysis, 878,880 ORD, see Optical rotary dispersion Orphan nuclear receptor binding assay, 460,464,466 circular dichroism, 458-459,461,465 differential scanning calorimetry, 459,461, 464-465 ligand-binding domain structure, 452-453 light scattering analysis of aggregation, 459-460,464-465 recombinant protein expression, 458, 460 sedimentation equilibrium analytical ultracentrifugation, 460, 464-466
PAGE, see Polyacrylamide gel electrophoresis Penicillin-binding protein, in vivo morphogene protein binding assay cyanogen cross-linking, 472,476 digestion of purified proteins, 472^73 fluorescence labeling bacterial cells, 470,472-473 p-lactam, 470 high-performance liquid chromatography, 472^73,478-479 limitations, 479 mass spectrometry, 473 principle, 469-470 specificity, 476-477 Peptide synthesis, see co-Agatoxin-TK; P-Amyloid; Aspartate aminotransferase; Association of Biomolecular Resource Facihties; p-sheet peptides; Combinatorial peptide library; Max-Myc dimer Phenylglyoxal, phospholipase A2 modification activity effects, 272-275 calcium-binding effects, 272-274 circular dichroism analysis, 269,273 identification of modified residues, 268-270,272 reaction conditions, 268,270
905
Index l-Phenyl-3-methyl-5-pyrazolone (PMP), oligosaccharide labeling in glycosylation profile analysis mass spectrometry, 323,326,328 reaction conditions, 322-323 reverse-phase high-performance hquid chromatography, 321-323, 326,328 pH memory, see Lyophihzed protein Phospholipase A2 (PLA2) amino-terminal helix structure determination by nuclear magnetic resonance chemical shift index, 629,631 coupling constants, 628, 631 nuclear Overhauser effect spectroscopy, 628,631 protein expression and labeling, 626 proton exchange, 626,629-630 resonance assignment, 626-627 role in catalysis, 267-268,626,631 classification, 625 phenylglyoxal modification activity effects, 272-275 calcium-binding effects, 272-274 circular dichroism analysis, 269,273 identification of modified residues, 268-270,272 reaction conditions, 268,270 substrate specificity, 267,625 PLA2, see Phospholipase A2 PMP, see, l-Phenyl-3-methyl-5-pyrazolone Polyacrylamide gel electrophoresis (PAGE), in-gel digestion Hepatitis E vaccine candidate, 49-50 sequencing background, 81-82 blotted proteins, see sequencing Coomassie Blue interference, 84-85 data evaluation, see Association of Biomolecular Resource Facilities detergent interference, 81 digestion reaction, 79-80 kinetics of digestion, 86 matrix assisted laser desorption/ ionization mass spectrometry, 80-81 molecular weight and protein loss, 83-85 optimization, 81-85 protein quantity estimation, 88-89 reverse phase HPLC, 80 sample preparation, 79 sensitivity, 79, 86-87 success rates, 86,88 Protein folding, see Differential scanning calorimetry; Hydrogen exchange; Temperature jump
Proton, potassium-ATPase (H,K-ATPase) membrane topology determination by matrix assisted laser desorption/ionization mass spectrometry enriched microsomal vesicle preparation, 535 high-performance hquid chromatography, 535 mass spectrometry, 536-538 peptide assignment, 537-538 post-source decay analysis, 533-534, 538-539,541 principle, 533-534 protease selection, 541 tryptic digest, 535-536 structure, 535
R RAGl RING finger motif, 573,577 V(D)J recombination role, 573 zinc-binding dimerization domain analytical ultracentrifugation, 575-576, 580-584 atomic absorption spectroscopy, 574, 577-578 circular dichroism, 574-575, 578-580 location in protein, 577,584 metal exchange, 574,578 purification, 574,577 thermal stability analysis, 579 Random coil, see Linker region Ras, see Kirsten-ras Retinoid, see Cellular retinaldehyde-binding protein; Cellular retinoid-binding protein Ribonuclease A (RNase A) adenine binding to Bl subsite, fluorescence polarization analysis, 567-568,570 base-binding subsites, 565-566 oligonucleotide ligand synthesis, 566-567 one-dimensional diffusion assay, 569 evidence, 570-571 experimental design, 568-569 phosphoryl-binding subsites, 565 Ribonuclease H (RNase H) assay, 411 biological functions, 409-^10 magnesium dependence, 413-414 manganese activation, 411,413
906 Ribonuclease H (RNase H), manganese (continued) electron paramagnetic resonance binding assay, All-All inhibition, 414-415 mechanism of catalysis, 409^10,414-415 metal binding site, 409 recombinant protein production, 410-411 Ribonuclease H* (RNase H*), native state hydrogen exchange distinguishing unfolding events from local fluctuations, 728-730,732-733 global denaturation, 731 nuclear magnetic resonance, 731 principle, 727-730,733 protein preparation, 730-731 Ricin B-chain blocking, 245-246 2-(4'-maleimidylanilino)naphthalene-6sulfonic acid labeling of thiols absorbance measurements, 246,254 affinity purification of MIANS label, 248,254 applications, 246 enzymatic digestion, 247-248 peptide mapping A-chain,249,251,254 B-chain attachment sites, 251,253-254 purification of blocked B-chain, 247 reaction conditions, 247,249 sequencing of peptide, 248-249 structure, 245 RNase A, see Ribonuclease A RNase H*, see Ribonuclease H* RNase H, see Ribonuclease H
Salt-bridge crosslinking, see Cyanogen SCF, see Stem cell factor Secondary structure, see also Circular dichroism; Helix-coil transition engineering, see p-sheet peptides; Isopropylmalate dehydrogenase prediction algorithm evaluation accuracy measurement, 786-789,791 multidimensional scaling, 787,791,793 protein database selection, 785 success rate, 793 X-ray structure comparison, 783 prediction methods, 784-785 Sequencing, proteins, see also Capillary electrophoresis blotted proteins from gels automated sequencing, 94-95 capillary liquid chromatography of peptides, 94-95
Index chemical cleavage, 92,94 cysteine modification, 92 Edman degradation, 92 peptide mapping, 93 reagents, 91-92 recovery, 96-97 sample preparation, 92 data evaluation, see Association of Biomolecular Resource Facilities Edman degradation, 37,39-40,43-44,57 gel digestions, see Polyacrylamide gel electrophoresis nanoelectrospray technique, 3 7 ^ 0 , 43-46 sensitivity of methods, 3-4 sequencers, see Applied Biosystems Procise^M 494; Beckman LF 3600 Serine modification, see Diisopropylfluorophosphate D-serine, see ca-Agatoxin-TK Serine protease, see Chymotrypsin; Cytomegalovirus protease Site-directed mutagenesis carboxy-terminal domain core of T4 lysozyme activity effects, 857 mutagenesis, 854 site selection, 852-853 structural analysis, 857-859 thermal stabihty measurements, 856, 860-862 cysteine to serine conversion in autocatalytic disulfide reduction, 386,393 feline immunodeficiency virus protease. X-ray crystallography of aspartateto-asparagine mutant crystallization, 645 data collection, 645 mutation effects on structure of LP-149 complex, 647-649,651-652 preparation of protein, 644 refinement, 647 guanylate kinase, tyrosine-50 to phenylalanine in yeast enzyme, effects on catalysis and structure equilibrium unfolding, 682,685-686 hydrogen bonding, 680,686,688 nuclear magnetic resonance structural analysis, 681-682, 685 substrate titration, 680-682 site-directed mutagenesis, 680,686 steady-state kinetics, 680,682,687 retinoid-binding leucine-121 in CRABP-II binding energy contribution, 450,454 competitive binding assay, 450-453 mutagenesis, 450
Index nuclear magnetic resonance analysis of conformation, 452,454 Size exclusion chromatography, light scattering/ultraviolet absorbance/ refractive index detection and extinction coefficient determination accuracy, 114,117,119 apparatus, 114-115 principle, 114 protein interaction analysis, 115,117-118 standard curve, 116-117 Staphylococcal nuclease, hydrogen exchange protection COREX algorithm in prediction, 774-775, 779 pattern of protection, 775-777 temperature dependence, 777 urea dependence, 778-779 Stem cell factor (SCF) disulfide bond structure dimer isolation, 372-374 folding intermediates, 371-372 gel electrophoresis and blotting, 373 high-performance liquid chromatography analysis, 372,374-375 hydrogen peroxide oxidation and cyanogen bromide cleavage, 373, 377-379 Lys-C proteolysis, 373,376-377 partial reduction reaction, 373, 379-380 possible structures, 375-376, 382 quaternary structure effects, 380-382 glycosylation, 371 Kit interactions studied by size exclusion chromatography with light scattering/ultraviolet absorbance/ refractive index detection, 118 STRIDE, see Linker region Substance-P, Edman sequencing and nanoelectrospray mass spectrometry from capillary electrophoresis samples, 37-40,43-46
TCEP, see Tris-(2-carboxyethyl)phosphine TEM-1 3-lactamase plasmid transfer, 827 random mutagenesis compensating mutations for asparagine76 defect, 833,835 correlation with sequence homology of conserved residues, 831, 833 identification of essential residues, 827-828,831,833 library construction, 828
907 randomization procedure, 828-829,831, 835 screening, 828-829,831 Temperature jump, laser instrument advantages, 735 design, 737-739 helix-coil transition kinetics circular dichroism, 740-741 fluorescence, 740-742 peptides apphcations, 736-737 design, 737 fluorescence labeling, 737 sources, 736 TFl,5eeDNAbend Thermo-optical absorbance detection (TOAD), PTH amino acid detection in sequencing, 4-5,7,9,13 Thioredoxin, structural determination of L78K-TRX mutant using nuclear magnetic resonance aliphatic sidechain resonance assignment, 608 backbone resonance assignment, 608 deuteration of protein, 605,607,613 global fold determination, 609-613 metabolic labeling, 607-608 purification, 606 secondary structure determination, 609 Thiuram disulfides NC p7 of human immunodeficiency virus1, modification digestion, 233 fluorescence assay, 233 mechanism of reactions, 237-242 reaction conditions, 232-233 reactivity, 235-237,242 reagents, 232 whole virus modification, 233-234, 241-243 structures, 232,236 Thrombin activity assay, 515 bivalent inhibitors structure, 513-514 synthesis, 514 hirudin binding energy minimization, 515 inhibition constant, temperature dependence, 518-519 molecular surface area calculation, 516 site-directed mutagenesis, 514 thermodynamics, 514-516,519-521 temperature dependence maximal velocity, 517-518 Michaelis constant, 517-518
908
Index
TOAD, see Thermo-optical absorbance detection Transferrin, sequencing of blotted proteins from gels automated sequencing, 94-95 capillary liquid chromatography of peptides, 94-95 chemical cleavage, 92,94 cysteine modification, 92 Edman degradation, 92 peptide mapping, 93 reagents, 91-92 recovery, 96-97 sample preparation, 92 Transition-state theory antifluorescein binding affinity constants, 508 antibody production, 506 limitations, 510 peptide carriers, 506-507 thermodynamic parameters, 508-510 transmission coefficients, 509-510 unimolecular rate constant determination, 507-508 thrombin binding of hirudin binding energy minimization, 515 inhibition constant, temperature dependence, 518-519 molecular surface area calculation, 516 site-directed mutagenesis, 514 thermodynamics, 513-516,519-521 Tris-(2-carboxyethyl)phosphine(TCEP), ghal cell line-derived neurotrophic factor modification alkylation of partially reduced protein, 279,283 complete reduction and pyridylethylation, 279 intermolecular disulfide identification, 283,286 mass spectrometry, 280,283 materials, 278 partial reduction of protein, 278-280,282 sequence analysis, 279,283 tryptic digestion, 279,283 Trypsin, Edman sequencing and nanoelectrospray mass spectrometry from capillary electrophoresis samples, 37-40,43-46 U Ubiquitin, dynamics studies with nuclear magnetic resonance carbon-13 enrichment, 716,718-719 data collection, 716-717
methyl group dynamics in protein interior, 719-724 nitrogen-15 relaxation of backbone, 715-716 nuclear Overhauser effect spectroscopy, 719-720 relaxation data analysis, 717 resonance assignment, 717-718 side-chain conformational entropy on folding, 724 Ultracentrifugation, analytical calculation axial ratio, 576 Dapp,576,582 molecular weight, 576,581-582 s%^, 575-576,581 Sapp, 576,582 shape factor, 576,583 orphan nuclear receptors, 460,464-466 zinc-binding dimerization domain of RAGl, 575-576,580-584
X-ray crystallography apolipophorin-III, 427 complementarity determining regions, 755 feline immunodeficiency virus protease, aspartate-to-asparagine mutant crystallization, 645 data collection, 645 mutation effects on structure of LP-149 complex, 647-649,651-652 preparation of protein, 644 refinement, 647 glycosylation-inhibiting factor crystallization, 635 data collection, 634 multiple isomorphous replacement, 634, 636-637 phase improvement, 637-638 refinement, 634-635,638 trimer structure, 638-640 integrase, avian sarcoma virus catalytic domain structure, 418^19 data collection, 418 metal binding site comparison with related enzymes, 421-423 magnesium, 419,421-423 manganese, 419,421,423 zinc, 419,421,423
Zinc-binding domain, see R A G l
Figure 3. Structure oflL-ip illustrating the location of receptor binding site A (residues 30, 32) and site B (residues 4, 6, 46, 56, 93, 103, 105). Site B residues R4 and L6 were mutated to alanine in mutant /. The position of the K138C mutation is also indicated Coordinates were taken from Protein Data Bank entry 2I1B (16).
This Page Intentionally Left Blank