ADVANCES IN PROTEIN CHEMISTRY Volume 39
This Page Intentionally Left Blank
ADVANCES IN PROTEIN CHEMISTRY EDITED BY
C. B. ANFINSEN
JOHN T. EDSALL
Department of Biology The Johns Hopkins University B8ltimore, Maryland
Department of Biochemistry 8nd Molecular B i o l ~ y Harvard University C8mbridge, MaSS8ChuSetlS
FREDERIC M. RICHARDS
DAVID S. EISENBERG
Department of Molecular Biophysics and Biochemistry Yale University New Haven, Connecticut
Department of Chemistry and Biochemistry University of California, Los Angeles Los Angeles, California
VOLUME 39
ACADEMIC PRESS, INC. Harcourt Brace Jovanovlch, Publlrhorr
San Diego New York Berkeley Boston London Sydney Tokyo Toronto
COPYRIGHT 8
1988 BY ACADEMICPRESS, INC.
ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMI'ITED IN ANY FORM OR BY ANY MEANS, ELECTRONIC
O R MECHANICAL. INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITH0 PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC. San Diego. California 92101
United Kingdom Edition published by ACADEMIC PRESS LIMITED 24-28 Oval Road, London NWl 7DX
LIBRARYOF CONGRESSCATALOG CARDNUMBER:44-8853
ISBN 0-12-034239-1
(alk.
paper)
PRINTED IN THE U N m D STATES OF AMERICA 88899091
9 8 7 6 5 4 3 2 1
CONTENTS PREFACE
.
.
.
.
.
.
.
.
.
.
.
.
vii
Basement Membrane Proteins: Molecular Structure and Function GEORGER. MARTIN,RUPERTTIMPL, AND KLAUSKUHN
Introduction . . . . Morphology and Ultrastructure. Components . . . . Self-Assembly and Interaction between Components . . V. Biological Aspects . . . References . . . . .
I. 11. 111. IV.
. .
. .
. .
. . . .
.
.
.
.
.
1 3 6
. . .
. . .
. . .
. . .
. . .
36 40 43
51 53 68 101 118
Design of Peptides and Proteins WILLIAMF. DEGRADO
I. 11. 111. JV.
Introduction . . . . Design of Small Peptides . Design of Medium-Sized Peptides . . . Protein Design . References . . . . .
. . .
.
. . .
. . .
. . .
. . .
.
.
.
.
.
.
.
.
Weakly Polar Interactions in Proteins S. K. BURLEYAND G. A. PETSKO
I. 11, 111. IV. V. VI. VII.
Introduction . . . . . . Electrostatic Interactions in Proteins . Weakly Polar Interactions in Proteins . Interactions: A Summary . . . Hydrophobic Interactions in Proteins . Discussion . . . . . . . Conclusion . . . . . . . References . . . . . . . V
. . .
. . .
. . .
. .
. .
.
. .
.
. .
.
. .
125 126 152 176 177 181 183 186
vi
CONTENTS
Stability of Protein Structure and Hydrophobic Interaction PETERL. PRIVALOV AND STANLEY J. GILL
I. Introduction . . . . . . . . 11. Calorimetric Studies of Protein Denaturation . 111. Studies of Dissolution of Nonpolar Substances into Water . . . . . , . . . IV. Hydration of Nonpolar Molecules . . . V. Comparison of Results on Protein Denaturation and Hydrocarbon Dissolution in Water . . VI. Mechanism of Stabilization of Compact Protein Structures . . . . . . . . . References . . . . . . . . .
.
.
193 194
. .
207 217
.
225
. .
228 23 1
Abstract of a Review on Chemistry of Peanut Proteins R. BHUSHAN, G. P.
Abstract References AUTHOR INDEX SUBJECT INDEX
. .
.
. .
. .
REDDY, AND
K. R. N. REDDY
.
.
.
.
.
.
.
.
255 238
,
. ,
. .
. .
. .
. .
. .
239
.
. .
. . . .
. . . . .
255
PREFACE
We are happy to announce that David S. Eisenberg has agreed to join us as an editor of Advances in Protein Chemisty. He has been concerned with protein chemistry since he worked with one of us on a laboratory thesis during his undergraduate work at Harvard. As a Rhodes scholar at Oxford, working with C. A. Coulson, he broadened his knowledge of fundamental physical chemistry. Returning to Princeton, he wrote, with Walter Kauzmann, an influential book on water and its properties. Since then, in his years at Cal Tech and especially at the University of California at Los Angeles, he has become a leader in many aspects of protein chemistry. As an X-ray crystallographer he has elucidated the structures of two complicated and important enzymes: glutamine synthetase and ribulose-bisphosphatecarboxylase. He has been deeply involved in the study of hydrophobic interactions, and his formulation of the concept of hydrophobic moment and its application to peptides and proteins of diverse functions have been illuminating. He has been involved in the design of new peptides and proteins with specific conformations and functions, a topic of one of the articles in this volume. He has served as the first president of the recently formed Protein Society. Although he had no part in the planning of this volume, he has already, in discussions with us, done extensive planning for possible authors and subjects for future volumes. The first article in this volume, by George R. Martin, Rupert Timpl, and Klaus Kuhn, deals with the biologically important subject of basement membrane proteins. These membranes, which underlie most epithelia and are also involved in the structure of nerve, muscle, and fat cells, are thin sheets with complex structures. They involve such unusual proteins as type IV collagen, the large glycoprotein laminin, and the heparan sulfate proteoglycans, calcium-binding proteins, and others. In addition to dealing with these individual constituents, Martin et al. consider the structure and organization of the basement membrane system as a whole. William F. DeGrado treats the problems of designing new peptides and proteins to extend and deepen our understanding of the molecules found in nature and to determine the structural and conformational basis of biologically significant activities. Sometimes the synthetic peptides surpass, in some specific activity, the natural molecules to which they are related. The importance of such studies for the future of physiology, pharmacology, and medicine is obvious. Important applications vii
...
Vlll
PREFACE
to the structure and activity of natural proteins have already been made, and much greater ones are certain to follow in the future. S. K. Burley and G. A. Petsko cover the field of noncovalent interactions of proteins, with particular emphasis on weakly polar interactions. Their presentation of the whole field of electrostatic interactions should be of value to many workers in protein chemistry, but their special concern is with the weaker, but very important, interactions involving aromatic side chains, their orientation relative to one another, to oxygen and sulfur atoms, to amino groups, and to aromatic ligands that may bind to the protein. These interactions, only recently recognized for their influence on protein structure, play an important part in the formation of aromatic clusters in the interior of globular proteins and in other features of structure. The authors provide numerous illustrations of the principles involved, from recently determined structures, of both small molecules and proteins. In the last full review article, Peter L. Privalov and Stanley J. Gill propose a drastic revision of our concepts relating to the role of hydrophobic interactions in protein structure and stability. Recent studies of reversible heat denaturation of proteins and of the interaction of small nonpolar molecules with water, over a wide range of temperature, appear to compel a reconsideration of the role of entropic factors in protein structure and stability. Privalov and Gill conclude that the maintenance of the compact folded state in native proteins is primarily due to hydrogen bonding and to van der Waals interactions between the nonpolar side chains in the protein interior. They infer that water solvation of nonpolar groups actually destabilizes the compact folded state. This destabilizing action increases with decreasing temperature, leading to the phenomenon of cold denaturation. This article is likely to arouse controversy, but the conclusions drawn are based on much recent experimental evidence, which potential critics must assimilate before attempting to draw different conclusions. The final brief report on peanut proteins, a subject treated by J. C. Arthur in Volume 8 of Advances in Protein Chemistry, is an abstract of a more extensive review by R. Bhushan, G. P. Reddy, and K. R. N. Reddy, which appeared in a new Indian journal. Since this journal is apparently not covered at present by any abstracting service, the authors have prepared, at our request, an abstract of the longer review so that workers interested in the subject may become aware of a study that they might otherwise miss. In the future, we may consider including other such abstracts to call attention to a relevant article that might otherwise be “lost.” C. B. ANFINSEN
T. EDSALL FREDERIC M. RICHARDS
JOHN
BASEMENT MEMBRANE PROTEINS: MOLECULAR STRUCTURE AND FUNCTION By GEORGE R.
MARTIN;.’
RUPERT TIMPL,t and KLAUS KUHNt
*Laboratory of Developmental Blology and Anomallea, National Institute of Dental Rerearch, Natlonal Inrtltuter of Health, Betherda, Maryland 20892 tMax-Planck-lnrtltut fur Blochemlo, D-8033 Martlnrrled be1 Miinchen, Federal Republlc of Germany
.............................. ..... ..... A. Type IV Collagen. . . . . . . . . . . . . . . . . . . . . . . . . . . B.Laminin.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. NidogenlEntactin. . . . . . . . . . . . . . . . . . . . . . . . . . . D. Heparan Sulfate Proteoglycans. . . . . . . . . . . . . . . . . . . . . E. Calcium-Binding Proteins . . . . . . . . . . . . . . . . . . . . . . . F. Other Basement Membrane Components . . . . . . . . . . . . . . . . IV. Self-Assembly and Interaction between Components. . . . . . . . . . . . . A.Laminin.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Multiple Interactions . . . . . . . . . . . . . . . . . . . . . . . . . V. Biological Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Structural Functions . . . . . . . . . . . . . . . . . . . . . . . . . B. Cellular Receptors for Collagen IV and Laminin . . . . . . . . . . . . . C. Turnover and Degradation. . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. Introduction.
11. Morphology and Ultrastructure . . . . . . . . . . . . . . . . . 111. Components. . . . . . . . . . . . . . . . . . . . . . . . . .
1
3 6 7 21 26 28 32 33 36 38 38 40 40 41 42 43
I. INTRODUCTION Basement membranes, first described over 100 years ago, are the principal extracellular matrices for most epithelial cells and for peripheral nerve, muscle, and fat cells. Their presence at the basal surface of epithelial tissues gave rise to the term basement membrane. These structures have been the subject of many histological studies, in part because they are very prominent after staining with the periodic acid-Schiff reagent. Such studies suggest that basement membranes are rather homogeneous sheets, perhaps contiguous with the cell surface itself. However, this is incorrect, since basement membranes are very thin extracellular matrices (50-250 nm) whose structural diversity cannot be resolved by light microscopy (Martinez-Hernandez and Amenta, 1983). Chemical studies on most basement membranes are also difficult, since the basement memPortions of this article were prepared at the Max-Planck-Institut while G . R. Martin was the recipient of an Alexander von Humboldt award. I ADVANCES IN PROTEIN CHEMISTRY. Vol. 39
Copyright 0 I988 hy Academic Press. Inc. All rights of reproduction in any form reserved.
2
GEORGE R. MARTIN ET AL.
brane is not readily separated from adjacent connective tissue or from cellular elements. Careful dissection or physical separations involving sieving, sonication, and flotation are often utilized to obtain small amounts of purified material from normal tissues (Krakower and Greenspon, 1978). Analyses of material isolated in this way demonstrated the presence of amino acids such as hydroxyproline and hydroxylysine, which are relatively specific markers for collagen (Kefalides, 1973). However, ultrastructural studies did not show the presence of typical collagen fibers. Recent studies have shown that all basement membranes contain a unique collagen arranged in an unusual network structure as well as certain glycoproteins and proteoglycans that occur only in basement membranes (Laurie et al., 1983). The cooccurrence of these proteins in basement membranes is due both to their synthesis by the cells that lie along the basement membrane and to strong and specific interactions that maintain the components together in situ. Also, the cells on basement membranes have specific receptors for these components (Yamada, 1983; von der Mark and Kiihl, 1985; Kleinman et al., 1985; Martin and Timpl, 1987). Basement membrane is the first extracellular matrix produced during development, and it is required for the initial events in morphogenesis (Leivo, 1983). Various components of basement membranes have been found to influence cellular adhesion, migration, growth, and differentiation. Basement membranes also provide a physical support for a wide variety of cells, create barriers that maintain tissue structure, and form a molecular filter that prevents the passage of proteins (Farquhar, 1981) in kidney glomeruli, in capillaries, and in other sites. The structure and function of basement membranes are altered in certain diseases. Grossly thickened basement membranes are observed in diabetic individuals, and these thickened basement membranes are believed to induce the degeneration of ocular, vascular, and renal tissues in long-term diabetes (Brownlee and Cerami, 1981). Further, the ability of tumor cells to penetrate basement membranes signals the transformation of such cells from benign to malignant status (Liotta et al., 1986). Such observations have made this a major research front-more than 5000 publications on basement membranes were retrieved by a computer search of the literature from 1979 to 1986. In recent years, many excellent reviews and books have been published on basement membranes (Kefalides, 1973, 1978; Kefalides et al., 1979; Heathcote and Grant, 1981; Kuhn et al., 1982; Timpl and Martin, 1982; Martinez-Hernandez and Amenta, 1983; Porter and Whelan, 1984; Shibata, 1985; Timpl and Dziadek, 1986). Our review will not attempt to be inclusive but will deal with a description of basement membrane compo-
BASEMENT MEMBRANE PROTEINS
3
nents, their structures, and the interactions that relate to the formation and maintenance of this matrix. 11.
MORPHOLOGY AND ULTRASTRUCTURE
Basement membranes show characteristic arrangements in tissues. They lie at the base of various epithelial tissues, such as the epidermis and glands, and of endothelial tissues, thereby separating the cells of these tissues from the adjacent stromal connective tissue. This arrangement creates a distinct polarity, with nutrient exchange occurring at the basal surface of the cells and secretion of cellular products occurring at their apical surface. Other cells, including Schwann, cardiac, striated and smooth muscle, and fat cells, are completely encircled by a basement membrane. The basement membrane in such sites can show extensive convolutions, as at neuromuscular junctions where multiple interdigitations occur in the muscle. In the glomerulus, lung, and some other sites, the basement membranes adjacent to endothelial and epithelial cells are fused. A scanning electron micrograph of a section of human skin is shown in Fig. 1A. In this case, the papillary dermis with attached basement membrane has separated from the epidermis. The basement membrane is seen as a smooth continuous sheet. A transmission electron micrograph of a section cut through a part of the eridermis and dermis is shown in Fig. 1B and C. In general, three distinct zones2 have been delineated in such basement membranes: an electron-lucent zone next to the epithelial cells (the lamina lucida), an electron-dense zone (the lamina densa), and the reticular layer (the lamina fibroreticularis),which is an incompletely delineated zone continuous with and also containing connective tissue fibers. In a few basement membranes such as Reichert's membrane (a fetal basement membrane) and lens capsule, there may be little or no lamina lucida. In tissues where epithelial and endothelial cells are directly apposed, as in glomerular and alveolar basement membrane, the two fuse to form a single, thickened basement membrane. There is little to note in the way of structures in the lamina lucida with the exception of strands of material that extend from the lamina densa to the cell surface and appear to be extensions of the lamina densa (Laurie et al., 1984). These and other data (Goldberg and Escaig-Haye, 1986) indicate that the lamina lucida could be a fixation artifact. 2 Diverse terms have been used to describe the zones in basement membranes, including basal lamina and lamina rara. The terminology used here follows the recommendation of the International Anatomical Nomenclature Committee (Laurie and Leblond, 1985).
4
GEORGE R. MARTIN ET AL.
FIG.I . (A) Scanning electron micrograph of human skin. The epidermis has pulled away from part of the basement membrane. (B and C)Transmission electron micrograph through the epidermal-dermal junction of human skin. Keratinocytes (KF) are the cells in the human epidermis. LD, The lamina densa of the basement membrane; LL, the lamina lucida. Typical anchoring fibrils (AF) formed from type VII collagen are shown at higher power in c. Courtesy of Dr. K. Holbrook, University of Washington.
The lamina densa is a well-delineated, highly stained zone whose presence is the sine qua non of a basement membrane. Many investigators use the term basement membrane to refer to this sheet-like structure, particularly where the presence of unique basement membrane components, such as collagen IV and laminin, can be established with specific antibodies. At higher magnification, the lamina densa can be seen to be a threedimensional network of 3- to 8-nm cords (Fig. 2). In addition, in such preparations one can usually see tubule-like structures (basotubules) as well as small parallel rods known as double pegs (Inoue et al., 1983; Laurie el al., 1984). The reticular layer of basement membrane in a tissue such as the skin contains a variety of matrix structures. Strands of basement membrane may project down to type I collagen fibers. Anchoring fibers, banded fibers composed of type VII collagen, extend into the basement mem-
BASEMENT MEMBRANE PROTEINS
FIG.IB and C.
5
6
GEORGE R. MARTIN ET AL.
FIG.2. A high-power view of the basement membrane zone. The lamina densa (D)is composed of poorly delineated cord structures that vary from 3 to 8 nm in diameter. Strands from the lamina densa (arrow) are seen to cross the lamine lucida (L) to the surface of the epidermal cells (Ep). Careful examination of the lamina densa has shown the presence of 8-nm hollow tubes (see circled structure in the lower right quadrant): the basotubules. These are formed of pentagonal units stacked one over the other and are believed to contain amyloid P. The insets at the lower left of the figure show double peg structures (arrows) that occur throughout basement membranes. Courtesy of Dr. S.Inoue and Dr. C. P. Leblond, McGill University.
brane and help to fix it in place (Burgeson et al., 1985; Sakai et al., 1986). Fibers of various other collagens may also be in close association with the lamina densa. The absence of certain proteins from these zones or the deposition of antibodies against components of these regions occur in certain disorders and induce blistering due to a loss of firm connections (Pate1 et al., 1983; Goldsmith and Briggaman, 1983; Fine et al., 1984). 111. COMPONENTS
Those components common to all basement membrane and believed to be integral to it rather than adventitiously associated with it are listed in
7
BASEMENT MEMBRANE PROTEINS
TABLE I Intrinsic Basement Membrane Components Component
Chain structure
Major function
Collagen IV Larninin Heparan sulfate proteoglycans
Cul(IV), a2(IV) BI, 8 2 , A Low density (M,, 500-650K); 3-4 heparan sulfate chains 130-250K); High density (M,, 4 heparan sulfate chains 15OK) Single chain (M,, Single chain (M,, 67K) Single chain (M,, 35K)
Structural support Cell attachment Filtration
Nidogedentactin Laminin receptor BM-40/osteonectin/SPARC
Filtration Larninin binding Laminin binding Calcium binding
Table I. They include collagen IV, laminin, heparan sulfate proteoglycan, and nidogen (entactin). Electron micrographs of collagen IV, laminin, and nidogen are shown in Fig. 3. Other components are known to be present in certain basement membranes, and they are presumed to be unique contributions from the resident cells. These include acetylcholinesterase in neuromuscular junctions and the bullous pemphigoid antigen in the epidermal basement membrane. Other constituents, such as a small calcium-binding glycoprotein (osteonectin/BM-40/SPARC), are apparently not restricted to basement membranes but are also found in both mineralizing and nonmineralizing extracellular matrices. Still other components may arise in distant sites, be transported through the blood, and lodge in the basement membrane due to specific binding or to mechanical trapping. These latter substances may include fibronectin, Clq bound to antigenantibody complexes, and albumin. A . Type IV Collagen Analyses of highly purified preparations of basement membrane showed the presence of amino acids, particularly hydroxyproline and hydroxylysine, that are characteristic of collagens (Kefalides, 1973), and the collagen content of various basement membranes has been estimated as 50% or higher on the basis of amino acid analyses. Attempts were made to extract collagen from basement membrane, but it proved difficult to isolate and to characterize due to a low solubility in ordinary solvents and the very limited amounts of basement membrane available. A major advance in this work was the solubilization of a collagenous protein from anterior lens capsule by digestion with pepsin, a procedure that produced a protein with d i k e chains somewhat larger than the chains of interstitial collagens (Kefalides, 1973; Dehm and Kefalides, 1978). Somewhat earlier it had been discovered that there were genetically distinct collagens, often
8
GEORGE R. MARTIN ET AL.
FIG.3. Electron micrographs of type IV collagen (a), laminin (b), and nidogen (c) molecules after rotary shadowing. Bar in a-c, 50 nm.
with a preferential distribution among the tissues (see Table I1 for an updated list). Since the collagen from basement membrane had a distinctive cyanogen bromide peptide pattern, Kefalides (1972a, 1973)concluded that basement membrane contained a distinctive collagen, which he designated type IV collagen, since three other collagens had been described and enumerated. He also proposed that the molecule was triple helical, since it resisted protease digestion and was composed of three identical chains. This model assumed that pepsin had cleaved some nonhelical
BASEMENT MEMBRANE PROTEINS
9
TABLEI1 Types of Collagenous Proteins Type
Helix (nm)
Tissue form
Characteristics, special function, and distribution
I I1 I11
300 300 300
Large, 67-nm banded fibers Abundant; structural; skin, bone, tendon, etc. Small, 67-nm banded fibers Abundant; structural; cartilage Small, 67-nm banded fibers Abundant; reticulin-like; blood vessels, internal organs
IV V VI
390 300 I05
Nonfibrillar network Small fibers Microfibrils, zebra bodies, 105-nm banded tactoids
Basement membrane specific; structural Pericellular; in most interstitial tissues Links constituents of interstitial tissues
VII VIII
450
-
Short, banded straps Unknown
Anchoring fibers for basement membranes Some endothelial cells
IX
200
Unknown
X
I50 300
Unknown Unknown
Throughout cartilage; contains chondroitin sulfate Hypertrophic and mineralizing cartilage Cartilage
XI
regions of the protein and left the main triple helix intact. However, these results, i.e., the isolation of a collagen with three identical chains, were not readily reproduced and were controversial, since multiple collagenous components could be isolated from glomerular basement membrane using essentially identical conditions (Daniels and Chu, 1975). In fact, it was found later that the structure of the basement membrane collagen was quite different from that of the fiber-forming collagens (Table 11). In retrospect, much of the problem in isolating basement membrane collagen was due to the methods applied. The mechanical sieving and sonication used in isolating basement membranes, as well as the pepsin used to solubilize basement membrane collagen, caused extensive degradation of the protein and created artifactual heterogeneity (Heathcote and Grant, 1981; Timpl and Martin, 1982). Model systems, including tumors and cultured cells, proved to be more suitable sources of intact basement membrane proteins. A transplantable mouse tumor, the EHS tumor,3was found to produce a matrix composed solely of basement membrane components that were readily extracted and served as an abundant source of these proteins (Orkin et al., 1977). By light microscopy, the matrix of the 3 This tumor was discovered by Professor Engelbreth-Holm in Denmark and extensively studied by Dr. Richard Swarm at the National Cancer Institute in Bethesda, Maryland. The EHS prefix is derived from the initials of these two investigators.
10
GEORGE R. MARTIN ET AL.
FIG.4. Transmission electron micrograph through a portion of the EHS tumor. A cell is seen at the bottom of the figure. At a distance from the cell (d), the basement membrane is seen to take on a laminated appearance (arrows), which is not apparent in the matrix proximal (p) to the cell. Courtesy of Dr. S. lnoue and Dr. C. P. Leblond, McGin University.
EHS tumor appears quite homogeneous, although areas with a repetitive lamellar structure are readily discerned by electron microscopy (Fig. 4). Comparable rat tumors have been identified (Martinez-Hernandezet al., 1982; Engvall et al., 1983). Other tumors, derived from parietal yolk sac (PYS cells), had been found even earlier to produce basement membrane, but these tumors grow only in less common strains of mice, are toxic for the host, and produce less tissue when transplanted (Pierce et al., 1982). However, the parietal yolk sac cells grow well in culture and continue to produce collagen IV, laminin, and other basement membrane components and are frequently used for biosynthetic studies on basement membrane (Oohira et al., 1982; Oberbaumer et al., 1982; Leivo et al., 1982). Certain teratocarcinoma cell lines, such as the F9 line of mouse tumor cells, are also widely used, since retinoic acid plus cyclic AMP induces their differentiation to parietal yolk sac cells with a concomitant transcription of the genes for basement membrane proteins (Strickland et al., 1980; Howe and Solter, 1980; Prehm et al., 1982; Carlin et al., 1983; Durkin et al., 1986). The fact that the molecules produced in such model systems are compara-
11
BASEMENT MEMBRANE PROTEINS
ble to those occurring in normal tissues is usually inferred from the reaction of antibodies prepared to the tumor proteins with tissue basement membranes and not by more direct methods. 1 . Molecular Structure The collagen IV molecule contains two distinct chains, aI(IV) and a2(IV), which form heterotrimers with the composition [aI(IV)]2a2(IV). The conclusion that most collagen IV molecules contain two distinct types of chains and are heteropolymers is based on the ability of monoclonal antibodies to react with native molecules containing both chains (Odermatt et al., 1984; Mayne et al., 1984), on the finding that the two chains comigrate when the native protein is chromatographed on ionexchange columns (Mayne and Zettergren, 1980; Trueb et al., 1982), and by the identification of a trimeric, cyanogen-bromide peptide from a region approximately 70 nm from the 7 S domain in which fragments of two al(1V) and one a2(IV) chain are cross-linked by disulfide bonds (Dieringer et al., 1985). However, some molecules may also exist as homopolymers (Gehron-Robey and Martin, 1981; Timpl et al., 1979a; Haralson et al., 1985). Collagen IV molecules have been visualized by electron microscopy after spreading on cleaved mica and rotary shadowing (Oberbaumer et al., 1982; Bachinger et al., 1982). A typical molecule appears as a long strand terminating in a distinct globule (Fig. 3a). Several major domains (Fig. 5 ) have been defined in the molecule. These include NC1, a terminal signal pept ide
I S I30nml 1 :
1H 1360nml
NCl
I 1
I
FIG. 5 . Schematic representation of the collagen IV molecule, which consists of two al(IV) chains and one a2(IV) chain. The non-triple-helical interruptions of the triple helix are indicated by black bars. The cysteine residues (C) and lysine or hydroxylysine (K) residues putatively involved in intra- or intermolecular bonds are shown. CHO designates a N-glycosidically bound oligosaccharide chain. The subscript numerals indicate the number of residues in a distinct region, summarized for all three a-chains. P designates a main pepsin cleavage site. In interruption 13, the ar2(IV) chain forms a 21-residue-long loop, stabilized by an interchain disulfide bridge. NCI, Noncollagenous domain I ; TH, triple-helical domain; 7 S, carboxyl-terminal domain.
12
GEORGE R. MARTIN ET AL.
globule at the carboxyl end. The main triple-helical domain extends from the NC1 globule to about 30 nm from the N-terminal portion of the molecule. The amino-terminal domain that participates in intramolecular cross-linking is termed the 7 S domain (Kuhn et al., 1981; Timpl et al., 1981; Mayne et al., 1982).Additionally, a second noncollagenous domain (NC2) has been inferred to occur some 90 nm from the amino terminus of the molecule (Timpl et al., 1981), although it may actually represent a region with unusual resistance to'collagenase. Current information indicates that three chains extend in register the entire length of the molecule and form the helical domains (390 nm versus 300 nm in the interstitial collagens). A similar alignment of chains is observed in interstitial procollagens (types I-111), but these have noncollagenous domains that occur at both ends of the molecule and are removed enzymatically prior to deposition of these collagens into fibers. No such processing is observed with collagen IV molecules (reviewed in Timpl and Dziadek, 1986). In addition, collagen IV molecules are considerably more flexible than collagen I, 11, and I11 (Hofmann et al., 1984). Some sequence studies, discussed below, indicate that the greater flexibility of the molecules and their susceptibility to proteases is due to the presence of many nonhelical sequences within the major helical domain (Fig. 5). The helices of interstitial collagens have repetitive units that are about 234 residues long and that regulate the lateral alignment of molecules and allow the molecules to assemble into ordered fibrils. Such repetitive units do not occur in collagen IV, a situation that may account in part for their inability to form such fibrils, although current models for the structure of collagen IV molecules in the matrix suggest that some lateral associations occur (Yurchenco and Furthmayr, 1984). The complete amino acid sequence of the human al(1V) and (w2(IV) chains has been obtained (Babel and Glanville, 1984; Pihlajaniemi et al., 1985; Brinker et al., 1985; Glanville et al., 1985; Braze1 et al., 1987; Siebold et al., 1987; K. Kuhn, unpublished). Including a signal peptide, the al(1V) chain comprises 1669 amino acids and the a2(IV) chain has about 1712 amino acids. The carboxyl ends of both chains are formed by the globular NC1 domains, which occupy 229 amino acid residues that are homologous to one another but not, for example, to the carboxyl-terminal propeptides of collagens 1-111. Each individual NC1 domain consists of two repeating units (Schwarz-Magdolen et al., 1986). Each repeat is stabilized by three disulfide bridges, which force the peptide chain into two large and one small loop. Figure 6 indicates how the two repeats of a NC1 domain can be schematically arranged in a rather symmetrical four-leaf clover-like structure. These structures have been conserved during evolution since 60% sequence identity is observed between the NCl domain of Drosophila, mouse, and human (Blumberg et al., 1987). Interestingly, the
BASEMENT MEMBRANE PROTEINS
13
FIG.6 . Loop structures in the NCI domain of collagen IV chains. The beginning portion of the triple-helical domain is shown as a jagged line labeled N. The NCI domain contains two repeating structures, each comprising a large loop, followed by a small loop, and then another large loop. The location of the intrachain disulfide bonds has been deduced (Sieboldt et a / . , unpublished).
subdomains of NCI do not coincide with the exon structure of human aI(1V) chain gene (Soininen et al., 1986b), a finding indicating that this homology is not the result of simple gene duplications. The major helical domain is about 360 nm long. The al(1V) chain contains 21 and the a2(IV) chain 23 nontripeptide interruptions (Fig. 5 ) . These include deletions of a single glycine residue from the Gly-X-Y triplet as well as stretches of as many as 13 residues lacking the Gly-X-Y repeat. Not all of the nontripeptide interruptions in the two chains are aligned, thus creating 26 separate interruptions in the triple helix of the molecule (Schwarz et al., 1986; K. Kiihn, unpublished). These non-triplehelical segments can be divided into flexible sites, bends, and neutral substitutions (Table 111). Such sequences could serve in the attachment of other proteins to the collagen IV network and also serve as sites of cleavage by various proteases. The amino-terminal region of the collagen IV molecule was isolated from proteolytic digests of basement membrane-rich tissue in the form of a cross-linked tetrameric complex, the so-called 7 S collagen. The aI(1V)
14
GEORGE R. MARTIN ET AL.
and a2(IV)chains in this 7 S domain both consist of a short (30 nm) helical region flanked on each side by nonhelical regions (Fig. 7A). The short nonhelical N-terminal segment on each chain contains cysteines and hy-
TABLEI11 Some Nontripeptide Sections of al(1V) and a2(IV)
Flexible sites 1 al a2 a1 2 a2 3 al a2 al 4 a2 Bends I al
G G G G G G G G
K i b S M D H V D M G S M K G Q K P 6 - P L I L P G M K D I K G E K W A E L f i G I U f D T V D L - P G S P T 6 8 H P - V E G P i T S P P S N G G S P P b E P Y D V I K G E P I P Q K I A V Q P G T L
G Q P ! G S ) K
a2
G F I K G V K 2 al G E V Y G F h G a2 G E A N T L P G Neutral substitutions 1 al G L P b G F t a2 G V D U G D P al G L i 8 I P G V 2 a2 G P N A L P G I 3 al G F D 8 A P G Q a2 G V S A V P G F
L A P V
+
K K K R
The nontriplet regions of the al(IV) and the a2(IV) chains are underline!. P denotes 4-hydroxyproline; K denotes fully hydroxylated and glycosylated lysine residues; (-) denotes deletions which had to be introduced to ensure optimal matching of the tripeptide structure of both chains. The non-triple-helical sequences are underlined. Three different types of nontriple-helical interruptions have been observed: ( I ) longer nonhelical sequences, expected to create flexible sites sensitive to proteases; (2) an extra amino acid in one of the two chains expected to create a bend; (3) neutral substitutions expected to disturb only slightly the structure of the triple helix.
15
BASEMENT MEMBRANE PROTEINS
droxylysines that are used to generate intra- and intermolecular crosslinks. Adjacent to this nonhelical sequence is a helical segment of 114 residues. Both of these segments form the overlapping, antiparallel strucA
Helical X-linking site
Cop site
CHO
+
+
+
+
a1
GPrGPTGPTGPr
a2
GRPGPPGPPGPP
a2 K F D V P C G GR D C S G G COC Y P E k G G R NonhelicolX-linking site
Nonhelicol bend
B I
7s I
Overlap Zone
FIG.7. (A) A detailed model of the amino-terminal aI(IV) and a2(IV) chains containing the 7 S domain and an adjacent segment of the main triple helix. TheJerminal sequence of both chains is nonhelical and contains lysines (K), hydroxylysines (K),and cysteines (C) that participate in intra- and intermolecular cross-linking. A helical cross-linking site is located about 30 nm from the amino terminus (N) of the molecule and contains cysteines and a hydroxylysine in the al(IV) chain involved in cross-linking plus a complex carbohydrate attachment site (CHO).The “cap” site within the p i n triple helix identifies a series of four triplets containing proline (P) and hydroxyproline (P), a composition that would be expected to form a very stable helical structure. [Data from Glanville et al. (1985); Siebold et al. (1987)]. (9) The antiparallel arrangement of type IV molecules with alignment of crosslinking sites.
16
GEORGE R. MARTIN ET AL.
tures and contain cross-linking sites where cysteine and hydroxylysines residues form intramolecular crosslinks (Fig. 7B). Intermolecular bonds are likely to occur using both the hydroxylysine and cysteine residues of the N-terminal nonhelical regions and those in the triple-helical cross-link site of two 28-nm overlapping antiparallel molecules. The triple-helical domains of the al(1V) and the a2(IV) chains are terminated by nonhelical regions, respectively 13 and 12 residues long, which define the border of the 7 S domain and provide the molecule with a flexible joint. This short segment allows the triple-helical molecules to form a bend that projects the major triple helix away from the 7 S domain, thus facilitating the formation of the network structure in the basement membrane. The adjacent triple-helical domains of both chains are characterized by a high content of proline and hydroxyproline residues. These proline- and hydroxyproline-richsequences form very stable triple helices and may stabilize the molecule in those regions that would otherwise be destabilized by the presence of neighboring nonhelical domains. Sequence analyses of the 7 S domain have been done (Glanville et al., 1985; Siebold et al., 1987), and a model based on hydrophobic interactions has been constructed. This model predicts close apposition of cross-linking sites and stronger binding between antiparallel segments than between parallel segments. In addition, a heterotrimer chain composition is favored over homotrimer assemblies (Siebold et al., 1987). These predictions are in excellent agreement with the experimental data and also suggest the role for the a2(IV) chain in the molecule. 2 . Supramolecular Organization a. 7 S Domain of Type ZV Collagen. Various models have been advanced to explain the organization of collagen IV molecules in basement membranes. On the basis of the absence of fibers in the basement membrane, such models proposed that the collagen was arranged between sheets of noncollagenous proteins or as a random mesh of molecules linked end to end through noncollagenous domains (Kefalides, 1973). A much more defined model has been obtained via the examination of the collagen IV molecule and of fragments derived from enzyme digests of basement membrane (Timpl et al., 1981). In these latter studies, most of the collagenous protein of the basement membrane was brought into solution after incubation with either pepsin or trypsin. A subsequent exposure to bacterial collagenase at 20°C was expected to destroy all collagenous protein but instead left behind a large component with a typical collagenous composition (Timpl et al., 1979b). This component formed a rather homogeneous boundary in the ultracentrifuge and had a sedimentation constant of 7 S. Material prepared in this manner is usually referred to as
BASEMENT MEMBRANE PROTEINS
17
7 S collagen and can be isolated from a variety of basement membranes (Risteli et al., 1980). Circular dichroism measurements showed that 7 S collagen contained triple-helical domains that showed thermal transitions at 48 and 70°C and were stabilized by disulfide bonds. Reduction of disulfide bonds lowered this thermal stability to 45°C and conferred collagenase sensitivity. A smaller form of 7 S collagen was also obtained utilizing collagenase digestion at 37°C (Risteli et al., 1980; Madri et al., 1983; Mayne et al., 1984). The nature and significance of the long and short forms of 7 S collagen became clear from examination of aggregates of collagen IV brought into solution by a brief digestion with pepsin (Fig. 8). Electron micrographs of
FIG.8. The network structure of type IV collagen as deduced from electron microscopy of fragments solubilized with proteases (Tirnpl et 01.. 1981). The top left micrograph shows the short form of the 7 S region. In the top right, a mere limited enzyme digestion allows a portion of the helical domain of each molecule to remain linked to the 7 S domain. The bottom left micrograph shows isolated dimeric NCI fragments and, at the lower right, dimeric triple helical material connected in the center by the NCI domain.
18
GEORGE R. MARTIN ET AL.
such digests revealed spider-like structures containing up to four thin, 370-nm long strands extending from a central rod-like domain (Kiihn er al., 1981; Timpl et al., 1981). The short form of 7 S collagen lacked these strands and consisted of the central rod structure. On the basis of the structure of the collagen IV molecule, it was concluded that the 7 S domain represented a site where four molecules overlapped by 28 nm, with two molecules in parallel and two molecules in an antiparallel orientation (Fig. 7B). The long form of 7 S collagen was an intermediate fragment consisting of the rod plus four short (30 nm) triple-helical extensions (Fig. 8). b. NCI Domain. Identification of the NCl domain, the C-terminal globule, as the other major cross-linking site of collagen IV was achieved in a similar manner. Electron microscopy of collagen IV solubilized by reduction or with acid from the EHS tumor matrix showed many dimeric structures, usually lacking the 7 S region but joined together through a shared globule (Timpl et al., 1981; Yurchenco and Furthmayr, 1984). Monomeric material terminated in a single, somewhat smaller globule (Fig. 3A). The NC1 complex can be isolated from a collagenase digest of various tissues and is predominantly a hexameric structure arising from the condensation of the NCl domains of six a chains of two collagen IV molecules (Weber et al., 1984). As discussed above, the NCI domains of the al(1V) and a2(IV) chains are highly homologous structures, particularly in the location and the size of disulfide-linked loops (Fig. 6). The loop structure is duplicated to create two highly homologous regions in each chain. The arrangement of these domains within the globule is not known, although crystals prepared from such material show in X-ray diffraction a high degree of symmetry (Timpl et al., 1985). The dimeric structure formed by the condensation of two NC1 domains is stabilized by disulfide bonds and by nonreducible bonds. c . Collagen ZV Network. Based on the observations discussed earlier, it was suggested that collagen IV molecules were arranged in a network with like ends of molecules in apposition (Timpl et al., 1981) (Fig. 8). This arrangement generates a very open structure that would also exhibit considerable tensile strength due to the presence of cross-links. In its most extended form, the distance from NC1 to the end of the 7 S domain would be about 390 nm and the distance from globule to globule about 780 nm. The network model describes how molecules are arranged in two dimensions, but it does not define how they are arranged in three dimensions. In its most open form, the molecules are in contact only at their ends. I n uitro experiments, however, provided evidence that the collagen IV molecules are also able to interact laterally with partial overlapping of their triple-helical domains (Yurchenco and Furthmayr, 1984).
BASEMENT MEMBRANE PROTEINS
19
When collagen IV prepared from the EHS tumor is incubated at 37°C it self-assembles, forming aggregates containing polygonal structures. These lateral associations are stabilized by the interaction between the domain NC1 and sites that occur along the triple helix and are separated from one another by about 100 nm (Tsilibary and Charonis, 1986). The additional possibility of lateral aggregation of the molecules would lead to a much more complex three-dimensional structure. Examination of type IV collagen produced by cells in tissue culture demonstrated in the cell layer the expected 7 S tetrameric and NCl dimeric structures discussed earlier. In contrast, the culture medium contained, in addition to monomeric collagen IV molecules, dimers and tetramers linked via their 7 S domains but no globule-linked dimers (Bachinger et af., 1982; Oberbaumer et al., 1982). It was also found that collagen IV molecules undergo a ready association via their 7 S domains in a concentration-dependent process (Yurchenco and Furthmayr, 1985). Upon longer incubation, the tetramers formed are stabilized by intramolecular disulfide bonds (Duncan et al., 1983). The formation of NCl-linked dimers and their stabilization by disulfide bonding is apparently a slower process, at least in vitro (Blumberg et al., 1986). The network structure has been observed by electron microscopy in Reichert’s membrane following treatment of tissue sections with plasmin (Inoue et al., 1983). This enzyme removes a major portion of the 3- to 8-nm-thick cords that constitute the major element in the basement membrane, leaving behind a network of fine filaments 1.5 to 2 nm in diameter. Presumably, the cords have a framework consisting of one or more collagen IV filaments that are coated with laminin and with other basement membrane components. A better understanding of the structure of collagen IV molecules explains some previous observations. The occurrence of the 7 S domain and nonhelical interruptions in the molecule can explain in part the heterogeneous mixture of collagenous components isolated from basement membrane, particularly when pepsin or other proteases are used to solubilize the protein. The molecules isolated from lathyritic basement membrane by acid extraction are truncated and lack the 7 S domain presumably due to cleavage by endogenous proteases. 3. Gene Structure and Evolution of Collagen ZV Genes Considerable progress has been made in studying the structure of collagen genes. The genes for the chains of collagens I, 11, 111, and IX have been isolated. The DNA coding for the major helical domains of collagens type I, 11, and I11 are divided into many exons of 54 bases each or of multiples of this number (Boedtker et al., 1983). Indeed, it has been
20
GEORGE R. MARTIN ET AL. TABLE1V Size of Helical Exons in the a2(1),al(lV), and a2(IV) Chain Genes al(1V)
a2(I)
Exon number
Exon size
Exon number
a2(IV)
Exon size
Exon number
Exon size
16b
64 I82 123
~~~
4 5
247(54)"
6
54 108 54 108 54 54
7 8 9 10 14
108
4 5 6 7 8 9 10 14
23(71)" 99 129 12 73 134 178 81
17 18
a The numbers in parentheses represent the number of bases in the helical portion of the fusion exon. * The location of these exons has not been determined, but they correspond to sequences from the center of the helical domain and are 5' to exon 14 in the al(1V) gene.
postulated that there was a primordial collagen gene of 54 bases coding for 18 amino acids consisting of six Gly-X-Y triplets and that the structure of this primordial gene has been duplicated and maintained (Yamada et al., 1980). This pattern, however, has been found only in the main triplehelical domains of the fiber-forming collagens. The gene for the al(1X) chain, for example, is smaller, and the portions coding for the helical domains are not divided into regularly sized exons (Lozano et al., 1985). The collagen IV gene also deviates from the pattern observed with the genes for the fiber-formingcollagens. The size of three helical exons in the a2(IV) chain (Kurkinen et al., 1985) and several helical exons in the al(1V) chain (Sakurai et al., 1986; Soininen et al., 1986a,b) have been determined by sequencing. None of these correspond to the 54-base pair pattern, and each has a unique size (see Table IV). Further, the initial and final codons of these exons are often split, in contrast to the genes for the chains of collagens 1-111. In addition, examination of the amino acid sequences of the al(1V) and a2(IV) chains show no homology with the chains of collagens 1-111 beyond that of the Gly-X-Y triplet characteristic of all collagens. Taken together the data show that there is little relation between collagen IV and other collagen types and that collagen IV genes evolved independently. Given that collagen IV occurs first during the initial development of the embryo (Leivo et al., 1980) and is present in invertebrates (i.e., hydra and sea urchins), it may have evolved prior to the interstitial collagens. The genes for cyl(IV) and a2(IV) chains have
BASEMENT MEMBRANE PROTEINS
21
been localized to a narrow segment of human chromosome 13 (Griffin et al., 1987), their locations indicating a close linkage. B. Laminin The strong reaction obtained in basement membranes with the periodic acid-Schiff stain is due to reaction with carbohydrate present in glycoproteins (Leblond et al., 1957). Subsequently, glycoproteins including some larger than M , 200K were solubilized with 8 M urea from lens capsule (Kefalides, 1972b). The probable importance of these glycoproteins to the structure of basement membranes was fully appreciated and most early structural models suggested that glycoproteins were part of the basement membrane structure. PYS cells were found to secrete material that was highly antigenic, and antibodies raised against these antigens reacted with authentic basement membranes. Johnson and Warfel(l976) used molecular-sieve chromatography to isolate a large noncollagenous protein(s) secreted by cultured PYS cells. They also found it in kidney extracts and showed that antibody to this material reacted with tissue basement membranes. Similarly, Hogan et al. (1980) as well as Howe and Solter (1980) showed that both cells derived from embryos and PYS cells secreted primarily two or three large proteins in culture. Chung et al. (1979) also studied a line of PYS-like cells that deposited an extracellular membrane, which, when freed of cellular elements and dissolved in detergent plus mercaptoethanol, was found to consist of two principal polypeptides called GPI (M, 320K) and GP2 (M, 230K). These glycoproteins were isolated by preparative electrophoresis and shown to be noncollagenous and distinct from fibronectin. Antibodies prepared against GP2 were found to react with basement membranes in kidney and some other organs. The nature of these components became clear after the isolation of laminin, a large glycoprotein ( M , 800-900K) that is a major constituent of basement membranes. Laminin was first isolated in its intact form from the EHS tumor (Timpl et al., 1979~).It is readily solubilized from this source with aqueous solutions (i.e., 0.5 M NaCl, pH 7.4) and can be purified to homogeneity by ion-exchange and molecular-sieve chromatography. In the absence of reducing agents, laminin migrates on electrophoresis as a single sharp band and sediments as a single sharp boundary (11 S) in the ultracentrifuge (Engel et al., 1981). Hydrodynamic studies indicate that laminin has an asymmetric and rather rigid structure. Circular dichroism revealed about 30% a helix and some 20% /3 structure in the molecule. Electrophoresis of laminin from the EHS tumor under reducing conditions resolved two polypeptide bands of similar intensity, the A (M, 400K)
22
GEORGE R. MARTIN ET AL.
and B (M, 220K)chains of laminin (Timpl et al., 1979~).Several lines of evidence indicated that the A and B chains were distinct and not, for example, related as precursor and product. The peptide maps obtained with the chains were different. No interconversion of one to the other was noted in pulse-chase experiments, and antibody to the B chain did not appear to react with the A chain (Chung et al., 1979;Cooper et al., 1981; Kurkinen et al., 1983;Howe and Dietzschold, 1983). Rotary shadowing and subsequent electron microscopy produced striking images of laminin (Engel et al., 1981). These showed a cross-shaped structure with one long (77nm) and three short (37nm) arms. Each short arm had two prominent globules, and the long arm terminated in a single large globule (Fig. 3b). Initial models suggested that three B chains constituted the short arms of laminin and one A chain constituted the long arm and that these chains were joined in a disulfide knot in the center of the cross. This model was supported by the appearance of the molecule, by the ratio of A to B chains in laminin prepared from the EHS tumor, and by the report that thrombin cleaved the long arm (i.e., the A chain) of laminin in preparations examined by electron microscopy, leaving the threeshort-arms structure with B chains intact (Rao et al., 1982). More recently, this model has undergone substantial modification. First, it became apparent from biosynthetic studies on cells in culture that there were two distinct B chains, BI and B2 (Cooper et al., 1981),which were not well resolved in laminin prepared from the EHS tumor. Analysis of the stoichiometry of the chains produced in these systems suggested that the laminin molecule contained one A, one B1, and one B2 chain (Fig. 9). Studies with a monoclonal antibody specific for the A chain indicated that the A chain formed part of the short arm as well as the long arm of laminin (Palm et al., 1985). The assembly of the molecule appears to proceed in discrete steps, with the initial assembly of a Bl-B2 dimer linked by disulfide bonds to which an A chain is added (Morita et al., 1985;Peters et al., 1985).
I . Domains, Fragments, and Sequence Data Detailed analysis of fragments isolated from laminin following cleavage with proteases and amino acid sequence deduced from cDNA clones from both the B1 and B2 chains of laminin have led to a further refinement of the structure. Various portions of the molecule survive proteolysis and have been isolated and characterized (Fig. 10A). Fragment 1 (M,275K) appeared to consist of the central portion of the cross, lacking terminal globules and the long arm. It could be isolated in good yield following pepsin and other protease digestion (Ott et al., 1982).Fragment 1 showed a strong reaction with antibody to laminin, a result indicating that it con-
BASEMENT MEMBRANE PROTEINS
YIGSR
f
23
A CHAIN (Mr = 400,OOO)
Cell Attachment
82 CHAIN (Mr=ZO5,000)
B1 CHAIN iMr
Coiled Coil Neurite Outgrowth
@I
Heparin Binding
FIG.9. Schematic model for laminin showing the 3-chain structure (A, B1, and B2) and the projection of the three chains down the long arm of the molecule. The location of a pentapeptide (YIGSR) with cell attachment activity is indicated, as well as the region with neurite-promoting activity (Baron van Evercooren et a / . , 1982; Edgar er al., 1984).
tains a major portion of the antigenic activity of laminin, is strikingly enriched in cysteine, and retains the ability to react with cellular receptors (Rao et al., 1982; Timpl et al., 1983b; Aumailley et al., 1987). It lacked both a helix and /3 structure. The three-short-arms structure, a3, although less stable than fragment 1, was isolated from laminin treated with thrombin (Rao et al., 1982) or elastase (Ott et al., 1982). The long arm of laminin was particularly sensitive to proteolysis, and its removal was associated with the complete loss of a-helical structures (Ott et al., 1982). Two fragments were isolated from the long arm of laminin after limited cleavage by elastase (fragment 8) or after cleavage with trypsin in conjunction with endogenous protease (fragment 25K). Fragment 8 was localized to the long arm of laminin on the basis of its appearance in electron micrographs, which showed a rather rigid rod terminating in a single globule. Fragment 8 also bound to heparin and reacted with antibody to fragment 3, the globular domain at the end of the long arm. Fragment 8 has a high content of a helix, presumably in the rod (Paulsson et al., 1985a). It contains a high-affinity cell attachment site (Aumailley et al., 1987; Goodman et al., 1987) and is responsible for the stimulation of neurite outgrowth (Edgar et al., 1984; Engvall et al., 1986). Fragment 25K also had a high content of a helix and reacted with antibody to fragment 8 (Paulsson et al., 1985a). It was found to consist of two disulfide-linkedpeptides (Mr 12K),which on the basis of immunologi-
A
FRAGMENT 1 CELL BINDING
FRAGMENT 3 HEPARIN BINDING
0
-
FRAGMENT 8
“25K”
B
a3 CELL COLLAGEN
/ -
Limited Elostose
BINDING
v
Domains
Globiar Domains
m
n
Homologous CvsteineRich Repeats -./
*.*
#.#’
/’
o-Helical Coiled Coils ’.. \. x.
.‘
I
S
COOH
FIG.10. (A) Proteolytic fragments derived from laminin and activities found to be associated with them. (B)A domain model for the B1 chain of mouse laminin deduced from the nucleotide structure of cDNA clones. Domains I and I1 are largely helical and probably form a coiled-coil structure with a similar portion of the B2 chain. There are several possible carbohydrate attachment sites. These domains are separated by a region “a” containing six cysteines closely bunched, possibly involved in cross-linking to the 8 2 and A chains. Domains 111and V are cysteine-rich regions composed of repetitive segments of about 50 amino acids each. These domains may form the two rod-like elements within the short arm, whereas domains IV and VI are thought to form the visible globular structures.
BASEMENT MEMBRANE PROTEINS
25
cal cross-reaction and sequence data were identified as arising from the B1 and B2 chains of laminin. The sequence of each peptides showed a heptad repeat, characteristic of helices in proteins such as tropomyosin, whose chains are arranged in coiled-coil structures. Fragment 3 (M, 50K)possessed p structure, appeared globular in electron micrographs, and was found to bind to heparin. It was assumed to be the globular region at the end of the long arm of laminin. This site is one of the main heparin- and heparan sulfate-binding domains in laminin (Ott et al., 1982). Advances in isolating and identifying cDNA clones for laminin provided considerable additional information on the amino acid sequences at the carboxyl end of the B1 and B2 chains (Barlow et al., 1984). These analyses showed that at least 350 residues of the B1 chain and over 200 residues of the B2 chain had the a-helical heptad repeat and predicted that these chains would be aligned together, along the long arm of laminin in a coiled-coil a-helical structure. Taken together, these and other data indicate that the laminin molecule is composed of one A, one BI, and one B2 chain (Fig. 9). The carboxyl terminus of each B chain extends down the long arm of laminin in a coiled-coil structure while the remainder of each chain forms a short arm. The A chain of laminin could also participate in such a coiled-coil structure, although this has not been shown and would require the alignment of a comparable a-helical domain in the A chain. However, the A chain is presumed to form one of the short arms and to extend through the molecule to form the globule at the end of the long arm of laminin. More recently, the complete amino acid sequence of the B1 and B2 chains of laminin has been deduced from the nucleotide sequence of cDNA clones (Yamada et al., 1985; Sasaki et al., 1987; M. Sasaki and Y. Yamada, unpublished). Computer analysis of possible secondary structures (Fig. IOB) shows a diversity of domains, including two a-helical-rich domains (I and 11), two cysteine-rich regions with regular repeats (domains I11 and V) and two cysteine-poor-possibly the globular-regions (domains IV and VI) (Fig. IOB). These various domains have been correlated recently with distinct laminin fragments by Edman degradation (R. Deutzmann, unpublished), this study demonstrating that domains IV and VI correspond to the inner and outer globular domains within one short arm of laminin, respectively, and that domain I11 is a typical constituent of the cysteine-rich fragment 1 . 2. Isoforms Laminin composed of A, B1, and B2 chains appears to be the major form of laminin secreted by cells that produce and deposit basement
26
GEORGE R. MARTIN ET AL.
membranes. However, other isoforms probably exist. For example, Bla and Blb chain are observed after in uitro translation of B1 chain mRNA (Kurkinen et al., 1983). Since this mRNA was isolated by hybridization to a single cDNA clone for the B1 chain, the Bla and Blb chains must arise from similar mRNA perhaps created by differential splicing of the original transcript. Laminin from placenta contains an additional component, the M chain, whose size is intermediate between the A and B chains of laminin (Ohno et al., 1983). These chains could form heteropolymers with different molecular and biological activities. The chains of laminin are not synthesized synchronously in early mouse embryo cells. Only the B1 chain is synthesized in the oocyte (Cooper and MacQueen, 1983). No laminin chains are produced at the 2-cell stage; and at the 4- and 8-cell stage, both B1 and B2 chains are synthesized. The A, B1, and B2 chains appear together only at the 16-cell stage. Production of the three chains is associated with the extracellular appearance and deposition of basement membranes and the initiation of morphogenesis. Laminin consisting of A and B chains has also been identified in embryonic cells of Drosophilu (Fessler et a / . , 1984b) and sea urchins (McCarthy et ul., 1987). These proteins showed a cross-shaped structure like that of mouse laminin, but with a more extended long arm. Some cells, including cultured Schwann cells, produce only the B chains of laminin (Palm and Furcht, 1983; Cornbrooks et al., 1983; Dziadek et al., 1986a). In addition, some animal tissues appear to contain B chains or B chain mRNA but to lack significant amounts of A chain (Mohan and Spero, 1986; Kleinman et al., 1987). C . NidogenlEntactin In addition to the A and B chains of laminin, SDS-mercaptoethanol extracts of the membranous sacs produced by a line of PYS-like cells contained a novel protein (M,158K) into which 35S04was incorporated (Carlin et al., 1981). This component was isolated by preparative electrophoresis and characterized by amino acid analysis, carbohydrate content, and antigenic reactivity. Antibodies against this sulfated glycoprotein were observed to localize to the basement membrane zones of various tissues. Due to a close association with epithelial and endodermal cell surfaces, the protein was named entactin. Additional studies utilizing immunoelectron microscopy showed that entactin was present in the lamina densa and colocalized with collagen IV, laminin, and heparan sulfate proteoglycan (Laurie et al., 1984). A similar component was independently observed to be produced by parietal endoderm cells and to be incorporated into Reichert’s membrane (Hogan et al., 1982a.b).
BASEMENT MEMBRANE PROTEINS
27
Nidogen was initially isolated from the EHS tumor as a fragment (M, 80K) that appeared by electron microscopy as an 8- to 10-nm globule with a short tail (10-14 nm) (Timpl et al., 1983a). It received the name nidogen on the basis of its ability to aggregate into nest-like structures (Latin: nidus, “nest”). Immunofluorescence studies with antibody to nidogen showed that it occurred in a wide variety of basement membrane. Even larger forms of nidogen were found in tissue extracts by immunological reaction of electrophoresed material. Subsequent work suggested that nidogen was rather susceptible to proteolysis but that the intact form (M, 150K)can be isolated from a variety of basement membranes using concentrated guanidine solutions containing protease inhibitors (Dziadek and Timpl, 1985). Such a component (M,150K)was obtained from Reichert’s membrane, which was sulfated and also reacted with nidogen antibody (Paulsson et al., 1985b). The sulfate was shown to occur in the protein as tyrosine sulfate, a well-known posttranslational modification common to a number of secreted proteins. The intact nidogen molecule is characterized by a single N-terminal amino acid sequence and has the shape of a dumbbell (Fig. 3C), consisting of two globular domains of unique size connected by a 16-nm rod-like segment (Paulsson et al., 1986, 1987a). Nidogen is readily degraded to a series of well-defined fragments (Dziadek ef al., 1985a). Such data suggest that nidogen consists of three to five separate domains that lack substantial portions of (Y helix or p structure. Strong complexes ( K D , 1 nM) were observed between nidogen and laminin, an association allowing the isolation of the complex by molecular-sieve chromatography (Dziadek et al., 1985a; Paulsson et al., 1987a), and the dissociated components are able to reassociate. Chemical and electron microscopical studies demonstrated binding of one nidogen molecule in the complex to the center of the cross (fragment l) of laminin. Specific antibodies to either laminin or nidogen precipitate the other when both are present in solution. In several basement membranes, laminin and nidogen occur in nearly equimolar amounts. However, nidogen appears rather later than laminin in embryonic development (Dziadek and Timpl, 1985). Certain observations suggest that entactin and nidogen are either identical or closely related proteins. These include their similarity in size, their strong affinity for laminin, and the cross reaction of antibodies to the intact form of each (Carlin et al., 1983; Hogan ef al., 1980; Palm and Furcht, 1983; Paulsson et al., 1985b). More affirmative proof of the relationship between both proteins will depend on direct sequence comparisons. The functions of nidogedentactin are not known, although complex
28
GEORGE R. MARTIN ET AL.
formation with laminin may modulate some of the functions of laminin, such as its binding to cells and collagen IV (Dziadek et al., 1985a;Aumailley et al., 1987).
D . Heparan Sulfate Proteoglycans Current concepts suggest that every extracellular matrix contains one or more proteoglycans in addition to collagens and glycoproteins. Proteoglycans are often classified on the bases of the glycosaminoglycan chains that they bear and their tissue source. However, it is their core proteins that are the unique gene products (Hassell et al., 1986). It is now firmly established that all basement membranes contain a distinctive class of heparan sulfate proteoglycans. The first indication of the presence of proteoglycans in basement membranes came from studies using cationic probes, such as cationized ferritin or ruthenium red, which react specifically at low pH with the sulfate groups in glycosaminoglycans and reveal regular deposits along the basement membrane (Kanwar and Farquhar, 1979a; Gordon and Bernfield, 1980). Also, radioactive sulfate was found to be incorporated into macromolecules that were deposited in basement membrane (Hay and Meier, 1974). Subsequently, physiological and ultrastructural studies on the clearance of macromolecules through the kidney demonstrated that the basement membrane in the glomerulus acts as a selective filter, preventing the passage of macromolecules (Karnovsky, 1979; Farquhar, 1981). Further, the charge on the macromolecule is important in determining its clearance, since negatively charged macromolecules show a lower passage than those with neutral or positive charges. These results indicated that there was an anionic barrier in the basement membrane that was important in regulating the passage of proteins from blood to the urinary space. The anionic groups observed in basement membrane with cationic stains were identified as heparan sulfate chains on the basis of their sensitivities to heparitinase and to nitrous acid (Kanwar and Farquhar, 1979b) and by chemical analysis (Kanwar and Farquhar, 1979c; Parthasarathy and Spiro, 1982). Current concepts suggest that the heparan sulfate chains are linked through a common oligosaccharide terminating in xylose linked to serine residues on the protein core and that certain xylosides can act as initiators for the synthesis of heparan sulfate. In the absence of added xyloside, the synthesis of heparan sulfate is entirely dependent on the synthesis of a core protein. Heparan sulfate proteoglycans have been isolated from glomerular basement membranes, from the EHS tumor matrix, and from various cells producing basement membranes, including endotheiial cells and epi-
29
BASEMENT MEMBRANE PROTEINS
thelial cells. In general, these studies show a size diversity of heparan sulfate proteoglycans. It should be noted that proteoglycans are not readily resolved from one another by ion-exchange chromatography due to their high negative charge. Rather, resolution is obtained by differential extraction, by density gradient centrifugation, and by molecular-sieve chromatography. In addition, the protein portion of the molecules is rather sensitive to endogenous proteases, and it is necessary to use protease inhibitors to limit artifactual heterogeneity. Proteoglycans prepared from the EHS tumor matrix have been extensively studied and may serve as prototypes for other basement membrane proteoglycans. The EHS tumor contains two different heparan sulfate proteoglycans, which can be distinguished by their buoyant density, size, and solubility (Fujiwara et al., 1984; Hassell et al., 1985). The largest proteoglycan ( M , 650K), a low-density proteoglycan with about 70% of its mass in the form of protein, is firmly associated with other components of the matrix and requires denaturing solvents such as 7 M urea or 6 M guanidine for extraction. High-density proteoglycans, with less than 25% protein are readily extracted from the tumor and are smaller ( M , 130K). These proteoglycans have been purified by ion-exchange chromatography, molecular-sieve chromatography, and CsC1-gradient centrifugation and are highly immunogenic when injected into rabbits. The specific antibodies produced are directed toward the protein portion of the proteoglycan, reacting with the tumor matrix and with all basement membranes in normal tissues (Hassell et al., 1980, 1985; Dziadek et al., 1985b). Rotary-shadowing electron microscopy (Fig. 11) together with chemical and physical analyses permit detailed models for both forms of EHS tumor proteoglycan to be proposed. The low-density proteoglycan consists of an elongated protein core (length, 80 nm) with three heparan sulfate chains clustered at one end (Paulsson et al., 1987b). The protein core, when released by treatment with heparitinase, migrates in electrophoresis more slowly than does the laminin A chain with a M , 500K, although the possible retention of carbohydrate and the lack of adequate standards limit accurate estimates of its size. It consists of a single polypeptide chain folded into about six compact globular domains. The multidomain structure of the protein core was confirmed by protease digestion experiments, which released several large protein fragments together with a small segment containing the heparan sulfate chains (Ledbetter et al., 1987; Paulsson et al., 1987b). This study also indicated that the heparan sulfate attachment region is at one end of the molecule. One form of high-density proteoglycan was obtained from NaCl extracts of EHS tumor and appears as a star-shaped molecule (Fig. 11). The
-
30
GEORGE R. MARTIN ET AL.
’\ \
Low density PG
\
High density PG
%*\.
*\
---
- -----‘\
(__-- -$---
#0 0
-
s 0 0
\r’
0
0
/‘
0 0
0
8
0
’<
.*
#*
,**,
0’
8
100 nm
FIG.11. Electron micrographs of low- and high-density heparan sulfate preteoglycans prepared from the EHS tumor and tracings of the molecules. The full thick lines indicate the protein cores and the dashed lines indicate the heparan sulfate chains. Heparan sulfate chains produce only very faint contours due to their low mass-to-length ratio. Brackets in the model of low-density heparan sulfate chains denote variations in their length.
proteoglycan contains four 30-nm long heparan sulfate chains (M,29K) linked to a small protein core that accounts for about 10% of the total mass (Fujiwara et al., 1984). Two more species of high-density proteoglycans that differ in solubility have also been obtained from the EHS tumor (Hassell et al., 1985). They could be variants of the same molecule or distinct species. They have a larger molecular mass (M,2 200K)and contain about 25% protein and a heterogeneous protein core obtained by heparitinase digestion. It is not yet clear whether the differences between these high-density forms reflect genetically distinct proteoglycans, artifacts produced during isolation, or
BASEMENT MEMBRANE PROTEINS
31
differences in the methods used to estimate size by chemical analysis, molecular-sieve chromatography, and ultracentrifugation. A variety of studies suggest that the low- and high-density proteoglycans may be related structures (Ledbetter et al., 1985; Fujiwara et al., 1984; Hassell et al., 1985; Dziadek et al., 1985b). Antibodies prepared to either variant react with the other, and both antibodies precipitate a single precursor protein (M, > 400K) from cells that synthesize basement membranes. Pulse-chase experiments indicate conversion of this nascent polypeptide to the low-density proteoglycans with the subsequent appearance of a high-density proteoglycan (Ledbetter et al., 1985). The simplest explanation for these observations is that heparan sulfate chains are added to the nascent protein to form the low-density proteoglycan, which is then converted to the high-density proteoglycan by proteolysis. Whether such processes account for the formation of all forms of high-density proteoglycan remains to be established. Heparan sulfate proteoglycans isolated from the basement membranes of certain tissues and from cultured cells are clearly different in size. For example, most studies on glomerular basement membrane (Kanwar et al., 1981, 1984; Stow et al., 1983) indicate that its heparan sulfate proteoglycan is a small, high-density form (M,130K) with four heparan sulfate chains (M,26K). Similar material has been obtained from kidney labeled in situ with sulfate and from glomeruli labeled in culture and was localized to the lamina lucida of glomerular basement membranes by antibody staining (Stow et al., 1985a). On the other hand, Spiro and Parthasarathy (1984) reported the isolation of a low-density heparan sulfate proteoglycan (M,200K, 70% protein) from glomerular basement membranes; and lens capsule also contains a larger form. The glomerular proteoglycan 14K) clustered in one appeared to contain four heparan sulfate chains (M, portion of the molecule. Also Oohira et al. (1983) reported that endothelial cells produced and deposited a large heparan sulfate proteoglycan (Mr 400K) that reacts with antibody to the EHS tumor proteoglycans. These basement membrane types of proteoglycans differ, however, from cellbound heparan sulfate proteoglycans (Hook et al., 1984) as shown by antibody staining of tissues and cells (Dziadek et al., 1985b; Stow et al., 1985b). It is difficult at present to reconcile these conflicting results, even in a simple model system such as the EHS tumor. However, since all tissue basement membranes react with antibodies prepared to the EHS tumor proteoglycan, it seems likely that they contain related species even though real differences may exist between the proteoglycans in different tissues. Such molecules could be separate gene products, arise from differential splicing of mRNA from the same gene, or be formed by specific
32
GEORGE R. MARTIN ET AL.
proteolytic cleavage of a low-density proteoglycan. Additional heterogeneity may arise due to differences in the size of the heparan sulfate chains. For example, the proteoglycan from Reichert’s membrane in contrast to that from the EHS tumor shows strong binding to antithrombin. Such binding, as shown previously for heparin, involves sulfation at 3-OH sites as found in the proteoglycans from Reichert’s membrane but not in the proteoglycans from the EHS tumor (Pejlar et al., 1987).
E. Calcium-Binding Proteins A few recent studies indicate that divalent cations, most likely calcium, are important for maintaining basement membrane structure, a conclusion suggesting the presence of several cation-binding proteins. Initially, studies on the heat-induced polymerization of laminin showed it to be increased in the presence of calcium and other divalent cations and arrested at the level of small oligomers by chelating agents (Yurchenco et al., 1985). Likewise, chelating agents increase the extraction of lamininnidogen complexes from the EHS tumor, a result indicating that calcium mediates its binding to the matrix (Paulsson et al., 1987a). It was also directly shown that laminin is in fact a calcium-binding protein, but the localization and number of binding sites and their affinities are not yet known. Another small calcium-binding protein, BM-40, was isolated from 6 M guanidine extracts of the EHS tumor (Dziadek et al., 1986b).Immunohistology showed it to be a component of the tumor matrix and of Reichert’s membrane. It was also identified as a constituent of other tissues (i.e., lens capsule) by immunoassay, even though it did not react immunohistologically. Later studies showed that BM-40 is efficiently extracted from the EHS tumor in physiological buffers containing EDTA (Mann et al., 1987). Partial amino acid sequence analysis indicates the identity of BM40 with bone osteonectin (Bolander et al., 1988) and with SPARC, a parietal endoderm protein (Mason et al., 1986a). Together the data demonstrate that this calcium-binding protein is a ubiquitous constituent of extracellular matrices, including most if not all basement membranes (Dziadek et al., 1986b; Mason et al., 1986b; Young et al., 1986). Complete sequence analysis based on cDNA was achieved for osteonectin and SPARC and indicated a disulfide-bonded, four-domain structure for the protein (Fig. 12). Remarkable features of the structure are a cluster of 16 glutamic acid residues at the N terminus and an EFhand domain located between the last two cysteine residues, both regions having the potential for calcium-binding (Bolander et al., 1988; Engel et al., 1987). Studies with BM-40 demonstrated a reversible change in ct helix from 30 to 20% upon removal of calcium. The data indicate the
BASEMENT MEMBRANE PROTEINS
33
EF dornai n I
BGlu
II I
Glu-rich domoin
domoin
FIG. 12. Predicted domain structure of the small calcium-binding protein BM-4010~teonectin/SPARC. Numbering is according to the sequence of SPARC (Mason et al.. 1986a). but omits the signal peptide. A dashed line denotes predicted a helix, C identifies cysteine residues, and + indicates clusters of basic residues. A potential calcium-binding domain of the EF type is noted as well as a cluster of glutamic acids. Modified from Engel ef al. (1987).
cooperative binding of several calcium ions with K D in the micromolar range (J. Engel, personal communication). No a helix is evident, but calcium binding with K D = 0.3 p M has been found for bone osteonectin (Romberg et al., 1985). Cells contain a large variety of calcium-binding proteins, with the K D for cation-binding falling within the 10- to 1OO-nM range of intracellular calcium concentration (Kretsinger, 1980). Limited changes in calcium concentration can therefore modulate binding behavior of these proteins, changes like those occurring during muscle contraction or transmission of secondary messages. The calcium concentration within basement membranes is not known, but it could be the same as that of extracellular fluids (0.1-1 mM). If this assumption is correct, BM-40 should be present in the extracellular matrix exclusively in calcium-associated form. It could serve there as a calcium-sequestering protein; or, alternatively, calcium may induce conformational changes that regulate its binding to other matrix proteins. Support for such speculations will depend on the better understanding of calcium binding by laminin and by BM-40 and the characterization of further calcium-binding proteins possibly present in basement membrane. F . Other Basement Membrane Components Collagen VII forms the anchoring fibrils in the reticular lamina that link the basement membrane to underlying connective tissue (Burgeson et al., 1985; Sakai et al., 1986). These attachment fibers are particularly abundant along the epidermal-dermal basement membrane, in the amnion, in lung, and in other sites where basement membranes are under tension. The fibers of collagen VII form a centrosymmetrically banded structure
34
GEORGE R. MARTIN ET AL.
about 800 nm in length. This distance apparently represents the length of two molecules each 460 nm arranged in a antiparallel fashion with an overlap of 60 nm. The anchoring fibrils form an extended network underneath the lamina densa and connect by their C-terminal domains to the lamina densa, with anchoring plaques being abundant in the reticular lamina (Keene et al., 1987). The plaques contain type IV collagen, a finding suggesting that it may interact with collagen VII. Anchoring fibrils and collagen VII are absent from the skin of certain patients with the recessive dystrophic variety of epidermolysis bullosa, a severe blistering disease (Goldsmith and Briggaman, 1983). 1 . Fibronectin While fibronectin is usually listed as a component of basement membranes, it is present in much larger amounts elsewhere and may be primarily an adventitious constituent of basement membrane. Fibronectin (M,450K) contains two similar or identical chains that are each 60 nm long and are linked at their carboxyl ends by disulfide bonds (Yamada, 1983). Each chain contains a number of discrete domains with distinct binding sites for cellular receptors, collagen, fibrin, DNA, heparan sulfate, and so on. Fibronectin exists as both plasma- and matrix/cell-associated forms whose structures show minor differences that arise by differential splicing of the transcript of the fibronectin gene (Yamada, 1983; Hynes, 1985). The entire amino acid sequence of the cellular form has been deduced from the nucleotide sequence of human cDNA clones (Kornblihtt et al., 1985) and by sequencing bovine plasma fibronectin (Skorstengaard et al., 1986). Immunofluorescent studies with antibody to fibronectin show the protein to be localized in the basement membrane zone. At the ultrastructural level, fibronectin is observed to be concentrated in structures in the reticular zone as well as being directly associated with the lamina densa (Madri et ul., 1980; Martinez-Hernandez et ul., 1981; Courtoy et ul., 1982; Semoff et al., 1982; Laurie et al., 1983; Fleischmajer and Timpl, 1984). Most studies suggest that fibronectin is not a ubiquitous component of all basement membranes or necessarily synthesized by the same cells that produce collagen IV and laminin. It may be acquired from the circulation and from nonepithelial cells. Fibronectin can bind to both collagen IV and heparan sulfate proteoglycan and could be localized to the lamina densa in this fashion. Studies on developing tissues suggest that the highest levels of fibronectin are associated with the newly deposited basement membranes and could play a role in their deposition as well as allowing fibroblastic cells to bind to basement membranes (Stenman and Vaheri, 1978; Wartiovaara et ul., 1979).
BASEMENT MEMBRANE PROTEINS
35
2 . Chondroitin Sulfate Proteoglycans Chondroitin sulfate was identified in embryonic basement membrane on the basis of studies showing the susceptibility of sulfate-labeled material to chondroitinase and a reaction with monoclonal antibodies (Couchman et al., 1984; Paulsson et al., 1985b). Similarly, Reichert’s membrane, endothelial L2, PYS cells, EHS tumor cells, and isolated glomeruli produce variable amounts of these proteoglycan (Oohira et al., 1983; Stow et al., 1983; Oldberg et al., 1981; Tyree et al., 1984; Hassell et al., 1980). The molecule produced by L2 cells (M, 40K) contains a small protein core with a unique amino acid sequence (Bourdon et al., 1985). It appears that in such systems, little of the chondroitin sulfate proteoglycan is deposited in the matrix. 3. Amyloid P Component This protein (M, 230K) is an acute-phase protein produced in response to either trauma or toxic factors (Pepys and Baltz, 1983). Amyloid P was also observed by histology in glomerular and certain vascular basement membranes. It appears to be bound to type IV collagen, since it is brought into solution with collagenase (Dyck et al., 1980). Amyloid P was suggested to form the basotubule structures observed in some basement membranes (Inoue et al., 1983; Inoue and Leblond, 1985). 4 . Acetylcholinesterase This enzyme is concentrated in the synaptic clefts between nerve ending and muscle. These are cholinergic synapses, and acetylcholinesterase inactivates acetylcholine released by the nerve to allow muscular relaxation. A considerable proportion of this acetylcholinesterase is bound to the basement membrane in the cleft separating the muscle surface from the nerve process. Some studies suggest that the basement membrane contains the asymmetric form of acetylcholine esterase, which contains a collagenous tail (McMahan et al., 1978; Inestrosa et al., 1982; Dreyfus et al., 1983). Binding studies show that this form has a rather high affinity for heparan sulfate proteoglycan, and its binding to the basement membrane could be through this interaction, since this enzyme does not bind to laminin, collagen IV, or fibronectin (Vigny et al., 1983; Grassi et al., 1983).
5 . Bullous Pemphigoid Antigen This protein is limited to basement membranes beneath squamous epithelia such as the epidermal, urethral, bladder, bronchial, and gall bladder basement membranes. It was detected using autoantibodies in the sera of
36
GEORGE R. MARTIN ET AL.
patients with bullous pemphigoid, a dermatological disorder characterized by multiple blistering of the skin. These blisters develop due to the deposition of the autoantibody in the epidermal basement membranes, a process that initiates an inflammatory reaction and leads to the separation of the epidermis from the dermis at the level of the basement membrane (Lever, 1979; Patel et al., 1983). Bullous pemphigoid antigen is synthesized in culture by epidermal cells and has been identified as a disulfide230K) (Stanley et al., 1981, 1982). It linked oligomer of a unique chain (M, has an affinity for the surface of epidermal cells and is the first component of the epidermal basement membrane to be deposited by epidermal cells during wound healing. 6 . Goodpasture Antigen The occurrence of this component was also discovered due to the presence of autoantibodies in patients with a rare form of glomerulonephritis. Such patients present with a syndrome of hematuria associated with pulmonary hemorrhage, which may progress rapidly and lead to the complete loss of kidney function. In this disorder, basement membranes in the glomeruli and lung contain linear deposits of IgG, whose presence elicits inflammatory reactions that lead to the loss of tissue function. Such antibodies can be eluted from the affected kidneys after their removal and are also detected in the blood of these patients (Wilson and Dixon, 1981; Peters et al., 1982). These autoantibodies react with the kidney basement membranes, but not, for example, with those in skin. While the Goodpasture antigen was originally thought to reside only in lung and kidney, it may also occur in smaller amount or in a cryptic form in other basement membranes (Wieslander and Heinegard, 1985). The antigen is solubilized by collagenase and copurifies with the NCl domain of type IV collagen (Wieslander et al., 1984a,b;Timpl et al., 1985). The epitope appears to be partially cryptic, and a stronger reaction with autoantibodies is obtained after dissociation of the hexameric NCl structure into subunits (Wieslander et al., 1985). However, not all of the subunits possess the epitope, and the precise relationship of the NC1 globule to the Goodpasture antigen remains to be established. It is possible that the intact antigen is a unique collagen IV chain (Butkowski et al., 1985, 1987).
IV. SELF-ASSEMBLY AND INTERACTION BETWEEN COMPONENTS Self-assembly processes are usually invoked to explain the deposition of collagenous matrices, since they form outside the cell. Self-assembly in the stroma is based on the ability of interstitial collagen types 1-111 to
BASEMENT MEMBRANE PROTEINS
37
spontaneously align in an ordered manner. These particular collagens are secreted as soluble precursors, the procollagens 1-111, which are converted to collagen molecules by the enzymatic excision of propeptides. This modification reduces the solubility of these proteins and encourages their lateral association, thereby giving rise to the rapid formation of fibers at physiological pH and temperature. A somewhat different pattern is observed with collagen IV. It is incorporated into the matrix without proteolytic modification and its deposition is not associated with a change from a soluble to an insoluble form. However, spontaneous interactions are observed between molecules that may be involved in the deposition of basement membrane. Monomeric preparations of intact collagen IV can be obtained from the media used to culture cells that produce the protein. These monomers of collagen IV, when incubated under physiological conditions, particularly in the presence of a glutathione redox system, form dimeric, trimeric, and tetrameric aggregates. Association only occurs at the amino terminus of the molecule, without any globular interactions. No requirement for cation was found, nor did laminin accelerate the process (Duncan et af., 1983). The data suggest that the formation of aggregates of collagen IV through their 7 S domains can occur spontaneously in a stepwise fashion up to the tetrameric level. The collagen IV extracted from lathyritic tissue under reducing conditions (Kleinman et af., 1982) is predominantly composed of dimers, with the molecules linked at their globular ends. When incubated in physiological solvents at 25"C, tetrameric, hexameric, and octameric aggregates are formed that resemble the aggregates of collagen IV brought into solution by limited digestion of basement membrane with proteases (Yurchenco and Furthmayr, 1984). When solutions of the dimeric preparation of collagen IV are incubated at 37"C, a more condensed and regular lattice network is observed. This material consisted of the aggregate structures formed at 25°C but two or three triple helices were laterally associated to form polygonal structures. The globular domains of the dimers are arranged on the vertices of the polygon in a regular fashion and are separated by an average distance of 170 nm (Yurchenco and Furthmayr, 1984). More recent studies suggest that dimeric globules isolated by collagenase treatment of collagen IV bind at specific sites along the helix (Tsilibary and Charonis, 1986). In summary, these observations suggest that purified collagen IV can form amino-terminal interactions, such as those occurring in situ, but not the globular (NCl) interactions. The open network formed by the association of like ends of collagen IV molecules can undergo additional associations determined by the affinity of dimeric NC 1 globules
38
GEORGE R. MARTIN ET AL.
for sites along the triple helix. Such a condensed structure, with the other component of basement membrane arrayed on it, could generate the cords observed in the lamina densa. A . Laminin Laminin has also been observed to polymerize in uitro. Polymerization is observed under physiological conditions at 37°C above a critical concentration (100 pg/ml), requires calcium ions, and is reversible at low temperatures (Yurchenco et al., 1985). Examination of the laminin aggregates suggests that the interactions occur through the terminal globules on the arms of laminin. Denatured laminin does not undergo the polymerization reaction. Antibodies to the end of the long arm of laminin block the polymerization of laminin (Charonis et al., 1986).A similar calcium-mediated polymerization is also observed with the laminin-nidogen complex (Paulsson et al., 1987a).
B. Multiple Interactions Collagen IV has been found to bind to both laminin and heparan sulfate proteoglycan, whereas laminin binds both the nidogen and heparan sulfate proteoglycan. These observations suggest that multiple interactions of a specific nature could be important in determining both the composition and the deposition of basement membranes. Collagen IV binds about an equal amount of laminin (Woodley et al., 1983).The initial interactions between the two appears to involve a globular portion of a short arm of laminin binding to a site on the collagen IV molecule about 80-140 nm from the NC1 domain (Charonis et al., 1985; Rao et al., 1985; Laurie et al., 1986). Subsequently, the laminin appears to collapse onto the collagen IV molecule. The short-arm fragment of laminin, which lacks the long arm, is also able to bind to type IV collagen, whereas the P1 fragment lacking globules does not (Rao et al., 1982). The binding affinity of laminin for collagen IV has been reported to be on the order of KD = lo7,and denaturation of either protein destroys their interaction (Woodley et al., 1983). Others have not observed interactions in solution or saturable binding (Charonis et al., 1985). Both the large and small forms of the heparan sulfate proteoglycans bind to laminin and to collagen IV (Woodley et al., 1983; Fujiwara et al., 1984). Binding of these proteoglycans occur at the globule on the end of the long arm of laminin, presumably due to electrostatic forces, since binding is reversed by relatively low levels of NaCl (Ott et al., 1982). Binding of heparan sulfate proteoglycans to collagen IV has been studied
BASEMENT MEMBRANE PROTEINS
39
both by affinity chromatography and by electron microscopy. A major binding site for proteoglycan binding has been identified at about 200 nm from the NCl domain (Laurie et al., 1986). Fibronectin has also been found to bind to collagen IV. In this case, binding is stronger to the denatured protein than to the native form, a binding pattern also seen with other collagen types (Woodley et al., 1983). Binding of fibronectin to collagen IV appears to involve a site on the collagen molecule about 200 nm from the NC1 domain and a region at the end of a fibronectin chain (Laurie et al., 1986). Even though the laminin and fibronectin binding sites are distinct on collagen IV, the binding of laminin blocks any binding of fibronectin but not vice versa. Presumably, this interference occurs for steric reasons due to the condensation of laminin on the collagen IV molecule (Terranova et al., 1986a). Such data indicate that the association of the major components of basement membrane-collagen IV, laminin, heparan sulfate proteoglycans, and nidogen-are specific and occur through a limited number of binding sites. Direct evidence that such interactions occur between mixtures of these components has been obtained by combining the purified proteins as well as by studying the interaction of the materials from unfractionated extracts (Kleinman et al., 1983, 1986). These studies showed that mixtures of collagen IV, laminin, and the large heparan sulfate proteoglycan precipitate at 37°C in the ratio 1 : 1 : 0.15. The stoichiometry observed in this interaction and the presence of a limited number of binding sites on each molecule are consistent with their ability to form defined complexes. Similar interactions were observed when extracts of the EHS tumor matrix, which contain laminin, nidogen, heparan sulfate proteoglycans, and a variety of other proteins, were examined (Kleinman et al., 1986). Examination of the extract by molecular-sieve chromatography showed that the majority of the laminin in the extract was present in an aggregate with nidogen and heparan sulfate proteoglycan. By electron microscopy, the aggregate was observed to contain a central core of proteoglycan to which was attached several outstretched laminin molecules. Nidogen was not visualized, possibly because it is much smaller than the other molecules. Other data indicated that laminin-nidogen complexes are particularly stable (Paulsson et al., 1987a)and that tissues contain equal amounts of these proteins (Dziadek and Timpl, 1985). These findings suggest the ubiquitous existence of such complexes in basement membranes. Supplementation of the extract with collagen IV leads to the rapid assembly of certain components in the extract into a solid gel composed of an interconnected network of lamellar structures resembling basement membranes (Kleinman et al., 1986). The major components of the gel
40
GEORGE R. MARTIN ET AL.
were identified as equal amounts of collagen IV, laminin, and nidogen, plus smaller amounts of proteoglycan. In summary, both collagen IV and laminin show the ability to selfassemble. In addition, the various components of basement membrane have an affinity for one another that involves binding to specific sites. These interactions generate rather defined aggregates in solution and lead to the deposition of the components in a gel-like form whose ultrastructure resembles in some details authentic basement membranes. Such multiple interactions would be expected to be stronger than single interactions and may account for the codistributions of these components in basement membranes.
V. BIOLOGICALASPECTS A . Structural Functions
Basement membranes are the primary extracellular matrix of epithelial tissues. As such, basement membranes are presumed to provide physical stability and thus maintain tissue shape and integrity. The physical characteristics of the basement membranes have not been adequately studied but are expected to exhibit much greater elasticity than other collagenous matrices due to the end-to-end arrangement of molecules. Such a network structure would also facilitate the passage of fluids through capillary and glomerular walls. However, basement membranes in the kidney, in capillaries, and possibly in other sites, exhibit the ability to restrict the passage of negatively charged proteins. This barrier in the basement membrane is formed of the heparan sulfate chains of the proteoglycan, which are arranged along its surfaces (Farquhar, 1981). In embryogenesis, proteoglycan appears to be absent from the basement membranes in those regions of developing glands that are undergoing rapid growth and branching (Bernfield et al., 1984). It has been suggested that the absence of proteoglycans from these regions allows macromolecules, such as growth factors, to pass through the basement membrane and stimulate the growth and morphogenesis of the tissue on the other side. Indeed, an important function of basement membranes in the mature individual may be to prevent biologically active mediators produced by the separated tissues from reaching each other. In diabetes, basement membranes may also become defective and leak serum proteins, a pathology resulting in degenerative changes in kidney, eye, and blood vessels. Additionally, basement membranes in a number of sites, including kidney, capillary, and nerve, become grossly thickened with increased collagen IV but reduced proteoglycan contents. Two gen-
BASEMENT MEMBRANE PROTEINS
41
era1 concepts have been advanced to explain these changes (Brownlee and Cerami, 1981; Rohrbach et al., 1983; Tarsio et al., 1987). The basement membrane produced in the diabetic has a reduced proteoglycan content and therefore is functionally altered. Additionally, an increased nonenzymatic glycosylation of basement membrane proteins occurs in diabetes due to the elevated blood glucose and interferes with the binding of the components to one another, most strongly altering proteoglycan binding (Tarseo et al., 1987). The fact that the thickened basement membranes in diabetes are located in sites where quantities of fluid are filtered points to the involvement of compensatory mechanisms underlying the increased production of basement membrane proteins.
B . Cellular Receptors for Collagen IV and Laminin Many cells attach to basement membranes through cellular receptors that are specific for various components of this matrix. Diverse mechanisms undoubtedly exist, since some cells bind preferentially to collagen IV, others to laminin, and others to the two proteins together (Terranova et al., 1980; Aumailley and Timpl, 1986). Colligin ( M , 47K), a collagen IV binding protein that also binds gelatin, has been isolated from endodermal cell membranes and could be a collagen receptor (Kurkinen et al., 1984; Hughes et al., 1987). Much more is known about the interaction of cells with laminin. A highto 4 X receptor for laminin has been found on affinity (KD= 1 x many cells including myoblasts, tumor cells, and macrophages (Lesot et al., 1983; Rao et al., 1983; Malinoff and Wicha, 1983). The laminin receptor (M, 67K) is solubilized by detergent and has all the characteristics of an integral membrane protein. A partial amino acid sequence of the receptor has been deduced from the nucleotide sequence of cDNA clones. These show a possible transmembrane sequence and suggest that this receptor has substantial cytoplasmic and extracellular domains (Wewer et al., 1986). A small peptide from the B1 chain of laminin has been identified as a major binding site for the laminin receptor (Graf et al., 1987). To identify this sequence as the active site, synthetic peptides (20-mers) were prepared to sequences in the various domains of the B1 chain and used as immunogens to induce the production of specific antibodies. The antibody to the peptide from domain 111, a cysteine-rich domain of homologous repeats, inhibited cell attachment, although the peptide itself did not exhibit activity. Additional peptides were prepared from neighboring sequences and a nonapeptide CDPGYIGSR was found to support cell attachment, to be chemotactic, and to elute the laminin receptor from a column of immobilized laminin. The pentapeptide YIGSR is the smallest
42
GEORGE R. MARTIN ET AL.
active sequence that will mediate cell attachment and receptor binding. On a molar basis, YIGSR shows only 0.5-1% of the activity of laminin, a result suggesting that other sequences or conformational factors are important. As expected, this peptide does not stimulate neurite outgrowth, since this activity has been localized to a different domain in laminin within or adjacent to the large globule at the end of the long arm (Edgar et al.. 1984). Presumably there is a neurite-specific receptor, although that has not been shown directly; and additional laminin receptors for other cells (Horwitz et al., 1985; Aumailley et al., 1987) may exist. Presumably these receptors react with other peptide sequences, as shown by inhibition of cell attachment to laminin by the peptide SDGR (Yamada and Kennedy, 1987), which resembles the inverted cell-binding sequence of fibronectin (Pierschbacher and Ruoslahti, 1984).
C. Turnover and Degradation Basement membranes are arrayed as thin but continuous sheets that form a significant physical barrier to the passage of cells. Current concepts suggest that degradative enzymes are required for turnover of the basement membrane and for passage of cells across it. These could include collagenases, heparitinases, and less specific enzymes. Given the cross-linked network that collagen IV forms in the basement membrane, one would expect that its removal is the critical step. Indeed, turnover studies show that collagen IV is more long lived than laminin and that heparan sulfate is the most rapidly replaced. Collagen IV is not attacked by the collagenases that attack stromal collagens, and a metalloproteinase that has been identified in the media of cultured tumor cells cleaves collagen IV molecules at a specific site (Liotta et al., 1979; Salo et al., 1983; Fessler et al., 1984a). The requirement for different collagenases to degrade fibrous versus basement membrane collagen is consistent with the role that basement membranes play in maintaining tissue architecture during the turnover of stromal tissue. In addition to the metalloproteinase IV, a variety of enzymes including elastase, pepsin, and chymotrypsin cleave collagen IV (Uitto et al., 1980; Mainardi et al., 1980). The importance of the various enzymes in the turnover and degradation of basement membranes has not been thoroughly examined. The ability of tumor cells to invade through basement membranes signifies their progression to malignant status (Liotta et al., 1986). As noted earlier, the collagen IV-cleaving metalloproteinase was originally discovered in the media used to culture metastatic tumor cells (Liotta et al., 1979). Binding of tumor cells to laminin, which precedes basement membrane invasion, results in a significant stimulation of protease release (Turpeenniemi-Hujanen et al., 1986). Current concepts on the process of
BASEMENT MEMBRANE PROTEINS
43
tumor cell dissemination suggest that their ability to cross basement membranes is necessary for their invasive activity. Basement membrane-lined tissues and intact reconstituted basement membranes have been used to assess tumor cell invasiveness in in uitro assays (Terranova et al., 1986a,b; Albini et al., 1987). These studies show that most, if not all, metastatic cells are able to invade through basement membranes and that specific mechanisms are involved. Interestingly inhibitors of collagenase retard the invasion of tumor cells (Reich et al., 1988). This process represents a potential site to direct therapeutic agents.
REFERENCES Albini, A., Iwamoto, Y., Kleinman, H. K., Martin, G. R., Aaronson, S.,Kozlowski, J. M., and McEwan, R. (1987). Cancer Res. 47,3239-3245. Aumailley, M., and Timpl, R. (1986). J. CellBiol. 103, 1569-1575. Aumailley, M., Nurcombe, V., Edgar, D., Paulsson, M., and Timpl, R. (1987). J . Biol. Chem. 262, 11532-1 1538. Babel, W., and Glanville, R. W.(1984). Eur. J . Biochem. 143,545-556. Bkhinger, H. P., Fessler, L. I., and Fessler, J. H.(1982). J. Biol. Chem. 257, 9796-9803. Barlow, D. P., Green, N. M., Kurkinen, M., and Hogan, B. L. M. (1984). EMBOJ. 3,23552362.
Baron van Evercooren, A., Kleinman, H. K., Ohno, S., Marangos, P., Schwartz, J. P., and Dubois-Dalcq, M. (1982). J . Neurosci. Res. 8, 179-184. Bernfield, M., Banejee, S. D., Koda, J. E., and Repraeger, A. C. (1984). I n "The Role of Extracellular Matrix in Development" (R. L. Trelstad, ed.), pp. 545-572. Liss, New York. Blumberg, B.,Fessler, L. I., Kurkinen, M., and Fessler, J. H. (1986). J. Cell Biol. 103, 1711-1719.
Blumberg, B., MacKrell, A. J.. Olsen, P. F., Kurkinen, M., Monson, J. M., Natzle, J. E., and Fessler, J. H. (1987). Boedtker, H., Fuller, F., and Tate, V. (1983). h i . Rev. Connect. Tissue Res. 10, 1-63. Bolander, M. E., Young, M. F., Fisher, L. W., Termine, J. D., and Yamada, Y. (1988). Proc. Natl. Acad. Sci. U.S.A. 85, 2919-2923. Bourdon, M. A,, Oldberg, A., Pierschbacher, M., and Ruoslahti, E. (1985). Proc. Nail. Acad. Sci. U.S.A. 82, 1321-1325. Braze], D . , Oberblumer, I., Dieringer, H., Babel, W., Glanville, R. W., Deutzmann, R., and Kiihn, K. (1988). Eur. J. Biochem., submitted. Brinker, J. M., Gudas, L. J., Loidl, H. R., Wang, S. Y., Rosenbloom, J., Kefalides, N. A., and Myers, J. C. (1985). Proc. Nail. Acad. Sci. U.S.A. 82, 3649-3653. Brownlee, M., and Cerami, A. (1981). Annu. Rev. Biochem. 50, 385-432. Burgeson, R. E., Moms, N. P., Murray, L. W., Duncan, K. G., Keene, D. R., and Sakai, L. Y. (1985). Ann. N . Y . Acad. Sci. 460,47-57. Butkowski, R. J., Wieslander, J., Wisdom, B. J., Barr, J. F., Noelken, M. E., and Hudson, B. G. (1985). J. Biol. Chem. 260, 3739-3747. Butkowski, R. J., Langeveld, J. P. M., Wieslander, J., Hamilton, J., and Hudson, B. G. (1987). J . Biol. Chem. 262,7874-7877. Carlin, B., Jaffe, R., Bender, B., and Chung, A. E. (1981). J. Biol. Chem. 256,5209-5214.
44
GEORGE R. MARTIN ET AL.
Carlin, B. E., Durkin, M. E., Bender, B., Jaffe. R., and Chung, A. E. (1983). J . Eiol. Chem. 258,7729-7737. Charonis, A. S., Tsilibary, E. C., Yurchenco, P. D., and Furthmayr, H. (1985).J . Cell Eiol. 100, 1848-1853. Charonis, A. S., Tsilibary, E. C., Saku, T., and Furthmayr, H. (1986). J. Cell Eiol. 103,1689-1697. Chung, A. E., Jaffe. R., Freeman, I. L., Vergnes, J. P., Braginski, J. E., and Carlin, B. (1979). Cell 16, 277-287. Cooper, A. R., and MacQueen, H. A. (1983). Dev. Biol. 96,467-471. Cooper, A. R., Kurkinen, M., Taylor, A,, and Hogan, B. L. M. (1981). Eur. J . Eiochem. 119, 189-197. Cornbrooks, C. J., Carey, D. J., McDonald, J. A., Timpl, R., and Bunge, R. P. (1983). Proc. Natl. Acad. Sci. U.S.A. 80, 3850-3854. Couchman, J. R., Caterson, B., Christner, J. F., and Baker, J. R. (1984). Nature (London) 307,650-652. Courtoy, P. J., Timpl, R., and Farquhar, M. (1982). J. Histochem. Cyrochem. 30,874-886. Daniels, I. R., and Chu, G. H. (1975). J . Eiol. Chem. 250, 3531-3537. Dehm, P., and Kefalides, N. A. (1978). J. Eiol. Chem. 253, 6680-6686. Dieringer, H., Hollister, D. W., Glanville, R. W., Sakai, L. Y., and Kuhn, K. (1985). Biochem. J . 227, 217-222. Dreyfus, P. A., Rieger, F., and Pincon-Raymond, M. (1983). Proc. Narl. Acad. Sci. U.S.A. 80,6698-6702. Duncan, K. G . , Fessler, L. I., Bachinger, H. P., and Fessler, J. H. (1983). J. Eiol. Chem. ZSS, 5869-5877. Durkin, M. E., Phillips, S. L., and Chung, A. E. (1986). Differentiation 32, 260-266. Dyck, R. F., Lockwood, C. M., Kershaw, M.. McHugh, N., Duance, V. C., Baltz, M. L., and Pepys, M. B. (1980). J . Exp. Med. 152, 1162-1174. Dziadek, M., and Timpl, R. (1985). Dev. Eiol. 111, 372-382. Dziadek, M., Paulsson, M., and Timpl, R. (1985a). EMBOJ. 4, 2513-2518. Dziadek, M., Fujiwara, S., Paulsson, M., and Timpl, R. (1985b). EMEO J . 4,905-912. Dziadek, M., Edgar, D., Paulsson, M., Timpl. R., and Fleischmajer, R. (1986a). Ann. N . Y. Acad. Sci. 486,248-259. Dziadek, M., Paulsson, M., Aumailley, M., and Timpl, R. (1986b). Eur. J . Biochem. 161, 455-464. Edgar, D., Timpl, R., and Thoenen, H. (1984). EMBO J . 3, 1463-1468. Engel, J., Odermatt, E., Engel, A., Madri, J. A., Furthmayr, H., Rohde. H., and Timpl, R. (1981). J . Mol. Biol. 150, 97-120. Engel, J., Taylor, W., Paulsson, M., Sage, H., and Hogan, B. L. M. (1987). Biochemistry 26,6958-6965. Engvall, E., Krusius, T., Wewer, U., and Ruoslahti, E. (1983). Arch. Biochern. Biophys. 222, 649-656. Engvall, E.. Davis, G . E., Dickerson, K., Ruoslahti, E., Varon, S., and Manthorpe, M. (1986). J. Cell Biol. 103, 2457-2465. Farquhar, M. G. (1981). In “Cell Biology of Extracellular Matrix” (E. D. Hay, ed.), pp. 335-378. Plenum, New York. Fessler, L. I., Duncan, K. G., Fessler, J. H., Salo, T., and Tryggvason, K. (1984a).J . Biol. Chem. 259, 9783-9789. Fessler, J. H., Lunstrum, G . . Duncan, K. G., Campbell, A. G., Stern, R., Bachinger, H. P., and Fessler, L. I. (1984b). In “The Role of Extracellular Matrix in Development” (R. L. Trelstad, ed.), pp. 207-219. Liss, New York.
BASEMENT MEMBRANE PROTEINS
45
Fine, J . D., Breathnach, S. M., Hintner, H., and Katz, S. I. (1984). J . Inuesr. Dermatol. 82, 35-38.
Fleischmajer, R., and Timpl, R. (1984). J. Hisrochem. Cytochem. 32, 315-321. Fujiwara, S., Wiedemann, H., Timpl, R., Lustig, A., and Engel, J. (1984). Eur. J. Biochem. 143, 145-157. Gehron Robey, P., and Martin, G. R. (1981). Collagen Relat. Res. 1, 27-38. Glanville, R. W., Qian, R. Q., Siebold, B., Risteli, J., and Kiihn, K. (1985). Eur. J . Biochem. 152, 213-219.
Goldberg, M., and Escaig-Haye, F. (1986). Eur. J . Cell Biol. 42, 365-368. Goldsmith, L. A., and Briggaman, R. A. (1983). J . Invest. Dermarol. 81, 464-466. Goodman, S., Deutzmann, R., and von der Mark, K. (1987). J. Cell Biol. 105, 589-598. Gordon, J. R., and Bernfield, M. R. (1980). Dev. Biol. 74, 118-135. Graf, J., Iwamoto, Y., Sasaki, M., Martin, G. R., Kleinman, H. K., Robey, F. A., and Yamada, Y. (1987). Cell 48, 989-996. Grassi, J., Massoulie, J., and Timpl, R. (1983). Eur. J . Biochem. 133, 31-38. Griffin, C. A., Emanuel, B. S., Hansen, J. R., Cavenee, W. K., and Myers, J. C. (1987). Proc. Narl. Acad. Sci. U.S.A. 84, 512-516. Harolson, M. A., Federspiel, S. J., Martinez-Hernandez, A., Rhodes, K. R., and Miller, E. J. (1985). Biochemistry 24, 5792-5797. Hassell, J. R., Gehron-Robey, P., Barrach, H.-J., Wilczek, J., Rennard, S.I., and Martin, G . R. (1980). Proc. Narl. Acad. Sci. U . S . A . 77, 4494-4498. Hassell, J. R., Leyshon, W. C., Ledbetter, S. R., Tyree, B., Suzuki, S., Kato, M., Kimata, K., and Kleinman, H. K. (1985). J . Biol. Chem. 260, 8098-8105. Hassell, J. R., Kimura, J. H., and Hascall, V. C. (1986). Annu. Rev. Biochem. 55,539-567. Hay, E. D., and Meier, S. (1974). J. Cell Biol. 62, 889-898. Heathcote, J. G., and Grant, M. E. (1981). Inr. Rev. Connect. Tissue Res. 9, 191-264. Hofmann, H., Voss, T., Kiihn, K., and Engel, J. (1984). J . Mol. Biol. 172, 325-343. Hogan, B. L. M., Cooper, A. R., and Kurkinen, M. (1980). Deu. Biol. 80, 289-300. Hogan, B. L. M., Taylor, A., and Cooper, A. R. (1982a). Deu. Biol. 90, 210-214. Hogan, B. L. M., Taylor, A., Kurkinen, M., and Couchman, J. R. (1982b). J. Cell Biol. 95, 197-204.
Hook, M., Kjellen, L., and Johansson, S. (1984). Annu. Rev. Biochem. 53, 849-869. Horwitz, A., Duggan, K., Greggs, R., Decker, C., and Buck, C. (1985). J . Cell Biol. 101, 2 134-2 144.
Howe, C. C., and Dietzschold, B. (1983). Deu. Biol. 98, 385-391. Howe, C. C., and Solter, D. (1980). Deu. Biol. 77,480-487. Hughes, R. C., Taylor, A., Sage, H., and Hogan, B. L. M. (1987). Eur. J. Biochem. 163,5765.
Hynes, R. (1985). Annu. Rev. Cell Biol. 1, 119-142. Inestrosa, N. C., Silberstein, L., and Hall, Z. W. (1982). Cell 29, 71-79. Inoue, S., and Leblond, C. P. (1985). Am. J. Anar. 174, 399-407. Inoue, S., Leblond, C. P., and Laurie, G. W. (1983). J. Cell Biol. 97, 1524-1537. Johnson, L. D., and Warfel, J. (1976). Biochim. Biophys. Acra 455,538-549. Kanwar, Y. S., and Farquhar, M. G. (1979a). J . Cell Biol. 81, 137-153. Kanwar, Y. S., and Farquhar, M. G. (1979b). Proc. Narl. Acad. Sci. U . S . A . 76, 1303-1307. Kanwar, Y. S., and Farquhar, M. G. (1979~).Proc. Narl. Acad. Sci. U.S.A. 76,4493-4497. Kanwar, Y. S., Hascall, V. C., and Farquhar, M. G . (1981). J. Cell Biol. 90, 527-532. Kanwar, Y. S., Veis, A., Kimura, J. H., and Jakubowski, M. L. (1984). Proc. Narl. Acad. Sci. U.S.A. 81,762-766. Karnovsky, M. J. (1979). Annu. Rev. Med. 30, 213-224.
46
GEORGE R. MARTIN ET AL.
Keene, D. R., Sakai, L. Y., Lunstrum, G. P., Moms, N. P., and Burgeson, R. E. (1987).J. Cell Biol. 104, 611-621. Kefalides, N. A. (1972a).Biochem. Biophys. Res. Commun. 47, 1151-3158. Kefalides, N. A. (1972b). Connect. Tissue. Res. 1, 3-13. Kefalides, N. A. (1973).Int. Rev. Connect. Tissue Res. 6, 63-104. Kefalides, N. A., ed. (1978).“Biology and Chemistry of Basement Membranes.” Academic Press, New York. Kefalides, N. A., Alper, R.,and Clark, C. C. (1979).Int. Rev. Cytol. 61, 167-228. Kleinman, H. K., McGarvey, M. L., Liotta, L. A,, Gehron-Robey, P. Tryggvason, K., and Martin, G. R. (1982).Biochemistry 21, 6188-6193. Kleinman, H. K., McGarvey, M. L., Hassell, J. R., and Martin, G. R. (1983).Biochemistry 22,4%9-4974. Kleinman, H. K., Cannon, F. B., Laurie, G. W., Hassell, J. R., Aumailley, M.,Terranova, V. P., Martin, G. R., and Dubois-Dalcq, M. (1985).J. Cell. Biochem. 27, 317-325. Kleinman, H. K., McGarvey, M. L., Hassell, J. R., Star, V. L., Cannon, F. B., Laurie, G . W., and Martin, G. R. (1986).Biochemistry 25, 312-318. Kleinman, H. K., Ebihara, I., Killen, P., Sasaki, M., Cannon, F. B., Yamada, Y., and Martin, G. R. (1987).Dev. Biol. l22,373-378. Kornblihtt, A. R., Umezawa, K., Vibe-Pedersen, K., and Baralle, F. E. (1985).EMBO J. 4, 1755- 1759. Krakower, C. A., and Greenspon, S. A. (1978). In “Biology and Chemistry of Basement Membranes” (N. A. Kefalides, ed.), pp. 1-16. Academic Press, New York. Kretsinger, R. H. (1980).CRC Crit. Rev. Biochem. 8, 110-174. Kiihn, K., Wiedemann, H.,Timpl, R.,Risteli, J., Dieringer, H., Voss, T., and Glanville, R. W. (1981).FEES Lett. US, 123-128. KUhn, K., Schone, IJ. H., andTimpl, R., eds. (1982).“New Trends in Basement Membrane Research.” Raven, New York. Kurkinen, M., Barlow, D. P., Jenkins, J. R., and Hogan, B. L. M. (1983).J. Biol. Chem. 258,6543-6548. Kurkinen, M., Taylor, A., Garrels. J. I., and Hogan, B. L. M. (1984).J . Biol. Chem. 259, 5915-5922. Kurkinen, M., Bernard, M. P., Barlow, D. P., and Chow, L. T. (1985).Nature (London) 317, 177-179. Laurie, G . W., and Leblond, C. P. (1985).Nature (London) 3l3, 272. Laurie, G. W., Leblond, C. P., and Martin, G. R. (1983).Am. J . Anat. 167, 71-82. Laurie, G. W., Leblond, C. P., Inoue, S., Martin, G. R., and Chung, A. (1984).Am. J. Anat. 169,463-481. Laurie, G . W., Bing, J. T., Kleinman, H. K., Hassell, J. R., Aumailley, M., Martin, G. R., and Feldmann, J. R. (1986).J . Mol. Biol. 189, 205-216. Leblond, C. P., Glegg, R. E., and Eidinger, D. (1957).J . Histochem. Cytochem. 5,445-458. Ledbetter, S. R., Tyree, B., Hassel, J. R., and Horigan. E. A. (1985).J. B i d . Chem. 260, 8106-8113. Ledbetter, S. R., Fisher, L. W., and Hassell, J. R. (1987).Biochemistry 26,988-995. Leivo, I. (1983).Med. Biol. 61, 1-30. Leivo, I., Vaheri, A., Timpl, R., and Wartiovaara, J. (1980).Dev. Biol. 76, 100-114. Leivo. I., Alitalo, K., Risteli, L., Vaheri, A., Timpl, R., and Wartiovaara, J. (1982).Exp. Cell Res. 137, 15-23. Lesot, H.,Kiihl, U., and von der Mark, K. (1983). EMBO J. 2, 861-865. Lever, W.F. (1979).J . A m . Acad. Dermatol. 1, 2-31. Liotta, L. A., Abe, S., Gehron-Robey, P., and Martin, G. R. (1979).Proc. Natl. Acad. Sci. U.S.A. 76,2268-2272.
BASEMENT MEMBRANE PROTEINS
47
Liotta, L. A., Rao, C. N., and Wewer, U. (1986). Annu. Rev. Eiochern. 55, 1037-1057. Lozano, G., Ninomiya, Y.,Thompson, H., and Olsen. B. R. (1985). Proc. Narl. Acad. Sci. U.S.A. 81, 3014-3018. McCarthy, R. A., Beck, K., and Burger, M. M. (1987). EMEOJ. 6, 1587-1593. McMahan, U . J., Sanes, J. R., and Marshall, L. M. (1978). Nature (London)271, 172-174. Madri, J. A., Roll, F. J., Furthmayr, H., and Foidart, J.-M. (1980). J. Cell Eiol. 86,682-687. Madri, J. A., Foellmer, H. G.. and Furthmayr, H. (1983). Biochemistry 22,2797-2804. Mainardi, C. L., Dixit, S.N., and Kang, A. H. (1980). J . Eiol. Chem. 255, 5435-5441. Malinoff, H. L., and Wicha, M. S. (1983). J. CellEiol. %, 1475-1479. Mann, K., Deutzmann, R., Paulsson, M., and Timpl, R. (1987). FEES Lett. 218, 167172. Martin, G. R., and Timpl, R. (1987). Annu. Rev. Cell Eiol., in press. Martinez-Hernandez, A., and Amenta, P. S. (1983). Lab. Invesr. 48, 656-677. Martinez-Hernandez, A., Marsh, C. A., Clark, C. C., Macarak, E. J., and Brownell, A. G. (1981). Collagen Relat. Res. 1, 405-418. Martinez-Hernandez, A., Miller, E.J., Damjanov, I., and Gay, S. (1982). Lab. Invest. 47, 247-257. Mason, I. J., Taylor, A., Williams, J. G., Sage, H., and Hogan, B. L. M. (1986a). EMEOJ. 5, 1465-1472. Mason, I. J., Murphy, D., Miinke, M., Francke, U., Elliott, R. W., and Hogan, B. L. M. (1986b). EMEO J . 5, 1831-1837. Mayne, R., and Zettergren, J. G. (1980). Biochemistry 19, 4065-4072. Mayne, R., Wiedemann. H., Dessau, W., von der Mark, K., and Bruckner, P. (1982). Eur. J . Biochem. W, 417-423. Mayne, R., Wiedemann, H., Irwin, M. H., Sanderson, R. D., Fitch, J. M., Linsenmayer, T. F., and Kiihn, K. (1984). J . Cell Eiol. 98, 1637-1644. Mohan, P. S., and Spero, R. G. (1986). J. ljiol. Chem. 261,4328-4336. Morita, A., Sugimoto, E., and Kitagawa, Y. (1985). Biochem. J . 229,259-264. Oberbaumer, I., Wiedemann, H., Timpl, R., and Kiihn, K. (1982). EMEO J . 1, 805-810. Odermatt, B. F., Lang, A. B., Riittner, J. R., Winterhalter, K. H., and Triieb, B. (1984). Proc. Natl. Acad. Sci. U.S.A.81, 7343-7347. Ohno, M., Martinez-Hernandez, A,, Ohno, N., and Kefalides, N. A. (1983). Biochem. Eiophys. Res. Commun. W , 1091-1098. Oldberg, A., Hayman, E. G., and Ruoslahti. E. (1981). J. Eiol. Chem. 256, 10847-10852. Oohira, A., Wight, T. N., McPherson, J., and Bornstein, P. (1982). J . CeNEiol. 92,357-367. Oohira, A., Wight, T. N., and Bornstein, P. (1983). J . Eiol. Chem. 258, 2014-2021. Orkin, R. W., Gehron, P., McGoodwin, E. B., Martin, G. R., Valentine, T., and Swarm, R. (1977). J . Exp. Med. 145,204-220. Ott, U., Odermatt, E., Engel, J., Furthmayr, H., and Timpl, R. (1982). Eur. J. Biochem. l23,63-72. Palm, S. L., and Furcht, L. T. (1983). J. Cell Biol. %, 1218-1226. Palm, S. L., McCarthy, J. B., and Furcht, L. T. (1985). Biochemistry 24, 7753-7760. Parthasarathy, N., and Spiro, R. G. (1982). Arch. Biochem. Eiophys. 213, 504-511. Parthasarathy, N., and Spiro, R. G. (1984). J. Biol. Chem. 259, 12749-12755. Patel, H. P., Anhalt, G. J., and Diaz, L. A. (1983). Ann. Allergy 50, 144-150. Paulsson, M., Deutzmann, R., Timpl, R.. Dalzoppo, D., Odermatt, E.,and Engel. J. (1985a). EMEO J. 4, 309-316. Paulsson, M., Dziadek, M., Suchanek, C., Huttner, W. B., and Timpl, R. (1985b).Biochem. J . 231, 571-579. Paulsson, M., Deutzmann, R., Dziadek, M., Nowack, H., Timpl, R., Weber, S.,and Engel, J. (1986). Eur. J . Biochem. 156,467-478.
48
GEORGE R. MARTIN ET AL.
Paulsson, M., Aurnailley, M., Deutzrnann, R., Timpl, R., Beck, K., and Engel, J. (1987a). Eur. J . Biochem. 166, 11-19. Paulsson, M., Yurchenco, P. D., Ruben, G. C., Engel, J., and Timpl, R. (1987b). J. Mol. Biol. 197, 297-313. Pejlar, G . , Backstrom, G., Lindahl, U.,Paulsson, M., Dziadek, M., Fujiwara, S., and Timpl, R. (1987). J. Biol. Chem. 262, 5036-5043. Pepys. M. B., and Baltz, M. L. (1983). Adv. Immunol. 34, 141-212. Peters, B. P., Hartle, R. J., Krzesick. R. F., Kroll, T. G., Perini, F., Balun, J. E., Goldstein, I. J., and Ruddon, R. W. (1985). J . Biol. Chem. 260, 14732-14742. Peters, D. K., Rees, A. J., Lockwood, C. M., and Pusey, C. D. (1982). Transplant. Proc. 14, 5 13-521.
Pierce, G . B., Jones, A., Orfanakis, N. G., Nakane, P. K., and Lustig, L. (1982). Differentiation 23, 60-72. Pierschbacher, M. D., and Ruoslahti, E. (1984). Nature (London) 309, 30-33. Pihlajaniemi, T., Tryggvason, K., Myers, J. C., Kurkinen, M., Lebo, R., Cheung, M. C., Prockop, D. J., and Boyd, C. D. (1985). J. Biol. Chem. 260, 7681-7687. Porter, R.,and Whelan, J., eds. (1984). Basement membranes and cell movement. CIBA Found. Symp. 108. Prehm, P., Dessau, W., and Timpl, R. (1982). Connect. Tissue Res. 10, 275-285. Rao, C. N., Margulies, I. M. K., Tralka, T. S . , Terranova, V. P., Madri, J. A., and Liotta, L. A. (1982). J. Biol. Chem. 257, 9740-9744. Rao, N. C., Barsky, S. H., Terranova, V. P., and Liotta, L. A. (1983). Biochem. Biophys. Res. Commun. 111, 804-808. Rao, C. N., Margulies, I. M. K., and Liotta, L. A. (1985). Biochem. Biophys. Res. Commun. 128,45-52. Reich, R., Thompson, E. W., Iwamoto, Y., Martin, G. R., Deason, J. R., Fuller, G. C., and Miskin, R. (1988). Cancer Res. 48, 3307-3312. Risteli, J., Bachinger, H. P., Furthrnayr, H., Engel, J., and Tirnpl, R. (1980). Eur. J . Biochem. 108,239-250. Rohrbach, D. H., Wagner, C. W., Star, V. L., Martin, G. R., Brown, K. S . , and Yoon, J. W . (1983). J. Biol. Chem. 258, 11672-1 1677. Romberg, R. W., Werness, P. G . , Lollar, P., Riggs, 8 . L., and Mann, K. G. (1985). J . Biol. Chem. 260, 2728-2736. Sakai. L. Y., Keene, D. R., Morris, N. P., and Burgeson, R. E. (1986). J. Cell Biol. 103, 1577-1586.
Sakurai, Y., Sullivan, M., and Yamada, Y. (1986). J. Biol. Chem. 261, 6654-6657. Salo, T., Liotta, L. A., and Tryggvason, K. (1983). J. Biol. Chem. 258, 3058-3063. Sasaki, M., Kato, S., Kohno, K., Martin, G. R., and Yarnada, Y. (1987). Proc. Natl. Acad. Sci. U.S.A. 04, 935-939. Schwarz, U., Schuppan, D., Oberbaurner, I., Glanville, R. W., Deutzmann, R., Timpl, R., and Kiihn, K. (1986). Eur. J. Biochem. 157, 49-56. Schwarz-Magdolen, U.,Oberbaumer, I., and Kiihn, K. (1986). FEBS Lett. 208, 203-207. Semoff, S . , Hogan, B. L. M., and Hopkins, C. R. (1982). EMBOJ. 1, 1171-1175. Shibata, S.. ed. (1985). Proc. Int. Symp. Basement Membr. Mishima, June 24-26. Siebold, B., Quian, R. G., Glanville, R. W., Hofman, H., Deutzrnann, R., and Kiihn, K. (1987). Eur. J . Biochem. 168,569-575. Skorstegaard, K., Jensen, M. S . , Sahl, P., Petersen, T. E., and Magnusson, S . (1986). Eur. J . Biochem. 161,441-453. Soininen, R., Tikka, L., Chow, L., Pihlajaniemi, T., Kurkinen, M., Prockop, D. J., Boyd, C. D., and Tryggvason, K. (1986a). Proc. Natl. Acad. Sci. U.S.A. 83, 1568-1572.
BASEMENT MEMBRANE PROTEINS
49
Soininen, R., Chow, L., Kurkinen, M., Tryggvason, K., and Prockop, D. J. (1986b). E M B O J. 5, 2821-2823. Stanley, J. R., Hawley-Nelson, P., Yuspa, S. H., Shevach, E. M., and Katz, S. 1. (1981). Cell 24, 897-903. Stanley, J. R.,Hawley-Nelson, P., Yaar, M., Martin, G. R., and Katz, S. I. (1982). J . Invest. Dermatol. 78, 456-459. Stenman, S., and Vaheri, A. (1978). Exp. Med. 147, 1054-1064. Stow, J. L., Glasgow, E. F., Handley, C. F., and Hascall, V. C. (1983). Arch. Biochem. Biophys. 225,950-957. Stow, J. L., Sawada, H., and Farquhar, M. G. (1985a). Proc. Narl. Acad. Sci. U . S . A . 82, 3296-3300. Stow, J. L., Kjellen, L., Unger, E., Hook, M., and Farquhar, M. G. (1985b). J . Cell Biol. 100,975-980. Strickland, S . , Smith, K. K., and Marotti, K. R. (1980). Cell 21, 347-355. Tarsio, J. F., Reger, L. A., and Furcht, L. T. (1987). Biochemistry 26, 1014-1020. Terranova, V. P., Rohrbach, D. H.. and Martin, G . R. (1980). Cell 22, 719-726. Terranova, V. P., Hujanen, E. S., Loeb, D. M., Martin, G. R., Thornberg, L., and Glushko, V. (1986a). Proc. Natl. Acad. Sci. U . S . A . 83, 465-469. Terranova, V. P., Aumailley. M., Sultan, L. H., Martin, G. R., and Kleinman, H. K. (1986b). J. Cell. Physiol. 127, 473-479. Terranova, V. P., Hujanen, E. S., and Martin, G. R. (1986~).J. Natl. Cancer lnsr. 77,311316. Timpl, R., and Dziadek, M. (1986). l n t . Rev. Exp. Parhol. 29, 1-112. Timpl, R., and Martin, G. R. (1982). I n “Immunochemistry of the Extracellular Matrix” (H. Furthmayr, ed.), Vol. 11, pp. 119-150. CRC Press, Boca Raton, Florida. Timpl, R., Bruckner, P., and Fietzek, P. (1979a). Eur. J . Biochem. 95, 255-263. Timpl, R., Risteli, J., and Bachinger, H. P. (1979b). FEES Lerr. 101, 265-268. Timpl, R., Rohde, H., Gehron Robey, P., Rennard, S. I., Foidart, J.-M.. and Martin, G. R. (1979~).J. Biol. Chem. 254, 9933-9937. Timpl, R., Wiedemann, H., van Delden, V., Furthmayr, H., and Kiihn, K. (1981). Eur. J. Biochem. 1M,203-21 1. Timpl, R., Dziadek, M., Fujiwara, S., Nowack, H., and Wick, G. (1983a). Eur. J. Biochem. 137,455-465. Timpl, R., Johansson, S., van Delden, V., Oberbaumer, I., and Hook, M. (1983b). J. Biol. Chem. 258, 8922-8927. Timpl, R., Oberbaumer, I., von der Mark, H., Bode, W., Wick, G., Weber, S., and Engel, J. (1985). Ann. N. Y. Acad. Sci. 460, 58-72. Trueb, B., Grobli, B., Spiess, M., Odermatt, B. F., and Winterhalter, K. H. (1982). J. Biol. Chem. 257, 5239-5245. Tsilibary, E. C., and Charonis, A. S. (1986). J . Cell Biol. 103, 2467-2473. Turpeenniemi-Hujanen, T., Thorgeirsson, U. P., Rao, C. N., and Liotta, L. A. (1986). J . Biol. Chem. 261, 1883-1889. Tyree, B., Horigan, E. A., Klippenstein, D. L., and Hassell, J. R. (1984). Arch. Biochem. Biophys. 231, 328-335. Uitto, V. J., Schwartz, D., and Veis, A. (1980). Eur. J. Biochem. 105,409-417. Vigny, M., Martin, G. R., and Grotendorst, G. R. (1983). J . Biol. Chem. 258, 8794-8798. Von der Mark, K., and Kiihl, U. (1985). Biochim. Biophys. Acra 823, 147-160. Wartiovaara, J., Leivo, I., and Vaheri, A. (1979). Dev.Biol. 69, 247-257. Weber, S.. Engel, J., Wiedemann, H., Glanville, R. W., and Timpl, R. (1984). Eur. J. Biochem. 139,401-410.
50
GEORGE R. MARTIN ET AL.
Wewer, U. M., Liotta, L. A., Jaye, M.,Ricca. G. A., Drohan. W. N., Claysmith, A. P., Rao, C. N., Wirth, P., Coligan, J. E., Albrechtsen, R.,Mudryj, M., and Sobel, M. E. (1986). Proc. Natl. Acad. Sci. U.S.A. 83, 7137-7141. Wieslander, J., and Heinegard, D. (1985). Ann. N . Y. Acad. Sci. 460, 363-374. Wieslander, J., Bygren, P., and Heinegard, D. (1984a). Proc. Narl. Acad. Sci. U.S.A. 81, 1544- 1548.
Wieslander, J., Barr, J. F., Butkowski, R. J., Edwards, S.J., Bygren, P., Heinegard, D., and Hudson, B. G. (1984b). Proc. Narl. Acad. Sci. U.S.A. 81, 3838-3842. Wieslander, J., Langeveld, J., Butkowski, R., Jodlowski, M., Noelken, M., and Hudson, B. G. (1985). J . Eiol. Chem. 260, 8564-8570. Wilson, C. B., and Dixon. F. J. (1981). I n “The Kidney” (B. M. Brenner and F. C. Pector, eds.), pp. 1237-1352. Saunders, Philadelphia. Woodley, D. T., Rao, C. N.,Hassell, J. R., Liotta, L. A,, Martin, G. R., and Kleinman, H. K. (1983). Eiochim. Biophys. Acra 761, 278-283. Yamada. K. M. (1983). Annu. Rev. Biochem. 52,761-799. Yamada, K. M., and Kennedy, D. W. (1987). J. Cell. Physiol. WO, 21-28. Yamada, Y., Avvedimento, V. E., Mudryj, M., Ohkubo, H., Vogeli, G., Irani, M., Pastan, I., and de Crombrugghe, B. (1980). Cell 22, 887-892. Yamada, Y., Sasaki, N., Kohno, K., Kleinman, H. K., Kato, S.,and Martin, G. R. (1985). In “Basement Membranes” (S.Shibata, ed.), pp. 139-146. Elsevier, Amsterdam. Young, M.F., Bolander, M. E., Day, A. A., Ramis, C. I., Gehron Robey, P., Yamada, Y., and Termine, J. D. (1986). Nucleic Acids Res. 14, 4483-4497. Yurchenco, P. D., and Furthmayr, H. (1984). Biochemistry 23, 1839-1850. Yurchenco, P. D., and Furthmayr, H. (1985). Ann. N.Y. Acad. Sci. 460, 530-533. Yurchenco, P. D., Tsilibary, E. C., Charonis, A. S.,and Furthmayr, H. (1985). J. Eiol. Chem. 260, 7636-7644.
DESIGN OF PEPTIDES AND PROTEINS By WILLIAM F. DEGRADO Contnl R o n r c h 8nd Dovolopmmt Dop.rtmant, E. 1. du Pont do ISlmours 6 Compmy, Incorpratd Exporlmontal Station, Wllmlngton. W m a r o 19898
.............................. ......................... A. Conformational Properties of the Commonly Occurring Amino Acids. . . . B. Properties of Some Conformationally Constrained Amino Acids . . . . . . C. Cyclic Peptides. . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Design of Analogs of the Pentapeptide Enkephalin . . . . . . . . . . . . 111. Design of Medium-Sized Peptides . . . . . . . . . . . . . . . . . . . . . A. Factors Stabilizing the Formation of Secondary Structures in Aqueous Solution . . . . . . . . . . . . . . . . . . . . B. Enhancing the Helical Potential of Natural Peptides . . . . . . . . . . . C. Design of Peptides That Form Amphiphilic Secondary Structures . . . . . IV. ProteinDesign. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Why Design Proteins? . . . . . . . . . . . . . . . . . . . . . . . . B. a-Helical Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . C. fi Proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. Introduction.
11. Design of Small Peptides
51 53 55
57 61 61 68 68 75 80
101 101 102 114 117 118
I. INTRODUCTION Recent dramatic advances in methodologies for the synthesis, modification, and analysis of peptides have increased the ease with which novel sequences can be prepared. This development has opened a number of new possibilities for studying peptide and protein structure/function relationships (Oxender and Fox, 1987). For instance, the techniques of peptide synthesis and site-directed mutagenesis provide convenient methods for introducing systematic changes into a native protein sequence, thereby allowing one to evaluate how one or more side chains contribute to the physical and biological properties of a protein. Furthermore, as our understanding of the determinants of peptide and protein structure expands, it should be increasingly possible to design peptides and proteins with predetermined structures and properties. Initially, this approach should be useful for understanding natural peptides and proteins, for it critically tests our understanding of these systems; but ultimately it also holds the promise of providing entirely novel substances with unprecedented properties. 51 ADVANCES IN PROTEIN CHEMISTRY, Vol. 39
Copyright Q 1988 by Academic Press, Inc. All rights of reproductionin any form resewed.
52
WILLIAM F. DEGRADO
This article will summarize some cptemporary approaches to the design of structurally defined peptides and proteins with special emphasis being given to how this approach has helped to elucidate the structural basis for the function of a variety of natural molecules. When in their proper biological milieu, peptides and proteins assume fairly well-defined conformations, which are responsible for their biological and physical properties. The mechanisms by which these molecules adopt defined three-dimensionalstructures depends on their size and chemical compositions. With few exceptions (Shoemaker et al., 1985, 1987a,b) short to medium-sized peptides (fewer than 30-50 residues) tend to be unordered in dilute aqueous solutions; that is, they adopt a large number of different conformations that are in dynamic equilibrium. However, peptides of this size adopt defined conformations when they bind to their receptors (e.g., membranes, proteins), which often have preorganized structures. Thus, short and medium-sized peptides tend to adopt conformations that are complementary to those of their receptors. In contrast, proteins tend to adopt fairly well defined three-dimensional structures, even in the absence of stabilizing interactions with other molecules or molecular surfaces. The intramolecular interactions involved in the stabilization of protein structures are weak and tend to be of the same type as those used by peptides in their intermolecular interactions with receptors. Thus, a major difference in the conformational properties of peptides and those of proteins is that proteins are large enough to contain sufficient loci for intramolecular, noncovalent interactions to overcome the unfavorable conformational entropy associated with adopting a defined three-dimensional structure. There are a variety of strategies and approaches for protein and peptide design. Those adopted for a given problem will depend on the size and molecular characteristics of the desired target molecule. If one wishes to create a model for a small peptide (e.g., the pentapeptide hormone enkephalin), then it might be feasible to design an analog of this molecule that is locked into a biologically active conformation. In this case, a conformation (or set of conformations) that promotes a productive interaction with a receptor is (or are) reinforced by introducing conformational constraints including macrocyclization and addition of alkyl groups that sterically restrict the conformational space available to the peptide chain. For peptides of longer chain lengths, it is not possible to introduce sufficient conformational constraints to restrict the peptide to just one or a few sets of conformations. In these cases, a different type of approach often produces impressive results. One postulates a hypothetical structure for the receptor-bound conformation of a given peptide, and then one designs an analog of this peptide that has minimal sequence homology to the
DESIGN OF PEPTIDES AND PROTEINS
53
parent peptide but quintessentially embodies the conformational and molecular characteristics thought to be important for biological activity. Finally, the design of proteins that adopt predetermined three-dimensional structures is perhaps the most ambitious goal, and one that has been attacked only recently. The large size and hence large number of conformational possibilities for even a small protein of 100 residues reaches astronomical proportions (Creighton, 1984). Nevertheless, by carefully considering the structures of natural proteins and by judiciously applying computational and graphical techniques in conjunction with physical models, it appears possible to achieve this goal. This review will highlight each of these three approaches to peptide and protein design. 11. DESIGNOF SMALL PEPTIDES
The design of conformationally constrained peptides is a powerful method for elucidating the biologically active conformation of a peptide when it is bound to its receptor (for reviews, see Hruby, 1982, 1984, 1985a,b; Freidinger and Veber, 1984; Struthers et al., 1984). In solution, small peptides tend to adopt large ensembles of conformations, and the biologically active conformation might only be rarely populated (Hruby , 1982, 1984, 1985a,b). Therefore, the solution conformations of a peptide deduced by experimental or theoretical methods, although interesting in establishing the dynamic range of conformations available to a given peptide, are not necessarily of relevance to the receptor-bound conformation. For instance, the conformations of Leu-enkephalin have been investigated by X-ray spectroscopy, NMR, conformational calculations, and fluorescence spectroscopy (reviewed in Schiller, 1984). The picture emerging from these techniques is that the predominant backbone conformation of this flexible molecule is highly dependent on its environment. Figure 1 illustrates two of the H-bonding patterns observed for Metenkephalin observed in crystal structures (Smith and Griffin, 1978; Stezowski et al., 1985; Griffin et al., 1986). Some of the observed conformations for Leu-enkephalin such as the extended /3 strands (Camerman et al., 1983; Griffin et al., 1986) do not appear to be consistent with the receptor-bound conformations inferred from studies with cyclic peptides (Schiller, 1984). On the other hand, some of the other experimentally and theoretically derived conformations, particularly those involving turns, are broadly consistent with studies with conformationally constrained peptides (Schiller, 1984). Thus, the synthesis of conformationally constrained peptides used in conjunction with theoretical and experimental methods of conformational analysis provides a powerful method for determining which (if any) of the many conformations observed in solution
54
WILLIAM F. DEGRADO
0
N-H
H
Met/tiofJ Phe 0
H
FIG.1. Small peptides can assume multiple conformational states in the solid state. The hydrogen-bonding patterns for two crystalline forms of Met-enkephalin (A and B) and a potent derivative of Met-enkephalin as determined by X-ray crystallography are illustrated (Smith and Griffin,1978; Stezowski et al., 1985; Grimn er al., 1986). The conformationsare antiparallel /3 pleated sheet (A), type I' @ turn (B),and type 11' /3 turn (C).
are responsible for binding and activating a receptor. In addition, conformationally constrained peptides might also be expected to bind more tightly to their receptors, because their binding should occur with a smaller decrease in the conformational entropy than the binding of an unconstrained analog. Finally, different receptor types appear to recognize different conformations of the same hormones (Schiller and DiMaio, 1982). Therefore, conformationally restricted peptides can have marked selectivities for binding to a certain receptor class. An important first step in the design of conformationally constrained peptides is the delineation of the residues that are important for activity. This goal is generally accomplished by first synthesizing a set of analogs that vary from the parent peptide by having shorter chain lengths, establishing the minimal chain lengths for complete and partial activities. Once a minimal chain length has been established, each side chain is systemati-
DESIGN OF PEPTIDES AND PROTEINS
55
cally varied to determine the importance of charge, steric bulk, hydrophobicity, aromaticity, and chirality at each position. These studies, although somewhat tedious, are essential for establishing the positions at which substituents might be introduced to constrain the conformation of the peptide. Also, after evaluation of the properties of a large set of analogs, it is often possible to postulate hypotheses concerning the conformational features involved in binding to the receptor. These hypotheses can be tested by preparing conformationally constrained analogs. A . Conformational Properties of the Commonly Occurring Amino Acids
In a discussion of the effects of conformational constraints on the overall configuration of a peptide, the conformation versus potential energy maps introduced by Ramachandran (Ramachandran et al., 1963) are useful (Fig. 2). The peptide bond generally adopts a trans configuration and deviates only slightly from planarity (Cantor and Schimmel, 1980). Therefore, to a first approximation, only two angles need to be considered to define the backbone configuration of a peptide chain: @, which defines the torsional angle about the Nu-C, bond, and V, which defines the angle about the C,-C (carbonyl) bond. Figure 2A illustrates the calculated conformational energy versus @, V angle diagram for N-acetylg1ycine-N'methylamide (Zimmerman et al., 1977; Scheraga, 1968). Even for this least hindered of amino acids, much of the conformational space is not allowed due to unfavorable overlaps of neighboring atoms. Figure 2B illustrates the effect that addition of a single methyl group has on the available conformational space (Zimmerman et al., 1977). Much of the conformational space available to glycine is not allowed for alanine, and the conformations corresponding to structural themes such as a helices and /3 sheets fall near local minima in the potential surface. The plots for the other common L-amino acids are similar, with the exception of proline. Before proceeding to a discussion of unusual, conformationally constrained amino acids, it is useful to discuss this residue (Cantor and Schimmel, 1980). The angle @ of proline is fixed at approximately -60" due to the constraints of the pyrrolidone ring. The values allowed for the corresponding V angle are approximately -55" and 130". The @ = -60", V = -55" set of angles leads to a partial reversal in the direction of the chain and explains the occurrence of proline at bends and the beginning of helices. The N-alkyl group of proline also sterically restricts the conformation of the residue preceding it in sequence, thereby making it difficult for that residue to adopt an a-helical conformation. This is in large part the origin of proline's tendency to break helices; the lack of an amide NH
56
WILLIAM F. DEGRADO
A
.
I I
-180
-120
-60
0
I20
60
I80
+ (drgrrrr) I 00
I20
B
60
Ti
-*
o
0
-60
-I
-120-
-la0
-I20
-60
I
(A
I
0
60
I20
1 180
C
480
IM
€4
0
M
20
B
*
lb)
120
b0
0
M
110
Id0
Id*q'*.sl
FIG.2. Conformational energy contour map of N-acetyl-Gly-N'-methylamide (A), Nacetyl-Ala-N'-methylamide(B) (taken from Zimmerman et al., 19771, and N-acetyl-a-Aib-
DESIGN OF PEPTIDES AND PROTEINS
57
group also is an important factor. The /3 sheet and left-handed a-helical conformations remain accessible for the residue preceding proline in the sequence.
B. Properties of Some Con&ormationallyConstrained Amino Acids A large number of amino acid and dipeptide derivatives with built-in conformational constraints have been synthesized and incorporated into biologically active peptides (reviewed in Hruby, 1982). In this review we shall consider only those modifications that appear to be generally applicable and for which the amino acids are either commercially available or can be prepared relatively easily. 1 . C,-Methylamino Acids One of the most successful methods of constraining the conformation of an amino acid unit is to replace the hydrogen atom at the a-carbon with a methyl group to give a dialkylamino acid (reviewed in Prasad and Balaram, 1984;Toniolo et al., 1983; Bosch et al., 1984). Figure 2C illustrates the conformational energy map for Ca-methyl derivative of alanine, N acetyl-C, ,C,-dimethylglycine-”-methylamine (Ac-Aib-NMe, where Aib refers to a-aminoisobutyric acid), as a prototype for the C,-methylamino acids (Paterson et al., 1981). The result of the addition of the a-methyl group to alanine is similar to the effect of adding a methyl group onto glycine. About 50% of the plot is accessible to glycine (Fig. 2A); addition of a single methyl group decreases the accessible area to 16% for alanine (Fig. 2B), whereas only a few percent remains accessible for the Aib residue (Fig. 2C). The region allowed for the Aib residue is particularly useful; it includes both the left-handed and right-handed a and 310helices. If the residues adjoining the Aib residue are chiral, a given helical sense will be favored: left-handed for D-residues and right-handed for L-residues. Thus, the addition of one or several Aib residues should be a useful method to promote either a turn or a helical conformation. Depending on the surrounding sequence, one can obtain type I, 1’,111, or 111’ conformations (Rose et al., 1985); or, if no helix-breaking residues are present, a helix of either sense can be engineered. Indeed, crystallographic and N‘methylamide (C) (taken from Paterson er al., 1981). Locations of minima are indicated by the filled circles. The contour lines in A and B are labeled with energy in kcalhol above the minimum-energy point. The contour lines in C are drawn at I , 3,5, 10, and 15 kcalhol above the 10” grid point of lowest energy for each map. The left graph in C was calculated using asymmetric bond lengths and angles for the dimethyl substituent as determined by X-ray crystallography. In the right graph, standard symmetrical, tetrahedral geometries were used.
58
WILLIAM F. DEGRADO
NMR investigations of Aib-containing peptides have confirmed this expectation (Prasad and Balaram, 1984; Bosch et al., 1984, 1985a,b; Fox and Richards, 1982; Kokkinidis et al., 1986; Bavaso et al., 1986; Karle et al., 1986). Alamethicin and related antibiotics contain multiple Aib residues in their sequences, and the crystal structures of a large number of fragments of these peptides have been determined to a high degree of resolution. As reviewed previously (Prasad and Balaram, 1984),the Q, and Y angles of Aib residues virtually always occur in the allowed region of the Q,, q plot (Fig. 2C). Long peptides tend to occur in 3,o- or a-helical conformation, whereas short Aib-containing peptides generally adopt type I or I11 turns, although type I1 turns are also occasionally observed. Clearly, Ca-methylation is a particularly useful method of dictating a desired range of @, q values for a given residue in a peptide chain. Methodology that should have broad applicability is appearing for the synthesis of &methyl derivatives of amino acids other than alanine. In particular, the chiral auxiliary approach of Schollkopf (1983), which involves the alkylation of Meerwein salts of lactim ethers of diketopiperazines, is promising. The “self-reproduction of chirality” approach of Seebach (Seebach and Fadel, 1985; Seebach et al., 1983, 1985) is even more attractive for the synthesis of chiral dialkyl glycine derivatives. In this method (Seebach and Fadel, 1985), an optically active amino acid is condensed with pivaldehyde, a reaction giving a cis-1,3-oxazolidin-5-one; the corresponding chiral enolate is then alkylated with methyl iodide, a reaction giving, after hydrolysis, the Ca-methylamino acid in high yield and diastereomeric purity. Such methods-elegant in their conceptual and experimental simplicity-should open the way to the widespread use of Ca-methylaminoacids as conformational constraints.
2. Na-MethylaminoAcids There are a variety of other conformational constraints that can be introduced into amino acids, but none appear to give the same degree of predictability or to have been studied as extensively as Ca-methylation. N,-methyl groups occur frequently in peptide antibiotics (Ovchinnikov and Ivanov, 1975)and have been introduced into peptides, occasionally to test the possibility of backbone hydrogen bonding (Sugano et al., 1974). The conformational implications of this transformation are complex and extend far beyond a simple change in hydrogen-bonding ability. Calculations (Tonelli, 1976; Manavalan and Momany, 1980) show that an Namethyl group has a major effect on the conformation of both the residue to which it is attached (the ith residue) and the residue preceding it in the sequence (the i - 1 th residue). For both residues, when the peptide bond preceding the N-methyl group is in a trans conformation, the right-handed
DESIGN OF PEPTIDES AND PROTEINS
59
a-helical portion of the Q,, Vr plot is high in energy, as assessed with the model compounds N-acetyl-N-methylalanine-N'-methylamide (NNMA) and N-acetylalanine-N'-dimethylamide,and the p portion of the plot is sharpened somewhat, giving a minimum at Q, = - 135", P = 75" (Manavalan and Momany, 1980). In addition, the left-handed a-helical portion of the plot represents the second lowest conformer for the ith residue, whereas for the i - 1th residue the CS(extended) conformation is second lowest in energy. N-Methylamino acids, like proline, often adopt cis as well as trans conformations about the peptide bond. Conformational calculations and NMR studies on sarcosylsarcosine (Howard et al., 1973) show that the two forms are nearly isoenthalpic, the cis isomer being less favored by 0.6 kcal/mol in DzO. The trans form of NNMA is calculated to be more stable than the cis form by approximately 4.5 kcal/mol (Manavalan and Momany, 1980). The lowest energy conformer for the cis form of NNMA has Q, and P angles of approximately - 140" and 70°, respectively. The effect of the introduction of N-methylamino acids into peptides has been experimentally addressed in a series of investigations by Vitoux et al. (1981, 1986; Aubry et al., 1981). The model peptide pivaloyl-L-prolylN-methyl-L-alanyl-N-methylamideadopts a cis peptide bond in dimethyl sulfoxide (DMSO), CCld, and in the crystalline state, with a type VI p turn centered between the two amino acids (for a review on turns, see Rose et al., 1985). In contrast, a type I1 p turn with all trans peptide bonds is adopted by the corresponding peptide lacking a methyl group at the amide nitrogen of alanine (Vitoux et al., 1981). For the diastereotopic the trans peptide pivaloyl-L-prolyl-N-methyl-~-alanyl-N-methylam~de, form is most stable, and a type I1 turn is formed in the crystal structure of the anhydrous form of this compound (Aubry et al., 1981). A monohydrate of this peptide was also examined; it had a similar backbone conformation (all trans peptide bonds), although minor adjustments in backbone angles allowed a bridging hydrogen-bonding water molecule to be inserted between the pivaloyl CO group and the C-terminal N-methylamide NH group (Aubry et al., 1981). More recent work (Vitoux et al., 1986) has focused on peptides of the general form Piv-X-Me-Y-NHMe (where X = Gly, L-Ala, L-Pro, and Y = Gly, and L- or D-Ala, Leu, or Phe). When the configurations of X and Y were both L, the peptide tended to adopt a type VI turn with a cis peptide bond, whereas the heterochiral pairs tended to adopt a type I1 turn with all trans peptide bonds. These conclusions were also consistent with the known conformations of more complex peptides that contain N-methylamino acids (Vitoux et al., 1986). For glycine-containing residues, the behavior was somewhat more complex; both cis and trans forms were populated.
60
WILLIAM F. DEGRADO
3. D-Amino Acids Another change that is commonly made in peptides is the reversal of the chirality of one or more amino acid residues (reviewed in Rose et al., 1985). This is a particularly popular modification, because protected Damino acids are commercially available, and the resulting analogs, if active, would have enhanced stabilities to enzymatic degradation. The chirality of the amino acids in the central two positions (i + 1 and i + 2) of a turn have a profound effect on the type of turn that is formed. If the central two residues are both of the L configuration, a type I turn is often formed. If the residue at position i + 1 is L and that at position i + 2 is D (an L, D pair) then a type I1 turn is stabilized, while a D, L pair at the central position will stabilize a type 11' turn (Rose et al., 1985). For this reason, type I1 turns are often referred to as L, D turns and type 11' turns as D, L turns. It is interesting to note that a type I turn composed of all L-amino acids has an overall topography that is similar to that for a type 11' with a D, L pair at the i + 1 and i + 2 positions (Rose et al., 1985). The backbone atoms in these turns tend to lie in a plane, and, because H-bonded turns are cyclic structures, they can be considered to have substituents in either axial or equatorial positions. Type I and 11' turns differ significantly only in the @ and V angles about the second residue. For the L, L pair, these are -6O", -30" (type I), and +60", - 120" for the D, L pair (type 11'). The result is that the side chains project from roughly the same position in both types of turns if the chirality of the second amino acid is reversed in going from a type I to a type 11' turn. Thus, if a type I turn is found in a native all+ peptide, substitution of a D-amino acid at position 2 of the turn might lead to an active analog. An enhancement of activity might even be possible as the hydrogen bond formed at a type 11' is tighter and more linear than that formed in a type I turn. In fact, substitution of a D- for an L-amino acid at position 8 of somatostatin gives rise to an analog with enhanced potency (Arison et al., 1978; Veber et al., 1981). In this case, NMR studies on this peptide and related analogs containing a D-residue at position 8 provide strong support for the formation of a type 11' turn at this position (Veber, 1981). 4 . a,P-Unsaturated Amino Acids Yet another substitution that restricts the conformational freedom of amino acids is the introduction of a double bond at the C, and C, atoms. In addition to the obvious effect that a,p-unsaturation has on the sidechain mobility, this modification also affects the conformation of the backbone. The effects of C,-Cp unsaturation have been explored experi-
DESIGN OF PEPTIDES AND PROTEINS
61
mentally and by semiempirical potential energy calculations (Aj6 et al., 1980). They are strongly dependent on the presence of alkyl groups in the y position and the configuration of this substituent about the double bond. More recently, Bach and Gierasch (1986) have focused on the conformational preferences of Z-dehydrophenylalanine (AZPhe)when incorporated into peptides. NMR of linear and cyclic peptides showed that this amino acid may readily be incorporated into the i or i + 2 positions of p turns, or the i + 1 position of a y turn. The preference for AzPheto occupy the i + 2 position of /? turns has been confirmed by Singh et al. (1987) and Chauhan et al. (1987) in their studies of N-acetylated derivatives of tripeptides in which AZPheoccupies the central position. The overall conformational impact of this residue was similar to that observed for D-Phe. Interestingly, dehydrophenylalaninewas found to have a rigidifying effect on the conformations of flexible peptides, thereby encouraging linear peptides to adopt turn conformations. C . Cyclic Peptides Another method of constraining the peptide backbone, perhaps even more drastic than incorporating unusual amino acids, is the introduction of covalent cross-links. Macrocyclization is often accomplished by forming an amide bond between the peptide N and C termini, between a side chain and the N or C terminus, or between two side chains. Disulfides can also be introduced, but they may be reduced in uiuo. The conformational entropy of a cyclic peptide is lower than its linear counterpart. Thus, adoption of a specific conformation may occur with a smaller decrease in entropy for a cyclic analog than for an acyclic analog, thereby making the free energy for the process more favorable. The literature concerning cyclic peptides is extensive and has been exhaustively reviewed (Rose et al., 1985; Ovchinnikov and Ivanov, 1975; Kessler, 1982). D . Design of Analogs of the Pentapeptide Enkephalin The design of conformationally constrained peptides has helped to elucidate the structural bases for the activities of a number of important biologically active peptides, including somatostatin (Veber, 1981), oxytocin (Hruby, 1985b), and melanotropin (Lebl et al., 1984). Here, we shall consider the design of analogs of the pentapeptides Leu- and Metenkephalin (Tyr-Gly-Gly-Phe-Leu and Tyr-Gly-Gly-Phe-Met)as a recent example of this approach. An extraordinarily large number of analogs of these peptides have been synthesized, so the functional groups required for activity are well known (Morley, 1980). One of the most intriguing questions concerning the enkephalins is how they can compete with compounds such as morphine for binding to opiate receptors. Apparently the
62
WILLIAM F. DEGRADO
enkephalins adopt bioactive conformations in which the appropriate functional groups are positioned such that their topography mimics that of morphine. Ultimately, it should be possible to elucidate the essential conformational and molecular features of this structure by synthesizing conformationally constrained peptides. However, this has been complicated by the considerable flexibility of the enkephalins as well as by the fact that nonequivalent conformations of peptides can nevertheless have side chains in nearly equivalent orientations. Thus, when a series of peptides are examined, they may not share a single bioactive conformation. Nevertheless, in the last several years, the study of conformationally constrained peptides has contributed much to our understanding of the structural and dynamic requirements for binding and activation of opiate receptors. For example, it has been shown that receptor subtypes recognize distinct conformational states of enkephalins. In the following section we shall review the recent findings in this rapidly advancing field. 1 . Enkephalins Bearing Conformationally Constrained Amino Acids
The enkephalins are highly flexible molecules, and this flexibility has hampered attempts to elucidate their bioactive conformers. In particular, the two consecutive glycyl residues provide the enkephalins with even more potential conformational states than a typical pentapeptide composed of all L-amino acids. This flexibility is apparent in Fig. 1, which illustrates several conformations of enkephalins in the solid state. A large number of conformations for enkephalins dissolved in various solvents has also been proposed on the basis of the NMR of enkephalins in various solvents (reviewed in Schiller, 1984). Computational studies have also underscored the large number of potential conformations. Most of the NMR and computational studies show structures involving a p turn center between the Gly-Gly or the Gly-Phe bond to be favored (Schiller, 1984). Therefore, in an effort to limit the conformational space available to the native molecule, a large number of analogs bearing various conformational constraints have been prepared and evaluated. An early discovery was that substitution of a D-Ala residue at position 2 of the sequence led to an enhancement of activity (Roemer et al., 1977), a finding suggesting that a turn might be centered at the D-Ala-Gly bond. To further stabilize this turn, a set of molecules was synthesized in which Aib residues either singly or in a pair replaced the Gly-Gly bond in enkephalin (Sudha and Balaram, 1983). It was found that substitution of an Aib residue for the first Gly residue gave rise to highly active analogs: both the Aib-Gly and Aib-Aib peptides showed potent in uiuo activities (see also Gorin et al., 1980). In contrast, the Gly-Aib analog showed much lower, but still significant, activity. Unfortunately, no in uitro studies were reported, so it is difficult to assess the extent to which an increased resistance to proteoly-
DESIGN OF PEPTIDES AND PROTEINS
63
sis influenced these results. The conformations of the Aib-containing enkephalins and fragments thereof were investigated by CD (Sudha and Balaram, 1981),NMR (Sudha and Balaram, 1983),and X-ray crystallography (Prasad et al., 1983). It was shown that replacement of the Gly with Aib residues caused the peptides to adopt conformations more ordered than those of the parent compound (Sudha and Balaram, 1981). NMR studies of the peptides in dimethyl sulfoxide and chloroform (Sudha and Balaram, 1983) showed that the Aib-Gly peptide adopted a p turn centered about the Aib-Gly peptide bond, whereas the Gly-Aib peptide adopted a p turn centered about the Aib-Phe bond. The Aib-Aib peptide adopted a 3lo-helicalconformation with two consecutive turns centered at the Aib-Aib and Aib-Phe bonds. This conformation is also seen in the crystal structure of Boc-Aib-Aib-Phe-Met-CONH2(Prasad et al., 1983). These results led the authors to suggest that the enkephaiins must adopt folded conformations, possibly with a turn centered at the Gly-Gly bond. The authors of these studies note that stabilization of this turn conformation by hydrogen bonding must not be essential, because analogs with Nmethyl-Phe or Met at positions 4 and 5 are known to be highly active (Sudha and Balaram, 1983). Along similar lines, Stammer and co-workers (Shimohigashi and Stammer, 1982a,b; Shirnohigashi et al., 1982) have investigated the effects of substituting the glycyl residues in enkephalin with dehydroalanine. Substitution at position 2 (Shimohigashi and Stammer, 1982a)gives an analog that is nearly equipotent to Leu-enkephalin in the 6 receptor binding assay but is more than twice as potent in the p receptor assay. Thus, the AAla2 peptide binds to p receptors nearly as well as to 6 receptors, having a ratio of 0.45 for the association constants for binding to the two receptors. (Enkephalin favors binding to the 6 receptor by a factor of approximately 5 . ) AAla3-Leu-enkephalin(Shimohigashi and Stammer, 1982b) shows a marked decrease in its affinity for the 6 receptor (4% as active), but only a 20% decrease in its affinity for the p receptor. Thus, the ratio of the association constant for the p versus the 6 receptors is approximately 1.O for this peptide. It would be of interest to determine whether these intriguing differences in receptor affinities are a result of a modification of the conformational properties or whether they are a result of the increased hydrophobicities of the analogs. The same researchers have also incorporated A'Phe into the 4-position of Leu-enkephalin (Shimohigashi et al., 1982). The resulting peptide is a highly potent, A-selective binder.
2 . Cyclic Enkephalins The studies discussed in the preceding sections demonstrate both the potential and the limitations associated with the use of conformationally restricted amino acids for elucidating bioactive conformations. Even
64
WILLIAM F. DEGRADO
FIG. 3. Structure of a cyclic enkephalin analog designed by Schiller and co-workers (DiMaio ef al., 1982).
though the conformations of the substituted amino acids are restricted, the overall chain still has a great deal of flexibility. If the possible configurations of the side chains are also considered, the problem is increased by orders of magnitude. In principle, it should be possible to combine more than one conformational constraint into a single peptide, but the resulting peptides still may have a large number of low-energy conformational states. Therefore, more recent work has focused on the introduction of even more extreme conformational constraints-in the form of macrocyclization-into enkephalins. Operating under the assumption that enkephalins adopt folded turn-like conformations at their receptors, Schiller and co-workers designed a series of cyclic enkephalins in which the 2-position was occupied by a D-amino acid bearing an alkylamino side chain (DiMaio et al., 1982): Cyclization was achieved by forming an amide bond between this side chain amino group and the C-terminal acarboxylate (Fig. 3). The prototype cyclic analog, Tyr-cyclo[N4-~-DbuGly-Phe-Leu] (Dbu = 2,4-diaminobutanoic acid) was an extraordinarily p selective ligand. Relative to Leu-enkephalin, it was 17 times more potent in the guinea pig ileum (GPJ, a tissue rich in p receptors) assay and 7-fold less effective in the mouse vas deferens (MVD, a tissue rich in 6 receptors) assay. The conformational properties of the ring could be modulated by adjusting the number of methylene groups in the side chain at the 2position. Four peptides were synthesized with one to four methylene groups. Relative to Leu-enkephalin, each had 5- to 50-fold higher activities in the GPI assay, and up to 40-fold lower potencies in the MVD assay. It was found that as the number of methylenes was increased from one to four, the selectivity ratio (the ratio of the potency in the GPI versus the MVD assays) increased, reaching a value of 29 for the analog with four methylene groups.
DESIGN OF PEPTIDES AND PROTEINS
65
The receptor selectivity of the cyclic peptides could be only partially explained in terms of their binding affinities for the p and 6 receptors. Each of the cyclic analogs had about the same binding affinity for the p receptor as Leu-enkephalin, although they had markedly poorer affinities for the 6 receptor. The ratio of the association constants for the p versus the 6 receptors reached a maximum of 8 when n = 2. Thus, the high degree of activity of the cyclic analogs in the GPI versus MVD assay systems had two causes: (I) the peptides bound rather more poorly to 6 receptors than to p receptors; and (2) they also had greater efficacy at the p receptor than did Leu-enkephalin. (That is, at a given degree of receptor-site saturation, the cyclic enkephalins produced a larger response than did Leu-enkephalin.)A reasonable explanation of these data is that multiple conformational forms of Leu-enkephalin bind to the p receptor but only one or a small subset of these bound conformational forms is responsible for signal transduction (DiMaio et al., 1982). Cyclization preferentially stabilizes the conformations that are important for signal transduction. Also consistent with this view are the results obtained with acyclic correlates of the above cyclic peptides (DiMaio et al., 1982; Schiller and DiMaio, 1982). These peptides are related to their cyclic analogs by a reductive cleavage of the ring and hence differ in chemical composition by the addition of only two hydrogen atoms. The prototype for this series is Tyr-Abu-Gly-Phe-Leu-CONHz (Abu = 2-aminobutanoic acid). These peptides have biological properties similar to those of Leu-enkephalinamide and are nonselective for p versus 6 receptors, both in binding and activity assays. Thus, the enhanced efficacy of the cyclic analogs must be a consequence solely of their conformational properties. In an extension of the above approach, partially retro-inverso analogs were synof the prototype peptide, Tyr-cyclo[N4-~-Dbu-Gly-Phe-Leu], thesized and evaluated (Berman et al., 1983). [A retro-inverso peptide (Goodman and Chorev, 1979) differs from its parent peptide by having the direction of its peptide bonds reversed and the chirality of each amino acid residue reversed. Nevertheless, the overall stereochemical relationships of the side chains remains roughly unchanged; C,-CO-NH-CL is converted to C,-NH-CO-CL.] These peptides contain a reversal of the peptide bond at two or more centers centered about the Phe-Leu bond. The cyclic retro-inverso analogs maintained the high p selectivity and efficacy of the parent compound. Thus, the orientation of the side chains must be similar in the retro-inverso analogs, and the backbone atoms in the modified region must serve primarily as a structural framework for positioning these side chains in proper juxtaposition. The conformations of several retro-inverso enkephalin analogs have been examined by a combination of computational and NMR techniques (Mammi et al., 1985;
66
WILLIAM F. DEGRADO
Hassan and Goodman, 1986; Mammi and Goodman, 1986). Molecular dynamics and systematic search methods were used to search conformational space to obtain starting conformations for energy minimization. Several low-lying backbone and side-chain conformations were found for each of the analogs. In comparison with the linear enkephalins, the cyclic analogs had relatively few low-energy conformers. However, they still appeared to be partially flexible in the 10-ps molecular dynamics simulations. Their side chains were rather variable in conformation, but their main chains also showed multiple transitions between hydrogen-bonding patterns and regions of the @, 9 plot. The conformations of the peptides were then evaluated by standard NMR techniques. The interpretation of the probable positions of hydrogen bonds was aided by examination of the low-energy conformations of the peptides. Each of the peptides was found to adopt preferred conformations in dimethyl sulfoxide, and some appeared to adopt the same preferred conformation in water as well. The peptides all had one or two hydrogen bonds within the ring structure, and the parent compound had two C7 hydrogen bonds in dimethyl sulfoxide. By searching for structural commonalities among the low-energy conformations of this class of cyclic peptides, it may be possible to surmise some structural requirements for binding to the opiate p receptor. In addition, it should now be possible to design peptides with even greater conformational restrictions by incorporating conformationally constrained amino acids into cyclic enkephalin analogs. A second class of cyclic enkephalins consists of a set of analogs that contain intramolecular disulfide bonds. The prototype of this series is Tyr-cyclo(D-Cys-Gly-Phe-Cys)-CONH2. This peptide (Schiller et al., 1981), and the corresponding peptide with two D-CYSresidues, were 38and 73-fold more active than Met-enkephalin in the GPI assay. Again, this enhanced potency was only partially due to more potent binding to the p receptor; the D, L and D, D compounds bound 2- and 3.5-fold more tightly to the p receptor than did Met-enkephalin. Interestingly, these peptides failed to show substantial specificity for the p receptor. The ratios of the association constants for the p versus the 6 receptors were approximately 2.5 for both peptides. Thus, the cystine-bridged enkephalins were less selective than the methylene-bridged analogs, even when the same number of atoms was involved in the ring. This is presumably due to the increased rigidity of the latter class of cyclic peptides. The methylenebridged enkephalins contain one more rigid, planar peptide bond in their ring structures than do the disulfide-bridged enkephalins. In an effort to decrease the flexibility of the disulfide-bridged enkephalins, penicillamine residues (C, , CB-dimethylcysteine,Pen) have been introduced in place of the cysteine residues. The D-Pen2-D-CysSpeptide exhibited substantial 6 receptor activities (Mosberg et al., 1982). Even
DESIGN OF PEPTIDES AND PROTEINS
67
more impressive effects were obtained when both cysteine residues were replaced by penicillamine (Mosberg et al., 1983). The D-Pen2-D-PenS not only reenkephalin analog [Tyr-cyclo(D-Pen-Gly-Phe-~-Pen)-COOHl tained the high potency of the D-CYS~-D-CYS~ analog in the MVD assay but also displayed a high selectivity for 6 receptors. The 6 receptor selectivity ratio was 3200 in a bioassay (MVD versus GPI) and 370 in a binding assay. More recently, conformationally restricted amino acids have been introduced into the D-PenZ-D-PenScyclic analog in an effort to further restrict the conformation of the peptide (Hruby et al., 1985). Substitution of N-methyl-Phe in the 4-position leads to a compound with reduced activity in both the MVD and GPI assays and with lower 6 selectivity than the parent compound. However, replacing Phe4with tetrahydroisoquinoline carboxylate gives rise to an analog with very high 6 selectivity. These results led the authors to suggest that the Phe4side chain prefers to be in a gauche (-) conformation when bound to the 6 receptor. Conformational analysis of the disulfide-linked cyclic enkephalins should lead to an understanding of the conformational differences between 6- and p-selective ligands. NMR analysis of Pen-containing analogs has shown that they indeed have more rigid conformations than the corresponding Cys-containing analog (Hruby et al., 1985; Mosberg and Schiller, 1984). However, no specific conformations have been proposed yet for these peptides. A third class of cyclic enkephalins are those in which cyclization is achieved by amide bond formation between two side chains (Schiller et al., 1985). The morphiceptin analog, Tyr-cyclo(D-Orn-Phe-Asp)-CONHz, binds selectively to p receptors with a binding selectivity ratio of 213. The corresponding peptide Tyr-cyclo(D-Lys-Phe-Glu)-CONH2 is far less selective, binding with a selectivity ratio of 3.0. The loss in selectivity has been explained in terms of an increase in flexibility when the ring is expanded from 13 to 15 atoms. The antiparallel cyclodimer of the D-0rn2Asp4peptide was also examined and found to be nonselective in its receptor binding preferences. Thus, the selectivity of the monomeric cyclic peptide was probably due to its conformational properties. In conclusion, studies with conformationally constrained analogs of enkephalins have provided important information concerning the relationship between the conformations and the bioactivity of the enkephalins. Clearly, a folded, turnlike conformation is recognized by the p and 6 receptors. The primary purpose of the amide main chain appears to be to position the side chains in proper juxtaposition for interaction with receptors. Further, the 6 and p receptors appear to recognize different conformational forms of enkephalin. By synthesizing and analyzing the conformations of increasingly rigid cyclic enkephalin analogs, it should be possible to determine what arrangement of side chains is recognized by
68
WILLIAM F. DEGRADO
the p and 6 receptors. Also, it should be possible to determine whether some flexibility in the enkephalin molecules is required for their activities. PEPTIDES 111. DESIGNOF MEDIUM-SIZED The design of medium-sized peptides is beset with many of the same problems encountered in the design of small peptides. Medium-sized peptides are also very flexible and tend to lack defined conformations at room temperature in aqueous solutions. An additional problem in designing medium-sized peptides is the increase in chain length. This increases the time required for synthesis and purification, while at the same time exponentially increasing the number of possible variants for a given sequence. With a peptide of approximately five residues, it is feasible to try all possible amino acids at a given position and to ultimately introduce conformational constraints that greatly restrict the peptide in solution. The same approach is simply not practical for peptides of greater than approximately 20 residues. Fortunately, simple secondary structures (ahelices, /3 sheets) figure largely in the bioactive conformations of peptides of this length (Kaiser and Kkzdy, 1983, 1984). This fact suggests a rational and experimentally feasible approach to designing medium-sized peptides. Proceeding from a hypothesis for the bioactive conformation of a peptide (which includes the locations of secondary structures), peptides that should favor formation of this hypothetical structure are designed. The resulting analogs are still quite flexible but should more favorably interact with their receptors, if the guiding hypothesis is correct. It has been proposed that both a helices and /3 sheets are involved in the bioactive conformations of medium-sized peptides (Kaiser and KCzdy , 1984). However, a-helical secondary structure have been more clearly implicated in the conformations of a variety of peptides, including apolipoproteins (Kaiser and Kezdy, 1983, 1984), toxins (DeGrado et al., 1981), hormones (Kaiser and Kkzdy , 1983, 1984), and calmodulin-binding peptides (DeGrado et al., 1985). Also, the techniques for prediction and analysis of a helices are more advanced than those for /3 sheets. Therefore, the following section will primarily highlight the design of a-helical peptides. A . Factors Stabilizing the Formation of Secondary Structures in Aqueous Solution
I . Short-Range Interactions Several approaches have been adopted for elucidating the factors involved in helix formation and stability. Perhaps the simplest approaches of these are the statistical approaches as typified by the Chou-Fasman
DESIGN OF PEPTIDES AND PROTEINS
69
method (Chou and Fasman, 1978). In this method the frequencies of occurrence of a given amino acid in helical versus nonhelical structures are determined by analysis of a large number of protein crystal structures. Residues that frequently occur in helical conformations are reasonably assumed to favor helix formation. An alternate experimental approach is the host-guest method of Scheraga (1978). This method attempts to determine the extent to which a given amino acid stabilizes or destabilizes the helical conformation by examining amino acid copolymers. A random copolymer of the amino acid of interest (the guest) and a helix-favoring amino acid (the host) is formed, with the host amino acid in large molar excess. The thermally induced helix-to-coil transitions of the host polymer with and without the guest amino acid are then evaluated to investigate the perturbational effect of the guest. The data are analyzed using the Zimm-Bragg formalism, which includes parameters for both helix initiation (u)and helix propagation (s). The parameter u can be interpreted as an equilibrium constant for formation of the first turn of the a helix from a random coil polymer. This parameter is unfavorable and is on the order of The parameter s corresponds to the equilibrium constant for addition of a single residue to an already initiated helix. The values of s calculated from experimental data of host-guest copolymers range from 0.6 for Gly to 1.32 for protonated Glu. There is a good qualitative agreement between the Zimm-Bragg s values that obtained and the ChouFasman parameters for helix formation (Scheraga, 1978; Chou and Fasman, 1978; Sueki et al., 19841, a result suggesting that both parameters are measures of the intrinsic conformational preferences of individual amino acids. 2 . Medium-Range Interactions Host-guest studies of random copolymers suggest that peptides of 10 to 30 residues should show very little helix formation in dilute aqueous solution, even when these peptides are composed of helix-favoring amino acids. This prediction is generally true for amino acid homopolymers, but recent work with peptides of defined sequence has shown that sequencedependent side chain-side chain or side chain-main chain interactions, which are not included in the Zimm-Bragg formalism, can have a dramatic influence on the stabilities of isolated helices in solution. This outcome has been most clearly shown in studies (Bierzynski et al., 1982; Kim and Baldwin, 1984; Shoemaker et al., 1985, 1987a) of the N-terminal a helix of ribonuclease A, which spans residues 3-13 in the crystal structure of this enzyme. As predicted from the Zimm-Bragg model, peptides corresponding to this sequence have little helical structure in water at room temperature, as assessed by NMR (Rico et al., 1986) or CD spectroscopy (Bier-
70
WILLIAM F. DEGRADO
zynski et al., 1982; Brown and Klee, 1971). However, they deviate from this model at lower temperatures; near 0°C in aqueous buffers at neutral pH, a peptide spanning the N-terminal 13 residues of the ribonuclease sequence (C-peptide, residues 1-13 terminating with a homoserine lactone) is approximately 30% helical (Bierzynski et al., 1982).Clearly, there must be specific interactions that provide the extra stabilizing force required for helix formation. The pH dependence of helix formation for this peptide indicates that two groups with apparent pK, values of 3.3 and 6.5 are important for helix stability. Initially it was hypothesized that there was a stabilizing salt bridge formed between the side chains of Glu-9 and His-12, which are located slightly more than a single a-helical turn apart, (Bierzynski et al., 1982); subsequent systematic studies with synthetic peptides disproved this hypothesis. It appears instead that the ionizing groups are Glu-2 and His-12, residues that are located at opposite ends of the helix. Their location suggests that they might serve to interact favorably with the ahelical dipole (Shoemaker et d., 1985, 1987a). In an a helix the amide NH groups all point approximately toward the N terminus of the helix, and the C=O groups point in the opposite direction, an arrangement giving rise to a macrodipole (Hol et al., 1981; Sheridan et al., 1982). Negatively charged residues at the N terminus of C-peptide and positively charged residues at the C terminus appear to interact favorably with the helix dipole and favor helix formation (Shoemaker et al., 1985, 1987a; Blagdon and Goodman, 1975). The salt dependence of helix formation for various analogs of C-peptide is also consistent with the helical dipole model (Shoemaker et al., 1985, 1987a). Also consistent with’thisview are studies in which the charge of the N-terminal residue is systematically varied in the following series: +2 (Lys); + 1 (Ala); 0 (acetyl-Ala); - 1 (succinylAla). The extent of helix formation increased steadily as the charge was varied from +2 to - 1 (Table I, compounds I-IV). Similarly, neutralization of the negative charge at the C terminus by conversion of the carboxylate to a carboxamide or homoserine lactone, also favors helix formation (Shoemaker et al., 1985, 1987a). Two other types of specific side-chain interactions have been proposed to stabilize the a helix formed by C-peptide (Shoemaker et al., 1987b). A salt bridge between Glu-2 and Arg-10 has been detected in the crystal structure of ribonuclease A (Wlodawer and Sjolin, 1983) as well as in Cpeptide in aqueous solution (Rico et al., 1986). This salt bridge also fixes the N-terminal boundary of the helix between Glu-2 and Thr-3. It is not sterically possible to make this ion pair if Glu-2 is part of the helix. Synthetic studies in which either Glu-2 or Arg-10 is replaced by Ala have provided support for the importance of this interaction in stabilizing the a-
71
DESIGN OF PEFTIDES AND PROTEINS
TABLEI Correlation of Association Constant and Thermostability of Semisynthetic RNase S with Peptide Helix Content'
Peptide
Peptide ellipticity (3°C) (deg cm2 dmol-I)
S-peptide (1-15) S-peptide (1-19) V (Lysl, Gin")' I (Lysl, Glu") I1 (Alal, Glull) VI (SucAla', Gln1I) I11 (AcAla', Glull) IV (SucAlaI, Glull)
2,500 3,500 8,200 11,100 12,800 13,000 15,800 16,200
K, (35°C) (lo6 M - 9 0.865 1.05 2.19 4.99 6.32 4.03 6.45 10.00
f 0.03
f 0.12 2 0.41
1.30 1.09 f 1.0s 2 2.29 f 2.13 f f
Tmb ("C) 48.1 45.9 41.5 49.3 50.4 48.8 51.1 51.9
f 0.9 f 0.7 2 0.2 f 0.2 f 0.1 f 0.4
From Mitchinson and Baldwin (1986); all values refer to 0.10 M NaCI, pH 7.30. Suc and Ac refer to a-N-succinyl and a-N-acetyl groups, respectively. T, measured at an RNase S concentration between 100 and 200 p M with 30% mole excess peptide. Compounds I-IV correspond to S-peptide 1-1s with Leu substituted for Glu at position 9, and other changes as indicated in parentheses.
helical conformation of C-peptide. However, the analysis is not simple, due to the dual role played by the Glu at position 2. This residue appears to be important for interacting with the helical dipole as well as forming a salt bridge to Arg-10. The protonated form of His-12 might also play a dud role in stabilizing helix formation. In addition to interacting favorably with the helix dipole, this residue might form an aromatic interaction (Blundell et al., 1986; Burley and Petsko, 1985) with Phe-8 in solution (Shoemaker et al., 1987b). Preliminary evidence suggests that this interaction is stronger for the protonated than for the nonprotonated form of His-12 (Shoemaker et al., 1987b). There are probably other specific sidechain interactions that have yet to be identified. Although the magnitude of the stabilizing side-chain interactions are small (they tend to account for a factor of 2 to 3), they work in concert to make helix formation by Cpeptide only slightly unfavorable. 3. Long-Range Interactions The studies mentioned above have shown how short-range and medium-range interactions can stabilize helix formation in isolated peptides. Also important for structural stability are long-range interactions between the residues within a secondary structural unit and groups not included in that unit. In some cases such long-range interactions play a decisive role in determining which secondary structure is formed (DeGrado and Lear, 1985). Several types of attractive forces, including hydrogen bonding,
WILLIAM F. DEGRADO
72
electrostatic interactions, and hydrophobic forces, are known to be involved in long-range interactions (Creighton, 1984). Hydrophobic forces (Tanford, 1980) in particular appear to be important for stabilizing secondary structures in globular proteins (Kauzmann, 1959; Janin, 1979), peptide hormone-hormone receptor complexes (Kaiser and Kkzdy, 1983), and apolipoprotein-phospholipid complexes (Kaiser and Kezdy, 1984). a Helices and antiparallel /3 sheets often lie along the surface of globular proteins (Richardson, 1981) and are hence amphiphilic (Eisenberg et al., 1982a); their solvent-exposedfaces are more hydrophilic than their opposite faces, which are in contact with the apolar interior of the protein. A similar amphiphilic disposition of hydrophobic and hydrophilic groups has been envisioned for peptides that bind to extrinsic apolar surfaces, such as phospholipid membranes and hormone receptors (Kaiser and Kkzdy, 1983, 1984). Amphiphilic secondary structures show periodic distributions of hydrophilic and hydrophobic amino acids in their sequences; the repeat period for this distribution matches that of the secondary structure (Eisenberg et al., 1984). Hydrophobic residues repeat approximately every two residues in an amphiphilic /3 sheet and every three to four residues in an amphiphilic a helix. Upon interacting with an apolar interface, the peptide adopts a secondary structure corresponding to its hydrophobic period, providing a low-energy conformation in which the hydrophobic residues are maximally dehydrated and the hydrophilic residues are maximally hydrated. This is shown schematically in Fig. 4. Thus, hydration forces play an important role in determining the conformations of amphiphilic secondary structures. Hydrophobic periodicity can have such a large influence on the formation of secondary structure that it dominates over short- and mediumrange interactions (DeGrado and Lear, 1985). This has been demonstrated recently, using a set of designed peptides with leucyl and lysyl residues in identical ratios but with different hydrophobic periodicities. peptide 1 peptide 2 peptide 3
FMOC-(Leu-Lys-Lys-Leu-Leu-Lys-Leu), FMOC-(Leu-Lys-Lys-Leu-Leu-Lys-Leu)2 FMOC-(Leu-Lys-Leu-Lys-Leu-Lys-Leu),
Chou-Fasman parameters (Chou and Fasman, 1978) were identical for each of these peptides, as were the Zimm-Bragg s values (Scheraga, 1978) for helix formation. The peptides differ only in their hydrophobic periodicities (3.5 for peptides 1 and 2; 2.0 for peptide 3) and chain lengths, thus allowing the effect of these parameters on peptide conformation to be investigated while keeping short-range interactions approximately constant. All three of these peptides adopted random conformations in very dilute, aqueous solutions, but peptides 2 and 3 could be induced to form
DESIGN OF PEF’TIDES AND PROTEINS
73
-
PERIODICITY 3.5
s+
*
PERIODICITY- 2.0
HYDROPHOBIC PERIODICITY (PRIMARY STRUCTURE)
APOLAR + (INTERFACE) SURFACE
INDUCED SECONDARY STRUCTURE
FIG.4. Schematic illustration of how hydrophobic periodicity can influence secondary structure formation. In this illustration, the closed circles symbolize apolar residues and open circles symbolize polar residues. In dilute, aqueous solution, the peptides lack a single defined conformation. However, in the presence of an apolar-water interface, they adopt a secondary structure that maximizes the interactions of the apolar groups with the apolar medium and the polar groups with water. Taken from DeGrado and Lear (1985).
defined secondary structures in the presence of an apolar-water interface. Peptide 2 was strongly surface-active and formed stable monolayers at the air-water interface [see Adamson, (1976) for a review on methods for studying molecules absorbed to the air-water interface]. Surface pressure-area isotherms were analyzed according to the equation ?r(A - Ao) = kT/nDP, where n is the surface pressure at a given area, A (expressed in Az per amino acid residue), DP is the number of residues per peptide, k is Boltzmann’s constant, and T is temperature in degrees Kelvin. Linear regression gives Ao, the limiting cross-sectional molecular area, and n , the degree of aggregation of the peptide at the air-water interface, which should be close to unity for an ideal surface gas. Peptide 2 appeared to form monomers or dimers at the air-water interface, and its molecular cross-sectional area (16 A2/amino acid residue) was consistent with a helical conformation. Compressed monolayers were transferred to solid, planar surfaces and examined by CD spectroscopy and attenuated total
74
WILLIAM F. DEGRADO
reflectance infrared spectroscopy. Both techniques indicated that the peptides were in predominantly a-helical conformations in the transferred monolayers. At high peptide concentration, peptide 2 also formed helical tetramers in aqueous salt-containing solutions. Thus, an extrinsic hydrophobic surface (air) or the hydrophobic surface of a neighboring, selfassociated helix or helices could serve to stabilize helix formation. The 7-residue peptide 3, with a hydrophobic period of 2.0, was also highly surface active and appeared to form p sheets at the air-water interface. Analysis of surface pressure-area isotherms for this peptide indicated a high degree of self-association (n = 30), a result suggesting that it might be forming extended, intermolecularly hydrogen-bonded p sheets. The observed cross-sectional surface areas (20 &residue) were consistent with this interpretation, and CD and IR spectroscopy of transferred monolayers of this peptide together indicated that the peptides were in an antiparallel p conformation. Peptide 3 also aggregated in aqueous salt-containing solution to form p sheets. In contrast, the 7-residue peptide 1, with a hydrophobic repeat of 3.5, showed very low surface activity. CD data showed that it failed to adopt an ordered configuration in aqueous solution under a variety of conditions. The above data illustrate the crucial role that hydrophobic periodicity plays in determining secondary structure formation. These data also illustrate the effect of chain length on secondary structure formation. Helix formation required a 14-residue segment, whereas p sheet formation was achieved with only 7 residues. This result is in agreement with studies of amino acid homopolymers of defined lengths. Under forcing conditions (low dielectric solvents that strongly favor secondary structure formation), the critical chain lengths for a helix and p sheet formation are approximately 8-13 residues and 4-6 residues, respectively (Narita et al., 1984; Mutter, 1985). Thus, chain length is another extremely important consideration in the design of secondary structures. The systematic studies of peptides with defined hydrophobic periodicities described above are also consistent with earlier results obtained from studies of amino acid copolymers of high molecular weight (
DESIGN OF PEPTIDES AND PROTEINS
75
The above discussion applies only to the design of peptides that form secondary structures in water or at the interface between an apolar surface and water. In nonpolar solvents, a helix and j3 sheet formation occurs much more readily in large part because solvent molecules cannot compete as readily for hydrogen bonding of the peptide amide groups. B . Enhancing the Helical Potential of Natural Peptides Often direct or indirect evidence suggests that peptides that are random coils in aqueous solution form helices when they bind to their receptors. In such cases it may be possible to enhance the affinity of the peptides for their receptors by introducing modifications that should make the coil-tohelix transition more favorable. By increasing the helical potential of a peptide, it should be possible to increase the binding affinity. 1. Ribonuclease S-Peptide For a number of years, ribonuclease S (Richards, 1958) has served as a convenient model system for studying peptide-protein interactions (Richards and Wyckoff, 1971; Finn and Hofmann, 1973). Limited proteolysis of ribonuclease A with subtilisin gives rise to a complex (ribonuclease S) composed of residues 1-20 (S-peptide) and 21-124 (S-protein).Neither of the fragments are individually active, but they reversibly associate to form a complex with full enzymatic activity. Residues 3-13 of ribonuclease S-peptide are helical in the complex, as they are in the native enzyme (Wyckoff et al., 1967). Changes that increase the stability of the S-peptide helix in solution should be expected to stabilize the complex formed between S-protein and S-peptide. Mitchinson and Baldwin (1986) have prepared a set of six analogs of S-peptide that should vary in their helical potentials but not in their mode of interaction with ribonuclease. These analogs encompass the first 15 residues of S-peptide and differ from it only in the nature of the N-terminal amino acid and in other residues that are not in direct contact with S-protein in the crystal structure of the complex. Their helical contents in aqueous solution at 3°C vary from approximately 10% for S-peptide 1-15 to 65% for the most helical analog. As the helicities of the peptides in solution increase, so too do the affinities of the peptides for S-protein (Table I, p. 71). Also, the thermal stabilities of the reconstituted ribonuclease S-peptide-S-protein complexes increase with increasing helical stabilities (Table I). The most helical peptide binds ribonuclease with more than a I0-fold higher affinity than does the least helical analog. The results have broad implications for the mechanisms by which proteins fold into their three-dimensional structures as well as to the process of protein design.
76
WILLIAM F. DEGRADO
Komoriya and Chaiken (1982) have designed a 15-residue analog of S peptide that contains many of the minimal structural requirements predicted to be essential for forming an enzymatically active complex with Sprotein (Richards and Wyckoff, 1971). This analog contains alanine at every position in its sequence except Glu-2, Lys-7, Phe-8, Arg-10, His-12, and Met-13. Alanine was chosen because it is small, neutral, and has a strong tendency to form a helices. Thus, the alanine residues should help stabilize a semirigid a-helical structure, a characteristic that is important for positioning the functionally important nonalanine residues. The apolar side chains of Phe-8 and Met-13 are involved in van der Waals contacts with S-protein in the crystal structure of the complex of ribonuclease S, and His-12 is required for catalytic activity. The remaining nonalanine residues were thought to be important for providing a hydrophilic surface and other stabilizing interactions. The model S-peptide retains many of the features of native S-peptide. It forms a I : 1 complex with S-protein, although its affinity is decreased by an order of magnitude. The dissociation constant for binding to S-protein was 1.1 x M for the model M for the native peptide. Also, the reconstituted model versus 0.1 x S-peptide :S-protein complex was 35% as active as the native complex in catalyzing the hydrolysis of cyclic cytidine 2’,3’-monophosphate. Part of the reason that the model peptide bound less tightly than the native peptide appears to be related to the helical potentials of the two peptides in aqueous solution. The model peptide was actually less helical than Speptide in water as assessed by their ellipticities at 222 nm. It would be interesting to determine whether changes known to increase the helicity of S-peptide (see above) would also increase the helicity (and affinity for S-protein) of the model peptide. For instance, blocking the N and C termini with acetyl and carboxamide moieties, respectively, should favor helix formation. Recently, the crystal structure of S-protein com lexed with the model peptide has been solved to moderate resolution (3 ) (Taylor et al., 1985). Most of the structural features envisioned in the design of the model peptide were indeed observed in the structure of the complex. The peptide is in a helical conformation, the histidine is held in a reasonable orientation for catalysis, and the complex is stabilized by nonbonded interactions between the hydrophobic cleft of S-protein and the side chains of Phe-8 and Met-13 of the peptide. There were also a number of subtle differences between the structures of the native and the model Sprotein : S-peptide complexes. Most notably, the N terminus of the peptide has undergone a major reorientation that prevents Glu-2 from forming a hydrogen bond with Arg-10. Further, the &nitrogen of the active-site
w
DESIGN OF PEPTIDES AND PROTEINS
77
histidine is shifted by approximately 1 %i, a change perhaps accounting for the decreased enzymatic activity of the model complex. The lack of a salt bridge between Glu-2 and Arg-10 in the crystal structure of the model complex suggests that this interaction might not be obligatory for the structural stability of the complex. This proposal is consistent with results obtained with other model peptides that display even less homology to S-peptide (Kanmera et al., 1983). An analog in which Glu-2, Lys-7, and Arg-10 were also replaced by alanine formed a complex with S-protein that was 15% as active as native ribonuclease S. Unfortunately, the affinity of this peptide for S-protein could not be properly evaluated because the peptide was not obtained in pure form. These studies with analogs of S-peptide demonstrate the usefulness of synthetic modeling in demonstrating which residues provide a conformational framework and which are required for intra- and intermolecular surface recognition.
2. Other Peptides Much of the information gleaned from studies of ribonuclease S can be applied to studies of far more complex systems. In most cases the receptor-bound form of the peptide will not be known, but the same synthetic modeling approaches can be applied to probe the conformational features of the complex. Two recent examples of the application of this approach to the design of biologically active peptides will be considered in this section. Glucagon is a 29-residue hormone whose primary biological role is to stimulate glucose release and production. In dilute aqueous solution the peptide is in a random conformation, but it can be induced to adopt a largely helical conformation under a variety of conditions. Under basic conditions it aggregates to form an a-helical trimer, and the crystal structure of this trimer is known (Sasaki et al., 1975). It also adopts a partially helical conformation when it forms mixed micelles with detergents (Braun et al., 1983). Several researchers have suggested that glucagon binds to its receptor with its C-terminal 10 to 13 residues in a helical conformation (Kaiser and KtSzdy, 1984; Epand, 1983; Krstenansky et al., 1986a). To test this hypothesis Hruby and co-workers (Krstenansky et al., 1986a,b) have designed analogs of glucagon that are similar to glucagon in their charge and hydrophobicity but that have enhanced helical potentials as assessed by the method of Chou and Fasman (1978). One such analog (Krstenansky et al., 1986a) incorporated three changes to simultaneously enhance the probability that residues 17 to 29 will form a helix while at the same time
78
WILLIAM F. DEGRADO
decreasing the probability that this segment will form a p sheet. Two arginine residues (Pa= 0.98, Ps = 0.93) at positions 17 and 18 were substituted with lysine (Pa= 1.16, Ps = 0.74), and an aspartic acid residue (Pa= 1.01, Pa = 0.54) at position 21 was replaced with a glutamic acid residue (Pa= 1.51, Pa = 0.37). The resulting synthetic peptide, [Lys17J8, Gl~~~]glucagon, was five times as potent as native glucagon in a receptor binding assay, and it was seven times as potent in an adenylate cyclase activation assay. Furthermore, CD studies showed that the monomeric form of the analog in aqueous solution had a slight but significant increase in its overall a-helical content (&z2 = -6200 deg cm2/dmol)relative to that of native glucagon (OZ22 = -4200 deg cm2/dmol).These findings provide indirect evidence to support a helical conformation for this region of the peptide. Lu, Mojsov, and Merrifield (Lu et al., 1987) have applied a similar approach in an attempt to decrease the helical potential of glucagon. A phenylalanine residue at position 22 was replaced with a tyrosine, which had the effect of simultaneously decreasing the helical parameter, Pa, from 1.13 to 0.70, while increasing Pa from 1.38 to 1.49. CD spectroscopy of the resulting analog showed that it had about the same amount of a helix and /3 sheet as native glucagon in dilute aqueous solution, but that it contained more p sheet and less a helix in mixtures of trifluoroethanol and water. The Tyr-22 analog was a full agonist, with 20-30% the activity of glucagon in an in uiuo glucose assay and 10% the activity of glucagon in an adenylate cyclase assay. While these data are not inconsistent with the hypothesis that helix formation is important for binding and activity, it is difficult to make specific mechanistic conclusions based on the properties of one analog that has a change at just a single residue. It is not clear whether the change in activity is a result of a change in the conformational properties of the peptide, or a change in a residue that interacts directly with the receptor. Hoeprich and Hugli (1986) have adopted a more radical approach to investigating the requirement for helicity in the C-terminal portion of human anaphylatoxin C3a, a 77-residue protein that is a potent humoral effector of the inflammatory response. Studies with synthetic peptides have shown that a fragment of C3a consisting of just the C-terminal 21 residues is equipotent with C3a in a smooth muscle contraction assay using guinea pig ileums (Lu ef al., 1984). Decreasing the length of this peptide leads to a gradual loss in potency (Caporale et al., 1980). The Cterminal pentapeptide is several orders of magnitude less active than C3a, and peptides shorter than this have no activity at all. It has been hypothesized that the C-terminal peptide forms the essential active site of the
DESIGN OF PEPTIDES AND PROTEINS
79
structure, and the remaining residues serve to enhance the binding of this element (Caporale et al., 1980). The crystal structure of native C3a has been determined, and the 15 residues immediately preceding the C-terminal pentapeptide “active site” are in a helical conformation. In contrast, a synthetic peptide comprising the C-terminal 21 residues of C3a shows very little helical character in aqueous solution, although a partially helical conformation can be induced by addition of trifluoroethanol to a final concentration of 25% (Lu et al., 1984). To determine whether this peptide segment formed a helix when bound to the C3a receptor, Hoeprich and Hugli (1986) designed several peptides with either substantially enhanced or reduced potentials to form a helices. When in a helical conformation, the C-terminal 21residue peptide is somewhat amphiphilic; that is, the hydrophilic and hydrophobic residues show a weak tendency to segregate on opposite faces of the helix. The side chains of the residues on the predominantly hydrophilic side of the helix were considered important for activity, whereas the hydrophobic residues on the opposite side of the helix were thought to be important only insofar as they served to stabilize helix formation and to impart an amphiphilic nature to the structure. Accordingly, two peptides were designed in which the hydrophilic face was retained as such, but the residues on the hydrophobic side of the helix were replaced by a mixture of alanine and either a-aminoisobutyric acid (Aib) or a-aminobutyric acid (Ab) residues. The Aib or Ab residues were chosen for their high potential to initiate helix formation (Paterson and Leach, 1978; Paterson et al., 1981)and were spaced at 4-residue intervals in the peptide sequence. CD spectroscopy of the resulting Ab- and Aibcontaining peptides indicated that they were approximately 1.5-fold more helical in aqueous solution than the parent peptide was. Also, they were slightly more helical than the parent peptide in 25% trifluoroethanol. The helicities of the parent, Ab-containing, and Aib-containing peptides were calculated to be 17, 23, and 26%, respectively, in aqueous solution and 41, 47, and 51% in 25% trifluoroethanol. The increase in helicity occurred concomitantly with an increase in biological activity; the Aband Aib-peptides were 2.5-fold more active than the parent peptide. As a further probe of the importance of helicity for the activity of these peptides, an analog of the Ab peptide was prepared in which the helix-breaking residue, proline, was placed at positions 6 and 14 of the sequence. The proline-containing peptide showed a very low level of helicity, and it could not be induced to form a helix by addition of trifluoroethanol to a final concentration of 25%. This peptide showed 1/300th the potency of the parent peptide in the guinea pig ileum assay. Taken together, these
80
WILLIAM F. DEGRADO
data provide strong support for the hypothesis that the C-terminal 20 residues of C3a are largely helical when bound to the C3a receptor (Lu et al., 1984). C . Design of Peptides That Form Amphiphilic Secondary Structures The preceding section described several examples in which a secondary structural element served as a rigid framework for positioning a few critical side chains in proper orientation to achieve macromolecular recognition and catalysis. Large portions of the sequences of these peptides appeared to be important only for stabilizing the secondary structural framework and hence could be mutated without adversely affecting the physical and biological properties of the peptide of interest. However, there were limits to the extent to which the framework residues could be varied. For instance, analogs of S-peptide with the highest enzymatic activities contained hydrophilic groups on the side of the helix that projected into solvent when complexed with S-protein (Kanmera et al., 1983; Komoriya and Chaiken, 1982). Similarly, the amphiphilic character of the C-terminal helix of C3a was retained and even enhanced in the highly active models of this peptide (Hoeprich and Hugli, 1986). Indeed, amphiphilicity is a very general attribute that is common to the secondary structures of a variety of medium-sized peptides that act at biological surfaces (Kaiser and KCzdy, 1983, 1984). The generality of this feature is a direct result of the fact that medium-sized peptides are too short to adopt stable tertiary structures. They therefore adopt structures complementary to the anisotropic environments of their receptors, which include proteins or phospholipid surfaces. Amphiphilic secondary structures can serve a variety of roles in the structures of peptides and proteins (Kaiser and Kkzdy, 1983,1984). Many peptides such as apolipoproteins and peptide toxins that bind to phospholipid surfaces do so by adopting an amphiphilic a-helical conformation (Kaiser and KCzdy, 1983). The exact sequence of the membrane-binding helix is often unimportant for function as long as the relative ratio of charge, hydrophobicity, and amphiphilicity is retained. In other cases amphiphilic secondary structures act as auxiliaries that serve to modulate the properties of neighboring sequences that often have quite specific roles in binding and/or catalysis. For instance, synthetic modeling has shown that the 31-residue peptide hormone @-endorphinis composed of a highly conserved pentapeptide segment (enkephalin)followed by a highly variable flexible linker, which is in turn followed by a helical auxiliary (Taylor and Kaiser, 1986). Potential amphiphilic a-helical segments are also found at N or C termini of globular proteins, where they appear to
DESIGN OF PEPTIDES AND PROTEINS
81
play a role in binding to other macromolecules or macromolecular surfaces (Lear and DeGrado, 1987; Recny et al., 1985; Lucas et a f . , 1986). The design of model peptides has played an important role in recognizing the importance of amphiphilic secondary structures for the properties of peptides. This approach involves replacing the regions of a peptide that are believed to form amphiphilic secondary structures with sequences that can effectively mimic the physicochemical properties of the postulated amphiphilic regions but have minimal sequence homology to the parent sequence. The properties of the resulting model compounds are then measured to assess the correctness of the guiding hypothesis. A number of recent reviews have described detailed procedures for identifying and modeling amphiphilic secondary structures in peptides (EricksonViitanen and DeGrado, 1987;Taylor and Kaiser, 1987).The application of these methods to the design of a variety of peptides has also been reviewed (Taylor and Kaiser, 1986; Kaiser and KCzdy, 1983, 1984). Therefore, the following section on this subject will not attempt to exhaustively review the literature but instead will provide three representative examples of the application of this approach to the design of biologically active peptides. They will be treated in order of increasing complexity. 1. Membrane-Binding Peptides
One of the first examples of the modeling of amphiphilic secondary structures involved the design of a-helical peptides that bind to phospholipid surfaces. The approach was very successfully applied to the design of models for apolipoprotein A-1 (Kaiser and Kdzdy, 1983; Nakagawa et a f . , 1985), the major protein constituent of high density lipoprotein. This apolipoprotein serves to stabilize the surface of the lipoprotein particle and is also important for activating the enzyme lecithin-cholesterol acyltransferase. It was hypothesized that this 243-residue protein did not form a stable tertiary structure on the lipoprotein particle, but instead formed a series of 22-residue amphiphilic a-helical segments, each of which bound to the phospholipid surface (Edelstein et af., 1979). About one-third of the residues in each of these peptides were hydrophobic, and the remaining residues were predominantly charged. Positively and negatively charged residues were present in nearly equal proportion, so the helices were approximately neutral at pH 7. A 22-residue peptide was designed to epitomize this hypothetical structure but to have minimal sequence homology to any sequence in apolipoprotein A-1. This peptide and its covalent dimer were found to reproduce all of the salient physical and biological properties of apolipoprotein A-1, including its ability to bind to phospholipid surfaces and activate lecithin-cholesterol acyltrans-
82
WILLIAM F. DEGRADO
ferase. Thus, the amphiphilic helix was firmly established as a structure that can bind phospholipid surfaces. Concurrent work with synthetic model peptides prepared in other laboratories (see Epand et al., 1987, and references cited within) led to essentially the same conclusions. About this time it was becoming increasingly clear that amphiphilic a helices were also involved in a class of highly surface-active toxic peptides that disrupt phospholipid membranes. The 2dresidue peptide from honeybee venom, melittin (Haberman, 1972), served as a prototype of this class of peptides. Early evidence suggested that it existed in a random coil conformation in aqueous solution, but that it could be induced to form helices in the presence of membranes or when aggregated into tetramers (Drake and Hider, 1979; Dawson et al., 1978). Analysis of the first 20 residues of its sequence showed that this segment was capable of forming an amphiphilic a helix (DeGrado et al., 1981) that differed markedly from the helices found in apolipoprotein A-1 . Whereas the helices in apolipoprotein A-1 were approximately one-third hydrophobic and contained a large number of charged groups, the amphiphilic helix in melittin was twothirds hydrophobic and contained just one charged residue at position 7 in its sequence. Also, melittin contained a proline at position 14, which might produce a kink in the amphiphilic helix. The difference in hydrophobicities of the apolipoprotein helix and the melittin helix might account for their different properties. The apolipoprotein helix has been proposed to stabilize phospholipid surfaces by interdigitating its hydrophobic groups among the head groups of the phospholipids (Edelstein et al., 1979). In contrast, the melittin helix is more hydrophobic and so would be expected to penetrate more deeply into the membrane, possibly contacting the fatty acid acyl chains and disrupting the fluidity and other physical properties of the membrane (Dawson et al., 1978; DeGrado et al., 1981). An amphiphilic helix alone was not sufficient to account for the full hemolytic properties of melittin, as a peptide spanning the first 20 residues of melittin was not a potent hemolytic agent even though it was highly surface active. The C-terminal hexapeptide was likewise inactive and thus appeared to act in concert with the N-terminal amphiphilic a helix. This segment was highly hydrophilic and basic, with a net positive charge of +4 at neutral pH. It appeared possible that this charged segment might serve to tighten the interaction of the helix with biological membranes, whose phospholipids tend to be negatively charged. To test the role of the proposed amphiphilic a-helical segment in determining the properties of melittin, the model melittin analog illustrated in Fig. 5 was prepared (DeGrado et al., 1981, 1982). It was designed to have minimal homology to melittin in the N-terminal20 residues but to have the C-terminal hexapep-
83
DESIGN OF PEPTIDES AND PROTEINS
Melittin
A
10
5
YN-GLY- ILE-GLY-ALA-VAL-LEU-LYS-VALLEU-THE 15 20 THRGLY -LEU-PRO-ALA- LEU- ILE-SER-TRPILE25
LYS- ARG-LYS-ARGGLN-GLN-CONH,
Model Melittin 5
I0
15
20
YN-LEU-LEU-GW-SER-LEU-LEU-SER-LEU-LEU-GLNSER-LEU-LEU-SER-LEU-LEU-LEU-GLN-TRP-LEU25
LYS- ARG-LYS -ARG-GLN-GLN-CONH,
B
ISTRP
LEU
2 0 LEU
LYS 21
I
ZLN
A ~ G - L Y S - ARG-GLN-GLN-NH~ 22 23 24 25 26
FIG.5. (A) Amino acid sequence of melittin and a model lytic peptide. (B)Axial helical projection (Schiffer and Edmundson, 1967) of the model peptide. Taken from DeGrado ef a/. ( 1981).
tide intact. This peptide was approximately 2.5-fold as active as melittin in effecting lysis of human erythrocytes. It also was capable of binding and disrupting small unilamellar phospholipid membranes and could activate phospholipase A*. Thus, much of the sequence of melittin was shown to serve a purely structural role and could be replaced by an idealized a helix. Simultaneously with the completion of the synthetic modeling work, the crystal structure of tetrameric melittin was completed by Terwilliger and Eisenberg (1982a,b; Terwilliger et al., 1982). Each of the melittin molecules in the tetramer was in a helical conformation and displayed a highly pronounced amphiphilicity. The hydrophobic faces of the helices pointed toward the interior of the structure, while the hydrophilic groups pointed outward toward the solvent. The conformation of melittin when bound to membranes is likely to be similar to the conformations of the monomers observed in the crystal structure (Terwilliger and Eisenberg, 1982b).
84
WILLIAM F. DEGRADO
More recent studies with natural and model peptides have suggested that the C-terminal hexapeptide of melittin serves only to provide charge and that it can be eliminated if the amphiphilic a-helical segment contains basic residues on its hydrophilic side. Several classes of toxic peptides have been isolated from insect venoms; they include mastoparan from wasp, crabrolin from European hornet, and bombolitins from bumblebee (Ariolas and Pisano, 1985;Nakajima et al., 1986).These peptides are 13 to 17 residues long and share a variety of properties with melittin, including the ability to lyse erythrocytes and elicit release of histamine from mast cells. Each of these peptides is capable of forming an amphiphilic a helix with hydrophobic residues on one face of the structure and basic residues on the opposite face (Fig. 6). Thus, the basic residues need not be present on a neighboring stretch of peptide chain as in melittin, but can be incorporated directly into the helical sequence instead. Indeed, the 14-residue, basic, amphiphilic a-helical peptide 2 described in Section III,A,3, composed of just leucine and lysine residues, is a hemolytic agent that is nearly equipotent with melittin (W. F. DeGrado, unpublished results).
LYS
Bombolitin I
Lys
Mastoparan
Crabrolin
FIG.6. Axial helical projections (Schiffer and Edmundson, 1967) of three peptide toxins. Taken from Argiolas and Pisano (1985).
DESIGN OF PEPTIDES AND PROTEINS
85
Another peptide that appears to form basic, amphiphilic a helices is the antibacterial peptide cecropin A (Boman and Steiner, 1981; Andreu et al., 1983), isolated from the hemolymph of the North American silk moth Hyalophora cecropia. This 37-residue peptide is capable of protecting the silk moth from infection by a wide variety of gram-negative and grampositive bacteria. It appears to be capable of forming an amphiphilic ahelical segment in approximately the first 20, and last 15, residues of its sequence. The mechanism by which this peptide exerts its biological activity is poorly understood at present. However, irrespective of the mechanism of action, evidence suggests that an amphiphilic a helix is involved in the activity of this peptide. A series of analogs has been prepared in which the helical potential, the charge, or the amphiphilicity of the first proposed helix has been compromised by the introduction of a change at single sites (Andreu et al., 1985). None of the modifications gave peptides with the same activities as the parent molecule, a result suggesting that all three factors act in concert to effect antibacterial activity. Also, an amphiphilic a-helical peptide composed of just leucine and lysine residues has been shown to be capable of reproducing some of the antibacterial properties of cecropin (DeGrado, 1983). This finding has been confirmed in a very extensive study by Mihara et al. (1987),who synthesized a series of twelve peptides approximately 15 residues long and containing leucine, alanine, and from four to six lysine residues. These investigators found that the abilities of these peptides to cause release of carboxyfluorescein from phospholipid vesicles correlated well with their antibacterial activities against gram-positive bacteria. The analogs with the largest number of lysine residues were most active against gram-negative bacteria, although they showed reduced potency against gram-positive bacteria. 2 . Calmodulin-Binding Peptides Calmodulin is a small acidic protein that serves to activate a variety of key regulatory enzymes, including phosphodiesterase, protein kinases, and adenylate cyclase (Klee and Vanaman, 1982). Calmodulin has binding sites for up to four calcium ions; it changes its conformation and exposes a hydrophobic binding surface or surfaces as these sites are filled. Activation of target enzymes generally requires that three or four of the calcium binding sites be occupied, although calmodulin can bind protein phosphatase after only two calcium-bindingsites are filled (Kincaid and Vaughan, 1986). The mechanism by which calmodulin can bind to and activate such a large number of target enzymes is a major unanswered question that is only beginning to be solved. Many target enzymes can be activated by partial proteolytic digestion (Klee and Vanaman, 1982);and in the case of skeletal muscle myosin light-chain kinase, it has been shown that pro-
86
WILLIAM F. DEGRADO
B
residue number
FIG. 7. (A) Aligned, partial sequences of a number of calmodulin-bindingpeptides. The boxes indicate residues that are generally occupied by apolar residues. Reported dissociation constants for interaction with calmodulin are given on the right. LK2, A mode peptide; VIP, vasoactive intestinal peptide; GIP, gastric inhibitory peptide. (B) The mean hydrophobicities for the residues at a given position were plotted versus their position in the aligned sequence. The horizontal bar indicates the period of an a helix. From Cox et al. (1985).
teol ysis removes a short calmodulin-binding peptide (Edelman et al., 1985). Thus, for at least some target enzymes, interaction with calmodulin is largely a peptide-protein interaction and can be studied using synthetic model peptides. Calmodulin is also known to bind a number of peptides (DeGrado et al., 1985; Cox et al., 1985; McDowell et al., 1985; Giedroc et al., 1983; Malencik and Anderson, 1982, 1983, 1984; Sellinger-Barnette et al., 1984) that include hormones and toxins such as melittin and mastoparan (Fig. 7). Although the physiological role of the interaction of calmodulin with these peptides (if any) is not known, they provide an attractive model system for studying calmodulin-target enzymes interactions (Erickson-Viitanenet al., 1987). The interaction of calmodulin with these peptides is calcium-dependent, and the dissociation constants for the complexes are of the same order of magnitude as for calmodulin-target
DESIGN OF PEPTIDES AND PROTEINS
87
enzyme complexes. In addition, the peptides compete with target enzymes for binding to calmodulin, a finding suggesting that both are binding to similar sites on calmodulin. The calmodulin-binding peptides (CBP) vary considerably in chain length, structure, and apparent function. The fact that there is no exact sequence homology running through the entire set of peptides reinforces the question of how one protein can interact with so many disparate structures. A possible answer to this question arose from a consideration of the overall structural properties of the calmodulin-binding peptides rather than just their amino acid sequences. The peptides that bound calmodulin most tightly could be shown to have more positively charged residues than negatively charged residues in their sequences (Malencik and Anderson, 1984). In addition, it appeared likely that calmodulin recognized an amphiphilic a helix in at least one peptide to which it bound, pendorphin. Using synthetic fragments of P-endorphin, it was shown that the active calmodulin-binding portion of this peptide was localized within a 12-residue segment capable of forming an amphiphilic a helix (Giedroc et al., 1983). CD evidence also suggested that p-endorphin bound to calmodulin in a partially helical conformation, although the degree of helical induction was small and difficult to quantify. Unfortunately, the affinity of calmodulin for P-endorphin was much weaker ( K D = 1 p M ) than its affinity for target enzymes (KD= 1 a), and the stoichiometry for the peptide-calmodulin complex was 2 : 1 rather than the 1 : 1 ratio seen for many target enzyme-calmodulin complexes. Therefore, considerable interest was generated by the discovery that a number of other hormonal and cytotoxic peptides formed high-affinity ( K D < 100 nM) complexes with calmodulin (reviewed in Cox et al., 1985). These peptides included the toxins melittin (Comte et al., 1983) and mastoparan (Malencik and Anderson, 1984), which had already been shown to have a high tendency to form basic amphiphilic a helices (see above). It seemed likely that this structural feature might be responsible for the calmodulin-binding properties of each of the peptides illustrated in Fig. 7. Indeed, hydrophobic and hydrophilic residues occupied equivalent positions in the sequences of these peptides when they were aligned as in Fig. 7. When the mean hydrophobicities of the residues at each position in the aligned sequences was considered as a function of the position along the chain, a sine wave with a periodicity similar to that for an a helix (3.6 residues) was observed (Cox et al., 1985). To obtain additional evidence in support of this hypothesis, three peptides were designed to be capable of forming a basic, amphiphilic a helix but to share little sequence homology with any natural calmodulin-binding peptide (Cox et al., 1985). The first two synthetic peptides, CBPl and
88
WILLIAM F. DEGRADO
TABLEI1 Synthetic Calmoddin-Binding Peptidess Peptide CBPl CBP2 CBP3 CBP4 CBP5 a
Sequence FMOCb Leu Lys Lys Leu Leu Lys Leu FMOC Leu Lys Lys Leu Leu Lys Leu Leu Lys Lys FMOC Leu Glu Glu Leu Leu Glu Leu Leu Glu Glu Lys Leu Trp Lys Lys Leu Leu Lys Leu Leu Lys Leu Lys Trp Lys Lys Leu Leu Lys Leu Leu Lys
KD (nM)
200 3 Leu Leu Lys Leu Leu Leu Glu Leu > lo00 0.4 Lys Leu Leu Lys Leu Gly 0.2 Lys Leu Leu Lys Leu Gly
The data are taken from Cox et al. (1985) (peptides 1-3) and from DeGrado et al. (1985) (peptides
4, 5).
FMOC, 0-N-Fluorenylmethyloxycarbonyl.
CBP2 (Table 11), corresponded to peptides 1 and 2 of Section III,A,3; and the third peptide, CBP3, was a negatively charged analog of CBP2. It was hypothesized that if helix formation were indeed important for high affinity binding ( K D < 100 a), then CBPl, which was too short to form a very stable a helix, should bind rather weakly. On the other hand, CBP2 readily formed helices in the presence of apolar-water interfaces, so it should bind to calmodulin much more tightly than CBPl did. CBP3 was designed to test the electrostatic requirements for binding; it was expected to have conformational properties very similar to those of CBP2. In accordance with these predictions, CBPl and CBP2 were found to bind to calmodulin with approximately 0.2 pM and 3 nM dissociation constants, respectively. CBP3 failed to bind calmodulin at all, even at micromolar concentration, a result that demonstrated that electrostatic interactions must play an important role in recognition. However, polylysine binds very poorly to calmodulin, an observation indicating that electrostatic interactions alone are not responsible for binding. Taken together, these data strongly suggest that a basic, amphiphilic a helix is an important structural feature for the binding of many peptides to calmodulin. The synthetic modeling approach described above is largely conceptual in nature and is based entirely on a guiding structural hypothesis concerning the peptide alone. In this approach one begins with the simplifying assumption that the receptor protein is entirely indifferent to the fine details of the topography of the helical surface of the peptide and that only the overall distribution of charge, hydrophobicity, and amphiphilicity is important. Clearly, this is a gross approximation; that this approach is so successful is somewhat surprising and requires an explanation. In considering the interaction of calmodulin with CBP2, it is important to consider the flexibility of the side chains on the surface of the peptide and protein.
DESIGN OF PEFTIDES AND PROTEINS
89
The main chain atoms also possess a certain degree of flexibility. Thus, the peptide and protein can adjust their surface topographies to maximize their mutual complementarity. However, there are limits to the degree to which peptides can vary their conformations, and it seems unlikely that a regularly repeating structure such as that of CBP2 could assume a conformation that is perfectly complementary to a surface as irregular as a protein-water interface. Therefore, the sequence of CBP2 is almost certainly not optimally designed for affinity or specificity. In addition to not having an optimal fit with calmodulin, it is probably too amphiphilic in its design. For instance, CBP2 also binds to and disrupts a variety of membranes (W. F. DeGrado, unpublished results). In contrast, the target enzymes of calmodulin tend to be highly specific in their interactions with this regulatory protein. The design of peptides with improved affinities and specificities for calmodulin requires a consideration of the structure of calmodulin as well as that of the peptide. Recently, the crystal structure of calmodulin has been solved, using data that extends to 3.0 8, (Babu et al., 1985). At this level of resolution, the positions of the calcium-binding sites and secondary structures could easily be discerned, but the exact positions of side chains will remain ambiguous until the structure has been refined to higher resolution. Prior to the publication of the crystallographic structure, two related structures were predicted for calmodulin on the basis of its sequence homology with two proteins of known three-dimensional structures, intestinal calcium-binding protein and parvalbumin (O’Neil and DeGrado, 1985). Because of the high degree of sequence homology between parvalbumin or intestinal calcium-binding protein and calmodulin, the models were found to be qualitatively similar to the crystal structure of calmodulin. An examination of the surface of these models resulted in the identification of a likely site for binding amphiphilic peptides. This site was primarily composed of residues near the C terminus of calmodulin and contained a hydrophobic patch that was flanked by several acidic residues. With computer graphics it was possible to “dock” CBP2 onto this site in such a way that the apolar side of the helix contacted the hydrophobic surface. Further analysis suggested that the affinity of CBP2 might be enhanced by adding a lysine residue near its N terminus on the otherwise uninterrupted hydrophobic side of the helix. Also, it appeared that a tryptophanyl residue might be well accommodated near the N terminus of the sequence. These and other considerations (DeGrado e? al., 1985) led to the design of CBP4 and CBP5 (Table 11). Both peptides had a tryptophan in the 3-position of their sequences, and they varied only in the sequence of their N terminal dipeptides. CBP4 had the N terminal sequence Lys-Leu, which placed the lysine on the
90
WILLIAM F. DEGRADO
hydrophilic side of the helix, while peptide 6 had the N-terminal sequence Leu-Lys, which placed the lysine on the hydrophobic side of the helix. CBP4 was found to bind calmodulin with a 0.4-nM dissociation constant, a finding indicating that adding a tryptophan to CBP2 and increasing its chain length indeed improved its affinity for calmodulin. CBP5 bound even tighter, with a dissociation constant of 0.2 nM. Thus, interrupting the perfect amphiphilicity of CBP4 led to an analog that bound more tightly to calmodulin. These data demonstrate how second-order conformational considerations can lead to the design of peptides with higher specificity and affinity for their receptors. While these studies with model peptides were in progress, other groups were involved in elucidating the calmodulin-binding sites of two related target enzymes, smooth muscle and skeletal muscle myosin light-chain kinase (MLCK) (Lucas et al., 1986; Blumenthal et al., 1985; Klevit et al., 1985; Edelman et al., 1985). Analysis of CNBr digests for each of these enzymes gave rise to a single peptide for each protein that could bind calmodulin with extremely high affinity. The peptides derived from the smooth muscle (Lucas et al., 1986) and skeletal muscle enzymes (Blumenthal et al., 1985) were 20 and 27 residues in length, respectively, and each lies near the C terminus of these enzymes. CD studies showed that when calmodulin and either of these two peptides were mixed in a 1 : 1 molar ratio, the helicity of the complexes was greater than that for the sum of the two individual components (Lucas et al., 1986; Klevit et al., 1985), an outcome that was consistent with the hypothesis that these peptides form helices when they bind to calmodulin. Studies involving limited proteolysis of native skeletal muscle MLCK have provided evidence that strongly suggests that the peptide isolated from the CNBr digest is involved in binding calmodulin in the native enzyme (Edelman et al., 1985). Thus, these peptides appear to make up much if not all of the calmodulin-binding domains of the enzymes. Each of the natural peptides derived from MLCK contains a 16-residue segment that conforms to the structural paradigm derived from the studies of model amphiphilic peptides. When aligned as in Fig. 8, hydrophobic and basic residues tend to occupy invariant positions in their sequences. As illustrated in the helical net diagram (Fig. 9),all three peptides display a potential apolar ridge. This ridge is formed by hydrophobic residues that occur at positions i - 3, i , i + 3 over a 10-residue stretch of the chain. This repeat has been found in many helix-helix packings (Chothia, 1984). Preliminary data from a number of laboratories suggests that the calmodulin-binding domains of several other enzymes might contain basic, amphiphilic a helices. The catalytic subunit ( y subunit) of phosphorylase b kinase is known to bind extremely tightly to calmodulin (the 6 subunit;
DESIGN OF PEEIDES AND PROTEINS
91
Rel. Skeletal Muscle MLCK 342361
T a b &.. 1986
Lys-Argdrg-Trp~ys-Lys-AsnPhe-lle-Ab -ValSer-Ab-Ala-Asn- Arg-Phe-Lys-Lys -Ile
Smwlh Muscle MLCK peplide
Lucas&..
Arg .Arg- Lys-Trp-Gln-Lys-Thr .Gly-H-Ala-Val-Arg-Ala- Ile- Gty-Arg-LeuSer SerSer
8-subnil 01 p II Cam Kinaseb97-316)
Bennen 8 Kennedy, 1987
Arg.Arg-Lys.Leu-Lys-Gly.Ab- Ila-LeuThr-Thr-Me1 Leu-Ala-Thr-Arg- Asn-Phe Ser Val
Unknown Calrrcdukn
Skeb I Hahn. 1987
Arg Arg-Lys-Leu-Lys-Ala-Ab-VaCLys-Ala-Val-Val-Ala-Ser Ser-Arg.Leu.Gly.Ser- Ah
1986
KiMW
CBP5
DeGrado &.,
1985
LeuLysTrp-Lys-Lys-LeuLeuLysLeuLeuLys Lys Leu-Leu Lys Leu Gty t
t
t
Hb
t
t
H b t HbHb
Hb
t
Hb
FIG.8. Amino acid sequences of the model calmodulin-bindingpeptide, CBPS, and of the proposed calmodulin-binding domains of four calmodulin-dependent kinases. The + symbols on the bottom row indicate positions where a positively charged residue occurs in at least half of the aligned sequences, the Hb symbol refers to positions generally occupied by hydrophobic residues.
Kee and Graves, 1986). A computer-assisted examination of the sequence of the y subunit revealed there is a very basic 21-residue segment with a high helical hydrophobic moment (Eisenberg et al., 1982a,b) near the C terminus of this protein (DeGrado et al., 1987a). A synthetic replicate of this segment binds calmodulin with approximately a 10-nM dissociation constant, a result indicating that this region might constitute a portion of the calmodulin-binding domain of the y subunit.
FIG.9. Helical net diagram (Crick, 1953) of a model calmodulin-bindingpeptide and the putative calmodulin-binding domains of two forms of myosin light-chain kinase (MLCK). The sequences are drawn together on a single helical net and are taken from (clockwise from left) the model peptide described by DeGrado et al. (1985), skeletal muscle MLCK peptide 342-359 (Edelman et al., 19851, and the N-terminal 18 residues of a peptide derived from smooth muscle MLCK (Lucas er al., 1986). The amino acids in the sequences are given in single letter codes. Positions that are hydrophobic in all three sequences are indicated by shading.
92
WILLIAM F. DEGRADO
The sequence of the p subunit of brain type I1 Ca*+/calmodulin-dependent protein kinase has been recently determined by molecular cloning (Fig. 8) (Bennett and Kennedy, 1987). The sequence of this enzyme begins with a region of approximately 300 residues that has a high degree of homology to a variety of protein kinases, including kinases that are not regulated by calmodulin. This region of the protein contains the active site residues and hence represents the catalytic portion of the protein. Immediately downstream (toward the C terminus) to this sequence is a short segment that is capable of forming a basic, amphiphilic a helix and is proposed to form this enzyme’s calmodulin-binding domain (Bennett and Kennedy, 1987). The calmodulin-binding domain of skeletal muscle MLCK is also downstream to the putative catalytic portion of the enzyme. Very recently, the partial sequence of another calmodulin-binding brain protein was determined by a novel genetic method (Sikela and Hahn, 1987). Mouse brain cDNA was cloned into a vector that expresses the inserted DNA as a fusion protein, and the resulting clones were selected for their ability to produce calmodulin-binding proteins. The partial sequence of a protein that shows some homology to the p subunit of brain type I1 Ca2+/calmodulin-dependentprotein kinase was obtained by this procedure. Restriction enzyme mapping experiments indicated that the calmodulin-binding segment of this protein was contained within a 39residue segment; a 20-residue segment within this region had a high potential to form a basic, amphiphilic a helix, and it had a high degree of homology with the proposed calmodulin-binding domain of the p subunit of brain type I1 Ca2+/calmodulin-dependentprotein kinase (Fig. 8). A major limitation of the above studies of calmodulin-peptide interactions was that spectral evidence to support helix formation was limited to predictive algorithms and measurements of the difference in the circular dichroism of peptides and calmodulin in free solution and the CD in 1 : 1 complexes. Interpretation of such experiments was severely limited by the fact that calmodulin probably undergoes conformational changes upon binding peptides (Klevit et al., 1985). One elegant NMR study has been reported on a complex of melittin and bacterial-derived perdeuterated calmodulin; the results were consistent with helix formation by the peptide in the complex (Seeholzer et al., 1986). To obtain additional evidence in support of helix formation, O’Neil et al. (1987) adopted a novel approach that may be applicable to a number of other systems involving peptide-protein interactions. In a set of analogs of CBPS, the tryptophan was systematically moved from its original position at residue 3 to every other possible position in the peptide chain. The fluorescence properties of tryptophan are markedly dependent on environment and can provide information concerning the solvent accessibil-
DESIGN OF PEPTIDES AND PROTEINS
93
ity, rigidity, and polarity of the immediate vicinity of the indole moiety. As tryptophan residues become increasingly less solvent-accessible in the interiors of proteins, they tend to have lower Stern-Volmer constants for acrylamide quenching of the tryptophan fluorescence, higher anisotropies (indicative of a larger degree of immobilization), and emission maxima that are increasingly shifted more toward the blue relative to those of tryptophan in water (Lakowicz, 1983). CBP5, which has a tryptophan at position 3 on the hydrophobic side of the helix, was found to have a low accessibility to acrylamide, a high anisotropy , and a highly blue-shifted emission maximum (DeGrado et al., 1985). Figure 10 illustrates the results that might be expected when tryptophan is moved to other positions, assuming that the peptides bind to calmodulin in a helical conformation with their apolar residues contacting the hydrophobic surface of calmodulin. If the fluorescence properties of the entire set of tryptophan-containing peptides are considered as a function of the position of the tryptophan in the sequence, then the fluorescence properties should be periodic and have a repeat period that matches that of an a helix. This was indeed found to be the case (O’Neil et al., 1987). The peptides bound to calmodulin with dissociation constants ranging from 0.2 nit4 (for the Trp-3 peptide) to 3.0 nit4 (for the Trp-10, Trp-14, and Trp15 peptides). Therefore, the perturbational effect due to the introduction of the tryptophan was small when compared with the total binding energy, which ranged from approximately - 12 to - 13 kcal/mol. Significantly,the affinity was the highest when tryptophan was placed in position 3, as in the original design of CBPS. The Stern-Volmer quenching constants, the anisotropies, and the emission maxima of the full set of peptides were periodic with respect to the position of the tryptophan in the sequence. Fourier analysis of the data gave a period of approximately 3.3, in good agreement with the value of 3.6 expected for an a helix. Figure 1 1 illustrates a normalized average of each of these parameters as a function of the tryptophan position. The dashed curve is a theoretical curve calculated with the parameters obtained from the best fit of a sine wave (3.3 residue period) to the data. The deviation of the data from the dashed curve is greater than can be explained by experimental error and arises from the fact that the peptides do not bind to a planar interface that varies smoothly in its dielectric, rigidity, and solvent accessibility. This property might also explain the deviation of the observed repeat of 3.3 residues from the ideal value of 3.6 residues for an a helix. Thus, the deviations of the experimental points from the smooth curve provide information concerning the fine structure of the peptide-binding site. For instance, data points on the upper portion of the curve in Fig. 1 1 correspond to residues on the apolar side of the
94
WILLIAM F. DEGRADO
FIG.10. Highly schematic representation of the orientation of several tryptophan-containing peptides with respect to calmodulin. (A) With tryptophan in position 1, the indole is located on the hydrophilic side of the helix and is exposed to solvent. Peptides with tryptophan on this face of the helix should exhibit emission maxima near that of indole in water (-350 nm), a small anisotropy, and a high accessibility for acrylamide quenching. (B) In position 2, the tryptophan is partially exposed at the interface between the peptide and calmodulin. Peptides with a tryptophan in this location should have fluorescence properties that are intermediate between example A and C. (C) The tryptophan is on the hydrophobic side of the helix and is almost entirely buried. The emission maximum should be strongly blue-shifted, the anisotropy should be large, and the accessibility to acrylamide quenching low. Taken from O’Neil ef al. (1987).
helix, which presumably directly contact the surface of calmodulin. The affinities of the peptides for calmodulin relative to a peptide lacking tryptophan are given above each of the local maxima. There is a correspondence between the height of each of the local maxima and the tightness of the interaction for the corresponding derivative. This correspondence
DESIGN OF PEPTIDES AND PROTEINS
0.3' 0
I
I
I
2
4
6
I
I
I
8 10 12 TRP POSITION
95
I
14
16
FIG.11. Variation of the fluorescence properties of a set of tryptophan-containing peptides as a function of the position of the tryptophan in their sequence. The parameterfAvE describes the degree of rigidity and hydrophobicity of the tryptophan's environment; it is based on emission maximum, anisotropy, and accessibility to acrylamide. When the values for each of these parameters were similar to those expected for indole in water, a value near 0 was assigned tof, whereas values up to 1.O were assigned as the fluorescence parameters more closely resembled those observed in very rigid and apolar environments such as the interior of a protein or ethylene glycol at -60°C (Lakowicz, 1983). The values offcalcu~ . dotted curve was generated lated for each parameter were then averaged to give f A ~ The by fitting a sine wave to the data (period = 3.3 residues). Taken from O'Neil et al. (1987).
indicates that the tryptophan can increase the affinity of the peptide for calmodulin only if there is a hydrophobic pocket on the surface of calmodulin that can accommodate the indole moiety of the tryptophan side chain. Furthermore, the data in Fig. 11 suggest several ways in which the sequence of peptide 6 could be changed to further improve the affinity and specificity of the peptide for calmodulin. Residue 12 is a lysine in peptide 6, and yet it appears to be experiencing a hydrophobic environment when bound to calmodulin (Fig. 11). Also, residue 14 is a leucine in peptide 6, and yet it is experiencing a very hydrophilic environment. In the aligned sequences of the calmodulin-bindingdomains shown in Fig. 8, the positions homologous to residue 12 and 14 of peptide 6 are generally occupied by hydrophobic and hydrophilic amino acids, respectively. These differences between the natural and designed peptides are likely to have two consequences: the natural peptides are less amphiphilic than the designed peptide and therefore the natural peptides are less likely to bind nondiscriminately to hydrophobic surfaces, and the surface topographies of the natural peptides are likely to be more complementary to the surface of calmodulin. The aforementioned results demonstrate the synergism between studying natural and model peptides. The use of designed model peptides al-
96
WILLIAM F. DEGRADO
lowed the identification of a basic, amphiphilic a helix as a structural feature that was important for binding. Subsequent spectroscopic studies as well as studies with natural peptides increased the understanding of the peptide-protein interface. This knowledge allowed the simple amphiphilic a-helical model to be refined to include the contribution of residues that decrease the overall amphiphilicity of the structure but increase the specificity of the peptides for calmodulin. Within several years, the structures of several calmodulin-peptide complexes should be known in a reasonable degree of detail. In the near future, the use of calmodulin-binding peptides bearing the photoaffinity amino acid p-benzoylphenylalanine should aid in the determination of those residues on calmodulin that are involved in peptide binding (Kauer et al., 1986). Data from such experiments can be evaluated in light of the known crystal structure of calmodulin to provide a reasonable estimation of the conformational features of the complex. Also, crystallographic (Tanaka et al., 1985) and NMR studies on the interaction of peptides with calmodulin (Klevit et al., 1985; Seeholzer et al., 1986) have been initiated and should provide much higher resolution structures. The availability of structural details for peptide-calmodulin complexes should make them attractive model systems for studying the interaction of amphiphilic peptides with their receptors. 3. Peptide Hormones The design of model peptides has recently been applied to elucidating the structural features involved in peptide hormone-receptor interactions. A first step in designing a model for a medium-sized peptide hormone is to assign a structural and functional role to each portion of a peptide sequence. This process is aided considerably if structure-activity relationships have been established from synthetic and phylogenetic studies. As described previously (Kaiser and KCzdy, 1984; Taylor and Kaiser, 1986; Schwyzer, 1982), the amino acid sequences of many medium-sized peptide hormones can be divided into two or more distinct domains on the basis of the tolerance of these domains to changes in their sequences. Often, there is a single domain (referred to here as the specificity element) that has a highly conserved sequence. Synthetic modifications in this domain lead to hormones with decreased potency, and, in some cases, peptides spanning just the specificity element are active as agonists or antagonists. The specificity elements appear to be directly and intimately involved in interactions with the receptor. The remaining portion or portions of the structure show more sequence variation and will be referred to here as auxiliary sequences. They appear to play a more structural role and serve to decrease the susceptibility of the hormone to enzymatic
97
DESIGN OF PEPTIDES AND PROTEINS
attack, to improve its binding to a receptor, and to modulate the affinity of the hormone for its receptor or receptors. The next step in the design of a model peptide hormone involves replacing the auxiliary sequences with structurally similar surrogate sequences. In general, it is safest to change one structural region at a time, but the changes within that region should be radical enough to allow unambiguous conclusions to be drawn from the experiments. Finally, the properties of the analogs are evaluated, new hypotheses are formulated, and new model peptides designed. The modeling process proceeds in an incremental manner until a peptide with the desired properties has been produced. The encompassing contributions of Kaiser and co-workers in the design of analogs of p-endorphin serve to illustrate this approach. Human p-endorphin is a 31-residue opioid peptide that is biosynthetically derived from a larger precursor, proopiomelanocortin (Hollt, 1983). It has been postulated (Taylor and Kaiser, 1986) to be composed of three modular units (Fig. 12). The N-terminal pentapeptide forms the specificity element and is identical to Met-enkephalin; the C-terminal 18 residues are postulated to form an a helix; and the 8-residue segment between these two modules is proposed to serve as a flexible linking sequence. The enkephalin portion is thought to be directly involved in binding to opiate receptors, while the two auxiliary sequences are thought to be responsible for modulating the receptor subtype specificity and for stabilizing the hormone against proteoly sis. Taylor, Kaiser, and their co-workers (Taylor and Kaiser, 1986) prepared a series of model peptides that were designed to test the role of each of the postulated modules in p-endorphin. Figure 13 illustrates a-helical net diagrams of the proposed helical auxiliary sequence of p-endorphin. The hydrophobic residues in this structure cover half the surface of the helix, twisting around the structure in a clockwise manner. [It has also been noted that if this sequence were to form a 7r helix rather than an a
*& Specficity Element
Flexible Linking Sequence
5
1
10
H2N-Tyr-Gly-Gly-Phe-Met-Thr-Ser-Glu-Lys-Ser-Gln-Thr-Pro-
a-Helical Auxiliary 1
f
15
20
25
30
Leu-Val-Thr-Leu-Phe-Lys-Asn-Ala-Ile-lle-Lys-Asn-Ala-Tyr-Lys-Lys-Gly-Glu
FIG.12. Amino acid sequence and structural assignments for human /3-endorphin (Taylor and Kaiser, 1986).
WILLIAM F. DEGRADO
98
P-Endorphin
P-2 13-31
P-1 13-31
13-31
P-3 13-31
P-4 13-31
FIG. 13. Helical net diagrams (Crick, 1953) of the proposed helical region of human p-endorphin and analogs thereof (Taylor and Kaiser, 1986).
helix, the hydrophobic residues would lie along one face of the helix and generate a substantially more amphiphilic structure (Taylor and Kaiser, 1986). At present, there are no direct, experimental data to support this hypothesis.] The first two model peptides were designed to maximize the amphiphilicity of the helical auxiliary when in an a-helical conformation. In the first model, p-1, residues 20-31 were replaced with a model sequence containing just three different types of residues: leucine, lysine, and glutamine
DESIGN OF PEF'TIDES AND PROTEINS
99
(Taylor et al., 1981). These residues were arranged so that the hydrophobic patch of the proposed a helix would lie parallel to the length of the ahelical axis throughout the entire length of the structure rather than spiraling around the helix (Fig. 13). In p-2, the sequence of the entire helical auxiliary was replaced with a model sequence (Fig. 13), and the flexible linking sequence was replaced by a model sequence composed of repeating Ser-Gly dipeptidyl units (Taylor et al., 1982). In p-3, the helical auxiliary was replaced with a model peptide in which the hydrophobic residues spiral in a clockwise manner around the helix (Fig. 13) as in the native sequence (Taylor et al., 1983). Finally, 0-4 has the same amino acid composition as p-2, but the residues in the helical auxiliary portion of the sequence have been arranged so that the helix is not amphiphilic and does not contain an extended hydrophobic patch (Fig. 13) (Blanc et al., 1983). The pharmacological and physical properties of p-1 to p-4 have been characterized in considerable detail (Taylor and Kaiser, 1986). The evaluation of the pharmacological data is complicated by the fact that each of the peptides contains the pentapeptide Met-enkephalin as their specificity element. To a large extent, this pentapeptide sequence controls the pharmacological behavior of the peptides, and many of the properties of Metenkephalin and p-endorphin are similar. However, there are also several important differences that distinguish Met-enkephalin from p-endorphin (Taylor and Kaiser, 1986). Met-enkephalin is considerably more susceptible to enzymatic degradation by enkephalinases and aminopeptidases. This finding might account for the fact that Met-enkephalin is much less active than p-endorphin as an analgesic when administered intracerebrally. In uitro, p-endorphin binds to p receptors about as tightly as it does to 6 receptors, and it has a very high potency in the rat vas deferens (RVD)assay, whereas Met-enkephalin binds preferentially to 6 receptors and has a much lower activity in the R V D assay. The following discussion of the properties of the model peptides will focus on those properties that differentiate p-endorphin from Met-enkephalin. Each of the model peptides including the nonamphiphilic analog p-4, bound the p receptor with high affinity (Taylor and Kaiser, 1986). The considerable potency of p-4 is supportive of other work that suggests that amphiphilic a helix formation is not obligatory for potent p receptor binding (Blake et al., 1981). In contrast, /3-4 was inactive in the R V D assay, while the other three model peptides were potent agonists (Taylor and Kaiser, 1986). Interestingly, the model that had the highest homology to p-endorphin, p-1, was most potent in this assay, with a potency approximately the same as p-endorphin. The peptides p-1,p-2, and p-3 were more stable than p-endorphin toward enzymatic attack in the rat brain homogenate and smooth muscle assays, whereas the nonamphiphilic pep-
100
WILLIAM F. DEGRADO
tide p-4 was rapidly degraded in these assays. Thus, an amphiphilic helical secondary structure appears to be important for enzymatic stability and in uitro activity in the RVD assay. Although peptides p-1 and p-2 differ from p-3 in the disposition of their hydrophobic residues along the putative helix, they have very similar properties in the above assays. This observation raises the question as to whether there is any functional significance to the fact that the hydrophobic patch runs diagonally along the helical net diagram in p-endorphin (Fig. 13). A comparison of the pharmacokinetic and in uiuo properties of peptides p-1 to p-4 suggests that the shape of the hydrophobic domain is indeed important for fine-tuning of the properties of this hormone (Taylor and Kaiser, 1986). For instance, of these four analogs, only p-3 has analgesic activity when administered by intracerebralventricular injection with an activity nearly half that of human p-endorphin in this assay. It is possible that amphiphilicity is indeed important for in uiuo activity, but peptides p-1 and p-2 are too amphiphilic and hence bind indiscriminately to cell surfaces. This nonspecific absorption might retard or prevent diffusion to the receptors from the site of injection. Even the active analog, p-3, required approximately five to ten times longer than human p-endorphin to produce its maximal effect, a finding suggesting that diffusion of this analog to the site of action was slower than for the native hormone. These differences in the pharmacokinetic effects were also seen in the RVD assay where the very amphiphilic peptides p-1 and p-2 required much longer than p-3 or p-endorphin to achieve their maximal effects. Recently, it has been shown that the auxiliary sequences in p-endorphin can be replaced by segments composed entirely of amino acids other than the 20 commonly occurring residues. In one analog of p-endorphin, the postulated a-helical segment was replaced by a segment composed entirely of D-amino acids (Blanc and Kaiser, 1984). This peptide was capable of forming left-handed a helices in helix-inducing solvents such as 50% aqueous trifluoroethanol, but it had biological and physical properties very similar to those of peptide p-3. In a second analog, the flexible linking sequence that spans residues 6-12 was replaced with four units of the y-amino acid, y-amino, y-hydroxymethylbutyric acid (Rajashekhar and Kaiser, 1986). This peptide also had physical and biological properties that were similar to those of p-3, a result indicating that residues 6- 12 of p-endorphin served as a flexible linking sequence joining the helical auxiliary with the specificity element. The aforementioned synthetic modeling approach, which was so successfully applied to studying the structural properties of p-endorphin, appears to be general and has been applied to the study of several other peptide hormones and hormone-releasing factors including glucagon
DESIGN OF PEPTIDES AND PROTEINS
101
(Musso et al., 1983, 1984), vasoactive intestinal peptide (G. Musso, personal communication),growth hormone-releasingfactor (Tou et al., 1986; Velicelebi et al., 1986), and calcitonin (Moe et al., 1983; Moe and Kaiser, 1985). In each of these cases, highly active analogs were produced by replacing a portion of the peptide sequence with a model amphiphilic (Yhelical sequence that had minimal homology to the parent hormone. IV. PROTEIN DESIGN A . Why Design Proteins?
The final topic of this review will be the de nouo design of proteins. While there exist ample precedents for the design of small peptides, the same does not hold for the design of proteins. There are two fundamental reasons for this: Until very recently the synthetic challenge associated with assembling a polypeptide chain the size of a protein was considerable, and it is only with the advent of recombinant DNA that this has become routine. In addition, our understanding of the mechanisms by which proteins adopt their tertiary structures is at a rather primitive state. Therefore, the design of functional proteins must begin with a consideration of the protein folding problem. In contrast to peptides that assume conformations complementary to those of their receptors, proteins are large enough to fold back upon themselves and adopt stable conformations in aqueous solution. When studied on an individual basis, these structures appear to be almost hopelessly complex, and indeed it is not currently possible to predict de nouo the structures of natural proteins from their amino acid sequences. However, the systematic study of the three-dimensional structures of large numbers of proteins has led to the recognition of some structural features that are common to a variety of proteins and has provided fundamental insights into the protein folding process (for reviews, see Levitt and Chothia, 1976; Richards, 1977; Richardson, 1981; Salemme, 1983; Chothia, 1984). Regularly repeating secondary structures make up large portions of proteins, and there is a high degree of order in the packing of individual elements of secondary structures. Elements of secondary structure tend to form layers that pack against neighboring layers of secondary structures in a nonrandom manner (Richardson, 1981). Thus, there are preferred angles for helix-helix, helix-sheet, and sheet-sheet packings (Chothia, 1984). The preferred crossover angles occur because there are a limited number of ways to closely pack side chains at the interface between secondary structures, as can be shown by construction of simple three-dimensional models of proteins (Salemme, 1983; Ohlen-
102
WILLIAM F. DEGRADO
dorf et al., 1987; Chothia, 1984) or by energy calculations (Salemme, 1983; Chou et al., 1984, 1985, 1986). Furthermore, there appear to be a finite number of higher level structural folding patterns found in the domains of proteins (Levitt and Chothia, 1976; Richardson, 1981). Thus, our understanding of the protein folding problem, although far from complete, has advanced far enough to provide solhe fundamental rules and principles from which to construct proteins. At present, the primary benefit to be derived from designing proteins is that it critically tests and advances our understanding of the principles governing protein stability and folding. Physical organic chemists have long relied on the construction of model compounds as an important tool for elucidating reaction mechanisms, similarly peptide chemists have found model peptides useful for addressing conformational questions. The construction of model proteins should likewise further our understanding of the kinetics, dynamics, and thermodynamics associated with protein folding. With this focus in mind, the following section will review some preliminary attempts to design such model proteins. A second potential benefit that might be derived from the construction of artificial proteins is that it may eventually be possible to design proteins with novel catalytic, pharmaceutical, or fibrous properties. Along these lines there have already been occasional reports of having achieved the design of peptides or proteins with specific binding and/or catalytic properties. These include a DDT-binding peptide (Moser et al., 1983,1985). an enkephalin-binding peptide (Kullmann, 1984), and a model for ribonuclease (Moser et al., 1983; Gutte et al., 1979; Jaenicke et al., 1980). The intended folding patterns for these peptides and proteins were not strictly modeled after entire domains of known proteins, but do incorporate some supersecondary structures known to occur in proteins. To date, the structural properties of these designed peptides have not been fully characterized. B. a-Helical Proteins The preceding sections reviewed some of the considerable progress that has recently been made in the design of peptides that form a helices in amphiphilic environments. A logical extension of this work is the design of proteins that are formed by the coalescence of two or more a helices. The study of amphiphilic peptides suggested that this should indeed be possible. For instance, melittin aggregates in solution to form ahelical tetramers, and crystallographic analysis of the tetramer showed that it had many of the attributes normally associated with folded proteins: Most of the apolar side chains were closely packed in the interior of the structure, and the helical crossing angles were typical of those found
DESIGN OF PEFTIDES AND PROTEINS
103
in folded proteins (Terwillinger and Eisenberg, 1982b). This finding raises the possibility of designing amphiphilic a-helical peptides that would selfassemble into helical proteins of predetermined geometries. 1. Coiled Coils
The simplest of a-helical proteins are the two-stranded coiled coils found in such fibrous proteins as the keratins and tropomyosin. In this structure, two a helices coil about one another with a left-handed superhelical twist (reviewed in Talbot and Hodges, 1982; Cohen and Parry, 1986). The left-handed superhelical twisting arises from the nonintegral repeat of the a helix, which makes it impossible to pack straight a helices over extended distances with a regular, repeating interaction pattern. One way in which a helices can favorably interact is referred to as “knobsinto-holes” packing (Crick, 1953) (Figs. 14A and 15); a side chain on one a helix packs between residues at positions i , i + 3, i + 4, and i + 7 of a neighboring helix. For two straight a helices, this interaction pattern can extend over several helical turns if the a helices are inclined with respect to one another by about 18” (Fig. 14A), although the helices will eventually diverge from one another. However, if the a helices wrap about one another with a left-handed sense, they remain at a constant distance from one another irrespective of their lengths, and their interaction pattern becomes integrally periodic, repeating every seven residues. This interaction pattern is also reflected in the amino acid sequences of proteins such as tropomyosin, which can be shown to be composed of a basic seven-residue sequence that is repeated 40 times without intermption (Hodges et al., 1972; McLachlan and Stewart, 1975). Hydrophobic residues almost invariably occupy the second and fifth positions of the heptad and are presumably directed toward the major axis of the superhelix, where they serve to stabilize the structure by hydrophobic effects (Fig. 15). The charged side chains are also nonrandomly distributed and are believed to form interhelical ion pairs that further stabilize the structure (Talbot and Hodges, 1982). Hodges and co-workers have designed a series of model peptides (Talbot and Hodges, 1982; Lau et al., 1984) containing the structural features envisioned to be important for stabilizingthe two-stranded coiled coil. To evaluate the role of chain length in stabilizing the coiled-coil conformation, a series of peptides were prepared containing from one to five copies of the model heptapeptide illustrated in Fig. 15 (Lau et al., 1984). CD spectroscopy indicated that, as the number of heptad units increased, the helicity and stability of the structures increased in concert; the peptides containing one or two heptad repeats showed no helical structure in aqueous solution, whereas the remaining peptides showed
I04
WILLIAM F. DEGRADO
FIG.14. Two types of helix-helix packing that give rise to nearly antiparallel crossover angles. The side-chain interaction pattern for a pair of a helices can be conveniently represented by drawing helical nets that describe the superposition of the two helices (Crick, 1953). The “knobs into holes” packing of Crick (1953) (A), and the “ridges into grooves” packing of Chothia et al. (1977, 1981) (B) give rise to similar interhelical packing angles, although the side chains are packed somewhat differently in the two models. Drawing by F. R. Salemme and P. Weber.
increasing amounts of helicity as the number of heptads was increased from three to five. Size exclusion chromatography and sedimentation equilibrium ultracentrifugation showed that the peptides with four or five heptameric repeats formed dimers in solution. Thermal denaturation studies indicated that these peptides were even more stable than carboxymethylated tropomyosin; the temperatures required to decrease the magnitude of their ellipticities by 30% of the value measured at 93°C were 62, 74, and 37”C, respectively. Presumably tropomyosin is thermally less stable than the model peptides because it contains a number of hydrophilic groups or alanines at the positions occupied by leucine in the model peptides (Lau et al., 1984). The role of electrostatic interactions in stabilizing the two-stranded
DESIGN OF PEPTIDES AND PROTEINS
105
POTENTIAL ELECTROSTATIC INTERACTIONS
HYDROPHOBIC CORE LlU
LlU
u POTENTIAL ELECTROSTATIC INTERACTIONS
FIG.15. A single heptad repeat of an idealized model for coiled-coil proteins (Talbot and Hodges, 1982; Hodges ef al., 1981). The primary driving force for formation of the structure arises from the interdigitation of the apolar leucyl side chains. In addition, electrostatic interactions between the oppositely charged residues on neighboring helices may also contribute to the stability of the parallel form of the coiled coil (Hodges ef al., 1981).
coiled coil remains ambiguous. The a-helical dimer formed by the peptide containing five heptad units was found to have greater thermal stability at pH 2.5 than at pH 7.0, and the ionic strength dependence of the stability was the opposite of what would be predicted if interhelical salt bridges contributed largely to the stabilization of the structure (Lau et al., 1984). In the tropomyosin coiled coil, the a helices run parallel to one another and in register (Talbot and Hodges, 1982). Experiments with disulfide cross-linked model peptides suggested that the same orientation of the helices was maintained in the aforementioned model peptides (Hodges er al., 1981). A peptide of the sequence A-Bs-L~s(A = Lys-Cys-Ala-Gluformed a covalent diLeu-Glu-G1y ; B = Lys-Leu-Glu-Ala-Leu-Glu-Gly) mer upon air oxidation; the covalent dimer had properties that were very similar to those of the corresponding reduced peptide; their CD spectra were nearly identical, and the shapes of their thermal unfolding curves were also similar. However, the midpoint of the thermal unfolding curve was 63°C higher for the oxidized peptide, a result indicating that it formed a more stable structure. Presumably the additional stabilization arose from a decrease in the conformational entropy of the unfolded form of the oxidized peptide from that of the reduced peptide (Kauzmann, 1959).
2. Four-Helix Bundles The study of models of fibrous proteins has provided a number of principles that might be applied to the design of globular proteins. In
106
WILLIAM F. DEGRADO
particular, the use of simplified sequences and amino acids that strongly favor the formation of a given secondary structure should allow design of simple, highly stable proteins. However, the design of globular proteins is fundamentally a far more complex problem. While fibrous proteins such as coiled coils have highly symmetrical structures with relatively short periodicities, globular proteins are far less simple, showing less symmetry. One possible exception is the four-helix bundle class of proteins (Fig. 16), which includes myohemerythrin, apoferritin, tobacco mosaic virus coat protein, and cytochrome c’ (Weber and Salemme, 1980;Richardson, 1981). As pointed out by Weber and Salemme (1980;Ohlendorf et al., 1987),if the directionality of the helices is ignored, then the helices are related by a pseudo 4-fold rotational axis that runs down the center of the structure. As in two-stranded coiled coils, neighboring helices cross at approximately 18” angles in the four-helix bundle structures (Weber and Salemme, 1980). However, there is a fundamental difference between the arrangement of the a helices in the coiled coils and that of the helices in the four-helix bundle structures. In the coiled-coil structures, the helices remain at a constant distance from one another because of the strong lefthanded supercoiling, whereas in the four-helix bundle structures, the a helices diverge from a point of closest approach, an arrangement giving rise to a cavity at the base of the structures (Weber and Salemme, 1980; Ohlendorf et al., 1987). This cavity provides a binding site for prosthetic groups in many of the four-helix bundles. The high structural simplicity and functional diversity of the four-helix bundle makes it an attractive target for the study of protein folding as well as the eventual design of synthetic binding sites. Various models have been proposed to account for the high frequency of occurrence of the interhelical packing angle observed in four-helix bundle proteins (Chothia, 1984;Chothia et al., 1977,1981;Richmond and Richards, 1978; Weber and Salemme, 1980; Ohlendorf et al., 1987). Chothia and co-workers proposed that interhelical side-chain packing interactions are the dominant factor influencing the interhelical packing angles. These authors suggested that the packing could more closely be described as “ridges into grooves” rather than “knobs into holes” as observed in coiled-coils. In “ridges into grooves” packing, residues at positions i - 4, i, and i + 4 appear to form a ridge that packs against residues at positions j - 3, j, and j + 3 on a neighboring helix (Fig. 14B). This defines the packing angle between neighboring helices and allows tight packing of the apolar side chains, thereby driving the folding process. Ohlendorf et al. (1987;Weber and Salemme, 1980)have pointed out that this is probably an oversimplification, as a detailed comparison of the known four-helix bundle proteins showed unexpected variability in pack-
DESIGN OF PEPTIDES AND PROTEINS
Myohemerythrin
107
Cytochrome bSG2
FIG.16. Some examples of the four-helix bundle folding motif. Taken from Richardson (1981).
ing interactions. These authors failed to find one single packing interaction that optimally described all naturally occurring bundles, but nevertheless found that “independent of detailed pairwise interactions between helices relatively inclined at 18”, square arrays of four helices with this interaction angle tend naturally to produce structures where all pairwise helix interactions are symmetry related and those of individual residues
108
WILLIAM F. DEGRADO
pseudoequivalent” (Ohlendorf et al., 1987). Electrostatic interactions between helical macrodipoles (Sheridan et al., 1982) have also been proposed to increase the stability of four-helix bundles. Finally the loops between helices, although variable in length and sequence, nevertheless serve to break the helices and direct the formation of hairpin loops (Ohlendorf et al., 1987). Recently, an attempt at the de n o w design of a four-helix bundle was initiated (Eisenberg et al., 1986; Ho and DeGrado, 1987). In this designed protein, the pseudo 2,2,2 symmetry found in natural four-helix bundle proteins was idealized. This simplified the modeling process; with a highly symmetrical structure, it was only necessary to design a single a helix that upon application of a symmetry operator would provide a tightly packed, protein-like structure. In addition, it allowed the modeling process to be approached in an experimental, incremental manner, as described in Fig. 17. The first step of this approach involved the design of a peptide that could self-assemble into a helical tetramer composed of four identical monomers (Fig. 17a). The stability of the aggregate relative to that of the monomeric peptides could be assessed from the monomer-to-tetramer equilibrium constant. The availability of a quantitative parameter describing the stability of the aggregate allowed the evaluation of alternate designs. Subsequent to optimizing the a-helical sequence, designed loops were inserted between two identical helical sequences, and the stability of the resulting peptide evaluated (Fig. 17b). Finally, the entire four-helix bundle was constructed from four identical helical sequences and three identical loop sequences. The design of the helical sequences (Fig. 18) was first accomplished with physical models and later refined by computer graphics (Eisenberg et al., 1986; Ho and DeGrado, 1987). Models of four 16-residue helices were arranged so that their side chains could interact in a manner similar to that described by Chothia (1984; Chothia et al., 1977, 1981). Leucine side chains were placed at positions that projected into the interior of the structure, while glutamate and lysine side chains were placed at positions projecting toward the exterior of the structure. Glycine residues were placed at the N and C termini to help break the helix and to contribute to the formation of a hairpin loop. A helical net of the resulting structure, ala,is illustrated in Fig. 18. Subsequent refinement of the model for the tetramer (Ho and DeGrado, 1987) led to the design of a l b (Fig. 18); it appeared that the Leu side chain at position 11 was excessively exposed to solvent, while the Glu side chain at position 13 was partially buried. Consequently, a second peptide, a l b , was designed in which Leu was changed to Lys at position 11 and Glu was changed to Leu at position 13.
DESIGN OF PEPTIDES AND PROTEINS
109
Y
T
D
T C
FIG. 17. An incremental approach to the design of a four-helix bundle protein (Hoand DeGrado, 1987). (a) The sequence of a peptide is first optimized for forming a very stable tetramer of a helices. The stability of the tetramer can be assessed from the dissociation constant for the cooperative monomer-to-tetramer equilibrium. (b) Two optimized helical sequences are then connected in a head-to-tail manner by a single loop. The loop sequence can be optimized by evaluating a series of alternate designs. (c) Finally, the entire four-helix bundle structure can be constructed from four optimized helices and three optimized loops.
In addition, the Lys side chain at position 2 and the a-carboxylate at the C terminus were expected to interact unfavorably with the helical macrodipole (Shoemaker ef al., 1987a)and were therefore converted to a Glu and an a-carboxamide, respectively (Fig. 18). The assembly of the peptides into tetramers was assessed by size exclusion chromatography and by analyzing the concentration dependence of their CD spectra (Fig. 19). At low concentrations, the peptides were
110
WILLIAM F. DEGRADO
A
B 01
al0
Ac-Glu-Leu-Leu-Lys-Lys-Leu-Lsu-Glu-Glu-Leu-Lys-G1y-COOH Ac-Gly-Lys -Leu-Glu-Glu-Leu-Leu -Lys-Lys-Leu-Leu-Glu-Glu-Leu-Lys-Gl~COOH Ac-Gly-Glu-Leu-Glu-Glu-Leu-Leu-Lys-Lys-Leu-Lys-Glu-Leu-Leu-Lys-Gly.CONH~
FIG. 18. Helical net diagrams (A) and amino acid sequences (B) of a,,q a r and alb.In the helical nets the hydrophobic residues are circled, and potential salt bridges between the side chains of residues spaced at positions i 2 3 or residues at i 2 4 are indicated by solid and dashed lines, respectively.
monomeric and had low helical contents; in concentrated solutions, however, they formed a-helical aggregates. These conformational changes were reflected in the CD spectra of the peptides. The concentration dependence of the CD spectra were extremely well described by a highly cooperative monomer-to-tetramer equilibrium (Ho and DeGrado, 1987). Analysis of these curves provided the stabilities of the tetramer (RT In KD),and the approximate helical contents of the monomeric and tetrameric forms of the peptides. These parameters are listed in Table 111for a l a and (rib, as well as for a1 ,a 12-residue fragment of alaisolated as a byproduct that arose during the synthesis of a l a (Eisenberg et al., 1986). The helical contents of the monomeric forms of these peptides were dependent on their chain lengths. The 12-residue peptide was approximately 15% helical, whereas the 16-residue peptides were approximately 30% helical-a value that is unusually high when compared with that for other monomeric peptides in aqueous solution at room temperature (Section 111,A). Thus, ataand (Ylb were probably capable of forming helices at a rather low energetic cost. The helical content of the tetramers formed by all three peptides was approximately 70%, a value consistent with the proposed four-helix bundle structure. The stabilities of the tetramers depended both on chain
DESIGN OF PEPTIDES AND PROTEINS
111
1:
li
4
E
U \ N
E 12
0 U
T
P x
9
N
m"
-n = 3 --- n.6
6
12
9
FIG.19. Concentration dependence of the ellipticity at 222 nm of ala.The lines are computer-generated, theoretical curves describing various monomer-to-nmer equilibria. The top panel shows the monomer-tetramer equilibrium, the middle panel shows monomertrimer and monomer-hexamer equilbria, and the bottom panel shows monomer-dimer and monomer-octamer equilibria.
TABLE111 Amino Acid Sequences and Free Energies of Tetramerization or Dimerization of Synthetic Peprides" Peptide alb
aIAb aIBb
amc
Sequence
RTIIIK* (kcal/mol)
A C - G ~ U - L ~ U - L ~ U - L ~ S - L ~ S - L ~ U - L ~ U - G I U - G I U - L ~ U - L ~ S - G I-11.14 Y-C~H Ac-Gly-Lys-Leu-Glu-Glu-Leu-Leu-Lys-Lys-LLys-Gly-COOH - 19 Ac-Gly-Glu-Leu-Glu-Glu-Leu-Leu-Lys-Lys-Leu-Lys-GIu-Leu-Leu-Lys-GIy-CONHz-22 Ac-Gly-Glu-Leu-Glu-Glu-Leu-Leu-Lys-Lys-Leu-Lys-Glu-Leu-Leu-Lys-Gly-~~~g-~g - 13
Gly-Glu-Leu-Glu-Glu-Leu-Leu-Lys-Lys-Leu-Lys-Glu-Leu-Leu-Lys-Gly-CONHz
Data taken from H o and DeGrado (1987). Monomer-tetramer equilibria. Monomer-dimer equilibrium.
-em
x 10-4
(monomer) 0.55 0.87
0.94 -0.9
-em
x 10-4
(tetramer) 2.17 2.12
2.30 2.10
DESIGN OF PEF'TIDES AND PROTEINS
113
length and on sequence, increasing in the order at << ala< (Ylb. The fact that (Ylb formed more stable tetramers than a l a did suggested that the packing interactions must be more favorable for (Ylb. (The two peptides have equal helicities as monomers, a finding suggesting that helix formation was energetically equivalent for both peptides.) This prediction was consistent with the modeling, which indicated that q b should form a more stable tetramer than a l a s The observed free energy of tetramerization for a l b corresponds to approximately -0.8 kcal/mol of Leu side chains, a value in reasonable agreement with that of -1.2 kcal/mol calculated by Guy (1985)for the transfer of a Leu side chain from water to the interior of a protein. Having succeeded in designing a highly stable tetramer of a helices, a loop sequence to connect two identical a-helical sequences in a head-totail manner was designed (Ho and DeGrado, 1987). Examination of models of tetramers of (Ylb suggested that this could be accomplished with a three-residue linking sequence. The sequence of this linker was ProArg-Arg; proline was chosen to break the helix and arginine was chosen for its charge and hydrophilicity. On the basis of size exclusion chromatography and the concentration dependence of its CD spectrum in the presence of guanidine hydrochloride, it was concluded that the resulting peptide, (YZb, formed dimers. (The dimers formed by this peptide and the tetramers formed by (Ylb were too stable to allow determination of dissociation constants in the absence of denaturants.) At similar peptide concentrations, (YZb was found to be substantially more resistant to guanidine-induced unfolding as compared to (Ylb. Approximately twice as great a concentration of guanidine hydrochloride was required to reach the midpoint of the unfolding curve for (Y2b as for (Ylb (approximately 2.5 and 5.0 M ,respectively; Ho and DeGrado, 1987).This success not withstanding, a detailed analysis of the thermodynamic data for (Y2b indicated that it might be possible to design an even more stabilizing loop (Ho and DeGrado, 1987). Recently, a gene encoding a protein comprising the entire four-helix bundle, (lL4b, has been cloned and expressed in Escherichia coli (Regan and DeGrado, 1988; DeGrado et al., 1987b). This protein contains four copies of CYlb connected by three Pro-Arg-Arg loop sequences. When expressed from the tac promoter (De Boer et al., 1983),the protein can be isolated in reasonable quantities and is in soluble form in the bacteria a finding indicating that it is stable in E. coli. The stability is a good indication that the protein can adopt a folded conformation, and the CD spectrum of the product is similar to that of (YZb, an observation indicating that it contains a large amount of helix. Further characterization is in progress. These observations indicate that it is possible to prepare a-helical pro-
114
WILLIAM F. DEGRADO
teins with hydrodynamic, thermodynamic, and spectral properties similar to those of four-helix bundle proteins. However, elucidation of the structures of the folded forms of these proteins must await a crystallographic analysis. Crystals of suitable for X-ray crystallographic analysis have been prepared (Eisenberg et al., 1986). ala8
C . /3 Proteins
Proteins composed of p sheets are attractive targets for protein design because of their elegantly simple structures (Salemme, 1983; Richardson, 1981). However, one possible problem associated with the design of proteins is the formation of infinite P-sheet crystal structures of small peptides like enkephalin (Camerman et al., 1983; Griffin et al., 1986). In proteins with a high content of P structure, e.g., antiparallel P barrels (Fig. 20), the p sheets tend to fold back upon themselves, thereby giving rise to a well-packed globular structure (Richardson, 1981). While the structural properties of both folded, globular P structures and infinite P sheets are well characterized (Salemme, 1983; Chothia, 1984; Chou et al., 1986), the sequence determinants that cause a protein to adopt a globular, folded structure rather than an infinite sheet are less clear. Nevertheless, there have been several attempts to design P sheet-containing proteins (Moser et al., 1983; Kullmann, 1984; Richardson and Richardson, 1987).
Catalasc domain 2
FIG. 20. The structures of two protein domains that contain a high content of p sheet. Taken from Richardson and Richardson (1987).
DESIGN OF PEPTIDES AND PROTEINS
no
115
& --bj-6+-& yn9 o n
23
O
n
21
FIG.21. Sequence and proposed secondary structure of the DDT-binding peptide designed by Moser er al. Dotted lines indicate hydrogen bonds between the main-chain amide groups. Taken from Moser et al. (1983).
The structural characterization of these proteins may lead to important insights concerning the folding of natural P-sheet proteins. 1. Artificial DDT-Binding Peptide
Some years ago Moser et al. (1983) described the synthesis of a 24residue peptide that was designed to bind to the insecticide DDT [2,2bis( p-chlorpheny1)-1, I , 1-trichloroethane].This peptide could potentially form a four-stranded antiparallel P-pleated sheet that contains a hydrophobic site for binding DDT on either face of the sheet (Fig. 21). Perhaps because of its considerable hydrophobicity, the peptide formed large aggregates in aqueous solutions, even in the presence of 1 M acetic acid. Nevertheless, it has been reported to form a monomer in aqueous 50% ethanol. In this solvent, it has been reported to bind DDT with approximately a 20 p M dissociation constant. The affinity of the peptide for other hydrophobic substances was not measured. However, a peptide with the same amino acid composition as the designed DDT-binding peptide but with its amino acids in a “random” sequence bound DDT about 140-fold less tightly. The designed peptide could be crystallized in the presence of DDT, giving needles that, unfortunately, were too small for X-ray crystallography. More recently, the peptide has been cloned and expressed as a fusion protein in bacteria (B. Gutte, personal communication). This achievement should facilitate the synthesis of variants of the peptide as
116
WILLIAM F. DEGRADO
well as provide a source of homogeneous peptide for crystallographic investigations. 2. Enkephalin-Binding Peptide Kullmann (1984) designed a 40-residue peptide that appeared to form a 1 : 1 complex with Leu-enkephalin. This peptide was designed to form two long antiparallel p sheets, which were envisioned to fold over to form a “trough” capable of binding Leu-enkephalin. The peptide was synthesized and on the basis of CD spectroscopy was found to contain a large amount of p structure. It was reported to bind Leu-enkephalin with a dissociation constant of approximately 10 pM,and the binding showed some stereospecificity; an analog of Leu-enkephalin composed of all Damino acids bound 10-fold more poorly than Met-enkephalin. More complete structural studies are needed to determine whether the peptides are folding and binding enkephalin in the envisioned manner. 3. Betabellin Recently Richardson and Richardson (1987) have designed a protein that is intended to adopt an antiparallel p-barrel structure similar to that of an immunoglobulin VL domain (Fig. 22). To aid in the chemical synthesis of the molecule, the protein was designed as a homodimer of two identical chains connected by an organic cross-linking moiety. The model for the designed protein is highly symmetrical (Fig. 22), with a 2-fold rotational
etabellr R FIG.22. Schematic drawing of the proposed tertiary structure of betabellin. Taken from Richardson and Richardson (1987).
DESIGN OF PEPTIDES AND PROTEINS
117
symmetry axis running down its center. The design of a sequence that would adopt this structure was accomplished by carefully applying secondary structure prediction schemes (e.g., Chou and Fasman, 1978). Hydrophilic amino acids were chosen for the exterior of the structure, and apolar residues for its interior. In models, the interior side chains pack tightly, as in natural proteins. A peptide corresponding to the designed sequence was synthesized in the laboratory of Bruce Erickson (Unson et al., 1984). Although the peptide, on the basis of CD spectroscopy, appeared to have a high content of p sheet in solution, it failed to crystallize under a variety of conditions. The sequence has subsequently been varied several times in an effort to improve the ease of synthesis as well as to improve the folding properties (Richardson and Richardson, 1987). It is hoped that one such sequence will display reasonable water solubility, hydrodynamic properties, and sufficient homogeneity to yield crystals suitable for X-ray diffraction.
D . Conclusions The approach of de nouo protein design, although in its infancy, has already shown significant promise in the aforementioned examples. As new graphical and analytical methods are developed to aid in the design of proteins (e.g., Ponder and Richards, 1987; Fletterick and Zoller, 1986), it should become increasingly possible to design new structures, including proteins representing structural classes other than the simple a-helical and p-sheet proteins mentioned above (e.g., proteins composed of both a helices and p sheets). However, the value of such exercises will depend critically on the ability of the designers to focus on problems of key importance in protein chemistry and to develop strategies for extracting structural and thermodynamic information concerning the designed proteins. A great virtue of the de nouo design approach is that it allows the construction of simple, well-characterized systems that lack some of the complexities and consequent difficulties of interpretation inherent to natural proteins. Such model systems will, in fact, bear fruit only if they are systematically studied and characterized in adequate structural detail. Once the structural properties of some designed proteins are well established, it should also be possible to use them to investigate questions of function, including binding, signal transduction, and ultimately catalysis.
ACKNOWLEDGMENTS I wish to thank Christine Tyma DeGrado for critical reading of this manuscript and support throughout its preparation. I also thank Siew Peng Ho for many useful conversations, and many of my colleagues for providing me with reprints and preprints of their work. Much of this review was written while W. F. D. was a Sloan Visiting Lecturer in the Chemistry Department of Harvard University.
118
WILLIAM F. DEGRADO
REFERENCES Adamson, A. W. (1976). “Physical Chemistry of Surface.” Wiley (Interscience), New York. 46,D., Granozzi, G.,Tondello, E., and Del Pra,A. (1980). Biopolymers 19,469-475. Andreu, D., Menified, R. B., Steiner, H., and Boman, H. G. (1983). Proc. Natl. Acad. Sci. U.S.A. 80,6475-6479. Andreu, D., Menifield, R. B., Steiner, H., and Boman, H. G. (1985). Biochemistry 24, 1683-1688. Argiolas, A., and Pisano, J. J. (1985). J. Biol. Chem. 260, 1437-1444. Arison, B. H., Hirschman, R., and Veber, D. F. (1978). Bioorg. Chem. 7,447-451. Aubry, A., Vitoux, B., Boussard, G.,and Marraud, M. (1981). Int. J. Pept. Protein Res. 18, 195-202. Babu, Y. S., Sack, J. S., Greenhough, T. J., Bugg, C. E., Means, A. R., and Cook, W. J. (1985). Nature (London) 315,37-40. Bach, 11, A. C., and Gierasch, L. M. (1986). Biopolymers 25, S175-Sl92. Barbier, B., Caille, A., and Brack, A. (1984). Biopolymers 23, 2299-2310. Bavoso, A., Benedetti, E., Di Blasio, B., Pavone, V., Pedone, C., ToNolo, C., and Bonora, G.M. (1986). Proc. Natl. Acad. Sci. U.S.A. 83, 1988-1992. Bennett, M. K., and Kennedy, M. B. (1987). Proc. Natl. Acad. Sci. U.S.A. 84, 1794-1798. Berman, J. M., Goodman, M., Nguyen, T. M.-D., and Schiller, P. W. (1983). Biochem. Biophys. Res. Commun. 115,864-870. Bierzynski, A., Kim, P. S., and Baldwin, R. L. (1982). Proc. Natl. Acad. Sci. U.S.A. 79, 2470-2474. Blagdon, D. E., and Goodman, M. (1975). Biopolymers 14,241-245. Blake, J., Ferrara, P., and Li, C. H. (1981). Int. J. Pept. Protein Res. 17, 239-242. Blanc, J. P., and Kaiser, E. T. (1984). J. Biol. Chem. 259,9549-9556. Blanc, J. P., Taylor, J. W., Miller, R. J., and Kaiser, E. T. (1983). J . Biol. Chem. 258,82778284. Blumenthal, D. K., Takio, K., Edelman, A. M.. Charbonneau, H., Titani, K., Walsh, K. A., and Krebs, E. G.(1985). Proc. Natl. Acad. Sci. U.S.A. 82, 3187-3191. Blundell, T., Singh, J., Thornton, J., Burley, S. K., and Petsko, G.A. (1986). Science 234, 1005. Boman, H. G.,and Steiner, H. (1981). Curr. Top. Microbiol. Immunol. 94, 75-92. Bosch, R., Jung, G.,Schmitt, H., Sheldrick, G.M., and Winter, W. (1984). Angew. Chem. Int. Ed. Engl. 23,450-453. Bosch, R., Jung, 0.. Schmitt, H., and Winter, W. (1985a). Biopolymers 24, %I-978. Bosch, R., Jung, G.,Schmitt, H., and Winter, W. (1985b). Biopolymers 24,979-999. Brack, A., and Spach, G. (1981). J. Am. Chem. SOC.103,6319-6323. Braun, W., Wider, G.,Lee, K. H., and Wutrich, K. (1983). J. Mol. Biol. 169, 921-948. Brown, J. E., and Klee, W. A. (1971). Biochemistry 10, 470-476 Burley, S. K., and Petsko. G.A. (1985). Science 229, 23-28. Camerman, A., Mastropaolo, D., Karle, I. L., Karle, J.. and Camerman, N. (1983). Nature (London) 306,447-450. Cantor, C. R., and Schimmel, P. R. (1980). “Biophysical Chemistry,” Part I, Chap. 5 . Freeman, San Francisco. Caporale, L. H., Tippett, P. S., Erickson, B. W., and Hugli, T. E. (1980). J . Biol. Chem. 255, 10758-10763. Chauhan, V. S . , Sharma, A. K.,Uma, K., Paul, P. K. C., and Balaram, P. (1987). Int. J . Pept. Protein Res. 29, 126-133. Chothia, C. (1984). Annu. Rev. Biochem. 53, 537-572.
DESIGN OF PEFTIDES AND PROTEINS
119
Chothia, C., Levitt, M., andRichardson, D. (1977). Proc. Natl. Acad. Sci. U.S.A. 74,41304134. Chothia, C., Levitt, M., and Richardson, D. (1981). J. Mol. Biol. 145, 215-250. Chou, P. Y., and Fasman, G . D. (1978). Adv. Enzymol. 47, 45-148. Chou, K.-C., Nemethy, G., and Scheraga, H. A. (l9,84).J. A m . Chem. SOC.106,3161-3170. Chou, K.-C., Nemethy, G., Rumsey, S.,Tuttle, R. W., and Scheraga, H. A. (1985). J. Mol. Biol. 186, 591-609. Chou, K.-C., Nemethy, G., Rumsey, S., Tuttle, R. W., and Scheraga, H. A. (1986). J. Mol. Biol. 188, 641-649. Cohen, C., and Parry, D. A. (1986). Trends Biochem. Sci. 11, 245-247. Comte, M., Maulet, Y., and Cox, J. A. (1983). Biochem. J . u)9,269-272. Cox, J. A., Comte, M.,Fitton, J. E., and DeGrado, W. F. (1985). J. Biol. Chem. 260, 2527-2534. Creighton, T. (1984). “Proteins,” p. 161. Freeman, San Francisco. Crick, F. H. C. (1953). Acta Crystallogr. 6, 689-697. Dawson, C. R., Drake, A. F., Helliwell, J., and Hider, R. C. (1978). Biochim. Biophys. Acta 510, 75-86. DeBoer, H. A., Cornstock, L. J.. and Vasser, M. (1983). Proc. Natl. Acad. Sci. U . S . A . 80, 21-25. DeGrado, W. F. (1983). In “Peptides: Structure and Function” (V. Hruby and D. H. Rich, eds.), pp. 195-198. Pierce Chemical Co., Rockford, Illinois. DeGrado, W. F., and Lear, J. D. (1985). J . A m . Chem. SOC.107,7684-7689. DeGrado, W. F., Kkzdy, F. J., and Kaiser, E. T. (1981). J . A m . Chem. SOC.103,679-681. DeGrado, W. F., Musso, G. F., Lieber, M., Kaiser, E. T., and Ktzdy, F. J. (1982). Biophys. J . 37,329-338. DeGrado, W. F., Prendergast, F. G., Wolfe, H. R., Jr., and Cox, J. A. (1985). J . Cell. Biochem. 29,83-94. DeGrado, W. F., Erickson-Viitanen, S., Wolfe, H. R., Jr., and O’Neil, K. T. (1987a). Proteins 2, 20-33. DeGrado, W. F., Regan, L., and Ho, S. P. (1987b). Cold Spring Harbor Symp. Quant. Biol. 52,521-526. DiMaio, J., Nguyen, T. M.-D., Lemieux, C., and Schiller, P. W. (1982). J. Med. Chem. 25, 1432-1438. Drake, A. F., and Hider, R. C. (1979). Biochim. Biophys. Acta 555, 371-373. Edelman, A. M.,Takio, K., Blumenthal, D. K., Hansen, R. S., Walsh, K. A,, Titani, K., and Krebs, E. G. (1985). J . Biol. Chem. 260, 11275-11285. Edelstein, C., Ktzdy, F. J., Scanu, A. M., and Shen, B. W. (1979). J. Lipid Res. 20, 143153. Eisenberg, D., Weiss, R. M., and Tenvilliger, T. C. (1982a). Nature (London)299,371-374. Eisenberg, D., Weiss, R. M., Terwilliger, T. C., and Wilcox, W. (1982b). Symp. Furaday Chem. SOC. 17, 109-120. Eisenberg, D., Weiss, R. M., and Tenvilliger, T. C. (1984). Proc. Natl. Acad. Sci. U.S.A. 81, 140-144. Eisenberg, D., Wilcox, W., Eshita, S. M., Pryciak, P. M., Ho,S. P., and DeGrado, W. F. (1986). Proteins 1, 16-22. Epand, R. M. (1983). Trends Biochem. Sci. 258,203-207. Epand, R. M., Gawish, A., Iqbal, M., Gupta, K. B., Chen, C. H., Segrest, J. P., and Anantharamaiah, G. M. (1987). J. Biol. Chem. 262,9389-93%. Enckson-Viitanen, S., and DeGrado, W. F. (1987). In “Methods in Enzymology” (A. R. Means, ed.), Vol. 139, p. 455-478. Academic Press, Orlando, Florida.
120
WILLIAM F. DEGRADO
Erickson-Viitanen, S., ONeil, K. T., and DeGrado, W. F. (1987). In “Protein Engineering” (D. L. Oxender and C. F. Fox, eds.), pp. 201-211. Liss, New York. Finn, F. M., and Hofmann, K. (1973). Acc. Chem. Res. 6, 169-176. Fletterick, R., and Zoller, M., eds. (1986). “Current Communications in Molecular Biology, Computer Graphics and Molecular Modeling.” Cold Spring Harbor Laboratory: Cold Spring Harbor, New York. Fox, R. D., Jr., and Richards, F. M. (1982). Naiure (London) 300, 325-330. Freidinger, R., and Veber, D. F. (1984). ACS Symp. Ser. 251, 169-187. Giedroc, D. P., Ling, N., and h e t t , D. (1983). Biochemistry 22, 5584-5591. Giedroc, D. P., Keravis, T. M., Staros, J. V., Ling, N., Wells, J. N., and Puett, D. (1985). Biochemisfry 24, 1203-121 1. Goodman, M., and Chorev, M. (1979). Acc. Chem. Res. l2, 1. Gorin, F. H., Balasubramanian, T. M., Cicero, T. J., Schwietzer, J., and Marshall, C. R. (1980). J . Med. Chem. 23, 1113-1122. Griffin, J. F., Langs, D. A., Smith, G. D., Blundell, T. L., Tickle, I. J., and Bedarkar, S. (1986). Proc. Natl. Acad. Sci. U.S.A. 83, 3272-3276. Gutte, B., Diiumigen, M.,and Wittschieber, E. (1979). Nature (London) 281, 650-655. Guy, R. (1985). Biophys. J. 47, 61-72. Haberman, E. (1972). Science 177, 314-322. Hassan, M., and Goodman, M. (1986). Biochemisfry 25, 7596-7606. Ho, S. P., and DeGrado, W. F. (1987). J. Am. Chem. SOC.109,6751-6758. Hodges, R. S., Sodek, J., Smillie, L. B., and Jurasek. L. (1972). Cold Spring Harbor Symp. Quanf. Biol. 37, 299-310. Hodges, R. S., Saund, A. K., Chong, P. C. S.,%Pierre, S. A., and Reid, R. E. (1981). J . Biol. Chem. 256, 1214-1224. Hoeprich, Jr., P. D., and Hugli, T. E. (1986). Biochemistry 25, 1945-1950. Hol, W. G. J., Halie, L. M., and Sander, C. (1981). Narure (London) 294, 532-536. Hollt, V. (1983). Trends Neurosci. 6, 24-26. Howard, J. C., Momany, F. A., Andreatta, R. H.,and Scheraga, H. A. (1973). Macromolecules 6, 535-541. Hruby, V. J. (1982). Life Sci. 31, 189-200. Hruby, V. J. (1984). ACS Symp. Ser. 251,9. Hruby, V. J . (1985a). Peptides 7,6-14. Hruby, V. J. (1985b). I n “Oxytocin: Clinical and Laboratory Studies” (J. A. Amico and A. G. Robinson, eds.), pp. 405-414. Elsevier, Amsterdam. Hruby, V. J., Kao, L.-F., Hirning, L. D., and Burks, T. F. (1985). In “Peptides: Structure and Function” (C. M.Deber, V. V. Hruby, and K. D. Kopple, eds.), p. 487. Pierce Chemical Co., Rockford, Illinois. Jaenicke, R., Gutte, B., Glatter, U., Stassburger, W., and Wollmer, A. (1980). FEBS Lett. 114, 161-164. Janin, J. (1979). Nature (London) 277,491-492. Kaiser, E. T., and Ktzdy, F. J. (1983). Proc. Nafl. Acad. Sci. U.S.A. 80, 1137-1143. Kaiser, E. T., and Kkzdy, F. J. (1984). Science 223, 249-255. Kanmera, T., Homandberg, G. A., Komorita, A., and Chaiken, I. M. (1983). Inr. J. Pept. Protein Res. 21, 74-83. Karle, I. L., Sukumar, M., and Balaram, P. (1986). Proc. Natl. Acad. Sci. U.S.A.83,92849288. Kauer, J. C., Erickson-Viitanen, S.,Wolfe, Jr., H. R., and DeGrado, W. F. (1986). J . Biol. Chem. 260,2527-2534. Kauzman, W. (1959). Adv. Protein Chem. 14, 1-64.
DESIGN OF PEPTIDES AND PROTEINS
121
Kee, S. M., and Graves, 1). J. (1986). J . Biol. Chem. 261,4732-4737. Kessler, H. (1982). Angew. Chem. Int. Ed. Engl. 21, 512-523. Kim, P. S., and Baldwin, R. L. (1984). Nature (London) 307, 329-334. Kincaid, R. L., and Vaughan, M. (1986). Proc. Natl. Acad. Sci. U.S.A. 83, 1193-1197. Klee, C. B., and Vanaman, T. C. (1982). Adu. Protein Chem. 35, 215-321. Klevit, R. E., Blumenthal, D. K., Wemmer, D. E., and Krebs, E. G. (1985). Biochemistry 24,8152-8157.
Kokkinidis, M., Banner, D. W., Tsernoglou, D., and Bruckner, H. (1986). Biochem. Biophys. Res. Commun. w9,590-595. Komoriya, A.. and Chaiken, I. M. (1982). J. Biol. Chem. 257, 2599-2604. Krstenansky, J. L., Trivedi, D., Johnson, D., and Hruby, V. J. (1986a). J. Am. Chem. SOC. 108, 1696-1698.
Krstenansky, J. L., Trivedi, D., and Hruby, V. (1986b). J. Biochem. 25, 3833-3839. Kullmann, W. (1984). J. Med. Chem. 27, 106-115. Lakowicz, J. R. (1983). “Fluorescence Spectroscopy.” Plenum, New York. Lau, S. Y. M., Taneja, A. K., and Hodges, R. S. (1984). J. Biol. Chem. 259, 13253-13261. Lear, J. D., and DeGrado, W. F. (1987). J. Biol. Chem. 262, 6500-6505. Lebl, M., Cody, W. L., Wilkes, B. C., Hruby, V. J., De L. Castrucci, A. M., and Hadley, M. E. (1984). I n t . J. Pept. Protein Res. 24, 472-479. Levitt, M., and Chothia, C. (1976). Nature (London) 261, 552-558. Lu,Z., Fok, K. F., Enckson, B. W., and Hugli, T. E. (1984). J . Biol. Chem. 259,7367-7370. Lu, G.-S., Mojsov, S., and Memfield, R. B. (1987). Int. J . Pept. Protein Res. 29, 545-557. Lucas, T. J., Burgess, W. H., Prendergast, F. G., Lau, W., and Watterson, D. M. (1986). Biochemistry 25, 1458-1464. McDowell, L., Sanyal, G., and Prendergast, F. G. (1985). Biochemistry 24, 2979-2984. McLachlan, A. D., and Stewart, M. (1975). J. Mol. Biol. 98, 293-304. Malencik, D. A., and Anderson, S. R. (1982). Biochemistry 21,3480-3486. Malencik, D. A., and Anderson, S. R. (1983). Biochem. Biophys. Res. Commun. 114,50-56. Malencik, D. A., and Anderson, S. R. (1984). Biochemistry 23, 2420-2428. Mammi, N. J., and Goodman, M. (1986). Biochemistry 25,7607-7614. Mammi, N. J., Hassan, M., and Goodman, M. (1985). J. Am. Chem. SOC.107,4008-4013. Manavalan, P.. and Momany, F. A. (1980). Biopolymers 19, 1943-1973. Mihara. H., Kanmera, T.. Yoshida, M., Lee, S., Aoyagi, H., Kato, T., and Izumiya, N. (1987). Bull. Chem. SOC.Jpn. 60, 697-706. Mitchinson, C., and Baldwin, R. L. (1986). Proteins 1, 23-33. Moe, G. R., and Kaiser, E. T. (1985). Biochemistry 24, 1971-1976. Moe, G. R., Miller, R. J., and Kaiser, E. T. (1983). J . Am. Chem. SOC. 105, 4100-4102. Morley, J. (1980). Annu. Reu. Pharmacol. Toxicol. 20, 81. Mosberg, H. I., and Schiller, P. W. (1984). Int. J. Pept. Protein Res. 23, 462-466. Mosberg, H. I., Hurst, R., Hruby, V. J., Galligan, J. J., Burks, T. F., Gee, K., and Yamamura, H.I. (1982). Biochem. Biophys. Res. Commun. 106, 506-512. Mosberg, H. I., Hurst, R., Hruby, V. J., Gee, K., Yamamura, H. I., Galligan, J. J., and Burks, T. F. (1983). Proc. Natl. Acad. Sci. U.S.A. 80, 5871-5874. Moser, R., Thomas, R. M., and Gutte, B. (1983). FEES Lett. 157, 247-251. Moser, R., Klauser, S., Leist, T., Langen, H., Epprecht, T., and Gutte, B. (1985). Angew. Chem. 24, 719-727. Musso, G. F., Kaiser, E. T., Ktzdy, F. J., and Tager, H. S. (1983). I n “Peptides: Structure and Function” (V. Hruby and D. H. Rick, eds.), pp. 365-368. Musso, G. F., Assoian, R. K., Kaiser, E. T., KCzdy, F. J., and Tager, H. S. (1984). Biochem. Biophys. Res. Commun. 119,713-719.
122
WILLIAM F. DEGRADO
Mutter, M. (1985). Angew. Chem. Int. Ed. Engl. 24, 639-653. Nakagawa, S. H., Lau, H. S. H., KCzdy, F. J., and Kaiser, E. T. (1985). J. Am. Chem. SOC. 107,7087-7092.
Nakajima, T., Uzu, S., Wakamatsu, K., Saito, K., Miyazawa, T., Yasuhara, T., Tsukamoto, T., and Fujino, M. (1986). Biopolyrners 25, sl 15-s212. Narita, M., Tomotake, Y., Isokawa, S., Matsuzawa, T., and Miyauchi, T. (1984). Macromolecules 17, 1903-1906. Ohlendorf, D. H., Finzel, B. C., Weber, P. C., and Salemme, F. R. (1987). I n “Protein Engineering” (D. L. Oxender and C. F. Fox, eds.), pp. 165-174. Liss, New York. O’Neil, K. T., and DeGrado, W. F. (1985). Proc. Nail. Acad. Sci. U.S.A. 82,4954-4958. O’Neil, K.T., Wolfe, Jr., H. R., Erickson-Viitanen, S., and DeGrado, W. F. (1987). Science 236, 1454-1456.
Osterman, D., Mora, R., KCzdy, F., Kaiser, E. T., and Meredith, S. C. (1984). J . Am. Chem. SOC. 106,6845-6847.
Ovchinnikov, Y. A., and Ivanov, V. T. (1975). Tetrahedron 31, 2177-2209. Oxender, D. L., and Fox, C. F., eds. (1987). “Protein Engineering.” Liss, New York. Paterson, Y., and Leach, S. J. (1978). Macromolecules 11,409-415. Paterson. Y., Rumsey, S. M., Benedetti, E., Nemethy, G., and Scheraga, H. A. (1981). J. Am. Chem. SOC.103,2947-2955. Ponder, J. W., and Richards, F. M. (1987). J. Mol. Biol. 193, 775-792. Prasad. B. V., and Balaram, P. (1984). CRC Crit. Rev. Biochem. 16, 307-348. Prasad, B. V. V., Sudha, T. S., and Balaram, P.J. (1983). Chem. SOC.Perkin Trans. 1,417421.
Rajashekhar, B., and Kaiser, E. T. (1986). J. Biol. Chem. 261, 13617-13623. Ramachandran, 0 . N., Ramakrishman, C., and Sasisekharan, V. (1963). J . Mol. Biol. 7, 95-99.
Recny, M. A., Grabau, C., Cronan, Jr., J. E., and Hager, L. P. (1985). J. Biol. Chem. 260, 14287-14291.
Regan, L., and DeGrado, W. F. (1988). Science, in press. Richards, F. M., (1958). Proc. Narl. Acad. Sci. U.S.A. 44, 162-166. Richards, F. M. (1977). Annu. Rev. Biophys. Bioeng. 6, 151-176. Richards, F. M., and Wyckoff, H. W. (1971). I n “The Enzymes” (P. B. Boyer ed.), Vol. 4, pp. 647-806. Academic Press, New York. Richardson, J. S. (1981). Adu. Protein. Chem. 34, 167-339. Richardson, J. S., and Richardson, D. C. (1987). I n “Protein Engineering” (D. L. Oxender and C. F. Fox, eds.), pp. 149-163. Liss, New York. Richmond, T. J.. and Richards, F. M. (1978). J. Mol. Biol. 119, 537-555. Rico, M., Santoro, J., Bermejo, F. J., Herranz, J., Nieto, J. L., Gallego, E., and Jimenez, M. A. (1986). Biopolymers 25, 1031-1053. Roemer, D., Buescher, H. H., Hill, R. C., Pless, J., Bauer, W., Cardinaux, F., Closse, A., Hauser, D., and Huguenin, R. (1977). Nature (London) 268, 547-549. Rose, G. D., Gierasch, L. M., and Smith, J. A. (1985). Adu. Protein Chem. 37, 1-109. Salemme, F. R. (1983). Prog. Biophys. Mof.Biol. 42,95-133. Sasaki, K., Dockerill, S., Adamiak, D. A., Tickel, I. J., and Blundell, T. (1975). Nature (London) 257,751-757. Scheraga, H. A. (I%@. Adu. Phys. Org. Chem. 6, 103-184. Scheraga, H. A. (1978). Pure Appl. Chem. 50, 315-324. Schiffer, M., and Edmundson, A. B. (1%7). Biophys. J. 7, 121-135. Schiller, P. W. (1984). Peptides 6, 219. Schiller, P. W.,and DiMaio, J. (1982). Nature (London) 297,74-76.
DESIGN OF PEPTIDES AND PROTEINS
123
Schiller, P. W., Eggimann, B., DiMaio, J. Lemieux, C., and Nguyen, T. M.-D. (1981). Eiochem. Eiophys. Res. Commun. 101,337-343. SchiJIer, P. W., Nguyen, T. M.-D., Lemieux, C., and Maziak, L. A. (1985). J. Med. Chem. 28, 1766-1771. SchCjllkopf, U. (1983). Curr. Top. Chem. 109, 65-84. Schwyzer, R. (1982). Natunvissenschafien 69, 15-20. Seeback, D., and Fadel, A. (1985). Helv. Chim. Acta 68, 1243-1250. Seeback, D., Boes, M., Naef, R., and Schweizer, W. B. (1983). J. A m . Chem. SOC. 105, 5390-5398. Seebach, D., Aebi, J. D., Naef, R., and Weber, T. (1985). Helv. Chim. Acta 68, 144-154. Seeholzer, S. H., Cohn, M., Putkey, J. A., Means, A. R., and Crespi, H. L. (1986). Proc. Natl. Acad. Sci. U.S.A. 83, 3634-3638. Sellinger-Barnette, M., and Weiss, B. (1984). Adu. Cyclic Nucleotide Protein Phosphorylation Res. 16, 261-276. Sheridan, R. P., Levy, R. M., and Salemme, F. R. (1982). Proc. Natl. Acad. Sci. U.S.A. 79, 4545-4549. Shimohigashi, Y., and Stammer, C. H. (1982a). Int. J. Pept. Protein Res. 19, 54-62. Shimohigashi, Y., and Stammer, C. H. (1982b). Int. J. Pept. Protein Res. 20, 199-206. Shimohigashi, Y.,English, M. L., Stammer, C. H., and Costa, T. (1982).Eiochem. Eiophys. Res. Commun. 104,583-590. Shoemaker, K. R., Kim, P. S., Brems, D. N., Marqusee, S., York, E. J.. Chaiken, I. M., Stewart, J. M., and Baldwin, R. L. (1985). Proc. Natl. Acad. Sci. U.S.A. 82, 23492353. Shoemaker, K. R., Kim, P. S., York, E. J.. Stewart, J. M., and Baldwin, R. L. (1987a). Nature (London) 326,563-567. Shoemaker, K. R., Fairman, R., Kim, P. S., York, E. J., and Stewart, J. M. (1987b). Cold Spring Harbor Symp. Quant. Eiol. 52, 391-398. Sikela, J. M., and Hahn, W. E. (1987). Proc. Natl. Acad. Sci. U.S.A. 84, 3038-3042. Singh, T. P., Haridas, M., Chauhan, V. S., and Kumar, A. (1987). Eiopolymers 26,819-829. Smith, G. D., and Grimn. J. F. (1978). Science 199, 1214-1216. Stezowski, J. J.. Eckle, E., and Bajusz, S. J. (1985). Chem. SOC.Chem. Commun. 11,681685.
Struthers, R. S., Hagler, A. T., and Rivier, (1984). J. ACS Symp. Ser. 251, 239. Sudha, T. S., and Balaram, P. (1981). FEES Lett. l34,32-36. Sudha. T. S., and Balaram, P. (1983). Int. J. Pept. Protein Res. 21, 381-388. Sueki, M., Lee, S., Powers, S. P., Denton, J. B., Konishi, Y., and Scheraga, H. A. (1984). Macromolecules 17, 148-155. Sugano, H., Abe, H., Miyoshi, M., Kato, T., and Izumiya, N. (1974). Bull. Chem. SOC.Jpn. 47,698-703. Takio, K., Blumenthal, D. K., Walsh, K. A., Titani, K., and Krebs, E. G. (1986). Eiochemi s t v 25, 8049-8057. Talbot, J. A., and Hodges, R. S. (1982). Acc. Chem. Res. 15, 224-230. Tanaka, Y., Takahashi, S., Mitsui, Y., Itoh, S., Iitaka, T., Kasai, H., and Okuyama, T. (1985). J . Mol. Eiol. 186, 675-677. Tanford, C. (1980). “The Hydrophobic Effect.” Wiley (Interscience), New York. Taylor, H. C., Komoriya, A., and Chaiken, I. M. (1985). Proc. Natl. Acad. Sci. U . S . A . 82, 6423-6426. Taylor, J. W.,and Kaiser, E. T. (1986). Pharmacol. Rev. 38,291-319. Taylor, J. W., and Kaiser, E. T. (1987). I n “Methods in Enzymology” (R. Wu and L. Grossman, ed.), Vol. 154, pp. 473-498. Academic Press, San Diego, California.
124
WILLIAM F. DEGRADO
Taylor, J. W., Osterrnan, D. G., Miller, R. J., and Kaiser, E. T. (1981). J. Am. Chem. SOC. 103,6965-6966, Taylor, J. W., Miller, R. J., and Kaiser, E. T. (1982). Mol. Pharmacol. 22, 657-666. Taylor, J. W., Miller, R. J., and Kaiser, E. T. (1983). J. Biol. Chem. 258, 446444471. Terwilliger, T. C., and Eisenberg, D. (1982a). J. Biol. Chem. 257, 6010-6015. Terwilliger, T. C., and Eisenberg, D. (1982b). J. Biol. Chem. 257, 6016-6022. Terwilliger, T. C., Weissman, L., and Eisenberg, D. (1982). Biophys. J . 37, 353-359. Tonelli, A. E. (1976). Biopolymers 15, 1615-1622. Toniolo, C., Bonora, G. M., Bavoso, A., Benedetti, E., DiBlasio, B., Pavone, V., and Pedone, C. (1983). Biopolymers 22, 205-215. Tou, J. S., Kaempfe, L. A,, Vineyard, B. D., Buonomo, F. C., Della-Fera, M. A,, and Baile, C. A. (1986). Biochem. Biophys. Res. Commun. 139, 763-770. Unson, C. B., Erickson, B. W., Richardson, D. C., and Richardson, J. S. (1984). Fed. Proc., Fed. Am. SOC.Exp. Biol. 4, 1837. Veber, D. F. (1981). I n “Peptides Synthesis Structure and Function” (D. H. Rich and E. Gross, eds.), pp. 685-694. Pierce Chemical Co., Rockford, Illinois. Veber, D. F., Freidinger, R. M., Perlow, D. S., Palveda, W. J., Jr., Holly, F. W., Strachan, R. G., Nutt, R. F., Arison, B. H.,Hommick, C., Randall, W. C., Glitzer, M. S., Saperstein, R., and Hirschman, R. (1981). Nature (London) 292, 55-58. Velicelebi, G., Patthi, S., and Kaiser, E. T. (1986). Proc. Natl. Acad. Sci. U.S.A. 83,53975399. Vitoux, B., Aubry, A., Cung, M. T., Boussard, G., and Marraud, M. (1981). Int. J. Pept. Protein Res. 17, 469-479. Vitoux, B., Aubry, A., Cung, M. T., and Marraud, M. (1986). Int. J. Pept. Protein Res. 27, 617-632. Weber, P. C., and Salemme, F. R. (1980). Nature (London) 287, 82-84. Wlodawer, A., and Sjolin, L. (1983). Biochemistry 22, 2720-2728. Wyckoff, H. W.. Hardman, K. D., Allewell, N. M., Inagami, T., Johnson, L. N., and Richards, F. M. (1967). J . Biol. Chem. 242, 3984-3988. Zimmerman, S. S., Pottle, M. S. Ntmethy, G., and Scheraga, H. A. (1977). Macromolecules 10, 1-9.
WEAKLY POLAR INTERACTIONS IN PROTEINS
..
..
By S K BURLEY'st.' and G A PETSKO' 'Dopartmont of Chsmlrtry. Marsrchuretta Inrtltute of Technology. Cambrldge. Yarachu8ettr 02139 tHarvard Medlcrl School. Hodth Sclencor and Tochnology Dlvlrlon. Boaton. Marrachurettr 02115
. I . Introduction . . . . . . . . . . . . . . . . . . . . . . . I1. Electrostatic Interactions in Proteins . . . . . . . . . . . . . A . Multipole Representations of Molecular Charge Distribution . B . Charge-Charge Interactions . . . . . . . . . . . . . . . . C. Partial Electronic Charges in Proteins . . . . . . . . . . . . D . Polarizability . . . . . . . . . . . . . . . . . . . . . E . Charge-Dipole Interactions . . . . . . . . . . . . . . . . F . Dipole-Dipole Interactions . . . . . . . . . . . . . . . . G . Interactions with Electronic Quadrupoles . . . . . . . . . . H . London Forces and Short-Range Electron Shell Repulsion . . 1. Hydrogen Bonds in Proteins . . . . . . . . . . . . . . . J . Hierarchy of Electrostatic Interactions . . . . . . . . . . . 111. Weakly Polar Interactions in Proteins . . . . . . . . . . . . . A . Aromatic Amino Acids . . . . . . . . . . . . . . . . . . B. Oxygen-Aromatic Interactions . . . . . . . . . . . . . . C . Sulfur-Aromatic Interactions . . . . . . . . . . . . . . . D . Aromatic-Aromatic Interactions . . . . . . . . . . . . . . E . Amino-Aromatic Interactions . . . . . . . . . . . . . . . IV . Interactions: A Summary . . . . . . . . . . . . . . . . . . V . Hydrophobic Interactions in Proteins . . . . . . . . . . . . . . VI . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . VII . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . A . Hypothesis . . . . . . . . . . . . . . . . . . . . . . B . Proposed Classification of Electrostatic Interactions . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . .
...... ......
......
...... ......
......
...... ......
......
......
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
...... ...... ...... ...... ...... ...... ...... ...... ...... ......
......
125 126 126 128 130 131 134 137 138 140 142 150 152 152 156 160 162 173 176 177 181 183 184 185 186
I . INTRODUCTION Although it has been known for two decades that the three-dimensional structure of a protein is determined by its amino acid sequence. the forces that drive a disordered polypeptide chain into its final. folded conformation. and once there maintain this compact structure are only understood in general terms . There are two broad classes of noncovalent interactions I Present address: Department of Medicine. Brigham and Women's Hospital. Boston. Massachusetts 021 I5 .
125 A D V A N C E S IN PROTEIN CHEMISTRY. Vol. 39
.
Copyright 8 19118 by Academic Press Inc . All rights of reproduction in any form reserved.
126
S . K. BURLEY AND G . A. PETSKO
that are traditionally thought to contribute to the net free energy of stabilization of protein structure. These interactions are electrostatic (of which salt bridges, hydrogen bonds and van der Waals interactions are the best characterized examples) and the hydrophobic effect (which results primarily from entropic changes and is not really a force). The interiors of globular proteins are closely packed, and it has been assumed that the arrangement of residues in the hydrophobic core is dictated largely by the need to fill space efficiently. Recent geometric analyses of structural data bases of proteins and small organic compounds suggest that the packing of atoms in the interior of a protein is, in part, also determined by interactions between asymmetric distributions of the electrons surrounding the nuclei of atoms in a protein (Thomas et al., 1982; Reid et al., 1985; Burley and Petsko, 1985, 1986a,b; Singh and Thornton, 1985; Gould et al., 1985). Such interactions are enthalpically favorable and have been termed weakly polar because they are essentially electrostatic in origin, but involve one or more amino acids that are traditionally considered to be nonpolar. Weakly polar interactions are stronger than van der Waals interactions between nonpolar groups, and are of an enthalpic importance comparable to that of a hydrogen bond between uncharged groups. We review the subject of noncovalent interactions in proteins with particular emphasis on the so-called weakly polar interactions. First, the physical bases of the noncovalent electrostatic interactions that stabilize protein structure are discussed. Second, the four types of weakly polar interactions that have been shown to occur in proteins are described with reference to some biologically significant examples of protein structure stabilization and protein-ligand binding. Third, hydrophobic effects in proteins are discussed. Fourth, an hypothesis regarding the biological importance of the weakly polar interaction is advanced. Finally, we propose adoption of a systematic classification of electrostatic interactions in proteins. 11. ELECTROSTATIC INTERACTIONS IN PROTEINS A . Multipole Representations of Molecular Charge Distribution
If a charge distribution p(r) is placed in an electric field [@(r)] due to a second nonoverlapping charge distribution, p’(r), the electrostatic potential energy of the system is given by
w=
I
p(r)@(r)
where r is the position vector over the volume of the charge distribution p(r) and the origin of the coordinate system lies within the charge distribu-
127
WEAKLY POLAR INTERACTIONS IN PROTEINS
tion. In this case, the potential function as
@ can( beIusefully )
rewritten
\
@(r) = p’(rf)/(lr - r’l) dr’ When the charge distributions correspond to molecules such as proteins that are composed of discrete segments, i.e., atoms, the charge distribution function can be expressed as the sum of atomic charge densities given by p(r) =
C pa(r) a
where p,(r) is the atomic charge density function for atom a. The above equations can be recast as
Further simplification is achieved by rewriting the potential function @(r) as a Taylor series expansion about the origin:
thereby giving the electrostatic energy in a similar form
Once the electronic multipole moments have been identified as q, net charge; p, dipole moment; etc.; the above expression for the electrostatic energy becomes
w = 2 @(O) + p a
V@(O)
+2
az@(o) + (Q..)ij a r j arj
...
In addition, the potential function @(r)becomes @(r) = qlrl-’
+ ( p - r)lrl- + (r
*
Q r)lrIW5+
which can be written as the sum of its corresponding atomic components @(r) = C q,lrl-l a
+ ( p a- r)lrl-3 + (r
Q, r)Irf5
+
Finally, the electrostatic energy can be written as a sum of interactions between the multipoles of the two molecules
128
S. K . BURLEY AND G . A. PETSKO
where the first nine terms of the expansion are Monopole-monopole: W, = qlq21rI-l Monopole-dipole: WO, = q1(p2 * r)lrl-3 Dipole-monopole: WIO= q2(p1 r ) / r [ - 3 Dipole-dipole: WII = (pl * ~ ( ? ) l r I-- ~3(pl * r ) ( b * r)lrl-s Monopole-quadrupole: W02 = ql(r * QZ* r)lrl-s Quadrupole-monopole: W, = q2(r. QI r)lrl-s Dipole-quadrupole: W12 = 2(pl * Q2 * r)lrl-’ - [5(pl r)(r Q2 * r)]lrl-7 Quadrupole-dipole: WI2 = -2(p2 QI r)lrl-s + [5(p1 r)(r * QI r)]lrl-7 Quadrupole-quadrupole: W2, = (1/3)[trace of Q ~ Q ~ l l r l-- ~(20/3)[r Q2 . Q I * rllrl-7 + (35/2)(r QI * r)(r Q2 * r)lr[-9
-
-
-
-
The expression for the electrostatic energy given above can be further subdivided to give a sum of interactions between atomic multipoles, which is in turn summed over all possible pairs of interacting atoms. Atomic multipoles are estimated by fitting the atomic multipole expansion to the detailed features of the ground-state wave function obtained from ab initio quantum mechanical calculations. Rein (1975) reviewed the problem of estimating atomic multipoles and presented examples of use of the atomic multipole expansion method to the problem of molecular recognition in biology. More recently, Liang and Lipscomb (1986) considered the problem of transferabilities of atomic multipoles in atomic multipole expansions.
B. Charge-Charge Interactions The carboxy and amino termini of proteins can possess unit electronic charges, as can the side chains of aspartic acid, glutamic acid, cysteine, tyrosine, lysine, arginine, and histidine. At physiological pH these ionizable groups are at least partially charged and are typically found on the surface of a protein, where they interact with other charged groups and/or become solvated by water molecules in the protein’s hydration shell. It is important to realize that some of these groups-tyrosine, arginine, lysine, and glutamic acid-are polar and charged only at one extremum of their structures. The remainder of the amino acid side chain is nonpolar and often at least partially buried. When charged groups are positioned in the interior of a protein, where they are inaccessible to solvent, oppositely charged ionizable groups are usually found within 4 A, and the two charges form an ion pair or salt bridge, which is enthalpically favorable. If the buried ionizable group is not balanced by a neighboring charge of opposite sign, it is usually found to be involved in a hydrogen bond with a nearby polar group or groups that are electrically neutral, thereby forming
WEAKLY POLAR INTERACTIONS IN PROTEINS
129
a charge-dipole interaction. Charge-charge interactions in proteins have been extensively described by Wada and Nakamura (1981), Rashin and Honig (1984), and Barlow and Thornton (1983), who documented their important role in protein structure stabilization. Charge-charge interactions are of two kinds: one when the cation and anion are not hydrogen bonded to each other, and one when they are. This latter case is called a salt bridge. Specific examples of the effects of salt bridges on protein thermal stability have been described. Perutz and Raidt (1975) examined ferredoxin thermal stability and explained changes in denaturation temperature in terms of changes in the number of internal salt bridges. A hereditary functional human triosephosphate isomerase deficiency, which is an autosoma1 recessive disease that has severe clinical manifestations including chronic hemolytic anemia and neuromuscular disorders (Valentine et al., 1983), has been analyzed genetically by Maquat and co-workers (Daar et al., 1986). They documented a single amino acid substitution from glutamate to aspartate at position 104. This change conserves the overall charge of the amino acid, but the shortening of the residue by loss of a + -CH2-COO-) is side-chain methylene group (-CH2-CH2-COOsufficient to disrupt the counterbalancing charge network that normally exists within this hydrophobic region of the native enzyme and thereby render the mutant enzyme thermolabile. In a laboratory mimicry of amino acid side-chain deamidation, which may be an important mechanism of the kind of single-site mutation that is deleterious to the thermal stability of a protein, Ahern et al. (1987) made site-directed mutations of some of the asparagine residues of yeast triosephosphate isomerase (Casal et al., 1987). They documented reduced thermal stability when asparagine-78 was changed to an aspartic acid, a conversion that introduces a negative charge into the interface between the two protein subunits of the dimer. The same substitution is thought to occur via deamidation of the asparagine side chain at higher temperatures in aqueous solution. Moreover, they documented enhanced stability to irreversible inactivation at higher temperatures when asparagine-98 was changed either to a threonine or an isoleucine. Salt bridges also mediate binding of charged ligands to proteins; 2,3diphosphoglycerate, for example, makes four salt bridges with charged residues lining the central cavity of deoxyhemoglobin (Fermi and Perutz, 1981). In addition, substrates and substrate analogs can bind to enzyme active sites via salt bridges. A salt bridge is observed between the negatively charged carboxylate moiety of glycyl-L-tyrosine and the guanidinium group of the active site arginine-145of carboxypeptidase A in the Xray crystal structure of this enzyme-ligand complex (Lipscomb, 1980).
130
S. K. BURLEY AND G . A. PETSKO
The potential energy of interaction between two electronic charges q1 and q 2 separated by a distance r in a medium of homogeneous dielectric constant ( E ) is given by Coulomb’s law and can be written in arbitrary units as
v a qlqZl(4 where E is the bulk dielectric constant of the medium separating the two charges. Dielectric is a bulk property of the medium that reflects the shielding of one charge from another by the molecules in the medium ( E is unity for vacuum and is 80 for water). If all the charges in a given system, including fluctuating charges in the solvent, can be described explicitly, a dielectric constant of unity is appropriate because the medium is completely specified and needs no description of bulk properties. Although it is usually not possible to determine precisely the energetic contribution any one charge-charge interaction makes to the stability of a protein, typical estimates for the free energy of stabilization of a buried salt bridge between an amino acid NH: group bearing a positive charge and a negatively charged carboxylate ion in a protein are about -3 to -4 kcal/mol (Fersht, 1972). Salt bridges that occur on the surface of hemoglobin have lower stabilization energies, because their constituents can readily form hydrogen bonds with nearby water molecules (Perutz, 1970). Weaker interactions are also possible at quite large charge-charge separations because the strength of the Coulomb interaction varies as llr, and has the longest range of any electrostatic interaction. This formula is, however, not valid in the vicinity of boundaries between media of differing dielectric properties, where the Poisson-Boltzmann equation (of which Coulomb’s law is a special case) must be solved explicitly (Gilson et al., 1987). The distance dependence for charge-charge interactions varies as llr and is spatially isotropic. C . Partial Electronic Charges in Proteins Covalent bonding between two atoms occurs when an electron orbital of one atom overlaps an electron orbital of the other atom. Although the two atoms are said to share electrons, they do not do so equally; the distribution of electrons in the vicinity of the two nuclei is not symmetric. Electronegativity is the property of an atom describing its ability to attract a shared electron pair (Pauling, 1960). Biologically important atoms and their electronegativitiesare 0 (3.49, N (2.98), C (2.55), S (2.53), H (2.13), and P (2.10). When two atoms of different electronegativity form a covalent bond the pair of atoms can be treated as if unlike fractional (6) electronic charges occur in the vicinity of the two atomic nuclei. The more electronegative of the two atoms is said to have the 6- charge and
WEAKLY POLAR INTERACTIONS IN PROTEINS
131
the less electronegative atom is said to have the 6+ charge, and a small permanent electronic dipole moment results. A self-consistent set of partial electronic charges for each of the twenty naturally occurring amino acids is given in Table I. This set of partial charges attributes three different magnitudes of electronic dipole moment to pairs of covalently joined atoms in the polar, weakly polar, and aliphatic amino acids, respectively. In the polar amino acid side chains, the 6 charges are typically about rir0.35 e; in the aliphatic amino acid side chains, the 6 charges are typically about +O. 10 e. Between these two extremes lie the weakly polar aromatic amino acid side chains with 6 charges of about k0.15 e. The magnitudes of the partial electronic charges are calculated by fitting the results of ab initio quantum mechanical calculations of welldefined model systems and experimental data, such as dipole and quadrupole moments and sublimation energies, to an analytic potential energy function that includes Coulomb’s law. Two force fields have been widely used for protein energy and molecular dynamics calculations. The AMBER all-atom force field developed by Kollman and co-workers (Weiner et al., 1986) explicitly treats bonds, bond angles, dihedral angles, hydrogen bonds, and the nonbonded Coulombic and Lennard-Jones 6- 12 contributions to the total enthalpy of the system. The CHARMM all-atom force field developed by Karplus and co-workers (Brooks et al., 1983; M. Karplus, unpublished observations) also determines the total enthalpy of a macromolecule by subdividing the contributions of bonds, bond angles, dihedral angles, and the nonbonded Coulombic and Lennard-Jones 6-12 potential, but does not treat the hydrogen bond as a special case of a nonbonded interaction. Different empirical potential energy functions use different values for the partial charges; although these differences appear small, they can lead to markedly different results in some applications. For example, the results of molecular mechanics and molecular dynamics calculations of the model system N-phenylacetyl-L-phenylalanine(Burley and Wang, 1987a) depend critically on the precise choice of partial electronic charges attributed to the phenyl rings (Forman et al., 1988). When partial electronic charges of 6 = -0.15 e for ring carbon atoms and 6 = +O. 15 e for ring hydrogen atoms are used, the minimum energy conformation closely approximates the structure obtained with single-crystal X-ray methods, illustrated in Fig. 1. If, however, much smaller partial electronic charges are attributed to the phenyl ring atoms, the minimum energy conformation differs markedly from the observed structure.
D. Polarizability Polarizability is defined conceptually as the propensity for a given distribution of electrons to be spatially distorted, and this property is in-
S. K. BURLEY AND G. A. PETSKO
132
TABLEI Partial Charges for All Atoms of the 20 Naturally Occurring Amino Acids" ~
Charge distribution
Amino acid N -0.35
HN +0.25
CA -0.10 CA 0.00 CA +0.00 CA +0.00 CA +0.00 N -0.36
HCA +o. 10 HCA +o. 10 HCA
c
Invariant Backbone Atoms
o
+0.55
-0.55
Nonpolar Amino Acids
G~Y Ala
Val Leu Ile
Pro
CYS
Met
Ser Thr Asn Gln
His His LYS LYS Arg Art!
CA +0.00 CA +0.00
CB -0.30 CB +0.10 -0.10 HCA CB +o. 10 -0.20 HCA CB +0.10 -0.20 CA HCA +0.00 +o. 10 HCA
. +0.10 HCA +0.10
CA 0.00 CA +0.00 CA +0.00 CA +0.00
HCA
CA +0.00 NDI -0.40 CA +0.00 CD -0.20 CA +0.00 CD -0.10
HCA +o. 10 HNDl
+0.10
HCA +o. 10 HCA +o. 10 HCA +0.10
+OM
HCA +o. 10 HCD +o. 10 HCA +0.10
HCD +o. 10
HCB +o. 10 HCB CG +O. 10 -0.30 HCB CG +0.10 -0.10 HCB CG +o. 10 -0.10 HCB CB -0.20 +0.10
HCG +0.10
HCG +0.10
HCG +0.10
CG -0.20
CD -0.30 CD -0.30 HCG
HCD +o. 10 HCD +0.10
CD +0.06
HCD
Sulfur-ContainingAmino Acids HCB SG HSG CB -0.20 +0.10 -0.05 +0.05 CB HCB CG HCG SD -0.20 +0.10 -0.12 +0.10 -0.17
CE -0.21
HCE
Uncharged Polar Amino Acids HCB OG HOG CB +0.05 +0.10 -0.65 +0.40 HOG HCB OG CB +0.15 +0.10 -0.65 +0.40 OD1 CB HCB CG -0.20 +0.10 +0.55 -0.55 CB HCB CG HCG -0.20 +0.10 -0.20 +0.10
HCG +o. 10 HND2 +0.30 OEl NE2 -0.55 -0.60
+0.10
CG -0.30 ND2 -0.60 CD +0.55
Positively Charged Amino Acids CB HCB CG -0.20 +0.10 +0.05 CD2 HCD2 CEl HCEl NE2 -0.14 +0.14 -0.14 +0.14 -0.05 HCB CG HCG CB -0.20 +0.10 -0.20 +0.10 CE HCE NZ HNZ +0.05 +0.10 -0.30 +0.35 HCB CG HCG CB -0.20 +0.10 -0.20 +0.10 NE HNE CZ NHI HNHl -0.40 +0.30 +0.50 -0.45 +0.35
NH2 -0.45
+Oslo
+0.10
HNH2 t0.35
HNE2 +0.30
133
WEAKLY POLAR INTERACTIONS IN PROTEINS TABLEI (Continued) Amino acid
ASP
Glu
Charge distribution
CA
HCA
CA
HCA
CA
HCA
CG -0.03 CA
CD -0.16 HCA
+o.oo
+0.10
+o.oo +O.lO
+om
+om
+0.10
+0.10
CG CD +O.W -0.14 CA HCA +o.oo +O.lO NEI CE2 +0.40 +0.13
Negarively Charged Amino Acids CB HCB CG OD1 OD2G -0.36 +0.10 +0.36 -0.60 -0.60 CB HCB CG HCG CD -0.20 +0.10 -0.36 +0.10 +0.36 Aromatic CB HCB -0.16 +0.10 HCD CE +0.14 -0.15 CB HCB -0.20 +O.lO HCD CE +0.14 -0.14 CB HCB -0.20
CE3 -0.16
OEl -0.60
OE2 -0.60
HOH +0.40 CD2 -0.49 HCZ3 +0.14
CH2 -0.18
Amino Acids
HCE CZ +0.14 -0.15
HCE +0.14 CG +0.10 -0.04 HCE3 CZ2 +0.14 -0.15
CZ +0.20 CDI -0.01 HCZ2 +0.14
HCZ +0.14 OH -0.60 HCDl +0.14 CZ3 -0.15
HCH2 +0.14
Partial charge given in units of electronic charge. These partial charge distributions are those used in the molecular mechanics modeling program CHARMM (Brooks e f al., 1983). Their inclusion is solely for the reader's information and does not constitute a specific endorsement of one molecular mechanics parameter set. HX denotes the hydrogen atom or atoms attached to atom X.
versely related to the electronegativity of a given atom. It has units of volume, and the polarizabilities of some common biochemical groups are given in the tabulation below.
Group CH3 NH2 SH OH Hz0
Polarizability (lo-" ml/molecule) I .97 1.44
1.83 0.733 1.46
Hence, N and 0 are less polarizable than C and H, even though they form more polar molecules. Polarizability increases with atomic radius, since the electrons are less tightly bound. Therefore, S is more polarizable than N.
134
S. K.
BURLEY AND G . A. PETSKO
3
cs
C6CP1>
CA <
'< FIG. 1. Stereodrawing of the molecular structure of N-pheynlacetyl-L-phenylalanine drawn by ORTEf I1 (Johnson, 1967). Molecular mechanics and dynamics calculations of the compound both in vacuum and in the crystal environment reproduced the observed crystal structure only when the appropriate partial electronic charges were used for the aromatic carbon and hydrogen atoms.
The potential energy due to the induction of a dipole of polarizability a by a point charge of magnitude q at a distance r is given by V
a
-OSaq2/r4
If the unit charge is 40 8, from a water molecule with a = 1.46 x ml/ molecule V = - 1.8 X kcallmol, because of shielding by intervening water molecules. However, if the two groups are only 2 8,apart and there is no shielding, V = - 15.1 kcallmol. Therefore, at short distances, charge distortion effects become important, and theoretical modeling of systems such as enzyme active sites should include atomic polarizabilities. Although no successful algorithm for treating polarizability explicitly in proteins has been developed, serious attempts to use experimentally determined polarizabilities when modeling proteins are underway and these data will soon be included explicitly in theoretical studies of protein stability and dynamics (Warshel and Levitt, 1976; Tapia and Johannin, 1981; Bashford and Karplus, 1988). Honig and co-workers have determined empirically that a choice of E = 2 for the interior of a protein provides a reasonable approximation for the effects of polarization in proteins when medium- and long-range electrostatic interactions are modeled, but this approximation breaks down when short-range interactions are treated (Gilson et al., 1987; Sharp et al., 1987). The distance dependence of polarizability effects varies as l/r4, and is spatially isotropic. E. Charge-Dipole Interactions The distribution of partial electronic charges within an amino acid gives rise to various permanent electronic dipole and higher order multipole
WEAKLY POLAR INTERACTIONS IN PROTEINS
135
moments in both the residue backbone and the side chain. An electronic dipole is formed when two nearby charges of opposite sign are linked, and the resulting electronic dipole moment [a vector quantity, with units Debye (D)] is a function of the charge magnitudes and their separation distance. In favorable cases, of which the a helix is a well-characterized example, some of these permanent electronic dipole moments (usually denoted by the vector p ) become aligned and their contributions sum to give a large macroscopic dipole moment spanning a substantial portion of the protein. The dipole moment of a water molecule is about 1.84 D and that of a hydroxyl group is 1.51 D. The electronic dipole of a peptide unit is depicted in Fig. 2 and has a dipole moment of magnitude p = 0.72 e A = 3.46 D. Therefore, a 10-residue a helix would have a dipole moment of about 34 D. It has been demonstrated that the effect of a typical helix dipole is comparable to placing one-half of a positive unit charge at the amino terminus of the helix (Hol et al., 1978). Analyses of a helix dipoles in proteins have demonstrated that they play an important role in stabilizing charges, and the enthalpy due to a typical ion-helix dipole interaction is estimated to be about -4 kcal/mol. For example, Hol et al. (1978) described negatively charged phosphate moieties bound to the positively charged N termini of a helices in proteins, and they suggest that helix dipoles may play a role in enzyme catalysis. More recently, the problem of a helix dipoles in proteins has been considered in detail by Rogers and Sternberg (1984) and Rogers et al. (1985). In addition, the questions of the involvement of helix macrodipoles in sulfate and phosphate binding to proteins have been discussed (Pflugrath and Quiocho, 1985; Johnson, 1984). Moreover, Baldwin and co-workers have documented the importance of charge-dipole interactions in stabilization of a helices themselves (Shoemaker et al., 1987). The latter case is particularly interesting. If the helix dipole is a real physical quantity, one would expect that terminal residues with side-chain charges opposite to the dipole partial charges would stabilize the helical conformation. Thus, negatively charged side chains would be stabilizing at the
FIG.2. Electronic dipolar nature of the peptide unit. The numbers adjacent to each atom give the approximate fractional electronic charge attributed to each atom (in units of fundamental electronic charge). The magnitude of the dipole moment is 0.72 eA = 3.46 D.
136
S. K . BURLEY AND G.A. PETSKO
N-terminal end of a helices, whereas positively charged side chains would be stabilizing at the C-terminal end of an a helix. The experiments of Baldwin and colleagues support this hypothesis. Interactions between charges and simple dipoles also occur in proteins, and carboxyl-carboxylate interactions (which are acid salts of monobasic carboxylic acids) were examined by Sawyer and James (1982). Fersht et al. (1985) have estimated the free energy change due to a hydrogen bond involving a charged donor or acceptor to be approximately -3 kcal/mol. The potential energy due to a charge-dipole interaction is V a qql(l/r+ - l / r - ) where q1 is the charge interacting with the dipole, q is the magnitude of the charge at either end of the dipole, r+ is the distance between charge q1 and + q , and r- is the distance between charge q1 and - 4 . Charge-dipole interactions are necessarily weaker than charge-charge interactions because of the cancellation effects of the opposite charges constituting the electronic dipole moment. When the charge q1is a long distance from the electronic dipole ( p ) , the potential energy of interaction varies as 1 1 6 , and is given by V
a
q p cos Olr2
where the angle 0 is defined in Fig. 3. Therefore, the charge-dipole interaction is of shorter range than the charge-charge interaction, which varies as llr. The charge-dipole interaction also differs from the chargecharge interaction, because the potential energy of the former interaction depends on the precise orientation of the electronic dipole moment with respect to the nearby charge. This dependence on spatial orientation is a hallmark of all interactions involving at least one dipole or higher order electronic multipole moment, and, as discussed later, directly influences the precise packing of atoms in crystals and in the interiors of proteins. Y
FIG. 3. Interaction of a positive charge q with an electronic dipole having a dipole moment p in the coordinate system used to define the energetics of this interaction.
137
WEAKLY POLAR INTERACTIONS IN PROTEINS A
B
Y
C
Clz
I
PI
P2
PI E
-1 h
a
I
5d
w
-
0
PI
FIG.4. (A) The general case of a dipole-dipole interaction. The angles 0 , and O2 define the orientation of the two electronic dipole moments pIand p2 with respect to the vector r, which passes from the center of one dipole to the next. The magnitude of the vector is the separation of the two dipoles. (B)The colinear dipole-dipole interaction, the enthalpically optimal case. (C) The antiparallel dipole-dipole interaction, which is often seen between macroscopic helix dipoles in proteins. (D) The parallel dipole-dipole interaction, which is enthalpically unfavorable. (E)The behavior of dipole-dipole interaction enthalpy as a function of the parameter with 0, = 0".
The distance dependence of the charge-dipole interaction is given by llr2 and is spatially anisotropic. F. Dipole-Dipole Interactions
Dipole-dipole interactions are still weaker and of shorter range than the charge-dipole interaction. The strength of the interaction depends on the distance (r) between the centers of the two dipoles, their dipole moments p1 and p2, and the angles O1 and 0 2 between each electronic dipole moment and the vector r, and is given by V
a
-pIp2(2 cos 01 cos 0 2 - sin 01 sin 02)/r3
where 0 1 and 0 2 are defined as the angles between the two dipole moments and the vector connecting their centers (see Fig. 4A). The optimum arrangement occurs when the two electronic dipoles are colinear and the positively charged end of one dipole moment is adjacent to the negatively charged end of the other (see Fig. 4B). In this configuration, the potential energy of interaction is given by
v a -2p,p2~r3
138
S.
K. BURLEY A N D G.A. PETSKO
By contrast, if the dipole moments are antiparallel to one another (see Fig. 4C), the interaction potential energy is reduced to V
a
-p1p21r3
and it becomes repulsive,
v a pIp21r3 if the dipole moments are oriented parallel to one another (see Fig. 4D). Hence, the strength and sign of the dipole-dipole interaction depends critically on the spatial arrangement of the two electronic dipole moments. The angular dependence of the strength of the dipole-dipole interaction for the simple case of 02 = 0" is given by cos illustrated in Fig. 4E. A relatively large angular deviation of 0 1 = 25" reduces the strength of the interaction by only 10%. Therefore, nonlinear hydrogen bonds are not substantially weaker than the optimal colinear case, and nonlinear hydrogen bonds are often observed in proteins (Baker and Hubbard, 1984). The best-characterized dipole-dipole interaction occurring in proteins is the hydrogen bond (which will be described later), because the hydrogen bond also involves other electrostatic interactions. Before continuing, it is interesting to note that interactions between the macroscopic electronic dipoles of secondary structural elements such as a helices and p sheets are thought to play a role in protein structure stabilization and folding. Sheridan et al. (1982) analyzed the common antiparallel arrangement of adjacent a helices in proteins and estimated the resulting enthalpy of stabilization to be between -5 and -7 kcallmol. Hol et al. (1981) discussed dipoles due to a helices and /3 sheets and their role in protein folding. They demonstrated that the electrostatic energy of a protein is very sensitive to the relative orientation of secondary structural segments in which the peptide dipoles are regularly arranged, and they suggest that alignment of secondary structure dipoles is significant in determining the tertiary structure of globular proteins. G . Interactions with Electronic Quadrupoles The simplest electronic quadrupole is formed when three charges, two like a unit charge and one unlike of twice unit charge, are aligned as shown in Fig. 5A. This group of charges is electrically neutral, and their arrangement has no net electronic dipole moment; but it does have a nonzero quadrupole moment, which is a six-term quantity called a tensor. The quadrupole moment of a given charge distribution [ p ( x , y, z)] is formally defined as a symmetric 3 x 3 matrix, or tensor, the terms of which ( Q i j ) are given by the following integrals:
WEAKLY POLAR INTERACTIONS IN PROTElNS
139
where V is the volume occupied by the charge distribution function p(x, y, z), and dV is the differential element of volume. Electronic quadrupole moments of the aromatic amino acid side chains are substantial. Figure 5B depicts a model of the distribution of partial electronic charges in benzene in which the 6- delocalized welectron cloud is treated as six discrete negative partial electronic charges located at each of the six carbon nucleus positions. The hydrogen atoms are treated as six positive partial electronic charges located near each of the hydrogen nucleus positions. Values of approximately +O. 1% for these partial charges have been shown to give appropriate calculated electronic dipole and quadrupole moments for benzene. Quadrupole-quadrupole, charge-quadrupole, and dipole-quadrupole interactions occur frequently in proteins and involve one or more aromatic moieties (vide infru). The
A
+@+
+ @ @@
-+
+
FIG.5 . (A) An example of the simplest electronic quadrupole consisting of three point charges. The system of charges can be thought of as consisting of two electronic dipoles, which are arranged in such a manner that the dipole moment of the system is zero, as is the system's net charge. (B) Schematic view of the arrangement of partial electronic charges in the aromatic compound benzene. Although it is known that bonding electrons are located between adjacent carbon atoms and that the 8- r-electron cloud is further delocalized, the partial charge distribution is usually treated as if there are discrete partial electronic charges located at each of the atomic nuclei. Such an arrangement of partial charges has no net charge, not net electronic dipole moment, and can be thought to consist of three simple electronic quadrupoles.
S. K. BURLEY AND G . A. PETSKO
140
distance dependence of these three types of interactions with electronic quadrupole moments can easily be calculated and are V V V
a 0:
a
1/ r 3 1 /r4 1 /rS
(charge-quadrupole) (dipole-quadrupole) (quadrupole-quadrupole)
The magnitudes and signs of these interactions also depend on the spatial arrangement of the charges, with enthalpically optimal arrangements bringing unlike charges close to one another and separating like charges. Some examples of interactions involving quadrupole moments are described later in the discussion of the weakly polar interactions. H . London Forces and Short-Range Electron Shell Repulsion The term van der Wads interaction traditionally refers to a combination of attractive interactions involving induced electronic multipoles and short-range repulsive interactions due to unfavorable spatial overlap of electron orbitals. Multipole-multipole interactions can exist between neutral molecules that do not possess permanent electronic multipole moments, because rapidly occurring, spatially random fluctuations of electrons in one molecule give rise to a set of instantaneous multipole moments in that molecule, which in turn induce a set of multipole moments in a neighboring molecule. Such interactions between fluctuating multipoles and induced multipoles are always attractive, because the multipole moments are aligned antiparallel to one another and are called London or dispersion interactions. Their strength is V
a
- (B/r6)- (B’/r8)- (B”/rI0)-
where each of the exponents refers to induced dipole-dipole interactions ( l / r 6 ) , induced dipole-quadrupole interactions ( l / r * ) ,and induced dipole-octupole and induced quadrupole-quadrupole interactions ( 1 /rl0), etc. (London, 1937). The inverse sixth power term (1/r6)dominates most calculations, and the higher order terms are usually neglected for computational simplicity. In addition, the strength of the London or dispersion interaction depends on the polarizabilities of the interacting molecules (Slater and Kirkwood, 193 1 ) . The dependence on polarizability means that nonpolar atoms like aliphatic C and H have stronger London interactions than polar atoms such as N and 0. These attractive dispersion forces are counterbalanced by repulsion of the electronic shells as the interacting atoms approach one another, and this short-range repulsion is usually modeled with a l/rI* distance dependence because it is computationally convenient. The Lennard-Jones 6- 12 potential function (Jones, 1924),
WEAKLY POLAR INTERACTIONS IN PROTEINS
141
V 0: AlrI2 - B/r6
which combines an inverse twelfth power-of-distance repulsive term with an inverse sixth power-of-distance attractive term, is a computationally simple method of estimating the effect of the combination of the two interactions, and the parameters A and B have been evaluated from various experimental data. Polarizability affects the magnitude of B. Numerous studies have confirmed the validity of the inverse sixth power attractive potential term and have shown that a variety of exponents from inverse ninth to inverse twelfth are all suitable for modeling the repulsive component due to unfavorable electron cloud overlap (Lifson et al., 1979). An alternative representation of the repulsive term using an exponential function of the form exp(-cr) has also been used to model electron cloud overlap (Lifson et al., 1979). It has been traditional to define a “van der Waals potential” (which combines Coulomb’s law and the Lennard-Jones 6- 12 potential function) and thereby subsume electronic shell repulsion, London forces, and electrostatic interactions under the term “van der Waals interaction.” Unfortunately, the resulting expression is an oversimplified treatment of the electrostatic interactions, which are only calculated between close neighbors and are considered to be spatially isotropic. Both of these implicit assumptions are untrue and do not represent physically realistic approximations. We prefer to use the term “van der Waals distance” for the internuclear separation at which the 6-12 potential function is a minimum (see Fig. 6), the “van der Waals radius” being one-half this value when the two interacting atoms are identical, and explicitly treat the LennardJones and electrostatic terms separately. While the term “van der Waals interaction” may have some value as a shorthand in structure description, it should be avoided when energetics are treated quantitatively.
:1114
A
m
0
-2 0
1
2
3
0
1
2
3
FIG. 6. Schematic drawing of the shape of the Lennard-Jones 6-12 potential energy versus interatomic distance ( r ) . The equilibrium separation distance occurs at the potential energy minimum and is defined to be twice the van der Waals radius if the two interacting atoms are identical.
142
S. K.
BURLEY AND G.A. PETSKO
I . Hydrogen Bonds in Proteins Most discussions of noncovalent interactions in proteins treat the hydrogen bond as a separate entity, but the hydrogen bond is a combination of an electrostatic interaction of the form dipole-dipoleand a charge transfer interaction (Umeyama and Morokuma, 1977; Reed and Weinhold, 1883). Although there is some disagreement as to the relative contributions of these two types of interactions, hydrogen bonding can be described phenomenologically as an electrostatic dipole-dipole interaction and such a description is given below. A hydrogen bond is formed by the favorable orientation and proximity of partial electronic charges, as shown below: Xa--H8+
... yS--ZS+
The electron shell surrounding the hydrogen atom is shifted toward the more electronegative atom (designated X) to which it is covalently bound, thereby giving the hydrogen and its covalent partner X unlike partial electronic charges. A hydrogen bond is said to form when this XS-Ha+ electronic dipole approaches another electronic dipole, as shown schematically above. The Coulombic and London forces are large and attractive, and the short-range repulsion due to unfavorable electron cloud overlap is minimized by withdrawal of the shared electrons away from the hydrogen nucleus toward the more electronegative atom X. As a result, the “donor” X and the “acceptor” Y are held in close proximity, typically less than 3 A in internuclear separation. For formation of XS--H*+ .-.Y*-hydrogen bonds, the distance X -..Y should be less than the sum of the X-H covalent bond and the van der Waals radius of H and that of Y (Rahim and Barman, 1978). Ab initio quantum mechanical calculations have shown that this interaction can be accurately modeled by combining Coulomb’s law with the Lennard-Jones 6-12 potential if appropriate partial electronic charges are assigned to each of the participants (Reiher, 1985). Each case of the resulting interaction contributes -0.5 to -1.5 kcal/mol to the free energy of stabilization of a protein if the donor and acceptor groups are only partially charged and -3 to -5 kcal/mol if one member of the pair bears a full electronic charge (Fersht et al., 1985). The increase in interaction energy in the case of charged species is consistent with the longer range distance dependence, l / r * ,for charge-dipole interactions as opposed to the l/r3 distance dependence for dipole-dipole interactions. The dipole-dipole interaction model of the hydrogen bond shown above also accounts for the importance of the directionality of the hydrogen bond interaction. The energetically optimum arrangement occurs
WEAKLY POLAR INTERACTIONS IN PROTEINS
143
when the two dipole moments are colinear, and deviations of 25” reduce the enthalpy of the interaction by at least 10%. Multiple hydrogen bonding arrangements from a single donor or to a single acceptor are possible as long as the dipolar alignments are favorable. The term bifurcated hydrogen bond has been used for these multiple arrangements. Bifurcated hydrogen bonds have been described frequently in both protein and small-molecule structural literature. There is abundant evidence both from bond length/bond angle data and from ab initio calculations that they exist. However, the term has often been misused. The term bifurcated hydrogen bond should only be used to describe the case where a single proton is shared by two proton acceptors. Baker and Hubbard (1984) have documented 41 examples of such “true” bifurcated hydrogen bonds in 15 well-refined protein crystal structures. Ninety percent of these involve at least one protein atom as the proton acceptor. The other acceptor is a water oxygen atom or a protein acceptor atom equally often. The term bifurcated hydrogen bond is often used incorrectly to describe a carbonyl oxygen receiving a protein from two distinct, simultaneous donors. This situation is a normal hydrogen bonding arrangement for a carbonyl oxygen, which possesses two lone pair orbitals that can accept protons. It is easy to distinguish this conventional hydrogen bonding situation from true bifurcated hydrogen bonding. The sharing of a single proton by two acceptors should, according to our electrostatic model, weaken the interaction between any pair of them. In the examples cited by Baker and Hubbard (1984), the bond lengths are longer than usually found for conventional hydrogen bonds. By contrast, the case of a carbony1 oxygen atom receiving two protons should not, and does not, reduce the strength of either interaction; bond length data confirm this. Small-molecule crystal structure data confirm these observations: the mean 0 H distance in 304 bifurcated hydrogen bonds in one survey was 2.004 8, compared with a mean distance of 1.899 8, for 1199 simple, twocenter hydrogen bonds (Taylor et al., 1983). Hydrogen bonds in biological systems are traditionally thought to involve only oxygen and nitrogen as one or more of the two electronegative participants in the bond, thereby limiting the possible classes of hydrogen bonds to the forms N-H 0, O-H 0, N-H N, and O-H N. However, the existence of hydrogen bonds involving carbon as the hydrogen donor atom, such as C-H .-.0 and C-H *.. N, has been documented using neutron and X-ray crystallography (Taylor and Kennard, 1982).
The crystallographic evidence for the existence of these nontraditional hydrogen bonds has been obtained from both small organic compounds and proteins. Taylor and Kennard (1982) tabulated the results of various
S. K.BURLEY AND G . A. PETSKO
144
high-accuracy neutron and X-ray crystallographic studies of small organic compounds and assembled an unambiguous case that proves that C-H N hydrogen bonds do in fact occur. By a statistical 0 and C-H analysis of 113 neutron diffraction crystal structures, they found that H atoms covalently bound to C have a statistically significant preference to form intermolecular contacts to 0 atoms rather than to C or H atoms. Such interactions have frequently been appreciated in single-crystal Xray studies of nucleic acid constituents (Kvick et al., 1974; Taylor and Kennard, 1982; Sundralingam, 1966; Lai and Marsh, 1972; Burley and Wang, 1987b), and an example detected in the structure of a modified deoxyadenosine compound is illustrated in Fig. 7. A particularly intriguing example of a C-H 0 hydrogen bond is found in the crystal structure of uracil (Stewart and Jensen, 1967). One of the two uracil carbonyl 0 hydrogen bonds, but the other makes two groups accepts two N-H short contacts to two C-H groups. The C-H groups are arranged in the expected positions to interact with the two lone pair orbitals of the acceptor oxygen. 0 interaction in a Theoretical methods suggest that each C-H nucleic acid constituent makes an enthalpic contribution of about -2 kcallmol to the stability of its three-dimensional structure (Amidon et al., 1975), and the role of carbon as a hydrogen bond donor in nucleic acid e..
MJ
M';
Mi
0 tl
1
0( B
FIG.7. Stereodrawing of the molecular structure of N-benzoyl-5'0-tert-butyldimethylsilyl-2'-deoxyadenosine monohydrate drawn with ORTEPII (Johnson, 1967). There is an intramolecular C-H ... 0 hydrogen bond between C-8 and 0 - 5 ' , which is indicated by a solid line drawn between them [C-8-H ... 0-5' = 3.241(1 I ) A]. In addition, there is a C-H 0 hydrogen bond between C-6 of the benzyl ring and the water of crystallization [C-6(benzyl)-H ... O(water) = 3.197(11) A].
I45
WEAKLY POLAR INTERACTIONS IN PROTEINS
base pairing has been reviewed recently by Ornstein and Fresco (1988). N interactions in Moreover, the role played by C-H ... 0 and C-H determining molecular packing and conformation has been examined in detail by Berkovitch-Yellin and Leiserowitz (1984), who document that such nontraditional hydrogen bonds are instrumental in fixing packing motifs in a variety of small organic crystals. Although crystallographic studies of hydrogen bonds in proteins are necessarily less accurate than most single-crystal X-ray work, nonstandard hydrogen bonds identified by Taylor and Kennard (1982) are present in proteins. A set of three carbon-donor hydrogen bonds have been observed in the three-dimensional structure of the complex of NADPH with dihydrofolate reductase and are illustrated in Fig. 14 (Filman et al., 1982; Villafranca et al., 1984). The nicotinamide ring of the cofactor is stabilized in its binding site by 0 hydrogen bonds involving isoleucine-13, threonine-45, and C-H 0 hydrogen bonds are highly nonlinear, in keeping alanine-97. C-H 0 bond angle is 152.7”,a value with their weakness: the mean C-H based on 4 1 intermolecular neutron diffraction examples (Taylor and Kennard, 1982). Sulfur, like oxygen, can act as both a hydrogen bond donor and an acceptor. The pK, of the sulhydryl group in free cysteine is -8.3 (Tanford, 1962), but this value may be different in the interior of a protein. S hydrogen bonds have been seen in chymotrypsin (Birktoft and N-H Blow, 1972) and in Pseudomonas aeruginosa ferredoxin (Adman et al., 1973), and a hydrogen bond of the type S-H ..- 0 has been observed between the 3’-OH group of the ribose ring of adenosine 5’-triphosphate and cysteine-35 in tyrosyl-tRNA synthetase from Bacillus stearothermophilus (Wilkinson et al., 1983). A S-H 0 hydrogen bond also seems to anchor the side-chain carboxylate of glutamate-165 to cysteine126 in yeast triosephosphate isomerase and may play a role in the catalytic mechanism of this enzyme (Davenport, 1986). The sulfur atom of methionine can also act as a hydrogen bond acceptor. An 0-H S(Met) hydrogen bond has been observed in cytochrome c (Takano and Dickerson, 1981). The small-molecule crystallographic studies described above are without ambiguity and imply that a very broad definition of a hydrogen bond should be adopted. Donohue (1968) argued that these nontraditional hydrogen bonds are somehow “different” from the “normal” hydrogen bond, but the results of an exhaustive analysis of ab initio quantum mechanical calculations of hydrogen bond potential energy surfaces by Reiher (1985) demonstrate that all hydrogen bonding is well modeled by a combination of the Coulombic and 6-12 potential functions. We, therefore, prefer to define the term hydrogen bond operationally as an enthalpi- a .
146
S. K . BURLEY AND C . A. PETSKO
cally favorable, quasi-linear hydrogen bridge between two negatively polarized nonhydrogen atoms, which yields an internuclear separation between the two nonhydrogen atoms that is less than the sum of the donor-to-hydrogen bond length and the appropriate van der Waals radii. Today, this definition is not widely used by biological chemists and is sharply at odds with the widespread but false belief that only nitrogen and oxygen can engage in hydrogen bonds. Several recent papers have attempted to define energetically preferred hydrogen bond geometry through the examination of small-molecule and protein crystal structure data bases. These studies have made a quantitative description of hydrogen bonding in terms of experimental data possible, and it is heartening to note that the geometric information is in complete agreement with the electrostatic-based picture outlined above. Since hydrogen bonding plays the major role in the stabilization of protein secondary structure, we will review the principal conclusions of these papers. Murray-Rust and Glusker (1984) analyzed the directionality of hydrogen bonding to the most common acceptor, the sp2- and sp3-hybridized oxygen atom, by a search through the Cambridge Crystallographic Data Base for short (less than 3 A) donor-acceptor atom distances. In all of the 0 H-X systems examined, the largest concentration of hydrogenbonded X-H groups lay in the direction of the oxygen lone pairs. For ether and epoxide oxygen acceptors, this is in a plane perpendicular to the C-0-C plane; whereas for ketones and esters, it is in the plane of the carbonyl group. Taylor and Kennard (1984) reviewed the geometry of all types of hydrogen bonds commonly found in organic crystals, with particular attention to bond lengths and angles. Their data show that, in the N-H ... 0 system that is most commonly found in proteins and is the determinant of secondary structural stability, the average N-H distance is 1.030 A, the mean H 0 distance is 1.869 A, and the mean N 0 distance is 2.85 A. The last two values have standard deviations of about 0.1 A; the N-H distance has an estimated standard deviation of only 0.015 A. The bonds are closely linear, but deviations from perfect linearity are common: the mean N-H 0 angle is 161”, with a standard deviation of 12”. The authors note that “the hydrogen bond is largely an electrostatic phenomenon (Umeyama and Morokuma, 1977)... Consequently, the length of a hydrogen bond is highly dependent on the nature of the donor and acceptor atoms; even small changes in their properties may produce significant alterations in the H ... Acceptor distance.” Consistent with the energetic evaluations of Fersht from site-directed mutagenesis, they find that the mean H ... 0 distances increases as one proceeds from salt bridges to
WEAKLY POLAR INTERACTIONS IN PROTEINS
147
uncharged donor-acceptor pairs. They also note that bifurcated hydrogen bond lengths are consistently longer than bonds involving single donors. In an interesting correlation with the electrostatic model, Taylor and Kennard observe that long chains of “cooperative bonds” of the form 0-H 0-H * * * exhibit shorter H 0 distances than those found in isolated 0-H 0 bonds, 1.805(9) %i versus 1.869(23) A (where the digits in parentheses indicate the uncertainty in the final digits, and typical random errors were estimated to be on the order of 0.005 A). Such an effect is predicted by the electrostatic picture since the formation of an 0-H 0 bond would polarize the electrons of the donor 0-H group so as to increase the partial negative charge at the oxygen atom; the atom thus becomes a better potential hydrogen bond acceptor (Kollman, 1977). A distinct preference was found for N-H ... O=C hydrogen bonds to form in or near to the directions of the carbonyl oxygen atom lone pairs. This preference was observed for both singly bonded and doubly bonded carbonyl oxygen atoms. In small-moleculestructures, all “active” hydrogens tend to be donated; this is a much stronger tendency than for lone pairs to receive hydrogen bonds (Olovsson and Jonsson, 1976). For our purposes, the most interesting compilation is that made by Baker and Hubbard (1984) on hydrogen-bonding patterns in proteins themselves. In their exhaustive (and exhausting, to quote the authors!) catalog, they used a data base of 15 highly refined protein crystal structures from the Brookhaven Protein Data Bank and attempted to break down the results according to residue type. They concentrated only on the conventional hydrogen bonding groups, but their results are nonetheless of great interest. Although it is impossible to summarize this massive paper in a brief review, some of its chief conclusions may be listed. Full hydrogen bonding potential is not always expressed because of steric effects of surrounding atoms, but the vast majority of possible bonds are made. Only 1 1 -2% of the main-chain carbonyl groups are not hydrogen bonded; the corresponding figure for main-chain N-H groups is 12.4%. Eighty-five percent of all possible side-chain hydrogen bonds are made. Hydrogen bond geometry in proteins is largely the geometry of the N-H 0 bond, since this bond is the most prevalent one in proteins, as it is in small-molecule structures. The distribution of distances covers a 0.4-A range, with the mean 0 ... H value being significantly longer (1.95 A) than in the case of charged donors and acceptors. N-H 0 distances have a clear cut-off below 1.7 A for 0 --.H and 2.6 A for 0 N, while for 0-H 0 hydrogen bonds the minimum 0 0 distance is 2.4 A. The distribution of distances tails off very slowly at the upper end, as a result of the presence of a “second sphere” of neighbors. Crudely, limits of 3.3 8, for the 0 0 distance and 3.4 %, for the 0 ... N distance seem 1..
I48
S. K. BURLEY AND G . A. PETSKO
appropriate; these values will allow future investigators to differentiate between van der Waals interactions and possible hydrogen bonds. These limits will lead to the omission of some weak bifurcated hydrogen bonds, H however, and there is probably no totally satisfactory value. C=O angles at the acceptor oxygen atoms cover a wider range than angles at the hydrogens and are centered near 120", the sp2 lone-pair orientation. Ninety percent of the values are within 40" of the carbonyl group plane, but out-of-plane is common (we particularly like the analogy of the interaction at the acceptor oxygen to a ball and socket joint [Peters and Peters, 19801). Acceptor oxygen angle is uncorrelated with hydrogen bond length. Regarding the linearity of the bond itself, the average N-H 0 angle is about 155", with 90% lying between 140" and 180". In other words, deviations from linearity are real and common, but the equilibrium configuration is close to linear, as expected from energy considerations. There is an obvious correlation between linearity as expressed by the N-H 0 angle and the H 0 distance. Hydrogen bond energy is thus insensitive to the angle at the acceptor oxygen but sensitive to the linearity of the bond at large deviations. In the potential energy diagrams of Hagler et al. (1974), the energy is no longer stabilizing at N-H --.0 angles of 120"; whereas for the C=O --.H angle at the acceptor oxygen, the energy is essentially unchanged over the range 180"-120", only starting to rise at about 100". All of these results are completely consistent with both the electrostatic model for hydrogen bonding and the small-molecule surveys reported earlier. Main-chain hydrogen bonds are in secondary structural features 50% of the time. Forty-six percent of all carbonyl oxygen atoms are hydrogen bonded to other main-chain atoms; the figure for backbone amides is 68%. Only 11% of the C=O or N-H groups make hydrogen bonds to side chains. Water molecules make up the difference. Of the carbonyl oxygen atoms, 28.2% accept hydrogens from two donors. Side-chain oxygen and nitrogen atoms hydrogen bond with water 52% of the time, with other side-chain oxygen and nitrogen atoms 24% of the time, and 24% of the time with main-chain oxygen and nitrogen atoms. Twenty-four percent of all lysines are not hydrogen bonded, a finding perhaps reflecting the oft-disordered states of these residues in protein crystal structures. Thirty-three percent of the arginines have all three of their N-H groups involved in hydrogen-bonding; 35% have only two. The terminal NH2 groups hydrogen bond 71% of the time in arginine. Amino acids with hydroxyl groups (serine, threonine, tyrosine) serve as donors almost 70% of the time. The charged side chains of lysine, arginine, and histidine form salt bridges with carboxylate groups about 25% of the time. Seven carboxyl-
WEAKLY POLAR INTERACTIONS IN PROTEINS
149
ate-carboxylate pairs are found; this structural feature seems to be most common in acid-stable proteins such as the acid proteases. Water molecules are hydrogen bonded to main-chain carbonyl oxygen atoms in 42% of the cases studied; 15% are hydrogen bonded to mainchain N-H groups. Forty-four percent of all bound solvents make at least one hydrogen bond to a side-chain group. In summarizing the results of their study, Baker and Hubbard list 12 generalizations regarding the role of hydrogen bonds in stabilizing the structures of proteins. 1. Almost all groups capable of forming hydrogen bonds do so. Where groups are not explicitly hydrogen bonded, there are probably disordered solvent molecules around, although a few internal groups are prevented from hydrogen bonding by steric factors. The authors assert that “this drive for all polar groups to be hydrogen bonded, either to protein atoms or to solvent, and for hydrogen bond angles to take characteristic values, is a powerful factor in establishing structural patterns.” 2. Main-chain carbonyl and amide groups are most often hydrogen bonded to each other, generally in elements of secondary structure. 3. All polar side chains have the ability to make more than one hydrogen bond, some as many as four (aspartate, glutamate, glutamine, asparagine) or five (arginine). This property, expressed by internal side chains, leads to the extensive hydrogen-bonded networks that are found in nonrepetitive regions of protein structure. The hydrogen bonds made by these interior residues are relatively short and have bond angles close to the ideal value. Surface side chains have less ideal geometry. 4. Side-chain-main-chain interactions involving mostly serine, threonine, aspartate, and asparagine help to satisfy the hydrogen bond potential of free N-H groups in turns and at the ends of a helices. The authors speculate that these residues may have a role in a helix initiation. 5 . Local interactions between side chains and main-chain C=O groups are dominated by serine and threonine in a helices and (less often) in turns. These residues usually make a second hydrogen bond to a carbonyl oxygen atom already bonded to a main-chain amide. 6. Side-chain hydrogen bonds are more common at a helix N termini than at C termini. The a helix dipole is confirmed by a statistical preference for negatively charged side chains at the N termini and positively charged side chains at the C termini. In turns, side-chain hydrogen bonds to free N-H groups are predominantly local; they are long-range to free carbonyl groups. However, in both turns and a helix termini, most free C=O and N-H groups only hydrogen bond to solvent. 7. Carbonyl groups prefer to accept two hydrogen bonds, with C=O ... H angles near 120”.
I50
S. K. BURLEY AND G . A. PETSKO
8. Carbonyl groups in Q helices tilt outward to a greater extent than that predicted from models. This may be a reflection of the carbonyl group preference for 120" hydrogen-bond bond angles, since the acceptor bond angle in an idealized Q helix would be 150". Of course, this outward tilt also makes the carbonyl group more accessible for a second hydrogen bonding interaction. 9. Q Helices in proteins can kink (for an example, see Ringe et al., 1983). Hydrogen bonds on the outside are longer than those on the inside, and some are broken. 10. The hydrogen bond angles at the Carbonyl oxygen atoms in p sheets are similar to those in a helices, but only 18% have two donors (the figure is 36% for C=O groups in Q helices), and most of the ones that do are at the ends of strands. Most probably, this effect arises from a combination of the nonpolar environment around most p sheets, the greater rigidity of the p structure itself, and the interdigitation pattern of the p sheet side chains. 11. Despite the often considerable twist in p structures, their hydrogen bond geometry is close to optimal. Parallel and antiparallel p pleated sheets do not differ in this respect. 12. Bound solvent molecules are an integral part of the structures of their proteins. This is seen in the extensive hydrogen bond networks that they form and that bridge protein atoms. Such cross-linking is observed internally as well as externally; many globular proteins contain a number of buried water molecules. The water molecule is unique because it has both double-donor and double-acceptor capability.
J . Hierarchy of Electrostatic Interactions The data presented in preceding sections demonstrate that there is a wide variety of distance and geometric dependences in the electrostatic interactions that occur in proteins. A distance dependence hierarchy of the potential energy of these electrostatic interactions is given below (the symbol 0 is included to indicate dependence on spatial arrangement, and the symbol D is included to denote dependence on the relative orientation
Charge Dipole Quadrupole London Electron repulsion
Charge
Dipole
Quadrupole
&I/r
+-@/r2 z@/r3
+-@/r3
London
Electron repulsion
t@lr4
+@DIrJ -l/r6 +llr'2
WEAKLY POLAR INTERACTIONS IN PROTEINS
151
of two quadrupole moments). This hierarchy demonstrates that electrostatic interactions span a very wide range of distance and geometric dependences. It is interesting to note that dipole-dipole interactions show the same distance dependence as charge-quadrupole interactions even though the dependence on interaction geometry of these two interactions is quite different. Finally, such a hierarchy provides a means of assessing the relative importance of each possible interaction at given atomic separations and interaction geometries. The general features of hydrogen bonding, charge-charge interactions, London forces, and dipolar interactions have been appreciated for many years, as has their role in protein structure stabilization and ligand-protein interactions. In the past few years, it has become apparent that this is not the whole story. A new class of interactions involving weakly polar aromatic amino acids has been identified and characterized. These interactions are specific, enthalpically comparable to a hydrogen bond, and ubiquitous. They are also of considerable importance. Consider the interior of a globular protein. Ignoring main-chain interactions, the predominant stabilization would seem to derive from the hydrophobic effect plus a myriad of London forces between randomly packed hydrophobic side-chain atoms. These include the aromatic side chains, which are commonly regarded as inert to polar interactions. Since neither the hydrophobic effect nor the London dispersion force has a directional term, the geometric arrangement of the side-chain atoms is thought to be dependent only on packing considerations. The interiors of globular proteins are known to be nearly close-packed (Richards, 1977), which would maximize both hydrophobic and London stabilization. And for the 0 aliphatic side chains, this picture is not incorrect. Although C-H and C-H N hydrogen bonds are possible, where the C-H derives from an aliphatic amino acid side chain group, their presence in the interior of a protein is unlikely. Polar side chains that could provide the acceptor atom are not commonly found in protein interiors. This leaves only the peptide carbonyl oxygen atom as a potential acceptor. The free energy of stabilization of this interaction is more favorable if the peptide amide N-H serves as donor to this atom, so the polar groups of the backbone are usually hydrogen bonded to each other in the centers of proteins, and London forces dominate the interactions involving aliphatic C-H groups. But the aromatic side chains are different. The geometric pattern of their interactions, both with each other and with polar groups in the interior, are governed by weakly polar forces. These interactions are not random, and the random packing of aliphatic residues does not take precedence over them. We shall now treat these interactions in detail.
S. K. BURLEY AND G . A. PETSKO
152
111.
WEAKLY POLAR INTERACTIONS IN PROTElNS
A . Aromatic Amino Acids The amino acids phenylalanine, tyrosine, and tryptophan are traditionally grouped because their side chains are aromatic. Phenylalanine has a benzyl group distal to the p-carbon atom, tyrosine has a phenolic group distal to the p-carbon atom, and tryptophan has an indole moiety distal to the p-carbon atom. These aromatic amino acids demonstrate a characteristic segregation of partial electronic charges like benzene, which is depicted in Fig. 5B.This segregation is a product of the presence of double bonds between ring carbon atoms, which give rise to a 6- r-electron cloud covering the face of the aromatic ring and 6 hydrogen atoms bound to each of the ring carbons and occupying the edge of the planar structure. Estimates of the partial electronic charges give typical values of about +O. 15 electrons in the vicinity of each hydrogen nucleus and each carbon nucleus, respectively. This localization of partial electronic charges to the ring carbon atom nuclei is an approximation of convenience only. It is well known from very high-resolution single-crystalX-ray studies of small organic compounds that there is substantial concentration of electrons not only in the vicinity of the carbon nuclei but also between adjacent carbon nuclei, where the ring bonding electrons reside. Although it is impossible to be certain that the values chosen for the partial electronic charges attributed to the constituents of benzene are appropriate, the requirement that they reproduce the experimentally determined molecular dipole and quadrupole moments represents a useful way of restricting the number of available choices. The space surrounding any planar molecule can be completely described using the standard right-handed polar coordinate system illustrated in Fig. 8. The position of each point in the space is labeled with three orthogonal coordinates ( r , @ , # ) , which are defined in Fig. 8. The center of mass of the six-membered aromatic ring is placed at the origin, with its 6-fold symmetry axis colinear with the z-axis, and the Cp-C, bond colinear with the x-axis (see Fig. 8). The angle 0 can vary between 0" and 90", and C#I can vary between 0" and 360". Nearby atoms such as N, 0, and S can be located with the polar coordinates (r,@,#). A nearby aromatic amino acid side chain can be described by the position of its centroid ( r , @ , # ) and the angle between the two planes defined by the two aromatic rings ( D ) , which can vary between 0" and 90". In the next four subsections, we shall describe the results of statistical surveys of single crystal X-ray and protein crystallographic data bases of atomic coordinates. These surveys document the existence of statistically significant preferences of certain three-dimensional geometric arrange+
WEAKLY POLAR INTERACTIONS IN PROTEINS
153
FIG.8. Coordinate axes for definition of the right-handed polar coordinate system (r, 0, 4), which were used to analyze the locations of 0, S, and N atoms with respect to a reference aromatic ring (left) or the locations of other aromatic rings (right) with respect to a reference aromatic ring. The center of mass of the six-membered reference ring were placed at the origin, with its 6-fold symmetry axis colinear with the z-axis. The C,-C, bonds of phenylalanine and tyrosine were made colinear with the x-axis. Tryptophan was positioned with the vector connecting the atoms Ce3and C, parallel to the x-axis. The geometry of interaction of an atom near the reference aromatic ring is completely specified by the three polar coordinates ( r , 8, 4). The geometry of interaction of an aromatic ring near the reference aromatic ring is completely specified by the three polar coordinates ( r , 8, 4) of the ring centroid and the angle (D) between the two planes defined by the two planar aromatic rings.
ments of four different chemical groups found near aromatic amino acid side chains, thereby elucidating geometric patterns that are likely to arise from energetically important noncovalent interactions. We note that positive results of statistical surveys do not in themselves prove the existence of an energetically favorable noncovalent interaction. However, when complemented by corroborative ab initio quantum mechanical calculations of model systems, they provide strong evidence for both the existence and the importance of such energetically favorable noncovalent interactions. What pattern of geometric arrangements would we expect if only random packing forces determined the interactions of chemical groups with aromatic amino acid side chains in globular proteins? The case of an interacting atom, such as N, 0, and S, is relatively simple and can readily be derived from geometric considerations. The distribution of distance between nearby atoms of all chemical types and the centroid of the aromatic amino acid side chain is shown in Fig. 9 and represents the spatial arrangement of atoms near aromatic amino acid side chains in globular proteins under a random close-packing regime. Peaks in such a distance distribution function computed for a given atom type will document violation of the random close-packing regime. A similar argument holds for the distribution of aromatic amino acid side chains in the vicinity of other aromatic amino acid side chains, although there the geometric considerations are somewhat more complex.
154
S. K. BURLEY AND G. A. PETSKO
I
'
0.' ' ' ' ' 0 2 4 6 8 1 0
Distance (A)
FIG.9. Normalized distance distribution functionof the polar coordinate r for all protein atom-aromatic ring contacts as defined by the coordinate system depicted in Fig. 8 (
Once a statistically significant preference for a given separation distance has been identified, distance criteria for further investigation of this unexpected proximity can be established. The subset of cases is then examined for any relationship between the chemical constituents of the aromatic ring and the positions of the nearby atom of interest (0, 4). When the case of nearby aromatic rings is considered, there are three important geometrical parameters to consider. These parameters are the location of the nearby aromatic ring centroid, given by (20, +), and the angle between the two planar molecules, denoted by D . The distribution of values of the polar coordinate angle 0 that would be expected under the random close-packing regime can be derived simply by two independent methods. One is of historical importance in the history of mathematics. In 1801 the French polymath Comte de Buffon posed and solved a problem, which is now known as Buffon's needle problem and involves calculation of probabilities of various geometric aspects of a needle falling on a striped surface (Buffon, 1801). The second method is more prosaic, but provides the same result. Geometric considerations of solid angle dictate that the behavior of 0 expected under conditions of random packing will follow the simple trigonometric function sin 0 . Therefore, if the random close-packing regime were to apply to the distribution of the angle 0,the majority of atoms near aromatic amino acid side chains would be found in the vicinity of the ring hydrogen atoms. Significant deviations of the observed distribution of the variable 0 from sin 0 would suggest that factors in addition to random close packing dictate the locations of these nearby atoms. The expected distribution of the angle between two randomly arranged planar groups is also given by the trigonometric sine function. Therefore, if the distribution of inter-
I55
WEAKLY POLAR INTERACTIONS IN PROTEINS
planar angle between two planar molecules ( D ) deviates substantially from sin ( D ) ,it suggests that factors other than random close packing are involved in determining the relative orientations of nearby planar molecules. It is important to note that the preceding discussion is only strictly true for planar molecules of zero thickness. However, the distribution of values of the parameter 0 for aliphatic carbon atoms in the vicinity of aromatic amino acid side chains, which are thought to obey the random close-packing regime, closely approximates the trigonometric function sin 0 . The distribution of values of the equatorial angle is not determined by intrinsic geometric considerations, and all possible values of are equally probable. However, unfavorable spatial overlap of electron orbitals determine the observed distribution of the variable +. Thus, the distribution of values of is nearly uniform over its entire range (0" C C 360") with the following notable exceptions: The presence of the CB-C, bond in tyrosine, phenylalanine, and tryptophan makes values of = 0" unlikely, and = 180" in tyrosine is partially blocked by the phenolic hydroxyl group. Finally, the five-membered ring of tryptophan partially blocks values of between 180" and 360". The most widely used test for comparing observed and expected frequencies (or for examining the "goodness of fit" of the sample to the expected distribution) is called the x 2 test; it is defined by
+
+
+
+
+
+
+
i
where N is the sample size;f(i), predicted frequency; and F ( i ) , observed frequency. The results of the x 2 calculation are interpreted with the aid of a set of x 2 tables, which attribute to each value of x 2 for a given number of degrees of freedom a probability that by chance the value of x 2 will be larger because of random fluctuations. Therefore, if a very large value of x 2for a given number of degrees of freedom is obtained, it is very unlikely that this value occurred by chance, and the sample probably does not fit the predicted distribution (Bevington, 1969). Before presenting the results of statistical surveys, it is important to note that there is some danger that studies of single-crystal X-ray crystallographic data bases may not yield exactly the same findings as identical studies of protein crystal structures. Modern single-crystal X-ray structures are of extremely high accuracy, and the ratio of observations to parameters varied during least-squares refinement of the structure typically falls between 5 and 10 (Sheldrick, 1976). Unfortunately, this is rarely the case in protein crystallography because protein crystals do not often diffract to atomic resolution. Instead, restrained least-squares refinement
156
S . K. BURLEY AND C . A. PETSKO
of protein structure is usually employed (Konnert, 1976; Hendrickson and Konnert, 1980; Konnert and Hendrickson, 1980). Addition of restraints to quantities such as bond length, bond angle, peptide bond dihedral angle, and minimization of close van der Wads contacts effectively increases the ratio of the number of observations to the number of parameters varied in the refinement. However, application of such restraints may yield something other than the “true” average X-ray structure of the protein because of the restraints represent bias. For example, Teeter (1985) analyzed the case of crambin and found evidence that the restraints on certain bond lengths were not appropriate for crambin.
B . Oxygen-Aromatic Interactions The oxygen-aromatic interaction in proteins was first described by Thomas et al. (1982), who analyzed the atomic environments of 170 phenylalanine residue aromatic rings from 28 well-determined protein structures. The spatial distribution of atom types in the aromatic ring environment demonstrates a statistically significant preference for the 6oxygen atoms to be found in the plane of the aromatic ring near the 6 + hydrogen atoms. Statistical methods were used to calculate an apparent net free energy change of about - 1 kcalhol favoring location of the oxygen in the plane of the aromatic ring. In addition, ab initio quantum mechanical calculations using a model system consisting of benzene and formamide in uucuo were used to analyze the shape of the oxygen-benzene interaction energy surface. A global enthalpy minimum of about -2.5 kcalhol was found when the formamide carbonyl oxygen was positioned in the plane of the aromatic ring 5.0 8, from the benzene centroid, and 2.5 8, from the nearest hydrogen atom, which corresponds to r = 5.0 and 0 = 90”in the polar coordinate system described earlier and illustrated in Fig. 8. The sum of appro riate bond lengths (aromatic C-C = 1.426 8,, aromatic C-H = 1.084 ) and van der Waals radii (H = 1.2 A, 0 = 1.4 8,) give 5.1 and 2.6 8, for the van der Waals distances between the benzene ring centroid and 0 and between H and 0, respectively. Hence, calculations using the ab initio model of the oxygen-aromatic interaction suggest that the spatial arrangement of interacting atoms resembles a hydrogen bond. Similar findings were obtained with a water-benzene model system. These theoretical results are supported by simple physicochemical data. Water is more soluble in benzene than in cyclohexane, a fact that implies an enthalpically favorable interaction that depends on the aromatic nature of the former. Further test calculations were performed with an ethane-benzene model system. These data established that aliphatic amino acid side
K
157
WEAKLY POLAR INTERACTIONS IN PROTEINS
chains are unable to compete with the oxygen atom for the edge of the aromatic ring of the phenylalanine side chain. This result has important and obvious implications for the way amino acid side chains arrange themselves in the interior of proteins. It suggests that random close packing of nonpolar groups cannot dictate completely the internal geometry. Moreover, a carbonyl oxygen-aromatic interaction does not preclude formation of at least one hydrogen bond. The carbonyl oxygen has two lone pairs of electrons and can, therefore, make one hydrogen bond and one oxygen-aromatic interaction (Finney et al., 1980). If these interactions are real, they should be apparent in simple organic solids, where intramolecular associations can be determined by crystal structure analysis. Gould et al. (1985) examined the Cambridge Crystallographic Data Base for interactions between oxygen atoms and phenyl rings in a set of high-accuracy, small-molecule single-crystal X-ray structures. They also detected a statistically significant preference for oxygen atoms to be located in the plane of the aromatic ring, where the 6- oxygen atom approaches the 6 + ring hydrogen atoms (see Fig. 10). Therefore, analyses of two independent crystallographic data bases, one of extremely high accuracy, demonstrate that the distribution of positive and negative partial charges in aromatic groups in proteins influences the positions of nearby 6- or negatively charged oxygen atoms. Ab initio quantum mechanical calculations provide an estimate of about - 1 to -2.5 kcal/mol for the enthalpic contribution of this favorable disposition of unlike partial charges. In addition, the spatial distribution of such enthalpically favorable oxygen-aromatic interactions in proteins suggests
... .* . -
I .
I
*
... .:. .. .. .:-.
. * ..
..
#
- .
- . .
. . .
.
'..
..
..
FIG.10. Stereodrawing showing the oxygen environment of 26 crystallographically determined phenylalanine side chains. The oxygen atoms show a statistically significant preference for the plane of the aromatic ring, which corresponds to fl = 90" in the coordinate system defined in Fig. 8. An apparent free energy difference between a oxygen lying in the plane of the ring and one lying over the face of the ring was determined to be about - I . 1 kcal/mol at 298 K. Reproduced with permission from Gould et a / . (1985).
158
S. K. BURLEY AND G . A. PETSKO
20
33
0 Pro c2 (37) a %C7 Thr C 6 (41)
(I
FIG.1 1 . Diagrammatic view of contacts between ethacrynic acid and deoxyhemoglobin A. The ligand is covalently bound to cysteine-93 of the /3 chain. @ and 0 indicate fractional charges due to the electronic dipolar nature of the o-dichlorobenzene moiety. Broken lines indicate hydrogen bonds and dotted lines van der Waals contacts. Oxygen-aromatic interactions involving aspartate-94@and cysteine-938 are shown schematically. Reproduced with permission from Perutz el a/. (1986).
that they are involved in stabilizing protein tertiary structure and are not dominated by optimal packing considerations. The distance dependence of the oxygen-aromatic interaction varies as l/r4for an uncharged oxygen atom and l/r3 for carboxylate oxygen, and both are spatially anisotropic. Oxygen-aromatic interactions also play a role in protein-ligand binding. Perutz et al. (1986) described a series of crystallographic structures of drugs and peptides bound to deoxyhemoglobin A (Hb A). The following oxygen-aromatic interactions were detected by Perutz and co-workers: the carbonyl oxygen atoms of aspartate-94@and cysteine-93@with the phenyl ring of ethacrynic acid and the 06- of aspartate-94p with the phenyl ring of ethacrynic acid (see Fig. 1l), the O E - atom of glutamate-22@ with the phenyl ring of ethacrynic acid (see Fig. 12), and the carbonyl oxygen atom leucine-968 with one of the indole moieties of succinyl-Ltryptophanyl-L-tryptophan(see Fig. 13). Determination of the threedimensional structure of dihydrofolate reductase revealed that the nicotinamide ring is bound in part to the protein by oxygen-aromatic interactions (see Fig. 14). C-2 is 3.15 %, from the carbonyl oxygen of isoleucine-13, C-4 and C-5 are 3.25 and 3.22 8, from the carbonyl oxygen of alanine-97, respectively, and C-6 is 3.28 %, from Oy of threonine-45.
159
WEAKLY POLAR INTERACTIONS IN PROTEINS
10
H" \ 1lCH2 - 1ZCH3
His G19 (117) 8 N & i q 'HZ
FIG.12. Diagrammatic view of contacts between ethacrynic acid and deoxyhemoglobin A. The ligand is covalently bound to histidine-117 of the p chain. Broken lines indicate hydrogen bonds and dotted lines van der Waals contacts. An oxygen-aromatic interaction involving glutamate-22, is shown schematically. Reproduced with permission from Perutz et al. (1986).
0
rolZ,Ph,CD4(46) a, Cm md2.Ly.EM(61)alNH3
mot. 1. His FG4 (97)4CqH
I
.
I
mol2.
..
mol 1. His FC4 (97)8, CO'
32
-"
CH,(I
cf3
33
HN'
34
HN
"
"
"
"
-
W " - " NH3 Lye El0 (61).
CHZ
c=o
"
WeCD3 (45)0, N,H
30
I CH2 I
mol.1. pheC7 (41) 8 ,H,C ,
.
O---sz- -HN His E7 (56)a,mol 2
'C'
I HN I cti
o=c
'-
C '
Arg C6 (40) 81mo#t
35
I
I
I$, Pro CD2 (44) m 2 mol. 1
FIG.13. Schematic view of the oligopeptide succinyl-L-tryptophanyl-L-tryptophan bound between two deoxyhemoglobin A molecules in the crystal lattice. Broken lines indicate hydrogen bonds and dotted lines van der Waals contacts. The protein-ligand interactions include oxygen-aromatic interactions involving leucine-%, and an aromatic-aromatic interaction involving phenylalanine-41,. Reproduced with permission from Perutz et al. (1986).
160
S. K . BURLEY AND G . A. PETSKO
FIG. 14. Nicotinamide-binding site of Lacrobacillus casei dihydrofolate reductase. NADPH is indicated by its van der Waals surface. The oxygen atoms of isoleucin-12, threonine-45, and alanine-97 that are involved in enthalpically favorable oxygen-aromatic interactions with NADPH are labeled.
Similar interactions have been observed in small-molecule nucleic acid crystal structures (see Fig. 7), and enthalpies of between -1.1 and -2.3 kcal/mol have been calculated for these cases by ab initio quantum mechanical methods (Amidon et al., 1975). Thus, an impressive body of evidence, both theoretical and from examination of protein and small-molecule structure data bases, indicates that oxygen atoms and aromatic rings can interact in a highly nonrandom fashion. This interaction, which may be thought of as a type of C-H *.. 0 hydrogen bridging interaction for simplicity, can aid in binding small molecules to proteins. It can also, by inference, aid in the stabilization of the folded structure of a protein. If oxygen can form such enthalpically favorable contacts with aromatic residues, one might expect other polar atoms to do so as well. We shall consider sulfur next. C . Sulfur-Aromatic Interactions Reid et al. (1985) first described sulfur-aromatic interactions in proteins, which occur between the sulfur atoms of methionine and cysteine and the aromatic side chains of the amino acids phenylalanine, tyrosine, and tryptophan. They examined the geometry of sulfur-aromatic interactions in globular proteins using a crystallographic data base of 36 proteins, solved to 2 A resolution or higher, and documented a statistically significant preferred separation distance of <6 A between the sulfur atom and the centroid of the aromatic ring. In addition, the sulfur-aromatic interaction geometries display a statistically significant preference for close approach of the 6- sulfur atom to the 6+ ring hydrogen atoms and a corresponding avoidance of the 6- rr-electron cloud of the aromatic group, which corresponds to r < 6 A and 0 = 90" in the polar coordinate system.
WEAKLY POLAR INTERACTIONS IN PROTEINS
161
The sum of appropriate bond lengths (aromatic C-C = 1.426 A,aromatic C-H = 1.084 A) and van der Waals radii (H= 1.2 A,S = 1.85 A) give 5.56 8, for the van der Waals distances between the benzene ring centroid and S, when 0 = 90" and the sulfur atom is located in the same plane as the aromatic ring. This distance is slightly greater than the position of the peak in the observed distribution of sulfur-aromatic ring separation distances calculated by Reid et al. (1985), which occurs at about r = 5.0 A. Hence, the observed distance distribution function for the sulfur-aromatic interaction suggests that the spatial arrangement of interacting atoms resembles a hydrogen bond. The sulfur-aromatic interaction has a distance dependence that varies as l/r4 for an uncharged sulfur atom and l/r3 for S - and is spatially anisotropic. This pattern of interaction is identical to that of the oxygen-aromatic interaction and is also thought to be due to an enthalpically favorable interaction between unlike partial charges. The sulfur-aromatic interaction, like the oxygen-aromatic interaction, must contribute to protein structure stability. It is a particularly common feature of the three-dimensional structure of the eye lens protein y-crystallin solved by Blundell et al. (1981) (see Fig. 15); and in this structure, the sulfur-aromatic interaction is also said to play an important role in protecting the protein from radiation damage (Summers et al., 1984). We presume that this protection arises because the sulfur-aromatic interaction reduces the solvent accessibility of the sulfur atoms and, in turn, the number of free radicals that can react with the cysteine sulfur atoms. Both cysteine and methionine residues are particularly vulnerable to chemical attack by free radicals of radiolytic origin (Burley et al., 1988b). Thus, sulfur atoms can, and do, make nonrandom interactions with aromatic rings in proteins. One is led to conclude that the fabled nonpolar-
FIG. 1.5. A pair of sulfur-aromatic interactions in y-crystallin. The Sy atom is within van der Waals contact of the ring hydrogen atom of Cq2 of tryptophan-68 and the ring hydrogen atom of CS1 of phenylalanine-5.
162
S. K. BURLEY AND G. A. PETSKO
ity of aromatic groups, which is reflected in their being found predominantly in the interiors of proteins, does not preclude their participation in bonding that can best be described as weakly polar in origin. D . Aromatic-Aromatic Interactions The weakly polar nature of aromatic residues that leads to their interaction with oxygen and sulfur atoms suggests that they ought to be able to interact with themselves as well. The positively polarized hydrogen atoms of one ring could interact with the S- .rr-electron cloud of a second aromatic ring. This would produce an edge-to-face interaction, as opposed to the face-to-facen--a stacking of the rings. Ring stacking is often observed between heterocyclic aromatic molecules, such as tryptophan and isoaloxazine ring system of the flavin chromophore in flavodoxin (Smith et al., 1983). Monocyclic aromatic rings are more often found in nearly perpendicular arrangements, of which crystalline benzene is the bestknown example (Cox et al., 1958). It is unlikely that this geometric arrangement could arise as the result of random orientation, since the weak but favorable enthalpy of stacking, even in the monocyclic case, should win out over optimization of packing. In other words, if simple aromatic pairs are not found in the parallel stacked arrangement, it is probably because they are making more energetically favorable interactions. Aromatic-aromatic interactions in proteins were first examined by Burley and Petsko (1985). Further analyses of this interaction have been made by Singh and Thornton (1985) and Burley and Petsko (1986a). They occur between the side chains of the aromatic amino acids phenylalanine, tyrosine and tryptophan and between the aromatic moieties of ligands and these aromatic amino acid side chains. Geometric analyses of a crystallographic data base of 33 protein structures documented that aromatic groups in proteins pair with centroid separations of between 3.4 and 6.5 (Burley and Petsko, 1985). In addition, these analyses showed that there are preferred geometric arrangements of the two aromatic rings that do not reflect random close packing of planar molecules in the interior of a protein (Singh and Thornton, 1985; Burley and Petsko, 1986a). Analysis of the observed pattern of interaction geometries and its comparison with the potential energy surface of dibenzene demonstrated that there are statistically significant preferences for interaction geometries that are enthalpically favorable (Burley and Petsko, 1986a). Such geometric arrangements permit the 6 + hydrogen atoms on the edge of one aromatic ring to approach the 6- n-electron cloud of the other member of the aromatic pair, and the favorable enthalpy can be understood readily in terms of electrostatic interactions between partial electronic charges (see Fig. 16). Gould et al. (1985) examined the Cambridge Crystallographic Data Base
WEAKLY POLAR INTERACTIONS IN PROTEINS
163
A
C
D
FIG.16. van der Waals stereodrawings of interacting benzene rings positioned in four enthalpically optimal geometric arrangements for r = 5.5 A, using the coordinate system defined in Fig. 8. (A) t3 = O", interplanar angle = 90";(B)0 = 30". interplanar angle = 55"; (C) 0 = 50". interplanar angle = 20"; (D) 0 = 90",interplanar angle = 90".
164
S. K. BURLEY AND G . A. PETSKO
for intermolecular interactions between the phenyl rings of phenylalanine in a set of high-accuracy small-molecule single-crystal X-ray structures. They detected a statistically significant preference for one of the local minima of the dibenzene potential energy surface described in Burley and Petsko (1986a). Therefore, analyses of two independent crystallographic data bases, one of extremely high accuracy, demonstrate that the characteristic distribution of positive and negative partial charges in aromatic groups influences the pattern of interactions between aromatic groups in proteins. Calculations of the dibenzene potential energy surface, based on ab initio quantum mechanical calculations of Karlstrom et al. (1983), provide an estimate of about -1 to -2.5 kcal/mol for the enthalpic contribution of each occurrence of this aromatic-aromatic interaction (Burley and Petsko, 1986a). The aromatic-aromatic interaction can be largely attributed to an enthalpically favorable interaction between two molecular quadrupole moments, and multipole expansion calculations for dibenzene reproduce the shape of the dibenzene potential energy surface ( S . K. Burley, unpublished observations). Hence, the aromatic-aromatic interaction has a distance dependence that varies as l/r5 and is spatially anisotropic. In addition, the spatial distribution of such enthalpically favorable aromatic pairs in proteins suggests that they are involved in stabilizing protein tertiary structure. Proteins such as parvalbumin (Kretsinger and Nockolds, 1973), calmodulin (Babu et al., 1985), and troponin C (Herzberg and James, 1985) are particularly rich in aromatic residues, and clusters of interacting aromatic residues have been observed within these proteins (see Fig. 17). These aromatic amino acid side chain clusters are analogous to the crystal structure of benzene, which displays an edge-toface, or herringbone, arrangement of benzene molecules (Cox et al., 1958). Moreover, the conservation of aromatic residues in parvalbumins
FIG.17. Model of the three-dimensional structure of carp parvalbumin showing the acarbon backbone and a group of phenylalanine side chains, which form an enthalpically favorable set of aromatic-aromatic interactions and, to a large extent, define the core structure of this protein.
WEAKLY POLAR INTERACTIONS IN PROTEINS
165
FIG. 18. Stereodrawing of the antigen-binding site of the Fab fragment NEW. The acarbon backbone and numbered aromatic side chains show the large number of aromaticaromatic interactions present at the VL-VH interface. These interactions occur both within and between the two domains. Courtesy of J. Novotny.
isolated from different species of fish demonstrate a marked tendency to preserve these enthalpically favorable clusters of aromatic amino acid side chains (Burley and Petsko, 1985). Properties of benzene clusters . have been examined theoretically by Oikawa et al. (1985), who determined the geometric arrangements of enthalpically favorable clusters of benzene from dimers up to an assembly of 42 molecules. The small clusters resemble the packing of aromatic amino acid side chains within proteins, and the larger clusters reproduce the crystal lattice of benzene. Human immunoglobulin Fab fragments NEW and KOL and mouse immunoglobulin Fab fragment MCPC 603 have been solved to comparable resolution by X-ray crystallography and were extensively compared by Novotny and Haber (1985). The most striking feature of the antigen binding site in all three of these Fab fragments is the preponderance of phenylalanine, tyrosine, and tryptophan residues at the interface of the light and heavy chains (VL-V,) (see Fig. 18). Analysis of the Fab fragment NEW revealed 13 aromatic pairs in the VL-VH pseudodimer, 5 of which occur at the bottom of the antigen-binding site where they contribute an attractive nonbanded interaction energy. These five interactions
166
S.
K. BURLEY AND G . A. PETSKO
are absolutely conserved in the Fab fragments NEW, KOL,and MCPC 603, and we propose that they play a significant role in antigen binding. Aromatic-aromatic interactions also play an important role in proteinligand binding. Both carboxypeptidase A and y-chymotrypsin have a single mobile tyrosine residue that makes an aromatic-aromatic interaction with an aromatic moiety of an inhibitor bound in the enzyme active site. In carboxypeptidase A ligand binding in the active site results in sidechain torsion angle changes, which move tyrosine-248 a distance of about 10 A, thereby allowing the aromatic side chain of tyrosine-248 to tend toward an aromatic-aromatic interaction with the benzoyl group of the peptide analog ligand N-benzoyl-L-phenylalanine(Christianson and Lipscomb, 1987). In addition, tyrosine-198 makes an aromatic-aromatic interaction with the L-phenylalanine residue, which resides in the s-1 aromatic cleft of the enzyme active site. Shoham et al. (1987) examined the structure of carboxypeptidase A with 5-amino-(N-tert-butoxycarbonyl)-2-benzyl-4-oxo-6-phenylhexanoicacid bound in the active site, and they documented a similar interaction between a substrate aromatic group and tyrosine-198. Site-directed mutagenesis was used by Gardell et al. (1985) to make phenylalanine-248 carboxypeptidase A. This mutation does not significantly alter the catalytic constant toward either peptide or ester substrates, a finding suggesting that tyrosine-248 plays a role in ligand binding. We conclude that this role arises both from the hydrogen bond between the phenolic hydroxyl group and the terminal carboxylate of substrates and between the phenolic oxygen and the penultimate substrate amide N-H group and from the favorable enthalpy of an aromatic-aromatic interaction. Further, we suggest that replacement of the phenylalanine at position 248 with alanine would further decrease substrate binding enthalpy and reduce the specificity of carboxypeptidase A for aromatic side chains. y-Chymotrypsin also has a mobile tyrosine residue that can engage in an aromatic-aromatic interaction with a ligand bound in the active site (Ringe et al., 1985). When 5-benzyl-6-chloro-2pyrone binds to serine-195 at the active site, tyrosine-228 undergoes sidechain torsion angle changes that allow the phenolic side chain to make an edge-to-face interaction with the benzyl group of the inhibitor (see Fig. 19). Perutz et al. (1986) have recently described a series of X-ray crystallographic structures of drugs and peptides bound to deoxy-HbA. The following aromatic-aromatic interactions were detected by Perutz and coworkers: the side chain of tryptophan-37@interacts with the phenyl ring of bezafibrate (see Fig. 20). L-Arabinose-bindingprotein represents an interesting example of an aromatic-aromatic interaction indirectly influencing ligand binding. Tryptophan-16 and phenylalanine-17 adopt an edge-to-
WEAKLY POLAR INTERACTIONS IN PROTEINS
167
FIG. 19. Stereodrawing of 6-benzyl-3-chloro-2-pyrone bound to the active site of ychymotrypsin. The structures of the native enzyme (solid lines) and the enzyme-inhibitor complex (clear lines) are overlaid. Reproduced with permission from Ringe er al. (1985).
face arrangement that creates a hydrophobic patch in the L-arabinose binding site. C-3, C-4, and C-5 of L-arabinose constitute a corresponding hydrophobic patch and make van der Waals contacts with the aromatic pair (Quiocho and Vyas, 1984). This arrangement corroborates the observation that only one of the five tryptophans in L-arabinose-bindingprotein is responsible for the fluorescence change on sugar binding, and that the spectral shift is consistent with a more hydrophobic tryptophan environment after the sugar binds (see Fig. 21). Finally, aromatic-aromatic interactions are important in stabilizing the three-dimensional structures of biologically active oligopeptides and peptide analogs. Recently, a group of bisphenylalanine compounds that are model therapeutic agents for the treatment of sickle cell disease have been described by Burley et al. (1987). The two antigelling agents, L-lysyl-L-phenylalanyl-L-phenylalanine (Wang and Burley, 1987a) and L-phenylaianylglycylglycyl-Dphenylalanine (Fujii et al., 1987), and the two antisickling agents, L-phenylalanine benzyl ester (Wang and Burley, 1987b) and N-phenylacetyl-L-phenylalanine (Burley , 1987; Burley and Wang, 1987a), are maintained in their compact, amphipathic conformations by intramolecular aromatic-aromatic interactions (see Fig. 22). It is interesting to note that the mean ring centroid separation for these five single-crystal X-ray structures is 5.026 (calculated standard deviation, 0.12 A), which is signifi-
168
S . K. BURLEY AND G . A. PETSKO
I
'C H,
I
%H,
I
Trp C3 (37)8, jCfH
Q
38
C,H
35 36
2 0
I
/
I
'b
CyH,Pro
30---7--ArQ '6CH3--'6C 2% 37
G2 (95)(r2 HC3 ( 1 4 1 ) ~ ~
18CH3
Thr H20(137)a2 C7H3
FIG.20. Schematic view of the antihyperlipoproteinemiccompound bezafibrate bound to deoxyhemoglobin A in the central cavity of the tetramer. The protein-ligand interactions include an amino-aromatic interaction involving asparagine-108, and an aromatic-aromatic interaction involving tryptophan-37,. Reproduced with permission from Perutz er a / . (1986).
cantly lower than the mean value of about 5.5 A obtained from a survey of proteins (Burley and Petsko, 1986a). The three-dimensional structures of these antisickling and antigelling compounds resemble that of another bisaromatic antisickling agent, succinyl-L-tryptophanyl-L-tryptophan, when it is bound in the central cavity of deoxy-Hb A (Perutz et al., 1986) (see Fig. 23). Larger oligopeptide hormones are also stabilized by such interactions. A nuclear magnetic resonance spectroscopic study of somatostatin by Hirschmann and co-workers (Arison et al., 1981) revealed that the side chains of phenylalanine-6 and phenylalanine-1 1 make the edge-to-face aromatic-aromatic interaction. Synthetic studies have shown that elimination of one or both of the aromatic groups at positions 6 and 11 abolishes biological activity. However, there is one notable exception. When
WEAKLY POLAR INTERACTIONS IN PROTEINS
169
FIG. 21. Stereodrawing of L-arabinose bound to L-arabinose-binding protein from Escherichiu coli. The edge-to-face interaction between tryptophan-16 and phenylalanine-17
gives rise to a hydrophobic surface that interacts with the hydrophobic portion of the sugar. Reproduced with permission from Quiocho and Vyas (1984).
a disulfide bridge is created between positions 6 and 11, the biological activity of the compound is restored. Arison et al. (1981), therefore, concluded that the aromatic-aromatic interaction creates a bridge analogous to the disulfide bridge and is essential for biological activity. The single-crystal structure of pressionic acid, a fragment of vasopressin, demonstrates an edge-to-face interaction between tyrosine-2 and phenylalanine-3 (Langs et af., 1986) (see Fig. 24). This finding explains some earlier work on the relationship between structure and function in oxytocin, which differs slightly from vasopressin and has a leucine at position 3. Walter et al. (1976) made [3-~-phenylalanine]oxytocin (oxypressin), and [3-@-cyclohexylalanine]oxytocin and demonstrated that the oxytocin activity of oxypressin was substantially below that of both wild-type oxytocin and [3-@-cyclohexylalanine]oxytocin.They explained these data by suggesting that the introduction of a phenylalanine at position 3 in oxytocin to make oxypressin permitted a r-T parallel stacking interaction between tyrosine-2 and phenylalanine-3. It now seems more likely that they,
170
S. K. BURLEY AND G. A. PETSKO
in fact, created an edge-to-face interaction between tyrosine-2 and phenylalanine-3, thereby making an analog of vasopressin with little oxytocin activity. Nucleic acid structural studies have conditioned many scientists to
B
FIG.22. Ball-and-stick stereodrawings of the molecular structures of each of the four bisphenyl antigelling and antisickling compounds. Small open circles indicate carbon atoms; large open circles indicate oxygen atoms; and large shaded circles represent nitrogen atoms. Each a-carbon atoms is labeled with the appropriate one-letter code signifying amino acid type. Water molecules are indicated with an encircled W, and the counterions with an encircled B for bromide and encircled C for chloride. (A) The molecular structure of the antigelling agent L-lysyl-L-phenylalanyl-L-phenylalanine with its two bromide counterions. (B) The molecular structure of the antigelling agent L-phenylalanyl-glycyl-glycyl-D-phenylalanine, showing only two of its three bound water molecules for the sake of clarity. (C) The molecular structure of the antisickling agent L-phenylalanine benzyl ester with its chloride counterion. (D) The molecular structure of the antisickling agent N-phenylacetyl-L-phenylalanine.
WEAKLY POLAR INTERACTIONS IN PROTEINS
17 I
C
D
U
0 FIG.
22C and D.
bound F I G . 23. Stereodrawing of the oligopeptide succinyl-L-tryptophanyl-L-tryptophan to deoxyhemoglobin A. The compact amphipathic conformation is maintained by an enthalpically favorable edge-to-face interaction between the two indole moieties. Reproduced with permission from Perutz c / a / . (1986).
172
S. K. BURLEY AND G . A. PETSKO
FIG. 24. Molecular structure of pressionic acid. The two aromatic residues tyrosine-2 and phenylalanine-3, which make an enthalpically favorable edge-to-face interaction, are covered by a network of points that depict their van der Waals surfaces.
expect that aromatic side chains will self-interact through V-v parallel stacking interactions similar to those made by consecutive bases in the interior of the DNA double helix. Calculation of interaction enthalpies, by ourselves and others, as well as the examination of protein and smallmolecule structures summarized above, clearly indicate that this expectation is erroneous. Simple aromatic residues prefer to associate via enthalpically favorable, edge-to-face, weakly polar interactions in which a 6+ hydrogen atom from one ring makes a close contact with the 8- V electron cloud of the other ring. This interaction is worth approximately - 1.5 kcal/mol of stabilization energy, in contrast to T-v parallel stacking, which contributes almost zero when the two rings are monocyclic. How, then, is the DNA structure to be explained? First of all, the favorable Watson-Crick hydrogen bonding between complementary strands, which is strongest when the paired bases are coplanar, combined with the geometry of the sugar-phosphatebackbone, favors a ‘‘ladder’’ of parallel aromatic rings. Second, approximately half of the bases are heterocyclic. The enthalpic contribution of a single edge-to-face weakly polar interaction is independent of the number of rings in each member of the pair, but the energy of 7r--72 (parallel ring) stacking increases markedly as the surface areas of the ring systems increase. For heterocycles, parallel stacking is competitive with-and maybe more favorable than-perpendicular interaction. Compelling-though indirect-evidence for this view comes from model studies of possible prebiotic template-directed oligonucleotide synthesis. In these experiments, a single-stranded homopurine or homopyrimidine polynucleotide is used as a template, and the ability of the prebiotic system to achieve synthesis of the complementary strand from mononucleotide precursors is measured. Since there is no polymerase present, the success of the system depends on the ability of the mononu-
WEAKLY POLAR INTERACTIONS IN PROTEINS
173
cleotides to form a DNA-like stacked column in order to pair with the template bases; only then will their sugar-phosphate groups be oriented correctly for polymerization. When the mononucleotides are heterocyclic purines, template-directed synthesis is achieved. However, when pyrimidines, which are monocyclic, are used, no polymerization is observed (Ninio and Orgel, 1978). Presumably, in this case, the greater enthalpic stability of the perpendicular edge-to-face interaction prevents the formation of the essential stack of bases (Felsenfeld and Miles, 1967). Up to now, our discussion has focused on interactions involving the 6+ hydrogens of aromatic rings as donor atoms in enthalpically favorable interactions. We shall now turn our attention to the role of the 6- relectron cloud as an acceptor.
E. Amino-Aromatic Interactions Oxygen-aromatic, sulfur-aromatic, and aromatic-aromatic interactions all derive from the polar attraction of the 6+ hydrogens of an aromatic ring to a negatively polarized atom or 6- tr-electron cloud. Conversely, we might expect that the 6- r-electron cloud of an aromatic ring would interact with other 6+ polarized hydrogens, such as those on an amino group. Such an interaction would lead to a preference for amino groups to be found in axial orientations above and below the plane of the ring, rather than close to the equatorial distributed 6+ ring hydrogen atoms. Amino-aromatic interactions in proteins were first described by Burley and Petsko (1986b). They occur between the side-chain amino groups of lysine, arginine, glutamine, asparagine, and histidine, and the side chains of the aromatic amino acids, phenylalanine, tyrosine, and tryptophan, and between the amino moieties of ligands and these aromatic residues, or between amino groups of a protein and aromatic moieties of ligands. Geometric analyses of a crystallographic data base of 33 protein structures documented that side-chain amino and aromatic groups in proteins have preferred separation distances of 3.0 to 6 8, between the nitrogen atom and the centroid of the nearby aromatic ring. In addition, these analyses show that the side-chain amino groups are preferentially found axially near the 6- tr-electron clouds of aromatic rings and that they avoid equatorial positions near the 6+ hydrogen atoms of these rings. Both these spatial preferences are different from that expected from random close packing of groups within the hydrophobic core of a protein and are statistically significant. The energetics of the interaction have been characterized by ab initio quantum mechanical calculations of the positively charged ammonium ion interacting with the 6- r-electron cloud of benzene, and the interaction is enthalpically favorable with the optimal sepa-
174
S. K. BURLEY AND G . A. PETSKO
ration distance of about 3 8, between the nitrogen atom and the phenyl ring centroid (Deakyne and Moet-Ner(Mautner), 1985). The distance dependence is expected to be llr4for -NH or -NH2; llr3 for -NH:; and spatially anisotropic in both cases. The spatial distribution of such enthalpically favorable amino-aromatic pairs in proteins suggests that they are involved in stabilizing protein tertiary structure. The amino-aromatic interaction has also been observed in two interesting cases in proteins that involve stabilization of one of two alternate positions for the side chains of arginine residues. In carbon monoxymyoglobin, arginine-45 is disordered and is found in two distinct conformations. One of the conformations is stabilized by an amino-aromatic interaction between the positively charged guanidinium group and the 6- 7r-electron cloud of phenylalanine-43 (amino to phenyl ring centroid separation, 3.0 A), and it is thought to play a role in ligand entry to the heme group (Kuriyan et al., 1986) (see Fig. 25). A similar phenomenon was observed in the structure of lamprey hemoglobin, where an alternate arginine conformation is stabilized by an amino-aromatic interaction (Honzatko el al., 1985). In addition, a highly unusual pair of amino-aromatic interactions in basic pancreatic trypsin inhibitor has been characterized by neutron and X-ray crystallography and proton magnetic resonance (Wlodawer et al., 1984; Tuchsen and Woodward, 1987) (see Fig. 26). The 6- welectron cloud of the aromatic ring of tyrosine-35 is sandwiched between the main chain amino group of glycine-37 and the side-chain amino group of asparagine-44. Recently, Levitt and Perutz (1988) have examined the amino-aromatic interaction using an empirical potential energy function to calculate the shape of the potential energy surface for N-H interacting with benzene. They found the optimal distance between the ring center and the N atom to be about 3.4 A, which is in good agreement with the amino-aromatic interactions detected in proteins by Burley and Petsko (1986b)and similar calculations (S. K. Burley, unpublished observations). In addition, Levitt and Perutz proposed that aromatic rings should be considered to be “hydrogen bond” acceptors. Although this view is supported by ab initio calculations of HF interacting with benzene (Cheney et al., 19881, we prefer to characterize amino-aromatic interactions in terms of electrostatics as charge-quadrupole or dipole-quadrupole interactions. Like the other weakly polar interactions, amino-aromatic interactions are a mechanism of protein-ligand binding. Perutz et al. (1986) described a series of X-ray crystallographic studies of drugs and peptides bound to deoxy-Hb A. They characterized an amino-aromatic interaction between the 6+ N6-H group of asparagine-108@with the 6- welectron cloud of one of the phenyl rings of bezafibrate (see Fig. 20).
175
WEAKLY POLAR INTERACTIONS IN PROTEINS
A
4
a
b
FIG. 25. Stereodrawings of the two conformations of arginine-45 of carbon monoxymyoglobin from sperm whale. (a) Conformation 1, which is comparable to the conformation normally observed in metmyoglobin from sperm whale. (b) Conformation 2, which is stabilized by an amino-aromatic interaction between arginine-45 and phenylalanine-43.
176
S . K . BURLEY AND G . A. PETSKO
FIG.26. Amino-aromatic sandwich detected in bovine pancreatic trypsin inhibitor. The side chain of tyrosine-34, indicated by its van der Waals surface, is sandwiched between the main-chain N-H group of glycine-37 and ND2-H amino group of asparagine-44. The distance between the two hydrogen atoms is 5.2 A.
To conclude, in a manner complementary to the interactions of oxygens and sulfurs with the hydrogens of aromatic rings, amino group hydrogens can make enthalpically favorable, weak, polar interactions with the relectrons of aromatic rings. These interactions are of sufficient importance to cause a marked anisotropy in the distribution of N-H and NH: groups in the vicinity of aromatic side chains. IV. INTERACTIONS: A SUMMARY The physical characteristics of noncovalent electrostatic interactions that stabilize protein structure have been described. In addition, an unusual group of weak electrostatic interactions in proteins, which have only recently been characterized, have been reviewed in detail and some examples of biological importance cited. These interactions, termed weakly polar, result from the characteristic distribution of partial charges in some amino acid side chain moieties and involve interactions between electronic monopole, dipole, and quadrupole moments. Charge-quadrupole interactions occur in the form of negatively charged oxygen-aromatic interactions and positively charged amino-aromatic interactions. Dipole-quadrupole interactions occur in the form of 6- oxygen-aromatic and sulfur-aromatic interactions, which can also be said to represent a type of “hydrogen bond” with a carbon atom at the hydrogen donor. Dipole-quadrupole interactions also occur in the form of 6+ amino-aromatic interactions. Finally, the aromatic-aromatic interaction is an example of a quadrupole-quadrupole interaction. Both the aromatic-aromatic interaction and the 6+ amino-aromatic interaction can also be described as “hydrogen bonds.” The 6+ amino group interacting with an aromatic ring can be said to act as a “hydrogen bond donor,” and the n-electron
WEAKLY POLAR INTERACTIONS IN PROTEINS
177
cloud of the remaining aromatic ring can be said to act as the “hydrogen bond acceptor” group. Although we do not favor such terminology, it does serve to underscore the similarity between these interactions and the more strongly polar conventional hydrogen bond.
v. HYDROPHOBIC INTERACTIONS IN PROTEINS This section does not represent an attempt to review the enormously complex subject of the hydrophobic effect in proteins, which has been discussed in depth by Tanford (1980). Instead, we shall present a discussion of the hydrophobic effect that is couched in terms of its relationship to the subject matter treated in the preceding sections. Proteins are amphipathic molecules and are composed of amino acids that have a strongly hydrophilic peptide group and either hydrophobic or hydrophilic side chains. When a nonpolar group is introduced into water, there are both enthalpy and entropy changes in the system as a whole. There is an unfavorable enthalpy change caused by the creation of a hole in the hydrogen-bonded structure of water (the difficulty in creating such a hole is the origin of the high degree of surface tension of water). This unfavorable change is offset to a large extent by a favorable enthalpy change due to London forces between the nonpolar groups and water molecules. The attractive force between nonpolar molecules and water molecules is comparable in magnitude to that of nonpolar molecules for each other (Tanford, 1980). In fact, the net enthalpy of interaction of a hydrophobic group with water may actually be favorable. Why, then, are such groups insoluble in water? Kauzmann (1959) pointed out that the low solubility of hydrocarbons in aqueous solvents can be explained by the “iceberg” model of Frank and Evans (1945). When transferred into water, the nonpolar hydrocarbon molecule induces in the layer of water immediately surrounding it a “cage” of more or less fully hydrogenbonded water molecules (this is the iceberg). Formation of the cage makes the entropy of the system decrease when hydrocarbons are introduced into water, and this unfavorable entropy change offsets the enthalpy effect. Thus, hydrophobic molecules will tend to self-associate in water, because doing so will decrease their total surface area in contact with the solvent and the unfavorable entropy decrease will be minimized. Thus, aggregation on nonpolar solutes in water is entropy-driven. The thermodynamics of this model have been treated quantitatively by Gill (1983, and there is support from molecular dynamics simulations of methane and neon in water, which show a layer of structured solvent around the nonpolar solute molecule. The model implies that, if proteins are stabilized to a major extent by this tendency of hydrophobic groups to avoid contact
178
S.
K.BURLEY AND G.A. PETSKO
with water, some of the thermodynamic data on protein unfolding should be explicable in terms of data on the solubility of simple liquid hydrocarbons in water. Baldwin has recently shown that quantitative accounting for the protein data is possible in this way (Baldwin, 1986). Earlier literature has used the term “hydrophobic bond,” but it is clear from the above discussion that no special hydrophobic force exists. Nonpolar groups self-associatein water because their dispersal throughout the solvent would be entropically unfavorable. Once they come together and water is largely excluded, enthalpically favorable interactions are possible, but these are just (for nonaromatic hydrocarbons) the normal weak London forces between any polarizable groups. There is no “bonding” that is specifically “hydrophobic.” The correct term is hydrophobic effect. Measurements of the solubilities of free amino acids and their analogs in aqueous solution and in nonpolar organic solvents have documented their free energies of transfer from aqueous solution to organic solvent. These measurements demonstrate that the free energy gained by burying hydrophobic side chains in the solvent-inaccessible interior of protein is a major factor stabilizing the native conformation or conformations of a protein in aqueous solution, and the behavior of amino acid side chains in a protein has been likened to that of the constituents of a detergent micelle (Kauzmann, 1959). Recently, Baldwin (1986) described the temperature dependence of the hydrophobic interaction in protein folding. As mentioned earlier, his discussion suggests that studies of the solubilities of simple hydrocarbons in water can account for much of the calorimetric data obtained from unfolding proteins (Privalov, 1979). In addition, he analyzed the temperature dependences of the changes in enthalpy and entropy for protein folding. At room temperature, the free energy change of protein folding can be ascribed to the sum of an enthalpy change, which is independent of the burial of hydrophobic groups, and entropic changes, which are only partially due to hydrophobic interactions (the remainder of the entropic contribution probably comes from conformational entropy, which increases when the closely packed protein atoms are liberated on unfolding). It was originally thought that nonpolar groups were restricted to the interior of the protein (where they are not exposed to solvent water molecules) and that the polar portions of the polypeptide chain would be found on the protein’s surface. However, analysis of three-dimensional protein structures determined by X-ray crystallography revealed that many hydrophobic groups are at least partially exposed to solvent (Richards, 1977), a finding suggesting that the free energy of transfer from aqueous to organic solvent was not well correlated with the tendency of a residue to
WEAKLY POLAR INTERACTIONS IN PROTEINS
179
be buried in a protein’s interior. More recently though, Rose et al. (1985) reviewed this problem and documented a positive linear correlation between the hydrophobicity measure of Nozaki and Tanford (1971) and the van der Waals surface area buried on folding. They determined that the twenty naturally occurring amino acids were readily divisible into three distinct groups by the mean fractional surface area buried on folding and the average residual unburied surface area after folding. The three amino acid classes are (1) very polar, which includes serine, proline, aspartate, asparagine, glutamate, glutamine, lysine, and arginine; (2) moderately polar, which includes alanine, threonine, histidine, and tyrosine; and (3) hydrophobic, which includes glycine, cysteine, valine, isoleucine, leucine, methionine, phenylalanine, and tryptophan (all given in ascending order of hydrophobicity). In sum, these data affirm the validity of the analogy between a detergent micelle and the three-dimensional structure of a globular protein [a statistical-mechanical explanation for the presence of nonpolar residues on the protein surface has been provided by Dill (1985), who points out that the reconfiguration entropy, which drives the system toward distributional disorder, would lead to about 40% of the solvophobic residues being found at the surface of the protein]. Unfortunately, the micelle analogy is not without flaws. First, a linear polypeptide chain is not segregated into regions with only polar or nonpolar amino acids. Second, even the most hydrophobic amino acids contain the peptide group, which is strongly hydrophilic. Third, unlike the interior of a detergent micelle, the interior of protein is, in general, highly ordered and can be resolved readily by X-ray crystallography. Hence, the interior of a protein consists of a collection of both polar and nonpolar atoms, which are typically found in well-ordered arrangements. We can partially explain the structural order in terms of the apparent requirement that most, but not all, buried hydrophilic peptide groups be involved in hydrogen bonds with one another. However, the positions of the side chains of hydrophobic amino acids are unlikely to be influenced directly by backbone hydrogen bonding patterns. The packing density of amino acid side chains in a protein’s interior is also likely to play a role in imposing structural order, but observed atomic packing densities in proteins are not maximal and there are even large spatial cavities within globular proteins (Richards, 1977; Tilton et al., 1984; Tilton and Petsko, 1988). Despite the presence of spatial cavities, the hydrophobic amino acid side chains are not observed to be significantly disordered in the vicinity of these cavities, where disorder, if it could occur anywhere in a protein interior, would appear to be very likely. When pressurized xenon was introduced into the cavities found in sperm whale metmyoglobin, the nonpolar atoms lining the cavity underwent positional shifts of up to 1.29
180
S. K.BURLEY AND G . A. PETSKO
a without significant changes in their crystallographically determined temperature factors, which reflect static disorder and/or thermal motion (Tilton et al., 1984). More recently, Tilton and Petsko (1988) have compared I and 200 atmosphere structures of nitrogen-myoglobin obtained from a single myoglobin crystal. These two structures differ markedly in their cavity structures. Some of the buried amino acid side chains undergo conformational changes because of large side-chain torsion angle changes, which create new cavities in the protein. Thus there are significant changes in the core structure of myoglobin when it is subjected to large gas pressures. Finally, it is interesting to note that there is a large group of naturally occumng cavity mutations in human hemoglobin A, mutations that invariably destabilize the protein by introducing a new cavity into the interior of the tetramer (Fermi and Perutz, 1981). How then can we account for the high degree of internal order routinely found within globular proteins? We believe that combinations of the wide variety of electrostatic interactions reviewed above determine the^ precise three-dimensional structure of the interior of a protein. We argue that the sum of these interactions produces, at least in part, the enthalpy change on protein folding that is independent of the hydrophobic effect. Crystal structures of small organic compounds provide a useful model of protein interiors, and we now discuss some recent theoretical studies of these systems. Acetic acid has been investigated by Smith and Karplus (1988), who found that both dimerization energy and dimerization geometry are strongly influenced by electrostatic effects involving oxygen, hydrogen, and carbon atoms. The crystal structure of the dipeptide L-alanyl-Lalanine has also been the subject of considerable theoretical study because it is unusual (J. C. Smith and M. Karplus, unpublished observations). The observed torsion angles in the crystal are not characteristic of random coil but can be well explained by the precise geometry of the nonbonded intra- and intermolecular interactions in the crystal. Dill (1985) has presented a theory for the folding and stability of globular proteins that addresses many of these concerns. In his model, folding is driven by the self-association of hydrophobic residues to reduce solvent contact and is opposed by the chain configurational entropy. Once a compact, hydrophobic-buried state is achieved, conformational searching is greatly restricted by steric interactions. Such a view eliminates the oftquoted objection that a protein does not have time to fold in a multipath manner by sampling most of the conformational possibilities for the peptide backbone. In the Dill scheme, tlie number of allowed conformations is reduced enormously because excluded volume “is of overwhelming importance” as the protein folds, since it is folding within a globule.
WEAKLY POLAR INTERACTIONS IN PROTEINS
181
There are far fewer possible globular states than random coil states; Dill calculated that for a polymer of 100 residues only of the states are accessible! The notion that folding proceeds from a condensed state rather than from a random coil state is somewhat heretical in an age dominated by the “secondary structure nucleation” point of view. However, the question has not been addressed experimentally, and the Dill model is in accord with the thermodynamics of unfolding for many globular proteins. To complete our abbreviated treatment of the hydrophobic effect, we note that the liquid hydrocarbon free-energy of transfer model fails to explain the effects of pressure on protein folding (Kauzmann, 1987). Pressure effects are determined by the volume change that accompanies a process; in this case, unfolding is usually studied. The volume change A V for unfolding is positive at low pressures (the unfolded protein occupies more volume than the native molecule), but it becomes negative at pressures above lo00 to 2000 atmospheres. This result means that the compressibilities of the folded and unfolded forms of proteins must be very different, the unfolded form being much more compressible. Unfortunately, the solubility of nonpolar substances in water shows the opposite behavior: A V for the transfer from nonpolar to aqueous environment is negative at low pressures but becomes positive at high pressures! We have speculated that this discrepancy may be explained by the existence of numerous weakly polar interactions in the hydrophobic interior of globular proteins, which could serve to reduce the compressibility of the native state by a factor of 20 over that of liquid hydrocarbons. Quantitative treatment of this hypothesis has not yet been attempted, and we conclude by echoing the view of Kauzmann (1987) that “until more searching is done in the darkness of high pressure studies, our understanding of the hydrophobic effect must be considered quite incomplete.”
VI. DISCUSSION The data presented above suggest that the three-dimensional structure of a protein is stabilized both by the hydrophobic effect and by a wide variety of electrostatic interactions that occur between amino acid constituents. These electrostatic interactions display markedly different dependences on the separation and geometric arrangement of interacting chemical groups and range in distance dependence from charge-charge interactions, which are governed by Coulomb’s law (l/r), to repulsion due to unfavorable electron shell overlaps, which decline in strength as the inverse twelfth power of the molecular separation. In addition, these electrostatic interactions demonstrate distinct dependences on the geo-
182
S. K. BURLEY AND G . A. PETSKO
metric arrangement of nearby chemical groups. For example, the strength of the charge-charge interaction is spatially isotropic; whereas, the strength of the aromatic-aromatic, or quadrupole-quadrupole, interaction depends both on the spatial overlap of the two aromatic rings and on their interplanar angle. Variation in these two geometric parameters permits the 6+ hydrogen atoms of one aromatic ring to approach the 6n-electron cloud of the other aromatic ring. Although the precise contribution of any given electrostatic interaction in a protein to its free energy of stabilization is uncertain because it is a function of charge distribution, separation and interaction geometry and of the local “solvent” environment, enthalpy estimates for the various electrostatic interactions can be made. Attractive charge-charge interactions with optimal separation are said to contribute a free energy of about -3 to -4 kcal/mol; charge-dipole interactions are estimated to contribute a free energy of about -3 kcal/mol; the hydrogen bond, a dipole-dipole interaction, contributes an average enthalpy of about -2 kcal/mol, if neither group is charged; and the weakly polar interactions involving quadrupole moments typically contribute enthalpies of between - 1 and -2.5 kcal/mol. Each of these individual enthalpy estimates is small when compared with the sum of all enthalpic contributions to the stabilization of a protein’s tertiary structure [e.g., AH at 25°C is about -65 kcal/mol for lysozyme going from the unfolded to the folded state (Tanford, 196811, but is substantial when compared with the free energy of protein tertiary structure stabilization, which is typically about - 10 kcal/mol (Privalov, 1979). For example, analyses of thermostable proteins suggest that the additional free energy of stabilization required to convert a mesophilic protein into a thermophilic protein is on the order of - 1 to -2 kcal/mol and is readily facilitated by the insertion of a single charge-charge interaction into the protein’s interior (Perutz and Raidt, 1975). Moreover, these various electrostatic interactions are involved in stabilizing proteinligand complexes and, in doing so, determine the precise geometry of ligand binding. Finally, a recent theoretical examination of multiple conformational states of the protein myoglobin by Elber and Karplus (1987) suggests that the potential energy surface of a protein is characterized by a large number of thermally accessible minima in the vicinity of the potential energy minimum obtained for the “crystallographic” structure of the protein. In other words, the equilibrium conformation of the protein is defined by a potential well that itself has fine structure resembling an egg carton. The protein can sample these substates in the neighborhood of the average structure. Different minima in the potential energy surface correspond to changes in the relative orientations of myoglobin’s a helices, which are
WEAKLY POLAR INTERACTIONS IN PROTEINS
183
closely coupled with side-chain rearrangements that preserve the close packing of the protein’s interior. These rearrangements produce minima with very similar potential energies, because the wide variety of weak but enthalpically favorable electrostatic interactions that occur in the interior of a protein allow similar optimization of the potential energy of each conformational state. Although the problem has not yet been addressed specifically, it seems reasonable to believe that protein-ligand interactions are also subject to the multiple minima effects seen in theoretical studies of myoglobin. Recent evidence of this effect has been obtained from crystallographic studies of the inhibitor N-acetyl-L-alanyl-L-alanylL-prolyl-L-alanine chloromethyl ketone binding to porcine pancreatic elastase. There are two distinct positions for the moiety distal to and including ~-proline-3,which are due to two different inhibitor peptide torsion angles (Giammona et al., 1988). The occurrence of multiple conformational states of proteins and protein-ligand complexes is expected to have important consequences for the interpretation of their biological function and raises an important philosophical question regarding the definition of the concept of “structure” in molecular biophysics. Recent advances in both experimental and theoretical studies of protein dynamics support the picture of a rich variety of structurally similar conformations that interconvert rapidly at physiologic temperatures (Petsko and Ringe, 1984; Karplus and McCammon, 1983). The “structure” observed by X-ray diffraction or sampled by most spectroscopic and chemical probes is most likely a time and space average of this ensemble of conformations. We believe that, as far as, say, X-ray diffraction is concerned, it is never strictly correct to speak of the structure of a protein. The average structure is a more accurate term, the average being over all unit cells in the crystal and over the time required for data collection. These concepts are reinforced by recent calculations that suggest that a number of slightly different structural models provide equally “good” fits to high-resolution X-ray diffraction data (Kuriyan et al., 1987; Burley et al., 1988a).
VII. CONCLUSION We have discussed noncovalent interactions that stabilize protein structure and have reviewed the results of recent geometric and energetic analyses of the atomic environments of aromatic side chains in protein and oligopeptide crystal structures. These studies have identified a group of weakly polar, enthalpically favorable interactions that exploit the characteristic segregation of positive and negative partial electronic charges found in aromatic moieties. Members of this group include (1) oxygen-
184
S . K. BURLEY AND G . A. PETSKO
aromatic interactions, which bring a 6- oxygen atom near the 6+ hydrogen atoms of an aromatic side chain; (2) sulfur-aromatic interactions, which bring the 6- sulfur atoms of cysteine and methionine near the 6+ hydrogen atoms of an aromatic ring; (3) edge-to-face interactions between aromatic amino acid side chains, which bring a 6+ hydrogen atom of one aromatic ring near the 6- tr-electron of the other aromatic ring; and (4) aminoaromatic interactions, which bring a positively charged or a+ amino group near the 6- tr-electron cloud of an aromatic amino acid side chain. These four interactions are both ubiquitous and numerous, and the distribution of observed geometries for each type of weakly polar interaction differs from that expected from random close packing. Moreover, the results of ab initio quantum mechanical calculations suggest that each occurrence of these interactions makes an enthalpic contribution of between - 1 and -2 kcal/mol to the three-dimensional structural stability of a protein. We presented examples of the biological significance of these weakly polar interactions involving proteins, oligopeptides, and protein-ligand binding and advanced a hypothesis regarding their importance in three-dimensional protein structure stabilization and protein-ligand binding. In addition, the role of the hydrophobic effect was examined in the context of the above discussion of noncovalent interactions. Finally, we propose a systematic classification of the types of electrostatic interactions that occur in proteins. A . Hypothesis We suggest that packing of amino acid side chains in the hydrophobic core of a protein is determined by at least two requirements: (1) the need to exclude water molecules, and (2) the need to both form and optimize a large number of enthalpically favorable electrostatic interactions, which include ion pairs, “classical” hydrogen bonds, and the enthalpically favorable weakly polar interactions that involve electronic dipole and quadrupole moments, attractive London interactions, and repulsive electron cloud overlaps. Further, we suggest that these electrostatic effects dominate the biological activities and physicochemical properties of proteins. Although surface electrostatic interactions must play a role in determining the local structure of the protein’s hydration shell, we believe that the precise geometric and enthalpic detail of the buried electrostatic interactions largely determines the structure, and through it, the biological function of any given protein. Some recent experiments by the technique of site-directed mutagenesis of T4 lysozyme by Mathews and co-workers have provided intriguing evidence for this view of protein structure (Alber et al., 1987). They documented the occurrence of temperature-sensitive mutations at 18 distinct sites in the protein’s hydrophobic core. The amino acid substitutions that destabilized the protein are chemically var-
WEAKLY POLAR INTERACTIONS IN PROTEINS
185
ied, and there is no simple pattern in the nature of the substitutions that cause thermal sensitivity. Finally, it is both fashionable and tempting to speculate that electrostatic interactions occurring in the interior of a protein may govern core structure formation and thereby play a role in determining the folding pathway or pathways followed by a polypeptide chain.
B. Proposed Classification of Electrostatic Interactions The wide variety of electrostatic interactions described above suggests that the terms salt bridge and hydrogen bond do not provide an adequate description of the role of electrostatics in stabilizing protein structure and may actually led to an erroneously simple view of protein physics. We suggest that electrostatic interactions should be classified by the relative distributions of nuclei and electron clouds and by the distance dependence of the strength of the interaction and its geometric properties. Therefore, all charge-charge interactions would be grouped together because of their llr' dependence; all charge-dipole interactions would be grouped together because of their 1 lr2 dependence; all dipole-dipole and charge-quadrupole interactions would be grouped together because of their 1 /r3 dependence; all dipole-quadrupole interactions would be grouped together because of their llr4 dependence; and all quadrupolequadrupole interactions would be grouped together because of their llr5 dependence. In addition, the London forces, with their llr6 dependence and overlapping electron shell repulsion at very close range (Url*dependence) must also be considered, as must the short-range effects of charges on polarizable groups (llr4dependence). Such a scheme would eliminate the arbitrary distinctions present in the currently accepted descriptions of protein electrostatics and direct discussion of electrostatic interactions in proteins along physically meaningful lines. Finally, by advancing this classification scheme, we hope to underscore the vital importance of pursuing high accuracy, atomic resolution neutron and X-ray crystallographic studies of proteins and models of protein structural motifs, such as CY helices, /3 sheets, and /3 turns. ACKNOWLEDGMENTS We thank Drs. J. Singh and J. M. Thornton for early transmission of their work on phenylalaning interactions in proteins. We appreciate the helpful suggestions made by Dr. H. Franklin Bunn, Dr. David W. Christianson, Professor Daniel S. Kernp, Dr. Jun-Yun Liang, Dr. Max F. Perutz, Professor Sir David C. Phillips, Professor Fredenc M. Richards, Dr. Dagrnar Ringe, and Dr. Robert F. Tilton, Jr. concerning this review. To all these colleagues we express our gratitude. This work was supported by grant No. GM26788 from the National Institutes of Health to G.A.P. S.K.B.thanks the W.R.Grace Foundation for a research fellowship.
186
S. K. BURLEY AND G. A. PETSKO
REFERENCES Adman, E. T.. Sieker. L. C., and Jensen, L. H. (1973).J. Biol. Chem. 248,3987. Ahem, T.J., Casal. J. I., Petsko, G. A.. and Klibanov, A. M.(1987).Proc. Natl. Acad. Sci. U.S.A. 84,675. Alber. T.,Dao-pin. S.. Nye, J., Muchmore, D., and Mathews. B. W. (1987).Biochemistry 26, 3754. Amidon, G. L.. Anik, S., and Rubin, J. (1975).I n "Structure and Conformation of Nucleic Acids and Protein-Nucleic Acid Interactions" (M.Sundralingham and S. T. Rao. eds.), p. 729. Univ. Park Press, Baltimore. Arison, B. H.. Hirschmann. R.. Paleveda, W. J., Brady. S. F.. and Veber. D. F. (1981). Biochem. Biophys. Res. Commun. 100, 1148.
Babu, Y . S., Sack. J. S.. Greenhough, T. J., Bugg. C. E.. Means, A. R.. and Cook. W. J. (1985). Nature (London) 315, 37. Baker, E. N.. and Hubbard, R. E. (1984). Prog. Biophys. Mol. Biol. 44,97. Baldwin, R. L. (1986).Proc. Natl. Acad. Sri. U.S.A. 83,8069. Barlow, D. J., and Thomton. J. M. (1983).J . Mol. Biol. 168, 867. Bashford, D., and Karplus, M. (1988). In preparation. Berkovitch-Yellin, Z.. and Leisemwitz, L. (1984).Act4 Crystallogr. B40, 159. Bevington. P. R. (1%9). "Data Reduction and Error Analysis for the Physical Sciences." McGraw-Hill, New York. Birktofi, J. J., and Blow, D. M. (1972).J. Mol. Biol. 68, 187. Blundell. T. L.. Lindley. P. F., Miller, L. R., Moss, D. S., Slingsby. C.. Tickle, 1. J.. Turnell, W. G.. and Wistow, G. J. (1981).Nature (London) 289,771. Brooks, B. R., Bruccoleri. R. E., and Olafson, B. D., States. D.J., Swaminathan, S.. and Karplus, M. (1983).J. Comput. Chem. 4, 187. Buffon. (1801). 'Essai d'arithmetique morale.' See his Oeuvres, ed. An. VIM, Vol. xxi. pp. 163 ff. Burley, S. K. (1987).(1987b).Acta Crystallogr. C43, 1316. Burley. S. K.. and Petsko, 0. A. (1985).Science 229, 23. Burley. S. K., and Petsko, G. A. (1986a). J. Am. Chem. Soc. 10s. 7995. Burley, S. K., and Petsko. G. A. (1986b). FEBS Lett. 2433, 139. Burley, S. K.. and Wang, A. H.-J. (1987a).Acfa Crystallogr. C43, 797. Burley, S. K., and Wang, A. H.-J. (1987b).Actu Crystallogr. C43.988. Burley, S.K., Wang, A. H.-J., Votano, J. R., and Rich, A. (1987).Biochemistry 26, 5091. Burley, S. K., Kuriyan. J., Hendnckson, W. A.. Karplus. M..and Petsko. G. A. (1988a). In preparation. Burley. S. K.. Ringe, D., and Petsko. G. A. (1988b):Biochemistry, submitted. Casal. J. I.. Ahem, T. J.. Davenport, R. C.. Jr.. Petsko, G. A.. and Klibanov. A. M. (1987). Biochemistry 26, 1258. Cheney, B. V.. Schulz, M. W.,Cheney. J.. and Richards, W. G. (1988).J. Am. Chem. Soc. 110,4195. Christianson. D. W.,and Lipscomb. W.N. (1987).J. Am. Chem. Snc. 109, 5536. Cox, E. G., Cruickshank, D. W. J., and Smith, J. A. S. (1958).Proc. R . SOC.Ser. A. 247,l. Daar, I. O., ArIymuik. P.. Phillips, D.C., and Maquat, L. E. (1986).Proc. Natl. Acad. Sci. U.S.A. 83, 7903. Davenport, R. C., Jr. (1986). Ph. D.thesis. Massachusetts Institute of Technology. Deakyne, C. A.. and Moel-NedMautner), M.(1985).J. Am. Chem. Soc. 107,474. Dill, K. A. (1985).Biochemistry 24, 1501. Donohue, J. (1968). I n "Structural Chemistry and Molecular Biology" (A. Rich and N. Davidson. eds.), p. 443. Freeman, San Francisco.
WEAKLY POLAR INTERACTIONS IN PROTEINS
187
Elber, R., and Karplus, M. (1987). Science 235, 318. Felsenfeld, G., and Miles, H. T. (1967). Annu. Rev. Biochem. 36, 648. Fermi, G., and Perutz, M. F. (1981). “Atlas of Molecular Structures in Biology: Hemoglobin and Myoglobin.” Clarendon, Oxford. Fersht, A. R. (1972). J. Mol. Biol. 64,497. Fersht, A. R., Shi, J.-P., Knill-Jones, J., Lowe, D. M.,Wilkinson, A. J., Blow, D. M., Brick, P., Carter, P., Waye, M.M. Y., and Winter, G. (1985). Nature (London) 314, 235. Filman, D. J., Bolin, J. T., Mathews, D. A., and Kraut, J. (1982). J. Biol. Chem. 257, 13663. Finney, J. L., Gellatly, B. J., Golton, I. C., and Goodfellow, J. (1980). Biophys. J. 32, 17. Forman, J. D., Burley, S. K., and Petsko, G. A. (1988). In preparation. Frank, H. S., and Evans, M. W. (1945). J. Chem. Phys. l3,507. Fujii, S., Burley, S. K., and Wang, A. H.-J. (1987). Acta Crystallogr. C43, 1008. Gardell, S. J., Craik, C. S., Hilvert, D., Urdea, M. S., and Rutter, W. J. (1985). Nature (London) 317,55 I . Giammona, D. A., Ringe, D., Mofonen, J. M., and Petsko, G. A. (1988). In preparation. Gill, S. J. (1985). J. Phys. Chem. 89, 3758. Gilson, M. K., Sharp, K.. Honig, B., Fine, R., and Hagstrom, R. (1987). Biophys. J. 51,22a. Could, R. O., Gray, A. M., Taylor, P., and Walkinshaw, M. D. (1985). J. Am. Chem. SOC. 107,5921. Hagler, A. T., Lifson, S., and Huler, E. (1974). I n “Peptides, Polypeptides and Proteins” (E. R. Blout, F. A. Bovy, M. Goodman, and N. Lotan, eds.), p. 35. Wiley (Interscience), New York. Hendrickson, W. A., and Konnert, J. H. (1980). I n “Biomolecular Structure, Function and Evolution” (R. Srinivasan, ed.), Vol. I , p. 43. Pergamon. Oxford. Herzberg, O., and James, M. N. G. (1985). Nature (London) 3l3, 653. Hol, W. G. J., van Duijnen, P. T., and Berendson, H. J. C. (1978). Nature (London) 273, 443 * Hol, W. G. J., Halie, L. M., and Sander, C. (1981). Nature (London) 294, 532. Honzatko, R. B., Hendrickson, W. A., and Love, W. A. (1985). J. Mol. Biol. 184, 147. Johnson, C. K. (1%7). ORTEPII. Report ORNL-5138. Oak Ridge National Laboratory, Tennessee. Johnson, L. N. (1984). I n “Inclusion Compaunds” (J. L. Atwood, J. E. D. Daview, and D. D. MacNicol, eds.), Vol. 3, p. 509. Academic Press, New York. Jones, J. E. (1924)- Proc. R. SOC.London, Ser. A 106,441. Karlstrom, G., Linse P., Wallqvist, A., and Jonsson, B. (1983). J. Am. Chem. SOC.105, 3777. Karplus, M., and McCammon, J. A. (1983). Annu. Rev.Biochem. 53, 263. Kauzmann, W. (1959). Adv. Protein Chem. 14, 1. Kauzmann, W. (1987). Nature (London) 325, 763. Kollman, P. A. (1977). I n “Modern Theoretical Chemistry” (H. F. Schaefer, ed.), Vol. 4, Chap. 3. Plenum, New York. Konnert, J. H. Acra Crystallogr. A32; 614. Konnert, J. H., and Hendrickson, W. A. Acfa Crystallogr. A36, 344. Kretsinger, R. H., and Nockolds, C. E. (1973). J. Biol. Chem. 248, 3313. Kuriyan, J., Wilz, S., Karplus, M., and Petsko, G. A. (1986). J. Mol. Biol. 192, 133. Kuriyan, J., Karplus, M.,and Petsko, G. A. (1987). Proteins 2, 1. Kvick, A., Koetzle, T. F., and Thomas, R. (1974). J. Phys. Chem. 61, 2711. Lai, T. F., and Marsh, R. E. (1972). Acta Crystallogr. B28, 1982. Langs, D. A., Smith, G. D., Stezowski, J. J., and Hughes, R. E. (1986). Science 232, 1240. Levitt, M., and Perutz, M. F. (1988). J. Mol. Biol. 201, 751.
188
S. K. BURLEY AND G. A. PETSKO
Liang, J.-Y., and Lipscomb, W.N. (1986).J. Phys. Chem. 90,4246. Lifson, S . , Hagler, A. T., and Dauber, P. (1979).J . Am. Chem. SOC.101, 51 11. Lipscomb, W.N. (1980).Proc. Natl. Acad. Sci. U.S.A. 77,3875. London, F. (1937).Trans. Faraday SOC.33, 8. Murray-Rust, P., and Glusker, J. P. (1984).J. Am. Chem. SOC.106, 1018. Ninio, J., and Orgel, L. E. (1978).J . Mol. Euol. 12, 91. Novotny, J., and Haber, E.(1985).Proc. Natl. Acad. Sci. U.S.A. 82,4592. N o d , Y., and Tanford, C. (1971).J. Biol. Chem. 246,2211. Oikwa, S., Tsuda, M., Kato, H., and Urabe, T. (1985).Acta Crysrallogr. B41, 437. Olovsson, I., and Jonsson, P.-G. (1976).I n “The Hydrogen Bond” (P. Schuster, G. Zundel, and C. Sandorfy, eds.), Vol. 2, p. 393. North-Holland, Publ., Amsterdam. Ornstein, R. L., and Fresco, J. R. (1988).In preparation. Pauling, L. (1960). “The Nature of the Chemical Bond,” 3rd Ed., Cornell Univ. Press, Ithaca, New York. Perutz, M. F. (1970).Nature (London) Us,726. Perutz, M. F.,and Raidt, H. (1975).Nature (London)US,256. Perutz, M. F., Fermi, G., Abraham, D. J., Poyart, C., and Bursaux, E. (1986).J. Am. Chem. SOC. 108,1064. Peters, D., and Peters, J. (1980).J. Mol. Struct. 68, 255. Petsko, G. A., and Ringe, D. (1984).Annu. Rev. Biophys. 13, 331. PRugrath, J. W.,and Quiocho, F. A. (1985).Nature (London)314,257. Privalov, P. L. (1979).Adu. Protein Chem. 33, 167. Quiocho, F. A,, and Vyas. N. K. (1984).Nature (London)310,381. Rahim, Z., and Barman, B. N. (1978).Acta Crystallogr. A34, 761. Rashin, A. A., and Honig, B. H. (1984).J. Mol. Biol. 173,515. Reed, A. E., and Weinhold. F. (1983).J. Chem. Phys. 78,4066. Reid, K. S. C., Lindley, P. F.,and Thornton, J. M. (1985).FEES Lett. 190, 209. Reiher, W.E., 111. (1985).Ph.D. thesis, Harvard University. Rein, R. (1975).Adu. Quantum Chem. 7, 335. Richards, F. M. (1977).Annu. Rev. Biophys. Bioeng. 6, 151. m e , D., Petsko, G. A., Yamakura, F., Suzuki, K.,and Ohmori, D. (1983).Proc. Natl. Acad. Sci. U.S.A. 80, 3879. Ringe, D., Seaton, D. B., Gelb, M.. and Abeles, R. H. (1985).Biochemistry 24,64. Rogers, N. K., and Sternberg, M. J. E.(1984).J . Mol. Biol. 174, 527. Rogers, N. K., Moore, G.R., and Sternberg, M. J. E. (1985).J. Mol. Biol. 182,613. Rose, G. D., Geselowitz, A. R., Lesser, G. J., Lee, R. H., and Zehfus, M. H. (1985). Science 229,834. Sawyer, L.,and James, M. N. G. (1982).Nature (London)295,79. Sharp, K., Fine, R., and Honig, B. (1987).Science 236, 1460. Sheldrick, G. M. (1976).“SHELX76, Program for Crystal Structure Determination.” Univ. of Cambridge Press, Cambridge. Sheridan, R. P., Levy, R. M., and Salemme, F. R. (1982).Proc. Natl. Acad. Sci. U.S.A. 79, 4545. Shoemaker, K. R., Kim, P. S., York, E. J., Stewart, J. M., and Baldwin, R. L. (1987). Nature (London) 326,563. Shoham, G., Christianson, D. W.,and Oren, D. A. (1988).Proc. Natl. Acad. Sci. U.S.A.85, 684. Singh, J., and Thornton, J. M. (1985).FEES Lett. 190, 1. Slater, J. C., and Kirkwood, J. G. (1931).Phys. Rev. 37, 682. Smith, J. C., and Karplus, M. (1988).In preparation.
WEAKLY POLAR INTERACTIONS IN PROTEINS
189
Smith, W. W., Paattridge, K.A., Ludwig, M. L., Petsko, G. A., Tsernoglou, D., Tanaka, M.,and Yasunobu, K. T. (1983). J . Mol. Biol. 165, 737. Summers, L., Wistow, G., Narebor, M., Moss, D., Lindley, P. F., Slingsby, C., Blundell, T. L., Bartunik, J., and Bartels, K. (1984). Pept. Protein Rev. 4, 147. Sundralingharn, M. (1966). Acta Crystallogr. 21, 495. Stewart, R. F., and Jensen, L. H. (1967). Acta Crystallogr. 23, 1102. Takano, T., and Dickerson, R. E. (1981). J. Mol. Biol. 153, 79. Tanford, C. (1%2). Adu. Protein Chem. 17, 69. Tanford, C. (1968). Adu. Protein Chem. 23, 121. Tanford, C. (1980). “The Hydrophobic Effect,” 2nd Ed. Wiley, New York. Tapia. O., and Johannin, G. J. (1981). J. Chem. Phys. 75, 3624. Taylor, R., and Kennard, 0. (1982). J. Am. Chem. SOC.104,5063. Taylor, R., and Kennard, 0. (1984). Acc. Chem. Res. 17,320. Taylor, R., Kennard, 0.. and Versichel, W. (1983). J. Am. Chem. SOC.105, 5761. Teeter, M. M. (1985). In “Molecular Dynamics and Protein Structure: Proceedings of a Workshop” (J. Harmans, ed.), p. 177. UNC Printing Company, Chapel Hill. Thomas, K. A.. Smith, G. M., Thomas, T. B., and Feldman, R. J. (1982). Proc. Natl. Acad. Sci. U.S.A. 79, 4843. Tilton, R. F., Jr., and Petsko, G. A. (1988). Biochemistry 27, in press. Tilton, R. F.,Jr., Kuntz, I. D., Jr., and Petsko, G. A. (1984). Biochemistry 23, 2849. Tlichsen, E., and Woodward, C. H. (1987). Biochemistry 26, 1918. Umeyama, H., and Morokuma, K. (1977). J. Am. Chem. SOC.99, 1316. Valentine, W. N., Tanaka, K. R., and Paglia, D. E. (1983). In “The Metabolic Basis of Inherited Disease” (J. B. Stanbury, J. B. Wyngarden, D. S. Frederickson, J. L. Goldstein, and M. S. Brown, eds.), p. 1606. McGraw-Hill, New York. Villafmca, J. E.,Howell, E. E., Voet, D. H., Strobel, M. S., Ogden, R. C., Abelson, J. N., and Kraut, J. (1984). Science 222,782. Wada, A., and Nakamura, H. (1981). Nature (London)293, 757. Walter, R., Smith, C. W., and Roy, J. (1976). Proc. Natl. Acad. Sci. U.S.A. 73, 3054. Wang, A. H.-J., and Burley, S. K. (1987a). Acta Crystallogr. C43. Wang, A. H.-J., and Burley, S. K. (1987b). Acta Crystallogr. C43, 1011. Warshel, A., and Levitt, M. (1976). J. Mol. Biol. 103, 227. Weiner, S. J., Kollman, P. A., Nguyen, D. T., and Case, D. A. (1986). J. Compur. Chem. 7, 230.
Wilkinson, A. J., Fersht, A. R., Blow, D. M., and Winter, G. (1983). Biochemistry 22,3581. Wlodawer, A., Walter, J., Huber, R., and Sjalin, L. (1984). J. Mol. Biol. 180, 301.
This Page Intentionally Left Blank
STABILITY OF PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION By PETER L. PRIVALOV. and STANLEY J. GlLLt *Institute of Protein Rerearch, Academy of Sclences of the USSR, Moscow Region, USSR *Department of Chemlstry and Blochemlstry, Unlverrlty of Colorado, Boulder, Colorado 80308
List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Calorimetric Studies of Protein Denaturation . . . . . . . . . . . . . . . . 111. Studies of Dissolution of Nonpolar Substances into Water . . . . . . . . . . IV. Hydration of Nonpolar Molecules . . . . . . . . . . . . . . . . . . . . . A. Thermodynamics of Dissolution of Nonpolar Molecules. . . . . . . . . . B. Two-step Description of Dissolution . . . . . . . . . . . . . . . . . . V. Comparison of Results on Protein Denaturation and Hydrocarbon Dissolution in Water. . . . . . . . . . . . . . . . . . . . . VI. Mechanism of Stabilization of Compact Protein Structures . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
191 193 194 207 217 217 220 225 228 231
LISTOF SYMBOLS
A:G
activity of ith species effective surface area of a molecule heat capacity of denatured state at constant pressure heat capacity of native state at constant pressure change in heat capacity at constant pressure from pure liquid to dissolved aqueous states change in heat capacity at constant pressure from pure liquid to gaseous states change in heat capacity at constant pressure from pure gas to dissolved aqueous states change in heat capacity at constant pressure from native to denatured states change in Gibbs energy at given T,pH, and solution activity a, from native to denatured states change in Gibbs energy from hypothetical compact state to dissolved aqueous state hydration free energy defined as change in Gibbs energy from hypothetical compact state to dissolved aqueous state hydration enthalpy defined as change in enthalpy from hypothetical compact state to dissolved aqueous state hydration entropy defined as change in entropy from hypothetical compact state to dissolved aqueous state 191
ADVANCES IN PROTEIN CHEMISTRY, Vol. 39
Copyright 0 1988 by Academic Press, Inc. All rights of reproduction in any form reserved.
192 Ah
AYH ArH A;H Af H A:H A:S
X
PETER L. PRIVALOV AND STANLEY J. GILL enthalpy difference between ground and first enthalpy state of solvated water molecule surrounding an apolar solute for two-state hydration model change in enthalpy at given T, pH, and solution activity ai from native to denatured states change in enthalpy from pure liquid state to dissolved state in water change in enthalpy from pure gas state to dissolved state in water change in enthalpy from pure liquid state to gas state change in enthdpy from pure hydrocarbon liquid state to hypothetical compact state change in enthalpy from hypothetical compact state to dissolved aqueous state change in entropy from hypothetical compact state to dissolved aqueous state change in entropy at given T, pH, and solution activity ai from native to denatured states denatured state of protein Gibbs energy of denatured state Gibbs energy of native state enthalpy of denatured state enthalpy of native state native state of protein number of solvated water molecules in first solvation shell surrounding a solute -log[proton concentration] gas constant entropy of denatured state entropy of native state absolute temperature temperature at which both states of two-state hydration model for apolar solutes are equal high transition temperature where population of native and denatured states is equal low transition temperature where population of native and denatured states is equal temperature where A;H (hydration enthalpy) is zero for hydrophobic solutes temperature where A r H is zero for hydrophobic solutes enthalpy inversion temperature where enthalpy of denaturation is zero temperature where A;S is zero for hydrophobic solutes entropy inversion temperature where entropy of denaturation is zero temperature at which Gibbs energy of denaturation is at its maximum value approximate temperature ( I 10-140°C)above which enthalpy and entropy of denaturation is constant and universal for all globular proteins mole fraction solute
This article summarizes results of calorimetric studies of protein denaturation and of dissolution of nonpolar substances in water. An analysis of the available experimental data shows that the stabilization of the com-
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
193
pact state of a protein, usually attributed to hydrophobic interactions, is primarily due to van der Waals interactions between the protein nonpolar groups, and that the contribution of water solvation by these groups, in spite of the widely held opinion, actually destabilizes the compact state. This destabilizing action of water solvation increases as the temperature decreases, and at sufficiently low temperatures causes unfolding of the compact structure of a protein, i.e., cold denaturation.
I. INTRODUCTION The mechanism of folding and stabilization of native protein structures has been discussed for at least half a century, during which time attention has been concentrated alternatively on various possible intermolecular forces, acting between the groups in the protein, and the interactions with water that might determine its unique three-dimensional structure. These forces include van der Waals interactions, interactions between charged groups (salt links), interactions between polar groups (hydrogen bonds), and the so-called hydrophobic interactions between nonpolar groups (Kauzmann, 1959). The latter are usually assumed to be the reason for low solubility of nonpolar substances in water, i.e., their hydrophobicity. In water-soluble proteins, most of the nonpolar groups are not exposed to the water solvent (Tanford, 1962; Perutz, 1965; Janin, 1979; Wolfenden et al., 1981; Rose et al., 1985), and thus one assumes that hydrophobic interactions play a major role in the formation of the compact structure of globular proteins, as they do in the case of an oil drop in water. The interest in hydrophobic interactions was stimulated by their unusual thermodynamic properties: it was argued and believed that they are governed, not by enthalpic, but by entropic features, characterized by the undesirable entropy decrease of water in the vicinity of nonpolar groups (Frank and Evans, 1945; Kauzmann, 1959; Franks, 1975; Tanford, 1980). This conclusion was reached largely from consideration of solvation effects at room temperature. The amount of polar groups in proteins is almost the same as the amount of nonpolar ones; and according to crystallographic studies, most of them are arranged at distances suggesting hydrogen bond formation. Thus hydrogen bonds were invoked to various degrees of importance in explaining the stabilization of the native structure. However, one could only speculate on the real contribution of the various interactions stabilizing protein structure. Not only was there a lack of quantitative information on these interactions, but also the concept of structural stability of a protein was itself unclear, since the macromolecule consists of numerous structural elements in thermal motion.
PETER L. PRIVALOV AND STANLEY J. GILL
194
One can judge the stability of any structure by studying its disruption. So the stability of a protein can be determined by the disruption of its native structure, i.e., by denaturation. Since a protein can be regarded as essentially a macroscopic system, the disruption of its structure is just a change of the macroscopic state of the system. However, the main problem was to define the macroscopic states that can be realized under a given range of external conditions. The result of particular importance for the purposes of this article came with the observation that two states, native and denatured, describe the dominant species in the denaturation of small globular proteins. The population of intermediates is so small that they can be neglected for most purposes. The solution of this thermodynamic problem emerged with the appearance of the scanning microcalorimetric technique, which allows direct measurement of the energy change on disruption of protein structure with increasing temperature. Protein structures can also be disrupted by variation of many other intensive thermodynamic parameters such as pressure, pH, and concentration of denaturant, and the process can be studied by many other appropriate methods. But temperature and thermal energy can be measured to high precision, and they are basic conjugated intensive and extensive thermodynamic parameters whose functional dependence includes all information on the macroscopic states (see Lumry et al., 1966; Freire and Biltonen, 1978; Wyman, 1981; Gill et al., 1985b). Another important event contributing to the progress in this field was the developmentof reaction microcalorimetry,which has permitted direct measurement of heat effects involved with the transfer of hydrophobic substances from a nonpolar environment to water. These processes have been thought to mimic the unfolding of compact protein structures. Prior to the development of direct calorimetric techniques, all information on the interaction of a hydrophobic substance with water was obtained from equilibrium studies. However, the results were limited in accuracy, particularly those properties that are obtained by consecutive temperature differentiation of the solubility, for example, the change in heat capacity. In this article we shall examine the main achievements of microcalorimetric studies of protein denaturation and of the dissolution of nonpolar substances in water. This analysis has led us to reconsider the popular point of view on the mechanism of hydrophobic interaction and its role in the stabilization of protein structures. 11.
CALORIMETRIC STUDIES OF PROTEIN
DENATURATION
Among the most important accomplishments of calorimetric studies of protein denaturation has been the establishment of the following general features.
195
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTIOk
c I
4.0
4.44 4.20/
I
20
I
I
I
40
60
I
\
I
I
80
TlOC
FIG.1. Partial specific heat capacity of sperm whale metmyoglobin in aqueous solutions with different pH values in the temperature range in which heat denaturation takes place. The observed heat capacity peak corresponds to the heat absorption upon protein denaturation that also results in a significant heat capacity increase A@ [for details see Privalov er al. (1986)l.
1. The denaturation of small globular proteins represents a process in which essentially only two macroscopic states are displayed-native (N) and denatured (D)-while the populations of all other (intermediate) states are small. Therefore, to a good approximation, denaturation of small proteins can be regarded as a two-state transition that proceeds with absorption of a definite energy (Fig. I). It follows then that a small protein molecule represents a single cooperative macroscopic system in which the constituent structural elements (amino acid residues) do not change their state independently (Privalov, 1963, 1979; Privalov and Khechinashvili, 1974; Freire and Biltonen, 1978). 2. The native structure of large proteins is disrupted in several discrete stages in each of which discrete amounts of energy are absorbed. Each of these steps corresponds to the “all-or-none” breakdown of definite structural blocks of the protein molecule. Therefore, the large protein structure is not a monolith but appears to be composed of discrete, more or less independent, cooperative blocks, i.e., domains (Wetlaufer, 1973; Janin and Wodak, 1983; Privalov, 1982; see also Privalov et al., 1981; Privalov and Medved’, 1982; Potekhin and Privalov, 1982; Novokhatny et al..
1984).
It appears that the discreteness of a structure is a general principle of protein architecture, which not‘only reflects the evolution of the protein molecule but also has a deep physical ground (Privalov, 1985, 1986). It is just this unique thermodynamic property of the protein molecule that has made possible a quantitative definition of protein stability.
196
PETER L. PRIVALOV AND STANLEY J. GILL
Indeed, since the macroscopic states of a protein are discrete, they are described by discrete surfaces in the phase space of considered variables (Pfeil and Privalov, 1976~).The small globular proteins, or individual cooperative domains, which have only two stable macroscopic states, the native (N)and denatured (D), are described by two surfaces in the phase space, corresponding to their extensive thermodynamic functions. The transition between these states is determined by the differences of A&H(T,pH, ai) = HD(T,pH, U J - HN(T,pH, ai)
(1)
Entropy:
ARS(T, pH, U J = SD(T,pH, ai) - SN(T,pH, ai)
(2)
Gibbs energy:
ARG(T, pH, a3 = GD(T, pH, a3 - GN(T, pH, ai) = ARH(T, pH, ai) - TARS(T,pH, ail
(3)
Enthalpy:
From the discontinuity of thermodynamic functions specifying the state of the protein, it follows that protein denaturation can be regarded as a first-order phase transition (Privalov, 1979; Pfeil, 1981). It should be emphasized that the native and denatured states of a protein depend on the environmental conditions, but these modify the protein states gradually and cannot be considered as phase transitions, i.e., as transitions between macroscopic states, but only as transitions between microscopic states, corresponding to the same macroscopic state (see, e.g., Griko et al., 1988a). The Gibbs energy difference of the denatured and native states corresponds to the work required for the transition of a system from the native to the denatured state, i.e., the work of disruption of the native cooperative structure. Therefore, this quantity is usually considered as a measure of the stability of the cooperative structure, i.e., the stability of a small globular protein or cooperative domain. As for the large proteins, their stability cannot be expressed by a single value, but only by a set of values specifying the stability for each domain within these molecules and the interaction between the domains. 3. The transition of a protein or a single cooperative domain from the native to the denatured state is always accompanied by a significant increase of its partial heat capacity (see, for reviews, Sturtevant, 1977; Privalov, 1979). The denaturational increment of heat capacity ARC, = C,D - CF amounts to 2 5 4 0 % of the partial heat capacity of the native protein and does not depend noticeably on the environmental conditions under which denaturation proceeds (Fig. 1) or on the method of denaturation. However, it is different fog different proteins and seems to correlate with the number of contacts between nonpolar groups in native proteins (Table I). On the other hand, the partial specific heat capacities of denatured states of different proteins appear to be rather similar (Tiktopulo et
TABLE I Thermodynamic Parameters of Denaturation of Compact Globular Proteins with Average Molecular Weighta Rotein
Molecular weight
NH Nw
AECp AEH(25T) AES(25OC) AgH(110"C) AES(11o"C)
Ribonuclease A
13,600
0.69
0.76
43.5
2.37
6.70
6.06
17.8
ParValbumin
11,500
0.71
0.71
46.0
1.36
2.80
6.12
16.8
Egg-white lysozyme
14,300
0.71
0.86
51.7
2.02
5.52
6.24
17.6
Fragment K4 of plasminogen fl-Trypsin
97,000
-
-
51.7
1.82
5.00
6.32
18.0
23,800
0.79
0.91
57.7
1.33
3.43
6.23
17.9
aChymotrypsin
25,200
0.79
1.08
57.7
1.26
3.57
6.17
18.0
papain
23,400
0.68
1.00
60.1
0.93
1.60
6.22
17.0
Staphylococcus nuclease Carbonic anhydrase
16,800 29,000
-
-
61.3 63.3
0.85 0.80
2.10 1.76
6.05 6.17
17.5 17.6
Cytochrome c
12,400
0.64
1.26
67.3
0.65
0.90
6.37
17.8
Pepsinogen Myoglobm
40.000 17.900
-
-
0.85
1.37
73.3 14.5
-0.24 0.04
-0.27 -0.80
6.47 6.37
18.7 17.9
1.10
3.00
6.25
17.6
- -
- Average value
0.73
1.00
References Privalov et al. (1973); Privalov and Khechinashvili (1974) Filimonov et al. (1978); Privalov (1979) Khechinashvili et al. (1973); Privalov and Khechinashvili (1974) Novokhatny et al. (1984) Tischenko and Gorodnov (1979); Privalov (1979) Tischenko et al. (1974); Privalov and Khechinashvili (1974) Tiktopulo and Privalov (1978); Privalov (1979) Calderon et al. (1985) Tatunashvili and Privalov (1986) Privalov and Khechinashvili (1974) Mateo and Privalov (1981) Atanasov et a/. (1972); Privalov and Khechinashvili (1974)
NH,Number of hydrogen bonds per amino acid residue; Nw ,number of contacts of nonpolar groups per amino acid residue; AECpin J * K-' per mole of amino acid residue at 50°C; AEH in kilojoules per mole of amino acid residue obtained in the assumption that AgCpis constant; A E S in J K-' per mole of amino acid residue obtained in the assumption that AEC, is constant. The average molecular weight of a residue is accepted to be 115.
198
PETER L. PRIVALOV AND STANLEY J. GILL
al., 1987). Therefore, the partial specific heat capacities of various proteins in their native states show significant differences in heat capacities (Fig. 2) (Privalov ef al., 1988). There have been many attempts to determine the temperature dependence of the denaturational heat capacity increment of proteins. In the limited temperature range from 20 to 80"C, this quantity, as seen in Fig. 2, seems to be temperature-independent, i.e., the heat capacities of the native and denatured states change in parallel upon an increase in temperature (Privalov, 1963; Brandts and Hunt, 1967; Privalov and Khechinashvili, 1974; Pfeil and Privalov, 1976a,b; Privalov, 1979; Pfeil, 1981; Khechinashvili and Tsetlin, 1984). However, in a broader temperature range (O-13O0C), the results shown in Fig. 2 suggest that the differences in heat capacities between native and denatured forms converge; i.e., whereas the heat capacity of the native state increases linearly with temperature (in any case in the range from 0 to 80"C), the heat capacity of the denatured state is a nonlinear function of temperature in the range from 0 to 130°C (Privalov ef al., 1988). Thus, extrapolating the heat capacity of the native state linearly above 80"C, one comes to the conclusion that ARC, decreases to zero at about 140°C (Fig. 3).
"O
t , 10
30
50
70
80
110
130
TI'C
FIG. 2. Temperature dependence of the partial specific heat capacity for pancreatic ribonuclease A (RNase), hen egg-white lysozyme (Lys), sperm whale myoglobin (Mb), and catalase from Thermus thermophilus (C'IT). The flattened curves are for RNase and Lys with disrupted disulfide cross-links and for apomyoglobin, when polypeptide chains have a random coil conformation without noticeable residual structure (Privalov et al., 1988).
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
I
1.00
I
199
I
-
c
I
UI
T/"C
FIG.3. Denaturational increment of the partial specific heat capacity of pancreatic ribonuclease A (RNase), hen egg-white lysozyme (Lys), and sperm whale myoglobin (Mb). The dashed lines represent the parts of these functions that were obtained by a linear extrapolation of the partial heat capacity of the native stat?. The dot-and-dash lines show the behavior when the values measured at 50°C are assumed to be temperature independent.
The main consequence of a heat capacity difference between native and denatured states of a protein is that the thermodynamic functions that determine the transition between these states are all temperature-dependent. Indeed, since
the temperature dependence of the enthalpy and entropy differences of the native and denatured states at fixed pH and environmental conditions ai are given as Ai&T) = A E H ( T F ) +
lwAEC, dT T
(4)
200
PETER L. PRIVALOV AND STANLEY J. GILL
where T F is chosen to be the transition temperature at which the population of the native and denatured states is equal. Therefore, the Gibbs energy difference of these states at T p is zero, ARG(T$f") = A R H ( T p ) - T p A R S ( T p ) = 0 and thus
In the approximation that ARC, is considered constant, Eqs. (4) and (5) can be expressed simply as AD,H(T) = AgH(T$")
- (T$" - T ) ARC,
(6)
Thus the Gibbs energy difference between the native and denatured protein states is given as
+ TARC,l n T( Tp) The functions ARH, ARS, and AD,G are presented in Figs. 4-6 for two globular proteins that differ markedly in their ARC, values (see Table I). The dot-and-dash line represents the situation where ARC, is assumed to be temperature-independent and equal to the value measured near 50°C in the denaturation experiment. As seen, this approximation is good in the temperature range from 0 to 80°C. In this temperature range, the enthalpy difference between the nature and denatured states decreases linearly with a decrease in temperature. Consequently, one can expect that at some sufficiently low temperature, T P , the enthalpy difference reaches zero and then inverts its sign. This temperature is
Similarly, the entropy difference of the native and denatured states according to Eq. (7)should also decrease with a decrease in temperature, but nonlinearly, and should reach zero at a somewhat higher temperature, T P , than that at which the enthalpy function inverts its sign
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
20 1
TI K
300
350
4.0
-
-2.0
0
400
I
I
50
I
100
150
TI0C
FIG. 4. Temperature dependence of the specific enthalpy of denaturation of myoglobin and ribonuclease A (per mole of amino acid residues) in solutions with pH and buffer providing maximal stability of these proteins and compensation of heat effects of ionization (see Privalov and Khechinashvili, 1974). The broken extension of the solid lines represents a region that is less certain due to uncertainty in the ARC, function (see Fig. 2). The dot-anddash lines represent the functions calculated with the assumption that the denaturation heat capacity increment is temperature independent.
A decrease in the heat of denaturation with decreasing denaturation temperature has been observed for all proteins studied calorimetrically. It
202
PETER L. PRIVALOV AND STANLEY J. GILL TIK
300
350
400
I
I
I
20
15
c
L
g
10
0
50
100
150
T/"C
FIO.5. Temperature dependence of the specific entropy of denaturation of myoglobin and ribonucleaseA (per mole of amino acid residues) under the same conditions as indicated in Fig. 4. The dot-and-dashlines represent the functions calculated in the assumption that the denaturated heat capacity increment is temperature independent.
has also been found that, when the effects of ionization of the protein and buffer and ligation of the denaturants are taken into account, a single temperature function describes either the enthalpy or entropy of the transition process as predicted by Eqs. (4) and (5) (Privalov and Khechinashvili, 1974; Pfeil and Privalov, 1976a,b). The Gibbs energy difference between the native and denatured states is represented by Eq. (8), a function with an extremum (Fig. 6). Its maximum value is reached at the temperature T- , which can be determined from the condition
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
203
from which it then follows that T-
= Tden S
(1 1)
Since the Gibbs energy difference of the native and denatured states determines the stability of a cooperative unit, it follows from Eq.(1 1) that the stability of a small globular protein (or a single domain) is maximal at the temperature at which the entropies of the native and denatured states are equal. At this temperature, the structure is stabilized only by the TI K 300
250
400
350
500
j
250
nu'z
a
0
-250
-500 0
50
100
T/"C
FIG.6. The Gibbs energy difference of the native and denatured states of myoglobinand ribonuclease A calculated per mole of amino acid residues under the same conditions as indicated in Fig. 4. The dot-and-dash Lines represent functions obtained in the assumption that the denaturation heat capacity increment is temperature independent.
204
PETER L. PRIVALOV AND STANLEY J. GILL
enthalpy difference between these states. At temperatures above and below TmU,the protein stability decreases: upon heating, it reaches zero at temperature TP;upon cooling, it reaches zero at temperature TF'.If ARC, is constant, then the relation between these temperatures is
One expects a significant amount of both the native and denatured protein structure in the vicinity of these two temperatures. The disruption of the native state on heating is usually called heat denaturation, since it proceeds with heat absorption and, consequently, with an increase in the molecular enthalpy and entropy. The disruption of the native structure on cooling, which we can call by analogy cold denaturation, should then proceed with a release of heat and, hence, with a decrease in enthalpy and entropy, because both of these functions have reversed their signs before reaching temperature TP'. Upon the breakdown of a protein's ordered native structure, a decrease in entropy, i.e., the increase of order of the system, seems to be a paradox. This might be the reason why cold denaturation, predicted almost 20 years ago by Brandts (1964, 1969) on the basis of a van't Hoff analysis of optical studies of heat denaturation, has not received adequate attention. Doubts concerning this prediction were aggravated by the difficulty of making direct observations of the proposed phenomenon, because the predicted value of Tp' for all the known proteins was far below the freezing point of aqueous solutions. There was some evidence for a lowtemperature decrease of protein stability under conditions of the presence of either high pressure (Hawley, 1971; Zipp and Kauzmann, 1973) or denaturants (Pace and Tanford, 1968; Nojima et al.. 1978) (see also Hatley and Franks, 1986). However, only recently was it shown directly using scanning microcalorimetry on various proteins that cold denaturation does indeed proceed with the release of heat as the protein solution is cooled (see Fig. 7) (Privalov et al., 1986; Griko et al., 1988a,b). Thus, it has become evident that the lower values of enthalpy and entropy of the denatured state predicted for low temperatures, relative to those of the native state, are not a fiction resulting from an excessive extrapolation of thermodynamic properties of the protein, but a fact that requires explanation. 4. The specific enthalpy and entropy of the corlformation transition of proteins from the native to denatured state has an upper limit that is reached above 140°C and seems to be universalfor all compact globular proteins (Figs. 4 and 5). By enthalpy and entropy of conformational tran-
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
0
-10
10
20
30 Temperature.
40
50
60
205
I0
OC
FIG.7. Microcalorimetric recording of the heat effect on cooling and subsequent heating of metmyoglobinsolution at pH 3.83. The low temperature peaks correspond to heat release on cold denaturaton and heat absorptionon subsequentrenaturation of protein. The shift of these peaks in temperature is caused by slow kinetics of unfolding and folding of myoglobin structure at low temperature (for details, see Privalov et al., 1986).
sition, we mean the enthalpy and entropy of denaturation from which the enthalpy and entropy effects of ionization of protein and buffer are excluded. (For details, see Privalov and Khechinashvili, 1974; pfeil and Privalov, 1976a,b.) It is noteworthy that the use of the approximation ARC, = constant does not lead to a significant modification of these values but only decreases the temperature Tx (to IIO°C)at which the apparently universal values are reached. Tx represents an approximate temperature above which either the entropy or the enthalpy change upon denaturation (per mole of amino acid residue) have the same constant value regardless of the specific protein. Since one obtains essentially the same enthalpy (or entropy) value for different proteins, one can sense that this limit has some general physical basis. For compact proteins with molecular masses of greater than 10,OOO and saturation of native structure by intramolecular hydrogen bonds of about 0.75 0.10 mole of bonds per mole of amino acid residues, the asymptotic values of enthalpy and entropy of the conformational transition, calculated per amino acid residue, amount to ARH(Tx) = (6.25 2 0.2) kJ mol-I and ARS(T,) = (17.6 k 0.6) J K-'mol-I. For some noncompact proteins (e.g., histones) or small globular proteins with molecular masses
*
.
206
PETER L. PRIVALOV AND STANLEY J. GILL
less than 10,OOO (e.g., neurotoxins) whose structure is less saturated by hydrogen bonds and contacts between nonpolar groups, the specific enthalpy and entropy of the conformational transition extrapolated to Txare lower than those indicated above (Privalov, 1979; Tiktopulo et al., 1982; Khechinashvili and Tsetlin, 1984). The general thermodynamic properties of proteins reported above give rise to several questions: What do the asymptotic (at T x )values of the denaturation enthalpy and entropy mean and why are they apparently universal for very different proteins? Why should the denaturation enthalpy and entropy depend so much on temperature and consequently have negative values at low temperature? In other words, why is the denaturation increment of the protein heat capacity so large, with a value such that the specific enthalpies and entropies of various proteins converge to the same values at high temperature? The denaturational increment of the heat capacity might be described partly by the increase of the extent of configurational freedom of the protein molecule upon denaturation. However, as was shown by Sturtevant (1977)and Velicelebi and Sturtevant (1979), the contribution of this effect to the observed denaturational increment of the protein heat capacity cannot be large. This conclusion becomes especially evident from the impossibility of using this configurational effect alone to explain the negative values of the enthalpy and entropy of protein denaturation at low temperatures. The denaturational increment of the heat capacity can be partly explained by a gradual melting of the residual structure in the denatured protein on heating. The existence of the residual structure in denatured proteins, and especially heat-denatured proteins, has been discussed for a long time (see, e.g., Tanford, 1%8). However, all attempts to find out by direct calorimetric measurements the difference in the heat capacity of proteins denatured by heat, in which the residual structure is suspected, or by 6 M guanidinium chloride (Pfeil and Privalov, 1976b) or by disruption of disulfide cross-links (see Fig. 2), which lead to complete unfolding of the polypeptide chain to a random coil state (Privalov et al., 1988), have failed. Certainly the assumption that the observed ARC, value is due entirely to gradual melting of the residual structure leads to the absurd conclusion that at temperatures below TF, when the enthalpy of the denatured state is lower than that of the native one, the residual structure in the denatured protein would be more extensive than the structure in the native protein. The most plausible explanation for the significant denaturational increment of the protein heat capacity is that it is due to water that comes in contact with the protein nonpolar groups exposed upon denaturation
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
207
(Brandts, 1967; Privalov and Khechinashvili, 1974; Sturtevant, 1977). It is assumed that water ordering increases in the vicinity of nonpolar groups (Kauzmann, 1959). If the order of the water molecules surrounding nonpolar groups decreases faster than that of bulk water as the temperature rises, one will observe the gradual melting of ordered water as the increment of the partial heat capacity of protein in water media. Let us consider the experimental facts that form the basis for the assumption that protein nonpolar groups influence the state of the surrounding water. 111. STUDIES OF DISSOLUTION OF NONPOLAR SUBSTANCES INTO WATER Equilibrium and calorimetric studies of the thermodynamics of dissolution of various gaseous and liquid nonpolar substances into water have led to the establishment of the following facts. 1. The enthalpy of dissolution of gaseous nonpolar molecules into water is always negative at room temperature, and its absolute value is proportional to the accessible surface area of the solute molecule (Frank and Evans, 1945; Tanford, 1980; Dec and Gill, 1984, 1985a,b; Olofsson et al., 1984). The most reliable calorimetric data on the enthalpy of dissolution of various nonpolar gases, the noble gases and hydrocarbons, are collected in Table 11. The very small gaseous molecules (helium and neon) were not included. The surface area of the considered molecules, A,, have been calculated from the known spatial structure of the molecule and represented either in terms of A2or the number of contacting water molecules, N,,assuming that each water molecule occupies an area of about 9 A2 (Hermann, 1972). The direct correlation between the enthalpy of dissolution of a gas in water, A,WH, and the surface area of the solute molecule, A,, or the number of water molecules contacting the solute molecule, N,, is seen from column five in Table 11. The enthalpy of dissolution of different liquid hydrocarbons in water provides additional information. The transfer of a nonpolar molecule from the pure liquid phase (1) to water (w) can be represented by two steps: (1) transfer from the liquid phase to the gaseous phase, i.e., the vaporization, and (2) transfer from the gaseous phase into water (Fig. 8). Consequently, the enthalpy of transfer can be presented as ATH = AfH
+ A,WH
Therefore, we obtain A,WH = ATH - AfH
TABLEI1 Enthalpy and Heat Capacity Increment of Solution of Some Nonpolar Gases in Water at 25°C" ~
Substance
A.
N,
A;H
A;HIN,
Argon, Ar
143' 152d 155' 168' 16pf 191d 209 207f 223d 24Of 249d
15.9 16.9 17.2 18.7 18.8 21.2 22.8 23.0 24.8 26.7 27.7 28.3
-ll.Bb -13.27' -15.29' -18.99' -16.55' -19.48' -23.03= -21.17' -23.11' -24.14' -24.45' -25.30'
-753 -785 -889 -1015 -880 -919 -1010 -920 -932 -904 -883 -894 -886 285
Methane, CH, Krypton, Kr Xenon, Xe Ethylene, CzH, Ethane, C2H6 Cyclopropane, c-CJ16 Propylene, HXCHCH, Propane. C& Butylene, HgCHCH2CH3 Isobutane, i-C,Hlo n-Butane, n-C,Hlo Average value Standard error, u
25Sd
A;Cp
me 217' 220'
250' 236' 284' 303' 278' 319r 389'
377~ 390'
A;Cp/N, 12.6 12.8 12.8 13.4 12.5 13.4 13.3 12.1 12.9 14.6 13.6 13.8 13.1 20.6
T&-,
w)
~~~~
Arcp
ATCJN,
264'.8 331'-8 30lc,8 331"P
10.6 12.4 10.9 11.6 11.3 20.8
85.1 86.0
90.0 100.5 95.1 93.6 101.0 101.1 92.8 87.1 89.9 89.9
A,, Surface area of the solute molecule in di2; N, ,number of water molecules contacting the solute molecule; AH in kJ per mole of the solute; AHIN, in J per mole of water; ACp in J . K-' per mole of the solute; ACp/N,in J . K-l per mole of water; T& + W)in "C; A;"C, Arcp + (AH% - AHm.'' np )I(Tb.p. - 298.15). Dec and Gill (1985). Olofsson et al. (1984). Hermann (1972). Dec and G i l l (1984); Naghibi et al. (1987a). 'Gill ef al. (1985a). 8 Zwolinski and Wilhoit (1971).
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
209
Gaseous phase
(9) of nonpolar substance
Liquid
Water phase (W)
“
phase
(1) of pure nonpolar substance
\
t
/
Condensed phase (C) of noninteracting non po la1: molecules FIG.8. Scheme of various ways of transfer of nonpolar solute into water.
The values AYH, AfH, and A,WH at 25°C for various liquid hydrocarbons are given in Table 111, from which it becomes clear that the A,WH values for these hydrocarbons are negative at 25°C and that their absolute values are proportional to the surface areas A, of the considered molecules, a relationship also true for nonpolar molecules that are gases at normal temperatures and pressures. Interestingly, the enthalpy of solution of liquid hydrocarbons into water is zero at a temperature near room temperature (Kauzmann, 1959; Gill et al., 1976; Tanford, 1980); thus the enthalpy of hydration (A,WH) is very close to the negative enthalpy of vaporization of the compounds at this temperature. Members of these two groups, i.e., those characterized as gases or liquids at room temperature and normal pressure, have values of ArHIN, that are rather close, with a deviation of only 2 lo%, despite large differences in surface areas.
TABLEIII Enthalpy and Heat Capacity Increment of Solution of Some Liquid Hydrocarbons in Water at 25°C"
Substance
Ns
ArH
26.76 30.46 30.36 31.0b 3 1.4b 33.5' 37.1
2.08' 1.73' -2.aP -0.10' OC
2.02' 2.30'
A~H
33.85' 37.W 26.43d 33.04' 31.5Sd 42.25' 46.23'
ArH
A,"HIN,
Arcp
ArCdN,
-31.77 -36.26 -28.43 -33.14 -31.55 -40.23 -43.93
- 1270 - 1240 -872 -1070 - lo00 -1260 - 1250
225' 263'
8.43 8.65 13.2 11.6 14.0 9.49 10.5 10.8 k2.2
-1140 2 160
400' 3ac 440' 318' 391'
A:Cp
A;Cp
A;CplNs
-54e -50'
279 315 460
-46' -52' -58' -58'
492 376 449
10.45 10.3 15.2 13.1 15.7 11.2 12.1 12.6 22.2
-52'
406
N., Number of water molecules contacting the solute molecule; AH in kJ per mole of the solute;AHIN, in J per mole of solute per mole of water; ACp in J K-I per mole of the solute; ACpINsin J . K-I per mole of solute per mole of water; T& + w) in "C. Ir Hermaun (1972). Gill et al. (1976). Zwolinski and Wilhoit (1971). weast (1970).
-
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
211
2. The heat capacity change upon a4ueous dissolution of nonpolar molecules from the gaseous phase and from the liquid phase is positive and is proportional to the s u ~ a c earea of the solute molecule (Edsall, 1935; Gill and Wadso, 1976). On the basis of initial calorimetric measurements (Gill et al., 1976; Olofsson et al., 1984; Dec and Gill, 1984, 1985), one can represent the enthalpy of transfer of hydrocarbons from the gaseous phase to water by a linear function of temperature in the temperature range 15-35°C. Bearing in mind Kirchhoff's relation between enthalpy and heat capacity change in the reactions, one can conclude that the transfer of nonpolar molecules to water leads to an increase of heat capacity by a value that is independent of temperature in the mentioned temperature range. The values of the heat capacity increment upon transfer of various nonpolar substances in water at 25°C are given in Tables I1 and 111. As in the previous case, one can write for the transfer from the liquid phase to water: A;VCp = AfC, + Arc, and thus Arc, = Arc, - AfC, (14) As seen from the tables, the heat capacity increment is also proportional to the surface of the molecule and the values for A;C,/N, are rather close for all the compounds studied. Therefore, one can conclude that the heat capacity increment of solution for nonpolar molecules in water is mainly caused by water solvating these molecules. 3. The heat capacity change upon dissolution of nonpolar molecules from the gaseous phase decreases with increasing temperature. From general consideration of water interactions with a nonpolar molecule, one can expect that the influence of this molecule on the state of water should decrease with increasing temperature and, therefore, the solution heat capacity increment should decrease with increasing temperature. Assuming that water molecules in a hydrated shell behave independently, tiill et al. (1985a) suggested that the contributors to the heat capacity change due to water solvation is given by the following expression that approximately represents the heat capacity increment upon transfer of nonpolar molecules to water: (Ah)2 exp[(-Ah/R)(l/T - 1/7',,,)1 A;Cp = N,RTZ (1 + exp[(-Ah/R)(l/T - 1/T,,J])2
(15)
where Ah is the enthalpy difference of a solvated water molecule between the ground state and the next enthalpically important state the solvated water molecule can have. Here Tmis the temperature at which the concentrations of solvated water molecules in these two states are equal.
212
PETER L. PRIVALOV AND STANLEY J. GILL
This behavior is borne out by experiment. A definite decrease of A r c p with increasing temperature in the temperature range of 0-50°C was indicated by calorimetric studies of the heats of dissolution of various hydrocarbon gases, namely, methane (Naghibi et al., 1986), ethane and propane (Naghibi et al., 1987a), and butanes (Naghibi et al., 1987b). According to Shinoda and Fujihara (1968) and Shinoda (1977), who analyzed the dissolution of nonpolar liquids in water in a broad temperature range up to 270"C, a noticeable decrease of the heat capacity increment occurs at temperatures above 80°C; and at 160°C, the heat capacity increment drops essentially to zero. From the methane work of Rettich et al. (1981), the decrease of the solution heat capacity increment occurs at even lower temperatures. Recently the partial specific heat capacity of benzene and toluene in water has been measured directly in the broad temperature range from 0 to 150"C, using the scanning microcalorimetric technique (Makhatadze and Privalov, 1987). It has been found that the solution increment of the heat capacity of these two substances decreases asymptotically in the studied temperature range and that AZC, values calculated per mole of hydrated water are in good correspondence with results found for methane (Naghibi et al., 1986) and for that predicted by Eq. (15) having the parameters: Ah = 6500 J mol-l, Tm = 480 K (Fig. 9). Thus, one can conclude that the solution heat capacity increment Arcp is a universal function of the surface area of nonpolar substances and of temperature and is described to good approximation by Eq. (15). Applying the established temperature dependence of Arcpto the substances listed in Tables I1 and 111, one can find that the enthalpy of the transfer of all these substances from the gaseous phase to water decreases to zero within the temperature range 100-180°C (Fig. 10). As is evident, when one linearly extrapolates A r H values determined at 25"C, using the usual assumption that Arc, is temperature-independent, one finds a lower value of the temperature TH(g + w) at which the hydration enthalpy is zero (see the last column in Table 11). It is clear, however, that these values, obtained by linear extrapolation, i.e., assuming constant heat capacity increment, have only a fictitious meaning. Nevertheless, in all cases one can conclude that the heat of solvation becomes zero at an elevated temperature in the range of 410 2 40 K . 4 . The entropy of solution of a nonpolar substance (liquids or gases) in water is always negative at 25"C, and its absolute value decreases as the temperature increases (see Tables IV and V). The negative value of the entropy of solution of nonpolar substances in water is an immediate consequence of the small (liquids) or rather negative (gases) enthalpies of solution along with the large and positive Gibbs
-
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
213
riu
<
0
100
200
T/"C
FIG.9. The A;C,/N, function for (1) methane (Naghibi er al., 1986), (2) benzene, and (3) toluene (Makhatadze and Privalov, 1987).
energy of solution, which follows from a low solubility mole fraction X of these substances (Frank and Evans, 1945; Kauzmann, 1959; Franks and Reid, 1973; Tanford, 1980). Indeed, since
AtS
=
ArgH - A&G - Ar.H + RT In X T T
(16)
at X << 1 and ArgHS 0, then AT$ < 0. As for the decrease of its absolute value with increasing temperature, this follows from the positive heat capacity increment for the dissolution of a nonpolar substance in the water and the negative value of the entropy of dissolution at room temperature (see Eq. 15). The standard entropy change for solution of several gases is presented in Table IV as a function of temperature, using Eq. (15) to estimate the temperature dependence of A;C,/N,. The standard entropy change for solutions of several liquid hydrocarbons at 25°C is shown in Table V. The standard entropy changes for solution of the compounds shown in Table V are plotted in Fig. 11, in which the temperature depen-
214
PETER L. PRIVALOV AND STANLEY J. GILL T/K
m I
400
350 1
I
I
I 60
0
460
I
I
I
I
100
160
0
TIT
FIG.10. Temperature dependence of ArH for some nonpolar'substances listed in Tables
I1 and 111 for which the ArCp(2SoC)values arc known with the highest accuracy: (I) CH,, (2) CZHI, (3) cC&, (4) C6H69 (5) C7He, (6) CsHiog (7) C#IZ. 250 40 20
TIK 350
300
400
450
t
c
B
L
0
-
-20
2 id
-40
%
-60
-a 0 -100
0
100
50
150
TI%
FIG.11. Temperature dependence of A;YSo for liquids (I) n-propane, (2) n-butane, (3) i-butane, (4) pentane, (5) hexane, (6) cyclohexane, (7) benzene, (8) toluene, and (9) ethylbenzene. High temperature behavior of heat capacity change described by exponential scaling [see Eq. (IS) and p. 2161.
2 15
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
TABLEIV Entropy of Solution of Some Nonpolar Gases in Water, Calculated in the Assumption That A;CP Decreases with Temperature Increasing as It Does in the Case of Benzene"
Substance
Ar CH,
Kr Xe CZHS HzCCHCHj C-C& C3Hs i-C4Hto n-GH 10
NS
25°C
50°C
75°C
100°C
125°C
AfSO(25"C)
15.9 16.9 17.2 18.7 21.2 23.0 22.8 24.8 27.7 28.3
-12W -132d -1346 -1426 -151d -147' -149' -165' -173' -175'
-113
-100 -101
-90 -90 -92 -94
-81 -81 -83 -84
-106
-%
-93 -91 -102 -102 -101
-82 -78 -88 -86
95c 97' 97' 107c 98' 95' 90' 90'
-115
-117 -123 -133 -125 -125 -140
-145 -146
-103 -107 -118 -108
-107 -120 -121 -122
95'
-85
A S in J * K-I. Dec and Gdl(1985). Calculated from the enthalpy of vaporization, boiling temperature and the heat capacity differences in the liquid and gaseous states from Weast (1970). Rettich et al. (1981). Wilhelm et al. (1977). a
TABLEV Solubility, Gibbs Energy, and Entropy of Transfer of Hydrocarbons from the Liquid Phase to Water' Substance
cd% CiHs cat0
calz C&o
cat4
C3Hn i-CJI10
n-GHto
Solubility at 25°C x x x x 0.095 x 0.020 x
4.01 1.01 0.258 0.117
10-4b 10-4' 10-4c 10-4' 10-4= 10-4=
A;Go(25'C) 19.4 22.8 26.2 28.2 28.7 32.5 14.41d 21.7Id 23. 14d
A;So(2SoC) -58
-71 -81 -95 - 103 - 109
-75.32d -89.14d -93.2d
Ts 139 145 133 132 126 121 156 164 153
Solubility in mole fractions. AGin kJ mol-I; AS in J * K-I mol-I; Ts in "C. The properties of the last three substances in their hypothetical liquid state under standard conditions were estimated by appropriate conversion from gas to dissolved state thermodynamic properties. Temperature dependence of heat capacity change described by exponential scaling (see p. 217). Franks et al. (1%3). McAulBe (1966). Dec and Gill (1984); Naghibi et al. (1987a,b); Wilhelm et al. (1977); Zwolinski et al. (1971).
216
PETER L. PRIVALOV AND STANLEY J. GILL
dence of Arcp is approximated by an appropriately scaled exponential function that closely represents the results shown in Fig. 9, i.e., A;VCp(T)= ArCJTo) exp[-b(T - TO)],where TOis 298 K and b is 0.0041. It may be noted that the differences between ArSoand A;S0 are all close to 90 J . K-I mol-I. This is not surprising; indeed, we are actually comparing standard entropies of the transfer of nonpolar substances into water from two different phases-the gaseous one at normal pressure and the condensed one (a pure liquid); and they should differ in entropy of vaporization of the condensed phase. According to Trouton's rule (Prigogine and Defay, 1954), the entropies of vaporization of all nonpolar liquids at normal boiling points are almost the same and close to 90 J K-I * mol-I. For the considered gases, their values, reduced to standard temperature 25"C, are listed in the last column of Table IV. Bearing this in mind, it is clear from Fig. 11 that the entropy of transfer of all nonpolar molecules from the liquid phase to water becomes equal to zero in a rather limited temperature range Ts 130-160°C. This important behavior was noticed first by Baldwin (1987), who assumed that the heat capacity increment was temperature independent. However, as illustrated in Fig. 12 for liquid benzene and pentane, the temperature depen-
-
40 I
I
40
30
90
20
20
10
10 c
0
E
\ h
s
0
E \
0
0
c,
1
-10
-10
-20
-20
L Bmnzmnm
-300 -4 250
300
350
T, K
400
450
-30 - 4.8
250
300
350
T,
400
450
K
RQ.12. Temperature dependence of ATW, TAPSO, and A;Go of transfer for benzene and pentane into water. The solid line is computed assuming Arc, depends on temperature according to Eq. (151, and the dashed line represents the calculation assuming Arc, is temperature independent.
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
2 17
dence of heat capacity of transfer does not lead to essential changes in this conclusion. An immediate consequence of the fact that the entropy of transfer of all nonpolar substances into water at TS is zero is a clear proportionality between the entropy of solution at 25°C and the surface area of the solute (see Tables IV and V). Indeed, if as noted, the heat capacity increment of the transfer of a nonpolar molecule into water is proportional to N , and is a universal function of temperature, one can expect, according to Eq. (9, that the entropy of transfer should also be proportional to N , at any temperature T if it is zero at some temperature Ts. IV. HYDRATION OF NONPOLAR MOLECULES A . Thermodynamics of Dissolution of Nonpolar Molecules The established correspondence of the ratios A,WHo/N,,A r C p / N s ,and A,WSo/N,for such different molecules as those of noble gases and multiatomic hydrocarbons shows that the observed heat effect of the transfer of a nonpolar molecule to water is caused mainly by the changes in the water contacting the nonpolar molecule, i.e., by hydration of these molecules. The large negative entropy of the transfer of a nonpolar substance to water at room temperature indicates a definite increase of the order in water in the presence of such solutes. It was suggested by Frank and Evans (1945) that water forms frozen patches or microscopic icebergs around nonpolar molecules, the extent of iceberg formation increasing with the size of the solute molecule. Later, these icebergs were considered as “flickering clusters” (Frank and Wen, 1957). However, leaving aside the specification of various possible models of hydration of nonpolar molecules (for review, see, e.g., Franks and Reid, 1973), let us consider formally whether hydration in itself might cause the low solubility of these substances, particularly of liquid hydrocarbons, in water. These attract special attention because they can be used as close models of the apolar compact core of a globular protein. As was shown in the previous section, the entropy of transfer of all liquid hydrocarbons into water is zero at a temperature TS that is about 140°C. This observation means that the Gibbs energy of transfer has a maximum at that temperature, because aAG/aT = -AS (see Fig. 13), and it follows that the process of transferring liquid hydrocarbons into water is most unfavorable at this temperature. Because the entropy of transfer is zero at this temperature, the Gibbs energy of transfer is determined entirely by the enthalpy change [A.;*H0(Ts)]. It includes the energy of dis-
218
PETER L. PRIVALOV AND STANLEY J. GILL
rupting contacts between nonpolar molecules as well as the enthalpy of hydration of these molecules upon transfer into water. It appears that there are two temperatures of a universal nature that describe the thermodynamic properties for the dissolution of liquid hydrocarbons into water. The first of these, TH,is the temperature at which the heat of solution is zero and has a value of approximately 20°C for a variety of liquids. The second universal temperature is T s , where the standardstate entropy change is zero and, as noted, TS is about 140°C. The standard-state free energy change can be expressed in terms of these two temperatures, requiring knowledge only of the heat capacity change for an individual substance
ArGO =
A r c p dT - T
ff ArCpIT d T
(17)
In the approximation where A r c p is constant, TSis 113°C (Baldwin, 1987), and Eq. (17) can be integrated to yield
T A r c o = ArCp(T - T H )- TA;HCpIn Ts which can also be expressed to first order by
(18)
This function' is plotted for a number of hydrocarbon liquids in Fig. 13. Since the heat capacity change when normalized to the number of solvating water molecules is also a universal parameter, one sees that the free energy change per solvating water molecule is a general function of temperature alone. A similar relationship holds as well for the entropy and enthalpy change per solvating water molecule, as can be seen, for example, in Eq. (18). It should be noted that the preceding equations assume the existence of a temperature TS where A r S o is zero. The temperature dependence of the solubility may also be expressed in terms of the parameters given generally by Eq. (17) or for the simple case of constant heat capacity change from Eq. (18). In the latter situation, we rearrange Eq. (18) to give ATGOIT, which is -Rln(solubility), to obtain the result
AYGOIT = ArCp(l - T H / T )- A r c p ln(T/Ts)
(20)
The actual calculation of this function is facilitatedfrom values known at 298 K by use of the equation: AYGo = [(298 - T)/298]AyH(298) - (298 - T)A& + TAYC, ln(298/T).
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
40
-
1
2 19
1
r
I
30
20
250
300
350
400
450
TIK
FIG. 13. Temperature dependence of A;Go of transfer for various liquid hydrocarbons into water, assuming AYC is constant.
The first-order expansion of this equation gives the result A?Go/T = A?Cp(ln TS - In TH) - A?C,(I - T / T H ) ~ / ~ (21)
As this equation shows, there is a maximum value for the function APGo/T, or a minimum in the solubility at the temperature THwhere the heat of solution is zero. A comparison of the solubility function (Eq. 20) with free energy of transfer (Eq. 18) is shown in Fig. 14. We plot the free energy of transfer as a function of Tin order to show the underlying contributions of AH and TAS. On such a plot, the slope of the AH function is AC,, and the free energy maximizes at Ts. On the other hand, the solubility function APGOITis plotted versus In Tin order to reveal the contributions of AHIT and A S directly. On this plot, the slope of the A S curve is AC, . Here the maximum of A?Go/T is seen to occur at TH. There are two simple relationships in the case of constant heat capacity change that are revealed from these plots. As Becktel and Schellman (1987) pointed out, the shaded triangle (Fig. 14, left) enables one to relate the maximum free energy of transfer at TS,i.e., the vertical side, with the base interval TS TH times the slope given by AC,. In the case of the negative solubility function the maximum occurs at TH,again indicated by a shaded triangle with base given by In TH - In TS and again the slope AC,. AGO and AC, scale for different hydrocarbons to the number of solvating water mole-
220
PETER L. PRIVALOV AND STANLEY J. GILL 100 I
30 20
f
80 60
-10 -20 -30
/
40 20
10 0
E
0
y 1..
Y
250
-2 E
/
,
-40 -6 0 -80
.,,..,, .,
300
,
350 T/ K
I/ 1 ,
,
,
,
-100 400
450
5.6
5.8
6
In TIK
FIG.14. Two thermodynamic representationsof a Liquid hydrocarbon (benzene)dissolution into water, assuming constant heat capacity change. (Left) Free energy of transfer and underlying contributionsas a function of temperature. (Right) Solubility function (ATGOIT) and underlying contributions as a function of In T. Dotted lines are drawn at respective maxima, TH and Ts.Shaded regions show triangular relations as discussed in the text.
cules, as shown to the first approximation by experiment. Thus if one has knowledge of the maximum value of the transfer free energy, the minimum value of the solubility and the change in the heat capacity, then the two equations for the two triangles will provide a general and unique solution of THand Ts.
B. Two-step Description of Dissolution In order to focus on the particular role of water in the dissolution process, it is helpful to decompose the process into two steps through a hypothetical intermediate state. As indicated in the previous section, there are two natural thermodynamic reference points, namely, the maxima of the transfer free energy and solubility functions. In this section we shall show that each of these reference points provides a useful hypothetical intermediate state, defined in terms of thermodynamic measurements. We first make use of the state characterized by the maximum value of the free energy of transfer. At this temperature (Ts), the overall entropy change is zero. The natural intermediate state, characterized by a constant value of the free energy of transfer ArG0(Ts),will then have the
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
22 1
same entropy and heat capacity properties, but not enthalpy, as the pure hydrocarbon liquid at Ts. The properties of this hypothetical state are essentially those of a compressed hydrocarbon gas at TS. The enthalpy change from the liquid to this state is then given by the enthalpy of vaporization corrected by a small compression term (RT). We call this the “compact” reference state (denoted by c) with properties AfH = A;UH(Ts), A f S = 0, AfCp = 0. It should be noted that the experimental quantities ArH, A;”So,and Arc,, permit calculation of the temperature TS, where the entropy change for this process is zero. We define the change from the compact to the dissolved aqueous state (w) as the “hydration” step. At this reference temperature TS, the hydration free energy change is zero and decreases on either side. It may be evaluated at any temperature by the integrals involving the heat capacity change for the liquid to aqueous state(w) change:
I’Arc, IT
A,”H =
and A,”S =
dT
(22)
A;WCpITdT
(23)
Ts
Ts
The introduction of the compact reference state permits the evaluation of the net thermodynamic effect of hydration at any temperature. The hydration free energy is then given by the dissolution of the compact hydrocarbon state into water: AGydration
= A,“G =
I
T
Ts
ArCp dT - T
1; A$Cp/T dT
(24)
The free energy of the process, assuming a constant heat capacity change in Eq. (24), then results in AGy,jration = ArCp(T - Ts) - TArC, ln(T/Ts)
(25)
The expansion of the logarithmic term in this equation to first order results in the expression
which shows that the free energy of hydration is negative at all temperatures, except at Ts, where it is zero. Figure 15 shows the hydration Gibbs energies normalized by the number of water molecules in the hydration shell for various nonpolar substances. The universal nature of the hydration effect and its temperature dependence, is seen from Eq. (25) to be a
222
PETER L. PRIVALOV AND STANLEY J. GILL
-
0
g I
P
v
-100
I
-38 2 3
f
-200
2
Q
-300
9 -400 250
300
350
4 00
450
TIK
FIG.15. Temperature dependence of the hydration free energy normalized by the number of water molecules in the hydration shell N, for several aromatic and alkyl hydrocarbons.
consequence of the proportionality of the heat capacity change with the number of solvating water molecules. This situation is general for all nonpolar substances that have thus far been studied. (One notes some grouping of the hydration free energy functions for different classes of hydrocarbons, i.e., alkyl or aromatic, but this effect is secondary to the overall nature of the hydration phenomenon.) The behavior of the normalized thermodynamic quantities of transfer from liquid hydrocarbon to water are each general functions of the number of waters, as depicted in Fig. 16. The average values of TH and TS were taken at 300 K and 420 K, calculated from AiVC,IN, varying with temperature as in the case of AlC, for benzene. As can be seen by the appropriate arrows, the hydration quantities are determined by the values and TAIWSOIN, along with the definition of the for AiVGOINs, A;"Ho/Ns, hydration properties: AHhyd = A,"HN, and AShyd = ArSIN,. The net effect of hydration of nonpolar solutes at any given temperature other than TSis to favor the transfer of nonpolar molecules from the gaslike compact state into water, and this effect increases as one moves farther away from Ts.From this view, the hydration eflect of nonpolar solutes stabilizes the dissolved state and thus in itselfcannot be regarded as a cause of their hydrophobicity.
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION 1.5 r
223
U n i v e r s a l Function
1
.5
0
-.5
-1 -1.5
I ' . ~ " ' " ' " " ' " . . * 250 300 350 400 450
TIK
FIG.16. Diagram of general behavior of the temperature dependence of the general thermodynamic parameters, normalized to moles of solvated water, involved in the dissolution process of liquid hydrocarbons into water. The free energy and enthalpy change of forming the compact reference state is denoted by the horizontal line where AfG = AtH. The hydration free energy change (normalized to the number of solvating water molecules and denoted by the bar) along with its entrogc and enthalpic compone_nts is depicted by the designated arrows and is expressed by A G y d = AGhydntion/Ns = AYG - A;H(Ts).
The usual view, in which hydrophobic interactions occur in order to avoid the ordering of water, i.e., to avoid a decrease in the entropy of the system, is based primarily upon properties evaluated at room temperatures. However, the maximum stabilization of a hydrophobic interaction occurs at a high temperature, where the enthalpy is the dominating contributor in determining the stability (on this aspect, see Patterson and Barbe, 1976; Shinoda, 1977; Abraham, 1982; Hvidt, 1983; Baldwin, 1987). On this basis we conclude that the enthalpic factor must be the principal thermodynamic effect accounting for hydrophobicity. The second maor feature of the properties of dissolving nonpolar solutes into water is the large heat capacity change. This property has the effect of making the enthalpic and entropic contributions highly temperature dependent, but they largely cancel each other in calculations of the free energy of transfer. As seen in Fig. 16, the small value of hydration Gibbs energy shows that its two components, AHhyd and TAShyd, which are generally large in absolute values, compensate each other to a great extent. It is usually supposed that the main source for the enthalpy and entropy of hydration
224
PETER L. PRIVALOV AND STANLEY J. GILL
is a change in the state of water in the presence of a nonpolar solute, i.e., ordering of water. According to Lumry and Ben-Naim, there is exact compensation for the entropy and enthalpy changes that arise from water ordering, and thus this effect would contribute nothing to the hydration Gibbs energy (Ben-Naim, 1980; Lumry et al., 1982; Lumry and Gregory, 1986). However this is not confirmed by the present analysis. In our treatment, the large heat capacity change of dissolution occurs in the hydration step and is a direct manifestation of water ordering. From our view, the heat capacity change plays the key role in regulating the hydration free energy change. We would like to point out that an alternative stepwise decomposition of the thermodynamics of the dissolution process can be made from consideration of the solubility as a function of temperature. The function AYGOIT has a maximum value at TH,which can serve as a reference point. By arguments similar to those advanced above, one may choose a hypothetical intermediate state (i) with the property that AiGOITis independent of temperature and having a value ATGO(TH)/TH. Thus the properties of this state are identical with those of the liquid with respect to the enthalpy and the heat capacity, but not the entropy, which is given by the entropy of dissolution at TH and also by minus the free energy of dissolution divided by this temperature. One can envision this hypothetical intermediate state as a nonpolar molecule that has been placed into a cavity that has been formed in the aqueous phase and in which no specific interactions occur with the water. For convenience we might call this the “cavity” state. The entropy change for attaining this state may be considered as the entropy of mixing of two liquids with dissimilar particle sizes. Statistical mechanics provides a means for estimating this quantity (Lee, 1985a;B. Lee, private communication). Since there is no enthalpy change in forming the cavity state from the liquids, the two states must have the same intermolecular interactions. To complete the overall dissolution process from the cavity state, the specific interactions between the nonpolar solute and surrounding water molecules are “turned on,” a process resulting in a significant heat capacity change for this step. As we move away from the reference temperature TH,the enthalpy change for this second step, which is the enthalpy change of dissolution, increases with temperature. This scheme, motivated here by thermodynamic considerations, finds a detailed molecular expression by scaled particle theory (Pierotti, 1963, 1965; Klapper, 1973; Lucas, 1976; Lee, 198Sa,b). Of the two approaches to decomposing the thermodynamics for dissolution of nonpolar solutes into water, the first, from a reference point at the maximum of the free energy of transfer, leads to the concept of the compact state of the nonpolar substance. The compact state can be de-
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
225
scribed as a noninteracting liquid, whose properties are similar to a compressed gas. In contrast to the second approach, which suggests a solutefilled cavity intermediate in the aqueous phase, the compact state has properties that are amenable to physical measurement. More important, analysis of the thermodynamics of nonpolar molecules dissolving into water through a compact state intermediate enables one to focus on the characteristics of hydration. The principal feature of the hydration process is the heat capacity change.
V. COMPARISON OF RESULTS ON PROTEIN DENATURATION AND IN WATER HYDROCARBON DISSOLUTION
Comparison of results on thermodynamic studies of protein denaturation and hydrocarbon dissolution in water shows a number of surprising similarities and differences between these two processes. The most surprising result is the close correspondence of the temperature of convergence of the enthalpy and entropy functions for the denaturation of proteins, T,, and the temperature TS for the dissolution of hydrocarbons in water. At this temperature, the entropy change for dissolution of liquid hydrocarbons in water is zero. However, the entropy of protein denaturation is far from zero at this temperature but amounts to 17.6 J K-I per mole of amino acid residues (Privalov, 1979), a value that corresponds to an 8-fold increase of the number of possible configurations and is close to the value expected for the helix-coil transition of polypeptides (Schellman, 1955). This difference shows that an oil drop is an inadequate model for a globular protein. A more suitable model resembles that of a small crystal with a quite definite positive melting entropy (see also Bellow, 1977, 1978). The crystal-like nature of a protein interior manifests itself also upon considering the packing density of these macromolecules. According to crystallographicdata, the packing density (i.e., the dimensionless ratio of the actual volume of an object to the volume of the space occupied by it) for very different globular proteins, is almost identical and close to 0.75. This value is also found for crystals of small organic molecules (Klapper, 1973; Richards, 1977). In contrast, organic liquids have packing densities that do not exceed 0.44 (Mapper, 1971, 1973). This drastic difference in packing densities of the protein interior and of organic liquids explains the difference in volume effects of denaturation and of transfer of nonpolar molecules from the pure liquid phase to water. The transfer of nonpolar molecules from the liquid phase to water is accompanied by a decrease of the volume and compressibility (Masterton, 1954; Friedman and Scheraga, 1965; Schneider, 1963; Alexander and Hill, 1965), whereas for
-
226
PETER L. PRIVALOV AND STANLEY J. GILL
protein denaturation, the change of volume and compressibility due to hydration of nonpolar groups appears to be positive (Brandts et al., 1970; Zipp and Kauzmann, 1973; Kauzmann, 1987). The cause for this difference in sign becomes clear from simple geometric considerations, if one takes into account the small size of water molecules (solvent) relative to that of nonpolar solute molecules (Assarson and Eirich, 1968) along with the noted difference of packing densities in the nonpolar liquids and in the protein interior (Klapper, 1971, 1973; Lee, 1983). Comparison of the enthalpy of protein denaturation (Table I) with the enthalpy of solution of liquid hydrocarbons at TS(Table 11) shows also a great difference in their values: the enthalpy of protein denaturation at Ts is about 6 kJ per mole of amino acid residues with an average molecular weight of 115; the enthalpy of solution of hydrocarbons of comparable size (ethylbenzene, M, = 106) is almost five times larger at this temperature. For denaturation of solutions of proteins in water AECP(25'C) is about 70 J K-' per mole of amino acid residues, whereas A\$CP(25'C)for ethylbenzene is 318 J * K-I * mol-I. However, this difference in the enthalpy and heat capacity increment is quite understandable, as not all of the groups in a protein are nonpolar, not all are screened from water in the native state, and not all are in contact with water in the denatured state. It is tempting to suppose that all the denaturation enthalpy at T, is provided by the disruption of nonpolar contacts, i.e., actually by van der Waals interactions, but that the temperature dependence of enthalpy is determined by hydration of nonpolar groups. The latter is supported by the correlation found between AEC, and the saturation of the native structure by the contacts between the nonpolar groups (Table I). However, this simple model immediately raises two questions: Why do proteins with different concentrations of nonpolar contacts have the same denaturation enthalpy values at T,? Is it reasonable to neglect the contribution of hydrogen bonds in the denaturation enthalpy ? The content of hydrogen bonds in various globular proteins is almost the same and amounts to 0.75 2 0.10 per amino acid residue (Privalov and Khechinashvili, 1974; Chothia, 1975; Privalov, 1979). Therefore, we can neglect them only if the enthalpy of their disruption in aqueous media is small or if these bonds are not disrupted upon protein denaturation. It is known that heat-denaturated proteins have higher ellipticity than the proteins denatured by guanidinium chloride or urea. This is usually explained by the assumption that they have some regular residual structure (Tanford, 1962, 1968). At the same time, all attempts to measure the heat effect associated with disruption of this residual structure by guanidinium chloride or urea have failed (Pfeil and Privalov, 1976b; Pfeil, 1981,
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
227
1986; Pfeil et al., 1986; Griko et al., 1988a). Therefore, the enthalpy of the residual structure in heat-denatured protein cannot be large. As for the enthalpy of hydrogen bonding of protein CO and NH groups, there is considerable divergence in its estimation. On the basis of thermodynamic studies of N-methylacetamide dimerization in water, Klotz and Franzen (1962) concluded that the enthalpy of hydrogen bonding of these molecules is close to zero; and according to Kresheck and Klotz (1969), it is even positive. However, these results were criticized, because Nmethylacetamide forms not only hydrogen bonds upon dimerization but also the contact between methyl groups, and the dehydration of these groups could give a significant heat effect (Kresheck, 1969; Gill and Noll, 1972). Schellman (1955), studying the association of urea in water, came to the conclusion that the enthalpy of formation of hydrogen bonds between urea molecules is about 7 kJ * mol-'. A similar value was obtained by Kresheck and Scheraga (1965) from the temperature dependence of nonideality of urea solution. Susi (1969), studying dimerization of 6-valerolactam, found that the enthalpy of hydrogen bond formation is 11.7 kJ mol-I. According to Gill and No11 (1972), who studied calorimetrically the dilution of diketopiperazine, the enthalpy of hydrogen bond between CO and NH groups in aqueous media is 8.8 kJ mol-I. Studies of the helix-coil transition of synthetic polypeptides lead to values for the enthalpy of forming hydrogen bonds in aqueous solutions within the limits of 3.5 to 5.0 kJ mol-1 (Hermans, 1966; Rialdi and Hermans, 1966; Bychkova et al., 1971; Chou and Scheraga, 1971; Terboyevich et al., 1972; Warashina and Ikegami, 1972; Hill et al., 1977). Therefore, at the present time there is no reason for neglecting the contribution of hydrogen bonds in the enthalpy of protein denaturation. One can even expect that this contribution should increase as the temperature increases because the denatured state of the protein will make fewer hydrogen bonds with water molecules at higher temperatures. It was suggested earlier that hydrogen bonds in proteins are the main contributors to the denaturation enthalpy at T, , whereas the nonpolar contacts determine only the temperature dependence of the denaturation enthalpy (Privalov, 1979). The main argument for this was the observation that proteins that have the same enthalpies at T, have an almost equal concentration of intramolecular hydrogen bonds, but differ in the concentration of nonpolar contacts (Table I). As is evident, the assumption of the dominant role of hydrogen bonds in the stabilization of protein structure explained the observed temperature convergence of the denaturation enthalpy, if the enthalpy of exposure of nonpolar groups to water is zero at this temperature. This assumption implied that either the enthalpy of
-
-
228
PETER L. PRIVALOV AND STANLEY J. GILL
disruption of van der Waals contacts between nonpolar groups is completely compensated at this temperature by the enthalpy of hydration of these groups or that both these quantities are zero at this temperature. This latter inference contradicts the results obtained on nonpolar liquids. A comparison of thermodynamic data on denaturation of globular proteins with that of fibrillar proteins and melting of phospholipid membranes led to a conclusion that the van der Waals contribution to the stabilization of protein structure is of the same order as that of hydrogen bonds (Privalov, 1982; see also Crigbaum and Komoria, 1979a,b).
VI. MECHANISM OF STABILIZATION OF COMPACT PROTEIN STRUCTURES
As seen in Fig. 6, the ARC function in the temperature range 0-100°C is insensitive to the assumption as to whether ARC, is temperature dependent or temperature independent. In the latter case, the enthalpy and entropy of protein denaturation can be presented in the first approximation in the following way: ARH(T) = ARH(Tx) - ARCp(Tx- T ) ARS(T) = A%(Tx)
- ARC, ln(Tx/T)
assuming that ARC, does not depend on temperature and includes all possible contributions: the increase of the extent of freedom of the polypeptide chain upon denaturation, the gradual melting of residual structure, the temperature dependence of hydrogen bond and van der Waals interactions, and the temperature dependence of hydration of nonpolar groups. As the latter is the major one, Txcan be considered as a temperature at which the enthalpy and entropy of hydration of nonpolar groups becomes zero, i.e., as Ts. Consequently, A#H(Tx)is then determined by hydrogen and van der Waals bonds stabilizing the native protein structure, and AgS(Tx)is determined by disordering of the native conformation of the polypeptide chain. Thus, to the first approximation, we will have for the Gibbs energy of stabilization of the native protein state
ARG(T)
=
A ~ H ( T-)
TARS(T)
= ARH(T,) - TARS(T,) - ARC,
(+-)
1 T - T 2
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
= / J l
/
229
\ ----
I
I
I
0
50
100
Ternperature,'C
FIG. 17. Contribution of dissipative forces [TAS(T,)]and water solvation effect [(AC,/Z)(T,IT - 1)2] to the stabilization of an abstract globular protein consisting of 200 amino acid residues.
230
PETER L. PRIVALOV AND STANLEY J. GILL
It is only the first term in Eq. (29), representing the total enthalpy of hydrogen bonds and van der Waals interactions in protein, that is positive. The second term, which represents the action of dissipative forces, is negative and increases in absolute value as the temperature increases. The third term in Eq. (29) is the only one that represents the effect of water solvation by the nonpolar groups upon protein stabilization. This term is always negative. It is zero at Tx but rapidly increases as the temperature decreases. Therefore, the native compact protein structure is stabilized only by intramolecular hydrogen and van der Waals bondings, while the effect of water solvation by nonpolar groups has only a destabilizing action. Thus, the protein compact native structure is stable at temperatures where the destabilizing action of the water solvation and of dissipative forces are relatively small. At higher temperatures, this structure breaks down due to an increase of dissipative forces, whereas at lower temperatures it breaks as a result of an increase of the solvation tendency caused by hydration of nonpolar groups (Fig. 17). Considering the van der Waals interaction between nonpolar groups and hydration effects of these groups as an integral entity, i.e., the “hydrophobic interaction,” one can state that upon increasing temperature
il 10
0
20
40
60
80
TIT
Fro. 18. Temperature dependence of the intrinsic viscosity of apomyoglobin (aMb), hen egg-white lysozyme with disrupted disulfide cross-links (Lys) and pancreatic ribonuclease A with disrupted disulfide cross-links (RNase) in solutions with pH 2.0 where polypeptide chains are in a random coil conformation [Privalov et al. (198811.
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
23 1
up to Tx, the “hydrophobic interactions” between the protein nonpolar groups increase. A perfect illustration of this is the squeezing of the hydrodynamic volume of a polypeptide chain in the random coil conformation observed upon increasing temperature (Fig. 18). On the other hand, one can regard the breakdown of the protein compact structure upon cooling as a result of weakening of the “hydrophobic interactions” of the nonpolar groups in the protein. However, although the term hydrophobic interaction is widely accepted, its use hardly clarifies the real situation in proteins: indeed, as we have already seen, the hydrophobic interactions not only decrease with decreasing temperature, but they could also change their sign at low enough temperature, and convert from a factor stabilizing the compact state to a factor destabilizing this state. On the other hand, at present we cannot separate the van der Waals and hydrogen bonding contributions to the stabilization of protein structure and, hence, cannot estimate quantitatively the net effect of so-called hydrophobic interactions. All this raises a question whether the use of the concept of hydrophobic interaction is indeed justified in considering the protein stability problem (in connection with this, see also, mapper, 1973). Does it not lead to too great a misunderstanding? For example, under hydrophobic interaction, many authors mean only one of its two components, usually the one that is stipulated by water ordering by the nonpolar groups, and consider the van der Waals interactions of these groups separately (see, e.g., Kauzmann, 1959) or do not consider them at all (see, e.g., Eagland, 1975).
ACKNOWLEDGMENTS We wish to acknowledge the support of NSF grant CHE-8611408 to S.J.G.We wish to thank Charles Robert for his assistance in preparing the section on thermodynamic properties of hydrophobic compounds, and we wish to express our appreciationfor various suggestions and comments made by B. K. Lee, Robert Baldwin, Ken Dill, and Ingemar Wadso.
REFERENCES Abraham, M. (1982). J . Am. Chem. SOC.104,2085-2094. Alexander, D. M.,and Hill, D. J. T. (1965). Aust. J . Chem. 18,605-608. Assarsson, P., and Eirich, R. F. (1968). J . Phys. Chem. 72,2710-2719. Atanasov, B. P.,Privalov, P. L., and Khechinashvili, N. N. (1972). Mol. Eiol. (USSR) 6, 33-41. Baldwin, R. (1986). Proc. Natl. Acad. Sci. U.S.A. 83, 8069-8072. Becktel, W. J., and Schellman, J. A. (1987). Eiopolymers 26, 1859-1877. Bellow, J. (1977). J . Theor. Eiol. 68, 139-142. Bellow, J. (1978). Int. J . Pept. Protein Res. U,38-41. Ben-Naim, A. (1980). “Hydrophobic Interactions. Plenum, New York. Brandts, J. F. (1964). J . Am. Chem. SOC. 86,4291-4301.
232
PETER L. PRIVALOV AND STANLEY J. GILL
Brandts, J. F. (1969). In “Structure and Stability of Biological Macromolecules“ (S. N. Timascheff and G. D. Fasman, eds.), pp. 213-290. Dekker, New York. Brandts, J. F., and Hunt, L. (1%7). J. Am. Chem. SOC.89,4826-4838. Brandts, J. F., Oliveira, R. J., and Westort, Ch. (1970). Biochemistry 9, 1038-1047. Bychkova, V. E.. Ptitsyn, 0.B., and Barskaya, T. V. (1971). Biopolymers 10,2161-2179. Calderon, R. O., Stolowich, N. J., Gerlt, J. A., and Sturtevant, J. M. (1985). Biochemistry 24,6041-6049. Cox, J. D., and Pilcher, G. (1970). “Thermochemistry of Organic and Organometallic Compounds.” Academic Press, London. Cho, K. C., Poon, H.T., and Choy, C. L. (1982). Biochim. Biophys. Acta 701, 206-215. Chothia, C. (1975). Nature (London) u4, 304-308. Chou, P. J., and Scheraga, H. A. (1971). Biopolymers 10, 657-680. Crigbaum, W. R., and Komoriya, A. (1979a). Biochim. Biophys. Acra 276, 204-228. Crigbaum, W. R., and Komoriya, A. (1979b). Biochim. Biophys. Acta 576, 229-246. Dec, S. F., and Gill, S. J. (1984). J . Solution Chem. W, 27-41. Dec, S. F., and Gill, S. J. (1985a). J. Solurion Chem. 14,417-429. Dec, S . F., and Gill, S. J. (1985b). J . Solution Chem. 14, 827-836. Eagland, D. (1975). In “Water” (F. Franks, ed.), Vol. 4, pp. 305-315. Plenum, New York. Edsall, J. T. (1935). J. Am. Chem. SOC. 57, 1506-1507. Edsall, J. T., and McKenzie, H. A. (1983). Adu. Biophys. 16. Filimonov, V. V., Pfeil, W., Tsalkova, T. N., and Privalov, P. L. (1978).Biophys. Chem. 8, 117-122. Frank, H.S., and Evans, M. W. (1945). J. Chem. Phys. W , 507-532. Frank, H. S., and Wen, W. Y. (1957). Discuss. Faraduy SOC.24, 133-140. Franks, F. (1975). In “Water” (F. Franks, ed.), Vol. 4, pp. 1-93. Plenum, New York. Franks, F., and Reid, D. S. (1973).In “Water. A Comprehensive Treatise” (F. Franks, ed.), Vol. 2, pp. 323-661. Plenum, New York-London. Franks, F., Gent, M.,and Johnson, H. H. (1%3). J. Chem. SOC.1963,2716-2723. Freire, E., and Biltonen, R. L. (1978). Biopolymers 17,463-479. Friedman, M. E., and Scheraga, H.A. (1%5). J. Phys. Chem. 65, 3795-3800. Gill, S. J., and Noll, L. (1972). J. Phys. Chem. 76, 3065-3068. Gill, S. J., and Wadsii, I. (1976). Proc. Natl. Acad. Sci. U.S.A. 73, 2955-2958. Gill, S. J., Nichols, N. F., and Wadsii, I. (1976). J. Chem. Thermodyn. 8,445-452. Gill, S . J., Dec, S. F., Olofsson, G., and Wads& I. (1985a). J . Phys. Chem. 89,3758-3761. Gill, S . J., Richey, B., Bishop, G., and Wyman, J. (1985b). Biophys. Chem. 21, 1-14. Griko, Yu. V., Privalov, P. L., Venyaminov, S. Yu., and Kutyshenko, V. P. (1988a). Biofzika (USSR), in press. Griko, Yu. V., Privalov, P. L., Sturtevant, J. M., and Venyaminov, S. Yu. (1988b). Proc. Natl. Acad. Sci. U.S.A. 85, 3343-3347. Hatley, R. H. M.,and Franks, F. (1986). Cryo-Letters 7,226-233. Hawley, S . A. (1971). Biochemistry 10,2435-2441. Hermann, R. B. (1972). J. Phys. Chem. 76,2754-2759. Hermans, J., Jr. (1966). J. Phys. Chem. 70, 510-515. Hill, D. J. T., Cardinaux, F., and Scheraga, H. A. (1977). Biopolymers 16,2447-2467. Hvidt, A. (1983). Acra Chem. Scand. A 37,99-103 Janin, J. (1979). Nature (London) 277, 491-492. Janin, J., and Wodak. (1983). Prog. Biophys. Mol. Biol. 42, 21-78. Kauzmann, W. (1959). Adu. Protein Chem. 14, 1-63. Kauzmann, W. (1987). Nature (London) 325,763-764. Khechinashvili, N. N., and Tsetlin, V. I. (1984). Mol. Biol. (USSR) 18,786-791.
PROTEIN STRUCTURE AND HYDROPHOBIC INTERACTION
233
Khechinashvili, N. N., Privalov, P. L., and Tiktopulo, E. I. (1973). FEES Letr. 30, 57-60. Klapper, M. H. (1971). Biochim. Biophys. Actu 229,557-566. Klapper, M. H. (1973). Prog. Bioorg. Chem. 2, 55-133. Klotz, I. M., and Franzen, J. S. (1962). J. Am. Chem. SOC. 84,3461-3466. Kresheck, G. C. (1969). J . Phys. Chem. 73, 2441-2443. Kresheck, G. C., and Klotz, I. M. (1969). Biochemistry 8, 8-12. Kresheck, G. C., and Scheraga, H. A. (1965). J. Phys. Chem. 69, 1704-1706. Lee, B. (1983). J. Phys. Chem. 87, 112-118. Lee, B. (1985a). Biopolymers 24, 813-823. Lee, B. (1985b). “Mathematics and Computers in Biomedical Applications (J. Eisenfeld and C. Delisi, eds.). Elsevier, Amsterdam. Lucas, M. (1976). J. Phys. Chem. 80,359-362. Lunuy, R., and Gregory, R. B. (1986). I n “The Fluctuating Enzyme” (G. R. Welch, ed.). Wiley (Interscience), New York. Lumry, R., Biltonen, R. L., and Brandts, J. F. (1966). Biopolymers 4,917-944. Lunuy, R., Battistel, E., and Jolicoeur, C. (1982). Faruduy Discuss. Chem. SOC.17,93-108. McAulSe, C. (1966). J. Phys. Chem. 70, 1267-1275. Makhatadze, G. I., and Privalov, P. L. (1988). J. Chem. Thermodyn. 20,405-420. Masterton, W. L. (1954). J. Chem. Phys. 22, 1830-1833. Mateo, P. L., and Privalov, P. L. (1981). FEES Letr. l23, 189-192. Naghibi, H.,Dec, S. F., and Gill, S. J. (1986). Phys. Chem. 90,4621-4623. Naghibi, H., Dec, S. F., and Gill, S. J. (1987a). Phys. Chem. 91, 245-248. Naghibi, H., Ownby, D., and Gill, S. J. (1987b). J. Chem. Eng. Data 32, 422-425. Nojima, H.,Hon-nami, K., Oshima, T., and Noda, H. (1978). J. Mol. Biol. 122, 33-42. Novokhatny, V. V., Kudinov, S. A., and Privalov, P. L. (1984). J. Mol. Biol. 179, 215232.
Olofsson, G . , Oshodj, A. A., QvarnstriSm, E., and Wadsb, I. (1984). J . Chem. Thermodyn. 16, 1041-1052.
Pace, N. C., and Tanford, C. (1968). Biochemistry 7, 198-208. Patterson, D., and Barbe, M. (1976). J . Phys. Chem. 80, 2435-2436. Perutz, M. (1%5). J. Mol. Biol. U,646-668. Pfeil, W. (1981). Mol. Cell. Biochem. 40,3-28. Pfeil, W.(1986). I n “Biochemical Thermodynamics” (M. N. Jons, ed.), 2nd Ed. Elsevier, Amsterdam. Pfeil, W., and Privalov, P. L. (1976a). Biophys. Chem. 4,23-32. Pfeil, W., and Privalov, P. L. (1976b). Biophys. Chem. 4, 33-40. Pfeil, W., and Privalov, P. L. (1976~).Biophys. Chem. 4,41-50. Pfeil, W.,Bychkova, V. E., and Ptitsyn, 0. B. (1986). FEES Lett. 198,287-291. Pierotti, R. A. (1%3). J. Phys. Chem. 67, 1840-1845. Pierotti, R. A. (1%5). J. Phys. Chem. 69, 281-288. Potekhin, S. A., and Privalov, P. L.(1982). J . Mol. Biol. 159, 519-535. Prigogine, I., and Defay, R.(1954). “Chemical Thermodynamics.” Longmans, Green, London. Privalov, P. L. (1963). Biofzika (USSR) 8,308-316. Privalov, P. L. (1979). Adu. Protein Chem. 33, 167-241. Privalov, P. L. (1982). Adu. Protein Chem. 35, 1-104. Privalov, P. L. (1985). B i o f z i k ~(USSR) 30, 722-734. Privalov, P. L. (1986). Vesfn. Akud. Nuuk SSSR 2, 72-79. Privalov, P. L.. and Khechinashvili, N. N. (1974). J. Mol. Biol. 86, 665-684. Privalov, P. L., and Medved’, L. V. (1982). J . Mol. Biol. 159, 665-683.
234
PETER L. PRIVALOV AND STANLEY J. GILL
Privalov, P. L., Tiktopulo, E. I., and Khechinashvili, N. N. (1973). fnr. J . Pept. Protein Res. 5, 229-237. Privalov. P. L..Mateo, P. L., Khechinashvili, N. N.,Stepanov, V. M., and Revina, L. P. (1981). J . Mol. Biol. 152,445-464. Privalov, P. L.,Griko, Yu. V., Venyaminov, S. Yu.,and Kutyshenko, V. P. (1986). J . Mol. Biol. w), 487-498. Privalov, P. L., Tiktopulo, E. I., Venyaminov, S. Yu., Griko, Yu. V., Makhatadze, G. I., and Khechinashvili, N. N. (1988). In preparation. Rettich, T. R., Handa, Y. P., Battino, R., &d Wilhelm, E. (1981). J . Phys. Chem. 85,32303237.
Rialdi, G., and Hermans, J., Jr. (1966). J . Am. Chem. SOC.88,5719-5720. Richards, F. M.(1977). Annu. Rev. Biophys. Bioeng. 6, 151-176. Rose, G. D., Geselowitz, A. R., Gleen, S. J., Lee, R. H., and Zhfus,M. H. (1985). Science 229,834-838.
Schellman, J. A. (1955). C.R. Trans. Lab. Carisberg, Ser. Chim. 29, 230-259. Schneider, G. (1%3). Z . Phys. Chem. (Franwurt)37,333-352. Shinoda. K. (1977). J . Phys. Chem. El, 1300-1302. Shinoda, K., and Fijihara, M. (1968). Bull. Chem. SOC.Jpn. 41, 2612-2615. Sturtevant, J. M. (1977). Proc. Natl. Acad. Sci. U.S.A. 74,2236-2240. Susi, H. (1%9). In “Structure and Stability of Biological Macromolecules” (S. N. Timasheff and G. D. Fasman, eds.), pp. 575-663. Dekker, New York. Tanford. C. (1%2). J . Am. Chem. Soc. 84,4240-4247. Tanford, C. (1%8). Adv. Protein Chem. 23, 121-275. Tanford, C. (1980). “The Hydrophobic Effect: Formation of Micelles and Biological Membranes.” Wiley (Interscience), New York. Tatunashvili, L. V., and Privalov, P. L. (1986). Biofizika (USSR) 31,578-581. Terboyevich, M., Cosani, A., Peggion, E., Quadrifoglio, F., and Crescenzi, V. (1972). Macromolecules S , 622-627. Tiktopulo, E. I., and Privalov, P. L. (1978). FEBS Lett. 91, 57-58. Tiktopulo, E. I., Privalov, P. L., Odintsova, T. I., Ermokhina, T. M., Krashennikov, I. A,, Aviles, F. X.,Cary, P. D., and Crane-Robinson, C. (1982). Eur. J . Biochem. 122,327331.
Tischenko, V. M., and Gorodnov, B. G. (1979). Biofizlka (USSR) 24, 334-335. Tischenko, V. M.,Tiktopulo, E. I., and Privalov, P. L. (1974). Biofzika (USSR) 19, 400404. Velicelebi, G., and Sturtevant, J. M. (1979). Biochemistry 18, 1180-1186. Warashina, A., and Ikegami, A. (1972). Biopolymers 11,529-547. Watanabe, K., and Anderson, H. C. (1986). J . Phys. Chem. 90,795-802. Weast, R. C. ed. (1970). “Handbook of Chemistry and Physics.” Chemical Rubber Co., Cleveland, Ohio. Wetlaufer, D. E. (1973). Proc. Narl. Acad. Sci. U.S.A. 20,601-701. Wilhelm, E., Battino, R., and Wilcock, R. J. (1977). Chem. Rev. 7, 219-262. Wolfenden, R., Anderson, L.. Cullis, P. M., and Southgate, C. C. B. (1981). Biochemistry 20,849-855.
Wyman, J. (1981). Biophys. Chem. 14, 135-146. M e , S. E., and Klibanov, A. M. (1986). Biochemistry 25,5432-5444. Zipp, A., and Kauzmann, W.(1973). Biochemistry 12,4217-4228. Zwolinski, B. J., and Wilhoit, R. C. (1971). “Handbook of Vapor Pressures and Heats of Vaporization of Hydrocarbons and Related Compounds.” Thermodynamics Research Center, American Petroleum Institute, College Station, Texas.
ABSTRACT OF A REVIEW ON CHEMISTRY OF PEANUT PROTEINS From time to time, the editors may wish to call attention to reviews on proteins published elsewhere, especially if they are in journals that might easily be overlooked. Here we call attention to a review on peanut proteins, which may be regarded as a sequel to the article on the same subject, in Advances in Prorein Chemistry, Vol. 8, pp. 393-414 (1953) by J. C. Arthur, Jr. The new review is by R. Bhushan, G.P. Reddy, and K.R. N. Reddy, of the University of Roorkee, India, published in Quarterly Chemical Reuiews, Vol. 2, pp. 41-58 (1986). We give below the title and authorship and table of contents of this review and an abstract prepared by Dr.Bhushan at our invitation.
CHEMISTRY OF PEANUT PROTEINS: A REVIEW By R. BHUSHAN, 0. P. REDDY, and K. R. N. REDDY Drprtttnrnt of Chomlrtry, Unlvrnlty of Roorkrr, Roorkrr, lndlr
I. Introduction 11. Composition of Peanuts 111. Isolation and Separation of Peanut Proteins: Arachin and Conarachin IV. Analytical Studies on Peanut Proteins A. Electrophoresis and Chromatography B. Dissociation-Association Behavior C. Molecular Species D. Subunit Structure E. Spectroscopic and Optical Studies F. Composition of Proteins and Subunits G. Hydrogen Ion Equilibria and Binding Studies V. Uses of Peanut Proteins A. Nutritional Value B. Nonfood Uses References
ABSTRACT Peanuts (Aruchis hypogeu) contain about 25% protein and are reported to have two proteins, namely, arachin and conarachin, of which the former is the major one. Isolation, composition, and properties of peanut proteins were reviewed by J. C.Arthur, Jr. in 1953, a review covering the literature up to 195 1 (Peanut protein isolation, composition and properties 235 ADVANCES IN PROTEIN CHEMISTRY, V d . 39
Copyright 8 1988 by Academic Press. Inc. All rights of reproduction in any form reserved.
236
R. BHUSHAN ET AL.
by J. C. Arthur, Jr., Advances in Protein Chemistry, Vol. 8, pp. 393-414). The present review covers the literature from 1951 to 1985. During the past 6 years, studies on various physicochemical aspects, especially determination of primary structure (sequence of amino acids) of one of the subunits of arachin and quantitative determination of ionizable groups by electroanalytical techniques and hydrogen ion equilibria titrations, have been carried out in this laboratory. The proximate composition of peanut shells and seeds has already been compiled in the earlier review by J. C. Arthur, Jr. The proteins of peanuts were first investigated by Ritthausen in 1880. He extracted the proteins from oil-free peanut cake with sodium chloride and weakly basic solutions and precipitated them by acidification. He considered that the protein thus precipitated was homogeneous; but later, Johns and Jones (1916) separated two proteins, arachin and conarachin, by ammonium sulfate fractionation. Afterward, several workers isolated peanut proteins, using different methods, and a survey of methods for separating individual globulins was published by Vanitraub and Shutov (1968). Gel filtration indicated that the protein is a single component, but disc and agarose gel electrophoresis of the protein revealed that there were two components, i.e., arachin and conarachin, though they had similar electrophoretic mobilities (Johnson et al., 1950). The two proteins were reported to have the same association-dissociation behavior properties in sucrose as in low ionic strength phosphate buffer (Neucere and St. Angelo, 1972). Using DEAE-cellulose and Sephadex chromatography and acrylamide gel electrophoresis, Tombs (1963) detected 17 components in peanut protein. He also reported polymorphism in arachin; this report was based on studies of 81 peanuts. Difference in the chromatographic behavior of arachin obtained by two different methods was observed by Neucere (1969). Sedimentation patterns in low ionic strength solution showed two components in the case of arachin. Arachin dissociated into subunits of approximately one-sixth its molecular weight when SDS was added (Shetty and Rao, 1973). Disc gel electrophoresis of arachin in the presence of 8 M urea and 0.03 M 2-mercaptoethanol at pH 8.6 showed 6 major, 2 semimajor, and 14 minor bands at the anode side and 3 major bands at the cathode side. In lower ionic strength solutions, the sedimentation constant of arachin changed to 9 S, whereas in high ionic strength solutions, it changed to 14 S, and the 9 S component underwent reversible association to the 14 S component at high ionic strength (Yotsuhashi and Shibasaki, 1983). Arachin contained two molecular species, arachin I and arachin 11. Arachin I existed as a monomer in 0.01 M sodium phosphate buffer but associated reversibly to a dimer in 0.3 M sodium phosphate buffer, whereas arachin I1 existed as a dimer in both conditions. It was shown that there were no sulfur-sulfur linkages between the
ABSTRACT CHEMISTRY OF PEANUT PROTEINS
237
different subunits of arachin. The six different subunits of arachin had different isoelectric points and were present in different weight ratios in the parent arachin. Reconstituted arachin from six subunits had a 90% yield and was indistinguishable from intact arachin in electrophoretic mobility, subunit composition, and sedimentation behavior (Yamada et d., 1979, 1981). The CD spectra showed that the reconstituted arachin exhibited a positive maximum at 194 nm and negative troughs at 210 nm. The contents of a helices, pleated sheets, and disordered structures in both arachins, i.e., reconstituted and parent, were computed to be 17, 32, and 51%, respectively (Jacks et al., 1975; Yamada et al., 1981). CD indicated that the a-helical content of arachin was increased when the protein was exposed to catechol and pyrogallol (Neucere and Jacks, 1978). The fluorescence emission spectrum of arachin showed a peak at 307 nm; fluorescence decreased to 1/5 the intensity of that of the native state when denatured with 6 M urea (Shetty and Rao, 1975). Bhushan and co-workers (1985) separated one of the subunits of arachin by electrophoresis and gel filtration and established the complete amino acid sequence of the same subunit, which contained 201 amino acid residues. Ultracentrifugation, electrophoresis, and light-scattering studies (Johnson and Naismith, 1953) showed that conarachin contained two different molecular species. The dissociation of conarachin into conarachin I and conarachin I1 (1 : 1) was later reported by Yotsuhashi and Shibasaki (1973a); they had different physicochemical properties. Changes in the ionic strength of the solvent did not produce an association-dissociation reaction for conarachin I and did not reveal subunit structure. In high ionic strength solutions, conarachin changed to an 8 S form, and in low ionic strength solutions, to 18 S. The subunit structure of the 8 S form was more stable than that of the 18 S form. Amino acid analysis of peanut proteins revealed that the principal amino acids were glutamic acid, 22-27%; arginine, 11-13%; and aspartic acid, 8-13% (Oslova et al., 1973). Cation-exchange HPLC and amino acid analyzer results showed differences in the content of phenylalanine and tyrosine, which were lower when analyzed by cation-exchange HPLC (Eukin and Griffith, 1985). The amino acid composition of six subunits of arachin was reported by Yamada et al. (1979). Bhushan and co-workers (1984) determined the amino acid content of intact arachin and calculated the number of each amino acid residue. Significant differences in amino acid composition were shown when the protein was extracted from eight varieties of peanuts grown in different locations (Dawson and McIntosh, 1973). Hydrogen ion equilibria studies were made both on arachin and conarachin by Malik et al. (1982) and Shetty and Rao (1976), respectively. Malik et al. calculated the heat of ionization on the basis of pH measurements made in hydrogen ion studies on arachin. The same group deter-
238
R. BHUSHAN ET AL.
mined the binding sites by polarography of Cu and Zn ion complexes of arachin. The review has cited two reports with 64 and 39 references on properties and potential food uses of peanut proteins (Martinez, 1979)and on the manufacture of protein products from peanuts and their uses in food and feed (Carter and Rhee, 19751, respectively. Fourteen additional references on uses of peanut proteins have been included. The review describes amino acid composition, molecular weights of subunits, fluorescence emission and UV spectra, a 201-amino acid residue sequence of one of the subunits, titration curve and polarograms showing effects of zinc and copper concentrations on diffusion current depression with respect to arachin. In all, there are 108 references in the bibliography. To obtain copies of the review, please send a letter or telex to Dr. Ravi Bhushan, Department of Chemistry, University of Roorkee, Roorkee 247 667, India; Telex: 0597-201 UOR IN. REFERENCES Arthur, J. C., Jr. (1953). Adu. Protein Chem. 8, 393. Bhushan. R., and Reddy, K. R. N. (1985). Composition of Conarchin (unpublished results). Bhushan, R., Royal, R. N., and Agarwal, A. (1984). J. Protein Chem. 3,395. Bhushan, R., Goyal, R. N., and Agarwal, A. (1985). Biochem. Inr. 11,477. Carter, C. M., and Rhee, K. C. (1975). Peanut Prod. Tex. 115. Dawson, and McIntosh. (1973). J. Sci. Food Agric. 24, 597. Eukin, R. G., and GriBth, J. E. (1985). J. Assoc. OH.Anal. Chem. 58, 1028. Jacks, T. J., Neucere, N. J., and McCall, E. R. (1975). Inr. J. Pept. Protein Res. 7 , 155. Johns, C. O., and Jones, D. B. (1916). J. Biol. Chem. 28,77. Johnson, P., and Naismith, W. E. F. (1953). Discuss. Faraday SOC.W, 98. Johnson, P. A., Shooter, E. M.,and Rideal, E. K. (1950). Biochim. Biophys. Acra 5 , 376. Johnson, P. A., Shooter, E. M.,and Rideal, E. K. (1950). Biochim. Biophys. Acta 5 , 176. Malik, W. U., Bhushan, R., and Agarwal, A. (1982). J . Indian Chem. Soc. 59, 1316. Martinez, W. H. (1979). J . Am. Oil Chem. SOC.56,280. Neucere, N. J. (1969). Anal. Biochem. 27, 15. Neucere, N. J., and Jacks, T. J. (1978). J . Agric. Food Chem. 26, 214. Neucere, N. J., and St. Angelo, A. J. (1972). Anal. Biochem. 47, 80. Oslova, L. P., Kozmina, E. P., and Panilova, A. I. (1973). Izu. Vussh. Ucheb. Pishch. Technol. 2, 128. Shetty, K. J., and Rao, M.S. N. (1973). Indian J. Biochem. Biophys. 10, 149. Shetty, K. J., and Rao, M. S. N. (1975). J . Agric. Food Chem. 23, 1220. Shetty, K. J., and Rao, M. S. N. (1976). Anal. Blochem. 73,458. Tombs,M. P. (1963). Nature (London)m,1321. Vaintraub, I. A., and Shutov, A. D. (1%8). Tr. Khim. Prir. Soedin. 7 , 23. Yamada, T., Aibara, S., and Morita, Y. (1979). Agric. Biol. Chem. 42, 2536. Yamada, T., Aibara, S.,and Morita, Y. (1981). Agric. Blol. Chem. 45, 1243. Yotsuhashi, K., and Shibasaki, K. (1973a). Nippon Shokuhin Kogyo Gakkaishi u),327. Yotsuhashi, K., and Shibasaki, K. (1973b). J. Jpn. SOC.Food Sci. Technol. 29,321. Yotsuhashi, K., and Shibasaki, K. (1983). J. Jpn. SOC.Food Sci. Technol. 20,519.
AUTHOR INDEX Numbers in italics refer to the pages on which the complete references are listed.
A
Aubry, A., 59, 118, 124 Aumaillcy, M.,2, 23, 27, 28, 32, 38,
Aarown, S., 43, 43 Abe, H., 58, 123 Abe, S., 42, 46 Abeles, R. H.,166, 188 Abelson, J. N.,145, 189 Abraham, D. J., 158, 159, 166, 174, 188 Abraham, M., 223, 231 Adamiak, D. A., 77, 122 Adammn, A. W.,73, 118 Adman, E. T., 145, 186 Aebi, J. D., 58, 123 Aganval, A., 237, 238 Ahern, T. J., 129. 186 Aibara, S., 237, 238 A@, D., 61, 118 Alber, T., 184, 186 Albini, A., 43, 43 Albrechtsen, R., 41, 49 Alexander, D. M.,225, 231 Alper, R.,2, 46 Amenta, P. S., 1, 2, 47 Amidon. G. L.,144, 160, 186 Anantharamaiah, G. M., 82, 119 Anderson, L.,193, 234 Anderson, S. R., 86, 87, 121 Andreatta, R. H., 59, 120 Andreu, D., 85, 118 Anhalt, G.J., 6, 36, 47 An&, S., 144, 160, 186 Aoyagi, H.,85, 121 Ariolas, A,, 84, 118 Ariaon, B. H., 60, 118 124, 168, 169, 186 Arthur, J. C., Jr., 235, 236, 238 Artymuik, P., 129, 186 Asmian, R. K.,101, 121 Atanamv, B. P., 197, 231
39, 41, 42, 43, 43, 46, 47, 49
Aviles. F. X.,206, 234 Awedimento, V. E., 20, 50
Babel, W.,12, 43 Babu, Y. S., 89, 118, 164, 186 Bach 11, A. C., 61, 118 Biichinger, H. P., 11, 16, 17, 19, 26, 37, 43, 44, 48, 49
Biickstriim, G., 32, 48 Baile. C. A., 101, 124 Bajusz, S. J., 53, 123 Baker, E. N., 138, 143, 147. 149, 186 Baker, J. R.,35, 44 Balaram, P. J., 57, 58, 61, 62, 63, 118, 120, 122, 123
Balasubramanian, T.M.,62, 120 Baldwin, R.,216, 218, 223, 231 Baldwin, R. L., 52, 69, 70, 71, 75, 109, 118, 121 123, 135, 178, 186, 188
Baltz, M. L., 35, 44, 48 Balun, J. E.,22, 48 Banerjee, S. D., 40, 43 Banner, D. W.,58, 121 Baralle, F. E., 34, 46 Barbe, M., 223, 233 Barbier, B., 74, 118 Barlow, D. J., 129, 186 Barlow, D. P., 20, 25, 26, 43, 46 Barman, B. N., 142, 188 Baron van Evercooren, A., 23, 43 Barr, J. F., 36, 43 Barrach, H.-J., 29, 35, 45
239
240
AUTHOR INDEX
Bamkaya, T. V., 227, 232 Bady, S. H., 41, 48 Bartels, K., 161, 189 Bartunik, J., 161, I89 Bashford, D., 134, 186 Battino, R., 212, 215, 234 Bavaso, A., 57, 58, 118, 124 Beck, K.,26, 27, 32, 38, 39, 46, 47 Becktel, W.J., 219, 231 Bedarkar, S., 53, 120 Bellow, J., 225, 231 Bender, B., 10, 26, 27, 43, 44 Benedetti, E., 57, 58, 79, 118, 122, 124 Ben-Naim, A., 224, 231 Bennett, M. K.,92, 118 Berendson, H.J. C., 135, 187 Berkovitch-Yellin, Z., 145, 286 Berman, J. M.,65, 118 Bermejo, F. J., 69, 70, 122 Bernard, M. P., 20, 46 Bernfield. M. R., 28, 40, 43, 44 B h g t o n , P. R., 155, 186 Bhushan, R.,235, 237, 238 Bienynaki, A., 69, 70, 118 Biltonen, R. L., 194, 195, 233 Bing, J. T., 38, 39, 46 Birktoft, J. J., 145, 186 Bishop, G., 194, 211, 232 Blagdon, D. E.,70, 118 Blanc, J. P., 99, 100, 118 Blow, D. M., 145, 174, 186, 189 Blumberg, B., 12, 19, 43 Blumenthal, D. K.,86, 90, 91, 92, 96, 118, 119, 121, 123 Blundell, T.L., 53, 71, 77, 114, 118, 120, 122, 161, 186, 189 Bode, W.,18, 49 Boedtker, H.,19, 43 Boa, M., 58, 123 Bolander, M. E., 32, 43, SO Bolin, J. T., 145, I87 Boman, H.G.,85, 118 Bonora, C. M., 57, 58, 118, 124 Bornstein, P., 10, 31, 55, 47 Bosch, R., 57, 58, 118 Bourdon, M. A., 35, 43 Boussard, G., 59, 118, 124 Boyd, C. D. 12, 20, 48 Bra& A., 74, 118 Brady, S. F., 168, 169, 186 Braginski, J. E., 21, 22, 44
Brandts, J. F., 194, 198, 204, 206, 226, 232, 233 Braun, W.,77, 118 Brazel, D., 12, 43 Breathnach, S. M., 6, 44 B m , D. N., 52, 69, 70, 123 Briggaman, R. A., 6, 34, 45 Brinker, J. M., 12, 43 Brooks, B. R., 131, 133, 186 Brown, J. E., 70, 118 Brown, K. S., 41, 48 Brownell, A. C., 34, 47 Brownlee, M., 2, 41, 43 B~ccoleri,R. E., 131, 133, 186 Bruckner, H., 58, 121 Bruckner, P., 11, 12, 47, 49 Buck, C., 42, 45 Buffon, 154, 186 Bugg, C. E., 89, 118, 164, 186 Bunge, R. P., 26, 44 Buonomo, F. C., 101, 124 Burger, M. M., 26, 46 B u r p s , W.H.,81, 90, 91, 121 Burgeon, R. E., 6, 33, 34, 43, 45, 48 Burks, T.E,66, 67, 120, 121 Burley, S. K.,71, 118, 126, 131, 144, 161, 162, 164, 165, 167, 168, 173, 183, 186, 187, 189 Bursaux, E., 158, 159, 166, 174, 188 Butkowki, R., 34, 36, 43, 49 Bychkova, V. E., 227, 232 Bygren, I?, 36, 49
C Caille, A., 74, 118 Calderon, R. O., 197, 232 Camerman, A., 53, 114, 118 Camerman, N., 53, 114, 118 Campbell, A. C., 26, 44 Cannon, F. B., 2, 26, 39, 46 Cantor, C. R.,55, 118 Caporale, L. H.,78, 79, 118 Carey, D. J., 26, 44 Carlin, B. E., 10, 21, 22, 26, 27, 43, 44 Carter, C. M.,238, 238 Cary, P. D., 206, 234 Casal, J. I., 129, 186 Caterson, B., 35, 44 Cavenee, W.K.,21, 45
24 1
AUTHOR INDEX
D
Cerami, A., 2, 41, 43 Chaiken, I. M.,52, 69, 70, 76, 77, 80, 120, 121, 123
Charbonneau, H., 90, 118 Charonis, A. S., 19, 32, 37, 38, 44, 49, 50 Chauhan, V. S., 61, 118, 123 Chen, C. H.,82, 119 Cheney, B. V.. 174, 186 Cheney, J., 174, 186 Cheung, M. C., 12, 48 Chong, P. C. S., 105, 120 Chow, M.,65, 120 Chothia, C., 90, 101, 102, 104, 106, 108, 114, 119, 121, 226, 232
Chou, K. C., 114, 119 Chou, P. J., 227, 232 Chou, P. Y.,69, 72, 77, 102, 117, 119 Chow, L. T., 13, 20, 46, 48 Christianson, D. W.,166, 186, 188 Christner. J. F., 35, 44 Chu, G. H.,9, 44 Chung, A. E., 3, 4, 10, 21, 22, 26, 27, 43, 44, 46
Cicero, T. J., 62, 120 Clark, C. C., 2, 34, 46, 47 Claysmith, A. P., 41, 49 Cohen, C., 103, 119 Cohn, M.,92, 96, 123 Coligan, J. E., 41, 49 Comtock, L.J., 113, 119 Comte, M.. 87, 119 Cook, W.J., 89, 118, 164, 186 Cooper, A. R., 21, 22, 26, 27, 44, 45 Cornbroob, C. J., 26, 44 Cosani, A.. 227, 234 Costa, T., 63, 123 Couchman. J. R., 26, 35, 44, 45 Courtoy, P. J., 34, 44 Cox, E. G., 162, 164, 186 Cox, J. A., 86, 87, 88, 89, 93, 119 Craik, C. S., 166, 187 Crane-Robinson, C., 206, 234 Creighton, T., 53, 72, 119 Crescenzi, V., 227, 234 Crespi, H. L.,92, 96, 123 Crick, E H.C., 91, 98, 103, 104, 119 Crigbaum, W. R., 228, 232 Cronan, Jr., J. E., 81, 122 Cruicbhank, D. W.J., 162, 164, 186 Cullis, P. M.,193, 234 Cung, M. T., 59, 124
Daar, I. O., 129, 186 Dalzoppo, D., 23, 47 Damjanov, I., 10, 47 Daniels, J. R., 9, 44 Dao-pin, S., 184, 186 Dauber, P., 141, 188 Diiumigen, M., 102, 120 Davenport, R. C., Jr., 129, 145, 186 Davis, G. E., 23, 44 D a m n , C. R.,82, 119, 237, 238 Day, A. A., 32, 50 Deakyne, C. A., 174, 186 Deason, J. R., 43, 186 De Boer, H. A., 113, 119 Dec, S. F., 207, 208, 211, 212, 215, 232, 233
Decker, C., 42, 45 De Crombrugghe, B., 20, 50 Dday, R., 216, 233 De Grado, W. F., 68, 71, 72, 73, 81, 82, 83, 84, 85, 86, 88, 89, 91, 93, 108, 110, 113, 119, 120, 121, 122, 123 Dehm, P., 7, 44 Del Pra, A., 61, 118 Della-Fera, M. A., 101, 124 Denton, J. B., 69, 123 Dessau, W.,10, 12, 47, 48 Deutzmann, R., 12, 13, 15, 16, 23, 27, 32, 38, 39, 43, 45, 47, 48 Diaz, L. A., 6, 36, 47 DiBlaeio, B., 57, 58, 118, 124 Dickerson, K.,23, 44 Dickerson, R. E., 145, 189 Dieringer, H., 11, 12, 18, 43, 44, 46 Dill, K. A., 179, 180, 186 DiMaio, J., 54, 64, 65, 119 Dixit, 42, 47 Dixon, F. J., 36, 50 Dockerill, S., 77, 122 Donohue, J.. 145, 186 Diiumigen, M., 120 Drake, A. F., 82, 119 Dreyfus, P. A., 35, 44 Drohan, W.N.,41, 49 Duance, V. C., 35, 44 Dubois-Dalcq, M.,2, 23, 43, 46 Duggan, K., 42, 45 Duncan, K. G., 6, 19, 26, 33. 37, 42, 43, 44
242
AUTHOR INDEX
Durkin, M. E., 10, 27, 44 Dyck, R. F., 35, 44 Dziadek, M.,2 , 12, 26, 27, 28, 29, 31, 32, 35, 39, 44, 47, 48, 49
Faaman, C. D., 69, 72, 77, 117, 119 Feldmann, J. R.,38, 39, 46 Feldman, R. J., 126, 156, 189 Feknfeld, C., 173, 187 Fmni, C., 129, 158. 159, 166, 174, 180, 187, 188
E
Fersht, A. R., 130, 136, 142, 145, 146,
Eagland, D., 231, 232 Ebihara, I., 26, 46 Eckle, E., 53, 123 Edelman, A. M.,86, 90, 91, 118, 119 Edehtein, C., 81, 82, 119 Edgar, D.. 23, 26, 28, 42, 43, 44 Edmundson, A. B., 83, 84, 122 Edsall, J. T., 211, 232 Eggimann, B., 66, 123 Eidingcr, D., 21, 46 k b e r g , D., 72, 83, 91, 103, 108, 110,
Fader, J. H., 11, 12. 19, 26, 37, 42, 43,
174, 187, 189
114, 119, 124
Elber, R., 182, 186 Elliott, R. W.,32, 47 Emanuel, B. S., 21, 45 Engel, A,, 21, 22, 44 Engel, J., 12, 17, 18, 21, 22, 23, 25, 27, 29, 30, 31, 32, 33, 38, 39, 44, 45. 47, 48, 49 Engelbreth-Holm, 9n Engliih, M. L., 63, 123 Engvall, E., 10, 23, 44 Epand, R. M.,77, 82, 119 Epprecht, T., 102, 121 E r i c b n , B. W.,78, 79, 80, 117, 118, 121, 124 Ericbn-Viitanen, S., 81, 86, 91, 119, 120, 122 Ermokhina, T. M.,206, 234 EscaigHaye, E, 3, 45 Eahita, S. M.,119 Eukin, R. C., 237, 238 Evana, M. W.,177, 187, 193, 207, 213, 217, 232
F Fadel, A., 58, 123 Fairman, R., 123 Farquhar, M. C., 2, 28, 31, 34, 40, 44, 45, 49
44
Fessler, L. I., 11. 19, 26, 37, 42, 43, 44 Fietzek, P., 11, 49 Filimonov, V. V., 197, 232 Filman, D. J., 145, 187 Fine, J. D., 6, 44 Fine, R., 130, 134, 187, 188 Finn, E M.,75, 120 Finney, J. L., 157, 187 Finzel, B. C., 122 Fisher, L. W.,29, 32, 43, 46 Fitch, J. M., 11, 17, 47 Fitton, J. E., 119 Fleischmajer, R., 26, 34, 44, 45 Fletterick, R.,117, 120 Foellmer, H. C., 17, 47 Foidart, J.-M., 21, 22, 34, 47, 49 Fok, K. F., 121 Forman, J. D., 131, 187 Fox, C. F., 51, 122 Fox, R. D., Jr., 58, 120 Franckc, U., 32, 47 Frank, H. S., 177, 187, 193, 207, 213, 217, 232
Franks, F., 193, 204, 213, 215, 217, 232 Franzen, J. S., 227, 233 Freeman, I. L., 21, 22, 44 Freidingcr, R. M.,53, 60, 120, 124 Freire, E., 194, 195, 232 Fresco, J. R., 145, 188 Friedman, M. E., 225, 232 Fujihara, M.,212, 234 Fujino, M.,84, 122 Fujiwara, S., 27. 29, 30, 31, 32, 38, 44, 45, 48, 49
Fuller, F., 19, 43 Fpller, C. C., 43, 48 Furcht, L. T.,22, 26, 27, 41, 47, 49 Furthmayr, H., 12, 16, 17, 18. 19, 21, 22, 23, 25, 32, 34, 37, 38, 44, 47, 48, 49, 50
243
AUTHOR INDEX
G Gallego, E., 69, 70, 122 Galligan, J. J., 66, 67, 121 Gardell, S. J., 166, 187 Gamls, J. I., 41, 46 Gawish, A., 82, 119 Gay, S., 10, 47 Gee,
K., 66, 67, 121
Gchron, P., 9, 47 Gchron-Robey, P., 11, 21, 22, 29, 32, 35, 37. 42, 45, 46, 49, 50
Gelb, M.,166, 188 Gellatly, B. J., 157, 187 Gent, M.,215, 232 Gerlt, J. A., 197, 232 Geselowitz, A. R., 193, 234 Giammona, D. A., 183, 187 Giedmc, D. P., 86, 87, 120 Gierasch, L. M.,61, 118, 122 Gill, S. J., 177, 187, 194, 207, 208, 209, 210, 211, 212, 215, 227, 232, 233 Gilson, M. K., 130, 134, 187 Glanville, R. W., 11. 12, 13, 15, 16, 18, 43, 44, 45, 46, 48, 49 Glaspw, E. E, 31, 35, 48 Clatter, U.,102, 120 Gleen, S. J., 193, 234 Glegg, R. E., 21, 46 Glitzer, M. S., 60, 124 Glushko, V., 39, 43, 49 Gluskcr, J. P., 146, 188 Coldbeg, M.,3 , 45 Goldsmith, L. A., 6, 34, 45 Goldstein, I. J., 22, 48 Golton, I. C., 157, 187 Coodfellow, J., 157, 187 Goodman, M.,65, 66, 70, 118, 120, 121 Goodman, S., 23, 45 Gordon, J. R., 28, 45 Gorin, F. H., 62, 120 Gorodnov, B. G . , 197, 234 Could, R. O., 126, 157, 162, 287 Grabau, C., 81, 122 Graf, J., 41, 45 Granozzi, G., 61, 118 Grant, M. E., 2, 9, 45 Grassi, J., 35, 45 Graves, D. J., 91, 121 Gray, A. M.,126, 157, 162, 187
Green, N. M.,25, 43 G d o u g h , T. J., 89, 118, 164, 186 Gmnspon, S. A., 2. 46 Grcgga, R., 42, 45 Gregory, R. B., 224, 233 Griffin, C. A., 21, 45 Griffin, J. E, 53, 114, 120, 123 Griffith, J. E., 237, 238 Griko, Yu. V., 195, 196, 198, 204, 206, 227, 232, 234
Grobli, B., 11, 49 Grotendorst, G. R., 35, 49 Cudas, L. J., 12, 43 Gupta, K. B., 82, 119 Gutte, B., 102, 114, 115, 120, 221 Guy, R., 113, 120
H Haber, E., 165, 188 Haberman, E.,82, 120 Hager, L. P., 81, 122 Hagler, A. T., 53, 123, 141, 148, 187, 188
Hagstrom, R., 130, 134, 187 Hahn, W. E., 91, 123 Halie, L. M.,70, 120, 138, 187 Hall, Z. W., 35, 45 Hamilton, J.. 36, 43 Handa, Y. P., 212, 215, 234 Handley, C. F.,31, 35, 48 Hansen, J. R., 21, 45 Hansen, R. S., 86, 90, 91, 119 Haridas, M.,61, 123 H a n k , R. J., 22, 48 Hascall, V. C . , 28, 31, 35, 45, 48 Hassan, M.,65, 66, 120, 121 Hasscll, J. R., 2, 28, 29, SO, 31, 35, 38, 39, 45, 46, 47, 49, 50
Hatley, R. H. M.,204, 232 Hawley, S. A., 204, 232 Hawley-Nelson, P., 36, 48 Hay, E. D., 28, 45 Hayman, E. G., 35, 47 Heathcote, J. G., 2, 9, 45 Heinegard, D., 36, 49 Helliwell, J., 82, 119 Hendrickmn, W. A,, 156, 161, 174, 183, 186, 187
244
AUTHOR INDEX
Hermann, R. B., 207, 208, 210, 232 Hermans, J., Jr., 227, 232, 234 Herranz, J., 69, 70, 122 Herzberg, O., 164, 187 Hider, R. C., 82, 119 Hill, D. J. T., 225, 227, 231, 232 Hilvert, D., 166, 187 Hintner, H., 6, 44 Hirning, L. D., 67, 120 Hirschmann, R., 60. 118, 124, 168, 169, 186
Ho, S. P., 108, 110, 113, 119, 120 Hodgw, R. S., 103, 105, 120, 121, 123 Hoeprich, P. D., Jr., 78, 79, 80, 120 Hofmann, H., 12, 15, 16, 45, 48 Hofmann, K., 75, 120 Hogan, B. L. M., 21, 22, 25. 26, 27, 52, 34, 41, 43, 44, 45, 46, 47, 48
b
Hol, W.C. J., 70, 120, 135, 138, 187 H o l b ~ k K., , 4 Holliiter, D. W., 11, 44 Hollt, V.,97, 120 Holly, F. W.,60, 124 Homandberg, C. A., 77, 80, 120 Hommick, C., 60, 124 Honig, B. H., 129, 130, 134, 187, 188 Hon-nami, K.,204, 233 Honzatko, R. B., 174, 187 HMk, M.,31, 45, 49 Hoplrins, C. R., 34, 48 Horigan, E. A., 31, 35, 46, 49 Homitz, A.,42, 45 Howard, J. C., 59, 120 Howe, C. C., 10, 21, 45 Howell, E. E., 145, 189 Hruby, V. J., 53, 57, 61, 66, 67, 77, 120, 121
Hubbard, R. E., 138, 143, 147, 149, 186
Hudson, B. C., 34, 36, 43, 49 Hughes, R. C., 41, 45 Hughes, R. E., 169, 187 Hugli, T. E., 78, 79, 80, 118, 120, 121 Hujanen, E. S., 39, 43, 49 Huler, E.. 148, 187 Hunt, L., 198, 232 Hunt, R., 66, 67, 121 Huttner, W.B., 27, 35, 47 Hvidt, A., 223, 231 Hynes, R., 34, 45
I Iitaka, T., 96, 123 Ikcgami, A., 227, 234 Inestrosa, N. C., 35, 45 I n a , S., 3, 4, 6, 10, 19, 26, 55, 45, 46 Iqbal, M., 82, 119 Irani, M., 20, 50 Irwin, M. H., 11, 17, 47 Isokawa, S., 74, 122 Itoh, S., 96, 123 Ivanov, V. T., 58, 61, 122, 123 Iwamoto, Y.,41, 43, 43, 45, 48 Izumiya, N., 58, 85, 121, 123
J Jacks, T. J., 237, 238 Jaenicke, R., 102, 120 Jaffe, R. 10, 21, 22, 26, 27, 43, 44 James, M. N. G., 136, 164, 187, 188 Janin, J., 72, 120, 193, 195, 232 Jaye, M., 41. 49 Jenkins, J. R., 26, 46 Jerwn, L. H., 144, 145, 186, 189 Jensen, M. S., 34, 48 Jimenez, M. A,,69, 70, 122 Jodlmlri, M., 34, 36, 49 Johanin, C. J., 134, 189 Johansson, S., 31, 45 Johns, C. O., 236, 238 Johnson, C. K., 144, 187 Johnson, D., 77, 121 Johnson, H. H., 215, 232 Johnson, L. D., 21, 45 Johnson, L. N., 134, 135, 187 Johnson, P., 237, 238 Johnson, P. A., 236. 238 Jones, A., 10, 48 Jones, D. B., 236, 238 Jones, J. E., 140, 187 Joneson, B., 164, 187 Jonsson, P.-G., 147, 188 Jung, C., 57, 58, 118
K Kaempfe, L. A., 101, 124 Kaiser, E. T., 68, 72, 74, 77, 80, 81, 96, 97, 98, 99, 100, 101, 118, 119, 120, 121, 122, 123, 124
245
AUTHOR INDEX Kang, A. H., 42, 47 Kanmera, T., 77, 80, 85, 120, 121 Kanwar, Y. S.,28, 31, 45 Kao, L. F., 67, 120 Karlc, I. L., 53, 58, 114, 118, 120 Karle, J., 53, 114, 118 Karlstdm, G., 164, 187 Kamovsky, M. J.. 28, 45 Karplus, M., 131, 133, 134, 161, 180, 182, 183, 186, 188
Kasai, H., 96, 123 Kato, H., 165, 188 Kato, M., 29, 30, 31, 45 Kato, S., 25, 50 Kato, T., 58, 85, 121, 123 Katz, S.I., 6, 36, 44, 48 Kauer, J. C., 96, 120 Kauzmann, 72, 105, 120, 177, 178, 181, 187, 193, 204, 207, 209, 213, 226, 231, 232, 234 Kee, S. M., 91, 121 Keene, D. R., 6, 33, 34, 43, 45, 48 Kcfalides, N. A., 2, 7, 8, 12, 16, 21, 26, 43, 44, 46, 47 Kennard, O., 143, 144, 145, 146, 147, 189 Kennedy, D. W.,42, 50 Kennedy, M. B., 91, 92, 118 Kershaw, M., 35. 44 Kessler, H., 61, 121 Kezdy, F. J., 68, 72, 74, 77, 80, 81, 82, 96, 101, 119, 121, 122 Khechinashvili, N. N., 195, 197, 198, 201, 202, 205, 206, 226, 231, 233, 234 Killen, P., 26, 46 Kim, P. S., 52, 69, 70. 71, 109, 118, 121, 123, 135, 188 Kimata, K., 29, 30, 31, 45 Kinura, J. H., 28, 45 Kincaid, R. L., 85, 121 Kirkwood, J. G . , 140, 188 Kitagawa, Y., 22, 47 Kjellen. L., 31, 45, 49 Klapper, M. H., 224, 225, 226, 231, 233 Klauser, S., 102, 121 Klee, C. B., 85, 121 Klee, W. A., 70, 118 Kleinman, H. K., 2, 23, 25, 26, 29, 30, 31, 37, 38, 39, 41, 43, 43, 45, 46, 49, 50
Klevit, R. E., 90, 92, 96, 121 Klibanov, A. M., 129, 186 Klippenstein, D. L., 35, 49 Klotz, I. M., 227, 233 Koda, J. E., 40, 43 Koetzle, T. F., 144, 187 Kohno, K., 25, 50 Kokkinidis, M., 58, 121 Kollman, P. A., 147, 187 Komoria, A., 228. 232 Komorita, A,, 77, 80, 120 Komoriya, A., 76, 80, 121, 123 Konishi, Y., 69, 123 Konnert, J. H., 156, 187 Kornblihtt, A. R., 34, 46 Kozlowski, J. M., 43, 43 Kozmina, E. P., 237, 238 Krakower, C. A., 2, 46 Krashennikov, I. A., 206, 234 Kraut, J., 145, 187, 189 Krebs, E. G., 86, 90, 91, 92, 96, 118, 119, 121, 123
Kmheck, G. C., 227, 233 Kretsingcr, R. H., 33, 46, 164, 187 Kroll, T.G., 22, 48 Krstenansky, J. L., 77, 121 Krusius, T., 10, 44 Krzesick, R. F., 22, 48 Kudinov, S. A,, 195, 233 Kiihl, U., 2 , 41, 46, 49 Kiihn, K., 2 , 10, 11, 12, 13, 15, 16, 17, 18, 19, 44, 45, 46, 47, 48, 49
Kullman, W., 102, 114, 116, 121 Kumar, A., 61, 123 Kuntz, I. D., Jr., 179, 180, 189 Kuriyan. J., 161, 174, 183, 186, 187 Kurkinen, M., 12, 13, 19, 20, 21, 22, 25, 26, 27, 41, 43, 44, 45, 46
Kutyshenko, V. P., 195, 196, 204, 232, 234
Kvick, A,, 144, 187
L Lackowicz, J. R., 121 Lai, T.F., 144, 187 Lang, A. B., 11, 47 Langen, 102, 121 Langeveld, J., 34, 36, 43, 49 Langs, D. A., 53, 120, 169, 187
246
AUTHOR INDEX
Lau, H. S. H., 103, 122 Lau, S. Y. M., 104, 105, 121 Lau, W.,81, 90, 91, 121 Laurie, C. W.,2, 3, 4, 19, 26, 34, 35,
Lunntrum, G.P., 26, 34, 44, 45 Lustig, L., 10, 29, 30, 31, 38, 45, 48
M
38, 39, 45, 46
Leach, S. J., 79, 122 Lear, J. D., 71, 72. 73, 81, 119, 121 Lebl, M., 61, 121 Leblond, C. P., 2, 3, 4, 6, 10, 19, 21, 26, 34, 35, 45, 46
Lebo, R., 12, 48 Ledbetter, S. R., 29, 30, 31, 45, 46
Lee, B., 224, 226, 233 he, K.H., 77, 118 Lee, R. H., 193, 234 Lee, S., 69, 85, 121, 123
Leiacrowitz, L., 145, 186 Leiit, T., 102, 121 Leivo, I . , 2 , 10, 20, 46, 47 Lemieux, C., 64, 65, 66, 67, 119, 123 Lesot, H., 41, 46 Lever, W.F., 36, 46
M.,101, 102, 119, 121, 134, 174, 187, 189 Levy, R. M.,70, 108, 123, 138, 188 Leyahon, W.C., 29, SO, 31, 45 Liang, J.-Y, 128, 187 LiEaon, S., 141, 148, 187, 188 Lindahl, U., 32, 48 Lhdley, P. F., 126, 160, 161, 186, 188, 189 Ling, N., 86, 87, 120 Lime, P., 164, 187 Linaenxnayer, T.F., 11, 17, 47 Liotta, L. A., 2, 22, 23, 37, 38, 39, 41, 42. 46, 48, 50 Lipccomb, W. N., 128, 129, 166, 186, 187, 188 Lackwood, C. M., 35, 36, 44, 48 h b , D. M.,39, 43, 49 Loidl, H. R., 12, 43 Lollar, P., 33, 48 London, F., 140, 188 Love, W.A., 174, 187 Lozano, C., 20, 46 Lu, G. S., 78, 121 Lu, z., 79, 80, 121 Luca.s. M.,224, 233 Lucu, T.J., 81, 90, 91, 121 Ludwig, M. L., 162, 188 Lumry, R., 194, 224, 233 Mtt,
Macarak, E. J., 34, 47 McAuliffe, C., 215, 233 McCall, E. R., 237, 238 McCammon, J. A., 183, 187 McCarthy, J. B., 22, 47 McCarthy, R. A., 26, 46 McDonald, J. A., 26, 44 McDomll, L., 86, 121 McEwan, R., 43, 43 McGarvey, M. L., 37, 39, 46 McGoodwin, E. B.. 9, 47 McHugh, N., 35, 44 McIntosh, 237, 238 MacKrell, A. J., 12, 43 McLachlan, A. D., 103, 121 McMahan, U. J., 35, 47 McPherwn, J., 10, 47 MacQueen, H. A., 26, 44 Madri, J. A., 17, 21, 22, 23, 34, 38,
.Magnumn,
44, 47, 48 S., 34, 48 Mainardi, C. L., 42, 47 Makhatadze, G. I., 198, 206, 212, 233, 234 Malmcik, D. A , , 86, 87, 121 Mali, W. U., 237, 238 Malimoff, H. L., 41, 47 M d , N. J., 65, 66, 121 Manavalan, P., 58, 59, 121 Mann, K., 6, 32, 33, 47, 48 Manthropr, M.,23, 44 Maquat, L. E., 129, 186 Marangor, P., 23, 43 Margulies, I. M. K., 22, 23, 38, 48 Marquee, S., 52, 69, 70, 123 Marotti, K. R., 10, 49 Marraud, M.,59, 118, 124 Mamh, C. A . , 34, 47 Mamh, R. E., 144, 187 Mamhall, C. R., 62, 120 Mamhall, L. M.,35, 47 Martin, C. R., 2, 3, 4, 9, 11, 21, 22, 25, 26, 29, 34, 35, 36, 37, 38, 39, 41, 42, 43, 43, 45, 46, 47, 48, 49, 50
247
AUTHOR INDEX Martinez, W. H., 238, 238 Martinez-Hernandez, A , , 1, 2, 10, 26, 34, 47
Mason, I. J.. 32, 47 Mamulie, J., 35, 45 Masterton, W. L.. 225, 233 Mastropaolo, D., 53, 114, 118 Mateo, P. L., 195, 197, 233, 234 M a t h m , B. W., 184, 186 Mathews, D. A., 145, 187 Matsuzawa, T., 74, 122 Maulet, Y., 87, 119 Mayne, R., 11. 12, 17, 47 Maziak, L. A., 67, 123 Means, A. R., 89, 92, 96, 118, 123, 164, 186
M e d v d , L. V., 195, 233 Meier, S., 28, 45 Meredith, S. C., 74, 122 Merrificld, R. B., 78, 79, 80, 85, 118, 121
Mihara, H., 85, 121 Miles, H. T., 173, 187 Miller, E.J., 10, 47 Miller, L. R., 161, 186 Miller, R. J., 99, 101, 118, 121, 123, 124 Miskin, R., 43, 48 M i t c h i w n , C., 71, 75, 121 Mitsui, Y.,96, 123 Miyauchi, T., 74, 122 Miyazawa. T., 84, 122 Miyoshi, M., 58, 123 Moe, G. R., 101, 122 Moet-Ner(Mautner). M., 174, 186 Mofonm, J. M., 183, 187 Mohan, P. S., 26, 47 Mojaov, S., 78, 79, 80, 121 Momany, F. A., 58, 59, 120, 121 M o w n , J. M., 12, 43 Mora, R., 74, 122 Morita, A., 22, 47 Morita, Y., 237, 238 Morley, J., 61, 121 Morokuma, K., 142, 146, 189 Morris, N. P., 6, 33. 34, 43, 45, 48 Mosberg, H. I., 66, 67, 121 Moser, R., 102, 114, 115, 121 Moss, D. S., 161, 186, 189 Muchmorc. D., 184, 186 Mudryj, M., 20, 41, 49, 50 Miinkc, M., 32, 47
Murphy, D., 32, 47 Murray, L. W., 6, 33, 43 Murray-Rust, P., 146, 188 Musso, G. F., 101, 121 Mutter, M., 74, 122 Myers, J. C., 12, 21, 43, 45, 48
N Naef, R., 58, 123 Naghibi, H.. 208, 212, 215, 233 Nakagawa, S. H., 81, 122 Nakajima. T., 84, 122 Nakamura. H.,129, 189 Nakane, P. K.,10, 48 Narebor, M.. 161, 189 Narita, M., 74, 122 Natzle, J. E., 12, 43 Neidhart, D., 174, 188 Nemethy, G., 55, 56, 57, 79, 122, 124 Neucere, N. J., 236, 237, 238 Nguyen, T. M. D., 64, 65, 66, 67, 118, 119, 123
Nichols, N. F.,209, 210, 211, 232 Nieto, J. L., 69, 70, 122 Ninio, J., 173, 188 Ninomiya, Y.,20, 46 Nockolds, C. E., 164, 187 Noda, H., 204, 233 Noelken, M., 34, 36, 43, 49 Nojima, H., 204, 233 Noll, L., 227, 232 Novokhatny, V. V., 195, 197, 233 Novotny, J., 165, 188 Nowack, H., 27, 49 Nozaki, Y., 179, 188 Nurcombc, Y., 23, 28, 42, 43 Nutt. R. F., 60, 124 Nye, J., 184, 186
0 Oberbaumer. I., 10, 11, 12, 13, 18, 19, 43, 47, 48, 49
Odermatt, B. F., 11, 21, 22, 23, 25, 38, 44, 47, 49
Odintsova, T. I., 206, 234 Ogden, R. C., 145, 189 Ohkubo, H., 20, 50
248
AUTHOR INDEX
Ohlendorf, D. H.,102, 106, 108, 122 Ohmori, D., 150, 188
Ohno, M.,26, 47 Ohno, N., 26, 47
Ohno, S., 23, 43 Oikawa, S., 165, 188 Okuyama, T., 96, 123 Olafson, B. D., 131, 133, 186 Oldberg, A.,35, 43, 47 Oliveira, R.J., 226, 232 Olofsson, G., 207, 208, 211, 233 O l o m n , I., 147, 288 Olsen, B. R., 20, 46 Olsen, P. F., 12, 43 O'Neil, Kit, 88, 91, 92, 93, 94, 95, 122 Oohira, A., 10, 31. 35, 47 Oren, D. A., 166, 188 Orfanakis, N. G., 10, 48 Orgel, L. E., 173, 188 Orkin, R. W., 9, 47 Omstein, R. L., 145, 188 Oshima. T., 204, 233 Oshodj, A. A., 207, 211, 233 Oslova, L. P., 237, 238 Osterman, D. G., 74, 99, 122, 123 Ott, U., 22, 23, 25, 38, 47 Ovchinnikov, Y. A., 58, 61, 122 Ownby, D., 212, 233 Oxender, D. L., 51, 122
P Paattridge, K. A , , 162, 188 Pace, N. C., 204, 233 Paglia, D. E., 129, 189 Palveda, W.J., Jr., 60, 124 Paleveda, W.J., 168, 169, 186 Palm, S. L., 22, 26, 27, 47 Panilova, A. I., 237, 238 Parry, D. A., 103, 119 Parthasarathy, N.,28, 47 Pastan, I., 20, 50 Patel, H. P., 6, 36, 47 Patemon, Y., 57, 79, 122 Patterson, D., 223, 233 Patthi, S., 101, 124 Paul, P. K. C., 61, 118 Pauling, L., 130, 188 Paulsaon, M., 23, 26, 27, 28, 29, 31,
32, 35, 38, 39, 42, 43, 44, 47, 48
Pavone, V., 57, 58, 118, 124 Pedone, C., 57, 58, 118, 124 Peegion, E., 227, 234 Pejlar, 32, 48 Pepys, M. B.. 35, 44, 48 Perini, F., 22. 48 Perlow, D. S., 60, 124 Perutz, M. E, 129, 130, 158, 159, 166, 168, 171, 174, 180, 182, 187, 188, 193, 233 Peters, B. P., 22, 48 Peters, D. K.,36, 48, 148, 188 Peters, J., 148, 188 Petemen, T. E., 34, 48 Petsko, G. A.,71, 118, 126, 129, 131, 150, 161, 162, 164, 165, 168. 173, 174, 179, 180, 183, 186, 187, 188, 189 Pfeil, W.,196, 197, 198, 202, 205, 206, 226, 227, 232, 233 Pflugrath, J. W.,135, 188 Phillips, D. C., 129, 186 Phillips, S. L., 10, 44 Pierce, G. B., 10, 48 Pierotti, R. A., 224, 233 Pierschbacher, M. D., 35, 42, 43, 48 Pihlajaniemi, T., 12, 20, 48 Pincon-Raymond, M., 35, 44 Pisano, J. J., 84, 118 Ponder, J. W., 117, 122 Porter, R., 2, 48 Potekhin, S. A . , 195, 233 Pottle, M. S., 55, 56, 124 Powers, S. P., 69, 123 Poyart, C., 158, 159, 166, 174, I88 Prasad, B. V., 57, 58. 63, 122 Prehm, P., 10, 48 Prendergast, F. G . , 68, 81, 86, 88, 89, 90, 91, 93, 119, 121 Prigogine, I., 216, 233 Privalov, P. L., 178, 182, 188, 195, 196, 197, 198, 201, 202, 204, 205, 206, 212, 225, 226, 227, 228, 230, 231, 233, 234 Prockop, D. J., 12, 13, 20, 48 Pryciak, P. M., 108, 110, 114, 119 Ptitsyn, 0. B., 227, 232 Puett, D., 86, 87, 120 Pusey, C. D., 36, 48 Putkey, J. A.,92, 96, 123
249
AUTHOR INDEX
Q Quadrifoglio, E,227, 234 Quian, R. Q., 12, 15, 16, 45, 48 Quiocho, F. A., 135, 167, 188 Qvarnstfim, E.,207, 211, 233
R Rahim, Z.,142, 188 Raidt, H., 129, 182, 188 Ramachandran, C., 55, 122 Ramachandran, G. N., 55, 122 Ramis, C. I., 32, 50 Randall, W. C., 60, 124 Rao, C. N.,2, 22, 23, 38, 39, 41, 42, 46, 48, 49, 50 Rao, M. S. N., 236, 237, 238 Rashin, A. A., 129, 188 Recny, M. A., 81, 122 Reddy, G. P., 235, 238 Reddy, K. R. N., 235, 238 Reed, A. E., 142, 188 Rms, A. J., 36, 48 Regan, L., 113, 122 Reger, L. A., 41, 49 Reich, R., 43, 48 Reid, D.S., 213, 217, 232 Reid, K. S. C., 126, 160, 161, 188 Reid, R. E., 105, 120 Reiher, W.E.,111, 142, 145, 188 Rein, R., 128, 188 Rennard, S. I., 21, 22, 29, 35, 45, 49 Repraeger, A. C., 40, 43 Rettich, T. R., 212, 215, 234 Revha, L. P., 195, 234 Rhee, K. C., 238, 238 Rialdi, G., 227, 234 Ricca. G. A., 41, 49 Rich, A., 167, 186 Richards, F. M., 58, 75, 76, 101, 106, 117, 120, 122, 151, 178, 179, 188, 225, 234 Richards, W.G., 174, 186 Richardson, D. C., 114, 116, 117, 119, 122, 124 Richardson, J. S., 72, 101, 102, 106, 107, 114, 116, 122, 124 Richey, B., 194, 211, 232
Richmond, T. J., 106, 122 Rico, M., 69. 70, 122 Rideal. E. K., 236, 238 Rieger, F., 35, 44 Riggs, B. L.,33, 48 Ringe, D., 150, 166, 183, 187, 188 Risteli, J., 12, 15, 16, 17, 18, 45, 46, 48 Rivier, 53, 123 Robey, F. A.. 41, 45 Roemer, D., 62, 122 Rogers, N. K.,135, 188 Rohde, H., 21, 22, 44, 49 Rohrbach, D. H.,41, 48, 49 Roll, F. J., 34, 47 Romberg, R. W.,33, 48 Rose, G. D.,57, 60, 61, 122, 179, 188, 193, 234
Rosenbloom, J., 12, 43 Roy, J., 169, 189 Ruben, G. C., 29, 47 Rubin, J., 144, 160, 186 Ruddon, R. W., 22, 48 Rumsey, S. M.,57, 79, 122 Ruoslahti, E., 10, 23, 35, 42, 43, 44, 47, 48
Rutter, W.J.. 166, 187 Riittner, J. R., 11, 47
S Sack, J. S., 89, 118, 164, 186 Sage, H., 32, 41, 44, 45, 47 Sahl, P., 34, 48 St. Angelo, A. J., 236, 238 St. Pierre, S. A., 105, 120 Saito, K., 84, 122 Sakai, L. Y.. 6, 11, 33, 34, 43, 44, 45, 48 Saku, T., 38, 44 Sakurai, Y.,20, 48 Salemme, F. R., 70, 101, 102, 104, 106, 108, 114, 122, 123, 124, 138, 188
Salo, T., 42, 48 Sander, C., 70, 120, 138, 187 Sanderson, R. D., 11, 17, 47 Sanes, J. R., 35, 47 Santoro, J., 69, 70, 122 Sanyal, G., 86, 121 Saperstein, R., 60, 124 Sasaki, K.,77, 122
250
AUTHOR INDEX
Sasaki, M., 26, 41, 45, 46 Sasaki, N.,25, 48, 50 Saskkharan, V.,55, 122 Saund, A. K., 105, 120 Sawada, H.,31, 49 Sawyer, L.,136, 188 Scanu, A. M., 81, 82, I19 Schellman, J. A., 219, 225, 227, 231, 234 Scheraga, H. A., 55, 56, 57, 59, 69, 72, 79,120, 122, 123, 225, 227,232,233 Schiffer, M., 83, 84, 122 Schiller, I? W.,53, 54, 62, 64, 65, 66, 67, 118, 119, 121, 122, 123 Schimmel, I? R., 55, 118 Schmitt, H.,57, 58, 118 Schneider, C., 225, 234 SchWkopf, U.,58, 123 Schulz, M. W.,174, 186 Schuppan, D., 13, 48 Schwartz, D., 42, 49 Schwartz, J. P., 23. 43 Schwarz, U., 13, 48 Schwarz-Magdolen. U., 12, 48 Schmizer, W.B., 58, 123 Schwietzer, J., 62, 120 Schwyzer, R. 96, 123 Seaton, D. B., 166, 188 Seebach, D., 58, 123 Seeholzer, S. H.,92, 96, 123 Segreet, J. I?, 82, 119 Sellinger-Barnette, M.,86, 123 Semoff, S., 34, 48 Sharma, A. K., 61, 118 Sharp, K., 130, 134, 187, 188 Sheldrick, C. M., 57, 58, 118, 155, 188 Shen, B. W., 81, 82, 119 Sheridan, R. P., 70, 108, 123, 138, 188 Shetty, K. J., 236, 237, 238 Shevach, E. M.,36, 48 Shibasaki, K., 236, 237, 238 Shibata, S., 2, 48 Shimohiphi, Y.,63, 123 Shinoda, K., 212, 223, 234 Shoemaker, K. R., 52, 69, 70, 71, 109, 123, 135, 188 Shoham, C., 166, 188 Shooter, E. M., 236, 238 Shutov, A. D., 236, 238 Siebold, B., 12, 15, 16, 43, 45, 48 Sieboldt, 13 Sicker, L. C., 145, 186
Sikcla, J. M., 91, 123 Silbentein, L.,35, 45 Singh, J., 71, 118, 126, 162, 188 Singh, T.€!, 61, 123 Sjolin, L., 70, 124 Skontengaard, K., 34, 48 Slater, J. C.,140, 188 Slinguby, C., 161, 186, 189 Smith, C. W.,169, 189 Smith, C. D.,53, 120, 123, 169, 187 Smith, C. M., 126, 156, 189 Smith, J. A., 57, 60, 61, 122 Smith, J. A. S., 162, 164, 186 Smith, J. C.,180, 188 Smith, K, K., 10, 49 Smith, W.W., 162, 188 Sobel, M. E., 41, 49 Soininen, R., 13, 20, 48 Solter, D., 10, 21, 45 Southgate, C. C. B., 193, 234 Spach, C., 74, 118 Spero, R. C.,26, 47 Spies, M., 11, 49 Spiro, R. C.,28, 47 Stammer, C. H.,63, 123 Stanley, J. R.,36, 48 Star, V. L.,39, 41, 46, 48 Stamburger, W.,102. 120 States, D. J., 131, 133, 186 Stciner, H.,85, 118 Stenman, S., 34, 48 Stepanov, V. M., 195, 234 Stem, R., 26, 44 Sternberg, M. J. E., 135, 188 Stmart, J. M., 52, 69, 70, 71, 109, 123, 135, 188 Stmart, M., 103, 121 Stewart, R. E, 144, 189 Stezomki, J. J., 53, 123, 169, 187 Stolowich, N. J., 197, 232 Stow, J. L.,31, 35, 48 Strachan, R. C., 60, 124 Strickland, S., 10, 49 Strobel, M. S., 145, 189 Struthera, R. S., 53, 123 Sturtevant, J. M., 196, 197, 204, 206, 232, 234 Suchanek, C., 27, 35, 47 Sudha, T.S., 62, 63, 122, 123 Suelri, M., 69, 123 Sugano, H.,58, 123
AUTHOR INDEX Sugimoto, E., 22, 47 Sukumar, M., 58, 120 Sullivan, M., 20, 48 Sultan, L. H., 43, 49 Summers, L., 161, 189 Sundraliigham, M., 144, 189 h i , H.,227, 234
Suzuki, K., 150, 188 Suzuki, S., 29, 30, 31, 45 Swaminathan, S., 131, 133, 186 Swarm, R.,9, 47
T Tager, H.S., 101, 121 Takahashi, S., 96, 123 Takano, T., 145, 189 Takio, K., 86, 90, 91, 118, 119, 123 Talbot, J. A., 103, 105, 123 Tanaka, K. R.,129, 189 Tanaka, M.. 162, 188 Tanaka, Y.,96, 123 Tanford, C.. 72, 123, 145, 177, 179, 182, 189, 193, 204, 206, 207, 209, 213, 226, 233, 234 Tapia, O., 134, 189 Thrsio, J. F., 41, 49 Tate, V., 19, 43 Tatunashvili, L. V., 197, 234 Taylor, A., 22, 26, 32, 41, 44, 45, 46, 47 Taylor, H.C., 76, 123 Thylor, J. W., 80, 81, 96, 97, 98, 99, 100, 118, 123 Taylor, I?, 126, 157, 162, 187 Taylor, R.,143, 144, 145, 146, 147, 189 Taylor, W., 32, 44 Tetter, M. M., 156, 189 Terboyevich, M., 227. 234 Termine, J. D., 32, 43, 50 Tmanova, V. I?, 2, 22, 23, 38, 39, 41, 43, 46, 48, 49 Temilliger, T. C., 83, 103, 119, 124 Thomen, H.,23, 42, 44 Thomas, K. A., 126, 156, 189 Thomas, R. M., 102, 114, 115, 144, 187 Thomas, T. B., 126, 156,'189 Thompson, E. W., 43, 48 Thompson, H.,20, 46 Thorgcirsson, U. I? , 42, 49 Thornberg, L.. 39, 43, 49
25 1
Thornton, J. M., 71, 118, 126, 129, 160, 161, 162, 186, 188
Tickle, I. J., 53, 77, 114, 120, 122, 161, 186
Tikka, L.,20, 48 Tiktopulo, E. I., 196, 197, 198, 206, 234
Tilton, R. F.,Jr., 179, 180, 189 Tmpl, R., 2, 9, 10, 11, 12, 13, 16, 17, 18, 19, 20. 21, 22. 23, 25, 26, 27, 28, 29, SO, 31, 32, 34, 35, 36, 38, 39, 41, 42. 43, 44, 45, 46 47, 48, 49 Tippett, I? S., 78, 79, 118 Tischenko, V. M., 197, 234 Titani, K., 86, 90, 91. 118, 119, 123 Tombs, M. P., 236, 238 Tomotake, Y.,74, 122 Tondello, E.,61, 118 Tonelli, A. E.,58, 124 Toniolo, C., 57, 58, 118, 124 Tou. J. S., 101, 124 Tkalka, T. S., 22, 23, 38, 48 Trivedi, D., 77, 121 Triibe, B., 11, 47, 49 Tkyggvason, K., 12, 13, 20, 37, 42, 44, 46, 48 'Ibalkova, T. N.,197, 232 Tsemoglou, D., 58, 121, 162, 188 Tsetlin, V. I., 198, 206, 232 'Ibilibary, E. C., 19, 32, 37, 38, 44, 49, 50 Tsuda, M., 165, 188 'Ibukamoto, T., 84, 122 Tiichsm, E., 174, 189 Turnell, W. G., 161, 186 Turpeenniemi-Hujanen, T., 42, 49 Tym, B., 29, 30, 31, 35, 45, 46, 49
U Uitto, V. J., 42, 49 Uma, K., 61, 118 Umeyama, H., 142, 146, 189 Umezawa, K.. 34, 46 Unger, E.,31, 49 Unson, C. B., 117, 124 Urabe, T., 165, 188 Urdea, M. S . , 166, 187 Uzu, S., 84, 122
252
AUTHOR INDEX
v Vaheri, A., 20, 34, 46, 48 Valentine, T., 9, 47 Valentine, W. N.,129, 189 Van Delden, V., 12. 16, 17, 18, 49 van Duijnen, P. T., 135, 187 Vanaman, T. C., 85, 121 Vanitraub, I. A., 236, 238 Varon, S., 23, 44 VaMer, M.,113, 119 Vaughan, M.,85, 121 Veber, D. E, 53, 60, 61, 118, 120, 124, 168, 169, 186
Vein, A., 42, 49 Velicelebi, G., 101, 124, 206, 234 Venyaminov, S., Yu., 195, 196, 198, 204, 206, 232, 234
Vergnes, J. P., 21, 22, 44 Versichel, W., 143, 189 Vibe-Pedersen, K., 34, 46 Vigny, M..35, 49 Villafranca, J. E.,145, 189 Vineyard, B. D., 101, 124 Vitoux, B., 59, 118, 124 Voet, D. H.,145, 189 Vogeli, C., 20, 50 Von der Mark, K.,2, 12, 18, 23, 41, 45> 46, 47, 49
Voss, T., 12, 18, 45, 46 Votano, J. R., 167, 186 Vyas, N. K.,167, 188
W Wada, A., 129, 189 Wadsii, I., 207, 208, 209, 210, 211, 232, 233
Wagner, C. W.. 41, 48 Wakamatsu, K.,84, 122 Walkinshaw, M. D., 126, 157, 162, 187 Wallqvist, A,, 164, 187 Walsh, K. A., 86, 90, 91, 118, 119, 123 Walter, R., 169, 189 Wang, A. H.-J., 131, 144, 167, 186, 189 Wang, S. Y., 12, 43 Warashina, A., 227, 234 Warfel, J., 21, 45
Warshel, A., 134, 189 Wartiovaara, J., 20, 34, 46, 49 Wattemon, D. M., 81, 90, 91, 121 Weast, R. C., 210, 234 Weber, P. C., 104, 106, 122, 124 Weber, S., 18, 27, 47, 49 Weber, T., 58, 123 Weinhold, F., 142, 188 Weiss, B., 86, 122 Wemmer, D. E.,90, 92, 96, 121 Wen, W. Y.. 217, 232 Werness, P. G . , 33, 48 Weston, Ch., 226, 232 Wetlaufer, D. E., 195, 234 Waver, U. M.,2, 10, 41, 42, 44, 46, 49
Whelan, J., 2, 48 Wicha, M. S., 41, 47 Wick, C., 18, 27, 49 Wider, G., 77, 118 Wiedemann, H., 10, 11, 12, 16, 17, 18, 19, 29, 30. 31, 38, 45, 46, 47, 49 Wieslander, J., 36, 43, 49 Wight, T. N., 10, 31, 35, 47 Wilcock, R.J., 215, 234 Wilcox, W., 119 Wilczek, J., 29, 35, 45 Wilhelm, E., 212, 215, 234 Wilhoit, R. C., 208, 210, 215, 234 Wilkinson, A. J., 145, 189 Williams, J. G., 32, 47 Wilson, C. B., 36, 50 Winter, C., 145, 174, 189 Winter, W., 57, 58, 118 Winterhalter, K. H.,11, 47, 49 Wirth, P., 41, 49 Wisdom. B. J., 36, 43 Wistow, G. J.. 161, 186, 189 Wittnchieber, E., 102, 120 Wlodawer, A., 70, 124, 174, 189 Wodak, 195, 232 Wolfe, H.R.,Jr., 68, 86, 88, 89, 91, 93, 119, 120, 122
Wolfenden, R.,193, 234 Wollmer, A:, 102, 120 Woodley, D. T., 38, 39, 50 Woodward, C. H., 174. 189 Wutrich, K.,77, 118 Wyckoff, H. W., 75, 76, 122 Wyman, J., 194, 211, 232, 234
AUTHOR INDEX
Y Yaar, M., 36, 48 Yamada, K. M., 2, 20, 26, 32, 34, 42, 43, 46, 48, 50 Yamada, T., 237, 238 Yamada, Y., 25, 41, 45, 50 Yamakura, F., 150, 188 Yamamura, H. I., 66, 67, 121 Yasuhara, T.,84, 122 Yasunobu, K. T., 162, 188 Yoon, J. W., 41, 48 York, E. J., 52, 69, 70, 71, 109, 123, 135, 188 Yoshida, M., 85. 121
253
Yotauhashi, K., 236, 237, 238 Young, M. E, 32, 43, 50 Yurchenco, P. D.,12, 18, 19, 29, 32, 37, 38, 44, 47, 50
Yuspa, S. H., 36, 48
2 Zettergren, J. G., 11, 47 Z h f u , M. H., 193, 234 Zimmerman, S. S., 55, 56, 124 Zipp. A., 204, 226, 234 Zoller, M., 117, 120 Zwolinslri, B. J., 208, 210, 215, 234
This Page Intentionally Left Blank
A Acetic acid, dimerization energy and geometry, electmtatic effects, 180
N-Acetyl-a-Aib-N’-methylamide, conformational energy contour map, 56
N-Acetyl-Ala-N‘-methylamide,
conformational energy contour map, 56
N-Acetyl-Cly-N’-methylamide,
conformational energy contour map, 56 Acetylcholinesterase, basement membranes, 7, 35 a-Helical peptides, binding to phospholipid surfaces, design, 81-85 a-Helical proteins, 102-103, see also a-Helix formation coiled coils, 103-105 four-helix bundles, 105-114 a-Helix formation, see also a-Helical proteins long-range interactions, 71-75 hydrophobic forces, 72 hydrophobic periodicity, 72-74 medium-range interactions, 69-71 C-peptide, 70 N-terminal a-helii of ribonuclease A, 69-70 protonated form of His-12, 71 salt dependence, 70 side chain-side chain interactions, 69-70 short-range interactions, 68-69 Chou-Fasman method, 69 host-guest method, 69 AMBER all-atom force field, 131 Amino-aromatic interactions, 173-176
Amino acid sequences a],~ I A ,~ I B and , ~ P B ,112 calmodulin-binding domains of calmodulin-dependentkinases, 91 calmoddin-binding peptide CBP5, 91 collagen IV al(1V) and aZ(IV) chains, 12-16 human &endorphin, 97 laminin in basement membranes, 25 Amino acids aromatic, weakly polar, 151-156 aromatic side chains, electronic quadrupole moments, 139-140 conformational properties, 55-57 conformationally constrained, mlvphalins bearing, 62-63 naturally occurring, partial charges for all atoms, table, 132-133 cY,&unsaturated, properties, 60-61 D-Amino acids, properties, 60 Amino-terminal domain, collagen IV, 13, 15-16 Amphiphilic secondary structures calmodulin-binding peptides, 85-96 membrane-binding peptides, 81-85 peptide hormones, 96-101 Amyloid P component, basement membranes, 35 Analphylatoxin C3a, enhancement of helical potential, 78-80 Apolipoprotein A-1, models, design, 81-82 Apomyoglobin, intrinsic viscosity, temperature dependence, 230 Aromatic-aromatic interactions, 162-173 strength of, 182 Atomic mdtipoles, estimation, 128 Average structure, 183 255
256
SUBJECT INDEX
B
molecular structure, stereodrawing, 144
Backbone configuration, @ and angles, 55 Basement membranes acetylcholinesterase, 7, 35 amyloid P component, 35 BM-40, 7, 32-33 bullous pemphigoid antigen, 7, 35-36 calcium-binding proteins, 32-33 chondroitin sulfate proteoglycans, 35 collagen IV, 7-11 interactions, 38 isolation methods, 9 molecular structure, 11-16 receptors, 41-42 self-assembly, 36-38 entactin, 7, 26 fibronectin, 34 interactions, 39 Goodpature antigen, 36 heparan sulfate proteoglycans, 7, 28-32 interactions, 38-39 interactions between components, 38-40 intrineic components, table, 7 laminin, 7, 21-22 A and B chains, 22 B1 chain domain model, 24 domains, 22-25 fragments, 22-25 isoforms, 25-26 isolation and preparation, 21 receptors, 41-42 self-assembly, 38 sequence data, 25 3-chain structure, model, 23 morphology, 3-6 nidogen, 7, 21-28 interactions, 39 self-assemblyof components, 36-38 structural functions, 40-41 turnover and degradation, 42-43 ultrastructure, 3-6 Benzene rings, interacting, van der Waals stereodrawing, 163 N-Benzoyl-5’0-tertbutyldimethylsilyl-2’ deoxyadenosine monohydrate,
0 Proteins, design, 114-115 BM-40 calcium-binding protein, 7,32-33 Bombohins, amphiphilic a-helii formation, 84 Botabellin, design, 116-117 Brookhaven Protein Data Bank, hydrogen-bonding patterns in proteins, 147-150 Buffon’s needle problem, 154 Bullous pemphigoid antigen, basement membranes, 7, 35-36
C C3a analphylatoxin, enhancement of helical potential, 78-80 Ca’+/calmodulin-dependent protein kinase (brain type II), 0-subunit sequence, 92 Calcium-binding proteins, basement membranes, 32-33 Calmodulin, aromatic-aromatic interactions, 164-165 Calmodulin-binding peptides CBPl, CBP2, CBPS, CBP4, and CBP5, 87-90 design, 85-96 model, helical net diagram, 91 myosin light-chain kinase, 90 Cambridge Crystallographic Data Base hydrogen bonding to sp-and spJ-hybridized oxygen atom, 146 interactions between oxygen atoms and phenyl rings, 157 intermolecular interactions between phenylalanine phenyl rings, 163-164 Carbon, as hydrogen donor atom, hydrogen bonds with, 143-145 Carbonate dehydratase, thermodynamic parameters of denaturation, table, 197 Carboxyl-carboxylate interactions, 136 Carboxypeptidase A, mobile tyrosine residue, aromatic-aromatic interactions, 166 Cecropin A, amphiphilic a-helix formation, 85
257
SUBJEcr INDEX Charge-charge interactions, 128-130 distance dependence, 130 potential energy of interaction. 130 salt bridges, 129 strength of, 182 Charge-dipole interactions, 134-137 a-helii stabilization. 135 distance dependence, 137 potential energy due to, 136 Charge-quadrupole interactions, 139-140 Charge distribution, molecular, see Molecular charge distribution CHARMM all-atom force field, 131 Chi-square test, 155 2 , 2 - b ~ ~ ~ o p ~ l ) - l , l , binding peptide, artificial, 115-116 Chondroitin sulfate proteoglycans, basement membranes, 35 Chou-Fasman method, 69 Chymotrypsin, collagen IV-cleaving, 42 a-Chymotrypsin, thermodynamic parameters of denaturation, table, 197 y-Chymotrypsin, mobile tyrosine residue, aromatic-aromatic interactions, 166 Coiled-coil proteins, design, 103-105 Collagen IV, basement membranes, 7-11 d(IV) and aZ(IV) chains, 11 amino acid sequence, 12-16 domains, 11-16 genes, size of helical exons, table, 20 loop structures in NCI domain, 13 NCI domain, 18 nontripeptide sections, 14 7 S domain, 13, 15-18 electron micrograph, 8 genes, structure and evolution, 19-21 isolation methods, 9 network arrangement, 18-19 receptors, 41-42 s e l f - m b l y , 36-38 Collagenous proteins, types, table, 9 Compact reference state, 221 Compact state, nonpolar substances, 224-225
Conformational constraints, peptide design and, 52-53 Conformational energy contour maps N-acetyl-a-Aib-N'-methylamide, 56
N-acetyl-Ala-N'-methylamide,56 N-acetyl-Gly-N'-methylamide,56
Conservation, amino-aromatic interactions, 174 Crabrolin, amphiphilic a-helix formation, 84 Cross-links, covalent, introduction in peptides, 61 y-Crystallin, sulfur-aromatic interactions, 161 Cyclic enkephalins. 63-68 amide bond formation between two side chains, 67 analogs containing intramolecular &sulfide bonds, 66-67 receptor selectivity, 65 l Cyclic - ~ ~peptides, o ~ e -design, 61 Cytachme c, thermodynamic parameters of denaturation, table, 197
D DDT-binding peptide, artificial, 115-116 Deoxyhemoglobin A amino-aromatic interactions, 174, 176 aromatic-aromatic interactions, 166-168 oxygen-aromatic interactions, 158, 160 Deoxyribonucleic acid, polar interactions and, 172 Dipole-dipole interactions, 137-138 angular dependence, 138 hydrogen bond, 138 Dipole-quadrupole interactions, 139-140
Dipoles, electronic a-helix dipoles in proteins, 135 formation, 135 of a peptide unit, 135 Dispersion interactions, in proteins, 140-141
Dissolution, nonpolar molecules thermodynamics, 217-220 two-step process, 220-225 Distance dependence aromatic-aromatic interactions, 164 charge-charge interactions, 130 charge-dipole interactions, 137 polarizability effects, 134 potential energy of electrostatic interactions, hierarchy of, 150-151 quadrupole interactions, 140 sulfur-aromatic interactions, 161
258
SUBJECT INDEX
E Egg-white lywzyme intrinsic viscosity, temperature dependence, 230 thermodynamic parameters of denaturation, table, 197 EHS tumor BM-40 calcium-bindingprotein, 32-33 heparan sulfate proteoglycans, isolation, 28-29 laminin isolation, 21 nidogen isolation, 27 source of baacment-membrane proteins, 9 Elastase, collagen IV-cleaving, 42 Electron micrograph, collagen IV, network structure, 17 Electron micrographs collagen IV. laminin, and nidogen, 8 low- and high-demity heparan sulfate proteoglycans from EHS tumor,
so
Electronegativities biologically important atoms, 130-131 polarizability and, 133 Electronic charges partial atom of naturally occurring amino acids, table, 132-133 in proteins, 130-131 potential energy of interaction, 130 Electronic dipoles, see Dipoles, electronic Electrostatic interactions charge-charge interactions, 128-130 charge-dipole interactions, 134-137 classification, 185 electronic quadrupoles. 138-140 enthalpy estimates, 182 hierarchy, 150-151 hydrogen bonds, 142-150 London forces, 140-141 multipole repmentations of molecular charge distribution, 126-128 partial electronic charges, 130-131 polarizability, 131, 133-134 short-range electron shell repulsion, 140-141 structure and function of proteins, 184-185
Electrostatic potential energy, 126-128 fl-hdorphin amino acid sequence and structural dgnmenta, 97 helical net diagrams and proposed helical region, 98 model peptides, 97-101 Enkcphalin-bindingpeptide, design, 116-117 Enkephalins analogs, see specific analog cyclic, see Cyclic enkephalins Entactin, basement membranes, 26 Enthalpy of diaaolution, nonpolar molecules in water, 207-209 electrostatic interactions, estimates, 182 nonpolar gases in water, table, 208 protein denaturation, enthalpy of solution of liquid hydrocarbons and, 226 upper limit, 204-207 Entropy upon breakdown of native Structure, 204 upper limit, 204-207 Entropy difference, native and denatured protein states, 200-201 Entropy of solution, nonpolar gases in water, table, 215 Entropy of transfer equal to zero, nonpolar molecules, 216-217, 221 hydrocarbons from liquid phase to water, table, 215
F Fibronectin, basement membranes, 34 Folding, pressure effects, 181 Four-helix bundle pmtcins, design, 105-114 F m energies of tetramerization/dimerization,a1, au, a i ~and , ~ P B 112 ,
G Gene structure, collagen IV, 19-21 Geometry, energetically preferred hydrogen bonds, 146-148
SUBJEcr INDEX
259
Gibbs energy, hydrocarbons from liquid denaturation and, 225-228 phase to water, table, 215 cnthalpy of dissolution in water, 207-209 Gibbs energy difference thermodynamic parameters of maximum value, 202-203 solution into water, temperature dependence, 223 between native and denatured protein Hydrogen bonds states, 200 bifurcated, 143 Globular proteins bonding patterns in proteins, 147-150 compact thermodynamic parameters of with carbon as hydrogen donor atom, denaturation, table, 197 143-145 upper limit of specific enthalpy andclassification of electrostatic entropy of conformation interactions and, 185 ddinition, 145-146 transition, 204-207 folding and stability, 180-181 denaturation enthalpy and, 227-228 internal order, electrostatic dipole-dipole interactions, 138, interactions, 180 142-143 energetically preferred, geometry, mechanism of stabilization, 228-231 small, macroscopic states in 146-147 enthalpy of protein denaturation and, denaturation, 195 Glucapn, enhancement of helical 227 formation, 142-143 potential, 77-78 linearity, 148 Coodpasture antigen, basement membranes, 36 Met-enkephalin bonding patterns, 54 GPl glycoprotein, 21 nontraditional, 143-146 sulfur as donor and acceptor, 145 GP2 glycoprotein, 21 Hydrophobic bond, 178 Hydrophobic effect, 178 H Hydrophobic forces, effect on secondary structure formation, 71-75 Heat capacity Hydrophobic interactions, 177-181 change, role in regulating hydration globular proteins, 180-181 fm energy change, 224 internal order of globular proteins, 180 increment, nonpolar gases in water, micelle analog, 179 table, 208 between protein nonpolar groups, nonpolar molecules, change upon 230-231 aqueous dissolution, 211-212 Hydrophobic periodicity, role in partial, during transition from native secondary struct~reformation, 72-74 to denatured state, 196-206 Hydroxlysine, in basement membranes, 7 Helical domains, collagen IV (ul(1V) Hydroxyproline, in basement and a2(IV) chains, 12-13 membranes, 7 Heparan sulfate proteoglycan, basement membranes, 7, 28-32 low- and high-density, 29-31 1 size diveraity, 31 Host-guest method, 69 Isoforme, basement membrane laminin, Hydration 25-26 fm energy change, role of heat capacity change, 224 nonpolar molecules, 217-225 K Hydrocarbons, liquid dissolution in water, protein Knobs into holes packing, 103-104,106
260
SUBJECT INDEX
L Lamina densa, in basement membranes, 3-4 Lamina fibroredcularb, in basement membranes, 3-4, 6 Lamina lucida, in basement membranes, 3 Laminin, basement membranes, 7, 21-22 A and B chains, 22 B1 chain domain model, 24 domains, 22-25 electron micrograph, 8 electrophoresis, 21-22 fragments, 22-25 isoforms, 25-26 isolation and preparation, 21 receptors, 41-42 self-assembly, 38 sequence data, 25 3-chain structure, model, 23 Large proteins, native structure, disruption stages, 195-196 Lennard-Jones 6-12 potential function, 140-141 Leu-enkcphalins,design, 61-68 Ligand binding, protein, see Proteinligand binding London forces, 140-141
M Macrocyclization. 61 enkephalins, 64 Mastorparan, amphiphilic a-helix formation, 84 Mellitin, analog, model, 82-83 Membrane-bindingpeptides, design, 81-85 Met-enkephalins crystalline forms, hydrogen-bonding patterns, 54 design, 61-68 Metalloproteinase, collagen IV-cleaving,
Morphology, basement membranes, 3-6 Multipole representations, molecular charge distribution, 126-128 Multipoles, atomic, estimation, 128 Myoglobin specific enthalpy of denaturation. temperature dependence, 201 thermodynamic parameters of denaturation, table, 197 Myosin light-chain kinate a-helix-forming peptides, 90 calmodulin-binding domains, helical net diagram, 91
N NCl domain, collagen IV, 11-12, 18 NC2 domain, collagen IV, 12 Network arrangement, collagen IV, 18-19
Nidogen basement membranes, 7. 27-28 electron micrograph, 8 Nonpolar gases enthalpy and heat capacity increment of solution, table, 208 entropy of solution in water, table, 215
Nonpolar molecules enthalpy of dissolution in water, 207-209
heat capacity change upon aqueous dissolution, 211-212 hydration, 217-225 thermodynamics of dissolution, 217-220
two-step dissolution process, 220-225
Nonpolar substances compact state, 224-225 entropy of solution in water, 212-217
42
C,-Methylamino acids, properties,
0
57-58
Nu-Methylamino acids, properties, 58-59 C,-Methylation, 58 Molecular charge distribution, multipole reprreentations, 126-128
Oligonucleotide synthesis, prebiotic template-directed, 172-173 Osteonectin, BM-40calcium-binding protein and, 32
SUBJECT INDEX Oxygen-aromatic interactions, 156-160 Oxytocin, aromatic-aromatic interactions, 169-170
P Papain, thermodynamic parameters of denaturation, table, 197 Parietal yolk sac cells GP1 and GP2 glycoproteins. 21 production of basement-membrane components, 10 Partial heat capacity, during transition from native to denatured state, 196-206 Parvalbumin aromatic-aromatic interactions, 164-165 thermodynamic parameters of denaturation, table, 197 Peanut proteins, chemistry, abstract, 235-238 Pepsin, collagen IV-cleaving, 42 Pepsinogen, thermodynamic parameters of denaturation, table, 197 Peptide design amino acids, conformational properties, 55-57 amino acids, conformationally constrained o-amino acids, 60 enkephalins bearing, 62-63 C,-methylamino acids, 57-58 Nu-methylamino acids, 58-59 a,B-unsaturated amino acids, 60-61 amphiphilic secondary structureforming peptides, 80-81 calmodulin- binding pept ides, 85-96 membrane-binding peptides, 81-85 peptide hormones, 96-101 cyclic peptides. 61 effects of conformational constraints, 52-53 enhancement of helical potential, 75-80 glucagon, 77-78 human analphylatoxin C3a. 78-80 ribonuclease S-peptide, 75-77 I r u - and Met-enlrephalin analogs, 61-68 medium-sized peptides, 68 small peptides, 55-55
261
stabilization of helix formation long-range interactions, 71-75 medium-range interactions, 69-71 short-range interactions, 68-69 Peptide hormones, design, 96-101 N-Phenylacetyl-L-phenylalanine molecular structure, stereodrawing, 134 partial electronic charges of phenyl rings, 131 Plasminogen, fragment K4, thermodynamic parameters of denaturation, table, 197 Polar coordinates, right-handed, 152-155 Polar interactions amino-aromatic interactions, 175-176 aromatic-aromatic interactions, 162-175 DNA structure and, 172 oxygen-aromatic interactions, 156-160 sulfur-aromatic interactions, 160-162 Polarizability, 131, 133-134 Potential energy charge-charge interactions, 150 charge-dipole interactions, 136 dipole-dipole interactions, 137-138 electrostatic, 126-127 polarizability, 134 Potential energy surfaces, minima, 182-185 Pressure effects, on protein folding, 181 Proline, conformational properties, 55, 57 Protein-ligand binding amino-aromatic interactions, 174-175 aromatic-aromatic interactions, 166 charged ligands, salt bridge mediation, 129-130 minima effects, 182 oxygen-aromatic interactions, 158, 160 Protein denaturation calorimetry, 194-195 discrete stages of protein structure disruption, 195-196 native and denatured states, 195 partial heat capacity changes, 196-204 specific enthalpy and entropy of conformation transition, upper limits, 204-207
262
SUBJECT INDEX
enthalpy enthalpy of mlution of liquid hydrocarbons and, 226 hydrogen bonds and, 227 hydrocarbon d h l u t i o n in water and, 225-228 Protein d&W, 101-102 a-helical proteins, 102-103 coiled coilr, 103-105 four-helix bundlm, 105-114 artificial DM"binding peptide, 115-116 6 proteins, 114-115 betabellin, 116-117 enkcphalin-bindingpeptide, 116-117 Protein structures, compact, stabilization mechanism, 228-231
Q Quadruple- quadrupok interactions, 139-140 strength of, 182 Quadrupolm, electronic, electrostatic interactions, 138-140
R Reaction microcalorimetry, 194 Ribonuclease A a-helix N terminal, 69-70 C-peptide, 70-71 intrinsic viscosity, temperature dependence, 230 pH dependence of helix formation, 70 salt bridge, 70-71 salt dependence of helix formation, 70
specific enthalpy of denaturation, temperature dependence, 201 thermodynamic parameters of denaturation, table, 197 Ribonuclease S-peptide, enhancement of helical potential, 75-77 Ridges into groova packing, 104, 106 Ring stacking, 162
S Salt bridges, 129 buried, 130 claaaification of electrostatic interactions and, 185 effect on protein thermal stability, 129 mediation of charged ligand-binding to proteh, 139-130 Scanning electron micrographs, human skin baaement membrane, 4 scanning m i ~ a l 0 r i m d C technique, 194 Secondary structum, amphiphilic, see Amphiphilic secondary structures 7 S domain, collagen IV, 13, 15-18 Single-crystalX-ray crystallographic data b m , 155-156 Solubility, hydrocarbons from liquid p h w to water, table, 215 Somatoetatin, aromatic- aromatic interactions, 168-169 SPARC, BM-40calcium-binding protein and, 32 Stabilization, compact protein structures, 228-231 Staphylococcus nuclease, thermodynamic parameters of denaturation, table, 197 Stenodrawings antigen-binding site of Fab fragment NEW, 165 N-benzoyl-5'O-tertbutyldimethylrilyl -2 'deoxyadenoeine monohydrate, molecular structure, 144 6-benzyl-S-chloro-2-pyrone bound to active site of y-chymotrypain, 167 conformations of carbon monoxymyoblobin arginine-45, 175 molecular structures of bisphenyl antigelling and antisickling compounds, 170 oxygm environment of 26 phenylalanine side chains, 157 N-phenylacetyl-L-phenylalanine, molecular structure, 134 succinyl-L-tryptophanyl-L-tryptophan bound to deoxyhemoglobin A, 171 van der Waals, interacting benzene rings, 163
263
SUBJECT INDEX Sulfur-aromatic interactions; 160-162 Sulfur, an hydrogen bond donor and acceptor, 145 Synthetic peptides q , (YIA, U ~ B ,and aPB9
112
amino acid sequences, 112 free energies of tetramerization or dimerization, 112
T Temperature dependence denaturational heat capacity increment, 198-200 intrinsic viscosity of egg-white lysozyme, apomyoglobin, and ribonuclease A, 230 specific mthalpy of denaturation of myoglobin and ribonuclease A,
Tertiary structure, stabilization by enthalpically favorable aminoaromatic pairs, 174 by enthalpically favorable aromatic pairs, 164 Thermal stability, effects of salt b r i d p , 129
Thermodynamics, dissolution of nonpolar molecules, 217-220 Troponin, aromatic- aromatic interactions, 164-165 &%sin, thermodynamic parameters of denaturation, table, 197 Tryptophan-containing peptides fluorescence properties, 95 orientation, schematic, 94 perturbational effect of tryptophan introduction, 93 Tryptophan introduction, perturbational effect, 93
201
thermodynamics parameters in dissolution of liquid hydrocarbons into water, 223 Temperature effects entropy of solution of nonpolar substances, 212-217 entropy of transfer equal to zero,
U Ultrastructure, basement membranes, 3-6
v
216-217, 221
heat capacity change upon dissolution of nonpolar molecules, 211-212
Teratocarcinoma cells, production of baerment-membrane components, 10
van van van van
der der der der
Waals Waals Waals Waals
distance, 141 interactions, 140-141 potential, 141 radius, 141
This Page Intentionally Left Blank