Carbohydrate Bioengineering

CARBOHYDRATE BIOENGINEERING Progress in Biotechnology Volume 1 New Approaches to Research on Cereal Carbohydrates (Hi...

Author: S.B. Petersen | B. Svensson | S. Pedersen

85 downloads 1853 Views 18MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

CARBOHYDRATE BIOENGINEERING

Progress in Biotechnology Volume 1 New Approaches to Research on Cereal Carbohydrates (Hill and Munck, Editors) Volume 2 Biology of Anaerobic Bacteria (Dubourguier et al., Editors) Volume 3 Modifications and Applications of Industrial Polysaccharides (Yalpani, Editor) Volume 4 Interbiotech '87. Enzyme Technologies (Bla~ej and Zemek, Editors) Volume 5 In Vitro Immunization in Hybridoma Technology (Borrebaeck, Editor) Volume 6 Interbiotech '89. Mathematical Modelling in Biotechnology (Bla~ej and Ottova, Editors) Volume 7 Xylans and Xylanases (Visser et al., Editors) Volume 8 Biocatalysis in Non-Conventional Media (Tramper et al., Editors) Volume 9 ECB6: Proceedings of the 6th European Congress on Biotechnology (Alberghine et al., Editors) Volume 10 Carbohydrate Bioengineering (Petersen, Svensson and Pedersen)

Progress in Biotechnology 10

CARBOHYDRATE BIOENGINEERING Proceedings o f an International Conference Elsinore, Denmark, April 23-26, 1995

Edited by S t e f f e n B. P e t e r s e n

SINTEF UNIMED, MR-Center, N-7034 Trondheim, Norway Birte Svensson

Gamble Carlsberg vej 10, DK-2500 Valby, Denmark Sven Pedersen

NovoNordisk A/S, Novo Alle, DK-2880 Bagsvaerd, Denmark

ELSEVIER Amsterdam

- Lausanne

- New

York - Oxford - Shannon

- Tokyo

1995

Published by: Elsevier Science B.V. P.O. Box 211 1000 AE Amsterdam The Netherlands

ISBN 0-444-82223-2 @1995 Elsevier Science B.V. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise without the prior written permission of the publisher, Elsevier Science B.V., Permissions Department, P.O. Box 521, 1000 AM Amsterdam, The Netherlands. No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, the Publisher recommends that independent verification of diagnoses and drug dosages should be made. Special regulations for readers in the U S A - This publication has been registered with the Copyright Clearance Center Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01293, USA. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the USA. All other copyright questions, including photocopying outside the USA should be referred to the copyright owner, Elsevier Science B.V., unless otherwise specified. This book is printed on acid-free paper. Printed in the Netherlands

PREFACE The rapid development in glycobiology and glycotechnology has resulted in an enormous increase in our knowledge on the structure, conversion, and application of carbohydrates in industry and medicine. The Carbohydrate Bioengineering Meeting held at the LO-School in Elsinore, Denmark, April 23-26, 1995, gathered 230 scientists, mostly from Europe, with interest in carbohydrate analysis and structure; carbohydrates in medicine and glycopathology; structure, function, application, and protein engineering of carbohydrate active enzymes; oligo- and polysaccharides of industrial interest; and production of carbohydrate containing new materials. The meeting provided a forum where highly distinguished researchers presented their latest results. The strong representation of participants in the EU Biotechnology Programme on Carbohydrate Active Enzymes illustrated a great need for this type of meeting which provides a stimulating atmosphere where established scientists and students in academia and industry can get together. The contributions, both from invited and selected speakers as well as from the poster presentations spanned the broad field in a continuous manner. The Proceedings chapters 1 - 4 address glycoconjugates as modulatory and recognition molecules, structure determination using NMR and mass spectrometry, and microdialysis-chip enzyme-based sensors. Carbohydrate active enzymes turned out to be a predominant topic. Chapters 5 - 8 describe different bacterial enzymes involved either in carbohydrate metabolism, or with potential application in bio-processing of special polysaccharides including at elevated temperatures. Details on the tools utilized in analysis of mechanism of carbohydrases and carbohydrateprotein interactions are discussed in chapters 9 - 12 which include active site mutations coupled with crystal structures and synthetic substrate analogue interactions. Protein engineering of specificity and stability of two industrially important starch-degrading enzymes are described in chapters 13 and 14. The significance and engineering of electrostatics in carbohydrate active enzymes, and the role and engineering of N-linked sugar moieties in protein stability are demonstrated in chapters 15 and 16. A very timely up-date on ceUulolytic enzymes covers in chapters 17 - 21 the new three-dimensional structures and binding domains for biotechnological applications. The description in chapters 22 and 23 of transgenic plants for understanding and manipulation of starch biosynthesis and of the expression of cellulases in transgenic animals both have wide perspective in nutrition and related sciences. Examples on prospects for industrial synthesis of polysaccharides using enzymes not acting on sugar nucleotides, and the potential and details of the complex mechanism of such enzymes, are given in chapters 24 and 25. Furthermore, industrial applications using monocomponent plant cell-wall polysaccharide hydrolases, synthesis of fatty acid-carbohydrate ester surfactants,

vi and the utilization of bacteria in commercial production of sugars derivatives are described in chapters 26- 28. Any scientific meeting is benefitting from economical support. In our case we would like to express our special gratitude for the generous support from the following sponsors: Biosym

Technologies, Boehringer DAKO, Dansk Metal, European Commission/DG XII, European Congress of Biotechnology 5, Nordic Fund for Technology and Industrial Development, Novo Nordisk A/S, Nutek, Pharmacia, Pharmacia Biotech, Radiometer, Silicon Graphics and SNF. Finally, at this point we also wish to express our warm thanks to the participants for their contribution and for making a most exciting and fruitful meeting. A very special thank you to all authors for the effort they have made in the preparation of the manuscripts. We also greatly acknowledge the professional assistance of Ms. Mona K. Eidem in the preparation of this book.

Copenhagen, September 27, 1995

Sven Pedersen, Steffen B. Petersen and Birte Svensson, Editors

This Page Intentionally Left Blank

xi

TABLE OFCONTENTS

Chapter 1: Glycans of glycoconjugates as modulatory and recognition molecules

N. Sharon

1

Chapter 2: NMR studies of the structure and dynamics of carbohydrates in aqueous solution H . van Halbeek and S. Sheng

I5

Chapter 3: Linkage analysis by mass spectrometry of chemically modified oligo-saccharides from glycosphingolipids and glycoproteins B. Nilsson

29

Chapter 4: Development of a novel enzyme based glucose sensor F. Spener, R . Steinkuhl, C . Dumschat, H. Hinkers, K . Cammann and M. Knoll

49

Chapter 5: Carbohydrate binding at the active site of Escherichia coli maltodextrin phosphorylase P. Drueckes, D. Palm and R. Schinrel

59

Chapter 6: The chitinolytic system of Streptomyces olivaceoviridis

H. Schrempf Chapter 7: Properties and production of the P-glycosidase from the thermophilic Archaeon Sulfolobus solfataricusexpressed in mesophilic hosts M. Moracci, L. Capalbo, M. De Rosa, R. La Montagna, A. Morana, R. Nucci, M. Ciaramella and M. Rossi

71

77

xii Chapter 8: Contribution of subsites to catalysis and specificity in the extended binding cleft of Bacillus 1,3-1,4-13-D-glucan 4-glucanohydrolases A. Planas and C. Malet Chapter 9: Probing of glycosidase active sites through labeling, mutagenesis and kinetic studies S.G. Withers

85

97

Chapter 10: Thiooligosaccharides: toys or tools for the studies of glycanases H. Driguez

Chapter 11: Mutational analysis of catalytic mechanism and specificity in amylolytic enzymes B. Svensson, T.P. Frandsen, I. Matsui, N. Juge, H.-P. Fierobe, B. Stoffer and K.W Rodenburg

113

125

Chapter 12: The structure and function relationship of Schizophyllum commune xylanase A Mdr Bray and AJ. Clarke

147

Chapter 13: Protein engineering of cyclodextrin glycosyltransferase from Bacillus circulans strain 251 L. Dijkhuzien, D. Penninga, H.I. Rozeboom, B. Strokopytov and B.W. Dijkstra

165

Chapter 14: Oxidation stable amylases for detergents T.V. Borchert, S.F. Lassen, A. Svendsen and H.B. Frantzen

175.

Chapter 15: Electrostatic studies of carbohydrate active enzymes A. Baptista, T. Brautaseth, F. DrablOs, P. Martel, S. Valla and S.B. Petersen

181

Chapter 16: Effects of glycosylation on protein folding, stability and solubility. Studies of chemically modified or engineered plant and fungal peroxidases K.G. Welinder and J.W. Tams

205

xiii Chapter 17: Modes of action of two Trichoderma reesei cellobiohydrolases T.T. Teeri, A. Koivula, M. Linder, T. Reinikainen, L. Ruohonen, M. Srisodsuk, M. Claeyssens and T.4. Jones

211

Chapter 18: Structural studies on fungal endoglucanases from Humicola insolens G.J. Davies and M. Schiilein

225

Chapter 19: The catalytic domain of endoglucanase A from Clostridium cellulolyticum belonging to family 5: an 0r enzyme V. Ducros, M. Czjzek, A. Belaich, C. Gaudin and R. Haser

239

Chapter 20: Celluosome domains for novel biotechnological application E.A. Bayer, E. Morag, M. Wilchek, R. Lamed. S. Yaron and Y. Shoham

251

Chapter 21: Interactions of cellulases from Cellulomonasfimi with cellulose N. Din, J.B. Coutinho, N.R. Gilkes, E. Jervis, D.G. Kilburn, R.C. Miller Jr., E. Ong, P. Tomme and R.4.J. Warren

261

Chapter 22: Transgenic plants as a tool to understand starch biosynthesis J. K o ~ n n , G. Abel, V. Biittcher, E. Duwenig, M. Emmermann, R. Lorberth, F. Springer, I. Virgin, T. Welsh and l. Willmitzer

271

Chapter 23: Targeted expression of microbial cellulases in transgenic animals S. Ali, J. Hall, K.L. Soole, C.M.G.4. Fontes, G.P. Hazlewood, B.H. Hirst and Hzl. Gilbert

279

Chapter 24: Mechanism and action of glucansucrases J.F. Robyt

295

Chapter 25: Studies of recombinant amylosucrase M. Remaud-Simeon, F. Albaret, B. Canard, I. Varlet, P. Colonna, R.M. Willemot and P. Monsan

313

xiv Chapter 26: Application of cloned monocomponent carbohydrases for modification of plant materials L.V. Kofod, T.E. Mathiasen, H.P. Heldt-Hansen and H. DalbOge

321

Chapter 27: Fatty acid esters of ethyl glucoside, a unique class of surfactants O. Andresen and O. Kirk

343

Chapter 28: A wide range of carbohydrate modifications by a single microorganism: leuconostoc mesenteroides W. Soetaert, D. Schwengers, K. Buchholz and E.I. Vandamme

351

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

Glycans of glycoconjugates as modulatory and recognition molecules Nathan Sharon Department of Membrane Research and Biophysics, The Weizmann Institute of Science, Rehovot 7 6100, Israel

Abstract

Glycoproteins and glycolipids are the most common classes of glycoconjugates. Their glycans are structurally diverse, although overlapping to some extent. The functions of these glycans are mostly unknown, but from the information available it is clear that they often modulate the physicochemical properties of the proteins to which they are attached, and sometimes also their biological activities. In addition, there is increasing evidence that many of these glycans, whether protein or lipid bound, serve as recognition determinants in moleculecell and cell-cell interactions. The knowledge accrued is having a marked impact on the biotechnological production of therapeutically useful glycoproteins and provides a strong impetus for the development of novel types of drugs for a wide range of diseases, such as microbial infections, inflammation and cancer.

1. INTRODUCTION Living organisms produce a bewildering array of glycans, many of them in the form of glycoproteins or glycolipids. These generally go under the term glycoconjugates or complex carbohydrates. The question of the function of these glycans is currently attracting increasing interest. Thus, oligosaccharides attached to proteins often modulate the physicochemical and biological properties of the latter. Furthermore, changes in glycan structure accompany many normal and pathological processes, from differentiation and development to cancer. In some cases, specific carbohydrates on cancer cells have been correlated with the tumor grade, metastatic potential and prognosis, raising the possibility that glycans may serve as markers for diagnostic purposes and as targets for drugs. Developments in genetic engineering have made possible the biotechnological production of pharmaceutically useful glycoproteins, both for research purposes and for therapeutic use. Last, but not least, is the prospect of creating carbohydrate-based drugs for the treatment of many diseases, from microbial infections to cancer. Findings made prior to the 1970's have given some indications that carbohydrates may have functions other than as structural materials and energy source [1]. In work carried out in the 1940's and 1950's it has been shown that sialic acid is the receptor for influenza virus, that sugars on the surface of enteric bacteria serve as phage receptors and that such compounds are determinants of ABO blood type specificity. The sixties witnessed the demonstration that

malignant cells have often a different sugar phenotype than their normal counterpart, and that sugars determine the lifetime of glycoproteins in the circulation. These findings were largely ignored by biochemists and biologists alike. During the last quarter of the century the situation has changed, in the beginning rather slowly, then at a steadily accelerating rate, and our knowledge of glycan structures and their functions has increased greatly. These developments have not escaped the attention of the popular press, as reflected in the article entitled "Sugar coated truths" (The Economist, September 24th, 1994): "Researchers have been bewitched by the elegance of nucleic acids and proteins. Now they are opening their eyes to the subtleties of sugars". Here I present a survey of the role of glycans in modulating the properties of the proteins to which they are attached and of their function as recognition determinants. The latter function is shared by membrane bound glycolipids, which is not surprising, in view of the identity of many glycan structures found in both types of glycoconjugate. These subjects have been extensively dealt with in two recent reviews [2,3], in which references to earlier literature can be found.

2. METHODOLOGY Different approaches are being employed in the quest to unravel the role of glycoconjugate glycans. They include modification of the glycans by purified glycosidases or transferases, use of inhibitors of glycosylation or glycoprotein processing and of cell mutants with known defects in glycosylation, and more recently, techniques of molecular genetics. Recombinant glycoproteins can be expressed in heterologous cells or organisms, resulting in different patterns of glycosylation. An extreme example is that of bacteria (e.g., E . coli) which produce completely non-glycosylated proteins. Nucleotide-directed mutagenesis can be applied to modify glycosylation sites on proteins so that they will no longer serve for glycan attachment. With N-glycoproteins, where the carbohydrate is linked to the Asn-X-Ser/Thr sequon, modification of either the first or third amino acid will abolish glycosylation at this site. In glycoproteins with more than one carbohydrate unit, whether N-, O- or both, the sites can be systematically eliminated in various combinations, to form a panel of mutants in which the roles of each carbohydrate chain can be assessed. Although very useful, these and other in vitro methods rarely provide answers to questions concerning the biological roles of carbohydrates in the intact, multicellular organism. Important information in this respect can be derived from investigations of congenital disorders involving genetic defects in protein glycosylation. A well-studied case is that of I-cell disease, a lysosomal storage disease, resulting from the inability to synthesize mannose-6-phosphate (due to a deficiency in N-acetylglucosaminyl phosphotransferase) [4]. Other examples are HEMPAS, a rare form of genetic anemia in humans [5] or the carbohydrate deficient glycoprotein (CDG) syndromes I and II, a set of multisystemic diseases with major nervous system involvement, all of which are due to deficiencies in various steps of N-glycosylation [6,6a]. Last but not least, the use of transgenic animals, still in its infancy, opens up new avenues for the study of glycan function in the intact organism [7]. Carbohydrate structures can now be manipulated by the expression of degradative enzymes that act on a particular carbohydrate in a particular linkage, of transferases that attach monosaccharides to glycoproteins or glycolipids, or of other enzymes, such as those that act on monosaccharide

substituents (e.g. O-acetyl). These structures may also be modified by the complete elimination of a particular glycosyltransferase ("knock-out" experiments). For example, insight into the role of complex or hybrid N-linked glycans in the development of embryos was recently obtained in experiments with transgenic mice, from which the gene for GlcNAc-T I has been removed, using homologous recombination [8,9]. Embryos of such mice exhibited an overall stunting of growth and lethality at around day 10 of gestation. These findings show that complex N-linked chains are not required during early development, but are necessary later, for the completion of morphogenesis.

3. MODULATION OF PHYSICOCHEMICAL PROPERTIES Quite frequently, the carbohydrate groups affect the physical properties of the protein to which they are attached (Table 1). In general, the effects are more pronounced the higher the carbohydrate content of the glycoprotein. The negative charges of sialic acid residues and sulfate groups change the solubility and modify the conformation of glycoproteins, as well as the adhesive properties of cells. These effects are of particular importance for the function of the highly glycosylated mucins which may carry sialic acid-containing oligosaccharides on as many as one third of their amino acids. As a result, they assume rigid, rod-like structures that may reach a length of several hundred nanometers. Mucin regions are often found on cell surface receptors and it is thought that the role of the rigid structures is to extend the functional domains away from the cell surface. Since the surface area of the glycans is quite significant when compared to that of the peptide moiety, they may, in addition, influence other properties of proteins, such as heat stability and susceptibility to proteolysis. These properties are of special interest to industries producing commercially used enzymes, for which stability is a common requirement. Perhaps the most important functions of N-glycosylation are to aid in folding of the nascent polypeptide chain and in stabilization of the conformation of the mature glycoprotein. As a consequence, it may also affect any, or all, of the functions that depend on conformation. In the absence of glycosylation, some (glyco)proteins aggregate and/or are degraded, and as a result, are not secreted from the cells in which they are synthesized. Other glycoproteins are less influenced and are secreted, but have compromised biological activities, while some appear to be totally unaffected. Recent studies suggest that one way in which N-linked oligosaccharide chains affect protein folding is by mediating the interaction of the newly-synthesized peptide with calnexin, a chaperone with apparent selectivity for N-glycoproteins [10]. It is a non-glycosylated membrane protein of 65.4 kDa molecular weight, with a large, negatively charged carboxyterminal cytoplasmic tail and an external domain containing three internal repeats of conserved hexapeptide sequence. Experiments with two viral membrane glycoproteins (influenza virus hemagglutinin and vesicular stomatitis virus G protein), showed that calnexin binds transiently to the newly synthesized glycoproteins that have partially trimmed, monoglucosylated oligosaccharides; the binding coincides with protein folding and oligomer assembly. The proteins remain bound to calnexin for different periods of time, depending on the rates at which they achieve conformational maturation. During this time, the single terminal glucose is rapidly turned over in a deglucosylation-reglucosylation cycle. Once a glycoprotein is folded or

Table 1 Carbohydrates modify the properties of the proteins to which they are linked OFTEN Affect solubility, charge and viscosity; Control folding and subunit assembly; Stabilize protein conformation; Protect against proteolysis Affect the lifetime in circulation Change the immunological properties Modify the transmission of signals by cellular receptors Modify the activity of enzymes and hormones RARELY

assembled into oligomeric form, it is no longer reglucosylated and therefore released by calnexin and is free to leave the endoplasmic reticulum. The carbohydrate may change markedly the quartenary structure of a protein to which it is attached, as demonstrated in the X-ray crystallographic study of the Erythrina corallodendron lectin, a member of the large family of legume lectins [ 11]. The heptasaccharide, linked at Asn17 of each of the two subunits of this lectin, prevents the formation of the characteristic dimer observed in other members of the legume lectin family (e.g. concanavalin A and pea lectin). As a result, these subunits adopt a completely different quaternary structure. Not only is glycosylation at a particular site important in directing protein folding and assembly, but the precise structure of the glycan may also be critical. A clear case is that of human chorionic gonadotropin, a glycoprotein hormone composed of two subunits, for which it was shown that abnormally glycosylated a subunit is unable to associate with the [3-subunit to form the mature hormone.

4. MODULATION OF BIOLOGICAL ACTIVITY The ability of carbohydrates to modulate the activities of biologically functional proteins, occasionally even in an all-or-nothing manner, has been established unequivocally during the last decade in a limited number of cases. For most glycoproteins, however, the role of the carbohydrates is still obscure.

4.1. Enzymes Well documented cases on the effect of carbohydrate on enzymatic activity are rare. One of these is that of tissue plasminogen activator (tPA), a serine protease which converts plasminogen into plasmin and thereby induces clot lysis (fibrinolysis). Naturally occurring type I and type II tPA that possess different numbers of N-linked glycans, differ in the rate of formation of an active complex with fibrin that is able to cleave plasminogen. Plasminogen itself exists in two forms, with either one or more oligosaccharide units attached. The rate of fibrin-dependent plasminogen activation spans a range that is dependent on the glycosylation site occupancy of both tPA and its substrate. At the extremes, this activity for type II tPA with

type II plasminogen (possessing one O-linked glycan) is 2-3 times the value for type I tPA and type I plasminogen with one O-linked and one N-linked glycan [ 12, 13]. Another, more recent, case is that of bovine pancreatic RNase. The enzyme occurs both in unglycosylated (RNase A) and glycosylated (RNase B) form; the latter is a collection isoforms, in which the same polypeptide is associated with nine different oligomannose chains (from Man 5 to Man 9) at the single N-glycosylation site (Ash34). RNase A and RNase B have always been reported as having the same enzymatic activity and were frequently quoted as a proof, provided by nature, that such activity is not affected by the presence of carbohydrates in the molecule. With the aid of a novel sensitive assay using double stranded RNA substrate it was shown [13] that RNase A was more than three times as active as RNase B. The individual glycoforms RNase-Mans, RNase-Man 1 and RNase-Man0, prepared by exoglycosidase treatment of naturally occurring RNase B and separated by capillary electrophoresis, were intermediate in activity (Table 2). These differences were attributed to an overall increase in dynamic stability of the molecule with glycosylation (as demonstrated by measurments of proton exchange rates of the various RNase forms) and to steric effects. Molecular modeling indicated that increasing the size of the RNase oligosaccharide up to five mannose residues could lead to a decrease in activity, whereas the Man5 to Man9 glycoforms would exhibit similar activities, as was indeed found [ 14].

Table 2 Enzymatic activity of RNase A and glycoforms of RNase B 1 Relative RNase Carbohydrate activity A None 1.0 B

2

GlcNAc2Man 0 0.62 GlcNAc2Man 1 0.45 GlcNAc2Man 5 0.28 GlcNAc2Mans_9 0.26

t!

1

Based on ref.[ 13].

2

Modified by enzymatic removal of some mannose residues

4.2. Hormones Chemically or enzymatically deglycosylated glycoprotein hormones bind to their receptors on target cells with the same affinity as the native ones; their ability to activate the hormone responsive adenylate cyclase is, however, drastically decreased [reviewed in 2,3]. Site directed mutagenesis experiments on human chorionic gonadotropin cDNA implied that glycosylation at Ash-52 of the (x-chain alone is sufficient for normal signal transduction. Furthermore, in the absence of this critical oligosaccharide unit, glycosylation at Ash-13 of the [3-chain resulted in intermediate activity of the hormone, whereas glycosylation at Ash-30 of the same chain resulted in an inactive product. Deglycosylated hCG interacts with a different domain of the receptor than the native hormone. This difference may be a factor determining the success or failure of signal transduction from the receptor to the effector system.

The role of carbohydrates in the activity of erythropoietin, a glycoprotein hormone that stimulates erythropoiesis, is the subject of intense studies, not the least because of the great commercial interest in this compound. Erythropoietin has the distinction of being the first recombinant glycoprotein produced industrially for clinical use and is being widely employed for the treatment of anaemia in patients on haemodialysis. Desialylation of the hormone enhanced its in vitro activity by increasing its affinity for the receptor, but decreased its activity in vivo, presumably by decreasing its life-time in circulation. Similar results were reported for the N-deglycosylated hormone, which in vitro exhibited several-fold higher specific activity than the native one, but was inactive in vivo. Examination of several preparations of recombinant erythropoietin that differ in the degree of branching of their N-glycans revealed that in vivo activity of the hormone increased with the ratio of tetraantennary to biantennary saccharides [ 15].

4.3. Other biologically active molecules Carbohydrates on receptors may affect the functional coupling of the latter to effector systems such as adenylate cyclase (via guanine nucleotide binding proteins (G-proteins)) and tyrosine kinase, essential for the transmission of signals from the ligand to the cell. Thus, insulin receptor in which all four potential N-glycosylation sites of the b-subunit have been eliminated by site-directed mutagenesis had similar affinity for its ligand as the wild type receptor but lost its transmembrane signaling ability, as evidenced by lack of stimulation of glucose transport and glycogen synthesis by the hormone [ 16]. Two types of T-cell derived, IgE-binding factor have been described, one of which enhances, and the other suppresses, IgE synthesis in mast cells. The factors share a common polypeptide backbone, but only the former is glycosylated. This is perhaps the only known case of such a remarkable change in the activity of a protein caused by the presence of a carbohydrate [ 17]. In IgG antibodies, elimination of the conserved glycan linked to Asn247 of the heavy chain leads to a loss of some of the effector functions of the molecule, such as binding to Fc receptors on macrophages [ 18].

5. ACTIVITIES OF FREE OLIGOSACCHARIDES Diverse activities are exhibited by free oligosaccharides, either derived from glycoproteins or from other sources (Table 3).This was originally demonstrated with heparin oligosaccharides that, like the parent molecule, act as anticoagulants. Recently it was found that a heparin/heparan sulfate dodecasaccharide activates cell-bound fibroblast growth factor, similarly to full-size heparin (or heparan sulfate proteoglycans, such as syndecan) [ 19,20]. The saccharide binds both to the growth factor and its receptor and the formation of such trimeric complex appears to be a prerequisite for signal transduction. Glycans isolated from plant glycoproteins were shown to delay tomato ripening. Other oligosaccharides that act on plants in different ways have been described, some of which are listed in Table 3.

Table 3 Biologically active oli~osaccharides Structure In animals Heparin-derived Heparan sulfate-derived In plants Oligoglucosides Pectin-derived Oligo-GlcN derivatives Fuc-Xyl-N-glycans Xyloglucan-derived

Activity

Ref.'s

Anticoagulant Growth factor activators

[ 19,20] [19,201

Induce disease resistance Anti-auxins Nodulation factors in rhizobia Delay tomato ripening Inhibit or promote elongation of pea stem segments

[21] [21] [22] [231

[21]

6. CARBOHYDRATES AS RECOGNITION DETERMINANTS There is increasing evidence for the concept, formulated over 20 years ago, that carbohydrates act as recognition determinants in a variety of physiological and pathological processes [24,25] (Tables 4 and 5).

Table 4 Carbohydrates and lectins in cell-cell recognition Process Sugars on Infection Host cells Defense Phagocytes Microorganisms Fertilization Eggs Leukocyte traffic Leukocytes Endothelial cells Metastasis Target organs Malignant cells

Lectins on Microorganisms Microorganisms Phagocytes 1 (Sperm) Endothelial cells Lymphocytes Malignant cells (Target organs)

1

Presumed, no experimental evidence available

This concept evolved with the realization that carbohydrates have an enormous potential for encoding biological information. The messages encoded in the structures of complex carbohydrates are deciphered through interactions with complementary sites on carbohydratebinding proteins, chiefly lectins. Processes in which the participation of carbohydrate-lectin interactions was clearly demonstrated include intracellular trafficking of enzymes, clearance of glycoproteins from the circulatory system and a wide range of cell-cell interactions. Particularly exciting is the recent demonstration that binding of carbohydrates on the surface of

leukocytes, with a class of animal lectins designated selectins, controls leukocyte traffic by mediating adhesion of these cells to restricted portions of the endothelium and their recruitment to inflammatory sites. 6.1. Clearance (traffic) markers The rapid removal of desialylated glycoproteins from rabbit serum via the hepatic asialoglycoprotein receptor (or lectin), a phenomenon discovered in the late 1960's, is the prototype of the saccharide-based recognition system, although its role in nature has not yet been proven beyond doubt [26].

Table 5 Clearance and targeting of glycoproteins Glycoprotein Specificity Asialoglycoproteins Galactose Hormones SQ-GalNAc Lysosomal enzymes Diverse

Receptor

Man-6-phosphate Mannose

Location Liver (hepatocytes) Liver (Kupffer cells, endothelial cells) Ubiquitous Macrophages, liver(endothelial cells)

Several other systems in which the traffic of glycoproteins is controlled by their carbohydrate constituents are known (Table 5). A prominent example is the intracellular routing of lysosomal enzymes to their compartment which is mediated by the recognition between Man-6-P attached to the oligomannose unit(s) of such enzymes, and the Man-6-P receptors [4]. Two such receptors have been described, one cation-independent and of high molecular weight (220 kDa), the other cation-dependent and of low molecular weight (48 kDa). A defect in the synthesis of the Man-6-P marker recognized by the receptors results in I-cell disease (also called mucolipidosis II or MLII), an inherited lysosomal storage disease, characterized by a lack in the lysosomes of those enzymes that normally carry the marker [27]. It is caused by a deficiency of GlcNAc-phosphotransferase, the first enzyme in the pathway of mannose phosphorylation, and is thus a processing disease, the first of its kind to be identified. Therefore, even though the disease is transmitted by a single gene, some 20 enzymes are affected. The enzymes lacking the recognition marker do not reach their destination (the lysosomes), and are, consequently, secreted into the extracellular milieu, which is one of the biochemical abnormalities of the affected cells. The specificity of the GlcNAcphosphotransferase for certain lysosomal enzymes is based on its ability to recognize a specific lysine residue and a particular tertiary domain of the acceptor glycoprotein. Another carbohydrate-specified targeting system is that of the sulfated glycoprotein hormones. Native lutotropin (LH), carrying predominantly mono- and di-sulfated oligosaccharides on its [3-subunit, is cleared from the circulation 4-5 times more rapidly than recombinant LH, produced in CHO cells, that bears only sialylated oligosaccharides. The sulfated oligosaccharides of LH are synthesized by the action of two enzymes, a glycoprotein

hormone-specific GalNAc-transferase and a GalNAc-specific sulfotransferase [28]. The oligosaccharide formed by the two transferases, SO4-GalNAc~4GlcNAc132Manot is recognized by a receptor present on hepatic endothelial and Kupffer cells. Sulfated oligosaccharides are also present on the common precursor to two other hormones, adrenocorticotropin and melanotropin. It has been hypothesized that the attachment of this structure is a general tag that signals rapid clearance, resulting in a short burst of circulating hormones, thus preventing overloading of the corresponding receptors. The presence of well-defined carbohydrate binding proteins on cell surfaces is being exploited for drug targeting to specific organs. Gaucher's disease is caused by a deficiency of the enzyme 13-glucocerebrosidase, resulting in accumulation of glucocerebroside in Kupffer and endothelial (non-parenchymal) cells of the liver. These cells contain on their surface a mannose- (and N-acetylglucosamine) specific lectin. To target the 13-glucocerebrosidase into the above cells, the complex and hybrid sugar chains of the enzyme were trimmed down with the aid of exoglycosidases to expose the mannose residues of the pentasaccharide core. In this way the administered glucocerebrosidase is effectively delivered to the deficient cells where the enzyme is needed to degrade the accumulated glucocerobroside [29]. 6.2. Infection The oligosaccharide repertoire on the host-cell surface is among the key genetic susceptibility factors in viral and microbial infection and in toxin action. A number of viral, mycoplasmal, bacterial and protozoan pathogens use specific carbohydrate structures (of glycoproteins or glycolipids) on host cells as attachment sites in the initial stages of infection [30,31]. Experiments in intact animals have indeed proved that it is possible to prevent bacterial infection by blocking the attachment of the responsible organism with an appropriate sugar (Table 6). Such findings have provided an impetus for the development of carbohydratebased anti-adhesion drugs to combat infections. At present, at least two drugs of this kind have been patented, against bacterial pneumonia and Helicobacter pylori, (a bacterium associated with stomach ulcers). Should a bacterium mutate so that it no longer recognizes the antiadhesive carbohydrates, it will also fail to bind to its cell surface receptors, and therefore lose the ability to cause infection. Moreover, since such drugs do not kill the pathogens, they will not exert selection pressure and their use will not result in the development of resistance. Even if a particular carbohydrate has been established as an inhibitor for a disease-causing microorganism in an animal (or in humans), it must be determined whether the use of this carbohydrate, or its analogues, will not interfere with other processes in the body. One such process is lectinophagocytosis, well documented for the mannose specific E. coli [32]. This mode of phagocytosis may result from binding of the bacteria to phagocytes, e.g. macrophages or neutrophils, which is followed by activation of the phagocytes and uptake and killing of the bacteria. Lectinophagocytosis may occur in vivo and may provide protection against infection by bacteria to nonimmune hosts or in sites that are poor in opsonins. The latter include lungs, renal medulla, the cerebrospinal fluid and the peritoneal cavity, especially during peritoneal dialysis. In another mode of lectinophagocytosis, a wide range of microorganisms (bacteria, fungi and protozoa) that express mannose on their surface, bind to the mannose specific lectin present on the surface of macrophages. This binding, too, may lead to the uptake of the

10 microorganisms by the phagocytic cell and occasionally also their killing. A particularly interesting example of such a microorganism is the pathogenic fungus, Pneumocystis carinii, a

Table 6 Inhibitors of sugar-specific adhesion prevent infection in vivo Organism Animal, site Escherichia coli type 1 Mice, UT Mice, GIT Mice, UT Klebsiella pneumoniae type 1 Rats, UT Shigella flexnerii type 1 Guinea pigs, eye Escherichia coli type P Mice Monkeys Escherichia coli K99 Calves. GIT Pseudomonas aeruginosa UT, urinary tract; GIT, gastrointestinal tract

Human, ear

Inhibitor Meo~Man Mannose Anti-Man antibody Meo~Man Mannose Globotetraose GalI34Gal13OM Glycopeptides of serum glycoproteins Gal+Man+NeuAc

major case of death among AIDS patients. Human immunodeficiency virus (HIV), the causative agent of AIDS, is heavily glycosylated [reviewed in 2]. The major envelope glycoprotein gpl20 of the HIV has a key role in infection by the virus through its interaction with the membrane glycoprotein CD4 of T lymphocytes. Glycosylation of gpl20 appears to be a prerequisite for CD4 binding: the non-glycosylated protein from cells grown in the presence of tunicamycin does not bind to CD4 and treatment of gpl20 with deglycosylating enzymes impairs binding. Of the various inhibitors of glycosylation tested, the most dramatic anti-viral effects observed have been with N-butyldeoxynojirimycin.

6.3. Leukocyte traffic Research carried out mainly during the last 5 years has demonstrated that adhesive interactions mediated by surface carbohydrates and surface lectins play a crucial role in leukocyte trafficking to sites of inflammation and hemostasis and in the migration (homing) of lymphocytes to specific lymphoid organs. In these processes, the carbohydrates serve as "area codes" which are interpreted by E-selectin, P-selectin and L-selectin, members of a family of endogenous lectins [33-35]. The selectins are highly asymmetric membrane-bound proteins. Their extracellular part consists of an amino terminal carbohydrate recognition domain (CDR), an epidermal growth factor-like domain and of several short repeating units related to complement-binding protein They bind specifically to sialyl-Lewisx (siaLe x in brief), NeuAc(ot2-3)Gal(131-4)[Fuc(o~l-3)]GlcNAc, and its positional isomer, sialyl-Lewis a (siaLea), NeuAc(o~2-3)Gal(131-3)[Fuc(o~l-4)]GlcNAc, with both fucose and sialic acid required for binding; sialic acid can be replaced by another negatively charged group such as sulfate. These proteins recognize the carbohydrate ligands only when the latter are present on particular

11 glycoproteins, such as cell surface mucins, pointing to the role of the carrier molecule and carbohydrate presentation in the recognition of the latter by lectins. The selectins provide the best paradigm for the role of sugar-lectin interactions in biological recognition. In broad outline, they all mediate, although with some differences, the adhesion of circulating leukocytes to endothelial cells of blood vessels, leading to the exit of the former cells from the circulation. The extravasation is necessary for the migration of leukocytes into tissues, such as occurs under normal recirculation of lymphocytes between different lymphoid organs or in recruitment of leukocytes to sites of inflammation. L-selectin, also known as "homing receptor", is found on all leukocytes. It is predominantly involved in the recirculation of lymphocytes, directing them specifically to peripheral lymph nodes. In contrast to the homing receptor the two other selectins are expressed mainly on endothelial cells, and only when these cells are activated by inflammatory mediators, mainly cytokines (e.g. interleukin-2 and tumor necrosis factor). The latter are released from tissue leukocytes in response to e.g wounding, infection or ischemia and induce the expression of P-selectin on the endothelial surface within minutes and of E-selectin within 3-4 hours. Recent experiments in animals have provided direct evidence for the role of selectins in the control of leukocyte traffic. For instance, in P-selectin deficient mice, generated by targeted gene disruption, the recruitment of neutrophils to the inflamed peritoneal cavity was significantly delayed [36]. The clinical importance of selectin-carbohydrate interactions in acute inflammatory responses in humans is illustrated by the finding that the neutrophils of two patients had a deficiency in SiaLe x [37]. The specific biochemical lesion responsible for this defect has not yet been established, but is believed to be a reflection of a general fucosyltransferase deficiency in these patients. In agreement with the ligand activity of SiaLe x, the neutrophils of the patients were unable to bind to E-selectin. The above findings imply that this inability prevents the neutrophils from migrating to the sites of infection and suggest that inhibitors of the selectins may be potent anti-inflammatory agents. Prevention of adverse inflammatory reactions by inhibition of leukocyte extravasation has become a major aim of many pharmacological industries. Preliminary experiments in animal models indeed show that oligosaccharides recognized by the selectins exert protective effects against experimentally induced lung injury. These approaches are now being evaluated for treatment of human disease.

7. OPEN QUESTIONS Although the enormous progress made in the last decade in many aspects of glycobiology "has finally opened a crack in the door to one of the last great frontiers of biochemistry" [38], many questions remain unanswered. The structures of the glycans encountered are exceedingly diverse and we are, as yet, unable to discern the principles that guide their formation. Although preliminary insights have been obtained into the central role of the polypeptide backbone in specifying glycosylation [28], the mechanism(s) by which the information encoded in the primary amino acid sequence is translated into a particular glycan structure is still not clear. A related problem concerns the molecular basis for the variations in glycosylation of the same protein between different species or cell types, and in the course of development, differentiation and oncogenesis.

12 The intriguing question of the function(s) of the carbohydrate is for most glycoproteins, whether soluble or membrane bound, still wide open, since only for a small number of glycoproteins, modulation of physicochemical properties or biological activities by their glycans has been demonstrated. One thing is clear: glycosylation can have markedly different effects on different proteins. This means that each glycoprotein must be examined individually and meticulously for the possible functions of the glycans it carries. Progress in this area will therefore be unavoidably slow. Still unresolved is the question of the biological relevance of microheterogeneity - in other words, can unique biological functions be ascribed to different components of the ensemble of glycoforms? The accumulation of evidence on the role of carbohydrates, whether bound to proteins or lipids, as recognition molecules is exceedingly rewarding. It serves as a strong impetus for improving techniques of structural analysis of carbohydrates as well as for the development of new methods of synthesis of oligosaccharides and glycomimetics, a challenge for the carbohydrate bioengineers. Recent developments also focus attention on lectins as co-partners in the recognition process. Of particular importance in this context is the need for a detailed knowledge of their combining sites [39], which should allow the design of highly effective carbohydrate anti-adhesive drugs.

8. A C K N O W L E D G E M E N T Special thanks are due to Dr. Halina Lis for her help in preparing this manuscript.

9. REFERENCES

1

N. Sharon, Complex Carbohydrates: Their Chemistry, Biosynthesis and Functions. Addison-Wesley, Reading, Massachussets, 1975. 2 H. Lis and N. Sharon, Eur. J. Biochem., 218 (1993) 1. 3 A. Varki, Glycobiology, 3 (1993) 97. 4 S. Kornfeld, Annu. rev. Biochem., 61 (1992) 307. 5 M.N. Fukuda, Glycobiology, 1 (1990) 9. 6 J. Jaeken, H. Carchon and H. Stibler, Glycobiology, 3 (1993) 423. 6a J. Jaeken, H. Schachter, H. Carchon, P. De Cock, B. Coddeville and G. Spik, Arch.Dis. Child., 71 (1994) 123. 7 B.D. Shur, Curr. Biol., 4 (1994) 996. 8 E. Ioffe and P. Stanley, Proc. Natl. Acad. Sci., USA 91 (1994) 728. 9 M. Metzler, A. Gertz, M. Sarkar, H. Schachter, J.W. Schrader and J.D. Marth, EMBO J., 13 (1994) 2056. C. Hammond and A. Helenius, Curr. Biol., 3 (1993) 884. 10 11 B. Shaanan, H. Lis and N. Sharon, Science, 254 (1991) 862. 12 A.J. Wittwer, S.C. Howard, L.S. Carr, N.K. Harakas, J. Feder, R.B. Parekh, P.M. Rudd, R.A. Dwek and T.W. Rademacher, Biochemistry, 28 (1989) 7662. 13 P.M. Rudd, H.C. Joao, E, Coghill, P. Fiten, M.R. Saunders, G. Opdenakker and R.A. Dwek, Biochemistry, 33 (1994) 17.

13 14 15 16 17 18 19 20 21 22 23 24 25 26

27 28 29 30 31 32 33 34 35 36 37 38 39

R.J. Woods, C.J. Edge and R.A. Dwek, Nature Struct. Biol., 1 (1994) 499. M. Takeuchi and A. Kobata, Glycobiology, 1 (1991) 337. I. Leconte, C. Auzan, A. Debant, B. Rossi and E. Clauser, J. Biol. Chem.,267 (1992) 17415. K. Ishizaka, Annu. Rev. Immunol., 6 (1988) 513. R.A. Dwek, Glycoconjugate J., 10 (1993) 357. D. Spillman and U. Lindahl, Curr. Opin. Struct. Biol., 4 (1994) 667. A.D. Lander, Chemistry&Biology, 1 (1994) 73. A. Darvill, C. Augur, C. Bergman et al., Glycobiology, 2 (1992) 181. I. Vijn, L. das Neves, A. van Kammen, H. Franssen and T. Bisseling, Science, 260 (1993) 1764. H. Yunovitz and K.C. Gross., Physiologia Plantarum, 90 (1993) 152. N. Sharon and H. Lis, Sci. Am., 268(1) (1993) 82. T. Feizi, Curr. Opin. Struct. Biol., 3 (1993) 701. G. Ashwell, in Lectin Blocking: New Strategies for the Prevention and Therapy of Tumor Metastasis and Infectious Diseases (J. Beuth and G. P Pulverer, eds.), Gustav Fischer Verlag, Stuttgart Jena New York, 1994, p. 26. E.F. Neufeld, Annu. Rev. Biochem., 60 (1991) 257. J.U. Baenziger, FASEB J., 8 (1994) 1019. E. Beutler, A. Kay, P. Garver, D. Thurston, A. Dawson and B. Rosenbloom, Blood, 78 (1991) 1183. I. Ofek and N. Sharon, Curt. Top. Microbiol. Immunol., 151 (1990) 91. I. Ofek and R.J. Doyle, Bacterial Adhesion to Cells and Tissues. Chapman and Hall New York London 1994 I. Ofek, J. Goldhar, Y. Keisari and N. Sharon, Annu. Rev. Microbiol., 49 (1995) 239. L.A. Lasky, Science, 258 (1992) 964. M.P. Bevilacqua, Annu. Rev. Immunol., 11 (1993) 767. S.D. Rosen and C.R. Bertozzi, Curr. Opin. Cell Biol., 6 (1994) 663. T.N. Mayadas, R.C. Johnson, H. Rayburn, R.O. Hynes and D.D. Wagner, Cell, 74 (1993) 541. U.H. von Andrian, E.M. Berger, L. Ramezani, J.D. Chambers, H.D. Ochs, J.M. Harlan, J.C. Paulson, A. Etzioni and K.E. Arfors, J. Clin. Invest., 91 (1993) 2893. G.W. Hart, Curr. Opin. Cell Biol., 4 (1992) 1017. N. Sharon, Trends Biochem. Sci., 18 (1993) 221.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

15

NMR studies of the structure and dynamics of carbohydrates in aqueous solution Herman van Halbeek and Shuqun Sheng Complex Carbohydrate Research Center and Departments of Chemistry and Biochemistry, The University of Georgia, 220 Riverbend Road, Athens, Georgia 30602-4712, USA

Abstract

Notable recent developments in NMR methodology for studying carbohydrate structure and dynamics include the increased information gained from the observation of hydroxyl proton signals in supercooled water as well as improved measurements of long-range heteronuclear scalar couplings and 13C relaxation rates. This chapter presents an example of recent progress in each of those categories. (1) A 2-D homonuclear rotating-frame exchange experiment is proposed as a suitable means under supercooled aqueous conditions to detect transient intramolecular hydrogen bonds in the flexible disaccharide sucrose. (2) A modified 2-D 1H-detected heteronuclear multiple-quantum experiment is introduced for speedy and accurate measurement of nJcH (n>2); the application of the so-called HSMBC experiment is illustrated for sialyllactose. (3) The quantitative aspects of the dynamics of the hexasaccharide headgroup of ganglioside GDla embedded in a perdeuterated mixed micelle in aqueous solution are assessed by 1H-detected natural-abundance 13C Tlo measurements; the data are interpreted according to the Lipari-Szabo model-free approach which reveals relatively fast internal motions in the outer region of the hexasaccharide.

1. INTRODUCTION A detailed NMR analysis of the solution conformations and dynamics of a carbohydrate encompasses the following steps: 9 complete assignment of 1H and 13C NMR spectra of the carbohydrate; 9 establishment of spatial (distance and/or torsion angle) constraints between atoms from dipolar and scalar correlation NMR measurements; 9 measurement of 1H and 13C relaxation parameters (that is, T1, T2, Tip, homo- and heteronuclear cross-relaxation rates). Despite the efforts typically involved in conducting the pertinent NMR experiments [ 1,2], the experimentally measurable constraints may fall short in determining the complete conformation and dynamic behavior of an oligosaccharide. Evaluation of the experimental data with computational strategies is then a necessity, typically by using potential energy calculations and molecular dynamics (MD) simulations [3-5]. However, in favorable cases, the NMR study yields a sufficiently large number of constraints to broadly define the conformation of the oligosaccharide in aqueous solution. Often one finds that not all the

16 obtained NMR constraints are compatible with the existence of a single rigid structure, implying that the oligosaccharide is dynamic with respect to torsional vibrations around each glycosidic bond. The past few years have witnessed a vast increase in the number of efforts aimed at the measurement of NMR parameters directly related to the flexibility of carbohydrates, including 1H and 13C T1 and Tip, homo- and heteronuclear cross-relaxation rates, and global and local correlation times (reviewed in [6]). It is gradually becoming clear that very few oligosaccharides adopt a single, fully constrained ("rigid") conformation. The ability of most carbohydrates to present one covalent structure to their environment in many different ways may very well contribute to their versatility in biological functions. NMR spectroscopy not only helps to narrow the conformational space accessible to a flexible oligosaccharide, it also can provide information about the relative populations of and the rate of interconversion between different energetically favored conformers. We will discuss below three examples of this role for NMR in carbohydrate conformational analysis, namely, to aid in restricting the theoretically possible ensemble of conformations for a given oligosaccharide. We present a couple of recently developed NMR methods that provide either additional (hydrogen bond) or more accurately determined (3JcH) parameters to restrain the oligosaccharide structure in question. Also, a quantitative evaluation of 13C relaxation data is presented for a glycolipid system mimicking physiologic cell-surface conditions.

2. HYDROGEN BONDING Hydroxyl proton resonances in aqueous solutions of sugars were first observed over 15 years ago [7], but the value of these protons in the conformational analysis of carbohydrates has been demonstrated only recently [8-11]. Hydroxyl resonances have the potential to provide a wealth of structural information in the form of chemical shifts, 3JH_c_OH couplings, nuclear Overhauser effects (NOEs), and exchange rates. However, this information is accessible only if the intermolecular exchange of OH protons with solvent H20 can be slowed sufficiently. At room temperature, hydroxyl protons in aqueous solutions of carbohydrates engage in chemical exchange with water, the rate of which is very fast on the NMR time scale. Researchers at first applied mixed solvents (water/acetone and water/methanol) to study OH groups at low temperatures (-5 to -10~ (see, e.g., [9]). More recently, 1H NMR studies have been reported of hydroxyl groups in dilute solutions of monoand disaccharides in pure H20 under supercooled (-15 t o - 2 0 ~ conditions [10]. The chemical exchange rates under these conditions are reduced to such an extent that signals can be observed for each hydroxyl site; thus, all hydroxyl proton resonances can be assigned on the basis of (scalar or dipolar) connectivities to non-labile aliphatic protons. The line widths, temperature shift coefficients, and coupling constants of OH protons are valuable hydration and hydrogen-bonding probes in NMR studies [11]. Also, H/D isotope effects of hydroxyl protons/deuterons on 13C resonances can be used to obtain indirect evidence of the involvement of OH groups in intra- or interresidue hydrogen bonds [12]. Furthermore, protruding farther from the glycosyl ring systems than most CH protons, OH protons may serve as long-range sensor conformational probes which can be interrogated by NOESY and ROESY experiments on the carbohydrate in aqueous solution [8,9]. In order to use OH protons as conformational probes for oligosaccharides, special NMR techniques for water

17 suppression must be utilized [13-16]. We report below on the study of intramolecular hydrogen bonds in the disaccharide sucrose in aqueous solution. The three-dimensional structure of sucrose [Fruf-13(2~-~1)~-Glup] (Fig. 1), particularly the conformation of its glycosidic linkage, has been the subject of numerous investigations. Data obtained by NMR analyses and MD simulations on sucrose in solution have led to uncertainties concerning the degree of rigidity of the linkage conformation in solution [ 17-21 ], raising the question whether it is the same as in the crystal structure [22]. Careful 13C T1 measurements for sucrose revealed very fast and small-amplitude torsional and vibrational motions at different sucrose ring positions [23,24] and mobilities of the exocyclic groups different from the ring skeletons [25]; however, on the basis of this type of measurements the glycosidic linkage was judged to be rigid and similar in conformation to that observed in the crystalline state. In 1992, inspired by the work of P6rez c . s . [20], we reinvestigated the solution conformation of sucrose applying new NMR methods to reveal 1H/1H NOE contacts explicitly including OH protons [26]. The results of our quantitative NOE measurements are compiled in Table 1. Interestingly, all NOE connectivities found for sucrose in water solution can be explained in terms of a single conformation, which is virtually identical to the crystal structure. However, based on a quantitative analysis of the magnetic field strength dependence of 1H/1H NOE data, we questioned [26] the rigidity in solution of the sucrose glycosidic bond. The NOEs between protons on different rings proved to be magnetic field strength-dependent, while this did not seem to be the case for NOEs between protons within the glucopyranose ring. We interpreted this observation as evidence of rearrangements occurring around the glycosidic bond between the glucosyl and fructosyl residue that take place much more quickly than the tumbling rate of the molecule in solution. This type of internal motion is transparent to relaxation parameters (including 13C T1) if it occurs on the same time scale as overall molecular tumbling; this seems to be the case for sucrose at ambient temperature. Thus, we demonstrated that sucrose in solution more than likely experiences fast internal motion around its glycosidic bond.

Figure 1. Left: the crystal structure of sucrose (hydrogen bonds are represented by dashed lines) [22]. Right: the 600-MHz 1-D 1H NMR spectrum of sucrose in pure water, recorded at -17~ and pH 6.5.

18 Table 1. Apparent transglycosidic interproton distances for sucrose in aqueous solution Proton pair a Hlg/Hlf Hlg/H3f Hlg/H4f Hlg/H6f H5g / H4f Hlg / OHlf H 1g / OH6f H l g / OH3f H5g / OH3f OH2g / H 1f

Distance b (,&) 2.6 4.6 3.5 4.0 3.0 3.6 3.4 4.0 3.8 3.2

a Data were derived from NOE contacts

observed at 500 MHz and 27~ for CH, and at-10~ for OH protons [26]; 'g' denotes the glucosyl, 'f' the fructosyl ring. b Distances were calculated using the formula rij=((Yref/(Yij)l/6rref in which (Yrefand rre f represent the cross-relaxation rate of and the distance between a reference pair of protons. Intraglucosyl distances H lg/H2g (2.4 ]k), Hlg/H3g (3.7 A) and Hlg/H4g (4.0 ]k) were used as references.

Table 2. Temperature coefficients and scalar coupling constants of the hydroxyl protons of sucrose dissolved in pure water 3JH_c_OH c

Kb

Proton a

OHlf OH3f OH4f OH6f OH2g OH3g OH4g

-8.2 -8.8 -9.2 -11.0 -8.0 -10.3 -9.4

5.7 7.3 5.6 5.3 7.4 4.8 6.4

OH6g

-10.3

5.5

a ,f, denotes the fructosyl, 'g' the glucosyl ring. b Temperature coefficients (~) are in p p m / ~ 103; they were measured over the temperature range-18 to-8~ at 1~ intervals. c Scalar coupling constants (3Jn_c_on) are in Hz; the values were measured a t - 17~ OH H Ha\ ..Hb

3'1

H

-OH '

t.,

H ':2

"

..HOI ~ 0 1C H2

5 C H20H 6

In this study we refine our investigations of sucrose at low temperatures in aqueous solutions in an effort to shed light on the existence of any interresidue hydrogen bonds in the context of its newly revealed flexibility around the glycosidic bond. We were aware that a great deal of effort had been spent by others [27,28] examining intramolecular hydrogen bonding and determining its effect on the overall conformation of sucrose in solution. However, we are the first to use the hydroxyl proton signals of sucrose in supercooled aqueous solution as NMR probes for the detection of intramolecular hydrogen bonds. A capillary tube (1.5 mm i.d.) containing a 20 mM sucrose solution in HzO/D20 (9:1, v/v) was inserted into a 5-mm tube and, once placed in the magnet, the sample was gradually cooled to -17~ All of the sucrose OH signals were indeed observed in 1-1 echo water suppression [ 13] experiments (Fig. 1) and had lines sufficiently narrow to be assigned by a 2-D COSY experiment. Neither the OH temperature shift coefficients (see Table 2), 3JH_C_OH scalar couplings (see Table 2), 13C (COH v s . COD) chemical shifts, nor OH exchange rates allowed us to single out any sucrose hydroxyl group with characteristics significantly different from the others. However, 2-D rotating-frame exchange spectroscopy (ROESY) of sucrose under supercooled conditions revealed a direct exchange between the glucosyl OH2 and fructosyl OH1 protons (see Fig. 2). We conclude that the observation of chemical exchange between OH2g and O H l f ('g' denotes the glucosyl, 'f' the fructosyl moiety in sucrose) provides reasonable

19 evidence for the transient existence of an OHlf:::O2g (or OH2g:::Olf) hydrogen bond in solution under the applied supercooled conditions. This "weak" hydrogen bond is compatible with the flexibility of sucrose around its glycosidic bond. It appears that the intramolecular hydrogen bonding in sucrose is concentration dependent; the existence of hydrogen bonds OHlf:::O2g, OH3f:::O2g, and OH6f:::O5g at higher concentrations has been postulated based i.a. on 13C isotope effects [27]. The existence of the first two hydrogen bonds at high concentration in solution implies flexibility and fast interconversion between two conformers around the glycosidic linkage in solution; in the crystal state only OHlf:::O2g and OH6f:::O5g are observed. It should be noted that the aforementioned 1H/1H ROESY experiment can only detect direct chemical exchange between two OH groups, so it is restricted in its applicability to those hydrogen bonds that involve two participating OH groups (thus, ROESY cannot reveal any hydrogen bonding to ring oxygens). Moreover, the ROESY spectrum contains no information about the identity of the hydrogen-bond donor and acceptor and, therefore, cannot discriminate between the possibilities of an OHlf:::O2g and an OH2g:::O 1f hydrogen bond in sucrose.

I

2g/!f@ f

o

E Q_ EL IY)

-~;

4g

Q_ i

(D --

~

(.o

(~4f 6.6 1

6.~ 6.0 F2 (ppm) I

I

Figure 2. Hydroxyl proton region of a 2-D ROESY spectrum of sucrose in H20/D20 (9:1, v/v) recorded at 600 MHz, pH 6.5, and-17~ The effective spin-lock field strength was set to 1.5 kHz, and the ROESY mixing time was 100 ms.

20

3. HETERONUCLEAR LONG-RANGE COUPLINGS MEASURED BY A TWODIMENSIONAL H.S.M.B.C. EXPERIMENT The classic HMBC experiment [29] has long been recognized as a powerful means to qualitatively detect long-range heteronuclear couplings. For carbohydrates, the HMBC experiment provides the key to primary sequencing, exclusively and entirely based on tracing through-bond J-couplings, the only scalar couplings consistently observable across a glycosidic bond being the transglycosidic 3JcH [30]. However, determining from the HMBC cross-peak patterns the magnitudes of the involved nJcHwith sufficiently high accuracy is not straightforward. Cross-peak multiplets in HMBC spectra are recorded in mixed-phase mode in the 1H dimension due to evolution of 1H chemical shifts and homonuclear couplings during the relatively long delay period required for heteronuclear long-range couplings to evolve [31]. Direct extraction of heteronuclear coupling constants from non-pure absorption patterns is impossible. To find a reference spectrum of the oligosaccharide that has the same phase behavior as the cross peak in the HMBC spectrum is mandatory. A 2-D 1H/1H TOCSY spectrum recorded under otherwise the same conditions as the HMBC spectrum has been used successfully for this purpose [32]. Alternatively, the refocused HMBC experiment designed for the quantitative measurement of long-range coupling constant values based on the HMBC cross-peak intensities [33,34] uses either a 1-D version of the HMBC spectrum (obtained by omitting the 13C pulses from the original sequence) or a full 2-D HMQC spectrum as reference. Both approaches require, therefore, the recording of the reference spectrum in an experiment separate from the HMBC experiment itself. The method we propose here uses the original HMBC pulse sequence, without suppression of one-bond correlations. Indeed, the one-bond correlation peaks are essential because they provide the necessary references for the phase properties of the multiple-bond correlations. Hence, we refer to this experiment as HSMBC for heteronuclear single- and multiple-bond correlation spectroscopy. The pulse sequence of the 2-D HSMBC experiment is shown in Fig. 3. No separate, additional NMR experiment needs to be conducted to obtain an appropriate reference spectrum. Also, the reference peaks here have truly the same phase properties as the multiple-bond correlation peaks, as they are collected during the very same experiment. Values for the long-range couplings can be extracted applying the shift method introduced by Titman et al. [32]. We will illustrate the application of the HSMBC experiment for sialyl(z(2~6)-lactose (SL6):

H

oH

H

Ho

Hn ~9~c~H .

" " C . . " K 17 16 H C

ILI

o

H

21,~

OH

g

[

.~

6H

i4)~ ~

H

O

O/,\

oH ~

\

, "

n o - ~

H

Neu5 Ac- c~(2~ 6)- Galp- 13( 1-+4)- Glcp

OH

21

90x

1H

180x

~zx

tl/2

II

90t)l

Figure 3.

tl/2 9002

Pulse sequence of the 2-D HSMBC experiment. The phase cycling is as follows:

~)1 = X,-X; (~2 = X, X, X, X,--X,--X,--X,--X; t~3 = X,--X, X,--X,--X, X,--X, X.

The HSMBC spectrum of SL6 (Fig. 4) was recorded on a Bruker AMX-600 spectrometer (1H frequency 600 MHz, 13C frequency 150 MHz) using a 20 mM solution of the trisaccharide in D20 at pD 7.3 and 25~ The delay A was set to 45 ms, and the acquisition times t 2 and t 1 were 1.13 s (spectral width 1805 Hz, 4 K complex data points) and 25.6 ms (spectral width 10,000 Hz, 512 real data points), respectively. The relaxation delay was 1 s; 64 scans were accumulated per t~ increment. Quadrature detection in t I was accomplished by timeproportional phase incrementation. The total duration of the HSMBC experiment on SL6 was 15 h. The HSMBC data were processed using BioSym's FELIX software package, version 2.3, on a Silicon Graphics Personal Iris workstation. No multiplication by any window function and no zero-filling in t2 were applied before Fourier transformation. A cosine-squared window function was applied in the tl dimension; the first data point was multiplied by 0.5 to reduce t l noise. No zero-filling was applied in t l. The final resolution in the 13C dimension was 19.5 Hz/pt. Fourier transformation in both dimensions was followed only by phasing in F1. The pair of resulting 1H cross sections was selected that contain the 3JcH and 1JcH correlation multiplets for a particular proton. Each of these spectra was then subjected to Hilbert transformation (to create the imaginary part of the spectral slice) followed by inverse Fourier transformation; subsequent zero-filling to 16 K real points resulted in a digital resolution of 0.1 Hz/pt. After Fourier transformation, the baseline of the signals was carefully adjusted applying the "flat" routine in FELIX and any unwanted signals in the traces were zeroed. The selected 1JcH multiplet was then inverted, shifted by 0.5x3J(trial), and added to its parent multiplet which was shifted in the opposite direction by 0.5x3J(trial). The resulting convoluted multiplet was fitted to the experimental 3JcH multiplet. The calculations are performed by an in-house FORTRAN software routine on a DEC-3100 workstation. The best fit is achieved when COS ~ --

E.T

IElITI reaches its maximum [35], where E is the vector corresponding to the experimental 3JcH multiplet and T is the trial vector reconstructed from the reference multiplet by the procedure outlined above.

22

Neu H3e Neu H3a

Glccz H 1Gal H 1

.!

C) ! |

,0.

.o,

-o

i "~'

o t

Neu C5.b w . , q

,,, ,,

Neu C4.Glccz C5 ~ GIc C4~ Glccz C1~ Glcl3 C1-.Neu C2-,Gal C1/-"

o

E Q_

-o

Q-

,,,

,e.,

N

.,,

.-.

,,,~ll0Do.o

D

o,

(.O ~--,

,.

,....

.,

.-,

,D

..

9

9

..

O

9

LL_

-o

00

i

II

cb

,

!

El

-o

}, I

5.0

I

4.0 F2

I

3.0

(ppm)

Q

o

I

2.0

Figure 4. 2-D HSMBC spectrum of sialyl-(z(2-*6)-lactose in D20, recorded at 600 MHz, pD 7.3, and 25~

The procedure for obtaining the value of a particular long-range coupling constant is graphically illustrated for Glc(z 3JH1c5. Figure 5 shows the doublet taken from the one-bond Glccz 1JH1c1 trace of the HSMBC spectrum, and the subsequent reconstruction of the 3JH1c5 multiplet. The 3JH1c5 coupling constant obtained by this method is 6.5 Hz. This value is accurate to _+0.2 Hz, that is, within two times the digital resolution of the 1-D traces processed as described above. Figure 6 shows the matching patterns of five other nJcH multiplets for SL6, including those for the interglycosidic coupling 3JGa1H1-GlcC4- The values obtained by our method are marked in bold; the italicized values are those obtained by a quantitative HMBC experiment [34]. The selected subset of nJcH values for sialyllactose illustrates that the HSMBC method provides generally more accurate values for the desired couplings in a shorter period of time. The need to record a separate reference spectrum is circumvented, while the accuracy of the "Jfn values is significantly improved over the quantitative HMBC experiment. The HSMBC method also compares favorably with 1-D selective HSQC measurements of the same couplings in SL6 [36].

23 (a)

(b) (c)

31 I~I~, _1

'atrial /~~ [ 'aH1-H2

/v L_

(d) Figure 5. The J-fitting procedure as performed for Glco~ 3JH1c5 from the pertinent traces in the 2-D HSMBC spectrum of sialyl-~(2~6)-lactose. (a) Doublet of Glc~ H1 taken from the 1JH1c1 row; (b) inverted doublet 'a' left-shifted by 3Jnlcs(trial); (c) reconstructed multiplet obtained by co-adding 'a' and 'b'; (d) actual multiplet of Glc~ H1 taken from the 3JHlc5 row.

Gal H1-GIc C4

4.0

(3.9+_0.8)

Neu H3a-C5

+i

Neu H7-C8

Neu H3a-C4 (7.7+ 1.1)

Neu H3a-C2

io'

. ._+ . ~f~

Glccz H1-C5 65

('4.0__+0.+4)

(6.5+_0.9)~

Figure 6. Selected set of 2JcH and 3JcH couplings (__.0.2 Hz) for sialyl-a~(2~6)-lactose, resulting from the fit of experimental (bottom traces) and reconstructed (top traces) 2-D HSMBC cross-peak multiplets. The resulting coupling constant values (in Hz) are marked in bold. For comparison, the values determined by Zhu et al. [34] from a quantitative HMBC experiment on SL6 are included (in italics).

24 4. DYNAMICS OF MEMBRANE-BOUND CARBOHYDRATES

The molecular behavior of an oligosaccharide chain isolated from a glycoprotein or glycolipid hardly mimics the behavior of the chain when bound to a protein or a lipid. For example, the molecular motion of a freely tumbling oligosaccharide will be substantially different from that of the same oligosaccharide anchored to a protein backbone and/or in a lipid bilayer. It is likely that both molecular conformation and internal dynamics are affected upon covalent binding of an oligosaccharide to a protein or lipid. Thus, appropriate caution must be exercised when extrapolating from NMR-derived conformation and dynamics results on isolated oligosaccharides to judgments about similar structures linked to proteins or lipids. NMR efforts are getting under way to examine the conformation and dynamics of oligosaccharides as covalent parts of larger molecular weight glycoconjugates (both intact glycoproteins and glycolipids dispersed in lipid bilayers). Amphipatic in nature, isolated glycolipids aggregate in aqueous solution to form micelles. These systems, in defiance of their long overall rotational correlation times, are amenable to high-resolution NMR study and are the most revealing probes available so far for the study of internal motions in lipid-linked oligosaccharides. The conformation and dynamics have been studied of the carbohydrate headgroups of a number of glycolipids inserted in deuterated dodecylphosphocholine (DPC) micelles in DaO and/or H20 [37-40]. As an example, we present here some of our work on ganglioside GDla Neu5Ac'ot(2-+3)Gal'p~(1-+3)GalpNAc~(1-+4)[Neu5Acot(2-+3)]Galp~(1-+4)Glcp~(1-+ 1)Cer embedded in a perdeuterated DPC micelle in aqueous solution. Without the addition of an organic solvent, 1H NMR studies of GDla in a micellar membrane in H20 at --I~ revealed relatively narrow OH signals, allowing their incorporation in extensive NOE studies. It appeared [40] that the observed interresidue 1H/1H NOE contacts (involving both CH and OH protons) could not be explained by only a single conformation of the GDla hexasaccharide such as the quasi-rigid structures proposed in keeping with 13C T1 data for the GDla free oligosaccharide [41 ] in D20 and glycolipid [42] in DMSO. At least two families of carbohydrate conformers [differing in the relative orientation of the Neu5Ac'ot(2-+3)Gal'13(1-+3)external disaccharide arm; see Fig. 7(a)] must exist in this model membrane system to account for the NOE constraints. The internal dynamics of the oligosaccharide were probed by 1H-detection of the 13C relaxation parameters (T1, Tip, and heteronuclear NOEs) of the GDla hexasaccharide in the micellar system at 13C natural abundance. The results of these measurements are listed in Table 3. The 13C relaxation rates R1 and R2 were converted into the spectral densities J(m) at the pertinent frequencies [40,45-47]. Aided by an independent measurement of the global rotation correlation time 'co of the micellar system (from 31p NMR relaxation measurements on the DPC 31p signal), the spectral density data were interpreted at the molecular level with the Lipari-Szabo model-free approach [43,44]. As illustrated schematically in Figure 7(b), the model-free approach distinguishes the types of molecular motion of an oligosaccharide by their different time scales. The total motion of the molecule is separated into the tumbling of the entire aggregate (with rotation correlation time "Co) modulated by more rapid internal (segmental) motion with characteristic correlation time ('ci) and amplitude expressed by an order parameter ($2). The latter is a measure of internal reorientation (S 2 = 1 in the absence of reorientation; S 2 =0 in the presence of completely isotropic reorientation). The separation of

25

Figure 7. (a) Two conformers of GDla on the DPC micellar membrane, differing in the orientation of the outer disaccharide moiety [40]. (b) Diagram of GDla in a DPC micelle illustrating the independent global and local motion model, the characteristics of which can be described by the Lipari-Szabo equations [43,44]. the motions into distinct regimes is crucial to the success of this method. Thus, the internal dynamics (segmental motions) of carbohydrate molecules in solution can be assessed by NMR spectroscopy if the rearrangements occur much faster (or much slower) than overall molecular tumbling. The internal reorientations in GDla on the micellar surface were found to occur with a rotation correlation time of 0.35 ns. The slow tumbling of the micelle (Mr "" 8 kDa) provided the distinct time scale ('to 2.8 ns), facilitating the discovery of the faster internal motion of a couple of flexible glycosidic linkages in the carbohydrate headgroup [40]. The order parameter S 2 was measured for reorientation of the C-H vectors of each glycosyl residue. As computed from 13C Tip, S 2 was found to be --1.0 for internal residues and--0.5 for the terminal (Neu5Ac') and penultimate (Gal') residues most distal to the lipid (Table 3).

26 Table 3. Average 13C relaxation rates a and dynamics parameters b for the headgroup hexasaccharide in Gnla/DPC micelles in aqueous solution Residue

R 1 [s-1]

GalNAc Neu5Ac Gal, Glc

}

'cNeu5Ac' Gal

}

R 2 [s -1 ]

NOE

% [ns]

'17i [ns]

S2 1

2.2+0.2

12.5+2.0

1.2-+0.2

2.8-+0.1

0

2.3+0.2

7.1_+1.0

1.6_+0.2

2.8_+0.1

0.34_+0.10 0.55+0.08

a The 13C relaxation rates are defined as: R 1 = (T1)-1, and R 2 = (T19)-l. b Obtained by nonlinear least-squares optimization using the equations in Fig. 7(b). The relaxation data for the methylene carbons were not included in the analysis. c Data for Gal' C4 were not included.

Application of the methods discussed to larger complex oligosaccharides as well as to those tethered to proteins (see, for example, [48]) or lipid membranes will continue to provide valuable evidence on the question of internal motions on a time scale of the reciprocal of the NMR frequency range used in the experiment (0.1 to 1 ns). Motions on a slower time scale, which would not be detectable by ordinary 13C T1 experiments, can be studied by rotating-frame relaxation data (T10). With control of the spin-locking field strength, the latter parameter can reveal motions on the Bs time scale.

5. CONCLUSIONS Carbohydrate NMR spectroscopy has reached a level of sophistication at which the number of detectable constraints (including hydrogen bonds and glycosidic bond torsion angles) is sometimes large enough to be incompatible with a single rigid conformer, prompting the consideration of ensemble average models. Internal motion in oligosaccharides can be detected and quantified by NMR spectroscopy if it occurs relatively fast compared to the overall motion of the molecule. Tethering the oligosaccharide to a phospholipid micelle in aqueous solution satisfies this criterion. Knowledge of three-dimensional conformations and the dynamics/flexibility of complex carbohydrates in their natural environment will broaden our insight into their functioning as mediators of numerous biological cell-cell and cell-molecule interactions. It is anticipated that NMR spectroscopy will continue to make invaluable contributions towards this goal.

6. ACKNOWLEDGMENTS The authors thank Drs. Leszek Poppe and John Glushka for helpful discussions and Rosemary Nuri for editing the manuscript. This research is supported by National Institutes of Health (NIH) grant P41-RR-05351 and Department of Energy (DOE) Plant Science Center grant DE-FG09-93ER20097.

27 7. A B B R E V I A T I O N S

The abbreviations used are: Cer, ceramide; COSY, homonuclear scalar correlation spectroscopy; DPC, dodecylphosphocholine; f, furanose; HMBC, heteronuclear multiplebond connectivity; HMQC, heteronuclear multiple quantum coherence; HSMBC, heteronuclear single- and multiple-bond connectivity; HSQC, heteronuclear single quantum coherence; MD, molecular dynamics; n-D, n-dimensional (where n=l, 2, or 3); NMR, nuclear magnetic resonance; NOE, nuclear Overhauser effect; NOESY, NOE correlated spectroscopy; p, pyranose; ROESY, rotating-frame Overhauser and exchange spectroscopy; SL6, sialyl-o~(2~6)-lactose; TOCSY, total scalar correlation spectroscopy. The symbols used are defined as follows: J, scalar coupling constant; J(c0), spectral density at frequency co; r, distance; S2, generalized order parameter; c~, cross-relaxation rate; Tt, longitudinal relaxation time; T2, transverse relaxation time; T10, longitudinal relaxation time in the rotating frame; "Co, global rotation correlation time; xi, internal (local) rotation correlation time; ~), torsion angle H1-C1-O1-C'x; ~t, torsion angle C1-O1-C'x-H'x; o3, torsion angle H5-C5-C6-O6.

8. REFERENCES

1 H. van Halbeek, in "Encyclopedia of Nuclear Magnetic Resonance," D.M. Grant and R.K. Harris (eds.), Wiley, Chichester, 1995, in press. 2 J. Dabrowski, in "Two-Dimensional NMR Spectroscopy. Applications for Chemists and Biochemists (Second Edition)," W.R. Croasmun and R.M.K. Carlson (eds.), VCH, New York, 1994, p. 741. 3 S.W. Homans, Prog. NMR Spectrosc., 22 (1990) 55. 4 S.W. Homans, in "Molecular Glycobiology," M. Fukuda and O. Hindsgaul (eds.), Oxford University Press, Oxford, 1994, p. 230. 5 S. PErez, A. Imberty, and J.P. Carver, Adv. Comput. Biol., 1 (1994) 147. 6 H. van Halbeek, Curr. Opin. Struct. Biol., 4 (1994) 697. 7 M.C.R. Symons, J.A. Benbow, and J.M. Harvey, Carbohydr. Res., 83 (1980) 9. 8 J. Dabrowski and L. Poppe, J. Am. Chem. Soc., 111 (1989) 1510. 9 L. Poppe and H. van Halbeek, J. Am. Chem. Soc., 113 (1991) 363. 10 L. Poppe and H. van Halbeek, Nature Struct. Biol., 1 (1994) 215. 11 B. Adams and L.E. Lerner, Magn. Reson. Chem., 32 (1994) 225. 12 J.C. Christofides and D.B. Davies, J. Am. Chem. Soc., 105 (1983) 5099. 13 V. Sklenfir and A. Bax, J. Magn. Reson., 74 (1987) 469. 14 V. Sklenfir, R. Tschudin, and A. Bax, J. Magn. Reson., 75 (1987) 352. 15 V. Sklenfir and A. Bax, J. Magn. Reson., 75 (1987) 378. 16 M. GuEron, P. Plateau, and M. Decorps, Prog. NMR Spectrosc., 23 (1991) 135. 17 B. Mulloy, T.A. Frenkiel, and D.B. Davies, Carbohydr. Res., 184 (1988) 39. 18 V.H. Tran and J.W. Brady, Biopolymers, 29 (1990) 961. 19 V.H. Tran and J.W. Brady, Biopolymers, 29 (1990) 977. 20 C. Herv6 du Penhoat, A. Imberty, N. Roques, V. Michon, J. Mentech, G. Descotes, and S. PErez, J. Am. Chem. Soc., 113 (1991) 3720. 21 J.M. Duker and A.S. Serianni, Carbohydr. Res., 249 (1993) 281. 22 G.M. Brown and H.A. Levy, Acta Crystallogr., Sect. B, 29 (1973) 790. 23 D.C. McCain and J.L. Markley, Carbohydr. Res., 152 (1986) 73. 24 D.C. McCain and J.L. Markley, J. Am. Chem. Soc., 108 (1986) 4259.

28 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

D.C. McCain and J.L. Markley, J. Magn. Reson., 73 (1987) 244. L. Poppe and H. van Halbeek, J. Am. Chem. Soc., 114 (1992) 1092. D.B. Davies and J.C. Christofides, Carbohydr. Res., 163 (1987) 269. B. Adams and L. Lerner, J. Am. Chem. Soc., 114 (1992) 4827. A. Bax and M.F. Summers, J. Am. Chem. Soc., 108 (1986) 2093. F.J. Cassels and H. van Halbeek, Methods Enzymol., 253 (1995) 69. A. Bax and D. Marion, J. Magn. Reson., 78 (1988) 186. J.J. Titman, D. Neuhaus, and J. Keeler, J. Magn. Reson., 85 (1989) 111. G. Zhu and A. Bax, J. Magn. Reson. Ser. A, 104 (1993) 353. G. Zhu, A. Renwick, and A. Bax, J. Magn. Reson. Ser. A, 110 (1994) 257. P. Huber, C. Zwahlen, S.J.F. Vincent, and G. Bodenhausen, J. Magn. Reson. Ser. A, 103 (1993) 118. L. Poppe, R. Stuike-Prill, B. Meyer, and H. van Halbeek, J. Biomol. NMR, 2 (1992) 109. L. Poppe, C.-W. vonder Lieth, and J. Dabrowski, J. Am. Chem. Soc., 112 (1990) 7762. D. Acquotti, L. Poppe, J. Dabrowski, C.-W. von der Lieth, S. Sonnino, and G. Tettamanti, J. Am. Chem. Soc., 112 (1990) 7772. H.-C. Siebert, G. Reuter, R. Schauer, C.-W. vonder Lieth, and J. Dabrowski, Biochemistry, 31 (1992) 6962. L. Poppe, H. van Halbeek, D. Acquotti, and S. Sonnino, Biophys. J., 66 (1994) 1642. S. Sabesan, J.O. Duus, T. Fukunaga, K. Bock, and S. Ludvigsen, J. Am. Chem. Soc., 113 (1991) 3236. J.N. Scarsdale, J.H. Prestegard, and R.K. Yu, Biochemistry, 29 (1990) 9843. G. Lipari and A. Szabo, J. Am. Chem. Soc., 104 (1982) 4546. G. Lipari and A. Szabo, J. Am. Chem. Soc., 104 (1982) 4559. S. Bagley, H. Kovacs, J. Kowalewski, and G. Widmalm, Magn. Reson. Chem., 30 (1992) 733. P.J. Hajduk, D.A. Horita, and L.E. Lerner, J. Am. Chem. Soc., 115 (1993) 9196. J. Kowalewski and G. Widmalm, J. Phys. Chem., 98 (1994) 28. T.J. Rutherford, J. Partridge, C.T. Weller, and S.W. Homans, Biochemistry, 32 (1993) 12715.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

29

Linkage analysis by mass spectrometry of chemically modified oligosaccharides from glycosphingolipids and glycoproteins Bo Nilsson National Defence Research Establishment, Department of NBC Defence, S-901 82 Ume~t, Sweden

Abstract Mass spectrometry of oligosaccharides gives structural information on monosaccharide sequence and depending on ionisation technique used, molecular weight. In order to extend the structural information to also include glycosidic linkage positions, specific chemical modif'lcations were introduced. Trifluoroacetolysis was used to specifically degrade high-mannose structures from the reducing terminal. Periodate oxidation followed by reduction, peracetylation or permethylation and analysis of the products by mass spectrometry gave on the basis of the sequence ions linkage positions between monosaccharide residues in oligosaccharides obtained from glycosphingolipids and glycoproteins.

1. INTRODUCTION In order to understand how biological compounds interact i.e. the structure - function relationship, structural analysis is essential. Biological phenomena work through specific interactions of chemical structures. Recent developments of cell systems for manufacturing of biologically important proteins for use in therapy have focused on post translational modifications such as glycosylation. Structural analysis of the carbohydrate chains has therefore become important in order to avoid cell systems that produce glycoproteins with glycosylation, which in humans can cause undesired immunological reactions and unfavourable serum clearance rate. Another type of biologically important glycoconjugates are glycosphingolipids, which are cell surface associated. Since the carbohydrate portion is exposed to the exterior it should be accessible to interactions with other molecules. Neutral and acidic glycosphingolipids have been implicated in several roles, for example receptors for bacteria, virus, bacterial toxins and as tumour associated antigens. Analytical methods for proteins are well developed, but carbohydrate-containing polymers have for long time been difficult to analyse. One of the reasons is that no other biological compounds can form so many isomeric species as carbohydrates. The great number of compounds that can be formed also indicates that the content of biological information is enormous. The biological functions of glycoconjugates are however still not completely

30 understood. Carbohydrates have in the past, in contrast to proteins and nucleic acids, been regarded as less interesting from biological point of view, which have resulted in that less efforts have been investigated in analytical techniques for this type of compounds. When spectroscopic methods as mass spectrometry (MS) and NMR were introduced a dramatic progress in structural analysis of glycoconjugates was noticed. A comprehensive strategy including MS and NMR for glycoprotein analysis has previously been published [1]. This contribution will focus on recent developments in mass spectrometry of glycosphingolipid- and glycoprotein oligosaccharides.

2. MASS SPECTROMETRY Structural analysis of glycoconjugates usually involves various types of mass spectrometric techniques. Traditionally electron ionisation mass spectrometry (EI-MS) of derivatised monosaccharides and oligosaccharides has been used. Compounds have to be volatile in order to be analysed by this ionisation technique. When fast atom bombardment mass spectrometry (FAB-MS) was introduced underivatised as well as derivatised high molecular weight glycoconjugates could be analysed. Later introduced techniques such as electrospray mass spectrometry (ES-MS) and time of flight mass spectrometry (TOF-MS) have further extended the mass range, allowing molecular weight determination of intact glycoproteins. These recently developed techniques have focused mainly on molecular weight determination rather than structural analysis. Carbohydrate-containing polymers differ from other biological substances in features of branching, multiple binding positions and the monosaccharide residues can usually not be discriminated by their masses. Structural analysis of glycoconjugates by mass spectrometry is therefore a special challenge. The structural information, that can be obtained, includes molecular weight and monosaccharide sequence in terms of hexose, deoxyhexose acetamido-deoxyhexose and sialic acid. In order to completely determine a structure several other features have to be assigned e.g. linkage positions between monosaccharide residues. Mass spectra are often recorded after derivatisation, peracetylation or permethylation, which in FAB-MS enhances the sensitivity. The sequence ions of derivatised compounds are formed by ionisation of the ring oxygen followed by a homolytic cleavage process of glycosidic bonds yielding cyclic oxonium ions. Presence of branch points can be determined by the absence of certain sequence ions. Only in exceptional cases binding positions between monosaccharides can be determined e.g. substitution positions of permethylated 2-acetamido-2-deoxyhexosyl (HexNAc) residues. After a primary cleavage of the HexNAc glycosidic bond in EI-MS or FAB-MS the relative intensities of the secondary fragments formed by eliminations determine the substitution pattern (Figure 1). The elimination takes place preferentially from the 3-position of the HexNAc residue [2]. Substitution positions of monosaccharide residues can be determined by gas chromatography-mass spectrometry (GC-MS) of partially methylated alditolacetates [3], but the sequential order of linkages is in general difficult to determine. NMR-Analysis or enzymatic methods can often provide this information if sufficient material is available. In order to extend the structural information to include linkage analysis in oligosaccharides by mass spectrometry, specific chemical modifications were introduced.

31 228

%

432

%

Hexl --4HexNAc I -

Hexl-3HexNAcl-

464 464

!

196

196 228

I .

m/z 402

m/z 432

% Hexl-3HexNAcl4 I deoxyHexl

196

I,

Hex ! -4HexNAe

638

4?2

I

I-

3 I deoxyHexl 638

196 ,

,

402 [

m~

Figure 1. Relative intensities in FAB-MS of primary and secondary ions derived from 3- and 4mono- and 3,4-disubstituted permethylated HexNAc residues.

3. CHEMICAL MODIFICATIONS Periodate oxidation was used to specifically modify monosaccharide residues in oligosaccharides [4]. Bonds connecting carbons carrying hydroxyl groups in vicinal positions are oxidatively cleaved and aldehydes are formed, which subsequently are reduced with NaBD4 (Figure 2). The product is then analysed after peracetylation or permethylation by GC-MS or FAB-MS depending on the size of the oligosaccharide. I periodate HCOH oxidation

I / C,,, NaBD 4 I acetylation or ] H O reduction HDCOH methylation HDCOAc

HCOH

H,

I

"c

I

//O

HDCOH

t

HDCOAc

I

I

HDCOMe or

HDCOMe I

Figure 2. Reaction steps used for preparation of derivatives for analysis by mass spectrometry.

The products obtained depend on linkage positions between residues and the mass values of ions formed by cleavage of glycosidic bonds for terminal and internal residues after permethylation and peracetylation are shown in Table 1.

32 Table 1 Mass values for terminal and intemal residues after periodate oxidation, NaBD4-reduction and derivatisation Residue

Permethylation

Peracetylation

Non-reducing Hexl6-deoxyHexlHexNAclNeu5Ac2-

179 149 264 289

263 205 334 a

208 204 208 164 245 245

292 288 292 220 287 287

Internal -2Hexl-3Hexl-4Hexl-6Hex 1-3HexNAcl-4HexNAc 1-

Residue

Permethylation Peracetylation

-6HexNAcl-8Neu5Ac2-

249 361

291 a

105 192 148 104

161 304 232 160

-3HexNAc-ol 189 -4HexNAc-ol 233 -6HexNAc-ol 60

231 303 88

Reducing -2Hex -3Hex -4Hex -6Hex

Reduced

a No mass due to lactone formation

Terminal residues are always oxidised and internal hexosyl (Hex) residues are oxidised if 2-, 4- or 6-substituted, whereas 3-substituted are not. For permethylated compounds 2- and 4substituted Hex residues both give a mass increment of 208 mass units (m.u.), but the structures are different. A secondary fragment, formed by elimination of methanol, is seen from the 2-substituted, but not for the 4-substituted Hex residue [5]. In peracetylated samples these two products give in addition an ion formed by cleavage of the bond between carbon 5 and the former ring oxygen, which for the 2-substituted residue is constant of m/z 160, whereas for the 4-substituted the m/z value depends on the mass of the substituent [6]. Both 3- and 4substituted HexNAc residues are resistant to periodate oxidation. As in unoxidised and permethylated compounds these residues are differentiated by the secondary fragments formed by eliminations as will be illustrated by examples below. In order to determine the substitution position of a reducing 2-acetamido-2-deoxyhexose the periodate oxidation has to be carried out on the reduced compound. Substitution positions of branched residues can sometimes be determined (Table 2). A 3,6-disubstituted Hex-ol gives after periodate oxidation and NaBD4-reduction two products, ethylene glycol and glycerol. O-Linked glycans often contain a 3,6-disubstituted HexNAc-ol (GalNAc-ol), which will be cleaved by periodate giving two products, ethylene glycol and N-acetyl tetrosaminitol. In N-linked structures the HexNAc-ol (GlcNAc-ol) is usually 4,6-disubstituted, which is not cleaved by periodate. Another chemical modification, trifluoroacetolysis, is carried out in a mixture of trifluoroacetic acid (TFA) and trifluoroacetic anhydride (TFAA). This mixture has a powerful

33 acetylating property resulting in rapid trifluoroacetylation of hydroxyl groups. Glycosidic bonds are stabilised by inductive effect exerted by the O-trifluoroacetyl groups [7].

Table 2 Products obtained from some common reduced branched residues after periodate oxidation and NaBD4-reduction Branched residue Products

CHDOH 6

Hex-ol

/3

CHDOH

!%

,~

I CH [

CHDOH

~I~OH

CHNHAc ~6

3 HexNAc-ol

CHDOH [

c%

I I CHDOH

~CH

~HaOH

CHNHAc

I I ~CH I CHOH I ~Ct-I 2

CHOH

~6

/

4

HexNAc-ol

Oligosaccharides, N- and O-linked, are released from glycoproteins, as N-trifluoroacetylated derivatives [8, 9]. Under certain conditions glycoprotein oligosaccharides are specifically degraded from the reducing terminal [ 10]. Trifluoroacetolysis is also useful for liberation of the carbohydrate portion from glycosphingolipids [11]. Periodate oxidation in combination with trifluoroacetolysis has been used in structural analysis of high-mannose glycoprotein oligosaccharides, as will be discussed below.

34 4. GLYCOSPHINGOLIPIDS Neutral and acidic glycosphingolipids, isolated various species and cell types, have been subjected to numerous studies of their biological functions. Due to their lipophilic nature these types of compounds are suitable for separation by thin layer chromatography, which has been used for identification by comparison of migration with reference compounds. Neutral glycosphingolipids and gangliosides can be analysed by FAB-MS as underivatised compounds or after permethylation. Besides monosaccharide sequence, composition of the ceramide moiety can be deduced from the spectra [12]. Only in permethylated samples and in the presence of internal HexNAc residues substitution position of these residues can be determined as previously discussed. In order to determine other glycosidic linkages by FAB-MS, periodate oxidation can be used on intact glycosphingolipids [5]. An alternative approach is to release the oligosaccharide from the ceramide portion and then perform the periodate oxidation.

4.1. Release of oligosaccharides The carbohydrate portion of glycosphingolipids can be released by chemical or enzymatic methods. Treatment of neutral glycolipids or gangliosides, containing an unsaturated sphingosine base, with ozone followed by mild base liberates the oligosaccharide [ 13]. Another chemical method, trifluoroacetolysis, also requires an unsaturated sphingosine base and works best on neutral glycosphingolipids. When used on gangliosides partial loss of sialic acid is seen. From N-acetyl hexosamine-containing glycosphingolipids the oligosaccharides are released as N-trifluoroacetyl derivatives, which after permethylation are suitable for analysis by GC-MS [ 14]. An altemative to chemical methods is treatment with enzymes. Enzymatic methods have an advantage in that both the carbohydrate moiety and the ceramide can be recovered separately, which is useful in studies of the ceramide residue. There are number of ceramidases commercially available e. g. from the medicinal leech and from the earth worm Lumbricus terrestris [ 15, 16]. 4.2. Mass spectrometry of glycosphingolipid oligosaccharides Ozonolysis of the GM3 ganglioside released the oligosaccharide: Neu5Acot2-3Gall] 1-4Glc The trisaccharide was subjected to periodate oxidation, NaBD4-reduction and permethylation. Analysis of the product by GC-MS gave a spectrum shown in Figure 3. The primary sequence ion of m/z 289 together with a secondary of m/z 257, formed from the primary by elimination of methanol, are characteristic for a periodate oxidised non-reducing terminal sialic acid. A disaccharide sequence of Neu5Ac2-3Gall- is deduced from the above ions together with rn/z 493. The linkage position between these residues is determined by the mass increment of 204 m.u. representing a periodate resistant hexosyl residue, which therefore must be 3-substituted (Table 1). The alditol-containing ions of m/z 148 and 352 show that the reducing glucose is 4-substituted. These two ions give rise to secondary ions of m/z 116 and m/z 320, respectively, by elimination of methanol. An ion of m/z 89, formed by cleavage within the alditol, shows two O-methylated carbons. El-Mass spectra of carbohydrates never give molecular ion species. In this case an [M-59] § ion of m/z 598 is seen, which is a loss of a

35 methoxycarbonyl radial (. COOMe) from the molecular ion, typical for EI-spectra of sialic acid containing oligosaccharides.

257

188. 9B

289

w r 8B z "~ 7B o z 6B =)

Me

~.

181

LU 4B >

3B

~ 2B w n- IB

~N

31

89

493

MeOOC--I]

Mel

co 5B

~

257+- 289 MeO

i

I

I

L--'0"+" 352

I

ICHOMe

I CIH20Me 148 [M-59]+

493

1481 183

......... t,++

CHDOMe

O

#' "-I-- I ~

159 I

116

ICH2OMe--7

,. . 2++

. . . . . . . . . . . . . . . . . .

,,,

4++

3++

....

1.4!946,5..I,

[ +B+

598 . . . . . . . . . . .

m/z

+m+

Figure 3. GC-MS spectrum of the GM3 oligosaccharide after periodate oxidation, NaBD4reduction and permethylation.

Treatment of the GTlb ganglioside with ozone liberated the oligosaccharide with the structure of:

Neu5Ac0~2-3Gal[31-3GalNAc[~ 1 ~ 4 Neu5Ac~176

j

3 Gal131-4Glc

A FAB-mass spectrum obtained after the above periodate treatment is shown in Figure 4. A disialyl sequence of Neu5Ac2-8Neu5Ac2- can be deduced from the primary sequence ions of m/z 289 and m/z 650, where the latter ion is an increase of 361 m.u. (Table 1). The linkage between these residues must therefore be 2-8 since the intemal sialic acid residue is not oxidised by periodate. Any other linkage position between these residues would make the internal sialic acid susceptible to periodate and thereby give rise to other ions than the above. Another sialylated sequence of Neu5Ac2-3Gal 1- is determined by the primary sequence ions of m/z 289 and m/z 493 as previously discussed. This monosialylated sequence is linked to the 3position of the GalNAc, determined from the primary sequence ion of m/z 738 and a secondary of m/z 228. Elimination of the Neu5Ac2-3Gal sequence gives rise to m/z 228 specific for a 3substituted 2-acetamido-2-deoxyhexosyl residue [ 17]. The substitution pattern of the branched

36 galactosyl residue cannot be inferred from the spectrum. An [M+23] + ion (molecular ion plus sodium) of m/z 1765 is consistent with a periodate oxidised G-m, oligosaccharide. After trifluoroacetolysis of globoside and reconstitution to the N-acetyl derivative the globo-N-tetraose oligosaccharide was obtained: GalNAc[31-3Galc~ 1-4Gal]31-4Glc

289 228

21i7

[M+23]* 17 ;5

,u ;8. z0

~ 58. m ~ 58.

g,

]8.

1888

m/z

493

288

488

]88

738

588

257 ..,-289 MeOOC'~

G88

493 CH2OMe~

228 "4---738 CH2OMe" ~

1

MeO

I

MeO MeOOC I

Me ~ .

,o%,---od Meo

~

257 4-- 289

788

MeNAc Me ~.

MeOOC

I

,,c-'~,~--~od MeO

___1

888

CH2OMe

/

MeO

2

988

1~8

m/z

?HDOMe

CH2OMe

650

Figure 4. FAB-MS spectrum of the G'rlb oligosaccharide after periodate oxidation, NaBD4reduction and permethylation.

Treatment with periodate as above and analysis by FAB-MS gave a spectrum shown in Figure 5. A periodate oxidised non-reducing terminal GalNAc is determined by the primary

37 sequence ion of m/z 264 and a secondary of m/z 232, formed by elimination of methanol. These ions combined with m/z 468 and m/z 676 show a sequence of GalNAcl-3Gall-4Gall-. The ion of rn/z 676 is an increase of 208 m.u. to m/z 468 and means a 2-substituted or 4-substituted Hex residue. Substitution in the 2-position can however be excluded since no secondary fragment, formed by elimination of methanol, is seen [5]. Substitution of the reducing glucose in the 4-position is calculated from the [M+23] + ion of m/z 863 and the sequence ion of m/z 676.

264

CH,,OMe

IJJ (J z I-

MeODHC

I

676

CH,~OMe'~

~,HoOMe'~

]~l' J, " ~ m ~ M e O D H C OMe

CHDOMe

CHDOMe

MeNAc

i HO CH2OMe

68

""< ..J UJ

2(i4

flail. 96. 88 ~0

468

~t t,

>400 x3 it

232

4g.

31.

1s2 let 14e

I

I

347

468

'[

I

7~9 I

~

I

,[

J~s

e37863 IM+2 m/z

m

Figure 5. FAB-MS spectrum of globo-N-tetraose after periodate oxidation, NaBD4-reduction and permethylation. Reprinted from: Carbohydr. Res. 168 (1987) 15. Copyright 1995 Elsevier.

5. GLYCOPROTEINS Production of recombinant proteins and monoclonal antibodies for use in therapy and diagnosis has renewed the importance of glycoprotein analysis. A complicating factor when working on glycoproteins is the heterogeneity among the oligosaccharides. A given glycosylation site often has several different structures. In order to carry out structural analysis the oligosaccharides have to be released from the polypeptide chain and purified to homogeneity. Separation of released oligosaccharides has traditionally been a problem due to the small differences in structure. New HPLC columns have, however, dramatically improved the separation methods, allowing separation of neutral as well as acidic isomeric structures [ 18, 19]. Structural analysis has been carried out by chemical, enzymatic and spectroscopic methods. Complete structural determination can in most cases be performed by NMR

38 spectroscopy if sufficient material is available. Mass spectrometry usually requires less material but complete structural determination cannot be obtained. 5.1. Release of oligosaccharides There are a number of chemical and enzymatic methods to liberate carbohydrates from glycoproteins. A chemical method frequently used to release N-linked oligosaccharides is hydrazinolysis [20]. After N-acetylation and mild acid treatment the oligosaccharides are recovered as free reducing compounds. O-Linked oligosaccharides are partially degraded from the reducing terminal, using this method. The most commonly used method to release O-linked glycans is reductive 13-elimination [21 ]. By using harsher conditions this procedure can also be used to liberate N-linked structures, which then have to be re-N-acetylated [22, 23]. When using these alkaline degradation methods the carbohydrates are recovered as oligosaccharide alditols. Trifluoroacetolysis is another chemical method for release of N- and O-linked glycans [8, 9]. Sialic acid is partially lost, which makes this procedure more suitable for asialoglycoproteins. The fate of a reducing hexosamine depends on the proportions of TFA/TFAA and under certain conditions this method can be used to specifically modify high-mannose structures, as will be discussed below [24]. All chemical methods suffer more or less from degradations or modifications of the reducing terminal of the released glycans. A viable alternative is therefore enzymatic methods. Several specific endoglycosidases are commercially available. Intact N-linked oligosaccharides can be obtained by treatment with PNGase F from Flavobacterium meningosepticum. For release of O-linked glycans there is an O-glycanase which releases the disaccharide GalI31-3GalNAc from Ser/Thr but not if this disaccharide is substituted. The specificity and sources of endoglycosidases have been reviewed elsewhere [25]. After separation of released glycans structural analysis can be carried out. 5.2. Mass spectrometry of N-linked oligosaccharides Glycoprotein oligosaccharides have been subjected to structural analysis by FAB-MS mostly as permethylated derivatives. In order to extend the structural information beyond sequence and molecular weight, periodate oxidation was used. The procedure used includes the same reaction steps as previously mentioned for glycosphingolipid oligosaccharides. Since almost all glycoprotein oligosaccharides, N- or O-linked, contain a reducing terminal HexNAc the periodate oxidation has to be carried out on the NaBIL-reduced compound in order to be able to determine the substitution position of this residue (Tables 1 and 2). The above procedure was applied to a biantennary oligosaccharide alditol with a structure of: Neu5 Aco~2-6G al [31-4GlcNAc 131-2Manor1 6 /3 Neu5 Ac~2- 6G all31-4GlcNAc]31-2Manor 1"

Man[31-4GlcNAc]31-4GlcNAc-ol

The FAB-mass spectrum obtained is shown in Figure 6. A non-reducing terminal periodate oxidised sialic acid is recognised by the primary and secondary fragments of m/z 289 and m/z 257. These sequence ions combined with m/z 698, from which a secondary ion of m/z 666 is formed, determine a sequence of Neu5Ac2-6Gall-

39 4GlcNAcl-. As previously discussed elimination from a HexNAc takes preferentially place from the 3-position in this case elimination of methanol giving m/z 666 [ 17]. Substitution in the 6-position of the GlcNAc residue can be excluded since this residue would then be oxidised, 289 U..I

Z

2520

9g

80 7g

Z

257

G6

<1: LU Sg i-'-

4it

_J LU 3g,

228 L 233

High mass scan

2gg

378

3oo

478

48g

378 (453)

6e~. . . . . . . . .

seB

o.IIo/, o, ~'CHDOMe ~J I / J' ~ P "" "~_j~'MeOCHD / Iu\ ~ i MeODHC' I "

MeO

CH2OMe~ ] o

J ~

378 MeOOC O

/

I

257 ~-- 289

MOeocHD

l

874 4--906

o

e

233

~ Hoh

o

~MeOL~

MeODHC r

I

(453)

O.

OMe

~ ~

CHrOMe

O

O

e

O

~]

MeO ~

, ..e N ~Ac

478

J--'--CH2OMe j CHOMe

MeODHC ==m=~If~,JrO --CH2

CHrOMe O

7a. . . . . . . . . 0h. . . . . . . . . sa. . . . . . . . i0'D8

CH2OMe"~

MeNAc

I - CH'~

874 906 m/z

666 ~ - 698

MeOOC---] f- CH-~2 ~

CHDOM.

666

Low mass scan

257 4-- 289

>800 x3

I

ii i

~L~

Me.

m/z 698

196

2g.

Ac/

[M+23]*

OO M ~ ~ O T '

~H I CHDOMe

MeNAc

I M e O D H C ==~====If "I"*O

666 ~-- 698

874 ~-- 906

Figure 6. FAB-MS spectrum of a biantennary oligosaccharide alditol after periodate oxidation, NaBD4-reduction and permethylation. Reprinted from: Protein Glycosylation, Cellular, Biotechnological and Analytical Aspects. GBF Monographs 15 (1990) 125. Copyright 1995 VCH.

giving other sequence ions than the above. No sequence ion corresponding to cleavage of the oxidised Gal residue, which would be m/z 453 is seen, instead an ion of m/z 378 is formed by cleavage of the carbon 5 and the former ring oxygen bond of the periodate degraded Gal residue. Both antennae of this biantennary structure contain 2-6 sialylated sequences. These sequences are linked to the 2-position of a mannose residue, deduced from the primary and secondary ions of m/z 906 and m/z 874, respectively. As previously discussed a periodate oxidised 2-substituted hexosyl residue gives a secondary ion by elimination of methanol, in this case m/z 874, which is not seen for a 4-substituted (Figure 5). Using the periodate oxidation

40 allows determination of all glycosidic linkage positions in the linear sequences. Substitution positions of the branched mannose residue can, however, not be determined from the mass spectrum. Since the periodate oxidation was carried out on the reduced compound the 4substitution of the reduced GlcNAc residue is determined from the ion of m/z 233 and combined with m/z 478 the chitobiose structure is established. An [M+23] § ion of m/z 2520 gives supporting evidence for the structure. This method is also useful when screening for sequences in mixtures of structures. The triantennary oligosaccharide fraction from a mouse monoclonal IgA antibody was analysed by FAB-MS after the above periodate treatment [26]. The sequence ions determining linkages in the outer linear sequences are indicated in Figure 7.

289

Gal IIO--4GIcNAcl.-]-O I 392--424 GalIIOB3GaI1--O--4GIcNAcl-~.

88. u.I

r~ Z

<

J

596"- 628

78.

cl z::::) 61t. m ,< 58, LIJ > i
257

257-'-

289

378, (453)

666-- 698

Neu5Ac2-]--OI3G.~O~4GIcNAc_~-O~ 257"-289

493

706"-738 Fucl

196

I

6

BO~lcNAc--ol

I 214

410 ~TR 3 9 2 424 -;~ \ 41 01

288

388

488

628 I

fin

688

698 I I 706

7ffi

m/z

8811

Figure 7. Part of a FAB-MS spectrum of a mixture of triantennary structures after periodate oxidation, NaBD4-reduction and permethylation. Reprinted from: Arch. Biochem. Biophys. 300 (1993) 335. Copyright 1995 Academic Press.

All GlcNAc residues in the antennea of the triantennary structures are 4-substituted deduced from the secondary fragments of m/z 392, m/z 596, m/z 666 and m/z 706, which all are results of eliminations of methanol. Furthermore the ion of m/z 410 shows that the reducing GlcNAc is substituted in the 6-position by a fucose residue. High-mannose structures are less favourable to analyse by FAB-MS as permethylated derivatives due to the absence of branches with N-acetyl hexosamine residues, which as shown above give abundant sequence ions. Peracetylated derivatives are in this case preferable. Two

41 methods for linkage analysis of high-mannose structures by mass spectrometry have previously been published [6, 24]. The first method is based on periodate oxidation and analysis by FABMS and EI-MS of peracetylated and permethylated derivatives. This method allows determination of all glycosidic linkage positions including substitution of branched residues. An example of application of this method is demonstrated on a Man6GlcNAc compound with the following structure: Manor 1 ~ 6 Manor 17

3 Manc~1 ~ 6

Manor 1-2Mano~1j

Man~31-4GlcNAc

3

Periodate oxidation of the NaBH4-reduced compound followed by NaBD4-reduction, permethylation and analysis by FAB-MS gave a spectrum shown in Figure 8. An intense sequence ion of m/z 179 is seen representing the non-reducing terminal mannose residues. A secondary sequence ion of m/z 355 formed by elimination of methanol from m/z 387, which is not seen, determines the Manl-2Manl- sequence [6]. Substitution of the reducing GlcNAc residue in the 4-position is established by the ion of m/z 233. The spectrum also gives information on substitution positions of branched residues. Studies on high-mannose structures have shown that 3,6-disubstituted residues give rise to alditol-containing ions where the positive charge is located on carbon 6 of the disubstituted residue [6]. Two such ions are seen in the spectrum, m/z 809 and m/z 1177. The former ion shows that the Man 1-2Man 1- sequence is linked to the 3-position of the 3,6-disubstituted Man adjacent to the reducing GlcNAc residue. The latter ion, m/z 1177, shows that the other 3,6-disubstituted residue is substituted in the 3-position by a single Man residue. From the [M+I] + ion of m/z 1373 it can be calculated that this branched residue also carries a single Man residue in the 6-position. The intensifies of the branch-specific ions of m/z 809 and m/z 1177 are more prevalent in direct-probe EI-MS. The second method, which is exemplified below, uses degradation with trifluoroacetolysis. Treatment with TFA/TFAA in proportions 1/1 (v/v) at 100~ for 48 h specifically degrades compounds with a reducing terminal N-acetyl hexosamine resulting in products quantitatively depleted of the reducing HexNAc residue [ 10]. For high-mannose structures, as protein-linked or as free reducing compounds, this means that the chitobiose sequence is lost and a mannose residue occupies the reducing terminal position. High-mannose compounds frequently contain a 3,6-disubstituted residue linked to the chitobiose. After TFA/TFAA treatment and reduction (NaBI-h) periodate oxidation is carried out followed by a second reduction (NaBD4). The 3,6disubstituted mannitol will be cleaved by periodate giving two products a glycerol containing two deuteriums and an ethylene glycol with one deuterium (Table 2). The products obtained are analysed after peracetylation by FAB-MS or GC-MS. The procedure described above was applied to the following pentasaccharide: Manotl-3Manotl~ 6 Manor 1j

3 Manl31-4GlcNAc

42 The GC-MS spectra obtained are shown in Figure 9.

IW.

233

179

1177

~._c~ICH2OMe | i

1' '9

MeODHCMeoDHC" "I- O .CH2

809

179 ~MeOOt~ ~'~~~

c.,o,. ] M.O7 / I-2-~

LU 78,

@ z <

M ' ~ 1 7 6 oo. c

68, z no 50.

W ~

='~

" . . . . . . . ___! 179

3O

l

498

1

0

J. CHDOMe

J

3 5 5 - - - (387)

>650 x5

355

o

~

I CH2OMe

[~ .....Ac J~.OM. otc.

CH2OMe t MeO~mm~

Oo. c

MeODHC

o

- o - - -

CH2OMe X

IJJ >~ 4o

~

233

[ M+I]+ 1373 /

_

809

1177

......

1~$2

'LLL ''

m/z

Figure 8. FAB-MS spectrum of Man6GlcNAc-ol after periodate oxidation, NaBD4-reduction and permethylation. Reprinted from: Methods in Molecular Biology: Glycoprotein Analysis in Biomedicine 14 (1993) 35. Copyright 1995 Humana Press. A periodate oxidised, NaBD4-reduced and peracetylated non-reducing terminal hexosyl residue is recognised by the cleavage ions of m/z 160 and m/z 263. The former ion is formed by cleavage of the bond between carbon 5 and the former ring oxygen. Spectrum A shows a glycerol residue from m/z 161 and together with m/z 264, formed by cleavage within the oxidised non-reducing terminal Man residue, it can be concluded that the spectrum represents the mannose linked to the 3-position of the 3,6-disubstituted Man residue in the original pentasaccharide. The other component formed gave spectrum B. An ethylene glycol residue is determined from the ion of m/z 88. The sequence ions of m/z 263 and m/z 551 determine a sequence of Manl-3Manl-. Sequence ions containing the ethylene glycol residue of m/z 376 and combined with m/z 479, formed by cleavage within the oxidised non-reducing terminal, confirm the linkage of the disaccharide sequence. This structure represents the Manl-3Manlbranch, which is linked to the 6-position of the 3,6-disubstituted mannose residue of the

43 original pentasaccharide. Using this method all glycosidic linkage positions including substitution positions of the branched residue can be determined.

uJ () lOe_ < I~1 88 z 23 El 68 < W

160

160 161

A

161 U--

CH '2OAc

I ~

264 / O r / - - - - I CHDOAc

Acoo.:.' ?~_- o1'~ ~coDNC, I " IF 263

CHDOAc

48

> 200 x 5

263 264

--I

"[:I: '

O

..~,L

'~

..........

60

80

d. "~

i1~I.

.... ,,

~08

'I ~

.. I

i~8

. . . . . .

1i8

,'~ II

,~,.~ I. .

168

100

,~,

R16 ,,,,~

L

~8

2~0

2i0

. . . .

i

2go

- 208

3~ m/z

I.U O

Z

186.

6e. I',-I LU

88

/~

Acoo.

A~ ~ - - _ _ _ 1

263

88

48.

O'[~

OOA~

551

> 300 x 10..

ZO

fl:: 8

376

A~

LJJ " >

160

160

B

188

146

.........L . ,,,., L4~_--.:-.L IBO

263 J

I

dd~ i~q:l 2BB

3~e

332 31~ [ ;1 _ ii

9

"3 7 6 .,.,

.,.

l

_

,

4ee

4is . , .

. . .

479 4~3 ( , . . . .

see

_

. . . . .

551 , .

.

.

.

see

m/z

Figure 9. GC-MS spectra of Man4GlcNAc after trifluoroacetolysis, NaBH4-reduction, periodate oxidation, NaBD4-reduction and peracetylation. (A) Branch connected to the 3position of the 3,6-disubstituted mannose residue. (B) Branch connected to the 6-position of the same disubstituted residue. Reprinted from: Anal. Biochem. 200 (1992) 58. Copyright 1995 Academic Press.

5.3. Mass

spectrometry

of O-linked

oligosaccharides

Oligosaccharides linked to Ser/Thr are released by alkaline borohydride as alditols, which is an advantage when structural analysis is carried out by periodate oxidation and mass spectrometry. To illustrate the use of this method on O-linked structures the following linear pentasaccharide alditol was chosen: Gall31-3GlcNAc]31-3Gall31-4GlcNAc[31-6GalNAc-ol Analysis of the intact compound by FAB-MS as permethylated derivative gave spectrum A in Figure 10. Primary sequence ions from the non-reducing terminal of m/z 464, m/z 668 and m/z 914 show a linear tetrasaccharide sequence. An alditol-containing ion of m/z 276 shows a monosubstituted GalNAc-ol and combined with the alditol-containing ions of m/z 521 and m/z 971 the linear sequence is confmned. The substitution positions of the GlcNAc residues can be

44 determined from the intensities of the secondary fragments. As previously discussed secondary fragments derived from a HexNAc residue are formed by preferential elimination from the 3position of this residue [2]. The primary sequence ion of m/z 464 gives a secondary of m/z 228 showing a Gall-3GlcNAc 1- sequence. A secondary ion of m/z 882 formed from m/z 914 by elimination of methanol shows that GlcNAc residue adjacent to GalNAc-ol is 4-substituted. No other linkage positions can be deduced from the spectrum. The monosaccharide composition is confirmed by an [M+I] + ion of m/z 1207. The FAB-mass spectrum of the oxidised compound as permethylated derivative is shown in spectrum B. The sequence ions from the non-reducing terminal of m/z 424 and m/z 228 show a Gall3GlcNAcl-sequence. The secondary fragment of m/z 228 determines the linkage position between the residues in agreement with data obtained from the intact compound (Spectrum A). The sequence ion of m/z 628 means addition of 204 m.u. to m/z 424 and consequently a periodate resistant residue which in this case is a 3-substituted Gal residue. The primary sequence ion of m/z 873 and a secondary of m/z 841 formed from the primary by elimination of methanol show a 4-substituted GlcNAc residue in accordance with the data obtained from the intact compound. From the ion of m/z 873 and an [M+I]+ ion of m/z 951 it can be concluded that the alditol has been shortened to ethylene glycol, which is a product of periodate oxidation of a 6-substituted GalNAc-ol. Supporting evidence for the sequence and linkage positions is furnished by the alditol-containing ions of m/z 305 and m/z 754. In most O-linked structures the GalNAc-Ser/Thr is 3,6-disubstituted. After treatment with alkaline borohydride the GalNAc-ol in the released oligosaccharide alditol is cleaved by periodate giving two products (Table 2). This is examplified by the following trisaccharide alditol: Neu5Ac(x2~ 6 GalNAc-ol Gal~l ~ 3 Analysis of the products by GC-MS after permethylation gave spectra shown in Figure 11. Spectrum A shows a periodate oxidised non-reducing terminal Gal determined by the cleavage ions of m/z 179 and m/z 104. Ions from the alditol part of m/z 189 and m/z 130 show an Nacetyl tetrosaminitol, which is a product of cleavage by periodate of the GalNAc-ol [27]. The spectrum represents the structural element Gall-3GalNAc-ol of the original trisaccharide alditol. The other part of the compound gave spectrum B. As previously discussed the oxidised sialic acid is recognised by the primary and secondary ions of m/z 289 and m/z 257, respectively. An alditol-containing ion of m/z 60 shows an ethylene glycol residue. The base peak of m/z 306 is a loss of a methoxycarbonyl radical from the moleuclar ion. These ions show that the GalNAc-ol is substituted in the 6-position by a sialic acid residue, establishing the structural element Neu5Ac2-6GalNAc-ol of the intact compound.

45

228

2 2 8 --- 4 6 4

668

, ,

uJ U Z r~ Z :::) <(: uJ > .J W -"

8 8 2 ---

914

70 68

971

521

276

58

]0

/196 I

,6, 432L

914

,,2 ~

521

4a

zlm

4,

[M+ll

6a

668

Bio

1207

8821

1

lia

~2r00

m/z

228 2 2 8 -,- 4 2 4 CH2OMe

80 w 0 Z Z

,..:

/,l MeODHC

78

o

~

CH2OMe

0

G0

< u,i > "I<:

so

r

]~

~O/~eO~o

~

~

0

CH~O ~

B

841 -- 873 CH,~O

!

~0

~O~?Me

~

J

~

Pur~n,J^

[,.,+23]

305

/

i.. LM+

" 24 196

951

"l~JlJ lJ/ I l l~,~ . 288

/ ........

O-CH2

~

754

48.

8

628

'

[ ' 368

4OO

754 ............ 500

512

liB_"; . . . . . . 600

663 695 7~?f;

/

[873 ~t;

I"~, ~,A . . . . . . . 788 880

1 I

I /

98

[ :~. . . . . . . 9OO

[89~

m/z

Figure 10. FAB-MS spectra of Gall 1-3GlcNAcl31-3Gall31-4GlcNAc[31-6GalNAc-ol. (A) Intact permethylated. (B) After periodate oxidation, NaBD4-reduction and permethylation. Reprinted from: Methods Enzymol. 193 (1990) 587. Copyright 1995 Academic Press.

46

104 -~2OM e

189

I10 io

)

-~o

1l 5i3

Ii

11;

15;

..!,L ~z.,ll,L..~,!,,,, ...... ,I,, ........ ,........... ,,.............. -,......... ,..,, ............ ,..,I ....... , 188 15e

8

I 130

/

CHDOMe

Jh

~e~.~

29.2

60 I CHDOMe

x5

38p 32e

3~e

Jl,

"

h

[M-45]* 339

,h

m/z

3~8

[M-59] + 306

o --I- C H 2

88 z 28 m 68 ,:( 50. LIJ ,18 > 38 I-.18

--J-:(~H

264

179

c}

rr

I~.N.-~e

>300

MeOOC

LIJ

o

179

LU O zoB, z 88

,,.I

264

MeODHCI~eODHC~

88

;~

189 r CH2OMe

MeO

I

257 * " 289 60

4~ 5"1

9B

oe

58

'

t2e

113

I8fl

I

143 1~6 158

183

2gl 288

24625 7 2)3 289 22`1232 ,L,L . . . . . . L...... L. 258

.,~

388

35B

m/z

Figure 11. GC-MS spectra of Gal131-3[Neu5Aco~2-6]GalNAc-ol after periodate oxidation, NaBD4-reduction and permethylation. (A) Residue linked to the 3-position of the GalNAc-ol. (B) Residue linked to the 6-position of the GalNAc-ol. Reprinted from: Carbohydr. Res., 239 (1993) 35. Copyright 1995 Elsevier.

6. CONCLUSION In order to completely establish an oligosaccharide structure several structural parameters have to be determined e.g. linkage positions, sequence, identification of the monosaccharide building blocks and absolute- and anomeric configurations. One way to address some of these problems by mass spectrometry is to work on chemical modifications of the constituent monosaccharide residues in such a way that differences in mass lead to structural information, as an alternative to investment in costly instruments. Choice of ionisation technique is important, since the structural information depends on formation of fragment ions. El-MS and FAB-MS, in contrast to softer ionisation techniques, produce intense cleavage ions. Specific chemical pretreatments using periodate oxidation and trifluoroacetolysis have added another dimension to structural analysis of glycoconjugates by mass spectrometry, allowing determination of linkage positions in one single analysis. The chemical transformation used has to be specific and the yield of the reactions employed should be quantitative, otherwise wrong conclusions regarding the structure can be drawn from the spectra. Trifluoroacetolysis when used on high-mannose structures satisfies the demands. Periodate oxidation is a quantitative and specific reaction as determined by FAB-MS. However, some large and highly branched high-mannose structures sometimes give incomplete oxidation. This

47 problem could easily be resolved by further oxidation after reduction and complete oxidation was achieved.

7. R E F E R E N C E S 1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

A-S. Angel, G. Gr6nberg, G. Nilsson, S. Str6mberg and B. Nilsson in Protein Glycosylation: Cellular, Biotechnological and Analytica Aspects. GBF Monographs. H.S. Conradt (ed.), VCH, New York, 15 (1990) 125. G. Gr/3nberg, P. Lipniunas, T. Lundgren, K. Erlansson, F. Lindh and B. Nilsson, Carbohydr. Res., 191 (1989) 261. H. Bj6mdal, C.G. Hellerqvist, B. Lindberg and S. Svensson, Angew. Chem. Int. Ed. Engl., 9 (1970) 610. I.J. Goldstein, G.W. Hay, B.A. Lewis and F. Smith, Methods Carbohydr. Chem., 5 (1965) 361. A-S. Angel, F. Lindh and B. Nilsson, Carbohydr. Res., 168 (1987) 15. A-S. Angel, P. Lipniunas, K. Erlansson and B. Nilsson, Carbohydr. Res., 221 (1991) 17. B. Nilsson and S. Svensson, Carbohydr. Res., 69 (1979) 292. B. Nilsson and S. Svensson, Carbohydr. Res., 72 (1979) 183. B. Lindberg, B. Nilsson, T. Norberg and S. Svensson, Acta Chem. Scand., B33 (1979) 230. B. Nilsson and S. Svensson, Carbohydr. Res., 65 (1978) 169. B. Nilsson and D. Zopf, Arch. Biochem. Biophys., 222 (1983) 628. P. Pfthlsson and B. Nilsson, Anal. Biochem., 168 (1988) 115. H. Wiegandt and G. Baschang, Z. Naturforsch., Teil B 20 (1965) 164. B. Nilsson and D. Zopf, Methods Enzymol., 83 (1982) 46. Y.-T. Li, Y. Ishikawa and S.-C. Li, Biochem. Biophys. Res. Commun., 149 (1987) 167. M. Ito and T. Yamagata, J. Biol. Chem., 261 (1986) 14278. A-S. Angel and B. Nilsson, Methods Enzymol., 193 (1990) 587. M.R. Hardy and R.R. Townsend, Proc. Natl. Acad. Sci. USA., 85 (1988) 3289. R.R. Townsend, M.R. Hardy, D.A. Cumming, J.P. Carver and B. Bendiak, Anal. Biochem., 182 (1989) 1. T. Mizuochi, K. Yonemasu, K. Yamashita and A. Kobata, J. Biol. Chem., 253 (1978) 7404. D.M. Carlson, J. Biol. Chem., 243 (1968) 616. Y.C. Lee and J.R. Scocca, J. Biol. Chem., 247 (1972) 5753. H. Krotkiewski, B. Nilsson and S. Svensson, Eur. J. Biochem., 184 (1989) 29. P. Lipniunas, A-S. Angel, K. Erlansson, F. Lind and B. Nilsson, Anal. Biochem., 200 (1992) 58. F. Maley, R.B. Trimble, A.L. Tarentino and T.H. Plummer Jr., Anal. Biochem., 180 (1989) 195. P. Lipniunas, G. Gr6nberg, H. Krotkiewski, A-S. Angel and B. Nilsson, Arch. Biochem. Biophys., 300 (1993) 335. H. Krotkiewski, E. Lisowska, G. Nilsson, G. Gr6nberg and B. Nilsson, Carbohydr. Res., 239 (1993) 35.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

49

Development of a novel enzyme based glucose sensor F. Spener, R. Steinkuhl, C. Dumschat, H. Hinkers, K. Cammann and M. Knoll Institut ftir Chemo- und Biosensorik Mtinster, Mendelstr.7, D-48149 Mtinster, Germany

Abstract

Glucose oxidase from Aspergillus niger is an extremely stable, FAD-dependent enzyme that has found wide application in electrochemical sensing of glucose in blood. Conventional sensor technology immobilises this enzyme in membranes deposited on top of the transducer. Here we report a new containment technology where the glucose oxidase is immobilised in gelatine or polyvinylalcohole and is deposited in the chip in pyramidal containments produced on silicon by anisotropic etching. This configuration of the sensor enhances the adhesion and stability of the membrane and has been applied to monitoring glucose in serum discontinuously as well as continuously in a flow-through system. Moreover we demonstrate the applicability of the containment technology to monitor glucose ex vivo with the help of a microdialysis system.

1. INTRODUCTION 1.1. Glucose oxidase

Glucose oxidase (I]-D-glucose: oxygen 1-oxidoreductase, EC 1.1.3.4) is a FAD-dependent enzyme that catalyzes the oxidation of ~-D-glucose by dioxygen to hydrogen peroxide and 8gluconolactone, which subsequently hydrolyzes spontaneously to gluconic acid. Apart from glucose monitoring in food, drinks and fermentation processes the most important application of this enzyme is in clinical diagnostics. Here it is used as a component of colorimetric diagnostic kits, of dry reagent test strips, and more recently of biosensors for the determination of glucose in blood, serum of plasma. Glucose oxidase (GOD) exhibits useful properties with respect to widespread application in biosensors because of the high specificity for glucose and an extremely high stability [1]. The enzyme has been isolated from various moulds, from red algae, citrus fruits, insects and bacteria. The most widly used enzyme in terms of research and commercialized products, however, is that from Aspergillus niger, a glycoprotein with a high-mannose type carbohydrate content of 10 to 16 % of its molecular mass. The carbohydrate component appears to be in form of a branched polysaccharide that partially surrounds the protein core [2]. Many of the enzyme's physical properties, such as high solubility in water and resistance to proteases for example, may be ascribed to its carbohydrate shell. Partial removal reduces the enzyme's stability.

50

Figure 1. Tertiary structure of FAD-containing glucose oxidase (monomer) from

Aspergillus niger [3]. The holo-enzyme is a homodimer, where each subunit contains the tightly bound (Kd = 1 x 101~ but not covalently attached coenzyme (FAD), and the reaction mechanism is essentially

H=O=,hydroquinones

Glucose " ~

S

GODo,-

= d ) ' -~ - ~

reduced mediators

,_ E ~

Fe(CN)=

~'~ ~

DCPIPH~

Glucose-GODo=

Glucono.~,,,..,,~

lactone

GOD~

DCPIP, low charge density Fe(CN)3, high charge density phenoxazines, ferrocenes,TTF, TCNQ 02, quinones

Figure 2. Natural and artifical substrates in redox reactions catalyzed by glucose oxidase from

Aspergillus niger.

51 dependent on this coenzyme. The recently unraveled tertiary structure of glucose oxidase (Fig. 1) from A. niger [3] shows that this enzyme allows electron acceptors other than dioxygen to reoxidize the reduced FAD in the same way. This is important for the application of the enzyme in the amperometric approach to glucose sensors, because the transfer of electrons between the active site of the reduced enzyme and an electrode takes place slowly or not at all. Transfer of electrons can be facilitated if a small electron acceptor is used as a mediator [ 1]. These compounds have the advantage that they allow amperometric biosensors to operate at relatively low potentials, and this can lead to a decrease in interference from electroactive compounds (Fig. 2).

1.2. Biosensors Given the extraordinary operational stability of glucose oxidase and the world-wide high demand for monitoring glucose in the blood of patients with type I diabetes (insulin-dependent diabetes mellitus) as a parameter for insulin injections that compensate the defect in insulin production by the pancreas, it is no surprise that the foremost application of biosensors to date is in the medical field. Initially biosensors for monitoring blood glucose have been used as analyzers in clinical laboratories. The US-company Yellow-Springs Instruments was the first to offer commercial glucose analyzers, based on the patent of L.C. Clark. Today the decentralized employment of point of care diagnostics in the doctor's office or in intensive care units and the bed-side measurements in hospitals are of greater interest (for example the multiparameter sensor of ISTAT Corp. Princeton, USA). Of utmost importance today, however, is self-testing by patients using miniaturized electrodes with immobilized GOD for blood glucose sensing. In this area 3 sensor systems are in different states of development. A big success already are the commercially available, disposable pen- or card-type sensors. The more provocative approach aims at continuous monitoring of blood glucose to attain normoglycemia and to avoid acute metabolic disturbances. With this in mind work on the development of implantable biosensors is going on for years. Due to the - in this case still limited - stability of the enzyme and still not optimal biocompatibility of all the sensor materials used, this concept has not yet been realized. A solution in the far future may be artificial organs, such as the pancreas, with an inbuild glucose sensor and insulin pump in nanotechnology. The more realistic concept for realization in the near future is a continuously working microdialysis system, were dialysates of the subcutaneuos tissue are channeled to a sensor for continuous glucose monitoring ex vivo [4]. Our approach to the glucose sensor is based on the GOD-catalyzed oxidation of glucose to gluconolactone and concomitant reduction of dioxigen to H202. The latter is electrochemically oxidized at a platinum anode. The anode is polarized at + 600 mV vs. an Ag/AgC1 reference electrode. Often the enzyme is immobilized in a membrane coveting the surface of the anode. The current depends linearly on the glucose concentration in the solution [5]. Until now most enzymatic biosensors have a membrane fixed on top of the transducer. This method often leads to malfunction of the sensor arising from problems like inadequate membrane adhesion and insufficient mechanical stability. Dipcoating procedures are difficult to perform in a reproducible way in order to obtain sensors with identical performances. Sensors with membranes that have separately been casted and mounted to an electrode can hardly be miniaturized and are therefore not suited for potential implantation.

52 In order to solve these problems we developed a new concept for membrane deposition [6, 7], the so called c o n t a i n m e n t s e n s o r s on the basis of micromechanically etched cavities in silicon substrates (<100> crystal orientation, 380 lam thick) with openings towards the analyte solution between 120 and 480 lam width, where the membrane is not located on but in the chip (Fig. 3). At the sides of the cavities platinum electrodes are deposited by means of semiconductor technology such as photolithography and physical vapour deposition. Here we demonstrate the application of this technology to the sensing of glucose in body fluids.

Pt Si SiO 2

Containment with immobilized enzyme

Figure 3. Schematic view of the containment sensor.

2. EXPERIMENTAL

2.1. Chip fabrication A silicon wafer (p-type, 3-inch, (100)-surface, 380 l.tm thick, optically polished on both sides) was oxidized in wet ambient at 1200 ~ for 210 min to get a 1.5 l.tm thick oxide layer. Spin-coating was performed sequentially with HMDS (hexamethyldisilazane) as adhesion promoter and photoresist (AZ 5214E/Hoechst) at 4000 rpm followed by a prebake at 90 ~ for 5 min. The resist layer had a thickness of 1.4 pm. It was exposed to about 80 mJ/cmz of mid UV radiation through a photomask to define the containments on the front side of the wafer. Development with AZ 524MIF/Hoechst was followed by a postbake at 120 ~ for 60 s. A protective coating was then applied to the back side of the wafer to withstand the following oxide etching process which was carried out with BOE (buffered oxide etchant) to get quadratic holes in the front side. Photoresist and protective layers where removed and silicon was etched anisotropically in 20 % KOH solution at 75 ~ for 6 h to create the containment holes. The silicon oxide was then stripped with hydrofluoric acid and a new 150 nm thick oxide layer was created thermally in dry oxygen at 1050 ~ In order to form platinum electrodes the front side of the wafer was first sputter deposited with a 1/am thick aluminum layer which acts as a sacrificial layer for reliable platinum patterning in a lift-off process. Using our newly developed electrospray-coating technique the aluminum layer was subsequently covered with a photoresist layer, again AZ 5214E from

53 Hoechst, which was now used in image 2 reversal mode. The prebake at 90 ~ for 5 min was sequentially followed by a 40 mJ/cm UV exposure through a photomask, a reversal bake at 120 ~ for 5 min, a 320 mJ/cm UV flood exposure, and development with AZ 524MIF. Then aluminum etching was performed with phosphoric acid etchant (PES 83.5-5.5-5.5/Merck) to get a negative mask for the following application of the platinum electrodes and conducting lines. A 200 nm thick platinum layer was then sputter deposited and patterned by removing the aluminum layer in a sodium hydroxide solution. Anodic bonding of a Pyrex | was performed at 500 ~ and 300 V for 60 min. Finally the chips were separated.

glass wafer

2.2. Enzyme immobilisation and membrane deposition The buffer (pH 7.0) was prepared by solving 8.1 g NaC1 (Merck), 0.272 g KH2PO 4 (Sigma), and 0.697 g K2HPO 4 (Sigma) in 1 1 deionized water. GOD (200 U/mg) was obtained from Fluka. 50 mg of acid photo gelatine (Filmfabrik Wolfen, Germany) were allowed to swell in 0.5 ml deionized water for 60 min at room temperature. Then 0.5 ml of the prepared buffer (pH 7.0) solution were added. After mixing the solution was kept at 35 ~ for 60 min, then different amounts of GOD-powder were added to 100 lal of the warm stirred gelatine solution in a small Eppendorf vessel. The vessel with the enzyme was placed in a chamber and the transducer chip was dipped into the enzyme-solution to cover the containment opening with it. Then the chamber was evacuated in order to fill the containment with the gelatine solution. After gelling the sensor surface was cleaned and the sensor was examined under a microscope to ensure a properly filled containment. Then the sensor was stored in buffer solution until the first measurement. Furthermore polyvinylalcohol functionalized with stilbene groups (PVA-SbQ) was used for enzyme immobilization [8]. 1600 U GOD were dissolved in 5 0 0 m g PVA-SbQ 500-88 solution. The solution was filled into the containment and then the PVA-SbQ was hardened by UV-irradiation. 2.3. Measurements The measurements were performed in a two electrode configuration. A chloridized silver wire was used as reference electrode. A voltage of 0.6 V was applied between the containment electrode and the Ag/AgC1 reference electrode (potentiostat Autolab PSTA 1 0 / 4 channel, Eco Chem). The sensors were tested in the buffer solution described above. Before starting calibration the sensors were polarized until the current was lower than 1 hA, which took about one hour. Known amounts of a highly concentrated glucose solution (Merck) were added to the stirred buffer solution to obtain the calibration graphs. When not in use the sensors were stored in buffer solution at 4 ~ The sensor was also probed in a flow system (fixed in a block of acrylic glass with a dead volume of approximatly 10 laD, equipped with a pump from Meredos and a valve from Knauer were used (the flow rate was 50 pl/min). The response time was obtained from the response curves in the flow system. The stability of the sensor was determined by immersing the polarised sensor together with the chloridized silver wire in stirred undiluted human serum (purchased from Sigma) and recording the current. The concentration changes in the undiluted serum were obtained by adding known amounts of high concentrated glucose solution.

54 3. RESULTS AND DISCUSSION The response curves and the corresponding calibration graphs of simultaneously measured containment sensors with different opening sizes (120, 250, 480 pm length of a side) toward the analyte solution are shown in Fig. 4. All sensors with gelatine based membranes responded fast and stable on changes in the glucose concentration. As expected the sensitivity of the sensors increased with increasing opening size. The detection limit was lower than 0.05 mM.

9

30O

'1'

9

i

Glucose / mM

9

i

9

~

350

;

-" -~

25O

300

9

I

"

250

y

-

< c .~ 150 !

.i

100

0

0.416 C 9

I

1.0

,

120 pm opening slzej I

1.5

m

/

200 150

/./f---I

"

/

I

"

l

pm

f

100

1.11 50

< t--

"

B

.

2O0

l

9

I

,

I

2.0 2.5 Time / h

a

I

3.0

,

II

3.5

50 0

9

"~

2

1

2

0

pm

4 6 8 Glucose I mM

10

Figure 4. Dependence of signal on the opening size (edge length) of the containment sensor toward the analyte solution. A, response curves; B, calibration curves. Glucose (0.139 - 9.12 mM) in buffer; 1600 U GOD/ml gelatine-solution.

In Fig. 5 it is demonstrated that the sensor is well suited for a flow system configuration. The sensor responded linearly and reproducibly in a concentration range relevant for continuous subcutaneous monitoring of glucose in a microdialysis system, if the microdialysis needle and flow rate are chosen properly. The response time (t90) in the flow system (Fig. 5) is 65 s for the change in concentration from 0.5 to 1.25 mM, and 104 s for the change from 1.25 to 0.5 mM.

55 60

I

5.0

9

I

9

I

9

9

I

"

I

"

I

so

mM

14

5O

"

"

//"

1.25

B

I

"

I

"

I

"

I

mM

50 2O

40

10

12

'

0

< =

Gluoou

30

2.5

mM

I mM

........

20 1.25

8

10

.........................................

6 |

0.0

9

I

0.5

9

I

1.0

9

I

,

1.5

I

9

2.0

I

,

2.5

I

9

9

3.0

I

9

1700

I

1800

9

I

1900

,

II,

ii

I

9

I

9

I

9

I

8600 8700 8800 8900

Time I s

Time I h

Figure 5. Response curve (A) and response time (B) of the containment sensor in the flow system. Glucose concentrations in buffer; 1600 U GOD/ml gelatine-solution; flow rate 50 pl/min.

160

.

,

'

I

'

,

'

/A'

'

,

140 t20 100

< =

80

=rob

60

+ 1. 2mM + 1.2 m M ~ j ' - ~

+ 1.2 mM

serum

/

40

,..J

new polarization 20

0,

, 20

,

~ 25

,

~ 3o

,

~ 35

,/~,0.4,

2

Time I h

Figure 6. Stability of the containment sensor during measurements of glucose in undiluted human serum. 2400 U GOD/ml gelatine-solution.

56 As demonstrated in Fig. 6, the sensor shows a stable signal in undiluted human serum over more than two days. This demonstrates the applicability of the sensor in undiluted body fluids. Because the linear range of the gelatine based membranes was not sufficient in undiluted probes, the photosensitive PVA-SbQ was chosen for the immobilization of GOD. The calibration plot indeed shows a wider linear range (Fig. 7). 100

I

. . . .

I

'

'

'

'

I

'

'

'

100

'

I

A

'

I

'

I

,

I

'

I

m

I

'

I

m

I

"

I

m

I

B 80

80

60

60

<

<

r-

C

40

40

20

..__/-

/.--.

20

0 I 0

,

,

,

,

I 50

,

,

,

,

I 100

T i m e I min

,

,

,

,

I 150

I 0

,

10

20

30

40

I 50

G l u c o s e I mM

Figure 7. Response curve (A) and calibration curve (B) of the containment sensor with GOD immobilized in PVA-SbQ (1500 U/500 mg solution).

It is possible to store the sensors with GOD immobilized in PVA for more than 3 months in the refrigerator with a minute decline in sensor characteristics only. The influence of interfering substances was not investigated but a behavior similar to planar sensors using the same materials is expected. The results show that enzyme membrane deposition in silicon containments makes glucose sensors with good analytical response behaviour feasible. Linear range, in particular the normal physiological range of 3.9- 6.1 mM (70- 100 mg/dl) for glucose in blood and above in the diseased state, as well as lifetime and response times are reasonable well and acceptable for most practical applications, also for use in flow systems combined with microdialysis sampling. In the introduction we alluded to the lack of stable, implantable and long lasting glucose sensors [9, 10]. On the way toward this goal one immediate solution with application potential is the development of a miniaturized microdialysis system. Such a microdialysis system under development in our laboratory is shown in Fig. 8.

57

B

A gas-pressurized spnng

peffusion-solutio~ I

syringe

.

microcontainment-

f

microdia~/sis-ch~o

,,,

11

i

waste

C silicon t/on

Figure 8. Schematic view of the microdialysis system. A, Principle; B, lay-out of the microdialysis-chip; C, integrated microcontainment sensor.

The analytical response behaviour of these containment sensors is comparable with other sensor configurations which are fabricated in silicon planar technology, but offers the following advantages: - Containment sensors are very robust because the mechanically sensitive membrane is protected inside the chip against damage. Also adhesion problems with the membranes are minimized since they are attached inside the containment. Membrane patterning is not needed. - Encapsulation is no longer a problem since the main components of the sensor (membrane and metal electrodes) are located inside the containment. Encapsulation is further simplified -

58 due to the fact that the conducting leads and the sensitive sensor surfaces are located on opposite sides of the wafer. Thus, encapsulation can be performed on the top side of the whole wafer. - Containment sensors are ideally suited for integration into microsystems as the dead volume added to the system is minute. - The fabrication process is entirely compatible with mass-production technologies since it is a full wafer process. In mass production even the membrane deposition can be performed as full wafer process either under vacuum or with an automated dispense system. Only soldering the plug must be done after the chips have been separated. In conclusion, the containment concept allows the fabrication of glucose sensors with good analytical response behaviour and with a considerably improved technology.

4. REFERENCES

1 2 3

R. Wilson and A. P. F. Turner, Biosens. & Bioelectron., 7 (1992) 165. J.H. Pazur, K. Kleppe and A. Cepure, Arch. Biochem. Biophys., 111 (1965) 351. H.J. Hecht, H. M. Kalisz, J. Hendle, R. D. Schmid and D. Schomburg, J. Mol. Biol., 229 (1993) 153. 4 C. Meyerhoff, F. J. Mennel, F. Bischof, F. Sternberg and E. E. Pfeiffer, Horm. Metab. Res., 26 (1994) 538. 5 F. Scheller and F. Schubert, B iosensoren. Akademie-Verlag, Berlin, 1989. 6 M. Knoll, German Patent DE 4115414A1 (1991). 7 R. Steinkuhl, C. Sundermeier, C. Dumschat, K. Cammann and M. Knoll, German Patent application Nr. P 4337418.2 (1993). 8 R. Renneberg, K. Sonomoto, S. Katoh and A. Tanaka, Appl. Microbiol. Biotechnol., 28 (1988) 1. 9 G . S . Wilson, Y. Zhang, G. Reach, D. Moatti-Sirat, V. Poitout, D. R. Th6venot, F. Lemonnier and J.-C. Klein, Clin. Chem., 38 (1992) 1613. 10 S. J. Updike, M. C. Shults, R. K. Rhodes, B. J. Gilligan, J. O. Luebow and D. von Heimburg, ASAIO J., 40 (1994) 157.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

Carbohydrate

binding

at the active

site of Escherichia

59

coli

maltodextrin phosphorylase P. Drueckes, D. Palm and R. Schinzel Theodor-Boveri-Institut ffir Biowissenschaften der Universit~it Wtirzburg, Physiologische Chemie I, Am Hubland, D-97074 Wtirzburg, Germany

Abstract The oligosaccharide binding site of E. coli maltodextrin phosphorylase was characterised by a site-directed mutagenesis approach. Kinetic studies with oligosaccharides of different length imply that the oligosaccharide binding site consists of five subsites. Subsite five contributes about 14 kJ/mol to binding. Mutations of active site residues E67, E350 and H536 impair oligosaccharide binding significantly. Based on molecular modelling studies residues E350 and H536 appear to be important parts of subsite five, while residue E67 most likely contacts subsite two, which also seemed to be important for binding. Subsites three and four appeared to contribute only little to oligosaccharide binding energy.

1. INTRODUCTION Glycogen phosphorylases (EC 2.4.1.1) are key enzymes of the carbohydrate metabolism. They catalyse the phosphorolytic cleavage of t~-l,4-1inked glucose units to produce t~-Dglucose 1- phosphate: glucose, + P~ +--)glucosen_l + glucose 1-phosphate In muscle glucose 1-phosphate is further metabolised via glycolysis to provide energy, and in liver phosphorylase helps to maintain a constant blood glucose level via the action of glucose 6-phosphatase [4]. Therefore, suppression of glucose output from the liver may be achieved by inhibition of glycogen phosphorylase. Such inhibitors may be of use for therapy of the noninsulin dependent form of diabetes (NIDDM or Type II diabetes). One class of phosphorylase inhibitors consists of glucose analogs which stabilise the inactive T-form of the enzyme. Since there are only weak physiological inhibitors known, a variety of glucose compounds with better inhibitory properties were designed, synthesised and tested by L. Johnson, G. Fleet, N. Oikonomakos and coworkers [13, 23]. An alternate way to look for potent inhibitors other than glucose compounds would be to design analogs derived from the oligosaccharide substrate.

60 However, present knowledge about substrate binding sites of glucosyl residues in phosphorylases is still incomplete. From x-ray crystallography, molecular recognition and site directed mutagenesis studies the binding of glucose 1-phosphate and its derivatives in ground state and transition state is well characterised [20, 21, 3, 12]. Substantially less is known about productive binding of the polysaccharide awaiting degradation. So far no binding of oligosaccharides at the active site of rabbit muscle glycogen phosphorylase has been observed in crystals [18, 12], although there is at least gross structural information on a second carbohydrate binding site responsible for the attachment of carbohydrates to glycogen particles, the "glycogen storage site" [9, 12].

Synthesis

Degradation

Figure 1. Schematic drawing of the substrate binding site of maltodextrin phosphorylase. Subsites 5-2 are the primer subsites.

From kinetic studies with branched and linear oligosaccharides French and coworkers [7, 8] suggested a five glucose unit binding site, four subsites for the primer and one for the glucose moiety (Fig. 1). This subsite concept is comparable to that described for other oligosaccharide degrading enzymes [ 11, 16]. However, glycogen phosphorylases differ from those enzymes: First, in the reaction of phosphorylase a glucosyl residue is transferred to a phosphate group rather than to a water molecule. Consequently the exclusion of water from the active site is of essence for the phosphorylase reaction. Indeed, mutations of active site residues cause an increase of the remarkably low error rate, eg release of glucose rather than glucose 1phosphate [ 15]. Further, the phosphorolytic cleavage of oligosaccharides is freely reversible. At equilibrium (Kcq = Pi/Glc-1-P = 3.6 at pH 6.8) "synthesis" is favoured. In the cell the physiological role of phosphorylase is the energy conserving mobilisation of storage polysaccharides through phosphorolysis due to the relatively high phosphate concentration in the cell. The work presented here is aimed at mapping the oligosaccharide binding site by a combination of a kinetic studies with linear oligosaccharides of increasing length and sitedirected mutagenesis. The E. coli maltodextrin phosphorylase is used as a model system, since this enzyme binds short linear oligosaccharides better than glycogen phosphorylase. In addition, the bacterial enzyme lacks the glycogen storage site, which makes determination of kinetic parameters less complicated

61 2. SUBSTRATE SPECIFICITY OF MALTODEXTRIN PHOSPHORYLASE In the phosphorolysis mode the shortest oligosaccharide degraded by maltodextrin phosphorylase is maltopentaose (G5). Maltotetraose (G4) is degraded much less efficiently exhibiting an approximately 300-fold lowered specificity constant (Table 1). On the other hand no significant additional binding energy is provided by additional glucose units from oligosaccharides with more than five glucose residues (Table 1). A very slow degradation of maltotriose was detectable by thin layer chromatography but this activity was to low to be assayed quantitatively (Htilsmeyer and Palm, unpublished results). In phosphorolysis maltotetraose acts as an inhibitor with an inhibitor constant comparable to the Km values of the oligosaccharide substrate (Table 2). Maltotriose (G3), maltose (G2) and glucose (G) do not inhibit the phosphorylase reaction (Table 2). In the synthesis mode the phosphorylase reaction is efficiently primed in the presence of oligosaccharides with a minimum length of four glucose residues. [7]. Similar to degradation longer oligosaccharides do not have better primer properties. Maltotriose was found to be a markedly less efficient primer molecule. Comparable observations were made for potato phosphorylase [22]. No primer activity of maltose was detectable even at high enzyme concentration and at prolonged incubation times up to 48 h (Htilsmeyer and Palm, unpublished observations). In addition, maltotriose is an inhibitor of the synthesis reaction with a dissociation constant comparable to the Kin-value of the primer molecule maltotetraose (Table 2).

Table 1 Kinetic parameters of maltodextrin phosphorylase action on different oli~osaccharides Substrate Degradation Synthesis

G7 G6 G5 G4 G3

Km kcat (mM) (s -1)

kcat~m (s -1 mM -1)

Km keat (raM) (s -1)

kcat/Km (s "1 mM "1)

0.4 0.5 0.8 3.9 n.d.

45 44 38 0.13 n.d.

3.2 3.0 4.6 4.1 23

15 15 12,5 14,5 0.091

18 22 30 0.5 <0.01

48 44 58 60 2.1

A series of oligosaccharides ranging from G3 to G7 was labelled with dansylhydrazine at the reducing end. Labelled oligosaccharides (from G4 to G7) behave like physiological substrates in both directions. The fluorescence properties of the labelled substrates remained unchanged in presence of phosphorylase (Drueckes, unpublished results). Thus, the fluorophore of the substrates seems not to interact with the protein. This confirms the observation described above that additional glucose residues consecutive to G4 do not interact with the enzyme and that smaller oligosaccharides do not bind to the enzyme. From the kinetic and binding experiments the following model of oligosaccharide binding at the active site of phosphorylases emerges: Oligosaccharides which contain at least four glucose residues in synthesis or five residues in degradation, respectively, are required for effective

62 catalysis. Additional glucose residues do not contribute significantly to binding neither in synthesis nor in degradation. Therefore, the oligosaccharide binding site is composed of five

Table 2 Inhibitory effects of short oligosaccharides, maltose and glucose on the phosphorylase reaction Inhibitor Degradation Synthesis Ki (mM) Ki (mM) G4 0.4 G3 >100 4 G2 >250 >250 G >500 >500

consecutive subsites in the direction of degradation and four glucose binding subsites and one glucose 1-phosphate binding subsite in synthesis. The pronounced loss in catalytic competence between G4 and G3 or G5 and G4, respectively, points to a central role of subsite five in substrate binding both in the ground state and transition state. The contribution of this subsite to binding energy can be quantitated using AAGn = -RT In[ (kcat/Km)n_1//(kcat/Km)n] according to Horumi [11]. A free energy of binding of approximately 14 kJ/mol for subsite five was determined, while for subsite six and seven values of about 0,4 kJ/mol and 1 kJ/mol were determined. Since shorter oligosaccharides are no or only weak inhibitors, subsite five seems not sufficient to bind the oligosaccharide substrate and the other subsites must also contribute to oligosaccharide binding. If it is assumed that binding at two subsites is required for efficient oligosaccharide binding, all oligosaccharides which can occupy those sites should be substrates or at least inhibitors of the phosphorylase reaction. The failure of G3 to be an efficient inhibitor makes a contribution of subsite two to binding evident, since G3 cannot bind at sites two and five at the same time. Only G4 could span both binding sites thus making it an inhibitor molecule in the direction of degradation. If sites three or four would contribute significantly to binding, maltotriose should have inhibitory properties comparable to that of G4. Binding at subsites three and four seems not to provide binding energy or may even lower binding energy ("negative binding"). This is in accordance with observations of Giri and French [8] that subsite three can fit 1,6-glycosidic bonds. However, subsite four is specific for 1,4 linked glucose units. From these data no conclusions on the contribution of subsite one, the catalytic site, can be made.

3. FUNCTIONAL ROLE OF ACTIVE SITE RESIDUES IN OLIGOSACCHARIDE BINDING Since kinetic analysis of the oligosaccharide binding site is difficult due to the lack of suitable substrate analogs and the failure of earlier attempts to measure binding directly, a sitedirected mutagenesis approach was employed to get further information how carbohydrates bind at the active site. In an alanine scanning experiment first suggested by Cunnigham et al.

63 [5] a number of active site residues were changed to alanine (Ddickes et al., in preparation). Only those residues were selected for the mutagenesis experiments which meet at least two of three criteria: - Location at a channel with access to the active site which is most probably identical with the oligosaccharide binding site. This channel, which was proposed by Barford and coworkers [2], consists of an about 1,5 nm long channel from the cofactor binding site to the surface of the enzyme.

- Conservation in all known phosphorylase primary structures including enzymes of bacterial, plant and mammalian origins [ 14]. - Participation in oligosaccharide binding. Only amino acid residues with charged or hydrophilic amino acid side chains were changed, which are commonly involved in carbohydrate binding. In contrast to many other enzymes acting on oligosaccharides [19] no tryptophanes and only few aromatic amino acid side chains seem to be involved in carbohydrate binding by interacting with the plane of the sugar through nonpolar interactions. All mutant enzymes created so far have some properties in common: - They exhibited thermal stabilities comparable to that of the wildtype enzyme except for the E67A mutant enzyme, which is slightly more sensitive to heat. - There are no significant differences in the steady state parameters for maltopentaose and maltoheptaose (Fig. 2A) in the degradation mode or for maltotetraose and maltopentaose in the synthesis mode, respectively (Fig. 2B). This confirms the above mentioned observation that residues beyond the fourth glucose unit do not contribute significantly to binding of oligosaccharides.

- All mutations which influence carbohydrate binding affect binding of inorganic phosphate as well, but to a lower extent. This is in accordance with earlier observations that both substrates in the direction of degradation, inorganic phosphate and the oligosaccharide, do not bind independently [ 10]. 3.1. Effects of m u t a t i o n s on apparent binding According to their effects on apparent binding, the resulting mutant enzymes can be divided into three classes: Few mutant enzymes belonging to the first class showed no or only minor changes in the steady state parameters upon mutation although they are highly conserved in all phosphorylase sequences. In addition, one mutant enzyme, N112A, exhibits significantly increased thermal stability (Drueckes and Schinzel, unpublished results). The functional role of these side chains remains unclear. A second class is built of all mutations which caused only moderate changes of the apparent binding. The loss in apparent ground state binding energy (Fig. 2A) compared to the wildtype enzyme is about 4-6 kJ/mol for all mutants in this class. This corresponds to the loss of one

64 hydrogen bond between uncharged side chains [6, 20]. Indeed, all mutants belonging to this class are mutants of amino acids with hydrophilic but uncharged side chains: N307, Y578 and T346. In the case of the N258A/D259A/N260A mutant enzyme one of the asparagines rather than aspartic acid 259 might be engaged in binding of the oligosaccharide substrate in the ground state. Those three residues are part of a loop which in the inactive T-form of phosphorylase b blocks the putative substrate channel. This loop swings back upon allosteric activation of phosphorylase [2]. The moderate effects of this mutation on the non-allosteric bacterial phosphorylase shows that maltodextrin phosphorylase resembles the R-form of rabbit muscle phosphorylase. The Y578A and T346A mutants are of special interest since for these two mutants the binding energy in the ground state of the oligosaccharide in the synthesis mode is less affected than binding in the degradation mode (Fig. 2). This suggests that the oligosaccharide substrate may occupy partially different binding sites in synthesis and degradation, respectively. Although substitution of E350 by an alanine removes a charge, only moderate changes in apparent ground state binding are observed when compared to the E67A and H536L mutant enzymes. The third class consists of mutant enzymes which showed a more substantial decrease of apparent carbohydrate binding. Here the observed changes in free energy of oligosaccharide binding in the degradation mode are in the range expected for a hydrogen bond with a charged or uncharged residue, respectively (Fig. 2). This is the case for the E67A and H536L mutants, where charged amino acid side chains had been substituted by neutral ones. These data imply that both residues are of major importance in carbohydrate binding. However, in the direction of synthesis the values given (Fig. 2) represent only a lower limit since oligosaccharides are insoluble at high concentrations and thus Km values cannot be determined accurately. Therefore, conclusions about differential effects of these mutations on binding in synthesis and degradation cannot be drawn from the present data. 3.2. Effects of mutations on kcat/Km Changes in kcat/K m c a n be expressed in terms of the free energy of binding: AAG = - RT In[ (kcat~m)mut/(kcat/Km)wt]. In Fig. 3. the effects of mutations o n kcat/K m values are shown. As observed for the Km values, mutations of H536 and E67 causes the largest effects on k~a~Km. Due to the very high Km values kcat]tKm w a s determined directly at [S]<
65 activity. This suggests that a negative charge at this position is indispensable for a fully active enzyme (Drtickes and Schinzel, manuscript in preparation).

Figure 2. Effect of different mutations on Km values for oligosaccharides relative to the Km of the wildtype enzyme. The values for the H536L and E67A mutant give lower limits, since the determination of the apparent Km values was limited by the solubility of the oligosaccharides. (A) Degradation mode; (B) Synthesis mode. G4 = maltotetraose; G5 = maltopentaose; G7 = maltoheptaose.

The D308A mutant enzyme retained wildtype Km while activity is lowered 100-fold. Substituting T346 by alanine affects both binding in the ground state and to a larger extent binding in the transition state. Exchanging Y538 and N307 to alanine affects lqat only

66 moderately, the changes in AAG therefore reflect only the impaired binding in the ground state. Both residues seemed not be involved in transition state binding.

Figure 3. Changes in free energy of activation (calculated from kcat/K m values) of mutant maltodextrin phosphorylase compared with the wildtype enzyme. The values for the H536L and E67A mutant were determined directly at S<
3.3. Structural implications Unlike glucoamylases and or-amylases [17] the glycogen phosphorylases - although highly conserved from bacterial to mammalian enzymes- share no homology with other carbohydrate degrading enzymes. The binding site differs from binding sites in other enzymes acting on oligosaccharides in some respects, for example no tryptophan residues are located near the putative carbohydrate binding site in at the active site of glycogen phosphorylase. Up to now no structure of a phosphorylase - carbohydrate complex has been solved. However, a channel from the active site to the surface of the enzyme had been identified in the phosphorylase b structure [ 1] which is most probably the substrate channel. A maltopentaose molecule had been modelled into the phosphorylase b structure. However, this was only possible under the assumption that the oligosaccharide molecule is distorted mainly between the second and third glucose unit (L. Johnson, personal communication). Since almost all amino acid residues of the catalytic domain of the bacterial and mammalian phosphorylase are conserved [14], the assumption can be made that the 3-D structure of the active sites of both enzymes should be very similar. Therefore, the position of the amino acid residues which correspond to the mutated residues in the bacterial enzyme relative to the maltopentaose molecule was examined.

67 In this model H571 (corresponding to H536 in maltodextrin phosphorylase) and E382 (E350) are relatively close to glucose residue five. Tyr613 (Y578) appears to distant to make direct contacts to the sugar model. Hence the participation of this residue to substrate binding is most likely indirect or mediated by other side chains. In contrast, in the model only one side chain, E382 (E350), seems to be in the vicinity of glucose residues four. Binding at this subsite should contribute only little to binding both in the ground state and the transition state. Comparable to subsite four in the model only one amino acid, aspartic acid at position 339 (D308), is located near the glucose residue at subsite position three. However, the glucose residue at subsite two is in contact to all amino acids found to be involved in substrate binding in the bacterial maltodextrin phosphorylase (except for H571 and E382). From this observation it can be predicted, that subsite two contributes substantially to binding of the substrate and is critical to bind the oligosaccharide molecule in a distorted form. G5

~1376

-r

)-

PLP

Figure 4. Position of amino acid side chains (rabbit muscle numbering) found to be involved in oligosaccharide binding in maltodextrin phosphorylase to a maltopentaose molecule fitted into the structure of phosphorylase b (L. Johnson, personal communication).

68 If the kinetic and site directed mutagenesis studies are combined with the structural model of substrate binding (Fig. 4), a functional model of substrate binding in glycogen phosphorylases emerges: Subsites five and two seem to be of major importance for binding of the oligosaccharide in both the transition state and the ground state, while subsite three and four make no or only small contributions to binding. The different Km values observed for some mutant enzymes in the direction of degradation and synthesis point to a structural rearrangement of parts of the oligosaccharide binding energy site in catalysis.

4. ACKNOWLEDGEMENT We are grateful to L. Johnson, Oxford, for making the coordinates of the oligosaccharidephosphorylase b model available to us. This work was supported by a grant from the EC BIO2-CT94-3025.

5. REFERENCES

1

2 3 4 5 6 7 8 9 10 11 12 13

14 15

16 17

D. Barford, J.W. Schwalbe, N.G. Oikonomakos, K. Acharya, A. Hajdu, G. Papageorgiou, J.L. Martin, A. Knott, A. Vasella and L.N. Johnson, Biochemistry, 27 (1988) 6733. D. Barford and L.N. Johnson, Nature, 218 (1989) 233. S. Becker, D. Palm and R. Schinzel, J. Biol. Chem., 269 (1994) 2485. M. Browner and R.J. Fletterick, Trends Biochem. Sci., 17 (1992) 66. B.C. Cunnigham and J.A. Wells, Science, 244, (1989) 1081. A.R. Fersht, J.P. Shi, J. Knill-Jones, D.M. Lowe, A.J. Wilkinson, D.M. Blow, P. Brick, P. Carter, M. Waye and G. Winter, Nature, 314 (1985), 235. D. French and G.M. Wild, J.Amer.Chem.Soc., 75 (1953) 4490. N.Y. Giri and D. French, Arch. Biochem. Biophys., 145 (1970) 505. E.J. Goldsmith, S.R. Sprang, R. Hamlin, N.-H. Xoung and R.J. Fletterick, Science, 245 (1989) 528. D.J. Graves and J.H. Wang, The Enzymes, Vol. III, pp. 460-463, Academic Press, New York (1970). K. Hiromi, Biochem. Biophys. Res. Commun., 40 (1970) 1. L.N. Johnson, FASEB J., 6 (1992) 2274. J.L. Martin, K. Veluraja, L.N. Ross, L.N. Johnson, G.W.J. Flett, N.G. Ramsden, I. Bruce, M.G. Orchard, N.G. Oikonomakos, G. Papageorgiou, D.D. Leonidas and H.S. Tsitoura, Biochemistry, 31 (1991) 10101. C.B. Newgard, P.K. Hwang and R.J. Fletterick, Crit. Rev. Biochem. Mol. Biol., 24 (1989) 69. D. Palm, S. Becker and R. Schinzel, in Enzymes Dependent on Pyridoxalphosphate and other Carbonyl Compounds as Cofactors T. Fukui, H. Kagamiyama, K. Soda and H. Wada (eds.), pp. 377-385, Pergamon Press, Oxford (1991). M.R. Sierks, C. Ford, P.J. Reilly and B. Svensson, Prot. Eng., 2 (1989) 621. M.R. Sierks, C. Ford, P.J. Reilly and B. Svensson, Prot. Eng., 6 (1993) 75.

69 18 19 20 21 22 23

S.R. Sprang, S.G. Withers, E. Goldsmith, R.J. Fletterick and N.B. Madsen, Science, 254 (1991) 1367. J.C. Spurlino G.-Y. Lu and F. Quiocho, J. Biol. Chem., 266 (1991) 5202. I.P. Street, C.R. Armstrong and S.G. Withers, Biochemistry, 25 (1986) 6021. I.P. Street, K. Rupiz and S.G. Withers, Biochemistry, 28 (1989) 1581. T. Suganuma, J.-I. Kitazone, K. Yoshinaga, S. Fujimoto and T. Nagahama, Carbohydrate Res., 217 (1991) 213. K.A. Watson, E.P. Mitchell, L.N. Johnson, J.C. Son, C.F. Bichard, M.G. Orchard, G.W. Fleet, N.G. Oikonomakos, D.D. Leonidas, M. Kontou and A. Papagoergioui, Biochemistry, 33 (1994) 5745.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

71

The chitinolytic system of Streptomyces olivaceoviridis H. Schrempf Department of Biology, University of Osnabriick, Barbarastral3e 11, D-49069 Osnabriick, Germany

Abstract

Streptomyces olivaceoviridis secretes several chitinases. The characterized 59 kDa exochitinase consists of a C-terminal catalytic domain, one central region, and one N-terminal chitin-binding part (12 kDa). The presence of the binding domain was shown to be a prerequisite for an efficient degradation of crystalline ~- and [~-chitins. In addition to the chitinolytic enzymes, S. olivaceoviridis produces a so far unique chitin-binding protein (CHB 1). Biochemical studies and immunofluorescence microscopy revealed that the CHB1 protein binds specifically only to c~-chitin from crab shell or fungi. The role of the novel lectinlike protein is at present being investigated.

1. INTRODUCTION Chitin, a polymer of N-acetyl-D-glucosamine, is highly abundant in nature, since it is present in the exoskeleton of insects, in mollusca, coelenterata, nematodes, protozoa and the cell walls of certain fungi. Naturally occurring chitins vary in the length of their chains which are stabilized by hydrogen bridges to a highly ordered crystalline structure. Chitin is frequently associated with proteins and may be stabilized by additional inorganic compounds. The natural annual production is judged to amount to 108 tons. Thus, apart from cellulose, it constitutes the second most abundant polysaccharide in nature [14]. Chitin can be hydrolyzed by enzymes produced by plants, fungi, and bacteria. Gram-positive streptomycetes are highly abundant in soil and known as important antibiotic producers. Though nearly all Streptomyces species have been shown to be chitinolytic, and chitin has been successfully used to enrich predominantly streptomycetes from soil [11], relatively few studies on chitinolytic enzymes have been performed. Streptomyces olivaceoviridis was identified as the most efficient degrader of crystalline chitin [ 1]. Recently, five chitinases (20.5 kDa, 30 kDa, 47 kDa, 70 kDa and 92 kDa) from S. olivaceoviridis have been purified to homogeneity [ 17]. Many extracellular hydrolases produced by bacteria consist of two or more domains. Most details are known about the catalytic and the binding domains of cellulases (CBDs). Proteins predicted from different bacterial chitinase genes showed in various cases an architecture of several domains, and comparisons of the deduced aminoacids allowed the identification of the

72 individual catalytic domains [23, 2]. With the help of biochemical and genetic data, chitinbinding domains could be identified within chitinases from Bacillus circulans [24]. We succeeded in analysing the catalytic and binding characteristics of an overproduced exochitinase and a novel lectin-like chitin-binding protein from S. olivaceoviridis as well as their corresponding chitin-inducible genes.

2. RESULTS AND DISCUSSION 2.1. The exochitinase and its gene Since S. olivaceoviridis degraded chitin efficiently, its DNA was used for shot-gun cloning experiments. In the presence of chitin, the S. lividans transformants containing the plasmid pCHIO 1 produced like S. olivaceoviridis a chitin-inducible exochitinase of 59 kDa which was proteolytically processed to a 47 kDa protein in the culture medium [2, 3]. Sequence analysis of a reading frame of 1794 base pairs, comparison of the deduced aminoacid sequence, and biochemical studies of the mature protein (59 kDa) and the proteolytically processed form (47 kDa) allowed the identification of one N-terminal chitinbinding domain, one central region with significant similarity to the type III module of fibronectin, and the C-terminal catalytic domain which spans from residues 261 to 597. This catalytic domain belongs to family 18 of glycosyl hydrolases [10] and four amino acids (Ser, Gly, Asp and Glu) were found to be completely invariable in the 12 aligned sequences. It is interesting to note that Asp and/or Glu residues have been discovered in the catalytic region of many glycosyl hydrolases including type-C lysozyme which also hydrolases chitin [22]. The deduced S. olivaceoviridis enzyme has highest overall amino acid identity with the B. circulans chitinase D [2]. The 59 kDa enzyme was shown to adhere very strongly to crystalline chitin and to hydrolyse native crab and fungal m-chitin more efficiently than the 47 kDa truncated form of the enzyme. In contrast, colloidal chitin and chitooligomers were degraded equally well by both forms of the enzymes. Neither standard physiological conditions nor elevated NaC1 and detergent concentrations allowed the removal of the bound Streptomyces exo-ChiO1 (59 kDa) enzyme; only high concentrations of urea and guanidine hydrochloride led to its release. Therefore it has to be assumed that the binding is irreversible under physiological conditions. These data reveal that a strong adhesion of the large form of the enzyme is a requisite for effective hydrolysis of the crystalline structure of chitin. In addition, it can be concluded that the N-terminal domain (12 kDa) of the 59 kDa enzyme is a chitin-binding domain [3]. By immunofluorescence, we could demonstrate that the 59 kDa chitinase mediates a very specific and strong binding to crystalline ~- and [3-chitins from various sources, but does not adhere to colloidal chitin. Since streptomycetes are dominant in soil and expected to encounter fungi in their natural habitat, it was not surprising that the 59 kDa enzyme interacts particularly efficiently with o~-chitin containing spores and phialides from fungi (i. e. Aspergillus proliferans and others). It hydrolyses the chitin within the fungal cell wall very well; thus protoplast-like structures and shortened mycelia are formed. We show for the first time that fungal chitin was hydrolysed considerably more efficiently than crab chitin which is usually used for the analysis of chitinase activities [3].

73

2.2. The lectin-like chitin-binding protein and its gene In the presence of chitin only, several chitinases including the exochitinase described above as well as an extracellular chitin-binding protein (CHB1) were produced by Streptomyces olivaceoviridis. The corresponding gene (chbl) was cloned on a multicopy vector in S. lividans. Like the exochi 1 gene, the chb 1 gene is only transcribed in the presence of chitin in its original host S. olivaceoviridis and the S. lividans transformant harbouring the chb 1 gene on a multicopy vector. Since the S. lividans transformant overproduced the protein only in the presence of chitin, it was rapidly purified to homogeneity [20]. Using immunofluo-rescence microscopy and biochemical analyses, a very specific binding of the overproduced CHB 1 to o~chitin was demonstrated. Native c~-chitin from crab shells consists of N-acetylglucosamine polymer chains arranged in an antiparallel way. Weaker adsorption of CHB 1 was found with colloidal chitin; this derives from native crab shell (x-chitin after treatment with acids, and its structure is probably predominantly amorphous. The CHB1 protein does not bind to carboxymethylchitin and to g-chitin, nor to any other type of cellulose. Since the CHB1 protein strongly interacted with o~-chitin and additional spectroscopical studies suggested a possible involvement of tryptophane residues with the crystalline substrate, we looked for a region within CHB1 harbouring several of these aminoacids. Four tryptophanes were shown to be deduced in the N-terminal part of the CHB 1 protein. By manual alignment of the relative positions of aromatic aminoacids within CHB1 and other glycohydrolases, four tryptophane residues were found in the cellulose-binding domains (CBDs) of several bacterial cellulases [8, 12, 13, 15, 25, 26]. The CHB 1 protein contained within a region of about 100 aminoacids one cysteine, two asparagines, and one glycine. These aminoacids are also present within several cellulose-binding domains (CBDs), each of which consists of approximately 100 aminoacid residues. Thus, they are about 70 aminoacids shorter than the CHB1 protein. It is interesting that two tryptophane residues are present in lectins from plants, such as Vicia faba [21] and Dolichos biflorus [19]. Several conservatively substituted aminoacids show an analogous arrangement of the above-cited CBDs and CHB 1. Lectins from the seeds of several plants (Graminaceae, Solanaceae and some Leguminosae) had been found to bind to N-acetylglucosamine, oligomers of this sugar and chitin [6]. One of the best-characterized chitin-binding lectins of plant origin is the wheat germ agglutinin (WGA). Within lectins, 30 - 43 aminoacids which contain various conserved cysteines and glycines (see also below) have been identified [4]. Analyses of deleted genes and their products from the B. circulans chitinases A1 and D resulted in the identification of chitin-binding domains of 52 to 55 aminoacids [24]. The exochitinase from S. olivaceoviridis [2,3] was shown to be proteolytically processed to a catalytically active part of the enzyme and a chitin-binding part of 131 aminoacids (see above). Up to now, the binding capacities of the domains from the above-mentioned chitinases have been tested with a-chitin, but neither with any other forms of chitin nor with different cellulose types. Despite our alignments, we could not identify similar aminoacid arrangements among the binding domains of the above chitinases and CHB 1. Further studies are necessary in order to test whether the various above-mentioned binding regions possibly mediate the adsorption to regions with a varying degree of crystallinity of the chitin and (or) if they differ in their interaction with regions which are largely or only to a small extent exposed to the surface of chitin. Although the sizes of the binding domains of various cellulases and chitinases of bacterial or eucaryotic origin range from 35 to about 150 residues, their g-structure can in

74 general be predicted (like for CHB 1, data not shown), and they seem to mediate the binding by exposed aromatic residues such as tryptophanes and/or tyrosines. More extensive studies are required to elucidate the structure of proteins interacting solely with a-chitin, in comparison with those binding to various types of cellulose and chitin. Therefore, truncated and mutated forms of the proteins are being investigated at present. Recent data give reason to believe that the cellulose-binding domains from cellulases are involved in the non-hydrolytic disruption of high-molecular weight cellulose. The precise action is not yet known, but it was proposed that the CBD protein binds to and penetrates the cellulose fibre at surface discontinuities and sluffs off cellulose fragments non-covalently associated with fibre but bound to underlying microfibrils [5]. Therefore the interaction of CHB 1 with chitin may assist the hydrolytic attack by enzymes and is being investigated at present. CHB 1 does not possess any antifungal activity. At present we are testing if it mediates the interaction of Streptomyces and chitin-containing fungi. It will also be interesting to analyse if the production of the specific chitin-binding protein is typical for S. olivaceoviridis, which had been isolated in the course of a screening programme [1] as an extremely efficient chitin degrader, or if other Streptomyces species produce binding proteins, too. Up to now, Calcofluor and Congo red [18] have been used to quickly identify polysaccharides within different organisms. However, the dyes are not very specific as they interact with various types of chitin, cellulose, and other polysaccharides. X-ray diffraction spectra have been necessary to distinguish different types of chitin. Since CHB1 binds specifically to a-chitin, it could also be used for the rapid identification of this chitin type. In addition, fluorescein-labelled CHB 1 could serve as a tool to study the synthesis of o~-chitin as well as the dynamics of its crystallization during the development of various organisms. Recent studies have revealed that cellulose-binding domains possess a large potential to purify various proteins. Thus a gene fragment encoding a CBD was fused to several genes encoding proteins of interest [16, 7]. Using similar constructs with the chbl gene, the adsorption of corresponding gene products to chitin could serve as a tool for the rapid enrichment of proteins which are difficult to isolate. Moreover, various proteins could be applied in their chitin-immobilized form.

3. ACKNOWLEDGEMENTS We thank M. Lemme for her help with the writing of the manuscript. The studies were supported by the Deutsche Forschungsgemeinschaft Schr 203/3-2.

4. REFERENCES

1 2 3 4

M. Beyer and H. Diekmann, Appl. Microbiol. Biotechnol., 23 (1985) 140. H. Blaak, J. Schnellmann, S. Walter, B. Henrissat and H. Schrempf, Eur. J. Biochem., 214 (1993) 659. H. Blaak and H. Schrempf, Eur. J. Biochem., 229 (1995) 132. M.J. Chrispeels and N.V. Raikhel, Plant Cell, 3 (1991) 1.

75 5 6

7 8 9 10 11

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

N. Din, N.R. Gilkes, B. Tekant, R.C. Miller Jr, R.A.J. Warren and D.G. Kilbum, Bio/Technology, 9 (1991) 1096. I.J. Goldstein and R.D. Poretz. In I.E. Liener, N. Sharon and I.J. Goldstein (eds.), The Lectins: Properties, Functions, and Applications in Biology and Medicine, Academic Press, pp. 35-250, 1986. J.M. Greenwood, E. Ong, N.R. Gilkes, R.A.J. Warren, R.C. Miller Jr and D.G. Kilbum, Prot. Eng., 5 (1992) 361. J. Hall and H.J. Gilbert, Mol. Gen. Genet., 213 (1988) 112. J. Hall, G.P. Hazlewood, N.S. Huskisson, A.J. Durrant and H.J. Gilbert, Mol. Microbiol., 3 (1989) 1211. B. Henrissat, Biochem. J., 280 (1991) 309. H.J. Kutzner, In M.P. Starr, H. Stolp, H.G. Trtiper, A. Balows and H. Schlegel (eds.), The Prokaryotes: A Handbook on Habitats, Isolation and Identification of Bacteria, Springer-Verlag, Berlin, Germany, pp. 2028-2090, 1981. A. Meinke, C. Braun, N.R. Gilkes, D.G. Kilburn, R.C. Miller Jr. and R.A.J. Warren, J. Bacteriol., 173 (1991) 308. A. Meinke, N.R. Gilkes, D.G. Kilburn, R.C. Miller Jr. and R.A.J. Warren, J. Bacteriol., 175 (1993) 1910. R.A.A. Muzarelli, Chitin, Pergamon Press, 1977. G.P. O'Neil, S.H. Goh, R.A.J. Warren, D.G. Kilburn and R.C. Miller Jr., Gene, 44 (1986) 325. E. Ong, N.R. Gilkes, R.C. Miller Jr., R.A.J. Warren and D.G. Kilburn, Biotechnol. Bioeng., 42 (1993) 401. A. Romaguera, U. Menge, R. Breves and H. Diekmann, J. Bacteriol., 174 (1992) 3450. C. Roncero and A. Dur~in, J. Bacteriol., 163 (1985) 1180. D.J. Schnell, D.C. Alexander, B.G. Williams and M.E. Etzler, Eur. J. Biochem., 167 (1987) 227. J. Schnellmann, A. Zeltins, H. Blaak and H. Schrempf, Mol. Microbiol., 13 (1994) 807. N. Sharon and H. Lis, Science, 246 (1989) 227. M. Sinnott, Chem. Rev., 90 (1990) 1171. T. Watanabe, W. Oyanagi, K. Suzuki, K. Ohnishi and H. Tanaka, J. Bacteriol., 174 (1992) 408. T. Watanabe, K. Kobori, T. Yamada, Y. Ho; M. Uchida and H. Tanaka, In R.A.A. Muzzarelli (ed.), Chitin Enzymology, Eur. Chitin Soc., Ancona, Italy, pp. 329-337, 1993. R.W.K. Wong, B. Gerhard, Z.M. Guo, D.G. Kilbum, R.A.J. Warren and R.C. Miller Jr., Gene, 44 (1986) 315. M.D. Yablonsky, T. Bartley, K.O. Elliston, S.K. Kahrs, Z.P. Shalita and D.E. Eveleigh, In J.-P. Aubert, P. B6guin and J. Millet (eds), Biochemistry and Genetics of Cellulose Degradation. FEMS Symposium 43, London, Academic Press, pp. 249-266, 1988.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), Carbohydrate Bioengineering 9 Elsevier Science B.V. All rights reserved.

77

Properties and production of the [3-glycosidase from the thermophilic Archaeon Sulfolobus solfataficus expressed in mesophilic hosts M. Moracci a, L. Capalbo b, M. De Rosa b, R. La Montagna ~, A. Morana b, R. Nucci ~, M. Ciaramella ~ and M. Rossi a alnstitute of Protein Biochemistry and Enzymology, CNR, Via Marconi 10, 80125 Naples, Italy blstituto di Biochimica delle Macromolecole, II University of Naples, Via Costantinopoli 16, 80138 Naples, Italy

Abstract

The [~-glycosidase from the thermophilc Archaeon Sulfolobus solfataricus (Singly) shows broad substrate specificity, thermophilicity and high resistance to heat and other protein denaturants. It is an interesting model for elucidating the structure-function relashionships in glycosyl-hydrolases as well as the molecular determinants of thermophilicity and thermostability. In this paper we describe different strategies to obtain and optimize heterologous expression of the S[3gly in mesophilic hosts and down-stream processing, which are the pre-requisites for large scale production of the enzyme. We also describe the properties of the expressed enzyme and those of protein chimeras obtained by fusing the S~gly to the glutathione-S-transferase (GST); such fusions are useful tools to investigate on the properties of the enzyme and to easily purify mutants impaired in their activity or stability.

1. I N T R O D U C T I O N Among enzymes involved in carbohydrate metabolism, attention has been recently focused on [3-glycosidases from thermophilic microorganisms, enzymes showing highly thermostable [3glucosidase and [3-galactosidase activities. We have previously reported studies on the thermostable ]3-glycosidase (S[3gly) from the extreme thermoacidophilic Archaeon Sulfolobus solfataricus (for a review, see 1). The enzyme is thermophilic - the activity variation with temperature shows increase up to 90 ~ and thermostable - the half life at 75 ~ is 24 hours; it is stable to different protein denaturants and proteases, and moreover it is activated by heat and some protein denaturants. The enzyme has a broad substrate specificity: it hydrolizes [3-D-gluco-, fuco-, and galactosides, and a large number of [3-1inked glycoside dimers and oligomers, linked 131-3, [31-4, and [31-6. It has

78 noticeable exo-cellobiase activity on oligosaccharides with up to 5 glucose residues. Furthermore, it is also able to promote transglycosylation reactions [2,3]. The SI]gly encoding gene, called lacS, has been cloned and expressed in mesophilic hosts [4-6; see below]. We have analyzed the transcriptional organization of the lacS region, we have identified the transcription start site and the presumptive promoter sequences [7]. The enzyme has been crystallized [8] and the resolution of its structure is in progress. The SI3gly belongs to the Glycosyl-hydrolase family 1, including enzymes with 13glycosidase, 13-galactosidase and phosho-13-galactosidase activity, widely distributed among the three living Domains (Bacteria, Eukarya and Archaea). Enzymes of this family share significant sequence similarity with enzymes of cellulase family A, including microbial enzymes with 13-1,4 glucanase activity, which are involved in the degradation of cellulose. Enzymes of both families catalyze hydrolysis of 13-glycosidic bonds, possibly by the same mechanism of general acid catalysis [9]. For its particular features the SI3gly is an useful tool in all circumstances in which wide substrate specificity coupled with resistance to denaturation is required. For instance, its ability to hydrolyze oligosaccharides allows its utilization coupled with mesophilic or thermophilic cellulases in the conversion at high temperature of cellulose to glucose. We are interested in the identification and definition of the functional sites of the SI3gly involved in its different properties (substrate specificity, exo-cellobiase and transglycosylation activities), and to the structural deteminants of thermostability and thermophilicity; such results could also open the possibility of rationally engineering the enzyme to achieve better applicative performance. In this paper we describe different strategies for production, purification and downstream processing of the heterologously expressed SI3gly.

2. HETEROLOGOUS EXPRESSION The development of an innovative enzymology based on enzymes from thermophiles is currently hampered by difficulties in producing thermophilc biomass. The extreme growth conditions of hyperthermophiles and the slow growth rate with consequent low biomass yields, make it difficult both to obtain amounts of enzyme needed for structure-function studies and to devise methodologies for industrial scale up of production. Cloning and overexpression of thermophile genes in mesophilic hosts is now a well-established methodology to overcome these problems. Thermophilic enzymes are inactive at the host optimal temperature and the cells may allow the accumulation of higher quantities of the expression product. Moreover, the intrinsic stability of the heterologous enzymes allows their selective purification from the host proteins. 2.1. Expression in Escherichia coli Expression of protein genes in E. coli is the essential prerequisite for easy production of product for enzymological and structural studies, as well as for manipulation of the genes by random and site-directed mutagenesis. We have over-expressed the SI3gly in E. coli and have purified it to homogeneity by using a heating step which was proved to be crucial: most host proteins were easily removed and two chromatographic steps employed for the purification of

79 the native enzyme were by-passed [6]. Subsequent conventional hydrophobic chromatography, ion-exchange chromatography and gel filtration gave a purified protein with the same specific activity as the native enzyme, but with a better final purification factor and yield. A typical purification gave more than 5 m g ~ culture of homogeneous enzyme. Since the enzyme produced in E. coli shows the same properties of the native one, it can be used as a substrate for mutagenesis studies; by sequence alignment among members of the Glycosyl-hydrolase family 1 several conserved residues can be identified which are potential targets for site-directed substitutions.

2.1. Expression in yeast For its features, the SI3gly has several applicative potentialities in food and pharmaceutical industries; the possibility of expressing high levels of the enzyme in a safe (GRAS) host such as Saccharomyces cerevisiae is particularly attractive. We have previously obtained expression in yeast of the enzyme by using the vector pYE87 based on the classical URA3 selection [5]. However, when this expression system was scaled up to production volumes, it turned out that the amount of enzyme produced was significantly lower than expected, due to plasmid instability. To overcome this problem we have used an autoselection system based on the FBA1 gene, coding for the glycolitic enzyme fructose 1,6 bisphosphate aldolase, that is essential in any growth condition. In this system, the host strain harbors the chromosomal copy of the FBA1 gene stably disrupted [10] and contains a functional copy of the same gene on a plasmid. Since only cells retaining the vector are viable, the plasmid is completely stable after 50 generation in rich media [ 11]. We have constructed an autoselection-expression vector (pYG5) containing a functional copy of the FBA1 gene and the S. solfataricus lacS gene, coding for the SI3gly, under the control of the galactose-inducible UAScAL. Yeast cultures transformed with pYG5 or with pYE87 were subjected to three rounds of growth in non selective medium; with conventional selection plasmids, this experimental scheme favours high rate of plasmid loss (see below). In these conditions the two strains show very similar growth curve and reach the same maximal optical density (Fig.l), showing that the autoselection system does not result in growth disavantage. 12 10 9 pYE87 i pYG5

OF 0

. 20

40

i i 60 80 100 120 140 time

Figure 1. Growth curves of yeast cultures transformed with pYE87 and pYG5 vectors.

80 Plasmid stability was measured in cultures grown in non-selective conditions for 19 generations (Table 1). With plasmid pYE87, based on conventional selection, the fraction of plasmid-containing cells drops to 30%, whereas the autoselection vector pYG5 is present in 100% of the cells, confirming that this autoselection system is very stringent and that there are no plasmidless cells in population. The possibility of scaling-up of this expression system to production volumes will be evaluated.

Table 1 Stability of pYE87 and pYG5 in non-selective medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . gen_e_r___ati_ons......... 3 19 pYE87 70 % 30 % pYG5 97 % 98 %

3. D O W N S T R E A M PROCESSING

We have developed a down-stream processing methodology for the purification of the enzyme expressed in yeast, taking advantage of the unusual stability of the S]3gly to heat and proteases. This methodology involves two key steps: cell autolysis and extracts thermal precipitation. A pilot fermentation experiment was carried out with a S. cerevisiae recombinant strain expressing the Sl3gly in a 100 liter scale. Cells were collected, subjected to one cycle of freeze-thawing, and then to autolysis. During this process, the enzyme was released into the medium. Interestingly, the time-course of the protein released showed that, whereas total protein concentration decreases with time, presumably due to digestion by endogenous proteases released during the autolysis, the [3-glycosidase activity increased continously with time up to a maximum recovery after 72 hours, testifying once more the exceptional stability of the enzyme. The enzymatic yield,_in terms of units per liter of cell colture, is about 35-fold higher in the recombinant yeast autolysate than that obtained from the S. solfataricus homogenate. The enzyme was further purified by heating the autolysate for 20 min at 80 ~ and discarding the precipitate by centrifugation. The yield of the whole process (units/L cell culture) obtained with this new purification procedure increased up to 56-fold. This methodology for the recovery of a thermophilic enzyme from yeast is suitable for industrial processes, being inexpensive and acceptable for the environment.

4. GST-S]]gly FUSIONS: EXPRESSION AND PURIFICATION We have constructed fusions of the S~3gly to the glutathione-S-transferase (GST) from Schistosoma japonicum and we have obtained their expression in E. coli; such fusions allow at the same time one-step purification of the chimeric enzyme by affinity chromatography on glutathione, and assay of GST activity to follow purification. Therefore they offer an

81 alternative to classical methods for SI3gly purification, and are an essential tool for production of mutant enzyme molecules impaired in activity and/or stability. We made a protein chimera by cloning the lac S gene in the BamHI site of vector pGEX-2T [12], obtaining vector pGexgly. The resulting product contains at the N-terminus the 26 kDa GST polypeptide fused to the 56 kDa SI3gly protein. The sequence at the junction of the two polypeptides was as follows: -Leu-Val-Pro-Arg-Gly-Ser-Met-Tyr-Ser-Phe-Pro-Asnwhere the ftrst six lac S amino acids are in bold. After transformation of JM105 E. coli cells with pGexgly, the expression of the chimeric protein was obtained by induction with IPTG and, after over-night growth, abundant GST expression could be detected in crude homogenate at 37 ~ following the standard GST assay [13]. The extract was loaded on a Gluthatione Sepharose 4B column; no GST activity was released from the column after extensive washing and the fusion was eluted after injection of glutathione buffer. The sample obtained showed both GST and 13-galactosidase activity at 37 and 75 ~ respectively, and an high degree of purity. When the sample was analysed by SDSPolyacrylamide Gel Electrophoresis (SDS-PAGE) shown in Fig. 2, a two band pattern was found (lane 2): a 90 kDa band corresponding to the denatured chimera monomer, and a high molecular weight band which corresponds to the incompletely denatured, oligomeric form of the protein, which surprisingly is still active, as demonstrated by 13-galactosidase activity staining (lane 9). A similar pattern (but with bands of reduced size due to the absence of the GST portion) could be found for wild-type S[3gly analysed in the same conditions (lanes 7 and 8). These results demonstrate that the fusion of GST to Sl3gly has little effect on the overall architecture of the latter molecule, poorly affecting its functional folding, oligomerisation and exceptional thermal stability (fusions and wild-type protein samples, before loading on SDSPAGE, were incubated at 100 ~ in 0.1% SDS for 5 min.). Nevertheless, the ]3-glycosidase specific activity of the fusion was as much as 25% less than that of the wild-type enzyme at the same purification step, indicating that the catalytic efficiency and/or specifity were partially affected. In order to separate the GST portion from the fusion and recover the free SI3gly, samples obtained from the affinity chromatography were subjected to cleavage with thrombin protease. In fact, the fusion included a thrombin recognition site (bold residues): -Leu-VaI-Pro-Arg $ Gly-Ser-Met-Tyr-Ser-Phe-Pro-AsnThe arrow shows the bond hydrolysed by thrombin: as a consequence, the S]3gly produced by this digestion would show two additional amino acids at the N-terminus. Unfortunately the proteolytic cleavage resulted inefficient, even at very high thrombin/fusion ratio, probably because the thrombin recognition site is hidden into the chimera structure (Fig. 2, lanes 2 to 6). Indeed the efficiency of cutting was partially enhanced when the protein was pre-incubated at 55 ~ and then subjected to thrombin hydrolysis: in these conditions the GST portion is partially denaturated and presumably exposes the thrombin site. However, this treatment produced incomplete and aspecific protein cleavage (data not shown).

82

Figure 2. 10% SDS-PAGE of GST-SI3gly fusions after treatment with thrombin protease. Left panel: Coomassie staining. Lane 1, Molecular Weight Markers: Phosphorylase b 94,000 Da; Bovine Serum Albumin 67,000 Da; Ovalbumine 43,000 Da; Carbonic anhydrase 30,000 Da; Soybean Trypsin Inhibitor 20,100 Da. Lane 2, GST-SI3gly fusion incubated o.n. at 25 ~ Lanes 3-6, GST-SI3gly fusion incubated at 25 ~ in the presence of thrombin protease (10 units/mg of chimeric protein) for 1, 2, 4 hrs and o.n., respectively. Lane 7, wild-type SI3gly. Right panel: 13-galactosidase activity staining. Lane 8, wild-type SI3gly. Lane 9, GST-SI3gly fusion incubated o.n. at 25 ~ in the presence of thrombin protease (10 units/mg of chimeric protein).

In order to overcome this problem we prepared a new GST-SI3gly fusion, cloning the lacs gene in the BamHI site of vector pGEX-2TK, obtaining pGKgly. The resulting chimera protein, called GST-K-SI3gly, is similar to GST-SI3gly, but includes the recognition sequence for the catalytic subunit of cAMP-dependent protein kinase (PK) that is here underlined:

-Leu-Val-Pro-Arg $ Gly-Ser-Arg-Arg-Ala-Ser-Val-Met-Tyr-Ser-Phe-Pro-AsnWhen this new chimera was expressed and partially purified, it was efficiently and specifically cleaved by thrombin suggesting that the PK recognition sequence worked as 'spacer' between the two polypeptides, exposing the protease site (Fig. 3, lane 1). The 13glycosidase portion obtained in this way contained seven extra amino acids at its N-terminus. Nevertheless, this tail did not affect S~gly activity and thermal stability, conf'mrting that pGKgly vector can be used for the production of S[3gly mutants even severely affected in their properties.

83

Figure 3. 7% SDS-PAGE of GST-K-SI3gly fusions treated with thrombin protease, after Coomassie staining. Lane 1, GST-K-S[3gly fusion incubated o.n. at 25 ~ in the presence of thrombin protease (10 units/mg of chimeric protein). Lane 2, GST-K-S~gly fusion incubated o.n. at 25 ~ Lane 3, Molecular Weight Markers (see Fig. 2 legend). Lane 4, wild-type SI3gly.

5. CONCLUSIONS We have developed different strategies for optimization of the heterologous expression, improvement of the downstream processing, and analysis of the SI3gly, which offer us a versatile set of tools suitable for different applications. Our expression/purification system in yeast can be scaled up to production volumes, maintaining the foreign DNA stably in transformed cells, producing high biomass, high amount of the target product, consistency of yields in large volume fermentors, with low costs. Furthermore, the availability of GST-SI3gly fusions, besides interesting considerations on the structure-activity relationships of the wild-type enzyme, allows the purification of mutants designed to address questions about the enzyme activity, thermostability and thermophilicity.

6. ACKNOWLEDGEMENTS We are grateful to prof. E. Martegani for strains and suggestions. This work was partially supported by: CNR Target Project Biotechnology and Bioinstrumentation, and CNR P. F. Ingeneria Genetica.

5. REFERENCES

1 2

M. Moracci, M.Ciaramella, R. Nucci, L. H. Pearl, I. Sanderson, A. Trincone and M. Rossi, Biocatalysis, 11 (1994) 89. F.M. Pisani, R. Rella, C. Rozzo, C. A. Raia, R. Nucci, A. Gambacorta, M. De Rosa and M. Rossi, Eur. J. Biochem., 187 (1990) 321.

84 3 4. 5 6 7 8 9 10 11 12 13

R. Nucci, M. Moracci, C. Vaccaro, N. Vespa and M. Rossi, Biotechnol. Appl. Biochem., 17 (1993) 239. M.V. Cubellis, C. Rozzo, P. Montecucchi and M. Rossi, Gene, 94 (1990) 89. M. Moracci, A. LaVolpe, J. F. Pulitzer, M. Rossi and M. Ciaramella, J. Bacteriol., 174 (1992) 873. M. Moracci, R. Nucci, F. Febbraio, C. Vaccaro, N. Vespa, F. La Cara and M. Rossi, Enz. Microb.Technol., (1995) in press. A. Prisco, M. Moracci, M. Rossi and M. Ciaramella, J. Bacteriol., 177 (1995) 1614. L.H. Pearl, A. M. Hemmings, R. Nucci and M. Rossi, J. Mol. Biol., 229 (1993) 558. Henrissat, B., Biochem. J., 280 (1991) 309. C. Compagno, B. M. Ranzi and E. Martegani, FEBS Lett., 293 (1991) 97. C. Compagno, A. Tura, B. M. Ranzi and E. Martegani, Biotechnol. Progr., 9 (1993) 594. D.B. Smith and K. S. Johnson, Gene, 67 (1988) 31. B. Mannervik and U. H. Danielson, CRC Crit. Rev. Biochem., 23 (1988) 283.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), Carbohydrate Bioengineering 9 Elsevier Science B.V. All rights reserved.

85

Contribution of subsites to catalysis and specificity in the extended binding cleft of Bacillus 1,3-1,4-~3-D-glucan 4-glucanohydrolases Antoni Planas* and Caries Malet Laboratory of Biochemistry, Department of Organic Chemistry, CETS Institut Qufmic de Sarri?t, Universitat Ramon Llull, 08017 Barcelona, Spain

Abstract

Specific low molecular weight oligosaccharides having a chromophoric aglycon were synthesized for kinetic evaluation of the enzyme catalysis. A subsite mapping study showed that the active site cleft is composed of four subsites on the non-reducing end, and allowed to estimate the contribution of subsites - I I m -IV to transition state stabilization. In addition to a proper orientation of the scissile glycosidic bond of the substrate, inhibition kinetics with [3glucan- and cello-oligosaccharides suggest a binding preference of subsite -I for a 3-0substituted over a 4-O-substituted glucopyranose unit. Subsite +I, on the other hand, has a minor contribution to binding but the nature of the aglycon in the substrate largely affects catalysis.

1. INTRODUCTION 1,3-1,4-[3-D-glucan 4-glucanohydrolase (EC 3.2.1.73, abbreviated 1,3-1,4-[3-glucanase) catalyses the regio and stereospecific hydrolysis of [3-glucans containing mixed [3-1,3 and [3-1,4 glycosidic linkages as cereal 13-glucans and lichenan [1,2,3]. The trisaccharide 3-O-[3cellobiosyl-D-glucopyranose (1) and the tetrasaccharide 3-O-]3-cellotriosyl-D-glucopyranose (2) are the final hydrolysis products of barley [3-glucan, thus defining the enzymic specificity of cleavage on [3-1,4-glycosidic linkages in 3-O-substituted glucopyranose units.

I ~,,..OH FI H O ' 7 " .OH-~Z~o~O

(,,,OH

HO"~O~

OH

I (n=l)2 (n=2) The 1,3-1,4-[3-glucanases cloned and sequenced so far belong to two distinct enzyme families, plant and microbial enzymes, with no apparent sequence similarity and unrelated

86 three-dimensional structures. The Bacillus isozymes are the best characterized; they are highly homologous, with molecular masses of 25-30 kDa, active in a wide pH range, basic pI (8-9), and are quite thermostable proteins compared to the plant isozymes. Because of its importance in the food industry (brewing and animal feedstuff), the enzyme is a current target of protein engineering programs both to understand the enzyme function at the molecular level, and to redesign some enzyme properties for practical applications. Recent progress on the structure-function relationships in Bacillus 1,3-1,4-~-glucanases includes identification of the catalytic residues by site-directed mutagenesis [4,5] and chemical modification with active site-directed inhibitors [1], as well as determination of the tertiary structure by X-ray crystallography [6,7]. Bacillus licheniformis 1,3-1,4-~-glucanase is a retaining glycosidase [8] with Glu138 acting as the general acid catalyst and Glu134 as the catalytic nucleophile [5]. The three dimensional structure has been recently refined at 0.18 nm resolution by X-ray crystallography showing almost identical structure to that of the previously reported hybrid H(A16M) enzyme consisting of amino acid residues 1-16 from B. arnyloliquefaciens and the rest derived from B. macerans [6]. The protein adopts a jelly roll ~-sandwich fold with a carbohydrate-binding cleft located on the concave face of a ~-sheet formed of consecutive anfiparallel ~-strands. The catalytic residues Glu134, Asp136 and Glu138 are placed in the same ~-strand, and surrounded by a number of aromatic amino acid residues as it is commonly found in carbohydrate-binding proteins. The polysaccharide substrate may bind into the extended cleft which can be envisaged as formed of individual subsites each one accomodating a glucopyranose unit of the bound carbohydrate. The structure of a covalent enzyme-inhibitor complex between the hybrid H(A16M) 1,3-1,4-~-glucanase and 3,4-epoxybutyl ~-cellobioside [6] provides little information about the binding site cleft, since the cellobiose unit of the inhibitor only partially fills the cleft and it is covalently attached to the protein by an alkyl chain. Kinetic studies, on the other hand, have been limited to natural polysaccharide substrates, namely barley or oat ~-glucan, lichenan, and the reduced SIII pneumococcal polysaccharide (a regular polymer containing alternating 1,3- and 1,4-~-D-glucopyranosyl units [9]). These substrates undergo an extensive endo depolymerization with a large number of glycosidic bonds hydrolyzed per molecule yielding a complex mixture of oligosaccharides in a wide range of molecular weights at initial reaction times. Although they are currently used for analytical purposes (i.e. determination of I]-glucanase contents in cereals), the use of polysaccharide substrates is not appropriate for kinetic measurement in mechanistic studies of the enzyme action. We have recently prepared a family of low molecular weight I]-glucan oligosaccharides bearing a chromophofic aglycon as potential substrates for 1,3-1,4-~-glucanases. Here we describe their use in the enzymology of the Bacillus licheniformis enzyme aimed to evaluate the contribution of subsites in the extended binding site cleft to binding and catalysis.

87

2. RESULTS 2.1. Chromophoric substrates: 4-methylumbelliferyl 13-glucan oligosaccharides A set of low molecular weight 13-glucan oligosaccharides having a 4-methylumbelliferyl aglycon have been synthesized for kinetic evaluation of the enzyme catalysis [ 10]:

OH

r 0 3a

0

.OH'-]

?H

(,,OH

3b-e (n=0,3)

Criteria for designing the specific substrates were: a.-a basic substructure G3G-X according to the requirements of natural polysaccharide substrates (barley [3-glucan and lichenan) to be hydrolyzed by the enzyme, i.e. cleavage of a 13-1,4 glycosidic bond on a 3-O-substituted glucopyranose unit b.-single scissile glycosidic bond, so the enzymic reaction could be described by a simple kinetic model c.-release of a chromophoric aglycon upon enzymic hydrolysis for easy monitoring of the reaction course by continous UV or fluorescence spectrophotometry.

HPLC and NMR monitoring of the enzymic hydrolysis by B a c i l l u s l i c h e n i f o r m i s 1,3-1,4-13glucanase shows that 3a-e undergo a single cleavage with release of 4-methylumbelliferone, and that the reaction proceeds with net retention of the anomeric configuration. 2.2. Number of substites on the non-reducing end and their contribution to transition state stabilization Kinetics were performed at pH 7.3, 53 ~ conditions of maximum activity on the natural substrate barley 13-glucan. Initial velocities (<3% conversion) for 3a-e at different substrate concentrations were determined by monitoring the release of 4-methylumbelliferone at 365 nm (Ae = 5440 M-lcm-1). 3a is a very poor substrate, no saturation kinetics being observed up to 10 mM. The curve v o vs. IS] for 3b-e reaches a maximum and decreases at higher substrate concentration (Figure 1). The data were fitted to equation 1 that corresponds to an acompetitive substrate inhibition model. A clearly biphasic Hill plot, log[v/(V-v)] vs. log[S], (Figure 1, for 3d as an example) is consistent with this inhibition scheme, where a second molecule of substrate binds into the extended binding site cleft with formation of unproductive ternary complexes (ES2), producing a downward curvature of the Hill plot at high substrate concentration with the slope becoming negative (h =- 1). Kinetic parameters are summarized in Table 1.

88 2.0

k +l k2 E + S------~- E.S....___.~ E + p k-1 + S

1.6O~

1.5

1.2-

"~ 1.0

"-" x 0.8"

~E 0.0

E

,

b

h ~ = - I

"//

E.S

0.5-

vo =

-0.5" -1.0" -1.5

0.4-

kcat[Z~[Eo] K. +Is]+

w

-

!

-

|

-

(1)

Kt

i

-' .5-0.5 0.5 1.5 2.5 3.5 0.0-

In IS]

I

0

'

I

5

I

'

10 [S] (mM)

I

15

'

I

20

Figure 1. Steady-state kinetics for 3d. Inset: Hill plot

Table 1 Kinetic parameters of B.licheniformis 1,3-1,4-~-glucanase with substrates 3a-e. Substrate K M (mM) kcat (s "l) K 1 (mM) kcat/KM(M-Is -1) 3a

3b 3c 3d 3e

G-MUa G3G-MU G4G3G-MU G4G4G3G-MU G4G4G4G3G-MU

16.8+2.2 2.69+0.06 0.79+0.02 0.66+0.03

0.139+0.012 4.58+0.06 8.86+0.10 8.78+0.20

57.12+11.5 37.8+1.5 36.4+1.8 29.3+4.9

0.348+0.008 8.27+1.80 1700+60 11200+500 13300+900

a MU: 4-methylumbelliferyl. Conditions: citrate-phosphate buffer pH 7.3, 53~ 0.1 mM CaC12 [E]= 25-100nM, [S]" 3a 1-10 mM, 3b 1-51 mM, 3c 0.1-24 mM, 3d 0.1-22 mM, 3e 0.1-6 mM

The dependence of kcat and K M as a function of the degree-of-polymerization of the substrates approaches a plateau between the tetrasaccharide 3d and the pentasaccharide 3e. It inditates that the binding site cleft is composed of 4 subsites for glucopyranose units on the non-reducing end from the scissile glycosidic bond. This result was further analyzed by computer molecular modelling on the X-ray structure of the Bacillus licheniformis enzyme. Docking of a hexasaccharide into the binding cleft based on the 3D structure of the covalent protein-inhibitor complex between B.amyloliquefaciensB.macerans 1,3-1,4-[3-glucanase and 3,4-epoxybutyl I]-cellobioside [6] is consistent with a cleft containing 4 subsites, where a fifth glucopyranose unit on the non-reducing end is not conformationally restricted by the interaction with the protein in the productive complex, but rather it lays out the cleft and is solvent exposed.

89

Subsite mapping.

Binding to a multi-subsite binding site cleft of a depolymerase may lead to different positional isomers. The observed acompetitive substrate inhibition behaviour for compound 3b-e provides evidence for multiple binding modes, only one leading to a productive complex followed by specific hydrolysis at a single glycosidic bond. The general kinetics model accounts for all possible binary and ternary complexes, k +l,i,n ~_ E + Sn--. k -1,i,n

k2 ,r,n E'S i.n ' ' ' - ' ' - ~ +

E + Pn-1 +X

Sn

E'S i,n'Sj,n

where i and j indicate the binding mode (positional isomer), n is the length of the oligosaccharide, r denotes the productive complex, and E and Sn are the free enzyme and substrate. From steady-state considerations, the experimental kcat, K M and K I parameters relate to the microscopic constants of this model by

i :xr

(

KI -- ~lKr'n q" i,; ~-" 1/Ki'n

i :xr

)

l ~., ~_~1/Ki,nKi,j, n j

k 2,r,n

(kcatl=lk2,r I ~ KM ) ~. Kr )n

i

Kr, n and k2,r, n

being the microscopic Michaelis constant and the hydrolytic coeficient for the productive process, and Ki, n (ir r) and Ki,j,n the microscopic dissociation constants for the unproductive binary and ternary complexes. Only the second order rate constant (kcat/KM) is independent of the eventual formation of unproductive complexes. The subsite mapping model assumes that the free energy of binding at each subsite is an intrinsic constant, unaffected by binding or absence of binding at any other subsites [11], so - -Rr

2;

-

rAs.

h

AGh b being the contribution of a subsite h to the free energy of binding (AGnb) for a substrate of length n in the binding mode i. Because Kr, n and k2,r,n are not experimentally accessible, and assuming that the intrinsic binding energy can be used both for substrate binding and for lowering the activation free energy in catalysis, the contribution of each subsite to ligand

90 binding can not be calculated. Additional substrates to map the reducing end of the binding cleft are first required to complete the analysis and be able to estimate binding energies. However, the effect of occupying a subsite h on the second order rate constant kcat]K M is the contribution of that individual subsite to transition state stabilization, and can be expressed by the difference in transition state activation energy between two substrates differing in one glucopyranose unit according to AG*r,h=n+l = AG r,n+l ~ - AG ~ = - R T In r,n

(keat/KM )n+ 1 (kcat/KM) n

The values of AGh r are calculated from the data in Table 1 and summarized in Table 2.

Table 2 Contribution of subsites to transition state stabilization. Subsite Binding mode -I 3-O-Glcp -II 4-O-Glcp -III 4-O-Glcp -IV 4-O-Glcp -V

AGh* (kcal.mo1-1) -2.1+0.2 -3.5+0.2 -1.2+0.2 -0.11 +0.07

Figure 2. Subsites on the non-reducing end of the bound carbohydrate and their contribution to transition state stabilization.

Binding of 4-O-substituted glucopyranose units to subsites - I I - -IV have a stabilizing effect on the enzyme transition state complex with a larger contribution of subsite -III (Figure 2). A virtual subsite -V would contribute by less than 0.1 kcal.mo1-1, a value assigned to

91 unspecific interactions between the edge of the binding cleft and a fifth Glcp unit facing the bulk solvent in a loose conformation. Kinetics with substrates cannot give any information about subsite -I since it will be always occupied in the productive complex.

2.3. Contribution of substite -I to substrate specificity Being a heterodepolymerase, the contribution of each subsite varies depending on whether it is occupied by a 3-0- or 4-O-substituted glucopyranose unit. In addition to the proper orientation of the scissile glycosidic bond of the substrate, the strict substrate specificity shown by the enzyme suggests a preference for a 3-O-substituted glucopyranose in subsite -I. Inhibition kinetics with [3-glucan- (1, 2) and cello- (4, 5) oligosaccharides were analyzed to address the question of subsite selectivity. !

H

~

,.... OH

, I OH

F

~

~

c"OH

n

1 (n=l)

2 (n-2)

OH

ot4

4 (n=l)

"'OH

5 (n=2)

All of them behave as competitive inhibitors of B. licheniformis 1,3-1,4-]3-glucanase using the 4-methylumbelliferyl tetrasaccharide 3d as substrate. To avoid complex behaviour arising from substrate inhibition, the range of substrate concentration was chosen to be close to K M (0.33 mM < [S] < 1 mM, K M = 0.79 mM), far below K I (3d) = 36 mM (Table 1) under the same experimental conditions. Figure 3 shows the double reciprocal plot for 2 as an example. Inhibition constants are compiled in Table 3. 2

2.5-

1.5 2.0-

1

0.5 0

E 1.5-

[I] (mM)

i O

x o 1.00.50.0 -1.5

J '

I

-0.5

I

0.5

'

I

1.5

1/[S] (mM -1)

Figure 3. Competitive inhibition of 2 using 3d as substrate.

'

I

2.5

'

I

3.5

92 Table 3 Inhibition constants for the competitive inhibitors 1, 2, 4 and 5. Inhibitor K! (mM) 1 G4G3G 12.0+0.1 4 G4G4G 20.4_+0.3 2 G4G4G3G 2.28+0.04 5 G4G4G4G 5.60_+0.11 Conditions: citrate-phosphate buffer pH 7.3, 53~ 0.1 mM CaC12, [E]= 25nM, [S]=0.2-1 mM, [I]: 1 0-10 mM, 2 0-2 mM, 4 0-20 mM, 5 0-10 mM

The tetrasaccharides 2 and 5 are better inhibitors than the trisaccharides 1 and 4 as it may be expected for a glycosidase with an extended binding site. On the other hand, the [3-glucanoligosaccharides 1 and 2 having a [3-1,3 glycosidic bond have lower K! values than the corresponding cello-oligosaccharides 4 and 5. The change in free energy of binding associated with the structural modifications that relate any pair of inhibitors can then be calculated from the relationship A(AGb) = -RTln[(KI)x/(KI)y], where x and y refer to each compound. As shown in the thermodynamic cycle of Figure 4, the total free energy change in going from the worst inhibitor (G4G4G, 4) to the best one (G4G4G3G, 2) is independent of the path taken. However, the effect of each modification on binding depends on the pair of inhibitors considered. Thus, addition of a new glucopyranose unit to the non-reducing end of the ]3glucan-trisaccharide (1--)2, -1.08 kcal.mol ~ has a larger contribution than the same structural modification on the cello-trisaccharide (4--)5, -0.84 kcal.mol-1). Likewise, conversion of a ]3-1,4 to a 13-1,3 linkage has different effect for the trisaccharides (4--)1, -0.35 kcal.mol ~ than for the tetrasaccharides (5--)2,-0.58 kcal.mol-1). If the free energy of binding at each subsite is an intrinsic constant, unaffected by binding to any other subsite (subsite mapping model), this

G4G4G(4) ]

-0.84 "',,

-

~ [ G4G4G4G(5) ]

0 . 3 5 ,-'"

- 0.35 ss /

-

1.43

9

",,

G4G3G(1) ] Figure 4. A(AG b) in kcal.mo1-1 that relate the competitive inhibitors 1, 2, 4 and 5.

differential behaviour is interpreted as an evidence of different binding modes for each enzymeinhibitor complex.

93 Any microscopic interpretation should be handled with care since the measured K I values may be function of all possible equilibria between free enzyme and inhibitor, and a number of binary or even ternary complexes in different binding modes. A preferential binding mode, however, can be expected given the specificity of the enzyme. A binding model consistent with the above data is proposed and summarized in Figure 5.

(4) ~ ~ i ~ i ~ -IV

-III

-II

-I

AG_I (1___>4) ~ r

I

(-0.84)

9

~ 9

AG-I(I~3) - AG-Iv

js S

(-0.35)

oo oo

."

oS

I(

.r

~

(

5

-IV

-III

-II

(-0.58)

9149 9

) -I

I

AG-I(1---)3) - AG-I (1---~)

9

IIK

Figure 5. Proposed binding model describing the preferred binding modes. - O : 4-O-Glcp,-O" 3-O-Glcp. In parenthesis, A(AG b) in kcal.mo1-1

The 13-glucan-oligosaccharides 1 and 2 bind preferentially at subsites -I to -III or -IV, positioning the 13-1,3 glycosidic bond between subsites-I and-II as it may be expected from the substrate specificity shown by the enzyme. The A(AGb) value of-1.08 kcal-mo1-1 that relates 1 and 2 is thus the free energy of binding assignable to subsite -IV. This value is very close to that determined previously for AGe_IV (-1.2 + 0.2 kcal.mo1-1, see w suggesting that the contribution of this subsite to transition state stabilization (AGe_IV) is mainly due to stabilization of the enzyme-substrate complex. The cello-trisaccharide 4 will not bind as the ]3-glucan-trisaccharide 1. If subsite -I prefers a 3-O-substituted Glcp, and binding to subsite -IV is exoergonic, 4 will fill subsites -II to -IV. But, in going to the cello-tetrasaccharide 5, the additional glucopyranose may either lay out of the binding site cleft or fill subsite -I. Since K! for 5 is lower than for 4 (with an associated A(A G b) of-0.84 kcal.mol-1), the cello-tetrasaccharide will not bind to the same subsites -II to -IV because the fourth Glcp will have no contribution to binding, but rather it will fill subsite -I, resulting in a binding mode similar to that proposed for the 13-glucan-tetrasaccharide 2. Therefore, A(AG b) between 5 and 2 accounts for the preference of subsite -I for a 3-0substituted over a 4-O-substituted glucopyranose unit (-0.6 kcal.mol-1). From this model, the contribution of subsite -I to binding a 3-O-substituted glucopyranose unit of 13-glucan

94 oligosaccharides in the normal and preferential binding mode can be evaluated as being 1.4 kcal. mol- 1. The proposed model is tentative and requires further analysis, i.e. evaluation of the inhibitory capacity of the ct and [3 anomers, extend the analysis to larger oligosaccharides to test the consistency of the model, and determination of the 3D structure of some enzymeinhibitor complexes. 2.4. Subsite +I has a minor contribution to binding but a large contribution to catalysis The 4-methylumbelliferyl glycosides 3a-e reported as substrates (see w have an aromatic aglycon that fills subsite +I in the productive complex. To evaluate the effect of the leaving group in the hydrolytic reaction as well as the contribution of subsite +I to the enzyme specificity, compounds 6 and 7 were prepared and assayed with the B. licheniformis enzyme. Together with the 4-methylumbelliferyl trisaccharide 3c, these compounds share a common fragment ~-Glcp-(1-->4)-~-Glcp-(1-->3)-~-Glcp- that occupies subsites -I to -III in the productive complex (Figure 6). The methyl glycoside 6 is not hydrolyzed by the enzyme as determined by HPLC and NMR monitoring. On the other hand, the methyl glycoside 7 is a good substrate. Initial velocities were determined by monitoring the increase of reducing power due to the newly-formed reducing ends upon enzymic hydrolysis. The tetrasaccharide 7 (kcat/KM = 30000 M-ls 1) reacts 18-fold faster than the 4methylumbelliferyl trisaccharide 3c (kcat/KM = 1700 M'ls'l), equivalent to a difference in transition state activation energy (AAG~) of 1.9 kcal-mol 1. This difference in kcat~M is mainly due to kcat since both substrates have approximately the same KM values. If KM reflects the dissociation constant of the enzyme-inhibitor complex, this result suggests that the interaction of the glucopyranose unit of 7 with subsite +I is utilized in stabilizing the enzyme transition state with no additional stabilization of the enzyme-substrate complex.

OH

6

HHO ~

HO~a~,,,~.~ O OH

OH

3c

_OH

00~OxHO~Ox

_OH

_OH ~

:

OH

no hydrolysis

O.~. IVle

_OH o

_

Ho..L.,.,/~c~ HO4W,,,,'K,,-~O--~,~.'~,,--k..~-O~U-.~O

("OH

OH

OH

r OH

OH

(,.OH

~

OH

"IV "111 "11 "1 ' I Figure 6. Binding mode and kinetic parameters for substrates 3c, 6 and 7.

K M = 2.7 mM

kcat = 4.58 s1

KM= 2.6 mM

k at-7 s'

95 In terms of the enzyme mechanism, retaining glycosidases are believed to operate through a double displacement mechanism assisted by general acid-base catalysis [12]. Following the protonation of the glycosidic oxygen by the general acid, the mechanism involves some development of a positively charged oxocarbonium that is stabilized by the catalytic nucleophile either by electrostatic stabilization or by formation of a covalent glycosyl-enzyme intermediate. Since 7 and 3c only differ in the aglycon (leaving group) both leading to the same intermediate, the lower kcat value for 3c indicates that the rate determining step in the hydrolysis of the 4-methylumbellyferyl 13-glucan oligosaccharides is the initial glycosidation step up to the glycosyl-enzyme intermediate, prior to its hydrolytic break down by a water molecule.

3. ACKNOWLEDGEMENTS The authors thank Udo Heinemann (MCD, Berlin) for the resolution of the x-ray structure of the Bacillus licheniformis enzyme. This work was supported by Grant BIO94-0912-C02-02 from Comisi6n Interministerial de Ciencia y Tecnologfa (CICYT), Spain. C.M. acknowledges a fellowship from Generalitat de Catalunya.

4. REFERENCES

1 2 3 4 5 6 7 8 9 10 11 12

P.B. Hc~j, R. Condron, J.C. Traeger, J.C. McAuliffe and B.A. Stone, J. Biol. Chem., (1992) 267, 25059. F.W. Parrish, A.S. Perlin and E.T. Reese, Can. J. Chem., (1960) 38, 2094. B.A. Stone and A.E. Clarke, in Chemistry and Biology of (1---~3)-]3-Glucans, La Trobe University Press, Australia, 1992. A. Planas, M. Juncosa, J. Lloberas and E. Querol, FEBS Lett., (1992) 308, 141. M. Juncosa, J. Ports, T. Dot, E. Querol and A. Planas, J. Biol. Chem., (1994) 269, 14530. T: Keitel, O. Simons, R. Borriss and U. Heinemann, Proc. Natl. Acad. Sci. U.S.A., (1993) 90, 5287. T. Keitel, M. Meldgaard and U. Heinemann, Eur. J. Biochem., (1994) 222, 203. C. Malet, J. Jimdnez-Barbero, M. Bemab6, C. Brosa and A. Planas, (1993) Biochem. J., (1993) 296, 753. M.A. Anderson and B.A. Stone, FEBS Lett., (1975) 52, 202. C. Malet, J.L. Viladot, A. Ochoa, B. G~illego, C. Brosa and A. Planas, Carbohydr. Res., in press (1995). J.D. Allen, Methods Enzymol., (1980) 64, 248. M.L. Sinnott, (1990) Chem. Rev., (1990) 90, 1171.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

97

Probing of glycosidase active sites through labeling, mutagenesis and kinetic studies Stephen G. Withers Protein Engineering Network of Centres of Excellence and Department of Chemistry, University of British Columbia, Vancouver, British Columbia, Canada, V6T 1Z1 Abstract Glycosidases which hydrolyses their substrates with net retention of anomeric configuration (retaining enzymes) do so via a double-displacement mechanism in which a covalent glycosylenzyme is formed and hydrolysed with acid/base catalytic assistance via oxocarbenium ion-like transition states. Strategies for identification of the nucleophilic residue and the acid/base catalytic residue have been devised. By use of mechanism-based inhibitors to trap the glycosylenzyme intermediate and affinity labels to derivatise the acid/base catalyst, in combination with novel mass spectrometric techniques, the identities of these residues have been determined in several glycosidases. In all cases identified to date these have turned out to be the carboxylic amino acids glutamate and aspartate. Armed with this knowledge, a retaining glycosidase has been converted into an inverter through replacement of the nucleophilic glutamate by alanine, along with the addition of alternative anionic nucleophiles such as azide and formate to the reaction medium. This provides the first example in which such a change of mechanism in a glycosidase has been effected.

1. INTRODUCTION The principal criterion by which glycosidases can be classified mechanistically is the stereochemical outcome of the bond cleavage which they carry out. One group of glycosidases hydrolyses the glycosidic linkage with net inversion of anomeric configuration, and these are known as inverting enzymes. Thus an inverting o~-glucosidase, for example, would hydrolyse an ot-glucoside substrate yielding ~3-glucose as the first-formed sugar product. Mutarotation would of course subsequently equilibrate this to the thermodynamic mixture of or- and ]3glucose. The second group of glycosidases hydrolyses the glycosidic linkage with net retention of anomeric configuration, and these are known as retaining enzymes. Different mechanisms are required for such different stereochemical outcomes, and indeed the fundamentals of these mechanisms were proposed over 40 years ago by Koshland [1]. These are shown in the diagram below in which the enzyme active sites are represented by boxes containing two key carboxylic acid residues. Clearly a large number of other residues are essential for specificity and catalysis, particularly those whose function is to bind to and stabilise the substrate as it

98 approaches its transition state conformation. However these have been omitted from this diagram in the interests of clarity.

RETAINING MECHANISM H

I

o

,fo

OH

HOiO

H

INVERTING MECHANISM

oLo H

OH

I

As is suggested by this diagram, there are considerable similarities between these two classes of enzymes in that both contain a pair of carboxylic acid residues suitably disposed on either side of the bond to be cleaved. However, as will become apparent, the roles of these two residues are different. In the case of the inverting enzymes, the mechanism involves a direct displacement in which the water attacks at the anomeric centre with general base catalytic assistance from one of the carboxylates. In a somewhat symmetrical manner, the other carboxylic acid functions as a general acid catalyst, protonating the leaving glycosidic oxygen in a process concerted with bond breaking. This results in hydrolysis with net inversion of anomeric configuration. The mechanism for the retaining enzymes, by contrast, involves a double displacement reaction in which a covalent glycosyl-enzyme intermediate is formed and hydrolysed. In the first step the deprotonated carboxylate functions as a nucleophile, attacking at the anomeric centre and displacing the glycosidic oxygen in a process which occurs with general acid catalytic assistance from the protonated carboxylic acid. This results in the formation of a covalent glycosyl-enzyme intermediate. In the second step water attacks at the anomeric centre of the glycosyl-enzyme species with general base-catalytic assistance from the deprotonated carboxylate, displacing the enzymatic active site nucleophile and releasing product sugar and free enzyme.

99 Both these steps, as well as the single step for the inverting enzyme, proceed via transition states with considerable oxocarbenium ion character. This means that, formally, bond cleavage precedes attack by the nucleophile, reaction proceeding via a dissociative (Snl) pathway. However, there is considerable evidence that there is still some degree of pre-association by the nucleophile, be it water or the active site carboxylate, presumably as one way of helping to stabilise the relatively unstable oxocarbenium ion [2]. The best evidence for the involvement of these oxocarbenium ion-like transition states has come from secondary deuterium kinetic isotope effect studies which provide insight into changes in hybridisation state at the anomeric centre between the ground state for the step in question and its subsequent transition state. Isotope effects between 1.05 and 1.25 have been measured for both steps (glycosylation and deglycosylation) for [~-glycosidases [3,4], and recently large isotope effects have been measured for ~-glycosidases [5]. Additional evidence has come from the potent inhibition afforded by inhibitors such as glyconolactones and amidinosugars which bear considerable structural resemblance to the proposed transition state [6]. Such inhibitors are, of course, recruiting the strong binding interactions which have evolved for stabilisation of the transition state conformation of the cleaving substrate. While the mechanism of retaining glycosidases is most conveniently described as proceeding through a covalent glycosyl-enzyme intermediate via oxocarbenium ion-like transition states, it is also quite possible that the oxocarbenium ion species flanking this covalent intermediate could be sufficiently stable to exist as short-lived species in their own right. This does not, however, significantly change the mechanistic description.

2. CARBOXYLIC ACIDS PLAYING A ROLE IN CATALYSIS 2.1. Identification of the active site carboxylic acids in retaining glycosidases Knowledge of the identities of the two key carboxylic acid-containing amino acids (aspartate or glutamate) in glycosidases is of considerable importance not only academically, but also for any structure/function or mutagenic studies as well as in cases where mutants are being made to provide enzymes of altered properties. A very effective, though typically rather labor-intensive, way of getting this information, as well as insight into the identifies of the other active site amino acids is through X-ray crystallographic determination of the 3-dimensional structure of the protein. Indeed, the structures of a number of glycosidases have now been determined crystallographically, and a compilation of the recently determined structures is provided in a recent review [7]. Frequently, however, this technique is not applicable, or available, and it is necessary to resort to other methods of identification. Further, even when a three-dimensional structure is available it is not always clear which amino acids are important unless the structure of an enzyme/substrate or enzyme/inhibitor complex has been determined. Even then it is also possible that the substrate or analogue may be bound in an unproductive mode, leading to a misassignment. An alternative approach involves the use of affinity labels, or far preferably, mechanismbased inactivators, to specifically derivatise the key amino acids in the active site. This requires the development of reagents which are capable of selectively reacting with and covalently "marking" the residues of interest. This manuscript reviews methods developed in the author's laboratory for the identification of these residues. Identification of active site nucleophiles in

100 glycosidases is carried out using a class of mechanism-based inactivator which traps the glycosyl-enzyme intermediate, in conjunction with electrospray tandem mass spectrometric methods for the localisation and sequencing of the labeled peptide within proteolytic digests. Identification of the acid/base residue can sometimes be achieved using affinity labels, or via detailed kinetic analysis of mutants modified at conserved glutamic and aspartic acids. 2.2. Identification of the active site nucleophile As noted above, the mechanism for retaining glycosidases involves the formation of a glycosyl-enzyme intermediate in which the sugar is covalently attached to the protein via the carboxylic side chain of a glutamic or aspartic acid. One way of identifying the nucleophilic amino acid residue would be to trap this intermediate by somehow decreasing its deglycosylafion rate. This requires differential manipulation of the rates of formation (glycosylation) and hydrolysis (deglycosylation) of the intermediate to slow the deglycosylafion step enormously from its typical values (t 1/2 around 1-10 ms) without ultimately slowing the glycosylafion step to as great an extent, since if this step were also slowed enormously, no intermediate would accumulate. This was achieved through the use of 2-deoxy-2-fluoro glycosides with good leaving groups such as dinitrophenolate or fluoride [8,9]. The presence of the C-2 fluorine substituent slows both the glycosylation and deglycosylation steps in two ways. One way relates to the fact that the hydroxyl substituent at C-2 plays a crucial role in transition state stabilisation in glycosidases by making key interactions (worth more than 8 kcal/mol), probably predominantly hydrogen bonding interactions, with the enzyme active site [10,11]. Replacement of the hydroxyl by fluorine, a substituent of limited hydrogen bonding potential, likely deletes, or at least decreases the value of, these hydrogen bonding interactions, destabilising both transition states significantly. The second way in which the fluorine substituent slows the two steps is through inductive destabilisation of the two electron-deficient transition states. Fluorine is much more electronegative than a hydroxyl, thus the positive charge developed at the transition state will be significantly destabilised by the presence of this substituent, slowing down both steps. The consequence of these two effects combined is a massive (up to 106 - 107 fold) reduction in rates of both steps [12,13]. Incorporation of a good leaving group such as 2,4-dinitrophenolate or fluoride speeds up the glycosylation step relative to the deglycosylation step with the effect that the intermediate is accumulated. Incubation of the enzyme with its corresponding 2-deoxy-2-fluoroglycoside then results in time-dependent inactivation, via the accumulation of a relatively stable 2-deoxy-2-fluoroglycosyl-enzyme intermediate.

H~

r

~1"~\ Ho4

-

o

HO/~O O~N

-o,r I

IH

.OH HO/~O o

,co I

o2N

o"

I

101 Supporting evidence for this mechanism of inactivation has been obtained in several ways. Firstly, stoicheometric reaction of inhibitor and enzyme has been demonstrated by electrospray mass spectrometry and by measurement of the magnitude of the "burst" of dinitrophenolate released [13]. Secondly, 19F-NMR studies of the inactivated enzyme have demonstrated the formation of a covalent ot-D-glycopyranosyl-enzyme intermediate [14]. Thirdly, and very importantly, the intermediate has been demonstrated to be catalytically competent. This was achieved by purifying inactivated enzyme of contaminating inactivator, then monitoring the reactivation of the enzyme consequent on turnover of this intermediate and release of the free enzyme [13,15]. Reactivation can be greatly accelerated by the inclusion of a suitable sugar acceptor into the reactivation mixture, such that turnover occurs via transglycosylation, a reaction typical of glycosidases. Identification of the amino acid residue labelled has been achieved in two distinct ways. The first of these required the synthesis of a radiolabelled version of the inactivator to generate a radiolabelled enzyme. Standard methods of proteolysis, HPLC separation of the resultant peptide mixture, and purification and ultimately sequencing of the radiolabeled peptide were then employed. This strategy was used successfully to identify the active site nucleophiles of four glycosidases, the ~-glucosidase from Agrobacterium faecalis, the exo-xylanase/glucanase from Cellulomonas fimi, the [3-galactosidase from Escherichia coli, and endo glucanase C from Clostridium thermocellum [15-18]. An alternative strategy, which obviates the need for synthesis of radiolabelled versions of the inactivator, involves the use of electrospray tandem mass spectrometry to identify the labeled peptide within a proteolytic digest, by monitoring for a collision-induced fragmentation reaction that is specific to the sugar-peptide linkage. This is illustrated below and in Figure 1 using the identification of the catalytic nucleophile in human [3-glucocerebrosidase by inactivation of the enzyme with 2-deoxy-2-fluoro-[3-glucosyl fluoride as an example [19]. Human 13-glucocerebrosidase is a lysosomal enzyme that cleaves the [3-glucosidic linkage of glucosylceramide. It is a membrane-associated glycoprotein (67 kDa, 497 amino acids) with both high mannose and complex oligosaccharides [20], whose crucial role in glycolipid catabolism has been established by disruption of the GCase gene in mice. Such mice die within hours of birth [21]. Inherited deficiencies of the human enzyme result in the variants of Gaucher disease, the most prevalent lysosomal storage disorder, affecting some 20,000 to 30,000 individuals worldwide. As such, this disease has been a prototype for the development of enzyme replacement therapies using specifically oligosaccharide-modified enzyme for targetting to the major sites of pathologic involvement, the macrophages of the liver, spleen and bone marrow [22-24]. Given the clinical, and commercial, importance of this enzyme, considerable interest in the identifies of active site residues is warranted, especially given the desire to understand the reasons for the inactivity of the enzyme in naturally occurring mutants. Treatment with 2-deoxy-2-fluoro-~-glucosyl fluoride (2FGlcF) resulted in time-dependent inactivation of the enzyme according to pseudo first order kinetics, yielding a value for the second order inactivation rate constant of ki/Ki = 22.7 min-lM 1. Inclusion of castanospermine (8.3 ~M), a known competitive inhibitor of GCase (Ki = 7 t.tM) in an inactivation mixture containing 10.4 mM 2FGlcF reduced kobs, the pseudo-first order inactivation rate constant, from 0.23 n ~ -1 to 0.15 min -1, showing that the inactivation was a consequence of reaction at the active site. Further the intermediate was shown to be catalytically competent by removal of excess inactivator from the labeled enzyme and then monitoring the return of activity. This was

102 a first order process, with a spontaneous reactivation rate constant, kreac of 5.3 x 10-4 min -1, corresponding to a half-life of tl/2 = 1300 minutes. The site of labeling was then identified by proteolysis of the labeled enzyme followed by HPLC separation, using an electrospray ionisation mass spectrometer (ESMS) to identify the labeled peptide by observing a specific, predicted fragmentation reaction, as follows. Peptic hydrolysis of 2FGlc-labeled GCase resulted in a mixture of peptides which was separated by reversed phase-HPLC using the ESMS as detector. When the spectrometer was scanned in the normal LC/MS mode, the total ion chromatogram (TIC) of the 2FGlc-labeled GCase digest displayed a large number of peaks, which arise from every peptide in the mixture (Fig. la). The 2-fluoroglucosylated peptide was then identified in a second run by using the tandem mass spectrometer set up in the neutral loss mode. In this technique the ions are subjected to

!

'| c i"1

IO

lo

~.1oo

J-

1~

14

111

18 INn.

10

~1oo

J:

I"''

12

14

10

18 mlrL

D

o

m~z

1000

11~00

Figure 1. ESMS experiments on glucocerebrosidase proteolytic digests.

limited fragmentation by collisions with an inert gas (Ar) in a collision cell located between the two quadrupole mass filters. Since the ester linkage between the sugar inhibitor and the peptide is quite labile under these conditions, relatively facile homolytic cleavage of this bond occurs with loss of a neutral sugar residue of known mass (165 Da) leaving the peptide moiety with its original charge. The two quadrupole analysers on either side of the collision cell can then be scanned in a linked manner in which the scanning of the two analyzers is offset by the mass of the anticipated "lost" neutral species. Under these conditions only ions differing in m/z by the mass of the lost sugar moiety (165 Da) can pass through both quadrupoles and be detected. As can be seen in Figure l b, when the spectrometer was scanned in this neutral loss tandem MS/MS mode searching for the mass loss m/z 165, corresponding to the loss of the 2FGlc label

103

100 ~

75

-

50

m

FASEA+2r--Gk:

F~

25

FASE

11

3OO

4OO

" m/z

~/ 35OO

6OO

7OO

Figure 2. Tandem MS/MS daughter ion spectrum of the 2FGlc-labeled peptide.

a true reflection of the activity of the enzyme, but results from contamination by wild type enzyme likely due to translational misincorporation, since equivalent levels of misincorporation have been observed previously [26,27]. Indeed, the activity observed is completely inactivatable by 2FGlcF at rates comparable to those of wild type enzyme, a result which is inconsistent with the expected enzymatic activity of a mutant in which the attachment site has been removed. This method has now been applied to a number of different glycosidases, including some whose catalytic nucleophile had been identified previously, namely Agrobacterium faecalis ~5glucosidase, Cellulomonas fimi exo-xylanase/glucanase and Clostridium thermocellum endo glucanase C [28], as well as the Bacillus subtilis xylanase [29]. The importance of these residues to catalysis has been confirmed in each case by kinetic analysis of mutants modified at those positions, and their active site location has been confirmed by X-ray crystallographic analysis of two of the enzymes, the Cellulomonas fimi exo-xylanase/glucanase [30] and the Bacillus subtilis xylanase [31 ]. 2.3. Identification of the a c i d / b a s e r e s i d u e by affinity labeling Techniques for the identification of this residue are not as well developed, or as reliable, as those described for identification of the catalytic nucleophile. One method which has been successful in several cases involves the use of an affinity label based upon a sugar with a reactive anomeric N-bromoacetyl functionality. The hope with this class of labels is that the reactive N-bromoacetyl moiety will react preferentially with the active site carboxylic acid in whose proximity it is bound, as shown in the Scheme below. I

.o~--~\

.

.........

.j

HO,,~

-

0

l

o, .o o

~c.O I

O~c..O t

104 N-Bromoacetyl 13-glycosylamines were first used as affinity labels with Escherichia coli [3galactosidase [32], the labeled amino acid being identified as Met501. More recently we have described their use as affinity labels for both the Agrobacterium faecalis 13-glucosidase and the Cellulomonas fimi exo-xylanase/glucanase [33]. In the former case, mass spectrometric analysis of the enzyme labeled with N-bromoacetyl ~-glucosylamine revealed that, even though simple pseudo-first order kinetics of inactivation had been determined, the inactivated enzyme was labeled with at least three equivalents of inactivator (Figure 3). The picture is further complicated by the presence of two native [~-glucosidase species of masses 51,205 and 51,066, corresponding to species with and without an N-terminal methionine residue. The presence of this N-terminal methionine residue, due to incomplete processing at the higher expression levels employed, has previously been shown to have no effect upon kinetic parameters. However in the case of the Cellulomonas fimi exo-xylanase/glucanase, inactivation by Nbromoacetyl ]3-cellobiosylamine resulted in the addition of only a single equivalent of inactivator per enzyme, as shown by the mass increase of (Am = 382) in Figure 4. The multiple labeling seen for the [3-glucosidase precluded simple identification of the site of labeling which led to inactivation, but the single labeling seen for the C. fimi exo-xylanase/glucanase provided an ideal system for study. Since it seemed unlikely that an appropriate neutral loss fragmentation would be observed in proteolytic digests of the labeled enzyme the identification of the site of labeling was attempted via comparative analysis of HPLC/ESIMS profiles of peptic digests of the labeled and unlabeled enzymes, using the mass information to "realign" the profiles and thereby correct for the irreproducibility of the HPLC chromatograms. A search was thus made for a peptide which was present in the labeled sample, but absent from the unlabeled sample, which was also greater in mass by exactly the mass of the label (Am = 382) than another peak which was present in the unlabeled sample, but (ideally) absent from the labeled. After searching through all the peaks only one peptide was found which met these criteria. Figure 5 shows the composite mass spectrum of all the peptides eluting from the HPLC between 19 and 24 minutes, the region in which the peptide of interest eluted. As can be seen, a peptide of m/z = 1028 is present in the digest of the labeled protein (Figure 5B), but is not found in the equivalent digest of the unlabeled enzyme (Figure 5A). Correspondingly, a peptide of rrgz = 646, smaller by the mass of the label (382), is found in the digest of the unlabeled enzyme, but only at lower intensity in the digest from the labeled. Ideally this peptide would not be found in the sample of the labeled protein digest, but is presumably present because of incomplete labeling or, more likely, because of partial degradation of the labeled peptide. A search of the amino acid sequence of Cex revealed only nine peptides of this mass which could be generated from this protein. The peptide was then further purified by reverse phase (C-18) HPLC, using the mass spectrometer as the detector, and sequenced by the Edman degradation in collaboration with Dr. Ruedi Aebersold, using the modified Edman reagent developed by his group and a mass spectrometric detector [34,35]. The great advantage of this approach, in which a mass spectrometer is used to identify the product modified phenylthiohydantoins, is that the mass of the modified amino acid phenylthiohydantoin is directly obtained. Using this approach the sequence DVVNEA was determined for this peptide, where the 5 th cycle resulted in the release of a phenylthiohydantoin derivative of m/z = 822. This is exactly the mass predicted for the phenylthiohydantoin of Nbromoacetylcellobiosylamine-modified glutamic acid.

105 1

I

422~

~, 1001

Sl,644"

Z

$11~S1,066 II $1,290#

i

2

47

4'i,ao6 ' Mo~ctw' Wdl~t

4,,000

4~,soo

4? 4M

$1;000

51,400

Molecular Welsht

$1,800

4

Figure 3. Reconstructed glucosidase

ESMS

of

000

'

22,758

A

e62

8.0

921

4.0

1014

........

0.0

892

16.1' 12.1'

47J4)0.

8O6

12.1

| n"

--

Figure 4: Reconstructed ESMS of C. fimi labeled with N-bromoacetylglucosylamine, exo-glycanase, a) free, b) labeled.

13-

16.1 646

a,i

47,400

Mdeeular Wellkt

792 80e

646

8.0

~ .,-~~~

752

B

880 |

I

682 7 1 4

22,758

936 9621183 806 922

1028 1014

4.0

0.0

650

700

750

800

850

900

950

1000

1050

m/z

Figure 5. Mass spectra of proteolytic peptides of C. fimi exo-glycanase, A) free, B) labeled.

106 Alignment of the sequence of this peptide with the amino acid sequence of the enzyme indicates that the modified residue corresponds to Glu127. Gratifyingly this residue was recently identified as the most likely candidate for the acid-base residue on the basis of a detailed kinetic analysis of mutants modified at conserved glutamic and aspartic acid residues [36]. In addition, this residue has very recently been located in the active site of the enzyme through X-ray crystallographic determination of the 3-dimensional structure [30]. In addition to the above, the putative acid site acid catalytic residue in the I]-glucosidase from Manicot esculenta has recently been identifed using N-bromoacetyl 13-glucopyranosylamine as affinity label, locating the labeled peptide through use of radiolabels [37].

2.4. Identification of the acid/base residue by kinetic analysis of mutants The alternative approach to identification of the acid catalytic residue involves performance of a detailed kinetic analysis of mutants which have been created in which strictly conserved glutamic and aspartic acids are, individually, replaced by alanine residues. The approach is illustrated with the exo-glucanase/xylanase from Cfimi (Cex). Cex belongs to a family of more than 20 enzymes capable of hydrolysing xylan and cellulose. Hydrolysis occurs with net retention of anomeric configuration [38], thus it is a 'retaining' glycosidase, following a double displacement mechanism [39]. Sequence comparisons permit the identification of a number of conserved amino acid residues in this family, among which are 6 glutamic and aspartic acids, the probable candidates for the acid/base catalyst as noted earlier. As expected, it is found that the catalytic nucleophile Glu233 identified using the fluorosugar approach [16] is indeed one of these conserved residues, thus establishing one of the others as the probable acid/base catalyst. The strategy used thus involves mutation of these conserved Glu and Asp residues to alanine, then detailed investigation of the kinetic properties of the mutants so generated. Such an investigation requires the application of several mechanistic tests, as described below. As noted previously, the residue in question will function as an acid catalyst in the first step (glycosylation) and as a base catalyst in the second (deglycosylation). Therefore deletion of this residue might be expected to slow down both steps. However if a substrate is used which has a very good leaving group not requiring acid catalysis for its departure, then in that case the first step may not be significantly compromised for the mutant, but the second step necessarily will be since this step is common for all substrates of a fixed sugar type. Mutants modified at Glu 127 do indeed show this behaviour as can be seen in the table below. Rates of the first step, glycosylation, were assessed through kcat/Km measurements since in this system kcat~rn represents the first irreversible step, which is indeed glycosylation. Rates of deglycosylation were obtained from kcat values of substrates for which deglycosylation was rate-limiting. As can be seen, for an excellent substrate such as 2',4'-dinitrophenyl cellobioside (pKa of leaving 2,4-dinitrophenol = 4.0) the first step (kcat/Km) is not significantly slowed by mutation, whereas for a substrate requiring some acid catalysis such as 4'-nitrophenyl cellobioside (pKa of 4'-nitrophenol =7.0) there is a significant rate reduction, and for substrates with leaving groups of very high pKa such as 4-bromophenol (pKa = 10) the rate was much lower again. These results are therefore completely consistent with the role of Glu127 as the acid/base catalyst. A second consequence of removal of the acid/base catalyst is that a small cavity, possibly capable of accomodating an anion, will be generated at the ~-face of the substrate adjacent to

107 Table 1 Kinetic parameters for h~,drolysis of various substrates by Cex and E 127 mutants Enzyme Substrate kcat (min "1) Km (mM) kcat/Km (min-lmM-1) Native Cex DNPC 419 0.06 6983 PNPC 677 0.53 1278 PBrPC 255 2.0 128 E127A DNPC 2.4 0.0003 7742 PNPC 2.3 0.025 92 PBrPC 4.0 x 10-2 1.9 2.1 x 10-2

the anomeric centre. Such a site could permit the binding and attack of a nucleophilic anion at the anomeric centre. If a substrate with a good leaving group, such as DNPC, is employed, the second step (deglycosylation) will be rate-limiting, thus reaction of the intermediate with an anion more nucleophilic than water, and one not requiring general base catalysis, will result in an increase in the steady state rate. Such is indeed found to be the case, as is shown in Figure 6. Increasing concentrations of sodium azide as exogenous nucleophile result in progressive increases in kcat values. As can be seen, much greater overall rate increases are seen with DNPC (200-fold) than with PNPC (8-fold), rates reaching a limiting value at higher azide concentrations. This plateauing of rates is clearly due to a change in rate-determining step. As azide concentrations are increased, so the deglycosylation rate increases, and the steady state rate with it, up to the point at which the rate of the deglycosylation step becomes greater than that of the glycosylation step. Beyond this point no further rate increases are observed. The lower maximal rate observed with PNPC than with DNPC is therefore consistent with the smaller glycosylation rate constant for the substrate with the poorer leaving group. This mechanism therefore requires that the reaction product in the presence of azide be ]3cellobiosyl azide. This was clearly shown by chemical synthesis of ]3-cellobiosyl azide and comparison of its chromatographic behaviour and 1H-NMR spectrum with that of the reaction mixture. The two were shown to be identical. Interestingly, as can be seen in Figure 6, values of kcat/Km are esentially invariant with azide concentration. This is also completely consistent with the proposed mechanism since the parameter kcat/Km reflects the first irreversible step in catalysis, formation of the glycosyl-enzyme intermediate, and this step is not affected by nucleophilic competition with the water. These data therefore combine to strongly suggest that Glu127 is indeed the acid/base catalyst in Cex. This is, of course, the same residue that was identified using the Nbromoacetylcellobiosylamine affinity label, and as has been noted earlier, it has been found in the appropriate location in the active site by X-ray crystallographic analysis [30]. This concurrence of findings lends great confidence to this approach to identifying the acid/base residue. 3. MODIFYING THE MECHANISM As noted in the introductory section of this manuscript, both inverting and retaining glycosidases have a pair of carboxylic acids present in their active site, playing crucial roles,

108

,otl , 0

~ 10

0

o

o,o 1.oo

to.,o

1,oo

0 20

30

.

0

40

50

1oo

~

"0

60

Azicle concentration (raM)

B

0 'c

600'

0.060

600

5OO

0.050

soo -."

400

0.040

E 4 0 0 ,."

3oo

0.030

200

0.020

100

0.010

0

r

-9

0

.

-4

600

,

I

1000

,

4

it * 300

g

200

100

"

O.OOC

I

IS00

2000

Azk:le concentration (raM)

Figure 6. Kinetic parameters for hydrolysis of PNPC (panel A) and 2,4-DNPC (panel B) by Cex Glu127Ala in the presence of various concentrations of sodium azide.

albeit these roles are somewhat different in the two cases. In addition, both classes of enzyme catalyse hydrolysis via oxocarbenium ion-like transition states. There are therefore considerable similarities between the two classes of enzyme. However one difference which would be expected would be that the two carboxylic acids should be further apart in the inverting enzymes than in the retainers since, in addition to the bound glycoside, a water molecule has to be fitted in between the two groups. These expectations are indeed conf'm'ned upon inspection of crystal structures of enzymes of both mechanistic classes. Such an analysis [7,27] revealed that the two carboxylic acids are approximately 57k apart (oxygen to oxygen) in retaining enzymes, but approximately 9.5]~ apart in inverting enzymes. This opened up the possibility of converting an enzyme from one mechanistic class to the other by changing the separation of their carboxylic acids. The first attempt to do this on the Agrobacterium faecalis [3-glucosidase, for which no three-dimensional structural information is available, involved simply the mutation of the nucleophilic carboxylate, Glu358, to an aspartic acid, thereby increasing the separation, but only by about 1/~ [40]. This mutant was indeed active, with an activity some 2500-fold reduced over wild type, but product analysis revealed that it still followed the retaining mechanism. Clearly a greater separation was necessary if the mechanisms were to be changed, but in the absence of 3-dimensional structural information it was not clear how this would be

109 achieved. A second alternative involved the complete removal of this residue, replacing it by alanine so that it could not follow a retaining mechanism. This could work well if general base catalytic assistance for the attack of water were not too important. Such a mutant was indeed constructed, but as is shown in the Table below, it was essentially completely catalytically inactive, with an apparent kcat value some 107-fold lower than wild type enzyme. Clearly general base catalysis is very important as has indeed been shown for several other naturally inverting glycosidases (see for example ref. 41). However, addition to this mutant of small nucleophilic anions which do not need general base catalytic assistance resulted in enormous regains of activity of up to 105 fold, most of the way back up to wild type activity.

Table 2 Kinetic parameters for hydrolysis of 2,4-DNP-glucoside by Agrobacterium ~-glucosidase and its Glu358Ala mutant. Enzyme + activators kcat (s -1) Km (mM) Native glucosidase 89 0.031 E358A 7 x 10-6 0.1 E358A + 2M azide 1.1 3.8 E358A + 4M formate 3.0 1.1

Further, analysis of the reaction product in the presence of sodium azide revealed that ctglucosyl azide was indeed being formed, presumably through the inverting mechanism shown below.

I 70 H

o~C~oH

I o/JC~oH

OH H O H o ~ ~

o

OH

L_ N3 ~H3

OH I N3

CH3 I

This provides the first example of a change in mechanism being intentionally engineered into a glycosidase, and thereby serves both as a reassurance that the mechanisms and active site structures of these two classes of glycosidase (inverting and retaining) might not be so

110 different, and that it is indeed possible to modify these enzymes in a defined way so as to alter their catalytic activities.

4. A C K N O W L E D G E M E N T S

The author would like to thank the many co-workers and collaborators who made this work possible. Their names are listed in the references provided. I also thank the Natural Sciences and Engineering Research Council of Canada and the Protein Engineering Network of Centres of Excellence for financial support.

5. REFERENCES

9 10 11 12 13 14 15 16 17 18 19 20 21

22

D.E. Koshland, Biol. Rev. 28 (1953) 416. N.S. Banait and W.P Jencks, J. Amer. Chem. Soc., 113 (1991) 7958. M.L. Sinnott, Chem. Rev., 90 (1990) 1171. J.B. Kempton and S.G. Withers, Biochemistry 31 (1992) 9961. Y. Tanaka, W. Tao, J.S. Blanchard and E.J. Hehre, J. Biol. Chem., 269 (1994) 32306. G. Legler, Adv. Carb. Chem. Biochem., 48 (1990) 319. J. McCarter and S.G. Withers, Current Opinion in Structural Biology, 4 (1994) 885. S.G. Withers, I.P Street, P. Bird and D.H. Dolphin, J. Amer. Chem. Soc., 109 (1987) 7530. S.G. Withers, K. Rupitz and I.P. Street, J. Biol. Chem., 263 (1988) 7929. J. McCarter, M. Adam and S.G. Withers, Biochem. J., 286 (1992) 721. R. Wolfenden and W.M. Kati, Acc. Chem. Res., 24 (1991) 209. I.P. Street, K. Rupitz and S.G. Withers, Biochemistry, 28 (1989) 1581. I.P. Street, J.B. Kempton and S.G. Withers, Biochemistry, 31 (1992) 9970. S.G. Withers and I.P. Street, J. Amer. Chem. Soc., 110 (1988) 8551. S.G. Withers, R.A.J. Warren, I.P. Street, K. Rupitz and J.B. Kempton, R.Aebersold, J. Amer. Chem. Soc., 112 (1990) 5887. D. Tull, S.G. Withers, N.R. Gilkes, D.G. Kilburn, R.A.J. Warren and R. Aebersold, J. Biol. Chem., 266 (1991) 15621. J.C. Gebler, R. Aebersold and S.G. Withers, J. Biol. Chem., 267 (1992) 11126. Q. Wang, D. Tull, A. Meinke, N.R. Gilkes, R.A.J. Warren, R. Aebersold and S.G. Withers, J. Biol. Chem., 268 (1993) 14096. S. Miao, J.D. McCarter, M. Grace, G. Grabowski, R. Aebersold and S.G. Withers, J. Biol. Chem., 269 (1994) 10975. A. Berg-Fussman, M. Grace, Y. Ioannou and G. Grabowski, J. Biol. Chem., 268 (1993) 14861. V. Tybulewicz, M. Tremblay, M. LaMarca, R. Willemsen, B. Stubblefield, S. Winfield, B. Zablocka, E. Sidransky, B. Martin, S. Huang, K. Mintzer, H. Westphal, R. Mulligan and E. Ginns, Nature, 357 (1992) 407. N.W. Barton, F.S. Furbish, G.J. Murray, M. Garfield and R.O. Brady, Proc. Natl.

111

23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

Acad. Sci. U.S.A, 87 (1990) 1913. M.L. Figueroa, B.E. Rosenbloom, A.C. Kay, P. Garver, D.W. Thurston, J.A. Koziol, T. Gelbart and E. Beutler, The New England Journal of Medicine, 327 (1992) 1632. S. Fallet, M.E. Grace, A. Sibille, D.S. Mendelson, R.S. Shapiro, G. Hermann and G.A. Grabowski, Pediatric Research, 31 (1992) 496. J. Sorge, C. West, B. Westwood and E. Beutler, Proc. Natl. Acad. Sci., 82 (1985) 7289. P. Schimmel, Acc. Chem. Res., 22 (1989) 232. Q. Wang, R.W. Graham, D. Trimbur, R.A.J. Warren and S.G. Withers, J. Amer. Chem. Soc., 116 (1994) 11594. D. Tull, S. Miao, S.G. Withers and R. Aebersold, Anal. Biochem., 224 (1995) 509. S. Miao, L. Ziser, R. Aebersold and S.G. Withers, Biochemistry, 33 (1994) 7027. A. White, S. Withers, N. Gilkes and D. Rose, Biochemistry, 33 (1994) 12546. W.W. Wakarchuk, R.L. Campbell, W.L. Sung, J. Davoodi and M. Yaguchi, Protein Sci., 3 (1994) 467. F. Naider, Z. Bohak and J. Yafiv, Biochemistry, 11 (1972) 3202. T. Black, L. Kiss, D. Tull and S.G. Withers, Carbohydrate Research, 250 (1993) 195. D. Hess, H. Nika, D.T. Chow, E.J. Bures, H. Morrison and R. Aebersold, Anal. Biochem., 224 (1995) 373. E.J. Bures, H. Nika, D.T. Chow, H. Morrison, D. Hess and R. Aebersold, Anal. Biochem., 224 (1995) 364. A.M. MacLeod, T. Lindhorst, S.G. Withers and R.A.J. Warren, Biochemistry, 33 (1994) 6571. Z. Keresztessy, L. Kiss and M. Hughes, Arch. Biochem. Biophys., 315 (1994) 323. S.G. Withers, D. Dombroski, L.A. Berven, D.G. Kilbum, R.C.J. Miller, R.A.J. Warren and N.R. Gilkes, Biochem. Biophys. Res. Commun., 139 (1986) 487. D. Tull and S.G. Withers, Biochemistry, 33 (1994) 6363. S.G. Withers, K. Rupitz, D. Trimbur and R.A.J. Warren, Biochemistry, 31 (1992) 9979. H.G. Damude, S.G. Withers, D.G. Kilburn, R.C.J. Miller and R.A.J. Warren, Biochemistry, 34 (1995) 2220.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

113

Thiooligosaccharides: toys or tools for the studies of Glycanases H. Driguez Centre de Recherches sur les Macromol6cules V6g6tales (CERMAV-CNRS) B.P. 53, F-38041 Grenoble Cedex 9, France

Abstract The latest syntheses are reported of oligosaccharides having one or several interosidic sulfur linkages. The donor molecules are glycopyranose-l-thiolates, readily prepared from their corresponding fully acetylated derivatives by selective S-deacetylation and activation. SN2 reactions of these donors with acylated acceptors bearing a good leaving group (generally a trifyl group or iodine atom at a secondary or primary position respectively) afforded the expected compounds in good to high yields. These non-natural, hydrolyzed-resistant oligosaccharides are analogues of substrate for glucoamylases, at-amylases, cellulases and 1,31,4-~-D-glucanases.

1. INTRODUCTION Glycosyl hydrolases are carbohydrate acting enzymes that catalyze in vivo the transfer of glycosyl units onto water. These enzymes, in general, can be divided into three groups: glycosidases which act on oligosaccharides and liberate monosaccharides units from the nonreducing end, the exo-glycanases which also cleave glycosidic bonds at the non-reducing end of oligo- and polysaccharides but may liberate mono- or disaccharide, and endo-glycanases which cleave internal glycosidic bonds of polysaccharides. These enzymes have been studied extensively and several reviews have been written on their mechanism of action and inhibition [1]. Nevertheless, the objective of several E.C. projects are the analysis of structural and functional relationships of glycosidases and glycanases with the aims of : i) elucidating the principles that govern substrate specificity at the molecular level, ii) using protein engineering to modulate or change substrate and cleavage specificities. To understand enzyme-sugar interactions and the mechanism of action of these enzymes, naturally-occuring compounds as well as structural analogues which can form reversible enzyme inhibitor complexes are required. These inhibitors may be divided into ground-state and transition-state analogues. The first class of inhibitors should give the most interesting information since these compounds possess the same overall geometry as the natural substrate but have one (or several) interglycosidic oxygen atom(s) or (and) ring oxygen(s) substituted by methylene group(s), nitrogen or sulfur

114

atom(s). These compounds should establish more or less the same polar and non polar interactions with the amino-acids of the active site without being hydrolyzed. Since the first synthesis of a C-disaccharide in which the interglycosidic oxygen atom is replaced by a methylene group [2], extensive research has made available of several carbaanalogues of di- or trisaccharides [3]. However, no papers reported the behaviour of these compounds as competitive inhibitors of the corresponding glycosidases. Aza-sugar inhibitors can be divided into those in which the nitrogen atom between two sugar or pseudosugar units is replaced by the nitrogen functionality [4]. These basic compounds may possess positive charge, and their inhibitory capacities are therefore strongly pH-dependent. However, these compounds especially those which may be considered as transition state analogues are very often potent inhibitors of glycanases [4]. Recently, the enzymatic synthesis of disaccharides containing sulfur in the ring of the reducing or non-reducing units were reported [5] together with the chemical synthesis of analogues of disaccharides with sulfur in the ring of the non reducing sugar and/or a chalcogen atom in the interglycosidic linkage [6]. This paper will discuss the recent syntheses of the largest class of inhibitors: oligosaccharides with interglycosidic sulfur atom(s) and their uses for biochemical and X-ray studies of glycanases.

2. SYNTHESIS OF THIOOLIGOSACCHARIDES 6-Thiogentiobiose and 6-thioallolactose were the first reducing S-linked thiodisaccharides prepared [7]. In 1982, we developed a general strategy for the establishment of any thioglycosydic linkage in good yield [8]. The reaction of glycopyranose 1-thiolate, readily prepared from the corresponding acetylated derivative, with an acylated acceptor bearing a good leaving group (generally a triflyl group) afforded, after subsequent reacetylation, the desired disaccharide in good yield (scheme 1). More recently, an improvement of this method was achieved by a selective in situ S-deacetylation and activation by using cysteamine (cyst.) and dithioerythritol (DTE) [9]. The versatility of these approaches has been shown in a recent review [ 10], which reports all the synthesis and the uses of thiooligosaccharides published up to 1991. This paper will therefore develop the results which have appeared in the literature since that date. 2.1. Synthesis of thiodisaccharides Hasegawa et al. [11] have synthesized several ganglioside analogs containing o~thioglycosides of sialic acid and have observed that these analogs are potent inhibitors of sialidase activities of different subtypes of influenza virus [12]. In this approach, the S-acetyl compound was more or less selectively S-deacetylated with sodium methoxide at low temperature, and the sodium salt thus obtained was then coupled in DMF with glycosyl acceptors, to afford the expected disaccharides upon reacetylation (55-70%). In 1994, von

115

o. f yo, O S-

OH ~

T

+

O

1)HMPA

VR

o.I

2)Ac~O' % 56(> pyr" )"

Mo

)_o:c 4j ~ ~OA c ~ S

aco

l Ac F

700

TfO ~__~Oo

OH

.

+

OAc

F

OR

1 )HMPA 2 )Ac 20,pyr.

O~OR -

~)OMe

Me

(60%)

Acd '-'=f ~)Ac

s

,=-=f OMe OR

Scheme 1

Itzstein and his group [ 13] has described a mild and efficient method for the selective in situ Sdeacetylation and under this condition the desired disaccharide was obtained in more than 80% yield after a few hours. Convenient syntheses of N, N,-diacetyl-thiochitobiose [14a], thiokojibiose, thionigerose and thioisomaltose [14b] were also reported. 2.2. Oiigosaccharides possessing only one thioglucosyl unit 2.2.1. At their non-reducing end

Glucoamylase is an exo-glycanase that hydrolyzes both otl-->6 and otl---~4 glucosidic bonds of starch from non-reducing ends to produce 13-D-glucose. Most fungal glucoamylases possess a separate starch binding domain, and although several enzymes with high activity on raw starch also have strong debranching activity, it remains to be demonstrated whether or not the starch binding domain plays a role in hydrolytic or reversion reactions. It was thus considered 2 w w that 6-thiopanose and its higher oligomers 4-S-ot-glucosyl-4-thiomaltodextrins should be useful tools to understand this duality. 2 As shown in scheme 2, the displacement of the iodine atom of acetylated 6 -iodomaltose by 2 the in situ activated 1-S-o~-D-glucose peracetate afforded the 6-thiopanose in excellent yield (82 %) [15]. The synthesis of higher oligomers is not so straigthforward. However, we thought that coupling reaction of glucose and 6-iodo-13-cyclodextrin catalyzed by cyclodextrin glucosyltransferase (CGTase) should provide the most direct approach to 6W-iodo-malto3 4 dextrins (scheme 3). And indeed, 6 -iodo-maltotriose and 6 -iodo-maltotetraose were obtained

116 2

in good yield. The procedure used for the synthesis of 6-thiopanose applied in the present work afforded the expected tetra- and pentasaccharide in 86 and 80 % yield respectively. The binding experiments of these compounds with both glucoamylase G 1 from A. niger as well as its isolated starch binding domain fragment were done by Dr B. Svensson, Copenhagen (Table 1). The dissociation constant Kd, determined by U.V. difference spectroscopy, decreased from about 1 mM to 0.2 mM when the degree of polymerization increases and these values are approximatively one order of magnitude lower than those reported for ct-l,4-maltodextrins containing the same number of glucosyl residues [ 16].

o\

.t OAc

o\

Aco--v ~

OAc

" - ' A c ~

A C ~ o ~

c SAc

~ ~D~yst.,HMPA A

~

OAc

1

Acoa'

82 %

@~x _oAc

Ac~.~'-

"OAc

OAc

Scheme 2

2.2.2. At a specific inner position

Complex polysacchaddes consisting of glycosyl units interconnected by various glycosidic bonds may be hydrolyzed by enzymes which specifically split only one type of linkage. For example depolymerization of 1,3:l,4-[3-glucans, the major matrix polysaccharides of barley endosperm cell walls, may be hydrolyzed by cellulases, lichenases or to a lesser extent laminarinase (scheme 4). To map the active site of lichenases which specifically cleave 1,4-[3linkages in ~-glucans where the glucosyl residue is substituted at position 3, we decided to synthesize the target molecule shown on scheme 4, as a potential inhibitor for these enzymes.

117 Table 1 Dissociation constants for thiopanose homologs and glucoamylase G1 or its starch binding domain (SBD). Ligand Protein thi0panose homologs a DP

Kd (mM)

SBD

G1

maltodextrins

b

Kd ( m S )

3

0.96 + 0.05

(1.6 + 0.11)

3.8

4

0.30 + 0.03

(0.56 + 0.06)

3.8

5

0.22 + 0.02

(0.32 + 0.01)

1.3

3

0.84 + 0.04

(1.7 + 0.04)

4

0.31 + 0.02

(0.53 + 0.02)

5

0.17 + 0.02

(0.15 + 0.01)

-

aValues in parenthesis, determined by titration calorimetry are for the site of highest affinity. b

Comparison to the dissociation constants found for maltodextrins and the starch binding domain. Reprinted from N.J. Belshaw et al., Eur. J. Biochem., 211 (1993) 717.

I

/ O

OH&

HO HO

+Glucose

I

~

0 H

O

CGTase

I

R

O

_ORo

o c\

cysteamine 2 n= 1 (59%)

5 n=l

R =Ac(86%)

6 n=2

R=Ac(80%)

HMPA~

3 n

\

7 n= 1 R=H 8 n=2 R=H

AcO'~-"'~_ \

AcO...~,,,,r OAC~Ac

Scheme 3

4

OH

1 ) Glucoamylase 2 ) Ac20,pyridine

~'~'"~,~f~\ oR

OH o

2 (22%)

o~c

118

~ - - - 13-Glc 1"~4

~---13-Glc 1"3

cellulases laminarinases lichinases

Me

Scheme 4

119

~

OH Scleroglucan

,-O

OH

1) Ac 2 0 , H + 2)HBr, AcOH 3)Bu 4 NSAc, tol. (2O%) 9OAc AcO-~.,-~ O,

5 steps

(4O%) OTf

o7-2-

OAc

s

I I

o

., OBz

~

OAc Act-td'~~OAc O" d o n o r molecule

~

O

Bz

OBz e. OBz " acceptor molecule

/

1) c y s t , H M P A (35%) or Et 2NH,DMF (45 %)

_OH

2)MeO "Na,MeOH

HO

OH ~.,..~_k"'~......- 0

OBz

OH

. OH

o~ . o ~ z 2 ~ . o ~ ~

o

0

OH

10eq.

0

OP03 =

~

Cellodextrin p h o s p o r y l a s e G l u c o s e o x i d a s e , Catalase (56%) OH

,OH

.oT-7-ZT-o~X-a~o, .o~e"~o" Ii O ~ 9o .

oH

~

_

,o. ,r

_ , o . _o ~,_.~ \ -o7 - - ~ l - o M o -

Scheme 5

The disaccharide acceptor was obtained in six steps from methyl 13-1actoside (scheme 5). Starting from acetylated laminaribiose isolated in 26 % yield by acetolysis of scleroglucan, the donor 1-thioacetyl laminaribiose was easily prepared. However, we were not able to find experimental conditions which afforded the expected tetrasaccharide in good yield. Under various conditions the yield was 35-45 %. This result was in contrast with the ones reported in 2.2.1. The elimination of triflic acid from an acceptor bearing an equatorial aglycon seems to be the side reaction which lowered the coupling yield. Nevertheless, this tetrasaccharide was used as a primer in an enzymatic elongation, using cellodextrin phosphorylase isolated from C. thermocellum [17]. When this enzyme was incubated for 12 h with the tetrasaccharide and glucose-l-P, the corresponding DP 5 was obtained in very good yield (56 %). However, the phosphorolytic synthesis of DP 6 was not so straithforward, since higher oligomers DP 7-8 were also obtained. The recognition of these compounds by lichenases from plant and bacterial origins is in progress.

120

2.3. Oiigosaccharides having sulfur atoms in all their interglycosidic linkages 2.3.1. 4-thiocellooligosaccharides Cellulose, the major polysaccharide in plant cell walls is randomly hydrolyzed by several families of cellulases [ 18]. Because of the complex structure of the natural solid substrate the precise mechanism of recognition and action of these enzymes may be more conveniently approached by the use of inhibitors and/or artificial substrates. The synthesis in the S-series of cello-tris-, tetra- and pentaosides was achieved in excellent yield by stepwise coupling of the same triflate acceptor with S-acetyl 1-thio-4-thiocellodextrins of DP 2, 3 and 4 respectively (scheme 6) [19]. These compounds were used as inhibitors for endoglucanase I (EGI) and cellobiohydrolase II (CBHII) from Humicola insolens which belong to two different classes of cellulases, and which require respectively three and four unmodified 15(1---)4) glucosyl units to efficiently hydrolyze a glycosidic bond. It has been found by Dr. M. Schtilein and his group from Novo Nordisk that these 4thiooligomers are potent inhibitors of these two cellulases (see Table 2) [19]. Furthermore, Prof. T.A. Jones and his group in Uppsala identified the active site residues of CBHII from T. reesei by solving the X-ray structure of the enzyme complexed with the thiocellopentaoside [2O].

Table 2 Inhibition Constants (Ki) for H. insolens EGI and CBHII with methyl 4-thiocellooli~osaccharides. DP 5 ..... _E _n_z__y___m___e .................. Constants .............................................................. D P3 .............................................. _DP___4 EGI

Ki (laM)

330 + 1.6

73 + 1.6

EGI CBHII

Kis(laM)

2,000 + 11

1,000 + 100

-

Ki (laM)

1,~400 + 250

270 + 2.3

15 + 0.9

m

u

35 + 2.7

2.3.2. 4-thiomaltooligosaccharides Rational design to produce enzymes which degrade starch with modified specificity will be possible with a better understanding at the atomic level of the interactions of enzymes with maltooligosaccharides. In order to investigate these interactions using the substrate-analogue approach, efforts have been directed toward a practical synthesis of 4-thiomaltodextrins. We have thought that the acceptor molecule should possess at its anomeric position an ct-thiol unreactive under the coupling conditions, but which can be easily and quantitatively transformed into a donor molecule when necessary. We demonstrated that triphenylmethanethiol has been an useful reagent in the synthesis of 1,2-cis-l-thioglycoses [21] and we showed the feasibility of this strategy by the synthesis of methyl 4,4'-dithiomaltotrioside [22] (scheme 7). This compound was resistant to enzyme hydrolysis, but did not behave as an ordinary competitive inhibitor. The K~ values decreased to a greater or lesser extent, with increase in the time of incubation, and increased at high concentration of inhibitor. This phenomenon may be explained by an abnormal binding of this compound in the active site. This speculation is supported by the results obtained by Drs F. Payan and R. Haser from Marseille who solved the X-ray structure of a crystal of pig pancreatic o~-amylase that was soaked with this compound [23]. They found that three molecules were bound to the enzyme, two in the active site and one on the surface binding site.

A 1 )MeO-,MeOH

AcO

1 )HBr,AcOH 84 %

+

AcO

AcO AcO

( 1 8 steps)

A B C

32%

OAc Scheme 6

2 )3.3 eq BzC1,pyr.

toluene I

60 %

L

STr

68 % Ac

Cl

AcO* Ac OAc

-

75%

B Ac

A

70 %

95%

Ac

SAC

OMe

Scheme 7

BzO

123 3. CONCLUSION In this review, we would like to point out that thiooligosaccharides constitute a class of non-natural oligosaccharides easily available. The interactions between an oligosaccharide and a protein were initiated by recognition of the solution conformation of the interacting species. From the results obtained from the few X-ray structures of thiosugars, as expected, the C-S bond is longer than the corresponding C-O bond [24]. These differences for thiomaltoside and maltose were shown in figure. It is worthy to note that the non-bonded (C(1')...C(4) in thiomaltoside is only 0.35 ]k longer than in maltoside [25]. 2.455 ~on

.o-

X-"-A._X"

I l

2.805

i Ii

X ~ , o n ..~O

, .o.

1.437

.o

_ . . ."x .x'

I I

I Ii

:

- OH

1.828

Furthermore, recent theoretical studies of the conformation of thiomaltoside [26] as well as NMR data [27] in aqueous solution demonstrated that thiomaltoside is more flexible than its oxygen analogue and that both compounds adopt the same conformation. All these data explain the excellent recognition of thiooligosaccharide by glycosylhydrolases and show that these compounds should be used more extensively for a better understanding of protein-sugar interactions.

4. ACKNOWLEDGEMENTS This research was supported by CNRS and the BAP, BRIDGE and BIOTECH programmes for E.C. Enzymatic studies were done by Drs. G. Fincher (Adelaide), J. Lehmann (Freiburg), A. Planas (Barcelona), M. Schtilein (Copenhagen) and B. Svensson (Copenhagen).

5. REFERENCES 1 2 3 4

M.L. Sinnott, Chem. Rev., 90 (1990) 1171. D. Rouzaud and P. Sinai, J. Chem. Soc., Chem. Commun., (1983) 1353. For references on the synthesis of C-disaccharides see Y.C. Xin, J.M. Mallet, and P. Sinai, J. Chem. Soc., Chem. Commun., (1993) 864 and references cited therein. K. Bock and B.W. Sigueskjold in Atta-Ur Rauman (ed.), Studies in Natural Products Chemistry, vol. 7, 1990 pp. 29 and references cited therein.

124 5a C.H. Wong, T. Krach, C. Guatheron-Le Narhor, Y. Ichikawa, G.C. Look, F. Gaeta, D. Thompson and K.C. Nicolaov, Tetrahedron Lett., 32 (1991) 4 867. 5b H. Yuasa, O. Hindsgaul and M.M. Palcic, J. Am. Chem. Soc., 114 (1992) 5891. 6 S. Mehta, J. S. Andrews, B.O. Johnston, and B. Mario Pinto. J. Am. Chem. Soc., 116 (1994) 1569. 7a D.H. Hutson, J. Chem. Soc., (C) (1967) 442. 7b W. Boos, P. Schaedel and K. Wappenfels, Eur. J. Biochem., (1967) 382. 8a M. Blanc-Muesser, J. Defaye and H. Driguez, J. Chem. Soc. Perkin Trans., 1 (1982) 15. 8b D. Rho, M. Desrochers, L. Jurasek, H. Driguez and J. Defaye, J. Bacteriol., 149 (1982) 47. 9 M. Blanc-Muesser and H. Driguez, J. Chem. Soc. Perkin Trans., 1 (1988) 3345. 10 J. Defaye and J. Gelas in Atta-Ur-Rahman (ed.), Studies in Natural Products Chemistry, vol. 8, 1991, pp 315 and references cited therein. 11 A. Hasegawa, T. Terada, H. Ogawa, and M. Kiso, J. Carbohydr. Chem., 11 (1992) 319. 12 Y. Suzuki, K. Sato, M. Kiso and A. Hasegawa, Glycoconjugate J., 7 (1990) 349. 13 S. Bennet, M. von Itzstein and M.J. Kiefel, Carbohydr. Res., 259 (1994) 293. 14a L.X. Wang and Y.C. Lee, XVII th Intern. Carbohydr. Symposium, Canada (1994) Abst. B2, 11 p. 285. 14b R.N. Comber, J.D. Friedrich, D.A. Dunshee, S.L. Petty and J.A. Secristin, Carbohydr. Res., 262 (1994) 245. 15 S. Cottaz, H. Driguez and B. Svensson, Carbohydr. Res., 228 (1992) 299. 16 C. Apparu, H. Driguez, G. Williamson and B. Svensson, Carbohydr. Res., in press. 17 E. Samain, C. Lancelon-Pin, F. Fdrigo, V. Moreau, H. Chanzy, A. Heyraud and H. Driguez, Carbohydr. Res., 271 (1995) 217. 18 B. Henrissat, M. Claeyssens, P. Tomme, L. Lemesle and J.P. Mormon, Gene, 81 (1989) 83. 19 C. Schou, G. Rasmussen, M. Schtilein, B. Henrissat and H. Driguez, J. Carbohydr. Chem., 12 (1993) 743. 20 T.A. Jones, personal communication 21 M. Blanc-Muesser, L. Vigne and H. Driguez, Tetrahedron Lett., 31 (1990) 3869. 22 M. Blanc-Muesser, L. Vigne, H. Driguez, J. Lehmann, J. Steck and K. Urbahns, Carbohydr. Res., 224 (1992) 59. 23 F.Payan and R. Haser, personal communication. 24 V. Duffer, H. Driguez, P. Rollin, E. Dude and G. Buisson, Acta Cryst., C48 (1992) 1791 and references cited therein. 25 S. Pdrez and C. Vergelati, Acta Cryst., B40 (1984) 294 and references cited therein. 26 K. Mazeau and I. Tvaroska, Carbohydr. Res., 225 (1992) 27. 27 K. Bock, J. O. Duus and S. Refn, Carbohydr. Res., 253 (1994) 51.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

125

Mutational analysis of catalytic mechanism and specificity in amylolytic enzymes B. Svensson, T.P. Frandsen, I. Matsui, N. Juge, H.-P. Fierobe, B. Stoffer a and K.W. Rodenburg b Department of Chemistry, Carlsberg Laboratory, Gamle Carlsberg Vej 10, DK-2500 Copenhagen Valby, Denmark. Present address: aUniversity of Copenhagen, Department of Chemistry, Laboratory IV, Universitetsparken 5, DK-2100 Copenhagen 0, Denmark. bAarhus University, Institute of Molecular Biology, C.F. Moilers All6 130, DK-8000/~rhus C, Denmark Abstract

Engineering of proteins and substrates in conjunction with enzyme kinetic, thermodynamic and X-ray crystallographic studies has provided new knowledge on the mechanism of substrate binding and catalysis of starch-hydrolases and related enzymes. Enzyme-substrate interactions at a distance from the site of catalysis will receive special attention to expand the insight into the structural basis of the diversity known to amylolytic enzymes. Examples are reported of mutants of glucoamylase from Aspergiltus niger and mutants and hybrids of barley (x-amylase isozymes with altered specificity. In addition site-directed mutagenesis of barley o~-amylase isozyme hybrids probes side chains critical for association with barley o~-amylase/subtilisin inhibitor.

1. INTRODUCTION In the past three years basic molecular features have been reported of inhibitors, substrate analogs, or products in complex with porcine pancreatic amylase [ 1,2], barley a-amylase 2 [3], soybean [3-amylase [4,5], Aspergillus awamori var. XIO0 glucoamylase (GA) [6-8], Bacillus circulans cyclodextrin glycosyltransferases (CGTases) [9-11], B. stearothermophilus CGTase [12], and Pseudomonas stutzeri maltotetraohydrolase [13] (reviewed in ref. 14). These crystal structures permit interpretation of mutants either rationally designed or selected after random mutagenesis. For native structures, available of Taka-amylase A (A. oryzae) [15], the closely related acid a-amylase from A. niger [16], B. licheniformis amylase [17], sweet potato 13amylase [18], and oligo-l,6-glucosidase from B. cereus [19], modeled complexes may be applied. In addition structure prediction coupled with alignment to related proteins of known three-dimensional structure allows identification at the sequence level of amino acid residues important in activity and stability [14,20]. Valuable information on structure/function relation-

126 ships of a very large number of amylolytic enzymes is thus at hand even when details of the enzyme-substrate interaction at the atomic level are lacking. Mutational analysis of substrate specificity and catalysis (reviewed in ref.s 14,21) has been performed in i) amylolytic enzymes active on o~-1,4 or ~-1,6 linkages of maltodextrins and starch representing seven EC classes within the amylase (~/oQs-barrel structural family; o~amylase, CGTase, cyclodextrinase, neopullulanase, amylase-pullulanase, and recently branching enzyme [22] and pullulanase [23]; ii) the inverting exo-glucanases, ]3-amylase and GA, containing a different (~/OQs- and an (o;/oQ6-barrel, respectively; and iii) the high molecular weight o~-glucosidases probably constituting a separate structural family. The results obtained span from identification of catalytic groups to successful design of mutants with altered properties. In case of B. circulans CGTases [9,24] and A. niger GA [25] the behavior of mutants has been rationalized based on crystal structures of the engineered proteins. The present paper focuses on A. niger GA and barley malt amylase. GA catalyses release of o~-1,4 and ~-l,6-1inked glucose from nonreducing ends of starch and related oligo- and polysaccharides with inversion of the anomeric configuration. Barley c~-amylase, in contrast, is a retaining, strictly o~-1,4 specific endoglucanase. Following the replacement of catalytic residues [26-28] main emphasis currently is on enzyme-substrate contacts at a distance from the site of catalysis. Investigations in GA involve mutagenesis in conjunction with inhibitor binding thermodynamics [29], presteady state kinetics analysis [30-32], and molecular recognition of substrate analogs [33-36] as an extension of the general characterization of enzymic activity and stability. Finally, the target isozyme of barley o~-amylase/subtilisin inhibitor (BASI), one of numerous proteinaceous o~-amylase inhibitors [37], is subjected to site-directed mutagenesis to map the contribution of individual side chains to the proteinprotein binding [38].

2. SITE-DIRECTED MUTAGENESIS OF A. NIGER GLUCOAMYLASE GA is one of the best known starch-hydrolases with respect to structure and function. Ongoing dissection of the interplay between specific parts of the enzyme and substrates/inhibitors, serves to probe the mechanism of action of starch-hydrolases and related enzymes in general and broadens the background knowledge required for rational design of novel enzymes. 2.1. The catalytic site Inverting exoglucanases act by rather simple binding and catalytic mechanisms (for reviews see ref.s 21, 39-40). Glu179 and Glu400 in GA have been identified as the general acid and base catalyst, respectively, using crystallography [6-8], site-directed mutagenesis [26,27], and differential labeling [41]. Insight is lacking, however, in the structural features that define the substrate specificity and the energetics of transition-state stabilization. Crystallography uncovered a remarkable interaction between Tyr48, which is completely conserved in the GAs [42], and 1-deoxynojirimycin bound at the innermost subsite 1 of the funnel-shaped active site [6]. This motivated mutational analysis of the role of Tyr48. Because the obvious replacement by phenylalanine failed, tryptophan was introduced [37]. In wild-type GA OH of Tyr48 hydrogen bonds to OE2 of Glu400, the catalytic base [6] and circumstantial

127 evidence suggests a coupled functionality between these residues, since Tyr48---)Trp and Glu400---)Gln GAs i) undergo similar losses in kcat and increases in Km (Table 1A), ii) have highly reduced binding affinity at the most strongly interacting subsite 2 (Table 1B), and iii) the pH-activity profiles of both mutants increase rather than decline at low pH where protonation of Glu400 is assumed to occur in wild-type GA [27]. As for substrates, the affinity of

Table 1 Mutants at the catalytic site. A. Kinetic parameters ~. B. Subsite 2 affinity a and acarbose binding thermodynamicsb A.

Maltose

kc s-~

Km

Wild-type

10.7

1.21

Tyr48---)Trp

0.12

3.92

Glu179~Gln c

-

-

Glu400---)Gln

0.30

14.8

Enzyme

B.

Maltoheptaose

kc/Km

kc s-1

Km mM

kc/Km

8.48

59.7

0.120

498

0.031

0.762

0.168

4.54

0.047

0.15

0.31

1.05

0.380

2.76

Ka M -~

-AG ~ kJ mol -~

-AH ~ kJ mol -~

TAS ~ kJ mo1-1

mM

Subsite 2 affinity kJ mol -~

0.020

Wild-type

-20.5

9.4•

68.9

32.8

36.1

Tyr48---)Trp

-15.4

2.0•

30.5

10.9

19.6

Glu400---~Gln

- 11.8

.

.

.

.

apH 4.5, 45~ ref. 27; b27~ ref. 29; c50~ ref. 26

Tyr48~Trp GA for the tight-binding pseudotetrasaccharide inhibitor acarbose (Figure 1) was drastically reduced (Table 1B), in fact to an even greater extent than in Trp52---)Phe, Arg54---)Leu, and Arg54---)Lys GAs (see Table 2B), three mutants also located at or near subsite 1 [29]. This behaviour is in accordance with the participation of Tyr48 in a hydrogen bond network involving besides Glu400, the water molecule that exerts nucleophilic attack a at C1 of the substrate glycon ring, and OH of Tyr311, a residue stacking onto the sugar ring at subsite 2 [7,8]. OH of Tyr48 is only 3.43/~ apart from C7A of o-gluco-dihydroacarbose which is equivalent to the endocyclic oxygen in the substrate [8]. Non-bonded electrons of Tyr48 therefore presumably contribute to stabilization of the oxycarbonium ion intermediate [27]. Very recently the structure was solved of A. niger Tyr48~Trp GA in complex with the inhibitor "Iris [25]. In the active site (Figure 2) the hydrogen bond network with Glu400, Tyr311, and Tyr48 has been disrupted and the distance between Glu400 and Tyr311 has increased from 4.70 to 5.21 A indicating that the structure of the active site is indeed perturbed. Compared to other GA crystal structures, Tyr48--rTrp GA lacks a water molecule

128

H

%

N_.~.~ ~~ .0 ~A~__~~

....

o.

~ t _.~<-Lo

/oH

H~ ~I _ . . ~ .

2 ~ ....

_.-o

o. ~o L..~<-2.o HO 0I ~ . . ~ H

j-O d

~ ~ o HO. Figure 1. Molecular structure of the pseudotetrasaccharide inhibitors, acarbose (top) and g luco-dih y droacarbose.

...... Trp

_

Water

Water

_/Glu 400

..........~

48 ~

J j:%:::

. . . . . .

D-

/G,u 400

Trp 48

,'--. " / ~ ,

ss_

J ! " % ~"

~-.. ' / ~ ,

Figure 2. Stereoview of the active site of A. niger GA Tyr48---~Trp with bound 'Iris [25]. Dashed lines indicate hydrogen bond donor-acceptor distances within 3.2/~.

at a position for attack at C1 [25]. The model of Tyr48~Trp GA thus accounts for the decreased activity and loss of affinity at subsite 2, while the relative increase in activity at low pH [27] is not readily explained by the mutant structure [25].

2.2. Enzyme/substrate-OH hydrogen bonds Hydrogen bonds between residues in the active site and polar groups of the sugar substrate contribute importantly to stabilization of the substrate transition-state as may be assessed by site-directed mutagenesis of the amino acids in such pairs of interacting groups (Table 2). Arg54 and Asp55 hydrogen bond to OH4 and OH6 of the sugar ring (see Figure 4) at subsite 1 [6-8] each stabilizing the transition-state complex by A(AG*)=15-21 kJxmol-' [43,44]

129 calculated from the difference in kcaJKm between the parent and mutant GAs. Substitution of these residues hardly affected substrate ground-state binding (Km), whereas mutation of Glu180, Arg305, or Asp309 at subsite 2, greatly decreased the affinity for or-1,4, but not for a1,6-1inked substrates (Table 2A). Arg305 is particularly critical, Arg305---)Lys GA thus has kcJKm reduced by a factor of 10 3 even for isomaltose (Table 2A). The corresponding loss

Table 2 Mutants at residues hydrogen bonding to substrate OH-groups. A. Kinetic parameters". B. Subsite 2 affinity" and acarbose binding thermodynamics b A. Enzyme

Maltose kc S-I

Km

wt

10.7

1.21

R54K

0.007

D55G c

Isomaltose kdKm

Km mM

kc/Km

s "1

Km mM

kdKm

s -1

8.84

0.41

19.8

0.021

59.7

0.120

498

0.53

0.013

-

-

-

0.031

0.032

0.97

0.04

1.50

0.027

0.002

38.5

5x10 -5

0.23

0.190

1.21

E180Q d

1.53

41.4

0.037

0.184

95.0

0.002

30.9

9.39

3.29

R305K

-

>400

0.0024

4x10 -4

21.0

2x10 5

-

-

0.66

D309E

9.11

52.4

0.174

0.089

61.9

0.0014

67.1

9.44

7.11

mM

B.

Subsite 2 affinity kJ mo1-1

wt

k~

Maltoheptaose k~

Ka M -1

-AG ~ kJ mo1-1

-AH ~ kJ mol 1

TAS ~ kJ mol 1

-20.1

9.4x1011

68.9

32.8

36.1

R54K

-18.4 e

8.2x106

39.7

41.2

-1.5

E180Q c

-13.4

2.8X104

25.5

19.8

5.7

R305K

-

9.3x 104

28.5

30.4

- 1.9

D309E

-

2.0x 101~

59.2

31.1

28.1

apH 4.5, 45~ ref. 43; b27~ ref. 29; Cref. 44; aref. 26; efor Arg54---rLeu, ref. 43

in transition-state stabilization energy A(AG*) is calculated to be 10-20 kJxmol l. Arg305 is situated to bridge subsites 1 and 2 (Figure 4) and is crucial in both substrate bond-type specificity and activity. We speculate, therefore, that engineering of the bond-type specificity should concentrate on residues somehow connected with Arg305, and perhaps located in the fifth of the six highly conserved a-+a-segments of the GA (~o06-barrel [6-8,42,45,46] that contains Arg305. The binding thermodynamics for acarbose which is structurally related to the transition-state, indicate that hydrogen bonding especially to the second ring is critical for transition-state stabilization (Table 2B) and that modulation of specificity with less severe consequence on activity may be achieved by mutation of Asp309 (compare data in Table 2A

130 and B) in indirect contact with the ligand via a salt-link with the guanido group of Arg305 [271.

2.3. Molecular recognition of o~-1,4- and ~-l,6-1inked disaccharide substrates The transition-state stabilization energy A(AG*) contributed by OH4', OH6', and OH3 in maltose (Figure 3), known to be critical for GA catalyzed hydrolysis to occur [47], was calculated according to equation (1) [48] from the kinetic parameters determined using a series of monodeoxy-analogs (Table 3; refs. 33,35). Because c~-1,4- and c~-l,6-1inked substrates are hydrolyzed at the same catalytic site, the A(AG*) values for the corresponding monodeoxyanalogs (Table 3) suggest that OH4 in isomaltose and OH3 in maltose are found at a similar (1)

A(AG*) = -R Tin[ (kcJKm)analogue](kcat/Km)parent]

positions in the GA-transition-state complex. This implies, as pointed out before [49], that GA associates with isomaltose in the least populated of its three prevalent solution conformers. The conformationally biased methyl 6R-C-methyl-o~-isomaltoside (not shown) exists primarily in that conformation [49]. In the absence of crystallographic insights identification of interacting pairs of atoms or groups in enzyme-substrate transition-state complexes can in principle be done through combined analysis with mutant enzyme and substrate analogs. In the case of Glu180--+Gln GA

Table 3 Binding energy contributed by substrate OH groups a D(DG*) (kJxmol-1), Analogue

[3-methylmaltosideb.C

o

o~-methylis~176 d

2-deoxy

5.2

1.1

3-deoxy

11.2

8.6

6/4-deoxy

- 1.8

16.5

2'-deoxy

0.19

4.0

3'-deoxy

9.1

2.2

4'-deoxy

18.8

23.2

6'-deoxy

17.4

24.3

apH 4.5 45~ bref. 35; Cref.33; dref. 36

OH 0 OMe HO

o

__\(~~

,,,, OH OMe

Figure 3. Key polar groups are encircled.

the OH2 of maltose supplied no transition-state stabilization energy [33], indicating that a hydrogen bond to wild-type GA (Table 3) was not established in the mutant. However, for monodeoxy-maltose analogs this approach further demonstrated [34], that loss of a protein group hydrogen bonding to a specific substrate ring can weaken the transition-state

131 stabilization at an adjacent glucose residue. Similarly transition-state stabilization of the a-l,6linked substrate was sensitive to loss of hydrogen bonding, A(AG*) for hydrolysis of 3- and 4deoxy-isomaltoside by Glu 180-->Gln GA thus decreased to 3.0 and 10.8 kJxmo1-1, respectively, and Asp309-->Glu GA showed a similar decrease for the 4-deoxy-analog (Table 3). With these two mutants A(AG*) furthermore decreased by 10-14 kJxmol 1 for the 4'- and 6'-deoxyisomaltosides, even though the corresponding substrate OH groups do not bind to Glul80 or Asp309. In contrast, however, the 4'- and 6'-deoxy-analogs of the conformationally blocked methyl 6R-C-methyl-ot-isomaltoside yielded A(AG*) values of 18-20 kJxmol ~ with wild-type and the two GA mutants [36]. We conclude that removal of a hydrogen bond to the glucose ring at subsite 2 hampers the conformational change normally imposed on substrate by GA to secure optimal transition-state stabilization via strong hydrogen bonds to charged groups at subsite 1. 2.4. Distant subsites and the condensation reaction Two binding modes for the inhibitor in D-gluco-dihydroacarbose-GA from A. awamori var. XIO0 revealed flexibility in oligosaccharide accomodation at subsites 3 and 4 (Figure 4). Ser119 OG (not shown) hydrogen bonds with OH3 of the fourth sugar ring in the low (35%) GAC 490

C 491

AC 491

VT'RP 12(

R 48

~'~

...,TRP 52

SP 55

t"'-

"4-.-.,

"~

Figure 4. Stereoview of the interaction of D-gluco-dihydroacarbose with GA from A. awamori var. XIO0 at pH 4.0. The nonreducing end is at the bottom; the low occupancy mode is to the left (from ref. 8.; copyright permission by Elsevier).

occupancy conformation [8]. Mutation on either side of the substrate binding cavity at a distance from the site of catalysis, as in S e r l l 9 ~ T y r , Ser185---)His GAs and the stretch preceding Ser185 (not shown), unexpectedly enhanced condensation as well as a-l,6-bondtype specificity (Table 4A). Clearly remote structural alterations had surprising impact on the

132 functionality of the catalytic site and/or its near vicinity. The quadrouble (loop3B) mutant (Table 4A) of Val181-Ser184 adjacent to Glu179 - the general acid catalyst - and Glul80, thus gave a decreased relative specificity, [(kcaJKm)maltose/(kcaJKm)isomaltose],of 37 compared to a wildtype value of 421. The slight increase in activity for isomaltose was accompanied by significant decrease in activity for maltose. Presumably the critical interaction of Glul80 with the second glucose ring of the substrate has been perturbed in this GA variant. The favorable effect on hydrolysis of the ct-1,6-1inked substrate indicates that the superior activity towards 0~1,4-1inked substrates is more sensitive to structural changes. The triple (loop3A) mutant GA (Table 4A) aimed at mimicking a distinct sequence of the Hormoconis resinae GA [42,45,46] to localize structural elements correlated with the high 0~-l,6-bond activity of that enzyme [51]. But the loop3A mutant GA only slightly favored o~-l,6-bond cleavage and had decreased specificity towards ot-l,4-1inked substrates (Table 4A). While this result confirms that Val181Ser184 in A. niger GA tolerates major structural modification, the Hormoconis resinae GA tripeptide conferred GA a favorable relative specificity without improving the overall activity for ot-l,6-1inked substrates. The Ser185--gHis GA had 4-5 fold reduced specificity (k~JKm) for maltose, maltoheptaose, and isomaltose (Table 4A). Serll9--)Tyr and Ser185~His GAs among a large number of mutants possessed an unusual capacity to produce branched oligosaccharides under conditions emulating industrial

Table 4 Mutants at a distance from the site of catalysis. A. Kinetic parameters a. B. Subsite 2 affinity a and acarbose binding thermodynamicsb A.

Maltose

Isomaltose

kc

Km

s "1

mM

8.84

0.41

19.8

1.05

9.6

0.36

5.65

3.58

1.58

Loop3B d

1.68

1.83

S185H

4.87

2.04

Enzyme

kc

Km

S"1

mM

wt

10.7

1.21

Sll9Y c

10.1

Loop3A d

k~

Km

s -1

mM

0.021

59.7

0.120

498

42.0

0.0086

56.9

0.158

360

0.23

22.9

0.010

31.8

0.171

186

0.92

0.63

24.8

0.0254

38.5

0.316

122

2.39

0.30

55.2

0.0054

29.9

0.323

93

kc/Km

B.

Subsite 2 affinity kJ mol -~

wt

kJKm

Maltoheptaose kJKm

Ka M -1

-AG ~ kJ mo1-1

-AH ~ kJ mol 1

TAS ~ kJ mol 1

-20.1

9.4x1011

68.9

32.8

36.1

S119Y c

-21.3

9.3x1011

68.8

29.9

38.9

Loop3B d

-19.5 1.2x1011

63.7

54.3

9.3

$185H

"pH 4.5, 45~ b27~ ref. 29; Cref. 50 dA:V181T/N182Y/G 183A; B:V181A/N182A/G183K/S 184H

133 saccharification. Although the subsite map [50] and Ka for acarbose [29] were comparable to wild-type data, the enthalpy and entropy of binding varied especially for Ser185-->His GA (Table 4B), indicating that the mutation confers low complementarity in accordance with a histidine being sterically unfavorable at position 185 in A. niger GA. After 96 hours of typical saccharification (Figure 5) hepta- and octasaccharides accumulated with Serll9--->Tyr compared to wild-type GA. The substitution of Ser119 by tyrosine probably suppresses the low occupancy binding mode (Figure 4) due to steric conflicts between the bulky tyrosine side chain and the fourth sugar ring. Consequently, Serll9---~Tyr GA might select the high occupancy conformer and promote condensation with glucose to generate poorly hydrolyzed branched oligodextrins. At present, we cannot discriminate between the possibility of Ser119--->Tyr GA favoring binding of branched oligodextrins resulting in condensation or whether the condensation products once formed are less readily degraded by Serll9--->Tyr

Maltodextrin saccharification 96 hrs

o,

o

o-o-~-o-o-o +

gO 0-0-0-0-0-~

o 9

0-0-0-0-0 + 0 0

iG2

A

6-6-o-o-o

Figure 5. BioGel P2 separation of saccharification products with wild-type and Ser119--->Tyr GAs. The structures are evaluated by NMR spectrometry. than by wild-type GA because their productive binding to the otherwise highly active mutant is suppressed.

2.5. Recognition and binding of maltooligosaccharides at subsites 1 through 3 A thorough analysis of the roles in the binding mechanism of structural elements along the active-site cleft is made possible by presteady state kinetic measurements on selected mutants as done here for i) Glul80 - an important hydrogen bond acceptor of OH2 of maltose at subsite 2, ii) Trp52 and iii) Trpl20, situated at the substrate binding crevace between subsites 1 and 2 and at subsite 3, respectively; both belong to a hydrophobic cluster involved in stabilization of the enzyme-substrate complex [34] and contribute perhaps to activity via hydrogen bonds between indole NH groups and the general acid catalyst, Glu179 (Figure 4). Mutation reduced kcat (at 8~ for maltose hydrolysis to 0.2% (Trp52--->Phe), 3% (Trpl20-->Phe), and 45% (Glul80-~Gln) of that of wild-type GA [30-32]. Glul80-->Gln GA slightly lost capacity in the initial association, K] for maltose being 25 mM [32] compared to

134 3.4 mM for wild-type, while K1 of longer substrates was not affected. Remarkably this mutant rearranged EL to E*L (Figure 6) with k2 of 10% of the wild-type rate and the reverse rate k_2, especially for longer substrates, around 20 times faster than for wild-type GA [32]. Trpl20, stacking with the ring at subsite 3 (Figure 4), had k_2 decreased only for maltose - by around 8 fold - and while k2 for maltose was similar to the wild-type value, decreasing kz values were obtained for that mutant with increasing length of substrate [31]. With longer substrates EL is thus less readily rearranged to E*L when Trp 120 is mutated, whereas wild-type GA behaves the opposite [31]. Mutation of Trp52, situated between subsites 1 and 2, caused slight changes at all steps of the binding mechanism. The efficiency in the last step of formation of productive complex E*L is indicated by 1(2 (i.e. k_2/k2), which for Glul80---)Gln GA remained around 0.5 as compared to values for wild-type decreasing from 0.03 to 0.006, and for the Trp52---)Phe and Trpl20--->Phe GAs from 0.007 to 0.001 for maltose through maltotetraose. The loss in catalytic efficiency is reflected in k3 (Figure 6) for acarbose binding which is 140 s-1 and thus

E + A

~

EA

~

E*A-~

E'A

k.2

E+L

~ EL ~ E*L---, E + P k.l k.2 Figure 6. Pathways for binding of acarbose (A) or substrate (L) to glucoamylase (E) to form inhibited complex (E'A) or product (P). * signifies the step preceding the transition-state.

facilitated for Trpl20---)Phe compared to wild-type GA of k3= 0.6 S -1 [32]. To illustrate the effect at subsite 1 and the catalytic site, glucosyl-l-fluoride was used as substrate. Since the rate of hydrolysis (kcJKm) catalyzed by wild-type and Trpl20---)Phe GA was similar [34], Trpl20 despite the hydrogen bond with Glu179 [6-8], appears not to be essential for the chemical catalysis, but for stabi/ization of transition-state complexes with longer substrates [32]. This was observed also for the Tyrl 16--)Ala, another mutant in the hydrophobic cluster in contact with substrate via Trp 120 [34].

3. GENETIC AND CHEMICAL STUDIES OF BARLEY AMYLASE a-amylases are retaining endo-acting starch-hydrolases of the (13/o08-barrel family [14,39,40,52]. Although the sequence similarity between (x-amylases can be extremely low, a few invariant residues play key roles in catalysis, transition-state stabilization, and maintenance of the active site geometry. A large number of amylolytic and related enzymes representing 18 different EC classes show the same weak similarity [14]. Short sequence motifs recognized at (13/o08-barrel 13-strands furthermore define an evolutionary tree that respects taxonomy and enzyme specificity [20]. Based on this insight, selected motifs near essential residues of the catalytic domain are tentatively correlated with the specificity variation among the family members. A separate useful comparison concerns enzymes from closely related species or

135 isozymes to identify structural features with special impact on physico-chemical and enzymic properties. Germinating barley seeds produce two a-amylase isozyme families, AMY1 and AMY2, showing approx. 80% sequence identity and distinct differences in temperature and pH stability, CaZ§ of activity, sensitivity to the endogeous inhibitor BASI (barley o~amylase/subtilisin inhibitor), turn-over rate and affinity of different substrates, and association with starch granules [53-55]. In AMY1 site-directed mutagenesis [28] at Asp l80, Glu205, Asp291 in the catalytic site and His93 and His290 involved in substrate binding, suggested that the histidines stabilize the transition-state, but since mutants at the three acid residues were inactive, their individual roles in catalysis were assigned entirely on the basis of the crystal structure of acarbose-AMY2:Glu205 as general acid catalyst, Asp 180 as catalytic nucleophile, and Asp291 to guide the nucleophilic water for attack at C1 [3]. Mutation also confirmed that the chemically identified surface site important in adsorption onto granular starch includes Trp279 [28]. Currently three genetic engineering strategies are pursued with the aim of tailormaking new barley (x-amylases. The crystal structure of AMY2 is available [3,56] and cDNAs encoding AMY1 and AMY2 have been expressed in Saccharomyces cerevisiae [54,55]. Occasionally poor yields prevented biochemical characterization of recombinant AMY2 or variant forms of both isozymes [54,55,57,58]. Both the random and the site-directed mutageneses are therefore made in AMY1 or in bipartite AMY1-AMY2 hybrids with Nterminal AMY1 part [58]. Previously random mutagenesis and formation of chimers of different bacterial a-amylases supplied collections of engineered enzymes evaluated by screening especially for improved stability [59-61]. We now implemented related more advanced techniques for the barley (x-amylase. Background information is provided also from mutation at conserved residues in the second and fourth [3~o~ loops of the ([3/o08-barrel of Saccharomycopsis fibuligera amylase leading to enhanced transglycosylation and alteration of the action pattem on oligosaccharide substrates [62-64]. 3.1. Barley o~-amylase isozyme hybrids highlight the role of domain B Functional bipartite AMY1-AMY2 hybrids were obtained by in vivo homologous recombination in yeast of the corresponding cDNAs followed by screening for transformants

112 AMY1 AMY2

116

... VV~I N H R c A d y K D sRGIYC I FE G G T~~ D g R~~LL D W G P H M I CR ... NG H R t A e .h K D g R G I Y C I F E 9 G D a D W G P H M I C R 144

AMYI AMY2

161

D D t k g s D GT a N 1 D T GAD aAAP D I D H L N dRVQ rE L D DD r p Y a D G T g N p D* T9 G A* . .D Fig A A* p ,'0'I D ~ N 1 R V Q k E L B

...

A

Figure 7. Alignment of domain B from barley (x-amylases 1 and 2 [67,68]. Conserved residues are in upper case. The cross-over points for the different AMY1-AMY2 hybrids [66] are indicated by vertical lines. Residues binding to the three Ca 2+ in AMY2 are indicated: * Ca500, 9 Ca501, 9 Ca502 [56].

136 that secrete active enzyme. Reasonable amounts of hybrids, even with a large AMY2 content, were produced provided the first 54 amino acid residues were from AMY 1 [58]. Such AMY1AMY2 hybrids have been extremely useful in site-directed mutagenesis of residues conferring target isozyme specificity for BASI [38,65,66] (see also 3.4). Surprisingly the behavior of hybrid isozymes complied with the origin of domain B [65], that protrudes from the (13/o08barrel at the C-terminus of the third 13-strand [56]. Briefly, if the N-terminal AMY1 part comprised domain B (L161, Table 5), the hybrid resembled AMY1 in enzymic and stability properties. In contrast, if the C-terminal AMY2 portion comprised domain B, the hybrid was inhibited by BASI and had lower substrate affinity (V90; Table 5). However, the Caz+ stimulation of activity - also characteristic of AMY2 - made an exception. Since the ligands to the three Ca2+ found in AMY2 [56] are conserved in AMY1 and, except for the carbonyl group of Gly183, belong to domain B (Figure 7), it remains to be understood how the activity of all hybrids, even L161 with AMY1 domain B, is stimulated by Ca2+ [65,66]. The prospects of creating AMY 1-AMY2 hybrids with useful combinations of isozyme traits encouraged the analysis of hybrids that cross-over within domain B [66] (Table 5). Characterization of such hybrids suggests that the AMY1 segments Val90-Thrll2 and

Table 5 Properties of AMY 1-AMY2 hybrids Substrate p-NPG7

L

Km

Crossover a

s-1

mM

AMY1

660

0.5

L161 b

327

F144 c

Inhibitor

Amylose (DP=17)

kc/Km

k~

Km

kc/Km

Stability

Acarbose

BASI

(pH 3.5)

Ki,app mM

Ki nM

min -1

t~

S-1

%-1

1320

236

0.4

590

2.2

>106

25

0.5

654

535

0.8

669

5.2

>106

15

352

0.7

503

765

1.1

695

8.9

5x103

5

RllC

215

0.5

430

880

1.3

677

11.5

0.62

2

Tl12 c

310

0.5

620

659

0.7

941

4.0

0.73

2

V90b

400

2.4

167

647

1.1

588

11.4

0.33

<1

AMY2

405

2.5

162

684

1.2

570

6.8

0.22

<1

"Residues at the C-terminus of the AMY 1 part are listed to identify the different hybrids; bref. 65; Cref. 66

Ala145-Leu161 define higher affinity for the substrates p-nitrophenyl-maltoheptaoside and amylose, respectively; while AMY2 Leu 116-Leu 160 and Leu 116-Phe 143 are critical for the lower stability at pH 3.5 and the sensitivity to BASI, respectively [65,66]. In the AMY2 model, Leull6-Alal40 of domain B lacks direct contact with the (13/o08-barrel [56]. It is

137 conceivable that this exposed part contains determinants involved in BASI-AMY2 interaction, while segments in contact with the ([3/008-domain influence the substrate affinity. 3.2. Random mutagenesis at the fourth [~-->~ loop of the (~/~)s-barrel of AMY1 Barley or-amylase is described to contain 10 subsites accomodating glucosyl residues in linear substrates - six subsites at the glycon and four at the aglycon side of the catalytic site [69,70] (Figure 8). Considering that certain ~-->ot substrate binding segments are

red. end 1

"]~7 8

9

10

Figure 8. Schematic representation of the kinetically identified subsites in the active site of barley (x-amylase [69,70].

readily identified at the sequence level in the amylase family [14,20], it is tempting to explore the variability permitted in such regions and the impact of structural changes on action pattern and specificity. In the case of barley (x-amylase we focused on Arg183-Tyr185 in the fourth ]3---)orsegment of AMY 1 as target for random mutation. Lys 182 from the corresponding AMY2 sequence (Lys182-Tyr184) binds to the glucose residue at subsite 8 (Figures 8 and 9). Also the equivalent Lys210 in Saccharomycopsisfibuligera amylase is known to bind substrate close to the catalytic site [71]. A pool of AMY1 random mutant genes was transformed into yeast followed by screening on starch plates for colonies secreting active enzyme. The action pattern of a few selected mutants (Table 6) is studied using the substrates p-nitrophenylmaltoheptaoside (pNPG7), pNPG6, and pNPGs. Product analysis indicated that the preferred productive binding mode of AMY 1, where G-pNP occupies subsites 7 and 8, was altered. For both of the highly active mutants (Table 6), albeit a little less conspicuously for the Ser-GlyMet than the Asn-Gly-Tyr mutant, which has also higher activity, primarily p-nitrophenol was released from pNPG7 and pNPG6, suggesting that subsite 8 and maybe 7 were affected by the mutation, p-nitrophenol, however, was not liberated from the poor substrate pNPGs. The AsnGly-Tyr mutant and wild-type AMY1 gave very similar action pattern on this substrate, while the Ser-Gly-Met mutant produced an unusually large amount of pNPG4. In addition, the mutants obtained by random mutagenesis suggested that Gly184 is strictly required for activity. All known o~-amylase sequences, except from cereals, have conserved histidine (Figure 9) at the corresponding position. This histidine interacts with the glucose ring of the aglycon at the bond to be cleaved [1,15]. Mutation in human pancreatic a-amylase revealed that it has a multifunctional role in substrate binding, control of the pH-activity dependence, activation by CI-, and inhibition by a proteinaceous inhibitor [72].

138

g-Amylases (not plants) barley I barley 2

Maltogenic g-amylase Maltotetraohydrolase

*

bb

RXDXXKH RLDFARG

RFDFAKG

RIDAVKH RFDFVRG

Maltohexaohydrolase

RIDAVKH

a-glucosidase

RIDTAGL

Cyclodextrinase

RLDVANE

Pullulanase

RFDLMGX

Isoamylase

RFDLASV

Amylase-pullulanase Neopullulanase

Oligo-l,6-glucosidase

RLDVANE

RLDVANE

RMDVINF

Dextran glucosidase

RMDVIDM

CGT-ase

RXDAXKH

Amylomaltase

RIDHFRG

Branching enzyme

Glycogen debranching enzyme

RXDXVXS

RLDNCHS

~4

Figure 9. Alignment of short sequences containing the catalytic nucleophile (*) and residues (b) binding to glucose rings in the substrate aglycon. 13designates residues in the fourth 13-strand in known three-dimensional structures (modified from ref. 14).

Table 6 Random mutants at the substrate binding Arg183 in the fourth ]3--)o~segment in AMY1 Enzyme

Sequence"

Activity (starch plate)

Wild-type

Arg Gly Tyr

+++

#31

Arg Gly Cys

+

#73

Asn Gly Tyr

+++

#89

Thr Gly Leu

+

# 117

Set Gly Met

++

"Mutated residues are in bold

3.3. Affinity labeling of AMY2 by a series of photoreactive maltooligodextrins Enzyme-substrate interactions illustrated in acarbose-AMY2 [3] cover only subsites 6 through 8 (Figure 8). To get information on the distant subsites, a series of linear maltodextrins; maltotetraose; maltopentaose; and maltohexaose, was synthesized which carried the photoreactive 1-thio-3-diazirine-n-butyl group (diaz) at the reducing end [73]. The 3Hlabeled analogs were covalently bound to AMY2 by photolabeling. In spite of the very

139 concentrated reaction mixtures, reasonable stoichiometry of incorporation and loss of activity were obtained (Table 7). Modified parts of AMY2 are assumed to be in contact with the

Table 7 Labeling of AMY2 with reducing end photoreactive maltooligodextrins Substrate analogue G4-diaz

Gs-diaz

G6-diaz

Reaction mixture": G,-diaz (mM)

30

30

10

AMY2 (mgxm1-1)

0.83

0.45

0.45

Productb: Gn/AMY2 (mol/mol)

0.85

1.12

0.66

Activity (% of native AMY2)

15

15

28

aIrradiation was carried out in 50 mM tris-ethanolamine-HC1 pH 8.0, 10 mM CaC12, for 20 min at 350 nm; banalyzed by 3H incorporation and activity towards pNPG7

photoreactive group at the reducing end of the substrate. The use of tritiated analogs facilitated identification of labeled peptide fragments using conventional protocols for proteolysis, peptide purification, and sequence analysis. Up to 20% of the incorporated label was recovered in a single peptide fragment and combined recoveries of around 60% was obtained within a small number of fragments. By comparing the localization of label in the AMY2 structural model [3,56] and the reported action patterns on maltooligodextrins of DP 5-7 [69,70], specific regions of the protein fold are tentatively assigned a specific subsite location, covered by the aglycon moiety (Table 8). This resulted in identification of the outermost subsite 10 at the segment between the extra helix, that succeeds the sixth 13-strand, and the sixth helix of the ([3/ot)8-fold [56], as labeled by Gs-diaz. Minor reaction occurred at the outer part of the fifth and at the eighth 13~o~ segments. A high affinity at subsite 1 [69,70] (Figure 8) causes preferred productive binding of G6-diaz to label subsite 7. Subsite 9 was basically modified by G4-diaz. Less prominent reaction was expected with subsite 8, which is, however, illustrated in the acarbose-AMY2 model [3]. As a consequence probably of the highly concentrated reaction mixtures and enzymatic cleavage of the maltooligodextrin derivatives, other parts of AMY2 got labeled too (not shown), but to lower degree. It is noteworthy that certain amylolytic enzymes do bind oligodextrins at aromatic groups on the surface in regions separate from the active site [2,3,10,74]. 3.4. Characterization of side chains in AMY2 that stabilize the complex with BASI The activity of AMY2 is specifically inhibited by BASI present in the mature barley seed. BASI belongs to the soybean trypsin inhibitor family containing [3-trefoil fold proteins engaged in protein-protein interaction [75,76]. BASI is a tight, fast reacting inhibitor of Ki = 2.2x10 1~ M [77] which follows a simple two-step binding pathway (Figure 10) where K1 = 0.4 mM, k2 320 sl and k_2 = 7x10 5 s-1 [78]. Chemical modification using group specific reagents suggested that arginines of BASI were important for the association with AMY2. Differential labeling

140 Table 8 Tentative localization of aglycon subsites to ~---)o~segments of the (~/o08-barrel Subsite b 13---)o~segment a

7

5

+++

8

9

++

+++

+ +++

6, at ot6a-+a6b +

7, at o~7a---)a7b

+

8

"As defined in ref. 56.

10

+ bshown

++

in Figure 8.

and subsequent sequence analysis of BASI revealed Arg27, Arg127 and Arg155 to be protected by AMY2 against inactivation by phenylglyoxal [76], suggesting a location at the

BASI + A M Y 2 ~ B A S I - AMY2 ~ B A S I - AMY2*

k., Figure 10. Reaction scheme for BASI and AMY2 [78]. protein-protein interface. Since a ternary complex composed of BASI, subtilisin, and AMY2 apparently can form [79,80], comparison of this labeling pattern and the three-dimensional structure of the homologous inhibitor from wheat, where the protease target site was identified [81], supported the putative identification of AMY2-binding region [76]. BASI-AMY2 was recently crystallized [82] and the preliminary model together with the model of the wheat homologue [81], guided a mutational approach to describe the importance of selected noncovalent bonds between BASI and AMY2 [38]. Advantage was taken of the higher production of AMY1-AMY2 hybrids to test the effect of mutation in the AMY2 part. Hybrids with domain B of AMY2 (V90) and AMY1 (L161) origin, respectively, enabled engineering of decreased affinity for BASI in AMY2 and installation of BASI recognition in AMY1 (Table 9). Mutation in the BASI-insensitive L161 of Thr129~Arg mimics Arg128 in AMY2 and Lysl30---)Pro presumably secures the conformation needed to get significant sensitivity to BASI. In the reverse type of experiment Arg128~Gln in hybrid V90 reduced the affinity for BASI by a factor of 102 (Table 9). A similar effect is obtained by substituting Asp142 forming a salt-bridge to BASI - with the corresponding amide (Table 9). The affinity for BASI decreases with decreasing pH or increasing ionic strength [79,83,84], suggesting that ionic interactions dictate the formation of the complex. Additional support for this point was obtained in recent titration calorimetric studies which demonstrated that the BASI-AMY2 complexation is enthalpy driven with only a small temperature dependence. Thus the hydrophobic effect appears to be of little importance and salt-bridges are most likely critical for the association (C.R. Berland, B.W. Sigurskjold, B. Svensson, unpublished data).

141 Table 9 Ki (nM)of BASI for AMY1, AMY2, AMY1-AMY2 hybrids, and AMY1-AMY2 mutants" Enzyme form b AMY1

L161

>10 6

>10 6

L161T129R/K130P _7•

3

V90-D142N

V90-R128Q

V90

AMY2

18

25

0.33

0.22

"at pH 8.0, ref. 38; bsee Table 5 for hybrid nomenclature

5. CONCLUSION Mutational analysis is a potent tool in structure/function relationship investigations in glucoamylase. Further characterization of mutants with respect to three-dimensional structure, binding thermodynamics, molecular recognition, and presteady state binding kinetics has expanded the understanding of functional roles played by a few selected residues in substrate recognition, specificity, catalysis, and condensation reaction. As a representative of the large family of amylolytic enzymes containing an (x-amylase type (13/o08 catalytic domain, barley oramylase was subjected to isozyme hybridization, localized random mutagenesis in a binding [ ~ o t connecting segment, and site-directed mutagenesis to study the active site, surface site and BASI binding site functionalities. Alterations obtained in substrate affinities and action pattern identify specific areas, especially within domain B, as targets for subsequent engineering experiments. The knowledge gained is useful in guiding similar engineering attempts in related enzymes.

6. FUTURE CHALLENGES Numerous open questions remain in the field of starch-hydrolases and related enzymes. With regard to the a-amylase (13/008 family distinct amino acid residues are located in ]3--+ot segments involved in substrate binding. Is binding loop replacement a valuable approach for rational alteration of specificity in amylolytic enzymes? Different family members contain multiple domains of which the significance has to be elucidated. The domains may represent modules suitable for grafting onto a variety of structural contexts to confer altered properties. Some domain types may be involved in substrate binding or in adsorption of the enzyme onto biological surfaces, e.g. cell walls. CGTases currently represent the most complex structure described. They contain five domains, including a C-terminal one that participates in adsorbtion onto starch. However, domains Nterminally of the catalytic ([3/ot)8-barrel, as found in most debranching enzymes, are of distinct types, for which both a structural model and function are yet to be identified. Another challenge is to explore structural manipulation of the domain B of the ([3/008-barrel in the light of our findings on the crucial role of the isozyme origin of this small module for the properties of the barley a-amylase isozymes. Much is unclear also in the way Ca2§ controls function and stability in many amylolytic enzymes. Finally, do catalytic subsites and more

142 distant subsites co-operate in polysaccharide metabolizing enzymes? Does co-operativity exist among individual domains? The structural basis for the multiple binding modes seen with maltodextrin analogs and glucoamylase and the impact on function is also ill-understood. Some of the above concepts are considerably more developed for cellulolytic enzymes. Synergistic substrate degradation involving several enzymes is thus a shared key question between that area and the amylolytic enzymes. As a reflection perhaps of the presence of starch branch points, the starch degrading enzymes seem to have evolved a larger variation in substrate and product specificity, than found for the cellulolytic enzymes.

7. ACKNOWLEDGEMENTS We are grateful to C. Dupont, J. Lehmbeck, B.E. Norman, K. Bock, T. Christensen, U. Christensen, C. R. Berland, B.W. Sigurskjold, J. Lehmann, and J. Steck for collaboration in studies not published previously. HPF received a long-term and NJ a short-term EMBOfellowship; BSt holds a post-doctoral fellowship from The Carlsberg Foundation. The financial support from the EU BAP and BRIDGE programmes, The Danish Agencies for Industry and Trade, and The Scandinavia-Sasakawa Foundation is gratefully acknowledged.

8. REFERENCES

10 11 12

13

M. Qian, R. Haser, G. Buisson, E. Du6e and F. Payan, Biochemistry, 33 (1994) 6284. S.B. Larson, A. Greenwood, D. Cascio, J. Day and A. McPherson, J. Mol. Biol., 235 (1994) 1560. A. Kadziola, M. Sogaard, B. Svensson and R. Haser (1995) submitted. B. Mikami, E.J. Hehre, M. Sato, Y. Katsube, M. Hirose, Y. Morita and J.C. Sacchettini, Biochemistry, 32 (1993) 6836. B. Mikami, M. Degano, E.J. Hehre and J.C. Sacchettini, Biochemistry, 33 (1994) 7779. E.M.S. Harris, A.E. Aleshin, L.M. Firsov and R.B. Honzatko, Biochemistry, 32 (1993) 1618. A.E. Aleshin, L.M. Firsov and R.B. Honzatko, J. Biol. Chem., 269 (1994) 15631. B. Stoffer, A.E. Aleshin, L.M. Firsov, B. Svensson and R.B. Honzatko, FEBS Lett., 358 (1995) 57. C. Klein, J. Hollender, H. Bender and G.E. Schulz, Biochemistry, 31 (1992) 8740. C.L. Lawson, R. Van Montfort, B. Strokopytov, H.J. Rozeboom, K.H. Kalk, G.E. De Vries, D. Penninga, L. Dijkhuizen and B.W. Dijkstra, J. Mol. Biol., 236 (1994) 590. B. Strokopytov, D. Penninga, H.J. Rozeboom, K.H. Kalk, L. Dijkhuizen and B.W. Dijkstra, Biochemistry, 34 (1995) 2234. Y. Matsuura and M. Kubota, in Enzyme Chemistry and Molecular Biology of Amylases and Related Enzymes (Ed. The Amylase Research Society of Japan, CRC Press, Tokyo, 1995) p 153. Y. Matsuura, in Enzyme Chemistry and Molecular Biology of Amylases and Related Enzymes (Ed. The Amylase Research Society of Japan, CRC Press, Tokyo, 1995) p

143

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

35 36 37 38 39 40 41

137. B. Svensson, Plant Mol. Biol., 25 (1994) 141. Y. Matsuura, M. Kusunoki, S. Harada and M. Kakudo, J. Biochem., 95 (1984) 697. E. Boel, L. Brady, A.M. Brzozowski, Z. Derewenda, G.G. Dodson, V.J. Jensen, S.B. Petersen, H. Swift, L. Thim and H.F. Wrldike, Biochemistry, 29 (1990) 6244. M. Machius, G. Wiegand and R. Huber, J. Mol. Biol., 246 (1995) 545. C.G. Cheong, S.H. Eom, D.H. Shin, H.K. Song, K. Min, J.H. Moon, K.K. Kim, K.Y. Hwang and S.W. Suh, PROTEINS: Structure, Function, and Genetics, 21 (1995) 105. H. Kizaki, Y. Hata, K. Watanabe, K. Katsube and Y. Suzuki, J. Biochem., 113 (1993) 646. H. Jespersen, E.A. MacGregor, B. Henrissat, M.R. Sierks and B. Svensson, J. Prot. Chem., 12 (1993) 791. B. Svensson and M. Scgaard, J. Biotechnol., 29 (1993) 1. H.P. Guan, T. Baba and J. Preiss, Cell. Mol. Biol., 40 (1994) 981. M. Yamashita, T. Kinoshita, M. Ihara, T. Mikawa and Y. Murroka, J. Biochem., 116 (1994) 1233. D. Penninga, B. Strokopytov, H.J. Rozeboom, C.L. Lawson, B.W. Dijkstra, J. Bergsma and L. Dijkhuizen, Biochemistry, 34 (1995) 3368. B. Stoffer, T.P. Frandsen, B. Svensson and M. Gajhede, Carbohydrate Bioengineering Meeting, Elsinore April 23-26, 1995, abstr. P16. M.R. Sierks, C. Ford, P.J. Reilly and B. Svensson, Protein Eng., 3 (1990) 193. T.P. Frandsen, C. Dupont, J. Lehmbeck, B. Stoffer, M.R. Sierks, R.B. Honzatko and B. Svensson, Biochemistry, 33 (1994) 13808. M. Scgaard, A. Kadziola, R. Haser and B. Svensson, J. Biol. Chem., 268 (1993) 22480. C.R. Berland, B.W. Sigurskjold, B. Stoffer, T.P. Frandsen and B. Svensson, Biochemistry, 34 (1995) 10153. K. Olsen, B. Svensson and U. Christensen, Eur. J. Biochem., 209 (1992) 777. K. Olsen, U. Christensen, M.R. Sierks and B. Svensson, Biochemistry, 32 (1993) 9686. K. Olsen, Ph.D. Thesis (1994) University of Copenhagen. M.R. Sierks and B. Svensson, Protein Eng., 5 (1992) 185. B. Svensson, B. Stoffer, T.P. Frandsen, M. S0gaard, M.R. Sierks, K.W. Rodenburg, B.W. Sigurskjold and C. Dupont, Proceedings of 36th Alfred Benzon Symposium (K. Bock and H. Clausen, eds.; Munksgaard, Copenhagen, 1994) p. 202. M.R. Sierks, K. Bock, S. Refn and B. Svensson, Biochemistry, 31 (1992) 8972. T.P. Frandsen, M.M. Palcic, B. Stoffer, R.U. Lemieux, O. Hindsgaul and B. Svensson, in preparation. F. Garcia-Olmedo, G. Salcedo, R. Sanchez-Monge, L. Gomez, J. Royo and P. Carbonero, Oxford Surv. Plant. Mol. Cell. Biol., 4 (1987) 275. K.W. Rodenburg, F. Vallre, N. Juge, X.J. Guo, J.C. Chaix, R. Haser and B. Svensson, Miami Bio/Technology Short Reports, 6 (1995) 9. M.L. Sinnott, Chem. Rev., 90 (1990) 1171. J.D. McCarter and S.G. Withers, Curr. Opin. Struct. Biol., 4 (1994) 885. B. Svensson, A.J. Clarke, I. Svendsen and H. Moiler, Eur. J. Biochem., 188 (1990) 29.

144 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71

B. Stoffer. Ph.D. Thesis (1994) University of Copenhagen. T.P. Frandsen, T. Christensen, B. Stoffer, J. Lehmbeck, C. Dupont, R.B. Honzatko and B. Svensson, Biochemistry, 34 (1995) 10162. M.R. Sierks and B. Svensson, Biochemistry, 32 (1993) 1113. P.M. Coutinho and P.J. Reilly, Protein Eng., 7 (1994) 393. B. Henrissat, P.M. Coutinho and P.J. Reilly, Protein Eng., 7 (1994) 1281. K. Bock and H. Pedersen, Acta Chem. Scand., B41 (1987) 617. A.J. Wilkinson, A.R. Fersht, D.M. Blow and G. Winther, Biochemistry, 22 (1983) 3581. M.M. Palcic, T. Skrydstrup, K. Bock, N. Le and R.U. Lemieux, Carbohydr. Res., 250 (1993) 87. M.R. Sierks and B. Svensson, Protein Eng., 7 (1994) 1479. R. Fagerstr6m, J. Gen. Microbiol., 137 (1991) 1001. C.-I. Brfinden, Curr. Opin. Struc. Biol., 1 (1991) 978. R.L. Jones and J.V. Jacobsen, Int. Rev. Cyt., 126 (1991) 49. M. S0gaard and B. Svensson, Gene, 94 (1990) 173. M. SCgaard, F.L. Olsen and B. Svensson, Proc. Natl. Acad. Sci., U.S.A., 88 (1991) 8140. A. Kadziola, J. Abe, B. Svensson and R. Haser, J. Mol. Biol., 239 (1994) 104. M. Sogaard, J.S. Andersen, P. Roepstorff and B. Svensson, Bio/Technology, 11 (1993) 1162. N. Juge, M. Scgaard, J.C. Chaix, M.F. Martin-Eauclaire, B. Svensson, G. MarchisMouren and X.J. Guo, Gene, 130 (1993) 159. Y. Suzuki, N. Ito, T. Yuuku, H. Yamagata and S. Udaka, J. Biol. Chem., 264 (1989) 18933. L. Holm, A.K. Koivula, P.M. Lehtovaara, A. Hemminki and J.K.C. Knowles, Protein Eng., 3 (1990) 181. P. Joyet, N. Declerck and C. Gaillardin, Bio/Technology, 10 (1992) 1579. I. Matsui, K. Ishikawa, S. Miyairi, S. Fukui and K. Honda, Biochim. Biophys. Acta, 1077 (1991) 416. I. Matsui, K. Ishikawa, S. Miyairi, S. Fukui and K. Honda, Biochemistry, 31 (1992) 5232. I. Matsui, S. Yoneda, K. Ishikawa, S. Miyairi, S. Fukui, H. Umeyama and K. Honda, Biochemistry, 33 (1994) 451. K.W. Rodenburg, N. Juge, X.J. Guo, M. Scgaard, J.C. Chaix and B. Svensson, Eur. J. Biochem., 221 (1994) 277. N. Juge, K.W. Rodenburg, X.J. Guo, J.C. Chaix and B. Svensson, FEBS Lett., 363 (1995) 299. J.C. Rogers and C. Milliman, J. Biol. Chem., 258 (1983) 8169. J.C. Rogers, J. Biol. Chem., 260 (1985) 3731. A.W. MacGregor, J.E. Morgan and E.A. MacGregor, Carbohydr. Res., 227 (1992) 301. E.H. Ajandouz, J. Abe, B. Svensson and G. Marchis-Mouren, Biochim. Biophys. Acta, 1159 (1992) 193. I. Matsui, K. Ishikawa, S. Miyairi, S. Fukui and K. Honda, FEBS Lett., 310 (1992) 216.

145 72 73 74 75 76 77 78 79 80 81 82 83 84

K. Ishikawa, I. Matsui, K. Honda and H. Nakatani, Biochem. Biophys. Res. Commun., 183 (1992) 286. M. Blanc-Muesser, H. Driguez, J. Lehmann and J. Steck, Carbohydr. Res., 223 (1992) 129. M. Qian, R. Haser and F. Payan, Protein Sci., 4 (1995) 747. A.G. Murzin, A.M. Lesk and C. Chothia, J. Mol. Biol., 223 (1992) 531. K.W. Rodenburg, I~. V~allyay, I. Svendsen and B. Svensson, Biochem. J., 309 (1995) 969. U. Sidenius, K. Olsen, B. Svensson and U. Christensen, FEBS Lett., 361 (1995) 250. J. Abe, U. Sidenius and B. Svensson, Biochem. J., 293 (1993) 151. J. Mundy, I. Svendsen and J. Hejgaard, Carlsberg Res. Commun., 48 (1983) 81. R.J. Weselake, A.W. MacGregor, R.D. Hill and H.W. Duckworth, Plant Physiol., 72 (1983) 1008. G.P. Pal, C.A. Kavounis, K.D. Jany and D. Tsernoglou, FEBS Lett., 341 (1994) 167. F. Vall6e, A. Kadziola, Y. Bourne, J. Abe, B. Svensson and R. Haser, J. Mol. Biol., 236 (1994) 368. A. T6rr6nen, M. Leisola and S. Haarasilta, Cereal Chem., 69 (1992) 355. A.J. Halayko, R.D. Hill and B. Svensson, Biochim. Biophys. Acta, 873 (1986) 92.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 91995 Elsevier Science B.V. All rights reserved.

The structure and function commune xylanase A

relationship

of

147

Schizophyllum

M.R. Bray and A.J. Clarke Department of Microbiology, University of Guelph, Guelph, Ontario N 1G 2W l, Canada

Abstract The structure and function relationship of the Family G glycosidase, xylanase A from Schizophyllum commune, is reviewed. The action pattern of xylo-oligosaccharide hydrolysis by xylanase A suggests that the binding site spans about seven xylose units (i.e., seven sub-sites), and that the catalytic site is located asymmetrically within them. Ultraviolet difference spectroscopy with xylanase A in the presence of inhibitors and substrates together with chemical modification studies indicate the participation of a tyrosyl residue in the binding of substrates to xylanase A. This binding-site residue has been identified as Tyr97, a conserved aromatic amino acid residue among the Family G glycosidases. Xylanase A catalyzes the hydrolysis of xylan with the retention of anomeric configuration thereby implying a double displacement mechanism of action. Peptide mapping studies with a derivative of xylanase A inactivated by radiolabelled carbodiimide served to identify Glu87, a totally conserved acidic amino acid residue, as the nucleophile in the catalytic mechanism of action of the enzyme.

1. INTRODUCTION The potent wood-degrading basidiomycete (white-rot fungus) Schizophyllum commune produces a full complement of extracellular cellulolytic and heteroxylanolytic enzymes. These include, cellulases (EC 3.2.1.4; 1,4-(1,3;1,4)-[3-D-glucan 4-glucanohydrolase), cellobiohydrolases (EC 3.2.1.91; 1,4)-~-D-glucan cellobiohydrolase) and ~-glucosidases (EC 3.2.1.21 ;]3-Dglucoside glucohydrolase) which act synergistically to efficiently hydrolyse cellulose to glucose. Xylanases (EC 3.2.1.8; 1,4-13-D-xylan xylanohydrolase) catalyze the random hydrolysis of the xylan backbone of heteroxylans which are layered on cellulose fibrils and thereby expose the entire cellulose chains to the cellulolytic complex. The complete degradation of heteroxylans requires, in addition to xylanases, an array of debranching enzymes that may include [3-xylosidases (EC 3.2.1.37), 13-glucuronidases (EC 3.2.1.x), Ot-L-arabinofuranosidases (EC 3.2.1.55) and acetylesterases (EC 3.1.1.6) [1]. Moreover, many fungi and bacteria produce various and multiple forms of these respective enzymes; e.g., the fungus Sporotrichum dimorphosporum produces as many as nine distinct xylanases [2].

148 Xylanases have attracted considerable attention in recent years in view of their diverse industrial potential. Use of microbial enzymes for the industrial hydrolysis of biomass is advantageous because of the high specificity of enzymic reactions, the mildness of the reaction conditions, and the absence of substrate loss due to chemical modification. Two broad areas of application for xylanolytic enzymes have been identified; with and without other enzymes present [1]. Xylanases can be used in conjunction with other hydrolytic enzymes, such as cellulases, in the bioconversion of wastes, such as those from the forest industries [3]. Xylanases used concurrently with cellulases and pectinases may find wider uses in the clarification of must and juices, and for liquefying fruits and vegetables [1], as well as for extracting coffee, plant oils, and starch [4], for improving the nutritional properties of agricultural silage [5], for macerating plant cell walls and in the formation of protoplasts [6], and for providing different textures to bakery products [7]. Further application also exists for the use of xylanolytic enzymes in the absence of cellulolytic enzymes, particularly in the pulp and paper industry. An efficient and economical enzymic process could convert these polluting liquids into higher value compounds such as ethanol, xylulose, xylitol and xylonic acid [8]. Perhaps most significant, and most likely to be applied in the future, amongst the biotechnological processes in the pulp and paper industry is the incorporation of xylanase prebleaching, where the enzymes are used to enhance the delignification of pulps. The pulp and paper industry is under significant market, environmental, and legislative pressure to modify its pulping, bleaching, and effluent treatment technologies in order to reduce the environmental impact of mill effluents [4]. The incorporation of xylanase prebleaching is being considered and tested because it permits the use of lower chlorine charges during the bleaching of kraft pulps, the bleach boosting effect being associated with reduced chloro-organic discharges [4]. To realize the potential of xylanases in industry, it was, and continues to be, necessary to expand our knowledge concerning the behaviour of these enzymes at the structural and mechanistic level. With this information, it may be possible to modify the enzymes more specifically, such as by cross-linking or immobilization, and thereby increase their stability and expand their applicability. Hydrolysis efficiency, specificity, or stability of xylanases may also benefit from application of structure-function information through genetic and protein engineering. An eventual goal of this type of work is to be able to engineer "designer" enzymes with industrially-ideal characteristics, but which retain exquisite specificity for a particular task. The xylanase from Schizophyllum c o m m u n e is an ideal candidate for such studies since it is both produced in relatively larger quantities and released extracelluarly. Both these properties facilitate its routine isolation and purification to homogeneity. In addition, it is characterized by a low molecular weight, no glycosylation and it is highly active. This review will describe the current knowledge concerning the mode of action, structure and proposed catalytic mechanism of S. c o m m u n e xylanase A. In addition, structural information and evidence for a lysozyme-type mechanism of action will be considered, with reference to other related xylanases.

149 2. ENZYME ISOLATION AND PURIFICATION Screening experiments with both stationary and submerged cultures conducted by Jurasek and co-workers in the late 1960s had indicated that S c h i z o p h y l l u m c o m m u n e was a potent producer of both cellulase and xylanase [9,10]. Early attempts were made to produce the enzymes by shake flask cultivation using beechwood sawdust as the major carbon source [ 10]. Also at this time, small amounts of an S. c o m m u n e xylanase were isolated by electrophoresis [11]. A protocol was finally developed by Paice et al. [12,13] for the isolation and purification of xylanase A from S. c o m m u n e strain Delmar grown in submerged culture with spruce sawdust as carbon source. The enzyme was isolated from the culture filtrate by ethanol fractionation. Anion-exchange chromatography and gel filtration on DEAE-Sephadex A-50 and Sephadex G-50, respectively, served to render a homogenous preparation of xylanase A with an overall yield of 25.3% (43-fold purification) [12,13]. A slight modification to this protocol has been described in which prior to the anion-exchange chromatography, an initial gel filtration of the crude xylanase is performed to remove the bulk of the extracellular polysaccharides still present in the ethanol precipitate of the crude enzyme [ 14]. The results of a typical experiment using this latter protocol are presented in Table 1. In most cases, analysis by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) reveals the presence of only one protein band at 21 kDa. However, xylanase A has been shown to tightly bind carbohydrates [ 12], and residual carbohydrates and/or small amounts of glycoproteins do sometimes remain in the xylanase preparation. In these instances, affinity chromatography on Con A-Sepharose has been employed for their removal [ 14]. The modified protocol is suitable for the purification of large quantities of homogeneous xylanase (15-20 mg/1 of culture). The enzyme is highly stable and can be kept in dilute solutions at 4 ~ for several weeks with no discernable loss of activity [ 14].

Table 1 Purification of xylanase A from culture filtrate of S. c o m m u n e Delmar Total Total Specific Purification Treatment Protein Activity Activity Per step Overall (%) (mg) (IU) (IU/mg) Culture Filtrate 1140 1140 1.0 Ethanol Precipitate 166 876 5.4 5.4 5.4 Sephadex G-50 78.8 665 8.4 1.6 8.6 DEAE-Sepharose 20.4 429 21.1 2.5 21.6 BioGel P-60 7.80 300 38.3 1.8 38.9 -

-

Overall Yield 100 76.9 56.4 37.7 26.4

3. PHYSICO-CHEMICAL PROPERTIES Xylanase A is a SDS-PAGE [14], c o m m u n e [15,16], residues) has been

single chain polypeptide with a molecular weight of 21 kDa, as judged by and has a pI value of 4.5 [12]. Unlike the cellulolytic enzymes of S. xylanase A is not a glycoprotein [12-14]. Its amino acid sequence (197 determined by automated protein sequencing analysis (Figure 1) and the

150 two cysteine residues at positions 111 and 160 form a disulfide linkage [17]. The amino acid sequence is similar to two Trichoderma xylanases (approximately 56% identical amino acids), and also shows at least 40% identifies with the xylanases from Bacillus circulans, B. pumilus and B. subtilis [17]. Alignment of the xylanase sequence with those of the over 60 known 131,4-glucanases and xylanases places this enzyme in Family G of the classification scheme of Gilkes et al. [ 18]. This family is comprised of only xylanases, but of both bacterial and fungal origin. A number of amino acid residues appear to be totally conserved among the Family G xylanases, including two acidic and eight aromatic residues.

1 SGTPSSTGTDGGYY_YSWWTDGAG DATYQN NG_GGSYTLTWSGNNGNLV 48 G G K~WN PGAAS RS ISYSGTYQ PN G NSY LSV..Y..6...t:VTRSS L I..E...Y..YIVESYGSu 98 D_PSSAASHKGSVTC N_GAT_YDILSTWRYNAPSlDGTQTFE~FWSVn:N PKKA 148 P_GGSISGTVDVQC_~FDAffKG L_GMNLG SEH NYQIVAT_EGYOSS_GTATITV Figure 1. Amino acid sequence of xylanase A [17]. Residues in bold type indicate homology amongst at least 10 of the 19 aligned sequences of Family G xylanases [36], while underlined residues identify complete conservation. The two totally conserved Glu residues are shaded. The putative stabilizing anion/nucleophile, Glu87, and the substrate-binding Tyr97 are identified by the asterisks.

The pH optimum exhibited by xylanase A is broad, being centred at pH 5, and the enzyme is most stable within a pH range of 6 to 8 (>90% residual activity after 65 h at 30 ~ [14]. Its optimum temperature for activity (10 min assay) is between 45 ~ and 50 ~ [12,13], and the enzyme is stable at these temperatures for approximately 60 min [17]. Whereas the disulfide linkage present in the enzyme likely contributes to its thermostability, other factors, such as electrostatic and hydrophobic interactions, are probably more important since the other characterized Family G xylanases do not possess disulfides and at least one, the B. subtilis enzyme, is more thermostable [ 17].

4. KINETICS Xylanase A is an endoxylanase [19] with specificity only for xylans; no appreciable cellulase, 13-glucosidase nor 13-xylosidase activity has been detected [13]. The major products of hydrolysis following an 18 h incubation with xylans from either larch (a softwood heteroxylan comprised of a xylose backbone with 4-0-methyl-o~-D-glucuronic acid substituents) or esparto

151 grass (a simple homoxylopyranose polysaccharide) are xylobiose and xylose in a ratio of approximately 3:1. Its Km for soluble xylan is 8.4 mg/ml and homogeneous preparations are characterized by a specific activity of 1.5 x 103 IU/min [13]. The kinetics of xylanase A-catalyzed hydrolysis of a series of xylan homopolymers has been studied to assess the effect of substrate chain length on the process of catalysis through product ratios, Michaelis parameters, and bond cleavage frequencies. To this end, an HPLCbased method was developed to concomitantly quantify both substrate loss and increase of reaction products without the use of radioactivity [19]. In general, xylanase A attacks the longer xylo-oligomers in a mixture, rapidly hydrolysing them to xylobiose and xylotriose. Unlike most other endoxylanases studied in terms of mode of action [20-23], xylanase A exhibits no exo-xylanase character, a trait which is indicated by the lack of xylobiase activity, but cleavage of xylotriose to xylose and xylobiose occurs [ 19]. A similar action pattem to that of xylanase A is exhibited by an endo-l,4-xylanase excreted by Clostridium thermolacticum [24], which hydrolysed 4-O-methylglucuronoxylan to the neutral end products xylobiose and xyltotriose. Xylanase A does not readily catalyze multi-substrate reactions (e.g., transglycosylation), a characteristic that is unusual among other xylanases [21,25-28] and many related carbohydrases but may be related to a relatively short-lived carbonium ion intermediate in the mechanism of catalysis of the enzyme [29]. However, preliminary hydrolysis experiments using high concentrations of xylotriose indicated that under appropriate conditions, product ratios shift as a result of the contribution of multi-substrate mechanisms to substrate degradation. To explain the observed concentration-dependent product-ratio shift of xylotriose hydrolysis (i.e., more xylobiose present than expected for unimolecular hydrolysis), at least three mechanisms have to be considered [30], transglycosylation, condensation, and shifted one-two binding (termolecular shifted complexes). The former two degradative pathways are illustrated schematically in Figure 2. They may occur if the enzyme-intermediate complex is sufficiently long-lived for the leaving group to diffuse away and an acceptor molecule to diffuse in and/or shift to the catalytic site. Clearly, the longevity of the enzyme-intermediate complex, and therefore the propensity of an enzyme to undergo multi-molecular reactions, is largely dependent upon the strength of the positive contribution to binding energy is subsites adjacent to the catalytic site. Figure 2. Possible bi-molecular pathways for degradation of xylotriose by xylanase A. The circles represent xylose residues, while the filled circles denote reducing ends. To result in more xylobiose than expected by the normal uni-molecular hydrolytic pathway (a) under conditions of high xylotriose concentrations, condensation (b) or transglycosylation (c) reactions may precede hydrolysis.

152 Initial rate data obtained from the xylanase A hydrolysis of individual oligomers at 25 ~ and pH 5.8 indicated that Km decreased with increasing chain length of oligomer while kcat increased with chain length up to a degree of polymerization of seven (Figure 3) [19]. Assuming that Km as well as Vmaxand bond cleavage frequencies are directly related to the free energy released in the process of monomer-subsite interaction [31], it appears that xylanase A has at least seven subsites capable of binding xylose residues (Figure 3). Bond-cleavage frequencies indicate a strong preference for intemal linkages and that the catalytic site is located assymetrically within the binding subsites of the enzyme. A thermodynamic binding array calculated by the method of Thoma and Allen [32] using the bond-cleavage frequencies is illustrated in Figure 4 [33]. As typical of enzyme binding sites, the seven subsites vary in their affinity for the substrate xylosyl residues. The sum of affinities of subsites -I and I constitutes a region of positive free energy which may distort the chair form of a bound xylosyl residue into a half-chair, producing the enzyme-induced strain which is proposed to lower the activation energy barrier to form the transition state. The interaction energy for these two subsites is given as a sum since all productive complexes fill both subsites -I and I. The effects of a subsite of low affinity near the active site (i.e., subsite II) would be overcome upon binding of substrates with a degree of polymerization greater than 4 due to the greater affinities associated with subsites -II, -III and III. The strong "endo" character of this binding array is most apparent with these rapidly hydrolysed substrates, and is reflected in the reduced likelihood of a complex being formed the closer one end of the oligomer comes to the repelling effects of subsites -I and I.

5. STEREOCHEMISTRY The use of high-field proton NMR to monitor the chemical shift and coupling constant of the anomeric proton on the hemiacetal carbon of reaction products has greatly facilitated the determination of the stereochemical course of hydrolysis catalyzed by glycanases [34]. Reaction conditions are established such that the initially formed anomer of a hydrolytic reaction accumulates sufficiently to permit analysis by proton NMR before mutarotation of the products occurs. Application of this technique by Gebler et al. [35], using xylotetraose as the substrate, revealed that xylanase A catalyzes hydrolysis with the retention of anomeric configuration. Mechanistically, this implies that this enzyme operates by the double displacement mechanism of Figure 5. The basic features of this mechanism were first proposed for glycosidases by Koshland in 1953 [36] and they have generally remained intact. As modified by Sinnott [29], the mechanism involves the following events: i) an acid catalyst protonates the substrate; ii) a carboxylate group on the enzyme is positioned on the opposite side of the sugar ring to the aglycone; iii) a covalent glycosyl-enzyme intermediate is formed with the carboxylate in which the anomeric configuration of the sugar intermediate is opposite to that of the substrate; iv) the covalent intermediate may be reached from both directions through transition states involving oxocarbonium ions; and various non-covalent interactions provide most of the rate enhancement. That the pH-activity profile for xylanase A is bellshaped and centred at pH 5 [ 12] gives direct evidence for the participation of two catalyticallyessential ionizable groups on the enzyme, one with an acidic pKa and remaining deprotonated

153

at neutral pH (stabilizing anion/nucleophile) while the second has a pK. value closer to neutrality and remains protonated (acid catalyst). 0.50 K

M kcat (mM) (sec 1) 27.9 72.1 '~5:5~~'g"o~O,'~____~ 48.6 51.4 ~--7", o~ . , ~ - - - " ; ~ ,

4.3

0.0 o~

3.6

9.4 65.2 24.4 0.9 ~--.%~,~'-o~O~O'~__I.~. 0.0 36.4 33.3 ~-"gT'o"~~g~o"~~g'~~

26.7

3.6

0.7 17.1 35.9 33.8 11.7 ~-.,%,o,~"gT'o~,,~lgT~-"~~"~_~

0.7

0

0.921

E

~-0.50

1.2 12.8

0

0.8 13.2

-1.00

O.5 16.6

-1.50 -IV

-III

-II

-I

I

II

A

0.00

0.0105

III

-IV

-III

-il

-I

I

II

III

Subsite

Subsite

Figure 4. Relative subsite binding energies for xylanase A obtained from quantitative bond-cleavage frequency analysis and Michaelis parameters. The position of the catalytic site is denoted by the arrowhead.

Figure 3. Bond cleavage frequencies and Michaelis constants of xylanase A catalyzed hydrolysis of xylo-alditols at 25 ~ and pH 5.7. Frequencies are expressed as per-centages of total cleavage events. The arrowhead denotes the probable site of hydrolysis within the binding subsites. The data and figure are adapted from ref. 19.

AH

o

C"

O=C JO

>

o.

RO

o,,//--~o/

"OR

L 6+~ ' Ha, a "O-H

H O N O R

0

AH

O=C " 0 -

A:

0.~-~.H'O-H 9

O=C So"

<

~_~.

RO

ALl ,

,~,,,/'0-H

O=C "('0]

Figure 5. Generally accepted endocyclic pathway of the double-displacement mechanism for the retaining ~-glycosidases.

154 6. DIFFERENCE SPECTROSCOPY Ultraviolet spectroscopic studies of xylanase A in the presence of substrates or inhibitors provided evidence for the participation of one or more Tyr residues in the binding of ligands [37]. Binding of xylooligomer substrates to the enzyme produces a characteristic difference spectrum with a maximum at 283-286 nm indicating that the environment of a Tyr residue(s) located in the binding cleft becomes more hydrophobic. A slight red shift of difference absorbance maxima at approximately 280 and 287 nm relative to a Tyr model compound (with maxima at 275, 283 and 286 nm) and a decrease in the band at 283 nm (Figure 6) suggested either the formation of new hydrogen bonds or the disruption of intrinsic hydrogen bonding network involving the Tyr residue. Since the xylooligomers used in these experiments would have been rapidly hydrolysed to xylobiose and xylotriose, the perturbed Tyr residue could be located in subsites -II, -III or -IV of Figure 4. However, the apparent subsite binding energies would predict that xylose preferentially binds subsite -IV and this weak inhibitor was observed to neither interact directly with the Tyr residue(s) nor induce sufficient conformational changes to perturb it. Thus, the perturbed Tyr residue is postulated to be localized to subsite -III or-II.

j

/

/ l

/

/

/-/ I

I

a

,

\.\ ............... b l, \ "....__

I

~,

I

I

-

0

I

260 270 280 290 300 310

Wavelength (nm) Figure 6. Difference ultraviolet absorbance spectra of (a) 50 laM xylanase A perturbed with 0.75% (wt/vol) xylooligomers and (b) N-acetyltyrosine ethyl ester and (c) N-acetyltryptophan ethyl ester, each perturbed with 40% dioxane. The spectra were recorded with a Beckman DU8 single-beam spectrophotometer at pH 6.0 and 20 ~ The solid bars each represent 0.01 absorbance units. The data are adapted from [37].

155 7. CHEMICAL MODIFICATION An essential step in understanding the catalytic process of an enzyme is to locate and characterize the amino acids comprising the active site, and to study the relationship between these residues and enzyme function. This relationship can be probed effectively by the technique of chemical modification. This approach has the advantage over site-directed mutagenesis in that the starting material for the experiments is fully active, native enzyme. However, interpretation of the results of chemical modification requires care because the physical and chemical properties of the product resultant from modification may affect activity adversely, favourably, or not at all depending on the role of a residue in the catalytic process. In addition, correlation of activity with chemical modification is usually based upon various and often tacit assumptions; for example, that the specificity of the reagent employed is absolute, that the conditions under which the reaction is performed do not disturb protein structure and that the activity changes are not the consequence of either conformational changes or denaturation. In the studies with xylanase A, great care was taken to ensure that these inherent limitations of chemical modification were minimized or eliminated to allow unambiguous interpretation of the data.

7.1. Substrate-binding residues. The participation of aromatic residues Trp and Tyr in the binding of substrate to xylanase A was probed with the group specific reagents N-bromosuccinimide (NBS) and tetranitromethane (C(NO2)4), respectively [14,37]. The results of these studies, which are summarized in Table 2, indicated that both of these amino acid residues are involved. Reaction with NBS led to the complete inactivation of the enzyme. In the course of the chemical modification of xylanase A with NBS, reduced specificity of the reagent resulted in the modification of Tyr, as detected by a sharp increase in absorbance at 280 nm with increasing NBS concentrations. Such reactions invalidated the use of the spectrophotometric method of Spande and Witkop [38] for Trp quantitation which is based on the decrease in absorbance at 280 nm. To circumvent this problem, a fourth-derivative spectrophotometric method for Trp and Tyr quantitation was developed [39]. Employing this new method, NBS-oxidized xylanase A with a residual enzymatic activity of less than 10% was shown to retain 4 of its 8 Trp residues and 16 of its 17 Tyr residues (Table 2). The presence of xylopentaose in the NBS reaction mixtures did not protect from loss of activity, but at concentrations greater than 1%, one less Trp residue was oxidized. These data imply that one Trp residue participates in binding interactions with substrate, while one, two or three others which are not located in the binding site cleft are required to maintain the structural integrity of a catalytically competent enzyme. At present, the identity of the binding-site Trp residue remains unknown, but likely candidates would include one of the two totally conserved Trp residues (Trp80 and Trp 165 of xylanase A) of the Family G enzymes (Figure 1). Investigations on the identity and possible role of a binding-site Tyr in xylanase A were initiated based on the compelling evidence obtained from both the difference spectroscopic study and NBS modification experiments described above. Reaction of xylanase A with a 100fold molar excess of C(NO2)4 led to the nitration of approximately 3 Tyr residues and the concomitant elimination of catalytic activity while N-acetylimidazole (NAI) in great excess had little effect. Through differential labelling with C(NO2)4, employing xylopentaose as protective

156 ligand, and peptide mapping studies, Tyr97 was identified as the catalytically-essential Tyr residue [37]. The differences in both the chemical properties of the two Tyr-specific reagents and their effects on xylanase A are consistent with Tyr97 being located in the relatively hydrophobic environment of a substrate binding cleft. Tetranitromethane is a relatively nonpolar reagent and typically reacts with buried residues whereas NAI is quite polar and has been shown to preferentially modify exposed surface Tyr residues [40]. The nitration of a Tyr would have several consequences, including the effects of the bulky nitro group being placed ortho to the phenolic hydroxyl function. The properties of the substituent nitro group would push electrons into the benzene ring (inductive effect) and thereby lower the pKa of the phenolic hydroxyl from approximately 10.3 to 7.3 [41 ].

Table 2 Chemical modification of xylanase A. Amino Acid Reagent Molar Specificity Excess Asp/Glu WRK 20,800 CMC 20,800 EDC 20,800 EAC 20,800 Cys DTNB 46 His DEP 1,040 Trp NBS 30

Tyr

HNB NAI

C(NO2)4

470 12,000 14 100

No. Modified Residues n.d. n.d. n.d. 3.9 / 12 Glu+Asp 0 3.0 / 3 His 3.97 / 8 Trp 1.10 / 17 Tyr n.d. n.d. 0.7 / 17 Tyr 2.8 / 17 Tyr

% Residual Activity <1 <1 < 1 <1 100 100 <1

Reference

75 65 84 4.8

37 37 14,37 37

14 14 14 14 14 37,39

n.d. - not determined.

Thus, the nitrated Tyr97 would be partially ionized at physiological pH and this may critically disrupt the function of a binding residue in the tight confines of the binding cleft. As with the difference spectroscopic investigations described above, xylobiose and xylotriose would have been the predominant protective ligands remaining in the reaction mixtures during the C(NO2)4 differential labelling studies. Hence, it is very tempting to speculate that Tyr97 is the Tyr residue predicted to be located in subsites -II or -III and perturbed by ligands. This would preclude any possibility that Tyr97 plays a more active role in the mechanism of substrate hydrolysis, such as the proton donor or stabilizing anion as has been suggested for Tyr503 of Escherichia coli lacZ 13-galactosidase [42] or Tyr48 of Aspergillus niger glucoamylase [43], respectively.

157

7.2. Catalytic residues The nature of the pH-activity profile for xylanase A [ 12] implicates the participation of both an acidic amino acid and possibly a histidyl residue in its mechanism of catalysis. The latter possibility arises because the basic limb of the profile reflects the ionization of an imidazolium group, which typically has pK values in the range of 5.6-7.0 in proteins [44]. Alternatively, a second acidic residue may be located in a hydrophobic environment resulting in an increase of its pKa. These possibilities were readily resolved using group specific reagents. All three histidyl residues in xylanase A could be modified by diethylcarbonate (DEP) with no concomitant loss of activity (Table 2) [14]. Since the lack of reactivity with dithionitrobenzoic acid (DTNB) precluded the presence of free sulfhydryls (the two Cys residues present in the protein have since been confirmed to form a disulfide bond [17]), two acidic residues were implicated in catalysis by the enzyme. This was indeed borne out by reaction of the enzyme with carbodiimides. The water-soluble carbodiimide 1-(4-azonia-4,4-dimethylpentyl)-3-ethyl carbodiimide iodide (EAC) in the absence of added nucleophile inactivates xylanase A very rapidly and completely relative to the other carbodiimides, 1-ethyl-3[3-diethylamino)-propyl]-carbodiimide (EDC) and 1-cyclohexyl-3-(2-morpholinoethyl) carbodiimide (CMC) or the isoxazolium salt, N-ethyl-5phenylisoxazolium-3-sulfonate (WRK) [14]. EAC was therefore used for further studies. Analysis of the order of reaction of EAC with the enzyme indicated that an average of at least one molecule of inhibitor binds one molecule of xylanase for inactivation. The inclusion of xylooligomer substrates in the reaction mixtures provided a concentration dependent protection from inactivation indicating that the essential residue is located at the active site. Support for this proposal was provided by differential labelling studies employing xylooligomers as protective ligands and [3H]EAC which demonstrated that only one active-site carboxyl group is modified to inactivate the enzyme. These observations, however, do not preclude the existence of a second catalytically-essential acidic residue at the active site. The catalytically-essential residue susceptible to EAC modification was identified as Glu87 [45]. Reacting exposed carboxyl groups with unlabelled EAC in the presence of substrate, followed by removal of excess reagent and protecting ligand and then complete inactivation of the enzyme with [3H]EAC allowed isolation of peptides with radiolabel only associated with Glu87. Alignment of the amino acid sequences of the Family G enzymes reveals that Glu87 is one of only two totally conserved acidic amino acids, the second being Glu184 (Figure 1). The proposed mechanism of carbodiimide action requires that protonated carboxylic acids catalyze their own modification by both protonating the carbodiimide nitrogen atom and thereby also providing the attacking nucleophile carboxylate anion (Figure 7, reaction pathway a). In view of both this proposal, and that the EAC modification reactions described above were conducted at pH 6.0, it would initially appear that Glu87 participates in the mechanism of xylanase A action as the acid catalyst. However, as suggested by Svensson et al. [46], it is easily conceivable that in the confined space of the active site the carbodiimide is protonated by one specific residue and then is attacked by a carboxylate anion in close proximity to the former (Figure 7, reaction pathway b); a situation which is wholly consistent with the mechanistic scheme proposed for carbohydrases involving a catalytic diad. Evidence for the latter proposal is provided by amino acid sequence homology studies and X-ray crystallography. Indeed, the identity of a putative catalytic residue in xylanase A allows an opportunity to make valuable comparisons between it and related enzymes.

158

y

Stabilizing anion/ Nucleophile

0

R-N=C=N-R '

>

_

R-N=(~-N-R' 0

0

0

R-N a--~--~l -R ' 0"0

Y

Acid catalyst

o,/

R-N=C-N-R '

Figure 7. Alternative mechanisms by which carboxyl side chains in the confines of an active site may be modified by a carbodiiimide. In pathway a, the carbodiimide is protonated by the acid catalyst and then attacked by a proximal carboxylate nucleophile. In pathway b, the acid catalyst serves as both proton donor and subsequently as attacking nucleophile.

8. SEQUENCE ALIGNMENTS AND STRUCTURE PREDICTIONS Unfortunately, a crystal structure has not been determined for S. commune xylanase A. However, X-ray crystal structures have been solved at 1.8, 2.2 and 2.5 A for the closely related Family G enzymes from Bacillus circulans [47], B. pumilus [48] and Trichoderma reesei [49], respectively. Secondary structure predictions and hydrophobic cluster analysis reveal extensive similarities between these enzymes and xylanase A (Figure 8) suggesting that the tertiary structure of all Family G enzymes would share many common features. Indeed, the homologous residues to Glu87 and Glu184 of xylanase A (viz. Glu78 and Glu172 of B. circulans, Glu93 and Glu182 of B. pumitus and Glu86 and Glu177 of T. reesei) are positioned within the active site clefts of their respective enzymes to act as the catalytic residues. Glu93 of the B. pumilus xylanase is salt-bridged to Arg127 [48] while Glu78 of the B. circulans enzyme is within the vicinity of the positive charge contributed by Arg112 [47]. These Arg residues are also totally conserved in the Family G enzymes and correspond to Arg123 of xylanase A (Figure 1). The proximity of this charged residue will have the effect lowering the pKa values of the respective Glu residues and therefore, strongly implicates them as the stabilizing anion/nucleophile. Affinity labelling of the B. subtilis xylanase has subsequently comf'lrmed that these conserved residues serve as the active-site nucleophile [50]. However, there is no direct evidence for the assignment of the acid catalyst. With the B. circulans xylanase, the position of Glu172 relative to the xylotetraose substrate was determined to be similar to that of the putative acid catalyst of hen egg-white lysozyme, Glu35. By analogy to the lysozyme model,

159 Glu172, and presumably Glu184 of xylanase A, fulfds the role of the acid catalyst in the double-displacement mechanism of action. ,o

t

SO

,~

I

r~+

3o

I

o-

,o

~

!

.

~

I

,

I

,o

I

,o

I

II

II

I

,,

I

~1

I

,,o

I

,2o

I

,~

I

,,oO,,;,~

oi j ol o/J,olio

I~

~ , ~

II

I~

I

,o

,,o

I

I

,,.

,6o

!

,,o

,,o

I

,,o

!

[]

! I~*

+

,,

~ t j, L o I o'L II

I~

I

II

~I

I / I l I

I I I~

Figure 8. Hydrophobic-cluster analysis plots comparing the amino acid sequences of the xylanases from S. commune (Sc) [17] and B. circuIans (Bc) [51]. Vertical lines indicate the proposed argreement between the two sequences. The closed arrowheads denote the putative stabilizing anion/nucleophile Glu87 (Sc) and Glu78 (Bc) and the acid catalysts Glu184 (Sc) and Glu172 (Bc) while the open arrowhead identifies the conserved aromatic residue Tyr97 (Sc) and Tyr88 (Bc). The conventional symbols for Pro (.), Gly ( . ) , Ser (~), and Thr (r-l) have been used [52].

The putative roles of the two identified Glu residues in the B. circulans [47] and B. pumilus [53] xylanases have been further probed by site-directed mutagenesis in combination with both kinetic and X-ray crystallographic analyses. For both enzymes, kinetic analyses showed that enzyme activity is greatly affected by site-directed mutations to Glu78/93 and Glu 172/182 but activity is more affected by mutation of Glu78/93 (Glu78Asp and Glu93Ser in B. circulans and B. pumilus, respectively) than in Glu172/182 mutants (Glu172Asp and Glu182Asp in B. circulans and B. pumilus, respectively). This difference in effects is consistent with Glu78/93 having the more critical role of the stabilizing anion/nucleophile. The xynA gene encoding xylanase A has not been cloned from S. commune, but an artificial gene has been synthesized de novo using a novel PCR-based procedure [54]. The nucleotide sequence of this gene encoding xylanase A was designed and tailored for expression in Escherichia coli but sitedirected mutagenesis has yet to be performed. A number of proposals have been made to account for the high pK, value of the specific Glu/Asp residue that participates as the acid catalyst in carbohydrases. With hen egg-white lysozyme, these include i) the low accessibility of solvent to the active site cleft, ii) electrostatic interactions with the charge constellation (mainly Asp52), and iii) its location at the C-terminus of an s-helix macrodipole (negative charge) [55]. Secondary structure predictions using

160 different methods, including hydrophobic cluster analysis, together with comparison to the crystal structure of the B. pumilus xylanase indicate that the putative acid catalyst of xylanase A, Glu184, is not preceded by an 0~-helix but instead, located in a [3-strand [45]. In fact, only one short o~-helix is predicted for xylanase A [45], and only one is observed in the crystal structures of the B. circulans [47], B. pumilus [48] and T. reesei [49] enzymes. While this does not necessarily exclude the role of macrodipoles in the mechanism of action of hen eggwhite lysozyme, it seems that with xylanase A, and likely all other Family G xylanases, other forces must be governing the physical properties of the acid catalyst. That hydrophobic environments favour rearrangement of an intermediate O-acylurea EAC adduct to the stable Nacylurea rather than hydrolysis back to free carboxyl, a situation that occurs upon reactions with EAC [56], would suggest that the modified Glu87, and presumably Glu 184, exists in such an environment. Indeed, Glu 182 of the B. pumilus xylanase is observed to be slightly buried in a more hydrophobic environment [48], but in contrast, both Glu 78 and Glu 172 of the B. circulans enzyme are exposed to solvent [47]. Thus, it remains to be determined what governs the pKa of the acid catalyst for at least the Family G xylanases. The essential binding-site Tyr residue identified in S. commune xylanase A, Tyr97, was predicted to be located in either subsite -III or subsite -II. This residue is homologous to Tyr88 and Tyr96 of the B. circulans and T. reesei xylanases (Figure 1). Based on hydrophobic cluster analysis, all three residues appear in the same secondary structural element (Figure 8) and indeed, Tyr96 of T. reesei XynlI is proposed to pack against the xylose ring in subsite +3 [49]. However, analysis of the X-ray crystal structure of the B. circulans enzyme indicates that two other Tyr residues, Tyr69 and Tyr 80 (homologous to Tyr78 and Tyr89, respectively, in xylanase A), directly participate in substrate binding. In addition, Tyr69 is thought to hydrogen bond to the putative stabilizing anion/nucleophile, Glu78 and play a role in positioning this residue for catalysis [47]. Xylotetraose was used in the X-ray crystallographic studies of B. circulans xylanase-substrate complexes and it was observed to span the catalytic site, thus binding in subsites -II, -I, I and II. Hence, it is conceivable that Tyr88 of the B. circulans xylanase, which is located in the crystal structure on the edge of the active site cleft [47], comprises subsite -III, and as predicted for xylanase A Tyr97, makes binding contacts with substrates of greater degrees of polymerization. A stacking interaction of the highly conserved Trp9 (homologous to Trp 18 of xylanase A) with one of the xylose tings of xylotetraose is also observed in the crystal structure of B. circulans xylanase [47]. It is likely that Trpl8 of xylanase A interacts with substrate in a similar manner and that such interactions provide the observed protection from NBS modification, as described above.

9. CONCLUDING REMARKS The physical, kinetic and chemical modification investigations reviewed above not only identified catalytic and substrate binding residues in the mechanism of action of S. commune xylanase A, but also illustrate both the continued potential and utility of such experimental approaches and techniques. Xylanase A has served as an excellent system for discerning the catalytic mechanism of xylanases, and indeed, retaining carbohydrases in general. Whereas there is little doubt for the formation of enzyme-bound intermediate in the function of retaining

161 carbohydrases, the pathway for its formation involving the protonation of the exocyclic (C1) oxygen illustrated above is strictly conjecture. Kinetic studies on the inhibition of hydroxylated cyclic amines [57] and theoretical molecular dynamic calculations of lysozyme [58] led to a proposal for an endocyclic pathway for the formation of the enzyme-substrate adduct. This latter postulate involves the protonation of the ring (endocyclic) oxygen by the acid catalyst, followed by subsequent nucleophilic attack and displacement. The mechanism requires the rotation of the C1-C2 bond of the carbohydrate substrate rather than invoking substrate distortion for the favourable alignment of electronic orbitals, aturally, the endocyclic mechanism has been received with some scepticism and has initiated much debate (for example, see [59]) Although it remains to be established what the true mechanism involves, most, if not all the data supports an exocyclic mechanism.

10. ACKNOWLEDGEMENTS These studies were supported by an operating grant from the Natural Science and Engineering Research Council of Canada to A.J.C. and a Graduate Scholarship to M.R.B. from the Province of Ontario.

11. REFERENCES

8 9 10 11 12

13 14 15 16 17

P. Biely, Trends Biotechnol., 3 (1985) 286. J. Combat, Carbohydr. Res., 118 (1983) 215. K.K.Y. Wong, L.V.L. Tan and J.N. Saddler, Microbiol. Rev., 52 (1988) 305. K.K.Y. Wong and J.N. Saddler, Crit. Rev. Biotech., 12 (1992) 413. M. Linko, K. Poutanen and L. Viikari, in Enzyme Systems for Lignocellulose Degradation, M.P. Coughlan (ed.) pp 331-346, Elsevier Applied Science, London, 1989. C.I. Beck and D. Scott, Adv. Chem. Ser., 136 (1974) 1. J. Maat, M. Roza, J. Verbakel, H. Stam, et al. in Xylans and Xylanases, J. Visser, G. Beldman, M.A. Kusters-van Someren, A.G.J. Voragen (eds.) pp 349-360, Elsevier Science Publishers B.V., Amsterdam, 1992. C.S. Gong, L.F. Chen, M. Flickinger, G.T. Tsoa, Adv.Biochem.Eng., 20 (1981) 93. L. Jur~i~ek, R. Spoko and J. Varadi, Ceska Mykol., 22 (1968) 43. L. Jur~i~ek, J. Varadi and R. Spoko Drev. Vysk., 12 (1967) 73. J. Varadi, V. Necesany and P. Kovacs, Drev. Vysk., 16 (1971) 147. M.G. Paice, L. Jur~igek M.R. Carpenter and L.B. Smillie, Appl. Envrion. Microbiol., 36 (1978) 802. L. Jur~i~ek, M.G. Paice, Methods Enzymol., 160 (1988) 659. M.R. Bray and A.J. Clarke, Biochem. J., 270 (1990) 91. A.J. Clarke, Biochim. Biophys. Acta, 912 (1987) 424. A.J. Clarke and M. Yaguchi, Eur. J. Biochem., 149 (1985) 233. T. Oku, C. Roy, D.C. Watson, W. Wakarchuk, R. Campbell, M. Yaguchi, L. Jurasek and M.G. Paice, FEBS Lett., 334 (1993) 296.

162 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

N.R. Gilkes, B. Henrissat, D.G. Kilbum, R.C. Miller, R.A.J. Warren, Microbiol. Rev., 55 (1991) 303. M.R. Bray and A.J. Clarke, Eur. J. Biochem., 204 (1992) 191. Y. Mitsuishi, T. Yamanobe and M. Yagisawa, Agric. Biol. Chem., 52 (1988) 921 P. Biely, Z. Kr~.tk3) and M. Vr~ank~i, Eur. J. Biochem., 119 (1981) 559. M. Kubackova, S. Karacsonyi, L. Bilisics and R. Toman, Carbohydr. Res., 76 (1979) 177. I.V. Gorbacheva and Rodionova, Biochim. Biophys. Acta, 484 (1977) 94. P. Debiere, B. Priem, G. Strecker and M. Vignon, Eur. J. Biochem., 187 (1990) 573 P. Biely, M. Vr~ank~i and I.V. Gorbacheva, Biochim. Biophys. Acta, 743 (1983) 155. M. Vr~ank~i, I.V. Gorbacheva, Z. Kr~itk2~,Z. and P. Biely, Biochim. Biophys. Acta, 704 (1982) 114. P. Biely, M. Vr~ank~i and Z. Kr~itk3~,Eur. J. Biochem., 119 (1981) 565. S. Takenishi and Y. Tsujisaka, Agric. Biol. Chem., 39 (1975) 2315. M.L. Sinnott, in Enzyme mechanisms, M.I. Page and A. Williams (eds.) pp 259-297, Royal Society of Chemistry, London, 1987. J.F. Robyt and D. French, J. Biol. Chem., 245 (1970) 3917. J.A. Thoma, G.V.K. Rao, C. Brothers, J. Spradlin and L.H. Li, J. Biol. Chem., 246 (1971) 5621. J.A. Thoma and J.D. Allen, Carbohydr. Res., 48 (1976) 105. M.R. Bray, Ph.D. Thesis (1993) University of Guelph. S.G. Withers, D. Dombroski, L.A. Berven, D.G. Kilburn, R.C. Miller Jr., R.A.J. Warren and N.R. Gilkes, Biochem. Biophys. Res. Commun., 139 (1988) 487. J. Gebler, N.R. Wilkes, M. Claeyssens, D.B. Wilson, P. Bequin, W.W. Wakarchuk, D.G. Kilbum, R.C. Miller, R.A.J. Warren and S.G. Withers, J. Biol. Chem., 267 (1992) 12559. D.E. Koshland, Biol. Rev., 28 (1953) 416. M.R. Bray and A.J. Clarke, Biochemistry, 34 (1995) 2006. T.F. Spande and B. Witkop, Methods Enzymol., 11 (1967) 498. M.R. Bray, A.D. Carriere and A.J. Clarke, Anal. Biochem., 221 (1994) 278. B. Myers II and A.N. Glazer, J. Biol. Chem., 246 (1971) 412. R.C.Lundblad and C.M. Noyes, Chemical Reagents for Protein Modification, pp 105121, CRC Press, Inc., Boca Raton, 1984. M. Ring and R.E. Huber, Arch. Biochem. Biophys., 283 (1990) 342. M. Stoffer, A.E. Aleshin, T.P. Frandsen, B. Svensson and R.B. Honzatko, Abstract C2.10, 17th International Carbohydrate Symposium, Ottawa, 1994. J.P. Greenstein and M. Winitz, Ann. Rev. Biochem., 38 (1969) 733. M.R. Bray and A.J. Clarke, Eur. J. Biochem., 219 (1994) 821. B. Svensson, A.J. Clarke, I. Svendsen and H. Moiler, Eur.J.Biochem., 188 (1990) 29. W.W. Wakarchuk, R.L. Campbell, W.L. Sung, J. Davoodi and M. Yaguchi, Prot. Sci., 3 (1994) 467. Y. Katsube, Y. Hata, H. Yamaguchi, H. Moriyama, A. Shimnyo and H. Okada, in Proceedings of the 2nd International Conference on Protein Engineering: Protein Design in Basic Research, Medicine and Industry, M. Ikehara, T. Oshima and K. Titani (eds.) pp 91-96, Japan Scientific Societies Press, Tokyo, 1989.

163 49 50 51 52 53 54 55 56 57 58 59

A. T/3rr6nen and J. Rouvinen, Biochemistry, 34 (1995) 847. S. Miao, L. Ziser, R. Aebersold and S.G. Withers, Biochemistry, 33 (1994) 7027. R.C.A. Yang, C.R. MacKenzie and R.A. Narang, Nucleic Acids Res., 16 (1988) 7187. C. Gaboriaud, V. Bissery, T. Benchetrit and J.P. Mornon, FEBS Lett., 224 (1987) 149. E.P. Ko, H. Akatsuka, H. Moriyama, A. Shimnyo, Y. Hata, Y. Katube, I. Urabe and H. Okada, Biochem. J., 288 (1992) 117. R.W. Graham, T. Atkinson, D.G. Kilburn, R.C. Miller Jr. and R.A.J. Warren, Nucl. Acid Res., 21 (1993) 4923. V. Spassov, A.D. Karshikov and B.P. Atanasov, Biochim. Biophys. Acta, 999 (1989) 1. R. Timokovich, Anal. Biochem., 79 (1977) 135. G.W.J. Fleet, Tetrahedron Lett., 26 (1985) 5073. C.B. Post and M.J. Karplus, J. Am. Chem. Soc., 108 (1986) 1317. M.L. Sinnott, Bioorg. Chem., 21 (1993) 34.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

165

Protein engineering of cyclodextrin glycosyltransferase from Bacillus circulans strain 251 L. Dijkhuizen a, D. Penninga a, H.J. Rozeboom b, B. Strokopytov b and B.W. Dijkstra b

aDepartment of Microbiology and bBIOSON Research Institute, Laboratory of Biophysical Chemistry, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, The Netherlands

Abstract

The 3D-structure of CGTase from Bacillus circulans strain 251 has been determined at 2.0 resolution, allowing a detailed analysis of structure-function relationships of this protein. The catalytic mechanism of the CGTase cyclization reaction was studied by constructing various mutants using site-directed mutagenesis. 3D-structures of these mutant proteins have been obtained and further analysed.

1. INTRODUCTION Many bacteria and fungi excrete enzymes that degrade starch to facilitate the uptake of carbohydrates into the cell. A number of bacteria belonging to the genera Bacillus, Thermoanaerobacter, Thermoanaerobacterium, Clostridium, Micrococcus, Klebsiella, are able to grow on starch using the extracellular enzyme cyclodextrin glycosyl-transferase (CGTase; EC 2.4.1.19) for the initial attack on this polymeric substrate. CGTases catalyse the formation of cyclodextrins from starch and related o~(1~ 4 ) linked glucose polymers via an intramolecular transglycosylation reaction. CGTase enzymes are functionally related to oramylases, which hydrolyse starch into linear products. In contrast with o~-amylases, CGTase enzymes preferentially add the "non-reducing end" glycosidic C4 hydroxyl group across the scissile ct(1---)4) glycosidic bond, resulting in a glycosidic exchange and the formation of cyclodextrins. CGTase enzymes and o~-amylases share about 30% amino acid sequence identity [2]. Bacteria employing a CGTase for starch degradation are able to subsequently metabolize the cyclodextrins produced as carbon- and energy sources for growth (Figure 1). This appears to involve a cell-associated cyclomaltodextrinase (CDase; EC 3.2.1.54), yielding glucose, maltose and maltotriose from cyclodextrins [1]. The further glucose metabolism proceeds intracellularly via the glycolytic pathway.

166

Figure 1. Bacterial growth on starch (amylose) involving an extracellular CGTase (yielding cyclodextrins, CD), a cell-associated CDase (yielding glucose, maltose and maltotriose), and intracellular glucose metabolism via the glycolytic pathway, yielding pyruvate (Pyr).

Cyclodextrins are cyclic glucose oligomers linked via a(1,4) glycosidic bonds (Figure 2). They possess a hydrophobic internal cavity formed by the glucopyranose CH groups and a hydrophilic exterior surface formed by primary and secondary hydroxyl groups [3]. Cyclodextrins are able to form inclusion complexes with many small hydrophobic molecules in aqueous solutions, resulting in changes in physical properties, e.g. increased solubility and stability, and decreased chemical reactivity and volatility (Figure 2) [14]. Cyclodextrins find an increasing use in industrial and research applications [21 ]. All CGTase enzymes studied thus far from different bacterial sources form a mixture of cyclodextrins consisting of 6, 7 or 8 glucose residues, or-, ~- and y-cyclodextrin, respectively. The relative proportion of the o~-, [3or T-cyclodextrins produced varies with the bacterial source of the enzyme [15]. CGTase enzymes not only produce cyclodextrins from starch but also display coupling, disproportionation and saccharifying activities (Figure 3). The hydrolyzing activity of the Bacillus circulans strain 251 CGTase protein, however, is rather low [12] and generally the enzyme only produces cyclodextrins from starch. All CGTase enzymes studied are also rather sensitive to product inhibition by cyclodextrins. The industrial cyclodextrin production process, involving CGTase, thus could be improved considerably by the construction of mutant CGTase enzymes with improved product specificity and decreased product inhibition

167

off o

15.3 A

I

-~ ~

~ [

I

~ ,

v

-- L

I

hydrophobic inside

I-~hydr~176

Stabilization of light- or oxygen-sens, comp. Stabilization of volatile compounds Alteration of chemical reactivity Improvement of solubility ~--- H20 9 Attraction

Improvement of smell and taste

4 Repulsion

Modification of liquid compounds to powders

Figure 2. Structure and properties of [3-cyclodextrins.

2. METHODS 2.1. Production and Purification of CGTase Proteins

CGTase wild type and mutant proteins were produced using Bacillus subtilis strain DB104A, lacking protease and t~-amylase activity. Cells of this strain carrying the plasmid encoded mutant cgt genes under the control of a cryptic p32 Lactococcus promoter grown in 3 liter fermentors produce high extracellular levels of the CGTase proteins. After concentrating, mutant CGTase proteins were purified to homogeneity by affinity chromatography, using an t~cyclodextrin-Sepharose-6FF column. Bound CGTase was eluted with buffer containing ctcyclodextdn. Using these procedures, a single 3-liter fermentor run allowed purification of up to 300 mg of mutant CGTase protein in a 15-60% yield [2, 12, 17]. 2.2. Cyclization Enzyme Assays The CGTase cyclization activities were measured by incubating appropriately diluted enzyme for 5-10 min at 50 ~ with 5% Paselli SA2 (partially hydrolyzed potato starch) in 10 mM sodium citrate (pH 6.0) as substrate. The [3-cyclodextrin formed was determined with phenolphthalein. One unit of activity is defined as the amount of enzyme able to produce 1 I.tmol of 13-cyclodextrin per rain. o~- and ~,-cyclodextrin formation was subsequently measured by HPLC (see below).

168

Figure 3. Reactions catalyzed by CGTase enzymes.

Cyclodextrin formation was also determined under industrial production process conditions. For this purpose 0.1 U/ml CGTase was incubated with 10% jet-cooked starch in a 10 mM sodium citrate buffer (pH 6.0) at 50 ~ for 45 h. Samples were taken at regular time intervals, boiled for 5 min, and the products formed were analyzed by HPLC using a 25 cm EconosilNH2 10 micron column (Alltech Associates Inc., USA) eluted with acetonitrile/water (60/40%, v/v) at a flow rate of 1 ml per min.

3. RESULTS 3.1. CGTase 3D-structure In previous work the B. circulans strain 251 CGTase encoding gene has been cloned, sequenced, and overexpressed in E. coli and B. subtilis. The CGTase protein has been purified, biochemically characterized and crystallized [8, 9]. The crystal structure of the protein has been determined at 2.0/~ resolution, allowing a more detailed analysis of structure-function relationships of this CGTase [9]. The overall fold of the enzyme is presented in Figure 4. The structure of CGTase in our crystal form is essentially the same as that of B. circulans strain 8 described by the group of Schulz [5, 7]. The protein consists of a single polypeptide chain of 686 amino acid residues, forming five domains (A through E). CGTase shares about 30% amino acid sequence identity with or-amylases. The three-dimensional structures of oramylases and CGTases also show clear structural similarity in the N-terminal A-C domains of approximately 400 residues, which folds into an (~/ct)8-barrel structure [6]. However, compared to or-amylases, CGTases have two additional C-terminal domains (D, E-domains) which fold into ~-pleated sheets.

169

=' 43

+

I 143

Figure 4. Ribbon drawing of CGTase from Bacillus circulans strain 251 [9]. Arrows represent ~-strands; coils represent a-helices. The amino and carboxyl termini are labelled n and c, respectively. The domains of CGTase are labelled [A] through [E]. Selected sidechains are drawn to indicate relative positions of sites mentioned in the text: Asp 229, Glu 257, Asp 328 (active centre), Trp 616 and Trp 662 (maltose binding site 1), Tyr 633 (maltose binding site 2), Trp 413 (maltose binding site 3).

Although crystals of the B. circulans strain 251 CGTase protein are grown and maintained in the presence of a-cyclodextrin or maltose [8], we did not find any a-cyclodextrin or other oligosaccharide bound in the active site of the enzyme [9]. From inspection of electron density maps we have identified three carbohydrate binding sites on the surface of the protein, located in each case parallel to the flat surfaces of aromatic rings. In these electron densities a-maltose could be modelled [9]. This indicated that the a-cyclodextrin had been degraded into short linear maltodextrins during the crystallization process. The first maltose binding site is located in the E-domain. Binding of the maltose mainly occurs through hydrophobic contacts of both glucose-rings with the sidechains of Trp 616 and Trp 662. Three direct hydrogen bonds with the protein are made in which both amide- and carbonyl-group of Asn 667 and the N~-group of Lys 651 participate. At this binding site three water molecules form mediating hydrogen bonds between protein and maltose. The second maltose binding site is also located in the E-domain. This maltose molecule is stacked on Tyr 633 with its reducing glucose unit. Hydrogen bonds are made with the Thr 598 Oy, Ala 599 O, Gly 601 N, Gin 602 N, Ash 603 O81 and N82, Ash 627 O51 and N52, Gin 628 Ne2 and Tyr 633 OH. Only one water molecule is hydrogen bonded to this maltose.

170 The third maltose binding site is located in the C-domain. A maltose molecule is stacked with the apolar face of the reducing sugar on the aromatic sidechain of Trp 413. Further binding to the protein is accomplished by 10 hydrogen bonds. Only three hydroxyl groups of the maltose do not participate in hydrogen bonding. Water molecules are often found to form mediating hydrogen bonds between sugar and protein. Four water molecules appear to be bound to both protein and maltose. CGTases bind strongly to raw starch and their amino acid sequences show the raw-starch binding motif proposed by [ 18]. This raw-starch binding motif is located in the E-domain and it reveals eleven strictly conserved residues [9]" Th r598- Gly601-G ly608-Leu613-Gly614-Trp616- Pro634-Trp636-Lys651-Trp662-Asn667 Of these residues Trp 616, Lys 651, Trp 662, Asn 667 and Thr 598, Trp 636 form the first and second maltose binding site, respectively, referred to above. The other strictly conserved residues (Gly 601, Gly 608, Leu 613, Gly 614, Pro 634) are not found to bind directly to maltodextrin, but are probably required for structural support of the raw-starch binding domain. 3.2. The Active Site The mechanism of starch hydrolysis by o~-amylases and CGTases is not very well understood. It is assumed that the catalytic mechanism bears some resemblance to the mechanism proposed for hen egg-white lysozyme: a Glu sidechain which is protonated forms a hydrogen bond with the oxygen in the glycosidic bond, thereby triggering a rearrangement in which the bond between the C1 atom and the glycosidic oxygen is broken. An intermediate is formed which has a positive charge delocalized over a double bond between the C1 and 05 atoms. This positive charge is stabilized by a negatively charged Asp residue. Next a water molecule attacks the C1 atom, forming a new hydroxyl group on the C1 atom. CGTase and amylases, however, possess one Glu and two Asp residues in the active site, instead of one Glu and one Asp residue. In the a-amylase literature all pairs of these acidic residues have been proposed to be the equivalents of the hen egg-white lysozyme Asp/Glu pair involved in hydrolysis. Site-directed mutagenesis of each of these residues in B. subtilis or-amylase [20] and Taka-amylase [ 10] resulted in inactive enzymes. The CGTase active site is present at the wide end of the (13/ot)8-barrel, with catalytic residues Asp 229, Glu 257 and Asp 328 (CGTase numbering). To elucidate the precise function of each of these three carboxylates in the active site, these three residues were replaced by asparagine (D229N, D328N) and glutamine (E257Q), using site-directed mutagenesis. All mutant proteins were purified and crystallized, allowing a detailed comparison of their biochemical properties and 3D structures with those of the wild type protein. The D229N, D328N and E257Q mutations resulted in virtually complete loss of the cyclodextrin forming, liquefying- and saccharifying activities of CGTase. Wild type CGTase and the virtually inactive mutant CGTase proteins have been used in further soaking experiments with for instance cyclodextrins and the pseudotetrasaccharide acarbose (Figure 5), a strong aamylase and CGTase inhibitor [ 16, 17]. The structure of acarbose is very similar to that of maltotetraose (G4). It consists of one normal maltose and one pseudo-maltose residue which is

171 essential for the inhibitory properties [4]. It consists of an unsaturated cyclitol unit (also called valienamine) and a 4-amino-4,6-dideoxyglucose unit.

H2OH~// .o~

~o

IJ

...."

nu H ~ ~ O

B

HO---z'~ HO~ ! 4 .CH2OoH HO

~

~

A

4 CH2OH

Figure 5. Structural formula of acarbose [17]. Arrows mark the four differences between maltotetraose and acarbose: (i) the C6-hydroxyl group of glucose C is absent; (ii) the Oglycosidic bond between residues C and D has been replaced by an N-glycosidic bond; (iii) the 05 oxygen of residue D has been substituted by a carbon atom (C7); (iv) a double bond has been introduced in residue D between the C5 and C7 atoms.

Acarbose was bound near the catalytic residues and this allowed us to analyse how a linear substrate molecule binds in the active site (Figure 6) which provided further insight into the catalytic mechanism of the enzyme [17]. On the basis of our current knowledge we suggest that Glu 257 is the proton donor in the cleavage reaction and Asp 229 serves as the general base or nucleophile, while Asp 328 is important in substrate binding and may be important for elevating the pKa of Glu 257 [17]. The binding mode of acarbose in CGTase differs from that observed in the complex of pancreatic a-amylase with acarbose where the catalytic Glu was found to be hydrogen bonded to the glycosidic nitrogen [ 13]. 3.3. The Cyclization Reaction CGTases and a-amylases both degrade starch by hydrolysis of a(1,4) glycosidic bonds but produce virtually exclusively cyclic and linear products, respectively. The various CGTases studied can be further distinguished as a-, ~- and 7-CGTases on the basis of their main cyclodextrin product. The B. macerans enzyme is the best studied example of an o~-CGTase [19], whereas for instance the B. circulans strain 251 enzyme is a [3-CGTase [8], producing a mixture of 0~-, 13-, and 7-cyclodextrins in a ratio of 13:64:23 [12]. The 13-CGTases are especially of interest since 13-cyclodextrin is most widely applied [15]. 13-Cyclodextrin is separated from the other cyclodextrins using organic solvents, which is expensive. Therefore,

172 Glu 257 I Arg 375

Asp ,328

)

acarbose

interactions

His 233

t//

Lys 232 \

Asp 229 ~His

140

Figure 6. Schematic representation of the interactions of acarbose bound in the active site of CGTase. one of our main goals is to increase the product specificity of CGTase enzymes by means of protein engineering. At present it is unclear what determines the differences in product specificity between the various CGTases (o~-, [3- and ~,-cyclodextrin ratios) and a-amylases (cyclic versus linear malto-oligo-saccharides). Alignment of amino acid sequences of CGTases and a-amylases, and the analysis of the binding mode of the substrate analogue acarbose in the active site cleft [17], suggested that Tyr 195 might play an important role in the cyclization of oligosaccharides. All (x-amylases studied possess a small residue (Gly, Ser, Val) at this position [11], in strong contrast with the large aromatic amino acids (Tyr or Phe) present in all CGTases studied.

Table 1 Specific Enzyme Activities (units/mg) of Bacillus circulans CGTase Proteins [12] Activity Mutants Cyclization Coupling Disproportionation Y 195 280+4 206_+5 620_+70 Y195F 175_+5 84+10 700_+80 Y195W 74_+4 72_+10 650_+70 Y195L 143+6 18+_4 650_+80 Y 195G 22_+3 25_+4 500_+70

Strain 251 Wild Type and Mutant

Saccharifying 3.0+0.5 2.0_+0.5 3.1_+0.5 4.8-+0.5 4.3_+0.5

We have studied the biochemical properties and crystal structures of CGTase mutant proteins in which the Tyr 195 residue had been replaced by Trp, Phe, Leu and Gly, using sitedirected mutagenesis [2, 12]. With some of these mutants the cyclodextrin product ratio clearly changed, as judged from halo-formation around colonies on starch-containing agar plates,

173 activity assays with phenolphthalein, and HPLC analysis of products formed. Mutant proteins therefore were purified and crystallized, and their X-ray structures were determined at 2.5-2.6 A resolution, allowing a detailed comparison of their biochemical properties and threedimensional structures with those of the wild type CGTase protein. The mutant proteins possessed significantly reduced cyclodextrin forming and coupling activities but were not negatively affected in the disproportionation and saccharifying reactions (Table 1). Also under production process conditions, after a 45 h incubation with a 10% starch solution, the Y195W, Y195L and Y195G mutants showed a lower overall conversion of starch into cyclodextrins. These mutants produced a considerable amount of linear oligosaccharides (Table 2). The presence of aromatic amino acids (Tyr or Phe) at the Tyr 195 position thus appears to be of crucial importance for an efficient cyclization reaction, virtually preventing the formation of linear products [ 12].

Table 2 Starch Conversion and Product Specificity of Bacillus circulans Strain 251 Wild Type and Mutant CGTases a Conversion of Conversion of starch into starch into G 1-G4 cyclodextrins (%) Product ratio (%) oligosaccharides (%) Mutants ot ]3 7 Y195 39.3 13 64 23 0 Y195F 38.8 15 64 20 0 Y195W 33.3 18 63 19 2-4 Y195L 24.4 0 86 14 6-10 Y195G 24.8 19 64 17 16-20 aResults of 45 h incubations of CGTase proteins (0.1 unit of 13-cyclodextrin-forming activity per ml) with 10% jet-cooked starch.

4. DISCUSSION In the Netherlands, AVEBE has developed a process for the production of cyclodextrins with the Bacillus circulans strain 251 CGTase, an enzyme which produces predominantly (65 %) 13-cyclodextrin. Selective precipitation steps with organic solvents may be used for the production of the separate or-, [3- or y-cyclodextrins. To avoid these expensive procedures and to be able to produce cyclodextrins with applications involving human consumption, the development of mutant CGTase enzymes that produce only one particular form of cyclodextrin, and are less sensitive to product inhibition, is required. Rational design of such mutants requires detailed knowledge of the three-dimensional structure of the protein, as well as elucidation of the transglycosylation pathway. The work presented provides a farm basis for the construction of specific or-, [3- or 7-CGTase proteins.

174 5. REFERENCES

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

H. Bender, Appl. Microbiol. Biotechnol., 39 (1993) 714. L. Dijkhuizen, D. Penninga, H.J. Rozeboom, B. Strokopytov and B.W. Dijkstra, Proc. 3rd Int. Symp. Perspectives on Protein Engineering, in press. D. French, Adv. Carbohydrate Chem., 12 (1957) 189. F.R. Heiker, H. B6shagen, B. Junge, L. M~ller and J. Stoltefuss, First Intern. Symp. On Acarbose, W. Creutzfeldt (ed.), pp. 137-141, Excerpta Medica, Amsterdam (1981). B.E. Hofmann, H. Bender and G.E. Schulz, J. Mol. Biol., 209 (1989) 793. H.M. Jespersen, E.A. MacGregor, M.R. Sierks and B. Svensson, Biochem. J., 280 (1991) 51. C. Klein and G.E. Schulz, J. Mol. Biol., 217 (1991) 737. C.L.L. Lawson, J. Bergsma, P.M. Bruinenberg, G.E. de Vries, L. Dijkhuizen and B.W. Dijkstra, J. Mol. Biol., 214 (1990) 807. C.L.L. Lawson, R. van Montfort, B. Strokopytov, H.J. Rozeboom, K.H. Kalk, G.E. de Vries, D. Penninga, L. Dijkhuizen and B.W. Dijkstra, J. Mol. Biol, 236 (1994) 590. T. Nagashima, S. Tada, K. Kitamoto, K. Gomi, C. Kumagai and H. Toda, Biosci. Biotechnol. Biochem., 56 (1992) 207. R. Nakajima, T. Imanaka and S. Aiba, Appl. Microbiol. Biotechnol., 23 (1986) 355. D. Penninga, B. Strokopytov, H.J. Rozeboom, C.L. Lawson, B.W. Dijkstra, J. Bergsma and L. Dijkhuizen, Biochemistry, 34 (1995) 3368. M. Qian, R. Haser, G. Buisson, E. Du6e and F. Payan, Biochemistry, 33 (1994) 6284. W. Saenger, Angew. Chem., Int. Ed. Engl., 19 (1980) 344. G. Schmid, Tibtech, 7 (1989) 244. D.D. Schmidt, W. Frommer, B. Junge, L. M~ller, W. Wingender and E. Truscheit, First Intern. Symp. On Acarbose, W. Creutzfeldt (ed.), pp. 5-15, Excerpta Medica, Amsterdam (1981). B. Strokopytov, D. Penninga, H.J. Rozeboom, K.H. Kalk, L. Dijkhuizen and B.W. Dijkstra, Biochemistry, 34 (1995) 2234. B. Svensson, H. Jespersen, M.R. Sierks and E.A. MacGregor, Biochem. J., 264 (1989) 309. T. Takano, M. Fukuda, M. Monma, S. Kobayashi, K. Kainuma and K. Yamane, J. Bacteriol., 166 (1986) 1118. K. Takase, T. Matsumoto, H. Mizuno and K. Yamane, Biochim. Biophys., 1120 (1992) 281. G. Wenz, Angew. Chem., Int. Ed. Engl., 33 (1994) 803.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

175

Oxidation stable amylases for detergents Torben V. Borchert, S0ren F. Lassen, Allan Svendsen and Henrik B. Frantzen Novo Nordisk A/S, Protein Discovery, 2880 Bagsvaerd, Denmark

Abstract The addition of an amylase to detergents would be a possible means for the removal of problematic starch stains during automated dish wash and/or laundry. Bacillus licheniformis alpha-amylase TermamylTM was chosen as the most suitable alpha-amylase for this purpose although this enzyme had to be stabilized to prevent oxidation. We identified one amino acid, viz. methionin in position 197, which is mainly responsible for the instability of the amylase in an oxidative environment. The search for the optimal substitution for methionine 197 is described.

1. INTRODUCTION Enzymes have been used for a number of years as additives for detergents. Proteases have been the main enzymes added to laundry detergents, but more recently also lipases and cellulases have been used for this purpose. Increasing environmental consciousness on the part of consumers has led to low- and medium-temperature laundering becomming more common; coupled with the wide spread use of automatic dishwashers this has resulted in starch soil removal from clothes and porcelain/silverware becomming a difficult task in the household. Even at wash/dishwash temperatures higher than 50 ~ starch is practically insoluble and difficult to remove, and residual starch stains are commonly found. The solution to this problem could be the addition of a suitable amylase to detergents. Laundry washing and dishwashing are performed under conditions which are rather severe from the point of view of enzyme stability, viz. rather high temperature, high pH, and in the presence of strongly oxidizing compounds. Besides being able to withstand this environment, a suitable amylase should also be compatible with surfactants, calcium chelating agents and proteases present in the detergent. The task was to identify the most suitable amylase, and if necessary alter its properties by protein engineering in order to obtain an amylase that would perform well under the prevailing conditions.

176 2. RESULTS Most of the commercially currently available amylases exhibit an optimum of activity at neutral or slightly acidic pH. Figure 1 shows the pH profile of three alpha-amylases from various Bacillus species: TermamylTM from B. licheniformis, BSG TM from B. stearothermophilus, and BAN TM from B. amyloliquefaciens. Both BAN TM and BSG TM have an activity optimum at pH 5, and little, if any, activity can be detected above pH 10. The activity optimum for TermamylTM spans a fairly broad pH range, and this enzyme displays superior performance at a pH of 10.5, which is relevant for automated dishwashing. TermamylTM is additionally extremely thermostable with an activity optimum around 85 ~ at pH 9.0 (data not shown), despite the fact that it originates from a mesophilic Bacillus species. ~>'120 :~O 100 80 "~ 60 .'o 40

m

~

8SG

9

BAN

Termamyl

20

o~

0

3

'

4

5

6

7pH

8

9

10

v

11

Figure 1. PH profile of BAN TM, BSG TM and TermamylTM. The activity of the three amylases of bacterial origin was measured using the Pharmacia Phadebas TM assay [2] at 37 ~ in a slightly modified Britton-Robinson buffer (50 mM of each of phosphoric acid, boric acid, and acetic acid, supplemented with 0.1 mM calcium chlroide). The buffer was adjusted with NaOH to the relevant pH in the range of 4-10.5. Values for the activity of the three amylases are given relative to the highest value measured for that particular enzyme. >, 120

.m

>-9 100

o

~

:3

80 60

~9

40

~-

20 0

9

0

I

5

,

I

10 minutes

9

i

15

9

I

20

,

I

25

9

30

of incubation

Figure 2. Oxidation stability of TermamylTM. The enzyme was incubated at 40 ~ in a 50 mM Britton-Robinson buffer, pH 9.0, supplemented with 200 mM hydrogen peroxide. Samples were withdrawn at the times indicated, diluted 1000-fold, and the residual activity was measured by the Phadebas TM assay. All values are given as a percentage of the activity at time zero.

177

,41--o

~9 140

o

~

r

120

"~

80

z~

e -

--

100

-c "

60

40

I-" 20

o~

o

"

0

5

10 minutes

15 20 of incubation

25

Termamyl M8L M15LM197L M256L M304L M366L M438L

30

Figure 3. Oxidation stability of TermamylTM and leucine-substituted variants thereof. The experiment were carried out as described in the legend to Figure 2.

Other commercially available amylases, such as FungamylTM of fungal origin, were discarded on the basis of their thermal stability and their pH optima. To examine the oxidation stability of TermamylTM, the enzyme was incubated in the presence of 200 mM hydrogen peroxide for periods of up to 30 minutes. Samples were then withdrawn and diluted to stop the oxidation process, and the residual activity was determined as shown in Figure 2. Rapid oxidative inactivation of the enzyme is observed, and it is clear that this behavior of the protein would limit its application in detergents. The most oxidation-labile amino acid residues in proteins are cysteine and methionine [1], both of which contain sulphur. TermamylTM contains seven methionines but no cysteines.We have substituted the seven methionines, located in positions 8, 15, 197,256, 304, 366 and 438 of the mature protein, with leucine residues. Leucine was chosen as it is a fairly conservative substitution for methionine. The oxidation stability of the resulting seven TermamylTM variant proteins was evaluated by incubation in the presence of 200 mM hydrogen peroxide, and the results are shown in Figure 3. Only one substitution, M197L, had a significant effect on the oxidation stability of the enzyme. During a 30 minute incubation with hydrogen peroxide no loss of activity could be detected; it appears, in fact, that TermamylTM becomes slightly activated by the presence of hydrogen peroxide; an observation we cannot presently explain. In order to identify the optimal amino acid replacement of methionine197, a series of amino acid substitutions in this position was constructed. In addition to leucine, the following amino acids were introduced in position 197: Ala, Cys, Phe, Gly, His, Ile, Ash, Gln, Ser, Thr, and Val. The introduction of a cysteine residue in position 197 creates an enzyme more susceptible to oxidation than the native enzyme, whereas the other substitutions have identical effects on the oxidation stability of the protein: the enzymatic activity of the variant protein is constant or becomes slightly elevated during the 30 minute incubation under oxidizing conditions. The oxidation profile of a selection of the variants is shown in Figure 4. Although a number of amino acid substitutions in position 197 results in an enhancement of the oxidation stability of TermamylTM, significant differences are observed in the specific activity measured under standard non-oxidative conditions at pH 7.3 and 37 ~ (Figure 4). All variants tested display decreased activity compared to the native enzyme. The decrease in specific activity is especially marked when amino acids with bulky side chains are introduced in

178 position 197. Substitution with histidine and phenylalanine residues causes a decrease in specific activities of more than 90% and 80%, respectively, compared to the native enzyme. Amino acids with smaller side chains, such as glycine and alanine, cause less decrease and the resulting variants have almost the same specific activity as the native enzyme. There seems to be a roughly inverse relationship between the volume of the amino acid in position 197 and the specific activity of the variant, as indicated in Figure 6. Interestingly, the methionine (with a volume of 124 /~a) found in the native enzyme, and the wild type TermamylTM specific activity of 7000 NU/mg result in a data point that does not fit into the relationship observed for the complete series of substitutions in position 197.

140

m

-> 9 120

o

100

8O "o 60 ~9 !.----

M197L M197A M197C M197V M197F M197Q Termamyl

---a,

40 20

0

5

10

15

20

25

minutes of incubation

30

Figure 4. Oxidation stability of TermamylTM and a selection of position 197 variants thereof. The experiments were carried out as described in the legend to Figure 2.

.,.,.,.,

>9 O 0

,,.,.,.

0

6

4 2

Q. 0 CO

M

A

F

G

H

L

N

O

A m i n o a c i d at p o s i t i o n

S

197

T

V

Figure 5. Specific activity of TermamylTM (M) and its variants in position 197. The activity of TM the enzymes were measured in a slightly modified Pharmacia Phadebas assay: 50 mM Britton-Robinson buffer, pH 7.3, temperature 37 ~ The reference was TermamylTM (M) with a specific activity of 7000 NU/mg. Amino acids are given by the one-letter denotation.

179 7 6 >

5

0

4

, , m

0 0 CD Q..

r~

3 2 1 0 40

60

80

Volume

100

120

140

A 3

Figure 6. The specific activity of the position 197 variant proteins as a function of the volume of the amino acid in position 197 based on van der Waals radii [3].

3. CONCLUSION We have identified an amino acid which is mainly responsible for the instability of the alphaamylase Termamyl TM towards oxidation. Among the seven methionine residues in this alphaamylase, the methionine in position 197 is the major source of oxidation lability. The optimal substitution for methionine 197 with respect to specific activity is to be found among less bulky amino acids. Furthermore, we have shown that the amino acid residue in position 197 has a tremendous influence on the function of the enzyme.

4. REFERENCES Y. Shecthter et al., Biochemistry, 14 (1975) 4497. Pharmacia Diagnostics AB, 51-0073-00/02. F.M. Richards, J. Mol. Biol., 82 (1974) 1.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

181

Electrostatic studies of carbohydrate active enzymes Antonio Baptista l, Trygve Brautaset 2, Finn DrablOsl, Paulo Martel 3, Svein Valla2 and Steffen B. Petersen 1. 1MR-Center, SINTEF UNIMED, N-7034 Trondheim, Norway. WWW: http ://www.mr. sintef.no 2UNIGEN, University of Trondheim, N-7034 Trondheim, Norway 3ITQB, Oeiras, Portugal

Abstract Most charged or titratable residues reside on the protein surface. Each of such residues will exhibit a titration behaviour that is dependent on the local context around the residue. As a consequence both pH, salt and docking of a ligand into the binding cavity of a protein can be expected to alter the protein electrostatics. Electrostatic interactions are the most long range interactions known. It is widely believed that protein electrostatics plays a major role in the recognition between a protein and its ligand(s). In the present chapter we cover the basic theory of protein electrostatics and illustrate in two case stories how electratic based effects may provide important clues in the interpretation of pH activity profiles for the acid and Taka amylases, as well as in the prediction of functional consequences of protein engineering on

Acetobacter xylinum. 1. INTRODUCTION Over the last couple of decades major advances have been made in the understanding of protein structural determination. Initially all contributions came from X-ray diffraction analysis, but lately also multidimensional NMR spectroscopy has proven a valuable tool for the determination of protein 3D structures, although linewidth and dispersion at the moment restricts NMR to relatively small water soluble proteins with a molecular weight not exceeding 30 KD. Today about 1000 protein 3D structures have been determined, and we have learned that many proteins fold into the same 3D fold, despite the fact that they may have little or no detectable sequence homology. Good examples for such proteins are the hemoglobins and the triglyceride lipases, which adapt a packed (x-helix bundle and the characteristico~/]3 - hydrolase fold respectively. This impressive amount of 3D information has provided us with atomic

182 information about the structural basis for protein function. However, in order to understand molecular recognition (docking) and salt and pH dependent phenomena a more elaborate picture need to be developed that also includes the effects the electric force fields that are caused by charged residues at the protein surface. In the present chapter we will present the basic concept of protein electrostatics and we will present two case studies: 1) we will attempt to explain semi-quantitatively why the pH optimum of the acid cx-amylase is shifted towards acidic pH values as compared to the normal neutral cx-amylase. 2) we will illustrate that electrostatics are of major importance in the function of phosphoglucomutase, and through point-mutations prove that such residues are indeed central to enzyme function. The present chapter does not claim that protein electrostatics is the sole important factor that determines protein function, and its variation with physical chemical parameters such as pH, temperature and salt. It is clear that other complex factors must be considered as well, such as quantum mechanical as well as molecular dynamics effects. Work is in progress in various labs including our own addressing these aspects. Here however, we will concentrate on the role of protein electrostatics in protein function.

2. BASIC CONCEPTS OF PROTEIN ELECTROSTATICS 2.1. The role of electrostatics Protein function always depends on the encounter of two or more molecules: substrateenzyme, ligand-receptor, etc. These molecular encounters are the final stage of a diffusional process where attractive interactions have to overcome the random effects of thermal agitation. Given the magnitude and long-ranged nature of electrostatic interactions, it is not surprising that they can play a determinant role in this process, by steering and/or orienting the incoming molecule [ 1]. In the case of enzymes a further functional requirement is the stabilization of the reaction intermediate, without which they could not act as catalysts. Also here electrostatic interactions are of major importance, with the residues of the active site stabilizing the intermediate charge distribution [2]. A much more basic condition for protein function is the existence of a properly folded structure. The hydrophobic effect, usually assumed to be the most determinant factor in the overall folding process, is essentially a consequence of the fact that polar and, specially, charged residues are electrostatically more stable when interacting with water than with the more apolar protein interior. Also, the hydrogen bonding responsible for protein secondary structure is basically the result of electrostatic interactions. Thus, electrostatic interactions occupy a special place among the factors governing protein function and stability, and its proper understanding and modeling is of major importance. The sources of electrostatic interactions in proteins are charged titrable residues (Asp, Glu, His, Lys, Arg, C- and N-terminii, Tyr and free Cys), structural ions (eg, Mg 2+ in PGM), bound ions, solution counterions, etc. In the following section we present some of the models that may be used to compute the electrostatic interactions arising from these charges and to relate them with protein functional features. General reviews of protein electrostatics are Warshel & Russel (1984) [3], Matthew (1985) [4], Rogers (1986) [5], Harvey (1989) [6], Sharp & Honig (1990) [7], Davies & McCammon (1990) [1]. An aspect which is almost impossible to dissociate from protein electrostatics is the effect of pH, because of the charge variability of titrable residues. The sometimes dramatic functional

183 dependence of protein function on pH is a result of the protonation and deprotonation of these residues, which can lead to pronounced changes in the interplay of electrostatic interactions. At first the theoretical task of explaining and predicting such electrostatic changes upon pH may seem straightforward - given the pKa of the titrable residues (available in any biochemistry handbook) it would be a trivial matter to say which residues are charged at a given pH. Unfortunately, the situation is far more complicated because, as explained below, the pK~ may be shifted by several pH units from its typical value. In fact, even the usual concept of pKa becomes inappropriate. Thus, to decide the charge to be assigned to a residue at a given pH one has to resort to fairly more complex methods, as discussed below. 2.2. Electrostatic models There are several approaches to model electrostatic interactions in proteins. The most natural one is to use Coulomb's law to describe the electrostatic force between pairs of charges. This is how electrostatic effects are modeled in molecular dynamics (MD) simulations, probably the most familiar simulation method used with biomolecules, where they are just one type of force among many others, which together determine the dynamical behavior of the protein. Although a complete knowledge of the dynamics can in principle lead to an understanding of most other properties, in practice this kind of simulation may take prohibitive computation times and require an elaborated analysis of the results. Furthermore, some important aspects of electrostatic effects cannot be easily included in a MD simulation. In particular, a fixed charge has to be assigned to each titrable group, which may be a dramatic simplification at pH values where partial protonation exists. In this chapter we will focus on the use of another type of models, where dynamical features are included in an implicit manner. The rationale behind these models is that macroscopic electrostatic concepts such as dielectric constants or continuous charge distributions reflect the microscopic dynamics of charges and dipoles, so that the atomic description of the protein may be replaced, at least partially, with a macroscopic-like picture. Thus, though a molecular system is being deal with, a macroscopic language is adopted and quantitative aspects can be handled using classical continuum electrostatics. Actually, only solvent dynamics are considered, the protein being considered in a single conformation, usually the average one obtained from NMR or X-ray diffraction studies. The simplest macroscopic model assumes that charges in the protein, either from titrable residues or permanent ions, interact through a medium characterized by a single dielectric constant. This approach cannot account for the fact that these two regions have very different dielectric properties. The dielectric constant reflects the orientation of dipoles induced by the local electric field. Two types of dipoles contribute to the global value: permanent and induced. Permanent dipoles arise from an unequal distribution of charge among neighbor bonded atoms, as in the peptide bond or the water molecule. Induced dipoles are due to the deformation of electron clouds be the electric field, an effect called electronic polarization. The orientation of the high permanent dipole of the relatively free water molecules gives rise to a high dielectric constant (-80). In contrast, permanent dipoles in the protein interior are virtually fixed and the orientation of the low dipoles arising from polarization lead to a much smaller value (2-4). The use of a single dielectric constant somewhere between these values is not realistic, since the use of a low value overestimates the interaction of charges at the protein surface, and the use of a high value underestimates the interaction of buried ones. Furthermore, a more subtle but nevertheless very important point is that the electrostatic interaction between a pair of charges

184 does not occur along the straight line connecting them. Instead, electric field lines travel in curved paths over all space, and apparently paradoxical situations may result - eg, two charged residues on opposite sides of a protein may interact with an electrostatic energy characteristic of a dielectric constant even higher than the solvent one. The use of a single dielectric constant cannot possibly account for this kind of effects. A more realistic approach is to explicitly consider that the protein and solvent regions have different local dielectric constants. This means that the interactions can no longer be computed using Coulomb's law. Instead, the Poisson equation of the system of charges and dielectrics has to be solved. In addition to the protein charges, an atmosphere of counterions can be considered around the protein, as in the Debye-Huckel theory of electrolytes; in this case the equation to be solved is the Poisson-Boltzmann one, usually in its linear form. The nature of the solution of either the Poisson or Poisson-Boltzmann equations depends on the particular model being adopted. If the model has some geometrical symmetry an analytical solution may be possible. Otherwise, one has to resort to numerical methods, usually much more demanding in terms of computer time. A simple approximation is to consider the protein to be a sphere with the charges placed at a small distance beneath the surface and surrounded by an ionic atmosphere. For this spherical model it is possible to obtain an analytical solution to the Poisson-Boltzmann equation [8]. Despite the fact that in general proteins will deviate more or less from sphericity, this model was shown to usually give satisfactory results [4], specially when the interactions are corrected according to the solvent-accessibility of the residues [9]. This is probably due to the fact that the shielding effect created by the solvent can be reasonably included even if some surface details are neglected. However, the electrostatic energies obtained with this model include charge-charge interactions but not the so-called selfenergies, which arise from the presence of charges deeply buried into the protein, a situation which, although rare, do occur in some cases. The more detailed approach is to consider explicitly the protein shape and use numerical methods to solve the electrostatic problem. The more common method in use is finite differences, though other methods have also been used (see, e.g. [1], and references therein). Given the atomic detail of these models, it is particularly interesting to compute the electrostatic potential around the protein, which can be conveniently displayed using equipotential surfaces. This representation can be very useful in establishing the relation of functional aspects with pH (see below). The computation of the self-energies is not a problem in these models. Although in the foregoing discussion we have implicitly assumed the protein molecule to be surrounded by water, nothing in the physical assumptions of the models precludes their use with other solvents. An organic solvent will have a much lower dielectric constant, comparable to the one of the protein, leading to a much lower shielding of the interactions, ie, the electrostatic interactions will be much more strong and far reaching than in an aqueous solution. Another variable shielding effect arises from the presence of counterions, ie, shielding also depends on ionic strength. The detailed models using numerical methods can easily be extend to consider different dielectric regions outside the protein, so that, eg, membrane proteins can be simulated in this way [10]. This dielectric regions can correspond to fixed or transient molecular structures. The binding/unbinding of the latter may in principle act as a modulating mechanism of electrostatic interactions, as exemplified below with carbohydrates.

185

2.3. pH-dependent electrostatics As referred to above, there exists a natural and close dependency of electrostatics upon pH. Has we then pointed out, the assignment of charge to the titrable residues may be problematic, since the pK, of a residue may be shifted from its typical value by several pH units. There are two main reasons for these shifts. One is the fact that the protein dielectric environment is different from the one experienced by a model compound (ie, an analogue having the same titrable group as the residue) in solution, to which the tabulated values refer. If a residue is buried into the mainly apolar interior of the protein, its charged form will be more unstable than in the case of a single residue or model compound in direct contact with water, and its pK, will be shifted in the direction of favoring the neutral form. The pKa value resulting from this shift (and assuming an otherwise neutral protein) is usually referred to as the intrinsic pKa [8] and may differ from the model value by several pH units. However, most titrable residues are at the protein surface and its intrinsic pKa's are usual roughly identical to the model pKa's. Yet, another cause for pKa shifts exists for both buried and exposed residues: the interaction with other charges in the protein. The positive form of a titrable residue (Arg, Lys, His, Nterminus) will be more stable/unstable due to the presence of a nearby negative/positive charge, and conversely for a negative form (Asp, Glu, Tyr, free Cys, C-terminus). If these neighbor charges also belong to titrable residues, this pKa shift (and thus the pKa itself) becomes pHdependent. Therefore, one can no longer speak of the pKa of a residue in the usual sense of the value that uniquely characterizes the equilibrium of its two forms at all pH values. Instead, one usually defines an apparent pKa as the pH value at which the residue is half-protonated (ie, at which half of the protein molecules have that residue protonated). (In some extremely anomalous cases even this concept becomes useless, since the half charge may exist at several pH values.) From a theoretical standpoint, the (pH-dependent) pK.~ of a given residue in a given protein may be written as

1 pK = PKmoaeI - ~ A A G w ~ 2.3RT

P

where PKmode I is the pKa of the corresponding model compound in water and AAGw~ t, is the change in free energy for the titration reaction when one moves the residue from water to the protein. This change can be split into the two contributions corresponding to the effects discussed above:

AAGw_~e =

AAGenvironmen t + AGinteractio n

where AAGenvironmen t is the contribution due to moving the residue from water into the neutral protein and AGinteractio n is the contribution due to the interaction with other charged residues. The fin'st term may be obtained by assuming a thermodynamic cycle involving the protonated and unprotonated forms of the residue in both the protein and the solution and computing the free energy terms using the electrostatic models discussed above [11]. For most (exposed) residues, however, this term can be simply assumed to be zero. The t e r m AGinteraction can also be obtained from electrostatic calculations, but one cannot simply assign a fixed charge to the

186 other residues, because at certain pH values some of them may have significant populations of

both protonation forms. Another way of stating the problem is to say that a protein with N titrable residues has 2N possible global charge sets, and that its titration behavior reflects the relative populations of all these sets at each pH value [8] [11]. The computation of all the 2N charge set populations is very demanding and several approximate methods have been suggested [12] [13] [14] [15]. The most simplifying approach is to assume that, although interacting, the residues titrate independently of each other, an approximation which can lead to errors only if strongly interacting residues titrate in the same pH region [12]. When this approximation is used the above interaction term reduces to [ 11] [ 16]. AGinteractio n

=-2g%zj 1 2 i jr

where Wij is the electrostatic interaction between a pair of unitary charges placed at residues i and j and zj is the mean charge of residue j at the considered pH. Therefore, the pK, of a residue i is given by 1

p K i = P K i,imr - ~ ~ Wij z j 2 . 3 R T i*i

where pKi.mu is its intrinsic pKa. The usual procedure is to use the previous equation and the Henderson-Hasselbach equation and compute alternately the pKa's and mean charges until selfconsistency [16]. We will use this approach to compute mean charges at several pH values, with the Wij terms computed with the aforementioned spherical model and corrected for the solvent accessibility, a procedure usually called the modified Tanford-Kirkwood method [9]. Once the residue mean charges are known at a given pH value, one can use those charges in an electrostatic calculation using a numerical method, as described above, and compute the electrostatic potential around the protein molecule. The potential thus obtained corresponds to the mean potential existing at the considered pH value. By repeating this procedure one can obtain a pH-dependent electrostatic profile of the molecule, as done for the PGM study presented below. 2.4. Carbohydrates as electrostatic modulators

The binding of a carbohydrate molecule to a protein may influence the function of the latter in various ways. Usually one tends to focus on induced conformational changes of the protein. However, though conformational changes may lead to pronounced functional differences, purely electrostatic effects also arise and in some cases they may be of more importance than the former (which may not even occur). The most obvious case is when the carbohydrate brings one or more new charges to the protein charge system (eg, glucose-6-phosphate in PGM). Besides being an addition to the existing charge set, the new charge(s) will influence neighbor titrable residues and eventually shift their pKa values, which will result in new mean charges at a given pH value. The pH-dependence of electrostatic interactions may be further modified if the charged carbohydrate is itself titrable. Even when the carbohydrate does not bear a formal charge, it will change the dielectric environment upon binding. A region previously filled with water (assuming no extensive

187 conformational changes) will be occupied by a much lower dielectric medium. This may act on the electrostatics through two different routes. In the first place the binding may cause a total or partial burying of some titrable residues of the protein, causing their intrinsic pH values (see above) to change and thereby shifting their pKa's. Secondly, it will modify the interaction of previous existing charges, by modifying the path of electric field lines. For example, two neighbor charged residues on the surface will in general have their interaction increased if a carbohydrate binds between them, eventually leading to further shifting of their pKa's. The new balance of electrostatic interactions arising from carbohydrate binding may affect the steering and/or orientation of other approaching molecules. For example, a different approach and docking of an incoming substrate may lead to a different reaction rate, if the latter is diffusion-limited. Similar effects may occur with cofactors, effectors, etc. Since further pK,, shifts may occur, the magnitude of the electrostatic changes varies with pH, so that the pH-dependent functional profile of the protein may change as well. In conclusion, carbohydrates may act as versatile electrostatic modulators of protein function even when their binding induce few or none conformational changes on the protein molecule.

3. ELECTROSTATIC PROPERTIES OF oc-AMYLASES The a-amylases are endohydrolases acting on 0c-l,4-glucosidic bonds of starch and related dextrins [17] . They are retaining hydrolases, which means that the stereochemistry at the chiral centers that were linked by the cleaved bond is retained after hydrolysis. Structurally the a-amylases consist of a (13Rz)8-barrel fold succedded by a y-crystalline-type domain of unknown function (Fig. 1). The active site is formed by loops at the C-terminal ends of the barrel l-strands. The amylases may have quite different pH optimum with respect to catalytic activity, e.g. Taka amylase has a pH optimum close to 7.0, whereas the optimum for acid amylase is closer to 4.5. In order to identify possible explanations for the difference in pH optimum for (z-amylases, we have looked at the structures of acid amylase [ 18] and Taka amylase [ 19].

3.1. Experimental The two structures were aligned into identical orientations superimposing the backbone atoms. The charge distributions at various pH values were computed using the Titra program (to be published). The charge of the primary Ca ++ ion was fixed to 2.0, the charge of the secondary Ca ++ ion was fixed to 0.0. A solid molecular surface representation of the molecules was generated using the Grasp program [20], and the electrostatic potential at the surface was computed using the electrostatic model of Grasp (which is based on the DelPhi program [21] ) with charges from the Titra computations. The residues belonging to the active site region were identified using the "scribing" feature of Grasp. 3.2. Results The simulated titration curves for the two amylases (Fig. 2) do show a small shift of the Taka amylase towards a less acidic situation. However, for most of the pH range the difference

188

Figure 1. The structure of Taka amylase. The two Ca ++ ions are shown as yellow spheres. The trace of the structure is color coded according to the intron-exon pattern of the Taka amylase gene (Gines et al., 1989), so that the sequence corresponding to each exon has a distinct color. The orientation of the molecule is the same as for Fig. 3. The figure was generated using the Molscript program [34].

189

Figure 2. Simulated titration curve for acid amylase and Taka amylase, computed using the Titra program.

190

Figure 3. Electrostatic potential at the molecular surface for acid amylase (left) and Taka amylase (fight) at the pH values 2.0 (top), 4.0 (upper middle), 6.0 (lower middle) and 8.0 (bottom). Blue color represents positive potential, red color negative potential. The yellow sphere represents the binding site for the scondary Ca ++ ion, this ion has been assigned a charge of 0.0 in the electrostatic computations, and it has not been included in the surface rendering.

191

Figure 4. Simulated titration curve for the active site environment, based on the simulated titration data from Fig. 2.

192 in the charge properties for these two molecules seems to be rather small, and it is difficult to predict any significant change in pH otimum based on these data. If we look at the electrostatic potential at the molecular surface (Fig. 3) the difference seems to be more obvious. At pH 4.0 there is a significant charge polarity of the bottom of the active site pocket of the acid amylase, wheras for the Taka amylase most of this pocket is strongly positive. As we move past pH 6.0 towards a more neutral environment, we see that we get an increasing polarisation of the active site pocket of the Taka amylase, whereas the acid amylase is moving towards a situation with a strong negative charge in the pocket. If we postulate that the charge polarity is an essential feature of the amylase active site pocket, then this difference between the acid and the Taka amylase seems to explain very much of the difference in pH optimum between the enzymes. In order to identify the residues responsible for this difference the atoms of the active site pocket were identified using the scribing feature of Grasp. The surface of the pocket up to the upper edge between the pocket and the exposed surface was made into a separate surface, and residues close to this surface were identified by two different approaches. In the first approach residues with atoms within 5.0 /~ from the surface were included (Table 1). In the second approach only residues with atoms in direct contact with the surface were included (Table 2). As can be seen, the results are remarkably similar. Most residues not included in the second approach are either uncharged residues or residues representing conservative mutations (when comparing acid and Taka amylase). The two exceptions, residues 154 (Asp (in acid amylase), Asn (in Taka amylase)) and 339 (Lys, Ash), are both quite far from the active site, and probably of less importance for understanding the catalytic activity. If we use the computation from Fig. 2, but include only the charge contribution from residues included in Table 2, then we get the simulated "titration curve" shown in Fig. 4. This curve can be looked upon as the local charge properties for the active site environment. We see that we get a more significant difference between the two enzymes, given that we assume a similar net charge of the active site region in the two structures at their optimum pH. It is important to realize that this is a accumulated effect caused by a number of mutations. If we look at the shifts in titration curves for undividual conserved active site residues (Fig. 5) we see that in most cases the shift in charge distribution is of opposite direction compared to Fig. 2 and Fig. 4, towards more negative (or less positive) values as we move from acid to Taka amylase. This is not surprising, we know that an acidic group will have its pK a lowered through favorable interactions with basic groups [22]. Therefore, although there are large effects on the pK a of individual residues, the shift in pH optimum seems to be caused by changes in the overall charge distribution rather than individual pK a shifts. In Fig. 6 we have higlighted the most important differences between acid and Taka amylase. we see that most of the conserved charged residues are in the bottom of the pocket. Around the edge of the pocket we have 6 mutations shifting the average charge distribution towards more positive values, and two mutations with the opposite effect. One of the mutations seems to be of particular importance, at position 210 a Glu in acid amylase is mutated into a His in Taka amylase. This mutation will have a dramatic effect on the electrostatic polarity of the pocket, in particular since this residue is one of the mutated residues which is closest to the bottom of the pocket and the active site of the amylases. Identification of the E21 OH mutation as a key feature for understanding the difference in pH optimum between acid and Taka amylase is in good agreement with experimental data. For

193 human pancreatic o~-amylase the mutation H201N (corresponding to H210 in Taka amylase)

shifted the pH optimum from 6.9 to 5.2 [23].

Table 1 Residues with atoms within 5.0 A from the surface of the active site pocket, separated into residues (potentially) carrying positive (+), negative (-) or no charge (0). Diagonal elements represents residues that are of the same class in both structures, with conservative mutations shown in paranthesis. Off-diagonal elements represent residues that are of different class in the two structures. 6taa + 0 + 5(0) 0 1 2aaa -

1

5(1)

5

0

1

2

38(11)

Table 2 Residues with atoms that are in direct contact with the surface of the active site pocket. See Table 1 for more details. 6taa + 0 + 3(0) 0 0 2aaa 1 4(0) 4 0 1 2 12(3)

4. APPLICATION TO PHOSPHOGLUCOMUTASE 4.1. Protein E n g i n e e r i n g on the Acetobacter xylinum P h o s p h o g l u c o m u t a s e .

Phosphoglucomutase (PGM) catalyzes the interconversion between glucose-l-phosphate (G-l-P) and glucose-6-phosphate (G-6-P), which represents a branch point in carbohydrate metabolism. Biochemical studies indicate that PGM's from a variety of organisms convert G-1P much more efficiently than the corresponding 6-phosphate isomer. G-6-P enters catabolic processes to yield energy and reducing power, whereas G-1-P is the precursor of sugar nucleotides that are used by the cells in the synthesis of various glucose-containing polysaccharides. We have previously cloned and sequenced the Acetobacter xylinum gene (celB) encoding PGM [24]. In this bacterium PGM is essential for the formation of extracellular cellulose, as mutants deficient in the corresponding gene are unable to produce this polymer [25]. Similar observations have also been reported for xanthan production in Xanthomonas campestris [26], and for the biosynthesis of both alginate and lipopolysaccharide in Pseudomonas aeruginosa [27]. In Escherichia coli it has been shown that mutants deficient in PGM accumulate intracellular amylose when grown in the presence of maltose [28]. The

194

Figure 5. Simulated titration curves for individual conserved residues in the active site region. D206 is believed to be the enzyme nucleophile, E230 is probably the general acid catalyst, D297 may stibilize the partially charged intermediate, and H122 and H296 are involved in transition state stabilisation. (Svensson and Sogaard,1993). The letters 'a' and 't' represent acid and Taka amylase, respectively.

195

Figure 6. CPK representation of the active site residues of acid amylase, the orientation of the molecule is the same as in Fig. 3. Charged residues in the active site pocket have been color coded according to mutation compared to Taka amylase. Red represent residues mutated so that the net charge is shifted towards more negative values, yellow for residues where the mutation has the opposite effect. Blue represents residues that are not mutated, and residue number 210 is colored green (please see text for more details).

196 probable reason for this is that maltose metabolism results in the formation of G-1-P which cannot be channelled into catabolism due to the PGM deficiency. In the active form of PGM, a divalent metal ion is bound to the enzyme, and a serine residue at the catalytic site is phosphorylated. This phosphate group is initially transferred to the substrate, and the serine residue is rephosphorylated concominant with the release of either of the two phosphoglucose isomers [29]. The crystal structure of rabbit muscle PGM has previously been described, showing three loops closely spaced in the active-site cleft [30]. One loop contains the active-site serine, one is a metal-ion-binding loop, and the third is suggested to be involved in substrate-binding specificity. Phosphomannomutase (PMM) catalyzes a reaction similar to the PGM reaction, and to date, numerous PMM's from a variety of different organisms have been sequenced and characterized. Generally, they all share the active-site and the metal-binding-loop sequence motifs typical of PGM's, and several of them can in fact efficiently convert both phosphorylated glucose and mannose. This property was however not observed for CelB, and in this respect this enzyme seems to belong to a class of PGM's highly spesific for glucose phosphates. All proteins belonging to this class indeed share some common sequence motifs that are not present in the other coresponding proteins with broader substrate spesificity [24]. In order to learn more about the mechanisms and particular amino acid residues directly involved in the catalytic reaction of PGM, we have performed site-directed mutagenesis on the A. xylinum PGM gene. So far, nine different amino acid residues in CelB have been mutated to alanine. Mutations introduced are predominantly in codons encoding amino acid residues believed to be involved in substrate- or metall-ion binding in the active enzyme. The results of expression analysis in E. coli show that spesific enzyme-activities for all mutants are reduced to various degrees, compared with the wild-type enzyme. Several of them seems to have completely lost their catalytic properties. The particular residues mutated and the relative specific activities of the resulting enzyme mutants, are presented in Table 3.

Table 3 CelB mutant

WT

T45

$46

R49

K158

R311

R360

K377

E393

E394

Spes. PGM 100 0.2 42 <0.1 <0.1 <0.1 65 15 <0.1 1.5 activity Expression analysis of CelB-enzyme mutants in E. coli. Spesific PGM activity is expressed as percentage of wild-type activity.

4.2. Sequence alignment and homology modeling In order to perform electrostatic calculations for Acetobacter xylinum phosphoglucomutase (PGM), we had to build its structure from the known sequence by homology modeling. The only known PGM structure is the rabbit one determined by [30] (see Aknowledgements). The rabbit and A. xylinum sequences have a 27% residue identity and display extensive homology. In order to improve the alignment of the two sequences, we have done a multiple alignment

197 rabbit-

V---KIVTVKT--KAYPDQKP--

Acetobrabbit

GTSGLRK

MPSISPFAGKPVDPDRLVNIDALLDAYYTRKPDPAIATQRVAFGTSGHR9

R V K V F Q S S T N Y A E N F IQS I

I

STVEPAQR-QEAT-- LVVGGDGRFYMKEAI .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Acet ob . . . . G S S L T T S F N E N H I LS I S Q A I A D Y R K G A G I TGP LF IG I D T H A L S R P A L rabbit-

QLIVRIAAANGIGRLVIGQNGILSTPAVSCIIRKI---KAI---GGIILT

Acetob"

KSALEVFAANGVEVRIDAQDGYTPTPVISHAILTYNRDRSSDLADGVVIT

rabbit-

ASHNPGGPNGDFGIKFNISNGGPAPEAITDKIFQISKTIEEYAICPDLKV

Acetob-

PSHNP--PE-DGGYKYNPPHGGPA

rabbit-

DLGVLGKQQFDLENKFKPFTVEIVDSVEAYATMLRNIFDFNALKELLSGP

Acetob-

AKKMEGVKRVSFEDALKAPTTKRHDYITPYVDDLAAVVDMDVIRE--SG-

rabbit-

NR---LKIRIDAMHGVVGPYVKKILCEELG-APAN-SAVNCVPLEDFGGH

Acetob:

..... V S I G I D P L G G A A V D Y W Q P II-DKYG- I N A T I V S K E V D P T F R F M T A

rabbit-

H P D P N L T ...... Y A A D L V E T M K S G E H D F G A A F D G D G D R N M I L - G K H G F F

Acetob-

DWDGQIRMDCSSPYAMARLVGMK-DKFDIAFANDTDADRHGIVSGKYG-L

rabbit:

VNPSDSVAVIAANIFSIPYFQQTGVRGFARSMPTSGALDRVANATKIALY

Acetob"

MNPNHYLAVAIEYLFNNRE-NWNASAGVGKTVVSSSMIDRVAKEIGRKLV

rabbit

9 E T P T G W K F F G N L M D A S K L S L C G E E S F G T G S D ...... H I R E K D G L W A V L A

. . . . . . .

* * * *

W * * W

. . . . . . . . . . . . . . . . . . . . .

. . . . . .

* * * W * * * . . . * * * *

.

.

.

.

.

.

.

.

.

DTDITKVVETAANDYM

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Acetob-

EVPVGFKWF@GLYNGTLGFGGEESAGASFLRRAGTVWSTDKDGI

rabbit-

WLSILATRKQSVEDILKDHWHKFGRNFFTRYDYEEVEAEGATKMMKDLEA

Acetob-

AAEITARTKRTPGAAYEDMTRRLGTPYYARIDAPA-DPE-QKAILKNLSP

rabbit-

LMFDRS-FVGK-QFSANDKVYTVEKADNFEYHDPVDGSVSKNQGLRLIFA

Acetob"

EQIGMT . . . . . .

rabbit-

DGSRIIFRLSGTGSAGATIRLYIDSYEKDNAKINQDPQVMLAPLISIALK

Acetob-

DG-WFAARPSGTEN---VYKIYAESF-KSAAHLKAIQTEAQDAISALFAK

rabbit-

VSQLQERTGRTAPTVIT

Acetob-

AAQKNAG

.

.

.

.

.

.

.

.

.

.

ILGLL

.

ELAGEPILSTLTNA .........

PGNGAAIGGLKVSAK

Figure 7. Alignment of Acetobacter xylinum with rabbit muscle phosphoglucomutase.

198

Figure 8. Model 3D structure of phosphoglucomutase from Acetobacter xylinum. The model is based on a revised version of the rabbit phosphoglucomutase (Dai et al,1992).

199 with several PGM sequences from rabbit, human and the yeast S. cerevisiae (PIR codes: PMRB, B41801, A45077, B45077, A41801, $37699, $40264). Even though, the alignment programs used MALIGN and CLUSTAL W [31], failed to identify some obviously important and conserved structural motifs (eg, the GTSG motif near the active site region) which had to be aligned by hand. The final alignment of the rabbit and A. xylinum sequences used as the starting point for the building of the structure of the latter is shown in Fig. 7. The model was build using the facilities of the Homology package of the Insight H program (Biosym Technologies, San Diego). Residues marked in Fig. 7 with a star or a point were assigned to the corresponding rabbit positions. Conformations for the intervening segments were searched in the loop database of Insight H, and selected on the basis of consistency with the flanking secondary structure and solvent exposure of polar or charged residues. The structure was then relaxed by energy minimization. Since the process was done in the absence of solvent, and to avoid unbalanced electrostatic forces, the neutral forms were used for titrable residues. Successive relaxations were done of, respectively, the loops, residues marked with a point, and residues marked with a star. Each of these relaxations was done in two stages: first the sidechain and then all the residue atoms. After each relaxation stage the side chain orientation of (potentially) charged residues was checked and new conformations were selected from a rotamer library when it seemed necessary. Residues highly conserved and believed to be important for activity, either in the active site or its vicinity (see below), were kept fixed during the relaxation process. The secondary structures of the N- and C-terminii were constructed according to the prediction of the PHDsec program, using the method of [32] [33]. These secondary motifs were then folded onto the seemingly most likely hydrophobic patches on the protein surface, oriented so as to mantain a hydrophobic/hydrophylic polarity. The final model is shown in Fig. 8. 4.3. Electrostatic studies of native and mutant phosphoglucomutase Using the model of A. xylinum it is now possible to investigate consequences of the assumed 3D structure for the phosphoglucomutase. In particular it is of interest to investigate electro-tatic effects, both in terms of pH variations as well as in terms of electrostatic changes introduced by the site specific mutations (vide supra). In Fig.9 is shown the pH dependent variations in the isopotential surfaces at pH 5, 7 and 9. It is clear that major changes occur in this pH range, and that the active site is strongly influenced by these changes.

5. CONCLUSIONS We have introduced the general concepts that are of importance in protein electrostatics. In order to illustrate the effects we have investigated two cases: the marked shift towards acidic pH enzyme activity optimum that the acid amylase displays when compared with the Taka amylase. We found no global features that could explain this observation. However when studying more closely the active site environment, we indentified residue 210 as a key residue for understanding the differences in pH optimum for these two amylases, although a number of other residues may be important for understanding the detailed electrostatic (and enzymatic) properties. This conclusion is in good agreement with experimental data. We also presented electrostatic isopotential data for several pH values (5,7, and 9) for the native Acetobacter

200

201

Figure 9. pH variations in the electrostatic isopotential surfaces of phosphoglucomutase from

Acetobacter xylinum, a: pH 5, b: pH 7 and c: pH 9. Color coding of isopotential surfaces 9 white 9+ 1kT and red - 1 kT.

202

Figure 10. The electrostatic effects of a the charge mutant 9 R49A. In a) the electrostatic isopotential surfaces are shown for the R49A mutant. In b) the difference map between the native Acetobacter xylinum phosphoglucomutase (Fig. 9b) and the R49A mutant is shown.

203

xylinum phosphoglucomutase. As an illustration we have visualized the electrostatic consequences of a single mutation in the active site R49A. By displaying the difference electrostatic maps, it is evident that the active site is dramatically influenced by this mutation. Again, experimental data correlate with this, the enzymatic activity for this mutant is reduced with a factor of 103. The data presented here cannot constitute a proof for the general applicability of protein electrostatics in the study of protein structure function relationship, but as case stories we find them very interesting. Work are in progress in our lab aimed at widening the application areas for such calculations.

6. ACKNOWLEDGEMENTS We thank W.J. Ray Jr. for kindly providing us with a revised structure of rabbit phosphoglucomutase prior to its deposition in a public databank. A.B. thanks Junta Nacional de Investiga~.o Cientffica e Tecnol6gica, Portugal, for his grant.

7. REFERENCES

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

M. E. Davies and J.A. McCammon, Chem. Rev., 90 (1990) 509. A. Warshel and J. Aqvist, Chemica Scripta, 29A (1989) 75. A. Warshel and S.T. Russel, Q. Rev. Biophys., 17 (1984) 283. J. B. Matthew, Annu. Rev. Biophys. Biophys. Chem., 14 (1985) 387. N.K. Rogers, Prog. Biophys. Mol. Biol., 48 (1986) 37. S.C. Harvey, Proteins, 5 (1989) 78. K.A. Sharp and B. Honig, Annu. Rev. Biophys. Biophys. Chem., 19 (1990) 301. C. Tanford and J.G. Kirkwood, J. Am. Chem. Soc., 79 (1957) 5333. S. J. Shire, G.I.H. Hanania and F.R.N. Gurd, Biochemistry, 13 (1974) 2967. D. Bashford and K.J. Gerwert, Mol. Biol., 224 (1992) 473. D. Bashford and M. Karplus, Biochemistry, 29 (1990) 10219. D. Bashford and M. Karplus, J. Phys. Chem., 95 (1991) 9556. P. Beroza, D.R. Fredkin, M.Y. Okamura and G. Feher, Proc. Natl. Acad. Sci. USA, 88 (1991) 5804. A.-S. Yang, M.R. Gunner, R. Sampogna, K. Sharp and B. Honig, Proteins, 5 (1993) 252. M.K. Gilson, Proteins, 15 (1993) 266. C. Tanford and R. Roxby, Biochemistry, 11 (1972) 2192. B. Svensson and M. Scgaard, J. Biotech. 29 (1993) 1. E. Boel, L. Brady, A.M. Brzozowski, Z. Derewenda, G.G. Dodson, V.J. Jensen, S. B. Petersen, H. Swift, L. Thim and H.F. Woldike, Biochemistry, 29 (1990) 6244. H.J. Swift, L. Brady, Z. S. Derewenda, E.J. Dodson, G.G. Dodson, J. P. Turkenburg, and A.J. Wilkinson, Acta Crystallogr. Sect. B., 47 (1991) 535. A. Nicholls and B. Honig, J. Comput. Chem., 12 (1991) 435. A. Nicholls, K. A. Sharp and B. Honig, Proteins 11 (1991) 281. B. Honig and A. Nicholls, Science, 268 (1995) 1144.

204 23 24 25 26 27 28 29 30 31 32 33 34

K. Ishikawa, I. Matsui, K. Honda and H. Nakatani, Biochem. Biophys. Res. Commun., 183 (1992) 286. T. R. Brautaset, E. Standal, E. Fjrervik and S. Valla, Microbiology, 140 (1994) 1183. E. Fjaervik, K. Frydenlund, S. Valla, Y. Huggirat and M. Benziman, FEMS Microbiol. Lett., 77 ( 1991) 325. R. K6plin, B. Arnold, B. H/3tte, R. Simon, G. Wang and A. Ptihler, J. Bacteriol., 174 (1992) 191. R. W. Ye, N. A. Zielinski and A. M. Chakrabarty, J. Bacteriol., 176 (1994) 4851. S. Adhya and M. Schwartz, J. Bacteriol., 108 (1971) 621. G. I. Rhyu, W. J. Ray and J. L. Markley, Biochemistry, 23 (1984) 252. J.-B. Dai, Y. Liu, W.J. Ray Jr., J. Biol. Chem., 267 (1992) 6322. J. D. Thompson, D. G. Higgins and T. J. Gibson, Nucleic Acids Research, 22 (1994) 4673. B. Rost and C. Sander, J. Mol. Biol., 232 (1993) 584. B. Rost and C. Sander, Proteins, 19 (1994) 55. P. J. Kraulis, J. Appl. Cryst., 24 (1991) 946.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

205

Effects of glycosylation on protein folding, stability and solubility. Studies of chemically modified or engineered plant and fungal peroxidases K. G. Welinder and J. W. Tams Department of Protein Chemistry, Institute of Molecular Biology, University of Copenhagen, Oster Farimagsgade 2A, DK-1353 Copenhagen K, Denmark

Abstract

Size and charge homogeneity and solubility are important parameters of the growth of quality crystals for structure determination of proteins. Horseradish peroxidase isozyme C (HRP) contains 8 heterogeneous N-linked glycans which have hampered crystallographic studies. We have prepared fully active, homogeneous HRP by treatment with trifluoromethanesulfonic acid using a modified protocol and four steps of chromatographic purification. Coprinus cinereus peroxidase (CIP) and site-directed mutants of CIP are secreted in high yields to the fermentation medium of Aspergillus oryzae transformants. Wild-type CIP contains 1 N- and 2 O-linked heterogeneous glycans, which were removed by mutagenesis yielding the ON mutant. In addition, 1N, 2N, 4N and 6N glyco-mutants were constructed. The enzymatic activities of these mutants were identical to that of the wild-type. Comparison of the properties of the fully glycosylated wild-type HRP and deglycosylated HRP, and of the properties of the wild-type CIP and the glyco-mutants of CIP, showed that the thermodynamic heat and denaturant stability was very little affected, whereas the kinetic stability, i.e. the rates of peroxidase unfolding and refolding, decreased significantly depending on the extent of glycosylation. Increasing contents of carbohydrate greatly increased the solubility in salt solution and decreased the solubility in acetone-water mixtures.

1. INTRODUCTION The heterogenous nature of glycoproteins has been an obstacle to many biochemical studies, most prominently to the growth of quality crystals for x-ray diffraction studies. In addition, highly glycosylated proteins are very soluble in salt solution, another important parameter of crystal growth. Glycans can be partly or entirely removed by specific glycosidases or chemical methods, yielding a seemingly homogeneous but rarely fully active protein. Protein engineering is a recent method to manipulate glycosylation. We have applied these methods to generate glyco-variants for the study of physico-chemical properties and roles of glycans in glycoproteins. The heme and Ca z+ containing peroxidases proved ideal for such studies, the built-in heme being a convenient spectroscopic probe of protein folding and activity, and Ca 2+

206 availability being essential to reversible folding. In the present paper we summarize our experience with chemically modified and engineered glycoproteins.

2. CHEMICAL DEGLYCOSYLATION OF HORSERADISH PEROXIDASE

Horseradish isoperoxidase C (HRP) is a heme and Ca2+-containing glycoprotein of Mr 44,000 and pI 9 [1]. HRP contains 8 N-linked glycans Manot3(Manot6)(Xyl132)Man134GlcNAc134(Fuco~3)GlcNAc- [2]. These glycans were resistent to enzymatic hydrolysis by the endoglycosidases peptide-Na(N-acetyl-13-glucosaminyl)asparagine amidase F, endo-13-Nacetylglucosaminidase H, and endo-[3-N-acetylglucosaminidase F under conditions where ovalbumin was deglycosylated [3]. Trifluoromethanesulfonic acid (TFMS) cleavage according to published protocols removed the glycans of HRP but left an inactive and highly modified peroxidase. The pyridine used for neutralization of TFMS extracted the heme, and thus was substituted by 2 M "Iris base bringing the pH to 8. IEF (isoelectric focusing), however, showed a ladder of bands of pI 5.53.5 rather than a pI of 9, indicating that negative charges had been introduced. The procedure still yielded inactive HRP. We developed a revised procedure using TFMS for 5 min at -10 ~ in the presence of 90 mM phenol, which removed all carbohydrate except (GlcNAc)2 glycans, and preserved 60 % of the peroxidase activity [3]. The active peroxidase fraction was retained by affinity chromatography on benzhydroxamic acid-agarose (KemEnTech, Copenhagen) [4]. 10 % of the active fraction was completely homogeneous by anion and cation exchange chromatography, SDS-PAGE and IEF-PAGE, showing an Mr of 35,500 and pI 9, and containing the (GlcNAc)2 glycans. No other modifications were detected by amino acid analysis, mass spectrometric and sequence analyses of tryptic peptides, and absorption spectroscopy. The enzymatic activities of the homogeneous deglycosylated HRP (d-HRP) and authentic HRP using hydrogen peroxide and o-dianisidine were identical. Silva et al. [5] previously reported that the glycans were essential to HRP activity, whereas Smith et al. [6] found very little difference in the activities of carbohydrate-free recombinant HRP expressed in E. coli and authentic plant HRP. We therefore conclude that the 8 glycans occurring naturally in HRP do not interfere with substrate access, binding or with catalysis. Furthermore, glycovariants must be meticulously analyzed for possible modifications outside the carbohydrate moiety. The heat stability of HRP and d-HRP was compared by incubating 1.1 I.tM peroxidase in 50 mM sodium citrate buffer pH 6.0 for 15 min at temperatures 25-80 ~ Further unfolding or refolding reactions were arrested by immediate 20-fold dilution at room temperature with 20 mM EDTA, which binds the two Ca2+ released per molecule of unfolded HRP or d-HRP. The residual activity was determined using o-dianisidine. In these experiments the equilibrium of native and unfolded samples were supposed to be reached at nearly reversible conditions. The curves of inactivation for HRP or d-HRP were indistinguishable and showed 50 % residual activity at 57 ~ HRP is stable in 8 M urea at room temperature and the more potent denaturant GdmC1 was therefore used to study the denaturation of HRP and d-HRP. The unfolding of HRP and dHRP in GdmC1 at pH 7.0 containing 20 mM EDTA was initiated by the addition of the native proteins and was irreversible due to the retainment of calcium ions by EDTA. The reaction was

207 followed by the decrease in absorbance at 402 nm, which indicates the loss of heme from the peroxidase. At 402 nm e = 100 mM -~ cm ~ for native and 40 mM 1 cm 1 for unfolded peroxidase. A semi-logarithmic plot of the experimental data can be fitted to a strait line indicating that the unfolding obeys first order kinetics. The rate constant for the unfolding process from native to unfolded protein (N---)U) k, = -ln([native]J[native]0)/t is equal to the negative slope. At 5.2 M GdmC1 the unfolding rate k, for d-HRP was increased 3-fold as compared to HRP (0.50 x 10.3 sec-1). At 5.8 M GdmC1 k, was increased 2-fold for d-HRP as compared to HRP (2.0 x 10 -3 secq). The identical heat stability of d-HRP and HRP indicated that AG ~ for the process N ~4J was the same for both proteins. The unfolding rate of d-HRP in GdmC1, however, was higher than for HRP which carry eight large glycans. The greater unfolding rate for d-HRP corresponds to a decrease in transition state free energy of 2-3 kJ/mol at 5.2 - 5.8 M GdmC1. In agreement with these results, Schtilke and Schmid [7] reported that glycosylated and non-glycosylated yeast invertase had the same heat stability, and furthermore, that the unfolding and refolding rates in GdmC1 were both higher for the non-glycosylated form. The solubility of HRP and d-HRP at pH 7.0, 23 ~ in ammonium sulfate (AMS) was determined after 30 min and by measuring the peroxidase concentration at 402 nm after centrifugation. The salting out concentrations for 1 mg/ml peroxidase were 2.30 M AMS for HRP and 1.81 M AMS for d-HRP, d-HRP showed a 140-fold decrease in solubility at 2.4 M AMS as compared to HRP. The difference in standard molar free energy of solubilization between HRP and d-HRP at 2.4 M AMS was AAG ~ = - RTln[HRP] + RTln[d-HRP] = -12.3 kJ/mol corresponding to approximately -1.5 kJ/mol glycan, or -0.3 kJ/mol monosaccharide residue [3].

3. E N G I N E E R E D G L Y C O - M U T A N T S OF COPRINUS P E R O X I D A S E Systematic studies on the physico-chemical roles of glycans were performed using a series of recombinant glyco-forms of the ink cap Coprinus cinereus peroxidase (CIP). CIP is secreted to the fermentation broth from the fungal mycelia [8]. Wild-type CIP and CIP mutants were expressed in transformed Aspergillus oryzae (collaboration with Drs. J. Vind and H. DalbCge, Novo Nordisk A/S). CIP is homologous to HRP C and they are very similar in enzymatic properties, although less than 20 % identical in their amino acid sequences [9,10]. CIP contains 1 N-linked glycan of high-mannose type attached at residue N142, and 2 O-linked glycans of one or more mannose residues attached at T331 and $338, near the C-terminus of the peroxidase [11,12]. The N-linked glycan was removed in the mutants N142P, N142T, N142S and N 142D. In addition, all three naturally occurring glycans were removed giving the ON CIP mutant (zero N-linked glycans) as described below. Furthermore, 1N, 2N, 4N, and 6N CIP mutants were constructed (J.W. Tams, J. Vind and K.G. Welinder, in preparation). At the time of design of these mutants no crystal structures were available for fungal peroxidases. The crystal structure of yeast mitochondrial cytochrome c peroxidase [13] and a large number of amino acid sequences of fungal, plant and bacterial peroxidases provided the basis for a predicted helix-rich prototypic peroxidase structure [14,9]. Using this model structure we strived to insert new glycans at reverse turns on the surface, away from the presumed substrate access channel. The glyco-mutants were the following: ON, N142S-T331A-S338A (triple

208 mutant); IN, T331A-S338A; 2N, T331N-S338A (one O-linked glycan site was changed to an N-linked glycan site); 4N, T331N-S338A-S8N-S263N-N265S; 6N, T331N-S338A-S8NS263N-N265S-Q38N-A304S. The peroxidases were precipitated from the filtered fermentation broth with AMS and purified by ion exchange and concanavalin A-Sepharose chromatography. The identity of the peroxidase mutants was checked by DNA sequencing, SDS-PAGE, amino acid analysis, glucosamine analysis and laser desorption mass spectrometry. Despite concanavalin A-Sepharose affinity chromatography the glyco-mutants containing the greatest number of glycans showed the widest distribution in Mr due to the glycan heterogeneity, which was previously demonstrated for wild-type CIP [ 11,12]. The enzymatic activity of wild-type CIP and of these 5 CIP mutants was the same at pH 6.3 using ABTS (2,2'-azino-bis-(3-ethylbenzthiazoline-6-sulfonic acid)) as substrate at 10 and 100 I.tM concentrations, indicating that none of the 3 naturally occurring glycans, nor the engineered glycans or the residues that were substituted, played a role in substrate binding or catalysis. The heat denaturation (irreversible unfolding in the presence of 4 mM EDTA) at 64 ~ pH 7.4 showed a time constant (reciprocal of rate constant at irreversible conditions) of 264 sec for wild-type CIP and time constants near 125 sec for the mutants N142P, N142T and N142D, which only lack the N-linked glycan. This decrease in kinetic heat stability was similar to that observed for surface mutants in which a polar residue was changed to phenylalanine. Polar surface mutants only showed small increases or decreases relative to the wild-type in their time constants of irreversible unfolding. Heat-induced reversible unfolding performed in the presence of 8 mM CaC12 gave the same general result for these mutants except for higher time constants of unfolding due to the contribution from refolding, i.e. 800 sec for wild-type CIP and 650 sec for the N142 mutants, again showing higher kinetic stability as a result of the Nlinked glycan. Heat stability experiments with the ON, 1N, 2N, 4N and 6N CIP mutants were carried out at a lower temperature, 55 ~ as it appeared that removal of the two O-linked glycans caused a pronounced decrease in stability. Now knowing the crystal structure of CIP [ 15,16], it appears that the mannose residue at T331 is part of the protein structure. The 6N mutant unexpectedly was less stable than the 4N mutant in experiments demonstrating both heat and urea kinetic stability. The crystal structure of CIP indicates that A304 is not fully exposed to the solvent, despite the N303-A304 peptide bond is susceptible to proteolytic nicking of CIP in the fermentation broth [11,16]. The physico-chemical properties of the ON, 1N, 2N, and 4N CIP mutants, however, changed gradually. Hence, the time constants for the irreversible heat unfolding were 107, 117, 121,330 sec, respectively, and 133 sec for the 6N mutant at 55 ~ in the presence of 4 M EDTA. These values are not corrected for contributions from the amino acid residues that were concomitantly changed in these mutants, as they were expected to be minor and possibly to cancel out, according to the results on the surface mutants mentioned above. Irreversible unfolding in 4 M urea, 4 mM EDTA of the ON, IN, 2N, 4N and 6N CIP mutants gave time constants of 48, 55, 64, 105 and 67 sec, respectively. Diluting these solutions to 3 M urea and adding CaC12 to 20 mM, the time constants for reversible folding were obtained. The time course of both reactions were clearly biphasic for the 2N, 4N and 6N glyco-mutants showing small initial and greater late time constants, but monophasic for the ON and 1N mutants. As the glycans were not homogeneous after lectin affinity chromatography, the smaller initial time constants are most likely valid for the fraction of a mutant having the

209 shortest glycans of the heterogeneous glyco-mutants, whereas the fraction carrying a higher carbohydrate load most likely gives rise to the late reactions. Hence, in all cases it appears that unfolding and refolding induced by heat or urea are delayed relative to the extent of glycosylation (J.W. Tams, J. Vind and K.G. Welinder, in preparation).

13 I J

\

:'\~\

[]

\

,.,

i

~..

\

\

"--'10 -~q

E

-~

1

~10 "x

O0 10 -"~ q

E

UD

-~ -

E] ~

\

\

\x

10

.

q

-i

i

-i

1(:) -'~ j 1. r_-

1.6

1.7~

1. ~8

AMS

1. ~9

(M)

2. ~0

2.1

10 -a 56

5~8 6~0 6~2 6~4 6~6 68

Acetone

7~0 7~2 714 71(5

(~)

Figure 1. The solubility S of wild-type CIP (x) Figure 2. The solubility S in acetone-water containing one N-linked glycan, and N142T mixture of N142T CIP (*)and wild-type CIP (*) with no N-linked glycan, in ammonium- (x), and of wild-type CIP in the presense of 40 sulfate solution, mM NaC1 (A) and 20 mM CaC12 (El).

The solubility in AMS solution decreased significantly in the N142P, N142T and N142D mutants as compared to wild-type CIP (Figure 1) in agreement with the findings for HRP and d-HRP. In water-acetone mixtures the N-linked glycan of wild-type CIP clearly reduced the solubility as compared to the N142T mutant. Acetone precipitation was assisted in 40 mM NaC1, and even more in 20 mM CaC12 (Figure 2).

4. CONCLUSIONS Our experiments with heme-containing peroxidases with modified contents of carbohydrate stress three points of importance to biotechnology. First, the enzymatic activity was not changed by naturally occurring or engineered glycans as long as they were not interfering with substrate binding, catalysis or proper folding of the protein. Second, unfolding or denaturation of a protein can be delayed markedly by the introduction of glycans, whereas the thermodynamic equilibrium reached under reversible conditions of the native and unfolded

210 form of a protein seemed nearly independent of the extent of glycosylation. Third, the solubility of glycoproteins greatly increases in salt solution and decreases in aqueous acetone solution, with increasing content of carbohydrate.

5. ACKNOWLEDGEMENTS We are grateful to Ms. Y.B. Larsen for expert technical assistance and A.L. Jensen for amino acid analyses. The work was supported by grant no. 11-7266 from the Danish Natural Science Research Council and grant no. 1990-133/1-900090 from the Danish Technology Council.

6. REFERENCES

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

K.G. Welinder, Eur. J. Biochem., 96 (1979) 483. J.E. Harthill, Ph.D. thesis, Faculty of Biological Science, University of Oxford, 1991. J.W. Tams and K.G. Welinder, Anal. Biochem., 228 (1995) 48. L. Reimann and G.R. Schonbaum, Meth. Enzymol., 52 (1978) 514. E. Silva, A.M. Edwards and A. Faljoni-Alario, Arch. Biochem. Biophys., 276 (1990) 527. A.T. Smith, N. Santama, S. Dacey, M. Edwards, R.C. Bray, R.N.F. Thorneley and J.F. Burke, J. Biol. Chem., 265 (1990) 13335. N. Schtilke and F.X. Schmid, J. Biol. Chem., 263 (1988) 8832. H. Dalbcge, E.B. Jensen and K.G. Welinder, Patent application (1992) WO 92/16634. L. Baunsgaard, H. Dalbcge, G. Houen, E.M. Rasmussen and K.G. Welinder, Eur. J. Biochem., 213 (1993) 605. G. Smulevich, A. Feis, C. Focardi, J.W. Tams and K.G. Welinder, Biochemistry, 33 (1994) 15425. M. Kjalke, M.B. Andersen, P. Schneider, B. Christensen, M. Schtilein and K.G. Welinder, Biochim., Biophys. Acta, 1120 (1992) 248. P. Limongi, M. Kjalke, J. Vind, J.W. Tams, T. Johansson, K.G. Welinder, Eur. J. Biochem. 227 (1995) 270. B.C. Finzel, T.L. Poulos and J. Kraut, J. Biol. Chem., 259 (1984) 13027. K.G. Welinder, Curr. Opin. Struc. Biol., 2 (1992) 388. J.F.W. Petersen, A. Kadziola and S. Larsen, FEBS Lett., 339 (1994) 291. N. Kunishima, K. Fukuyama, H. Matsubara, H. Hatanaka, Y. Shibano and T. Amachi, J. Mol. Biol., 235 (1994) 331.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), Carbohydrate Bioengineering 9 Elsevier Science B.V. All rights reserved.

211

Modes of action of two Trichoderma reesei cellobiohydrolases Tuula T. Teeri", Anu Koivula a, Markus Linder", Tapani Reinikainen", Laura Ruohonen a, Malee Srisodsuk", Marc Claeyssensb and T. Alwyn Jones c aVTI" Biotechnology and Food Research, PO Box 1500, FIN-02044 VTI', Finland b Department of Biochemistry, University of Ghent, Ledeganckstraat 35, B-9000 Ghent, Belgium c Department of Molecular Biology, Uppsala University, Biomedical Center, PO Box 590, S75124 Uppsala, Sweden

Abstract Trichoderma reesei degrades native cellulose utilizing a set of cellulolytic enzymes

dominated by two cellobiohydrolases, CBHI and CBHII. These enzymes exhibit the typical two domain architecture of fungal cellulases, and both act primarily as exoglucanases liberating cellobiose from the ends of the polymeric cellulose chains. The three dimensional structures of the catalytic domains of CBHI and CBHII revealed that their active sites are situated in tunnels formed by long loops on the enzymes surface. The active sites of homologous endoglucanases lack these loops and have more open active sites permitting catalytic activity in the internal positions of cellulose chains. Site-directed mutagenesis and structural studies have identified the key catalytic residues of both CBHI and CBHII. Similarly, the primary interaction surface of the cellulose-binding domain has been defined and residues responsible for its tight binding to cellulose identified.

1. INTRODUCTION Native cellulose is a crystalline structure in which parallel polymeric glucose chains pack together by regular H-bonding networks. The role of cellulose in plant cell walls is predominantly structural. Its complete hydrolysis in nature is slow and requires the combined activities of many different enzymes. The filamentous fungus Trichoderma reesei produces a potent set of cellulolytic enzymes which are capable of efficient hydrolysis of highly crystalline celluloses. The key enzymes in this process are its two cellobiohydrolases, CBHI and CBHII. Both these enzymes are exoglucanases releasing cellobiose from the ends of the cellulose chains, and they act synergistically with each other and with a number of endoglucanases. Comparison of the primary structures of T. reesei cellulases and various biophysical and biochemical studies of T. reesei cellulases revealed a common structural organization: a large

212 catalytic domain is joined by a relatively long O-glycosylated linker peptide to a smaller cellulose binding domain (CBD) [1-3] (Figure 1). This two domain structure - also found in many other fungal and bacterial cellulases - is essential to full activity on insoluble, crystalline substrates. The catalytic domain, without the CBD, has much reduced activity on crystalline cellulose. We have undertaken genetic engineering and structural studies of the T. reesei cellobiohydrolases in order to understand their detailed modes of action in the hydrolysis of highly crystalline cellulose.

Figure 1. Domain structure of CBHI. The structure of the catalytic domain was solved by Xray crystallography [4], and that of the cellulose-binding domain by NMR [5]. There is no experimental structural information of the interdomain linker peptide, which is drawn to the figure simply to illustrate the separation of the two functional domains.

2. CATALYTIC DOMAIN STRUCTURES OF T. reesei CELLOBIOHYDROLASES The three dimensional structures of the catalytic domain of T. reesei CBHII was the first cellulase structure solved [6], and many others, including that of T. reesei CBHI [4], have since been determined by X-ray crystallography (see Davies and Schfilein, this issue). Despite their apparently similar modes of action, the catalytic domains of CBHI and CBHII do not share a common fold. The central motif of CBHI is a [3-sandwich whereas CBHII is an ogl3 barrel protein. However, the active site architecture of the two cellobiohydrolases is strikingly similar. In both enzymes relatively long surface loops, which are well ordered in the crystal structures, enclose the active site into a tunnel spanning through the catalytic domain. The tunnel shaped active site apparently provides an unusually tight fit for the polymeric substrate. Cellulase catalytic domains have been grouped into families sharing a similar fold and the same stereochemistry of catalysis [7]. CBHI belongs to a family of retaining hydrolases, and CBHII in a family of inverting enzymes. Both families contain exo- and endoglucanases.

213 Sequence and structural comparisons have revealed that a major difference between the homologous endo- and exoglucanases is the absence of the long active site loops in the endoglucanases [4, 6, 8, Davies and Schtilein, this issue]. This offers a structural explanation for the different modes of action of the two types of cellulases. The endoglucanases with relatively open active sites can bind in the middle of the polymeric cellulose chains and hydrolyse internal glycosidic bonds along the cellulose crystal surfaces. Consequently, they also exhibit high activities on substituted substrates such as CMC (carboxymethyl cellulose) or HEC (hydroxyethyl cellulose). The cellobiohydrolases, on the other hand, have closed tunnelshaped active sites which restrict their action primarily to the chain ends. We and others have obtained evidence suggesting that CBHI attacks cellulose chains from the reducing end while CBHII prefers the opposite, non-reducing chain ends [4, 9]. The active site tunnel of CBHII is about 20/~ long and contains at least four binding sites (noted as A-D) for the glucosyl units. In the sites A, C and D sugar binding is characterized by hydrophobic interactions of the glucosyl tings with W135 (A), W269 (C) and W367 (D). Site B is different from the other three binding sites since it has no tryptophan residues for the stacking interaction with a sugar ring. In addition, the structure shows that it is the loosest binding site and can in principle accommodate the sugar ring in different conformations. The active site also contains many ionizable groups which interact with each other and form hydrogen bonds to the substrate. We have carried out a series of investigations using sitedirected mutagenesis and structural studies of CBHII in order to understand details of its catalytic mechanism (see below) The crystal structure of CBHI catalytic domain reveals an even longer active site tunnel of 40A. There are seven putative sites (A-G) for substrate binding, four of which contain a tryptophan residue (W40, W38, W367, W376). The bond cleavage occurs at the reducing end of the cellulose chain between subsites B and C. The catalytic amino acids required for the retaining reaction are readily apparent in the active site of CBHI (Divne). The side-chain of E217 in CBHI is favourably positioned for donating a proton to the leaving group thus acting as the general acid/general base in a double-displacement reaction (see below). E212 on the other hand is the most probable nucleophile in the reaction. Site-directed mutagenesis is now being carried out to verify the role of these and other residues in the active site of CBHI.

3. THE ACTIVE SITE OF CBHII The enzymatic break-down of a glycosidic linkage is carried out as a general acid catalysis in a stereoselective manner. The stereochemical course of the hydrolysis is different in the two cellobiohydrolases: CBHI retains the conformation of the anomeric carbon while CBHII inverts it [10, 11]. Inverting glycosidases are expected to operate via single-displacement mechanism whereas retaining glycosidases use a double-displacement mechanism involving glycosyl-enzyme intermediate. Both reaction mechanisms are thought to include oxocarbonium ion like transition states corresponding to half-chair conformations of the ring structure [12]. Kinetic and structural studies of CBHII on small soluble oligosaccharides have shown that the bond cleavage takes place between binding sites B and C [6, 13]. An extended cellulose chain adopts a zig-zag pattern of the glycosidic bonds with every second linkage pointing "up" and every second pointing "down", and hydrolysis can only occur with one conformation

214 between the subsites B and C. Once the cellulose chain has entered the active site, the restricted volume of the tunnel, H-bonding networks and van der Waals forces prevent extensive conformational changes of the bound glucan chain. On the other hand, experience has shown that the subsite A must always be occupied for hydrolysis to occur. Therefore, depending on the initial chain orientation in the active site, either a-cellobiose or a-cellotriose is cleaved off from the non-reducing end of the chain (Figure 2.). Cellotriose can then be further degraded to cellobiose and glucose [6, 14, 15].

Figure 2. Schematic representation of cellotetraose degradation by CBHII. See text for details.

A single-displacement inverting reaction mechanism assumes a catalytic acid donating a proton to the glycosidic oxygen, and a base to assist a nuclophilic attack at the C1 by a solvent water molecule. Site-directed mutagenesis studies of CBHII have been carried out to identify amino acids needed for catalysis and substrate binding. The structure revealed two aspartic acids (D175 and D221) close to the scissile bond of the polymeric glucan chain [6]. In its environment D221 was likely to be protonated and was proposed to act as the proton donor while D175 was more likely to be charged. The mutation D221A abolishes practically all of the catalytic activity of CBHII and does not alter affinity (K,~s-values) for small soluble oligosaccharides (Tables 1 and 2). Mutations D175A (Table 1) and D175N [16] are also

215 catalytically inactive with a slightly altered binding behaviour of D175A (Tables 1 and 2). Thus is seems that D221 indeed acts as the proton donor. D 175 may have a role either in ensuring the protonation of D221 or stabilizing the hypothetical carbonium ion intermediate or both. The general base for the reaction has been more difficult to assign. Due to the inverting mechanism the nucleophilic water should approach the anomeric carbon from the opposite side of the glycosidic bond relative to the proton donor, D221. In the CBHII structure the residue D401 is correctly orientated with respect to the glycosidic linkage, However, its distance from the general acid, D221, is about 10/~ which seems long even for an inverting reaction mechanism. D401 is also salt linked to two nearby residues, R353 and K395. Furthermore, results from kinetic studies on cellobiosylfluorides [17] have cast doubt on a typical singledisplacement mechanism for CBHII. It is possible, although speculative, that the only catalytic amino acids needed for the reaction are D221 and D175. In this case the water molecule needed for the reaction could enter the active site through a narrow tube reaching to the external solvent and filled with water as observed in the CBHII structure.

Table 1 Kinetic parameters for the native CBHII and mutants. Hydrolysis experiments were performed in 10mM sodium acetate buffer, pH 5.0 at 27 ~ Samples were taken at different time points and analyzed with HPLC as earlier described [14]. Kinetic constants were calculated by a nonlinear regression analysis (Enzfit). __P!7___o_t_e_i_n..................................... _ _k__c_at_(_m___i_n1;_) ..............................K__m___(bt_M_.................................. _) __G_a_t _/__K_m___(minT! " ~M-_!) Native CBHII 223 1.8 120 D221A D175A Y169F W135F W135L

<0.2 <0.2 57 75 1

nd* 8.1 1.2 6 nd*

0.02 48 12 -

*nd = not determined

Table 2 Association constants for the binding of MeUmb-glycosides to native CBHII and mutants. The bindin~ studies were performed in 50mM NaAc buffer, pH 5.0 at 16~ unless otherwise stated. Protein Native CBHII D221A D175A Y 169F W135F W 135L *nd = not determined

K~s (M -1) MeUmb(Glc)2 3 x 105 2 x 105 9 x 105 200 x 105 0.5 x 105 (8 ~ 0.2 x 105 (8 ~

MeUmbGlcXyl 490 x 105 (25 ~ nd* nd* nd* nd* nd*

216 Structural studies with ligands diffused into the active site of CBHII have so far failed to reveal the sugar conformation at the subsite B preceding the site of cleavage. The complex structure with glucose and cellobiose shows glucose at site A and cellobiose at C-D, all in a regular chair conformation. The 2.5]k structure with methylumbelliferylcellobioside (MeUmb(Glc)2), occupying sites A-D, shows good density for the sugar at site A, but poor density at site B. Increased binding with cellobionolactonoxime suggests that a half-chair conformation of the sugar ring at site B may be involved in the reaction mechanism of CBHII [13]. One of the amino acid residues possibly involved in the substrate binding at site B is a conserved tyrosine, Y 169. Interestingly, Y169F mutant shows increased binding but lowered catalytic rate on small soluble cellooligosaccharides (Tables 1 and 2). The most pronounced increase in binding was achieved with methylumbelliferylcellobioside (MeUmb(Glc)2); a very similar increase in affinity (Table 2) has been noticed for the binding of MeUmbGlcXyl to CBHII wt [13]. Thus, making more space in the binding site B seems to increase binding affinity but decrease catalytic rate. One possible explanation for these results is that sugar ring is bound in a strained conformation in site B and Y169 forms an H bond to this distorted glucosyl unit. The possibility of a strained sugar conformation at the subsite preceding the scissile bond gains support from structural studies of two other inverting enzymes, a glucoamylase [18] and a catalytically inactive endoglucanase, EGV (Davies and Schtilein, this issue). CBHII complex structures have shown that site A is the tightest binding site and highly specific for a glucose ring. This subsite must be occupied for hydrolysis to occur, as demonstrated by NMR studies failing to detect a glucose among the hydrolysis products of CBHII [14, 15]. Binding studies have further revealed that site A is very specific for an intact D-glucopyranose configuration. Mutations of W135, which is located at the site A, show decreased binding and subsequent reduction in the catalytic efficiency of CBHII (Tables 1 and 2). This result confirms that tight binding of the substrate in the subsite A is crucial for efficient hydrolysis and is in part dictated by W 135.

4. THE CELLULOSE-BINDING DOMAIN OF CBHI Cellulose-binding domains (CBDs) of cellulases have been shown to promote the degradation of insoluble substrates while their removal has no effect of the enzymes' capacity to hydrolyse small soluble sugars [3]. We have undertaken genetic and structural studies to examine the properties and role in cellulose degradation of the fungal CBDs. The cellulose-binding domain of CBHI belongs to CBD family 1 together with other fungal cellulose-binding domains [19]. The size of these highly homologous CBDs is approximately 36 amino acids which is only one third of the size of bacterial CBDs. The 3D structure of a synthetic peptide corresponding to the CBD of CBHI has been determined [5]. The CBD of CBHI folds into a wedge-shaped structure with overall dimensions of 30 x 18 x 10 A. The domain is very small and therefore it does not have a characteristic hydrophobic interior. Instead the structure is stabilized by two or - in some other enzymes - by three disulfide bridges. In CBHI, one face of the CBD is more hydrophilic than the other and contains three tyrosine residues (Y5, Y31 and Y32 corresponding to residues Y466, Y492 and Y493 in the native CBHI protein) (Figure 3.). In addition to the tyrosines this face accomodates two other

217 invariant residues N29(N490) and Q34(Q495). The interaction between an aromatic ring and a glucose residue arises from ring current polarization attraction involving delocalized ~electrons and the pyranose ring. The spacing and alignment of the tyrosines Y466, Y492, Y493 and the residues N490 and Q495 are in fact perfect for achieving multiple interactions with the glucose residues on the cellulose crystal. The cellulose-binding domains of T. reesei share over 70 % amino acid sequence similarity permitting structural model building based on the NMR structure of the CBHI CBD [20]. The completed models and the experimental starting structure were refined by molecular dynamics simulations in water. As expected, all the four models were very similar to the structure of CBHI, differences were found mainly in their hydrophobicities and electrostatic properties. Inspite of the sequence and structural similarity of the CBDs, we have found differences in their affinities towards cellulose. A replacement of the CBHI CBD by that of EGI clearly increased the affinity of the enzyme on crystalline cellulose [21]. The same difference was noticed by comparing the affinities of synthetic peptides corresponding to the CBDs of EGI and CBHI (our unpublished results). Part of the difference in binding could be attributed to differences in amino acids on the binding face.

Figure 3. The CBDs of T. reesei cellulases fold into a wedge-shaped structure. One face of the wedge is flat and more hydrophilic in character. It is formed by three tyrosines, an asparagine and a glutamine, and represents the primary interaction surface with cellulose. The other face is less hydrophilic and rougher, and does not directly interact with cellulose.

The function of the CBHI CBD has also been studied by site-directed mutagenesis of amino acid residues at its hydrophilic surface [22, 23]. The mutant enzymes were produced in T. reesei and tested for binding and activity on crystalline cellulose. Mutations introduced at the tyrosine residue Y492(Y31), located at the tip of the wedge, decreased both the binding and activity of CBHI on native cellulose (Table 3, Figure 4.). Furthermore it was found that the affinity and activity of the mutant Y492A were decreased more than those of Y492H, which shows that a histidine side chain can partially substitute for a tyrosine residue. The data of mutant P477R(P16R) support the view that the rough, more hydrophobic CBD surface is not

218 in direct contact with cellulose and that the productive cellulose-CBD interaction is transmitted through the flat and more hydrophilic surface.

Table 3 Activities of the CBHI proteins carrying mutated CBDs mutants on bacterial cellulose at pH 5.0. Activities are calculated from the velocity of reducing sugar formation with 2.1 gmol CBHI/g bacterial cellulose. The incubation temperature was 50 ~ Protein mkat/g1-1/mol Native CBHI 79 + 7 CBHI core 8.2 + 0.9 P477R 48 + 2 Y492A 18 + 2 Y492H 31 + 0.5

o

P16R ++ N29A ++ Q34A +

WT

P16R N29A

Y31A + Y32 ++ Y5A +++

o

~

/

f

200

Q34A Y31A

1.00

" o~

~I

-100

Y5A

Y32A

9 5.00 o

i0.00~

15.00

~

2000

o

Free peptide (HM)

Figure 4. Binding isotherms of the synthetic peptides containing mutations at some of the proposed sites of interaction with cellulose. The plusses correspond roughly to the degree of structural perturbations observed in the mutant peptide structure by NMR.

The effects of amino acid substitutions on the proposed binding face have also been studied using synthetic CBD peptides [24]. A set of six peptides with the mutations Y5A, Y31A, Y32A, P16R, N29A and Q34A was investigated (see Figure 4.). The binding isotherms were determined for all these peptides, and two dimensional NMR was used to assess structural effects of the mutations. It was found that Y5A and Y32A had lost all affinity towards cellulose, but also suffered structural perturbations. The structure of the mutant peptide Y31A was maintained better, but it had still lost most of its affinity. The affinities of both N29A and

219 Q34A were decreased approximately as the expected by contributions of lost hydrogen bonds. Here either, the mutation P16R did not significantly alter the binding of the CBD, and we are confident that the flat, hydrophilic face of the CBD is its primary interaction site with cellulose. Therefore, if a CBD has a peeling activity on the cellulose crystals as sometimes suggested [25] this activity must be mediated by binding interactions in the primary binding face.

12000 O

10000 8000

"~ 6000

"~ 4000 m 2000 NATIVE CBHI CBHI core

m pH 2.2

Y492A PROTEIN iI~ pH 5.0

Y492H

P477R

l--I pH 6.5

Figure 5. The effect of pH on binding of the CBHI CBD mutants. [E]/[S] = 10 l.tmol/g cellulose. McIlvaine (citrate-phosphate) buffer was used in all experiments. The values have been derived from the binding isotherms.

The adsorption of CBHI mutants was found to be moderately pH-dependent [23]. As can be seen in Figure 5., the adsorption of all proteins except the mutant Y492H was somewhat increased at pH 5. At this pH, closest to the isoelectric point (pi=3.9), the overall charge of the molecules is lowest and electrostatic repulsion is minimal, which therefore may allow denser packing. The adsorption of the mutant Y492H was restored almost to the level of native CBHI at pH 6.5 but was drastically decreased at pH 2.2. This difference presumably reflects the ionization of the mutated CBD. Assuming the pK, of a histidine to be 6.2, the tip histidine H492 and also H465 are totally protonated at pH 2.2, whereas approximately 50% of these residues are neutral at pH 6.5. Thus the higher net charge at pH 2.2 results in greater electrostatic repulsion between the CBDs than at pH 6.5. 1 M concentrations of salt ions had a marked effect on the binding of the proteins Figure 6. The increase in the adsorption is conceivably caused by the masking effect of salt ions on the ionic interactions between the protein molecules. This reduces the electrostatic repulsion between the molecules and allows denser packing, an effect clearly seen as better binding at the highest protein concentrations. Since MgSO4 had a greater effect on the binding than NaC1 it is

220 possible that a hydrophobic effect is involved in the cellulase-cellulose interaction [26]. However, the hydrophobic effect is a complex thermodynamic function including the cellulosecellulase, water-water, water-cellulose, water-protein and protein-protein interactions and therefore impossible to explain in detailed structural terms. In 1 M MgSO4 the activities of the mutants Y492A and Y492H and core protein were increased almost to the level of the native CBHI, whereas activities of the native CBHI and the mutant P477R were not affected (Table 4). The higher cellobiose concentration in the reaction mixtures of these proteins may have partially inhibited the enzymatic activity because no 13glucosidase was added. However, it is clear that weaker binding through the mutated CBDs is compensated by high ionic strength leading to improved activity. The enzymes seem to be capable of solubilizing cellulose even with a mutated CBD or without the CBD once bound on the cellulose surface.

12000

O

o

10000 8000

O

E

6000

.

.

4000 o m

2000

NATIVE CBHI CBHI CORE

Y492A

Y492H

P477R

PROTEIN !~ WATER []]]] NACL I~ MG-sulphate

Figure 6. The effect of ionic strength on binding of the CBHI CBD mutants. [E]/[S] = 10 btmol/g cellulose. The ionic strength was varied by adding 1 M MgSO4 or 1 M NaC1 to 50 mM Mcllvaine buffer, pH 5.0. The values have been derived from the binding isotherms.

4. THE INTERDOMAIN LINKER PEPTIDE OF CBHI

The two cellulase domains are ususally connected by relatively long, glycosylated linker peptides. Although the amino acid sequences and lengths of the linker sequences from different cellulases vary, they typically share an amino acid composition rich in proline and hydroxyl amino acids. Some linkers contain repeated sequences and some long runs of hydroxyamino acids [27]. The widespread occurence of these linker sequences in carbohydrate degrading enzymes emphasizes their importance on the enzyme function. It is not clear, however,

221 whether the role of the linker peptides is simply to provide physical separation of two independently functioning domains or whether the linkers mediate more specific interactions between the two domains.

Table 4 The effect of i M MgSO4 on the activity on crystalline cellulose. Cellobiose was quantified by HPLC after 1 h incubation with 2.1 lamol CBHI/g bacterial cellulose at pH 5.0. The incubation temperature was 50 ~ Protein Native CBHI CBHI core P477R Y492A Y492H

Cellobiose (mM) No Mg2SO4 0.73 0.14 0.65 0.31 0.40

1M Mg2SO4 0.61 0.30 0.62 0.60 0.60

Based on biochemical [3] and structural data [4] the linker peptide of CBHI has been defined to the 31 amino acids between the residues N431 and P461. The linker peptide contains a glycine and proline rich, 3 times repeated sequence GNP(P/S)G followed by 10 putative O-glycosylation sites provided by serine and threonine residues. The C-terminal junction of the linker contains again two glycines and a proline (Figure 7). We have studied the role of the CBHI linker by introducing two specific deletions in the sequence [28] (Figure 7). The functionality of the CBD was apparently maintained in both deletion mutants since their affinities were not significantly reduced (data not shown). The deletion of the entire linker peptide in the mutant AG-460 drastically reduced the binding capacity and activity of CBHI empahisizing that sufficient separation between the domains is required for the enzyme function. The mutant AG-444 found clearly less binding sites that the wt CBHI on the cellulose surface (Figure 8.). but suprisingly retained its catalytic activity (Figure 9). This led to apparently better productivity of the mutant. The repeated sequence [GNP(P/S)G], deleted in the mutant, could constitute a flexible hinge linked to more rigid, extended structure created by the O-glycosylated amino acids [3, 29]. The native CBHI has been shown to adsorb readily all along the crystal surfaces, and its CBD - flexibly linked to the catalytic domain -seems to guide it also to catalytically inaccessible regions. Without the hinge, the mutant enzyme becomes less flexible. Our results suggest that this loss of flexibility reduces the enzymes binding to some of the sites readily utilized by the wt CBHI. Why this leads to preferred binding to the productive binding sites is currently not understood.

222 CATALYTIC DOMAIN

LINKER

,YY /

/

/

/

I wt

/

CBD

\

\

Papain//

\

\

\ \\

\ ~./ 434 444 460\ FGPIGSTG NPSGGNPPGGNPPGTTT'rRRPA'I-B'GSSPG PTQSHYGQ I

I

DEL 1

I

DEL 2

I I I

I

I

I

I

~

I I

I

I

Figure 7. Schematic representation of the domain structure of CBHI. The amino acid sequence of the linker peptide has been shown. The deleted sequences in the mutants are indicated. DELl corresponds to the deletion of residues from 434 to 460, and DEL 2 to the deletion of residues from 434 to 460. The zig-zag line represents the putative hinge regions and the Y shaped symbols the O-glycosylated region.

Figure 8. Adsorption of CBHI linker deletion mutants on bacterial cellulose at various initial enzyme concentrations. Adsorption experiments were performed at 4 ~ 90 min (Data from ref. 28).

223 14 12 O

10

o

8

0

E

6

E

4

,-~ .~__ 0

<

2 2.1

3.3 5.0 Enzyme concentration I~M WT r-] CORE [~ AG444 ~

10

AG460

Figure 9. Enzymatic activity of the linker deletion mutants based on reducing sugar formation after 3 h incubation at pH 5.0, 50 ~ (Data from ref: 28).

5. R E F E R E N C E S

1 2 3 4 5 6 7 8 9 10 11 12 13 14

H. van Tilbeurgh, P. Tomme, M. Claeyssens, R. Bhikhabhai and G. Pettersson, FEBS Lett., 204 (1986) 223. T.T. Teeri, P. Lehtovaara, S. Kauppinen, I. Salovuori and J. Knowles, Gene, 51 (1987) 43. P. Tomme, H. van Tilbeurgh, O. Pettersson, J. Van Damme, J. Vandekerckhove, J. Knowles, T. Teeri and M. Claeyssens, Eur. J. Biochem., 170 (1988) 575. C. Divne, J. St/~hlberg, T. Reinikainen, L. Ruohonen, G. Pettersson, J.K.C. Knowles, T.T. Teeri and A. Jones, Science, 265 (1994) 524. P.J. Kraulis, G.M. Clore, M. Nilges, T.A. Jones, G. Pettersson, J. Knowles and A.M. Gronenborn, Biochemistry, 28 (1989) 7241. J. Rouvinen, T. Bergfors, T. Teeri, J.K.C. Knowles and T.A. Jones, Science, 249 (1990) 380. B. Henrissat and A. Bairoch, Biochem. J., 293 (1993) 781. M. Spezio, D.B. Wilson and P.A. Karplus, Biochemistry, 32 (1993) 9906. M. Vr~ansk~i and P. Biely, Carboh. Res. 227 (1992) 19. J.K.C. Knowles, P. Lehtovaara, M. Murray and M.L. Sinnott, J. Chem. Soc. Chem. Commun., 1988, 1401. M. Claeyssens, P. Tomme, C.F. Brewer and E.J. Hehre, FEBS Lett., 263 (1990) 89. J.D. McCarter and S.G. Withers, Curr. Op. Struct. Biol., 4 (1994) 885. H. van Tilbeurgh, F. Loontiens, Y. Engelborgs and M. Claeyssens, Eur. J. Biochem., 184 (1989) 553. L. Ruohonen, A. Koivula, T. Reinikainen, A. ValkeajL,'vi, A. Teleman, M. Claeyssens,

224 M. Szardenings, T.A. Jones and T.T. Teeri, In: P. Suominen and T. Reinikainen (Eds.) Trichoderma reesei cellulases and other hydrolases. Foundation for Biotechnical and

15 16

17 18 19 20 21 22 23 24 25 26 27 28 29

Industrial Fermentation Research, Helsinki. Vol 8 (1993) 87. A. Teleman, A. Koivula, T. Reinikainen, A. Valkeaj~irvi, T.T. Teeri, T. Drakenberg and O. Teleman, Eur. J. Biochem., 231 (1995) 250. C. Barnett, R. Summer, R. Berka, S. Shoemaker, H. Berg, M. Grizali and R. Brown, In: M.E. Himmel and G. Georgiou (Eds.) Biocatalyst Design for Stability and Design, (1993) 220. A.K. Konstantinidis, I. Marsden and M.L. Sinnott, Biochem. J., 291 (1993) 883. A.E. Aleshin, L.M. Firsov and R.B. Honzatko, J. Biol. Chem., 269 (1994) 15631. J.B. Coutinho, N.R. Gilkes, R.A.J. Warren, D.G. Kilbum and R.C. Jr. Miller, Mol. Microbiol., 6 (1992) 1243. A.-M. HoffrEn, T.T. Teeri and O. Teleman, Prot. Eng., 8 (1995) 443. M. Srisodsuk, Thesis, University of Helsinki, 1994, 64-67. T. Reinikainen, L. Ruohonen, T. Nevanen, L. Laaksonen, P. Kraulis, T.A. Jones, J.K.C. Knowles and T.T. Teeri, Proteins: Structure, Function and Genetics, 14 (1992) 475. T. Reinikainen, O. Teleman and T. Teeri, Proteins: Structure, Function and Genetics, in press. M. Linder, M.L. Mattinen, M. Kontteli, G. Lindeberg, J. Stahlberg, T. Drakenberg, T. Reinikainen G. Petterson and A. Annila, Prot. Sci., 4 (1995) 1056. J. Knowles, P. Lehtovaara and T. Teeri, Tibtech., 5 (1987) 255. T.E. Creighton, Proteins, Structures and Principles. Freeman and Company, New York, 1984, 145. N.R. Gilkes, B. Hem'issat, D.G. Kilburn, R.C. Jr. Miller and R.A.J. Warren, Microbiol. Rev., 55 (1991) 303. M. Srisodsuk, T. Reinikainen, M. Penttil~i and T.T. Teeri, J. Biol. Chem., 268 (1993) 20756. I. Salovuori, M. Makarow, H. Rauvala, J.K.C. Knowles and L. K~i~iri~iinen, 1987. Bio/Technology, 5 (1987) 152.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

225

Structural studies on fungal endoglucanases from Humicola insolens Gideon J. Davies" and Martin Schtilein

b

" Department of Chemistry, University of York, York YO 1 5DD, England b Novo Nordisk a/s, Novo all~, 2880 Bagsv~erd, Denmark

Abstract

The structures of the endoglucanases I and V (EG1 & EGV) from the fungus Humicola insolens have been solved, by X-ray crystallography, at resolutions of 2.2 and 1.5/~ respectively. EGV, from glycosyl hydrolase family 45, catalyses cleavage of the 13(1---)4) glycosidic linkages of the substrate cellulose with inversion of configuration. Analysis of oligosaccharide complexes and studies on mutant proteins indicate the likely basis for catalysis by this enzyme. EG1, from family 7 catalyses cleavage with retention of configuration. It structurally similar to the previously published Trichoderma reesei cellobiohydrolase I structure [ 1] with the important difference that the active site is situated in a long open groove, compared to the more enclosed active site of the cellobiohydrolase, which helps explain the differences in the pattern of cellulose degradation by these two enzymes.

1. INTRODUCTION Cellulose is the major polysaccharide component of the plant cell wall and one of the most abundant natural compounds. Cellulose-based fabrics are ubiquitous and cellulose containing materials make up a large proportion of municipal waste. The mechanisms of cellulose degradation are, therefore, of increasing commercial and ecological importance. Cellulose is a polymer of ~(1---)4) linked glucose residues, Figure 1. In common with other 13(1---)4) linked polymers, such as xylan, cellulose forms extensive intra- and inter-chain hydrogen bonds from which it derives its structural strength. Furthermore, cellulose forms highly-ordered crystalline fibres which are generally resistant to degradation. A number of cellulolytic micro-organisms exist, however, which contain batteries of enzymes to facilitate the complete degradation of cellulose [2-4]. This enzymatic degradation of cellulose is performed by enzymes, such as cellobiohydrolases and endoglucanases, which have been isolated from a variety of bacterial and fungal sources. These enzymes are often multi-domain entities which consist of a catalytic core domain linked to one or more domains via flexible linker regions [5]. In bacterial systems the cellulolytic enzymes are often arranged into a multi-protein complex called the cellulosome. In fungal systems the arrangement is much simpler, usually consisting of the catalytic core

226 domain, a linker region and a cellulose-binding domain (CBD). The structure of an archetypal fungal CBD has been solved by nuclear magnetic resonance techniques [6]. The catalytic domains of glycosyl hydrolases have been classified into 45 families based on sequence comparison and hydrophobic cluster analysis [7,8]. Cellulases and xylanases are found in 11 of these families which are also known as cellulase families A-K. More recently an additional family, family L, has been described [9]. At the time of writing, structures of enzymes from 7 of these families are known: EGCCA from family 5 (A) (Ducros et al., manuscript submitted to Structure), Cellobiohydrolase II [10] (CBH-II) and endocellulase E2 [11] from family 6 (B), CBH-I [1] and EG1 from family 7 (C), CelD from family 9 (E) [12] the xylanases from family 10 (F) [13, 14] and 11 (G) [15, 17] and the endoglucanase V from family 45 (K) [18, 19]. These structures exhibit many distinct tertiary folds, ranging from variants of the ot/13 barrel first observed in triose phosphate isomerase (families 5(A), 6(B) and 10(F)) through various, different, predominantly 13-sheet structures (families 7(C), l l(G) and 45(K)) to the primarily a-helical enzyme from family 9(E). All of these enzymes facilitate bond cleavage by similar acid-base catalytic mechanisms resulting in either retention or inversion at the anomeric C 1 carbon [20]. In order to gain some insights into the underlying mechanisms of the enzymatic degradation of cellulose by the thermophilic hyphomycete Humicola insolens, we have determined the structures of a number of cellulases from this organism. In this paper we describe structural studies on two of the endoglucanases from H. insolens, endoglucanase V (EGV) and endoglucanase I (EG 1).

Cellulose

0

OH

HO~c 6

/ ~

~c4., ~,,.o ~ Cs \

HO...-J

OH

OH

HO

__

/

_

HO...~

HO....~ _

0

-

-

\

OH -o

OH

HO

-

/

-

0 "~

HO~...J

Xylan

OH 01!.0

OH 0

0 OH

0

OH 0

HO

0 '~"

OH

Figure 1. Structures of the two most common [3(1--->4)linked polymers in plants, cellulose and xylan.

2. ENDOGLUCANASE V Humicola insolens is a soft-rot fungus that produces a number of cellulolytic enzymes, of which seven have been characterised [21, 22]. Like many fungal cellulases, EGV consists of a

227

2.1. Structure of the native enzyme The structure of the catalytic core of EGV was solved by X-ray crystallography at 1.5A resolution [18,19]. The main structural feature is a six-stranded ~-barrel domain with long interconnecting, often disulphide-bonded, loop regions. A long active site groove runs across the surface of the enzyme and on either side of this sit the catalytic aspartates, Asp 10 and Asp 121, Figures 3,4. A similar topology can be found in the sugar-binding wound-induced plant defence protein 'barwin' whose structure has been determined by NMR [26]. The catalytic activity of barwin, if any, is not known.

Figure 4. CPK representations of the native EGV structure showing the active site groove and catalytic aspartates 10 and 121.

The native structure of EGV has a disordered loop between two of the 13strands adjacent to the active site. No electron density can be seen in this region (residues 112-117). This region corresponds to a loop which is also disordered in the solution structure of the plant defence protein barwin, which suggests that it may play a common, dynamic role in the function of these two proteins. The two catalytic residues, Asp 10 and Asp 121, sit on either side of the substrate-binding groove, straddling Tyr 8 at this point. Their carboxylates are separated by approximately 9A, which is as expected [20] for an enzyme known to catalyse glycosyl hydrolysis with inversion of the anomeric configuration [21]. In order to obtain a more complete understanding of the reaction mechanism of EGV, however, it was necessary to study a number of oligosaccharide complexes of the enzyme.

2.2. Structure of the product (cellobiose) complex The structure of EGV co-crystallised with cellobiose, the product of the reaction has been solved at 1.9A resolution. Cellobiose is found, as expected, in the 'leaving-group' subsites E and F, Figure 5. The electron density is clear and unambiguous with the two glucosyl units having low average temperature factors of 18 and 21A 2 respectively. The directionality of the

228 sugars is consistent with the known cleavage pattern of cellodextrins (shown in Figure 2). 4

Both glucopyranosyl tings are in the full C 1 chair conformation.

Figure 5. Stereo MOLSCRIPT plot showing the binding of cellobiose to EGV. The distance between the 04 atom of the E subsite sugar and one of the carboxylate oxygens of Asp 121 is 2.6,~. This leads us to believe that Asp 121 functions as the proton donor in the reaction mechanism. This is also indicated by the fact the Asp 121 sits in a predominantly hydrophobic environment which probably assists in maintaining a raised pKa for this residue. It is likely that Asp 10 functions as the base, activating the nucleophilic water through a deprotonation event. Asp 10 is situated sufficiently distant from Asp 121 that both a sugar ring and the catalytic water molecule could lie between the two carboxylate groups, as expected for an inverting enzyme. The critical importance of these two acidic groups is supported by kinetic analyses of site-directed mutants of these two residues described, below. A number of conformational changes have occurred in the EGV-cellobiose complex. There is a slight closing of the active cleft which allows the hydroxyl group of Tyr 147 to make a water-mediated hydrogen bond with the 06 atom of the E subsite sugar. Another important feature of the cellobiose complex is that the loop between 13 strands V and VI (residues 112 to 117) becomes ordered although it makes no direct interactions with the cellobiose product. It mat be possible that this is due to crystal packing considerations, but perhaps more likely is that this is due to a change in the water structure. Amino-acids in this region are linked to the cellobiose via a network of well defined water molecules. 2.3. Mutation of the catalytic residues of EGV Site-directed mutagenesis of many of the acidic groups in EGV was performed. In addition to the two catalytic aspartates 10 and 121, mutation of Asp 114 was also found to diminish catalytic activity. Tyr 8, at the base of the D subsite and conserved in all family 45 sequences, was also mutated. The catalytic consequences of these mutations are shown in Table 1. These figures help to confirm the likely roles for Asp 10 and 121, both of which are clearly essential for catalysis. This is in agreement with our proposal that they act as the activator of the catalytic nucleophile and the catalytic proton donor, respectively. The hydroxyl

229 Table 1 Activity of EGV mutants measured by reduction in viscosity of carboxy-methyl substituted cellulose at pH 7.5 Protein Enzymatic Activity (ECU) EGV Wild-Type 430 Mutant D 1ON 0 Mutant D121N 0 Mutant D 114N 25 Mutant Y8F 290

group of Tyr 8, at the base of the D subsite, is clearly only of extremely minor importance. Asp 114, which is situated on the mobile loop, disordered in the native structure, and pointing into solvent away from the active site in the cellobiose complex, clearly has a significant role in catalysis or binding. This is difficult to reconcile with its location in the cellobiose complex. The influence of Asp 114 on enzymatic activity is, however, somewhat easier to explain when its conformation in the cellohexaose complex, described in the following section, is revealed. 2.4. Structure of the D10N mutant and its complex with celiohexaose The availability of completely inactive EGV mutants, such as D10N and D121N allowed us to try and study substrate complexes of this enzyme. Initially the structure of the D10N mutant, alone, was solved at 1.9/~ resolution. This structure is completely isomorphous with the native EGV structure allowing us to conclude that the inactivity of this mutant could be ascribed purely to the chemical nature of the modification (i.e., the loss of charge at residue 10) and not due to misfolding or structural rearrangements. The mutant D10N enzyme was therefore used to prepare a oligosaccharide complex by co-crystallisation with cellohexaose. The D10N cellohexaose complex was solved, by molecular replacement, also at 1.9A resolution. Clear sugar density can be seen for sugars in subsites A, B & C and E, F and G, Figure 6. The sugars are well defined with temperature factors of 25, 9, 8, 10, 15 and 42A 2 respectively. The directionality of the sugars is again consistent with the mode of binding expected from kinetic studies, running from non-reducing end to reducing end from subsite A 4 towards G. All the sugar tings are in the full C 1 chair conformation. No sugar density is present in the D subsite which is, instead, occupied by a number of discrete water molecules. Why might this be? It is possible that in the very high protein concentrations of the crystallisation conditions cleavage of cellohexaose has taken place, despite the apparent inactivity of the D10N mutant. Inspection of the electron density maps suggests that this is not the explanation. Instead, it appears that we are observing two cellohexaose molecules. The first molecule has its first three glucose units disordered in the crystal solvent with the last three units tightly bound the A, B and C subsites. The second cellohexaose molecule is bound in the E, F and G subsites, this time with the last three glucose units disordered in the solvent. The lack of binding to the D position is probably explained by the fact that this subsite is responsible for inducing strain in the substrate or stabilising the 4

transition state and therefore has an unfavourable binding energy for a C1 chair glucose ring.

230

Figure 6. Electron density for the oligosaccharide and catalytic aspartates in the D10N cellohexaose complex, together with a Co~ trace of the whole molecule.

Conformational changes take place in this structure compared to both the native and cellobiose complexes. Again, a closing of the active site cleft, relative to the native structure, is observed, with a similar magnitude to that seen in the cellobiose complex. The loop between residues 112 and 117 that was disordered in the native EGV structure, but ordered in the cellobiose complex, is again ordered in the D10N cellohexaose complex. The loop conformation is, however, quite different to that seen in the cellobiose complex. One consequence of this conformational change is to completely bury the active site at the point of cleavage. This is brought about by the loop 'flipping' so as to bury Leu 115 adjacent to Asp 121, Figure 7. The increased hydrophobicity adjacent to the proton donor presumably assisting in elevating the pKa of this residue, and in addition, Asp 114 is brought into a position directly above the D subsite. There is a critical interaction between the 06 atom of the C subsite sugar with the peptide amide group of Gly 113 which may be important for the specificity of EGV for glucose as opposed to xylose based polymers. An additional beneficial effect of this loop movement is that the carboxylate of Asp 114, in addition to sitting over the vacant D subsite and interacting with the E subsite sugar, is now positioned approximately 5/~ from the carboxylate of Asp 121. Presumably this interaction also favours a raised pKa for the catalytic proton donor and explains the significant consequences of mutation of this residue described in section 2.3, above. Recent, elegant, site- directed mutagenesis experiments on a family 6(B) inverting cellulase, CenA from Cellulomonas fimi [28] have indicated the roles of an additional acidic group in maintaining a raised pKa for the catalytic proton donor. It seems likely, therefore, that a similar mechanism exists in EGV.

231

Leu 1 1 5 ~ .

G

Leu 1

C

B

Figure 7. Loop conformations for the cellobiose complex (faint lines) and the cellohexaose complex (thicker lines). The position of Asp 121 and the position of Leu 115 in the two conformations are shown.

2.5. Mechanism of bond cleavage by EGV Deducing the mechanism of bond cleavage by EGV is complicated by the absence of a sugar in the 'cleavage' subsite D. What then do we know ? Kinetic and mutant studies show us that EGV catalyses the cleavage of the glycosidic bond: 9between subsites D and E 9with inversion of configuration at C1 9with Asps 10 and 121 playing an essential role and Asp 114 a minor one Primary and tertiary structure considerations show that: 9Asp 10 and Asp 121 are completely conserved throughout all family 45 sequences 9 Asp 121 sits in a hydrophobic environment and makes a 2.6/~ hydrogen bond to the 04 position of the E subsite sugar, suggesting a likely role as the proton donor 9Asp 10 sits in a more hydrophilic environment, below the D subsite, where it hydrogen bonds to a number of discrete water molecules. The carboxylate of Asp 10 is approximately 9/~ from that of Asp 121 9conformational changes take place upon the binding of substrates In order to assign a definitive mechanism, knowledge of the conformation of the D subsite sugar would be essential. No complexes with a bound D subsite sugar have, thus far, been obtained and modelling of the D sugar is also particularly problematic. The A, B & C and E, F 4

& G subsite sugars have full C 1 chair conformations and conformational angles similar to that found in small molecule structures of oligosaccharides [29,30]. It is, however, difficult to build a sugar into the D subsite which bonds satisfactorily to the C and E subsite sugars. A plausible model can be built, however, in which the orientation of the D subsite sugar continues the natural twist of the A, B and C subsite sugars. In this orientation the D subsite sugar is in such an orientation that the a position at C1 could interact with a water molecule bound to Asp 10. This is in agreement with our proposals, based on sequence conservation, mutagenesis and

232 structure that Asp 10 functions so as to activate the nucleophilic water by a deprotonation event. This suggests that EGV catalyses glycosidic bond cleavage utilising a single displacement, SN2, mechanism, proposed by Koshland in 1953 [31] with Asp 121 acting as the proton donor and Asp 10 the catalytic base, Figure 8, overleaf. The role of Asp 114 may be to assist in maintaining a raised pKa for the catalytic proton donor or it may play a role in regenerating the appropriate charges on the two catalytic aspartates, which will have changed following an enzymatic turnover event. The single displacement, essentially SN2 mechanism for catalysis by EGV is the simplest explanation for catalysis by EGV. Others have addressed the similarities in the transition states for both SN2 and SN1 reaction mechanisms [32]. This suggests that the simple SN2 mechanism for EGV may not be quite the whole story, but in the absence of more mechanistic data, further speculation is inappropriate.

3. ENDOGLUCANASE I Most fungal cellulases consist of a catalytic core domain connected to a cellulose binding domain via a flexible linker region. The endoglucanase I from H. insolens is unusual in that it consists of a catalytic core domain only. This catalytic core can be classified into family 7 of the glycosyl hydrolases, also known as cellulase family C. Family 7(C) contains both endoglucanases (EC. 3.2.1.4) and cellobiohydrolases (EC. 3.2.1.91), which are believed to hydrolyse cellulose in a predominantly endo or exo manner respectively. In common with other family 7 enzymes, EG1 catalyses glycosyl hydrolysis with a net retention of the anomeric configuration, giving the 13 anomer as the reaction product. The kinetics of EG1 have been described [21,22]. The unchanged kca~M for cellodextrins containing 4, 5 or 6 sugar tings indicates that EG 1 has only four kinetically significant subsites for substrate binding. EG1 is unusual for a cellulase in that it can release monosaccharide aglycons. Under certain conditions both EG1 and CBH-I can cleave cellotriose, liberating either cellobiose or glucose as the leaving group. The kinetics of EG1 are further complicated by the fact that, like many enzymes that operate with a net retention of configuration, significant transglycosylation reactions take place at high product concentrations. 3.1. Structure of the native enzyme The structure of the H. insolens EG1 was solved, by molecular replacement utilising the known T. reesei CBH-I structure [1] as the search model, at 2.2A resolution. The first 398 amino-acids of the structure are visible in the electron density maps and the missing 17 Cterminal residues are presumed to be either disordered or proteolytically cleaved from the enzyme. The N-terminal Gin residue is found as the modified pyroglutamate group as was observed in the T. reesei CBH-I structure [1,33]. Evidence of N-linked glycosylation can be seen at both the potential glycosylation sites, but only at Asn 247 could this be modelled, in this case as a single ordered N-acetylglucosamine group. The structure of a cellobiohydrolase from family 7(C) has been described [1]. It consists of a predominantly I]-strand structure with a similar topology to that found in the plant legume lectins such as concanavalin A [34,35]. This fold is also found in the structure of another

233

Figure 8. A single displacement reaction scheme for endoglucanase V.

234 glycosyl hydrolase, the bacterial 13(1---)3):13(1--+4) glucanase from family 16 [36]. It has also been likened to that found in the xylanases of family 11(G) [13], but it is structurally distinct from any of the other cellulase topologies that have so far been observed. EG1 shares this same topology, with overall dimensions of approximately 60/~ x 45tI, x 40/~. The structure is primarily built of two large antiparallel 13 sheets. In EG1 these two sheets consist of 7 and 8 strands respectively, Figure 9. These stack on top of each other in the form of a convex and concave sheet. A large groove runs across the surface of the structure and is formed by the enclosure created by the concave 13 sheet. It is approximately 50/~ long and about 20t1, deep for most of its length. The active site residues are located at the bottom of this groove approximately two-thirds of the way along the channel. The channel is sufficiently long to accommodate up to 8 sugar units, although it is known that only 4 of these potential binding sites contribute to catalysis.

3.2 Active Site of EG1 Catalysis with overall retention of anomeric configuration requires a proton donor, to protonate the glycosidic bond and assist aglycon departure, and a catalytic nucleophile. The function of the nucleophile, or base, is to stabilise the oxycarbonium ion present after aglycon departure. This stabilisation is likely to take the form of a covalent-enzyme intermediate, as has been proven for many systems [37,38], although the possibility of the electrostatic stabilisation of a long-lived intermediate has also been proposed [39,40]. In either case catalysis dictates that the two acidic residues are located with their carboxylate groups <separated by approximately 5/~ [20]. The active site residues of EG1 lie at the bottom of the active site groove. They are easily identified as the catalytic residues by their strict sequence conservation in family 7(C), by their spatial location and by analogy with the CBH-I [33] and bacterial 13(1---)3):13(1---)4) glucanase structures for both of which oligosaccharide complexes have been reported. The three active site residues of the H. insolens EG 1 are Glu 197, Asp 199 and Glu 202. They sit at the bottom of the active-site cleft. Comparison with CBH-I and the bacterial 13(1--)3):13(1---)4) glucanase structure suggests that Glu 202 functions as the proton donor in the reaction. Glu 202 is located in a somewhat hydrophobic environment. It is flanked, on either side by tryptophan tings and is positioned below an asparagine residue, Asn 143. This Asn residue is located in an analogous manner to that seen adjacent to the proton donor in the family 10 enzymes, in which it has been implicated in maintaining a raised pKa for the proton donor [15]. The family 10 structures are quite distinct to EG1, instead consisting of an (c~/13)8 barrel as first observed in triose-phosphate isomerase. In the family 10 structures, and related sequences [41], this Asn residue is provided by the same strand as the proton donor in a AsnGlu motif. In EG 1 the Asn located above the proton donor is instead provided by the adjacent 13 strand. The likely catalytic nucleophile in EG 1 is Glu 197. The equivalent residue has been labelled by a mechanism-based epoxyalkyl reagent in the bacterial 13(1~3):13(1---)4) glucanase structure [26]. The carboxylate groups of Glu 197 and Glu 202 are separated by approximately 5.5/~, consistent with a proton-donor - nucleophile pair in a retaining enzyme [20]. The role of Asp 199 is unclear. It sits approximately 3]k from the potential nucleophile Glu 197 and may play a role in maintaining the appropriate charges on the reactive groups during catalysis.

235

Figure 9. Stereo MOLSCRIPT picture of the H. insolens EG1 structure. The active site residues Glu 197 and Glu 202 are shown in ball-and-stick representation.

Family 7(C) contains both cellobiohydrolases and endoglucanases. A superficial comparison of the known T. reesei CBH-I structure with the EG1 presented here indicates the likely basis for the differences in the pattern of cellulose degradation by these two enzymes. CBH-I is considered to be a primarily exo-acting enzyme, based upon its inability to digest carboxymethyl substituted cellulose. As predicted [1], primary difference between EG1 and CBH-I is that many of the extended loops of CBH-I are absent in the EG1 structure resulting in a more open active site channel and an endo-catalytic activity. A similar difference between the enclosed tunnel of a cellobiohydrolase (CBH-II) and the corresponding endoglucanases was predicted [10] later observed [11] the family 6(B) enzymes. It is likely then, that these structural differences will exist for many families of cellulases which contain both cellobiohydrolases and endoglucanases. In order to understand more fully the mechanism of EG1, the differences to CBH-I and its interaction with oligosaccharides, structures of EG1 substrate and inhibitor complexes will be necessary.

5. REFERENCES

1

C. Divne et al., Science, 265 (1994) 524.

2

P. BEguin and J.-P. Aubert, FEMS Micro. Rev., 13 (1994) 25.

3

B. Henrissat, Cellulose 1, (1994) 169.

4

H.J. Gilbert and G.P.J. Hazlewood, Gen. Micro, 139 (1993) 187.

5

N.R. Gilkes, B. Henrissat, D.G. Kilburn, R. C. Miller, J. and R.A.J. Warren, Microbiol.

236 Rev., 55 (1991 ) 303. 6

P.J. Kraulis et al., Biochemistry ,28, (1989) 7241.

7 8 9

B. Henrissat, Biochem. J., 280 (1991) 309. B. Henrissat and A. Bairoch, Biochem. J., 293 (1993) 781. H. Shen et al., Biochem. Biophys. Res. Comm., 199, (1994) 1223.

10 11

J. Rouvinen, T. Bergfors, T. Teeri, J.K.C. Knowles and T.A. Jones, Science, 249, (1990) 380. M. Spezio, D.B. Wilson and P.A. Karplus, Biochemistry, 32, (1993) 9906.

12

M. Juy et al., Nature, 357 (1992) 89.

13

A. T6rr6nen, A. Harkki and J. Rouvinen, EMBO J., 13 (1994) 2493.

14

W.W. Wakarchuk, R.L. Campbell, W.L. Sung, J. Davoodi and M. Yaguchi, Protein

15 16 17 18

Sci., 3 (1994) 467. A. White, S.G. Withers, N.R. Gilkes and D.R. Rose, Biochemistry, 33 (1994) 12546. U. Derewenda et al., J. Biol. Chem., 269 (1994) 20811. G.W. Harris et al., Structure, 2 (1994) 1107. G.J. Davies, S.P. Tolley and M. Schtilein, in: Proceedings of the second Tricel symposium on Trichoderma reesei cellulases and other hydrolases, P. Suominen and T. Reinikainen (eds.), 81 - 86, Foundation for biotechnical and industrial fermentation research, Espoo, 1993.

19

G.J. Davies et al., Nature, 365 (1993) 362.

20 21

J.D. McCarter and S.G. Withers, Curr. Op. Struct. Biol., 4 (1994) 885.

22

23 24 25

C. Schou, G. Rasmussen, M.-B. Kaltoft, B. Henrissat and M. Schtilein, Eur. J. Biochem., 217 (1993) 947. M. Schtilein, D.F. Tikhomirov and C. Schou, in: Proceedings of the second Tricel symposium on Trichoderma reesei cellulases and other hydrolases, P. Suominen, T. Reinikainen (eds.), 109 - 116, Foundation for Biotechnical and Industrial Fermentation Research, Espoo, 1993. H.J. Gilbert, J. Hall, G.P. Hazlewood and L.M.A. Ferreira, Mol. Microbiol., 4 (1990) 759. P.O. Sheppard et al., Gene, 150 (1994) 163. A. Saloheimo, B. Henrissat, A.-M. Hoffr6n, O. Teleman and M. Penttil~i, Mol. Microbiol., 13 (1994) 219.

26

S. Ludvigsen and F.M. Poulsen, Biochemistry, 31 (1992) 8783.

27

P.J. Kraulis, J. Appl. Cryst., 24 (1991) 946.

28

H.G. Damude, S.G. Withers, D.G. Kilburn, R.C. Miller, J. And R.A.J. Warren, Biochemistry, 34 (1995) 2220. K. GeBler et al., Science, 266 (1994) 1027. S. Raymond, A. Heyraud, D.T. Qui, A. Kvick and H. Chanzy, Macromolecules 28, (1995) 2096.

29 30

237 31

D.E. Koshland, Biol. Rev., 28 (1953) 416.

32

Y. Tanaka, W. Tao, J.S. Blanchard and E.J. Hehre, J. Biol. Chem., 269 (1994) 32306.

33 34

C. Divne, Ph. D Thesis, Uppsala (1994). G.M. Edelman et al., Proc. Natl. Acad. Sci., (USA) 69 (1972) 2580. K.D. Hardman and C.F. Ainsworth, Biochemistry, 11 (1972) 4910.

35 36 37 38

T. Keitel, O. Simon, R. Borriss and U. Heinemann, Proc. Natl. Acad. Sci. (USA) 90, (1993) 5287. M.L. Sinnott, in: Enzyme mechanisms, M.I. Page and A. Williams (eds.), 259-297, Royal Society of Chemistry, London (1987). M.L. Sinnott, Chem. Rev., 90 (1990) 1171.

39 40

C.C.F. Blake et al., Proc. Roy. Soc. ser. B, 167 (1967) 378.

41

B. Henrissat et al., Proc. Natl. Acad. Sci. (USA), 92 (1995) 7090.

N.C.J. Strynadka and M.N.G. James, J. Mol. Biol., 220 (1991) 401.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

The

catalytic

domain

of endo~lucanase

cellulolyticum belonging to family 5: an cs

239

A from Clostridium enzyme

V. Ducros a, M. Czjzek a, A. Belaich b, C. Gaudin b and R. Haser a

aLCCMB, CNRS-Marseille, URA1296, IFRC1, 31 chemin Joseph Aiguier, 13402 Marseille cedex 20, France bBIP, CNRS-Marseille, UPR9036, IFRC1, 31 chemin Joseph Aiguier, 13402 Marseille cedex 20, France

Abstract The 1.6/~ resolution crystal structure of the catalytic domain of family 5 of glycosyl hydrolases, the endoglucanase CelCCA from Clostridium celluIolyticum, is reported. The structural fold belongs to the superfamily of (o~/]3)8-barrel enzymes, with the active site located at the carboxy-terminal ends of the ]3-strands and the aromatic residues, forming the substrate binding site, arranged along a long cleft on the surface of the globular enzyme. The determination of the structure allows a comparison with related enzymes which belong to other glycosyl hydrolase families.

1. INTRODUCTION Native cellulose is degraded to soluble sugars by fungal and bacterial microorganisms harboring multiple cellulases. The complete degradation results from the cooperative action of cellobiohydrolases and endoglucanases. These cellulolytic enzymes are often multifunctional proteins composed of distinct domains, a catalytic domain and one or more domains involved in substrate binding or multi-enzyme complex formation [ 1]. The mechanism by which glycosyl hydrolases catalyse the glycosidic bond cleavage is formally a nucleophilic substitution at the anomeric centre that can take place with either retention or inversion of the configuration. The catalytic domains of glycosyl hydrolases are classified in 45 families on the basis of sequence alignement and hydrophobic cluster analysis [2]. Among these, cellulases and xylanases are ordered into 11 families. For a given family, available data are consistent with a common active site topology and catalytic mechanism [ 1, 3-6]. CelCCA from Clostridium ceIlutolyticum has a catalytic domain belonging to the largest family, 5, and a carboxy-terminal domain containing a stretch of reiterated sequences [7-9]. This duplicated segment appears, in clostridial enzymes, to be involved in attachment of catalytic subunits to the scaffolding proteins of the cellulosome [1]. Catalysis by enzymes of family 5 proceeds with net retention of the anomeric configuration [9,10]. HCA-plots led to

240 the identification of five segments particulary well conserved in the catalytic core, including seven residues strictly conserved [11]. Extended biochemical studies and experiments using site directed mutagenesis were performed in order to identify the residues involved in the catalytic reaction [8, 12-16]. The results of these studies are consistent with the participation in catalysis of one strictly conserved glutamate (corresponding to Glu307 in CelCCA) as the catalytic nucleophile [15-17]. The importance of the other conserved residues has been shown, but there is no conclusive evidence concerning their role in catalysis. Up to now, there has been no detailed structural information for family 5. For cellulolytic enzymes, whose structures have been determined, no general relationship between their tertiary folding can be identified. However, three of the seven known structures exhibit variations of the (ot/13)8 barrel motif, such as for families 5 [ref], 6 [18-19] and 10 [2022]. Other folds, that contain mainly [3-sheet motifs have been observed in families 7 [23], 11 [24] and 45 [25] and an o~/ot-barrel for the only representant of family 9 [26]. The catalytic domain of the endoglucanase CelCCA from Clostridium cetlulolyticum has been crystallized [27] and its structure solved to a resolution of 1.6/~ with a crystallographic R-factor of 19.1%. Full details of the structure determination will be reported elsewhere [28].

2. THE ENDOGLUCANASE A 2.1. Overall fold The catalytic core of CelCCA, encompassing residues 1 to 380, folds as a single (ct/[3)8barrel domain that contains an elliptical core of eight parallel [3-strands (Fig.l), as first observed in triose-phosphate isomerase (TIM) [29].

Figure 1. Ribbon representation of the overall fold of CelCCA. (a) Top view of the (ct,/13)8barrel motif. (b) Side view of the ((x/13)8-barrel motif, showing the cleft at the carboxy-terminal end of the [3-barrel.

241 The TIM barrel motif, which represents a large percentage of all known structures [30, 31], has been observed in many proteins that have little or no sequence homology and a variety of functions. Their catalytic sites are always found at the carboxy terminal extremity of the 13barrel with the catalytic residues located in the loops connecting a [3-strand with the following a-helix. Most of the ~13-barrel proteins often have discrepancies, such as supplementary or missing (x-helices and [3-strands or additional domains. Farber and Petsko classified the ogl3-barrel enzymes into four different structural families based on the following criteria: the number of domains, the location of the principal axis of the [3-barrel and the requirement for cofactors [32]. The endoglucanase A is a member of the fourth family according to the classification of Farber and Petsko. CelCCA shows the structural features of this family which comprises single-domain ~[3-barrel proteins that have an additional helix preceding the first [3-strand. This helix, located at the amino-terminal end of the barrel, in contrast with the other helices, is oriented perpendiculary to the axis of the barrel. Initially, this family was established on the basis of two aldolases, the fructose biphosphate aldolase and the 2-keto-3-deoxy-6phosphogluconate aldolase (FALD and KDPG) [33, 34]. We can also note a slight deviation from the regular o~/~-barrel fold caused by the absence of the fifth helix,which is, instead, replaced by an extended loop containing only one helical turn. This structural feature leads to a large accessible groove running across the surface of the molecule at the carboxy-terminal end of the barrel. The same irregularity has recently also been reported for the (~13)8 barrel domain of [3-galactosidase from E.coli [35]. The details of the arrangement of the secondary structure elements are presented schematically in Figure 2.

80

3i 32

97 1 9,

!

187

222

2

9 3

~

339/ I

I i,

-~ 116 1161 i ~

~____]2

2491 ~3031 C 380

Figure 2. Topological diagram of the secondary structure elements in CelCCA.

242 2.2. The catalytic site Enzymes belonging to family 5 are known to catalyse glycosyl hydrolysis with overall retention of the anomeric configuration. This type of catalysis requires a pair of carboxylic acids (Glu or Asp) at the active-site,with their carboxylate groups separated by approximatively 5.0/~. A first residue acts as a proton donor and protonates the glycosidic bond and a second residue, the nucleophile, stabilizes the oxycarbonium ion after the aglycon departure. Based upon sequence alignments and HCA predictions, two glutamate residues, strictly conserved in cellulase family 5 (Glu 170 and Glu307 for CelCCA), have been identified, and the study of site-specific mutants and kinetic analysis have proved their involvement in catalysis [15-17]. These two glutamate residues, Glu170 and Glu307, sit at the bottom of the catalytic site on either side of the groove, at the carboxy ends of 13-strands IV and VII, respectively (Fig. 3). Their carboxylate groups are separated by approximately 5.5/~, as expected for enzymes, which catalyse glycosyl group hydrolysis with retention of the configuration. Other invariant residues within family 5, Arg79, His122, Asn169, His254 and Tyr256 [11], are all located on the loops interconnecting the [3-strands on the same side of the ]3-barrel, an arrangement which brings them close to each other. The active site is located at the bottom of a large cleft, which contains several aromatic residues forming the substrate binding site (Fig.

4).

255 / \ -'~ / ./~ /

""-.

GLU170 ~J~--7~ !

\

~.wAT

A ~

RO171

-.r

\ ! I I I !

Figure 3. The catalytic residues in the active site of CELCCA. The proton donor Glul70 and the nucleophile Glu307 are situated on opposite sides of the active site cleft.

243

Figure 4. The arrangement of the aromatic residues along the substrate binding site. 3. RELATIONSHIP TO OTHER ot/~-BARREL GLYCOSYL HYDROLASES Other glycosyl hydrolases, such as (x-amylases [36, 37], 13-amylase [38], cyclodextrin glycosyl transferase [39], 13-galactosidase [35], cellulases [18,19], xylanases [20-22,24] and more recently chitinases [40,41] and 13-glucanases [42] contain the ~13-barrel motif. Active sites are generaly found in clefts or tunnels and, for the (oc/13)8-barrel motif, at the carboxyterminal ends of the 13-strands. The spatial arrangement, however, and the localization of their catalytic residues on the [3-strands may be quite distinct. In the (x-amylases belonging to family 13, which are also retaining enzymes, the catalytic site is formed by three carboxylic residues, two of which are implicated in acidic catalysis located on I]-strands V and on the loop connecting [3-strand VII with o~-helix VII respectively, and the putative nucleophile on 13-strand IV. This arrangement is close but not identical to the one found in CelCCA. Moreover, (x-amylases need an extra domain to bind the substrate in catalysis. A number of retaining glycosyl hydrolases displaying the (o~/13)8-barrel motif, such as Cellulomonas fimi xylanase/glycanase Cex [20],Pseudomonas fluorescens xylanase A [21], Streptomyces lividans xylanase A [22] and the catalytic domain of Escherichia coli ~galactosidase [35], have an identical arrangement of the proton donor and the nucleophile on

244 strands IV and VII, respectively. In some cases, however, as for example the I]-galactosidase, the need of a supplementary cation Mg 2+ in the active site is essential. The similarities observed support the idea of classifying retaining glycosyl hydrolases from different families, displaying slightly different substrate specifities but having the same arrangement of the active site, in superfamilies of (o~/13)8-barrel structures as has recently been proposed by Jenkins et al. [43] and Henrissat et al. [44].

4. C O M P A R I S O N W I T H C E L L U L A S E AND XYLANASE S T R U C T U R E S The structure of CelCCA has been compared to other cellulases and xylanases, the coordinates of which were available [28]. The greatest resemblance is found to the bifunctional [3-1,4-xylanase/cellulase Cex, belonging to family 10, the best superpositions being situated within the eight-stranded ~-barrel and the largest deviations found in the loops and in some of the external helices. The active sites of the two enzymes come to lie close, within 1A, when superposing the general structural fold. It is also noteworthy, that the ( ~ ) 8 - m o t i f of CelCCA is closer to the classical TIM-barrel than to the structures of both the endoglucanase and cellobiohydrolase belonging to family 6. 4.1. Comparison to cellobiohydrolase II, family 6 In contrast to CelCCA, CBHII and E2 are inverting enzymes. The arrangement of both the active site and the large groove forming the substrate binding site are quite different from those found in CelCCA or Cex. Again the highest similarity is found at the level of the ~-barrel, even though the eighth 13-strand is missing in the structures of CBHII and E2. In CBHII, there are two extensive loops connecting l-strands II and III on the one side of the cleft and after J-strand VII on the other, which form a tunnel, together with the superficial groove, leading to the active site. Variations between CelCCA and CBHII are observed in the helical regions which differ significantly in length, but the most important variation is seen in the catalytic site. The active site of CelCCA is much more open than that of CBHII. We can note that the superposition of the first [3-strand in CelCCA with the first [3-strand in CBHII does not lead to a superposition of the binding site clefts. They are separated by a rotation of 90 ~ about the barrel axis. When we apply this rotation, [3-strands II and VII of CelCCA correspond to the [3-strands V and II in CBHII, respectively. The two extensive loops, following [3-strands II and VII in CBHII, which form the tunnel around the active site then correspond to the three longest loops after [3-strands I,VIII and IV of CelCCA. In CelCCA these loops do not form a tunnel, but instead, they are oriented away from the active site leaving it much more accessible (Fig. 5). In the structure of CelCCA, these loops correspond to regions with some disconnectivities in the electron density and the side chains of residues 44 to 46, 177, 348 and 350 are poorly defined as indicated by the high B-factors. In CBHII the loops are reported to be well defined in density and stabilized by disulphide brigdes formed at the base of the loops. In CelCCA disulphide bridges are completely absent. This seems to indicate that these loops have some flexibility in CelCCA and may play a role in substrate binding.

245

a)

b)

Figure 5. Comparison of (a) CBHII from Trichoderma reesei [18] with (b) CelCCA from Clostridium cellulolyticum [27].The loops in CBHII are constrained by disulphide bridges to form a tunnel where the flexibility and positioning of the corresponding loops in CelCCA leaves the active site cleft accessible.

Furthermore, the catalytic residues of the two enzymes are located on different 13-strands. In CBHII the nucleophile, Asp265, is situated on the loop following strand VII and the proton donor, Asp 117, at the carboxy terminal end of the 13-strand III, while in CelCCA the respective location of the nucleophile, Glu307, and the proton donor, Glu 170, are at the carboxy terminal ends of [3-strands VII and IV. This discrepancy in the position of the catalytic residues is not surprising, since CBHII is an enzyme proceeding with inversion of the anomeric carbon, in contrast to CelCCA, which is a retaining enzyme.

4.2. Comparison to xylanase Cex, family 10 CelCCA from Clostridium cettulotyticum and Cex from Cellulomonas fimi show a low level of sequence similarity and display different specificities towards substrates. They have therefore been classed in two different families based on sequence alignment and hydrophobic cluster analysis. Nevertheless, the catalysis in both of these families proceeds with net retention of the anomeric configuration. Furthermore, the structural comparison shows that these two enzymes display a common overall fold and that they share a similar spatial arrangement of their catalytic residues. CelCCA superposes with the xylanase Cex with a rms-deviation of 1.43A [28]. The sequence alignment of the catalytic core of CelCCA with two members of family 10, xylanase Cex from Cellulomonas fimi and xylanase A from Pseudomonas fluorescens, is given in Figure 6.

246 1 -YDASLI

2-

PNLQI PQKNI PNN DGMN FVKGLRLGWNLGNT

3

FDAFN- GT--N IT-NELDYETSWSGI

KTTKQMI

DAI KQK

S LAD F P I GVAVAAS GGNAD I FT S SARQN- - I VRAE FNQ I T

ATTLKEAADG

PGRDF- G FALDPNRLS

EAQYKAIADS

E FN LVV

1 - G FNTVRI PVS WH PHVS G S DYK I S DVWMN RVQ EVVN YC I DN KMYVI LNTHH DVD KVKGY FP S - - S QYMAS S KKY I T S

2 - A E N I M K M - - S Y M Y - - S G S N F- - S F T N S D R - - - L V S W A A Q N 3 -AENAMKW

GQT- -VH GHALVWH P S YQL PNWAS DSNAN FRQD FAR

D A T E P- - - S Q N S F- - S F G A G D R- - - V A S Y A A D T G K E - - L Y G H T L V W H S- - Q L P D W A K N L N G S A F E S A M V

i--VWAQIAARFANYDEH

2 - -HI DTVAAH

LI F E G M N E P R L V G H A N E W W P E L T N S D V V D S I N C I N Q L N Q D F V N - T V -

FA- - GQVKSW DVVNEA- L FD SADD P DGRG SANGY RQ SVFYRQFGG

3 - N H V T K V A D H F E - - G K V A S W DVVIqEA- F A D - G D G P P

RATGGKNAS

P EY I DEAFRRAP

QDSAF-QQKLGNGY

RYLM

RAD PTAELYY

I ETAFRAARAAD

PTAKLC

I

1 - C P G Y V A S P D G A - T . . . . N D Y F R M PN D I S G N N N K I I V S V H A Y C P W N F - A G L A M A D G G T N A W N I N D S K D Q S E V T W F M D

2 - N D- F N T E E N G A K T T A L V N L V Q R L L N N - G V P I D G V G F Q ~ 3 -N D-YNVEGINAKSN

DY P S IAN I RQAMQ K IVAL

S L- Y D L V K D F K A R G V P L D C V G F Q S H L I V G Q V P G -

1 - N I YN K Y T S R G I P V I I G E C G A V D K N N L K T R V E Y M S Y Y V A Q A K A R G

2

3

S PTLKI KITELD-VRLNN

DFRQNLQRFADL

I L C I L W D N N N FS G T G . . . . E L F G F F D R R S C Q F

PYDGNS SN DYTNRN DCAVS CAGLDRQKARYKE

G V D V R I T E L D - I R M R T P S D A T K-

LATQAADY

IVQAYLEVVP

KKVVQACMQVT

PG- RRGGI T

.... RCQGVT

I - K FP E I I - D G M V K Y A F F J k K T

2-VWGIADPDSWLYTHQNLPDWPLLFNDNLQPKPAYQGVVEALSGR 3 -VWGI T DKYSWVP

DVFPG EGAALVWDAS

YAKKPAYAAVMEAFGAS

Figure 6. Sequence alignment of CelCCA (1-CelCCA), family 5, with xylanase A (2-XYLPFA) and Cex (3-CexCFi), family 10. The strictly conserved residues common to both families are printed in bold.

The catalytic residues are strictly conserved within the two families, the proton donor Glul70 in CelCCA corresponds to Glu127 in Cex and the nucleophilic glutamates 307 and 233 both follow an invariant asparagine. In the 3D-structures these two invariant glutamates are found in the same positions, 13-strand IV for the proton donor and 13-strand VII for the nucleophile.Both are situated in a similar chemical environment (Fig. 7). A feature common to both nucleophiles is a hydrophobic stacking formed by a tryptophan and a histidine in family 5, which corresponds to a tryptophan and a glutamine in family 10, respectively. Moreover, the hydrogen bond formed with Tyr256 in CelCCA is replaced by a hydrogen bond with His205 in Cex. The environment of the proton donor is less conserved, although in both families the neighbouring asparagine, forming the conserved dipeptide sequence Asn-Glu, is related to the proton donor via a water molecule in the case of CelCCA and via the conserved Gln203 in Cex.

247

ms 205

\

_

_TRP 340 ~

Au~G 79 41

1

"~ "'

/

ms z~4

~ 3 0 ~

ASN~ ~ ' x ~

_TRP 34O

-~

,.,., o., ms xz3

79 VAL 41

Figure 7. A stereo representation of the superposed active sites of CelCCA (thick lines) and Cex (thin lines).

Some discrepancies exist between the sequence alignment on the one hand and the structural superposition on the other hand within the two families. In particular this is true for a invariant arginine in family 5 (Arg79 in CelCCA), which aligns with a conserved lysine in family 10. In CelCCA this arginine is buried in the catalytic site and forms a salt bridge with the nucleophile and therefore seems to be involved in the catalytic activity (Fig. 7). Studies by site directed mutagenesis [8], carried out on CelCCA, showed that a lysine could replace this arginine, yielding an enzyme with a residual activity of about 20%. Although, the sequence alignment shows a lysine residue for family 10 at the position of this arginine in family 5, in the 3D-structure we observed a valine at the equivalent position. As this residue is considerably shorter than arginine or lysine more space is left in the catalytic cavity. This difference could be associated to their specifities for different substrates, xylan and cellulose. Since xylan is composed of 13-1-4-D-xylopyranosyl linked to several different mono- or oligo-saccharide side chains, the catalytic cavity of xylanase requires more space in order to accomodate this substrate, than does cellulase to accomodate cellulose. Some other residues in the catalytic site, invariant within each family, but differing between cellulases of family 5 and xylanases of family 10, may be associated with the distinct specificity for cellulose or xylan substrates. In particular we observe a structural similarity between a conserved His-His motif (122-123) in CelCCA, structurally corresponding to a conserved Trp-His motif (84-85) found throughout family 10, but which is shifted by four residues in the sequence alignment (see Figures 6 and 7).

248 In CelCCA the invariant His122 is supposed to be involved in substrate binding [11,28], as well as Trp84 in Cex [19]. However, the precise orientation of the ring-planes to each other and the orientation of the residues within the catalytic cavities differs considerably from CelCCA to Cex, which again might be explained by the different substrate specificities.

5. CONCLUSION CelCCA can be classed into the fourth family of (o~/[3)8-barrel enzymes, according to the classification of Farber and Petsko [32]. The comparison to other (ot/[3)8-barrel enzymes, which do not share the same mechanism, shows that the location of the catalytic residues will not be identical. The superposition with retaining enzymes, sharing the (o~/[3)8-barrel motif, as for example Cex a member of family 10, clearly shows strong similarities in the catalytic site, with the two glutamates located on the same ~3-strands IV and VII. The discrepancies in the catalytic cavity could be attributed to their different substrate specificities. This structural analysis of a representant of family 5 supports the results of Jenkins et al [43] and Henrissat et al. [44], who proposed that several glycosyl hydrolase families (1,2,5,10,30,35,39 and 42) diverge from the same ancestral (o~/[3)8-barrel structure. Furthermore, Henrissat [44] predicts the precise location and function of the two catalytic glutamates on ]?)-strands IV and VII within these families, prediction which we confirm by the 3D-structure determination of CelCCA.

6. REFERENCES

1 2 3 4

5 6 7 8 9 10 11 12

P. Bdguin and J.P. Aubert, FEMS Microbiol. Rev., 13 (1994) 25. B. Henrissat and A. Bairoch, Biochem. J., 293 (1993) 781. M. Claeyssens and B. Henrissat, Protein Sci., 1 (1992) 1293. J. Gebler, N.R. Gilkes, M. Claeyssens, D.B. Wilson, P. B6guin, W.W. Wakarchuk, D.G. Kilburn, R.C. Miller, R.A.J. Warren and S.G. Withers, J. Biol. Chem., 267 (1992) 12559. M.L. Sinnott, Chem. Rev., 90 (1990) 1171. J.D. McCarter and S.G. Withers, Current Opinion in Struct. Biol. 4 (1994) 885-892. H.P. Fierobe, C. Gaudin, A. BelaYch, M. Loutfi, E. Faure, C. Bagnara, D. Baty and J.P. Bela'ich, J.Bacteriol., 173 (1991) 7956. A. BelaYch, H.P. Fierobe, D. Baty, B. Busetta, C. Bagnara-Tardif, C. Gaudin, and J.P. Bela'fch, J. Bacteriol., 174 (1992) 4677. H.P. Fierobe, C. Bagnara-Tardif, C. Gaudin, F. Guerlesquin, P. Sauve, A. Bela'fch and J.P. BelaYch, Eur. J. Biochem., 217 (1993) 557. F. Barras, I. Bortoli-German, M. Bauzan, J. Rouvier, C. Gey, A. Heyraud and B. Henrissat, FEBS, 300 (1992) 145. B. Henrissat, M. Claeyssens, P. Tomme, L. Lemesle and J.P. Mornon, Gene, 81 (1989) 83. S.D. Baird, M.A. Hefford, D.A. Johnson, W.L. Sung, M. Yaguchi and V.L. Seligy, Biochem. Biophys. Res. Commun., 169 (1990) 1035.

249 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

30 31 32 33 34 35 36 37 38 39 40 41

B. Py, I. Bortoli-German, J. Haiech, M. Chippaux and F. Barras, Protein Engineering, 4 (1991) 325. J. Navas and P. Bdguin, Biochem. Biophys. Res. Com., 189 (1992) 807. Q. Wang, D. Tull, A. Meinke, N.R. Gilkes, N.A.J. Warren, R. Aebersold and S.G. Withers, J. Biol. Chem., 268 (1993) 14096. I. Bortoli-German, J. Haiech, M. Chippaux and F. Barras, J. Mol. Biol., 246 (1995) 82. S.G. Withers, R. Antony, T. Warren, I.P. Street, K. Ruptiz, J.P. Kempton and R. Aebersold, J. Am. Chem. Soc., 112 (1990) 5887. T. Rouvinen, T. Bergfors, T. Teeri, J.K.C. Knowles and T.A. Jones, Science, 249. M. Spezio, D.B. Wilson and P.A. Karplus, Biochem., 32 (1993) 9906. A. White, S.G. Withers, N.R. Gilkes and D.R. Rose, Biochemistry, 33 (1994) 12546. G.W. Harris, J.A. Jenkins, I. Connerton, N. Cummings, L.L. Leggio, M. Scott, G.P. Hazlewood, J.I. Laurie, H.J. Gilbert and R.W. Pickersgill, Structure, 2 (1994) 1107. U. Derewenda, L. Swenson, R. Green, Y. Weit, R. Morosoli, F. Shareck, D. Kluepfell and Z.S. Derewenda, J. Biol. Chem., 269 (1994) 20811. C. Divne, J. Stahlberg, T. Reinikainen, L. Ruohonen, G. Pettersson, J.K.C. Knowles, T.T. Teeri and T.A. Jones, Science, 265 (1994) 524. A. T6rr6nen, A Harkki and J. Rouvinen, EMBO J., 13 (1994) 2493. G.J. Davies, G.G. Dodson, R.E. Hubbard, S.P. Tolley, Z. Dauter, K.S. Wilson, C. Hjort, J.M. Mikkelsen, M. Rasmussen and M. Schtilein, Nature, 365 (1993) 362. M. Juy, A.G. Amit, P.M. Alzari, R.J. Poljak, M. Claeyssens, P. Bdguin and J.P. Aubert, Nature, 357 (1992) 89. V. Roig, H.P. Fierobe, V. Ducros, M. Czjzek, A. Bela~'ch, C. Gaudin, J.P. Bela'ich and R. Haser, J. Mol. Biol., 233 (1994) 325. V. Ducros, M. Czjzek, A. Bela~'ch, C. Gaudin, H.P. Fierobe, J.P. BelaYch, G. Davies and R. Haser, Structure (1995) Accepted. D.W. Banner, A.C. Bloomer, G.A. Petsko, D.C. Phillips, C.I. Pogson, I.A. Wilson, P.H. Corran, A.J. Furth, J.D. Milman, R.E. Offord, J.D. Priddle and S.G. Waley, Nature, 255 (1975) 609. C.I. Br~inden, Curr. Opin. Struct. Biol., 1 (1991) 978. G.K. Farber, Curr. Opin. Struct. Biol., 3 (1993) 409. G.K. Farber and G.A. Petsko, TIBS, 15 (1990) 228. J. Sygusch, D. Beaudry and M. Allaire, Proc. Natl. Acad. Sci. USA, 84 (1987) 7846. I.M. Mavridis, M.H. Hatada, A. Tulinsky and L. Leboida, J. Mol. Biol., 162 (1982) 419. R.H. Jacobson, X.J. Zhang, R.F. DuBose and B.W. Matthews, Nature, 369 (1994) 761. M. Qian, R. Haser and F. Payan, J. Mol. Biol., 231 (1993) 785. A. Kadziola, J. Abe, B. Svensson and R. Haser, J. Mol. Biol., 239 (1994) 104. B. Mikami, M. Sato, T. Shibata, M. Hirose, S. Aibara, Y. Katsube and Y. Morita, J. Biochem., 112 (1992) 541. C.L. Lawson, R. van Montfort, B. Strokopytov, H.J. Rozeboom, K.H. Kalk, G.E. de Vries, D. Penninga, L. Dijkhuizen and B.W. Dijkstra, J. Mol. Biol., 236 (1994) 590. A. Perrakis, I. Tews, Z. Dauter, A.B. Oppenheim, I. Chet, K.S. Wilson and C.E. Vorgias, Structure, 2 (1994) 1169. A.C. Terwisscha van Scheltinga, K.H. Kalk, J.J. Beintema and B.W. Dijkstra, Structure, 2 (1994) 1181.

250 42 43 44

J.N. Varghese, T.P.J. Garrett, P.M. Colman, L. Chen, PB. Hoj and G.B. Fincher, Proc. Natl. Acad. Sci., 91 (1994) 2785. J. Jenkins, L.L. Leggio, G. Harris and R. Pickersgill, FEBS Lett., 362 (1995) 281. B. Henrissat, I. Callebaud, S. Fabrega, P. Lehn, J.P. Mornon and G.Davies, Proc. Natl. Acad. Sci. USA, (1995), in press.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

251

Cellulosome domains for novel biotechnological application Edward A. Bayer a, Ely Morag a, Meir Wilchek a, Raphael Lamed b, Sima Yaron c and Yuval Shoham c aDepartment of Biophysics, The Weizmann Institute of Science, Rehovot, Israel bDepartment of Molecular Microbiology and Biotechnology, Tel Aviv University CDepartment of Food Engineering and Biotechnology, The Technion Haifa

Abstract The functional domains of cellulosomes from cellulolytic bacteria can be exploited for an astonishing variety of both conventional and nonconventional applications. Using a combination of molecular biological technique and chemical probes, the various domains can be mixed and matched, shuffled and scrambled to create new species of functionally altered complexes. For example, biotinylated cellulose-binding domains, attached to cellulose, serves as a basis for a new, simple and inexpensive type of avidin column for use in avidin-biotin technology. In addition, the domains responsible for integrating the catalytic subunits into the cellulosome can be reorganized for incorporation of foreign types of enzyme into heterocellulosomes and/or chimeric complexes. Using this strategy, improved types of "supercellulosome" may be produced which will degrade cellulosic materials more efficiently. In the future, hybrid biomolecules, which comprise selected cellulosomal domains in conjunction with other affinity or enzymatic components, should find broad application in research, medicine and industry.

1. INTRODUCTION For the greater part of this century, cellulolytic microbes and their cellulase systems have been considered for industrial conversion of cellulosic biomass [1-4]. It eventually became apparent, however, that natural systems are not necessarily compatible with industry. Advanced engineering technique alone is not sufficient to design viable processes for solubilization of cellulosics, and we had to learn more about the enzymes and microbes which mediate cellulolysis. During the past decade, it has been shown that the cellulases of many cellulolytic bacteria are organized into discrete, multienzyme composites, called cellulosomes [5-7]. Their many subunits are composed of numerous functional domains which interact with each other and

252 with the cellulosic substrate (Fig. 1). The definitive cellulosomal subunit represents a distinctive new type of noncatalytic polypeptide called scaffoldin - - which selectively integrates the native cellulase and xylanase subunits into the cohesive complex [8-12]. The scaffoldin subunit contains a cellulose-binding domain (CBD) and numerous attachment sites (called cohesins) which serve to integrate the enzymes into the cohesive complex. The enzymatic subunits m the cellulases and xylanases - - contain a catalytic domain and a docking domain (called dockerin), which interacts with one of the cohesins on scaffoldin. The domains of the cellulosomal subunits appear to act independently of one another [ 13, 14]. The portions of the genes which encode for these domains can be "isolated" and subcloned into a suitable high-expression vector and expressed in a suitable host system. They can be fused with a variety of enzymes or components of other affinity systems.

Figure 1. Simplified model of a typical cellulosome, based on the complex from Clostridium thermocellum. All of the subunits are composed of multiple domains. Both the scaffoldin subunit (shown as the central structure in white) and some of the catalytic subunits (shaded structures) contain other domains (not shown), the functions of which are still not known.

We are currently using a combination of recombinant gene technology and protein chemistry to produce chimeric constructs of heterologous enzymes and permutated forms of scaffoldin [ 12]. By virtue of the selective interaction among their binding domains, these components can then be assembled in vitro to form cellulosome hybrids. In this manner, the set of enzymes in the resultant cellulosome can be manipulated to suit the nature of the substrate. For example, hyperactive cellulases and xylanases from different organisms or recombinant forms can be

253 bolstered with selected ligninases, pectinases, etc., which can be incorporated into the same cellulosome complex for efficient degradation of a specific type of cellulosic substrate. In an extension of this approach, cellulosome components can be exploited for universal application to other fields of the biological sciences. Thus, appropriate cellulosomal domains can either be fused or crosslinked to non-cellulolytic enzymes or components of other affinity systems, such as binding proteins, nucleic acids and virtually any other biologically active material. In this communication, we describe some of our recent studies towards the development of such "designer cellulosomes". In this regard, the production of individual CBD-cohesin constructs are presented. We also describe a process which links the emerging CBD technology with avidin-biotin technology. It is hoped that these newly developed strategies will eventually lead not only to improved utilization of cellulosic biomass but will also give rise to a broad spectrum of unconventional uses in research, medicine and industry.

2. RESULTS AND DISCUSSION

2.1. Expression and purification of cohesin-CBD constructs What we would like to do in the future, of course, is to use the cellulosomal domains to create new combinations of complexes which contain heterologous mixtures of components. However, before we can start wanton cloning and fusion of the various domains, we have to learn more about how they interact with each other. Thus, in order to examine the functionality and applicability of the cohesin-dockerin interaction, we began by subcloning portions of scaffoldin into simplified forms. In this context, two of the cohesins from the scaffoldin of Clostridium themocellum were cloned and expressed together with the adjacent CBD into an appropriate Escherichia coli expression vector [15]. The resultant constructs were termed Coh2-CBD, CBD-Coh3 and Coh2-CBD-Coh3. The calculated values for the molecular weights of the three constructs are 37,655, 39,670 and 58,651, respectively. By virtue of the resident CBD, the recombinant constructs could be purified by affinity chromatography on cellulose. The final amount of purified protein obtained for each clone was approximately 10 mg per liter of culture. The SDS-PAGE profile of the three cohesin-CBD constructs is shown in Fig. 2. The Mrs of the purified products were in agreement with the theoretical calculated values. The purified proteins were assayed by Westem blotting for their interaction with the other (catalytic) subunits of the cellulosome (Fig. 3). Both cohesin domains, which differ by about 30% in their primary structure, showed a very similar binding profile to the cellulosomal subunits. Nearly all of the major subunits, with the exception of $2, bound to each cohesin, although some seemed to interact more intensely than others. Most notably, calcium ions dramatically increased the cohesin-induced labeling pattern using either of the cohesin-CBD constructs. EDTA resulted in the complete or near-complete elimination of the interaction between the cohesins and many of the various subunits, whereas the label associated with other subunits was highly reduced. The results suggest that the interaction between cohesins and dockerins is much less specific than hitherto considered. The cohesin-dockerin interaction is currently being analyzed further both to determine the recognition factors involved in the integration of the cellulosomal

254

Figure 2. SDS-PAGE of the purified, recombinant cohesin-CBD constructs.

subunits and for their eventual utilization. In this context, a variety of dockerin-containing biomolecules are being cloned, and we hope to use the expressed chimerae to form heterologous protein complexes in a selective and controlled manner.

2.2. Biotinyl CBD and its use in avidin-biotin technology In a second approach, we developed an alternative strategy to the emerging CBD technology. Past approaches have centered on the production of hybrid proteins, consisting of the CBD combined with enzymes, binding proteins and the like [ 16-21]. As an alternative, we have combined the avidin-biotin system [22, 23] with the CBD, in order to prepare novel types of cellulosic affinity columns. In this context, the single cysteine of the cloned CBD from the scaffoldin of C. t h e r m o c e l l u m was modified with maleimidopropionyl biocytin, a thiol-specific biotinylating reagent [24]. The reaction is shown schematically in Fig. 4. The biotin moiety is thus bound irreversibly to the CBD molecule. The biotinylated CBD was shown to retain its cellulosebinding properties and also bound tightly to avidin, indicating that the biotin moiety was in an exposed position. Since we now have a stable avidin-containing cellulose column, we can use this system for the further immobilization of other biotinylated macromolecules to cellulose via the avidinbiotin bridge. In this manner, biotinylated enzymes can be immobilized to the cellulose column for use as an enzyme reactor. Alternatively, biotinylated antigens or biotinylated antibody can be immobilized for immunochromatography. Similarly, biotinylated lectins can be used for isolation of glycoconjugates and polysaccharides, and vice versa. In fact, any biotinylated

255

Figure 3. The effect of calcium and EDTA on the interaction of the CBD-Coh3 construct with the cellulosomal subunits. CS, the cellulosome preparation (2.7 mg of protein per lane) was subjected to SDS-PAGE, the separated subunits were blotted onto nitrocellulose and probed with biotinylated forms of the cohesin-CBD constructs (0.4 mg per sample). The blots were developed using avidin-peroxidase complexes. Identical samples were pretreated with 5 mM EDTA (+ EDTA) or 15 mM CaC12 (+ Ca) and treated similarly. A cellulosome control sample was stained for protein using Coomassie brilliant blue (CBB). Essentially identical results were obtained using the Coh2-CBD and Coh2-CBD-Coh3 constructs.

substance can essentially be bound to such a column, and all of the major advantages of avidinbiotin-based purifications and separations are inherent in this system. Alternatively, the avidinbound cellulosic matrix can be applied as a diagnostic base for the detection or quantification of target material. Furthermore, biotinylated enzymes can be immobilized on such cellulosic matrices for use as an enzyme reactor. As an initial example of this general approach, biotinylated protein A was coupled to such an avidin-cellulose column, and the resultant protein A-cellulose matrix was used to isolate antibodies from whole antiserum (Fig. 5). Sequential application to a cellulose column of the modified CBD, avidin and biotinylated protein A, provided an affinity resin which was used to adsorb anti-transferrin antibodies directly from rabbit serum [25]. The resultant antibody preparation was pure according to standard assays (Fig. 6). Only minimal traces of leakage were observed. In fact, the only visible contamination of the preparation appeared to be trace levels of CBD which apparently leaked from the column. Avidin and protein A were not observed in the gel. The high-molecular-weight serum components, which accompanied the isolation of antibody by the standard protein A affinity column, were absent in the sample prepared by affinity chromatography on the cellulosic matrix. The antibody could thus be removed efficiently without disturbing drastically any of the interactions among the primary components of the column.

256

O HN"~NH _NH~ / ' - . ~ , .

NH_ ~

+

CBD-sH

3-(N-Maleimidopropionyl) biocytin

O HN/~NH .+_+.

~,

(,,,,,S/~--~C-NH~

~o~ o~_ \ NH- ! ~ - N

s-CBD

Biotinylated CBD Figure 4. Scheme showing the mode of biotinylation of the cellulosomal scaffoldin CBD from C. thermocellum. The reagent binds covalently to the single cysteine (residue 62) of the CBD molecule.

Figure 5. Scheme for isolation of antibodies using biotinyl protein A, coupled to cellulosebound biotinyl CBD using a modified avidin bridge.

257 There are added attractions over many other previously established strategies, which involve the use of conventional affinity matrices or the production of recombinant CBD-containing fusion proteins. First, the new class of affinity column itself is inexpensive. The matrix, cellulose, is one of the cheapest and most prevalent materials available. In addition, avidin, a component of egg white, is today a relatively cheap protein. As described above, the expressed CBD from C. thermocellum also has the potential of being a relatively inexpensive protein, since it can be produced in high quantities and is readily solubilized without the involvement of chaotropic agents. Other types and sources of CBD can, of course, be substituted for the one described in this communication. In cases where a CBD does not have an exposed cysteine, the more standard class of biotinylation reagents, i.e., an appropriate N-hydroxysuccinimide ester, which labels lysines rather than cysteines, can be employed. Another advantage in using these columns is that chemical activation or other potentially hazardous procedures are precluded. Essentially, one simply percolates the various solutions through the column In short, the combination of avidin-biotin and CBD technologies results in cellulosic affinity matrices which are advantageous in that they are safe, simple, effective and cheap! In addition, the preparation of novel types of cellulosome-like complexes should provide a new dimension of versatility for use in various areas of biotechnology. These new procedures will be useful in a variety of affinity separations and purifications, immunoassays and other diagnostics, and as enzyme reactors. It is therefore hoped that these newly developed strategies will eventually lead not only to improved utilization of cellulosic biomass but will also give rise to a broad spectrum of unconventional uses in research, medicine and industry.

258

Figure 6. Direct isolation of antibodies from serum by sequential application of biotinyl CBD, avidin, and biotinyl protein A on a cellulose support.Applied, whole serum. CBD, purified standard. Effluent, purified IgG by procedure described schematically in Fig. 5. IgG standard, purified IgG by conventional affinity chromatography (on CNBr-activated Sepharose-protein A column).

3. REFERENCES

E.A. Bayer and R. Lamed, Biodegradation, 3 (1992) 171. P. B6guin and J.-P. Aubert, FEMS Microbiol. Lett., 13 (1994) 25. H.J. Gilbert and G.P. Hazlewood, J. Gen. Microbiol., 139 (1993) 187. L.G. Ljungdahl and K.-E. Eriksson, Advan. Microb. Ecol., 8 (1985) 237. R. Lamed, E. Setter, R. Kenig and E.A. Bayer, Biotechnol. Bioeng. Symp., 13 (1983) 163. R. Lamed and E.A. Bayer, Advan. Appl. Microbiol., 33 (1988) 1. R. Lamed and E.A. Bayer, in Genetics, Biochemistry and Ecology of Lignocellulose Degradation, K. Shimada, S. Hoshino, K. Ohmiya, K. Sakka, Y. Kobayashi and S. Karita, (eds.), p. 1, Uni Publishers Co., Ltd.: Tokyo, Japan. 1993. E.A. Bayer, E. Setter and R. Lamed, J. Bacteriol., 163 (1985) 552. J.H.D. Wu, W.H. Orme-Johnson and A.L. Demain, Biochemistry, 27 (1988) 1703.

259 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

O. Shoseyov, M. Takagi, M.A. Goldstein and R.H. Doi, Proc. Natl. Acad. Sci. USA, 89 (1992) 3483. U.T. Gerngross, M.P.M. Romaniec, T. Kobayashi, N.S. Huskisson and A.L. Demain, Mol. Microbiol., 8 (1993) 325. E.A. Bayer, E. Morag and R. Lamed, Trends Biotechnol., 12 (1994) 378. K. Tokatlidis, P. Dhurjati and P. B6guin, Protein Eng., 6 (1993) 947. M. Takagi, S. Hashida, M.A. Goldstein and R.H. Doi, J. Bacteriol., 175 (1993) 7119. S. Yaron, E. Morag, E.A. Bayer, R. Lamed and Y. Shoham, FEBS Lett., 360 (1995) 121. E. Ong, J.M. Greenwood, N.R. Gilkes, D.G. Kilburn, R.C. Miller, Jr. and R.A.J. Warren, Trends Biotechnol., 7 (1989) 239. D.M. Poole, A.J. Durrant, G.P. Hazlewood and H.J. Gilbert, Biochem. J., 279 (1991) 787. E. Ong, N.R. Gilkes, R.C. Miller, Jr., R.A.J. Warren and D.G. Kilburn, Enzyme Microb. Technol., 13 (1991) 59. J.M. Greenwood, E. Ong, N.R. Gilkes, R.A.J. Warren, R.C. Miller, Jr. and D.G. Kilburn, Protein Eng., 5 (1992) 361. C. Ramfrez, J. Fung, R.C. Miller, Jr., R.A.J. Warren and D.G. Kilburn, Bio/technology, 11 (1993) 1570. K.D. Le, N.R. Gilkes, D.G. Kilburn, R.C.J. Miller, J.N. Saddler and R.A.J. Warren, Enzyme Microb. Technol., 16 (1994) 496. E.A. Bayer and M. Wilchek, Trends Biochem. Sci., 3 (1978) N237. M. Wilchek and E.A. Bayer, (eds.), Avidin-Biotin Technology. Academic press, San Diego, 1990 E.A. Bayer, M.G. Zalis and M. Wilchek, Anal. Biochem., 149 (1985) 529. E. Morag, A. Lapidot, D. Govorko, R. Lamed, M. Wilchek, E.A. Bayer and Y. Shoham, Appl. Environ. Microbiol., 61 (1995) 1980.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 1995 Elsevier Science B.V. All rights reserved.

261

Interactions of cellulases from Cellulomonas fimi with cellulose N. Din, J.B. Coutinho, N.R. Gilkes, E. E. Ong, P. Tomme and R.A.J. Warren

Jervis, D.G.

Kilburn,

R.C.

Miller Jr.,

Department of Microbiology and Immunology, and Protein Engineering Network of Centres of Excellence, University of British Columbia, Vancouver, B.C., Canada V6T 1Z3

Abstract The amino acid sequences of eight [~-l,4-glycanases from Cellulomonas fimi are known from the nucleotide sequences of the corresponding genes. The enzymes, four endoglucanases, two cellobiohydrolases, a xylanase and a mixed function exoglucanase-xylanase, are all modular proteins comprising from two to six modules or domains. All of them contain a catalytic domain (CD) and a cellulose-binding domain (CBD). The CDs come from six of the families of glycoside hydrolases; the CBDs from three of the families of CBDs, although all but one of the enzymes has a CBD from family II. The CDs and the CBDs function independently of each other when separated by proteolysis or genetic engineering. The enzymes interact with cellulose/xylan in two ways. The CDs have weak affinity for substrate, relative to the CBDs, and catalyze hydrolysis of glycosidic bonds with inversion or retention of anomeric configuration, depending on the CD. The CBDs have much greater affinities for cellulose, with Kds of the order of 0.5-1.0 /aM for the family II CBDs. The family II CBDs, with the exception of CBDxyto from xylanase D, adsorb to both crystalline and amorphous cellulose;

CBDxyto adsorbs only to crystalline cellulose. The family IV CBD from endoglucanase CenC (CBDc~c) absorbs to amorphous but not to crystalline cellulose. Adsorption to cellulose is strongly dependent on aromatic amino acid residues, especially tryptophans, which are conserved in nearly all family II CBDs. CBDce x from the exoglucanase-xylanase Cex, is a 13barrel in solution, with extensive [3-sheet structure; two of the conserved tryptophans which participate in binding to cellulose are adjacent in space and exposed to solvent. The isolated CBDc~,A, from endoglucanase CenA, has a disruptive effect on cotton fibres in spite of lacking hydrolytic activity. CBDce~ interacts synergistically with CDcc,A in the release reducing sugars from cotton fibres. The binding of the family II CBDs to cellulose is stable enough for them to be used as affinity tags for protein purification and for enzyme immobilization.

1. INTRODUCTION Many bacteria and fungi can degrade plant cell walls. The cell walls are complex, comprising several polymers: cellulose, hemicelluloses, xylans, and lignin. Not surprisingly, microorganisms produce correspondingly complex enzyme systems for the degradation of

262 plant cell wall material. The enzyme systems from a number of bacteria and fungi are now understood in considerable detail. However, much of this knowledge has been acquired by examining the degradation of isolated components of the cell walls. The structures of the intact walls are not understood in sufficient detail to allow meaningful analysis of their interaction with and degradation by the microbial enzyme systems. In spite of this, it is clear that the systems from diverse bacteria and fungi which degrade cellulose and xylans have many features in common [ 1-5]. The bacterium Cellulomonas fimi will serve as a paradigm to discuss the interactions of the components of a system with cellulose.

2. THE CELLULASE SYSTEM OF CELLULOMONAS FIMI The 13-1,4-glycanases from C. fimi characterized to date are endoglucanases A, B, C and D (CenA, CenB, CenC and CenD, respectively), exocellobiohydrolases A and B (CbhA and CbhB, respectively), and xylanases A and D (Cex and XylD, respectively) [6-13]. All are modular proteins of varying degrees of complexity (Fig.l), but with two features in common: a catalytic domain (CD) and a cellulose-binding domain (CBD) which can function independently [ 13-16]. In four of the enzymes, CenB, CenD, CbhA and CbhB, fibronectin type III (Fn3) repeats separate the N-terminal CD from the C-terminal CBD. The CDs of the enzymes come from six of the families of glycoside hydrolases [17,18]; all of the enzymes, except CenC, have an N- or C-terminal CBD from family II of CBDs [5]; CenC has tandem CBDs from family IV at its N-terminus; CenB and XylD each have a second, intemal CBD from families III and II, respectively. Cex and XylD are clearly xylanases; however, Cex, but not XylD, has low activity on cellulose. Nonetheless, like several other bacterial xylanases [19], they have CBDs. Similar systems are produced by related bacteria [20,21]. C. fimi probably produces other 13-1,4-glycanases. The unrelated bacterium, Clostridium thermocellum, for example, produces twenty or more 13-1,4-glycanases [22].

3. ANALYSIS OF INTERACTIONS WITH CELLULOSE Currently, individual components of plant cell walls, especially cellulose, xylans and mannans, are used as substrates for 13-1,4-glycanases. Most of the work on the enzymes from C. fimi has been done with cellulose because seven of them hydrolyze it, and all of them bind to it. However, analysis of cellulose hydrolysis and binding of the enzymes to cellulose is complicated by the uncertain nature of the substrate. In cellulose I, the naturally occurring form, the individual glucose polymers are arranged in parallel chains [23], held together by intra- and intermolecular hydrogen bonds. The degree of crystallinity of the cellulose, however, varies with both source and the method of preparation. The cellulose obtained from algae such as Valonia spp. is highly crystalline but of limited availability. Cellulose from the bacterium Acetobacter xylinum is slightly less crystalline but more readily available; in fact, it is now available commercially. Bacterial microcrystalline cellulose (BMCC) is produced from this source. Most other commercially available forms of cellulose, such as cotton fibres and Avicel, generally have much lower and more variable degrees of crystallinity than Valonia cellulose

263 and BMCC [24]. Phosphoric acid-swollen cellulose (PASC) is generally assumed to be amorphous. Analysis of the interaction(s) of individual enzymes with cellulose is complicated further by their modular structures. Both the CD and the CBD of each enzyme from C. fimi interact with cellulose. Fortunately, the CDs and CBDs retain their functions when separated by proteolysis [ 14] or gene manipulation [ 13, 15, 16, 25, 26], allowing their properties to be analysed either alone or in combination.

CenA

b,\\! 6

CenB

I

CenC

&\\\"~\\",,,",3

CenD

J

Cex

I

XylD

I

I

9 9

5 ~o

l;~.~;:[t;$Z~b~\'q lx\\~

11 .... [,,,\'q ......... ,,

I I linker

,~Z~,~

[',,",~",'1 ~[~,[~

] catalyticdomain ~

$[~[~

.......

~

Fn3module

CBD ~

other

Figure 1. Schematic diagram showing the arrangement of functional domains in C. fimi endoglucanases (CenA, CenB, CenC and CenD), xylanases (Cex and XynD) and cellobiohydrolases (CbhA and CbhB). The catalytic domains are numbered according to the ~1,4-glycanase family to which they belong [18].

4. HYDROLYTIC ACTIVITIES OF THE ENZYMES The interactions of the enzymes with cellulose are complex and difficult to analyze kinetically because of the insolubility of the substrate. [3-1,4-1inked polymers of glucose are insoluble once the chain length exceeds seven glucose molecules. In native cellulose, many cellulose molecules are held together in crystalline arrays by hydrogen bonds. Enzymes interact with this insoluble substrate. Both products of hydrolysis by endoglucanases will be insoluble for the most part. Do endoglucanases desorb from the cellulose between each hydrolytic event? In contrast, exocellobiohydrolases act processively from the ends of cellulose molecules

264 in the insoluble substrate, producing both an insoluble product and a soluble product, cellobiose. Presumably, they remain associated with the substrate between hydrolytic events. The three-dimensional structure is known of only one CD from C. fimi, that of Cex [27]. However, several other CDs from C. fimi belong to families for which one or more structures are known. CenA and CbhA are in family 6, which includes two CDs whose structures are known: CDCBHII from Trichoderma reesei [28] and CDE2 from Micromonospora fusca [29]. CenB and CenC are in family 9, which includes a CD whose structure is known: CDCelD from Clostridium thermocellum [30]. CenD is in family 5, which includes two CDs whose structures are known: CDc~lccA from C. cellulolyticum [53] and CDc~lc from C. thermocellum [54]. The catalytic sites of all the CDs for which the structures are known accommodate a single cellulose molecule. It can be assumed that this also applies to all CDs from C. fimi. All of the enzymes from C. fimi that we have analyzed hydrolyze PASC and all of them except Cex hydrolyze BMCC (Table 1). Cex is much more active on xylans, but it does hydrolyze cellulose. The enzymes differ widely in their activities on both BMCC and PASC. CbhA and CbhB appear to have no marked preference for either substrate; CenA and CenC appear to prefer PASC; the most active enzymes on BMCC, CenB and CenD, although about eight times more active on PASC than on BMCC, are less active than CenA and CenC on PASC. Excluding Cex, which is truly a xylanase, CbhA and CbhB are the least active of the enzymes. Preliminary evidence indicates that CbhA is the most abundant of the enzymes [9]. The cellulase system from C. fimi contains at least two types of endoglucanase: one type, exemplified by CenA and CenC, which degrades amorphous cellulose; the other, exemplified by CenB and CenD, which degrades both amorphous and crystalline cellulose. It contains two types of exocellobiohydrolase; CbhA degrades cellulose molecules from their non-reducing ends; CbhB degrades them from their reducing ends. Both are inverting enzymes. CbhB produces cellotetraose from cellohexaose; the ratio of or- and 13-cellotetraoses after 5 min incubation is 1:0.9; it changes to the equilibrium ratio of 1:2 after mutarotation. This is consistent with hydrolysis from the reducing end. The ratio of o~- and 13- cellotetraoses released from cellohexaose by CbhA is 1:2 before and after allowing time for mutarotation, consistent with hydrolysis from the non-reducing end. Presumably, the presence of both CbhA and CbhB allows hydrolysis to proceed in both directions from a site resulting from endoglucanase activity.

5. ADSORPTION OF THE CBDs TO CELLULOSE All of the CBDs of family II in the enzymes from C. fimi, with the exception of those from XylD, bind to both crystalline and amorphous cellulose. The C-terminal CBD of XylD binds only to crystalline cellulose [13]; the intemal CBD binds to insoluble xylan but not to cellulose [32]. The CBDs of family IV from CenC bind to amorphous but not to crystalline cellulose [16]. The internal CBD of family III from CenB binds to Avicel, which is about 60% crystalline, but its binding to crystalline and amorphous cellulose has not been determined [ 15]. Although CDc~,A hydrolyzes cellulose, it does not form a stable complex with the cellulose [25]. Does this mean that the CD dissociates from the cellulose between each hydrolytic event? CenA is an endoglucanase; it would be expected not to be processive. The binding of

265 Table 1 Activities of C. fimi 13-1,4-[~lucanases and xylanases on crystalline and amorphous celluloses Enzyme Bacterial microcrystalline cellulose Phosphoric acid-swollen cellulose CenA 0.2 244 CenB 10.9 66 CenC 1.6 114 CenD 9.7 81 CbhA 1.4 1.2 CbhB 0.7 0.4 Cex ND 27 Activities were determined at 37 ~ pH 7, and are expressed as moles glucose equivalents -1

-1

released min . mole enzyme . ND: not detectable.

CBDc~nA to cellulose is very stable (see below); this could allow the CD to remain close to the substrate between hydrolytic events. Removal of its CBD does not affect the hydrolytic activitiy of CenA on soluble substrates or on amorphous cellulose; it reduces but does not abolish activity on crystalline cellulose [14, 31]. The activity of CDce~ is also affected by substitution of CBDce,A with CBDc~nc: the activity on amorphous cellulose is unchanged, but the activity on crystalline cellulose is no better than that of the CD alone [33]. Clearly, the nature of the CBD affects the activities of CDCenA on different forms of cellulose.

6. NATURE OF THE BINDING OF CBDs TO CELLULOSE The adsorption of CenA and Cex, and to a lesser extent of CenC, to cellulose is better understood than that of the other enzymes. Manipulation of the genes encoding CenA and Cex is relatively straightforward, allowing production in quantity of their CBDs [25, 34]. The CDs are obtainable by proteolysis of the holoenzymes and removal of the CBDs with cellulose [ 14]. CenA and CBDCenA adsorb along the length of Valonia cellulose microcrystals, with an apparent preference for the 110 crystal faces, or edges [35], an indication that the CBD does adsorb to crystalline cellulose, not just amorphous regions in the crystals. Scatchard plots for the adsorption of CenA, CBDCenA, Cex and CBDccx to BMCC are concave upwards, indicating a complex interaction of the polypeptides with the BMCC. Based on a model of overlapping potential binding sites, each comprising multiple lattice units (cellobiose residues) on the cellulose surface, the relative association constants (Kr) for the polypeptides were 40.5, 45.3, 33.3 and 38.5 1.g cellulose-1, respectively [25, 26]. Equilibration is complete in less than 20 s, and there is no detectable desorption of the proteins during the following 16 h. There is no detectable binding of CDce,A o r CDce x to BMCC [14, 25]. It is estimated that CBDCenA occupies about 35 cellobiose units on the 110 crystal face of BMCC [25]. As might be expected, the CBDs retain their affinities for cellulose when fused to heterologous polypeptides by genetic manipulation [36, 37]. A further indication of the

266 tightness of binding is given by Abg-CBDce x, a fusion of a ~3-glucosidase (Abg) from an

Agrobacterium sp. and CBDc~x. When immobilized by adsorption to cellulose, Abg-CBDc,,x does not desorb from the column during continuous operation for 10 days at 37 ~ [38]. Furthermore, by fusion to a homologue of Abg from a thermophilic bacterium, it can be shown that CBDc~x remains tightly bound at 70 ~ [E. Ong; unpublished observation]. A striking characteristic of CBDs of family II, which includes CBDCenA and CBDc~x, is four tryptophan which are conserved in nearly all members of the family. Mutations W 14A and W68A in CBDCenA (corresponding to W 17 and W72 of CBDcCx, Fig. 2) reduce its affinity for BMCC 50-fold and 30-fold, respectively [39], an indication of the importance of these residues in the adsorption to cellulose. Mutation of any of the four conserved tryptophans in another CBD of family II, that of xylanase A from Pseudomonas fluorescens subsp, cellulosa, also reduces binding, but quantitative data are not given [40]. CBDc~x is a nine-stranded, anti-parallel 13-barrel [41]. Three of the conserved tryptophans, including those corresponding to W14 and W68 in CBDCenA, are exposed on one side of the barrel (Fig. 2). The conditions required for desorption of the CBDs vary with the nature of the CBD and with the nature of the polypeptide to which it is fused. Nonetheless, hydrophobic interactions with the cellulose surface, which may involve the tryptophans, appear to be involved in binding.

Figure 2. Ribbon diagram of C. fimi CBCc~x from the structure determined by NMR spectroscopy [41]. Three tryptophan residues implicated in binding to cellulose and a disulfide bridge conserved in most family II CBDs are shown.

267 CBDc~~ disrupts cotton fibres, releasing small particles, even though it is devoid of hydrolytic activity [42]; it interacts synergistically with CDc~nA in the hydrolysis of cellulose [43]; and it prevents the flocculation of microcrystalline cellulose [35]. Although the CBD will serve to increase the concentration of the CD on the surface of the substrate, it is clear that it may further facilitate hydrolysis by altering that surface. The apparent tightness of binding is paradoxical, however, if hydrolysis is to be continuous and progressive, as it appears to be.

7. FIBRONECTIN TYPE III (Fn3) REPEATS Fn3 repeats in CenB, CenD, CbhA and CbhB are approximately 100 amino acids long. Their role(s), if any, in cellulose hydrolysis is unknown. In all four enzymes, the Fn3 repeats separate an N-terminal CD from a C-terminal CBD. The Fn3 repeats in fibronectin have clearly defined structures [44-46]. The repeats in the enzymes from C. fimi may serve simply as linkers between the CBDs and CDs. If they are flexible, a CD could have some mobility on the cellulose surface in spite of being anchored securely by its CBD. An intriguing possibility is that the flexibility is enhanced by reversible unfolding of the Fn3 repeats. The stretch and elasticity of titin and fibronectin are provided in part by the reversible unfolding of Fn3 repeats [47]. The length of a Fn3 repeat is increased 7-fold by unfolding. If the Fn3 repeats in the enzymes from C. fimi also undergo reversible unfolding, the CDs could move considerable distances over the cellulose surface while remaining anchored by the CBDs. Fn3 repeats are uncommon in [3-1,4-glycanases, however, having been observed only in cellulases from Cellulomonas spp. and chitinases from Bacillus spp [48].

8. CONCLUSIONS The interactions with cellulose of some of the enzymes of the cellulase complex from C. fimi are understood superficially. The CBDs bind the CDs to cellulose, apparently irreversibly. The CDs, which have very low affinities for cellulose, hydrolyze [3-1,4-glycosidic bonds within individual cellulose molecules. The catalytic mechanisms of two of the enzymes, Cex and CenA, are understood in some detail [49-52]. In CenA, at least, the CBD and the CD interact synergistically [43]. Details are lacking, however, of the molecular interactions between the CBDs and the cellulose surface, of the events occurring between adsorption and hydrolysis of glycosidic bonds, which may occur concommitantly, and of the processivity of hydrolysis by enzymes which behave as if irreversibly bound to the substrate. These are challenging problems, compounded by the indeterminate nature of the substrate. The question of interactions between the components of the system has not been raised.

9. ACKNOWLEDGEMENTS This work was supported by the Natural Sciences and Engineering Research Council of Canada and the Protein Engineering Network of Centres of Excellence.

268 10. REFERENCES

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

P. B6guin, Annu. Rev. Microbiol., 44 (1990) 219. H.J. Gilbert and G.P. Hazlewood, J.Gen. Microbiol., 139 (1993) 187. P. B6guin and J.-P. Aubert, FEMS Microbiol. Rev., 13 (1994) 25. B. Henrissat, Cellulose, 1 (1994) 169. P. Tomme, R.A.J. Warren and N.R. Gilkes, Adv. Microb. Physiol., 37 (1995) 1. W.K.R. Wong, B. Gerhard, Z.M. Guo, D.G. Kilburn, R.A.J. Warren and R.C. Miller, Jr., Gene, 44 (1986) 315. A. Meinke, N.R. Gilkes, D.G. Kilbum, R.C. Miller, Jr. and R.A.J. Warren, J. Bacteriol., 173 (1991) 308. J.B. Coutinho, B. Moser, D.G. Kilburn, R.A.J. Warren and R.C. Miller, Jr., Mol. Microbiol. 5 (1991) 1221. A. Meinke, N.R. Gilkes, D.G. Kilbum, R.C. Miller, Jr. and R.A.J. Warren, J. Bacteriol., 175 (1993) 1910. A. Meinke, N.R. Gilkes, E. Kwan, D.G. Kilbum, R.A.J. Warren and R.C. Miller, Jr., Mol. Microbiol., 12 (1994) 413. H. Shen, N.R. Gilkes, D.G. Kilburn, R.C. Miller, Jr. and R.A.J. Warren, Biochem. J., 311 (1995) 67. G.P. O'Neill, S.H. Goh, R.A.J. Warren, D.G. Kilbum and R.C. Miller, Jr., Gene, 44 (1986) 325. S.J. Millward-Sadler, D.M. Poole, B.Henrissat, G.P. Hazlewood, J.H. Clarke and H.J. Gilbert, Mol. Microbiol., 11 (1994) 375. N.R. Gilkes, R.A.J. Warren, R.C. Miller, Jr. and D.G. Kilburn, J. Biol. Chem., 263 (1988) 10401. A. Meinke, N.R. Gilkes, D.G. Kilbum, R.C. Miller, Jr. and R.A.J. Warren, J. Bacteriol., 173 (1991) 7126. J.B. Coutinho, N.R. Gilkes, R,A.J. Warren, D.G. Kilbum and R.C. Miller, Jr., Mol. Microbiol., 6 (1992) 1243. B. Henrissat, B iochem. J., 280(1991) 309. B. Henrissat and A. Bairoch, Biochem. J., 293 (1993) 781. H.J. Gilbert and G.P. Hazlewood, J. Gen. Microbiol., 139 (1993) 187. D. B.Wilson, Crit. Rev. Biotechnol., 12 (1992) 45. G.P. Hazlewood, J.I. Laurie, L.M.A. Ferreira and H.J. Gilbert, J. Appl. Bacteriol., 72 (1992) 244. P. B6guin, J. Miller and J.-P. Aubert, FEMS Microbiol. Lett., 100 (1992) 523. A. Sarko and R. Muggli, Macromolecules, 7 (1974) 486. A.K. Kulshreshtha and N.E. Dweltz, J. Polym. Sci., 11 (1973) 487. N.R. Gilkes, E. Jervis, B. Hem'issat, B. Tekant, R.C. Miller, Jr., R.A.J. Warren and D.G. Kilburn, J. Biol. Chem., 267 (1992) 6743. E. Ong, N.R. Gilkes, R.C. Miller, Jr., R.A.J. Warren and D.G. Kilbum, Biotechnol. Bioeng., 42 (1993) 401. A. White, S.G. Withers, N.R. Gilkes and D. Rose, Biochemistry, 33 (1994) 12546. J. Rouvinen, T. Bergfors, T. Teeri, J.K.C. Knowles and T.A. Jones, Science, 249 (1990) 380.

269 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53

M. Spezio, D.B. Wilson and P.A. Karplus, Biochemistry, 32 (1992) 9906. M. Juy, A.G. Amit, P.M. Alzari, R.J. Poljak, M. Claeyssens, P. B6guin and J.-P. Aubert, Nature, 357 (1992) 89. K.M. Kleman-Leyer, N.R. Gilkes, R.C. Miller, Jr. and T.K. Kirk, Biochem. J., 302 (1994) 463. G.W. Black, G.P. Hazlewood, S.J. Millward-Sadler, J.I. Laurie and H.J. Gilbert, Biochem. J., 307 (1995) 191. J.B. Coutinho, N.R. Gilkes, D.G. Kilburn, R.A.J. Warren and R.C. Miller, Jr., FEMS Microbiol. Lett., 113 (1993) 211. E. Ong, N.R. Gilkes, R.C. Miller, Jr., R.A.J. Warren and R.C. Miller, Jr., Biotechnol. Bioeng., 42 (1993) 401. N.R. Gilkes, D.G. Kilburn, R.C. Miller, Jr., R.A.J. Warren, J. Sugiyama, H.,Chanzy and B. Henrissat, Int. J. Biol. Macromol., 15 (1993) 347. J.M. Greenwood, N.R. Gilkes, D.G. Kilburn, R.C. Miller, Jr. and R.A.J. Warren, FEBS Lett., 244 (1989) 127. E. Ong, N.R. Gilkes, R.A.J. Warren, R.C. Miller, Jr. and D.G. Kilburn, Biotechnology, 7 (1989) 604. E. Ong, N.R. Gilkes, R.C. Miller, Jr., R.A.J. Warren and D.G. Kilburn, Enzyme Microb. Technol., 13 (1991) 59. N. Din, I.J. Forsythe, L.D. Burthnick, N.R. Gilkes, R.C. Miller, Jr., R.A.J. Warren and D.G. Kilburn, Mol. Microbiol., 11 (1994) 747. D.M. Poole, G.P. Hazlewood, N.S. Huskisson, R. Virden and H.J. Gilbert, FEMS Microbiol. Lett., 106 (1993) 77. G.-Y. Xu, E. Ong, N.R. Gilkes, D.G. Kilburn, D.R. Muhandiram, M. Harris-Brandts, J. Carver, L.E. Kay and T.S. Harvey, Biochemistry, in press. N. Din, N.R. Gilkes, B. Tekant, R.C. Miller, Jr., R.A.J. Warren and D.G. Kilbum, Biotechnology, 9(1991) 1096. N.Din, H.G. Damude, N.R. Gilkes, R.C. Miller, Jr., R.A.J. Warren and D.G. Kilburn, Proc. Nat. Acad. Sci. USA, 91(1994) 11383. D.J. Leahy, W.A. Hendrickson, I. Aukhil and H.P. Erickson, Science, 258 (1992) 987. A.L. Main, T.S. Harvey, M.Baron, J. Boyd and I.D. Campbell, Cell, 71 (1992) 671. C.D. Dickinson, B. Veerapandian, X.-P. Dai, R.C. Hamlin, N.-h. Xuong, E. Ruoslahti and K.R. Ely, J. Mol. Biol., 236 (1994) 1079. H.P. Erickson, Proc. Nat. Acad. Sci. USA, 91 (1994) 10114. C.K. Hansen, FEBS Lett., 305 (1992) 91. D. Tull, S.G. Withers, N.R. Gilkes, D.G. Kilburn, R.A.J. Warren and R. Aebersold, J.Biol. Chem., 266 (1991) 15621. D. Tull and S.G. Withers, Biochemistry, 33 (1994) 6363. A.M. MacLeod, T. Lindhorst, S.G. Withers and R.A.J. Warren, Biochemistry, 33 (1994) 6371. H.G. Damude, S.G. Withers, D.G. Kilburn, R.C. Miller, Jr., and R.A.J. Warren, Biochemistry, 34 (1995) 2220. V. Ducros, M. Czjzek, A. Belaich, C. Gaudin, J.-P. Belaich, G. Davies and R Marer, Structure, in press.

270 54

R. Dominguez, H. Souchon, S. Spinelli, Z. Dauter, K.S. Wilson, S. Chauvaux, P. B~guin and P.M. Alzari, Nature Struct. Biol., 2 (1995) 56.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 1995 Elsevier Science B.V. All rights reserved.

271

Transgenic plants as a tool to understand starch biosynthesis J. KoBmann a, G. Abel b, V. Btittcher b, E. Duwenig b, M. Emmermann a, R. Lorberth b, F. Springer a, I. Virgin c, T. Welsh b and L. Willmitzer a aMax-Planck-Institut ftir molekulare Pflanzenphysiologie, Karl-Liebknecht-Stral3e 24-25, D14476 Golm, Germany bInstitut ftir Genbiologische Forschung GmbH Berlin, Ihnestr. 63, D- 14195 Berlin, Germany CStockholm Environment Institute, Lilla Nygatan 1, Box 2142, S- 103 14 Stockholm, Sweden

Abstract

Transgenic plants are used as a tool for further understanding starch biosynthesis. To this end cDNAs are cloned which encode biochemically identified enzymes, which are possibly involved in starch metabolism. These cDNAs are then either expressed in Escherichia coli to study the effects of the corresponding enzymes on glycogen metabolism, or used to repress the expression of the respective gene in transgenic potato plants. The functional analysis of the key enzymes of starch metabolism, the ADP-glucose pyrophosphorylase, the different isoforms of starch synthase and the branching enzyme is discussed.

1. INTRODUCTION More than 6 million tons of starch are currently produced in Europe and are used in a wide range of different industries. Wild type starch is normally composed of essentially linear (0.1% branchpoints) o~-l,4-glucans (amylose) and via cx-l,6-glycosidic bonds branched (4-5% branchpoints) ot-l,4-glucans (amylopectin). The major sources are maize, wheat and potato. Around 30% of the starch is used in its native form, and 15% in a chemically modified form. These chemical modifications are normally introduced to optimize the physicochemical properties of the different starches for special applications. Genetic engineering of plants might serve as tool to replace some of the chemical modifications, if it was possible to manipulate key steps of starch biosynthesis, which are of major importance in determining certain properties of the starch synthesised, in transgenic plants. To this end starch biosynthesis has to be fully understood on the biochemical level. To date only a few aspects of a rather complex pathway are described. The main route of starch synthesis comprises three enzymatic steps [ 11 ]" -the conversion of glucose-l-phosphate to ADP-glucose by the ADP-glucose pyrophosphorylase, the key regulatory enzyme of the pathway (AGPase)

272 -the transfer of glucose from ADP-glucose to an at-1,4-glucan by different isoforms of starch synthase (granule-bound (GBSS) or soluble (SSS)) -the introduction of branchpoints by different isoforms of branching enzyme (BE). Most of these enzymes have been purified and biochemically characterised. However, the synthesis of starch granules in vitro using these enzymes has never been achieved; only the production of linear and branched, but soluble glucans was obtained. This indicates that other enzymes might be involved in the formation of starch granules. Possible canditates are the starch phosphorylases (STP), glucanotransferases e.g. disproportionating enzyme (DE), glucosyltransferases e.g. T-enzyme, and different hydrolases, e.g. the debranching enzyme (Renzyme; RE) or the at- and [3-amylases. Furthermore it is impossible to explain the phosphorylation of potato starch, a feature of major importance for the industrial uses of this starch. No enzyme has been characterised so far, which can transfer phosphate to starch. To elucidate the role of each enzyme it is possible to generate transgenic plants, where the expression of the respective genes is specifically repressed. This is accomplished through the antisense-RNA methodology. Here a part of the coding sequence of the target gene is expressed in the reverse orientation, probably leading to the hybridisation of sense and antisense RNA molecules and the subsequent degradation of the hybrid. The mechanism how the expression of the target gene is repressed is still under debate, however, this technique is available as a valuable tool for the functional analysis of some enzymes. The effects of the absence of one or several enzymes on starch metabolism and structure can than be studied, and in some cases a defined function can be ascribed to the respective enzyme. It should be noted that this technique not only enables the functional analysis of genes/proteins, but can also be used as a tool to generate plants, which synthesise starches with altered physicochemical properties. Another valuable approach to generate modified starches in planta is to express heterologous genes originating from different bacterial, fungal, animal, or even different plant species. Of special interest are genes involved in glycogen metabolism. Glycogen is a polysaccharide, which is similar to amylopectin, it is more highly branched (7-10% branchpoints) and in most cases it has a higher molecular weight. As becomes evident from above, it is a necessity to have the respective coding sequences in order to follow the antisense RNA approach or to overexpress a certain protein in transgenic plants. In most of the cases of enzymes involved in starch metabolism in potato tubers the genes (cDNAs) had to be isolated using different approaches, e.g. heterologous screening, immunoscreening, or direct functional screening using different mutants of Escherichia coli.

2. RESULTS AND DISCUSSION 2.1. Plants with reduced levels of ADP-glucose pyrophosphorylase produce starchless tubers that contain high amounts of soluble sugars The AGPase is a allosterically regulated heterotetrameric enzyme consisting of two different subunits. It is generally believed that the reaction catalysed by the enzyme is the limiting step in starch metabolism with respect to the amount of starch which is produced in a given organ.

273 Furthermore, many starchless mutants have been isolated in different plant species, which are deficient for or contain lower levels of the enzyme [ 11 ]. In order to test if the enzyme plays the same key role in potato tuber starch metabolism, plants were generated that contain lower levels of AGPase using the antisense RNA technique. To this end two different cDNAs were cloned using heterologous probes encoding the different subunits of AGPase from maize [9]. Plants were transformed with a construct containing the cDNA encoding the small subunit of potato tuber AGPase in reverse orientation driven by the constitutive CaMV 35S promoter. Individual lines containing different levels of AGPase were selected and analysed [10]. A clear correlation between the AGPase activity remaining and the starch content in tubers was observed. Plants with only 2% of wild type AGPase levels contain only 5% of wild type starch levels. The lack of starch in the transgenic tubers was partially compensated for by the accumulation of soluble sugars. In normal tubers starch contributes 70-80% of the dry weight. Transgenic tubers expressing the lowest levels of AGPase contain only 1% starch, but 30% sucrose and 8% glucose on a dry weight basis. Interestingly, the amount of starch granules which are initiated in very small (developing) tubers does not seem to be changed in the plants with lower levels of AGPase (unpublished observation). The process of initiation of the starch granules is not understood at all. The results here can be taken as an indication, that the "ADP-glucose pathway" is not important for the starch granule initiation, however it is also possible that the remaining enzyme activity (2%) is sufficient to support the initiation of the starch granules, but is then limiting for further growth of the granules. The starch, which can be isolated from the plants with the lowest levels of AGPase activity is, as also can be seen in thin sections of tubers, smaller in average granule size, thus making the isolation procedure more tedious. Further the gelatinisation characteristics, as determined with a Brabender Viscograph E, of this starch are significantly changed. Higher viscosities are obtained after cooling the paste, whereas the maximal viscosity is drastically lowered. How this relates to other structural changes in the starch than the lowered average granule size is currently under investigation. It gives rise to the possibility, that under ADP-glucose limitation, the relative contribution of the different starch synthases to the building of the starch granule is changed, thus leading to a change in chain length distribution in the amylopectin or to a change in the amylose content of the starch. In a complementary approach the starch content of potato tubers was increased through the introduction of a bacterial ADP-glucose pyrophosphorylase with allosteric properties that constitutively promote starch synthesis in transgenic plants [13]. This can be taken as a further indication for the importance of this enzyme in controlling the flux of carbon into starch. 2.2. Biochemical, molecular, and functional analysis of the different isoforms of starch synthase in potato tubers The functional analysis of the different starch synthases in potato tubers is far more complex and demanding problem, since very little is known on the biochemical level. All attempts to purify these enzymes to homogeneity from potato tubers have so far failed. Starch synthases can be divided into two major classes, the starch granule-bound and the soluble isoforms. Only for the major starch granule-bound protein, the GBSS I ("waxy" protein), has a clear function been assigned to date (see below). The individual roles of the other starch synthases is not

274

clear at present, due to the lack of mutants in these synthases. Furthermore, only a very few coding sequences other than cDNAs encoding GBSS I are available. 2.2.1. Reduction of the GBSS I in transgenic potato plants leads to the production of waxy potato starch One of the primary goals in modifying starch structure and composition is to produce starches with homogenous composition (exclusively amylose or amylopectin) in order to increase its technical value, since amylose and amylopectin are molecules with rather different physicochemical properties, which makes them suitable for very different applications. Many mutants in cereals (and other plants including Chlamydomonas) have been described that contain amylose-free starch (waxy starch) [8]. Waxy maize starch is commercially produced, since it has many advantages compared to wild type maize starch. The phenotype can be ascribed to the loss of active GBSS I protein. Such a mutant was also isolated in a diploid potato line [5], however, this line does not produce a sufficient yield to produce waxy potato starch for commercial purposes. Classical genetics are commonly not applicable to potato plants, since normal potato varieties are allotetraploid with a high degree of heterozygosity. Therefore the antisense RNA approach is ideally suited to generate commercially interesting lines with waxy potato starch. Transgenic lines were therefore generated which express a cDNA for GBSS I in antisense orientation. Lines with reduced expression of the gene were selected. Reduction of GBSS I ultimately leads to the reduction of the amylose content in potato starch and it was possible to select lines which contained essentially amylose-free starch [14] (unpublished results). This phenotype is normally correlated with a yield penalty in maize. However, this can not be observed in the case of potato plants. Field testing the transgenic lines did not give any indication, that the lack of amylose synthesis leads to decreased starch contents or decreased tuber yields [7]. Furthermore, other features which make potato starch superior over cereal starches, like the larger average granule size, the tastelessness, or the phosphate content, are maintained within the waxy potato starch. No differences in the granule size distribution were observed when the waxy potato starch was compared with wild type potato starch. A 20% increase in the phosphate content at the C6-position, as determined enzymatically as glucose-6-phosphate after acid hydrolysis of the starch, was measured in the starch derived from the transgenic lines. This can be explained by the fact that phosphate is only bound to the amylopectin, but not to the amylose. Since amylose, normally accounting for 20% of the total starch, is fully replaced by amylopectin, the 20% increase of phosphate exactly matches with the anticipated result. In the waxy Chlamydomonas mutant it was shown that GBSS I is not only involved in the formation of amylose, but also contributes to the synthesis of amylopectin, since a certain distinct fraction, that contains a branched glucan with longer side chains than normal amylopectin, in a gel permeation chromatography is also absent in the waxy starch. The same is true for waxy potato starch. The exact mechanism how the GBSS I protein is able to accomplish amylose synthesis in the presence of branching enzymes is still unknown. One possibility is that the GBSS I protein is still active in regions of the growing starch granule which are substantially dehydrated whereas other enzymes are losing their activity. Pasting characteristics of the waxy potato starch very much resemble what is expected. Overall the viscosity is significantly lower in pastes of waxy potato starch. No retrogradation

275 can be observed in waxy potato starch pastes. This indicates that the waxy potato starch combines the positive features of waxy maize starch with those of normal potato starch. One can imagine a series of applications, where waxy potato starch might replace other starches in the paper, food or textile industries.

2.2.2. Cloning and functional analysis of the GBSS II from potato Recently the cloning of a cDNA encoding a second granule-bound starch synthase (GBSS II) was reported from pea embryos [4]. In contrast to the GBSS I, this protein is not exclusively located on starch granules, but also exists in a soluble form, and was earlier characterised also a soluble starch synthase II (SSS II). The N-terminal sequence of SSS II is resembled within the sequence derived from the eDNA clone encoding GBSS II [3]. A eDNA clone showing high similarity to the GBSS II from pea was isolated from a potato tuber eDNA library using an antiserum directed against proteins derived from waxy potato starch granules. This strategy was followed, because the GBSS I protein normally makes up to 90% of the total protein bound to starch granules. A full length eDNA clone for GBSS II encodes for a protein of about 85 kD. The eDNA was expressed in reverse orientation in transgenic potato plants. Individual fines were selected, where very little or no GBSS II protein was detectable on Western blots from crude tuber extracts. As in pea it can be clearly shown, that the GBSS II protein from potato is present in a granule-bound and a soluble form. In extracts from wild type tubers a clear signal is detectable in the granular and the soluble fraction, whereas in selected transgenic lines no signal is obtained in both fractions. In native polyacrylamide gels stained for starch synthase activity the absence of a distinct isoform of starch synthase can be shown easily. Surprisingly, no differences can be identified when starches from plants with little or no GBSS II protein is compared to wild type starch. The starch content in the tubers remains unaltered indicating that this protein either does not contribute significantly to net starch synthesis in wild type potato tubers, or that the lack of this protein is fully compensated for by the presence of the other isoforms of starch synthases in potato tubers. The amylose content of the starch remains unaltered. Also no differences in the chain length distribution using HPAEC (high performance anion exchange chromatography) with pulsed amperometric detection can be found. The phosphate bound to the C-6 position is also unchanged, as is the average starch granule size. Finally, no differences in the gelatinisation and gelation characteristics can be observed. The function of the GBSS II in potato tuber starch metabolism therefore remains enigmatic. Even more, since the protein can be actively expressed in Escherichi'a coli. In a glycogen synthesis deficient mutant it contributes to the formation, of granular structures, which are constituted by linear glucans. 2.2.3. Cloning of a eDNA for SSS I In many different laboratories it was attempted to biochemically purify soluble starch synthases from a wide range of plant species. In most of the cases it was not possible to isolate sufficient amounts of protein to perform protein sequencing or to raise antibodies against the proteins. Especially it was impossible to obtain any soluble starch synthase from potato tubers in a pure form. A purification scheme was developed, which includes anion-exchange chromatography on DEAE-Sepharose, affinity chromatography on an amylose resin, and again

276 anion-exchange chromatography on a MonoQ column. In the last step it was possible to recover two distinct starch synthases (170- and 450-fold enrichment), however, not in a homogenous form. The biggest problem is the presence of the branching enzyme in all the fractions containing soluble starch synthase activity. This indicates that these proteins might be active as complexes. Furthermore, far too little amounts of activity are found after several chromatographic steps, indicating that these proteins are rather unstable. Therefore other strategies had to be followed in order to clone cDNAs encoding soluble starch synthases from potato. Recently the cloning of a cDNA encoding soluble starch synthase (SSS) from rice was reported [1]. Based on the nucleotide sequence, primers were generated to amplify a 1 kb fragment for soluble starch synthase from a leaf specific rice cDNA via the polymerase chain reaction. This fragment was radioactively labelled and used as a probe to screen a potato tuber eDNA library. This screen was performed "differentially", meaning that replica filters were hybridized with labelled probes for GBSS I and GBSS II. Only those clones were isolated which hybridized with the probe from rice, but not with the other two. Of approximately 5001000 clones initially hybridizing with the rice probe, only one was isolated (so called SR5), which did not encode GBSS I or II. Sequence analysis of this clone revealed high homology to the SSS from rice on the amino acid sequence level, indicating that it encodes a soluble starch synthase from potato. Expression analysis of the gene surprisingly shows, that it is more highly expressed in leaves than in tubers. This is contrasting the expression pattern of other starch biosynthetic genes from potato, which are normally very highly expressed in tubers, and to a lower level in leaves [9, 6]. However, the expression of the gene can be induced in leaves if sucrose is applied exogenously, which is paralleled by other starch biosynthetic genes. The clone SR5, which did not contain the full information for the protein, was used to isolate a full size eDNA from a potato leaf library. The full size clone encodes a protein of ca. 70 kD. For a functional analysis the full size clone was expressed in different mutants of E. coli. Glycogen synthase negative mutants were complemented using the eDNA from potato. In native polyacrylamide gels a soluble starch synthase was detectable specific for the cDNA introduced into E. coli. The migration pattern of this starch synthase very much resembles a starch synthase, which is also detectable in zymograms with crude potato tuber extracts, or with fractions containing only one starch synthase after several purification steps as discussed above. The protein, which is encoded by this clone was designated soluble starch synthase I (SSS I). Inhibiting the expression of the gene in transgenic plants will specifically reveal its function in potato starch metabolism in the near future. 2.2.4. Cloning of a novel type of soluble starch synthase from potato A second strategy was followed to clone soluble starch synthases from potato: Peptides were synthesised, which resemble those domains which are conserved within all ~-l,4-glucan synthases sequenced so far (including glycogen synthases from bacteria). These peptides were used to generate different antisera. Using one of these antisera to probe Western blots showed that proteins from the partial enzyme purification discussed above were specifically detected in those fractions containing high levels of soluble starch synthase activity. The major protein which was detected had a size of approximately 135 kD, but also bands of lower molecular

277 weight were detected in the same fraction, which are possibly degradation products, further indicating that the soluble starch synthases are rather sensitive to proteolytic cleavage. This antiserum was used to screen a potato tuber cDNA library. Positive clones were isolated and sequenced. Sequence analysis of a subsequently isolated full size clone revealed homologies to other c~-l,4-glucan synthases. When a dendrogram is computed, the sequence represents a novel class of enzymes, which was designated soluble starch synthase III (SSS III). The gene is expressed in almost any organ, a slightly higher expression can be observed in tubers, as determined by Northern-blot analysis. As for other starch biosynthetic genes, the expression can be induced in leaves through the exogenous application of sucrose. The protein can also be expressed in E. coti, and, as discussed above, can complement glycogen synthase deficient mutants. The migration pattern on native gels is different from SSS I, but similar to a starch synthase, which is also present in potato tubers. The sequence was used to inhibit the expression of the gene. In selected transgenic lines the absence of an isoform of soluble starch synthase can be demonstrated in native gels after staining for starch synthase activity. Further analysis of the plants containing little or no SSS III, will elucidate its role in starch metabolism.

2.3. Reduction of the branching enzyme in transgenic plants does not lead to the production of high-amylose containing potato starch Most of the plants contain two isoforms of the starch branching enzyme. Mutants, which are reduced for one isoform have been described in several plant species, contain higher levels of amylose within the starch (up to 80%) [12]. These mutants are not normally used for the production of high-amylose starch, because the phenotype is accompanied by a severe reduction of the crop yield and the starch granules from these plants are significantly smaller, leading to significant losses in the starch extraction processes. Potato seems to be an ideal system to manipulate the amylose content, since only the occurence of one isoform of branching enzyme has been described [2, 15, 6]. Furthermore, the average starch granule size of potato starch is higher than in any other plant species used for starch production. A reduction of the granule size, due to higher contents of amylose, might therefore not severely effect the extractibility of the starch. A cDNA encoding potato branching enzyme was cloned and used to suppress the expression of the gene in transgenic potato plants. Surprisingly, a more than 99% reduction of the branching enzyme as compared to wild type plants did not result in any changes of the amylose content of the starch. In addition, no changes in the chain length distribution and the size of the amylopectin were found. This indicates, that most probably different isoforms of branching enzyme are also present in potato tubers. Approaches to isolate further sequences that encode other isoforms of branching enzyme are in progress. Nevertheless, differences in glucan metabolism were observed, when the plants with reduced branching enzyme activity were further analysed. The first is, that water soluble glucans, which are stainable with iodine, are less branched, than in wild type plants. Soluble extracts from the transgenic plants stain blue, whereas wild type extracts stain brown. The role of these glucans in starch metabolism is unclear, they might be intermediates in the synthesis of amylopectin. In greenhouse grown plants, no differences in yield and starch content were

278 measured between transgenic and wild type plants. This could be taken as an indication, that these water soluble glucans are not an essential intermediate of amylopectin synthesis, because a drastically altered structure should result in a change of net flux. On the other hand it is possible, that the conditions in the greenhouse, where relatively low amounts of light are available and the growth of the plants is limited by the pot in which they are kept, do not allow the detection of differences when the pot yield is taken as a measure for carbon flux into starch accumulating in sink organs. The second difference found is a significant increase in the phosphate content of the starch, when measured as described under 2.2.1. These increases amount up to 80%. However, how this can be explained on a biochemical basis is still unclear and subject to further investigations. This increased phosphate content might be interesting for several starch using industries, especially the paper industry. Here cationic potato starches are applied, due to their amphoteric nature, which is given through the covalently bound phosphate groups. Significant changes were also observed when the gelatinisation characteristics of the different starches were analysed. Starches derived from plants with reduced branching enzyme levels are generally more viscous. This could be due to the higher phosphate content, however, it is not clear if other changes, which are not detectable with the analytical systems used, might as well contribute to the higher viscosities measured.

3. REFERENCES 1 2 3 4 5 6 7 8 9 10 11 12

13 14 15

T. Baba et al., Plant Physiol., 103 (1993) 565. D. Borovsky et al., European J. Biochem., 59 (1975) 615. K. Denyer et al., Plant J., 4 (1993) 191. I. Dry et al., Plant J., 2 (1992) 193. J.H.M. Hovenkamp-Hermelink et al., Theor. Appl. Genet., 75 (1987) 217. J. KoBmann et al., Mol. Gen. Genet., 230 (1991) 39. G.J. Kuipers et al., Euphytica, 59 (1992) 83. B. Mtiller-R6ber and J. Kogmann, Plant Cell Environ., 17 (1994) 601. B. Mtiller-R6ber et al., Mol. Gen. Genet., 224 (1990) 136-146. B. Mtiller-R6ber et al., EMBO J., 11 (1992) 1229. J. Preiss, in: Oxford surveys of plant molecular and cell biology, Vol. 7, B.J. Miflin and H.F. Miflin (eds.), 59-114, Oxford University Press, Oxford (1991). J.C. Shannon and D.L. Garwood, in: Starch chemistry and technology R.L. Whistler, J.N. BeMiller and E.F. Paschall (eds.), 25-86, Academic Press, New York. (1984). D.M. Stark et al., Science, 258 (1992) 287. R.G.F. Visser et al., Mol. Gen. Genet., 225 (1991) 289. G.H. Vos-Scheperkeuter et al., Plant Physiol., 90 (1989) 75.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 1995 Elsevier Science B.V. All rights reserved.

279

Targeted expression of microbial cellulases in transgenic animals Simi Ali~, Judith Hall", Kathleen L. Soole ab, Carlos M. G. A. Fontes "b, Geoffrey P. Hazlewood c, Barry H. Hirst b and Harry J. Gilbert``. ``Department of Biological and Nutritional Sciences, University of Newcastle upon Tyne, Newcastle upon Tyne, NE 1 7RU, U.K. bDepartment of Physiological Sciences, University of Newcastle upon Tyne, Newcastle uponTyne, NE2 4HH, U.K. CDepartment of Cellular Physiology, The Babraham Institute, Babraham, Cambridge, CB2 4AT, U.K.

Abstract

In non-ruminant livestock the use of plant structural carbohydrates, such as cellulose and xylan, as a dietary source is limited by the lack of intestinal enzymes to degrade this material to simple sugars. Any dietary energy made available is through the action of microbes in the hind gut, but this fermentation is inefficient with the production of volatile fatty acids rather than simple sugars. The nutrition of such simple-stomached animals could, therefore, be significantly improved by the introduction of plant degrading enzymes such as cellulases and xylanases into their gastrointestinal (GI) tract. We have expressed a bacterial gene from Clostridium thermocellum encoding a cellulase endoglucanase E', in the exocrine pancreas of a simple-stomached animal model, the mouse. A catalytically active enzyme is synthesized and secreted into the small intestine which is resistant to proteolytic inactivation. Currently the level of expression is low so we are attempting to maximise expression by gene rescue experiments. In addition we are also attempting to express this gene in the intestinal enterocytes, lining the GI tract, using a glycosylphosphatidylinositol sorting signal to direct the secretion of the bacterial protein into the intestinal lumen.

1. INTRODUCTION Simple-stomached animals such as pigs and poultry do not utilise forage based diets or those rich in cereal glucans efficiently. They depend on cellulolytic bacteria in their hind-gut to digest the plant structural polysaccharides but then the sugars generated are metabolised into volatile fatty acids (VFA). VFAs when absorbed have only 60% of the energy value of an equivalent quantity of monosaccharide from which they are derived [13]. Furthermore, the hind gut fermentation of sugars is inefficient because dietary energy and carbon are lost to the

280 animal in the heat of fermentation and through methane production. Thus, if hexoses derived from the plant cell wall could be released and absorbed directly in the upper small intestine, without being subject to microbial fermentation, more energy would be available to the animal. Pigs and poultry diets contain considerable quantities of cereals such as barley and wheat, and include factors considered non-nutritional. One such factor is the non-starch polysaccharide 13-glucan found in barley. [3-glucans are composed of glucopyranose units joined by [3-glycosidic bonds, which on solubilisation form viscous solutions. In the GI tract these viscous solutions impair the digestion and absorption of nutrients and cause health problems such as sticky faeces [22]. To improve non-ruminant animal performance, cereal based diets are often supplemented with endo-13-1,4-glucanase and endo-13-1,4-xylanase [29], which in poultry can lead to lower viscosities of fluid in digesta from the small intestine and, therefore, result in significant improvement in nutrition and health. Furthermore, the enzymes added must remain active throughout their passage to the small intestine, i.e. the enzyme should not be susceptible to inactivation by the acidic pH found in the proventriculus and gizzard and also be resistant to inactivation by proteolysis by intestinal enzymes. The advent of transgenic animal technology provides us with the opportunity of manipulating the repertoire of enzymes secreted into the gastrointestinal tract to include [3-1,4glycanases. The major advantages of this approach are firstly, the synthesis and secretion of the enzyme can be directed to the small intestine via exocrine pancreatic cells and intestinal enterocytes, thus by-passing the acid denaturing conditions of the stomach and secondly, the bacterial gene is incorporated into the genome, and upon breeding transgenic DNA is inherited as a simple Mendelian trait. To initiate the development of non-ruminant livestock with the endogenous capacity to hydrolyse plant structural polysaccharides, we have expressed a bacterial endoglucanase in the exocrine pancreas of mice. The enzyme is secreted into the gastrointestinal tract, is catalytically active and resistant to the activity of small intestinal proteinases. In addition, we have attempted to deliver the enzyme (into the intestinal lumen) through the intestinal enterocytes and preliminary experiments (in mice) show that a functional protein is synthesised and correctly sorted in these cells.

2. RESULTS 2.1. Screening for a proteinase-resistant 13-1,4-glycanase The genes encoding for enzymes which degrade plant structural polysaccharide are mostly found in prokaryotes and lower eukaryotes, though anaerobic cellulose degrading protozoa have also been identified in the rumen. The avocado fruit as well as slime mould Dictyostelium discoideum produce cellulolytic enzymes [1]. To develop an animal model with the capacity to hydrolyse cellulose, the major plant structural polysaccharide, requires that the secreted cellulase is resistant to the proteinases in the gastrointestinal tract. In view of this, the resistance of a range of recombinant cellulases and xylanases to inactivation by proteinases of pancreatic origin was assessed. The results (Table 1) indicate that one enzyme endoglucanase E [EGE] from Clostridium thermocellum was completely resistant to inactivation by intestinal proteinases.

281

Clostridium thermocellum is an anaerobic bacteria, which degrades crystalline cellulose with cellobiose as the main product. The degradation of cellulose is achieved by a high molecular weight complex called the cellulosome which comprises at least 14 distinct polypeptides including numerous endoglucanases and xylanases and at least one [3-glucosidase [23]. The complexes are generally associated with the cell surface and mediate attachment between cells and insoluble substrate. Endoglucanse E [EGE] a component of the cellulosome was isolated and sequenced. The structural gene consists of 2442 bp [16] and the encoded enzyme comprises a N-terminal catalytic domain and a C-terminal cellulose binding domain joined by a proline/threonine rich linker sequence [9]. This enzyme, in addition to cleaving soluble cellulosic substrates, also hydrolyses xylan. The gene encoding this enzyme was, therefore, selected as suitable for expression in the mouse exocrine pancreas.

Table 1 Resistance of plant cell wall hydrolases to proteolytic inactivation Half-life of enzyme incubated with proteinase (rain) Enzyme

Elastase

Trypsin

Endoglucanase E

> 180

> 180

> 180

> 180

Endoglucanase A

2

5

3

2

Endoglucanase B

1

3

1

2

15

12

10

9

Xylanase A

Chymotrypsin

Pancreatin

Endoglucanase B and endoglucanase E were derived from Ruminococcus aIbus [31] and Clostridium thermocellum [16] respectively. Endoglucanase A and xylanase A were from Pseudomonas fluorescens subsp, cellulosa [17, 18]. The enzymes were incubated with relevant proteinases in the ratio of 10:1 (w/w) respectively. Adapted from Hall et al [21].

2.2. Evaluation of nutritional benefits of endoglucanase E in poultry The potential of EGE to improve the nutritional capacity of livestock was evaluated [30]. The simple trial consisted of three groups of chicks each fed a different diet. One group of chicks was fed a standard diet containing 50% barley supplemented with EGE (1%), the second group had the same diet supplemented with commercial preparation of fungal-13glucanase called Avizyme-SX (AVIZ-SX) and the third group were fed the barley based diet with no enzyme supplements. 13-glucanase activity in the crop and small intestine of birds given exogenous enzymes were generally higher than those of birds given only the basal diet (Fig 1). Low EGE activity on day 35 could be due to reduction of proventricular pH with age. The fungal 13-glucanase is less pH sensitive than EGE, so EGE does not display as much activity as AVIZ-SX in the small intestine. But this study suggests that exogenous enzyme is of little significance after three weeks of age.

282 2800

-

m Control 9 EGE

2400 w

E

9 Avlz.SX

2000-

I-.

o~

1600

-

C

1200 -

4-P

.= 0

~

800

-

400 -

0

-

O

I

I

I

I

5

10

15

20

'

I

I

I

25

30

35

Age [days] I=

m control 9EGE 9A v l z sx

1.2-

"-

w

1.0-

0.8

I=

0.6 0.4 ~

0.2

~,

o.o

I

|

0

I

i

I

I

I

I

I

5

10

15

20

25

30

35

Age {days) Figure 1. a) Growth performance in broiler chickens fed on Barley based diets with or without exogenous [3-glucanase. Three groups of birds were fed for upto 35 days on diets containing 50% barley: 1) supplemented with a commercial fungal 13-glucanase (Aviz-SX) 2) a recombinant endoglucanase E and 3) control without any exogenous glucanase. The birds were weighed on day 4, 7, 11, 14, 21, 28 and 35 respectively. b) Measurement of 13-glucanase activity of the fluid in small intestine, in the three groups of birds as above on days 14, 21 and 35 respectively. [3-glucanase assays were performed using Azo-barley glucan as the substrate.

283 The viscosity of small intestinal contents of the chicks in each group was measured over a period of 35 days. While the control group showed very high viscosity, a significant reduction in viscosity was found in groups of animals whose diet was supplemented either with EGE or AVIZ-SX. This marked reduction in viscosity was coupled to an improvement in the health status of the chicks as noted by the reduction in sticky faeces. However, the viscosity of the intestinal fluid from the control birds declined after day 14, which supports the idea of adaptation to 13-glucan based diets. The weight of these animals was also monitored. The weight gain in the groups whose diet was supplemented with exogenous 13-glucanase were significantly higher (10%) compared to their respective controls for the first three weeks (Fig 1). The trial demonstrates the beneficial effect of adding exogenous 13-glucanase to 13-glucan containing diets of young birds. These results also show that the performance of EGE matches that of the commercial preparation. These results, therefore, indicated the potential of this enzyme to improve the nutritional capacity of simple stomached livestock. 2.3. The in vitro expression of a bacterial gene encoding an endoglucanase The majority of mammalian genes contain exons which code for mRNA and are interrupted by non coding sequences of variable length known as introns. Early observations [15] suggested that splicing, in which introns are excised from the primary RNA transcript is necessary for mRNA accumulation in the cytoplasm. The subsequent use of eDNA vectors has not however resulted in the loss of mRNA production and protein expression, at least in cell culture [8]. In 1988 Buchman and Berg, determined that the addition of heterologous splicing signals into the expression vector containing rabbit [3-globin eDNA were effective in promoting [I-globin formation [5]. The authors suggest that the splicing mechanism may play an important role in transcription initiation and elongation, nuclear stability and or cleavage of RNA. In contrast, Brinster et al [3] reported that while introns increased transcription rates of the human growth hormone (hGH) gene in transgenic mice there was no significant difference in human growth hormone expression from the eDNA or the complete gene in cultured mammalian cells. From the forgoing discussion it is clear that the importance of introns for the efficient expression of transgenes is still unresolved. As prokaryote genes do not contain endogenous introns, an important question to consider is whether a prokaryotic gene will express in a eukaryotic cell or whether introns have to be incorporated into the gene to effect the expression. To address this question, i.e. the role of the heterologous introns on expression of a bacterial gene in mammalian cells, a series of constructs were constructed. In these constructs celE' truncated so as to encode the catalytic domain only, was inserted in either the f'irst, second or fifth exons of hGH gene, respectively. These constructs transcribed from SV40 early enhancer/promoter, were stably transfected into Chinese Hamster Ovary cells (CHO) and the effect of intron position on the expression of the bacterial gene, at the level of mRNA and protein were assessed. All the celE7 hGH chimeric transgenes, except when ceIE' was inserted into the f'irst exon of hGH, expressed endoglucanase activity. Constructs devoid of introns resulted in a 2-18 fold increase in EGE activity, compared to constructs with introns. Analyses of EGE encoded mRNA from the transfected cell lines suggested that the presence of introns resulted in

284 aberrant splicing of message by the use of cryptic splice sites in the celE' gene. These data demonstrate that introns are not required for the efficient expression of a bacterial endoglucanase in mammalian cells in vitro and suggest that heterologous introns reduce the expression of the encoded protein [19]. EGE has its own signal peptide, but it was not known whether it would function in the eukaryote secretory system. Thus the above constructs were designed either to contain signal peptide from EGE (i.e. prokaryote signal peptide) or hGH (i.e. eukaryote signal peptide). 35S-methionine radiolabelling studies revealed that the endoglucanase was secreted, and the secretion was independent of the origin of signal peptide. Thus the secretion of the bacterial protein in a eukaryote cell system does not appear to have any special requirement regarding the origin of signal peptide [20], but that a signal peptide is obligatory for secretion in eukaryotic cells [36]. Results from the in vitro studies showed that the bacterial endoglucanase was synthesised and secreted by eukaryotic cells, was catalytically active and was resistant to inactivation by intestinal proteases [19, 20]. 2.4. Expression in exocrine pancreas of transgenic mice The celE" gene encoding the signal peptide and catalytic domain of EGE', was fused to 200bp of DNA encoding the specific enhancer/promoter sequence of the elastase I gene [27] and to the [~-globin polyadenylation signal (Fig 2). The resultant gene construct (E1PBG) was injected into the male pronucleus of single cell mouse embryos to generate transgenic mice. Functional endoglucanase activity, determined by both 4-methylumbelliferyl-cellobiosidase [MUCase] and carboxymethylcellulase (CMCase) activities [21], was detected in three out of eleven lines characterized. Expression of endoglucanase E activity was limited to pancreas (Table 2). In pancreatic sections, indirect immunohistochemistry with anti-EGE' antibody revealed acinar cell staining, with an absence of immunoreactive EGE' in islet tissue and blood vessels. In situ hybridisation with antisense celE' as a probe also localized EGE' mRNA to acinar cells (Fig 3). To assess the influence of introns on ceIE" expression in transgenic animals, celE' was inserted in the second exon of human growth hormone structural gene (Fig 2). The hybrid gene (EIXcelhGH) was used to generate transgenic lines which were assayed for cellulase production. One of the lines expressed EGE' exclusively in the pancreas (Table 2), but at a level lower than that observed in animals containing the intronless transgene. The Cel E" structural gene, when fused to the rat elastase enhancer, was expressed exclusively in the exocrine pancreas. The introduction of introns 5' and 3' to celE' did not appear to enhance transgene expression. This is consistent with the data of Whitelaw et al, [40], who showed that heterologous introns did not elevate the expression of a cDNA encoding human factor IX in the mouse. In contrast the introduction of heterologous introns into certain cDNAs' was shown to enhance transcription [28] although, this increase in mRNA synthesis was dependent on the nature of the intron, and its position relative to cDNA.

285 N I-~p

E1PBG

1

E1X2CelhGH

I .. ~

TAAKSs R

Prokaryote signal peptide

Bg Bg

Sa

X

F:~abbit [3,-globin poly A soquerw.o

Sp Bg

Sp

TAAKSs

r/t,4/,/~' celE" " " ~

~

Elastase promoter/ enhancer

~7~

~//////~

,

hGH

Figure 2. The celE' constructs used for pronuclear injections into the single cell mouse embryo to generate transgenic lines. The restriction sites are N(NcoI), H(HindlII), Ss(SstI), Sp(Sph/) R(EcoRI), Bg(BgllI), K(KpnI), Sa(SalI) and X(XhoI). The elastase enhancer,--; prokaryotic signal peptide, m; CeIE'; ~1; I]-globin polyadenylation signal sequence, ~3; and i n t r o n s , - and exons, l~l of human growth hormone gene are shown.

Table 2 Detection of Functional EGE in transgenic mice 4-Methylumbelliferyl- ~-d-cellobioside hydrolysing (MUCase) activity (mU/mg protein) Mouse line Non-transgenic

Construct -

Liver

Pancreas

9 + 5(5)

2 + 1(6)

2

pcelE'

9 + 1(10)

4805 + 1040(23)

13

pcelE'

7 (1)

18687 + 7119(5)

14

pcelE'

n.d

3

hgh-celE'

6 +2(3)

337* 838 + 390(3)

Spleen 12 +

12(5)

50 + 36(10) 1(1) n.d. 23 + 21 (3)

Mouse lines 2,13 and 3 were individual transgenic lines. Appropriate organs were isolated, homogenized in lysis buffer and assayed for MUCase activity. Values given are the mean detected in the pancreas of transgenic mice. * Only two animals were analyzed from line 14. Adapted from Hall et al [21]. SEM from (n) determinations, n.d.= not determined. Functional CMCase was only detected in the pancreas of transgenic animals.

286 To generate a non-ruminant animal with the capacity to hydrolyse cellulosic material, requires not only the synthesis of the bacterial protein in the appropriate cell type, but also the secretion of the protein into the gastrointestinal tract. The direction of secretion of the microbial protein from the polarized pancreatic acinar cells is unknown. To address this question, we determined the location of EGE' exported from exocrine pancreas. Protein was found in gastrointestinal contents, none in the blood. As EGE' is endogenous to Clostridium thermocellum it is unlikely that the cellulase has evolved any sorting signals, apart from the Nterminal signal peptide, which could direct secretion of the enzyme through a specific membrane domain in polarized eukaryotic cells. Thus the secretory route of EGE in the pancreas probably defines the constitutive pathway, which is through the apical membrane and into the gut lumen. The majority of EGE' synthesised and secreted by the exocrine pancreas had a molecular weight identical to that synthesised in Escherichia coli, 37 kilodalton. Possible explanations for the apparent lack of glycosylation by the pancreas include: 1) The environment of the endoplasmic reticulum/Golgi apparatus in the pancreas may affect the folding of EGE' such that the glycosylation sites are not available for modification. 2) The exocrine pancreas does not efficiently glycosylate secreted proteins, i.e. the major pancreas derived digestive enzymes are not glycosylated despite potential N-glycosylation target sequences [33]. 3) The endoglucanase is not secreted via the endoplasmic reticulum/Golgi apparatus pathway. However, the N terminal signal peptide is essential for EGE' export in eukaryotic cells, arguing against this view [36]. 4) EGE' is glycosylated in the pancreas but the sugar residues are subsequently removed. In the highest expressing transgenic mouse the total quantity of EGE present in small intestine is about 20 ng. Taking into account the specific activity of the endoglucanase against 13-glucan, the residence time in small intestine and the 13-glucan component of a barley based diet it is estimated that the mouse must secrete about 600 ng of enzyme to elicit a 90% reduction in viscosity of the 13-glucan. Thus in order to have a nutritional benefit, expression has to be increased by at least 30-fold. So our current work is directed towards increasing the level of expression in exocrine pancreas and intestinal enterocytes of transgenic mice. 2.5. Rescuing endoglucanase expression in transgenic mice Experiments with transgenic mice have shown that sequence information sufficient for tissue specific and developmentally regulated expression is usually localized in the promoter and enhancer elements within a few kilobases from the gene. However, if a gene containing all the proximal control region is introduced into transgenic mice, it is rarely expressed at the same level as its endogenous counterpart, nor is there a strict relationship between copies of integrated gene and level of expression. This inefficient expression of transgenes is attributed to two causes. Firstly, the transgene fragment that is injected does not contain all the regulatory regions required for full expression. Secondly, random integration of the transgenes in the mouse genome results in so called "position effects"; regulatory regions present at, or near, the site of integration of the transgene can act on the transgene and influence its pattern of expression.

287 High levels of gene copy number dependent expression can be achieved by either: 1) Including locus control regions (LCR) in the gene construct [ 14], LCR achieves activation of transcription in some dominant fashion, perhaps by creating very stable interactions between the LCR and the gene or 2) Manipulating the site of integration, i.e, target the heterologous

Figure 3. Cell specific expression of celE' in exocrine pancreas of transgenic mice. The indirect immunohistochemistry with anti EGE Ab and 2 ~ Ab conjugated to immunogold was used to visualise sections of pancreas from transgenic (A) and non transgenic (B) mice respectively. In situ hybridisation of pancreatic sections from transgenic (C&D) and non transgenic (E) mice, using antisense strand of celE' as a probe. Sections were viewed under bright (C&E) and dark field(D) microscopy [21 ].

288 gene to the vicinity of a highly expressed endogenous gene. It is possible that the highly

expressed endogenous gene alters the chromatin structure in the proximity of the transgene, which then allows transcription factors access to the enhancer/promoter of heterologous gene. Strategies for targeting the transgene include homologous recombination using embryonic stem cells or simultaneous coinjection of two genes into the pronuclei of mice embryos, which should result in their cointegration at a single site [2]. Clark et al [7] have demonstrated in the mammary gland that coinsertion of the highly expressed sheep 13-1actoglobulin gene with two poorly expressed transgenes (human factor IX cDNA and al-antitrypsin cDNA) rescued the level of expression of the transgenes in transgenic mice. They have shown that sheep 13lactoglobulin gene created an environment in which juxtaposed cDNA transgenes were expressed. However, the al-antitrypsin and factor IX cDNA transgenes appeared to respond differently, suggesting that the degree of activation may depend on the cDNA transgene used. Current studies in our laboratory are investigating whether co-integration of a highly expressed pancreas specific gene, i.e. hGH gene driven by the elastase enhancer/promoter [27] with a bacterial cellulase construct will increase the expression of the later gene. Twelve founder lines were identified, bred and analysed. The level of expression of ceIE' in expressing lines was not significantly higher than the celE' constructs themselves, though the hGH gene was apparently expressed at high levels. These results seem to be in agreement with the recent observations by McKnight et al [26]. They tried to activate the mouse whey acidic protein (WAP) cDNA which was virtually silent in transgenic mice by cointegrating with WAP genomic transgene which is expressed in approximately 50% of the lines at levels ranging 1% to more than 100% of endogenous RNA. However, the activity of WAP-cDNA transgene did not exceed 1% of the WAP-genomic transgene. This suggests that an integration site capable of supporting basal level transcription can be established, but additional sequences/events are required for activation of the transgene. So our current work involves coinjecting the celE' construct with the full length elastase gene on the basis that the elastase gene is expressed at high levels independent of its chromosomal position, i.e. it might contain the locus control elements [39].

2.6. Intestinal enterocyte specific expression In order to maximise the synthesis and secretion of cellulases in the gastrointestinal tract, the genes encoding for such enzymes could be expressed at high levels in a variety of specific cell types, such as gastric cells, pancreatic acinar cells and intestinal enterocytes. The cells lining the gastrointestinal tract are polarized epithelial cells, they consist of distinct plasma membrane domains termed apical and basolateral, which are separated by tight junctions. The protein is thought to be sorted in trans-Golgi network into different vesicle populations which are then delivered either apically or basolaterally [34, 32]. Proteins that do not contain any sorting signals apart from an N-terminal signal peptide that directs proteins into endoplasmic reticulum, follow the constitutive or default pathway for secretion. In vitro experiments using a model human intestinal cell line Caco-2 have defined the default pathway in intestinal enterocytes to be predominantly basolateral [35]. In vivo this would result in the protein being directed to the blood rather than the gut lumen. However, we have been able to resort this protein to the apical membrane in Caco-2 cells (in vitro) by using a glycosyl-phosphatidylinositol (GPI) anchor sequence fused to 3' end of ceIE' gene [37]. The GPI recognition sequence comprises of a C-terminal hydrophobic region that signals the

289 attachment of a GPI anchor to the protein soon after translation [ 10]. The GPI moiety is then believed to act as an apical targeting signal linking the C-terminal of the protein to the outer leaflet of the membrane lipid bilayer of the apical membrane [24, 25, 4]. The GPI anchor signal has been shown to be an apical sorting signal in Madin-Darby canine kidney (MDCK) cells, as the fusion of the signal to proteins normally located at the basolateral membrane redirects the polypeptides to the apical surface [4, 25]. However, recent evidence suggest that many endogenous GPI-anchored proteins contain additional sorting information, recognized in their ectodomains [6]. In contrast to MDCK cells, in polarized Fischer rat thyroid epithelial cells endogenous GPI-anchored proteins and chimeric proteins which have the decay accelerating factor (DAF) GPI anchor signal sequence fused to the C-terminal end are localized at basolateral surface [41]. As the sorting mechanism varies considerably between different epithelial cell types, it can not be assumed that GPI anchor functions as an apical sorting signal in enterocytes. Hence, in our laboratory we have investigated the vectorial secretion of endoglucanase E fused to the GPI sequence from Thy-1. Caco-2 cells were transfected with endoglucanase E or endoglucanase E fused to GPI, respectively. In the cells transfected with EGE 70% of the enzyme was secreted basolaterally and 30% apically (Fig 4). These results define the constitutive pathway for protein secretion in Caco-2 [35]. However, in cells transfected with EGE fused to GPI, 80% of the extracellular form of the enzyme was routed through the apical membrane over a 24 hour period. EGE was also detected at the basolateral membrane (Fig 4). The rates of delivery of EGE-GPI to the two membrane domains in Caco-2 cells as determined by a biotinylation protocol, revealed apical delivery to be approximately 2.5 times that of basolateral. Transcytosis of the basolateral EGE to the apical membrane was then observed [37]. These data indicate that a GPI anchor does represent a dominant apical sorting signal in this intestinal epithelial cell line. However, the missorting of a proportion of EGE-GPI to the basolateral surface of Caco-2 cells provides an explanation for additional sorting signals in the ectodomain of some endogenous GPI anchored proteins. As we have established that a GPI anchor is a major apical targeting signal in intestinal enterocytes in vitro we are testing the hypothesis that GPI anchor can function as a dominant apical targeting signal in intestinal enterocytes of transgenic mice. To date, constructs with and without GPI, under the control of the intestinal fatty acid binding protein gene (iFABP) promoter/enhancer, have been injected into male pronuclei of single cell mice embryos to create transgenic mice [38]. Six founder lines have been identified with the construct without GPI, and three of them are expressing the gene. With the construct containing ceIE' fused to GPI, fifteen founder lines have been established and six are expressing the enzyme. In mice encoding celE'-GPI the majority of the enzyme is sorted through the apical surface into the intestinal lumen; low levels of enzyme are detected in the blood indicating a low level of missorting. The distribution of celE' in villus-associated enterocytes and goblet cells and lack of expression in crypts of Lieberkuhn mimics that of the endogenous iFABP gene product [38]. The enzyme is also expressed differentially along the intestine, with maximum levels in distal jejunum. Currently, detailed analyses are being carried out.

2.7. Screening of additional enzymes The ultimate goal of this project is to achieve complete hydrolysis of plant structural polysaccharides in the GI tract of simple-stomached animals. This will require the synergistic

290 action of a consortium of endo-and exo-acting cellulases and xylanases. In order to achieve this objective we have looked beyond celE and analysed the proteinase sensitivity of several cellulases and xylanases from different enzyme families, which were derived from mesophyllic and thermophilic organisms [11]. The data presented in Table 3 indicate resistance to proteolysis is a common property of cellulases and xylanases and that this characteristic cannot be attributed to specific enzyme families or the thermotolerance of the endogenous host organism. It is apparent, therefore, that a range of cellulases and xylanases can be targeted to the GI tract of simple stomached animals to improve their nutritional efficiency. Further to this a highly thermostable protease-resistant xylanase has been isolated from Clostridium thermocellum, sequenced and characterized [12]. A truncated form of the xylanase gene encoding the catalytic domain has been successfully expressed in vitro and used to generate transgenic mice. Eleven lines of trangenic mice have been established and are currently being investigated for pancreatic xylanase activity.

/,///'~

80-

///// ///// ////i ///// ////i ////i

,.,.,.

,,,,...

(.1

60-

t//// ///// ///// I//// I//// ///// ///// / // // / /

40-

///// ///// ///// I//// I/I// II/// I//// I///,, ///// I//// I//// ///// I/I//

O ::J

,,=.=.

O "U

r

LU

Basal

/////

o

=

= Apical -

20-

/////

0 -

B

A EGE

B

A

EGE-GPI

Figure 4. Relative secretion of endoglucanase E' and endoglucanase E' fused to GPI to apical and basolateral membrane domains in Caco-2 cells. Results illustrate proportion of ceIE' activity secreted to each membrane domain over 24 hours [35, 37].

291 Table 3 Resis.tance of different enzymes to proteolytic inactivation Half-life of enzymes (min.) .....

Enzyme

Enzyme family

Endogenous host

Pancreatin

Chymotrypsin

Duodenal juice

Endoglucanase A

A

B. fibrosolvens

10.4

11.2

11.1

Xylanase D

G

C. fimi

> 180

> 180

> 180

Endoglucanase I

-

C.mixtus

12.2

10.9

11.4

Endoglucanase I

E

C. thermocellum

> 180

> 180

> 180

Endoglucanase A

D

C. thermocellum

> 180

> 180

> 180

Endoglucanase F

A

C. thermoceIlum

> 180

> 180

> 180

Xylanase X

F

C. thermocellum

> 180

> 180

> 180

Xylanase A

G

N. patriciarum

> 180

> 180

> 180

Xylanase B

F

N. patriciarum

> 180

> 180

> 180

Endoglucanase B

A

N. patriciarum

13.8

11.4

11.2

Xylanase B

F

P. fluorescences

> 180

> 180

> 180

Endoglucanase E

A

P. fluorescences

> 180

> 180

N.D 2

Endoglucanase A

A

R. albus

12.2

11.9

12.2

Malate dehydr.

-

E. coti

4.3

5.1

4.8

NADH reductase

-

E. coli

7.3

6.9

5.8

13-galactosidase

-

E.coli

13.5

12.8

11.3

1. Uncharacterized enzyme. 2. not determined

3. C O N C L U S I O N This novel research has succeeded in introducing an endoglucanase into the repertoire of digestive enzymes secreted into the gastrointestinal tract of a simple stomached animal. The endoglucanase E has been successfully expressed in the exocrine pancreas and intestinal enterocytes of transgenic mice. A GPI anchor sequence has been shown to be a dominant apical targeting signal in intestinal cells in vitro and we have tested the hypothesis that GPI anchor can function as a major apical sorting signal in intestinal enterocytes in transgenic mice. Initial results show that a GPI anchor can resort the majority of enzyme through apical surface in vivo. Furthermore, a poultry trial has shown the potential of EGE in improving the nutrition of simple stomached animals. Finally, we have screened several other plant polysaccharide

292 degrading enzymes resistant to intestinal proteases which makes them potential candidates for expression in gastrointestinal tract of transgenic mice. A highly thermostable, protease resistant xylanase gene has been isolated, sequenced and expressed in a polarized epithelial cell line and also in transgenic mice. In addition, this work will provide a foundation for the future additional modification of the repertoire of proteins secreted into the GI tract to include proteins with antimicrobial activities. These proteins could improve the resistance of animals to GI tract infections.

4. A C K N O W L E D G E M E N T S We thank Carl Morland for technical assistance, A. John Clark, J. Paul Simons, Roberta Wallace and Azim Surani for their help in creating initial transgenic lines. This work was funded by Biotechnology and Biological Sciences Research Council (including grant LR 13/573). We would also like to thank the staff at Comparative Biology Centre at Newcastle for breeding and maintaining the transgenic lines.

5. REFERENCES

10 11 12 13 14 15 16 17 18

P. Beguin and J.P Aubert, FEMS. Microbiol. Rev., 13, 25 (1994) 58. R.R. Behringer, T.M. Ryan, M.P. Reilly, T. Asakura, R.D. Palmiter and R.L. Brinster, Science, 245 (1989) 971. R.L. Brinster, J.M. Allen, R.R. Behringer, R.E. Gelinas and R.D. Palmiter, Proc. Natl. Acad. Sci. USA, 85 (1988) 836. D. Brown, B. Crise and J.K. Rose, Science, 245 (1989) 1499. A.R. Buchman and P. Berg, Mol.Cell.Biol., 8 (1988) 4395. J.E. Casanova, G. Apodaca and K. Mostov, Cell, 66 (1991) 65. A.J. Clark, A. Cowper, R. Wallace, G. Wright and J.P. Simons, Biotechnology, 10 (1992) 1450. D. Drayna, C. Fielding, J. McLean, B. Baer, G. Castro, E. Chen, L. Comstock, W. Henzel, W. Kohr, L. Rhee, K. Wion and W. Lawn, J. Biol. Chem., 261 (1986) 16535. A.J. Durrant, J. Hall, G.P. Hazlewood and H.J. Gilbert, Biochem. J., 273 (1991) 289. M.A.J. Ferguson, M. Duszenko, G.S. Lamont, P. Overath and G.A.M. Cross, J. Biol. Chem., 261 (1986) 356. C.M.G.A. Fontes, J. Hall, B.H. Hirst, G.P. Hazlewood and H.J. Gilbert, Appl. Microbiol. Biotechnol., 43 (1995a) 52. C.M.G.A. Fontes, G.P. Hazlewood, E. Morag, J. Hall, B.H. Hirst and H.J. Gilbert, Biochem. J., 307 (1995b) 151. H.J. Gilbert and G.P. Hazlewood, Proc. Nutr. Soc., 50 (1991) 173. F. Grosveld, G.B. Van Assendeift, D.R. Greaves and G. Kollias, Cell, 51 (1987) 975. P. Gruss and G. Khoury, Nature, 286 (1980) 634. J. Hall, P. Barker, G.P. Hazlewood and H.J. Gilbert, Gene, 69 (1988a) 29. J. Hall and H.J. Gilbert, Mol.Gen. Genet., 213 (1988b) 112. J. Hall, G.P. Hazlewood, N.S. Huskisson, A.J. Durrant and H.J. Gilbert, Mol.

293

19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

34 35 36 37 38 39 40 41

Microbiol., 3 (1989) 1211. J. Hall, B.H. Hirst, G.P. Hazlewood and H.J. Gilbert, Biochim. Biophys. Acta, 1130 (1990a) 259. J. Hall, G.P. Hazlewood, M.A. Surani, B.H. Hirst and H.J. Gilbert, J. Biol. Chem., 265 (1990b) 19996. J. Hall, S. Ali, M.A. Surani, G.P. Hazlewood, A.J. Clark, J.P. Simons, B.H. Hirst and H.J. Gilbert, Biotechnol., 11 (1993) 376. K. Hesselman and P. Aman, Animal Fed. Sci. Technol., 15 (1986) 83. R. Lamed and E.A. Bayer, Adv. Appl. Microbiol., 33 (1988) 1. M.P. Lisanti, M. Sargiacomo, L. Graeve, A.R. Saltiel and E. Rodriguez-Boulan, Proc. Natl. Acad. Sci. USA, 85 (1988) 9557. M.P. Lisanti, I.W. Caras, M.A. Davitz and E. Rodriguez-Boulan, J. Cell. Biol., 109 (1989) 2145. R.A. McKnight, R.J. Wall and L. Hennighausen, Transgenic Research, 4 (1995) 39. D.M. Ornitz, R.D. Palmiter, R.E. Hammer, R.L. Brinster, G.H. Swift and R.L. MacDonald, Nature, 313 (1985) 600. R.D. Palmiter, E.P. Sandgren, M.R. Avarbock, J.M. Allen and R.L. Brinster, Proc. Natl. Acad. Sci. USA, 88 (1991) 478. D. Pettersson and P. Aman, Br. J. Nut., 62 (1989) 139. J.S. Philips, H.J. Gilbert and R.R. Smithard, Br. Poultry. Sci. (1995) in press. D.M. Poole, G.P. Hazlewood, J.I. Laurie, P.J. Barker and H.J. Gilbert, Mol. Gen. Genet., 223 (1990) 217. E. Rodriguez-Boulan and S.K. Powell, Annu Rev. Cell. Biol., 8 (1992) 395. G.A. Scheele and H.F. Kern, in: Handbook of physiology: The gastrointestinal system, Vol III pp 447-498, J.G. Forte (ed.) American Physiological Society, Bethesda, MD, 1989. K. Simons and A. Wandinger-Ness, Cell, 62 (1990) 207. K.L. Soole, J. Hall, M.A. Jepson, G.P. Hazlewood, H.J. Gilbert and B.H. Hirst, J. Cell. Sci., 102 (1992) 495. K.L. Soole, B.H. Hirst, G.P. Hazlewood, H.J. Gilbert, J.L. Laurie and J. Hall, Gene 125 (1993) 85. K.L. Soole, M.A. Jepson, G.P. Hazlewood, H.J. Gilbert. and B.H. Hirst, J. Cell. Sci., 108 (1995) 369. D.A. Sweetser, S.M. Hauft, P.C. Hoppe, E.H. Birkenmeier and J.I. Gordon, J.I., Proc. Natl. Acad. Sci. USA, 8 (1988) 9611. G.H. Swift, R.E. Hammer, R.J. MacDonald and R.L. Brinster, Cell, 38 (1984) 639. C.B.A. Whitelaw, A.L. Archibald, S. Harris, M. McClenaghan, J.P. Simons and A.J. Clark, Transgenic Research, 1 (1991) 3. C. Zurzolo, M.P. Lisanti, I. Caras, I. Nitsch and E. Rodriguez-Boulan, J. Cell. Biol., 121 (1993) 1031.

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

295

Mechanism and action of glucansucrases John F. Robyt Laboratory of Carbohydrate Chemistry and Enzymology, Dept. Biochemistry and Biophysics, Iowa State University, Ames, Iowa, 50011, USA.

Abstract There are several different glucansucrases elaborated by Leuconostoc mesenteroides strains and Streptococcus species. They catalyze the synthesis of glucans, with different structures, from sucrose. Glucans that have contiguous tx 1-->6 linked glucose residues are known as dextrans. The dextrans differ from each other by the type of branch linkages (ix-1-->2, or- 1--,3 or ct-1-->4), the percentage of branching, the length of the branch chains, and their spatial arrangement. Some of these dextrans are very highly branched with 35%--50% branch linkages. Other glucans with contiguous o~-1-->3 linked glucose residues in the main chains are known as mutans, and glucans with alternating ct-1-->6 and ct-1-->3 linked glucose residues in the main chains are known as alternans. The enzymes that catalyze glucan synthesis from sucrose form covalent glucosyl- and glucanyl-enzyme complexes. They transfer the glucosyl unit to the reducing-end of the growing glucan chain by a two-site insertion mechanism. Branching of the glucans occurs when a glucan chain acts as an acceptor and attacks the covalent glucosyl- or glucanyl-enzyme complex. Glucose or glucan chain are transferred to the acceptor chain where they are attached to the chain by a branch linkage. When other carbohydrates, in addition to sucrose, are present in the enzyme digest, the enzyme transfers glucose to the carbohydrate acceptors in a secondary reaction that diverts some of the glucose from incorporation into glucan. Many carbohydrate acceptors have been recognized and the products that result are dependent on the particular enzyme and the structure of the particular acceptor. The two-site insertion mechanism has been confirmed by determining that there are two sucrose binding-sites at the active site and that chemical modification of one of the sites stops glucan synthesis but does not stop the acceptor reactions. It has further been shown that activation of the dextransucrases by the addition of exogenous dextran is by an allosteric mechanism rather than by a primer dependent mechanism.

1. SOURCES AND STRUCTURES OF THE GLUCANS SYNTHESIZED FROM SUCROSE BY GLUCANSUCRASES In 1941, Hehre [ 1] reported the first cell-free synthesis of dextran from sucrose. In 1954, Jeanes et al. [2] reported the synthesis of several different kinds of glucans by 96 strains of Leuconostoc mesenteroides. The enzymes responsible for the synthesis are glu-

296 cansucrases. They are secreted into the culture medium by Leuconostoc and Streptococcus strains and species. These two genera are Gram-positive, facultative anaerobic, cocci that are closely related to each other. One notable difference between them is that, until recently, Leuconostoc strains required sucrose in the culture medium to induce the formation of the enzyme(s), whereas the Streptococcus species did not require sucrose in the culture medium for the formation of the enzymes. Thus, the Leuconostoc strains were inducible for the formation of the glucansucrases, and the Streptococcus species were constitutive for their formation. Kim and Robyt [3] recently reported the mutation of several Leuconostoc strains (B-512FM, B-742, B-1142, B-1299, and B-1355) with ethyl methane sulfonate and obtained mutants that were constitutive for their various glucansucrases. All of the mutants produced higher glucansucrase activities (3 to 22 times) when grown on glucose than the parent strains grown on sucrose. In 1954, Jeanes et al. [2] characterized the polysaccharides synthesized by the 96 strains of Leuconostoc by optical rotation, viscosity, periodate oxidation, and their physical appear

Figure 1. Structural representation of segments of different glucans synthesized by glucansucrases from sucrose. A is Leuc. mesenteroides B-512F dextran; B, Leuc. mesenteroides B742 regular comb dextran; C, Strep. mutans alternating comb dextran; D, Strep. mutans mutan; E, Leuc. mesenteroides B-1355 alternan; and F, Leuc. mesenteroides B-1299 dextran. represents a glucose residue linked o ~1----)6 to

another glucose residue

~ r e p r e s e n t s a glucose residue linked a-1---~3 to another glucose residue represents a glucose residue linked or-1--)2 to another glucose residue

297 ance after alcohol precipitation. The latter were observed to have different appearances that were described by Jeanes et al. in qualitative terms such as pasty, fluid, stringy, tough, long, short, flocculent, crumbly, etc. This was an early indication of differences in the structures of the polysaccharides. Some of the polysaccharides were water-soluble, and some of the strains of organisms elaborated more than one kind of polysaccharide. Wilham et al. [4] reported the separation of these mixture of polysaccharides by differential alcohol precipitation. The structures of the polysaccharides have been determined by methylation [58] and 13C-NMR [9-18]. Initially all of the glucans synthesized from sucrose by the various culture supernatants were considered to be dextrans. Dextrans are defined as glucans that have main chains composed of contiguous c~-1~ 6 linked glucopyranose residues. As knowledge of the different structures were obtained, it was recognized that the different kinds of dextrans differed from each other by the types, the amounts, the lengths, and the arrangements of the branch linkages. The principal type of branch linkage is o~-1~ 3 , but o~-1~ 2 and o~-1~ 4 branch linkages also have been found (see Table 1). It also became apparent that there were glucans that did not fit the definition of a dextran. In particular there were glucans that had contiguous o~-1~ 3 linkages in the main chains and glucans that had alternating o~-1---)6 and o~1~ 3 linkages in the main chains instead of contiguous o~-1---->6linkages

Table 1 Glucans synthesized from Sucrose by Glucansucrases from selected Leuconostoc and Streptococci Percent of Linkages Species and 1--)6 1---~3 l ~ 3 B r b 1~ 2 B r b l ~ 4 B r b Description of the ethanol Strain No.a precipitate L.m. B-512F 95 5 translucent gel L.m. B-742 87 13 heavy, opaque L.m. B-742 50 50 fine c L.m. B- 1299 66 1 27 flocculent c L.m. B- 1299 65 35 fine c L.m. B-1355 95 5 transluecent gel L.m. B- 1355 54 35 11 heavy, opaque L.m. B- 1191 94 2 4 cohesive, stringy c L.m. B-1308 95 5 pasty, crumbly S.s. B-1526 83 17 fluid, stringy ~ S.v. B-1351 89 11 short ~ S.m. 6715 64 36 heavy, opaque S.m. 6715 4 96 2 water-insoluble L.m. B-523 100 water-insoluble L.m. B- 1149 100 water-insoluble aL.m. = Leuconostoc mesenteroides; S.m. = Streptococcus mutans; S.s. = Strep. species; S.v. = Strep. Viridans; B-numbers refer to the strain number in the Northern Regional Research Laboratory Collection (NRRL) of the USDA Laboratory, Peoria, IL. bBr = branch linkage. CDescription taken from Jeanes et al., J. Am. Chem. Soc., 76 (1954) 5041.

298 Leuc. mesenteroides B-512F(M) produces only one glucan, a dextran, that has 95% ct1---)6 linkages in the main chains and 5% or-I---)3 branch linkages [5,11]. The branches consist of two types, single glucose units and relatively long ix-1---)6 linked chains attached to an or-1---~6 linked chain by an ct-1---)3 branch linkage (see Fig. 1). Other dextrans contain a much higher percentage of ix-1---)3 branch linkages. For example, Leuc. mesenteroides B-742 dextran-S has 50% or-I---r6 linkages in the main chains and 50% tx-1---~3 branch linkages of which the majority are single glucose residues [8]. This is the highest degree of branching that can be obtained in a dextran. The structure that results is a bifurcated comb in which each of the single o~-1---~3 linked glucose residues are like teeth of a comb on a backbone of the o~-1~ 6 linked chains (see Fig. 1). A dextran of this type would be highly resistant to endo-dextranase hydrolysis. Leuc. mesenteroides B-742 also produces another dextran that has 7% ix-I---)4 branch linkages, instead of ct-1---)3 branch linkages, attached to an ct-1 ~ 6 linked main chain. Strep. mutans 6715 also elaborates two glucansucrases. One is a dextransucrase (sometimes called glucosyltransferase-soluble or GTF-S) that synthesizes a water-soluble dextran reported to have 35% ct-1---)3 branch linkages consisting primarily of single glucose residues attached to ix-1---~6 linked main chain [6]. This dextran also contains a relatively high degree of branching consisting primarily of single glucose residues. The degree of branching requires that one out of every two glucose residues of the main chains are branched. If the single branch glucose residues are uniformly distributed along the o~-1---)6 chain, the result is an alternating, bifurcated comb structure in which the single branch glucose residues are attached by ix-I---)3 linkages to every other glucose residue in the main chains (see Fig. 1). This dextran also would be resistant to endo-dextranase hydrolysis. The second enzyme elaborated by Strep. mutans 6715 synthesizes a water-insoluble glucan that has contiguous t x - l ~ 3 linked glucose residues instead of contiguous c t - l ~ 6 linkages [ 13] and obviously is not a dextran. It is totally resistant to endo-dextranase and is called mutan and its enzyme is called mutansucrase (sometimes glucosyltransferase-insoluble or GTF-I). Leuc. mesenteroides B-1355 also elaborates two glucansucrases. The first enzyme synthesizes a dextran very similar in structure to B-512F dextran. The second enzyme, however, synthesizes a glucan that has an alternating o~-1---)6 and t~-1---~3 linked glucose residues in the main chains with 11% c~-1~3 branch linkages [16,17]. This polysaccharide also is not a dextran. It has been called alternan and its enzyme, alternansucrase. Alternan also is totally resistant to endo-dextranase hydrolysis.

2. M E C H A N I S M OF GLUCAN SYNTHESIS The reaction of glucansucrases with sucrose can be simply formulated by the following: n Sucrose ---) (Glucose),_m_w + n-m Fructose + m Leucrose + w Glucose The reaction is essentially irreversible. The main products are high molecular weight (1 x 10 7 -- 1 x 10 8 Da) glucan and fructose and the minor products are glucose and leucrose

299 (where, n>>m or w). Glucose arises from an acceptor reaction with water and leucrose [5-0ot-D-glucopyranosyl-D-fructopyranose] arises from an acceptor reaction with the primary product, fructose. A discussion of the acceptor reactions is given below. When dextransucrase was first described by Hehre in 1941 [19], Coil and Coil [20] and Swanson and Coil [21] were studying the action of muscle phosphorylase and Hanes [22] was studying potato phosphorylase. These investigators observed that phosphorylase could elongate glycogen and starch chains by the transfer of glucose from a-glucose-1-phosphate (a-G-l-P) to the nonreducing-end glucose residues of glycogen and starch. This reaction did not take place unless a glycogen primer or a starch primer chain was present. It, thus, resulted that a primer or preformed glycogen or starch chain was an absolutely required constituent in the enzyme digest to obtain chain elongation. The phosphorylase reaction was later shown to have an equilibrium constant close to 1 and the reaction could go either way, toward synthesis or toward degradation, depending on whether o~-G-1-P or inorganic phosphate, respectively, were present in dominant amounts. The phosphorylase catalyzed reaction was later shown to be a degradative process in vivo in which it catalyzes the reaction of P~ with the nonreducing-end glucose residue of the glycogen or starch chain to give o~-G-1-P as the product [23]. The so-called synthetic reaction, requiring a primer, was the reverse of the degradative reaction that indeed requires the glycogen or starch chain for reaction. This, however, was not appreciated and the primer mechanism for polysaccharide synthesis became fn-mly established without any direct experimental proof. Thus, in the 1940's and 50's, the primer mechanism for dextran synthesis was assumed for the synthesis of dextran [24-27]. The primer mechanism for dextran synthesis was somewhat strengthened in the 1970's when Germaine et al. [28-30] found that the addition of dextran to Strep. mutans dextransucrase digests increased rate of dextran synthesis. Kobayashi and Matsuda [31,32] also reported that the purified dextransucrases of both Leuc. mesenteroides and Strep. mutans were stimulated by dextran, although both enzymes could synthesize dextran without the addition of dextran. The reaction was accompanied by a lag that could be abolished by the addition of dextran. Both groups interpreted their results as evidence for a primer-based mechanism for dextran synthesis. There, however, is a significant difference in the synthesis of dextran by dextransucrase and the chain elongation of glycogen and starch by phosphorylase. Even though the rate of dextransucrase is stimulated by the addition of dextran, the stimulation is limited to 1.5 to 2.5 gg/mL of Dextran T-40 [33], depending on the enzyme and its concentration. Further, the maximum amount of stimulation is limited and equal for the different enzymes and the different dextransucrases can synthesize dextran in the complete absence of any primer dextran. Robyt and Corrigan [34] found that dextrans modified by a blocking group (triisopropyl sulfonyl or tripsyl group) on C-6 of the nonreducing-end glucose residue increased the rate of dextran synthesis equally as well as unmodified dextran. The modified dextran could not participate in a priming reaction as the requisite site for the addition of glucose, the C-6 hydroxyl of the nonreducing-end glucose residue, was blocked by a tripsyl group. This showed that the added dextran was not stimulating the reaction by acting as a primer but by some other mechanism. Recently, Robyt et al. [33] showed that the activation of dextransucrases by the addition of dextran is by an allosteric mechanims rather than by a primer mechanism.

300 In 1974, Robyt et al. [35] presented experimental evidence for the insertion mechanism and the addition of glucose to the reducing-end of dextran. They used pulse and chase techniques with [u-lnc]-sucrose and Bio-Gel P-2 immobilized Leuc. Mesenteroides B-512F dextransucrase. The enzyme was located on the surface of the highly cross-linked Bio-Gel beads. The immobilized enzyme was pulsed with a low concentration of [U-lac]-sucrose. The enzyme beads were removed by centrifugation and washed with buffer several times to remove any soluble label, which was primarily lnc-fructose. Some label remained tightly associated with the immobilized enzyme. This label could be stoichiometrically released by adjusting the pH to 2 and heating at 55 ~ for 5 min. Chromatographic analysis showed that the released label was 14C-glucose and 14C-dextran. In a separate experiment, the pulsed-labeled enzyme beads were chased by incubation in nonlabeled sucrose solution for a short time. The enzyme beads were removed, washed with buffer, adjusted to pH 2, and heated at 55 ~ for 5 min. Chromatographic analysis of the released label showed only 14C-labeled dextran. These experiments showed that glucose and dextran were covalently attached to the enzyme during synthesis and that the glucose was being incorporated into the dextran. The dextrans produced in the pulse and chase experiments were purified (separated from glucose) by chromatography on Bio-Gel P-6. They were reduced with sodium borohydride and acid hydrolyzed. The hydrolyzates contained labeled D-glucitol and D-glucose. The ratio of labeled D-glucitol to labeled D-glucose was much higher in the pulse dextran hydrolyzate than it was in the chase dextran hydrolyzate. This experiment showed that the glucose was being transferred to the reducing-end of the growing dextran chain that was covalently attached to the enzyme active-site. Robyt et al. [35] proposed a two-site insertion mechanism to explain the results of the pulse and chase experiments. In this mechanism, there are two sucrose binding-sites and two nucleophiles, presumably two carboxylate anions, that attack the two sucrose molecules to give two covalent glucosyl-enzyme intermediates (see Fig. 2). The C-6 hydroxyl of one of these glucosyl intermediates makes a nucleophilic attack onto C-1 of the other glucosyl intermediate with the formation of an o~-1~6 glycosidic linkage and an isomaltosyl-enzyme intermediate. The newly released nucleophile then attacks another sucrose molecule to give a new glucosyl-enzyme intermediate (this is shown in Fig. 2 as a concerted reaction, taking place as the nucleophile is displaced from the glucosyl unit). The C-6 hydroxyl of the new glucose-intermediate then attacks the C-1 of the isomaltosyl-intermediate to give a second o~1---)6 linkage and the formation of an isomaltotriosyl-enzyme intermediate. The process continues in a similar fashion between the two sites, giving the synthesis of a dextran chain by the addition of glucose to the reducing-end of the chain and the apparent insertion of glucose between the enzyme and the growing chain. In this mechanism, a dextran chain can be synthesized de novo in a continuous manner without the presence of any pre-formed dextran primer. Robyt and Eklund [36] considered the stereochemistry of the reaction and concluded that the linkages of the glucosyl- and dextranyl-units to the enzyme must be 13 to retain the configuration of the glucose residue in going from sucrose to dextran. They further postulated that the C-6 hydroxyl is stereochemically placed so that it is apposed to the (x-side of C-1 of the opposite 13-glucosyl unit of the dextran chain, which then assumes a planar conformation to give an axial glycosidic bond to the enzyme (see Fig. 3). The C-6 hydroxyl of the single glucosyl unit makes a nucleophilic attack onto the axial bonded C-I, displacing the enzyme nucleophile and forming an o~-1~6 linkage. During synthesis, the

301 growing dextran chain is transferred from one site to the other. The chain, however, does not have to move a great distance as only one or two of the glucosyl residues at the reducing-end of the chain have to move a few angstroms to effect the transfer. The chain, thus, is extruded from the active-site as the glucose units are added to the reducing-end.

Figure 2. Two-site insertion mechanism for the synthesis of Leuc. mesenteroides Bdextran by dextransucrase. X orients the glucose units at the active-site so that their C-6 hydroxyl groups can make an attack onto C-1 of the apposed glucosyl unit. is sucrose, 9 is glucose, 4 is fructose, X- represents an enzyme nucleophile, represents two glucose residues linked o~-1-->6.

An additional requirement for the reaction to take place is the transfer of a hydrogen ion to the displaced fructosyl moiety of sucrose [36]. Fu and Robyt [37] showed by chemical modification of the enzyme with diethylpyrocarbonate and Rose Bengal dye photo-oxidation, that two imidazolium groups of histidine were essential for dextran synthesis.They postulated that these two imidazolium groups donate their hydrogen ions to the leaving fructose units (see Fig. 3) and that the resulting imidazole group, in a second step, becomes reprotonated by abstracting a proton from the attacking C-6 hydroxyl group of the glucosylenzyme intermediate, facilitating the nucleophilic attack and the formation of the c~-1--->6 linkage. The imidazole group, thereby, also becomes reprotonated for the next reaction with sucrose. In 1983, Robyt and Martin [38] conducted similar [14C]-sucrose pulse and chase studies with Strep. mutans 6715 dextransucrase (GTF-S) and mutansucrase (GTF-I). They found that these two enzymes also had an insertion mechanism in which the glucose was added to the reducing end of the growing chain. For GTF-I, which catalyzes the synthesis of o~-1---~3 glycosidic linkages, the stereochemistry for the enzyme-glucosyl unit must be such that the C3 hydroxyl is placed in stereochemical position to make the nucleophilic attack onto the pposite glucosyl unit of the growing chain to give the synthesis of 0~-1-->3 linkages (see Fig. 4). The synthesis of the dextran chain by GTF-S dextransucrase occurs in a similar manner as it does for B-512F dextransucrase. In 1984, Ditson and Mayer [39] confirmed the synthesis of dextran from the reducing-end by Strep. sanguis GTF-S dextransucrase.

302

I

o

x

o.

I/

I xe

2 XI

OH

H

O

\--'-I-~ I Figure 3. Mechanism for the cleavage of sucrose and the formation of an c~-1---)6 glycosidic bond by dextransucrase. Reaction 1: nucleophilic displacement and protonation of the leaving fructose moiety to form a glucosyl-enzyme intermediate. Reaction 2: formation of an or-14--)6 glycosidic bond by attack of a C-6 hydroxyl group onto C-1 of a glucosyl-enzyme complex; the attack is facilitated by abstraction of a proton from the hydroxyl group by the imidazole group.

Mechanisms for the synthesis of other glucans, such as Leuc. mesenteroidesB-1355 alternan also can be formulated by a two-site insertion mechanism. The mechanism for the synthesis of alternan can be postulated to have the two glucosyl-intermediates stereochemically positioned differently. On one site (the X-site), the glucosyl-intermediate is stereochemically positioned so that only its C-6 hydroxyl is in position to make the attack onto C-1 of the opposite glucosyl-enzyme intermediate to give an o~-1-->6 linkage, and on the other site (the Y-site), the glucosyl-enzyme intermediate is stereochemically positioned so that only its C-3 hydroxyl makes the attack onto C-1 of the opposite glucosyl-intermediate to give an 0~14--)3 linkage. In this manner, the chain goes back and forth between the two sites giving an alternating synthesis of or-1-->6 and o~-14--)3 glycosidic linkages (see Fig. 5). Su and Robyt [40]

303 confirmed the two-site mechanism for Leuc. mesenteroides B-512FM dextransucrase, using equilibrium dialysis with 6-deoxy sucrose, a strong competitive inhibitor for the enzyme. They showed that there are two sucrose binding-sites at the active-site. They further showed that two sites were required for dextran synthesis, as shown in Fig. 2, and one site for acceptorproduct synthesis, as shown in Fig. 8, by determining the relative decrease in the rate of dextran synthesis and the rate of acceptor-product synthesis as a function of diethylpyrocarbonate modification of histidine. The argument was based on the hypothesis that if two-sites were required for glucan synthesis and one of the sites is modified, synthesis

Figure 5. Two site insertion mechanism for the synthesis of Leuc. mesenteroides B-1355 alternan by alternansucrase. The symbols are the same as in Figs. 2 and 4 with the addition that there the two nucleophiles are X and Y; X orients its glucosyl unit so that its C-6 hydroxyl group can make an attack onto C-1 of the apposed unit and Y orients its glucosyl unit so that its C-3 hydroxyl group can make an attack onto C-1 of the apposed glucosyl units.

304 of glucan would stop, but if only one of the two sites is required for the acceptor-reaction, the acceptor-reaction can still occur when only one site is modified. This modification should, therefore, produce a difference in the relative rates in the decrease of the synthesis of dextran and acceptor-products because modification of one site stops dextran synthesis but does not stop acceptor-product synthesis. The experimental results verified the hypothesis as the enzyme lost the ability to synthesize dextran more rapidly than it did the ability to synthesize acceptor-products [40].

3. SYNTHESIS DEXTRAN

OF

BRANCH LINKAGES IN LEUC. MESENTEROIDES B-512F

In 1959, Bovey [41] attempted to study the synthesis of branch formation in B-512F dextran using light scattering measurements. He postulated that there was a branching enzyme similar to the branching enzyme found in the biosynthesis of starch. The branching enzyme, however, has never been found. In 1967, Ebert and Brosche [42] proposed a reaction for the formation of branches in which a dextran chain itself acts as an acceptor attacking an enzyme-dextran complex so the acceptor dextran becomes the main chain and the dextran chain from the enzyme is the side branched chain. Using a [3H]-labeled acceptor dextran of low molecular weight and assuming an average molecular weight of 4 X 105 for the synthesized dextran, they calculated from the specific activity of the synthesized product that there was only one labeled acceptor dextran molecule in each synthesized dextran. While this seemed to be proof for the proposed mechanism, there was some doubt cast on the mechanism because of the assumptions and the circular arguments that were made. In 1976, Robyt and Taniguchi [43] reported their studies on the acceptor branching reaction using Bio-Gel P-2 immobilized Leuc. mesenteroides B-512FM dextransucrase.The immobilized enzyme was labeled by incubating it with a relatively low concentration of [14C]-sucrose. In a second procedure, the immobilized enzyme was first incubated with nonlabeled sucrose, washed, and then labeled with a low concentration of [14C]-sucrose. In both experiments, the labeled material was shown to be glucose and dextran. When either of the labeled, immobilized enzymes were incubated with a low molecular weight, nonlabeled dextran, all of the enzyme bound label was released as [~4C]-dextran. No [14C]-labeled dextran was released when the labeled enzyme was incubated in buffer alone. The released [14C]-dextran was shown to be slightly branched by hydrolysis with an exo-dextranase. Acetolysis of the labeled dextran gave 7.3% of the 14C in nigerose. Reduction of the labeled nigerose, followed by acid hydrolysis, gave all of the label in glucose, demonstrating that the nigerose was exclusively labeled in the nonreducing glucose residue. The results of the experiments indicated that the [~4C]-label was being released by the action of the added low molecular weight dextran (acceptor dextran) and that this action gave the formation of a new 0~-1~3 branch linkage. Robyt and Taniguchi [43] proposed a mechanism for the synthesis of branch linkages by Leuc. mesenteroides B-512FM dextransucrase in which a C3 hydroxyl of an interior glucose residue on an acceptor dextran makes a nucleophilic attack onto C-1 of either the glucosyl-enzyme complex or onto C-1 of the dextranylenzyme complex, thereby forming an o~-1-->3 branch linkage by displacing glucose and dextran from the enzyme (Fig. 6). Thus, branching can take place without a separate enzyme by the

305 action of an acceptor dextran on the glucosyl- and dextranyl-dextransucrase complexes.

Figure 6. Mechanism for the synthesis of o~-1-->3 branch linkages by Leuc. mesenteroides B512F dextransucrase. The C-3 hydroxyl of an acceptor dextran chain makes an attack onto (A) the glycosyl unit to give a single branched glucose linked or-i---)3 or (B) the C-3 hydroxyl group of the dextran chain makes attack onto C-1 of the glucosyl unit of the dextranyl chain to give long 0~-14--)3 linked branched dextran chain.

4. ACCEPTORS AND THE ACCEPTOR-REACTION OF GLUCANSUCRASES In addition to catalyzing the synthesis of dextran from sucrose, dextransucrase also catalyzes the transfer of glucose from sucrose to other carbohydrates that are present or are added to the digest [44,45]. The added carbohydrates are called acceptors and the reaction is called an acceptor-reaction. When the acceptor is a monosaccharide or disaccharide there usually is produced a series of oligosaccharide acceptor-products [46]. Fig. 7 shows a chromatographic analysis of acceptor products that result when maltose, D-glucose, cellobiose, and lactose are the added acceptors with B-512F dextransucrase and sucrose. Actually there are two classes of acceptors, those that give a homologous series of oligosaccharides, each differing one from the other by one glucose residue, and those acceptors that only form a single acceptor-product containing one glucose residue more than the acceptor. Koepsell et al. [25] and Tsuchiya et al. [26] also observed that the presence of low molecular weight acceptors shifted the course of the reaction from the synthesis of high molecular weight dextran to the synthesis of a lower molecular weight dextran. Robyt and Eklund 3~ showed that the amount of dextran synthesized decreased as the molar ratio of maltose (the best known acceptor) to sucrose increased. When D-glucose, methyl-o~-D-glucopyranoside, maltose, and isomaltose are the acceptors, the glucose from sucrose is transferred to the C-6 hydroxyl of the monosaccharide

306 or to the C-6 hydroxyl of the nonreducing-end glucose residue of the disaccharides to give a series of isomaltodextrins of degree of polymerization (d.p.) of 2 to 7 attached to the acceptor [47,48]. The first product in the series with isomaltose is isomaltotriose and the first product in the series with maltose is panose (62-cx-D-glucopyranosyl maltose) [46]. The next product in the maltose series is a tetrasaccharide, 62-a-isomaltosyl maltose, and the other members of the series have isomaltodextrin chains of increasing degrees of polymerization linked to the C-6 hydroxyl group of the nonreducing-end glucose residue of maltose [44]. Similar homologous series are obtained from nigerose, 1,5-anhydro-D-glucitol, and turanose [44]. The amount of each saccharide product in the series decreases as the d.p. increases, usually terminating at d.p. 6 or 7. Cellobiose gives an unusual series in which the first product is 2]-o~-D-glucopyranosyl cellobiose with glucose attached to the C-2 hydroxyl group of the reducing-end glucose residue [47,48]. The succeeding products of the cellobiose series had the glucose unit of sucrose transferred to the C-6 hydroxyl of the glucose attached to C-2 of the reducing residue of cellobiose. When the cellobiose analog, lactose, was the acceptor only one acceptorproduct was formed, 2]-a-D-glucopyranosyl lactose [48-50]. There seems to be a pattern that when D-galactose composed part of the acceptor structure, only one acceptor product was formed, for example, raffinose [6Glc-o~-D-galactopyranosyl sucrose] also gave only a single acceptor product, 2Glc-o~-D-glucopyranosyl raffinose [51]. When fructose is the acceptor, there are two products formed, depending on the ring form of the fructose acceptor. The major product, leucrose [5-O-a-D-glucopyranosyl-D-fructopyranose], is formed from D-fructopyranose, and the minor product, isomaltulose [4-O-o~-D-glucopyranosyl-D-fructofuranose], is formed when D-fructofuranose is the acceptor [52-54]. Because D-fructose is a major product in the dextransucrase synthesis of dextran from sucrose, it acts as an acceptor to give leucrose in all dextransucrase-sucrose igests. A small amount of D-glucose also is formed when water acts as an acceptor [44]. This reaction represents the hydrolysis of sucrose. Other unusual acceptor-products result from the reaction of D-mannopyranose and Dgalactofuranose. D-mannopyranose gave a nonreducing, ct,[3-trehalose isomer, o~-D-glucopyranosyl-]3-D-mannopyranoside and D-galactofuranose gave o~-D-glucopyranosyl-[3-Dgalactofuranoside [55]. Fu and Robyt [56,57] studied the structures of the maltodextrin, maltotriose to maltooctactaose (G3 - GS), acceptor products synthesized by Leuc. mesenteroides B-512FM dextransucrase [56] and Strep. mutans dextransucrase (GTF-S) and mutansucrase (GTF-I) [57]. They found that B-512FM dextransucrase transfers D-glucose to C-6 hydroxyl of both the nonreducing-end and the reducing-end residues of G3 - G8. G3, thus, gave two tetrasaccharides, 63-o~-D-glucopyranosyl maltotriose and 61-~-D glucopyranosyl maltotriose. The former acceptor-product was also an acceptor giving a homologous series of isomaltodextrins attached to the C-6 hydroxyl of the nonreducing-end glucose residue. The acceptor-product with glucose attached to the reducing-end residue, however, was not an acceptor. This same pattern was observed for the other maltodextrins studied [56]. None of the glucose residues between the reducing-nd glucose and the nonreducing-end glucose served as acceptor sites.

307

Figure 7. TC analysis of the products formed in acceptor reactions of B-512F dextransucase. 1 and 7, isomaltodextrin standards; 2, sucrose digest; 3, maltose acceptor digest; 4, glucose acceptor digest; 5, cellobiose acceptor digest; 6, lactose acceptor digest; 8, maltose, cellobiose, and lactose standards.

C6t6. and Robyt [58] studied the acceptor products catalyzed by alternansucrase. They found that altemansucrase was capable of forming both o~-1-o6 and ct-l-o3 glycosidic bonds with acceptors. Isomaltose gave both isomaltotriose and 32-o~-D-glucopyranosyl isomaltose. These initial acceptor-products also acted as acceptors, and the structures of the products of higher d.p. show that an o~-1-o3 glycosidic bond is formed only when the nonreducing-end glucose residue is linked by an (x-l-o6 bond to another glucose residue. Nigerose, thus, gave 62-o~-glucopyranosyl nigerose. Maltose gave 62-o~-glucopyranosyl maltose but this saccharide gave an unusual tetrasaccharide, 62-t~-nigerosyl isomaltose in which there are three types of glycosidic linkages in sequence from the nonreducing-end: ct-1-o3, o~-1-6, and o~-1-o4. Thus, alternansucrase can synthesize both ct- 1-o6 and ix- 1-o3 acceptor product linkages. When the nonreducing residue acceptor is linked by an tx-l-o6 linkage, altemansucrase can transfer glucose to either C-6-OH or C-3-OH to give ix-1-o6 or ix-1-o3 linked glucose unit, but when the nonreducing glucose unit of the acceptor is linked by a o~1-o3 or ct-l-o4 bond, alternansucrase will only transfer glucose to C-6-OH of the nonreducing glucose residue. Another unusual feature was that nigerose was a better acceptor than isomaltose.

308 Robyt and Walseth [45] studied the mechanism of the acceptor reactions of Leuc. Mesenteroides B-512FM dextransucrase. A purified dextransucrase was incubated with sucrose, and the resulting fructose, glucose, leucrose, and unreacted sucrose were removed from the enzyme by chromatography on a Bio-Gel P-6 column. The charged enzyme was incubated with [~4C]-D-glucose, [~4C]-D-fructose, and [~4C]-reducing-end labeled maltose acceptors. Each of the three acceptors gave two types of labeled products, a high molecular weight product, identified as dextran, and a low molecular weight product that was an oligosaccharide. It was found that all three of the acceptors were incorporated into the products at the reducing-end. Similar results were obtained when the enzyme and labeled acceptors were reacted in the presence of sucrose. The only difference being higher yields of the labeled products and a series of homologous oligosaccharides from the glucose and the maltose acceptor-reactions. Because both a labeled oligosaccharide and a labeled dextran was produced when labeled acceptor and enzyme were incubated together with and without sucrose, it was concluded that the acceptor reactions were taking place by the acceptor making a nucleophilic displacement of the glucosyl and dextranyl groups from the covalent enzyme intermediates. It was further concluded that the acceptor reactions serve to terminate polymerization of dextran by displacing the growing chain from the active-site in contrast to previous ideas exposed in references 25-27 that acceptors were serving as primers for dextran synthesis. Even with the best acceptor, maltose, a discontinuous set of products was formed: a set of low-molecular weight oligosaccharides (d.p. 3-6 for an acceptor to sucrose ratio of 1:1) and high molecular weight dextran. Intermediate sized oligosaccharide acceptorproducts of d.p. 7-15 were not present. If the acceptors had been acting as primers for dextran synthesis, a continuous series of oligosaccharides from d.p. 3 and upward should have been produced. Furthermore, it also would have been expected that, as the concentration of the acceptor (the so-called primer) was increased, there would have been a stimulation of dextran synthesis as the number of priming sites would be increased, instead of the observed decrease in the amount of dextran synthesized [44,59]. Robyt and Walseth [45] proposed the mechanism shown in Fig. 8 for the acceptor reaction. In this mechanism, the acceptor is bound at an acceptor-binding site [60] and when maltose is the acceptor, its C-6 hydroxyl group at the nonreducing-end attacks C-1 of the glucosyl or dextranyl groups in the enzyme complex to give an oligosaccharide or a dextran acceptor product, respectively. When glucose is the acceptor, its C-6-hydroxyl group makes the attack, and when fructose is the acceptor, its C-5-hydroxyl group makes the attack. For acceptors that form a homologous series, Robyt and Walseth also showed that when the concentration of the first acceptor-product becomes sufficiently high, it too can act as an acceptor to give the next higher homolog, which in turn can act as an acceptor so that a series of homologous oligosaccharides are formed. Su and Robyt [40], using maltose in an equilibrium dialysis experiment, showed that there was one acceptor binding-site at the active-site. Thus, the active-site of Leuc. mesenteroides B-512FM dextransucrase has two sucrose binding-sites and one acceptor binding-site. A systematic study of the effects of three parameters on the acceptor reaction was made for Leuc. mesenteroides B-512FM dextransucrase using maltose as the acceptor [59]. The amount and distribution of acceptor-products and the amount of dextran were studied as a function of (a) the ratio of maltose to sucrose, (b) the concentration of maltose and sucrose, and (c) the concentration of enzyme. The ratio of maltose to sucrose was varied from 0:1 to

309

Figure 8. Mechanism for the acceptor reaction of Leuc. mesenteroides B-512F dextransucrase. A disaccharide acceptor binds in the acceptor binding-site so that (A) its nonreducing C-6 hydroxyl group can make an attack onto C-1 of the glucosyl unit releasing it from the active-site to give a trisaccharide or (B) its nonreducing C-6 hydroxyl can make an attack onto C-1 of the glucose residue of the glucanyl chain, releasing it from the active-site.

20:1. As the ratio was increased, the amount of dextran steadily decreased with a concomitant increase in the amount of acceptor-products (see Table 3). The number of acceptor products, however, decreased. At a constant sucrose concentration of 100 mM, a ratio of 1:5 gave 47.0% dextran and 8 acceptor-products (d.p. 3-10); a ratio of 2:1 gave 3.8% dextran and 4 acceptor-products; and a ratio of 20:1 gave 0% dextran and 1 acceptor product. Keeping the ratio constant at 1:1 and increasing the concentrations of maltose and sucrose from 1.25 mM to 300 mM also gave a decrease in the amount of dextran from 49.3% to 0.6% and an increase in the amount of acceptor products. The number of acceptor products in this experiment remained relatively constant at 5-6 for the different concentrations. Using a constant 1:1 ratio of maltose to sucrose and different concentrations from 1.25 mM to 200 mM, the concentration of enzyme was varied 1000-fold from 120 mU/mL to 120 U/mL. As the concentration of enzyme was increased there was a decrease in the amount of dextran formed and an increase in the amount of acceptor-products formed. The decrease in dextran was most pronounced at the lower substrate concentrations. At the highest enzyme concentration (120 U/mL) all of the substrate concentrations (1.25 mM to 200 mM) gave only 5% dextran out of the total amount of product formed. Not all of the acceptors reacted with equal efficiency. In a series of reactions with different acceptors at a 1:1 ratio of acceptor to sucrose at 80 mM, the amount of dextran

310 formed in the reaction was determined for Leuc. mesenteroides B-512FM dextransucrase [44]. The most effective acceptor for decreasing the amount of dextran was maltose. Sixteen other acceptors were compared on a relative scale with maltose defined as 100%. The next best acceptor was isomaltose (89%), followed by nigerose (58%), methyl-o~-Dglucopyranoside (52%), D-glucose (17%), turanose (13%), lactose (11%), cellobiose (9%), and D-fructose (6.4%). The relative efficiencies of the maltodextrins, maltose to maltooctaose, as acceptors were determined for Leuc. mesenteroides B-512FM dextransucrase [57]. The relative efficiencies decreased from 100% for maltose to 6.2% for maltooctaose. The maltodextrins had higher relative efficiencies as acceptors with Strep. mutans GTF-S and GTF-I than they did with B512FM dextransucrase [56]. The efficiencies of reaction of GTF-S and GTF-I with the maltodextrins also decreased as the size of the maltodextrins increased, but unlike B-512F dextransucrase, a minimum was reached with maltopentaose and then increased again with maltohexaose and maltoheptaose [61 ].

5. SUMMARY The glucansucrases synthesize glucan by a two-site mechanism in which the glucose and the growing glucan are covalently attached to the active site. The glucose is transferred to the reducing end of the growing glucan chain by an insertion mechanism in which glucose is inserted between the enzyme and the growing chain. The chain is released from the active-site by acceptor reactions. When the acceptor is a glucan chain, a branch linkage is formed. When the acceptor is a low molecular weight carbohydrate, the glucan chain is released with the acceptor attached to the reducing end, and a low molecular weight acceptor product is produced with a glucose residue attached to the acceptor. With many acceptors, the acceptor product is an acceptor itself and a series of acceptor-products are produced. The structure of the acceptor product depends on the structure of the acceptor and the particular glucansucrase. Acceptors divert glucose away from the synthesis of glucan and terminate glucan synthesis. The amount and the number of acceptor products varies, depending on the concentration ratio of acceptor to sucrose. At low acceptor to sucrose ratios, the yield of acceptor-products is low and at high ratios, the yields are high, and the amount of dextran is higher at low ratios and lower at high ratios. With acceptors that form a homologous series of acceptor products (e.g., maltose), a low ratio gives a relatively large number of acceptorproducts (>10) and a high ratio gives a small number of acceptor-products (1-2) in high yields with respect to sucrose.

6. REFERENCES E. J. Hehre, Science, 93 (1941) 237. A. Jeanes, W. C. Haynes, C. A. Wilham, J. C. Rankin, E. H. Melvin, M. J. Austin, J. E. Cluskey, B. E. Fisher, H. M. Tsuchiya and C. E. Rist, J. Am. Chem. Soc., 76 (1954) 5041. D. Kim and J. F. Robyt, Enzyme Microbiol. Technol., 16 (1994) 659.

311 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

27 28 29 30 31 32 33 34 35 36

C.A. Wilham, B. H. Alexander and A. Jeanes, Arch. Biochem. Biophys., 59 (1955) 61. J . W . Van Cleve, W. C. Schaefer and C. E. Rist, J. Am. Chem. Soc., 78 (1956) 4435. A. Shimamura, H. Tsumori and H. Mukasa, Biochim. Biophys. Acta, 702 (1982) 72. M.D. Hare, S. Svensson, and G. J. Walker, Carbohydr. Res., 66 (1978) 245. F . R . Seymour, E. C. M. Chen, and S. H. Bishop, Carbohydr. Res., 68 (1979) 113. F . R . Seymour, R. D. Knapp, S. H. Bishop, and A. Jeanes, Carbohydr. Res., 68 (1979) 123. F . R . Seymour, R. D. Knapp, andS. H. Bishop, Carbohydr. Res., 72 (1979)229. A. Jeanes and F. R. Seymour, Carbohydr. Res., 74 (1979) 31. F.R. Seymour, R. D. Knapp, and S. H. Bishop, Carbohydr. Res., 74 (1979) 77. F. R. Seymour, R. D. Knapp, E. C. M. Chen, A. Jeanes and S. H. Bishop, Carbohydr. Res., 75 (1979) 275. F.R. Seymour and R. D. Knapp, Carbohydr. Res., 81 (1980) 67. F.R. Seymour and R. D. Knapp, Carbohydr. Res., 81 (1980) 105. F. R. Seymour, R. L. Julian, A. Jeanes, and B. L. Lamberts, Carbohydr. Res., 86 (1980) 227. F. R. Seymour, R. D. Knapp, E. C. M. Chert, and S. H. Bishop, Carbohydr. Res., 74 (1979) 41. F . R . Seymour, M. E. Slodki, R. D. Plattner, and A. Jeanes, Carbohydr. Res., 53 (1977) 153. E.J. Hehre, Science, 93 (1941) 237. G.T. Coil and C. F. Coil, J. Biol. Chem., 131 (1939) 397. M.A. Swanson and C. F. Coil, J. Biol. Chem., 172 (1948) 815. C.S. Hanes, Proc. Royal Soc. London. Series B, 129 (1940) 174. D. Stetten, Jr. and M. R. Stetten, Physiol. Rev., 40 (1960) 513. E.J. Hehre, Adv. Enzymol., 11 (1951) 297. H. J. Koepsell, H. M. Tsuchiya, N. N. Hellman, A. Kazenko, C. A. Hoffman, E.S. Sharpe, and R. W. Jackson, J. Biol. Chem., 200 (1953) 793. H. M. Tsuchiya, N. N. Hellman, H. J. Koepsell, J. Corman, S. S. Stringer, S. P. Rogovin, M. O. Bogard, G. Bryant, W. H. Feger, C. A. Hoffman, F. R. Senti, and R. W. Jackson, J. Am. Chem. Soc., 77 (1955) 2412. H.M. Tsuchiya, Bull. Soc. Chim. Biol., 42 (1960) 1777. A . M . Chludizinski, G. R. Germaine, and C. F. Schachtele, J. Dent. Res., Special Issue C, 55 (1974) C75. G.R. Germaine, A. M. Chludzinski, and C. F. Schachtele, J. Bacteriol., 120 (1976) 287. G.R. Germaine, S. K. Harlander, W-L. S. Leung, and C. F. Schachtele, Infect. Immun., 16 (1977) 637. M. Kobayashi and K. Matsuda, Biochim. Biophys. Acta, 614 (1980) 46. M. Kobayashi and K. Matsuda, J. Biochem., 100 (1986) 615. J.F. Robyt, D. Kim, and L. Yu, Carbohydr. Res., 266 (1995) 293. J.F. Robyt and A. J. Corrigan, Arch. Biochem. Biophys., 183 (1977) 726. J.F. Robyt, B. K. Kimble, and T. F. Walseth, Arch. Biochem. Biophys., 165 (1974) 634. J.F. Robyt and S. H. Eklund, Bioorg. Chem., 11 (1982) 115.

312 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61

D. Fu and J. F. Robyt, Carbohydr. Res., 183 (1988) 97. J. F. Robyt and P. J. Martin, Carbohydr. Res., 113 (1983) 301. S. L. Ditson and R. M. Mayer, Carbohydr. Res., 126 (1984) 170. D. Su and J. F. Robyt, Arch. Biochem. Biophys., 308 (1994) 471. F. A. Bovey, J. Polym. Sci., 35 (1959) 167. K. H. Ebert and M. Brosche, Biopolymers, 5 (1967) 423. J. F. Robyt and H. Taniguchi, Arch. Biochem. Biophys., 174 (1976) 129. J. F. Robyt and S. H. Eklund, Carbohydr. Res., 121 (1983) 279. J. F. Robyt and T. F. Walseth, Carbohydr. Res., 61 (1978) 433. M. Killey, R. J. Dimler and J. E. Cluskey, J. Am. Chem. Soc., 77 (1955)3315. R. W. Bailey, S. A. Barker, E. J. Bourne, P. M. Grant and M. Stacey, J. Chem Soc., (1958) 1895. F. Yamauchi and Y. Ohwada, Agr. Biol. Chem., 33 (1969) 1295. E. J. Bourne, J. Hartigan, and H. Weigel, J. Chem. Soc., (1959) 2332. R. W. Bailey, S. A. Barker, E. J. Bourne, and M. Stacey, Nature, 176 (1955) 1164. W. B. Neely, Arch. Biochem. Biophys., 79 (1959) 154. F H. Stodola, H. J. Koepsell, and E. S. Sharpe, J. Am. Chem. Soc., 74 (1952) 3202. F. H. Stodola, E. S. Sharpe, and H. J. Koepsell, J. Am. Chem. Soc., 78 (1956) 2514. E. S. Sharpe, F. H. Stodola, and H. J. Koepsell, J. Org. Chem., 25 (1960)1062. Y. Iriki and E. J. Hehre, Arch. Biochem. Biophys., 134 (1969) 130. D. Fu and J. F. Robyt, Arch. Biochem. Biophys., 283 (1990) 379. D. Fu and J. F. Robyt, Carbohydr. Res., 217 (1991) 201. G. L. C6t~ and J. F. Robyt, Carbohydr. Res., 111 (1982) 127. D. Su and J. F. Robyt, Carbohydr. Res., 248 (1993) 339. A. Tanriseven and J. F. Robyt, Carbohydr. Res., 225 (1992) 321. J. F. Robyt, "New Products from the Action of Sucrose-glucosyltransferases" in: Carbohydrates in Industrial Synthesis. M. A. Clarke (ed.) pp. 56--67, Bartens, Berlin (1992).

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), Carbohydrate Bioengineering 9 Elsevier Science B.V. All rights reserved.

313

Studies on a recombinant amylosucrase M. Remaud-Simeon a, F. Albaret a, B. Canard b, I. Varlet c, P. Colonna d, R.M. Willemot a and P. Monsan" aC.B.G.D.-I.N.S.A., Complexe Scientifique de Rangueil, 31077 Toulouse cedex, France h._,.G.M.C.H., CNRS URA 1462, Facult6 de M6decine, Avenue de Valombrose, 06107 Nice, France CCentre de Biochimie Universit6 de Nice Sophia Antipolis, Parc Valrose, 06104 Nice cedex, France dINRA, Laboratoire de Biochimie et Technologie des Glucides, BP 1627,44316 Nantes cedex 03, France

Abstract In order to characterize a recombinant amylosucrase activity (E.C. 2.4.1.4.) and to evaluate its potential use as a glucosylation tool, chromosomal Sau 3A DNA fragments from Neisseria polysaccharea were cloned into the phage X EMBL3. A recombinant phage expressing the amylosucrase activity was isolated. Production of the enzyme was carried out by infection of liquid culture of E. coli. The enzyme was purified from culture lysate to a specific activity of 0.3 U/mg. When incubated with sucrose and traces of glycogen, the recombinant amylosucrase produced an insoluble glucopolysaccharide mainly composed of or-(1---)4) glucosidic linkages and a very low degree of or-(1---)6) branched linkages (less than 5 %). The recombinant enzyme is activated by glycogen, starch and maltooligosaccharides. It also catalyzes the transfer of glucosyl residue from sucrose onto a maltopentaose acceptor to produce maltohexaose and heptaose.

1. INTRODUCTION The important role of oligosaccharides in cell-cell interactions and the numerous applications that are offered to these molecules in the pharmaceutical or nutritional fields greatly stimulate research focused on the oligosaccharide synthesis. Besides the chemical synthesis of such compounds which requires fastidious steps of protection/deprotection and elimination of side products, the enzymatic approach appears very promising [ 1]. However, to envisage the industrial production of enzymatically synthesized oligosaccharides, the reactions must be carried out with low cost substrates and catalysts. In addition, the products must be synthesized in high yields. Those constraints very often limit the biocatalysis development

314 particularly the use of glycosyltransferases which act on nucleotide activated sugars or the use of hydrolases in reverse reactions. However some transferases were found to be well appropriate to efficiently synthesize oligosaccharides. In fact, glucooligosaccharides can be produced in high yields using the glucosyltransferases (E.C. 2.4.1.5.) from Leuconostoc mesenteroides [2-5]. These enzymes catalyze the transfer of D-glucopyranosyl units from sucrose (a low cost and highly available substrate) onto acceptor molecules (mainly sugars) [6]. The chemical structure of the oligosaccharides obtained was shown to be highly dependent on the glucosyltransferase producing strain [2-6]. They contain o~-(1-->6) linkages in the linear chain and o~-(1-->3) or o~-(1-->2) branched linkages. Among the glucosyltransferases, amylosucrase (E.C. 2.4.1.4.) first discovered by Hehre and Hamilton is a very original enzyme [7]. In fact, the constitutive amylosucrase from Neisseria perflava catalyzes the synthesis of a glycogen-like polysaccharide directly from sucrose without the mediation of nucleotide activated sugars. The polymer was shown to be composed of 90 % of o~-(1-->4) linkages and 10 % of ct-(1-->6) branched linkages which are shorter than those found in glycogen [8-10]. In 1974, another non-pathogenic strain from Neisseria was isolated from the throat of healthy children and further proposed as a prototype strain constituting a new taxon in the genus Neisseria [11]. This strain was named N. polysaccharea because of the large amount of exo-cellular polysaccharide produced when bacteria are grown on agar containing 1 to 5 % sucrose [ 12]. The analysis of the polysaccharide revealed that it has also a glycogen-like structure and differs from the polysaccharide from N. perflava only in having a lesser degree of branched linkages [12]. The present work describes the cloning of the amylosucrase gene from N. polysaccharea into phage ~ EMBL3 vector and the preliminary study on the recombinant enzyme.

2. MATERIALS AND METHODS 2.1. Construction of the genomic library Chromosomal DNA from N. polysaccharea NCTC 11858 was extracted by the procedure of Brenner et al. [13-14]. The genomic library was constructed following the procedures described by Russel et al. [ 15-16]. N. polysaccharea chromosomal DNA was partially digested with restriction endonuclease Sau 3A. The fragments obtained were inserted into the BamH1 cloning site of ~EMBL3. In vitro packaging was carried out to obtain phage particles with Gigapack gold II kit (Stratagene). 2.2. Screening of the genomic library The procedure described by Russel et al. for the detection of sucrase gene from Streptococcus mutans was used for N. polysaccharea sucrase [16]. The recombinant genomic bank and E. coli C600 in YT medium soft agar were plated on top of M9 medium and supplements (MgSO4, CaC12, thiamine, threonine and leucine) containing 0.7 % of sucrose. After 6 hours of growth at 37 ~ plaques appeared in the soft agar [ 17]. When sucrase activity was expressed, the enzyme catalyzed the release of fructose and/or glucose which were metabolized by the bacteria and stimulated growth around the plaque forming a 'haloe'. As sucrose cannot be utilized by E. coli C600, the 'haloe' appeared only around the plaque where sucrase activity was expressed.

315 2.3. Enzyme production and purification 500 ml of Luria broth supplemented with MgSO4 10 mM were inoculated with an overnight preculture of E. coli TG1. When optical density at 600 nm reached 0.5 value (3. 108 bacteria /ml), the culture was infected with the recombinant phage preparation at MOI =2. The culture was maintained at 37 ~ under vigorous shaking until lysis occured. After lysis, the cell pellets were centrifuged. Nucleic acids, liberated during cell lysis, were eliminated using polyimin precipitation at 0.08 % w/v. Solid ammonium sulfate was then added to the supernatant fluid up to 80 % saturation. The mixture was gently stirred at 4 ~ The precipitate obtained was dissolved in a sodium maleate buffer (50 mM), pH 6.4 containing MgC12 (10 mM) and CaC12 (10 mM) and dialyzed against the same buffer during 16 hours at 4 ~ The preparation was then loaded onto anion exchange mono Q column (Pharmacia) equilibrated with an imidazole buffer (50 mM, pH 7). Enzyme was recovered by applying a gradient of NaC1. 2.4. Reaction conditions Reactions were all conducted in 50 mM sodium maleate buffer pH 6.4. at 30 ~ with various concentrations of sucrose. Glycogen from bovine liver (Sigma Chemical Co.) was added at a final concentration of 100 mg/1 to eliminate the lag phase observed in the absence of exogenous polysaccharide. One unit of amylosucrase activity is the amount of enzyme that catalyzes the release of one ~tmole of fructose per min at 30 ~ in 50 mM sodium maleate buffer, pH 6,4. Initial rate of fructose production was determined using the Glucose/Fructose Kit from Boehringer. Once it was determined that no glucose release occured, the enzyme activity was also measured by the dinitrosalycylic acid method [18]. Protein were assayed using the method of Lowry et al. [ 19]. 2.5. Polysaccharide and maltooligosaccharide analysis ~3C NMR spectra of the polymer dissolved in DMSO was recorded with a Brucker AM 300. Average Mw values and size distribution was obtained at 25 ~ by coupling on-line HPSEC, a multi-angle laser light scattering (MALLS) photometer and a refractometer. Maltooligosaccharide acceptor reaction products were analyzed using reverse phase chromatography (C18 column) as previously described [5].

3. RESULTS AND DISCUSSION 3.1. Cloning of the amylosucrase gene The method described in experimental allowed to isolate one recombinant in every five hundred plaques. From the positives obtained, two recombinants were further characterized. Recombinant phage DNAs were purified. Their physical maps, given in Figure 1, show that both recombinant phages probably contain the same insert ligated in the opposite direction.

316 Z,1

Sa

E

J

Sm Sa

I

I

I

Sa I

I n s e r t " 13.5 kb

L2 Sa Sa Sm

H

E

I

Sa

I

I

Insert 914.5 kb Figure 1. Physical maps of two amylosucrase recombinant phage DNA. Sa: SalI, E: EcoR1, H: HindlII, Sm: Sma I, Sa: SalI

3.2. Amylosucrase

production

and purification

Infection of E.coli TG 1 liquid culture with the recombinant phage ~l led to the production of 25 U of amylosucrase per 500 ml culture. The enzyme was excreted in the medium during the lysis. An addition of polyimine was then carried out to eliminate the bacterial nucleic acids and facilitate the following purification steps. No activity could be recovered after the concentration using ultrafiltration membranes. The enzyme was thus precipitated with ammonium sulfate at 80 % saturation. The specific activity of the preparation was found to be of 0.062 U/rag and the enzyme was recovered with a 40 % yield. Anion exchange chromatography (Figure 2) was then carried out, the enzyme was recovered with no loss and had a specific activity of 0.3 U/mg.

25 -

- 40 - 3 5 ~-

~o

~-20

tt%

~D

~o

o r (D

~10

z - 25 4O

~0

~>. 1 5 -

>

- 20.~

g0

15~ 10 o

o,,-i

o

<

5-

o

0 _

5 r,.)

0 0

5

10 15 20 25 Fraction number

30

Figure 2. Anion exchange chromatography of recombinant amylosucrase. 200 gL of ammonium sulfate precipitated amylosucrase (0.3 U/ml), in 50 mM imidazole buffer pH 7, were loaded at 0.5 mgmin. Elution was carried out by applying a gradient of NaC1. The enzyme was collected in 1 ml fractions (vertical bars).

317

3.3. Polysaccharide synthesis and characterization When incubated with sucrose (30 g/l) and traces of glycogen (0.1 g/l), amylosucrase (0.1 U/ml) catalyzed the synthesis of a white insoluble polysaccharide which precipitated in the tube and the release of an equimolar amount of fructose. MALLS analysis revealed that the polysaccharide is homogenous in size and has a molecular weight of 2. 1 0 7 g/tool. ~3C NMR chemical shifts (Table 1) confirmed that it was a glucopolysaccharide composed of o~-(1-->4) linkages. No trace of o~-(1--->6) branched linkages could be detected using this analysis method. It implies that the degree of branching is lower than 5 % and clearly demonstrates that the polysaccharide is less branched than the polysaccharide synthesized from direct culture of N. polysaccharea or produced in vitro by N. perflava amylosucrase [8-12]. However, a complementary analysis is required to provide additional informations about the structure.

Table 1 13CNMR chemical shifts of the amTlosucrase synthesized polTsaccharide Chemical shifts, ppm 99.8 78.8 73 71.8 71.4 60.4 Assignment C- 1--->4 C-4--->1 C-2 C-3 C-5 C-6

3.4. Recombinant amylosucrase properties 3.4.1. Effect of sucrose on recombinant amylosucrase initial rate As shown on Figure 3, a decrease of initial rate is observed above 30 g/1 sucrose concentration. This inhibitory effect was also observed for the N. perflava amylosucrase [8-9]. 0,5 ~4

0,4

rch

lycogen

d o 3

~'0,3

d7 M5

.,,a

= 2

0,2

o

v ator

r

o 1 2 0

0,1

r

0,0 0

i

i

1

i

50

100

150

200

25

Sucrose concentration, g/L Figure 3. Sucrose effect on amylosucrase initial rate Reactions were carried out in the presence of glycogen (0.1 g/l).

0

1

I

i

i

2

4

6

8

10

Time, h Figure 4. Activator effect of glycogen, soluble potato starch and maltooligosaccharides. Sucrose 30 g/l, activator 0.1 g/l, amylosucrase 0.3 U/mL.

318 3.4.2. Activating effect of glycogen, starch and maltooligosaccharides Reactions were carried out, as described in Materials and Methods, either in the presence of glycogen, starch or maltooligosaccharides at 0.1 g/l, or without exogenous activator. The enzyme is activated by the presence of all the compounds assayed (Fig.4). Fructose is the only reducing sugar detected in the presence of glycogen and starch. When maltooligosaccharides were added, a small production of glucose was also observed. It represents 10 % and 5 % of the total reducing sugars in the reactions realised with maltopentaose and maltoheptaose respectively. This is probably due to a transglycosylase activity. The level of activation increased with the degree of polymerisation of maltooligosaccharides added. However, glycogen is the only molecule that eliminates the lag phase. This phase may correspond to the time required by the enzyme to synthesize its own activator. 3.4.3. Maltooligosaccharide elongation To evaluate the transferase activity of amylosucrase onto maltopentaose, a reaction was carried out with a ratio of sucrose/maltopentaose (mol/mol) of 1 at an initial sucrose concentration of 30 g/1 with 0.15 U/ml enzyme. After total consumption of sucrose, it was found that the maltopentaose concentration decreased and that maltohexaose and maltoheptaose were synthesized (Figure 5).

Iooo

i

'

.

l

x

I \

Figure 5. Chromatograms of the maltooligosaccharides synthesized from sucrose and maltopentaose ( - - ) and without sucrose (---). (Sucrose (0 or 30 g/l), maltopentaose (72 g/l), sodium maleate (buffer 50 mM, pH 6.4), amylosucrase (0.15 U/ml), temperature 30 ~ However, oligosaccharides having a degree of polymerization lower than five were also synthesized. The reaction carried out without sucrose (Fig. 5) showed that a transglycosylase activity was present in the preparation. The origin of this activity is still uncertain. In fact, a transglycosylase was already found to be produced by N. perflava strain [9]. In our case, it is possible that the fragment cloned (14 kb) also codes for a transglycosylase. However, this activity may also be a property of amylosucrase itself or may be due to the amylomaltase from E. coli which could have been excreted during the lysis of the cells. In any case, comparison of the chromatograms obtained with and without sucrose clearly shows that the recombinant enzyme elongates the maltopentaose and can catalyze the transfer of glucopyranosyl residue onto this acceptor (Fig. 6).

319 28 ~_. 2 4 e~0 ~g

r

o O O r

Z~

-

20

-

16

-

12

-

8

-

4

i i

0

2 Degree

i

4 6 8 of polymerization

i

10

Figure 6. Comparison of maltooligosaccharide concentrations obtained in the presence of maltopentaose and sucrose (I-l) or without sucrose (~!).

4. CONCLUSION Sucrose is a good substrate for the recombinant amylosucrase. Its consumption leads to the release of fructose and the synthesis of an amylo-type polysaccharide which appears to be less branched than the polysaccharide produced by N. perflava [8-9] and by direct culture of N. polysaccharea [12]. The enzyme is inhibited by sucrose concentrations higher than 30 g/1 and is activated by glycogen, starch and maltooligosaccharides. Amylosucrase was also shown to use maltopentaose as acceptor. It can thus be used to increase the moleculer weight of maltooligosaccharides. A transglycosylase activity was also identified in our preparation. From our study, it is not yet possible to precise its origin. As the present study demonstrates the potential of this recombinant amylosucrase, it appears necessary to subclone the gene in order to overproduce the enzyme and to use it as a glucosylation tool. Subcloning and sequencing of the gene are already initiated. The identification of the amino acid sequence will allow the amylosucrase to be compared with other glucosyltransferases such as glycogen and starch synthase, glucosyltransferases from L.mesenteroides or Streptococcus sp. and cyclodextringlucanotransferases. This work will undoubtedly be of particular interest for the general study on structure/function relationships of glucosyltransferases.

5. ACKNOWLEDGEMENTS We thank Professor J.Y. Riou (Pasteur Institute) for having graciously furnished Neisseria polysaccharea strain and Dr. M. Vignon (CERMAV, Grenoble) for ~3C-NMR analysis.

320 6. REFERENCES

1 Y. Ichikawa, G.C. Look and C. Wong, Anal. Biochem., 202 (1992) 215. 2 F. Paul, E. Oriol, D. Auriol and P. Monsan, Carbohydr. Res., 149 (1986) 433. 3 M. Remaud, F. Paul, P. Monsan, A. Heyraud and M. Rinaudo, J. Carbohydr. Chem., 10 (1991) 861. 4 M. Remaud, F. Paul, P. Monsan, A. Lopez-Munguia and M. Vignon, J. Carbohydr. Chem., 11 (1992) 359. 5 M. Remaud-Simeon, A. Lopez-Munguia, V. Pelenc, F. Paul and P. Monsan, Applied Biochem. Biotech., 44 (1994) 101. 6 H.J. Koepsell, H.M. Tsuchiya, N.N. Hellman, A. Kasenko, C.A. Hoffman, E.S. Sharpe and R.W. Jackson, J. Biol. Chem., 200 (1952) 793. 7 E.J. Hehre and D.M. Hamilton, J. Biol. Chem., 166 (1946) 77. 8 G. Okada, E.J. Hehre, J. Biol. Chem., 249 (1974) 126. 9 B.Y. Tao, P.J. Reilly and J.F. Robyt, Carbohydr. Res., 181 (1988) 163. 10 C.R. MacKenzie, M.B. Perry, I.J. McDonald and K.G. Johnson, Can. J. Microbiol., 24 (1978) 1419. 11 J.Y. Riou, M. Guibourdenche and M.Y. Popoff, Ann. Microbiol., 134 (1983) 257. 12 J.Y. Riou, M. Guibourdenche, M.B. Perry, L.L. MacLean and D.W. Griffith, Can. J. Microbiol., 32 (1986) 909. 13 D.J. Brenner, A.C. McWhorter, J.K. Leete-Knutson and A.G. Steigerwalt, J. Clin. Microbiol., 15 (1982) 1133. 14 M. Guibourdenche, M.Y. Popoff and J.Y. Riou, Ann. Inst. Pasteur/Microbiol., 137B (1986) 177. 15 R.R.B. Russel, D. Coleman and G. Dougan, J. Gen. Microbiol., 131 (1985) 195. 16 R.R.B. Russel, P. Morissey and G. Dougan, FEMS Microb. Lett., 30 (1985) 37. 17 J.S. Sambrook, E.F. Fritsch and T. Maniatis, Molecular cloning: a laboratory manual, 2nd ed., Cold Spring Laboratory, Cold Spring Harbor, NY, 1989. 18 J.B. Sumner and S.F. Howell, J. Biol. Chem., 108 (1935) 51. 19 O.H. Lowry, N.J. Rosebrough, A.L. Farr and R.J. Randall, J. Biol. Chem., 193 (1951) 265.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), Carbohydrate Bioengineering 9 Elsevier Science B.V. All rights reserved.

321

Application of cloned monocomponent carbohydrases for modification of plant materials L. V. Kofod, T. E. Mathiasen, H. P. Heldt-Hansen and H. Dalb0ge Novo Nordisk A/S, Novo Al16, DK 2880 Bagsv~erd, Denmark

Abstract Several plant cell wall degrading enzymes have been cloned by the expression cloning technique. These enzymes can be used to degrade isolated plant cell wall polysaccharides into oligomers or to extract poly- or oligosaccharides from insoluble and complex plant cell wall material, thereby providing soluble dietary fibre or oligosaccharides with potential beneficial physiological effects. Also the cloned enzymes can be used to control e.g. viscosity in the industrial processing of plant material. This is illustrated by the degradation of various arabinoxylans or arabinoxylan containing plant material with cloned xylanases and by the degradation of rhamnogalacturonans or rhamnogalacturonan containing plant material with cloned rhamnogalacturonases and assessory enzymes.

1. I N T R O D U C T I O N Due to the abundancy of plant material in nature and the diversity of its carbohydrate components the plant cell wall is a rich source of polysaccharides. In the primary wall cellulose microfibrils form a structural network which is embedded in a matrix of hemicelluloses and pectic substances [1]. In grasses, such as wheat, corn and barley, the content of pectic substances is usually very low whereas the content of hemicelluloses is high. The major hemicelluloses are the arabinoxylans which are composed of a backbone of 13-1,4-1inked xylose units with different sidechains attached [ 1, 2]. The sidechains are usually single unit c~-1,2 or 1,3 linked arabinofuranose or o~-1,2 linked 4-O-methylglucuronic acid [2]. Some xylose residues can be substituted at both C-2 and C-3 and the degrees of mono- and disubstitution vary within different populations of arabinoxylans [3, 4]. Xylans can be either soluble or insoluble. The reasons for the insolubility of arabinoxylans have not been fully elucidated, since alkali extractable water insoluble arabinoxylans seem to have the same structures as soluble arabinoxylans [4, 5]. In dicotyledons - and in monocotyledons other than the grasses - xyloglucans are the dominating hemicelluloses [1] and the content of pectic substances is relatively high (10-60 % of the wall polysaccharides). The pectic substances are characterized by a high content of galacturonic acid, which is present in homogalacturonan as well as rhamnogalacturonan

322 polysaccharides [1]. In homogalacturonan long stretches of a-l,4-1inked galacturonic acid residues are only occasionally interrupted by a rhamnose residue whereas rhamnogalacturonan is a polymer of alternating rhamnose and galacturonic acid residues [ 1]. In rhamnogalacturonan the rhamnose residues often carry arabinan, galactan or arabinogalactan sidechains [1]. Because of the abundant sidechains the term "hairy region" is often used to describe the rhamnogalacturonan rich regions of the pectic substances [6]. In the primary wall the cellulose fibrils give the necessary strength for the cell to resist turgor pressure, while the hemicelluloses and pectic substances regulate the flexibility and porosity of the wall, necessary for cell expansion during growth [ 1]. Different models for the interlinkage of the wall polymers have been proposed, but it is generally believed that xyloglucans interlace the cellulose fibrils through strong hydrogen bonds [ 1]. Also, it has been suggested that the arabinogalactan or arabinan sidechains of pectic substances are covalently linked to cellulose or xyloglucan [7, 8] but this is not generally accepted [1, 9]. The rhamnogalacturonan (or "hairy") regions of the pectic substances supposedly alternate with homogalacturonan regions, and the pectic substances are crosslinked due to the ability of the homogalacturonan regions to form interchain "egg box junctions" through Ca2+ ions [1, 9]. Recently it was suggested that the "hairy" regions are not composed solely of rhamnogalacturonan, but that e.g. xylogalacturonan regions are an integral part of the "hairy" region alternating with rhamnogalacturonan [ 10]. The ability of saprophytic filamentous fungi to produce plant cell wall degrading enzymes has been utilized in the production of industrial enzyme products [ 11]. These products usually contain varying amounts of glucanases, xylanases and pectinases [ 11-13]. The glucanase and xylanase enzyme systems have been thoroughly described [14]. Glucanases will not be described any further. Xylanases hydrolyse the 13-1,4-1inkage between unsubstituted xylose residues in arabinoxylans but for complete degradation of arabinoxylans exo-enzymes are necessary in order to remove substituents [2, 15]. An example is arabinofuranosidases which remove the arabinose substituents resulting in more available sites for the xylanase [2, 15]. Pectic enzymes working on homogalacturonan regions as well as arabinanases and galactanases have been studied for years [6, 14, 16-21 ]. In contrast only very recently enzymes cleaving within the rhamnogalacturonan regions of pectic substances have been described. Analogous to the glucanase, xylanase and pectinase enzyme systems a set of endo- and exoenzymes exist which synergistically degrade the rhamnogalacturonan [22]. These enzymes include rhamnogalacturonases [23-26], rhamnogalacturonan acetyl esterase [27, 28] and rhamnopyranohydrolase [22]. The industrial multi-enzyme complexes have found application in wine and juice production, pulp and paper industry, baking, animal feed, textile industry, vegetable oil extraction, and production of undigestible oligo- and polysaccharides [12, 29-31]. A battery of enzymes is often necessary for complete degradation of the plant cell wall material and commercial carbohydrases, e.g. pectinases produced from fungi, can contain an extensive amount of different activities [ 13]. However, in new as well as existing applications it has sometimes been realized that only a few enzyme activities are necessary to achieve the desired effect. At best the additional activities are superfluous but in some applications they are even undesirable [ 11]. Some initiatives have been taken to purify selected enzyme activities, however not on a commercial scale. As an alternative to large scale purification of enzymes cloned monocomponent enzymes have the potential of offering a better use of resources in the

323 fermentation, a better control of the industrial enzyme reaction and a more economical and ecological dosage of enzyme protein. Of particular interest is the controlled degradation or modification of specific components of the plant cell wall, whereby selected functional properties might be encouraged. Application of monocomponent enzymes enables an understanding of the relationship between the structure of plant cell wall components and the functionality, e.g. effect on viscosity, waterbinding capacity, mouthfeel etc. The usual way of obtaining monocomponent enzymes is to identify the enzyme component in the enzyme mixture, purify the enzyme, determine the amino acid sequence, use this information to construct a labeled DNA-probe, isolate by hybridization the gene from a cDNA or genomic library constructed from the fungus in question and finally to transform the gene into an expression host for production of high amounts of monocomponent enzyme [32]. Recently, an alternative method for isolation and expression of fungal genes was introduced. In the expression cloning technique (fig. 1) the gene is isolated by virtue of its expression in yeast into an active enzyme. The yeast harboring the gene is identified by the activity of the enzyme, visualized by a sensitive plate screening assay [32]. By use of expression cloning the steps of enzyme purification, amino acid sequencing, construction of probes and hybridization can be excluded [32]. Additionally, more than one enzyme with the activity in question can be isolated simultaneously [32]. The technique has been shown to be a powerful tool for the isolation of plant cell wall degrading enzymes from filamentous fungi such as Humicola insolens and Aspergillus aculeatus [21, 24, 32, 33]. The cell wall polysacchararides of plants is the main contributor to the intake of dietary fibre by humans [34]. Dietary fibre escapes digestion by human digestive enzymes but is fermented in the large bowel to varying degrees [34, 35]. Insoluble dietary fibre is only slightly fermented and mainly serves the physiological purpose of adding bulk to the faeces and decrease transit time [35]. Soluble types of dietary fibre are fermented more extensively and serve to increase the viscosity of gastrointestinal fluids as well as to regulate lipid metabolism [35]. Especially the beneficial effects of soluble types of dietary fibre in blood glucose and cholesterol regulation and control of intestinal flora has caused an increasing interest in the addition of these types of dietary fibre to foods [36-38]. Also oligosaccharides, e.g. xylooligosaccharides, have received some attention due to their possible beneficial effect on intestinal flora [39, 40]. In the present study the possible applications of cloned enzymes for the production of poly- or oligosaccharide and for the processing of different plant material is described.

2. MATERIALS AND METHODS 2.1. Enzyme isolation A cDNA library from A. aculeatus was constructed and transformed into S. cerevisiae as described [24]. For identification of xylanase producing yeast colonies AZCL-xylan (MegaZyme, Australia) was incorporated into the agar plates. Xylanase activity was visualized by a blue halo surrounding the yeast coloni. Rhamnogalacturonases and galactanase were cloned as described [21, 24]. Arabinanase producing yeast colonies were identified by incorporation of AZCL-arabinan into the plates whereas o~-arabinofuranosidase producing colonies were identified with an overlayer of Methylumbelliferyl-t~-arabinofuranoside giving rise to a fluorescent zone. The genes were isolated and transformed into A. oryzae as described

324 [32, 41]. A.oryzae transformants were fermented as described [24] and recombinant enzymes were purified from the culture supernatant by ionexchange chromatography.

Figure 1. The principle of expression cloning. A cDNA library is constructed in a E. coli/yeast shuttle vector from an enzyme producing fungus. The library is amplified in E. coli and subsequently transformed into yeast. Yeast colonies which produce fungal enzymes are detected by appropriate enzyme assays. Vector DNA is isolated from the positive yeast coloni and the gene encoding the enzyme is inserted into an Aspergillus vector. After transformation of Aspergillus large amounts of essentially monocomponent enzyme can be produced.

2.2. Substrates

Birch xylan was obtained from Roth, soluble wheat arabinoxylan from MegaZyme. Insoluble wheat arabinoxylan was produced by treatment of wheat flour with Termamyl| and Alcalase| and recovery of insolubles by centrifugation and sieving. Corn cell wall material (Corn CWM) was isolated by successive treatments of dehulled corn kernels with Alcalase| and Termamyl| and recovery of the insoluble cell wall material by sieving. Modified hairy regions from apples were isolated according to Schols et al. 1990 [42]. Soy cell wall material (Soy CWM) was isolated by Alcalase| treatment and jet cooking (115 ~ 4 minutes) of soy meal followed by centrifugation and recovery of insolubles.

325 2.3. Small scale enzyme treatments Enzyme reactions were carded out at 30 ~ in 1.5ml Eppendorf| tubes in temperature controlled Eppendorf Thermomixers using varying amounts of enzyme and incubation times. The enzyme reaction was stopped by raising the temperature to 95 ~ for 20 minutes. Insoluble substrates were centrifuged after incubation and the supernatants recovered for analyses. Soluble substrates could be analysed with no further purification.

2.4. Viscosity reduction of wheat flour slurries Suspensions of commercial wheat flour (45 % w/w) in water were treated with enzymes (2.35 mg enzyme protein / g wheat flour) for 1 minute at 35 ~ The viscosity was measured at 40 rpm in a Brookfield viscosimeter. 2.5. Production of cloud stable apple juice Apples (Red Belle de Boskop) were cut and milled. Enzyme preparations (25 mg enzyme protein / kg mash) were added to the mash and incubated for 2 hours at 20 ~ whereafter the mash was pressed. The resulting apple juice was pasteurised to discontinue further enzyme degradation. The cloud was measured as turbidity in EF/F units [43]. The cloud stability was determined by a centrifugation test as the amount of turbidity remaining after centrifugation for 4169 x g for 15 minutes [43]. 2.6. HPLC analysis of enzyme digests The molecular weight distribution of enzyme digests was determined by high pressure size exclusion chromatography (HPSEC) which implied separation on three TSK gelfiltration columns (PW G3500, PW G3000 and PW G2000 obtained from TosoHaas) connected in series followed by refractive index detection (RID) on a RID6A (Shimadzu). The saccharides were eluted with 0.4M Sodium acetate buffer pH 3.0 at a flow rate of 0.8ml/min using a Dionex gradient pump (Dionex Corporation). The chromatograms were processed by Dionex software AI450 and Dextran standards (Serva) were used for estimation of the molecular weight (Mw) and degree of polymerization (DP). The amount of soluble saccharide in the sample could be estimated from the area of the chromatogram. Oligomers obtained from the different substrates after enzyme digestion were separated by High Pressure Anion Exchange Chromatography (HPAEC). Oligomers were eluted from a CarboPac PAl column (Dionex Corporation) with a gradient of sodium acetate in 0.1M NaOH. Gradient mixing was controlled by the Dionex gradient pump. 25ml were injected and eluting saccharides were detected by Pulsed Amperometric Detection (PAD) [44]. Xylooligomers were eluted with 0-10 rain of 0.1M NaOH followed by a linear gradient from 0-0.2M sodium acetate over 40 minutes. Rhamnogalacturonan oligomers were eluted with an acetate gradient according to Schols et al. [45]. For the determination of monosaccharide composition enzyme digests were hydrolysed in 2M triflouroacetic acid (TFA) for 1 hour at 121 ~ followed by evaporation. The hydrolysate was redissolved in water and 25 ml was injected into the CarboPac PAl column. The monosaccharides were eluted with a step gradient of from 0-12 min 5mM NaOH, from 12-28 min water, from 28-35 min 0.1M NaOH and a linear gradient from 35-54 min from 0-300mM sodium acetate in 0.1M NaOH. The column was rinsed from 54-64 min with 0.5M NaOH and equilibrated from 64-70 min in 5mM NaOH. The eluting saccharides were detected by Pulsed

326 Amperometric Detection (PAD). For calibration of the detector response standard solutions of 0.25mM, 0.5mM and lmM rhamnose, fucose, arabinose, galactose, glucose, mannose, xylose, galacturonic acid and glucuronic acid (all obtained from Sigma) were hydrolysed in TFA and analysed as described. The content of the individual monosaccharides in the enzyme digests was calculated from linear regression.

3. RESULTS AND DISCUSSION

3.1. Cloning of plant cell wall degrading enzymes from A. aculeatus When an A.aculeatus cDNA library in yeast was screened for xylanase activity on AZCL-xylan several clones were obtained representing three different xylanases, Xyl I, Xyl II, and Xyl 1111. Thus, by expression cloning three different enzymes sharing the same activity could be cloned simultaneously, which verifies the advantage of this technique. The same library was screened for rhamnogalacturonase activity [24] and a new rhamnogalacturonase (RGase B) was identified, whereas the previously described RGase A [23] was cloned by the PCR technique due to lack of expression in yeast [24]. Also, the rhamnogalacturonan acetyl esterase (RGAE) from A.aculeatus was cloned by the PCR technique [28] because of the lack of a suitable plate screening assay. The galactanase (Gal), arabinanase (Ara) and o~arabinofuranosidase (Ara.f) from A aculeatus were cloned from the cDNA library as described z [21]. After expression in A. oryzae the enzymes were purified by ion chromatographic methods essentially as described [21, 24, 33].

3.2. Composition of xylan substrates The arabinose to xylose ratio of the different xylan substrates used in this study has been determined and the results are shown in table 1. The birch xylan contain no arabinose sidechains whereas the soluble wheat arabinoxylan in average contain an arabinose substituent for every second xylose residue. In the insoluble wheat arabinoxylan the arabinose to xylose ratio is the same as in soluble wheat arabinoxylan in accordance with previous reported results [5]. However, most likely some of the xylose residues will be substituted with arabinose at C-3 as well as at C-2. In the corn cell wall material the arabinose content was higher which indicate a high level of disubstitution of xylose residues with arabinose in this substrate.

Table 1 The arabinose to xylose ratio in arabinoxylan substrates Ara/xyl Substrate 0.0025 Birch xylan 0.51 Soluble wheat arabinoxylan 0.52 Insoluble wheat arabinoxylan 0.78 Corn CWM

327 3.3. Degradation of soluble birch xylan When birch xylan was degraded by xyl I, xyl II or xyl III and analysed by HPAEC the xylooligomers eluted as seen in fig. 2. The hydrolysis of the substrate was followed by time course studies involving different dosages of enzyme. In all time course studies for all three xylanases the oligomers showed a valley point at DP 10 throughout the hydrolysis. This strongly suggests that oligomers of around 10 residues are the preferred substrate for the xylanases because they seem to be degraded as soon as they are produced. From the time course studies it was possible to find a degree of depolymerization of the birch xylan substrate which was identical for the three enzymes. At this identical degree of depolymerization of the xylan the oligomer patterns obtained with the three enzymes were almost identical. The only difference was seen in the amount of xylose and xylobiose produced 1. Xyl I produced no xylose but small amounts of xylobiose. Xyl II, which is shown in fig. 2, produced large amounts of xylose and xylobiose, whereas Xyl III produced smaller amounts of xylose and xylobiose 1. 3.4. Degradation of soluble and insoluble wheat arabinoxylans The chromatograms which result from the HPAEC analysis of soluble wheat arabinoxylan degradation products, fig. 3, were slightly more complex than those obtained for the unsubstituted soluble birch xylan. Some extra peaks emerged when compared to the birch xylan oligomers. The additional peaks are expected to be xylooligomers with arabinose substituents. As for the soluble birch xylan the three xylanases produced exactly identical oligomers from soluble wheat arabinoxylan (except for xylose and xylobiose) indicating that the preferred points of cleavage are identical. With wheat arabinoxylan DP 6-8 were produced in very small amounts and instead DP 9-11 accumulated. These results are in agreement with previously reported results [ 10] and indicate that arabinose substituents prevent the previously preferred degradation of xylooligomers with DP around 10. Therefore, if xylooligomers with high DPs are desirable an arabinoxylan substrate should be chosen instead of an unsubstituted xylan. The HPAEC chromatograms become more complicated when the insoluble wheat arabinoxylan is used as substrate, fig. 4. As opposed to the soluble wheat arabinoxylan the oligomers produced by the three xylanases were no longer identical. The differences in the degradation products have not been identified in this study. An increase in disubstituted xylose residues add yet another factor for variation in arabinoxylooligomer structures which can explain the more complex oligomer pattern. Studies on insoluble wheat arabinoxylan have been carried out with two different xylanases isolated from A.niger [46]. In those studies the oligomer structures were identified by NMR and the xylanases were shown to be different in their sensitivity to arabinose substitution [46, 47]. Thus, the differences in the degradation products obtained with the three xylanases from A. aculeatus probably result from differences in the preferred sites of attack in the highly substituted xylan backbone. The degradation of the insoluble wheat arabinoxylan was also followed by HPSEC. The amount of solubilised material could be estimated from the area under the curve in the chromatogram. In fig. 5 three chromatograms were chosen in which the enzyme dosage and time of hydrolysis would give a degree of depolymerization of soluble birch xylan which was identical for the three xylanases. It is clearly seen that Xyl II was not capable of solubilising the same amount of arabinoxylan as Xyl I and Xyl III and that the solubilised material had a lower

328

DP5 xylobiose

L

xylose

II

1,

uC

10 Ill 24 h 240 min. 120 min. 60 min. 15 min.

5

10

15

20

25

30

35

40 min.

Retention time

Figure 2. HPAEC of birch xylan degradation products. In a time course study 1.5 ml aliqouts of a 1% solution of birch xylan in 0.1M acetate buffer pH 5.0 were added 4mg of Xyl II and incubated at 30 ~ for 15, 60, 120 or 240 minutes or 24 hours. The oligomers produced were eluted from a CarboPac PAl column with an acetate gradient resulting in the chromatograms shown. Similar time course studies were performed with Xyl I and Xyl III.

329

uC

24 h 240 min. 120 min. 60 min. 15 min.

_!~, !!! ! ~ ! ! ! ! ! !! !~ ~ !!-! ! ! ! I!,, !! !!~ !l !!!!! ,~! 5

10

15

20

25

30

35

40

min.

Retention time Figure 3. HPAEC of wheat arabinoxylan degradation products. The experimental conditions were as described in fig. 2, except that the substrate was 1% wheat arabinoxylan.

330

uC Xyl III Xyl II Xyl I

5

10

15

20

25

30

35

40 min.

Retention time

Figure 4. HPAEC of insoluble wheat arabinoxylan degradation products. 3 % suspensions of insoluble wheat arabinoxylan were incubated with each of the three xylanases in time course studies. The three chromatograms shown represent an enzyme dosage and time of hydrolysis which with soluble birch xylan as the substrate would give identical degradation for the three xylanases.

331 molecular weight. This is in accordance with the finding that Xyl II has a very low activity on insoluble wheat arabinoxylan compared to Xyl I and Xyl III]. Prolonged degradation with Xyl II did not increase the amount of solubilised material to the level seen with xylanase I and III and the time course studies showed that at no stage in the hydrolysis chromatograms could be obtained with identical appearances for the three enzymes. This is opposed to the results on soluble birch xylan and soluble wheat arabinoxylan and the HPSEC results verify the differences seen with the three enzymes in the HPAEC oligomer analysis.

mV Xyi III Xyl I I Xyl I

Blank

I I I

>500,000 >3,200

125,000 800

8,000 50

500 3

Mw DP

Figure 5. HPSEC of insoluble wheat arabinoxylan degradation products. The molecular weight distributions of the arabinoxylans released from insoluble wheat arabinoxylan by the action of xylanases were determined by HPSEC. The estimated molecular weight (Mw) and degree of polymerisation (DP) is shown in the X-axis. The chromatograms correspond to the HPAEC chromatograms shown in fig. 4. The amount of material released from the substrate can be estimated from the area: Xyl I: 22 %; Xyl II: 9 %, Xyl III: 20 %.

332

3.5. Degradation of corn CWM The material liberated from corn CWM by the prolonged action of xylanases with and without the addition of arabinofuranosidase was studied. From the HPSEC chromatograms in fig. 6 the amount of solubilised material has been calculated and the results are shown in table 2 together with the arabinose/xylose ratio. For all three xylanases the addition of arabinofuranosidase increased the amount of solubilised material and the molecular weight of the solubilised material is reduced. The monosaccharide composition shows that the solubilised material has a higher ara/xyl ratio than the intact CWM which indicates that the xylan remaining in the wall has a lower degree of substitution than the liberated arabinoxylan. Thus arabinose substitution is not a major determinant of how tightly the polymers are fixed in the cell wall matrix.

Table 2 Amount and composition of solubilised material from corn CWM treated with xylanases and arabinofuranosidase Enzyme Amount released* Ara/xyl ratio % (mol]mol) Xyl I 14 1.04 Xyl I + Ara.f. 18 1.09 Xyl II 5 1.01 Xyl II + Ara.f. 9 1.09 Xy111/ 4 1.13 Xyl III + Ara.f. 10 1.14 A. aculeatus 35 0.92 * Estimated from the areas of the HPSEC chromatograms in fig. 6.

In accordance with the results on insoluble wheat arabinoxylan, Xyl II solubilises less material than Xyl I from the insoluble corn CWM. Xyl II releases only half the amount of material of Xyl I even when arabinofuranosidase is added. In contrast to the results obtained on insoluble wheat arabinoxylan, Xyl III does not release the same amount of material from corn CWM as Xyl I. When compared to the action of the complex A. aculeatus supernatant, the cloned enzymes release less material. Thus, from a solubilisation point of view, the cloned enzymes are inferior to the enzyme complex. The high solubilising power of the enzyme complex is probably due to the presence and action of several exo-enzymes which work in synergy with the xylanases [2, 15]. However, the many side activities result in a degradation of the released material into mainly mono- and dimers. This is not desirable if the extracted material is to be used for incorporation into foods as a functional food ingredient or soluble dietary fibre. Therefore, the intended application of the enzyme degradation products determines whether cloned monocomponent enzymes or the entire enzyme complex is preferable.

333 3.6. Viscosity reduction in w h e a t slurries

The purpose of wheat separation is to separate wheat gluten from wheat starch. Industrially this is accomplished in a wet milling process where a slurry of wheat flour in water is centrifuged by means of hydrocyclones or decanters yielding several fractions. The fractions obtained are enriched in gluten, starch and wheat water solubles, respectively. The viscosity of the slurry determines the capacity of the wheat separations plant as well as the quality of the separation.

A. aculeatus product

Xyl III +Ara.fur

Xyl III Xyl II +Ara.fur

mV

Xyl II XylI +Ara.fur

XylI Blank

>500,000 >3,200

125,000 800

8,000 50

500 Mw 3 DP

Figure 6. HPSEC of corn cell wall degradation products. Suspensions of corn CWM (3 % in 0.1M acetate buffer pH 5.0) were incubated with xylanase (100mg to 1.5 ml of substrate) at 30~ for 24 hours. In some experiments ot-arabinofuranosidase (100mg to 1.5 ml of substrate) was used in combination with xylanase. Also, an experiment was performed in which the cell wall material was degraded by a culture supernatant of A. aculeatus. The molecular weight distribution of the released polysaccharides was determined by HPSEC.

334 In fig. 7 the viscosities in wheat slurries which have been added equal amounts of Xyl I, II or III are seen. Wheat suspensions contain both soluble and insoluble xylan, of which the former contributes the most to viscosity [48] The viscosity reduction obtained with Xyl II is considerably higher than that obtained with Xyl I and Xyl III. As previously described Xyl II has a very low activity on insoluble wheat arabinoxylan, contrary to Xyl I and Xyl III. Thus, Xyl II does not cause a release of more xylan into the soluble phase but instead cause an immediate depolymerisation of the soluble xylan, leading to a reduction in viscosity, which is advantageous for separation of wheat components. The disadvantage of Xyl II for solubilization of insoluble xylan for production of xylooligomers or polymers is turned into an advantage when the xylanases are to be used for wheat separation.

Figure 7. Viscosity reduction in wheat flour slurries treated with xylanases. The viscosity after 1 minute of incubation is measured relative to the viscosity of a wheat flour suspension which was not added enzyme.

3.7. Degradation of rhamnogalacturonan substrates Previously it has been shown that the rhamnogalacturonases, RGase A and RGase B cloned from A. aculeatus, are functionally different [24]. Besides marked differences in pH optima and stability, the enzymes were shown to have different ratios of activity towards rhamnogalacturonan from apples, potatoes, lupins and sugar beets. When rhamnogalacturonan from apples was saponified and degraded with the RGases the degradation products obtained after prolonged incubation were shown by HPSEC to be of identical molecular weight. However, analysis by HPAEC showed that the oligomers produced by the two enzymes eluted

335 very differently from the CarboPac column. Therefore, it was anticipated that the new enzyme RGase B cleaves the linkage between rhamnose and galacturonic acid in the rhamnogalacturonan backbone as opposed to the RGase A, which has previously been shown to hydrolyse the linkage between galacturonic acid and rhamnose [24, 45]. In a very recent study it has also been shown that RGase B as well as RGase A acts in synergy with the cloned rhamnogalacturonan acetylesterase (RGAE) from A. aculeatus in the degradation of apple rhamnogalacturonan in which the acetyl esters have not been removed by saponification [28].

3.8. Degradation of soy CWM In the present study the action of the RGases on soy CWM has been investigated. Soy CWM is known to have a very high content of galactan [49-51] which is present as sidechains in the rhamnogalacturonan polymers. Therefore it was interesting to study the degradation of soy CWM with the RGases in combination with RGAE and galactanase. Also arabinanase and arabinofuranosidase were included in order to obtain as complete degradation of the sidechains as possible. A pH of 5.0 was chosen as a compromise between the acidic RGase A and neutral RGase B. At pH 5.0 both enzymes maintain 25 % of the activity at optimal pH. The results of the HPSEC of solubilised material can be seen in figs. 8 and 9 for RGase A and B, respectively. The amount of solubilised material has been estimated from the area of the chromatograms and the results are presented in table 3 together with the monosaccharide compositions.

Table 3 Amount and composition monocomponent enzymes. Amount Enzyme Soy CWM (untreated) RGase A + RGAE + Gal + Ara + Ara.f RGase B + RGAE + Gal + Ara + Ara.f

of material released from soy CWM by the action of cloned released % 0 7 17 44 46 37 52 48 47

Gal.A 19 6 5 4 4 4 3 4 4

Monosaccharide composition Rha Gal 4 38 6 40 5 51 4 59 4 55 5 55 4 58 3 58 4 56

Ara 19 34 35 30 32 34 33 32 33

The HPSEC analysis revealed that RGase A in combination with RGAE released substantial amounts (17 %) of high molecular weight material from soy CWM. RGase B alone was capable of releasing 37 % high molecular weight material, a yield which could be increased to about 50 % by the addition of RGAE. The high molecular weight material has a DP, estimated from dextran standards, of about 300. The composition of the extracted polymers, seen in table 3, shows an almost 1:1 ratio of rhamnose and galacturonic acid and a very high content of galactose and arabinose. Thus, it must be anticipated that the solubilized material is almost

336 entirely composed of fragments of rhamnogalacturonan backbone with long sidechains of arabinogalactans and arabinans attached. This is verified by the fact that all the released material is degraded completely to rhamnogalacturonanoligomers and galactose and arabinose mono- and dimers by the concerted action of RGase, RGAE, galactanase, arabinanase and arabinofuranosidase (figs. 8 and 9) _

mV

) / V 9 [ ~ RGase A+RGAE +Gal+Ara+Ara.f. RGase A+RGAE +Gal RGase A+RGAE

~

RGase A

Illllll

I >500,000 >3,500

125,000 800

8,000 50

500 3

Mw DP

Figure 8. HPSEC of soy CWM released by RGase A in combination with different cloned monocomponent enzymes. Aliquots of 1% suspensions of soy CWM in 0.1M acetate buffer pH 5.0 were incubated with enzymes (40rag of each to 1.5 ml of substrate) at 30 ~ for 24 hours and the solubilized material was analysed.

337

mV RGase B+RGAE +Gal+Ara+Ara.f. RGase B+RGAE +Gal.

w

~ '

ill

illllll,, >500,000 >3,200

RGase B+RGAE ~1 ~ " - - - ~ ' ~ ' " - - J ~v~ RGase B

llllllllll

125,000 800

8,000 50

500 3

Mw DP

Figure 9. HPSEC of Soy CWM released by RGase B in combination with different cloned monocomponent enzymes. Soy CWM was incubated with RGase B as described in fig. 8 for RGase A.

If, as suggested [7, 8], the sidechains of rhamnogalacturonan were covalently attached to the xyloglucan or cellulose of the plant cell wall, then enzymes cleaving in the rhamnogalacturonan backbone should not alone be able to release large amount of material from the wall. Therefore, soy rhamnogalacturonan does not seem to be attached to other plant cell wall constituents by means of the sidechains, which is in accordance with reports on other plant materials [52-54]. It could be argued, though, that only a few galactan or arabinan chains

338 were involved in covalent crosslinks and that these are not released. Then, the addition of galactanase and/or arabinanase should increase the amount of solubilized polymers. This was not observed. The addition of galactanase, arabinanase and arabinofuranosidase to RGase B combined with RGAE did not increase the solubilization (table 3). The only effect of the addition of side.chain degrading enzymes was to depolymerise extensively the soluble material (fig. 9). The results obtained with RGase A were slightly different. Addition of sidechain degrading activities to RGase A and RGAE increased the solubilization, although not to a level exceeding RGase B combined with RGAE. The most likely explanation for the results with RGase A is that the sidechains sterically hinder the action of RGase A and that the addition of galactanase or arabinanases minimises this hindrance. This explanation is verified by the monosaccharide composition which shows that the galactose ratio is lower and the rhamnose and galacturonic acid ratios are higher in the material released by RGase A (+/- RGAE) compared to RGase B. Thus, RGase A preferentially cleaves in rhamnogalacturonan which is not extensively substituted with sidechains. As for RGase B the addition of sidebranch degrading enzymes had the effect of converting the released high molecular weight fragments into mono-, di- and oligomers. RGases are therefore the enzymes of choice if high molecular weight polysaccharides are desired, whereas the addition of sidebranch degrading enzymes is necessary if small oligosaccharides are the preferred endproducts. Besides a high content of galactan, soy cell wall has also been reported to contain substantial amounts of xylogalacturonan [49, 51 ]. Xylogalacturonan has been suggested to be an integral part of the "hairy" regions of apple pectin, rhamnogalacturonan regions being interspersed with xylogalacturonan regions [10]. If, accordingly, the rhamnogalacturonan regions of soy cell walls were interspersed with xylogalacturonan regions then polymers released by RGases would also contain xylose. However, the content of xylose was negligible in the material released by the RGases and assessory enzymes in this study. Also, no homogalacturonan was released along with the rhamnogalacturonan. This indicates that rhamnogalacturonan in soy cell walls is not either interspersed with homogalacturonan. The results with RGases therefore indicate that rhamnogalacturonan in soy cell walls exists in a matrix separate from a matrix of homogalacturonan or xylogalacturonan. This is in accordance with the finding that no rhamnogalacturonan could be extracted from soy CWM by polygalacturonases (results not shown), which is opposed to results obtained by Schols et al. who used pectinases to extract rhamnogalacturonans from various plant sources [42, 55].

3.8. Cloud stable apple juice In several countries, cloudy fruit juices are produced. The quality of these juices is the pulpy appearance and the stability of the cloud is of paramount importance. The cloud stability is influenced by the size and composition of the particles and the viscosity of the juice [43, 56]. Experiments with various pectic degrading enzymes for production of cloud stable apple juice demonstrated that some enzymes attacking the hairy regions of pectin result in increased cloud stability of the juice. RGase B could not be used for apple juice production because of the low pH. In accordance with the results on soy CWM, RGase A alone had almost no effect (results not shown), but when combined with galactanase and RGAE a cloud stable juice could be obtained (table 4). One explanation for the cloud stability can be the increased viscosity which was found in the juice due to a large solubilization of pectic substances (results not

339 shown). In a study on apple protopectin, RGase alone was shown to solubilise some pectic material (homogalacturonan as well as rhamnogalacturonan) and no synergism was seen with galactanase [57]. The deviating results in the present study can possibly be explained by the use of a differently treated substrate with a different origin. The stabilizing effect of the galactanase used alone (table 4) might be due to modifications of the composition of already soluble pectic material or of the cloud particles rather than to the effect of material solubilized by the galactanase.

Table 4 Production of cloudy apple iuice from Red Belle de Boskop Enzyme Turbidity before Increase in turbidity centrifugation relative to untreated control, % Untreated 1061 + 112 100 RGase A + Gal 1333 + 102 125 + RGAE Galactanase 1212 + 28 114

Cloud Stability ATz, % 56 + 3 86 + 10 77 + 24

4. CONCLUSIONS In the present study it has been shown that cloned monocomponent enzymes used alone or in combination can be used for the production of soluble oligo- or polysaccharides from different plant polysaccharides or complex plant cell wall material. First, the choice of plant polysaccharide or material determines the type of saccharide which can be extracted. Secondly, the choice of enzymes determines the composition and molecular weight of the resulting degradation products. Thus, by careful selection of plant material and enzymes it is possible to obtain a wide range of saccharide products, pectic as well as hemicellulosic, with high or low molecular weight, depending on the preference. It has also been shown that the cloned monocomponent enzymes are valuable tools for control of the processing of plant material such as in the wheat separation or apple juice processes. A simil,'u" regulation and control of enzyme reaction products cannot be obtained with multi-enzyme complexes. Finally, it has become evident that an enzyme with inferior properties for one particular purpose, when compared to enzymes of the same class, can be superior for other purposes.

5. A C K N O W L E D G E M E N T S Thanks are due to Sakari Kauppinen, Lene Nonboe Andersen, Stephan Christgau, Tina S. Jacobsen, Torben Halkier, Kurt Drrreich, Susanne Htittel, Lotte R. Henriksen and Flemming M. Christensen for their contribution to this work. Also, we thank Susanne G. Jacobsen, Marianne Rohde and Margit T. Kjaer for skillful technical assistance.

340 6. FOOTNOTES 1 Sandal, T. et al., manuscript in preparation z Andersen, L. N. et al., manuscript in preparation

7. REFERENCES

1 2 3 4 5 6 7 8 9 10

11 12 13 14 15 16 17 18 19 20 21 22 23 24

N.C. Carpita and D.M. Gibeaut, Plant J., 3 (1993) 1. M.P. Coughlan and G.P. Hazlewood, Biotechnol. Appl. Biochem., 17 (1993) 259. R.A. Hoffmann, B.R. Leeflang, M.M.J. de Barse, J.P. Kamerling and J.F.G. Vliegenthart, Carbohydr. Res., 221 (1991) 63. H. Gruppen, F.J.M. Kormelink and A.G.J. Voragen, J. Cereal Sci., 18 (1993) 111. H. Gruppen, R.J. Hamer and A.G.J. Voragen, J. Cereal Sci., 16 (1992) 41. J.A. de Vries, F.M. Rombouts, A.G.J. Voragen and W. Pilnik, Carbohydr. Pol., 2 (1982) 25. T. Sakai, T. Sakamoto, J. Hallaert and E.J. Vandamme, Adv. Appl. Microbiol., 39 (1993) 213. J. Hwang, Y.R. Pyun and J.L. Kokini, Food Hydrocolloids 7 (1993) 39. S.C. Fry, Ann. Rev. Plant Physiol. 37 (1986) 165. A.G.J. Voragen, H.A. Schols and H. Gruppen, in Plant Polymeric Carbohydrates, F. Meuser, D.J. Manners and W. Seibel (eds.), 3-17, Royal Society of Chemistry, Cambridge (1992). O.P. Ward and M. Moo-Young, CRC Crit. Rev. Biotechnol., 8 (1989) 237. T. Godfrey and J. Reichelt, Industrial Enzymologi. The Applications of Enzymes in Industry, Stockton Press, New York (1983). A. Sch/3nfeld and U. Behnke, Die Nahrung 35 (1991) 395. C.A. White and J.F. Kennedy, in Carbohydrate Chemistry, J.F. Kennedy (ed.), Oxford University Press (1988). F.J.M. Kormelink and A.G.J. Voragen, Appl. Microbiol. Biotechnol. 38 (1993) 688. W. Pilnik and F.M. Rombouts, in Polysaccharides in Food, J.M.V. Blanshard. and J.R. Mitchell (eds.), 109-126, Butterworths, London (1979). G. Beldman, M.J.F. Searle-van-Leeuwen, G.A. De Ruiter, H.A. Siliha, and A.G.J. Voragen, Carbohydr. Polym., 20 (1993) 159. P. Lerouge, M.A. O'Neill, A.G. Darvill and P. Albersheim, Carbohydr. Res. 243 (1993) 373. J.M. Labavitch, L.E. Freeman and P. Albersheim, J. Biol. Chem., 251 (1976) 5904. J.W. Van De Vis, M.J.F. Searle-van-Leeuwen, H.A. Siliha, F.J.M. Kormelink and A.G.J. Voragen, Carbohydr. Polym., 16 (1991) 167. S. Christgau, T. Sandal, L.V. Kofod and H. Dalboge, Curr. Genet., 27 (1995) 135. M. Mutter, G. Beldman, H.A. Schols and A.G.J. Voragen, Plant Physiol., 106 (1994) 241. H.A. Schols, C.C.J.M. Geraeds, M.F. Searle-van-Leeuwen, F.J.M. Kormelink and A.G.J. Voragen, Carbohydr. Res., 206 (1990) 105. L.V. Kofod, S. Kauppinen, S. Christgau, L.N. Andersen, H.P. Heldt-Hansen, K.

341

25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

D6rreich and H. Dalb0ge, J. Biol. Chem., 269 (1994) 29182. T. Sakamoto and T. Sakai, Carbohydr. Res., 259 (1994) 77. J. An, M.A. O'Neill, P. Albersheim and A.G. Darvill, Carbohydr. Res., 264 (1994) 83. M.J.F. Searle-van Leeuwen, L.A.M. van den Broek, H.A. Schols, G. Beldman and A.G.J. Voragen, Appl. Microbiol. Biotechnol., 38 (1992) 347. S. Kauppinen, S. Christgau, L.V. Kofod, T. Halkier, K. D6rreich and H. DalbCge, (submitted for publication) G.R. Beldman, A.G.J. Voragen and W. Pilnik, Enzyme Microb. Technol., 6 (1984) 503. A.G.J. Voragen, H.A. Schols and G. Beldman, Fruit Processing, 2 (1992) 98. E.-M.B.A.W. Dtisterh6ft, J.C. Venekamp, A.G.J. Voragen, World J. Microbiol. & Biotechnol., 9 (1993) 544. H. Dalbcge and H. Heldt-Hansen, Mol. Gen. Genet., 243 (1994) 253. S. Christgau, S. Kauppinen, J. Vind, L.V. Kofod and H. Dalbc~ge, Biochem. Mol. Biol. Int., 33 (1994) 917. P.S. Selvendran, Amer. J. Clin. Nutri., 39 (1984) 320. M.L. Dreher, Handbook of Dietary Fibre. An Applied Approach, Marcel Dekker Inc., New York (1987). S.A. Andon, Food Technology, Jan., (1987) 74. J. Frank and V. Wheelock, British Food Journal, 90 (1988) 22. M. Glicksman, Food Technology, Oct., (1991) 94. K. Koga and S. Fujikawa, Jap.Technol. Rev. Biotechnol., 3 (1990) 124. A.J. Morgan, A.J. Mul, G. Beldman and A.G.J. Voragen, Agro-Food-Industry Hi-Tech, Nov/Dec., (1992) 35. T. Christensen, H. Wr E. Boel, S,B, Mortensen, K. Hjortshetj, L. Thim, and M.T. Hansen, Bio/Technology, 6 (1988) 1419. H.A. Schols, M.A. Posthumus and A.G.J. Voragen, Carbohydr. Res., 206 (1990) 117. S. Hamatschek, Dissertation Hohenheim, (1989). K. Koizumi, Y. Kubota, T. Tanimoto and Y. Okada, J. Chrom., 464 (1989) 365. H.A. Schols and A.G.J. Voragen, Carbohydr. Res., 256 (1994) 97. H. Gruppen, F.J.M. Kormelink and A.G.J. Voragen, J. Cereal Sci., 18 (1993) 129. H. Gruppen, R.A. Hoffmann, F.J.M. Kormelink, A.G.J. Voragen, J.P. Kamerling and J.F.G. Vliegenthart, Carbohydr. Res., 233 (1992) 45. H. Gruppen, F.J.M. Kormelink and A.G.J. Voragen, Enzymes in Animal Nutrition Symposium, Kantause Ittingen (1993). A.M. Stephen, in The Polysaccharides, G.O. Aspinall (ed.), 97-193, Academic Press (1983). J.-M. Brillouet and B. Carr6, Phytochemistry, 22 (1983) 841. H.A. Schols, G. Lucas-Lokhorst and A.G.J. Voragen, Carbohydrates in the Netherlands, 9 (1993) 7. J.-F. Thibault, R. De Dreu, C.C.J.M. Geraeds and F.M. Rombouts, F.M, Carbohydr. Polym., 9 (1988) 119. C.M.G.C. Renard, M.J.F. Searle-van-Leeuwen, A.G.J. Voragen, J.-F. Thibault, and W. Pilnik, Carbohydr. Polym., 14 (1991) 295. C.M.G.C. Renard, H.A. Schols, A.G.J. Voragen, J.-F. Thibault and W. Pilnik, Carbohydr. Polym., 15 (1991) 13.

342 55 56 57

H.A. Schols and A.G.J. Voragen, Carbohydr. Res., 256 (1994) 83. D.L. McKenzie and T. Beveridge, Food Microstructure, 7 (1988) 195. C.M.G.C. Renard, J.-F. Thibault, A.G.J. Voragen, L.A.M. van den Broek, and W. Pilnik, Carbohydr. Polym., 22 (1993) 203.

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), Carbohydrate Bioengineering 9 Elsevier Science B.V. All rights reserved.

343

Fatty acid esters of ethyl glucoside, a unique class of surfactants Otto Andresen and Ole Kirk Novo Nordisk A/S, DK 2880 Bagsv~erd, Denmark

Abstract Fatty acid monoesters of carbohydrates have been notoriously difficult to synthesize. The development of a simple procedure, utilizing the regioselectivity of a lipase, which makes commercial production in bulk quantities of such compounds feasible, is described. The unique performance of the products in selected applications is briefly touched upon. The products pose significant differences to petrochemical nonionic surfactants in relation to structure and purity. An attempt is made to connect these differences to the differences in behaviour between the two classes of nonionic surfactants.

1. SYNTHESIS In 1983 we at Novo initiated a search for enzymes to be applied on an industrial scale for synthesis of organic compounds. Among the target molecules were carbohydrate monoesters of fatty acids. This was an obvious choice for several reasons. 9 The products were expected to have unique surface active properties which might lead to a large commercial potential. 9 Enzymes may provide the specificity, required for mono-acylation of sugars. 9 Enzymes may perform at conditions mild enough to prevent unwanted side reactions of the carbohydrate. 9 With the product made from renewable raw materials, an enzymatic synthesis would make not only the product but also the production environmentally friendly. Lipases are well established as catalysts for ester synthesis under mild conditions [1]. Furthermore, lipases with selectivity for primary alcohol groups are known [2], which would seem ideal for selective formation of 6-O-monoesters of e.g. glucose. However, all attempts to perform a synthesis along these lines met with one difficult problem, the strongly limited solubility of either raw material in solvents suitable for the other. In the few solvents which dissolved both, like pyridine and DMF, only limited conversion was obtained [3, 4, 5]. In our work to overcome this problem, we at Novo surprisingly found that the minor modification of converting the glucose to a glucoside, e.g. ethyl D-glucopyranoside, had a dramatic effect on the solubility in organic solvents, even in melted fatty acids. Based on this

344 discovery, a simple and very efficient synthesis (Figure 1) was developed [6, 7, 8]. Using a solution of ethyl glucoside in melted fatty acid as substrate and, at the same time, reaction

H+ OH

OH

+

ROH

O

~

I

OR

OH

OH

I

OH O

Immobilised

0 O

LIPASE OR

( ~ )

n

(~) n ~

o

OH

OH I

OH

OR

OH

J OH

Figure 1. One-pot synthesis of fatty acid 6-O-monoester of ethylglucoside. medium, and applying an immobilized lipase from Candida antarctica as catalyst, more than 95% yield of the 6-O-monoester could be obtained. To drive the reaction, the liberated water was removed with vacuum and a small stoichiometric surplus of fatty acid was applied. The synthesis could be carried out with fatty acids from C8 to C18 and also with C22:1. The limit downwards being the aggressivity of the acid against the enzyme and the limit upwards the melting point of the fatty acid which must be lower than the limit for temperature stability of the enzyme. Besides ethyl glucoside, the synthesis worked well with propyl, iso-propyl, butyl, iso-butyl and even with phenyl glucoside. Methyl glucoside gave solubility problems but these were finally overcome [9]. For reasons of acceptability by the market, ethyl glucoside was selected as the raw material of choice. The transfer of the process to pilot plant scale caused no major problems and a production in 20 kg scale was set up in order to prepare material for application tests and, not least, for tests in relation to EINECS registration.

2. APPLICATIONS These Ethyl Glucoside mono Esters, in the following called EGEs, were first tested in skin creams and shampoos. In skin cream a surprisingly good moisturizing effect was observed. In shampoos a good cleaning was obtained and the clean hair even acquired a silky lustre. When it turned out that it was possible to produce the EGEs cheap enough to be considered for application in household detergents, laboratory washing tests were carried out in standard formulations. These demonstrated a performance matching that of the conventional alcohol ethoxylates. Tests in connection with the EINECS registration demonstrated that the EGEs are non-toxic to warm-blooded animals, do not irritate skin or eyes, have low toxicity to aqueous organisms and possess a remarkably rapid biodegradability.

345 In applications they are unique in several ways. They are very effective for removal of oil, fat and grease and they have, in contrast to other surfactants, in most cases the best effect alone, without builders and co-surfactants. The cleaning is good even at room temperature. The removed oil separates quickly from the washing fluid if not too much mechanical agitation is applied. The EGEs lower the surface tension of water to a very low level, they are poor foamers alone, but show foam synergy together with anionic surfactants, and they can be very good emulsion stabilizers at the right conditions. On the negative side should be mentioned that they are not very good at removing particle dirt and protein stains.

3. PROPERTIES Structurally, the EGEs differ from the alkyl ethoxylates in three main aspects: 9 They are esters. 9 The hydrophilic group is a carbohydrate. 9 They are single, well-defined molecular species. The first property is responsible for the limited stability in aqueous solution, and, combined with the fact that they are synthesized from natural, and easily digestible, raw materials, for the high biodegradability. The fact that the hydrophilic group is a carbohydrate means that the hydrophilicity is much more concentrated than in alcohol ethoxylates, as can be seen in figure 2. The difference in size has two reasons. One is that in the ethoxylate side chain there are two carbon atoms for

Alkylethoxylate

C12-EGE

Figure 2. Computer image of C12EGE compared with a C12alcohol ethoxylate. Both contain seven oxygen atoms represented in the figure by heavy lines.

346 each oxygen, while there is only one in the carbohydrate. The other which concentrates the hydrophilic oxygen atoms even more is the pyranose ring conformation of the sugar. It is highly probable that this is one reason for the remarkable grease removing capacity of the EGE which is demonstrated in figure 3, where the removal of a cutting oil from the surface of iron grains by a number of different surfactants is graphically depicted [ 10].

Figure 3. Removal of Ilobroach 11G (Castrol) from iron powder by different surfactants (0.5% in water). At left the EGEs are arranged according to chain length. They are followed by four alkylpolyglucosides, four sucrose esters, two commonly applied alcohol ethoxylates and four nonylphenolethoxylates. TRI stands for trichloroethylene.

Figure 4. Mechanism for oil removal by a surfactant from a hard surface.

If one considers how oil is removed from a surface by a surfactant, as shown in figure 4, it is reasonable that a surfactant with a more concentrated hydrophilic head more easily penetrates between the surface and the oil and thus more quickly releases

347 the oil stain. The poor emulsifying ability of the EGE can also be understood, because it is generally accepted that emulsification is facilitated when the hydrophilic part of the surfactant is large. The EGE, with its small hydrophilic head does not readily emulsify oil during the washing process with the consequence that the oil separates quickly, which is an advantage in many cleaning operations in regards to the pollution potential of the washing liquid. On the other hand, experiments carried out at the Swedish Institute for Surface Chemistry (Ytkemiska Institutet (YKI), Stockholm) [11 ] have demonstrated creation of very stable oil-inwater emulsions when sufficient homogenisation has been applied. So, once it has been created, the EGE's ability to stabilize an emulsion is good. The solubility, both in water and oil of the EGEs is poor. This may also be understood from the molecular structure. In water because the hydrophilic part is so small compared to the hydrophobic part, and in oil because the hydrophilic part is so concentrated, making it more difficult to "accomodate" in the oil phase. The fact that the EGE, due to the specificity of the enzymatic esterification, is composed mainly of one molecular species, is probably part of the explanation for another unique property of the C12 EGE, its ability to organize itself in so-called lamellar liquid crystalline phase. In this arrangement the surfactant molecules are organized in laminar double-layers with the hydrophobic parts facing each other. These double-layers are arranged sandwich-like, alternating with water-layers. Among professionals engaged in formulation of cleaning agents it is universally accepted that the maximum cleaning efficiency of a surfactant is obtained when it is present in the form of lamellar phase. And it is one of the aims of formulation work to attain such lamellar phase in the complete detergent formulation. While alcohol ethoxylates normally only form lamellar phase at higher (e.g. 70-90%) concentrations [12], the C12 EGE is apparantly "born" with the ability to form lamellar phase over a very wide concentration interval, as can be seen in figure 5 [11]. It stands to reason that a composition consisting of identical molecules much more easily forms (liquid) crystals than one which contains a range of molecules closely related but differing in molecular weight/chain length. That homogenous molecules is not a sufficient precondition for formation of lamellar phase is seen from the fact that the C10 EGE, differing only from the C~2 EGE by one methylene group in the hydrophobic part, behaves like alcohol ethoxylate in this respect, as seen in figure 6 [ 11]. Nevertheless, it is even better than the C~2 EGE for oil solubilization. So, at this stage these new surfactants pose far more questions than answers. The EGEs are now, after 8 years of development, ripe for test marketing. They have potential for use in a wide range of applications, from cleaning of oil contaminated soil over metal degreasing, industrial cleaning and household detergents to personal care and food uses. If the test marketing is successful, they will constitute a handsome example of how enzymes can be used to produce bulk type chemicals based on carbohydrates to the benefit of the environment.

348 BioSurf

12 (teehn.)/Water

BioSurf

I I

i

I 7~

= f (temp.)

10 (teehn.)/Water

= f (temp.)

L 7o

I I

o

I

o

I i

W+D 40

I I i I I

D

;

| o

19

I I i i I i I

Solid particles

~

J'- D

~ ~o W -I- L

+ Solid 3o

/

,

fro

Water

,

~o

I

,

40

~o

,

70

i

i

$o

qo

. 30

IoO

BioSurf teehn.

Figure 5. Phase diagram for C]2 EGE.

12

Water

.

. 30

D

/ "- - D + S o l i d

. #0

~ro

~o

70

ao

"79

WJ

BioSurf teehn.

10

Figure 6. Phase diagram for C]0 EGE.

In the above binary phase diagrams, D denotes lamellar phase, L stands for spherical micelles and W for water.

4. ACKNOWLEDGEMENTS The authors thank our colleague Ole Hvilsted Olsen for the creation of the QUANTA plot of Figure 2.

5. REFERENCES

1 2 3 4 5 6 7 8 9

G.G. Haraldsson, The Application of Lipases in Organic Synthesis. In: S. Patai (ed.), The Chemistry of Acid Derivatives, Vol. 2, 1395-1473. Wiley & Sons Ltd. (1992). Z.S. Derewenda, Advances in Protein Chemistry, 45 (1994) 1. H.M. Sweers and C-H. Wong, J. Am. Chem. Soc., 108 (1986) 6421. M. Therisod and A.M. Klibanov, ibid 108 (1986) 5638. S. Riva, J. Chopineau, A.P.G. Kiebom and A.M. Klibanov, ibid, 110 (1988) 584. K. Adelhorst, F. Bj6rkling, S.E. Godtfredsen and O. Kirk, Synthesis, 2 (1990)112. F. BjSrkling, S.E. Godtfredsen and O. Kirk, J. Chem. Soc., Chem. Commun., 934 (1989). Monoesters of Glycosides and a Process for Enzymatic Preparation Thereof, USP 5, 191, 071, to Novo Nordisk A/S. Process for Producing Methyl Glycoside Esters, USP 5,200,328, to Novo Nordisk A/S.

349 10 11 12

"Oil Removal" Patent application (abandoned) to Novo Nordisk A/S, Int.Publ. No. WO91/12305. I. Blute, M. Jansson, K. Ryddn and M. Sj6berg, YKI, Unpublished results. M.R.Porter, Handbook of Surfactants, Blackie & Son Ltd. London, p 31 (1991).

This Page Intentionally Left Blank

S.B. Petersen, B. Svensson, and S. Pedersen (Eds), CarbohydrateBioengineering 9 Elsevier Science B.V. All rights reserved.

351

A wide range of carbohydrate modifications by a single microorganism" leuconostoc mesenteroides W. Soetaert a, D. Schwengers a, K. Buchholz b, E.J. Vandamme r a Pfeifer & Langen, Frankenstrasse 25, 41539 Dormagen, Germany. blnstitute for Carbohydrate Technology (Zuckerinstitut), Technical University Braunschweig, Langer Kamp 5, 38106 Braunschweig, Germany. c Laboratory for Industrial Microbiology and Biocatalysis, Department of Biochemical and Microbial Technology, University of Gent, Coupure links 653, 9000 Gent, Belgium.

Abstract Leuconostoc mesenteroides, a lactic acid bacterium, possesses a wide range of biocatalytic properties that are potentially useful in carbohydrate modifications. The use of L. mesenteroides for production of dextran in whole cell fermentation, enzymatic leucrose synthesis by dextran sucrase, mannitol fermentation with viable L. mesenteroides cells, and the use of sucrose phosphorylase for enzymatic synthesis of ct-glucose-l-phosphate from sucrose are discussed. The applications of the various products are also briefly mentioned.

1. INTRODUCTION Leuconostoc mesenteroides, a heterofermentative lactic acid bacterium is particularly well adapted to sugary niches and consequently possesses a wide spectrum of biocatalytic properties useful in carbohydrate modifications. Research conducted at Pfeifer & Langen, in collaboration with universities has been directed towards industrial utilisation of these useful properties. As a result, several new processes have been optimized up to pilot scale and in some cases, commercial processes have resulted from it. Leuconostoc mesenteroides and its enzymes can be used to produce carbohydrates and derivatives as diverse as dextran (biopolymer), fructose, mannitol (polyol), leucrose (noncariogenic disaccharide), glucose-l-phosphate, and many others. The various industrial applications and possibilities of this potent organism will be discussed below.

2. DEXTRAN FERMENTATION Dextran is a well known glucan, produced by L. mesenteroides strains, when cultured on sucrose as a carbon source. It was probably the first biopolymer produced on an industrial

352 scale by fermentation when production started around 1948. Commercial dextran consists for 95 % of a o~-l,6-glucan backbone, with 5 % ~x-1,3 linkages. The sole enzyme involved in its synthesis is a glucosyltransferase, dextran sucrase, which has been characterized in detail [1]. The enzyme polymerises the glucose moiety of sucrose into dextran, thereby releasing the fructose moiety. So essentially the enzyme converts sucrose into dextran and fructose. In this polysaccharide synthesis no ATP or cofactors are involved, all energy being delivered by the bond between glucose and fructose in sucrose. The enzyme is secreted extracellularly by the cells during their growth phase, so the dextran synthesis occurs completely extracellular. Dextran fermentation has been performed on a commercial scale at Pfeifer & Langen in Dormagen (Germany) since 1951. The industrially applied production method uses a whole cell fermentation. In this anaerobic fermentation the bacteria are cultured on a medium consisting of an excess sucrose (10-12 %) and limited quantities of nitrogen source and trace elements. During the fermentation crude dextran, having a high molecular weight of several millions, accumulates besides fructose, low molecular weight dextran and small quantities of mannitol. In a following step the dextran is separated from the broth by precipitation in ethanol. The crude dextran obtained is the starting product for a whole range of dextrans and derivatives. In a first step the dextran is partially hydrolysed by acid hydrolysis, followed by fractionation to obtain a specific molecular weight fraction. An important fraction obtained is clinical dextran, a bloodplasma substitute having a molecular weight of 40-70.000. Dextran can also be chemically modified for several other applications. - in veterinary medicine (Fe-dextran as a source of Fe2+) - in human medicine as a cholesterol lowering agent (DEAE-dextran) -in separation technology, as molecular sieve (crosslinked dextran) and in aqueous two phase separation systems (dextran/polyethyleneglycol) - as a microcarrier in tissue/cell culture - for biotechnological applications (dextran sulphate)

Recently, enzymatic (cell-free) processes have also been developed [2]. In this approach the enzyme is first produced by fermentation without concomittant dextran synthesis. The cell-free enzyme is then added to a pure sucrose solution to synthesize the dextran. By this approach one obtains a purer dextran solution, not contaminated by cells and medium components. However, industrial practice still favours the whole cell fermentation as the higher purity of the obtained dextran does not outweigh the difficulties for obtaining the enzyme.

3. LEUCROSE BIOSYNTHESIS Leucrose is a disaccharide that consists like sucrose of a glucose and fructose moiety which are differently bound (5-O-(o~-D-glucopyranosyl)-~-D-fructopyranose). Leucrose was known to occur in the dextran fermentation broth in small quantities [3]. In 1986 Pfeifer & Langen was granted a patent for a biotechnical process that permits leucrose to be produced in large quantities from saccharose [4]. The enzymatic process is based on

353

OH H o ~ O

HO~

OH

0

OH

OH

systematic studies towards the reaction mechanisms of dextran sucrase [ 1]. Leucrose is formed from sucrose by the action of dextransucrase, the same glucosyltransferase as above that normally catalyses the formation of dextran from sucrose. In the normal reaction glucose is transferred by the enzyme from sucrose to the non-reducing end of the growing dextran chain. Glu-Fru + Glu Glu-Fru + Glu-Glu n Glu-Fru + Glu-Glu sucrose

---) ---)

Glu-Glu Glu-Glu-Glu

+ Fru + Fru

-o

Glu-(Glu) -Glu + n Fru dextran

In the presence of high concentrations of fructose however, fructose acts as the acceptor instead of the growing dextran chain and thus a new glucose-fructose disaccharide is formed (leucrose), its bond being different from that of the starting sucrose molecule. Glu-Fru~ + Fru o Glu-Fru 2 + Fru I sucrose

-~ ---)

Glu-Fru o + Fru 1 Glu-Fru I + Fru 2 leucrose

As one can see from the reaction sequence, no net fructose is converted in the process. In essence sucrose is thus converted to leucrose, in the presence of a high concentration of fructose. The production of leucrose involves quite a number of steps. First, the enzyme dextransucrase is obtained by fermentation with a L. mesenteroides strain. After the fermentation, the extracellularly secreted enzyme must be separated from the cells by centrifugation and the enzyme is further concentrated and purified by ultrafiltration. The enzymatic reaction can be performed batchwise and continuous with immobilized enzyme. The batch reaction occurs at 25 ~ in a concentrated solution of 65 % consisting of 1/3 sucrose and 2/3 fructose. The conversion efficiency is about 90 %. After the conversion is complete, the leucrose is separated from the fructose by large scale chromatography. The leucrose containing fraction is then concentrated and crystallized. The final product is pure crystalline leucrose, in a physical form very similar to normal sugar.

354 Leucrose crystallizes as a monohydrate with a melting point of 156-158 ~ The purity of the crystals is at least 99 %. Leucrose displays desirable characteristics as a bulk sweetener. It has a sweetness of about 50 % compared to sucrose. It is essentially non-cariogenic as there appears to be no enzyme system present in the human mouth capable of breaking the o~-1,5 linkage. Leucrose is resorbed easily without any incompatibility problems even at high daily intake, as it is broken down to its glucose and fructose units by enzymes in the small intestines in the same manner as for sucrose. Leucrose is a reducing sugar with an unusual high stability towards acid of the glucose-fructose bond, contrary to the acid-labile sucrose. This may allow its use in acidic foods and for chemical synthesis reactions.

4. MANNITOL FERMENTATION Mannitol is a common polyol derived from mannose and is extensively used for various applications. Apart from its use in foods mannitol finds wide application in the non-food sector. Because of its desirable properties, mannitol is commonly used in the pharmaceutical formulation of chewable tablets and granulated powders [5]. The complex of boric acid with mannitol is used in the production of dry electrolytic capacitors. It is an extensively used polyol for the production of resins and surfactants. Mannitol is used in medicine as a powerful osmotic diuretic and in many types of surgery for the prevention of kidney failure and to reduce eye and brain oedema. Mannitol hexanitrate is a well known vasodilator, used in the treatment of hypertension. Mannitol is now produced by catalytic hydrogenation of fructose using a nickel catalyst and hydrogen gas. This hydrogenation yields mannitol, as well as its isomer sorbitol in about equal amounts, due to the poor selectivity of the nickel catalyst used. This leads to a less efficient production process as sorbitol can be produced cheaper by hydrogenation of glucose. Numerous process improvements to increase the ratio of mannitol/sorbitol formation have been suggested and patented [6]. Mannitol is a common reserve product of many fungi and yeasts and its production by fermentation has often been tempted, but the yields and productivities were too low to compete with the chemical hydrogenation process [7-13]. Recently, a new fermentation process capable of converting fructose quantitatively to mannitol has been developed [ 14]. The process makes use of the capability of L. mesenteroides to use fructose as an alternative electron acceptor, thereby reducing it to mannitol, with the enzyme mannitol dehydrogenase. In the process the reducing equivalents are generated by the conversion of glucose into D-lactic acid and acetic acid. Based on the hydrogen balance the following (theoretical) fermentation equation can be derived. 2 fructose + 1 glucose

---)

2 mannitol + D-lactic acid + acetic acid + CO 2

In this process there is no formation of sorbitol but of limited quantities of D-lactic acid. Dlactic acid is an interesting by-product that finds application as a chiral synthon for organic synthesis, more particular in the industrial synthesis of chiral phenoxyherbicides. The influence of various factors on the fermentation have been studied in detail. A key factor thereby was the conversion efficiency, defined as the ratio of the produced amount of mannitol versus the

355 consumed fructose. The conversion efficiency could be markedly increased to near quantitative conversion by choosing appropriate fermentation conditions. An optimized batch fermentation resulted in a conversion efficiency of 92 %. Fundamental studies towards the fermentation mechanism enabled to devise an optimal fed batch fermentation procedure with automatic feeding strategy. A very fast and complete conversion is reached in less than 24 hours. The conversion efficiency is 94 % and the maximal conversion rate is 11 g mannito!/1.h. The final mannitol concentration is 150 g/l, close to its solubility limit of 180 g/1 (25 ~ Thus a very high mannitol concentration can be produced in high yield using a fed batch strategy in less than 24 hours of fermentation. By selection of an engineered strain even better process characteristics could be obtained, resulting in quantatitative conversion and a further concentration increase up to 185 g/l mannitol by increasing the fermentation temperature to 35 ~ (fig. 1).

200 i

CONC

(g/l)

CDM

175~-P

4.0 3.5

150~

:

L

125~t_f 100

(g/l)

mannitol

3.0

"S-"

cell d r y 1111288

~v

/

,~

H/

2.5

/i /

2.0

/ H/

/

75?

1.5

/

fructose

),

50i-

1.0 glucose

25 0 I~=/ 0

0.5

//! m

,

i 10

,

i 20

,

l 30

,

i 40

~

l 50

0.0 60

time (h) Figure 1. Fermentation profile of a fed batch mannitol fermentation

The downstream processing of the fermentation broth has also been optimized. The use of electrodialysis followed by crystallization results in cost-effective recovery of highly pure

356 crystalline D-mannitol and D-lactic acid. The new process thus offers an attractive alternative to the presently used industrial synthesis routes of mannitol (Table 1).

5. E N Z Y M A T I C SYNTHESIS OF G L U C O S E - I - P H O S P H A T E Sucrose phosphorylase is an intracellular enzyme obtained from L. mesenteroides [ 15]. It is a glucosyltransferase transferring glucose from sucrose to a number of acceptor molecules, phosphate being the most effective acceptor, forming (x-glucose-l-phosphate. sucrose + Pi

~

"'-

(x-glucose-l-phosphate + fructose

Table 1 Comparison of mannitol fermentation process versus catalytic hydrogenation process Fermentation Catalytic hydrogenation all fructose converted to mannitol only half of fructose converted to mannitol co-product : D-lactic acid one quarter of mannitol co-product: sorbitol in large excess (3 X) glucose is hydrogen source in hydrogenation

highly pure hydrogen gas necessary

nitrogen source (CSL) essential for growth

nickel catalyst essential

electrodialysis for removing organic acids

ion exchanger for nickel ions removal

use of less pure substrates poses no problem

highly pure substrates necessary to avoid catalyst inactivation

Glucose-l-phosphate is at this moment an expensive fine chemical. The technology based on sucrose phosphorylase can turn glucose-l-phosphate into a cheap commodity chemical. Glucose-l-phosphate is essentially a C 1 protected glucose molecule. It can be easily converted enzymatically into glucose-6-phosphate, a C 6 protected glucose molecule. Both can be used in glucose derivatisation reactions. For instance glucose-l-phosphate can be oxidized at C 6 position to glucuronic acid-l-phosphate. This can then be easily hydrolysed to glucuronic acid, an interesting base for further synthesis reactions. Glucose-l-phosphate is used in infusion solutions as a supplier of glucose (energy) and phosphate. It can be used in cell cultures with calcium alginate cell carriers to supply the cells with phosphate, that would otherwise (in a free form) complex the calcium, thus destabilizing the beads. The process to produce glucose-l-phosphate starts with a fermentation using L. mesenteroides. After this the cells containing the enzyme are collected by centrifugation. The cells are then immobilized in gelatin beads. An immobilized sucrose phosphorylase is thus obtained that has a good mechanical stability and a half life time of 40 days. These beads are used in a continuous mode in a column. The substrate consisting of sucrose and phosphate is

357 converted to glucose-l-phosphate and fructose. The phosphate conversion efficiency is about 80 %. The substrate is then separated by batch semi-continuous chromatography. This results in three product streams, a first stream consisting mainly of fructose, which can be marketed as fructose syrup. The second is the glucose-1-phosphate stream from which crystalline glucose1-phosphate is readily obtained by crystallisation in high yield. A third stream consists of unconverted sucrose and phosphate, which are recirculated, resulting in a very efficient process. Thus, provided a large production base is established, the price of glucose-lphosphate can be as low as 5-10 DM/kg for this currently significantly more expensive fine chemical. The enzyme can also be used for the synthesis of new disaccharides. Fructose and other similar carbohydrates such as L-sorbose and D-xylitol behave as good acceptors of the glucose moiety [ 16]. Sucrose phosphorylase for instance also catalyses the following reaction : o~-G-1-P + D-xylitol

-->

4-O-ot-D-glucopyranosyl-xylitol + P~

In this way such complex disaccharides can be conveniently synthesized with the use of a single enzyme.

6. CONCLUSION Although Leuconostoc mesenteroides has well recognized importance in food fermentations, these bacteria have rarily been used as production strains in industrial bioconversions, dextran synthesis being the only well documented industrial use of the microoganism. The presented processes show the sofar undiscovered potential of this lactic acid bacterium as a potent tool in carbohydrate modification. Highly selective modifications can be performed with the organism and its enzymes, resulting in very efficient, high yielding and very fast bioconversions of high concentrations of carbohydrates.

7. REFERENCES

1 2 3 4 5 6 7 8 9

J.F. Robyt and H. Taniguchi.. Arch. Biochem. Biophys., 174 (1976) 129. D.E. Brown and A.J. Mc Avoy, Chem. Technol. Biotechnol., 48 (1990) 405. F.H. Stodola, H.I. Koepsell and E.S.J. Sharpe, J. Am. Chem. Soc., 74 (1952) 3202. D. Schwengers and H. Benecke, European patent No. 185 302 (1985). Chem. Abstr., 105 (1986) 77815. B. Debord, C. Lefebvre, A.M. Guyot-Hermann, J. Hubert, R. Bouchem and J.C. Guyot, Drug Dev. Ind. Pharmacy, 30 (1987) 1533. M. Makkee, A.P.G. Kieboom and H. Van Bekkum, Starch/St~irke, 37 (1985) 136. H.J. Blumenthal, p. 292-307. In: J.E. Smith and D.R. Berry (eds.), Vol. II. Edward Amold Publishers, London (1976). K. Hattori, T. Suzuki, Agric. Biol. Chem., 38 (1974) 1203. H.V. Hendriksen, T.E. Mathiasen, J. Adler-Nissen, J.C. Frisvad and C. Emborg, J. Chem. Technol. Biotechnol., 43 (1988) 223.

358 10 11 12 13 14 15 16

W.H. Lee, Appl. Microbiol., 15 (1967) 1206. H. Onishi, T. Suzuki, Biotechnol. Bioeng., 12 (1970) 913. S.C. Prescott, C.G. Dunn, p. 644-646 In: S.C. Prescott and C.G. Dunn, C.G. (eds.). Industrial Microbiology, Mc Graw Hill Book Co., New York (1959). J.F.T. Spencer and P.A.J. Gorin, Progr. Ind. Microbiol., 7 (1968) 1. W. Soetaert, Ph.D. thesis, University of Ghent, Belgium (1991). E.J. Vandamme, J. Van Loo, L. Machtelinck and A. Delaporte, Adv. Appl. Microbiol., 32 (1987) 163. S. Kitao and H. Sekine, Biosci. Biotech. Biochem., 56 (1992) 2011.

359

INDEX 1,3-1,4-13-D-glucanase

113

1,3-1,4-13-glucanase

85,86,88,91

4-methylumbeUiferyl

87,88,91,94

t~-amylase

113, 120, 125, 134, 137, 141, 165, 168, 170, 171,172,175

13-1,4-glycanase

261,262,263,267

~-amylase

125,126

13-galactosidase

77,78,81,82

13-glycosidase

77,78,80,81

([3/o0s-barrel

126,134,135,136,137,140,141

(13/t~)s-barrelstructure

168,170

acarbose

127,128,129,131,134,135,136,138,139

acceptor reaction of dextransucrase

295,299,307,308,309

acid t~-amylase

181,187

ADP-glucose pyrophosphorylase

271,272,273,

adsorption

265,266,267

aff'mitychromatography

253,255,258

affinity labeling

104,138

aglycon binding subsite

137,138,139,140

alternan

295,302,303

altemansucrase

298,303,307

amylolyticenzymes

125,139,142

amylopectin

271,273,277

amylose

271,273,277

anti-adhesion therapy

9

360 antisense RNA

272, 273, 274

apple

325, 334, 335,338, 339

arabinoxylans

321,327, 331

Aspergillus glucoamylase

125

autolysis

80

autoselection

79, 80

avidin-biotin technology

251,253, 254

Bacillus circulans

165, 169, 172, 173

Bacillus licheniformis

86, 87, 88, 95

barley t~-amylase, 1 and 2 (AMY1 and AMY2)

135, 136, 137, 140, 141

barley t~-amylase/subtilisin inhibitor (BASI)

126, 135, 136, 139, 140, 141

BASI-AMY2 complex formation

137, 140

binding and catalytic domain

71, 72, 73, 74

binding energy

89

binding site

86, 87, 88, 89, 92, 93

branch points

30

branching enzyme

271,272, 273,276, 277

caco-2 cells

288,289

cancer

1

carbodiimides

157

catalytic acid

126, 130, 135

catalytic base

126, 130

catalytic domain

239, 261,262, 263

catalytic mechanism

165, 170, 171

catalytic nucleophile

126, 130, 135, 138

catalytic residues

157, 171

CDase

165, 166

cellobiohydrolase

211, 261,263

cellobiose

227, 228, 229, 230, 231,232, 233

361 cellohexaose

227, 230, 231,232

cellulase

113, 116, 120, 226, 227, 231, 233, 235, 236, 261,264, 267,279, 284, 286, 288, 290

Cellulomonas

261,262, 267

cellulose

261,262, 263, 264, 265, 266, 267

cellulose hydrolysis

262, 267

cellulose-binding domain

251,252, 261,262

cellulosome (functional domains)

251,252, 253,255,257

CGTase

165

chemical exchange

16, 17

chemical modification

29, 30, 31, 46, 147, 155, 156, 160

chimeric protein

81, 82, 83

chitinase

71, 72, 73

Clostridium cellulolyticum

239

Clostridium thermocellum

252, 279, 281,286, 290

cohesin-dockerin interaction

253

condensation

131,133, 141

conformation

15, 16, 17, 18, 24, 26

conformational change

229, 231,232

containment technology

49, 59

continuous glucose monitoring ex vivo

49, 51

corn

321,323, 326, 332, 333

cyclization reaction

165, 167, 171,172, 173

cyclodextrin glycosyltransferase

165

cyclodextrins

165, 166

cyclomaltodextrinase

165

D- g luc o-dih y droacarbose

127, 128, 131

debranching enzyme

272

dextran

295, 296, 298,299, 351,352, 353, 357

362 dextran sucrase

295, 298, 299, 301,302, 303, 304, 305,351, 352,353

dietary fibre

321,323, 333

difference ultraviolet absorbance spectroscopy

147

differential labelling

155,156,157

disproportionating enzyme

272

dynamics

15,24,26

electrodialysis

355,356

electrospray

100, 101,102

electrostatics

181,182

electrostatics, carbohydrate modulation of

181,184,186

electrostatics, effect of charge mutation

181,189

electrostatics, pH dependence

181,182,185

electrostatics, salt dependence

181

endoglucanase

225,226,227,233,234,236,261,263,264

endoglucanase A

239

endoglucanase E

279,286,289,290,291

enzymatic

343,347

enzymes

2,4,7,8,10

epithelial cells

288,289

erythropoietin

1,6

Escherichia coli

2,10

esters

343,345,346

ethylglucoside

344

exchange spectroscopy

18,26

exo-cellobiase

78

exocrine pancreas

279,281,284,286,287,291

exoglucanase

261

fast atom bombardment mass spectrometry

30

363 fibronectin type III

262

flexibility

16, 18, 26

flow-through system

49

fusion

77, 80, 81, 82, 83

ganglioside

15, 24, 34

gas chromatography - mass spectrometry

3O

gastrointestinal tract

280, 286, 288, 291

Gaucher's disease

9

gelatinisation

273,275,278

gene

78, 80, 81, 82

gene rescue

279

general acid catalyst

132, 133, 135

genetic defects

2

glucansucrase

295, 305

glucoamylase

113, 115, 116, 117, 129, 130, 135, 142

glucocerebrosidase

9

glucose- 1-phosphate

351,356, 357

glucuronic acid

356

glutathione-S-transferase

77, 80

glycoconjugates

24

glycogen

271,275, 276, 277

glycolipids

1,2,9,24

glycoproteins

1, 2, 3, 6, 29, 33, 37, 38

glycosidase

97, 99, 100, 101,103, 104, 107, 108, 110

glycosphingolipids

29, 33, 34

glycosyl hydrolase

77, 78, 79, 225, 226, 227, 233, 235

glycosylation

205

glycosylphosphatidylinositol anchor [GPI]

279, 288, 289

granule-bound starch synthase

272, 275

hairy regions

324, 338

364 Helicobacter pylori

9

heparan sulfate

6,7

heparin

6,7

heterologous expression

77, 83

heteronuclear coupling constants

20

high-mannose

29, 33, 38, 40, 41, 46

homology modelling

181,190

hormones

4,5,8,9

hydrogen bonding

16, 18, 19

hydrophobic cluster analysis

158, 160

inclusion complexes

166

infection

1,7,9,10

inflammation

1,10

inhibition

85, 87, 89, 91

inhibitors

113, 114, 120

internal motions

15, 24, 26

intestinal enterocytes

279, 280, 286, 288, 291

introns

283, 284

inversion

225, 226, 227, 228, 232

isolation and purification

149

isomaltose

129, 130, 132

isozyme hybrids

125, 135

J-fitting

23

kinetics

86, 87, 88, 89, 91

lectin-like chitin-binding protein

72, 73

lectinophagocytosis

9

lectins

4,7,8,10,11,12

Leuconostoc mesenteroides

295, 351

leucrose

351,352,353

leukocytes

7,8,11

365 linkage analysis

30, 41

lipase

343, 344

long-range coupling constants

20, 22

maltodextrin phosphorylase

59,60,61,64,66

maltooligodextrin

138, 139

maltose

127, 130, 132, 133, 134

maltose binding site

169, 170

Man-6-phosphate

8

mannitol

351,352, 354, 355, 356

mechanism

97, 98, 99, 100, 101, 107, 108, 109, 110, 225, 226, 228, 229, 231,232, 233, 235, 236

mechanism of synthesis of alternan

302

mechanism of synthesis of dextran

295, 304

mechanism of synthesis of polysaccharides

299

mechanism-based inactivators

97, 99, 100

metastasis

7

micelle

15, 24, 25, 26

microbial

279

microdialysis system

49, 51, 57

microorganisms

7,9

model-free approach

15, 24

modular proteins

261,262

molecular recognition

126, 130, 141

monoesters

343

MUCase

284, 285

mutagenesis

97

mutan

295, 298

mutansucrase

299, 301,304, 306

N-bromosuccinimide

155

N-Linked glycans

29, 32

366 NMR

15,16,17,18,20

NOE

16,17,18,24,26,27

nutrition

279,280,281,286,290,291

O-Linked glycans

32,38

oil, removal

345,346

oligosaccharides

1,3, 6, 8, 11, 29, 30, 31, 33, 34, 35, 37, 85, 86, 87,92,93,94,95,225,227,228,230,231,232, 235,236,321,323,338

oligosaccharide binding

59,61,62,64,65,67

oxidation

175

p-nitrophenyl-state kinetics

136,137

pectinases

322,338

peracetylation

29,31,32,41,43

periodate oxidation

29,31,33,34,35

permethylation

34,35,36,37,39,40

peroxidases

205

phagocytosis

9

phase diagram

348

phosphoglucomutase

181,189

phosphoglucomutase, structure function relationship

181,189

phosphorolysis

60,61

photoreactive maltodextrin

138,139

plant cell wall

261,262,321,322,326,337,339

plasmid

79,80

polysaccharide

297,298,299

poultry trial

279,280,291

product inhibition

166,173

production

77,78,79,80,81,82,83

protein engineering

165,172,175,181,189

367 protein folding

205

protein solubility

205

protein sorting

279, 288, 289

protein stability

205

protein targeting

279, 288, 289, 291

proteinases

280, 281

purification

78, 79, 80, 81, 83

random mutagenesis in binding loop

137, 141

raw-starch binding motif

170

receptors

6

recombinant amylosucrase

313

relaxation parameters

15, 17,24

retention

225, 226, 233, 235

retrogradation

274

rhamnogalacturonases

321,322, 323, 325, 334, 335, 336, 337, 338

ROESY

16, 18, 19, 27

rotation correlation time

24, 27

saccharification

133, 134

Schizophyllum commnune

147

secretory

284, 286

selectins

8,10,11

sequence

29, 30, 34, 35, 37, 39, 40, 43, 44

sialyl-Lewis X

10

sialyllactose

15, 22

site-directed mutagenesis

59, 60, 62, 125, 126, 128, 134, 135, 165, 170, 172

soluble starch synthase

271,274, 275, 276, 277

soy

323, 335, 336, 337, 338

stability

175

starch

271

368 starch granule size

274, 275, 277

starch hydrolysis

170

starch phosphorylase

271,272

Streptococcus mutans

295

structure

225, 226, 227, 228, 229, 230, 231, 233, 234, 236, 237

structure-function

77, 78

structure-function relationship

148

subsite

59, 62, 64, 69, 85, 86, 88, 89, 90, 93, 94, 126, 127, 128, 129, 131,132, 133, 134, 137

substrate binding residues

160

sucrose

15, 17, 18, 19, 295,296, 300-309

sucrose phosphorylase

351,356

supercooled water

15, 18

surfactants, carbohydrate

343, 345, 346, 347

synthesis

343, 344

synthesis of branch linkages of dextran

295, 297, 298, 304, 305

T-enzyme

272

TAKA 0~-amylase

181,187

tetranitromethane

155, 157

thermodynamics of binding

126, 127, 129, 132, 141

thermophilic Archaeon

77

thermostable

77

thiooligosaccharides

113, 114, 123

three-dimensional structure

168, 173

thrombin

81, 82, 83

tissue plasminogen activator

4

torsion angles

26

transgenic animals

279, 284

transgenic plants

271,276, 277

369 transglycosylation

78, 165, 173

transition state stabilization

85, 87, 90, 93

transition-state stabilization energy

126, 128, 129, 130, 131, 134

Trichoderma reesei

211

tfifluoroacetolysis

29, 32, 41, 43, 46

viruses

1,3,10

viscosity

273, 274, 321,323, 325, 333, 334, 338

waxy starch

274

wheat

321, 323, 325, 326, 327, 329, 330, 331, 332, 333, 334

xylanase

148, 261, 262, 263, 264, 265, 266, 279, 280, 281, 282, 290, 291, 292, 321, 322, 326, 237, 330, 331,334

yeast

79, 80, 83

This Page Intentionally Left Blank