Progress in Nucleic Acid Research and Molecular Biology, Volume 19

PROGRESS IN Nucleic Acid Research and Molecular Biology Volume 79 This Page Intentionally Left Blank PROGRESS IN ...

Author: Waldo E. Cohn | E. Vollin

12 downloads 932 Views 24MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

PROGRESS IN

Nucleic Acid Research and Molecular Biology Volume 79

This Page Intentionally Left Blank

PROGRESS IN

NucIeic Acid Research and Molecular Biology Volume

79

mRNA: The Relation o f Structure to Function

edited

by

WALDO E. COHN and ELLIOT VOLKIN Biology Division Oak Ridge National Laboratory Oak Ridge, Tennessee

7976

ACADEMIC PRESS New York Sun Francisco London A Subsidiary of Harcourt Brace Jouanovich, Publishers

COPYRIGHT 0 1976, BY ACADEMICPRESS,INC. ALL RIGHTS RESERVED. N O PART OF THIS PUBLICATION MAY B E REPRODUCED OR TRANSMITTED I N ANY F O R M OR BY ANY MEANS. ELECTRONIC OR MECHANICAL, INCLUDING P H O l O C O P Y , RECORDING, O R A N Y INFORMATION STORACF AND RETRIEVAL SYSTEM, W I T H O U T PERMISSION I N W R i T l N C F R O M T H E PUBLISHER.

ACADEMIC PRESS, INC.

111 Fifth Avenue, New York. New York 10003

United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval R o a d , L o n d o n NW1

LIBRARY O F CONGRESS CATALOG CARDNUMBER:6 3 15847 ISBN 0-12-540019-5 PRINTED IN T H E UNITED STATES OF AMERICA

808182

9 8 7 6 5 4 3 2

Contents LISTOF CONTRIBUTORS.

. .

. .

. .

. .

. .

. .

. .

. . . . . . . .

xxii

DEDICATIONJACQUES MONOD .

.

.

.

.

.

.

.

.

.

.

xxv

.

.

.

.

.

.

.

.

.

.

.

xxvii

SOMEARTICLESPLANNEDFOR FUTURE VOLUMES.

.

.

.

.

.

.

xxxi

. .

3 4 17 19 20

PREFACE

.

.

ABBREVIATIONSAND

.

.

SYMBOLS

xv

1. The 5’-Terminal Sequence (“Cap”) of mRNAs Caps in Eukaryotic mRNAs: Mechanism of Formation of Reovirus mRNA 5’-Terminal m’GpppGm-C Y. FURUICHI, S. MUTHUKRISHNAN, J. TOMASZ AND A. J. SHATKIN I. 11. 111. IV.

Introduction Results . Discussion Summary . References

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. .

. .

. .

. .

.

.

. .

.

. .

. .

.

. .

. .

.

. .

. .

.

. .

. .

.

. .

.

. .

Nucleotide Methylation Patterns in Eu karyotic mRNA Fnrm M. ROTTMAN,RONALDC. DESROSIERS AND KAREN FRIDERICI I. 11. 111. IV. V.

Introduction . . Materials and Methods Results . . . Discussion . . Summary . . . References . .

. . . . .

.

. . . . .

.

. . . . .

. . . . .

.

.

. . . . .

.

. . . . .

.

. . . . .

.

. . . . . . . . . . . . . . . . . . . .

.

.

.

.

21 22 24 34 37 38

Structural and Functional Studies on the “5’-Cap”: A Survey Method for mRNA HARRIS BUSCH,FRIEDRICH HIRSCH, KAUSHALKUMAR GUPTA, MANCHANAHALLI RAO,WILLIAM SPOHNAND BENJAMIN C. Wu I. Introduction 11. Results . 111. Discussion References

.

. .

.

.

. .

.

.

. .

.

.

. .

.

.

.

. .

. .

.

.

V

.

. .

.

.

. .

.

.

. .

.

.

. .

.

.

. .

.

.

. .

.

.

. .

.

39 42 55 60

vi

CONTENTS

Modification of the 5’-Terminals of mRNAs by Viral and Cellular Enzymes BERNARD MOSS,SCOTTA. MARTIN,MARCIA J. ENSINGER, ROBERT F. BOONEAND CHA-MERWEI

.

.

.

.

.

.

.

.

.

.

.

.

.

.

63 64

111. Isolation of a GpppN-Specific Guanine-7-methyltransferasc from Uninfected HeLa Cells . . . . . . . . . . . IV. Summary and Conclusions . . . . . . . . . . References . . . . . . . . . . . . .

76 77 80

I. Introduction

.

11. 5’-Tcrminal RNA Modification Enzymes of Vaccinia Virus

.

Blocked and Unblocked 5’Termini in Vesicular Stomatitis Virus Product RNA in Vitro: Their Possible Role in mRNA Biosynthesis RICHARDJ. COLONNO, GORDON ABRAIIAM AND AMIYAK. BANERJEE Text . References

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

.

.

.

83

.

.

.

87

. .

. .

. .

89 96 96

The Genome of Poliovirus Is an Exceptional Eu karyotic mRNA YUAN FONLEE,AKIONOMOTO AND ECKARD WIMMER Text . Summary References

II.

. .

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

.

.

.

Sequences and Conformations of mRNAs

Transcribed Oligonucleotide Sequences in Hela Cell hnRNA and mRNA MARYEDMONDS, HIROSHI NAKAZATO, E. L. KORWEKAND S. VENKATESAN

. . . . . . . . . . . . .

I. Introduction 11. A Transcribed Oligo(A) Sequence in hnRNA 111. Oligo(U) Sequences in hnRNA and mRNA References . . . . . . .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

99 99

105 112

vii

CONTENTS

Polyadenylylation of Stored mRNA in Cotton Seed Germination BARRY HARRIS AND LEONDUREI11 Text . . References

.

.

.

.

. .

. .

.

.

.

.

.

113 118

.

.

.

119 122 122

. . . . . .

. . . . . .

. . . . . .

123 123 126 130 133 133

. . . . . . . . . . . . . . . .

135 154

. . .

.

.

.

.

.

.

. .

.

.

mRNAs Containing and Lacking Poly ( A ) Function as Separate and Distinct Classes during Embryonic Development MARTIN NEMERAND SAULSURREY Text. . Summary . References

. .

.

. . .

.

. .

. . .

. . .

. .

.

.

.

. . .

. . . .

.

. . . . . . . .

Sequence Analysis of Eukaryotic mRNA N. J. PROUDFOOT, C. C. CHENCAND G. G . BROWNLEE

. .

.

. . . . . . . . . . . . . . . .

I. Introduction . . 11. Complementary DNA Sequence Analysis 111. mRNA Sequences . . . . . IV. Discussion . . . . . V. Summary References . . .

.

.

.

. . . . .

. . . . . .

. . . . . .

The Structure and Function of Protamine mRNA from Developing Trout Testis P. L. DAVIES, G . H. DIXON,L. N. FERRIER, L. GEDAMU AND K. IATROU Text . . References

. .

. . .

.

.

.

.

.

The Primary Structure of Regions of SV40 DNA Encoding the Ends of mRNA KIRANURN. SUBRAMANIAN, PRABHAT K. GHOSH,RAVIDHAR, SAYEEDAB. ZAIN, JULUN PAN BAYAR THIMMAPPAYA, AND SHERMAN M. WEISSMAN I. 11. 111. IV. V.

.

. . . . . . . . . . . . . .

Introduction . Material and Methods Results . . . Discussion Summary . . References

. . .

. . . . . .

. . . . .

. . . . . . . . . . . . .

. . . . . . . . . . .

.

. . . . . .

. . . , . .

. . .

. .

.

157 158 158 162 164 164

viii

CONTENTS

Nucleotide Sequence Analysis of Coding and Noncoding Regions of Human 0-Globin mRNA CHARLES A. MAROTTA,BERNARD G . FORGET, MICHAELCOHEN-SOLAL AND SHERMAN M. WEISSMAN I. Introduction

.

.

11. Materials and Methods

111. Results 1V. Discussion V. Sunnnary References

.

.

.

. . , . . . . . . . . . .

. ,

. .

.

.

. .

. . .

.

. .

. . . .

.

. . . . .

. .

.

. . .

. . . . . . , . . . . . . . . . .

.

. .

.

.

.

. . . . . . .

165 168 167 170 173 174

Determination of Globin mRNA Sequences and Their Insertion into Bacterial Plasmids WINSTON SALSER, JEFF BROWNE, PAT CLARKE,

RUSSELLHIGUCHI, GARYPADDOCK, JOHN GARYSTUUNICKA AND PAULZAKAR

.

HOWARD HEINDELL,

ROBERTS,

.

. . . . . . . . .

I. Introduction . . . . , . . 11. Sequence Analysis by in Vitro Transcription from cDNA Templates 111. Cloning cDNA Scquences on Bacterial Plasmids . 1V. Sequence of a 79-Nucleotide Region in the Beta Chain niRNA . V. The Relation of Globin mRNA Structure to Function . References . . . . . . .

.

.

.

. . .

. . .

.

177 178 186 195 199 203

Mutation Rates in Globin Genes: The Genetic Load and Haldane's Dilemma WINSTONSALSERAND JUDITII STROMMERISAACSON I. 11. 111. IV.

. .

. . .

.

.

.

205 207 208 212 219

. . . . . , . . . . . . . . . . . . .

221 225

The Use of Silent Mutations to Measure Mutation Rates . Haldane's Dilemma Magnified . . . . . . Constraints on the Maintenance of Single-Copy DNA Sequences How Are Multiple Copy DNA Seqiiences Kept Accurate? . References . . . .

. .

.

.

. . .

.

. .

. .

The Chromosomal Arrangement of Coding Sequences in a Family of Repeated Genes G. M. RUBIN, D. J. FINNEGAN AND D. S. HOGNESS Text . References

.

.

.

.

.

. .

.

ix

CONTENTS

Heterogeneity of the 3’ Portion of Sequences Related to Immunoglobulin K-Chain mRNA URSULAS T O ~

.

Text References

. .

.

.

.

. .

.

.

. .

. . . . . . . . . . . .

. . .

.

227 231

Structural Studies on Intact and Deadenylylated Rabbit Globin mRNA JOHN N. VOURNAKIS, MARCIA S. FLASHNER, MARYANN KATOPES, GARYA. KITOS,NIKOSC. VAMVAKOPOULOS, S. SELLAND REGINAM. WURST MATTHEW

.

I. Introduction . . . . . . . . . 11. The Controversy between Structure and Function . 111. Eukaryotic mRNA Structure . . . . IV. Purification and Deadenylylation of Rabbit Globin mHNA V. Optical Studies . . . . VI. Carbodiimide Binding to Globin mHNA . . . VII. Polynncleotide Phosphorylase Digestion of mRNA . . VIII. Specific Hydrolysis of mHNA by S 1 Nuclease . . IX. Summary and Conclusions . . . . . References . . . . .

.

.

.

.

.

. . .

.

. . . . . . . . .

. .

233 234 236 237 238 . 239 , 242 . 244 249 251

. . . . .

Molecular Weight Distribution of RNA Fractionated on Aqueous and 70% Formamide Sucrose Gradients HELGABOEDTKERAND HANSLEHRACH I. Introduction . . . . . . . . . . . . 11. Molecular Weight Distribution on Aqueous Sucrose Gradients . 111. Molecular Weight Distribution on 70% Formamide Sucrose Gradients IV. Discussion . . . . . . . . . . . . References . . . . . . . . . . . .

. . .

. .

253 253 258 259 259

CONTENTS

X

111 . Processing of mRNAs Bacteriophages T7 and T3 as Model Systems for RNA Synthesis and Processing J . J . DUNN.C. W. ANDERSON. J . F. ATKINS. D . C . RARTELTAND W. C. CROCKETT I . Introduction . . . . . . I1 Properties of RNase I11 . . . . I11 Synthesis of T7 and T3 Early RNAs . IV . Fidelity of RNase 111 Cleavage in Vitro V Effect of Cleavage on Translation . . VI . Summary . . . . . . . References . . . . . .

. . .

.

. . . . . .

.

. . . . . .

.

. . . . . .

.

.

.

.

263 263 264 265 268 271 273

.

275 275 277 280 283 291 291

. . . . . . . . . . . . . . . . . . . . . . . .

The Relationship between hnRNA and mRNA ROBERTP. PERRY. ENZOBARD.B . D A V I D HAMES. DAWNE . KELLEYAND UELISCIIIBLER I . Introduction . . . . . . . . . I1. Transcriptional Units and the Physical Size of Precursors I11. Sequence Properties . . . . . . . . IV . Kinetic Considerations . . . . . . . . V . Studies of the 5 Termini of hnRNA and mRNA . VI Summary . . . . . . . . . . References . . . . . . . . .

.

.

.

.

.

.

.

.

. . . . . . . . . . . .

. . . . . . . .

A Comparison of Nuclear and Cytoplasmic Viral RNAs Synthesized Early in Productive Infection with Adenovirus 2 HESCHEL J . RASKASAND ELIZABETH A . CRAIG

. . . . . . . . . . . . . . .

Text References

.

.

.

.

.

.

.

.

.

.

.

.

.

293 300

Biogenesis of Silk Fibroin mRNA: An Example of Very Rapid Processing? PAULM . LIZARDI I. I1 I11. IV . V.

Introduction

. Experimental Results Discussion Summary References

.

.

.

.

Procedure

.

.

. .

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

.

. . . .

301 301 302 307 311 312

xi

CONTENTS

Visualization of the Silk Fibroin Transcription Unit and Nascent Silk Fibroin Molecules on Polyribosomes of Bombyx mori STEVENL. MCKNIGHT.NELDAL . SULLIVAN AND OSCAR L . MILLER.JR. Text . References

.

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

313 318

Production and Fate of Balbiani Ring Products B . DANEHOLT. S . T. CASE.J . HYDE.L. NELSONAND L. WIESLANDER I. Introduction . . . . . . . . . . I1. Transcription Complexes in Balbiani Rings . . . . 111. BR Granules in Nuclear Sap. Nuclear Pores and Cytoplasm IV. Polysomes of Large Size in Chironomus Salivary Glands . . . . . . . . V. BR RNA in Polysomes . VI. Concluding Remarks . . . . . . . . References . . . . . . . . . .

. . .

. . .

. . .

. . .

. . .

. . .

.

319 320 321 323 327 329 333

. . . .

. . . .

335 336 349 350

. . . . . . . . .

355 356 359 359 361 364 365 369 370

.

.

Distribution of hnRNA and mRNA Sequences in Nuclear Ribonucleoprotein Complexes ALANJ . KINNIBURGH. PETERB. BILLINGS. THOMAS J . QUINLAN AND TERENCE E . MARTIN I. Introduction . . . . . . . . I1. Distribution of RNA Sequences in Nuclear Extracts 111. Concluding Remarks . . . . . . References . . . . . . . .

. . . .

. . . .

. . . .

IV. Chromatin Structure and Template Activity The Structure of Specific Genes in Chromatin RICHARDAXEL I . Introduction . . . . . . . . . . . . I1. The Nucleosomal Subunit . . . . . . . . . I11. Nucleosomes in Metaphase Chromosomes . . . . . . IV. Analysis of the DNA of Monomeric Particles . . . . . V . Structure of the Globin Genes in Chromatin . . . . . . . VI . In Vitro Transcription as a Probe of the Globin Genes . VII . Recognition of DNA Restriction Endonuclease Sites in Nucleosomes . . . . . . . . . . . . VIII . Conclusions References . . . . . . . . . . . .

xii

CONTENTS

The Structure of DNA in Native Chromatin as Determined by Ethidium Bromide Binding J. PAOLETTI, B. B. MAGEEAND P. T. MACEE I. Introduction 11. Methods . 111. Results References

.

. . . . , . . . . . . . . . . .

.

.

.

,

. .

. .

.

.

. .

.

. . . .

373 373 374 377

. . . . . . . . . . . . . . . . . . . . .

379 382 392

,

. . . . . .

. . .

.

. .

.

.

. . . .

Cellular Skeletons and RNA Messages RONALDHERMAN, GARYZIEVE, JEFFREY WILLIAMS, ROBERTLENKAND SHELDONPENMAN I. Cytoplasmic Skeleton . . . 11. The Nuclear Skeleton and hnRNA 111. Low-Molecular-Weight RNA Species IV. Summary . . . . . . References . . . . .

.

. .

.

.

.

. . . .

.

. . . .

. . . .

399 400

The Mechanism of Steroid-Hormone Regulation of Transcription of Specific Eu karyotic Genes BERTW. O’MALLEY AND ANTHONYR. MEANS I. Introduction . . . . . . . . . . . . 11. Control Theories . . . . . . . . . . . 111. The Oviduct as a Model for Steroid Hormone Action . . . IV. Is Ovalbumin Synthesis Regulated at the Translational Level? . . V. Is Ovalbumin Synthesis Regulated at the Posttranscriptional Level? VI. Is Ovalbumin Synthesis Regdated at the Level of Transcription? . VII. A Model for Steroid IIormone Action . . . . . . . VIII. Directions of Future Research . . . . . . . . References . . . . . . . . . . . .

. . . . .

. . . .

403 404 405 411 411 413 414 415 417

Nonhistone Chromosomal Proteins and Histone Gene Transcription GARYSTEIN, JANET STEIN, LEWISKLEINSMITII, WILLIAM PARK,ROBERTJANSING AND JUDITH THOMSON I. Introduction

,

.

.

.

.

. . . . . . . . . . . . . . . .

421

11. Evidence for Transcriptional Regulation of Histone Gene Expression in

Continuously Dividing HeLa S , Cells

421

xiii

CONTENTS

111. Regulation of Histone Gene Transcription in Continuously Dividing . . . . HeLa S, Cells by Nonhistone Chromosomal Proteins . IV. Regulation of Histone Gene Transcription Following Stimulation of . . . . . . . . Nondividing Cells to Proliferate . V. Activation of Histone Gene Transcription by Nonhistone Chromosomal Phosphoproteins . . . . . . . . . . . . VI. Conclusions . . . . . . . . . . . . . References . . . . . . . . . . . . .

427 433 437 443 444

Selective Trancription of D N A Mediated by Non histone Proteins TUNCY. WANG,NINAC. KOSTRABAAND HUTH S. NEWMAN I. Introduction . . . . . . . . . . . . . 11. A Nonhistone Protein from Ehrlich Ascites Tumor That Inhibits Transcription from DNA. . . . . . . . . . . . 111. T h e Nonhistone-Protein Fraction That Stimulates Transcription from . DNA. . . . . . . . . . . . IV. Conclusion . . . . . . . . . . . . . References . . . . . . . . . .

.

.

447 448 456 460 461

V. Control of Translation Structure and Function of the RNAs of Brome Mosaic Virus PAULKAESRERG I. Introduction . . . 11. Structural Considerations 111. Regulation of Translation IV. Summary . . . . References . . .

.

.

.

.

.

.

.

. . .

. . .

. . .

. . .

. . .

. . .

. . .

.

.

.

.

.

.

. .

.

,

. .

.

. . . .

465 465 467 470 470

Effect of 5’-Terminal Structures on the Binding of Ribopolymers to Eu karyotic Ribosomes s. MUTHUKRISHNAN, Y. FURUICHI, G. w.BOTH AND A. J. SHATKIN Text . References

.

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

473 476

. .

. .

. .

. .

477 484

Translational Control in Embryonic Muscle STUARTM. HEYWOOD AND DORIS S. KENNEDY Text . References

.

. .

. .

. .

. .

. .

. .

. .

. .

. .

xiv

CONTENTS

Protein and mRNA Synthesis in Cultured Muscle Cells R . G . WHALEN . M . E . BUCKINCHAM AND F. GROS Text . . References

. .

. .

. .

. .

. .

. . . . . . . . . . . . . . . .

485 489

.

VI Summary mRNA Structure and Function JAMES E . DARNELL I . Introduction . . . . . . . . . . . . . 493 I1. Definition of mRNA and Brief Survey of Recent Progress in mRNA Structure . . . . . . . . . . . . . . 494 111. Average Size of mRNA . . . . . . . . . . . 495 IV . mRNA Methylation . . . . . . . . . . . . 496 Addition of Poly(A) to mRNA . . . . . . Noncoding Regions and Repetitive Oligonucleotide in mRNA Role of Caps in Translation . . . . . . . . . Nuclear Transcripts and the Origin of mRNA . . . IX . Chromatin Transcription and Gene Regulation . References . . . . . . . . . .

V. VI . VII VIII.

.

SUBJECT INDEX

.

.

.

.

. . . . .

. . . . .

. . . . .

498

.

499 502 504 509 511

.

.

.

.

.

.

.

.

.

.

.

.

513

.

.

.

.

.

.

.

.

.

.

517

CONTENTS OF PREVIOUS VOLUMES

List of Contributors Numbers in parentheses indicate the pages on which the authors' contributions begin.

GORDON ABRAHAM (83), Roche lnstitute of Molecular Biology, Nutley, New Jersey C. W. ANDERSON (263), Biology Department, Brookhaven National Laboratory, Upton, New York J. F. ATKINS(263), Department of Molecular Biology, University of Edinburgh, Edinburgh, Scotland RICHARDAXEL (355), Institute of Cancer Research and Department of Pathology, Columbia University, College of Physicians & Surgeons, New York, New York AMIYAK. BANERJEE ( 8 3 ) , Roche Znstitute of Molecular Biology, Nutley, New Jersey ENZOBARD'(275), The Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania D. C. BARTELT(263), Biology Departnwnt, Brookhaven National Laboratory, Upton, New York PETERB. BILLINGS (335), Department of Biology, University of Chicago, Chicago, Illinois HELGABOEDTKER (253), Department of Biochemistry and Molecular Biology, Harvard University, Cambridge, Massachusetts ROBERTF. BOONE( 6 3 ) , National Znstitute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland G. W. BOTH (473), Roche Institute of Molecular Biology, Nutley, New Jersey JEFF BROWNE ( 177), Department of Biology and Molecular Biology Institute, University of California, Los Angeles, California G. G. BROWNLEE (123), MRC Laboratory of Molecular Biology, Hills Road, Cambridge, England M. E. BUCKINCHAM ( 485),Dhpartement de Biologie Mole'culaire, lnstitut Pasteur, Paris, France HARRISBUSCH(39), Department of Pharmacology, Baylor College of Medicine, Houston, Texas S. T. CASE(319), Department of Histology, Karolinska Institutet, Stockholm, Sweden C. C. CHENG(123), MRC Laboratoy of Molecular Biology, Hills Road, Cambridge, England PAT CLARKE(177), Department of Biology and Molecular Biology lnstitute, University of California, Los AngeIes, California

' Present address: Department of Biology, University of Ottawa, Ottawa, Ontario, Canada. xv

xvi

LIST OF CONTRIBUTORS

MICHAELCOIIEN-SOLAL ( 165), The Division of Hematology-Oncology of the Department of Medicine, ChiZdren'.r IIospital Medical Center, and the Department of Pediatrics, IIarvard Medical School, Boston, Massachusetts RICIIARDJ. COI.ONNO ( 8 3 ) ,Roche Institute of Molecular Biology, Nutley, New Jersey ELIZABETH A. CRAIG-! (293), Department of Pathology, Washington University School of Medicine, St. Louis, Missouri W. C. CROCKETT ( 263), Biology Department, Brookhaven National Lahoratory, U p o n , New York R. DANEHOLT ( 319), Department of Histology, Karolinska Institutet, Stockholm, Sweden JAMESE. DAI~NELL (493), Rockefeller University, New York, New York P. L. DAVIES(135), Division of Medical Biochemistry, Faculty of Medicine, The University of Calgary, Calgary, Alberta, Canadu RONALDC. DESROSIERS~ (21), Department of Biochemistry, Michigan State University, Eust Lansing, Michigan RAVI DHAN( 157), Department of Human Genetics, Yale University School of Medicine, Nezc; Haven, Connecticut G. H. DIXON(135), Division of Medical Biochemistry, Faculty of hfedicine, The University of Calgary, Calgary, Alberta, Canada J. J. DUNN(263), Biology Department, Brookhaven National Lahoratorg, lipton, New York LEON DUI~EI11 (113), Department of Biochemistry, University of Georgia, Athens, Georgia MARYEDMONDS (99), Life Science Department, University of Pittsburgh, Pittsburgh, Pennsylvania MARCIAJ. ENSINGER ( 63), Nationul Institute of Allergy and Infectious Diseases, National lnstitutcs of Health, Bethesda, Maryland L. N. FERRIER ( 135),Division of Medical Biochemistry, Faculty of Medicine, The University of Calgary, Calgary, Alberta, Canada D. J. FINNEGAN~ (221), Department of Biochemistry, Stanford University School of Medicine, Stanford, California MARCIAS. FLASHNER ( 233), Department of Biology, Syracuse University, Syracuse, New York Prcscnt addrcss : Departmcnt of Microbiology, University of California, San Francisco, California. '' Present address: Department of Molecular Biophysics and Biochemistry, Yale University, New Ilaven, Connecticut. ' Present address: Department of Molecular Biology, Univeraity of Edinburgh, Edinburgh E H 9 3JR, Scotland.

LIST OF CONTRIBUTORS

xvii

BERNARD G. FORGET (165), The Division of Hematology-Oncology of the Department of Medicine, Children’s Ilospital Medical Center, and the Department of Pediatrics, Harvard Medical School, Boston, Massachusetts KARENFRIDEHICI ( 21 ) , Department of Biochemistry, Michigan State University, East Lansing, Michigan Y. FURUICHI (3, 473), Roche Institute of Molecular Biology, Nutley, New Jersey L. GEDAMU(135), Division of Medical Biochemistry, Faculty of Medicine, The University of Calgary, Calgary, Alberta, Canada PRARHAT K. GHOSH(157), Department of Internal Medicine, Yale University School of Medicine, New Haven, Connecticut F. GROS(485), Ddpartement tfe Biologie Molkculaire, Institut Pasteur, Paris, France KAUSHALKUMAHGUPTA(39), Department of Pharmacology, Baylor c01lege of Medicine, Houston, Texas B. DAVID H A M E S(275), ~ The Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania BARRYHARRIS ( 113), Department of Biochemistry, University of Georgia, Athens, Georgia HOWAHD HEINDELL ( 177), Department of Biology and Molecular Biology Institute, University of California, Los Angeles, California RONALDHERMAN(379), Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts STUART M. HEYWOOD (477), Genetics uncl Cell Biology Section, The University of Connecticut, Storm, Connecticut RUSSELLHIGUCHI(177), Department of Biology and Molecular Biology Institute, University of California, Los Angeles, California FRIEDRICH HIRSCH( 39), Department of Pharmacology, Baylor College of Medicine, Houston, Texas D. S . HOGNESS (221), Department of Biochemistry, Stanford University School of Medicine, Stanford, California J. HYDE(319), Department of Histology, Karolinska Institutet, Stockholm, Sweden K. IATROU (135), Division of Medical Biochemistry, Faculty of Medicine, The University of Calgary, Calgary, Alberta, Canada JUDITH STROMMEI-I ISAACSON(205), Department of Biology and Molecular Biology Institute, University of California, Los Angeles, California Present address: Department of Biology, University of Essex, Wivenhoe Park, Colchester, England.

xviii

LIST OF CONTRIBUTORS

HOBERT JANSING, Depwtment of Biochemistry and Molecular Biology, University of Florida, Cainesville, Florida PAUL KAESBERG( 465), Biophysics Laboratory of the Graduate School and Biochemistry Department, College of Agricultural and Life Sciences, University of Wisconsin, Madison, Wisconsin MAIWANNKATOPES ( 233), Department of Biology, Syracuse University, S!lracuse, New York DAWNE. KELLEY (275), The Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania DORIS S. KENNEDY (477), Genetics and Cell Biology Section, T h e University of Connecticut, Storrs, Connecticut ALAN J. KINNIBURGH, (335), Department of Biology, University of Chicago, Chicago, Illinois GAIW A. KITOS (233), Depurtment of Biology, Syracuse University, Syracuse, N e w York LEWISKLEINSMITH( 421), Division of Biological Sciences, IJniversity of Michigan, Ann Arbor, Michigan E. L. KO~IWKK( 99), Life Science D e p a r t m n t , llniversity of Pittsburgh, Pittsburgh, Pennsylvaniu NINAC. KOSTHABA(447), Division of Cell a d Molecular Biology, State University of New York at Buffalo, Buflalo, New York YUAN FONLFE (X9), Department of Microbiology, School of Basic Heulth Sciences, State cTniz;eTsity of New York at Stony Brook, Stony Brook, New York HANSLEIIRACH (253 ), Department of Biochemistry and Molecular Biologl, Ilarvartl IJnivers i t i j , Canabriclge, hlussucllusetts RORERTLINK ( 379 ), Depa;tment of Biology, Massachusetts Institute of Technology, Cumliridge, Massachusetts PAUL M. LIZARVI(301), T h e Rockefeller University, N e w York, N e w York STEVENL . MCKNICriT (313),D e p a r t m n t of Biology, IJniversity of Virginia, Chartottesville, Virginia €3. B. MAGEE(373), Department of Human Genetics, Yale University School of Medicine, New Haven, Connecticut ( 373), Department of Human Genetics, Yale Universit!! School of hledicine, New Huven, Connecticut CIIARLFY A. MAI~OTTA ( 165), The Psychiatric Research Lal)oratories, and The Departmcnt of Ps!jchiatry, Massuchtuetts General Hospital, and the Harvarrl Alcilical Sclzool, Boston, Mnssachusetis SCOTTA. MARTIN"( 6 3 ) , National Institute of Allergy and Infectious Diseuses, National Institutes of Health, Bethestla, Mar!lEund " Prewit n d d r t ~ ~ s1)ep~irtiiieiit : nf Pathology, Washington University School of Medicine, St. Imii\, l\lissouri.

LIST OF COSTRIBUTORS

xix

TERENCE E. MARTIN (333), Department of Biology, University of Chicago, Chicago, lllinois ANTHONYR. MEANS (403), Department of Cell Biology, Baylor College of Medicine, Houston, Texas OSCARL. MILLER,JR. (313), Department of Biology, University of Virginia, Charlottesville, Virginia BERNARDMoss ( 63 ) , National Institute of Allergy and Infectious Diseases, National lnstitutes of Health, Bethesda, Maryland S. MUTHUKRISHNAN (3, 473), Roche Institute of Molecular Biology, Nutley, New fersey HIROSHI NAKAZATO ( 99), Life Science Department, University of Pittsburgk, Pittsburgh, Pennsylvania L. NELSON( 319), Department of Histologij, Karolinska lnstitutet, Stockholm, Sweden MARTIN NEMER(119), The Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania RUTH S. NEWMAN (447), Division of Cell and Molecular Biology, State University of New York at Buffalo, Butalo, Nezl; York AKIONOMOTO (89), Department of Microbiology, School of Basic Health Sciences, State University of New York at Stony Brook, Stony Brook, New York BERT W. O'MALLEY(403), Department of Cell Biology, Baylor College of Medicine, Houston, Texas GARYPADDOCK (177), Department of Biology and Molecular Biology Institute, University of California, Los Angeles, California JULIAN PAN (157), Department of Human Genetics, Yale University School of Medicine, New Haven, Connecticut J. PAOLETTI'(373), Department of Human Genetics, Yale University School of Medicine, New Haven, Connecticut WILLIAMPARK(421), Department of Biochemistry and Moleculur Biology, University of Florida, Gainesville, Florida SHELDONPENMAN( 379), Department of Biology, Massachusetts Institute of Technology, Camlbridge, Massachusetts ROBERTP. PERRY(275), The lnstitute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania N. J. PROUDFOOT (123), MRC Laboratory of Molecular Biology, Hills Road, Cambridge, E nglanrl

' Present address: Laboratoire de Pkiarniacologie MolCculaire

n o 147 du CNRS,

Institut Cnstave ROLISSY, 16 bis Av. Paul Vaillant Couturier, 94800 Villejuif, France.

SS

LIST OF CONTIIIBUTORS

TIIOMAS J. QUINIAN' ( 3 3 5 ) ,Deprtrrwit of Biology, Llniversitrl of Chicngo, Chicugo, Illinois MANCHANAIIALLI HAO ( 39), Department of Phar~ncrcology,Baylor College of hlrdicine, llouston, Terns I ~ E S C I I IJ.~ ~ RASKAS I, ( 29:3), Uepurirneiif of P~tliolog!y,W'ushington lJiiiversity School of h4edicine, S t . Louis, Alissoztri JOIIN RORFRTS( 177), Departinent of Biolog!~mnd hlolecular Biology Inof Culiforniu, Los Artgelcs, California stitute, ~~niucrsity FIUTZR I . ROTTMAN( 21 ) , Depurtrnetit of Biochemistr!y, hlichigun State ~~niocrsit!y, East Lansing, Jlichigan G. Sl. RLJRIK"( 221 ) , Depurttrtent of Biochemistry, Staiiforcl lJniversit!y Sc110ol of hletlicinc, Stanford, California ~VINSTONSALSEH( 177, 203), Depurimrnt of Biology and Molecular Biology Institute, (rniversitrj of California, Lots Angeles, California UELI SCIIII~LER(275), The l n s k u t e for Cancer Research, Fox Chase Conccr Ce nt e r, P h ilacle lphiu, Pen nsy 1van iu MATTIIEW S . SFXI, ( 233 ), Depurtrnent of Biology, S!jrcrcuse University, C Syrucuw, N C ~ York A. J. SIIATKIN(3, 473), Roclae Institute o f Molecular Biology, N u t l c y , Nctc Jersey WILLIAA~ SPOIIN( 39), Uepurtineiit of Pharmacologj, Baylor College uf Medicine, Houston, Texus G A I I Y STEIN ( 421 ) , Department of Biochemistry and hlokcidar Bio/og!l, 1Jniversity of Florida, Gainesville, Floritlo JANET STEIN ( 421 ) , Uepclrtnient of Biochemistry clnrl Molecular Biology, lJniuersity of Florida, Gainesville, Florida UIISULAS w i m ( 227), Department of Microbiologrl arid linrnzinology, 17niuersity of Washiiigton, Seuttle, Wnshington GARYSTUDVICKA( 177), Depcrrtinent of Biology und Molecular Biology Institute, 1~niocrsitrlof California,Los Angeles, California KIHANUI~ N. S u R n A h r A N I m ( 1S7), Departmelit of kluinuii Genetics, Yak? llniversity School of hleclicine, Nezc Haven, Connectirut N ~ L I > L. . z SULLIVAN(313), Deprtrncnt of Biology, liniuersity of Virginiu, Chnrlottesville, Virginia SAULSuitiwy (119), The Znstifiite for Cnttccr Resenrth, Fox Chase Cancer Ccnler, Philadelphiu, Pcnnsyloania ' Prrscnt addrcss: Departiiiciit of Patliology mid Aiiatoiny, hlayo Clinic, ltochester, Minnesota. !' Present address: Sidney E';ti.l)er Cancc*r Center, I Iarvartl hfedical School, 35 Binncy Coiirt, Boston, Massachusetts.

LIST OF CONTRIRUTOHS

xxi

T I I I ~ ~ E L I A I(~157 P A) ,Y A Department of Human Genetics, Yale liniversity School of Medicine, N e w tlaven, Connecticut JUDITH TriomoN ( 421 ), Department of Biochemistry and Molecular Biology, liniversity of FIorida, Guinesoille, Florida J. TOMASZ ( 3 ) , lnstitute of Biophysics, Biological Research Centre of the Ilzingurian Acutlerny of Sciences, Szeged, Hungary NIROS C. VAAIVAKOPOULOS ( 2 3 3 ) ,Depclrtrnent of Biology, Syracuse University, Syrucuse, New York S . VENKATESAN (99), Life Science Department, [Jniversity of Pittsburgh, Pittsburgiz, Perins!yicnniu JOHN N . VOURNAKIS ( 233), Department of Biology, Stjracuse University, Siyrcrcuse, N e w 2ork Turn' Y. WANC ( 4 4 7 ) , Division of Cell and Molecular Biolog!y, State Universily of Neic York at Uufalo, Buffalo, New York C I I A - \ I ~\VEI ( 6 3 ) , Nutionul Imstitutc of Allergy and Infectious Discases, Nutioritrl Institutes of Heulth, Bethesclu, Maryland S m i < x f . m hl. \\'mshmx ( 157, 165), Depurtment of Human Genetics, 2ule liniversitiy School o j Medicine, Net6 Haven, Connecticut 13. C. WHALEN(455), Dc'partentent cle Biologie Mole'culaire, Jnstitut Pastcur, Puris, Fruiice L. WIESLANDER ( 319), Depurtment of Histology, Karolinska lnstitutet, Stockholm, Szceclen JEFFIIEY W~LLIAAIS ( 379), Depart nient of Biology, Massachusetts Institute of Teclinology, Cunibritlge, hfassaclrusetts ECKAHD W I M E L (~89), R Deprtnicnt of Microhiolog!y, School of Basic Hculth Sr,irnc.c.s, State ~7tiit)crsitiyof Neil; York at Stony Brook, Stoniy Brook, New York I ~ F N J A A I I ~ C. Wu ( 3 9 ) , Dcprtinent of Pharniacology, Baylor College of Medicine, Houston, Texm R~XIXAX I . \\'UI<ST ( 233 ) , Depurtnicnt of Biology, Syacuse lJniversit!y, Syrncuse, N e w York SAYEEDA13. ZAIN (157), Colcl Spring Ilarbor Laboratory, Colt1 Spring IIarl)or, Nerc York PAULZ A K A I (177), ~ Deprlinent of Biology und Molecular Biology Institute, ~lniversitiyof Californiu, LOT ~ i g e ~California e ~ , G.41IY ZIEVE ( 379 ) , Deprtinent of Biology, Massachusetts Institute of Technology, Cumbridge, Mtrssuchusetts I3AY.m

Preface This is not thc first time that this series has broken with thc tradition of iiicludiiig only a group of invited review-type essays, none of which is directly related to its neighbors, inasmuch as the first half of Volunie 17 is comprised of papers presented at a symposium honoring Erwin Chargaff on the occasion of his retiremcmt, and that volume was dedicated to hiin. However, the present volume is the first to be devoted to a symposium on a single topic, including only papers presented at that s!miposium, and the first to be dedicated ex post fucto, for thc reasons given in the statement that follows, to the iiiemory of one who first proposed its subject. The decision to make a volume out of this symposium’ was quite opportunistic. The subject iiiatter obviously falls within the purview of this series, and the principal invited participants included many whose ficlds and ideas would qualify them to be iuvited as contributors to any of its volumes. The nearness of the site, which enabled the undersigned to attend and to discuss Iliiitte1*swith the individual contributors, was a factor, as was the long-time association with the chief organizer, the co-editor of this volume. Oiic of the problcnis encountcred was how to fit in the shorter contributions, both plaiiiiecl aiid spontancous, which are the natural coiicomitnnts of nll such symposia. Dcxisioiis were made by the two editors on the basis of length, pertinence, and the submission of manuscript; 1 1 0 attempt wiis made to record and include the brief questions and answers or other give-and-take that occurred. If this omission loses some flavor or detail, it is inore than made up by the fact that each contributor tells his story a s lie wishes; no attempt has been made to eliminate overlap, or to suinniarize tlic results of pardlel or complenicntary investigitiotis (with the singk exception of the final papcr, which does attempt an overview of the entire symposium: pipers, discussions and all). Each of the several major sections niay be considered a minisymposium in itself, rathcr thaii the single-author review. A second cditorial decision involved nomenclature. Here the freedom allowed cach contributor in the telling of his story was sharply abrogated in order that identical substances should b e named identically, and, insofar as possible, in accord with international Recommendations and llules. “mRl\jA: Thc Relation of Structure to Function,” 29th Annual Symposiiuin of the Biology Division of the Oak Ridge National L,il)oratory, Iield at Gatlinbrirg, ‘I‘enriessee, April 5-8, 1976. xxii

PREFACE

ssiii

~ J“Abbreviations X ~ and While rnost nomenclnt~ualitems arc ~ O ~ C in Symbols” (see 11. m i x ) , there were a few regarding which arbitrary decisions were made. The term “cap” for “S’-terininal seqrrence” was dlowed throughout Section I. However, the terms used by various authors to dcscrilie HNA with or without a poly( A ) sequencc at the 3’ terminal werc so diverse and potentially confusing, as well as quite long and often repented, thiit wc converted them into the short and explicit terms “RNA( A,, ) ” for KNA containing a p l y ( A ) sequence, and “HNA( no A,, )” for RNA devoid o f a poly( A ) sequence, where the differentiatioil is important. (By the Rules, HNA-A,, and RNA should suffice, but the hyplien inay bcb confused with a miniis sign.) W e also converted X and Y ( f o r “unspecified nucleotide residues”) to N’, and N”, and SAM and SAH to AdoMet and AdoHcv, in accordance with the Rules (p. mix). It is our hope that readers will find this volume a proper addition to the series, and we welcome, as always, comments---criticnl or othcrwisc ---on it or on any facet of the series. W. E.

c.

E. V.

Dedicat io n-J

a c q u es M o n od

Fifteen years ago, Jacob and Moi iod presented their ideas on the regulation of gene activity ( I , Z ) , introducing several concepts and terms that were as iiovel as they were prescient (operon antl operator; structural gene and regulatory gene; and messenger R N A ) . They pointed out the necessity for conceiving the existence of a short-lived, rapidly-turning-over chemical entity that could carry the infnrniation encodcd in the genetic material to the ribosomal “factory,” uutil then considered to be programmed directlv by the genetic material, and indicated that the HNA fraction demonstrated several ycms earlier by Volkin and Astrachan ( 3 ) had the rcyuired properties to make it the logical “candidate” for the mcsscnger role their concept demanded. In discussing their concept, Jacob and Monod stated “This model may appear rather abstract and complex. It is, however, precise enough to imply very distinctive predictions by which its validity can be tested.” Although their ideas were bascd entircly on results from the study of prokaryote systems, they asked “to what extent are the mechanisms that operate in bacteria also prescnt in tissues of highcr organisms; what functions may such mechanisms perform in this different context; and may the new co~iceptsand experimental approaches derived from the study of micro-organisms be transferred to the analysis and interpretation of the far more complex controls involved in thc functioning and differentiation of tissue cells?” The explosion in recent years of research along these lines with eukaryotes enable one to say with confidence that these questions were essentially rhetorical. The m(~ssengerKNA-regulator gene concept has had influence far bcyoiid molecular biology. It has had a major impact on developmental biology antl embryology. The vistas it opened up, by its brilliant combination of experiment, deduction, and prediction, underlie the Nobel award of 1965 to Lwow, Monod, and Jacob ( 4 , 5 ) . Rather early i n thc development of the concept of this volume, the subject and content of which flow SO naturally out of the above, we asked Jacques Monod to contribute a short historical essay on the conception a i d development of the mRNA hypothesis. This he agreed to do, but shortly thereaftcr h e became gravely ill and advised us that “I have to admit that this serious state of health is not yet over, and that its being a constant preoccupation has prevented me from meeting many of my obligations.” Within days, he was dead. AS a scientist, Monod was blessed with the ability to coriceivc of the right experiment at the right time, but however thoroughly devoted to xxv

xxvi

DEDICATION-JACQUES

MONOD

liis scientific work, he always had time to help, inspire and discuss with students and othcr bcginning scientists, from the time of his first chairiiiaiiship in 1954 through to his later years (from 1971 on) as director 01 the Pasteur Institute, an appointment that, to his regret, took hiin out of the laboratory to serve science and help others in another capacity. A generation of now well-known scientists, mainly American, bcncfited from his training and collaboration from the earliest years onwards. “In Jacques Monod one met a rare combination of eminent gifts: a harmony between his natural gifts and the task at hand, a balance between intuition, crcative imagination, and reason; a logical exigency in all the great undertakings; and a rigorous concern for his responsibilities” ( 6 ) . It is in appreciation of the high esteem in which he is held, both as a scientist and as a nian ( 6 ) , and of the seminal influence he has had on the subject of this syn~posiuniand, indcecl, on tlie subject of this series, that we dedicate this voluine to the memory of Jacques Monud.

W. E. C. E. v. 1 . F. Jacoh and J. hlonod, J M B 3, 318-356 (1961). 2. F. Jncol, arid J. hlonod, C S H S Q B 26, 193 (1961); J. Monod and F. Jacob, ibid. 26, 389 ( 1961). 3. E. Volkin and L. Astrachan, Virology 7, 149 ( 1956). 4. G. S. Sterit, Science 150, 462 (1965). 5 . J. hlonod, Scicnce 154, 475 (1966). 6. Articles by A. Lwow, F. Gros, and others in Le Monde (Paris), pp. 1, 19 (June 2, 1976). See also Trends i n Biochemical Science?, 1, N-208 ( 1976).

Abbreviations and Symbols All contrihutors to this Series are asked to use the terminology ( abbreviations and symbols ) recommended by the IUPAC-IUB Commission on Biochemical Nommclature ( C B N ) and approved by IUPAC and IUB, and the Editor endeavors to assure conformity. These Recommendations have been published in inany journals ( 1 , 2 ) and compcndia ( 3 ) in four Ianguagcs and are availal)le i n reprint form from the NAS-NRC Office of Biochemical Nomenclature ( OBN ), as stated in each publication, and are therefore considercd to he generally known. Those used in nucleic acid work, originally set out in section 5 of the first Recommendations ( 1 ) and subsequcntly revised and expanded ( 2 , 3 ) , are given in condensed form (I-V) below for the convenience of the reader. Authors may use them without definition, when necessary.

1. Bases, Nucleosides, Mononucleotides I . Bases ( i n tahles, figures, eqriations, or chromatogranis) are symbolized by Ade, Gua, Hyp, Xan, Cyt, Thy, Oro, Ura; Pnr = any purine, Pyr = any pyrimidine, Base = any base. The prefixes S-, II,, F-, Br, Me, etc., may be used for Inodifications of these. 2. Ribonucleosides ( in tables, figures, equations, or chromatograms ) are synibolized, in the same order, by Ado, Guo, Inn, Xao, Cyd, Thd, Ord, Urd (*rd), Puo, Pyd, Nuc. Modifications may be expressed as indicated in (1) above. Sugar residues may be specified by the prefixes r (optional), d ( =deoxyribo), a, x, I, etc., to these, or by two three-letter synil)ols, as in Ara-Cyt (for aCyd) or dRib-Ade (for dAdo ) . 3 , Mono-, di-, and triphosphutes of nticleosides (5’) are designated by NMP, NDP, NTP. The N (for “nucleoside”) may be replaced by any one of the nucleoside symbols given in 11-1 below. 2’-, 3’-, and 5’- are used as prefixes when necessary. The prefix d signifies “deoxy.” [Alternatively, nucleotides may be expressed by attaching P to the symbols in ( 2 ) above. Thus: P-Ado = AMP; Ado-P = 3’-AMP.] cNMP = cyclic 3’:5’-NMP; Bt,eAMP = dibutyryl CAMP; etc.

It. Oligonucleotides and Polynucleotider 1 . Ribonucleoside Residues

( a ) Coinmon: A, G, I, X, C, T, 0, U, ‘k, R, Y, N ( i n the order of 1-2 above). ( b ) Base-inodifietl: s I or M for thioinosine = 6-mercaptopurine ribonucleoside; sU or S for thiouridine; brU or B for 5-bromouridine; hU or D for 5,6-dihydrouridine; i for isopentenyl; f for forniyl. Other modifications are similarly indicated by appropriate lower-case prefixes ( in contrast to 1-1 above) (2, 3 ) . ( c ) Sugar-modified: prefixes are d, a, x, or 1 as in 1-2 above; alternatively, by italics or boldface type (with definition) unless the entire chain is specified by an appropriate prefix. The 2’-O-n1ethyl group is indicated by suffix m (e.g., -Am- for 2’-O-methyladenosine, but -mA- for N-methyladenosine ) . ( d ) Locants and multipliers, when necessary, are indicated by superscripts and subscripts, respectively, e.g., -m,6A- = 6-diniethyladenosine; -s4U- or -‘S- = 4-thiouridine; -ac’Cm- = 2’-O-methyl-4-acetylcytidine. ( e ) When space is limited, as in two-dimensional arrays or in aligning homoxxvii

xxviii

ABBREVIATIONS AND SYMBOLS

logous sequences, the prefixes may be placed ocer thc capital letter, the suffixes m e r the phosphodiestcr synibol. 2. Phosphoric Acid Residues [left side

=

5’, right side

=

3’ (or Z’)]

( a ) Terminal: p; e.g., pppN . . . is a polyniicleotide with a 5’-triphosphate at nnr end; Ap is adenosine 3’-phosphate; C > p is cytidine 2’:3’-cyclic phosphatc ( 1 , 2, 3 ) ; p < A is atlcnosine 3’:5’-cyclic phosphate. ( h ) Internal: hyphen (for known sequence), coinnia (for unknown secpencc); unhnowri seciuencrs are enclosed in parentheses. E.g., pA-C-A-C( C2,A,U)A-U-CC > p is a seqiicnce with a (5’) phosphate at one end, a 2’:3’-cyclic phosphate at the other, and a tetraiiucleotide of unknown sequence in the middle. (Only codon triplets arc written without some punctuation separating the residues. ) 3. Polarity, or Direction of Chain

The syinliol for thc phosphodiester group ( whether hyphen o r coninia or parenthe~es,as in 21)) represents a 3’-5’ link (i.e., a 5’ . . , 3’ chain) unless otherwise indicated by appropriate niinibers. “Reverse polarity” ( a chain proceeding from a 3’ teririiriris at left to a 5’ terniiniis at right) may bc shown by numerals or by right-toleft arrows. Pol;irity in a n y direction, as in a two-dimensional array, may tie shown by appropriate rotation of the (capital) lettci-s so that 5’ is at left, 3’ at right when the letter is viowed right-side-up. 4. Synthetic Polymers

Thc conipl(bttt iiaiiie o r tlie appropriatv groiip of syiiihols ( scc 11-1 above) of thc repeating unit, enclosed in parentheses if complex or a symbol, is either ( a ) prececled by “poly,” or ( I)) followed by a suhsc:ript “II” o r appropriate number. No space follows “poly” ( 3 , 5). The coiiventions of 11-21) arc’ used to specify known o r ~niknown ( randoiii ) sec~“cI””, c.g., polyadenylate = p l y ( A ) or ( A ),,, a siniple homopo~ymer; po1y( 3 adenylatc, 2 cytidylate) = p l y ( A:,C2) or ( A,,C2),,, an irregular copdynier of A and C in 3 : 2 proportions; pol?(deos).adenylatc-deoxythymitlylate ) = poly[d( A-T)] or p l y ( (1.4-dT) o r ( dA-[IT),, or d ( A-T),,,an altc~~iiatitig cnpolynicr of dA and [IT; p l y ( adcnylate,guan).late,cytidylate,uritlylate ) = p l y ( A,G,C,U ) or ( A,G,C,U ),,, a random assortment of A, C, C, and U residncs, proportions unspecified. The prefix copoly or oligo inay replace poly, if desired. T h e snhscript “11” niay be rcpl:iccd hy niniierals indicating actual size, e . g . , ( A ) 8 , .(dT)],

111. Association of Polynucleotide Chains 1. Associutcd (e.g., H-bonded) chains, or bases within chains, are indicated by a center dot (uot a liyphm or a plus sign) scparating the c o ~ a p l e t snames or symbols, 62,s.:

p l y (A ) . p l y (U ) or ( A 1,). ( IJ) poIy(A).2 p o l y ( U ) or (A),,.2(U),,, p(ily( CIA-dC). p l y ( dC-dT) or (dA-dC),,. (dG-(IT),,,.

xxix

ABBREVIATIONS AND SYMBOLS

2. Nonassociated chains are separated by the plus sign, e.g.:

3. Unspecified or unknown association is expressed by a conniia (again meaning “iniknown”) between the completely specified chains. Note: In all cases, each chain is completely specified in one or the other of the two systems described in 11-4 above.

IV. Natural Nucleic Acids RSA DNA InHNA; rRNA; nRNA hnRSA D-RNA; cRSA rntDNA tRSA

ribonucleic acid or rilxniiicleate deoxyriboniicleic acid or deoxyrihonucleate messenger HNA; ribosnnial HNA; nuclear RXA heterogeneous nuclear RNA “DNA-like” RNA; coniplementary RNA mitochondrial DNA transfer ( o r acceptor or ;uiiino-acid-acceptiiig) RNA; replaces sRNA, which is not to be used for any purpose aniinoacyl-tRNA “charged” tRNA ( i.e., tRNA’s carrying aniinoacyl residues); may l:e abbreviated to AA-tRNA alaninr tRNA ur tRXA nomially capa1)le of accepting alanine, to form tRNAI’“, etc. alanyl-tRNA alanyl-tRNA or T h e same, with alanyl residue covalently attached. alanyl-tRNA”” [Note: fMct = forniylmethionyl; hence tRNAr1Iet, identical with tRNA:”‘] Isoacceptors are indicated by appropriate subscripts, i.e., tRNA;\’”, tRNA;’”, etc.

V. Miscellaneous Abbreviations I’i, PPi inorganic nrthophosphate, pyrophosphate HNase, DNase rihonuclease, deoxyii1,onuclease melting tempcrature ( “ C ) t,“ (not T,, ) Others listed in Tablc I1 of Reference 1 may also be used without definition. No others, with or without definition, are used unless, in the opinion of the editor, they increase the ease of reading.

Enzymes In naining enzymes, the 1972 reconinlendations of the IUPAC-IUB Commission on Biocheniical Nomenclatiire ( C B N ) (4),are followed a? far as possible. At first ~nention,each enzyme is described either by its systematic name or by the equation

for the reaction catalyzed or hy thc recommended trivial name, followed by its EC niimber in parentheses. Thereafter, a trivial name may be used. Enzyme names are not to be ablweviated except when the substrate has an approved abbreviation ( e.g., ATPase, but not LIIH, is acceptable). REFEnENCESO 1 . JBC 241, 527 (1986); Bchent 5, 1445 (1966); BJ 101, 1 (1966); ABR 115, 1 (1966), 129, 1 ( 1969); and e1sewherc.f

* Contractions for names of journals follow.

t Reprints of all CBN Recomnicndations are available from the Office of Biocheniical Nonienclature ( W . E. Cohn, Director), Biology Division, Oak Ridge National Lahoratary, Box Y, Oak Ridge, Tennessee 37830, USA.

xss

ABBREVIATIONS AND SYMBOLS

2. EJB 15, 203 (1970); ]BC 245, 5171 (1070); J M B 55, 299 (1971); and elsewhcrc. * 3. “1I;indl)ooh of Riochcmi\try” ( f1. A . Sober, rd.), 2nd ed. Chemical RrhbeI‘ Co., Cleveland, Ohio, 1970, Section A and pp. H130-133. 4. “Enzyme Nomenclaturr,” Elsevier Scientific Publ. Co., Ainsterdani, 1073, and Supplement No. 1, BBA 429, (1976). 5. “Nomenclaturr of Synthetic Polypeptides,” JBC 247, 323 ( 1972 ); Biopolynier.5 11, 321 ( 1972 ) ; and elsrwhere. * Abbreviations of Journal Titles ]OllTt1ll~S

Annu. Rev. Biocliem. Arch. Hiocliem. niophys. Biochem. Riophys. Res. Cominiin. Riocheinibtry Bioclieiii. J . Biochim. Biophys. Acta Cold Spiing Harbor Syiiiy. Qiiant. Bid. Eur. J. Biocheni. Fed. Proc. J. Amcr. Cheni. SOC. J. Bacteriol. J. Bid. Chem. 1. Chem. Soc. J. MoI. Biol. Natiirc, New Biology Proc. Nat. Acad. Sci. U.S. Proc. Soc. Exp. Bid. h k d . Progr. Nucl. Acid Res. MoI. Bid.

Abhreuiutioris used

ARB ABB BBRC Bchem BJ BBA CSHSQB EJB FP JACS J. Bact. JBC JCS JhlB Nature NB PNAS PSEBM This Series

* Reprints of all CRN Recommendations are available from the NRC Office of Riochemical Nomcnclature ( W. E. Cohn, Director), Biology Division, Oak Ridge National Lnhoratory, Box Y, Oak Ridge, Tennessee 37830, USA.

Some Articles Planned for Future Volumes The Transfer RNAs of Cellular Organelles

If'. E. BARNETT,L. I. HECKER AND S. D. SCIIWARTZBACII Mechanisms in Polypeptide Chain Elongation on Ribosomes

E. HERMEK Mechanism of Action of DNA Polymerases

L. M. S. CHANC Initiation of Protein Synthesis

M. GRUNBERG-MANAGO Integration vs. Degradation of Exocellular DNA: An Open Question

P. F. LURQUIN The Messenger RNA of Immunoglobulin Chains

B. MACH Bleomycin, an Antibiotic Removing Thymine from DNA

W. MULLERAND R. ZAHN Vertebrate Nucleolytic Enzymes and Their localization D. SHUCARAND H. SIERAKOWSKA Regulation of the Synthesis of Aminoacyl-tRNAs and tRNAs

D. SOLL Physical Structure, Chemical Modification and Functional Role of the Acceptor Terminus of tRNA hl. SPRINZLAND F. CRAMER

The Biochemical and Microbiological Action of Platinum Compounds

A. J.

TIIOhfSON AND

J. J. ROBERTS

Transfer RNA in RNA Tumor Viruses L. C. WATERS AND B. C. MULLIN Structure and Functions of Ribosomal RNA

R. ZIMMERMANN

xxxi

This Page Intentionally Left Blank

PROGRESS IN

Nucleic Acid Reseurch and Molecular Biology Volume 19

This Page Intentionally Left Blank

1. The 5’-Terminal Sequence (“Cap”) 04 mRNAs

Caps i n Eukaryotic mRNAs: Mechanism of Formation of Reovirus mRNA 5’-Terminal m’GpppGm-C

3

Y. FURUICHI, S. MUTHUKRISHNAN, J. TOMASZ AND A. J. SHATKIN 21

Nucleotide Methylation Patterns i n Eukaryotic mRNA

FRITZ M. ROTTMAN,RONALD C. DESROSIERS AND KAREN FRIDERICI Structural a n d Functional Studies on the “5’-Cap”: Method for mRNA

A Survey

39

HARRIS BUSCH, FRIEDRICH HIRSCH, KAUSHAL KUMARGUPTA, MANCHANAHALLI RAO, WILLIAMSPOHNAND BENJAMIN C. Wu Modification of the 5’-Terminals of rnRNAs b y Viral a n d Cellular Enzymes

63

BERNARDMoss, SCOTTA. MARTIN, MARCIA J. ENSINGER, ROBERTF. BOONEAND CHA-MER WEI Blocked and Unblocked 5’-Termini in Vesicular Stomatitis Virus Product RNA in Vitro: Their Possible Role in mRNA Biosynthesis GORDONABRAHAM AND AMIYA K. RICHARDJ. COLONNO, BANER JEE

83

The Genome of Poliovirus Is an Exceptional Eukaryotic mRNA YUANFONLEE,AKIONOMOTOAND ECKARD WIMMER

89

1

This Page Intentionally Left Blank

Caps in Eukaryotic mRNAs: Mechanism of Formation of Reovirus mRNA 5’-Terminal m‘GpppGm-C

Y. FURUICHI, S. MUTHUKRISHNAN, J. TOMASZ*AND A. J. SHATKIN Roclie Institute of Molecular Biology Nutley, New Jersey, and *Institute of Biophysics Biological Research Centre of the Hungarian Academy of Sciences Szeged, Hungary

1. Introduction Messengcr RNAs from a variety of eukaryotic cells and viruses have been found to contain a 5’-tcrrninal “cap” structure,l m7GpppN’( m ) N’’(m)-,2 exemplified by the reovirus mRNA cap shown in Fig. 1 (I). The widespread distribution of caps in eukaryotic mRNAs suggested that they have a role in protein synthesis. Further studies showed that caps influence mRNA translation at the level of initiation ( 2 4 ) , since reovirus mRNA containing m’GpppGm . . . binds efficiently to ribosomes in cell-free extracts whereas mRNA with 5’-terminal ppG . . . or GpppG . . . binds poorly (5). B ~ C ~ L S’-tcrminal IS~ m’GpppNm may be a recognition signal for mRNA * ribosome initiation-complex formation, it represents an important functional as well as structural feature of many eukaryotic mRNAs. Caps are also present in heterogeneous nuclear RNA (hnRNA) of mammalian cells (6, 7) and may be conserved during the maturation of various species of cytoplasmic mRNA (8). Thc unique 5’3’ linkage in caps, first observed in low-molecular-weight RNA from nuclei of Novi‘See also articles by Busch et al., Rottman ct al. and Moss et al. in this volume. ‘It is understood that m’GpppN. . is m’G( 5’)pppN.. . The (5’) is omitted for clarity. The parentheses around the m’s indicate that 2’-0-methylation of N’ and N“ does not occur in all cases. 3

.

4

Y. FURUICHI ET AL. 0

FIG. 1. Structure of m’GpppCm-C-.

koff hepatoma cells ( 9 ) , implies that unusual mechanism( s ) arc available for modification of 5’- termini of cukaryotic RNAs. 5’-Terminal methyhtion of a viral mRNA in vitro was first demonstrated with purified insect cytoplasmic polyhedrosis virus (CP virus) (10). In this system, mRNA synthesis by thc virion-associated RNA polymerase dcpends upon the presence of S-adenosylmethionine ( AdoMet) ( 1 0 ) and the resulting mRNAs contain 5’-terminal m7GpppAmG . . . (11). Formation of the cap structure apparently occurs at the initiation of transcription, with AdoMct acting as a “trigger” for mRNA synthesis (10). I n contrast to CP virus, mRNA synthesis with the core particles of rcovirus or vaccinia virus does not depend upon AdoMet, but the 5’ termini of the mHNAs synthesized in its presence contained cap structures (12-14). In order to clarify the mechanism of cap synthesis and its relation to transcription, we studied cap formation by enzymes associatcd with viral cores of purified reovirus ( 15).

II. Results

A. Cap Formation at an Early Stage i n Reovirus Transcription Purified reovirus cores were incubated under conditions of mRNA synthesis in a reaction mixture that contained [ Me-“H]AdoMet and [a-Y?]GTP ( 1 6 ) , Nascent HNAs and oligonucleotides formed during

REOVIRUS

mRNA 5'-TERMINAL

5

short periods of incubation were separated from the labeled precursors by gel filtration in a calibrated column (Fig.2A). Oligonucleotides of chain length -1-188 residues were detected after 30 seconds (Fig.2B). RNAs of -4 S, a size sufficientto be excluded from Sephadex G-75, were present after 1 minute (Fig.2C) and by 2 minutes had increased both

FRACTION NUMBER

FIG. 2 . Analysis by gel filtration of nascent mRNA products synthesized with reovirus cores. Synthesis of inRNA by reovirus cores was stopped a t early stages of the transcription reaction (B,C,D = 30,60, and 120 seconds, respectively) by addition of phenol. The deproteinized extracts were applied to a calibrated column ( 1 x 70 cm) of Sephadex G-75. Elution was carried out with 0.02 M TrisCl buffer (pH 7.6), and aliquots (0.1 nil) of fractions (0.5 ml) were monitored for radioactivity. Each incubation mixture (0.5 m l ) contained 70 niM TrisCl ( p H 8 ) , 50 mM KC1, 2 mM ATP, 2 mM CTP, 2 niM UTP, 0.5 mM GTP, 24 pCi [a-"'P]GTP ( 8 3 Ci/mmol, New England Nuclear), 40 pCi ['HJAdoMet, 7 mM MgCI,, and 600 pg of washed renvirus cores. Incubation mixtures without MgCI, were warmed for 1 minutc at 35"C, and transcription at 35°C was initiated by the addition of MgCI,. _ - - 2H; , 32P. ~

6

Y. FURUICHI ET AL.

in size (-12 S ) and amount (Fig. 2 0 ) . The short oligonucleotides (fraction I of Fig. 2B,C) wcrc pooled and analyzed for the presence of caps. They contained the dimethylated cap, m’GpppGm (Fig. 3-1). Oligonucleotidcs smaller than dodecanucleotides (fraction 11, Fig. 213 and C ) also contained predominantly m’GpppGm, but small amounts ( -10%) of inonomcthylatcd (incomplete) cap, ni;GpppG, were present as well (Fig. 3-11). The results indicate that cap formation occurs at an early stage in the transcription and suggest further that 7-mcthylation of the 5’-terminal guanosine precedes 2’-0-1nethylation of the penultimate residue in caps. In support of thcse observations, caps were also found in nascent 5’-terminal oligonucleotides produced in incomplete transcripr

IT

I

I

I I I

I I I

I 5

I

0

I I 1

rI

I

I

I

FIG. 3. Analysis of [3H]methylated 5’-terminal structures of nascent rcovims mRNA. Fractions 35-45 (I) and 4F-50 (11) were pooled from the 30-sr,cond and 60-sccond reactions ( Fig. 2B and C ). The products wcre concentrated by adsorption to DEAE-cellulose (0.6 x 3 cni column), elution with 2 M NH,HCOn and lyophilization. Pools I (upper) and I1 (lower) were digested in 0.2 In1 of sodium acetate buffer ( 2 mM, pH 6.0) with 20 pg of PI nuclease at 37°C for 1 hour, adjusted to pH 8.0, and incubated with 1 unit of alkaline phosphatase (Worthington) at 37°C for 1 hour. Digests were analyzed by paper chromatography in isobutyric acid/0.5 M NHIOII ( 10:B v/v) with authentic markers, ni’GpppG and m’GpppGm. ”1’ radioactivity (not shown in the figurc) migrated in the position of Pi, i.e., faster than thc pG marker.

REOVIRUS

mRNA

7

5'-TERMINAL

tion reaction mixtures containing AdoMet, GTP and CTP but no ATP and UTP. Short oligonucleotides of net negative charge about -4 contained almost exclusively the monomethylated cap, m7GpppG. Longer nascent chains had an increasing proportion of dimethylated caps ( 1 5 ) .

B. Conversion of ppG-C to Blocked and Methylated Cap, m'GpppG(m1-C The dinucleotide ppG-C, which corresponds in sequence to unblocked

*

5'-termini of reovirus mRNA ( 5 ) , was efficiently converted to GpppG-C

*

*

( p = ?-!P)by reovirus cores incubated with [ (Y-~''P] GTP, pppG, in the presence of AdoHcy ( 1 5 ) . Blocking of ppG-C by reovirus cores was also studied under conditions of methylation, i.e., in the prcsence of [P P ] GTP and [ LV~-~H] AdoMet. The alkaline-phosphatase-resistant 32P- and 3Hlabeled products were separated from the labeled precursors by paper electrophoresis (Fig. 4A). The material indicated by the bracket was cluted, digested with PI nuclease and resolved into three radioactive peaks by paper chromatography (Fig. 4R). Peak I, which comprised the predominant "P-labeled product, contained no 3H radioactivity and migrated with marker GpppG, indicating that it was derived from the

*

blocked, unmethylated GpppG-C. Further analysis of peak I1 by paper electrophoresis aftcr trcatment with nucleotide pyrophosphatase revealed the presence of 32P-and "-labeled 7-methylguanosine monophosphate as the only radioactive constituent (Fig. 4C). Thus, the structure of

*

*

peak I1 is m'GpppG, obtained from m'GpppG-C. Peak 111 was further purified by paper electrophoresis to separatc the contaminating residual P, (73% of the ,,P) and pG (21% from incompletely hydrolyzed [ C X ~ ~ P ] G Tfrom P ) the "- and 3'P-containing constituent that migrated in the position of pG. The phosphatase-resistant 32P-labeledmaterial (6%) migrating in the position of the m'GpppGm also contained [3H]methyl radioactivity and was identified as m'GpppGm since it yielded ?H- and 32P-labeled pm'G and "-labeled Gm after nucleotide pyrophosphatase treatment (Fig. 4D). Thus, the structures in peak I, I1 and I11 were

*

*

*

derived from cap structures GpppG-C, m'GpppG-C and m'GpppGm-C, rcspectively, that were formed by the action of core-associated guanylyl-

* *

* *

transferase and methyltransferases. Low levels of GpppG and m'GpppG were also detected in the phosphatase-treated products (in fractions 11-12 and fraction 8, respectively, of Fig. 4 A ) . These compounds presumably were formed in a limited reaction involving condcnsation of two

*

molecules of pppG and subsequent methylation.

8

Y. FURUICHI ET AL.

-1

H

,.

FRACTION NUMBER

FIG.4. Modification of ppG-C to form methylated, blocked structures. ( A ) The capping and methylation of ppG-C was done in a reaction mixture (0.2 nil) containing 75 mM TrisCl pH 8, 4 niM MgCL, 0.25 mM ppG-C, 30 mM KCI, 25 &i ['HIAdoMet (specific activity 7.5 Ci/nimol), 0.8 mM GTP, 35 pCi of [a-3'P]GTP and 800 pg of washed reovinis cores. Incubation was at 45°C for 5 hours; the m'ixture was extracted with phenol followed by ether, digested with alkaline phosphatase and analyzed by paper electrophoresis at pH 3.5. ( B ) Fractions indicated by the bracket in panel A were extracted, digested with PI nuclease, and analyzed with marker compounds by paper chromatography in isobutyric acid/0.5 M NH,OH (10:Bv/ v). ( C ) Peak I1 component in panel B, which migrated in the position of m'CpppG, was extracted, digested for 30 minntcs at 37°C with 0.05 unit of venom nucleotide pyrophosphatase per milliliter in 0.1 1111 of 0.02 M Tris buffer pH 7.5 and 1 mM Mg'+, and analyzed by paper elcctrophoresis. ( D ) Pcak I11 component, migrating with marker ni'GpppGni in panel B, was extracted, further purified by paper electrophoresis to remove Pi, treated with alkaline phosphatase to remove pG, again separated by electrophoresis, digested with venom nucleotide pyrophosphatasc, and finally analyzed by paper electrophoresis as for compound 11 in panel C.

REOVIRUS

mRNA 5’-TERMINAL

Product

* *

Amount produced (nmol)

GpppG-C

5.6s

rn7GpppG-C

* xn7GpppGm-C * *

0.35

GpppG

0.2

* *

in X p p p G (1

9

0.04

0.02

Valucs shown merc calculated from the data in Fig. 4A and B.

The quantities of the different 5’-terminal structures synthesized by reovirus cores in the presence of AdoMet are summarized in Table I. Conversion of ppG-C to GpppG-C appears to be the most efficient reaction and is at least 20-fold more effective than pppG condensation for the formation of the blocked structure, GpppG. 7-Methylation of the terminal G in GpppG-C ( o r GpppG) was incomplete in these partial reaction mixtures; only about 10%of the products were methylated. The second methylation, i.e., to form 2’-0-methylguanosine, of the blocked trinucleotide was even more limited (
C. 5’-Diphosphate and Phosphodiester Bond Requirement for Cap Formation A variety of different substrates were tested as precursors for the synthesis of blocked structures by reovirus cores. As shown in Table 11, condensation of GTP ( cu-32P-labeled)to form GpppG occurred to a limited extent and was not affected by addition of GDP, UTP or ATP (Table 11, Expt. 1). However, when cores were incubated with GTP plus CTP, the formation of labeled blocked structures, increased by 20-fold. Addi-

* *

tion of GDP or UTP partially inhibited the reaction. GpppG formation from [a-”P]GTP was also decreased upon addition of CTP, i.e., under conditions that promoted GpppG-C synthcsis (Table 11, Expt. 2 ) . The

10

Y. FURUICHI ET AL.

nlocked structures (pmol)

*

Expt. no. 1

2

3

4

5

Substrates GTP (iT1’ GTP

+ (;1)P + UTP m P + ATP CTP + CTP GTP + C T P + (+HI’ GTP + CTP + UTP GTP + ppG-C G T P + PPPG-C (;TI’+ p(;-C (;TP + ppG-C G T P + ppG-C + pG-C GTP + ppG-C + PA-(; G T P + ppA-G p p G C + GDP ppG-C + m W T P

+

p p ~ c [P,~P]GTP 6

*

GpppG-C‘

+ pyrophosphate + CTI’ + pyrophosphate + ppG-C + pyrophosphato (;TI’+ CTI’ + phosphate GTP

(;TP GTP

GpppGb

0 0 0 0

12 12 < 10 < 10

280 222 218

< 10

910 72 0

< 10

8 6 8 11

270 207 261 21c

8 15 7 15

0 0 0

0 0 10

0 0 0 203

0 0 0 < 10

a Inriit~ationconditions and product analysis were the same as in Fig. 4 and lief. 15, rxrrpt for Eapt. 4, in whirh t h e reartion tirne was 30 minutes and the concentrations of ppA-(;, pA-C; and pG-C Mere 0.05, 0.1 and 0.1 mM, rcspwtivrly. The lrvrls of [a-’*P](;l)P, [a-321’]m~(~TP and [B,+2I’](:TP in 1Sxpt. 5 were 2, 0.8 and 5.3 ,&I, respectively. All incubations were peiforinrd in thc prrsenw of 0.1 InM A d o k l r t containing 20 pCi of [3H]AdoRlrt(7.5 Ci/mmol) except the exoerimrnt with ppG-C and m7GTP. The concentrations of pyrophosphatr and phcsphate were 2 xnM and 0.1 M, respectively.

*

*

The fractions containing GpppG-C also inrludc smaller amounts of ~ n ~ ( ; p p p ( ; - c

*

*

( - 6 % of total pinol formed) and m?(:pppGm-C (-1 %of total); the (:pppG fractions

*

rontained m7GpppG (10% of total). Product identification only on basis of clcctrophoresis mobility a t pH 3.5.

REOVIRUS

11

mRNA 5’-TERMINAL

* *

* *

high yield of GpppG-C as compared to GpppG suggests that phosphodiester bond formation occurs bcforcx transfer of the 5’-terminal pG in the synthesis of blocked structures. Further evidence for this suggestion was obtained by comparing the dinuclcotides, pppG-C, ppG-C or pG-C as precursors of caps. The highest yield of blocked structures was obtained with GTP plus ppC-C (Table 11, Expt. 3 ) . The dinucleotide with a 5’-triphosphate end functioned to a lesser extent, presumably because prior conversion to ppG-C by the virion-associated nucleotide phosphohydrolase (17-19) was required. (pppG-C was converted to ppG-C by reovirus cores at a rate of 9 nniol/mg of cores/hour at 45°C.) The 5’monophosphate-containing pG-C was not utilized as a substrate for the blocking reaction. ppA-G, which has the same general structure but differs in nucleotide scquence from the 5’-terminal ppG-C of unblocked reovirus mRNA, also was not used (Table 11, Expt. 4 ) . When pG-C or pA-G was added to the blocking reaction, only the former inhibited cap formation from ppG-C plus GTP (23%reduction in the presence of 2:1 molar ratio of pG-C to ppG-C). These results are consistent with blocking occurring on nascent mHNA chains that are base-paired with the template minus-strand of reovirus genome RNA. Other compounds were also tested as substrates for the reovirus core were P not active, conguanylyltransferase. [ a- ‘ T I G D P and [ ( u - ” P ] ~ ~ G T firming the requirement for GTP (Table 11, Expt. 5 ) . No “P-labeled GpppG-C was synthesized when [/3,y-”P]GTP rather than [ P ~ ~ P ] G T P was used as donor, consistent with transfer of only the a-phosphate of

*

pppG to the diphosphate-terminated acceptor, ppG-C,

D. Effect of Pyrophosphate on Blocked-Structure Synthesis Pyrophosphatc, which is formed by reovirus cores in both the RNA polymerase and guanylyltransferase reactions, is an effective inhibitor of blocked-structure synthesis. At concentrations of 0.05 and 0.5 mM, the

*

yield of GpppG-C from ppG-C and GTP was reduced by 70%and 95%, respectively. A t 2 mM pyrophosphate, a concentration that reduced mRNA chain elongation by only 50%,no blocked structures were detected using ppG-C as subqtrate (Table 11, Expt. 6 ) . By contrast, a 50-fold higher concentration of inorganic phosphate (0.1 M ) decreased GpppG-C synthesis by 25%and mRNA synthesis by 28%.Thus, PP, but not P, differentialIy inhibits cap synthesis. The RNA products synthesized by the reovirus-associated RNA polymerase in the presence of AdoHcy included molecules with 5’-terminal GpppC-C (27%) and ppG-C (71%) (5). The ratio of RNA chains with

12

Y. FURUICHI ET AL.

I 1

' 6 -

/I

-

:

II I, II

-

f 11

-

II

4; 2-

.

-

I':

,

5 LLr-, I

I -

*

FIG. 5. Pyrophosphorolysis of CpppC-C. ( A ) ["'P]GpppG-C ( 1 . 2 nmol, 1.1x lo6 cpm), synthesized by reovirus cores from [a-"'PIGTP and ppG-C, was incubatcd for 75 minutes at 45°C with 200 pg of washed reovirus cores in 0.1 ml of mixture containing 70 n1M TrisCl ( p H 4 mM MgCI,, and 2.5 m M sodium pyrophosphate.

s),

After incubation the mixtiire was phenol-extracted and analyzed by paper clectrophoresis. ( B ) Fractions indicated by the bracket in panel A were extracted, and the components wcre identified by paper chromatography in isobutyric acid 0.5 M NH40H ( 1 0 : 6 v/v). ( C ) [SII]Methyl-labeled m'CpppG-C (34 pmol, 1.3 x lo" cpm) was incubated with 2.5 rnM sodium pyrophosphate as described i n panel A except that the time of incubation was 60 minutes.

blocked vs. unblocked 5' tcrmini was rcversed whcn methylated RNA products were synthcsized: m7GpppGrn-C (75%) mid ppG-C (25%). Some unblocked ends probably result from progressive inhibition of guanylyl transfer by pyrophosphate accumulated during polymerization. An increase in the proportion of unblockcd termini in uninethylated mRNA would also be expected if pyrophosphate promoted reversal of the capping rcactioii, (ppG-C G T P a GpppG-C PP, ) but not the

+

+

REOVIRUS

mRNA

13

~’-TER~.IINAL

pyrophosphorolysis of the methylated cap, m5GpppG-C. This possibility was tested by comparing GpppG-C and m’GpppG-C as substrates for

*

pyrophosphorolysis. GpppG-C was hydrolyzed by reovirus cores to the extent of 19%after incubation for 75 minutes in the presence of 2.5 mM pyrophosphate ( Fig. 5A). The resulting GDP ( Fig. 5B) probably was derived from GTP products by the action of the highly active core-associated nucleotide phosphohydrolase. In contrast, m’GpppG-C was not hydrolyzed

* *

under thcse conditions (Fig. 5C). GpppG also was not degraded by pyrophosphorolysis, consistent with the above results showing that ( i ) cap formation probably does not occur by GTP condensation, and ( ii) the substrate for reovirus mRNA cap synthesis includes a minimum of one phosphodiester bond, i.e., ppG-C. The finding that 7-methylation prevents cap pyrophosphorolysis probably accounts for the greater proportion of capped molecules in methylated reovirus mRNA.

E.

Proposed Mechanism of Reovirus Cap Synthesis Together with previous findings ( 2 0 ) ,the data shown are consistent with cap synthesis occurring during mRNA formation on reovirus doublestranded template RNA as, follows: A. Reovirus Transcription (+)

5’m7GpppGm-C-U-A-U-C’3

(-1

3’C -G- A-U-

Genome RNA:

L

m It NA : (+)

5’m7(:pppGm-C-U

I

A- G pp 6’

c

3’

B. Reovirus mRNA Cap Synthesis PPPG

+ PPPC

RNA polymerase

PPPG-C

Nucleotide phosphohydrolase

pppG-C PPPG

+ PPG-c <

’ PPG-C

+ PPi

+ Pi

guanylyltransferase

’ GpppG-c

+ PPi

+ Adohlct rnethyltransferase m7GpppG-C + AdoITcy rnethyltransferase 2 m7GpppG-C + AdoMet m7GpppC.m-C + AdoIIcy GpppG-C

1

3

(5)

This proposed mechanism of reovirus mRNA cap synthesis implies that formation of the first phophodiester bond on the genome double-stranded template RNA precedes guanylyl transfer to the 5’-diphosphate end, i.e. it requires that ppG-C, but not ppG, be a substrate for blocking by reo-

14

Y. FURUICHI ET AL.

i

n

0

20

40

60

80

100

FRACTION NUMBER

FIG. 6. Identification of intermediate, ppG-C, in the synthesis of CpppG-C. ( A ) A reaction mixture (0.4 ml) containing 50 mM TrisCl ( p H 8.0), 4 mM MgCI,, 0.85 mM CTP, 20 pCi of [“CICTP (465 mCi/mmol, Amersham/Searle), 0.8 mM GTP, 66 pCi of [(I-~~PIGTP, 12.5 mM phosphoenolpyruvate, 0.6 units of pyruvate kinase, 2 mM sodium pyrophosphate, and 1.5 mg washed reovirus cores was incubated for ti hours at 45°C. After incubation, the mixture was analyzed by paper electrophoresis. ( B ) Adjacent fractions in pancl A were comhined in pairs and extracted to obtain the radioactive compoiinds. Aliquots (104 of each extract) were digested in 0.1 in1 of 10 InM TrisCl buffer with 0.5 unit of alkaline phosphatasc at 37°C for 2 hours, and analyzed again by paper electrophoresis for [“CIG-C with authentic marker G-C. The cpm profile shows the yield of [“ClG-C from tha fractions in panel A. ( C ) An aliquot ( 5 0 1 ) of the matrrial from the region indicated by the bracket in panel A was applied to a column (0.6 x 25 cm) of Dowex-1 x 8 resin equilibrated with 0.01 M HCI, and the chromatography was performed with a 0 to 0.4 M NaCl linrar gradient in 0.01 M HCI (total = 400 id). Fractions ( 2 ml) were collected, and 0.1 ml of each was counted in Aqiiasol scintillnnt. “P; ---- “C. ~

virus cores in the prcsencc of GTP. As shown in Table 2, a small amount of GpppG was formed by GTP condensation. However, it appears not to be a precursor of caps but a side-product formed in the incomplete reaction mixture because when caps were synthesized from ppG-C and

REOVIRUS

mRNA

15

5’-TERMINAL

* *

[ w ~ ~ P I G Tor P , from [ d ? P ] G T P plus CTP, GpppG formation decreased as compared to the quantity synthesized from GTP alone. Furthermore, the rate of GpppG formation is insufficient to account for the amount of cap structures made during mRNA synthesis.

F. Identification of the Intermediate ppG-C in the Synthesis of GpppG-C Reovirus cores synthesize the 5’-terminal blocked structure of mRNA, GpppG-C, from GTP plus CTP (Table 11). The proposed reaction series for its synthesis includes ppG-C and pppG-C as intermediates. Since 2 mM pyrophosphate almost completely inhibits reaction 3 in the series, i.e., the formation of GpppG-C from ppG-C plus GTP, it was of interest to test for the accumulation of ppG-C as an intermediate when cores were incubated with [a-32PJGTP, [I-ICICTP and 2 mM pyrophosphate. As shown in Fig. 6A, electrophoresis and Dowex-1-column chromatographic analysis of a reaction mixture incubated under these conditions revealed the presence of the intermediate compound, ppG-C. The calculatcd amount of “C-labeled ppG-C synthesized in the presence of 2 mM pyrophosphate was 57%of the amount of GpppG-C synthesized under similar conditions but without added pyrophosphate (Table 2, Expt. 2 ) . These results support the proposed mechanism of reovirus mRNA cap synthesis, i.e., blocking of the dinucleotide ppG-C to form GpppG-C followed by two sequential methylations.

G. Specificity of Reovirus Methyltransferases Although GpppG is not hydrolyzed by reovirus cores in the presence of pyrophosphate, i t is methylated at the 7-position of one guanosine residue by core-associated methyltransferase (Table 111). The 7-methyltransferase activity requires as substrate two G residues linked through three phosphates. GTP and blocked structures containing other nuclcosides or a different number of phosphates in thc bridge are methylated poorly or not at all. As expected on the basis of the proposed mechanism of synthesis of blocked structures, GpppC-C was severalfold bettcr as a substrate for methylation than GpppG. The reovirus methyltransferase that modifies the 7-position of guanosine in caps (7-G-methylase) has a strict specificity for the 5’-terminal G, since it does not methylate internal nuclcotides in nascent mKNA and recognizes only blockcd structures consisting of two guanosine residues linked 5’-5’ through three phosphates. Such a strict substrate recognition may bc a characteristic property of the reovirus-associated enzyme. The 7-G-methylase solubilized and purified from vaccinia virus inethylates GpppG( G ) n and GpppA( A) ,I, giving

16

Y. FURUICHI ET AL.

Compound

[3H]Methylincorporated (pmol) 0.3 0 1 1 0 2 2 8b 0 15lC

4 Reaction mixtures (0.1 ml) contained 70 mM TrisCl (pII 8) , 50 m M KC1, 2 mM MgCI?, 25 pCi of [3H]AdoMet,100 pg of washed reovirus cores and 10 nmol of each compound. The mixturcs wcrc incubated a t 45°C for 30 minutes and stoppcd by t h o addition of a n equivalent volunie of phenol. Thc aqueous layer recovered after centifugation w a s unulyzed by electrophoresis. * m7GpppG. r11~GpppG-C(80%) and m7GpppGm-C (20%).

rise to m'GpppG( G t n and m'GpppA( A),, in the absence of viral genome DNA (21, 22).' The purified vaccinia methylase docs not require metal ions for activity. In contrast, the reovirus core-associated 7-G-methylase, one of several activities in the transcription complex, is completcly inhibited by 2 mM EDTA and functions optimally at 2 mM Mg2+.The reovirus cation requirement may reflect a difference in template-associated vs. solubilized 7-G-methylases.

H.

Ability of Reovirus Methylase to Modify mRNA Posttranscriptionally

Methylation of blockcd 5'-termini of reovirus mRNA occurs at an early stage of transcription. However, the corc-associated methylases can methylate incomplete RNA chains posttranscriptionally at both the 7position of the terminal guanosine and the 2'-OH of the penultimate guanosine ( Fig. 7 ) . Unmethylated, immature RNAs of different lengths were accumulated in situ in viral cores during brief transcription incubation timcs. The cores were washed to remove ribmucleoside triphosphates, to prevent further chain elongation, and then incubated with [ M&H] AdoMet in the absence of ribonucleoside triphosphates. Nascent, preexisting mRNAs were methylated at their 5' termini, yielding the dimethylated cap, m'GpppGm. Under these conditions, the core-associated RNA polymerase apparently remains on the genome template RNA with See also Moss et al., this volume.

REOVIRUS

mRNA 5'-TERMINAL

4-

1 min

'

'

'

17

' -_

3min

-

3-

- 2-I 1 :1 - 0-

s'

1 1 1

-4

0

' 2min

x

2 L

8-

' 1 1 1

1

'

1

-

5 min

I

1 1 1

U

-4

-

$

E

FRACTION NUMBER

FIG.7 . Methylation of nascent mRNA by cores in situ. Incomplete mRNA chains of variable size were synthesized in a reaction mixture (0.25 rnl) containing 70 mM Tris buffer ( p H S ) , 10 mM MgCL, 2 mM ATP, 2 rnM UTP, 2 mM CTP, 0.3 niM GTP, 23 pCi of [a-"P]GTP (17.7 Ci/mmol), 0.4 mM AdoHcy, 10 mM phosphoenolpyruvate, 1 unit of phosphopyruvate kinase and 800 pg of washed reovirus cores. Incubations were carried ont at 45°C for 1, 2, 3, and 5 minutes and stopped by freezing. After thawing at 4"C, the cores containing template-associated, incomplete nascent mRNA chains were collected by centrifugation, suspended in 0.2 nil of 50 mM Tris buffer (pH 8 ) containing 50 niM KCI and recentrifuged. This washing procedure was repeated at 4°C. The washed cores were suspended in a reaction mixture (0.1 m l ) containing only 70 mM Tris buffer ( p H 8 ) , 5 mM MgCl, and 20 pM of ['HIAdoMet (10 Ci/mmoI) and incubated for 15 minutes at 45°C. The mixtures were phenol-extracted and aliquots ( 50%) were analyzed by velocity sedimentation (18 hours, 30,000 rpm, SW 41 rotor) in 5 to 30%glycerol density gradients in 20 mM Tris buffer ( p H 8 ) , 0.1 M NaCl and 5 mM EDTA. Acid-insoluble radioactivity in the fractions was collected on Millipore filters and counted in toluene-based 3'P. The 3'P scale in the 2 minute profile also applies to the scintillant. ---,T I ; -, 3-minute and 5-minute samples.

the incomplete mRNA at the position where chain elongation stops, since nascent mRNA molecules are quantitatively converted to complete chains during the "chase" with nonradioactive ribonucleoside triphosphates. Thus, the core methylases can function independently of the RNA polymerase to modify 5'-terinini of nascent mRNA during transcription. These results suggest that the reovirus RNA polymerase and methylases within viral cores are functionally and physically separated.

111. Discussion Synthesis of mRNA caps during reovirus transcription is catalyzed by several core-associated enzymes that function sequentially in the

18

Y. FURUICHI ET AL.

blocking, 7-methylation and 2’-O-methylation reactions, as proposed above. The blocking reaction, i.e., guanylyl transfer to ppG-C, occurs at a very early step in transcription, and caps are formed during brief incubation timcs or when chain elongation is limited by deleting ATP and UTP from reaction mixtures. However, cap synthesis probably depends upon the prior activity of core-associated RNA polymerase and of nucleotide phosphohydrolase, because ppG-C is more efficiently converted to GpppG-C than is pppG-C or GTP plus CTP (Table 11). The RNA polymerase and phosphohydrolase activities may be functionally coupled in reovirus cores; i.e., the pppG-C may be converted to ppG-C as the first phosphodiester bond is formed. pG-C and ppA-G are not substrates for the guanylyltransferase activity, although pG-C (b u t not PA-G) inhibits utilization of ppG-C for cap synthesis. These findings indicate that the virion-associated guanylyltransferase requires oligonucleotides with diphosphate 5’ termini as pG acceptors. The diphosphate rcquirernent establishes a possible role for the phosphohydrolase activity that was described previously in animal DNA and RNA viruses including vaccinia ( 2 3 ) ,frog virus 3 ( 2 4 ) , reovirus ( 2 5 ) and CP virus (26). The inactivity of ppA-G and pA-G in this reaction also suggests that the core-mediated blocking reaction is nucleotide-specific and possibly template-dependent since G-C, but not A-G, is complementary to the 3’-terminal sequence of the “minus” strand of reovirus genome RNA (27). Among the four ribonucleoside triphosphates, only GTP (not G D P ) is used for capping of

* **

ppG-C. This is accornplishcd for all reovirus mRNA classes by pG trans-

*

fer from [U-~~PIGTP,pppG. Furthermore, when [p,y-”P]GTP, pppG, was tested as substrate, the p-phosphate was incorporated into mRNA at the

*

5’-terminus as ppG. . ., but conversion of unlabeled ppG-C to GpppG-C in the presence of [p,y-”PIGTP did not yield radioactive products. Among the in uitm mRNA products of reovirus corcs arc molecules with 5’-terminal ppG-C. They comprise 75% of the total in preparations of unmethylated RNA made in the absence of AdoMet or the presence of AdoHcy, but only 258 under conditions of methylation. Some molecules could escape the blocking reaction if polymerization were faster than capping and if the RNA polymerase and guanylyltransferase both moved away from the transcription initiation sites. The unblocked 5‘ cnds may also arise from one or more effects of pyrophosphate on the blocking reaction. First, the guanylyl transferase appears to be inhibited by pyrophosphate, as is the core-associated RNA polymerase. Because the transferase activity is more sensitive to the inhibitor, capping would be differentially and progressivcly reduced as polymerization proceeded with concomitant release of pyrophosphate. Second, pyrophosphorolysis

REOVIRUS

mRNA 5’-TEHMINAL

19

of GpppG-C 5’ ends, i.e., reversal of the capping reaction, would yield ppG-C plus GTP. The back reaction would b e prevented in 5’ termini that contain m’G. Since AdoMct does not stimulate the capping reaction, these results probably account for the 3-fold increase in the proportion of 5’-blocked molecules in preparations of methylated vs. unmethylated mRNA (5). Like the guanylyltransfcrase, 7-G-methylase is also nucleotide-specific. It methylates the 7-position of only one of the guanosine residues in GpppG, the structure homologous to reovirus-blocked ends. I t does not significantly methylate GpppA, GpppC or GpppU. GpppG-C is a more effective substrate than GpppG for the 7-G-methylase, consistent with cap formatiop a t the dinucleotide level (Table 111). Substrate specificity is also a feature of the core-associated 2’-O-methylating activity. It modifies only those oligonucleotides that already contain the 5’-terminal m7G and also requires at least one phosphodiester linkage in the substrate; i.e., m’GpppG or GpppG are not e’-O-methylated by cores. Caps are present in HeLa and L cell hnRNA (6, 7 ) , in the fast-sedimenting RNA as well as in nuclear RNA molecules that are similar in size to cytoplasmic mRNA. However, it is not known whether the large capped hnRNAs, which have poly(A) at the 3’-terminal end, consist of molecules that have already been cleaved from precursors or whether they are primary transcripts. Our studies with reovirus cores demonstrate that, at least for some eukaryotic viral mRNAs, 5’-terminal caps are formed during initiation of transcription.

IV. Summary Blocked and methylated 5’-termini of reovirus mRNA are formed by viral cores at an early stage of transcription. Cores incubated in a complete transcription reaction mixture for 30 seconds synthesize the “cap” structure, m’GpppGm-C. The dinucleotide ppG-C functions as substrate for a core-associated guanylyltransferase and is converted to GpppG-C by addition of pG from pppG( GTP). For optimal conversion both the diphosphate terminus and phosphodiester bond are required. pG-C is not a substrate, but pppG-C is utilized after removal of the y-phosphate by a core nucleotide phosphohydrolase. Methyltransferases, also present in cores, transfcr methyl groups scquentially from S-adenosylmethionine to the 7-position of the 5’-terminal G of GpppG-C and to the 2’-OH of the penultimate G. GpppG-C is hydrolyzed by cores in the presence of pyrophosphate to ppG-C, the predominant 5’-terminal structure of reovirus mRNA made in the absence of S-adenosylmethionine. 7-Methylation prevents pyrophosphorolysis of m7GpppG-C, which may explain the increased proportion of blocked, methylated 5’ termini in viral mRNA

20

Y. FURUICHI ET AL.

synthesized in the presence of S-adenosylmethionine. On the basis of these findings, a series of reactions is proposed (Section 11, E ) for the synthesis of reovirus mRNA caps, and some characteristic features of the cnzymes involved in these reactions are discussed (Sections 11, G and

H, 111). ACKNOWLEDGMENTS We thank Dr. A. Simoncsits of the Institute of Biophysics, Szeged, Hungary for the preparation of pppG-C and ppG-C ( 2 8 ) and A. LaFiandra and M. Morgan of the Roche Institute for assistance in these studies.

REFERENCES 1. A. J. Shatkin, Cell ( 1976) ( i n press). 2. G. W. Both, A. K. Banerjee and A. J. Shatkin, PNAS 72, 1189-1193 (1975). 3. S. Muthukrishnan, G. W. Both, Y. Furuichi and A. J. Shatkin, Nature 255, 3 3 3 7 (1975). 4 . E. D. Hickey, L. A. Weber and C. Baglioni, PNAS 73, 19-23 (1976). 5. G. W. Both, Y. Furriichi, S. Muthukrishnan and A. J. Shatkin, Cell 6, 185-195 (1975). 6. R. P. Perry, D. E. Kelley, K. H. Friderici and F . M. Rottman, Cell 6, 13-19 (1975). 7. hl. Salditt-Georgieff, W. Jelinek, J. E. Darnell, Y. Furuichi, M. Morgan and A. J. Shatkin, Cell 7, 227-237 (1976). 8. S. Cory and J. M. Adams, J M B 99, 519-547 ( 1975). 9. T . S. Ro-Choi, Y. C. Choi, D. Henning, J. A. McCloskey and €1. Busch, JBC 250, 3921-3928 (1975). 10. Y. Furuichi NARes 1, 809-822 (1974). 1 1 . Y. Furuichi and K.-I. Miura, Nature 253, 374-375 ( 1975). 12. Y. Furuichi, hl. Morgan, S. Muthnkrishnan and A. J. Shatkin, PNAS 72, 362366 (1975). 13. T. Urushibara, Y. Furuichi C. Nishiinura and K. Miura, FEBS Lett. 49, 385-389 (1974). 14. C. M. Wei and B. Moss, PNAS 72,318-322 (1975). 15. Y. Furuichi, S. Muthukrishnan, J. Toniasz and A. J. Shatkin, JBC 251, 50435053 (1976). 16. A. J , Shatkin, PNAS 71,3204-3207 (1974). 17. D. H. Levin, G. Acs and S. C. Silverstcin, Nature 277, 603-604 ( 1970). 18. J. Bursa, J. Grover and J. 1).Chapman, 1. Virol. 6, 295-302 ( 1970). 19. A. K. Banerjer, R. L. Ward and A. J. Shatkin, Nature N B 230, 169-172 ( 1971). 20. Y. Furuichi, S . Muthukrishnan, G. W. Both and A. J. Shatkin, Abstr. Int. Congr. war. 3rcz ( 1975). 21. S. A. Martin, E. Paolctti and B. Moss, J B C 250, 9 22. S. A. hlartin and B. Moss, JBC 250, 9330-9335 (1975). 23. P. Gold and S. Dales, PNAS 60, 845-852 ( 1968). 24. R. Vilagines and B. R. McAuslan, J . ViroE. 7, 619-G24 (1971). 25. A. J. Shatkin and C. W. Both, Cell 7, 30531n ( 1976). 26. G . B. Storcr, M. G . Shepherd and J. Kalmakoff, lnteroirology 2, 87-94 ( 1973/74). 27. S. Muthrikrishnan and A. J. Shatkin, Virology 64, 9G-105 (1975). 28. A. Simoncsits, J. Toniasz and J. E. Allende, NARes 2, 257-263 (1975).

Nucleot ide Methy 1at ion Patterns in Eukaryotic

mRNA FRITZM. ROTTMAN, RONALDC. DESROSIERS’ KAREN FRIDERICI

AND

~

Department of Biochemistry Michigan State University East Lansing, Michigan

I. Introduction After the dcscription of methylation in eukaryotic cellular mRNA ( 1 , 2 ) and viral RNA ( 3 - 5 ) , it was noted that a large portion of the methylnucleotides were present in these RNA molecules as an alkali-stable oligonucleotide. Based 011 the data obtained from several laboratories, a general structure for this oligonucleotide was proposed in which a 7methylguanosine residue was joined by 5‘-5’ pyrophosphate linkage to a 2’-O-methylnucleoside ( 6 ) . These 5’-terminal “cap” structures have been found in a wide variety of cellular mRNAs (7-10) and viral RNAs (11-16).’ Particularly in cellular mRNA, the 5’-terminal cap can be either m’GpppN’m-N” ( cap-1 ) or m’GpppN’m-N’’m-N’’’ ( cap-2), containing onc and two 2’-O-mcthylnucleosides, respectively ( 8-10). Additional methylation of cellular mRNA molecules occurs internally, between the 5’-cap and the 3’-poly( A ) segment, yielding 6-methyladenosine ( m6A) (2, 17). Recent studies on hnRNAs indicate that these moleculcs also contain 5’-caps and internal mGA(18, 19). Interest in ccllular mRNA and viral RNA methylation has led to further refinement in the characterization of the methylnucleotides present in these molecules. Recently there has been an accumulation of cvidence suggesting that the methylation events may be separated in the cell by time and place. These patterns of methylation, reflecting such parameters as labeling times and altered physiological cell culture conditions, can be exposed by an enzymic dismantIing of cap structures coupled with an analysis of nucleosides and nucleotides by high-resolution chromatography. The data presented here concentrate on mRNA from Novikoff cells Present address: Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut. ’See also articles by Moss et al., Furuichi ct al., and Busch et al. in this volume. 21

22

FRITZ M. ROTTMAN ET AL.

grown in suspension culture. These rapidly growing cells, with a doubling time of approximately 12 hours, synthesize a complex array of proteins and consequently an equally complex mixture of mRNAs. Under conditions of continuous labeling with L- [methyI-"II] methionine, the relative labeling of cap-2 structurcs, at very early times, grcatly exceeds that of cap-1. This relative enhanccment of c a p 2 labeling is due almost entirely to incorporation of methyl groups at thc second nucleoside position to form N"m. At longer times of labeling, the level of radioactivity in each position approachcs equivalence. Determination of methylnuclcosides in iiiternal positions and within cap structures permits several interesting comparisons of mRNA methylation patterns as a function of labeling time. Thc distribution data suggest a selective preservation of certain mKNA sequences and are consistent with a model in which the second cap methylation, to yield N"m, is a cytoplasmic event that follows earlier nuclear methylations producing m'G, N'm and internal mGA.

II. Materials and Methods A. Cell Culture and labeling Conditions Novikoff hepatonia 'cells ( N l S l strain) were grown in Swimm's S-77 niediinii (GIBCO) containing 10%calf serum essentially as described ( 2 ) . For labeling with ~-[methyl-:'H]methionine ( Amersham/Searlc, 5 Ci/mmol), cells in micllogarithmic growth phase wcre harvested aseptically and resuspendcd in fresh warin medium at a concentration of approximately 7.5 x 10" cells/inl. Labeling for the time-course study was performed in the presence of 20 niM sodium formate and 20 fiM each of adenosine and guanosine to suppress nonniethyl purine ring-laheling; normal metliionine levcls were present for all lal~elingtimes except for 20 minutes, in which medium without methionine was used. Labeling conditions for thc distributional analysis experimcnts wcre altered to increase the specific activity of thc methylatccl nucleotides in mRNA. Adenosinr, guanosine and forinatc were omitted and the 5- and 24-hour samples were labeled in 5 mh.1 methioninc, one-half the normal concentration. Under thesc conditions, there was no discerniLle change either in cell growth or in cell appearance when examined by phase niicroscopy. In most experiments, 15-20 mCi 1.-[methyl-'H1methionine was used. At each time point, cells were harvested aseptically and the radioactivc medium was returned to the growing culture. Final cell concentrations at the time of harvest never exceeded 1.3 x 10"/ml.

B.

Isolation a n d Characterization of Poly(A1-Containing Cytoplasmic mRNA

Total cytoplasmic RNA was isolated as previously described ( 2 ) . Poly( A)-containing mRNA was isolated by ohgo( dT)-ccllulose chromatography, including a heatstep prior to a second passage over the column ( 1 0 ) . This step is necessary to eliminate traces of rRNA that otherwise interfere with the analysis.

METHYLATION PATI-ERNS IN EUKARYOTIC

mRNA

23

C. Enzymic, Alkaline and Acid Digestion of mRNA Poly( A)-containing cytoplasmic mRNA, essentially free of tRNA and rRNA contamination, was digested either enzymically or with alkali to obtain internal methylnucleotides and the cap. RNA was digested enzymically with RNase T2 (Sigma) at 2 units/A,,;o unit of R N A in 0.9 M NaCI, 0.15 M sodium acetate ( p H 4.5), 0.01 M EDTA, for 2 hours at 37°C. The reaction mixture was then adjusted to pH 8 with 1 M NaOH and made 0.017 M in magnesium acetate. Bacterial alkaline phosphatase ( PL Biochemicals, electrophoretically pure ) dialyzed against 0.05 M NH,HCO:, was added (0.25 unit/A?,;,,unit of RNA) and the reaction was continued for 30 minutes at 37°C. After resolution of the reaction products on DEAE-Sephadex in the presence of 7 M urea ( 7 ) , the oligonucleotidrs were desalted on Bio-Gel P-2 (100-200 mesh ). Mononucleosides were adsorbed to charcoal and eluted with 20% pyridine. Alternatively, niRNA was first treated with nucleotide pyrophosphatase from Crotalus atrox (Sigma). A 200-pI reaction contained 0.25 unit of enzyme, 9 A?,,, units of RNA, 20 pmol of TrisCl pH 7.8 and 0.2 pmol of magnesium acetate. After incubation at 37°C for 35 minutes thc reaction was stopped by placing in a boiling water bath for 5 minutes. The high level of carrier RNA was added to suppress nonspecific diesterase activity. The RNA was separated from the released pm’G by chroniatngraphy on Bio-Gel P-2 (100-200 mesh, 1.5 x 22 cm column) and treated with 0.25 unit of phosphatase in 0.05 M TrisCl p H 7.8, 1 m M magnesium acetate for 45 minutes at 37°C to remove the newIy exposed 5’-terminal phosphates. The dephosphorylnted RNA was then digested for 18 hours at 37°C with 0.4 N KOH to obtain N’ni-N”p from cap-1 and N’ni-N”m-N”’p from cap-2 structures. C a p 2 structurcs ( m’CpppN’m-N”m-N’” ) were further digested with penicillium nuclease (Yamasa Shoyl Co., Ltd.) at 160 pg/ml in 0.01 M sodium acetate, pH 6.1. After 45 minutes at 37”C, the sample was made 0.05 M TrisCl pH 7.8, 1 mM in magnesium acetate and 0.3 unit of phosphatase/100 pI was added; incubation was continued for 30 minutes at 37°C. This procedure yielded N”m plus the “core” oligonucleotide, ni’GpppN’ni. Both cap structures and the core oligonucleotides generated from cap-2 were completely digested to nucleosides by addition of 0.25 unit of nucleotide pyrophosphatase and 0.4 unit of phosphatase to 100-pl reactions containing 0.1 M TrisCl pH 7.8, 0.1 rnM magnesium acetate, and incubation for 45 minutes at 37°C. Acid hydrolysis of wholc mRNA, 5’-terminal oligonucleotides, and mononucleotides can be used to cleave the N-glycosyl bond of pnrine-containing nucleotides, releasing free purine bases. The procedure of Munns et al. ( 2 0 ) was followed. Generally, 1.5 A?,:,,unit of RNA was dissolved in 0.5 ml of concentrated formic acid, the tube was sealed, and hydrolysis was carried out at 100°C for 2 hours. [“CIAdenosine was added as an internal standard to the ‘H-labeled RNA before hydrolysis to permit determination of “C:’H ratios after digestion and thus provide a measure of the methanol lost from 2’-O-mcthyl groups.

D. Liquid Chromatography The chromatographic system used in these separations has been completely described in earlier publications ( 2 1 , 1 0 ) . Basically, it consisted of assenibled coinponcnts including a Milton-Roy Mini Pump, a Gilford Model 2000 spectrophotomer and columns made from stainlcss steel tubing and SwageLok fittings. The dimensions of each column, containing either Aminex A-5 resin (Bio-Rad) or Pellionex WAX

24

FRITZ M. IIOTTMAN ET AL.

( Rccve Angcl ), the huffcrs employed, and the conditions for each separation are given in the lcgends to the figures and tables. Buffers nsed for resolving nucleosides containing unnictliylated ribose or free bases consisted of only nmmonirnn forninte, while buffers for resolution of 2’-O-niethyl1iucleosides contained aiiixiioniuiii formate in 40% ethylene glycol ( 2 1 ).

111. Results A. Isolation of Cellular mRNA and Detection of Methylation The 3’-termirinl p l y ( A) segment on most mRNA molcculcs provides a convcnient mcthod for mRNA purification, allowing the mRNA to bc retained on covalently bound poly( U ) or oligo( dT) columns. It should b e emphasized, however, that mRNA purified by standard oligo ( d T ) procedures is not completely free of contaminating rRNA sequenccs and must be subjcctcd to a brief hcat treatment to climinate traces of rHNA ( 1 0 ) . Presumably this contaminating rHNA, composed mainly of 18 S scyucnces, results from a specific interaction bctwcen rRNA and mRNA (22). Messenger RNA purified in this fashion contains no significant

40

5

a u

n

I

20

/I

J

Frc. 1. Aniinex A-5 cliromatography of a n acid hydrolyzate of whole, poly( A ) containing IIIRNA. The mRNA was hydrolyzed with acid as dcscrilxd in Section I1 and the formic acid was removed with a stream of N,. The sample was dissolved in column buffer and applied to the column ( 1 / 8 inch x 90 cni), which was clcveloped with 0.4 M ammonium formate pH 5.6 at 42°C and 30 ml/hr (3500 psi); 2O-drop fraction< (-0.8 Inl) were collected. Thc position of the arrows indicates the locntion of purine bases included as markers and detected by absorption at 260 nm.

METHYLATION PATTERNS

IN EUKARYOTIC

25

mRNA

alkali-stable dinucleoticle, a product expected from internally located 2’-0-mcthylated nucleotides. Acid hydrolysis with formic acid of whole, methyl-labeled mRNA produces free purine bases, pyrimidine-containing nucleotides and 2 ’ 0 methylatcd material. Separation of these on Aminex A-5 columns produces only three major fractions: 2’-O-methyl-containing material, eluting at thc solvent front; 7-methylguanine; and 6-methyladenine ( Fig. 1). Thus in one analysis the distribution of methyl groups between the basemethylated 5’-terminal m7G, m‘IA and total 2’-O-methyl groups can be determined. When [ 3H]mcthyl-labr~ledpoly ( A )-containing mRNA was subjected to periodate oxidation and p-elimination, only a single methylated purine was detected ( Fig. 2 ) . This h e was i’-methylguaniiie, consistent with its unique linkage at the 5’-ttrrninal end of the mRNA. Little of the 7-methyIguanine was converted to its ring-opened form (Fig. 2 ) . Chromatography on Pellionex WAX was used to resolve alkali-produced oligonucleotides, including the cap structures ( 1 0 ) (Fig. 3 ) . TWO I

I

I

m‘G

Ring -opened

m7G

Unh y drolyzed

0

I

I0

20

30

F R A C T I O N NUMBER

FIG.2. Aminex A-5 chromatography of material released from whole poly( A ) containing mRNA by periodate oxidation and p-elimination. The RNA remaining after p-elimination was precipitated with ethanol, and the ethanol supernatant was dried with N3 and treated as in Fig. 1.

26

FRITZ M. ROTTMAN ET AL.

I

m

FRACTION N U M B E R

FIC. 3 . Rcsoliition of products from 0.4 N KOH cligestion of wliole p l y ( A ) containing rnRNA on Pellionex WAX. The KOH hydrolyxate was neutralixcd with HClO,, the precipitate was reinoved, and the supernatant was dried by lyophilization. T h e dried sample was dissolved in 5 niM sodium phosphate/7 M urea at pH 7.8 and injected onto a 1 / 8 inch x 53 cni Pellionex WAX column. The column was developed a t rooiii teinpcrature at -30 rnl/hr ( G O O psi) with a 100-nil 0 - 0.2 hl (NHt)SSO, gradicnt in 5 nihl sodiiiiii phosphatc/7 M urea at pII 7.8; 1-1111 fractions wcrc collected. Oligonucleotide standards were added as iiiarkers and detected h y absorption at 2FO nm.

items of note in this figure nre the additional oligonucleotide peak eluting near the ( U 7 ) p standard, and the extremely small peak of radioactivity at N’ni-N”p. The former is a possiblc result of resolution due to differences in the base composition of cap structures, and the latter is indicative of the low levels of IRNA contamination in heat-treated poly(A)contaiiiiiig mKNA. Owing to the sensitivity of the m’G component to alkaline pH, alternative conditions for hydrolyzing mRNA inoleculcs were employed. Combined hydrolysis of mRNA by T, riboiiuclcase and phosphatase produccs two major methylated fractions: internal base-methylated mononucleosides and thc cap structures. Analysis of the rnethylated mononucleosicle fraction on Aminex A-5 shows only mcA (data not shown). Further characterization of the cap structures is discussed in detail bclow.

B. Analysis of Cap-1 and Cap-2 Structures C a p 1 and cap-2 structures differ from each other basically in the presence of one and two 2’-O-niethylnucleosides, respectively. These structures can be resolved in the presence of 7 M urea on DEAE-sepha-

METIIYLATION PATTERNS IN EUKARYOTIC

27

mRNA

dex columns or on Pellioncu WAX as discussed above. Frequently the resolution obtained on whole cap structures on Pellionex WAX is not satisfactory, even when 7 M urea is added to suppress base-composition difkrences. Removal of the terminal m7G with nucleotide pyrophosphatase followed by treatment with alkaline phosphatase leaves the mRNA with a 5’-terminal N’m-N”p . . . . . or N’ni-”’m-N”’p . . . . . , corresponding to cap-1 and cap-2 structurcs, respectively. Subsequent alkaline hydrolysis of the remaining portion of mRNA produces N’ni-N”p and N’m-N”m-Np”’, which are easily separated 011 Pellionex WAX. Novikoff mRNA, labeled for varying times with L- [methgl-’Hlmethionine and rigorously purified to eliminate rRNA as described in Section 11, was treated ( a s described above) to yield the internal base-methylated mononucleotidc m”Ap, plus N’m-N”p and N’m-N”m-N”’p ( Fig. 4 ) . One of the main objectives of these studies was to examine the re1at’we distribution of cap-1 and cap-2 structures as a function of time of continuous labeling with L-[methgZ-‘HI methionine. Early experiments were performed at a single labeling time of 13 hours, and indicated a ratio of N’rn-”’m-N”’p to N’rn-N”p of 1.3 ( 2 0 ) . However, after an exposure of only 20 minutes, most of the label in cytoplasmic inRNA was contained in cap-2 structurcs (Fig. 4 ) . Similar deterininations were made at later time points, whereupon the ratio of labeling between cap-1 and cap-2 structures changed. The results of these experiments, plotted in Fig. 5, include times from 20 minutes to 24 hours. It is also instructive to consider the distribution of label between N’m-N”ni-N”’p and N’m-N”p relative to m7G after different times of laheling (Table I ) . In comparison to m‘G, most of the label at 20 minutes is found in the 2’-O-metliylnucleosides of cap-2. After a few hours, the

Lahrling time

in7(;

X‘m-N”p

N’rn-N”m-N”’p

20 Min 3.7.5 IIr 5 . 0 TIr 6.2.5 IIr 1 2 . 7 5 Tlr 2 4 . 0 IIr

1 1 1 I 1 I

3.1 I .:I I .2 1.0 0.6 0.6

11.7 1. 0 0.8 0.7 1. 0 1.2

Tho rpm in pm7(: rrlrascd by nuvleotide pyrophosphatasc from whole mRNA was yuantitatcd and rornparcd to tlic rpm in N’m-N”p and N’m-N”m-N”’p produced by phospliatasc and KOlr hydrolysis of the rrmaining RNA (cf. Fig. 4).

28

FRITZ M. ROTTMAN ET AL, I

I

I

I

40(

20

JPI Nm-Np

UP,

UP,

ups

min

up,

201

Np

I 5 hr

20

E

0"

to

I

n

c)

I 2 4 hr

20

10

250

So0

Drop

750

No.

too0

METHYLATION PATTERNS IN EUKARYOTIC I

mRNA

29

I

9

3.0

-

P

fE Z

,2 . 0 -

fE Z I

E

z

1.0-

I

10

I

20

FIG.5. Change in ratio of N’m-W’p to N’ni-N”m-N”’p in mRNA with time. The data were obtained from mRNA labeled with ~-[methyZ-”H]methioninefor the times indicatcd and treated as described in the legend to Fig. 4. Dinucleotide contamination from rRNA was determined (cf. Fig. 3 ) , and appropriate corrections were made. The ratio of counts per minute in the two oligonucleotides was calculated.

distribution more nearly approaches, but docs not reach, equivalent labeling of each cap component.

C. Distribution of Methylnucleosides in Specific Positions in mRNA To characterize thc 5’-terminus of intact mRNA, methods to release m7G without prior treatment of the RNA were explored. As mentioned earlier and shown in Fig. 2, the 5’-terminal m’G of mRNA can be released by periodate oxidation and p-elimination. However, this reaction in our hands has not been reproducible in that all the methylated base removed is m’G, but its removal often does not exceed 75%of the total m’G present, as determined by acid hydrolysis. W e have found that nucleotide pyrophosphatasc gives more consistent results, releasing >85% of the miG FIG. 4. Resolution on Pellioncx WAX of KOH digestion products from mRNA previoudy treated with nucleotide pyrophosphatase and phosphatase. A 1/8 inch x 40 cni column was developed at rooiii temperature with a 100-ml gradient of 0 to 0.2 hl (NH,),SO, in 7 hl urea, 5 nih4 bodiurn phosphate ( p H 7 . 7 ) at a flow rate of 25 ml/hr.

30

FRITZ M. IIOTTMAN ET AL.

100 m'Gva

I Q

V

50

I

m

0

10

20 FRACTION

30 NO.

FIG. 6. Acid hydrolysis of the nucleotides rclcased from whole mRNA by nucleotide pyrophosphatase. Whole poIy( A)-containing iiiRNA was treated with nuclcotide pyrophosphatase, and the released niononucleotide was separated from the RNA Ly chromatography on Bin-Gel P-2 (0.02 A1 NIIJICO:,, pH 7.1 ). The mononucleotide fraction was made 20%in ethanol and evaporated. Acid hydrolysis and Amincx A-5 chromatography were performed as described for Fig. 1.

from intact mRNA after short periods o f treatment. Novikoff mRNA, labeled for 24 hours with L- [ rneth!/l-'H]niethioniiie was treated with nucleotide pyrophosphatase, and the released niononucleotide was hydrolyzed with formic acid. Only m7G was found (Fig. 6). Comparison of the amount of internal m"A relative to ni'G in inRNAs indicated that thc content of m'IA dccreasrcl with longer labeling times (data not shown). Since thc number of m'IA residues present in hnRNA and mRNA a t any singlc labcling tirnc appear to be proportional to thc size of the HNA molecule ( 23, 2 4 ) thc methyl-labeled RNA inolecules isolated aftcr 20 minutes, 5 hours and 12 hours of labeling were examined on denaturing sucrose gradicnts. As shown in Fig. 7, there is a decreasc in the average sizc of these methyl-labeled rnKNAs as a function of labeling time. To study intact c a p 1 and c a p 2 structures, total mRNA ~ v a sfirqt hydrolyzed with T2 HNaFe and bacterial alkaline phosphatase and then resolvcd on DEAE-Sephadcu ( 7 M urea) or on Pellionex WAX columns. Earlier studies indicated that, after a 13-hour label, a largc amount (-50%) of the 1abc.Ied methylnucleotides were located in the mononucleotidc fraction as m'IA ( l a ) , the balaiicc being distributed between cap-1 and c a p 2 structures. The total nicthylnncleoside content of cap-1 and thc PI-resistant oligonuclcotidc from c a p 8 structures can be readily assayed by treatment of the isolatcd cap structmcs with a inixturc of nucleotide pyro-

METHYLATION PATTERNS IN EUKARYOTIC

mRNA

31

FRACTION NUMBER

FIG 7 . Size of methyl-labeled niRNAs as a function of labeling time. Sedimentation analysis of p l y ( A)-containing mRNA was perfomied using 4.8-1111 gradients of 5 to 20% sucrose in 99% MeSO, 10 mM LiCI, 1 m M EDTA. The mRNA was SIISpended in 100 pl of 91% M e 3 0 in 10 niM LiCI/l m M EDTA and heated at G0"C for 2 minutes prior to layering onto the gradient. Centrifugation was for 14.5 hours at 25°C and 45,000 rpm in a Becknian SW 50.1 rotor.

phosphatase and bacterial alkaline phosphatase. The separation of methylnucleosidcs derived from cap-1 structures is presented in Fig. 8. Only results from mRNA obtained at 5 and 24 hours are included, since the amount of radioactivity in c a p 1 at 20 minutes is too small to analyze (Fig. 4 ) . The distribution data are presented in Table TI. Two important aspects should he mentioned: first, the major change in niethylnucleoside composition as a function of time is the increase in Cm content; and

32

FRITZ hl. ROTTMAN ET AL.

Ip

A

m’Guu

200

m’G

1

Um Gm Am

I00

z

8 0 I

hm7G

*)

B

1

I00

Urn Gm Am

Cm

i l l

1

n

50

1

I

20

40

I 60

I

80

FRACTION NUMBER

FIG.8. The distribution of metl:yl~iucleosides in cap-1 structures. C a p1 structures prodiiced by RNase Tz and phosphatase treatment were eluted from DEAE-Sephadex (with 7 h l urea) in a volume of 10-20 nil and desalted by absorption on a 1.9 x 42 cni Bio-Gel P2 column and elution with 0.02 M NH,HCO:,. Material in the void volume was made 20% with ethanol and evaporated. Cap-1 structures were then digested with nucleotide pyrophosphatase and phosphatase as described in Section 11. The reaction mixture was dried with N, and dissolved in 125 pl of column,huffer. Chromatography on Aininex A-5 ( l / 8 inch x 90 cm) utilized 0.4 M ammonium formate ( p H 4.25), 40% ethylene glycol at 40°C. Flow rate was -7 ml/lir (2500 psi) until Am was eluted; the rate was then increased to -12 ml/lir (4750 psi) for remainder of the run. Fraction size was 10 drops (-0.4 ml) until Cm was eluted; the fraction size was then doubled. ( A ) Cap-1 from mRNA labeled for 5 hours. Inset is the acid hydrolysis of the same 5-hour cap-I structure analyzed as in Fig. 1 cxcept that the ainmoniun: formate was at pH 5.3. (€3) Cap-2 from mRNA labeled for 24 hoiirs. 2’-O-Methylnucleosicles and n1’G were added as markers and detected at 260 nil:.

second, it appears that all the Am is present as a doubly-methylated derivative, NG,2’-O-diinethyladenosine ( m“Am) , Verification of the m6A content in this nucleoside was obtained by subjecting the same cap to

METHYLATION PATTERNS IN EUKAHYOTIC

33

mRNA

Percent of total label in Structure Cap-1 (7mGpppN’m-N”) Cap-2 “core” (m7GpppN’m)

Labeling time

Um

5 Hr 24 Hr 5 Tlr 24 Hr

4 . 6 14.9 2.4 9.4 4.7 11.0 1.8 8 . 0

Gm

Am m6Am Cm

m7G

28.8 8 . 7 19.6 21.1 4 . 1 25.2 13.0 9 . 3 20.4 18.4

42.2 44.0 42.0 42.1

0

3.3

Perccnt of cap-2 as N“m Cap-2 (N”m)

20 &Tin 5 IIr 24 I-Ir

44 35 36

11 21 18

28 26 23

NIY NIY 0

15 18 23

2 0 0

80 3 .26

Whole poly (A)-rontaining rnILNA was digested with RNase T2 and phosphatase. Cap-1 and cap-2 structures werc separated on DEAR-Sephadex (7 M urea). l f t e r digestion of r a p structures with penicillium nurlrase, the “core” oligonucleotide and N”m were resolved on a Pellionex WAX column (cf. Fig. 9). The distribution of nucleosidcs in N”rn was determined as in Fig. 10. Core oligonurleotide and cap-1 structures werc digrsted with nucleotide pyrophosphatase and analyzed as in Fig. 8. The data are presented as percentage of the total radioactivity present in the structural position indicated. * The presence of m6A was not determined for nucleosides in the N”m position.

acid hydrolysis and isolating the free bases produced, as described earlier. Of the label present in m“Am, 50% was detected as m6A (cf. Fig. 8 inset). Similar results were obtained with acid hydrolysis of cap-1 derived from 24-hour mKNA; i.e., most of the Am is found in the form of m6Am. The methylnucleosidc distribution in the N”m position of cap-2 was determined by digesting cap-2 structures with penicillium nuclease to produce m’GpppN’m N”m. The released N”m was separated from the remainder of the cap structure on Pellionex WAX (Fig. 9), and subsequently assayed on Aminex A-5. After 20 minutes, over 80% of the label in cap-2 is in N”m (Table 11). The distribution of methylnucleosides in N”m from cap-2 structures labeled for 20 minutes is shown in Fig. 10. The N”m position of 20-minute-labeled cap-2 appears to be particularly rich in Um, and it contains a significant amount of Am. The overall distribution of methylnucleosides at each specific site of

+

34

FRITZ M. ROTTMAN ET AL.

6M NH4Ac

100

-

60

-

.1

f

U

I

n

I

0

10

FIG.9. Separation of products of penicillium nuclease an- phosphatase digestion of a cap-2 structure. The enzymic digestion was pcrformed as described in Section 11, diluted to 0.5 nil with H,O and injected onto a Pellicincx WAX colunin (1/8 inch x 32 e m ) . Nucleosides ( N “ m ) were eluted with 0.1 M ammoniiini acetate; the buffer was then switched to 6 hl amnioniuni ac-tate, and the “core” oligonucleotide was eluted. Fraction size was 1 nil, and flow rate was 1ni1/1.7 minutes.

1

1

-

100

5

101v.

front

l???

Cm

4

m’G

4

n

FIG. 10. Separation of ”’in nucleosides by Aminex A-5 cliromatogrnphy. The N ” m nucleoside fraction from Fig. 0 was lyophilized, dissolved in 0.4 M ammonium formate pII 4.25 in 40’%ethylene glycol, nnd chromatographed as described for Fig. 8.

Novikoff mHNA methylation after 20 minutes, 5 hours and 24 hours of continuous labeling with L- [ nzethyL3HH]rnethionirie is shown in Table 11.

IV. Discussion Earlier studies provided information on the qualitative composition of methylnuclcosidcs in cytoplasmic rnRNA at fixed labeling times. The

hlETHYLATION PATTERNS IN EUKARYOTIC

mRNA

3s

presence of two different types of cap structures, however, raised the possibility of a time-dependcnt formation ( and degradation ) of specific methylated sequences. In an attempt to cxaminc this possibility, we labeled Novikoff cells for various periods of time with ~-[metlzyl-”H]methioninc, purified the cytoplasmic mRNA and determined the level of methylnucleoside labeling at specific sites within the mRNA. Acid hydrolysis rcleases only two methylated purines from mRNA, m7G and m”A. When total mRNA is used, the mGAobtained is a sum of the m”A located internally in the mRNA molecule plus that present in the cap as m”Am. Alternatively, the mRNA can first be hydrolyzed to monoiiucleotides and methylated oligonucleotide cap structures, each of which can subsequently b e analyzed separately for methylnucleoside content and distribution. High-speed, high-resolution column chromatography is an efficient analytical technique for these determinations, sincc most separations can be accomplished in 60-90 minutes, recoveries are quantitative and individual labeled methylnucleosides or mcthylnucleotides can be identified by the inclusion of appropriate UV-absorbing standards. Mcthyl-labeled components are determined by collecting samples directly into scintillation vials. Care must be exercised to avoid alkaline pHs, since m’G readily forms isocytosine derivatives under such conditions. Excessive periods of enzymic digestion with venom diestcrase or alkaline phosphatase near pH 8 are sufficient to cause partial ring opening of m’G, which will then appear near Um on the cation-exchange resin, Aminex A-5. Acid hydrolysis to thc level of the frcc base followed by chromatography on Aminex A-5 provides an accurate measurement of the ring-opened form of m7G (Fig. 2). Thc scparation of intact cap-1 and cap-2 on Pellionex WAX occasionally yields more than the two major oligonucleotide peaks predicted, even when alkaline pH is avoided. One likely cxplanation for this result is the high resolving power of Pellionex WAX resin. Even in the presence of 7 M urea, base compositional effects do not appear to be completely suppressed. This results in partial separations of individual cap structures. Therefore, the preferred method for determining the relative amounts of cap-1 and cap-2 structures in a mixture of mRNA molecules involves converting them to N’m-”’p and N’m-”’m-N”’p, respectively, folIowed by column separation ( Fig. 4 ) . Controlled digestion with nucleotide pyrophosphatase can be used to remove the terminal m’G from the intact mRNA. This indicates that the caps are exposed and accessible to the enzyme. Removal of m7G by this enzymic method was faster and more reproducible than by periodate oxidation and p-elimination. This method of cap analysis has proved to be reliable and, in addition, provides a separate analysis of radioactivity in the m‘G portion of the cap structure.

36

FRITZ M. ROTTMAN ET AL.

Carc must be taken to eliminate rHNA accompanying mRNA purified on &go( dT)-cellulose, since the rRNA is a source of extraneous N'm-N"p oligonucleotides. The rapid labeling of cap-2 structures relative to cap-1 (Fig. 4 ) and rclativc to m7G (Table I ) is interesting. Earlier studies on hnRNA methylation (18) show that these nuclear molecules contain internal ni6A and only one type of cap structure, cap-1, which appears to be identical to the cap-1 found in cytoplasmic mRNA. These results on hnRNA and the data on the kinetics of labeling of cytoplasmic mRNA cap-1 and cap-2 structures are compatible with a model in which miG, N'm and the internal m"A are all products of nuclear methylation events, followed by cytoplasmic methylation of N" to yield "'m. Thus, after short labeling times, mRNA rnolccules bearing cap-1 termini methylated earlier in the nucleus with nonradioactive methyl precursors reach the cytoplasm and are there mcthylatecl with radioactive methyl groups at N". With longer labeling times, the cap-1 structures reaching the cytoplasm also contain radioactive methyl in m7C, and N'm, and the ratio of radioactive cap-2 to cap-1 decreases. The slow hut eventual rise of the ratio of cap-2 to c a p 1 (Fig. 5 ) in the continuing presence of label could rcpresent mRNA turnover with selective preservation of cap-2-containing structures or a SIOW cytoplasmic conversion of c a p 1 to cap-2. I t is also interesting to note that the average sizc of a mixed population of methyl-labeled cytoplasmic mRNAs apparently becomes smaller at longer labeling times (Fig. 7 ) . Such a reduction in average size of mRNA probably reflects a loss of large niRNAs containing a proportionately highcr number of internal m"A residues per molecule. Treatment of cap structures with nucleotide pyrophosphatase plus alkaline phosphatase produces nucleosidcs that are readily resolved on Arninex A-5 ( Fig. 8 ) . This procedure permits a comparison of methylnucleoside distribution between cap structures as a function of time, as shown for cap-1 structures in Fig. 8, and also enables one to compare the composition of N'm in cap 1 to that in cap 2 at a given time. As a function of increascd timr. of continuous labcling, it is apparent from Fig. 8 that the relative distribution between Um, Gm and m"Am is nearly the same while the amount of Cm increases to the point wlicre it represents a significant amount of the label in N'm. Thus the distribution of methylnucleosidcs in the N'm position of c a p 1 structurcs changes as a function of time. Also it should be noted that essentially all thc material eluting as Am prcsent i n c a p 1 structures at both 5 and 24 hours exists as thc doubly methplated nucleoside, ni"Am. This can be concluded from parallel cxperimwts in which a portion of the cap structure is hydrolyzed with acid, producing free purines. The N"-methyladenine resulting from this

METHYLATION PATTEHNS IN EUKAHYOTIC

mRNA

37

hydrolysis accounts for SO% of the label initially present in ni6Ani. Whether or not this modification is exclusivel? a nuclear event is difficult to determine at this time. I t should be pointed out, however, that m”Am appears in cap structures obtained from hnRNA, suggesting that at least part of this modification occiirs in the nucleus ( 18 ) . The distribution of methylnucleosides in N” after 20 minutes indicates that all four nucleosides are represented at this sitc in cap-2 (Fig. 10 and Table 11). There appears to be a significant amount of labeled Am in this position of Novikoff cap-2‘structures at early times. Also, the prcdominant N”in at early times is Urn with only small amounts of Cm. It is interesting to compare the methylnucleoside composition in cap-1 to that of cap-2 “core,” generated by removal of N”m from cap-2. If a completely different subgroup of mRNA molecules with a unique pattern of methylation at N’m w c ~ cbeing selected for eonversioii to cap-2 structures, one might expect to see diff ercnces in N’rn composition between cap-l and cap-2. As can be seen from the data in Table 11, no significant diffcrencc~swere observed in the N’m position at either S or 24 hours. In fact, the correspondence> between each methylnucleoside in cap-1 and cap-2 core at each time point is remarkably similar, even reflecting the increase in Cm composition with labeling time. This close correspondence in N’m composition was not observed in similar studies on L-cell mRNA (23).

V. Summary The use of enzymes for selective hydrolysis, coupled with high-resolution liquid chromatography for assay of products, provides an efficient means of determining the specific patterns of inethylation in eukaryotic mHNA molecules. Continuous labeliiig with levels of L-[ rnethyZ-?H]methionine that permit normal growth of Novikoff cells was used to examine the methylation of specific sites of cytoplasmic mRNA as a function of time. After only brief exposure, the main site of cytoplasmic mRNA labeling is at the second position ( N ” ) of the S-terminal sequence. Data obtained by comparing the iii~~thyliiucleoside composition of these sequences and the ratio of doubly to singly 0-methylated termini (cap-2 to cap-1) as a function of labeling time is consistent with a model hi which m7G, N’m and the m“A located in the mRNA molecule are all products of nuclear methylation evcmts. Subsequently there is a cytoplcismic methylation of some singly 0-inethylated structures a t the second ( N” ) position yielding thci doubly 0-methylated structure. The kinetics of methvl ld&ng and the changing composition within the caps show a distinct pattern, possiblv rcflecting a selection or enrichment of a stable

38

F R I T Z M. ROTTMAN ET AL.

class of mRNA molecules, many of which contain the doubly lnbelcd structure at their 5’-terminus and are of smaller size.

ACKNOWLEDGMENTS We wish to express our sincere appreciation to Marian Dovmberg for her contrilnltion to portions of this work, and to Sarah Stuart and Arlen Thoniason for their critical reading of the manuscript. This work was supported by Public Health Service Research Grant CA 13175 from the National Cancer Institute.

REFERENCES 1 . R. P. Perry and D. E. Kelley, Cell 1, 3 7 4 2 ( 1974).

R. Desrosiers, K. Friderici and F. Rottman, PNAS 71, 3971-3975 ( 1974). A. J. Shatkin, PNAS 71, 3204-3207 (1974). C. M. Wei and B. Moss, PNAS 71, 3014-3018 (1974). Y. Furuichi, NARes. 1, 809-822 (1974). F. Rottman, A. J. Shatkin and R. Perry, Cell 3, 197-199 (1974). 7. R. P., Perry, D. E., Kelley, K. Friderici and F . Rottinan, Cell 4, 387-394 (1975). 8. J. M . Adams and S . Cory, Nature 255, 28-33 (1975). 9. Y. Friruichi, M., Morgan, A. J. Shatkin, W. Jelinek, M. Salditt-Georgciff and J. E. Darnell, PNAS 72, 190-1-1908 ( 1975). 10. R. Desrosiers, K. Friderici and F. Rottman, Bchcm 14, 4367-4374 (1975). 11. Y. Furuichi and K. Miura, Nntrire 253, 373-375 ( 1975). 12. C. M. Wei and B. Moss, PNAS 72,318-322 (1975). 13. Y. Furuichi, S. Muthukrishnan and A. J. Shatkin, PNAS 72, 742-745 (1975). 14. S. Moycr, G. Abraham, R. Adler and A. K . Banerjee, Cell 5, 59-67 (1975). 15. J. Keith and H. Fraenkel-Conrat, E’EBS Lett. 57, 31-33 ( 1975). 16. Y. Furuichi, A. J. Shatkin, E. Stravnezer and J. M. Bishop, Nature 257, 618 (1975). 17. C . M. Wci, A. Gershowitz and B. Moss, Bchmn 15, 397401 (1976). 18. R . P. Perry, 1). E. Kelley, K. H. Friderici and F. M. Rottman, Cell 6, 13-19 ( 1975). 19. M. Salditt-Ceorgieff, W. Jelinek, J. E. Darnell, Y. Furuichi, M. Morgan and A. Shatkin, Cell 7, 227-237 (197G). 20. T. Munns, K. Padratz and P. Katzman, Bchem 13, 44094416 ( 1974). 21. L. Pike and F. Rottman, Anal. Biockcm. 61, 367-378 (1974). 22. J. A. Stcitz and K. Jakes, PNAS 72,4734-4738 (1975). 23. R. P. Perry and D. E. Kclley, Cell ( 1976) ( i n press). 24. M. Salditt-Georgieff, W . Jelinek, J. E. Darnell, Y. Furuichi, M. Morgan and A. Shatkin, Cell 7, 227-237 ( 1976). 2. 3. 4. 5. 6.

Structural and Functional Studies on the “5’-Cap”: A Survey Method for mRNA’ HARRISBUSCH, FRIEDRICH HIRSCH, KAUSHALKUMARGUPTA, MANCHANAHALLI RAO, WILLIAMSPOHNAND BENJAMIN C. Wu Department of Pharmacology Baylor College of Medicine Houston, Texas

1. Introduction The extensive literature on the “5’-cap” (cap) has already been the subject of major reviews ( 1 ) and of other reports in this symposium.? In our laboratory, a series of investigations on the types and structures of lo~~-molccular-weight RNA species of the nucleolus and the nucleus led to the discovery of the structure of some of these molecules (Fig. l ) , which in turn clarified some findings with respect to the 5’-terminal structure of mRNA of virusrs and eukaryotic cells (14). The results obtained thus far have led to a much clearer view of the cap as a special region^ that may be important for controls of cell function, as a target for future drug development for chemotherapy and hormone action, and for understanding of the incredible fidelity of the translational systems involved in protein synthesis. The present report deals with the following points: ( a ) a comparison of the nuclear and cytoplasmic messenger RNAs with respect to translational activities and their content of the cap; ( b ) the probable allosteric nature of the interaction of thc cap and its associated protein; ( c ) the potential usefulness of the information derived from these studies in the development of a survey system for quantitative and qualitative analysis of mRNAs in tissues; ( d ) some new approaches to studies on mRNA,.,,.,t (r-prot = ribosomal proteins), and ( e ) some new studies on inhibitors of cap function.

’ These studies were supported by the Cancer Research Center Grant CA-10893 awarded by the National Cancer Institute, the Davidson Fund, the Wolff Memorial Foundation and a generous gift from Mrs. Jack Hutchins. ‘ S e e articles by Moss et al., Furuichi et al., and Rottman et al. in this volume. 39

PT15 (PTII)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PPI 7F I

PPloB i n1n 2150 1 ~

~

~

~

~

"

~

r ~

~

PPlOA

A,&*, ,

~

I

22 ,

-IS 151m5 ~

, ,, L A ~

.

C-C-U-C-A-C-G-C-A-U-C-G-A-C-C-U-G-G-21-A-U-U-G-C~-C-A-G-U-A-C-C-C-U-C-A~-G-A-A-C~-G-U-G-C-A-C-C-~" I

21

I

I

11

I

U

12

I

1

I

15

I

I

I

2

JIJ-

L

5

L

19

1

UU

8

1

4

U

28 1

PTllB (PT1) I

PT15 ( P T I I )

FIG. 1. Derivation of the primary sequence of U2 lo~r.-molecular-\~eightnuclear R S A . The cap was derived from fragments T27 and P25 ( 3 - 6 ) .

~

,

~

5

STUDIES ON ?-TERMINALS

41

The translational activity of nuclear RNA(A,,) is less than that of cytoplasmic mRNA ( 7 ) . Low translational activities of hnRNA have also been reported (8-11). The hnRNA of liver has a low translational capacity in the Krcbs ascites cell-free system (10, 11);the corresponding activity of cytoplasmic mRNA was not determined. The findings ( 8 ) that hnRNA from adenovirus-infected cells could be translated only in the presence of nuclei suggested that further processing by the nuclei was required. It has not been generally shown that the nuclear RNA(A,)3 has a low translational activity. O h difficult problem in such studies has been contamination of nuclear preparations with cytoplasmic elements. After Perry e t al. (12) reported that 5040%of hnRNA contained a cap, Brandhorst and McConkey (13) indicated that the nuclear isolation method of Perry et al. (12) resulted in contamination of the hnRNA with cytoplasmic RNA(A,,). They also indicated that 90% of hnRNA turns over within the nuclei and does not enter the cytoplasm. Accordingly, it seemed important in these present studies to compare results with nuclei of normal liver, for which elegant methods have been developed ( 1 4 ) , with those from the Novikoff hepatoma, for which several nuclear isolation methods have been described (IP17). In the present studies, the nuclear RNA(A,,) from Novikoff hepatoma cells labeled with 32Pwas found to contain only 16-25% as much labeled cap as does cytoplasmic RNA( A,,). Similarly, cytoplasmic RNA( A,, ) labeled with KBJH, after periodate oxidation had a 4-7-fold greater percentage of the cap than the nuclear RNA( A,,); it would appear that the difference in translational activity of the cytoplasmic and nuclear RNA( A,) is related in part to the higher content of the cap in the former. Highly reproducible translational studies have been made recently with a variety of mRNA species using the 30,000 x g supernatant fraction of wheat germ extracts (18-21). With improved methods for isolation of polysonial RNA( A,,) on poly( A)-binding columns, i t has been possible to demonstrate differences in translational capacities of RNA of differing tissues (18-21 ) and cellular fractions, such as the nuclear and cytoplasmic RNA( A,,) (22). Inasmuch as earlier studies from this laboratory have shown differences in the nuclear proteins of tumors and other tissues ( 2 3 ) , the present studies were designed to determine whether differences in the mRNA translation products would be demonstrable in tumors and other tissues. For this purpose, the two-dimensional system developed earlier ( 2 4 ) for separation of nucleolar and ribosomal proteins

’ RNA(A,,) is a short hand notation for RNA-A,, or RNA-poly(A) [RNA with poly( A ) a t the 3’ terminus], sometimes referred to as p l y ( A ) + RNA, in contrast to poly(A)- RNA, which is RNA without a p l y ( A ) at the 3’ terminus.

42

HARRIS BUSCH ET AL.

was employed. It had a number of advantages in this study because many of the proteins synthesized with tumor mRNA in uitro comigrated with previousIy mapped proteins of the 40 S ribosomal subunits ( 2 5 ) .

II. Results A. Comparison of the Nuclear and Cytoplasmic Messenger RNAs with Respect to Translational Activities and Their Content of the Cap

Table I shows that cytoplasmic RNA( A,,) from both normal liver and Novikoff hepatoma cells had approximately 20 times the translational activity of the nuclear RNA( A,, ), Similar results were obtained with nuclei isolated by the NP-40 method ( 7 ) , the citric acid method, and the method employing Ivory dctergent ( 1 6 ) . The material isolated from liver nuclei preparcd by the sucrose/Ca2+method ( 14) had essentially the same translational activity as that from the tumor nuclei. Thus, nuclear RNA( A,,) had a translational activity of 3-4 pmol of [3H]leucine incorporated per microgram of RNA irrespective of the isolation pro-

Trrinslatiorial activity; pmol of [3Il]lcuc*inr incorporated/pg RNA Source of ItNA(A,) and treatment Cytoplasin Nuclei control Pronase Heat and quick-cool

+ +

Novikoff

Liver

6'2 0 3 2

8,; 0

-

-

3 8 3 2 3 6

a The RNA samples were inoubated with 0.2 ing protein cf wheat germ S30 fraction (18) in a total volume of 41.5 pl containing: ATP, 1 mM; GTP, 0.02 inhf; rreatine phosphate, 8 mhf; creatine phosphokinase, 1 . G p g ; KCI, 84 m M ; Mg(OAc)z 3 m M ; 19 amino acids, 0.02 m M each; [3H]leucine, 20 pCi; Szthreitol,,2 rriPrl; and IIepcs buffer (pH 7.6) 28 mM. Incubations were a t 2.5' for 90 minutes; 12.5-fi1samples were then spotted on filter paper disks for radioactivity analysis. One portion (50 p g ) of the liver nuclear HNA was trrated with 10 pg of Pronase a t 37°C for 15 minutes, recxtractcd with phenol and rcprecipitated with 2 volumes of ct hanol before assay for translational activity. Another cquivalcnt portion of liver nuclear RNA was dissolved in 0.2% Na dodccyl sulfate and heated a t G0"C for 2 minutes, quick-coolcd on ice-alcohol (-20') and rcpurified by rec:hroniatography 011 a column of poly(U)Sepharose 4B.

STUDIES ON 5'-TERMINALS

43

ccdure ( 7) ; this value agrees well with that of Granznadovic and Hradec (10, 11). The activity of liver nuclear RNA( A,,) did not increase after Pronase treatment and rcextraction with phenol or rebinding to poly( U ) Sepharose columns after heating and quick cooling. Accordingly, the low translational activity is not due to associated inhibitor proteins or doublestranded RNA, which have been shown to be inhibitory for translation in uitro (26, 27).

B. Quantitative Analysis of the Cap Since the cap is necessary for the translational activity of mRNA (22, 28), quantitativc analysis of the cap in both the nuclear and cytoplasmic RNA isolates was made. For this purpose, Novikoff hepatoma cells were labeled in vitro for 4 hours with [ "Plorthophosphate ( 1 5 ) and RNA( A,,) was isolated from both the detergent-treated nuclei and cytoplasm. The RNA(A,,), purified by heating to 60°C ( 2 minutes) and rebinding to a second poly( U)-Sepharose column (29), was digested completely with T, RNase and U L RNase (22), and the oligonucleotides were separated on DEAE-Sephadex columns. Figure 2A shows the profile of the combined T, and U L RNase digest of cytoplasmic RNA( A,,), eluted with a 0.05 to 0.5 M NaCl gradient; 0.41% of the total radioactivity was eluted in the -4.Fj to -5 charge region (Table 11), which contains the cap (22, 29). Less than 0.1% of the radioactivity was eluted in the -3 and -4 charge pcaks, which are characteristic of rRNA (29). When heating and quick-cooling was not used to purify the mRNA, the radioactivity in the -3 and -4 charge peak was similar to that of the peak containing the cap. The 0.413 of the radioactivity in the cap region corresponds to the presencc of the cap in all molecules with an average chain-length of approximately 1000-1500 nucleotides, assuming that the cap contains five This result is in agreephosphate residues [m'G( 5 ' ) ppp ( 5') N'm-"'p]. ment with the range of scdirnentation of the RNA(A,,) in the 8-18 S region of the sucrose density gradient (Fig. 3A). The corresponding elution pattern of the combined T2 and U, RNase digest of the nuclear RNA(A,,) labeled for 4 hours with [7'P]orthophosphate is shown in Fig. 2B. Only 0.11% of the total radioactivity eluted in the peaks contained the cap (Table 11). This nuclear material had a sedimentation in the 8-18 S region in the formamide gradient similar to that of the cytoplasmic RNA( A,, ) (Fig, 3 ) . Thus, the nuclear material contains only 27%of the theoretical amount of isotope in the cap by comparison with 100%for the cytoplasmic material from molecules of chain lengths of 1000-1500 nucleotides.

44

HAnRIS BUSCH ET AL.

I. 6 3.5

3.4 0.3

g

8 Q

0. 2 0. I

B

3x104V I

0 N

1x104

RNA ( A,) NUCLEUS

-

FIG.2. DEAE-Sephadex chromatography of T, and U2 RNase digests of nuclear RNA( A , , ) from Novikoff hepatoma cells. Nuclear ( A ) and cytoplasmic ( B ) RNA( A,) labeled for 4 hours in cells in tissue culture medium with 200 mCi of ["Plorthophosphate was purified on poly( U)-Sepharose 4B and digested with T2 RNase and U? RNase ( 5 and 4 units, respectively, per 50 pg of RNA) at 37°C for 6 hours in 0.1 M ammonium acetate buffer, pH 4.5.The digests were chromatographed on a 0.5 x 30 cm column of DEAE-Sephadex in 0.05 M TrisC1, pH 7.6, and 7 M urea. The oligonucleotides were eluted (4-ml fractions) with a 0.05 M to 0.5 M gradient of NaCl in 0.05 M TrisC1, pH 7.6, and 7 M urea. T h e dotted line represents the elution pattern of a complete T, RNase digest of yeast tRNA and A-A which elutes a t -1 charge ( A ) Cytoplasmic RNA( .4")labeled for 4 hours. ( B ) Nuclear RNA( A,,) labeled for 4 hours.

Percentage radioactivity -4.5 to

Source of RNA

-2

Cytoplasm (4 hr) Nuc*lrar (4 hr)

99 48 99 78

-3

-4

-5.5

0.1

0.01 0.03

0.41 0.11

0

ox

0 Nuclear arid cytoplasmic IE:AT<-Scphadexcdurnns RS described for Fig. 1 . Percentagw of the total radioactivity \+ere calmlatcd for the individual praks.

45

STUDIES ON 5'-TERMINALS

20CQo 0 z

-

5 l5oW z Y

4s

IOCQO

0 N m

5000

F R A C T I O N NUMBER

FIG.3. Sucrose density gradient profile of cytoplasmic and nuclear RNA(A.) in aqueous and formamide gradients, respectively. ( A) Cytoplasmic RNA( A,, ) labeled with "P for 4 hours was centrifuged in a 5 to 45% sucrose gradient in 0.14 M NaCI, 0.01 M acetate buffer, pH 5.1, and 0.01 M EDTA in a SW 27 rotor at 4°C for 17 hours at 26,000 fpm. ( B ) Nuclear RNA(A,,) labeled with mP for 4 hours was centrifuged on 5 to 25% sucrose in 70%formamide, 3 mM EDTA and 3 niM TrisC1, pH 7.5, in a SW 41 rotor for 20 hours at 39,000 rpm a t 20°C. The positions of the peaks of 4 S, 18 S and 28 S are shown as markers.

C. Effects of S-Adenosylmethionine a n d S-Adenosylhomocysteine on Translation

Figures 4 and 5 show the lack of effect of exogenous S-adenosylinethioriinc ( AdoMet ) or S-adenosylhomocysteine ( AdoHcy ) on the translational activities of varying concentrations of nuclear and cytoplasmic RNA (An1 NUCLEUS

0

1.6

3.2 RNA W I m l

4. 8

I.6

3.2 RNA pqlml

4. 8

0

8

16

24

RNA pq1ml

FIG. 4. Effects of S-adenosylmethionine ( AdoMet ) and S-adenosylhomocysteine ( AdoHcy) on the translational activity of nuclear and cytoplasmic RNA( A,>). The translational activity of nuclear and cytoplasmic mRNA (A,, ) was assayed in the wheat-gem1 S-30 fraction as described in the legend of Table I in the presence of either 4 p M S-adenosylmethionine or 160 fiM S-adenosylhomocysteine. 0-0, Control; O--O, AdoMet; A--A, AdoHcy. ( A ) Novikoff cytoplasmic RNA( A,l). ( B ) Liver cytoplasmic RNA( A,,). ( C ) Liver nuclear RNA( A,,).

46

HARHIS BUSCH ET AL.

C WOPLASM 20

-

0 0 MINUTES

20

40 60 MINUTES

So

FIG.5. Kinetics of translational activity of cytoplasmic RNA( A,,) from liver and Novikoff hepatoma cells in the presence of S-adenosylmethionine ( AdoMet ) and S-adcnosylhomocystcine ( AdoHcy ). T h e translational activities of cytoplasmic RNA( 11.) were assayed for differcnt time periods in the wheat-germ system under the conditions clescribed in legend of Table I. (0-0, Control; 0-0, with 4 pM AdoMet A-.----A, with 1GO pM AdoHcy. ( A ) Novikoff cytoplasmic RNA( A , , ) . ( B ) Liver cytoplasniic RKA( At!).

RNA( A,,) from liver and Novikoff hepatoma cclls. Addition of AdoMet at a concentration of 4 pM to the wheat-germ incubation system did not enhance the translational activity of either RNA (Fig. 4 ) . This result showed that the translatable mRNA species wcre already methylated or that the endogenous AdoMet in the wheat-germ S-30 fraction itself is sufficient for methylation of thc mRNA ( 3 0 ) . AdoHcy (160 pM) produccd no significant reduction in translational activity of either nuclear or cytoplasmic RNA( A,,). Similarly, AdoMet and AdoHcy had no effect on the translational activities of the liver and tumor cytoplasmic substances at different times of incubation (Fig. 5 ) . These results indicate that there is no significant pool of mimethylated cytoplasmic RNA( A,) in either liver or Novikoff hepatoina cells and little if any metabolic turnover of the cap occurs during these translational assays, and also the low translational activity of nuclear RNA(A,,) is not due to undermethyl at’1011.

D. The Cap of Ovalbumin mRNA For studies on a specific mRNA, highly purified ovalbumin niRNA, generously provided by Drs. S. L. C. Woo and H. O’Malley, was labeled in uitro with KB7H after pcrioclate oxidation. DEAE-Sephadex chromatography of the T, and U, RNase digest from the labeled HNA (Fig. 6 ) shows a zero-charge peak, in which 30-408 of the isotope was in the tri-

47

STUDIES ON 5'-TJSRMINALS

40

60

F R A C T I O N NUMBER

FIG.6. DEAE-Sephadex chromatography of ovalbumin mRNA digested with T? and UzRNase. RNA, 100 pg, was oxidized with 2.5 mM NaIOI in 0.1 M potassium acetate buffer (final volume 100 p l ) pH 5.1, at 0°C for 60 minutes in the dark. After incubation, the excess periodate was destroyed by adding ethylene glycol to a final concentration of 1%.The oxidized RNA was precipitated with ethanol, redissolved in 0.1 M potassium phosphate buffer, pH 7.7, and reduced with 2000-fold molar excess of KB'H, (18.6 Ci/mmol) at 0°C for 3.5 hours in the dark. The RNA was precipitated with ethanol, then separated from the unreacted KB'HHIby passing it through a Sephadex (3-25 column (0.5 x 40 cm ) in 0.01 Me sodium acetate b d e r , pH 5.1. The purified RNA was digested with T, and Uz RNase ( 5 and 4 units, respectively, per 50 pg RNA) at 37°C for 6 hours in 0.1 M ammonium acetate buffer, pH 4.5. The digests were chromatographed on a 0.5 x 30 cm column of DEAE-Sephadex in 0.05 M TrisC1, pH 7.6, and 7 M urea. The dotted line represents the elution pattern of a complete TI RNase digest of yeast tRNA.

alcohol derivative of adenosine (6); the peaks of charge -5 and -6 contain the cap (6). The nucleotide pm7G was released with nucleotide pyrophosphatase or venom phosphodiesterase (Figs. 6 and 7).

E. The Role of the Cap At the present, the combination of methylated nucleotides and pyrophosphate linkages that characterize the cap of low-molecular-weight RNA and RNA( A,,) provides little insight as to the role of this structure. One suggestion is that the cap may serve for interaction with allosteric binding sites of initiation factors (1,31, 32). Another possibility is that either cleavage of the pyrophosphate linkage or methylation of the cap might be involved in control of initiation. The present experiments show that in the wheat-germ system the addition neither of AdoMet nor of AdoHcy results in alteration of translational activity; this evidence indicates that there is little, if any, metabolic turnover of the cap compared to the remainder of the cytoplasmic RNA( A,,). Thus, these studies support the concept that the cap associates with an allosteric binding site during initiation of protein synthesis.

48

HARRI.9 BUSCH ET AL.

't

"CAP I" (-5 ChJnp)

Undigrsted Contrd

i I

5

9

1

3

1

7

H

MIGRATION (em)

FIG.7 (left). Identification of pm'G in cap1 of ovalbumin mRNA by electrophoresis at pH 3.5. Portions of the radioactive material for the cap-1 peak (-5 charge) (Fig. 6 ) were digested either with snake venom phosphodesterase or nucleotide pyrophosphatase and subjected to electrophoresis on Whatman No. 3 paper at pH 3.5. The paper was cut into 1.O-cm strips from the origin and counted in the liquid scintillation counter after digestion with NCS solubilizer. The position of 2' and 3' mixture of known mononucleotides and a sample of 7-methylguanosine monophosphate are shown. pm'C is comigrating with its trialmhol derivative after labeling with KB'H, (data not shown). ( A ) Untreated c a p 1 peak. (B) Cap-1 peak treated with nucleotide pyrophophatase. ( C ) Cap1 peak treated with venom phosphodiesterase.

F.

Survey System for Quantitative and Qualitative Analysis of mRNAs in Tissues

The translational activities of the RNA( An) of the Novikoff hepatoma and normal liver ranged from 60 to 80 pmol of leucine incorporated per microgram of mRNA. These values are comparable to those obtained with highly purified mRN&,,, provided by Drs. B. OMalley and S. L. C. Woo. To visualize the products of translation of the mRNA, autoradiography of the [86S]methionine-labeledproteins was carried out for varying times ranging from 3 to 6 days (Figs. 8 and 9). Figure 9 shows many

49

STUDIES ON 5'-TERMINALS

@@a m a pm7G Cp 100

-"

GP

Ap

A

UP

-

"CAP Ill)(-6Charge) Undigested Control

B Nucledide Pyrophosphatase

KJO

," 50

Phosphcdiesterase

50

I

5

9

13

17

21

MIGRATION (cm)

FIG. 7 (right). Identification of pm'G in cap-2 of oviduct mRNA by electrophoresis at pH 3.5 after nucleotide pyrophosphatase and venom phosphodiesterase digestion. Portions of the radioactive material from the cap-2 peak (-6 charge) (Fig. 6 ) are treated and analyzed as described in the legend for Fig. 7 (left). ( A ) Untreated cap-2 peak. ( B ) Cap-2 peak treated with nucleotide pyrophosphatase. ( C ) C a p 2 peak treated with snake venom phosphodiesterase.

dense spots in the tumor pattern, mainly in the A region. The resemblance of these patterns to those obtained earlier (Fig. 10) for ribosomal proteins led to studies on their comigration. Many of the spots observed in the 40 S ribosomal subunit pattern ( 2 5 ) were readily observed in the tumor pattern, but only faintly discerned in the pattern for normal liver (Fig. 8). With shorter time exposures (Fig. 8), three major spots were visualized in the liver pattern, i.e., M1, M2 and BX. Some dense spots were also noted in the C region. Two dense spots, M1 and M2, of the liver pattern (Fig. 8 ) were found in the tumor pattern (Fig. 9). The dense spot BX was not observed in the tumor pattern, but in this region there were two small spots (26 and 27) that are part of an oblong ring noted in this figure (Fig. 9 ) .

50

HARRIS RUSCH ET AL.

FIG.8. Autoradiograrn of a two-dimensional gel electrophoresis run of [%]methionine-labeled proteins synthesized in d t r o by the wheat-germ system to which was added mRNA( A,,) of rat liver. ( A ) Three-day exposure, ( B ) Six-day cxposure.

51

STUDIES ON 5’-TERMINALS

FIG. 9. Two-dimensional gel patterns of [“‘S Jmethioninc-labeled proteins synthesized in rjitro by the wheat-germ system to which mRNA(A,,) of Novikoff hepatoma was added; 6-day exposure. I

I

I

I

I I

I 56 54

0 51

% 0 0 0

1 0 55

I I

0

53 52 Q QI O

0

I I I I

I

I I I A

I

I

B

I

C

FIG. 10. Diagrammatic representation of the Coomassie Brilliant Blue-stained spot patterns of the comigrating proteins from ribosomal 40 S subunit of Novikoff hepatoma. The methods are the same as those of Prestayko et aZ. (25). The migration of the dotted spots S 1 to S6 corresponded to the radioactive spots numbered SI-S6 in Fig. 9.

G. Quantitative Comparisons of Labeling of Specific Proteins of Liver and Tumor In view of the apparent fidelity of the wheat-germ translational system and the linearity of labeling with respect to mRNA concentration (2, 18-21, 3 3 ) , the quantitative differences in the labeling of the various protcin spots for the tumor and liver niRNA presumably reflect differences in the amounts of the various mRNA species present. In the liver pattern,

52

HARRIS BUSCH ET AL.

the largest percentage of total isotope, i.e., 40%, was in proteins M1 and M2 (Table 111). These values arc significantly greater than t h i percentages, 17%,of the total isotope in proteins M1 and M2 of the tumor. In the proteins comigrating with those of 40 S ribosomal subunits (Fig. 10, Sl-S6), thc tumor pattern contained 26% of the total isotope incorporated to 5% in the liver. The 5-fold greater labeling of proteins S 1 4 6 in the tumor is statistically significant (Table 111); i.e. the p value is <0.001. In control incubations with the wheat-germ system without added HNA ( A,, ), incorporation of radioactivity was only 2 5 % that of systems containing RNA ( A, ). Autoradiography of the protein gels showed only a few faint spots at exposure times of 3-6 days. These results indicate that wheat germ has a very low endogenous mRNA content. To determine whether other growing tissues cxhibit similar increases in mRNA,. studies were made with 18-hour regenerating liver (Fig. 11). A substantial incrcase was observed in labeling of rapidly moving

-

% [35S]Methioninc Spot no. hl1 M2

BX (‘27)

Sl s2

S3 s4 S:, SA

S1-6

Liver

Novikoff hepatoma

P value

29.33 f 3 . 3 5 10.xz f 0.!)2 8.0.5 f 0 . 9 4 0.6!) k 0.06 1.27 0 . 3 4 1.53 f 0 . 3 1 0 , 9 1 0.21 0.87 -I 0.19 0.33 0.07 5 . 0 2 0.61

12.!)8 k 1.7$5 4 58 f 0..56 1.36 f 0 . 2 7 3.21 1.28 6.37 k 1 . 3 4 6.22 rt: 0 . ,53 3.44 0.81 4 . 2 8 f 0.21 2.41 L- 0 . 4 9 2.; 92 f 2 . 4 2

<0.01 <0.01 <0.01

+

*

+

+

*

0.05 <0.01
~35S]Methioni~ie-lat~~~lcd protrins synthcsized In wheat-gcrm cell-free systcm in the. prcsrnrr of livcr or Novikoff liepatonia ItNA(A,) wrrc prcripitated with 2 volumes of cthanol from the postrihrsomal frartion of t h o incubation mixture. Protrins wrre scparatcd on 10 Yo and 12 % two-dnrlensional polyacyylamidr gds. Iladioac-tive spots w r r cut frorri the gels and digrstrd completely in 30 % H 2 0 2 plus 60 % TICK), (2: l,v/v) a t .7O0C for 48 hours. Aquasol was used as a srintillation fluid. Internal standard was used to rhcck the possihle quenching. Spots were rut out of thr grls with R sharp blade arid used for radioactive counting. Each valuc is the mran +SIC of 4-6 experimcnts. I’

STUDIES ON 5’-TERMINALS

53

Fic:. 11. Autoradiogram of a two-dimensional gel electrophoresis run of [?~S]methioninelabeled proteins synthesized in uitro by the wheat-germ system to which was added niRNA( A,L)of 18-hour regenelating liver.

proteins, as was also noted in the Novikoff hepatoma pattern. A key question currently under study is the kinetics of these events. I t will be interesting to determine the earliest times at which mRNA,.,,,t begins to increase.

H. Inhibition of Translation of mRNA In view of the report of Hickey et al. ( 3 4 ) that the translation of mRNA can be inhibited by pm’G, experiments were undertaken to determine whether synthetic cap and U, RNA (the smallest, U-rich, low-MW H N A ) inhibit mRNA translation. Figure 12A shows the translation activities of liver polysomal RNA(A,,) in the presence of various concentrations of pm’G, m’G( 5’)ppp(5’)Am, pG and m’G( 5’)ppp( 5 ’ ) Am oxidized with sodium periodate. Both pm’G and m’G( S ’ ) p p p ( S ) Am inhibited the translational activity of the liver polysomal RNA( A,, ) ; thcir IC,, values were 30 pM each (Table I V ) . On the other hand, pG had no effect on translation as noted by Hickey et ul. ( 3 4 ) with globin mRNA and HeLa cell RNA(A,,) RNA. They found 80%inhibition with pm’G of the translation of HeLa cell RNA(A,,). If pm’G competes specifically for the cap binding site, it is possible that all the niRNA( A,,) of liver cytoplasm contain the cap. With m’G( 5’)ppp( 5’)Am oxidized with pcriodatc to yield the corresponding dialdehyde, the IC,,, was 60 pM. The affinity of the dialdehyde derivative for initiation factors was rcduced by approximately thc same percentage as the translational activity of mRNA after periodate oxidation.

54

HARRIS RUSCH ET AL.

1.4

2.8

( 6)

4.2

5.6

70

PM

FIG. 12. Effects of various cap analogs on translational activity of liver mRNA in the wheat-gem1 system. ( A ) pm'G synthetic cap, oxidized cap and 5'GMP( pG). ( B ) tRNA, mi-277G,U, RNA.

The results of corresponding cxpcriments using U1 RNA ( 4 ) , which has ni2-.,'Gpp( p)Am as the cap, in translational studies are shown in Fig. 12R. U1 HNA was 15 times more inhibitory than either m'GpppAm or pm'C in decreasing the translational activity of liver polysomal RNA( A , ) , i.e., thc IC,,, was 2.1 pM (Table I V ) . The nuclcoside m2,2,7Ghad no effcct, indicating the nucleotide is esscntial for binding to the initiation factors. The inhibition observed with U 1 RNA does not result from trapping of proteins on polynuclcotidc chains, since tHNA, at similar concentrations,

Inhibitor

IC50

PM pm7G 1n7(;(:i'jppp(.i')Am Oxidizrdb m7G(.;'jppp(.i'jAm

U I RNA

SO 30 60 2.1

a Trmslrttional activity was drtrrmined using liver polysoir~alIiNA(A,) in t,ho wlicat-germ syst,cm ns descrit)cd in footnote to Tahlc I. IC,*was c!aleulatcd For various inhibitors from thc data prrsent,rd in Fig. 12. Iach value is a n avcragc of three srparate cxpcriments. *With NaI04.

55

STUDIES ON j’-TERMINALS

did not inhibit translation but actually was stimulatory at lower levels. The greater affinity of U1 RNA for the binding site on the initiation factors may be due either to ( a ) the two additional methyl groups on the N’ of guanosine, which may provide more hydrophobic binding sites and thus enhance the affinity for protein binding, or to ( b ) the additional polynucleotide chain next to the cap, which may induce a greater binding affinity. These inhibitors, along with the cap structure of U1 RNA, should provide more information on recognition factors for the cap during initiation of protein synthesis and may offer important therapeutic opportunities in the future,

111. Discussion The demonstration of the ?-Cap has led rapidly to an important series of findings relating to the broad distribution of this structure in eukaryotic mRNA and, in addition, a clear demonstration of its virtually uniform presence in RNA( A,,) of liver and tumor cells. The functional role of the 5’-Cap now seems quite clearly to be as an allosteric binding site for one or more proteins involved in the initiation reactions in protein synthesis. Although the AUG codon is as close as 10 nucleotides to the cap as reported by Dasgupta et al. (35) for brome mosaic RNA IV, it appears that the distance may be much greater in other molecular species. The reasons for these differcnccs are unknown, as are also the reasons for the differences in thc chain lengths of the 5’ segments that arc not part of the codes for specific proteins. Although the studies on nuclear RNA( A,!) indicate that many of these molecules, in fact, contain no cap, it is uncertain what their fate may actually be. That they contain poly(A) segments was shown by analysis and by the method of isolation ( 7 ) . Whether the molecules lacking the cap are further metabolized and destroyed, or whether with time they are “capped” or have other functions remains for future analysis. At the present time, the differences in the translational activities of nuclear and cytoplasmic molecules will partially or completely be accounted for on the basis of the differcnces in their content of the cap. Our recent studies suggest that there may be some qualitative differences in the proteins codcd for by nuclear and cytoplasmic RNA( A,,), With the development of further information about the cap, the possibility arose that a survey method could be developed for comparisons of the proteins produced by different tissues. The requisites for this are fidelity of thc wheat-germ system, which has been frequently demonstrated, and the intactness of the RNA used as template. Since the RNA employed was isolated as a poly( A)-containing product on poly-( U ) -

5ti

HARRIS RUSCH ET AL. R E S T I N G OR G R O U N D S T A T E

NUCLEAR ENVELOPE ,,-

r

, I

I

Fetal

Genes HYDROLASES Proteases Nucleases Carbohydrases Phosphatases

(A)

Ix = G r o w t h and division 11 = lnvasiveness f2 = Metastasis f n = Special fetal f u n c t i o n I Inhibitors %

ACTIVATED

STATE

STUDIES ON

57

5’-TERMINALS FETAL -

CELL NUCLEUS

CELL WALL

“GROWTH

Fetal Genes

HYDROLASES -__ Proteases Carbohydrases Nucleases Lipases Phosphatases

FIG. 13. ( A ) Elements of cellular response to external stimuli. It is envisioned that “growth” factors and other stimuli are almost continuously present in the cellular periphery and possibly in continuous equilibrium with intracellular elements. The symbols are R,-R,, = receptors, G1-GR = structural genes including genes for receptor proteins ( G R ) , Gfl-Gfn = fetal genes with functions indicated, GI = inhibitor genes, No = rDNA genes and G,.-,,,.ot= genes for ribosomal proteins. ( B ) Response of cell to stimulus, S,, by formation of stimulus-receptor complex ( S , R , ) which impinges upon a group of genes, G1-GR,, G,.,,,t and Gs~-~DsA,to produce a series of messenger RNAs, ribosomes and polysomes that produce specific products including R, ( 3 7 ) . The “battery” of fetal genes is not involved in these normal responses. ( C ) A similar pattern is shown for the fetal state when a variety of fetal genes are activated by stimulus St (fetal “stimulus” factors), which are presumed to act in much the same way as the factors controlling structural genes of the adult.

Sepharose, and since, the molecules do not function if the cap is not intact, it appears that the basic requisites for such a survey exist. I t is apparent that in thc Novikoff hepatoma and in 18-hour regenerating liver a series of proteins is produced that comigrate on two twodimensional gels with ribosomal proteins. These products require further chemical identification to establish their identity unequivocally. The translation by mRNA of Novikoff hepatoma and regenerating liver as cornpired to that by normal liver rnRNA have important impIications for

58

HARRIS RUSCH ET AL. C A R C l NOGENESI S

-___

NUCLEAR ENVELOPE

CELL WALL

I-

rot rDNA

Fetal Genes

CANCER NUCLEAR ENVELOPE ,

CELL WALL

"GROWTH FACTORS"

Pf2

-

Pfn-

Proteaser RNase Carbohydrases

mRNAf2mRNAn

-

STUDIES ON 5’-TERMINALS

59

coupled synthesis of rRNA and ribosomal proteins (Figs. 13 and 14) for ribosome production in eukaryotic cells. An increasing amount of evidence suggests that as part of any growth process, such as that in regenerating liver (Fig. 13B), there is a rapid increase in ribosome synthesis ( 1 4 ) . In hormone-treated systems as well, a corresponding increase in production of both rRNA and ribosomal protein occurs as an early event ( 14) ( Fig. 13B). I t is well recognized now that in the cancer process a variety of “fetal products” are produced, some of which reflect events a t specified times in fetal development that :ire usually temporally activated in a normal progression and deactivated as the fetus progresses (Figs. 13C and 14). Following exposure of cells destined to become cancer cells to carcinogenesis, it is apparent that at least three types of events can occur: ( a ) a direct effect of carcinogen (viral, chemical or physical) on the genome, ( b ) an inhibitor effect of the carcinogens on many types of hydrolytic cnzymcs and/or ( c ) an activation of functional genes both for synthesis of ribosomes and for synthesis of other components including “fetal” elements that are released out of “ordered progression” of fetal development. A key aspect of canccr is the self-generation of cells, i.e., no extrinsic stimuli are required to produce their activation for further cell replication (Fig. 1 4 ) . An important part of the self-stimulatory aspects of cancer apparently are products of fetal genes that have been activated in the carcinogcnic process and that will become fixed protein constituents of these cells. The persistence of their polysonies during mitosis as reported by Hodge et al. (36) continues the process without remission. FIG.14. Effects of carcinogenic agents on cellular responses. ( A ) It is envisioned that carcinogens permit structural genes to function in production of normal products hut that through several mechanisms fetal genes are activated to produce a variety of fetal products, including Pfl and Pf2, which are important to invasiveness and metastasis. The carcinogen may act with a fetal receptor to directly interact with the genome or may cause a new stimulus within the cell to interact with a receptor that will interact with the genome. Alternatively, the carcinogen may interfere with degradative reactions that are involved in normal growth controls. ( B ) This diagram indicates the expression of cancer, which is a continuous production of gene products involved in growth, invasiveness and metastasis. Such cells no longer produce receptors 1 and 2 or others that may have phenotypic specificity. It is envisioned that, as described by Nodge et al. ( 3 6 ) , these gene products and their stimulatory activators are produced through mitosis and hence keep these genes activated during new cell formation. Moreover, the lack of fetal extracellular regulatory mechanisms do not permit these genes to be regulated as they would be during fetal growth and development.

60

HARRIS BUSCH ET AL.

REFERENCES 1 . H. Busch, Perspect. Biol. Med. 19, 549-567 (1976). 2. M. R. S. Rao, B. C. Wu, J. Waxman and 11. Busch. BBRC 66, 1186-1193 (1975). 3. T. S. Ro-Choi, R. Reddy, Y. C. Choi, N. R. Raj and D. Henning. F P 33, 1548 (1974). 4. R. Reddy, T. S. Ro-Choi, D. Henning and H. Busch. JBC 249,6486-6494 ( 1974). 5. H. Shibata, T. S. Ro-Choi, R. Reddy, Y. C. Choi, D. Henning and H. Busch. JBC 250, 3909-3920 ( 1975). 6. T. S. Ro-Choi, Y. C. Choi, D. Henning, J, McCloskey and H. Busch. JBC 250, 3921-3928 (1975). 7. S. Sakamoto, M. S. Rao, B. Wu, W. €I. Spohn and H. Busch. Physiol. Chem. Phys. 7,309-324 ( 1975). 8. N. K. Chatterjee and H. Weissbach. PNAS 71,3129-3133 (1974). 9. A. Carillo-Ruiz, M. Beats, G. Schwartz, P. Fiegelson and V. G. Allfrey. PNAS 70, 3641-3645 (1973). 10. J. Grozdanovic and J. Hradec. NAR 2, 821-830 (1975). 1 1 . J. Grozdanovic and J. Hradec. BRA 402, 69-82 ( 1975). 12. R. P. Perry, D. E. Kelley, K. H. Friderici and F. M. Rottman. Cell 6, 13-20 ( 1975). 13. B. P. Brandhorst and E. H. McConkey PNAS 72,4450-4454 (1975). 14. H. Busch and K. Srnetana. “The Nucleohs.” Academic Press, New York, 1970. 15. C. M. Mauritzen, Y. C. Choi and H. Busch. Zti “Methods in Cancer Research” (11. Busch, ed.), Vol. 6, pp. 253-282. Academic Press, New York, 1970. 16. Y. Daskal, S. Ramirez, N. R. Ballal, B. Wu and H. Busch. Cancer Res. 36, 10261034 ( 1976). 17. C . W. Taylor, L. C. Yeoman and €1. Busch. Exp. Cell Re.?. 82, 215-226 (1973). 18. B. E. Roberts and B. M. Patcrson. PNAS 70, 2330-2334 (1973). 19. €3. E. Roberts, M. Gorecki, R. C. Mulligan, K. J. Danna, S. Rozenblatt and A. Rich. PNAS 72, 1922-1926 ( 1975). 20. I. Gozes, H. Schmitt and U. Z. Littauer. PNAS 72, 701-705 (1975). 21. M. N. Thang, D. C. Thang, E. De Mayer and L. Montagnier. PNAS 72, 39753977 ( 1975). 22. M. S. Rao, F. Hirsch, B. C. Wu, W. H. Spohn and H. Busch. Mol. Cell. Biochem. in press (1976). 23. L. C. Yeoman, C W. Taylor, J. J. Jordan and H. Busch. Exp. Cell Res. 91, 207-215 (1975). 24. L. R. Orrick, M. 0. J. Olson and €1. Busch. PNAS 70, 1316-1320 (1973). 25. A. W. Prestayko, G. R. Klomp, D. J. Schmoll and H. Busch. Biochem. 13, 19451951 (1974). 26. G. A. Evans and M. G. Rosenfeld. F P 34, 1913 (1975). 27. T. Hunter, T. IIunt, R. J. Jackson and H. Robcrtson. JBC 250, 409-417 (1975). 28. S. Muthukrishnan, G . W. Both, Y. Furuichi and A. J. Shatkin. Nature 255, 33-37 ( 1975 ) . 29. R. C. Derosiers, K. H. Friderici and F. M. Rottman. Boichem. 14, 4367-4374 (1975). 30. N. K. Chatterjee and H. Weissbach. PNAS 71, 3129-3133 (1974). 31. €I. Busch, D. Henning, F. W. Hirsch, M. S. Rao, T. S. Ro-Choi, W. H. Spohn and B. C. Wu. International Symposium on Molecular Biology of the Mammalian

STUDIES ON 5’-TERMINALS

32.

33. 34.

35. 36.

37.

61

Genetic Apparatus-Its Relationship to Caiicer and Medical Genetics. Pasadena, California (1975). H. Busch, D. Henning, F. W. Hirsch, M. S. Rao, T. S. Ro-Choi, W. H. Spohn, and B. C. Wu, I n “Control Mechanisms in Cancer” ( W . E. Criss, T. Ono and J. R. Sabine, eds.), pp. 241-267. Raven Press, New York, 1978. S. L. C. Woo, J. M. Rosen, C. D. Liarakos, Y. C. Choi, H. Busch, A. R. Means and B. W. OMalley. JBC 250,7027-7039 (1975). E. D. Hickey, L. A. Weber and C. Baglioni. PNAS 73, 19-23 (1978). R. Dasgupta, S. D. Shih, C. Saris and P. Kaesberg. N a h z 256, 624-628 (1975). L. D. Hodge, E. Robbins and M. D. Scharff. J . Cell Biol. 40, 497-507, (1969). M. E. Cohen and T. H. Hamilton. PNAS 72,4346-4350 ( 1975).

This Page Intentionally Left Blank

Modification of the 5’-Terminals of mRNAs by Viral and Cellular Enzymes BERNARD Moss, SCOTT A. MARTIN,* MARCIAJ. ENSINGER, ROBERT F. BOONEAND CHA-MERWEI National Institute of Allergy and liifectious Diseases National Institutes of Health Bethesda, Maryland

I. Introduction In contrast to prokaryotic niRNAs, which contain a 5’-terminal triphosphate or monophosphate, numerous eukaryotic viral ( 1-10) and cellular (11-15) mRNAs as well as “heterogeneous nuclear” RNAs ( hiiRNAs) (16, 17) contain a terminal 7-methylguanosine residue linked from its 5’-position through a triphosphate bridge to the 5‘-position of a 2’-O-methylribonuclcoside ( Fig. 1).While some cukaryotic mRNAs may lack the 2’-O-methyIribonucleoside (18-21 ), others have a doubly methylated N”,O”-dimethyladenosine residue in the penultimate position (22, 2 3 ) , and many have two consecutive 2’-O-methylribonucleosides. Sequences of the same general type containing a trimethylated guanosine in the terminal position are present in a class of low-molecular-weight nuclear RNAs (24-26a) .l The presence of internal Nc-methyladenosine ( m”) residues is another characteristic of some viral (6, 8, 10) and many cellular (11-13, 15-17,26b) mRNAs and hnRNAs. In HeLa cells, these residues are found predominantly in G-m”A-C and A-m”A-C sequences (27) that are not clustercd at cithcr the 3’- ( 1 7 ) or 5’- (Wei and Moss, in preparation) ends of the mRNA. We have initiated a search for 5’-terminal modification enzymes in order to understand the steps involved in the synthesis and processing of viral and cellular mRNAs. A description of three enzymic activities Present address: Department of Pathology, Washington University School of Medicine, St. Louis, Missouri 63110. See also articles by Busch et al., Fumichi et al., and Rottrnan et al., in this vol 11me. 63

64

BERNARD MOSS ET AL.

7-methylguanosine

0-

I

2 - 0 -methylrihnucleoside

BASE lm6Ade. Ade. Gua. Cyt. Ura)

5

I 6 I

I

NHCH3

0-CH3 BASE (Ads. Gua. Cyt, Ura!

O-lCH3 or H )

I

N 6-methyladenoslne

Pdy(A) ( A ) 150-2WAOH

FIC. 1. Representation of HeLa cell mRNA.

isolated from vaccinia virions and one from uninfected HeLa cells forms the major part of this report.

11. 5'-Terminal RNA Modification Enzymes of Vaccinia Virus

A. Synthesis of Methylated mRNA by Vaccinia Virus Cores Vaccinia, a member of the poxvirus group, is a large DNA-containing virus that replicates within the cytoplasm of infected cells. The virus particle consists of a lipoprotein envelopc, a pair of lateral bodies and a core (Fig. 2 ) . The core contains, in addition to the genome, a transcriptase activity that can bc detected by incubating purified virions with n nonionic detergent and a sulfhydryl reducing agent to disrupt the envelope, the four ribonucleoside triphosphates, and Mg2+ (28, 29). The RNA is

65

ENZYMIC MODIFICATION OF 5’-TERMINALS Lateral body

/ Vaccinia virion

DNA ( 1 2 0 x 106)

I

Core

ocn N i;;

1

A T ; :

detergent

GTP, CTP, U T P

AdoMet

10-12 S RNA

(A)imAoH

N = AorG

FIG.2. I n vitro synthesis of mRNA by vaccinia virus cores.

transcribed from the early gene sequences, has a modal sedimentation coefficient of 10-12 S, and contains a poly ( A ) tract at the 3’ end (30, 31 ) . When S-adenosylmethionine ( AdoMet ) is added to the reaction mixture, there is no change in the rate of synthesis or size of the mRNA formed, but methy1 groups are incorporated into the 5’-terminal sequences, which h a w been identified as m’GpppAm‘ and miGpppGm2 (I, 4 , 3 2 ) . Considerable information pertaining to the terminal modification reactions has been obtained, using this in vitro RNA-synthesizing system.

‘ The

( 5 ’ ) , which should follow m’G (see Fig. 2 ) , is omitted for simplicity [Eds.].

66

BERNARD MOSS ET AL.

The first question to be considered here is the nature of the guanylyltransferase reaction. We Considered two alternatives: ( 1 ) transfer of a GMP residue from GTP to a diphosphate at the 5' end of the RNA, and ( 2 ) transfer of a GDP residue from GTP to a monophosphate at the 5' end of the RNA. Discrimination between these mechanisms WAS achieved by using specific "P-labeled NTPs for RNA synthesis and then locating thc position of the radioactive label in the 5'-terminal fragments of the RNA, as outlined in Fig. 3. The results of such expcriments indicated that ( i ) the labeled phosphate of [U-~:P]ATPwas incorporated exclusively into the position adjacent to Am in m'GpppAm; (ii) [u-'~P]GTPwas incorporated next to miG in both m7GpppAm and m7GpppGm as well as in the position adjacent to Gm; (iii) [&y"P]ATP was incorporated into the middle phosphate position of m'GpppAm only; and ( i v ) [p,y-"P]GTP was incorporated only into the middle phosphate position of m'GpppGm (33). The incorporation of [y-"P]GTP or [ Y - ~ ~ P I A was T P not detected. These results are compatible with the transfer of a GMP residue from Vaccina virus cores, ATP, GTP, CTP, UTP,S-adeno~yI(mefhy/-~H1 rnathionine

( 1 1 [ a . 3 2 ~ ATP 1 (21 1,.32~1 GTP (3) [P,PPI ATP (4) [ P . ~ ~ GTP ~ ~ P I

I

37",30 rnin

Nucleotide pyrophosphatase Paper electrophoresis

FIG.3. Origin of each of the phosphates in the 5’-terminal structure of vaccinia virus mRNA.

E N Z Y M I C MODIFICATION OF s'-TERMINALS

67

GTP to a diphosphate at the 5' end of the mRNA. The alternative mechanism, transfer of a GDP rwidue to a monophosphate at the end of the HNA chain, can be excludcd for the formation of m'GpppAm, but, from these data alone such a transfer cannot be excluded for the formation of m'GpppGrn, since in the latter structure all three of the phosphates are derived from GTP. Although a mechanism involving GMP transfer is also used during the formation of m'GpppAm by cytoplasmic polyhedrosis virus ( 3 ) , GDP transfer has been reported for vesicular stomatitis virus ( 5 ) . The analog of GTP, guanosine [ p,y-methylene] triphosphate ( GMPP ( C H , ) P ) , cannot be hydrolyzed between the p and y phosphates and thus provides a second means for examining the nature of the guanylyltransferase reaction. The methyl-labeled RNA made by vaccinia virus cores when GMP-P( CH,)P was substituted for GTP contained m'GpppAm terminals but very few m'GpppGm terminals ( 3 3 ) (Fig. 4). This result indicates that a GMP residue is transferred from GMPP ( CH,)P to the ppA elid of an RNA chain to form m'GpppAm. Absence

CENTIMETERS

FIG. 4. Paper Chromatography of the methyl-labeled 5' terminals of vaccinia virus mRNA synthesized with GTP or GMP-P(CH')P. RNA was digested with PI nuclease and bacterial alkaline phosphatase prior to chromatography (33).

68

BERNARD MOSS ET AL.

of m7GpppGm terminals was anticipated since RNA chains initiated with GMP-P( C H I ) P would not lose their 7-phosphate terminals and therefore would not condense with GMP. Additional studies with vaccinia virus cores suggested the scquence of reactions indicated in Fig. 5. Transfer of a GMP residue to a diphosphate at the end of the nascent HNA was proposed as the first step since HNA containing ppN and GpppN ends are formed in the absencc of AdoMet ( 3 3 ) . [Evidence for thc reversibility of this step is discussed in Section C, 2 l,elow]. As the majority of termini are methylated only at the 7-position of the terminal panosine when limiting concentrations of AdoMet are provided, methylation at this position appears to be the second step followed by methylation of the penultimate ribonucleoside at the 2‘-position ( 3 3 ) . Additional studics indicate that methylation occurs prior to addition of the poly(A) sequence to the 5’-terminal of the RNA ( 3 4 ) .

B. Soluble Guanylyl- and Methyltransferases of Vaccinia Virus Further progress in undcrstanding the nature of the 5’-terminal modification reactions required the disruption of the viivs core and release of the enzymes in a soluble form. Solubilization was achieved by incubation of the virus corc with 0.1%sodium deoxycholate, 0.3 M NaCl and 0.0.5 M dithiothreitol followed by high speed centrifugation to remove insoluble structural proteins and subsequent filtration through a DEAE-ccllulose column to remove DNA ( 3 4 ) . This procedurc had bcen previously used i n our laboratory as the first step in the purification of a number of other enzymes from vaccinia virus cores, including a poIy( A ) polymerase ( 35, 3 6 ) , two nucleic acid-dependent nuclcoside triphosphatnses (37, 3 8 ) , a single-strand specific DNase (39, 4 0 ) and a protcin kiiiase ( 41, 4 2 ) . Unmcthylated vaccinia virus mRNA synthesized in vitro

PROPOSED REACTION SEQUENCE

GpppN-

t

AdoMet

rn’GpppN- t AdoMet-

rn’GpppN- t AdoHcv ( 2 )

rn’GpppNm-

+ AdoHcy (3)

FIC.5. Proposed sequence of reactions in the formation of thc modified 5’-terminal structure of vaccinia virus mRNA.

ENZYMIC

MODIFICATIOk

OF 5‘-TERMINALS

69

by vaccinia virus cores in the absence of S-adenosylmethionine contains ppA, ppG, GpppG arid GpppA terminals and was used as an acceptor in assays for the 5’-terminal modification enzymes. That all the activities needed to modify the 5’ terniinals of RNA are solubilized was demonstrated by the incorporation of radioactivity from [I‘CIGTP and Ado[Me”]Met into RNA as shown in Fig. 6. After digestion of the RNA, “C was detected in m’G, and 3H was detected in m’G, Gm and Am (Fig. 7A). Similarly, the enzymes could incorporate radioactivity into poly( A ) containing a 5’-terminal diphosphate (Fig. 7 B ) . Furthermore, the methylated nucleosides were present exclusively at the 5’ terminals of mRNA arid of pol\/(A ) in m’GpppGm and/or m7GpppAm sequences. As anticipated, the enzymes catalyzed the incorporation of label from [cY-,~P]but not from [P,y-”P]GTP into the RNA. Heterologous RNAs also serve as suitable substrates for the vaccinia enzymes. The 5’ end of satcllite tobacco necrosis virus RNA, which is ( p ) p p A (43, 44a), was converted to m’GpppAm (Fig. 8). The soluble vaccinia enzymes thus provide a unique and specific method of labeling

MINUTES

FIG.6. Incorporation of methyl groups from Ado[Me-’H]Met and of CMP residues from [“CIGTP into vaccinia virus niRNA by soluble enzymes from vaccinia virus cores.

70

BERNARD MOSS ET AL.

2.0

1.6

9 I-

d

1.2

2 a

8 z -

0.8

P

.3 ,= 0.4

5

-0

-I

0

> 1.2 2

I

L z

6

0.8

5 0.4

0

5 10 15 Distance From Origin lcm )

2(1

FIc. 7 . Thin-layer chromatographic identification of [JHln~ethyl-and [“Clgiianosine-labrled nucleosides in RNA modified by sololile erizyines from vaccinia virus cores. Acceptor RNA (unmethylated vaccinia virus mRNA in panel A or the synthetic polyribonucleotide pp( A),, in panel B ), was incubated with soluble enzymes from vaccinia virus cores, [“CIGTP, Mg“, and Ado[Me-”HIMet. The RNA was purified and digested with RNnse, snake venom phosphodiestcrase and alkaline phosphatase prior to thin-layer chromatographic separation of the resulting niicleosides (.14).

those RNAs that have di- or triphosphate 5’ terminals with [ N - ~ ~ P [I4C]-, I-, or [:‘H]C:TP or with Ado[ hle-’’H]Met. [Thosc RNAs containing triphosphate terminals are converted into inolccules with diphosphate terminals b y an “RNA terminal triphosphatase.” This enzymc, present both in the extract from vaccinia cores as well as in our preparations of thc purified cori~plpuof gunnylyltransferase and RNA( guaiiinc-7-)methyltransfel.ase (see Section 11, C ) , has been purificd by Tutas and Paoletti (pcrsonal communication)1. Furthermore, 5’-terminal fragments obtained by limited RNnse digcstion of such enzymically-modified RNAs would exhibit free 2’- and 3’-hyc’lroxyl groups and thercfore could be isolated on boratcderivatized adsorbants as has been done for 3’-terminal fragments (4 4 b ).

S’-TERMINALS

ENZYMIC MODIFICATION OF

71

-

STNV RNA (PIPPA-

la -

-

m7GpppAm

-

m7GpppAm

12 -

+ -

PI Nuclease Alkaline Phosphatase

-

6-

-

+

N

Q X

IV 0 * I

1

8

-

1

Ii m7G

I

-

I

I

Am II)

1

I

+

Venom Phosphodiesterase-

+

Alkaline Phosphatase

10 W

20 1

-

a

O

,

’ 30 ”

40

50

FIG. 8. Analysis of satellite tobacco necrosis virus (STNV) RNA modified by soluble vaccinia virus enzymes. Following incubation of STNV RNA with GTP, Ado[Me-”HIMet and soluble vaccinia virus enzymes, the RNA was purified and digested with P, nuclease and alkaline phosphatase (upper). This material was then digested with venom phosphodiesterase and alkaline phosphatase (lower). Digests were analyzed by paper chromatography ( 33 ) .

C. Characterization of a Complex of a n RNA Gua nylyltra nsferase and a n RNA(guanine-7-)-methyltransferase from Vaccinia Virus

1. ISOLATION AND PHYSICAL PHOITHTIES OF THE COMPLEX The previous studies indicatcd that three activities, namely, ail RNA Suanvl>iltransferase, an RNA ( guanine-7- ) -niethyltransferase, and an RNA ( ribonucleoside-2’- ) -methyltransferase, were released in a soluble form by deoxycholate treatment of vaccinia virus cores. Efforts were made to fractionate these enzymes by column chromatographic procedures ( 45, 4 6 ) . As outlined in Fig. 9, the guanylyl- and guanine-7-methyltransferases were separated by DNA-agarose chromatography from the adsorbed ribo-

72

BERNARD MOSS ET AL. Vaccinia virus

I Nonidet P-40 Dithiothreitol

Corms

Envelopes

I

I

I

Na deoxycholate Dithiothreitol NaCl

I

Soluble proteins + DNA

Insoluble proteins

I

I

DEAE-cellulose

Soluble proteins DNA-agarose

1

I

mRNA (guanine-7-)methyltranrferare mRNA guanylyltransferase

mRNA (nueleoside-2’-) methyltransferare

I t Poly(U)-Sepharors

1

Poly (A)-Sepharosa

mRNA (guaninO7-)methyltranrfera~ mRNA guanylyltransferase

FK:.9. Outline of procedure used to purify vaccinia virus guanylyl- and methyltransferases ( 4 5 ) .

nucleosidc-2’-mcthyItransferase. The guanylyl-and guanine-7-methyltransferases, however, were inseparable by chromatography on poly ( U ) or poly ( A )-agarose or in a variety of ion-exchangers, and consequently appear to be a complex. After a 150-fold purification, approximately 0.2 mg of this complcx was obtained from 60 mg of vaccinia virus. Thc guanylyl- and gnanine-7-methyltransferase activities cosediment in a sucrose density gradient. From the sedimentation coefficient and the Stokes radius obtained by gel filtration, a molecular weight of 127,000 was calculated for the complex ( 4 6 ) . After treatment with sodium dodecyl sulfate and mercaptoethanol, two major polypeptide peaks with molecular weights of 95,000 and 31,400 were detected by polyacrylamide gel electrophoresis. 2. HNA GUANYI.YI.TRANSFEI~ASE ACTIVITY OF

THE

COMPLEX

Detailed studies of both the guanylyltransfcrase and guanosine-7methyltransferase were carried out with the purified enzyme complex (46, Martin and Moss, in preparation ), The guanylyltransferase reaction is readily reversible and can be measured in cithcr direction using the

ENZYMIC MODIFICATION OF

73

5'-TERMINALS

appropriately labeled precursor. GTP is a better donor than dGTP, and the other three ribonucleoside triphosphates are completely ineffective. Significantly, m'GTP is not a substratc for the guanylytransferase, eliminating a possible alternative mechanism of formation of the m7GpppN structure. A divalent cation is essential, and Mg" is nearly 10 times as effective as Mil"+. Although synthetic polyribonucleotides with a single phosphate at the 5' terminal, such as those generally formed by polynucleotide phosphorylase, are not acceptors for the guanylyltransferase, they may be activated by chemical addition of a second phosphate. When equimolar amounts of p p ( N ) , , were tested, the order of efficiency was p p ( A ) , > ~P(I> ) , pp( ~ G),, > pp(U),, > pp( C)". The greater activity with purine polyribonucleotides may reflect the fact that the 5' terminals of vaccinia mRNAs are miGpppAm and m'GpppGm. Polyribonucleotides containing a 5'-terminal triphosphate were also acceptors because of the RNA triphosphatase activity in our preparations. We were particularly interested in determining whether GTP could condense with ribonucleoside diphosphates to form GpppN structures, which conceivably could serve as primers for RNA synthesis. Although the condensation reaction was demonstrated with ADP, GDP, UDP and CDP, the reactions were considerably less efficient than with the corresponding diphosphate-terminated homopolyribonucleotide and may not be biologically significant, For example, at identical concentrations of GTP, the dissociation constant for pp( A ) , l was 5.2 pM and for ADP was 360 pM, suggesting that guanylylation normally occurs after the initiation of the RNA chain. Inorganic pyrophosphate at concentrations equimolar with GTP is a potent inhibitor of the guanylyltransferase reaction. This inhibition can be partly relieved by AdoMet, presumably because it methylates the terminal guanosine and thereby prevents the reverse reaction. This explanation is supported by our finding that GpppA but not m'GpppA is a substrate for the guanylyltransferase-catalyzed pyrophosphorolysis.

3. RNA ( GUANINE-7- ) -METHYLTRANSFERASE ACTIVITYOF

THE COMPLEX

The guanosine-7-methylation reaction may be studied independently of guanylylation by using appropriate acceptors (46, Martin and Moss, in preparation). Most experiments were carried out with Gppp(A ) ,, prepared from pp ( A ) using the vaccinia virus guanylyltransferase. With Gppp( A ) ,,, neither GTP nor divalent cations were required for methylation. Gppp( N ) , is a much better acceptor than the dinucleotide GpppN, which in turn is better than GTP > GDP > GMP > G. With identical concentrations of S-adcnosylmethionine, the dissociation constants for

74

BERNARD MOSS ET AL.

Gppp( A),,, GpppG and GTP were 0.21 pM, 120 pM and 530 pM, respectively. The high acccptor efficiency of the polyribonucleotide Gppp ( A ) ,, suggests that methylation normally occurs after guanylylation of the nascent RNA chain. Significantly, nucleosidcs carrying a 3‘-phosphatc or a 2’-phosphate are totally ineffective acceptors, thereby providing an explanation for the absence of detectablc internal methylation.

D. RNA (ribonucleoside-)-2’-methyltransferase Preparations of ribonucleoside-2’-methyltransferase free of the guanine-7-methyltransferase have recently been obtained, and i t is possible to employ polyribonucleotides of the type m’Gppp ( N ) ,, to specifically study 2’-O-methylation. Such synthctic polyribonucleotides can be made enzymically using pp( N),, and the purificd complcx of guaiiylyltransfcrase and guanine-7-methyltransferase. Alternatively, naturally occurring RNA, such as tobacco mosaic virus RNA, which has the sequence ni’GpppC, (19, 2 0 ) , may be used. Further mcthylation of m‘GpppG-terminated molecules by the vaccinia 2’-methyltransferase does not require either GTP or divalent cations, and the product has been identified as m7GpppGm by procedures like those described in Fig. 8. Studies with synthctic homopolyribonucleotides indicate that m’GpppAm, ni‘GpppGm, m‘GpppIm, m’GpppUm and m’GpppCm can all be formed. Thus far we have not observed thc formation of GpppNm and therefore suspect that only m‘G-terminated structures are substrates for the 2’-methyltransferase. Accordingly, methylation at the 2’-position appears to be the third step as indicated in Fig. 5.

E. Synthesis of RNA Guanylyl- and RNA methyltransferases after Vaccinia Virus Infection Changes in the levels of RNA guanylyl- and RNA methyltransferase activities in HeLa cells were measured after vaccinia virus infection to investignte whether the enzymes packaged into vaccinia cores are of cellular or of viral origin. A rapid rise in guanylyltransferase activity in cell extracts was detected using either [“I- or [cPPIGTP as a donor and p p ( A ) , , as an acceptor (Fig. 10A). The specificity of this assay was vcrified by P, nuclease digestion of the product and identification of labcled GpppA. The rise in activity occurred cven when viral DNA synthesis was prevented with arabiiiocytidine ( a r a C ) , but not when protein synthcsis was inhibited with cycloheximide, indicating that synthesis of this enzyme was an early viral function. Similar experiments using Ado[ Me-”] Met as the donor and, as the acceptor, either Gppp( A),, or pp( A ) ” in the presence of GTP and Mg” indicated that methyltransferase synthesis is also an early viral function

ENZYMIC MODIFICATION OF

I

2

75

5’-TERMINALS

.

I

.

I

.

1

4 6 8 HOURS AFER INFECTION

.

1

lo

FIG. 10. Levels of guanylyltransferase and methyltransferase activities following vaccinia virus-infection of HeLa cells. At the times indicated samples of uninfected, vaccinia virus-infected, and arabinosylcytosine-treated infected cells were lysed with deoxycholate and mercaptoethanol and then sonicated. The total cell lysates were assayed for guanylyltransferase ( A ) and methyltransferase ( B ) activities using pp( A),, as acceptor. The efficacy of arabinosylcytosine was confirmed by measuring the inhibition of synthesis of the DNA-dependent ATPase, a “late” vaccinia virus enzyme

(50).

( Fig. 10B ) . Evidcnce that both the guanine-7-methyltransferase and the nucleoside-2’-methyltransferase are made after vaccinia virus infection has been obtained by analysis of the enzyme reaction products and by column chromatographic separation of the enzymes from infected cell homogenates.

F.

Host Modification of Vaccinia Virus mRNA

Vaccinia virus mRNA isolated from infected H6La cells is more highly methylated than mRNA synthesized in uitro by vaccinia virus

76

BERNARD MOSS ET AL.

cores. The additional modifications includc methylation of the penultimate Am in the Nc-position to give m‘Am and methylation of the third ribonucleoside in the 2’-position ( Boone and Moss, in preparation). Similar methylated ribonucleosides are found in HeLa cell mRNA (11, 13, 17, 22, 271, and it seems probable that HeLa cell enzymes catalyze the further methylation of vaccinia virus mRNA.

111. Isolation of a GpppN-Specific Guanine-7-methyltransferase from Uninfected HeLa Cells Little is presently kown about the mechanism of 5’-terminal modification of cellular RNA. Isolated HeLa cell nuclei can form methylated 5’ termini under conditions of RNA synthesis ( 4 7 ) ,and extracts of L cells and wheat germ can methylate added viral RNAs ( 48, 49). We chose HeLa cells for our studies because the methylated nucleosides of HeLa cell mRNA have been well characterized (11, 13, 17, 2 7 ) , extensive studies on RNA synthesis and processing have been carried out with this system, and they support thc growth of many viruses, including vaccinia. To detect enzymes from HeLa cxtracts capable of modifying the 5’-tcrmina1 of RNA, it was first necessary to select an appropriate acceptor. Ideally, this acceptor would not be a substrate for the numerous tRNA and rHNA niethyltrnns~erases.Sincc unmodified HcLa cell mRNA or heterogcneous nuclear HNA was unavailable, we began our studies with unmcthylatcd vaccinia virus mRNA, which, as previously indicated, has both free diphosphate ( p p G and ppA) and guanylylated [GpppG and GpppA] ends and can be made in relativcly large amounts in uitro. Addition of this RNA to reaction mixtures containing an extract of HeLa cells led to an 8-fold stimulation of incorporation of radioactivity from Ado[Mc-?H]Metinto macromolecular material (51 ) . Of the total mRNAstimulatcd activity, one-third was associated with the nucleus, and the remainder, perhaps because of nuclear leakage, was found in the cytoplasmic fraction when the cc~llswerc disrupted by a variety of methods utilizing aqueous buffers. The cytoplasmic activity was separated from mitochondria and lysosoines by ccntrifugation, and although about 304% scdimented with ribosomes, it could be removed from the ribosomes by w‘ishing with 0.5 M KCI. Whether enzyme associated with nuclei, postribosomal supernatant, or unwashed ribosomes was used, approximately 90%of the methyl label was recovered as miG following digestion of the labeled RNA. Approximately a 200-fold purification of thc HeLa cell guaiiine-7methyltransfera~ewas achieved by phase partition with aqueous dextranpolyethylene glvcol, ammonium sulfate precipitation, DEAE-cellulose

ENZYMIC MODIFICATION OF

5’-TERMINALS

77

chromatography, DNA-agarose chromatography, and CM-Sephadex chromatography (51). On both DNA-agarose and CM-Sephadex columns, only a single peak of methyltransferase activity was detected using vaccinia virus mRNA as acceptor. The molecular weight of the guanine-7methyltransferase, calculated from the sedimentation coefficient and Stokes radius, was estimated to be approximately 56,000. After P, nuclease and alkaline phosphatase digestion of vaccinia mRNA methylated by the purified HeLa cell enzyme, all the labeled material was adsorbed to a DEAE-cellulose column in 7 M urea and eluted with a net charge of -2.5. The labeled oligonucleotide isolated by DEAE-cellulose chromatography was identified as m’GpppG and m’GpppA by paper chromatography. After further digestion with snake venom phosphodiesterase and alkaline phosphatase, exclusively miG was identified by thin-layer chromatography. The purified HeLa cell guanine-7-methyltransferase requires neither GTP nor divalent cations for activity. In addition to unmethylated vaccinia virus mRNA, both thc synthetic polyribonucleotide Gppp( A ) and the dinucleotide GpppG are good acceptors. The dinucleotide GppG is a poor acceptor and GppppG, GTP and GDP are totally ineffective. The diphosphate-terminated polyribonucleotide pp ( A ) ,, in the presence of GTP and Mg2+was also not an acceptor, indicating that a guanylyltransferase similar to that associated with the vaccinia virus guanine-7-methyltransferase is not associated with the HeLa cell enzymes. The features that distinguish the HeLa cell guanine-7-methyltransferase from thc enzyme isolated either from vaccinia virus cores or from infected cells include: ( i ) the inability of the HeLa enzyme to methylate GTP or GDP; ( i i ) the absence of guanylyltransferase activity associated with the HeLa enzyme; (iii) differences in molecular weight; and ( i v ) differences in chromatographic properties. Although the possibility that the HeLa cell guanine-7-methyltransferase actually is a subunit of the complex of guanylyltransferase and guanine-7-methyltransferase isolated from vaccinia virus cannot be eliminated until the HeLa enzyme has been further purified, we consider it unlikely that the cellular and viral enzymes are related.

IV. Summary and Conclusions Three enzyme activities, namely, an RNA guanylyltransferase, an RNA ( guanine-7- ) -methyltransferase, and an RNA ( ribonucleoside-2’-) methyltransferase, have been isolated from vaccinia virus cores and extcnsively characterized with respect to donor and acceptor specificities. Acting sequentially, the enzymes are capable of completely modifying the

78

BERNARD MOSS ET AL.

5’ terminals of homologous or heterologous RNAs or of synthetic polyribonucleotides containing a 5’-terminal diphosphate. Two of the activities, the guanylyltransferase and guanine-7-methyltransferase, were isolated as a complex with a molecular weight of 127,000. After dodecyl sulfate/polyaciylamide gel electrophoresis, two putative subunits with molecular weights of 95,000 and 31,400 were detected. In infected cells, synthesis of the guanylyltransferase and both methyltransferases are directed by vaccinia virus and appear to belong to the class of “early” or prcreplicative gene products. A scheme for the synthesis and modification of early vaccinia virus mKNA is shown in Fig. 11. W e propose that, prior to completion of the RNA chain, the y-phosphate is removed from the 5’ terminal and a GMP residue is added. Methylation follows sequentially, first at the 7-position of the terminal guanosine and then at the 2’-position of the penultimate ribonucleoside. After completion of the KNA chain, adenylate residues are added and the RNA is then extruded from the core into the cytoplasm where the final two methylation steps occur. This sequence is in accord with ( i ) the products obtained by incubating virus cores either in the absence of S-adcnosylmethionine or in the presence of both limiting and saturating concentrations of the methyl donor, ( i i ) the localization within the 5’-terminal sequence of 32P transferred from a- or p,y-”P-labeled ribonucleoside triphosphates, (iii) the effects of a GTP analog GMPP( CEI,)P, and ( i v ) the individual reactions carried out at highest efficiency by the isolated guanylyltransferase and guanine-7-methyltransferase and by ribonucleoside-2’-methyltransferase. Two more enzymes, a 2’-O-methyladenosine-N”-methyltransferase and another 2’-0-1nethyltransferase are postulated to exist in the cytoplasm, to account for the observation that vaccinia virus mRNA isolated from HeLa cells is more highly methylated than mRNA made in vitro by vaccinia virus cores. We presume that these are host enzymes, since HeLa cell inRNA is similarly modified and neither of these activities is present in vaccinia cores. Characterization of the enzymes involved in the 5’ terminal modification of cellular mRNAs or hcterogeneous nuclear RNAs are at an early stage. Thus far, a single enzyme, a guaniiie-7-methyltransferase specific for the terminal guanosine of G ( 5’)ppp( N),,, where n is 1 or more, has been isolated from HeLa cells. This enzyme has a molecular weight of about 56,000 and has no detectable guanylyltransferase activity associated with it. The isolation and charactcrization of the guanylyltransferase as well as the other methyltransferases involved in the modification of HcLa cell mRNA remains to be accomplished. It is expected that progress in

ENZYMIC MODIFICATION

OF

79

5’-TERMINALS 1

4TP +GTP + CTP + UTP

Vaccinia cors.asociatd m2ymr

RNA polymerase pppA ...... pppG ...... RNA terminal triphosphalase ppA ...... ppG .....

PPPG

4

RNA guanylyltranrferan

GpppA-----GpppG----AdoMei AdoHcy

3

RNA (guanine-7.J-methyltrsnrferare

m7GpppA ...... m7GpppG------

AdcMet AdOHcy

2 2

RNA (nucleoside-2’-)-me~yltransfera~

m’GpppAm~--.-m 7 ~ p o n ~..... ,

PPi ATP

m7GpppAm ............I A), m ? ~ p p p ~ .m .......(PI), ....

II m’GpppAb ........... (A), m’GpppGm.~.. .......( A ) AdoMet

PolylAl polymerase

3

? ATP-dependent extrusion factor Cytoplasmic anzqmn

? RNA

(2-0.methyladsnor~ne.~6J-meihvltran~fera~

AdoHcy m7Gpppm6Am ...........(A) m’GpppAm ...........I A ) ” m7GpppGm ...........I A) AdoMet

1

7

RNA (nucleoride-2-)-melhylfr.nrferau,

AdoHcy m7Gpppm6Am-INmJ----------.(AJ~ m’GpppAm-(Nm) ............ (A) m7GpppGm-INm ) ~ ~ . - - - - - - - . - ( A l ~

FIG. 11. Scheme for the synthesis and niodification of early vaccinia virus mRNA. The reactions within the rectangle can b e carried out in uitro by vaccinia virus cores. Those enzymes preceded by a qne5tion mark are inferred but have not been directely demonstrated.

this area will further our understanding of the sequence of events in the transcription and processing of eukaryotic mRNA.

ACKNOWLEDGMENT Satellite tobacco necrosis virus RNA and tobacco mosaic virus RNA were generously provided by Abraham Marcus of the Institute for Cancer Research, Philadelphia, Pennsylvania.

80

BERNARD MOSS ET AL.

REFERENCES 1 . C.-M. Wei and B. Moss, PNAS 72, 318 (1975). 2. Y. Furlliehi, M. Morgan, S. Muthiikrishnan and A. J. Shatkin, PNAS 72, 362 ( 1975). 3. Y. Furuiehi and K. Miura, Nature 253, 374 (1975). 4. T. Urishihara, Y. Furuichi, C. Nishiniura and K. Miura, FEBS Lett. 49, 385 (1975). 5 . C;. Ahrahani, D. P. Rhodes and A. K. Banerjee, Cell 5, 51-58 ( 1975). fi. S. Lavi and A. J. Shatkin, PNAS 72, 2012 (1975). 7. 1. K. Rose, JBC 250, 8098 ( 1975). 8. Y . Furuichi, A. J. Shatkin, E. Stavnezer and J. M. Bishop, Nature 257, 618 (1975). 9. J. Keith and H. Fraenkel-Conrat, PNAS 72, 3347 (1975). 10. B. Moss and F. Koczot, 1. Virol. 17, 385 (1976). 1 1 . C.-M. Wei, A. Gershowitz and B. Moss, Cell 4, 379 ( 1975). 12. R. P. Perry, D. E. Kelley, K. Friderici and F. Rottman, Cell 4, 387 (1975). 13. Y. Furuichi, M. Morgan, A. J. Shatkin, W . Jelinek, M. Salditt-Georgielf and J. E. Darnell, P N A S 72, 1904 (1975). 14. J. M. Adanis and S. Cory, Nature 255, 28 (1975). 15. D. T. Dubin and R. H. Taylor, NARes 2, 1653 ( 1975). 16. R. P. Perry, D. E. Kelley, K. 11. Friderici and F. M. Rottman, Cell 6, 13 (1975). 17. M. Salditt-Georgieff, W. Jelinek, J. E. Darnell, Y. Fnruichi, M. Morgan and A. J. Shatkin, Cell 7, 227 (1976). 18. R. Dasgupta, D. S. Shih, C. Saris and P. Kacsberg, Nature 256, 624 (1975). 19. J. Keith and H. Fraenkel-Conrat, FEBS L e t t . 57, 31 (1975). 20. D. Zimmern, N A R e s 2, 1189 (1975). 21. E. IIefti, D. H. L. Bishop, D. T. Dubin and V. Stollar, J. Virol. 17, 149 (197G). 22. C.-M. Wei, A. Gershowitz and B. Moss, Nature 257, 251 (1975). 23. S. A. Moyer, G. Abraham, R. Adler and A. K. Banerjee, Cell 5, 59 (1975). 24. T. S. Ro-Choi, R. Rerldy, Y. C. C h i , N. B. Raj and D. Henning, FP 33, 1548 (1974). 25. H. Shibata, T. S. Ro-Choi, R. Reddy, Y. C. Choi, D. Henning and H. Busch, JBC 250, 3909 ( 1975). 260. S. Cory and J. M. Adams, Mol. Biol. Rep. 2, 287 ( 1975). 266. R. Desrosiers, K. Friderici and F. Rottman, P N A S 71, 3971 (1974). 27. C.-M. Wei, A. Gershowitz and B. Moss, Bchem 15, 397 (1976). 28. J. R. Kates and B. R. McAuslan, PNAS 58, 134 (1967). 29. W. Munyon, E. Paoletti and J. T. Grace, Jr,, PNAS 58,2280 (1967). 30. J. Kates and J. Beeson, J A B 50, 1 (1970). 31. J. Kates and J. Beeson, J M B 50, 19 (1970). 32. C.-M. Wei and B. Moss, P N A S 71, 3014 (1974). 33. B. Moss, A. Gershowitz, C.-M. Wei and R. Boone, Virology 72, 341 (1976). 34. M. J. Ensinger, S. A. Martin, E. Paolctti, and B. Moss, P N A S 72, 3385 (1975). 35. B. Moss, E. N. Rosenblum and E. Paoletti, Nature 254, 59 ( 1973). 36. B. Moss, E. N. R o s e n l h ~ and i A. Gershowitz, JBC 250, 4722 ( 1975). 37. E. Paoletti, H. Rosemond-Hornbeak and B. Moss, JBC 249, 3273 (1974). 38. E. Paoletti and B. Moss, JBC 249, 3281 ( 1974). 39. H. Rosernond-Hornbeak, E. Paoletti and B. Moss, JBC 249,3287 (1974). 40. H. Rosemond-Hornbeak and B. Moss, JBC 249, 3292 ( 1974).

ENZYMIC MODIFICATION OF

5’-TERMINALS

81

41. J. H. Klciman and B. Moss, JRC 250, 2420 ( 1975). 42. J. H. Kleinian and B. Moss, JBC 250, 2430 (1975). 43. E. Winimer, A. Y. Chang, J. M. Clark, Jr., and M. E. Reichmann, JMB 78, 59 (1968). 44a. J. Horst, H. Fraenkel-Conrat, and S. Mandeles, Bchem 10, 4748 (1971). 44b. M. Rosenberg, NARes 1, 653 (1974). 45. S. A. Martin, E. Paoletti and B. Moss, JBC 250, 9322 (1975). 46. S. A. Martin and B. Moss, JBC 250, 9330 ( 1975). 47. Y. Groner and J. Hurwitz, PNAS 72, 2930 (1975). 48. S. Muthukrishnan, G. W. Both, Y. Furuichi and A. J. Shatkin, Nature 255, 33 (1975). 49. G. W. Both, Y. Fumichi, S. Muthukrishnan and A. J. Shatkin, Cell 6, 185 (1975). 50. E. Paoletti, N. Cooper and B. Moss, J. ViroE. 14, 578 (1974). 51. M. J. Ensinger and B. Moss, JBC ( i n press).

ADDENDUM Activities that can carry out the final two steps in the modification of vaccinia mRNA (Fig. 11) have been denionstrated in extracts of uninfected HeLa cells.

This Page Intentionally Left Blank

Blocked and Unblocked 5’Termini in Vesicular Stomatitis Virus Product RNA In Vitro: Their Possible Role in mRNA Biosynthesis RICHARD J. COLONNO, AND GORDONABRAHAM AMIYAK. BANERJEE Roche Institute of Molecular Biology Nutley, New Jersey

Vesicular stomatitis virus (VSV) is a rhabdovirus that contains a single-stranded RNA genome of MW approximately 4 x loF and five structural proteins designated as L, G, M, NS and N ( I ) .The virus also contains a virion-associated RNA polymerase that transcribes, in vitro, the negative strand genome RNA into five monocistronic mRNA species (2), which have been identified by translation in an in vitro proteinsynthesizing system ( 3 ) .Although the various species synthesized in vitro and in vivo have been well characterized ( 4 ) , the precise mechanism of RNA synthesis on the genome template is not known. In an effort to understand the mechanism of this viral mRNA transcription and its possible relation to replication, we determined the 5’-terminal sequences of the mRNA species synthesized in vitro and compared them with the 3’terminal sequence present in the genome RNA of the virus. The VSV mRNAs contain a 5’-terminal structure consisting of a guanosine residue connected to an adenosine residue through a 5‘-5’ pyrophosphate linkage as G ( 5’)ppp( 5’)A. . . ( 5 ) . The a-phosphate of ATP and the a- and p-phosphates of GTP are incorporated in the blocked structure. In the presence of the methyl donor, S-adenosylmethionine, the structure is methylated at two positions as miG( 5’)ppp( 5’)Am. . . (6). The base sequences at the 5’ termini of both the unmethylated [ G ( 5’) ppp ( 5’) A-A-C-A-G-. . . ] and methylated [ m’G ( 5’ ) ppp ( 5’) Am-AC-A-G-. . . ] mRNAs are identical, and a11 the mRNA species contain poly( A ) at their 3’ ends ( 7, 8). However, the 3’-terminal sequence of the VSV genome RNA is . . .Y-G-UOII( 9 ) ( Y = pyrimidine nucleoside). Since this 3’-terminal sequence is not complementary to the 5’-terminal sequence of the VSV mRNAs, two possible models for mRNA biosynthesis can be envisaged. 83

84

RICHARD J. COLONNO ET AL.

The first model involves independent initiation of RNA transcription with pppA at multiple sites on the genome RNA far removed from 3' end. Blocking of the 5' termini of the mRNAs then occurs with ppG after the removal of the p- and y-phosphates of pppA. The second model involvcs initiation of RNA synthesis at a single site at the 3' end of the genome and the sequential release of mRNAs from the transcript by a processing mechanism, exposing 5'-monophosphates that are subscquently blocked. This model predicts that a small initiating RNA molecule, complementary to the 3'-terminal sequence of the genome RNA, should be synthesized and released during RNA synthesis. In addition, the synthesis of each rnRNA will depend on the prior transcription of its 3'-proximal gcnes. To investigate the first possibility, i.e., the synthesis of an initiating RNA segment, RNA was synthesized in vitro using [P,-,+:"P]ATP to label UTP to label internal UMP residues. The purified the 5' terminus and ]H"[

2or '

X

2 %

2

=I

T

lo

0

10

20

30

0

FRACTION NUMBER

FIG. 1. Velocity sedimentation analysis of vesicular stornatitis virus ( VSV ) produ~t-RNAsynthesized in uitro. A standard transcriptase 1-eaction containing 200 pCi/nil [p,-p"'P]ATP (40.0 Ci/mmol), and 40 pCi/ml ['HIUTP ( 16 Ci/mniol) was incubated with 0.1 mg/nil purified VSV for 5 hours at 30°C. The rcsalting prodiict RNA was extracted twice with phenol/dodecyl sulfate at pH 7 and purified by Sephadex C-SO chron~atograpby. A portion of the product RNA was sedimcnted through a linear l2-nil 15 to 305 ( w / v ) sucrose gradient in 0.5%dodecyl sulfate, 0.1 M NaCl, 10 mM TrisC1, 1 mM EDTA ( p H 8.0) in a Spinco SW 40 rotor at 23°C for 16 hours at 34,000 rpm. Fractions (0.4 nil) were collected from the bottom of the tuhc and assayed in 0.5 ml of €LO and 7 nil of Hydroniix (Yorktown) to determine the ["'PIATP (O..--O) and [WIUMP (.---a) radioactivity.

85

BLOCKED AND UNBLOCKED 5'-TERMINI

RNA was analyzed by velocity sedimentation, and a small but distinct peak of 3L'Pradioactivity was present at the top of the gradient (Fig. 1). In contrast, no :'.P was associated with the mRNA species, which contained only W . This result suggested that a small RNA product containing 5'-polyphosphorylated ATP was being synthesized in vitro. I n order to characterize this RNA species, VSV RNA products were synthesized in the presence or in the absence of ["H]AdoMet using [cPPIATP as the labeled precursor for RNA synthesis. The RNA sedimenting between 2 S and 7 S (data not shown) was further analyzed by electrophoresis in 20%polyacrylamide gel. As shown in Fig, 2A, some of the 32Pmigrated heterogeneously in the upper fractions of the gel, but a discrete RNA species migrating slightly faster than tRNA (data not shown) was clearly seen. Similarly, when the methylated RNA products were analyzed (Fig. 2B), a "P-labeled RNA product migrating at the same position as in Fig. 2A, was observed, but it contained no detectable [3H]methyl radioactivity. The heterogeneous methylated RNA species containing both 'H and s2Plabels were subsequently shown to be incomplete mRNA species ( l o ) ,whereas the discrete RNA product was found to be a unique n

h6 X

I

6 4 &

I

T 2 N n

Y

0

--

-8 F:

F: 6:

0 6

z

X

I

U

64

4 ,

2

5

Y

,2 I

r-l

0 2

I

I? N v

Y I?

0

20

40

60

00

100

MIGRATION ( m m )

FIG. 2. Polyacrylaniide gel electrophoresis of small in d t r o JA products. Product RNA was labeled with ["T]AMP in the presence ( B ) or in the absence ( A ) of ["HIAdoMet, and analyzed by velocity sedimentation. Labeled RNA products sedimenting from 2 S to 7 S were isolated and analyzed by electrophoresis in 10-cni 20% polyaci-ylainide gels ( 1 0 ) . The gels were fractionated into 1-nim slices and the [:"P]AMP ).(--. and ['H]methyl (0-0) radioactivities in each were determined.

86

RICHARD J. COLONNO ET AL.

RNA which contained the following interesting features. The 5’-terniinal structure of this RNA, in contrast to the mRNAs, is unblocked, unmcthylated and has the sequence ppA-C-G-. . . which appears to be complementary to the 3’-terminal sequence of the genome HNA. The y-phosphate of ATP is presumably removed by the virion-associated phosphohydrolase. The small RNA product ( approximately 70 bases) is rich in A (47%)but docs not bind to oligo(dT)-cellulose ( 1 1 ) . These results indicate that this HNA molecule probably represents an initiated “leader” RNA segment transcribed prior to the synthesis of VSV mRNAs, the capping of the mRNAs presumably occurring subsequently. Thus, in the VSV system, the capping may not be coupled with the initiation of RNA transcription, as has been observed in cytoplasmic polyhedrosis virus, reovirus and vaccinia virus (11,12). In order to demonstrate that the individual VSV niRNA species are synthcsized sequentially, we exposed VSV to ultraviolet radiation and ~ o of the individual mRNA studied the kinetics of the in ~ i t synthesis species. With an oligo ( d T ) -cellulose fractionation procedure, each completed inclividual poly ( A ) -containing mRNA species was quantitated after different times of UV cxposiire ( 1 3 ) .The ratc of decrease of each mRNA spccics followed single-hit kinetics; from these results, the targetsizes of the corresponding genes were calculated. As shown elscwhere ( 1 3 ) ,the target sizc of thc N-protein gene corresponded with thc physical size of its mHNA. However, the target sizes of the other genes did not correlate with the molecular weights of their corresponding mRNAs. In fact, the target size of a particular gene appeared to include the sum of the molecular weights of its 3’-proximal genes. For example, the c’‘11culated target size of the L-protein genc, 4 x loGdaltons, was nearly cqual to the sum of the molecular weights of all five messages. This suggested that a single UV hit anywhere on the genome RNA prevents the synthesis of a complcted L-protein message. Thus, the synthesis of a particular message is dependent on the prior synthesis of its 3’-proximal gcnes. These results indicate a compulsory order in which VSV mRNAs are synthesized with the N-gene (locatcd closest to the 3’-terminus of the genome HNA) synthesized first and the L-gene (located close to the 5’ terminus) synthesized last. In conclusion, wc propose the following model for VSV mRNA biosynthesis ( Fig. 3 ) . First, the virion-associated RNA polymerase initiates transcription at the 3’ terminus of the template genome RNA with the S’-terminal sequence ppA-C-G-. . . . Second, the remainder of the genome RNA continues to be transcribcd as the RNA polymerase moves toward the 5’-end of the genome RNA. Third, a processing enzyme recognizes specific scqucnces on the transcribed product RNA, cleaving it to release

87

BLOCKED AND UNBLOCKED 5’-TERMINI

LEADER RNA

I

/ ’.’ G-

A-C-A-Ap

CLEAVAGE

p-ppG

~

CAP

fN-N-N---

POLY(A)

FIG. 3. Model for biosynthesis of vesicular stomatitis virus mRNAs.

the leader RNA and cistron-sized mRNAs, which retain a free 5’-monophosphate. Fourth, the capping of the 5’ termini of the mRNAs with GDP takes place with concomitant polyadenylylation at the 3‘ end of the cleaved RNA. Thus, in the VSV system, capping of the mKNAs appear to occur subsequent to the initiation of RNA synthesis, Finally, from the sequential synthesis of niRNA species, the genetic map of the VSV genomc has been established as (5’)L-G-M-NS-N( 3’) ( 1 3 ) .

REFEHENCES I . R. R. Wagner, in “Cotnprehensive Virology” ( H. Fraenkel-Conrat and R. R. Wagner, eds.), Vol. 4, pp. 1-161. Plenuni Press, New York, 1975. 2. S. A. Moyer and A. K. Banerjee, Cell 4, 37 (1975). 3 . G . W. Both, S. A. Moyer and A. K. Banerjee, PNAS 72, 274 (1975). 4 . S. A. Moyer, hl. J. Grulmian, E. Ehrenfeld and A. K. Banerjee, Virology 67, 463 ( 1075). 5. G. Alirahani, D. P. Rhodes and A. K. Banerjee, Nattrre 255, 37 ( 1975). 6. C. Abraham, D. P. Rhodes and A. K. Banerjee, Ccll 5, 51 (1975). 7 . U. P. Rhodes and A. K. Rancrjee, J. Virol. 17, 33 (1976). 8. A. K. Banerjee, S. A. hloyer and 11. P. Rhodes, Virology 61, 547 (1974). 9. A. K. Banerjee and D. P. Rhodes, BBRC 68, 1387 (1976). 10. R. J. Colonno and A. K. Banerjee, Cell 8, 197 (1676). 11. Y. Furuichi, S. Muthukrishnan, J. Tomasz and A. J, Shatkin. This volume, 11. 3 . 12. B. Moss, S. A. Martin, M. J. Ensinger and R. Boone. This volume, p. 63. 13. G. Abraham and A. K. Banerjee, PNAS 73, 1504 (1976).

This Page Intentionally Left Blank

The Genome of Poliovirus Is an Exceptional Eukaryotic

mRNA YUAN FONLEE, AKIO NOMOTOAND ECKARD WIMMER Department of Microbiology School of Basic Health Sciences State University of New York at Stony Brook Stony Brook, New York

The primary structure of a species of mRNA determines its unique function in translation. Structural features common to mRNAs indicate functions shared by mRNAs. Such a common structure is the “capping group” [m’G( 5’)ppp( 5’)Np], which has been found at the 5’ end of most eukaryotic mRNAs and has been implicated in protein chain initiation (see preceding papers, also Muthukrishnan et al. in this volume. The presence of the capping group is not obligatory for initiation of protein synthesis in eukaryotes since two RNAs that function as mRNAs without it have been identified so far. These are the genome of satellite tobacco necrosis virus (1; see also the paper by Moss et al., in this volume) and the genome and mRNA of poliovirus ( 2 - 4 ) . The RNA in poliovirus is a plus-strand (5) and it serves as mRNA imrncdiately after infection [Fig. 1; for a review, see Levintow ( 6 ) ] . One of several proteins synthesized in the infected cell is a virus-specific RNA polymerase that, together with the incoming genome RNA, forms a replication complex. Newly synthesized progeny RNA may serve as template iii replication, or as mRNA in translation, or may enter the pool of RNA, which is eiicapsidated (Fig. 1). We have previously studied the 5’ end of poliovirion RNA and failed to identify a capping group, Based on analyses of alkaline digests of [,,PI RNA or labeling with polynucleotide kinase, it was concluded that the RNA is 5’-terminated with pNp (7; Jacobi and Winimer, unpublished). The available evidence suggests that the viral mRNA is not a precursor for progeny virus ( 6 ) . Thus, structural modifications of poliovirus mRNA, such as a “capping during polyribosome formation, would not be evident in virion RNA. We have, therefore, analyzed the 5’ end of poliovirus-specific mRNA ( 2). [ 3’P]mRNA, identified by two-dimensional gel electrophoresis, was a9

90

YUAN FON LEE ET AL.

sz$F:s

PAPPING

\

UNCOATING

GROUP

7

\

\1

AMNYLATE TRANSFERASE

REPLICATION COMPLEX (membrane bound)

PROCASPID

\ &/

PROVIRION

__yvn

--__-

RF

TYW

FIG. 1. Replicative cycle of poliovirus in IIeLa cells. This graph is reproduced with permission from K. Dorsch-Hider, Ph.D. Thesis, St. Louis University, St. Louis, Missouri, 1976.

digested with KNase T,. The products were separated by column chromatography on DEAE-cellulose at pH 5 ( 2 ) . Under these conditions, the capping group of reovirus mRNA ( a generous gift from Drs. Sliatkin and Furuichi) elutes together with GTP (Fig, 2A). No radioactive material eluting with GTP or thereafter was observed in digests of poliovirus mRNA (Fig. 2B), an observation indicating that the capping group is absent from viral messenger ( 2 ) . Hewlett et al. ( 3 ) and FernandezMufioz and Darnell ( 4 ) have come to the same conclusion using different methods of analysis. Absence of the capping group is evident also when virion [32P]RNA is analyzed by column chromatography at pH 5 (Fig. 2 C ) which confirms previous studies ( 7; Jacobi and Wimmer, unpublished). Labeled material eluting ahead of GDP in Fig. 2B has been identified as pup (2).Owing to the specificity of RNase T2, a nucleoside 3',5'-bisphosphate can originate only from the 5' end of an RNA. We therefore suggest that p u p is the 5' terminus of poliovirus mRNA (see Fig. 4, also refs. 2 4 ) .

FRAcnoNs

FIG. 2. Column chromatography of nucleotides on DEAE-cel~ulose at pH 5. Reovirus [rnethyl-"HlmRNA ( A ) , poliovirus ["PImRNA ( B ) and poliovirus genome ["PIRNA ( C ) were digested with RNase T?. The products were applied to the column and eluted with a linear gradient of triethylammonium acetate, pH 5. For details, see Nomoto et al. ( 2 ) .

92

YUAN FON LEE ET AL.

pup is an unlikely terminus for a viral genomc (for a discussion, see ref. 2 and below). I t is an exceptional terminus also for a cukaryotic mRNA. How does poliovirus mRNA function in vivo without the involvement of a capping group? The most plausible explanation is that the virus has evolved a sequence of nucleotides that binds to ribosomes with high efficiency and serves as an initiation site in protein synthesis. Since the poliovirus genome is messenger as well as tcmplate for replication, the ribosome binding site may be preceded by a long “leader sequence” of nucleotides ( 8 ) and therefore be distal to the 5’ end. Although the poliovirus genome is currently thought to contain only one ribosome binding site ( 6 ) , recent evidence leaves open the possibility of a second binding site ( 9 ) that, if internal, is likely to function also without the involvement of the 5’ end. Poliovirus effectively inhibits host cell protein synthesis in I-IeLa cells ( 6 ) . I t has been considered that the mechanism of inhibition is a virusinduced degradation of capping groups at the 5’ end of the host-cell inRNAs. However, such modification of HcLa cell mRNAs after infection with poliovirus has not been detected ( 4 ; Hewlett, Rose and Baltimore, unpublished results; Nomoto, Lee ancl Wimmer, unpublished results). The elution profiles of RNase T, digests of poliovirus mRNA (Fig. 2B) :~ndof poliovirus HNA (Fig. 2C) show differcnccs. First, a distinct peak of pUp as i n Fig. 2B is absent in Fig. 2C. Instead, digests of virion RNA producc several compounds that elute at the position of ppN. Thc quantities of these compoiinds in Fig. 2C differ from preparation to preparation of virion RNA. T h y are presumably nucleoside 3’,Ei’-bisphosphates ( 2 ) that originate from spurious ends of degraded RNA ( 1 0 ) . Second, “‘P-labeled material elutes i n the first column fraction ( a t void volume) of digests of virion RNA but not of polio mHNA. This material has been dcsignated as X (Fig. 2C). In thc following, we present evidence that suggests that X is a protein that, prior to digestion with RNase T,, was covalently linked to the poliovirus genome. The properties of X, which arc summarized in Table I, are consistent with it being a small, phosphorylated protein ( 1 0 ) .In a 20% polyacrylamide gel in the prcscncc of dodecyl sulfate, X migrates as single band slightly slower than the tracking dyc (Fig. 3 ) . It elutes from the gel during staining i n 10%acetic acid. Wc conclude that X has a molecular weight below 10,000. Experiments to detcrniine its size more accurately are currently being carricd out in our laboratory. X is derived from poliovirus genome RNA, which was isolated as follows. “P-labeled virns was sedimcntcd through a sucrose gradient in 0.1 M NaC1/0.05 M TrisCl ( p H 7.5)/2 mM EDTA/O.,5%dodecyl sulfate (7) , and subsequontly banded in CsCl ( 1 1 ) . Viral particles wcrc thcn

POLIOVIRUS:

AN EXCEPTIONAL

mRNA

93

1. X is laheled with 32P (approximately 0.015% of the label in virion IINA). I t s yield corresponds to 1-2 phosphatcs of the total IiNA. 2. X adsorbs rapidly t o glass, siliconized glass and plastic. It can be eluted from surfaces with 0.1 % sodium dodeeyl sulfate, hut not with ehloroforrn/methanol ( I : ) . 3. Up to 90% of the label of X can be precipitated with 5% CI&O?H. 4. Treatment of X with Pronase in 0.1 % doderyl sulfate renders all 32P-label acidsoluble. 5. Extraction of a solution of X in 0.1 yo dodccyl sulfate with chloroform/methanol (3: 1) does not remove acid-insoluble counts from the aqueous phase. 6. X rerriains a t tha origin after clirornatography 011 cellulose thin-layer plates, and migrates slightly toward the cathode on U'hatman 3 h4M paper at p l l 3.7 or 7 . 5 . 7. X can bc isolated from genome RXA as a 14C-labeled compound after the virions are labeled in vino with a mixture of 15 '4C-labeled amino acids. 8. X migrates in 20% polyacrylamide gels in the presence of dodecyl sulfate slightly inarc slowly than hromphenol blue (Fig. 3 ) . a

Ekperimental details will be published elsewhrre (cf. 10)

disrupted either by two to three treatments with phenol/chloroform (1:1) in the previous buffer or by treatment with EDTA, citric acid and dodecyl sulfate at pH 3.5 (2, 12) followed by treatment with phenol. Virion [,?PI RNA was subsequently sedimented through a sucrose gradient in the same buffer and collected by precipitation with ethanol. Normally, this procedure should result in a complete deproteinization of the RNA. However, to test whether or not protein X is covalently linked to the RNA or tightly bound by secondary binding forces resisting deproteinization, we subjected polio [d2P]RNAto one of the following treatments (10): ( i ) sedimentation through sucrose in 0.5 M NaC1, 0.05%dodecyl sulfate at 24°C; (ii) heating to 100°C for 2 minutes in dodecyl sulfate 5 mM EDTA followcd by rapid cooling to room temperature, mixing with Me,SO and sedimentation through a sucrose gradient containing 80% MelSO, 0.2%dodecyl sulfate at 24°C; or (iii) isopycnic centrifugation in the denaturing salt cesium trichloroacetate, in which the RNA bands at a density corresponding to approximately 4 M salt (13). Undegraded RNA was recovered from the gradients and analyzed for X as described in Fig. 2C (10). In each case, X was found with almost identical yields. Since any of thc treatmcnts i-iii strongly interfere with secondary binding forces, these data suggest that X is covalently linked to poliovirion RNA. If X were a conventional, phosphorylated protein, its 32P-labelshould

94

YUAN FON LEE ET AL.

FIG.3. Autoradiograin of a slab gel ( 6 % stacking gel, 20% separating gel). Coniponents were electrophoresed in the presence of dodecyl sulfate. ( A ) ['"PIX ( se e Table I ) . ( B ) [ Leu-"Clproteins from a cytoplasmic extract of poliovirus-infected HeLa cells. The viral proteins were labeled 3 hours after infection for 5 minutes. V P 1 4 are viral capsid proteins ( 6 ) .

be linked to an amino acid. If, on the other hand, X is indeed covalently bound to the polio genome prior to treatment with RNase T2, then some or all of its 3'P-label should be linked to a nucleoside, possibly to the 5'-terminal nucleoside of the RNA, and to an amino acid. An analysis of thc nature of the phosphorus moiety was complicated due to the fact that X adheres to surfaces, which property makes it impossible to treat it with nucleases or phosphatases. Incubation of X in 0.25 N NaOH at 37°C for 1 hour did not render additional counts acidsoluble. Incubation for 24 hours rendered 60% of the control acidsoluble; the acid-soluble material was not inorganic [ phosphate

POLIOVLRUS: AN EXCEPTIONAL

95

mRNA

as would be expected if X contained Ser-P or Thr-P. Brief treatment with hot acid (10% trichloroacetic acid, 20 minutes at 100°C) or hot alkali (1 N NaOH, 15 minutes at 100°C) released some but not all label as acid-soluble material. Pronase released one major 32Plabeled product (yield: go%),which migrates differently from Thr-P, Ser-P or Arg-P during electrophoresis on Whatman 3 MM or DEAE-paper at pI-1 3.5. The nature of the Pronase product is currently under investigat'ion. The terminal structures of poliovirus specific RNAs, as known to date, are represented in Fig. 4. The genome RNA and mRNA contain 3'terminal poly( A), which is thought to be synthesized by transcription from poly( U ) of polio minus-strands (12, 14 and literature cited therein). The 5' end of poliovirus mRNA is pup ( 2 4 ) . The data presented here suggest that the genome of poliovirus is covalently bound to a small protein. I t is attractive to speculate that the 5' cnd of the polio genome is bound to the protein; however, the nature of the linkage between RNA and protein is as yet unknown. Our finding leaves us with many interesting questions. What is the nature of the protein? Is it virus coded or host-cell specific? At what stage in the rcplicative cycle of poliovirus is it attached to virion RNA? Is it involved in replication of the genome and/or in the morphogenesis of the virion? Does it play a role in the sensitivity of poliovirus replication to guanidinium chloride ( S ) ? Why is it absent from poliovirus mRNA? Does the protein intereferc with translation (initiation), which would make it nccessary to remove it prior to complex formation with ribosomes? Does absence of the protcin make the viral mRNA incompetent as precursor for virions? Does virion RNA remain infectious after the protein has been removed? The answers to these and other questions will shed light on some of the inany problems yet unsolved in poliovirus replication. 5' end

3'eRd 7700 nucleotidss

genome RNA

Protein{

messenger R N A minus strand R N A

- +/

PUP

?p

w

- _ _ _ - _ -_,+- _ - __ _ _ _ _ _ _ _ - _ _ _ _ "

7

POlY IU)

FIG. 1. Schematic representation of the terminal structures of poliovirus-specific RNAs.

96

YUAN FON LEE ET AL.

Summary Previous studies from thrce laboratories show that the mRNA of polioviru!;, .when isolated from polyribosomes of infected HeLa cells, is riot “capped’ with m7G(5’)ppp(5’)Np. Irriplications of this finding are discussed. Evidence is presented suggesting that poliogenome RNA, but not polio mHNA, is covalently linked to a small protein (molecular weight below 10,000). ACKNOWLEDGMENTS We thank our colleagues froin the Department of Mi’crohiology for numerous suggestions and discussions that were invaluable for us during the course of the investigation reported here. In addition, we are grateful to Barbara Morgan Uetjcn for advicc in writing the manuscript. This work was supported by Grant NO. CA16879, awarded by thc National Cancer Institute and by a Postdoctoral Fellowship CA-0.1180 of the National Institutes of Health to Y.F.L.

REFERENCES 1 . E. Wimnier, A. Y. Chang, J. M. Clark and E. M. Reichnian, J M B 38, 59 ( 1968). 2. A. Nonioto, Y. F. Lee and E. Winimer, PNAS 73, 375 (1976). 3. M. J. Hewlett, J. K. Rose and D. Baltimore, PNAS 73, 327 ( 1976). 4. R. Fernandez-Muiioz and J. Darnell, J. Virol. 18, 719 (1976). 5. D. Baltimore, Bacteriol. Rev. 35, 235 ( 1971). 6. L. Levintow, in “Comprehensive Virology” (H. Fraenkel-Conrat and R. R . Wagner, eds.), Vol. 2, pp. 109. Plenum, New York, 1974. 7 . E. Wimmer, JhlB 68,537 (1972). 8. C. Weissman, et al., ARB 42, 303 ( 1973). 9. M. L. Celma and E. Ehrenfeld, I M B 98, 761 (1975). 10. Y. F. Lee, A. Nomoto and E. Wi~nnier,PNAS, in press (1976). 11. C. N. Cole, D. Smoler, E. Wimmer and D. Baltimore, J . Virol. 7,478 (1971). 12. K. Dorsch-HBsler, Y. Yogo and E. Wimmer, J. Virol. 16, 1512 ( 1975). 13. R. L. Burke and W. R. Bauer (1976). In preparation. 14. Y. Yogo and E. Wimmer, J M B 92,467 (1975).

II. Sequences and Conformations of mRNAs

Transcribed Oligonucleotide Sequences in HeLa Cell hnRNA and mRNA

99

MARYEDMONDS, HIROSHI NAKAZATO,E. L. KORWEKAND S. VENKATESAN Polyndenylylation of Stored mKNA in Cotton Seed Germination BARRYHARRIS AND LEONDURE I11

113

mRNAs Containing and Lacking Poly( A ) Function as Sepa119 rate and Distinct Classes during Embryonic Development NEMERAND SAULSURREY MARTIN Sequence Analysis of Eukaryotic mRNA

123

N. J. PROUDFOOT, C. C. CIIENC AND G. G. BROWNLEE The Structure and Function of Protamine mRNA from Developing Trout Testis

135

P. L. DAVIES, G. H. DIXON,L. N. FERRIER, L. GEDAMU AND K. IATHOU The Primary Structure of Regions of SV40 DNA Encoding the Ends of mRNA

157

KIRANURN. SUHRAMANIAN, PRABHAT K. GHOSH,RAVIDHAR,BAYAR TIIIMMAPPAYA, SAYEEDA B. ZAIN,JULIAN PANAND SHERMAN M. WEISSMAN Nucleotide Sequence Analysis of Coding and Noncoding Regions of Human p-Globin mRNA 165 A. MAROTTA,BERNARD G. FORGET, MICHELCOHENCHARLES SOLAL AND SHERMAN M. WEISSMAN

97

Determination of Globin mRNA Sequences and Their Insertion into Bacterial Plasmids

1 77 WINSTONSALSER,JEFF BROWNE,PAT CLARKE,HOWARDHEINDELL, RUSSELL HICUCHI,G A ~ PADDOCK, ~Y JOHNHOBElXTS, GARYSTUDNICKA AND PAULZAKAR

Mutation Rates in Globin Genes: The Genetic Load and 205 Haldane’s Dilcmma WINSTON SALSER AND JUDITH STROMMER ISAACSON The Chromosomal Arrangement of Coding Scquences in a 221 Family of Repented Genes G. M. RUMN, D. J. FINNEGANANDD. S. HOGNESS Heterogeneity of the 3’ Portion of Sequences Related to Immunoglobulin K-Chain mRNA URSULASTORR

227

Structural Studies on Intact and Deadenylylated Rabbit Globin mRNA

233 N. VOURNAKIS, MARCIAS. FLASHNER, MARYANNKATOPES, GARYA. KITOS, N ~ K O C S . VAMVAKOPOULOS, MATTHEWs. SELLAND REGINAM. WURST

JOHN

Molecular Weight Distribution of RNA Fractionated on A C ~ O Uand S 70% Formamide Sucrosc Gradients AND HANSLEIIRACII HELGAI~OEDTKER

98

253

Transcribed Oligonucleotide Sequences in Hela Cell hnRNA a n d mRNA MARYEDMONDS, HIROSHINAKAZATO, E. L. KORWEKAND S. VENKATESAN Life Scimce Department University of Pittsburgh Pittsburgh, Pennsylvania

1. Introduction The poly( A ) sequences at the 3’ ends of many of the hnRNA molecules of animal cell nuclei provided the first evidence for the existence of common sequcnces within this highly heterogeneous RNA population. Additional se quex e homologies were revealed when high concentrations of two shorter single-base sequences were found in the hnRNA of HeLa cells. One is an internal sequence of about 25 AMPS that, in contrast to the longer poly( A ) at the 3’ end, is transcribed ( 1 ) . The other is a poly( U ) sequence of 30-40 nucleotides concentrated in the largest hnRNA molecules in rcgions distant from the poly( A ) terminus (2, 3 ) . We have been examining the properties and distribution of these sequences in the nucleus and cytoplasm of HeLa cells in the expectation that insights could be gained on mechanisms of mRNA production through hnRNA processing.

II. A Transcribed Oligo(A1 Sequence in hnRNA Poly ( A ) preparations from ribonuclease digests of hnRNA purified on oligo ( dT) -cellulose contain a small AMP-rich component that readily separates from the large poly ( A ) sequence during electrophoresis. While investigating the location of the large poly ( A ) in periodate-oxidized nuclear RNA molecules that had been reduced with labeled sodium borohydride, it was noted that, in contrast to large poly(A), this rapidly migrating species did not become labeled ( 4 ) . I t was concluded that, unlike large poly(A), this small AMP-rich species is riot a t the 3’ end of hnRNA molecules. A comparative study of the effects of controlled doses of actinomycin D and 3‘-deoxyadcnosine on the biosynthesis of the two sequences clearly 99

100

MARY EDMONDS ET AL.

showed that different niechaiiisms were involved ( 1 ) . Synthesis of the large poly ( A ) was, as expected, inhibited markedly by 3'-deoxyadenosine, while small poly( A ) was not. Conditions of actinomycin treatment were also found that could abolish small-poly ( A ) synthesis without greatly reducing large-poly( A ) synthesis ( 1 ). It was apparent that the smallpoly( A ) is a transcribed sequence not derived from the large-poly( A ) , nor is it a special size class of 3'-terminal poly( A ) .

A. Localization of Oligo(A) i n RNA of the Nucleus and Cytoplasm We have developed conditions that allow an essentially quantitative binding of denatured poly ( A )-containing hnRNA molecules to ohgo( dT)cellulose without significant degradation ( 5 and Fig. 1 R ) , A brief heat treatment at 63°C in Me,SO has relatively small effects on the sedimentation properties of total nuclear RNA of HeLa cells labeled for 4 hours with 32P(Fig. 1 B ) . Subsequcnt binding of this RNA to oligo( dT)-cellulosc has little effect on the sedimentation propcrties of either bound or unbound fractions (Fig. 1A). Figure 2A shows thc large and small poly(A) sequcnccs recovered from the ribonuclease digests of the unfractionated total nuclear RNA. Figure 2B shows that oligo( dT)-ccllulose produces

1Gc. 1. Sedimentation profile of nuclear RNA( A,,) and RNA( no A,,)." Nuclear RNA of HeLa cells 1nl)eled for 4 hours with "P was separated into ( A , , ) and ( n o Ax>)R N A and centrifuged through a 15X to 30%linear sucrose density gradient in a Spinco SW 40 rotor for 13 hours at 18,800 rpm after treatrncnt with MeSO ;IS dcscribed ( 5 ) . Total nuclear HNA with and without MeSO treatment was also scdimented. ( A ) 0-0, Nnclear RNA ( A , , ) ; O - - - O ,and nuclear RNA ( n o A,,). ( B ) Total nuclear RNA with 0-0 and without O - - - O hlezSO treatment.

* See footnote 1

on p. 108,

OLIGONUCLEOTIDE SEQUENCES IN HELA

10

20

30

1

I

I

hnRN A

40 I

AND

101

mRNA

10

20

30

1

I

1

40 1

B

A DYE

DYE

H

H

C

D

10

20

DYE

DYE

H

H

30

40 10 FRACTION NUMBER

20

30

40

FIG. 2 . Ilistribution of large and small poly(A) in cytoplasmic and nuclear RNA of HeLa cells. Four-hour "'P-labeled cytoplasmic and nuclear RNAs of HeLa were separated into ( A , , ) and (no A,,) fractions ( 5 ) .The poly( A ) isolated from each fraction after nuclease treatment was further purified by a second binding and elution from oIigo( dT)-cellulose. Poly( A ) from ( A ) total nuclear RNA; ( B ) nuclear RNA(A,,) (@---a) and nuclear RNA (no A,,) (0-0); ( C ) total cytoplasmic; and cytoplasmic RNA( no Ari) RNA; and ( D ) cytoplasmic HNA( A,,) ( 0-0) (0 -0 were electrophoresed in 10%polyncrylamide gels. l--l: bromphenol blue dye marker.

nearly a complete separation of the large and small poly( A ) sequences. Since there is little cvidencc for degradation of the RNA during thcse manipulations (Fig. l A ) , we conclude that the small transcribed

102

MARY EDMONDS ET AL.

poly(A)'s are primarily in hnRNA molecules that do not contain the large poly( A) sequence. As seen in Figs. 2C and 2D, there is little evidence for a small poly( A ) sequence of this size in thc cytoplasm after 4 hours of labeling. This had been noted earlier in short labeling times as well ( I ) . The diffuse spread of rapidly migrating radioactivity on thew gcls is primarily composed of AMP-rich sequenccs arising from the shortening of poly ( A ) during this extended labeling period ( 6).

B. Size and Composition of Small Poly(A) The small p l y ( A ) sequence is often contaminated with HNA fragments most easily rcmoved by a second binding to oligo ( d T ) -cellulose, as seen in Fig. 3. Thc fraction retained by oligo(dT)-cellulose is now considerably more homogeneous (Fig. 4 A ) and is highly enriched in AMP, as the data of Table I, experiment 1show. Figure 4R shows that the sequence is not detectably shortened by a second treatmcnt with a 50-fold increase in the RNase A concentration, suggesting that it is uninterrupted by pyrimidine nucleotides. The sequencc purified from a RNase-TI-plusRNase-A digest invariably contains one GMP residue per 25 AMP residues. Since thc GMP disappears on treatment with alkaline phos-

10

1

OYE

H

FIG. 3. Purification of sniall poly(A) by a second oligo(dT) binding. The p l y ( A ) fraction (.---@) isolated from nuclear RNA( no A,)) (after niiclease treatment) by binding and eluting from oligo( dT)-cellulose was again subjected to oligo( dT)-cellrilose binding and separated into the bound ( 0-0, (58.8% of "poly( A ) fraction") and unbomd (0- @, 31.2%)fractions described in Fig. 2. Electropherogranis run in parallel are plotted in one figure. 1- : dye marker. Base com~~ositions of saniple 0--0 were 2.3% C, 89.5% A, 3.97 G and 4.3%U; of sample @--@, 32.2%C, 9.89 A; 44.0%G and 14%U.

OLIGONUCLEOTIDE SEQUENCES IN HELA

hnRN A

AND

mRN A

103

FRACTION NUMBER

FIG. 4. Inability of RNase A to hydrolyze purified oligo(A). Ohgo( A ) isolated from nuclear RNA(no A,,) was recovered from 10% gel and was purified by binding and eluting from oligo( dT)-cellulose. Two aliquots were incubated for 30 niinutes at 37°C with and without 4 pg of RNase A in 1.2 ml of 0.025 M TrisCl pH 7.4 containing 0.2 M NaCl and 1 OD-unit of yeast 4 S RNA. After adding sodium dodecyl sulfate to 0.72, 40-pl aliquots of each reaction were electrophoresed on 10% polyacrylamide gel with ["Hladenosine-labeled HeLa cytoplasmic RNA as a marker. ( A ) Without RNase; ( B ) with RNase. 0-0: sniall poly(A); cyto: dye marker. plasmic ['HIRNA; I- /

.---.:

Treatmcnts Expt . no.

2

1

Nucleotide composition

3

C

A

G

U

3.8

1.5

Total 32P analyzed (cpm )

( %I

+

1tNascA TI KOII 0.7" 1iNasr A f TI Phosphatase KOIf 0.1a llNase A T I P1iosph:ttasc Phosphodiesterasc 1 . 3 ItNase A KOIT 1.5f' Phosphatase KO13 1tNasr A 0.1a Phosphatase Phosphodicsterasr 1 . 8 IlNase A

1

+

2 3

~

~

~

~~~

93.9 9'3.0

0.2 0.7 93..5 3.8 1 . 3 8.5.4 9 . 5 3.6 88.4 10.1 0 . 6 84.7 10.0 3 . 6

7,180 17,400 40,400 14,000 15,500 15,900

~~

Corrected for a rontaniinant derived from alkaline liydrolyzates of poly (A) that comigrates with C M P (18). a

104

MARY EDMONDS ET AL.

phatase (Table I ) , we conclude that thc oligo(A) derived from this digest of hnRNA has a GMP at its 3’ c w l and contains 25 AMP residues, a length compatible with its c>lectrophoretic mobility. A minority of thc oligo( A) sequeiiccs may be terminated by UMP, since a small drop in UMP content follows phosphatase treatment (Table I ) . To determine the nucleotides that surround the oligo( A) sequence, a fragment released by IiNave A trcatmcnt was purified. A majority of these sequences contain 2 GMPs and a UMP a t the 3’ terminus, since most of the UMP disappeared after phosphatase treatment. Again, a minority of these sequences may contain a CMP at the 3‘ end, since the relatively low initial CMP content was also reduced by phosphatasc treatment. TWOobservations derived from 3’-exonucleaqe treatment of this sequence with snakc venom phosphodiesterase allow us to assign the extra GMP in this sequence to the 3’ side of oligo A. In one case, both GMPs and one UMP were released before oligo( A) digestion was more than 50%completed. Significantly, all guanine was recovered as 5’-GMP rather than guanosine when complete digestion was achieved. Although the presence of one or more AMP residues within the GGU sequence remains to be determined, we tentatively propose the following sequence for a majority of the oligo( A) sequences in hnRNA:--[A-A---A],,-G-G-U- . It has been suggestcd by Scherrer ( 7 ) that oligo( A) sequences might serve as primcrs for poly( A) synthesis if they are at 3’ ends. Dictyostelium discoides does in fact contain a small transcribed ohgo( A) in both hnRNA and mRNA that is separatcd by several nucleotides from a longcr nontranscribed poly(A) at the 3’ end (8). Although it has been suggested ( 8 ) that the oligo( A) is a recognition site for the posttraiiscriptional addition of polyf A), it obviously cannot serve as the actual primer site in this case. Some observatioiis on the distribution of oligo( A ) and poly(‘A) sequences in HeLa cell RNA led us to reconsider this possibility for HeLa oligo(A) sequences, primarily because it should bc possible to test it experimentally. The facts that oligo( A) and poly( A) sequence are not present in the same hnRNA molecules of any size class, and that niIiNA appears to lack oligo( A), could be accounted for by the series of steps outlined in Fig. 5. According to this scheme, oligo( A) occurring at some unspecified site within hnRNA undergoes cleavage at or ncar its 3’ end to produce a 3’-hydroxyl group on the terminal AMP, which subsequently becomes the site for the polymerization of new AMP residues by poly( A ) polymerase. Sequences released by the endonuclease, as well as others from the 5’ ends of hnRNA not destined for export to cytoplasm as niRNA, are then degraded.

OLIGONUCLEOTIDE SEQUENCES IN HELA

hnRN A

I ----

AND

105

"Endonuclease"

[A-A---A-OH

i ----

mRNA

+

_-

Poly (A) polymerase

[A-A---A-]A-A-A--A-OH

4-4-

C 1 7 5 j

L

J,

Processing? Transport

Polysomal mRNA

FIG. 5. Scheme showing cleavage of oligo(A) within hnRNA and steps in degradation of released sequences.

111. Oligo(U) Sequences in hnRNA and mRNA The abundant oligo( U ) sequences of 30-40 nucleotides in hnRNA tend to be concentrated in the largest size classes of hnRNA and are relatively sparse in smaller hnRNAs ( <28S) ( 2 ) . In the case of poly( A)-containing hnRNA, where it has been readily tested, the oligo( U ) sequences are far removed from the 3' end ( 2 ) .Polysome-bound mRNA was also reported ( 2 ) to be very low in oligo( U ) scqucnces, an observation compatible with models for mRNA biogenesis from the 3' ends of hnRNA, since oligo( U ) content is very low in this region. In our experiments, we have found repeatedly as much as 20-25% of the total poly( U ) sequences of HeLa cells in the cytoplasm. Since we thought it unlikely that cytoplasm would invariably be contaminated with this large quantity of poly( U ) containing nuclear RNA, we have made more detailed studies of the localization and properties of the RNA molecules containing these sequences.

A. Poly(U) Sequences in Messenger RNA A large fraction of the poly( U ) sequences in the cytoplasm of HeLa cells is found in the polysome fraction. The fact that most of these se-

106

MARY EDMONDS ET AL.

quences are shifted to region5 containing more slowly sedimenting components when EDTA is included in the gradient suggested that poly( U ) is in mRNA, not in an undefined polydisperse class of KNA sedimenting in the polysonic region of thcsc gradients ( 9 ) . Ribosomal RNA had previously been eliminated as a source of poly( U ) sequences when the latter were not found in the 45 S pre-rRNA of the nucleolus of HeLa cells (10). The dctectioii of poly( U ) sequences in polysomal RNA indicatcd that a detailed quantitative study of the distribution of poly( U ) sequences iii all classes of cellular RNA was needed to assess the significance of this observation. Figure 6 and Table I1 contain the results of analyses carried out on both poly( A)-containing and poly( A)-free RNA of the nucleus iund cytoplasm of cells labeled for 2 5 hours with ["Plphosphate. Each of these four HNA fractions was denatured by heating in the pre5ence of an excess of unIabcled poly(U) for 3 minutes at 64°C in 70% M e 3 0 before it was applicd to the sucrose gradients described in Fig. 6. The characteristic distribution of radioactivity in each of the four RNA fractions, sedimented in the same rotor, is seen in Figs. 6A and B. Each gradient was divided into 5 identical pools, and the pofy( U ) content of the cytoplasmic RNA of each pool is shown in the hatched bars and that of nuclear RNA i n the open bars. A numerical summary of the poly( A ) containing RNAs is included in Table 11. It is evident from Figs. 6A and 13 and Table I1 that p l y ( U ) sequences arc found in all size classcs of poly( A)-containing mRNA and that a striking correlation is found between the distribution of the poIy( A ) sequences and the poly(U) sequences in each of the five fractions. A calculation of the average number of poly(U) sequences per mRNA moIecuIe can be obtained from these data since both the poly(A) and poly( U ) sequences are rather homogeneous (see gel patterns of Fig. 2 and Fig. 7 ) and each mRNA contains only one poly(A) sequence. If estimates of 120 and 30 are used as the average lengths of the labeled poly( A ) and poly( U ) sequences, approximately 0.20 mole of p l y ( U ) is present per mole of poly( A ) in all size classes of mRNA. The deviations found in pools I and V can be ignored for this calculation since they constitute only 7%of the total poly(A)-containing mRNA. It can be concluded that about 20% of such mRNAs could contain one poly( U ) sequence, although the possibility of more than one poly( U ) per mRNA cannot bc excluded. This is quite different from nuclear RNA where the largest hnRNA molecules contain as many as 3 or 4 poly( U ) sequences per molecule, while thc smallest poly( A)-containing hnHNAs of thc size of mRNA contain less than 0.20 per RNA molecule ( 3 ) . Many of the poly(U) sequences in the cytoplasm and especially

OLIGONUCLEOTIDE SEQUENCES IN HELA

0 Lo

10

20

30

"

FRACTION

hnRNA

"0

AND

10

107

mRNA

20

30

"

NUMBER

FIG. 6. Poly(U) Content of mRNA and hnRNA. The RNAs, isolated and separated into RNA( A,,) and RNA( no A,,) as described in the legend to Fig. 1 were each mixed with 25 pg of unlabeled commercial poly( U ) in the presence of 0.1 M NaCl and were precipitated with ethanol. The precipitates collected by centrifugation were dissolved in 0.12 ml of 0.01 M EDTA, 0.01 M Tris ( p H 7.4) and 0.2% dnrlrrvl f t.e-. r adding nil Me..SO. .- . Na .. - .. -- -,. .siilfate. . -.. ...-. A_ -- -... 0.28 ..- - . ... nf -.. . -.,- , the ...- soliltions ...- ..- ..- were .. -.- heated ._ - -for 3 minutes at 64'C before adding 0.40 nil of the same (plus 0.1 M NaCI) and rapidly cooled in a water bath at 23°C. This RNA solution was layered over 12 ml of a 15 to 30%linear sucrose gradient in the latter solution and centrifuged for 15 hours at 29,000 rpni in an IEC SB-283 rotor at 21°C. After fractionation, 5-J aliquots were taken for acid-precipitable radioactivity, and the remainder of the fractions were pooled as designated by Roman numerals. The RNA that had precipitated (fraction = 0 ) was recovered by dissolving in the NaCl/EDTA/Tris/dodecyl sulfate solution. It was added to pool I. After ethanol precipitation in the presence of 200 p g of yeast RNA, each pool was assayed for poly(U) and poly(A) content as described in Section 111, A. Radioactivity in 5-pI aliquots (about 1/100 of a fracand cytoplasmic RNA ( - - - - ) run in the same tion volume) of nuclear RNA (-) rotor are plotted. ( A ) RNA(A,,); ( B ) RNA(no A"). Open bars show poly(U) content of nuclear RNA; hatched bars, poly( U ) in cytoplasmic RNA. ~

I

those in the nucleus are found in RNA molecules lacking poly( A ) . Those of the cytoplasmic RNA are of particular interest, since many are in an RNA fraction associated with polysomes that has properties similar to mRNA ( 1 0 ) .This poly( A)-free RNA in HeLa cytoplasm has recently becn characterized as a class of mRNA (11) apparently unrelated structurally to mRNA(A,,)' since it failed to hybridize to cDNA transcribed from the mRNA(A,) by reverse transcriptase. More than 50% of the

108

I

I1 111

IV

v

MARY EDMONDS ET AL.

cpm 26 I 3.i 1640 686 2370 2360 3x7 505 10‘) 171 4717 3757 44.3

>2xs 28 s 18s >4s

<4s Total cpm: % ’ in Cytuplasin:

“OIigo(U) (nurleus).

=

-30

UJIPs; poly(A)

=

x

100‘ 62.7 139.0

3 .x 29.6 x9.3 105.0 10.2 29.3 1O.F lG.7 -~ 312.0 184.0 37.0

-120

1.7 0.59 0.26 0.21 0.68

AhlPs (rytoplasm), -210

0.27 0.17 0.18 0.23 0.39

AJll’s

p l y ( U ) sequenccs of cytoplasm arc usually found in KNA( no A,,),l although mRNA( no A,,) constitutes only 30%of the total mRNA of HcLa cells regardless of the length of the labeling period (12). Although poly( U ) sequences may be present in higher concentrations in poly( A ) free than in p l y ( A) -containing mHNA, the possibility of contamination with poly( U)-containing nuclear RNA is also greater for the former because of the excess of poly( U ) sequences in hnRNA( no A,,) (Fig. 6 ) .

B. Contamination of mRNA with Poly(U1 Sequences from the Nucleus Since the concentration of poIy( U ) -containing RNA of thc nucleus always exceeds that in cytoplasm, the question of nriclear contamination as the source of the cytoplasmic poly(U) must be considered. It would be reasonable to expect as much as 5% of the RNA of the nucleus to be recovered in cytoplasmic fractions prepared by homogenization of hypotonically swollen HeLa cells, since 2-4% of the nuclear DNA is usually found i n cytoplasm under theqe conditions (13). The experiment summarized in Fig. 6A and Table I1 shows not only that the concentration of poly( U ) in the cytoplasm relative to that in the nucleus exceeds this amount bv an order of magnitudc, but also that thc p l y ( U ) sequences are concehratcd i n different size classes of RNR in cytoplasm from those in the nucleus.

’ RNA( A,,) is “ p l y ( A)-containing RNA,” sometimes referred to as “poly A ( + ) RNA.” RNA( 110 A , , ) is used for “poly(A)-free RNA,” sometimcs referred to a5 “poly A ( - ) RNA,” or simply as RNA. [Eds.]

OLIGONUCLEOTIDE SEQUENCES IN IIELA

hnRN A

AND

mRNA

109

A comparison of poly( U ) sequences in RNA(A,) from the two sites provides compelling evidence for localization of poly ( U ) sequences in cytoplasmic mRNA. I n this experiment, the fraction of the total poly( U ) sequences in the RNA( Ar,) present in cytoplasm is similar to the fraction of the total cellular poly( A ) in cytoplasm (i.e., 37%vs. 44%).T h e poly( U ) sequences are also distributed in a population of RNA molecules sedimenting after Me,SO denaturation with velocities characteristic of mRNA rather than of hnRNA (A,, ) ( Fig. 6A). The question of nuclear contamination of the RNA( no A,,) of cytoplasm is more serious because of the very high levels of poly(U) in hnRNA( no A,,) (Fig. 6B). However, in this case also, the distribution of poly(U) in various size classes of RNA in the nucleus in no way resembles the distribution in cytoplasm. The HNA of pool I of the nucleus, for example, contains 25 times more poly( U ) than does pool I of cytoplasm, suggesting that, for RNA of this size class, the nuclear contamination did not exceed 4%even if all RNA in this cytoplasmic pool resulted from nuclear contamination. These data exclude any simple form of nuclear contamination as the source of most of the poly(U)-containing RNA of cytoplasm. This is particularly clear in the case of the poly( U ) scquences in mRNA( A,l) in our experiments where labeled poly( U ) sequences are present in quantities similar to those in the nuclear hnRNA( A,,) molecules of similar size.

C. Properties of Poly(U) Sequences Poly( U ) sequences from RNA( A,,) and RNA( no A,l) of the nucleus and cytoplasm display similar electrophoretic mobilities (Fig. 7 ) , although those from cytoplasmic RNAs migrate somewhat more heterogeneously. The sequences electrophoresed in Fig. 7 were obtained from a pool of each RNA that contained a large amount of poly( U ) . Poly( U ) always migrated with the bromphenol blue dye marker slightly ahead of transfer RNA. A poly ( A ) fraction from pulse-labeled yeast mRNA, measured by end-group analysis as 50 nucleotides in length, comigrated with transfer RNA in similar gels ( 1 4 ) . The minimum size of these sequences can be estimated from the composition data for poly( U ) fractions obtained from RNA( A,)’s (Table I11 ), since each sequence in a complete RNase T, digest should be terminated by a single G residue, Sequences of 25-30 nucleotides would be calculated from the ratio of 32Pin U to that in G for the poly( U ) obtained from each size class. This is in close agreement with the length obtained previously for the poly( U ) sequences in HeLa nuclear RNA ( 2 ) . This value increases to 35 or 40 if this poly ( U ) fraction is further purified on DEAE-Sephadex (10).

110

MARY EDMONDS ET AL.

-

T 0

x

E " 0

"

~

0

10

20 30 4 0 FRACTION

0 10 NUMBER

20

30

40

FIG. 7 . Electrophoretic mobilities of p l y ( LJ) fractions from the nucleus and cytoplasm. Poly( U ) isolated from one pool of each of thc nuclear arid cytoplasmic gradients shown in Fig. 2 was electrophorcscd with ['Hladenosinc-labeled RNA ( no A,) from the cytoplasm of HeLa cells to provide a 4 S RNA marker. 0--0, Poly( U ) from ( A ) cytoplasmic RNA( no A,,) 111; ( B ) cytoplasmic RNA( A,,) 111; ( C ) nuclear RNA(no A,,) 11; ( D ) nuclear RNA( A,>) 111. [3H]Adenosine-labcled marker for 4 S RNA is shown by dashed line in ( C ) and by arrows in A, B, and D. Poly( U ) from cytoplasmic poIy( A)-free RNA V electrophoresed on a parallcl gel is plotted as a solid line in ( A ) . Fractions were pooled as indicated by bars and were analyzed for base composition as shown in Table 111.

I t is apparent from Fig. 7 that the poly( U ) fractions from RNA( A,,)'s bound to a poly( A ) adsorbent' contain components not found in HNA lacking poly( A ) . Thc composition analysis of Table I11 shows that this slowly migrating heterogeneous material is poly ( A ). In these experiments, it represents about 10%of the poly(A) sequences in thc RNA sample. Its separation from poly( U ) during electrophoresis shows that it is not covalently linked to poly( U ) , but may be hybridized t o the unlabeled

' The poly( A ) adsorbent is poly( A ) linked by diazotization to a inethylene dianiline derivative of starch (19).

OLIGONUCLEOTIDE SEQUENCES IN HELA

COMI'OSI'I'ION

OF

hnRN A

TABLE I11 S1XJtJI'~NCKSB O U N D TO

THlC

AND

mRN A

PoLY(A)ADSOICBENT~

Nucleotide composition ( %)

Nucleus Component Component Cytoplasm Component Component a

111

32P

C

A

G

U

analyzed (cpm)

U

2..5 7.0

94.9 10.2

0.2 3.5

2.4 79..5

4500 4000

A U

2.4 5.6

95.G 11.2

0.2 3.0

1.9 80.3

4800 2240

A

From Figs. 7B and 71).

poly( U ) added to the RNase T, digest to ensure a maximal reproducible binding of the poly(U) to the poly(A) resin (Venkatesan et al., 19). Because of the large excess of poly(U) in the mixture, i t is reasonable to expect that many singlc-stranded regions remain that hybridize to the poly(A) absorbent and carry along segments of labeled poly( A ) annealed to other regions of the same poly( U ) chains. We did not find in HeLa cell cytoplasm any small UMP-rich RNA molecules similar to those isolated from differentiated tissues that alter the translation of specific mRNAs in cell-free systems (15, 16). W e would have expected these translational control RNAs to be in pool 5 of the RNA gradients of Fig. 6. The p l y ( U ) sequences recovered from the cytoplasmic RNA ( no A,, ) of this pool migrated during electrophoresis with the same velocity as did the poly( U ) from all other size classes of RNA (shown in the solid line of Fig. 7a). A poly(U) sequence of 15 or 20 nuclcotides, as has been estimated to be i n such RNA, would migrate well ahead of the dye in this system. What is known of the location of poly( U ) sequences in the nuclear RNA has been readily accommodated by models for the biogenesis of mRNA in which the mHNA sequences are derived from the 3' ends of larger completed hnRNA molecules that have been polyadenylylated posttranscriptionally ( 3 ) . Such observations include the much higher concentrations of poly(U) sequences in the larger hnRNA molecules and take into account the fact that limited fragmentation of large hnRNA( A,,) results in a 90%decrcase in the poly( U ) content of the poly( A)-containing fragments of sizes similar to mRNA ( 3 ) . These data are compatible also with the original report of the vcry low levels of poly( U ) in polysome-bound mRNA ( 2 ) . While our detection of poly( U ) sequences in some mRNA species is not inconsistent with this model, since some

112

MAnY EDMONDS ET AL.

p l y ( U ) sequences are found in poly ( A)-containing hriRNA molecules of the size of mHNA (Fig. 2 and ref. 3 ) , the fiiidi~igis also consistent with some mRNR originating from the 5’ regions of hnRNA. Support for the presence of inRNA at the 5‘ ciid of hiiRNA has reccntly bccn provided by the detection of methyl-blocked 5’-terminal sequences (so-called ‘ ‘ ~ l p s ” )in very large ( >32 S ) hnRNA molecules of mouse L cells ( 1 7 ) . Tecliriiques allowing the selecthn of poly ( U )-containing RNA molecules from mixed RNA populations would be especially helpful in answering the many questions raised in these experiments about the structure and function of poly( U )-containing HNA. The relatively short length of the poly( U ) sequence cocxisting with an excess of much longer poly( A ) sequences has made this a more difficult problem to solvc than was the case for p l y ( A)-containing RNA molecules. Studies of the function and metabolism of p l y ( U ) -containing mHNA molecules will depend on thc development of this capability.

REFERENCES 1 . H. Nakazato, M. Edmonds and D. W. Kopp, PNAS 71,200 (1974). 2. G. R. Molloy, W. L. Thomas and J. E. Darnell, PNAS 69, 3684 (1972). 3. G. R. Molloy, W. Jelinek, hl. Salditt and J. E. Ilnrnell, Cell 1 (1974). 4 . H. Nakazato, D. W. Kopp and M. Edmonds, J B C 248, 1472 ( 1973). 5. H. Nakazato and M. Ednionds, in “Methods in Enzymology,” Vol. 29: Nideic Acids and Protein Synthesis, Part E ( L . Grossman and K. Moldave, eds.) pp. 431443. Academic Press, New York, 1974. 6. D. Sheiness and J. E. Darnell, Nature N B 241, 265 (1973). 7. K. Schcrrer, in “Control of Gene Expression” ( A . Kohn and A. Shatkay, cds.), p. 169. Plenum, New York, 1973. 8. A. Jacobson, R. Firtel and H. F. Lodish, PNAS 71, 1607 (1974). 9. S . Penman, C. Vesco and M. Penman, J M B 34, 49 (1968). 10. E. L. Korwek, Ph.D. Thesis, Univ. of Pittsburgh, 1974. 11. E. L. Korwek, H. Nakazato, S. Venkatesan and M. Edmonds, Bclaem In,press. 12. C. hlilcarek, R. Price and S . Penman, Cell 3, 1 (1974). 13. T. Borun, E. Robbins and M. I>. Scharf, B B A 149, 302 (1967). 14. C. S . McLaughlin, J. R. Warner, H. Nakazato, M. Ednionds and M. H. Vaughan, JBC 248, 1466 (1973). 1 5 . A. J. Rester, D. S . Kennedy and S. M. Heywood, PNAS 72, 1523 (1975). 16. D. Bogdanovsky, W. Hermann and G. Schapira, BBRC 54,25 ( 1973). 17. R. P. Perry, D. E. Kellcy, K. Fridcrici and F. Rottnian, Cell 4, 387 (1975). 18. G. R. Molloy and J. E. Darnell, Bchem 12, 2324 (1973). 19. S. Venkatesan, H. Nakazato and M. Edmonds, NARes 3, 1925 (1976).

Polyadenylylation of Stored mRNA in Cotton Seed Germination BARRYHARRISAND LEONDUREI11 Depuitnzent of Biochemistry Unioersity of Georgia Athens, Georgia

During the past several years, we have accumulated evidence that much of the inHNA used by germinating cotton cotyledons in the first several days of germination is transcribed in embryogenesis b u t not used until germination begins (1-4). W e consider this body of mRNA to code for proteins unique to the germination process rather than representing mRNA functioning in embryogenesis and carried over undergraded into germination. In fact, a specific enzyme, unique to germination, arises from this “stored” mRNA ( 3 ) . Part of the evidence for the existence of stored mRNA in cotton cotyledons rests on the insensitivity of gross protein synthesis (measured by isotope incorporation ) and of the advent of “germination” enzyme activity to actinomycin D during the first 3 days of germination. With 20 pg of actinomycin per milliliter, 95%of rRNA synthesis and 70%mRNA synthesis is prevented in germinating cotyledons, as judged by the incorporation of isotope into the mRNA(A,) fraction obtained by oligo(dT)cellulose chromatography. However, gross protein synthesis is drastically M reduced and germination enzyme activity is totally inhibited by 3’dAdo( cordycepin) ( 5 ) . This latter observation suggests that the proccssing of the stored rnRNA does not occur until germination commences. An examination of the time course of 3’dAdo sensitivity indicates that the sensitive process takes place within the first 30 hours of germination ( 5 ) . Against this background, we have attempted to determine by several methods whether preexisting mRNA is polyadenylylated during the first 24 hours of germination. The first of these involved measuring the reIative extent of incorporation of [ ”‘P]phosphate and [2-3H]adenosine into the mRNA portion and poly( A) portion of mRNA( A,) of cotyledons during the hours 10-20 of germination. This was done by isolating mRNA( A,,) from extracts of the tissue on oligo( dT)-cellulose columns using standard techniques. The mRNA portion of the molecules was di113

114

BARRY HAHRIS AND LEON DURE 111

gcsted with ribonucleases A and T, and the poly ( A ) portion was recovered

by oligo ( dT) -celluloFe chromatography. From these fractions, the following data were obtained: (1) poly( A ) (A,,) chain-length, by measurements of the iiucleoside/iiucleotide ratio aftcr alkaline hydrolysis and by polyacrylamide gel electrophoresis on 10% gels ( 6 ) ; ( 2 ) the % AMP in mKNA( A,,); ( 3 ) the % AMP in mRNA after removal of A,,; (4)the relative % ['HIAMP of mKNA(A,,) contained in the mRNA and A,, fractions; ( 5 ) the % "P of mRNA( A,!) contained in the mRNA and A,, fractions. In all cases, the A, chain-length proved to be between 100 and 110 nucleotides. Using this value and a knowledge of the distribution of isotopes between the two portions of the mRNA(A,,) molecules, it is possible to calculate an apparcnt avcrage mRNA chain-length by three different means. The rationale behind such calculations is that if poly ( A) chains are ineorporatcd onto existing mRNA molecules as well as on to newly synthesized niRNA molecules during the labeling period, the mRNA fraction will be underlabeled relative to the A,, fraction. This would cause the calculated apparent mRNA chain length to bc shorter than the true average chain-length. Furthermore, the cliff erence in the calculated apparent chain-length and the true average chain-length will reflect the relative amounts of preevisting and newly synthesized mRNA in the mRNA( A,) fraction. Also, if these measurements are made on mRNA( A,,) obtained from cotyledons germinated in actinomyein ( in which mRNA synthesis is 70%inhibited), the calculated apparent chain-length will be even shorter since the amount of preexisting mRNA in the total mRNA will be larger. Of course, this ratioiiale is valid only if the specific radioactivity of the nucleotide triphosphate pools used in mHNA synthesis and in poly(A) synthesis is identical. As a control for this experimental approach, the same mrasurements made on niRNA( AI,) obtained from cotyledons further advanced in germination, when nll the putative stored mKNA has been polyadenylylated, should give calculated apparent chainlength values for mRNA that are commensurate with the true average chain-length. Table I presents the data obtained in a typical experiment for mRNA( A, ) from normally germinated cotyledons and from those germinated in actinom yein, and compares them with the theoretical values expected for mRNA of 1000 nuclcotides using the experimentally dctcrmined values of 100 for the A, chain-length, i.e., n = 100, and 22 for the pcrcent AMP in mRNA. The data show that the amount of adenosine, as determined by [,.PI AMP in mRNA ( A,, ), in germinating cotyledons is much larger than would bc found were the mRNA chain-length 1000 nucleotides. So are the values for the 'lP and 3H content of A,. Consequently when the experimentally determined values are used to calculate

POLYADENYLYLATION OF STORED

115

mRNA

Theoretically rxpected values, assuming m I t N A = 1000 nuclrotides

-Act 1)

+Act 1)

39 20 49

45 49 68

1000

364

296

1000

400

100

1000

473

218

% A h l P i n rnl:NA(A,) % 32Pof m l < N h ( A , , )in A,, % j311]Adcncsinc% of m I I N A ( A , ) in A, Calcwlated r n IiNA c.hain length I3ased on t h e A, chain length, thc. % A M P i n rnliNA(A,), a n d the % in m I t N A Based on th(, A, chain length a n d the % 31P of mRNA(A,,) in A, Based on t h e poly(A) chain length, t h e % ARlP in m l l N A , a n d t h r % [3fI]adenosine of mltNA(A,,) in A,,

Measured values for germinating cotyledons

Using t h e measured values of poly(A) chain length A h l P in niILNA = 22.

=

100 (i.e., Aloe) and t h e %

the apparent mRNA size, very low values result. The calculations from the data obtained for mRNA( A,,) from cotyledons germinated in actinomycin yield even smaller mRNA sizes. Table I1 gives the averages of a large number of values gained from such experiments involving cotyledons germinated 1 day f actinomycin.

Germination pulse time % ["PIAMP % [32€']AMP (hours) in 1nl1NA in rriRNA(A,) 10-22 10-22 +Act 1) 82-40

22 22 28.6

3 8 , :3 (416) 4 3 . 1 (297) 3 3 . 4 (1,528)

% % 3*1' [3H]Adenosine Average of mIiNA(A,,) of mR.NA(A,) m R N A in A, in A, length 18.8 (475) 3 5 . 7 (298) 7 . 5 (1357)

4 3 . 5 (600) 51.9 (464) 2 6 . 3 (1510)

497 320 1462

a Numbers in italics indicate calculated m R N A chain lengths, which are based on the averages of a large number of separate experiments. These averages are t h e nonitalicized figures.

116

BARRY HARRIS AND LEON DURE I11

The average inRNA size calculated from these experiments is about 500 nucleotides for normally germinating cotyledons a i d about 300 for those gerrninated in actinomycin. However, when the same determinations are made on mRNA(A,) incubated in the isotopes during hours 32-40 of germination, the values obtaiiicd are quite different and yield a calculatcd mRNA size of about 1500 nucleotides. These experiments suggest that poIy ( A ) chains are put on preexisting ( and hence nonradioactive) premRNA molecules during the first day of germination. Of course, another explanatia~nof these data would be that largc pieces of thc mRNA molecules are lost during purification owing to iiuclease action, so that the mRNA portion of the niRNA (A,,) is artifactually small. This possibility was ruled out by gel electrophoresis of the mRNA(A,,) fractions in 99% forniamide, which is considered to show the true size of RNA molcculcs by destroying secondary structure and interstrand aggregntion ( 7). The profilcs of radioactivity show that thc mass-average size of these molecules is between 1800 and 2200 nucleotides, which is roughly consistent with the average size found by isotope distribution calculations for the mRNA ( A,, ) from cotyledons further along in germination (Table 11). If the massaverage size of mRNA is considered to be about 1500 nucleotides, the fact that the calculated apparent average chain-length is only 500 nucleotides in 1-day germinated cotyledons seems to indicate that twice as much preexisting inRNA is polyadenylylatcd cluring the first day of germination as is newly synthesized mKNA. The quantitative relationship between the amounts of precxisting and newly synthesized mRNA polyadcnylylated can be calculated from the differential inhibition of mHNA and ( A l L )synthesis by actinomyciii ( a s measured by the diniinution of 32Pin mHNA and in poly( A ) compared to the isotope levels in those fractions from untreated cotyledons). Actinoniycin inhibits incorporation into mRNA by 70%but incorporation into pol!,( A ) by only 30%.This indicates again that much of the mRNA that is polyadenyl\ilated during the first day of germination already exists in the dry scecl cotvledons. Calculations based on these data indicatc that about 58% of the mRNA that is polyadcnylylated during this period is preexistent and 42%is ncwly synthesized. These experiments, to be meaningful, depend upon the assumption that mRNA synthesis and polyacleiiylylatioii draw upon precursor pools of the saiiie specific radioactivity. The fact that the apparent mRNA chainlength calculated from isotope distribution in mHNA ( A,, ) obtained from cotyledons more advanced in germination is roughly that given by gel elcctrophorcsis suggests that this assumption is valid. Nevertheless, we have attcmpted to sulxtantiate the indications that pol>.(A ) is added to prccxistiiig mRNA in early germination by measuring

POLYADENYLYLATION OF STORED

L C ’ I prccipitatrd Cotyledon wurce

117

mRNA

1iN.4

Dry s e r d 24-TIour grrniinatrtl

.i00

24-IIour grrniiiiatcd +Act 1 ) 24-IIour germinated +:l’dAtlo

.i00 300

*ioo

mRNA(A,) fraction

A“

4(0 8%) X ( 1 6%) 6 (1.2%)

0 2(.5%)

0 4(.i%) 0.3 ( 3 % ) -

Artinornyrin valur = 7.i% of control (G4%). 1Zass Averagr rnIiNA chain length = 1900 nurlcotides.

optically the actual amounts of mRNA( A,,) existing in dry seed cotyledons and in cotyledons germinated 24 hours ( f actinomycin). Table 111 presents thc data obtained. Thcre is no change in the total amount of high-molecular-weight RNA in cotyledons during the first day of germination a5 indicated by the amount of material precipitable with 2 M LiCI. However the amount of RNA that is niRNA( A,!) increases during germination. This increase is less in the presence of actinomycin. However, the incrcasc in both preparations is actually larger than perceived, since the residual mRNA ( A,, ) present in the dry seed cotyledons is destroyed or deadenylylated dur:ng early germination as shown by its disappearance in cotyledons germinated in 3’dAdo. These data show that in actinomycin the mRNA(A,) fraction has increased to 75% of that untreated cotyledons, but, since actinomyein inhibits only 70% of the synthesis of new

1lass avrragr tnliNA(An) chain lrngth (:el

2000

Isotopr incorporation 1 .i.iO (22-40 liou rs grri i 1i n s t r d)

il2 6 0 units [ % A,, of m I1 NA (A,)]

2000

Stored New m R N A mRNA Lo\v-lcvrl isotopr inrorporation in inliNA relative t o PlS!A) I)iIfcrcIit,i:d in1iil)ition by act inoriiyriri of isotopa inc:orpor:ition i n n i I1NA and -4I, A , , , increase in rriI<XA(A,) in actinornycin relative to untrcated cotyledons

67 %

33 %

37.5% 42..5%

64 %

36 %

118

BARRY HARRIS AND LEON DURE I11

mRNA, some of the increase in actinomycin is new mRNA. Compensating for this, these data suggest that about two thirds of the mRNA(A,) existing by 24 hours of germination was preexisting and one third newly synthesized. A true mHNA chain-length estimation of 1900 nucleotides can also be obtained from the data in Table I11 based on the fact that the poly( A ) , which is about 100 nucleotides long in all cases, comprises 58 of the mHNA ( A,, ) fraction. Table IV summarizes the values obtained in these experiments for the true mass-average mRNA chain-length and for the relative amounts of preexisting and newly synthesized mRNA that is polyadenylylated during the first day of germination. Although there is considerable variation in the latter values depending on the method of evaluation used, in every case the polyadenylylation of preexisting RNA is suggested.

REFERENCES 1. J. N. Ihle and L. S. Dure I11 BBRC 36, 705 (1969). 2. J. N. Ihle and L.S. Dure I11 BBRC 38, 995 ( 1970). 3. J. N. Ihle and L.S. Dure I11 JBC 247, 5034 (1972). 4. J. N. Ihle and L.S. h r e 111 JBC 247,5048 (1972). 5. V. Walhot, A. Capdevilla and L. S. Dure I11 BBRC 60,103 (1974). 6. V. Walbot, B. Harris, and L. S. Dure I11 in “Symposium on Developmental Biology of Rcprodnction” (C. Market, ed. ), Soc. Develop. Biol., pp. 165-187. Academic Press, New York, 1975. 7 . P. H. Duesberg and P. K. Vogt J. V i r d . 12, 594 ( 1973).

mRNAs Containing and Lacking Poly(A) Function as Separate and Distinct Classes during Embryonic Development MARTINNEMERAND SAUL SURREY The Institute for Cancer Research Fox Chase Cancer Center Philadelphia, Pennsylvania

Messenger RNAs fall into at least three classes: the histone mRNAs and the nonhistone mRNAs either containing or lacking poly( A) [i.e., mRNA( A,, ) and mRNA( no A,, ), respectively] .l Since the latter two classes comprise distinct nucleotide sequences, according to the results of cDNA hybridization experiments (I, 2), there exist three classes of genes producing these mRNAs. W e can ask: Are these genes differentially expressed under diffcrent physiological conditions, especially as manifest during embryonic development? The early sea urchin embryo is an appropriate system for such a question. The period from cleavage to early blastula (approximately the first 10 hours) is one of rapid cell division and replication with minimal cellular differentiation. Later (from about the 20th hour ) , gnstrulation represents the beginning of cellular specialization. Between these periods of development, the mid-blastula stages seem to be a transition. Measured as the appearance of newly labeled RNA in polyribosomes, the relative amount of synthesis of each of these three mRNA classes changes during development in a characteristic way ( Fig. 1 ) . During the earliest period, putative histone mHNA accounts for 60%of the labeled RNA in free polyribosornes. At this time mRNA(A,,) represents only approximately 10%.However, through subsequent development, the proportion that is histone mRNA decreases considerably and the mRNA( A,,) increases to more than 50% as the gastrula stages are reached. During this whole course of development, the proportion of nonhistone mRNA( n o A,,) remains fairly constant at approximately 30%of the labeled polysoma1 RNA. Therefore, the syntheses of these mRNA classes appear to be under separate controls, describing distinct changes as a function of development. These changes during development can be interpreted as shifts See articles in Part I of this volume; also by Edmonds et al. ( p . 99). 119

120

MARTIN NEhlER AND SAUL SUlWEY

/?, 50-

?

I

40-

0

30-

-

b ..

0

20-

5

UJ

Y

a"

10 I

.d

5

x

I

l

I

10

15

,

I

,

1

20 25

,

I

,

I

30 35

,

I

40

Hours Post Fertilization

FIG. 1. The relative amounts of newly synthesized nonhistone niRNA( A,,) and i n RN A (no A,) and putative histone iirRNA in free polyribosomes as a function of embryonic development. Froin thc quantitative oligo( d T )-cellulosc fractionation and sedimentation analyscs of this RNA from several embryonic stages, the relative amount7 of non-4 S RNA labeled in GO minutes present in free polyribosomcs werc calculated. The fraction bound to oligo( dT)-cellulose was the mRNA( A,,) ( 0 ) . The unbound, nonhistone mRNA( no A,,) was the sedimentation modc at 22 S ( O), corrected for the P S mRNA, the putative hi\tonc mRNA ( x ).

in emphasis toward different kinds of gcnetic information represented by thc mRNA classes. I t is thus a regulation of gene expression that scts tlie

strategy for calling into play the members of broad gene classes. If these mRNA classes can be distinguished 011 the basis of intrinsic properties, then they may be accorded diff ercntial trcatment in the cytoplasm. Indeed the ( A , l ) and ( n o A,l) nonhistone inKNAs appear to be loaded with ribosomes to different cxtents. This conclusion is based on the observation that the nonhistone niRNAs in small polyribosomes (approximately 100-250 S ) arc approximately the same size as the mRNAs from large polyribosomes (approximately 250-400 S ) and that the ratio in large/small polysomes is greater for the ( A,l) than the ( n o All) mRNA. The mRNAs in the small polysomes thus appear to be underloaded, with proportionatcly more of tlie ( n o A,,) underloaded than the (A,,) mRNA. The degree of ribosomal loading changes for each class as a separate function of development (Fig. 2). Both classes are underloaded in the early embryo. However, with development to the late gnstrula stages, loading of the mRNA( Atl) approaches theoretical levels, whereas that of the mRNA( no All) increases only slightly. If these changes can be attributed to changes in the rclativc rates of initiation, then we might speculate that there are developmental changes in the activity of initiation agents, to which mRNA( A,, ) is more rcsponsive than mRNA( no Al,). It is not immediately apparent how differences in poly( A)-content at

POLY ( A ) IN EMBRYONIC DEVELOPMENT

121

40,

.EaO

.

t

\

a

s? 4

0

5

10

15

20

25

30 35

40

Hours Post Fertilizoiion

FIG. 2. The relative amounts of newly synthesized nonhistone mRNA( A,>) and mRNA(no A,>) in large and sniall polyribosomes as a function of embryonic development. The relative amounts of nonliistone mRNA( AIL)and niRNA( no A,,) were quantitated as in Fig. 1. ( A ) Percent of labeled non-4 S RNA in total polyribosomes: ( 0 ) niRNA( A,,) in large polyribosonies; ( x ) niRNA( A,,) in small polyribosonies; (0) ni RNA( no A,,) in large polyribosomes; ( A ) mRNA( no A,3) in small polyribosonies. (€3) Ratio of labeled nonhistone niRNA in large to small polyribosomes; ( 0 ) niRNA( A,,); (0) mRNA( n o A,,).

the 3’ end can influence initiation a t the 5‘ end. One possibility is that poly( A) interacts with sequences at the 5’ end or with so-called “translation control RNA,” which specifically influences initiation ( 3 ) . Alternatively, the 5’ termini of the mRNA(A,,) species may differ from those of the mRNA(no A,,) species and thus provide different sites for ribosomal binding or initiation agents. We examined the 5‘ termini and found that, for each of the three mRNA classes, greater than 80%of the population contained the “cap” ( 4 7 ) structure. Analyzing the methylated derivatives after RNase T, hydrolysis and treatment with bacterial alkaline phosphatase, RNase P,, and nucleotide pyrophosphatase, we detected only “cap-l”, m’GpppN’m-N’’p. Evidence for the existence of “cap-2” ( i.e., containing N”m) was sought among the three classes and in mRNAs from carly and late-stage embryos. We concluded that cap-2 is ncver present and that the sea urchin embryo lacks the ability to methylnte the N” position. Like the other two classes, the histone mRNAs have only the cap-1 structure, but in contrast, there is no internal methylation, i.e., no N6-niethyladenosine, in the histone mRNAs. In order to investigate the

122

MARTIN NEMER AND SAUL SURREY

apparent differences in initiation of (A,,) and ( n o A,,) mKNAs, wc are now examining thc nucleotidc compositional differences betwcen these mRNA species at the 5’ terminus.

Summary The histone inRNAs and the nonhistone mRNAs eithcr containing or lacking p d y ( A ) [i.e., mRNA( A,,) and mRNA(no A,,), respectively], comprise distinctly different nuclcotide sequences. Therefore, they represent three separate classes of genes. The synthcsis of mKNA corresponding to each of thcse gene classes changcs during early development of thc sea urchin embryo in a different and characteristic way. Therefore, these syntheses appear to bc m-ider separate controls, seemingly related to developmental changes in cellular rcquirernents. Furthermore, trans1a t’ion of mRNA(A,,) appears to be controllcd differently from that of mRNA( no A,, ), especially as a function of embryonic development. Messenger RNAs of both classes are underloaded with ribosomes in the early embryo. However, with development to the late gastrula stages, loading of the mRNA( A,,) approaches theoretical Ievels, whereas that of the mRNA( no A,, ) increases oiily slightly. A developmental change in initiation activity may be operative here, with one mRNA class being affectcd more than the other. These translational differences are not accounted for by differences in the general 5’ terminal “cap” structure, sincc all three mRNA classes contain the cap 1, miGpppN’m-N”p, on nearly all polyribosomal mRNAs, and completely lack, at all embryonic stages, methylation in the N” nucleotide. Gcncral gene classcs are distinguished by the presence or absence of poly(A) in their product mRNAs. These distinct gene classes are separately regulated in the production and translation of their mRNAs. ACKNOWLEDGMENT Supported by grants from NIH and N S F and by an appropriation from the Commonwealth of Pennsylvania.

REFERENCES 1. M. Nenier, M. Graham, and L. M. Dubroff, J M R 89, 435 ( 1974). 2. C. Milcarck, R. Price, and S. Penman, Cell 3, 1 ( 1974). 3. S. M. Heywood and D. S. Kennedy, This volume, p. 477. 4 . F. M. Rottman et ol., This volume, p. 21. 5 . B. Moss, S. A. Martin, M. J. Ensinger, R. F. Boone and C.-M. Wei, This volumc, p. 63. 6. Y. Furuichi, S. Muthukrishnan, J. Tomasz and A. J. Shatkin. This volume, p. 3. 7 . H. Busch, F. Hirsch, K. Gupta, Manchanahalli Rao, W. Spohn and B. Wu. This volume, p. 39.

Sequence Analysis of Eukaryotic mRNA N. J. PROUDFOOT, C. C. CHENCAND G. G. BROWNLEE MRC Laboratory of Molcciilar Biology Hills Road, Cambridge, England

1. Introduction The determination of scquences in eukaryotic messenger RNAs has been revolutionized by thc discovery of reverse transcriptase in RNA tumor viruses (1) and as an inherent activity of DNA polymerase I of Escliericlzia coli (2, 3 ) . With this activity, it is possible to obtain DNA transcripts of mRNA, called complementary DNA (cDNA), by hybridization of an oligo( d T ) primer to the 3’-terminal poly( A ) sequence present in most eukaryotic mRNAs ( 4 ) . This cDNA may then b e directly sequenccd by the introduction of ??Pinto the transcript using (~-~’P-labelcd dcoxyribonuclcoside triphosphates ( 5, 6). Alternatively, it may be “backcopied’ into complementary RNA ( cRNA) , using RNA polymerase ( E . coli) and ribonucleoside [a-”Pltriphosphates, which may be sequenced using RNA sequencing procedures ( 7,8). We have been determining the sequences of several eukaryotic mRNAs by studying the cDNA produced both with reverse transcriptase and DNA polymerase. A direct result of sequencing cDNA primed with oligo( dT) is that the RNA sequences obtained derive from the 3’-terminal regions of the mRNA, in particular from the 3’ noncoding region. We have therefore determined the sequences of substantial sections of the 3’ noncoding regions of six eukaryotic mRNAs: rabbit and p-globin, and p-globin, mouse immunoglobulin light-chain, and chicken human ovalbumin niRNAs. We outline first the procedures used [more detailed accounts are presented elsewhere ( 6 , 9 ) ] and then describe the sequences so far determined and their possiblc biological significance. (Y-

(Y-

II. Complementary DNA Sequence Analysis Sequence analysis of cDNA obtained from the six mRNAs listed above was carried out in three stages, each stage involving the analysis of cDNA molecules of different size. 123

124

N. J. PROUDPOOT ET AL.

The procedures used for the synthesis of cDNA and its subsequent sequc’nce analysi5 are described in detail elsewhere ( 6, 9 ) .

A. Limited Synthesis To establish the scqueiice immediately adjacent to the oligo( d T ) primer of the cDNA, the procedure of limited synthesis was employed (6, 9 ) . This technique rclies on the “phasing” of the oligo(dT) primer onto the 5’ terminus of the poly(A) sequcnce and extending it several nucleotides with reverse transcriptase and one of the four deoxyribonucleoside triphosphates, a-32P-labeled. Thc short cDNA molecules produced may then be fractionated in one dimcnsion using homochromatography and the products obtained thcn subjected to the standard DNA sequencing procedures of partial exonuclease digestion, depurination and nearest neighbor analysis (10-12). Two methods were used to phase the oligo( d T ) primer. The reverse transcriptase or DNA polymerase in the absence of dTTF’ produces a phased tranwript up to the first d T residue in the cDNA sequence ( 2 ) . With very low concentrations of dTTP (5-50 nM), the oligo( d T ) primer is still phased at the 5’ end of the poly(A) scquence, but the cDNA produced extends beyond the first d T residue. A series of products results representing “pile-ups” before each successive d T residue in the DNA sequence ( 6 ) . More recently, an alternative method of phasing has been employed. The oligonucleotide p( dT),,,dG-dC,, has been synthesized ( 9 ) , and this primer hybridizes with the 5’ end of the poly(A) sequence and the first two nuclcotides beyond this (in the caw of globin and ovalbumin mRNAs, see Fig. 1). With this primer, it is possible to limit any of the four nucleotides, thereby obtaining a series of products (as before) suitablc for sequence analysis ( 9 ) . B. Complementary D N A Synthesized by DNA Polymerase Wc have reported that DNA polymerase I ( E . coli) in the presence of Mn2+with oligo(dT) as primer synthesizes short DNA transcripts of mRNA, about 50-100 nucleotides in length (2, 5, 1 3 ) . The cDNA SO obtained was purified and subjected to endonuclease IV digcstion (12, 6 ) . The digestion products were thcn fractionated in two dimensions ( 1 4 ) , isolated, and subjected to DNA sequencing procedures as above ( 10-12). Partial venom cxonucleasc digestion proved to be a very valuable procedure, and a great majority of the cDNA sequences were established with this technique. The other DNA sequencing procedures were largely employed to confirm these “partial venom” sequences. Using cDNA dcrived with DNA polymerase, sequenccs of 34, 75, 43,

G- G- U - C - U- U- U- G I A - A - U- A- A - A ] G- U- C - U- G- A- G - U- G- A

30

CA

- G- U-

G - G- C - poly (A)

3

10

20

G - G- C - U A - A - U - A - A - A G - G - A - A - A - U - U - U - A - U - U - U - U - C - A - U - U - G - C - p o l y (A)

(75) -U-

30

20

-

z PI

5

-

Human m-globin mRNA

10

G - C - C - U A - A - U - A - A - A A - A - A - C - A - U - U - U - A - U - U - U - U - C - A - U - U - G - C - poly (A)

Human 0-globin mRNA

5

E % 9

ta

35

z ?-

C-A-G-A-A20

30

- A- A-

U- A

mRNAs

20

30 (70) - U -

Rabbit 0-globin mRNA

10

U - G - G - U - C U - U - U - G I A - A - U - A - A - A ] G - U - C - U G - A - G - U - G - G - G - C - G - G - C - poly (A)

(43)-

Rabbit m-globin mRNA

SEQUENCE ANALYSIS OF

10

20

30 (34) - U -

UY

10

- U - U - C t A - A - U - A - A - A t G- U - G - A - G - U - C - U - U - U - G - C - A - C - U -

U - G - poly (A) Mouse immunoglobulin light chain mRNA

70 (75)

60

50

C- C - U -U - G - U - A - C - C - C - A - U - A - U -G 40

- U - A - A - U - G- G- G- U - C- U- U- G- U- G- A - A - U- G- U - G- C- U- C- U- U- U- U- G- U - U -

30

- C -C

20

10

- U - U - U - A - A - U - C - A - U i A - A - U - A - A - A] A - A - C - A - U - G - U - U - U - A - A - G - C - p o l y (A)

Chicken ovalbumin mRNA

125

FIG. 1. Comparison of sequences adjacent to poly(A) in six eukaryotic mRNAs: rabbit a- and p-globin, human aand p-globin, mouse immunoglobulin light-chain, and chicken ovalbumin mRNAs. The full sequences of globin mRNAs are shown in Fig. 3. Numbers in parenthesis denote number of nucleotides away from poly(A) that have been sequenced. Boxed sequence denotes homology, and underlined sequences denote partial homology. Numbers indicate distance of adjacent nucleotides from poly( A ) .

126

N. J. PROUDFOOT ET AL.

70,35 and 75 nucleotides have becn established for rabbit a- and p-globin ( 6 ) , human a- and p-globin ( 15), mouse immunoglobulin light-chain ( 1 3 ) and chicken ovalbumin ( 9 ) mRNAs, rcspectivcly. The mRNA sequences predicted from these cDNA sequences are shown in Figs. 1 and 3.

C. Complementary DNA Synthesized by Reverse Transcriptase Unlike the relatively short cDNA synthesized by DNA polymerase I, reverse transcriptase cDNA is much larger and often represents a complete transcript of the mRNA template ( 16,17),The reverse-transcriptasecDNA transcribed from rabbit globin mRNA was thcrefore studied with a view to extending the cDNA sequences already established from DNA polymerase-derived rabbit-globin cDNA. Endonuclcase IV chromatograms were obtained, and the ncw oligonucleotides (not present in DNApolymerase-derived cDNA chromatograms ) were isolated and subjected to DNA sequencing procedurcs as before. Using such data, it was possible to cstablish a sequence of 36 nucleotides in rabbit a-globin mHNA, corresponding to the termination region of the mRNA (Fig. 3). Also, a sequence of 92 nucleotides was established in the coding region of rabbit p-globin mRNA, coding for amino acids 107-137 in thc p-globin polypeptide ( Fig. 4,1 8 ) . However, the nuclcotide sequence joining these two regions to the 3’-tcrminal sequences described above (Fig. 3) remains undetermincd. It appears that this region of rabbit globin cDNA is particularly resistaiit to endonucleasc IV (hoth in a- and p-globin cDNA ) so that few cndonuclcnse products have been isolated from this region of the cDNA.

111. mRNA Sequences

A. Comparison of mRNA Sequences Adjacent to Poly(A1 Figure 1 shows the HNA sequences adjacent to poly(A) for the six mRNAs described in this paper. These sequences were deduced by sequeiice analysis of limited synthesis and DNA-polymerase-derived cDNA ( see preceding section ) . All six mRNAs contain the sequence A-A-U-A-A-A (enclosed within a box) between 14 and 20 nucleotides from the poly( A ) . This homology is the most strikingly conscrvcd sequence among the six different mRNA molccules. In addition, human p-globin mRNA and chicken ovalbumin mRNA have a more cxteiisive region of sequence homology, including the A-A-U-A-A-A sequence (see Fig. 2 ) . I n detail, a region of 13 nuclcotides is complctcly homologous, save one nucleotide, between the two

SEQUENCE ANALYSIS OF

127

mRNAs Hurnanp-globinmRNA ChickenovalbuminrnRNA

Fic. 2. Comparison of a small section of human p-globin and chicken ovalburnin mRNAs. Outside box encloses homologoris sequence; inside box encloses sequence comnion to all six mRNAs. Numbers denote distance of that nucleotide from poly( A ) sequence.

mRNAs. I t is also interesting, though possibly less significant, that the other four mRNA sequences have an additional G residue conserved on the 3’ side of the homologous sequence -A-A-U-A-A-A-G. This particular sequence homology has becn previously noted (5). Another region of partial sequence homology between the six mRNAs is the sequence directlv adjacent to the po1y( A ) . Thus five out of the six mRNAs possess the sequence G-C-poly ( A ) (underIined sequences). However, the immunoglobulin light-chain mRNA does not possess this sequence, although in a previous paper ( 1 3 ) it was thought othcrwise. Recent studies have corrected this error and show that thv mRNA has a sequence U-G-poly(A) rather than G-C-poly( A ) ( N . J. Proudfoot and G. G. Brownlee, unpublished observations ). In a previous paper (5) it was reported that both rabbit p-globin mRNA and the mouse immunoglobulin light-chain mRNA could be drawn as douhlc hairpin-loop structures. Howevcr, the four other mRNA sequences describcd here cannot be drawn in such a way, so that the previously suggested structural homology appears to have been fortuitous. Indeed, it was previously mentioned ( 5 ) that these double hairpin-loop structures might be unstablc in uiuo.

B.

Comparison and Size of Human and Rabbit Globin mRNA 3’ Noncoding Sequences

One direct test of the functional importance of a particular nucleotide sequence is to cstablish how conserved that sequence is between different species. If the sequence has completely diverged, as in the case of satellite DNA (19), its function may be regarded ap1 nonspecific. Alternatively, if thc sequence has been verv highly conserved, its function can be considered to be highly specific and central to the overall biological role of the molecule. Such may be the case with the 3’ noncoding regions of globin mRNA. Figure 3 shows and compares the nucleotide sequences determined for rabbit N- and p-globin and human N - and p-globin mRNAs. The termination sequence for the human a-globin mRNA has been determincd by others (20, 21, 8). As indicated, both the a-glohin and

, , Thr

Ser

LYS

~

C- C U- C- C A - A - A

G A- C- C U - C - C

A-A-A 120

o-Globin mRNA

Tyr x~ A% U- A- U C- G - U U - A - A - G - C - U - G - G - A - G - C - C - U - G - C - G - A - C - C - C - G - G - C - C ~

x

U - A - C C- G- U U - A - A - G- C- U - G - G - A - G - C- C - U - C-G- U - A - G - C 110 100

(H) (50)

30 20 10 C - C - C - C - U - G- G- U - C- U - U - U - G A - A - U - A - A - A - G - U - C - U - G- A - G- U - G - A

-

-C-

A - C - C - G - G- C - C I

C - C\

c-u-u

40

U - G - C - U - C - U -U - U - G - A - A - U - A - A - A - G - U - C - U - C - A - G - U - G 20

30

t

(R )

x

- G - U - G - G - C - poly (A)

x

(R)

x

G - G - C - G - G - C - p o l y (A)

(H)

10

0 J-Globin mRNA

50 40 C-A-A-A-A-A-~-U-A-U-G-G-G-G-A-C-A-U-C-A-U-G-A-A-G-C-C-C-C-U-U-G-A-G-C-A-U-C-UX X X X 70

60

(R)

x x

A-A-C-U-G-G-G-G-G-A-U-A-U-U-A-U-G-A-A-G-G-C-C-C-U-U-G-A-C-C-4-U-C-U0 60 50 40

(H)

70

30 20 10 -G-A-C-U-U-C-U- C-G-C-U-A-A-U-A-A-A-G-G-A-A-A-U-U-U-A-U-U-U-U-C-A-U-C-G-C-poly(A)

x x

X

-G-C-A-U-U-C-U-C-C-C-U-A-A-U-A-A-A--

0

A- A- A - U -U - U -A I 1 C-A 0

- U - U - U - U - C - A - U - U - G - C -poly (A)

(R) (H)

P Y v

FIG.3. Comparison of human and rabbit globin mRNA sequences (a-and p-mRNAs). ( H ) denotes human and ( R ) rabbit. X denotes base change between human and rabbit, @ denotes addition of sequence, and denotes deletion of sequence. With a-globin mRNA sequences, termination regions for two species also compared ( N . J. Proudfoot, unpublished data; 20, 21, 8 ) and the position of the carboxylterminal amino-acid sequence of a-globin (the same for both species; see ref. 22) is positioned. The mRNA sequence connecting the'tennination regions with the 3'-terminal regions is indicated by a line. In the case of human a-globin, the connecting sequence is 50 nucleotides [denoted by (50)l.f denotes terminator used by human a-globin Constant Spring ( 2 0 ) .Numbers denote distance of that nucleotide from the poly( A ) sequence.

0

0

2 3 +

r

SEQUENCE ANALYSIS OF

mRNAs

129

,Q-globin mRNA sequences are extensively conserved between human and rabbit. Indeed, out of 129 nucleotides from the 3’ noncoding sequences, only 13 base changes, a deletion of one and another of two nucleotides, and finally an insertion of two and another of three nucleotides, occur. This may be regarded as 16.3%nucleotide-sequence variation between the two species. The variation in nucleotide sequence between the coding regions of rabbit and human globin mRNAs may also be estimated. Thus 13.6%amino-acid variation exists between human and rabbit a- and P-globins (22). This predicts that the mRNA sequence coding for these proteins varies by a minimum of 4.5%between the two species. However, Salser (23) finds that the rate of neutral mutation between the coding regions of human and rabbit globin mRNAs is about 9-fold higher than the rate of base substitutions indicated by amino-acidsequence variation. If this value is representative of the whole coding region, both for a- and p-globin mRNAs, the degree of nucleotide sequence variation between the coding regions of the two species may be as large as 30%.Thus it follows that the 3’ noncoding regions of human and rabbit globin mRNAs are at least in part more conserved in sequence than are the coding regions. An important result to have emerged from the sequence determination of the 3’-terminal region of human a-globin mRNA is that the exact size of the 3’ noncoding region of this mRNA may now be calculated. A chain termination mutant of human a-globin, Constant Spring, was described several years ago (20). This mutant protein was 31 amino acids longer on its carboxyl-terminal side. The sequence of these extra amino acids was established ( 2 0 ) , thereby partially predicting the sequence of the 3’ noncoding region of the a-globin mRNA. The human a-globin 3’-termind sequence corresponds exactly with this partially predicted RNA sequence. Furthermore, the U-A-A sequence (marked by f in Fig. 3 ) corresponds to the terminator for Constant Spring. From this result, the size of the 3’ noncoding region for human a-globin mRNA may be calculated as 112 nucleotides. A similar size can be ascribed to rabbit a-globin mRNA, as the two mRNA sequences are very similar (see Fig. 3). As described elsewhere ( 1 5 ) , the 5’ noncoding sequence of human, and therefore probably of rabbit a-globin, mRNA must be considerably smaller than thc 3’-noncoding sequence. In fact, its size has been estimated, by consideration of the sizes of the coding region, the poly( A ) sequence, and the total mRNA length, in addition to the size of the 3’ noncoding sequence ( 1 5 ) , to be from a few to a hundred nucleotides. Similarly, the 5’ noncoding sequence of p-globin mRNA and immunoglobulin light-chain mRNA are likely to be smaller than their corresponding 3’ noncoding sequences ( 6 , 1 3 ) .

130

N. J. PROUDFOOT ET AL.

C. The mRNA Sequence Coding for Amino Acids 107-1 37 of Rabbit P-Globin A long sequence of 92 nucleotides has been determined for the coding sequence of rabbit p-globin mRNA (see Fig. 4). Of particular interest is the observation that the RNA scquence coding for residue 112 in p-globin is variable and can be either AUU or GUU. This predicts that the amino acid at this position in the polypeptide chain is polymorphic and can be either isoleucine or valine. Indeed, such a variant of rabbit p-globin has been described previously ( 2 4 ) . The fact that these cDNA sequencing procedures have proved sufficiently sensitive to pick up this sequence heterogeneity strongly vindicates the technique. Another conclusion that can be made from this sequence is that there does appear to be some degree of codon selection. In particular, valine occurs six times in the sequence and is coded by either GUU or GUG but not GUA or GUC. Similar levels of codon selection havc been suggested by others ( 7 , 8).

IV. Discussion The sequence determination of eukaryotic mRNAs by analysis of a2Plabeled cDNA produced by reverse transcription has proved to be a successful approach. Thus mRNA sequences of 460 nucleotides have been established using this proccdurc. One of thc principle objectives of these studies was to compare the sequences adjacent to poly( A ) for a varicty of eukaryotic mRNAs. We have previously reported sequence and possibly structural homology in the nucleotide sequence adjacent to poly ( A ) betwcen rabbit p-globin mRNA and immunoglobulin light-chain mRNA. By studying four more mRNAs, we have demonstrated that the previously observed homologies are less extensive than originally suggested. The sequence A-A-U-A-A-A between 14 and 20 nucleotides from poly(A) appears to be the only significantly homologous feature of all six mRNA sequences. The function of this sequence remains obscure a t this time although several possibiIities may be suggested. For instance, this sequence might be involved in termination of transcription if the mRNA sequence is at the 3’ terminus of the primary gene transcript (25) (heterogeneous nuclear RNA, or hnRNA). Alternatively, this sequence might be recognized by a ribonucleasc-processing-enzyme, if the mRNA sequence is at an internal position in the hnRNA as has recently been argued (26). Suggestive evidence that this homologous sequence has a role in termination of transcription comes from the 3’-terminal sequence

112 Val

107

108

109

110

111

Gly

Asn

Val

Leu

Val

G-G-C

A-A-C

C-U-G

C-U-G

G-U-U

121 Glu

122 Phe

123

124 Pro

125 Cln

,:-Ile U-U

,, 113

114

115

116

Val

Leu

Ser

His

C-U-C

C-U-G

U-C-U

C-A-U

127 Gln

128 Ala

129 Ala

130 Tyr

, ,, 117

118

His

Phe

C-A-U

131 Gln

119.

m

120

U-U-U

Cly G-G-C

Lys A-A-A

132 Lys

133 Val

134 Val

n a C

M

3F4

*

C-A-A

i

U-U-C

Thr

t

A-C-U

t

C-C-U

126 Val

i - - . i - - t - - i - -t t i C A G G U G C A G G

C

U G-C-C

U-A-U

C-A-G

i

A-A-C

t

G-U-G

i

C-U-C

135 Ala

t

G-C-U

136 Gly

i

137 Val

1

C-G-U G-U

FIG.4. Sequence from the coding region of rabbit p-globin mRNA coding for amino-acid residues 107-137 of rabbit p-globulin. Vertical lines divide the mRNA sequence into codons. Residue 112 is polymorphic ( 2 4 ) . The mRNA sequence a t this position is therefore variable ( 1 8 ) . SV40earlymRNA-----------CA-A-A-U-A-A-A-G-C-A-U-U-U-U-U-U-U-C-A-C-U-G-C.A--.--

2 p 2 v)

8 s P Z P v)

Rabbit 0 -globin mRNA

------ C - U

t A- A- U- A- A- A - Gt GL.

- U- U- U

xa3

A- A (3

Human P-globin m R N A

------ C -

U

1 A - A - U - A - A - A \ AI IFI

- A- U- U-

-U

t

A - A - U - A -A- A t A

,

- U - U- C - A

1

U

f

,U - U - U - U - C - A t U

g

A-A 0

Chicken ovalbumin mRNA---- A

U- U

G

] poly (A)

poly (A)

A' @

F

/

G

t

c

fA-A

poly (A)

A

0 Immunoglobulin - - - - - - - - - - - - U - C light chain mRNA

A-A-U-A-A-A-G

U-C-A-C

U

U U

U C C A C U C poly(A)

C

U

0

l 3

FIG.5. Comparison of rabbit and human p-globin, chicken ovalbumin and mouse immunoglobulin light-chain mRNA sequences with SV40 early mRNA (27, 28). Sequences in boxes are homologous to SV40 sequence. Dashed lines indicate continuation of known nucleotide sequence (not drawn), @ denotes addition, while denotes deletion of nucleotide sequence.

0

132

N. J. PROUDFOOT ET AT,.

of SV40 “early” niHNA (27, 28). This region of thc early mRNA is Iikely to posscss sequences signaling termination of transcription. A comparison of this SV40 sequence with the 3’-terminal regions of rabbit and human p-globin, chicken ovalbumin, and mouse immunoglobulin light-chain mRNAs is shown in Fig. 5. As indicatcd by the boxed areas, all four mRNAs are significantly homologous to the SV40 early mRNA sequence ( including A-A-U-A-A-A sequence). The p-globin mRNA sequences are especially so, having 20 out of 25 homologous nucleotides. Similar ohservations have been made by Subramanian et nl. ( 29). These homologies suggest that the 3’-terminal regions of the five mRNAs have a common function and that this function may be rcIated to termination of transcription. However, the apparent absence of homology with the SV40 early mRNA shown by the two ,-globin mRNA 3’-terminal regions (other than the A-A-U-A-A-A sequence) may argue against the above model. In the cytoplasm the homologous sequence A-A-U-A-A-A might be recognized by various regulatory proteins. Also it is interesting to note that this sequence contains a termination codon (U-A-A) and that this sequence actually functions as a terminator for human a-globin Constant Spring (20). The sequence directly adjacent to poly(A) does not appear to b e homologouq [as was previously incorrectly suggested (5)1. However, the sequencc G - C - p l y (A ) is present in five out of thc six mRNA sequences, so that some sequence conservation for this region appears to exist. In agreement with this, Nichols and Eiden (30) have found that two sequences adjacent to poly(A) are predominant in total HeLa mRNA: G-C-poly(A ) and G-U-poly( A ) . IIowevt-r, Winter and Edmonds ( 3 1 ) have demonstratcd that poIy( A ) polymerase isolatcd from calf thymus nuclei does not appear to display any sequence specificity in vitro. As described above, human and rabbit globin mRNAs possess very similar 3’ noncoding region sequences. Indeed, it seems possible that the 3’ noncoding regions of human and rabbit globin inRNAs may be more conserved in sequence than their coding rcgions. This result suggests that the noncoding sequences of globin and therefore other cukaryotic mHNAs posscss a defined, sequence-specific function. Even though the six mRNA sequences described above possess few features in common, the sequence conservation between human and rabbit globin mRNAs still suggests an important role for this region of eukaryotic niHNA. Howcwr, this role may be spccific to individual mRNAs rather than common to all. In conclusion, the direct sequence analysis of cDNA has provided, and no doubt will continue to providc, a powerful method for the sequence determination of cukaryotic mRNA. However, this method, as

SEQUENCE ANALYSIS OF

mRNAs

133

with others, has certain limitations. In particular, the 5’-terminal regions of mRNAs arc particularly inaccessible when priming the synthesis of cDNA with oligo( d T ) hybridized to the 3’-terminal poly( A ) . However, it may b e feasible, by using chemically synthesized primers complementary to other regions of mRNA, to sequence the whole of a mRNA molecule. Also it will soon be possible to sequence cDNA that has been inserted into a bacterial plasmid (32, 33). Such cDNA sequences would be in a doublr-stranded state and therefore susceptible to restriction enzyme digcstion and the new rapid DNA sequencing procedure that has reccntly been described by Sanger and Coulson ( 3 4 ) of this laboratory.

V. Summary Sequcnces of 34, 75, 43, 70, 35 and 75 nucIeotides adjacent to the 3’ terminal poly(A) of rabbit a- and /j-globin, human a- and p-globin, mouse immunoglobulin light-chain and chicken ovalbumin mRNAs, respectively, have been estabhhed, as wcll as a sequence of 92 nucleotides from the coding region of rabbit /j-globin mRNA and 36 nucleotides from the termination region of rabbit q l o b i n mRNA, by analysis of .jT-Iabeled complementary DNA. All six mRNAs possess the homologous A-A-U-A-A-A between 14 and 20 nucleotides from the poly(A). This sequence may be a signal for termination of transcription. The 3’ noncoding sequences of human and rabbit globin mRNAs are in part 84% homologous, more so than the coding region sequences. This result emphasizes the functional importance of the nucleotide sequence adjacent to poly( A ) in globin and probably other eukaryotic mRNAs. REFERENCES 1. €1. M. Teniin and D. Baltimore, Aduan. Virus Res. 17, 129 ( 1972). 2. N. J. Proudfoot and G. G. Brownlee, F E B S Lett. 38, 179 (1974). 3. L. A. Loeb, K. D. Tartof and E. C. Travaglini, (1973) Nuttire N B 242, 66 ( 1973 ). 4. G. Brawerman, ARB 43, 621 (1974). 5. N. J. Proudfoot and G. G. Brownlee, A‘attrre 252, 359 (1974). 6 . N. J. Proudfoot, JMB (1976). In press. 7. R. Poon, G. V. Paddock, H. Heindell, P. Whitcome, W. Salser, D. Kacian, A. Bank, R. Gambino and F. Ramirez, PNAS 71, 3502 (1973). 8. C. A. Marotta, B. G. Forget, S. M. Wcissnian, I. M. Verma, R. P. McCaffrey and D. Baltiinore, PNAS 71, 2300 (1974). 9. C . C. Cheng, G. G. Brownlee, N. H. Carey, M. T. Doel, S. Gillam and M. Smith, J M B , in press (1976). 10. V. Ling, J M B 64, 87 ( 1972).

134

N. J. PROUDFOOT ET AL.

1 1 . F. Sangcr, J. E. Donelson, A. R. Coulson, H. Kijssell and U. Fisclicr, J M B 90, 315 (1974). 12. F. Galibert, 1. W. Sedat and E. B. Ziff, ] M B 87, 377 ( 1974). 13. C. Milstein, G. G. Brownlee, E. M. Cartwriglit, J. M. Jarvis and N. J. Proudfoot, Nature 252, 354 ( 1974). 14. G. G. Brownlee and F. Sanger, EJB 11,395 ( 1969). 15. N.J. Proudfoot, and J. I. Longley, Cell (1976) In press. 16. J. Ross, H. Aviv, E. Scolnick and P. Leder, PNAS 69, 264 (1972). 17. T. H. Rabbitts and Milstein, C. EJB 52, 125 (1975). 18. N. J. Proudfoot, NARes 3, 1811 (1976). 19. E. M. Southern, ] M B 94,51 (1975). -70.J. B. Clegg, D. J. Weatherall and P. F. Milner, Nature 234, 337 ( 1971). 21. M. Seid-Akhavan, W. P. Winter, R. K. Abranison and D. L. Hncknagel, PNAS 73, 882 (1976). 22. M. 0. Daylioff, in “Atlas of Protein Seqnence and Structure,” Vol. 4, pp. D-42 and D-54. Natl. Bionied. Res. Found. Silver Spring, Maryland, 1969. 23. W. Salser, S. Bowen, D. Brojvne, F. El Adli, N. Federoff, K. Fry, H. Heindell, G. Paddock, R. Poon, B. Wallace and P. Whitcome, F F 35, 23 ( 1976). (See also Salser et al., this volume, p. 177.) 24. J. Bricker and M. D. Garrick, B R A 351, 437 (1974). 25. J. E. Darnell, W. R. Jelinek and G. R. hlolloy, Science 181, 1215 (1973). 26. R. P. Perry, D. E. Kelly, K. H. Friderici and F. M. Rottinan, Cell 6 , 13 (1975). 27. R. Dhar, S. M. Weissnian, R. S. Zain, J. Pan and A. hl. Lewis, Jr. NARes, 1, 595 ( 1974). 28. R. Dhar, S. Zain, S. M. Weissman, J. Pan and K. N. Submnianian, PNAS 71, 371 (1974). 29. K. N. Subramanian. P. K. Gliosh, R. Dhar, B. Thimniappaya, S. Zain, J. Pan and S. M.Weissinan, this volume, p. 165. 30. J. L. Nicliols and J. J . Eiden, Bichcm. 13, 4629 (1974). 31. M.A. Winters and M.Edmonds, JBC 248, 4763 (1973). 32. F. Rougeon, P. Konrilsky and B. Mach, NARes. 2, 2365 (1975). 33. T. H. Rabbitts, Nature 260, 221 ( 1976). 34. F. Sanger and A. R. Coulson, J M B 94, 441 ( 1975).

The Structure and Function of Protamine mRNA from Developing Trout Testis P. L. DAVIES,~ G. H. DIXON,' L. N. FERRIER, L. GEDAMUAND AND K. IATROU~ Division of Medical Biochemistry Faculty of Medicine The Unitjersity of Calgary Calgary, Alberta, Canada

An unusual and characteristic set of small, highly basic polypeptidesthe protamines-appears in the nuclei of spermatid cells of salmonid fishes at a specific stage in the terminal differentiation of the testis ( 1 ) . The amino-acid sequences of three separated components of rainbow trout protamine have been determined by Ando and Watanabe (2, 3). The three components are closcly related and of unusual structure (Fig. 1);they contain two arginines out of every three amino acids and are only 32-33 residues in length. The range of other amino acids is very limited, namely, serinc, proline, glycine, valine, isoleucine and alanine. The thrcc componciits differ only in the length of the five arginine tracts present in each molecule or by single amino-acid interchanges such as VallIle or Pro/Ala (underlined in Fig. 1). Figure 2 summarizes our present knowledge of the life history of rainbow trout protamine molecules, from the time of their synthesis on a specific class of cytoplasmic diribosomes in trout spermatid cells (4-6), to thcir final, tight complcx with DNA to form nucleoprotamine in the mature trout sperm nucleus. Thc transport into the nucleus ( 4 ) and binding of protamine to nucleohistone is accompanied by a series of phosphorylations and dephosphorylations of the four serine residues in the molecule (7-9). The final product of this series of reactions is an insoluble, highly condensed form of chromatin known as nucleoprotamine. Chromatin in this state is inactive as a template for exogenous E. coli KNA polymcrase ( 1 0 ) . Recently Honda et al. ( 1 1 , 12) showed

' Medical Research Council of Canada postdoctoral fellow. ' To whom reprint requests and enquiries should be sent. Scholar of the Greek State Scholarships Foundation.

135

136

P. L. DAVIES ET AL.

(la) Pro-Arg,-Ser,-Arg-Pro-V~l-Arg,-Pro-Arg,-Val-Ser-Arg,-Gly,-Arg,-

(Ib) Pro-Arg,-Ser,-Arg-Pro-ne-Arg,-Pro-Arg,-Val-Ser-Arg,-Gly,-Arg,(2)

Pro-Arg,-Ser,-Arg-Pro-Val-Arg,-Ala-Arg,-Val-Ser-Arg,-Gly,-Arg,-

FIG.1. Amino acid scqwnces of rainbow trout protamines ( iridines ) detcrmincd by Ando and Watanabe ( 2 ) .

that the ubiquitous “beaded” or nucleosome structure of chromatin, present in early testis cell nuclei, is readily digested by micrococcal nuclease to single nucleosomes ( containing a 200-base-pair Iength of DNA), whereas, after replacement of the histones by protamine, thc nucleoprotamine becomes totally resistant to high concentrations of nuclease. This result indicates that nuclcoprotamine cannot have the “ b e a d e d structure implied by the iiuclcosorne hypothesis ( 13-15), but must exist in a totally different conforination in which the entire length of the DNA is covered by protarnine molecules and has no exposed regions susceptible to nuclease attack. Considerable intercst attaches to the control of synthesis of protamines and particularly to the specific mechanism of activation of the protamine genes during terminal differentiation of the sperm. Studies of protamine biosynthesis muTt take into account the possibility of controls exerted either upon the transcription of protamine mRNA (or its precursor), upon its translation on cytoplasmic diribosornes ( 4 , 5 ) , or possibly at both levels. Crucial to an undcrstanding of both levels of control is the purification and characterization of the messenger RNA for

I

Histones

Poly(A)addition precursor Ribosomal subunits

FIG. 2. A scheme outlining the life history of protarnine in trout spermatid cells (4-9).

PROTAMINE

mRNA

FROM TROUT TESTIS

137

protamine. This papcr reports the recent progress in our studies of this problem. The small size of protamincs predicts that the coding portion of the mRNA should contain only 99-102 nucleotides. The observations that the major site of protaminc synthesis is on diribosomes was consistent with a very small messenger size ( 5 ) . Gilmour and Dixon ( 1 6 ) showed that a crude low-molecular-wcight RNA fraction from trout testis polysomes posesses mRNA activity for protamine synthesis when assayed in a preincubated cell-free trout liver ribosomal system. More recent studies ( 1 7 ) show that protamine mRNA can be translated efficiently in heterologous cell-free systems dcrived either from Krebs I1 ascites cells (18) or rabbit reticulocyte lysates ( 19) to yield polypeptides identical with the three natural protamine components. Successful translation (20) has also been achieved in vitm with the wheat-germ ccll-free system ( 2 1 ) and in vivo by injection into oocytes or fertilized eggs of Xenopus laevis (22). With the sensitive assay provided by the incorporation of [3H]- or [**C]arginineinto protarnine in the Krebs I1 ascites 5-30 system ( 1 7 ) ,it has been possible to purify extensively the protamine mRNA present in trout testis cells (20). In contrast to histone mRNAs, which protamine mRNA might have been expected to resemble, a short poly( A ) tract is present a t the 3'-hydroxyl end of the molecule, and this has allowed purification by binding to oligo ( dT) -cellulose columns (23). If either the total polysomal or postribosoinal RNA fractions are passed through such columns, a fraction comprising 1-3% of the total RNA is strongly adsorbed at high salt concentrations (20) and can be eluted by reducing the salt concentration to zero. While this fraction contains the major portion of the total mRNA activity, there is always another fraction of protamine mRNA activity that passes through the column unretarded even if the pass-through fraction is recycled several times. These and other studies ( 1 7 ) indicate that there is a second edition of protamine mRNA present in both polysomal and, to a lesser extent, postribosomal supernatant RNA that does not bind to oligo ( d T )-cellulose and in which, therefore, the poIy(A) tract is either absent or too short to allow stable hydrogen bonding. This poly( A)-free protamine mRNA [mRNA( no A,,)] in the pass-through peak has recently been purified by elution from polyacrylamidc gels by comparison with artificially deadenylylated poly ( A) containing protamine mRNA[mRNA(A,,)] ( 2 4 ) , and this material is also able to hybridize with a complementary DNA (cDNA) prepared from protamine mRNA( A,,) ( 2 5 ) . The translation products with both purified and crude protamine mRNA( A,,) and mRNA( no A,) species appear to be qualitatively identical in each system tested ( 1 7 ) and in turn identical with the three

138

P. L. DAVIES ET AL.

authentic protamine components as judged by both gel electrophoresis (17) and chromatography on CM-cellulosc columns (17). However, it has bcen noted that messenger fractions prepared from batches of trout testis at different stagcs of development show considerably different ratios of arginine incorporation into the thrce protamine components (26). This rcsult indicatcs that the three mRNAs for the three protamine components may not be synthesized in a completely coordinate fashion during testis development. Further purification of protamine mRNA( All) has been achieved by sucrose density gradient centrifugation of the mRNA( A,,) fraction from oligo( dT)-cellulose. A major component of this fraction sediments at 6 S (20) and, after rebinding of this 6 S material to oligo( dT)-cellulose and its elution in low salt, an essentially homogeneous fraction of messenger RNA is produced. In Fig. 3, this highly purified protamine mRNA (20) is compared with trout testis 5 S ribosomal RNA following electrophoresis in a highIy dcnaturing 99%formamide/poIyacrylamide gel (27), and a single band migrating at a position consistent with a size of 200 25 nuclcotidcs is scen. Careful determination of the s ~ ~value , , ~ of protamine mRNA( A,) in the analytical ultraccntrifuge yields a value of 6.5 S [sill,$ = 5.6 S X correction factor of 1.17 for 1.0 M NaCl 0.05 M Na,HPO1 buffer ( 2 8 ) ] .This value is slightly higher than that previously published (20) but is consistent with independent data from the catalog of oligopyrimidines after depurination and nearest-neighbor analysis of the cDNA prepared from protamine mRNA. When ssil,,v values for a series of small RNAs are plotted against thc log,, of the number of nucleotides ( 2 0 ) ,the s.',,,,~value of 6.5 S extrapolates to a polynucleotide size of 230 residues. The base composition of the most highly purified mRNA (eluted from the major band on a polyacrylamide gel such as that seen in Fig. 3 ) has been determined by separation (29) of the 3'-ended mononucleotides after complete digestion of the mRNA with ribonuclease T2 (30). The results are shown in Table I. Before the base composition of the non-poly(A) portion (coding plus noncoding regions) of the mRNA can be calculated, it is iieccssary to determine the length of the poly( A ) region. The T,-induced polyiiucleotide kinase first described by Richardson ( 3 1 ) has been employed for labeling the 5' ends of unlabeled oligonucleotides ( 3 2 ) , and we have used this procedure to label the poly( A) tract obtained by the digestion of unlabeled protamine mRNA with a mixture of pancreatic and T, ribonucleases in 0.45 M sodium chloride. Under these digestion conditions, the poly ( A ) tract remains intact, but there is complete digestion of the remainder of the RNA at pyrimidines ( RNase A) and guanine ( RNase TI) residues. The resulting

PROTAMINE

mRNA

FROM TROUT TESTIS

I

139

2

-PmRNA

-5s

-4s

FIG.3. Fonnamide gcl electrophoresis of RNA (20). RNA fractions were analyzed on a 10%slab polyacrylamide gel containing 98%formamide as follows: 20 Pg of 5 S ribosomal RNA from trout testis (slot 1) and 10 jcg of protamine mRNA(A,,) (slot 2 ) . The position where tRNA""' migrates is shown by the 4 S arrow. (Protamine mRNA is abbreviated to PmRNA.) The technique of Staynov et al. (27) was employed for running formamide gels. Formamide (99%) was deionized by stirring with 5 g of mixed-bed ion-exchange resin (Bio-Rad AG 501-X8) per 100 ml for 1 hour, and then the resin was removed by filtration. One milliliter of a solution containing 100 mg of ammonium persulfate, 170 mg Na?HP04 (anhydrous) and 40 mg NaH3P04H?O was mixed with 74 ml of deionized formamide containing 6.38 g of acrylamide and 1.12 g of bisacrylamide to give a final phosphate concentration of 20 mM at pH 7.5. The solution was degassed and polymerized by the addition of 0.15 in1 of N,N,N',N'-tetramethylethylenediamine. The reservoir buffer was 20 m M sodium phosphate, pH 7.5, and the gel was prerun at 200 V (constant voltage) for 1 hour. Lyophilized RNA samples were dissolved in 10 jc1 of 99%formamide, heated at 60°C for 5 minutes, cooled in ice. After addition of bromphenol blue, the samples were applied and electrophoresis took place for 8 hours at 150 V at room temperature until the marker dye was 2 cm from the bottom of the gel.

mixture of oligonucleotides was then labeled with very high specific activity [y-"PIATP (0.5 to 1.0 x 10; cpmlpmol) in the presence of polynucleotide kinase ( 3 3 ) . This phosphorylated digest was passed through an oligo( dT)-cellulose column to separate the labeled poly( A ) tract, which is retained on the column until the ionic strength is reduced

140

P. L. DAVIES ET AL.

TABLE I BASICCOMPOSITION OF PROTAMINI.: mRNA(An)
A C G

%

Total molecule

Excluding A,, tail

29.9 25.9 25.8

69 60 -59

51

60 59

% Base romposition in coding plus untranslated region 24.1

56.1%

(+ +

C rich

The mRNA was digested completely to 3’-mononucleotides with T2 rihenuclcase (QO),and aliquots of the digest were applied to a column of l’artisil SAX (25 X 0.46 cm) in a "high-pressure" liquid chromatography apparatus (\-aria11 LCH-1000). The solvent was 0.015 M KH2P04, pH 3.35, and rlution was a t room tempcrature a t a flow rate of 49 ml/hr. The a r m under the A2ao trace for each mononucleotide was integrated, and the eonrentration was detcrminrd by comparison with standards. The authors arc indebted to I h . T,. Brox, McFhrhcrn Laboratory for Cancer Ikscarch, University of Alberta, Kdmonton, Alherta for his cooperation in performing these :inalyses.

to zero. When this material was subjected to electrophoresis in an 8 M urea/lS% polyacrylamide gel ( 3 4 ) together with markers of dT,, and rA, (35), the autoradiograph seen in Fig. 4 was obtained. The gel LENGTH O F POLYDITRST

ia

27

36 45

Panc & T i digest of protarnine ~ R N A (A,)

FIG. 4. Polyacrylamide gel electrophoresis of 5’-1?P-lal)eled oligo ( A ) tracts derived from protamine mRNA by digestion with pancreatic and T, ribonuclease in 0.45 M NaCI. Fallowing digestion, the 5 termini of resultant oligonuclcotidrr, were labeled with polyniiclentidc kinase in the presence of high specific activity [y-Y’]ATI’. The poly( A ) tracts were then isolated by binding to oligo( dT)-crllulosc. The gel was prepared according to Sanger and Coulson ( 3 4 ) and was run for 6 hours at 750 V. Calibration mixtures (35) of dT,,, and rA.] and the polyniicleotidc kinase were kindly supplied by Dr. €Inns van de Sande, Division of Medical Biochemistry, Faculty of Medicine, University of Calgary.

PROTAMINE

mRNA

141

FROM TROUT TESTIS

system is so highly resolving that nucleotides differing in length by only one residuc are separatcd ( 3 4 ) , and it may be seen that the poly(A) tract is highly polydisperse in length, ranging from 10 to 40 residues with the peak of the distribution at 18 residues of A. Thus, the base composition in Table I may be corrected for an average content of A,, in the poly( A) tract. In summary, the ( G C)-content of the mRNA [corrected for the poly(A)] is 56.1% and G = C = 60 residues per molecule, while A is slightly higher than U (51 versus 42 residues per molecule). Two points concerning the length of the poly( A) segment on protamine mRNA are worthy of note. First, the poly(A) is very significantly shorter than that reported for globin mRNA of 40-100 (for review, see 36) or the 150-200 residue length in unfractionated mRNA HeLa cells in culture (36, 3 7 ) . I t is not clear yet why the poly( A ) tract should be so short on protamine mRNA, but it seems possible that three distinct classes of mRNA may exist in eukaryotic cells. The predominant class would comprise a number of mRNAs ( 3 6 ) with 100-200 residues in the poly(A), the second would have as yet only protamine mRNA as an example with 10-40 residues, and the third class, that containing the histone mRNAs ( 3 8 ) , is apparently devoid of a poly( A) region. Second, Gedamu and Dixon ( 1 7 ) , as noted above, found a fraction of protamine mRNA that could not be bound to the oligo( dT)-cellulose column even after recycling, and this fraction was termed poly A( -) protamine mRNA [ mRNA( no A,, ) 3 , Although this material is not fully characterized, it is clear that is possesses a significantly smaller S value and it seems probable that the mRNA( no A,) fraction may represent that portion of mRNA possessing a poly(A) of less than about 10 As, which can no longer bind to oligo( dT)-cellulose. There is evidence from two separate directions that protamine mRNA possesses an extensive secondary structure in solution consistent with the DNA-like equivalence of C and G in the base composition, and to a lesser extent that of A and U (Table I ) . First, yeast tRNAP”? shows a single melting transition at 56°C (Fig. 5). I t is known that 42 out of the total 76 bases in this transfer RNA molecule are hydrogen-bonded in two double-helical regions of the molecule, one nine base-pairs long and the other twelve basepairs long ( 3 9 ) . Protamine mRNA, in contrast, shows a pronounced biphasic melting curve with the first transition at 49°C and the second a t 76°C. A preliminary evaluation of this biphasic curve suggests one region or set of regions melting at 49OC, which are either shorter or slightly more mismatched than the corresponding region of yeast tRNAP’le,and a second region, melting at 76OC, which is either very long or very precisely base-paired to produce

+

P.

L. DAVIES ET AL.

r I

mRNA

TEMPERATURE “C

FIG.5 . The melting profile of protainine mRNA( A,,) compared with that of yeast tRNA’”’’. RNA samples ( 2 0 p g ) were dissolved in 1 nil of 0.15 M NaC1/0.015 M Na citrate, and the melting profile was followed in a Gilford 24003 spectrophotometer.

an unusually stable structure. It is also likely that this high-melting region would be rich in G . C base-pairs. The second piece of evidence comes from the observation that protamine mRNA is particularly resistant to digestion by ribonuclease TI (40) and only a very limited range of predominantly the dinucleotides, A-G, C-G and U-G appears after exposure to high concentrations of the enzyme at 37°C. If the protnmine mRNA is first denatured by exposure to 100°C for 2 minutes and then rapidly cooled before the addition of TI, the digestion is enhanced approximately 3-fold although it is still very limited. This behavior might indicate that G-rich regions of the molecule are particularly tightly structured and unavailable to T I digestion. In addition, the continued resistance even after denaturation tends to show that these G-rich regions can renature very rapidly upon cooling. While the T, ribonuclcase digestion of protarnine mRNA was very incomplete, the thrce major products were A-G, C-G and U-G in approximately equal proportions ( 4 0 ) . Together they comprise 82% of the 5’

PROTAhlINE

mRNA

FROM TROUT TESTIS

143

ends labeled by the polynucleotide kinase when the mRNA was digested at 37OC, and 65%if the mRNA was denatured at 100°C prior to digestion. Since protamine is so arginine-rich, two-thirds of the codons in the coding region must be for arginine, and since the coding region of about 100 nucleotides comprises 50% of the total mRNA, oligonucleotides derived from arginine codons should predominate in the chromatogram. There are six possible codons for arginine, in two sets CG( N ) and AG;. The high proportion of A-G and C-G in the T, digest fits well with the presence of CGG and AGG as frequent codons. However, U-G cannot be derived from arginine codons directly, and although some could arise from sequences such as Arg-Val (C-G J, U-G J, U-G), the large amount of U-G indicates that much of it must come from the noncoding regions of the niRNA. The polynucleotide kinase labeling technique has also been applied to pancreatic ribonuclease digests of unlabeled mRNA. In this case, the digestion is more extensive, but the chromatogram ( 4 1 ) is still relatively simple. The sequences of the major oligonucleotides are presented in Table 11. Three oligonucleotide sequences, G-G-U, G-G-G-U and G-G-G-C fit with CGG codoris for arginine in known protamine amino-acid sequences and are thus consistent with the high amount of C-G in the TI digest. Two other nucleotides, A-G-A-U and A-G-A-G-U fit into sequences involving an AGA arginine codon. In summary, the TI and pancreatic ribonuclease digests indicate that several different arginine codons from both of the two sets are present in protarnine mRNA, there being good evidence for CGG and AGA, and less-strong evidence for AGG and CGU. A very effective method of sequencing mRNA is through the synthesis of labeled complementary DNAs using either DNA polymerase I ( 4 2 ) or avian myeloblastosis reverse tmnscriptase (43, 44) and a primer of oligo( dT) to base-pair with the 3' poly( A ) region. Using reverse transcriptase purified from avian myeloblastosis virus,&cDNA was prepared by incubating protaniine mRNA with a ( d T ) , 2 - , s primer, high-specificactivity [(Y-:'~P J dTTP plus the three unlabeled deoxynucleoside triphosphates ( 2 5 ) .In Fig. 6, it may be seen that the cDNA migrates at precisely the same mobility as protamine niRNA, thus indicating that the cDNA represents a full-length copy of the protamine mRNA. I t is interesting to reflect that despite the apparently tight and extensive secondary structure of protamine mRNA, the reverse transcriptase must be able to melt out the structure without difficulty in the process of copying the entire sequence.

' We are greatly indebted to Dr. Joseph W. Beard, Life Sciences-Research Laboratories, Gulfport Laboratory, 1509!$ 49th St. South, St. Petersburg, Florida 33707, for supplying us with highly purified reverse transcriptase from avian myeloblastosis virus as part of the N.I.13.-N.C.I.program.

144

P. L. DAVIES ET AL.

Oligonucleotide sequenw

Possible assignment in coding region

p(:-f:

Arg or Ser

pc;-u Arg or 8cr

pG-G-u Arg-Ser

p(;-(;-(:-u Ai-g-\?nl

pG-G-G-c Arg-Ala p A- (.;-A- C;-U

Arg-\.al

pA-G-U Scr

pA-G-A-U Arg-Ser pA-A-U Met (Initiation)

A corresponding cDNA was also prepared from high-specific-activity product. The [W~?PP]~CTP and mixed with the [ (~-~~P]dTTP-labeled mixture was then depurinated by the method of Burton and Petersen ( 4 5 ) , and the oligopyrimidine products wcre separated on the twodimensional system described by Barrel1 (41) as modified by Ling ( 4 6 ) . An autoradiograph of this separation is shown in Fig. 7. Each spot was eluted and counted so that a complete catalog of the number of copies of each pyrimidine cluster per molecule of cnNA could be calculated ( 4 0 ) . In addition, sequence determination was possible on a number of oligopyrimidines by sequential digestion from the 3' end with snake venom diesterase ( 4 6 ) and from the 5' end with spleen diesterase. The largest

h)

0 FIG.6. Polyacrylamide gel electrophoresis of cDNA to protamine mRNA. Protamine mRNA (slot 1) was prepared according to Gedamu and Dixon ( 2 0 ) and protamine cDNA (slot 2 ) labeled with [a-”’PIdTTP was prepared and purified [Iatrou and Dixon (25)l.The radioactive markers dT,, (slot 3 ) were a gift from Dr. H. van de Sande. Samples were lyophilized, dissolved in 99%formamide and applied on a 15%polyacrylaniide/8 M urea gel prepared according to Sanger and Coulson ( 3 4 ) . Electrophoresis was performed as described in the legend to Fig. 4. At the end of the electrophoresis the gel was stained overnight with Stains-All, destained and then autoradiographed.

+I M (A

=!

(A

146

P. L. DAVIES ET AL.

FINGERPRINT OF DE-PURIMATION PRODUCTS FROM NIX

OF3%-C 6 -T-LABELLED PROTAMINE cDNA

FIG.7 . Separation of the depurination products from a mixture of two protamine cDNAs, one labeled with [a-”P]dCTP and the other with [a-3’P]dTTP. In each case, the cDNA was purified by elution of the G S band from a polyacrylainide gel (53). Depurination was performed under the conditions of Burton ( 4 5 ) , and the oligopyrimidines were separated by the two-dimensional technique of Ling ( 4 6 ): horizontally, electrophoresis in 7 M urea, pH 3.5;vertically, homochroinatography.

oligopyrimidine tract, an undecanucleotide, was present as one copy per mole and had the composition C,T, (see Fig. 7). The sequence determination of this oligonucleotide is illustrated in Fig. 8; the sequential digestion from both 5’ and 3’ ends for seven residues provided an overlap of three nucleotides and a total sequence of 5’-CCTCCTCTCCT-3’. In Fig. 9, this sequence is seen to correspond with a messenger sequence of 5’-AGG-AGA-GGA-GGcU,which, in turn, codes unequivocally for -ArgArg-Gly-Gly-, a polypeptide sequence present in all three protamine components near the C-terminus (see Fig. 1 ) . Further depurination cxperiments with [ w ~ ~ dATPP] and [ ( Y - ~dGTP-labeled ~P] cDNA showed that C,T, was labeled on its 3’-phosphate when derived from G-labeled cDNA but not A-labeled cDNA, so that nearest-neighbor considerations

PROTAMINE

mRNA

Sequence

SPD

147

FROM TROUT TESTIS

analysis of

C7T4

SVPD digest

digest

pH 3 . 5

.

5‘

electrophoresis

-

-

-

L

--.----.-

&

L

L

C-C-T-C-C-T-C-T-C-C-T

3’

FIG.8. Separation of the products of snake venom diesterase (right) and spleen diesterase (left ) digestion of ClT4.The oligodeoxynucleotide GT,, labeled with [ a-”PldCTP, was prepared as described in the legend to Fig. 6. Conditions for the partial digestion of dephosphorylated CIT4using the two diesterases were based on those of Galibert et al. (50). The digestion products were separated in two dimensions according to Brownlee ( 4 9 ) ( electrophoresis horizontally and homochromatography vertically) and made visible by autoradiography.

148

P. L. DAVIES ET AL.

c DNA

3 ‘ G T C C T C T C C T C C E 5’ 5‘ c (CQCAGGAGtGGtGGy

m RNA

Polypeptide

-Arg-Arg-Gly-Gly-

’

(Pos. 2 6 - 2 9 )

FIG.9. Coding sequence of mRNA corresponding to d-C,Ta. The G at the 3’ end of GT, was determined by labeling cDNA with [a3*P1dCTPand demonstrating incorporation of “P at the 3’ end after depurination. [n-”P]dATP-cTlNA did not yield labeled GT, after depurination.

would place G at the 3’ end (Fig. 9 ) . The amino-acid residue on the amino-terminal side of -Arg-Arg-Gly-Gly- is Arg in all protamine components indicating that the next codon must be CGC (Fig. 9 ) . The sequences of seven other pyrimidine clusters (extended in some cases as described above by nearest-neighbor labeling with A or G ) correspond with sequences in the coding region of the mRNA and are depicted in Fig. 10. One clear conclusion from the depurination products is that, of the 21 or 22 arginine positions in the three protamines, at least 11 are coded by either AGA or AGG, the remainder being coded by CG( N ) codons. From the sequence of C7T,, it is clear that both CG( N ) and AG; codons can

:omporition

Yoles oer mole 2

T2 C2T2 C2T2

cDNA sequence TCT(~)

m RNA aoquence

:&At -Arg-

CCTT~

r

OS C T C T ~

:A

wt

-Arg-

C,AGAG~

-nre” Ala

‘2’4

1

TCT TCTG

CAG~AG$ -Arg-Arg-

C3T

1

CCCTG

CAGGG~ --&I Ala

c3T2

0.3

CTCCTA

UAG?AC$ -Arg- Ser-

‘4 T2

1

CCTCCTG

c AG5AG 9:

-

-Arg-Arg-

F I ~10. . A catalog of oligopyrimidines derived from depurination of [a-32P]dCTPand [cr-”ZP]TTP-labeled cDNA that yield mRNA sequences in the coding region.

PROTAMINE

mRNA

FROM TROUT TESTIS

149

FIG. 11. Interrelationships of the codons for the limited range of amino acids present in protamine.

occur in a single arginine tract. This point is supported by the observation of the scquences C-AGA-AGA-,V and C-AGG-AGG-2, each of which must code for a different -Arg-Arg- sequence. Since there is only a single sequence, -f;;;-Arg-Arg-Val-, in the molecule containing a tract of only two adjacent arginines, one of the two sequences must come from a longer arginine tract of at least four residues, in which case C G ( N ) codons must also occur in the tract. An interesting relationship becomes apparent when the codons for the very limited range of amino acids in protamine are examined (Fig. 11). The AGA and AGG codons for arginine are related by a single transition interchange, whereas CGA and CGG are related to AGA and AGG by single transversion interchanges. Further, single transversions can produce CGU and CGC from either CGG or CGA. Thus, the complete set of six arginine codons forms the core of the diagram in Fig. 11. All the other amino acids in protamine that occur niore than once, i.e., Ser, Pro and Gly, have codons related to one of the arginine codons by a single base change, and those that are less frequent, i.e., Val, Ala or Ile, would require 2 to 3 base changes to be derived from arginine codons. These interrelationships have suggested a possible origin of the coding portion of the protamine genes outlined in Fig. 12. It is proposed that the archetypal protamine gene might have arisen from a repetitive portion of DNA, (TCT),,. ( AGA),,. Studies by Morgan ( 4 7 ) on synthetic duplex DNAs of this type have shown that the transcription of such a duplex is overwhelmingly from the pyrimidine strand in a ratio of at least 100: 1,

150

P. L. DAVIES ET AL.

'-(A

AGG

G A),, I

No. of mutations from AGA

AGG

C G G AGG CGG AGG CGG

I

0-2

1

2

2

3

2

4

leading in the case of the putative archetypal protamine gene to a mRNA of repeating structure (AGA).. As indicated in Fig. 12, stage 1, this mRNA would be translated as poly( Arg), a product that presumably possessed some selective advantage in the packing and condensation of DNA in the sperm heads of primitive vertebrates. Subsequent mutation of A in the third position leading to AGG codons could be tolerated since poly ( Arg) would still be synthesized (stage 2 ) . Further mutation in the first position to C is also conservative and would lead to the second set of arginine codons. However, an equal possibility of AG," mutation to AG: exists, which would lead to the introduction of serine residues into the p l y ( Arg) (stage 3 ) . Serine is found as a constant constitucnt of all protamines so far described ( 3 ) ,and as outlined in Fig. 2, the phosphorylation and dephosphorylation of 3 to 4 serines plays an apparently obligatory role in the binding of newly synthesized protamines to nucleohistone, the displacement of the histones and the formation of nuclcoprotamine. Thus, the appearance of serine residues may well have possessed selective advantage in the function of the primitive poly ( Arg ) protamines. In stages 4 and 5 (Fig. 12), the remaining codons of presentday protamine for Pro, Gly, A h , Val and Ile would have arisen by further single-base changes. As seen in the lower two lines of Fig. 12, there is an inverse correlation between the frequency of occurrence of a particular amino acid in protamine and the number of mutations required to derive its codon from AGA; i.e., the 3 to 4 serine codons require only a single base change while the isoleucine codon, which occurs only once in one

PROTAMINE

mRNA

FROM TROUT TESTIS

151

FIG. 13. Separation of an endonuclease IV (48) digest of [a-”PJdCTP-labeled protarnine cDNA. [a-”PIdCTP-labeled protaniine cDNA was prepared, purified ( 25) and dissolved in endo-IV buffer ( 4 H ) prior to digestion with endonuclease I V (kindly supplied by Dr. P. D. Sadowski). Two equal aliquots of this solution, each containing 425 ng of cDNA (5.4 x 1 0 Cerenkov cpm) were mixed with 5 units and 10 units of enzyme, respectively. The digests were incubated for 3 hours at 38”C, combined, lyophilized, dissolved in 10 pI of H,O and separated in two dimensions (49). In the first dimension, electrophoresis at pH 3.5 in pyridine/acetate/urea buffer was for 3 hours at 4.5 kV. Separation in the second dimension was performed using a 5%, 15-minute hydrolyzrd “homomix” ( 4 9 ) . Autoradiography was for 12 hours.

152

P. L. DAVIES ET AL.

of the three protamine sequences, requires four mutations. A test of this hypothesis is available by hybridizing a labeled synthetic polynucleotide, (TCT)., to protamine mRNA, and this experiment is now under way. Further sequencing of the cDNA for protamine mRNA has been carried out by digestion with endonuclease IV ( 4 8 ) to produce oligo-

FIG. 14. Sequence analysis of spot No. 8 from the endonuclease IV digest illustrated in Fig. 13. Spot No. 8, obtained by partial digestion of cDNA by endonuclease IV as described in the legend to Fig. 13, was eluted from the DEAE-cellulose thinlayer plate ( 4 9 ) and lyophilized three times. A partial snake venom digest ( 5 0 ) of the [32P]-labeledoligonucleotide comprising spot No. 8 was separated in two dimensions ( 4 9 ) and autoradiographed.

PROTAMINE

mRNA

153

FROM TROUT TESTIS

nucleotides of the general structure, pC( N,)-NoiI,which are then isolated by two-dimensional chromatography ( 49). Such oligonucleotides from a digest of [“’PIdCTP-labeled protamine cDNA (Fig. 13) were thus scparated. The sequence analysis of endonuclease IV spot No. 8 after digestion with snake venom diesterase ( 5 0 ) is depicted in Fig. 14. Analyses of several of the spots as well as of a depurination product ( CT5, with the sequence 5’-TCTTTT-3’) yielded mRNA sequences that do not correspond to coding regions. Proudfoot and Brownlee ( 4 2 ) have established an extended sequence of 52 residues from the 3’ poly(A) region into the 3’ noncoding region of p-globin, and we have found that the protamine mRNA sequences derived from the four oligonucleotides mentioned above show significant homology with regions in this 3‘ noncoding region of p-globin mRNA, as shown in Fig. 15. Two sequences in protamine mRNA are of particular interest; the first is AAUAAA, which has been shown (51) to be present in this region in all six eukaryotic mRNAs they have examined, namely, rabbit a- and p-globin, human aand p-globin, immunoglobulin light-chain, and ovalbumin. The second sequence of interest is the U-rich region, which also seems to be a common feature of both eukaryotic and some viral mRNAs (52) and may be related to the termination signal for mRNA transcription from the DNA template. Also in Fig. 15 the sequence of p-globin mRNA is 45

35

40

30

25

20

15

UGAGCAUCUGACGGCUGGCUAAUA A A G A u

1

211

I

GCAUAUGACGGC UGU UCAAUA A AAGA +Endo+-Endo +&CT 8 5 + 19

10

5

1

u UAU U U UCAU uGC--poiy

(A)

Y U U UCAUUG ,Endo\A( 5

FIG.15. Comparison of noncoding sequences in protaniine mRNA with the 3’ noncoding region of p-globin mRNA ( 4 2 ) . The vertical arrows indicate base substitutions. The looped structure is one of those proposed by Proudfoot and Brownlee ( 4 2 ) for the 3’ region of P-globin niRNA, and the bases marked with a dot indicate those positions that are homologous in protamine mRNA.

154

P. L. DAVIES ET AL.

arranged in one of the two H-bonded, looped structures proposed for p-globin mRNA by Proudfoot and Brownlee ( 4 2 ) , and it may be seen that the protamine mRNA sequences are compatible with the maintenance of this double-loop structure, The small size and pronounced secondary structure of protamine mRNA make it a favorable candidate for a complete primary structure determination, with the further possibility of crystallization leading to the elucidation of the three-dimensional structure of a eukaryotic messenger RNA. Such studies should illuminate important questions such as : What features of the mRNA are important for ribosome binding? Do the 5’ and 3’ noncoding termini fold into specific structures? Are specific folded structures present that include the initiation and termin at’1011 codons? Are there common sequences or secondary structures that are recognized by the mRNA-binding proteins present in mRNP particles? ACKNOWLEDGMENT This work has been supported by the hledical Research Council of Canada through a Negotiated Development Grant to the Faculty of hleclicine, University of Calgary.

REFERENCES 1. A. J. Louie and G. 11. Dixon, JBC 247, 5490 (1972). 2. T. Ando and S. Watanabe, I n t . J . Prot. RES. 1, 221 (1969). 3. T. Ando, M. Yamasaki and K. Siizuki, “Protamincs-Isolation, Characterisation, Structure and Function.” Springer-Verlag, Berlin and New York, 1973. 4. V. Ling, J. R. Trevithick and G. 11. Dixon, Can. J. Biochern. 47, 51 (1969). 5. V. Ling and G. 11. Dixon, JBC 245,3035 (1970). 6. V. Ling, B. Jergil and G. H. Uixon, JBC 246, 1168 ( 1971). 7. C. J. Ingles and G. H. Dixon, P N A S 58, 1011 ( 1967). 8. A. J. Louie and G. H. Dixon, JBC 247, 7962 (1972). 9. A. J. Louie and G. H. Dixon, Can. J. Biochem. 52, 536 ( 1974). 10. K. Marushige and G. H. Dixon, DEU.B i d . 19,397 ( 1969). 1 1 . B. M. Honda, D. L. Baillie and E. P. M. Candido, FEBS Lett. 48, 156 ( 1974). 12. B. M. Honda, D. L. Baillie and E. P. M. Candido, JBC 250, 4643 (1975). 13. D. R. Hcwish and L. A. Burgoyne, BBRC 52,504 ( 1973). 14. A. L. Oliiis and D. E. O h , Science 183,330 (1974). 15. R. D. Kornberg and J. 0. Thomas, Science 184, 868 (1974). 16. R. S. Gilmour and G. H. Dixon, JBC 247, 4621 (1972). 17. L. Gedamu and G. H. Dixon, JBC 251, 1446 ( 1976). 18. M. B. Mathews and A. Korner, EJB 17,328 (1970). 19. H. Lamfrom and P. hl. Knopf, J M B 9,558 (1964). 20. L. Gedamu and G. H. Dixon, JBC 251, 1455 (1976). 21. B. E . Roberts and B. M. Peterson, PNAS 70, 2330 (1973). 22. L. Gedamu, J. B. Gurdon, R. S. Gilmour, T. W. Wu and G. H. Dixon, in preparation ( 1976).

PROTAMINE

23. 24. 25. 26. 27. 28. 29.

mRNA

FROM TROUT TESTIS

155

G. Brawerman, ARB 43, 621 ( 1074). See also Volume 17 of this series. L. Gedamu, K. Iatrou and G. H. Dixon, submitted for publication (1976). K. Iatrou and G. 11. Dixan, submitted for publication (1976). L. Gedamu and G. H. Dixon, in preparation ( 1976). D. Z. Staynov, J. C. Pinder and W. B. Gratzer, Nature NB 235, 108 (1972). W. Bauer and J . Vinograd, Proced. NtrcZ. Acid Res. 2, 297 ( 1971). P. Brown, “Hfgh Pressure Liquid Chromatography.” Academic Press, New York, 1973. 30. G. W. Rushizky and H. A. Sober, JBC 238, 371 ( 1963). 31. C. C. Richardson, JMB 15, 49 (1966). 32. M. Szkkely and F. Sanger, JMB 43, 607 ( 1969). 33. G. Chaconas, H. van de Sande and R. B. Church, A n d . Biochem. 69, 312 (1975). 34. F. Sanger and A. R. Coulson, JMB 94,441 (1975). 3,5. B. W. Kalisch and H. van dc Sande, in press ( 1976). 36. B. Lewin, Cell 4, 77 (1975). 37. E. H. Davidson and R. J. Britten, Q. Reu. B i d . 48, 565 (1973). 38. M. Grunstein, S. Levy, P. Schedl and L. Kedes, CSHSQB 38, 717 (1973). 39. S. H. Kim, G. J. Quigley, F. L. Suddath, A. McPherson, D. Sneden, J. J. Kim, J. Weinzierl and A. Rich, Science 179,285 ( 1973). 40. G. H. Dixon, P. L. Davies, L. N. Fernier, L. Gedaniu and K. Iatrou, in “Symposiuin on Molecular Biology of the Mammalian Genetic Apparatus” ( P . 0. P. Ts’o, ed. ), Vol. A, Part I. Associated Scientific Publishers, Elsevier-Excerpta Medica-North Holland, Amsterdam and New York, in press (1976). 41. B. G. Barrel], Proced. Nucl. Acid Res. 2, 751 (1971). 42. N. J. Proudfoot and G. G. Brownlee, Nature 252, 359 (1974). See (51 ) below. 43. I. M. Verma, G. F. Temple, H. Fan and D. Baltimore, Nature NB 235, 163 (1972). 44. D. L. Kacian, S. Spiegelman, A. Bank, M. Terada, S. Metafora, L. Dow and P. A. Marks, Nuttire N B 235, 167 ( 1972). 45. K. Burton and G. B. Petersen, BJ 75, 17 ( 1960). 46. V. Ling, J M B 64, 87 (1972). 47. A. R. Morgan, JMB 52, 441 (1970). 48. P. D. Sadowski and I. Batzyta, JBC 247, 405 ( 1972). 49. G. G. Brownlee, in “Laboratory Techniques in Biochemistry and Molecular Biology” (T. S. Work and E. Work, eds. ), North-HolIand Publ., Amsterdam, 1972. 50. F. Galibert, J. W. Sedat and E. B. Ziff, JMB 87, 377 (1974). 51. N. J. Proudfoot, C. C. Cheng and G. G. Brownlee, this volume, p. 123. 52. K. N. Subramanian, P. K. Ghosh, R. Dhar, B. Thinimappaya, S. B. Zain and S. M. Weissman, this volume, p. 157. 53. S. Levy, P. Wood, M. Grunstein and L. Kedes, Cell 4, 239 (1975).

This Page Intentionally Left Blank

The Primary Structure of Regions of SV40 DNA Encoding the Ends of mRNA KIRANUR N. SUBRAMANIAN, PRABHATK. GHOSH, RAVIDHAR, BAYARTHIMMAPPAYA, SAYEEDAB. ZAIN, JULIAN PAN AND SHERMANM. WEISSMAN Dapartments of Human Genetics and Internal Medicine Yale University School of Medicine New Haven, Connecticut and Cold Spring Harbor Laboratory Cold Spring Harbor, New York

1. Introduction The small DNA virus SV40 provides an excellent model for the analysis of the structure and mechanism of expression of genes functioning in the nucleus of an animal cell. The virus can code for a protein of mass no more than 170,000 daltons; over one-third of this consists of capsid proteins that are formed only after DNA replication has begun. “Early” transcripts can bc made even in the absence of new protein synthesis, and therefore must be produced by preexisting host cell machinery. Presumably, the processes involved in “early” gene expression and most of those for “late” gene expression are available and utilized for host genes also, so that detailed knowledge of SV40 could provide appropriate analogs for the function of host cell genes. We have analyzed the cytoplasmic polyadenylate-containing RNA in cells infected with SV40, to locate the precise sequences of the genome encoding the 3’ and 5’ ends of cytoplasmic RNA. We then examined the nucleotide sequences of the SV40 DNA overlapping the regions. Analyses were performed both on RNA transcripts of restriction fragments of the virus DNA and directly on the DNA. Interpretation of the results is limited, in that the sites of initiation and termination of transcription on viral DNA may not correspond to the ends of the cytoplasmic RNA because of possible nucleolytic processing of precursors. Unfortunately, no system is available for faithful initiation or termination of transcription 157

158

KLRANUR N. SUBRAMANIAN ET AL.

with animal-cell RNA-polymerase 11. However, certain analogies can be noted between untranslated sequences in SV40 and sequences at the ends of prokaryotic DNA transcripts.

II. Material and Methods These have been reviewed previously (1, 2, et op. c i t . ) . Restriction fragments were prepared by acrylamide gel electrophoretic fractionation of digests of SV40 DNA. The authors and particularly grateful to Dr. Richard Roberts of Cold Spring Harbor Laboratory for providing them with information and samples of many restriction endonucleases. Sequence analyses were performed both on RNA transcripts of DNA fragments and directly on the fragments. Radioactive RNA was prepared from African green monkey cell lines (RSC1, VERO or CV-l), grown in the presence of 32Pphosphate and harvcsted at times varying from 18 to 48 hours after infection with SV40. The cells were disrupted and nuclei were removed by centrifugation. RNA was extracted from the cytoplasm and fractionated on oligo ( d T )-cellulose columns by standard methods. Polyadcnylate-containing RNA was annealed to SV40 DNA fragments immobilized on filters and the retained RNA was analyzed by T, or pancreatic nuclease digestion and chromatography of the oligonucleotides.

111. Results The location and sequences of the DNA about the 3’ end of “late” RNA was described some time ago (1,2 ) . This site is particularly easy to localize because of the relative abundance of “late” RNA from this region and the distinctive size and sequence of the last T, oligonucleotides within the “late” message and of the first T, RNase product missing from ‘‘late’’ RNA. Specifically, the sequence in the DNA immediately around the 3’ end of “late” mRNA is 10

1

20

30

CAAIJAAACAA-(;UUAACAACA-ACAAUU(;CAU-UCAIIUUUUAU

where the first guanylic acid is contained within the sequence of “late” RNA while the second is not. The sequence proceeds as follows: 40

50

(iUUUCAGGA.. . .(;GUUUUUUAAAC

There is a sequence beyond residue 48 that resembles sequences at the 3’ end of several prokaryotic transcripts such as the A 6 S RNA

REGIONS OF

SV40 DNA

CODING ENDS OF

159

mRNA

transcribed in uiuo or in uitro ( 3 - 8 ) , the x 4 S RNA in uitro ( 8 - l o ) , the 3‘ end of a Bacillus subtilis precursor ( 11 ), inter alia, and also the recently reported attenuator sequence for the tryptophan operon ( 12). These sequences are shown here with arrows showing where transcription of 4 S and 6 S RNA and the tryptophan “leader” sequence terminate in uitro. The nucleotide sequences beyond the ( conditional) termination point are included. 5\’40

(;-(;-U 6-A-A-A-C

1

X 6

S IlNA

C-(:-(;-(;-A-UG-A-U-A-U-C-U-G-C-A-C-A-A-C . . .

1 X 4 S IlNA

H . subtilis precursor

(;-C;-C-(:-UG-A-U-U-G-CI-U-G-A-C;

.

..

(:-(:-U8-G-U-U-U-U-U-u-(;

1 Trp attcnuator

C-C;-(;-G-C-Ua-Us-(:-A-A-A

The 3’ end of the “early” RNA of SV40 cannot be located quite SO precisely since there is much less “early” than “late” RNA, there is overlap of the 3’ ends of “early” and “late” RNA, and several of the “early” strand T, RNase products from the region of the 3’ end of message have the same chain length and composition as products from internal sequences of the “early” message that lie near the 3’ end. The end of “early” message includes as a minimum the sequence complementary to the SV40 “late” strand RNA sequence presented above and an additional 11 nucleotides. As a maximum, it extends about another 45 nucleotides beyond that sequence so that in the “early” strand sequence ( 1 ) CACAAAUAAAGCAUUUUUUUC-ACUG. . . the first guanylic acid could possibly be included in “early” message. Thc T, RNase product CAUUUUUUUCACUG has not consistently been detcctcd in “early” message so we believe “early” message stops near that guanylic acid. Again there is a sequence containing 6 uridylates at or beyond the 3’ end of the transcript. We have examined the sequences of over 66% of SV40 and found a total of 12 stretches of 6 or more deoxyadcnylates, including the two already referred to. Of the additional deoxyadenylate stretches, five occur on the “late” strand of DNA from the “early” region of the virus, five occur on the “early” strand of DNA from the “late” region, and one occurs within the sequence at the origin of DNA replication. Location of the 5’ Ends of SV40 Cytoplasmic RNA

The position of the 5’ end of SV40 “early” RNA was studied with preparations of RNA from cells infected in the presence of arabinosylcytosine and also with RNA from cells “late” in the infectious cycle in

160

KIIIANUR N. S U B R A M A N I A N ET AL.

the absence of inhibitors. No attempt was made to analyze possible “capped” structures in this RNA, since there was so little radioactivity present and even the presence of caps would not exclude the existence of other longer RNA specimens overlapping the region. To analyze the 5’ ends of “early” and “late” RNA, we annealed the preparation to SV40 DNA fragment EcoRII-G ( 1 3 ) . This fragment contains the origin of DNA replication and also encodes the 5’ ends of the longest cytoplasmic RNA species that we have been able to detect. A portion of its sequence is presented in Fig. 1. Oligonucleotides corresponding to T, RNase products of the “early” strand transcript could be detected from position 1 up to about position 150 of the EcoRII-CJ sequence. Beyond 150 no “early” strand products were found. Preceding the 5’ end of “early” RNA in EcnRII-G is a sequence of 17 consecutive A * T base-pairs, found by relating G . C with sequences. There are also ( A T)-rich segments ( 16-18 A . T base-pairs) in the DNA lying 30 base-pairs to the 5’ end of the “late” strand sequence of EcoRII-G. The sequences defining the origin of DNA replication presumably lie principally within the DNA fragment Hind-C, since Nathans and his colleagues ( 1 4 ) have obtained a deletion variant of SV40 that lacks the Hind-A fragment but still replicates in the presence of helper virus. Shenk, Carbon and Berg ( 1 5 ) have obtained a viable deletion mutant of SV40 that lacks a sequence rightward from somewhere between positions 198 and 207 to about position 230. Other deletion mutants lacked sequences farther to the right. Therefore, sufficient sequence to specify the origin of replication must lie between about positions 85 and 200 in this DNA fragment. The end of “late” strand RNA was more difficult to locate. There were clearly “late” strand products corresponding to much of the sequence. The products complementary to nucleotides 1-55 of the sequence in Fig. 1 were missing or present only at undetectably low levels. The products CA,G and CU,G complementary to positions 77-90 of the sequence were present; CA,G was more difficult to detect, but pnncreatic RNase digestion gnve A4GC, indicating the presence of this product. An oligonucleotide in the position of C,UC,A,,G was present, The next large TI product was not clearly detected but could have been lost in the radioactive material we always had at the bottom of the chromatogram. The product CCG was present in more than molar amounts. This is significant because it was absent from “early” RNA maps. There was also only one source for an additional mole of CCG in the sequence 250 nucleotides preceding or following EcoRII-G. Therefore, the presence of

+

AG

C 180

C -T

/

\

\

/

G

I G.

I

I

I I

I

I I C. .G I I T..A , IT . . A I

.t I

I ‘A T..A’ I I T . .A I I T. .A I I

‘*Lola’‘

C . .t I

70

cn

I

2

.c I

0

G..C AI . . TI

,ao

---- - _ _ _ _ _ _ - _ _ _ _ _ _ _ - - - _ _ _ _ 110

T . .A I I

A . .T I I

n

C . . G c . .G I

IzaI

I I C-T

0

I no C-A-T-A-A-A-T-

. . . . . . . .. .. .. .. .. .. .. .. .. .. G-G-A-T-C-C-IG-G-A-G-G-T-T-T~T-T-T-C-G-G-A-G-CI-A-G-T G-A G-T-A-T-T-T-AL _ _ _ _ _ _ _ _ 1 I I / I

-A-G-G-T-A

I G . . C I 1 A . . T I I I\. . T

1

Hind I1 and __ 111 Fragment A

G. .c

I

C..G’

I

1

I

I

T. .A I 1 T . .A I f T. .A

I

L

3’

0

M

I

- _ _ I _ _ - _ _ _ _

G’.C

I

I

A . . T I I

5’

I

I T . .A 1 I

C . .G

,T’.A

G

G..C

I

l l T..A I I I A . . T C. . G I I 1 I

‘A..T

Hind I1 a n d I11 __

Fragment C

I

G”C

C. .G I 1

I

I

A-C

G’.C (30 I I G . ‘C I I

160

c . .G

\

1 1 T . . A I I

C A \ / C---A

C . . G 1

110

C,

1

,C

I

I

G\

/c

E?

2,

. .I *. . . . . . . . . . I . ..

1

A . .T,

(0

I

I l l ‘A..T G . . C I I I I

. .. .. .. . . . . . .. I.. .. .. .. . . . . . . .

.. .. .. .. 3‘

1

G.

c

C-C-T-A-G-G-IC-C-T-C-C-A-A-AiA-A-A-G-C-C-T-C-CtT-C-A

-1-C-C-A-T

*Early*

I

I

T - G 130 CI . .GI 160

G..c

5’

I

.c

C . .G

1

A . .T C.

1

A . .T

aoA. .T

1

.

I

T

A. . T

I

I I G .C l l G * . C

FIG. 1. Nucleotide sequence of a portion of an SV40 DNA fragment containing the 5’ ends of “early” and “late” cytoplasmic RNA. The paired structure is drawn to emphasize symmetries in the DNA and probably does not represent a physical entity. “Late” and “early” refer to the DNA strand with the sequence of “late” and “early” message, respectively.

z

D

162

KIRANUR N. SUDRAMANIAN ET AL.

CCG indicates that the complementary sequences in adjacent portions of the RNA from thc region did not prcvent RNA . DNA hybridization. There is much more abundant cytoplasmic transcript from most of the “late” region than from the sequences in EcoRII-G and adjacent DNA. The sequence also indicates that the “late” strand RNA from EcoKII-G may not be translated (if UGA is consistently a termination codon in animal cells). Perhaps the RNA represents a penultimate precursor to translated mRNA.

IV. Discussion A number of factors probably play a role in determining where transcription terminates in uiuo in bacteria. There is clearly heterogeneity in termination signals, sincc some function in vitro with purified KNA polymerase while others require rho factor. The rho factor presumably serves as a general positive effector for the termination of transcription in uiuo. Whether the transcript is being translated may also affect termination. There are probably numerous site-specific negative effectors of the termination of transcription. For example, the N-gene product of x permits read-through into the delayed “early” genes of A. Direct analysis of message RNA has shown that late in A infection in the presence of Q-gene product, the phage transcript contains oligonucleotides extending from the last G in 6 S RNA to the first G beyond it. The tryptophan operon may be positively regulated by a mechanism that involves counteraction of a termination signal in the DNA preceding the sequcnces coding for the first enzyme (12). Certain features of the sequences may be relevant to the termination of transcription. Runs of 6 or more uridylates, often followed or preceded by a purine, have been found at the 3’ terminus of thc transcripts described above. A wide range of examples will be needed to determine, for example, whether runs of pyrimidines that contain cytidylic acid also play a role in termination. Several sites already sequenced have either purine-rich or ( G C)-rich sequences preceding the uridylate runs. Conversely, the sequences beyond the site of terminntion in A 6 S, h 4 S and the tryptophan operon appear to diverge and may not play a role in termination. Another factor that may be of importance is that in several cases ( but not the tryptophan operon ) the subterminal sequences can base-pair extensively with sequences preceding them. As first suggested for the 6 S RNA, this could facilitate release of the transcript RNA from the template. Base-pairing is not very favored for the terminal sequence, either of the major cytoplasmic ‘‘early” and “late” HNA of SV40 or of transcripts

+

REGIONS OF

SV40 DNA

CODING ENDS OF

163

mRNA

that extend through the U-rich scqucnce presented above. However, the primary structure analogies to bacterial termination sites are fairly striking, and the distribution of deoxyadenylate stretches is consistent with their acting as tcrmination signals ( either for message transcription or for RNA primers for Okazaki fragments formed during DNA replication ). This could be biased if, for examplc, SV40 protcin had some excess of lysine sequenccs. If the scquences of uridylates do represent sites of attenuation of transcription, thcn some other features must determine the exact 3’ ends of the stable “late” mRNA, since it precedes the sequence ( d A ) , by about 30 nuclcotides. One component of such a signal could be the sequence A,UA,G or A,UA ,C mentioned in Section 111. A major reservation to the hypothesis that these sequences function as tcrmination signals in uivo is the fact that nuclear RNA “late” in the infectious cycle has been reported to contain RNA thc length of an entire SV40 genome ( 1 5 ) . Also nuclei contain anti-late RNA not found in the cytoplasm (16, our unpublished observation). Therefore, few if any of the deoxyadenylates are likely to serve as ahsolute stops for transcription in vivo. More than a dozen promoters for E . coli RNA polymerase have been sequenced. Although the necessary and sufficient sequence conditions for such promoters and the mechanism of initiation of RNA replication are incomplctely understood, certain featurcs appear in most known promoters. There is often a sequence similar to ATA,T, five to eight base-pairs preceding the polymerase start. In many cases, this sequence may be varied by a transition of one or more bases. The sequence, over 30 bases bcforc the start of transcription, may play a role in initiation and sometimes contains a cleavage site for the restriction endonuclease Hind-11. In view of the variability in the sequence of bacterial promoters, it would be an uncertain conclusion to attempt to identify new promoters from sequences even of prokaryotic DNA. However, there is at least one feature of the sequences preceding the 5’ ends of SV40 “early” and “late” mKNA that is of interest. As discussed above, in each case there is a block of ( A T)-rich scquences between 30 and 100 nucleotides preceding the 5’ cnds as defined above. The location of the sequence preceding the 5’ end of “early” RNA seems rather comparable to the sequence distancc seen for bacterial promoters. No “early” strand sequence ATAAT is present near the 5’ end of “early,” although the ( A T ) rich block is preceded immediately by the sequence GCUGAC that differs by one nucleotide from the Hind-I1 cleavage site preceding by

+

+

164

KIIIANUR N. SURRAMANIAN ET AL.

about 30 nucleotidcs thc initiation site for transcription from several bacterial promoters. The structure of the ( A T)-rich scgmcnts on the “late” strnrid within thc 100 nuclcwtides preceding EcoRII-G is complex. They also do not contain copies of tlie seqiicnce ATA,T. It is uncertain whether this rcpresciits a site for initiation of “late” rnRNA and unlikely that these could bc the onlv site for initiation, since a delction mutant of SV40 lacking these sequences (and tlie rcst of the H i n d - A fragment) still complemcnts “late” functions. Alternativc possibilities include the prcsence of inore than one ‘‘late\'' mRNA promoter, a “latc” promoter within the “late” message segment or the possibility that sequeiices within EcoliII-G could function as a “latc” (or bidirc~ctional)promoter.

+

V. Summary We liave determined the nucleotide sequences in SV40 DNA encoding the 5’ and 3’ eiids of the longest species of cytoplasmic polyadcnylylated RNA. At or shortly beyond the 3’ ends of “early” and “late” HNA, there are stretches of 6 or Inore cleoxyadrnylates in the DNA. Prcceding the 5’ end or “early” RNA is a sequence iiicludiiig 17 successive A . T base-pairs and 8 successive dcoxyndenylates. Some of the features of these sequences show similarities to sequenccs in prokaryotic DNA that function to initiate or tcrminate transcription.

REVEHENCES 1 . R. Dhar, S. Zain, S. M. Weissnian, J. Pan and K. N. Sul>rainanian, PNAS 71, 371 (1974). 2. R.’Dhar, S. M. Weissinan, B. S. Zain, J. Pan and A. M. Lewis, Jr., NARcs 1, 595

(1971). 3. C:. Larsm 1’. Lcbowitz, S. M. Weissman and B. Dubiiy, CSIISQB 35, 35 ( 1970). 4. P. Lehwitz, S. M. Wcissman, C. M. Hadding, J R C 246, 5120 (1970). 5. 1. Sklar, P. Yot, S. M. Weissman, PNAS 72, 1817 (1975). 6. k. R. Hlattncr and J. E. Dahlberg, Nature NR 237, 227 (1972). 7. F. Lee, C. L. Squircs, C. Squires and C. Yanofsky, JMB 103, 383 (1076). 8. J. E. lhhlberg, F. R. Blattncr, F B 32, 664 ( 1973). 9. I). Kleid, Z.IIamayan, A. Jeffrey and M.Ptashnc, PNAS 73, 293 (1976). 10. M. Rosenberg, H. dc Chroml~ruggheand R. Miisso, PNAS 73, 717 ( 1976). 11. A{. L. Sogin, N. R. Pace, A4 Rosenlxrg and S. M. Weissman, JRC ( 1976). In press. 12. K. Bertl-and, C:. Sqiiircs and C. Yanofsky, J M B 103, 319 (1976). 13. K. K. Siihrainwiaii, J . l’nn, S. %din and S. A l . Weissman, NARCS 1, 727 (1974). 14. C. J . Lai a n d 11. N. Nathans, J M B 89, 179 (1973). 1 5 . T. Shenk, J. Carbon and P. Berg, J M B in press. 16. G . Waltrr and A. 11. Martin, J . Virol. l G , 1236 (1975). 166. J. Sklar ;ind S. A l . Weissman, F P 35, 1537 ( 1976). 17. 0 . Laul) and Y. Rloni, J. V i r d . 16, 1171 (1975).

Nucleotide Sequence Analysis of Coding and Noncoding Regions of Human P-Globin mRNA CHARLES A. MAROTTA T h e Psychiatric Research Ltiboratories, antl The Department of Psychiatry Massachusetts General Hospital, antl the Harvarcl Medical School Boiton, Marsachzuetts

BERNARD G. FORGET MICHELCOHEN-SOLAL The DiGision of HematologyOncology of the Department of Medicine Children’s Hospital Mcdzcal Center, and The Department of Pedratiics Harvarrl Medical School Boston, Marsachwetts SHERhlAN

M.

WEISSMAN

The Department of Human Genetics Yale University School of Medicino New Haven, Connecticut

1. Introduction The analysis of human hemoglobin messenger KNA was undertaken to determine thc cxact nucleotide sequence of translated and untranslated regions. As isolated from reticulocytes, the molecular weight of the rniscd pool of N - arid /3-globin niRNAs has been estimated to be 200,000220,000 ( 1 - 4 ) . RNA molecules of this size range contain 650-670 nucleotides. The coding regions of the N- and the @chain require 423 and 444 nuclcotide residues, respcctively, and the poly( A ) segment at the 3’ end has been cstimated to account for 50-75 residues ( 5 , 6 ) . Thus other, untranslated sequences may contain approximately 150-175 additional nucleoticles. Knowledge of the translated sequenccs of globin mRNA has important 165

c.

166

A . MAROTTA

m

AL.

implications for dctcrniination of specific codon usage by eitkaryotic cells and for mcwarements of the rate of mutational events when human and subhuman species arc compared. Untraiislated animal ccll mRNA sequences have heen shown to contain conserved homologous sequences at the 3' end (7, S ) ; thus, the biological significance of this region is implicit. Both coding nnd noncoding sequences have direct 1,earing on the structural characteristics of cukaryotic mRNAs. The niicleotide structurc of human hemoglobin and p-mRNA has particular iinportaiice since there exist many mutations in man, including five a-chain and two 8-chain variants that contain abiiorrnally long aminoacid sequcnces that may arise by means of chaiii-tc.riiiinatioii or frameshift mutations ( 9 ) . The origin of three of these variants has been explained by detailed analysis of nucleotide sequences adjacent to the tcrminatioii codon of the p chain or within the tcrminatioii codon of the cy chain (9, 10). Similar structural studies 011 human globiii mRNA have demonstrated that in the thalassemia syndromes therc is a quantitative cfcficicncv of the affected chaiii-specific mRNA ( 1 1 , 1 2 ) . Continuing our previous investigations on the structure uf human hernoglo1)in mKNA, we report here the assignment of 275 nucleotides, derived from unique digestion products of complementary HNA ( cRNA ) , to the coding region of 8-niRNA, and a sequence of 43 nuclcotides adjacent to tho tcmnination codon. Extensive sequences arc tentativcly assigned to the translated rcgion by analysis of nonunique oligonucleotides. A riumber of assignments were confirmed by analysis of complementary DNA ( cDNA ). Sequences containing over 350 nucleotides arc reported. (Y-

II. Materials and Methods Materials and methods are the same as clescribed in previous reports ( 9-12). Kcticulocyte RNA was derived from subjects with sickle ccll disease, a- or p-thalassemia, or hemolytic anemia with HI, A. Globin mRNA was purified on sucrosc gradients, Me,SO-sucrose gradients, aqueous acrylainido gels, affinity chromatography columns, or 99%formamideand p-mRNAs separate ( 1 2 ) . After ncrylamidc gels; on the latter, the prccipitatioii from gradients or elution from gels, the inRNA was transcribed into cDNA by RNA-depcndent polyulerase of avian myeloblastosis virus; '"P-labeled cRNA was transcribed from single-strandcd cDNA by ~ n c n n sof E. coM R N A polymcrasc in the prcscnce of the four nuclcoside triphosphates, one of which was labeled in each experiment by an [ a - ' T I NTP ( I S ) . ['"Y]cRNA was digested wit11 T, or pancreatic HNase, two(Y-

CODING REGIONS OF HUMAN p-GLORIN

mRNA

167

dimcnsional maps were prepared and the resulting spots were analyzed ( 1 0 ) . Native mRNA was iodinated prior to digestion ( 1 3 ) . Sequence analysis was carried out on double-stranded human globin cDNA and labeled with [ n2P]dCTP (14, 2 3 ) . The preparation of restriction endonuclease fragments was previously described ( I S ) .

111. Results RNA was isolated from the 10 S region of a 5 to 20%sucrose gradient. In some cases, it was further purified by passage through a Me,SO-sucrose gradient, a poly ( U )-cellulose column, or an oligo ( d T ) -cellulose column ( 1 2 ) .The rcsulting material served as a substrate for labeling procedures. Altcniatively, 10 S KNA was separated into its component CY- and p-mRNA moieties on 99% foimamide-acrylamide gels. The more rapidly migrating (fast) band and thc more slowly migrating (slow) band were previously idcntified to bc N-mRNA and 8-mRNA, respectively ( 1 2 ) . T I RNA digestion of single-stranded cDNA transcribed froin sicklecell 10 S mRNA in the presence of actinomycin D labeled with [ c Y - ~ ~ P J GTP rcwilted in a chromatogram with a complex but well-defined nonrandom appearance that is characteristic for N- plus p-globin cRNA. With the use of the thwe othcr nuclcoside [a- "P] triphosphates, similar patterns were obtained, h i t the rc,lative intensities of the spots varied. In order to select for p-mRNA oligonucleotides, cRNA was prepared from slow-band mKNA and from a-thalassemic mRNA previously shown to contain nearly cdusivc)ly p-globin mRNA (11, 1 2 ) . The oligonucleotide patterns obtained from a-thalassemic and slow-band inRNA with T I RNase were simpler and demonstrated the absence or decreased yield of a number of the largest and most characteristic products prescnt in sicklecell cRNA. M a w of the oligonucleotides retained on all three maps code for p-glol)in amino acids (11, 1 2 ) . The absent digestion products were demonstrated to include O-mRNA oligonucleotides by preparation of a chromatograin derived from p-thalasscmic cRNA which contains mostly N - ~ R N AThcsc . results arc consistent with the hybridization studies of. Housman et ul. ( 1 7 ) and Forget et ul. ( 1 2 ) . T I RNase products from the above were matched by chromatographic position and by analyyis of the products derived after pancreatic RNase digestion. Tablc I lists thc) oligonucleotidcs coding for a unique region of liuman p-globin. Using the same techniques, primary pancreatic RNase digestion products were prepared and malyzed; the unique products found are listcd in Table 11. Oligonucleotides that code for more than one region of the p-globin chain were also catalogued ( 1 8 ) .

168

C. A. MAROTTA ET AL.

(( :)UCUACCC:IJIJ( ;((:)

:34-37 37-40 ;)6 60 62-64

((i)ACCCAC;( A ) (l;)(:AACCCUAa(;((:j (( ;) C tJC A U( ;(( i) ((;)CCUUIJAt:(U) (C;)CAA(;(A) ((;)CIJCACCU(i (C;) ((;)ACAACCU('AA( ; (t i ) ((:jCACCUUUC;(C) (( :)CCACAC:U(+(A) (( i ) AC A A ( ( Y ) ((:)AACUIJCA(:((:) ((;)CCCAUCACUUUt:(t;) (G)AAUUCACCCCACCA(; (U) (G)CCUAUCAG (A) (G) CUAAUG (C) (G)CCCACAAG (U) ((:)UAUCACI!AA(;(C)

70-72 64-66 76-7!) 79 -83 83-86 86- 89 04-!46 101 104 11.5-119 121-126

129-132 1 3 - 140 142 -146

IM-T

Untranslated sequcnvcs ( G )CY u UY UUG (Y) ((4)UUCCCUAAG (U) (CrjUCCAAUUUCUAUIJAAAG(~)

T41a T23 T6

:I

11hi- 148

140-132 I 39- 1 VJ 153- 159

Y indicates a pyrimidine nucleoside.

TABLE I1 HUMAN,+GLOBIN MRNA UNIQVE PANCRE.\TIC 1tN.w: OLIGONL'CI.I.:OTIU~S~~ No.

Sequences

Amino acid position

P6 P27a P16

(UjGGAGAAGU ( C ) b (Y)GAGGC(C) (C)AG AC GU(U) (Y)G(;Q(;AU(C) (Y)GAAG GC (C)AAGAAAGU (C)AAGGGC(A) (Y)GAGAAC (U) ( Y ) AAAGAAU(U) (C)AGAAA(;U((;)

6-0 26-27 39-4 1 4,5-48 60-62 64-67 81-84 100-lo:$ 119-122 l30-1SY

-

P15 P19a P8 P19b 1'20

P13 1'12

Y indicatrs a pyrimidine nucleoside, and N an undetermined nuclcoside.

* Unique t o sickle-cell

P-mRNA.

CODING REGIONS OF HUMAN

P-GLOBINmRNA

169

To rule out the possibility that transcripts were incomplete or contained transcription errors, or that RNA polymerase did not initiate transcription precisely at thc 3' end of cDNA, studies were carried out on native mRNA. When a T, HNase digest of iodinated native a-thalassemic mRNA was compared to one of cRNA, it became apparent that, although most products were identical, the iodinatcd mRNA contained some products that were not prominent in cRNA digests (see Discussion). To obtain further specific 8-mRNA sequence information, sickle cell double-stranded cDNA was digested with a series of DNA restriction endonucleases (16); the data are listed in Table 111. Size estimates of the fragments and comparison of the endonuclease-specific sequences with the results described above establish tentatively, in some cases coiiclusively, the nucleotide arrangement at a number of sites along the p-globin mRNA chain. Partial venom diesterase digestion of cDNA fragments labeled at the 5' end provide confirmatory data ( 1 8 ) .

TABLE I11 RESTRICTION

E n zym c

Hae 111

Eco 1111

IIinf 1

ENZYME FRAGMENTS OF SICKLEC E L L CDNAI

Length of fragments

(i) (ii) (iii) (iv) (i) (ii) (iii)

142 114 9.5 73 138 98 76

(i) 110 (ii) 44 (iii) >300 (i) 292 (ii) 167 (iii) 14.5 (i) 296 (ii) 1.53

Alu I

Sequence

(XXC

cc:U( ;G

GANUC

AGCU

,

Rlbo I

BamII

Ic

(;AUC G(;AUCC

@-Globin amino acid positions 26-28 74-75 1 I4-11.jb 141-142 27-29 75-76* 77-79 140-142 43-44 3-5b 90-91* 9,5-96 347-348 47-48 99-100 98-100

Preparation and characterization of fragments were the same as those described by Subramanian d a / . (16). Length of fragments is expressed in number of nurleotides +8. IS indicatas a n unsprrificd nucl~osidc. * Tantative assignment. Fragment size not determined.

170

C. A. MAROTTA ET AL.

IV. Discussion Based upon the available sequence information, the primary structure of human sickle-cell p-mRNA was constructed as shown in Fig. 1: nucleotides tentatively assigned are placed in parentheses. Unique T, and pancreatic RNase digestion products (Tables I and 11) allow placement of about 300 nucleotidcs, a number of which are contained in overlapping oligonucleotides. Most of thc uniquc products clustcr at sites distant from the 5’ end of mRNA indicating that RNA polymerase most often initiates transcription internally in cDNA. The unique pancreatic RNase product P6 (Table 11) appeared as a faint spot on chromatograms prepared from sickle-cell cRNA and was absent on thosc prepared from Hb A cRNA; on the latter, a new and larger digestion product was seen. This observation is accounted for by the uniquc placement of P6 a t the sickle cell mutation site (amino acid 6 ) in which valine replaces glutamic acid and the second-position nucleotide of the codon is changed from A to U. The transversion results in a smaller pancreatic RNase digestion product on the sickle-cell chromatogram. In addition to the above unique assignments, othcr scqucnces were tentatively positioned by analysis of excluded nucleotides. In this analysis the nonunique tri-, tetra- and peiitai7ncleoticles, determined from scquence analysis data, mere catalogucd. Small nuclcotides that were not present were also listed. For example, (G)UAG, ( C , )UCG, (G)ACG, ( G ) A A A G ( C ) , ( Y ) G G G C ( G ) and ( Y ) G G G U ( A ) are not prominent within a-cRNA transcripts. A third catalog was prepared containing the small oligonucleotides that can be accommodated by the p-globin amino-acid sequencc. By comparison of the three catalogs, we could exclude alternative sequences at many consecutive codons; in most cases only one permitted sequence was allowable. The codons for over 40 amino acids were assigned in this manner. This analysis not only predicted nearly all the unique T, and pancreatic RNase products, but also was consistcnt with restriction enzyme data (Table 111). Both unique and nonunique nucleotide assignments were uniformly consistent with p-globin mutation data (19) which restricts codon assignments (indicated by underlined nucleotides in Fig. 1) The chromatogram prepared from iodinated native a-talassemic mRNA revealed five spots that are not prominent on cRNA maps. From the results of pancreatic KNase digestion and estimates of the overall base composition and chain length, two of the additional oligonucleotides contain sequenccs consistent with p-globin amino acids 3-5. This confirms the observation that HNA polymerase initiates transcription only infrequently among the first 75 nucleotides of the coding region of thc mRNA. By

.

CODING REGIONS OF HUMAN 1 Va 1 GUN

2

3

His

Leu

Thr

CAY

YUN

ACN

Ala GCN

14 Leu CUN -

15 TrP UGG

16 GlY

Lys

GGX

26 GlU GAG

27 Ale GCC

28

29 GlY GCC

39 Gln CAG

40

41

Arg AGG

uuc

52 ASP GAN

53 Ala GCA

54 Val GUU

55 Met

56 Gly

AUG

GGC

65 LYE AAG

66

67 Val

68

69 GC!)

0 AUG

...

13

78

Leu CUG 91 Leu

Leu

(GU? Phe

ACN

23 Val GUU

24 GlY

Glu GAA

GGX)

25 GlY GGY

33 Val GUG

34 Val GUC

35 TYr UAC

36 Pro

ccu

37 Trp UGG

38 Thhr ACC

46 Gly GGg

47 Asp GAU

48

UW

-CUG

49 Ser NNN

50 Thr ACN

51 Pro CCN

50 Pro

59

Lys AAG

61 LYS AAG

62 A1 a GCU

63 His

ccu

60 Val GUG

CAU

G1Y GGC

71

72

Phe

uuu

Ser AGU

73 Asp CAU

74 Gly GGC

75 Leu CUG

76 A1a GCE

CAC

86 Ala GCC

87 Thr ACA

88

uuu

Leu CUG -

89 Ser AGU

-

Glu (GAG

98 Val GUG

99 Asp GAU

100

Pro

ccu

101 Glu GAG

102 Asn AAC

103 Phe UUC

111 Val GUC

112 Cys UGY

113 Val CUG

114

Leu ( C W

115 Ala GCC

His

126 Val

Gln

GU5

CAN

128 Ala GCN

129 Ala GCC

141

30 Arg AGN

31 Leu CUC)

43 Glu

GAN

44 Ser UCN

45

Phe

UUN

57 Asn AAC

19 AAY 32 Leu

CON Phe

GlU

22

Leu

Thr

64

71

His

GUG

Leu (CIJC

79 Asp GAC

80 ASn AAC

81

82

Leu CUC

Lys

83 Gly

AAG

GGC

84 Thr ACC

92

93 CYS UGU)

94 Asp GAC

95 Lys AAG

96

97

Leu CUN

CAY

107 GlY GGC

108 Asn AAC

109 Val

G)Uc

Leu YUN

120

121 Glu GAA

121

123

124

125

Phe UUC

Thr ACC

Pro CCA

Pro CCA

134 Val GUN

135 Ala GCfU)

Cly

137 Val GUN

138 Ala GCU

139 Asn AAU

140 A1a GCC

Leu

CGE

CUG

142 Ala GCC

146

147

148

149

150

151

152

153

154

155

His

END UAA

GCU

NGC

157

158

159

Hie CAC

104 Arg AGC

105

106

Leu ( CUN

Leu

117

118

His

Phe

119 GlY GGC

uw

130

131

T Y ~ UAU

Gln CAG

143

144

His

LYS

CAC

12

Val GUN

21 Asp GAU

Asn

70 Ala GCC

11

Ala GCI

20 Val GU(5

18 Val GUN

17

10

Ser

ucy

(AAG)

7

9

8 Lys AAG

6 Val GUG

LYE AAA

cue

CAC

4

GAG

5 Pro CCN

42

171

P-GLOBINmRNA

AAG

!?

132 LYS AAA 145 Tyr UAU

LYS AAA 133 Va 1

GUG

CAC

156 CUA

Gly

His 110

136

....................

85

Phe

127

90

116 GAU

uuu ....................

GUC GYU YUU YUU ....... ..............

160

161

UUA AAG GUU (CCC ............................................

CAA

162

GU)

FIG. 1. The nucleotide sequence of sickle cell P-globin mRNA. Tentatively assigned sequences are enclosed in parentheses. Untranslated sequences are indicated by dots and termination codons are indicated by solid lines over the niicleotides. Y represents a pyrimidine and N an undetermined nucleotide. Those nucleotides that are confirmed by p-gloliin mutation data are underscored by bars (see text). Codon assignments that can be deduced from the amino acid sequence alone are not included.

172

C . A. MAROTTA ET AL.

determining the chain length of the thrcc other additional oligonucleotides, it could b e estimated that an untranslated region containing a minimiini of 30 nucleotides separates the 5’-terminal cap’ from the coding region of human 8-globin mRNA. The available codon assignments (Fig. 1) appear to indicatc that the choice of the third-position iiucleotide is nunrandom among degenerate codons. When there is a choice between only two pyrimidines in the third position of a neutral amino acid, uridine is most frequently chosen. Glutamic acid and valinc most often contain guanosine in the third position. Thus the sickle-cell mutation may be accounted for b y a single-base transvcrsion in the second position only ( A + U ). Salser et al. ( 20) recently reported nine oligonucleotide fragments that code for unique sites of rabbit p-globin mRNA. When these regions are aligned with human p-globin inRNA, 109 nucleotides can be compared (Fig. 1 and ref. 20). Among the comparable sequences, there are 99 homologous nucleotides, and 11 transitions and transversions occur: two in the first position, one in the second, cight in the third position of nonhomologous codons; there are six silent mutations and three mutations that lead to a different amino acid. In human ,+globin mRNA the termination codon at position 147 is UAA; previously we showed the same termination triplet in human a-globin mRNA (10). The choice of UAA may be a general feature of many eukaryotic mRNAs, although more data are needed to establish this tentative conclusion. There is an out-of-phase termination codon at positions 157-158 (indicated by a bar over the nucleotides in Fig. 1).The region around the first UAA and the untranslated sequence between it and thc second UAA is confirmed by the two abnormally long 8-globin variants Tak and Crnnston ( 9). This sequence is consistent with the predicted origin of the additional ,&chain amino acids. In the case of Cranston, thc variant can arise by rcduplication of the last two nucleotides ( AG) of thc lysinc codan at position 144; and in the case of Tak, by reduplication of the last two nucleotides ( A C ) of the histidine codon at position 146. In both instances insertion of the two extra nucleotidc residues brings the reading frame of the untranslated sequence into the proper register to code for the additional amino acids; and, the out-ofphase UAA is brought into phase to terminate the chains, as previously dcscribcd ( 9 ). Digcstioii product T20 ( Table I ) \ w s tentatively positioned at 11011coding triplets 159-162. The assignment was based upon an overlapping

’ Re “cap,” see articles by Furuichi et al., Rottman et al., Busch et al., and Moss et al. in t h i s \~oluiiic. ‘ See Salser ct (11.. in this volume.

CODING REGIONS OF HUMAN

P-GLOBINmRNA

173

5’Rabbit

p:

Rabbita:

poly(dT)- G - C - A- A- T - G - A- A- A - A - T- A - A- A- T - T - T - Cpoly(dT)-G-C-

C-A-

- - - - - - -- - - - -C-

A - C - T - C- A-G- A - C - T -

I

T- T - A - T - T-

...

I

FIG.2. Coinparimi of untranslated scquenccs of rabbit ( 7 ) and hunian globin cDNA transcribed from the region adjacent to the 3’-end poly( A ) segment of mRNA. Homologous sequences are enclosed by brackets. The rabbit sequences were determined by N. J. Proudfoot (personal connnnnication ).

sequence derivcd from sickle cell [?:PIcDNA after endonuclease-IV digestion followed by snake-venom diesterase digestion of the resulting fragments. T23 contains an additional in-phase termination triplet, UAA, at site 161. If there were a 8-globin termination codon mutation at position 147, the additional amino acids would terminate at the UAA of position 161. This type of p-globin variant has not been reported; however, in thc case of the m-globin variant Constant Spring, the 31 additional amino acids can be accounted for by a point mutation of the first position of the terminatin codon of a-mRNA ( 1 0 ) . Dhar et al fund that in SV40 transcripts there is the recurrent sequence AAUAAAG near the 3’-end of early mRNA (21) . Proudfoot and Brownlee ( 7 ) ” later demonstrated the same heptanucleotide to be present toward the 3’-end of rabbit a- and p-globin and mouse immunoglobin mRNAs. We have rcccntly isolated a dodecanucleotide fragment from human globin cDNA with the sequencp (5’)CTCAGACTTTAT-(3’) that shows homology with rabbit N - and p-globin cDNA transcribed from the region adjacent to poly( A) of thc mRNAs ( Fig. 2 ) . Thus the mRNA sequence AAUAAAG may represent a conserved untranslated signal with a unique biological function.

V.

Summary

Sequence analysis studies were carried out on human p-globin mRNA. Thalasscxmic, sickle-cell and Hb A mRNA served as substrates for the preparation of complementary DNA by RNA-dependent DNA polymcmse. cDNA was transcribed by E . coii HNA polymerase, and the resulting cHNA was analyzed. Additional sequence information was obtained by analysis of iodinated native a-thalassemic mRNA and by digestion of cDNA with rcstrictioii endonucleases. The data derived allow assignment of over 350 nucleotides to the coding region and 43 nucleoSee Proudfoot et al. in this volume.

174

C . A. MAROTTA ET AL.

tides to the noncoding region adjacent to the termination codon. Thcre are extensive rclgions of homology between the translated regions of rabbit and human 8-mRNA; a short homologous region can be demonstrated between the untranslated 3’-terminal sequence of human mRNA and other animal cell mRNAs. ACKNOWLEDGMENTS We thank Drs. I. Verma, R. R. McCaffrcy and 1). Baltimore for providing thc cDNA used in many of these experiments, Dr. W. Prensky for carrying out several iodination procedures and Dr. W. M. Fitch for providing globin mutation data. The helpful encouragement of Drs. D. G. Nathan and S. S. Kety is gratefully acknow!edged. A . Manschreck, D. Paci, B. Parks and L. Prusoff provided excellent technical assistance. This work was supported by the following grants: National Institiitcs of Mental Health Grant MH 16674; the Ethal D. Dupok-Warren Award, the Williani F. Milton Fund Award, Harvard Medical School; and a grant from the Vance Fund, Massachusetts General Hospital to Charles A. Marotta. Michel Cohen-Sold is the recipient of a fcllowship from I.N.S.E.R.M. (France) and Bcrnard G. Forget is the recipient of a Research Career Development Award AM-70234 of the 1J.S. Public IIealth Service; a portion of this work was supported by the following grants of the National Institutes of Health: CA-13472, AM-15929, AM-05581 and AM-15035. Sherman hl. Weissnian is the recipient of grants from the Anierican Cancer Society and thc National Cancer Institute.

REFERENCES 1 . F. Labrie, Nature 221, 1217 (1969). 2. H. Williamson, M. Morrison, G. Lnnyon, R. Eason and J. Paul, Bclietn 10, 3014 (1971). 3. P. Gaskill and D. Kabat, PNAS. 68, 72 (1971). 4 . €1. J. Gould and P. H. Hamlyn, FEBS Lett. 30, 301 (1973). 5. J. Gorski, M. H. Morrison, C . C . Alerkel and J. B. Lingrel, J M B . 86, 303 (1974). 6. J. N. hlansbridge, J. A . Crossley, W. G. Lanyon and R. Williarnson, E I B . 44, 261 (1974). ‘7. N. 1. Proudfoot and G. G. Brownlee, Nature 252, 359 (1974). 8. R. Dhar, K. N. Subramanian, B. S. Zain, A . Levine, C. Patch and S. M. Weissman, “In Vitro Transcription and Translation of Viral Genonies,” V d . 47, pp. 25-31. INSERM, Paris, 1975. 9. B. G. Forget, C. A. Marotta, S. M. Weissman and M. Cohen-Solal, PNAS. 72, 3614, ( 1975). 10. C. A. hlarotta, B. G. Forget, S. M . Weissinan, I. M. Verma, R. P. McCaffrcy and D. Baltimore, PNAS 71, 2300 (1974). 11. B. G. Forget, D. Baltimore, E. J. Benz, Jr., D. Housman, P. Lcbowitz, C . A. hlarotta, R. P. McCaffrcy, A. Skoultchi, P. S. Swerdlow, I. M . Verma and S. M. Weissinan, Anti. N.Y. Acad. Sci. 232, 70 ( 1974).

’ See Proudfoot ct al., this volume, pp. 127, 130.

CODING REGIONS OF HUMAN

p-GLOBIN mRNA

175

12. B. G. Forget, D. Housman, E. J. Benz, Jr. and R. P. McCaffrey, P N A S . 72, 984 (1975). 13. B. G. Forget, C. A. Marotta, S . M. Weissman, I. M. Verma, R. P. McCaffrey and D. Baltimore, Ann. N . Y. Acad. Sci. 240, 290 ( 1974). 14. Galibert, F., Sedat, J. and Ziff, E. J M B 52, 377 (1974). 15. C. A. Marotta, P. Lebowitz, R . Dhar, R. S. Zain and S. M. Weissman, in “Methods in Enzynlology,” Vol. 29E,pp. 254-272. Academic Press, New York, 1974. 16. K. N. Subranianian, J. Pan, S. Zain and S. M. Weissman, NARCS. 1, 727 (1974). 17. 1). Housman, B. G. Forget, A. Skoultchi and E. Benz, Jr., PNAS. 70, 1809 (1973). 18. C . A. Marotta, B. G. Forget, S. M. Weissman and M. Cohen-Solal, in preparation. 19. M. 0. Dayhoff, in “Atlas of Protein Sequence and Structure,” Natl. Biomed. Rrs. Fonnd., Washington, D.C., 1974. 20. W. Salser, S . Bowen, 1). Browne, F. El Adli, N. Federoff, K. Fry, H. Heindell G. Paddock, R. Poon, B. Wallace and P. Whitcome, F P 35, 23 ( 1976). 21. H. Dhar, S . Zain, S. M. Weissman, J. Pan and K. Subramanian, P N A S 71, 371 ( 1974 ) . 22. G. G. Brownlee and F. Sanger, E J B . 31, 395 (1969). 23. T. hlaniatas, A. Jeffrey and D. G. Kleid, PNAS 72, 1184 (1975).

This Page Intentionally Left Blank

Determination of Globin mRNA Sequences and Their Insertion into Bacterial Plasmids WINSTON SALSER, JEFF BROWNE,

PAT CLARKE, HOWARD HEINDELL, RUSSELL HIGUCHI, GARYPADDOCK, JOHN ROBERTS, GARYSTUDNICKA AND PAUL ZAKAR Department of Biology and Molecular Biology lnstitute University of California Los A n g e l a , California

1. Introduction Since 1972, we have used the rabbit globin mRNAs as a model system to devclop general methods for the sequence analysis of polyadenylylated mRNAs (1-3). As starting material, we have used cDNA made with rcverse transcriptase. This cDNA has served as the template for synthesis of a variety of products used in various phases of this research. Figure 1 lists those of most interest here, The synthesis products include "Plabeled RNA for sequence analysis of fragments resulting from a cleavage at G rcsidues, "P-labelcd dC- or dT-substituted RNAs for specific cleavage at U or at C residues, and duplex gene copies for insertion into bacterial plasmids. I t is through molecular cloning that we have succeeded in making our approach applicable to a wide variety of eukaryotic mRNAs. Although more and more pure mRNAs are becoming available, when we started this work globin mRNA was one of the few mammalian mRNAs readily obtainable in the purity and quantity required for sequence studies. One of the advantages of the technique we developed for the insertion of eukaryotic mRNA copies into bacterial plasmids is that a rigorous purification is not necessary; the act of cloning such material definitively eliminates all contaminating eukaryotic nucleic acid sequences. This should 177

178

WINSTON SALSER ET AL.

alkali treatment

c D N A copy TTTTT

FIG.1. Glohin cDNA, synthcsized from rabbit glol,in mRNA as shown, was used as ternplate for in uitro synthesis of the following products: ( 1 ) 3'P-lubelcd R N A for determination of sequences of C cleavage fragmeiits (RNase T,); ( 2 ) ["'PI dCscrhstitfrtcd R N A for detcrmination of sequences of U clcavagc fragments ( RNase A ) ; ( 3 ) ["'PI clT-substituted R N A for dcteimination of scquences of C cleavage fragments ( RNase A ) ; ( 4 ) drq~lcxD N A for inscrtion into bacterial plasinicls.

be especially iisefiil to the rescarcher who, tor instance, has a mixture of 1 0 differcmt mRNA species that are difficult to scparate. In order to obtain the corresponding sequences in purc form, it suffices to make cDNA copies ot the mixture of mKNAs, convert this mixture of cDNAs to duplex gene copies, insert the DNA into bacterial plasmids and introduce the mixture of plasmids thus obtainrd into bacteria. SiIicc>each bacterial trmsfoImaiit can receive only one of the eukaryotic gene inserts, analysis of a few dozen bacterial clones should suffice to provide onc with a clone for each of the interesting scqucnces. With such clones, it is possible to producc niilligrarn quantities of these sequcnces in dupleu form, sufficicnt for the most rapid sequencing techniques.

II. Sequence Analysis by in Vitro Transcription from cDNA Templates

A. Agreement of Nucleotide and Amino-Acid Sequences Confirms Fidelity The first problem we faced was that of the fidelity of in vitro synthesis, especially that carried out by reverse transcriptase. As a preliminary check of this fidelity, we chose to examinc the agreement between known amino acid sequences of the globin proteins and our nucleotide sequence data for the fragments resulting from a cleavage a t G rcsidues. The sequences

INSERTION OF GLOBIN

mRNA

179

SEQUENCES

resulting from this cleavage are listed in Table I ( 4 ) , and it has been possiblc to assign 15 of these fragments to locations within the structural genes on the basis of thcir unique fits with the amino-acid sequence as shown in Fig. 2.Five other assignments are shown in Fig. 6. We: have also found sequences of more than 70 nucleotides that do not fit the structural genes and h a w nssigncd them to the untranslated regions of the niRNAs. l'hc,se we must examine especially carefully for mistakes. The fact that such fragments do not fit within the structural genes could indicate either that they are from the untranslated regions or that they rcpresent mistakes made in transcribing the translated regions. If they ented mistakes, we expected that some of them should have significant "almost-fits" at their places of origin in the structural genes. We have found no significant reseniblances of this sort to suggest a lack of fidelity in the system. More recently Proudfoot (23, and this symposium) has confirnicd, b y entirely indcpendent means, all but three of our sequences and thcir assignment to the untrans1:ited region. Our other thrce assignments to the untranslated region are also consistent with his data, since there is a substantial portion of this region for which he lacks sequence information. In Section 111 we indicate some other approaches using our ability to clone glol)in cDNA inserted into bacterial plasmids; these techniques have cna1)led us to check the strand assignments of three-fourths of our fragments. So far, every one of the assignments to the alpha or beta strand thus checked has proved to be correct. Taken together, these obscwations suggest that the fidelity of the sequencing techniques used is quite high. ALPHA chain

5'

N terminus

.

.

0

- -- -54

.

20

.

.

40

.

.

.

.

60

4831

.

80

,

.

100

13 47 26

.

.

3'

C terminus .

120

I

'

140

.

.

I

Untranslated

"

Poly ( A )

BETA chain 5'

N terminus

0

20

- - - -

5253

40

35

60

30b

45

80

- ---

34c

100

58c 57

120

33

---

-

34b 34d 58b 55 34a 29 36 (-' -, -) I

140

C terminus

Untranslated

588

3'

I

Poly ( A )

FIG. 2 . Rabbit hemoglobin mRNA. Map showing the unique fits that can be deduced by cornparing the C: cleavage data in Table I with the globin amino-acid sequences. More complete sequence assignments that rely on other data as well are given in Figs. 6 and 8. The RNA studied was ohtained by transcribing the cDNA in oitro with Esclwichia coli RNA polymerase. The lack of unique fits prior to betachain amino acid 40 or alpha-chain amino acid 80 suggests that RNA polymerase preferentially initiates at specific points on the single-strand cDNA template.

Spot no. 1 2 3 4

,ia .ib -5c 6 7 8 9 10 11 12 13

14 1-5

16 17 19 20 21a 21b 22 23

?pot

Sequrnce(s)

no.

Sequcnce(s)

(+U(+ (;CUC; (;AUG GCCUG(C) GACUG C :CAUG GAUCG GAAUG(C) GCCCUG (C) GCACUG GACCUG GA.4CUG G CC A AUG GA CCC U G GCCCUCCG (a113 VPSE)* GCCACCUC: (XAAAUG GUUG GUCCG(C) GCUUG GUAUG GCUCUG GUCCUG(C) GCUCCUG GAUUG

24a 24h

GACUUG GAUCU(; GUUAAG GCCUCCCUG (a123 ASL)* GUCACUG GCUCACG GCAUCUG GAUCCUG GCCCCUUG (untranslated) GUCACCUG GCCUUCAG (p70 AFS)b GUCCCACUC; (a101 LSHC)b GCCUAUCAG (PI29 AYQ)b GACACCAUG (untranslated) GCAAUCAUG (untranslated) GAACUUCAG (6101 ENFIE)b GAAUACCUG (untranslated)

2.i

26 27a 27b 27c 28 29 30a Sob 31

33 34 a 34b 34C

34d

spot no. 3.5 36 37 38 39 40 42 44 43

47 48 49 50 -51 32

3

*?I

4

. ? I

Sequence(s) GCAAUCCUAAG ( p 56 SNPK)b GCUAAUAAAG (untranslated) GUUUG GUUAUG CUCUUG GUUCCUG GUCCUCUG G ACUUCUG GCACCUUUG (p83 GTFA)b GAAUUCACCCCUC: (a116 EFTPA)b GAAUUUCAAG (a96 \-NFK)* GUUUUG GUUAUUG GUCUUUG GUUCUUCG (p40 RFFE)h GUCCUUUG (843 ESFG)b GUCUACUCUCSG (a80 LSTLS)b

1

spot no.

--

.,a

57 58a

.i8b 58C

61 62 63 64 6.5 66 67 68a 68b G8c 69 70 71

Scquence(s)

GCAAAAAUUAUG (untranslated) GAAUUCACUCCUCAG (G) ( p l 2 l EFTPQV)* GAAACCUAUUUUCAUUG (C) (untranslated) AUC(Yi-,)CUCUG(C) (t) (untranslated) GUCUCAUCAUUUUG (G) (61 14 LSHHFG)b GCG GA4G GCCG GCAG GACG GAAG GCCCG GCACG GACCG GCCAG (C) GCAAG(C) GAAAG(C) GCAACG

Catalog of scqucnces resulting froin cleavage a t G residues of R 3 - k s>-ntliesized in vitro from a rabbit globin c l > S A teinplate. , 18). Missing numbers in the series are in all cases oligonucleotidrs that orc.ur in low yields or do not appear consistently ( e . ~ .nunibcr Number 56 is actually three oligonucleotides, which srparate on 1)E:AE-cellulose Iioriiochroiiiatography ; they occur in low yield and little sequence information has been derived. Of the other sequcncrs only 581) remains tentative and may in fact be a mixture of two sequence isomers. We also hare some data for isomers of CIAC:, A2CC:, CrA& and CA3G, which are not listed in the table. The techniques used to determine these sequences will he described in full elsewhere ( 4 ) but include the secondary digestion terhniques listed below follo\ved with appropriate tertiary analyses for full sequence and nearest-neighbor labeling information: (a) ItSase U: was used to cleave t h r fragments a t A residues; (b) pancreatic IiNase A was used to cleave a t pyrimidines; (c) in some cxperimcnts, the original clcavage a t G residues was carried out on dC-substituted R?;A so that secondary digestion with RNase A would cleave only a t U residues; (a) similarly, use of ItNase A to carry out secondary digestions on T-substituted rnat.eria1 resulted in cleavage only a t C residues; (e) partial spleen diesterase treatment was used as a secondary digestion procedure in cases \\-here This ensures that only the 3’ ends of the fragments are labeled so that cndonucleolytic containinalatjel was introduced as [W~~P]C;TP. tion of the exonuclease preparation cannot introduce artifacts. * Sinino acids are indicated in the one-letter codc [see J . Biol. (‘hcm. 243, 3557 (1068)]; (t) indicates “tentative sequcncc.” a

Y

5 5 g 0

s 9

+2: v1

m

182

WINSTON SALSER ET AL.

B. Start and Stop Signals for E. coli RNA Polymerase It must lie kept in mind that all the data we have discusscd thus far have bccn obtained by analysis of RNA transcribcd in uitro b y E . coli RNA polymerase from a cDNA templatc. The clustering of fragment assignments shown in Fig. 2 strongly suggests that E. coli RNA polymerase initiates RNA synthesis on the cDNAs near the sequences corresponding to amino-acid-residue 40 on the bcta chain and to amino-acidrcsidue 80 on the alpha chain. Moreover, it is clear from comparison with the results reported by Proudfoot et (11. (this volume) that transcription on the beta chain continucs to the end of the sequencc, but that the alpha chain cDNA sequencc contains some feature that effectively terminates transcription near the end of the structural gene. We have considered the possibility that the regions for which no fragment assignnicnts have bcen made might be transcribcd in our in uitro synthesis, but that cleavage at G residues might yieId no large fragments that could be recognized by unique fits in these “silcnt” regions. This alternative is ruled out by detailed examinations of the possible nucleotide sequences consistent with the amino acid scquciices in these regions. In each case, it can be shown that transcriptiou should have yielded easily rccognizable products that are absent from the list shown in Table I. The available physical data characterizing the globin cDNAs suggest that they contain at lcast a substantial fraction of full-size copics of the mRNA sequences. If so, our data demonstrate that E. coli RNA polymerase has a strong prefcrence tor specific entry points in the interior of the cDNA templatc. IIowever, we feel that the available data on the complction of thc cDNAs is not compclling. Detailed analysis of cDNA inolccules propagated on bacterial plasmids should soon provide resolution of the uncertainty.

C. Deoxysubstitution to Permit Cleavage Specifically at U or a t C Residues Any large-scale sequencing project of this sort is limited by the ability to obtain overlaps that pcrmit the entire sequence to be fitted togcther. The remainder of this paper is primarily concerned with the ways in which wc hope to accomplish this fur the globin mRNAs. With smaller ~eqnencingprojects, it has bcen traditional to use pancreatic RNase A digcstions to provide ovcrlaps. Since the average size of pancreatic RNasc A fragnients is so small ( 2 nncleotides), such data are of limited usefulness in a scqueiicing projcct of this magnitude. Therefore, we have developed deoxysul~~titutiolIitio1~ techniques to cleave specifically at U residues or at C residues ( 5 ) .

INSERTION OF CLOBIN P

mRNA

183

SEQUENCES 0

P

FIG.3. Separation of ralibit hemoglobin dC-substituted cRNA, cleaved at U residues by pancreatic RNase A. The digest was electrophoresed on cellulose acetate strips at pH 3.5 and then transferred to a polyethyleneiniine thin-layer plate for homochromatography. Individual spots were eluted and subjected to secondary analysis, usiially by digestion with RNasc TI or alkaline hydrolysis, and finally tertiary digestions. In each experinlent label was introduced on only one of the four nucleotide precursors so that nearest-neighbor sequence inforniation conld be obtained. For a niore complete analysis of the results, see Table 11.

In RNA in which every C is replaced by dC, RNase A will cleave specifically at U residues. The U cleavage pattern is shown in Fig. 3. Since U rcsidues and G residues have similar effects on the mobility of oligoiiucleotidc>tragments in such a system, the pattern is reminiscent of that obtaincd when an RNase TI digest is scparated, except that U is substituted for G. Instcad of separate graticules for fragments with 0, 1, 2 or 3 G residues, o m finds separate graticules for fragments with 0, 1, 2 or 3 U residues, arranged in the same general pattern. Tablc I1 lists our preliminary catalog of fragment sequeiices resulting from the U cleavage. Similarly, by synthesizing RNA in which d T rcplaces U, we can use RNase A to clcave specifically at C residues to produce the chromato-

184

WINSTON SALSER ET AL.

Spot no.

1

lJC*

2 3

IJU(C) UU(G) too light

4 5

UAC:((;) UCC?

Scquc.rice(s)

Spot no.

Scquenec(s)

UCCCU(F) UCACU(C) UCACU (G) UC C AlJ (C) UACCU(F) 14 UAACIJ(C) IJAACU(C;) 1 .i TJACCCIJ UCACCU 16 UACCAU(C) UAACCU 17 UC'ACCCU and other isoiricrs 1X UAACCCU aiid other isomers 19 uc;u(C) lJ(iU((i) 20 UCc;u(C) U ( X 1 J (( ;) lJc:cu(C) U(;CLJ((:)

12 13

F and Z, respectively, denote that there arc no (; or C nearest nriglil)or9. N,, denotes s c q u r n c ~of~ C, G, and A residues of unspc~cifietlorder arid Ieiigth. We h a w not attrrripted rc,present our partial sequrncr data in tliesc casrs. Spots 1 , 4, .iand , 6 arc "illegal" csleavagc prcducts p r c d u c ~ din small airiount as n r r s i of incomplctc tlc,ox?sul)stitution.

graphic pattern shown i n Fig. 4. Our preliminary catalog of the fragmcnt scquerices from this C cleavage pattern is in Table 111. LVc emphasize that this is only a progress report on the C and U cleavage catalogs. In fact, we have just recently assembled the data shown in Tablcs I1 and 111, and there has not been time to consider carefully how much more of thc total sequence the data already collected will give us. Obviously, too, the data are not nearly so complete as with the G cleavage catalog. While more data on the U and C cleavagcs must be gathered, we do not intend to bring those catalogs to the state of completion attained for the G cleavage catalog. It has been our experience

INSERTION OF GLOBIN

P

mRNA

185

SEQUENCES

R

0

P

P

B

0 C

FIG.4. Separation of rabbit hemoglobin dT-substituted cRNA, cleaved at C residues by digestion with pancreatic RNase A. Electrophoresis, chromatography and subsequent analysis were carried out as described for Fig. 3.

that much more than half the work in establishing such catalogs is in working out the sequences of the last few difficult fragments or mixtures of fragments. With conventional sequencing approaches, relying mainly on RNase T, and pancreatic HNase A, this was frequently essential: the tcchniques available were barely adequate to give enough overlaps to solve a sequence, and it was essential to have every bit of data the techniques could provide. With the variety of new techniques available, the situation has changed, and it may be possible to obtain compelling evidciice for a particular sequence more rapidly by using several different approaches rather than by taking the special trouble required to work out the most troublesome spots from any particular cleavage. We describe below an example in which our ability to clone globin cDNA sequences on bacterial plasmids has allowed us to apply new “ladder” sequencing techniques and to sequence very rapidly an interesting region bounded by Eco R I and Hue I11 restriction sites.

186

WINSTON SALSER ET AL.

Spot no.

Scqurrivr (s)

\- dciiotcs that ncnrrst ncighhors c:f (', (; arid IJ have l)ccii dcnionstmtrd, a n d A niay 1 prrscmt. N,l tlrnotrs a scqurrire of tl, (; and A rcsidurs cf iiribprrifiod oitlrr und Icngtlr. \\ h a w not a t t r r n p t r d t o rcprcsrnt o u r partial srqurricc data in tlirsc rnscs. 'I

111. Cloning cDNA Sequences on Bacterial Plasmids

A. cDNA Synthesized b y AMV Reverse Transcriptase Contains a Fold-back Region that Primes the Synthesis of DNA by DNA Polymerase-l In the course of some earlier attempts to combine the power of the sequence-specific T4 ciidonucleasc IV digestion with ribosubstitutioii sequencing, we became acutely aware that cDNAs made with reverse trnriscriptase have a short double-stranded "hook" at the 3' end (6, 7) (see Fig. 1). This "hook" serves as a remarkably effective primer for E. coli polymerase I, so that all duplex DNA synthesized is in the form of rapidly rc~riaturing"hairpins," which occur even if large amounts of exogenous primer are added in an attempt to synthesize DNA not COvalently linked to the cDNA template ( 7 ) . Surprisingly, although actinomycin D blocks extensive synthesis of duplex DNA by reverse transcriptase, it does not appcar to bIock synthesis of a "llook" on the ends of a large fraction of the molecules. This is shown by the observation that

INSERTION OF GLOBIN

mRNA

SEQUENCES

187

cDNAs made with or without the drug behave similarly in priming the synthesis of a covalently linkcd second strand by DNA polymerase-I ( 7 ) . Such behavior was a nuisance when we desired to use DNA polymerase-I to synthesize ribosubstituted DNA suitable for digestion by the singlc-straiid-spccific T , endonuclense-IV. For this purpose, we devclopcd conditions in which the single-straiid-specific S, nucleasc efficiently opcncd the hairpin structures ( Nina Fedorof!, unpublished results). But this same “hook” also offers an impressive advantage in permitting us to make a complete duplex copy of the cDNA sequence. After suitable treatment with s, nucleasc to open the “hairpin,” such a gene copy could be inserted into bacterial plasmids.

B.

Experimental Approach for the Insertion of cDNA Copies of Polyadenylylated mRNAs

A general method for cloning the cDNA sequences corresponding to any polyadenylylatecl mRNAs should provide important advantages for scvcral areas of research. As mentioned above, it should permit the rapid sequencing of any polyadenylylatcd niRNA and provide pure probes in large quantity for the quantitative analysis or purification of such mRNAs. Perhaps more important for our future understanding of eukaryotic gene function, thc cloning of cDNA sequences will provide pure probes to facilitate the isolation and cloning of the larger DNA sequences surrounding the corresponding structural genes. The approach that wc have followed is outlined in Fig. 5. Globin cDNA is prepared from globin mRNA using avian myeloblastosis virus ( AMV) reverse transcriptase. After purification, this cDNA is used as the template for second-strand synthesis by DNA E. coli polymerase-I. As noted above, AMV reverse transcriptase leaves a ‘‘hook))on the 3’-OH end of its product, so that this template is self-priming. A potentially fullsized duplex gcne copy results, but this duplex is a “hairpin” structure with only one “open” end. Treatment with S, nuclease is used to open the hairpin loop so that the geiie copy can be inserted into bacterial plasmids. This insertion is usually accomplished by the methods of Lobban and Kaiser ( 8 ) in which phage lambda exonuclease is used to expose 3’-OH termini for homopolymer addition by polyiiucleotide terminal transferase. Terminal transferase is used to add poly( d A ) tails to the S,-treated globin gene copies and to add poly( d T ) tails to plasmid DNA ( prepared by cleavage with Eco R I restriction endonuclease and treatment with lambda exonuclease). The poly( dA)-tailed globin gene copies are then mixed with the poly( dT)-tailed plasmids to give circular complexes that can lie “repaired and replicated in E. coli. Transformation is carried out by a modificatioii of the method of Mandel and Higa ( 9 ) .

188

WINSTON SALSER ET AL. Hb rnRNA polyA PSC I01 D N A

I+

Hb cDNA

c

I

AMVreverse tronscriptose dMTsT

.

'*hr

-

DNA polymerase I +dNTT;

e 1

Eco RI

SI Nucleose

*----------

. . . . . . . . . a

IXExonucleose

.......IXExonuc ........ leose

Terrninol

1

Anneal

C"'."""""""'' I ' . . " " " " " " . *

1-11.

".........--.-....-..~

(,L.-.--.....

Heterogeneous Populotion of Hybrid DNA s

I

Tronsformotion of E . 4 to Tetrocycline Resistance

1

Selection of Clones by Hybridization with Rodiooctive Hb cRNA

I

lsolotion of Plosmid DNA from o Tronsformed Clone

.... ...... Homogeneous Populotion of Hybrid DNA s

FIG. 5. Flow shect of s t e p i n the synthesis of artificial globin gene copies and their insertinn into bacterial plasmids.

Both plasmids used for this work (pSC101 and pMR9) carry a tetracycline rcsistance marker, so that transformants can be isolated by plating on tetracycline plates. We find that most of the tctracycline-resistant transforniants carry globin-gene sequences. I t is convenient to confirm this by the colony hybridization assay of Grunstein and Hogness ( 1 0 ) in which a radioactive RNA probc is hybridized to DNA immobilized by alkali treatment of colonies grown 011 nitroccllulosc filters. Autoradiography of the hybridized probe provides proof of globin gcne insertion and gives a rough estimate of thc size of the insert and/or the frequency of the complementary sequences in the RNA probc preparation.

INSERTION OF CLORIN

mRNA

SEQUENCES

189

The description above outlines the essential features of our appr0ach.l In the following paragraphs we give a summary of some of the experimental details, emphasizing our experience with variations of the basic approach outlined above. 1. PUIUFICATIOH OF GLOBINmRNA We are using a ncw large-scale purification technique for which we are indebted to Dr. Randolph Wall ( 1 1 ). In this procedure the cells (washed reticulocytes from phenylhydrazine-treated rabbits) are broken by discharge from a Parr nitrogen disruption bomb. Globin niRNA is purified by repeated phenol/CHCl, extractions of niagnesimii-precipitated polysomes followed by p l y ( U ) -Sepharose selection of poly( A )-containing mRNA and two gradient velocity sedimentations in dodecyl sulfate/ sucrose. Up to 1 liter of cell suspension, approximately 200 g of cells, can be broken in a single operation by this method.

2. SYSTHESIS OF cDNA To synthesize cDNA of maximal length, the reaction was carried out with millimolar concentrations of the four deoxynucleoside triphosphates. We find that these reaction conditions produce a significant increase (about 15%)in the level of synthesis observed at the half-millimolar concentrations used by Efstratiadis et ul. ( 1 2 ) . Reactions were carried out either with or without actinomycin D. The cDNAs so synthesized have been coinpared by chromatographic separation of nuclease digests of RNA synthesized in aitro from each template. The patterns obtained were identical, but the yields of cDNA obtained were substantially higher in the absence of actinoniycin D. As we have reported earlier, actinomycin does not interfere with synthesis of a double-stranded “hook” of size sufficient to make this template self-priming in subsequent reactions with DNA polymerase I ( 7 ) . In the absence of actinomycin, we obtain about 0.285 pg of cDNA for every microgram of poly(U)-Sepharoseselected globin mRNA. 3.

SYNTIIESIS O F T H E SECOND

DNA STRAND

Usually the cDSA is treated with 0.3 A4 NaOH at 100°C for 10 minutes to degrade the RNA template and is passed over a Sephadex G-100 column to remove salt. However, in one experiment with cDNA transcribed from immunoglobulin mRNAs, it was observed that synthesis of the second DNA strand by DNA polymerase-I can proceed normally even if the niRNA sequences have not been removed by alkali treatment ( 1 3 ) . This olxervation is consistent with the known ability of the 5’ to 3’ exonuclease activity of this enzyme to degrade RNA “primers” base paired with the DNA template. Our yields of duplex D N A have ranged from 0.4 to 0.8 pg of duplex DNA per microgram of input cDNA.

The

experimental details have already been circulated informally through the mechanism of the Nucleic Acid Recombinant Scientific Memoranda as part of the report on the Dcceiiiher 1975 LaJolla ineetings on biocontainment techniques and will be published in detail elsewhere. Experiinentors working in this area of research can siilxcribe to the Nucleic Acid Recombinant Scientific Memoranda ( NARSM ) by writing to Dr. E. C. Chamberlyne, Project Officer, NIAID, Bldg. 31, room 7A50, National Institutes of Health, Bethesda, Maryland 20014.

190 4. CLEAVACE

WINSTON SALSEH ET AL. OF

HA~RPIN STHUCTUHES

We have used the conditions of Shenk et al. ( 1 4 ) with 135 units of S, nuclease per microgram of DNA for 1 hour at room temperatnre ( i n 0.28 M NaCI, 0.0045 M MgSO,, 0.03 M NaOAc ( p H 4 . 6 ) , 1 pg of diiplex DNA in a 200 pl of reaction mixture). These conditions are designed to mininiize cleavages of duplex DNA. The progress of tlie S, reaction can be followed by heating an aliquot, cooling it, and again digesting it with S, to measure the fraction no longer rapidly renaturing. Apparently the hairpin loop made hy AMV reverse transcriptase contains only a few unpaired bases, because we find that this relatively mild S, treatlnent usually cleavcs lcss than half of the hairpin structures. We prefer not to use more extensive digestions since it is known that S, treatment can progressively attack the ends of duplex molecules even under conditions where internal cleavage is suppressed, and bccause we obtain good yields of globin gene-carrying transforniants using these conditions (see Section 111, B, 6 ) .

5. ADDITION OF HOSIOPOLYSIER TAILSTO AND PI.ASxrID

THE

DUPLEXGLOBINGENE COPY

VECI'OR I>NAs

In theory we could add either p l y ( d A ) or p l y ( d T ) tails to tlie globin gene copy, as long as the cotnplementary seqnence is added to create cohesive ends on the recipient plasmid. In fact, however, the duplex gene copies synthesized as described above should already contain a short poly( dA) tract corresponding to the poly( A ) of the niRNA seqiience, at the 3' end of onc of thc duplex strands. We have reasoned that addition of p l y ( d T ) tails to the 3' ends of the globin gene copy DNA might not provide proper cohesive ends since this p l y ( dT) could conceivably fold back to pair with the adjacent p l y ( d A ) . Such intcraetions would be espccially likely if this adjacent poly ( dA) had been rendered single-stranded by treatment with lamhda exonuclease. Following this line of reasoning, we hnvc always attached the p l y ( dA ) tails to the gene insert and the poly( dT) tails to the plasmid vector. Polynncleotide terminal transferase, froin calf thymus, is nsed to add honiopolynwr tails to the 3'-OH termini to prepare DNA molecules for joining. This enzyme prefcrs a substrate with protruding 3' ends, and the conventional approach to achicvc this addition involves treatment of DNA fragments with 1anil)da exonuclease to remove approxiinatcly 50 nucleotides froin the 5'-OII ends of the douhle-stranded globin and plasmid molecnles ( 8 ) . It appears that this exonuclease treatment may be unnecessary under the conditions we use. Addition of p l y ( dA ) to lamhda-exonuclease-treated globin DNA was carried out in 150 pl of 50 m M KH,PO, ( pH G.B), 8 m M MgCI,, 1 ~ n h l2-increaptciethano~ with globin DNA at 3 pg/nil and a total of 8 nniol [a-"P]dATP. Terminal transferase ( 150 units) \vas added to initiate the reaction at 37°C. At 5-minute intervals, the reaction was halted by placing the mixture on ice, and acid-precipitable counts were measured. If the desired lcvel of incorporation ( about 2.5-5%, corresponding to tails 50-100 nucleotides in length) had not heen reached, the reaction could be reinitiated by warming to 37°C and adding an additional 150 units of terminal transferase. As mentioned, pretreatment with lambda cxonnclease appears to be superfluous, owing to destructinn of any protruding 5' ends by the nnclease S1 treatments. The addition of p l y ( dT) to lambda-exoniiclease-treatcd plasmid DNA was carried out in a similar fashion, except in 0.2 M cacodylic acid ( p H 7.2), 1 mM CoCh, 2.5 niM 2-mercaptoethanol made fresh immediately prior to each use. Here the siibstratc is derived by Eco RI cleavage of the plasmid DNA and contains 5'

INSERTION OF GLOBIN

mRNA

SEQUENCES

191

termini that protrude by four nii'clcotides. W e assume that the ability to carry out this reaction successfully without exonuclease pretreatment niay depend upon the presence of cobalt ions. It is important to use terminal transferase preparations free of nucleolytic activity (this is even more inrportant when one desires to clone very large molecules) since homopolymer tails added at breaks can cause a variety of complications. ( I n this regard we were fortunate to enjoy the hospitality of Dr. Ratliffs laboratory and his expert advice and assistance in making a large preparation of this enzyme.) The enzyme we have used has been tested for nucleolytic activity on twisted circles about 14,000 base-pairs in size; under the conditions of Lobban and Kaiser ( 8 ) we observed about one single-strand break per molecule during 2 hours of incubation. Some commercial preparations of this enzyme have much higher levels of endonuclease activity ( 1 5 ) and are unsuitable for cloning experiments. 6.

TRABSFOHhIATION

The bacterial hosts used for this work were E. coli C600 and HBIOI. With strain C600 we obtain roughly 1600 transforinants from 0.1 yg of duplex globin DNA; with strain IlBlOl we obtain levels of transforination about a twentieth of this. The lower efficiency with HB101 niay b e due to the fact that it carries the rec Amutation. This character is deemed useful in providing some protection against rearrangements of the inserted seqiience while it is carried in the bacterial host, but we are unaware of evidence for such rearrangeinents having occurred except in the case of repeated scriuences, such as satellite DNAs, and in some such cases the observed rearrangeinents appear to occur in Imth rec A' and rec A- cells. Both of the al)ove-mentioned strains are defined as EK-1 according to the pending (unofficial) guidelines of the N.I.H. recombinant DNA committee. W e carried out the experiments described in a P-3 biocontainment facility with the approval of biocontainment committees at our institution and the relevant grant administrator a t the National Institntes of Health. Recently we received two new bacterial strains from Dr. Roy Curtiss I11 ( 1 6 ) . These strains, designatcd X 1849 and X 1776, are presumptive EK-2 hosts as defined by the unofficial gnidelincs and are now pending approval. I n preliminary experiments, X 1849 gives levels of transfoimation approximately equal to CFOO, while X 1776 gives levcls of transforination roughly a fourth of this. It should be emphasized that neither of these strains has yet received any official approval as an EK-2 strain. Nevertheless, they do appear to afford a great additional margin of safety, and we intend to utilize thein in our future experiments, consistent with the Asilomar guidelines ( 1 7 ) , which rcconrmend that bacterial strains with selfdestruct features 11c cntployed whei-cver possible even if not explicitly required.

C. Characterization of the Plasmids Carrying Globin Gene Sequences

1. COLONY HYBIUDIZATION ASSAY We have used the colony hybridization assay (10) as a means of carrying out a rapid preliminary characterization of the plasmid inserts obtained. In this procedure the transforniant colonies are transferred by toothpick onto nitrocellulose filters placed over nutrient agar for growth. Cells from the resulting colonies are lyscd, and their DNA is denatured

192

WINSTON SALSER ET AL.

and fixcd in sitti by treatmcnt with alkali. They are thcn hybridizccl with a ”‘P-labeled globin-RNA probc obtaincd by in uitro transcription of globin cDNA. Colonies containing sequcnces complementary to the globin mRNAs are then idcntified by autoradiogrnphy. Low-intensity images indicate eithcr short inserts or inserts of DNA from minor RNA species.

INSERTSBY CHROMATOGRAPHY OF CRNA DIGESTS Because the sequence analysis of the fragments resulting trom cleavage at G residues of RNA synthesized in vitro from globin cDNA have now been complcted (Tablc I ) , clectropherograms of the RNA sequences that hybridize to the pHb plasmids provide a powerful way to obtain a dctailed analysis of the globiii sequences inserted. The results of such experiments are shown in Fig. 6: plasmid pIIbl3 contains most or all of the 2. ANALYSIS OF

PLAShfID

regions of the globin beta chain mRNA for which wc have sequence information, while plasmid pHbl4 contains sequences from the latter twothirds of the structural gene plus the untranslatecl sequences at thc 3’ end of the mHNA. (Note that pHb denotcs plasmids with globin inserts.) Such experiments can be carried out in a variety of ways in order to answer different questions. We chose to synthcsizc the 32P-labeled RNA probe by i n vitro transcription of globin cDNA uring E . coli RNA polymerase, so that it would yield only those G cleavage products we have

E T A &in

5’

N ImmM

YY+2 -+++

+ +

+

2

f

+

7

!4(447l ++7

5BC 57 33

+ + +

ptibl lack$ -1s

L

,

(%,%,%) 315 + 7 + 7 ? +

Y.5354

-

.

A

7

7,

7

FIG. 6. Map of G cleavage fragment assignments in the rabbit alpha and beta globin mRNA chains. The presence or the absencr of individual sequences in plasmid or - below the bar showing the position of pHblS is indicated by the symbols each fragment. Thc symbol ? indicates those fragments that could not be scored unambiguously, either because they are not lalieled with the radioactive precursor used in scoring the plasmids, [m-lrP]CTP, or because they occur as part of a complex niixture of fragments. See text for details of the technique. The fragment assignments shown here do not include those obtained from “ladder” sequencing techniques (see Fig. 8) or with the aid of the I J cleavage or C cleavage data shown in Tables I1 and 111.

+

3’

INSERTION OF GLOBIN

mRNA

SEQUENCES

193

already fully characterized (listed in Table I ) . This RNA probe was then hybridized to DNA from globin plasmid chimeras that had been denatured and immobilized on nitrocellulose filters. After treatment with RNase, those RNA sequences complementary to the plasmid DNA were recovered, cleaved at G residues and chromatographed. Each of the radioactive spots was then eluted and subjected to secondary digestions with RNase U2 and/or pancreatic RNase A to obtain unequivocal identification. ~ P ] in these experiments, The radioactivity was introduced as [ ( Y - ~CTP but there are some fragments that cannot be scored by this technique, either because they are not labeled when qLPis introduced on C, or because the fragment is not well separated from others that give similar secondary digestion products. Such uncertainties are indicated by the question marks in Fig. 6. For plasmid pHbl3, fragments that are unmisand those clearly absent are shown by takably present are shown by -. Similar results were obtained for plasmid pHb23 while the results for plasmid pHb14 are showr: by the bracket in Fig. 6. A number of other plasmids have been similarly analyzed with a variety of results. Some give simple chromatograms indicating incorporation of a shorter gene fragment, while others suggest incorporation of alphachain sequences. Since we have not yet performed the necessary secondary digestions to characterize these chromatograms unambiguously, it is not possible to draw firm conclusions at this time. Obviously, characterization of such “partial” chromatograms could provide a powerful tool for making new sequence assignments. Even faster progress may be possible by using the new sequencing techniques discussed below.

+

3. RESTRICTION ENZYMECLEAVAGES Both of the plasmid vectors used in this work (pSC101 and pMB9) have a single Eco RI clcavage point, which we have used as the site for inserting the globin gene sequences. Such insertions using terminal transferase destroy the RI site. Our previous sequence studies ( 2 , 3 ) had revealed that both the alpha- and the beta-globin gene sequences contain Eco RI restriction sites ( corresponding to amino-acid-residues 121-122 in the beta chain and 116-117 in the alpha chain). Consistent with this, we find that the majority of the p H b plasmids that we have tested are cleaved by Eco RI. The globin genes also contain Hue I11 rcstriction sites that we have used to determine a 79-nucleotide region of the beta mHNA sequence (see below). Figurc 7 shows a comparison of the Hue I11 digestion patterns of pSClOl and pHb23. The fragments have been separated by electrophoresis on an 8%acrylamide gel. Note that the pattern obtained with pHb23 is missing one small fragment present in the parent plasmid

194

WINSTON SALSER ET AL,

FIG. 7 . DNAs from plasmid pIIb23 (left track) and the parent plasmid pSClOl ( right track) were digested with Hue111 restriction endoxiuclease and electrophoresed on 8%acrylamide gels. The gels were then stained with ethidium bromide and the banding pattern of UV-induced fluorescence was photographed.

INSERTION OF GLOBIN

mRNA

SEQUENCES

195

and has at least two new larger fragmcnts. The small llae I11 fragment present in the pSCl0l but not the pHb23 pattern must contain the Eco RI insertion point. The two new bands appearing in the pHb23 pattern contain at least pnrt of thc inserted globin sequences. Siniultaneous digestion with both llae I11 and Eco RI (not shown) reveals that the slower migrating of the two “globin inscrt bands” seen in Fig. 7 contains the Eco RI site present in the beta mRNA sequence. Eco RI digestion of this band yields two fragments, the smaller of which is only about 50 basc-pairs. Inspection of the amino-acid sequence for potential restriction sitcs enabled us to establish the orientation of thc Hue 111 fragment within the beta globin g e m a s follows: measuring from the Eco HI site, there are potential Hae I11 cleavage sites 51 and 63 nucleotidcs toward the 3’ end of the mRNA sequence, but in the other direction the nearest amino-acid sequencc consistent with a Hue I11 site is 141 nucleotides away. Consequently, the small fragment must code for the sequence stretching from amino-acid 121 to amino-acid 138 or 142. Subscquent sequence analysis reveals that the Hue I11 site corresponding to ainino-acid residue 138 is the correct one.

IV. Sequence of a 79-Nucleotide Region in the Beta Chain mRNA We have detcrnmined the sequence of the Eco RI-Hue 111 fragment mentioncd in thc preceding section. In addition, we have other sequence data from overlapping or nearby regions that enable us to assemble the continuous stretch of 79 nucleotides shown in Fig. 8, with only 4 positions uncertain. Our ability to determine rapidly the sequence of this region illustratcs one of thc main advantages of the cDNA cloning technique. Once the mRNA sequence has been cloned as duplex DNA, it is possible to apply the new “ladder” sequencing techniqucs developcd by Maxam and Gilbert (18, 19). In this approach, a restriction fragment is dephosphorylated and then treated with polynucleotide kinase and [y-’j?P]dATP to phosphorylate the 5’ termini. The fragment is then cleaved with a second restriction enzyme, and acrylamide gel electrophoresis is used to separate the two labeled ends. The labeled fragment whose sequence is to be determined is then subjected to one of four cleavagc proccdures, each of which is designed to cut more or less preferentially at one of the four bases. A critical feature of the technique is that only a mild treatment is used, sufficient to cleave perhaps one base in 40. Thus, when preferentially cleaving at dG residues, one ends up with labeled fragmcnts corresponding to all lengths that reach from the labeled 5’ terminus to dG residues in the sequence under study.

196

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

Leu

Ser

HIS

HIS

Phe

Gly

Lys

Glu

Phe

Thr

Pro

Gln

Val

Gln

Ala

Ala

Trp

Glu

U ~ U - U - U ~ G - G - X ~ A - A - ~ ~ G - A - A ~ U - U - C ~ A - C - U ~ C - C - U ~ C - A - G ~ G - U - G ~ C - A - G t G - C - U ~ G - C - C ~ U - A - U ~ C - A - G ~

U-G/U-C-U{C-A-U{C-A-

I

u Eco R I slte

I sequence deduced from G cut fragment 58c

I

I

I

purine "ladder" data - (see text)

sequence deduced from G cut fragment 57

I

I

pyrimidine ''ladder'' data

- (see text)

L G cut fragment 33 132

133

134

135

136

137

138

139

140

Lys

Val

Val

Ala

Giy

Val

Ala

Asn

Ala

'

A - A - ~ t G - U - ~ t G - U - G 1 G - C - U t G - G - U ~ G ~ U - G t G - C - C ~ A - A - U ~ G - C 3

I pyrimidine "ladder" data - (see text)

1

G cut fragment 13

1

C cut fragment 13

WINSTON SALSER ET AL.

-

Hae I l l Site

-

1

I

U cut fragment 33

INSERTION OF CLOBIN

mRNA

SEQUENCES

197

When such a digestion mixture is electrophoresed on 20% acrylamide gels, the labeled products, up to 70-100 nucleotides in length, can be resolved as a series of bands (regularly spaced like rungs in a ladder, hence the term “ladder technique” which we have colloquially applied to these methods). With cleavage preferentially a t dG residues, the rungs of the ladder Corresponding to the positions of dG residua in the sequence will be intense, the others faint or absent. By running the appropriate reactions side by side, it is possible to read of€ the nucleotide sequence directly, relying upon thc relative intensities of the rungs of the ladder pattcms in each track. In practice, methylation-induced depurination with dimethyl S d f d t C is used to reveal the pattern of purines. Since dG is nicthylated about 7 times as rapidly as dA, and since the methylated bases are released, the nicthylation-induced depurination cleavage \+Ids a pattern of heavy bands corresponding to dG residues and 1ight.baiids corresponding to dA residues (18, 19). It is also possible to obtain confirniation of such results by obtaining a separate pattern in which the dA bands are more intense than dG bands. To do this, hlaxam has taken advantage of the fact that the methylated A residues can be preferentially released under appropriate acid conditions ( l a ) , so that thc subsequent beta elimination cleaves preferentially at dA residues. The pattern of pyrimidines is investigatcd using hydrazinc-induced cleavage. To distinguish between d C and d T residues one takes advantage of thc observation that, while dT is attacked somewhat more rapidly than dC under normal conditions, addition of 1 M NaCl largely suppresses the attack on dT residues ( 1 8 ) . The use' of pipcridine to cleave wherever bases have been modified has been found to be superior to the use of. other agcnts in giving a complete reaction so that the products can be resolved clearly on thc electropherograms ( 18). The brackets under the sequence illustrated in Fig. 8 indicate the different scts of data that contributed to its determination. As shown, our pyrimidine “ladder” data are complete for the nucleotide sequence correqponding to amino-acid-residue 125 ( Gln ) through to the restriction cleavage at 138 ( Ala). Portions of the purine “ladder” data were difficult to read because of artifacts that smeared some regions of the gel. The FIG. 8. Determination of a 79-niicleotide sequence in the rabbit beta globin mRNA. Brackets below the nucleotide sequence indicate the natiire of the data supporting the sequence assignments. Four nucleotides not definitely assigned are indicated by X or ,“. [Atltletirlriin: Additional ladder sequencing data obtained as this was being submitted establishes the entire sequence from 121 Glri through 137 Val independently of the data presented here. These resuIts (Browne, Clarke, Paddock, Liu and Snlser, unpublished) establish the sequence shown in Fig. 9.

198

WINSTON SALSER ET AL.

region from 124 ( P r o ) through 129 (Ala) was of high quality and has been utilized here. I t should be emphasized that the purine “ladder” data beyond this region, although of lower quality, d o not conflict with thc sequence assignmeitts dictated by our other data. The remainder of the sequence is established by nniquc fits for fragments 58c, 57, 33 and 11 from the G cleavage catalog (Table I ) and by U cleavage fragnent 33 (Table 11) and C cleavage fragment 13 (Table 111). Attempts to clone cDNA sequences on bacterial pIasmids have been pursued independently in a numbcr of laboratories in addition to our own. Efstratiadis et al. (20) have independently discovercd the efficacy of the “hook” reported earlier ( 7 ) in priming synthesis of a second DNA strand by DNA polymerase-I. They have inserted the duplex DNA molecules thus formed into bacterial plasmids using a procedure similar to that which we report here, apparentlv with sirnilar efficiencies of transformation ( 2 1 ) . Rougeon et al. (22) have used a somewhat different approach. First, thcv atteniptcd to recreate the Eco RI insertion sites through the use of DNA polymerasc-I and honiopolymeric dC, and d C tails. Second, their procedure for making the cDNA double-stranded ignored its self-priming characteristics: pol?( d T ) was added to the 3’ end and then oligo( dA) primer was used to prime thc second strand synthesis. Perhaps owing to these diff ereiices in approach, the transformation was several hundred times less efficient than what we routinely observe in the bacterial strain used; only one transformant carrying n partial globin gene-sequence was obtained. I t is unclear whether they huvc recreated the RI restriction sites as intended. They report that the plasniid contains at least one ECO R I site, but this is to be expected since both of the gene sequences for adult rabbit globiiis carry an internal ECORI sitc (2, 3 ) . Rabbitts (31) has used a still different approach, again with a low cfficiency in creating globin carrying transformants. From the identification of one clone carrying a globin plasmid, it was cstimated that about 7%of the selected transformants actually contained a globin plasmicl. The lcngth of the inserted sequeiices has not hecn casamined. As in the method of Rougeon et al., the technique does not take udvantage of the selfpriming “hook found on the cDNA. In fact, both approaches use polynucleotide termiiial transferase to add a hoiriopolyincr to the 3’ end of the cDNA. The behavior of this “hook” (7, 20) suggests that it might act as an i n d e n t d 3’ tcrminus, a situation known to inhibit tlie action of polynucleotide tcrniinal transfcrasc: ( as we mentioned above, use of cobalt may overcome this inhibition, but cobalt was not used in Rabbitts’ procedure). Thus tlie hook may cause low yields b y iiitc>rferingwith the tailing procedures rcquircd iii thesc alternative. npproachcs.

INSERTION OF CLOBIN

mRNA

SEQUENCES

199

V. The Relation of Globin mRNA Structure to Function A. Overall Status of Globin mRNA Sequencing The sequeiice determination of thc rabbit globin mRNAs can probably be completed soon. Within the beta globin mRNA, for instance, we have completed 172 iiucleotides from the structural gene and assigned fragments totaling more than 90 nucleotides to the untranslated regions at the 3’ end of the mRNA. Proudfoot, using a different approach, has determined a 76-nucleotide sequence within the uiitranslated region at the 3’ end that correlates completely with the fragments we had assigned to this region and allows us to make two additional assignments as well (our G cut fragments 27c and 44). He has also sequenced a 92-nucleotide region within the beta structural gene, including a 22-nucleotide sequence corresponding to amino acid residues 107-114, for which we do not yet have data ( 2 3 ) . Similar work is being carried out with the human globiii mRNA sequences and, again considering the case of the beta chain sequence, about 81 nucleotides in the structural gcne ( 2 4 ) and 37 nucleotides in the untranslated rcgion (25) have been reported. Work on both rabbit and human alpha globin mRNA sequences is proceeding in the same laboratories but is not yet as far advanced. ’Iziith the advent of the cDNA cloning techniques we describe above, we can confidently expect that the rapid compktion of the globin mRNA sequences will be followed by a rush of other mRNA sequences. Now that cloiiing can supply almost arbitrarily large quantities of a pure sequence as duplex DNA ( hence suitable for the ultra-fast “ladder” sequencing approaches ) , such scyuence dcterminations become easier by an order of magnitude.

B. Signal Sequences W e May Expect t o Find I t is therefore useful to review briefly the sort of questions we would like to answer from such sequence studies and to emphasize how little we presently know about eukaryotic mRNA function. The role of poly ( A ) is still obscure although it has been studied extensively, and roles in transport and in stabilization in the nucleus and/or the cytoplasm have been proposd. It will be helpful to understand how certain sequences may signal addition of poly( A ) , and data presented by Proudfoot et (11. and by Subranianian et (11. in this volume provide us with important clues about such signals. At the 5’ end of the mRNA we will be searching for sequences that might signal thc~addition of the m’ Gppp ( c a p ) structure, or signal a processing clc>avagefrom ;i larger hnRNA precursor. The cap

200

WINSTON SALSER ET AI,.

structure may directly interact with initiating ribosomes (26, 27), but it has been suggested that there may be 50 or more untranslated nucleotides at the 5’ terminus of the beta globiii mHNA (23). If so, it is possible that there is also a ribosomal attachment site distinct from the cap and more nearly analogous to that seen in bacterial mRNAs. As with studies of signals in bacterial sequences, answers to these question., will ultimately have to be sought by looking for direct interactions between sequences and the rclevant binding proteins, and by examining thc niolecular basis of genetic defects (for instance, there are a small number of thalassemias in which mRNA appears to be present but not translated), Such studies may take some timc. It is therefore likely that in the immediate future we shall have to rely on the more indirect methods of looking for features common to a number of different mRNAs where conservation of the sequcnce (or of a particular base-pairing pattern) suggests a functionally important signal.

C. Other Signals? That there may be signals we cannot now anticipate is obliquely suggested by several lines of evidence. 1. Globin mRNA exists in the cytoplasm as a ribonucleoprotein complex (28, 29). Are the protcins bound specifically and present for a purpose, such as transport from the nucleus, regulation of mRNA breakdown, ctc, or arc thcy adventitious? 2. There are large untranslated sequences present in the globin mRNAs [roughly 150 nuclc~otidesin the alpha chain and 240 nucleotides in the beta chain, not including poly ( A ) 1. Do these sequcnces serve a precise function (e.g., provide protein binding sites), or are they functionless and evolutionarilv drifting? Such signals arc likely to be of two or more broad types corresponding to agents that interact with all or many mammalian mRNAs, and those specific for one or a few messcnger species. Signals of the first sort, which would have to occur in large classes of mRNAs, will be recognizable as being common not only to the globin mRNAs but to other mRNAs as well, for example, the immunoglobulin mRNAs. Such functions, if they not only have to be carried out for all mRNA scquences but are carried out using similar signals in each case, might properly be described as “housekeeping” functions, insofar as this indicates that they are not involved in differential control of different mRNAs. Signals specific for a single mRNA type, such as the bcta globin mRNA, can best be recognized by comparing the nucleotide sequences of the same mRNA in different organisms separated by an appropriate span of evolution. The roughly 200 million years of evolution separating rabbits and man (twice the time since their

INSERTION

OF GLORIN

mRNA

SEQUENCES

201

divergence) is sufficient to have introduced changes in at least 44% of the nucleotide positions which are not under selection pressure (Ref. 3 and Salser and Isaacson, this volume) so that even rather small signals should be recognizable if they are strongly conserved.

D. mRNA Secondary Structure Consideration of inRNA base-pairing and folding could be very important in determining the availability of binding sites for any “regulatory proteins” as well as ribosomal attachment, cven though proteins known to be bound to the mRNA (28, 29) may have strong effects in stabilizing some base-paired configurations and destabilizing others. By comparing several mammalian globin mRNA sequences that have diverged over long periods of time, it may be possible to detcrmine which base-pniring arrangements are physiologically important and hence conserved. A search for the most stablc base-pairing pattern expected in solution (without the unknown but possibly important protein interactions) is only the first nccessary step. If alternativc base-pairing arrangements can equally well explain the conscrvation of a sequence in mRNAs from evolutionary distant sources, both alternatives may be physiologically significant. Moreover, the switch from one configuration to the other may be an essential fcaturc of some control process. Our sequence analysis of a 79-nucleotide sequence from beta-chain 114 ( L e u ) through 140 ( Ala) provides an opportunity to illustrate such a possible base-pairing arrangement ( Fig. 9 ) . I t should be emphasized that we have not made a systematic computer-assisted search for the most stable base-pairing configuration that this sequence could adopt. Any

FIG.9. A possible base-pairing configuration for the nucleotide sequence shown in Fig. 8. The rabbit sequence is shown, and base substitutions thought to occur in the human sequence are indicated by arrows as based on personal comniunications from B. Forget and S. M. Weissman.

202

WINSTON SALSER ET AL.

conclusions we draw at present should be treated with skepticism and the understanding that we anticipate obtaining sufficient additional sequence information to justify a more systematic analysis of the overall secondary structure. Nevertheless, it seems worthwhile to discuss the possible base-pairing configuration shown in Fig. 9 not only because of the amount of regular base-pairing which it exhibits, but also because it illustrates a plausible explanation for the clustering of silent mutations we have observed in this region. The base-pairing configuration shown in Fig. 9 was derived by inspection of the rabbit globin mHNA sequence without regard to the corresponding sequence in human globin mRNA. The stability of the proposed structure was then evaluated using the rules of Tinoco et d.( 3 0 ) . According to these rules, proposed for RNA frcc in solution, the structure shown has a stability of -11.3 kcal/mol. We can test the probable biological significance of the structurc by asking how mutations differentiating rabbit and human sequences are distributed with respect to the proposed base-pairing. In the structure shown, 43%of the bases are paired. If the postulated base-pairing is not related to what occurs in vivo, we would expect that, on the averagc, 43%of the mutations in this sequence should occur in base-paired regions. In fact, of the 9 mutations that have occurred in this sequence, only one ( t h e change of a G . C pair to G . U ) is in a region of postulated pairing. Such a result suggests that the pairing illustrated may have biological significance. The addition of firm sequences ovcr larger regions of the gene, permitting a thorough computer-assisted analysis of the relative stabilities of all theoretically possible alternatives of RNA molecules free in solution, followed by a detailed mutational analysis, will help us dccidc which of the possible structures seems to be most significant in the complex milieu of the cell. Indeed, for the mutational analysis to be convincing it may be necessary to compare beta-chain mRNA sequences from more than two species, each ncw species providing an independent set of mutations for testing any particular pattern of secondary structure. With the advent of cDNA cloning techniques and rapid methods for determining the cloned sequences, such an approach, which would have seemed impossible a few years ago, is bccoming practical. ACKNOWLEDGMENTS We would like to thank S. Weissman, N. Proudfoot and their collaborators for providing us with data in advance of publication; F. Ramirez, A. Bank and D. Kacian for supplying us with the globin cDNA used in the earlier parts of this work; K. Toth and R. Wall for their assistance with globin mRNA preparations; A. Maxam for detailed descriptions of the ladder sequencing techniques in advance of publication;

INSERTION OF GLOBIN

mRNA

SEQUENCES

203

T. Maniatis and A. Efstratiadis for helpful discussions and making available unpublished results; and J. Isaacson for valuable discussions and comments on the manuscript. Special thanks go to R. RatliIE for providing advice, the hospitality of his laboratory and help of all sorts during our preparation of polynucleotide terminal transferase. We would like to thank R. Firtel for gifts of the phage lambda exonidease and endonrielease R1, R. Roherts and T. Maniatis for gifts of endonuclease Hae I11 and lambda exonuclease, and B. Wallace for his help with the photography. The reverse transcriptase iiscd in this work was provided by the Office of Program Resources and Logistics, Viral Cancer Program Viral Oncology, Division of Cancer Cause and Prevention, and the National Cancer Institute. The research reported was supported in part by USPHS grants GM 18586 and CA 15940. W. S. is a recipient of Public Health Service Career Developnient Award GM 70045. J. B. was snpported in part by USPHS Molecular Biology Training Grant GM 1531, R. H. and €1. H. by training grant CA 09056, and P. Z. by Training Grant GM 7104. G. P. has been supported in part by a Helen Hay Whitney Postdoctoral Fellowship.

REFERENCES I . W. Salser, R. Poon, P. Whitcome and K. Fry, in “Virus Research” ( C . F. Fox and W. S. Robinson, eds.), p. 545. Academic Press, New York, 1973. 2. R. Poon, G. V. Paddock, H. Heindell, P. Whitcome, W. Salser, D. Kacian, A. Bank, R. Gainbino and F. Raniirez, PNAS 71, 3502 (1974). 3. W. Salser, S. Bowen, D. Browne, F. El Adli, N. Fedoroff, K. Fry, H. Heindell, 6. Paddock, R. Pooii, B. Wallace and P. Whitcome, F P 35, 23 (1976). 4 . G. V. Paddock, R. Poon, 1-1. Heindell, J. Isaacson and W. Salser, in preparation. 5. G. V. Paddock, H. Heindell and W. Salser, PNAS 71, 5017 (1974). 6. N. Fedoroff and W. Salser, unpublished results. 7. W. Salser, ARB 43, 923 (1974). 8. P. E. Lobban and A. D. Kaiser, J M B 78, 453 (1973). 9. M. hlandel and A. Higa, JMB 53, 159 ( 1970). 10. M. Grunstein and D. Hogncss, PNAS 72,3961 (1975). 11. R. Wall, S. Lippinan, K. Toth and N. Fedoroff, in preparation. 12. A. Efstratiadis, T. Maniatis, F. C. Kafatos, A. Jeffrey and J. N. Vournakis Cell 4, 367 (1975). 13. K. Toth, R. Wall, G. V . Paddock, R. Higuchi and W. Salser, unpublished results. 14. T. E. Shenk, C. Rhodes, P. Rigby and I?. Berg, PNAS 72, 989 (1975). 15. N. Davidson, personal communication. 16. R. Curtis 111, D. Peveira, J. Clark, S. Hull, R. Goldschmidt, J. C. Hsu, L. Maturin, R. Moody and M. Inone, person’‘11 communication. 17. P. Berg, D. Baltimore, S. Brenner, R. Roblin and M. Singer, Science 188, 991 (1975). 18. A. Maxani, personal communication. 19. W. Gilbert, A. Maxam and A. Mirzabekov, in “Control of Ribosome Synthesis,” Alfred Benzon Symp. IX ( N. 0. Kjelgaard and 0. Madge, eds.), Munksgaard, Copenhagen, 1976. In press. 20. A. Efstratiadis, F. C. Kafatos, A. Maxam and T. Maniatis, Cell 7, 279 (1976). 21. T. Maniatis, personal communication. 22. F. Rngeon, P. Kourilsky and B. Mach, NARes 2, 2365 ( 1975 ). 23. N. J. Proudfoot and G. G . Brownlee, Br. Mcd. Bull. (1976). In press.

204

WINSTON SALSER ET AL.

24. C. A. Marotta, B. C . Forget, S. M. Weissman, I. Verma, R. McCaffrey and D. Baltimore, PNAS 71, 2300 ( 1974). 25. B. G. Forget, C. A. Marotta, S. M. Weissman and M. Cohen-Solal, PNAS 72,

3614 (1975). 26. C . W. Both, A. K. Banjeree and A. J. Shatkin, PNAS 72, 1189 ( 1975). 27. S. Mnthukrishnan, G. W. Both, Y. Furuichi and A. J. Shatkin, Nature 255, 33

(1975). 28. G. Blobel, PNAS 70,924 ( 1973). 29. C. Morel, E. S. Gander, M. Herzberg, J. Dubochet and K. Sherrer, E I B 36, 445 (1973). 30. I. Tinoco, P. N. Borer, B. Dengler, M. Levine, 0. Uhlenbeck, D. Crothers and 1. Gralla, Nature NB. 246, 40 (1973). 31. T. H. Rabbits, Nature 260, 221 (1976).

Mutation Rates in Globin Genes: The Genetic Load and Haldane’s Dilemma WINSTONSALSERAND JUDITH STROMMERISAACSON Department of Biology and Molecular Biology Institute Unioersity of California LO,PAngeles, California

1. The Use of Silent Mutations to Measure Mutation Rates Mutation rates for mammalian genomes have often been computed from rates of amino-acid substitution (1,2). The fact that rates differing by manyfold are obtained for different proteins emphasizes that in some proteins, such as the histones, most amino-acid substitutions are sufficiently deleterious to be eliminated during the course of evolution. In such an experiment, of course, one is measuring not the rate at which mutations occur, but instead the rate at which they are being fixed in the population ( w e shall refer to this as the acceptance rate). Accurate estimates of the rates at which mutations occur form the basis for calculations of considerable current interest, and to obtain such estimates i t is useful to restrict one’s attention to those niutations most likely to be truly neutral, that is, those with no selective advantage or disadvantage. Direct nucleotide sequence analysis permits us to come closer to this ideal in a way not possible before, by allowing specific consideration of mutations that are silent. Note that by “silent” we mean causing no aminoacid substitution and therefore exerting no effect on the fitness of the protein, I t should not be assumed that silent mutations are necessarily neutral; they may disrupt important base-pairing relationships, change the binding of control proteins to signal sequences, or have other effects that are more or less strongly selected ( 4 ) . But it seems reasonable that silent mutations arc no more likely than other base-substitution mutations to have selective effects of this particular sort, and they are certain to have none of the well-known and frequently strong selective effects resulting from amino-acid substitutions. Thus, by measuring acceptance rates of silent mutations through direct nuclcotide sequence comparisons between species, we can remove one of the most important limitations of previous estimates of the mutation rate, which have been based on amino-acid substitution data. 205

206

WINSTON SALSEH AND JUDITH STROMMER ISAACSON

In a previous analysis ( 3 ) ,we compared 53 nucleotides of thc rabbit and human globin mKNA sequences and observed five silent mutations, representing a mutation acceptance rate of 44%per base-pair. With such a high figure there is a substantial correction due to the possibility that two or more successive mutations have occurred at the same site. Such a correction can be made by use of the Poisson distribution, but one must treat separately the sites where there are one, two or three possible silent mutations. When the 44%figure is corrected in this way, one obtains a figure of 61%,not 66%as was reported earlier ( 3 ) . We stressed that these figures were based upon a small sample and, even if based on a large sample, they could represent only a minimum estimate of the real base substitution rate, since some silent mutations may be strongly selected against. Our determination of the 79-iiucleotide sequence from beta chain 114 ( L e u ) through 140 ( A h ) ( 4 ) g'lves 11s new data, which provide an opportunity to examine these questions further. W e have compared our sequence, which is in agreement with the data Proudfoot has obtained for a portion of the same region ( 6 ) , with preliminary sequence data for the corresponding portion of the human beta chain ( 5 ) . ' Since the first 16 nucleotides of this region were included in our earlier calculation, we have restrictcd these calculations to the new data, from 121 ( G l u ) through 140 ( A h ) . In this region the known rabbit and human sequences differ by 7 base substitutions, of which 6 are silent. Considering these 6 silent muta t'ions as a fraction of all silent mutations possible in this sequcnce, we have computed a mutation rate of 43%per nucleotide before correcting for multiple events, and 60%with the correction, Thuq, for an amino acid having four possible codons so that the third nucleotide has three possible silent mutations, these data suggest that there has bcen acceptance of an average of 0.60 silent mutations during the rabbit-human divergence. For an amino acid with only two codons, such as phenylalaninc, only onethird of the mutations in the third position would have been silent, so therc would havc been an average of 0.60/3 = 0.20 silent mutations accepted per such codon during this same period. This result (raw value 43%,corrected value 60%) is, fortuitously, in very good agreement with the values obtained with the older, independent set of data (raw value 44%,corrected value 61%).The result is especially interesting because the measured rate of acceptance of silent mutations in the globin genes is about 10 timcs higher than the rate of acceptance of base substitutions that result in amino-acid changes ( about 5.9%per nucleotide for the cvolutionary time span separating rabbits and ' S e e Marotta et d. in this volume.

MUTATION RATES IN GLOBIN GENES

207

humans; see 3 ) . Thus, by using silent mutations in the expectation that they will be more nearly neutral, we have taken a significant step toward a measure of the true mutation rate for mammalian genomes.

II. Haldane’s Dilemma Magnified Haldane computed the number of genetic deaths (by which we mean the amount of selective mortality due to the expression of genetic characters) required for a genc substitution to occur according to strict Darwinian principles ( 7). With reasonable assumptions about permissive levels of mortality associated with Darwinian evolution, he estimated that there could be about one gene substitution every 300 gener at‘ions. Kimura ( 1 ) however, pointed out that if all mammalian DNA is evolving at the same rate as that indicated by the rate of amino-acid substitutions in hemoglobin and cytochrome c, this is equivalent to a gene substitution every year or two in each genome. Crow (8) and others have pointed out that these are two ways around the dilemma thus posed. These can be restated as follows: 1. Truticiition selection. Classically, it has been assumed that genes have independent effects upon fitness. J. M. Smith ( 9 ) postulated that in many cases this may not b e so, and that two criteria may operate together to reduce the genetic load drastically. The first of these criteria is that many distinct genes must contribute cumulatively to some underlving variable. The second is that there must be a truncation, so that survival is not linearly proportional to this variable, but instead most individuals above a certain threshold level survive while most of those below this value do not. Selections for yield in plant or livestock breeding usually approximate this procedure. The result is that, with proper adjustment of the threshold, the genetic dcath of each individual below the threshold eliminates not one but many deleterious gene copies. The genetic load involved in keeping that portion of the genome accurate is correspondingly reduced. 2. Neutral mutations. The second route around Haldane’s dilemma is to imagine that the nucleotide sequences that must be kept accurate make up only a small fraction of the genome and that the rest is genetically drifting. In this case, most of the niutations that occur will be selectively neutral and will not contributc to the genetic load. Our own results based on the frequency of “silent” nucleotide substitutions suggest that the real mutation rate is much highcr than that estimated by Kimura from the amino-acid replacement data. W e estimate at least 5 mutations per year per genome rather than the 0.05-1.0 per year estimated by Kimura. This proportionally increases the magnitude

208

WINSTON SALSEH AND JUDITH STROMMER ISAACSON

of the dilemma that Kimura pointed out. If our results still undercstimate the niutntion rate, as seems possible, the magnitude will be increased further. To account for the observed maintenance of mammalian genomes in thc face of Haldane’s dilemma, we propose to invoke both of the solutions nientioned above. Truncation selection provides a useful way of explaining accurate maintenance of the repctitivc gene clusters and satellite DNAs, but it is difficult to imagine that truncation selection could effectively maintain the accuracy of the single-copy DNA coding for most proteins. Indeed, we suggest that only a small fraction of the genome, perhaps lcss than 2X, is kept accurate as single-copy DNA. Since most mammalian genomes contain large amounts of single-copy DNA, the implication is that most such sequences must be genetically drifting. Finally, there is a special problem in understanding how the interspersed repetitive nucleotide sequences can be maintained with a degree of accuracy consistent with their proposed function as control elements. Experimental evidence suggesting that they are kept accurate points to the possibility of a rather special mechanism for this purpose.

111. Constraints on the Maintenance of Single-Copy DNA Sequences As pointed out by Kimura ( l o ) ,the total genetic load (defined as the proportion by which the population fitness or survival is decreased in comparison with that of an optimal genotypc) is the sum of many components. One is the sulxtitution load, which is the coct of evolution, of replacing all of the original genotypes with genotypes carrying a new advantageous mutation. Another is the mutation load, the cost of keeping the genome accurate by eliminating deleterious mutations. Numerous other components are described by Wallace ( I I ) Because of the way in which Haldane originally posed the problem (the cost of gene replacement), most mathematical treatments of this matter havc primarily considered the substitution load (1, 7, 9). In fact, however, if it can be accepted that many of the mutations are neutral, as strongly argued by Kimura and others ( I , 2, 8 ) , then it will be virtually impossible to determine experimentally the crucial parameter for calculations dealing with the substitution load. This parameter is the rate of occurrence ( a n d fixation) of mutations having a positive selection value. At the molecular level, these alterations will be extremely difficult to distinguish from the larger number of neutral mutations. At the level of anatomy or behavior, it is usually impossible to say how many individual mutational events have gone to create a new characteristic. Thus, at

.

209

MUTATION RATES IN GLOBIN GENES

the present time, calculations of thc substitution load cannot b e meaningfully related to experimental data. Our main interest hcre will therefore be in the mutation load. We share the reservations noted by Crow (8) and others about the effectiveness of truncation selection as a complete solution to Haldane’s dilemma. Such reservations should be further strengthened by any increases in the cstimatcs of the mutation rate such as we have argued from our data above; and for simplicity in the calculation below we will assume that truncation selection does not play a significant role in keeping the singlecopy DNA accurate. The results of the calculation can be modified in accordance with the degree to which the reader believes that truncation selection does play a role. The basic truth upon which we wish to focus is that, in the absence of truncation selection, there must be one “genetic death” to eliminate each deleterious mutation, a concept stressed by Haldane (12) and by Muller ( 1 3 ) . In the steady-state condition, such genetic deaths must occur a t the same rate as mutations that cause deleterious effects. If the entire genome were to be kept accurate, the number of genetic deaths would have to equal the total mutation rate, but, as pointed out by Kimura ( 1), at known mutation rates each zygote would carry several lethal mutations and there would be virtually no viable births. This possibility is therefore excluded. We compute the fraction of the genome that can be kept accurate ( F ) as follows. A deleterious mutation is defined as one that changes a portion of the sequence normally kept accurate. Therefore, the number of deleterious mutations per genome per generation ( L ) is equal to I; times the total number of mutations per genome per generation ( m ) ,i.e., L = Fm. Solving for F and substituting for m we obtain the expression

F

=

L/w

=

2TL/ftBT’

where: m = Total mutations occurring per haploid genome per generation ( m = ftBP/ZT) L = Genetic load: deleterious mutations eliminated per haploid genome per generation (roughly equal to half the fraction of diploid zygotes failing to reproduce because of genetic defects). Mnller ( 1 4 ) estimated that there were between 0.1 and 0.2 new deleterious mutations per haploid genome per generation for man, corresponding to the genetic deaths of nearly 0.2 to 0.4 of all zygotes. Owing to the longer reproductive span and greater number of germ cell divisions in man, these values are perhaps high for an estimate of the lagomorph genetic load, but we will asaume they are typical for the evolutionary sequence in question. We have not applied the Poisson correction for multiple events because the difference i, not appreciable for values of L less than 0.5. We are aysuming that the population is in a steady state, so that the rate of elimination of deleterious mutations by “genetic deaths” is equal to the rate of occur-

210

WINSTON SALSER AND JUDITH STHOMMER ISAACSON

rence of new deleterious mutations. Moreover, we are using the total genetic load figure estimated by Mullcr. To restrict ourselves to the mutation load, the more appropriate parameter for this calculation, we should subtract substitution, segregation and other components of the total genetic load from the figures given. Kimura( 11) estimates that the substitution load is twice as great as the mutation load. If so, such corrections will cause a 3-fold decrease in the estimates of F shown in Table I, further increasing the magnitude of Haldane's dilemma and strengthening the conclusions we wish to draw. T = Time in years since evolutionary divergence of the two species. For the rabbitman comparison, estimates have recently been increased substantially dne to the discovery of lagomorph fossils dated at 80-90 million years ( 1 5 ) . On this basis, McKenna estimates that the divergence can reasonably be judged to have occurred about 100 million years ago (16).The factor of two converts time since divergence of lagomorphs and primates to total years of evolution separating the two. f = Fraction of bases mutated during the evolutionary divergence as corrected for multiple events. Our current best minimum cstimate for this number is 0.60 (see text ahove). If some silent mutations are in fact selected against, then the real number could be substantially higher. There is, in fact, increasing evidence that many silent mutations are deleterious and consequently eliminated during the course of evolution. Perhaps the strongest evidence comes from a comparison of the rate of silent mutations in the untranslated regions of globin

TABLE I

F, T H E FRACTION OF T H E hfAMM.\LI.\N C E N O Y C : TH.\'P C.\N K w r ACCURATEWITHOUT INVOKING TRUNC.\TION SELECTION"

CALCUL.\TlON OF

BE

m

I,

Most probablc

Maximum

Minimum

F

F

F

18.6

5.0

0.15

T

108

f

0.6

x

Mutations pcr gcnome per generation 0.1 Genetic load: deleterious mutations per genomc per generation 0 . 8 x 108 Ycars since divcrgencc of species compared 1.2 Fraction of bases mutated (during divergence of the species compared) 3 . 0 x 10s Base-pairs per genomc 3.0 Years per avcram generation 0.002 ( 0 . 2 %) Fraction of genomc 1 . 2 x 104 (;lobin gcne equivalents 50

0.2

1.2

Units

108

0.4

BP 3 . 0 X 109 3 . 0 X 109 t 2.0 1.0 F 0.008 ( 0 . 8 %) 0.040 ( 4 . 0 %) No. of 4 . 8 x lo4 2 . 4 x 106 genes

Successive columns list our values for most probable, maximum and minimum values of F along with the input parameters used in each calculation. See text for details of the calculation.

MUTATION RATES IN GLORIN GENES

211

niRNAs (where all mutations are by definition silent) with the rate of silent mutations in the structural genes. While we find a 44% substitution rate for silent mutations in the structural genes, Proudfoot ( this symposium) estimates that rabbit and human sequences differ at only 15%of the positions in the untranslated regions sequenced. The difference implies that silent mutations in the untranslated region are more likely t o be deleterious than those in the translated region, perhaps reflecting a role in mRNA secondary structure or in protein binding. There remains the question of the fraction of silent mutations in the structural gene that are themselves deleterious, but the observed correlation of silent mutations with non-base-paired regions in the hypothetical structure depicted in Fig. 7 of the preceding paper ( 4 ) suggests such an effect may be large. Consequently the real value of f could b’e at least two times greater than the measured niinimuni value of 0.60. Moreover, our data are limited to base substitiitions. The real value of f to be used in this calulation should be increased to include other types of mutations contributing to the genetic load ( e.g., deletions and chromosome rearrangements ). Any such increases would have the effect of further decreasing estimates of F in Table I, again magnifying Haldane’s dilemma. BP = Size of haploid genome in base-pairs (3.0 x 10” for mammals) ( 1 7 ) t = Average generation time since the divergence of the species. The generation time for primitive man was probably somewhat shorter than at present, perhaps 20 years to median birth. For lagomorphs it is about 1.5 years, although probably more in the wild. The aoerage generation time is what is needed, however. Note that the average will be dominated by the smallest values. Thus, if we assume that the generation time has been 1.5 years along the entire lagomorph branch and 25 years along the entire primate line, then the average generation time would be 2.8 years. If man’s long generation time is a recent adaptation, then 1.5 years wonld be a better average. I; = Fraction of genome that can be kept accurate through selection without invoking truncation selection

Results When the calculation is carried out with the values that seem most reasonable to us, F is found to be 0.008 (0.8%) of the genome, enough to code for 4.8 x 10’ genes the size of globin. Column 2 of Table I shows the result of this calculation along with the values chosen for each of the input variables. In the next column, we give the “maximum F” computed by choosing extreme values of the input variables so as to maximize F . In the last column, the same proccdure is followed to compute minimum F . Note that even when input variables are chosen to maximize F one finds that no more than 4.0%of the genome can be kept accurate, The least certain values listed in Table I are almost definitely those for L, the number of deleterious mutations per genome per gener at‘ion that can be eliminated by genetic death without truncation selection. W e have used Muller’s estimate of 0.1-0.2 (14). Muller argued that the dominance of deleterious mutations is about 5%,sufficiently high so that most will be eliminated as heterozygotes, and concluded that for L = 0.1-

212

WINSTON SALSEH AND JUDITH STROMMER ISAACSON

0.2, genetic death must strike 20-40%of a11 zygotes. We can better assess the plausibility of Muller’s figures if we ask how high a value of L is consistent with reasonable levels of mammalian fertility. For L = 0.5 there would have to be an average of one dcleterious mutation eliminated per zygote, and the Poisson distribution would predict a 37% survival rate for zygotes. In this case, the “most probable” value of F would be 3.0%of the genome, and the maximum value would be 10.0%. For L = 1.0, again assuming sufficient dominance that most dcletcrious mutations are eliminated as heterozygotes, the corresponding zygote survival figure would be about 14%,the “most probable” value of F would be 6.0%of the genome, and the maximum value of F would be 20.0%of the genome. Thus, even by straining every input parameter and assuming genetic viability of only 14%( a figure that seems very low), classical Darwinian selection cannot account for the accurate maintenance of more than 20%of the genome, and 0.8%appears to be a more likely figure. Our conclusion is not far from that reached by others (18) on the basis of nucleotide substitution estimates obtained by Kohne (20) from measurements of the depression of the t,,,in interspecific hybrids between nonrcpctitive DNAs. Our data providc a more direct measurement, coming as they do from the determination of actual nucleotide sequences in a wcll-characterized structural gene rather than from an extrapolation of the number of substitutions indicated by t,,,depressions in total singlecopy DNA

IV. How Are Multiple Copy DNA Sequences Kept Accurate? A. Clustered Repetitive Sequences The repetitive DNA sequences, and especially the highly repetitious satellite DNAs, seem to violate the principles stated abovc. For instance, we have q u e n c e d the major ( a n d many of the minor) repeats of three satellites which make up 52%of the total genome of the kangaroo rat Dipodomgs ortlii (21, 3 ) . The major repeat sequences are accompanied by numerous variants, but on the whole it is impressive that from 40 million to 500 million copies of the major repeats can be carried in the genome with thc obscrved degree of accuracy. George Smith has postulated that it should be po5sible to maintain homogeneity of satellite and other repcated sequences without contributing to the genetic load by a process of random unequal crossovers between such sequences (21, 22). H e has shown by computer stimulations that a series of unequal crossovers, such as have been demonstrated in the

MUTATION RATES IN GLOBIN GENES

213

ribosomal gene cluster (bobbed locus) of Drosophila (23, 2 4 ) , will result in a decrease in the amount of variability within the repeat pattern, so that ultimatelv all the repeats will be identical, except for mutational events that continue to introduce new variability. Smith’s work is important in demonstrating that unequal crossingover, a mechanism for which there is biological evidence, can be a potent force in maintaining the homogeneity of repeated scquences. W e believe, however, that his theory makes some incorrect predictions about the evolution of satellite DNA sequences and must be substantially modified. In his model, the sequence heterogeneity will be radically decreased by unequal crossing-over, but, at least to a first approximation, any of the elements of the original cluster of repeats has an equal chance of being the dominant element in the final gene cluster. Thus the sequence is not conserved, it is only kept homogeneous. The novelty of Smith‘s proposal, then, is in suggesting how random genetic drift may lead t o homogeneity in such cases. This is in good agreement with the proposals by Southern (25) and Walker (26) that satellite sequences arise de novo over a time short relative to the age of the species, so that satellite sequences in different species should have no resemblance to each other except by chance convergent evolution. Kirk Fry, in his scquence analysis of the Dipodomys ordii HS-alpha satellite, has shown that this simple picture cannot be correct ( 3 ) .The HS-alpha satellite of Dipodornip ordii very closely resembles the guinea pig alpha-satellite sequenced by Southern. We do not think it likely that this is a chance convergence. When the guinea pig alpha and D. ordii HS-beta satellites were analyzed by the same technique, the results indicated a strong similarity not only in the major repeat (which is identical) but also in the whole range of minor variants present (27). Similar results have been obtained with a third satellite from a pocket gopher. Southern and Walker’s proposal was based on their observation that the closely related species they examined had very different satellite DNA patterns. This is also true in the genus Dipodomys, as shown by the studies of Hatch and Mazrimas (28, 29). The extreme case is D . deserti, in which none of the three satellites which make up 52%of the genome of D. ordii can be detcctcd by isopycnic banding in the analytical ultracentrifuge. Therefore, our theory of satellite DNA evolution must explain the seeming paradox that satellite sequences can show very drastic changes, a t least in level, over short evolutionary times, yet they can persist, be accurately maintained, and reappear in different species separated by much greater evolutionary spans. Salser et al. ( 3 ) attempted to account for these observations by proposing that the rodents (and perhaps other mammals) share a common

214

WINSTON SALSER AND JUDITH STROMMER ISAACSON

“library” of satellite sequences present at levels lower than the lO’-or-so tandem repeats necessary for detection in density-gradient centrifugation. According to this model, the rapid evolutionary changes that satellites undergo are for the most part quantitative, resulting from saltatory replications of different p a t s of the satellite library and from deletions of portions of amplified satellites. The roles proposed for satellite DNA sequences include possible action as recognition sites for spindle-fiber attachment or for meiotic pairing. Insofar as such roles might involvc scquence-specific protein binding, they would seem to be ruled out according to the view advanced by Smith (21, 2 2 ) , for it is difficnlt to see how recognition sequences in the proteins could evolve at a pace rapid enough to match the rapid random drift of DNA sequences that Smiths model would involve. On the other hand, if many satellites are similar to the D. or& HS-beta/guinea pig alpha in having a long evolutionary persistence, then one can imagine that along with the “library” of satellite sequences there may be a cognate library of genes for binding proteins. This brings us back to the question of how the accuracy of the clustered repetitiou$ DNAs can be maintained in the context of genetic load. Smiths model makes no use of truncation selection or any means by which a particular sequence might be maintained other than by classical Darwinian selection according to which, on the average, only one deleterious mutation is eliminated per genetic death. According to the calculations in Section 111, however, such classical Darwinian selection can keep accurate no more than about 0.8%of the genome, much less than the amount of satellite DNA accurately maintained. We therefore favor the idea that a form of truncation selection may work in combination with the mechanism so elegantly proposed by Smith. This would achieve accurate maintenaiice of any functionally important repeated sequence, with a contribution to the genetic load that could be a very small fraction of that demanded by the classical mechanisms of Wright (30) and Fisher (31 ). This is perhaps best illustrated in the case of a gene cluster for products of known function such as those for the ribosomal RNAs or the tRNAs. We imagine that deleterious mutations are continually appearing in such gene clusters and that unequal crossing-over is continually operating to producc, as discussed by Smith, a broad range in the number of bad gene copies per genome. Consider, for example, a cluster of 1000 ribosomal HNA genes. Let u s suppose that individuals can function relatively well as long as they have more than a certain level of good gene copics, which we set at 75%for the purposes of argumcnt. However, let us now suppose that in those individuals whose bad gene copies exceed

MUTATION RATES I N GLORIN GENES

215

this level there is progressively greater difficulty, so that individuals with 50%defective genes very seldom reproduce in a competitive natural environment. If so, virtually all the genetic deaths due to defective ribosomal genes will be of individuals who carry from 250 to 500 defective gcne copies. In the absence of truncation selection, each genetic death eliminates only one excess defective gene from the population, so the truncation that we propose has the effect of reducing the genetic load from 250- to 500-fold for this particular gene cluster. I t should be stressed that the specific unequal-crossover mechanism proposed by Smith is only one of many ways by which homogeneity can be introduced into a clustered repetitious gene family. For instance, it could be proposed that one or more repeats are circularized by a recombiiiational event and that a rolling-circle mode of replication then produces many gene copies to replace the original cluster. Regardless of the details of the process, the essential features of the model that we propose are the same: first, that clustered repetitious gene families will have the characteristics required for truncation selcction, and second, that any mechanism introducing homogeneity should greatly increase the effectiveness of the truncation. So far as we know, the special relevance of truncation selection mechanisms to repeated gene families has not been proposed bcforc. Crow and others h a w argued that the notion of truncation selection, while qualitatively correct, can apply in only a limited number of cases and should not be expected to have a large effect in reducing the genetic load. In fact, these criticisms are relevant to typical single-copy genes, but d o not apply to the clustered repetitive sequences that interest 115 here. In the first place, Crow ( 8 ) points out that for truncation selection to operate on a family of different genes, it must b e shown that all these gencs act cumulatively on a single trait. He found it difficult to believe that this could be a very common phenomenon. Obviously, this condition is automatically fulfillcd in thc case of a rcpetitive gene family, since all members are producing the same products. Second, Crow found it difficult to imagine that truncation would apply in natural situations, with selective retention of all individuals above a certain level and rcjection of those below. I t is difficult to address this latter question in cases for which it is imagined that many different kinds of gene products arc iiiteracting. In the case of multiple copies of a single gene, however, one may simply inspect the shape of the dose-response curve relating viability to the percciitage of functional gene copies. Such experiments are difficult to carry out in eukaryotic systems, but an experiment performed by M. Fluck, R. Epstein and W. Salser (32, 33) measures this relationship for the r l l B cistron of bacteriophage T4 and

2 16

WINSTON SALSER AND J U D I T H STROMMER ISAACSON

serves to illustrate a phenomenon that could be general. In thcse cxperiments, the amount of active r l l gene product was varied over a range of several hundredfold by infecting with various ratios of rll-producing to rll-deletion phage. In order to cover a greater range of effects, the producing phage were sometimes r l l B nonsense mutants grown in a weakly suppressing host rather than the wild type. The results revealed that if even one r l l B cistron out of seven cntering the ccll is fully active, the viability (phage yield) is nearly normal. However, when the production of the r l l gene product is further reduced, phage yield begins to fall off very rapidly. Beyond this point, successive %fold decreases in the r l l genc product input result in 4-fold decreases in phage yield over a wide range. Such results can be interpreted to mean either that thc r l l gene product is normally produced in a roughly 10-fold excess, or that the gene product autogenously regulates its own synthesis with a feedback mechanism boosting synthesis up to 10-fold when gene product is limiting. I n cither case, the dose-response curve is such that there would be very efficient truncption selection were there a repetitive eukaryotic gene family with these characteristics. Any gene whose synthesis is under autogenous regulation ( 3 4 ) or whose product acts in a nonlincar fashion (as, for instance, if the active gene product is an oligomer) may be expected to show a certain cooperativity. This will result in a dose-response curve appropriate for more or less effective truncation selection should the gene occur as a member of a clustered repetitive gene family.

8. Interspersed Repetitive Sequences Britten, Davidson and their colleagues have shown that a wide variety of organisms contain substantial amounts of intermediate repetitive se-. quences that are interspersed in a regular fashion throughout the uniquesequence DNA (35). They have suggested that families of interspersed repetitive sequences act together as control signals for the induction or repression of transcription of the neighboring single-copy DNA. Most versions of such models rcquire that these interspersed repetitive sequences be kept relatively accurate; random genetic drift would otherwise destroy their ability to function. This poscs a problem since such sequences constitute an appreciable fraction of most genomes ( roughly 18%in the sea urchin), much more than the fraction we estimate can be kept accurate without truncation selection. In the preceding section wc discus5ed how repetitive sequences may be kept accurate with only a small contribution to thc genetic load by a combination of unequal crossing over and truncation selection. But the unequal crossing-over mechanism, at least as discussed by Smith (21,22), is applicable only to clustered repeats. Other more drastic models for

MUTATION RATES IN GLOBIN GENES

217

saltatory replication have been proposed and would accomplish similar or even more rapid reductions of the diversity in repeated gene families, but insofar as such mechanisms have been described in molecular detail, they too would appear to apply only to clustered repeat sequences. How then can we imagine that the interspersed repetitive sequences are kcpt accurate, distributed as they are in an orderly array of roughly 300 base-pair sequences throughout the single-copy portion of the genome? Three alternatives come to mind. a. We may suppose that such sequences are kept accurate without any substantial reliance on truncation selection in the same way as the conserved portion of the single-copy DNA. That would mean one genetic dcath for every one or two mutations in such sequences. However, as argued in section 111, we believe only about 0.8%of a mammalian genome can be kept accurate in this way, Assuming that mammals and sea urchins are similar in their maintenance of interspersed repetitive sequences, then wc have to imagine how more than 18%of the genome in addition to that coding for protein can be kept accurate. As discussed earlier, the 0.9% figure may be in error but in order to reach a figure as high as 18%one must make a series of assumptions, as outlined earlier, that seem in the aggregate to be quite improbable. Consequently this alternative seems unlikely. b. We may assert that, contrary to most models, the interspersed sequences need only be kept homogeneous to retain their control function. This is possible if control sequences and recognition elements evolve in parallel so that the ability to interact properly is maintained. But such models pose problems no less severe, since the mechanisms previously postulated for maintaining homogeneity apply only to clustered repeats. c. Finally, we may suppose that interspersed repetitive sequences do play an important role in control, and that their sequences must therefore be conserved with relative accuracy. But since the accurate maintenance of such a large fraction of the genome seems to demand truncation selection, the problem is to understand how truncation selection could work effectively to maintain interspersed repetitive sequences. We believe that control sequences of the sort postulated will have properties permitting truncation selection. However, to optimize the truncation effect, an additional mechanism is invoked, one capable of introducing homogeneity into the families of interspersed repetitive sequences. We feel that this last alternative provides an attractive model for the maintenance of interspersed repetitive sequences. Due to space Iimitations, a detailed exposition of this model will be presented elsewhere (Salser and Isaacson, in preparation), but we attempt to set forth its major aspects here. As discussed earlier, truncation selection demands first that a family

218

WINSTON SALSER AND JUDITH STROMMER ISAACSON

of genes contribute to a common parameter significant in natural selection. This criterion seems to be satisfied in the case of control signals for a family of genes turned on at a particular point in differentiation or in response to a particular environmental stress. Second, it is essential that there be actual truncation, that genetic dcath become much more likely above a critical number of mutations. For the postulated control regions to mcet this latter criterion, at least two featurcs seem esscntial: first, single-base changes in the relatively long wntrol regions should result in quantitative rather than qualitative changes in regulated gene products; and second, gene products turned on by a particular family of control sequences should work synergistically toward a common goal. Let us illustrate by considering a group of ten genes coding for ten essential protcins that togcthcr form a critical enzymic or structural complex. If a single basc substitution in one control sequence eliminates an essential protein, then that one mutation is lethal and truncation selection cannot occur. If, on the other hand, several deleterious mutations arc necessary to eliminate formation of thc essential complex, truncation selection can occur. If, in addition, the gene products exhibit synergistic behavior the system will function reasonably well until a certain degrec of control is lost, dcteriorating rapidly beyond this threshold. For the case in which products combine in a common structure, efficiency of polymerization would drop off drastically as several constituent protein concentrations fell below critical levels, and truncation would be effected. There are many possible forms of such synergistic interaction between gene products turned on in response to a shared signal. Although such bchavior fulfills the formal requirements for truncation selection, the expected efficiency is much less than that of truncation for the case of clustered genes. The missing element is a mechanism for reducing heterogeneity, a mechanism analogous to the uncqual crossing-over postulated to produce homogeneity in clustered repetitive sequences. To maximize the effect of truncation, one could achieve similar effects by invoking an enzyme system designed to carry out gene conversions among the members of a particular family. I t is not necessary to invoke an “intelligent” mechanism that recognizes erroneous sequences and corrects them; random conversions, by analogy with unequal crossovers, would rcsult in some individuals with many more, and others with many fewer, defective copies than the average. Such a process sets the stage for efficient truncation selection in which many more deleterious mutations are eliminated per genetic death than otherwise possiblc. Insofar as one believes that substantial fractions of the genome are accurately maintained as interspersed repeated sequences, models of this

MUTATION RATES IN GLOBIN GENES

219

sort should be seriously considered. The enzymic machinery required, while complex, nccd be no niorc elaborate than that proposed to account for the strange behavior of “insertion sequences” in prokaryotic cells, to name one example. I t should be pointed out, however, that there is as yet no experiment clearly demonstrating that all sequences cross-hybridizing with an interspersed repetitive family are either interspersed or accurately maintained. There is evidence, moreover, that although most single-copy DNA is adjacent to interspersed “signals,” most such single-copy DNA is not active ( 35) and consequently probably not dependent upon functional control sequences. One therefore wonders why interspersed repetitive sequences should be kept as homogeneous as is indicated by cross-hybridization. The cDNA cloning technique described in Salser et al. ( 4 ) should provide a powerful tool to help us test the various models proposed and answer some of the questions generated. This technique produces probes that should permit thc isolation and cloning of DNA sequences surrounding the structural genes for globin and other interesting genes. Once quantities of such scquences can be obtained and analyzed in detail to detect those conserved in different species, it should be possible to design cxpcriments enabling us to discover how eukaryotic gene expression is regulated within the framework of constraints imposed by the genetic load and to appreciate the ramifications of Haldane’s dilemma.

ACKNOWLEDGMENTS We wish to thank, in addition to those mentioned in Salser et at. ( 4 ) , R. Angerer, R. Britten, W. Fitch, M. McKenna and B. E. Wallace for helpful discussions; and J. Browne, P. Clark, H. Heindell, R. Higuchi, G. Paddock, J. Roberts, G. Studnicka and P. Zakar for research contributions. Research in the laboratory has been supported in part by USPHS grants GM 18586 and CA 15940. WS is a recipient of Public Health Service Career Development Award GM 70045. JI is supported in part IJY USPHS Molecular Biology Training Grant GM 1531.

REFERENCES I. M. Kiniura, Nature 217, 624 ( 1968). 2. J. L. King and T. H. Jukes, Science 164, 788 (1969). 3. W. Salser, S. Bowen, 13. Browne, F. El Adli, N. Fedoroff, K. Fry, H. Heindell, G. Paddock, R. Poon, B. Wallace and P. Whitcome, FP 35, 23 ( 1976). 4. W. Salser, J. Browne, P. Clarke, H. Heindell, R. Higuchi, G. Paddock, J. Roberts, G . Studnicka and P. Zakar, this volume, p. 177. 5. S. M. Weissman, personal communication. 6. N. J. Proudfoot and G. G. Brownlee, Br. Med. Bd.,in press (1976). 7. J. B. S. Haldane, I. Genet. 55, 511 (1957). 8. J. F. Crow, Proc. Berkeley Symp. Math. Statist. Probability, 6th, pp. 1-22 (1972).

220

WINSTON SALSER AND JUDITH STROMMER ISAACSON

9. J. Maynard Smith, Nature 29, 1114 (1968). 10. M. Kiniura, J . Gellet. 57, 21 ( 1960). 11. B. Wallace, “Genetic Load: Its Biological and Conceptual Aspects.” PrenticeHall, Englewood Cliffs, New Jersey, 1970. 12. J. B. S . Haldane, Am. Nut. 71, 337 (1937). 13. H. J. Muller, Am. J . Hum. Genet. 2, 111 (1950). 14. H. J. Muller, A d a Genet. Statist. Mecl. 6, 157 (1956). 15. M. McKenna, in “Phylogeny of the Primates” ( F . Szalay and W. P. Luckett, eds.), pp. 21-46. New York, 1976. 16. M. McKenna, personal communication. 17. “Handbook of Biochemistry: Selected Data for Molecular Biology, 2nd ed., H. Sober, ed., H58ff. Chemical Rubber Co., Cleveland, Ohio, 19fi8. 18. L. LeCam, J. Neyman and E. Scott, eds., Proc. Berkeley Symp. Math. Statist. Probability, 6th ( 1972). 19. D. E. Kohne, Q. Reu. Biophys. 3,327 (1970). 20. K. Fry, R . Poon, P. Whitcome, J. Idriss, W. Salser, J. Mazrimas and F. Hatch, PNAS 70, 2642 ( 1973). 21. G. P. Smith, CSHSQB 38, 507 ( 1973). 22. G . P. Smith, Science 191, 528 (1976). 23. A. Schalet, Genetics 63, 133 (1969). 24. K. Tartof, CSHSQB 38, 491 ( 1973). 25. E. M. Southern, Nature 227, 794 (1970). 26. P. M. B. Walker, Prog. Biophys. Mol. Riol. 23, 145 (1971). 27. K. Fry and W. Salser, in preparation. 28. F. Hatch and J. Mazrimas, B R A 244, 291 (1970). 29. F. Hatch and J. Mazrimas, NARes 1, 559 (1974). 30. S. Wright, Genetics 16, 97 (1931). 31. R. A. Fisher, “The Genetical Theory of Natural Selection,” Oxford Univ. PresT (Clarendon), London and New York, 1930; Dover Press, New York, 1958 (rev. ed. ). 32. W. Salser, M. Fluck and R. Epstein, CSHSQB 34, 513 (1969). 33. M. Fluck, W. Salser and R. Epstein, in preparation. 34. R. F. Coldberger, Science 183, 810 ( 1974). 35. E. Davidson and R. J. Britten, ARB 48, 565 (1973).

The Chromosomal Arrangement of Coding Sequences In a Family of Repeated Genes G . M. RUB IN,^ D. J. FINNEGAN’ AND D. S. HOGNESS Department of Biochemistry Stanford University School of Medicine Stanford, California

We are interested in studying the chromosomal arrangement of DNA sequences that code for mRNAs in Drosophilu mehogaster, and in analyzing adjacent sequences that may control their expression. In order to isolate individual segments coding for particular mRNAs, we have constructed a set of hybrid DNA molecules by joining [with the aid of terminal transferase ( I ) ] the bacterial plasmid ColE1 to sheared fragments of D. mehogaster embryonic DNA. From among these, we have identified a single DNA segment that contains sequences homologous to approximately 1%of the mass of cytoplasmic poly( A ) containing RNA from D. melanogaster tissue culture cells. Some of the properties of this hybrid, which we have called cDm412, are shown in Fig. 1 ( 2 , 3 ) . The mRNA species complementary to cDm412 is 6000-7000 nucleotides long as determined by polyacrylamide gel electrophoresis in 96% formamide. The sequences on cDm412 homologous to this message are confined to the restriction fragments A, B, C, D, E and F (Fig. 1 ) . These are intcrnal fragments and span a distance 9500 nucleotides indicating that cDm412 can carry only one copy of this mRNA sequence. The poly( A)-containing end of the RNA lies in fragment A, suggesting that transcription is from right to left on the map (Fig. 1; 3 ) . In addition to sequences coinplemcntary to an abundant mRNA, cDm412 also contains sequences representative of several families of moderately repetitive DNA sequences. Studies on the distribution and interrelationships of these moderately repetitive sequences are described elsewhere ( 2). Present address: Sidney Farber Cancer Center, Harvard Medical School, 35 Binney Street, Boston, Massachusetts 02115. Present address: Department of Molecular Biology, University of Edinburgh, Edinburgh EHQ 3JR, Scotland. 22 1

222

G . M. RUBIN ET AL. 3

mRNA

+

<

I

1

5'

I A

B

C

O

E

F

5 - 1

FIG. 1. Physical niap of cDni412. The thin horizontal line represents Drosoplaila DNA. The thick line represents DNA of the plasinid vector ColE1. The circular map has been opened at a Snia 1 restriction enzynic cleavage site within the ColE1 DNA to produce the linear niap shown. The vertical lines represent the cleavage sites for the restriction enzymes EcoRl ( ) and HindIII ( 1 ). A scale in kilobases ( 1000 nucleotides = 1 kb) is shown. The approximate location of the mRNA sequences is shown above the map.

2

We have determined how the sequences of cDm412 are arrangcd within the D. melanogaster genome by in situ hybridization of polytene salivary gland chromosomes with ['HI RNA complementary to cDm412. About 70 sites on the chromosome arms as well as the chromocenter were labeled. The cDm412 sequences homologous to mRNA must lie at onc or more of these sites. In fact, about 30 sites on the chromosome arms were labeled aftcr in situ hybridization using a probe made from fragment E, which contains only mRNA sequences. A similar pattern of labeling is seen after in situ hybridization using [ 3H]RNA complementary to the HindIII endonuclease fragment containing sequences C, D, E and F (Fig. 2 ) . The entire sequence coding for the mRNA may be present at each of these sites, or alternatively the labeling at some of them might be due to homology to fragment E alone. Two lines of evidence suggest that the entire mRNA sequence is represented at most, perhaps at all of these sites. The first comes from in situ hybridization of ["H]cRNA to fragment A. If the entire mRNA is present at each site labeled by fragment E, all these sites should also be labeled by fragment A. With the help of M. Young, we have mappcd the 10 sites on the X chroniosome labeled by fragment E and have compared them with those sites labeled by fragment A. Indeed, all sites on thc X chromosome labeled by fragment E are also labeled by fragment A. In order to examine in more detail the seqL1encc.s present at several of the 30 or so chromosomal sites, we have screened several thousand independently cloned hybrid plasmids for those that contain sequences present in fragment E. The screen was carried out by the colony hybridization method of Grnnstein and Hogness ( 4 ) , using [3'P]cRNA to fragment E as the probe. Four of the desired hybrid? were isolated, and two of these, cDm454 and cDm468, are compared with cDm412 in Fig. 3.

ARRANGEMENT OF CODING SEQUENCES

223

FIG. 2. Zit sitti hybridization of Drosophila melanogaster polytene chromosomes with ['HIcRNA to the HindIII endonuclease fragment containing sequences C, D, E and F (Fig. 1).In sitti hybridization was carried out as described previously ( I ). The arrow indicates labeling of the 3C2-7 region of the X-chromosome.

cDm454 and cDm468 both yield fragments identical in size to fragments B, C, D and E of cDm412 after digestion with the restriction cwzymes HindIII and EcoR1, and in each case these fragments show sequence homology to mRNA. By contrast, the two fragments that contain DNA complementary to the ends of the niRNA, as well as adjacent non-mRNA sequences (for example fragments A and F of cDm.412) are different in each case. We conclude that: ( i ) the three cloned segments are derived from different chromosomal sites; ( ii ) these sites contain the same or very similar mRNA sequences; and (iii) the different sites contain different sequences adjacent or close to the ends of these mRNA sequences. Taken together, this evidence strongly suggcsts that the structural gene carried by cDm412 is present at each of approximately 30 sites in the genome. How faithful is this repetition? We have not yet obtained a quantitative measure of the degree of mismatch aniong the mRNA sequences a t different sites. The mRNA regions could exhibit small differences due either to third-position variation in codon sequences that are otherwise identical, or to slight perturbations in the amino-acid sequences coded from each mRNA. However, the observation that the EcoRl and HindIII cleavagc sitcs are identically distributed within the mRNA

224

G. M. RUBIN ET AL.

A-

CEBDcDm412

cDm454

cDm468

FIG. 3. A comparison of three independent hybrid plaqmids that have sequence homology to fragment E. The left-hand portion of each panel shows the fragments generated by digesting the plasmids with both of the restriction endonucleases EcoRl and HindIII. The fragments were srpirated by electrophoresis through 1.4%agarose gels containing 0.09 M Tris-borate, 3 m M EDTA, 2 pg/nil ethidium bromide, pH 8.4.The gels were photographed under UV illumination. The DNA in each gel was transferred to a nitrocelliilose filter by the procedure of Southern ( 5 ) . In order to determine which restriction enzyme fragments contained sequences complementary to mRNA, "P-labeled poly ( A )-containing cytoplasmic RNA from D. melanogoster tissue culture cells ( 6 ) was hybridized to each filter. Hybridization reactions were carried out for 16 hours at 43°C in 0.1 hl sodium phosphate, 0.6 h.I NaCI, 0.06 M sodium citrate, 50%forinamide, p H 7 , containing 200 fig of poly( A ) per milliliter. The right-hand portion of each panel shows the autoradiograph of the filter after unhybridized RNA had been removed by RNase treatment and successive washes in hybridization buffer and then with 0.3 M iVaC1, 0.03 M sodium citrate. In the digest of cDm412, fragnient F comigrates with fragment C ; these have been distinguished by digestion with other restriction enzymes ( 3 ) .

sequences at three chromosomal sites indicates that in this small sample there is little if any variation in the number of iiucleotides in each of the regions defined by these cleavages. The following experiment indicates that the region corresponding to fragment C in cDm412 is likewise invariant for most, if not all, of the chromosomal sites. Total D. melanogaster DNA was digested by EcoRl and HindIII restriction endonucleases and the resulting fragments fractionated accord-

ARRANGEMENT OF CODING SEQUENCES

225

FIG 4. Homogeneity of fragment C sequences within the genome. The left-hand panel shows the products of a combined HindII1,EcoRl digest of Drosophil~melanogaster embryo DNA after separation by electrophoresis as described in the legend to Fig. 3. The DNA was transferred to nitrocellulose as before and the nitrocellulose filter was then treated by the procedure of Denhardt ( 7 ) . DNA from fragment C was labeled with ”P to a specific activity of ca. 5 x 10’ cpm/rg by the “nick translation” reaction of DNA polymerase I ( 8 ) and hybridized to the filter by a modification of the Denhardt ( 7 ) procedure. The hybridization was carried out at 65°C at a salt concentration of 0.75 M NaCI, 0.075 M sodium citrate. After 36 hours of incubation, the filters were washed exhaustively with 0.3 M NaCI, 0.03 M sodium citrate at 65°C. A radioautograph of the filter is shown in the right-hand panel.

ing to length by electrophoresis in a 1.4%agarose gel. These fragments were then denatured and transfered to a nitrocellulose filter ( 5 ) , and those that contain sequences homologous to fragment C were assayed by hybridization with 32P-labeledfragment C DNA. Figure 4 shows that more than 95% of this hybridization was restricted to a single class of fragments identical in length to fragment C. We wish to know which of these repeated genes are transcribed in which cell types, and to map the sequences that are transcribed for a t least some of these genes. Restriction fragments adjacent to the mRNA sequence should be useful probes for these purposes. We know that sequences adjacent or very close to some copies of the mRNA sequence are different. If this is generally true, and if these sequences are transcribed, then we should be able to solve these problems by the use of such probes. One of the sites labeled by fragment E is within the genetically well defined 3C2-7 region of the X chromosome (9), and we anticipate that this will allow a combined genetic and biochemical attack on the genes in this region.

226

G . M. RUBIN ET AL.

REFERENCES 1 . P. C. Wensink, D. J. Finnegan, J. E. Donelson and D. S. Hogness, Cell 3, 315 (1974). 2. D. J. Finnegan, G. M. Rubin, 1). J. Bower and D. S. Hogness, (1976). In preparation. 3. G. hf. Rubin, D. J. Finnegan and D. S. Hogness, (1976). In preparation. 4. M. Grunstein and D. S. Hogness, PNAS 72, 3961 (1975). 5. E. M. Southern, JMB 98, 503 ( 1975). fi. G. M. Rubin and D. S. Hogness, Cell 6, 207 (1975). 7. D. T. Denhardt, BBRC 23, 641 (1966). 8. F. H. Schachat and 11. S. Hogness, C S H S @B 38, 371 ( 1973). 9. G. Lefevre and M. M. Green, Chrornosorna 36, 391 ( 1972).

Heterogeneity of the 3’ Portion of Sequences Related to Immunoglobulin K-Chain m RNA URUSULA STORB Department of Microbiology and Immunology Unioersity of Washington Seattle, Washington

Immunoglobulin K-chains consist of a v-region, the portion that combines with antigen, located in the NH1-half of the molecule, and a c-region in the COOH-half, each comprising approximately 110 residues. The c-region is generally identical in all K-chains ( I ) and is coded for by unique genes ( 2 4 ; Wilson and Storb, unpublished), In order to determine whether T-lymphocytes have the potential for immunoglobulin synthesis, we searched for the presence in T-cells of RNA molecules related to cK1mRNA by the use of a cDNA probe (5).In the course of these experirnents, evidence was obtained that cK genes, although “unique,” represent a small family of sequences that diverge by as much as 12%. Immunoglobulin K-chain mRNA was prepared from membrane-bound ribosomes of MOPC-41 mouse myeloma tumors by hot phenol extraction, oligo ( d T )-cellulose chromatography, sucrose gradient centrifugation, and polyacrylamide gel electrophoresis ( 5).The K-mRNA appeared as a single band of approximately 13 S in polyacrylamide gel electrophoresis and was translationally pure in the wheat-germ system. The mRNA was transcribed by the reverse transcriptase of avian myeloblastosis virus (AMV) into cDNA labeled with [”]dCTP( 6). The K-cDNAwas pure as determined by ccmparing its hybridization kinetics with its template mRNA with those of a kinetic standard, mouse globin cDNA-mRNA; the C , t / 2 were 4.3 x and 4.6 x respectively (5). From alkaline sucrose gradients, the K-cDNA was approximately 400 nucleotides long. Since thp transcription of mRNA was performed in an approximately 60 M excess of oligo( d T ) over poly( A), it is presumed that the cDNA did not contain a transcript of the poly(A) region, and therefore probably corresponded to the 3’ untranslated region and part Abbreviations: CK,c region of K-chains; C,t, concentration of RNA in hybridization reaction x the time of incubation, expressed as inoles nucleotide x sedliter; C d 2 , Crt at which the hybridization reaction has proceeded to one half completion; T- ( B - ) cells = thyinus (Bursa)-dependent lymphocytes; v- ( c - ) region = variable (constant) region.

227

228

URSULA STORH

of the sequences coding for the cKregion, but not the vKregion of the mRNA ( 7 ) . Figure 1 shows the hybridization kinetics of K-cDNA with whole cell RNAs extracted by a hot phenol method ( 8 ) from various tissues and cells. All RNA preparations had been treated with DNase and were essentially free of DNA contamination (0-2%). Except for RNA of liver and of a tissue culture thymoma where the reactions had not been carried to completion, the RNAs tested protected the K-cDNA almost completely from S, nuclease digestion. The reactions with liver and thymoma RNA proceeded at the same rate as with the other HNAs. The results indicate that lymphocytes of both the B- and T-cell variety [the thymus and thymoma cell preparations contained 99.8%and 100%theta positive cells, rcspcctively ( 5 ) ] contain HNA sequences homologous to approximately

"'1 "1

MOPC-41

I

7 a

sp'e7

a

FIC. 1. Hybridization of K-CDNA with whole cell RNAs. K-[%]cDNA was hybridized with an excess of whole cell RNAs: -0- MOPC-41; -0- Swiss Swiss thymus; ---A- C3H thymoma tissue culture; -A- BALR/c spleen; -0liver in 0.24 hl Pi ( p H 6.8), 0.5 m M EDTA at 67°C. Hybrids were analyzed by treatment with S I nriclease ( 1 2 ) . Background S, nuclease resistance is shown at C,t 0. The C d / 2 values are indicated by horizontal bars. For the calculation of the C d / 2 of livcr RNA. it was assumed that 90% of the hybridization would be COIIIplcted within 100 C,t.

IMMUNOGLOBULIN K-CHAIN

mRNA

229

400 nudeotides of the 3' portion of K-mRNA. The liver RNA was positive probably due to circulating lymphocytes; no precautions had been taken to remove blood from this very vascular organ when the RNA was extracted. The C,t/2 of the hybridization with whole cell MOPC-41 myeloma RNA was 1.05, indicating that the tumor contains approximately 0.4% K-mHNA. The K-cDNA was also hybridized with whole-cell RNAs of a variety of other mouse myelomas and lymphomas (data not shown), All tumor KNAs tested protected the cDNA completely; the C,t/2 of the hybridization reactions corresponded to the quantities of K-chains produced by the various tumors, respectively. The hybrids between the K-cDNAand MOPC-41 K-mRNA had a sharp thermal denaturation profile with t,,, of 93°C (Fig. 2 ) as evidence that the K-cDNA was a faithful transcript of the K-cDNA. The same sharp thermal transition and t,,, was also observed in hybrids between the K -

-

100- o.----o SWISSthymus (82')

*----+C3H thymoma (845")

90 -

)-..

SWISSspleen (813 O ) Balb spleen (82") MOPC-41 (92")

80 - --OK-m

'

'RNA

(93")

70-

1'"11 20

0

65

70

35

80

85

I

90

95 100

"C

FIG. 2. Thermal stability of hybrids between K-cDNA and various RNAs or DNAs. Hybridization mixtures sealed in capillaries were incubated at 67°C long enough to achieve maximal hybridization. Melting profiles were obtained by raising the temperature in 5°C increments and quick-freezing duplicate capillaries at each step for the determination of SI-nuclease-resistant cpm. The BALB/c DNA (MOPC-41) and Swiss DNA (Krebs ascites cells) were sheared in the French press to a singlestrand size sedimenting at 10.1 S in alkaline sucrose gradients: DNA/cDNA ratios were approximately 4 x 10'; at C,,t 5000, 56.8 and 59.5%of the cDNA was hybridized with the DNAs. In parentheses t,,,'s.

230

URSULA STORB

cDNA and whole-cell MOPC-41 RNA, the tumor from which the K-inRNA had been prepared, indicating that in the presence of 99.6%other cellular HNAs K-mRNA formed completely matched hybrids. There is also a small lower-melting component seen with MOPC-41 whole cell RNA, which is discussed below. All heterologous RNAs tested formed hybrids with K-cDNA, which were apparently mismatched by maximally 8.5 to 11.7% ( 9 ) as indicated by the lowered tnl’sof their thermal denaturation profiles (Fig. 2 ) . This finding may be due to the following factors: impurity of the K-cDNA, smaller size of the heterologous RNAs, differences in the 3’-terminus of K-mRNA of different mouse strains, and the presence of several different cKgenes. I t appears unlikcly that the hybrids of hcterologous RNAs were mainly with sequences unrelated to K-chains, which may be present in the K cDNA. The hybridization kinetics of KcDNA with K-niHNA indicated a relatively high purity when compared wth a globin cDNA standard. Furthermore, the hcterologous RNAs hybridized with the total hybridizable K-cDNA, whereas lower plateau levels would be expected if only contaminants of the cDNA had reacted. Control experiments showed that the heterologous RNAs were large enough to form completely stable hybrids ( 5 ) . It was also found that mismatched hybrids were formed with RNAs from the same mouse strain (BALB/c) in which the MOPC-41 tumor originated. Figure 2 shows that hybrids with BALB/c spleen RNA were mismatched. In addition, the RNAs of other BAL,B/c myelomas formed mismatched hybrids with the MOPC-41 K-cDNA (data not shown). Finally, hybrids between the K-cDNA and genomic DNA of MOPC-41 tumors (BALB/c) a n d genomic DNA of Krebs ascites cells (Swiss) give identical thermal staldity profiles with t,,,’s of 90°C (Fig. 2 ) . These results indicate that the mismatched hybrids are due to nucleic acid sequences present in thc BALB/c strain as well as in outbred Swiss mice. Considering the fact that the hybrids with MOPC-41 DNA were also niismatchcd, the results suggest the existcnce of scvcral cK genes with sequences divergent by maximally 11.7%.Approximately 10% of the K cDNA-DNA hybrids melted above 95°C as expected for well-matched DNA.DNA hybrids. Thus there may exist as many as ten cKgenes. The divergence appcars to be prescnt in sequcnccs of both the wregion and the 3’-untranshted region of K-mRNA because the hybrids with spleen and thymus RNAs lack any high melting components. I t must be assumed that the spleen and thymus contain a mixed population of cells that express different cK-genes. Possibly the low-melting component seen in hvbrids with whole-cell myeloma HNA is due to infiltration of the tumor

IMMUNOGLOBULIN K-CHAIN

mRNA

231

by circulating lymphocytes that express cK genes other than the MOPC-41 tumor. There is also some recent evidence, from amino-acid sequencing data of immunoglobulin L-chains, that c-regions may not be unique. As many as 8 different human A-chain c-regions (10) and multiple cKregions in rat (If ) have been reported. These findings and the hybridization data described here have interesting implications for the organization of immunoglobulin genes and the control of their expression. DNA excess hybridization of mouse genomic DNA with the K-cDNA had indicated that the c,-region is coded for by unique genes (2-4; Wilson and Storb, unpublished ) . Obviously, DNA excess hybridization does not allow the distinction between one and may be ten genes, since ideal kinctic standards are almost impossible to obtain. The results reported here indicate that a unique gene may be a member of a relatively small family of closely related genes.

ACKNOWLEDGMENTS I thank Lisa Hager for excellent technical assistance and Dr. Joseph Beard for reverse transcriptase. Supported by NIH grants A1 10685 and DE 02600.

REFERENCES 1. N. Hilshman and L. C. Craig, PNAS 53, 1403 ( 1965). 2. P. Leder, T. Ilonjo, S. Packman, D. Swan, M. Nail and B. Norman, PNAS 71, 5109 (1974). 3. J. Stavnezer, R. C . C . Huang, E. Staviiezer and J. M. Bishop, M E 88, 43 (1974). 4 . C. H. Faust, H. Diggelniann and B. Mach, PNAS 71, 2491 ( 1974). 5. U. Storb, L. Hager, D. Pntnani, L. Buck, F. Farin and J. Clagett, P N A S 73, 2467 ( 1976). 6 . I. M. Verina, G. F. Teniplc, H. Fan and D. Baltimore, Nattire N B 235, 163 (1972). 7 . C. Milstein, G. C. Brownlee, E. M. Cartwright, J. M. Jarvis and N. J. Proudfoot, Nature 252, 354 ( 1974). 8. U. Storb, J. Imrnnnol. 108, 755 (1972). 9. R. J. Britten, D. E. Graham and B. R. Neufeld, in “Methods in Enzymology,” Vol. 22, p. 363 Academic Press, New York, 1974. 10. J. W. Fett and El. F. Deutsch, Irnrnunochcrnistry 13, 149 (1976). 11. G. A. Gutman, E. Loh and L. Hood, PNAS 72,5046 (1975). 12. V. Vogt, E J B 33, 192 ( 1973).

This Page Intentionally Left Blank

Structural Studies on Intact and Deadenylylated Rabbit Globin mRNA

1

N. VOURNAKIS, MARCIAs. FLASHNEH, MARYANN KATOPES, GARYA. KITOS, NIKOSC . VAMVAKOPOULOS, MATTHEW S. SELLAND REGINAM. WunsT

JOHN

1

Department of Biology Syracuse Unicersity Syracuse, New York

1. Introduction It is a remarkable testament to the rate of progress in molecular biology that this compendium of current research on various aspects of messenger RNA (mRNA) structure and function can be published only 20 years since an RNA with the biochemical properties of mRNA was first reported ( l ) ,and 15 years since the original mRNA hypothesis was stated by Jacob and Monod ( l a ) and supported by Brenner et al. ( 2 ) . One of the major unanswered questions that has received attention throughout the short history of mRNA research concerns the detailed relationship between the molecular structure and the function of messenger RNA. Messenger RNA, in solution, may assume specific conformational properties, e.g., stable helical regions, that can be important in aspects of its function. Eukaryotic messenger RNA has a complex life cycle ( 3 ) .I t must interact with a rather large number of different proteins from the moment of initial transcription, through processing and packaging for transport to the cytoplasm, during initiation, elongation and termination of protein synthesis, until it is finally degraded by cytoplasmic nucleases. Unique primary structural propcrties, e.g., the 5’-cap structures’ ( 4 ) and the 3’-OH poly( adenylic acid) sequences3 ( 5 ) , most likely play a major determinative role in some of these processes. It is plausible that specific secondary and/ or tertiary structural features are also involved as recognition sites for the intcraction of some of these proteins with mRNA. See Dedication, p. xxvii [Eds.]. ‘See articles in Part I of this volume. ‘See article by Edmonds et d . in this volume. 233

234

JOHN N. VOURNAKIS ET AL.

This paper presents data that suggest that specific helical regions of discrete length exist in rabbit globin mRNA, that the thermal stability of the molecule is enhanced bv the presence of Mg2+ions, and that the degree of helicity is sensitive to rather small changes in pH. Some studies, comparing intact mHNAs with mRNA that has had the 3’ polyadenylate sequence specifically removed, provide evidence that poly( A ) is not a major determinant of secondary structure.

II. Is mRNA Structure Random or Specific? The detailed mechanisms of inRNA function cannot be understood, at the molecular level, until information concerning the conformation and stability of mRNA in solution is obtained. This view is not held by some who suggest that mRNA secondary structure will prove to be nonspecific and random, with no particular functional significance. It is really this issue that must he addressed: Do specific structural features exist? Are they functional in the regulation of the interaction of proteins with mRNA? Early studies oil the relationship of mRNA structure to function focused on the mRNA of the bacteriophages R17, MS2 and f2. These molecules were used as templates for the study of in uitro protein synthesis (6, 7 ) . During the mid-l960s, it was suggested, from physical studies, that these phage mRNAs have extensive secondary structure, with between 63 and 82%of thcir nuclcotides in helical regions (8-12). The postulate that secondary structure in these molecules is functional in the control of protein synthesis is supported by the work of Adams et al. ( 1 3 , 1 4 ) on sequences of proposed ribosome binding and protein synthesis initiation sites. These papers suggest, based on sequence analysis of a 57-nucleotide-long region of the coat-protein cistron of R17, that genetic code degeneracy is employed in such a way as to maximize the predicted extent of base-pairing. The degeneracy in the triplet code allows for basepair formation without altering the amino-acid sequence of the coat protcin. This idea is supported by Ball ( 1 5 ) ,who suggests that the aminoacid sequence of a protein evolves in a direction to enhance the secondary structure of it5 mRNA. Ball’s analysis ( 1 6 ) of the codons in prcdicted helical and single-stranded regions of the MS2 coat protein gene ( 1 7 ) demonstrates that codons for the most conserved amino acids tend to exist in proposed base-paired regions. T h e above lines of inquiry argue in favor of the existence of specific secondary structure in mRNA, with the implication that thc role of such structure is to enhance its functional stability. A contrary viewpoint exists, based on the theoretical considerations

INTACT AND DEADENYLYLATED GLOBIN

mRNA

235

of Fitch ( 1 8 ) and of Gralla and DeLisi (19). These workers demonstrated that computer-generated random RNA sequences of various lengths can have an average of 50-60% secondary structurc. This implies that no evolutionary pressure toward base-pairing need be postulated, and that any mRNA molecule should have “random” secondary structure that may be of no particular functional interest. Richard and Salser ( 2 0 ) compare the thermal stabilities of chcmically synthesized polyribonucleotides of random sequence and 16 S E . coli rRNA. Both RNAs are highly helical with 50-60% base-pairing. But the helical regions of the randomsequence RNA are of a disorderly type, probably including a high degree of base mismatching as measured by lower cooperativity of melting and higher sensitivity to T, KNase compared to rHNA. Similar results were obtained by Holder and Lingrel ( 2 1 ) in studics comparing the thermal transitions of rabbit globin mRNA and random sequence RNA. I t seems unlikely, given the limited data currently available, that natural mRNAs have only secondary structure of the random, disorderly type. Several studies provide evidence that specific helical regions exist in natural mRNAs, and that the secondary structure may be functional. Gralla et al. (22) and Hilbers et al. (23) have isolated and performed physical studies on the 59-nucleotide fragment from R17 bacteriophage mRNA that is protected from ribonucleasc digestion by the binding of R17 coat-protein. This piece contains the ribosome binding site and the initiator codon for the replicasc gcne, previously sequenced by Steitz ( 2 4 ) . Melting curves derived by teniperature-jump procedures and high resolution proton magnetic resonance spectroscopy demonstrate that the fragment contains two stable helices, as predicted from the sequence data of Bernardi and Spahr (25). Also, convincing evidence is presented that the R17 coat-protein is able to bind to the helix a t the replicase ribosome binding site. These papers present a strong argument that R17 mRNA secoiidary structure is involved both in mRNA recognition by ribosomes, and in the control of translation via repression by the binding of coat protein. Lodish (26) also obtained support for the notion that translation of mRNA is controlled by secondary structure. Upon disruption of secondary structure by reaction with formaldehyde, f2 mRNA would initiate the synthesis of some artifact polypeptides whose sequence does not correspond to that of any known f2 proteins. Also, forrnaldehyde-treated f2 mRNA initiates the synthesis of a great excess of certain f2-specific proteins, implying that accurate initiation of protein synthesis requires secondary structure. Richard and Salser (27) have reported that at least two ribosomal binding sites exist in the lysozyme cistron of T4 phage, one of which is masked and cannot be located without treatment with both RNase and heat denaturation, This implies that lysozyme mRNA

236

JOHN N. VOURNAKIS ET AL.

contains double-strandcd regions that may be involved in the initiation of translation. Sherrer ( 28) proposes that there are specific helical regions i n duck gloliin mRNA that act as sites for the binding of messengerspecific proteins involved in the formation of polysome-derived mHNP particlcs. This model, based on some electron microscopic studies and on ethidium bromide binding data, suggests that these proteins associate with specific regions in the mRNA high in secondary structure. I t can be concluded that there exists at the present time some information, derived from a limited number of systems, that relates mRNA structure to its function. I t is our overwhelming impression, however, that the structure-function issue will not become clarified until a great deal of evidence is accumulated that relates primary sequences to the conformational properties of many mRNAs.

111. Eukaryotic mRNA Structure Recent advances in the ability to purify relatively large amounts of tvkaryotic mRNA ( 5 ) have resulted in some progress in understanding their primary and secondary stnictore. Hrownlee et al. (29)*developed a technique for ohtaining highly radioactive cDNA copies of the 3’hydroxyl regions of several mRNAs immediately adjacent to the terminal poly( A) sequences. I t is now well established that there are noncoding scquences in mammalian mRNAs at thcir 5’ and 3’ ends (30). Proudfoot.’ (30) and Milstcin et al. (31 ) have sequenced portions of the 3’ noncoding regions of rabbit N anrl p globin, and mouse imniunoglobulin light-chain niRNAs. A striking result is that sequence and proposed secondury-structure homologies exist among the three mRNAs. I t is possible that the lioncoding regions contain binding sites for specific proteins involved in niRNA function. The predicted helical regions are short, approximately 5 to 8 base-pairs. A small number of attempts to study the structure of eukaryotic mRNAs, directly, with biophysical mcthods have been published. Direct evidence for the occurrence of secondary structure in rabbit globin mRNA w a s obtained by Holder and Lingrel ( 21 ) from studies of thermal transitions by ultraviolet spectroscopy. They find that 58-63% of the bases are in helical regions, which molt in a highly cooperative manner. Similar results were obtained by Favre ( 3 4 ) . Differential melting curves indicate that globin mRNA may have three distinguishablc temperaturetransition regions that are sensitivc to the concentration of sodium chloride (21), and that scwm to denature, independently of one another. ‘ S e e article by Proudfoot et aE. in this volumc.

INTACT AND DEADENYLYLATED CLOBIN

mRNA

237

There are no definitive results, beyond those mentioned above, concerning the extent and organization of secondary structure in eukaryotic messenger RNA. The data presented below represent the first attempts of this laboratory to contribute information to this area.

IV. Purification and Deadenylylationof Rabbit Globin mRNA Rabbit globin mRNA was obtained from New Zealand female albino rabbits as described by Nienhuis et al. (35). A11 preparations were purified by using a sequence of oligo( dT)-cellulose (Collaborative Kes., Inc), affinity chromatography, and sucrose-gradient centrifugation steps ( 36). Some highly purified mRNA samples were radioiodinated ( Na’TO,, Amersham Searle, Inc. ) as described (37). Deadenylylated globin mRNA was prepared by specifically removing the 3’-hydroxyl polyadenylate sequences using hybrid nuclease ( RNase H; E C 3.1.4.34) (38). mRNA that contains poIy( A ) sequences we refer to as “intact.” Vournakis et al. ( 39) have demonstrated that electrophoretic homogeneity of individual mRNA species increases following deadenylylation, as a resuIt of the removal of poly( A ) sequences that are inherently variable in length. Samples of “‘I-labeled intact and deadenylylated globin mRNA were denatured and analyzed by high-resolution polyacrylamide/forniamide slab gel electrophoresis. Figure 1 is a composite of such data. Mobility in these gels is linearly related to the logarithm of the RNA chain length, and all RNAs arc assumed to be completely denatured in the presence of 98% formamide ( 4 0 ) . The gel: were calibrated by Maniatis et al. ( 4 0 ) using DNA restriction enzyme fragments of specific length as markers. RNA does not have exactly the same mobility as DNA in these gels, hence the lengths of RNA cannot be established exactly. However, if it is assumed that 01 and p rabbit globin mRNA have nucleotide lengths 630 and 710, respectively ( 41 ) , then the deadenylylated lengths are calculated to be 600 and 680, respectively. These values are in good agreement with the most recent estimates of Proudfoot ( 3 0 ) . It is clear that both intact and dendenylylated species are free of major contamination by other RNAs. Similar results are obtained by staining gels in which nonlabeled inRNAs have been electrophoresed. These results indicate that our purification method yields highly purified intact and deadenylylated mRNA.‘ I t is such material that iy used in the structural studies described below.

’ Referred to

as m RNA( A,,) and mRNA(no Al,) in other papers in this volume.

238

JOHN N. VOURNAKIS ET AL.

Bd

xc

FIG. 1. Autoradiogram of electrophoretically analyzed intact (slot 1 ) and RNase-H-digested (slot 2 ) '251-labeled rabbit globin mRNA. Electrophoresis was in a 5% polyacrylamide/98% formamide slab gel ( 4 0 ) , and was run for 6.5 hours at 300 V; 123,400 cpni and 64,560 cpni were layered on slots 1 and 2, respectively. The positions of CY and i3 intact and deadenylylated globin mRNAs, and the tracking dye, xylene cyanol ( X C ) , are indicated.

V. Optical Studies Absorbance and circular dichroism spectroscopic nieasurements were used to probe thc secondary structure of rabbit globin mRNA( A,, ). Thermal denaturation data were obtained in two buffers: borate buffer (0.10 M Na borate, 1 mM EDTA, pH S . O ) , or S1 iiuclease buffer (0.30 M NaCl, 0.02 M Na acetate, 2 mM ZnSO,, 1 mM EDTA, 5% glycerol, pH 4.5). These particular buffer systems were chosen because the carbodiimide binding studies and S1 nuclease digestion experiments, described below, are pcrformed in the borate and S, buffers, respectively. The ionic strengths of these buffers are nearly the same. Table I is a summary of data derived from the thermal transition

INTACT AND DEADENYLYLATED GLOBIN

t,,

("C)

Bufferh Borate Borate

s1 S1

+ 5 mM MgC12

+ 5 niM MgClz

61.2 78.0 :is, 6 78.9

239

mRNA

hlaximal Estimated % % h g 5 8 C double-stranded basesd 25.8 26.2 36.9 37.5

52 52 74 75

Data were obtained using a Beckrnan 25 spectrophotometer, a Digitec Model 5810 therrnoinetcr equipped with a Yellow Springs Instruments Model 701 thermistor probc, a Neslab TP-2 thermal programmer and a Lauda K-21R water bath. The tcrnperaturc was varied a t a rate of O.S°C/minute. * See Table It- and Fig. 4 for buffer compositions. c The pcrcentagc of hzs8was calculated using the following equation:

% hz58

= [(A258.95'

- Az,8.,o]/Ans.,o]X

100

where A 2 5 8 . 5 ' is the absorbance a t 258 nm a t 5"C, etc. A11 data were corrected for thermal expansion. d The percentago double-strandcd bases was cstimated according to the procedure of Holdcr and Lingrel (21).

studies. Two striking facts emerge. The extent of base-pairing, estimated from the hyperchromicity values (21 ), is significantly greater at pH 4.5 than in the pH 8.0 buffer. The addition of Mg2+ ions tends to stabilize the structure of mRNA a t either pH. Temperature transition curves (unpublished results) are biphasic at pH 8.0, and are cooperative, similar to the results of Holder and Lingrel (21) at comparable ionic strength. The pH 4.5curves are also biphasic showing a marked transition between 40" and 50°C.These curves all become more cooperative and lose their biphasic appearance in the presence of Mg." Figure 2 is a collection of circular dichroism spectra of globin mRNA obtained in the S, nuclease b i d e r at several temperatures. There is a large decrease in molar ellipticity between 22" and 40°C. Much of the ellipticity is regained upon addition of Mg". This result is consistent with the results obtained from the thermal transition data. The data suggest that these molecules have extensive secondary structure that is sensitive to pH and is stabilized by Mg2+ions.

VI. Carbodiimide Binding to Globin mRNA The water-soluble reagent N-cycIohexy1-N'- [2- ( 4-methylmorpholinium ) J ethyl carbodiimide, commonly refcrred to as CMEC, can be used

240

JOHN N. VOURNAKIS ET AL.

220

I

I

240

260

I

280

I

300

WAVELENGTH ( n M )

FIG.2. Circular dichroism spectra of intact rabbit globin niRNA. Samples were in S 1 nnclease digestion buffer (see text). Spectra were obtained using a JascoDurruni inodel J-20 spectropolarimeter. Molar ellipticities ( cleg. cni‘ mol-’ ) were calculated using a molar extinction coeffcient of 7700.

to probe the structure of RNA. This compound reacts specifically with unpaired guanine and uracil residues ( 4 2 ) , forming covalent bonds. Several laboratories have used it to study the secondary structure of transfer RNA (4<3-46) The pattern of chemical modification with specific tRNAs seeins to obey predictions based on the known three-dimensional structural of yeast phenylaIaninc tHNA (45, 4 6 ) . Tlicre is a consistent correlation between the regions of chemical reactivity and the “exposed” (unpnircd ) bases in this molecule. We have preliminary evidence that mixed deacylated tRNA (Miles, Inc.) may be induced to undergo a transition from its nativc three-dimeiisional state to a partially denatured state by altering the ratio of the diimide to tRNA in the reaction mixture. I t is possible that the partially denatured state obtained in this way represents a tertiary-to-secondary ( cloverleaf ) structural transition. It is also possible that the transition represents the unpairing of certain tertiary and hU-loop hydrogen bonds ( 4 7 ) . Table I1 summarizes the tRNA carbodiimide bitiding data. The kinetics of these rcactions has been studied ( 4 8 ) . All rcactions reach completion between 8 and 12 hours. A partial denaturation of tRNA, rneasurcd by an increase in diirnidc bindI

INTACT AND DEADENYLYLATED GLOUIN

Reaction ratio (diimidc per nucleotidc)*

241

mRNA

Addition of RlgCI?

Number of bases reartcd % IStOlI per tllNAc 7.1 1.5.0 3.1 6 .3 8.2d 14.4c

5.5

11.0

--

a . i)

11.0 -5 . 5 5. .-)-24 hours 11.0-24 hours

+--

0 , ;)

- -

a , ;)

0 1

5.5

)

; I , ;I

10

_ -

13. If 7.8 9.9 13.9

* Borate buffer (0.10 M Na borate, p H 8.0) was uscd in all experiments. In some cases 0.02 M MgClZ was added. All reactions were incubated a t 30°C for 24 hours unless otherwisc indicated. The diimide was rV-cyclohexyl-N'-[2-(4-methyl1norpholiniurn)]ethyl carbodiimide. * A molar ratio of either 5.5 or 11.0 (diimidc/nucleotidc) was used in all cases. Reactions contained 1.8-2.2 mg/ml deacylated yeast tIiNA in a total volume of 100 pl. The ["Cldiimide used had a specific activity of 2.07 Ci/niol. c T h e number of nucleotide bases reacted per tRNA molecule (assuming an average value of 80 nucleotides per tllNA) was determined by separating the reacted complex from free diimide. Samples were fractionated on a column (1 cm X 24 cm) cont.aining 6.0 ml of Sephadex C-23 (Pharmacia) laid ovor 4.0 ml C-50 CM-Sephadcx (Pharmacia). The column was equilibrated and eluted with 0.01 M phosphate buffer (pII 7.0). Fractions mere counted by liquid scintillation and cpriis were converted to dpms to determine the number of bases rearted. The incubation time was 48 hours. This sample was incubated at a ratio of 5.5 for 24 hours and a t a ratio of 11.0 for a n additional 24 hours. f The reaction temperature was 40°C.

ing, occurs when the diimide/tHNA ratio is high, when the reaction temperature is 40"C, or when 10%ethanol is present. If Mg2+ ions are added prior to incubating at a high ratio, the tRNA is protected from denaturation. Viscosity data support the conclusion that the partial denaturation occurs between 30" and 40"C, although absorbance measurements indicate little hyperchromicity in that temperature range. The kinetics of the diimide reaction with rabbit globin mRNA at pH 8.0 in borate buffer are summarized in Table 111. The total number of bases reacting is quite small, considering that approximately 40% of the molecule appcars, by optical measurements, to be unpaired (Table I).

242

JOHN N. VOURNAKIS ET AL.

Standard, 1 hour Standard, 4 hours Standard, X hours Standard, 24 hours 40”C, 24 hours 10 % IStOI1, 24 hours

8 4

20 7 20 1 23.2 :4x 4 38 . 9

Standard reaction cwnditions were as follows: horatc bulTer (0.10 AT Na borate pIT 8.0) 1.2.5 nig/ml mRNA, 40: 1 ratio of cml~odiiiuideto iriolrs of nucleotide, total volunw 80 pl, 30°C iucul)ation trniprrature. ?'lit csarhodiimide used was t h r same as listed under ‘I’able 11. * The number of carbodiimidc rcartccl per globin ni RNA was calrulatrd assuming a n average length of 630 ni~c~lcotides. 0

A conservative estimate of thc number of unpaired G’s and U’s potentially available for reaction with the diiniide is about 100. The numbcr of bases reacting increases significantly when the reaction is carried out at either 40°C or in thc presence of 10%ethanol, but does not approach

100. It is prematurc to spcculate as to what structural changes generate such carbodiimide binding behavior, However, it is unlikely that all unpaired guanines and uracils will react with the diimide since the reagent is bulky and steric hinderance, particularly in neighboring bases, may inhibit the reaction. Also, folding of the niRNA, either in a specific or nonspecific manner, may bury some available nucleotides. I t is quite clear, from these data, that globin mRNA has extensive structure that inhibits most G’s and U’s from reacting with the diimide, and that the structure can be partially denatured ( i.e., permitting more extensive reaction) by small temperature changes or by adding small amounts of ethanol.G

VII. Polynucleotide Phosphorylase Digestion of mRNA The 3’-exonucleolytic activity of polynucleotide phosphorylase ( EC 2.7.7; Worthington) can be used to probe the composition of the 3’ end

of an RNA molecule ( 4 9 , 5 0 ) . Beginning a t a 3’-OH terminus, the enzyme digests in a 3’ to 5‘ direction in a processive manner producing nucleoside “Analogous to the bisulfite reaction of tRNAs (see Hayatsu in Volume 16 of this series).

INTACT AND DEADENYLYLATED CLOBIN

243

mRNA

diphosphates (50). The products of the reaction can be quantified by ascending thin-layer chromatography on PEI-cellulose ( 39 ) , In earlier studies with "P-labcled silk-moth chorion mRNAs ( 3 9 ) , it was observed that, at 5OC, the poly( A) sequence of intact species is readily digested whereas dcadenylylated mRNAs are completely resistant to attack by this enzyme, and also that sequences adjacent to the 5' end of the poly( A ) tails in intact message are also resistant to the enzyme. Exposure of the deadenylylated mRNAs to the enzyme a t 20°C results in a slow digestion. These observations imply that structural features adjacent to the poly( A ) tail may inhibit the movement of the cnzynie at low temperature. Similar results have been obtained with both intact and deadenylylated lT-labeled globin mRNA. Figure 3 shows a comparison of the rate of digestion of these molecules at 24°C. Thesc are surprising results, since the enzyme must traverse the 30 to 50 nucleotides of the poly ( A ) of the intact molecules prior to generating 1251-labeledCDP. A t 5"C, no release of lz5I is observed in either case. The results suggest that the initiation of the digestion of deadenylylated molecules is inhibited when poly( A ) is l5

F

0

60

120

TIME ( m i n )

FIG. 3. Polynucleotide phosphorylase digestion of intact and deadenylylated '"I-labeled rabbit globin mRNA. Samples of RNA ( < 1 p g ) disolved in 20 pl of water were digested by mixing an eqnal volurne 2X buffer (0.10 M TrisCI, pH 7.5, 0.03 hl MgCl,, 0.03 hl potassium phosphate) with 10 J of enzyme solntion ( 5 mg/inl in 1X of the above 1)uffer). Digestion products were analyzed on polyethyleneimine ( PEI ) -celldose sheets as described ( 39 ). Total radioactivity per reaction was 146,000 and 140,600 for intact and deadenylylated mRNA, respectively. This is a typical example of results obtained in six separate experinicnts.

244

JOHN N. VOUHNAKIS ET AL.

removed. I t is possible that structural features, such as the helices predicted by Proudfoot (30),which should be stabilized a t low tcmperaturc, act to retard the initial binding and progress of the enzyme,

VIII. Specific Hydrolysis of mRNA by SI Nuclease Nuclease digestion can be applied to study the structural properties of RNA (51, 5 2 ) . The single-strand-specific nuclease S , , isolated from Aspergillus orzyae (EC 3.1.4.21), was used in this study to probe the conformation of both rabbit globin mRNA(A,,) and mRNA(no A,,). The cnzyme was purified from crudc a-amylase powder as described by Vogt ( 5 3 ) ,with the modifications of Rushizsky et al. ( 5 2 ) ,which remove contaminating T, and T2 ribonucleases from the preparation. The enzyme activity is assayed by the Vogt procedure ( 5 3 ) using sonicated and heatdenatured calf thymus DNA as substrate. Enzyme units are calculated based on the amount of denatured DNA solubilized in 10 minutes at 45°C in a standard assay. A series of S, nuclease digestions were performcd using 1*51-labeled intact and deadenylylated globin mRNA. Figure 4 summarizes kinetic data obtained on the resistance of intact globin mRNA to S, digestion at several enzyme concentrations, between 0.6 and 2.4 units/ml, incubated at 24°C. Resistance to the enzyme was obtained by determining the ( trichloroacctic ) acid-precipitable radioactivity by a standard filtcr assay procedure ( 3 6 ) .Data were normalized to zero time counts ( the enzyme is actually added after the zero point is obtained) and are presented as percent resistance. It is seen that in the absence of enzyme the 12GI-mRNA is stable during the 4 hours of the reaction, The mRNA is approximately 70-74% resistant to S , digestion at the three enzyme concentrations studied. In other experiments, at a greater enzyme concentration ( 5 units/ml) and at higher temperatures (37" and 42"C), the percent resistance decreased significantly (results not shown). Figure 5 summarizes the effect of adding Mg?+ ions to a standard rcactioit containing 2.4 units of enzyme per milliliter. The mRNA resistance to S, nuclease digestion appears to be insensitive to the addition of Mg'+, with about 65-72%of the nucleotidcs resistant to hydrolysis. It is possible that the enzyme is losing activity during the reaction. To test this possibility, more mHNA was addcd to a reaction about 75 minutes after thc initial addition of enzyme. It was found that the enzyme is capable of digesting the new intact mRNA with about the same kinetics and to about the same extent as the original niRNA. This implies that the digestion slows down owing to a depletion of substrate (single-stranded regions of the mRNA molecule) rather than to inactivation of the cnzyme.

INTACT AND DEADENYLYLATED CLOBIN

245

mRNA

I

l

q

W

fn

z a-"

20

0

I

3

2

4

TIME (hours)

FIG.4. S , nuclease digestion of intact '"I-labeled rabbit globin niRNA at several enzyme concentrations. The standard reaction mixture included the following coniponents: 0.03 M Na acetate, pH 4.4; 0.30 M NaCI; 0.002 M ZnClz; 5%glycerol; 1 el samples of 'zI-labeled niRNA containing 1 pg of deacylated yeast transfer RNA; 1-10 p1 of enzyme solution in storage buffer (0.02 M Na acetate pH 4.6; 0.1 mM ZnSO,; 0.3 M NaC1; 50% glycerol) containing 2.5 units/pl enzyme; total volume was 100 p1. Reactions were incubated at 24" for various lengths of time. Total radioactivity per reaction was: no enzyme ( a ) ,92,400; 0.6 units/ml enzyme (O), 105,000 cpm; 1.1 units/nil enzyme ( A ) , 57,000 cpm; 2.4 units/ml enzyme 62,000 cpm. Aliquots ( 9 PI) were analyzed at each time point by precipitation with trichloroacetic acid on Whatman 3 MM filter paper disks (39) to determine the amount of nuclease-resistant RNA.

(o),

W [L

+

60 -

W

fn

a

I

I

I

0

I

2

TIME (hours)

FIG.5. S1-nuclease digestion of intact lxI-labeled rabbit globin mRNA at several h4g2+concentrations. Standard reaction conditions (as in Fig. 4 ) and 2.4 units of enzyme per milliliter were used. Total radioactivity per reaction was: no enzyme ( O), 57,300; no Mg" ( 0 ), 62,000 cpm; 4 mM Mg" ( A ) , 60,500,; 20 mM Mg" ( ), 57,900. Aliquots ( 9 pl) were analyzed for nuclease resistance.

246

JOHN N. VOURNAKIS ET AI,.

Further tcsts to determine the reliability of our S, nuclease are described in Fig. 6 . It is known that synthetic polyadenylate (Miles Laboratories) is doublestranded at pEI 4.5 ( 5 4 ) , which is the pH optimum of S, nuclease and the pH of the standard S , reaction. Polyuridylate (Miles Laboratories) is single-stranded at this pH. It is seen that S, completely hydrolyzes poly ( U ) whereas poly ( A ) remains intact. Samples of poly( A ) wcre treated with formaldehyde (2%formaldehyde, 8O"C, 15 minutes) prior to S, digestion. Such treatment is known to disrupt helical

8 z 2

I

0

100

v, fn

W

u w

fn

60

a

W

-J

V 3 2

8

20

0

I

2

TIME (hours)

W V

100 NATIVE mRNA

z

0

a k E

0

fn

W

a W fn

60

*a

a-

W

_1

V 3

z

s

20

HCHO-mRNA

t I

1

I

0

I

2

TIME (hours)

FIG. 6. S, digestion of poly( A ) , poly( U ) , and formaldehydc-treatcd poIy( A ) and "SI-lahelcd rabbit globin mHNA. Standard conditions, at 24°C and 2.4 units of errzyme per milliliter, containctl the following RNA samples: L3H]poly(A ) ( 0), specific activity 18.7 Ci/mol; ["H]poly( U ) (A), specific activity 54.2 Ci/mol; and formaldehyde-reacted [3H]poly(A ) ( 0 ), specific activity 18.6 Ci/mol. Total radioactivity was: 20,700 cpni, 2700 cpm and 8200 cpni for poly( A ) , poly( U ) and €ICIIOp l y ( A ) , and 70,300 and 60,800 for native and I.ICIIO-niRNA, respectively.

INTACT AND DEADENYLYLATED GLOBIN

247

mHNA

rcgions in RNA molecules. Formaldehyde-treated poly ( A ) is completely digested by the enzyme. These results indicate that our S, nuclease is free of contaminants capable of hydrolyzing double-stranded material. Our enzyme preparations are, therefore, specific for single-stranded nucleotides. Saniplcs of intact mRNA were treated with formaldehyde and then with S , nuclease, with the results shown in Fig. 6. The formaldehyde-treated mRNA is much more susceptible to S, hydrolysis, indicating that a significant decrease in secondary structure occurs in comparison to untreated mRNA. The digestion by S, of the HCHO-mRNA is not complete. Some particular structural features, e.g., stable helical segments, may not have been disrupted by the formaldehyde reaction. The formaldehyde reaction conditions are fairly mild, hence it is possible that complete denaturation docs not occur during the reaction. A comparison of the S, nuclease resistance of intact and deadenylylated '"I-globin mRNA is seen in Fig. 7. Data obtained at two enzyme concentrations are presented. All experiments are done in the absence of Mg" ions, and are incubated at 24°C. I t is striking that the percentage of nuclense resistance is insensitive to the presence or absence of the poly ( A ) sequences. We began to characterize the S, nuclease digestion products. Our initial objective was to determine the extent to which fragments of discrete lengths are generated, and to estimate their lengths and relative amounts. We subjected samples of digested intact and deadenylylated lZ5I-

0

I

2

TIME (hours)

FIG.7. S, digestion of intact and deadenylylated '"I-labeled rabbit globin niRNA at two enzyme concentrations. Standard reaction conditions (see Fig. 4 ) at 24°C were used with 2.4 units of enzyme per milliliter. Total radioactivity was: intact ( A ), 50,900 cpm; deadenylylated (A),7331 cpin; and at 5.0 units/ml enzyme; intact (0 ) 55,800; deadenylylated ( @ ) , 61,700 cpm.

248

JOHN N. VOURNAKIS ET AL.

labeled globin inRNA to po1yncryl:imide gel electrophoresis, using a denaturing grl ( 208 acrylaniide 7M urca) capable of resolving short KNAs ( 4 0 ) . An autoradiogram of such a gel is shown in Fig. 8. Although the photograph tends to blur thc bands, it is seen that the S , digestion gcnerates specific, discrete-length fragments, the pattern being

FIG. 8. Autoradiogram of electrophoretically analyzed S, nuclease digestion protliicts of intact and deadenylylated '"I-labeled rabbit globin mRNA. Eiectrophoresis was in a 20 cm 20T acrylamide/7 M urea slab gel (40),run at 250 V for 14 hours. Slots 1 and 2 were loaded with undigested intact and deadenylylated, and slots 2 and 3 with S1-digested (standard reaction at 24°C with 2.4 units/nd enzyme) intact and deadenylylated samples, respectively. Total radioactivity loaded was: slot 190,000 cpm; slot 2, 148,000 cpm; slot 3, 950,000 cpm; slot 4, 740,000 cpm. Estimated nucleotide lengths, and thc final positions of the marker dyes broniphenol blue (BP) and xylem cyanol (XC), are indicated.

INTACT AND DEADENYLYLATED GLOUIN

mRNA

249

almost idcntical for both intact and deadenylylated niRNA. Samples not treated with enzyme stay, essentially, at the origin. The lengths of these fragments arc estimates bascd on the mobilities of bromphenol blue and xylene cyanol. hlaniatis et al. ( 4 0 ) show that, in such gels, the mobilities of small DNA markers of defined length are a linear function of the logarithm of the number of nucleotides. They also show that xylene cyanol and bromphenol blue coelectt-ophorese with DNA pieces 29 and 10 nucleotides long, respectively. Although our estimated lengths are uncertain, it is unlikely that they are more than a few nucleotides in error. These results imply that the secondary structure in globin mRNA, at pH 4.5, is organized in several specific helical regions of variable length. The removal of poly( A) sequences seems to have little effect on the general pattern of this structure.

IX. Summary and Conclusions The data presented in this paper represent our first attempts to study the structure of a purified eukaryotic mRNA. Several approaches, including the use of optical, chemical and enzymic probes, are described. These studies focus on mixed populations of 01 and p rabbit globin mRNA. In some instances, comparisons between intact and deadenylylated mRNA species are made in order to begin to assess the structural role of poly( A ) sequences. Eukaryotic messenger RNA has not been extensively studied from a structural perspective. The experimental information that will allow confidence in proposing structural models may not be available for many years. However, even though the results presented above are in most cases incompletc, it is possible to draw some general conclusions and to speculate concerning the structure of rabbit globin mRNA. The optical studies presented here and those published from other laboratories suggest that, at pH 7.0-8.0 and at relatively high salt, approximately 5 5 6 0 % of the nucleotides are in helical regions. This number increases by about 10% at slightly acidic pH. T h e secondary structure is stabilized by the addition of Mg2+ions. There appear to be a small number of subclasses of stable structure. Thermal denaturation studies imply that these subclasses melt independently and generate multiphasic denaturation curves. This phenomenon may represent the differential thermodynamic stabilities of helical regions, or the existence of some tertiary interactions that are less stable than regular helices. I t may also reflect the coexistence of specific stable secondary structures, which may be functional, with nonspecific “random” secondary structures that may be nonfunctional. Sincc HNA can spontaneously form poorly matched secondary structures, it is not unlikely that the long coding regions of these mRNAs may generate “random” structures in solution.

250

JOHN N. VOURNAKIS ET AL.

The chemical binding experiments with the water-soluble carbodiirnide g w c results consistent with the optical data. A small number of presunialdy single-strand guanincs and uracils rcact. Estimates of the extent of holical structure cannot bc bascd on diiniide binding, as it is probable that a largc portion of the mRNA is unable to react. I t is, therefore, likely that at pH 8 significant secondary, and perhaps tertiary, structure exists. Sinall pertubatiom of the structure by slight temperature changes or by the addition of small amounts of ethanol results in nearly a doubling of the number of reactive sites available to the diimide. This result is similar to that obtained with transfer RNA ( 4 8 ) and is consistent with the existence of unstable tertiary and/or secondary structural regions. Structurc-sI>ecificenzymes were also used to probe the conformation of globin mHNA. The polynucleotide phosphorylase results, in which the digestion is inhibited by removal of the p l y ( A ) sequences, irnply that there are tcmy~craturc-dependciitstructural features in the 3' terminus adjacent to the poly ( A) that might stc,rically interfere with the initiation of the enzyme reaction. This is consistent with published data on the digestion of intact and deadenylylated silk-moth chorion mRNA (39). T h e "P-labeled poly ( A ) sequences from thcsc messages are removed by polynucleotide phosphorylase at 5"C, whercm the remainder of tho molecule remains intact. Deadenylylatecl chorion mRNA is completely undigested by thc cmzyme at 5'C, but can be hydrolyzed at higher temperatures. I t is possible that the polynucleotide phosphorylase results reflect the existcnce of stable secondary structure of the type proposed b y Proudfoot (30). The other enzyme used, S, nuclcasc>,gave perhaps the most intriguing results in this papcr. Digestion of radioiodinated globin inHNA with this enzyme indicates that approxiniatcly 7&7S% of the nucleotides in these molecules are resistant to digestion. This value is an estimate since only the rcsistance of iodinated cytosinc: residues is lxing measured. I t is possible that thc relative distribution of this base in hclical regions is nonuniform, hence the percent resistance determined may riot be an accurate measure of the true fraction of bases protected from S, digestion. However, the numbers obtained arc in agreement with the estimates from optical studies in S, nuclease buffer. Mg" ions do not appear to alter the enzyme resistance of the mRNAs. This may reflect the mechanism of enzyme digestion, i.e., it may be that the inolecule loses its glo11uIar nature when the enzyme begins to cleave single-stranded regions. The removal of poly( A ) sequences also does not affect the S, results. The attempt to study the nature of the fragments generated by S , , by gel electrophoresis, provides evidence for the apparent existence of hclical regions of discrete length. Several rathcr

INTACT A N D DEADENYLYLATED GLOBIN

inRNA

251

short pieces were obtained in addition to longer, less well defined, fragments. It is possible that the short, discrete fragments represent stable well-matched helical segments, whereas the longer diffuse bands may present “random”, poorly matched heliccs. These results are encouraging and provide substantial motivation for the further exploration of mRNA structure using S , and othcr structure-specific enzymes.

ACKNOWLEDGMENTS We thank Drs. J. Stavrianopoulos, A. Maxam, W. Gilbert, A. Efstratiadis, J. Lebowitz and F. C. Kafatos for niatcrials; Drs. F. Kafatos and R. Woodruff for use of facilities; P. Given for help with the carboiiinide binding studies; B. Yates, M. Frishinan, L. Rieser and L. Lawrence for technical assistance; B. Gingell and K. A. Votirnakis for help with figrircs; and S. Petrarca and G. Ventiira for typing the manuscript. This research was supported by grants from N.I.H. (GM-22280) and from the Syracuse University Equipment and Research Fund.

REFERENCES 1 . E. Volkin and L. Astrachan, Virology 2, 149 (1956). l a . F. Jacob and J. Monod, J M B 3, 318 (1961). 2. S. Rrenner, F. Jacob and M. Meselson, Nature 190, 576 (1961). 3. J. Darnell, W. H. Jelinek and G. R. Molloy, Science 181, 1215 (1973). 4 . N. Yang, R. F. Manning and L. P. Gago, Cell 7, 339 ( 1976). .5. G. Brawernian, A R B 43, 021 (1974). 6. H. F. Lodish and H. 1).Robertson, C S H S Q B 34, 655 (1969). 7 . J. Tooze and K. Weber, J M B 28,311 ( 1967). 8. S. Mitra, M. D. Enger and P. Kaesburg, €”AS 50, 68 ( 1963). 9. J. H. Straws and RI L. Sinshrimer, J M B 7, 43 (1963). 10. R. F. Gesteland and H. Boedtker, J M B 8, 496 (1964). 1 1 . H. Roedtkcr, B c h e m 6 , 2718 (1967). 12. H. Isenberg, R. I. Cotter and W. B. Cratzer, B B A 232, 184 (1971). 13. J. M. Adanis and S. Cory, Nature 227, 570 (1970). 14. J. M. Adams, S. Cory and P. F. Spahr, E I B 29, 469 ( 1972). 15. L. A. Ball, Nature N B 242, 44 ( 1973). 16. L. A. Ball, J . Theor. B i d . 41, 243 ( 1973). 17. W. Min Jou, G. Hagenian, M. Ysebaert and W. Fiers, Nature 237, 82 (1972). I N . W. M. Fitch, J . M o l . Euol. 3, 279 (1974). 19. J. Gralla and C. DeLisi, Nature 248, 330 (1974). 20. B. Ricard and W. Salser, BBRC 63, 548 ( 1975). 21. J. W. Holder and J. B. Lingrel, B c h e m 14, 4209 (1975). 22. J. Gralla, J. Steitz and D. M. Crothers, Nature 248, 204 (1974). 23. C. W. Hilbers, R. G. Shulman, T. Yamane and J. A. Steitz, Natu,re 247, 225 (1974). 24. J. A. Steitz, Nature 224, 957 ( 1969). 25. A. Bernardi and P. F. Spahr, PNAS 69, 3033 (1972). 26. H. F. Lodish, J M B 50, 689 ( 1970).

252

JOIIN N. VOUHNAKIS ET AL.

27. B. Ricard and W. Salser, Nature 252,359 ( 1974). 28. K. Scherrer, Karolinska S y m p . Res. Methods Rcprod. Endocrinol., 6th Symp. Reprod. Tissue, p. 95 ( 1973 ) . 29. N. J. Proiidfoot and G. G . Brownlee, Nature 252, 359 ( 1974). (See also Proudfoot et al., this volume.) 30. N. J. Proudfoot, JMB, in press (1976). 31. C. Milstein, G . G. Brownlee, E. M. Jarvis and N. J. Proudfoot, Nutiire 252, 354 (1974). 32. R. Poon, et a [ . PNAS 71, 3502 (1974). 33. C. A. Marotta, B. G. Forget, E. M. Jarvis and N. J. Proudfoot, Nature 252, 354 ( 1974). 34. A. Favre, et d.E J B 57, 147 ( 1975). 35. A. W. Nicnhuis, A. K. Falvcy and W. F. Andcrson, in “Methods in Enzymology,” Vnl. 30F (K. Moldave and L. Crossman, eds.), p. 621. Academic Press, New York, 1974. 36. A. Efstratiadis and F. C. Kafatos, in “Methods in Molecular Biology” (J. Last, ed.),V d . 8. Dekker, New York, in press. 37. S. L. Commcrford, Bchem 10, 1993 (1971). 38. J. G. Stavrianopoulos and E. Chargaff, PNAS 70, 1959 (1973). 39. J. N. Vournakis, A. Efstratiadis and F. C. Kafatos. PNAS 72,2959 ( 1975). 40. T. Maniatis. A. Jeffrey and H. van de Sande, Bchem 14, 3787 (1975). 41. H. J. Gould and P. H. Hamlyn, FEBS L e t t . 30, 301 (1973). 42. D. Metz and G . Brown, Bclaem 8, 2312 ( 1969). 43. U.Metz and G . Brown, B c h e m 8,2329 ( 1969). 44. S. Brostoff and V. Ingrani, Science 158, GGG ( 1967). 45. D. Rhodes, J M B 94,449 ( 1975). 4fi. S. Chang and D. Ish-Horowicz, JMB 84,375 ( 1974). 47. A. Stern and D. M. Crothers, B c h n 15, 160 (1976). 48. G . Kitos, P. Given, R. Wurst and J. Voumakis, Unpublished. 49. R. Williamson, J. Crossley and S. Humphries, Rchem 13, 703 (1974). 50. H. Soreg, U. Nudel, R. Soloman, J. Revel and U. Z. Littauer, J M B 88, 233 (1974). 51. R. R. Jordan, J M B 55, 423 (1971). 52. G . W. Rushizky, V. A. Shatcrnikov, J. 11. Mozijko and H. A. Sober, Bchem 14, 4221 (1975). 53. V. M. Vogt, E J B 33, 192 (1973). 54. D. N. Holcomb and I. J. Tinoco, Biopolyrners 3, 121 (1965).

Molecular Weight Distribution of RNA Fractionated on Aqueous and 70% Formamide Sucrose Gradients ' ~

1

HELGABOEDTKER AND HANS LEHRACH Department of Biochemistry and Molecular Biology Hartjard Uniuersity Cambridge, Massachusetts

1. Introduction Sucrose gradient centrifugation has been used for many years to fractionate RNA according to size. It has also been known for many years that, under nondenaturing conditions, there is no simple relation between RNA sedimentation constants and molecular weights (I, 2). This observation has been confirmed more recently in studies of eukaryotic mRNAs, which exhibit anomalous sedimentation rates relative to ribosomal RNA on aqueous sucrose gradients (3-6). Nevertheless, aqueous sucrose gradients are still being used to analyze the size distribution of nuclear RNA (7-9). Recently we analyzed the MW distribution of pulse-labeled RNA and found that RNA with MW's equal to and greater than 2 x loRsediment as low as 20 S on both aqueous and 70%formamide sucrose gradients. Thus, quite contrary to the observation that large nuclear RNA sediments too slowly relative to rRNA on fully denaturing Me,SO gradients (10, II ), the former actually sediments too slowly on nondenaturing or partially denaturing sucrose gradients.

11. Molecular Weight Distribution on Aqueous Sucrose Gradients Low-salt NaDodSO, sucrose gradients have been widely used in the isolation of mRNA. In the expectation that such gradients would prove useful for the isolation of procollagen mRNAs, we examined their fidelity by measuring the MW distributing of pooled fractions from 18 S to >30 S by denaturing gel electrophoretic analysis. [ 3H]Uridine-pulse-labeled RNA was prepared from 15-day-old chick embryo calvaria as described previously ( 1 2 ) . The distribution of pulse-labeled and stable rRNA ob253

254

HELGA BOEDTKER AND HANS LEHRACM

09

07

-I

05

0

%

a

O? I

2

3

01 FRACTION ( m i )

FIC. 1. Aqueous sucrose gradient fractionation of [‘H]~midine-piilse-labelcd RNA. RNA (300 ~ g with ) a specific activity of 200 cpm/pg was dissolved in 9OX foriiiamide, heated at 70°C for 1 minute, diluted with 19 volumes of buffer, and scdimented on a preformed 4 to 40% sucrose gradient containing 0.21 NaDodSO,, 0.01 M NaCI, 0.001 M Na acetate, 0.2 mM NaJZDTA (pH 5.0) for 23 hours a t 25,000 rpm in a Beckman SW 27 rotor at 20°C. Approximately 1-1111 fractions were collected, the absorbance of each was read at 260 niii in a Gilford 2400s sp~ctrophotometer;0.1 ml of each fraction was then coiinted in 10 ml of Scintiverse (Fisher) in a Reckman LS-250 liquid scintillation counter. --, Azk;s;0- - - 0, cpm.

taiiied is shown in Fig. 1. Althoiigh the 27 S’ and 18 S rHNA species appear as sharp, well separated peaks, there is a n almost continuous distribution of radioactivity from 10 S to 30 S. To determine the MWs of the pulse-labcled RNA, fractions were poole d as indicated and analyzed by pol yacrylarnide gel electrophoresis in 99%fornmnide. The radioactivity profiles obtained for six fractions ( fractions I and I1 were combined) is shown in Fig. 2. In each fraction sedimentirig faster than 18 S rHNA (fractions I-V), a peak of radioactivity is located at 20-25 Inn1 from the top of the gel corresporidiiig to the approxiniatc location of 27 S rRNA. Morc important, however, is that this also corresponds to thc location of the MW-independent migration of HNA Thc large chick ribosomal RNA has been “renamed” 27 S, rather than 28 S previously used, liccause its molecular weight is 1.5 x 10’ and corresponds to a seclimentation constant of 27 S. 28 S should b e reserved for the large mammalian rRNA of MW 1.G5 x 10’.

DISTHIBUTION OF

RNA

255

ON GRADIENTS

FIG.2. Polyacrylamide gel electrophoretic analysis in 99% formamide of aqueous sucrose gradient fractions. Electrophoresis was carried out as described by Pinder et al. ( 1 3 ) with the modifications described previously ( 1 4 ) except that electrophoresis was started within 1 hour after the gels set. Gel slices were counted after solubilizing in 90% protosol for 2 hours a t 60°C and then adding 3 ml of Scintilene ( Fisher ) . CPM

-

['H) URlDlNE

A260

0----0

3000 0.7

2000

0.5

a3 1000

0.I

10

20

30

40

fraction

FIG. 3 . 70% Formamide sucrose gradient fractionation of ['Illuridine pulse-labeled HNA. Of the RNA, 150 pg with a specific activity of 5000 cp~n/pgwere dissolved in 70% formaniidc, heated at 70°C for 1 minute, fast-cooled to room temperature, and then sedimented on a preformed 3 to 20%'sucrose gradient in 70% formamide. containing 0.05 M TrisC1, 5 mM Na,EDTA titrated to pH 7.8 for 24 hours at 36,000 rpm in a Beckman SW 40Ti rotor at 20°C. Fractions of 0.3 ml were collected, diluted to 1 nil with distilled deionized water, and then read and counted as described for Fig. 1.

256 E E

E E

FIG. 4A.

FIG.4B.

2 7s

I

CPM

500 I

CPM

,

II

CPM

CPM Ki CR -4

, 20

20

40

(mm)

FIG.4C

20 (mm)

40

40

FIG.4. Gel electrophoretic analysis of 70%formamide-sucrose gradient fractions under denaturing conditions. ( A ) 3% Polyacrylamide gels in 99%formamide. Electrophoresis was carried out as described in Fig. 2 except that the acrylamide concentration was 3%. RNA samples were heated at 65°C for 1 minute before electrophoresis. Gels were run at 3 mA per tube for 4 hours at room temperature, and then stained overnight in ethidium bromide ( 1 pg/ml) in 0.1 hl NH, acetate ( 1 6 ) to locate the molecular weight standards before freezing and slicing the gels. ( B ) 1%Agarose gels in 68 formaldehyde. RNA fractions were reacted with formaldehyde as described previously ( 17) except the formaldehyde concentration was increased from 3 to 6%,and the reaction was carried out in 50%formamide at 60°C for 5 minutes. Agarose gels, la, were used instead of polyacrylamide, and electrophoresis was for 4 hours at 2 mA per tube at room temperature. The gels were stained and counted as described above. ( C ) 1% Agarose gels in 6 hf urea. RNA fractions were dissolved in 25 pl of 99%formamide, heated at 65°C for 30 seconds. To this, 25 p1 of 6 M urea, 0.025 M citrate buffer, p H 3.5, was added and the sample was applied to 1% agarose gels in 6 M urea, 0.25 M citrate, p H 3.5,as described by Rosen et al. ( 1 8 ) . Electrophoresis was for 4 hours at 3 mA per tube at 4°C. The gels were stained and counted as described above.

258

HELGA BOEDTKER AND HANS LEHRACH

molecules larger than 1.6 x loGon these gels. This phenomenon is documented by the results presented and discussed below. At this point, we only point out that pulse-labeled RNA that sediments at 22 S on aqucous gradients, electrophoreses on 99% formamide gels with RNA molecules having MWs equal to or greater than 1.6 x lo6.

111. Molecular Weight Distribution on 70% Formamide Sucrose Gradients Since we had failed to achieve a satisfactory fractionation of pulselabeled RNA on aqueous sucrose gradients, we examined the MW distril bution of such RNA on 70%formamide sucrose gradients similar to those described by Suzuki et a1 (15). The distribution of the absorbance and radioactivity obtained is shown in Fig. 3. As in the aqueous gradients, the 27 S and 18 S rRNA species appear as sharp, well separated peaks and the distribution of radioactivity is quite broad. However, a significantly greater fraction of the radioactivity cosediments with the rRNA species on the formamide gradients. To determine the size distribution of pulse-labeled RNA across the gradicnt, five fractions were pooled as indicated, and equal aliquots of each fraction except fraction I1 were analyzed by electrophoresis on three different denaturing gel systems, 99%formamide, 6%formaldehyde, and 6 M urea,

pH 3.5. The distribution of pulse-labeled RNA obtained in each case is shown in Figs. 4A, 4B and 4C. A large fraction of the pulse-labeled RNA sedimenting at approximately 20 S (fraction IV) electrophoreses with a mobility less than that of 27 S rRNA OR all three denaturing gels. While most of the radioactivity is located in the 27 S formamide sucrose gradient fraction (fraction 11) in each case, a significant fraction of the labeled species is larger than 27 S rRNA and TMV RNA ( 2 x lo6 MW) when analyzed on either formaldehyde or urea gels. Therefore, pulse-labeled RNA sediments much more slowly than stable rRNA on 70% formamide sucrose gradients. Finally, if one compares the appearance of pulse-labeled RNA sedimenting faster than 30 S (fraction I ) on 3%acrylamide gels in 99%formamide with that on either formaldehyde or urea agarose gels, there is a striking difference. In the former case, all the radioactivity is confined to a single band with a mobility somewhat lower than that of 27 S rRNA, while a broad distribution of radioactivity ranging from one to five million appears for this same fraction when analyzed on either urea or formaldehyde gels. Furthermore, there is a peak of radioactivity a t the same position in the 99%formamide gel analysis of fractions 11, I11 and IV. Since this corresponds to a molecular weight of about 2 x 106

DISTRIBUTION OF

RNA

ON GRADIENTS

259

while molecules larger than this are clcarly found in these fractions whcn analyzed on agarose gels, the lowest mobility peak seen on the formamide gels must represent those RNA species that are equal to or larger than 2 x 10” and that move through the gels with a mobility independent of their MW. This interpretation was confirmcd by demonstration that lambda DNA and silk fibroin mRNA ( - 6 x 10” MW) travels to the same position on 3%acrylamide gels in 99%formamide (19).

IV. Discussion Pulse-labeled RNA was fractionated on both aqueous and 70%formamide gradients, and fractions were pooled and analyzed by gel electrophoresis under denaturing conditions. A significant fraction of the labeled HNA sedimenting more slowly than 27 S rRNA and thus appearing to be of lower MW actually electrophoreses more slowly than 27 S rRNA and thus must have MWs greater than the latter. Since the labeled RNA species include rRNA, mRNA, and nuclear RNA, and since only nuclear RNA would contain molecules as large as 2 x loo, it seems clear that large nuclear RNA sediments much more slowly than ribosomal RNA of the same MW. Since the RNA samples analyzed on these gradients were heated in formamide at 70” for 1minute before fractionation, the simplest explanation of their anomalous sedimentation behavior is that rRNA renatures to some degree on both aqueous and 70%formamide gradients, while nuclear and messenger RNA does so to a much smaller extent or not at all. The partial renaturation of rRNA at room temperature in formamide has been reported (20, 21 ). In view of these results, it seems clear that the distribution of sedimentation constants observed for nuclear RNA on aqueous gradients cannot be transformed into molecular weight distributions, even when the RNA is denatured in Me,SO prior to centrifugation. ACKNOWLEDCMENTS We thank Tricia Bredbury for her invaluable assistance in slicing and counting gels. This research was supported by NIH Grant HD-01229. H. Lehrach was the recipient of a postdoctoral fellowship from the Jane Coffin Childs Memorial Fund for Medical Research.

REFERENCES 1 . R. F. Gesteland and H. Boedtker, MB 8, 496 (1964). 2. H. Boedtker, in “Methods in Enzymdogy,” Vol. 12, NucIeic Acids, Part B ( L . Grossman and K. Moldave, eds.) Academic Press, New York, 1968.

260

HELGA BOEDTKER AND HANS LEHRACH

3. G. E. Morris, E. A. Buzash, A. W. Rourkc, K. Tepperman, W. C. Thompson and S. M. Heywood, CSHSQB 37, 535 (1972). 4. D. J. Shapiro and R. T. Schimke, JBC 250, 1759 (1975). 5. M. E. Haines, N. H. Carey and R. D. Palmiter, E J B 43, 549 ( 1974). 6. M. C. MacLeod, Anal. Biochem. 68,299 (1975). 7 . R. C. Herman, J. G. Williams and S . Penman, Cell 7, 429 ( 1978). 8 . S . Bachenheimer and J. E. Damell, PNAS, 72, 4445 (1975). 9. R. P. Perry, D. E. Kelley, K. H. Friderici and F. M. Rottman, Cell 6, 13 (1975). 10. N . H. Acheson, E. Buetti, K. Scherrer and R. Wed, PNAS 68,2231 ( 1971). 11. T. Iniaizumi, H. Diggelmann and K. Scherrer, PNAS 70, 1122 ( 1973). 22. 11. Bocdtkcr, R. B. Crkvenjakov, J. A. Last and P. Doty, PNAS 71, 4208 (1974). 23. J. C. I’inder, D. Z. Staynov and W. B. Gratzer, Bchem. 13, 5373 (1974). 14. H. Bocdtker, R. B. Crkvenjakov, K. F. Dewey and K. Lanks, Bchem. 12, 4358 (1973). 15. Y. Suzuki, L. P. Gage and D. D. Brown, J M B 70, 637 ( 1972). 16. J. M. Bailey and N. Davidson, Anal. Biochem. 70, 75 (1978). 17. H. Boedtker, BBA 240,448 ( 1971). 1 N . J. M. Rosen, S . L. C. Woo, J. M. Holder, A. R. Means and B. W. O’Malley, Bchern. 14, 69 ( 1975 ) . 19. H. Lehrach, unpublished data. 20. P. K. Wellaucr and I. B. Dawid, J M B 89, 379 ( 1974). 21. H. Boedtker, unpuhlished data.

111. Processing of mRNAs

Bacteriophages T7 and and Processing

T3

as Model Systems for RNA Synthesis

263

J. J. DUNN,C. W. ANDERSON,J. F. ATKINS, D. C. BARTELT AND W. C. CROCKETT The Relationship between hnRNA and mRNA

275

ROBERTP. PERRY,ENZOBARD,B. DAVIDHAMES,DAWN E. KELLEY AND UELI SCHIBLER A Comparison of Nuclear and Cytoplasmic Viral RNAs Synthesized Early in Productive Infection with Adenovirus 2 HESCHEL J. RASKASAND ELIZABETH A. CRAIG Biogenesis of Silk Fibroin mRNA: An Example of Very Rapid Processing?

293

30 1

PAULM. LIZARDI Visualization of the Silk Fibroin Transcription Unit and Nascent Silk Fibroin Molecules on Polyribosomes of Bombyx mori 313 STEVENL. MCKNIGHT, NELDA L. SULLIVAN AND OSCARL. MILLER,JR. Production and Fate of Balbiani Ring Products

319

B. DANEHOLT, S. T. CASE,J. HYDE,Id. NELSONAND L. WIESLANDER Distribution of hnRNA and mRNA Sequences in Nuclear Ribo-

335

nucleoprotein Complexes

ALAN J. KINNIBURGH,PETERB. BILLINGS, THOMASJ. QUINLAN AND TERENCE E. MARTIN

261

This Page Intentionally Left Blank

Bacteriophages T7 and T3 as Model Systems for RNA Synthesis and Processing J. J. DUNN,* C. W. ANDERSON,* J. F. ATKINS,~ D. C. BARTELTO AND W. C. CROCKETT*

* Biology Department Brookhaven National Laboratory Upton, N e w York and f Department of Molecular Biology Unioersity of Edinburgh Edinburgh, Scotland

1. Introduction In Escliericlzia coli, site-specific cleavages by RNase 111’ are the initial events in the processing of rRNAs and are also responsible for generating the individual “early” mRNAs of bacteriophages T7 and T3 (1-6). Using purified E . coli RNA polymerase and RNase 111, it is possible to synthesize in oitro early T7 and T3 mRNAs that are identical to those found in infected cells. RNase I11 cleavage of rRNA and early mRNA takes place at specific processing signals that are present in the RNA itself. Have processing signals been preserved during evolution? If so, enzymes similar to E. coli RNase I11 might be found in other organisms. One approach to identifying a RNase-111-type enzyme in a source other than E. coli is to ask whether extracts prepared from the organism contain an activity that will faithfully process T7 and T3 early RNAs. A complementary approach would be to test a putative precursor RNA with RNase I11 from E . coli to see whether the RNA contains processing signals and whether specific products are generated. This report summarizes some aspects of T7 and T 3 early RNA processing and the properties of E . coli RNase I11 that might be useful in studics of this type.

II. Properties of RNase 111’ RNase I11 was originally characterized as an eiidonuclease that degrades double-stranded RNA to acid-soluble fragments ( 7 ) . The rather

’ EC

3.1.4.24 ( 1 0 ) . [Eds.]

263

264

J . J. DUNN ET AL.

nonspecific manner by which the enzyme digest double-stranded RNA is in marked contrast to its spccificity for unique sites in single-stranded HNAs. A number of lines of evidence suggest that a single enzyme catalyzcs both reactions: ( a ) the ratio of activities against single-stranded and double-stranded HNA remains constant during purification to homogenity; ( b ) an RNasc-111-deficicnt niutant of E . coli lacks both activities (2, 3, 8 ) ; ( c ) the digestion products of either single-stranded or doublestranded RNA have S'-P and 3'-OH ends ( 4 , 5, 9, 1 0 ) ; and ( d ) doublestranded RNA is a potent competitive inhibitor of the specific cleavages of single-stranded RNA (2, 9 ) . The purification of RNase I11 has been greatly simplified by the finding that RNase I11 binds tightly to columns containing immobilized p o l y ( I ) . p o l y ( C ) (11). I M NH,CI elutes essentially all other E . coli protcins that also bind to the column, and subsequent passage of 2 M NH,CI through the column elutes RNase 111. If enzymes of similar specificity from other sources bind as tightly to poly( I ) .poly( C ) , then columns of this type should also he useful in their purification. In its native form, RNase I11 seems to be composed of two 25,000 M W subunits that fail to resolve when electrophoresed on polyacrylamide gels in the presence of sodium dodecyl sulfate ( 1 2 , 12). It remains to be established whether the subunits are identical. The enzyme reqiiires divalent cations for activity (Mg" or Mn9+),but binds to poly( I ) . poly( C ) in the absence of divalent cations (11).

111. Synthesis of 17 and T3 Early RNAs The early regions of bacteriophages T7 and T3 are transcribed by the E. coli RNA polymerase and comprise the left-hand 20%of the phage DNAs (13-15). As shown in Fig. 1, the early region of each virus is a single unit of transcription consisting of a promoter region, five early genes and a termination signal. When the infecting DNA enters E . coli, the host RNA polymerase initiates transcription at three closely spaced sites within the promoter rcgion, and then transcribes the five early genes in order from left to right ( 1 , 1 6 ) . The primary transcripts are cut a t five specific sites by RNase I11 to produce five early mRNAs plus three overlapping RNAs from the promoter region. Uncleaved primary transcripts from the early regions of T7 and T3 are not observed in RNase 111' hosts probably because of the close coupling of cleavage with transcription. However, when T7 or T3 DNA is transcribed by purified E . coli RNA polymerase, or when RNase 111- strains are employed as hosts, polycistronic primary transcripts covering the entire early region of each virus are produced (1, 2 ) , Cleavage does not have to be coupled with

inHNA

PROCESSING IN BACTERIOPHAGES 5

0 DNA

'

RNA

T7 PROTEIN

73 PROT E I N

T7

AND

265

T3 15

I0

20

..2-

70.3

I

0.7

!

,

I.!

I

1.3

J 8,700 OVERCOME HOST RESTRICTION I 1,500

SAME

42,000 PROTEIN KINASE 40,000 SAME

100,000 R N A POLYMERASE

97,000 SAME

8,000 7

40,000 DNA LiGASE

7

37,000 SAME

(SAMASE)

FIG. 1. The early regions of T7 and T3 DNAs. The map of the RNAs and the sizes and functions of T7 early proteins have been described (17, 1 8 ) . The corresponding information for T3 is from Studier and Movva (19). Approximate distance from the left end of the genome, in units of percent the length of T7 DNA, is given above the double line representing DNA; the gene numbers are given above the single line representing the RNA transcripts. Eschen'chia coli RNA polymerase transcribes this region starting at three sites to the left of gene 0.3 and stopping just to the right of gene 1.3. E . coli RNase I11 cuts RNA from this region at five specific sites, between each of the early genes and at the left of gene 0.3.

transcription since, as noted above, the processing signals are present in the RNA itself. Thus, when polycistronic early RNA is incubated with RNase 111, it is cleaved at the same five sites that are normally cut in vivo (I, 2 , 4 , 5 ) . The position of each cleavage site in the early RNAs of T7 and T3 have been mapped and mutants that delete each site have been isolated ( 17-19). These mutants are also useful in identifying individual early mRNAs. The ability to locate cleavage sites with these mutants makes it possible to test whether processing enzymes from other sources cut T7 and T3 early RNAs at the same sites as does E. coli RNase 111.

IV. Fidelity of RNase 111 Cleavage in Vitro The individual T7 and T3 early RNAs form a characteristic pattern when they are electrophoresed on polyacrylamide geIs (Fig. 2). Any deviations from the normal pattern can be readily detected, thereby providing a sensitive method for analyzing the fidelity of cleavage in vitm. Deviations from the normal polyacrylamide gel patterns were repeatedly seen with certain in vitro conditions. The changes observed suggested that cleavages were occurring at sites in addition to those normally cut in vivo. For the sake of discussion, the five sites cleaved in vivo are referred to as primary sites, and other sites are termed secondary sites. Cleavage at secondary sites was found to be influenced strongly by the concentration of monovalent salt in the digestion mixture ( I1 ). Figurc 2 shows electrophoretic patterns typical of those obtained when polycis-

266

J . J. DUNN ET AL.

mRNA

PROCESSING IN BACTERIOPHAGES

T7

AND

T3

267

tronic T7 early HNA is incubated over a wide range of monovalent salt ( N H , C l ) concentrations with a constant amount of RNase 111. Similar patterns are obtained using NaCl or KCI in place of NH,Cl. At the enzyme concentration used in the experiment shown in Fig. 2, an RNA pattern corresponding to that produced in vivo is generated only at monovalcnt salt concentrations betwcen 150 and 300 mM. Salt concentrations lower than 150 mM promotc extensive cleavage of secondary sites, as evidenced by the loss of the larger mRNA species from the patterns and appearance of discrete smaller RNAs. At monovalent salt concentrations above 300 mM,cleavage of primary sites, as well as of secondary sites, is inhibited and a numl)c~rof partial digestion products are observed. The extent of cleavage at secondary sites depends upon enzyme concentration ( 11) . Evcn at the Iowcst salt concentration used ( 5 mM ), primary sites are the preferred sites of cleavage, and the only sites deaved if enzyme is limiting. However, at 5 mM salt only a slight (2- to 4-fold) increase in enzyme concentration results in significant cleavage at secmdary sites. At moderate salt concentrations, the enzyme’s preference for primary sites is more pronounced, and secondary sites are cleaved only at high enzyme to substrate ratios. Both primary and secondary cleavage sites occur at specific locations within polycistronic T 7 early RNA, and it should be possible eventually to determine what features of structure or nucleotide sequence are required for cleavage. One possibility is that all cleavage sites have regions of helical structure. Helical structure at cleavage sites would be consistent with the ability of RNase I11 to digest double-strandcd RNA ( 7 ) , and the observation that double-stranded RNA inhibits all cleavages of polycistronic T7 early RNA in uitm (2, 9, 11). Cleavage might also require the presence of particular nucleotide sequences. Table I lists some of the termini gcneratcd by RNase I11 cleavages. I t is clear that no onc sequence is mandatory for cleavage, although certain sequences seem to predominate. Perhaps several sequences may b e preferred by RNase FIG. 2. Effect of monovalent salt concentration on the fidelity of RNase I11 cleavage. Approximately 10 ng of ”P-labeled polycistronic T7 early RNA was incubated with 1.5 units of RNase at 37°C in 50 p l of buffer containing: 5% sucrose, 0.02 M TrisCI ( p H 7.9), 0.005 MgCI,, 0.1 mM EDTA, 0.1 mM dithiothreitol and NHKl as indicated. After 20 minutes, reactions were terminated and the RNA was analyzed by electrophoresis on polyacrylamide gels. Equal portions of each digestion mixture were applied to a 2% polyacrylaniide plus 0.5% agarose gel to resolve RNA greater than 600 nucleotides long, and to a 3 to 20% polyacrylamide gradient gel, to resolve RNAs of less than 600 nucleotides. The RNA applied to the tracks marked “control” was from an incubation mixture which received no RNase 111. The positions of the early T7 RNAs are indicated to the right of each gel pattern.

268

J. J. DUNN ET AL.

Primary cleavage sites

Rrfcrcncc

T7 IiNAs I-0,$ 0.3-0, 7 0.7-1 1-1.1 1 .I-1

.s

E . coli rRNA 3' end of p16S 3' end of p16H

6 6

Secondary cleavage sites

T7 RNA near end of 1 . 1 RNA T4 RNA 20

species I 20 0

1%. I). Iiohcrtson, IZ. IXckson and J. J. I)unn, in preparation.

I11 ( 6 ) . As noted above, deletions affecting each primary site within T7 and T3 early RNA arc available. I t should be feasible, using spccific deletion strains, to determine extended regions of sequence at each cleavage site. By comparing the sequences of a number of sites, it might then begin to be possible to draw conclusions regarding the exact nature of cleavage sites.

V. Effect of Cleavage on Translation Why are T7 and T3 early RNAs cut by RNase III? The possibility that cleavage is necessary for efficient translation of T7 carly RNA was examined by comparing the rate of synthesis of T7 early proteins in RNase-I11 and RNase-111' hosts ( 2 1 ). Early proteins were labeled with [ "S]]metliio~iine and then resolved by electrophoresis on polyacrylamide gels in the presence of sodium dodecyl sulfate. Individual proteins were identified by the effects that various deletion and nonsense mutations of T7 have on the protein patterns. Thc results demonstrated that synthesis of only one of the five early proteins is affected by RNase-I11 cleavage of thc RNA. Much less 0.3 protcin is synthesized in a RNasc-111- host as compnred to its synthesis in a RNase-111' strain. In addition, when T7 carly RNA is used to program a cell-free protein synthesizing system pre-

mRNA

PROCESSING IN BACTEHIOPHAGES

T7 AND T 3

269

pared from RNase-111- cells, RNase-I11 cleavage of the RNA stimulates synthesis of 0.3 protein, but not the other early proteins. In the RNase-IIIcell-free system, the stimulation of 0.3 protein synthesis by prior cleavage of the RNA is at least 10- to 20-fold. The most straightforward interpretation of the data indicates that RNase-I11 cleavage is not required for efficient translation of most regions of polycistronic T7 earfy RNA, but cleavage is required for efficient translation of the 0.3 mRNA portion. Two eukaryotic cell-frec protein synthesizing system, one derived from mammalian cells and one from wheat germ, were also examined for ability to translate polycistronic or RNasc-111-cleaved early RNA ( 2 2 ) . Both systems were stimulated to incorporate [ ”Slmethionine into polypeptides by the addition of either cut or uncut early RNAs. As can be seen from Fig. 3, more incorporation was obtained with RNase-IIIcleaved RNA than with an equivalent amount of the uncut RNA. However, unlike the situation in the E . coli cell-free system, all the early proteins made in the eukaryotic systems are stimulated by cleavage. Polyacrylamide gel electrophoresis in the presence of sodium dodecyl sulfate indicates that both eukaryotic systems synthesize authentic T7 and T3 early proteins. The identity of individual proteins was verified by observing the effects of various deletions and nonsense mutations on the protein patterns. Only certain early mRNAs seem to be translated. The mammalian system efficiently translated the gene 0.3, 1 and 1.3 mRNAs of T7 and T3 but not gene 0.7 and 1.1 mRNAs. Sufficient T3 0.3 protein was synthesized in the mammalian system so that it could be detected readily by its enzymic activity (cleavage of S-adenosylmethionine ) ( 2 3 ) . The wheat-germ system failed to synthesize detectable quantities of the T 3 gene 0.3 protein but appeared to translate efficiently the corresponding T7 mRNA. The gene 1 and 1.3 proteins of T7 and T3 were also synthesized efficiently in the wheat-germ system. The reason for the apparent lack of synthesis of certain proteins in these systems is not clear. Perhaps it is due to poor initiation or the presence of a codon in the mRNA that is seldom used in wheat and mammalian cells. Synthesis of T7 gene 0.7 protein is also inefficient in E. coli cell-free systems ( 2 1 ) . In the wheat-germ system there is a very low background incorporation of [ “S]methioninc, indicating that the system contains little endogenoiis mRNA and that added mRNAs do not have to compete for avaiIabIe ribosomes. Most individual early mRNAs begin with pG-A-U (4, 5 ) , while many eukaryotic mHNAs are “capped with 7-methylguanosine in 5’-to-5’ linkage with the first encoded base of the RNA (see Rottman, Busch, Furuichi, and Moss in this volume). In order to deterniine whether carly rnRNA could effectively compete with authentic “ c a p p e d eukaryotic mRNA, the ability to translate T3 mRNA in the presence of brome-

270

J. J . DUNN ET AL. 400

rn

9 20c

:

C

I

I

I I I.5 RNA (pg/25p1)

I

FIG. 3. Stirnrilation of ["S]methionine incorporation by T3 RNA in a fractionated mamnialian cell-free protein-synthesizing system. Synthesis was performed as described elsewhere (22) after which aliquots were removed, treated with 1 hl KOH for 10 minutes, and precipitated with 5% cold trichloroacetic acid for liquid scintillation counting. Similar results were obtained using T3 R N A to program a wheat-germ system. The ionic conditions for optimum translation of uncut and RNase 111-cut early RNAs were the same as those for optimal translation of eukaryotic mRNAs ( brome-mosaic virus in the wheat-germ system; globin and adenovirns type 2 in the niamnialian system). In the mammalian systcm, iliaxima1 incorporation was obtained at 100 mM KCI, 2.2 mM MgCI, and 8 mM putrescine. Incorporation of [""Slmcthionine was reduced by more than SO% at MgCI? concentrations l~rlow1.5 inhl end al)ove 3.5 mM. In the wheat-germ system the divalent cation requirement for niaximal translation of T7 and T3 mRNAs was essentially identical to that for the mammalian system (2.2 m M MgCI,, 86 p M spermine). However, in the wheatgerm systcm maximum incorporation was obtained at KCI concentrations hetwecn 56 and 76 mM and decreased rapidly at higher KC1 concentrations. Polycistronic T3 early RNA ( A ); RNase-111-cut T3 early RNA (0).

mRNA

PROCESSING IN BACTERIOPHAGES

T7 AND T3

271

mosaic virus and globin mRNAs was examined. Figure 4 shows the polyacrylamide gel patterns that were obtained when increasing amounts of T 3 mRNA were added to the wheat-germ system along with a previously determined, saturating amount of brome-mosaic virus RNAs. Not only are T3 protcins synthesized, but the addition of T 3 mRNA appears to suppress the synthesis of some prominent brome-mosaic virus polypeptides. Also shown in Fig. 4 is the analogous expcrinient in the mammalian system but with globin mRNA as the added homologous eukaryotic mRNA. Again, T 3 early proteins wcre synthesized. Neither eukaryotic cell-free system seems to contain an activity that will faithfully process T7 or T 3 early HNA, at least not under the conditions used here for protein synthesis. This conclusion is based on the observation that, when "C-labeled polycistronic T 7 or T 3 early RNA was used to program either system, specific cleavage of primary sites were not detected by electrophoresis of the RNA on polyacrylamide gels. In an E . coli RNase-111' cell-free system, added polycistronic early RNA is rapidly processed to individual carly rnRNAs ( 21 ) .

VI. Summary In E . coli, RNase I11 cleavages are part of the normal pathway that produces individual rRNAs and bacteriophages T 7 and T3 early mRNAs (1-6). The late RNAs of T7 and T 3 are synthesized by RNA polymerases specified by the gene I of each virus (24, 2 5 ) and some late transcripts are also processed by RNase I11 ( 2 6 ) . At this time, it is not known whether any E . coli mRNAs are cleaved by RNase 111. Transcribing specialized transducing phages in vitro and then incubating the transcripts with RNase I11 should provide information concerning the role, if any, of RNase 111 in E . coli mRNA biosynthesis. The finding that, under appropriate conditions, HNase 111 can cleave T7 early RNA at secondary sites makes it obvious that cautioii must be exercised in studies of this type. Presumably the incubation conditions that promote a high fidelity of cleavage of T7 early RNA would also apply for other RNAs. In determining whether an RNase 111 cleavage observed in vitro would most likely also occur in uiuo, the RNA to be tested should probably be incubated in parallel with polycistronic T7 early RNA. The T7 digestion pattern could then be ttsed as an index of cleavage at primary sites and the absciice of cleavage at secondary sites. Cleavage of primary sites does not appear to be a prerequisite for transIation of early RNA in vivo or in E. coli cell-free protein-synthesizing systems) although cleavage greatly stimulates the synthesis of the 0.3 protein of T7 and T3 (21, 2 7 ) . Selective stimulation of 0.3 protein synthesis by cleavage of early RNA was not observed in two eukaryotic cell-free protein-synthesizing systems. In the eukaryotic system, all the proteins synthesized seem to lie stimulated by cleavage of the RNA. The

272

J . J. DUNN ET AL.

FIG.4. Competition between homologous RNA and T3 early RNA. I n citro protein synthesis was carried out at 25-pI reactions as described elsewhere (22) for 90 minutes at 30°C (wheat) or 37" (mammalian). The products were labeled with ["Slmethionine, denatured in sodium dodecyl sulfate plus dithiothreitol and electrophoresed on a 17.5%polyarrylamide gel, which was then dried and autoradiographed. The wheat-gcrm system was programmed with: ( b ) rRNA to protect endogenous synthesis; ( c ) 2 pg of nncleaved T3 RNA; ( d ) 2 pg RNase I11 cleaved T3 RNA; (e-h) 2.5 pg of brome-mosaic vinis (RMV) RNA plus ( e ) 0.4 pg, ( f ) 0.8 pg, ( g ) 2 pg, or ( h ) 0 pg of RNase-111-cleaved T3 RNA. The mammalian system was programmed with ( i ) rHNA, or (j-m) 0.5 pg of 9 S RNA (globin) from rabbit reticulocytes plus ( j ) 0 pg, ( k ) 0.4 pg, (1) 0.8 pg, or ( m ) 2.0 pg of RNase-IIIcleaved T3 RNA. An extract of UV-irradiated T3-infected E. coli labeled 0-8 min after infection with ["Slmethionine (gift of F. W. Studier) is shown in ( a ) .

d N A

PROCESSING IN BACTERIOl’II.4GES

T7 AND T 3

273

synthesis of T7 and T3 early proteins in cell-free systems from wheat germ and mammalian cells is not very surprising, since it has been previously shown that these same two systems will translate the coat and synthetase regions of the RNA bacteriophages of E. coli (28-32). The finding that the eukaryotic systems also translate T7 and T 3 early mRNAs provides another sophisticated prokaryotic genetic system for probing the mechanism of protein synthesis in eukaryotes.

ACKNOWLEDGMENT This research has been carried ont at Brookhaven National Laboratory under the auspices of the U.S. Energy Research and Development Administration.

HEFERENCES J. J. Dunn and F. W. Studier, PNAS 70, 1559 (1973). J. J. Dunn and F. W. Stndier, PNAS 70, 3296 (1973). N. Nikolaev, L. Silengo and D. Schlessinger, PNAS 70, 3361 (1973). R. A. Kramer, M. Rosenberg and J. A. Steitz, J M B 89, 767 (1974). 5. M. Rosenberg, R. A. Kramer and J. A. Steitz, J M B 89,777 ( 1974). 6. D. Ginsburg and J. A. Steitz, JBC 250, 5647 ( 1975). 7. H. D. Robertson, R. E. Webster and N. D. Zinder, JBC 243, 82 (1968). 8. P. Kinder, T. U. Keil, and P. H. Hofschneider, Mol. G e n . Genet. 126, 53 (1973). 9. H. D. Robertson and J. J. Dunn, JBC 250,3050 (1975). 10. R. J. Crouch, JBC 249, 1314 (1974). 1 1 . J. J. Dunn, JBC 251, 3807 (1976). 12. J. L. Darlix E J B 51,369 (1975). 13. R. B. Siege1 and W. C. Sunnners, J M B 49, 115 (1970). 14. R. W. Hyman, J M B 61,369 (1971). 15. F. W. Studier, Science 176, 367 (1972). 16. E. G. Minkley and D. Pribnow, J M B 77,255 (1973). 17. N. M. Simon and F. W. Studier, J M B 79,249 (1973). 18. F. W. Studier, J M B 94, 283 ( 1975). 19. F. W. Stndier and N. R. Movva, J. Virol. 19, 136 (1976). 20. G . Paddock and J. Abelson, Nature 246 2 (1973). 21. J. J. Dunn and F. W. Studier, J M B 99,487 (1975). 22. C. W. Anderson, J. F. Atkins and J. J. Dunn, PNAS 73, 2752 (1976). 23. M. Gefter, R. Hausmann, M. Gold and J. Hurtwitz, JBC 241, 1995 (1966). 24. M. Chamherlin, J. MeCrath and L. Waskell, Nature 228, 227 ( 1970). 25. J. J. Dunn F. A. Bantz and E. K. F. Bantz, Nature 230, 94 ( 1971). 26. J. J. Dunn and F. W. Studier, Brooklaaljen Symp. Biol. 26, 267 (1975). 27. K. Hercules M. Schweiger, and W. Sauerbier, PNAS 71, 840 (1974). 28. H. Aviv, I. Boime, B. Loyd, and P. Leder, Science 178, 1293 ( 1972). 29. M. H. Schreier, T. Staehelin, R. F. Gesteland and P. F. Spahr, J M B 75, 575 (1973). 30. T. G. Morrison and H. F. Lodish, PNAS 70, 315 (1973). 31. J. W. Davies and P. Kaesberg, J. Virol. 12, 1434 (1973). 32. J. F. Atkins J. B. Lewis, C. W. Anderson and R. F. Gesteland, JBC 250, 5688 ( 1975). 1. 2. 3. 4.

This Page Intentionally Left Blank

The Relationship between hnRNA and mRNA '

ROBERTP. PERRY, ENZOR A R D , ~ €3. DAVID HAMES,' DAWNE. KELLEY AND UELISCHIBLER The Institute for Cancer Research Fox Chase Cancer Center Philadelphia, Pennsylvania

1. Introduction There is considerable evidence indicating that the precursors of eukaryotic mRNA belong to the class of relatively large molecules termed heterogeneous nuclear RNA ( hnRNA ) . This evidence includes hybridization studies with mRNA-specific probes and with defined portions of certain viral genomes; kinetic studies with and without transcriptional inhibitors; and striking similarities in the types of posttranscriptional modifications that are charactcristic of hnRNA and mRNA molecules ( I ). I t is our purpose here to present an overview of the current concepts of hnRNA + mRNA processing and to analyze some of the major issues that have emerged from investigations of this problem. In the concluding portion we describe some of our recent studies of the 5' termini of hnRNA and mRNA, which have provided important new insights into the nature of mRNA precursors.

It. Transcriptional Units and the Physical Size of Precursors There are two principal factors that determine whether or not a cell contains a significant quantity of oversized precursor molecules for any given mRNA species. First, the particular mRNA must be transcribed as part of a large unit, and second, cleavage of the precursor should generally not occur until after the completion of transcription. The importance of this sccond factor is illustrated by a comparison of ribosomal RNA production in prokaryotes and eukaryotes ( Fig. 1).I n prokaryotic

' Present address. University of Ottawa, Department of Biology, Ottawa, Ontario, Canada. ' Present address: University of Essex, Department of BioIogy, Wivenhoe Park, CoIchester, England. 275

276

ROBERT P. PERRY ET AL.

Pi6S

4

165

Secondary Trimming

-

P23S

P5S

235

5s

1m

I

FIG. 1 . Schematic diagram depicting the transcription and processing of ribosonial RNA precursors. Conserved portions of precursors are shown as hcavy bars; noneonserved portions are shown as thin lines. In the eukaryotic scheme, the four major cleavage sites are designated by numerals 1-4.

cclls, a processing scission by HNase I11 separates the S’, 16 S-RNAcontaining portion from the remainder of the growing transcript. In eukaryotic cells, the analogous scission does not occur until after the entire transcript is released from its DNA template. This same principal may well apply to mRNA, and could account for the relative scarcity of large precursors to some of the highly abundant mRNA species (2-5). Such “prccocious” processing of mRNA is suggested by electron micrographs of transcriptionally active DNA in insects ( 6, 7 ) . Simultaneous processing and transcription may also characterize the production of mRNA in certain eukaryotic viruses, as was indicated by comparisons of UV target sizes and messznger lengths for the various inKNAs of vesicular stomatitis virus (8, 9). Thus, the considcration of biological iniportnncc is thc size of the transcriptional unit relative to the size of the

RELATIONSHIP BETWEEN

hnRNA

AND

mRNA

277

mature, functional mRNA component, irrespective of whether a fulllength transcript of this unit is ever produced. More or less of such transcripts could be expected depending on the relative rates of transcription and processing. Another important consideration concerns the actual size distribution of hnRNA and mRNA molecules. Pulse-labeled hnRNA, extracted from rapidly proliferating mammalian cells (e.g.,mouse L cells) by very gentle procedures that minimize degradation, and exposed to stringent denaturation conditions that discourage artifactual aggregations, displays a broad range of sizes from about 0.5 to 30 kilobases ( k b ) with a mass average of about 13 kb (Fig. 2A). In comparison, mRNA( A,*) from the polyribosomes of such cells is much smaller, ranging in size from about 0.5 to 10 kb with a mass average of about 3 kb. Such mass plots are somewhat misleading, however, because they do not reflect the actual molecular distributions of hnRNA and mRNA molecules. Approximate molecular distribution can be obtained by dividing the ordinates of a mass plot by their estimated size, based on calibrations of the abscissa with an appropriate group of markers. When this is done and plotted cumulatively (Fig. ZB), it is seen that there is a considerable overlap in the molecular size distributions. More than half of the hnRNA molecules, which have a number average molecular weight of about 5.7 kb, are in the same size range as the mRNA [number average molecular weight exclusive of the poly( A ) of about 2.5 kb]. Since the denaturation treatment may have exposed “cryptic” nicks in regions of the molecules that were protected by secondary structure, there is a possibility that we have underestimated the true molecular sizes, especially in the case of hnRNA ( 1 0 ) . However, these estimates are in fact consistent with the results of analyses of the 5’ termini of primary transcripts and capped molecules, which suggest that some hnRNA might become mRNA with relatively little reduction in size (see below).

111. Sequence Properties Hybridization studies in a variety of systems have demonstrated that both hnRNA and mRNA exist in multiple-frequency classes of varying analytical complexity [for refs. cf. Lewin ( I ] ) ] .The mRNAs in the highcomplexity class represent the products of about 8,000-10,000 genes and are present in only a few copies per cell, whereas those of the lowcomplexity class come from about fifty to a few hundred genes and are present in hundreds to thousands of copies per cell. Cross-hybridization experiments in which enzymically synthesized cDNA probes to niRNA

3

, , j $

3 2 1

B

I00

80 28s 60

40

20

Molecular Size ( k b )

FIG. 2. Size distributions of hnRNA and mRNA from mouse L cells. ( A ) Two different prepnrations of hnRNA ( 0 and A ) from cells labeled for 25 minutes with ["C]uridine were denatured by heating for 2 niinntcs at G0"C in 80% MeSO and sediniented through 15 to 30% sucrose gradients in aqueous buffer containing 0.5% sodium dodecyl sulfate ( 3 6 ) . The gradients were fraetioned into 44 samples, and the cpm in each sample were assayed and divided by the total cpm on the gradient to give percent distribution of RNA niass. For one hnRNA preparation, the fractions sedimenting slower than 50 S were 'concentrated and sedinlented on a second aqueous gradient, which allowed better resolution of smaller RNA components (inset ). lJolyadcnylylated mRNA from polyribosomes of cells labeled f o r 4 hours with ["Hluridine ( 0)was heated for 2 minutes in Me.SO/dimethylformamide/l mM EDTA ( 1:2: 1) and sediniented through 5 to 20% chloral hydrate in 99%MeSO with ["CIrRNA markers (shown in the inset). The abcissas were converted from fraction number to niolecnlar sizc in kilobases (kb) iising calibration curves constructed with 18 S and 28 S rRNA, 45 S pre-rRNA and 70 S Hous sarcoma virus RNA, assumed to lie 2, 5, 12.5 and 22 kh, respectively. ( B ) The mass distributions shown in panel ( A ) were ,converted to niolecular distributions by dividing cach ordinate value by its abscissa value. The data were plotted cumulatively as percent molecules greater than a particular size. The symhol ( A ) represents the point at which data for the gradient of <50 S hnRNA were normalized to data for large hnRNA.

RELATIONSHIP BETWEEN

hnRNA

AND

279

mRNA

are annealed with hnRNA or in which probes to hnRNA are annealed with mKNA clearly indicate sequence homology between these two classes of molecules, both for particular mRNAs of defined coding specificity and for complex mixtures of mHNA [see Perry ( 1 ) for refs.]. We have also made similar homology determinations with a probe consisting of the DNA scquences that code for the mRNA of mouse L cells ( 1 0 ) . This probe, termed mDNA, was isolated by hybridizing singlecopy DNA to a large quantity of polyadenylylated mRNA, and exhaustively purifying the hybrids. When increasing amounts of the mDNA probe were hybridized back to mRNA or to various fractions of pulselabeled hnRNA, data such as those illustrated in Fig. 3 were obtained. These measurements indicate that the large molecules of hnRNA, whether or not polyadenylylated, and small nonadenylylated hnRNA, contain about 12-25% mRNA sequences, and that small polyadenylylated hnRNA contains about twice that proportion. Although exact quantitation is not possible owing to the difficulty of establishing the absolute level of purity of the probe, it seems clear that all size classes of hnRNA, including the large molecules, contain substantial amounts of mRNA sequences. The sequence complexities of both frequency classes of hnRNA are from 5 to 10 times greater than those of the respectivc mRNA classes,

c

S'hnRNA

50

100

150

200

Concentration of Complementary Sequence DNA ( p q h1

FIG.3 . Determination of the proportion of mRNA sequences in hnRNA. A DNA probe complementary to niRNA( A,,) of L cells was prepared by hybridizing singlecopy L-cell DNA to excess niRNA and purification of the hybrid DNA by a niultistep process ( 1 0 ) . Increasing amounts of this "mDN.4" probe were then hybridized with thc following '"P-lal)eled RNAs: mRNA( A"), large ( >45 S ) hnRNA containing or lacking p l y ( A), and small ( 1 2 8 S ) hnRNA containing or lacking poly(A). The data for different inRNA preparations are shown by different symbols. S' and Lrepresent sinall hnRNA( A n ) and large nonadenylylated hmRNA, respectively. T h e data for S-- and L'hnRNA were similar to those of the L-hnRNA. In one experiment, the L-hnRNA was hybridized in the presence of a 125-fold excess of unlabeled mRNA( A,,).

280

ROREHT 1’. I‘EHHY ET AL.

indicating that an appreciable fraction of the hiiRNA sequences are not converted to mRNA. Thew “non-mRNA sequences include regions adjacent to the 3’-terniinal poly( A ) segments and probably other parts of tlic hnRNA molecules as well (12, 13). To a certain extent, one can explain such results in terms of a processing schemc analogous to that established for ribosomal IINA, in which certain parts of the hnRNA ~noleculesare conserved and others discarded. Whether one can attribute the complevity differences entircly to mRNA processing, or whether the transcripts of some genes are confined to the nucleus without ever being processed into mRNA, rc3maiiis nn open question. Attempts to provide a conclusive answer have been hampered b y the above-mentioned uncertaintics concerning the true sizc of the transcriptional units and the purity of probes, and by possible compositional differences between the total transcriptional output and the steady-state hnRNA population ( see 1,clow).

IV. Kinetic Considerations In order to comprehend fully the relationship between hnRNA and mRNA, it is important to consider the kinetic parameters of these molecules. Experiments performed with a variety of cell types indicate that most mRNAs decay in the cytoplasm by a stochastic process, and that the decay constants may differ for different mRNA species within the same cell (10, 14-17). In mouse L cells growing exponentially with a doubling time of 12-13 hours, thc decay of polyadenylylatcd mRNA can be described in terms of two mRNA classes, an a-class with a half-life of about 2 hours [mean lifetime (half-life/ln 2 ) , t,,, = 3.2 hours] and a p-class with a half-life of about 18 hours (t,,,= 27 h r ) (Fig. 4 ) . In these expcrinients, the cells were given a 1-hour pulse with [“Hluridine and then chased with unlabeled uridine undcr conditions in which the pools of radioactive precursors were reduced to negligible levels within about 3 hours, as jiidgrd hy the constant specific activity of the rHNA components. Earlier evperiments performed under conditions of continuous labeling, in which the mRNA clccay was determined by the rate of approach to constant SIXcific activity indicated a single decay coiistant for the mRNA( A,,) (18, 1 9 ) . However, as illustrated in the inset of Fig. 4, this method is intrinsically less sensitive to the contribution of the more rapidly decaying mRNA class, so that one might fail to detect its contribution to the overall decay, although such a contribution is readily observable in pulse-chase cxperimeiits. The calculated ratio of labeled a-mRNA labeled p-mRNA for the cxperiment of Fig. 4 was 3.6, which indicates a steady-state ratio of

RELATIONSHIP BETWEEN

hnRNA

AND

mHNA

281

I \II

8

16

+.-.-+ 24 32

Duration of Chase ( h r ) FIG.4. Decay of niRNA(A,,) in mouse L cells as determined by the pulse-chase method. Cultures of cells in strict exponential growth with doubling times of 14 hours were labeled for 1 hour with ["Hluridine and then chased for varying periods of time in the presence of 0.40 niM unlabeled uridine. mRNA(A,) from appropriate aliquots of cells was extracted from polyribosomes, purified by chromatography on oligo( dT )-cellulose, and assayed by hybridization to poly( U )-glass-fiber filters. The ordinate represents the natural logarithm of the total radioactive mRNA in a constant volnme of culture. , Data from three separate experiments covering essentially the same time spans were normalized to the 10-hour point ( B ) and plotted together. The triangular symbol ( A )represents two overlapping data points. The data were fed to a computer with instructions to obtain the best decay parameters for single ( I ) and two-component ( 11) stochastic curves, to select the better-fitting model and to provide a confidence level for the selection. The one-component model with a mean lifetime (half-life/lne) of 18.5 hours (dashed line) was rejected at the 1% level of significance in favor of the two-component model (solid line) with mean lifetimes of 3.2 hours ( a ) and 27 hours ( p ) . The computed ratio of radioactively labeled a to p was 3.6. From this value, a steady-state ratio, ( a / P ) - , of 1.2 was calculated with the help of Greenberg's equation for continnous labeling experiments (It?), using 2 hours as the average labeling time. The inset shows theoretical curves for the oneand two-component models as would be obtained in a continuous labeling experiment.

of 1.2. In order for the class of mRNA to exist in approximately the same amount per cell as the p class, it must be produced about 9 times more rapidly in order to keep pace with its fast decay. This means, in turn, that the precursors of a-mRNAs, i.e., the a-hnRNAs, must be produced 9 times faster than the corresponding p-hnRNAs. Thus, in shortterm labeling experiments, the a-class of both mRNA and hnRNA will normally predominate among the radioactively labeled species. (Y

282

ROBERT 1’. PERRY ET AL.

Thc relative sizes of the steady-state nuclear pools of a-hnRNAs and P-hnRNAs will depend on the relative efficiencies and rates of processing of thcse classes of molecules. If the processing rates are similar, as suggested b y kinetic data on the decay of total L-cell hnRNA ( 2 0 ) , and if the processing efficiencies were comparable, then one would expect that the a-hnRNAs would be about 9 times more abundant than the p-hnHNAs. On the other hand, if the processing ratcs were different as is the case for 18 S vs. 28 S rKNA (21 ) or if the processing efficiencies were different as is the case for rRNAs and mRNAs in resting vs. growing cells (22, 2 3 ) , then the steady-state quantities of a- and p-hnRNA might be more similar or evcn the same. From the foregoing discussion, it should be evident that the steadystate distribution of hnRNA need not necessarily reflect that of the mRNA for which it is a precursor. Whether it does or not depends on the extent of coupling between transcription, processing and turnover rates. These considerations, which are applicable to any determination of the structural similarities between kinetically complex mixtures of molecules, have important consequences for the evaluation of sequence comparisons between mRNA and hnRNA populations. For example, one might conceive of situations in which a very abundant and relatively stable species of mRNA is transcribed and processed at rates similar to that of other less abundant, more labile species. In such a case, the hnRNA population would not be particularly enriched in precursors of this abundant mHNA species. The complexities of such a multiple component system also make it almost impossible to arrive at unique and definitive conclusions about hnRNA + mRNA processing from kinetic measurements of the synthesis and turnover of a common clement, such as poly( A ) . Any set of nieasuremcnts can usually bc explained by a variety of alternative models (17, 24, 2 5 ) , and, in the case of poly(A), the situation is further complicated by the existence of an independent cytoplasmic system capable of incorporating adenylate into poly( A ) (26, 27). Nevertheless, from all the various studies of poly( A ) metabolism, it seems reasonable to conclude that although poly(A) is conserved to a greater extent than arc total nucleotides in the processing of hnRNA (28, 29), there is also a measurable poly ( A ) turnover, which may vary according to the proliferative state of the cell (12, 13, 23,24, 29, 3 0 ) . A further point to be emphasized in cvaluating the relationship between hnRNA and mRNA is the distinction between quantitative and qualitative selection. Quantitative selection may be a means of determining the number of copies of a particular mRNA molecule by modulating

RELATIONSHIP BETWEEN

hnRNA

AND

mRNA

283

the efficiency of processing. An example of reduced processing efficiency is seen in the so-called “wastage” of ribosomal and messenger precursors in nongrowing cells (21-23). If this concept is extended to include a selcctivc diminution in the efficiency of processing of a particular class of transcripts, which may even approach all-or-none proportions ( 3 1 , 3 2 ) , it then provides an example of qualitative selection. Structural and compositional comparisons provide information only about qualitative selection, whereas kinetic measurements will monitor both types.

V. Studies of the 5’ Termini of hnRNA and mRNA’‘ Recent studies of the 5‘ termini of hnRNA and mRNA have caused us to revise some of our idcas about the location of mRNAs within the primary transcription products and have given us some new insights into possible pathways of mRNA processing. From the common occurrence of poly(A) on the 3’-OH ends of most mRNAs and some large hnRNA molecules, as well as from other considerations (cf. reviews 33-35 for references), it was generally believed that the mRNA segments were located a t the 3’ ends of the hnRNA molecules. However, the finding of similar modified “cap” structures a t the 5’ termini of large hnRNA and mRNA (36) led us to question this original idea, and together with more recent analyses of the phosphorylated termini in hnRNA has indicated that some mRNAs may even be derived from the 5’ portions of initial transcripts. The general structure of the 5’ terminal caps (m’G(5’)pppN’mN”( m ) p ) has been covered elsewhere in this s y m p o ~ i u mThe . ~ one important difference between the cap structures of hnRNA and mRNA is that the mRNA caps may sometimes contain a 2’-O-methylated nucleotide at position N”, whereas in hnRNA the 2’-O-methylation is found only at position N’ ( 3 6 ) .This feature is illustrated in Fig. 5, which compares thc chromatographic profiles on DEAE-Sephadex ( urea) columns of methyl-labeled cap derivatives. After removal of the 3’-phosphates with alkaline phosphatase, the mRNA caps are readily resolved into two distinct peaks corresponding to the cap I (m‘GpppN’m-N”) and cap I1 ( m7GpppN’m-N’’m-N’”) structures, whereas the hnRNA caps exhibit only type I structures. The reason for this difference became apparent when it was found that the 2‘-0-methylation at position N” occurs as a secondary modification after the mRNA has entered the cytoplasm (37, 38). Discussed in Part I of this volume.

284

ROBERT .'1

PERRY ET AL.

I N I

0 X

a Q U

FRACTION NUMBER

FIG. 5. Analyses by DEAE-Sephadex chromatography of alkaline phosphatasetreated cap oligoniicteotides of hnRNA ( N2) and mRNA ( A,,) ( i n ) . Preparations of >32 S hriRNA and mRNA( A,,) were obtained from L cells labcled for 1 hour with [ methyl-'€II]rnctliioni~~e.Cap oligonucleotides isolated Ly discontinuous elution on DEAE-Sephadex from KOH hydrolyzates of the RNAs were treated with alkaline phosphatase and rechromatographed on DEAE-Sephadex (urea) columns together with suitable charge markers (3G). The horizontal bar in the npper panel marks the hnRNA cap I structure; the bars designated m?,, and III?~,in the lower panel represent the mRNA cap I and cap I1 strrictures, respectively.

The distribution of the five types of modified nuclcotides: Am, mGAm, Gin, Urn and Cm that can occupy position N' in the cap I structures of the various hnRNAs was found to be remarkably similar to that observed for the cap I structures of the mRNA (36). Since the distributions are clcwly nonrandom, these results suggested that the hnRNA caps might indeed lx* prc~ciirsorsof thc mHNA caps, and conseqwntly that at least some mHNA might be derived from 5' portions of hnRNA. This striictural similarity in 5' tcwiiini was extended when we observed that an additional scquence characteristic, namely the spcctrum of segment lengths Ixtwecri thc cap and the first nonmodified gunny late residue, was again nonrandom, and indistinguishable in hnRNA and mRNA ( U. Schiblcr et al., unpublished). Further indication of a precursor-product relationship between 5' portions of hnHNA and mRNA has comc from a pulse-chase kinetic analysis

RELATIONSHIP BETWEEN

hnRNA

AND

mRNA

285

of methyl-labeled cap structures (37). All the labeled cap structures in

>50 S hnRNA from cells pulsed for 15 minutes with [methyL3H]methionine are chased out of the nucleus by 3 hours (Fig. 6 ) . Owing to the rapid equilibration of the S-adenosylmethionine pools (39), the flow of radioactivity from the hnRNA can be observed almost immediately after the start of the chase; the undesirable delay encountered in nucleoside pulse-chase experiments is thus avoided. During the first half-hour d the chase, essentially all the radioactive cap material lost from the hnRNA can be accounted for by labeled cap material appearing in the cytopIasmic mRNA ( Table I ). In this experiment, we measured both cap-I and cap-I1 radioactivities for the cytoplasmic mRNA, but have excluded the contribution of the cytoplasmic 2‘-O-methylation a t position N” of cap-II by measuring only the radioactivity in the “cores” produced by digesting

:h 3 Hr CHASE

20

60

100

140

FRACTION NUMBER

FIG. 6. Relative amounts of radioactive m‘A and cap in >50 S hnRNA from cells labeled for 15 minutes with [‘Hlmethionine and chased for varying periods with unlabeled niethionine. Chromatography on DEAE-Sephadex (urea) of RNA digested with T2 ribonuclease ( 3 7 ) .

TABLE I LABELING OF

h l E T H T L . \ T E D COMMPOSBNTS I N C E L L S P U L S E D F O R

CHASED F O I ~ VAIIYING PERIODS WITH

RN.4 fraction m IINA(A,) (polyribosonial)

hnltNA > 50 S 18 $30 s hnRNA > .jO S 18-50 s hnltNA, total h n R S A > .50 S 18 s-50

s

Duration of rhase (hr)

0 0.5 1 ..5 3.0 0 0 0.5 0 . .5 1.5 3.0 3 .0

AN

Amount of label (cpm) m6A

cap

3820 8.530 5660 3080 77.50 24,000 3400 12,.i00 3400 11.50

3607 8870 88.50 63.50 3140 16,900 1880 6390 3680 0 1170

3850

13

h1INI:TP:S

\VITH

[ 3 H ] h I ~ T H I O N I S ~A~N: D

EXCESSO F UNLABCLED METHIOXINE~ Cap I1 constituents ( % label)

Label as Ratio m7GpppN’m cap I: cap I1 m7C;pppN‘m N”m (cpm)*

Net accumulation cytoplasmc or nucleus

0 . -56

1320

-

2.8 2.7 2.3

7310 7440 3i30

10,800 (+I 11,020 (+) 7550 (+)

-

20,040 $270

11,770 ( - )

3680

16,360 ( - )

1170

18,870(-)

0

z5

a

1

* For mItS.4

?

Labeling was done in the presence of 90 ng/ml artinomycin. Chasing was done in normal groit th mcdium. the values arc calculated as the total cpm in rap I plus the frartion of cap I1 lahe1 whicah is in m7GpppN’m. The hnItXA values represent the sum of rap radioactivity in all size fractions > 18 S. c The net arcurnulation is the amount of m7GpppS’m a t any particular chasc time minus the amount a t the beginning of the chase. For the cytoplasm the values for the polyribosomal m R N A are multiplied by 1.8 to account for the fact that the mIt?;A(A,) fraction contains 36% of the total cytoplasmic m1tNA. (+) rcprrsrnts a nrt gain, ( - ) represents a net loss.

v

M

3

z

4

2

RELATIONSHIP BETWEEN

hnRNA

AND

287

mRNA

cap-I1 structures with penicillium nuclease and alkaline phosphatase [m:GpppN’m-N”mp + m’GpppN’m (“core”) N”m] . In addition to cap methylation, 3-6 internal adenylate residues of hnRNA are methylated a t the N 6 position of the base. After a 30-minute chase, about half of the m”A residues could be accounted for; the remainder was presumably located on portions of the hnRNA that were degraded during processing. As the chase period is extended, the labeled cap material in the mRNA eventually diminishes owing to mRNA turnover. There is also a gradual shift toward a higher proportion of cap-I1 structures as the cap-I-terminated mHNAs continue to receive the secondary 2’-O-methylation at position N”. These pulse-chase data for methy I-Iabeled cap were compared to the predictions of a theoretical model based on total cap conservation and on previously determined turnover rates of 2 hours for the a-class of mRNA and 23 minutes for L-cell hnRNA (20). As may be seen by comparing the dashed and solid curves in Fig. 7 , the agreement was rather good. The actual decay of hnRNA cap is somewhat slower than that predicted for a single kinetic component with a 23-minute half-life, suggesting the existence of a second more stable class of capped hnRNA, possibly that associated with the p-class of mRNA. The foregoing data, which suggest a precursor-product relationship

+

2

I

3

duration of chase (hr)

FIG.7 . Flow of methyl label from hnRNA caps to mRNA caps: comparison of simple model with experimental data. The data for hnRNA ( A ) and total cytoplasmic from Table I are plottrd as hn/hn,, and m/hn,, respectively, where lzn mRNA (0) represents the total cpm in cap I of hnRNA, and m the net accumulation of cpm in the m‘GpppN’m portions of both caps of cytoplasmic mRNA. hn, is the cpm in the hnRKA cap at the beginning of the chase. The theoretical curves (dashed lines) are ki

k?

derived according to the model hn --rm -+ which predicts hn/hn,, = e - x , t and ni/hn,, = k,(e-*z6- c k, - k J ) . The values of k , (1.8 hr-I) and k2 (0.35 hr-’) correspond to half-lives of 23 minutes for hnRNA and 2 hours for the rapidly decaying class of mRNA.

288

ROBERT P. PERRY ET AL.

hctwcen the S’-capped termini of hnRNA and mRNA, prompted us to cxiaminc thc other types of S’ termini of hnRNA in order to relate the cap structures to primary transcription products (5’ triphosphate termini) atid othcr possihlc processing products, e.g., mono- and diphosphate termini (3%). For these experiments we used total hnRNA >18 S from L cells labeled for 30 minutes with ’?P,,and separated thc 5’-capped termini from internal nucleotides b y chromatography on acetylated dihyclroxyl~orylamiiio~~thyl-( DBAE )-cellulose ( 40, 41 ) ( Fig. 8 ) . The various phosphorylated 5’ termini were further separated by chroniatogruphy on diethylaminoethyl- ( DEAE- ) -cellulose ( Fig. 9 ) and submitted to compositional analysis on polyethyleneimine ( PEI )-cellulose thin-layer plates. These measurements allowed 11s to coinpare the composition of pppN’. . ., ppN’. . ., pN’. . . and cap N’m of m’GpppN’ni termini. The results of such compositional analyses (Table 11) indicate that all hnRNA molecules are initiated by purines, and consequently that the monophospliorylated pyrimidine termini and the cap termini containing

;k x

4

z 0

3

z J

10

20

30

40

U

x 20

0 0

10

20

30

40

FRACTION NUMBER

FIC. 8. Chromatography of a rilmniiclease T2 digest of hnRNA on DBAEcellulose. The hnRNA from cells labeled for 30 minutes with [“’Plorthophosphate was denatured with 804 Me,SO a t 60°C and subjected to sucrose-gradient centrifugation. The sedimentation profile is shown in the inset (0) together with the profile of ribosomal RNA run on a parallel gradient ( 0 ) .Molecules sedimenting faster than 18 S rRNA wcrc collected and digested exhaustively with ribonuclease T2. The digest was diluted with 10 volu~nesof DBAE-application bufler (total volume 3.3 ml) ( 4 1 ) and applied to a 0.4 x 4 ‘cni DBAE-cellulose column. After the column was washed with 60 ml (30 fractions) of application buffer, the retained material was eluted with 1 M sorbitol in application buffer. Beginning with fraction 11 the counts were multiplied by 50 so that they could be plotted together with nonretained eluate.

RELATIONSHIP

0

10

BETWEEN

20

30

40

hnRNA

50

60

AND

0

10

289

mRNA

20

30

40

50 0

10

20

30

40

50

FRACTION NUMBER

FIG.9. Analysis of DBAE-nonretained and retained nucleotides on DEAE-Sephadex columns. The nucleotides not retained hy DBAE (Fig. 8, fractions 1-6) and those retained by DBAE (Fig. 8, fractions 31-36) were diluted, adjusted to 20 mM TrisCl pH 7.4, 4 mM EDTA and 7 M urea with stock solutions and bound to a 0.4 x 50 cm DEAE-Sephadex column. Nucleotides were eluted with a 160-ml gradient of 0.1-0.5 M NaCl in 20 n1M TrisCI, pH 7.4, 7 M urea. In all separations, a ribonuclease A digest of “11-labeled rRNA was included as a series of charge markers. ( A ) Nucleotides not retained by DBAE-cellulose; ( B ) nucleotides retained by DBAE-cellulose; ( C ) alkaline phosphatase digest of nucleotides eluting at -5 as in panel ( A ) ; ( D ) nuclease S1 digest of nucleotides eluting at -5 in panel ( A ) : ( E ) alkaline-phosphatase digest of nucleotides eluting a t -5 as in panel ( B ) . In panel ( A ) the peak near -2 has been plotted on a 400-fold reduced scale (0).

2’-O-mcthylpyrimidine must originate by cleavages at internal sites ofhnRNA molecules. The relatively low proportions of G and A in the pN’. . . termini, compared to their predominance in caps and diphosphate termini, suggests the possibility that the capped termini might be derived from the 5’-initiated portions of the hnRNA as well as from internal cleavage sites. This would imply that niRNAs can be located both at transcriptionally initiated and internal portions of the precursor molecules. The prevalence of pU at the internal cleavage sites is noteworthy. The analysis of 5’ tcrmini also offers a clue as to the nature of the capping reactions. The ppN’. . . termini, which have a composition similar to that of the caps and quite distinct from the compositions of either the pppN’. . . or the pN’. . . termini, could be readily understood

290

HOBEHT.'I

PEHIlY ET AL.

IXstrihution ( % radioactivity) N'III (rap) I3asr at position N' (iuariinr Adenine Cytosine Uracil

pN'" TJPPN' 62 36

PPN'

PN'P

44

6

29 7 20

17 21 56

P"

m7CpppN'ni (321'-labeled) [311]CI13-capb

__

13 18 39 30

3.5 36 11 17

:)r)

22 11 1 :<

The composition of pN' termini was measured as the :<',5'-bisphosphatt'-bisphosphatc (pNp) arid as t h e 5'-phosphate (pN) in order to assrss the effects of varying epecific activitier of precursor pools. Inclusion of the 3' nearest neighbor phcsphate, which tends to diiriinish the el'fcct of precursor specific activity, results in lower C, :A and C : U ratios, indic:nt,ing t,hat, thc molar composition of (; and C may be sorucwhat ovcircstirnated in these deteriiiiiiations. Taken from ctoinposit,ions of methyl-labrlcd N'ni eonstitucnts cf caps of hiiRNA [Table \-I (37)j. Values are adjusted to account for 1.8 inol of iiietrhyl pcr 1 mol of A in ni6Ani ( 4 5 ) . Qualitatively similar results wrre obtained in another experiment i.e., purinc/pyriniidinc for pppN, p p N and p N p were >20, 1.7, arid 0.51, rcsprcst.ivrly.

in terms of the scheme depicted in Fig. 10. According to this model, the diphosphorylatcd termini, formed both by phosphorylation of the monophosphate termini and dephosphorylation of the triphosphatc termini, would be capped by the mechanism described for vaccinia and reovirus in which GTP condenses with a diphosphorylated terminus to yield cap plus pyrophosphatc (42, 4 3 ) . Although this model necessarily has to invoke n hitherto unknown enzyme, a nucleotide 5'-monophosphate kinasc

FK:. 10. Model for processing and capping of L-cell hnRNA. The hnRNA molecules terminated by nucleotide 5-diphosphates are derived from two sources : from dephosphorylation of initial transcripts (pppN'p. . . + pi ppN'p. ... N' = purine), and from cleavage and subsequent phosphorylation (pN'p. . . +pi + ppN'p. . . ). The diphosphate temiini are then condensed with G T P and methylated by S-adenosylmethionine ( AdoMet) to form the cap (m'GpppN'm-N"p. , ), ( R = purine nricleoside. )

+

.

RELATIONSHIP

BETWEEN

hnRNA

ANI)

291

mRNA

with polynucleotide specificity, it seems like the most reasonable way to explain the existence of the pyrimidine diphosphates, as well as the paucity of p G . . . tcrmini in spite of the high proportion of G at the N' position of caps. A mechanism like that described for vesicular stomatitis p l ( 4 4 ) , could in virus, i.e., one involving Gppp pN' + GpppN' principle also be used for capping the termini produced by the internal cleavages. However, with this mechanism there would be no obvious explanation for the existence of pyrimidine diphosphates.

+

+

VI. Summary Several types of structural and kinetic evidence indicate that hnRNA contains mRNA precursors in a variety of forms. In some cases the precursors consist of hnRNA niolecules that are considerably larger than the mature mRNA. In other cases the size of the pre-mRNA is similar to that of the mature mRNA, either because processing events have closely followed transcription, or because the transcriptional units are not significantly larger than the mature mRNA. The mRNA segments may be located at the 5' termini of initial transcripts as well as at internal regions that are exposed by appropriate processing cleavages. Nuclear processing of the mRNA precursors is completed by cap formation on the 5' terminus and, for most mRNA species, methylation of internal adenylate residues and polyadcnylylation of the 3'-OH terminus. The cellular content of any particular mHNA species and its precursors is determined by the relative rates of transcription, processing and turnover. ACKNOWLEDGMENTS Research was supported by a grant to Dr. R. P. Perry from the National Science Foundation, grants from the U S . Public Health Servi'ce, and an appropriation from the Conimonwealth of Pennsylvania. W e also gratefully acknowledge fellowship awards from the Danion Runyon Memorial Fund to E. Bard, from the Science Research Council of Great Britain to B. D. Hanies, and from the Swiss Science Foundation to U. Schibler.

REFERENCES 1. 2. 3. 4.

5. 6.

7. 8.

R. P. Perry, A R B 45, 605 (1976). T. Imnizunii, H. Diggelmann and K. Scherrer, P N A S 70, 1122 (1973). G. S. McKnight and R. T. Schimke, P N A S 71,4327 ( 1974). P. M. Lizardi, This volume, p. 301. B. Daneholt et al., This volume, p. 319. C. D. Laird and W. Y. Choi, Chromosoma In press. S. L. McKnight, N. L. Sullivan and 0. L. Miller, Jr., This volume, p. 313. L. A. Ball and C. N. White, PNAS 73,442 (1976). I

292

ROBERT P. PERRY ET AL.

9. R. J. Colonno, G . Abraham and A. K. Banerjee, This volume, p. 83. 10. R. P. Perry, E. Hard, B. 1). Hames and D. E. Kelley, in “Proceedings of the 9th FEBS hleeting” (E. J. Hidvegi, J. Snniegi and P. Venetianer, eds.), Vol. 33, pp. 17-34. Akadeiniai Kiado, Budapest, 1975. 1 I . B. Lewin, Cell 4, ( 1975). 12. M. J. Getz, G. D. Birnie, B. D. Young, E. MacPhail and J. Paul, Cell 4, 1.21 (1975). 13. R. C . Herman, J. G . Williams and S. Penman, Cell 7, 429 (1976). 1 4 . R. 11. Singer and S. Penman, J h l B 78, 321 (1973). 15. M. E. Buckingham, D. Caput, A. Cohen, R. G . Wlialen ancl F. Gros, PNAS 71, 1466 ( 1974). 16. A . Spradling, H. Hui and S. Penman, Cell 4, 131 ( 1975). 17. L. Puckett, S. Chambers and J. E. Darnell, PNAS 72, 389 (1975). 18. J. Greenberg, Nature 240, 102 ( 1972). 19. R. P. Perry and D. E. Kelley, J M B 79, 681 (1973). 20. B. 1’. Brnndhorst and E. H. McConkey, J M R 85,451 ( 1974). 21. R. P. Perry, Nat. Cancer Inst. Monogr. 18, 325 ( 1965). 22. H. L. Cooper and E. M. Gilxon, JBC 246, 5059 ( 1971 ). 23. I,. F. Johnson, H. T. Aliclson, 11. Green, J. Williams and S. Penman, Cell 4, 89 (1975). 24. R. P. Perry, D. E. Kellcy and J. LaTorre, J M B 82, 315 (1974). 25. B. P. Brandhorst and E. H. McConkey, PNAS 72, 3,580 (1975). 26. I. Slater, D. Gillespic and D. W . Slater, PNAS 70, 406 ( 1973). 27. J. Diez and G . Brawernian, PNAS 71, 4091 (1974). 28. W. Jelinck, M. Adesnik, M. Salditt, M. Sheiness, R. Wall, G. Molloy, L. Phillipson and J. E. Darnell, J M B 75,515 ( 1973). 29. J. LaTorre and R. P. Perry, BBA 335, 93 ( 1974). 30. R. S. W n and F. 11. Wilt, Deochp. B i d . 41, 352 (1974). 31. J. Paul, R. S. Cihnonr, P. R. Harrison, J. Windass and N. Affara, i n “Proceedings of the 9th FEW Meeting ( E . J. Hidvegi, J. Suinegi and P. Venetianer, eds.), Vol. 33, pp. 447-459. Akademiai Kiado, Budapest, 1975. 32. S. Hiumphries, J. Windass and R. Williamson, Cell 7, 267 ( 1976). 33. G. Brawerman, ARB 43,621 ( 1974). 34. J. R. Greenlwrg, J. Cell B i d . 64, 269 ( 1975). 35. J. E. Darnell, W. R. Jelinek and G . R. Molloy, Scictlce 181, 1215 (1973). 36. €3. P. Perry, D. E. Kelley, K. H. Friderici and F. M. Rottman, Cell 6, 13 ( 1975). 37. R. P. Perry and D. E. Kelley, Cell 8, 433 (1976:). 38. F. hl. Rottnian, R. C. Desrosiers and K. Friderici, This volllmc, 11. 21. 39. 11. Greeiherg and S. Penman, J M B 21, 527 ( 1966). 3%. U. Schilder and R. P. Perry, Cell 9, 121 (1976). 40. Y. Furuichi, A. Shatkin, E. Stavnezer and J. Bishop, Nature 257, 618 (1975). 41. I‘. F. McCutchan, R. T. Gilhani and D. Soll, NARes 2, 853-863 (1975). 42. R. Moss, S . A. Martin, M. J . Ensinger, R. F. Bcone and C.-M. Wei, This volllme, 1’. 63. 43. Y. Furuichi, S . Muthukrishnan, J. Tomasz and A. J. Shatkin, This volume, p. 3. 44. G . Al)raham, D. P. Rhodcs and A. K. Ranerjee, Cell 5, 51 ( 1975). 45. R. P. Perry, 1). E. Kelley, K. Fridcrici and F. M. Rottman, Cell 4, 387 (1975).

A Comparison of Nuclear and Cytoplasmic Viral RNAs Synthesized Early in Productive Infection with Adenovirus 2 !

HESCHELJ. RASKASAND ELIZABETH A. CRAIG* Department of Pathology Wa.yhington University School of Medicine St. Louis, Missouri

During the early phase of infection with adenoviruses, prior to the onset of viral DNA synthesis, a limited portion of the genome is expressed as cytoplasmic mRNA. The genome sites coding for the adenovirus 2 mRNAs have been determined by hybridization of viral RNA with specific viral DNA fragments ( 1 5 ) .The segments of the genome coding for individual viral mRNA species are shown in Fig. 1; this figure also shows the set of viral DNA fragments produced by cleavage with endo R.Eco R1 ( 6 ) . Four regions of the viral genome code for early messenger RNA, two regions from each strand. The DNA strand that generates rightward transcripts is designated the r strand. The extreme left-hand end of this strand is transcribed to give three RNAs, two 13 S and one 11 S species. Two mRNAs, one 19 S and one 13 S, are products of the region corresponding to Eco R1 fragments D and E. The 1 strand yields mRNAs from regions of the genome included in Eco R1 fragments C and B. A 20 S RNA is specified by the sequences in Eco R1-C, and 19 S and 11 S RNAs are detected by hybridization with Eco R1-B. To compare the nuclcar and cytoplasmic RNAs transcribed from early genes, nuclear RNA was prepared from cultures labeled with [ 'Hluridine, and cytoplasmic RNA was obtained from cultures incubated in the presence of ?LP. Since the cytoplasmic viral RNAs are polyadenylvlated (7, 8), the poly( A)-containing molecules were isolated from each these preparations. The nuclear and cytoplasmic RNAs were mixed and subjected to electrophoresis in gels containing 98%formamide. The RNA from each gel slice was eluted and then hybridized to specific DNA fragments. This procedure allows a direct comparison of the size dis-

of

* Present address: Department of Microbiology, University of California, San Francisco, California 94101. 293

294

I-IESCHEL J. RASKAS AND ELIZABETH A. CRAIG 135 115 1%

195 135

+ ++

3'

----m*

1.

A

I

B

j F j D i E /

5' -+ 1%

11s

C

-

5' r

3' I

205

FIG.1. Genome location of cytoplasmic adenovirus 2 RNAs synthesized early in productive infection. The genome sites coding for cytoplasmic viral RNA species synthesized early in infection were established by hybridization of size-fractionated RNA to specific adenovinis 2 DNA fragments (3-5, 9 ) . The five adenovirus 2 DNA sites cleaved by endo R*Eco R1 are indicated by the dashed vertical lines ( 6 ) ; the six fragments produced have the genome order A, B, F, D, E, C ( 1 3 ) . The space hetween each pair of small vertical lines represents 10%of the genome, and the lcngth of the RNAs is proportional to their molecular weights as calculated from electrophoretic mobility. The polarity of the strands was determined by analysis of the DNA termini ( 1 ).

tribution of nuclear and cytoplasmic viral RNAs transcribed from specific genes. In one experiment two adjacent fragments, Eco R1-R and F, were used as hybridization probes ( Fig. 2 ) . Hybridization with Eco R1-B DNA revealed the 19 S and 11 S cytoplasmic viral HNAs specified by this fragment. There was no detectable hybridization of cytoplasmic RNA to Eco R1-F DNA; this fragment codes exclusively for late genes. The profiles of hybridizable nuclear RNA werc considerably different. Nuclear sequences specified by Eco H1-H included two peaks that were larger than the cytoplasmic viral RNAs, one migrating as 25 S RNA and a smaller but reproducible peak migrating at the rate expected for 28 S RNA. Additional molecules migrntcd between the 11 S and 19 S cytoplasmic RNAs. The profile of Eco H1-F transcripts included two peaks that have the same migration rate as the 28 S and 25 S RNAs that hybridized to Eco RI-B DNA. From this and other observations ( 4 ) , we suggest that the 28 S nuclear HNAs detected by hybridization to Eco R1 €3 and F DNAs are actually a single molecular species that includes transcripts from both these fragments. To compare nuclear and cytoplasmic transcripts from a different rcgion of the genome, Eco R1 fragments D and E were utilized (Fig. 3 ) . At early times in infection these two adjacent fragments code for 19 S and 13 S mRNAs, which are transcribed from the T strand. The 13 S RNA that hybridizes to Eco R1-D and E is a single species transcribed from segments of both of these contiguous fragments. Hybridization of the iiuclear HNA revealed a discrete peak larger than the cytoplasmic mRNAs;

RNAS

295

FROM ADENOVIRUS-2 INFECTION 285

185

I ?I 4 I D

T

- 4

'?

0 3 X

I '

E

8

- 2

E

N 01 or

m >I

a z

or3 n

I

L2

' A

1

10 SLICE NUMBER

FIG.2.

20

SLICE NUMBER

FIG.3.

FIG. 2. Hybridization of p l y ( A )-containing nuclear and cytoplasmic RNAs to Eco R1-B and F DNA. Cytoplasmic ['"P]RNA and nuclear ['HIRNA were fractionated by electrophoresis, and the RNA eluted from gel slices was hybridized to 0.5 pg equivalents of Eco R l - B (upper panel) and F (lower panel) DNA. For experimental details see Craig and Raskas ( 9 ) . FIC:.3. Hybridization of poly ( A)-containing nuclear and cytoplasmic RNAs to Eco R1-D and E DNA. Cytoplasmic ["'PIRNA and nuclear 13H]RNA were fractionated by electrophoresis and thc RNA from gel slices was hybridized to 0.5-pg equivalents of Eco R1-D (upper panel) and E (lower panel) DNA. For experimental details see Craig and Raskas ( 9 ) .

25 S RNA hybridized to both Eco R1-D and Eco R1-E. In this instance also, it seems likely that a single 25 S molecule includes transcripts from the two adjacent fragments, D and E. For each of the four regions of the genome that code for early mRNA, we havc detected nuclear traiiscripts that are larger than the cytoplasmic mRNAs ( 9 ) . To determine the sequence relationship between the larger nuclear RNAs and cytoplasmic mRNA, hybridization-inhibition experiments were perfornicd. In such experiments the relatedness of two RNA prcparntions can be detcrniined by the ability of an unlabeled RNA to prevent the hybridization of n second RNA ( radioactive ) preparation. The procedure used for thcsc. cxperimcnts is outlined in Fig. 4.

296

HESCIIEL J . llASKAS AND ELIZABETH A. CRAIG

Nuclear RNA was labeled with [ {Hluridine and the poly( A)-containing molecules wcre purified and subjected to electrophoresis in formainide gels. The RNA eluted from each gel slice was then hybridized to three membranes, c d i containing the same DNA fragmcnt. Each of thcsc membranes was incubated with a large excess of nonradioactive HNA, either nuclear HNA from infected cc~lls, cytoplasmic RNA from infected cells, or RNA from uninfected cells, Thus, if a nuclear peak contains sequence$ identical to those prescnt in cytoplasmic RNA, its hybridization shoulcl be prevented by the prehybridization with nonradioactive cytoplasmic RNA. I285 185

I

Poly(A)+ Nuclear [3H] RNA /Electrophoresis /Elute RNA

Cytoplasm R N A , Infected Cells 1 Tube From Eoch Gel Slice

1

~

g

~

1 1,-/ 2

F1c. 4.

'

u

L\lecbd Cells

FIG.5.

FIG.4. Procedure for hybridization-inhibition studies with ['HIRNA fractionated by size. Following electrophoresis, the ['IIIRNA was eluted from each gel slice and then annealed to three filters. These filters contained the same DNA, but had been incubated with saturating ainounts of three different nonradioactive RNA preparations, as shown. For the actual experiments, the ['HIRNA from each gel slice was divided into three aliquots, and each aliquot was hybridized with one of the previously incubated iiiembranes ( 4, 9 ) . FIG.5 . Ability of early cytoplasmic RNA to inhibit hybridization of nuclear RNA transcribed from Snia I-E DNA. Poly( A)-containing nuclear ["FIIRNA was fractionated by electrophoresis, and the RNA eluted from each gel slice was analyzed by hybridization-inhibition stildies as ontlined in Fig. 4. One aliquot was hybridized to filters previously hybridized with whole-cell RNA from uninfected cells (O--O); a second aliquot was incubated with a filter similarly hybridized with nuclear RNA from cultures harvested 6 hours after infection (.--a); the third aliquot was hybridized to filters already annealed with early cytoplasmic RNA ( 0-0).

RNAS

FROM

ADENOVIRUS-2

INFECTION

297

An example of this type of experiment is presented in Fig. 5. These hybridizations were perfornicd with an adenovirus 2 DNA fragment (Sma 1-E) that includes sequences present in the left-hand 3-11%of the genome ( Mulder, Green and Delius, in preparation), These sequences code for two early cytoplasmic RNAs that migrate in electrophoresis as 13 S and 11 S RNAs (see Fig. 1).Polyadenylylated nuclear RNA transcribed from Sma I-E includes 22 S HNA as well as 11-13 S molecules. When nuclear ['HIRNA was hybridized with Sma I-E DNA which had been preincubated with RNA from uninfected cells, the profile was identical to that seen in the absencc of prehybridization. Hybridization to filters annealed in advance with nuclear RNA was reduced about 90%,as expected when the filters are pretreated with a homologous RNA. The filters presaturated with cytoplasmic RNA yielded a very different profile of hybridization. Hybridization of the 11-13 S molecules was reduced approximately 7540%. Hybridization of the larger 22 S peak was inhibited about 50%. This result demonstrates that the 22 S peak contains sequences transcribed from the same strand and the same region of the strand as the early cytoplasmic RNA. That the inhibition is only 50% suggests that the 22 S RNA contains sequences present in the nuclear transcripts but not transported to the cytoplasm. Similar results were obtained when hybridization-inhibition experiments were performed to analyze the larger nuclear RNAs transcribed from the other regions of the genome coding for early mRNA (Craig, Sayavedra and Raskas, in preparation). These findings are compatible with previous studies demonstrating the existence of nuclear-specific transcripts early in infection ( 1,3,10), Our rcsults to date are summarized in Fig. 6. The upper part of the figure shows the relevant cleavage sites for the three restriction endonucleascs used in these studies. The lower part of the figure compares the early cytoplasmic RNAs with the larger nuclear RNAs. Each region of the genome coding for early mRNA also specifies a polyadenylylated nuclear RNA large enough to include the sequences in the mRNAs as well as additional sequences. By comparing molecular weights and also from the results of the hybridization-inhibition studies, we can calculate the percent of the nucleotides in these RNAs that are nuclear-specific; for example, only 15%of the large nuclear transcript from Eco Rl-C DNA appears to be restricted to the nucleus whereas nearly 50% of the large RNA species from Eco R1-B is nuclear specific. From the 2 strand there is a 22 S nuclear RNA specified by Eco R1-C DNA as compared to the 20 S cytoplasmic species and a 28 S nuclear RNA from Eco R1 fragment B and F as compared to the 19 S and 11 S cytoplasmic RNAs. The r strand of Eco R1-D and E fragments specifies 25 S nuclear RNA and 19 S and 13 S cytoplasmic RNAs. Transcripts from the left-hand end of the

HESCIIEL J. RASKAS AND ELIZABETH A . CRAIG

913

&dIU

r

E

JI

3.0

1

A

58.5 70.7 75.9 83.4 89.7 I 1 8 1 F I D I E I C ;

IG

iK 92.0 98.2

11.1

255

& RI h

I

___)

Nucleus

-+

Cytoplosm

19s 135

22s

Cytoplosm

Nucleus

r strond

I

strond

FIG.6. A comparison of nuclear and cytoplasmic RNAs transcribed from regions of the genonie containing early adenovirus 2 genes. The upper part of the figure

illustrates the five genome sites cleaved by endo R.Eco R 1 as well as one of the endo R . Hin dIII sites ( R. J. Roberts and collaborators, personal communication) and four of the endo R.Sona I sites (Mulder, Greene and Delius, in preparation). The cleavage sites are identified as the percent of the distance from the left-hand end of the genome; the calculated sites of cleavage were deduced from the molecular weights of fragment DNAs and therefore are subject to the uncertainties in molccufar-weight deterniinations. Fragnients prepared hy cleavage at these sites have been used in the analysis of nuclear RNA synthesized early in adenovirus 2 infection. The lower part of the figurc has been prepared as described for Fig. 1. The nuclear RNAs shown in this figure correspond to the largest moleciiles that have been detected as transcripts from specific D N A fragmcnts. When the same size class of RNA is hybridized to two adjacent fragments, we have assumed that a single transcript overlaps the cleavage sites.

genonie have been studied using Stna I fragments. Using Sma I-E DNA, 22 S and 11 S-13 S nuclear RNAs were idcntified as compared to the 11 S-13 S cytoplasmic RNAs. We have not yet detected a larger nuclear RNA transcribed from the extreme left-hand region, the first 3%of the genome (Smu I-J). The larger nuclcar RNAs we have identified may be precursors of early adenovirus mRNA. Certainly the structural analyses of these RNAs is compatible with the possibility, for they are transcribed from the “sense” strand of the DNA and include sequences present in cytoplasmic niRNA. Metabolic studies to investigate the possibility of a precursorproduct relationship are now in progress. However, in proposing models for the transcription and processing of early adenovirus RNA, three categories of transcripts need to be considered. In the present studies, we have compared the cytoplasmic mRNAs and discrete size classes of nuelcar RNA that are polyadenylylated, like mRNA, but that are larger

RNAs

r

I

FROM

--

ADENOVIRUS-2

299

INFECTION

A

MODEL 1

B

MODEL 2

C -

-

~.

~~

... .

. Cf

~

c

FIG.7. Two possible models for the synthesis and processing of early adenovirus RNA. The schematic drawings A, B and C represent three groups of viral RNAs present early in infection. A represents the self-complementary RNAs transcribed from much, if not all, of the genome ( I , I 1 ) (Zimnier, Craig, Carlson and Raskas, in preparation). B illustrates the larger nuclear RNAs transcribed from the early genes. C is a representation of the early mRNAs transcribed from the adenovirus 2 genome.

than the mRNAs. Nuclei harvested early in infection contain a third category of RNA, complementary transcripts of large regions of the genome (I, I 1 ). Taking into account these three categories of RNA, two possible models for the synthesis of early mRNA are presented in Fig. 7. Primary transcription at early times may encompass most, if not all, of the genome (Model 1, A ) ; after rapid processing, relatively stable intermediates may accumulate ( B ) , followed by slower processing to yield the functional mRNAs ( C ) . Labeling for as short an interval as 10 minutes, we have not succeeded in isolating viral RNAs that migrate more slowly than 28 S RNA in formamide gels. Preliminary analysis of the nonpolyadenylylated viral RNAs has not revealed RNAs larger than 28 S. However this evidence is not compelling, and viral RNAs larger than those we have detected may be the primary transcripts. An alternative model (Model 2 ) assumes that the complementary viral RNAs present in low concentrations may be unrelated to mRNA formation. For example, these RNAs might be transcribed from viral genomes integrated into ccllular chromosomes (12). If the complementary RNAs are not precursors of mRNA, there may indeed be only four discrete pri-

300

IIESCHEL J. RASKAS AND ELIZABETH A . CRAIG

innry transcripts from the four regions of the genome coding for early mKNA ( I3 ), and these transcripts may be processed subsequently to yield the functional mRNAs ( C ) .

REFERENCES I . P. A. Sharp, P. H. Gallirriore and J. Flint, CSHSVB 39, 457 (1975). 2. L. Philipson, U. Pettersson, U. Lindberg, C. Tibbetts, B. Vennstrtim and T. Persson, C S H S Q B 39, 447 (1975). 3. E. A. Craig, J. Tal, T. Nishimoto, S. Zimmer, M. McGrogan and 11. J. Raskas, C S H S Q B 39, 483 ( 1975 ). 4. E. A. Craig, S. Zinnner and H. J. Raskas, J. Virol. 15, 1202 ( 1975). 5. E. A. Craig, M. McGrogan, C. Mulder and H. J. Raskas, J. Virol. 16, 905 (1975). 6. U. Pettersson, C. Mulder, H. Delius and P. Sharp, PNAS 70, 200 (197,3). 7. U. Lindhcrg, T. Persson and L. Philipson, J. Virol. 10, 909-919 ( 1972). 8. E. A. Craig and H. J. Raskas, J. Virol. 14, 751 ( 1974). 9. E. A. Craig and 13. J. Raskas, Cell 8, 205 ( 1976). 10. R. Wall, L. Philipson and J. E. Darnell, Virology 50, 27 (1972). I I . S. Zimmer and H. J. Raskas, Vi~ology70, 118 (1976). 12. H. Burger and W. Doerflcr, J. Virol. 13, 975 (1972). 13. C. Mulder, J. R. Arrand, H. Delius, W. Keller, U. Pettersson, R. J. Roberts and 1’. A. Sharp, CSHS@B 39,397 ( 1975).

Biogenesis of Silk Fibroin mRNA: An Example of Very Rapid Processing?’ PAULM. LIZARDI The Rockefeller Unioersity New York, New York

1. Introduction It is currently believed that posttranscriptional modifications play an important role in the generation of functional mRNA molecules in metazoa. Terminal modifications, such as polyadenylylation and “capping,”’ are of widespread occurrence in the RNA of eukaryotic cells ( 1-13). The available evidence suggests that endonucleolytic cleavage of large nuclear precursor molecules may also be an important mechanism in mRNA biogenesis ( 1 4 ) . However, the definitive characterization of large nuclear precursor molecules has not yet been achieved for any specific cellular messenger RNA. The posterior silk gland of the silkworm Bombyx mori, which synthesizes exceptionally large amounts of fibroin mRNA ( 1 4 , 1 5 ) , provides an excellent experimental system for the study of the biogenesis of a specific messenger species. Recently, it has become possible to perform a sequence-specific enrichment of fibroin mHNA by affinity chromatography. This method has permitted the isolation and partial characterization of newly synthesized fibroin mRNA molecules ( 1 6 ) . Here I present additional data on the characterization of pulse-labeled RNA from the posterior silk gland, including some preliminary evidence for the possible existence of a very rapidly processed precursor.

II. Experimental Procedure Methods for raising silkworms and for labeling larval RNA in uiuo have been described ( 1 4 ) . Procedures for RNA extraction using Sarkosylphenol/ chloroform, and for poly ( C-C-A ) -Sephadex chromatography have been published (16, 1 7 ) . Gel electrophoresis was performed in a buffer system consisting of ‘This work was partially supported by Grant No. GM 22865-01 from the National Institutes of Health, U.S. Public Health Service. ‘See articles by Busch et al., Furuichi et d.,Moss et al., Rottman et al., in this volume. 30 1

302.

PAUL M. LIZARD1

3.0%forinaldehyde, 1.33 mM EDTA, 40 mM triethanolamine-PO,, pH 7.5. The gel composition was 1.7%acrylamide, 0.8%agarose. The RNA sample buffer consisted of 50%formamide, 4%formaldehyde, 6% sucrose, 1 mM EDTA, 30 mM triethanolamine-PO,, pH 7.5. The sample was heated to 58°C for 10 minutes just before loading. The gels were run at G V/cm for about 8 hours, stained 2 hours in Stains-all (16, 1 8 ) and destained for 2 hours in water. Gel slices were counted as described previously ( 1 6 ) . Chromatography in poly( U)-Sepharose was as follows: a small column (1.5 nil bed volume of poly( U)-Scpharose type 6A, from PL Biochemicals, Milwaukee, Wisconsin ) was equilibrated with NETS-25 buffer (0.25 h4 NaCl, 0.5 mM EDTA, 10 mM TrisCl pH 7.4, 0.4%dodecyl sulfate). One milliliter of posterior gland HNA ( 3 mg/ml) in NETS-25 was slowly passed through the column at 24°C. Unbound RNA was removed by washing with 15 bed volumes of ETS (salt-free NETS). Bound RNA was eluted with a buffer containing 60% formamide, 0.5 mM EDTA, 10 mM TrisCl pH 7.4, 0.4%dodecyl sulfate, a t a temperature of 35°C. The HNA was recovered by ethanol precipitation from 0.25 M NaCl.

111. Results It has been deinonstrated that B. mori tissue synthesize hnRNA molecules with molecular weights as large as 10' (117).Figure 1A shows an electrophoretic pattern ( in formaldehyde-acrylamide/agarose gels ) of double-labeled posterior gland RNA previously fractionated in a BioGel A-50 m column to select molecules larger than 30 S. The mass ( ,32P) profile shows a major peak of fibroin mRNA and minor peaks at 40 S, 32 S and 18 S. (Insect 28 S rRNA contains a hidden interiial nick (break) and generates two 18 S fragments in denaturing gels), The distribution of 12-minute pulse-labelcd material shows a major peak at the position of 40 S ribosomal RNA precursor (off scale). The profile also shows heterogeneous material that extends to sizes even larger than fibroin mRNA. Figure 1 H shows RNA obtained from the same silk glands but purified instead by chromatography in poly ( U ) -Sepharose to select poly ( A )-containing RNA molecules. Again the main cornponcnt in terms of mass ("P) is fibroiii mRNA, which is the major polyadenylylated specics in the posterior silk gland. Thc profile of pulsc-labeled RNA shows a 40 S peak, which probably represents residual contamination by 40 S ribosomal KNA. A bump at the position of fibroin mRNA can now bc observed as well as heterogeneous material extending toward both the heavy and light sides of the fibroin peak. After a second passage through poly ( U ) -Sepharose ( not shown), the polyadenylylated material heavier than fibroin mRNA is still in evidence. This experiment shows

BIOGENESIS OF SILK FIBROIN

mRNA

303

E

E

a

u

Y

Q

a

I

I

N

m

F-l

Fraction number

FIG. 1. Polyacrylamide/agarose gel electrophoresis of double-labeled RNA from posterior silk glands; 12-minute pulse label. Two larvae on day 4 of the fifth instar were each labeled with 0.8 rnCi of ["Plorthophosphate. Twenty-four hours later, each larva received an injection of 5 mCi of [5-'HH]uridine. After 12 minutes of uridine incorporation, R N A was extracted from the pooled posterior silk glands using phenol/chloroform. The preparation was divided into two equal aliquots and treated as follows: ( A ) fractionation of R N A in Bio-Gel A-50 in to select R N A larger than 30 S ( 1 9 ) , followed by gel electrophoresis; ( B ) fractionation of the material in p l y ( U)-Sepharosc, followed by gel electrophoresis of bound RNA. In both ( A ) and ( B ) , 20% of the inaterial obtained after chromatography was used for gel-electrophoretic analysis. The radioactivity in gel slices has been corrected for 0.8% "P spill into the tritium channel. Note that the tritium scale is slightly different in ( A ) and

(B).

that a largc fraction of the newly-synthesized nonribosomal RNA of the silk gland consists of a heterogeneous population of polyadenylylated componcnts of high molecular weight and relatively short half-life. Suzuki and Brown ( 1 4 ) have shown that sequences of the type G-G-

*-G-C-U occur with high frequency throughout most of the length of U

304

PAUL hl. LIZARLlI

the fibroin mKNA molecule. This peculiarity in the mRNA sequence has provided thc basis for ;i simplc method for fibroin mHNA isolation based on affinity chromatography. The method, described in more detail elsewhere ( 1 6 ) , consists of chromatography in a column of Sephadex G-10 containing hound s!wthetic polyribonucleotites in which the sequence C-C-A occurs with very high frequency. Onci passagc through a poIy( CC-A )-Sephadex column results in about 80%binding of fibroin messenger HNA with a 7 - to &fold cnrichnient over rHNA and an even greater enrichment over other contaminants of lower ( G +C ) -content. Figure 2 shows thc results of an experiment in which a portion of the material shown in Fig. 1 R was further purified by one cycle of poly( CC-A) -Sephadex chromatography before performing the gel-electophoretic analj,sis. Thc mass profile shows a major peak of fibroin mcssenger RNA and a small peak of 18 S rRNA contaminant. The pulse-labeled material shows n peak that comigrates with [ “PImRNA and a small amount of 40 S rRNA contamination. The amount of radioactivity recovered in the [ ’HIniRNA peak shows that fibroin mRNA represents a fairly large fraction of the original high-niolecular-weight material. In fact, it has been estimated ( 1 6 ) that at least 19% ( a n d in some experiments up to 25%)of the pulse-labeled inatcrial larger than 40 S rRNA is fibroin mRNA. Considering the level of resolution of the acrylnmide-ngarose gel shown in Fig. 2, it can bc cstimated that the coniigrating peaks of “new” and “old” mRNA do not differ in size by more than 200,000 daltons. A striking feature of this gel profile is the absence of the heterogeneous poly ( A ) containing niatcrial evident in Fig. 18. However, the profile in Fig. 2 is not representative of the total population of fibroin gene transcripts,

Fraction Number

FIG.2. Polyacrylamide/agRrose gel electrophoresis of double-labeled RNA after chramatography in p l y ( U )-Sepharose followed by poly( C-C-A)-Sephadex. Part of the bound RNA recovered after p l y ( U)-Sepharose chromatography (see legend to Fig. I n ) was sulijected to chromatography in poly( C-C-A)-Sephadex ( 1 6 ) .Twentyfive percent of the bound RNA was used for gel electrophoresis, in order to make the gcl load comparable to Fig. 1A and B. Counts have been corrected for “P spill.

BIOGENESIS OF SILK FIHROIN

305

mRNA

since it contains only those molecules that are polyadenylylated within the 12-minute labeling period. The method of choice for studying the size distribution of the totality of newly synthesized fibroin inHNA molecules consists of two consecutive steps of p l y ( C-C-A)-Sephades chromatography. Figure 3 illustrates the purification of fibroin gene transcripts from total RNA after a 10-minute pulse label. The mass profile ( ? ? P )in 3B and 3C shows the progressive enrichment after one ( R ) or two ( C ) passages through the affinity

1.0

I

0

L05

10

20

30 40 50 Fraction Number

60

FIG. 3. Polyacrylamide/agarose gel electrophoresis of double-labeled RNA ( 10minute pulse) before and after p l y ( C-C-A)-Sephadex chromatography. A single larva on day 5 of the fifth instar received an injection of 0.5 mCi [3'P]orthopho~phate. After 24 hours, the larva received 6 niCi of ["Hluridine, and the posterior gland was dissected 10 minutes later. Portions of the extracted RNA were used for one or two cycles of poly( C-C-A )-Sephadex chromatography. ( A ) Total RNA before chromatography. ( B ) RNA after one cycle of poly(C-C-A)-Sephadex. ( C ) RNA after two cycles of chromatography. The arrow denotes the position of fibroin mRNA.

306

PAUL hl. LIZARD1

column. The profile of crude pulse-labeled RNA (”) in Fig. 3A shows the 40 S rHNA precursor pcak as well as larger heterogeneous material ( gel slices 7 through 20). After two cycles of chromatography ( 3C ) , the bulk of thc material peaks close to the electrophoretic mobility of fibroin inRNA. The actual pcak of the ’H profile is displaced toward the light side of [“‘PIinRNA. There is a nearly continuous trailing of material toward smaller sizes, and a small elevation at the position of 40 S rRNA, which is the major contaminant after two rounds of chromatography ( 1 6 ) . The trailing toward smaller sizes can be explained on the basis of the shortness of the labeling period. I t has been estimated that it should take about 5.6 minutes to transcribe the 16,000-nucleotide fibroin mRNA sequence at 24°C (16). The 10-minute in viuo labeling period is estimated to represent an cBectivc incorporation phase of about 8 minutes ( 1 6 ) . Since the incorporation period is not much longer than the transcription time, and since the specific activity of thc UTP pool must be increasing with time, one would expect more than 50%of the uridine label to be present i n nascent fibroin mRNA molecules. The ability of the poly ( C-C-A ) -Sephadex column to bind mRNA fragments has been documented ( 1 6 ) . Therefore, the material that trails toward smaller sizes should reflect the distribution of nascent fibroin gene transcripts. Nascent molecules are not polyadenylylated, which explains why the trailing material is not to be seen in Fig. 2. When the two-cycle poly( C-C-A)Sephadex fractionation is used to analyze RNA pulse-labeled for 35 minutes, no trailing material is observed (16). Of course, after 35 minutes of incorporation, most of the label should be in full-length molecules. An important feature of the tritium profile shown in Fig. 3C is the small amount of material present as a shoulder on the heavy side of the mRNA peak. This shoulder, although very small, is reproducible, and is slightly more pronounced at shorter pulse-labeling times ( 1 6 ) . I t does not disappear after a third pTssage through poly( C-C-A)-Sephadex (not shown), One possible explanation for the prcsence of large molecules in this shoulder is that they represent a highly unstable mRNA precursor species. If most processing cuts in such a precursor occurred very rapidly, perhaps even before the termination of transcription, the longer molecules would never accumulate in significant amounts. Since processing cuts would be expected to occur stochastically, a few molecules could transiently cscapc thc cutting mechanism and give rise to a small subset of uncut primary transcripts. Such molecules would obviously be very difficult to study unless it were possible to slow down the proccssing cuts. With this possibility in mind, I have experimented with conditions that may slow down processing cuts. There have been a number of reports in the literature (20, 21) that document the accumulation of ribosomal RNA precursor molecules during labeling at subnormal tempera-

AIOGENESIS OF SILK FIBROIN

niRNA

307

tures. I therefore performed a number of pulse-labeling experiments using silkworms cooled to temperatures in the range of 7°C to 14°C. A t temperatures below 10°C the levels of uridine incorporation were too low to permit adequate analysis. At temperatures between 10°C and 12"C, it is possible to obtain substantial levels of incorporat:on. Figure 4A shows the results of an experiment in which an animal was cooled to approximately 11.5"C and pulse-labeled for 15 minutes with [3H]uridine. After two cycles of p l y ( C-C-A ) -Scphadex chromatography, the electrophoretic profile shows a broad size distribution and a number of small peaks in the high-molecular-weight range. The largest peak is at the position of 40 S rRNA, and is probably due to residual contamination with rRNA. There are two sinall peaks close to the ["'PImRNA peak, one of somewhat slower mobility, the other a little faster. The overall broadness of the "H-labeled profile suggests that the labeling time in this experiment was shorter than the transcription time of the fibroin gene. I t probably takes morc than 15 minutes to transcribe this gene at 11.5"C. Figures 4B and 4C show profilcs of RNA pulse-labeled for 30 minutes at 11.5"C. In 4B, thc animal was cooled slowly over a 20-hour period before it was labeled; in 4C, the cooling period took only 35 minutes. The profile of uridinelabeled material in 4B looks very similar to that of a 10-minute pulse at 24"C, with the characteristic peak present at a position about 2 gel slices smaller than 3'P-labeled fibroin mRNA. In contrast, the profile in 4C shows a peak at about 2.5 gel slices larger than authentic fibroin mRNA. 130th profiles also show the ubiquitous 40 S rR.NA contaminant peak, but in 4C there is in addition a broad peak in the 28-32 S region of the gel, the nature of which has not yet been determined. The material that peaks at about 2.5 gel slices to the left of fibroin mRNA in Fig. 3C and 3A should bc about 0.8 x 1 0 daltons larger than authentic mRNA. The molecular weight of fibroin mRNA has been estimated to be about 5.8 x 10" ( 1 7 ) . This implies that the large peak observed in these lowtemperature experiments has a molecular weight in the range of 6.6 -f0.3 x loG(18.2 2 0.8 x 10" nucleotides). I t is important to point out that these observations cannot be explained on the basis of animal heterogcneity, sincc the results are the same whether the experiments are done with a single larva or with a group of two or three. The possibility that diffcrent genes are active a t high or low temperature is eliminated by the fact that the fibroin gene occurs only once per haploid genome (22, 2,3),

IV. Discussion The data on the labeling of the 6.6 x 106 dalton material at low temperature are clearly preliminary and do not unequivocally show that

308

PAUL M. LIZAHDI

C 2-

10

-_i_-

FIG 4. Gel-electrophoretic analysis of poly ( C-C-A )-Sephadex-pnrified RNA from animals pulse-lal)eled at low temperature. ( A ) RNA from an animal which was 1al)elecl with '9' for 21 hours, placed at 11 "-12°C for 1 hour, and subseqilently labeled with 5 mCi of ["Hluridine for 15 minutcs at 11.5"C. Electrophoretic analysis was performed after 2 cycles of poly( C-C-A)-Sephadex chromatography. ( B ) RNA from an animal which was lalicled for 12 hours with '"P, then kept 20 hours at 15"C, and finally cooled to 1 I .ij°C for piilse-labeling. Labeling with ["Hluridine was for 30 minutes at 11.5"C. ( C ) RNA from an animal labeled with 'T for 32 hours, cooled to 11.5"C during a period of 35 minutes, and labeled with [:'II]uridine for 30 niinutcs at 11.5"C. The amounts of radioactive isotope used in these experiments are similar to those specified in previoris figures. The gc1 in ( A ) was run for a shorter time than that in ( B ) and ( C ) . The arrow denotes the position of fibroin iuRNA. The radioactivity has been corrected for 0.8% "'P spill into the tritium channcl.

the larger molecules are in fact precmsors of fibroin inl1NA. Present efforts in this laboratory arc directed toward further charactcrization of this inatcrial by RNA sequencing procedures. Experiments are also in progrcss to dctcmnine whether thc raclioactivity present in thc large peak can be chased into authentic fibroiii inRNA iri the presence of actinomycin

BIOGENESIS OF SILK FIBROIN

309

mRNA

D, which causes rapid inhibition of fibroiii mRNA synthesis (P. Lizardi, unpublished results). Observations on the size profile of fibroin gene transcripts at different temperatures provide a basis for compnring alternative models for the transcription of this gene. Figure 5 shows some schematic diagrams of various possible modes of transcription. The diagrams on the left side illustrate the arrangement of nascent RNA molecules; the patterns on the right show the expected mass distribution when the nascent molecules are analyzed by gel electrophoresis. For simplicity, the exponential nature of gel mobility has been ignored. The gel profiles include only

1

16Kb

1

GROWING CHAINS

t

ELECTROPHORESIS

FIG.5. Various possi1,le models for the arrangement of nascent transcripts in the fihroin gene, and predicted gel electrophoretic profiles of the nascent RNA. The drawings on the left side, A through E, show growing molecules between the initiation ( i ) and terminator ( t ) loci. The dark shading denotes the portions of the growing chains that contain the conserved ( structural ) niRNA sequence. Abrupt changes in chain length denote the occurrence of endonncleolytic cuts. The drawings on the right side are highly schematic representations of the expected distribution of RNA inass in a polyacrylaniide gel. The exponential nature of gel mobilities has been ignored for simplicity. The gel profile contains only molecules that would be selected after two cycles of chromatography in poly ( C-C-A )-Sephadex, which presumably hinds only those niolecules containing parts of the conserved mRNA sequence.

310

PAUL M. LIZARD1

those nascent molecules that would be selected by an affinity column rccogniziiig exclusively the conserved ( structural ) mRNA sequence. The first modcl ( A ) shows the sirnplc~sttranscriptional unit, in which the primarv transcriptional product has the same size as mature mRNA. h4odel 13 shows a transcriptional unit containing an extra 2.2 x lo3 iiuclcotidcs, which give rise to a large precursor RNA. Model C is similar to B esucept that a processing cut is occurring before the termination of transcription, resulting in a discarded 5'-terminal f r a p e n t . This situation gives rise to a peculiar gel profile in which the distribution of mass shows a peak at a size somewhat smaller than mature mHNA. Model D is similar to C h i t involves a much longer transcriptional unit and a larger discarded piece. A4odel E differs from the others in having an additional S'-tcrminal piece that is cut and discarded. Models A, B and E give rise to gel profiles in which the peak of the mass distribution is as large or larger than fibroin mHNA. Such profiles have never bcen observcd after pulse-labcling for 6-10 minutes at 24°C. Models C and D predict inass distributions with a peak of lower molecular weight than fibroin mHNA; Model C i n particular resembles the in uivo profiles. Note, for example, the rcsemblance bctwecn hlodcl C and Fig. 3C and 4B, ignoring of course the 40 S rRNA contaminant peak. Model C predicts that if the processing cut fails to occur rapidly eiiough, thc situation should become similar to Model H, where the peak of the gel profilc is a littlc larger than fibroin mRNA. This situation rescmblcs what has been obsc~rvedin experiments where the aniinals are rapidly cooled to 11.5"C (see Fig. 4C, again ignoring the 40 S rRNA and 28-32 S material). I t seems, therefore, that the data are most compatible with Modcl C a t 24°C and h4odel B at 11.5'C (after rapid cooling), Modcl D seems less likely, since it predicts the accumulation of extremely long molecules in the low-teniperature esperiments. Other more complex models could also be compatiblc with the available data, such as a combination of C and E, in which procc~ssingcuts occur at both the 3' terminal and 5' ends of the transcript. I t is premature to further discuss such models in view of the paucitv of data. It is interrsting that, when the aiiiriials are cooled slowly, the 30minute pulse resembles n lO-minute pulse at room temperature. This could mean that the animals have compensatory mechanisms to prevent a slowing of mRNA processing steps during normal temperature fluctuations. A disturbing feature of the RNA profile obtained in rapidly cooled animals is the presence of a large amount of matcrial in thc 28-32 S rcgioii of the gel. It is still not clear whether these are fibroin gene transcripts or ribosomal RNA contaminants. One possibility is that at 11.5"C the spccd of propagation of transcribing polymcrases is discontinuous, re-

BIOCENESIS OF SILK FIBROIN

mRNA

311

sulting in a piling-up effect of transcripts of intermediate size. Another possibility is that incorrect processing cuts are being made. The nature of the 28-32 S material is currently under investigation using RNA sequencing procedures. Careful inspection of the profile of pulse-labeled RNA in Fig. 4C shows that the presumptive mRNA precursor peak is somewhat broad on the heavy side. One must keep in mind the possibility that the primary trmscriptional product could show length heterogeneity due to multiple initiation or termination loci. The labeling of fibroin mRNA at low temperatures may permit the isolation of substantial amounts of what may well be the primary transcript of a eukaryotic mRNA gene. It may thus become possible to probe the finer structural features of this specific transcript, In addition, it should be possible to design experiments to test at what time these transcripts become capable of forming specific translational complexes with ribosomes. Hopefully, such experiments will improve our understanding of thc functional significance of mRNA processing. At the time this manuscript was written, I learned of the elegant work of 14cKnight, Sullivan and Miller ( 2 4 ) , who have been able to obtain electron micrographs of transcribing genes from the posterior silk gland of B. rnori. They have identified a transcriptional unit about 18 x lo1 nucleotides long, which is considered to correspond to an active silk fibroin gene. The gene length measured by McKnight et al. is in excellent agreement with the biochemical data presented in this paper. W e are now in a position to combine electron microscopic and biochemical techniques to probe the finer structural features of this functioning gene.

IV. Summary An affinity column containing a bound syiithetic polynucleotide has bccn used as a tool for the isolation ot pulsc-labeled fibroin mRNA. Two passages through the column are sufficient to purify the pulse-labeled fibroin gene transcripts froin the total hnRNA population. The fibroin mHNA molecules isolated after pulse-labeling in viwo are not demonstrably larger than fibroin mRNA. In fact, for short pulse-labeling times (10 minutes) the peak of the gel profile of pulse-labeled mRNA is at a position one to two gel slices smaller than maturc mRNA. To investigate the possibility that putative RNA processing cuts may be occurring even before the termiiiatioii of transcription, experiments have been done under conditions in which processing may be expected to be slower. In vivo pulse-labeling of anininls at 11.5"C, followed by affinity chromatography and gel electrophoresis, results in a different size-profile of newly synthesized fibroin mRNA. In this case, the peak of the size distribution is

312

PAUL M. LIZAHDI

a few gel slices larger than the mature mRNA peak. This larger material, which accumulates at low temperature, could rcprescnt a very short-lived precursor species, and may i i i fact be the “primary” transcript of the silk fibroin gene. ACKNOWIXDGMENTS

I thank Alan Engelberg for his excellent technical assistance. I am indebted to Drs. James L. Vaughn, Kcizo Hayashiya and S. Shibata for their assistance in obtaining silkworm eggs and food. Special thanks are due to Donald D. Brown for introducing me to the silk gland system and for many productive discussions.

REFERENCES G. nrawerman, ARB 43, 621 ( 1974). J. E. Darnell, Harvey Lcct., Scr. 69, 1 ( 1975). J. R. Greenberg, I . Cell Biol. 64, 269 (1975). 8. Lewin, Cell 4, 11-20 ( 1975). 5. T. S. Ro Choi, Y. C. Choi, 1).Henning, J. McCloskey and H. Bnsch, JBC 250, 3921-3928. 6. C.-M. Wei, A. Gershowitz and B. Moss, Cell 4, 379-386 ( 1975). 7 . Y. Furuichi, hl. hlorgan, A . J. Shatkin, W. Jelinek, M. Salditt-Ceorgieff and J. E. Darnell, PNAS 72, 1904 (1975). 8. J. M. Adanis and S. Cory, Nature 255, 28 ( 1975). 9. R. Desrosiers, K. Friderici and F. Rottman, PNAS 71, 3971-3975 (1974). 10. R. P. Perry, D. E. Kelley, K. Friderici and F. Rottman, Cell 4, 387-394 ( 1975). 11. R. 1’. Perry, D. E. Kelley, K. H. Friderici and F. M. Rottman, Cell 6, 13-19 (1975). 12. hl. Salditt-Georgieff, W. Jelinek, J. E. Darnell, Y. Furnichi, M. Morgan and A. Shatkin, Cell 7, 227-237 ( 1975). 13. N.-S. Yang, R. F. Manning and L. 1’. Gage, Cell 7, 339-347 (1975). 14. Y. Suzuki and D. D. Brown, J M B 70, 637-649 ( 1972). 15. Y. Suzuki and E. Snzuki, J M B 88, 393-407 (1974). 16. P. M. Lizardi, Cell 7, 239-245 ( 1976). 17. P. hl. Lizardi, R. Williamson and D. D. Brown, Cell 4, 199 ( 1975). 18. A. E. Dahlherg, C. W. Dingman and A. C. Peacock, J M B 41, 139 (1969). 19. P. M. Lizardi and D. D. Brown, CSHSQB 38, 701 (1973). 20. R. €1. Stevens and H. Amos, J. Cell Biol. 50, 818 (1971). 21. Y. Kuriyama and D. J. L. Luck, I M B 73,425 ( 1973). 22. Y. Suzuki, L. P. Gage, and D. D. Brown, J M B 70, 637 ( 1972). 23. L. P. Gage and R. F. Manning, J M B 101, 327 ( 1976). 24. S. L. McKnight, N. L. Sullivan and 0. L., Miller, Jr., This volume, p. 313. 1. 2. 3. 4.

Visualization of the Silk Fibroin Transcription Unit and Nascent Silk Fibroin Molecules on Polyribosomes of Bombyx mori

~

STEVEN L. MCKNICHT, NELDAL. SULLIVAN AND OSC4R

L. MILLER,JR.

Department of Biology Uniuenity of Virginia ClzarlottesuiUe, Virginia

The unique physical properties of the silk fibroin gene of Bornbyx mori, its complementary mRNA molecule and the fibroin polypeptide have allowed biochemical probes to define the kinetics of fibroin production. Late in larval development, the highly polyploid posterior silk gland cells contribute over 80%of their protei i synthesis to the production of silk fibroin ( 1 ) . By modifying chromosome spreading techniques first adapted for visualizing extrachromosomel nucleolar genes of amphibian oocytes ( 2 ), we have examined the transcriptional ( McKnight ) and translational ( Sullivan) organization of silk producing Bornbyr cells. To spread the chromosomes, 5 mg of the gland tissue were dispersed with jewelers’ forceps in 3 nil of a 0.05%Joy detergent solution adjusted to pH 8.5 with NaOH-borate buffer. The suspewion was then centrifuged through a formalin-sucrose cushion onto a carbon-coated electron microscope grid and prepared for transmission electron microscopy by techniques previously reported ( 3 ) . Adequately dispersed Bornbyx posterior silk-gland genomcs show: ( a ) inactive “ b e a d e d chromatin regions; ( b ) active ribosomal genes ( Fig. l a ) ; ( c ) variously sized non-nucleolar genes that are typically populated with low densities of RNA polymerase molecules; and ( d ) a distinct population of ribonucleoprotein fibril gradients between 5 and 6 p n ~long that are packed with almost as many RNA polymerases per unit length as the rRNA genes, but are not present in tandem array (Figs. lb, l c and 2). The distribution of the three categories of ribonucleoprotein fibril gradients changes as larvae proceed through the fifth instar, with the category of long, polymerase-dense gradients becoming more prominent. We identify these long, polymerase-dense gradients as active silk fibroin genes for the following reasons. First, the mean length of these 313

VISUALIZATION OF TRANSCRIPTION UNITS

315

distinct transcription units is 5.43 k 0.24 pm ( N = 14). If the B-conformation length of the gene is foreshortened by the amount we have estimated for rRNA genes (-12%), the obscrved length would correspond to a tcmplate slightly longer than 1.8 x 10' base-pairs, a length that is very close to the gene size estimated by biochemical probes (-17,000 base-pairs; 4 ) . Second, preliminary observations indicate that the middle portion of the silk gland, which synthesizes little or no fibroin mRNA, does not contain the very long polymerase-dense gradients observed in the posterior portion. Third, the -5.4 p m gridients are not tandemly repeated, which is consistent with data showin; the fibroin gene to exist only once per haploid genornc equivalent ( 5 ) .And fourth, these gradients are present on loci that are essentially space-filled with RNA polymerases. Analysis of fibroin mRNA production in the fifth larval instar indicates that such single-copy genes would have to be loaded with polymerases in order to account for thc large number of messages synthesized per unit time during this stage of development (Y. Suzuki, personal communication). In most instances, the terminal regions of the fibroin transcription units are obscured because of the extraordinary length of the more terminal ribonucleoprotein fibrils, and the prestmce of overlapping chromatin strands. In one case, howevcr, wt' have observed that a portion of the most distal fibrils appear to have been cleaved (Figs. l b and l c ) . This may represent primary transcript processirig, and, if so, would indicate that the fibroin gene produces a short-lived precursor molecule somewhat larger than the fibroin mRNA (see Lizardi, page 301). Similar ribonucleoprotein fibril processing has been reported by Laird and Chooi ( 6 ) which suggests that cleavage of nascent ribonucleoprotein molecules occurs on the nurse-cell genome of Drosophilu melunogaster. The polyribosomes of posterior silk-gland cdls were inspected using methods similar to those described for genome pieparation except that the FIG. l a . Electron micrograph of Bombyx mori ribosomal ribonu'cleoprotein matrices. The sample was prepared from mid-5th instar posterior silk gland tissue. FICS.l b and lc. Electron Micrograph of a putative silk fibroin transcription unit. The sample was prepared from mid-5th instar posterior silk gland tissue. Arrows in 1L point to sites of initiation ( i ) and termination ( t ) of transcription. Contour length ( i ) * ( t ) measures -5.8 pxn. Figure l c is a tracing of Fig. l b and shows the putative endnucleolytic cleavage site (large arrow). Of the (imost terminal nascent transcripts, 5 appear cleaved (small arrows) and have a mean length of 0.19 pm, which is just under one-fifth the mean length ( 1.02 pm) of t t e most terminal unprocessed transcript$. Endonucleolytic cleavage site ( ecs ) occurs -4.8 pm from initiation site ( i ) , thus accounting for slighty less than four-fifths of t.he full gradient length. The density of RNA polymerases hound near the terminus of the gradient appears less than that in the more proximal regions.

FIG. 2. Electron iiiicrograph of a putative silk fibroin transcription unit. The sample W'IS prepared from mid-5th instar posterior silk gland tissue. Arrows point to sites of initiation ( i ) and termination ( t ) of transcription. Contour length ( i ) +( t ) measures -5.3 pm. The template is complexed with -40 RNA polymerase molecules per micrometer of contour length.

316

FIGS.3a, 3b, and 3c. Electron micrographs of polysomes isolated from the late 5th instar of posterior silk gland cells. The polysomes shown in Figs. 3a and 3b were isolated as described in the text, except that no cyclohevimide was used. Translational polarity is indicated by 5’ and 3’ symbols. Arrows point to putative nascent silk fibroin polypeptides. Bars represent 0.1 pm. 317

318

STEVEN L. hlCKNIGHT ET. AI..

tissue was suspended in a solution of cyclohexiinide (20 pglml) for up to 24 hours prior to homogenization. The large majority of the polysomes observed after this treatment (Fig. 3 ) exhibit an array of extended thin fibrils singly attached to individual ribosomes. The attached fibrils reach a maximum length of -0.1 pin and have a distinctive beaded appearance. Tangling and shearing of these very long polysomes has so far prevented an accurate determination of their sizr, but the range probably lies between 50 and 80 ribosomes. The attached fibrils are identified as nascent silk fibroin polypeptides because: ( a ) the late fifth instar posterior silk-gland cells synthesize almost exclusively silk fibroin; ( b ) they generate gradients of increasing length along the polysome as expected from the known mechanism of mRNA translation, and establish its polarity; ( c ) the contents of the lumen of the posterior portion of the silk gland, dispersed in the same way, consists priniarily of molecules having essentially the same distinctive morphology and sizc as the longest fibrils found at the 3’ end of the polyribosomes; and ( d ) the extreme size of the silk fibroin molecule [estimated to range between 170,000 daltons ( 7 ) , and 370,000 daltons, ( 8) ] requires polysomes in this size range. Visualization of thcse nascent fibroin polypeptides is possible both because of their size and because they contain repeating amino-acid sequences that take the form of folded antiparallel ,&pleated sheets that extend in a linear, rather than a globular, conformation. To our knowledge, these observations mark the first time that a specific structural eukaryotic gene has been visualized and the first visual confirmation of the accepted biochemical interpretation of protein synthesis.

ACKNOWLEDGMENTS We thank Ur. D. D. Brown for his suggestion that we initiate this study, and are grateful to Dr. Y. Suzuki and Paul Giza for generously supplying us with Bomhyx mori embryos and larvae. We acknowledge and appreciate the communication of unpublished data and stimulating discussion provided by Urs. Suzuki, Brown, and P a d Lizardi. We also thank Ms L. Blanks for her excellent technical assistance. Supported by NSF Grant BMS73-01131-AOl and USPHS-NIGMS Grant 1 R01 GM21020-01.

REFERENCES 1. Y. Tashiro, T. Morinioto, S. Matsnura and S. Nagata, J. Cell Biol. 38, 574 ( 1968). 2 . 0. L. Miller, Jr. and B. R. Beatty, Science 164, 955 (19G9). 3. S. L. McKnight and 0. L. Miller, Jr., Cell 8, 305 (1976). 4. P. M. Lizardi and D. D. Brown, Cell 4, 207-215 (1975). 5 . P. M. Lizardi and D. 1).Brown, CSHSQB 38, 701 (1973). 6. C . D. Laird and W. Y. Chooi, Cl~romosomo,in press (1976). 7 . Y. Tashiro and E. Otsuki, BBA 214,265 (1970). 8. K. U. Sprague, Bchetn 14, 925 ( 1975).

Production and Fate of Balbiani Ring Products

B. DAVEHOLT, S. T. CASE, J. HYDE, L. NELSON AND L. VVIESLANDER Department of Histology Karolinska Institutet Stockho,'m, Sweden

I. Introduction The salivary glands in the dipteran Chirononms tentans offer a suitable experimental system for analysis of the synthesis and processing of RNA. Two large chromosomal puffs, Balbiani rings 1 and 2 (BR-1 and -2), are particularly amenable for such studies. These puffs are likely to be connected to the production of salivary polypeptides, presumably by generating the corresponding messenger RNAs ( 1, 2 ) . Owing to their exceptionally large size, the Balbiani rings can be isclated by microdissection ( 3, 4 ) and their products subsequently analyzed. Moreover, since nuclear sap and cytoplasm can be collected, the further intracellular fate of Balbiani ring RNA can be monitored (for review, see 5). In Balbiani ring 2, an RNA species of 75 S size is synthesized ( 6 ) ,and this product is transported via nuclear sap into the cytoplasm without bcing measurably changed in sizc (7, 8 ) . The BR-1 RNA product is of about the same size as BR-2 RNA (9, 10) and also appears in the cytoplasm without a major size reduction ( 8 ) .AF shown by in situ hybridization, however, RR-1 RNA has a nucleotide sequence different from that of BR-2 RNA ( I 1 ). The 75 S RNA has a high 5,tability in cytoplasm and constitutes as much as 1.5%of total cellular RNA ( 7 ) . Recently, 75 S RNA has been obtained in a reasonably p i r e and undegraded form (Fig. 1) and can be made available in larger quantites for further structural analysis ( 1 2 ) . For example, rcdectrophoresis of 75 S RNA, which was subjected to various denaturing conditions, suggests that the 75 S RNA fraction corresponds to unbroken, single-stranded RNA molecules of giant size (Case and Daneholt, unpublished). The flow of BR products from the chromosomal level into cytoplasm has now also been analyzed on the ribonucleoprotein level. We report here the features of early stages in this transfci process, as visualized in 3 19

n.

320

DANEHOLT ET AI,.

B

500028

%

I

._ $3000-

Bs

[L

1000-

10

20

30

LO

Slice no.

FIG.1. Electrophoretic analysis of long-term labelcd salivary gland RNA ( A ) and reelectrophoresis of 73 S RNA ( B ) . Chiroiromus tentans larvae were kept at 18°C for 3 days in culture medium supplemented with radioactive RNA precursors. Four salivary glands were isolated, and the labeled RNA was extracted in a Sarkosyl/ Pronase solution and analyzed in a 1% agarose gel ( A ) . The gel segment containing 75 S RNA was cut out, and the RNA was eluted electrophoretically and rerun in a 1%agarose gel ( B ) . For further experimental detaiIs, see Case and Daneholt ( 1 2 ) :

the electron microscope, as well as the final appearance of the BR products in polysomes of large sizes.

II. Transcription Complexes in Balbiani Rings In the electron microscope, Halbiani rings display a characteristic morphology with looped, brushlike configurations, recognizable in particular in the periphery of the BRs (Fig. 2A, B ) . The most conspicuous elemcnts of thcsc loops arc large granules, many of which can be seen to have a stalk connecting the granule to an ill-defined loop axis (Fig. 2 B ) . These observations are in agreement with earlier studies on BR morphology in Chironomus (13-15). From cytochemical tests, Stevens and Swift ( 1 4 ) concluded that the granules and their stalks consist of ribo-

RALHIANI RING PRODUCTS

321

nucleoproteins while the axis is composed of deoxyribonucleoproteins. Owing to the analogies bctwecn this structure of BR loops and that of spread transcription coniplexes reported in other eukaryotes ( e.g., 1 6 ) , it seems plausible that the BR loops do represent such transcription complexes. This interpretation of the electron micrographs is in agreement with biochcmical data for UR RNA, strongly suggesting that by far most of the high-molecular-weight RNA molecules present in BRs are in a nascent state (for review, see 5 ) . I t is then interesting to note that the growing ribonucleoprotein ( RNP ) fibrils are wound up into granular structures before the synthesis of the fibrils is completed. Thus, there appears to be a close coupling, at least in time, between the actual synthesis of the RNA molecule, the formation of the ribonucleoprotein fibril, and its subsequent packaging into a granular structure.

111. Balbiani-ring Granules in Nuclear Sap, Nuclear Pores and Cytoplasm

Granules corresponding in size to the largest in the BRs but lacking stalks, are abundant throughout the nuclear sap (Fig. 2B, 3A). Furthermore, some of them are closely associated with nuclear pore complexes and often exhibit an altered configuration. W e have recognized granules with a more or less pronounced projection through the center of the pore (Fig. 3A), rodlike structures traversing the pore, and finally, but more rarely, spherical or cone-shaped particles on the cytoplasmic side of the pore. As pointed out by Stevens and Swift ( 1 4 ) , on the basis of similar findings in Chironomus thummi, these various configurations suggest that the large granules are translocated through the pore complexes. In Chironomus tentans, a small number of large granules are also regularly recorded in cytoplasm (Fig. 3B). I t was estimated that the total number of cytoplasmic granules observed in any cell is less than 5% of the number of granules contained within the corresponding nucleus. The cytoplasmic granules are mainly observed within a 1-pm broad zone adjacent to the nucleus (Fig. 4 ) . If the position of the granules within this zone is determined relative to the nuclear envelope, it can be observed that there are more granules close to the envelope than further out in the zone (Fig. 5 ) . A steep concentration gradient is therefore evident. It should be stressed that in the vast part of the cytoplasm the large granules are virtually absent. The electron micrographs showed good preservation of the various cellular components, including an intact nuclear envelope, and displayed no signs of redistribution of nucleoprotein material within either the nucleus or the cytoplasm. It seems therefore likely that the few large granules observed in cytoplasm reflect the in uiuo situation rather

322

R. DANEIIOLT ET AL.

BALRIANI RING PR0I)UCTS

323

than an artifactual leakage of graniiles from nucleus into cytoplasm during preparation for electron microscopy. Since 1313s are the only puffs containing coiispicuous granules, the large granulcs in sap probably represent products releascd from the l3Rs. At least some granulrs secm to pass through the nuclear pores and appear in the cytoplasm close to the nuclear envelope. This flow scheme of granules is, however, only tentative: the granules observed in the various cellular compartments contain RNA, are of about the same size and stainability, but it has not been directly demonstrated, for example, that they also contain H R 7S S 13NA. The most important finding is pcrhaps not that a few granules are present in cytoplasm, but rather that the major part of cytoplasm seems to essentially lack large granules. The vast majority of the granules in a cell are recorded within the i i ~ c l e ~ (more is than 95%).On the other hand, Iiy far most 75 S RNA molecules in the cell are present in cytoplasm ( 7 ) and cannot be accommodated in the low number of large granules detected there. One then has to look for another cellular component harboring most 75 S RNA in cytoplasm. Since the BR RNA sequences are thought to code for the salivary polypeptides (I, Z ) , it would seem logical to test directly whether or not 75 S RNA enters polysomes.

IV. Polysomes of large Size in Chironomus Salivary Glands The polysomes can be extracted in a high yield from the salivary gland cells in a detergent solution at high ionic strength, using a homogenization proccdure ( 1 7 ) . However, the large RNA in cytoplasm is then to a large extent degraded. A new technique avoiding homogenization as well as a precentrifugation step has recently been developed and applied in analysis of the polysomes in these cells (18). The glands are placed in a coolcd DOC-Tween solution and torn open by dissection FIG.2. Electron micrograph of chromosome IV in Chironomus tentans ( A ) and of putativc transcription complexes in a Bulbiani ring ( B ) . In panel A the three Balbiani rings ( B R - I , -2 and -3) can be recognized, the location of BR-2 being specifically indicated. The Balbiani rings represent expanded chromosome regions with characteristic loop structnres, which are particularly apparent in the periphery of the rings. In pancl B, a few conspicuous ~ O O ~are S displayed at il high magnification. Each of them shows a large number of granules, many of which have a stalk connecting the grannle to an axis in the center of the loop. A BR loop is likely to represent a transcription complex, each granule with its stalk corresponding to a growing ribonu'cleoprotein fibril and the axis to the deoxyribonucleoprotein template. The upper half of B corresponds to nuclear sap containing spherical granules lacking stalks. The salivary glands were fixed in glutaraldehyde and stained in uranyl acetate followed by lead citrate. The scale line represents 2 pni in A and 0.2 pm in B.

324

B. DANEHOLT ET AL.

HALBIANI RING PRODUCTS

325

needles. This technique ascertains an efficient release of polysomes as well as a reduced degree of degradation of cytoplasmic 75 S RNA (Fig. 6B). The undissolved material, including the nuclei with the giant chromosomes, is removed by the dissection needles and the extract can be directly loaded on top of a sucrose gradient for sedimentation analysis. In a 15 to 60%( w / w ) sucrose gradient, the polysomal extract showed a biniodal distribution of particles, one sharp peak corresponding to monosomes and a broad distribution (200-2000 S ) of polysomes, with an average peak at about 700 S (Fig. 6). The polysomal nature of the rapidly sedimenting material was confirmed when the polysomes in the extract were dissociated by EDTA: the rapidly sedimenting material shifted almost completely from the polysome region toward the top of the gradient ( Fig. 6 A ) . The remarkable feature of the polysome profile is the high sedimentation values observed. They indicate that many of the polysomes are of very large size, or alternatively, that smaller polysomes are aggregated with each other or contaminated with celIular debris. These various possibilities were investigated by electron microscopic analysis of one fraction from the heavy part of the gradient (fraction 7 ) and one from the light part (fraction 19). The results are shown in Fig. 7: rapidly sedimenting polysomes are presented in panels A, B and C, and more slowly sedimenting ones are displayed in D, E and F. Most of the polysomes appeared more or less collapsed with the ribosomes tightly packed (A, D ) , but sometimes more untangled ones could be observed (B, E ) . In the most favorable cases, we recorded well-extended polysomes with a distinct fibril connecting the ribosomes to each other (C, F). The E M pictures of the bept-spread polysomes therefore suggest that we are dealing with true polysomes rather than with aggregates of smaller ones. This is further supported by the fact that we never observed more than two frce ends of the well extended polysomes. Finally, the electron micrographs demonstrate that the high sedimentation vaIues can hardly be FIG. 3 . Electron micrographs showing the location of large granules in nuclear sap, within nuclear pores and in cytoplasm. In panel A the nuclear envelope with a few nuclear pore complexes can be seen in the center. One of the pores harbors a large granule with a projection heading toward the cytoplasm. It can also be observed that the nuclear sap (upper half) contains several large granules, while the cytoplasm displays an abundant number of ribosomes. The salivary glands were fixed in glutaraldehyde and osinium tetroxide and stained with uranyl acetate followed by lead citrate. The s'cale line represents 0.2 ,urn. Panel B exhibits a few large granules in nuclear sap, one in a nuclear pore complex and four in the cytoplasm. A well developed granular endoplasmic reticulum as well as frequent microtubules can also be noted in the cytoplasm. The glands were fixed in glutaraldehyde and stained in uranyl acetate followed by lead citrate. The magnification is the same as in A.

326

R. DANEHOLT ET AL.

BALRIANI RING PRODUCTS

327

explained on the basis of a major Contamination such as membranous inaterial or clumps of secretion bound to the polysomes. To obtain a better idea of the polysome sizes along the gradient, a large number of polysomes in fractions 7 and 19 were studied as to number of rilmsonies per polysome. In fraction 19, polysomes containing 11-13 ribosomes were most frequent but the range of polysome sizes was large ( 5 4 0 ribosomes per polysome ) . The most abundant size class in fraction 7 corresponded to about 60 ( 5 5 - 6 4 ) ribosomes per polysome. Again the size variation was considerable with some polysomes in fraction 7 exceeding one hundred ribosomes in size.

V. Balbiani-ring RNA in Polysomes In order to characterize the polysomes further, RNA was recovered from the heavy as well as from the light polysome region and analyzed by electrophoresis (Fig. S). In both frxtions a broad size range of high molecular weight RNA (10-75 S ) is observed but the rapidly sedimenting polysomes contain proportionately more of the most high-molecularweight RNA (50-75 S RNA). In EDTA-shift experiments ( 1 8 ) , it was furthermore demonstrated that inore than 60% of the 30-75 S RNA cosedimciiting with the polysonies could be shifted to the postpolysomal region of the gradient and is therefore likely to be located in polysomes. For 75 S RNA in particular, it was calculated that at least 40% of all cytoplasmic 75 S KNA is present in polysomes. This figure should be regarded as a minimum value, mainly due to the possibility that 75 S RNA was released from the polysomcs but still sedimented as part of a ribonucleoprotein particle in the polymne region of the gradient and iiccordinglt. recorded as EDTA-insensitive ( c.f. distribution of label in the EDTA experiment in Fig. 2 ) . The in situ hybridization techniquc was also applied in the analysis of RNA extracted from rapidly as well as from slowly sedimenting polysomes (19). It was obscrved that RNA from the heavy (Fig. 9A) as well FIG. 4. Survey electron micrograph of a salivary gland cell showing the location of the narrow zone in cytoplasm with most of the cytoplasmic granules of BR type. A scgnrent of the nucleus appears in the upper left corner of the figure. The nuclear adjaccnt zonc with most of the cytoplasmic granules of BR type is demarcated with dashed lines. The cytoplasm is doniinated by granular endoplasniic reticulum except for a basal zone fillcd with mitochondria (lower right corner). Large, electron-dense secretion granules scattered in the cytoplasm can also be readily observed. The al)undant gi-anular cmdoplasniic reticuliim as well ;is a large number of Golgi complexes and secretion granules suggest that the main function of thesc cells is to synthesize and sccrctc proteins. The gIands were fixed and stained as described for Fig. 3A. The scale line represents 1 pin.

328

R. DANEHOLT ET AL.

Distance frwn nuclear envelope (nm)

FIG. 5. The distribution of large granules within a narrow cytoplasmic zone adjacent to the nuclear envelope. The zone is 1 r m broad and is visualized in Fig. 4. Ten cells from ten larvae were chosen for statistical analysis. One section passing through t h r crnter of the nucleus was s'canned per cell, and the position of each granule witiiin the cytoplasmic zone was measiired relative to the nuclear envelope.

as from the light (Fig. 9 B ) polysome region contained sequciiccs cornplementary to BR-1 and BR-2 DNA. Most of this R N A was present in EDTA-sensitive material ( Fig. 9C and D ) , thus presumably originating from polysomes. From these results the size of the KNA responsible for thc hybridization reaction cannot be decided. However, since 75 S R N A was shown to bc the only cytoplasmic R N A fraction to hybridize in situ to the 13Rs ( 8 ) , our in situ experiments are compatible with the idea that RNA molcculcs of 75 S size enter polysomes. On the basis of the described experiments, we havc concluded that at least some 75 S R N A molecules synthesized in RK-1 and HK-2 end up in large-sized polysomes. I t should be noted that the polysoiiie sizes may well have been underestimated, as it was not possible to completely abolish degradation of the most high-molecular-weight R N A during polysome extraction and analysis (cf. Figs. 1 and 8). This latter circumstance also precludes conclusions from the finding that B R sequences are also present in slowly sedimenting polysomes. Our finding that 75 S R N A molecules from BR-1 and BR-2 are incorporated into polysomes strongly suggests that they act as messenger molecules in cytoplasm. Since the corresponding polysomes are so large, the coding segments of the molecules should be of considerable size and far exceed that required for average-sized polypeptides ( a polypeptide of 30,000-dalton size corresponds to polysomes containing about 10 ribo-

329

BALRIANI RING PRODUCTS

Fractions

Slice no

FIG.6. Sucrose-gradient sedimentation of Chironomus polysomes ( A ) and electrophoresis of labeled RNA in a polysonial extract ( B ) . ( A ) Eight salivary glands were labeled in uioo for 3 days. The polysomes were extracted at 2"-4"C in a detergent solution containing deoxycholate and Tween 80. One half of the extract was treated with EDTA (final concentration 0.02 M ) . Each sample was layered on a 1540% sucrose gradient and spun at 40,000 rprn for 30 minutes at 4°C. Untreated sample, crosses; EDTA-treated, filled circles. ( B ) In parallel experiments, four glands were labeled and the polysomes extracted as in A, but subsequently SarkosyVPronase was added, and the RNA was precipitated in ethanol, redissolved and analyzed by electrophoresis in a 1%agarose gel. For further experimental details, consult ( 18 ).

somes ) . I t is true that indirect cytogenetic and biochemical information suggests that the BR products contain genetic information for one or more of the salivary polypeptides, the main protein product of the salivary gland cells (for discussion, see 2 0 ) , but we still lack more direct information. The successful translation of the 75 S RNA messages in a suitable system is probably needed for firm conclusions on the nature of the sequences in these long coding segments.

VI.

Concluding Remarks

In the carly studies of the behavior of BR RNA, the microdissection technique was instrumental in providing us with well-defined and pure

w w

0

FIG.7 . Electron niicrographs of rapidly ( A-C) and slo\tly ( D-F) sedinienting polysomes. Poljsoines were analyzed in sucrose gradients as shown in Fig. 6A. Samples from the heavy (fraction 7 ) as well as from the light (fraction 19) polysome region were collected. Grids were placed at 2 " 3 " C for 5 minutes in droplets prepared from the samples. Subsequently, the material attached to the grids was fixed in 4% fonnaldehyde, treated with Photoflo and stained in phosphotungstic acid according to Miller and Bakken ( 2 4 ) .

331

UALBIANI RING PRODUCTS I

20 s 1

HP

60.

-

10

20

30

10

LO

2'0

30

LO

.,

Slice no

FIG. 8. Electrophoretic analysis of RNA extracted from heavy ( H P ) and light ( L P ) polysomes. Polysomes were sedimented in sucrose gradients as shown in Fig. BA. Gradient fmctions 4, 6, 8, 10 and 12 were pooled and formed a heavy polysome sample ( H P ) , while fractions 14, 16, 18, 20 and 22 constituted a light polysome sample ( LP). RNA was released by Sarkosyl/Pronase, precipitated in ethanol, redissolved and analyzed in 1%agarose gels according to Ilaneholt et al. (18).

cellular components, such as specific chromosome segments, nuclear sap, and cytoplasm. In the present investigation, we have applied electron microscopy and biochemical cell fractionation techniques, which have permitted us to locate the BR RNA in defined ribonucleoprotein components within the nucleus and cytoplasm and to obtain strong support for a messenger function for the BR RNA. The prescnt information on the BR products show that at least some 75 S RNA molecules synthesized in BR-1 and in BR-2 are delivered to cytoplasm and become incorporated into polysomes without a major reduction in size during the transfer process. However, two possibilities must be kept open. First, it has not been excluded that the primary transcript is cleaved before it is completed. Second, the product delivered from the BRs (75 S ) could also subsequently be diminished in size to a minor extent, since previous size estimates made of these giant molecules are not sufficiently accurate to preclude such a possibility. Other processing steps have also to be considered. Addition of a poly( A) segment is indicated (21), but experiments testing for capping1 and methylationl are lacking. Furthermore, in view of the rapid degradation of a large portion of heterogeneous nuclear RNA in eukaryotic cells (e.g., 22), it is also important to analyze the BR system as to the degree of conservation of the sequences synthesized. Recent experiments by Egyhizi (23) indiSee papers in part I of this volume.

P U

r

FIG.9. In situ hybridization of RNA from heavy and light polysomes. Polysomes were prepared and analyzed as described for Fig. 6A, one gradient containing nontreated polysomes, the other EDTA-treated ones. RNA was released from the heavy polysome ( H P ) and the light polysome ( L P ) region in both gradients as described in Fig. 8 and subsequently studied by hybridization in situ. For further experimental details, see Wieslander and Daneholt ( 1 9 ) . h’on-EDTA-treated polysome extract: (.4) HP and ( B ) LP. EDTA-treated: ( C ) H P and ( D ) LP.

5 8P

F

333

BALBIANI RING PRODUCTS

cate that only a minor portion of the 75 S RNA molecules (about 5 % ) ever reach cytoplasm. Although not conclusive, this result hints at the possibility that coding sequences might be degraded to a large extent in this system. The possible significaiicc of this observation for a posttranscriptioiial regulation of gene expression has to be further explored. In eukaryotes, the transfer of genetic information from the DNA template to polysomes seems to be a complex process involving multiple steps. I t is evident that the transcript must be associated permanently or transiently with a number of proteins, including various enzymes, before it can become functionally active in cytoplasm. It is conceivable that the BR system might offer some advantages in the search for and characterization of components intimately coupled to the transcript. The exceptional size of the BR granules in the nucleus as well as of the BR RNA-containing polysomes in the cytoplasm might be useful in attempts to isolate the various proteins associated with one specific gene product during its transfer from the chromosomes via the nuclear sap into polysomes in the cytoplasm. ACKNOWLEDGMENTS The technical assistance of Mrs B. Hycle, Miss Eva Mzrtenzon and Mrs Sigrid Sahldn is greatfully acknowledged. The present work was supported by the Swedish Cancer Society, hlagnus Bergvalls Stiftelse and Karolinska Institutet (Reservationsanslaget). S.T.C. is a recipient of a National Research Service Award from the National Institutes of Health ( U.S.A. ).

REFERENCES I . W. Becrmann, Chroniosoma 12, 1 ( 1961 ). 2. U. Grossbach, CSIISQB 38, GI9 (1973). 3. J.-E. Edstriini, in “Methods in Cell Physiology” (D. M. Prescott, e d ) , Vol. 1, p. 417. Academic Press, New York, 1964. 4. B. Lambert and B. Daneholt, in “Methods in Cell Biology” ( D . M. Prescott, ed.), Vol. 10, p. 17. Academic Press, New York, 1975. 5. B. Daneholt, Cell 4, 1 (1975). 6. R. Daneholt, Nature NB 240, 229 (1972). 7. R . Daneholt and H. Hosick, PNAS 70, 442 (1973). 8. B. Lanibert and J.-E. Edstrom, Mol. B i d . Rep. 1, 457 ( 1974). 9. B. Daneholt, J.-E. Edstriim, E. Egyhizi, B. Lanibert and U. Ringborg, Chromasonla 28, 418 (1969). 10. E. Egyhizi, PNAS 72, 947 (1975). 11. B. Lambcrt, J M B 72, 65 ( 1972). 12. S. T. Case and B. Daneholt, Anal. Biochem. 74, I98 (1976). 13. W. Beermann and G. F. Bahr, E x p . Cell Res. 6, 195 (1954). 14. B. J. Stevens and H. Swift, J . Cell B i d . 31, 55 (1966). 15. G. Vazquez-Nin and W. Bernhard, J. Ultrastruct. Res. 36, 842 (1971). 16. B. A. Hamkalo, 0. L. Miller Jr. and A. H. Bakken, CSHSQB 38, 915 (1973).

334

B. DANEIIOLT ET AL.

17. H. Hosick and €3. Daneholt, Cell D i f . 3, 273 ( 1974). 18. B. Daneholt, K. Anderson and M. Fagerlind, J . Cell B i d . In press. Z9. L. Wieslander and B. Uaneholt, in preparation. 20. B. Uaneholt, Znt. Reu. Cytol. Stcppl. 4, 417 (1974). 22. J.-E. Edstroni and R. Tangnay, ] M B 84, 569 ( 1974). 2 2 . B. P. Brandhurst and E. H. McConkey, J M B 85, 451 ( 1974). 23. E. EgyhAzi, Cell 7, 507 (1976). 24. 0 . L. Miller and A. 11. Bakken, Acta Endocrinol. 168, 155 (1972).

Distribution of hnRNA and mRNA Sequences in Nuclear R ibon ucleoprote in Com plexes

,

1

ALANJ. KINNIRURGH, PETERB. BILLINGS, AND THOMASJ. QUINLAN~ TERENCEE. MARTIN Depurtnient of Biology h i u e r s i t y of Chicago Clzicago, Illinois

1. Introduction The past few years have seen the accumulation of a wealth of new information on the nature of the sequences that exist in the nuclear and cytoplasmic RNAs of higher eukaryotes ( a s witness the other chapters in this volume). Our knowledge of the details of “marker” sequences in hnRNA and mRNA has been grcatly expanded, and although the precise relationship between the large nuclear transcripts and functioning cytoplasmic messenger is still debated, there is considerable reason to believe that this detailed information will provide essential clues to the mechanism and selectivity of mRNA processing. In parallel with studies on purified RNA over the past decade, there have been attempts to isolate from cells the native forms of RNA molecules at various stages of processing. The investigations of these extractable ribonucleoprotein complexes ( R N P ) have led to a subset of supporting and conflicting data, and it should be stressed that the results presented in this paper are interpreted in terms of the previous work from our laboratory; we have not set out to provide a genei-al review or inclusive citation list to an expanding field. What can be stated fairly of this field at the time of writing is that the experiments have generally followed behind the observations of detectable structural elements in purified RNA molecules and have necessarily drawn on techniques developcd in the latter studies. The present paper for the most part is of this pattern in that we have begun to relate more recently described sequence elemcnts, such as oligo ( A ) , oligo( U ), double-stranded RNA and methylated nucleotides, to RNP subcomplexes Present address: Department of Pathology and Anatomy, Mayo Clinic, Rochester, Minnesota 55901.

335

336

ALAN J. KINNIRURGII ET

AL.

previously purified and partially characterized in our laboratory. However, it is hoped that novel information concerning nuclear RNA will also be provided by this approach; we report here the presence of a highly abundant class of RNA sequences associated with a purified nuclear RNP complex. Further, we believe that a parallel esamination of the proteins of these complexes, and their interaction with specific RNA sequencc classes, will eventually provide the missing links in the understanding of the processing of nuclear RNA with the concomitant selection and transport of mRNA molecules.

II. Distribution of RNA Sequences in Nuclear Extracts A. Pulse-Labeled Nuclear “DNA-like” RNA and Nuclear Poly(A) The rapidly synthesized, heterogeneous nuclear RNA ( hnRNA ) of eukaryotes is found complexed with proteins (1-3). The experiments of Georgiev and his colleagues ( 1 ) indicated that the nuclei of some mammalian tissues, if extracted with a simple salt buffer (0.1 h4 NaCl, 0.01 M TrisCl pH 8.0, 0,001 M MgCl,) in the presence of rat-liver-supernatant HNase-inhibitor could yield large RNP complexes containing RNA moleculcs approaching the size of hnRNA molecules extracted directly from nuclei by deproteinization, In the absence of the RNase inhibitor, or with the addition of small amounts of RNase, the bulk of the pulse-labeled nuclear RNA was extracted in a 30 S H N P form. We had made similar observations concerning the pulse-labeled RNA of mouse liver nuclei (2 ), and an experiment of this kind is shown in Fig. 1. These results imply that the 30 S RNP particle is a substructure of larger hnRNP fibers, but it will be noticed that, in the case of our mouse liver experiments (Fig. l ) ,the labeled RNA of large structures is not quantitatively converted by RNase to the 30 S form. Because of the loss of some RNA, we feel that we cannot conclude, along with Samarina et al. ( 1 ), that the large structures are entirely composed of 30 S subcomplexcs, although the 30 S RNP do contain a major fraction of the pulse-labeled “DNA-like” RNA (D-RNA) of the nucleus. Other extraction procedures have also implied the existence of 30 S substructures in hnRNP complexes (see, e.g., ref. 3 ) , and it was a natural extension of the above observations to attempt to characterize both the protein and RNA sequence coniponcnts of these relatively homogeneous, and thus readily purified, particles. In order to do this on a scale providing sufficient material for chemical, and eventually immunological studies, while still allowing the use of high-activity radiolabeling techniques, we have chosen to analyze the nuclear RNP components of mouse ascites

hnRNA

AND

mRNA

IN NUCLEAR RIBONUCLEOPROTEIN

I +

30s

60s

I

337

mouse liver

2

P X

20 1

FRACTION NO.

FIG. 1. Sedimentation distribution of pulse-labeled RNA in an extract of mouse liver nuclei: the effect of a low concentration of RNase. A mouse was injected intraperitoneally with 200 FCi of [“Hluridine; after 45 minutes, liver nuclei were isolated, and a nuclear extract was prepared in the presen’ce of rat liver supernatant RNase inhibitor and analyzed on sucrose gradients as described by Martin and McCarthy ( 2 ) . The extract was analyzed with (0-0) and without (0-0) prior addition of 1 pg of pancreatic RNase per milliliter.

cells ( Taper liver tumor) ( 2 ) with occasional comparisons made to other tissues and species ( 4 ) . The experiments described here employed these tumor cells. Our previously dcscribed studies (2, 4 ) show that the RNA of the 30 S RNP subcomplex, which is the major extractable species in our mouse ascites cells, even in the presence of rat liver RNase inhibitor, is DNAlike and not ribosomal in character, in agreement with Samarina et al. ( 1). Nucleic acid hybridization-competition demonstrated that the 30 S complex contained nucleus-restricted RNA sequences ( 2 ) , and the kinetics of labeling and turnover of the RNA ( 2 ) were in the range expected for hnRNA (5, 6 ) . We could not prove the presence of mRNA sequences, although in uitm experiments indicated the binding of mRNA sequences to the 30 S RNP-proteins under conditions in which tRNA and rRNA were not bound ( 4 ) . The elucidation of the linkage of an approximately 200-nucleotide scquencc of adenylate residues to some hnRNA and the majority of mRNA molecules and the inferred precursor-product relationship between nuclear poly ( A) -containing species and cytoplasmic messenger ( 7 ) led us to seek the poly( A) “tail” in 30 S RNP subcomplexes. Instead, we found a new subparticle, which contains the bulk of nuclear poly(A)

338

ALAN J, KINNIHURCH ET AL.

in a ribonucleoprotein form sedimenting at approximately 15 S in sucrose gradients ( 4 , 8, 9). Collating these observations, we have the simple model of the riboIiucleoprotein complex containing nuclear hnRNA ( Fig. 2 ) , which provides the working frame for our subsequent attempts to locate other nuclear RNA marker sequences with respect to identifiable KNP substructures. Along with others ( 4 , 10, II), we presume that proteins associate with hnRNA during transcription and that the completed molecuIe is released in an RNP fiber form that has a distinctive subunit structure. W e suspect that this is the natural template for RNA modifica-

0,0003

POSSIBLE

hn RNP

STRUCTURES

. .,.: ... .:. .;::; . .:;:. .::.: :. .:....: :....::..:F .. .: :::..' ;. .: --------

,

, , , , ,

RNase -sensitive

,

'

' ''

sites

subparticle

30 S

subparticles

FIG. 2. Simple models of hnRNP structure indicating the 30 S subcomplexes known to 'contain rapidly synthesized DNA-like RNA including nucleus-restricted sequences ( 2 ) hound to a simple set of polypeptides ( 4 ) , and the 15 S subcomplex containing poIy(A) ( 4 , 10) associated with a completely distinct group of polypeptides ( 8 ) . The lower model showing RNP filaments connecting the particle structures takes into consideration the failure to completely recover the labeled RNA from large structures in particle form (Fig. l ) ,and the possibility that the large structures may contain a greater diversity of polypeptides (see, for example, ref. 3 ) . An alternative explanation of these observations would be the coexistence in the nucleus of completely particulate and completely filamentous structures. The electron micrographs of Miller and his colleagues give suggestive visual evidence of the partially particulate character of hnRNA transcription fibers (12, see also ref. 11 ).

hnHNA

AND

mHNA

IN NUCLEAR HIRONUCLEOPROTEIN

339

tion and processiiig. The bulk of the newly synthesized RNA sequences is associated with the 30 S substructure, which is composed of a very simple set of polypeptides ( 4 ) , whereas the 200-nucleotide poly( A ) added posttranscriptionally is associated with a completely different group of protcins ( 8 ) . The aggravating thing to us about the model was that it did not indicate the RNP associations of mRNA sequences. The following sections essentially present a summary of our present efforts to define the interaction of thcse and other sequcnces of interest with nuclear proteins responsible for RNA processing and transport.

B. mRNA Sequences in 30 S Ribonucleoprotein Subcomplexes Although the 30 S RNP particle contains most of the pulse-labeled nuclear RNA and represents a basic subunit of the larger hnRNP, the bulk of hnllNA turns over in the nucleus with only a small proportion entering thc cytoplasm as mRNA ( 5 , 6 ) . Therefore, it was not clear from labeling studies that the 30 S RNP structure containcd mRNA sequences. tlybridization-competition experiments could demonstrate nucleus-restricted sequences but could not prove the presence of mRNA sequences ( Z ) , despite the fact that the proteins of 30 S RNP can bind mRNA ( 4 ) . We have rcccntly assayed the RNA from 30 S RNP of mouse ascites cells for niKNA( A,,) sequences by hybridizing particle RNA with complementary DNA ( cDNA) synthesized from a mRNA( A,, ) template ( 1 3 ) . RNA from crude 30 S RNP hybridized with this cDNA, although the kinetics were slower than for the mHNA-driven reaction, and complete hybridization was not achieved (Fig. 3 ) . When RNA from 30 S RNP purified by sedimentation through a second siicrose gradient was hybridized with cDNA, the rcaction proceeded faster thaii that of RNA from crude 30 S RNP and was nearly complete (85%relativc to the homologous reaction), Therefore mo:jt, and possibly all, of the mRNA(A,,) species are present in RNA of 30 S RNP. Since the amount of RNA in a single 30 S RNP particle is insufficient for the length of most mRNA sequences, a given sequence must span more than one 30 S RNP subcomplex. From the kinetics of hybridization, we estimate that 10-15% of the RNA of 30 S RNP is ho~nologousto mRNA( A , , ) . Thc remainder of the sequences in 30 S HNP probably represent iiucleus-restricted sequences, though a small proportion may bc> niRNA (no A,, ) sequences ( 1 4 ) . We therefore must conclude that mRNA sequcnces are associated, at some stage of their processing in thcb nucleus (presumably just after synthesis ) with the proteins that interact to form the 30 S subcomplcx. Since these proteins are not readiIy detectable in the cytoplasm (15),they are probably replaced by other proteins bcfore or immediately after transport. Our data

340

ALAN J. KINNIBURGI-I ET. AL.

I

-2

I

-1

I

0

I

2

3

4

5

LOG b t

FIG. 3. Hyliridization kinetics of crude and purified RNA from 30 S RNP and cytoplasmic mRNA( At,) with cDNA specific for the latter. cUNA was synthesized from RNA with avian myeloblastosis virus reversr transcriptase ( 3 4 ) . Samples for hybridization ( i n 0.9 M NaCl 0.09 M Na-citrate) were sealed in 5-pl disposable pipettes and hybridized at 67°C. At various times samples were quenched in an icewater bath and the contents expelled into 0.4 ml of S, nu'clease buffer ( 3 7 ) containing 100 p g / d of heat-denatured mouse DNA. Samples were digested with SI nucleasc, precipitated with cold CC1:,C02H,collected on Whatinan GF/A glass fiber filters and counted. Background SI-resistant radioactivity (unhybridized, nucleasedigested samples ) has been substracted from each determination. All assays were done in dnplicatc. inRNA( A,,) handago11s reaction, 0-0; RNA fro111 crude 30 S R N P with mRNA(A,,) specific cDNA, A--A; RNA from purified 30 S RNP with R,,t is the product of R N A concentration mRNA( A,j) specific cDNA, A-A. (n~olesnucleotide per liter) and time (seconds), i.e. [RNA-phosphate] .see.

+

cannot exclude complctely the possibility that there exist in the nuclei of marninalian cells mRNA copies that are never transported to the cytoplasIn.

C.

Methylated Nucleotides' The recent discovery of low levels of rnethylated nucleotides in mRNA and hnRNA of L cells ( 1 6 ) , and the rapid progress that has been made in the characterization of these methylated derivatives would appear to 'See articles by Moss t? al., Furuichi ct al., Perry et Busch et al. in this volume.

d.,Rottinann et al., and

hnRNA

AND

mRNA

IN NUCLEAR RLBONUCLEOPROTEIN

34 1

provide a basis for the assessment of the maturation state of sequences in nuclear RNP complexes. Eukaryotic messenger contains both internal methylated nucleotides, largely if not entirely NG-methyladenylate, and “blocked” structures at the 5’ termini containing the general sequence proposed for viral mRNA (17, 1 8 ) , consisting of m’GpppNm- - (19-22). Blocking groups of messenger RNA have been shown to contain either one or two 2’-O-methylnucleotides ( 22) ; however, only “caps” of the first type were isolated from hnRNA ( 2 3 ) . ’ In the preceding section it was shown that the 30 S RNP in the nuclear extracts contain mRNA sequences, However, under the extraction conditions, poly(A) sequences contained in nuclear RNA are cleaved and recovered from sucrose gradients as a separate, smaller RNP complex in the region of 15 S ( 8 ) . We sought to determine whether the methylated nucleotides and blocked 5’ termini of hnRNA are contained within the 30 S complex, by assaying the distribution of methyl-labeled nucleotides in our nuclear extracts. Conditions were chosen such as to reduce incorporation of [3H]methyl groups from methionine into ribosomal precursor RNA. Control experiments (not shown) with labeled adenosine indicated no decrease in incorporation into hnRNA under conditions of methionine starvation for the short labeling times employed. Ascites cells were incubated in methionine-free medium in the presence of a low concentration of actinomycin D for 15 minutes prior to addition of [Me-3H]methionine for 30 minutes. Extracts were prepared from isolated nuclei and fractionated on sucrose gradients ( 2 ) . RNA extracted from three regions of the gradients (supernatant, 10-20 S and 30 S ) was digested exhaustively with RNases A, T, and T2, and analyzed by DEAE-cellulose ( u r e a ) chromatography ( 2 4 ) . Methyl-labeled derivatives were eluted in fractions corresponding to a charge of -2 ( mononucleotides ), -3 ( dinucleotides ) and -4.8 to -5 (capped oligonucleotides). The capping group was recovered largely in the gradient supernatant (4-10 S ) with only 17%and 5%found in the 15 S and 30 S regions, respectively (Table I, experiment 1).A slightly higher proportion of the dinucleotides were in the larger complexes; 25%and 19%were recovered in the 15 S and 30 S fractions. However, methylated mononucleotides were distributed largely at the top of the gradient and in the 30 S region, with only 21% recovered in the 15 S region. The recovery of high proportions of di- and trinucleotides in the extracts implied that the suppression of 45 S rRNA transcription may be incomplete under the conditions of labeling used. We therefore subsequently included a preliminary incubation with a low concentration of actinomycin D in complete medium for 15 minutes prior to resuspending in methionine-free medium. Even under these conditions, methylated

342

ALAN J. KINNIBURGH ET AL.

Cpm recovered from 1Il~;AI~:cellulose in Cprn relative to cap

=

Expt. no.

Iiegion of gradient

Mono(-2)

Di(-3)

Tri(-4)

Cap (-4.8)

RIono-

IX-

Tri-

lb

Supernatant 15 s 30 s Supernatant

8790 4250 6670 4610

9880 4490 3400 2000 990

1380 0 0 0

14,710 3270 1020 2080 2200 780 240

0.60 1.30 6 54 2.22 1 88 10.96 3 03

0 67 1.37 3 34 0.!)6 0 45 0 70 1.00

0 09

2c

15

30 >30

s

s s

4130

8.530 720

540

0 0

240

0

1

-

-

Nuclear extracts (in 0.1 11.1 NaCI, 0.01 M TrisCl (pH 9), 1 mi\Z MgClz) were prepared from ascites eclls Iahclcd 30 minutes in ruethionine-free medium with 25 pCi/ml [ ~ ~ ~ - ~ H ~ r r i e t h iino nthe i n prcsenc~ e of 0.04 p g / m l nctinomycin I>, in 20 1nR1 sodium formatc, 20 phl cach adenosine and guanosine. Extracts were fractionated on sucrose gradients, arid ItNA was extracted from pooled fractions [supernatant (4-10 S), 15 S (10-20 S) and 30 S rcgions] and digested with a niixture of ribonuclcascs (25 pg/ml IiNase A, 15 U/m1 RNasc TI in 1 1 2 0 for 2 hours at, :17T followed by 15 U/mI RNasc Tz in 10 m M sodiuni acetate p H 4.5 for 8 hours at 37°C). 1)igcstrd RNA along with 1 rrig of yeast t R N A oligonuclcotidcs from an IlNasc A digest %as bound to 1)EAE-cellulose in 7 M urca, 20 n i R Tris pI-I 7.6. Oligoriuclcotidcs were clutcd from the column with a linear gradient of 0.05 t o 0.4 RZ NaCl in urea/Tris, and aliquots were counted in Aquasol. b Cells were first incubated for 15 niinutrs at 37°C in iriiniinal essential medium lacking methionine but containing 0.04 pg/nil actinomycin 1). c Cells were incubatcd 10 minutvs in cvniplete niininial epsential medium eont.aining 0.04 pg/inl actinoxiiyciri L) and then suspcnded in nicttliioiiirie-free medium for 15 minutes prior to introduction of 1nt)el. Icxtraction of nuclei was performed in the prescnco of the RNase inhibitor in a 100,000 X g supernatant of rat liver cytoplasm concentrated by 35-55 % s:tturated ammonium sulflttc fractionation.

dinuclcotidcs were still present in all fractions of the gradient (Fig. 4 and Table I, experiment 2 ) . Although we cannot exclude a slight contamination of the 30 S region by rRNA from adjacent regions, 30 S RNP have becn shown to have little afinity for tRNA or mature rRNA ( 4 ) . It has been suggested ( 25) that labeled dinucleotides in hnRNA preparations may not result entirely from incomplete suppression of rRNA synthesis; a comparison of the ratios of the varioiis 2’-0-1i~ethyl1tucleotidesrecovcrcd in the diriiicleotide peak with those of caps and rRNA indicates that they may in fact represent precursors to caps. We have also included the rat liver supernatant RNase inhibitor (26)

hnRNA

AND

mRN A

IN NUCLEAR RIBONUCLEOPROTEIN -1 1

6 4

-3 1

-4 1

-5 1

343

-6 1

A

sup.

2

-1 1

-3 1

-4 1

-1 1

-6 1

n 6

15 S

n

9 x

4

I $ 2

r

2c

ia 4

2

Frc. 4. Analysis of methylated derivatives in sucrose gradient fractionated nuclear extracts by DEAE-cellulose ( u r e a ) chromatography. Nuclear extracts ( 2 ) were prepared from cells labeled with [~le-'~~]nietliionine for 30 minutes in low-dose actinoniycin D (described in the legend to Table I, Expt. 2 ) and fractionated on sucrose gradients. RNA was extracted from selected regions of the gradient and digested with HNases A, TI and T, (as described in Table I ) . Samples were pooled in 7 M urea 20 nihl Tris pH 7.6 with oligonucleotide markers prepared by limited pancreatic digestion of yeast tRNA and applied to a 0.9 x 20 cm column of DEAE-cellulose (Whatinan DE 5 2 ) equilibrated with 7 M urea/BO niM Tris. Oligonucleotides were eluted with a 200-1i11linear gradient of 0.05-0.4 M NaCl in urea at a flow rate of 0.0: ml/min; 2-ml fractions were collected and counted in Aquasol. Positions of oligonucleotide markers, determined by continuous A,, ,,,,>monitoring are indicated by charge ( - 2 , mono-; -3, di-; ctc.). Fractions of the gradient assayed were 4-10 S (supernatant), 10-20 S ( 1 5 S ) and the 30 N RNP absorption peak ( 3 0 S ) .

344

ALAN

J.

KINNIBURGII ET AL.

during the extraction procedure in an attempt to prevent cleavage of exposed caps from the 30 S RNP. Increascd proportions of the capping group were recovered in the 15 and 30 S regions, amounting to 41%and 14%,respectively, of the total assayed (Table I). In the presence of the nuclease inhibitor, there is :t considerable reduction in the labeled baseniethylated niononucleotides at the top of the gradient with 47%of the total now retained in the 30 S region. Further analysis is required to determine whether the methylated mononucleotides not contained in 30 S RNP under these conditions are derived from tRNAs. Sequences either associated with, but not specifically bound to the proteins of 30 S RNP (free tails), or associated with distinct proteins [for example, p l y ( A ) ] have a high probability of being cleaved from the large hnRNP during the extraction process in this system. While it appears that internal base-methylated nucleotidcs are contained in 30 S RNP, the important question of whcther proteins distinct from those of 30 S RNP are associated with the capping group reinains to be answered. It is also quite possible that caps are added subsequent to a transfer of mRNA to other proteins involved directly in transport to the cytoplasm.

D. Oligo(A),Oligo(U) and Double-Stranded RNA In an attempt to analyze in more detail the nucleus-restricted sequences in 30 S RNP, we have assayed particles for the srnall adenylaterich sequence [oligo ( A ) ] found in hnRNA but not detected in cytoplasmic niRNA (27, 28 ). When adenosine pulse-labeled RNA was prepared from mouse ascites 30 S RNP and digested with T, and pancreatic RNase, a prominent peak of 20-40 nucleotides was observed after electrophoresis on aerylamide gels (Fig. 5). When similar samples were further treated with T, RNasc, this peak was removed, leaving only the more heterogeneous higher-molecular-weight sequences, which are presumably double-stranded RNA sequences, dsRNA (31, 32). I t was also possible to separate oligo( A ) sequences in 30 S RNP from other RNase-resistant material by chromatography on oligo( dT)-cellulose ( 3 3 ) . The unbound fraction consisted mainly of heterogeneously migrating RNA in acrylamide gels whereas the bound material was a single peak of 20-40 nucleotides (data not shown). Whcn undigested [sH]adenosine labeled 30 S RNP-RNA was similarly analyzed, RNA identical in size to the total RNA of 30 S RNP (40-90 nucleotides) was found in the bound fraction ( data not shown). The oligo ( A) sequence is therefore most likely linked to RNase-sensitive scqucnces in 30 S RNP. W e next wished to determine the quantitative distribution of oligo( A ) in our nuclear extracts. An adenosine-labeled preparation was centrifuged 011 a 15 to 30%sucrose gradient and RNA prepared from the separate

hnRNA

AND

mRNA

I N NUCLEAR RIBONUCLEOPROTEIN

345

>

slice no.

FIG. 5. Acrylaniide gel electrophoresis of pancreatic and TI-RNase-resistant ['Hladenosine labeled RNA from 30 S RNP. Cells were labeled for 20 minutes with 10 pCi/ml of ['H]adenosinc and the nuclear extract prepared and centrifuged on a 15 to 30'2 sucrose gradient ( 2 ) . RNA was extracted (29) from the 30 S region of this gradient. The RNA was digested for 40 minutes with 2 pg/ml of pancreatic and 10 U/ml of T, RNase, purified, and electrophoresed on a 12% acrylamide gel for 2.5 hours at 5 mA/gel ( 3 0 ) . The 4 S and 5 S RNA were included in this gel as molecular-weight markers, and nuclear p l y ( A ) [from nuclear poly( A ) containing RNP] was run in a parallel gel. ( BPB-broniphenol blue marker).

fractions. After pancreatic and T, RNase digestion and electrophoresis on 12%acrylamide gels, the radioactivity in the 2 0 4 0 iiucleotide region was summed and expressed as percent oligo( A) per fraction (Fig. 8 ) . Most of the oligo( A ) sequences in the nuclear extract were found a t the top of the gradient. However, at least 2 0 3 0 % of the extracted oligo(A) sequences were associated with 30 S RNP complexes. The relative sensitivity of oligo( A ) sequences to cleavage may result from a lower affinity for the 30 S RNP-proteins, partially exposing these regions in large hnRNP complexes. To further demonstrate the presence of oligo(A) sequences in the RNA of 30 S RNP, we have attempted to transcribe the sequences adjacent to the 5' end of oligo( A ) by oligo( dT)-primed cDNA synthesis using avian myeloblastosis virus reverse transcriptase. The addition of oligo(dT) primer stimulates the template activity of the RNA of 30 S RNP to a degree similar to that obtained with mRNA( A,,) as template ( TabIe I1 ). Neither of these templates can direct cDNA synthesis primed with oligo( dA), indicating that the cDNA transcribed with oligo( d T )

346

ALAN J , KINNIBUHGH ET AL.

I

z 0 . : :

t-

v 4 a

LL

$ 1

$ 5 0 cl

-I

0

FRACTION NO.

FIG.6. Distribution of total pulse-labeled R N A and of oligo( A ) sequences in a sucrose gradient of mouse ascites cell nuclear extract. Cells were labeled for 20 minutes with 10 pCi/ml of [Wladenosine as described ( 8 ) . Nuclei were prepared and extracted, and the extract was centrifuged on a 15 to 30%sucrose gradient. This gradient was fractionated into 5-1111 aliquots and the R N A from each was extracted with chloroform/phenol ( 29). Total acid-precipitable radioactivity in each fraction was determined, and the oligo(A) content estimated by summation of the ["ladenosine in the 2 0 4 0 nucleotide region of acrylamide gels after electrophoresis. Absorption at 254 nni ( . . . .); total radioactivity of gradient in each fraction (0-0 1; percent oligo( A ) per fraction ( 0 - 0 ) . Sedimentation is from left to right.

RNAa Cytoplasmic IiN A (A,)

'

Total I l N A from 30 R RNI'

Primerb

13H]dCl'P incorporate& (cpm X

None (dA) 1 2 - 1 H (dT)1 2 - 1 8 Nonc (dA)1 2 - 1 8 (dT) 1 2 - 1 x

26.6 24.4 217.2 6.0 5 .8 38.6

Stimulation by primer (X)

1 .0 8.2 -

1 .0 6.4

O.,? pg ItNA(A,) and 1.5 pg I1NA from 30 S IINP per 100 pl rraction wrreassayrd by the procedure of Yerma rt al. (34). b Primer was present at a final conrentration of 7.6 pg/ml. c [WIdCTP specific activity was 2500 cpm/pmol. 0

hnRNA

AND

mRNA

IN NUCLEAR RIBONUCLEOPROTEIN

347

is almost certainly complementary to the sequence adjacent to the 5’ end of the adenylate-rich sequences and is not a random representation of sequences. We have also analyzed the RNA of 30 S RNP and other RNP regions of the gradient for double-stranded RNA (dsRNA) and oligo(U) sequences. Double-stranded RNA (as judged by pancreatic, TI, and T, RNase resistance) was found in two peaks on RNP gradients, one in the 10-20 S region and the other in the 30 S region. Labeling 30 S RNP with [?HIadenosine or 3LPIgave different patterns of double-stranded RNA on 12% acrylamide gels, possibly indicating that two populations of dsRNA distinct in base composition (35) may be present in 30 S RNP. We have as yet been unable to demonstrate oligo( U) sequences in the RNA of 30 S RNP in terms of T, RNase-resistant [3H]uridine radioactivity binding to oligo( dA)-cellulose. We have shown that oligo( U ) is heterodisperse on our gradients of nuclear extracts; it is present in the 15 S RNP gradient region and becomes associated with the poIy(A) component during RNA extraction ( 9 ) . Our failure to detect oligo( U ) sequences in some fractions by binding to oligo( dA)-cellulose may be due to prior association of the oligo( U ) with adenylate-rich sequences.

E. Oligo(A)-Linked Sequences in 30 S Ribonucleoprotein Su bcomplexes The oligo(A) sequence present in the RNA of 30 S RNP allows the transcription of 5’-linked sequence into cDNA (Table 11). We have tested the complexity of the homologous RNA sequences of 30 S RNP by RNA-excess cDNA hybridization (36). A very rapidly hybridizing component was observed that comprises approximately 22%of the cDNA (Fig. 7 ) . After the hybridization of this component, several orders of magnitude higher Rot3 values were needed before further hybridization was obtained. The absolute saturation value for hybridizable cDNA has not bcen achieved owing to the relatively high concentrations of RNA required to drive the reaction. The complexity of the oligo( A)-linked sequences was assessed by comparison of the hybridization kinetics with those of a globin mRNA standard. The rapidly hybridizing sequences appear to have a complexity of 500-1000 nucleotides. More data are needed to make a similar estimate of complexity for the slower hybridizing component, but these sequences must be 103-104times more complex than the rapidly hybridizing sequences. Oligo(A) has not been detected in mRNA and is internally located in hnRNA (27, 28). It therefore was important to establish whether the Rot is the product of RNA concentration (moles nucleotide per liter) and time (seconds), i.e. [RNA-phosphate].sec.

348

ALAN J. KINNIRURGH ET AL.

I

-4

-3

-2

I

-1

I

I

I

I

0 LOG Rot

1

2

3

4

FIG. 7 . Hybridization kinetics of RNA from 30 S R N P and homologous cDNA. HNA from 30 S RNP was prepared from pclleted complexes and cDNA prepared with avian myeloblastosis virus reverse transcriptase and oligo ( dT ) primer. Hybridization was performed as described for Fig. 3.

oligo (A )-linked sequences are homologous to mRNA or arc completely nucleus-restricted. Cytoplasmic mRNA( A,, ) was therefore allowed to hybridize with cDNA specific for scqucnces adjacent to oligo(A). It can be seen that mRNA(A,,) does hybridize with this cDNA (Fig. 8). The rate is approximately %,, to %,,, that of the homologous reaction with the RNA of 30 S HNP, but as fast as that of the most abundant class of mRNA(A,,) when reacting with its own cDNA. It appeared that the mRNA is hybridizing to the cDNA of simple complexity; to substantiate this directly, the rapidly hybridizing cDNA was purified by hybridization with the RNA of 30 S HNP to a R,,t of 2.4 x 10-' M X sec, followed by nuclease S, digestion and alkaline hydrolysis of the hybridized RNA. When this cDNA was rehybridized to RNA (A,,) , 35%of the input cDNA hybridized with kinctics similar to the rapidly hybridizing fraction of total cDNA specific for sequences adjaccnt to oligo( A) (data not shown). The failure to achieve complete hybridization may have been due to degradation of this small cDNA (approximately 50 nucleotides in length) during purification, so that much of it was unable to form a stable hybrid under our conditions. However, the results suggest that these simple oligo( A ) linked sequenccs are homologous to some of the most abundant mRNA( A,, ) sequences found in the cytoplasm. This is surprising since mRNA(A,,) is presumed to be derived from the 3' end of hnRNA(A,,)

hnRNA

AND

mRNA

-4

-3

349

IN NUCLEAR RIBONUCLEOPROTEIN

-2

-1

0

1

2

3

'

LOG 5 t

FIG. 8. Hybridization kinetics of cytoplasmic RNA( A,,) with cDNA specific for RNA from 30 S RNP. Cytoplasmic RNA(A.) was hybridized to this cDNA as described for Fig. 3. RNA( A , ) hybridization with cDNA specific for RNA of 30 S RNP (0-0); hybridization of RNA from 30 S RNP with homologous cDNA from Fig. 3 ( . . .); RNA( A,,) hybridization with homologous cDNA from ref. 13 ( - - - ).

.

molecules (38), yet oligo( A ) sequences are found internally in hnRNA (27). I t is possible that in the case of mRNA species homologous to oligo( A ) -linked sequcnces, the cytoplasmic mRNA( A,,) molecules may be derived from internal regions of hnRNA transcripts.

111. Concluding Remarks The data we have presented in the preceding sections constitute preliminary information with regard to the organization of RNA sequences of current interest within nuclear ribonucleoprotein complexes. We have concentrated our studies so far on the identifiable subparticles of the structures that contain heterogeneous nuclear RNA; these can be purified and the polypeptide composition unequivocally determined. However, it is clcar that certain important nucleotide structures are not tightly associated with the proteins of thcse subcomplexes. The 5' capping group containing methylated nucleotides appears to be a n important example of a structure not contained in the 15 S and 30 S particles. Messenger RNA sequences are present in 30 S RNP, and so is a t least a fraction of the oligo(A) of hnRNA, the latter being linked to a highly abundant RNA sequence in the nuclear RNP. The significance of this sequence remains to be determined, but its homology to a fraction of the cytoplasmic mRNA population is suggestive. It is necessary, of course, to extend the

350

ALAN J . KINNIRURGH ET AL.

analysis to intact hnRNP complexes to determine the complete range of specific protein-nucleotide interactions, and finally, the native form in which niature mRNA is transported from nucleus to cytoplasm remains to be identified and characterized. ACKNoWLEDChf ENrS W e thank Ms Ljerka Urbas for excellent technical assistance, which greatly facilitated the research described here. A. J.K. and P.B.B. are trainees supported by USPHS Training Grant HD-00174. T.J.Q. was a trainee of USPHS Training Grant GM-780. The research was supported by USPHS Grant CA-12550, and the University Cancer Research Center Grant CCRC Project IIIB.

REFERENCES 1 . 0. P. Saniarina, E. M. Lukanidin, J. Molnar and G. P. Georgiev, J M B 33, 251 ( 1968). 2. T. E. Martin and B. J. hl’ccarthy, B B A 277, 351 (1972). 3. T. Pederson, J M R 83, 163 ( 1974). 4. T. Martin, P. Billings, A. Levey, S. Ozarslan, T. Quinlan, €1. Swift and L. Urbas,

C S H S Q B 38, 921 (1973).

5. R. Soeiro, M. Vaughan, J. R. Warner and J. E. Darnell, J . Cell B i d . 39, 112 (1968). 6. B. P. Brandhorst and E. H. McConkey, J M B 85,451 ( 1974). 7. W. Jelinek, M. Adesnik, M. Salditt, D. Sheiness, R. Wall, G. Molloy, L. Philipson and J. E. Darnell, JMB 75, 515 (1973). 8. T. J. Quinlan, P. B. Billings and T. E. Martin, PNAS 71, 2632 (1974). 9. T. J. Quinlan, A. J. Kinniburgh and T. E. Martin, submitted for publication. 10. 0 . P. Samarina, E. M. Lukanidin, and G. P. Georgiev, in “Protein Synthesis in Reproductive Tissue” ( E. Diczfalusy and A. Diczfalusy, eds. ), Karolinska Symp. Res. Methods Reproductive Endocrinol., 6th Symp. p. 130. Karolinska Inst., Stockholni, 1973. 1 1 . D. B. Malcolm and J. Sommerville, Clzroinosoma 48, 137 ( 1974 ). 12. 0. L. Miller, Jr. and A. H. Bakken, in “Gene Transcription in Reproductive Tissue” ( E . Diczfalusy and A. Diczfalusy, eds.), Karolinska Symp. Res. Methods Reproductive Endocrinol., 5th Symp. p. 155. Karolinska Inst., Stockholm, 1972. 13. A. J. Kinniburgh and T. E. Martin, PNAS 73, 272, (1976). 14. C. Milcarek, R. Price and S. Penman, Cell 3, 1 (1974). 1 5 . E. M. Liikanidin, S. Olsnes and A. Phil, Nature N B 240, 90 ( 1972). 16. R. P. Perry and 1). E. Kelley, Cell 1, 37 ( 1974). 17. Y. Furuichi, M. Morgan, S. Muthukrishnan and A. J. Shatkin, PNAS 72, 362 (1975). 18. C. M. Wei and B. Moss, P N A S 72, 318 (1975). 19. J. M. Adarns and S. Cory, Nature 255, 28 (1975). 20. Y. Furuichi, M. Morgan, A. J. Shatkin, W. Jelinek, M. Salditt-Georgieff and J. E. Darnell, PNAS 72, 190.1 (1975). 21. R. P. Perry, D. E. Kelley, K. Friderici and F. Rottman, Cell 4, 387 (1975). 22. C. M. Wei, A. Gcrshowitz and B. Moss, Cell 4, 379 (1975). 23. R. P. Perry, D. E. Kelley, I<. H. Friderici and F. M. Rottman, Cell 6, 13 (1975).

hnRNA

AND

mRNA

IN NUCLEAR RIBONUCLEOPROTEIN

351

24. G. M. Tener, in “Methods in Enzymology,” Vol. 12, Part A ( S . P. Colowick and N. 0. Kaplan, eds.), p. 398. Academic Press, New York, 1967. 25. M. Salditt-Georgieff, W . Jelinek, J. E. Darnell, Y. Furuichi, M. Morgan and A. Shatkin, Cell 7, 227 ( 1976). 26. J. S. Roth, JBC 231, 1085 (1958). 27. 1%.Nakazato, M. Ednionds and D. W. Kopp, PNAS 71, 200 (1974). 28. Edmonds, this volume. 25’. R. P.Perry, J. LaTorre, D. E. KeIley and J. H. Greenberg, BBA 262, 220 ( 1972). 30. U. E. Loening, BJ 102,251 (1967). 31. W. Jelinek and J. E. Darnell, P N A S 69, 2537 (1972). 32. W. Jelinek, G. hlolloy, R. Fernandez-Munoz, M. Salditt and J. E. Darnell, I M B 82,361 ( 1974 ) . 33. H . Aviv and P. Leder, PNAS 69, 1408 ( 1972). 34. I. hl. Vernia, G. F. Temple, 1%.Fan and D. Baltimore, Nature 235, 163 (1972 ). 35. A. P. Ryskov, G. F. Saunders, V. R. Farashyan and G. P. Ceorgiev, B B A 312, 152 (1973). 36. J. 0. Bishop, J. G. Morton, M. Rosbach, and M. Richardson, Nature 250, 199 (1974). 37. W. D. Sutton, B B A 240, 522 ( 1971). 38. G. R . hlolloy, W. L. Thomas and J. E. Darnell, P N A S 69, 3684 (1972).

IV. Chromatin Structure and Template Activity

The Structure of Specific Genes in Chromatin

355

RICIIARDAXEL

The Structure of DNA in Native Chromatin as Determined by Ethidium Bromide Binding 373 J. PAOLETTI, B. B. MACFEAND P. T. MAGEE

379 RONALDHERMAN,GARYZIEXE, JEFFERYWILLIAMS,ROBERTLENK AND SHELDON PENhfAN

Cellular Skeletons and RNA Messages

The Mechanism of Steroid-Hormone Regulation of Transcription of Specific Eukaryotic Genes

403

BERTW. OMALLEY AND ANTHONY R. MEANS Nonhistone Chromosomal Proteins and Histone Gene Transcription

42 1

GARYSTEIN, JANETSTEIN, LEWIS KLEINSMITII, WILLIAMPARK, ROBERTJANSING AND JUDITH THOMSON Selective Transcription of DNA Mediated

by Nonhistone Proteins TUNCY. WANG,NINAC . KOSTRARA AND RUTHS. NEWMAN

353

447

This Page Intentionally Left Blank

The Structure of Specific Genes in Chromatin RICHARDAXEL lnstitiite of Cancer Research and Department of Pathology Columbia University, College of Physician? Q Surgeons New York, New York

1. Introduction Control of gene expression in eukaroytes is thought to operate, at least in part, at the level of RNA synthesis, in such a way as to permit the transcription of specific gencs in one tissue and to restrict their expression in other tissues. The molecules responsible for regulating specific gene expression probably interact with specific nucleotide sequences or structures that reside in the exceedingly complex genome. T h e genomic DNA of virtually all eukaryotes is associated with a stable complement of histone proteins and a variable level of nonhistone proteins to form a complex that we define as chromatin. The vast amount of information in the form of nuclcotide sequence is therefore afforded a higher order of complexity resulting from the specific interaction of the chromosomal proteins both among themselvcs and with DNA. We have been concerned with the arrangement of proteins along the DNA and the possible role of these structures in regulating gene expression. Physicochemical studies show that chromatin proteins alter both the long-range and short-range interactions of DNA, resulting in the formation of a relatively compact coil in which there is local distortion of the DNA secondary structure. Considerable effort has been expended to determine the role of various protein components in maintaining this structure, and to relate the structural changes to changes in template activity of the nucleoprotein complex. More recent biochemical and microscopic data have elucidated an elegant array of basic repeating units in chromatin, consisting of about 190 base-pairs of DNA and 8 histone molecules to generate a 7 : l shortening of the DNA fiber (1-8). The elucidation of this subunit structure in chromatin immediately poses the question as to the possible role of this structure in those biological processes in which chromatin participates. 355

356

RICHARD AXEL

II. The Nucleosomal Subunit During the past year, an experimentally consistent model of chromatin structure has emerged that postulates the presencc of regularly repeating nucleoprotein subunits joined by short segments of DNA. Biochemical evidence for a periodic protein . DNA complex was introduced by Hewish and Burgoyne ( 2 ) and was subsequently confirmed in several laboratories ( 2-6). This evidence demoustrates that mild nuclease digestion of nuclei results in the liberation of a series of nucleoprotein particles, called nucleosomes or nu ( v ) bodies, containing DNA fragments whose molecular weights arc all multiples of n unit fmgmcnt 150-200 base-pairs in length. Subsequent studies have revealed that the nucleosome represents a transient intermediate in the digestion process, and the cleavage of DNA within the nucleosoine results in the generation of a true limit digest reflecting the internal structure of the monomeric subunit. Electron microscopic observations (7, 8) arc in accord with the biochemical data and reveal the existence of spherical nucleoprotein particles joined by short filaments of DNA. In gently disrupted nuclei, these particles can be observed closely apposed to one another, providing evidence that they reflect a lcvcl of structure characteristic of the chromatin fiber in viuo. We have used the. enzyme staphylococcal nuclc3ase to probe the accessibility of DNA in both chromatin and intact nuclei. Digestion of chromatin from a variety of sources liberates about half the DNA as acidsoluble oligonucleotides. Analysis of the DNA liberated early in the digestion process allows us to discern the repeating subunit profile (Fig. 1). As mentioned above, at early times in the digestion proccss, DNA fragments arc gcneratcd whose molecular weights are integral multiples of a single unit, about 185 base-pairs in length. At 3% solubilization, DNA bands corresponding to 185, 370, 550, 735 and 910 base-pairs are observed. I t is obvious that, as the digestion proceeds, the relative proportion of the smallest DNA fragment incrcases at the expense of the larger ones. These observations suggest that the proteins of chromatin arrange the DNA in a regular conformation that repeats itself every 185 basepairs. The repeating subunit is punctuated by regularly spaced nucleasecleavage sites, presumably due to a region relatively free of protein and sensitive to nuclease attack. It should be noted that, as the digestion proceeds, a series of DNA fragments of lower molecular weights is observed that provides information reflecting the internal structure of the nucleosomal subunit. A prominent fragment is observed at 140 base-pairs in length. Although considerablc variation exists in the size of the larger subunit, this 140

STRUCTURE OF GENES I N CIIROMATIN

357

FIG. 1. PoIyacryIamide gel electrophoresis of purified DNA fragments liberated upon digestion of metaphase chromosomes and interphase nuclei. Nuclei and chromosomes were digested with staphylococcal nuclease. At various times, aliquots were removed and the DNA purified. DNA, lofig, was applied to each slot of a 2.5% acryIamide/0.59; agarose slab and electrophoresed at 200 V for 2 hours. Gels were stained with ethidium bromide and photographed. Samples from right to left represent 2, 4 and 10%digestion of metaphase chromosomes and 2, 4 and 10%digestion of interphase nuclei, respectively.

base-pair fragment appears in chromatin digests from virtually all eukaryotic sources and may represent the internal core of the basic repcating unit. The difference in size between the largest observable fragment (185 base-pairs) and this constant 140 base-pair intermediate presumably reflects the presence of the more accessible, interparticle DNA. As digestion proceeds to a limit at 508 solubilization of DNA,a second set of highly reproducible low-molecular-weight bands ranging in size from 140 down to 40 base-pairs is observed (Fig. 2 ) . Similar patterns of D N A fragments are obtained whether we use intact nuclei, nucleosomes or isolated chromatin as the initial substrate for the enzyme (9, 10). The precision with which the enzymic process produces these discrete DNA fragments suggests that there is a highly specific arrangement of the

358

RICHARD AXEL

FIG.2. Polyacrylamide gel electrophoresis ( 6%) of DNA liberated upon digestion of metaphase chromosomes and interphase nuclci. Interphase nuclei and metaphase chromosomes were digested with staphylococcal nuclease. The resistant DNA was freed of protein and 10 pg applied to each slot of a 6%'polyacrylamide gel. The gel was run as descrilied previously, stained with ethidium bromide, and photographed. Samples from right to left represent 14 and 28%digestion of metaphase chromosomes arid 10 and 309 digestion of nuclei.

histones on the surface of the DNA in which certain well-defined portions of the polypeptide chains are in intimate contact with the DNA, or have folded the DNA in such a way that precisely determined lengths of nucleic acid are protected from digestion. The sizes of these segmcnts are probably not determined by the presence of any long special sequences of nucleotides; similar patterns are obtained when nucleoprotein complexes are constructed using eukaryotic, bacterial, or viral DNA, which do not share such sequcnces to any significant extent. These results suggest that the majority of chromatin proteins bound to DNA are randomly distributed with respect to base sequence. The dctcrmining factor in thc production of the DNA fragments apparently lies in the structure of the histones themselves and the way in which the histones attach themselves to DNA in the n at'ive chromatin structure.

STRUCTURE OF GENES I N CHROMATIN

359

111. Nucleosomes in Metaphase Chromosomes The 10-nm unit fiber of interphase chromatin is likely to be generated by an array of apposing nucleosomal subunits. It is generally assumed that this unit fiber of interphase chromatin is maintained during the process of condensation that results in the formation of the metaphase chromosomes. Support for this concept could b e provided if we were able to demonstrate that the basic nucleosomal subunit structure is retained in thc mitotic chromosome. To this cnd, metaphase chromosomes were prepared from colchicine-arrested cultures of mouse L-cells and were subjected to digestion with staphylococcal nuclease (11). In Fig. 1, we see that the larger fragments generated early in the digestion of mitotic chromosomes are virtually indistinguishable from those liberated from interphase nuclei. Similarly, the pattern of lower-molecular-weight bands observed upon limit digestion of nuclei is observed with metaphase chromosomes as well. These observations indicate that, although significant morphological and biochemical changes occur within the chromatin fiber during mitosis, the basic subunit structure of the interphase fiber is retained bv the mitotic chromosome.

IV. Analysis of the DNA of Monomeric Particles The previous studies merely demonstrate that a nuclease cleavage site exists at regular intervals along the DNA backbone. However, evidence that such a repeating unit results from a regular arrangement of iiucleoproteiii subunits requires that we dcmonstrate the liberation of such particles, containing both protein and DNA, and 185 base-pairs in Icngth. T o this end, rat liver nuclei were partially digested with nuclease and the resulting nucleoprotcin fragments sedimented in sucrosc gradients (Fig. 3 ) . This pattern results from only a 4%digestion of nuclear DNA, and reveals at least five absorbance peaks. The DNA from each of these components represents a multimeric form of the 185 base-pair fragment ( 4 ) . Analysis of the proteins within each nucleoprotein peak reveals the presence of histones H2A, H2B, H3, H4 in proportions identical to that obscrved in unfrnctionated Chromatin. Histone H 1 is present in equal proportions in the dimeric through pentameric fractions, but appears to be reduced in the monomeric subunit. The degree of loss of histone H1 appears to be related to the extent of digestion, suggesting that excision of thc interparticle DNA results in a concomitant loss of histone H1. The availability of purified nucleosomes has enabled us to characterize the kinetic coniplexity of the DNA organized into nucleosomal subunits.

360

RICHARD AXEL

05

025

0 75

-E 05 10

N Q u l

025

B

n

a

10

075

05

0

10

20

30

top

milliliters

FIC.3. Sucrose gradient centrifugation of chromatin subunits. The nucleoprotein particles generated by 2, 4 and 10%digestion of rat livcr nuclei were subjected to centrifugation in a 5 to 20% sucrose gradient at 110,000 x g for 6 hours. The direction of scdirnentation proceeds from right to left.

Perhaps the most direct experimental approach to this problem involves an analysis of the reussociation kinetics of isolated monomeric and total nuclear DNA ( 1 2 ) . To this end, tritium-labeled total nuclear DNA ( [3H]nDNA) was prepared from cultures of normal rat liver cells. This

36 1

STRUCTURE OF GENES IN CHROMATIN

I

0

r

I

I

2

I

3

I

4

log CJ

FIG. 4. Kinetics of annealing Wlabeled rat-liver total nuclear DNA to monomeric and total nuclear D N A fragments. Rat-liver total nuclear DNA (10,000 cpm, 0.25 p g ) was annealed with 25 or 750 pg of unlabeled total nuclear DNA ( 0 )or monomeric nuclear DNA fragments (0) in a reaction volume of 30 pI. Aliquots were removed at various times, and duplex formation was monitored by S,-nuclease digestion.

DNA was then annealed in the presence of a large excess of either total nuclear or isolated nucleosomal DNA. The reassociation of the [“HInDNA to the excess quantities of total nuclear DNA is seen in Fig. 4. AS expected, two transitions are observed over the range of Cut values studied. Thirty percent of the DNA consists of repeated sequences. The final transition reflccts the reassociation of the unique sequences of the rat genome, which have a Cot,,?of 820. A similar pattern is obtained when the [”InDNA is annealed with DNA isohted from purified nucleosomes with the reaction saturating at 85% duplex formation. This concordance indicates that virtually all sequences present in the rat genome are represented in the nucleosomal DNA and that the reiteration frequency of these sequences is identical in the two DNA populations. These findings are consistent with the kinetics of appearance of monomeric DNA, which suggest that 8590% of the nuclear DNA is involved in the basic repeat structure.

V. Structure of the Globin Genes in Chromatin The observation that thc kinetic complexity of nucleosomal DNA is identical to that of total nuclear DNA does not exclude the possibility that a small but biologically significant subset of sequences is excluded from the monomeric DNA fragments. Interesting candidates for such a population of DNA sequences are those actively expressed within the gcnome. Perhaps the systcm most amenable to this sort of analysis is the synthesis of globin RNA by avian erythroid cclls ( 1 2 ) .

362

RICHARD AXEL

It was of obvious interest t o determine whether proteins reside along the globin genes in populations of avian reticulocytes in which more than 75%of the cells are actively synthesizing globin RNA. Two features of chromatin structure are likely to result solely from the specific interaction of histones with DNA: thc pattern of multimeric nucleosomal subunits and the generation of unique fragments of DNA upoil limit digestion of chromatin. Both of these structural indicators are observed upon reconstitution of DNA and pure histone and are not dependent on the presence of nonhistoiie protein. The observation that the globin genes exist in a iiucleosomal array would therefore provide convincing evidence for the presence of histones along transcriptionally active regions of the genome. One approach to this problem simply involves titrating the globin genes in monomeric nucleosomal DNA particles by molecular hybridization with highly radioactive globin cDNA. To this end, a fixed amount of globin cDNA is titrated with increasing quantities of either total nuclear or monomeric DNA fragment from avian erythroblasts under conditions that allow completion of the annealing reactions. From the initial slope of the titration curve, we dctcrmine what proportion of the cDNA is annealed to a known quantity of cold genomic DNA. The concentration of the globin gencs in a population of DNA can be determined directly from these measurements. This approach assumes only that the cDNA and the corresponding geiiomic minus strand sharc the same rate constant of reaction to the natural plus strand. The titration of 0.2 ng of duck globin cDNA with total or nucleosomal DNA of reticulocytes is shown in Fig. 5. The curves for the two DNA fractions are identical and demonstrate that the reiteration frequency of the globin genes is the same in DNA obtained from the monomeric particle as it is in total DNA, even in a tissue actively expressing these

pg DNA

FIG 5. Titration of globin genes in monomeric and total nuclear DNA of duck reticnlocytes. Globin cDNA (800 cpni, 0.03 n g ) was annealed with increasing quantities (5-300 p g ) of either monomeric ( x ) or total nuclear ( 0 )erythroblast DNA in a volume of 25 pl. Reactions were incubated at 69°C for 96 hours and assayed by treatment with S, nuclease. 0 = E . coli DNA.

363

STRUCTURE OF GENES IN CHROMATIN

sequences. From the titrations, we calculate the presence of two copies of each of the globin genes per haploid genome, a finding consistent with work from other laboratories. Fnrther support for the existence of nucleosomes over the globin genes is provided by experiments in which the globin genes are quantitated in multimeric as well as monomeric nucleosomal forms. A sucrose gradient profile of a 7%nuclease digest of reticulocyte nuclei is shown in Fig. 6. DNA was purified from each of the gradient fractions and annealed to globin cDNA under conditions where the extent of annealing is linearly related to the globin gene content of the individual fractions. The concordance between the absorbance profile and annealing data is quite good, with peaks of annealing corresponding to the regions of trimeric, dimeric, and monomeric nucleoprotein subunits. Similar results have been obtained demonstrating the presence of the ovalbumin gene in monomeric subunits obtained from estrogen-stimulated oviducts. Furthermore, virtiiall y all sequences expressed as mRNA in liver cells are present in nucleosomal DNA at frequencies identical to that of unfractionated total nuclear DNA. Therefore, the template-active

0

10

20

30

3

Fraction Number

FIG.6. Titration of globin genes in multimeric nucleosomal subunits. Duck reticulocyte nuclei were digested with staphylococcal nuclease, and the resultant nucleoprotein particles were fractionated by sucrose gradient centrifugation (-). DNA was prepared from equal aliquots of each gradient fraction and annealed to globin cDNA as described in the legend to Fig. 5 ( ). The DNA content of the 25 @I of annealing reaction ranged from 5 to 20 pg such that the extent of annealing was linearly related to the globin gene content of the gradient fractions.

364

RICHARD AXEL

regions of the genome appear to participate in the basic repeat structure characteristic of chromatin. Hence, transcription through protein-covered regions of the genome must occur, and the presence of histone on a specific gene does not appear to be sufficient to restrict polymerase action. A note of caution should be added here. I t is possible that cleavage by nuclease results in distortion of the native structure over transcriptionally activc genes. If these regions of the chromatin are physically restrained in an extended conformation, nuclease action may relieve this restraint, thereby allowing nucleosomes to reform over the liberated fragments. Such a model still requires the presence of histone over active genes, but implies that they are not organized in a compact subunit. The more stable nucleosomal conformation may therefore be gcnerated only after nuclease cleavage. I t is apparent from these studies that the arrangement of the nucleosomes along the DNA is not related in a straightforward way to the specific transcription observed with chromatin. These studies, however, do not exclude the possibility that the proteins within the monomeric subnnit of active genes are organized in such a way as to facilitate the transcription of specific genes, and this structure is not discernible by simple analysis of nuclease digestion products. Therefore, other probes were used in conjunction with nuclease to examine the distribution of proteins along the globin genes ( 1 3 ) .To this end, “protein-free” DNA was prepared by titration of chromatin with poly( D-lysine), followed by proteinase and nuclease treatment. The globin genes were then quantitated in this fraction of “open” DNA as well as in “covered” DNA prepared by limit digestion of reticulocyte chromatin. In these annealing reactions, we found that, whereas all globin gene sequences are represented in covered DNA, a specific portion of the globin gene is absent from “open” DNA, corresponding to about 20% of the gene length. It appears, therefore, that specific regions of the globin genes of reticulocyte chromatin are covered in such a manner as to render them inaccessible to the polylysine. In contrast, no difference exists in the annealing properties of globin cDNA to open and covere\d, DNA fractions from erythrocyte chromatin. These findings suggest that, although there is a random distribution of proteins along the globin genes of eiythrocytes, there exists in erythroblasts a specific class of sequences that are always covered or chemically restricted by protein and along which the protein distribution is nonrandom.

VI. In Vitro Transcription as a Probe of the Globin Genes The above observations (Section V ) on the distribution of proteins along the globin genes suggest that subtle differences exist between

STRUCTURE OF GENES IN CHROMATIN

365

template-active and -inactive regions of chromatin, but that these differences are not recognizable at the gross level of nucleosomal structure. Structural alterations within the nucleosome itself may occur without disruption of this particle and are therefore not detectable by simple nuclease cleavage. An alternative approach, examining the structure of specific genes in chromatin, involves the accessibility of these genes to transcription by RNA polymerase. Transcription of reticulocyte chromatin in uitm results in the synthesis of globin RNA, at levels constituting 0.01%the total RNA transcript. Similar reactions with erythrocyte chromatin result in the synthesis of one-seventh the amount of globin RNA obtained from reticulocyte templates. We have also examined RNA transcribed from duck liver chromatin and from protein-free DNA, and detect no globin RNA in the transcript, a result consistent with the concept of tissuespecific transcription ( 1 4 1 6 ) . These results establish that the proteins of chromatin restrict transcription in a tissue-specific manner and are in accord with the proposed role of chromatin in control of gene expression. The nature of this restriction, however, remains elusive. It is unlikely that the bacterial polymerase is capable of recognizing a natural promoter sequence adjacent to the globin genes. Nevertheless, the globin genes appear to be recognized and transcribed at least 100 times more frequently than we would expect if the polymerase were copying a 1000-nucleotide sequence of the genome at random. Therefore, the nucleosomes about the globin gene appear to be organized in such a way as to facilitate the binding of polymerase throughout the length of the gene. Polymerase binding must occur in the presence of a nucleosomal array and may in fact require this sort of structural organization for recognition.

VII. Recognition of DNA Restriction Endonuclease Sites in Nucleosomes

VirtuaIIy all models of transcriptional control require the presence of regulatory molecules capable of recognizing specific sites within the chromosome. Our observations suggest that nucleosomes are distributed randomly with respect to DNA sequence, and are present on transcriptionally active segments of the genome. If such regulatory factors exist, it is possible that they exert their effect by recognition a t specific sites within a preformed nucleosome. An experimentally feasible approach to this problem involves an analysis of the efficiency with which specific nucleotide sequences within nucleosomes are recognized and cleaved by DNA restriction endonucleases. If a given sequence is structurally or-

366

RICHARD AXEL

ganized in a nucleosomal array, then the extent to which this sequence is cleaved by specific endonucleases provides a measure of the accessibility of nucleosomal DNA to proteins that require sequence recognition for activity. A eukaryotic system amenable to this sort of analysis is the cleavage of the bovine genome with the restriction endonuclease Eco RI. The major satellite of this organism, satellite I, comprises 7%of the genome and is tandemly repetitious with the smallest identifiable repeat unit (300 base pairs) present at about 10,000 copies per autosomal chromosome. Higher order repeat units exist, which must obviously be internally repetitious. If a given restriction sequence exists within this tandem array, then endonuclease digestion should liberate the entire repeat structure as unit-length fragments. The length of the fragment is determined by the distance between the specific repeat sequences. An Eco RI cleavage site,

( 17), exists at 1400-nucleotide intervals within bovine satellite DNA; cleavage at these sites liberates 95% of this satellite as 1400 base-pair fragments (18, 19). This fragment represents about 7%of the total DNA and is readily isolated by agarose gel electrophoresis (Fig. 7A). The ease with which the generation of this sequence can be measured now permits analysis of the accessibility of this site when orgniiized into a nucleosomal structure within the nucleus of bovine cells. The kinetics of generation of this 1400-nuclcotide fragment from isoIatcd calf-thymus DNA is shown in Fig. 7A. As progressively more Eco R I is added to the reaction, increasing proportions of this specific fragment are liberated until a limit is reached with about 7.38 of the DNA represented in the 1400-nucleotide band. Early in the digestion process, multimeric forms of the unit-length fragment are observed (see also Fig. 7 B) , which permit us to determine accurately the length of this fragment using known molecular-weight markers derived from restriction endonuclease treatment of Ad2 and Ad5 DNA. The value we obtain is 1393 t 20 base-pairs. Other discrete bands are observed a t 984, 1515 and 2120 base-pairs, which presumably represent cleavnge products of three of the four remaining calf DNA satellites. It was now of interest to examine the susceptibility of these sequences in their native chromosomal conformation to Eco RI cleavage. Increasing quantities of enzyme were added to purified calf thymus nuclei (Fig. 7B). We observe that the 1400-nucleotide fragment is generated, but even under limit digestion conditions, with enzyme to DNA ratios of 20

367

STRUCTURE OF GENES IN CHROMATIN G

F

E

D

A

C

B

A

A

B

C

D

E

8

FIG. 7 . Restriction endonuclease Eco R1 digestion of calf thymus DNA and nuclei. Panel A. Calf thymus DNA ( 5 f i g ) was incubated in 3% sucrose, 0.01 M TrisCl (pH 7.9), 0.05 M NaCI, 0.0005 M MgCl, for 2 hours at 37°C in the presence of increasing amounts of Eco R1: A, no enzyme; B, 0.12 U; C, 0.23 U; D, 0.47 U; E, 1.2 U; F, 4.7 U; G, 11.7 U. Panel B: Purified calf thymus nuclei ( 5 pg) was incuhated with increasing amounts of Eco R 1 under the conditions described above: A, no enzyme; B, 2.3 U; C, 11.7 U; D, 23.3 U; E, 46.7 U. DNA was purified from the reaction mix by phenol extraction and analyzed by 1.51 agarose gel electrophoresis. Gels were stained in ethidiuni bromide and photographed.

units of DNA per microgram, the proportion of this fragment is far less than that observed upon digestion of DNA. The digestion shown in slot 5 represents a true limit, since the addition of a second dose of enzyme at the end of a 2-hour reaction results in no change in the gel profile. At the limit digestion, 2.5% of the DNA is present within the unit-length satellite fragment with prominent multimeric fragments present at molecuIar weights consistent with dimeric to pentameric forms of the monomeric 1400-nuclcotide fragment. As expected, these multimers are not observed in the limit profile of protein-free DNA. These data indicate that, although cleavage can occur in the native nucleoprotein complex consisting of satellite I sequences, this cleavage is restricted, presumably owing to the distribution of protein along the satellite sequences in chromatin. Any attempt to relate these observations to the conformation of these sequences in chromatin requires determining the organization of pro-

368

RICHARD AXEL

Units RI/pg

DNA

FIG. 8. Kinetics of endonucleasc cleavage of satellite I DNA from calf thymus DNA and nuclei. DNA ( 0 )and nuclei (0) of calf thymus were incubated with increasing aniounts of Eco R1, and the resultant DNA fragments were analyzed by 1.5% agarose-gel electrophoresis. Gels stained with ethidiuni bromide were photographed and the negatives were scanned. The amount of DNA generated as the 1400 base-pair, unit-length satellite fraginvnt was then calculated.

teins in this region of the genome. We then asked whether nucleosomes exist over satellite sequences in calf nuclei. The experimental approach was analogous to that used to demonstrate the prescnce of nucleosomes over the globin genes. Calf-thymus nuclei were digested with staphylococcal nuclease, and the resulting nucleoprotein was subjected to sedimentation velocity fractionation on sucrose gradients. DNA was purified from monomeric, dinieric, and trimeric nucleosomes and was assayed for satellite-I content by hybridization with highly radioactive pure satellite-I DNA (Fig. 8 ) . These results clearly indicate that the concentration of satellite in nucleosomes is identical to that in total DNA. Satellite sequences are therefore organized in a iiucleosomal array in a manner analogous to that observed for bulk genomic DNA. The DNA fragments obtained on rcstriction endonuclease cleavage of whole nuclei can now bc interpreted with respect to iiucleosomal structurc. A rcpcating subunit in calf-thymus chromatin consists of 185 basepairs of DNA with a 140 base-pair core and 45 base-pairs of interparticle DNA. If we assume that restriction clcavage can occur only in this interparticle DNA, the probability that a given Eco HI sitc will exist within the segment is 0.24. However, the generation of unit-length fragments requires that two adjacent sites both reside in this interparticle DNA, a phenomenon that should occur with a probability of 0.24 x 0.24, or 0.057. ,

STRUCTURE OF GENES IN CHROMATIN

369

A model that postulates a random distribution of nucleosomes, in which the core DNA is restricted from sequence recognition, predicts that the amount of unit-length fragments of satellite generated upon digestion of nuclei should be 5.7% that obtained with DNA. However, about one-third as much satellite is liberated as unit-length DNA upon digestion of nuclei as opposed to protein-free DNA. This suggests that sequence recognition and cleavage must occur within the nucleosomal core itself, and that about 60%of the nucleosomes must be accessible to endonuclease action.

VIII. Conclusions The clucidation of a regular array of subunits in chromatin immediately poses the questiofi as to the possible role of this structure in those biological processes in which chromatin participates. Attempts to relate structural changes to changes in the template activity of the nucleoprotein complex require that we consider all discernible levels of structure in the hierarchical organization of the chromosome. The smallest unit examined in our studies are the nucleoprotein fragments generated upon cleavage of the nucleosome itself. These fragments of precisely determined lengths perhaps represent the points of intimate association of histoncs with DNA. The specific organization of these fragments generates the larger basic subunit or nucleosome. Although the precise nature of the interaction between DNA and protein in the nucleosome rcmiains unclear, current physicochemical studies support the view that the DNA resides on the external surface of an apolar core of specific histone aggregates. The core of the subunits consists of about 140 basepairs of DNA with a 45-base fiber of DNA connecting adjacent particles. A higher-order coil is likely to exist, which reflects an aspect of nucleosome-nucleosome interaction not yet elucidated. Analysis of the organization of this subunit structure with respect to DNA sequence reveals that no long special sequence of nucleotides is necessary for nucleosome formation. Virtually all genomic sequences are involved in the repeat structure, with the particles randomly distributed along the genomic RNA. Furthermore, analysis of the distribution of nucleosomes over transcribing scquences reveals that the template-active regions of the genome are similarly organized in this basic structure. The mere presence of nucleosomes along a givcn sequence is not sufficient to restrict transcription. I t is possible that more subtle alterations characterize the templateactive regions of chromatin, and that these aspects of structure arc not apparent in a simple aiialysis of the nuclcosomal array. In support of this,

370

RICHARD AXEL

studies that utilize polylysine in concert with nucleases reveal that the distribution of protein along the globin genes in erythroblasts differs from that observed in the transcriptionally inert crythrocyte. I n addition, bacterial polymerase specifically transcribes globin RNA from erythroblast chromatin templates: a phenomenon likely to result from structural differences rather than from recognition of eukaryotic promoters. More recently, it has been shown (20) that DNase I is capable of specific digestion of the template-active regions of the genomc. These observations are in accord with the notion that structural differences exist between template-active and -inactive regions of chromatin and suggest that these altcrations may reside within the inteiiial organiza t’ion of the nucleosome itself. It is possible that regulatory factors exert their effect by recognizing specific sites within a prcfornied subunit. In accord with this, we find that spccific nucleotide scquenccs within the core of the nucleosome can be identified and cleaved by restriction endonucleases that require sequence recognition for activity. If some of the proteins of chromatin act as transcriptional control factors, we would predict that proteins should exist that recognize specific regulatory sites and interact with the nucleosome itself, thus generating the differences observed with a variety of structural probes. I t is further possible that those aspects of DNA structure dictated by nucleosoiiial organization may b e required for the fidelity of regulatory interactions.

ACKNOWLEDGMENTS This work was supported by the National Institutes of Health, National Cancer Institute, Research Grant CA-2332 and CA-16346.

REFERENCES 1. D. R. Hewish and L. A. Brirgoyne, BBRC 52, 501 ( 1973). 2. hi. Noll, Nature 251, 249 ( 1974). 3. C. G . Sahasralmddhe, and K. E. Van Holde, IBC 249, 152 (1974). 4. R. Axel, Bchem 14, 2921 ( 1975). 5. B. Sollner-Webb, and G. Felsenfeld, Bchem 14, 2916 (1975). 6. R. Kornberg and J. 0. Thomas, Science 184, 865 (1974). 7. A. L. Olins, R. D. Carlson and D. E. Olins, J . Cell B i d . 64, 528 ( 1975). 8. 1’. Oudet, hl. Gross-Bellard, and P. Chambon, Cell 4,281 ( 1975). 8. R. Axel, W. Melchior, B. Sollner-Webb and G . Felsenfeld, €“AS 71, 4101 (1974). 10. H. Weintraub and F. Van Lente, P N A S 71, 4249 (1974). 11. M. Wigler and R. Axel, NARes 3, 1463 (1976). 12. E. Lacy and R. Axel, P N A S 72,3978 ( 1975). 13. R. Axel, H. Cedar and G. Felsenfeld, Bchem 14,2489 (1975).

STRUCTURE OF GENES IN CHROMATIN

371

14. R. Axel, H. Cedar and G. Felsenfeld, PNAS 70, 3440 ( 1973). 1 5 . R. S. Gilmour and J. Paul, PNAS 70, 2029 ( 1973). 16. A. Stegglcs, G. Wilson, J. Kantor, D. Picciano, A. Falvry and W. F. Anderson, PNAS 71, 1219 (1974). 17. J. Hedgpeth, 13. Goodman and H. Boyer, P N A S G9, 3448 (1972). 18. M. Botchan, Nature 251, 288 (1974). 19. S. Mowbray and A. Landy, PNAS 71, 19.20 ( 1975). 20. H. Weintraub and M. Groudine, Science 193, 848 (1976).

This Page Intentionally Left Blank

The Structure of DNA in Native Chromatin as Determined by Ethidium Bromide Binding J. PAOLETTI,~ B. B. MAGEEAND P. T. MAGEE Department of Human Genetics Yale University School of Medicine New Haven, Connecticut

1. Introduction The currently favored model of chromatin structure was first suggested by the electron micrographs of Olins and Olins ( 1 ) and is supported by various biophysical and biochemical studies ( 2 4 ) .This model proposes that a “core,” containing two molecules each of histones 2a, 2b, 3 and 4, is spaced at 200-nucleotide distances along the DNA backbone. Thc fifth histone, H1, associates more loosely with the DNA a t some point as yet undeterrnincd, possibly between the cores or v-bodies (nucleosomes). Axel (5) has produced evidence that both active and inactive genes are present in the “-body fraction isolated from chromatin. The state of the DNA and the nature of its interaction with histones and nonhistone proteins has important implications for the mechanism ( s ) by which transcription of this highly organized structure is controlled. We chose to use the intercalating dye ethidium bromide to investigate this problem. Specifically, we wished to ask: what is the tertiary structure of the DNA on the v-bodies?

II. Methods Chromatin was prepared from WIL2 lymphoblastoid cell nuclei ( 6 ) according to No11 ( 7 ) with digestion times, and enzyme concentrations varied as described in the figure legends. Ethidium bromide binding was followed by fluorescence enhancement ( 8). Fluorescence polarization was determined on a spectrofluorimeter as described by Yguerabide et al. ( 9 ) . Binding data were plotted as a Scatchard plot, where T = ethidium bromide bound per nucleotide of DNA, and c = unbound Present address: Laboratorie de Pharmacologie MolCculaire no 147 du CNRS, Institut Gustave Roussy, 16bis Av Paul Vaillant Couturier, 94800 Villejuif, France. 373

374

J . PAOLE’ITI ET AL.

ethidium bromide ( 10). The fluorescence polarization data were plotted as cos?w (the square of the average cosine of the angle w swept by the emission oscillator between the time of absorption and emission of light) against T .

111. Results Figure 1 shows the Scatchard plot of ethidium bromide binding to free DNA, to minimally digested chromatin (30 seconds), and to extensively digested chromatin ( lo’), A number of very striking features are evident in the chromatin curves. First, the extrapolation to T / C = 0, a measure of the total amount of dye bound per nucleotide, is much less for both kinds of chromatin than for free DNA. The difference (0.110.12 vs. 0.20) could be characteristic of that found for circular DNA (11)

6

5

2

1

0.05

0.1

0.2

r FIC. 1. Scatchard plot of ethidium bromide binding to chromatin and DNA. Ethidiuni Iiromide binding to chromatin and DNA was detemiined as described ( 8 ) . The DNA standard was from calf thymus (Worthington) (-). Chromatin was prepared by digestion for 30 scconds with 1 pg of staphylucoecal nuclease (Worthington) per 1.5 x 10’ ririclei a t 37°C (0-0) or by 10 minutes’ digestion under ). The 30-second (minimally) digested chromatin had identical conditions ( A-A >1% monomer as determined by gel electrophoresis of the extracted DNA; the 10-minute (extensively) digested chromatin had 11% monoiner. [See Note Added in Proof, 1). 377.1

ETHIDIUM BROMIDE BINDING TO CHROMATIN

375

or could correspond to a complete masking by proteins of one-half of the DNA, and has been found previously for chromatin extracted by mechanical shearing (12). Second, the binding to minimally digested chromatin falls into two classes, one with a high binding constant ( k = 2 X lo6 M-I), not too different from that for free DNA under these conditions (Fig. l ) , and one with a low binding constant ( k = 2 X ' loi M - I ) . The latter constant is very similar to that found for DNA in high salt, when the phosphate groups are largely neutralized. The region of greater affinity, while observable in extensively digested chromatin, has a much lower constant than in minimally digested chromatin. Two such classes were also found in sheared chromatin (12). An extrapolation of the high-affinity region to T / C = 0, while difficult owing to the curvature of the line, gives a value of about 0.03 to 0.05, corresponding to 2540%of the total DNA, or 50 to 80 base-pairs. This number is in fairly good agreement with the fraction of DNA in a v-body that is highly susceptible to nuclease digestion, 50-60 base-pairs. A third important characteristic of the Scatchard plot in Fig. 1 is the transition that separates the high- and low-affinity binding domains in minimally digested chromatin. This transition is typical of a cooperativity in binding. We interpret these results to mean that v-bodies do contain DNA in two different states, differing in their affinity for ethidium bromide (EtdBr). The fractions of DNA in these states corresponds reasonably well with the fractions in the nuclease-sensitive and nuclease-resistant parts of the v-body. Further support for this interpretation comes from the similarity between the affinity constant for EtdBr of the fraction corresponding to the limit digest of the DNA and the affinity constant of DNA in high salt, since DNA tightly complexed with histones is, so to speak, in high salt. The cooperativity of binding to the low affinity region may indicate that the unwinding associated with intercalation of the dye causes a partial relaxation of the structure so that further binding is facilitated. The fact that the total amount bound is characteristic of a closed circular configuration could argue that even in the partially relaxed structure the DNA of v-bodies still acts as though highly constrained. In order to substantiate further the interpretation that the highaffinity fraction of DNA is in a separate region from the low-affinity fraction, we looked at the polarization of fluorescence of EtdBr bound to chromatin. When EtdBr is excited with fully polarized light, the emitted light is partially depolarized. This depolarization occurs when the molecule can rotate during the lifetime (20 nsec) of the excited state or when the energy of excitation can be transferred to a molecule a t an angle with the first ( f o r example, a second EtdBr intercalated nearby). Thus the

376

J. PAOLETIT ET AL.

polarization of fluorescence extrapolated to low values of r gives a measure of the rigidity of the DNA into which the dye is intercalated, and the rate of depolarization as a function of r is proportional to the frequency with which molecules of dye intercalate in proximity. Figure 2 shows that in free DNA, we find (extrapolating to r = 0) a value of cos‘o= 0.835, while in chromatin the value is 1.0. If the polarization of EtdBr in DNA is extrapolated to r = 0 a t infinite viscosity, the value is very close to 1.0. (Infinite viscosity would prevent the rotation of the DNA around the long axis, accompanied by the transient opening or “breathing” of the base-pairs.) Thus, the DNA that binds the dye at very low r (therefore, the high affinity fraction) seems to be highly constrained and unable to undergo rotational motion. Figure 2 also shows that the depolarization of fluorescence in v-body DNA as r increases is very rapid. We interpret this to be due to rapid saturation of the highaffinity DNA with EtdBr, accompanied by energy transfer to nearby moleules. The possibility that the depolarization is due to energy transfer to externally bound dye, rather than to intercalated mdecules, seems to be eliminated by the fact that the lifetime of fluorescence is constant with increasing 7 ( 19 ns) and is very close to that of EtdBr in free DNA (20.7 ns). After r reaches about 0.03-0.035, the depolarization of fluorescence in v-bodies declines at a rate roughly parallel to that in free DNA, indicating that the dye molecules are able to intercalate over larger fractions of the DNA in the chromatin. The following ideas about the structure of v-bodies can be drawn from

1.0

0.5

-

I

0.01

I

I

0.1

0.05

r FIG. 2. Polarization of fluorescence of ethidium bromide bound to chromatin. The chromatin (O---O) was minimally digested as described for Fig. 1. The DNA ( 0-0 ) was native calf thymus ( Worthington).

ETHIDIUM BROMIDE BINDING TO CHROMATIN

377

these studies: ( 1 ) DNA in v-bodies is heterogeneous, and both fractions seem to be constrained; ( 2 ) the fraction of DNA with high affinity for EtdBr corresponds in amount to the one that is highly sensitive to nuclease and that is decreased when digestion time is increased; ( 3 ) in order to bind ethidium bromide, the DNA fraction with low affinity must undergo some structural alterations. These ideas, leading as they do to a picture of chromatin with a highly organized and tightly constrained structure, indicate that transcription may be a very complex process, requiring a number of relaxing proteins or other structural entities to permit binding and progression of RNA polymerase.

ACKNOWLEDGMENTS This research was supported by USPHS grants GM 19481 ( E . A. Adelberg, principal investigator), and GM 21012 (P. T. Magee, principal investigator).

REFERENCES 1 . A. L. Olins and D. E. Olins, Science 183, 330 (1974). 2. K. E. van Holde, C. C . Sahasrabnddhe, B. R. Shaw, E. F. J. von Bruggen and H. Arnberg, BBRC 60, 1365 (1974). 3. D. R. Hewish and L. A. Burgoyne, BBRC 52, 504 ( 1973). 4. M. Noll, J. 0. Thomas and R. D. Kornberg, Science 187, 1203 (1975). 5. R. Axel, This volume, p. 355. 6. B. B. Magee, J. Paoletti and P. T. Magee, PNAS 72, 4830 (1975). 7. M. Noll, Natuw 251, 249 ( 1974). 8. J. B. Le Pecq and C . Paoletti. J M B 27, 87 (1967). 9. J. Yguerabide, H. F. Epstein and L. Stryer, J M B 51,573 (1970). 10. G. Scatchard, Ann. N.Y. Acad. Sci. 51, 660 (1949). 1 1 . J. B. Le Pecq, Methods Biochem. Anal. 20, 41 (1972). 12. L. M. Angerer, S. Georghiov and E. N. Mondrianakis, Bchem. 13, 1073 (1974).

NOTE ADDEDIN PROOF In later experiments we have found that the time of digestion, up to 10 minutes or 20%monomer, does not affect the Scatchard plot of ethidium bromide binding to nnclease-digested chromatin, We attribute the difference shown in Fig. 1 to a de-

gradation of the 10-minute digested chromatin, possibly due to protease action. This does not, of course, affect the conclusions drawn about the structure of nucleasedigested chromatin.

This Page Intentionally Left Blank

Cellular Skeletons and RNA Messages RONALDHERMAN, GARYZIEVE, JEFFREYWILLIAMS, ROBERTLENKAND SHELDONPENMAN Department of Biology Massaclztisetts Institute of Technology Cambridge, Massachuretts

Progress in the study of eukaryotic, and especially metazoan, cell biology is quite apparent from the contributions at this conference. Nevertheless, the challenge remains to elucidate those properties of gene expression, presumably through RNA metabolism, that serve to make a metazoan animal the complex arrangement of biological materials that it is. In particular, morphogenesis involves a bewildering variety of cell growth, movement, changes in architecture and the development of special biochemical pathways. Although the impressive work on the RNA metabolism of higher organisms has established profound differences from the metabolism of prokaryotes, so far, with few exceptions, little relates our studies to the obvious problems of metazoan biology. In this report, we describe some of our first tentative efforts to relate RNA metabolism to the unique properties of a mctazoan cell. W e present suggestive evidence that the architecture of the metazoan cell, in this case mammalian, is intimately involved with RNA metabolism.

1. Cytoplasmic Skeleton We describe two preparations (“skeletons”) from HeLa cells, one cytoplasmic and one nuclear. [The term “skeleton” is currently used to suggest a number of different cellular structures; perhaps in the near future a more definitive terminology will be adopted.] The choice of HeLa cells for a study of cell architecture may seem odd since these cells have little in the way of morphologically distinct features. Nevertheless, for other reasons, this work was started using this rather nondescript workhorse of a cultured cell, and a quite remarkable degree of internal structure was found. 379

380

RONALD HERMAN ET AL.

Figure 1 is a low-power clectron micrograph of a HeLa cell gently lysed with Triton X-100 in an isotonic buffer. The general structure of a cell and nucleus are clearly visible, and even such specialized morphological entities as surface microspikes seem to be partially preserved. What is remarkable about this preparation is how few of the cellular constituents remain. The procedure extracts most of the phospholipids and a11 of the soluble components, such as proteins and transfer RNA, and mitochondria1 constituents have apparently been leached out. Furthermore, the cold extraction procedure breaks down microtubules, and tubulin is quantitatively removed. Actin filaments are not visible; actin is found largely in extracted proteins either because of its unpolymerized state in these cells or owing to an instability of nonmuscle f-actin under these extraction conditions. What remains is comprised of a network of filaments that at higher magnification are seen to be the intcrincdiate filaments previously described ( 1-3). These apparently interconnect and possibly mesh with the as yet undefined proteins, which appear condensed in rather diffuse blotches. Thus, a major component of the structure that maintains morphology in the absence of microtubules or microfilaments appears to be the intermediate filaments. The analysis of “skeleton” proteins shows that the 53,000-dalton subunit of these filaments described by Shelanski and co-workers ( 4 ) is, in fact, a major component and is quantitatively rctained in the preparation. Most s t a r t h g is the retention of most of the active polyribosomes of the cell. These are quite apparent in the electron micrograph, and biochemical measurements show that at least 75% of the active cellular polyribosomes remain attached by some linkage to the cytoskeleton. In contrast to the active polyribosomes, the inactive monomers are largely extracted by the lysis procedure. The polyribosomes always appear to be associated with the blotches of condensed protein apparent in the cellular network. Where the intermediate fibers are particularly dense, the polyribosomes appear excluded. The most difficult thing to establish at this point is whether the association of polyribosomes with the cytoskeleton represents a truc in viuo state or is some artifact of the extraction procedure. The major evidence that this association may represent the true distribution of polyribosomes in the intact cell is the observation that ribosomal monomers do not stick to the “skeleton” to any significant degree. Of course, the active ribosomes may very well have components that lead to an artifactual association; at present, this possibility cannot be ruled out. Experiments are in progress to determine whether the products of the extracted polyribosomes are in any way differentfrom those that remain attached to the skeleton.

CELLULAR SKELETONS AND

KNA

MESSAGES

381

FIG. 1. Cytaskeleton of HeLa cells. HeLa cells in suspension culture were harvested, washed and resuspended in isotonic, low ionic strength buffer (0.25 M sucrose, 0.01 M NaCI, 0.003 M MgCI,, 0.01 M Tris, pH 7.4) containing 1%Triton X-100. The cells were vortexed briefly and centrifuged into a pellet. Fixation was with 2.5% glutaraldehyde followed by 1%osmium tetroxide. The sample was dehydrated in alcohol, embedded in Epon-Araldite and sectioned. Micrographs were taken with a JEM lOOB at 80 Kev. The microscopy was carried out by Elaine Lenk.

382

RONALD HERMAN ET AL.

Taken at face value, and assuming that the preparation is not plagued by artifacts, the electron-microscope and biochemical studies suggest strongly that the protein synthetic machineiy of the cell does not float about freely but, even in the case of “free” (as opposed to membranebound) polyribosomes, is localized within the cytoplasm. This would serve to explain such puzzling morphological observations as the apparent exclusion of ribosomes from certain regions of the cell, such as the vicinity of the centriole, Also, the localization of the protein synthetic machinery on relatively spatially stable structures makes some teleological sense. In such cases as cell division or extensive ccll movement, a randomly diffusing protein synthetic system would be out of cellular control and the partition of polyribosomes between daughter cells or parts of extended cell would be left to chance. Also, there may be situations similar to those involving products of membrane-bound polyribosomes, where the spatial location of polyribosome products is important. All this is speculative and requires considerable further effort to demonstrate the reality of the suggested topological control of the proteinsynthesizing components of the mammalian cell.

II. The Nuclear Skeleton and hnRNA We return to the cytoskeleton below, but consider here another “skeleton”, this time the one associated with the nucleus and presumably related to the structures previously described by Berezney and Coffey ( 5). Figure 2 shows an early preparation in which about SM of the DNA has been removed by microccocal nuclease digestion. Much of the remaining DNA appears localized in the perinuclear heterochromatin, which appears to be relatively less accessible to nuclease. Later preparatory techniques, for which electron micrographs are not presently available, remove the remaining plasma membrane components surrounding the nuclear shell and more than 95% of the total nuclear DNA. Even in this early electron micrograph (Fig. 2 ) , it is possible to see that the nucleus retains its shape (somewhat distorted in this preparation by high centrifugation forces) and that suggestions of internal structure are visible through the empty space left by the removed chromatin. A prominent nucleolus is visible; it is possible that it is simply trapped by the nuclcar shell. We postulate a relation between nuclear metabolism, chromatin organization, and this nuclear skclcton. This interrelation is apparently accomplished by a class of large hnRNA molecules that are relatively long-lived, many terminating with poly( A ) , and that appear to be attached to both the skeleton and the chromatin.

CELLULAR SKELETONS AND

RNA

MESSAGES

383

FIG.2 . Nuclear skeleton of HeLa Cells. HeLa cells were broken in hypotonic buffer (0.01 M NaCl, 0.003 M MgCl,, 0.01 M Tris, pH 7.4) to which 1%Triton X-100 was added. Nuclei were separated by centrifugation and resuspended in digestion buffer ( 5 % sucrose, 10.' CaCl?, 0.1 M Tris, pH 7.4). Micrococcal nuclease was added to 4 pg/nil, and the mixture was incubated at 25" for 9 minutes. EDTA was added to 0.001 M and the remnant nuclei were centrifuged into a pellet. Fixation and electron microscopy was as in Fig. 1. Electron microscopy was by Elaine Lenk.

384

RONALD HERMAN ET AL.

The existence of this “quasi-stable” hnRNA population is strongly suggested by two observations. The amount of polyadenylylated molecules in the nucleus a t steady-state is far greater than would be expected on the basis of the purely short-lived hnRNA component, as shown below. Second, “chase” experiments conducted in the presence of high concentrations of glucosamine clearly indicate the presence of a multicomponent hnHNA population, a portion of which appears to decay with approxiniatcly a 20-minute half-life while the remainder has a half-life of about 100 minutes (see pp. 384-385). However, the glucosamine technique has a number of unanticipated pitfalls and is not completely understood at present. Therefore, we take, at present, the existence of the long-lived hnRNA component principally from the large steady-state content of polyadenylylatcd moleculcs.

A. mRNA Sequences in Steady-State hnRNA A major problem in the study of hnRNA has been the isolation of nuclear RNA free of cytoplasmic contamination. Total HcLa hnRNA prepared from nuclei washed using the double detergent procedure ( 6 ) remains contaminated with a small but significant amount of cytoplasmic species. This is concluded from the presence in the nuclear fraction of u p to 3%of the total cellular 18 S ribosomal RNA and thus, presumably, 3% of cellular polyribosomes. To reduce the cytoplasmic contamination of the hnRNA, an additional step was added to the cellular fractionation procedure. The nuclear structure can be disrupted by exposing the detergcnt-washed nuclei to high ionic strength (0.4 M ammonium sulfate) ( 7 ) . However, the bulk of the hnRNA remains associated with the chromatin and thus can be separated froin contaminating polyribosomes, which dissociate under these conditions. A small subfraction of hnRNA is released by the ammonium sulfate, but this accounts for very little of the steady-state material ( 7 ) . The hnRNA is then extracted with phenol/chloroform and extensively trcated with DNase ( 8). Nuclear RNA prepared in this way retains 1%( o r less) of the total cellular 18 S rHNA, most of which is probably nascent 18 S KNA in the nucleolus ( 6 ) . The presence of the ribonuclease inhibitors poly ( vinylsulfate ) , spermine and N-ethylmalcimide during the purification results in the isolation of very large hnRNA. More than half the molecules carrying poIy( A ) sediment faster than 45 S whcn isolated in this way ( Fig. 3a). Having achieved an extensive purification of hnRNA, we measured the relative cytoplasmic and nuclear p l y ( A) content of HeLa cells by hybridizing [3H]poly(U ) to each fraction ( 9 ) .In several different steadystate cytoplasmic and nuclear preparations, 20%of the cellular poly( A ) (by weight) is in nuclear RNA. [Less than one-fifth of the nuclear

CELLULAR SKELETONS AND

28 24

RNA

3

28S18S 4:

285 18s

II

1i i

1 1

I

y

0

0

x

I

a

V

=

4

1

12

y 20 I 16

385

MESSAGES

g

10 8

V

12

m"

8

4

4

2

a 10 I( 2 4 6 8

._

12 2 4 6 8 10 12 14 FRACTION NUMBER

FIC. 3. Sedimentation distribution of purified nuclear RNA. Steady-state nuclear RNA was isolated from HeLa cells. The R N A was analyzed by sedimentation in a 15-302 sucrose gradient in dodecyl sulfate (0.1%)buffer. The gradient was assayed by hybridizing ["H]poly(U ) to a portion of each fraction. The locations of the 28 S, 18 S and 4 S R N A markers were taken from a parallel gradient. ( a ) Native hnRNA. Sedimentation was at 18 K rprn in the Spinco SW 41 rotor for 15 hours at 25°C. ( b ) Alkaline-cleaved, p l y ( A ) -containing hnRNA fragments. Native hnRNA was alkaline-cleaved for 15 minutes at 0°C. The poly( A)-containing fragments were isolated by poly( U ) -Scpharosc chromatography as described ( 23). Sedimentation was at 25 K rpm in the SW 41 rotor for 1G hours at 25°C. For details, see Herman ct al. ( 2 3 ) .

[3H]poly(U) binding is in the oligo(A) fraction.] This ratio of cytoplasmic to nuclear poly( A ) in HeLa cells is comparable to that obtained recently by Johnson et d.( 8 ) from growing mouse fibroblast 3T6 cells ( 2 : 1 ) , using both steady-state labeling with 3 2 P 0 , and [ 3 H ] p ~ I yU( ) hybridization. A cDNA copy of the 3' terminal of the purified steady-state HeLa hnRNA ( A,, ) was synthesized using avian myeloblastosis virus reverse transcriptase and oligo ( clT ) as primer. However, hnRNA contains short stretches of 30-50 adcnylatc residues [oligo( A) 1, which can be distinguished from the 3'-poly( A ) (approximately 200 AMP residues) by their internal positions in the molecules and by their transcription from the cellular DNA (10, 11 ). After limited alkaline hydrolysis of the hnRNA, the oligo ( A )-containing fragments were removed by differential affinity chromatography using poly( U)-Sepharose ( 9 ) ; the remaining poly ( A ) adjacent fragments are shown in Fig. 3b.

386

RONALD HERMAN ET AL.

LOG Rot

FIG. 4. Hybridization of cDNA transcribed from hnRNA( A,,) fragments compared to cDNA from mRNA. NLIC~CYU RNA(A.) fragments and niRNA(A,,) were purified, and cDNA was prepared from each as described ( 2 3 ) . RNA excess hybridizations were performed using a 1000-2000-fold excess of driver RNA. mRNA concentrations ( R 0 ) were calculated from the poly( A ) content of the preparation assuming that the poly( A ) is 4% of the chain length. Nuclear RNA concentrations were calculated as explained in the text. ( A , A ) nuclear cDNA driven by niRNA(A,,); ( X ) nuclear cDNA driven by hnRNA(A,) fragments; (0, 0 ) message cDNA drivcn by mRNA( A,,).

Figure 4 shows the results of an experiment in which the cDNA transcript of the cleaved nuclear RNA(A,,) was hybridized to an excess of mRNA( A,, ), The saturation value obtained in these hybridizations provides a measure of the fraction of hnRNA molccules sharing sequences with cytoplasmic mRNA. A t saturation, approximately 45-50% of the input nuclear cDNA hybridizes to thc mRNA(A,,). Figure 4 also shows that by log R,,t = 2, 67%of this same cDNA has hybridized to the hnRNA( A,,) fragments. Thus the nuclear cDNA anneals under similar conditions to a significantly greater extent with nuclear RNA than with cytoplasmic mRNA. The low saturation value achieved using cytoplasmic mRNA as “driver” is therefore due not to an inherent inability of the nuclear cDNA to hybridize, but rather to the presence of poly ( A ) -containing sequences that have no detectable counterparts in the cytoplasm. Thus at most 70% (46/67) of the nuclear cDNA transcribed from the 3’ terminus of the hnRNA( A,,) fragments is complementary to mRNA( A,). Assuming that reverse transcriptase copies RNA sequences in proportion to their relative abundance within the population, only 70%of the polyadenylylated HeLa

CELLULAR SKELETONS AND

RNA

387

MESSAGES

nuclear RNA contains sequences at the 3' terminus related to those in mRNA ( A, ) . The most complex transition, extending from a log Rot of 0.7-2, contains approximately 33% of the cytoplasmic sequences but 65%of the hybridizable nuclear cDNA. If selective copying of the complex nuclear sequences has not occurred, these complex sequences constitute a relatively larger proportion of the nuclear RNA than of the cytoplasmic niRNA. Nevertheless, the actual number of RNA molecules in the nucleus containing the scarce sequences is lower than in the cytoplasm. It has been suggested that the largc hnRNA molecules result from the artifactual aggregation of smaller nuclear molecules. To show that mRNA sequences are in truly large molecules, we selected hnRNA molecules sedimenting faster than 45 S in an aqueous sucrose gradient, and then denatured these large molecules in Me,SO ( 1 2 ) . The fraction of these large molecules (20%)that still sedimented faster than 45 S in a 5 to 20% sucrose gradient in M e 3 0 was recovered and treated as outlined above to obtain poly ( A )-containing fragments. The hybridization of the cDNA prepared from this denatured, cleaved hnRNA( A,) is shown in Fig. 5. Approximately 40% (uncorrected) of this cDNA hybridizes to the mRNA( A,,). Kinetics of the hybridization are essentially identical to those for the hybridization of the cDNA prepared from the total cleaved hnRNA(A,) (Fig. 4).

-I

I

0I

2I

L LOG R o t

FIG.5 . Hybridization of cDNA transcribed from large hnRNA molecules. Nuclear RNA sedimenting faster than 45 S in an aqueous 15 to 30%sucrose gradient was isolated. This large RNA was denatured with MeSO and then centrifuged in a 5 to 20% sucrose gradient in MeSO at 40 K rpm for 20 hours in the Spinco SW 40 rotor. Those molecules again sedimenting faster than 45 S were pooled and alkali-cleaved, and the poly( A)-containing fragments were purified by poly( U ) -Sepharose chromatography. cDNA was prepared from the ( An)-containing fragments and then hybridized to an excess of mRNA( A"). Rot = concentration of RNA nucleotide x time.

388

RONALD HERMAN ET AL.

The results of the hybridization of nuclear cDNA to mRNA ( A,, ) suggest that at least a portion of the nuclear RNA molecules contain message sequences at their 3’ terminus adjacent to the poly( A). However, these experiments do not indicate how many of the cytoplasmic sequences are found in nuclear RNA. To answer this question, cDNA was prepared from the cytoplasmic mRNA(A,,) and annealed to a large excess of hnRNA( A,,). The cDNA from cytoplasmic mRNA hybridizes much more slowly to hnRNA than to its own template (Fig. 6 ) . This suggests that the rapidly hybridizing (abundant) sequences in the cytoplasm are much reduced in the nucleus relative to the scarce sequences. This agrees with the conclusions drawn from the hybridization of nuclear cDNA to cytoplasmic RNA ( Fig. 4 ) . An unambiguous interpretation of the hybridization of cytoplasmic cDNA to nuclear RNA requires that the contribution by cytoplasmic mRNA contamination be negligible. Most of the message sequences contaminating the nuclear RNA should be from the abundant classes. HeLa message cDNA was therefore separated into two fractions, one containing the transcripts of the abundant, and the other of the scarce, mRNA. Each cDNA fraction was then hybridized to the hnRNA separately. Fractionation was accomplished by annealing the HeLa message cDNA with mRNA( A,,) to an Rot value of approximately 1.5 and then separating the hybridized (abundant) from the unhybridized (scarce) message cDNA by chromatography on hydroxylapatite ( 1 3 ) . Separation was confirmed by the hybridization of each cDNA fraction to the mRNA (Fig. 6b). When the abundant message cDNA is annealed to cleaved, &go( dT)cellulose-purified hnRNA, hybridization occurs at values of Rut approximately 10 times those at which the same cDNA hybridizes to messenger RNA (Fig. 6 b ) . This displacement can be used to establish an absolute maximum level of cytoplasmic contamination of the nuclear RNA preparation (that is, 10%)by assuming, in the extreme, that all the observed hybridization of the abundant message cDNA is with cytoplasmic RNA contaminating the hnRNA. Similarly, the scarce message cDNA should hybridize 10-fold more slowly to nuclear RNA than to messenger RNA if only cytoplasmic contamination were driving the reaction. The data in Fig. 6 show that the scarce message cDNA hybridizes 4 times faster than was predicted by assuming 10% cytoplasmic contamination. This shows that most of the scarce mRNA sequences are present in hnRNA. The rate at which scarce message cDNA is driven into hybrid form by nuclear RNA is, however, 2.5-fold slower than when this cDNA is driven by messenger RNA. This is further evidence that the message sequences adjacent to poly( A) in the nucleus are diluted by poly( A)containing nonmessage sequences,

KNA

CELLULAR SKELETONS AND

389

MESSAGES

a

.------

80

I

I

I

I

0

I

I

I

2

LOG R o t

FIG.6. Hybridization of HeLa “message” cDNA to hnRNA. ( a ) Total message cDNA. Unfractionated HeLa message cDNA was hybridized to an excess of cleaved, oligo( dT)-cellulose-bound hnRNA (X-X). [See Herman et al. ( 2 3 ) for an explanation of how nuclear RNA concentrations were determined.] The kinetics of hybridization of total message cDNA with mRNA(A,,) are reproduced from Fig. 4 for comparison ( ----). ( b ) Abundant and scarce message cDNA. Total HeLa message cDNA. Total HeLa message cDNA was annealed with mRNA(A.) to an Rat value of 1.5. The hybridized (abundant) cDNA was separated from the unhybridized ( scarce ) by chromatography on hydroxylapatite. Each fraction was then hybridized to an excess of mRNA( A,) or cleaved, oligo( dT)-cellulose-bound hnRNA: ( A--A) abundant cDNA driven by total mRNA(A,,); ( A-A) abundant cDNA driven by nuclear RNA; (.--a) scarce cDNA driven by total mRNA(A.); (0-0) scarce cDNA driven by nuclear RNA.

The number of copies of hnRNA molecules containing the scarce message sequences can be estimated from the data presented here. We have assumed that the HeLa cell has a total of 5 x lo5 mRNA molecules. One third of these are in the scarce class, which has a sequence complexity of -lo1. Thus there are about 15 copies of each scarce message

390

RONALD HERMAN ET AL.

per cell, Since 7%of the p l y ( A)-adjacent message sequences are in the nucleus and approximately 65%of these are “scarce” sequences, thcre is about 0.7 copy of each per nucleus. Considering the approximations used, it is very possible that the actual number is one copy per nucleus. Most important, there appear to be transcripts in the nuclcus corresponding to most or all of the active DNA regions.

B. Association of hnRNA with Other Structures To estimate the physical association of these hnRNA molecules with the nuclear skeleton and to chromatin, hnRNA was labelcd for 3 hours, the length of time sufficient to approach a steady state. We have already seen that most of this RNA is tightly associated with the chromatinnuclear skeleton complex that sediments after ammonium sulfate treatment. Part of this linkage is to the nuclear skeleton itself. Thc DNA of the chromatin can be almost quantitatively removed while the hnRNA remains associated with a rapidly sedimcnting structure with properties of the remnant nuclear skeleton. The data in Table I indicate that after removal of over 95% of nuclear DNA, at least 80%of hnRNA remains associated with the nuclear skeleton even after ammonium sulfate fraction. The double-stranded regions of the hnRNA are apparently involved in the linkage of these molecules to the nuclear skeleton. Dige5tion of isolated nuclei with pancreatic ribonuclease removes upward of 80%of hnRNA but leaves 50-80% of the double-stranded regions intact and attached to the nuclear skeleton chromatin complex. The data in Table I show the retention of the protected double-strand pieces and their relative resistance to subsequent elution by ammonium sulfate. Removal of most of the chromatin with DNase at the same time as RNase digestion has very little effect on the final result, either with respect to double-strand yield or the resistance to ammonium sulfate. The results suggest that a significant portion of the double-strand loops are firmly attached to the nuclear skeleton, but not to the chromatin to a significant degree, at least by the criteria applied here. A very similar result is obtained when the poly(A) segment of hnRNA is examined. The results in Table I show also that the poly( A ) scgment remains attachcd to the nuclear skeleton as long as it is covalently linked to the hnHNA molecules and the chromatin remains intact. In parallel to the double strands, the digestion of the chromatin with DNase and the digestion of the hnRNA with RNase docs not result in the extensive liberation of the poly(A) segment. Thus it appears that thc attachment of poly(A) also does not require the integrity of chromatin and suggests that this segment is, in fact, affixed to the nuclear superstructure.

CELLULAR SKELETONS AND

Assocl.\rrIoiv

OF

RNA

H N R N A WITH

391

MESSAGES

TABLE I NuCLfi:.\It

SK1.:LI.;TON . \N D

THE

A. Ilistribution of hnKNA( %) Control

+DNase

hnItNA DNA

hnIlNA DNA

lleleasrd from nucleus llelrased i n (NH4)2S04 Prllct

1

1

)

4

4 9

9*j

87

94

B. Ilistribution of 1)ouhlc-Stranded RNA( %)

70 26 4

+

RNasc +L)Nase

~_+IlNase

___

9 5 86

15 37 48

5 .55 40

Control

+ RNase ____

Control ~

Iteleased from nucleus Rcleased in (NH4),S04 Pellet

C . Ihtribution of Poly(A)( %)

~

lleleased from nucleus Iteleased in (NH4)2S04 Pellet

CHROMATINa

4 15 XI

23 10 67

+DNase flZNase ~

29 16 55

IIeLa cells were labeled overnight with [I4C]thymidine(Part A only) and for 3 hours with [311]uridineor [3H]adcnosinc. Thc cclls were broken in low ionic strength buffer (0.01 hf NaCI, 0.01 hI Tris, pH 7.4, 1.5 mM MgCI?) by the addition of NP-40 to 1 %. Isolated nuclei were trcated, as indicated, in the same buffer with pancreatic IZNase (10 pg/tnl) and pancreatic 1)Nase (120 gg/xnl) for 10 minutes at 25°C. Nuclei were fractionated by resuspension in 0.4 M (NH4)2S04as descrihed previously ( 7 ) . Fractions were cxtractcd with phenol and digested with R.Nasc a t high ionic strength (0.25 hl NaCI, 0.01 M MgC12, 0.01 M Tris, p H 7.4) and assayed for double-stranded IlNA content by clcctrophorrsis on 14 % polyacrylamidc gcls or for poly(A) content by electrophoresis on 10 % gels.

The linkage is apparently a relatively firm one since ribonuclease followed by ammonium sulfate leaves the poly(A) segment with the skeleton. The picture that emerges from these results is of relatively long-lived HNA transcripts attached to the nuclear skeleton by both the doublestranded RNA regions and the 3’ poly( A) tails. The double strands and poly(A) segment are not necessarily the only sites of attachment of hnRNA to subcellular structures. Rather, these are portions that are resistant to ribonuclease digestion and thus their location in the nuclear skeleton-chromatin complex is easily determined. We suggest that these transcripts play a role in organizing chromatin and keeping active or euchromatin in its native state. This hypothesis would be consistent with

392

RONALD HERMAN ET AL.

the observation that chromatin undcrgoes significant reorganization after the inhibition of RNA metabolism with a drug such as actinomycin. We therefore suggest that there are two classes of hnRNA, one that is short-lived and serves, at least in part, as the precursor to cytoplasmic mRNA, and a second class that is relativcly long-livcd and contains long transcripts terminating in poly ( A ) , These long-lived transcripts appear to be attached to the nuclear skeleton and perhaps also to the chromatin. A most plausible assumption would be that the sites of attachment are related to the region producing the quasi-stable transcript. Since these regions are active, one would conclude that transcription continues in the region where the quasi-stable hnRNA is attached. This raises the possibility that the transcript that is not used for morphological purposes is not of the same size or from exactly the same region as thc quasi-stable transcript. Certainly there are suggestions that message may arise from the 5’ end of hnRNA as indicated by conservation of 5’ caps’ from the nucleus to the cytoplasm ( Perry, private communication ) . Nevertheless, all the cytoplasmic mRNA appears to be present in sequences a t the 3’ ends of the quasi-stable transcripts. There is also an observation of the subfraction of hnRNA consisting of smaller than average nuclear transcripts ( although still considerably larger than cytoplasmic mRNA) that behaves as a major precursor to cytoplasmic mcssage ( 7 ) . This subfraction was removed in present procedures, and yet the remaining hnRNA still contains most if not all of the sequences of cytoplasmic mRNA. Such pieces of evidence are merely tantalizing and certainly do not prove that a particular transcription site may have more than one size or type of transcript. However, the data presented here do suggest a possible structural role for the quasi-stable hnRNA molecules and raise the possibility of several types of transcripts with different functions at a given locus.

111. Low-Molecular-Weight RNA Species In this section we turn to another form of gene expression in eukaryotic cells and use the word “message” in a broader sense than simply a sequence coding for polypeptides. I t was established nearly 10 years ago that there are low-molecular-wcight RNA molecules found in the nucleus and, more recently, in the cytoplasm of the cells of higher organisms that d o not serve any role in protein synthesis (references to much of the earlier work are given in Table 11). These molecules have no known

’ Re caps, see articles in Part I of

this volume.

CELLULAR SKELETONS AND

RNA

Alternative nomenclaturcsa*h

A B C 1) E F

US

Ilalflifec.“ c 1

U?

c 2

UIB

C3

5.8 S

UI, .i S I, I1 6 s 111 4.5 8 I, 11, 111

G

11

c4 CS C6

K “\Yral 7 S”

L

393

MESSAGES

Stable 20 Hr 2
Localization Nucl~olar Nuclear skeleton Nuclear skeleton Nuclear Ribosomal Nucleoplasmic Ribosomal Nu c1ear Nuclear skclcton Cytoplasmic and nuclear Cytoplasmic /membrane

110-Choi and Busch ( 2 1 ) . al. ( 2 2 ) . c Weinberg and Penman (20). (1 Frederiksen ct al. ( 2 5 ) . e hlarzluff ct al. (26).

* Walker rt

analogy in prokaryotic HNA metabolism and are among the things that clearly distinguish cukaryotic RNA metabolism from that of lower organisms. Since their initial discovery and characterization, the work on these RNA species has been rather desultory since no very plausible proposal has been made concerning their possible function. Our recent results indicate that these low-molecular-weight RNA species, found in very specific subcellular locations, appear to be involved in cellular structure. In particular, a number of species are specifically associated with the nuclear skeleton. Of the two cytoplasmic species, one at least appears, in part, to be associated with membranes and appears in oncornaviruses derived from mammalian cells. What follows is a listing of the small, stable nuclear and cytoplasmic RNAs and a brief description of what is presently known about the metabolism and localization of each.

A. SnA The position of SnA relative to SiiB and SnC is altered in this gel system compared to the electrophoresis system used previously (14).Its identification, however, is unambiguous owing to its being located exclusively in the nucleolar fraction ( a s shown in lane 3 of Fig. 7). It also remains in the nucleus after the release of chromatin in the form of

394

RONALD HERMAN ET AL.

FIG. 7. Identification of the small-molecular-weight RNAs in the HeLa cell. Electrophoresis in 6 to 15%dodecyl sulfate polyacrylaniide slab gels. Each fraction was prepared from 2 x lo6 cells labeled for 16 hours with 5 pCi of ['H]uridine per milliliter under normal growth conditions. 1, Nuclear fraction; 2, Cytoplasmic fraction; 3, nucleolar fraction. For details, see Zieve and Penman ( 2 4 ) .

nucleosomes ( n u bodies2). Labeled SnA is not found associated with the nucleolus until 15 minutes after the addition of the label to the cells. During the first 15 minutes of labeling, SnA is found transiently in the cytoplasmic fraction. Thus for a brief period after its synthesis, SnA is not fixed in the nucleolus, and either goes through an initial cytoplasmic stage before entering the nucleolus or leaks into the cytoplasm during fractionation. After 2 hours of continuous labeling, equal amounts of SnA are found in the cytoplasmic and nuclear fractions. After a l6-hour label, >go% of the SnA in the cells is found in the nucleolus. 'See papers by Axel and by Paoletti et al. in this volume.

CELLULAR SKELETONS AND

RNA

MESSAGES

395

Several results suggest that SnA is involved in nucleolar processing. A fraction of SnA is hydrogen-bonded to 32-28 S nucleolar intermediates, as first reportcd by Prestayko et al. (15). After 6 hours of labeling, we find 5% of the SnA in the cell hydrogen-bonded to nucleolar 32-28 S material. If these cells are allowed to grow for an additional 12 hours or longer, 30%of the total cellular SiiA is found hydrogen-bonded to the nucleolar material. SnA is never found hydrogen-bonded to any cytoplasmic material. Significant amounts of SnA (25%)are still synthesized in the presence of 0.04 pglml of actinomycin D, a concentration that is sufficient totally to inhibit ribosomal precursor formation (16). In addition, the SnA synthesized in the presence of low levels of actinomycin is found only in the cytoplasmic fraction of the cell. A similar result is found if cells are labeled with an RNA precursor in the presence of 5-fluorouridine, an inhibitor of ribosomal processing (17, 18). No SnA is found in the cell if cells are labeled for 16 hours in the presence of 5-fluorouridine. The fixation of SnA in the nucleus and its stability in the cell appear to be dependent upon normal nucleolar functioning.

B. SnB, SnC These species are associated predominantly with the nuclear skeleton. In our gel system, they run very close to each other, C having only a slightly faster mobility. When cells labeled for 16 hours are fractionated into nucleus and cytoplasm, between 25%and 40%of the labeled species SnB and SnC appear in the cytoplasmic fraction (Fig. 7). All the SnB that remains in the nuclei is found in the nucleoplasmic fraction. Most of the SnC that remains in the nucleus is also found in the nucleoplasmic fraction; however, between 5% and 10%is found in the nucleolar preparations. When chromatin is released from the nuclei by enzymic digestion, all the SnB found in the nucleus pellets with the nuclear skeleton. A small amount of SnC is released from the nuclei by both micrococcal and pancreatic DNase digestions.

C. SnD SnD is the most abundant of the small RNA species found in the nuclei of mammalian cells. In the cold fractionation procedure, up to 10%of this species is found in the cytoplasmic fraction, a t least some attributable to mitotic cells (19). ( O n our gels, SnD often runs as a curved band, as can be seen in the gel patterns.) It is the only small species released from the nucleus during brief warming as, for example, occurs during the digestion for nu bodies. The SnD released is in very

396

RONALD HERMAN ET AL.

large structures that sediment at 40 S and contain a large number of distinct polypeptidcs.

D. SnF, SnH These species are the least abundant of the major small nuclear RNAs, and their behavior has not been exhaustively studied. Both are found only in the nucleoplasm. SnF is partially released from the nuclei by micrococcal nuclease digestion but not hy the pancreatic DNase, which suggests that the RNase activity is required to detach it from the nuclear skeleton. SnH is not released from the nuclei by either of the enzyme digestions. SnH has a half-life of 30 hours; however, SnH disappears in the presence of high levels of actinomycin D.

E. SnG’ SnG’ is a stable molecule associated with the nucleoplasm. I t was originally identified as a form of cytoplasmic 5 S RNA but was later shown to be a methylated 5 S RNA distinct from cytoplasmic ribosomal 5 S (20). Its electrophoretic mobility is slightly different from cytoplasmic 5 S on our gels, and a recent report shows that a nuclear 5 S RNA actually has a base sequence very different from that of cytoplasmic 5 S (21 ).

F. SnK SiiK is the only specics that has not been definitely localized. It appears to varying degrees in both nucleus and cytoplasm. After a 10-minute period of labeling, radioactive SnK is found totally in the cytoplasmic fraction. After 90 minutes of labeling, about 10%of it becomes associated with the nucleus. After 18 hours of labeling, the distribution of SnK is quite variable and in different experiments ranged from 60%cytoplasmic/ 40%nuclear to 95%cytoplasmic/ 5% nuclear. The most interesting aspect of the metabolism of SnK is its migration into thc nuclear fraction after exposure of the cells to high levels of actiiiomycin D. When prelabeled cells are treated with 5 pg/ml of actinomycin D, increasing ilmounts of SnK become associated with the nuclear fraction, and by 4 hours 90%of the prelabeled SnK is found in the nucleus. This behavior could not be induced by any other inhibitor investigated. Unlike the other small nuclear species, S I X is not methylated (25). W e have confirmed this and have also found that species K is metabolically stable. In addition, the synthesis of SnK is the most resistant to actinomycin D of all the small RNAs, with the exception of 5 S rRNA and tRNA. At 0.1 pg/inl actinomycin D, SnK is still synthcsized at 40%of the control rate.

CELLULAR SKELETONS AND

RNA

MESSAGES

397

G.SnP SnP is a stable species, found only in the nucleus, which we tentatively identify as a new low-molecular-weight HNA. SnP is obvious in Fig. 7. This RNA often runs as a very diffuse band in the aqueous gels. It becomes labeled very slowly, and its behavior has not been studied extcnsively.

H.

S c l or Oncornavirus 7 S RNA This RNA species has been previously identified as a nuclear species (20). In our fractionation, it is totally cytoplasmic and is the small cytoplasmic RNA that has recently been described as occurring in cells and in the virion of oncornaviruses (22). It was also reported that ScL is associated with polyribosomes. W e have been unable to reproduce this result using either hypotonic or isotonic lysis buffers. More than 90%of ScL sediments more slowly than ribosomal subunits in a sucrose gradient when detergent lysis is used to fractionate the cells. A significant portion of ScL is associated with membranes, A membrane fraction contains 30% of the ScL of a mechanically prepared cytoplasmic extract. If the membrane fraction is prepared from cytoplasm cxposed to EDTA, the amount of ScL is reduced to 15%.If membranes are dissolved with 0.5%deoxycholate and 0.5%Brij 58, none of the ScL is found in rapidly scdimenting structures. I t therefore seems probable that ScL is, in part, found in membranes, and it can be concluded that this portion is truly cytoplasmic in localization. The ScL RNA from HeLa cells and from oncornavirus are probably identical molecules ( Fig. 8 ) . Both have the same electrophoretic mobility for the main band, and each has a family of conformers that run more rapidly than the principal band, Conformers have been isolated from gels and rerun in urea. Under denaturing conditions, the conformers from both the cells and the virus migrate with the same mobility, which is identical to that of the main band of ScL. In addition, ScL from both cell and virus can be shifted in mobility by heating prior to electrophoresis. The data in Fig. 8 show that both viral and cellular RNA is shifted in mobility to a single band whose migration rate is intermediate between the originaI main band and the fastest running conformers. The numbcr of properties the viral and cellular RNA molecules have in common is sufficient to suggest their identity at least with respect to physical behavior. Indeed, it seems very possible that the molecule found in the virion is of cellular origin. The presence of this molecule in membranes suggests that it may become associated with the virion during the process of maturation and budding.

398

RONALD HERMAN ET AL.

FIG. 8. Equivalent mobility of ScL and viral 7 S R N A under nondenaturing and denattiring conditions. Electrophoresis was carried out in a 6 to 15%gradient slab gel as described ( 2 4 ) . 1, Cytoplasmic fraction from 6 x loa cells labeled for 2 hours; 2, isolated ScL; 3, isolated ScL heated for 7 minutes at 70°C in sample buffer before application to the gel; 4, MLV-M RNA; 5, MLV-M R N A heated for 7 minutes at 70°C in sample buffer before application to the gel.

ScL resembles SnK in that it is not methylated and is metabolically stable. Its synthesis is inhibited 801 by 0.1 &ml actinoniycin D. The results of this survey are summarized in Table I1 and Fig. 9 together with the nomenclaturc of othcr workers. The specific subcellular location suggests that these H N A species are involved in cellular structure either as passive participants in subcellular organization or as active directors of architectural construction. More recent studies have gone further and show that the cytoplasmic species K and L are highly enriched in preparations of the cytoplasmic “skeleton,” further supporting the notion that these RNA molecules arc always found in association with specific cellular structures. The role of these small HNA molecules is only a subject of speculation a t present, but it may be noted that they represent a flow of information

CELLULAR SKELETONS AND

RNA

hlESSAGES

399

Sc D

FIG. 9. Localizations of the small RNA species of the HeLa cell. The schematic diagram of the HeLa cell shows the locations of the small cytoplasmic ( S c ) and small nuclear ( S n ) RNA species in the cytoplasm, nucleoplasm and nucleolus ( n u ) .

from the genome that does not go through the process of protein synthesis. Thus especially with regard to cellular organization, we must consider the possibility that there arc forms of gene expression in eukaryotes that affect cell behavior directly and not through the mediation of protein synthesis. The amount of such information may be rather small, if, in fact, we have completely described all the small R N A species. If, however, many of the “confonners” scem in the electrophoretic pattern should prove to be molecules of different base sequence, the amount of genetic information flowing through this pathway could be considerable indeed.

IV. Summary This report presents experimental results that suggest an interrelation between cellular topology and nucleic acid metabolism. Gentle lysis of HeLa cells with a nonionic detergent reveals an elaborate remnant structure that retains much of the cellular morphology. Electron micrographs and biochemical studies show that a majority of the protein synthetic apparatus remains affixed to this “cytoskeleton.” The nature and even the reality of this attachment remains to be elucidated. However, the suggestion that protein synthesis takes place in a topologically ordered manner would help explain numerous observations showing that polyribosomes appear not to be randomly distributed in the cytoplasm of cells. Thus, the skeleton structure possibly plays an important role in cytoplasmic R N A metabolism. Two of the small R N A species found in mammalian cells, ScK and ScL, are intimately associated with the cytoskeleton, and ScL is, in part, associated with membranes. The function of these molecules is unknown, but there is the possibility that they serve a structural role and may help determine the architecture of the cytoskeleton. If so, then R N A metabolism could play a direct role in determining cellular topology.

400

RONALD HERMAN ET AL.

The nucleus also has a “skeletal” structure, although much less can be visualized in this case. We have suggested that hnRNA molecules are attached by both their double-stranded segments and 3’ poly( A ) tails to this nuclear framcwork. This suggests a possible structural role for these transcripts in keeping chromatin properly organized in the interphase nucleus. Also, many of the small HNA species are found localized in this nuclear skeleton, suggesting that they also may play a roIe in either structure or function determination. The experiments described here, while not definitive, indicate an aspect of the metazoan cell with no obvious counterpart in bacteria, that is, an interrelation between spatial organization and RNA metabolism.

ACKNOWLEDGMENTS We would like to thank Carol Hahnfeld and Laura Ransom for their excellent technical assistance. This work was supported by grants from the National Institutes of Health (NIH 2 R 0 1 CA08416; NIH CA12174) and from the National Science Foundation (BMS 73 06859). R. Herman is a recipient of a National Institutes of Health postdoctord fellowship ( 6 F22 DEOl655). J. Willian~s was a Harkness Foundation Fellow. G. Zieve is the recipient of a predoctoral fellowship from the National Science Foundation.

REFERENCES 1. M. L. Shelanski and H. Feif, in “Structure and Function in the Nervous System” (C. Bourne, ed.), Vol. 6, pp. 47-80. Academic Press, New York, 1972. 2. R. V. Rice and A. C. Brady, CSHSQB 37, 439 (1972). 3. R. P. Goldman and D. M. Knipe, CSHSQB 37,523 (1972). 4. S. Yen, D. Dahl, M. Schachmer and M. L. Shelanski, PNAS 73, 529 (1976). 5. R. Berezney and D. Coffey, BBRC 60, 1410 (1974). 6. S . Penman, J M B 17, 177 (1966). 7 . R. Price, L. Ransom and S. Penman, Cell 2, 253 ( 1974). 8. L. F. Johnson, J. G. Williams, H. T. Abelson, H. Green and S. Penman, Cell 4, 69 (1975). 9. J. 0. Bishop, M. Rosbash and D. Evans, J M B 85, 75 (1974). 10. G. R. Molloy, W. Jelinek, M. Salditt and J. E. Darnell, Cell 1, 43 (1974). 11. H. Nakazato, M. Edmonds and B. W. Kopp, PNAS 71, 200 (1974). 12. E. Derman and J. E. Darnell, Cell 3, 255 ( 1974). 13. J. G. Williams and S. Penman, Cell 6 , 197 (1975). 14. R. Weinberg and S . Penman, J M B 38, 289 (1968). 15. A. W. Prestayko, M. Tonato and H. Busch, J M B 47, 505 (1970). 16. S. Penman, C. Vesco and M. Penman, J M B 39, 49 (1968). 17. D. S. Wilkinson and H. C. Pitot, JBC 248, 63 (1973). 18. B. Brdr, D.B. Rifkin and E. Reich, FEBS Lett. 24,345 ( 1972). 19. A. Rein, B B A 232, 306 (1970). 20. R. Weinberg and S . Penman, B B A 190, 10 ( 1969). 21. T. S. Ro-Choi and H. Busch, in “The Cell Nucleus” ( H . Busch, ed.), 3rd ed., pp. 151-208. Academic Press, New York, 1974.

CELLULAR SKELETONS AND

RNA

MESSAGES

401

22. T. A. Walker, N. R. Pace, R. L. Erikson, E. Erikson and F. Behr, PNAS 71, 3390 ( 1974). 23. R. C. Herman, J. G. Williams and S. Penman, Cell 7,429 (1976). 24. G. Zieve and S. Penman, Cell 8, 19 (1976). 25. S. Frederiksen, I. R. Pederson, P. Hellung-Larsen and J. Engberg, BBA 340, 64 (1974). 26. hl. F. Marzluff, E. L. White, R. Benjamin and R. C . Huang, Chromatin Biochem. 14, 3715 (1975).

This Page Intentionally Left Blank

The Mechanism of SteroidHormone Regulation of Transcription of Specific Eu karyotic Genes BERTW. O’MALLEYAND ANTHONYR. MEANS Department of Cell Biology Baylor College of Medicine Houston, Texas

1. Introduction Growth and differentiation is a complex process that requires an integrated synthesis of DNA along with transcription of a full complement of RNA and expression of protein for each new cell type. Differentiation can refer either to morphological changes or to the appearance of a new biochemical function. In a variety of eukaryotic model systems, tissue growth and differentiation can be induced with specific chemical effectors. Initiation of active metabolism in cell division of these tissues allows a precise temporal analysis of the biochemical events that result in the differentiation response. These chemical effectors are exemplified by several steroid hormones acting upon their target tissues (1-11). Hormone-mediated biochemical differentiation refers to the initiation of a new cell capacity that normally appears during the maturation process. Thus, these responses are specific for either embryonic or immature tissues. The effect of glucocorticoids on the differentiation of chick embryo retinal cells provides one example of hormone-dependent biochemical differentiation ( I , 3). The action of the insect steroid ecdysone in the initiation of maturation of insect larvae provides another example ( 4 ) . Dihydrotestosterone is the hormone responsible for the expression of differentiation of the male phenotype in most animals ( 5 , 6). Similarly, the decidual cell reaction in the uterus in response to progesterone is manifested by proliferation of cells similar to that occurring on implantation of a blastocyst ( 7). Estrogen stimulates the growth and differentiation of the immature rat uterus ( 8 ) and is one of the hormones responsible for the appearance of differentiated mammary gland function (9). It also causes a remarkable cytodifferentiation and biochemical differentiation of the oviduct of the immature chick (8, 10, 11).Thus, steroid hormones are capable of stimulating prematurely the appearance of new proteins and target cell functions under well defined conditions. More403

404

BERT

w.

O’MALLEY AND ANTHONY R. MEANS

over, these events are a primary part of the normal adult response during maturation and presumably are under the direction of the endogenous hormones.

II. Control Theories The ultimate response to a hormone that causes growth or differentiation of its target tissue is the change in the complement of cellular proteins. A variety of steroid hormones stimulate protein synthesis within hours after a single injection into hormone-deficient animals. One of the primary interests of investigators studying steroid hormone action has been to determine in chemically precise terms the intracellular levels at which this control is exerted. Since the description of the original model for control of bacterial gene expression at the transcriptional level ( 1 2 ) , it has been hypothesized that steroid hormones stimulate protein synthesis in a similar manner. However, alternate hypotheses exist, such as regulation of specific protein synthesis at the posttranscriptional level (13). This theory requires that messenger RNAs or their precursors, heterogeneous nuclear RNAs, be synthesized at all times; but in the hormonedeficient state, thcy would not be processed to the cytoplasm for translation because they are undergoing simultaneous degradation. Hormones, then, would act either by promoting the processing mechanism or by inhibiting the degradation process, thereby causing an increase in the net cellular levels of specific messenger RNAs. Yet another level of possible regulation is at the translation of the messenger RNA into protein. In this case, one could envision a variety of potential regulatory steps. These would include the interaction of the messcnger RNA with protein-synthesis initiation-factors and/or ribosomes. The presence of a specific structure on the 5’ end of the messenger RNA (such as the “cap” structure, described in detail in part I of this volume) appears to be required for the initiation of translation of a variety of eukaryotic messenger RNAs; this is another potential example of translational control. However, the posttranscriptional theory and to a lesser extent the translational control theory require that the messenger RNAs or precursors thereof are continuously synthesized regardless of the hormonal state of the animal. In order to elucidate the precise control mechanisms in eukaryotic organisms, the presence of hormone effects on messenger RNA production had to be demonstrated. Measurements of indirect changes in gene transcription indicate that scveral hormones do, in fact, act a t the transcriptional level of cell regulation. As an example, estrogen acting

MECHANISM OF STEROID-HORMONE REGULATION

405

on uterus stimulates the synthesis of rapidly labeled nuclear RNA ( 1 4 ) , and RNA polymerase activity is altered (15).In addition, the template capacity of nuclear chromatin is increased ( 1 6 ) . Examination of these RNA products by nucleic acid hybridization and nearest-neighbor base analysis have also revealed marked changes in response to the steroid hormone ( 1 7 ) . However, none of these studies constitute direct proof of alterations in the transcription of spccific structural genes. Indeed, obtaining direct evidence that the synthesis of messenger RNA is a ratelimiting step in the action of steroid hormones has until recently been a most elusive problem. The only definitive way to prove the existence of any messenger RNA is to isolate an RNA fraction from the cell or tissue under investigation and demonstrate that this RNA supports the unambiguous translation of its specific protein in a cell-free protein-synthesis system. Application of this procedure to studies concerning mechanism of hormone action are made much easier if the hormone results in a stimulation of a specific and easily quantitated protein in its target cell. The presence of such proteins has made the chick oviduct a unique model in which to study the level at which hormones regulate the synthesis of cell specific proteins ( 10, 11) .

111. The Oviduct as a Model for Steroid Hormone Action Two steroid hormones, cstrogcn and progesterone, affect the chick oviduct ( 10, 18-20), Estrogen is required for the cytodifferentiation of the glandular tissue and is subsequently required for the maintenance of optimal metabolic activity. One of the most striking features of the differentiated function of this organ is the synthesis and secretion of eggwhite proteins (10, 11, 20-24). One of these proteins, ovalbumin, comprises nearly 60% of the protein synthesized in the mature hen oviduct and a similar proportion of the oviduct of chicks treated with estrogen for 15-18 days. Thus, ovalbumin, because of its abundance, has been relatively easy to isolate and purify to homogeneity. Unlike estrogen, progesterone causes neither cytodifferentiation nor major changes in total metabolic activity in the oviduct ( l o , 11,18, 19, 21 ). However, progesterone has been shown to control specifically the synthesis of the egg-white protein avidin (10, 24-28), which represents no more than 0.1%of the total egg-white protein (25). However, this protein too has been relatively easy to quantify because of its unique ability to bind biotin with high affinity (29). In order to demonstrate whether estrogen and progesterone regulate the synthesis of ovalbumin and avidin, respectively, at the level of RNA production, it was necessary

406

BERT

w.

O’MALLEY AND ANTHONY R. MEANS

to determine the intracellular concentration of the specific mRNAs under various conditions of hormone stimulation. In order t o accomplish this task, these mRNAs have been isolated from oviduct tissue and proven to be authentic by demonstrating the capacity of these molecules to direct thc cle nuuu synthesis of their corresponding egg-white proteins in heterologous protein-synthesizing systems ( 11, 30-35). Using such a system, it has been demonstrated that oviduct from laying hens in which ovalbumin is being synthesized a t its maximal rate contains the greatest amount of ovalbumin mRNA. On the other hand, there is no detectable inRNA for ovalbumin in nucleic acids extracted from the unstimulated immature oviduct of the 7-day-old chick. Stimulation of these animals with estrogen for 4, 10 or 16 days leads to increasing activity of the extractable messenger for ovalbumin synthesis. However, when chicks treated with estrogen for 16 days are subsequently withdrawn from hormone treatmcnt for 16 days, the ovalbumin mRNA activity again becomes very low. Finally, the administration of estrogen to these animals for 1, 2 or 4 days after the 16-day withdrawal period leads once more to a progressive increase in the amount of ovalbumin messenger (11, 32-34). These observations reveal that indeed the amounts of extractable ovalbumin mRNA from oviduct are directly dependent upon estrogen stimulation. When ovalbumin and ovalbumin mRNA are measured in the same tissue samples, a striking correlation over a period of 0-17 days can be demonstrated between ovalbumin accumulation in the stimulated oviduct and its mRNA activity (33, 3 4 ) . Further studies, as illustrated in Fig. 1, revealed a remarkable parallelism between the changes in the rate of ovalbumin synthesis and the available mRNA. Prior to injection of estrogen to hormone-withdrawn chicks, very little ovalbumin mRNA was detected. Maximal induction occurred at 18 hours and mRNA content returned to barely detectable levels at 72 hours. From the decline in the activity of translatable mRNA it can be calculated that the halflife of ovalbumin mRNA appears to be 8-10 hours under conditions of estrogen withdrawal. These studies indicate that estrogen acts a t the level of gene transcription leading to the accumulation of a specific mRNA during differentiation of the oviduct. The appearance of the message seems to be a rate-limiting factor in determining the rate and extent of synthesis of this tissue-specific protcin, ovalbumin. Ovalbumin mRNA was choscw in our initial studies designed to investigate the hormonal regulation of tissue-specific mRNAs, since ovalbuniin was prcsent in such high concentrations in oviduct. This is a very specialized instance, and in most cases proteins are present in much less concentration. In this regard, the cgg-white protein avidin, which is

407

MECHANISM OF STEROID-HORMONE REGULATION 3000

2

V

n W

N v)

W

I t-

zoo0 g

v)

z

I 2

m I

a > 1000

0

z -I

a >

I

n V

z

U

0 DAYS OF ESTROGEN

FIG. 1. Correlation of ovalbumin mRNA activity and oviduct albumin. Ovalbumin and ovalburnin mRNA were measured as described previously ( 3 3 ) .

specifically induced by progesterone represents only about 0.1% of the total oviduct protein. Therefore, it followed that the mRNA for this protein might also be present in considerably less amounts than that for ovalbumin. Indeed, extraction of total RNA from estrogen-stimulated hen oviduct proved to be less than satisfactory as a means of quantitating avidin mRNA. When these RNA preparations were tested in the proteinsynthesis system, it was not always possible to demonstrate avidin synthesis by a specific immunoprecipitatioii procedure. Subsequently, it was demonstrated that reproducible results could be assured by effecting partial purification of the mRNA fraction. We were able to take advantage of the fact that most mRNAs, including the ones for avidin and ovalbumin, contain an extensive sequence of residues at the 3'-terminal end. This poly(A) sequence was shown initially by Lee et al. (36) to allow the mRNA to be selectively adsorbed to nitrocellulose filters. Application of this procedure to oviduct RNA results in a one-step, 50-fold purification of avidin and mRNA. This simple procedure allowed us to measure routinely and consistently the avidin mRNA activity that appears in oviduct in response to progesterone ( 3 0 ) . Avidin mRNA activity is highest in oviducts of mature laying hens, where progesterone stimulation is maximal ( 3 1 ) . On the other hand, no activity can be demonstrated in the unstimulated immature chick or in oviducts from animals that have received multiple injections of estrogen. However, avidin mRNA activity was first detected at 6 hours after a

408

BERT

w.

O’MALLEY AND ANTHONY R. MEANS

single injection of progesterone, and it continued to increase up to approximateIy 24 hours. The avidin mRNA levels increased prior to the accumulation of avidin and coincident with its increased rate of synthesis. In contrast to the estrogen-mediated changes in ovalbumin mRNA, progesterone induction of avidin mRNA and avidin synthesis occurs with little or no additional change in net cellular RNA and protein synthesis over that resulting from estrogen alone (10, 24, 26). However, these results suggest that both estrogen and progesterone act in the oviduct to alter gene transcription in a manner that leads to the production of specific mRNAs. Although the results described above directly support the hypothesis that steroid hormones regulate specific gene expression in target tissues, it was necessary to quantitate directly and precisely the number of specific mRNA sequences (intact or partially degraded) present in target tissues under various hormonal states. Cell-free translation assays suffer from two serious shortcomings. Firt, they can only quantitate intact mRNA molecules; and second, they are not sufficiently sensitive, since several hundred mRNA molecules per cell are required even a t maximal sensitivity ( 10, 37, 3 8 ) . RNA-dependent DNA polymerase (reverse transcriptase) has been isolated from RNA viruses and can use an RNA template to catalyze the synthesis of a DNA strand complementary to the RNA molecule (39, 40). However, the enzyme cannot initiate DNA synthesis without R primer. Since most eukaryotic mRNAs contain a stretch of polyadenylate covalently linked at the 3’ termini, oligothymidylic acid can be added to serve as primer for the reverse transcriptase by forming a hybrid with the polyadenylate in the mRNAs ( 4 1 ) . In the presence of radioactively labeled deoxyribonucleoside triphosphates, a complementary DNA of very high specific radioactivity can be synthesized. This radioactive complementary DNA may then be utilized to assay for the mRNA sequences by molecular hybridization. Since the complementary DNA can form stable hybrids with sh6rt tracts of complementary oligonucleotides, partially degraded mRNA molecules will also be detected using this technique, This assay is also much more sensitive than the translation assay since it can detect as little as one mRNA moleculc per 20 cells. In order to employ this assay successfully to quantitate ovalbumin mRNA sequences in the chick oviduct during estrogen-mediated induction of ovalbumin synthesis, it was necessary to obtain a pure ovalbumin mRNA in large quantities. Preparation of milligram amounts of purified ovalbumin mRNA was accomplished by a sequential combination of precisc sizing techniques with the selective purification of the poly( A)containing RNA by either affinity chromatography or adsorption to nitro-

MECHANISM OF STEROID-HORMONE REGULATION

409

cellulose filters (42-44). Several new techniques were applied to the purification of ovalbumin mRNA including Sepharose 4B chromatography ( 4 3 ) and agarose gel electrophoresis in the presence of 6 M urea at pH 3.5 ( 4 4 ) . All the procedures used were adapted on a preparative scale to the fractionation of large quantities of RNA. The purity of the ovalbumin mRNA was assessed by several independent criteria and it was shown to be homogeneous ( 4 2 , 4 4 ) . This purified ovalbumin mRNA was next used as a template for the synthesis of a radioactively labeled complementary DNA ( cDNA,, ) using reverse transcriptase isolated from avian myeloblastosis virus ( 45, 4 6 ) . The cDNA,, was allowed to hybridize with oviduct RNA extracts. The rate of hybridization is a measure of the concentration of ovalbumin mRNA sequences (intact or partially degraded) in the extract, and this value can be converted to the number of ovalbumin mRNA sequences present in each cell ( 4 7 ) . Using this procedure, it was determined that there are approximately 92,000 ovalbumin mRNA sequences in each tubular gland cell in mature hen oviducts, but that ovalbumin mRNA sequences are virtually absent in oviducts of immature chicks. Chronic estrogen treatment of immature chicks resulted in an increase in the number of ovalbumin RNA sequences per cell from 0 to over 30,000 after 18 days (Table I ) . Withdrawal from estrogen treatment for 16 days caused the number of sequences to diminish to a level of 0-15 sequences per cell. A single dose of estrogen administered at this time resulted in TABLP: I ESTROGIEN INDUCTION OF MRNA,, D U R I N G PIIIMARY AND SIWOND.\RY STIMULATION OF T H E OVIDUCT Hormonal statca

No. of molecules mItNA,, per tubular gland cellh

Unstimulated 4 Days 9 1)ays 1%Days Withdrawn 0 . 3 Hr 1 Hr 4 1Tr 8 Hr 29 Hr

0 12,500 28, ,500 30,550 10-1.5 8-30 30-60 1,500 3,200 10,800

Stimulated with dicthylstilbrstrol for times shown.

* ItNA was extracted from oviduct and assayed for ovalbumin mRNA sequences as previously described ( 4 7 ) .

410

BERT

w.

O'MALLEY AND ANTIIONY R. MEANS

an acute increase in ovalbumin mRNA sequences, first detectable within 30 minutes of the injection; by 29 hours, they had increased to >10,000 molecules per cell. Moreover, there was a remarkable parallelism between the increase in the number of ovalbumin mRNA sequences per cell, as determined by analyses using the cDNA,, probe and the ovalbumin mRNA activity analyzed in the in uitro translation assay (37, 38, 4 7 ) . The transcriptional control theory is further supported by recent experiments in which the [ "HI cDNA,,, hybridization technique was utilized to quantify the number of mRNA,, sequences in the RNA products transcribed from chick oviduct chromatin in uitro ( 4 8 ) . Using both bacterial and oviduct DNA-dependent RNA polymerases, ovalbumin mRNA sequences were readily detectable in RNA products transcribed from oviduct chromatin of estrogen-stimulated chicks, but not from unstimulated chicks or from spleen chromatin. Withdrawal of the stimulated animals from estrogen treatment for 12 days resulted in a 20-fold reduction of mRNA,, levels in the in vitro RNA transcripts. These results indicate that the isolated chromatin retains specificity for transcription in uitro, and that the levels of mRNA,,, in the oviducts under various hormonal states are determined by whether the gene for ovalbumin is available for transcription. The evidence accumulated suggests that the mechanism by which stcroid hormones regulate specific protein synthesis in target cells occurs via direct transcriptional control. However, the molecular mechanism involved in regulation of specific gene transcription by hormone-receptor complexes has still not been completely defined in chemically precise terms, In an effort to elucidate the effect of hormone-receptor complex on chromatin transcription in uitro, a method has been developed that accuratcly measures the formation of initiation complexes between chromatin and RNA polymerase molecules (49). This method has enabled us to quantitate the number of initiation sites available for transcription in oviduct chromatins of chicks a t various developmental states (50). The number of initiation sites increased in a manner directly correlated with the enhanced growth and differentiation of the oviduct during chronic estrogen stimulation. Furthermore, the change in the number of chromatin initiation sites also correlated with the change in the level of estrogen receptors present in the nucleus of withdrawn chick oviducts during secondary estrogen stimulation (51 ). These results strongly suggest that cstrogen-mediated differentiation of the chick oviduct may involve regulation by hormone-receptor complcxes of the number of chromatin initiation sites available for transcription. Moreover, our total results now allow us to discriminate between the various control theories discussed above.

MECHANISM OF STEROID-HORMONE REGULATION

411

IV. Is Ovalbumin Synthesis Regulated a t the Translational Level? Translational control as a primary mechanism for the regulation of ovalbumin synthesis presupposes that mRNA,, exists in the target cell in an untranslatable or “inactive” form. A steroid hormone would then exert its effect by “activating” the mRNA,, so that it would bind to available ribosomes and be translated. Although this interpretation has been frequently applied to various induction systems over the past decade, no strong evidence supports the hypothesis. Such an explanation for estrogen and/or progesterone induction of ovalbumin Synthesis in the chick oviduct appears to be unequivocally eliminated. The hormone-mediated increase in ovalbumin mRNA activity, as assayed by in vitro translation of oviduct RNA in cell-free heterologous systems, decreased the likelihood of such an explanation but did not eliminate the possibility of a hormoneinduced modification of the primary or secondary structure of the mRNA,, ( 5 2 ) . Upon developing a specific hybridization probe ( [ 3H ] ~D N A , ,)capable of detecting even partial sequences of mRNA,, (45, 46), we were able to demonstrate that, following hormone withdrawal, the basal level of oviduct mRNA,, is -0-15 molecules/tubular gland cell (Table I ) ( 4 7 ) . A single injection of either estrogen or progesterone (2’ stimulation) resulted in a rapid increase in the level of mRNA,, sequences, reaching a level of 10,000 molecules/ tubular gland cell over the next 29 hours (see above, Table I ) . It thus appears that steroid regulation of ovalbumin synthesis at the level of mRNA translation is eliminated as a viable alternative for steroid hormone action.

-

V. Is Ovalbumin Synthesis Regulated a t the Posttranscriptional level?

Much attention has been given to the possibility that regulation at the posttranscriptional Ievel might be the primary locus of steroid hormone action. Data interpreted to support this hypothesis has been garnered from a large series of experiments dealing with protein induction kinetics, mRNA turnover, and drug inhibition of hormone-mediated protein synthesis ( 1 3 ) . In its strictest sense, this hypothesis states that the ovalbumin gene would be constantly transcribed (“open”) but that the mRNA,, product would be inactivated and degraded prior to exit from the nuclcus by repressor or “degradative” regulatory proteins. By this mechanism, the levels of mRNA,, would be prevented from rising in the cell and ovalbumin synthesis would remain at basal noninduced levels. During the course of induction of ovalbumin synthesis, the hormone may have little effect on the rate of transcription of the structural gene, but

412

BERT

w.

O'MALLEY AND ANTHONY R. MEANS

rather, would act to block inactivation and degradation of mRNA,,. The result would be a rise in the level of effective mRNA,, and increased ovalbumin synthesis in the absence of gene derepression. Our previous experiments (described above, Table I ) revealed a rise in the net level of cellular mRNA,, following steroid hormone stimulation. The rate of this rise in mRNA,,, taken together with estimates of the half-life of this mRNA, appeared to argue against posttranscriptional control as a primary mechanism for induction of ovalbumin synthesis. Nevertheless, we could not completely eliminate the possibility that in fact the ovalbumin gene was open and continually transcribed, but that the nuclear degradation rate was so rapid that not even a small number of partially degraded mRNA,, sequences could be detected in the absence of hormonal stimulat'ion. Our most recent experiments appear to eliminate this possibility. In these experiments, as illustrated in Table 11, we have prepared chromatins from hormonally withdrawn chicks. Transcription of these chromatins by bacterial ( or eukaryotic) RNA polymerase resulted in few detectable mRNA,, sequences (53). In other words, the ovalbumin gene was "closed" or unavailable to the enzyme for transcription. Within 2 hours of an injection of progesterone ( o r estrogen), oviduct chromatin was capable of supporting synthesis of mRNA,, sequences. Our conclusion was that the steroid hormone altered the chromatin template of target cells in such a manner that the ovalbumin gene was now "open" or available to be transcribed by RNA polymerase. This result is inconsistent with the theory of posttranscriptional control and supports the hypothesis that the primary site of steroid hormone induction of protein synthesis is at the locus of the template for transcription. This is not to say, however, that no regulation whatsoever occurs folTABLE I1

In V i k O

SYNTHESIS OF

MRNA,,

FROM C H R O M I T I N P R E P k R E D FROM

C H I C K S S E C O N D A R I L Y ST1MIJL.ITF.D

Oviduct chromatin at

Chromatin in reaction

0 IIr 2 Hr 6 Hr 24 Hr

600 600 600 600

WITH PROGESTERONK

in

UiVO

pg niIINA,,. synthesized

synthesizcd"

Percent mltNA,, in IiNAb

( x10-3)

PB mRNA/pg DNA

201 219 22 1 261

-

-

-

0.014 0.030 0.056

3 4 6 8

6.7 11 4 2.i. 6

RNA

15.3

RNA was synthesized undcr room temperature. Data have been corrected by subtracting background (chromatin pliis carrier RNA omitting thc polymerase). 0

6

MECHANISM OF STEROID-HORMONE REGULATION

413

lowing transcription. Rather, evidence has already been published that suggests that the half-life of mRNA,, is greater under conditions of estrogenic stimulation than under conditions of hormone withdrawal (34, 37, 4 7 ) . Nevertheless, it can easily be calculated that such changes alone cannot account for the rapid increase in mRNA,,, observed during hormonal stimulation ( 4 7 ) . At present, we do not understand the mechanism of such changes in mRNA turnover. Such observations, however, do serve to remind us of the complexity of eukaryotic cell function.

VI. Is Ovalbumin Synthesis Regulated at the level of Transcription?

W e predicted some years ago that steroid hormones would be shown to exert their regulatory influences at the transcriptional level (10). Our cumulative indirect evidence over the past 8 years was always consistent with such an interpretation (11). Nevertheless, we suggested that such a conclusion could only be supported by demonstrating that a steroid hormone, in its active form as a hormone-receptor complex, could “turn on” gene transcription in a reconstituted cell-free system composed of purified components. These experiments have recently been completed. Our first task was to purify the progesterone-receptor to homogeneity ( 5 4 ) .In its native form, the receptor is a dimer composed of two distinct subunits (A and B ) (55). In parallel fashion, we developed a biochemically defined system to study eukaryotic chromosomal transcription in vitro. Nucleic acid hybridization techniques with [3H]cDNAovwere used to detect the production of minute amounts of ovalbumin mRNA sequences synthesized in oitro from a chromatin template. It was possible to show that ovalbumin mRNA was synthesized in vitro only from oviduct chromatin but not from nontarget chromatins isolated from animals that had received steroid hormone stimulation in uivo ( 5 3 ) . Thus, it appeared technically possible to examine the in uitro effects of purified progesterone receptor on the production of ovalbumin messenger RNA from chromatin isolated from hormonally withdrawn animals. Bulk amounts of RNA were synthesized from both control withdrawn chromatin and from withdrawn chromatin incubated in the presence of saturating amounts of purified progesterone receptor ( M ). The hybridization data showed that the RNA synthesized in the presence of pure receptor-hormone complex contained a 10 to 50-fold enrichment of ovalbumin mRNA sequences as compared to control chromatins (Table 111). This experiment strongly supports the hypothesis that steroid hormonereceptor complexes act directly on target cell chromatin to derepress specific genes.

414

BERT

w.

O’MALLEY AND ANTHONY R . MEANS

TABLE TI1 In Vitro SYNTIIKSIS OF Ov \ I , I I U M I N MRNA WROM Ciirce Ovrnuc,r CHROMATIN Sourcr of chrornat~ri With drawn ovidurt Withdran n oviduct

I’rogesteronr Chromatin RNA I’g of m IlNA pg rccrptor in reaction synthesized syntliesiztltl m ItNA,,, /’PI: ( x 10-3) I)NA ( 1 X 1 0 - 8 R l ) ( ~ 1)NA) g (pa) -

400

125

0-1 9

0-4 ri

+

400

135

20 0

-50 0

VII. A Model for Steroid Hormone Action Our previously published data arc consistent with the following model of steroid hormone action (Fig. 2). Steroids ( S ) enter target cells, probably by passive diffusion, and bind to cytoplasniic receptor dimers. An unusual feature of this model is thc requirement of two bound hormone molecules per intact functional receptor dimer. Following translocation to the cell nucleus, the receptor clinier binds through its specifier B - s ~ b unit to chromatin acceptor sites consisting of chromatin-associated nonhistone protcins and DNA. This mechanism allows the concentration of active receptor molecules in areas of the genome under hormonal control. CELL MEMBRANE

1

NUCLEAR MEMBRANE]

UISTONES

PROTEIN

-

ACCEP

1

[TRANSCRIPTION]

Q,

,

TRANSCRIPT

etc.)

FIG.2 . Schematic rrpresentation of our current concept of the molccnlar mechanism of steroid hormone action.

MECHANISM OF STEROID-HOHhfONE REGULATION

415

Because the DNA binding sitc of the regulatorv A subunit is apparently occluded when it is combined with the B sulkinit in the intact dinier ( 5 6 ) , it may bc neccssarv to postulate the release of the A subunit from the dimer after its localization in chromatin. The A subunit would then be free to search the adjacent genome for specific effector sites, which presumably lie in the neighborhood of the acceptor sites. Binding of the A subunit to specific effector sites could then promote a destabilization of the DNA duplex and thus create new potential RNA polymerase binding and initiation sites. As a result, the information contained within previously repressed structural genes could be expressed. The subsequent translation of these ncwly induced mRNAs provide the proteins required for the functional response.

VIII. Directions of Future Research In order to focus our studies on the regulation of transcription of a specific genetic locus rather than the entire chromatin, we have attempted to synthesize, isolatc and amplify the ovalbumin gene. The complete single-stranded complementary DNA,, ( t o mRNA,,, ) was employed for thc synthesis of the double-stranded ovalbumin gene. The product was treated with S, nuclease and analyzed on a neutral sucrose gradient. The DNA scdimented at 10 S, corresponding to a mean lcngth of 1600 basepairs. More than 30% of the final product was 1800 base-pairs in length and was thus the complete coding portion of the ovalbumin structural gene (57). Tracts of poly(dA) and poly(dT) were added to the 3’ termini of the synthetic ovalbumin genc and thc Col E l plasmid DNA, respectively. The chimeric DNA moleculcs formed by union of these DNA preparations were employed to transform calcium-treated E. coli hosts. The bacterial clones that contained chimeric DNA molecules were detected by in situ hybridization using [,{?P]RNA transcribed from cDNA,, ( 5 6 ) . The positive clones were subsequently cultured and large quantities of the ovalbumin structural gene were obtained. A separatc approach was taken to obtain segments of DNA in which the sequences adjacent to the portion of the ovalbumin genc coding for the structural protein remained intact. Total chick DNA was sheared to a mean length of 4000-5000 base-pairs, and the “naturally occurring” ovalbumin gene sequences were partially purified from the bulk of DNA by affinity chromatography using either an mRNA,,-phosphocellulose or a cDNA,,-phosphocellulose exchanger, prepared as described by Shih and Martin (58). After repeated chromatography ( 5 9 ) , the resultant DNA fraction bound to the mRNA,, affinity column hybridized with

416

BERT

w.

O'MALLEY AND ANTHONY R. MEANS

n w

a

N

80

(r

m

60 0

Q

40

7 ; 20 r(

8 0

IO-~

10-I

loo

10'

lo2

103

lo4

lo2

103

lo4

CO f

a w

80

E m 60 w

9

2

40

rn

I m

Y

20

ae 0

io+

lo-'

loo

10'

CO' FIG. 3. Hybridization of various DNA fractions from repeated mRNA,, or cDNA,, affinity chromatography to 'l-labeled mRNA., (panel A ) and [aH]cDNA,, (panel B ) , respectively. DNA bound to the affinity resins was eluted at denaturation temperature, diluted with hybridization buffer and rechromatographed. The procedure was repeated once again, and the DNA bound to the resins was used for hybridization (0-0);total chick DNA sheared to a mean length of 5000 base pairs ( A-A ) ; and DNA not bound to the affinity resins ( 0-0 ).

'2sI-mRNA,, with a value of approximately 1.25 (Fig. 3). By comparing this analysis to the Cot,,? value of 12,000 observed in the hybridization reaction using 1251-mRNA,vand total chick DNA, we can calculate that a 9600-fold enrichmcnt of the coding strand of the ovalbumin gene had been accomplished. Similarly, the DNA fraction bound to the cDNA,,-affinity column hybridized with [ 3H]cDNAovwith Cotll2value of approximately 0.23 (Fig. 3). Since total DNA reacted with a value of about 2300, a 10,000-fold enrichment of the anticoding strand

MECHANISM OF STEROID-HORMONE REGULATION

417

had also been effected. Upon subsequent reannealing, a double-stranded ovalbumin gene with sequences adjacent to the coding portion of the gene was obtained. It is hoped that sequences located in tandem to the 3’ end of the natural structural gene will play an important regulatory role in the expression of this gene, Amplification of these DNA preparations using bacterial plasmids will now enable us to use large quantities of the natural ovalbumin gene to purify putative chromosomal regulatory proteins by affinity chromatography. Our eventual goal will be to reconstitute a “minichromosome” containing the regulatory and structural elements of the ovalbumin gene (DNA plus nonhistone proteins) and to study the interaction between this “minichromosome” and RNA polymerase and pure steroid hormone-receptor complexes. Such studies should lead to a definitive description of the molecular mechanism of steroid hormone action and the regulation of gene expression in eukaryotes.

REFERENCES A. A. Moscona and R. Piddington, B B A 121, 409 (1966). L. Reif-Lehrer and H. Amos, B J 106, 425 ( 1968). R. J. Schwartz, Nature N B 237, 121 (1972). P. Karlson and C. E. Sekeris, Rec. Prog. Horm. Res. 22, 473 (1966). J. D. Wilson and I. Lasnitzki, Endocrinology 89, 659 ( 1971). J. D. Wilson, N e w Engl. 1. Med. 287, 1284 ( 1972). V . J. DeFeo, in “Cell Biology of the Uterus” ( R . M. Wynn, e d.), p. 191. Appleton, New York, 1966. 8. A. R. Means and B. W . O’Malley, Metabolism 21,357 (1972). 9. R. W. Turkington, C. C. Majumder, N. Kadohama, J. H. MacIndoe and W. L. Frantz, Rec. Prog. Horm. Res. 29, 417 (1973). 10. B. W. O’Malley, W. L. McGuire, P. 0. Kohler and S. G. Korenman, Rec. Prog. €form. Res. 25, 105 (1969). 11. B. W. O’Malley and A. R. Means, Science 183, 610 ( 1974). 12. F. Jacob and J. Monod, J M B 3, 318 (1961). 13. G. M. Tomkins, T. D. Gelehrter, D. Granner, D. Martin, Jr., H. H. Samuels and E. G. Thompson, Science 166, 1474 (1969). 14. A. R. Means and T. H . Hamilton, PNAS 56, 1549 (1966). 15. T. H. Hamilton, Science 1G1, 649 (1968). 16. C. S. Teng and T. H. IIamilton, PNAS 60, 1410 (1968). 17. R. H. Church and B. J. McCarthy, B B A 199, 103 ( 1970). 18. S. H. Socher and B. W. O’Malley, D e n B i d . 30, 411 (1973). 19. R. Oka and R. T. Schimke, J. Cell Biol. 41, 816 ( 1969). 20. B. W. O’Malley, W. L. McGuire and S. G. Korenman, B B A 145,204 ( 1967). 21. P. 0 Kohler, P. hl. Grimley and B. W. O’Malley, J . Cell Biol. 40, 8 (1969). 22. R. D. Palmiter, R. Oka and R. T. Schimke, JBC 246,724 (1971). 23. A. R. Means, I. B. Alxass and B. W. O’Malley, B c h e m 10, 1561 (1971). 24. A. R. Means and B. W. O’Malley, B c h e m 10, 1570 (1971). 1. 2. 3. 4. 5. 6. 7.

418

BERT

w.

O’MALLEY AND ANTIIONY R. MEANS

25. S. G. Korenman and B. W. O’Malley, Endocrinology 83, 11 (1968). 26. B. W. O’Malley, Bchem 6, 2546 (1967). 27. B. W. O’Malley, W. L. McCuire and P. A. Middleton, Endocrinology 81, 677 (1967). 28. W. L. McGuire and B. W. O’Malley, BBA 157, 187 (1968). 29. S. G . Korennian and B. W. O’Malley, BBA 140, 174 ( 1967). 30. G. C. Rosenfeld, J. P. Comstock, A. R. Means and B. W. O’Malley, BBRC 47, 387 (1972). 31. B. W. O’Malley, G. C. Rosenfeld, J. P. Comstock and A. R. Means, Nature 240, 45 (1972). 32. A. R. Means, J. P. Comstock, G . C. Rosenfeld and B. W. O’Malley, PNAS 69, 1146 (1972). 33. J. P. Comstock, G. C. Rosenfeld, B. W. O’Malley and A. R. Means, PNAS 69, 2377 ( 1972). 34. L. Chan, A. R. Means and B. W. OMalley, PNAS 70, 1870 (1973). 35. R. E. Rhoads, B. S. McKnight and R. T. Schiinke, JBC 246, 7407 (1972). 36. S. Y. Lee, J. Mendecki and G. Brawerman, PNAS 68, 1331 (1971). 37. A. R. Means, S. L. C. Woo, S. E. Harris and B. W. O’Malley, Cell. M o ~ . Biochem. 7 , 3 3 (1975). 38. B. W. O’Malley, S. L. C. Woo, S . E. Harris, J. M. Rosen and A. R. Means, 1. Cell. Physiol. 85, 343 ( 1975). 39. H. M. Temin and S. Miztuani, Nature 226, 1211 (1970). 40. D. Baltimore, Nature 226, 1309 ( 1970). 41. M. Edmonds, M. H. Vaughan, Jr., and H. Makazato, PNAS 68, 1336 (1971 1. 42. J. M. Rosen, S. L. C. Woo, J. W. Holder, A. R. Means and B. W. O’Malley, Bchem 14,69 (1975). 43. S. L. C. Woo, S . E. Harris, J. M. Rosen, L. Chan, P. Sperry, A. R. Means and B. W. O’Malley, Prep. Biochem. 4, 555 (1974). 44. S. L. C. Woo, J. M. Rosen, C. D. Liarakos, D. L. Robberson, Y. C. Choi, H. Busch, A. R. Means and B. W. O’Malley, JBC 250,7027 ( 1975). 45. S. E. Harris, A. R. Means, W. M. Mitchell and B. W. O’Malley, PNAS 70, 3776 (1973). 46. J. J. Monahan, S. E. Harris, S. L. C. Woo, 1). L. Robberson and B. W. O’Malley, Bchem 15,225 (1976). 47. S. E. Harris, J. M. Rosen, A. R. Means and B. W. O’Malley, Bchem 14, 2072 (1975). 48. S. E. Harris, R. J. Schwartz, A. K. Roy, M.-J. Tsai and B. W. O’Malley, IBC 251, 524 (1976). 49. M.-J. Tsai, R. J. Schwartz, S. Y. Tsai and B. W. O’Malley, JBC 250, 5164 (1975). 50. R. J. Schwartz, M.-J. Tsai, S. Y. Tsai and B. W. OMalley, JBC 250, 5175 (1975). 51. S. Y. Tsai, M.-J. Tsai, R. J. Schwartz, M. Kalimi, J. H. Clark and B. W. O’Malley, PNAS 72, 4228 ( 1975). 52. N. T. Van, J. W. Holder, S. L. C. Woo, A. R. Means and B. W. O’Malley, Bchem (1976). In press.

53. S. E. Harris, R. J. Schwartz, M.-J. Tsai, B. W. OMalley and A. K. Roy, JBC 251, 524 (1976). 54. R. W. Kuhn, W. T. Schrader, R. G. Smith and B. W. O’Malley, JBC 250, 4220 (1975). 55. W. T. Schrader, S S. Heuer and B. W. O’Malley, Biol. Reprod. 12, 134 (1975).

MECHANISM OF STEROID-IIORMONE REGULATION

419

56. W. A. Coty, W. T. Schrader and B. W. O’Malley, JBC (1976). In press. 57. B. W. O’Malley, S. L. C. Woo, J. J. Monahan, L. McReynolds, S. E. Harris, M.-J. Tsai, S. Y. Tsai and A. R. Means, in “Molecular Mechanisms in the Control of Gene Expression” ( D. P. Neerlich, W. J. Rutter and C. F. Fox, eds.), Academic Press, New York, 1976. 58. T. Y. Shih and M. W. Martin, Bchem 13, 3411 (1974). 59. T. Y. Shih and M. W. Martin, PNAS 70, 1697 ( 1973). 60. S. L. C. Woo, R. G. Smith, A. R. Means and B. W. O’Malley, JBC (1976). In press. 61. S . L. C. Woo, J. J. Monahan and B. W. O’Malley, PNAS (1976). In press.

This Page Intentionally Left Blank

Nonhistone Chromosomal Proteins and Histone Gene Transcription GARYSTEIN, JANET STEIN, LEWISKLEINSMITH, WILLIAMPARK, ROBERTJANSING AND JUDITH

THOMSON

University of Florida Department of Biochemistry and Molecular Biology Gainestiille, Florida and Division of Biological Sciences Unitiersity of Michigan Ann Arbor, Michigan

I. Introduction Throughout the cell cycle of continuously dividing cells, as well as after the stimulation of nondividing cells to proliferate, a complex and interdependent series of biochemical events occur requiring modifications in the expression of information encoded in the genome. Hence, the cell cycle provides an effective model system for studying the regulation of gene readout. For the past several years, our laboratory has been focusing on the cell-cycle, stage-specific regulation of a defined set of genetic sequences-that coding for the histones. In the present article, several lines of evidence are presented suggesting that, in continuously dividing cells as well as after stimulation of nondividing cells to proliferate, ( a ) regulation of histone gene expression resides, at least in part, at the transcriptional level, and ( b ) a subset of the nonhistone chromosomal proteins associated with the genome during the S-phase of the cell cycle is responsible for activation of histone gene transcription when DNA replication occurs.

II. Evidence for Transcriptional Regulation of Histone Gene Expression in Continuously Dividing Hela S, Cells A. Hybridization Analysis of Histone Messenger RNA Association with Polyribosomes during the Cell Cycle t 7 1 A functional relationship between histone synthesis and DNA replication is suggested by the fact that, in many biological systems, the synthe42 1

422

GARY STEIN ET AL.

sis of these proteins and their deposition on the DNA is restricted to the S-phase of the cell cycle ( 2 4 ) .Further support for the coupling of histone and DNA synthesis comes from the observation that inhibition of DNA replication results in a rapid shutdown of histone synthesis (2, 4-6). It has been shown, utilizing cell-free protein-synthesizing systems derived from reticulocytes, Ehrlich ascites cells, and HeLa cells, that the RNA isolated from the polyribosomes of S-phase HeLa cells supports the synthesis of histones while the RNA from polysomes of GI-phase cells or of S-phase cells treated with inhibitors of DNA synthesis does not (7-10). These findings indicate that translatable histone mRNAs are associated with polyribosomes of HeLa cells exclusively during the S-phase of the cell cycle. However, the possibility still exists that messenger RNAs are components of the polyribosomes during other periods of the cell cycle, but have in some way been rendered nontranslatable. Such a possibility would have important implications for the mechanism operative in the regulation of histone gene expression. Therefore, to establish that histone messenger RNA sequences are associated with polyribosomes only during S-phase, we examined G,, S and G , polyribosomal RNAs from synchronized HeLa cells for their ability to hybridize with histone cDNA ( I ) . Isolation of histone mRNAs from polyribosomes of S-phase HeLa S, cells as well as synthesis of the 3Hlabeled complementary DNA using ["H]dCTP and [3H]dGTP were carried out as previously described (11, 12). Poly(A) was added to the 3'-OH termini of the histone messenger RNAs with an ATP-polynucleotidylexotransferase isolated from maize seedlings ( 1 3 ) , and the polyadenylylated mRNAs were then transcribed with RNA-dependent DNA polymerase from avian myeloblastosis virus or Rous sarcoma virus using dTIo as a primer. The characteristics of the histone cDNA probe have been reported (11,12,14). The extent of hybrid formation between histone cDNA and total polysomal RNA of GI, S and G, cells is compared in Fig. 1. The hybridization observed between S-phase polyribosornal RNA and the cDNA indicates the presence of histone-specific sequences associated with the polyribosomes of S-phase cells. In contrast, the absence of G, polyribosomal RNA hybridization demonstrates that histone mRNA scquences are not components of G, polyribosomes. Comparison of the kinetics of the hybridization reaction between S-phase polyribosomal RNA and cDNA (Crot,,L= 1.8) with those of histone mRNA and cDNA ( Crotlrr = 1.7 x 10 2 , indicates that histone mRNA sequenccs account for 0.9%of the S-phase total polysomal RNA. This figure is consistent with the in uivo situation where approximately 10-12% of the protein synthesis in S-phase HeLa cells is histone synthesis ( 2 ) . Additionally, the

NONHISTONE CHROMOSOMAL PROTEINS IN TRANSCRIPTION

423

FIG. 1. ['HIcDNA (27,000 dpm/ng) and unlabeled RNA were hybridized at 52°C in sealed glass capillary tubes containing, in a volume of 15 PI, 50%formamide, 0.5 M NaCI, 25 niM Hepes (pH 7 . 0 ) , l mM EDTA, 0.04 ng cDNA and 3.75 or 7.5 P g of polysomal RNA from GI ( * ) ,S (0) or G1 ( 0 )HeLa Ss cells. Cr,,t = moles of ribonucleotide x sec x 1.'. Samples were removed at various times and incubated for 20 minutes in 2.0 ml of 30 mM sodium acetate, 0.3 M NaCI, 1 mM ZnS04, 5% glycerol ( p H 4.6) containing S I nuclease at a concentration sufficient to degrade at least 96% of the single-stranded nucleic acids present. The amount of radioactive DNA resistant to digestion was determined by trichloroacetic acid precipitation. S- and G1-phase cells were obtained by synchronization with 2 cycles of 2 mhf thymidine block. S-phase cells wcre harvcsted 3 hours after release from the second thymidine block, at which time 98% of the cells were in S-phase. Gz cclls were harvested 7.5 hours after release from thymidine. GI cells were obtained 3 hours after selective detachment of mitotic cells from semiconfluent monolayers; 97% of the cells were in the GI-phase of the cell cycle and S-phase cells were not detected. Polyribosomal RNA was isolated as reported ( 1 ). From Stein et al. ( 1 2 ) .

complete absence of hybrid formation between GI polysomal RNA and histoiie cDNA establishes the absence of ribosomal RNA ( 5 S, 18 S and 28 S ) and tRNA sequences in the histone mKNA preparations, as well as in the cDNA probe. Determination of thc presence or the absence of histone mRNA sequences on G, polysomes is complex. The kinetics of the hybridization

424

GARY STEIN ET AL.

FIG. 2. Percentage of cells in DNA synthesis and mitotic index at various times following release of HeLa S:, cells from 2 cycles of 2 mM thymidine block. Cells were labeled with 5 rCi/ml of [3H]thymidine for 15 minutes and the percentage of cells in DNA synthesis was determined autoradiographically. The mitotic index was determined from the autoradiographic preparations. From Stein et aZ. ( 1 2 ) .

reaction between G, polyribosomal RNA and histone cDNA ( Crotl,, = 8.5) suggests that the amount of histone mRNA sequences present on the polyribosomes of G, cells is 21%of that present on S-phase polyribosomes. However, the data in Fig. 2 clearly indicate that 20% of the G, cell population consists of cells undergoing DNA replication. It is therefore reasonable to conclude that the histone mRNA sequences present in the G, polyribosomal RNA are due to the presence of S-phase cells in the G, cell population. This implies that histone mRNA sequences are not associated with polyribosomes during the G,-phase of the cell cycle. Unfortunately, no effective methodologies are available for obtaining a pure population of G2-phase HeLa S:, cells to establish this point definitively. These results indeed demonstrate that, in HeLa cells, histone mRNA sequences are associated with polyribosomes only during the S-phase of the cell cycle. It therefore follows that regulation of histone gene expression in this system does not reside at the translational level, and transcriptional control is strongly implied. However, this type of regulation of histone gene expression may not be universal. For example, there is evidence that during the early stages of embryonic development, control of histone synthesis may be mediated, at least in part, posttranscriptionally (15, IS).

B. Cell-Cycle, Stage-Specific Transcription of Histone Genes (74, 7 7 ) TO ascertain directly that the genes containing the information for histone synthesis are transcribed during a restricted period of the cell

NONHISTONE CHROMOSOMAL PROTEINS I N TRANSCRIPTION

425

cycle in continuously dividing HeLa S, cells, chromatin from GI- and S-phase cells was transcribed in a cell-free system, the RNAs were isolated, and their ability to form S, nuclease-resistant acid-precipitable hybrids with histone [ 3 H ] ~ D N Awas determined. The kinetics of this hybridization are shown in Fig. 3. While transcripts from S-phase chromatin hybridized with histone cDNA at a CrOtl/2of 0.2 compared with a CrotlIr of 0.017 for the histone niRNATcDNA reaction, there is no evidence of hybrid formation between histone cDNA and GI transcripts even at a Cr,t of 100. The maximal level of hybrid formation (65%)between histone cDNA and S-phase transcripts was the same as that observed between histone cDNA and histone mRNA. Fidelity of the hybrids formed between histone cDNA and transcripts from S-phase chromatin is suggested by the fact that the t,, of these hybrids is identical to the t, of histone mRNA-cDNA hybrids (65°C in 50%formamide/ 0.5 M NaC1/25 mM Hepes ( p H 7.0)/1 mM EDTA). It should be noted that the t,,, obtained under these conditions is consistent with an RNA DNA hybrid having a ( G C)-content of 54%, the nucleotide composition of histone messenger RNA reported by Adesnick and Darnell (18).

+

FIG.3. Kinetics of annealing of histone cDNA to in uitro transcripts of chromatin from GI- and S-phase HeLa SI cells. ["HIcDNA (0.04 ng; 27,000 dpm/ng) was annealed at 52°C to either 0.15 or 1.5 p g of RNA transcripts from GI- ( A ) or S-phase ( 0 ) chromatin. cDNA (0.04 n g ) was also annealed to 1.5 r g of E . coli RNA isolated in the presence of S-phase chromatin ( W ) . E. coli RNA was included in each reaction mixture so that the final amount of RNA was 3.5 pg. Chromatin was isolated as reported ( 2 ) . From Stein et al. (12).

426

GARY STEIN ET AL.

RNA synthesized in intact cells may remain associated with chromatin during isolation and in part account for hybrid formation between in vitro RNA transcripts and complementary DNAs for specific genes. I t is possible that the extent to which this phenomenon occurs varies significantly with the tissue or cell and the method of chromatin preparation. To determine whether such cndogenous RNA sequences account for histonespecific sequences detected in transcripts from S-phase chromatin, the following control was executed. S-phase chromatin was placed in the in vitro transcription mixture without RNA polymerase, and an amount of E . coli RNA equivalent to the amount of RNA transcribed from S-phase chromatin was added. RNA was immediately extracted by the same procedure utilized for the isolation of in vitro RNA transcripts. When this control RNA was annealed with histone cDNA, no significant lcvel of hybridization was observed ( Fig. 3 ) . Additionally, RNA isolated from S-phase chromatin in the absence of carrier RNA shows no hybrid formation with thc histonc cDNA. These results establish that endogenous histone-specific sequences associated with S-phase chromatin are not contributing significantly to the hybridization observed with S-phasc in vitro transcripts. It is therefore reasonable to conclude that the histone sequences present in S-phase transcripts can be accounted for by in vitro synthesis. If purified histone mRNA equivalent to the amount of histone mRNA sequences transcribed from S-phase chromatin is added to the transcription mixture of G , chromatin at the beginning of the incubation, the mixture of G, transcripts and histone mRNA subsequently isolated hybridizes with histone cDNA with the expected C T , , ~ (0.2) ~ , ~ (19). This result suggests that the absence of histone mRNA sequences among RNA transcripts from G, chromatin is not attributable to a specific nuclease associated with chromatin during the G,-phase of the cell cycle. The possibility that histone sequences are present in G, transcripts but are not detected because they are in a double-stranded form due to symmetric transcription is unlikely, since heating the hybridization mixture to 100°C for 10 minutes bcfore incubation has no effect on the hybridization of [$HIcDNA to the transcripts (19). The results from these in vitro transcription studies clearly indicate that histone sequences are available for transcription during S-phase and not during GI. Such findings are consistent with the restriction of histonc synthesis to the S-phase of the cell cycle and the association of histone messenger RNAs with polysomes only during S-phase. Taken together this evidence suggests that, in continuously dividing HeLa S, cells, the expression of histone genes is regulated, at least in part, at the transcriptional level, and that the readout of these genetic sequences occurs only during the period of DNA replication.

427

NONHISTONE CHROMOSOMAL PROTEINS I N TRANSCRIPTION

111. Regulation of Histone Gene Transcription in Continuously Dividing HeLa S, Cells by Nonhistone Chromosomal Proteins ( 7 4, 7 9 / 2 0 ] Although evidence has been presented ( 2, 21-30) strongly suggesting that among the nonhistone chromosomal proteins there are macromolecules responsible for the regulation of transcription during the cell cycle, it is primarily of a correlative nature. To examine directly the involvement of nonhistone chromosomal proteins in the control of cell-cycle stage-specific gene readout, chromatin isolated from G,- and S-phase cells was dissociated, fractionated and reconstituted as outlincd in Fig. 4. In ~ i t r oRNA transcripts from chromatin reconstituted with G, nonhistone chromosomal proteins and from chromatin reconstituted with S-phase nonhistone chromosomal proteins were annealed with histone [ 3H]cDNA. Figure 5 indicates that RNA transcripts from chromatin reconstituted with S-phase nonhistone chromosomal proteins hybridize with histone cDNA, while those from chromatin reconstituted with GI nonhistone chromosomal proteins do not exhibit a significant degree of hybrid formation. I t should be emphasized that the kinetics and extent of hybridiza-

PROTEINS

HONHI STONE PROTEINS

GRADIENT

GRADIENT

D l ALYS IS

I

CHROMATIN RECONSTITUTED WTIH NONHISTONE CHROMOSOWL PROTEINS FROM S-PHASE CELLS

DIALYSIS

I

NONHISTONE CHROMOSOMAL PROTEINS FROM G 1 - CELLS ~ ~ ~ ~ ~

FIG. 4. Flow diagram of experimental protocol for chromatin reconstitution. From Stein et al. ( 1 2 ) .

428

GARY STEIN ET AL.

tion with the cDNA are the same for transcripts of native S-phase chromatin and transcripts of chromatin reconstituted with S-phase nonhistone chromosomal proteins (Fig. 5). Furthermore, the amounts of RNA transcribed and the recovery during isolation of these transcripts from native and reconstituted chromatin preparations are csscntially identical. These results clearly imply a functional role for nonhistone chromosomal proteins in regulating the availability of histone sequences for transcription during the cell cycle. Such a regulatory role for nonhistone chromosomal proteins is in agreement with results from several laboratories indicating that these proteins are responsible for the tissue-specific transcription of globin genes (31-33). However, the present results represent the first demonstration that nonhistone chromosomal proteins regulate the transcription of genes that are transiently expressed. An important question that then arises is whether the difference in the in uitro transcription of histone genes from G , - and S-phase chromatin is due to an activator of histonc gene transcription present in the S-phase nonhistone chromosomal proteins, or alternatively to a specific

FIG.5. Kinetics of annealing of histone cDNA to in cjitro transcripts from native and reconstituted chromatin. [ 'IIIcDNA (0.04 n g ) was annealed at 52°C with either 0.15 fig or 1.5 pg of RNA transcripts from chromatin reconstituted with S-phase nonhistone chromosomal proteins ( 0), chromatin reconstituted with G, nonhistone chromosomal proteins ( w ), native S-phase chromatin ( ) and native GI chromatin (0). E. coli RNA was included in each reaction mixture so that the final amount of RNA was 3.5 pg. From Stein et al. ( 1 2 ) .

NONHISTONE CHROMOSOMAL PROTEINS IN TRANSCRIPTION

429

repressor of histone gene transcription present among the G, nonhistone chromosomal proteins. If the difference in histone genc activity of GIand S-phase chromatin were due to an activator which is present or operative only in S-phase, one would anticipate that dissociation of G, chromatin with high salt and urea, followed by reconstitution in the presence of S-phase nonhistone chromosomal protein would result in an increase in the availability of histone genes for transcription. One would not anticipate any major effect on histone gene transcription if S-phase chromatin were reconstituted in the presence of GI-phase nonhistone chromosomal proteins. In contrast, if the difference in histone gene expression in G,- and S-phase chromatin can be accounted for by a repressor of histone gene expression associated with chromatin during the GI-phase of the cell cycle, one would anticipate that dissociation of S-phase chromatin, followed by reconstitution in the presence of increasing amounts of G , nonhistone chromosomal proteins would result in a progressive decrease in the availability of histone genes for transcription. If the latter alternative prevails, the presence of S-phase chromosomal proteins during reconstitution would not be expected to affect significantly the expression of histone genes from GI chromatin. If the regulation of histone genes involves both repressors and activators acting in an antagonistic manner, one would anticipate a more complex, intermediate result. As shown in Fig. 6, when GI chromatin is dissociated and then reconstituted in the presence of increasing amounts of S-phase nonhistone chromosomal proteins, hybrid formation between transcripts from these chromatins and histone cDNA is seen at progressively lower Cr,t values, indicating a dose-dependent activation of the histone genes of the GI chromatin by the S-phase nonhistone chromosomal proteins. It can be seen that the histone genes from G, chromatin can be activated to approximately the same degree as in native S-phase chromatin by comparing the kinetics of the hybridization of histone cDNA with transcripts from S-phase chromatin (Grot,/.' = 0.2) and the kinetics of the hybridization of histone cDNA with transcripts from G, chromatin reconstituted with a 1:l ratio of S-phase nonhistone chromosomal protein to DNA (Cr,,t,/? = 0.3). The fidelity of the hybrids formed between the transcripts and histone cDNA as well as the validity of comparing Crotl/s values is suggested by the fact that the t, of the hybrids in all cases is identical to the t,,, of the hybrids formed between histone mRNA and histone cDNA. Also, the maximal hybridization as estimated by a doublereciprocal plot is equal in all cases to that of the histone mRNAScDNA hybridization reaction (65%).In contrast, when G, chromatin is dissociated and then reconstituted in the presence of S-phase histones even

430

GARY STEIN ET AL.

-7

-2

-1

0

I

2

3

Loq Cr,?

FIG. 6. Kinetics of annealing of histone cDNA to in uitro transcripts from GI chromatin reconstituted in the presence of various amounts of S-phase nonhistone chromosomal proteins. ['HIcDNA (0.04 ng ) was annealed to RNA transcripts from G1 chromatin reconstituted in the presence of 0 ( x ) , 0.01 ( A ) , 0.10 ( A )or 1.00 (0) mg of S-phase nonhistone chromosomal protein per milligram of GI chromatin DNA. cDNA (0.04 ng) was also annealed to RNA transcript.; from G , chromatin reconstituted in the presence of 1.00 mg of G , total chromosomal protein per milligram of G1 chromatin DNA ( m ) and RNA transcripts from chromatin isolated from S-phase cclis ( 0 ) .E . coli RNA was included in each reaction mixture so that the total amount of HNA.was 3.75 pg.

a t a 1:l ratio of S-phase histone to DNA, a significant stimulation of the transcription of histone genes is not observed. I t should be noted that thcre were 110 significant diffcrenccs among thc various chromatin preparations in the yield or recovery of RNA during isolation, even though the presence of S-phase nonhistone chromosomal proteins during reconstitution coiild cause a greater than 1000-fold stimulation in thc amount of histone sequences transcribed from G , chromatin. This apparent stimulation of histone gene transcription is not observed when G ,

NONHISTONE CHROMOSOMAL PROTEINS IN TRANSCRIPTION

431

chromatin is dissociated and then reconstituted in the presence of additional C , chromosomal protein, even a t a 1:l ratio of additional G, protein to DNA (Fig. 6 ) . To eliminate the possibility that a small amount of nucleic acid present in the S-phase chromosomal proteins is responsible for the observed hybridization, either by containing histone sequences or by having the ability to render the histone genes transcribable, the residual nucleic acid was removed from the S-phase chromosomal proteins by buoyantdensity centrifugation in cesium chloride and urea. As shown in Fig. 7, there is no significant difference in the kinetics of hybridization with

-3

-2

- 1

0

1

2

3

L o g Cr,t

FIG 7. Kinetics of annealing of histone cDNA to in vitro transcripts from GI chromatin reconstituted in the presence of S-phase total chromosomal protein from which niicleic acid has been removed by centrifugation in 0.41 mg/ml CsC1/5 M ureall0 mM TrisCl ( p H 8.3) in an SW 50.1 rotor at 35,000 rpm for 48 hours at 4°C. ['HJcDNA (0.04 n g ) was annealed to RNA transcripts from GI chromatin reconstituted in the presence of 1.00 nig of CsC1-treated S-phase total chromosomal protein (0) or 1.00 mg of untreated S-phase total chromosomal protein ( 0 )per milligram of GI chromatin DNA. E . coli RNA was aded to each reaction mixture so that the total amount of RNA was 3.75 pg.

432

GARY STEIN ET AL.

histone cDNA of transcripts from G, chromatin reconstituted with equal amounts of either CsC1-treated S-phase chromosomal proteins or untreated S-phase chromosomal proteins. In order to determine whether G, chromatin contains an inhibitor of histone gene transcription that is degraded or inactivated as cells progress from the GI- to the S-phase of the cell cycle, chromatin from S-phase cells was dissociated and then reconstituted in the presence of total chromosomal proteins from GI-phase cells. The ability of transcripts from this reconstituted chromatin preparation to hybridize with histone cDNA was then determined. As shown in Fig. 8, the presence of G, total chromosomal proteins, even a t a 1 :I ratio of G , total chromosomal pro-

100

80

20

icy Lrot

FIG.8. Kinetics of annealing of histone cDNA to in nitro transcripts from S-phase chromatin reconstituted in the presence of GI-phase total chromosomal proteins. ['HIcDNA (0.04 ng) was annealed to HNA transcripts from S-phase chromatin reconstituted in the presence of 0.10 ( A ) or 1.00 (0) mg of GI-phase total chromobonial proteins per milligram of S-phase DNA. cDNA was also annealed to transcripts from native S-phase chromatin ( 0 ) .E . coli RNA was included in each reaction mixture such that the total amount of RNA was 3.75 pg.

NONHISTONE CHROMOSOMAL PROTEINS IN TRANSCRIPTION

433

tein to DNA, does not significantly inhibit histone gene transcription from S-phase chromatin. This is not to say that there is nothing in GI chromosomal proteins that can inhibit histone gene transcription. [As we have reported elsewhere (20), histones inhibit the transcription of histone genes from naked DNA, although to the same degree to which they inhibit total RNA synthesis.] Rather, there is nothing in the G , chromosomal proteins that can inhibit in vitro histone gene transcription in the presence of S-phase chromosomal proteins. This suggests that any additiofial specific repressor of histone gene expression is lost during isolation, dissociation, fractionation or reconstitution, or that any inhibition of histone gene transcription by G, chromosomal proteins can be overridden by S-phase chromosomal proteins. Similar results are obtained when S-phase chromatin is dissociated and then reconstituted in the presence of G,-phase nonhistone chromosomal proteins. Again the t, of the hybrids formed and the maximal hybridization are the same as seen in the histone mRNAScDNA reaction. These results provide support for the contention that the difference in the in oitro transcription of histone genes from GI- and S-phase chromatin is due to the nonhistone chromosomal protein portion of the genome. Further, this difference can be accounted for by a component (or components) of the S-phase nonhistone chromosomal proteins that has the ability to render the histone genes of G,-phase chromatin available for trailscription in a dose-dependent fashion. These results do not indicate which component (or components) of the S-phase nonhistone chromosomal proteins is responsible for the observed activation or by what mechanism the activation is achieved, but they do provide an assay by which this histone gene activator can be purified and characterized.

IV. Regulation of Histone Gene Transcription Following Stimulation of Nondividing Cells to Proliferate ( 3 4 )

To determine whether the mode of histone gene regulation observed in continuously dividing HeLa S., cells is of broader biological relevance, we examined the control of histone gene expression following stimulation of nondividing WI-38 human diploid fibroblasts to proliferate. Confluent monolayers of WI-38 human diploid fibroblasts can be induced to proliferate by replacing exhausted growth medium with fresh medium containing 20%fetal calf serum (35, 36). The addition of serum to such cells triggers a complex and interdependent series of biochemical events (reviewed in 22). Activation of DNA synthesis as measured by incorporation of ["Hlthymidine into DNA is evident at 10 hours after stimulation of WI-38 cells and reaches a maximum at 12 hours (Fig. 9a). The activa-

4 000)

b

fl 5 0 0 .

c

. 2

4

0

12 16 20 T I M E (hours)

24

28

32

36

FIG.9. ( a ) DNA synthesis at various times (hours) following serum stimulation of WI-38 human diploid fibroblasts. Cells were labeled with ['Hlthymidine. To determine the rate of DNA synthesis ( c p n d ~ gDNA), cells were harvested and nuclei were isolated. Nuclei were washed twice with cold ( 4 ° C ) 0.3 M perchloric acid and nucleic acids were extracted with hot (90°C) 1 M perchloric acid. The amount of DNA present in nucleic acid extracts was assayed by the diphenylamine reaction. Each point represents an average of at least four determinations, and the range of values does not exceed 5%.( b ) Labeled nuclei per 1000 cells at various times (hours) following serum stimulation of WI-38 human diploid fibroblasts. Cells were labcled with ['Hlthymidine. To determine the percentage of cells with ['Hlthymidine-labeled nuclei, cells were harvested, smeared on acid-washed microscope slides and prepared for autoradiography. Autoradiographs were exposed for 14 days and stained with hematoxylin after development. The values of ["H]thymidine-labeled nuclei per 1000 cells were obtained by counting 2000 cells. Each value represents an average of four determinations, and the range of values did not exceed 7%. ( c ) Mitotic figures/lOOO cells at various times (hours) after serum stimulation of WI-38 human diploid fibroblasts. Colceniid was added 12 hours after seruni stimulation, and at the indicated times cells were harvested, smeared on acid-washed microscope slides, fixed in nlcohol/acetic acid ( 3 :1) and stained with hematoxylin. The values for mitotic figures per 1000 cells were obtained by counting 2000 cells. Each point represents an average of at least four determinations, and the range of values did not exceed 71.

NONHISTONE CHROhlOSOMAL PROTEINS IN TRANSCRIPTION

435

tion of DNA synthesis in WI-38 cells is supported by a similar (600-fold) increase in the percentage of nuclei labeled with [3H]thymidine as determined autoradiographically ( Fig. 9 b ) . An increase in mitotic activity is observed beginning at 20 hours (Fig. 9c). Concomitant with the activation of DNA synthesis there is a stimulation of histone synthesis. The tight coupling between histone synthesis and DNA replication in WI-38 cells is suggested by the rapid and complete shutdown of histone synthesis by inhibition of DNA replication (37). To determine the availability of histone genes for transcription as a function of time following stimulation of WI-38 cells to proliferate, we examined in vitro transcripts of chromatin from confluent WI-38 cells, from WI-38 cells during the prereplicative phase (1,4 and 7 hours after stimulation) and from cells at 10 and 12 hours after stimulation ( S phase). The presence of histone mRNA sequences was assayed by hybrid formation with histone cDNA. The kinetics of hybridization of the histone cDNA with RNA transcripts from chromatin of WI-38 cells at various times following serum stimulation are shown in Fig. 10. There is a significant increase in the rate of hybridization of histone cDNA to RNA transcripts 10 hours after stimulation ( Crot,,2 = 1.0) with a maximal rate of hybridization observed at 12 hours (Grot,,? = 4.0x lo-*). In contrast to the limited extent of hybrid formation between histone

FIG. 10. Kinetics of annealing of histone ['HIcDNA to in vitro transcripts of chromatin from unstimdated ( x ) WI-38 cells, and WI-38 cells at 1 hour (O), 4 hours ( ), 7 hours ( A), 10 hours (0) and 12 hours ( 0 )after serum stimulation. Histone cDNA was also annealed to endogenous RNA isolated from S-phase chromatin (A).Annealing reactions were carried out in a volume of 15 pl and the extent of histone cDNA-histone mRNA hybrid formation was determined by S, nuclease digestion.

436

GARY STEIN ET AL.

cDNA and RNA transcripts from chromatin of confluent cells and cells 1, 4 and 7 hours after stimulation ( Cr,,tl/e = 180), the kinetics of the hybridization reaction of histone cDNA and NNA transcripts from S-phase ( 12-hour) chromatin revealed a 500-fold activation of histone mRNA sequence transcription following stimulation of WI-38 cells to proliferate. A comparison of the CrOtlIL'values of the hybridization reactions between histone cDNA and RNA transcripts from chromatin as a function of stimulation to proliferate clearly demonstrates that activation of histone gene transcription parallels the onset of DNA synthesis in WI-38 fibroblasts (Fig. 9 ) . The low level of hybridization between histone cDNA and RNA transcripts from chromatin of G, and unstimulated cells is most likely attributable to the few proliferating cells that escape "contact inhibition" and hence continue to synthesize DNA and histones. This interpretation is supported by the observation that stimulation of semiconfluent WI-38 cells results in a time course and maximal lcvel for activation of histone genes similar to that observed when confluent cells are stimulated. However, in these semiconfluent cells, an elevated level of histone gene transcription from chromatin is detected prior to stimulation and during the prereplicative period ( Cr,tl12 = 14). Control experiments were carried out to eliminate the possibility that endogenous RNAs associated with chromatin from S-phase ( 12-hour) cclls account for hybrid formation of RNA transcripts with histone cDNA. The role of chromosomal proteins in regulating the transcription of histone genes was directly examined by a series of chromatin reconstitution experiments. To assess the involvement of nonhistone chromosomal proteins in rendering histone genes transcribable, chromatin from confluent WI-38 cells was dissociated and reconstituted in the presence of S-phase ( 12-hour) nonhistone chromosomal proteins. RNA transcripts from the reconstituted chromatin were tested for ability to hybridize with histone ["H]cDNA. The data in Fig. 11 indicate that the Crotl/? of the hybridization reaction between histone cDNA and RNA transcripts from this reconstituted chromatin preparation ( Crotl/L'= 0.4) is indistinguishable from that of the hybridization reaction between histone cDNA and S-phase chromatin RNA transcripts. Transcription of histone mRNA sequences from chromatin of confluent WI-38 cells is unchanged following dissociation and reconstitution in the presence of the histone fraction of S-phase ( 12-hour) chromatin. These results suggest that nonhistone chromosomal proteins are responsible for determining the availability of histone genes for transcription in chromatin of WI-38 cells and that a component of the S-phase nonhistone chromosomal proteins serves to activate the transcriptoin of histone mRNA sequences. To ex-

NONHISTONE CHROMOSOMAL PROTEINS IN TRANSCRIPTION

437

Log Crot FIG. 11. Kinetics of annealing of histone cDNA to in vitro transcripts of reconstituted chromatin. ["HIcDNA was annealed to RNA transcripts from unstimulated chromatin rcconstituted in the presence of S-phase ( 12-hour) nonhistone proteins (0) and from S-phase chromatin reconstituted in the presence of total chromosomal proteins from unstimulated chromatin ( ). Hylxid formation was assayed using S1 nuclease.

amine the possibility that a component of the chromosomal proteins of confluent cells specifically restricts the availability of histone genes for transcription, S-phase ( 12-hour) chromatin was dissociated and then reconstituted in the presence of total chroniosomal proteins from confluent cells. Transcripts from such reconstituted chromatin preparations exhibit kinetics of hybridization with histone cDNA identical to those of native S-phase chromatin transcripts. A specific repressor of histone genes associated with chromatin of confluent WI-38 cells is therefore unlikely.

V. Activation of Histone Gene Transcription by Nonhistone Chromosomal Phosphoproteins (38, 3 9 ) Results from the studies described above suggest that in continuously dividing HeLa S., cells, as well as in WI-38 human diploid fibroblasts after stimulation to proliferatc, the cell-cycle stage-specific transcription of histoiie genes is regulated by a component of the S-phase nonhistone chromosomal proteins. One aspect of a possible mechanism by which histone gene transcription is regulated may involve the phosphate groups on the nonhistone chromosomal protcins. Modifications in the phosphorylatiori of nonhistone chromosomal proteins have been observed throughout the cell cycle in continuously dividing cells and following stimulation of nondividing cells to proliferate (27, 28, 40, 4 1 ) . Such changes in the metabolism of phosphate groups provide correlative evi-

438

GARY STEIN ET AL.

dence for a functional role of phosphorylation in gene regulation during the cell cycle. More direct evidence that phosphorylation of nonhistone chromosomal proteins is important in determining the availability of defined genes (histone genes) for transcription can be gleaned from the results of two recent studies. In one series of experiments, chromatin-associated phosphoproteins were isolated from HeLa S , cells, and this subset of the nonhistone chromosomal proteins was compared with other nonhis tone chromosomal protein fractions for ability to activate histone mRNA sequence transcription from chromatin ( 38). Phosphoproteins were isolated from HeLa S, ccll chromatin as schematically illustrated in Fig. 12. Concomitantly, protein fractions were isolated in an identical manner from cells pulse-labeled with ?.P, for 1 hour. The histone gene activating ability of each fraction was correlated with the degrce of phosphorylation, and the fractions were also examined by means of polyacrylamide gel electrophoresis. The phosphoprotein fractionation scheme employed in the present studies subdivides the chromosomal proteins into three electrophoretically distinguishable fractions as demonstrated in Fig. 13. These fractions also differ as to their specific activities with respect to 3’P, with the proteins bound to calcium phosphate gel exhibiting a 10-fold enhancement in phosphorylation (3.2 x 10” cpm/mg ) compared with the nonbinding proteins (2.3 x 10’ cpm/mg). Each of the four protein fractions was analyzed in the following manner for its ability to activate, in vitm, the transcription of histone mRNA sequences from G, chromatin, which is ineffective as a template for histone gene transcription. G , chromatin was dissociated in 5 M urea/3 M NaCl and then reconstituted in thc presence of one of the four chromosomal protein fractions. The reconstituted chromatins were transcribed with E. coli RNA polymerase and the RNA transcripts were assayed for their abilities to form S, nuclease-resistant, acid-precipitable hybrids with histone [ ’HIcDNA. As shown by the hybridization curves in Fig. 14, dissociated GI chromatin reconstituted alone or in the presence of “80,000 x g pellet proteins” or “calcium phosphate nonbinding proteins,” does not serve as a template for the in uitro transcription of HNA sequences that hybridize with histone cDNA. Howcver, dissociated G , chromatin reconstituted in the presence of “80,000 x g supernatant proteins” or the “phosphoproteins” is capable of transcribing HNA that hybridizes to histone cDNA. The kinetics of the hybridization reaction between histone cDNA and RNA transcripts from native S-phase chromatin ( Cr,,t,/2 = 0.2) are similar to those of the hybridization reaction betwcen histone cDNA and RNA transcripts from GI chromatin reconstituted with “80,000 x g supernatant proteins” or the “phosphoproteins” ( Cr&.! = 0.25). When RNA poly-

NONHISTONE CHROMOSOMAL PROTEINS I N TRANSCRIPTION

439

Chromatin dispersion in 1 .O M NaCl reduced to 0.4 M NaCl 80,000 x 9

I

I

Pellet

I

Supernatant

dispersed in 5 M urea 3 M NaCl

80,000 x g supernatant proteins

I

t

I A

250,000 x g

Supernatant

t

Pellet discarded

Capo4 pellet

Bio-Rex 70

Capo4 gel

Supernatant

gel solubilized

I Phosphoproteins 1 FIG. 12. Schematic diagram for the fractionation of chromosomal proteins from HeLa cells. Chromatin was prepared as described previously ( 2 ) and then suspended in a Ilounce homogenizer in 1.0 M NaC1/50 mM Tris ( p H 7.5) at a concentration of 2 mg/ml; 1.5 volumes of 20 niM Tris ( p H 7.5) were added dropwise, and the mixture was briefly homogenized and centrifuged at 80,000 x g for 1 hour. The 80,000 x g pellet was dispersed in 5 M urea/3 M NaCI/10 mM Tris (pH 8.3), and the mixture was centrifuged at 250,000 x g for 24 hours. The proteins in the supernatant are referred to as the “80,000 x g pellet proteins.” Bio-Rex 70 (previously equilibrated with 0.4 M NaCl/20 mM Tris HCI, pH 7.5) was added to the 80,000 x g supernatant proteins at a ratio of 20 nig of Bio-Rex per milligram of protein. The suspension was stirred for 5-10 minutes, then centrifuged a t 6000 x g. Calcium phosphate gel was added to the resulting supernatant in a ratio of 0.46 mg of gel per niilligrani of protein, stirred for 5-10 minutes and then centrifuged at 7000 x g. The proteins remaining in the supernatant are referred to as ‘‘Capoa nonbinding proteins.” The pellet of calcium phosphate gel was washed in 40 ml of 1.0 M ( N H 4 ) 5 O J 5 0 niM Tris (pH 7.5) and solubilized in 0.3 M EDTA ( p H 7.5)/0.33 M (NI-I,),SOI in a ratio of 0.2 nil of solution per milligram of gel. The insoluble residue was removed by centrifugation for 15 minutes at 33,000. x g, and the supernatant constituted the protein fraction referred to as the “phosphoproteins.”

merase is omitted from the transcription reaction and RNA is isolated (with an amount of carrier E . coli RNA equivalent to the amount of RNA transcribed in the presence of polymerase) from G, chromatin reconstituted with “phosphoproteins,” the isolated RNA shows no significant extent of hybridization with histone cDNA. This experiment indicates

440

GARY STEIN ET AL.

OD at 590

migration in centimeters

FIG. 13. Electrophoretic profiles of the total nuclear proteins (top), the “80,000 x g pellet proteins” ( a ) , the “Capo4 nonbinding proteins” ( b ) , and tlic “~ihosphoproteins” ( c ) isolated by the methods described in Fig. 12 from exponentially growing HeLa cells. Aliquots from cach samplc were dialyzed against 0.1% dodecyl sulfate/lO mM sodium phosphate ( p H 7.0)/0.1% p-mercaptoethanol, and subsequently fractionated electrophoretically according to molecular wcight on 7.5% acryiamide/0.28% bisacrylaxnide gels.

that endogenous histone-specific sequences associated with the “phosphoprotein” fraction do not contribute significantly to the hybridization observed betwcca histonc cDNA and transcripts of G, chromatin reconstituted in the presence of “phosphoproteix~s.”These resuIts clearIy suggest that the ability to activate histone mRNA sequence transcription

NONHISTONE CHROMOSOMAL PROTEINS IN TRANSCRIPTION

-

100

._ ._ U .n

-

E 2 .-

2

20

441

,I. o

x

-3 -2 ,,xp”ps x-1 . 0 1

2

Log Cr,t

FIG. 14. Kinetics of annealing of histone [‘HIcDNA to in uitro transcripts of chromatin from GI HeLa cells dissociated and reconstituted ( m ) , or dissociated and reconstituted in the presence of “80,000 x g supernatant proteins” ( a ) ,“80,000 x g pellet proteins” ( A),“CaP04 nonbinding proteins” ( 0), “phosphoproteins” (O), or “phosphoprotein” where no RNA polymerase was added to the transcription assay ( x ) . GI chromatin was dissociated and 1-mg aliquots were reconstituted in the presence of 1 mg of “80,000 x g supernatant proteins,” “80,000 x g pellet proteins,” “CaP04nonbinding proteins” or “phosphoproteins.”

resides in a component of the nonhistone chromosomal proteins soluble in 0.4 M NaCl and with a high affinity for calcium phosphate gel. We also examined the effects on histone gene transcription of dephosphorylating nonhistone chromosomal proteins ( 39). Using a calfthymus protease-free nuclear phosphatase covalently linked to agarose, S-phase nonhistone chromosomal proteins from HeLa S, cells were partly dephosphorylated. This procedure is effective in removing u p to 60%of the phosphate groups from S-phase nonhistone chromosomal proteins. Dephosphorylation is carried out in the presence of 5 M urea, thus maintaining completc solubility, and the procedure yields proteins that, though partially dephosphorylated, are quantitatively and qualitatively identical to native S-phase nonhistone chromosomal proteins. TO assay the influence of phosphate groups associated with nonhistone chromosomnI proteins on histone gene transcription, chromatin was reconstituted utilizing DNA, S-phase histones and either native S-phase nonhistone chromosomal proteins or partially dcphosphorylated S-phase nonhistone chromosomal proteins. The data in Fig. 15 clearly indicate that dephosphorylation results in a 7540% decrease in the transcription of histone

442

GARY STEIN ET AL.

l q Cbt

FIG.15. Hybridization of RNA transcripts from reconstituted chromatin to histone ['HIcUNA. ["II]cDNA (0.04 n g ) was annealed with RNA transcripts from chromatin reconstituted with DNA, histones and nonhistone chromosomal proteins from S-phase I-IeLa S, cells (0) or from chromatin reconstituted with DNA, histones and partially dephosphorylated nonhistone chroniosonial proteins from S-phase cells ( 0 ).

mRNA sequences. Such enzymic dephosphoylation of S-phase nonhistone chromosomal proteins brings about less than 50% reduction in overall chromatin template activity and binding sites for E . coli RNA polymerase. Therefore it appears that all genes are not affected randomly and that histone genes are among thosc selectively inhibited. These two lines of evidence provide support for a direct and functional involvement of nonhistonc chromosomal protein phosphorylation in the regulation of histone gene transcription. Further elucidation of the involvement of phosphorylation in the regulation of histone gene transcription requires : ( 1) fractionation of the genome-associated phosphoproteins, which constitute a complex and heterogeneous class of macromolecules; ( 2 ) determination of whether histone gene transcription is activated by a G, protein modified at the onset of S-phase, or a protein synthesized and phosphorylated concomitant with the initiation of DNA synthesis; and ( 3 ) resolution of whether control of phosphorylation resides with the nonhistone chromosomal protein substrate or with the phosphorylating enzyme system.

NONHISTONE CHROMOSOMAL PROTEINS IN TRANSCRIPTION

443

VI. Conclusions While the specific regulatory elements that dictate the availability of histone genes for transcription have not been identified, it is possible, predicated on severaI observations presented here and elsewhere, to speculate as to how these genes may be rendered effective templates for transcription of mRNA sequences. DNA is an effective template for the transcription of histone mRNA sequences, and histones by themselves inhibit histone gene transcription from DNA in a dose-dependent, nonspecific manner ( 2 0 ) . When complexed with DNA alone, nonhistone chromosomal proteins (G,- or S-phase) do not affect the transcription of histone mRNA sequences (20). However, when associated with DNA in the presence of histones, the nonhistone chromosomal proteins arc capable of rendering histone genes transcribable selectively ( 1 4 ) . Chromatin reconstituted with S-phase nonhistone chromosomal proteins is an effective template for transcription of histone mRNA sequences whereas chromatin reconstituted with nonhistone chromosomal proteins from GI cells is not. Hence, it appears that the cell-cycle, stage-specific transcription of histone genes depends upon the source of nonhistone chromosomal proteins. Also, histone gene transcription during S-phase appears to be “activated” by a component of the S-phase nonhistone chromosomal proteins rathei- than being “repressed” during the GI-phase of the cell cycle by a component of the GI nonhistone chromosomal proteins (19, 3 4 ) . Taken together, these results suggest that a component of the S-phase nonhistone chromosomal proteins modifies the interaction of histones with DNA in a specific manner to render histone genes transcribable. I t is not clear how such modifications in the association of histones with DNA are achieved. Partial displacement of histones from DNA may be brought about by competition of nonhistone chromosomal proteins with specific sites on the DNA molecule. Alternatively, interaction of nonhistone chromosomal proteins with specific DNA sites may result in conformational modifications in adjacent DNA sequences where histone binding may be altered. Previous data suggesting nonhistone chromosomal proteins are responsible for cell cycle stage-specific variations in the binding of histones to DNA in chromatin are consistent with such reasoning ( 4 2 ) . One may envision “regulatory” proteins being complexed with regions of chromatin packaged as “nu-bodies” or with regions of the genome between the “beads.” I n the specific situation of histone gene activation during S-phase, it remains to be established whether the regulatory protein or proteins is: ( a ) newly synthesized and associated with the genome at the time of DNA replication; ( b ) recruited

444

GARY STEIN ET AL.

from thc cytoplasm or nucleoplasm during S-phase; or ( c ) a preexisting chromosomal protein enzymically modified at the onset of S-phase to alter its structural and functional properties. Within this context, it should be noted that evidence (38, 39) has been discussed that suggests that nonhistone chromosomal protein phosphorylation influences the transcription of histone genes.

ACKNOWLEDGMENTS These studies wcre supported by research grants from the National Science Foundation ( GB38349, BMS75-18583 and BMS74-23418), the National Institutes of Health (GM20535) and the American Cancer Society ( F73UF-6 and F75UF-4).

REFERENCES 1 . J. L. Stein, C. L. Thrall, W. D. Park, R. J. Mans and G. S. Stein, Science 189, 557 (1975). 2. G. S. Stein and T. W. Borun, J . Cell Biol. 52, 292 ( 1972). 3 . J. Spalding, K. Kajiwara and G. Mueller, P N A S 56, 1535 ( 1966). 4. E. Robbins and T. W. Bomn, PNAS 57,409 ( 1967). 5. T. W. Borun, M. D. Scharff and E. Robbins, P N A S 58, 1977 (1967). 6. D. Gallwitz and G. C. Mueller, JBC 244,5947 (1969). 7. D. Gallwitz and M. Breindl, BBRC 47, 1106 ( 1972). 8. M. Jacobs-Lorena, C. Baglioni and T. W. Borun, PNAS 69,2095 ( 1972). 9. T. W. Bomn, F. Gabrielli, K . Ajiro, A. Zweidler and C. Baglioni, Cell 4, 59 (1975). 10. M. Breindl and D. Gallwitz, EJB 45, 91 ( 1974 ). 11. C. L. Thrall, W. D. Park, W. H. Rashba, J. L. Stein, R. J. Mans and G. S. Stein, BBRC 61, 1443 (1974). 12. G. Stein, J. Stein, C. Thrall and W. Park, in “Chromosomal Proteins and Their Role in the Regulation of Gene Expression” (G. S. Stein and L. J. Kleinsmith, eds), p. 1. Academic Press, New York, 1975. 13. R. Mans and N. Huff, JBC 250,3672 (1975). 14. C. Stein, W. Park, C. Thrall, R. Mans and J. Stein, Nature 257, 764 (1975). 15. M. Farquhar and B. McCarthy, BBRC 53,515 (1973). 16. A. Skoultchi and P. R. Gross, PNAS 70, 2840 (1973). 17. G. S . Stein, W. D. Park, C. L. Thrall, R. J. Mans and J. L. Stein, BBRC 63, 945 (1975). 18. M. Adesnik and J. Darnell, J M B 67, 397 ( 1972). 19. W. D. Park, J. L. Stein and G. S. Stein, Bchem 15, 3296 (1976). 20. J. L. Stein, K. Reed and G. S. Stein, Bchem 15, 3291 (1976). 22. G. S. Stein, T. C. Spelsberg and L. J. Kleinsmith, Science 183, 817 (1974). 22. R. Baserga, Life Sci. 15, 1057 (1974). 23. G. S. Stein and J. Farber, €“AS 69, 2918 (1972). 24. C. S. Stein, S. C. Chaudhuri and R. Baserga, JBC 247,3918 (1972). 25. T. W. Borun and G. S. Stein, J. Cell Biol. 52, 308 ( 1972). 26. G. S. Stein and D. Matthews, Science 181,71 (1973). 27. R. Platz, G. S. Stein and L. J. Kleinsmith, BBRC 51, 735 (1973). 28. J. Karn, E. M. Johnson, G. Vidali and V. G. Allfrey, JBC 249, 667 (1974).

NONHISTONE CHROMOSOMAL PHOTEINS IN TRANSCRIPTION

44s

J. Bhorjee and T. Pederson, PNAS 69, 3345 (1972). E. Cerner and R. Humphrey, B B A 331, 117 ( 1 9 7 3 ) . T. Barrett, D. Maryanka, 1’. 11. Hamlyn and €I. J. Goiild, €“AS 71, 5057 (1974). J. Paul, I{. S. Gilniour, N. Affara, G. Birnie, P. Harrison, A. Hell, S. Humphries, J. Windass and B. Young, CSI l SQR 38, 885 (1973). 33. J.-F. Cliiu, Y. H. Tsai, K . Sakuma and L. S. Hnilica, JBC 250, 9431 (1975). 34. R. L. Jansing, J. L. Stein and G. S. Stein, submitted. 35. S . L. Rhode and K. A. 0. Ellcni, Exp. Cell Res. 53, 184 (1968). 36. C . Rovera and R. Bascrga, J . Cell. Pliysiol. 77, 201 (1971). 37. C. S. Stein and C . L. Thrall, FEBS Lctt. 34, 35 ( 1973). 38. J. A. Thomson, J. L. Stein, L. J. Klcinsmith and G. S. Stein, Science, in press. 39. L. J. Kleinsmith, J. L. Stein and C . S. Stein, PNAS 70, 1174 (1976). 40. D. Punio, C. S. Stein and L. J. Kleinsmith, B B A 402, 125 (1975). 41. D. Pumo, G. S. Stein and L. J. Kleinmiith, “Cell Differentiation,” in press ( 1976). 42. G. S. Stein, G. Hunter and L. Lavie, BJ 139, 71 (1974). 29. 30. 31. 32.

This Page Intentionally Left Blank

Selective Transcription of DNA Mediated by Nonhistone Proteins

TUNGY. WANG, NINA C. KOSTRABAAND RUTH S. NEWMAN Division of Cell and Molecular Biology State Univer.rity of New York at Buffalo Buffalo, New York

1. Introduction Transcription of DNA in chromatin is severely restricted, yet tissuespecific (1-4). Since histones, which suppress most of the template activity of DNA in chromatin, d o not qualitatively exhibit tissue differences, the specific restriction of transcription from chromatin lies in essence with the nonhistone chromosomal proteins. This deduction has found support from observations by many investigators. Gilmour and Paul (3) showed, by hybridization of DNA with RNA, that calf thymus DNA, complexed with homologous nonhistone proteins, transcribes 40% fewer RNA copies than “ n a k e d DNA. By comparison, insignificant amounts of hybridizable RNA are transcribed from nucleoprotein complexes formed with DNA and histones alone. Similar findings were also reported by Spelsberg et al. (5). Furthermore, the transcriptional template activity of chromatin devoid of all histones is only 67-76% of that of naked DNA (5).These observations strongly implicate a specific restriction action of nonhistone proteins in the transcription of chromatin. Additional evidence supporting a repressor role for the nonhistone proteins was provided by the works of Farber et al. ( 6 ) and Stein et al. ( 7 ) . In mitotic cells, HNA synthesis is suppressed, yet there are no quantitative differences in histones as compared with S-phase cells ( 7 ) . The inhibited RNA synthesis appears to be partly caused by a reduced template activity of mitotic chromatin for RNA synthesis. This depression of template activity of the mitotic chromatin is dependent upon specific nonhistone proteins (6). The transcriptional template activity of chromatin reconstituted from DNA, histones and nonhistone proteins is always higher than that of chromatin reconstituted from DNA and histones ( 3 , 8-11 ). This, together 447

448

TUNG Y. WANG ET AL.

with the observation that nonhistone proteins determine the specificity of chromatin transcription (12) suggests a regulatory role for the nonhistonc proteins in specific gene activation. Support of this was first demonstrated by Teng et al. (13, 14) and by Kleinsmith and co-workers (15, 16). The former, using phenol extraction, and the latter, using 1.0 M NaCl extraction and calcium phosphate gel fractionation, isolated from rat liver a nonhistone phosphoprotein fraction that stimulates transcription from DNA. A similar nonhistonc-protein fraction has subsequently been isolated from Ehrlich ascites tumor chromatin using DNA-affinity chromatography (17). The activation of transcription from DNA by these nonhistone-protein fractions is specific in that they bind selectively to, and preferentially stimuIate, transcription of homologous DNA. The activation reaction apparently depends on the phosphoprotein components and requires a eukaryotic RNA polymerase system. Teng et al. (14) suggested that the nonhistone proteins may function as does a sigma factor, recognizing and combining with specific polynucleotide sequences to promote transcription of particular gene loci. The work reported here aims to delineate the regulatory role of the nonhistone proteins in both positive and negative control of gene activity. The studies were designed to determine the restrictive and stimulatory properties of those nonhistone proteins involved in specific transcription. We first describe the purification of a near-homologous nonhistone protein that specifically inhibits transcription from DNA in an homologous RNA polymerase I1 system. Next, we describe the isolation of a nonhistone-protein fraction that selectively stimulates in vitro transcription from homologous DNA. It is also shown that the activator nonhistoneprotein fraction stimulates the initiation of transcription of only unique sequences in DNA, supporting thc hypothesis advanced by Teng et al. (14). In the present studies, Ehrlich ascites tumor cells were used. A homogeneous RNA polymerase I1 was purified from the tumor cells to provide the in vitro RNA-synthesizing system (18). As detailed elsewhere (18), this eukaryotic RNA polymerase 11 catalyzes the transcription of limited but selective DNA sequences when compared with bacterial RNA polymerase.

II. A Nonhistone Protein from Ehrlich Ascites Tumor That Inhibits Transcription from DNA

A. Isolation of the Inhibitor Nonhistone Protein The inhibitor nonhistone protein was isolated from the DNA-protein complex of chromatin. Details of the isolation procedure have been given elsewhere ( 19). Bricfly, the Ehrlich ascites tumor chromatin was extractcd with 2 M NaCl and the salt extract was diluted to 0.14 M NaCI

449

TRANSCRIPTION MEDIATED BY NONIIISTONE PROTEINS

-

to precipitate the DNA protein complex. The complex was extracted with 0.2 M H,SO, to remove the histones, followed by phenol extraction as described by Teng et al. (13, 1 4 ) . The phenol-soluble protein was further passed through a Bio-Rex (Na') column and dialyzed against 0.01 M TrisC1, pH 8.0, before use.

B. Some Properties of the Inhibitor Nonhistone Protein The inhibitor nonhistone protein isolated from Ehrlich ascites tumor chromatin contained 2.7% phosphorus by weight and appeared as a single, stainable band when subjected to polyacrylamide gel electrophoresis either in the presence or in the absence of sodium dodecyl sulfate, as well as by two-dimensional polyacrylamide gel electrophoresis (Fig. 1).I t was electrofocused at pH 5.3,and contained a predominant

5i3

PH

A

B

3i5

C

FIG. 1. ( A ) Polyacrylamide gel electrophoresis of inhibitor nonhistone-proteit, isolated from Ehrlich ascites tumor chromatin. Five micrograms of the nonhistone protein were subjected to electrophoresis on nondenaturing 58 polyacrylamide gel for 5.5 hours at 0.2 mA as described elsewhere ( 2 3 ) . ( B ) Ehrlich ascites tumor inhibitor nonhistone-protein ( 1 5 f i g ) was subjected to electrophoresis for 24 hours at 0.5 mA in a 20% polyacrylamide/O.l% sodium dodecyl sulfate gel as described by Laemmli ( 2 4 ) . The value indicates molecular weight x 10.'. Sodium dodecyl sulfate buffer front is indicated by B.F. ( C ) Two-dimensional gel electrophoresis of 30 ,ug of nonhistone inhibitor protein was performed as described by Suria and Liew ( 2 5 ) . The first dimension ( t o p gel) consisted of isoelectric focusing in 8 M urea with ampholine, pH range 3.5 to 10, at 100 V for 6 hours. The second dimension consisted of electrophoresis of the isoelectric focused gel onto a sodium dodecyl sulfate gel slab for 3.75 holm at 30 inA per slab. The gels of A, B and the slab gel are stained as described elsewhere (19).

450

TUNG Y. W A N G ET AL.

proportion of aspartic acid and glutamic acid to basic amino-acid residues (Table I ) . The inhibitor nonhistone protein is therefore an acidic protein. The molecular weight of the inhibitor nonhistone protein, calculated from its amino acid composition and sodium dodecyl sulfate polyacrylamide gel electrophoresis with known markers, is estimated to be 10,300.

C. Binding of the Inhibitor Nonhistone-Protein to DNA The inhibitor nonhistone-protcin isolated from Ehrlich ascites tumor chromatin binds to homologous D N A (Fig. 2 ) . To ascertain whether such interaction is selective for specific D N A sequences, Ehlrich ascites tumor D N A was fragmented and separated into fractions containing reiterated sequences, ( C,,t < 100) and unique sequences ( Cot >8 5 0 ) . Utilizing the nitrocellulose-filter binding assay ( 2U), these D N A fractions were tested for their ability to interact with the inhibitor nonhistoneprotein. As shown in Fig. 3, the protein binds only to the reiterated sequences. There was very little binding of the inhibitor nonhistoneprotein to the unique DNA. The preferential interaction between the inhibitor nonhistone-protein and the reiterated sequences of D N A was verified by the competitive binding assay as described by Johnson et at. (21). As shown in Fig. 4, binding of the protein to reiterated D N A se-

FIG. 2. Electrophoresis of "'I-labeled inhibitor nonhistone-protein isolated from Ehrlich ascites tumor (0.2 p g ) and DNA-binding '311-nonhistoneprotein I on 10%sodium dodecyl sulfate polyacrylamide (approximately 0.05 p g ) gel as described in Fig. 1. Samples were subjected to electrophoresis for 24 hours at 0.5 mA/tube. The gels were sliced into 1-inm sections, and each slicc was counted in a Beckrnan gamma counter. The inhibitor DNA-binding protein was prepared as described elsewhere (19).

451

TRANSCRIPTION MEDIATED BY NONHISTONE PROTEINS

-

0

low-Col DNA 0 W

z ;Wr a a t

;I

0

d

d

20

3 h i g h - t o t DNA

OO

2

4

8

pg NONHISTONE PROTEIN

FIG.3.

0

20

40

60

80

pg COMPETING DNA

FIG.4.

FIG. 3. Retention of inhibitor Ehrlich ascites tumor nonhistone-protein.DNA complex on nitrocellulose filters. Increasing amounts of inhibitor nonhistone-protein and were interacted with 1Fg of 1251-containinglow-C,,t DNA (850) (0-0).The assay procedure was as described by Riggs et al. ( 2 0 ) as modified by Sevall et al. (26). One microgram of input DNA was adjusted with corresponding unlabeled DNAs to give 15,000 cpm/pg. The abscissa represents niicrograms of inhibitor nonhistone-protein added, and the ordinate represents percentage of DNA retained on the filter. Background of approximately 1000 cpm was obtained by passing 1 r g of "'I-DNA without the inhibitor nonhistoneprotein through the nitrocellulose filter. FIG. 4. Retention of a complex between inhibitor nonhistone-protein ( Ehrlich ascites tumor) and low-C,,t DNA on nitrocellulose filters in the presence of competing DNAs. Various amounts of low-Cot DNA and high-Cot DNA were each initially treated with 0.5 pg of inhibitor nonhistone-protein. Subsequently, 20 Pg of low-C,,t '"I-DNA was added to each reaction. Increasing amounts of competing DNAs are shown on the abscissa. The ordinate represents percentage of DNA retained on the nitrocellulose filter after competition with low-C,t DNA (0-0 ) and high-Cot DNA ( 0 - 0 ) . Total input DNA represents 35,000 cpm, and background represents approximately 1500 cpm.

quences is effectively inhibited by the low-Cot DNA fragments, but not by high-Cot DNA.

D. Inhibition of Transcription from DNA in Vifro by the Nonhistone Protein

Addition of the nonhistone protein to a native homologous DNAtemplated RNA-polymerase-I1 reaction resulted in inhibition of RNA

452

TUNC Y. WANC ET AL.

synthesis (Fig. 5 ) . This protein was not effective in inhibiting in vitro transcription whcn dcnatured DNA was uscd as a tcniplate (Fig. 5 ) . The nonhistone protein was also ineffective when tested in an E . coli RNA polymerase reaction (Fig. 6 ) . Since the inhibitor nonhistone protein binds to reiterated, but not to unique, DNA sequences, the question arises whether the inhibitory effect of the nonhistone protein is a result of its interaction with reiterated D N A sequences. The data in Table I1 show that the nonhistone protein inhibits RNA synthesis templated by low- and middle-C,,t DNAs, but not high-C,,t DNA. These results suggest a possible ternary complex formation specific for the nonhistone protein, reiterated D N A sequences and a eukaryotic RNA polymerase I1 in the inhibition of transcription in vitro by the nonhistone protein. ssDNA

0

RNA polymerase

z

8 60

sp

!?

8

40

40

r

I z-

I I -

ds DNA

z

tumor

z

a -

0

1 2 3 4 pa NONHISTONE PROTEIN

FIG.5.

NONHISTONE PROTEIN

FIG. 6.

FIG. 5. Inhibition of DNA-dependent RNA synthesis in uitro by the inhibitor nonhistone protein isolated from Ehrlich ascites tumor. Ehrlich ascites tumor RNA polymerase I1 was used in the in uitro assay system. Varied amounts of inhibitor protein as indicated on the abscissa were added to 5 pg of native Ehrlich ascites in the in uitro RNA synthesizing DNA (0-0) or denatured DNA).--.( system as described elsewhere ( 1 9 ) . The ordinate represents RNA synthesis, expressed as percentage of control. The control consisted of cpm of RNA synthesized using 5 pg of various DNAs in the in uitro RNA-polymerase system. FIG.6 . Preference for Ehrlich-ascites-tuillor RNA polymerase in the inhibition of DNA-templated RNA synthesis in uitro by Ehrlich-Ascites-tumor inhibitor nonhistoneprotein. Ehrlich ascites tumor RNA polymerase I1 (O---O) or E. coli RNA poly) was used in the in uitro RNA synthesis system. Various amounts merase ( .---a of inhibitor nonhistone protein as indicated on the abscissa and 5 pg of native Ehrlich ascites DNA were used in the RNA polymerase reaction as described elsewhere ( 1 9 ) . T h e ordinate represents RNA synthesis, expressed as percentage of control. The control consisted of cpm of RNA synthesized using 5 pg of Ehrlich ascites DNA in the in uitro RNA synthesis system in the absence of the inhibitor nonhistone-protein.

453

TRANSCIIII’TION hfEDIATED BY NONIIISTONE PI3OTEINS

Aiiiirio acid

Rlolar %

1,ysinc Histidinc Arginirie Aspartic acid Threoninc: Seriric (;lutarnic acid Prolinc (ilycinc Alanine \7alinc Isolcucinc Leucine Tyrosinc I’henylalanirir ((ilutamict avid aspirtic: acid)/ (lysine histidinc argininc)

6.7

+

+

+

1.2 7.6 !)‘l 4.9

7 ,5 12.9 7.8 11.6 10.8 7.7 2.8 .-) . 1 1. 3 3 .5 1.42

Minimal no of residues

6 1

6

x 4 0 11 6 10 9

0 2 4 1 3

E. Mode of the Inhibitory Action of the Nonhistone Protein in Transcription from D N A

The inhibition of RNA synthesis in vitro by the nonhistone protein could be at either the initiation or the elongation step of RNA chain growth. The specific binding to DNA by the inhibitor nonhistoneprotein suggests that the inhibitory action of the nonhistone protein is likely to be on the initiation of KNA synthesis. In the reaction of RNA polymerase I1 from Ehrlich ascites tumor employing naked DNA as template, initiation of RNA synthesis starts with the deposition of ATP and GTP as the S’-terminal nucleotides (18). As can be seen in Fig. 7, incorporation of [ Y - ~ ~ P I A and T P -GTP into RNA in the presence of the nonhistone protein is approximately half that obtained in the absence of the nonhistone protein. The suppressed initiation of RNA synthesis is directly proportional to the reduction of [3H]UMP incorporated into RNA by the inhibitor nonhistone-protein. Consequently, there is no significant change in the average chain-length of the RNA product synthesized either in the presence or in the absence of the nonhistone protein. The data indicate that the action of the nonhistone protein is inhibition of RNA chain initiation rather than elongation. Since the nonhistone protein preferentially binds to and inhibits tran-

454

TUNG Y. WANG ET AL. L

UMP

TIME IN MINUTES

FIG. 7 . Effect of the Ehrlich ascites inhibitor nonhistone-protein on RNA synthesis and chain initiation in the Ehrlich ascites tumor RNA polymerase I1 reaction. The re.action mixture, as described elsewhere ( 1 9 ) , included 10 p g of DNA, vith and without 10 pg of the inhibitor protein; ['HIUTP; and either ( a ) [y"'P]ATP or (b)[-/-"PIGTP. The reaction niixtures were incubated at 37°C for the time periods as indicated. After incubation, the product HNA was extracted with phenol and prepared for counting as described elsewhere ( 1 9 ) . 0, ['HIUMP incorporation; 0 , ['NIUMP incorporation in the presence of the nonhistone protein; A, [y-'"P]ATP ( a ) , and [y-'"P]GTP ( b ) incorporation; A, [y-"'PIATP ( a ) , and [y-"P]GTl' ( b ) incorporation in the presence of the nonhistone protein.

scription of reitcrated DNA sequences, inhibition of RNA initiation by the nonhistone protein should be observed with DNA of reiterated sequences and not with high-Cut DNA. This is borne out by the results shown in Table 11.

Incaorportition of rstliaactive ,~ nuclcotidcs into RNA ( p m ~ l )from

~ ~ _ _ _ _ _

LOW-(:ol 1) N A +I-N Ill'* hIiddlc-C,l I)NA +I-N I i 1' IIigh-Cut I)NA +I-N I I I'

0 40 0.17 0 4% 0 1x

o.:w 0 40

I n c u h t i o n time 20 ~riinutcs. = inhihitor nonhiston~protrin.

* I-N I 11'

0 32 0 , 1.i 0 32 0 a0 0 . :I6 0.34

30

17

an 17 :I 1 3I

TRANSCRIPTION MEDIATED BY NONIIISTONE PROTEINS

F.

455

Comparison of Some of the Properties of the Inhibitor Nonhistone-Proteins Isolated from Ehrlich Ascites Tumor, Calf Thymus and Chicken Erythrocytes

The above results showing selective interaction with, and inhibition of the initiation of transcription of reiterated sequences in DNA by the nonhistone protein suggest possible involvement of the nonhistone protein in the negative control of gene activity. For such a regulatory role, the inhibitor nonhistone-protein should exhibit tissue-specificity, or, as a first approximation, tissue-variations. Corresponding nonhistone proteins have been prepared from chromatin of calf thymus and chicken erythrocytes using the same isolation procedurc employed in the isolation of the tumor lionhistone protein. As shown in Fig. 8, the nonhistone proteins isolated from chicken erythrocyte and calf thymus are capable of inhibiting KNA synthesis from their corresponding homologous native templates in an RNA polymerase I1 reaction. These inhibitor nonhistone proteins, when subjected to polyncrylamide gel electrophoresis under nondenaturing conditions, exhibit different electrophoretic mobilities (Fig. 9). In the sodium dodecyl sulfate gel electrophoresis (Fig. l o ) , the inhibitor nonhistone-proteins isolated from calf thymus and chicken erythrocytes each show two subunits, differing from one tissue to the other. Both of these differ from the inhibitor nonhistone-protein prepared from Ehrlich

O

0

1

2

3

4

pg NONHISTONE PROTEIN

FIG. 8. Inhibition of DNA-dependent RNA synthesis in oitro by the inhibitor calf thymus ( A ) and nonhistone-protein isolated from Ehrlich ascites tumor (O), chicken erythrocyte ( ) Varied amounts of inhibitor nonhistone-protein as indicated on the abscissa were added to the in oitro RNA synthesizing system using homologous RNA polymerase I1 (except that calf thymus RNA polymerase I1 was used for the erythrocyte nonhistone-protein). In all cases, 5 pg of homologous DNA template were used. The control represents RNA synthesis in the absence of the nonhistone protein, expressed as 100%RNA synthesis.

456

TUNG Y. WANG ET AL.

A

CT CE GEL SLICES

FIG.9.

FIG. 10.

FIG. 9. Polyaerylamide gel electrophoresis of inhibitor nonhistonr-protein isolated from Ehrlich ascites tumor ( A ) , calf thymus ( C T ) and chicken erythrocyte chromatins (CE ). Approximately 4 p g of the appropriate nonhistone proteins were subjected to electrophoresis on a nondenaturing 52 polyacrylamide gel for 5.5 hours at 0.4 mA as described elsewhere ( 23 ) . FIG. 10. "'I-Labeled calf thymus and chicken erythrocyte inhibitor nonhistone proteins were subjected to electrophoresis for 24 hours at 0.5 mA in a 20% polyacrylamide/O.l% sodium dodecyl sulfate gel as described by Laemiiili ( 2 4 ) . The values indicate the corresponding molecular weights. The ordinate represents cpni of 1251-proteindetermined with a Beckmnn gamma counter.

ascitcs tumor cells, which contains only one subunit. Further, these lionhistone proteins preferentialIy inhibit transcription from homologous DNA. As shown in Fig. 11, 1.7 pg of tumor lionhistone protein, which inhibits transcription from 5 pg of tumor D N A by 60%,reduces transcription from 5 pg of calf thymus D N A and 5 pg of chicken erythrocyte DNA by onIy 15%and 20%, respectiveIy. These data suggest a tissuespecificity for the inhibitor rionhistone-protein.

111. The Nonhistone-Protein Fraction That Stimulates Transcriptional from DNA

A. Isolation of the Activator Nonhistone-Protein Fraction The nonhistoiie-protein fraction that sclectively stimulates transcription from homologous DNA in vitro was isolated from a 0.35 M NaCl ex-

TRANSCRIPTION MEDIATED BY NONHISTONE PROTEINS

457

a

r z

a

I

z

h ch.RBC

"

tumor

calf thymus

+e

DNA USED

FIG. 11. Preference for homologous template by the inhibitor nonhistone protein. Ehrlich ascites inhibitor nonhistone protein (1.7 p g ) was added to tumor RNA polymerase I1 reactions containing 5 fig of different DNA templates. The abscissa indicates the origin of DNA teniplate; the ordinate represents picomoles of ['HIUMP incorporated with the respective DNA template.

tract of Ehrlich ascites tumor chromatin. The salt-extracted proteins wcre treated with Bio-Rex 70 (Na') and passed through an E . coli DNA-cellulose column. The unabsorbed proteins were then chromatographed on an Ehrlich ascites tumor DNA-cellulose column. The activator nonhistone-protein fraction was eluted with 0.6 M NaC1. The detailed procedure has been reported elsewhere (17). This fraction was heterogeneous and contained 0.9%phosphorus by weight.

B. The Nonhistone Protein Fraction Selectively Stimulates Transcription from DNA In the Ehrlich-ascites-tumor RNA-polymerase reaction templated by homologous DNA, the addition of the nonhistone proteins stimulates RNA synthesis in oitro (Fig. 12). This activation appears to be specific as DNA templates prepared from calf thymus, rat liver and chicken erythrocytes showed no stimulation of RNA synthesis. The nonhistone proteins were also ineffective when tested in a Micrococcus luteus RNA polymerase reaction. The prcferential stimulation of transcription from only homologous DNA by the activator nonhistone-proteins, and the fact that the nonhistone proteins were prepared by selective binding to homologous

458

TUNG Y. WANG ET AL.

hiqh-Cot

DNA

160

100 0 5 10 1.5 2.0 2.5 3.0 PROTEIN :DNA

FK:. 12.

0

4

8

12

16

PP NHP

FIG:. 13.

FIG. 1.2. Template specificity i n activated HNA synthesis in oitro stiiniilatcd Iiy the DNA-binding nonhistone proteins. Each reaction t d ) e contained 5 pg of the respective DNA in an Ehi-lich ascites tumor RNA polymerase I1 reaction, and the amounts of activator nonhistone 1)roteins used iire indicated on the alxcissa. Th e amount of ["HIUMP iiicorporatcd into RNA in the absence of added nonhistone proteins is considered as 100% template activity. Assay conditions and procedures for isolating the activator proteins are described elsewhere ( 1 7 ) . The DNA templates used in this study were from Ehrlich ascites tumor ( 0), calf thymus ( A ), rat liver ( 0) and chicken erythrocyte ( A ) . FIG. 13. Retention of a complex of DNA and activator nonhistone protein ( Ehrlich ascites tumor) on nitrocellulose filters. Increasing ailiounts of activator nonhistone proteins as shown on the nlxcissit weye treated with 1 rp of "'1-lal)eled lowC,,t ( <0.02) (0-0) middle-C,,t (0.02-lOO) ( A- .-A ) or high-C,,t ( >850) ( - - 0 ) DNA. Tlir assay procedure was as described by Riggs et d . ( 2 0 ) as modified by Sevall et al. ( 2 6 ) . One microgram of input DNA was adjusted to approximately 15,000 c p with corresponding nnlabcled DNAs to give 15,000 cpni/pg. Th e abscissa represents microgram of activator nonhistone-proteins added ( p g N H P ) and the ordinate represents percentage of DNA retained on the filter. Background of approximately 1000 cpm was obtained by passing 1 pg of "-I-DNA withorit the activator nonliistone-protein through the filter.

DNA, suggest that the active eomponents of the nonhistone proteins recognize and bind to specific sequences in DNA. Accordingly, DNA prepared from Ehrlich ascites tumor was sheared and fractionated into low-C,,t ( <0.02), middle-C,,t (0.02-loo), and high-C,,t ( >SSO) fragments of approximately 500-600 nucleotides, and tested for the binding affinity of the nonhistone proteins toward these DNAs of reiterated and unique DNA sequences. As shown in Fig. 13, the activator nonhistoneproteins bind only to unique sequences in DNA by the nitrocellulosefilter method (20, 21 ). There is no, or insignificant, binding of the non-

TRANSCRIPTION MEDIATED BY NONHISTONE PROTEINS

459

' " 7 middle- and Iow-Col DNAs

-

OO

100

200

pq COMPETING DNA

FIG.14. Rcltention of a complex of high C,,t DNA and activator nonhistone-protein on nitrocellulose filters in the presence of competing DNAs. The method of Riggs et al. ( 2 0 ) as modified by Johnson et al. ( 2 1 ) was followed. Various amounts of low-C,,t, middle-C,t and high-C,,t Ehrlich ascites DNAs were each initially treated with 2 fig of the activator protein. High-C,,t '"I-DNA (20 ,ug) was subsequently added to each reaction. Incrsasing amounts of competing DNAs are shown on the abscissa. The ordinate irpi esents percentage of DNA retained on the nitrocellulose filter after competition with low-C,,t ( 0--0 and high-(=,$ ( @-• ) DNA. Total input DNA represents 35,000 cpni, and background represents approximately 1500 cpm.

histone proteins to DNAs of reiterated sequences. In competition experiments, DNAs of C,,t less than 100 do not compete in the above binding of the activator fraction with DNAs of C,,t greater than 850 (Fig. 14).

C. The Nonhistone Protein Fraction Stimulates the Initiation of RNA Transcribed from Unique Sequences in D N A The specific binding of the nonhistone proteins to high-Cot DNA and the rcsulting enhaiiced transcription suggest that the nonhistone proteins may act at the initiating step in RNA synthesis. To ascertain this, y-"P-labeled ATP and GTP and ['HIUTP were used as the labeled substrates in the RNA polymerase reaction, and RNA synthesis was allowed to procecd in the presence or in the absence of the nonhistoneprotcin fraction. The results are shown in Fig. 15. I t is seen that the rate of incorporation of [ y-"P]ATP was doubled, paralleling the incorporation of [ 'HIUMP, when RNA was synthesized in the presence of the activator norihistone proteins. However, incorporation of [ Y - ~ ~GTP ~ P into ] RNA was not enhanced by the nonhistone proteins. The average chain length of the RNA product was approximately the same whether or not the lionhistone proteins were present in the RNA-synthesizing reaction.

460

T U N G Y. W A N G ET AL.

5 '4 15

30 4 5

60

MINUTES

0 W I-

a a V

-1.0

z n

-0.6

a -0.4

J

0

I -0.2 n

oov-

ib

20

o;

40

Sb $0

lo

MINUTES FIG.15. Effect of the Ehrlich ascites activator nonhistone-proteins on RNA synthesis and chain initiation in an Ehrlich-acites-turnor RNA-polymerase-I1 reaction. Th e radioactive precursors and conditions of the rcactions were the same as thosc described in Fig. 7 except that the activator nonhistone-protein fraction instead of the inhibitor nonhistone-protein was used in this experiment.

The results indicate that the activator nonhistone-proteiii fraction stimulates the initiation but not the elongation of RNA chain growth. Since the activator nonhistone-proteins selectively bind to unique sequences in DNA, it is necessary to establish that the nonhistone proteins stimulate transcription from only high-C,,t DNA. Moreover, such activated transcription should also be reflected in a stimulated RNA chain initiation, involving only the incorporation of [ Y - ~ ~ ATP, P ] but not into RNA. The data in Table I11 illustrate that such that of [Y-~~PIIGTP, is indeed the case. These results, taken together, show that the activator nonhistone-protein fraction rccognizes and binds specifically to the structural genes and stimulates initiation of HNA synthesis.

IV. Conclusion The dependence of specific transcription from chromatin on the nonhistone chromosomal proteins, the high turnover rate of and the synthesis of unique nonhistone proteins in relation to celIular activity, and the tissue variations of these proteins, have led to the belief that the nonhistone proteins play a key role in the control of gene expression (12).

461

TRANSCL~IPTION h<EI11I\TEI) I3Y NONIIISTONE PROTEINS

lncorporation of radioactive nurlroticlcs into RNA (prnol)<‘from

LOW-(‘uf. 1)NA f N I1

Middle-(‘,t J)NA +NHl’ I~Iigli-Col1 ) N A +NI11’

$’

-~

_-

-

-

.~

~-

-

-

0 .Y ! )

0.37 0.60

0.4:4

97.6 !)6 . :3

x:3, x

81. 1

98.7 151.2

Incuhation tirnc 60 n1inut.w. NTIP = nonliistont: protctin.

One approach toward establishing thc nonhistone proteins as potential regulatory molecules is to demonstrate, as a prerequisite, that they affect selectivc transcription from DNA. The original demonstration that nonhistone proteins bind to and stimulate transcription of homologous DNA by Teng et al. (13, 1 4 ) and by Kleinsmith and associates (15, 1 6 ) provided a direct experimental basis toward this direction. The work described hcre was carried out using this approach and elaborates on several aspects of the specificities of the nonhistone proteins affecting transcription, among which is the activation of transcription from unique DNA sequences by nonhistone proteins. In this regard, the finding of a nonhistone protein that inhibits only reiterated sequences in DNA is not inconsistent with the multilevel control mechanism proposed by Britten and Kohne (22). Thus, the present data underline the potential significance of the chromosomal proteins in both positive and negative control of gene expression. ACKNOW LEDC hlENT

The work reported here was supported by a research grant from T h e National Foundation ( N o . 1-488).

REFERENCES I. Bekhor, G. M. Kung and J. Bonner, J M B 39, 351 (1969). R. C. C. Huang and P. C. Hnang, J M B 39, 365 (1969). R. S . Gilmour and J. Paul, J M B 40, 137 (1969). T. C. SpelsLerg and L. S. Hnilica, BI 120,435 (1970). 5. T. C. Spelsberg, L. S. I-lnilica and A. T. Ansevin, BBA 228, 550 ( 1971). I. 2. 3. 4.

462

TUNG Y. WANG ET AL.

6. J. Farber, G. Stein and R. Baserga, BBRC 47,790 ( 1972). 7. G. Stein, G. Hunter and L. Lavie, BJ 139, 71 ( 1974). 8. M. 0. J. Olson, W. C. Starbuck and H. Busch, in “The Molecular Biology of Cancer” (11. Busch, ed.), p. 309. Academic Press, New York, 1973. 9. T. C. Spelsljerg, J. A. Wilhelm and L. S. IInilica, Sub-cell. Biochem. 1, 107 (1972). 10. A. J. MacCillivray, J. Paul and G. Threlfall, Advan. Cancer Res. 15, 19 (1972). 11. G . S. Stein and R. Baserga, Adoan. Cancer Res. 15, 287 ( 1972). 12. V. G. Allfrey, in “Acidic Proteins of the Nucleus” ( I . L. Cameron and J. R. Jeter, eds.), p. 1. Academic Press, New York, 1974. 13. C. T. Teng, C. S. Teng and V. G . Allfrey, BBRC 41, 690 ( 1970). 14. C. S. Teng, C. T. Teng and V. G . Allfrey, JBC 246,3597 ( 1971). 15. L. J. Klcinsmith, JBC 248, 5648 (1973). 16. M. Shea and L. J. Kleinsmith, BRRC 5@,473 ( 1973). 17. N. C. Kostraba, R. A. Montagna and T. Wang, JBC 250, 1548 ( 1975). 18. J. Chan, R. M. Loor and T. Y. Wang, ABB 173, 564 (1976). 19. N. C. Kostraltn and T. Y. Wang, JRC 250, 8938 (1975). 20. A. D. Riggs, S. Bourgeois, R. 17. hTewky and M. Cohn, JMB 34, 365 ( 1968). 21. E. M. Johnson, J. W. Hadden, A. Inoue and V. G. Allfrey, Bchem 14, 3873 (1975). 22. R. J. Britten and D. E. Kohne, Science 165, 349 (1969). 23. J. S. Krakow, in “Methods in Enzymology,” Vol. 21: Nucleic Acids, Part 1) (L. Grossman ancl K. Moldavc, eds.), p. 520. Academic Press, New York, 1971. 24. U. F. Laemmli, Nature 227, 680 (1970). 25. D. Suria and C. C. Liew, Can. J. Biochem. 52, 1143 (1974). 26. J. S. Sevall, A. Cockburn, M. Savage and J. Bonner, Bchem 14,782 ( 1975).

V. Control of Translation

Structure and Function of the RNAs of Brome Mosaic Virus

465

PAULKAESBERG Effect of 5’-Terminal Structures on the Binding of Ribopolymers to Eukaryotic Ribosomes 473 S. MUTHUKHISHNAN, Y. FURUICHI, G. W. BOTH AND A. J. SHATKIN Translational Control in Embryonic Muscle

477

STUART M. HEYWOOD AND DQRIS S. KENNEDY Protein and inRNA Synthesis in Cultured Muscle Cells R. G. WIIALEN, M. E. BUCKINGHAM A N D F. GROS

463

485

This Page Intentionally Left Blank

Structure and Function of the RNAs of Brome Mosaic Virus PAULKAESBERC Biophysics Laboratory of the Graduate School and Biochemistry Department, College of Agricultural and Life Sciences University of Wisconsin Madison, Wisconsin

1. Introduction Almost ail plant viruses known have either mRNA or an RNA having messenger sense as their genome. Among the questions that involve regulation of expression are: What functions arc encoded in the viral RNA? How is this information put into a form suitable for translation? What are the biochemical interactions coupling RNA replication, pretranslational processing of messenger RNA, translation itself and posttranslational processing of products of translation? Until recently, it was thought that all plant viral RNA is messenger RNA and that it is translated, in unmodified form, in susceptible cells. Failure to reproduce this translation in vitro was attributed to incompatibility of bacterial cell-free systems with RNA as foreign as that of plants, to inefficiency of animal cell-free systems, or to failure of cell-free systems from plants to synthesize anything. All this was true, but the problem is more basic. The principal product ( indeed, the only identifiable product ) is coat protein, and with some of the most prominent plant viruses, viral RNA is not its messenger. In the case of tobacco mosaic virus, coat-protein messenger is a small RNA molecule whose replication and life cycle are still obscure ( 1 ). The breakthrough in the understanding of the translation of plant viral RNAs came from the development of an efficient cell-free proteinsynthesizing system from plants (2, 3 ) and its use with viral RNA containing a translatable coat-protein cistron: the wheat-embryo proteiiisynthesizing system and brome-mosaic viral (RMV) RNA ( 4 ) .

II. Structural Considerations

A. Properties of Brome Mosaic Virus RMV consists of three virions, each carrying a portion of the genome. When we first charactcrizcd the BMV virions a decade ago, BMV was 465

466

PAUL KAESBEHC

thought to be a small, simple virus quite similar to the RNA phages that had just been discovered. They contain 180 identical protein subunits, of MW 20,000, in a symmetrical array on a T = 3 lattice, giving a nearly spherical particle about 300 A in diameter. The protein subunit is almost one-fifth alanine residues. The mass of the RNA in the virions is approximately lo6 daltons. However, when the viral RNA was isolated, it was found that there exist RNAs of mass 1.1 x lo", 1.0 x lo", 0.75 x lop and 0.28 x 1 0 daltons, designated 1, 2, 3 and 4 in the order of decreasing mass. These four RNAs are contained in threc virions designated H, M and L. Virion H contains RNA 1, virion M contains both RNA 3 and HNA 4, and virion L contains RNA 2. All the virions have the same complement of 180 identical coat-protein subunits. ( For a n extensive review, see ref. 5.) All three virions are needed for infectivity ( 6 ) . When the viral RNAs, themselves, are checked for infcctivity, a remarkable result is obtained. RNAs 1, 2 and 3, together, are required for infectivity. Addition of RNA 4 neither enhances nor inhibits infectivity. Complementation studies show that RNA 3 contains the coat-protein cistron. Moreover, as the infectivity data would predict, RNAs I, 2 and 3 are independent in sequence. However, RNA 4 is a part of the sequence of RNA 3 ( 7 ) . Again this is in concert with the infectivity data, which demand that the RNA 4 sequences be prescrved in one of thc RNAs that contributes to infectivity. Another important characteristic will be discussed more fully in Section 111, C. All four RNAs are able to bind tyrosine when exposed to aniinoacyl-tRNA synthetases under appropriate conditions ( 8).

B. Structure of BMV RNA 4 13MV RNA 4 is about 900 bases long, based on its electrophoretic mobility on polyacrylamide gels relative to marker RNAs. The value is subject to some uncertainty due both to uncertainties in the markers and in the basic assumptions underlying the method. Its translation in vitro shows that RNA 4 is a monocistronic messenger for the coat protein. The 5' terminus contains 7-methylguanosine (m'G) ( 9 ) . Ribosome binding experiments show that the coat protein cistron is only 10 bases removed from the 5' end of the RNA (10). Then follow codons for the coat protein, about 550 in number. Finally there is an untranslated region, estimated to be about 300 bases long, terminating in -C-C-A. The primary structure and configuration of this portion of the RNA is somewhat tRNAlike because it can be joined to tyrosine by the action of tyrosine-tRNA synthetase. Indeed, it is possible to obtain, with ribonuclease T1, a fragment 160 bases long that contains the original 3' terminus and that will accept tyrosine ( 11).

RNAS

OF BROME MOSAIC VIRUS

467

C. Structure of BMV RNA 3 BMV HNA 3 is about 2200 nucleotides long. Complementation studies show that it contains the coat-protein cistron. The lack of participation of RNA 4 in infectivity together with sequence analyses led to the conclusion that the complete RNA 4 sequence exists in RNA 3. Since the large oligonucleotides we have obtained from RNA 4 exist in only one copy in RNA 3, it follows that the HNA 4 sequencc and the coat-protein cistron reside in a 3’ portion of RNA 3. RNA 3 also contains m’G, but this is at its 5’ terminus. RNA 3 serves as a messenger for a protein (of unknown function) we call protein 3a, but it does not induce synthesis of coat protein (see Section 111, A ) . Protein 3a is not related in sequence to coat protein. Thus RNA 3 begins with a blocked 5’ terminus, presumably followed by a ribosome binding site and a protein 3a cistron. Next comes an intracistronic region, the coat-protein cistron and the rest of RNA 4. RNA 3 is thus a dicistronic messenger in which one of the cistrons is cryptic.

D. Structure of BMV RNAs 1 a n d 2 RNAs 1 (-3300 bases) and 2 (-3100

bases) are monocistronic messengers for proteins l a and 2a, respectively. These proteins are dissimilar in sequence. Their function is unknown. RNAs 1 and 2 have m’G at their 5’ termini and, except for two bases and one base, respectively, have sequences identical to that of RNA 4 extending at least 160 bases from their 3’ termini. The sequences differ, of course, in the region of their cistrons.

111. Regulation of Translation We have noted, above, that the BMV RNAs are comprised of three monocistronic messengers, RNA 1, RNA 2 and RNA 4, and a dicistronic messenger, RNA 3, which contains an “eager” cistron and a “reluctant” cistron. We also have found that all four RNAs have the same distinctive 5’ terminus and all four have practically the same sequence extending for 160 bases from their 3’ termini. What do these features have to do with regulation of translation of these RNAs? Some of these things can be brought into focus by in vitro studies, and these I discuss in the sections that follow.

A. Messenger Competition Experiments We know from complementation experiments that RNA 3 contains the coat-protein cistron. However, when used as a messenger in the wheat-embryo system, the principal product is protein 3a, unrelated to

468

PAUL KAESBERC.

the coat protein. Sometimes we can detect a minor electrophoretic band corresponding to coat protcin. We had thought that this was translated from HNA 3. However, we find, from competition experiments involving various combinations of KNA 3 and RNA 4, that the existence of even a small admixture of RNA 4 has more than a proportional effect in causing synthesis of coat protein. Thus we cannot be sure that we have ever observed synthesis of coat protein translated from cistron 3b of HNA 3. The appearance ot coat-protein synthesis may merely reflect the contamination of RNA 3 with a small amount of RNA 4. In the competition experiments, RNA 4 was mixed with RNA 3 in various proportions, and the synthesized products were analyzed on gels ( 1 2 ) . RNA 4, even at a fraction of the molar concentration of RNA 3, caused substantial inhibition of the translation of protein 3a. When an equal molar amount of HNA 4 was added to the reaction mixture, the synthesis of protein 3a was barcly dctectable even though RNA 3 was present at its optimal amount for synthesis in the absence of RNA 4. The main product was coat protein. The inhibition of the translation of RNA 3 by RNA 4 is probably not caused by the coat protein synthesized from RNA 4 because the addition to the incorporation mixture of BMV coat-protein in large amounts did not at all affect the activity of the RNA. RNA 4 also affects translation of RNA 1 but less markedly. At an optimal concentration of RNA 1, an equal molar amount of RNA 4 does not inhibit the translation of RNA 1. Normal amounts of protein l a are synthesized. For obtaining substantial inhibition, a ratio of two-to-three moles of RNA 4 to one mole of RNA 1 was required. At a ratio of four to one, a nearly complete inhibition of the synthesis of protein l a was observed. These rcsults provide a scenario of what might take place in uiuo. Infection with BMV requires the presence of BMV RNAs 1, 2 and 3 at a sitc in a susceptible cell. Since RNA 4 is not needed for infectivity, we may assume that its presence is not required. Translation commences, a viral replicase is synthesized and RNA replication occurs. In some way, RNA 4 is made as well as the other three RNAs, but perhaps not immediatcly, because it derives from RNA 3. Translation of RNAs 1, 2 and 3 continues for a whilc, unencumbered except by the effect of each on the others. Gradually the amount of RNA 4 builds up and its translation becomes preferred. Coat protein is made in abundance, encapsidation of the RNAs occurs, resulting in the formation of virions and removal of the RNAs from the translating milieu.

B. Effect of the 5’ Structure on Translation EfFiciency BMV RNA 4 was chemically modified with sodium periodatc and aniline to remove the terminal m’G structure, and the modified RNA

RNAs

OF BHOME hfOSAIC VIRUS

469

was tested for messenger activity. I n order to obtain better completion of the chemical modification reactions, the RNAs were treated with large cxcess amounts of modifying agents. Chromatographic analyses indicated that the reaction conditions used were effective for complete removal of m’G. The RNAs were intact after the chemical treatment as judged by polyacrylamide gel electrophoresis analyses. The abilities of modified and unmodified RNA 4 to induce leucine incorporation were compared. At a concentration of 10 pg/ml, modified RNA 4 induces a small amount of incorporation, about 12%of that induced by unmodified RNA 4. However, at a concentration of 50 pg/ml, which is near the saturation concentration of normal RNA 4, modified RNA 4, can induce an incorporation about half that induced by unmodified RNA 4. Data for the two RNA 4 samples at various RNA concentrations show that incorporation increases with increasing concentration for both the modified and unmodified RNA, but the amounts of incorporation from modified RNA 4 are lower in all cases. The products synthesized from modified RNA 4 are normal BMV coat proteins. The ribosome binding efficiencies of modified and unmodified RNA 4 were compared. The reaction mixtures were the same as the amino acid incorporation assay mixture except that the antibiotic anisomycin was added to prevent peptide chain elongation. After incubation, the binding rcaction mixtures were cooled, applied to 10 to 40% sucrose density gradients, and centrifuged to reveal ribosome patterns. These showed mostly monosomes and disomes. The ratios of the binding efficiencies of the two forms of RNA at several concentrations corresponded very well to the amino-acid incorporation data. With modified RNA 4 as messenger, large polyribosomes were formed if anisomycin was omitted from the binding reaction mixture. This indicates again that translation can take place with modified RNA. I t is not clear how miG is involved in RNA translation. However, we know that the terminal dinucleotide miGpppGp by itself is not sufficient for ribosome recognition as this structure does not bind to wheatembryo ribosomes. Possibly m’G serves to protect the RNA from exonucleolytic degradation.

C. Function of t h e 3’-Terminal

Regions of the BMV RNAs These ends of the BMV RNAs serve as a substrate for tyrosyl-tRNA synthetase and can be charged with tyrosine. Is this structural feature in some way involved with translation? W e believe not. The modifying reaction described in the preceeding section also affects the 3’ end of RNA 4 and destroys its ability to bind tyrosine. Nevertheless, translation can occur, although at a reduced rate as we have stated. However, even this differential effect on translation is absent

470

PAUL KAESBEHG

if the 3’ end (and the 5’ end as well) is modified ( t o destroy tyrosine binding) but not treated with aniline ( 1 3 ) .With such treatment, RNA 4 is fully active in translation. Whatever role tyrosine binding may play is not revealed by our in uitro experiments. Whatever the function of the 3’ region of the RMV RNAs, it requires a precise structure. RNAs 1, 2 and 4 are quite independent RNAs in terms of their susceptibility to mutation. Nevertheless their 3’ sequences have remained remarkably similar. Two possible functions, applicable to all the BMV RNAs, merit careful attention, one being that the structures serve as a recognition or priming site for the BMV RNA replicating machinery, the other that these ends are involved in the initial events in virion assembly.

IV. Summary Brome mosaic virus ( B M V ) is a small, multicomponent plant virus. I t consists of three virions, designatcd H, L and M, which contain RNAs 1, 2, and 3-plus-4, respectively. These RNAs have molecular weights of 1.1X lo6, 1.0 XIOG, 0.75 x 10” and 0.28 x1OG,respectively. RNAs 1, 3 and 4 are monocistronic messengers for proteins of molecular weight approximately 110,000, 105,000 and 20,000; the latter is BMV coat protein. RNA 3 is a dicistronic messenger; it encodes a protein of MW 35,000 and also coat protein. All four RNAs are excellent messengers in cell-free extracts derived from wheat embryo. However, only the cistron for the 35,000 MW protein is translated efficicntly by RNA 3. All four RNAs have m;GpppG at the 5’ terminus, and all four have nearly identical sequences extending for 160 nucleosides from their 3’ terminus. The latter region is somewhat resistant to nucleases and can be obtained intact by partial digcstiori of the RNAs with ribonucleasc T1. Thcse 160nucleoside fragments, as well as the RNAs from which they were obtained, are chargeable with tyrosine in reactions catalyzed by wheat-germ tRNA synthetases. ACKNOWLEDGMENTS This work was supported by U S . Public Ilealth Service Grants AI-01466 and AI-21,942 from the National Institute of Allergy and Infectious Diseases and CA15,613 from the National Cancer Institute and by contract AT-( 11-1)-1633 from the Biology Division of the Energy Research and Development Administration. I am grateful for the collaboratioii and encouragement of Drs. Marcel Bastin, Ranjit Dasgupta, Jeffrey Davies, Ding Shih and Wlodek Zagorski.

REFERENCES 1 . J. Knowland, T. Hunter, T. Hunt and D. Zimmern, in “Zii Vitro Translation and Transcription of Viral Genoines” (L. Haenni and G. Beaud, eds.), p. 211,

(1975).

HNAS

OF BROME MOSAIC VIRUS

471

2. A. Marcus, B. Luginbill and J. Feeley, PNAS 59, 1243 (1968). 3. W. H. Klein, C. Nolan, J. M. Lazar and J. M. Clark, Jr. Bchem 11, 2009 ( 1972). 4 . D. S. Shih and P. Kaesberg, PNAS 70, 1799 (1973). 5. L. C. Lane, Adz;. Virus Res. 19, 151 (1974). 6. L. C. Lane and P. Kaesberg, Nature N B 232, 40 ( 1971). 7. D. S. Shih, L. C. Lane and P. Kaesberg, J M B 64, 353 (197.2). 8. T. C. Hall, D. S. Shill and P. Kaesberg, B ] 129, 969 (1972). 9. R. Dasgupta, F. Harada and P. Kaesberg, J. Virol. 18, 260 (1976). 10. R. Dasgupta, D. S . Shih, C. Saris and P. Kaesberg, Nature 256, 624 (1975). 11. M. Bastin, R. Dasgupta, T. C. Hall and P. Kaesberg, J M B 103, 737 (1976). 12. D. S. Shih and P. Kaesberg, J M B 103, 77 (1976). 13. D. S. Shih, P. Kaesberg and T. C. Hall, Nature 249, 353 (1974).

This Page Intentionally Left Blank

Effect of 5’-Terminal Structures on the Binding of Ribopolymers to Eu karyotic Ribosomes S. MUTIIUKRISHNAN, Y. FURUICHI, G. W. BOTH AND A. J. SHATKIN Roche Institute of Molecular Biology Nzctley, N c w Jersey

The prcsencc of the methylated 5’-terminal “cap” sequence, m’GpppN’m-N’’m in most but not all eukaryotic cellular and viral mRNAs has stimulated numerous investigations concerning its possible role(s) in mRNA function ( 1 ) . l In particular, the m’G portion of the cap has been implicated in an initiation step in the cell-free translation of reovirus, vesicular stomatitis virus ( 2 , 3 ) , tobacco and alfalfa mosaic virus (4, 5), globin (4, 6) and Artemia salina mRNAs ( 7 ) . If the cap structure is an important structural signal for mRNA recognition by ribosomes and/or initiation factors ( 8 ) , its presence in polyribonucleotides would be expected to alter the ribosome-binding properties of thesc polymers. Since picornavirus RNA ( 9-11 ) and satellite tobacconecrosis virus RNA are uncapped ( 1 2 ) but function efficiently as messengers (5, 13, 1 4 ) , initiation of protein synthesis by some mRNAs apparently depends upon nncleotide sequcnce, and not upon 5’-terminal m’G. In order to determine how cap structures and base compositions of ribopolymers affect their ability to intcract with ribosomes, we have synthesized and studied a variety of capped and uncapped polymers. The polymers were synthesized with polynucleotide phosphorylase under

*

primer-dependent conditions with ”‘P-labeled GpppG-C, ppG-C ( 1251), [meth~/Z-“H]m;GpppGm-C( m’G-cap ) and its ring-opened m’G derivative, ~ i ’ G ~ p p p G ~ i -The C . products ranged in size from 100 to 200 nucleotides ( 15). They werc tested a t low concentrations (0.3-0.6 pg/nil) in either wheat-gcrm extract ( S,, ) or rabbit-reticulocyte lysate under optimal ionic conditions for cukaryotic mRNA binding. As shown in Table I, the presence of the methylated cap, but not the ring-opened m’G-cap derivative, improves the polymer binding efficiency (measured ‘ S e e papers in Section I of this volume. 473

474

S. MUTHUKRISHNAN ET AL.

% Input cpin bound \\'lien t germ 5' Tcrniinus

I'olynier

40

s

s

s

81)s

<1 3
-

<1

<1

<1
80

5 <1 3

<1 17

<1

3 <1 2 <1 <1 38 48 21 <1
Itel irulocyte

'LO

7 4

3 3

6 58 68

<1

46

7 7

43 64

40

-

26 -

_<1
7 40 51 65

a Iiihosomc binding in wlieat-germ rstrart ( S L 3 )was measured by inruhating 1 pmol of each polyrrier in a 11)O-J assay for 10 niiiiutes a t 23°C ( 1 5 ) . For the rabbit reticulocyte lysates, 50-pl assays containing 0.6 pmol of ribopolymer were incubated a t 30°C. The mixtures were then analyzed by velocity sedimentation in glycerol gradients. Fractions were collected and rounted directly ( 1 5 ) . Specific activities were: 3II-labeled 1n~GpppGu1-Cand m7GApppCm-Cprimed polymers = 67*i0 cpm /pmol ;

*

aT-labeled GpppG-C primed polymers = 1200 rpm/pmol; ~ P G - C ( ~ *primed ~I) polymers = 5000 cpm/pniol.

as percentage of input cpm bound). For example, the extent of binding of the uncappcd U homopolymer or GpppG-C-( U ) ,, to 40 S ribosomal subunits increases froin 1 and 5%,respectively, to 38%whcn the miG-cap is present. However, some polymers failed to bind in a stable fashion even when they had a methylated cap [ m'GpppGm-C- ( C ) for example]. Further, some capped polymers including m'GpppGm-C- ( C ) >,, - ( U C),l and - ( A . C ), formed complexes only with 40 S subunits

-

BINDING OF RIBOPOLYMERS TO RIBOSOMES

475

and were not converted to SO S complexes, suggesting that structural features in addition to caps influence the extent and/or nature of binding to ribosomes. Further inspcction of the tabulated data reveals that only those polymers rich in both A and U are found in SO S complexes. A good correlation exists between the A-plus-U content of a particular polymer and its efficicncy of binding in either the wheat-germ or reticulocyte system. Qualitative differences exist for individual polymers tested in thc two systems; for cxample, m’GpppGm-C-( A),, binds only in the reticulocyte lysate and m’GpppGm-C-( U ) , in wheat germ. ( A-plus-U)rich polymers show significant lcvels of binding even when they do not contain 5’-terminal methylated caps, but in each case the extent of binding is increased by the presence of a nicthylated cap. In addition to an effect on the extent of ribopolymer binding to ribosomes, the presence of the methylated cap increases the rate of binding. and ppG-CThe kinetics of binding of m’GpppGm-C- ( A,,U,,G) (A2,U2,G),!are shown in Fig. 1. These two polymers presumably have the same average base composition but differ in their 5’-terminal structurc. In the reticulocyte lysatc, thc methylated polymer binds to ribosomes with a t l I 2 of -1 minute as compared with the t1,2 of -3 minutes for the uncapped polymer. Furthermore, the steady-state bind-

TIME (minutes)

FIG. 1. Ribosome binding of capped- and uncapped-( A2,U?,G).. Polymers were assayed in reticulocyte lysates as described in the footnote of Table I. At the indicated times, samples were analyzed by gradient centrifugation, and the percentage of input cpm sedimenting in the 80 S region was determined.

476

S . MUTHUKRISHNAN ET AL.

ing level of the capped polymer is twice that of its uncapped counterpart. The increase in biiiding of polymers with methylated caps may be due to a greater affinity for ribosomes or a slower rate of dissociation of ribosome complexes that contain methylated vs. unmethylnted ribopolymers. This effect may be mcdiated by a m'G-specific protein factor (8). We favor a model of ribosome*mRNA interaction that is multivalent: ( a ) recognition of the cap; ( b ) interaction of the (A-plus-U)rich scqueiice with some component( s ) of ribosomes, either RNA or protein; and ( c ) association of initiator Met-tRNA with the AUG codon. Each of these interactions may eithcr position the mRNA on the ribosome or stabilize the ribosome mRNA complex. Their combined interactions may determine the overall efficiency of initiation. When one or more of these features is absent, the mRNA may function with a lower efficiency. Thus, each interaction may be more or less critical, depending upon the strength of the others. It should be possible for future studies to generate ribopolymers with a range of ribosome binding efficiencies by suitably varying the contribution of each of these components.

-

REFERENCES A. J. Shatkin, Cell (1976). In press. G . W. Both, A. K. Banerjee and A. J. Shatkin, PNAS 72, 1189 ( 1975). G. W. Both, Y. Furuichi, S. Muthukrishnan and A. J. Shatkin, Cell 6, 185 (1975). E. D. IIickey, L. E. Weber and C . Baglioni, YNAS 73, 19 (1976). 5. R. Roman, J. D. Brooker, S. N. Seal and A. Marcus, Nature (London) 260, 359 (1976). 6. S. hluthukrishnan, G. W. Both, Y. Furuichi and A. J. Shatkin, Nature (London) 255, 33 ( 1975). 7. S. Muthukrishnan, W. Filipowicz, J. M. Sierra, G. W. Both, A. J. Shatkin and S. Ochaa, JBC 250, 9336 (1975). 8. W. Filipowicz, Y. Furuichi, J. M. Sierra, S. Muthukrishnan, A. J. Shatkin and S. Ochoa, PNAS 73, 1559 (1976). 9. A. Nomoto, Y. F. Lee and D. Wimmer, PNAS 73, 375 ( 1976). 10. XI. J. Hewlett, J. K. Rose and D. Baltimore, PNAS 73, 327 (1976). 11. R. Fcrnandez-Mnnoz and J. E. Damell, J. Virol. 18, 719 (1976). 12. J. Lesnaw and M. Reichman, PNAS 66, 140 (1970). 13. L. Villa-Komaroff, N. Guttman, D. Baltimore and H. F. Lodish, PNAS 72, 4157 ( 1975). 14. B. F. Oberg and A. J. Shatkin, PNAS G9,3589 (1972). 15. C. W. Both, Y. Furuichi, S. Muthukrishnan and A. J. Shatkin, J M B 104, 637 (1976).

1. 2. 3. 4.

Translational Control in Embryonic Muscle STUART M. HEYWOOD AND DORISS. KENNEDY Gcnetics and Cell Biology Section The University of Connecticut Storm, Connecticut

Differentiating muscle in embryonic chick undergoes rapid and distinct morphological changes. The molecular events both determining and accompanying the changes are not well understood. We have been attempting to study the mechanisms involved in gene expression of differentiating muscle cells for a number of years. Our approach has becn to isolate and characterize the mRNAs found in muscle and to study both their transcription and their translational properties. Myosin mRNA, coding for the myosin heavy chain of 200,000 daltons, has been our major probe into these investigations. It has been isolated from both polysomes ( 1 ) and more recently from nonribosomal bound ribonuclear protein particles (mRNPs) ( 2 ) . The occurrence of stored myosin mRNA in nondifferentiated muscle cells was first suggested by Buckingham et al. ( 3 ) ,Yaffe ( 4 ) and Chacko ( 5 ) . We have found that in embryonic chick leg muscle the largest amount of stored myosin mRNPs is present in 13-day embryos and that this decreases as development proceeds. In contrast, the amount of myosin mRNA found in polysomes increases from day 14 to day 19 of development ( 6 ) . Lodish (7, 8 ) has suggested that translation is controlled simply by the efficiency of the mRNA in itself to initiate protein synthesis. This degree of efficiency results from the structural aspects of the molecule. In this manner, “better” mRNAs would have a higher competitive advantage over “poorer” mRNAs. These “poorer” mRNAs may even be nonassociated with ribosomes under conditions when the cell is not actively synthesizing protein. The existence of mRNA not associated with ribosomes is well documented (9-11). Although it has not been demonstrated that all storecl messengers are precursors to polysoinal messengers, it is clear that, in the case of myosin mRNA, the stored and polysomal forms are identical as determined by hybridization experiments using myosin cDNA (12) and the fact that both have the same translational efficiency in either the reticulocyte or the wheat-germ cell-free systems 477

478

STUART M. HEYWOOD AND DORIS S. KENNEDY

( unpublished results). Therefore, it is unlikely that structural alterations occur between the stored (poorer?) and the polysomal (better?) forms of myosin mRNA. Thus, the simple efficiency hypothesis of Lodish ( 7 ) does not seem to be applicable in this case, and other factors may be involved in altering the efficiency of translation of mRNAs in eukaryotes. We have suggested that in embryonic muscle the mechanism by which certain mRNAs are maintained in the inactive form in the cytoplasm is by complexing with a small oligo( U)-containing RNA that we have termed “translational control” RNA ( tcRNA ) . A model concerning the manner by which the interaction of tcRNA and mRNA maintains stored mRNAs in an inactive state has been published ( 1 3 ) . This model is based on our observations that poly( A)-containing mRNPs have a U-rich RNA molecule that is small, that contains an oligo( U)-rich region capable of hybridization to poly ( A), and that inhibits the translation of the poly(A)-containing mHNAs isolated from the same mRNPs. A similar RNA molecule, but lacking the U-rich region, can be isolated from muscle polysornes. This polysomal tcHNA does not inhibit the translation of mRNA and, in fact, gives a small but reproducible stimulatory effect on mRNA translation. A number of predictions resulting from our model ( 1 3 ) concerning tcRNA regulation of protein synthesis have been experimentally tested. If mRNP tcRNA lacking polysomal tcRNA contains an oligo( U ) region and this U-rich region is required to hylxidize with thc poly( A) of mRNA in order to inactivate the mcssenger, then the two classes of tcRNA should have different molecular weights and deadenylylated mRNA should be unaffected by mRNP tcRNA. I t has been demonstrated that polysomal t&NA migrates in front of mRNP tcRNA on dodecyl sulfate/ polyacrylamide gel electrophoresis ( 1 4 ) . Moreover, the specific activity of mHNP tcRNA, when labeled with [ “Hluridine, is approximately twice that of polysomal tcRNA, confirming the differences in the number of U’s in these molecules. These results are consistent, but certainly are not proof, with the hypothesis that polysomal tcHNA arises from mRNP tcRNA by removal of the oligo( U ) region. In addition, when either purified myosin mRNA or mRNA obtained from small mRNPs are de;idenylylated, the translation of thcse messengers is not inhibited by tcHNA preparations previously active in inhibiting the translation of these same poly ( A )-containing mcssengcrs. Thercfore, as suggested by our model ( 1 3 ) , the p o l y ( A ) segment of mRNA may be required for tcRNA inhibition of mcssenger translation as a result of its interaction with tlic oligo( U ) rcgion of mRNP tcRNA. In addition to the oligo( U ) rich portion, tcRNA may contain a region that can specifically recognize a series of bases near the 5’-OH cnd of

TRANSLATION CONTROL I N EMBRYONIC MUSCLE

479

mRNA. If such is the case, tcRNA should be specific in its interaction with different messengers. I3y using different mRNPs, those containing myosin mHNA, and those sedimenting at less than 40 S from muscle, a degree of specificity has been shown between the tcRNA and mRNA isolated from the same ribonuclear protein particle ( 1 4 ) .This interaction has been demonstrated both by its ability to specifically inhibit the translation of mRNA and cause structural changes in their respective mRNAs during hybridization. Thc degree of nuclease protection of niRNA induced by the interaction with tcRNA is somewhat surprising. The amount of RNase resistance in the complex is much greater than can be accounted for by simple intermolecular hybrid formation ( 1 4 ) . Therefore, the circularization of mRNA by tcRNA as suggested by our model ( 1 3 ) is probably an oversimplification. I t is clear from the above discussion that mRNP tcRNA can be characterized by ( a ) its location in mRNPs, ( b ) its oligo( U ) region capable of forming hybrids with poly(A), ( c ) its ability to inhibit the translation of certain messengers, and ( d ) its capacity to alter the structure of mRNAs, thereby increasing their nuclease resistance. Utilizing these properties, it has been possible for us to isolate myosin tcRNA from myosin mRNPs and to purify it to the extent that it appears as a single band on formamidel acrylamide gcl electrophoresis ( 1 5 ) . The stoichiometry of the interaction of mRNA and this highly purified tcRNA can now be determined. Our model ( 1 3 ) predicts that this interaction will occur at a one-to-one molc-ratio. In order to assure that the mRNP tcRNA did not arise from the degradation of larger RNA molecules found in the myosin mRNP fraction of the sucrose gradients, a number of controls were performed ( 1 5 ) . The ability of the &go( U ) portion of the molecule to form T, and T, RNase-resistant hybrids with ["H]poly(A) allows for a rapid and simple screening procedure for the presence of putative mRNP tcRNA. Utilizing this procedure ( 1 3 ) we have ascertained that myosin tcRNA is not a fragment of myosin mRNA. Neither T , RNase fragments nor fragments of myosin mRNA produced by limited alkaline hydrolysis are capable of forming nuclease-resistant hybrids with [ 3 H ] p ~ I yA( ) . Furthermore, when [5H]~~ridine-labeled myosin mRNA was added to the myosin mRNA fraction obtained from sucrose gradients and subsequently dialyzed in EDTA buffer, no degradation of myosin mRNA resulted, thereby indicating that myosin tcRNA is not a product of myosin mRNA degradation. Myosin mRNPs may be separated from cosedimenting ribosomes by affinity chromatography on poly ( U ) -Sepharose. When mRNPs are purified by two passes through the poly( U)-Sepharose, no tcRNA can be recovered from the ribosome-containing fraction. In addi-

480

STUART M. I-IEYWOOD AND DORIS S. KENNEDY

tion, rRNA was not found to contain regions that would substantially protect [ 'H]poly( A ) from nuclease digestion after hybridization, as is the case with inRNP tcRNA. Therefore, it appears unlikely that tcHNA arises from rRNA breakdown. From these results, we suggest that tcRNA is not an artifact produced during isolation, but that mHNP tcHNA is a bona fide entity controlling the utilization of messengers in cmbryoiiic muscle cells. Therefore, further purification of this oligo( U ) -rich molecule to establish its stoichiometric relationship to niRNA was undertaken. Our previous results (13, 1 4 ) dealt with tcKNA preparations obtained by EDTA buffer dialysis and subscquciit chromatography on DEAEcellulose. The dialysis procedure may be circumvented by chromatography on Sephadex G-50 (Pharmacia) utilizing the same EDTA buffer ( 1 5 ) . Myosin tcRNA can be further purificd by chromatography of the IIEAE-cell~ilo~c fraction on Dowes AG1-X2 ( Bio-Rad). As shown in Fig. 1, two distinct peaks ( A , fraction< 20-22; C, fractions 79-80) as well as a diffuse peak ( H , fractions 58-71) are obtained. ( F o r details of chromatography, sec 15.) By testing the ability of these fractions to protect ['H]poly(A) from TI and T, RNase digestion, fraction C can be shown to contain a n ohgo( U)-rich area. In addition, if the tcRNA mate-

FRACTION NUMBER

FIG. 1. Dowcx chromatography of myosin mRNP tcRNA. DEAE-cellulose prcparations of tcRNA were prepared from myosin mRNPs as previously describcd ( 14 ). Aftcr ethanol precipitation, the RNA precipitate was dissolved in neutralized H,O and applied to a 0.5 cm x 4 'cm Dowex column (Bio-Rad A G l - X 2 ) . After the column was washed with 8 nil H,O, the RNA was rapidly eluted using a 0.01 to 1.0 N HCI gradient. The column was rlln at 2'C, and chromatography was completed in 20 minutes. Aftcr elution the sa~npleswere precipitated in ethanol with 0.24 M N1.L acetate.

TRANSLATION CONTROL I N EMBRYONIC MUSCLE

481

rial obtained by DEAE-cellulose chromatography is applied to an oligo d ( A)-cellulose column, the material eluting in low salt is found to be only in fractions 13 and C after subsequent chromatography on the Doweu column (15). Further analysis indicates that only fraction C from the Dowex column is capable of interacting with and inhibiting the transIation of myo5in mRNA. Therefore, it is Iikely that fraction C contains myosin tcRNA. Myosin mRNA has a molecular weight of approximately 2 x loc. This is reasonable for a mRNA coding for a protein of 200,000 molecular weight. Such a molecule must contain approximately 6000 coding nucleotides, a poly( A ) tail, and nonreading portions at both the 5'-OH and 3'-OH ends of the molecule. In order to determine the molecular weight of myosin tcRNA, five muscle cell cultures ( 8 x lo6 cells each) were labeled with "P for 4 hours before extensive cell fusion had occurred. Radioactively labelcd mRNP tcRNA was obtained from these ciiltures by Dowex chromatography in the presence of carrier tcRNA from muscle tissue. Formamidc,/ acrylamidc gel analysis of thc purified tcRNA indicated its molecular weight to be 10,000 (15). This molecular weight was uscd for determining the stoichometry in the interaction of myosin mRNA and myosin mRNP tcRNA (assuming a molecular weight of 2 x lo', for myosin mRNA ) . The data presented in Table I illustrate the stoichiometric relationship of myosin mRNP tcRNA with myosin mRNA as measured by the increase in T, and T2 RNaw resistance after hybridization. Increasing amounts of added tcRNA increase the nuclease resistance until a one-toone mole ratio of mRNP tcRNA and mRNA is reached. Further increases

0 0.2.; 0,5 1 .o 2.0 16.0

5.1 7.6 11.2 18.3 l!). 1 17.8

Hyhridizatiori reactions and nuclease digestion were performed and analyzed as previously described (14).

482

STUART M. HEYWOOD AND DORIS S . KENNEDY

in amounts of tcRNA do not result in additional nuclease resistance. These data suggcst that one molecule of myosin tcRNA induces considerable structural changes in one molecule of myosin niRNA. Such a 3- to 4-fold increase in RNase resistance could arise from bringing two distal portions of the mc’ssenger into close proximetry and in so doing allowing additional areas of the mRNA to form double-stranded regions that would otherwise be unstable. Further experiments designed to study the stoichiometric relationship between myosin tcRNA and myosin inRNA are shown in Table 11. AS can be seen, when increasing amounts of myosin tcRNA are premixed with the messenger, a progressive decrease in translation is observed when the mixture is added to a cell-free amino-acid-incorporating system. Again, when a one-to-one mole ratio is achieved, the translation of myosin mRNA is completely inhibited. The results of this experiment and those presented in Table I indicate that the interaction of the messenger and its tcRNA occurs in a stoichiometric fashion, not catalytically. A number of our predictions about the interaction of messenger and tcRNA have been shown to be correct. Therefore, the model dealing with tcRNA regulation of protein synthesis ( 13) is strengthened. However, questions concerning ( a ) the precise manner of the interaction between mRNA and tcRNA, ( b ) the synthesis and precursor-product relationship of the molecules, ( c ) the degree of specificity of the reactions, ( d ) the role of mRNP proteins on the activity of the tcRNA, and ( e ) the function of polysomal tcRNA remain to be answered. We have suggested that the translation of mRNA can be modulated by the presence of specific initiation factors (16-18). Initiation factor three ( I F 3 ) is a large, complex protein made u p of 10-12 subunits not present TABLE I1 STOICHIOMETRIC RELATIONSHIP BETWEEN MYOSIX tcRNA A N D MYOSIN mRNA AS DETERMINED nr THE INHIDITION O F MYOSIN mRNA TRANSLATION^ Mole ratio tcRNA/mRNA

Myosin (cpm)

0 0.1 0.3

1920 1430 810 410 l0,5

0.7 1 .0 8.0

75

Conditions for cell-free protein synthesis and determinat,ionof myosin synthesis have been described (14). (1

483

TRANSLATION CONTROL IN EMBRYONIC MUSCLE

in stoichiometric amounts (19). The heterogeneity of this factor may represent its heterogeneity in function, as we have reported. In this manner, a nonspecific IF3 could be competed for by all cellular messengers, predicted by Lodish ( 7 ) , or a modulator protein could become associated with a number of the IF3 molecules, thereby making those IF3 units specific for a mRNA or class of mRNAs. This would in turn allow a competitive advantage to these messengers. Additional evidence in support of this hypothesis has been obtained by translating equimolar concentrations of myosin and globin mRNAs in a wheat-germ cell-free system (Table 111). Both myosin and globin messengers are translated when added to the incubation mixtures. However, when IF3, obtained from native 40 S ribosomal subunits of embryonic chick muscle, is added, a preferential increase in myosin mRNA translation is observed. On the other hand, when a similar preparation of IF3 obtained from rabbit reticulocyte is added, a n increase in the translation of both messengers is observed. The larger stimulation of myosin synthesis compared to globin synthesis in this case is probably a result of a nonspecific enhancement of translation of a less efficient mRNA by the addition of a limiting factor as predicted by Lodish's hypothesis ( 7 ) . Therefore, the specific effect of muscle IF3 on the translation of myosin mRNA can be explained only by a portion of these factors being specific for myosin mRNA. In all probability, the specificity will not ultimately be found to be messenger-specific, but to involve classes of mRNA. I t will be of interest to determine eventually whether these classes of mRNA code for functionally related proteins, or can be related in terms of their translational efficiency. Alternatively, mRNAs coding for functionally related proteins may have messengers with similar translational efficiencies. The control of gene expression in embryonic muscle must involve many different levels of control. There appears to be a temporal differTABLE I11 SPECIFICITY OF MUSCLE IF3 FOR MYOSINSYNTHESIS IN WHEAT-GERM CELL-FREE SYSTEM' IF3

Muscle Reticulocyte

A

Myosin (CP.1)

Globin (cpm)

Myosin/globin

4,225 14,290 7,125

26,200 22,185 31,100

0.16 0.64 0.23

These experiments will be reported in detail elsewhere.

484

STUART M. IIEYWOOD AND DORIS S. KENNEDY

ential between transcription and translation, for muscle-specific messengers are transcribed and stored in the cytoplasm without being translated. Studies in tissue culture suggest these stored messengers may turn over at a faster rate than polysomal mRNA ( 3 ) ; however, caution should be used in cxtrapolating in zjitro turnover studies to what actually occurs in uiuo. These stored mRNAs are subsequently activated and must now compete for other cellular mRNA for the very active constituents of thc protein synthetic apparatus. Specific modulator factors associated with initiation factor 3 may be involved in this aspect of gene expression. Finally, utilizing myosin cDNA as a probe will enable a thorough study of transcriptional events and allow a verification of the impact of translational control as thc overall gene expression in muscle cells. ACKNOWLEDGMENT This research was supported by an NI€I Research Grant ( H D OG316), and the eggs used in these experiments were purchased on an NCI Grant (CA 14733).

REFERENCES 1 . S. M. Heywood and hl. Nwagwu, B c l ~ 8,3839 n~ (1969). 2. S. M. Heywood, D. S. Kennedy and A. Bester, F E B S Lett. 53, 69 (1975). 3. M. E. Bnckingham, D. Caput, A. Cohen, R. G. Whalen and F. Gros, PNAS 71, 1466 ( 1974). 4. D. Yaffe and €I. Dym, CSHS@B 37, 543 ( 1 9 7 2 ) . 5. S. Chacko and J. Xavier, Deu. B i d . 40, 340 ( 1974). 6. S. M. Heywood and A. Rich, €"AS 59,590 (1968). 7 . H. F. Lodish, Nature 251, 385 ( 1974). 8. H. F. Lodish, T. Alton, J. Margolskee, R. Dottin and A. Wciner, ICN-UCLA Symp. Mol. Cell Biol. 2, 366 (1975). 9. G. Brawerman, ARB 43, 621 (1974). 10. E. S. Gander, A. Stewart, C. Morel and K. Sclierrer, EJB 58, 587 (1973). 11. R. Williamson, FEBS Lett. 37, 1 (1973). 12. J. Robbins and S. hl. Heywood, BBRC 68, 918 (1976). 13. A. J. Bester, D. S. Kennedy and S. hl. Heywood, PNAS 72, 1523 (1975). 14. S. hl. IIeywood, D. S. Kennedy and A. J. Bestcr, EJB 58, 587 (1975). 15. S. M. Heywood and D. S. Kennedy, Bchem 15,3314 (1976). 26. A. Rourke and S. M. Heywood, Bchern 11, 2061 (1972). 17. W. C. Thompson, E. A. Bnzash and S . M. Heywood, Bchem 12,4559 (1973). 18. S. M. Heywood, 1). S. Kennedy and A. J. Bester, PNAS 71, 2428 (1974). 19. I. C. Sundkvist and T. Staehclin, J M B 99, 401 (1975).

Protein and mRNA Synthesis in Cultured Muscle Cells R. G. WHALEN, M. E. BUCKINCHAM AND F. GROS Ddpartement de Biologie Mo~~cuhire lnstitzct Pasteus Puris, France

When embryonic muscle cells are grown in tissue culture, the appearance of several differentiated characteristics and functions is observed. This system has been useful in the study of the protein components involved in muscle cell differentiation and in determining how the genetic expression of these proteins might come about (for review, see ref. 1). In particular, the ability to carry out pulse-labeling and chase experiments in the tissue culture system has permitted studies of messenger RNA synthesis and stability during muscle cell differentiation (2, 3 ) . A pulse-labeled mRNA molecule sedimenting at 26 S on sucrose gradients has been identified in calf-muscle cells, and several lines of indirect evidence have suggested that this molecule is the mRNA coding for the heavy subunit of myosin, a protein molecule of 200,000 molecular weight. The striking finding was that this mRNA is present as an unstable (tl,?= 10 hours), mainly untranslated molecule prior to muscle cell fusion, when a basal level of myosin heavy chain is being synthesized. After cell fusion, and in correlation with the increase in rate of heavychain synthesis, the 26 S mRNA is found in polysomes and is now a molecule of considerably greater stability, with a half-life of 50-60 hours ( 2 ) . Further experiments have indicated that stabilization and translation of the 26 S mRNA found after fusion may be two discrete steps. Figure 1 shows the results of experiments in which muscle ceIls are pulse labeled with [3H]uridine at a time corresponding to the beginning of cell fusion and the cytoplasm is fractionated into two regions, the subunit to trisome region ( S ) and the remainder of the polysomes ( P ) . If the RNA from these two regions is analyzed immediately after a 2-hour pulse, the 26 S mRNA is found almost exclusively in the subunit-trisome fraction, not in the polysomes. The 26 S mRNA sediments in the small polysome region because it is in the form of a ribonucleoprotein particle, as shown by experiments in which its sedimentation position is essentially unchanged after dissociation of the polysomes by treatment with EDTA 485

486

R. G . WIIALEN ET AL.

1 0 H Chase

10

20 30

10 2 0 30

10 20

30

10 20 30

F r o c t i o n number: 5-20% sucrose g r a d i e n t s

FIG.1. Sucrose-gradient analysis of pulse-labeled RNA from niuscle-cell cnlturcs. Cultures were labeled for 2 hours with ["Hluridine at the onset of cell fusion (after 48 hours in culture), followed by unlaljeled nridine for the times indicated in the figure. After the cytoplasmic extracts were separated into the subunit-trisome ( S ) and polysonie ( P ) regions, thc RNA was extracted ( 2 ) and analyzed on sucrosr gradicnts, which were fractionated; the radioactivity in total (0-0) and RNA( A,) (.---a) was then determined. The position of the 26 S mRNA is indicated by an arrow. See refcrcnces 2 and 3 for details.

( 2 ) . If HNA is analyzed after a 10-hour chase with unlabeled uridine, the 26 S mHNA is still found oiily in the subunit-trisome fraction, and from thc radioactivity remaining in the 26 S pmk, it is possible to calculate that the half-lifc is that of the $table form, i.e., about SO hours. Onlv after a total chase of 24 hours can the 26 S molccule be found in part in the polysomal compartment, and, once again, half-lifc calculatioiis indicatc that thc molecule is the stablc species. These experiments and those reported previously ( 2) suggest three iiwtabolic forms of thc 26 S niHNA: ( a ) an unstablc, mostly untraiislated form existing in dividing nmscle ccll cultures, ( b ) n stable form, synthesized at about the tinic of the onset of cell fu,ion, that is not yct translated, arid ( c ) a stablc, translated form foniid to enter polysorncs at a time coincident with the increased rate of myosin heavy-chain synthesis. A defiiiitivc interpretation of these result? awaits thc precise identifica-

PROTEIN AND

mKNA

487

SYNTHESIS IN MUSCLE CELLS

tion of the pulse-labeled 26 S peak as the myosin heavy-chain mRNA and a demonstration that the several forms arc in fact the same molecule. A second protein of interest in muscle cclls is actin. Although actin is found in many if not all higher cells ( 4 ) , its quantity and arrangement into an interdigitating contractile apparatus is very likely unique to muscle tissue. During the course of studies on the proteins being synthesized in calf muscle cells during differentiation in culturc ( 5 ) ,it became apparent that actin from these cclls could be resolved into three bands by isoelectric focusing in the prescnce of urea (Fig. 2B and C ) . However, only one form was obtained when actin was prepared from cmbryoiiic calf tissue (Fig. 2A), and it was this form that was present in small quantities in prefusion muscle cell cultures and that became predominant in the fused cultures. In addition, actin isolated from cultured calf-kidney cells waq composed exclusively of the two other forms (Fig. 2D), suggesting that these may be specific for nonmuscle tissue and prefusion muscle cells. Thus it secms clear that the actin found in muscle tissue (called a-actin) is but one of several possible isozymic forms and that

base

Y,

-P

-a

A

C

acid

FIG.2. The analysis of actin by isoelectric focusing in the presence of urea. Actin of acetone powders and polymerization to yield F-actin. The samples were analyzed by isoelectric focusing in the presence of urea and the nonionic detergent, NP-40, as described by O’Farrell ( 8 ) . Only the middle parts of the original 11 cm gels are shown. The proteins have been stained with Coomassie 1)lue. The three actin species, designated a-, 8- and y-actin, are separated by approximately 1 mni and their isoelectric points differ by about 0.03 pH units; the three forms focus at about pH 6.25 under the conditions used. The samples analyzed are from fetal muscle tissue ( A ) , prefusion muscle cell cultures ( B ) , fused muscle cell crdtures ( C ) and cultured kidney cells ( D ) . The precise location of a-actin in gels B and D has been established by niixing the samples with muscle tissue actin ( 5). was purified from several so~ircesby extraction

488

R . G . WHALEN ET AL.

two others (called p- and y-actin) may be characteristic of nonmuscle tissue. I t is possiblc that several genes coding for the multiple actin forms may evist or, alternatively, that the three are interrelated by secondary modifications of the same primary gene product. Recent sequence data of Elzinga et al. ( 6 ) on the human platelet and cardiac actins have providcd definitive evidence for two gcncs in this case. The following results obtained from the cell-free synthcsis of calf actin suggest that multiple genes might be responsible for the situation shown in Fig. 2. When HNA extracted from pre- and pustfusion muscle cells is translated in the ccll-free proteiii-s)liithesizing system described by Schreier and Stachelin ( 7 ) , proteins that cofocu5 and coelectrophorese with the calf actins are produced. HNA from prefusion cells gave virtually only the p- and y-actins, as judged from analysis by isoelectric focusing (Fig. 3 ) . However, the RNA from fused cultures directed the synthesis of a considerable amount of a-actin in addition to p- and y-actin. Since the cellfree system was the same in both experiments and only the RNA was cpm 1000 BOO

6 00 400

200

-

n 8 00

600 400 200 10 20 SLICE NUMBER

30

FIG.3 . Analysis I)y isoelectric focusing of the products of cell-free protein synthesis directed by RNA from pre- and postfusion muscle cell cultures. Total cytoplasmic RNA was extracted from niuscle cell cultures as described ( 2 ) and translatcd in the cell-free protein synthesizing system described by Schreier and Staehelin ( 7 ) .The products of the cell free system, labeled with ["S]imetliionine, were analyzed by isoelectric focusing as in Fig. 2. The region corresponding to the position of the actins was sliced into 0.5-1nm slices and the radioactivity was determined. This region of the focusing gcl was substantially free of other labeled species as shown by the complete two dimensional analysis ( 8 ).

PROTEIN AND

mRNA

SYNTHESIS IN MUSCLE CELLS

489

changed, the most likely explanation for this result is that the mRNA coding for a-actin was present only in postfusion RNA and not in prefusion RNA. The existence of a t least two mRNAs for actin strongly suggests that multiple genes exist for the calf actin species. I n addition, the lack of a-actin translation when using prefusion RNA would seem to indicate that this mRNA is not preexistent, at least not in quantity, in muscle cells prior to fusion. T h e foregoing results illustratc how, in the case of actin, isozymic changes may accompany the processes of terminal differentiation in cultured embryonic muscle cells. Furthermore, the ability to study the messenger RNAs in cultured muscle cells, either by pulse-labeling or by cell-free translation, has provided information on the control of protein synthcsis during differenti at’ion. ACKNOWLEDGMENTS R. G. W. i q a Fellow in Cancer Research supported by Grant DRG-34-F of the Damon Runyon-Waltcr Winchell Cancer Fund. This work was supported by grants from the Fonds de D6veloppement de la Recherche Scientifique et Technique, the Centrc National de la Recherche Scientifique, the Institut National de la Santk et de la Recherche h$&dicale, the Commissariat A 1’Energie Atoniique, the Lique Nationale Franpise contre Ie Cancer, the Foundation pour la Recherche Mkdicale Francaise, the Fondation Philippe, and the Muscular Dystrophy Associations of America.

REFERENCES 1 . J . P. Merlie, M. E. Backingham and R. G. Whalen, Curr. Top. Deuelop. Biol. (1976). In press. 2. M. E. Buckingham, A. Cohen, D. Caput, R. G. Whalen and F. Gros, PNAS 71, 1466 ( 1974). 3. M.E. Buckingham, A. Cohen and F. Gros, J M B 103, 611 (1976). 4 . T. D. Pollard and R. R. Weihing, CRC Crit. Rsu. BiocAem. 2, 1 (1974). 5. R. C . Whalen, G. S. Butler-Browne and F. Gros, PNAS 73, 2018 (1976). 6. M. Elzinga, B. J. hfaron and R. S. Adelstein, Science 191, 94 (1976). 7 . hl. 11. Schrcier and T. Staehelin, J M B 73, 329 (1973). 8. P. H. O’Farrell, JBC 250, 4007 (1975).

VI. Summary 493

mRNA Structure a n d Function JAhfES E. DARNELL

49 1

This Page Intentionally Left Blank

mRNA Structure and Function‘ JAMES

E. DARNELL

Rockefeller University New York, N e w York

1. Introduction The papers dealing with eukaryotic mRNA bring together information at every level of development, which is one reason why this conference was both satisfying and stimulating. There are some descriptions of work carried out to the last detail, or at least to the primary sequence detail, if not to thrcc-dimensional structure as yet. There are some reports that have shown inspiring new progress in areas that have been active for quite some time, and some completely new ideas and experiments. In considering “mRNA: The relationship of structure to function,” we clearly want to address ourselves first to the question of structure in mRNA. A few years ago it would have been presumptuous to speak of the general structure of mRNA in cukaryotic cells, but that is no Ionger true. I t appears that there is a general structure to eukaryotic mRNA, a t least for the majority of the molrxules, so the first part of this summary will concern structural studies. We must turn thcn to what the structure tells us, and there are some definite things that the structure tells us about the “cellular biology” of mRNA-about the function in translation and about the origin of mRNA. A number of gaps in our knowledge about the function of mRNA still exist, for example, the mechanism and regulation of mRNA turnover. Perhaps the structural studies will eventually help unravel these points, but clearly metabolic studies will also be required before w e finally understand such complex points. Another area of unfinished business-in fact, of business just begu~i-focuses on the question: HOW does the structure of the functioning mRNA relate to the origin of mRNA in the nucleus? Much of the work in the last few years of those of us who attended this conference has involved struggling with this question of mRNA biosynthesis. ‘This paper represents a summary of the conference entitled mRNA: The Relation of Stnrcture to Function held in Gatlinburg, Tennessee April 4-7, plus selected literature references as well as references to work in progress in the authors’ and Dr. Warren Jelinek‘s lal~oratoriesat RockefelIer University. 493

494

J A M E S E. DARNELL

Finally there is a section about which I will have relatively few comments to make because the experiments are so astounding to me that I hardly know what to say about them; I am referring to the "brute force" attempts to discover and study the nuclear proteins responsible for genetic regulation. In thcse experiments the investigator unceremoniously yanks the chromatin out of the nucleus, divests it of its protein, puts the protein back again on thc DNA, and the resulting mixture is still transcribed by Esclierichia coli polymerase in a manner that reflects the cell from which the chromatin came. This approach is so amazing to many of us that, rather than try to recite for you a summary of the cviclence, I will discuss the questions this line of work raises for me.

II. Definition of mRNA and Brief Survey of Recent Progress in mRNA Structure The operational definition of messenger RNA' in mammalian cells came after the discovery of polyribosomes (Fig. 1) and the isolation from HcLa cells of polyribosoines containing labeled mRNA ( I) . This polysome-associated labeled mRNA had a low guanosinc-cytosinc content, like DNA, a i d was about 1500 nucleotides long. These observat'ions, which now date back a dozen years, have stood us in good stead because all the purifications of the specialized mRNA molecules that have been

I

COMMENTS

POLYRIBOSOMES

-

mRNA ISOLATION

-1500

5' POLY ( A )

METHYLATION

COMMON ( R E P E T I T I V E ) SEQUENCES

Np's

3'

Role i n mRNA transport

3'

Role i n Ribosome

AZOQ---AZO

binding

and/or turnover(?)

-A20n---A20

51 m7GpppNm-Nk-

inIGpppNm-Nm-R'

RI-Azoo

---A20

01

mIGpppNm-Nk

L o w G+C ,"DNA-I ike" MONOCISTRONIC(?)

-R'R2AZo0

Role i n traiiscription and /or process,ng

FIG.1. Steps in the study of eukaryotic mRNAs since 1963.

-

'See Preface and Dcdication, pp. xxiii and xxvii [Eds].

mRNA

STRUCTURE AND FUNCTION: SUhlMARY

49s

important in the last 5-6 years began with polyribosomes. I n the last five years or so, several important advances in the study of eukaryotic mRNAs have occurred that allow us to discuss a general structure; I would like to outline these advances. Two posttranscriptional events in messenger formation have been clearly documented. Edmonds and Caramela (2; sce also Edmonds in this volume) and Kates ( 3 ) refocused attention on the fact that poly(A) cxists in cells and is not just a laboratory curiosity produced by polynucleotide phosphorylase; it is a real substance from living cells. Their work stimulated attempts to connect the poly( A) in mRNA and the poly( A ) also found in the nuclear RNA, the so-called heterogeneous nuclear RNA (hnRNA) ( 4 , 5). Evidence was obtained that poly(A) synthesis occurs in the nucleus and is added to hnRNA after transcription is completed (see 6 for review, also p. 379). Then it was found ( 7 ) that methyl groups exist in mRNA and the work, particularly of Moss, Rottman, and Furuichi (Part I, this volume) reveals the existence and unexpected nature of the 5’-terminaI methylated structurc (“cap”) in virtually all virus and cell mRNA. Earlier such a “blocked” 5’-terminal structurc had been described by Busch et al. (8, and this volumc) in the low-molecular-weight nuclear RNA, but no connection with mHNA was apparent from that study. Finally, there are sequences that are conserved, i.e., that appear in a variety of mRNA molecules. This area is by no means settled, but it is very interesting and suggestive at this point. Therefore we will consider the nature and possible origin of these repetitious regions in mRNA. To summarize: we can diagram the mRNA molecule as a capped structure with nontranslated sequences, probably containing short repetitive regions perhaps at both ends, and with poly( A ) a t the 3‘ terminus. This would be the basic structure of mRNA (last line of Fig. 1). There are now a number of details I would like to add to this general picture. SO

111. Average Size of mRNA Turning to the original findings that the messenger is relatively small, -1500 nucleotides, I discuss briefly some recent evidence that strongly supports the old conclusions on the average length of the total mRNA from cultured cells. “P-labeled mRNA, purified by poly( U ) Sepharose chromatography, affords a chance to measure accurately the minimum average length of mRNA because an assay for structures a t both ends exists. Every such molecule contains a poly( A), which, after 3-4 hours of labeling, averages about 150 nucleotides in length. I n addition, every internal nucleotide should be equally labeled, and at the 5’ terminus a structure containing an average of 5.5 phosphates is present. We measured the radioactivity in poly(A), in the caps, and the total

496

JAMES E. DARNELL

25

-2

50

150

I00

-3

-4

-5

-6

FIG. 2. Characterization of :"P-lal)eled mRNA. Total cprn ( 1440-1770), gcl electrophoresis of p l y ( A ) ( 140-160 ), DEAE-Sephndex colinnn analysis of structures of charge -5 to -6 ( 5 - 6 ) indicate that in mRNA there is one blocked methylated "cap" structure and one unit of poly( A ) ( -5 hours old) per 1500 nucleotides.

radioactivity and found that for every 1500 cpm of total radioactivity, we had about 150 cpm in poly( A ) and 5.5 cpm in caps. Thus it appears that the number-average chain-length is 1500 ( Fig. 2 ) . From the outset, such a small size suggested that mHNA in mammalian cells might be monocistronic; 1500 nucleotidcs could code maximally for one peptide of 500 amino acids. In 1967, K~iffand Roberts ( 9 ) showed that the polyribosomes in rat liver cells contain larger nascent polypeptides associated with thc ribosomes in larger polyribosomes, a finding most easily explained by assuming that the ribosomes wcrc probably transcribing mRNA that was monocistronic and the larger mRNA encoded larger polypeptides. Kaesberg (this volume) describes an interesting new use of the rule for monocistronicity of eukaryotic mHNA. In brome-mosaic virus, four separate mRNA molecules exist, each one apparently used to make one type of protein product even though two of these mRNAs might be used to make at least two different proteins. Thus the monocistronic charactcr of translation is perhaps a possibility in all cukaryotes, a i d it certainly seems to bc the case for mRNA in mammalian cells.

IV. mRNA Methylation Thcre are basically two different biochemical pathways for capping that seem well established now, and the importance of understanding

mRNA

STRUCTURE AND FUNCTION: SUMMARY

497

the two is emphasized in considerations of how cleavage of nuclear RNA followed by capping might possibly occur (see Part I, this volume). The first pathway, exemplified by reovirus and vaccinia virus, involves adding a terminal phosphate from pppG in the capping reaction. The first (5’) nucleotide laid down by RNA polymerase has one phosphate removed by a nucleotide phosphohydrolasc, followed by a donation from the incoming GTP of only one phosphate. A second mechanism envisages cleavages by an enzyme( s ) that leaves a 5’-terminal phosphate. Ribonuclease I11 (see Dunn et al., this volume) is one interesting example of an enzyme with such activity. The mechanism of capping in VSV utilizes such a 5’-phosphate with the donation of two phosphates by the incoming pppG (Colonno et al., this volume). In fact, it is suggested ( loc. cit. ) that VSV transcription may actually involve synthesis of very long molecules followed by internal cleavage and cap addition. Perry et al. (this volume) suggest an amalgamation of these two reactions in the formation of caps by L-cells. Analysis of the phosphate termini at the 5’ ends of nuclear RNA shows a considerable number of ppNp structures, where N can be any of the four bases. The distribution of the bases in these structures is 30 A, 46 G, 16 U and 8 C, which is approximately the same as the distribution of nucleotides to which the terminal m’G is attached in caps from cytoplasmic mRNA, (36 A, 35 G, 18 U and 10 C ) . In contrast, only G and A were found in pppNp. Thus Perry et al. suggest an internal cleavage and stepwise addition of one phosphate to give the ppNp structures, followed by capping with one phosphate from GTP and finally methylation. Thus we may not have the complete story yet on the enzymology of capping. There are two accepted described routes of capping and one new proposed route. An important point to emphasize is that if cleavage and capping occur, every cap does not necessarily signal an KNA polymerase starting point. A final note about the experiments examining cap formation (described by Rottman et al., by Perry with cellular mRNA, and by both Furuichi et al. and Moss et al. for viral RNAs) concerns the final methylation. The so-called cap-2 structure-m7GpppN’m-N’’mp-apparently is formed exclusively in the cytoplasm by adding a 2’-0 methyl to the second ( N ” ) nucleotide. In addition, Rottman indicates that N” is frequently a cytidine in Novikoff hepatoma cells. The result of this late methylation is that the earliest appearing cytoplasmic methyl label in mRNA is overrepresented in this position. Thc studies on methylation of mRNA have raised another totally unanswered point. In addition to methylation in caps, there is a large amount of “;-methyladenosine ( mGA) ( and some 5-methylcytidine ) present in the total mRNA preparation, enough for several mGA’sper

498

JAMES E. UARNELL

moleculc. However, every nucleus-derived mRNA does not have m6A; e.g., hemoglobin mRNA apparently lacks it. Nevertheless, the average content of mGAsuggests that most mRNA arrives in the cytoplasm with niGAin it. A point of interest is that all the viruses that produce mHNA, presumably through use of the celluIar nuclear mechanisms, have cytoplasmic virus-specific mRNA, which contains m”A. However, many viruses that grow exclusively in the cytoplasm lack it. Apparently, no one has a sound idea as to what m”A may d o physiologically. Most of the evidence indicates that most of the residues are at a distance from the 3‘ end, suggesting a 5’ location, but there is not enough evidence yet to say that m‘;A might not in fact be in the middle of the coding region. A role for mGA might be postulated somewhat better if we knew more about its exact locat’ion.

V. Addition of Poly(A1to mRNA The slime mold Dictyostelium cliscoideum synthesizes an oligoadenylate approximately 25 residues long, which is copied from genome DNA and also appears in mRNA. It appears that virtually every gene is “ m a r k e d in Dictyostelium by such a sequence. Short poly( A ) segments transcribed from the genome definitely exist in HeLa cell hnRNA (Edmonds, this volume). The cornpositior~of such units derived by T 1 RNase is approximately Ai5, G L ,UL,and by pancrcatic RNase digestion AZ5,G2,U, suggesting a nonterminal, internal position and probably 110 interspersed base among the A’s. Most interestingly, these oligo( A) regions are not found in mRNA(A,,), as seems to be the case for slime molds, and they are found in highest concentration in hnllNA ( n o A,,). The possibility exists that these sequenccs are precursors that are extended to yield the longer (-250 nucleotide) poly(A) contained in mKNA. There is evidence (Sawicki, Jelinck and Darnell, unpublished) that the unit is synthesized only in the nucleus, an issue that has been argued rcpcatedly in the past couple of years. Originally it was shown that if labeling times were shortened to less than 1-2 minutes, almost all discerniblc A,,,,, was associated with hnRNA ( 6 ) . To further clarify this issue, advantage was taken of the fact ( 2 0 ) that terminal addition of A,, can occur in the cytoplasm. This addition is readily detectable in the cytoplasm in 1 minute in all size classes of preexisting mRNA whereas only in thc nucleus is the size class of 200 predominantly found. Thus both the nuclear and the cytoplasmic nucleotide pool are labeled, but only the nucleus contains the largest poly ( A ) after brief labeling periods. detectable by a The subscquciit rapid appcarance of cytoplasmic ALOO, rising internal labeling with a plateau of terminal label, must mean rapid

mRNA

STRUCTURE AND FUNCTION: SUMMARY

499

nucleocytoplasmic transport for some mRNA molecules once poly ( A) is added.

VI.

Noncoding Regions and Repetitive Oligonucleotide in mRNA What are some of the other details of mRNA structure that will help us think more deeply about the control of translation and/or the origin of mRNA? Clearly some of the most dramatic information is coming out of thc scquence analyses. At the outset of the sequence studies, there was no guarantee that knowledge of the primary sequence of a single or a few mRNAs would supply any very interesting information; but it has. Proudfoot et d Xdescribed the 150 3’-terminal nucleotides that occur next to poly( A ) in a- and p-chains of rabbit hemoglobin mRNA, as well as human hemoglobin, mouse immunoglobulin and chick ovalbumin mRNA molecules. Weissman discusses the probable sequences in the region of the 3’ termini of “early” and “late” mRNA molecules made on SV40 DNA. Salser et ~ 1 and . ~Proudfoot et al. compare data on sequences from the protein-coding and noncoding regions of several hemoglobin mRNAs and Davies et nl.? show almost complete sequences on the short mRNAs that code for the 30-31 amino-acid residues of trout-sperm protamines. Several powerful conclusions come from all these studies. There are as many as 130-150 residues of noncoding scqucnce past the carboxyl terminus in every hemoglobin mRNA chain; i s . , the coding sequences for the protcin have been located, and 150 nucleotides plus the poly( A ) remain untranslated past the UAA or UAG terminator a t the 3’ end. In these cases, relatively little, pcrhaps 3 0 5 0 nucleotides, could remain for noncoding purposcs at the 5‘ end; e.g., in the 650 nucleotides of rabbit a-globin the arrangement might be, from 5‘ to 3’: (50 noncoding) (423 protein-coding) ( 100 non-coding ) (75 poly( A ) , In other mRNAs, considerably longer noncoding regions may exist; Schimke suggests ( unpublished) that perhaps several hundred 5’ nucleotides can be removed from chick ovalbumin mRNA without destroying its coding capacity. What are these noncoding regions doing? One of the point5 that is brought out b y Proudfoot and by Salser is that thc genctic drift that has occurred in thc 3’ terminus of most mRNA molecules, for example, between human and rabbit a-globin, is less than the drift that has occurrcd in some of the coding regions where :mino-acid sequeiiccis have remained the same. So these terminal regions are highly conserved, for reasons that we do not understand, but we must presume some function that would be perturbed by sequence drift. The terminal conservations arc obviously not an inviolate sequence of hundreds of nucleotides or even fifty nucleotides in a row. There are, how-

’ In Part I1 of

this volume.

500

JAMES E. DARNELL 5

I

I 2 3 4 5

PROTECTED

PARTIALLY PROTECTED

SUSCEPTIBLE

FIG. 3. Repetitious sequences in hnRNA: 1 and l’, etc., are complementary sequences and might exist in a variety of configurations. They are detectable as prominent oligonucleotides in double-stranded RNA obtained by nuclease digestion ( Robertson, Jelinek, and Dickson; and Jelinek, unpublished).

ever, certain rigidly preserved regions. For example, one hexanucleotide AAUAAA, is present in the 3’ noneoding region of every mRNA so far sequcnccd. The presence of such a sequence within 100 or so nucleotides of the 3’ terminus in every mRNA clearly violates statistics. In addition, particularly pointed out by Weissman, thcre are U-rich sequences between the AAUAA and the poly( A) near the SV40 mRNA terminus as well as A- and U-rich regions near the 5’ terminus of SV40 mRNA regions. CIearly we do not yet know whether such sequences have something to do with initiation or tcrmination, something to do with cleavage and poly( A ) addition, or something to do with some unsuspected function at this point, but the elucidation of the function of these things cannot be far behind their identific at’ion. I would like to carry forward this discussion of possible repetitive sequences in mRNA. Jclinck and I, for the last several years, since finding double-stranded regions in hnRNA, have been trying to find a home for these doublc-stranded regions at one end or the other, or both, of the mRNA (Figs. 3 and 4),Jelinek and Robertson have considerable sequence data on the prominent T 1 oligonucleotides from hnRNA labeled in U ~ U O . Jelinek finds a number (&lo) of prominent repeating oligonucleotides in these double-stranded structures, which follows our older evidence 5/

I 2 3 4 5

A200

3‘

1 2 3 4 5

A200

FIG.4. Relation of repetitious sequences in hnRNA to mRNA. Oligonucleotides present in hnRNA survive in cytoplasmic mRNA but are not in double-stranded form. The model in the first line is favored as the most likely at the moment (Jelinek, Evans, Salditt-Georgieff and Darnell, unpublished ).

mRNA

STRUCTURE AND FUNCTION: S UMMARY

501

that the clsRNA comes from intermediate repetitious DNA ( 6 ) . In addition, Jelinek finds that E . coli RNA polymerase copies "snap-back DNA to produce a T, digest pattern very similar to that from total doublestranded regions of hnRNA. Finally Jelinek, Evans, Salditt-Georgieff , and myself have shown that the rapidly hybridizing mRNA fraction contains, albeit in a lower concentration, the same oIigonucleotide sequences. This evidence made us believe that a large fraction of the HeLa cell mRNAs might have some sort of repetitive sequence at one or both ends, but not necessarily at the end of every molecule. Another type of experiment does suggest, however, that some portion of most HeLa cell mRNA is related to a double-stranded, and therefore possibly repetitive, nucleotide sequence. This cxperiment repeats and extends with HeLa cells what Naora and Whitelam ( 1 1 ) showed with rat liver RNA last year. The nuclear double-stranded regions from hnRNA were prepared, melted by boiling, and used to drive [3H]uridine-labeled mRNA into a hybrid. About 7 or 8%of the mRNA was converted to RNase-rcsistance, and the driver was alkali-stable ( Fig. 5 ) . Moreover, when 3LP-labeledRNase-resistant RNA RNA hybrid was compared to nonhybridized RNA, the protected oligonucleotides gave a chromatogram like that of double-stranded RNA. In addition to these recent results on conserved or repetitive oligonucleotides in HeLa cell mRNA, evidence from Crippa et al. ( l l a ) indicated that hydroxyapatite chromatography dctected a rapid hybridization of whole mRNA to DNA, whereas fragments of mRNA did not hybridize rapidly. Crippa's inter-

rng/ml x nr

FIG. 5. IleLa cell mRNA: nuclear double-stranded RNA hybridization. ['HIuridine-labeled mRNA was hybridized to purified double-stranded RNA from HeLa cell nuclei, and hybrids were detected by RNase digestion. Background hybridization was about 1%of input, and the ability of double-stranded RNA to promote hybridization was alkali sensitive ( A )(from Jelinek, Evans, Salditt-Georgieff, and Darnell, unpublished ) .

502

JAMES E. DARNELL

pretation, which rccent more extcnsive experiments have confirmed ( Crippa, personal communication), is that Xenopus Znevis mRNA molecules have terminal repetitive oligonucleotides. Thus it appears that this may b e a regular feature of eukaryotic mRNAs. Before turning to the insights on mRNA translations provided by structural studies on mRNA, I comment briefly on the evolutionary origin of one class of mRNAs, drawn from spcculatioiis about the mHNA sequences of trout protamines. Protamines are, as Davies et al. (this volume) describe them, simply minimally changed polyarginines, the two major codons for which are AGA and AGG, with an additional two codons available by single base changes, CGA and CGG, and two further codons, CGU and CGC, arising by one additional base change. Serine can be derived by a single base change ( AGd‘’), and single additional mutational changes can produce codons for proline, isoleucine and alanirie. These changes corrcspoiid exactly to the amino acids occasionally substituted in the “polyarginines,” the protamines. Dixon’s speculations’< are backed u p by the actual sequence data thus far obtained on the protamine mRNAs, which show a preponderance of AGAs and AGGs, other arginirie codons being more infrequcnt. Thus it appears in this casc that the evolutionary history of these simple proteins can be read backward, one base change at a time, and the original mRNA may have been simply AGA repeats.

VII. Role of Caps in Translation The ultimate function of niHNA is to be translated, and the recent studies on mRNA structure have clearly given us an important new understanding about the initiation of translation. There is evidence that mRNAs without caps are translated much less well than mRNAs with caps (see Furuichi ct al., Colonno et al., Kaesberg in this volume). Howcver, from Kaesberg’s results with brome-mosaic-virus mRNAs, it is clear that translation can occur without a cap. In addition, polio RNA definitely lacks caps, but is, nevertheless, translated. Muthukrishnan (this volume) has outlined his studies on homopolymers constructed by elongating the G-C dinucleotide that occurs at the 5’ end of the transcribed portion of reovirus mRNAs. This work shows that the cap added to a polynucleotide promotes binding to the 40 S ribosomal subunit. However, formation of the 80 S ribosomal complex by addition of the larger ribosomal subunit requires that adeiiylic and uridylic acids be included in the nucleotides near the 5’ end. Thus we get a picture of facilitated binding by the cap plus an A-and-U-rich 5’ region. Such a

mRNA

STRUCTURE AND FUNCTION:

SUMMARY

503

picture clearly accords with the idca of moderately well-conserved, but not necessarily perfectly conserved, 5’ regions of mRNA discussed in the preceding section. One area perhaps related to the translation of mRNA, which thc recent structural studies have left a mystery, but which is an area of tremendous iinportancc to the physiology of cells, is the question of turnover of mRNA. In spite of continued attention to this problem since the beginning of our ability to measure mHNA, we have learncd nothing about the connection of mllNA structure to mRNA turnover. This area is very important because mammalian cells probably cannot shift transcriptional gears as quickly as bacterial cells. At least some of the messenger RNA of mammalian cclls is known to be long-lived; it has recently been confirmed that most of the message has a half-life as long as 16-20 hours. However, as Perry et al. mention (see p. 275) and as has been demonstratcd ( 1 2 ) ,there is mHNA that turns over much more rapidly. In fact, there is probably a continuous spectrum of inRNA half-lives from an hour or so to thosc that arc virtually as stable as rRNA. Two presentations about the effective beginning of myosin synthesis wcre given at this conference, and both are perhaps tangentially related to the qucstion of mHNA turnover; they are certainly related to the question of mRNA stability during differentiation. In the course of muscle-cell-syncytium formation, cells pass from single cells to myotubes producing large amounts of myosin. The production of myosin mRNA, however, begins earlier than does the production of myosin. Thus 26 S poly( A ) -terminated, presumptive myosin mRNA is not stabilized in the polysomes until differentiation begins. These results earlier reported by hckingham, Gros, and colleagues were more fully described here and extended to include actin, probably, as well ( Whalen et al., this volume). The basis for restraint of translation, and thus probably destruction of this early-appearing unused myosin inRNA is thought by Heywood and Kennedy (this volume) to be hydridization with a small RNA segment called “translational control RNA.” Such emergence of unused mRNA to the cytoplasm may be a phenomenon of great importance in understanding the transition from determined to differentiating cells. Likewise, such mRNA, destined for destruction rather than use, might be a very important class of RNA to study in trying to reconcile various facets of turnover and total RNA synthesis studies in cultured cells. We might expect that the mechanism of messenger turnover is somehow related to the terminal repetitious mRNA sequences and the proteins that interact with them. I n spite of the importance of this area and the attcntion it must receive in the future, it is obviously a very complex area. Even with bacterial cells, we still do

504

JAMES E. DARNELL

not know the biochemical basis for mRNA turnover. The question of turnover may be one of the last things to be resolved in molecular cell biology because it may take a totally coupled in uitro system to finally get metabolically meaningful turnover. I think any understanding of this problem that can bc drawn from a study of terminal sequences now would bc a real contribution. For example, we might be able to find proteins that recognize the terminal sequences and that have nothing to do with positive events like translation. A library of such proteins might be good candidates for participation in mRNA turnover. Hence our consideration of mRNA structure has led us to a better understanding of the initiation of translation, if not yet of translational regulation. Has the understanding of the structure of the mRNA helped us to understand where the mRNA comes from?

VIII. Nuclear Transcripts and the Origin of mRNA If we agree that most eukaryotic mRNA in thc cytoplasm of mammalian cells has the general structurc including poIy(A) and caps as shown in Fig. 6, we can then argue that such molecules have come from the nucleus itself, because we recognize the posttranscriptionallyadded markers in the nuclear RNA as well (Fig. 7 ) . Also, because there is no nuclear protein synthesis, we can definitely state that, prior to the utilization of a coding region of nuclear RNA, there are important biochemical steps that intervcne between transcription and utilization. Therefore, all futurc considerations on regulation in mammalian cells must take this into account. Since posttranscriptional modification is thorougly established, the possibility of posttranscriptional regulation ( a decision to carry out or not to carry out these modifications) should be borne in mind. Do the posttranscriptional modifications include simply the biochemical additions at each end of the molecule, which means that the transcript will be a direct product of exactly the right size, ready

t m 7GpppNh-N”~-N-

N -N-A

20 - 200

FIG.6. Diagram of cytoplasmic mRNA structure.

mRNA

505

STRUCTURE AND FUNCTION: SUMhlARY I I

POSTTRANSCRIPTIONAL MODIFICATION

f m7GPPP N‘m-N’‘m-N-N-N-A20,0

200

FIG. 7. Modified nuclear molecules similar to cytoplasmic mRNA. Note that second 2’-O-methylation and shortening of poly( A ) are cytoplasmic.

to be modified at both ends and used? Or, do the modifications also include some sort of RNA chain cleavage (Fig. 8 ) ? We should not recoil at the possibility of cleavage, at least of a short piece of RNA, at this point in our study of RNA metabolism. As Dunn et al. (this volume) discusses, there is ample evidence for RNA cleavage in the production of T-odd bacteriophage mRNA. ~ ~ I N I l I A l I O(PROMOTER-OPERAlOR7) N

MODIFICATIONS

c = = = = = , d

I

“TRANSPORT” m7GpppN’m-N”(m) -poly(A)

m 7GpppN’m -N’Im)

000

polylAJ

FIG.8. Nuclear modifications in mRNA formation and two major transcriptional alternatives. i,,. represents a possible initiation site for hnRNA transcription; in,, a possible mRNA initiation site; STRUC indicates structural gene region. The cytoplasmic stage of mRNA unassociated with ribosomes is widely accepted but not critically proved.

506

JAMES E. DARNELL

Cleavages are important in the processing of pre-rRNA and pre-tRNA in the manufacture of mamnialian cells, and such reactions have their countcrparts in bacterial cells ( 6 ) . To some investigators, the bacterial RNA cleavages make legitimate the whole idea of RNA processing because the enzymes that do the cleaving have been discovered and the cleaving reactions have been carried out in vitro. So one should not hesitatc to believe that cleavagc might take place in eukaryotic mRNA formation. IVhat then are the types of facts that we need to prove how mRNA arises? and through what transcriptional and posttranscriptional pathways does mHNA arise? Clearly the most important type of information to discover would bc what a transcript from a functioning gene rcally looks like. In the absence of the ability to study single-gene products, several types of studies have examined nuclear HNA for its content of sequences related to those found in mRNA. We showed ( 6 ) that SV40and later adeno-Ztransformed cells contain virus-specific high-molecularweight RNA in the nucleus, and smaller discrete-sized virus-specific mRNA in the cytoplasm. Penman (this volume) and Perry (this volume) and colleagues have demonstrated that, in both L-cells and Hela cells, sequence overlap exists between InRNA (or the cDNA from i t ) and hnRNA (or the cDNA made from i t ) . Thus what was an issue for many years, we may now regard as closcd. It is clear that the high-molecularweight niolccules do contain coding sequences, and, we now know that caps] are also prcsent in the posttranscriptional markers [ p l y ( A ) hnRNA. But what is the form of transcript that is effectively converted into the niRNA? Is it possible that there are two kinds of transcripts, cffective and ineffective hnHNA? One approach (see Lizardi et al. and Dancholt et al. this volume) is to study RNA synthesis in highly specinlized cells, such as the silk gland and the salivary glands of insects. In these cases the mRNAs for secretory proteins are made in such high amounts that a detectable fraction of total RNA synthesis corresponds to specific mRNA. Lizardi’s cvidencc bnsed on pulse-labeled RNA in animal5 subjected acuteIy to low temperature, and supported by the clcctron micrographs of McKnight et al. (this volume), suggests that in all likelihood the true transcript from the silk genes in Bombyr niori is from a somewhat longer DNA transcriptional unit, perhaps a couple of thousand nuclcotides longcr, than the messenger RNA itself. Processing may occur normally during RNA synthesis. In the case of Daneholt’s studies, no evidence esists for or against a larger precursor even though nascent chains can be observed during pulse labels. Again, however, if the transcriptional unit is only a few thousand nucleotides longer, detection of a larger precursor would be most difficult.

+

mRNA

STRUCTURE AND FUNCTION:

507

SUMMARY

If we believe that most HeLa cell mRNAs, Xenopus mRNAs ( I l a ) and rat liver mRNAs (12) have some repetitious sequences on their 5' ends, which are also present in, and possibly derived from, cleavage of double strands, we are led to the conclusion that at least a short piece of RNA may be removed from the majority of most original transcripts, because cytoplasmic mRNAs lack double strands of the type found in nuclear RNA [G, also Jelinek, unpublished (Figs. 3 and 4 ) ] . At this conference, we heard about two cases in which gene-specific direct transcripts have been studied, the results of which at least leave open the possibility that an RNA molecule larger than mRNA is, in one instance at least, the direct transcription product. In order to try to illuminate this question further, I consider briefly some new information from my laboratory concerning the true transcript size of hnRNA from HeLa cells. Figure 9 is similar in some ways to a diagram presented by Perry ct al. (this volume). The curve of open triangles is simply the sedimentation profile of hnRNA at the steady state. If we make the assumption that the 18 S molecules are about 2,000 iiucleotides long, the 32 S molecules 8,000, and the 45 S molecules 14,000 (recognizing that this may introduce errors of as much as t 2 0 %due to inaccuracy of the sedimentation/ molecular-weight relationship; see Roedtker and Lehrach, this volume), we can obtain a molar profile of the molecules, indicated by the filled circles (Fig. 9 ) . 'Mie have never known whether this molar profile represents molecules originally all bigger than 10,000 nucleotides, and the small molecules were primarily breakdown products, or whether the smaller molecules were nascent chains, or smaller transcripts. We have never been able in thc past to deal with this issue. Eva Dcrman, a

I

I

45s

32s

18s

-

Sedimentation __

FIG.9. Sedimentation analysis of IlcLa cell hnRNA. Cells labeled for 180 minutes with I T L were lalx~ledfor 10, 20, 30 seconds with [ 'Hluridine. hnRNA was extracted, denatured and sedinientcd.

508

JAMES E. DARNELL

graduate student in our group, realized that sedimentation profiles of very briefly labeled hnRNA showed rcproducibly molecules scdimenting more slowly than the long-labeled hnRNA. Figure 9 shows that the hnRNA from cells labeled for 10, 20 and 30 seconds is not as long as KNA that is formed in steady state. Dernian and Goldberg thcn went on to analyze this sedimcntation distribution to prrdict what the chain sizc would be when synthesis was completcd. This analysis is possiblc if no chain destruction occiirs during a 10-sccond labcl and is based on the fact that chains of all eventual sizes contribute to shorter chains, but only chains greater than a given length contribute to chains that long or longer. The results of this analysis show that half thc chains are shorter than 5000 nucleotidcs and half the chains are longer. This predicted chain length is what will be achieved when the nascciit profile has reached maturity, and this is the first time wc have ever been able to make such an estimate. Whilc a good number of the chains in this distribution might overlap the 1500-chain-length of thc average mRNA, over 85% of the hnRNA transcripts are larger than mRNA. From the work of Lizardi (this volume) and from the considerations just made about transcription units in HeLa cells, perhaps the general rule will be that the posttranscriptional processing will include not only capping and poly( A ) addition, but also ( a t least very frequently) the removal of relatively short sequences, up to a few thousand nucleotides. It is quite possible, however, that only a portion of the hnRNA is related to mRNA formation. Sehgal, in our laboratory, has observed that 5’,6’-dichlororibofura1iosylbenzimidazoleis a cliain-initiation inhibitor of only a portion of hnRNA, but at the same time prevents labeling of any poly( A)-terminated mRNA. I would further consider the results of Price et al. ( 1 3 ) suggcsting that there is a direct messenger precursor of smaller sizc detectable in an nmmonium-sulfatc-soluble fraction of HeLa cell nuclei. This material may indeed include the short nascent-chain material or short dircct-transcript material that now has also been observed in bricf labeling periods and that may be the target of the above inhibitor. This division of the hnRNA into functional classes, seems to me to be one of the most important types of questions to be attacked in the next few years. The experiments reported by Penman (see Herman et al., this volume) may represent a significant step in that direction. This new work has apparently reproducibly separated the nucleus into two fractions, those bound to structural elcments and those free of structural elements, that may have two different functions. The chromatin-associated-struct u r d hnRNA may not play an mRNA-precursor role, while that hnRNA either released from the nuclear skeleton or never a part of it may con-

mllNA

STRUCTURE AND FUNCTION: SUMMARY

509

tain the mRNA precursor. While more work is necessary to test these ideas, a new fractionation technique is always an important adjunct to progress in understanding such complex possibilities. One can only speculate, wildly at the moment, why a gene operative for producing messenger may produce two kinds of molecules. Possibly a long molecule, perhaps even one terminated by poly(A), may be necessary to keep a gene in the right configuration to be transcribed effectively, but mRNA would br produced from a shorter transcript. To explore thc nature of transcripts from specific genes will be the problem of the next few years. An important step toward being able to examine the RNA transcripts from functioning genes is discussed in this volume by Rubin et al., Salser et ul., and O’Malley and Means. Inserting DNA from specific genes in bacterial plasmids will allow the production of enough specific gene DNA to explore, by RNA.DNA hybridization, the properties of transcripts from specific genes. The outline of how a specific DNA can be used for such a purpose has been provided by work with type 2 adenovirus, Ad-2 ( 1 3 ) . This virus contains 35,000 nucleotides, and its genome is divided into six fragments by Eco RI restriction endonuclease. We explored the nature of the primary transcript from adenovirus by limiting the label incorporation time to 1 or 2 minutes and hybridizing the resulting RNA to fragments of DNA. The largest RNA (over 20,000 nucleotides) was complementary to fragments at one end of the genome, and successively shorter RNA molecules were complementary to successive fragments toward the other end of the genome. This type of experiment both orders the transcription for the regions distant from a promoter and suggests that a long transcript is necessary in adenovirus mRNA formation. This experiment is the prototype of what can be done with various kinds of gene-specific plasmidpurified DNA, especially with the already existing Drosophila clones. The question of whether transcripts from specific genes exist as large as well as small molecules can be answered. Finally, what is the reason to go through all this exercise and years of work to learn something about the structure and the possible pathway of transcription of mRNA molecules? The reason we want to know all these biochemical details is finally to understand regulation in mammalian cells. It is through understanding details of mRNA manufacture that we may eventually understand the regulation of the mRNA.

IX. Chromatin Transcription and Gene Regulation The mechanism of regulation of gene expression was the flavor of the work of those laboratories concerned with chromatin transcription.

510

JAMES E. DARNELT,

These workers are going straight ahead to search for proteins responsible for regulation without knowing the nature of nuclear RNA transcripts from functioning genes. The dramatic findings from the laboratories of Stein, O’Malley, Axel, and Gilmour are basically similar: cells making large amounts of specific proteins, e.g., histones, ovalbumin, and hemoglobin, serve as sources from which corresponding mRNAs can be purified. Probes constructed by reverse transcription of the mRNA allow measurements of the specific genc output by quantitative hybridization of all newly formed RNA when chromatin is transcribed by E. coli polymerase. In all cases thus far tested, both chromatin and reconstituted chromatin are transcribed i n a pattern similar to RNA production of the cell of origin. The specificity for allowing proper transcription after reconstitution of the chromatin resides in the chromosomal nonhistoneproteins. In the case of ovalbumin transcription, OMallcy et al. (this volume) have evidence that iioiihistone-proteins from oviduct cells can be increased in thcir ability to cause the ovalbumin sequences to be transcribed by adding progesterone, a hormone that induces ovalabumin synthesis. In the general interpretation of these results, these workers have adopted the following approach. Studies of bacterial physiology tell us that there arc certain proteins that can interact with DNA either positively or negatively, that is, to impede or promote RNA transcription. By studying the proteins found in chromatin, onc is studying proteins that interact with DNA and that act cither by starting or stopping specific gene transcription. If the argument is to be carried through the last phase, I would have some hesitation to accept it, What seems clcar to me is that proteins that are part of the chromatin have been isolated and that clearly they have something to do with transcription. However, I would hesitate to believe that a unique configuration of the chromatin structure as it exists inside thc cell can be achieved by removing the proteins with salts and urea, dialyzing out the urea, and achieving spontaneous perfect reassociation. Magee’s brief presentation (this volume) suggests one type of reason for this skepticism: the first ethidium bromide molecules to interact with the nuclear chromatin are l~oundvery tightly. What this suggests is that the chromatin structure as it really exists inside the cell is probably totally tightly packed. It is hard for me to believe that the in uitro chromatin is reassembled in exactly the same structure that existed inside the cell. Nevertheless, it is clearly true that the regions that were functional for producing messages inside the cell have been reestablished as regions accessible to added E. coli RNA polymerase. Axel (this voliime) favors the interpretation that the chromatin, either as one takes it out of the cell or as onc reconstitutes it after taking it out of the

mRNA

STRUCTURE AND FUNCTION: S UMMARY

511

cell, has left the transcribing genes accessible but not specifically promoted, i.e., the actual initiation event for HNA synthesis may not be in the same sitc or at the same rate as occurs in the cell. This differs from specific gene recognition signals or positive transcriptional proteins that exist in E . coli, which do promote starts specifically at the correct place in uitro. Nevertheless, we may learn a great deal about specific r e p l a tion by studying the presently isolated chromatin proteins, which may operate in a more general manner. I think it is clear that the study of eukaryotic mRNA is tnily coming of age. We are now learning molccular details about messenger manufacture which I, at least, felt 10 years ago we would never know. The prospects appcar bright to truly learn about regulation within the next 10 years.

REFERENCES S. Penman, K. Scherrer, Y. Becker and J. E. Darnell, PNAS 49, 654 (1963). M. Edmonds and M. G. Caraniela, JBC 244, 1314 (1969). J. Kates, C S H S Q B 35, 743 (1970). J. Mendecki, Y. Lee and G. Brawcrman, Bchem 11, 1792 ( 1972). J. E. Darnell, R. Wall and R. J. Tushinski, €“AS 68, 1321 (1971). 6 . J. E. Darnell, Harvey Leet. Ser. 69, 1 ( 1975). 7. R. P. Perry and D. E. Kelly, Cell 1, 37 ( 1974). 8. T-S. Ro-Chai, Y. C. Choi, D. Henning, J. McCloskey and H. Busch, IBC 250, 3921 (1975). 9. E. L. Kuff and N. E. Roberts, J M B 26, 211 (1967). 10. J. Diez and G. Brawerman, PNAS 71,4091 (1974). 11. H. Naora and J. M. Whitelam, Nature 256, 756 ( 1975). l l a . M. Crippa, I. Meza and D. Dina, CSHSQB 38, 933 ( 1973). 12. L. Puckett, S. Chambers and J. E. Darnell, PNAS 72, 389 (1975). 13. R. E. Price, L. Ransom, and S. Penman, Cell 2, 253 ( 1974).

1. 2. 3. 4. 5.

This Page Intentionally Left Blank

Subject Index A Adenovirus 2 infection, nuclear and cytoplasmic RNAs, 293-300

B Bacteriophages T7 and T3 early ribonucleic acids effect of cleavage on translation, 268-271 synthesis of, 264-265 Ralbiani rings granules in nuclear sap, nuclear pores and cytoplasm, 321-323 RNA in polysomes, 327-329 transcription complexes in, 320-321 Brome mosaic virus properties of, 465-466 regulation of translation effect of 5’ structure on translation efficiency, 468-469 function of 3’-terminal region, 469470 messenger competition experiments, 457-458 RNAs 1and 2, structure, 467 RNA 3, structure, 467 RNA 4, structure, 466

C Caps blocking and methylation of, 7-9 5’-diphosphate and phosphodiester bonds and, 9-11 formation, reovirus transcription and, 4-7 intermediates in synthesis, 15 occurrence of, 3-4 pyrophosphate effect on, 11-13 reovirus, proposed mechanism of synthesis, 13-15

role in translation, 502-504 structural and functional studies, 3942 comparison of nuclear and cytoplasmic mRNAs, 42-43 effects of S-adenosylmethionine and S-adenosylhomocysteine on translation, 45-46 inhibition of translation, 53-55 ovalbumin mRNA, 46-47 quantitative analysis, 43-45 quantitative comparison of labeling of specific proteins, 5 1-53 role of cap, 47-48 survey system for quantitative and qualitative analysis of mRNAs, 48-51 Chromatin globin gene structure in, 361-364 nucleosomal subunits, 356358 transcription, gene regulation and, 509-5 11 Chromosomes arrangement of coding sequences, 221-225 metaphase, nucleosomes in, 359 Cloning cDNA sequences on bacterial plasmids characterization of plasmids carrying globin gene sequences, 191-195 fold-back region as primer, 186-187 insertion of polyadenylated niRNAs, 187-191 sequence analysis by in vitro transcription agreement of nricleotide and aminoacid sequences, 178-182 deoxysubstitution for specific cleavage, 182-186 start and stop signals for E . coli RNA polymerase, 182 Coding sequences, chromosomal arrangement, 221-225 Cytoplasm, skeletons, 379-382

513

514

SUBJECT INDEX

D Deoxyribonucleic acid accuracy of mnltiple copy sequences clustered sequences, 212-216 interspersed sequences, 216-219 monomeric particles, analysis, 359-361 Deoxyrihonuclcic acid selective transcription mediated by nonhistone proteins, 4 4 7 4 4 8 binding of inhibitory protein, 450451 comparison of inhihitory proteins from different sources, 455-456 inhibition of transcription in uitro, 451-453 isolation of activator protein, 456457 isolation of inhibitory protein, 448449 mode of action of inhibitor, 453-454 nonhistonc stimulation of initiation, 459-460 properties of inhibitory protein, 449450 selective stininlation by activator, 457-459 single-copy sequences, constraints on maintenance, 208-212 structure in native chromatin by cthidiiini bromide binding methods, 373-374 results, 374-377

G Globin genes, in uitro transcription as probe, 364-365

H IIaldane’s dilemma, magnification of, 207-208 IIeterogencous nuclcar rihonuclcic acid distribution in nuclear ribonucleoprotein, 335-336 methylated nucleotides, 340-344 mRNA sequences in 30 S ribonucleoprotein complexes, 339-340

oligo ( A )-linked sequences in 30 S ribonucleoprotein, 347-349 oligo( A ), oligo( U ) and donhlestranded RNA, 344-347 pulse-labeled nuclear “DNA-like” RNA and nuclear poly( A ), 336339 transcribed oligo( A ) sequences, 99100 localization in nucleus and cytoplasm, 10&102 size and cumposition of small p l y ( A ) , 102-105 p l y ( U ) sequences and, 105-112 relationship to messenger RNA kinetic considerations, 280-283 sequence properties, 277-280 studies of 5’ termini, 283-291 transcriptional units and physical size of precursors, 275-277 Histone gene expression evidence for transcriptional regidation cell cycle, stage-specific, 424-426 hybridization analysis of messcngcr, 421424 nonhistone chroinosornal phosphoproteins, 437-442 nonhistone chromosomal proteins, 427-433 stiniulation of nondividing cells to proliferate, 433-437

M Muscle embryonic, translational control in, 477-484 Muscle cells cultured, protein an d niRNA synthesis in, 485-489 Mutations silent, nintation rate measurement and, 205-207

N Nucleosomes, recognition of DNA restriction endonuclease sites in, 365-370

515

SUBJECT INDEX

Nucleus skeleton, 382-384 heterogeneous nuclear RNA association with other structures, 390392 messenger RNA sequences in steady-state heterogeneous nuclear RNA, 384-390

0 Ovalbu ni i n regulation of synthesis level of transcription, 4 1 3 4 1 4 posttranscriptional level, 411-413 trandational level, 411 Oviduct, as model for steroid hormone action. 405-410

P Poliovirus genome, an exceptional enkaryotic mRNA, 81-1-96 Polysomes, Chirononiu.~salivary glands, 323-327 Poly( U ) sequences, propertics of, 109112

R Reovirus methyltransferase ability to modify messenger ribonucleic acid, 16-17 specificity, 15-16

Ribonuclease 111 fidelity of cleavage i n oitro, 265-268 properties of, 263-264 Rihonucleic acid low-molecular weight species, 392-393 ScL or oncornavirus 7 S RNA, 397399 SnA, 3 9 3 4 9 5 SnB and SnC, 395 SnD, 395-396 SnF and SnH, 396 S n C , 396 SnK, 396 SnP, 387 molecular weight distribution aqueous sucrose gradients, 253-258 forinainide sucrose gradients, 258259 vesicular stomatitis virus, role of blocked and unblocked 5'-termini, 83-87 Ribopolyniers 5'-terminal structures, effect on binding to ribosomes, 473-476

S Silk fibroin nascent, visualization of, 313-318 transcription unit, visualization of, 313-318 Steroid hormones model for action, 414-415 regulation of transcription of specific genes, 4 0 3 4 0 4 control theories, 404-405

This Page Intentionally Left Blank

Contents of Previous Volumes Volume 1 "Primer"

N u c l e a r Ribonucfeic Acid

HENRYHARRIS

in DNA Polymerase Reactions

Plant Virus Nucleic Acids

F. J. BOLLUM

ROY h'hRKHAA1

The Biosynthesis of Ribonucleic Acid i n Animol Systems

R . hl.

The Nucleoses of Ercherichia coli

s. ShiELLIE

I. R. LEHMAN

The Role of DNA in RNA Synthesis JEHAHD

HunwIrz

AND

Specificity of Chemical Mulogenesis

J. T. AUGUST

DAVIDR.

Polynucleotide Phosphorylase

KmEG

Column Chromatography of Oligonucleotides

hl. GRUNBERG-MANAGO

and Polynucleotides

MATTHYSSTAEHELIN

Messenger Ribonucleic Acid

FRITZLIPMANN

Mechanism

of

Action

and

Application

of

Azopyrim idines The Recent Excitement in the Coding Problem

J.

F. 11. C. CRICK

SKODA

The Function of the Pyrimidine Base i n the Some

Thoughts on the Double-Stranded Model of Deoxyribonucleic Acid

BENDICHA N D HERBERT s. ROSENKIIANZ -.-

Ribonuclease Reoction

HERBERTWITZEL

AARON

Denoturotion

and

Renaturation

of

Preparation, Fractionation. and Properties of sRNA

Deoxy-

G . L. BROWN

ribonucleic Acid

A w r ~ r o INDEX-SUB i~ JECT INDEX

J. M A R h f u n , R. ROWND,AND C. L. SCIfILDKHAUT Some Problems Concerning the Mocromoleculor Structure of Ribonucleic Acids

Volume 3

A. S. SPIRIN The Structure of DNA os Determined by X-Ray Scattering Techniques

Isolation and Fractionation of Nucleic Acids

K. S. KIRBY

VIrronro LUZZATI Molecular Mechanisms of Radiation Effects

A.

Cellulor Sites of RNA Synthesis

DAVIDM. PRESCOTT

WACKER

Ribonucleoses i n Taka-Diastase:

AWHOR INDEX-SUBJECT INDEX

FUJIOEGAMI,KENJI TAKAHASHI, AND TSUNEKO UCHIDA

Volume 2

Chemical Effects of Ionizing Radiations on

Nucleic Acids and Infomation Transfer

LIEBEF. CAVALIERI AND ROSENREHG

Properties,

Chemical Nature, and Applications

BARBARA

Nucleic Acids and Related Compounds

H.

JOSEPH

517

J. WEISS

518

CONTENTS OF PREVIOUS VOLUMES

The Regulation of RNA Synthesis in Bacteria

FIWERICY c.

NE1I)HARDT

Actinomycin a n d Nucleic Acid Function

E. REICIIAVI) I. €1. GOLDBERG De ~ o v oFrotein Synthesis in Vifro

8.N I S ~ I AAXN D J.

cinogens o n Nucleic Acids

1’. D. LAWLEY Nucleic Acids in Chloroplasts a n d Metabolic DNA

IWAhlUllA

TATSrlICIII

PELsioxI

Free Nucleotides in Animal Tissues

1’.

Effects of Some Chemicol Mutagens a n d Car-

hfANDEL

Enzymatic Alteration Structure

of

Macromolecular

P. R. SIIINIVASAN A N D ERNESTBOREK

AUTHOR INDEXSUI~JECT INDEX

Hormones o n d the Synthesis a n d Utilization

of Ribonucleic Acids

1. R.

Volume 4

Txra

Nucleoside Antibiotics Fluorinated Pyrimidines

CIIAIKESHEIDELDERGER Genetic Recombination in Bacteriophage

JACKJ. Fox, KYOICHI A. WATANABE, ANI) ALEXANDER BLWH Recombination o f D N A Molecules

CHARLES A. THOMAS, Jn.

E. VOLKIN

Appendix I. Recombination of a Pool of D N A Fragments w i t h Complementary Single-Chain Ends

D N A Polymerases from Mommalion Cells

H. M. KEIR The Evolution of

Base Sequences in Poly.

nucleotides

G.

s. WATSON,w. K. ShfITH, AND

CHARLES

B. J . MCCARTHY Biosynthesis of Ribosomes i n Bacterial Cells

SYOZOOSAWA

A.

THOhIAS, J R .

Appendix 11. Proof That Sequences of A, C, G, a n d T Can Be Assembled t o Produce Chains of Ultimate length, Avoiding Repetitions Everywhere

5-Hydroxymethylpyrimidines

and Their De-

rivatives

T. L. V. ULBRICHT Amino Acid Esters o f RNA, Nucleotides, a n d Related Compounds

H. G. ZACHAUAND H. FELDMANN Uptoke of D N A b y l i v i n g Cells

L. LEDOUX

A. S. FRAENKEL A N D J. GILLIS The Chemistry o f Pseudouridine

ROBEIITWARNEH CHAMBERS The Biochemistry of Pseudouridine

EUGENEGOLDWASSER AND ROBERT L. HEINRIKSON AUTHOR INI)EX-SUBJECT INDEX

AUTHOR INDEX-SUBJECT INDEX Volume 6 Volume 5

Nucleic Acids a n d M u t a b i l i t y

STEPHEN Introduction t o the Biochemistry o f 4 - A r a binosyl Nucleosides

SEYMOUR S. COHEN

ZAhlENHOF

Specificity i n the Structure of Transfer RNA

KIN-ICIIIIIO MIURA

519

CONTENTS OF I’HEVIOUS VOLUMES Synthetic Polynucleotides

A. hf.

Enzymatic Reduction of Ribonucleotides

J. ~ $ A S \ O U L IA~ ,~ W. GU~CHLDAUER h$ICIIELSON,

D

The DNA of Chloroplosts, Mitochondria, and

ACNE LARSSON AND

PETER

REICHARD

The Mutagenic Action of Hydroxylamine

1.

H. PHILLIPSA N D D. M. BROWN

Centrioles

s.

<:RANICK

AND

Mammalian Nucleolytic Enzymes and Their Localization

AHAROVG I B O R

Behauior, Neural Function, and RNA

DAVID SHUCARANI) HALINASIERAKOWSKA

H. HYDCN The

Nucleolus Ribosomes

ROBERT P.

and

the

Synthesis

of

AUTIIOH INDEX-SUBJECT INDEX

PERRY

The Nature and Biosynthesis of Nuclear Ribo-

Volume 8

nucleic Acids

G. P. GEORCIEV

Nucleic Acids-The

First Hundred Years

J. N. DAVIDSON

Replication of Phage RNA C H A R L E S W E I S S h l A N N AND

Nucleic Acids and Protamine i n Salmon Testes

SEVER0 OCHOA

GORDON H. Dixos

AUTHOH INDEX-SUBJECT INDEX

AND

M I C H A E L SMITH

Experimental Approaches to the Determination of the Nucleatide Sequences of large Oligonucleotides and Small Nu-

Volume 7

cleic Acids

Autoradiographic

Studies on DNA Replica-

tion in Normal and leukemic Human Chromosomes

FELICEGAVOSTO

Bacteriophage q5Xl74 and Related Viruses

CAHLR. WOESE the

Messenger

RNA

of

Hemoglobin

Ribonucleic Acids and Information Tronrfer in Animal Cells

A. A. HADJIOLOV

hfAl7TIN N E h r E R

ROBERTL. SINSHEIXIER The

H. CHANTRENNE, A. BURNY,AND G. MARBAIX

Transfer of Genetic Embryogenesis

Ease Composition i n

R O B E R T SHAPIRO

The Present Status of the Genetic Code

for

DNA

Chemistry of Guanine and its Biologically Significant Derivatives

LUSOMIH S. HNILICA

Search

Alterotions of Bacteria

G. F. CAUSE

Proteins of the Cell Nucleus

The

ROBERT W. HOLLEY

Information

Preparation

and

of

GEORGE W. RUSHIZKY AND HEHBERT A. SOBER

Purine N-Oxides ond Cancer GEORGE BOSWORTH

during

Characterization

Large Oligonucleotides

The Photochemistry,

BROWN

Photobiology, and Re-

pair of Polynucleotides

R. B. SETLOW

520

CONTENTS OF PREVIOUS VOLUMES

What Reolly Is DNA? Remarks on the Chong-

Volume 10

ing Aspects of a Scientific Concept

ERWINCHARGAFF

Induced Actirotian of Amino Acid Activating Enzymes b y Amino Acids and IRNA

Recent Nucleic Acid Research in China TIEN-HsI

ALAN H. MEHLER

CHENCAND ROY H. DOI Transfer RNA and Cell Differentiation

A u ~ ~ o INDEX-SUBJECT n INDEX

N o u o n u SUEOKAA X D TAMIEO KANO-SUEOKA

Volume 9

The

Role of Conformation Mutogenesir

in

Chemical

N"- (A'-lsopentenyl) adenosine: Chemical Reactions, Biasynthesir, Metabolism, and Significance to the Structure and Function of tRNA ROSS

Ir. HALL

B. SINGERAKD H. ~ A E N K E L - C O N R A T Nucleatide Polarogrophic Techniques Research

in

Nucleic

Acid

from

Preformed

tory

Mechanisms

and

Regula-

Biological Sig-

n ificance

E. PALEEEE

A.

RNA Polymerase and the Control of Synthesis

RNA

JOHN P. RICHARDSON Radiation-Induced Alterations in the Structure of Deoxyribonucleic Acid and Their Biological Consequences

D. T.

Biosynthesis

Purines in Mommalion Cells:

w. hfURRAY, DAPIINE c. ELLIOTT,

AND

M. R. ATKINSON

Ribosome Specificity of Protein Synthesis in Vifra OR10 CIFERRI A N D

BRUNOPARIS1

Synthetic Nucleotide-peptider ZOE

A.

SHABAROVA

KAXAZIl7

The Crystal Sfructurer of Purines, Pyrimidines Optical Rotatory Dispersion and Dichroirm of Nucleic Acids

Circular

J E N TSI Y A X C AND

and Their Intermolecular Complexes

DONALD VoEr

AND

ALEXANDER RICH

AUTHOR INDEX-SUBJECT INDEX

TATSUYA SAMEJIhlA The Specificity of Molecular Hybridizotion in Relation to Studies on Higher Organisms

1'. hf. B.

Volume 1 1 WALKER

Quantum-Mechanic01 Investigations of the Electronic Structure of Nucleic Acids and Their Constituents

The Induction of Interferon by Natural and Synthetic Palynucleotider

CLAHENCE COLBY,JR.

BERNARD PULLMAN AND

ALI~ERTEPULLMAN

Ribonucleic Acid Maturation i n Animal Cells

H. H. BURDON The Chemical Modification of Nucleic Acids

N. K. K~~~~~~~~

AND

E. 1. B~~~~~~~

AUTHOR INDEX-SUBJECT INDEX

Liparibonucleoprotein os an lntcgrol Part of Animal Cell Membranes

v. S. SHAPOTAND s. YA.

DAVIDOVA

521

CONTENTS OF PnEVIOUS VOLUSfES Uptake of Nonviral Nucleic Acids by Mammalian Cells

Minor Components in Transfer RNA: Their Characterization, Location, and Function

Sususiu NISHIMURA

I'USIIPAM. BHAHCAVA AND G . SHANMUGAM

The Mechanism of Aminaacylation of TransThe Relaxed Control Phenomenon

fer RNA

ANN M. RYANA N D ERNESTBOREK

ROBERT

Molecular Aspects of Genetic Recombination CEIXUC

Principles

I.

DAVERN

and

LOFTFIELD

Regulation of RNA Synthesis

EKKEHARD K. F. BAUTZ

Practices

of

Nucleic

Acid The Paly(dA-dT) of Crab

Hybridization

M. LASKOWSKI, SR.

DAVIDE. KENNELL Recent Studies Mechanism

B.

Concerning

the

Coding

The Chemical Synthesis and the Biochemical Properties of Peptidyl-tRNA

THOMAS H. JUKES

LILA GATLIN

AKD

YEHUDA LAPIDOT AND NATHANDE GROOT

The Ribosomal RNA Cistrans

M. L. BIRNSTIEL,M. CHIPCHASE, AND J. SPEIRS

SUBJECTINDEX

Three-Dimensional Structure of tRNA

FRIEDRICH CRAhiER Current

Thoughts

an

Volume 13 the

Replication

of

DNA

Reactions of Nucleic Acids and Nucleaprateins with Formaldehyde

ANDREWBECKERA N D JERARD

HURWITZ

M. YA. FELDMAN

Reaction of Aminoacyl-tRNA Synthetares with Heteralagaus tRNA'r

nus of Transfer RNA

K. BRUCEJACOBSON

MURRAYP. DEUTSCHER

On the Recognition of tRNA by Its Arninaacyl-tRNA Ligase

TAKASHI SUCIMURA

Ia r Bio I o gy a

Probe of

L O M A N T AND JACQUES

Developments

Enzymology

MEHRANGOULIAN

R. FRESCO in

HENRYM. SOBELL Resistance Factors and Their Ecological I m -

Palyribanucleotide Conformation

Recent

JACOB

The Stereochemistry of Actinamycin Binding to DNA and Its Implicotians i n Malecu-

Ultroviole? Pholochemistry as

Some

T.

Poly (adenosine diphosphate ribose)

INDEX-SUBJECT INDEX

Volume 12

A. J.

Mammalian RNA Palymerases SAhlSON

HOBEKT W. CHAMBERS AUTHOR

Synthesis and Functions of the -C-C-A Termi-

DNA

portance to Bocterio and t o Man

M. H.

RICHMOND

Lysagenic Induction

ERNEST BOREKAND ANN RYAN

522

CONTENTS OF PREVIOUS VOLUMES

Recognition in Nucleic Acids and the Anticodon Families

NINIO

JACQUES

Eukaryatic RNA Polymerores and the Factors That Control Them B.

B. BISWAS,A. GANGULY, AND

L). D A S

Translation and Transcription of the Tryptophon Operon

Structural ond Energetic Consequences of Noncomplementary Base Oppositions in

F U h l I O IhfAhlOTO

Nucleic Acid Helices Lymphoid Cell RNA's a n d Immunity

A.

GOTTLIEB

AHTHUR

SUHJECT

A.

1. LOhiANT A N D H. FRESCO

JACQUES

INDEX

The Chemical Effects of Nucleic Acid Alkylation and Their Relation to Mutagenesis a n d Carcinogenesis

B.

Volume 14

SINGER

Effects of the Antibiotics Netropsin o n d Dirtarnycin A on the Structure and Func-

DNA Modification and Restriction WKliNE R A i i w n

tion o f Nucleic Acids

Mechanism of Bacterial Transformation and Transfection

CHHISTOPH ZIhf hiER

JECT INDEX NIHALK. NOTANIAND JANE K. SETLOW SUIJ

DNA Polymerores II a n d Ill o f coli MALCOL M

L.

Escherichio Volume 16

GEFTER

The Primary Structure of DNA Initiation of Enzymic Synthesis of Deoxyribo-

KENXETHMURRAYAND ROHERT

w. O L D

nucleic

RNA-Directed DNA Polymerase-Properties and Functions i n Oncagenic RNA Viruses and Cells

MAURICEGREENAND GRAYF. GERARD SUBJECT

Acid

by

Ribanucleic

Acid

Primers E l l W I X CHAHGAFF

Transcription and Processing of Transfer RNA Precursors JOHN

INDEX

D. SMITII

Bisulfite Modification o f Nucleic Acids and Their Constituents

€IIKOYA HAYATSU Volume 1 5

The Mechonism of the Mutagenic Action of Hydroxylamines

Information Transfer in Cells Infected by RNA Tumor Viruses and Extension to Human Neoplaria

u. GILLESPIE, w. C. SAXISGER, R . C . GALLO Mammalian DNA Polymerares F.

J. BOLLUM

E. I.

BUDOWSKY

Diethyl Pyrocarbonate in Nucleic Acid Research

AND

L. F.

EIInExuEnc, SOLYhlOSY

SUBJECT

INDEX

I.

~ ~ E D O R C S 6 KA, N D

523

CONTENTS OF PREVIOUS \'OLUMES Volume 17

Insights into Protein Biosynthesir and Ribosome Function through Inhibitors

The Enzymic Mechonism of Guonosine 5l.3'-

SIDNEYFESTKA lnteroction with Nucleic Acids of Carcinogenic

Polyphosphote Synthesis

and Mutagenic N-Nitroso Compounds

FRITZLIPMANN AND JOSE SY

W. Effects of Polyomines on the Structure and Reactivity of tRNA TED

T. SAKAIA N D

SEYLfOTJlI

s. C O H E N

Information Transfer and Sperm Uptoke by Mornmotion Somatic Cells

L IJ IN S K Y

Biochemistry and Ribonuclease

Physiology of

Bacterial

ALOKK. DATTAAND SALILK. NIYOCI SUBJECT

INDEX

AARONBENDICEI, ELLENBOHENFREUND, STEVENS. WITKINS, DELIABEJU, AND Volume 18 PAULJ. HIGGINS Studies on the Ribosome and Its Components

SPITNIK-ELSON AND DAVID ELSON PNINA

Clossicol and Postclassical Modes of Regulation of

the Synthesis of Degradative

The Ribosome of Escherichia coli

R.

BRIhiACOhIBE, K. H. NIERHAUS, R. A. CARHETT A N D H, G . WITThlANN

Structure and Function of 5 S and 5.8 S RNA

VOLKERA. ERDMANN

Bacterial Enzymes

BORISMACASANIK Chorocteristics and Significance of the Polyodenylote Sequence in Mommalian Messenger RNA GEORGE BHAWEIIMAN

Polyodenylofe Polymerases

MARY EDMONDS AND MARYANN WINTERS Three-Dimensional Structure of Transfer RNA S U N G H O W Klhf

High-Resolution Nuclear Magnetic Resonance Investigations of the Structure of tRNA in Solution

DAVID R. KEARNS Premelting Changes i n DNA Conformation

E. P A L E ~ E K Quantum-Mechanical Studies on the Con{ormotion of Nucleic Acids and Their Constituents BEllNAHD PlJLLh'IAN AND

Suir JECT INDEX

ANILS A R A N

This Page Intentionally Left Blank

Progress in Nucleic Acid Research and Molecular Biology, Volume 50

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 6

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 82

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 65

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 31

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 57

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 58

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 40

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 56

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 77

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 51

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 61

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 28

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 55

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 70

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 44

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 47

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 67

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 17

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 41

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 69

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 45

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 72

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 52

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 51

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 28

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 54

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 49

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 38

Read more

Progress in Nucleic Acid Research and Molecular Biology, Volume 46

Read more

Recommend Documents

Progress in Nucleic Acid Research and Molecular Biology, Volume 50

PROGRESS IN Nucleic Acid Research a n d Molecular Biology edited by WALDO E. COHN KlVlE MOLDAVE Biology Division Oak...

Progress in Nucleic Acid Research and Molecular Biology, Volume 6

PROGRESS IN Nucleic Acid Research and Molecular Biology Volume 6 Contributors to Volume 6 G. P. GEORGIEV AHARON GIB...

Progress in Nucleic Acid Research and Molecular Biology, Volume 82

PROGRESS IN Nucleic Acid Research and Molecular Biology Volume 82 This page intentionally left blank PROGRESS IN ...

Progress in Nucleic Acid Research and Molecular Biology, Volume 65

Some Articles Planned for Future Volumes Proceedings of the Workshop on “Base Excision Repair 2000” SANKAKMITRA AND...

Progress in Nucleic Acid Research and Molecular Biology, Volume 31

PROGRESS IN Nucleic Acid Research and Molecular Biology Volume 31 This Page Intentionally Left Blank PROGRESS IN ...

Progress in Nucleic Acid Research and Molecular Biology, Volume 57

PROGRESS IN Nucleic Acid Research and Molecular Biology Volume 57 This Page Intentionally Left Blank PROGRESS IN ...

Progress in Nucleic Acid Research and Molecular Biology, Volume 58

PROGRESS IN Nucleic A c i d Research and M o l e c u l a r Biology Volume 58 This Page Intentionally Left Blank PR...

Progress in Nucleic Acid Research and Molecular Biology, Volume 40

PROGRESS IN Nucleic Acid Research and Molecular Biology VOLUME 40 This Page Intentionally Left Blank PROGRESS IN ...

Progress in Nucleic Acid Research and Molecular Biology, Volume 56

PROGRESS IN N u c l e i c A c i d Research a n d M o l e c u l a r Biology Volume 56 This Page Intentionally Left Bl...

Progress in Nucleic Acid Research and Molecular Biology, Volume 77

PROGRESS IN Nucleic Acid Research and Molecular Biology Volume 77 This Page Intentionally Left Blank PROGRESS IN ...