Volume 147 Number 1 September 30, 2011
www.cell.com
Genomic Basis for Oncogenic Rearrangements Frontiers in Human Genetics
i get data faster with 20x more assay choices
i am TaqMan
®
Speed through samples with the largest collection of pre-optimized assays The TaqMan® family of products offers 20x more pre-designed and pre-optimized assays. With multiple sizes and formats developed to help you detect your desired target virtually 100% of the time, you can get the right answers right away. And as always, all TaqMan® assays carry our TaqMan® Assays “QPCR” Guarantee.* For a selection of gene expression, SNP genotyping, microRNA, pathogen research, copy number variation, and protein analysis solutions, go to www.appliedbiosystems.com/taqman *Certain terms and conditions apply. To obtain a copy of the full terms and conditions of the guarantee, visit www.appliedbiosystems.com/taqmanguarantee. For Research Use Only. Not intended for any animal or human therapeutic or diagnostic use. © 2011 Life Technologies Corporation. All rights reserved. The trademarks mentioned herein are the property of Life Technologies Corporation or their respective owners. TaqMan® is a registered trademark of Roche Molecular Systems, Inc., used under permission and license. CO23566 0711
Transfect ZLWKFRQÀGHQFH Building upon your voice and our vision, Roche scientists have developed new reagents that position reliability and
Created to meet the needs of researchers who require high efficiency and ease of use, X-tremeGENE reagents provide more viable cells with more physiologically relevant results, allowing you to focus your efforts
FRPSUHKHQVLYHHIÀFLHQF\DFURVVFRPPRQ and hard-to-transfect cell lines.
on discovery. With Roche, you can have complete confidence in the technology behind you.
Introducing X-tremeGENE DNA Transfection Reagents Request your free sample today at x-tremegene.roche.com
For life science research only. Not for use in diagnostic procedures. X-TREMEGENE is a trademark of Roche. © 2011 Roche Diagnostics. All rights reserved.
Roche Diagnostics Corporation Roche Applied Science Indianapolis, Indiana
Editor Emilie Marcus Senior Deputy Editor Elena Porro Deputy Editor Robert Kruger Scientific Editors Karen Carniol Kara Cerveny Michaeleen Doucleff Kara Lassen Fabiola Rivas Niki Scaplehorn Lara Szewczak Managing Editor Andy Smith Art Program Manager Andrew A. Tang Senior Illustrator/Designer Yvonne Blanco Illustrator Kate Mahan Production Staff Reyna Clancy Editorial Assistant Anna Hofvander
Editorial Board C. David Allis Genevie`ve Almouzni Uri Alon Angelika Amon Johan Auwerx Richard Axel Cori Bargmann Konrad Basler Bonnie Bassler David Baulcombe Jeffrey Benovic Carolyn Bertozzi Wendy Bickmore Elizabeth Blackburn Joan Brugge Lewis Cantley Joanne Chory David Clapham Andrew Clark Hans Clevers Stephen Cohen Pascale Cossart George Daley Jeff Dangl Ted Dawson Pier Paolo di Fiore Marileen Dogterom Julian Downward Bruce Edgar Steve Elledge Anne Ephrussi Ronald Evans Witold Filipowicz Marco Foiani Elaine Fuchs Yukiko Goda Stephen Goff Joe Goldstein
Douglas Green Leonard Guarente Taekjip Ha Daniel Haber Ulrike Heberlein Mark Hochstrasser Erika Holzbaur Arthur Horwich Tony Hunter James Hurley Richard Hynes Thomas Jessell Tarun Kapoor Narry Kim Mary-Claire King David Kingsley Frank Kirchhoff John Kuriyan Robert Lamb Mark Lemmon Beth Levine Wendell Lim Jennifer Lippincott-Schwartz Dan Littman Richard Losick Scott Lowe Tom Maniatis Matthias Mann Kelsey Martin Joan Massague´ Iain Mattaj Satyajit Mayor Ruslan Medzhitov Craig Mello Tom Misteli Tim Mitchison Danesh Moazed Alex Mogilner Paul Nurse
Roy Parker Dana Pe’er Kathrin Plath Carol Prives Klaus Rajewsky Venki Ramakrishnan Rama Ranganathan Anne Ridley Alexander Rudensky Helen Saibil Joshua Sanes Charles Sawyers Ueli Schibler Joseph Schlessinger Hans Scho¨ler Trina Schroer Geraldine Seydoux Kevan Shokat Pamela Sklar Nahum Sonenberg James Spudich Paul Sternberg Bruce Stillman Azim Surani Keiji Tanaka Craig Thompson Robert Tjian Ulrich von Andrian Gerhard Wagner Jonathan Weissman Matthew Welch Tian Xu Shinya Yamanaka Marino Zerial Xiaowei Zhuang Huda Zoghbi
Cell Office Cell, Cell Press, 600 Technology Square, 5th Floor, Cambridge, Massachusetts 02139 Phone: (+1) 617 661 7057, Fax: (+1) 617 661 7061, E-mail:
[email protected] Online Publication: http://www.cell.com Cell (ISSN 0092-8674) is published biweekly by Cell Press, 600 Technology Square, 5th Floor, Cambridge, Massachusetts 02139. The institutional subscription rate for 2011 is $1,605 (US and Canada) or $1,847 (elsewhere). The individual subscription rate is $320 (US and Canada) or $363 (elsewhere). The individual copy price is $50. Periodicals postage paid at Boston, Massachusetts and additional mailing offices. Postmaster: send address changes to Elsevier Customer Service Americas, Cell Press Journals, 3251 Riverport Lane, Maryland Heights, MO 63043, USA. The paper used in this publication meets the requirments of ANSI/NISO Z39.48-1992 (Permanence of Paper). Printed by Dartmouth Printing Company, Hanover, NH.
Want to learn how to prepare, submit and publish an article in a Cell Press journal? Watch the Cell Press publication guide.
Chapter 1: Before manuscript submission
Chapter 2: After initial submission
Chapter 3: Decision process
Chapter 4: After manuscript acceptance
for more information visit
www.cell.com/publicationguide
Cell Press President & CEO Lynne Herndon Editor in Chief, Vice President of Content Development Emilie Marcus Vice President of Business Development Joanne Tracy Vice President of Web Development and Operations Keith Wollman Publishing Directors Peter Lee Deborah Sweet Editorial Director, Reviews Strategy Katja Brose Editorial Director, Content Development Elena Porro Director of Marketing Jonathan Atkinson Production Manager Meredith Adinolfi Press Office Elisabeth (Lisa) Lyons Mary Beth O’Leary
ª2011 Elsevier Inc. All rights reserved. This journal and the individual contributions contained in it are protected under copyright by Elsevier Inc., and the following terms and conditions apply to their use: Photocopying: Single photocopies of single articles may be made for personal use as allowed by national copyright laws. Permission of the Publisher and payment of a fee are required for all other photocopying, including multiple or systematic copying, copying for advertising or promotional purposes, resale, and all forms of document delivery. Special rates are available for educational institutions that wish to make photocopies for nonprofit educational classroom use. For information on how to seek permission, visit www.elsevier. com/permissions or call (+44) 1865 843830 (UK) / (+1) 215 239 3804 (US). Permissions: For information on how to seek permission, visit www.elsevier.com/ permissions or call (+44) 1865 843830 (UK) / (+1) 215 239 3804 (US). Derivative Works: Subscribers may reproduce tables of contents or prepare lists of articles including summaries for internal circulation within their institutions. Permission of the Publisher is required for resale or distribution outside the institution. Permission of the Publisher is required for all other derivative works, including compilations and translations (please consult www.elsevier.com/permissions). Electronic Storage or Usage: Permission of the Publisher is required to store or use electronically any material contained in this journal, including any article or part of an article (please consult www.elsevier.com/permissions). Except as outlined above, no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior written permission of the Publisher. Notice: No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a matter of products liability, negligence, or otherwise, or from any use or operation of any methods, products,
Display Advertising Key Accounts Manager: Victoria Macomber, ph: 508 928 1255; fax: 508 928 1256; e-mail:
[email protected] Northeast/Mid-Atlantic: Gordon Sheffield, ph: 617 386 2189; fax: 617 397 2805; e-mail: g.sheffi
[email protected] Midwest/Southeast/Eastern Canada: Inez Herrero-Redman, ph: 585 678 4395; fax: 585 678 4722; e-mail:
[email protected] Northwest/Southwest/Western Canada: Mike Walker, ph: 925 648 3101; fax: 213 232 3245; e-mail:
[email protected] California: Jonathan Sismey, ph: 845 987 8128; fax: 845 544 2049; e-mail:
[email protected] UK/Europe: Darryl Freeman, ph: +44 13 9228 5827; e-mail:
[email protected] Asia: Kevin Partridge, ph: +44 18 6584 3717; fax: +44 18 6584 3010; e-mail:
[email protected] Classified Advertising United States and Canada: Gordon Sheffield, Key Account Manager, ph: 617 386 2189; fax: 617 397 2805; e-mail: g.sheffi
[email protected] UK, Europe, and Asia: Sabrina Dodge, Key Account Manager, ph: +44 20 7424 4997; fax: +44 18 6585 3136; e-mail:
[email protected]
instructions, or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made. Although all advertising material is expected to conform to ethical (medical) standards, inclusion in this publication does not constitute a guarantee or endorsement of the quality or value of such product or of the claims made of it by its manufacturer. Reprints: Article reprints are available through Cell’s reprint service; for information, contact Nicholas Pavlow (e-mail:
[email protected]; ph: (+1) 212 633 3960). Subscription Orders and Inquiries: Mail, fax, or e-mail address changes to Elsevier Customer Service Americas, allowing 4–6 weeks for processing. Lost or damaged issues will be replaced, subject to availability, if Cell Press is notified within the claim period (US and airmail delivery: 3 months from issue date; surface delivery: 4 months from issue date). Periodical delivery in the US can take up to 3 weeks. Airmail delivery can take 2–4 weeks. The price of a single copy of Cell is $50 (excluding special issues). All orders must be prepaid and in writing. Please include the volume and issue number, payment (check or credit card, MasterCard, Visa, or American Express only), and a delivery address. Allow 4–6 weeks for delivery. Mailing address: Elsevier Customer Service Americas, Cell Press Journals, 3251 Riverport Lane, Marlyland Heights, MO 63043, USA. Toll-free phone within USA/Canada: 866 314 2355; phone for outside US/Canada: (+1) 314 447 8880; fax: (+1) 314 447 8029; e-mail:
[email protected]; internet: www.cellpress.com or <www.cell.com>. Funding Body Agreements and Policies: Elsevier has established agreements and developed policies to allow authors whose articles appear in journals published by Elsevier to comply with potential manuscript archiving requirements as specified as conditions of their grant awards. To learn more about existing agreements and policies, visit http://www.cell.com/cellpress/FundingBodyAgreements. Guide for Authors: For a full and complete guide for authors, please go to www.cell.com/authors.
Built for efficiency. Competent Cells from New England Biolabs. Ensure successful transformations with NEB cloning strains and take advantage of: s (IGH TRANSFORMATION EFlCIENCIES s 6ALUE PRICING AND NO DRY ICE CHARGES s #ONVENIENT PRODUCT FORMATS INCLUDING SINGLE USE TUBES NEB Competent E. coli – building a strong foundation for your cloning experiments.
Transformation Efficiency (x109) (cfu/μg pUC19)
"ENElT FROM HIGH TRANSFORMATION EFlCIENCIES WITH .%" ALPHA
FREE SAMPLE
To request a SCAN THIS CODE OR VISIT WWWNEBALPHACOM
.EED A CODE READER 'O TO DSCANCOM FROM YOUR MOBILE BROWSER SEARCH FOR @3CAN,IFE IN YOUR APP STORE OR TEXT 3#!. TO
2.5 2.0 1.5 1.0 0.5 0 NEB 5-alpha MAX Efficiency® DH5α™-T1R
The transformation efficiencies of NEB 5-alpha and MAX Efficiency DH5α-T1R were compared using manufacturers’ recommended protocols. Values shown are the average of triplicate experiments.
www.neb5alpha.com MAX Efficiency® is a registered trademark of Invitrogen. $(α™ is a trademark of Invitrogen.
Leading Edge Cell Volume 147 Number 1, September 30, 2011 IN THIS ISSUE SELECT 5
Cancer’s Epigenome
VOICES 9
Human Genome: What’s Been Most Surprising?
ANALYSIS 11
Genomics in Africa: Avoiding Past Pitfalls
M. Kaplan
COMMENTARIES 14
Genomics Reaches the Clinic: From Basic Discoveries to Clinical Impact
T.A. Manolio and E.D. Green
17
Genetics and Genomics to the Clinic: A Long Road ahead
D. Ginsburg
PREVIEWS 20
Translocation Mapping Exposes the Risky Lifestyle of B Cells
R.P. McCord and J. Dekker
22
Splicing up Pluripotency
B.R. Graveley
24
Unweaving the Autism Spectrum
C. Lord
ESSAY 26
A Blueprint for Advancing Genetics-Based Cancer Therapy
W.R. Sellers
PERSPECTIVES 32
Clan Genomics and the Complex Architecture of Human Disease
J.R. Lupski, J.W. Belmont, E. Boerwinkle, and R.A. Gibbs
44
Metagenomics and Personalized Medicine
H.W. Virgin and J.A. Todd
(continued)
Antibodies and Related Reagents for Signal Transduction Research
E1 E2 E3
The highest quality antibodies for the study of
from Cell Signaling Technology
Unparalleled product quality, validation, and technical support 3 3 8 1 7 9 K6 K1 K2 K2 K3 K4 K6
Ubiquitin Linkage
K48-linked Polyubiquitin
60 50 40 200 140 100 80
K63-linked Polyubiquitin
60 50
:: Innovative products from Cell Signaling Technology offer unsurpassed sensitivity, specificity, reproducibility, and performance. :: Extensive in-house validation means optimization is not left up to you. :: Technical support provided by the same scientists who produce and validate the products translates into a thorough, fast, and accurate response.
40 200 140 100 80
Polyubiquitin
TOP IMAGE: Proteasomal degradation of ubiquitinated proteins. On the right, Uba1 E1, Ubc1 E2, and cullin-RING E3 ligases (blue, as labeled) ubiquitinate p53 (green and ubiquitin bright yellow) and target it to the 26S proteasome (orange; center). The 26S proteasome degrades ubiquitinated p53 into short 6-12 amino acid peptides (green dots). These polypeptides can be further degraded by giant TPP II protease complexes (large orange cylinders; left) to yield tripeptides (smaller colored dots; left).
60 50
LEFT IMAGE: Western blot analysis of seven distinct recombinant polyubiquitin chains using K48-linkage Specific Polyubiquitin (D9D5) Rabbit mAb #8081 (upper), K63-linkage Specific Polyubiquitin (D7A11) Rabbit mAb #5621 (middle), and Ubiquitin Antibody #3933 (lower).
40
for quality products you can trust...
www.cellsignal.com Orders (toll-free) 1-877-616-2355
| Technical Support (toll-free) 1-877-678-8324
[email protected] | Inquiries
[email protected] | Environmental Commitment eco.cellsignal.com
Cell Signaling Technology® is a trademark of Cell Signaling Technology, Inc.
kDa 200 140 100 80
© 2011 Cell Signaling Technology, Inc.
Ubiquitination
PRIMER 57
Mapping Rare and Common Causal Alleles for Complex Human Diseases
S. Raychaudhuri
REVIEW 70
Modeling Human Disease in Humans: The Ciliopathies
G. Novarino, N. Akizu, and J.G. Gleeson
SNAPSHOT 248
Human Biomedical Genomics
E.E. Kenny and C.D. Bustamante
3TRUGGLINGåTOåKEEPåUPåWITHåå THEåLATESTåLIFEåSCIENCEåNEWS
#ELLå$AILY¬.EWS¬!GGREGATORåHASåTHEåSOLUTION 3UBSCRIBEåTOå&2%%å$AILYå.EWSå!LERTSåATåNEWSCELLCOMåANDå GETåTHEåLATESTåLIFEåSCIENCEåHEADLINESåDELIVEREDåTOåYOURåINåBOX
NEWSCELLCOM
Articles Cell Volume 147 Number 1, September 30, 2011 81
The Lin28/let-7 Axis Regulates Glucose Metabolism
H. Zhu, N. Shyh-Chang, A.V. Segre, G. Shinoda, S.P. Shah, W.S. Einhorn, A. Takeuchi, J.M. Engreitz, J.P. Hagan, M.G. Kharas, A. Urbach, J.E. Thornton, R. Triboulet, R.I. Gregory, DIAGRAM Consortium, MAGIC Investigators, D. Altshuler, and G.Q. Daley
95
Translocation-Capture Sequencing Reveals the Extent and Nature of Chromosomal Rearrangements in B Lymphocytes
I.A. Klein, W. Resch, M. Jankovic, T. Oliveira, A. Yamane, H. Nakahashi, M. Di Virgilio, A. Bothmer, A. Nussenzweig, D.F. Robbiani, R. Casellas, and M.C. Nussenzweig
107
Genome-wide Translocation Sequencing Reveals Mechanisms of Chromosome Breaks and Rearrangements in B Cells
R. Chiarle, Y. Zhang, R.L. Frock, S.M. Lewis, B. Molinie, Y.-J. Ho, D.R. Myers, V.W. Choi, M. Compagno, D.J. Malkin, D. Neuberg, S. Monti, C.C. Giallourakis, M. Gostissa, and F.W. Alt
120
A DNA Repair Complex Functions as an Oct4/Sox2 Coactivator in Embryonic Stem Cells
Y.W. Fong, C. Inouye, T. Yamaguchi, C. Cattoglio, I. Grubisic, and R. Tjian
132
An Alternative Splicing Switch Regulates Embryonic Stem Cell Pluripotency and Reprogramming
M. Gabut, P. Samavarchi-Tehrani, X. Wang, V. Slobodeniuc, D. O’Hanlon, H.-K. Sung, M. Alvarez, S. Talukder, Q. Pan, E.O. Mazzoni, S. Nedelec, H. Wichterle, K. Woltjen, T.R. Hughes, P.W. Zandstra, A. Nagy, J.L. Wrana, and B.J. Blencowe
147
Selective Translation of Leaderless mRNAs by Specialized Ribosomes Generated by MazF in Escherichia coli
O. Vesper, S. Amitai, M. Belitsky, K. Byrgazov, A.C. Kaberdina, H. Engelberg-Kulka, and I. Moll
158
Regulatory Control of the Resolution of DNA Recombination Intermediates during Meiosis and Mitosis
J. Matos, M.G. Blanco, S. Maslen, J.M. Skehel, and S.C. West
173
Saturated Fatty Acids Induce c-Src Clustering within Membrane Subdomains, Leading to JNK Activation
R.G. Holzer, E.-J. Park, N. Li, H. Tran, M. Chen, C. Choi, G. Solinas, and M. Karin
185
Conformation-Sensing Antibodies Stabilize the Oxidized Form of PTP1B and Inhibit Its Phosphatase Activity
A. Haque, J.N. Andersen, A. Salmeen, D. Barford, and N.K. Tonks
199
Crystal Structure of the Mammalian GIRK2 K+ Channel and Gating Regulation by G Proteins, PIP2, and Sodium
M.R. Whorton and R. MacKinnon
(continued)
WITH The industry’s leading energy efficient and compact -86°C ultra-low freezer. MORE CAPACITY [ 25.7 cubic feet ]
LESS ENERGY [ 15.1 kWh / day ]
MORE SAVINGS, LESS ENERGY
LESS SPACE [ 9.5 square feet ]
MORE CAPACITY, LESS SPACE
576
400
BOXES
BOXES
9.5 sq.ft.
10.7 sq.ft.
10.5 sq.ft.
Based on published data on file
Visit ww ww.ggreenffreezers.com to learn more! And to take advantage of special offers.
480
BOXES
VIP® PLUS SERIES MDF-U76VC
209
A Pseudoatomic Model of the Dynamin Polymer Identifies a Hydrolysis-Dependent Powerstroke
J.S. Chappie, J.A. Mears, S. Fang, M. Leonard, S.L. Schmid, R.A. Milligan, J.E. Hinshaw, and F. Dyda
223
Beclin1 Controls the Levels of p53 by Regulating the Deubiquitination Activity of USP10 and USP13
J. Liu, H. Xia, M. Kim, L. Xu, Y. Li, L. Zhang, Y. Cai, H.V. Norberg, T. Zhang, T. Furuya, M. Jin, Z. Zhu, H. Wang, J. Yu, Y. Li, Y. Hao, A. Choi, H. Ke, D. Ma, and J. Yuan
235
Absence of CNTNAP2 Leads to Epilepsy, Neuronal Migration Abnormalities, and Core Autism-Related Deficits
O. Pen~agarikano, B.S. Abrahams, E.I. Herman, K.D. Winden, A. Gdalyahu, H. Dong, L.I. Sonnenblick, R. Gruver, J. Almajano, A. Bragin, P. Golshani, J.T. Trachtenberg, E. Peles, and D.H. Geschwind
CORRECTION 247
AKT/FOXO Signaling Enforces Reversible Differentiation Blockade in Myeloid Leukemias
S.M. Sykes, S.W. Lane, L. Bullinger, D. Kalaitzidis, R. Yusuf, B. Saez, F. Ferraro, F. Mercier, H. Singh, K.M. Brumme, € S.S. Acharya, C. Scholl, Z. Tothova, E.C. Attar, S. Frohling, R.A. DePinho, D.G. Gilliland, S.A. Armstrong, and D.T. Scadden
ANNOUNCEMENTS POSITIONS AVAILABLE
On the cover: Chromosomal rearrangements disrupt the integrity of the genome and are involved in producing leukemias and lymphomas. Klein et al. (pp. 95–106) implement translocation capture sequencing (TC-Seq) to document genome-wide chromosomal rearrangements in primary B cells. Their results reveal that double-strand break location, transcriptional activity, chromosome territories, and activation-induced cytidine deaminase (AID) activity influence the extent of genome rearrangement and favor recurrent oncogenic translocations. The cover image depicts AID as a spider and a TC-Seq-generated translocation map to the immunoglobulin heavy-chain locus as its web. The cover was designed by Rafael Casellas and Ethan Tyler.
Subscribe to
Active Zone The Cell Press Neuroscience Newsletter
Featuring: Cutting-edge neuroscience from Cell Press and beyond Interviews with leading neuroscientists Special features: Podcasts, Webinars and Review Issues Neural Currents - cultural events, exhibits and new books And much more Read now at
bit.ly/activezone
Leading Edge
In This Issue Splicing Up ES Cell Pluripotency PAGE 132
Gabut et al. identify a splicing switch that leads to an ESC-specific form of the transcription factor FOXP1, with an altered DNA-binding domain. This isoform stimulates expression of pluripotency genes, represses differentiation genes, and is also required for efficient reprogramming.
Linking mTOR, let-7, and Diabetes PAGE 81
Connecting cellular energy sensing with type 2 diabetes (T2D), Zhu et al. show that muscle-specific overexpression of the let-7 miRNA-binding protein Lin28 ameliorates insulin resistance and glucose tolerance in mice via upregulation of mTOR signaling. Genome-wide association studies reveal that let-7 target genes are enriched for SNPs associated with T2D, extending the relevance of these findings to human disease.
Making Translocation Spots Hot PAGE 95 and PAGE 107
Aberrant fusions between the c-myc and IgH loci are frequently associated with B-cell lymphoma. In this issue, Klein et al. and Chiarle et al. report high-throughput techniques to investigate how these oncogenic rearrangements occur. By analyzing the repair of inducible DNA double-strand breaks in millions of primary B cells, the authors find that AID activity, chromosome topology, and transcription all conspire to increase the probability of translocation. The findings provide a molecular understanding of translocation hot spots.
DNA Repair Complex Meets Pluripotency Factors PAGE 120
Fong et al. uncover a stem cell-selective function of a DNA repair complex comprising XPC, RAD23B, and CETN2. This complex coactivates Oct4/Sox2-directed transcription of key pluripotency genes and is important for stem cell maintenance and reprogramming.
Remodeling Ribosomes under Stress PAGE 147
Vesper et al. show that, when under stress, E. coli generates functionally specialized ribosomes by truncating 16S rRNA with the endoribonuclease MazF. Moreover, MazF specifically removes 50 UTRs of distinct transcripts to form leaderless mRNAs, which are selectively translated by the altered ribosomes to adjust the translational program in response to stress.
Timed Triggers for Crossovers PAGE 158
The timely resolution of DNA recombination intermediates is essential for chromosome segregation. Matos et al. show that the activation of two crossover-promoting endonucleases is temporally coordinated with cell-cycle progression to enable the specialized meiotic and mitotic chromosome segregation programs. In meiosis, the nucleases are hyperactivated to promote crossovers, whereas in mitosis, their activities are restrained to favor noncrossover products. Cell 147, September 30, 2011 ª2011 Elsevier Inc. 1
Targeting Reversible Oxidation PAGE 185
The phosphatase PTP1B regulates insulin and leptin signaling. Haque et al. identify an antibody that selectively recognizes and stabilizes the oxidized, inactive form of PTP1B to promote insulin signaling. The findings indicate that the targeting of oxidized PTP1B is a promising therapeutic approach for diabetes and obesity.
Fatty Acids Congeal Signaling Pathways PAGE 173
What are the molecular mechanisms behind the ill effects of a high-fat diet? Holzer et al. provide evidence that saturated fatty acids (FA) trigger stress signaling and insulin resistance through their effects on membrane fluidity. Saturated FA, but not unsaturated FA, induce clustering of c-Src into membrane subdomains, which in turn, leads to JNK signaling and resistance to insulin. The findings provide a new model for how membrane composition can trigger different signaling cascades, distinct from a ligand-sensing mechanism.
Channel Gating by Coincidence PAGE 199
G protein-gated potassium channels control electrical excitability in cells in response to G protein activation and the signaling lipid PIP2. Whorton and MacKinnon report structures of wild-type and constitutively active mutant channels, providing mechanistic insights into multiligand regulation of channel gating. Their findings reveal that, in the absence of PIP2, G proteins appear to open only one of two sequential gates, whereas the presence of both molecules opens both gates, leading to ion conduction.
Stopping Autophagy on a Dime PAGE 223
Liu et al. report a small-molecule inhibitor of autophagy that blocks two ubiquitin-specific peptidases. Using the inhibitor, Liu et al. elucidate a regulatory pathway linking Beclin1 and p53 that is controlled by ubiquitination. As reduced levels of Beclin1 lead to reduced p53, these findings provide insights into how Beclin1 functions as a tumor suppressor.
New Model for Autism Treatment PAGE 235
Pen˜agarikano et al. report that mice lacking the neurexin Cntnap2, a gene strongly associated with autism and related neurodevelopmental disorders, exhibit striking neuropathological and behavioral parallels to humans with mutations in the equivalent gene. As in human autism patients, the repetitive, but not the social behavioral, deficits of the mice are rescued by the drug risperidone, suggesting promise for the model in facilitating the identification of new therapies.
Dynamin Powerstrokes PAGE 209
The dynamin GTPase mediates the release of clathrin-coated vesicles from the plasma membrane by assembling into collars around the necks of invaginated coated pits. Chappie et al. combine cryo-EM, X-ray diffraction, and crosslinking data to describe the structure and topology of assembled dynamin collars, suggesting that energy from dimerization and GTP hydrolysis is converted into large structural movements that drive membrane fission.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 3
Be the first...
to read the latest issue of AJHG. Receive AJHG Email Alerts – FREE! Register Now at www.ajhg.org
Leading Edge
Select Cancer’s Epigenome Large-scale cancer genomics projects are just starting to trickle into the literature, and the flow will only grow in size and impact in the upcoming year. Nevertheless, the studies so far are already reshaping our view of tumor progression, especially in terms of how cancer manipulates the epigenome for fast growth and adaptability. This Select discusses recent genomics articles that, collectively, offer a new mechanism for cancer evolution and highlight the rising importance of chromatin remodeling factors in cancer.
A Cellular Gala´pagos? Cancer cells have remarkable plasticity. They quickly respond to changes in their environment by switching phenotypes and even back-tracking out of differentiated states. Now, a landmark study by Hansen et al. (2011) uncovers a new mechanism driving cancer cells’ adaptability: increased stochasticity in DNA methylation levels across the genome. Methylating cytosines in promoters generally silences genes, and when a cytosine is immediately next to a guanine (i.e., ‘‘CpG’’ dinucleotide), the cytosine is often methylated. Thus, studies on DNA methylation in cancer have focused primarily on gene promoters and genomic regions with a high concentration of CpG dinucleotides. Now, Hansen et al. decide to take a broader approach and A multidimensional scaling of methylation for examine regions with lower CpG density. Using a custom-made Illumina microselected CpGs in cancer, showing the distribution array, the authors measure methylation levels at 384 sites in 290 tumor and of normal tissues (left) and cancer (right). A few normal samples, including colon, lung, thyroid, breast, and Wilms’ tumors. cancer samples fall within the normal range but do Compared to normal tissues, all five cancers display significant increases in not define a tight distribution themselves. Dotted methylation variability at 70%–92% of CpG sites—that is, the standard devialines indicate the typical variation for normal tissues. tions for methylation levels are strikingly higher in cancer than in normal tissue. Image courtesy of K.D. Hansen and W. Timp. The authors then use bisulfate sequencing on three matched colorectal tumors to characterize the methylation pattern on a genomic scale. This analysis reveals large blocks of hypomethylated DNA, which include genes associated with tumor heterogeneity and progression. Although 80% of the genes in these blocks are silenced in both cancer and normal samples, the variability in gene expression across samples is substantially higher in cancer than in normal tissue, as the authors observed for methylation levels. Together, these findings suggest a provocative new model for how cancer cells can adapt so quickly to fluctuating environments: increased randomness in methylation levels enhances the variability of gene expression across a uniform population of cells, allowing selective pressures to select for epigenetic configurations most suitable for a given condition. In other words, the epigenetic instability provides the opportunity for Darwin-like evolution to occur at the cellular level. Hansen, K.D., et al. (2011) Nat. Genet. 43, 768–775.
Massive Sequencing Strikes Suppressor Gold If epigenetic patterns, such as DNA and histone modifications, contribute to cancer’s evolution and tumorigenesis, then proteins involved in shaping the chromatin landscape should be key drivers in a broad range of cancers. Now, four independent exome sequencing studies identify multiple chromatin remodeling factors as tumor suppressors and oncogenes in non-Hodgkin lymphoma (Morin et al., 2011; Pasqualucci et al., 2011). Until now, only a few genetic drivers have been identified for the most common forms of non-Hodgkins lymphoma—diffuse large B cell lymphoma (DLBCL) and follicular lymphoma. To hunt for more tumor suppressors in lymphoma, Morin et al. (2011) sequence the genomes or exomes of 15 lymphoma cases (14 DLBCLs and 1 follicular lymphoma) and then perform RNA sequencing on these cases plus 113 more samples. Independently, Pasqualucci et al. combine exome sequencing and copy number analysis on six DLBCL cases. Both groups identify 100 genes recurrently mutated in multiple tumors, and both pools of genes are enriched with enzymes involved in DNA methylation and histone modifications, such as histone methyltransferases and acetyltransferases. In both studies, 10% of the lymphomas carry mutations in the MEF2B gene, which encodes a factor involved with histone acetylation. More importantly, Morin et al. find that 89% of the follicular lymphoma cases carry mutations in the MLL2 gene, and both groups observe MLL2 mutations in 24%–32% of the DLBCL
Genome-wide visualization of somatic mutation targets in non-Hodgkin lymphoma from Morin et al. (2011). Purple circles mark genes mutated in more than one case, with their diameters proportional to the number of cases with single-nucleotide variants observed. Image courtesy of R. Morin.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 5
tumors, making MLL2 the most common mutated gene in non-Hodgkin lymphoma. MLL2 encodes an H3K4-specific histone methyltransferase involved in gene activation. The mutation patterns of MLL2—that is, a high percentage of inactivating mutations— strongly indicate that MLL2 is a critical tumor suppressor in non-Hodgkin lymphoma. Moreover, both MLL2 and MEF2B display signs that selective pressures accelerated the acquisition of nonsynonymous point mutations, providing further evidence that these chromatin remodelers are core drivers of non-Hodkgin lymphoma. Morin, R.D., et al. (2011) Nature 476, 298–303. Pasqualucci, L., et al. (2011) Nat. Genet. 43, 830–837.
Remodeling Tumor Suppressor Theory Using a similar strategy as the two studies above on lymphoma, Gui et al. (2011) and Li et al. (2011) also identify chromatin remodelers as new tumor suppressors in bladder carcinoma and hepatitis C-associated liver cancers (HCC), respectively. In both studies, the authors start by sequencing the exomes for 10 different tumors (to 90-fold coverage); identify genes with nonsilent mutations in more than one tumor; and then sequence these genes in 100 more samples. For bladder cancer, 8 of the 49 most frequently mutated genes encode chromatin remodeler proteins, including a SWI/SNF-related gene (ARID1A), a histone demethylase (UTX), acetyltransferases (CREBBP and EP300), and methyltransferases (MLL and MLL3). For HCC, 18.2% of the samples harbor inactivating mutations in the ARID2 gene, which encodes a subunit of the chromatin remodeling complex PBAF Histology image of a hepatitis C-associ(polybromo and BRG1-associated factor). In both studies, the mutational patterns ated liver cancer case with an ARID2 mutaobserved for these seven chromatin remodelers strongly implicate these genes as tumor tion. Image courtesy of M. Torbenson. suppressors in bladder cancer or HCC. A major strength of the study by Hansen et al. (see first summary) is that the increased variability in methylation and gene expression was observed across diverse types of cancers, suggesting that this epigenetic instability is a general property of cancer. Now, a related trend seems to be arising from exome sequencing projects: disrupting enzymes that maintain the proper epigenetic landscape drives malignant transformation in diverse types of cancers, from non-Hodgkin lymphoma to bladder and liver cancers. Collectively, these studies also suggest that targeting chromatin remodeler genes may be a particularly productive approach for finding new tumor suppressors and oncogenes in future sequencing projects. Gui, Y. et al. (2011). Nat. Genet. 43, 875–878. Li, M., et al. (2011). Nat. Genet. 43, 828–829.
BRCA1 Breaks into Chromatin Remodeling Mutations in the breast cancer susceptibility gene 1 (BRCA1) increase the risk for developing breast and ovarian cancers up to 95%, and it is generally accepted that BRCA1 stops tumorigenesis by promoting the repair of DNA breaks through homologous recombination. Now, Zhu et al. (2011) challenge this classic view by presenting evidence that BRCA1 maintains genomic stability through the ubiquitination of histone H2A in heterochromatin. Centromeres are surrounded by heterochromatin, which contains large arrays of repeating sequences called satellite DNA. These repeats are thought to be noncoding and largely silenced by heterochromatin formation. The BRCA1 protein contains a domain with ubiquitin E3 ligase activity, and previous studies have shown that, in vitro, BRCA1 preferentially ubiquitinates H2A, a histone variant implicated in gene silencing. Now, Zhu et al. demonstrate that deleting BRCA1 disrupts the organization of heterochromatin, decreases levels of ubiquitinated H2A histones at heterochromatin, and increases the expression of transcripts from the satellite DNA. Next, the authors show that boosting transcription of the satellite DNA surprisingly causes many abnormalities seen in BRCA1-deficient cells, such as deficiencies in homologous recombination and impaired chromosomal segregation. Then, when the authors overexpress an H2A-ubiquitin fusion protein, the satellite DNA is resilenced and many of the BRCA1-linked defects improve, including p53-induced apoptosis, growth arrest, and impaired homologous recombination. Zhu and colleagues find that these satellite transcripts are also increased in breast cancer tumors from both humans and mice harboring BRCA1 mutations. Together, these findings suggest that BRCA1 exerts its tumor-suppressive effects by maintaining the integrity of heterochromatin. Although many questions still remain, such as how the satellite transcripts alter DNA stability and integrate with BRCA1’s known role at DNA breaks, the study may have unlocked an entirely new route for tumor evolution and adaptability. It also reminds us that, clearly, we are just beginning to become privy to the dangerous cabal that cancer has with chromatin and the epigenome. Zhu, Q., et al. (2011). Nature 477, 179–184.
The confocal image in the Sputnik satellite shows a cell overexpressing satellite DNA and with abnormally amplified centrosomes (red). At center is a mathematical deconvolution of the confocal image to resolve the tubulin fibers (green) and the individual chromosomes (blue). Image courtesy of Q. Zhu, J. Fitzpatrick, and J. Simon.
Michaeleen Doucleff Cell 147, September 30, 2011 ª2011 Elsevier Inc. 7
Leading Edge
Voices Human Genome: What’s Been Most Surprising? Let’s Remember the Chromosomes
Variation and Complexity
A Hidden Ecosystem
David Page
Vivian Cheung
David Haussler
Massachusetts Institute of Technology
University of Pennsylvania
University of California, Santa Cruz
What surprises me most about the state of the human genome in 2011 is how much of the sequence remains to be assembled accurately and how important that achievement would be for fulfilling the Human Genome Project’s original goal: a comprehensive reading of the book of life. Achieving this aim will require conquering the most structurally ornate and dynamic regions of the genome, including the essential but elusive elements (such as centromeres and telomeres) that are responsible for faithful transmission of the genome from one generation to the next. Students of human biology and medicine would finally be able to see the genome as an orchestra of chromosomes—not merely a ‘‘parts list’’ of genes. We would, at last, be positioned to address some of the longstanding mysteries of human biology and medicine, such as the fragile and tenuous nature of human reproduction, and we would understand why a sizeable proportion of all human conceptions—and half of spontaneously lost pregnancies—display dramatic anomalies of one or more chromosomes. Just as high-resolution crystal structures of macromolecular complexes enable unforeseen insights into function in health and disease, a complete and accurate assembly of the human genome will answer questions that we do not even know to ask.
The simplicity of A, C, G, and Ts as building blocks of the human genome is deceptive. Although various genomescale projects seek to identify functional units based on DNA sequences, it is surprisingly difficult to find genes and regulatory elements, and such information is critical for determining how sequence variants affect disease risks. DNA sequences are highly processed and modified during transcription and translation. The same DNA sequence can code for different proteins by alternate promoter usage or splicing, and base modification such as methylation and chromatin modifications also affect how DNA sequences are converted into functional units. In addition, DNA sequences are not always copied exactly into RNA and proteins; processes such as RNA editing lead to proteins that are not encoded by the underlying DNA sequences. Even after the proteins are synthesized, different modifications affect their functions. Although DNA is viewed as the template for transcripts and proteins, there is a lack of a direct relationship between DNA sequence and functional elements. A deeper understanding of the fundamental yet complex relationships between DNA and RNA (as well as proteins) is necessary to assess the functional significance of genome variation. An approach may involve studying individual variation in the transcriptome and proteome in addition to DNA polymorphisms to determine which and how genetic variants affect disease susceptibility and response to therapy.
The role of functional, noncoding DNA has been simultaneously the biggest surprise and the biggest mystery emerging from our first glimpse of the human genome. Comparative sequencing of the genomes of other species revealed that, since the inception of placental mammals, more than 5% of the human genome has been under selective constraint to preserve some presumably fitness-enhancing function. Careful gene structure analysis proved that only about 1.2% codes for protein. Many of the most dramatically constrained genome segments are noncoding. What does this ‘‘dark matter’’ do, and how did it evolve? Ample experimental evidence suggests that the bulk of the evolutionarily constrained segments will have gene regulatory functions either as DNA elements or as parts of noncoding RNA genes. It has become clear that these elements turn over and are reinvented anew on a much faster timescale than the protein-coding genes. The other big surprise is the mounting evidence that transposons—mobile elements within a cell’s DNA that are viewed by most as almost exclusively parasitic—play a critical role in this turnover. Previously viewed as a library of cellular information infested with a few parasites, the genome has turned out to be a lively and complex ecosystem unto itself, with many unusual denizens partaking in complex evolutionary alliances and battles that remain to be deciphered.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 9
Huge Heterogeneity
Richard Gibbs Baylor College of Medicine
It is remarkable how much of our perception of the human genome is shaped by the nature of its ‘‘accessible’’ portions. The regions that are unique, more easily finished, and accessible to highthroughput genotyping are naturally the best studied. This excludes many repetitive regions and segments that are otherwise recalcitrant, such as those with high GC content. Many interesting and disease-related alleles fall in these regions. It is a shock to hear from colleagues involved in medical diagnostics that, for some regions, ‘‘we ignore the reference sequence.’’ This is because there is enormous population heterogeneity, and the different alleles involve complex rearrangements. The assessment in the diagnostic laboratory is bootstrapped by all of the data from the patients that are tested with standard arrays and only modest dependence on the reference. Another surprise is that there are many more private, even ‘‘unique,’’ and potentially functionally significant variants in populations than we expected. This means that it will be a long time before all of the important alleles that contribute to human health have been discovered. And in general, when new sequences are examined, there will be newly discovered alleles for which the greatest challenge will be assessing their functional significance.
10 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Leading Edge
Analysis Genomics in Africa: Avoiding Past Pitfalls A landmark genomics project is taking shape in Africa that shifts the power and prominence to local scientists. If successful, the program will offer valuable insights into the inheritance of common diseases and reshape the paradigm of foreign-funded research. In the past few years, large-scale genomics studies have made one point clear: scientists must screen the genomes of diverse populations around the world to decipher the genetic basis of common diseases. But the relationship between genomicists and indigenous populations is often contentious. Although biomedical projects are more positively received than evolutionary studies, the prevailing structure still bears elements of neocolonialism; European and American scientists gather samples and return home to analyze and publish the results. This imbalance breeds skepticism for projects and limits participation. Starting this October, the US’s NIH and the UK’s Wellcome Trust are joining forces to restructure this paradigm by developing a genomics research network across Africa. Named The Human Heredity and Health in Africa Project, or H3Africa, the program plans to give African researchers an opportunity to bid for research grants to study diverse topics in genomics, from the human microbiome and pharmogenomics to the genetics of communicable and noncommunicable diseases. Those selected for funding will establish or enhance local research facilities in their home country to generate modern sequencing and phenotyping laboratories. These centers will also serve as training facilities and function within a network of clinics and bioinformatics laboratories. The hope is that the genetic work will prove useful in combating diseases that are specifically problematic in Africa but will also yield insight into the genes behind global diseases, like diabetes and cancer. The project naturally elicits the question of ‘‘why Africa.’’ Indeed, it would be less
expensive and politically far simpler to have western researchers continue to sample western genes, but that would yield an incomplete picture of human genetic variation. The reason is that Africa is a genetically special place. Why Africa In May 2009, a study published in Science sent shockwaves through the field of human genetics when it reported the most comprehensive analysis of African genetic diversity to date (Tishkoff et al., 2009, Science 324, 1035–1044). Led by Sarah Tishkoff at the University of Pennsylvania, the study looked at 1327 nuclear microsatellite and insertion/ deletion markers in 2,400 individuals— a diminutive scale and scope compared to sequencing and genome-wide association studies reported today. But what made Tishkoff’s study so extraordinary were the individuals themselves. Most genetic studies in African had sampled only a handful of groups and then generalized their findings. In contrast, Tishkoff and her team acquired samples from 113 distinct populations across the continent, from the Mozabite Berbers of Morocco to the hunter-gatherer San of the Kalahari Desert. The findings made it clear that Africa was home to the highest levels of human genetic diversity on the planet. The reason for this vast diversity is that migration has reduced genetic variation in populations outside of Africa. Just as people today move to take advantage of low housing prices or new jobs, ancient humans moved to find new forms of shelter and food. A major threat faced by these migratory populations was genetic isolation; inbreeding decreased genetic
diversity and increased the population’s vulnerability to disease. Today, human populations decrease in genetic diversity the further away they get from Africa because founder populations that were formed from other founder populations shrunk the gene pool further. For instance, a study last year reported the complete genome (or exome) sequences of four individuals from a hunter-gatherer population in Southern Africa called Khosians. Despite their geographic proximity, the Khosians were genetically quite far apart, with more differences in their genes than those observed between Europeans and Asians (Schuster et al., 2010, Nature 463, 943–947). Researchers keen to find genes that contribute to diseases are limiting the variations that they have available for study if they restrict their search to individuals outside of Africa. To date, however, genomic studies have focused almost exclusively on westerners. Approximately 96% of the subjects in genome-wide association studies so far are from European descent. Findings from these studies may not generalize to other populations, and they may also miss many alleles that are rare in Europeans but more frequent in other ethnic groups. ‘‘A lot of past work has been Eurocentric, and that is not good. We need to know much more about the genetics of humans worldwide, particularly Africa,’’ says geneticist Nick Patterson at the Broad Institute. Genetically, in comparison to most other human populations, Africans have more haplotypes (combinations of alleles in different loci) and lower levels of linkage disequilibrium (alleles on the same chromosome are more likely to be inherited independently of each other). ‘‘These allow for fine-mapping and better localization of risk variants in genomic loci,’’ says Charles Rotimi, president of the African Society of Human Genetics and director of the Center for Research on Genomics and Global Health at the NIH. But the reasons for going to Africa are not solely genetic ones. ‘‘In the developing countries that I’ve visited, there is a large community with tremendous intellectual capacity. We really want to reverse the history of not involving African communities and see to it that the people of Africa reap the health benefits that will
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 11
Phenetic tree of human populations around the world. African branches are color coded according to language classifications. The two insets show principal components analysis created on the basis of individual genotype. Adapted from Tishkoff et al., 2009, Science 324, 1035–1044.
result from analysis of their genetics,’’ says Jane Peterson, associate director of extramural research at the NIH. ‘‘Over the years, it has been very frustrating working in genomics and seeing so few Africans participate. Creating a research environment where young African men and women can pursue this research on their own is everything,’’ says Dr Rotimi. Yet the path forward is not without obstacles. Colonial Ghosts Whether justified or not, a connection could be made between historic colonial mining of African resources and the western funding of projects that will mine African genomes. Indeed, Africans need not look to colonial times to have
concerns about misuse of collected biological resources. In 1983, blood samples were collected from natives of the Nuu-chah-nulth tribe on Vancouver Island by researchers. Rheumatoid arthritis was a serious problem among the indigenous population. The tribe had agreed with the scientists using their blood to search for genes associated with rheumatoid arthritis, with the hopes of alleviating pain in their community. The researchers were unable to find any genetic basis for the arthritis during the 1980s, and when the lead researcher behind this work moved to a new university in 1986, so did the Nuu-chah-nulth blood samples. Over time, the samples were shared with other researchers, and studies were conducted on them that did
12 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
not involve rheumatoid arthritis at all. When the tribe ultimately discovered that studies, including one on viruses spread by intravenous drug use, were being conducted on their biological samples without their permission, tensions rose and an international debate emerged over the proper care of genetic samples. ‘‘This was all about expectations not being met and the people not being involved in the decisions to carry out secondary research after the initial research could not be completed,’’ explains Laura Arbour at the University of British Columbia, a medical geneticist who was not involved with the Nuu-chahnulth work. ‘‘People who live with health disparity want research money spent in ways that can make a difference for them. But research is a dynamic process.
Things change. Results may unexpectedly lead to a different path or the research intended cannot be completed. If new research is being proposed for their samples, they should be included in that discussion too,’’ she says. Making African Research African To avoid creating a Nuu-chah-nulth debacle and to promote independence from NIH support, keeping research inside of Africa is a priority for the new project. During the next five years, considerable funding—five million dollars a year from the NIH and eight million pounds a year from the Wellcome Trust—is going to be awarded in the form of grants to African researchers by an NIH-selected peer review team. However, collaboration with labs outside of Africa is undoubtedly going to happen, and this means that some African samples may need to leave the continent. Keenly aware that even sending some blood and DNA out of Africa might generate the wrong message, H3Africa is leaving control of biological materials collected in Africa to researchers from the continent. ‘‘African researchers will develop the policy for sample distribution,’’ says Peterson. Yet because the project is still in its infancy, the funders
behind H3Africa will not guarantee that an African biorepository for material storage and management will be built. ‘‘A biorepository remaining on African soil and under the governance of African researchers and other local stakeholders is an ideal way to reduce the risk of unapproved secondary research,’’ comments Dr Arbour. ‘‘But if the samples need to be analyzed elsewhere, it remains possible that the protective processes can still be effective as long as all parties are on board and the governance remains with the African partners,’’ she says. It also remains to be determined what diseases will get studied and to what extent they will include diseases that are specifically problematic in Africa, as the project’s future revolves around the research proposals that come in during the months ahead. The funding bodies behind H3Africa are not formally declaring any diseases that they are keen to see fought. ‘‘I would guess that we will see a mix chosen since one of the key criteria that will be used by the peer reviewers is medical relevance, and there are certainly a lot of highly relevant diseases shared by both Africa and the rest of the world,’’ says Peterson. But this silence from the funding agency is not stopping individuals from expressing what they would like to see studied.
‘‘A distressingly high proportion of young children in Africa pick up lethal diarrhea. I suspect that this has been persisting in Africa for a long time. Since it sets in before reproductive age and selection probably has been playing a role in selecting for infants that can survive the disease, there might be some genetic resistance there, and this may be well worth looking for,’’ says Patterson. As for Rotimi, he is keen to see sickle cell anemia dealt with. ‘‘In some areas of western Africa, heterozygotes can be as common as 20% of the population, while actual sickle cell patients (sufferers) can be up to 3%. Many sufferers never make it to medical facilities alive. In addition, it is often assumed that carriers are entirely normal, but we really do not know that! Research directed at understanding this disease, its complications, and treatment has not received enough scientific attention, and this needs to change,’’ says Rotimi. Indeed, the project looks well aimed to avoid past pitfalls, and with the right project proposals, it could have a profound impact on our understanding of disease inheritance and human evolution. But there is still a long way to go to narrow the gap between Africans’ genetic richness and their power to use it for their own benefit.
Matt Kaplan London, UK DOI 10.1016/j.cell.2011.09.018
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 13
Leading Edge
Commentary Genomics Reaches the Clinic: From Basic Discoveries to Clinical Impact Teri A. Manolio1 and Eric D. Green1,* 1National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.012
Today, more than ever, basic science research provides significant opportunities to advance our understanding about the genetic basis of human disease. Close interactions among laboratory, computational, and clinical research communities will be crucial to ensure that genomic discoveries advance medical science and, ultimately, improve human health. The potential for the burgeoning knowledge of genome structure and function to improve medical care has long been anticipated (Collins, 1999), but until very recently, the actual clinical application of genomics has been limited (Green and Guyer, 2011). Despite concerns about the pace of medically relevant genomic discoveries and the implementation of genomic medicine (clinical care based on or influenced by knowledge of a patient’s specific genomic variants) (Varmus, 2010), growing numbers of encouraging examples are now in hand. Early case reports of genomic-based diagnoses leading to altered treatment and an improved clinical course, facilitated by advancing genomic technologies such as whole-exome and -genome sequencing, illustrate the potential of genomically informed medicine for improving clinical care. Such reports also demonstrate the critical role that basic science approaches play in characterizing implicated variants and pointing toward more effective treatments. Here, we describe several recent successes in genomic medicine that illustrate the critical interplay between basic and translational researchers that will be required to make the routine use of genomic medicine a reality. The potential of whole-exome sequencing for identifying the genetic cause of a mysterious and disabling disease and, in some cases, for illuminating a path toward effective treatment was vividly demonstrated by the desperate case of a young boy with severe, intractable, and atypical inflammatory bowel disease (Worthey et al., 2011). He failed
to respond to conventional treatment and progressively worsened. The only treatment option remaining was hematopoietic stem cell transplantation. In the absence of a clear diagnosis, clinicians were concerned about subjecting the boy to an invasive procedure with unknown chances for survival. Wholeexome sequencing provided the key diagnostic clue: a nonsynonymous change in XIAP, a gene involved in apoptosis. Functional studies quickly confirmed that the mutation caused aberrant XIAP function. These findings, and the known morbidity risk of XIAP deficiency-related hemophagocytic lymphohistiocytosis, tipped the balance in favor of stem cell transplantation. The patient survived the procedure and is now well 1 year later (D. Dimmock, personal communication). There is little doubt among the clinical and laboratory teams that this course of treatment—a high-risk gamble justified only upon identification of the presumed causal mutation—saved the patient’s life. Genomic analysis is a cornerstone of the National Institutes of Health’s Undiagnosed Diseases Program (http:// rarediseases.info.nih.gov/Resources.aspx? PageID=31), and it recently proved invaluable for studying three families with severe, symptomatic arterial calcifications (St Hilaire et al., 2011). Several of the affected individuals had disabling intermittent claudication with extensive occlusion of the iliofemoral arterial system due to heavy calcification. One family was consanguineous (a third-cousin marriage), so genome-wide single-nucleotide polymorphism arrays were used
14 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
to identify genomic regions that were homozygous in all affected siblings but heterozygous in the unaffected parents. The only such region included three genes implicated in cellular pathways potentially involved in calcification. One of these, NT5E, codes for the protein CD73, which is involved in the same pathway as a gene associated with generalized arterial calcification of infancy. Targeted sequencing of the affected siblings revealed a homozygous nonsense mutation in NT5E, and quantitative PCR analysis demonstrated decreased NT5E expression in cultured fibroblasts from two of the affected siblings. Studies of two other affected families detected missense and nonsense NT5E mutations in homozygous or compound heterozygous states. A series of elegant experiments revealed markedly reduced CD73 levels and absent CD73 enzymatic activity in fibroblasts from affected patients, with the latter rescued by transfection with a CD73-encoding lentiviral vector. Fibroblasts carrying the NT5E mutation also showed excessive staining for tissue-nonspecific alkaline phosphatase (TNAP), a key enzyme for calcification, as well as abundant calcium phosphate crystal formation. These phenotypes were ameliorated by CD73 transfection or treatment with either adenosine or an inhibitor of alkaline phosphatase. Elucidating the precise molecular defect in this condition enables consideration of treatments affecting other components of this calcification pathway and may shed light on potential treatments for ectopic tissue calcification in other disorders.
These two notable successes are encouraging, but it is sobering to recognize that whole-genome analysis has failed to reveal the cause of a rare genetic disease in the majority of cases studied to date. More robust approaches for genome analysis are being developed to study the thousands of genetic disorders for which the molecular basis remains unknown. Pharmacogenomics is another area where genomic discoveries can be leveraged to improve clinical care. Genotypetargeted treatment with clopidogrel represents a prototypic pharmacogenomic advance facilitated by basic science investigation of the effect of specific genetic variants. Clopidogrel is a widely prescribed antiplatelet drug that binds to the platelet P2Y12 receptor with wide interindividual variability in response (Roden and Shuldiner, 2010). Further study of clopidogrel’s mechanism of action showed that it is a pro-drug highly dependent on cytochrome P450 2C19 for activation. Up to 30% of individuals carrying CYP2C19 variants are unable to generate the active form, and inhibition of platelet aggregation was diminished in these individuals. Some, but not all, studies also point to an associated higher risk for thrombotic cardiovascular events among these CYP2C19 variant carriers. An alternative but more costly anti-P2Y12 drug does not require bioactivation, raising the potential for genotype-targeted selection of individuals needing the higher-cost alternative. Another common 2C19 polymorphism can actually increase clopidogrel metabolism, whereas the effects of several rarer 2C19 variants remain to be studied. Several pilot studies are now underway examining the effectiveness of pre-emptive genotyping in patients. The effects of other 2C19 variants, the role of genotyping versus phenotypic platelet inhibition assays, and the therapeutic potential of other antiplatelet drugs provide fertile ground for basic science investigations that can generate more effective treatments to reduce the risk of thrombotic events. Clinical outcome improvements like these, in conjunction with appropriate changes in physician and patient behavior, will be essential for promoting adoption of such genomic approaches in routine clinical care. Findings from these
and related studies will also inform policy development and regulatory oversight, as illustrated by the Food and Drug Administration’s promotion of druglabeling changes (www.fda.gov/drugs/ scienceresearch/researchareas/pharmaco genetics/ucm083378.htm). Other pharmacogenomic successes include the documented increased risk for life-threatening adverse reactions to carbamazepine in persons carrying the HLA-B*1502 allele and the reduction of that risk following genotyping and drug avoidance by carriers of the risk allele (Wilke and Dolan, 2011). HLA-mediated risk for adverse drug reactions remains poorly understood but is suspected to involve HLA-allele-specific presentation of key drug moieties to immune-activating cells—an area ripe for basic investigation. Another potent pharmacogenomic example is the BRAF kinase inhibitor vemurafenib. Patients with metastatic melanoma whose tumors carry the activating BRAF V600E somatic mutation respond dramatically, with improved rates of survival (Chapman et al., 2011). The potential for vemurafenib and other BRAF kinase inhibitors to improve health spans of other cancer patients carrying the BRAF V600E mutation is being investigated and may be an initial step in the long-anticipated classification of cancers based on molecular taxonomy rather than organ of origin and histopathology. These examples demonstrate the enormous potential of basic research to contribute key insights about the phenotypic consequences of disease-associated variants that, in turn, lead to changes in patient care. Although genetic variants of large effect with clear functional impact may be more readily identified in familial or isolated cases of severe disease, such as those described above, recent studies have demonstrated the important new areas of research catalyzed by studying smaller-effect loci identified in genome-wide association studies (GWAS) (Ernst et al., 2011). For example, 80% or more of the genomic regions implicated in the GWAS conducted to date are intronic or intergenic (http:// www.genome.gov/gwastudies), with disease-associated variants being significantly enriched in enhancer elements
(Ernst et al., 2011). Together, these findings make detailed investigation of genetically associated noncoding genomic regions a high priority, yet our functional understanding of noncoding regions is in its relative infancy and will be complicated by their greater genomic variation compared to coding regions (1000 Genomes Project Consortium, 2010). Variants in gene regulatory regions will undoubtedly prove to be important in disease causation, but their role in pathogenesis will also be complex and difficult to define compared to the coding variants represented by the recent discoveries described above and the overwhelming majority of disease-causing mutations reported to date. For genomic medicine to be successful, basic science advances are also needed to promote development of low-cost, rapid, and clinically available technologies for detecting genomic variants. Although the advent of genome-wide genotyping arrays revolutionized identification of disease-associated loci, this and other genomic technologies (such as genome sequencing) remain largely unavailable outside major research laboratories. The requisite data analysis and quality control will likely keep these technologies out of the typical clinical laboratory for some time. This is unfortunately also true for methods to detect variants in targeted genes recognized to have significant clinical implications. Although becoming increasingly available, whole-exome and -genome sequencing presents significant challenges with respect to data analysis, interpretation, and display. Robust yet easy-toutilize bioinformatic tools are urgently needed for analyzing genome sequence data, providing to clinicians only the information about genomic variants that is relevant to a patient’s care. Our ability to define the role of genomic variation in human disease is growing at an ever-accelerating pace. As revealed in the above examples, using these advances to directly improve patient care will require close interactions between the basic and clinical research communities. The insights resulting from genomic knowledge moving freely between the laboratory and the clinic hold great promise for the implementation of genomic medicine.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 15
ACKNOWLEDGMENTS
C., Testori, A., Maio, M., et al; BRIM-3 Study Group. (2011). N. Engl. J. Med. 364, 2507–2516.
We thank William Gahl, Geoffrey Ginsburg, Mark Guyer, Laura Rodriguez, Jeffery Schloss, and Marc Williams for helpful input in preparing this paper.
Collins, F.S. (1999). Ann. N Y Acad. Sci. 882, 42–55, discussion 56–65.
REFERENCES
Ernst, J., Kheradpour, P., Mikkelsen, T.S., Shoresh, N., Ward, L.D., Epstein, C.B., Zhang, X., Wang, L., Issner, R., Coyne, M., et al. (2011). Nature 473, 43–49.
1000 Genomes Project Consortium. (2010). Nature 467, 1061–1073.
Green, E.D., and Guyer, M.S.; National Human Genome Research Institute. (2011). Nature 470, 204–213.
Chapman, P.B., Hauschild, A., Robert, C., Haanen, J.B., Ascierto, P., Larkin, J., Dummer, R., Garbe,
Roden, D.M., and Shuldiner, A.R. (2010). Circulation 122, 445–448.
16 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
St Hilaire, C., Ziegler, S.G., Markello, T.C., Brusco, A., Groden, C., Gill, F., Carlson-Donohoe, H., Lederman, R.J., Chen, M.Y., Yang, D., et al. (2011). N. Engl. J. Med. 364, 432–442. Varmus, H. (2010). N. Engl. J. Med. 362, 2028– 2029. Wilke, R.A., and Dolan, M.E. (2011). JAMA 306, 306–307. Worthey, E.A., Mayer, A.N., Syverson, G.D., Helbling, D., Bonacci, B.B., Decker, B., Serpe, J.M., Dasu, T., Tschannen, M.R., Veith, R.L., et al. (2011). Genet. Med. 13, 255–262.
Leading Edge
Commentary Genetics and Genomics to the Clinic: A Long Road ahead David Ginsburg1,* 1Howard Hughes Medical Institute, Departments of Internal Medicine, Human Genetics, and Pediatrics, and the Life Sciences Institute, University of Michigan, Ann Arbor, MI 48109, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.013
Advances in genomic technology have produced an explosion of new information about the genetic basis for human disease, fueling extraordinarily high expectations for improved treatments. This perspective will take brief stock of what genetics/genomics have brought to clinical practice to date and what we might expect for the future.
Improved Diagnosis for Mendelian Genetic Disorders First the ‘‘good news’’: the contribution of modern genetics and genomics to the diagnosis of Mendelian genetic disorders has been nothing short of spectacular. The number of human single-gene disorders with a known molecular genetic cause has risen from less than 5 in 1982, to approximately 150 in 1990, and to nearly 3,000 in 2011 (http://omim.org). Precise DNA diagnosis in approved clinical testing laboratories is available for many of these disorders, and testing on a research basis can be obtained for a significant subset of the remaining diseases. Although ambiguity still remains for some sequence findings (e.g., ‘‘variants of unknown significance’’), definitive diagnosis can often be established by DNA testing, and once a familial mutation is known, testing of nearly perfect sensitivity and specificity is then available for other at-risk family members, including prenatal and preimplantation diagnoses. These advances have led to the near elimination of select autosomal-recessive diseases in specific populations, such as b-thalassemia in parts of the Mediterranean and TaySachs disease among Ashkenazi Jews (Zlotogora, 2009). Though the full power of genetic diagnosis has not yet been fully realized, major progress is clear, and the impact of DNA testing on single-gene disorders, particularly with the advent of next-generation sequencing, is likely to expand dramatically over the next decade.
Treatment for Mendelian Genetic Disorders Now for the ‘‘bad news’’: in contrast to the remarkable impact on diagnosis, the contribution of modern genetics and genomics to the treatment of most Mendelian genetic disorders has been, with a few notable exceptions, a disappointing failure. Among the 3,000 single-gene disorders for which the responsible gene has been identified, only a handful (<1%) have been translated directly into new therapies. Most of this success has been restricted to diseases due to enzyme deficiencies, such as hemophilia and the lysozomal storage diseases. However, for most genetic diseases, little has changed in treatment since the discovery of the responsible gene. Where modest improvement in therapy has been seen, this has generally been empiric and not the direct result of knowing the underlying genetic defect. Examples include muscular dystrophy, whose gene was identified in 1986, cystic fibrosis (1989), and Huntington disease (1993). Sickle cell anemia provides a particularly humbling example for the biomedical research community. Its molecular basis has been understood at the 5.5 A˚ level since 1960 (Perutz et al., 1960). However, current therapy is still largely empiric and not derived from a sophisticated understanding of hemoglobin structure and sickle cell anemia molecular pathogenesis. Chronic myelogenous leukemia (CML) offers a striking exception for which understanding genetic pathogenesis has
indeed had a striking impact on treatment. Imatinib (Gleevec), a targeted therapeutic against the unique fusion gene product (BCR/ABL) found in all CML patients, has profoundly altered the prognosis for patients with this disease. Despite the initial enthusiasm that designer drugs of this type would soon become a routine follow-on to the discovery of any pathogenic gene, few such therapies with an impact of this magnitude have yet emerged. It has become increasingly clear that the path for translation of basic genetic findings into new therapeutics is not an easy one. Although 3,000 Mendelian diseases can now be diagnosed by DNA testing, is this always of direct clinical value to the patient? Medical test ordering by physicians often follows a logic similar to Mallory’s decision to climb Mount Everest, simply ‘‘because it’s there.’’ This approach has made unrestrained diagnostic testing a key contributor to runaway health care costs. A transition toward evidence-based medicine will increasingly demand that diagnostic tests, including DNA testing, be restricted to those settings wherein a significant impact on treatment and medical management is anticipated. It is here that our disappointing success rate in translating genetic discovery into improved medical treatment dampens the significance of our triumph in the diagnostic arena. Unfortunately, a significant fraction of DNA diagnostic tests currently performed does not meet this standard of evidence-based medical justification.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 17
For example, one of the most commonly performed genetic diagnostic tests in the US today is for factor V Leiden, a common human polymorphism that predisposes to venous thrombosis. However, the results of this test are of limited or no value in guiding current medical therapy (Middeldorp, 2011). Though modern genetics and genomics have had a transforming impact on our understanding of Mendelian human genetic disorders, direct translation to improved treatment remains elusive for most patients. This relative failure to date should not be a cause for resignation or despair, but rather a call to redouble our efforts. The path from basic discovery to effective therapy is usually a long and arduous one, and generally unpredictable. Recent history simply demonstrates that identifying something as broken is much easier than fixing it. Complex Genetic Disorders Single-gene disorders are estimated to account for approximately 9% of childhood mortality and probably less than 2% of overall hospital admissions in the US (Korf et al., 2007). The vast majority of health care costs are devoted to the treatment of common complex disorders such as coronary artery disease, stroke, diabetes, hypertension, and cancer, which all appear to have large, heritable components. The ‘‘common disease/ common variant’’ hypothesis has been tested by genome-wide association studies (GWAS) for over 220 diseases or traits (Hindorff et al., 2011). Over 1,300 highly significant genetic risk loci have been identified, though nearly all with very modest effect (allelic odds ratios generally < 1.5) and in aggregate accounting for only a small fraction of the overall genetic risk (Manolio et al., 2009). Is information from GWAS clinically useful in guiding medical management or choice of therapy? At present, the clear consensus among clinical genetic experts is a resounding ‘‘no.’’ One prime example is type 2 diabetes, where extensive single-nucleotide polymorphism (SNP) testing adds little to the much greater predictive value of family history and body mass index (Lyssenko et al., 2008). The lack of demonstrated diagnostic or therapeutic utility has not dissuaded the enthusiastic direct-to-consumer (DTC)
marketing of whole-genome SNP analysis for self-diagnosis of risk for a broad array of diseases and common human traits. Though the technical validity of SNP calling seems high, risk prediction from similar data varies dramatically between different DTC companies, often in opposite directions (Ng et al., 2009). The potential negative impact of DTC genetic testing on medical practice and health care cost containment has led to calls for increased regulation and oversight. Despite limited direct clinical payoff to date, GWAS have nonetheless provided important advances in our understanding of several complex genetic diseases. Examples include entirely unanticipated insight into the role of complement factor genes in the pathogenesis of age-related macular degeneration (de Jong, 2006) and the discovery of the BCL11A gene as a critical regulator of fetal hemoglobin levels and the long-sought-after basis for the fetal to adult globin switch (Sankaran et al., 2009). In addition, the observation that common SNPs in aggregate generally account for <10%–20% of overall genetic risk for most complex diseases has sparked new efforts to identify the genetic basis for the remaining ‘‘missing’’ heritability (Manolio et al., 2009). Pharmacogenomics There has been considerable enthusiasm that the identification of common human genetic variants would accurately predict drug response and toxicity, allowing precise tailoring of individualized or ‘‘personalized’’ treatments for patients based on their specific disease and genetic makeup. Nonetheless, progress in this area remains limited and has not yet proven as straightforward as initially thought. Although there are several clear-cut associations of specific genetic variants with drug toxicity, genetic screening prior to treatment has not yet become a standard part of medical care. As in disease diagnostics, pharmacogenomic testing has often been greeted with naive enthusiasm and advocated well in advance of solid evidence for clinical utility. A case in point is CYP2/C9 and VKORC1 genotyping to predict response to warfarin, a widely used oral anticoagulant for which precise dosing is critical (too low a dose leads to inadequate efficacy as an anticoagulant and too high
18 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
a dose to excessive bleeding). Common variants in these two genes account for 30% of the variability in the individual response to warfarin, leading the U.S. Food and Drug Administration to modify warfarin labeling to suggest CYP2/C9 and VKORC1 genotyping. However, this testing has not yet been demonstrated to be of practical clinical value in decreasing bleeding complications or increasing anticoagulant efficacy and is rarely used in clinical practice (Rosove and Grody, 2009). Where Do We Go from Here? The impact of next-generation sequencing on human genetics, both diagnostically and therapeutically, is likely to be transformational. With decreased cost and improved quality over the next 5–10 years, whole-genome sequencing may replace conventional newborn screening in much of the developed world. Initially, only those genes for which immediate diagnosis in a newborn is of clear value for directing medical management are likely to be evaluated, including the 20–30 genes comprising current newborn screening panels, with the rest of the sequence archived for later use. The list of such actionable Mendelian genetic disorders will undoubtedly expand with time. The availability of full genome sequences for large fractions of the patient population, together with implementation of a uniform electronic medical record, should enable an entirely new scale of genetic epidemiologic health outcomes research. Along with these advances, issues of informed consent and genetic privacy, as well as ethical considerations related to prenatal and preimplantation intervention, will pose increasingly complex challenges. The anticipated flood of sequencing data, along with the development of new computational and conceptual tools to analyze it, are likely to yield profound new insights into the pathogenesis of human disease, the diagnosis of complex genetic disorders, and the accurate prediction of drug efficacy and toxicity, leading to improved treatment selection. The near future promises to be a very exciting time for biomedical research, perhaps finally providing the tools required to realize the promise of translating basic scientific discovery into improved outcomes for patients.
ACKNOWLEDGMENTS D.G. is a Howard Hughes Investigator and also supported by National Institutes of Health grants RO1HL039693 and PO1HL057346. D.G. is also a member of the Board of Directors for Shire plc.
REFERENCES de Jong, P.T. (2006). N. Engl. J. Med. 355, 1474– 1485. Hindorff, L.A., Junkins, H.A., Hall, P.N., Mehta, J.P., and Manolio, T.A. (2011). http://www. genome.gov/gwastudies.
Korf, B.R., Rimoin, D.L., Connor, J.M., and Pyeritz, R.E. (2007). Nature and Frequency of Genetic Disease. In Emery and Rimoin’s Principles and Practice of Medical Genetics, D.L. Rimoin, J.M. Connor, R.E. Pyeritz, and B.R. Korf, eds. (Philadelphia: Churchill Livingstone Elsevier), pp. 49–52. Lyssenko, V., Jonsson, A., Almgren, P., Pulizzi, N., Isomaa, B., Tuomi, T., Berglund, G., Altshuler, D., Nilsson, P., and Groop, L. (2008). N. Engl. J. Med. 359, 2220–2232. Manolio, T.A., Collins, F.S., Cox, N.J., Goldstein, D.B., Hindorff, L.A., Hunter, D.J., McCarthy, M.I., Ramos, E.M., Cardon, L.R., Chakravarti, A., et al. (2009). Nature 461, 747–753.
Middeldorp, S. (2011). J. Thromb. Thrombolysis 31, 275–281. Ng, P.C., Murray, S.S., Levy, S., and Venter, J.C. (2009). Nature 461, 724–726. Perutz, M.F., Rossmann, M.G., Cullis, A.F., Muirhead, H., Will, G., and North, A.C. (1960). Nature 185, 416–422. Rosove, M.H., and Grody, W.W. (2009). Ann. Intern. Med. 151, 270–273, W95. Sankaran, V.G., Xu, J., Ragoczy, T., Ippolito, G.C., Walkley, C.R., Maika, S.D., Fujiwara, Y., Ito, M., Groudine, M., Bender, M.A., et al. (2009). Nature 460, 1093–1097. Zlotogora, J. (2009). Hum. Genet. 126, 247–253.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 19
Leading Edge
Previews Translocation Mapping Exposes the Risky Lifestyle of B Cells Rachel Patton McCord1 and Job Dekker1,* 1Programs in Systems Biology and Gene Function and Expression, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, 364 Plantation Street, Worcester, MA 01605-0103, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.005
Recurrent chromosomal translocations can drive oncogenesis, but how they form has remained elusive. Now, Chiarle et al. (2011) and Klein et al. (2011) characterize the genome-wide spectrum of translocations that form from a single double-stranded break, revealing that specific loci have an intrinsic predisposition for frequent chromosomal rearrangements. In the face of damage from everyday metabolic processes or external ionizing radiation, a cell must maintain chromosomal integrity. The most dangerous event that it can face is a DNA doublestrand break (DSB), which can lead to chromosomal translocations (Mills et al., 2003). Chromosomal translocations are a common feature of many cancers, with specific, recurrent translocations occurring in nearly 40% of all human tumors (Shaffer and Pandolfi, 2006). Recurrent translocations can promote tumorigenesis by creating novel gene products or by altering the regulation of genes involved in cell proliferation and differentiation. For example, the Bcr/Abl translocation, observed in 90% of chronic myelogenous leukemia cases, creates a novel oncogene that is sufficient to transform cells in vitro. Why do recurrent translocations occur so frequently? They could result from random and rare events that are subsequently selected due to their proliferative potential. Alternatively, molecular processes could predispose specific loci to engage frequently in translocations. In this issue, Klein et al. and Chiarle et al. present powerful new methods to address such questions by capturing the genome-wide landscape of translocations (a translocatome), which result from a single defined DSB in primary cells in the absence of confounding effects of growth selection (Chiarle et al., 2011; Klein et al., 2011). The two methods induce a DSB at a defined position in the genome. After allowing the cells to repair this DSB for
a relatively short time, translocation junctions between the induced DSB and endogenous DSBs are identified by PCR amplification and deep sequencing. Both studies find that most cells repair the induced DSB by rejoining the ends without causing major genomic rearrangements. However, a significant fraction of cells join the induced DSB ends with endogenous DSBs elsewhere in the genome, creating intra- and interchromosomal rearrangements. Though these technologies can be widely applied to any cellular system, both groups use them to examine genomic rearrangements in murine B cells. These immune cells can give rise to lymphomas as a result of a recurrent interchromosomal translocation between the c-myc gene and the IgH locus (Ku¨ppers and Dalla-Favera, 2001). In these two studies, the authors conditionally introduce a DSB at either the IgH or the c-myc locus in B cells activated for class switch recombination (CSR), a process that generates a spectrum of endogenous DSBs. They find that B cells minimize the risk of deleterious translocations from all of these DSBs by repairing almost half of the induced DSBs locally, within 1 kb of the original break site. Even within 100 kb of the break site, some junctions can be attributed to the local repair process of resection and resealing. Outside of local repair, however, the authors observe a multitude of translocation partners of c-myc and IgH that could generate almost every chromosomal aberration imaginable, including di-centric and a-centric chromosomes.
20 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
This diversity shows the power of capturing an early, unselected translocatome, as many of these rearrangements would have been lost if cells were required to undergo cell division. Importantly, these studies reveal that the frequency of translocations derived from a single DSB is 0.4%–1%. In a living organism, where millions of B cells per day experience endogenous DSBs during CSR, the observed translocation rate predicts that nearly 103 cells per day could form inadvertent translocations. With distinct yet similar experimental protocols, these two studies paint a picture of some of the underlying mechanisms that predispose certain regions to translocations in B cells (Figure 1). For instance, both groups find that translocations occur more frequently on the chromosome carrying the induced break, even up to 50 Mb away. Such a phenomenon cannot be explained by local repair activity but, rather, suggests that translocations between pairs of DSBs occurring on the same chromosome are strongly preferred over interchromosomal events. Given that loci on the same chromosome tend to be located closer to each other in the nucleus than loci on different chromosomes, the spatial organization of the genome may directly impact the formation of translocations. Despite the enrichment for intrachromosomal translocations, the authors find that interchromosomal translocations occur frequently, comprising 60% of the nonlocal repair events. Some chromosomal regions translocate so frequently that they are classified as ‘‘hot
Figure 1. Factors Predisposing Loci to Double-Strand Breaks and Translocations Genome-wide translocation mapping identifies translocations that form when a DNA double-strand break (DSB) is artificially introduced in a defined location by the I-SceI meganuclease (center). The majority of induced DSBs are repaired locally, but many form translocation junctions with endogenous DSBs. These endogenous DSBs, often mediated by AID activity, occur at transcription start sites (left), class switch recombination sites (top), and elsewhere throughout the genome (right).
spots.’’ Among these, the c-myc/IgH translocation is among the most frequent, regardless of whether the original break was introduced at c-myc or IgH. What mechanisms might explain the recurrent formation of certain translocations? Both studies find that many hot spots depend on activation-induced cytidine deaminase (AID). AID initiates CSR and somatic hypermutation by inducing DNA damage at immunoglobulin (Ig) loci (Yamane et al., 2011). Importantly, Klein et al. (2011) and Chiarle et al. (2011) show that this enzyme not only acts at canonical target loci, but also at additional sites throughout the genome, initiating DSBs that can lead to translocations. Thus, B cells pay a price for their programmed ability to rearrange
Ig loci by acquiring DNA damage elsewhere in the genome. Even in the absence of AID-dependent DSBs, there are numerous nonrandom translocations throughout the genome. Both studies find that translocations are much more likely to happen near transcription start sites (TSSs) of actively transcribed genes. Though this effect is pronounced when AID is present (AID induces DSBs at exposed cytosines revealed by stalled RNA polymerases; Pavri et al., 2010), transcription alone creates a risk of DSBs. Together, these studies show that transcription, the generation of breaks by AID, and physical linkage along the DNA all affect translocation frequency
(Figure 1). Importantly, the presence of translocation hot spots in the absence of selection suggests that recurrent translocations observed in disease may have mechanistic causes rather than solely being observed due to selection. Recurrent translocations in cancer may result from intrinsic mechanisms and thus may or may not be drivers of disease. The broad distribution of translocations, along with the likelihood that these rearrangements occur near gene promoters, explains the high risk of acquiring novel gene products and translocation-driven cancers. These studies highlight the molecular processes that lead to formation of DSBs at particular genomic sites. However, for a translocation to occur, two DSBs must also come into close proximity. How and when this contact occurs has long been debated in terms of two contrasting models. The ‘‘contact first’’ model proposes that only regions of chromosomes already in contact prior to DSB formation can form translocations, whereas the ‘‘breakage first’’ model hypothesizes that DSBs can move in the nuclear space to contact translocation partners (Meaburn et al., 2007). The results of the studies from Klein et al. (2011) and Chiarle et al. 2011 suggest that spatial genome organization may affect translocation frequency, but neither study resolves which of the two models applies. The tendency of translocations to occur on the same chromosome as the engineered break site suggests that close spatial proximity can contribute to formation of specific translocations. Indeed, other studies have reported that IgH and c-myc are closely juxtaposed even before a DSB occurs (Roix et al., 2003), possibly facilitating translocation formation when DSBs are introduced. However, the extent to which loci move, especially when CSR and somatic hypermutation are activated, remains unresolved. We now have the ability to comprehensively determine the initiating translocatomes; combining such techniques with other genome-wide assays that probe chromatin state and spatial conformation (Lieberman-Aiden et al., 2009) will contribute to a better understanding of the mechanisms that lead to recurrent formation of disease-causing genomic rearrangements.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 21
REFERENCES Chiarle, R., Zhang, Y., Frock, R.L., Lewis, S.M., Molinie, B., Ho, Y.-J., Myers, D.R., Choi, V.W., Compagno, M., Malkin, D.J., et al. (2011). Cell 147, this issue, 107–119. Klein, I.A., Resch, W., Jankovic, M., Oliveira, T., Yamane, A., Nakahashi, H., Di Virgilio, M., Bothmer, A., Nussenzweig, A., Robbiani, D.F., et al. (2011). Cell 147, this issue, 95–106. Ku¨ppers, R., and Dalla-Favera, R. (2001). Oncogene 20, 5580–5594.
Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al. (2009). Science 326, 289–293. Meaburn, K.J., Misteli, T., and Soutoglou, E. (2007). Semin. Cancer Biol. 17, 80–90. Mills, K.D., Ferguson, D.O., and Alt, F.W. (2003). Immunol. Rev. 194, 77–95. Pavri, R., Gazumyan, A., Jankovic, M., Di Virgilio, M., Klein, I., Ansarah-Sobrinho, C., Resch, W.,
Yamane, A., Reina San-Martin, B., Barreto, V., et al. (2010). Cell 143, 122–133. Roix, J.J., McQueen, P.G., Munson, P.J., Parada, L.A., and Misteli, T. (2003). Nat. Genet. 34, 287–291. Shaffer, D.R., and Pandolfi, P.P. (2006). Nat. Med. 12, 14–15. Yamane, A., Resch, W., Kuo, N., Kuchen, S., Li, Z., Sun, H.W., Robbiani, D.F., McBride, K., Nussenzweig, M.C., and Casellas, R. (2011). Nat. Immunol. 12, 62–69.
Splicing up Pluripotency Brenton R. Graveley1,* 1Department of Genetics and Developmental Biology, University of Connecticut Stem Cell Institute, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, CT 06030, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.004
In this issue of Cell, Gabut and colleagues (2011) identify a new splice variant of FOXP1 that directly regulates the expression of pluripotency genes. It endows human embryonic stem cells with their pluripotent nature and is required for the reprogramming of somatic cells to induced pluripotent stem cells. The past few years have seen remarkable progress in our understanding of the mechanistic basis of pluripotency, including the identification of key factors required for maintaining the pluripotent state of human embryonic stem cells (ESCs) (Chen et al., 2008; Kim et al., 2008; Silva et al., 2009). Moreover, one of the great breakthroughs of this decade was the discovery that a only few critical transcriptions factors, such as the combination of Oct4, Sox2, Klf4, and c-Myc, are sufficient to reprogram somatic cells into induced pluripotent stem (iPS) cells (Takahashi et al., 2007). These factors appear to activate a transcriptional network that endows cells with pluripotency (Samavarchi-Tehrani et al., 2010), but gene expression can be regulated by numerous processes other than transcription, including chromatin modifications, RNA stability, and preRNA splicing. How these processes contribute to pluripotency has been largely understudied in human ESCs. Now in this issue of Cell, Gabut et al. (2011) break this field wide open by identifying an alter-
native splicing ‘‘switch’’ at the top of the pluripotency transcriptional network. Alternative splicing—the process by which exons can be joined together in different patterns such that a single gene can give rise to multiple transcripts—is known to regulate key developmental decisions in a number of systems (Nilsen and Graveley, 2010). Perhaps the best known example is the sex-determination pathway in Drosophila (Salz, 2011). This pathway consists of five genes encoding premRNAs that are spliced in a sex-specific manner (Figure 1A). The genes are organized in a hierarchy in which the splicing of an upstream gene regulates that of downstream genes. The genes at the bottom of this hierarchy, dsx (doublesex) and fru (fruitless), encode transcription factors, and the male-specific and female-specific protein variants of each factor regulate distinct sets of target genes. Thus, these regulated splicing events act in a switch-like manner to specify nearly all aspects of sex determination and courtship behavior.
22 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
To explore the role of alternative splicing in human ESC pluripotency, Gabut et al. use microarrays that can detect different splicing variants. These experiments reveal numerous splicing events that change as human ESCs differentiate into neural precursor cells, including one in the FOXP1 gene. This event involves a previously unannotated exon that is included in human ESCs but skipped in differentiated cells (Figure 1B). Strikingly, the exon’s sequence and its stem cell specificity is conserved in mouse, suggesting that it might play a significant role in stem cell biology. FOXP1 encodes a member of the forkhead family of transcription factors, which recognize particular DNA sequences through a ‘‘forkhead domain.’’ FOXP1 is an essential gene that is broadly expressed and required for the establishment of specific cell types. Fusions of FOXP1 with other genes or loss of FOXP1 function are associated with many different types of cancer (Wang et al., 2004; Dasen et al., 2008). Intriguingly, the
Figure 1. Alternative Splicing Regulatory Switches (A) A cascade of alternative splicing regulatory switches control sex determination in Drosophila; Sxl (Sex lethal), tra (transfer), msl-2 (male-specific lethal-2), dsx (doublesex), and fru (fruitless) genes are all differentially spliced in males and females. Sxl encodes a female-specific RNA-binding protein that autoregulates itself and represses the male-specific isoforms of tra and msl-2. The female-specific isoform of tra encodes a female-specific RNA-binding protein that activates expression of the female-specific isoforms of both dsx and fru. These isoforms encode female-specific transcription factors that activate expression of genes specifying female physical traits and sexual behavior. (B) An alternative splicing regulatory switch in FOXP1 regulates pluripotency and reprogramming (Gabut et al., 2011). FOXP1 contains two exons, 18 and 18b, which are spliced in a mutually exclusive manner. In embryonic stem cells (ESCs), exon 18b is included. This results in the production of FOXP1-ES, which binds to and activates pluripotency genes while simultaneously repressing differentiation genes. In differentiated cells, only exon 18 is included, resulting in the production of FOXP1, which activates the expression of differentiation genes.
ES-specific exon is located within the forkhead domain, suggesting that the FOXP1 splice variants may encode proteins with distinct DNA-binding specificities. To examine this possibility, Gabut and colleagues determine the DNA-binding specificity of both the traditional FOXP1 and the ES-specific FOXP1 (FOXP1-ES) with microarrays that contain oligonucleotides with all possible 8-mers. Whereas FOXP1 preferentially recognizes the sequence GTAAACA, FOXP1-ES preferentially binds to AATAAACA and CGATACAA. These results suggest that alternative splicing of FOXP1 could regulate the activation of distinct transcriptional programs. Next Gabut and colleagues use two complementary approaches to determine whether FOXP1 and FOXP1-ES control different sets of genes. First, they deplete either FOXP1 or FOXP1-ES by RNA interference (RNAi) and then sequence the resulting transcriptome. This allows them to identify genes that increase or decrease expression upon disruption of a specific FOXP1 isoform. Additionally, the authors use chromatin immunoprecipitation experiments to identify where FOXP1 and FOXP1-ES bind in the genome. The
results of these experiments are striking. FOXP1-ES, but not FOXP1, enhances expression of many pluripotency genes, including OCT4, NR5A2, and NANOG, by directly binding to their promoters. Simultaneously, FOXP1-ES represses the expression of genes that control differentiation. These intriguing observations prompt the authors to investigate the role and requirement for FOXP1-ES in stem cell pluripotency and reprogramming. Increasing the expression of FOXP1-ES, but not FOXP1, prevents differentiation of mouse ESCs under conditions that promote efficient differentiation. Conversely, depleting FOXP1-ES, but not FOXP1, inhibits reprogramming of mouse embyronic fibroblasts via the activation of Oct4, Klf4, c-Myc, and Sox2. The article by Gabut and colleagues is a landmark study that shifts the paradigm for mechanisms regulating embryonic stem cell pluripotency and reprogramming in mammals. Instead of a transcriptional network being at the top of the hierarchy, an alternative splicing switch can now be placed upstream of this network, as FOXP1-ES activates expression of plu-
ripotency genes and represses expression of differentiation genes. Despite changing our understanding of the regulatory network controlling pluripotency and reprogramming, this work also raises many questions to be addressed in future studies. For instance, what controls the FOXP1-FOXP1-ES splicing switch? What splicing factors are responsible for flipping this switch, and how are their expression and activities regulated? Answering these questions is like hunting down the ‘‘chicken-or-the egg’’ paradox, but they will ultimately uncover the master regulator of stem cell pluripotency.
REFERENCES Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V.B., Wong, E., Orlov, Y.L., Zhang, W., Jiang, J., et al. (2008). Cell 133, 1106–1117. Dasen, J.S., De Camilli, A., Wang, B., Tucker, P.W., and Jessell, T.M. (2008). Cell 134, 304–316. Gabut, M., Samavarchi-Tehrani, P., Wang, X., Slobodeniuc, V., O’Hanlon, D., Sung, H.-K., Alvarez, M., Talukder, S., Pan, Q., Mazzoni, E.O., et al. (2011). Cell 147, this issue, 132–146. Kim, J., Chu, J., Shen, X., Wang, J., and Orkin, S.H. (2008). Cell 132, 1049–1061.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 23
Nilsen, T.W., and Graveley, B.R. (2010). Nature 463, 457–463.
Nagy, A., and Wrana, J.L. (2010). Cell Stem Cell 7, 64–77.
Salz, H.K. (2011). Curr. Opin. Genet. Dev. 21, 395–400.
Silva, J., Nichols, J., Theunissen, T.W., Guo, G., van Oosten, A.L., Barrandon, O., Wray, J., Yamanaka, S., Chambers, I., and Smith, A. (2009). Cell 138, 722–737.
Samavarchi-Tehrani, P., Golipour, A., David, L., Sung, H.K., Beyer, T.A., Datti, A., Woltjen, K.,
Takahashi, K., Tanabe, K., Ohnuki, M., Narita, M., Ichisaka, T., Tomoda, K., and Yamanaka, S. (2007). Cell 131, 861–872. Wang, B., Weidenfeld, J., Lu, M.M., Maika, S., Kuziel, W.A., Morrisey, E.E., and Tucker, P.W. (2004). Development 131, 4477–4487.
Unweaving the Autism Spectrum Catherine Lord1,* 1Department of Psychiatry, Weill-Cornell Medical College, New York, NY 10605, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.017
Although genes associated with human autism spectrum disorders have been identified, bridging the gap between genetics and the patchwork of behavioral deficits associated with the disease remains an enormous challenge. Pen˜agarikano et al. (2011) now show that mice lacking CNTNAP2, a gene that causes a rare form of epilepsy associated with autistic features and language impairment, display similar phenotypes to their human counterparts, raising hopes that such models may speed the identification of neuronal circuitries underlying the core features of autism. Disorders that affect behavior, including both psychiatric conditions and developmental disabilities, provide challenging opportunities and pitfalls for neuroscientists. In autism, a three-domain model describing deficits in communication, social interaction, and fixated or repetitive behaviors and interests has proven useful as a ‘‘grammar’’ to represent the nature of the deficits and to yield reliable diagnoses (Figure 1A). This model does not, however, necessarily reflect functional relationships between behaviors (Gotham et al., 2007). Such limitations underlie both the strengths and weaknesses of bold, integrative approaches such as those found in this issue in Pen˜agarikano et al. (2011), which reports a comprehensive and ambitious series of experimental behavioral, neuropathological, and neurophysiological studies of CNTNAP2 knockout mice. CNTNAP2, a gene on chromosome 7q35, is of particular interest because it has been shown to cause a rare form of epilepsy (Strauss et al., 2006). These patients have severe intellectual disabilities and, like most individuals with severe cognitive deficits, are described as having
features of autism. Although it likely accounts for fewer than 1% of cases of autism spectrum disorder (ASD) (Sanders et al., 2011), CNTNAP2 has also been associated with specific language impairment (SLI), which is characterized by difficulties with grammatical aspects of language acquisition in the absence of related causes such as hearing loss (Bishop, 2010). CNTNAP2 is also a downstream target of FOXP2, one of the first genes to have been identified as a cause of SLI. In common with some human CNTNAP2 patients, CNTNAP2 mutant mice have epileptic seizures and display impaired migration of cortical projection neurons, cortical neuronal synchrony, and numbers of GABAergic interneurons. To quantify the behavioral impact of these anatomical and electrophysiological defects, the authors assess knockout and wild-type mice for behaviors considered analogous to the three domains of autism. Standards for interpreting behavioral data in mouse models have become more sophisticated (Silverman et al., 2010). However, leaps made from findings to interpretations are still often substantial (Minshew and McFadden, 2011). Pen˜a-
24 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
garikano et al. (2011) report multiple measures and address several confounding factors such as potential olfactory impairment and the effect of sedation. Most striking, though not obviously anticipated, is that the mutant mice have deficits across diverse contexts and domains. On average, mice lacking CNTNAP2 make fewer social approaches and engage in less vocalization and nesting. In contrast, perseveration, grooming, and digging (used to indicate repetitive behaviors) are enhanced, as are overall levels of activity. Treatment with risperidone, an atypical antipsychotic drug licensed for the treatment of autism, increases nesting and decreases grooming, perseveration, and hyperactivity. However, risperidone has no effect on social approach or vocalization. The authors propose that these specific responses to pharmacological intervention are likely a result of the behaviors being driven by distinct neural circuits. Though the idea that social deficits and repetitive behaviors in autism are separable on an anatomical level is appealing, the absence of any attempt to address functional relationships between these deficits within individuals in
Figure 1. The Changing Landscape of Autism (A and B) The three-domain model of autism in the Diagnostic and Statistical Manual of Mental Disorders IV (DSM-IV) (A), compared with the two-domain model of DSM-V (B).
analyses of multilevel behavioral data makes conclusions about differences in circuits across these domains very questionable. For example, juvenile play is treated as an independent measure of social behavior from nesting and grooming, a repetitive behavior, and is analyzed separately, when these behaviors seem unlikely to be independent measures. On a broader level, associations among language delay, SLI, and ASD, which account for much of the interest in CNTNAP2, are also much more complex than are typically acknowledged in genetic studies (Bishop, 2010; Vernes et al., 2008). This is, in part, because simple, less interesting explanations, such as associations between genetic findings and overall level of ability or severity of ASD, are often not ruled out (Hus et al., 2007). Language difficulties in autism, in which delays in comprehension and onset are common, are not the same as those found in specific language deficit, in which grammatical aspects of expressive language are most affected. Thus, overlapping findings regarding CNTNAP2 in autism and SLI are not necessarily evidence of direct links between particular behavioral deficits and specific genetic loci. As is the case for similar parallels between autism and schizophrenia or ADHD, more developmental or functional models are required. This is most relevant for the findings of Pen˜agarikano et al. (2011) because recent conceptualizations of ASD contain only
two autism-specific domains (social communication; fixated/repetitive behaviors and interests), with language delays and structural language deficits now treated as separate (though important) diagnoses that may or may not co-occur with ASD, as is the case for intellectual disabilities (http://www.dsm5.org) (Figure 1B). Social aspects of communication are considered within a single social communication domain. Fixated interests, over- or underreactions to sensory input, and simple repetitive behaviors continue to be categorized within a single domain. However, in contrast to the lumping together of social communication deficits, there is mounting evidence that different types of repetitive/restricted behaviors are no more related to each other than they are to social communicative deficits (Richler et al., 2007). If mouse models are justified in terms of the analogs that they provide to human behavior (Silverman et al., 2010), the evolving specification of the nature of autism in humans must be considered. There is strong evidence that different genes are associated with ASD, but in almost all instances, these findings as yet have no clinical implications. Whereas standards in statistical genetics and for the biological aspects of the genetics are high, standards in genotype-phenotype studies for clarity in what is hypothesized, what is treated as a replication, and how different findings are considered in light of one another often seem comparatively low. Pen˜agarikano et al. (2011)
represents some of the most impressive research in this area through its attempts to integrate genetics, behavioral studies, and neural function and anatomy. It also highlights the need for more complex analyses, as well as for restraint in making claims about different behaviors that are not as simple or as distinct as represented. REFERENCES Bishop, D.V.M. (2010). Behav. Genet. 40, 618–629. Gotham, K., Risi, S., Pickles, A., and Lord, C. (2007). J. Autism Dev. Disord. 37, 613–627. Hus, V., Pickles, A., Cook, E.H., Jr., Risi, S., and Lord, C. (2007). Biol. Psychiatry 61, 438–448. Minshew, N., and McFadden, K. (2011). Autism Res. 4, 1–4. Pen˜agarikano, O., Abrahams, B.S., Herman, E.I., Winden, K.C., Gdalyahu, A., Dong, H., Sonnenblick, L.I., Gruver, R., Almajano, J., Bragin, A., et al. (2011). Cell 147, this issue, 235–246. Richler, J., Bishop, S.L., Kleinke, J.R., and Lord, C. (2007). J. Autism Dev. Disord. 37, 73–85. Sanders, S.J., Ercan-Sencicek, A.G., Hus, V., Luo, R., Murtha, M.T., Moreno-De-Luca, D., Chu, S.H., Morrow, M.P., Gupta, A.R., Thomson, S.A., et al. (2011). Neuron 70, 863–885. Silverman, J.L., Yang, M., Lord, C., and Crawley, J.N. (2010). Nat. Rev. Neurosci. 11, 490–502. Strauss, K.A., Puffenberger, E.G., Huentelman, M.J., Gottlieb, S., Dobrin, S.E., Parod, J.M., Stephan, D.A., and Morton, D.H. (2006). N. Engl. J. Med. 354, 1370–1377. Vernes, S.C., Newbury, D.F., Abrahams, B.S., Winchester, L., Nicod, J., Groszer, M., Alarco´n, M., Oliver, P.L., Davies, K.E., Geschwind, D.H., et al. (2008). N. Engl. J. Med. 359, 2337–2345.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 25
Leading Edge
Essay A Blueprint for Advancing Genetics-Based Cancer Therapy William R. Sellers1,* 1Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, MA 02139, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.016
In the era of next-generation sequencing, there are significant challenges to harnessing cancer genome information to develop novel therapies. Key research thrusts in both academia and industry will speed this transition, and lessons learned for cancer will more broadly shape the process for genetic contributions to the therapy of disease more broadly. Introduction Breakthrough advances in the treatment of medical illness require the elucidation of the pathogenic mechanisms initiating and driving disease states. Cancer is largely a disease of the genome, and through recent technological advances, we are now able to separate the ‘‘diseased’’ cancer genome from the normal genome. As a consequence, the next decade should see dissection of disease-relevant somatic mutations and the completion of the ‘‘pathogenetic’’ landscape of cancer, paving the way for further therapeutic development. The discovery of key pathogenetic cancer alterations has already transformed the treatment of specific cancer types. The introduction of all-trans retinoic acid to the treatment of acute promyelocytic leukemia harboring translocations into the RARa gene has led to curative responses in the majority of patients (Huang et al., 1988). Treatment of chronic myelogenous leukemia, bearing the BCRABL fusion gene, with imatinib, an inhibitor of the Abelson kinase, has led to a staggering 80% decline in disease mortality (Figure 1). More recently, inhibitors targeting EGFR mutations in adenocarcinoma of the lung, BRAF mutations in melanoma, and ALK translocations in lung cancer highlight the potential to alter the course of previously untreatable disease (reviewed in Haber et al., 2011). These encouraging results strongly suggest that the next step toward the goal of curing cancer is to more fully exploit this genetic-therapeutic strategy. Though access to ever-increasing amounts of sequence information holds
great promise, there are at least five significant barriers to this goal. First, we do not yet have a complete picture of the genetic alterations comprising the disordered cancer genome. Second, the discovery of specific mutations that lead to cancer, so called ‘‘driver’’ mutations, is too often accompanied by an inability to create a relevant therapeutic molecule. In other words, there is a ‘‘druggability gap.’’ Third, most cancers, particularly those that are late stage, are genetically heterogeneous or capable of rapid genetic evolution. Either scenario can lead to the rapid emergence of therapeutic resistance. Fourth, as a corollary to the problem of resistance, combination therapy will be required to achieve therapeutic cure or long-lasting disease control. Finally, the ability to make more definitive predictions about clinical therapeutic outcome is severely limited by the lack of robust preclinical disease models. ‘‘Completing’’ the Human Cancer Genome The daunting complexity of the human cancer genome is amply illustrated by the emerging fully characterized cancer genomes. For example, a recent study looking at prostate cancer identified substantial variation between the patient tumors (Figure 2) (Berger et al., 2011). This complexity appears both as the diversity of genetic alterations across many cancer types and samples and the complexity within single tumors, in which the background mutation frequency (‘‘carrier’’ mutations) often exceeds the frequency of ‘‘driver’’ mutations. Thus, in order to distinguish the critical genetic
26 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
alterations of therapeutic interest, repetitive mutations or repetitively mutated genes across many samples from individual tumor types must be identified and correlated with disease in a relevant manner. One can envision a broader goal requiring larger sample sets whereby we would define not only the cancer gene alterations, but also the cancer gene interactions. As examples, early efforts to describe the glioblastoma genome identified three core interacting genetic pathways in this disease (The Cancer Genome Atlas Research Network, 2008), whereas low-frequency single-gene mutations appear to, in aggregate, target the NFKB pathway in myeloma (Chapman et al., 2011). Therefore, the vision of ‘‘completeness’’ is to characterize cancer genomes at a sufficient scale so that genetic interactions can be used to directly define the key functional cancer pathways and potentially the therapeutic targets. To this end, the goal should be to describe three interaction attributes beyond single-gene mutation frequency, specifically: (1) to identify those genetic alterations that co-occur at a greater frequency than by chance alone and are hence cooperative, (2) to identify those genes in which alterations anticorrelate and thus provide a similar function or where co-mutation is incompatible with cancer cell progression, and (3) to identify genes whose coding sequence is notable for the absence of background mutations. In this latter case, the absence of background loss-of-function mutations could potentially define genes whose function is absolutely required for cancer cell
targets, it is likely that these resources will be of more value than the screening platforms themselves. In industry, time-dependent pressure to deliver molecules ready for clinical trials drives early-stage drug discovery efforts focused on tractable targets, including those with often tenuous links to disease pathogenesis over those focused on difficult targets with incontrovertible disease linkage. Perversely, this can reward a scenario of costly late-clinical failure based on the invalidation of the therapeutic hypothesis From Genetics to over a scenario of early-reTherapy: Bridging the search failure based on drugDruggability Gap gability. Notably, the current The identification of protein biotechnology venture-based kinases, activated by somatic funding model suffers from mutation, has led the recent Figure 1. CML Mortality Has Declined in the United States, and the the same short-term reward Annual Incidence Is Unchanged wave of therapeutic breakShown are the estimated US-based CML incidence and mortality rates for the structure. Somehow, a prothroughs outlined above. In years 1997, 1998, and 2000–2011. These data were abstracted from the portion of the biopharmasharp contrast is the realizaannual cancer statistics publications published by CA: A Cancer Journal for ceutical drug discovery retion that most of the known Clinicians’ in the years 1997, 1998, and 2000–2011 (Siegel et al., 2011). source must be dedicated critical oncogenes and tumor suppressors remain beyond the reach between BH3-containing proteins and over the longer term to the discovery of of current therapeutic modalities. Se- BCL2 family members belie the notion compounds that break new ground quencing projects have reaffirmed TP53 that such PPI targets are inherently intrac- against the most challenging of targets. To expand these efforts on the small moleas the most commonly mutated tumor table (Oltersdorf et al., 2005). Further broadening of the scope of cule front for particular targets, we need suppressor gene and the RAS oncogenes (K-RAS and N-RAS) as key drivers of druggable targets can be achieved research funding mechanisms in acadea variety of devastating cancers. Yet, the through a number of parallel mechanisms. mia or industry that are compatible with elucidation of p53 and RAS biology has In the arena of ‘‘typical’’ synthetic low- the 5–10 year timescale required for tacknot been coupled with major revolutions molecular weight molecules, we will ling difficult drug discovery problems. Alternative therapeutic modalities, in therapeutics based on these genetic benefit from increases in diversity and alterations. Similarly, the discovery of scale of available small-molecule li- including therapeutic siRNA, intracellular key oncogenic transcription factors, braries. In academia, renewed interest in peptide therapies, and gene therapy, including MYC, NMYC, ERG, and others, direct participation in drug discovery hold promise for a broad attack on the has not been paralleled by the develop- has enabled establishment of modest- druggability gap but are all beset by a ment of transcription factor inhibitors. To scale low-molecular weight screening common issue: difficult drug delivery. address this druggabilty gap, it is first facilities. Such efforts may focus on The challenge to the delivery of theranecessary that we challenge the notion phenotype-oriented drug discovery, and peutic siRNA remains essentially unof what is ‘‘undruggable’’ so that drug hopefully breakthroughs against difficult changed from that faced by antisense targets are not rejected outright based targets will follow. But centers must be oligonucleotides. The molecules are large on prior assumptions (or past failures). bolstered with robust medicinal chemistry (>10,000 MW) and highly charged, making Similar prejudices led to the premature resources, or it will become difficult to distribution across biological membranes conclusions that kinases and other ATP- make advances beyond the relatively problematic (Shim and Kwon, 2010). To utilizing enzymes were themselves un- weak nonspecific inhibitors typically iden- realize the promises for these novel theradruggable. The development of drugs dis- tified through screening. The most chal- peutic classes, a renewed focus on rupting protein-protein interactions (PPI) lenging drug discovery projects (e.g., systematic and robust approaches to the between IAPs and SMAC (Sun et al., PPIs) typically require structure-based delivery problem is warranted. In the short-term, certain key cancer 2004), between p53 and HDM2 (Vassilev guidance (X-ray and NMR) and an investet al., 2004), and, most notably, along ment in biophysical analysis platforms. genes are likely to remain relatively difficult the remarkably extended interface Indeed, in tackling challenging drug to drug. This is a notable problem for the survival. This so-called class of ‘‘never-mutated’’ genes will likely be conditioned by driver genetic events and could be an important new class of ‘‘conditional lethality’’ genes particularly relevant to therapeutic development. Advances in sequencing technology have created the opportunity to achieve these goals; however, high-priority efforts must now be made to address the remaining ratelimiting steps of tumor sample acquisition, data handling, and data analysis.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 27
Figure 2. The Chromosomal Rearrangements Found in Seven Prostate Cancer Genomes Copy number alterations are represented by the colors of the chromosome segments depicted along the inner ring (red, copy gain; blue, copy loss). Intrachromosomal rearrangements and interchromosomal translocations are shown with green and purple lines, respectively Reprinted by permission from Macmillan Publishers Ltd: Nature 470, 214–220, copyright 2011.
tumor suppressor genes, in which the gene product is often completely absent. Here, the hope is to exploit the concept of ‘‘synthetic lethality’’ or ‘‘conditional lethality’’ in order to define key therapeutic targets whose requirement for the maintenance of the cancer state is conditioned by a specific genetic event (Kaelin, 2005). Such conditional lethality nodes can be conceived of as existing within a pathway downstream of an oncogene or tumor suppressor or in a parallel pathway. In the case of downstream conditional lethality, preclinical and clinical proof of concept has been obtained from the inhibition of the SMO receptor downstream of tumor suppressor mutations in the PTCH gene (Von Hoff et al., 2009), from the inhibition of mTORC1 downstream of germline mutations in the TSC genes (Krueger et al., 2010), and from the inhibition of MEK downstream of mutant oncogenic BRAF in melanoma (Solit et al., 2006). In the case of parallel pathway conditional lethality, clinical proof of concept has been obtained from the inhibition of PARP1 in the context of BRCA1 loss-of-
function mutations in breast cancer (Farmer et al., 2005; Fong et al., 2009). In each case, the key finding is that mutantbearing tumors are far more susceptible to the relevant inhibitor than either nonmutant tumors or the normal host tissue. The tools for further discovery of such conditional lethal nodes are only just becoming available with improvements in larger-scale shRNA libraries. However, the robust realization of such discoveries remains plagued by the noise inherent in larger-scale shRNA screens and, in some cases, the overreliance on screens in isogenic cell line pairs that can often lead to cell line-specific hits. In this regard, the alternative approach of cell line panels should be considered (Brough et al., 2011). An investment in the development of highly validated shRNAs for each human and murine gene could go a long way in helping to reduce the notion of conditional lethality to practical discovery. The Development of Resistance The evolutionary nature of cancer and its mutable genome makes emergent thera-
28 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
peutic resistance a serious and often unnerving problem. No doubt this is a major hurdle in moving from therapeutic efficacy to curative cancer therapy. Though there has been a notable lack of progress in defining clinically relevant mechanisms of resistance to classical cytotoxics, more rapid progress has been seen in understanding resistance to genetically directed targeted therapeutics. In particular, it is clear that, in the latter case, the preclinical discovery of resistance mechanisms can be directly predictive of the resistance features seen in patients. A first example was the elucidation of mechanisms of resistance to imatinib, in which the preclinical definition of resistance alleles through a random mutagenesis and selection process was, in retrospect, highly correlated with resistance alleles uncovered in patient samples (Azam et al., 2003; Shah et al., 2002). Similarly, clinically relevant mechanisms of resistance to EGFR, SMO, and RAF inhibitors have been uncovered through a variety of preclinical studies employing the relevant addicted cell line or animal
models. In parallel, the application of next-generation sequencing technologies to the discovery of very low-abundance pre-existing resistance alleles can allow confirmation of preclinical discoveries prior to treatment with relevant inhibitors. The utility of predictive preclinical resistance studies together with the advent of deep sequencing raise the possibility that resistance mechanisms could be identified before the clinical trials of novel therapeutics begin. In fact, ideally such an approach would be used to guide improvements in the first-generation inhibitors during lead optimization. For example, choices among distinct small-molecule scaffold classes with alternate binding modes to a target of interest could be prioritized based on the frequency and propensity of resistance mechanisms and upon which resistance alleles are detectable in patients prior to treatment. Presumably, this approach could accelerate the generation of bestin-class targeted therapeutics. The rapid emergence of resistance during targeted therapy has raised the specter of an endless chase of resistance alleles, with ever more specific inhibitors dealing with an increasing complex spectrum of resistance alleles and nongenetic tumor-adaptive responses. Though this so-called ‘‘whack-a-mole’’ approach can lead to the rapid clinical development of second-generation or third-generation inhibitors applied in mutation-specific settings, this will not be the ultimate strategy for achieving curative therapy. Rather, defining the emergent principles of resistance should be used to both (1) guide improvements to the first-line targeted therapeutics to gain-greater efficacy and (2) elucidate rationale combinations based on the understanding of escape mechanisms. What can we say about the principles of resistance to targeted therapeutics that are now evident from the study of ABL, EGFR, KIT, BRAF, and SMO inhibitors? The common theme is a fairly remarkable and consistent finding of persistent target and pathway addiction. Given the complexity of the cancer genome and the well-recognized mutability, one might have imagined that hundreds of distinct resistance mechanisms would have emerged in response to any given targeted therapy. Instead, the observation is that of a consistent pattern of resistance
mechanisms, acting in large part to restore the activity of the original ‘‘addicting’’ pathway. In the case of imatinibbased treatment of CML, the great majority of resistance alleles are found in the ABL kinase domain itself. Based on these data, ABL inhibitors capable of both directly suppressing such mutations and more potently inhibiting the wild-type kinase have substantially improved the molecular response rates (Kantarjian et al., 2010; Saglio et al., 2010). Lung adenocarcinomas bearing activating mutations in EGFR show dramatic response to catalytic EGFR inhibitors but relapse through direct mutations in the EGFR kinase domain or through coamplification of MET (Kobayashi et al., 2005; Pao et al., 2005). In both instances, a common feature is the restoration of downstream signaling through phosphoinositide-3 kinase. Based on these data, second-generation inhibitors, novel combinations with MET inhibitors and with PI3K inhibitors, are in development. Melanomas bearing activating mutations in BRAF similarly manifest significant responses to BRAF or MEK inhibitors, and through heterogeneous mechanisms, escape is manifest largely by reactivation of the MEK-ERK cascade. Finally, the study of resistance to the androgen receptor pathway inhibitors in prostate cancer for two decades has focused on so-called androgen-independent means of resistance. The recent clinical success of CYP17 inhibitors (Attard et al., 2008) and novel AR antagonists (Scher et al., 2010) has shown, however, that such ‘‘androgen-independent’’ tumors remain dependent on both AR and AR ligands. In all, the emerging data from these examples and others strongly support the notion that cancers remain highly dependent on these initial dominant oncogenic pathways and support the elaboration of improved inhibitors or of ‘‘vertical’’ pathway combinations, where inhibitors target the same pathway, as at least one mechanism by which we can improve the first-line treatment of cancer. Such combinations are also likely to be effective in the case of nongenetic or adaptive resistance mechanisms involving pharmacologic activation of homeostatic feedback loops exemplified by the finding of RTK activation induced by AKT inhibition (Chandarlapaty et al., 2011).
Toward the Discovery and Development of Curative Combinations The majority of curative cancer treatment regimens involve multidrug combinations. The emergence of therapeutic resistance together with the frequent incomplete response to primary therapy underscores the importance of developing novel highly efficacious combinations. Here, we face substantial challenges in both the research arena and the clinic. The preclinical discovery of combination therapies is significantly limited by experimental throughput and by the lack of a consistent understanding of which measures of preclinical combination activity might be predictive of clinical combination activity. The throughput of combination profiling in vitro might be solved through automation; however, animal costs rapidly become prohibitive if systematic in vivo combination testing is required. Measures of combination effect, i.e., the interaction between two drugs to produce a cellular outcome, have become increasingly sophisticated, yet there is little understanding of how such improved methods will relate to the clinic. Efforts to solve this latter problem are urgently needed. The clinical challenge is to move beyond the paradigm that requires the testing of new agents in combination with existing standard of care therapies (SOCs). This incremental strategy ignores the possibilities that novel and existing agents might be antagonistic and that two novel agents might be more effective as a combination without an SOC component. Fortunately, recognition that this paradigm is inadequate has led to early phase II clinical trials in which two novel agents are being tested in combination prior to the full demonstration of single-agent activity. Notably, the observation that the treatment of RASdriven tumors may require blockade of both the PI3K and RAS pathways (Engelman et al., 2008) has led to multiple trials of the ‘‘horizontal’’ combination of AKT or PI3K inhibitors in combination with MEK inhibitors. In addition, the ‘‘vertical’’ combination of MEK and RAF inhibitors is being tried in melanoma (Infante et al., 2011). If future results of such phase II trials were truly distinctive, including high partial and perhaps complete response rates in relatively unresponsive cancers, such combinations might be directly
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 29
submitted as new drug applications or might proceed to phase III trials comparing new combinations against the accepted standard of care. The Development of Preclinical Disease Model Systems Until recently, preclinical therapeutic profiling was typically confined to the analysis of a handful of human cancer cell lines. As such, preclinical testing is conducted on fewer cancer cell lines than the number of human patients treated in even exploratory trials. Under such conditions, it is unreasonable to expect either robust preclinical prediction of the clinical utility of a candidate therapeutic or the preclinical discovery and validation of predictive biomarkers. In simple terms, the test sets are too small. Ideally, preclinical models would be of sufficient robustness and used at sufficient scale so that clinical failure would be reduced to zero. The complexity of human cancer makes it unlikely that this landmark will be attained; nonetheless, progress toward this goal will be of unquestionable benefit. The ‘‘required’’ elements of a robust preclinical infrastructure include molecularly defined model systems that are directly reflective of their human counterparts and sufficient model numbers for one to approach the disease diversity found in humans. Additionally, the preservation of stromal-epithelial interactions, in particular those occurring through ligand-receptor pairs, along with the ability to study cancer in the setting of a functioning immune system, are vitally important. Lastly, the ability to replicate disease progression and to examine sufficient intratumoral heterogeneity to enable the study of resistance are desirable for such an infrastructure. It is selfevident that no single class of preclinical models will satisfy such requirements. Moreover, each model system will have distinct advantages and weaknesses. Thus, the monolithic view that there is a single best system for the preclinical study of therapeutic effect is naive. It is reasonable, therefore, to take a multipronged approach with the paramount commonality that each model is related to human cancer through detailed molecular characterization. The in vitro study of cancer cell lines is the only current method for characterizing
therapeutic effect across hundreds of representative disease models derived from bona fide human tumors. The artificial nature of cell culture systems has many limitations, including a limit to the spectrum of human cancer that can be adequately represented. Nonetheless, cell-autonomous growth inhibitory effects can be robustly studied in many instances. Prior efforts at systematic profiling, including the NCI60, were limited by the small sample size and the lack of molecular characterization. Two efforts to rectify this deficit have now characterized both the molecular constituents and drug sensitivity of many hundreds of cell lines. The Cancer Cell Line Encyclopedia project (http:// www.broadinstitute.org/ccle) has completed the expression profiling, the copy number analysis, and the exon sequencing of 1600 genes across 949 cell lines available through commercial sources. The MGH-Sanger cell line project (Genomics of Drug Sensitivity in Cancer; http://www. cancerRxgene.org) has completed the expression and copy number analysis of 800 cancer cell lines, as well as exon sequencing of 65 genes commonly mutated in cancer. Full genomic exome sequencing will be completed by the end of 2011. These genomic parameters are being correlated with drug responses to a panel of 135 anticancer small molecules. Renewed interest in studying human tumors without the requirement for in vitro growth has led to the development of primary human tumor explants for propagation in immunocompromised animals. This approach allows for the development of tumor models characterized by molecular alterations that appear difficult to maintain during in vitro growth. In addition, the function of key developmental pathways (e.g., the Hedgehog pathway) are much more likely to be preserved. Finally, the role of the murine stroma in supporting the growth of human cancers can in some instances reflect the human stromal response. For example, in pancreatic cancers, tumor production of Hedgehog ligands leads to stromal Hhpathway activation. In turn, inhibition of stromal Hh pathway activation by SMO inhibitors leads to an antitumor response (Yauch et al., 2008). Genetically engineered mouse (GEM) models have the intrinsic advantage of preserving stromal-to-epithelial interac-
30 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
tions and a competent immune system. GEM models are increasingly reflective of the genetic alterations seen in some human cancers and likely harbor a high degree of genetic heterogeneity such that relevant resistance mechanisms can be rapidly identified (Buonamici et al., 2010). These systems remain limited by the necessity of staging large numbers of mice for even rudimentary preclinical trials, and hence transplantable versions of GEM-derived tumors might provide a more tractable approach. Finally, the rapid advances in nextgeneration sequencing throughput make it worth revisiting spontaneous or carcinogen-induced murine cancers, along with cancers arising in mice with targeted disruption of DNA repair pathways. Here, tumors can be collected in large numbers, and as sequencing costs drop, the full complement of genetic alterations will be understood. Such models could then be accepted or rejected as representatives of human disease based on a genetic comparison. This might allow the rapid development of a large array of genetically complex transplantable murine tumors. The growing evidence that preclinical models, when used in robust numbers and when accurately characterized, can be helpful in guiding clinical development should renew investment in building these models as a commonly available resource for wider-spread use. Implications for Therapeutic Development in Nononcologic Diseases The treatment of the majority of noninfectious medical illness remains largely based on phenotype. For example, diabetic therapy is largely confined to the lowering of blood glucose, the most readily measured ‘‘phenotype’’ of this complex disease. The progress toward understanding genetic mechanisms of disease has been notable in those medical illnesses caused by Mendelian inherited genetic alterations; however, progress in defining causally associated genetic variants has been slower. Limitations to progress in the genetic definition of medical disease have come in two forms: (1) the need to more precisely define distinct subdisease phenotypes so that genetically diverse diseases are not admixed and (2) an incomplete ability
to analyze genetic variation, including rare variants across large sample sets. As is the case for cancer, we can see the near-term end of the second roadblock. And with sufficient attention to the first, the future discovery of new genetically defined pathways is likely. The discovery of complement pathway genetic variants as the largest attributable factor in the pathogenesis of macular degeneration is emblematic of the transformation that awaits (Edwards et al., 2005; Haines et al., 2005; Klein et al., 2005). Following the definition of diseasecausing genetic variants, the same challenges seen in cancer are likely to arise. First, the druggability gap will pose a substantial challenge. As an example, the identification of genetic alterations in hemoglobin has not led to a major advancement in medical therapy of sickle cell anemia and is reflective of the same difficulty faced in treating loss-of-function mutations in tumor suppressor genes. Though therapeutic resistance through a hypermutable genome is unlikely to be a common theme in nononcologic medical disease, adaptive resistance through pathway and endocrine remodeling likely suggests that combination therapy of medical illness will be essential. Lastly, though cancer models are not ideal, the situation in the medical disease area is far worse. For most commonly used therapeutic models (collageninduced arthritis, chemical-induced fibrosis), we have little to no evidence that they mimic and/or reflect the human disease. Furthermore, in most cases, only one model system is used at all. Again, the likelihood that this testing paradigm will be predictive of human therapeutic success is low. A key challenge in these areas is to take emerging genetic data derived from patient-based studies and create models (likely starting with GEMs) based on the most relevant genetic alterations both alone and in combination. The completion of the human genome project provided a roadmap to the eventual understanding of disease genomes. Genetic drivers of disease are now being elucidated and will allow us to transform the treatment of human illness. Though substantial, the hurdles to fully realizing this transformation can be overcome to drive the next stages of innovation and investment.
ACKNOWLEDGMENTS I am most grateful to my colleagues Levi Garraway, Chatanika Stoop, and Mariellen Gallagher for their critical comments and many suggestions. W.R.S. is an employee and shareholder of Novartis Pharmaceuticals.
REFERENCES
Infante, J.R., Falchook, G.S., Lawrence, D.P., Weber, J.S., Kefford, R.F., Bendell, J.C., Kurzrock, R., Shapiro, G., Kudchadkar, R.R., Long, G.V., et al. (2011). J. Clin. Oncol. 29, CRA8503. Kaelin, W.G., Jr. (2005). Nat. Rev. Cancer 5, 689–698. Kantarjian, H., Shah, N.P., Hochhaus, A., Cortes, J., Shah, S., Ayala, M., Moiraghi, B., Shen, Z., Mayer, J., Pasquini, R., et al. (2010). N. Engl. J. Med. 362, 2260–2270.
Attard, G., Reid, A.H., Yap, T.A., Raynaud, F., Dowsett, M., Settatree, S., Barrett, M., Parker, C., Martins, V., Folkerd, E., et al. (2008). J. Clin. Oncol. 26, 4563–4571.
Klein, R.J., Zeiss, C., Chew, E.Y., Tsai, J.Y., Sackler, R.S., Haynes, C., Henning, A.K., SanGiovanni, J.P., Mane, S.M., Mayne, S.T., et al. (2005). Science 308, 385–389.
Azam, M., Latek, R.R., and Daley, G.Q. (2003). Cell 112, 831–843.
Kobayashi, S., Boggon, T.J., Dayaram, T., Ja¨nne, P.A., Kocher, O., Meyerson, M., Johnson, B.E., Eck, M.J., Tenen, D.G., and Halmos, B. (2005). N. Engl. J. Med. 352, 786–792.
Berger, M.F., Lawrence, M.S., Demichelis, F., Drier, Y., Cibulskis, K., Sivachenko, A.Y., Sboner, A., Esgueva, R., Pflueger, D., Sougnez, C., et al. (2011). Nature 470, 214–220. Brough, R., Frankum, J.R., Sims, D., Mackay, A., Mendes-Pereira, A.M., Bajrami, I., Costa-Cabral, S., Rafiq, R., Ahmad, A.S., Cerone, M.A., et al. (2011). Cancer Discov. 1, 260–273. Buonamici, S., Williams, J., Morrissey, M., Wang, A., Guo, R., Vattay, A., Hsiao, K., Yuan, J., Green, J., Ospina, B., et al. (2010). Sci. Transl. Med. 2, 51ra70. Cancer Genome Atlas Research Network. (2008). Nature 455, 1061–1068. Chandarlapaty, S., Sawai, A., Scaltriti, M., RodrikOutmezguine, V., Grbovic-Huezo, O., Serra, V., Majumder, P.K., Baselga, J., and Rosen, N. (2011). Cancer Cell 19, 58–71.
Krueger, D.A., Care, M.M., Holland, K., Agricola, K., Tudor, C., Mangeshkar, P., Wilson, K.A., Byars, A., Sahmoud, T., and Franz, D.N. (2010). N. Engl. J. Med. 363, 1801–1811. Oltersdorf, T., Elmore, S.W., Shoemaker, A.R., Armstrong, R.C., Augeri, D.J., Belli, B.A., Bruncko, M., Deckwerth, T.L., Dinges, J., Hajduk, P.J., et al. (2005). Nature 435, 677–681. Pao, W., Miller, V.A., Politi, K.A., Riely, G.J., Somwar, R., Zakowski, M.F., Kris, M.G., and Varmus, H. (2005). PLoS Med. 2, e73. Saglio, G., Kim, D.W., Issaragrisil, S., le Coutre, P., Etienne, G., Lobo, C., Pasquini, R., Clark, R.E., Hochhaus, A., Hughes, T.P., et al; ENESTnd Investigators. (2010). N. Engl. J. Med. 362, 2251–2259.
Chapman, M.A., Lawrence, M.S., Keats, J.J., Cibulskis, K., Sougnez, C., Schinzel, A.C., Harview, C.L., Brunet, J.P., Ahmann, G.J., Adli, M., et al. (2011). Nature 471, 467–472.
Scher, H.I., Beer, T.M., Higano, C.S., Anand, A., Taplin, M.E., Efstathiou, E., Rathkopf, D., Shelkey, J., Yu, E.Y., Alumkal, J., et al; Prostate Cancer Foundation/Department of Defense Prostate Cancer Clinical Trials Consortium. (2010). Lancet 375, 1437–1446.
Edwards, A.O., Ritter, R., III, Abel, K.J., Manning, A., Panhuysen, C., and Farrer, L.A. (2005). Science 308, 421–424.
Shah, N.P., Nicoll, J.M., Nagar, B., Gorre, M.E., Paquette, R.L., Kuriyan, J., and Sawyers, C.L. (2002). Cancer Cell 2, 117–125.
Engelman, J.A., Chen, L., Tan, X., Crosby, K., Guimaraes, A.R., Upadhyay, R., Maira, M., McNamara, K., Perera, S.A., Song, Y., et al. (2008). Nat. Med. 14, 1351–1356.
Shim, M.S., and Kwon, Y.J. (2010). FEBS J. 277, 4814–4827.
Farmer, H., McCabe, N., Lord, C.J., Tutt, A.N., Johnson, D.A., Richardson, T.B., Santarosa, M., Dillon, K.J., Hickson, I., Knights, C., et al. (2005). Nature 434, 917–921.
Solit, D.B., Garraway, L.A., Pratilas, C.A., Sawai, A., Getz, G., Basso, A., Ye, Q., Lobo, J.M., She, Y., Osman, I., et al. (2006). Nature 439, 358–362.
Fong, P.C., Boss, D.S., Yap, T.A., Tutt, A., Wu, P., Mergui-Roelvink, M., Mortimer, P., Swaisland, H., Lau, A., O’Connor, M.J., et al. (2009). N. Engl. J. Med. 361, 123–134.
Siegel, R., Ward, E., Brawley, O., and Jemal, A. (2011). CA Cancer J. Clin. 61, 212–236.
Sun, H., Nikolovska-Coleska, Z., Yang, C.Y., Xu, L., Tomita, Y., Krajewski, K., Roller, P.P., and Wang, S. (2004). J. Med. Chem. 47, 4147–4150.
Haber, D.A., Gray, N.S., and Baselga, J. (2011). Cell 145, 19–24.
Vassilev, L.T., Vu, B.T., Graves, B., Carvajal, D., Podlaski, F., Filipovic, Z., Kong, N., Kammlott, U., Lukacs, C., Klein, C., et al. (2004). Science 303, 844–848.
Haines, J.L., Hauser, M.A., Schmidt, S., Scott, W.K., Olson, L.M., Gallins, P., Spencer, K.L., Kwan, S.Y., Noureddine, M., Gilbert, J.R., et al. (2005). Science 308, 419–421.
Von Hoff, D.D., LoRusso, P.M., Rudin, C.M., Reddy, J.C., Yauch, R.L., Tibes, R., Weiss, G.J., Borad, M.J., Hann, C.L., Brahmer, J.R., et al. (2009). N. Engl. J. Med. 361, 1164–1172.
Huang, M.E., Ye, Y.C., Chen, S.R., Chai, J.R., Lu, J.X., Zhoa, L., Gu, L.J., and Wang, Z.Y. (1988). Blood 72, 567–572.
Yauch, R.L., Gould, S.E., Scales, S.J., Tang, T., Tian, H., Ahn, C.P., Marshall, D., Fu, L., Januario, T., Kallop, D., et al. (2008). Nature 455, 406–410.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 31
Leading Edge
Perspective Clan Genomics and the Complex Architecture of Human Disease James R. Lupski,1,2,3,* John W. Belmont,1,2 Eric Boerwinkle,4,5 and Richard A. Gibbs1,5,* 1Department
of Molecular and Human Genetics of Pediatrics Baylor College of Medicine, Houston, TX 77030, USA 3Texas Children’s Hospital 4Human Genetics Center University of Texas Health Science Center at Houston, Houston, TX 77030-1501, USA 5The Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA *Correspondence:
[email protected] (J.R.L.),
[email protected] (R.A.G.) DOI 10.1016/j.cell.2011.09.008 2Department
Human diseases are caused by alleles that encompass the full range of variant types, from singlenucleotide changes to copy-number variants, and these variations span a broad frequency spectrum, from the very rare to the common. The picture emerging from analysis of whole-genome sequences, the 1000 Genomes Project pilot studies, and targeted genomic sequencing derived from very large sample sizes reveals an abundance of rare and private variants. One implication of this realization is that recent mutation may have a greater influence on disease susceptibility or protection than is conferred by variations that arose in distant ancestors. Genetic Contributions to Disease Common chronic diseases such as diabetes, coronary heart disease, stroke, neuropsychiatric illness (including schizophrenia, autism, and developmental disabilities), chronic respiratory disease, and cancer account for an overwhelmingly large fraction of mortality, morbidity, and health care expenditure (http://www.cdc.gov/nchs/). These diseases disproportionately affect aging populations and burden the health care systems and economies of industrialized nations throughout the world. Understanding the underlying causes of such disorders is a key step toward enabling earlier and more precise diagnosis, prognosis, interventional therapy, and potentially prevention. Most common diseases are complex or multifactorial with both environmental and genetic contributions along with their nearly intractable interaction effects. In general, the environmental components are challenging to identify and quantitate. In contrast, as a result of the emergence of powerful genomic technologies, the analysis of the genetic components is becoming increasingly tractable and relatively inexpensive to investigate. These technical improvements have fueled a pipeline of discovery of the genes and variants that predispose to human maladies. The technical improvements have also impacted genetic diagnostics, as it is now practical to sequence an entire individual’s genome for less than the cost of a comprehensive set of whole-body imaging scans. Furthermore, the cost of whole-genome sequencing is rapidly becoming less expensive than current clinically implemented ‘‘multigene panel testing’’ for molecular diagnosis of disease traits with even modest genetic heterogeneity. Allele Frequency Distributions and Human Disease A ‘‘common disease/common variant’’ (CDCV) hypothesis has been popularized as an explanation for common disorders and 32 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
has garnered much support (Reich and Lander, 2001). This model presupposes that different combinations of common alleles aggregate in specific individuals to increase disease risk. The CDCV hypothesis was a major intellectual impetus for the International Haplotype Mapping (HapMap) project and ultimately led to a proliferation of genome-wide association studies (GWAS) identifying regions influencing disease status or risk factor levels (http://www.genome.gov/gwastudies). As a consequence, insights into potential new pathways underlying common disease have emerged. Manolio et al. (2009) suggested that the genetic variance explained per se is of interest for what it might suggest about effective research paradigms. Accounting for the genetic variance is not the same as achieving utility and impact. If the goal of our shared research program is to achieve mechanistic understanding and lessen the impact of human disease, then the magnitude of the genetic effect is less important than the possible insight provided by the newly identified loci. Genes identified through their weakly acting common alleles may give important clues about pathways and/or be excellent targets for therapeutic and preventive strategies. However impressive the information gleaned from GWAS has been, the results have explained only a few percent of the apparent genetic variance contributing to common diseases. Furthermore, these studies have not yet delivered medically actionable variants that inform medical decision making by helping to establish an etiological diagnosis and lead to a more efficacious treatment or prevention plan. Both diagnostic utility and classification of pathogenicity are closely associated with the magnitude of each variant-specific effect. For most complex diseases, unless a variant clearly partitions the affected into distinct biological subgroups or can be incorporated into
risk-prediction models, there is likely to be limited diagnostic usefulness. Morrison et al. (2007) proposed the use of a composite ‘‘genetic risk score’’ for risk assessment. However, it remains to be determined whether variants identified by GWAS have a role as biomarkers in risk assessment and clinical decision making. Overall, these data do not support a simple additive version of the CDCV model as an explanation for the majority of the genetic component underlying risk for common disease. Research efforts have shifted to exploration of less frequent variants in common disorders. It has been noted for decades that the mutational changes that underlie rare and highly penetrant Mendelian disease may share features with genetic factors that underlie more common forms of the disease (Boerwinkle and Utermann, 1988; Goldstein and Brown, 2001). Clearly, the relationship between rare and common disease is not a simple one, but there are emerging examples wherein specific loci that cause ‘‘Mendelian disease’’ are contributing to the background risk to a parallel common disorder. Although a common disease/rare variant (CDRV) hypothesis is attractive, it demands tenable and complete explanations as to how the functional roles of individual alleles can work to produce the ultimate phenotypic effects. The models must consider the range of variant types, including single-base or simple-nucleotide variants (SNV), short insertions or deletions (indels), structural variants, and copynumber variants (CNV), the penetrance of individual alleles, and allelic and locus interactions (dominance and epistasis, respectively) and show how these all combine to produce the population frequency and the phenotypic complexity of different disorders. Fortunately, the current state of knowledge of key examples supports models that close the gap between complex and Mendelian traits. These examples show how mutations in single genes can fulfill the definition of Mendelian disease—but in different context are parts of the menu of causal contributors to complex disorders (Greeley et al., 2010; Voight et al., 2010). As we begin to observe instances wherein variation at more than one locus contributes to perturbations of networks and ultimate phenotype, the relevance of assessing genome-wide variation becomes more apparent. Thus, inferences about individual mutation burden by geneticists in the last century are now open to direct observation (Muller, 1950). Variation and Disease Susceptibility—We Are All Truly Unique The interplay between different types of variation and their contribution to disease are highly dependent on our understanding of the normal patterns of genetic variation. For rare variants, this has been a particular challenge, as highly accurate data need to be generated from many samples in order to properly determine the frequency and population distribution of the genetic variants. To illustrate this: the successful HapMap project (International HapMap 3 Consortium, 2010; International HapMap Consortium, 2005; Frazer et al., 2007) that provided an early survey of single-base variation across major human populations cataloged only a fraction of the genetic variation above a frequency of 5%. Even the 1000 Genomes Project pilot studies comprehensively captured only variation at greater than 1% frequency (1000 Genomes Project Consortium, 2010).
Our view of the site frequency spectrum of these rare variants (<1%) has been more influenced recently as a result of the generation of personal genome data using whole-exome sequencing and whole-genome sequencing (Gonzaga-Jauregui et al., 2011). The number of diploid human genome sequences available for analyses is growing rapidly. Remarkably, from the small number determined and publicly available to date, it is apparent that even more genetic variation exists between individuals than was previously expected (Ahn et al., 2009; Bentley et al., 2008; Kim et al., 2009; Levy et al., 2007; Lupski et al., 2010; Schuster et al., 2010; Wang et al., 2008; Wheeler et al., 2008). When compared with the haploid reference, each individual human genome on average contains some three and a half million SNV and about 1,000 CNV (>450 bp) (Conrad et al., 2010), many of which appear to be rare in the population from which the individual was sampled. In addition, each individual personal genome sequence still reveals 200,000–500,000 SNV that have not been observed in other publicly available personal genomes, many of which may be unique to that individual’s family or clan. In parallel, recent studies that deeply sequence relatively large samples (hundreds to thousands of individuals) show that the rate of identification of variants that have not been seen (private variants) continues unabated with every new individual sampled (Coventry et al., 2010; Turner et al., 2008). The extent of some of this nucleotide variation may have been anticipated from human genetic studies during the previous three decades that established that an SNV occurs about every 1 Kb; however, the extent of rare and ‘‘private’’ SNV was not anticipated, and the extent of CNV was unexpected. There are technical limitations to some of these studies—including a background of variation introduced in cultured cells—as well as in the mutation detection methods themselves, particularly for CNV in the 100 bp to 500 bp range, low-copy repeat sequences, and simple repeats. Nevertheless, the enormous extent of private variation has been clearly established. Rare Variants and New Mutation A number of factors may have led to the observed skewing of the allele frequency spectrum toward rare and private variants. The explosion of human populations in the current historical epoch could, by itself, account for the short branch lengths and low frequencies of the most distal segments of human variant genealogies (Boyko et al., 2008; Coventry et al., 2010; Turner et al., 2008). In addition, secular factors that have enabled the explosion of the population, such as abundant food supplies, improved sanitation, and routine vaccinations, may directly participate in the relaxation of the most important selective pressures that have constrained the population in the past. Even the widespread availability of minimal routine health care may be artificially slowing negative selection. Dramatic reductions in maternal death and infant mortality, properly celebrated in the last 100 years, may be influencing the distribution of genetic variation and contribute to relaxed selection. Finally, mutation rate, perhaps partially driven by increased paternal age (Crow, 2008) and undiscovered environmental factors, may contribute to the observed rare variant spectra. The conceptual shift to emphasizing studies of abundant, rare, and heterogeneous variants profoundly impacts our approaches Cell 147, September 30, 2011 ª2011 Elsevier Inc. 33
to studying the genetic architecture of human disease, leading to a genome-wide, versus a locus- or gene-specific, emphasis. Genetic architecture here refers to the types of variation (SNV, CNV, etc., both coding and noncoding), their allele frequency distribution (common, rare, intermediate), the size of an allele’s effects, and new mutation rates. For a given individual, what is important to know is not only the number and location of pathogenic variants taken one at a time but also the unique composition of his or her genome-wide mutational burden. If this is the case, then the risk conferred by any particular allele estimated from the population risk would be much less relevant than the personal risk emerging from the total mutational burden in each individual. The shift in emphasis to a whole-genome view changes how we should consider the way in which harmful combinations of mutant alleles assemble or accumulate in each genome. Each personal genome has a collection or ‘‘ecology’’ of deleterious and protective variations, which in combination (not necessarily in sum) dictate the health of the individual. Understanding this genome ecology will be a substantial challenge in human genetics and has ramifications for the extent to which genetic information can be maximized for medical utility. Each personal genome combines inherited alleles and new variation introduced by de novo mutation. Interestingly, CNV may contribute in a significant way, from both the novel combinations inherited from each parent and the new mutations. This is the very type of variation not fully taken into account when previous mutation models were being considered. Locusspecific mutation rates for SNV are 2.0–2.5 3 108 and have recently been shown to potentially differ in male versus female germ cells (Conrad et al., 2011); for CNV, new mutation rates can be substantially higher: between 106 and 104, 100 to 10,000 times more frequent than in SNV (Lupski, 2007a). The latter figures implicate CNV in sporadic traits (Lupski, 2007a) including birth defects (Lu et al., 2008) and highlight the contribution of new mutation to individual mutational burden (Potocki et al., 1999). Either new or recent (i.e., arising in close relatives or ‘‘clan members’’) de novo mutations could substantially contribute to phenotypic extremes, such as birth defects and disease. Although de novo CNV have been detectable now for some time with microarray technologies, identifying smaller de novo events (e.g., SNV) has become feasible only recently with the advent of large-scale DNA sequencing technologies. Recent exome sequencing studies of family trios with patients manifesting sporadic intellectual disability (previously more frequently referred to as mental retardation [MR]) identified a high frequency of de novo mutations in ‘‘MR genes’’ (Vissers et al., 2010). Such studies support established theory that if the mutational target is large (and hence the observed gene mutation rate is high), de novo mutations may account for a high incidence of disease even when the selection coefficient is close to 1.0. These early studies suggest a resolution to the question of why the frequency of neurodevelopmental disabilities is high despite near genetic lethality for such traits. Relatedly, sequencing studies of multiple ion channel genes in patients with epilepsy (Klassen et al., 2011) and of known autism susceptibility genes in subjects with highfunctioning autism (Schaaf et al., 2011) reveal many rare variants and also de novo mutations that may be contributing to disease. 34 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
The concept of new mutation in X-linked lethal disorders was well established by Haldane (Haldane, 1935). However, the new mutation contribution to many human disease traits may be greater than anticipated (Hoischen et al., 2010), particularly for genetically heterogeneous conditions in which hundreds of genes could be involved but only one or a few loci are responsible in an individual patient. The developmental timing of new somatic mutations is perhaps underappreciated (Lupski, 2010) as previous studies have emphasized germline events. New mutations may occur in the germline, during any stage of development of the organism, in stem cells, or in differentiated somatic cells. Chromothripsis in cancer (Stephens et al., 2011) and complex genomic rearrangements (CGR) associated with selected genomic disorders (Liu et al., 2011) both illustrate the potential gene(s) alteration—complexities that can be brought about by new mutation CNV events. In each case, a single mutational event can result in a cataclysmic chromosomal catastrophe and alter the copy number or structure of several different genes. ‘‘Clan Genomics’’ Most sites of variation have low minor allele frequencies (that is, are rare) and are of recent origin, and therefore the major contributors to inherited disease susceptibility are likely to be those alleles that arose recently in an extended pedigree. Purifying natural selection is expected to eliminate highly deleterious variants before they reach a high frequency, such that disease risk alleles with large effects should be enriched at the lower frequencies (Marth et al., 2011). The idea that there are unique combinations of rare variants characteristic of a recent family lineage and that these combinations can have a causative role in disease is encapsulated by what we refer to as ‘‘clan genomics’’ (Figure 1). The population from which one comes and its collection of older common variants may have less influence on an individual’s disease susceptibility than the collection of recently arisen rare variants and de novo mutations (Figures 2A and 2B). The most important thing that an individual needs to consider in terms of their genetic variation with relation to disease susceptibility is therefore recent ‘‘genetic history’’ of their extended pedigree or clan. From the standpoint of delivering personalized genomic medicine, the medically actionable alleles are the ones of most interest; and these may be highly weighted toward recent rare variants. Nevertheless, the most important thing is not to focus disproportionately on specific variants, but rather to integrate across all classes of risk-associated variants. In some individuals, risk may be caused by an unusual combination of common variants, whereas in others it will be due to a smaller number of large effect rare variants. Mendelian Disease and Complex Traits Resequencing studies of genes that can cause rare Mendelian forms of common complex traits reveal that rare variants can contribute to hypertension (Ji et al., 2008; Wagner, 2008), hypercholesterolemia (Kotowski et al., 2006), hypertriglyceridemia (Romeo et al., 2009), and nonalcoholic fatty liver disease (Romeo et al., 2008) in the population at large. These examples inform models where individual alleles with high penetrance contribute to common complex traits. In addition, when GWAS signals
Figure 1. Clan Genomics Heat map and extended pedigree showing the conceptual relationship among de novo mutations leading to disease (red), recent mutations with moderate effects arising within a clan (yellow and green), and older common variants with small effects segregating in the population (blue). An individual’s genetic disease risk emerges from the collection of variants he or she has inherited from both parental lineages of distant ancestors (typically common and of individually small effect), more recent ancestors (rare, but potentially larger effect), and de novo mutations.
have identified variants for common traits, their molecular mechanistic underpinnings often support those already established by Mendelian forms of the condition (Sankaran et al., 2008, 2009; Vernimmen et al., 2009). The idea that genes responsible for Mendelian disease can also have a role in the common form of the same or a similar condition is not new. For example, the pioneering studies of Michael Brown and Joseph Goldstein showed that individuals with compound heterozygous mutations in the low-density lipoprotein receptor (LDLR) gene manifest the Mendelian disorder familial hypercholesterolemia (FH) (Brown and Goldstein, 1986; Goldstein and Brown, 1987). FH patients have extremely high cholesterol levels and can have coronary atherosclerotic heart disease and myocardial infarctions in their teenage years. Interestingly, the type of LDLR gene mutation predicts cardiovascular risk in children with familial hypercholesterolemia (Guardamagna et al., 2009). Heterozygous rare variant mutations at the LDLR locus can also cause the complex traits of early onset hypercholesterolemia, coronary atherosclerotic heart disease, and myocardial infarctions in carriers with disease manifesting in the fourth or fifth decades of life. Recessive Mendelian Mutations Can Increase Complex Disease Risk in Carriers Heterozygous carriers for recessive disease genes do not manifest the recessive disease but may be susceptible to a milder or related malady, which may consist of a complex trait with a similar phenotype. For example, heterozygote carriers of mutations in the ataxia telangiestasia locus are susceptible to breast
cancer (Athma et al., 1996), and similar heterozygous carrier susceptibilities are also manifest for other recessive human cancer predisposition syndromes (Heim et al., 1991). Carriers for mutations in the Gaucher disease causative gene, GBA encoding glucocerebrosidase, are at increased risk for Parkinson disease (Goker-Alpan et al., 2004; Sidransky et al., 2009). Heterozygous carriers of mutations in the cystic fibrosis transmembrane regulator gene, CFTR, can be susceptible to idiopathic pancreatitis (Cohn et al., 1998; Sharer et al., 1998; Weiss et al., 2005), chronic obstructive pulmonary disease (COPD) (Divac et al., 2004), and even chronic rhinosinusitis (Wang et al., 2000, 2005). Carriers of a-1-antitrypsin (AAT) deficiency can also be susceptible to COPD (Hersh et al., 2004; Poller et al., 1990). Interestingly, even such common traits as age-related macular degeneration (AMD) and carpal tunnel syndrome are associated with heterozygous carrier status for mutations in ABCA4, the gene responsible for Stargardt macular dystrophy (Bacq et al., 2009), and Charcot-Marie-Tooth neuropathy genes (Lupski et al., 2010), respectively (Figures 2C and 2D). In the latter case, haploinsufficiency due to either heterozygous SNV (Lupski et al., 2010) or heterozygous CNV (Del Colle et al., 2003) can convey the trait. Whereas most carrier states may have rare allele frequencies, others will actually have a significant carrier frequency in selected populations (e.g., CFTR 4% in European descendants). Genes and Single Loci Implicated in Mendelian Disease and in Complex Disease Risk In addition to variants that cause Mendelian disease-informing complex traits, there is a striking reciprocity of genes implicated by GWAS that are also known to underlie rare Mendelian diseases. For example, 11 of 30 genes associated with serum lipid levels are implicated in single-gene disorders of lipid metabolism (Kathiresan et al., 2009). We reviewed the current listing of annotated genes with significant associations in 891 GWAS studies (http://www.genome.gov/gwastudies/). We found that at least 268 genes implicated by GWAS are also known to bear mutations in rare single-gene disorders. Some of these associations are intuitive, such as those associated with biochemical traits and related inborn errors of metabolism. There are also a significant number of genes that underlie developmental disorders that harbor common variants affecting risk of cancer, body growth, and cardiovascular traits (Table S1 available online). This raises the testable hypothesis that genetic influences on human diseases can largely be accounted for by a subset of genes that play roles in a restricted set of pathways. Immune and inflammatory pathways provide a robust example as do those genes involved in lipid metabolism. It is important to note that in our survey, for most cases of GWAS the causal gene underlying a given GWAS signal is unknown. Whereas GWAS can indirectly implicate ‘‘Mendelian genes’’ in complex disease risk, different mutations of a single gene, or a CNV at a single locus, are directly implicated in complex disease risk. A poignant example of the different phenotypic consequences of distinct allelic variants at a locus is provided by the fragile X mental retardation 1 (FMR1) locus. Triplet repeat expansion of the CGG repeat element in the 50 untranslated region (UTR) of the FMR1 gene—an especially unstable form of indel mutation—causes severe X-linked mental retardation in both males and females. Alleles with lower numbers of CGG Cell 147, September 30, 2011 ª2011 Elsevier Inc. 35
Figure 2. Phenotypic Consequences of Allele Combinations This figure demonstrates ‘‘clan genomics,’’ wherein the combinations of alleles one inherits from his or her nearest relatives profoundly affect clinical outcome. In these illustrative pedigrees, different combinations of ABCA4 alleles can affect age-of-onset of Stargardt macular dystrophy (STGD; MIM 601691) (A), Mendelian versus multifactorial trait (i.e., Stargardt disease versus age-related macular degeneration [AMD]) (B), or retinal disease type (i.e., Stargardt disease versus retinitis pigmentosa [RP; MIM 601718]) (C). (D) Differing SH3TC2 alleles result in recessive Charcot-Marie-Tooth disease (CMT; MIM 601596), dominant axonal neuropathy, or the complex trait of carpal tunnel syndrome.
repeats (55–200 repeats; called premutation alleles), however, cause adult onset tremor/ataxia syndrome (FXTAS) in approximately 33% of males and 10% of females (Hagerman et al., 2004; Jacquemont et al., 2004). Thus, premutation variants that have been considered nonpathogenic can have phenotypic consequences for common complex traits such as tremor and ataxia. Rare CNV at different loci have also recently been associated with complex traits including Alzheimer disease (Rovelet-Lecrux et al., 2006), Parkinson disease (Farrer et al., 2004; Singleton et al., 2003), lupus glomerulonephritis (Aitman et al., 2006), Crohn disease (Fellermann et al., 2003, 2006; McCarroll et al., 2008), psoriasis (Hollox et al., 2008), pancreatitis (Le Mare´chal et al., 2006), and obesity (Bochukova et al., 2010). Many rare CNV have also been associated with intellectual disability (Stankiewicz and Beaudet, 2007) and with some forms of neuropsychiatric illness, including schizophrenia (Consortium, 2008; Lupski, 2008; McCarthy et al., 2009; Stefansson et al., 2008) and autism (Kumar et al., 2008; Shinawi et al., 2010; Weiss et al., 2008). Same Gene, Different Mutations, Diseases, and Modes of Inheritance A further illustrative example of the connection between Mendelian and complex traits is provided by variants at the MECP2 36 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
locus and their contribution to disease. Heterozygous loss-offunction SNV in MECP2 result in the X-linked dominant trait of Rett Syndrome in girls (Amir et al., 1999); however, hemizygous loss-of-function mutations are thought to be lethal in males. Recently, duplication of CNV including MECP2 has been associated with an intellectual disability plus seizure disorder in males (Carvalho et al., 2009; del Gaudio et al., 2006; Friez et al., 2006; Meins et al., 2005; Van Esch et al., 2005) and autism spectrum disorder (Ramocki et al., 2009; Schaaf et al., 2011). Male patients with triplication of MECP2 have a more severe phenotype (del Gaudio et al., 2006). Of note, maternal carriers of the MECP2 duplication (CNV) appear more susceptible to psychiatric symptoms unrelated to having a child with a disability (Ramocki et al., 2009). Thus, at a single locus, the genetic variation can cause an X-linked dominant disorder in females and an X-linked recessive trait in males and can be associated with susceptibility to a common complex trait in carrier mothers. For mutations at a single locus, allelic interactions can profoundly affect clinical phenotype. At the ABCA4 locus, the disease severity is related to the residual activity of encoded transporter protein (Figures 2A–2C). Recessive Stargardt disease is caused by compound heterozygous mutations at this locus (Allikmets et al., 1997b). Homozygous or compound
Figure 4. Totality of Pathogenic Variants, Disease Severity, and Clan Genomics
Figure 3. Models of Disease Allele Transmission (A) In classical Mendelian disease, for a recessive, monogenic disease, at that single locus there is biallelic inheritance (highlighted in box). Examples could be either Stargardt macular dystrophy or cystic fibrosis, which are both due to point mutations in ATP-binding cassette (ABC) transporter genes. However, at some loci in the human genome, imprinting results in monoallelic expression, and the disease phenotype will occur in a manner dependent on the parent of origin of the specific mutation, either by deletion copy number variants (CNV) or uniparental disomy (UPD). The example given is the Angelman syndrome with point mutations in the UBE3A gene. The CMT1A locus (17p12) represents a triallelic locus whereby because of the duplication, there are three copies of the PMP22 gene. None of the copies have point mutations in them, but it takes three copies to convey the clinical phenotype. Other examples of disease allele transmission include interactions between two or potentially more genes. In the classic model of digenic inheritance, the phenotype of retinitis pigmentosa has been shown to be due to heterozygous point mutations in the ROM1 gene in combination with heterozygous point mutations at the RDS locus. Thus there is biallelic digenic inheritance. Note that a genomic deletion CNV renders a locus monoallelic, whereas a duplication CNV results in a triallelic locus. (B) Bardet-Biedl syndrome (BBS), traditionally thought of as a recessive trait, can sometimes result from three mutant alleles, two of which come from one locus, and one from another locus. This is an example of digenic triallelic inheritance. (C) A single pedigree illustrates triallelic inheritance for BBS. Standard pedigree symbols are used; filled squares, affected with BBS. Alleles segregating at two distinct loci (BBS2 and BBS6) are shown, one in each pedigree. WT, wild-type or normal allele.
heterozygous mutations, if both null, result in retinitis pigmentosa. Within a single pedigree or clan, different combinations of alleles can result in differing ages of onset (Lewis et al., 1999), completely different diseases (Shroyer et al., 2001a), or both a recessive Stargardt disease and susceptibility to a complex trait, age-related macular degeneration, due to a heterozygous carrier state (Figure 2) (Shroyer et al., 1999, 2001b).
Pedigrees of families segregating Charcot-Marie-Tooth (CMT) neuropathy, illustrating that disease severity is directly related to pathogenic mutational burden. (A–C) Mutations at two different CMT loci result in a more severe phenotype. These double heterozygotes may be due to either a single-nucleotide variant (SNV) + copy-number variant (CNV) (A and B) or two SNV (C) (Chung et al., 2005; Hodapp et al., 2006; Meggouh et al., 2005). (D) In a single family, disease results from homozygous MTMR2 mutation (likely related to consanguinity) or de novo CNV—the CMT1A duplication (PMP22) (Verny et al., 2004); an example of clan genomics.
Multiple Mutated Genes Underlying Clinical Phenotypes Rare point mutations (either functional noncoding SNV [Kurotaki et al., 2005] or coding SNV with incomplete penetrance [Shy et al., 2006]) in combination with a deletion CNV have been shown to contribute together to particular phenotypes. A combination of a rare deletion CNV with a de novo duplication CNV can also result in a phenotype that appears to be a complex trait (Potocki et al., 1999). Sometimes SNV mutations at two different loci, i.e., digenic inheritance, are required to manifest a trait segregating as a recessive disease, and the mutational load required may have a single mutant allele at each of the two loci (Kajiwara et al., 1994) (double heterozygous) or two mutant alleles at one locus and one at the other (triallelic inheritance) (Figure 3) (Katsanis et al., 2001). With respect to models for Mendelian transmission, a deletion CNV renders a locus monoallelic, whereas a duplication CNV results in a triallelic locus (Figure 3). It is now well established that even simple Mendelian traits can have modifier loci (Badano and Katsanis, 2002; Dipple and McCabe, 2000), demonstrating the potential importance of nonhomologous allelic interaction and epistasis. For example, severity of disease for CMT can be due to a combination of mutations at more than one CMT locus (Chung et al., 2005; Hodapp et al., 2006; Meggouh et al., 2005) (Figures 4A–4C). Cell 147, September 30, 2011 ª2011 Elsevier Inc. 37
Figure 5. Bridging the Gap between Chromosomal Syndromes and Mendelian Disorders (A) Chromosomal duplication mapping wherein chromosomally visible duplication abnormalities, as evidenced by altered G-banding patterns, are used to delineate the portion of the genome responsible for the reduced motor nerve conduction velocities that accompany the demyelinating form of Charcot-Marie-Tooth disease (CMT1A; MIM 118220). Several different chromosomal abnormalities have been reported in association with a CMT1 phenotype. Note, different chromosome 17 abnormalities including direct duplications, inverted duplications, and inherited as well as de novo translocations have been reported with complex phenotypes that include CMT. If the duplicated genomic interval encompasses the 17p12 dark G-band where the PMP22 gene maps (*), then the patient will have a demyelinating neuropathy, as evidenced by decreased motor nerve conduction velocities, as part of their clinical phenotype. (B) Submicroscopic genomic rearrangements associated with neuropathy. Vertical lines represent a ‘‘blow-up’’ of the genomic interval within 17p12 containing the PMP22 gene (filled rectangle). The horizontal parentheses delimit the rearranged interval for the common deletion (depicted by absence of vertical line) and duplication (two copies of gene and interval). To the right are rare-sized copy-number variants (CNV) depicting genomic deletion (green dots on array CGH) versus duplication (red dots on array). (C) shows genotype/phenotype correlations between PMP22 point mutations associated with neuropathy. The T118M missense amino acid substitution in PMP22 appears to be a reduced penetrance loss-of-function allele. As a heterozygous mutation it can result in a mild hereditary neuropathy with liability to pressure palsies (HNPP) phenotype in some individuals; as a homozygous allele it can convey a severe axonal neuropathy. Interestingly, when the T118M allele occurs in combination with the HNPP deletion, a severe demyelinating phenotype results. Of further interest, when the T118M allele occurs in combination with the CMT1A duplication, the loss-of-function missense amino substitution appears to mitigate some of the consequences of the gain-of-function duplication CNV.
Large CNV or Aneuploidy Can Simultaneously Affect Multiple Genes In contrast to the trans-genetics of Mendelism (Figures 2–4), genetic interactions occurring on the same chromosome or in cis (Figures 5A–5B) can also have profound consequences as exemplified at the alpha globin locus. For structural variants, the genomic mutational load can reflect the size of the CNV and inclusion of additional dosage-sensitive genes or genomic segments in cis (Bi et al., 2009; Lupski et al., 1991, 1992; Roa et al., 1996). Two extreme examples of this ‘‘cis-genetics’’ effect are segmental aneuploidy (Figure 5) and complete aneuploidy (e.g., trisomy 21) that convey complex phenotypes related to the size of the CNV and number of dosage-sensitive genes and/or genomic segments involved. For Down syndrome associated with trisomy 21, this includes an endophenotype of early onset Alzheimer disease; the amyloid precursor protein (APP) gene maps to chromosome 21, and duplications involving this gene have indeed been associated with Alzheimer disease (Rovelet-Lecrux et al., 2006). For intellectual disability, recent studies suggest the possibility that two independent CNV (El-Hattab et al., 2010; Girirajan et al., 2010; Potocki et al., 1999) can contribute to the ultimate phenotype, as shown in individual patients and as predicted by previous models (Lupski, 2007b). In aggregate these data show that rare variants and the genome-wide totality of pathogenic alleles contribute to complex traits (Allikmets, 2000; Allikmets et al., 1997a; Douros 38 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
et al., 2008; Hersh et al., 2004; Lupski, 2007b; Poller et al., 1990; Wittrup et al., 1997, 2006). Unfortunately, such rare variants are not being accounted for in many current GWAS, and CNV and noncoding SNV are not detected by typical wholeexome sequencing approaches. A Unified Genetic Model for Human Disease In the past, focused, locus-specific, single-gene analyses have elucidated genetic etiologies for disease, but it is now emerging that whole-genome sequencing will produce a more complete assessment of genetic variation contributing to personal health. The genome of each individual contains the inherited contribution of common variants that segregate within the population, the inherited contributions of rare variants that emerged in recent history in the clan, the new combinations of such recently arising variants from both parents, and the new mutation contributions yielding the total mutational burden (Figure 1). Highly penetrant rare variants, and often de novo mutations, contribute medically actionable alleles to Mendelian disease and perhaps extremes of phenotypes in common disease. Common variants can contribute to medically actionable variants for pharmacogenomics traits. What emerges is a unified picture whereby previously distinct entities or categories of human diseases, chromosomal syndromes, genomic disorders, Mendelian traits, and common diseases or complex traits, can now be considered as part of one continuum (Figure 6), whereby common and rare variants
Figure 6. A Continuum for the Genetics of Human Disease The square (center) represents genomic variation that can influence the different categories of genetic disease. The circles represent the overlapping categories of human disease with darker regions depicting intersection with greater overlap in the underlying genetic influences on these given disease categories. A unified model for human genetic disease proposes that all major categories of disease with genetic influence—Mendelian disease, common disease or complex traits, genomic disorders, and chromosomal syndromes— can be explained by variation in DNA sequence (SNV) or copy number (CNV) from a ‘‘wild-type’’ diploid state. Whereas trans-genetic interactions at a single locus (alleles) or between loci may contribute to Mendelian disease and complex traits, cis-genetic interactions can be important to phenotypic manifestations in genomic disorders (CNV) and chromosomal syndromes (segmental aneuploidy). Digenic and triallelic inheritance bridge Mendelian traits and complex disease; each represents an oligogenic inheritance model.
context of complete individual genetic variation data, population genetics, and evolution. Genome-wide assays including whole-genome sequencing, copy-number arrays, and transcriptional profiling are among the current technologies that can be used to further explore and test the ‘‘genome-wide totality of pathogenic variants’’ hypothesis. These genome analysis methods can now generate a massive data flow, opening up to experimental exploration fundamental questions that have occupied the minds of generations of scientists and philosophers. Yet, such genome-wide experimental assays alone will be insufficient. Other challenges include: How many types of variants (repeat expansions, CNV between 100 and 500 bp, etc.) are we missing with current techniques? How will we validate the phenotypic effects of variants observed in a single individual or family? What analytical approaches should clinical genome sequencing projects adopt given the sheer complexity of some of the gene-disease associations described herein? How can we integrate disease risk emerging from common and rare variants in an individual genome? Can disease phenotypes be refined and redefined by molecular correlates such as gene expression, chromatin conformation, DNA methylation, and all of the other ‘omics? Can individual serial observation of molecular phenotypes, much as we currently do for routine lab measures such as glucose and lipids, show us stronger effects of underlying genetic variation that are otherwise poorly captured by crosssectional studies and lead us to yet new models? SUPPLEMENTAL INFORMATION
including de novo mutations in the context of environmental influences result in perturbation of the biological balance of a restricted set of networks activating final common pathways that ultimately cause disease. Even though there may be many loci that contribute to interindividual inherited susceptibility of a phenotype in a population, in any one individual rare or common variants from just a few may be responsible for the trait (i.e., oligogenic inheritance). Extreme genetic heterogeneity and the contributions of new mutation may underlie some of the apparent complexity of complex traits. A unified genetic model for human disease breaks down the artificial boundaries between categories of human disease (Figure 6). It views all human disease categories including complex traits, Mendelian disease, genomic disorders, and even chromosomal syndromes as representing a spectrum of phenotypic manifestations reflecting the totality of pathogenic variants: ancestral alleles, those arising in recent ancestors (clan), unique combinations inherited from parents, and de novo variants (Figure 1). A full accounting of individual mutational load genome-wide and expansion of the current genocentric, locusspecific model opens the door to reinvestigation of classic problems in human genetics. These challenges include understanding the molecular basis of incomplete penetrance and variable expressivity of monogenic traits, clinical manifestations of ‘‘recessive alleles’’ (i.e., weak semidominance), homologous allelic interaction and nonhomologous allelic interaction, and their effects on disease and health. This new synthesis is required to interpret the ecology of individual genomes in the
Supplemental Information includes one table and can be found with this article online at doi:10.1016/j.cell.2011.09.008. ACKNOWLEDGMENTS This work was supported in part by the National Human Genome Research Institute (5 U54 HG003273) to R.A.G. and the National Institute of Neurological Disorders and Stroke (R01NS058529) to J.R.L. J.R.L. is a consultant for Athena Diagnostics, has stock ownership in 23andMe and Ion Torrent Systems, and is a coinventor on multiple United States and European patents for DNA diagnostics. R.A.G. and J.W.B. are founding shareholders in SeqWright, Inc. The Department of Molecular and Human Genetics derives revenue from clinical testing by high-resolution human genome analyses. REFERENCES 1000 Genomes Project Consortium. (2010). A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073. Ahn, S.M., Kim, T.H., Lee, S., Kim, D., Ghang, H., Kim, D.S., Kim, B.C., Kim, S.Y., Kim, W.Y., Kim, C., et al. (2009). The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group. Genome Res. 19, 1622–1629. Aitman, T.J., Dong, R., Vyse, T.J., Norsworthy, P.J., Johnson, M.D., Smith, J., Mangion, J., Roberton-Lowe, C., Marshall, A.J., Petretto, E., et al. (2006). Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans. Nature 439, 851–855. Allikmets, R.; The International ABCR Screening Consortium. (2000). Further evidence for an association of ABCR alleles with age-related macular degeneration. Am. J. Hum. Genet. 67, 487–491. Allikmets, R., Shroyer, N.F., Singh, N., Seddon, J.M., Lewis, R.A., Bernstein, P.S., Peiffer, A., Zabriskie, N.A., Li, Y., Hutchinson, A., et al. (1997a). Mutation
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 39
of the Stargardt disease gene (ABCR) in age-related macular degeneration. Science 277, 1805–1807.
quencing reveals excess rare recent variants consistent with explosive population growth. Nat. Commun. 1, 131.
Allikmets, R., Singh, N., Sun, H., Shroyer, N.F., Hutchinson, A., Chidambaram, A., Gerrard, B., Baird, L., Stauffer, D., Peiffer, A., et al. (1997b). A photoreceptor cell-specific ATP-binding transporter gene (ABCR) is mutated in recessive Stargardt macular dystrophy. Nat. Genet. 15, 236–246.
Crow, J.F. (2008). Maintaining evolvability. J. Genet. 87, 349–353.
Amir, R.E., Van den Veyver, I.B., Wan, M., Tran, C.Q., Francke, U., and Zoghbi, H.Y. (1999). Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2. Nat. Genet. 23, 185–188. Athma, P., Rappaport, R., and Swift, M. (1996). Molecular genotyping shows that ataxia-telangiectasia heterozygotes are predisposed to breast cancer. Cancer Genet. Cytogenet. 92, 130–134. Bacq, Y., Gendrot, C., Perrotin, F., Lefrou, L., Chre´tien, S., Vie-Buret, V., Brechot, M.C., and Andres, C.R. (2009). ABCB4 gene mutations and singlenucleotide polymorphisms in women with intrahepatic cholestasis of pregnancy. J. Med. Genet. 46, 711–715. Badano, J.L., and Katsanis, N. (2002). Beyond Mendel: an evolving view of human genetic disease transmission. Nat. Rev. Genet. 3, 779–789. Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., Brown, C.G., Hall, K.P., Evers, D.J., Barnes, C.L., Bignell, H.R., et al. (2008). Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59. Bi, W., Sapir, T., Shchelochkov, O.A., Zhang, F., Withers, M.A., Hunter, J.V., Levy, T., Shinder, V., Peiffer, D.A., Gunderson, K.L., et al. (2009). Increased LIS1 expression affects human and mouse brain development. Nat. Genet. 41, 168–177. Bochukova, E.G., Huang, N., Keogh, J., Henning, E., Purmann, C., Blaszczyk, K., Saeed, S., Hamilton-Shield, J., Clayton-Smith, J., O’Rahilly, S., et al. (2010). Large, rare chromosomal deletions associated with severe early-onset obesity. Nature 463, 666–670. Boerwinkle, E., and Utermann, G. (1988). Simultaneous effects of the apolipoprotein E polymorphism on apolipoprotein E, apolipoprotein B, and cholesterol metabolism. Am. J. Hum. Genet. 42, 104–112. Boyko, A.R., Williamson, S.H., Indap, A.R., Degenhardt, J.D., Hernandez, R.D., Lohmueller, K.E., Adams, M.D., Schmidt, S., Sninsky, J.J., Sunyaev, S.R., et al. (2008). Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4, e1000083.
Del Colle, R., Fabrizi, G.M., Turazzini, M., Cavallaro, T., Silvestri, M., and Rizzuto, N. (2003). Hereditary neuropathy with liability to pressure palsies: electrophysiological and genetic study of a family with carpal tunnel syndrome as only clinical manifestation. Neurol. Sci. 24, 57–60. del Gaudio, D., Fang, P., Scaglia, F., Ward, P.A., Craigen, W.J., Glaze, D.G., Neul, J.L., Patel, A., Lee, J.A., Irons, M., et al. (2006). Increased MECP2 gene copy number as the result of genomic duplication in neurodevelopmentally delayed males. Genet. Med. 8, 784–792. Dipple, K.M., and McCabe, E.R. (2000). Phenotypes of patients with ‘‘simple’’ Mendelian disorders are complex traits: thresholds, modifiers, and systems dynamics. Am. J. Hum. Genet. 66, 1729–1735. Divac, A., Nikolic, A., Mitic-Milikic, M., Nagorni-Obradovic, L., Petrovic-Stanojevic, N., Dopudja-Pantic, V., Nadaskic, R., Savic, A., and Radojkovic, D. (2004). High frequency of the R75Q CFTR variation in patients with chronic obstructive pulmonary disease. J. Cyst. Fibros. 3, 189–191. Douros, K., Loukou, I., Doudounakis, S., Tzetis, M., Priftis, K.N., and Kanavakis, E. (2008). Asthma and pulmonary function abnormalities in heterozygotes for cystic fibrosis transmembrane regulator gene mutations. Int. J. Clin. Exp. Med. 1, 345–349. El-Hattab, A., Zhang, F., Maxim, R., Christensen, K.M., Ward, J.C., Scaglia, F., Lupski, J.R., and Cheung, S.W. (2010). Deletion and duplication of 15q24: molecular mechanisms and potential modification by additional copy number variants. Genet. Med. 12, 573–586. Farrer, M., Kachergus, J., Forno, L., Lincoln, S., Wang, D.S., Hulihan, M., Maraganore, D., Gwinn-Hardy, K., Wszolek, Z., Dickson, D., and Langston, J.W. (2004). Comparison of kindreds with parkinsonism and alpha-synuclein genomic multiplications. Ann. Neurol. 55, 174–179. Fellermann, K., Wehkamp, J., Herrlinger, K.R., and Stange, E.F. (2003). Crohn’s disease: a defensin deficiency syndrome? Eur. J. Gastroenterol. Hepatol. 15, 627–634.
Brown, M.S., and Goldstein, J.L. (1986). A receptor-mediated pathway for cholesterol homeostasis. Science 232, 34–47.
Fellermann, K., Stange, D.E., Schaeffeler, E., Schmalzl, H., Wehkamp, J., Bevins, C.L., Reinisch, W., Teml, A., Schwab, M., Lichter, P., et al. (2006). A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn disease of the colon. Am. J. Hum. Genet. 79, 439–448.
Carvalho, C.M., Zhang, F., Liu, P., Patel, A., Sahoo, T., Bacino, C.A., Shaw, C., Peacock, S., Pursley, A., Tavyev, Y.J., et al. (2009). Complex rearrangements in patients with duplications of MECP2 can occur by fork stalling and template switching. Hum. Mol. Genet. 18, 2188–2203.
Frazer, K.A., Ballinger, D.G., Cox, D.R., Hinds, D.A., Stuve, L.L., Gibbs, R.A., Belmont, J.W., Boudreau, A., Hardenbol, P., Leal, S.M., et al; International HapMap Consortium. (2007). A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861.
Chung, K.W., Sunwoo, I.N., Kim, S.M., Park, K.D., Kim, W.K., Kim, T.S., Koo, H., Cho, M., Lee, J., and Choi, B.O. (2005). Two missense mutations of EGR2 R359W and GJB1 V136A in a Charcot-Marie-Tooth disease family. Neurogenetics 6, 159–163.
Friez, M.J., Jones, J.R., Clarkson, K., Lubs, H., Abuelo, D., Bier, J.A., Pai, S., Simensen, R., Williams, C., Giampietro, P.F., et al. (2006). Recurrent infections, hypotonia, and mental retardation caused by duplication of MECP2 and adjacent region in Xq28. Pediatrics 118, e1687–e1695.
Cohn, J.A., Friedman, K.J., Noone, P.G., Knowles, M.R., Silverman, L.M., and Jowell, P.S. (1998). Relation between mutations of the cystic fibrosis gene and idiopathic pancreatitis. N. Engl. J. Med. 339, 653–658. Conrad, D.F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Aerts, J., Andrews, T.D., Barnes, C., Campbell, P., et al; Wellcome Trust Case Control Consortium. (2010). Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712. Conrad, D.F., Keebler, J.E., DePristo, M.A., Lindsay, S.J., Zhang, Y., Casals, F., Idaghdour, Y., Hartl, C.L., Torroja, C., Garimella, K.V., et al; 1000 Genomes Project. (2011). Variation in genome-wide mutation rates within and between human families. Nat. Genet. 43, 712–714. Consortium, I.S.; International Schizophrenia Consortium. (2008). Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455, 237–241. Coventry, A., Bull-Otterson, L.M., Liu, X., Clark, A.G., Maxwell, T.J., Crosby, J., Hixson, J.E., Rea, T.J., Muzny, D.M., Lewis, L.R., et al. (2010). Deep rese-
40 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Girirajan, S., Rosenfeld, J.A., Cooper, G.M., Antonacci, F., Siswara, P., Itsara, A., Vives, L., Walsh, T., McCarthy, S.E., Baker, C., et al. (2010). A recurrent 16p12.1 microdeletion supports a two-hit model for severe developmental delay. Nat. Genet. 42, 203–209. Goker-Alpan, O., Schiffmann, R., LaMarca, M.E., Nussbaum, R.L., McInerneyLeo, A., and Sidransky, E. (2004). Parkinsonism among Gaucher disease carriers. J. Med. Genet. 41, 937–940. Goldstein, J.L., and Brown, M.S. (1987). Regulation of low-density lipoprotein receptors: implications for pathogenesis and therapy of hypercholesterolemia and atherosclerosis. Circulation 76, 504–507. Goldstein, J.L., and Brown, M.S. (2001). Molecular medicine. The cholesterol quartet. Science 292, 1310–1312. Gonzaga-Jauregui, C., Lupski, J.R., and Gibbs, R. (2011). Human genome sequencing in health and disease. Ann. Rev. Med. 10.1146/annurev-med051010-162644.
Greeley, S.A., Tucker, S.E., Worrell, H.I., Skowron, K.B., Bell, G.I., and Philipson, L.H. (2010). Update in neonatal diabetes. Curr. Opin. Endocrinol. Diabetes Obes. 17, 13–19. Guardamagna, O., Restagno, G., Rolfo, E., Pederiva, C., Martini, S., Abello, F., Baracco, V., Pisciotta, L., Pino, E., Calandra, S., et al. (2009). The type of LDLR gene mutation predicts cardiovascular risk in children with familial hypercholesterolemia. J. Pediatr. 155, 199–204, e192. Hagerman, R.J., Leavitt, B.R., Farzin, F., Jacquemont, S., Greco, C.M., Brunberg, J.A., Tassone, F., Hessl, D., Harris, S.W., Zhang, L., et al. (2004). Fragile-X-associated tremor/ataxia syndrome (FXTAS) in females with the FMR1 premutation. Am. J. Hum. Genet. 74, 1051–1056. Haldane, J.B.S. (1935). The rate of spontaneous mutation of a human gene. J. Genet. 31, 317–326. Heim, R.A., Lench, N.J., and Swift, M. (1991). Heterozygous manifestations in four autosomal recessive human cancer-prone syndromes: ataxia telangiectasia, xeroderma pigmentosum, Fanoni anemia, and Bloom syndrome. Mutat. Res. 284, 25–36. Hersh, C.P., Dahl, M., Ly, N.P., Berkey, C.S., Nordestgaard, B.G., and Silverman, E.K. (2004). Chronic obstructive pulmonary disease in alpha1-antitrypsin PI MZ heterozygotes: a meta-analysis. Thorax 59, 843–849. Hodapp, J.A., Carter, G.T., Lipe, H.P., Michelson, S.J., Kraft, G.H., and Bird, T.D. (2006). Double trouble in hereditary neuropathy: concomitant mutations in the PMP-22 gene and another gene produce novel phenotypes. Arch. Neurol. 63, 112–117. Hoischen, A., van Bon, B.W., Gilissen, C., Arts, P., van Lier, B., Steehouwer, M., de Vries, P., de Reuver, R., Wieskamp, N., Mortier, G., et al. (2010). De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat. Genet. 42, 483–485. Hollox, E.J., Huffmeier, U., Zeeuwen, P.L., Palla, R., Lascorz, J., RodijkOlthuis, D., van de Kerkhof, P.C., Traupe, H., de Jongh, G., den Heijer, M., et al. (2008). Psoriasis is associated with increased beta-defensin genomic copy number. Nat. Genet. 40, 23–25. International HapMap 3 Consortium. (2010). Integrating common and rare genetic variation in diverse human populations. Nature 467, 52. International HapMap Consortium. (2005). A haplotype map of the human genome. Nature 437, 1299–1320. Jacquemont, S., Hagerman, R.J., Leehey, M.A., Hall, D.A., Levine, R.A., Brunberg, J.A., Zhang, L., Jardini, T., Gane, L.W., Harris, S.W., et al. (2004). Penetrance of the fragile X-associated tremor/ataxia syndrome in a premutation carrier population. JAMA 291, 460–469.
plasma levels of low-density lipoprotein cholesterol. Am. J. Hum. Genet. 78, 410–422. Kumar, R.A., KaraMohamed, S., Sudi, J., Conrad, D.F., Brune, C., Badner, J.A., Gilliam, T.C., Nowak, N.J., Cook, E.H., Jr., Dobyns, W.B., and Christian, S.L. (2008). Recurrent 16p11.2 microdeletions in autism. Hum. Mol. Genet. 17, 628–638. Kurotaki, N., Shen, J.J., Touyama, M., Kondoh, T., Visser, R., Ozaki, T., Nishimoto, J., Shiihara, T., Uetake, K., Makita, Y., et al. (2005). Phenotypic consequences of genetic variation at hemizygous alleles: Sotos syndrome is a contiguous gene syndrome incorporating coagulation factor twelve (FXII) deficiency. Genet. Med. 7, 479–483. Le Mare´chal, C., Masson, E., Chen, J.M., Morel, F., Ruszniewski, P., Levy, P., and Fe´rec, C. (2006). Hereditary pancreatitis caused by triplication of the trypsinogen locus. Nat. Genet. 38, 1372–1374. Levy, S., Sutton, G., Ng, P.C., Feuk, L., Halpern, A.L., Walenz, B.P., Axelrod, N., Huang, J., Kirkness, E.F., Denisov, G., et al. (2007). The diploid genome sequence of an individual human. PLoS Biol. 5, e254. Lewis, R.A., Shroyer, N.F., Singh, N., Allikmets, R., Hutchinson, A., Li, Y., Lupski, J.R., Leppert, M., and Dean, M. (1999). Genotype/Phenotype analysis of a photoreceptor-specific ATP-binding cassette transporter gene, ABCR, in Stargardt disease. Am. J. Hum. Genet. 64, 422–434. Liu, P., Erez, A., Nagamani, S.C.S., Dhar, S.U., Kolodziejska, K.E., Dharmadhikari, A.V., Cooper, M.L., Wiszniewska, J., Zhang, F., Withers, M.A., Bacino, C.A., et al. (2011). Chromosome catastrophes involve replication mechanisms generating complex genomic rearrangements. Cell 146, 889–903. Lu, X.Y., Phung, M.T., Shaw, C.A., Pham, K., Neil, S.E., Patel, A., Sahoo, T., Bacino, C.A., Stankiewicz, P., Kang, S.H., et al. (2008). Genomic imbalances in neonates with birth defects: high detection rates by using chromosomal microarray analysis. Pediatrics 122, 1310–1318. Lupski, J.R. (2007a). Genomic rearrangements and sporadic disease. Nat. Genet. Suppl. 39, S43–S47. Lupski, J.R. (2007b). Structural variation in the human genome. N. Engl. J. Med. 356, 1169–1171. Lupski, J.R. (2008). Schizophrenia: Incriminating genomic evidence. Nature 455, 178–179. Lupski, J.R. (2010). New mutations and intellectual function. Nat. Genet. 42, 1036–1038. Lupski, J.R., Wise, C.A., Kuwano, A., Pentao, L., Parke, J.T., Glaze, D.G., Ledbetter, D.H., Greenberg, F., and Patel, P.I. (1992). Gene dosage is a mechanism for Charcot-Marie-Tooth disease type 1A. Nat. Genet. 1, 29–33.
Ji, W., Foo, J.N., O’Roak, B.J., Zhao, H., Larson, M.G., Simon, D.B., NewtonCheh, C., State, M.W., Levy, D., and Lifton, R.P. (2008). Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat. Genet. 40, 592–599.
Lupski, J.R., de Oca-Luna, R.M., Slaugenhaupt, S., Pentao, L., Guzzetta, V., Trask, B.J., Saucedo-Cardenas, O., Barker, D.F., Killian, J.M., Garcia, C.A., et al. (1991). DNA duplication associated with Charcot-Marie-Tooth disease type 1A. Cell 66, 219–232.
Kajiwara, K., Berson, E.L., and Dryja, T.P. (1994). Digenic retinitis pigmentosa due to mutations at the unlinked peripherin/RDS and ROM1 loci. Science 264, 1604–1608.
Lupski, J.R., Reid, J.G., Gonzaga-Jauregui, C., Rio Deiros, D., Chen, D.C., Nazareth, L., Bainbridge, M., Dinh, H., Jing, C., Wheeler, D.A., et al. (2010). Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. N. Engl. J. Med. 362, 1181–1191.
Kathiresan, S., Willer, C.J., Peloso, G.M., Demissie, S., Musunuru, K., Schadt, E.E., Kaplan, L., Bennett, D., Li, Y., Tanaka, T., et al. (2009). Common variants at 30 loci contribute to polygenic dyslipidemia. Nat. Genet. 41, 56–65. Katsanis, N., Ansley, S.J., Badano, J.L., Eichers, E.R., Lewis, R.A., Hoskins, B.E., Scambler, P.J., Davidson, W.S., Beales, P.L., and Lupski, J.R. (2001). Triallelic inheritance in Bardet-Biedl syndrome, a Mendelian recessive disorder. Science 293, 2256–2259. Kim, J.I., Ju, Y.S., Park, H., Kim, S., Lee, S., Yi, J.H., Mudge, J., Miller, N.A., Hong, D., Bell, C.J., et al. (2009). A highly annotated whole-genome sequence of a Korean individual. Nature 460, 1011–1015. Klassen, T., Davis, C., Goldman, A., Burgess, D., Chen, T., Wheeler, D., McPherson, J., Bourquin, T., Lewis, L., Villasana, D., et al. (2011). Exome sequencing of ion channel genes reveals complex profiles confounding personal risk assessment in epilepsy. Cell 145, 1036–1048. Kotowski, I.K., Pertsemlidis, A., Luke, A., Cooper, R.S., Vega, G.L., Cohen, J.C., and Hobbs, H.H. (2006). A spectrum of PCSK9 alleles contributes to
Manolio, T.A., Collins, F.S., Cox, N.J., Goldstein, D.B., Hindorff, L.A., Hunter, D.J., McCarthy, M.I., Ramos, E.M., Cardon, L.R., Chakravarti, A., et al. (2009). Finding the missing heritability of complex diseases. Nature 461, 747–753. Marth, G.T., Yu, F., Indap, A.R., Garimella, K., Gravel, S., Leong, W.F., TylerSmith, C., Bainbridge, M., Blackwell, T., Zheng-Bradley, X., et al. (2011). The functional spectrum of low-frequency coding variation. Genome Biol. Published online September 14 2011. 10.1186/gb-2011-12-9-r84. McCarroll, S.A., Huett, A., Kuballa, P., Chilewski, S.D., Landry, A., Goyette, P., Zody, M.C., Hall, J.L., Brant, S.R., Cho, J.H., et al. (2008). Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn’s disease. Nat. Genet. 40, 1107–1112. McCarthy, S.E., Makarov, V., Kirov, G., Addington, A.M., McClellan, J., Yoon, S., Perkins, D.O., Dickel, D.E., Kusenda, M., Krastoshevsky, O., et al; Wellcome Trust Case Control Consortium. (2009). Microduplications of 16p11.2 are associated with schizophrenia. Nat. Genet. 41, 1223–1227.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 41
Meggouh, F., de Visser, M., Arts, W.F., De Coo, R.I., van Schaik, I.N., and Baas, F. (2005). Early onset neuropathy in a compound form of CharcotMarie-Tooth disease. Ann. Neurol. 57, 589–591. Meins, M., Lehmann, J., Gerresheim, F., Herchenbach, J., Hagedorn, M., Hameister, K., and Epplen, J.T. (2005). Submicroscopic duplication in Xq28 causes increased expression of the MECP2 gene in a boy with severe mental retardation and features of Rett syndrome. J. Med. Genet. 42, e12. Morrison, A.C., Bare, L.A., Chambless, L.E., Ellis, S.G., Malloy, M., Kane, J.P., Pankow, J.S., Devlin, J.J., Willerson, J.T., and Boerwinkle, E. (2007). Prediction of coronary heart disease risk using a genetic risk score: the Atherosclerosis Risk in Communities Study. Am. J. Epidemiol. 166, 28–35. Muller, H.J. (1950). Our load of mutations. Am. J. Hum. Genet. 2, 111–176. Poller, W., Meisen, C., and Olek, K. (1990). DNA polymorphisms of the alpha 1-antitrypsin gene region in patients with chronic obstructive pulmonary disease. Eur. J. Clin. Invest. 20, 1–7. Potocki, L., Chen, K.S., Koeuth, T., Killian, J., Iannaccone, S.T., Shapira, S.K., Kashork, C.D., Spikes, A.S., Shaffer, L.G., and Lupski, J.R. (1999). DNA rearrangements on both homologues of chromosome 17 in a mildly delayed individual with a family history of autosomal dominant carpal tunnel syndrome. Am. J. Hum. Genet. 64, 471–478. Ramocki, M.B., Peters, S.U., Tavyev, Y.J., Zhang, F., Carvalho, C.M., Schaaf, C.P., Richman, R., Fang, P., Glaze, D.G., and Lupski, J.R. (2009). Autism and other neuropsychiatric symptoms are prevalent in individuals with MeCP2 duplication syndrome. Ann. Neurol. 66, 771–782. Reich, D.E., and Lander, E.S. (2001). On the allelic spectrum of human disease. Trends Genet. 17, 502–510. Roa, B.B., Greenberg, F., Gunaratne, P., Sauer, C.M., Lubinsky, M.S., Kozma, C., Meck, J.M., Magenis, R.E., Shaffer, L.G., and Lupski, J.R. (1996). Duplication of the PMP22 gene in 17p partial trisomy patients with Charcot-MarieTooth type-1 neuropathy. Hum. Genet. 97, 642–649. Romeo, S., Kozlitina, J., Xing, C., Pertsemlidis, A., Cox, D., Pennacchio, L.A., Boerwinkle, E., Cohen, J.C., and Hobbs, H.H. (2008). Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease. Nat. Genet. 40, 1461–1465. Romeo, S., Yin, W., Kozlitina, J., Pennacchio, L.A., Boerwinkle, E., Hobbs, H.H., and Cohen, J.C. (2009). Rare loss-of-function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans. J. Clin. Invest. 119, 70–79. Rovelet-Lecrux, A., Hannequin, D., Raux, G., Le Meur, N., Laquerrie`re, A., Vital, A., Dumanchin, C., Feuillette, S., Brice, A., Vercelletto, M., et al. (2006). APP locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebral amyloid angiopathy. Nat. Genet. 38, 24–26. Sankaran, V.G., Menne, T.F., Xu, J., Akie, T.E., Lettre, G., Van Handel, B., Mikkola, H.K., Hirschhorn, J.N., Cantor, A.B., and Orkin, S.H. (2008). Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science 322, 1839–1842. Sankaran, V.G., Xu, J., Ragoczy, T., Ippolito, G.C., Walkley, C.R., Maika, S.D., Fujiwara, Y., Ito, M., Groudine, M., Bender, M.A., et al. (2009). Developmental and species-divergent globin switching are driven by BCL11A. Nature 460, 1093–1097. Schaaf, C.P., Sabo, A., Sakai, Y., Crosby, J., Muzny, D., Hawes, A., Lewis, L., Akbar, H., Varghese, R., Boerwinkle, E., et al. (2011). Oligogenic heterozygosity in individuals with high-functioning autism spectrum disorders. Hum. Mol. Genet. 20, 3366–3375. Schuster, S.C., Miller, W., Ratan, A., Tomsho, L.P., Giardine, B., Kasson, L.R., Harris, R.S., Petersen, D.C., Zhao, F., Qi, J., et al. (2010). Complete Khoisan and Bantu genomes from southern Africa. Nature 463, 943–947.
delay, behavioural problems, dysmorphism, epilepsy, and abnormal head size. J. Med. Genet. 47, 332–341. Shroyer, N.F., Lewis, R.A., Allikmets, R., Singh, N., Dean, M., Leppert, M., and Lupski, J.R. (1999). The rod photoreceptor ATP-binding cassette transporter gene, ABCR, and retinal disease: from monogenic to multifactorial. Vision Res. 39, 2537–2544. Shroyer, N.F., Lewis, R.A., Yatsenko, A.N., and Lupski, J.R. (2001a). Null missense ABCR (ABCA4) mutations in a family with stargardt disease and retinitis pigmentosa. Invest. Ophthalmol. Vis. Sci. 42, 2757–2761. Shroyer, N.F., Lewis, R.A., Yatsenko, A.N., Wensel, T.G., and Lupski, J.R. (2001b). Cosegregation and functional analysis of mutant ABCR (ABCA4) alleles in families that manifest both Stargardt disease and age-related macular degeneration. Hum. Mol. Genet. 10, 2671–2678. Shy, M.E., Scavina, M.T., Clark, A., Krajewski, K.M., Li, J., Kamholz, J., Kolodny, E., Szigeti, K., Fischer, R.A., Saifi, G.M., et al. (2006). T118M PMP22 mutation causes partial loss of function and HNPP-like neuropathy. Ann. Neurol. 59, 358–364. Sidransky, E., Nalls, M.A., Aasly, J.O., Aharon-Peretz, J., Annesi, G., Barbosa, E.R., Bar-Shira, A., Berg, D., Bras, J., Brice, A., et al. (2009). Multicenter analysis of glucocerebrosidase mutations in Parkinson’s disease. N. Engl. J. Med. 361, 1651–1661. Singleton, A.B., Farrer, M., Johnson, J., Singleton, A., Hague, S., Kachergus, J., Hulihan, M., Peuralinna, T., Dutra, A., Nussbaum, R., et al. (2003). alphaSynuclein locus triplication causes Parkinson’s disease. Science 302, 841. Stankiewicz, P., and Beaudet, A.L. (2007). Use of array CGH in the evaluation of dysmorphology, malformations, developmental delay, and idiopathic mental retardation. Curr. Opin. Genet. Dev. 17, 182–192. Stefansson, H., Rujescu, D., Cichon, S., Pietila¨inen, O.P., Ingason, A., Steinberg, S., Fossdal, R., Sigurdsson, E., Sigmundsson, T., Buizer-Voskamp, J.E., et al; GROUP. (2008). Large recurrent microdeletions associated with schizophrenia. Nature 455, 232–236. Stephens, P.J., Greenman, C.D., Fu, B., Yang, F., Bignell, G.R., Mudie, L.J., Pleasance, E.D., Lau, K.W., Beare, D., Stebbings, L.A., et al. (2011). Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144, 27–40. Turner, D.J., Miretti, M., Rajan, D., Fiegler, H., Carter, N.P., Blayney, M.L., Beck, S., and Hurles, M.E. (2008). Germline rates of de novo meiotic deletions and duplications causing several genomic disorders. Nat. Genet. 40, 90–95. Van Esch, H., Bauters, M., Ignatius, J., Jansen, M., Raynaud, M., Hollanders, K., Lugtenberg, D., Bienvenu, T., Jensen, L.R., Gecz, J., et al. (2005). Duplication of the MECP2 region is a frequent cause of severe mental retardation and progressive neurological symptoms in males. Am. J. Hum. Genet. 77, 442–453. Vernimmen, D., Marques-Kranc, F., Sharpe, J.A., Sloane-Stanley, J.A., Wood, W.G., Wallace, H.A., Smith, A.J., and Higgs, D.R. (2009). Chromosome looping at the human alpha-globin locus is mediated via the major upstream regulatory element (HS -40). Blood 114, 4253–4260. Verny, C., Ravise´, N., Leutenegger, A.L., Pouplard, F., Dubourg, O., Tardieu, S., Dubas, F., Brice, A., Genin, E., and LeGuern, E. (2004). Coincidence of two genetic forms of Charcot-Marie-Tooth disease in a single family. Neurology 63, 1527–1529. Vissers, L.E., de Ligt, J., Gilissen, C., Janssen, I., Steehouwer, M., de Vries, P., van Lier, B., Arts, P., Wieskamp, N., del Rosario, M., et al. (2010). A de novo paradigm for mental retardation. Nat. Genet. 42, 1109–1112.
Sharer, N., Schwarz, M., Malone, G., Howarth, A., Painter, J., Super, M., and Braganza, J. (1998). Mutations of the cystic fibrosis gene in patients with chronic pancreatitis. N. Engl. J. Med. 339, 645–652.
Voight, B.F., Scott, L.J., Steinthorsdottir, V., Morris, A.P., Dina, C., Welch, R.P., Zeggini, E., Huth, C., Aulchenko, Y.S., Thorleifsson, G., et al; MAGIC investigators; GIANT Consortium. (2010). Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579–589.
Shinawi, M., Liu, P., Kang, S.-H.L., Shen, J.J., Belmont, J.W., Scott, D.A., Probst, F.J., Craigen, W.J., Graham, B.H., Pursley, A., et al. (2010). Recurrent reciprocal 16p11.2 rearrangements associated with global developmental
Wagner, C.A. (2008). How much is blood pressure in the general population determined by rare mutations in renal salt-transporting proteins? J. Nephrol. 21, 632–634.
42 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Wang, J., Wang, W., Li, R., Li, Y., Tian, G., Goodman, L., Fan, W., Zhang, J., Li, J., Zhang, J., et al. (2008). The diploid genome sequence of an Asian individual. Nature 456, 60–65. Wang, X., Moylan, B., Leopold, D.A., Kim, J., Rubenstein, R.C., Togias, A., Proud, D., Zeitlin, P.L., and Cutting, G.R. (2000). Mutation in the gene responsible for cystic fibrosis and predisposition to chronic rhinosinusitis in the general population. JAMA 284, 1814–1819. Wang, X., Kim, J., McWilliams, R., and Cutting, G.R. (2005). Increased prevalence of chronic rhinosinusitis in carriers of a cystic fibrosis mutation. Arch. Otolaryngol. Head Neck Surg. 131, 237–240. Weiss, F.U., Simon, P., Bogdanova, N., Mayerle, J., Dworniczak, B., Horst, J., and Lerch, M.M. (2005). Complete cystic fibrosis transmembrane conductance regulator gene sequencing in patients with idiopathic chronic pancreatitis and controls. Gut 54, 1456–1460. Weiss, L.A., Shen, Y., Korn, J.M., Arking, D.E., Miller, D.T., Fossdal, R., Saemundsen, E., Stefansson, H., Ferreira, M.A., Green, T., et al; Autism
Consortium. (2008). Association between microdeletion and microduplication at 16p11.2 and autism. N. Engl. J. Med. 358, 667–675. Wheeler, D.A., Srinivasan, M., Egholm, M., Shen, Y., Chen, L., McGuire, A., He, W., Chen, Y.J., Makhijani, V., Roth, G.T., et al. (2008). The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–876. Wittrup, H.H., Andersen, R.V., Tybjaerg-Hansen, A., Jensen, G.B., and Nordestgaard, B.G. (2006). Combined analysis of six lipoprotein lipase genetic variants on triglycerides, high-density lipoprotein, and ischemic heart disease: cross-sectional, prospective, and case-control studies from the Copenhagen City Heart Study. J. Clin. Endocrinol. Metab. 91, 1438–1445. Wittrup, H.H., Tybjaerg-Hansen, A., Abildgaard, S., Steffensen, R., Schnohr, P., and Nordestgaard, B.G. (1997). A common substitution (Asn291Ser) in lipoprotein lipase is associated with increased risk of ischemic heart disease. J. Clin. Invest. 99, 1606–1613.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 43
Leading Edge
Perspective Metagenomics and Personalized Medicine Herbert W. Virgin1,* and John A. Todd2,* 1Department of Pathology and Immunology, Department of Molecular Microbiology, and Midwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research, Washington University School of Medicine, St. Louis, MO, 63110, USA 2Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0XY, UK *Correspondence:
[email protected] (H.W.V.),
[email protected] (J.A.T.) DOI 10.1016/j.cell.2011.09.009
The microbiome is a complex community of Bacteria, Archaea, Eukarya, and viruses that infect humans and live in our tissues. It contributes the majority of genetic information to our metagenome and, consequently, influences our resistance and susceptibility to diseases, especially common inflammatory diseases, such as type 1 diabetes, ulcerative colitis, and Crohn’s disease. Here we discuss how host-gene-microbial interactions are major determinants for the development of these multifactorial chronic disorders and, thus, for the relationship between genotype and phenotype. We also explore how genome-wide association studies (GWAS) on autoimmune and inflammatory diseases are uncovering mechanism-based subtypes for these disorders. Applying these emerging concepts will permit a more complete understanding of the etiologies of complex diseases and underpin the development of both next-generation animal models and new therapeutic strategies for targeting personalized disease phenotypes. Recent advances in diverse areas of science and technology make this a unique time to study the genetics and pathogenesis of complex diseases, such type 1 diabetes (T1D) and inflammatory bowel disease (IBD), which includes Crohn’s disease (CD) and ulcerative colitis (UC). These distinct diseases are now understood to share important common characteristics and aspects of their disease mechanisms. In all three diseases, the immune system damages tissues: T1D is likely an autoimmune disease, whereas CD and UC are likely caused by inappropriate inflammatory responses to components of our microbiome (see Box 1 for definition of key terms). Many genetic loci regulate the risk for each disease. Although a threshold dose of these susceptibility alleles provides the foundation for developing the disease, these alleles are not sufficient to cause the disease. It has been obvious for decades that complex gene-gene and gene-environment interactions govern these diseases, but not surprisingly, untangling this web of interactions has been extremely difficult (Figure 1). Despite the failure to identify single causal agents for each disease, there is strong evidence that microbes contribute to pathogenesis. Furthermore, genomewide association studies (GWAS), which use large study populations and careful replication of results, have effectively identified many important loci in the host that increase one’s risk for the disease, and these results have fundamentally altered how we conceptualize these diseases (Stappenbeck et al., 2011; Khor et al., 2011; Anderson et al., 2011; Franke et al., 2010; Todd, 2010). Correlation of GWAS data with genome-wide gene expression analyses (eQTLs), in combination with proteinprotein interaction data, is greatly assisting the identification of candidate causal genes within these loci (Anderson et al., 2011; Franke et al., 2010; Cotsapas et al., 2011; Rossin et al., 2011; Fehrmann et al., 2011). Recently, numerous approaches 44 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
have been developed to start defining mechanisms for complex inflammatory diseases by using leads from GWAS and analyses of the microbiome. These promising approaches include the following: the introduction of mutations in GWAS-identified loci into the mouse genome (Cadwell et al., 2010; Bloom et al., 2011); the creation of induced pluripotent stem cells (iPSCs) from patients and their differentiation into relevant cell types (e.g., Rashid et al., 2010); and humanized mouse models in which the murine immune system is replaced by transplantation (e.g., Brehm et al., 2010; Esplugues et al., 2011) or human microbial communities are transplanted into formerly germ-free mice (Goodman et al., 2011). Currently the great challenges in this field are to (1) understand how both microbiome and GWAS-identified genes contribute to disease; (2) elucidate the molecular mechanisms by which causal genes act during pathogenesis; and (3) validate biomarkers and druggable pathways via genotype-phenotype studies (e.g., Dendrou et al., 2009; Bloom et al., 2011; Cadwell et al., 2010). By peering through the lens of recent studies on CD, UC, and T1D, this review seeks to delineate emerging concepts in research on complex inflammatory diseases and to comment on the implications of these concepts for the interpretation of genetic and pathogenetic data. Two concepts are emphasized and integrated herein: (1) that single disease diagnoses are unlikely to be single phenotypes and may instead be the sum of multiple mechanism-based disease subsets, and (2) that the interactions of individual microorganisms and their genomes with specific host genes or pathways underpin the relationship between genotype and phenotype in these complex diseases. In this view, disease genetics may be combinatorial with different host-gene-microbial interactions, contributing to the pathogenesis of disease in subsets of patients. These two interrelated
Box 1. Definition of Terms Dysbiosis: Most commonly refers to a disruption in the normal homeostatic and beneficial relationship between microbes and their host, including disruptions in microbial community structure and function. Alterations in microbial community structure, involving Bacteria, Archaea, and/or Eukarya, can occur in any body habitat but have been best described in the gut where they have been associated with a number of disease states including, for example, inflammatory bowel disease. GWAS: Analysis of common alleles (mostly single-nucleotide polymorphisms, SNPs) in a population that associates genetic loci with disease susceptibility. These loci contain ‘‘candidate’’ disease genes. Familial clustering: If a family member is diagnosed with a disease such as type 1 diabetes, ulcerative colitis, or Crohn’s disease, then the risk of other first-degree family members is much greater (perhaps as much as 50-fold for some multifactorial disorders) than that for a person taken at random from the general population. Familial clustering is caused by a combination of inherited genetic variants from the parents to the children and shared environmental factors within the families. Susceptibility variants are being discovered rapidly by GWAS, but the environmental factors remain unknown, although numerous candidates are recognized, most particularly a role for the microbiome and infections. Metagenetics: Approaching genetic and genomic studies by considering all of the genes in the metagenome as opposed to considering, in isolation, host genes or genes that confer particular properties (e.g., virulence or commensalism) upon an individual microbe. Importantly, the history of microbial inputs into the metagenomic profile of an individual is important for identifying the causes of complex disease, requiring expensive but essential longitudinal studies, including information from maternal and gestational exposures and phenotypes. Metagenome: As used here, metagenome is the sum of all genes and genetic elements and their modifications in the somatic and germ cells of a host plus all genes and genetic elements in all microorganisms that live on or in that host at a given time. The metagenome has transient elements (e.g., during infection with a pathogen) and more persistent elements (e.g., infection with latent eukaryotic virus; presence of commensal bacteria). Microbiome: As used here, the microbiome is the sum of all microbial organisms that live in or on the host at a given time. The microbiome includes members of Bacteria, Archaea, Eukarya, and the viruses of these organisms. In other articles this term may be used to refer to the genes of these organisms. Virome: The sum of all viruses living in the tissues of the host or infecting organisms in the microbiome. These viruses may be further divided into viruses that infect members of each of the three domains of life (e.g., bacterial virome or bacterial phages or the eukaryotic virome).
concepts, therefore, define T1D, CD, and UC as metagenetic (Box 1), rather than simply ‘‘genetic,’’ diseases. These concepts will guide the design and interpretation of future experiments that seek to dissect the pathophysiologic mechanisms underlying a number of complex diseases and to identify more effective approaches for their treatment and prevention. Host Genetic Grist for the Metagenetic Mill Recently, meta-analyses of GWAS of large cohorts of patients of European descent with UC or CD have been performed (Franke et al., 2010; Anderson et al., 2011; Khor et al., 2011). These
studies identified 98 loci, and candidate genes within these loci, that have a putative role in IBD. Similar studies of T1D identified 53 disease susceptibility loci (Barrett et al., 2009) (http:// www.t1dbase.org). Importantly, many disease susceptibility loci are shared among common autoimmune and inflammatory diseases, including T1D, Graves’ disease, celiac disease, CD, UC, psoriasis, rheumatoid arthritis, alopecia areata, multiple sclerosis, and systemic lupus erythematosus (Cotsapas et al., 2011; Khor et al., 2011). It is striking that T1D and CD share 13/52 (25%) risk loci outside the human leukocyte antigen (HLA) gene complex despite the fact that these diseases are neither thought to be related diseases nor reported to be shared within families more often than expected by chance (http://www. t1dbase.org). Notably, the candidate causal genes in these 13 susceptibility loci regulate immunity. These include (Khor et al., 2011) PTPN22, which is involved in T and B cell signaling; IL10, encoding a powerful cytokine that suppresses inflammatory responses (including in specialized T regulatory cells in the gut) (Maloy and Powrie, 2011); BACH2, which regulates B cell gene expression and possibly IgA production; TAGAP, which is involved in T cell activation; IKZF1, which negatively regulates B cells; IL2RA, which controls T regulatory lymphocyte development and function; GSDMB/GSDMA/ORMDL3, which is involved in stress responses; FUT2, which controls microbial susceptibility (Smyth et al., 2011; Franke et al., 2010; McGovern et al., 2010); and IL27, which suppresses inflammatory responses and regulates IL-10 signaling (Imielinski et al., 2009; Barrett et al., 2009). This is a remarkable concordance of involved genes for two unrelated diseases, indicating that different diseases can have common mechanistic components and that the immune system is key for both diseases. However, not withstanding all insights into disease mechanisms that the GWAS approach has already provided, the inheritance and the strong clustering of these multifactorial diseases within families (Box 1), which encompass both inherited genetic variants and intrafamilial environmental factors, remain only partially explained. Assuming a simple statistical model of gene interaction (Clayton, 2009), the numerous identified loci account for not more than 25% of the familial clustering of CD and UC (Anderson et al., 2011; Franke et al., 2010). This contrasts with T1D, in which the HLA effect is uniquely large and, together with 52 non-HLA loci, can account for almost all of the familial clustering (Clayton, 2009). For T1D, the massive effect of the HLA region, owing to functional polymorphisms in the HLA class II and class I genes, contributes almost 50% of familial clustering on its own (Clayton, 2009; Todd, 2010). There are, however, probably hundreds of non-HLA loci affecting the risk of CD, UC, and T1D that remain unmapped owing to their very small effect sizes (Barrett et al., 2009; Anderson et al., 2011; Franke et al., 2010). These putative loci will be difficult to map unless they contain rare mutations of higher penetrance, an occurrence that is just beginning to yield informative findings (Nejentsev et al., 2009; Rivas et al., 2011) and holds continued promise with the rapid use of high-throughput next-generation sequencing. In humans, the HLA locus contains a large number of genes encoding the major histocompatibility complex (MHC) molecules (which are responsible for presenting antigens to cells of the Cell 147, September 30, 2011 ª2011 Elsevier Inc. 45
Figure 1. Perfect Storms for Developing Crohn’s Disease and Type 1 Diabetes A series of overlapping events and phenotypes driven by metagenetic and environmental processes that, in sum, contribute to the development and pathogenesis of type 1 diabetes (A) and Crohn’s disease (B).
immune system), along with a number of other genes that modulate immune responses. The remarkable contribution of HLA variations (Todd, 2010) to T1D risk is an unusual feature of a common disease. Nevertheless, HLA genotypes that greatly predispose individuals to T1D are not sufficient to cause the disease because only 5% of high-risk HLA carriers develop T1D. HLAs are expressed by antigen-presenting cells (APCs), such as macrophages, B lymphocytes, and dendritic cells (DCs). DCs are highly potent APCs that reside in the pancreas and its islets (i.e., collections of insulin-producing beta cells and other endocrine cells) and could initiate the autoimmune destruction of beta cells by T cells (Calderon et al., 2011a, 2011b). Interestingly, the pancreatic lymph nodes, where DC priming of T cells for the induction of T1D may occur, also drain parts of the intestine, providing a site where the microbiome might influence the genesis of T1D (Turley et al., 2005; Wen et al., 2008). Because the insulin gene is one of the strongest non-HLA T1D susceptibility loci in the genome (Todd, 2010) (http://www.t1dbase.org), insulin and its precursors are likely primary autoantigens. These very strong associations with both HLA and this autoantigen gene are not a feature of CD or UC, in which no particular antigen is known to be targeted, hence their classification as inflammatory rather than autoimmune diseases. GWAS point to several other immunologic components of T1D etiology, including IL-2 production and receptor signaling (IL-2 gene, IL-2 receptors IL2RA [CD25] and IL2RB [CD132]; Todd, 2010), immune tolerance and T cell receptor signaling (PTPN2 [Long et al., 2011], PTPN22 [Arechiga et al., 2009; Bottini et al., 2006]), and recently, the immune response to viral infections and the type 1 interferon responses (IFIH1 [encoding MDA5], GPR183 [EBI2] [Heinig et al., 2010], TLR7 and TLR8, and FUT2 [Smyth et al., 2011]). 46 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Twenty-eight loci (28/71, 39%) of CD risk loci are shared with UC, indicating that a set of core mechanisms participate in these diseases (Figure 2) (Khor et al., 2011). These diseases genes implicate numerous processes in both CD and UC, including T cell differentiation and function, autophagy, endoplasmic reticulum stress, oxidative stress, and mucosal immune defenses, among others. There are important gene-gene and pathwaypathway interactions within this core set of processes. For example, the CD risk gene NOD2 links to autophagy through interactions with the Nod2 protein and with Atg16L1, induction of proinflammatory cytokines, control of bacterial infection, and sensing of pathogen-associated molecular patterns (Levine et al., 2011). Particularly notable are pathways involving the cytokines IL-23 and IL-12, which regulate the development of TH1 and TH17 CD4 T cells, and IL-10, which is essential in the function of certain regulatory T cells (Tregs) via its anti-inflammatory activity. Rare mutations in genes encoding IL-10 receptors confer susceptibility to early-onset IBD (Glocker et al., 2009). These genetic clues point to a key role for regulating the balance between pro- and anti-inflammatory T cells in CD and UC. The regulation of T cell differentiation is also a key target for hostgene-microbial interactions, as discussed below. Mechanism-Based Disease Subtypes GWAS have revealed a wealth of genes potentially involved in T1D, UC, and CD, but no single gene or set of genes is prognostic. How can we interpret this observation? Here, we argue for an important contributor to this observation—the concept that ‘‘diagnosis’’ does not equal ‘‘single phenotype.’’ Without a distinct phenotype, genetic results are often difficult to interpret. This basic principle comes into sharp focus as one considers current genetic and pathogenesis studies of CD, UC, and T1D.
Figure 2. Refining the Relationship between Genotype and Phenotype in Complex Inflammatory Diseases (A) Traditionally, a disease is considered as a single phenotype, with genes or loci conferring risk to two diseases shown as overlapping in a Venn diagram. (B) We propose a new view of the genotype-phenotype relationship in which different sets of loci are responsible for mechanistically distinct subtypes of diseases, and the sum of these subtypes constitutes the overall diagnosis. Here two disease subtypes are indicated for simplicity, but many such subtypes may exist, and sets of overlapping risk loci may be associated with these multiple mechanistically distinct disease phenotypes.
Why is a diagnosed ‘‘disease’’ an imprecise phenotype? It is not because patients have been misdiagnosed—the diagnoses of UC, CD, or T1D have stood the test of time to predict patient prognosis. However, we believe that there are many pathways to the same diagnosis. A diagnosis may be ‘‘clinically’’ precise but ‘‘mechanistically’’ imprecise. Thus, clinical diagnoses are poor phenotypes for genetic studies unless a single mechanism is responsible for the diagnosis, as in the case of a rare gene mutation in a monogenic disease. The complexity of GWAS results is consistent with the existence of multiple disease subtypes within T1D, UC, or CD, each based on a specific mechanism (Figure 2). Support for this idea comes from the observation that subsets of IBD patients respond differentially to mechanistically distinct interventions (Melmed and Targan, 2010). Why do diagnostic categories group different mechanistic processes under the same moniker? Over many decades, pathologists have lumped patients with similar but nonidentical clinical and pathological signs and symptoms into diagnostic categories that predict outcome and complications. Indeed, this has enormous value clinically, but it emphasizes similarities between patients in outcome rather than differences in pathways that lead to a common endpoint. Complex diseases are diagnosed by summing up multiple factors that may be causes or mere consequences of the disease process. Disease ‘‘diag-
nosis’’ does not require the presence in the tissue of all of the abnormalities that may be ‘‘classically’’ seen in a given disease (Gianani et al., 2010; Odze, 2003). For example, at the polar extreme, CD is easily distinguished from UC by its classical ileal involvement (i.e., involvement of tissue at the end of the small intestine), fissures, granulomas, transmural inflammation (i.e., inflammation through the entire intestinal wall), fat wrapping of the intestine, patchy pathology, skip lesions, and patient presentation with bowel strictures or percutaneous fistulae. However, like UC, CD can be restricted to the colon, and the inflammatory infiltrates of CD and UC overlap. UC can be patchy, and the patient presentations of the two diseases can overlap extensively. Similarly, the genetics, pathology, and pathogenesis of IBD may differ between young and old patients with the same diagnosis (Imielinski et al., 2009; Odze, 2003). Even when all classical aspects of a disease are present, the mechanism responsible for the pathology observed may differ from one person to another. Based on these considerations, it is no surprise that the genetics of T1D, CD, and UC are complex because different phenotypes may have been grouped into a single analysis. This putative mechanistic heterogeneity is reflected in sometimes subtle, but quantifiable, characteristics of the disease process and pathology. Taking such differences into account can be used to identify disease subtypes that are more recognizable as molecularly defined pathological conditions and that more closely relate to specific pathogenetic mechanisms underpinned by distinct sets of genetic risk loci (Figure 3). For example, variations in the ATG16L1 gene (i.e., hypomorphic expression in the mouse and homozygosity for the T300A variant in humans) result in abnormalities in Paneth cell granules and secretion (Cadwell et al., 2008, 2010). Paneth cells are innate immune epithelial cells positioned at the base of small intestinal crypts, where they secrete antimicrobial peptides and other factors that help shape the configuration of the intestine’s bacterial community. Abnormalities in Paneth cells are observed in the subset of CD patients homozygous for the T300A allele, thus defining a pathologic subtype of CD (Figure 2B). If one used criteria including Paneth cell abnormalities in CD diagnosis, the frequency of the ATG16L1 T300A allele would be higher in patients with the ‘‘Paneth cell subtype’’ of CD than in the CD population as a whole (Figure 3). If multiple risk loci contribute to such Paneth cell changes, one might be able to detect gene-gene interactions in this subset of patients compared to other subsets. A similar situation exists in T1D. Biopsy specimens of the pancreas are virtually impossible to obtain. Therefore, T1D is defined clinically by the downstream consequences of destroying the insulin-secreting b cells of the pancreatic islets, namely, high blood glucose and absolute insulin dependence, rather than by the mechanisms for their destruction. It is, therefore, possible that several different pathologic processes result in this disease. T1D patients diagnosed under age 10 years frequently exhibit islet inflammation or insulitis, whereas patients diagnosed over age 10 years exhibit insulitis less frequently. More recently, this histopathological heterogeneneity has become even more evident (Gianani et al., 2010), thanks to the Juvenile Diabetes Research Foundation nPOD project (http:// www.jdrfnpod.org). As for CD, the diagnosis of T1D may reflect the presence of more than one pathogenetic mechanism and, Cell 147, September 30, 2011 ª2011 Elsevier Inc. 47
patient outcome. We, therefore, argue for iterative high-precision phenotyping of patients into mechanism-based subtypes in future studies; this will allow more accurate interpretation of genetic, pathogenesis, outcome, and therapeutic studies (Figure 3). Such definitions must be iteratively reassessed as risk alleles are defined and disease mechanisms are delineated so that the field is not limited by inflexible definitions of disease that may obscure mechanistic heterogeneity. This type of approach is a necessary presage to so-called stratified or personalized medicine. The genetic and pathological complexity of T1D, CD, and UC is particularly well suited for testing whether iteratively redefining disease diagnoses can enhance the value of genetic and pathogenesis studies. Importantly, precision in disease categorization would make defining the impact of hostgene-microbial interactions on disease processes more robust.
Figure 3. The Iterative Redefinition of Mechanism-Based Disease Subtypes Here we present a conceptual workflow for breaking a broad disease diagnosis into its component subtypes by the iterative application of genetics and mechanistic studies. One output would be therapeutics based on disease subtype and patient stratification into groups more likely to respond to a given therapy or preventive strategy (A). (B) shows specific challenges for this process for type 1 diabetes and Crohn’s disease.
thus, represent more than one disease subtype, although in T1D the HLA effects are an essential common pathway. The concept that disease diagnoses include mechanismbased disease subtypes has many implications for interpreting human genetic studies and for understanding the relationship between the microbiome and genetic susceptibility, as discussed below. Including disease subtypes within a single diagnosis would decrease the power to define causal alleles and to detect gene-gene interactions that contribute to a single disease subtype. In this view, the difficulty of interpreting how multiple small genetic effects sum to predispose an individual to a clinical diagnosis may partly reflect insufficient precision in selection of specific phenotypes to study. It is important to recognize that it is the power and informativeness of GWAS themselves that drive the concept of mechanismbased disease subtypes (Figure 2). In the absence of candidate genetic mechanisms for defining disease subtypes, there is limited clinical utility in focusing on low-frequency characteristics or subtypes within a larger diagnostic category that predicts 48 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Host-Gene-Microbial Interactions in Metagenetics Metazoan organisms are complex communities that include a core organism in combination with a veritable zoo of other organisms that live on or in the body—our microbiome. The microbiome includes eukaryotic viruses, Eukarya, bacteria viruses, Bacteria, Archaea, and, for many, helminths (Virgin et al., 2009; Kau et al., 2011; Garrett et al., 2010b; Spor et al., 2011). The importance of understanding the microbiome has been repeatedly emphasized, giving rise to a large number of international human microbiome projects (e.g., https://commonfund.nih. gov/hmp/, http://www.metahit.eu/) that have focused initially on the bacterial component of the microbiome. The host plus non-host genes of this polyglot and interactive community constitute our metagenome (Box 1). A critical emerging concept is that bacterial and viral interactions in the pathogenesis of inflammatory disease occur in a host gene-specific fashion (see below; Virgin et al., 2009; Cadwell et al., 2010; Bloom et al., 2011; Elinav et al., 2011). Understanding the metagenome is, therefore, highly relevant to understanding T1D, UC, CD, and other common multifactorial diseases. Intestinal bacteria play a role in driving IBD, and emerging data support a similar view for T1D (Wen et al., 2008; Giongo et al., 2011; Roesch et al., 2009). The evidence that bacteria play a role in IBD includes two major observations: that surgical diversion of the fecal stream ameliorates inflammation (Sartor, 2008), and that antibiotics help some patients. In mouse models of colitis, viruses, bacteria, or both acting together can contribute to the pathology via signaling through innate immune sensors and regulation of pro- and anti-inflammatory cytokines (Levine et al., 2011; Maloy and Powrie, 2011; Khor et al., 2011). For many years, enterovirus infection has been associated with T1D (e.g., Yeung et al., 2011; Oikarinen et al., 2011; Stene et al., 2010), and the major sensor for enterovirus RNA is the T1D susceptibility gene IFIH1, encoding MDA5 (Nejentsev et al., 2009; McCartney et al., 2011). The mechanisms for these associations between components of the microbiome and T1D, CD, or UC have proven elusive. The lack of integration among scientific disciplines and among training programs, together with limitations in technology, has substantially limited the understanding of metagenomic contributions to disease. Adult mammals are permanently infected by many viruses, and they are populated by large site-specific bacterial and
Figure 4. Microbe Plus Gene Interactions Determine Inflammatory Bowel Disease Phenotypes (A) Two recent studies analyzed the capacity of two different strains of murine norovirus, MNV strain CR6 versus MNV strain CW3, to trigger phenotypes when orally inoculated into mice with a mutation in the Crohn’s disease risk gene Atg16L1 (Cadwell et al., 2008, 2010). This mutation results in decreased expression of Atg16L1 protein (hypomorphic, Atg16L1HM). Even though MNV CW3 and MNV CR6 are closely related, they have different effects on intestinal pathology in Atg16L1HM mice. Some of these interactions are observed only when mice are fed the chemical dextran sodium sulfate (DSS). (B) Two other studies analyzed the capacity of two different species of Bacteroides to trigger phenotypes in combination with mutations in the IL-10 receptor and T cell expression of a dominant-negative form of the TGF-b receptor (dnKO mice) (Bloom et al., 2011; Kang et al., 2008). dnKO mice are cured of their spontaneous colitis by treatment with antibiotics, but oral feeding of ‘‘cured’’ mice with fecal contents or specific bacteria reinduces disease. Even though Bacteroides thetaiotaomicron and Bacteroides sp. TP5 are closely related, they induce different forms of inflammation when fed to antibiotic-cured dnKO mice. In the same studies, dysbiosis (Box 1) with increases in the numbers of Enterobacteriaceae was noted in dnKO mice prior to curing the mice with antibiotics. However, E. coli inoculation did not trigger the pathologies seen with either Bacteroides species.
phage communities without overt negative effects (Virgin et al., 2009; Foxman and Iwasaki, 2011; Spor et al., 2011). Thus, the bacterial microbiota (and their phages) and the eukaryotic virome are two major (but not the only) contributors to the metagenome. The intestinal microbiome plays a critical role in mammalian physiology by synthesizing vitamins and harvesting energy from food (Spor et al., 2011; Kau et al., 2011). Further, the normal function of the innate immune system, which is critically involved in the pathogenesis of T1D, UC, and CD, is regulated by both chronic viral infections and resident bacterial communities (Barton et al., 2007; White et al., 2010; Virgin et al., 2009; Spor et al., 2011; Kau et al., 2011). The microbiome and metagenome vary from person to person based on host genetics, diet, exposure, geography (including westernization as approximated by the gross national product of a country), socioeconomic status, mode of delivery, gestational age at birth, breast feeding, antibiotic use, and additional factors (Virgin et al., 2009; Benson et al., 2010; Spor et al., 2011; Kau et al., 2011; Penders et al., 2006). Such variations could certainly provide environmental inputs that contribute to the incidence of T1D, UC, and CD, within the genetic foundation revealed by
GWAS (Bach, 2002; Vehik and Dabelea, 2011; Ehlers and Kaufmann, 2010). There are extensive interactions between host and non-host genes within the metagenome, and bacteria and eukaryotic viruses alter our physiology and fitness (Spor et al., 2011; Virgin et al., 2009; Hansen et al., 2010). These genetic interactions within the metagenome create a complex and poorly understood host-gene-microbial interaction matrix that can define phenotype. Host genes, such as those involved in innate and adaptive immunity (e.g., NOD2, NLRP6, HLA, TLR2, and MYD88), shape the bacterial microbiota (Spor et al., 2011; Elinav et al., 2011; Wen et al., 2008). Forward genetic screens in mice suggest that resistance to individual viruses involves hundreds of genes (Virgin et al., 2009), making it likely that many host genes regulate the microbiome (and thus the metagenome). Importantly, key interactions between members of the microbiome have been and are increasingly being reported. For example, murine norovirus can trigger an intestinal inflammatory process in mice with a mutation in Atg16L1. This process can be treated with antibiotics and is thus presumed to be bacteria dependent (Cadwell et al., 2010) (Figure 4). The existence of such interactions Cell 147, September 30, 2011 ª2011 Elsevier Inc. 49
Figure 5. A Metagenetic View of Developing Normal and Pathological Immune Responses This flowchart depicts stages in the development of normal immune responses or autoimmune and inflammatory diseases at which metagenetic interactions (i.e., gene-gene and gene-microbe interactions) might play a determining role. ‘‘Microbial products’’ refers to molecules that interact with host innate immune sensors and initiate inflammation.
could contribute to disease susceptibility and potentially explain a proportion of the familial clustering, the so-called ‘‘missing heritability’’ of multifactorial diseases.
indicates that it will be important to consider that the disease contributions of microbiome members (e.g., helminths and bacteria) are potentially dependent on each other. Metagenetic influences on disease could occur in various ways. Familial disease clustering may reflect intrafamilial behavioral and dietary factors that define the metagenome. A major influence on the human metagenome may be vertical transmission of the maternal microbiome. In the controlled environment of mouse colonies, the bacterial microbiota is clearly maternally inherited. Furthermore, this microbiome can have profound pathological effects in mice carrying specific mutations, such that studies of host-gene functions must now consider contributions of the metagenome (see below). The situation in humans is more complex (Hansen et al., 2011; Benson et al., 2010), although the importance of early environmental exposures has been well documented, including studies with mono- and dizygotic twins (Turnbaugh et al., 2009). Children delivered vaginally initially acquire a distinctly different intestinal bacterial microbiota than those born by caesarian section (Penders et al., 2006; Dominguez-Bello et al., 2010). In a meta-analysis, delivery by caesarean section increases the risk of T1D by 20% (Cardwell et al., 2008). An association with increased risk for T1D has also been reported for higher birth weight (Cardwell et al., 2010) and early infant diet (Pflu¨ger et al., 2010) (Figure 5). Furthermore, changing microbial exposures and infections likely has a major influence on the incidence of other diseases. The dramatic rise in the incidence of allergy, asthma, and T1D in the last 60 years correlates with vast improvements in healthcare and sanitation (Bach, 2002; Vehik and Dabelea, 2011; Ehlers and Kaufmann, 2010). For example, severe rhinovirus infection before the age of 3 years coupled to an asthma-predisposing inherited host genome, has been associated with increased risk of asthma (Foxman and Iwasaki, 2011). Thus, the metagenome 50 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Metagenetic Effects on Immunity and Autoimmunity GWAS point to a fundamental role for the immune system in the pathogenesis of T1D, UC, and CD. An emerging concept is that bacterial and viral interactions contribute to both normal immune physiologies and abnormal pathologic responses that occur in a host gene-specific fashion (Virgin et al., 2009; Cadwell et al., 2010; Bloom et al., 2011; Garrett et al., 2007; Elinav et al., 2011). The microbiome has significant effects on the development of the immune system (Lee et al., 2011; Mazmanian et al., 2008; Sartor, 2008; Ivanov et al., 2009) and on physiology, including susceptibilities to obesity and the metabolic syndrome (Kau et al., 2011). Serum IgE responses to antigenic challenge are lower in mice colonized with Clostridium, confirming that bacteria can have profound effects on systemic immune responses involved in allergy (Atarashi et al., 2011). Central nervous system (CNS) inflammation induced by autoantigen is limited in germ-free mice, but it can be restored by colonization with specific bacteria (Lee et al., 2011). In the non-obese diabetic (NOD) mouse model mouse model, autoimmune diabetes is regulated by a MyD88-dependent interaction of intestinal microbes with the innate immune system (Wen et al., 2008). Germ-free K/BxN T cell receptor transgenic mice are resistant to arthritis caused by autoantibodies to the selfantigen glucose-6-phosphate isomerase. When these animals are colonized with segmented filamentous bacteria (SFB), they regenerate TH17 responses in the small intestine, autoantibody production, and arthritis (Wu et al., 2010). Furthermore, the normal intestinal microbiome is essential for effective resistance to oral inoculation with Toxoplasma gondii and for generating appropriate CD8+ T cell responses to influenza (Ichinohe et al., 2011; Benson et al., 2009). The mechanisms responsible for these observations are under intensive investigation. One recent study shows that intestinal bacteria induce Tregs in an antigen-specific and T cell receptor-dependent fashion (Lathrop et al., 2011). This is a key observation because it provides a mechanism, in addition to thymic exposure to self-antigens, for how regulatory responses can be generated to blunt inflammation. Given the continuous
presence of the stimulating antigens for these Tregs in the normal intestinal microbiome, such cells could have profound effects on both intestinal and systemic immune responses, including responses to self-antigens. This is particularly important because it has been reported that T lymphocytes migrate to the intestine to accept differentiation signals regulating autoimmune responses (Esplugues et al., 2011). It was also shown that injection of Staphylococcus aureus or its superantigen S. aureus enterotoxin B (SEB) was able to induce these intestinal regulatory TH17 cells, which is consistent with SEB injection being immune tolerogenic (Esplugues et al., 2011). These studies suggest that variation in the metagenome between individual humans, between mice in different research facilities, or even between animals from different cages within the same facility could have profound effects on many aspects of the immune response. This concept has key implications for the interpretation of mouse studies. The microbiome is maternally inherited in mice, but it can differ among research facilities; there may even be significant microenvironmental variation between cages of mice or between mice born of different dams. Given that the microbiome influences immunity so extensively, experiments must control for these factors. Currently, this is neither consistently performed nor required by peer reviewers. Host-Gene-Metagenome Interactions in UC and CD Correlations between communities of intestinal bacteria and CD or UC have led to the concept of dysbiosis (Box 1) as a contributor to these diseases (e.g., Sartor, 2008). This important hypothesis emphasizes the potential role that changes in the bacterial microbiota have on disease. However, now this hypothesis needs to expand and include both nonbacterial components of the metagenome and highly specific interactions between individual bacteria or viruses and host genes, which have recently been identified as contributors to disease pathogenesis. The relative contribution of dysbiosis versus the contribution of single organisms within the microbiome to the etiology of complex inflammatory diseases is unresolved. A confounding element has been the reliance on antibiotic treatment to assess bacteria as causes for intestinal disease. Because antibiotics can treat enteric inflammatory disease triggered by viruses (Figure 4) (Cadwell et al., 2010), a broader approach—including proof that specific bacteria or viruses are both necessary and sufficient for a phenotype—will be required to understand metagenetics of disease. Specific risk alleles for CD or UC could affect IBD by altering bacterial populations or individual bacterial types (Maloy and Powrie, 2011; Garrett et al., 2010b; Spor et al., 2011). Data from numerous mouse models of transmissible colitis confirm this point and are discussed below. The complexity of these reciprocal interactions between host and non-host genes within the metagenome underlines the critical need for new concepts and methodologies in computational and systems biology that can deal with individual host-genemicrobial interactions in the broader context of the metagenome. IBD in humans and mice is associated with alterations in the balance between TH1, TH17, and Treg cells, and this balance is dependent on the metagenome (Garrett et al., 2010b; Maloy and Powrie, 2011). The relevance of these studies to human CD and UC is strongly supported by the identification of genes
regulating these pathways in GWAS on IBD (see above). The role of specific bacteria and helminths in regulating these T cell responses in both the small and large intestine is highly relevant to understanding the genetics and pathogenesis of IBD. Polysaccharide A synthesized by the common colonic commensal Bacteroides fragilis induces Tregs that secrete IL-10 and inhibit intestinal inflammation (Round and Mazmanian, 2010; Mazmanian et al., 2008). Similarly, a protein antigen secreted by the intestinal helminth Heligmosomoides polygyrus induces Foxp3+ Treg cells in vitro and in vivo in mice (Grainger et al., 2010). Furthermore, enteric carriage of a community of Clostridium species induces IL-10-secreting Foxp3+ Tregs in the colon, likely via induction of TGF-b (Atarashi et al., 2011). These findings are interesting in light of the ubiquity of Bacteroides and Clostridia as commensal organisms in human and mouse, and differences in human carriage of helminths across the world. In mice, the presence of distant relatives of Clostridia, called SFB, drives resistance to the enteric pathogen Citrobacter rodentium and the induction of CD4+ TH17 cells in the lamina propria of the small intestine (Ivanov et al., 2009; Gaboriau-Routhiau et al., 2009). The discovery that SFB influence CD4+ T cell differentiation was made when investigators noticed differences in intestinal TH17 cell numbers between mice of the same strain purchased from different vendors, followed by the demonstration that co-housing of these mice resulted in induction of TH17 cells (Ivanov et al., 2009). SFB are highly evolved for their commensal relationship with the mouse intestine (Sczesnak et al., 2011). Similar organisms have not yet been reported in humans, but it seems likely that similarly coevolved organisms will play a role in human intestinal biology and immunoregulation. The discovery of the role for SFB in CD4+ T cell responses is similar to the discovery of a virus-plus-gene trigger for an intestinal disease in mice with symptoms similar to those in CD (Cadwell et al., 2010). This finding occurred by comparing intestinal phenotypes in one strain of mice bred in two different facilities. Both of these findings underline the critical importance of directly analyzing the contributions of the entire microbiome, rather than individual components, in animal models of diseases. Transmissible Colitis and Host-Gene-Metagenome Interactions Recent studies have made the striking observation that genetically determined colitis is transmissible, revealing a key role for host genes in defining the microbiome and for metagenomic contributions to enteric disease. Mice lacking both Rag2 and the transcription factor T-bet develop colitis that can be transmitted from a mutant mother to wild-type fosterling mice (Garrett et al., 2007, 2010a). Although there are expansions of specific bacterial types in these mice, including Klebsiella pneumoniae and Proteus mirabilis, another cofactor, in addition to these bacteria, is required to generate the colitis phenotype. This cofactor is not yet identified. Similarly, mice deficient in NLRP6, caspase-1, IL-18, or ASC (all proteins that regulate the expression of proinflammatory cytokines such as IL-18) develop colitis that is transmissible to co-housed wild-type mice (Elinav et al., 2011). Recent studies in another mouse model of transmissible colitis, which has similarities to UC, provide an example of the specificity of host-gene-bacterial relationships and IBD Cell 147, September 30, 2011 ª2011 Elsevier Inc. 51
(Figure 4) (Bloom et al., 2011; Kang et al., 2008). Mice lacking the IL-10 receptor and expressing a dominant-negative form of the TGF-b receptor in T lymphocytes develop IFN-g- and TNF-adependent colitis (Kang et al., 2008). The disease is cured by antibiotic treatment and reinduced by co-housing diseased and cured animals or by simply feeding cured mice the common commensal bacteria Bacteroides thetaiotaomicron (B. theta) (Bloom et al., 2011). In the same mice, the related Bacteroides sp. TP5 induced a lymphocytic inflammatory infiltrate different from that induced by B. theta, indicating the remarkable specificity of host-gene-bacterial interactions (Figure 4). The authors noted dysbiosis in diseased animals with increased numbers of Enterobacteriaceae, but these bacteria did not induce disease despite being present in higher numbers in sick mice. This study shows that a single bacterial type can cause IBD-like pathology in the proper genetic setting, a bacteria-plus-gene interaction that triggers intestinal inflammation. Importantly, the observation that two closely related bacteria induce different pathologies in the same genetically susceptible host provides support for the concept that genes present in the non-host metagenome can determine a host phenotype. A similar observation, in this case of a virus-plus-gene interaction that triggers IBD-like pathology, has been described in mice mutant for the CD risk gene Atg16L1 (Figure 4) (Cadwell et al., 2008, 2010). Abnormal Paneth cells were observed in humans carrying the ATG16L1 T300A allele and mice hypomorphic for expression of Atg16L1 raised in a conventional clean barrier (Cadwell et al., 2008). Importantly, the phenotype of the mice varied between different facilities and could be induced in mutant mice, but not wild-type mice, by inoculation with a specific strain of murine norovirus (Karst et al., 2003; Thackray et al., 2007). When these mice were challenged with dextran sodium sulfate (DSS), they developed inflammatory phenotypes specific to the combination of Atg16L1 mutation and an individual norovirus strain (Figure 4). Virus-triggered pathologies could be treated by blocking TNF-a or IFN-g or by treatment with antibiotics. Interestingly, infection with murine norovirus enhances signaling through Nod1 and Nod2 via the induction of type 1 IFN, potentially providing a direct link between enteric viral infection and NOD signaling pathways implicated in IBD risk (Kim et al., 2011). These data raise the possibility that patterns of viral infection and specific components of the bacterial metagenome act together to influence the penetrance of UC and CD susceptibility risk alleles in humans. Furthermore, these data show that closely related viruses can have quite different effects on the phenotype of a host genetically prone to a disease process. This finding further supports the concept that genes in the non-host metagenome can determine host phenotypes. Host-Gene-Metagenome Interactions in T1D For T1D, recent observations fit with a ‘‘perfect storm’’ scenario in which numerous events combine to increase susceptibility to disease development in early childhood (Figure 1). These events include susceptibility alleles in HLA class II genes and INS that cause increased autoreactivity against insulin, its precursors, and other islet antigens; lowered IL-2, IL-10, and IL-27 production and signaling; altered T cell receptor signaling and regulation (via, for example, susceptibility alleles in PTPN2, PTPN22, CTLA4, 52 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
and IL2RA); and increased type 1 IFN production and responsiveness (Todd, 2010; Robinson et al., 2011; Bluestone et al., 2010). The ‘‘perfect T1D storm’’ is generated when these factors combine with a permissive, modern environment of widespread vitamin D deficiency (Cooper et al., 2011) and other still unidentified environmental factors (Figure 5). In particular, the T1D susceptibility genes and candidates IFIH1 (Nejentsev et al., 2009), GPR183 (EBI2) (Heinig et al., 2010), TLR7, TLR8 (Barrett et al., 2009), and FUT2 (Smyth et al., 2011) strongly suggest an etiological role for virus-induced, type 1 interferon production. A common knockout mutation of FUT2 in several populations causes the nonsecretor status (i.e., a lack of shedding of the A and B blood group antigens into saliva and intestinal secretions). This T1D-predisposing FUT2 genotype is also associated with increased risk of CD (McGovern et al., 2010; Franke et al., 2010), providing another direct mechanistic link between these two diseases and microbial infections. The FUT2 nonsecretor genotype is associated with resistance to certain strains of norovirus and Helicobacter pylori (Smyth et al., 2011). Investigations of the mechanisms involved in the FUT2 associations with chronic and infectious disease are urgently required, as is the case for many of the newly identified GWAS candidate genes. Defining the Metagenome Now and in the Future Technologies for analyzing human loci involved in complex diseases have, until recently, outstripped technologies for analyzing the metagenome. For example, single-nucleotide polymorphism (SNP)-based GWAS cover the entire human genome, although at low resolution, whereas most common tools and methods applied to the non-host metagenome focus on only one component, such as a particular bacteria, viruses, or phage. The non-host metagenome is so complex that researchers have focused on DNA sequencing, even though many organisms relevant to disease—including enteroviruses that have been linked to T1D and viruses that cause intestinal disease—have RNA genomes. Although our knowledge of the human gut metagenome is in its infancy, this metagenome can now be explored in detail by deep, next-generation sequencing of both RNA and DNA, then stratified by host genotype, disease risk, or disease status. Investigators are increasingly using shotgun sequencing of RNA + DNA, which theoretically can detect any organism (e.g., Finkbeiner et al., 2008). However, studies to date have often relied on the DNA sequencing of 16S rRNA genes of bacteria. This standard and reliable method has identified dysbiosis in IBD and T1D (Wen et al., 2008; Roesch et al., 2009; Giongo et al., 2011; Sartor, 2008; Garrett et al., 2010b). Whether these changes are causal or secondary to disease is unclear. An outstanding example of consequences of relying on the analysis of only a subset of the metagenome is the recent appreciation that bacterial phage viruses are a major and dynamic part of the intestinal microbiome (Reyes et al., 2010). This adds an enteric bacterial ‘‘virome’’ to the eukaryotic virome that lives in our tissues (Virgin et al., 2009). Bacteria are not the only cells, in addition to host cells, that can be infected by viruses with consequent changes in biology. For example, an RNA virus infects the eukaryotic pathogen Leishmania and regulates the host inflammatory responses during parasite infection (Ives et al., 2011). Thus, like bacteria and their phages, all Eukarya in
the microbiome are candidates for viral infection that might alter biological processes. The tools to detect and quantify the entire non-host metagenome at a reasonable cost will undoubtedly develop rapidly as metagenomic sequencing technologies and computational approaches to phylogeny and microbe detection are developed and applied. Similarly, sequencing the entire host genome is becoming more cost efficient and practical. This wealth of data will set the stage for metagenetics, but meaningful and robust analyses of the complex interactions within the metagenome will require new computational tools and new conceptualizations of gene-gene and gene-microbe interactions. Conclusion: The Metagenetics of Mechanism-Based Disease Subtypes Here we have argued that two factors need to be considered as key contributors to the genetics and pathogenesis of complex inflammatory diseases, such as T1D, CD, and UC: specific host-gene-microbial interactions and the mechanistic heterogeneity of phenotypes that constitute complex diseases. Although we have used the lens of T1D, CD, and UC research to support these concepts, it is clear that these ideas may apply to a broader array of diseases as well. The striking effects of the microbiome on systemic immunity and on diseases that affect both visceral and mucosal tissues suggest that any physiologic process may be altered by the microbiome and gene-specific interactions of the microbiome with the host. At a minimum, the diverse diseases that have been revealed by GWAS to share risk alleles are strong candidates for considering the metagenome, rather than only the host genome, as contributing to health or disease. The concepts of mechanism-defined disease subtypes and host-gene-microbial interactions cooperate in important ways. For example, if the single diagnosis of CD or T1D includes multiple mechanistic phenotypes (Figure 2 and Figure 3), a specific host-gene-microbial interaction (Figure 4) might contribute to only one of these phenotypes. In this setting, the impact of interactions between genes in the metagenome, of either microbial or host origin, would be obscured. This could, for example, obscure the role of a single microbe in causing one mechanism-based disease subtype rather than causing all cases of a disease. Failure to identify such an agent would prevent the use of approaches that treat or vaccinate against the agent (Figure 2 and Figure 5). It is logical and anticipated that stratifying patients for treatment with pathway-specific drugs will improve outcomes and success of phase II and III clinical trials (Figure 3). This paradigm is highly effective and increasingly used in the treatment of cancer, but it also seems likely to benefit those with germline-based predisposition to disease as well. Deconvoluting the complex matrix of interactions within the metagenome that contribute to disease will require more complete analyses of the metagenome. It also requires an iterative redefinition of disease subtypes using markers that distinguish between patients based on the mechanism responsible for injury rather than the presence of tissue injury per se. This ambitious goal is daunting to consider, but data discussed herein from human studies, animal studies, and analyses of the microbiome lead us to the inescapable conclusion that complex interactions within the metagenome control phenotypes. We must face this
complexity head-on to solve the puzzle of the etiology and pathogenesis of complex diseases. We, therefore, argue for the inclusion of the metagenome in human genetic studies for these diseases. We view complex diseases as ‘‘metagenetic,’’ reflecting the contributions of both host and non-host genes within the metagenome. The nonhost genes in the metagenome that are relevant to a disease might be viral, bacterial, or derived from additional members of the microbiome, which are still largely uncharacterized. Parasites likely play a critical role in some populations. These metagenetic interactions probably contribute to the development of disease at two levels (Figure 5). First, we envision the normal immune system developing via harmonious relationships within the metagenome. For example, the level of innate immunity in mice is regulated by chronic herpesvirus infection (Barton et al., 2007; White et al., 2010), and therefore acquisition of a specific chronic virus might predispose the host to either helpful or harmful responses to other components of the microbiome. It will be important to develop quantitative and robust ways to indentify such a ‘‘normal’’ immune system. Second, once a poorly balanced immune system is generated, host gene interactions, with either other host genes or the non-host metagenome, likely synergize to generate inappropriate levels of inflammation in response to microbial products (e.g., CD and UC) or to set the stage for development of HLA-dependent autoimmunity (T1D). Understanding this level of biological complexity will require the involvement of statisticians, computational biologists, geneticists, pathogenesis experts, virologists, bacteriologists, and parasitologists in an integrated fashion to identify mechanistically important interactions. Such an integrated approach can then perhaps make sense of the metagenetics of complex diseases, to the advantage of us all. ACKNOWLEDGMENTS The authors would like to acknowledge helpful conversations with Thad Stappenbeck, Ramnik Xavier, Emil Unanue, Jeff Gordon, Balfour Sartor, Adolfo Garcia-Sastre, and Dermot McGovern. We thank Tom Smith for providing the images of noroviruses used in Figure 4. H.W.V. is supported by the NCI, NIAID, NCRR, the Crohn’s and Colitis Foundation of America, and the Broad Medical Foundation. J.A.T. is supported by the NIDDK, NIHR, the Wellcome Trust, the Juvenile Diabetes Research Foundation International, and the European Union. REFERENCES Anderson, C.A., Boucher, G., Lees, C.W., Franke, A., D’Amato, M., Taylor, K.D., Lee, J.C., Goyette, P., Imielinski, M., Latiano, A., et al. (2011). Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47. Nat. Genet. 43, 246–252. Arechiga, A.F., Habib, T., He, Y., Zhang, X., Zhang, Z.Y., Funk, A., and Buckner, J.H. (2009). Cutting edge: the PTPN22 allelic variant associated with autoimmunity impairs B cell signaling. J. Immunol. 182, 3343–3347. Atarashi, K., Tanoue, T., Shima, T., Imaoka, A., Kuwahara, T., Momose, Y., Cheng, G., Yamasaki, S., Saito, T., Ohba, Y., et al. (2011). Induction of colonic regulatory T cells by indigenous Clostridium species. Science 331, 337–341. Bach, J.F. (2002). The effect of infections on susceptibility to autoimmune and allergic diseases. N. Engl. J. Med. 347, 911–920. Barrett, J.C., Clayton, D.G., Concannon, P., Akolkar, B., Cooper, J.D., Erlich, H.A., Julier, C., Morahan, G., Nerup, J., Nierras, C., et al; Type 1 Diabetes Genetics Consortium. (2009). Genome-wide association study and
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 53
meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat. Genet. 41, 703–707.
structure of the initial microbiota across multiple body habitats in newborns. Proc. Natl. Acad. Sci. USA 107, 11971–11975.
Barton, E.S., White, D.W., Cathelyn, J.S., Brett-McClellan, K.A., Engle, M., Diamond, M.S., Miller, V.L., and Virgin, H.W., 4th. (2007). Herpesvirus latency confers symbiotic protection from bacterial infection. Nature 447, 326–329.
Ehlers, S., and Kaufmann, S.H.; Participants of the 99(th) Dahlem Conference. (2010). Infection, inflammation, and chronic diseases: consequences of a modern lifestyle. Trends Immunol. 31, 184–190.
Benson, A., Pifer, R., Behrendt, C.L., Hooper, L.V., and Yarovinsky, F. (2009). Gut commensal bacteria direct a protective immune response against Toxoplasma gondii. Cell Host Microbe 6, 187–196.
Elinav, E., Strowig, T., Kau, A.L., Henao-Mejia, J., Thaiss, C.A., Booth, C.J., Peaper, D.R., Bertin, J., Eisenbarth, S.C., Gordon, J.I., and Flavell, R.A. (2011). NLRP6 inflammasome regulates colonic microbial ecology and risk for colitis. Cell 145, 745–757.
Benson, A.K., Kelly, S.A., Legge, R., Ma, F., Low, S.J., Kim, J., Zhang, M., Oh, P.L., Nehrenberg, D., Hua, K., et al. (2010). Individuality in gut microbiota composition is a complex polygenic trait shaped by multiple environmental and host genetic factors. Proc. Natl. Acad. Sci. USA 107, 18933–18938. Bloom, S.M., Bijanki, V.N., Nava, G.M., Sun, L., Malvin, N.P., Donermeyer, D.L., Dunne, W.M., Jr., Allen, P.M., and Stappenbeck, T.S. (2011). Commensal Bacteroides species induce colitis in host-genotype-specific fashion in a mouse model of inflammatory bowel disease. Cell Host Microbe 9, 390–403. Bluestone, J.A., Herold, K., and Eisenbarth, G. (2010). Genetics, pathogenesis and clinical interventions in type 1 diabetes. Nature 464, 1293–1300. Bottini, N., Vang, T., Cucca, F., and Mustelin, T. (2006). Role of PTPN22 in type 1 diabetes and other autoimmune diseases. Semin. Immunol. 18, 207–213. Brehm, M.A., Bortell, R., Diiorio, P., Leif, J., Laning, J., Cuthbert, A., Yang, C., Herlihy, M., Burzenski, L., Gott, B., et al. (2010). Human immune system development and rejection of human islet allografts in spontaneously diabetic NODRag1null IL2rgammanull Ins2Akita mice. Diabetes 59, 2265–2270. Cadwell, K., Liu, J.Y., Brown, S.L., Miyoshi, H., Loh, J., Lennerz, J.K., Kishi, C., Kc, W., Carrero, J.A., Hunt, S., et al. (2008). A key role for autophagy and the autophagy gene Atg16l1 in mouse and human intestinal Paneth cells. Nature 456, 259–263. Cadwell, K., Patel, K.K., Maloney, N.S., Liu, T.C., Ng, A.C., Storer, C.E., Head, R.D., Xavier, R., Stappenbeck, T.S., and Virgin, H.W. (2010). Virus-plussusceptibility gene interaction determines Crohn’s disease gene Atg16L1 phenotypes in intestine. Cell 141, 1135–1145. Calderon, B., Carrero, J.A., Miller, M.J., and Unanue, E.R. (2011a). Cellular and molecular events in the localization of diabetogenic T cells to islets of Langerhans. Proc. Natl. Acad. Sci. USA 108, 1561–1566. Calderon, B., Carrero, J.A., Miller, M.J., and Unanue, E.R. (2011b). Entry of diabetogenic T cells into islets induces changes that lead to amplification of the cellular response. Proc. Natl. Acad. Sci. USA 108, 1567–1572. Cardwell, C.R., Stene, L.C., Joner, G., Cinek, O., Svensson, J., Goldacre, M.J., Parslow, R.C., Pozzilli, P., Brigis, G., Stoyanov, D., et al. (2008). Caesarean section is associated with an increased risk of childhood-onset type 1 diabetes mellitus: a meta-analysis of observational studies. Diabetologia 51, 726–735. Cardwell, C.R., Stene, L.C., Joner, G., Davis, E.A., Cinek, O., Rosenbauer, J., Ludvigsson, J., Castell, C., Svensson, J., Goldacre, M.J., et al. (2010). Birthweight and the risk of childhood-onset type 1 diabetes: a meta-analysis of observational studies using individual patient data. Diabetologia 53, 641–651. Clayton, D.G. (2009). Prediction and interaction in complex disease genetics: experience in type 1 diabetes. PLoS Genet. 5, e1000540. Cooper, J.D., Smyth, D.J., Walker, N.M., Stevens, H., Burren, O.S., Wallace, C., Greissl, C., Ramos-Lopez, E., Hyppo¨nen, E., Dunger, D.B., et al. (2011). Inherited variation in vitamin D genes is associated with predisposition to autoimmune disease type 1 diabetes. Diabetes 60, 1624–1631. Cotsapas, C., Voight, B.F., Rossin, E., Lage, K., Neale, B.M., Wallace, C., Abecasis, G.R., Barrett, J.C., Behrens, T., Cho, J., et al; on behalf of the FOCiS Network of Consortia. (2011). Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 7, e1002254. Dendrou, C.A., Plagnol, V., Fung, E., Yang, J.H., Downes, K., Cooper, J.D., Nutland, S., Coleman, G., Himsworth, M., Hardy, M., et al. (2009). Cell-specific protein phenotypes for the autoimmune locus IL2RA using a genotype-selectable human bioresource. Nat. Genet. 41, 1011–1015. Dominguez-Bello, M.G., Costello, E.K., Contreras, M., Magris, M., Hidalgo, G., Fierer, N., and Knight, R. (2010). Delivery mode shapes the acquisition and
54 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Esplugues, E., Huber, S., Gagliani, N., Hauser, A.E., Town, T., Wan, Y.Y., O’Connor, W., Jr., Rongvaux, A., Van Rooijen, N., Haberman, A.M., et al. (2011). Control of TH17 cells occurs in the small intestine. Nature 475, 514–518. Fehrmann, R.S.N., Jansen, R.C., Veldink, J.H., Westra, H.J., Arends, D., Bonder, M.J., Fu, J., Deelen, P., Groen, H.J.M., Smolonska, A., et al. (2011). Trans-eQTLs reveal that independent genetic variants associated with a complex phenotype converge on intermediate genes, with a major role for the HLA. PLoS Genet. 7, e1002197. Finkbeiner, S.R., Allred, A.F., Tarr, P.I., Klein, E.J., Kirkwood, C.D., and Wang, D. (2008). Metagenomic analysis of human diarrhea: viral detection and discovery. PLoS Pathog. 4, e1000011. Foxman, E.F., and Iwasaki, A. (2011). Genome-virome interactions: examining the role of common viral infections in complex disease. Nat. Rev. Microbiol. 9, 254–264. Franke, A., McGovern, D.P., Barrett, J.C., Wang, K., Radford-Smith, G.L., Ahmad, T., Lees, C.W., Balschun, T., Lee, J., Roberts, R., et al. (2010). Genome-wide meta-analysis increases to 71 the number of confirmed Crohn’s disease susceptibility loci. Nat. Genet. 42, 1118–1125. Gaboriau-Routhiau, V., Rakotobe, S., Le´cuyer, E., Mulder, I., Lan, A., Bridonneau, C., Rochet, V., Pisi, A., De Paepe, M., Brandi, G., et al. (2009). The key role of segmented filamentous bacteria in the coordinated maturation of gut helper T cell responses. Immunity 31, 677–689. Garrett, W.S., Lord, G.M., Punit, S., Lugo-Villarino, G., Mazmanian, S.K., Ito, S., Glickman, J.N., and Glimcher, L.H. (2007). Communicable ulcerative colitis induced by T-bet deficiency in the innate immune system. Cell 131, 33–45. Garrett, W.S., Gallini, C.A., Yatsunenko, T., Michaud, M., DuBois, A., Delaney, M.L., Punit, S., Karlsson, M., Bry, L., Glickman, J.N., et al. (2010a). Enterobacteriaceae act in concert with the gut microbiota to induce spontaneous and maternally transmitted colitis. Cell Host Microbe 8, 292–300. Garrett, W.S., Gordon, J.I., and Glimcher, L.H. (2010b). Homeostasis and inflammation in the intestine. Cell 140, 859–870. Gianani, R., Campbell-Thompson, M., Sarkar, S.A., Wasserfall, C., Pugliese, A., Solis, J.M., Kent, S.C., Hering, B.J., West, E., Steck, A., et al. (2010). Dimorphic histopathology of long-standing childhood-onset diabetes. Diabetologia 53, 690–698. Giongo, A., Gano, K.A., Crabb, D.B., Mukherjee, N., Novelo, L.L., Casella, G., Drew, J.C., Ilonen, J., Knip, M., Hyo¨ty, H., et al. (2011). Toward defining the autoimmune microbiome for type 1 diabetes. ISME J. 5, 82–91. Glocker, E.O., Kotlarz, D., Boztug, K., Gertz, E.M., Scha¨ffer, A.A., Noyan, F., Perro, M., Diestelhorst, J., Allroth, A., Murugan, D., et al. (2009). Inflammatory bowel disease and mutations affecting the interleukin-10 receptor. N. Engl. J. Med. 361, 2033–2045. Goodman, A.L., Kallstrom, G., Faith, J.J., Reyes, A., Moore, A., Dantas, G., and Gordon, J.I. (2011). Extensive personal human gut microbiota culture collections characterized and manipulated in gnotobiotic mice. Proc. Natl. Acad. Sci. USA 108, 6252–6257. Grainger, J.R., Smith, K.A., Hewitson, J.P., McSorley, H.J., Harcus, Y., Filbey, K.J., Finney, C.A., Greenwood, E.J., Knox, D.P., Wilson, M.S., et al. (2010). Helminth secretions induce de novo T cell Foxp3 expression and regulatory function through the TGF-b pathway. J. Exp. Med. 207, 2331–2341. Hansen, E.E., Lozupone, C.A., Rey, F.E., Wu, M., Guruge, J.L., Narra, A., Goodfellow, J., Zaneveld, J.R., McDonald, D.T., Goodrich, J.A., et al. (2011). Pan-genome of the dominant human gut-associated archaeon,
Methanobrevibacter smithii, studied in twins. Proc. Natl. Acad. Sci. USA 108 (Suppl 1), 4599–4606. Hansen, J., Gulati, A., and Sartor, R.B. (2010). The role of mucosal immunity and host genetics in defining intestinal commensal bacteria. Curr. Opin. Gastroenterol. 26, 564–571. Heinig, M., Petretto, E., Wallace, C., Bottolo, L., Rotival, M., Lu, H., Li, Y., Sarwar, R., Langley, S.R., Bauerfeind, A., et al; Cardiogenics Consortium. (2010). A trans-acting locus regulates an anti-viral expression network and type 1 diabetes risk. Nature 467, 460–464. Ichinohe, T., Pang, I.K., Kumamoto, Y., Peaper, D.R., Ho, J.H., Murray, T.S., and Iwasaki, A. (2011). Microbiota regulates immune defense against respiratory tract influenza A virus infection. Proc. Natl. Acad. Sci. USA 108, 5354– 5359. Imielinski, M., Baldassano, R.N., Griffiths, A., Russell, R.K., Annese, V., Dubinsky, M., Kugathasan, S., Bradfield, J.P., Walters, T.D., Sleiman, P., et al; Western Regional Alliance for Pediatric IBD; International IBD Genetics Consortium; NIDDK IBD Genetics Consortium; Belgian-French IBD Consortium; Wellcome Trust Case Control Consortium. (2009). Common variants at five new loci associated with early-onset inflammatory bowel disease. Nat. Genet. 41, 1335–1340. Ivanov, I.I., Atarashi, K., Manel, N., Brodie, E.L., Shima, T., Karaoz, U., Wei, D., Goldfarb, K.C., Santee, C.A., Lynch, S.V., et al. (2009). Induction of intestinal Th17 cells by segmented filamentous bacteria. Cell 139, 485–498. Ives, A., Ronet, C., Prevel, F., Ruzzante, G., Fuertes-Marraco, S., Schutz, F., Zangger, H., Revaz-Breton, M., Lye, L.F., Hickerson, S.M., et al. (2011). Leishmania RNA virus controls the severity of mucocutaneous leishmaniasis. Science 331, 775–778. Kang, S.S., Bloom, S.M., Norian, L.A., Geske, M.J., Flavell, R.A., Stappenbeck, T.S., and Allen, P.M. (2008). An antibiotic-responsive mouse model of fulminant ulcerative colitis. PLoS Med. 5, e41. Karst, S.M., Wobus, C.E., Lay, M., Davidson, J., and Virgin, H.W., 4th. (2003). STAT1-dependent innate immunity to a Norwalk-like virus. Science 299, 1575– 1578. Kau, A.L., Ahern, P.P., Griffin, N.W., Goodman, A.L., and Gordon, J.I. (2011). Human nutrition, the gut microbiome and the immune system. Nature 474, 327–336. Khor, B., Gardet, A., and Xavier, R.J. (2011). Genetics and pathogenesis of inflammatory bowel disease. Nature 474, 307–317. Kim, Y.G., Park, J.H., Reimer, T., Baker, D.P., Kawai, T., Kumar, H., Akira, S., Wobus, C., and Nu´n˜ez, G. (2011). Viral infection augments Nod1/2 signaling to potentiate lethality associated with secondary bacterial infections. Cell Host Microbe 9, 496–507. Lathrop, S.K., Bloom, S.M., Rao, S.M., Nutsch, K., Lio, C.W., Santacruz, N., Peterson, D.A., Stappenbeck, T., and Hsieh, C.S. (2011). Peripheral education of the immune system by colonic commensal microbiota. Nature 10.1038/ nature10434. Lee, Y.K., Menezes, J.S., Umesaki, Y., and Mazmanian, S.K. (2011). Proinflammatory T-cell responses to gut microbiota promote experimental autoimmune encephalomyelitis. Proc. Natl. Acad. Sci. USA 108 (Suppl 1), 4615–4622. Levine, B., Mizushima, N., and Virgin, H.W. (2011). Autophagy in immunity and inflammation. Nature 469, 323–335. Long, S.A., Cerosaletti, K., Wan, J.Y., Ho, J.C., Tatum, M., Wei, S., Shilling, H.G., and Buckner, J.H. (2011). An autoimmune-associated variant in PTPN2 reveals an impairment of IL-2R signaling in CD4(+) T cells. Genes Immun. 12, 116–125. Maloy, K.J., and Powrie, F. (2011). Intestinal homeostasis and its breakdown in inflammatory bowel disease. Nature 474, 298–306. Mazmanian, S.K., Round, J.L., and Kasper, D.L. (2008). A microbial symbiosis factor prevents intestinal inflammatory disease. Nature 453, 620–625. McCartney, S.A., Vermi, W., Lonardi, S., Rossini, C., Otero, K., Calderon, B., Gilfillan, S., Diamond, M.S., Unanue, E.R., and Colonna, M. (2011). RNA sensor-induced type I IFN prevents diabetes caused by a b cell-tropic virus in mice. J. Clin. Invest. 121, 1497–1507.
McGovern, D.P., Jones, M.R., Taylor, K.D., Marciante, K., Yan, X., Dubinsky, M., Ippoliti, A., Vasiliauskas, E., Berel, D., Derkowski, C., et al; International IBD Genetics Consortium. (2010). Fucosyltransferase 2 (FUT2) non-secretor status is associated with Crohn’s disease. Hum. Mol. Genet. 19, 3468–3476. Melmed, G.Y., and Targan, S.R. (2010). Future biologic targets for IBD: potentials and pitfalls. Nat. Rev. Gastroenterol. Hepatol. 7, 110–117. Nejentsev, S., Walker, N., Riches, D., Egholm, M., and Todd, J.A. (2009). Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324, 387–389. Odze, R. (2003). Diagnostic problems and advances in inflammatory bowel disease. Mod. Pathol. 16, 347–358. Oikarinen, S., Martiskainen, M., Tauriainen, S., Huhtala, H., Ilonen, J., Veijola, R., Simell, O., Knip, M., and Hyo¨ty, H. (2011). Enterovirus RNA in blood is linked to the development of type 1 diabetes. Diabetes 60, 276–279. Penders, J., Thijs, C., Vink, C., Stelma, F.F., Snijders, B., Kummeling, I., van den Brandt, P.A., and Stobberingh, E.E. (2006). Factors influencing the composition of the intestinal microbiota in early infancy. Pediatrics 118, 511–521. Pflu¨ger, M., Winkler, C., Hummel, S., and Ziegler, A.G. (2010). Early infant diet in children at high risk for type 1 diabetes. Horm. Metab. Res. 42, 143–148. Rashid, S.T., Corbineau, S., Hannan, N., Marciniak, S.J., Miranda, E., Alexander, G., Huang-Doran, I., Griffin, J., Ahrlund-Richter, L., Skepper, J., et al. (2010). Modeling inherited metabolic disorders of the liver using human induced pluripotent stem cells. J. Clin. Invest. 120, 3127–3136. Reyes, A., Haynes, M., Hanson, N., Angly, F.E., Heath, A.C., Rohwer, F., and Gordon, J.I. (2010). Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature 466, 334–338. Rivas, M.A., Beaudoin, M., Gardet, A., Stevens, C., Sharma, Y., Zhang, C.K., Boucher, G., Ripke, S., Ellinghaus, D., Burtt, N., et al. (2011). Deep resequencing of GWAS loci identifies independent rare variants associated with inflammatory bowel disease. Nat. Genet. 10.1038/ng.952. Robinson, T., Kariuki, S.N., Franek, B.S., Kumabe, M., Kumar, A.A., Badaracco, M., Mikolaitis, R.A., Guerrero, G., Utset, T.O., Drevlow, B.E., et al. (2011). Autoimmune disease risk variant of IFIH1 is associated with increased sensitivity to IFN-a and serologic autoimmunity in lupus patients. J. Immunol. 187, 1298–1303. Roesch, L.F., Lorca, G.L., Casella, G., Giongo, A., Naranjo, A., Pionzio, A.M., Li, N., Mai, V., Wasserfall, C.H., Schatz, D., et al. (2009). Culture-independent identification of gut bacteria correlated with the onset of diabetes in a rat model. ISME J. 3, 536–548. Rossin, E.J., Lage, K., Raychaudhuri, S., Xavier, R.J., Tatar, D., Benita, Y., Cotsapas, C., and Daly, M.J.; International Inflammatory Bowel Disease Genetics Constortium. (2011). Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273. Round, J.L., and Mazmanian, S.K. (2010). Inducible Foxp3+ regulatory T-cell development by a commensal bacterium of the intestinal microbiota. Proc. Natl. Acad. Sci. USA 107, 12204–12209. Sartor, R.B. (2008). Microbial influences in inflammatory bowel diseases. Gastroenterology 134, 577–594. Sczesnak, A., Segata, N., Qin, X., Gevers, D., Petrosion, J.F., Huttenhower, C., Littman, D.R., and Ivanov, I.I. (2011). The genome of Th17 cell-inducing segmented filamentous bacteria reveals extensive auxotrophy and adaptations to the intestinal environment. Cell Host Microbe 10, 260–272. Smyth, D.J., Cooper, J.D., Howson, J.M.M., Clarke, P., Downes, K., Mistry, T., Stevens, H., Walker, N.M., and Todd, J.A. (2011). FUT2 non-secretor status links type 1 diabetes susceptibility and resistance to infection. Diabetes. Published online October 24, 2011. 10.2337/db11-0638. Spor, A., Koren, O., and Ley, R. (2011). Unravelling the effects of the environment and host genotype on the gut microbiome. Nat. Rev. Microbiol. 9, 279–290. Stappenbeck, T.S., Rioux, J.D., Mizoguchi, A., Saitoh, T., Huett, A., DarfeuilleMichaud, A., Wileman, T., Mizushima, N., Carding, S., Akira, S., et al. (2011).
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 55
Crohn disease: a current perspective on genetics, autophagy and immunity. Autophagy 7, 355–374.
Vehik, K., and Dabelea, D. (2011). The changing epidemiology of type 1 diabetes: why is it going through the roof? Diabetes Metab. Res. Rev. 27, 3–13.
Stene, L.C., Oikarinen, S., Hyo¨ty, H., Barriga, K.J., Norris, J.M., Klingensmith, G., Hutton, J.C., Erlich, H.A., Eisenbarth, G.S., and Rewers, M. (2010). Enterovirus infection and progression from islet autoimmunity to type 1 diabetes: the Diabetes and Autoimmunity Study in the Young (DAISY). Diabetes 59, 3174–3180.
Virgin, H.W., Wherry, E.J., and Ahmed, R. (2009). Redefining chronic viral infection. Cell 138, 30–50.
Thackray, L.B., Wobus, C.E., Chachu, K.A., Liu, B., Alegre, E.R., Henderson, K.S., Kelley, S.T., and Virgin, H.W., 4th. (2007). Murine noroviruses comprising a single genogroup exhibit biological diversity despite limited sequence divergence. J. Virol. 81, 10460–10473. Todd, J.A. (2010). Etiology of type 1 diabetes. Immunity 32, 457–467.
Wen, L., Ley, R.E., Volchkov, P.Y., Stranges, P.B., Avanesyan, L., Stonebraker, A.C., Hu, C., Wong, F.S., Szot, G.L., Bluestone, J.A., et al. (2008). Innate immunity and intestinal microbiota in the development of type 1 diabetes. Nature 455, 1109–1113. White, D.W., Keppel, C.R., Schneider, S.E., Reese, T.A., Coder, J., Payton, J.E., Ley, T.J., Virgin, H.W., and Fehniger, T.A. (2010). Latent herpesvirus infection arms NK cells. Blood 115, 4377–4383.
Turley, S.J., Lee, J.W., Dutton-Swain, N., Mathis, D., and Benoist, C. (2005). Endocrine self and gut non-self intersect in the pancreatic lymph nodes. Proc. Natl. Acad. Sci. USA 102, 17729–17733.
Wu, H.J., Ivanov, I.I., Darce, J., Hattori, K., Shima, T., Umesaki, Y., Littman, D.R., Benoist, C., and Mathis, D. (2010). Gut-residing segmented filamentous bacteria drive autoimmune arthritis via T helper 17 cells. Immunity 32, 815–827.
Turnbaugh, P.J., Hamady, M., Yatsunenko, T., Cantarel, B.L., Duncan, A., Ley, R.E., Sogin, M.L., Jones, W.J., Roe, B.A., Affourtit, J.P., et al. (2009). A core gut microbiome in obese and lean twins. Nature 457, 480–484.
Yeung, W.C., Rawlinson, W.D., and Craig, M.E. (2011). Enterovirus infection and type 1 diabetes mellitus: systematic review and meta-analysis of observational molecular studies. BMJ 342, d35.
56 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Leading Edge
Primer Mapping Rare and Common Causal Alleles for Complex Human Diseases Soumya Raychaudhuri1,2,3,4,* 1Division
of Genetics of Rheumatology Brigham & Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA 3Partners HealthCare Center for Personalized Genetic Medicine, Boston, MA 02115, USA 4Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.011 2Division
Advances in genotyping and sequencing technologies have revolutionized the genetics of complex disease by locating rare and common variants that influence an individual’s risk for diseases, such as diabetes, cancers, and psychiatric disorders. However, to capitalize on these data for prevention and therapies requires the identification of causal alleles and a mechanistic understanding for how these variants contribute to the disease. After discussing the strategies currently used to map variants for complex diseases, this Primer explores how variants may be prioritized for follow-up functional studies and the challenges and approaches for assessing the contributions of rare and common variants to disease phenotypes. Most common diseases are complex: many genetic and environmental factors mediate the risk for developing the disease, and each individual factor explains only a small proportion of population risk (Cardon and Abecasis, 2003). Genome-wide genotyping with high-throughput approaches has led to the identification of >2,600 associated common risk alleles, with convincing associations in >350 different complex traits (most with modest effect size of odds ratio <1.5) (Hindorff et al., 2009). More recently, low-cost, high-throughput sequencing of exomes and whole genomes is giving investigators access to the spectrum of rare inherited variants and de novo mutations. Once an associated allele is discovered, a critical step to characterizing pathogenesis is the definition of the causal allele, that is the functional allele that influences disease susceptibility and explains the observed association. However, for the vast majority of associated alleles, the identities of causal genes and variants, as well as the function of these variants, remain uncertain. This Primer discusses the population genetics features of rare and common alleles, strategies for connecting these alleles to disease, and strategies to prioritize them for functional follow-up studies. Population Genetics of Rare and Common Alleles Geneticists have long debated the extent to which rare and common alleles contribute to complex disease (Pritchard, 2001; Pritchard and Cox, 2002; Reich and Lander, 2001). Although there is evidence of susceptibility alleles across the frequency spectrum in many complex diseases, it is important to realize that rare alleles and common alleles have different population characteristics that are relevant to medical genetics. The exact distinction between rare and common alleles is to an extent an arbitrary one. We define common alleles as those with frequencies >1%; these alleles are frequent enough that they
can be queried by genotyping in standard marker panels. Rare alleles are polymorphic alleles with <1% frequency that might be most effectively studied with sequencing technologies. The rarest alleles are seen in only a handful of individuals or are private to a single individual and can only be observed by sequencing. The Origin of Polymorphic Alleles De novo mutations occurring spontaneously in individuals are constantly and rapidly introduced into any population. These mutations are initially ‘‘private’’ to the individual that they occurred in but might then be passed on to progeny. Most of these mutations are quickly filtered out or lost by genetic drift and will never achieve appreciable allele frequencies. I illustrate this concept by a simulation in which de novo neutral mutations (conferring no effect on fitness) are introduced into a population of 2,000 diploid individuals. In 31 generations, 95% of these mutations disappear from the general population, and not one of these mutations achieves an allele frequency of >1% in 200 generations (see Figure S1 available online). Mutations that are deleterious are even more rapidly purged from populations. Although any de novo mutation is very unlikely to become a common allele, even a somewhat deleterious mutation may persist for a few subsequent generations as a rare allele before disappearing. Thus populations harbor many rare alleles, most of which have been derived recently, but relatively few common ones. In fact, there is only about one common variant on average per 500 bp in European populations (1000 Genomes Project Consortium, 2010). On the other hand, recent and rapid expansion of human populations has resulted in the presence of many rare alleles. At the extreme of the allele frequency spectrum are de novo mutations; each individual harbors 40 Cell 147, September 30, 2011 ª2011 Elsevier Inc. 57
Figure 1. Linkage Haplotype Lengths
Disequilibrium
and
(A) Linkage disequilibrium metrics. Left: For two markers that are random with respect to each other, each with a 0.5 allele frequency, there is no linkage between them; each resulting haplotype has a frequency of 0.25. Middle left: Here the two markers are not entirely random, and alleles at one marker correlate partially with alleles at the other marker. The A allele on the left is observed more frequently with the C allele on the right, and the T allele on the left is observed more frequently with the G allele on the right. Middle right: Here the two alleles are more tightly linked or have tighter LD than in the previous case. In this instance, the presence of the T allele on the left predicts with certainty the G allele on the right. This could be the case if the T allele arose de novo on a haplotype with the G allele on the right Right: For instances of tight LD, an allele at one marker predicts perfectly the allele at the other marker; in this case, these two markers form only two haplotypes. (B) Changing LD properties of a persistent de novo mutation. A de novo event (circle), when it first occurs on a chromosome (bottom), is on one haplotypic background defined by the chromosomal markers on which it forms (red). As generations pass (moving upward), the event propagates through the population. Recombination events (Xs) occur, reducing the common haplotype (red) on which a variant is present and decoupling it from distal markers (blue). (C) Simulating LD structure of a de novo event as it becomes a common variant. Here a computer simulation depicts a chromosome with 10,000 common markers with 1,000 randomly assigned hot spots. Random mating occurs here with an average of one recombination event per generation. A single rare variant is introduced in the middle of the chromosome on one individual (bottom) and allowed to propagate through the population. The left panel depicts the allele frequency as it increases through the generations (upwards). In the middle panel, all markers in LD with that variant (with D0 = 1) are indicated with a red dot. Initially that variant is in LD with every common marker that it is in phase with on that chromosome, revealed by the red band stretching across the bottom of the plot. As random recombination events occur and the allele becomes more frequent, the number of markers in phase decreases, revealed by the shrinking red band in the middle. On the right panel, a gray dot indicates markers for which the genotypes correlate with the rare variant (r2 > 0.5). For the first few generations, there are no other variants that correlate with the de novo mutation as it becomes a rare allele. As time progresses and the allele becomes more common, it begins to develop genotypic correlations with nearby variants that remain on the same haplotype.
de novo point mutations that may not be present in any other individuals (Conrad et al., 2011). Common alleles tend to be more ancient than rare ones as it takes many generations for a rare allele to rise to a reasonable allele frequency. There are important exceptions to these generalizations. An ancient allele may be rare because it is being depleted from the population. A common allele may be recent if it confers a critical survival advantage or has emerged after a rapid population expansion from a small founder population. Linkage Disequilibrium and Haplotypes Genetic linkage is the tendency of alleles at nearby loci to be transmitted together; two nearby loci are in linkage disequilibrium (LD) when recombination events occur between them very infrequently. Two common metrics quantify pairwise LD between biallelic markers (see Figure 1A). The R-squared (r2) 58 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
between two markers is their correlation across chromosomes within a population. If two markers have r2 = 1, then alleles are always in phase (or in cis) with each other; in a genetic study, their association statistics will be identical. The D-prime (D0 ) between two markers is inversely related to the fraction of chromosomes that have had historical recombination between them. If D0 = 1, two biallelic variants constitute only two or three haplotypes, whereas if D0 < 1, all four possible haplotypes are present in the population. If D0 = 0 or r2 = 0, then the two markers are unlinked and statistically independent of each other. Recombination events break down pairwise linkage between markers over time and reduce the lengths of haplotypes in a population. Recombination events are much more likely to occur in hot spot regions in the genome than in other regions (Myers et al., 2005). As a result, markers without a recombination hot
Figure 2. Common Variants and FineMapping with Conditional Haplotype Analysis (A) Common variants. This image illustrates the structure of common variants and LD blocks. The top lists a reference genome spanning 10 kb and the reference genotypes of the polymorphic variants. The haplotype structure is broken up into two blocks by a recombination hot spot. Each block contains a set of markers in tight LD, which can be phased into a small number of haplotypes. Below that, a limited number of genotypes are depicted for a hypothetical individual because a commercial array would assay only a limited collection of all of the common variants in a region. The bottom row demonstrates how data for those genotypes can be phased using reference population data and how missing genotypes can be imputed if the haplotype can be inferred accurately. In some instances, imputed genotypes may be uncertain. (B) Fine-mapping with conditional haplotype analysis. The left-hand side lists genotypes at ten variant sites (numbered) that define seven common haplotypes. Each row represents a haplotype, and genotypes at variant sites are listed in each column. Assuming that a common variant association is observed at marker 1, identical associations will be observed at the markers 2, 3, and 5 because their genotypes are correlated across haplotypes. In the first step, haplotypes are grouped by marker 1. The result is that the seven haplotypes form two subgroups (indicated by purple and red bars on the right). The purple group demonstrates association with disease (right). Including marker 7 breaks the groups up further into four haplotypes (indicated by purple, green, blue, and red bars on the far right). By adding marker 7, differential risk association between haplotypes is apparent. Whereas the T/G haplotype confers risk, the T/T haplotype confers even more risk. Thus marker 1 alone does not parsimoniously explain all of the risk at that locus.
spot between them are often linked over long periods of time and have high pairwise D0 . Those markers can often be grouped into a set of limited number of common haplotypes (see Figure 2A). Phasing algorithms can be applied to determine markers in cis and to define the most likely haplotypes. Rare alleles generally sit on long haplotypes whereas common alleles sit on shorter ones. When a mutation first occurs de novo on a chromosome, it occurs on the background of a single rare haplotype defined by all markers on that chromosome (see Figure 1B). Because the de novo mutation appeared as a random event, it initially has no correlation with other markers on that chromosome (r2 = 0). In initial generations, prior to a recombination event, the mutation has D0 = 1 with other markers across the chromosome. But, if the mutation survives generations and becomes a common allele, repeated recombination events fragment that haplotype and reduce its length. The allele retains high D0 to only proximate markers that are not separated from it by a recombination hot spot. As the variant becomes more frequent, so does the haplotype that it occurs on; over time the emerging variant develops correlation (r2 > > 0) with the markers on that short haplotype (see Figure 1C). Finding Pathogenic Variants, Both Rare and Common Common variant associations to phenotype are often facile to find. Their high frequencies allow case-control studies to be adequately powered to detect even modest effects. Their high
r2 to other proximate common variants allows for association signals to be discovered by genotyping the marker itself or other nearby correlated markers. But mapping those associated variants to the specific causal variant that functionally influences disease risk can be challenging because the statistical signals invoked by intercorrelated variants are difficult to disentangle. On the other hand, individual rare variant associations are challenging to find. Their low frequency renders current cohorts underpowered to detect all but the strongest effects, and lack of correlation to other markers often prevents them from being picked up by standard genotyping marker panels. But, once a rare associated variant is observed, mapping the causal rare variants is relatively facile because recent ancestry is likely to limit the number of intercorrelated markers. Functional Properties of Pathogenic Variants, Both Rare and Common Because common alleles tend to be ancient, they have weathered the influences of purifying negative selection. Therefore, common variants that influence disease risk are likely to have functionally modest effects that are compatible with their high population frequency. There are two possibilities outlined by Kruyokov et al. that might allow for this (Kryukov et al., 2007). First, common variants that are medically detrimental act subtly or specifically to confer disease without altering evolutionary fitness. As an example, consider a variant that confers risk of Cell 147, September 30, 2011 ª2011 Elsevier Inc. 59
Box 1. Glossary Associated allele: An allele that, in a genetic study, is observed to have differential allele frequencies in cases compared to controls. The presence of an association suggests that it, or some other variant in LD, is influencing disease susceptibility. Causal allele: The functional allele that influences disease susceptibility and explains the observed associated allele. Common alleles: Alleles with a high population frequency, typically defined as >1%. Standard marker panels can often be used to identify common allele associations. Rare alleles: Alleles with a lower allele frequency of <1%. These alleles can be polymorphic in the population being seen in multiple distantly related individuals; alternately they might be alleles that are private to an individual or seen in a small number of closely related individuals. De novo mutations: A mutation that has occurred in an individual and that was not inherited from a parent. These mutations are initially private. If a de novo mutation is passed on and persists through generations, it can become a polymorphic allele. Linkage disequilibrium (LD): Two polymorphic loci are in LD when they are colocated, and alleles at those loci are distributed nonrandomly with respect to each other on chromosomes in the population. Linkage disequilibrium is present when recombination events between two loci occur infrequently. Two metrics for LD are r2 and D0 (see Figure 1A). Recombination hot spots: Individual regions within the genome that have frequent recombination events. Negative selection: Selection acting to remove new deleterious mutations that reduces evolutionary fitness of an individual. Also known as purifying selection. Positive selection: Selection acting to propagate new advantageous mutations that increase evolutionary fitness of an individual. Balancing selection: Selection acting to increase allelic variability at a locus. Genotype imputation: A statistical technique to infer missing genotypes in a set of individuals using a reference panel of genotyped individuals. Imputation exploits LD between genotyped and ungenotyped variants. Genome-wide significance: A level of statistical significance typically used to establish association for a common variant in genome-wide association studies (p = 5 3 108), which assumes that there are 1,000,000 effective independent tests genome-wide. Stratification: A genetic confounder if there are differences in the ancestral origin of cases and controls. The resulting systematic allele frequency differences can result in false-positive associations. Genomic inflation factor (l): The ratio of the median of the observed chi-square statistics for an association study and the expected median chi-square statistic. If there is stratification, the test statistic is inflated, causing the genomic inflation factor to be substantially greater than 1, resulting in inappropriately significant p values. Fine-mapping: The use of dense genotyping data around an associated allele to identify the causal allele(s) to account for the observed statistical signal in the region. Second-generation sequencing: Recent sequencing technologies not using Sanger chemistry that characteristically generate many short read sequences. Targeted region: The region of the genome selected for a sequencing experiment. Whole-genome sequencing: A sequencing experiment where the full 3 GBp of whole genome is sequenced. Does not require DNA capture. For most medical genetic studies, the sequencing
60 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Box 1. Continued data are not reassembled but mapped to a reference genome sequence. Whole-exome sequencing: A sequencing experiment where the protein-coding sequences of all known genes are targeted, captured, and sequenced (30 Mbp). Coverage: In a sequencing experiment, coverage at a genomic position is the total number of reads mapped to that position.
addiction to tobacco (Thorgeirsson et al., 2008). Such a variant might have little impact on survival historically but might have specific neuropsychiatric effects that mediate the risk of 21st century diseases such as lung cancer or coronary artery disease that play a role later in life after reproduction. Second, forces that select specifically for these common variants counteract their medically detrimental qualities; the variant, although causing disease, also offers evolutionary benefit simultaneously. For example, common ApoL1 variants that confer high risk of chronic kidney disease in African Americans protect from Trypanosoma brucei rhodesiense infection at the same time (Genovese et al., 2010). Because rare alleles are typically more recent, they may not have been subjected to the same negative selective pressures yet and may include among them more relatively deleterious mutations. Rare alleles therefore often are enriched for those variants more likely to have more dramatic functional consequences. This is supported by data indicating that rare deletions are more likely than more common deletions to remove entire genes, exons, promotors, or stop codons (Conrad et al., 2010). Similarly, rare variants are twice as likely as common ones to be nonsynonymous (1000 Genomes Project Consortium, 2010). Because rare variants are relatively unrestricted in terms of their functional impact in general, a subset of rare pathogenic variants with large effect might offer more obvious insight about disease mechanism. Common Variants Detecting Common Variants with High-Throughput SNP Arrays High-throughput genotyping of standard marker panels of common single-nucleotide polymorphisms (SNPs) has become possible with microarrays (Gunderson et al., 2005). Their application to large case-control sample collections has facilitated detection of even the most modest risk alleles, with odds ratios of 1.1 or less. There are a finite number of common variants present in the general population, i.e., <6 million are estimated in European populations (1000 Genomes Project Consortium, 2010). But nearby common SNPs are in LD with one another and define a limited number of haplotypes (see Figure 2A), so the effective number of independent variants is much fewer. Thus, genotyping a limited number of common variants genome-wide has the effect of covering many more common variants. In European populations, the Affymetrix 5.0 array with 440K SNPs has r2 > 0.8 for 57% of common variants, and the Affymetrix 6.0 array with roughly double the number of SNPs (900K) has r2 > 0.8 for 66% of common variants (Bhangale et al., 2008).
Genome-wide genotyping also allows investigators to use imputation to estimate genotypes of markers not directly genotyped; in doing so, it becomes possible to combine samples genotyped on different platforms. Probabilistic multipoint imputation algorithms, using a limited number of genotyped common variants, can determine the genotypes of ungenotyped common variants by comparing to a reference panel of comprehensively genotyped individuals (see Figure 2A). Most of these methods currently use probabilistic Hidden Markov Model approaches to infer the local LD structure (Browning, 2008; de Bakker et al., 2008). Selecting Populations for Study Initial efforts to map complex traits emphasized selected isolated populations, for example the Finish populations (Peltonen et al., 2000). These populations can offer the advantage of increased inbreeding, more uniform genetic and environmental backgrounds, detailed genealogical records, availability of intact extended families, and longer LD intervals. Populations that have undergone rapid population expansion may be of particular use because LD intervals are longer. The most successful validation of this approach is represented by deCODE genetics and their study of a wide-range of complex diseases in Iceland. Now, investigators are increasingly focused on the inclusion of individuals from multiple ethnic backgrounds in order to enhance the ability of studies to discover risk alleles with variable allele frequencies across different backgrounds (Rosenberg et al., 2010). Different ethnic backgrounds might highlight different mechanisms of disease pathogenesis, including differences in environmental exposures, as well as reflect different degrees of genetic diversity and LD patterns. A striking example of this is the discovery of an IL18B variant that predicts response to hepatitis C treatment with equivalent effect in European, African, and Hispanic American patients; allele frequency differences of the variant explain about half of the differences in treatment response across populations (Ge et al., 2009). Genome-wide Association Studies In a case-control genome-wide association study (GWAS), samples are genotyped for a set of 100,000–2,000,000 markers; case and control allele frequencies are compared directly to each other. Statistical significance is assessed with a simple 2 3 2 chisquare test or with logistic regression when genotypes are probabilistic (e.g., from imputation). Critical to the success of GWAS has been the application of stringent statistical significance thresholds that result in reproducible associations that account for the large number of simultaneous tests (Risch and Merikangas, 1996). Testing for common variant associations throughout the genome represents 1 million independent tests (Hoggart et al., 2008). Thus investigators routinely use a genome-wide significance threshold representing a Bonferoni correction for multiple tests (p = 0.05/ 106 = 5 3 108). Because effect sizes for most common variants are modest, large sample sizes and careful adjustment for subtle technical artifacts that can easily obscure results or produce false-positive associations are of paramount importance (Balding, 2006; Clayton et al., 2005; McCarthy et al., 2008). The genomic inflation factor is an important metric that indicates the extent of inflation due to stratification and other technical confounders. Fortu-
itously, the strength of genome-wide genotyping goes beyond simply measuring case-control allele frequency differences throughout the genome. It also allows investigators to look at patterns in the genotyping data to identify key technical confounders. For instance, patterns of excessive ‘‘missing’’ genotype data for an individual indicate that intensity data could not be clustered into genotype, likely as a function of low DNA quality or concentration. Another key confounder is population stratification, that is the presence of the systematic allele frequency differences observed in a population as a consequence of ancestry rather than case-control status. As a dramatic example, Campbell et al. showed, even in studies using only European populations, that not carefully adjusting for an individual’s country of origin results in a highly statistically significant false-positive association for height at a lactase SNP (Campbell et al., 2005). Genome-wide genotype data allow investigators to identify and correct for case-control population stratification. Once markers are identified as having statistically significant allele frequency differences in cases and controls, they are ideally replicated in independent populations. Replicating in an independent population not only adds statistical confidence to the results but also adds confidence that the results of the initial study are not the consequence of technical confounding or stratification. Identifying an associated marker rarely clarifies whether the marker itself is the functional allele that causes altered disease susceptibility. The observed association at a marker might be the result of an underlying causal allele with high r2 with the associated variant, a rare functional allele on a haplotypic background shared with the associated variant, or multiple functional alleles that cause an apparent association. Nevertheless, the causal alleles must closely correlate and be in LD with associated variants. Fine-Mapping Common Variant Loci Dense genotyping of markers in the region, followed by finemapping, can identify the causal allele, or at least reduce the number of potential candidates. The underlying assumption is that the causal allele will most parsimoniously explain the entirety of the evidence of association. In many instances, however, finemapping is complicated if the association is not being driven by a marker that has been genotyped; in those instances, it might be possible to identify a risk haplotype defined by genotyped markers and to then sequence selected individuals to identify the causal allele. Thus in order to fine-map effectively, dense genotyping to include all known markers in the region is key. Additionally, in many instances there might be multiple causal alleles, and in order to be powered to detect multiple effects, it is often necessary to densely genotype a large number of samples, perhaps more than those used to discover the association. After densely genotyping a large number of samples, there are two major statistical tools utilized in fine-mapping common variants. The first is conditional regression. If a single lead marker (or another marker in perfect LD with it) is causal, then applying conditional regression adjusting for that lead marker should obviate all other association in the region. The second statistical tool is conditional haplotype analysis (Figure 2B). With conditional haplotype analyses, investigators start with data from a subset of the genotyped markers and phase genotypes to define Cell 147, September 30, 2011 ª2011 Elsevier Inc. 61
haplotypes. If the selected markers are causal, then the defined haplotypes should parsimoniously explain the risk at that locus. That is, the addition of additional markers (and thus creation of more haplotypes) should not explain risk better, and removal of any marker (and thus removal of haplotypes) should reduce the explained risk. With both approaches, if the causal allele is in perfect LD (r2 = 1) with other markers, then distinguishing between statistically identical associations may not be possible. One striking example of fine-mapping was an effort by Pereyra et al. where they used GWAS to demonstrate that multiple HLA-B classical alleles are associated with long-term viral load control in HIV-infected individuals (Pereyra et al., 2010). Then, with conditional haplotype analysis, they demonstrated that allelic risk was best defined by amino acid variation at a few sites along the binding groove of HLA-B. Data from multiple ethnic populations may be particularly useful to fine-map associations (Rosenberg et al., 2010). Ideally a single allele might explain risk across multiple ethnic groups. This approach is effective only if the same causal allele is present with a high allele frequency in both, and there are ethnic differences in local LD structure. The inclusion of African populations might be particularly useful because LD patterns are generally shorter. This approach might be complicated if multiple different alleles in populations influence disease susceptibility within the same locus. Adrianto et al. looked at SNPs associated with systemic lupus erythematosus (SLE) spanning the TNFAIP3 gene (Adrianto et al., 2011). When they looked at markers associated in Asian and European populations, they were able to finemap the associated region from a span of 100 kb to 50 kb. Subsequent sequencing identified a novel AA > T single base pair deletion polymorphism that acts to disrupt an NF-kb binding site. This single variant explained the associated risk of the locus. Rare Variants It is possible that associated rare variants for complex diseases will be more facile to fine-map and to evaluate for functional impact. The discovery of a rare variant near a common variant might be particularly informative. A rare variant that clearly impacts one of the candidate genes implicated by a common variant might clarify which of the candidate genes is pathogenic. Furthermore, the rare variant’s function might offer clues about the mechanism of the common variant. There have been several examples of this phenomena reported in the literature already. Common alleles associated with type II diabetes are near five genes, PPARG, HNF1A, KCNJ11, WFS1, and HNF1B, that have rare mutations that cause familial forms of diabetes (Voight et al., 2010). Similarly, 18 of the 95 known common variants associated with serum lipid levels are near genes that have been implicated in monogenic lipid disorders (Teslovich et al., 2010). Indeed studies to find rare coding variants near common risk loci have already shown success in type I diabetes (Nejentsev et al., 2009), age-related macular degeneration (S.R. and J. Seddon, unpublished data), and Crohn’s disease (Momozawa et al., 2011). The extent to which rare variants explain complex disease susceptibility in general remains an open question. It has been speculated that the gap between the heritability explained by known common variants and that which might be predicted from family studies might be explained by rare variants (Bansal 62 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
et al., 2010), and that even many observed common variant associations might be the consequence of functional undiscovered rare variants (Anderson et al., 2011; Dickson et al., 2010). Other investigators have suggested that undiscovered common variants themselves might explain much of that missing heritability (Purcell et al., 2009; Yang et al., 2010). Identifying Rare Variants with High-Throughput Sequencing Advances in DNA capture and sequencing technology have greatly facilitated targeted, exome, and whole-genome sequencing (Maxmen, 2011; Ng et al., 2010) and have in the process enhanced the search for rare variants. Whereas the cost of sequencing is rapidly dropping, the computational and statistical challenges to rapidly aligning sequences to reference sequences, separating variant calls (SNPs, indels, and structural variants) from sequencing artifact, data storage, and establishing associations are mounting (McKenna et al., 2010). Second-generation sequencing technologies have now come online and are distinct from prior approaches in that they do not use Sanger chemistry and are characterized by high sequencing yield with shorter reads (Shendure and Ji, 2008). The Illumina HiSeq 2000 system, for example, generates >1 billion 100 bp paired-end useable reads per run. Efficiently mapping a large volume of short reads to the reference genome accurately has been an important area of methodological progress (Li and Homer, 2010). Look-up (or hash-table) based methods map reads quickly but are not as accurate as less-efficient alignment-based methods. Accurate alignment is especially important in regions with short insertions or deletions (indels); poor alignment in such regions can result in false-positive SNP calls and false-negative indel calls. Repetitive genomic regions and regions with homology can be challenging to map and, in some instances, may not be possible to query effectively. Paired-end sequencing generates two sequence reads from opposite ends of the same contiguous genomic fragment and helps overcome some of these alignment issues. To sensitively and accurately call a heterozygote nonreference base, a minimum of 203 coverage is necessary to overcome the uncertainty resulting from sampling short sequence reads across a diploid genome. Additional coverage may be necessary to compensate for random and nonrandom sequencing error, which may vary across technologies. Even with a high-coverage sequencing experiment, the coverage is typically nonuniform across the targeted region. Nonuniform coverage can be related to biases in DNA capture technologies, in unequal pooling of amplicon products from different genomic regions or individuals, and in intrinsic sequence properties (Harismendy et al., 2009). Careful experimental technique and sample normalization can minimize some biases in coverage. Average coverage of an experiment is thus not as useful of a metric as is the percentage of target genomic region achieving more than a prespecified coverage threshold. A set of independently genotyped SNPs to verify sequence-based genotype calls and assess the accuracy of sequencing studies is useful to confirm accuracy. Sequencing can be applied to a set of samples to discover variants or to genotype variants. For variant discovery, sequence data can be pooled across multiple samples to boost power to detect a nonreference base. After application of sequencing to
Figure 3. Power to Find Rare Variants and Burden Testing (A) Power to find rare variants. Here is a plot of 80% power to discover rare associated alleles at p < 107 and p < 1011 for cohorts of both 500 and 5000 cases and controls. The control allele frequency and odds ratio (OR) are plotted along the x axis and the y axis, respectively. Diagonal lines indicate corresponding case allele frequencies. (B) Burden testing. Here data from sequenced cases (top) and controls (bottom) are depicted around a gene of interest. Each horizontal line represents an individual. Variants are shown as red Xs. Certain variants are rare (i.e., seen once), and others are more common (vertical line). In this example, the case variants within the candidate gene (arrow at bottom and blue shading) are seen more frequently than in controls. If common variants are excluded, there are five case chromosomes with a rare variant compared to one control chromosome. This pattern of enrichment is not evident outside the gene. A burden test of association for rare variants within the gene might be statistically significant.
discover rare variants, confirming the presence of the variant in discovery samples with TaqMan or capillary electrophoresis sequencing is useful before exploring in independent samples to establish disease association. Power Considerations and Significance Testing One of the challenges to establishing a rare variant disease association is that in any given study, few variants are observed. Therefore, genetic studies are more poorly powered to detect a rare SNP association than they are to detect more common association with the same effect size (see Figure 3). Thus to detect associations at the same statistical threshold, sample collections larger than those currently used might be necessary. Establishing association of de novo or private mutations may not be possible at all because they may be seen only once in an entire study. For rare variant associations, the field has not yet defined accepted standards for statistical significance that account for the burden of multiple hypothesis testing. Because there are many more rare variants than common ones, and they are not typically intercorrelated with each other, a more stringent threshold may be necessary than applied for common variants. One conservative approach is to correct for the total number of bases genome-wide, i.e., p = 0.05/3 3 109 1011 as a significance threshold. Most recent studies have limited themselves to exomes or to a subset of targeted genes; in these instances the multiple-hypothesis testing burden might be significantly less. But with spectre of genetic studies with genome sequencing in the very near future, this conservative threshold may ultimately turn out to be appropriate. Despite limitations in power and the need for achieving greater significance, rare variant associations with strong effects might be imminently detectable. For instance, as part of a genomewide study, Holm et al. were able to identify a rare variant for sick sinus syndrome (Holm et al., 2011); the coding variant that explained the association was highly statistically significant in a modestly sized cohort as it had such a large effect size (odds ratio [OR] > 12). One strategy to further enhance the prospects of discovery is to identify those individuals most likely to have
highly penetrant rare mutations. For example, individuals with younger onset or more severe disease, those with familial forms of disease, or those individuals that have disease despite a lack of other clinical or genetic risk factors might be promising candidates for rare variant association studies. Burden Testing If a genomic region is critical to disease pathogenesis, rare mutations may modulate disease susceptibility. Then, many affected individuals may have rare mutations more frequently in that region, though the mutations may be different from and unrelated to one another. This concept has sparked interest in the genetics community, and workers in statistical genetics have devised strategies to examine rare variants in aggregate across a target region (Bansal et al., 2010). These ‘‘burden’’ tests assess whether rare variants within a specific region are distributed in a nonrandom way, suggesting that they might be playing a role in disease pathogenesis (see Figure 3B). For example, a simple burden test might assess whether cases are enriched for rare variants compared to controls. More sophisticated tests account for the possibility that the region contains both protective and riskconferring mutations. The target region might be a specific subregion of a gene, an entire gene transcript, or the entire genome. This approach is an important alternative to the challenging task of establishing the association of individual rare variants; using these approaches to test multiple variants simultaneously might enhance power over testing individual variants. For instance, a burden test might be able to identify nonrandom distributions even of private mutations. In an early application of rare variant burden testing, Cohen et al. examined individuals from the general population with high and low HDL levels and assessed the burden of rare variation in three candidate genes known to harbor Mendelian mutations that cause familial low serum high-density lipoprotein (HDL) levels (Cohen et al., 2004). They found that individuals with low HDL levels were significantly more likely to contain rare nonsynonymous mutations than those with high HDL levels; of the low HDL individuals, 16% had at least one rare mutation, compared to 2% of high HDL individuals. This suggested Cell 147, September 30, 2011 ª2011 Elsevier Inc. 63
strongly that for individuals with low HDL levels, 14% of them may have mutations in these three genes mediating phenotype. The idea of comparing the proportion of case individuals with rare alleles to control individuals with rare alleles was formalized into a statistical test, the ‘‘Cohort Allelic Sums Test’’ (CAST) (Morgenthaler and Thilly, 2007). Subsequently, more sophisticated tests have been proposed that allow investigators to combine association testing of rare and common alleles by either testing for association together in multivariate tests (Li and Leal, 2008) or combining rare and common alleles weighted inversely to their allele frequency (Madsen and Browning, 2009). One very powerful way of enhancing burden testing is to filter variants that are more likely to be causal from those that are likely not to be causal. For example, investigators may focus their studies on nonsynonymous alleles. Alternative approaches might include filtering variants based on sequence conservation properties or other bioinformatics approaches (Adzhubei et al., 2010; Ng and Henikoff, 2003). A successful test, where statistical significance is obtained, can be used to argue that (1) the tested rare variants play a role in a specific disease, and (2) the target region tested plays an important role in disease pathology. But, it fails to implicate specific variants, and ambiguity about the causal variants might remain. For example, if rare variants are enriched in a gene 2-fold in cases compared to controls, then roughly half the variants seen in cases might be pathogenic, but the other half are part of the background distribution of rare variation in that gene and may not influence disease risk. Structural Variants Rare structural variants have gained recent interest; the frequency and size of structural variants have repeatedly shown enrichment in schizophrenia and other neuropscychiatric disease (International Schizophrenia Consortium, 2008; Sebat et al., 2007; Walsh et al., 2008). However, except for a few specific regions such as 22q11 and 16p11, most rare events have uncertain pathogenecity. For instance, although the rates of >100 kb deletion events are significantly increased in cases compared to controls, there is great uncertainty as to which individual events are pathogenic and which ones are nonpathogenic events that might occur in the general healthy population. This is analogous to the circumstance that might occur with a statistically significant burden test for point mutations, described above. Extended Haplotypes As previously discussed, many rare variants are recent and occur on extended haplotypes that can be identified using common variant markers. Thus GWAS datasets may be used to identify long-range haplotypes based on common markers and to then assess whether they are associated with phenotype. If this is the case, the phenotypic association might be driven by a highly penetrant rare variant. We used this approach to find an extended haplotype in the CFH gene that conferred high risk of age-related macular degeneration; subsequent sequencing identified the causal mutation to be an argenine to cysteine change in the C terminus of the protein (S.R. and J. Seddon, unpublished data). This approach might be most effective in isolated populations where reduced genetic diversity and founder effects make it possible to identify long-range haplotypes (Kong et al., 2008). 64 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
One recently published method to identify long and rare haplotypes, and to then test for association to phenotype, has been successfully applied to multiple phenotypes in out-bred populations (Gusev et al., 2011). From Variants to Function Translating rare and common variants to function can be challenging. In many instances the presence of an association does not clarify which variants are functionally causing disease susceptibility. For common variants, fine-mapping might be stymied by local LD. For rare variants, burden testing might be able to identify a genomic region enriched for rare variants but may not be able to specifically distinguish the individual causal rare variants from spurious nonpathogenic variants. Here we describe broad approaches that might be pursued to clarify pathogenic functions and causality, in the absence of genetic mapping that has clearly identified a single causal variant. Evaluating Nonsynonymous Coding Variants About 1% of the genome consists of protein-coding sequences. Variants in this portion of the genome are potentially the most amenable to follow up by biochemical characterization of the protein product in vitro, characterization in cell lines, or evaluation in transgenic model organisms. Only a minority of associated common variants can be explained by a nonsynonymous coding variant (10%) (Hindorff et al., 2009). Currently, most studies of rare variation emphasize nonsynonymous coding variants; in many cases, noncoding variants are altogether ignored even if they are sequenced. An important challenge in the field is to prioritize discovered coding variants for potentially timeconsuming functional follow up. Computational approaches can be effective at assessing the degree to which a specific amino acid substitution in a protein, induced by a variant, might disrupt function. The functional impact of a substitution can often be estimated by using information about sequence conservation at the mutated site from comparative sequence analysis of a gene with orthologs and paralogs. If an amino acid site in a protein sequence is functionally critical, then most de novo mutations are deleterious and are subject to purifying selection; these sites then are expected to show little variation. Thus, a nonsynonymous allele from a study in a highly conserved site is likely to be deleterious. Sequence conservation in organisms more closely related to human is particularly informative because more distantly related organisms may have divergent biology and protein function. Many software tools using these principles to assess coding variants have now been devised (Cooper and Shendure, 2011). One example of such a program is Polymorphism Phenotyping 2 (or PolyPhen 2) (Adzhubei et al., 2010). The most predictive features in this method are the estimated likelihood that the mutant allele fits the substitution pattern observed in the multiple-sequence alignment; the evolutionary distance to the organism with a protein harboring a similar nonsynonymous substitution; and whether the mutant allele occurs at a site that is hypermutable. The method uses these features and others, including information from the three-dimensional protein structure, to define a statistical model that includes the probability of disease based on a catalog of known pathogenic Mendielian mutations. The functional importance of an amino acid replacement is predicted
from these features based on a naive Bayes classifier. PolyPhen 2 and other related methods demonstrate similar performance in their ability to predict pathogenic mutations achieving an area under the curve (AUC) of 75%–80% (Hicks et al., 2011). Experimental approaches to individually interrogate rare variants with functional assays can also be very powerful. But, for an approach to be effective, it is critical that the functional assay is high throughput, and that it has an assayed function that is relevant to the phenotype. Otherwise, mutations that affect the assayed gene function might not in fact be pathogenic. In one application of this approach, Davis et al. used it to look at individual mutations with the TTC21B gene and to show that they cause human ciliopathies (Davis et al., 2011). First they demonstrated that a translation-blocking morpholino specific for TTC21B resulted in gastrulation defects in zebrafish that were consistent with cilliary dysfunction. Then, when they resequenced TTC21B in a large, clinically diverse ciliopathy cohort and matched controls, they observed a similar frequency of rare variants. But, when they tested those rare alleles to identify those that caused gastrulation defects in zebrafish, they observed a significant enrichment of functional alleles in cases compared to controls. Evaluating Noncoding Variants Noncoding variants pose a particular challenge to the field at the moment. The noncoding genome represents 99% of the genome and at present is poorly annotated (Alexander et al., 2010). About 10% of the noncoding genome is under purifying selection, suggesting that they harbor critical processes that if disrupted could be pathogenic (Davydov et al., 2010). Many common variants, if they contribute to disease, likely act by impacting the noncoding genome. As one example, an associated Crohn’s disease SNP in LD with polymorphic deletion overlaps the IRGM gene promotor and modulates gene expression (McCarroll et al., 2008). In the last several years, however, several promising approaches have emerged to evaluate noncoding variants that might point the way to causality, such as analyzing sequence conservation, gene expression, and chromatin state. Sequence Conservation A computational approach to prioritizing noncoding variants is to identify those that are at sites with a high degree of sequence conservation across mammalian organisms and are thus under purifying negative selection (Cooper et al., 2005; Miller et al., 2007). These approaches differ from those approaches used to prioritize coding substitutions, as they can only use nucleotide sequence similarity. Indeed, investigators have argued that the conservation information from nucleotide sequences is as predictive as the information gained by peptide sequence similarity and protein structural features (Cooper et al., 2010). The value of assessing common variants with sequence conservation approaches is uncertain, as common variants are presumably not under purifying negative selection. But, rare noncoding variants that have dramatic effects on disease susceptibility might be effectively prioritized with this approach. eQTL Data Can Suggest Causal Genes and Mechanism Expression quantitative loci (eQTL) are genetic variants that correlate with the transcript level of a gene (Jansen and Nap, 2001). To date, most reported eQTLs are cis-effects, acting on nearby genes by encoding variants that modulate promotor
activity, enhancer activity, or mRNA stability. Expression QTL acting in trans have been largely unexplored thus far. Although most recently discovered eQTL have been common variants, there is evidence of rare eQTL also (Montgomery et al., 2011). Identifying rare eQTL might be challenging given the limited power of currently sized cohorts. In the future, burden tests previously described might be able to effectively identify small genomic regions where rare variants dramatically impact transcript levels. It has been shown that common trait-associated variants have a significant overlap with eQTL, suggesting the possibility that many common disease variants act by altering transcript levels (Nicolae et al., 2010). Thus, it might be insightful to assess whether a specific disease-associated common variant is itself an eQTL. If it is, then the gene whose transcript is influenced by the risk allele might be the causal gene. Furthermore, if the risk allele is increasing the transcript level, then the gene may increase disease risk by magnifying gene function; alternatively, if the risk allele reduces transcript level, then the gene may cause disease by mitigating gene function. A convincing eQTL effect can be isolated by transfecting constructs with risk haplotype fragments, as was done to identify the causal variant in the SORT1 lipid locus (Musunuru et al., 2010). Another compelling example of an eQTL that influences disease susceptibility is a type II diabetes-associated variant upstream of the KLF14 transcription factor. Investigators showed that this variant acts not only as a cis-eQTL influencing KLF14 levels in adipose tissue but also as a trans-eQTL for many genes regulated by KLF14 that are important in metabolic traits (Small et al., 2011). There are a few important caveats about this seemingly straightforward approach. First, because eQTL are spread throughout the genome, spurious overlap between disease-associated variants and eQTL is possible (Nica et al., 2010). If a risk variant confers risk by modulating transcript levels, and it is itself causal (or in LD with the causal variant), then it should also be consistent with the strongest eQTL effect in the region. Checking to ensure that the disease-associated variant is consistent with the strongest eQTL effect itself mitigates the risk of spurious overlap. However, it is still possible that the causal allele and the strongest eQTL effect are strongly correlated by chance, and that eQTL association is unrelated to disease risk. Second, although many eQTL act generically, most are tissue specific (Dimas et al., 2009; Price et al., 2011). In fact, certain eQTL may not be detectable unless the cell has responded to a specific stimulus or stress. In order to understand the transcriptional impact of disease alleles most effectively, identifying eQTL in the pathogenic tissues is key. Current eQTL databases are based on a small number of resting cell types, for example lymphoblastoid cell lines (Stranger et al., 2007). Many important pathogenic tissues are not easily accessible for eQTL studies. In the near future, the catalog of available tissues profiled will expand dramatically with the NIH-sponsored Genotype Tissue Expression (GTEx) project, aiming to profile >60 separate tissues (https://commonfund.nih.gov/GTEx/). Finally, although eQTL data can offer potential in identifying the likely causal gene and provide hints about mechanism for common variants, they may not clarify ambiguity about the Cell 147, September 30, 2011 ª2011 Elsevier Inc. 65
causal variant if there are multiple variants in LD. Certain variants may seem more promising, for example structural variants or SNPs overlapping a regulatory variant. As with disease-associated common variants, eQTL datasets often face challenges in fine-mapping signals. Chromatin Modifications Identifying regions of the genome that act as regulatory elements can offer important complementary information to eQTL data in evaluating noncoding variants. Specific functional regulatory elements can be identified from genome-wide profiles of key histone modifications: H3K4me3 marks active promoters; H3K4me1 marks enhancers; H3K4me2 and most histone acetylation mark both promoters and enhancers (Barski et al., 2007; Heintzman et al., 2007; Wang et al., 2008). Similarly, DNase I hypersensitive sites also flag open chromatin regions harboring promoters and enhancers (Sabo et al., 2006). With the advancement of high-throughput sequencing technologies and development of techniques such as ChIP-seq (Park, 2009) and DNaseseq (John et al., 2011), there are mounting public data on genome-wide chromatin profiles. For instance, histone mark ChIP-seq and DNase-seq data on over 100 cell lines and tissues have now been generated through the ENCODE and Roadmap Epigenomics projects (Bernstein et al., 2010; Birney et al., 2007). Although computational approaches to identify putative binding sites based on sequence data alone are nonspecific, recent reports suggest that the prediction of active regulatory sites within assayed tissues is possible by including ChIP-seq and DNase-seq data (Ernst and Kellis, 2010; He et al., 2010; Pique-Regi et al., 2011; Song et al., 2011). One potential approach then to prioritize noncoding variants for follow up is to identify those that are in regions that have been predicted to be regulatory elements. These variants might, for example, disrupt or enhance a transcription factor binding at an enhancer or a promoter. Particularly promising variants might be those that have eQTL activity in the same cell type. Histone mark locations and DNase hypersensitive sites have been shown to be enriched near associated variants (Ernst et al., 2011; McDaniell et al., 2010). A key limitation of this approach is that, like eQTL data, it requires genome-wide chromatin data from the same or similar cell types as those that are pathogenic. Identifying Causal Processes with Integrative Analyses In many instances where the specific causal variant within a locus cannot be identified, examination of the genes implicated may still help to suggest the key underlying functional networks and pathways that might be active in a disease. For instance, age-related macular degeneration associations have implicated the complement pathway without necessarily identifying causal variants. This task can be challenging in general because for any given associated allele, 20 or more genes might be implicated by LD, and any of them may harbor the causal mutation. But despite that, statistically significant connectivity between genes in different associated loci can often be identified. We and others have devised strategies to look for functional connections or similarity between genes across implicated loci. These networks can predict novel gene loci and offer insight about disease mechanism. Gene Relationships Across Implicated Loci (GRAIL) uses >400,000 published scientific PubMed texts to assess pairwise gene similarity between genes across loci 66 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
(Raychaudhuri et al., 2009a). In addition to repeatedly showing highly statistically significant connectivity between genes across loci in multiple diseases, GRAIL has been used to prospectively predict and prioritize associated variants (Raychaudhuri et al., 2009b) and prioritize disease genes within a locus (Beroukhim et al., 2010). Investigators used a similar approach, Disease Association Protein-Protein Link Evaluator (DAPPLE) algorithm, to demonstrate that protein-protein interactions are enriched among genes within disease loci more than by chance alone (Rossin et al., 2011). They demonstrated enrichment most convincingly in autoimmune diseases and furthermore demonstrated that the enrichment of interactions was often between genes within the same immune cell types. These networks offer insight as to how protein products of genes across many loci might be interacting together to initiate disease. We note importantly that pathway analyses can be easily confounded, in particular in neuropsychiatric diseases because there is a correlation between the sizes of transcripts and the likelihood that they will have brain function (Raychaudhuri et al., 2010). Conclusions The advances in genotyping and sequencing technologies over the last few years have revolutionized genetics. Only a few years ago, researchers were still tackling the challenges of gene mapping and discovery of complex diseases. Now we face an embarrassment of riches in which the ability to map loci has become quick and reproducible. The next important challenge is streamlining functional validation, which in most cases is still a critical bottleneck. Rare variant discovery has the potential to yield more obviously functional variants with larger effect sizes because they are less constrained by purifying selection. The discovery of rare variant associations might shed light on those loci discovered by common variant mapping. However, strategies to prioritize functional follow-up studies will be key at those loci where common variants cannot be effectively fine-mapped or individual rare variants (beyond the presence of case enrichment) cannot be identified. Strategies to use regulatory variants, chromatin state data, and sequence conservation offer a potential path forward to prioritize candidate variants. SUPPLEMENTAL INFORMATION Supplemental Information includes one figure and can be found with this article online at doi:10.1016/j.cell.2011.09.011. ACKNOWLEDGMENTS The author would like to acknowledge helpful discussions and feedback from colleagues including Drs. Mark Daly, Paul I.W. de Bakker, X. Shirley Liu, Cynthia Sandor, Eli A. Stahl, Barbara E. Stranger, and Shamil Sunyaev. REFERENCES 1000 Genomes Project Consortium. (2010). A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073. Adrianto, I., Wen, F., Templeton, A., Wiley, G., King, J.B., Lessard, C.J., Bates, J.S., Hu, Y., Kelly, J.A., Kaufman, K.M., et al; BIOLUPUS and GENLES Networks. (2011). Association of a functional variant downstream of TNFAIP3 with systemic lupus erythematosus. Nat. Genet. 43, 253–258.
Adzhubei, I.A., Schmidt, S., Peshkin, L., Ramensky, V.E., Gerasimova, A., Bork, P., Kondrashov, A.S., and Sunyaev, S.R. (2010). A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249.
Cooper, G.M., Goode, D.L., Ng, S.B., Sidow, A., Bamshad, M.J., Shendure, J., and Nickerson, D.A. (2010). Single-nucleotide evolutionary constraint scores highlight disease-causing mutations. Nat. Methods 7, 250–251.
Alexander, R.P., Fang, G., Rozowsky, J., Snyder, M., and Gerstein, M.B. (2010). Annotating non-coding regions of the genome. Nat. Rev. Genet. 11, 559–571.
Cooper, G.M., Stone, E.A., Asimenos, G., Green, E.D., Batzoglou, S., and Sidow, A.; NISC Comparative Sequencing Program. (2005). Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901–913.
Anderson, C.A., Soranzo, N., Zeggini, E., and Barrett, J.C. (2011). Synthetic associations are unlikely to account for many common disease genomewide association signals. PLoS Biol. 9, e1000580. Balding, D.J. (2006). A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7, 781–791. Bansal, V., Libiger, O., Torkamani, A., and Schork, N.J. (2010). Statistical analysis strategies for association studies involving rare variants. Nat. Rev. Genet. 11, 773–785. Barski, A., Cuddapah, S., Cui, K., Roh, T.Y., Schones, D.E., Wang, Z., Wei, G., Chepelev, I., and Zhao, K. (2007). High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837. Bernstein, B.E., Stamatoyannopoulos, J.A., Costello, J.F., Ren, B., Milosavljevic, A., Meissner, A., Kellis, M., Marra, M.A., Beaudet, A.L., Ecker, J.R., et al. (2010). The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048. Beroukhim, R., Mermel, C.H., Porter, D., Wei, G., Raychaudhuri, S., Donovan, J., Barretina, J., Boehm, J.S., Dobson, J., Urashima, M., et al. (2010). The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905. Bhangale, T.R., Rieder, M.J., and Nickerson, D.A. (2008). Estimating coverage and power for genetic association studies using near-complete variation data. Nat. Genet. 40, 841–843. Birney, E., Stamatoyannopoulos, J.A., Dutta, A., Guigo´, R., Gingeras, T.R., Margulies, E.H., Weng, Z., Snyder, M., Dermitzakis, E.T., Thurman, R.E., et al; ENCODE Project Consortium; NISC Comparative Sequencing Program; Baylor College of Medicine Human Genome Sequencing Center; Washington University Genome Sequencing Center; Broad Institute; Children’s Hospital Oakland Research Institute. (2007). Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816. Browning, S.R. (2008). Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450. Campbell, C.D., Ogburn, E.L., Lunetta, K.L., Lyon, H.N., Freedman, M.L., Groop, L.C., Altshuler, D., Ardlie, K.G., and Hirschhorn, J.N. (2005). Demonstrating stratification in a European American population. Nat. Genet. 37, 868–872. Cardon, L.R., and Abecasis, G.R. (2003). Using haplotype blocks to map human complex trait loci. Trends Genet. 19, 135–140. Clayton, D.G., Walker, N.M., Smyth, D.J., Pask, R., Cooper, J.D., Maier, L.M., Smink, L.J., Lam, A.C., Ovington, N.R., Stevens, H.E., et al. (2005). Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat. Genet. 37, 1243–1246. Cohen, J.C., Kiss, R.S., Pertsemlidis, A., Marcel, Y.L., McPherson, R., and Hobbs, H.H. (2004). Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872. Conrad, D.F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Aerts, J., Andrews, T.D., Barnes, C., Campbell, P., et al; Wellcome Trust Case Control Consortium. (2010). Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712. Conrad, D.F., Keebler, J.E., DePristo, M.A., Lindsay, S.J., Zhang, Y., Casals, F., Idaghdour, Y., Hartl, C.L., Torroja, C., Garimella, K.V., et al; 1000 Genomes Project. (2011). Variation in genome-wide mutation rates within and between human families. Nat. Genet. 43, 712–714. Cooper, G.M., and Shendure, J. (2011). Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat. Rev. Genet. 12, 628–640.
Davis, E.E., Zhang, Q., Liu, Q., Diplas, B.H., Davey, L.M., Hartley, J., Stoetzel, C., Szymanska, K., Ramaswami, G., Logan, C.V., et al; NISC Comparative Sequencing Program. (2011). TTC21B contributes both causal and modifying alleles across the ciliopathy spectrum. Nat. Genet. 43, 189–196. Davydov, E.V., Goode, D.L., Sirota, M., Cooper, G.M., Sidow, A., and Batzoglou, S. (2010). Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025. de Bakker, P.I., Ferreira, M.A., Jia, X., Neale, B.M., Raychaudhuri, S., and Voight, B.F. (2008). Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum. Mol. Genet. 17(R2), R122–R128. Dickson, S.P., Wang, K., Krantz, I., Hakonarson, H., and Goldstein, D.B. (2010). Rare variants create synthetic genome-wide associations. PLoS Biol. 8, e1000294. Dimas, A.S., Deutsch, S., Stranger, B.E., Montgomery, S.B., Borel, C., AttarCohen, H., Ingle, C., Beazley, C., Gutierrez Arcelus, M., Sekowska, M., et al. (2009). Common regulatory variation impacts gene expression in a cell typedependent manner. Science 325, 1246–1250. Ernst, J., and Kellis, M. (2010). Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825. Ernst, J., Kheradpour, P., Mikkelsen, T.S., Shoresh, N., Ward, L.D., Epstein, C.B., Zhang, X., Wang, L., Issner, R., Coyne, M., et al. (2011). Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49. Ge, D., Fellay, J., Thompson, A.J., Simon, J.S., Shianna, K.V., Urban, T.J., Heinzen, E.L., Qiu, P., Bertelsen, A.H., Muir, A.J., et al. (2009). Genetic variation in IL28B predicts hepatitis C treatment-induced viral clearance. Nature 461, 399–401. Genovese, G., Friedman, D.J., Ross, M.D., Lecordier, L., Uzureau, P., Freedman, B.I., Bowden, D.W., Langefeld, C.D., Oleksyk, T.K., Uscinski Knob, A.L., et al. (2010). Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 329, 841–845. Gunderson, K.L., Steemers, F.J., Lee, G., Mendoza, L.G., and Chee, M.S. (2005). A genome-wide scalable SNP genotyping assay using microarray technology. Nat. Genet. 37, 549–554. Gusev, A., Kenny, E.E., Lowe, J.K., Salit, J., Saxena, R., Kathiresan, S., Altshuler, D.M., Friedman, J.M., Breslow, J.L., and Pe’er, I. (2011). DASH: a method for identical-by-descent haplotype mapping uncovers association with recent variation. Am. J. Hum. Genet. 88, 706–717. Harismendy, O., Ng, P.C., Strausberg, R.L., Wang, X., Stockwell, T.B., Beeson, K.Y., Schork, N.J., Murray, S.S., Topol, E.J., Levy, S., and Frazer, K.A. (2009). Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biol. 10, R32. He, H.H., Meyer, C.A., Shin, H., Bailey, S.T., Wei, G., Wang, Q., Zhang, Y., Xu, K., Ni, M., Lupien, M., et al. (2010). Nucleosome dynamics define transcriptional enhancers. Nat. Genet. 42, 343–347. Heintzman, N.D., Stuart, R.K., Hon, G., Fu, Y., Ching, C.W., Hawkins, R.D., Barrera, L.O., Van Calcar, S., Qu, C., Ching, K.A., et al. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318. Hicks, S., Wheeler, D.A., Plon, S.E., and Kimmel, M. (2011). Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum. Mutat. 32, 661–668. Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., and Manolio, T.A. (2009). Potential etiologic and functional
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 67
implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 9362–9367. Hoggart, C.J., Clark, T.G., De Iorio, M., Whittaker, J.C., and Balding, D.J. (2008). Genome-wide significance for dense SNP and resequencing data. Genet. Epidemiol. 32, 179–185. Holm, H., Gudbjartsson, D.F., Sulem, P., Masson, G., Helgadottir, H.T., Zanon, C., Magnusson, O.T., Helgason, A., Saemundsdottir, J., Gylfason, A., et al. (2011). A rare variant in MYH6 is associated with high risk of sick sinus syndrome. Nat. Genet. 43, 316–320. International Schizophrenia Consortium. (2008). Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455, 237–241. Jansen, R.C., and Nap, J.P. (2001). Genetical genomics: the added value from segregation. Trends Genet. 17, 388–391. John, S., Sabo, P.J., Thurman, R.E., Sung, M.H., Biddie, S.C., Johnson, T.A., Hager, G.L., and Stamatoyannopoulos, J.A. (2011). Chromatin accessibility pre-determines glucocorticoid receptor binding patterns. Nat. Genet. 43, 264–268. Kong, A., Masson, G., Frigge, M.L., Gylfason, A., Zusmanovich, P., Thorleifsson, G., Olason, P.I., Ingason, A., Steinberg, S., Rafnar, T., et al. (2008). Detection of sharing by descent, long-range phasing and haplotype imputation. Nat. Genet. 40, 1068–1075. Kryukov, G.V., Pennacchio, L.A., and Sunyaev, S.R. (2007). Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am. J. Hum. Genet. 80, 727–739. Li, B., and Leal, S.M. (2008). Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321. Li, H., and Homer, N. (2010). A survey of sequence alignment algorithms for next-generation sequencing. Brief. Bioinform. 11, 473–483. Madsen, B.E., and Browning, S.R. (2009). A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5, e1000384. Maxmen, A. (2011). Exome sequencing deciphers rare diseases. Cell 144, 635–637. McCarroll, S.A., Huett, A., Kuballa, P., Chilewski, S.D., Landry, A., Goyette, P., Zody, M.C., Hall, J.L., Brant, S.R., Cho, J.H., et al. (2008). Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn’s disease. Nat. Genet. 40, 1107–1112. McCarthy, M.I., Abecasis, G.R., Cardon, L.R., Goldstein, D.B., Little, J., Ioannidis, J.P., and Hirschhorn, J.N. (2008). Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet. 9, 356–369. McDaniell, R., Lee, B.K., Song, L., Liu, Z., Boyle, A.P., Erdos, M.R., Scott, L.J., Morken, M.A., Kucera, K.S., Battenhouse, A., et al. (2010). Heritable individualspecific and allele-specific chromatin signatures in humans. Science 328, 235–239. McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., Garimella, K., Altshuler, D., Gabriel, S., Daly, M., and DePristo, M.A. (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing nextgeneration DNA sequencing data. Genome Res. 20, 1297–1303. Miller, W., Rosenbloom, K., Hardison, R.C., Hou, M., Taylor, J., Raney, B., Burhans, R., King, D.C., Baertsch, R., Blankenberg, D., et al. (2007). 28-way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res. 17, 1797–1808. Momozawa, Y., Mni, M., Nakamura, K., Coppieters, W., Almer, S., Amininejad, L., Cleynen, I., Colombel, J.F., de Rijk, P., Dewit, O., et al. (2011). Resequencing of positional candidates identifies low frequency IL23R coding variants protecting against inflammatory bowel disease. Nat. Genet. 43, 43–47. Montgomery, S.B., Lappalainen, T., Gutierrez-Arcelus, M., and Dermitzakis, E.T. (2011). Rare and common regulatory variation in population-scale sequenced human genomes. PLoS Genet. 7, e1002144. Morgenthaler, S., and Thilly, W.G. (2007). A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutat. Res. 615, 28–56.
68 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Musunuru, K., Strong, A., Frank-Kamenetsky, M., Lee, N.E., Ahfeldt, T., Sachs, K.V., Li, X., Li, H., Kuperwasser, N., Ruda, V.M., et al. (2010). From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719. Myers, S., Bottolo, L., Freeman, C., McVean, G., and Donnelly, P. (2005). A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324. Nejentsev, S., Walker, N., Riches, D., Egholm, M., and Todd, J.A. (2009). Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324, 387–389. Ng, P.C., and Henikoff, S. (2003). SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814. Ng, S.B., Buckingham, K.J., Lee, C., Bigham, A.W., Tabor, H.K., Dent, K.M., Huff, C.D., Shannon, P.T., Jabs, E.W., Nickerson, D.A., et al. (2010). Exome sequencing identifies the cause of a mendelian disorder. Nat. Genet. 42, 30–35. Nica, A.C., Montgomery, S.B., Dimas, A.S., Stranger, B.E., Beazley, C., Barroso, I., and Dermitzakis, E.T. (2010). Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 6, e1000895. Nicolae, D.L., Gamazon, E., Zhang, W., Duan, S., Dolan, M.E., and Cox, N.J. (2010). Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888. Park, P.J. (2009). ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet. 10, 669–680. Peltonen, L., Palotie, A., and Lange, K. (2000). Use of population isolates for mapping complex traits. Nat. Rev. Genet. 1, 182–190. Pereyra, F., Jia, X., McLaren, P.J., Telenti, A., de Bakker, P.I., Walker, B.D., Ripke, S., Brumme, C.J., Pulit, S.L., Carrington, M., et al; International HIV Controllers Study. (2010). The major genetic determinants of HIV-1 control affect HLA class I peptide presentation. Science 330, 1551–1557. Pique-Regi, R., Degner, J.F., Pai, A.A., Gaffney, D.J., Gilad, Y., and Pritchard, J.K. (2011). Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data. Genome Res. 21, 447–455. Price, A.L., Helgason, A., Thorleifsson, G., McCarroll, S.A., Kong, A., and Stefansson, K. (2011). Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317. Pritchard, J.K. (2001). Are rare variants responsible for susceptibility to complex diseases? Am. J. Hum. Genet. 69, 124–137. Pritchard, J.K., and Cox, N.J. (2002). The allelic architecture of human disease genes: common disease-common variant.or not? Hum. Mol. Genet. 11, 2417–2423. Purcell, S.M., Wray, N.R., Stone, J.L., Visscher, P.M., O’Donovan, M.C., Sullivan, P.F., and Sklar, P.; International Schizophrenia Consortium. (2009). Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752. Raychaudhuri, S., Plenge, R.M., Rossin, E.J., Ng, A.C.Y., Purcell, S.M., Sklar, P., Scolnick, E.M., Xavier, R.J., Altshuler, D., and Daly, M.J.; International Schizophrenia Consortium. (2009a). Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions. PLoS Genet. 5, e1000534. Raychaudhuri, S., Thomson, B.P., Remmers, E.F., Eyre, S., Hinks, A., Guiducci, C., Catanese, J.J., Xie, G., Stahl, E.A., Chen, R., et al; BIRAC Consortium; YEAR Consortium. (2009b). Genetic variants at CD28, PRDM1 and CD2/CD58 are associated with rheumatoid arthritis risk. Nat. Genet. 41, 1313–1318. Raychaudhuri, S., Korn, J.M., McCarroll, S.A., Altshuler, D., Sklar, P., Purcell, S., and Daly, M.J.; International Schizophrenia Consortium. (2010). Accurately assessing the risk of schizophrenia conferred by rare copy-number variation affecting genes with brain function. PLoS Genet. 6, e1001097. Reich, D.E., and Lander, E.S. (2001). On the allelic spectrum of human disease. Trends Genet. 17, 502–510.
Risch, N., and Merikangas, K. (1996). The future of genetic studies of complex human diseases. Science 273, 1516–1517. Rosenberg, N.A., Huang, L., Jewett, E.M., Szpiech, Z.A., Jankovic, I., and Boehnke, M. (2010). Genome-wide association studies in diverse populations. Nat. Rev. Genet. 11, 356–366. Rossin, E.J., Lage, K., Raychaudhuri, S., Xavier, R.J., Tatar, D., Benita, Y., Cotsapas, C., Daly, M.J., and Daly, M.J.; International Inflammatory Bowel Disease Genetics Constortium. (2011). Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273. Sabo, P.J., Kuehn, M.S., Thurman, R., Johnson, B.E., Johnson, E.M., Cao, H., Yu, M., Rosenzweig, E., Goldy, J., Haydock, A., et al. (2006). Genome-scale mapping of DNase I sensitivity in vivo using tiling DNA microarrays. Nat. Methods 3, 511–518. Sebat, J., Lakshmi, B., Malhotra, D., Troge, J., Lese-Martin, C., Walsh, T., Yamrom, B., Yoon, S., Krasnitz, A., Kendall, J., et al. (2007). Strong association of de novo copy number mutations with autism. Science 316, 445–449. Shendure, J., and Ji, H. (2008). Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145. Small, K.S., Hedman, A.K., Grundberg, E., Nica, A.C., Thorleifsson, G., Kong, A., Thorsteindottir, U., Shin, S.Y., Richards, H.B., Soranzo, N., et al; GIANT Consortium; MAGIC Investigators; DIAGRAM Consortium; MuTHER Consortium. (2011). Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes. Nat. Genet. 43, 561–564. Song, L., Zhang, Z., Grasfeder, L.L., Boyle, A.P., Giresi, P.G., Lee, B.K., Sheffield, N.C., Gra¨f, S., Huss, M., Keefe, D., et al. (2011). Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. Published online August 19, 2011. 10.1101/gr.121541.111.
Stranger, B.E., Nica, A.C., Forrest, M.S., Dimas, A., Bird, C.P., Beazley, C., Ingle, C.E., Dunning, M., Flicek, P., Koller, D., et al. (2007). Population genomics of human gene expression. Nat. Genet. 39, 1217–1224. Teslovich, T.M., Musunuru, K., Smith, A.V., Edmondson, A.C., Stylianou, I.M., Koseki, M., Pirruccello, J.P., Ripatti, S., Chasman, D.I., Willer, C.J., et al. (2010). Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713. Thorgeirsson, T.E., Geller, F., Sulem, P., Rafnar, T., Wiste, A., Magnusson, K.P., Manolescu, A., Thorleifsson, G., Stefansson, H., Ingason, A., et al. (2008). A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature 452, 638–642. Voight, B.F., Scott, L.J., Steinthorsdottir, V., Morris, A.P., Dina, C., Welch, R.P., Zeggini, E., Huth, C., Aulchenko, Y.S., Thorleifsson, G., et al; MAGIC investigators; GIANT Consortium. (2010). Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579–589. Walsh, T., McClellan, J.M., McCarthy, S.E., Addington, A.M., Pierce, S.B., Cooper, G.M., Nord, A.S., Kusenda, M., Malhotra, D., Bhandari, A., et al. (2008). Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 320, 539–543. Wang, Z., Zang, C., Rosenfeld, J.A., Schones, D.E., Barski, A., Cuddapah, S., Cui, K., Roh, T.Y., Peng, W., Zhang, M.Q., and Zhao, K. (2008). Combinatorial patterns of histone acetylations and methylations in the human genome. Nat. Genet. 40, 897–903. Yang, J., Benyamin, B., McEvoy, B.P., Gordon, S., Henders, A.K., Nyholt, D.R., Madden, P.A., Heath, A.C., Martin, N.G., Montgomery, G.W., et al. (2010). Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 69
Leading Edge
Review Modeling Human Disease in Humans: The Ciliopathies Gaia Novarino,1 Naiara Akizu,1 and Joseph G. Gleeson1,* 1Neurogenetics Laboratory, Institute for Genomic Medicine, Howard Hughes Medical Institute, Department of Neurosciences and Pediatrics, University of California, San Diego, La Jolla 92093, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.014
Soon, the genetic basis of most human Mendelian diseases will be solved. The next challenge will be to leverage this information to uncover basic mechanisms of disease and develop new therapies. To understand how this transformation is already beginning to unfold, we focus on the ciliopathies, a class of multi-organ diseases caused by disruption of the primary cilium. Through a convergence of data involving mutant gene discovery, proteomics, and cell biology, more than a dozen phenotypically distinguishable conditions are now united as ciliopathies. Sitting at the interface between simple and complex genetic conditions, these diseases provide clues to the future direction of human genetics. Until a few years ago, identifying the genetic basis of an inherited human disease was an arduous undertaking, requiring potentially a decade or more of work in ascertainment of families for linkage analysis, followed by endless fine mapping of the locus and, finally, sequencing of candidate genes one by one until that ‘‘eureka’’ moment when the likely causative gene was identified. The newly discovered disease gene was often entirely novel, without recognizable domains or a path to understand the disease mechanism. A mouse model was then generated, in which the disease gene was inactivated. The mouse faithfully recapitulated the human phenotype in some cases but, more often, showed no phenotype or phenotypes not clearly related to the human disease. Once established, the model was studied from multiple perspectives to understand the cell biological and biochemical basis of disease, culminating in attempts to test potential therapies. Although successful in a few instances such as losartan treatment for Marfan syndrome (Habashi et al., 2006), this path has not fulfilled the promises of genomic medicine. This strategy has begun to change over the past 10 years due to increased knowledge of human genetic diseases, annotation of the human genome, and an amazing suite of tools to explore disease mechanisms. It is not uncommon now to open up a journal to find that geneticists have solved the molecular basis of a dozen or more conditions. And since we now know a lot more about the function of genes, protein domains, and networks, frequently just the discovery of the molecular cause of a disease can partially explain its mechanism. For instance, the discovery that the Rett syndrome gene encodes a methylCpG-binding protein (Amir et al., 1999) immediately set the stage for a host of important discoveries in epigenetics related to brain function. The types of mutations displayed by patients, known as allelic diversity (Figure 1), can tell us something about the effect of these disease-causing variants on protein function. By identifying patients with different phenotypes due to specific types of 70 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
mutations in the same gene (i.e., genocopies), we can understand human disease as a network of related signs and symptoms. For example, specific types of mutations in the gene encoding p53 predispose to very different types of cancers. By comparing the genes mutated in phenotypically related human diseases, we can learn about the disturbed protein networks that underlie them. Finally, by exploring gene-gene and geneenvironment interactions, we can begin to characterize genetic and epigenetic modifiers of disease. Perhaps the best example is age-related macular degeneration, in which a substantial part of the risk of disease can be quantified based on gene-environment interactions (Chen et al., 2010). The Ciliopathies: One Organelle, Many Disorders Although they are individually rare conditions, the ciliopathies have emerged as a dynamic new field of biology that exemplifies how genetics can be employed to drive research in basic cell biology and vice versa. The primary cilium is structured with a basal body at its base and a 9-paired microtubule axoneme, surrounded by plasma membrane but lacking the central pair of microtubules and outer dynein arms that define its cousin, the motile cilium. Primary cilia were first observed more than a century ago and were initially thought to be evolutionary remnants. How is it that biologists missed their importance for so long? More than a dozen disorders are now considered to be within the ciliopathy spectrum, including Joubert syndrome (JBTS), nephronophthisis (NPHP), Senior-Loken syndrome (SLS), orofaciodigital (OFD), Jeune syndrome, autosomal dominant and recessive polycystic kidney disease (ADPKD and ARPKD), Leber congenital amaurosis (LCA), Meckel-Gruber syndrome (MKS), Bardet-Biedl syndrome (BBS), Usher syndrome (US), and some forms of retinal dystrophy (RD). Between them, these conditions involve nearly every major body organ, including kidney, brain, limb, retina, liver, and bone (Figure 2A), highlighting
Figure 1. Human Genetic Interactions in the Ciliopathy Spectrum Diseased individuals are in color, with severity represented by darker shading. ‘‘Phenocopies’’ refers to the finding that patients with homozygous or compound heterozygous mutations in two different genes (i.e., AHI1 or INPP5E) can show indistinguishable JBTS phenotypes. Multiple allelism at the same locus indicates that mutations in a single gene (i.e., CEP290) can lead to various distinguishable phenotypes. ‘‘Modifiers’’ refers to evidence that potentially deleterious sequence changes in a gene like AHI1 can modify in quantifiable ways the phenotype observed in patients with NPHP1 mutations. (Black) Normal chromosome; (red) mutant chromosome. AHI1, abelson helper integration site 1; INPP5E, inositol polyphosphate-5-phosphatase E; CEP290, centrosomal protein 290 kDa; NPHP1, nephrocystin 1. BBS, Bardet-Biedl syndrome; JBTS, Joubert syndrome; LCA, Leber congenital amaurosis; MKS, Meckel syndrome; NPHP, nephronophthisis; RD, retinal dystrophy; SLS, Senior Loken syndrome.
the important role of the primary cilium in development and homeostasis. These conditions were largely defined by clinical geneticists in the middle of the last century, who did their best to ascribe syndromes to unique combinations of clinical features. Individual diseases are known for the most commonly involved or diseased organ: BBS patients display the triad of obesity, polydactyly, and retinopathy but can display a host of other pathologies. MKS is a lethal condition at birth, with occipital encephalocele, PKD, and polydactyly. JBTS is characterized by a very peculiar radiographic finding known as the ‘‘molar tooth sign,’’ characterized by elongated superior cerebellar peduncles, deepened interpeduncular fossa, and cerebellar vermis hypoplasia. For each of these conditions, significant phenotypic variability has been observed even between members of the same family, making clinical diagnosis a challenge. Now that these conditions have been united by their underlying cell biology, we can begin to see commonalities between individual syndromes (Figure 2A). For instance, low muscle tone, cystic kidney disease, agenesis of the brain’s corpus callosum, mental retardation, and hyperpnea/apnea are additional clinical features often present in ciliopathy patients. Individuals affected by BBS can have several clinical features common to JBTS such as mental retardation, hypotonia, and apnea. However, they also share other symptoms that are usually not present in JBTS but are common to other ciliopathies, such as polydactyly and retinal dystrophy (RD). Moreover, BBS patients are usually obese, a unique trait that is absent among the other ciliopathies. The first few genes identified from positional cloning of human or mouse phenotypes for what would eventually become the ciliopathies initially did not point in an obvious way to the cilium as the site of action. It was not until evidence began to accumulate that the encoded proteins localize specifically near the cilium that the field was born (Ansley et al., 2003; Barr and Sternberg, 1999; Kim et al., 2004; Otto et al., 2003; Taulman et al., 2001). Although the localization data is now incontrovertible, the functions of the encoded proteins at the cilium for the most part still remain a mystery, and some of the effects of these proteins do
not seem to have direct relevance to ciliary function (Yen et al., 2006). The question that emerges is how a single subcellular organelle can mediate such diverse clinical features. About 50 genes encoding predominantly ciliary-localized proteins have now been identified that are mutated in these partially overlapping syndromes (Figure 2B). In a series of positional cloning studies, each of these ciliopathy genes was initially found as causative for a restricted phenotype. Surprisingly, in most instances, this gene identification was followed by reports of mutations in the same gene in a different ciliopathy category (Baala et al., 2007a, 2007b; den Hollander et al., 2006). This occurred with such regularity that the field began to wonder whether any genotype-phenotype correlations would stand the test of time. How could mutations in a single gene produce such pleiotropic phenotypic effects in patients? Were these observations exceptions to the rule of strict genotype-phenotype correlation, or were they the exception that proves the rule of widespread phenotypic pleiotropy? The Primary Cilium Network The primary cilium is a hair-like, immotile cellular organelle protruding from almost all eukaryotic cells, frequently described as the cell’s ‘‘antenna’’ for transducing extracellular signals. For the purposes of this Review, we focus exclusively on diseases involving nonmotile cilia and do not include diseases like primary cilia dyskinesia that involve motile cilia, as there is little phenotypic or genetic overlap. Cilia are generated during interphase from the mother centriole by coalescence of vesicles at its distal end that fuse with the plasma membrane. After this docking of the mother centriole, the microtubule axoneme protrudes out of the cell, concurrent with recruitment of a host of ciliaryspecific proteins. The basal body (i.e., the docked mother centriole) possesses several specialized accessory structures termed transition fibers, basal feet, and ciliary rootlets and is surrounded by the pericentriolar matrix. But for many years, the molecular determinants of these structures were unknown. Recent work has begun to hint at the molecular architecture of these anatomically defined structures. The 9+0 microtubule Cell 147, September 30, 2011 ª2011 Elsevier Inc. 71
Figure 2. Phenotypic and Interactome Diversity of the Ciliopathies Based upon Major Organ Involvement (A) Disease is represented below by abbreviation, and involvement of major organ is listed above. (B) Major ciliopathy diseases (color coded by severity) and gene mutated in each condition (red) linked by black bar, with more common causes showing thicker lines. (C) The same gene map, now indicating evidence for direct interaction between protein products. Protein interaction networks identified from published data, demonstrating major clustering of interactions corresponding to disease networks. Note that genes causing a particular disorder tend to have products that interact, although there are many genes without known connections. Note that genes such as CEP290 that can cause several different diseases should serve as hubs but, to date, have few demonstrated physical interactions. BBS, Bardet-Biedl syndrome; JBTS, Joubert syndrome; LCA, Leber congenital amaurosis; MKS, Meckel syndrome; NPHP, nephronophthisis; OFD, Orofaciodigital syndrome; SLS, Senior Loken syndrome.
arrangement of the axoneme emerges from the basal body in a triplet configuration and shifts to a doublet configuration at the ciliary transition zone (TZ). The Y-shaped microtubule extensions that define the TZ require the Cep290 protein for their formation in Chlamydamonas (Craige et al., 2010), although it is not clear whether Cep290 constitutes or otherwise contributes to these structures or whether this function is conserved in verte72 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
brates. The location of the TZ marks the ciliary diffusion barrier: a septin-2 cytoskeleton that separates the ciliary and cytoplasmic compartments (Hu et al., 2010). Attention has shifted to the TZ as the site of action of proteins mutated in several ciliopathies (Garcia-Gonzalo et al., 2011), but clear structurefunction relationships are still lacking. Protein synthesis and vesicular transport do not occur inside cilia, so the assembly of
this organelle, its maintenance, and its function are totally dependent upon an intraflagellar transport system by which proteins track bidirectionally along the polarized microtubules of the axoneme (Kozminski et al., 1993). Although the involvement of primary cilia in human diseases is now well established, many questions about its function remain. The current paradigm describes the primary cilium as an organelle for detecting and modulating the response to extracellular signaling molecules and as a location for organizing their cytoplasmic effectors. It is well poised to mediate both effects, as cilia typically protrude in a polarized fashion, which can help the cell interpret the context of extracellular signals. Many cellular receptors are localized to the primary cilium, and the basal body is itself a signaling hub, probably serving as an efficient transit point to transmit signals into the nucleus. In fact, a number of transcription factors undergo processing within or near the cilium prior to nuclear entry. Numerous critical developmental signaling pathways have been directly linked to primary cilia, such as Hedgehog (Hh), canonical and noncanonical Wnt, and some forms of PDGF signaling, highlighting cilia’s role as a signaling hub (Huangfu et al., 2003; Schneider et al., 2005; Simons et al., 2005). In addition to the modulation of these signaling pathways, primary cilia are essential for mechanical, odor, and photo reception. The interpretation of the ciliopathies as a unique group of disorders associated with defects in a single organelle gave a new direction to the investigation of these human diseases. Proteomics Merges with Genomics The marriage of proteomics with genomics in the area of ciliary biology can be traced to an influential paper showing that the RFX-type transcription factor DAF-19 is essential for assembly of cilia in C. elegans sensory neurons and regulates several genes encoding intraflagellar transport proteins (Swoboda et al., 2000). The RFX transcription factor family emerged in ciliated proto-eukaryotes, but it was only later that the RFX genes were co-opted to regulate expression of cilia-specific genes based upon the presence of an X box in their promoter (Piasecki et al., 2010). This work set the stage for a comparative genomics approach to search for X-box-containing genes that were likely to encode proteins relevant to primary cilia. Two follow-up studies demonstrated the potential of this approach in identifying important components of the primary cilium (Avidor-Reiss et al., 2004; Li et al., 2004) and raised the idea of building a ciliary protein database. The ciliome (Gherman et al., 2006; Inglis et al., 2006) now consists of more than 3000 genes encoding proteins either localized to cilia or essential for their assembly or function (Arnaiz et al., 2009). The ciliary proteome has already proved to be a powerful resource, accelerating the identification of candidate human ciliopathy genes by short-listing positional candidates. For example, the cloning of MKS1, BBS3, BBS5, and the BBS modifier MGC1203 was achieved by sequencing a reduced set of candidate genes (Badano et al., 2006; Chiang et al., 2004; Kytta¨la¨ et al., 2006; Li et al., 2004). Extending the idea of a ciliary proteome to a ciliary ‘‘interactome’’ was the natural extension of this work through the identification of proteins that physically interact as part of specific complexes (Eley et al., 2008; Gorden
et al., 2008). Through pair-wise testing of potential yeast twohybrid interactions (Otto et al., 2005) and identification of binding partners by serial mass spectrometry (Nachury et al., 2007; Sang et al., 2011), a number of discrete, functionally relevant complexes have now emerged as the likely minimal diseasecausing modules (Figure 2C). An important observation, which has stood the test of time, is that the composition of these specific protein modules could have been predicted based upon the phenotype observed in patients. Specifically, the proteins from seven of the eight most conserved genes mutated in BBS form a core complex termed the BBSome (Nachury et al., 2007), found to play important roles in ciliary protein and vesicular transport. Importantly, this complex does not contain proteins encoded by other ciliopathy diseases genes such as NPHP, JBTS, or MKS. This is perhaps surprising considering that mutations in MKS genes can cause BBS (Leitch et al., 2008). Three separate complexes containing many of the proteins mutated in NPHP, JBTS, and MKS were recently identified, and although their function is still under investigation, genes for two of the copurifying proteins, ATXN10 and TCTN2, were found to be mutated in patients matching the same ‘‘module’’ phenotype (Sang et al., 2011). In general, the complexes also display specificity in subcellular localization and function: the MKS/JBTS complex transduces hedgehog signaling and localizes to the ciliary transition zone, whereas the BBS complex forms a coat complex to target vesicles to the cilium and localizes to ciliary membrane (Jin et al., 2010). Proteins have nonredundant function within a given module and do not associate or function in other modules. This work has been further corroborated by analyzing a series of C. elegans double mutants, which demonstrate worsened synthetic phenotypes (i.e., functional interactions) only when mutations occur in two different modules (Williams et al., 2011), but not with two different mutations in the same module. For instance, the B9 domain proteins of the MKS module functionally interact with the NPHP module (Williams et al., 2008), but not with most other genes in the MKS module. The conclusion is that each module probably mediates partially separate ciliary functions. Inactivating one component in a module is probably sufficient to fully inactivate the module, so functional interaction is only observed by inactivating a component in a different module. Taken together, these examples show that the availability of various disease proteomes in combination with the explosion of currently available genomic and transcriptomic data sets are driving forward biological network analysis in human disease (Figure 3). Genetic Complexity of Human Ciliopathies Allelic Diversity What can the study of the ciliopathies teach us about the future direction of human genetic disease? Most obviously, that human genetics will be a lot more complex than many of us would have predicted. One obvious example is in the degree of multiple allelism at particular genetic loci. Although some of the ciliopathy genes are associated with only a single phenotypic class to date, other gene mutations can result in phenotypes along the entire ciliopathy clinical spectrum. For instance, mutations in CEP290 are reported in MKS, JBTS, NPHP, BBS, and LCA, spanning Cell 147, September 30, 2011 ª2011 Elsevier Inc. 73
Figure 3. Discovery Pipeline for Mechanism of Disease: Past and Future Paradigms (A) Former strategy to identify disease mechanisms, starting with ascertainment of pedigrees for linkage, disease mapping to a particular chromosomal locus, and candidate gene Sanger sequencing. Once a mutant gene was identified, animal models were created to understand the mechanism. (B) The future paradigm bypasses the need for informative pedigrees and disease mapping, instead going directly from patient ascertainment to genome sequencing and then to variant identification and expanding to identification of modifiers and mutational load. In parallel, patientspecific disease modeling using human cells coupled with protein interaction networks and therapeutic drug screening can further uncover disease mechanisms and help develop better treatments.
the full breadth of severity (Coppieters et al., 2010b), whereas ARL13B mutations are restricted to patients with JBTS. For ARL13B, it is not clear whether this gene is only capable of causing a restricted phenotype or whether there are additional mutations to be identified in other ciliopathy class disorders. Current data would suggest the latter, as only hypomorphic mutations in ARL13B gene were identified in humans, whereas comparably more severe phenotypes were observed in mouse and zebrafish (Caspary et al., 2007; Sun et al., 2004), suggesting that the full spectrum of disease-causing alleles has not yet been reported. In the case of CEP290, despite the identification of over 100 unique disease-causing mutations, the ability to predict phenotype based upon genotype is extremely limited. The mechanism underlying the different clinical outcomes of distinct mutations is not always clear. One possibility might relate to the type of mutations and their locations as predictors of phenotypic severity. Though there are some types of mutations associated with particular phenotypes, such as the c.2991+1655A/G variant present in 21% of all LCA patients (den Hollander et al., 2006), the exact same mutation can be seen in two different ciliopathy classes (Coppieters et al., 2010b). 74 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
There are other examples of pleiotropy in the ciliopathies. NPHP1 mutations are found in pure NPHP and NPHP with RD (a combination known as SLS); INPP5E mutations are found in JBTS as well as BBS-like conditions. The case of TMEM67 deserves special attention in that mutations cause a broad range of phenotypic combinations that comprise renal cyst, liver fibrosis, central nervous system malformations, retinal manifestations, and postaxial polydactyly. The gene was first identified as both a cause of MKS3 and the origin of the multiple phenotypes of the Wpk rat, which include PKD, agenesis of the corpus callosum, and hydrocephalus (Smith et al., 2006). Thereafter, more than 80 TMEM67 mutations were identified, not only associated with MKS, but also with a peculiar form of JBTS involving liver fibrosis and several renal and liver ciliopathy syndromes in which missense mutations predominate (Brancati et al., 2009; Iannicelli et al., 2010). Interestingly, the presence of two truncating alleles or a missense mutation within exons 8–15 associates with the lethal MKS phenotypes, suggesting an essential function of this region of the encoded protein that is yet to be identified. Modifiers Whereas allelism offers one perspective with which to view phenotypic pleiotropy, epistatic interactions and mutation load offer another important perspective that deserves consideration. Such reports are starting to emerge across the full spectrum of human disease (Gu et al., 2009; Oprea et al., 2008) but are still limited in number due to underpowered studies. Within the ciliopathies, recent studies of families with more than one affected child and larger population screenings have highlighted an important role of epistatic interactions, whereby the effect of one gene modifies the phenotypic attributes of a different gene (Leitch et al., 2008; Wiszniewski et al., 2011) (Figure 1C). For this Review, we distinguish this from mutational load, whereby
the total genetic burden from accumulated deleterious variants sums to produce the phenotype. It is now clear that the precise disease manifestations and probably the timing of the appearance of symptoms are subject to modification, which could be in the form of stochastic, environmental, or genetic inputs. The identification of genetically encoded modifiers offers the potential to understand gene networks, improve prognostic information, and identify targetable biochemical processes for the development of therapeutic treatments. Evidence that heritable factors in humans can alter the course of ciliopathies came initially from an elegant series of experiments involving the role of the MGC1203 gene (Badano et al., 2006). The encoded protein, CCDC28B, contains a coiled-coil domain and was identified in a yeast two-hybrid screen as a BBS4-interacting protein. After demonstrating CCDC28B interaction with several BBS proteins, the authors screened a BBS cohort for potential MGC1203 mutations. Though none of the patients carried a mutation known to cause BBS, a C-toT transition (C430T) in MGC1203 generated a splice defect in about 10% of gene products. To test the involvement of C430T as a potential modifier, they screened BBS and control cohorts for this variant, finding that 6.2% of patients versus 1.4% of controls carried the variant, representing an over-transmission of the variant in transmission disequilibrium testing. The authors reported three families with affected siblings carrying a homozygous mutation on BBS1 in which the RD severity was associated with the presence of the transition, suggesting that this variant increases the likelihood of developing more severe BBS symptoms when associated with a known BBS mutation. Since then, other examples of epistatic effects on RD associated with ciliopathies have emerged. Mutations in either AHI1 or RPGRIP1L, which both encode ciliary-localized proteins, were initially reported to cause JBTS. The AHI1 gene product physically interacts with nephrocystin-1 (Eley et al., 2008), the product of the most commonly mutated NPHP gene, NPHP1. Because more than 50% of patients with AHI1 mutations display RD/ LCA in addition to JBTS, investigators considered that AHI1 might be mutated in isolated RD/LCA. However, screening a cohort of 176 mixed ancestry patients with LCA failed to demonstrate any causative mutations, indicating that AHI1 mutations do not lead to isolated LCA in the absence diagnostic JBTS features (Louie et al., 2010). However, Ahi1 mutant mice predominantly displayed an LCA/RD phenotype, which was more severe when synthetically combined with an Nphp1 mutant allele, prompting investigation of epistatic interactions. Among a cohort of 153 NPHP1 mutant patients, there was a significant enrichment for a heterozygous c.C2488T change, leading to a functional p.R830W substitution in those with RD. This translated into a 7.5-fold increase in RD in this population, which represents one of the highest known risk alterations so far described for any human disease. The c.C2488T allele therefore significantly increases the risk of developing RD in the presence of an NPHP1 mutation. This association is not restricted to those with RD, as the same c.C2488T allele was more commonly found in individuals with NPHP1 mutations displaying a more severe neurological phenotype (Tory et al., 2007). Nor are AHI1 variants solely restricted to modifying NPHP1 phenotypes, as three unrelated patients with
the exact same CEP290 genotype (p.R1465X) presenting with different clinical phenotypes showed variants in AHI1 that might explain this discordance. AHI1 heterozygous transversions (p.N811K and p.H758P) were found associated with greater severity of nephrological and neurological phenotypes (Coppieters et al., 2010a). Intriguingly the AHI1 variants in this case had no effect on the presence of RD. Thus, these epistatic interactions are both genotype specific and phenotype specific. In studies that use model systems, it is possible to manipulate gene structure or expression through a variety of techniques, but investigation of genetic modifiers in humans is limited to naturally occurring variations. This might seem like a huge drawback initially, but it has tremendous benefits in the long run because large numbers of patients can be analyzed for the more common variants. Several heterozygous variants have been reported in RPGRIP1L in ciliopathy cohorts, and though their individual rarity precluded further investigation, one remarkable exception, the heterozygous p.A229T, is common enough to apply statistical analysis. The RPGRIP1L protein interacts with RPGR, a ciliary protein frequently mutated in RD, and this interaction is disrupted in the presence of the p.A229T transversion (Khanna et al., 2009). Individuals with this variant alone have normal vision, suggesting that this heterozygous variant is silent in isolation. However, it is enriched in ciliopathy patients with RD compared to those without RD. There are probably many such gene-gene interactions, but defining those of highest effect size and the mechanisms by which these epistatic effects are manifested will require new experimental strategies. Mutational Load Is it possible that such second site mutations might modify not only the expressivity, but also the penetrance of disease? BBS has long provided the classic example of oligogenicity (Katsanis et al., 2001), representing the idea that second-site mutations are necessary to produce an observable phenotype. The demonstration that three mutant alleles at two different loci were required for pathogenicity was among the first example of oligogenic inheritance. In this example, investigators observed that both affected and healthy children of an outbred family carried two different nonsense mutations in BBS2 (p.Y24X and p.Q59X). In an effort to identify the differential genetic trait that caused the disease in the affected sibling, they identified a heterozygous nonsense potentially deleterious sequence variant (PDSV) in BBS6 (Q147X) that was absent in the healthy sibling, suggesting that the three different mutant alleles were necessary to produce this phenotype. Subsequently, additional examples were described in which BBS patients carried three mutant alleles in two different loci (Eichers et al., 2004). However, in such examples, the segregation does not exclude the possibility that two mutations in the same gene are sufficient to cause disease in a different genetic background (Mykytyn et al., 2003). Moreover, the lack of recurrence of such combinations in BBS cohorts studied complicates the validation of the oligogenic hypothesis. Although oligogenic inheritance in BBS is still hotly contested, several reports of possible oligogenic inheritance in other ciliopathies have emerged more recently (Hoefele et al., 2007; Hopp et al., 2011), in which only single heterozygous PDSVs are detected or in which patients with two deleterious mutations Cell 147, September 30, 2011 ª2011 Elsevier Inc. 75
in one gene also carry a heterozygous PDSV in a second gene. The evidence for oligogenicity is somewhat tempered by the natural limitations of the technology used to support the findings, and whether such variants are functionally relevant and whether they are overrepresented in patients compared with similarly investigated controls is still a matter of debate. Nevertheless, the observed phenotypic and genetic variability, together with an apparent overabundance of PDSVs in ciliopathy genes, have led to the general acceptance of a ‘‘mutational load’’ hypothesis. Defining the mutational load required for the manifestation of a given phenotypic combination or expressivity is now the main challenging issue. One potential reason for optimism is the emergence of next-generation sequencing, which will allow genome-wide unbiased exploration of the mutational load hypothesis. For instance, it would be fascinating to test whether patients with ciliopathy phenotypes carry greater burdens or particular patterns of mutations in cilia-specific genes compared with housekeeping genes on a genome-wide scale. IPSCs, Cell-Based Screens, and Treatments for Ciliopathies What does the future of human genetics hold, and how can disorders like the ciliopathies benefit from new technological advances? Whole-genome sequencing (WGS) and wholeexome sequencing (WES) are clearly the breakthroughs most likely to impact the study of Mendelian disorders like ciliopathies. The exons, accounting for 1% of the genome, harbor most disease-causing mutations of high-effect size; therefore, WES is a reasonable approach for finding novel disease-causing genes. This is evident by the abundance of novel diseasecausing genes identified in recent years. The limitations of WES can be overcome with WGS—with its ability to identify not just variants in coding regions of the genes, but also variants in introns, untranslated regions, and noncoding RNAs, such as the recent discovery of mutations in the U4atac shRNA in microcephalic osteodysplastic primordial dwarfism type I (Edery et al., 2011; He et al., 2011). Additionally, with genome-wide approaches, these data sets offer the possibility to directly test the mutational load hypothesis and identify a host of epistatic modifiers. At this point, we still require better bioinformatic tools to predict resultant human phenotypes, but this technology is already greatly benefiting the field of human genetics, with an explosion in the number of newly discovered human disease genes. The ciliopathies are no exception (Gilissen et al., 2010; Hopp et al., 2011). Combining these genetic discoveries with proteomic research is another huge area for the future as molecular geneticists take advantage of the explosion of new human disease genes. Identifying functional complexes, modules, and genetic pathways through the identification of protein-binding partners and animal modeling will yield more robust platforms from which to consider therapeutics. An example is in PKD, one of the most common Mendelian diseases. Based upon the knowledge of similarities in phenotype between PKD and tuberous sclerosis complex (TSC), investigators wondered whether there was a possible functional connection. The two genes mutated in TSC are involved in the mTOR pathway, and some symptoms are treatable with mTOR inhibitors like rapamycin. Investigators hypoth76 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
esized that PKD1 and tuberin (one of the genes mutated in TSC) might interact. After confirming this hypothesis, they tested whether, like TSC, PKD might be abrogated by mTOR inhibitors. Strikingly, rapamycin results in a significant reversal of renal cystogenesis in two different mouse models (Shillingford et al., 2006), and these findings led to a clinical trial in humans. More recent work hints at the possibility that mTOR inhibitors may benefit other ciliopathy conditions (Tobin and Beales, 2008). Although the human trials were not successful and will need to be repeated with different endpoints, the story exemplifies the streamlining of approaches as investigators move from human disease genes to new treatments. Because animal models can frequently be significantly different than their human counterparts in terms of organ specificity and severity, they have been of limited benefit in directly exploring disease pathogenesis. For instance, both published mouse models for the most commonly mutated gene in human NPHP, NPHP1, shows no detectable kidney phenotype. This could be a result of functional differences in the kidney in humans versus mice, differences in the types of mutations, or differences in the genetic background. Although NPHP1 mutant mice have some evidence of ciliopathy-like features, including aberrant sperm maturation and a background-dependent genetic interaction with AHI1 in RD (Jiang et al., 2009; Jiang et al., 2008; Louie et al., 2010), in general, the use of animal models of human disease often requires careful assessment of relevance. As in many other diseases, investigators that continue to return to patients and patient-derived samples for clues to pathogenesis are more apt to make disease-relevant discoveries. Induced pluripotent stem cells (IPSCs) offer such an opportunity, especially for diseases like ciliopathies, in which the genetic background is probably critical for phenotypic expressivity. For this reason, it might be preferable to work with a patient sample carrying a known disease-causing mutation in the diseasecausing genetic background rather than a sample from a knockout mouse, in which the species or the species’ genetic background may not be appropriate for full expressivity. IPSCs and other cellular reprogramming strategies offer an opportunity to make discoveries in disease-relevant cells. For example, a recent application of the spheroid cellular assay to probe mechanistic insights of kidney cysts (Otto et al., 2010) might greatly benefit from the use of patient-derived cells. Functional genomic cell-based screens also offer a complementary approach to the study of human disease by leapfrogging past animal models in the identification of potential treatment-relevant targets. In the past year, a number of such cell-based high-throughput or high-content screening systems to identify genes required for cilium assembly and maintenance or signaling pathways known to depend upon the cilium have emerged. By combining stable isotope labeling with amino acids in cell culture (SILAC) with BAC transgenesis in human cells, a list of 135 new centriolar components that are specific to either the mother or daughter centriole were identified (Jakobsen et al., 2011). Given the important coordination of centriole function with ciliogenesis, this list is likely to lead to many new functional discoveries in the future. Genome-wide RNAi screening in cells has identified the multitasking kinase Stk11 (a.k.a Lkb1) as a key factor in cilium stability and an integrator of Shh and Wnt
signals in cells, two key pathways modulated by the cilium (Jacob et al., 2011). Finally, two recent papers established the genetic requirements of cilium assembly and length in mammalian cells, highlighting the important contribution of the actin cytoskeleton and showing that it is possible to uncouple ciliary cargo transport from cilia formation in vertebrates (Kim et al., 2010; Lai et al., 2011). Rapidly evolving sequencing methods combined with the underlying growth of informatic algorithms can provide the power to uncover both new causes as well as new potential treatments in human disease. Using the relatively simple example of the ciliopathies, we can see how the paradigm is shifting in experimental biology to one that is less reliant on the pure study of animal models, in favor of using humans as a model to study human disease. Further integration of these new approaches has the potential to yield improved diagnosis and opens the window for new therapies across the field of genetics. ACKNOWLEDGMENTS N.A. is supported by a training grant from the California Institute for Regenerative Medicine. G.N. is supported by a grant from the Deutsche Forschungsgemeinschaft (DFG). Work in the Gleeson lab is supported by grants from the NIH (R01NS041537, P01HD070494, R01NS052455, P30NS047101, and R01NS048453). J.G.G. is an Investigator of the Simons Foundation Autism Research Initiative (SFARI) and the Howard Hughes Medical Institute. REFERENCES Amir, R.E., Van den Veyver, I.B., Wan, M., Tran, C.Q., Francke, U., and Zoghbi, H.Y. (1999). Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2. Nat. Genet. 23, 185–188. Ansley, S.J., Badano, J.L., Blacque, O.E., Hill, J., Hoskins, B.E., Leitch, C.C., Kim, J.C., Ross, A.J., Eichers, E.R., Teslovich, T.M., et al. (2003). Basal body dysfunction is a likely cause of pleiotropic Bardet-Biedl syndrome. Nature 425, 628–633. Arnaiz, O., Malinowska, A., Klotz, C., Sperling, L., Dadlez, M., Koll, F., and Cohen, J. (2009). Cildb: a knowledgebase for centrosomes and cilia. Database (Oxford) 2009, bap022. Avidor-Reiss, T., Maer, A.M., Koundakjian, E., Polyanovsky, A., Keil, T., Subramaniam, S., and Zuker, C.S. (2004). Decoding cilia function: defining specialized genes required for compartmentalized cilia biogenesis. Cell 117, 527–539. Baala, L., Audollent, S., Martinovic, J., Ozilou, C., Babron, M.C., Sivanandamoorthy, S., Saunier, S., Salomon, R., Gonzales, M., Rattenberry, E., et al. (2007a). Pleiotropic effects of CEP290 (NPHP6) mutations extend to Meckel syndrome. Am. J. Hum. Genet. 81, 170–179. Baala, L., Romano, S., Khaddour, R., Saunier, S., Smith, U.M., Audollent, S., Ozilou, C., Faivre, L., Laurent, N., Foliguet, B., et al. (2007b). The MeckelGruber syndrome gene, MKS3, is mutated in Joubert syndrome. Am. J. Hum. Genet. 80, 186–194. Badano, J.L., Leitch, C.C., Ansley, S.J., May-Simera, H., Lawson, S., Lewis, R.A., Beales, P.L., Dietz, H.C., Fisher, S., and Katsanis, N. (2006). Dissection of epistasis in oligogenic Bardet-Biedl syndrome. Nature 439, 326–330. Barr, M.M., and Sternberg, P.W. (1999). A polycystic kidney-disease gene homologue required for male mating behaviour in C. elegans. Nature 401, 386–389. Brancati, F., Iannicelli, M., Travaglini, L., Mazzotta, A., Bertini, E., Boltshauser, E., D’Arrigo, S., Emma, F., Fazzi, E., Gallizzi, R., et al; International JSRD Study Group. (2009). MKS3/TMEM67 mutations are a major cause of COACH Syndrome, a Joubert Syndrome related disorder with liver involvement. Hum. Mutat. 30, E432–E442.
Caspary, T., Larkins, C.E., and Anderson, K.V. (2007). The graded response to Sonic Hedgehog depends on cilia architecture. Dev. Cell 12, 767–778. Chen, Y., Bedell, M., and Zhang, K. (2010). Age-related macular degeneration: genetic and environmental factors of disease. Mol. Interv. 10, 271–281. Chiang, A.P., Nishimura, D., Searby, C., Elbedour, K., Carmi, R., Ferguson, A.L., Secrist, J., Braun, T., Casavant, T., Stone, E.M., and Sheffield, V.C. (2004). Comparative genomic analysis identifies an ADP-ribosylation factorlike gene as the cause of Bardet-Biedl syndrome (BBS3). Am. J. Hum. Genet. 75, 475–484. Coppieters, F., Casteels, I., Meire, F., De Jaegere, S., Hooghe, S., van Regemorter, N., Van Esch, H., Matuleviciene, A., Nunes, L., Meersschaut, V., et al. (2010a). Genetic screening of LCA in Belgium: predominance of CEP290 and identification of potential modifier alleles in AHI1 of CEP290-related phenotypes. Hum. Mutat. 31, E1709–E1766. Coppieters, F., Lefever, S., Leroy, B.P., and De Baere, E. (2010b). CEP290, a gene with many faces: mutation overview and presentation of CEP290base. Hum. Mutat. 31, 1097–1108. Craige, B., Tsao, C.C., Diener, D.R., Hou, Y., Lechtreck, K.F., Rosenbaum, J.L., and Witman, G.B. (2010). CEP290 tethers flagellar transition zone microtubules to the membrane and regulates flagellar protein content. J. Cell Biol. 190, 927–940. den Hollander, A.I., Koenekoop, R.K., Yzer, S., Lopez, I., Arends, M.L., Voesenek, K.E., Zonneveld, M.N., Strom, T.M., Meitinger, T., Brunner, H.G., et al. (2006). Mutations in the CEP290 (NPHP6) gene are a frequent cause of Leber congenital amaurosis. Am. J. Hum. Genet. 79, 556–561. Edery, P., Marcaillou, C., Sahbatou, M., Labalme, A., Chastang, J., Touraine, R., Tubacher, E., Senni, F., Bober, M.B., Nampoothiri, S., et al. (2011). Association of TALS developmental disorder with defect in minor splicing component U4atac snRNA. Science 332, 240–243. Eichers, E.R., Lewis, R.A., Katsanis, N., and Lupski, J.R. (2004). Triallelic inheritance: a bridge between Mendelian and multifactorial traits. Ann. Med. 36, 262–272. Eley, L., Gabrielides, C., Adams, M., Johnson, C.A., Hildebrandt, F., and Sayer, J.A. (2008). Jouberin localizes to collecting ducts and interacts with nephrocystin-1. Kidney Int. 74, 1139–1149. Garcia-Gonzalo, F.R., Corbit, K.C., Sirerol-Piquer, M.S., Ramaswami, G., Otto, E.A., Noriega, T.R., Seol, A.D., Robinson, J.F., Bennett, C.L., Josifova, D.J., et al. (2011). A transition zone complex regulates mammalian ciliogenesis and ciliary membrane composition. Nat. Genet. 43, 776–784. Gherman, A., Davis, E.E., and Katsanis, N. (2006). The ciliary proteome database: an integrated community resource for the genetic and functional dissection of cilia. Nat. Genet. 38, 961–962. Gilissen, C., Arts, H.H., Hoischen, A., Spruijt, L., Mans, D.A., Arts, P., van Lier, B., Steehouwer, M., van Reeuwijk, J., Kant, S.G., et al. (2010). Exome sequencing identifies WDR35 variants involved in Sensenbrenner syndrome. Am. J. Hum. Genet. 87, 418–423. Gorden, N.T., Arts, H.H., Parisi, M.A., Coene, K.L., Letteboer, S.J., van Beersum, S.E., Mans, D.A., Hikida, A., Eckert, M., Knutzen, D., et al. (2008). CC2D2A is mutated in Joubert syndrome and interacts with the ciliopathyassociated basal body protein CEP290. Am. J. Hum. Genet. 83, 559–571. Gu, Y., Harley, I.T., Henderson, L.B., Aronow, B.J., Vietor, I., Huber, L.A., Harley, J.B., Kilpatrick, J.R., Langefeld, C.D., Williams, A.H., et al. (2009). Identification of IFRD1 as a modifier gene for cystic fibrosis lung disease. Nature 458, 1039–1042. Habashi, J.P., Judge, D.P., Holm, T.M., Cohn, R.D., Loeys, B.L., Cooper, T.K., Myers, L., Klein, E.C., Liu, G., Calvi, C., et al. (2006). Losartan, an AT1 antagonist, prevents aortic aneurysm in a mouse model of Marfan syndrome. Science 312, 117–121. He, H., Liyanarachchi, S., Akagi, K., Nagy, R., Li, J., Dietrich, R.C., Li, W., Sebastian, N., Wen, B., Xin, B., et al. (2011). Mutations in U4atac snRNA, a component of the minor spliceosome, in the developmental disorder MOPD I. Science 332, 238–240.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 77
Hoefele, J., Wolf, M.T., O’Toole, J.F., Otto, E.A., Schultheiss, U., Deˆschenes, G., Attanasio, M., Utsch, B., Antignac, C., and Hildebrandt, F. (2007). Evidence of oligogenic inheritance in nephronophthisis. J. Am. Soc. Nephrol. 18, 2789– 2795. Hopp, K., Heyer, C.M., Hommerding, C.J., Henke, S.A., Sundsbak, J.L., Patel, S., Patel, P., Consugar, M.B., Czarnecki, P.G., Gliem, T.J., et al. (2011). B9D1 is revealed as a novel Meckel syndrome (MKS) gene by targeted exonenriched next-generation sequencing and deletion analysis. Hum. Mol. Genet. 20, 2524–2534. Hu, Q., Milenkovic, L., Jin, H., Scott, M.P., Nachury, M.V., Spiliotis, E.T., and Nelson, W.J. (2010). A septin diffusion barrier at the base of the primary cilium maintains ciliary membrane protein distribution. Science 329, 436–439. Huangfu, D., Liu, A., Rakeman, A.S., Murcia, N.S., Niswander, L., and Anderson, K.V. (2003). Hedgehog signalling in the mouse requires intraflagellar transport proteins. Nature 426, 83–87. Iannicelli, M., Brancati, F., Mougou-Zerelli, S., Mazzotta, A., Thomas, S., Elkhartoufi, N., Travaglini, L., Gomes, C., Ardissino, G.L., Bertini, E., et al; International JSRD Study Group. (2010). Novel TMEM67 mutations and genotype-phenotype correlates in meckelin-related ciliopathies. Hum. Mutat. 31, E1319–E1331. Inglis, P.N., Boroevich, K.A., and Leroux, M.R. (2006). Piecing together a ciliome. Trends Genet. 22, 491–500. Jacob, L.S., Wu, X., Dodge, M.E., Fan, C.W., Kulak, O., Chen, B., Tang, W., Wang, B., Amatruda, J.F., and Lum, L. (2011). Genome-wide RNAi screen reveals disease-associated genes that are common to Hedgehog and Wnt signaling. Sci. Signal. 4, ra4. Jakobsen, L., Vanselow, K., Skogs, M., Toyoda, Y., Lundberg, E., Poser, I., Falkenby, L.G., Bennetzen, M., Westendorf, J., Nigg, E.A., et al. (2011). Novel asymmetrically localizing components of human centrosomes identified by complementary proteomics methods. EMBO J. 30, 1520–1535. Jiang, S.T., Chiou, Y.Y., Wang, E., Lin, H.K., Lee, S.P., Lu, H.Y., Wang, C.K., Tang, M.J., and Li, H. (2008). Targeted disruption of Nphp1 causes male infertility due to defects in the later steps of sperm morphogenesis in mice. Hum. Mol. Genet. 17, 3368–3379. Jiang, S.T., Chiou, Y.Y., Wang, E., Chien, Y.L., Ho, H.H., Tsai, F.J., Lin, C.Y., Tsai, S.P., and Li, H. (2009). Essential role of nephrocystin in photoreceptor intraflagellar transport in mouse. Hum. Mol. Genet. 18, 1566–1577. Jin, H., White, S.R., Shida, T., Schulz, S., Aguiar, M., Gygi, S.P., Bazan, J.F., and Nachury, M.V. (2010). The conserved Bardet-Biedl syndrome proteins assemble a coat that traffics membrane proteins to cilia. Cell 141, 1208–1219. Katsanis, N., Ansley, S.J., Badano, J.L., Eichers, E.R., Lewis, R.A., Hoskins, B.E., Scambler, P.J., Davidson, W.S., Beales, P.L., and Lupski, J.R. (2001). Triallelic inheritance in Bardet-Biedl syndrome, a Mendelian recessive disorder. Science 293, 2256–2259. Khanna, H., Davis, E.E., Murga-Zamalloa, C.A., Estrada-Cuzcano, A., Lopez, I., den Hollander, A.I., Zonneveld, M.N., Othman, M.I., Waseem, N., Chakarova, C.F., et al. (2009). A common allele in RPGRIP1L is a modifier of retinal degeneration in ciliopathies. Nat. Genet. 41, 739–745. Kim, J.C., Badano, J.L., Sibold, S., Esmail, M.A., Hill, J., Hoskins, B.E., Leitch, C.C., Venner, K., Ansley, S.J., Ross, A.J., et al. (2004). The Bardet-Biedl protein BBS4 targets cargo to the pericentriolar region and is required for microtubule anchoring and cell cycle progression. Nat. Genet. 36, 462–470. Kim, J., Lee, J.E., Heynen-Genel, S., Suyama, E., Ono, K., Lee, K., Ideker, T., Aza-Blanc, P., and Gleeson, J.G. (2010). Functional genomic screen for modulators of ciliogenesis and cilium length. Nature 464, 1048–1051.
Lai, C.K., Gupta, N., Wen, X., Rangell, L., Chih, B., Peterson, A.S., Bazan, J.F., Li, L., and Scales, S.J. (2011). Functional characterization of putative cilia genes by high-content analysis. Mol. Biol. Cell 22, 1104–1119. Leitch, C.C., Zaghloul, N.A., Davis, E.E., Stoetzel, C., Diaz-Font, A., Rix, S., Alfadhel, M., Lewis, R.A., Eyaid, W., Banin, E., et al. (2008). Hypomorphic mutations in syndromic encephalocele genes are associated with BardetBiedl syndrome. Nat. Genet. 40, 443–448. Li, J.B., Gerdes, J.M., Haycraft, C.J., Fan, Y., Teslovich, T.M., May-Simera, H., Li, H., Blacque, O.E., Li, L., Leitch, C.C., et al. (2004). Comparative genomics identifies a flagellar and basal body proteome that includes the BBS5 human disease gene. Cell 117, 541–552. Louie, C.M., Caridi, G., Lopes, V.S., Brancati, F., Kispert, A., Lancaster, M.A., Schlossman, A.M., Otto, E.A., Leitges, M., Gro¨ne, H.J., et al. (2010). AHI1 is required for photoreceptor outer segment development and is a modifier for retinal degeneration in nephronophthisis. Nat. Genet. 42, 175–180. Mykytyn, K., Nishimura, D.Y., Searby, C.C., Beck, G., Bugge, K., Haines, H.L., Cornier, A.S., Cox, G.F., Fulton, A.B., Carmi, R., et al. (2003). Evaluation of complex inheritance involving the most common Bardet-Biedl syndrome locus (BBS1). Am. J. Hum. Genet. 72, 429–437. Nachury, M.V., Loktev, A.V., Zhang, Q., Westlake, C.J., Pera¨nen, J., Merdes, A., Slusarski, D.C., Scheller, R.H., Bazan, J.F., Sheffield, V.C., and Jackson, P.K. (2007). A core complex of BBS proteins cooperates with the GTPase Rab8 to promote ciliary membrane biogenesis. Cell 129, 1201–1213. Oprea, G.E., Kro¨ber, S., McWhorter, M.L., Rossoll, W., Mu¨ller, S., Krawczak, M., Bassell, G.J., Beattie, C.E., and Wirth, B. (2008). Plastin 3 is a protective modifier of autosomal recessive spinal muscular atrophy. Science 320, 524–527. Otto, E.A., Schermer, B., Obara, T., O’Toole, J.F., Hiller, K.S., Mueller, A.M., Ruf, R.G., Hoefele, J., Beekmann, F., Landau, D., et al. (2003). Mutations in INVS encoding inversin cause nephronophthisis type 2, linking renal cystic disease to the function of primary cilia and left-right axis determination. Nat. Genet. 34, 413–420. Otto, E.A., Loeys, B., Khanna, H., Hellemans, J., Sudbrak, R., Fan, S., Muerb, U., O’Toole, J.F., Helou, J., Attanasio, M., et al. (2005). Nephrocystin-5, a ciliary IQ domain protein, is mutated in Senior-Loken syndrome and interacts with RPGR and calmodulin. Nat. Genet. 37, 282–288. Otto, E.A., Hurd, T.W., Airik, R., Chaki, M., Zhou, W., Stoetzel, C., Patil, S.B., Levy, S., Ghosh, A.K., Murga-Zamalloa, C.A., et al. (2010). Candidate exome capture identifies mutation of SDCCAG8 as the cause of a retinal-renal ciliopathy. Nat. Genet. 42, 840–850. Piasecki, B.P., Burghoorn, J., and Swoboda, P. (2010). Regulatory Factor X (RFX)-mediated transcriptional rewiring of ciliary genes in animals. Proc. Natl. Acad. Sci. USA 107, 12969–12974. Sang, L., Miller, J.J., Corbit, K.C., Giles, R.H., Brauer, M.J., Otto, E.A., Baye, L.M., Wen, X., Scales, S.J., Kwong, M., et al. (2011). Mapping the NPHPJBTS-MKS protein network reveals ciliopathy disease genes and pathways. Cell 145, 513–528. Schneider, L., Clement, C.A., Teilmann, S.C., Pazour, G.J., Hoffmann, E.K., Satir, P., and Christensen, S.T. (2005). PDGFRalphaalpha signaling is regulated through the primary cilium in fibroblasts. Curr. Biol. 15, 1861–1866. Shillingford, J.M., Murcia, N.S., Larson, C.H., Low, S.H., Hedgepeth, R., Brown, N., Flask, C.A., Novick, A.C., Goldfarb, D.A., Kramer-Zucker, A., et al. (2006). The mTOR pathway is regulated by polycystin-1, and its inhibition reverses renal cystogenesis in polycystic kidney disease. Proc. Natl. Acad. Sci. USA 103, 5466–5471.
Kozminski, K.G., Johnson, K.A., Forscher, P., and Rosenbaum, J.L. (1993). A motility in the eukaryotic flagellum unrelated to flagellar beating. Proc. Natl. Acad. Sci. USA 90, 5519–5523.
Simons, M., Gloy, J., Ganner, A., Bullerkotte, A., Bashkurov, M., Kro¨nig, C., Schermer, B., Benzing, T., Cabello, O.A., Jenny, A., et al. (2005). Inversin, the gene product mutated in nephronophthisis type II, functions as a molecular switch between Wnt signaling pathways. Nat. Genet. 37, 537–543.
Kytta¨la¨, M., Tallila, J., Salonen, R., Kopra, O., Kohlschmidt, N., Paavola-Sakki, P., Peltonen, L., and Kestila¨, M. (2006). MKS1, encoding a component of the flagellar apparatus basal body proteome, is mutated in Meckel syndrome. Nat. Genet. 38, 155–157.
Smith, U.M., Consugar, M., Tee, L.J., McKee, B.M., Maina, E.N., Whelan, S., Morgan, N.V., Goranson, E., Gissen, P., Lilliquist, S., et al. (2006). The transmembrane protein meckelin (MKS3) is mutated in Meckel-Gruber syndrome and the wpk rat. Nat. Genet. 38, 191–196.
78 Cell 147, September 30, 2011 ª2011 Elsevier Inc.
Sun, Z., Amsterdam, A., Pazour, G.J., Cole, D.G., Miller, M.S., and Hopkins, N. (2004). A genetic screen in zebrafish identifies cilia genes as a principal cause of cystic kidney. Development 131, 4085–4093.
Williams, C.L., Winkelbauer, M.E., Schafer, J.C., Michaud, E.J., and Yoder, B.K. (2008). Functional redundancy of the B9 proteins and nephrocystins in Caenorhabditis elegans ciliogenesis. Mol. Biol. Cell 19, 2154–2168.
Swoboda, P., Adler, H.T., and Thomas, J.H. (2000). The RFX-type transcription factor DAF-19 regulates sensory neuron cilium formation in C. elegans. Mol. Cell 5, 411–421.
Williams, C.L., Li, C., Kida, K., Inglis, P.N., Mohan, S., Semenec, L., Bialas, N.J., Stupay, R.M., Chen, N., Blacque, O.E., et al. (2011). MKS and NPHP modules cooperate to establish basal body/transition zone membrane associations and ciliary gate function during ciliogenesis. J. Cell Biol. 192, 1023–1041.
Taulman, P.D., Haycraft, C.J., Balkovetz, D.F., and Yoder, B.K. (2001). Polaris, a protein involved in left-right axis patterning, localizes to basal bodies and cilia. Mol. Biol. Cell 12, 589–599. Tobin, J.L., and Beales, P.L. (2008). Restoration of renal function in zebrafish models of ciliopathies. Pediatr. Nephrol. 23, 2095–2099. Tory, K., Lacoste, T., Burglen, L., Morinie`re, V., Boddaert, N., Macher, M.A., Llanas, B., Nivet, H., Bensman, A., Niaudet, P., et al. (2007). High NPHP1 and NPHP6 mutation rate in patients with Joubert syndrome and nephronophthisis: potential epistatic effect of NPHP6 and AHI1 mutations in patients with NPHP1 mutations. J. Am. Soc. Nephrol. 18, 1566–1575.
Wiszniewski, W., Lewis, R.A., Stockton, D.W., Peng, J., Mardon, G., Chen, R., and Lupski, J.R. (2011). Potential involvement of more than one locus in trait manifestation for individuals with Leber congenital amaurosis. Hum. Genet. 129, 319–327. Yen, H.J., Tayeh, M.K., Mullins, R.F., Stone, E.M., Sheffield, V.C., and Slusarski, D.C. (2006). Bardet-Biedl syndrome genes are important in retrograde intracellular trafficking and Kupffer’s vesicle cilia function. Hum. Mol. Genet. 15, 667–677.
Cell 147, September 30, 2011 ª2011 Elsevier Inc. 79
The Lin28/let-7 Axis Regulates Glucose Metabolism Hao Zhu,1,2,3,4,14 Ng Shyh-Chang,1,2,8,14 Ayellet V. Segre`,5,6 Gen Shinoda,1,2 Samar P. Shah,1,2 William S. Einhorn,1,2,4 Ayumu Takeuchi,1,2 Jesse M. Engreitz,7 John P. Hagan,1,2,8,9 Michael G. Kharas,1,2,4 Achia Urbach,1,2 James E. Thornton,1,2,8 Robinson Triboulet,1,2,8 Richard I. Gregory,1,2,8 DIAGRAM Consortium,13 MAGIC Investigators,13 David Altshuler,5,6,10 and George Q. Daley1,2,4,8,11,12,* 1Stem Cell Transplantation Program, Stem Cell Program, Division of Pediatric Hematology/Oncology, Children’s Hospital Boston and Dana Farber Cancer Institute, Boston, MA, USA 2Harvard Stem Cell Institute, Boston, MA, USA 3Division of Medical Oncology, Dana Farber Cancer Institute, Boston, MA, USA 4Division of Hematology, Brigham and Women’s Hospital, Boston, MA, USA 5Department of Molecular Biology, Diabetes Unit Department of Medicine, and Center for Human Genetics Research, Massachusetts General Hospital, Boston, MA, USA 6Program in Medical and Population Genetics, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA 7Division of Health Sciences and Technology, MIT, Cambridge, MA, USA 8Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA, USA 9Department of Molecular Virology, Immunology and Medical Genetics, The Ohio State University Medical Center, Columbus, OH, USA 10Departments of Genetics and of Medicine, Harvard Medical School, Boston, MA, USA 11Howard Hughes Medical Institute, Boston, MA, USA 12Manton Center for Orphan Disease Research, Boston, MA, USA 13Memberships of the consortia are provided in Text S1 available online 14These authors contributed equally to this work *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.08.033
SUMMARY
The let-7 tumor suppressor microRNAs are known for their regulation of oncogenes, while the RNA-binding proteins Lin28a/b promote malignancy by inhibiting let-7 biogenesis. We have uncovered unexpected roles for the Lin28/let-7 pathway in regulating metabolism. When overexpressed in mice, both Lin28a and LIN28B promote an insulin-sensitized state that resists high-fat-diet induced diabetes. Conversely, musclespecific loss of Lin28a or overexpression of let-7 results in insulin resistance and impaired glucose tolerance. These phenomena occur, in part, through the let-7-mediated repression of multiple components of the insulin-PI3K-mTOR pathway, including IGF1R, INSR, and IRS2. In addition, the mTOR inhibitor, rapamycin, abrogates Lin28a-mediated insulin sensitivity and enhanced glucose uptake. Moreover, let-7 targets are enriched for genes containing SNPs associated with type 2 diabetes and control of fasting glucose in human genome-wide association studies. These data establish the Lin28/let-7 pathway as a central regulator of mammalian glucose metabolism. INTRODUCTION Metabolic disease and malignancy are proposed to share common biological mechanisms. Reprogramming toward glyco-
lytic metabolism can increase a cancer cell’s ability to generate biomass, a phenomenon termed the ‘‘Warburg Effect’’ (Denko, 2008; Engelman et al., 2006; Gao et al., 2009; Guertin and Sabatini, 2007; Laplante and Sabatini, 2009; Vander Heiden et al., 2009; Yun et al., 2009). Likewise, many genes identified in type 2 diabetes (T2D) genome wide association studies (GWAS) are proto-oncogenes or cell cycle regulators (Voight et al., 2010). MicroRNAs (miRNAs) are also emerging as agents of metabolic and malignant regulation in development and disease (Hyun et al., 2009; Peter, 2009). The let-7 miRNA family members act as tumor suppressors by negatively regulating the translation of oncogenes and cell cycle regulators (Johnson et al., 2005; Lee and Dutta, 2007; Mayr et al., 2007; Kumar et al., 2008). Widespread expression and redundancy among the well-conserved let-7 miRNAs raise the question of how cancer and embryonic cells are able to suppress this miRNA family to accommodate rapid cell proliferation. In human cancers, loss of heterozygosity, DNA methylation, and transcriptional suppression have been documented as mechanisms to reduce let-7 (Johnson et al., 2005; Lu et al., 2007). Another mechanism for let-7 downregulation involves the RNA-binding proteins Lin28a and Lin28b (collectively referred to as Lin28a/b), which are highly expressed during normal embryogenesis and upregulated in some cancers to potently and selectively block the maturation of let-7 (Heo et al., 2008; Newman et al., 2008; Piskounova et al., 2008; Rybak et al., 2008; Viswanathan et al., 2008). By repressing the biogenesis of let-7 miRNAs and in some cases through direct mRNA binding and enhanced translation (Polesskaya et al., 2007; Xu and Huang, 2009; Xu et al., 2009; Peng et al., 2011), Lin28a/b regulate an array of targets involved in cell proliferation and Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc. 81
B
C
Lean weight
% initial weight
Lin28a Tg
WT Control
Lin28a Tg on HFD ITT
Control n=5 Lin28a Tg n = 5
Minutes after IP injection
Minutes after IP injection
LIN28B mRNA expression
I
let-7 in iLIN28B induced mice
t en r t le ut en r cle t G ple Live us Fa Gu ple Live usc Fa S M S M WT Control iLIN28B induced
L
let-7g
J
** Control iLIN28B Tg
n = 12 n=8
* *
*
Minutes after IP injection
O
** **
Control n = 6 iLIN28B Tg n = 4
Days after HFD and induction
M
iLIN28B on normal diet ITT
Minutes after IP injection
iLIN28B on HFD growth curve
**
Control iLIN28B Tg
n=9 n=8
Days after HFD and induction
Lin28a muscle specific KO GTT
P
Lin28a muscle specific KO ITT
*
** ** n=9 n=8
Glucose (mg/dl)
**
Control iLIN28B Tg
le at er sc F Liv Mu
Control n = 5 iLIN28B n = 5
iLIN28B on HFD GTT
**
n
lee
Sp
Fed state glucose after induction
**
Relative growth after HFD
iLIN28B on normal diet GTT
% initial glucose
Glucose (mg/dl)
let-7b
let-7a
Glucose (mg/dl)
t
Gu
**
Glucose (mg/dl)
Control
Fold change
Fold change
WT Control iLIN28B mice
N
Lin28a Tg
**
WT
*
% initial glucose
Glucose (mg/dl)
**
Control n=5 Lin28a Tg n = 5
K
Days after HFD
G
*
*
H
Lin28a Tg
WT Control
F
Lin28a Tg on HFD GTT
WT Control n = 5 Lin28a Tg n = 5
WT Control n = 5 Lin28a Tg n = 7
WT Control n = 5 Lin28a Tg n = 7
E
**
% fat weight
% lean weight
Tg
Lin28a Tg on HFD
**
**
WT
D
Fat weight
Minutes after IP injection
82 Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc.
*
Lin28a fl/fl Myf5-Cre;Lin28a -/-
n=6 n=5
Minutes after IP injection
% initial glucose
A
* Lin28a fl/fl Myf5-Cre;Lin28a -/-
n=4 n=8
Minutes after IP injection
differentiation in the context of embryonic stem cells and cancer (Viswanathan and Daley, 2010). Little is known about the in vivo function of the Lin28/let-7 axis. The pathway was first revealed in a screen for heterochronic mutants in C. elegans, where loss of lin-28 resulted in precocious vulval differentiation and premature developmental progression (Ambros and Horvitz, 1984; Moss et al., 1997; Nimmo and Slack, 2009), whereas loss of let-7 led to reiteration of larval stages and delayed differentiation (Abbott et al., 2005; Reinhart et al., 2000). We previously showed that Lin28a gain of function promotes mouse growth and delays sexual maturation, recapitulating the heterochronic effects of lin-28 and let-7 in C. elegans, as well as the height and puberty phenotypes linked to human genetic variation at the Lin28b locus identified in GWAS (Zhu et al., 2010). The conservation of Lin28 and let-7’s biochemical and physiological functions throughout evolution suggests an ancient mechanism for Lin28 and let-7’s effects on growth and developmental timing. In this report we found that both Lin28a and LIN28B transgenic mice were resistant to obesity and exhibited enhanced glucose tolerance. In contrast, muscle-specific Lin28a knockout and inducible let-7 transgenic mice displayed glucose intolerance, suggesting that the Lin28/let-7 pathway plays a specific and tightly regulated role in modulating glucose metabolism in mammals. In vitro experiments revealed that Lin28a enhances glucose uptake via an increase in insulin-PI3K-mTOR signaling due in part to the derepression of multiple direct let-7 targets in the pathway, including IGF1R, INSR, IRS2, PIK3IP1, AKT2, TSC1 and RICTOR. Experiments with the mTOR-specific inhibitor rapamycin demonstrate that Lin28a regulates growth, glucose tolerance, and insulin sensitivity in an mTOR-dependent manner in vivo. In addition, analysis of T2D and fasting glucose whole genome associations suggests a genetic connection between multiple genes regulated by let-7 and glucose metabolism in humans. These metabolic functions for Lin28a/b and let-7 in vivo provide a mechanistic explanation for how this pathway might influence embryonic growth, metabolic disease and cancer. RESULTS Lin28a Tg Mice Are Resistant to Obesity and Diabetes We previously described a tetracycline-inducible Lin28a transgenic (Lin28a Tg) mouse that showed leaky constitutive Lin28a
expression in the absence of induction (Zhu et al., 2010). In that study, we showed that these mice cleared glucose more efficiently during glucose and insulin tolerance testing (GTT and ITT), classic metabolic tests used for the characterization of whole animal glucose handling. Given that young Lin28a Tg mice exhibited enhanced glucose metabolism, we tested if old Lin28a Tg mice were also resistant to age-induced obesity. Compared to Lin28a Tg mice, wild-type mice fed a normal diet gained significantly more fat mass with age (Figure 1A). Dual Energy X-ray Absorptiometry scans showed increased percentage lean mass and reduced percentage body fat in the Lin28a Tg mice (Figures 1B and 1C). To rule out behavioral alterations, we measured activity over three days in isolation cages and found no differences in horizontal activity, O2/CO2 exchange, and food/water intake between wild-type and Tg mice (Figures S1A and S1B available online). To determine if these mice were resistant to HFD-induced obesity, we fed mice a diet containing 45% kcals from fat, and observed resistance to obesity in the Lin28a Tg mice (Figure 1D). Lin28a Tg mice consumed as much high-fat food as their wild-type littermates, ruling out anorexia (data not shown). Furthermore, we inquired if Lin28a Tg mice were resistant to HFD-induced diabetes and found that they had markedly improved glucose tolerance and insulin sensitivity under HFD conditions (Figures 1E and 1F). Lin28a Tg mice also showed resistance to HFD-induced hepatosteatosis (Figure 1G). Taken together, leaky Lin28a expression in the muscle, skin and connective tissues (Zhu et al., 2010) protected against obesity and diabetes in the context of aging and HFD. iLIN28B Tg Mice Are Resistant to Diabetes Although Lin28a and Lin28b both block let-7 miRNAs, they are differentially regulated, resulting in distinct expression patterns during normal development and malignant transformation (Guo et al., 2006; Viswanathan et al., 2009). Given that LIN28B is overexpressed more frequently than LIN28A in human cancer, we sought to determine if LIN28B exerts a similar effect on glucose metabolism. Thus, we generated a mouse strain carrying an inducible copy of human LIN28B driven by a tetracycline transactivator rtTA placed under the control of the Rosa26 locus (iLIN28B mouse, see Experimental Procedures). After 14 days of treatment with the tetracycline analog doxycycline (dox), high levels of LIN28B were induced and mature let-7’s were
Figure 1. Lin28a Tg and iLIN28B Tg Mice Are Resistant to Obesity and Diabetes and Lin28a Is Physiologically Required for Normal Glucose Homeostasis (A) Aged wild-type (left) and Lin28a Tg mice (right) fed a normal diet, at 20 weeks of age. (B) Percentage body fat and (C) lean mass as measured by DEXA. (D) Weight curve of mice fed a HFD containing 45% kcals from fat. (E) Glucose tolerance test (GTT) and (F) Insulin tolerance test (ITT) of mice on HFD. (G) Liver histology of mice fed HFD. (H) Human LIN28B mRNA expression in a mouse strain with dox inducible transgene expression (named iLIN28B). (I) Mature let-7 expression in gut, spleen, liver, muscle and fat. (J) Kinetics of fed state glucose change after induction. (K) GTT and (L) ITT under normal diets. (M) iLIN28B growth curve under HFD. (N) GTT after 14 days of HFD and induction. (O) GTT and (P) ITT of Myf5-Cre; Lin28afl/fl mouse. Controls for Lin28a Tg mice are WT. Controls for iLIN28B Tg mice carry only the LIN28B transgene. Controls for muscle knockout mice are Lin28afl/fl mice. The numbers of experimental animals are listed within the charts. Error bars represent SEM. *p < 0.05, **p < 0.01.
Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc. 83
repressed in metabolically important organs (Figures 1H and 1I), resulting in hypoglycemia with an average fasting glucose of < 50 mg/dL in induced mice compared to > 150 mg/dL in control mice (p < 0.01). To determine the kinetics of this effect, we measured fed state glucose daily and noted falling glucose levels after 5 days (Figure 1J). Glucose and insulin tolerance tests on dox-induced animals on normal diets showed considerable improvements in glucose tolerance and insulin sensitivity (Figures 1K and 1L). When assessing islet b cell hyperactivity, we found that iLIN28B mice produced no more insulin than control littermates during glucose challenge (data not shown). Under HFD, we found that induced iLIN28B mice were surprisingly resistant to weight gain (Figure 1M) despite a trend toward increased food intake (9.9 versus 4.8 g/mouse/day; p = 0.075). These mice continued to exhibit superior glucose tolerance after 14 days of dox induction under HFD (Figure 1N), when average weights were 34.5 ± 1.05 g for controls and 27.1 ± 0.99 g for iLIN28B mice, demonstrating that HFD had a strong obesogenic and diabetogenic effect on control but not on LIN28B induced animals. Unlike the Lin28a Tg mice, expression was not leaky in the iLIN28B mice (Figure 1H and 3F) and uninduced mice exhibited no growth or glucose phenotypes (Figures S1C and S1D), making this a better model for inducible Lin28 hyperactivation. These data show that both Lin28 homologs have similar effects on glucose metabolism and obesity, suggesting that these effects are mediated through common mRNA or miRNA targets of the Lin28 family. Lin28a Is Physiologically Required for Normal Glucose Homeostasis We then asked if Lin28a is physiologically required for normal glucose metabolism in one specific adult tissue compartment, skeletal muscle, since previous studies have found low but significant levels of Lin28a expression in the muscle tissues of mice (Yang and Moss, 2003; Zhu et al., 2010). We generated a skeletal muscle-specific knockout of Lin28a (see Experimental Procedures). These muscle-specific knockout mice showed impaired glucose tolerance (Figure 1O) and insulin resistance (Figure 1P) relative to wild-type littermates, demonstrating that Lin28a activity in skeletal muscles is required for normal glucose homeostasis. We analyzed miRNA expression in muscle tissue by qRT-PCR and found no significant difference in let-7 levels during adult (data not shown) or embryonic stages (Figure S1E), suggesting that Lin28a loss of function affects glucose homeostasis either through let-7-independent mRNA binding or through changes in the spatiotemporal distribution of let-7 miRNA. Together, these data show that Lin28 isoforms are important and essential regulators of glucose homeostasis. iLet-7 Mice Are Glucose Intolerant In addition to their ability to suppress let-7 biogenesis, Lin28a and Lin28b also regulate mRNA targets such as Igf2, HMGA1, OCT4, histones and cyclins through non-let-7 dependent mechanisms of mRNA binding and enhanced translation (Polesskaya et al., 2007; Xu and Huang, 2009; Xu et al., 2009; Peng et al., 2011). To test if altered let-7 expression might produce the opposite phenotypes of Lin28a/b gain of function, we generated a mouse strain in which let-7g can be induced with dox under 84 Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc.
the control of the Rosa26 locus (iLet-7 mouse, See Experimental Procedures). To ensure that endogenous Lin28 would not block pri- or pre-let-7g biogenesis, we used a chimeric let-7g species called let-7S21L (let-7g Stem, mir-21 Loop), in which the loop region of the precursor miRNA derives from mir-21 and cannot be bound by Lin28, thus allowing for let-7 processing despite Lin28 expression (Piskounova et al., 2008). Global transgene induction from three weeks of age onward increases mature let-7g levels in liver (>50-fold), skin (>20-fold), fat (4-fold) and muscle (4-fold) (Figure 2A). This level of let-7 overexpression led to reduced body size and growth rates in induced animals (Figures 2B and 2C). Growth retardation was proportional and not manifested as preferential size reduction in any particular organs (Figure S2A). Similar to the iLIN28B mice, leaky expression was not detected and uninduced male mice exhibited no growth or glucose phenotypes (Figures S2B–S2D). After 5 days of let-7 induction, these iLet-7 mice produced an increase in fed state glucose (Figure 2D). GTT revealed glucose intolerance in mice fed normal (Figure 2E) or HFD (Figure 2F). Surprisingly, ITT failed to detect a difference in insulin sensitivity (Figure 2G). The decreased glucose tolerance in the setting of comparable insulin sensitivity suggested either decreased insulin production from islet b cells in response to glucose, or higher insulin secretion to compensate for peripheral insulin resistance. Thus, we measured insulin production following glucose challenge, and found that iLet-7 mice produced more insulin than controls (Figure 2H). These results demonstrated that broad overexpression of let-7 results in peripheral glucose intolerance and compensatory overproduction of insulin from islet b cells. To test if let-7 induction could abrogate the glucose uptake phenotype of LIN28B overexpression, we crossed the iLIN28B to the iLet-7 inducible mice. After 10 days of induction, simultaneous induction of LIN28B and let-7g did not result in any differences in glucose tolerance (Figures 2I and 2J), in contrast to LIN28B or let-7g induction alone. Taken together, the opposing effects of Lin28 and let-7 expression on glucose regulation show that Lin28 overexpression influences metabolism in part by suppressing let-7, and that let-7 alone is sufficient to regulate glucose metabolism in vivo. Insulin-PI3K-mTOR Signaling Is Activated by Lin28a/b and Suppressed by let-7 To dissect the molecular mechanism of the effects of Lin28 and let-7 on glucose regulation, we turned to the C2C12 cell culture system. Overexpression of Lin28a in C2C12 myoblasts resulted in protein levels of Lin28a similar to that observed in mouse embryonic stem cells (ESCs) (Figure 3A), and led to robust let-7 suppression (Figure 3B). In C2C12 myotubes differentiated for 3 days, Lin28a promoted Ser473 phosphorylation of Akt and Ser235/236 phosphorylation of S6 ribosomal protein, suggesting activation of the PI3K-mTOR pathway (Figure S3A). In this setting, Lin28a increased myotube glucose uptake by 50% (Figure 3C). Lin28a-dependent glucose uptake was abrogated by 24hr treatment with the PI3K/mTOR inhibitor LY294002 or the mTOR inhibitor rapamycin (Figure 3C), but not the MAPK/ERK inhibitor PD98059 (Figure S3B), demonstrating that Lin28adependent glucose uptake requires the PI3K-mTOR pathway.
Mature let-7 in iLet-7 mouse
B
iLet-7
C
Controls
iLet-7 mouse growth curve
*
Control M iLet-7 M
14g 18g
Muscle
Age (days)
*
day 0
Control n = 10 iLet-7 n=5
day 5
Normal diet ITT
* *
*
Insulin during GTT
H
I
p = 0.096
Control n = 6 iLet-7 n=6
Minutes after IP injection
J
iLIN28B/Let-7 GTT
AUC of GTTs
*
GTT p = n.s.
p = 0.095
15 30 Minutes after IP injection
DOX for 10 days
NO DOX iLIN28B/Let-7 +DOX iLIN28B/Let-7
Minutes after IP injection Control n = 4 iLet-7 n=3
n = 15 n = 15 do x
Glucose (mg/dl)
Insulin (ng/ml)
Control n = 10 iLet-7 n=5
* *
Minutes after IP injection
GTT
% initial glucose
HFD GTT
Minutes after IP injection
do x do x
Glucose (mg/dl)
Control n = 6 iLet-7 n = 4
Glucose (mg/dl)
*
G
Normal diet GTT
do x
Fed state glucose post-dox
F
NO
E
+
Fat
20g
Control F iLet-7 F
NO
D
Skin
21g
Glucose (mg/dl)
Liver
25g
Weight (g)
*
**
Fold change
**
iLIN28B only
+
A
iLIN28B/ Let-7
Figure 2. iLet-7 Mice Are Glucose Intolerant (A) let-7g and let-7a qRT-PCR in tissues of dox induced iLet-7 mice (n = 3) and controls (n = 3). (B) Reduced size of induced animals. (C) iLet-7 growth curve for males and females. (D) Fed state glucose in iLet-7 mice induced for 5 days. GTTs performed on mice fed with either (E) normal diet or (F) HFD. (G) ITT on normal diet. (H) Insulin production during a glucose challenge. (I) GTT of LIN28B/Let-7 compound heterozygote mice before (blue) and after (red) induction with dox. (J) Area under the curve (AUC) analysis for this GTT. Controls for iLet-7 Tg mice carry either the Let-7 or Rosa26-M2rtTa transgene only. The numbers of experimental animals are listed within the charts. Error bars represent SEM. *p < 0.05, **p < 0.01.
To exclude myotube differentiation-dependent phenomena, we tested the effects of Lin28a on PI3K-mTOR signaling in undifferentiated myoblasts under serum-fed, serum-starved, and insulin-stimulated conditions (Figure 3D). In the serum-fed state, we found that Lin28a promoted the activation of PI3K/Akt signaling by increasing Akt phosphorylation at both Ser473 and Thr308, compared to the pBabe control. Furthermore, we found
that Lin28a robustly increased the phosphorylation of mTORC1 signaling targets S6 and 4EBP1 in the serum-fed state. Serumstarvation for 18 hr abrogated the phosphorylation of Akt, S6 and 4EBP1, indicating that Lin28a-induction of PI3K-mTOR signaling requires exogenous growth factor stimulation. Upon insulin stimulation, Akt phosphorylation increased dramatically and, both phospho-S6 and phospho-4EBP1 levels were Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc. 85
F
Igf1r p-Akt (S473)
Insr
p-Akt (T308)
Akt
p-S6
p-S6
S6
S6
p-4EBP1
p-4EBP1
4EBP1
4EBP1
Tubulin
Tubulin
**
pBabe Control Lin28a OE
SS
Fed
Ins
SS
Fed
Ins
SS
Let-7 L28a
Lin28a p-Akt (S473)
G
ntr My ol f5 Co KO ntr My ol f5 KO
Relative glucose uptake
Glucose uptake in C2C12
Ctrl
Akt
Co
C
Fed
Control miRNA Ctrl L28a
E
Fed
a i e b t-7 et-7 et-7 -30 le l l ir m
Ins
Fold Change
* **
SS
let-7’s in C2C12
*
p-Akt (S473)
Akt
Ins
Tubulin
B
L28B
LIN28B
Lin28a
Lin28a
Muscle Ctrl
Ins
SS
Ins
SS
Rapamycin Ctrl L28a Ins
SS
SC
a 2 E pB Lin m
Fed
8a
SS
be
Fed
E
O
Ins
DMSO Ctrl L28a
D
A
p-S6 Insr
S6
p-4EBP1 p-4EBP1
Tubulin
4EBP1 SO
DM
2
in
00
94
2 LY
c my
pa
Ra
Tubulin
Figure 3. Insulin-PI3K-mTOR Signaling Is Activated by Lin28a/b and Suppressed by let-7 (A) Western blot analysis of Lin28a protein expression in C2C12 myoblasts infected with control pBabe or Lin28a overexpression vector, and mouse ESCs, with tubulin as the loading control. (B) Quantitative PCR for let-7 isoforms in C2C12 myoblasts, normalized to sno142, after Lin28a overexpression. (C) 2-deoxy-D-[3H] glucose uptake assay on 3-day-differentiated C2C12 myotubes with and without Lin28a overexpression, treated with DMSO, the PI3K inhibitor LY294002, and the mTOR inhibitor rapamycin for 24 hr. (D) Western blot analysis of the effects of Lin28a overexpression on PI3K-mTOR signaling in C2C12 myoblasts, under serum-fed (fed), 18 hr serum starved (SS) or insulin-stimulated (Ins) conditions. Insulin stimulation was performed in serum-starved myoblasts with 10 mg/mL insulin for 5 min. Prior to insulin stimulation, serum-starved myoblasts were treated with either DMSO or 20 ng/mL rapamycin for 1 hr. (E) Western blot analysis of the effects of let-7f or control miRNA on PI3K-mTOR signaling in C2C12 myoblasts under serum-fed (fed), 18 hr serum starved (SS) or insulin-stimulated (Ins) conditions. (F) Western blot analysis of the effects of LIN28B induction by dox on PI3K-mTOR signaling in quadriceps muscles in vivo (n = 3 iLIN28B Tg mice and 3 LIN28B Tg only mice). (G) Insr and p-4EBP1 protein levels in wild-type and Lin28a muscle-specific knockout adults. Error bars represent SEM. *p < 0.05, **p < 0.01.
increased even further by Lin28a overexpression, suggesting that Lin28a increases the insulin-sensitivity of C2C12 myoblasts. Importantly, we found that rapamycin abrogated the Lin28ainduction of phospho-S6 and phospho-4EBP1 upon insulin stimulation, but did not affect let-7 levels (Figure S3C) or Lin28a itself (Figure 3D), indicating that the mTOR dependence is occurring downstream of Lin28a. To test if the effects of Lin28a on insulin-PI3K-mTOR signaling are let-7-dependent, we transfected either mature let-7f duplex or a negative control miRNA into both Lin28a-overexpressing and pBabe control myoblasts (Figure S3D and Figure 3E). 86 Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc.
Because mature let-7 duplexes cannot be bound and inhibited by Lin28a protein, this experiment tests if PI3K-mTOR activation is occurring downstream of let-7. Transfection with control miRNA did not affect Lin28a-induction of the phosphorylation of Akt, S6, or 4EBP1 in serum-starved myoblasts upon insulin stimulation. Transfection with let-7f, however, attenuated the Lin28a-induction of phospho-Akt (Ser473), and abrogated the increase in S6 and 4EBP1 phosphorylation upon insulin stimulation in Lin28a-overexpressing myoblasts (Figure 3E). In pBabe control myoblasts, let-7 duplex still suppressed S6 and 4EBP1 phosphorylation in the serum-fed state, serum-starved, and
B
C
Conserved let-7 target sequences
Control microRNA Mature let-7f
Luciferase Assays for Insulin-PI3K mTOR components
7 kb p=n.s.
# of Conserved let-7 target sequences
IGF1 IGF1R
*
p=n.s.
IGF1 p=n.s. p=n.s.
INSR
**
p=n.s.
IGF1R
** **
IRS2
*
**
**
p=0.07
p=0.07
p=0.14
p=0.08
INSR IRS2 PIK3IP1 AKT2 TSC1 RICTOR
Relative luminescence
A
RPS6KA3 EIF4EBP2
T2 AK
EIF4G2 IGF2BP1 IGF2BP2 IGF2BP3 HMGA1 HMGA2 LIN28A
E D
C2C12
pBabe Control
HEK293T M
Lin28a
iR f k m oc NC et-7 l
R
TO
IC
R
shRNA Ctrl 5 0.
LIN28B shRNA g g ug 5ug 0ug .5ug .5u .0u . . 3 1 0 3 1
LIN28B
LIN28B
Lin28a Total IRS2
Total Irs2
INSR
Tubulin
Tubulin
Figure 4. Lin28a/b and let-7 Regulate Genes in the Insulin-PI3K-mTOR Pathway (A) Shown are the numbers of conserved let-7 binding sites within 30 UTRs found using the TargetScan algorithm. (B) Putative let-7 binding sites in 16 genes of the insulin-PI3K-mTOR pathway and in Lin28a/b. (C) 30 UTR luciferase reporter assays performed to determine functional let-7 binding sites. Bar graphs show relative luciferase reporter expression in human HEK293T cells after transfection of mature let-7f duplex normalized to negative control miRNA. Shown also are mutations in the seed sequence of the let-7 binding sites for INSR, IGF1R and IRS2. (D) Western blot analysis of Lin28a, Irs2, and tubulin in C2C12 myoblasts with and without Lin28a overexpression. (E) Western blot analysis of LIN28B, total IRS2, INSR and TUBULIN in HEK293T cells with either let-7f transfection or shRNA knockdown of LIN28B. Error bars represent SEM. *p < 0.05, **p < 0.01.
insulin-stimulated conditions, relative to total S6 and 4EBP1 protein. The suppression of mTOR signaling by let-7 even in the absence of Lin28a implies that let-7 can act independently downstream of Lin28a. Together with data indicating that let-7 abrogates Lin28a-specific induction of p-Akt, p-S6 and p-4EBP1 upon insulin stimulation, this demonstrates that the effects of Lin28a on PI3K-mTOR signaling are at least in part due to let-7 and that Lin28 and let-7 exert opposing effects on PI3K-mTOR signaling. To test if these effects of Lin28 on insulin-PI3K-mTOR signaling are also relevant in vivo, we examined the quadriceps muscles of iLIN28B mice and found that dox-induction led to increases in the phosphorylation of Akt (S473), S6 and 4EBP1, the targets of PI3K-mTOR signaling (Figure 3F). Furthermore, the Insulin-like growth factor 1 receptor (Igf1r) and the Insulin receptor (Insr) proteins were also upregulated in the muscles upon LIN28B induction, reinforcing the fact that Lin28a/b drives insulin-PI3KmTOR signaling in C2C12 myoblasts and within mouse tissues. On the other hand, similar analysis of the Lin28a muscle-specific knockout mice revealed reduced Insr and p-4EBP1 expression (Figure 3G), demonstrating that Lin28a is both necessary and sufficient to influence glucose metabolism through the regulation of insulin-PI3K-mTOR signaling in vivo. Lin28a/b and let-7 Regulate Genes in the Insulin-PI3K-mTOR Pathway On the RNA level, Lin28a overexpression in C2C12 myoblasts leads to an increase in mRNA levels of multiple genes in the
insulin-PI3K-mTOR signaling pathway (Figure S4A). Although both Lin28a suppression of let-7 and direct Lin28a binding to mRNAs could increase mRNA stability and thus increase mRNA levels, it is possible that these increases do not reflect direct interactions. To find direct targets, we performed a bioinformatic screen using the TargetScan 5.1 algorithm (Grimson et al., 2007), and found that 16 genes in the insulin-PI3KmTOR pathway contained evolutionarily conserved let-7 binding sites in their respective 30 UTRs (Figures 4A and 4B). Next, we performed 30 UTR luciferase reporter assays to determine if these genes were bona fide and direct targets of let-7. To do this, we generated luciferase reporters with twelve human 30 UTR fragments containing conserved let-7 sites. Luciferase reporter expression in human HEK293T cells after transfection of either mature let-7f duplex or a negative control miRNA demonstrated that the 30 UTRs of INSR, IGF1R, IRS2, PIK3IP1, AKT2, TSC1 and RICTOR were targeted by let-7 for suppression (Figure 4C). Three-base mismatch mutations in the seed region of the let-7 binding sites abrogated let-7’s suppression of INSR, IGF1R and IRS2. To confirm that the luciferase reporters predicted actual changes in protein expression mediated by let-7, we assayed the endogenous expression of some of these proteins upon Lin28a/b overexpression. We found that an increase in Lin28a upregulated Irs2 (Figure 4D) in vitro, and that an increase in LIN28B upregulated Igf1r and Insr protein in skeletal muscles in vivo (Figure 3F). Conversely, INSR and IRS2 are reduced upon both let-7f transfection and LIN28B shRNA knockdown in HEK293T, demonstrating that these Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc. 87
regulatory mechanisms hold in both mouse and human cells, and in the setting of both LIN28B gain and loss of function (Figure 4E). This establishes a direct mechanism for let-7’s repression and Lin28’s derepression of multiple components in the insulin-PI3K-mTOR signaling cascade. Previously, Lin28a has been shown to enhance Igf2 translation independently of let-7 (Polesskaya et al., 2007), offering an alternative mechanism by which Lin28a might activate the insulinPI3K-mTOR pathway. To determine the relative contribution of this mechanism, we performed in vitro and in vivo loss of function experiments. Following siRNA knockdown of Igf2 in C2C12 (the efficacy of knockdown is shown in Figure S4B), we found only minimal changes in S6 and 4EBP1 phosphorylation (Figure S4C). In these C2C12 myotubes, glucose uptake was unaffected by Igf2 knockdown, but significantly decreased by let-7a (Figure S4D). In addition, we crossed the Lin28a Tg mice with Igf2 knockout mice and found that the absence of Igf2 did not abrogate enhanced glucose uptake, insulin sensitivity, or the anti-obesity effect mediated by Lin28a (Figure S4E–H). Taken together, these data indicate that the metabolic phenotypes we have observed are not solely due to the ability of Lin28a/b to promote translation of Igf2 mRNA, but do not rule out the possibility that Lin28a/b might modulate other mRNAs in the insulin-PI3K-mTOR signaling pathway. mTOR Mediates Lin28a’s Enhancement of Growth and Glucose Metabolism In Vivo Given that Lin28a activates the insulin-PI3K-mTOR pathway both in vitro and in vivo, we asked whether the metabolic effects of Lin28a in vivo could be abrogated by pharmacological inhibition of the mTOR pathway. To do this, we injected Lin28a Tg and wild-type littermates with rapamycin 3 times per week beginning when mice were 18 days old. Rapamycin abrogated the growth enhancement in Lin28a Tg mice at doses that had minimal growth suppressive effects on wild-type mice (Figures 5A and 5B), suggesting that Lin28a promotes growth in an mTORdependent manner. Selective suppression of Lin28a-driven growth was observed using several parameters: weight (Figures 5B and 5C), crown-rump length (Figure 5D), and tail width (Figure 5E). We also tested if the enhanced glucose uptake phenotype in vivo was likewise dependent on mTOR. Indeed, glucose tolerance testing showed that short-term rapamycin reversed the enhanced glucose uptake effect of Lin28a (Figures 5F and 5G) and reduced the insulin-sensitivity of Lin28a Tg mice to wild-type levels (Figures 5H and 5I). These data indicate that the glucose uptake, insulin sensitivity and animal growth phenotypes of Lin28a overexpression in vivo are dependent on mTOR signaling. let-7 Target Genes Are Associated with Type 2 Diabetes in Human GWAS Finally, we sought to assess the relevance of the Lin28/let-7 pathway to human disease and metabolism, using human genetic studies of T2D and fasting glucose levels. Because the Lin28/let-7 pathway has not been previously implicated in T2D, we first asked whether any of the genes that lie in T2D association regions identified in T2D GWAS and meta-analyses (Voight et al., 2010) are known or predicted let-7 targets. We used Tar88 Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc.
getScan 5.1 to computationally predict let-7 targets (Grimson et al., 2007), and found that 14 predicted let-7 target genes lie in linkage disequilibrium to 39 validated common variant associations with T2D, including IGF2BP2, HMGA2, KCNJ11 and DUSP9 (strength of T2D association signals p < 4 3 109) (Table 1). Of the computationally predicted let-7 targets associated with T2D, IGF2BP1/2/3 and Hmga2 have been verified as let-7 targets in several studies (Boyerinas et al., 2008; Mayr et al., 2007). To validate the connection between Lin28 and GWAS candidate genes, we analyzed the expression of Igf2bp and Hmga family members in C2C12 cells with and without Lin28a overexpression, and observed increases in Igf2bp1, Igf2bp2, and Hmga2 mRNA following Lin28a overexpression (Figure 6A). To ensure that this was not a C2C12- or musclespecific phenomenon, we confirmed the upregulation of these genes in 3T3 cells following human LIN28A or LIN28B overexpression on the mRNA (Figure 6B) and the protein level for the Igf2bp family (Figure 6C). We also observed increased expression of Igf2bp2 and Igf2bp3 (Figure 6D) in Lin28a Tg muscle, confirming this link in vivo. We next asked whether there is a more widespread connection between T2D susceptibility and let-7 targets, in addition to the targets in validated T2D association regions (p < 5 3 108). To address this, we applied a computational method called MAGENTA (Meta-Analysis Gene-set Enrichment of variaNT Associations) (Segre` et al., 2010) to GWAS meta-analyses of T2D and fasting glucose blood levels, and tested whether the distributions of disease or trait associations in predefined let-7 target gene sets are skewed toward highly ranked associations (including ones not yet reaching a level of genome-wide significance) compared to matched gene sets randomly sampled from the genome (Table 1). We tested three types of let-7 target definitions with increasing levels of target validation, from in silico predicted let-7 targets using TargetScan 5.1 (Grimson et al., 2007) to experimentally defined targets. For the latter, we used (i) a set of genes with at least one let-7 site in their 30 UTR and whose mRNA was downregulated by let-7b overexpression in primary human fibroblasts (Legesse-Miller et al., 2009), and (ii) a set of genes whose protein levels were most strongly downregulated by let-7b overexpression in HeLa cells (Selbach et al., 2008). We first tested the let-7 target sets against the latest T2D meta-analysis of eight GWAS (called DIAGRAM+) (Voight et al., 2010), and found significant enrichment (Table 1). The enrichment rose from 1.05-fold for the broadest definition of let-7 targets predicted using TargetScan (1800 genes; p = 0.036) to 1.92fold for the experimentally validated target set based on protein level changes in response to let-7 overexpression (100 genes; p = 1 3 106). In the latter case, an excess of about 20 genes regulated by let-7 at the protein level are predicted to contain novel SNP associations with T2D. Notably IGF2BP2, which is a canonical let-7 target that lies in a validated T2D association locus, was found in all types of let-7 target definitions in Table 1. Furthermore, the genes driving the T2D enrichment signals for the different let-7 target sets include both functionally redundant homologs of T2D-associated genes, such as IGF2BP1 (IGF2BP2), HMGA1 (HMGA2), DUSP12 and DUSP16 (DUSP9), and genes in the insulin-PI3K-mTOR pathway, including IRS2, INSR, AKT2 and TSC1 (best local SNP association p = 104 to 4*103).
**
A Vehicle Control
Control
p < 4.2E-10
**
D
E
**
WT
Lin28a Tg
Weight (g) VEH
F
n=9 n=9 n=5 n=9
Age (days)
H
VEH
**
Vehicle GTT
G
** WT VEH n = 10 Lin28a Tg VEH n = 10
WT RAP n=9 Lin28a Tg RAP n = 10
Rapamycin ITT
Vehicle ITT
Minutes after IP injection
RAP
Minutes after IP injection
I
WT VEH n=5 Lin28a Tg VEH n = 7
VEH
(mm)
Rapamycin GTT
Minutes after IP injection
*
p < 0.002
Tail width RAP
*
% of initial glucose
*
RAP
*
% of initial glucose
Rapamycin treatment growth curve
p = n.s.
Relative weight change
B
Height (cm)
% of initial glucose
Lin28a Tg
Lin28a Tg
p = n.s.
p < 0.001
WT
** p < 6.9E-9
p < 3.2E-7
**
% of initial glucose
Rapamycin
C
WT RAP n=9 Lin28a Tg RAP n = 10
Minutes after IP injection
Figure 5. mTOR Is Required for Lin28a’s Effects on Growth and Glucose Metabolism In Vivo (A) Rapamycin (left 2 mice) and vehicle (right 2 mice) treated wild-type and Lin28a Tg mice shows relative size differences. (B) Curves showing relative growth (normalized to weight on first day of treatment) for mice treated from 3 weeks to 6.5 weeks of age. Blue and red represent wildtype and Lin28a Tg mice, respectively. Solid and dotted lines represent vehicle and rapamycin treated mice, respectively. Growth was measured by several other parameters: (C) weight, (D) crown-rump length or height, and (E) tail width. (F) GTT performed after 2 doses of vehicle or (G) rapamycin. (H) ITT performed after 1 dose of vehicle or (I) rapamycin. Controls for Lin28a Tg mice are WT. The numbers of experimental animals are listed within the charts. Error bars represent SEM. *p < 0.05, **p < 0.01.
We next tested for enrichment of let-7 target gene associations with fasting glucose levels, using data from the MAGIC (Meta-Analysis of Glucose and Insulin-related traits Consortium) study of fasting glucose levels (Dupuis et al., 2010). We observed an over-representation of multiple genes modestly associated with fasting glucose at different levels of significance for the different let-7 target gene sets (Table 1). The strongest enrichment was found in the genes downregulated at the mRNA level by let-7, an enrichment of 1.20 fold over expectation (p = 1*104). Taken together, our human genetic results support the hypothesis that genes regulated by let-7 influence human metabolic disease and glucose metabolism. Recently it has also become clear that Lin28a/b has important let-7-independent roles in RNA metabolism, as evidenced by
numerous direct mRNA targets whose translation is enhanced by LIN28A (Peng et al., 2011). Using GSEA, we found that this list of direct mRNA targets is also significantly enriched for glucose, insulin and diabetes-related genes (Table S1). Thus, Lin28a/b may regulate metabolism through direct mRNAbinding as well as let-7 targets. DISCUSSION Lin28 and let-7 Are Mutually Antagonistic Regulators of Growth and Metabolism Our work defines a new mechanism of RNA-mediated metabolic regulation. In mice, Lin28a and LIN28B overexpression results in insulin sensitivity, enhanced glucose tolerance, and resistance Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc. 89
Table 1. MAGENTA Analysis of T2D and Fasting Glucose Associations in Different let-7 Target Gene Set Definitions
let-7 Target Gene Set
Number of Genes Analyzedy
Nominal Gene Set Enrichment p Value
Expected Number of Genes above Enrichment Cutoff
Observed Number of Genes above Enrichment Enrichment Fold Cutoff
Number of Genes Linked to Genes Linked to Validated Validated GWAS SNPs GWAS SNPs
Type 2 Diabetes (DIAGRAM+ Meta-Analysis) All targets predicted by TargetScan
1763
0.036
441
462
1.05
14
IGF2BP2, DUSP9, SLC5A6, TP53INP1, YKT6, ZNF512, HMGA2, KCNJ11, MAN2A2, MEST, NOTCH2, ZNF275, FAM72B, RCCD1
Conserved targets predicted by TargetScan
789
0.089
197
212
1.08
7
IGF2BP2, DUSP9, SLC5A6, HMGA2, KCNJ11, MAN2A2, ZNF275
Downregulated mRNAs following let-7 OE
795
0.055
199
216
1.09
9
IGF2BP2, DUSP9, SLC5A6, TP53INP1, YKT6, ZNF512, HHEX, IRS1, TLE4
Downregulated mRNAs following let-7 OE + TargetScan
502
0.061
126
140
1.11
6
IGF2BP2, DUSP9, SLC5A6, TP53INP1, YKT6, ZNF512
Downregulated proteins following let-7 OE
97
1.0E-06*
24
46
1.92
2
IGF2BP2, CDKAL1
Downregulated proteins following let-7 OE + TargetScan
37
0.011
9
16
1.78
1
IGF2BP2
0.015
427
450
1.05
3
CRY2, SLC2A2, GLIS3
0.042
190
207
1.09
1
CRY2
Fasting Glucose (MAGIC Meta-Analysis) All targets predicted by TargetScan
1708
759 Conserved targets predicted by TargetScan Downregulated mRNAs following let-7 OE
750
1.0E-04*
188
226
1.20
2
CRY2, FADS1
Downregulated mRNAs following let-7 OE + TargetScan
484
0.013
121
141
1.17
1
CRY2
Downregulated proteins following let-7 OE
96
0.632
24
23
0.96
0
-
Downregulated proteins following let-7 OE + TargetScan
35
0.245
9
11
1.22
0
-
The statistical enrichment for genes associated with T2D and fasting glucose among let-7 targets using the MAGENTA algorithm. The TargetScan algorithm was used to define the ‘‘All human let-7 targets’’ and the ‘‘Conserved let-7 targets’’ gene sets (http://www.targetscan.org/). mRNA downregulation following let-7 overexpression (OE) was measured in primary human fibroblasts (Legesse-Miller et al., 2009), and protein downregulation following let-7 OE was measured in HeLa cells (Selbach et al., 2008). The enrichment cutoff used is the 75th percentile of all gene association scores in the genome. The enrichment fold is the ratio between the observed and expected number of genes above the enrichment cutoff. Genes linked to validated GWAS SNPs (39 SNPs for T2D and 14 SNPs for fasting glucose) were ordered according to the number of target gene sets they appear in and then alphabetically. y The following genes were removed from the analysis: (i) genes absent from the full human gene list used in the analysis, (ii) genes that had no SNPs within 110 kb upstream or 40 kb downstream to their most extreme transcript boundaries, or (iii) to correct for potential inflation of enrichment due to physical proximity of let-7 target genes along the genome, subsets of proximal genes assigned the same best local SNP were collapsed to one gene and assigned the score of the most significant gene p-value in that subset. *gene sets that pass a Bonferroni corrected cutoff (p < 0.004).
to diabetes. Our analysis of iLet-7 Tg mice shows that let-7 upregulation is also sufficient to inhibit normal glucose metabolism, supporting the idea that gain of Lin28a/b exerts effects on whole animal glucose metabolism at least in part through 90 Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc.
let-7 suppression. Previously, we showed that transgenic overexpression of Lin28a causes enhanced growth and delayed puberty, phenotypes that mimicked human traits linked to genetic variation in the Lin28/let-7 pathway in GWAS (Zhu
Igf2bp and Hmga family in C2C12
Igf2bp family in 3T3
B
**
*
1 2 1 2 bp bp ga ga f2 f2 Hm Ig Hm Ig
mRNA fold change
mRNA fold change
*
LIN28B OE
** *
1 bp f2 Ig
C
Actin
3 bp f2 Ig
1 ga Hm
2 ga Hm
Igf2bp family in Lin28a Tg muscle
Lin28a mRNA fold change
Igf2bp1/2/3
*
*
2 bp f2 Ig
D Control
E
**
LIN28A OE pBABE control Lin28a OE
Lin28a
*
pBABE control
Control Lin28a Tg
A
2 bp f2 Ig
3 bp f2 Ig
Figure 6. let-7 Target Genes Are Associated with Type 2 Diabetes Mellitus and a Model of the Lin28/let-7 Pathway in Glucose Metabolism mRNA expression of Igf2bp and Hmga family members in (A) C2C12 with and without Lin28a overexpression and in (B) 3T3 cells with and without LIN28A or LIN28B overexpression. (C) Western blot of NIH 3T3 cells with Lin28a overexpression showing Lin28a and Igf2bp1/2/3 protein levels (n = 3 biological replicates). (D) Igf2bp2 and Igf2bp3 mRNA in Lin28a Tg muscle. (E) Model of Lin28/let-7 pathway in glucose metabolism. Error bars represent SEM. *p < 0.05, **p < 0.01.
et al., 2010). Given that Lin28a/b is downregulated in most tissues after embryogenesis, while let-7 increases in adult tissues, lingering questions from our earlier report were first, whether let-7 was sufficient to influence organismal growth, and second, what function does let-7 have in adult physiology? Our observation that the iLin28a, iLIN28B, and iLet-7 Tg gain of function mice, as well as muscle-specific Lin28a loss of function mice manifest complementary phenotypes supports the notion that Lin28a/b and let-7 are both regulators of growth and developmental maturation. We propose that different developmental time-points demand distinct metabolic needs, and that global regulators such as Lin28 temporally coordinate growth with metabolism. The dynamic relationship between Lin28, let-7 and metabolic states during major growth milestones in mammals is reminiscent of the heterochronic mutant phenotypes originally defined in C. elegans (Ambros and Horvitz, 1984; Moss et al., 1997; Boehm and Slack, 2005), and suggests that metabolism, like differentiation, is temporally controlled. Lin28a/b and let-7 Influence Glucose Metabolism through the Insulin-PI3K-mTOR Pathway We have shown that Lin28a/b and let-7 regulates insulin-PI3KmTOR signaling, a highly conserved pathway that regulates growth and glucose metabolism throughout evolution. PI3K/ Akt signaling is known to promote Glut4 translocation to upregulate glucose uptake, while mTOR signaling can promote glucose
uptake and glycolysis by changing gene expression independently of Glut4 translocation (Brugarolas et al., 2003; Buller et al., 2008; Duvel et al., 2010). Previous studies have shown that Lin28a directly promotes Igf2 (Polesskaya et al., 2007) and HMGA1 translation (Peng et al., 2011), and that let-7 suppresses IGF1R translation in hepatocellular carcinoma cells (Wang et al., 2010). Consistent with these findings, our results define a model whereby Lin28a/b and let-7 coordinately regulate the insulinPI3K-mTOR pathway at multiple points (Figure 6E), a concept that is consistent with the hypotheses that miRNAs and RNA binding proteins regulate signaling pathways by tuning the production of a broad array of proteins rather than switching single components on or off (Kennell et al., 2008; Hatley et al., 2010; Small and Olson, 2011). Coordinated regulation is important because negative feedback loops exist within the insulinPI3K-mTOR pathway. Loss-of-function and pharmacological inhibition studies have shown that the mTOR target S6K1, for instance, inhibits and desensitizes insulin-PI3K signaling by phosphorylating IRS1 protein and suppressing IRS1 gene transcription (Harrington et al., 2004; Shah et al., 2004; Tremblay et al., 2007; Um et al., 2004). Conversely, TSC1-2 promotes insulin-PI3K signaling by suppressing mTOR signaling (Harrington et al., 2004; Shah et al., 2004). Although the effects of let-7 and Lin28a/b on the expression of individual genes are modest, simultaneous regulation of multiple components such as IGF2, IGF1R, INSR, IRS2, PIK3IP1, AKT2, TSC1, RICTOR in the Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc. 91
insulin-PI3K-mTOR signaling pathway could explain how this RNA processing pathway coordinately regulates insulin sensitivity and glucose metabolism by effectively bypassing these negative feedback loops. Whereas our work has implicated let-7 as a regulator of insulin-PI3K-mTOR signaling, we do not exclude a parallel role for direct mRNA targets of Lin28a/b in glucose metabolism, a hypothesis supported by the recent findings that HMGA1 is translationally regulated by LIN28A and mutated in 5%–10% of T2D patients (Peng et al., 2011; Chiefari et al., 2011). Such non-let-7 functions are also suggested by the fact that musclespecific loss of Lin28a results in glucose derangement without significant let-7 changes. Nevertheless, it remains likely that during other developmental stages or in other tissues, let-7 suppression by Lin28a or Lin28b is required for normal glucose homeostasis. The effects of the Lin28/let-7 pathway on glucose metabolism in our murine models, together with our observation that genes regulated by let-7 are associated with T2D risk in humans, indicates important functional roles for both Lin28a/b and let-7 in human metabolism. let-7 Targets Are Relevant to Disparate Human Diseases: Cancer and T2D Metabolic reprogramming in malignancy is thought to promote a tumor’s ability to produce biomass and tolerate stress in the face of uncertain nutrient supplies (Vander Heiden et al., 2009). During their rapid growth phase early in development, embryos may utilize similar programs to maintain a growth-permissive metabolism. Dissecting the genetic underpinnings of embryonic metabolism would likely provide important insights into the nutrient uptake programs that are co-opted in cancer. While loss of function studies in the early embryo would help define the metabolic roles of oncofetal genes in their physiologic context, classical in vivo metabolic assays are difficult to perform in embryos. Lin28a and Lin28b are oncofetal genes, and thus highly expressed in early embryogenesis and then silenced in most adult tissues, but reactivated in cancer (Yang and Moss, 2003; Viswanathan et al., 2009). Cancer cells may utilize the embryonic function of Lin28a/b to drive a metabolic shift toward increased glucose uptake and glycolysis – a phenomenon termed the ‘‘Warburg effect.’’ Previously, we showed that Lin28a expression promotes glycolytic metabolism in muscle in vivo and in C2C12 myoblasts in vitro (Zhu et al., 2010). Though we cannot yet readily determine the metabolic effects of shutting off Lin28a/b within the embryo, we have dissected the potent effects of reactivating and inactivating this oncofetal program in adults. Conversely, in normal adult tissues that do not express high levels of Lin28a or Lin28b, one might ask if a role for the highly abundant let-7 is to lock cells into the metabolism of terminally differentiated cells to prevent aberrant reactivation of embryonic metabolic programs. Further studies are required to understand how this pathway may link mechanisms of tumorigenesis and diabetogenesis. Our report implicates Lin28a/b and let-7 as important modulators of glucose metabolism through interactions with the insulinPI3K-mTOR pathway and T2D-associated genes identified in GWAS. Although it is likely that additional mechanisms and feedback loops exist, our data suggests a model whereby Lin28a/b 92 Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc.
and let-7 coordinate the GWAS identified genes and the insulin-PI3K-mTOR pathway to regulate glucose metabolism (Figure 6E). It also suggests that enhancing Lin28 function or abrogating let-7 may be therapeutically promising for diseases like obesity and diabetes. Likewise, results from this work might shed light on the physiology of aging and, specifically, how the accumulation of let-7 in aging tissues may contribute to the systemic insulin resistance that accompanies aging. EXPERIMENTAL PROCEDURES Mice All animal procedures were based on animal care guidelines approved by the Institutional Animal Care and Use Committee. Mouse lines used in this study are described in the Extended Experimental Procedures and Figure S5. Indirect Calorimetry The apparatus used was a set of 16 OxyMax Metabolic Activity Monitoring chambers (Columbus Instruments; Columbus, OH, USA). Each chamber consisted of a self-contained unit capable of providing continuous measurements of an individual mouse’s total activity and feeding behavior. Monitoring occurred over a 3-day period. Each subject was placed into an individual chamber on day 1, with free access to food and water during the course of the experiment. Subjects were maintained under a normal 12:12 hr light:dark cycle. All measurements were sampled periodically (at approximately 12 min intervals) and automatically recorded via the OXYMAX Windows V3.22 software. Activity measures over the final 24 hr period were parceled into 2-h bins and these were used to express diurnal activity levels. Quantitative RT-PCR Performed with standard methods, which are described in detail in the Extended Experimental Procedures. Histology Tissue samples were fixed in 10% buffered formalin or Bouin’s solution and embedded in paraffin. Glucose and Insulin Tolerance Tests Overnight-fasted mice were given i.p. glucose (2 mg/g body weight). For insulin tolerance test, 5 hr fasted mice were given 0.75 U insulin/kg body weight by i.p. injection (Humulin). Blood glucose was determined with a Lifescan One Touch glucometer. Insulin levels were measured by ELISA (Crystal Chem). Cloning Murine Lin28a and human LIN28B cDNA was subcloned into pBabe.Puro and pMSCV.Neo retroviral vectors. LIN28B and Control shRNA in lentiviral plasmids were purchased from Sigma-Aldrich and previously reported in Viswanathan et al., 2009. UTR cloning for luciferase reporters is described in Table S2. Cell Culture, Viral Production, and Transfection Performed using standard methods as described in the Extended Experimental Procedures. Glucose Uptake Assay In vitro glucose uptake assays were performed as described in Berti and Gammeltoft, 1999. Drug Treatments Rapamycin was injected i.p. 3 times a week for mouse experiments. For cell culture, C2C12 myotubes differentiated for 3 days were incubated with inhibitors for 1 day prior to glucose uptake assays. See Extended Experimental Procedures for further details.
Western Blot Assay Performed using standard methods. Detailed methods and reagents used are described in the Extended Experimental Procedures. Luciferase Reporter Assay 10 ng of each construct was co-transfected with 10 nM miRNA duplexes or into HEK293T cells in a 96-well plate using lipofectamin-2000 (Invitrogen). After 48 hr, the cell extract was obtained; firefly and Renilla luciferase activities were measured with the Promega Dual-Luciferase reporter system. MAGENTA Analysis See Results, Table 1 Legend, and Extended Experimental Procedures for detailed methods. Statistical Analysis Data is presented as mean ± SEM, and Student’s t test (two-tailed distribution, two-sample unequal variance) was used to calculate p values. Statistical significance is displayed as p < 0.05 (one asterisk) or p < 0.01 (two asterisks). The tests were performed using Microsoft Excel where the test type is always set to two-sample equal variance.
Ambros, V., and Horvitz, H.R. (1984). Heterochronic mutants of the nematode Caenorhabditis elegans. Science 226, 409–416. Beard, C., Hochedlinger, K., Plath, K., Wutz, A., and Jaenisch, R. (2006). Efficient method to generate single-copy transgenic mice by site-specific integration in embryonic stem cells. Genesis 44, 23–28. Berti, L., and Gammeltoft, S. (1999). Leptin stimulates glucose uptake in C2C12 muscle cells by activation of ERK2. Mol. Cell. Endocrinol. 157, 121–130. Boehm, M., and Slack, F. (2005). A developmental timing microRNA and its target regulate life span in C. elegans. Science 310, 1954–1957. Boyerinas, B., Park, S.M., Shomron, N., Hedegaard, M.M., Vinther, J., Andersen, J.S., Feig, C., Xu, J., Burge, C.B., and Peter, M.E. (2008). Identification of let-7-regulated oncofetal genes. Cancer Res. 68, 2587–2591. Brugarolas, J.B., Vazquez, F., Reddy, A., Sellers, W.R., and Kaelin, W.G., Jr. (2003). TSC2 regulates VEGF through mTOR-dependent and -independent pathways. Cancer Cell 4, 147–158.
SUPPLEMENTAL INFORMATION
Buller, C.L., Loberg, R.D., Fan, M.H., Zhu, Q., Park, J.L., Vesely, E., Inoki, K., Guan, K.L., and Brosius, F.C., 3rd. (2008). A GSK-3/TSC2/mTOR pathway regulates glucose uptake and GLUT1 glucose transporter expression. Am J Physiol Cell Physiol 295, C836–843.
Supplemental Information includes Extended Experimental Procedures, a list of DIAGRAM and MAGIC consortia members with affiliations, and five figures and can be found with this article online at doi:10.1016/j.cell.2011.08.033.
Chiefari, E., Tanyolac, S., Paonessa, F., Pullinger, C.R., Capula, C., Iiritano, S., Mazza, T., Forlin, M., Fusco, A., Durlach, V., et al. (2011). Functional variants of the HMGA1 gene and type 2 diabetes mellitus. J. Am. Med. Assoc. 305, 903–912.
ACKNOWLEDGMENTS We thank John Powers, Harith Rajagopalan, Jason Locasale, Abdel Saci, Akash Patnaik, Charles Kaufman, Christian Mosimann and Lewis Cantley for invaluable discussions and advice, Roderick Bronson and the Harvard Medical School Rodent Histopathology Core for mouse tissue pathology, and the Harvard Neurobehavior Laboratory for CLAMS experiments. This work was supported by grants from the US NIH to G.Q.D., a Graduate Training in Cancer Research Grant and a American Cancer Society Postdoctoral Fellowship to H.Z., the NSS Scholarship from the Agency for Science, Technology and Research, Singapore for N.S.C, an NIH NIDDK Diseases Career Development Award to M.G.K, and an American Diabetes Association Postdoctoral Fellowship for A.V.S. J.M.E. was supported by the National Human Genome Research Institute (NHGRI). R.I.G. was supported by US National Institute of General Medical Sciences (NIGMS) and is a Pew Research Scholar. D.A. is a Distinguished Clinical Scholar of the Doris Duke Charitable Foundation. G.Q.D. is a recipient of Clinical Scientist Awards in Translational Research from the Burroughs Wellcome Fund and the Leukemia and Lymphoma Society, and an investigator of the Howard Hughes Medical Institute and the Manton Center for Orphan Disease Research. H.Z. and N.S.C designed and performed the experiments, and wrote the manuscript. A.V.S. and D.A. performed bioinformatic analysis on let-7 targets in GWAS. G.S. and S.P.S. performed expression analysis, metabolic assays and mouse husbandry. G.S., W.S.E. and A.T. generated the mouse strains. J.E.T, R.T. and R.I.G. assisted with the luciferase assays. J.P.H., R.I.G., G.S., and A.T generated the conditional knockout mice. M.G.K. helped to design the experiments. G.Q.D. designed and supervised experiments, and wrote the manuscript. The authors declare no competing financial interests. Received: December 17, 2010 Revised: May 9, 2011 Accepted: August 5, 2011 Published: September 29, 2011 REFERENCES Abbott, A.L., Alvarez-Saavedra, E., Miska, E.A., Lau, N.C., Bartel, D.P., Horvitz, H.R., and Ambros, V. (2005). The let-7 MicroRNA family members mir-48, mir-84, and mir-241 function together to regulate developmental timing in Caenorhabditis elegans. Dev. Cell 9, 403–414.
Denko, N.C. (2008). Hypoxia, HIF1 and glucose metabolism in the solid tumour. Nat. Rev. Cancer 8, 705–713. Dupuis, J., Langenberg, C., Prokopenko, I., Saxena, R., Soranzo, N., Jackson, A.U., Wheeler, E., Glazer, N.L., Bouatia-Naji, N., Gloyn, A.L., et al. (2010). New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat. Genet. 42, 105–116. Duvel, K., Yecies, J.L., Menon, S., Raman, P., Lipovsky, A.I., Souza, A.L., Triantafellow, E., Ma, Q., Gorski, R., Cleaver, S., et al. (2010). Activation of a metabolic gene regulatory network downstream of mTOR complex 1. Mol. Cell 39, 171–183. Engelman, J.A., Luo, J., and Cantley, L.C. (2006). The evolution of phosphatidylinositol 3-kinases as regulators of growth and metabolism. Nat. Rev. Genet. 7, 606–619. Gao, P., Tchernyshyov, I., Chang, T.C., Lee, Y.S., Kita, K., Ochi, T., Zeller, K.I., De Marzo, A.M., Van Eyk, J.E., Mendell, J.T., et al. (2009). c-Myc suppression of miR-23a/b enhances mitochondrial glutaminase expression and glutamine metabolism. Nature 458, 762–765. Grimson, A., Farh, K.K., Johnston, W.K., Garrett-Engele, P., Lim, L.P., and Bartel, D.P. (2007). MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol. Cell 27, 91–105. Guertin, D.A., and Sabatini, D.M. (2007). Defining the role of mTOR in cancer. Cancer Cell 12, 9–22. Guo, Y., Chen, Y., Ito, H., Watanabe, A., Ge, X., Kodama, T., and Aburatani, H. (2006). Identification and characterization of lin-28 homolog B (LIN28B) in human hepatocellular carcinoma. Gene 384, 51–61. Harrington, L.S., Findlay, G.M., Gray, A., Tolkacheva, T., Wigfield, S., Rebholz, H., Barnett, J., Leslie, N.R., Cheng, S., Shepherd, P.R., et al. (2004). The TSC1-2 tumor suppressor controls insulin-PI3K signaling via regulation of IRS proteins. J. Cell Biol. 166, 213–223. Hatley, M.E., Patrick, D.M., Garcia, M.R., Richardson, J.A., Bassel-Duby, R., van Rooij, E., and Olson, E.N. (2010). Modulation of K-Ras-dependent lung tumorigenesis by MicroRNA-21. Cancer Cell 18, 282–293. Heo, I., Joo, C., Cho, J., Ha, M., Han, J., and Kim, V.N. (2008). Lin28 mediates the terminal uridylation of let-7 precursor MicroRNA. Mol Cell 32, 276–284. Hyun, S., Lee, J.H., Jin, H., Nam, J., Namkoong, B., Lee, G., Chung, J., and Kim, V.N. (2009). Conserved MicroRNA miR-8/miR-200 and its target USH/ FOG2 control growth by regulating PI3K. Cell 139, 1096–1108.
Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc. 93
Johnson, S.M., Grosshans, H., Shingara, J., Byrom, M., Jarvis, R., Cheng, A., Labourier, E., Reinert, K.L., Brown, D., and Slack, F.J. (2005). RAS is regulated by the let-7 microRNA family. Cell 120, 635–647. Kennell, J.A., Gerin, I., MacDougald, O.A., and Cadigan, K.M. (2008). The microRNA miR-8 is a conserved negative regulator of Wnt signaling. Proc. Natl. Acad. Sci. USA 105, 15417–15422. Kumar, M.S., Erkeland, S.J., Pester, R.E., Chen, C.Y., Ebert, M.S., Sharp, P.A., and Jacks, T. (2008). Suppression of non-small cell lung tumor development by the let-7 microRNA family. Proc Natl Acad Sci USA 105, 3903–3908. Laplante, M., and Sabatini, D.M. (2009). An emerging role of mTOR in lipid biosynthesis. Curr. Biol. 19, R1046–R1052. Lee, Y., and Dutta, A. (2007). The tumor suppressor microRNA let-7 represses the HMGA2 oncogene. Genes & Development 21, 1025–1030. Legesse-Miller, A., Elemento, O., Pfau, S.J., Forman, J.J., Tavazoie, S., and Coller, H.A. (2009). let-7 Overexpression leads to an increased fraction of cells in G2/M, direct down-regulation of Cdc34, and stabilization of Wee1 kinase in primary fibroblasts. J. Biol. Chem. 284, 6605–6609. Lu, L., Katsaros, D., de la Longrais, I.A., Sochirca, O., and Yu, H. (2007). Hypermethylation of let-7a-3 in epithelial ovarian cancer is associated with low insulin-like growth factor-II expression and favorable prognosis. Cancer Res. 67, 10117–10122. Mayr, C., Hemann, M.T., and Bartel, D.P. (2007). Disrupting the pairing between let-7 and Hmga2 enhances oncogenic transformation. Science 315, 1576–1579. Moss, E.G., Lee, R.C., and Ambros, V. (1997). The cold shock domain protein LIN-28 controls developmental timing in C. elegans and is regulated by the lin-4 RNA. Cell 88, 637–646. Newman, M.A., Thomson, J.M., and Hammond, S.M. (2008). Lin-28 interaction with the Let-7 precursor loop mediates regulated microRNA processing. RNA 14, 1539–1549. Nimmo, R.A., and Slack, F.J. (2009). An elegant miRror: microRNAs in stem cells, developmental timing and cancer. Chromosoma 118, 405–418. Peng, S., Chen, L.L., Lei, X.X., Yang, L., Lin, H., Carmichael, G.G., and Huang, Y. (2011). Genome-wide studies reveal that lin28 enhances the translation of genes important for growth and survival of human embryonic stem cells. Stem Cells 29, 496–504. Peter, M. (2009). Let-7 and miR-200 microRNAs: Guardians against pluripotency and cancer progression. Cell Cycle. 8, 843–852. Piskounova, E., Viswanathan, S.R., Janas, M., LaPierre, R.J., Daley, G.Q., Sliz, P., and Gregory, R.I. (2008). Determinants of microRNA processing inhibition by the developmentally regulated RNA-binding protein Lin28. J. Biol. Chem. 283, 21310–21314. Polesskaya, A., Cuvellier, S., Naguibneva, I., Duquet, A., Moss, E.G., and Harel-Bellan, A. (2007). Lin-28 binds IGF-2 mRNA and participates in skeletal myogenesis by increasing translation efficiency. Genes & Development 21, 1125–1138.
Segre`, A.V., DIAGRAM Consortium, MAGIC investigators, Groop, L., Mootha, V.K., Daly, M.J., and Altshuler, D. (2010). Common inherited variation in mitochondrial genes is not enriched for associations with type 2 diabetes or related glycemic traits. PLoS Genet 6, e1001058. Selbach, M., Schwanhausser, B., Thierfelder, N., Fang, Z., Khanin, R., and Rajewsky, N. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58–63. Shah, O.J., Wang, Z., and Hunter, T. (2004). Inappropriate activation of the TSC/Rheb/mTOR/S6K cassette induces IRS1/2 depletion, insulin resistance, and cell survival deficiencies. Curr. Biol. 14, 1650–1656. Small, E.M., and Olson, E.N. (2011). Pervasive roles of microRNAs in cardiovascular biology. Nature 469, 336–342. Tremblay, F., Brule, S., Hee Um, S., Li, Y., Masuda, K., Roden, M., Sun, X.J., Krebs, M., Polakiewicz, R.D., Thomas, G., et al. (2007). Identification of IRS-1 Ser-1101 as a target of S6K1 in nutrient- and obesity-induced insulin resistance. Proc. Natl. Acad. Sci. USA 104, 14056–14061. Um, S.H., Frigerio, F., Watanabe, M., Picard, F., Joaquin, M., Sticker, M., Fumagalli, S., Allegrini, P.R., Kozma, S.C., Auwerx, J., et al. (2004). Absence of S6K1 protects against age- and diet-induced obesity while enhancing insulin sensitivity. Nature 431, 200–205. Vander Heiden, M.G., Cantley, L.C., and Thompson, C.B. (2009). Understanding the Warburg effect: the metabolic requirements of cell proliferation. Science 324, 1029–1033. Viswanathan, S.R., and Daley, G.Q. (2010). Lin28: A microRNA regulator with a macro role. Cell 140, 445–449. Viswanathan, S.R., Daley, G.Q., and Gregory, R.I. (2008). Selective blockade of microRNA processing by Lin28. Science 320, 97–100. Viswanathan, S.R., Powers, J.T., Einhorn, W., Hoshida, Y., Ng, T.L., Toffanin, S., O’Sullivan, M., Lu, J., Phillips, L.A., Lockhart, V.L., et al. (2009). Lin28 promotes transformation and is associated with advanced human malignancies. Nat. Genet. 41, 843–848. Voight, B.F., Scott, L.J., Steinthorsdottir, V., Morris, A.P., Dina, C., Welch, R.P., Zeggini, E., Huth, C., Aulchenko, Y.S., Thorleifsson, G., et al. (2010). Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579–589. Wang, Y.C., Chen, Y.L., Yuan, R.H., Pan, H.W., Yang, W.C., Hsu, H.C., and Jeng, Y.M. (2010). Lin-28B expression promotes transformation and invasion in human hepatocellular carcinoma. Carcinogenesis 31, 1516–1522. Xu, B., and Huang, Y. (2009). Histone H2a mRNA interacts with Lin28 and contains a Lin28-dependent posttranscriptional regulatory element. Nucleic Acids Res. 37, 4256–4263. Xu, B., Zhang, K., and Huang, Y. (2009). Lin28 modulates cell growth and associates with a subset of cell cycle regulator mRNAs in mouse embryonic stem cells. RNA 15, 357–361. Yang, D.H., and Moss, E.G. (2003). Temporally regulated expression of Lin-28 in diverse tissues of the developing mouse. Gene Expr Patterns 3, 719–726.
Reinhart, B.J., Slack, F.J., Basson, M., Pasquinelli, A.E., Bettinger, J.C., Rougvie, A.E., Horvitz, H.R., and Ruvkun, G. (2000). The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403, 901–906.
Yun, J., Rago, C., Cheong, I., Pagliarini, R., Angenendt, P., Rajagopalan, H., Schmidt, K., Willson, J.K., Markowitz, S., Zhou, S., et al. (2009). Glucose deprivation contributes to the development of KRAS pathway mutations in tumor cells. Science 325, 1555–1559.
Rybak, A., Fuchs, H., Smirnova, L., Brandt, C., Pohl, E.E., Nitsch, R., and Wulczyn, F.G. (2008). A feedback loop comprising lin-28 and let-7 controls pre-let-7 maturation during neural stem-cell commitment. Nat. Cell Biol. 10, 987–993.
Zhu, H., Shah, S., Shyh-Chang, N., Shinoda, G., Einhorn, W.S., Viswanathan, S.R., Takeuchi, A., Grasemann, C., Rinn, J.L., Lopez, M.F., et al. (2010). Lin28a transgenic mice manifest size and puberty phenotypes identified in human genetic association studies. Nat. Genet. 42, 626–630.
94 Cell 147, 81–94, September 30, 2011 ª2011 Elsevier Inc.
Translocation-Capture Sequencing Reveals the Extent and Nature of Chromosomal Rearrangements in B Lymphocytes Isaac A. Klein,1 Wolfgang Resch,5 Mila Jankovic,1 Thiago Oliveira,1,6 Arito Yamane,3,5 Hirotaka Nakahashi,3,5 Michela Di Virgilio,1 Anne Bothmer,1 Andre Nussenzweig,4 Davide F. Robbiani,1 Rafael Casellas,3,5,7,* and Michel C. Nussenzweig1,2,7,* 1Laboratory
of Molecular Immunology Hughes Medical Institute The Rockefeller University, New York, NY 10065, USA 3Center for Cancer Research 4Experimental Immunology Branch National Cancer Institute 5Genomics and Immunity, National Institute of Arthritis and Musculoskeletal and Skin, National Institutes of Health, Bethesda, MD 20892, USA 6Medical School of Ribeirao Preto/USP, Department of Genetics, 8 National Institute of Science and Technology for Stem Cells and Cell Therapy and Center for Cell-based Therapy, Ribeirao Preto, SP 14051-140, Brazil 7These authors contributed equally to this work *Correspondence:
[email protected] (R.C.),
[email protected] (M.C.N.) DOI 10.1016/j.cell.2011.07.048 2Howard
SUMMARY
Chromosomal rearrangements, including translocations, require formation and joining of DNA double strand breaks (DSBs). These events disrupt the integrity of the genome and are frequently involved in producing leukemias, lymphomas and sarcomas. Despite the importance of these events, current understanding of their genesis is limited. To examine the origins of chromosomal rearrangements we developed Translocation Capture Sequencing (TC-Seq), a method to document chromosomal rearrangements genome-wide, in primary cells. We examined over 180,000 rearrangements obtained from 400 million B lymphocytes, revealing that proximity between DSBs, transcriptional activity and chromosome territories are key determinants of genome rearrangement. Specifically, rearrangements tend to occur in cis and to transcribed genes. Finally, we find that activation-induced cytidine deaminase (AID) induces the rearrangement of many genes found as translocation partners in mature B cell lymphoma. INTRODUCTION Lymphomas, leukemias, and solid tumors frequently carry gross genomic rearrangements, including chromosomal translocations (Kuppers, 2005; Nussenzweig and Nussenzweig, 2010; Tsai and Lieber, 2010; Tsai et al., 2008; Zhang et al., 2010). Recurrent chromosomal translocations are key pathogenic events in hematopoietic tumors and sarcomas; they may juxtapose proto-oncogenes to constitutively active promoters, delete
tumor suppressors, or produce chimeric oncogenes (Rabbitts, 2009). For example, the c-myc/IgH translocation, a hallmark of human Burkitt’s lymphoma and mouse plasmacytomas, deregulates the expression of c-myc by bringing it under the control of Immunoglobulin (Ig) gene transcriptional regulatory elements (Gostissa et al., 2009; Kuppers, 2005; Potter, 2003). Alternatively, in chronic myeloid leukemia, the Bcr/Abl translocation fuses two disparate coding sequences to produce a novel, constitutively active tyrosine kinase (Goldman and Melo, 2003; Wong and Witte, 2004). Chromosome translocation requires formation and joining of paired DNA double strand breaks (DSBs), a process that may be limited in part by the proximity of two breaks in the nucleus (Nussenzweig and Nussenzweig, 2010; Zhang et al., 2010). B lymphocytes are particularly prone to translocation-induced malignancy, and mature B cell lymphomas are the most common lymphoid cancer (Kuppers, 2005). This enhanced susceptibility appears to be the direct consequence of activation-induced cytidine deaminase (AID) expression in activated B cells (Nussenzweig and Nussenzweig, 2010). AID normally diversifies antibody genes by initiating Ig class switch recombination (CSR) and somatic hypermutation (SHM) (Muramatsu et al., 2000; Revy et al., 2000). It does so by deaminating cytosine residues in single-stranded DNA (ssDNA) exposed by stalled RNA polymerase II during transcription (Chaudhuri and Alt, 2004; Pavri et al., 2010; Storb et al., 2007). The resulting U:G mismatches are then processed by one of several repair pathways to yield mutations or DSBs, which are obligate intermediates in CSR, but may also serve as substrates for translocation (Di Noia and Neuberger, 2007; Honjo, 2002; Peled et al., 2008; Stavnezer et al., 2008). Although AID has a strong preference for targeting Ig genes, it also mutates a large number of non-Ig loci, including Bcl6, Pax5, miR142, Pim1, and c-myc (Gordon et al., 2003; Liu et al., 2008; Pasqualucci et al., 2001; Pavri et al., 2010; Robbiani Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc. 95
MycI
I-SceI Site ex 1
Sonicate & Polish
or IgHI
Eu
Iu
Su repeat
Infect I-SceI +/- AID
Rearrangement
Native locus
Figure 1. TC-Seq Schematic
Genomic DNA
ex 2
Ligate Linkers & Cut I-SceI Rearrangement
Native locus
et al., 2009; Shen et al., 1998; Yamane et al., 2011). While non-Ig gene mutation frequencies are low, it has been estimated that AID mutates as many as 25% of all genes expressed in germinal center B cells (Liu et al., 2008). The full spectrum of potential AID targets was revealed by AID-chromatin immunoprecipitation studies, which showed AID occupancy at more than 5000 gene promoters bearing stalled RNA polymerase II (Yamane et al., 2011). AID is targeted to these genes through its interaction with Spt5, an RNA polymerase stalling factor (Pavri et al., 2010). Consistent with its genome-wide distribution, mice that overexpress AID exhibit chromosomal instability and develop translocation-associated lymphomas (Okazaki et al., 2003; Robbiani et al., 2009). Yet, c-myc is the only gene conclusively shown to translocate as a result of AIDinduced DSBs (Ramiro et al., 2007; Robbiani et al., 2008). It has been estimated that up to 5% of activated primary B lymphocytes carry IgH fusions to unidentified partners which may or may not be selected during transformation (Franco et al., 2006; Jankovic et al., 2010; Ramiro et al., 2006; Robbiani et al., 2009; Wang et al., 2009; Yan et al., 2007). Additionally, recent deep-sequencing studies have revealed hundreds of genomic rearrangements within human cancers and documented their propensity to involve genes (Campbell et al., 2008; Pleasance et al., 2010a; Pleasance et al., 2010b; Stephens et al., 2009) However, the role of selection or other physiologic constraints in the genesis of these events is unclear because methods for mapping chromosomal translocations in primary cells do not yet exist. Here, we describe a novel, genome-wide strategy to document primary chromosomal rearrangements. We provide insight into the effects of genomic position and transcription on the genesis of chromosomal rearrangements and DSB resolution. Our data also reveal the extent of recurrent AID-mediated translocations in activated B cells. RESULTS Translocation Capture Sequencing To discover the extent and nature of chromosomal rearrangements in activated B lymphocytes we developed an assay to capture and sequence rearranged genomic DNA (TC-Seq). In this system, DSBs are induced at the c-myc (chromosome 15) or IgH (chromosome 12) loci, which were engineered to harbor the I-SceI meganuclease target sequence (Robbiani et al., 2008). c-mycI-SceI/I-SceI or IgHI-SceI/I-SceI (hereafter referred to as MycI and IgHI) B cells were stimulated and infected with a retrovirus expressing I-SceI, in the presence or absence of AID. Rearrangements to I-SceI sites were recovered by semi96 Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc.
Semi-Nested LM-PCR
Linker Cleavage
IgHI or MycI primary B cells are infected with retroviruses encoding I-SceI with or without AID. Genomic DNA is fragmented, blunted, A-tailed, ligated to T-tailed asymmetric linkers and native loci are eliminated by I-SceI digestion. Rearrangements are amplified by semi-nested ligation-mediated PCR followed by linker cleavage and paired-end deep sequencing.
Paired-End Solexa Seq
nested ligation-mediated PCR from genomic DNA that had been fragmented, A-tailed (to prevent intramolecular ligation) and ligated to asymmetric DNA linkers (Figure 1). Site-specific primers were placed at least 150 bp from the I-SceI site allowing for the capture of rearrangements involving moderate endprocessing. PCR products were submitted for high-throughput paired-end sequencing and reads were aligned to the mouse genome. Identical reads were clustered as single events. Since sonication generates unique linker ligation points in each cell, this method allows for the study of independent events without sequencing through rearrangement breakpoints. AID-Independent Translocations In the absence of AID, DSBs arise as by-products of normal cellular metabolism including transcription and DNA replication (Branzei and Foiani, 2010). Consistent with a global distribution of DSBs, we mapped 28,548 unique rearrangements between the I-SceI site and every chromosome in MycIAID/ B cells (100 million cells assayed, Figure 2A). To determine whether there is a genome-wide bias for rearrangement, these events were characterized based on location, transcription and histone modification of the locus. We found a marked enrichment of intrachromosomal rearrangements on chromosome 15, with approximately 125 events per mappable megabase (11,066 rearrangements), or 40% of all events (Figure 2B). Translocations between MycI and other chromosomes were evenly distributed throughout the genome (Figure 2B and Table S1). Notably, 86.7% (9591 of 11,066) of all intrachromosomal rearrangements were localized within a 350 kb domain surrounding the I-SceI site (from 50 kb to +300 kb; Figure 2C). This is consistent with the observation that 92% of intrachromosomal rearrangements in the breast cancer genome involve aberrant joining of DSBs within 2 Mb of each other (Stephens et al., 2009), and that 87% of RAG-mediated intrachromosomal rearrangements in Abl-transformed pre-B cells lie within 200 kb of a recombination substrate (Mahowald et al., 2009). The asymmetrical distribution of events in the direction of c-myc transcription and the adjacent Pvt1 gene is also consistent with the idea that gene expression facilitates rearrangement (Thomas and Rothstein, 1989). I-SceI-proximal events may be the result of either resection and rejoining of I-SceI breaks, bona fide rearrangements between I-SceI and random DSBs, or a combination of DNA end resection and balanced translocations. Regardless of the precise molecular mechanism, the abundance of these events reveals a strong preference for DSBs to be resolved by ligation to a proximal sequence, a DNA repair strategy that may minimize gross genomic alterations.
A
B
1
Y
X 19
2
18 3
16 4
MycI 15
5
14
Rearrangements / Mb
120
17
MycIAID-/-
80
40
6
13 12
7 11 10
9
0 1 2 3 4 5 6 7 8 9 10 1 12 13 14 15 16 17 18 9 X Y 1 1
8
Chromosomes
C Rearrangements / 5 Kb
10
10
3
2
MycIAID-/-
10
0
-5 0 0 k b
-2 5 0 k b
0 kb
2 5 0 kb
500 kb
Distance from I-SceI Site
Figure 2. Rearrangements to a DSB Site Documented by TC-Seq (A) Genome-wide view of rearrangements to MycI in AID/ B cells. (B) Rearrangements per mappable megabase to each chromosome in MycIAID/ B cells. (C) Profile of rearrangements around the I-SceI site in 5 kb intervals.
Recent cancer genome sequencing experiments uncovered a modest but highly significant preference for cancer-associated rearrangements to occur in genes, which compose only 41% of the human genome. For example, in 24 sequenced breast cancer genomes, 50% of all rearrangements involved genes (Stephens et al., 2009). Whether this bias resulted from selection or some inherent feature of DSB formation and repair specific to cancer cells could not be determined. To ascertain whether a similar bias is seen in primary cells in short term cultures, AID-independent rearrangements in MycIAID/ B lymphocytes (excluding 1 Mb of DNA around the I-SceI site) were classified as genic or intergenic. Consistent with the human tumor studies, 51% (9677 of 19,246) of the events were associated with genes (Figure 3A). Because only 40% of the mouse genome is genic, this represents a small (1.25-fold) but significant difference (permutation test p < 0.001) relative to intergenic regions. Moreover, the genic rearrangements were particularly enriched at transcription start sites (Figure 3B). Consistent with the preference for genic rearrangements, we also observed a bias to transcribed genes. Fewer rearrangements than expected occurred at silent (fe = 0.74, p < 0.001) and trace (fe = 0.95, p < 0.001) transcribed genes, while more than expected occurred at low (fe = 1.08 p < 0.001), medium (fe = 1.13, p < 0.001), and highly (fe = 1.14, p < 0.001) transcribed
genes (Figure 3C and Figure S1). Additionally, rearrangements were enriched in genes bearing PolII and activating histone marks such as H3K4 trimethylation, H3 acetylation, and H3K36 trimethylation (p < 0.001, Figure 3D). Thus, there is a propensity for a DSB to recombine with gene rich regions of the genome and more specifically to transcription start sites of actively transcribed genes. AID-Mediated Lesions Captured by TC-Seq Processing of AID-induced U:G mismatches can result in DSBs in Ig and non-Ig genes such as c-myc (Robbiani et al., 2008). To determine whether AID-mediated DSBs can be captured by TCSeq we examined the IgH and c-myc loci in B cells expressing retrovirally encoded AID (IgHIAIDRV or MycIAIDRV). IgHI B cells expressing both I-SceI and AID showed extensive AIDdependent rearrangement between the I-SceI site and downstream switch (S) regions (Figure 4A). The frequency of rearrangements resembled the pattern of AID-mediated CSR in LPS+IL-4 cultures (e.g., IgG1[IgG3>IgE), with 18,686 mapping to Sg1, 3,192 to Sg3, and 1433 to S3 (Table S2). Furthermore, translocations between c-myc and IgH were entirely dependent on AID (Figures 4B and 4C). In two biological replicate samples totaling 100 million B cells, we observed 45 translocations from IgHI to c-myc (the I-SceI DSB was in IgH), and 5963 from MycI Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc. 97
A
Figure 3. Rearrangements to MycI Occur Near the TSSs of Actively Transcribed Genes
Total Captured Rearrangements Genic -/-
MycI
IgHI
Intergenic
% genic
AID
9677
9569
50.3
AIDRV
26221
25101
51.1
AIDWT
5651
5723
49.7
AIDRV
9475
8026
53.9
(A) Rearrangements, excluding the 1 Mb around I-SceI, were categorized as genic or intergenic. (B) Composite density profile of genomic rearrangements from MycI to genes (red line) and intergenic regions (blue line). TSS = transcription start site, TTS = transcription termination site. (C) Relative frequency (fe) of rearrangements in genes that are either silent or display trace, low, medium or high levels of transcription in activated B cells as determined by RNA-Seq (Figure S1). Dashed line indicates the expected rearrangement frequency based on a random model. p < 0.001 for all (permutation test). (D) Relative frequency of rearrangements in PolII- and activating histone mark-associated gene groups (Yamane et al., 2011). Dashed line indicates expected frequency based on a random model. p < 0.001 for all samples (permutation test). Also see Figure S1.
P (permutation test) < 0.001 for all
B 0.005
Frequency
MycIAID-/-
0.001
TS
C
S
50
%
TT
Intergenic
S
D
MycIAID-/-
MycIAID-/-
of 10,633) mapping within 50 kb to 300 kb of the break (Figure S2C). A notable exception was an additional cluster of rearrangements associated with Pvt1 exon 5 (Figure S2C). These events coincided precisely with documented chromosomal translocations isolated from AID sufficient mouse plasmacytomas (Cory et al., 1985; Huppi et al., 1990) and likely represent an AID hot spot. In agreement with the MycIAID/ samples, translocations between IgHI or MycI and other I chromosomes were evenly distributed throughlI Po out the genome, except for the MycI capture sample, which displayed a marked bias for chromosome 12 due to creation of DSBs at the IgH locus by AID (Figure S2A). Similar to MycIAID/ samples, rearrangements in both cases were more likely to occur in regions that are genic, transcriptionally active, recruiting PolII, and associated with activating histone marks (Figure 3A and Figures 4D and 4E). Furthermore, intragenic rearrangements were enriched at transcription start sites of genes (Figure S2D). In contrast to recent studies that used Nbs1 as an indirect marker of AID mediated damage (Staszewski et al., 2011), we found little or no difference in rearrangements to genomic repeats in the presence of AID (Table S3). Thus, AID does not dramatically alter the general profile of rearrangements. Next, we examined whether IgHI and MycI capture DSB targets at similar rates. Indeed, in AID sufficient samples, total translocations from a given chromosome to IgHI or MycI occurred at roughly similar frequencies (Figure 4F). This similarity could be explained by the close physical proximity of IgH and c-myc, as suggested by studies with EBV-transformed B lymphoblastoid cells (Roix et al., 2003). Alternatively, the correlation in transchromosomal joining might represent random ligation between I-SceI DSBs in IgH or c-myc and DSBs on other chromosomes. We conclude that extra-chromosomal DSBs ligate DSBs in IgHI and MycI at similar rates.
w Lo
m iu ed M Transcription Group
c
e3
gh Hi
m
4 3K
3A
H
H
to IgH (the I-SceI DSB was in c-myc) (Table S2). Additionally, TCSeq tags mapping to c-myc from IgHI correlate well with c-myc/ IgH translocation breakpoints sequenced from primary B cells (Figure 4C) (Robbiani et al., 2008). This suggests that TC-Seq reads are an accurate proxy for breakpoints. Furthermore, the data corroborate previous findings showing that AID induced breaks at c-myc are rate limiting for c-myc/IgH translocations (Robbiani et al., 2008) and suggest that AID-dependent IgH breaks are two orders of magnitude more frequent than those at c-myc. We conclude that TC-Seq captures rearrangements and translocations between DSBs in IgHI or MycI and known AID targets. As was the case for AID deficient samples, MycIAIDRV and IgHIAIDRV libraries were enriched in intrachromosomal rearrangements: 17% (10,633 of 63,772 total events) for MycI and 70% (36,019 of 51,312) for IgHI (Table S1 and Figures S2A and S2B). The difference in enrichment between the two was mostly the result of AID activity on chromosome 12, which generated a large number of rearrangements to IgH variable and constant domains (Figure 4A). Expression of AID did not alter the distribution of events around MycI, with 72.5% (7707 98 Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc.
e3
e ac Tr
6m
0
nt le Si
K3
0
1.0
H3
1.0
Relative Frequency
Relative Frequency
Unrearranged genes Rearranged genes
A
D
B
E
C
F
G
Figure 4. Rearrangements to IgHI or MycI in Primary B Cells Expressing AID (A and B) (A) Rearrangements per kb to I-SceI sites (indicated with an asterisk) in IgH or (B) c-myc in AID/ (top panel) or AIDRV cells (bottom panel). White boxes in the schematics below each graph represent Ig switch domains while black boxes depict constant regions. (C) Translocations per 100 bp from IgHI to c-myc in AID/ (top panel) or AIDRV cells (bottom panel). Green arrows indicate c-myc/IgH translocation breakpoints sequenced from primary B cells (Robbiani et al., 2008). (D) Relative frequency of rearrangements in transcription-level gene groups (Figure S1), dashed line indicates expected frequency based on a random model. Asterisks highlight values with a p < 0.001 (permutation test). (E) Relative frequency of rearrangements in PolII-associated or activating histone mark-associated gene groups (Yamane et al., 2011). Dashed line indicates the expected frequency based on a random model. p < 0.001 for all samples (permutation test). (F) Graph comparing the number of translocations per mappable megabase to each chromosome from IgHI (y axis) or MycI (x axis). (G) Ratio of IgHI captured to MycI captured events in 500 kb bins moving away from the I-SceI capture site (both directions combined). Dotted line represents the average transchromosomal joining rate computed on all chromosomes other than 12 or 15. Gray areas show 2 standard deviations around the mean. Also see Figure S2.
To examine the nature of the intrachromosomal rearrangement bias we calculated the ratio of IgHI to MycI captured events for each 500 kb segment of the genome and compared the values for chromosome 12 and 15 to the transchromosomal average (Figure 4G and Figure S2E). This analysis revealed that DSBs are preferentially captured intrachromosomally and this effect diminishes at a rate inversely proportional to the distance from the I-SceI site (d-1.29) (Figure S2F). This effect was most prominent locally but was evident at up to 50 Mb away from the I-SceI break. We conclude that paired DSBs are
preferentially joined intrachromosomally and that the magnitude of this effect decreases with increasing distance between the two lesions. Translocation Hot Spots To determine whether there are hot spots for rearrangement, we searched the B cell genome for local accumulations of reads in AID deficient and sufficient samples. TC-Seq hot spots were defined as a localized enrichment of rearrangements above what is expected from a uniform genomic distribution. We Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc. 99
A Translocations / 100 bp
MycIAIDRV MycIAID-/IgHIAIDRV 4 2
IgHIAIDWT Pax5
MycIAIDRV MycIAID-/IgHIAIDRV 8 4
IgHIAIDWT Il4i1
B
Figure Spots
5. AID-Dependent
Rearrangement
Hot
(A) Screenshots of translocations per 100 bp present at Il4i1 and Pax5 genes in all samples. (B) Overlap of AID-dependent hot-spot-bearing genes in IgHIAIDRV and IgHIAIDWT experiments. (C) Overlap of AID-dependent hot-spot-bearing genes in IgHIAIDRV, MycIAID/ and MycIAIDRV experiments. (D) Empirical cumulative distribution showing transcript abundance in genes displaying (red) or lacking (black) rearrangement hot spots. Filled-in gray slice represents 2000 highly transcribed unrearranged genes. (E) Total rearrangements as a function of gene expression (RNA-Seq) (Yamane et al., 2011) in genes bearing rearrangement hot spots. Also see Figure S3.
C
sequence) yielded 57 events, seven times more than expected in a random distribution Myc AID IgHIAIDWT model. When allowing up to 6 mismatches we find a total of five out of 34 AID-independent 28 17 9 3 55 66 hot spots near putative cryptic I-SceI sites. Although I-SceI has been used to generate a unique DSB in gene targeting and DNA repair 17 IgHIAIDRV experiments, our data suggest that DNA reMycIAID-/cognition by I-SceI can be promiscuous in the mouse genome, as demonstrated for other D E yeast endonucleases (Argast et al., 1998). 100 In contrast to AID/, we found 157 AID= 0.34 2.5 dependent hot spots in 83 genes captured by 10 IgHIAIDRV and 60 hot spots in 37 genes by I AIDRV in 100 million B cells. (Table S4). Myc 60 80% of the hot spots captured by c-myc and 101.5 Rearranged 90% of those captured by IgH were within genes. For example, we found robust AID20 Unrearranged dependent hot spots on Il4i1 and Pax5 (a recur0.5 10 ring IgH translocation partner in lymphoplasmacytoid lymphoma [Kuppers, 2005]) (Figure 5A 0 1 2 3 0 1 10 100 1000 10000 10 10 10 10 and Table S4). AID-dependent hot spots were Expression (RPKM) Expression (RPKM) similar for IgHI B cells expressing wild-type levels (WT) or retrovirally overexpressed (RV) AID, however the number of events was deremoved likely artifacts; namely hot spots containing >80% of creased in the former (Figures 5A and 5B and Table S4). Therereads within DNA repeats, and those with footprints of <100 nt fore, translocations to AID targets occur in cells expressing (because translocations are amplified from randomly sonicated physiological levels of AID and hotspots are not dependent on DNA (Figure 1), deep-sequence tags associated with bona fide AID overexpression. We conclude that AID produces substrates for translocations in a number of discreet sites throughout the rearrangements are unlikely to map within a small region). We identified 34 hot spots captured by MycI in the absence of genome, and these sites are mainly in genes. Genes containing AID-dependent hotspots overlapped AID (Table S4). There were 31 hot spots in 17 genes and three in nongenic regions (Table S4). Seventeen of the hot spots were in between IgHIAIDRV and MycIAIDRV samples (Figure 5C). ConsisPvt1, within 500 kb of the I-SceI site. In addition, two hot spots tent with the similar capture rates observed for transchromosooccurred within 5 kb of cryptic I-SceI sites (each bearing one mal targets (Figure 4F), we found that 28 of the frequently transmismatch to the 18-base pair recognition sequence). For located targets were shared (Table S5). In contrast, we found example, one such hot spot at chr15:16219195-16219312 con- a number of unique intrachromosomal AID-dependent hot spots. taining a 1-off I-SceI recognition sequence bore 8 rearrange- For example, rearrangements to Inf2 on chromosome 12 (850 ments (Figure S3). A genome-wide search for rearrangements kb from IgHI) were only found by IgHI capture while rearrangewithin 5 kb of cryptic I-SceI sites (83 within the mouse genome ments near Pvt1 on chromosome 15 (up to 350 kb from MycI) with 1 or 2 mismatches to the canonical I-SceI recognition were only found by MycI capture (Table S5). Thus, there is RV
Rearrangements
Gene group (%)
I
100 Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc.
IgHIAIDRV
Hot spots Spt5 AID
14
1.0
0.5
2.5
ρ = 0.84
10
10
2 6
Rearrangements
1.5
Figure 6. Characterization of AID-Dependent Hot Spots
C
B
RPM (x10-3)
Rearrangements
A
1.5
10
0.5
10
1
2 -2
TSS
2
4
Position (kb)
-10
-5
0
5
Position (kb)
10
0.5
10
1
10
1.5
10
2
10
Mutation rate (x10-5)
D
(A) Composite density graph showing the distribution of rearrangements in genes associated with AID-dependent hotspots relative to the TSS. (B) Spt5 and AID recruitment at genomic sites associated with translocation hotspots (Pavri et al., 2010; Yamane et al., 2011). (C) Somatic hypermutation frequency versus number of rearrangements in genes bearing AID-dependent translocation hotspots. (D) Distribution of AID, Spt5. and translocations in Rohema and Hist1 genes. Also see Figure S4.
AID Spt5 2 1
Hist1
lymphoma. These include Pax5/IgH, Pim1/ Bcl6, Il21r/Bcl6, Gas5/Bcl6, and Ddx6/IgH translocations and Junb and Socs1 deletions in diffuse large B cell lymphoma, Birc3/Malt1 translocation in MALT lymphoma, Ccnd2/IgK translocation and Bcl2l11 deletion in mantle cell lymphoma, Aff3/Bcl2 and Grhpr/Bcl6 translocations in follicular lymphoma, mir142/c-myc translocation in B cell prolymphocytic leukemia as well as c-myc/IgH and Pvt1/IgK translocations in Burkitt’s lymphoma (Table 1). Interestingly, we find that AID is capable of inducing DSBs in Fli1 (Table S4), which is translocated to EWS in 90% of Ewing’s sarcomas, a malignant tumor of uncertain origin (Riggi and Stamenkovic, 2007). We conclude that in addition to mutating many genes, AID also initiates DSBs in numerous non-Ig genes. These genes serve as substrates for translocations associated with mature B cell lymphoma, strongly implicating AID as a source of genomic instability in these cancers. 20 10
Translocations
Rohema
a bias toward recombination between I-SceI breaks and AID hot spots within the same chromosome. Additionally, the finding that some hot spots are only captured in cis indicates that TC-Seq underestimates the number of AID mediated DSBs in the genome and suggests that we have not reached saturation. Combined analysis of the IgHI and MycI TC-Seq data sets shows that AID-dependent hot spots are primarily found in transcribed genes (Figure 5D). However, although nearly all of the translocated genes are actively transcribed, there is no clear correlation between transcript abundance and rearrangement frequency (Figure 5E). Furthermore, 2000 highly transcribed genes are not rearranged (Figure 5D, shaded area). Therefore, transcription is necessary but not sufficient for AID targeting, and transcription levels alone cannot account for AID-dependent DSBs. AID-dependent hot spots are biased to the region around the transcription start site (Figure 6A). This finding is consistent with the accumulation of AID and Spt5 around the promoters of stalled genes and the distribution of somatic hypermutation (Pavri et al., 2010; Yamane et al., 2011). Indeed, AID-dependent TC-Seq hot spots overlap with regions of AID (Figure S4A) and Spt5 accumulation (Figures 6B and 6D). This correlation prompted us to explore the relationship between AID activity and accumulation of chromosomal translocations by measuring somatic hypermutation at TC-Seq captured AID targets (Yamane et al., 2011 and Table S6). We found a positive correlation (Spearman coefficient = 0.84) between hypermutation and rearrangement frequency (Figure 6C). All genes analyzed with a mutation rate over 10x105 bear rearrangements, and all genes with AIDdependent TC-Seq hot spots show mutations (Figure 6C). Rearrangements were only seen rarely in genes with lower rates of mutation (Figure 6C). This suggests that the rate of hypermutation and the frequency of AID-induced DSBs are directly proportional. We conclude that AID-dependent TC-Seq hot spots occur on stalled genes that accumulate Spt5, AID, and high rates of hypermutation. Among AID-dependent hot spot containing genes we find several that are translocated or deleted in mature B cell
DISCUSSION To date, the study of chromosomal aberrations has been primarily limited to events identified in tumors and tumor cell lines. Although we have learned a great deal about the importance of genomic rearrangements in cancer, it has not been possible to develop an understanding of the cellular and molecular requirements that govern their genesis. To examine genomic rearrangements in primary cells in short term cultures, we developed a technique to catalog these events by deep sequencing, TC-seq. Our results and analysis reveal the importance of transcription and physical proximity in recombinogenesis, and identifies hotspots for AID-mediated translocations in mature B cells. Nuclear Proximity and Chromosomal Position The existence of chromosome territories, regions in which individual chromosomes segregate, has been long proposed (Cremer and Cremer, 2001) and recently shown to be a key feature of genome organization (Lieberman-Aiden et al., 2009). Our analysis provides evidence that physical proximity and chromosome territories are partial determinants for joining of specific rearrangement partners. The effects of physical proximity are most evident in the 350 kb region around the DSB. In the absence of AID the plurality of rearrangements fall in this region. Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc. 101
Table 1. AID-Dependent Translocations in Human B Cell Lymphoma TC-Seq Gene
Translocation in Mature BCL
Mature BCL Type
Reference
Birc3 (Api2)
t(11;18)(q21;q21)
MALT
(Rosebeck et al., 2011)
Il21r
t(3;16)(q27;p11)
DLBCL
(Ueda et al., 2002)
Pax5
t(9;14)(p13;q32)
DBLCL
(Iida et al., 1999)
Pim1
t(3;6)(q27;p21.2)
DLBCL
(Yoshida et al., 1999)
Aff3
t(2;18)(q11.2;q21)
FL
(Impera et al., 2008)
Gas5
t(1;3)(q25;q27)
DLBCL
(Nakamura et al., 2008)
Ccnd2
t(2;12)(p12;p13)
MCL
(Gesk et al., 2006)
c-myc
t(8;14)(q23;q32)
BL
(Kuppers, 2005)
Ddx6 (Rck)
t(11;14)(q23;q32)
DLBCL
(Lu and Yunis, 1992)
Grhpr
t(3;9)(q27;p11)
FL
(Akasaka et al., 2003)
Bcl2l11
Deleted
MCL
(Bea et al., 2009)
Socs1
Deleted
DLBCL
(Mottok et al., 2009)
Junb
Deleted
DLBCL
(Mao et al., 2002)
mir142
t(8;17)
B-PLL
(Gauwerky et al., 1989)
Pvt1
t(2;8)(p11.2;q24.1)
BL
(Einerson et al., 2006)
IgH
several
several
(Kuppers, 2005)
IgK
several
several
(Kuppers, 2005)
IgL
several
several
(Kuppers, 2005)
Genes bearing AID-dependent TC-Seq hotspots are shown with the associated translocation or deletion observed in human mature B cell lymphoma (BCL). BCL types are abbreviated as follows: MALT, mucosa-associated lymphoid tissue lymphoma; DLBCL, diffuse large B cell lymphoma; FL, follicular lymphoma; MCL, mantle cell lymphoma; BL, Burkitt’s lymphoma; B-PLL, B cell prolymphocytic leukemia.
This observation is consistent with the analysis of rearrangements in the breast cancer genome and suggests that the abundance of these events is independent of cancer specific selection (Stephens et al., 2009). Additionally, a preference for DSB repair within 350 kb matches the range of gamma-H2AX spreading from a DSB (Bothmer et al., 2011). This is consistent with the idea that the DNA damage response facilitates proximal rearrangement, a phenomenon most prominent at the IgH locus during CSR. The magnitude of the effect of chromosome territories on rearrangement is far less prominent than proximal joining, but is consistent with recent genome mapping data obtained by high-throughput chromosome conformation capture (Hi-C) (Lieberman-Aiden et al., 2009). Intrachromosomal joining bias is evident in the preferential joining of AID hotspots and nonhotspots on Chr12 and Chr15 with their respective I-SceI breaks. When compared to transchromosomal joining, the bias to intrachromosomal rearrangements is evident even when DSBs are separated by as much as 50 Mb. In mouse, the mean autosome size is 130 Mb, so a 50 Mb preference for intrachromosomal joining on either side of a DSB will encompass nearly the entire average chromosome. We conclude that intrachromosomal joining is preferred to transchromosomal joining. Since this effect diminishes with distance, it is mediated by proximity, a likely consequence of local chromosome packing and nuclear chromosomal territories. A strong preference for 102 Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc.
proximal intrachromosomal rearrangement minimizes gross genomic alterations. We propose that this may be an important feature of DSB repair regulation that maintains genomic integrity. Transcription Transcription is associated with increased rates of DNA damage and genome instability; these effects are likely mediated by a number of different mechanisms (Gottipati and Helleday, 2009). Transcription may expose ssDNA, which is susceptible to chemical or oxidative damage (Aguilera, 2002). Additionally, head-on collision of the replication and transcription machinery has been implicated in fork stalling and genomic instability (Takeuchi et al., 2003). Consistent with these ideas, TC-Seq reveals that transcription facilitates DNA rearrangement. In the case of the c-myc locus, transcription increases the size of the local area around a DSB that is available for recombination from 50 kb to 300 kb. Moreover, I-SceI breaks rearrange predominantly to transcribed genes genome-wide and more specifically to the TSS. Thus, exposed ssDNA may serve as a primary source of genomic instability. AID expression further reinforces this phenomenon by creating U:G mismatches in ssDNA at sites of PolII stalling downstream of the TSS (Pavri et al., 2010). A bias for rearrangement between genic regions was also reported in recent studies of the cancer genome, but the role of transcription, transformation or selection in these events could not be evaluated (Stephens et al., 2009). Our experiments demonstrate that transcribed genic regions are over-represented in chromosomal rearrangements in primary cells in short-term cultures. In addition to being more susceptible to damage, this effect may be due to the increased physical proximity of transcribed regions to each other in the nucleus (Lieberman-Aiden et al., 2009). We speculate that this phenomenon may have consequences for tumorigenesis. The rearrangement of protooncogenes to transcribed regions may lead to their deregulation or produce hybrid entities that alter cellular metabolism. AID and Chromosome Translocation AID initiates SHM, CSR, and chromosome translocation by deaminating cytosine residues in ssDNA exposed by transcription (Chaudhuri and Alt, 2004; Di Noia and Neuberger, 2007; Nussenzweig and Nussenzweig, 2010; Peled et al., 2008; Stavnezer et al., 2008). AID targets the IgH locus and the TSSs of stalled genes through direct interaction with Spt5, a PolII stalling factor (Pavri et al., 2010), resulting in widespread somatic mutations (Yamane et al., 2011). Additionally, AID has been shown to initiate DSBs in non-Ig targets such as c-myc, and generates diverse translocations and chromosome breaks (Robbiani et al., 2008; Robbiani et al., 2009). However the precise relationships between AID and Spt5 occupancy, mutation, and translocations have not previously been investigated. By capturing and sequencing chromosomal rearrangements, a readout for aberrantly resolved DSBs, we have gained insight into the mechanisms by which AID targets DNA for chromosomal rearrangement. First, we show that AID targets discreet sites in the genome for DSB. These sites are predominantly genic and actively transcribed. A recent study using Nbs1-ChIP as a surrogate for DNA damage suggested that AID targets repeat rich sequences (Staszewski et al., 2011). In contrast, we find no
AID-dependent increase in rearrangements to repeats. Moreover, AID-dependent rearrangement hotspots predominantly occur in genes, not in or near repeat regions that are not transcribed. Hotspots that do fall in repeats (Figure S4B), are not AID-dependent and do not suffer somatic hypermutation (Table S6). While it is difficult to map short reads to repetitive sequences, these data suggest that rearrangements to repeats may be from AID-independent DSB. While genes rearranged by AID are largely transcribed, expression and PolII accumulation do not correlate directly with rearrangement frequency suggesting that transcription is necessary but not rate-limiting for rearrangement. Reflecting the distribution of AID and its co-factor Spt5 in the genome (Pavri et al., 2010; Yamane et al., 2011), AID-dependent rearrangements occur mainly on transcription start sites of stalled genes that carry high levels of the PolII stalling factor Spt5. In addition, we find a strong and direct correlation between hypermutation and rearrangements, suggesting that genes susceptible to AID mediated recombinogenesis are a subset of the most highly mutated genes in the genome. Consistent with this notion, we show that Pax5, Il21r, Gas5, Ddx6, Birc3, Ccnd2, Aff3, Grhpr, c-myc, Pvt1, Bcl2l11, Socs1, mir142, Junb, and Pim1, which are translocated or deleted in mature B cell lymphomas (Table 1) are among the more highly mutated AID targets and bear AIDdependent translocation hotspots. Our experiments were performed on in vitro stimulated B cells. Germinal center B cells will have an alternate gene expression profile that might influence the number and position of AID target sites. We conclude that in addition to hypermutation, AID is also a source of genomic instability in mature B cell lymphomas. Finally, we note that TC-seq can be adapted for use in other cell types to study translocation biology in any tissue. EXPERIMENTAL PROCEDURES B Cell Cultures, Infections, and Sorting Resting B lymphocytes were isolated from mouse spleens by immunomagnetic depletion with anti-CD43 MicroBeads (Miltenyi Biotech) and cultured at 0.5 3 106 cells/ml in RPMI supplemented with L-glutamine, sodium pyruvate, antibiotic/antimycotic, HEPES, 50 mM 2-mercaptoethanol (all from GIBCOBRL), and 10% fetal calf serum (Hyclone). B cells were stimulated in the presence of 500 ng/ml RP105 (BD PharMingen), 25 mg/ml lipopolysaccharide (LPS) (Sigma) and 5 ng/ml mouse recombinant IL-4 (Sigma). Retroviral supernatants were prepared by cotransfection of BOSC23 cells with pCL-Eco and pMXIRES-GFP-derived plasmids encoding for I-SceI-mCherry or AID-GFP with Fugene 6, 72 hr before infection. At 20 and 44 hr of lymphocyte culture, retroviral supernatants were added, and B cells were spinoculated at 1150 g for 1.5 hr in the presence of 10 mg/ml polybrene. For dual infection, separately prepared retroviral supernatants were added simultaneously on both days. After 4 hr at 37 C, supernatants were replaced with LPS and IL-4 in supplemented RPMI. At 96 hr from the beginning of their culture, singly infected B cells were collected and frozen in 10 million cell pellets at 80C. Dually infected B cells were sorted for double positive cells with a FACSAria instrument (Becton Dickson) then frozen down. Translocation Capture Sequencing Genomic DNA Library Preparation 5 3 10 million B cell aliquots were lysed in Proteinase K buffer (100 mM Tris [pH 8], 0.2% SDS, 200 mM NaCl, 5 mM EDTA) and 50 ml of 20mg/ml Proteinase K. Genomic DNA was extracted by phenol chloroform precipitation and fragmented by sonication (Bioruptor - Diagenode) to yield a 500–1350 bp distribu-
tion of DNA fragments. DNA was divided into (5ug) aliquots in 1.5mL eppendorf tubes. Each experiment consisted of genomic DNA from 50 million B cells in 50 3 5 mg aliquots for a total of 250 mg of fragmented genomic DNA per experiment. Subsequent reactions were performed individually on 5 mg aliquots. DNA was blunted by End-It DNA Repair Kit (Epicenter), purified, then adenosine-tailed by Klenow fragment 3 / 50 exo- (NEB) and purified. Fragments were ligated to 200 pmol of annealed linkers (pLT + pLB) (Table S8A) and unrearranged loci were eliminated by I-SceI digestion. Reactions were purified and pooled. Rearrangement Amplification Pooled linker-ligated DNA was divided into two equal parts for semi-nested ligation-mediated PCR using either forward or reverse primers (to capture rearrangements to either side of the I-SceI break). All PCRs were performed using the Phusion Polymerase system (NEB). DNA was divided into 1ug aliquots and subjected to single-primer PCR with biotinylated pMycF1, pMycR1, pIghF1 or pIghR1 [1x(98C-1min) 12x(98C-15s, 65C-30s, 72C-45s) 1x(72C-1min)] (Table S8A). Each reaction was spiked with pLinker and subjected to additional cycles of PCR [1x(98C-1min) 35x(98C-15s, 65C-30s, 72C-45s) 1x(72C-5min)]. Forward and reverse PCR reactions were pooled separately. Higher molecular weight products were isolated by agarose gel electrophoresis and magnetic streptavidin bead purification. Semi-nested PCR was performed on the magnetic beads with pMycF2, pMycR2, pIghF2, or pIghR2 and pLinker (Table S8A) [1x(98C-1min) 35x(98C-10s, 65C-30s, 72C-40s) 1x(72C-5min)]. Higher molecular weight products were isolated by agarose gel electrophoresis. Paired-End Library Preparation Linkers were removed by AscI digestion. Fragments were blunted by End-It DNA Repair Kit (Epicenter), purified, adenosine-tailed and ligated to Illumina paired-end adapters. Higher molecular weight products were isolated by agarose gel electrophoresis and adaptor-ligated fragments were enriched by 25 cycles of PCR with Illumina primers PE1.0 and PE2.0. Forward and reverse libraries for the same sample were mixed in equimolar ratios and sequenced by 36 3 36 or 54 3 54 paired end deep sequencing on an Illumina GAII. TC-Seq Computational Analysis Read Alignment Each end of the paired end sequences was matched against the relevant bait primer plus genomic sequences allowing up to two mismatches with bowtie (V 0.12.5; command line options: –v2). For read pairs longer than 2 3 36 nts, 10 nts were trimmed of the 30 end of each read. Each read pair with a single match to one of the primers was then checked for a perfect match to the linker on the second arm. If the linker was present, this arm was designated a target arm, linker sequence was trimmed, and the remainder was aligned against the mouse genome (NCBI 37/mm9) with bowtie allowing up to 2 mismatches and requiring unique alignments in the best alignment stratum (command line options: -v2 –all –best –strata -m1:). Exactly identical alignments (same position, same strand) were combined into a single putative translocation event and events supported by a single alignment were not considered in any analyses. We also removed putative translocation events closer than 1 kb to their respective bait. For hotspot analyses the exclusion limit was increased to 50 kb. Translocation positions were given as the position of the 50 end of the read in the alignment. Data from technical and biological repeats were pooled to increase saturation (Table S7). Mapping of Translocation Hot Spots A translocation hotspot was defined as a localized enrichment of translocation events above what is expected from the null hypothesis of uniform distribution of translocation events along the genome. To identify such hotspots, candidate regions were defined as locations containing consecutive translocations with distances shorter than expected from the mappable size of the mm9 genome assembly (p < 0.01 each as determined by a negative binomial test). For a candidate region to be called a hotspot it had to (1) have more than 3 translocations and (2) have at least one read from each of the two sides of the bait and (3) have at least 10% of the translocations come from each side of the bait and (4) have a combined P value less than 109 given the number of translocations and length of the region as determined by a negative binomial test. Hotspots with a large degree (>80%) of overlap with repeat regions, small
Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc. 103
footprints (<100nt) or less than 10-fold enrichment over the AID/ control were removed. Analyses of RNA-Seq, chromatin modifications, AID-, PolII-, and Spt5-ChIP as well as the identification of cryptic I-SceI sites, TSSs, genic and intergenic domains were carried out in R (http://www.R-project.org). Hypermutation Analysis CD43- splenocytes from IgkAID-Ung/ or Aicda/ mice were cultured at 0.1 3 106 cells/ml with LPS+IL-4, and 0.5 mg/ml of aCD180 (RP105) antibody (RP/14, BD PharMingen). At 72 hr cells were diluted 1:4 and cultured for another 48 hr. 50 ng of genomic DNA was amplified for 30 cycles with Phusion DNA polymerase (New England Biolabs) and specific primers (Table S8B). For nested PCR, two-20 cycle amplifications were performed with DMSO. The amplicon was cloned using PCR Zero blunt (Invitrogen) and sequenced. ACCESSION NUMBERS The TC-Seq datasets are deposited in SRA (http://www.ncbi.nlm.nih.gov/sra) under accession number SRA039959. SUPPLEMENTAL INFORMATION Supplemental Information includes four figures and eight tables and can be found with this article online at doi:10.1016/j.cell.2011.07.048. ACKNOWLEDGMENTS I.A.K. designed and performed experiments and analysis and wrote the manuscript. W.R. designed and performed data analysis. M.J. performed TC-Seq experiments. T.O. designed and performed data analysis. A.Y. and H.N. performed hypermutation sequencing and analysis. M.D.V. and A.B. assisted with TC-Seq experiments. D.F.R. assisted with TC-Seq experiments and contributed mice. A.N. made suggestions on the manuscript. R.C. and M.C.N. designed experiments and analysis and wrote the manuscript. We thank all the members of the Nussenzweig and Casellas labs for valuable input and advice; Klara Velinzon and Svetlana Mazel for FACSorting; and David Bosque and Thomas Eisenreich for animal management. We also thank Scott Dewell of the Rockefeller Genomics Resource Center and Gustavo Gutierrez of the NIAMS genome facility for high-throughput sequencing and guidance; as well as Christopher Mason of the Weill Cornell Medical College for assistance with data analysis. I.A.K. was supported by NIH MSTP grant GM07739, and is a Cancer Research Institute Predoctoral Fellow and a William Randolph Hearst Foundation Fellow. A.B. is a Cancer Research Institute Predoctoral Fellow. This work was supported by NIH grant #AI037526 to M.C.N., NYSTEM #C023046, The Starr Cancer Consortium and the Intramural Research Program of the National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health. M.C.N. is an HHMI investigator. Received: May 24, 2011 Revised: July 14, 2011 Accepted: July 27, 2011 Published: September 29, 2011 REFERENCES
lymphoma revealed by integrative high-resolution whole-genome profiling. Blood 113, 3059–3069. Bothmer, A., Robbiani, D.F., Di Virgilio, M., Bunting, S.F., Klein, I.A., Feldhahn, N., Barlow, J., Chen, H.T., Bosque, D., Callen, E., et al. (2011). Regulation of DNA End Joining, Resection, and Immunoglobulin Class Switch Recombination by 53BP1. Mol. Cell 42, 319–329. Branzei, D., and Foiani, M. (2010). Maintaining genome stability at the replication fork. Nat. Rev. Mol. Cell Biol. 11, 208–219. Campbell, P.J., Stephens, P.J., Pleasance, E.D., O’Meara, S., Li, H., Santarius, T., Stebbings, L.A., Leroy, C., Edkins, S., Hardy, C., et al. (2008). Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 40, 722–729. Chaudhuri, J., and Alt, F.W. (2004). Class-switch recombination: interplay of transcription, DNA deamination and DNA repair. Nat. Rev. Immunol. 4, 541–552. Cory, S., Graham, M., Webb, E., Corcoran, L., and Adams, J.M. (1985). Variant (6;15) translocations in murine plasmacytomas involve a chromosome 15 locus at least 72 kb from the c-myc oncogene. EMBO J. 4, 675–681. Cremer, T., and Cremer, C. (2001). Chromosome territories, nuclear architecture and gene regulation in mammalian cells. Nat. Rev. Genet. 2, 292–301. Di Noia, J.M., and Neuberger, M.S. (2007). Molecular mechanisms of antibody somatic hypermutation. Annu. Rev. Biochem. 76, 1–22. Einerson, R.R., Law, M.E., Blair, H.E., Kurtin, P.J., McClure, R.F., Ketterling, R.P., Flynn, H.C., Dogan, A., and Remstein, E.D. (2006). Novel FISH probes designed to detect IGK-MYC and IGL-MYC rearrangements in B-cell lineage malignancy identify a new breakpoint cluster region designated BVR2. Leukemia 20, 1790–1799. Franco, S., Gostissa, M., Zha, S., Lombard, D.B., Murphy, M.M., Zarrin, A.A., Yan, C., Tepsuporn, S., Morales, J.C., Adams, M.M., et al. (2006). H2AX prevents DNA breaks from progressing to chromosome breaks and translocations. Mol. Cell 21, 201–214. Gauwerky, C.E., Huebner, K., Isobe, M., Nowell, P.C., and Croce, C.M. (1989). Activation of MYC in a masked t(8;17) translocation results in an aggressive B-cell leukemia. Proc. Natl. Acad. Sci. USA 86, 8867–8871. Gesk, S., Klapper, W., Martin-Subero, J.I., Nagel, I., Harder, L., Fu, K., Bernd, H.W., Weisenburger, D.D., Parwaresch, R., and Siebert, R. (2006). A chromosomal translocation in cyclin D1-negative/cyclin D2-positive mantle cell lymphoma fuses the CCND2 gene to the IGK locus. Blood 108, 1109–1110. Goldman, J.M., and Melo, J.V. (2003). Chronic myeloid leukemia–advances in biology and new approaches to treatment. N. Engl. J. Med. 349, 1451–1464. Gordon, M.S., Kanegai, C.M., Doerr, J.R., and Wall, R. (2003). Somatic hypermutation of the B cell receptor genes B29 (Igbeta, CD79b) and mb1 (Igalpha, CD79a). Proc. Natl. Acad. Sci. USA 100, 4126–4131. Gostissa, M., Yan, C.T., Bianco, J.M., Cogne, M., Pinaud, E., and Alt, F.W. (2009). Long-range oncogenic activation of Igh-c-myc translocations by the Igh 30 regulatory region. Nature 462, 803–807. Gottipati, P., and Helleday, T. (2009). Transcription-associated recombination in eukaryotes: link between transcription, replication and recombination. Mutagenesis 24, 203–210. Honjo, T. (2002). Does AID need another aid? Nat. Immunol. 3, 800–801.
Aguilera, A. (2002). The connection between transcription and genomic instability. EMBO J. 21, 195–201. Akasaka, T., Lossos, I.S., and Levy, R. (2003). BCL6 gene translocation in follicular lymphoma: a harbinger of eventual transformation to diffuse aggressive lymphoma. Blood 102, 1443–1448. Argast, G.M., Stephens, K.M., Emond, M.J., and Monnat, R.J., Jr. (1998). I-PpoI and I-CreI homing site sequence degeneracy determined by random mutagenesis and sequential in vitro enrichment. J. Mol. Biol. 280, 345–353. Bea, S., Salaverria, I., Armengol, L., Pinyol, M., Fernandez, V., Hartmann, E.M., Jares, P., Amador, V., Hernandez, L., Navarro, A., et al. (2009). Uniparental disomies, homozygous deletions, amplifications, and target genes in mantle cell
104 Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc.
Huppi, K., Siwarski, D., Skurla, R., Klinman, D., and Mushinski, J.F. (1990). Pvt1 transcripts are found in normal tissues and are altered by reciprocal(6;15) translocations in mouse plasmacytomas. Proc. Natl. Acad. Sci. USA 87, 6964–6968. Iida, S., Rao, P.H., Ueda, R., Chaganti, R.S., and Dalla-Favera, R. (1999). Chromosomal rearrangement of the PAX-5 locus in lymphoplasmacytic lymphoma with t(9;14)(p13;q32). Leuk. Lymphoma 34, 25–33. Impera, L., Albano, F., Lo Cunsolo, C., Funes, S., Iuzzolino, P., Laveder, F., Panagopoulos, I., Rocchi, M., and Storlazzi, C.T. (2008). A novel fusion 50 AFF3/30 BCL2 originated from a t(2;18)(q11.2;q21.33) translocation in follicular lymphoma. Oncogene 27, 6187–6190.
Jankovic, M., Robbiani, D.F., Dorsett, Y., Eisenreich, T., Xu, Y., Tarakhovsky, A., Nussenzweig, A., and Nussenzweig, M.C. (2010). Role of the translocation partner in protection against AID-dependent chromosomal translocations. Proc. Natl. Acad. Sci. USA 107, 187–192.
Ramiro, A.R., Jankovic, M., Callen, E., Difilippantonio, S., Chen, H.T., McBride, K.M., Eisenreich, T.R., Chen, J., Dickins, R.A., Lowe, S.W., et al. (2006). Role of genomic instability and p53 in AID-induced c-myc-Igh translocations. Nature 440, 105–109.
Kuppers, R. (2005). Mechanisms of B-cell lymphoma pathogenesis. Nat. Rev. Cancer 5, 251–262.
Ramiro, A., Reina San-Martin, B., McBride, K., Jankovic, M., Barreto, V., Nussenzweig, A., and Nussenzweig, M.C. (2007). The role of activation-induced deaminase in antibody diversification and chromosome translocations. Adv. Immunol. 94, 75–107.
Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293. Liu, M., Duke, J.L., Richter, D.J., Vinuesa, C.G., Goodnow, C.C., Kleinstein, S.H., and Schatz, D.G. (2008). Two levels of protection for the B cell genome during somatic hypermutation. Nature 451, 841–845. Lu, D., and Yunis, J.J. (1992). Cloning, expression and localization of an RNA helicase gene from a human lymphoid cell line with chromosomal breakpoint 11q23.3. Nucleic Acids Res. 20, 1967–1972. Mahowald, G.K., Baron, J.M., Mahowald, M.A., Kulkarni, S., Bredemeyer, A.L., Bassing, C.H., and Sleckman, B.P. (2009). Aberrantly resolved RAG-mediated DNA breaks in Atm-deficient lymphocytes target chromosomal breakpoints in cis. Proc. Natl. Acad. Sci. USA 106, 18339–18344. Mao, X., Lillington, D., Child, F., Russell-Jones, R., Young, B., and Whittaker, S. (2002). Comparative genomic hybridization analysis of primary cutaneous B-cell lymphomas: identification of common genomic alterations in disease pathogenesis. Genes Chromosomes Cancer 35, 144–155. Mottok, A., Renne, C., Seifert, M., Oppermann, E., Bechstein, W., Hansmann, M.L., Kuppers, R., and Brauninger, A. (2009). Inactivating SOCS1 mutations are caused by aberrant somatic hypermutation and restricted to a subset of B-cell lymphoma entities. Blood 114, 4503–4506. Muramatsu, M., Kinoshita, K., Fagarasan, S., Yamada, S., Shinkai, Y., and Honjo, T. (2000). Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell 102, 553–563. Nakamura, Y., Takahashi, N., Kakegawa, E., Yoshida, K., Ito, Y., Kayano, H., Niitsu, N., Jinnai, I., and Bessho, M. (2008). The GAS5 (growth arrest-specific transcript 5) gene fuses to BCL6 as a result of t(1;3)(q25;q27) in a patient with B-cell lymphoma. Cancer Genet. Cytogenet. 182, 144–149. Nussenzweig, A., and Nussenzweig, M.C. (2010). Origin of chromosomal translocations in lymphoid cancer. Cell 141, 27–38. Okazaki, I.M., Hiai, H., Kakazu, N., Yamada, S., Muramatsu, M., Kinoshita, K., and Honjo, T. (2003). Constitutive expression of AID leads to tumorigenesis. J. Exp. Med. 197, 1173–1181. Pasqualucci, L., Neumeister, P., Goossens, T., Nanjangud, G., Chaganti, R.S., Kuppers, R., and Dalla-Favera, R. (2001). Hypermutation of multiple protooncogenes in B-cell diffuse large-cell lymphomas. Nature 412, 341–346. Pavri, R., Gazumyan, A., Jankovic, M., Di Virgilio, M., Klein, I., Ansarah-Sobrinho, C., Resch, W., Yamane, A., Reina San-Martin, B., Barreto, V., et al. (2010). Activation-induced cytidine deaminase targets DNA at sites of RNA polymerase II stalling by interaction with Spt5. Cell 143, 122–133. Peled, J.U., Kuang, F.L., Iglesias-Ussel, M.D., Roa, S., Kalis, S.L., Goodman, M.F., and Scharff, M.D. (2008). The biochemistry of somatic hypermutation. Annu. Rev. Immunol. 26, 481–511. Pleasance, E.D., Cheetham, R.K., Stephens, P.J., McBride, D.J., Humphray, S.J., Greenman, C.D., Varela, I., Lin, M.L., Ordonez, G.R., Bignell, G.R., et al. (2010a). A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196. Pleasance, E.D., Stephens, P.J., O’Meara, S., McBride, D.J., Meynert, A., Jones, D., Lin, M.L., Beare, D., Lau, K.W., Greenman, C., et al. (2010b). A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463, 184–190.
Revy, P., Muto, T., Levy, Y., Geissmann, F., Plebani, A., Sanal, O., Catalan, N., Forveille, M., Dufourcq-Labelouse, R., Gennery, A., et al. (2000). Activationinduced cytidine deaminase (AID) deficiency causes the autosomal recessive form of the Hyper-IgM syndrome (HIGM2). Cell 102, 565–575. Riggi, N., and Stamenkovic, I. (2007). The Biology of Ewing sarcoma. Cancer Lett. 254, 1–10. Robbiani, D.F., Bothmer, A., Callen, E., Reina-San-Martin, B., Dorsett, Y., Difilippantonio, S., Bolland, D.J., Chen, H.T., Corcoran, A.E., Nussenzweig, A., et al. (2008). AID is required for the chromosomal breaks in c-myc that lead to c-myc/IgH translocations. Cell 135, 1028–1038. Robbiani, D.F., Bunting, S., Feldhahn, N., Bothmer, A., Camps, J., Deroubaix, S., McBride, K.M., Klein, I.A., Stone, G., Eisenreich, T.R., et al. (2009). AID produces DNA double-strand breaks in non-Ig genes and mature B cell lymphomas with reciprocal chromosome translocations. Mol. Cell 36, 631–641. Roix, J.J., McQueen, P.G., Munson, P.J., Parada, L.A., and Misteli, T. (2003). Spatial proximity of translocation-prone gene loci in human lymphomas. Nat. Genet. 34, 287–291. Rosebeck, S., Madden, L., Jin, X., Gu, S., Apel, I.J., Appert, A., Hamoudi, R.A., Noels, H., Sagaert, X., Van Loo, P., et al. (2011). Cleavage of NIK by the API2MALT1 fusion oncoprotein leads to noncanonical NF-kappaB activation. Science 331, 468–472. Shen, H.M., Peters, A., Baron, B., Zhu, X., and Storb, U. (1998). Mutation of BCL-6 gene in normal B cells by the process of somatic hypermutation of Ig genes. Science 280, 1750–1752. Staszewski, O., Baker, R.E., Ucher, A.J., Martier, R., Stavnezer, J., and Guikema, J.E. (2011). Activation-induced cytidine deaminase induces reproducible DNA breaks at many non-Ig Loci in activated B cells. Mol. Cell 41, 232–242. Stavnezer, J., Guikema, J.E., and Schrader, C.E. (2008). Mechanism and regulation of class switch recombination. Annu. Rev. Immunol. 26, 261–292. Stephens, P.J., McBride, D.J., Lin, M.L., Varela, I., Pleasance, E.D., Simpson, J.T., Stebbings, L.A., Leroy, C., Edkins, S., Mudie, L.J., et al. (2009). Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature 462, 1005–1010. Storb, U., Shen, H.M., Longerich, S., Ratnam, S., Tanaka, A., Bozek, G., and Pylawka, S. (2007). Targeting of AID to immunoglobulin genes. Adv. Exp. Med. Biol. 596, 83–91. Takeuchi, Y., Horiuchi, T., and Kobayashi, T. (2003). Transcription-dependent recombination and the role of fork collision in yeast rDNA. Genes Dev. 17, 1497–1506. Thomas, B.J., and Rothstein, R. (1989). Elevated recombination rates in transcriptionally active DNA. Cell 56, 619–630. Tsai, A.G., and Lieber, M.R. (2010). Mechanisms of chromosomal rearrangement in the human genome. BMC Genomics 11 (Suppl 1), S1. Tsai, A.G., Lu, H., Raghavan, S.C., Muschen, M., Hsieh, C.L., and Lieber, M.R. (2008). Human chromosomal translocations at CpG sites and a theoretical basis for their lineage and stage specificity. Cell 135, 1130–1142.
Potter, M. (2003). Neoplastic development in plasma cells. Immunol. Rev. 194, 177–195.
Ueda, C., Akasaka, T., Kurata, M., Maesako, Y., Nishikori, M., Ichinohasama, R., Imada, K., Uchiyama, T., and Ohno, H. (2002). The gene for interleukin-21 receptor is the partner of BCL6 in t(3;16)(q27;p11), which is recurrently observed in diffuse large B-cell lymphoma. Oncogene 21, 368–376.
Rabbitts, T.H. (2009). Commonality but diversity in cancer gene fusions. Cell 137, 391–395.
Wang, J.H., Gostissa, M., Yan, C.T., Goff, P., Hickernell, T., Hansen, E., Difilippantonio, S., Wesemann, D.R., Zarrin, A.A., Rajewsky, K., et al. (2009).
Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc. 105
Mechanisms promoting translocations in editing and switching peripheral B cells. Nature 460, 231–236.
switching and translocations use a robust non-classical end-joining pathway. Nature 449, 478–482.
Wong, S., and Witte, O.N. (2004). The BCR-ABL story: bench to bedside and back. Annu. Rev. Immunol. 22, 247–306.
Yoshida, S., Kaneita, Y., Aoki, Y., Seto, M., Mori, S., and Moriyama, M. (1999). Identification of heterologous translocation partner genes fused to the BCL6 gene in diffuse large B-cell lymphomas: 50 -RACE and LA - PCR analyses of biopsy samples. Oncogene 18, 7994–7999.
Yamane, A., Resch, W., Kuo, N., Kuchen, S., Li, Z., Sun, H.W., Robbiani, D.F., McBride, K., Nussenzweig, M.C., and Casellas, R. (2011). Deep-sequencing identification of the genomic targets of the cytidine deaminase AID and its cofactor RPA in B lymphocytes. Nat. Immunol. 12, 62–69. Yan, C.T., Boboila, C., Souza, E.K., Franco, S., Hickernell, T.R., Murphy, M., Gumaste, S., Geyer, M., Zarrin, A.A., Manis, J.P., et al. (2007). IgH class
106 Cell 147, 95–106, September 30, 2011 ª2011 Elsevier Inc.
Zhang, Y., Gostissa, M., Hildebrand, D.G., Becker, M.S., Boboila, C., Chiarle, R., Lewis, S., and Alt, F.W. (2010). The role of mechanistic factors in promoting chromosomal translocations found in lymphoid and other cancers. Adv. Immunol. 106, 93–133.
Genome-wide Translocation Sequencing Reveals Mechanisms of Chromosome Breaks and Rearrangements in B Cells Roberto Chiarle,1,2,7 Yu Zhang,1,7,* Richard L. Frock,1,7 Susanna M. Lewis,1,7 Benoit Molinie,3 Yu-Jui Ho,1 Darienne R. Myers,1 Vivian W. Choi,1 Mara Compagno,1,2 Daniel J. Malkin,1 Donna Neuberg,4 Stefano Monti,5,6 Cosmas C. Giallourakis,3,* Monica Gostissa,1,* and Frederick W. Alt1,* 1Howard Hughes Medical Institute, Immune Disease Institute, Program in Cellular and Molecular Medicine, Children’s Hospital Boston and Departments of Genetics and Pediatrics, Harvard Medical School, Boston, MA 02115, USA 2Department of Biomedical Sciences and Human Oncology and CERMS, University of Torino, 10126 Turin, Italy 3Gastrointestinal Unit, Center for Study of Inflammatory Bowel Disease, Massachusetts General Hospital, Boston, MA 02114, USA 4Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA 5Broad Institute, 5 Cambridge Center, Cambridge, MA 02142, USA 6Section of Computational Biomedicine, Boston University School of Medicine, Boston, MA 02118, USA 7These authors contributed equally to this work *Correspondence:
[email protected] (Y.Z.),
[email protected] (C.C.G.),
[email protected] (M.G.),
[email protected] (F.W.A.) DOI 10.1016/j.cell.2011.07.049
SUMMARY
Whereas chromosomal translocations are common pathogenetic events in cancer, mechanisms that promote them are poorly understood. To elucidate translocation mechanisms in mammalian cells, we developed high-throughput, genome-wide translocation sequencing (HTGTS). We employed HTGTS to identify tens of thousands of independent translocation junctions involving fixed I-SceI meganucleasegenerated DNA double-strand breaks (DSBs) within the c-myc oncogene or IgH locus of B lymphocytes induced for activation-induced cytidine deaminase (AID)-dependent IgH class switching. DSBs translocated widely across the genome but were preferentially targeted to transcribed chromosomal regions. Additionally, numerous AID-dependent and AIDindependent hot spots were targeted, with the latter comprising mainly cryptic I-SceI targets. Comparison of translocation junctions with genome-wide nuclear run-ons revealed a marked association between transcription start sites and translocation targeting. The majority of translocation junctions were formed via end-joining with short microhomologies. Our findings have implications for diverse fields, including gene therapy and cancer genomics. INTRODUCTION Recurrent oncogenic translocations are common in hematopoietic malignancies including lymphomas (Ku¨ppers and DallaFavera, 2001) and also occur frequently in solid tumors such
as prostate and lung cancers (Shaffer and Pandolfi, 2006). DNA double-strand breaks (DSBs) are common intermediates of these genomic aberrations (Stratton et al., 2009). DSBs are generated by normal metabolic processes, by genotoxic agents including some cancer therapeutics, and by V(D)J and immunoglobulin (Ig) heavy (H) chain (IgH) class switch recombination (CSR) in lymphocytes (Zhang et al., 2010). Highly conserved pathways repair DSBs to preserve genome integrity (Lieber, 2010). Nevertheless, repair can fail, resulting in unresolved DSBs and translocations. Recurrent translocations in tumors usually arise as low-frequency events that are selected during oncogenesis. However, other factors influence the appearance of recurrent translocations including chromosomal location of oncogenes (Gostissa et al., 2009). Chromosomal environment likely affects translocation frequency by influencing mechanistic factors, including DSB frequency at translocation targets, factors that contribute to juxtaposition of broken loci for joining, and mechanisms that circumvent repair functions that promote intrachromosomal DSB joining (Zhang et al., 2010). IgH CSR is initiated by DSBs that result from transcription-targeted AID-cytidine deamination activity within IgH switch (S) regions that lie just 50 of various sets of CH exons. DSBs within the donor Sm region and a downstream acceptor S region are fused via end-joining to complete CSR and allow expression of a different antibody class (Chaudhuri et al., 2007). Clonal translocations in human and mouse B cell lymphomas often involve IgH S regions and an oncogene, such as c-myc (Ku¨ppers and Dalla-Favera, 2001; Gostissa et al., 2011). In this regard, AIDgenerated IgH S region DSBs directly participate in translocations to c-myc and other genes (Franco et al., 2006; Ramiro et al., 2006; Wang et al., 2009). Through its role in somatic hypermutation (SHM) of IgH and Ig light (IgL) variable region exons, AID theoretically might generate lower frequency DSBs in Ig loci that serve as translocation intermediates (Liu and Schatz, 2009). In addition, AID mutates many non-Ig genes in activated Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc. 107
B cells at far lower levels than Ig genes (Liu et al., 2008); such offtarget AID activity also may contribute to translocations of non-Ig genes (Robbiani et al., 2008). Indeed, AID even has been suggested to initiate lesions leading to translocations in nonlymphoid cancers, including prostate cancer (Lin et al., 2009). However, potential roles of AID in generating DSBs genomewide have not been addressed. In this regard, other sources of translocation-initiating DSBs could include intrinsic factors, such as oxidative metabolism, replication stress, and chromosome fragile sites, or extrinsic factors such as ionizing radiation or chemotherapeutics (Zhang et al., 2010). DSBs lead to damage response foci formation over 100 kb or larger flanking regions, promoting DSB joining and suppressing translocations (Zhang et al., 2010; Nussenzweig and Nussenzweig, 2010). IgH class switching in activated B cells can be mediated by yeast I-SceI endonuclease-generated DSBs without AID or S regions, suggesting general mechanisms promote efficient intrachromosomal DSB joining over at least 100 kb (Zarrin et al., 2007). In somatic cells, classical nonhomologous end-joining (C-NHEJ) repairs many DSBs (Zhang et al., 2010). C-NHEJ suppresses translocations by preferentially joining DSBs intrachromosomally (Ferguson et al., 2000). Deficiency for C-NHEJ leads to frequent translocations, demonstrating that other pathways fuse DSBs into translocations (Zhang et al., 2010). Correspondingly, an alternative end-joining pathway (A-EJ), that prefers ends with short microhomologies (MHs), supports CSR in the absence of C-NHEJ (Yan et al., 2007) and joins CSR DSBs to other DSBs to generate translocations (Zhang et al., 2010). Indeed, C-NHEJ suppresses p53deficient lymphomas with recurrent IgH/c-myc translocations catalyzed by A-EJ (Zhu et al., 2002). Various evidence suggests that A-EJ may be translocation prone (e.g., Simsek and Jasin, 2010). The mammalian nucleus is occupied by nonrandomly positioned genes and chromosomes (Meaburn et al., 2007). Fusion of DSBs to generate translocations requires physical proximity; thus, spatial disposition of chromosomes might impact translocation patterns (Zhang et al., 2010). Cytogenetic studies revealed that certain loci involved in oncogenic translocations are spatially proximal (Meaburn et al., 2007). Studies of recurrent translocations in mouse B cell lymphomas suggested that aspects of particular chromosomal regions, as opposed to broader territories, might promote proximity and influence translocation frequency (Wang et al., 2009). Nonrandom position of genes and chromosomes in the nucleus led to two general models for translocation initiation. ‘‘Contact-first’’ poses translocations to be restricted to proximally positioned chromosomal regions, whereas ‘‘breakage-first’’ poses that distant DSBs can be juxtaposed (Meaburn et al., 2007). In-depth evaluation of how chromosomal organization influences translocations requires a genome-wide approach. To elucidate translocation mechanisms, we developed approaches that identify genome-wide translocations arising from a specific DSB in vivo. Our studies isolate large numbers of translocations from primary B cells, which were activated for CSR, and provide a comprehensive analysis of the relationships among particular classes of DSBs, transcription, chromosome domains, and translocation events. 108 Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc.
RESULTS Development of High-Throughput Genomic Translocation Sequencing We developed high-throughput genomic translocation (HTGTS) to isolate junctions between a chromosomal DSB introduced at a fixed site and other sequences genome-wide. Such junctions, other than those involving breaksite resection, mostly should result from end-joining of introduced DSBs to other genomic DSBs. Thus, HTGTS will identify other genomic DSBs capable of joining to the test DSBs. With HTGTS, we isolated from primary mouse B cells junctions that fused IgH or c-myc DSBs to sequences distributed widely across the genome (Figures 1A and 1B). We chose c-myc and IgH as targets because they participate in recurrent oncogenic translocations in B cell lymphomas. To generate c-myc- or IgH-specific DSBs, we employed an 18 bp canonical I-SceI meganuclease target sequence, which is absent in mouse genomes (Jasin, 1996). One c-myc target was a cassette with 25 tandem I-SceI sites, to increase cutting efficiency, within c-myc intron 1 on chromosome (chr)15 (termed c-myc25xI-SceI; Figure 1C; Wang et al., 2009). For comparison, we employed an allele with a single I-SceI site in the same position (termed c-myc1xI-SceI) (Figure 1C; see Figures S1A–S1C available online). For IgH, we employed an allele with two I-SceI sites in place of endogenous Sg1 (termed DSg12xI-SceI) on chr12 (Zarrin et al., 2007). As a cellular model, we used primary splenic B cells activated in culture with aCD40 plus IL4 to induce AID, transcription, DSBs and CSR at Sg1 (IgG1) and S3 (IgE), during days 2–4 of activation. At 24 hr, we infected B cells with I-SceI-expressing retrovirus to induce DSBs at I-SceI targets (Zarrin et al., 2007). Cells were processed at day 4 to minimize doublings and potential cellular selection. As high-titer retroviral infection can impair C-NHEJ (Wang et al., 2009), we also assayed B cells that express from their Rosa26 locus an I-SceI-glucocorticoid receptor fusion protein (I-SceIGR) that can be activated via triamcinolone acetonide (TA) (Figure 1D; Figures S1D–S1F). The c-myc25xI-SceI cassette was frequently cut in TA-treated c-myc25xI-SceI/ROSAI-SceI-GR B cells (Figure S1G). We employed two HTGTS methods. For the adaptor-PCR approach (Figure 1E, Siebert et al., 1995), genomic DNA was fragmented with a frequently cutting restriction enzyme, ligated to an asymmetric adaptor, and further digested to block amplification of germline or unrearranged target alleles. We then performed nested PCR with adaptor- and locus-specific primers. Depending on the locus-specific PCR primers, one or the other side of the I-SceI DSB provides the ‘‘bait’’ translocation partner (Figure 1C), with the ‘‘prey’’ provided by DSBs at other genomic sites. As a second approach, we employed circularization PCR (Figure 1E) (Mahowald et al., 2009), in which enzymatically fragmented DNA was intramolecularly ligated and digested with blocking enzymes and nested PCR was performed with locusspecific primers. Following sequencing of PCR products, we aligned HTGTS junctions to reference genomes and scripted filters to remove artifacts from aligned databases. We experimentally controlled for potential background by generating HTGTS libraries from mixtures of human DNA and mouse DNA from activated I-SceI-infected c-myc25xI-SceI or DSg12xI-SceI B
A
B
IgH
c-myc
C
E c-myc I-SceI models (chr15) Ex1
Ex2 >
25x I-SceI
>
I-SceI
Ex3
Induction of DSB
>
Translocation
>
1x I-SceI
> I-SceI recognition site
DNA fragmentation by frequent cutter Intramolecular ligation
Sequencing primer
S 1
2nd nested-PCR (specificity)
C 1 2nd nested-PCR (specificity, sequencing primers)
10 kb > < 0.5kb
S 1/ 2x I-SceI
1st PCR (enrich)
1st PCR (enrich)
S 1 I-SceI model (chr12) iEμ Sμ Cμ I 1
Linker ligation
3rd PCR (sequencing primers)
I 1
D
454 sequencing
ROSA26 locus Ex1
I-SceI-GR IRES-tdT Ex2
F
method
Ex3
ad-PCR 25x I-SceI
ad-PCR 25x I-SceI
ad-PCR 2x I-SceI
circ-PCR 25x I-SceI
S 1 WT K562
c-myc WT K562 316
human DNA
c-myc WT K562
mouse junc.
2257
1957
1837
human junc.
12
11
6
2
background
0.53%
0.56%
0.33%
0.63%
mouse DNA
c-myc AID-/K562
1st PCR primers 2nd PCR primers adapter primers
Figure 1. High-Throughput Genomic Translocation Sequencing (A and B) Circos plots of genome-wide translocation landscape of representative c-myc (A) or IgH (B) HTGTS libraries. Chromosome ideograms comprise the circumference. Individual translocations are represented as arcs originating from specific I-SceI breaks and terminating at partner site. (C) Top: a cassette containing either 25 or one I-SceI target(s) was inserted into intron 1 of c-myc (see Figures S1A–S1C). Bottom: a cassette composed of a 0.5 kb spacer flanked by I-SceI target replaced the IgH Sg1 region. Relative orientation of I-SceI sites is indicated by red arrows. Position of primers for generation and sequencing HTGTS libraries is shown. (D) An expression cassette for I-SceI fused to a glucocorticoid receptor (I-SceI-GR) was targeted into Rosa26 (see Figures S1D–S1G). The red fluorescent protein Tomato (tdT) is coexpressed via an IRES. (E) Schematic representation of HTGTS methods; left: circularization-PCR, right: adaptor-PCR. See text for details. (F) Background for HTGTS approaches, calculated as percent of artifactual human:mouse hybrid junctions when human DNA was mixed 1:1 with mouse DNA from indicated samples.
Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc. 109
cells; junctions fusing mouse and human sequences were less than 1% of the total (Figure 1F). We identified nearly 150,000 independent junctions from numerous libraries from different mice (Table S1). Resulting genome-wide junction maps are shown either as colored dot plots of overall distribution of translocation numbers in selected size bins (useful for visualizing hot spots) or bar plots that compress hot spots and illustrate translocation site density. HTGTS yields an average of 1 unique junction/5 ng of DNA, corresponding to about one junction/1000 genomes. Major findings were reproduced with both HTGTS methods (e.g., Figure S2A). Moreover, while the largest portion of data was obtained with c-myc25xI-SceI alleles cut via retroviral I-SceI, major findings were reproduced via HTGTS from the c-myc25xI-SceI allele cleaved by I-SceI-GR and the c-myc1xI-SceI allele cleaved by retroviral I-SceI (Figures S2C and S2D). Analysis of Genome-wide Translocations from c-myc DSBs For HTGTS of c-myc25xI-SceI or c-myc1xI-SceI alleles, we used primers about 200 bp centromeric to the cassette (Figure 1C) to detect junctions involving broken ends (BEs) on the 50 side of c-myc I-SceI DSBs (‘‘50 c-myc-I-SceI BEs’’). Based on convention, prey sequences joined to 50 c-myc-I-SceI BEs are in (+) orientation if read from the junction in centromere to telomere direction and in () orientation if read in the opposite direction (Figures S3A–S3D). Joins in which 50 c-myc-I-SceI BEs are fused to resected 30 c-myc-I-SceI BEs would be (+) (Figure S3A). Intrachromosomal joins to DSBs centromeric or telomeric to 50 c-myc-I-SceI BEs would be (+) or () depending on the side of the second DSB to which they were joined, with potential outcomes including deletions, inversions, and extrachromosomal circles (Figures S3B and S3C). Junctions to DSBs on different chromosomes could be (+) or () and derivative chromosomes centric or dicentric (Figure S3D). Analyses of over 100,000 independent junctions from 50 c-myc-I-SceI BEs from WT and AID/ backgrounds revealed prey to be distributed widely throughout the genome with similar general distribution patterns (Figure 2; Figures S2B, S2E, and S2F). Other than 200 kb downstream of the bait DSB, intrachromosomal and interchromosomal junctions were evenly distributed into (+) and () orientation (Figure 2; Figure S3I). This finding implies that extrachromosomal circles and acentric fragments are represented similarly to other translocation classes, suggesting little impact of cellular selection on junction distribution. The junctions of 50 c-myc-I-SceI BE from c-myc25xI-SceI, c-myc1xI-SceI, and c-myc25xI-SceI/ROSAI-SceI-GR models were all consistent with end-joining, and most (75%– 90%) had short junctional MHs (Table S1). WT and AID/ HTGTS maps for 50 c-myc-I-SceI BEs had other common features. First, the majority of junctions (75%) arose from joining 50 c-myc-I-SceI BEs to sequences within 10 kb, with most lying 30 of the breaksite (Figure 3A; Figure S4A). The density of joins remained relatively high within a region 200 kb telomeric to the breaksite (Figure 3A; Figure S4A). Notably, most junctions within this 200 kb region, but not beyond, were in the (+) orientation, consistent with joining to resected 30 c-myc-I-SceI BEs (Figure 3A; Figure S4A). About 15% of junctions occurred within the region 100 kb centromeric to the breaksite. As these could not have resulted from resection (due 110 Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc.
to primer removal), they may reflect the known propensity for joining intrachromosomal DSBs separated at such distances (Zarrin et al., 2007). Compared with other chromosomes, chr15 had a markedly high density of translocations along its 50 Mb telomeric portion and also a high density along its centromeric portion (Figure 2). Many chromosomes had smaller regions of relatively high or low translocation density, with such overall patterns conserved between WT and AID/ backgrounds (Figure 2; Figures S2A–S2F). Finally, although the majority of hot spots were WT specific, a number were shared between WT and AID/ backgrounds (see below). Analysis of HTGTS Libraries from IgH DSBs For HTGTS of the DSg12xI-SceI alleles, we used primers about 200 bp telomeric to the I-SceI cassette (Figure 1C), allowing detection of junctions involving BEs on the 50 side of Sg1 I-SceI DSBs (‘‘50 Sg1-I-SceI BEs’’). Intra- and interchromosomal joins involving 50 Sg1-I-SceI BEs result in (+) or () junctions with the range of potential chromosomal outcomes including deletions, inversions, extrachromosomal circles, and acentrics (Figures S3E–S3H). We isolated and analyzed approximately 9000 and 8000 50 Sg1-I-SceI BE junctions from WT and AID/ libraries, respectively (Figures S2G and S2H). Reminiscent of the 50 c-myc-I-SceI junctions, about 75% of these junctions were within 10 kb of the breaksite, with a larger proportion on the 30 side and predominantly in the () orientation, consistent with joining to resected 30 Sg1-I-SceI BEs (Figures S4B–S4D). Outside the breaksite region, the general 50 Sg1-I-SceI BE translocation patterns resembled those observed for 50 c-myc-I-SceI BEs, with both (+) and () translocations occurring on all chromosomes (Figures S3J and S2G). Though we analyzed more limited numbers of 50 Sg1-I-SceI BE junctions (Table S2 and Figures S2G and S2H), the broader telomeric region of chr12 had a notably large number of hits, and within this region, there were IgH hot spots in WT, but not AID/, libraries (Figure 3B). Sm and Sε are major targets of AID-initiated DSBs in B cells activated with aCD40/IL4. Correspondingly, substantial numbers of 50 Sg1-SceI BE junctions from WT, but not AID/, B cells joined to either Sm or to S3, which, respectively, lie approximately 100 kb upstream and downstream of the DSg12xI-SceI cassette (Figure 3B; Figures S4B–S4D). These findings support the notion that DSBs separated by 100–200 kb can be joined at high frequency by general repair mechanisms (Zarrin et al., 2007). We also observed frequent junctions from WT libraries specifically within Sg3, which lies about 20 kb upstream of the breaksite, a finding of interest as joining Sg3 to donor Sm DSBs during CSR in aCD40/IL4-activated B cells occurs at low levels (see below). Notably, in WT, but not in AID/ libraries, we found numerous junctions within Sg1 (Figure S4D), which is also targeted by AID in aCD40/IL4-activated B cells. As Sg1 is present only on the non-targeted chr12 homolog due to the DSg12xI-SceI replacement, these findings demonstrate robust translocation of 50 Sg1-I-SceI BEs to AID-dependent Sg1 DSBs on the homologous chromosome, consistent with transCSR (Reynaud et al., 2005). Finally, while AID deficiency greatly reduced junctions into S regions, we observed a focal cluster of five 50 Sg1-I-SceI BE junctions in or near Sm in AID/ DSg12xISceI libraries (Figure 3B; Figure S4C).
1a) Arid5a 1b) Aff3 2a) Rapgef1 2b) Traf1 2c) Mmp24 2d) Bcl2l1
Cen 1a 1b
(-)
(+)
orientation
orientation
7a) Il4i1 7b) Apbb1 7c) Il4ra 7d) Il21r
4a) Pax5 4b) Hivep3
2a 2b
9a) Fli1 9b) Kirrel3
6a) Clec2d
11a) Sfi1 11b) Bcl11a 10a) Tnfaip3 11c) Ebf1 10b) Socs2 11d) Grap 11e) Mpdu1
8a) Fcer2a 8a
11a 10a
4a
11b
7a
9a 9b 11c 11d 11e
7b
4b 2cd
6a
10b
7cd
Tel 1
Cen
2
3
4
5
6
7
8
9
10
11
12a) Rad51l1 13a) CD83 12b) Dync1h1 13b) Mef2c 14a) Fermt2 12c) IgH 18a) Zfp608 17a) Pim1 16a) Lrrc33 17b) miR-715
14a
13a
12a
19a) Scd2
17a 17b
16a
13b
18a Xa) Gucy2f
Xa 19a
Tel 12c12b 12
13
14
15
16
17
18
19
X
Figure 2. Genome-wide Distribution and Orientation of Translocations from c-myc DSBs Genome-wide map of translocations originating from the c-myc25xI-SceI cassette (chr15) in aCD40/IL4-activated and I-SceI-infected B cells. Single junctions are represented by dots located at corresponding chromosomal position. The dot scale is 2 Mb. Clusters of translocations are indicated with color codes, as shown in legend. (+) and () orientation junctions (see Figure S3) are plotted on right and left side of each ideogram, respectively. Hot spots (see Figure 4A) are listed in blue on top, with notation on the left side of chromosomes to indicate position. Data are from HTGTS libraries from seven different mice. Centromere (Cen) and telomere (Tel) positions are indicated. See also Figure S2.
Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc. 111
A
Figure 3. Distribution of IgH- and c-myc-Proximal Junctions (A) Distribution of junctions around chr15 breaksite in the pooled c-myc25xI-SceI HTGTS library. Top: 10 kb around breaksite (represented as a split). Middle: 250 kb around breaksite (represented by red bar); Bottom: 2.5 Mb around breaksite. (+) and ()-oriented junctions are plotted on top and bottom of chromosome diagrams, respectively. (B and C) Distribution of translocation junctions at IgH in the pooled DSg12xI-SceI (B) or c-myc25xI-SceI (C) HTGTS libraries. Translocations in WT (top) and AID/ (bottom) B cells are shown. Positions of S regions within the 250 kb IgH CH region are indicated. Color codes are as in Figure 2. Dot size, position of centromere (red oval) and telomere (green rectangle), and orientation of the sequencing primer are indicated. See also Figure S4.
B
C
Most c-myc Translocation Hot Spots Are Targeted by AID To identify 50 c-myc-I-SceI BE translocation hot spots in an unbiased manner, we separated the genome into 250 kb bins and identified bins containing a statistically significant enrichment of translocations (Extended Experimental Procedures). This approach identified 55 hot spots in WT libraries and 15 in AID/ libraries (Table S3; Figure 4A). Among the 43 most significant hot spots, 39 were in genes and 4 were in intergenic regions. Of these 43 hot spots, 21 were present at significantly greater levels in WT versus AID/ backgrounds, and, therefore, classified as AID dependent; while 9 more were enriched (from 3- to 6-fold) in the WT background and were potentially AID dependent (Table S3; Figure 4A). The other 13 were equally represented between WT and AID/ backgrounds (Table S3; Figure 4A). Of these 13, two exist in multiple copies (Sfi1 and miR-715), which may have contributed to their classification as hot spots (Quinlan et al., 2010; Ira Hall, personal communication); five reached hot spot significance in only one of the two backgrounds (Table S3; Figure 4A). 112 Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc.
The Sm, Sg1, and S3 regions, which are targeted for CSR DSBs by aCD40/IL4 treatment, were by far the strongest AID hot spots for 50 c-myc-I-SceI BEs, with other non-IgH AIDdependent hot spots ranging from 1% to 10% of Sm levels (Figure 4A). Translocation specificity to these three S regions, which together comprise less than 20 kb, was striking; there were only a few junctions in the remainder of the CH locus, which includes 4 other S regions not substantially activated by aCD40/IL4 (Figure 3C). Notably, there was only one 50 c-myc-I-SceI BE junction with Sg3, even though Sg3 was a marked hot spot for 50 Sg1-I-SceI BEs. In this regard, while AID-dependent DSBs in Sg3 likely are much less frequent than in Sm, Sg1, and S3 under aCD40/IL4 stimulation conditions, Sg3 DSBs may be favored targets of 50 Sg1-I-SceI BEs because of linear proximity. Finally, translocations occurred in Sm and Sg1 in AID/ B cells at much lower levels than in WT, but frequently enough to qualify them as AID-independent hot spots (Figure 4A). Several top AID SHM or binding targets in activated B cells (Liu et al., 2008; Yamane et al., 2011) were translocation hot spots for 50 c-myc-I-SceI BEs, including our top 3 non-IgH hot spots (Il4ra, CD83, and Pim1) and probable AID-dependent translocation targets (e.g., Pax5 and Rapgef1) (Figure 4A; Table S3). We also identified other AID-dependent translocation hot spots including the Aff3, Il21r, and Socs2 genes, and a nonannotated intergenic transcript on chr4 (Gm12493, Figure 4A; Table S3). We confirmed the ability of such hot spots to translocate to the c-myc25xI-SceI cassette by direct PCR (Table S4). We conclude that AID not only binds and mutates numerous non-Ig target genes but also acts on them to cause DSBs and translocations. Translocations Genome-wide Frequently Occur Near Active Transcription Start Sites To quantify transcription genome-wide, we applied unbiased global run-on sequencing (GRO-seq; Core et al., 2008) to aCD40/IL4-activated, I-SceI-infected B cells. GRO-seq measures
elongating Pol II activity and distinguishes transcription on both strands. For all analyses, we excluded junctions within 1 Mb of the c-myc breaksite to avoid biases from this dominant class of junctions. To analyze remaining junctions from WT and AID/ backgrounds, we determined nearest transcription start sites (TSSs) and divided translocations based on whether or not the TSS had promoter proximal activity based on GRO-seq (Extended Experimental Procedures). Strikingly, both WT and AID/ junctions, when dominant IgH translocations were excluded, showed a distinct peak that reached a maximum about 300–600 bp on the sense side of the active TSSs and spanned from about 600 bp on the antisense side to about 1 kb on the sense side (Figures 4B and 4C). Translocation hot spot genes, including Il4ra, CD83, Gm12493, Pim1, as well as potential hot spots including Pax5 and Bcl11a, had a substantial proportion of their translocations within 1–2 kb regions starting 200–400 bp in the sense direction from their bidirectional TSSs (Figures 5A and 5B). In one striking example of TSS-proximal translocation targeting, there were distinct translocation peaks downstream of the TSSs of Il4ra and Il21r, which lies just 20 kb downstream; yet, there were no detected translocations into the 30 portion of Il4ra even though it was highly transcribed (Figure 5A). While lower level translocations into some AID hot spot genes in AID/ mice had less correlation with TSS proximity (Figures 5A and 5B), the overall correlation of translocations and active TSS appeared similar in WT and AID/ mice (Figures 4B and 4C; Figures S5A and S5B). Together, our findings indicate a relationship between active TSSs and AID-dependent and independent translocations genome-wide. In this context, we did not find a marked TSS correlation for translocations into nontranscribed genes (Figures 4B and 4C). When the dominant IgH hot spots were included in the translocation/transcription analyses, the translocation peak shifted from about 300–600 bp to about 1.5 kb downstream of the TSS in the sense direction (compare Figures 4B and 4C to Figures S5C and S5D). In B cells, transcription through Sm initiates from the V(D)J exon and Im exon promoters upstream of Sm. B cell activation with aCD40/IL4 stimulates CSR between Sm and Sg1 or S3 by inducing AID and by activating Ig1 and I3 promoters upstream of Sg1 and S3. Indeed, most translocations into germline CH genes in WT aCD40/IL4-activated B cell were tightly clustered 1-2 kb downstream in the 50 portion of Sm, Sg1, and S3, consistent with transcription robustly targeting AID to S regions (Figure 5C). Finally, AID-independent IgH translocations were scattered more broadly through S and C regions, suggesting that DSBs that initiate them arise by a different, AIDindependent mechanism of S region instability (Figure 5C). For 50 c-myc-I-SceI BEs (outside the breaksite region), 55% of translocations were within genes, whereas genes account for only 36% of the genome (Table S5). Therefore, we asked whether translocations from 50 c-myc-I-SceI BEs varied with gene density. For this purpose, we compared translocation densities to available gene density maps and to our GRO-seq transcription maps of all genes (Figure 6; Figures S6 and S7). Strikingly, translocation distribution was highly correlated with gene density and transcription level. In general, chromosomal regions with highest transcriptional activity had highest translocation density. In contrast, regions with very low or undetectable transcription gener-
ally were very low in translocations (Figure 6; Figures S6 and S7). Notably, we found no obvious regions with high overall transcription and low translocation levels, supporting a direct relationship between active transcription and translocation targeting genome-wide. In this context, we observed several robust AID-independent hot spot peaks that were relatively distant to the TSS and/or occurred in nonactive genes (Figures 4B and 4C, asterisks); these hot spots were generated by I-SceI activity at cryptic endogenous I-SceI sites as discussed next. HTGTS Libraries Reveal Numerous Cryptic Genomic I-SceI Target Sites Eleven AID-independent translocation targets for 50 c-myc-ISceI BEs were in genes and two were in intergenic regions (Table S3). Eight of these hot spot regions, in which junctions were tightly clustered, contained potential I-SceI-related sites, many of which were very near (within 50 bp) or actually contributed to translocation junctions. These putative cryptic I-SceI sites had from 1 to 5 divergent nucleotides with respect to the canonical 18 bp target site (Figure 7A). We scanned the mouse genome for potential cryptic I-SceI sites that diverged up to three positions and identified ten additional sites that map within 400 bp of one or more 50 c-myc-I-SceI BE translocation junctions (Figure 7A). In vitro I-SceI digestion of PCR-amplified genomic fragments demonstrated that all eight putative I-SceI targets at hot spots, and six of seven tested additional putative I-SceI targets, were bona fide I-SceI substrates (Figures 7A and 7B). We performed direct translocation PCRs with three selected cryptic ISceI sites and confirmed I-SceI-dependent translocation to the c-myc25xI-SceI cassette (Figure 7C). Finally, GRO-seq analyses showed that five of eight cryptic I-SceI translocation hot spots were in transcriptionally silent areas and that two I-SceI-generated hot spots in transcribed genes were distant from the TSS (Figures 4B and 4C, asterisks; Figures 7D and 7E), highlighting the distinction between the I-SceI-generated hot spots and most other genomic translocation hot spots. DISCUSSION With HTGTS, we have identified the genome-wide translocations that emanate from DSBs introduced into c-myc or IgH in activated B cells. A substantial percentage of these translocations (80%–90%) join introduced DSBs to sequences on the same chromosome proximal to the breaksite, likely reflecting the strong preference for C-NHEJ to join DSBs intrachromosomally (Ferguson et al., 2000; Zarrin et al., 2007; Mahowald et al., 2009). The remaining 10%–20% translocate broadly across all chromosomes, with translocation density correlating with transcribed gene density. Translocations are most often near TSSs within individual genes. Despite c-myc and IgH DSBs translocating broadly, there are translocation hot spots, with the majority being generated by cellular AID activity and most of the rest by ectopically expressed I-SceI activity at cryptic genomic I-SceI target sequences. Notably, targeted DSBs join at similar levels to both (+) and () orientations of hot spot sequences, arguing against a role for cellular selection in their appearance. This finding also suggests that both sides of hot spot DSBs have similar opportunity to translocate to a DSB on another chromosome. Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc. 113
Figure 4. Identification of Specific and General Translocation Hot Spots (A) Graph representing translocation numbers in frequently hit genes and non-annotated chromosomal regions. Only hot spots with more than five hits are shown and are ordered based on frequency of translocations in the pooled c-myc25xI-SceI/WT HTGTS library (top bars). Respective frequencies of translocations in the pooled c-myc25xI-SceI/AID/ HTGTS library are displayed underneath (bottom bars). Green bars represent frequent hits involving cryptic I-SceI sites. Blue and
114 Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc.
A Chr4: 23kb
Chr13: 27kb
Chr7: 89kb AID-/Junctions (1)
(0)
(6)
(1)
WT Junctions (35)
(6)
(30)
(26)
Sense
Anti-sense
Il4ra
B
Chr4: 268kb
AID-/(3) Junctions WT Junctions (9)
Preferentially
Occur
WT and AID/ c-myc25xI-SceI HTGTS libraries were analyzed. In each panel, translocation junctions are in the first and second rows (WT and AID/ as indicated). The third and fourth rows represent sense and antisense nascent RNA signals from GRO-seq. The IgH m, g1, 3 genes are shown in (C), the next most frequently hit hot spots in (A) and three selected oncogene hot spots in (B). The transcriptional start site (arrow) is at the bottom of each panel. The size of each genomic region and number of junctions in each are shown.
Gm12493
CD83
Il21r
Figure 5. Translocations Near TSSs
Chr17: 15kb
Chr11: 14kb (2)
(1)
(9)
(25)
Sense
et al., 2009). HTGTS also provides a method to discover recurrent genomic DSBs, as evidenced by ability of HTGTS to find known DSBs, such as AID-initiated DSBs in S regions, and previously unrecognized genomic I-SceI targets. HTGTS should be readily applicable for genome-wide screens for translocations and recurrent DSBs in a wide range of cell types.
Anti-sense
Pax5
Bcl11a
Chr12: 21kb (12)
Chr12: 24kb (8)
C AID-/Junctions WT Junctions
(446)
(259)
Sense
Anti-sense
I
I 1 S 1
C 1
The majority of HTGTS junctions from the c-myc I-SceI DSBs are mediated by end-joining and contain short MHs, reminiscent of joins in cancer genomes (Stratton et al., 2009) and consistent with roles for either (or both) C-NHEJ or A-EJ (Zhang et al., 2010). Recurrence of translocations in cancer genomes is a characteristic used to consider them as potential oncogenic ‘‘drivers.’’ Our HTGTS studies establish that many recurrent translocations form in the absence of selection and, thus, are caused by factors intrinsic to the translocation mechanism (Wang et al., 2009; Lin
AID Has a Dominant Role in Targeting Recurrent Translocations Genome-wide Pim1 Prior studies demonstrated that AID binds to and mutates non-Ig genes (Pasqualucci et al., Chr12: 21kb 2001; Liu et al., 2008; Yamane et al., 2011). We (0) find that AID also induces DSBs and translocations in non-Ig genes with the peak of translocation junctions spanning the region of the TSS. (327) Thus, processes closely associated with transcription and, potentially, transcriptional initiation may attract AID activity to these non-Ig gene targets, consistent with ectopically expressed AID mutating yeast promoter regions (Go´mez-Gonza´lez and Aguilera, 2007). IgH translocation junctions mostly fall 1.5–2 kb I S C downstream of the activated I region TSSs within S regions, which are known to be specialized AID targets. Thus, transcription through S regions attracts and focuses AID activity, at least in part via pausing mechanisms and by generating appropriate DNA substrates, such as R-loops, for this single-strand DNA-specific cytidine deaminase (Yu et al., 2003; Pavri and Nussenzweig, 2011; Chaudhuri et al., 2007). Notably, S regions still qualified as translocation hot spots for 50 c-myc-I-SceI BEs in AID/ B cells, supporting suggestions that these regions, perhaps via transcription, may be intrinsically prone to DSBs (Dudley et al., 2002; Kovalchuk et al., 2007; Unniraman et al., 2004). Given
yellow portions of top bars represent translocations found in c-myc1xI-SceI and c-myc25xI-SceI/ROSAI-SceI-GR libraries, respectively. Genes translocated in human and mouse lymphoma or leukemia are in red. The dashed line represents the cutoff for significance over random occurrence for each of the two groups (see Table S3). (B and C) Genome-wide distribution of translocations relative to TSSs. Junctions from c-myc25xI-SceI/WT (B) or c-myc25xI-SceI/AID/ (C) libraries (excluding 2 Mb around chr15 breaksite and IgH S regions) are assigned a distance to the nearest TSS and separated into ‘‘active’’ and ‘‘inactive’’ promoters as determined by GRO-seq. Translocation junctions are binned at 100 bp intervals. n represents the number of junctions within 20 kb (upper panels) or 2 kb (lower panels) of TSS. Asterisks indicate cryptic genomic I-SceI sites. See also Figure S5.
Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc. 115
(Figure 6). On the other hand, we found that transcription is not required for high-frequency translocations, since many I-SceIdependent hot spots are in nontranscribed regions. Together, our observations are consistent with transcription mechanistically promoting translocations by promoting DSBs. Thus, our findings strongly support the long-standing notion of a mechanistic link between transcription, DSBs, and genomic instability (Aguilera, 2002; Haffner et al., 2011; Li and Manley, 2006). Potential Influences of Genome Organization on Translocations The high level of translocations of 50 c-myc-I-SceI BEs to other sequences along much the length of chr15, while generally correlated with transcription, likely may be further promoted by high relative proximity of many intrachromosomal regions (Lieberman-Aiden et al., 2009). Proximity might also contribute to the apparently increased frequency of 50 c-myc-I-SceI BEs to certain regions of various chromosomes (e.g., Figure 2). In this regard, the relative frequency of chr15 50 c-myc-I-SceI BE translocations to the Sm and S3 regions on chr12 were only 5 and 7 fold less, respectively, than levels of intra-IgH 50 Sg1-I-SceI BE joins to Sm and S3 (Figure 3C). Thus, even though DSBs are rare in c-myc, their translocation to IgH when they do occur is driven at a high rate by other mechanistic aspects, most likely proximal position (Wang et al., 2009). However, we also note that sequences lying in regions across all chromosomes translocate to DSBs in c-myc on chr15 and IgH in chr12, suggesting the possibility that, in some cases, DSBs might move into proximity before joining, perhaps during the cell cycle or via other mechanisms (e.g., Dimitrova et al., 2008). Figure 6. Translocations Cluster to Transcribed Regions 25xI-SceI
25xI-SceI
Translocation density maps from pooled c-myc /WT and c-myc / AID/ HTGTS libraries are aligned with combined sense and antisense nascent RNA signals for chr 15, 11, and 17 using the UCSC genome browser. Chromosome gene densities are displayed below GRO-seq traces. Chromosomal orientation from left to right is centromere (C) to telomere (T). See also Figures S6 and S7.
the differential targeting of CSR and SHM (Liu and Schatz, 2009), application of HTGTS to germinal center (GC) B cells, in which AID initiates SHM within variable region exons, may reveal novel AID genomic targets not observed in B cells activated for IgH CSR in culture, potentially including genes that could contribute to GC B cell lymphoma (Ku¨ppers and Dalla-Favera, 2001). A General Role for Transcription and Transcription Initiation in Targeting Translocations We find a remarkable genome-wide correlation between transcription and translocations even in AID/ cells, with a peak of translocation junctions lying near active TSSs. In this context, while the majority of junctions were located in the sense transcriptional direction, junctions also occurred at increased levels close to the TSS on the antisense side (e.g., Figures 4B and 4C; Figure 5), correlating with focal antisense transcription in the immediate vicinity of active promoters (Core et al., 2008) (Figure 5). Notably, we observed a number of regions genome-wide that were quite low in or devoid of translocations and transcription, but few, if any, that were low in translocations but high in transcription 116 Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc.
HTGTS Reveals an Unexpectedly Large Number of Genomic I-SceI Targets Our HTGTS studies revealed 18 cryptic genomic I-SceI sites as translocation targets. There could potentially be more cryptic ISceI sites; to find the full spectrum, bait sequences may need to be introduced into a variety of chromosomal locations to neutralize position effects. Beyond I-SceI, the HTGTS approach could readily be extended through the use Zinc finger nucleases (Ha¨ndel and Cathomen, 2011), meganucleases (Arnould et al., 2011), or TALENs (Christian et al., 2010) designed to cleave specific endogenous sites, thereby obviating the need to introduce a cutting site and greatly facilitating the process. The above three classes of endonucleases are being developed for targeted gene correction of human mutations in stem cells for gene therapy. One major concern with such nucleases is relative activity on the specific target versus off-target activity, with the latter being difficult to assess. HTGTS provides a means for identifying off-target DSBs generated by such enzymes, for assessing ability of such off-target DSBs to translocate, and for identifying the sequences to which they translocate. EXPERIMENTAL PROCEDURES Mouse Strains Utilized DSg12xI-SceI, c-myc25xI-SceI and AID/ mice were described (Zarrin et al., 2007; Wang et al., 2009; Muramatsu et al., 2000). c-myc1xI-SceI mice were generated similarly to c-myc25xI-SceI mice (see Extended Experimental Procedures). ROSAI-SceI-GR mice were generated by targeting an I-SceI-GR/ IRES-tdTomato expression cassette into Rosa26 (Extended Experimental
Figure 7. Identification of Cryptic I-SceI Sites in the Mouse Genome by HTGTS
A
B
D
(A) Cryptic I-SceI site translocation targets. The canonical I-SceI recognition sequence is on top; nucleotides divergent from the consensus are in red. Chromosomal position and gene location of each cryptic site are indicated. ‘‘Hits’’ represent total number of unique junctions in a 4 kb region centered around each site in the pool of all HTGTS libraries (see also Table S6). In vitro cutting efficiency, evaluated as in Extended Experimental Procedures, is indicated. NA, intergenic or not annotated; nd, not determined. (B) In vitro cutting of PCR products encompassing indicated cryptic I-SceI sites. C+, positive control: PCR fragment containing a canonical I-SceI site. U, uncut; I, I-SceI-digested. (C) PCR to detect translocations between c-myc25xI-SceI and cryptic I-SceI sites in Scd2, Dmrt1, and Mmp24 genes. (Top) Position of primers used for PCR amplification. (Middle) Average frequency of translocations ± SEM (Bottom) Number of translocations/105 cells from three independent c-myc25xI-SceI WT mice. (D) Transcription in genes containing I-SceI sites determined by GRO-seq. Translocation junctions are in the first (AID/) and second (WT) rows; sense and antisense nascent RNA signals are in the third and fourth rows. (E) Distance of cryptic I-SceI hot spots from the nearest TSS in pooled HTGTS libraries from WT and AID/ c-myc25xI-SceI B cells.
C
E of the I-SceI cassette. Libraries were sequenced by Roche-454. See Extended Experimental Procedures for details.
Procedures). All mice used were heterozygous for modified alleles containing I-SceI cassettes. The Institutional Animal Care and Use Committee of Children’s Hospital, Boston approved all animal work. Splenic B Cell Purification, Activation in Culture, and Retroviral Infection All procedures were performed as previously described (Wang et al., 2009). c-myc25xI-SceI/ROSAI-SceI-GR B cells were cultured in medium containing charcoal-stripped serum and I-SceI-GR was activated with 10 mM triamcinolone acetate (TA, Sigma). Generation of HTGTS Libraries Genomic DNA was digested with HaeIII for c-myc25xI-SceI samples or MspI for DSg12xI-SceI samples. For adaptor-PCR libraries, an asymmetric adaptor was ligated to cleaved genomic DNA. Ligation products were incubated with restriction enzymes chosen to reduce background from germline and unrearranged targeted alleles. Three rounds of nested PCR were performed with adaptor- and locus-specific primers. For circularization-PCR libraries, HaeIIIor MspI-digested genomic DNA was incubated at 1.6 ng/ml to favor intramolecular ligation and samples treated with blocking enzymes as above. Two rounds of nested PCR were performed with primers specific for sequences upstream
Data Analysis Alignment and Filtering Sequences were aligned to the mouse reference genome (NCBI37/mm9) with the BLAT program. Custom filters were used to purge PCR repeats and multiple types of artifacts including those caused by in vitro ligation and PCR mispriming. Hot Spot Identification Translocations from WT or AID/ libraries minus those on chr15 or the IgH locus were pooled. The adjusted genome was then divided into 250 kb bins and bins containing five or more hits constituted a hot spot (details in Extended Experimental Procedures). In Vitro Testing of Putative Cryptic I-SceI Sites A genomic region encompassing each candidate I-SceI site was PCR-amplified and 500 ng of purified products were incubated with 5 units of I-SceI for 3 hr. Reactions were separated on agarose gel and relative intensity of uncut and I-SceI-digested bands calculated with the FluorchemSP program (Alpha Innotech) (see Extended Experimental Procedures). PCR Detection of Translocations between c-myc and Cryptic I-SceI Sites Translocation junctions between c-myc and cryptic I-SceI targets were PCRamplified according to the standard protocol (Wang et al., 2009). Primers and PCR conditions are detailed in Extended Experimental Procedures. GRO-Seq Nuclei were isolated from day 4 aCD40/IL4-stimulated and I-SceI-infected c-myc25xI-SceI B cells as described (Giallourakis et al., 2010). GRO-seq libraries were prepared from 5 3 106 cells from two independent mice using a described protocol (Core et al., 2008). Both libraries were sequenced on the Hi-Seq 2000
Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc. 117
platform with single-end reads and analyzed as described (see Extended Experimental Procedures). After filtering and alignment, we obtained 34,212,717 reads for library 1 and 15,913,244 reads for library 2. As results between libraries were highly correlated, we show results only from replicate 1.
antisense transcription and lineage-specific V(D)J recombination. Proc. Natl. Acad. Sci. USA 107, 22207–22212.
SUPPLEMENTAL INFORMATION
Gostissa, M., Alt, F.W., and Chiarle, R. (2011). Mechanisms that promote and suppress chromosomal translocations in lymphocytes. Annu. Rev. Immunol. 29, 319–350.
Supplemental Information includes Extended Experimental Procedures, seven figures, and seven tables and can be found with this article online at doi:10.1016/j.cell.2011.07.049. ACKNOWLEDGMENTS We thank Barry Sleckman for providing unpublished information about circular PCR translocation cloning of RAG-generated DSBs. This work was supported by NIH grant 5P01CA92625 and a Leukemia and Lymphoma Society of America (LLS) SCOR grant to F.W.A., grants from AIRC and grant FP7 ERC-2009StG (Proposal No. 242965—‘‘Lunely’’) to R.C., an NIH KO8 grant AI070837 to C.C.G., and a V Foundation Scholar award to M.G. Y.Z. was supported by CRI postdoctoral fellowship and R.L.F. by NIH training grant 5T32CA070083-13. F.W.A. is an Investigator of the Howard Hughes Medical Institute. F.W.A. is a member of the scientific advisory board of Cellectis Pharmaceuticals. Received: June 1, 2011 Revised: July 22, 2011 Accepted: July 29, 2011 Published: September 29, 2011 REFERENCES Aguilera, A. (2002). The connection between transcription and genomic instability. EMBO J. 21, 195–201. Arnould, S., Delenda, C., Grizot, S., Desseaux, C., Paˆques, F., Silva, G.H., and Smith, J. (2011). The I-CreI meganuclease and its engineered derivatives: applications from cell modification to gene therapy. Protein Eng. Des. Sel. 24, 27–31. Chaudhuri, J., Basu, U., Zarrin, A., Yan, C., Franco, S., Perlot, T., Vuong, B., Wang, J., Phan, R.T., Datta, A., et al. (2007). Evolution of the immunoglobulin heavy chain class switch recombination mechanism. Adv. Immunol. 94, 157–214.
Go´mez-Gonza´lez, B., and Aguilera, A. (2007). Activation-induced cytidine deaminase action is strongly stimulated by mutations of the THO complex. Proc. Natl. Acad. Sci. USA 104, 8409–8414.
Gostissa, M., Ranganath, S., Bianco, J.M., and Alt, F.W. (2009). Chromosomal location targets different MYC family gene members for oncogenic translocations. Proc. Natl. Acad. Sci. USA 106, 2265–2270. Haffner, M., De Marzo, A.M., Meeker, A.K., Nelson, W.G., and Yegnasubramanian, S. (2011). Transcription-induced DNA double strand breaks: both an oncogenic force and potential therapeutic target? Clin. Cancer Res. 17, 3858–3864. Ha¨ndel, E.M., and Cathomen, T. (2011). Zinc-finger nuclease based genome surgery: it’s all about specificity. Curr. Gene Ther. 11, 28–37. Jasin, M. (1996). Genetic manipulation of genomes with rare-cutting endonucleases. Trends Genet. 12, 224–228. Kovalchuk, A.L., duBois, W., Mushinski, E., McNeil, N.E., Hirt, C., Qi, C.F., Li, Z., Janz, S., Honjo, T., Muramatsu, M., et al. (2007). AID-deficient Bcl-xL transgenic mice develop delayed atypical plasma cell tumors with unusual Ig/Myc chromosomal rearrangements. J. Exp. Med. 204, 2989–3001. Ku¨ppers, R., and Dalla-Favera, R. (2001). Mechanisms of chromosomal translocations in B cell lymphomas. Oncogene 20, 5580–5594. Lieber, M.R. (2010). The mechanism of double-strand DNA break repair by the nonhomologous DNA end-joining pathway. Annu. Rev. Biochem. 79, 181–211. Li, X., and Manley, J.L. (2006). Cotranscriptional processes and their influence on genome stability. Genes Dev. 20, 1838–1847. Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293. Lin, C., Yang, L., Tanasa, B., Hutt, K., Ju, B.G., Ohgi, K., Zhang, J., Rose, D.W., Fu, X.D., Glass, C.K., and Rosenfeld, M.G. (2009). Nuclear receptor-induced chromosomal proximity and DNA breaks underlie specific translocations in cancer. Cell 139, 1069–1083. Liu, M., and Schatz, D.G. (2009). Balancing AID and DNA repair during somatic hypermutation. Trends Immunol. 30, 173–181.
Christian, M., Cermak, T., Doyle, E.L., Schmidt, C., Zhang, F., Hummel, A., Bogdanove, A.J., and Voytas, D.F. (2010). Targeting DNA double-strand breaks with TAL effector nucleases. Genetics 186, 757–761.
Liu, M., Duke, J.L., Richter, D.J., Vinuesa, C.G., Goodnow, C.C., Kleinstein, S.H., and Schatz, D.G. (2008). Two levels of protection for the B cell genome during somatic hypermutation. Nature 451, 841–845.
Core, L.J., Waterfall, J.J., and Lis, J.T. (2008). Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848.
Mahowald, G.K., Baron, J.M., Mahowald, M.A., Kulkarni, S., Bredemeyer, A.L., Bassing, C.H., and Sleckman, B.P. (2009). Aberrantly resolved RAG-mediated DNA breaks in Atm-deficient lymphocytes target chromosomal breakpoints in cis. Proc. Natl. Acad. Sci. USA 106, 18339–18344.
Dimitrova, N., Chen, Y.C., Spector, D.L., and de Lange, T. (2008). 53BP1 promotes non-homologous end joining of telomeres by increasing chromatin mobility. Nature 456, 524–528. Dudley, D.D., Manis, J.P., Zarrin, A.A., Kaylor, L., Tian, M., and Alt, F.W. (2002). Internal IgH class switch region deletions are position-independent and enhanced by AID expression. Proc. Natl. Acad. Sci. USA 99, 9984–9989. Ferguson, D.O., Sekiguchi, J.M., Chang, S., Frank, K.M., Gao, Y., DePinho, R.A., and Alt, F.W. (2000). The nonhomologous end-joining pathway of DNA repair is required for genomic stability and the suppression of translocations. Proc. Natl. Acad. Sci. USA 97, 6630–6633. Franco, S., Gostissa, M., Zha, S., Lombard, D.B., Murphy, M.M., Zarrin, A.A., Yan, C., Tepsuporn, S., Morales, J.C., Adams, M.M., et al. (2006). H2AX prevents DNA breaks from progressing to chromosome breaks and translocations. Mol. Cell 21, 201–214. Giallourakis, C.C., Franklin, A., Guo, C., Cheng, H.L., Yoon, H.S., Gallagher, M., Perlot, T., Andzelm, M., Murphy, A.J., Macdonald, L.E., et al. (2010). Elements between the IgH variable (V) and diversity (D) clusters influence
118 Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc.
Meaburn, K.J., Misteli, T., and Soutoglou, E. (2007). Spatial genome organization in the formation of chromosomal translocations. Semin. Cancer Biol. 17, 80–90. Muramatsu, M., Kinoshita, K., Fagarasan, S., Yamada, S., Shinkai, Y., and Honjo, T. (2000). Class switch recombination and hypermutation require activation-induced cytidine deaminase (AID), a potential RNA editing enzyme. Cell 102, 553–563. Nussenzweig, A., and Nussenzweig, M.C. (2010). Origin of chromosomal translocations in lymphoid cancer. Cell 141, 27–38. Pasqualucci, L., Neumeister, P., Goossens, T., Nanjangud, G., Chaganti, R.S., Ku¨ppers, R., and Dalla-Favera, R. (2001). Hypermutation of multiple protooncogenes in B-cell diffuse large-cell lymphomas. Nature 412, 341–346. Pavri, R., and Nussenzweig, M.C. (2011). AID Targeting in Antibody Diversity. Adv. Immunol. 110, 1–26. Quinlan, A.R., Clark, R.A., Sokolova, S., Leibowitz, M.L., Zhang, Y., Hurles, M.E., Mell, J.C., and Hall, I.M. (2010). Genome-wide mapping and assembly
of structural variant breakpoints in the mouse genome. Genome Res. 20, 623–635. Ramiro, A.R., Jankovic, M., Callen, E., Difilippantonio, S., Chen, H.-T., McBride, K.M., Eisenreich, T.R., Chen, J., Dickins, R.A., Lowe, S.W., et al. (2006). Role of genomic instability and p53 in AID-induced c-myc-Igh translocations. Nature 440, 105–109. Reynaud, S., Delpy, L., Fleury, L., Dougier, H.L., Sirac, C., and Cogne´, M. (2005). Interallelic class switch recombination contributes significantly to class switching in mouse B cells. J. Immunol. 174, 6176–6183. Robbiani, D.F., Bothmer, A., Callen, E., Reina-San-Martin, B., Dorsett, Y., Difilippantonio, S., Bolland, D.J., Chen, H.T., Corcoran, A.E., Nussenzweig, A., and Nussenzweig, M.C. (2008). AID is required for the chromosomal breaks in c-myc that lead to c-myc/IgH translocations. Cell 135, 1028–1038.
Wang, J.H., Gostissa, M., Yan, C.T., Goff, P., Hickernell, T., Hansen, E., Difilippantonio, S., Wesemann, D.R., Zarrin, A.A., Rajewsky, K., et al. (2009). Mechanisms promoting translocations in editing and switching peripheral B cells. Nature 460, 231–236. Yamane, A., Resch, W., Kuo, N., Kuchen, S., Li, Z., Sun, H.W., Robbiani, D.F., McBride, K., Nussenzweig, M.C., and Casellas, R. (2011). Deep-sequencing identification of the genomic targets of the cytidine deaminase AID and its cofactor RPA in B lymphocytes. Nat. Immunol. 12, 62–69. Yan, C.T., Boboila, C., Souza, E.K., Franco, S., Hickernell, T.R., Murphy, M., Gumaste, S., Geyer, M., Zarrin, A.A., Manis, J.P., et al. (2007). IgH class switching and translocations use a robust non-classical end-joining pathway. Nature 449, 478–482.
Shaffer, D.R., and Pandolfi, P.P. (2006). Breaking the rules of cancer. Nat. Med. 12, 14–15.
Yu, K., Chedin, F., Hsieh, C.L., Wilson, T.E., and Lieber, M.R. (2003). R-loops at immunoglobulin class switch regions in the chromosomes of stimulated B cells. Nat. Immunol. 4, 442–445.
Siebert, P.D., Chenchik, A., Kellogg, D.E., Lukyanov, K.A., and Lukyanov, S.A. (1995). An improved PCR method for walking in uncloned genomic DNA. Nucleic Acids Res. 23, 1087–1088.
Zarrin, A.A., Del Vecchio, C., Tseng, E., Gleason, M., Zarin, P., Tian, M., and Alt, F.W. (2007). Antibody class switching mediated by yeast endonucleasegenerated DNA breaks. Science 315, 377–381.
Simsek, D., and Jasin, M. (2010). Alternative end-joining is suppressed by the canonical NHEJ component Xrcc4-ligase IV during chromosomal translocation formation. Nat. Struct. Mol. Biol. 17, 410–416.
Zhang, Y., Gostissa, M., Hildebrand, D.G., Becker, M.S., Boboila, C., Chiarle, R., Lewis, S., and Alt, F.W. (2010). The role of mechanistic factors in promoting chromosomal translocations found in lymphoid and other cancers. Adv. Immunol. 106, 93–133.
Stratton, M.R., Campbell, P.J., and Futreal, P.A. (2009). The cancer genome. Nature 458, 719–724. Unniraman, S., Zhou, S., and Schatz, D.G. (2004). Identification of an AID-independent pathway for chromosomal translocations between the Igh switch region and Myc. Nat. Immunol. 5, 1117–1123.
Zhu, C., Mills, K.D., Ferguson, D.O., Lee, C., Manis, J., Fleming, J., Gao, Y., Morton, C.C., and Alt, F.W. (2002). Unrepaired DNA breaks in p53-deficient cells lead to oncogenic gene amplification subsequent to translocations. Cell 109, 811–821.
Cell 147, 107–119, September 30, 2011 ª2011 Elsevier Inc. 119
A DNA Repair Complex Functions as an Oct4/Sox2 Coactivator in Embryonic Stem Cells Yick W. Fong,1 Carla Inouye,1 Teppei Yamaguchi,1 Claudia Cattoglio,1 Ivan Grubisic,2 and Robert Tjian1,3,* 1Howard
Hughes Medical Institute, Department of Molecular and Cell Biology Berkeley-UCSF Graduate Program in Bioengineering 3Li Ka Shing Center for Biomedical and Health Sciences University of California, Berkeley, Berkeley, CA 94720, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.08.038 2UC
SUMMARY
The transcriptional activators Oct4, Sox2, and Nanog cooperate with a wide array of cofactors to orchestrate an embryonic stem (ES) cell-specific gene expression program that forms the molecular basis of pluripotency. Here, we report using an unbiased in vitro transcription-biochemical complementation assay to discover a multisubunit stem cell coactivator complex (SCC) that is selectively required for the synergistic activation of the Nanog gene by Oct4 and Sox2. Purification, identification, and reconstitution of SCC revealed this coactivator to be the trimeric XPC-nucleotide excision repair complex. SCC interacts directly with Oct4 and Sox2 and is recruited to the Nanog and Oct4 promoters as well as a majority of genomic regions that are occupied by Oct4 and Sox2. Depletion of SCC/XPC compromised both pluripotency in ES cells and somatic cell reprogramming of fibroblasts to induced pluripotent stem (iPS) cells. This study identifies a transcriptional coactivator with diversified functions in maintaining ES cell pluripotency and safeguarding genome integrity. INTRODUCTION The molecular events leading to the maintenance of pluripotency in embryonic stem (ES) cells and reacquisition of a stem-like state in induced pluripotent stem (iPS) cells during somatic reprogramming represent mechanistically distinct processes that converge on a set of remarkably similar transcriptional events that underpin the pluripotent state. Both ES and iPS cells depend on fundamental transcription frameworks that are governed by a common set of ‘‘core’’ stem cell-specific transcription factors, namely Oct4, Sox2, and Nanog (Jaenisch and Young, 2008). These activators, in turn, collaborate with both ubiquitous and cell type-specific transcription factors to orchestrate complex gene expression programs that confer upon stem cells the 120 Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc.
unique ability to safeguard stemness while remaining poised to execute a broad range of developmental programs that drive lineage specification (Boyer et al., 2005; Chen et al., 2008; Kim et al., 2008; Marson et al., 2008). Proper execution of these highly regulated processes by sequence-specific transcription factors often requires the coordinated recruitment of coactivator proteins to their cognate promoters. For example, transcriptional activators direct histone modifiers (e.g., CBP/p300) and chromatin remodelers (e.g., PBAF/BAF) to gene promoters to alter chromatin structure toward a state that is more permissive to transcriptional activation (Na¨a¨r et al., 2001). Independent of chromatin, a variety of activators recruit other classes of coactivators, such as the multisubunit Mediator, various TBP/TAF complexes, SRC, etc., via direct protein-protein interactions to execute specific transcriptional programs. This class of coactivators often serves as molecular ‘‘adaptors’’ by bridging activators to the general transcription machinery, thereby mediating the synergistic response by these activators (Na¨a¨r et al., 1999). Interestingly, subunits of Mediator have also been shown to interact with cohesin possibly to promote DNA looping and thereby facilitate long-distance interactions between enhancers and core promoters in vivo (Kagey et al., 2010). Indeed, such coactivators are often multifunctional and can activate transcription through chromatin-dependent as well as independent mechanisms. Further expanding the transcriptional repertoire of coactivator complexes, their protein levels and subunit compositions are frequently modulated in a developmental stage and cell type-specific manner (Roeder, 2005; Taatjes et al., 2004). Additionally, these proteinprotein-driven coactivator-activator transactions are often critical nodes in various signal transduction pathways and can serve as molecular ‘‘sensors’’ by integrating cell-intrinsic and -extrinsic cues, thereby coupling gene networks with specific cellular responses to produce complex biological programs of gene expression (Rosenfeld et al., 2006). Totipotent ES cells employ these same sets of coactivators in conjunction with special activators such as Oct4 and Sox2 to regulate transcription of a large number of genes, including Nanog, that form the molecular basis of pluripotency (Gao et al., 2008; Kagey et al., 2010; Kidder et al., 2009; Tutter et al., 2009). The transcription of Nanog is exquisitely dependent on
Oct4 and Sox2 (Kuroda et al., 2005; Rodda et al., 2005). However, coexpression of Oct4 and Sox2 failed to robustly activate a Nanog promoter reporter construct in differentiated cells like 293 or NIH 3T3 cells, even though Mediator, p300/ CBP, and PBAF/BAF complexes remain abundantly expressed and active (Rodda et al., 2005). This led us to speculate that one or more as yet unidentified stem cell-specific cofactors may be required to activate the transcription of Nanog and other Oct4/Sox2 target genes in ES cells. Indeed, recent studies of germ cells and differentiated somatic cells revealed that even parts of the general transcriptional machinery may be radically altered in a tissue- or cell-specific context (Goodrich and Tjian, 2010; Mu¨ller et al., 2010). Diversification of the transcriptional apparatus may therefore represent a fundamental strategy, particularly in ES cells, to cope with the multidimensional nature of transcription programs that must be precisely tuned to both maintain pluripotency and, at the same time, allow for lineagespecific programs of differentiation (Liu et al., 2011). The human Nanog promoter contains a prototypic composite oct-sox cis-acting regulatory element located immediately upstream of the transcription start site that is conserved across several mammalian species (Kuroda et al., 2005; Rodda et al., 2005). A Nanog promoter-GFP reporter construct containing a DNA fragment encompassing this promoter-proximal oct-sox element is sufficient to recapitulate the robust expression pattern of endogenous Nanog in ES cells in an Oct4-, Sox2-dependent manner (Kuroda et al., 2005; Rodda et al., 2005). Unbiased genome-wide motif searching analyses of Oct4 in both mouse and human ES cells identified an oct-sox composite consensus sequence element, confirming that Oct4 likely orchestrates an ES-specific gene expression program primarily through cooperation with Sox2 (Chen et al., 2008; Loh et al., 2006). Because the oct-sox cis-control element in the Nanog promoter represents a common configuration that is present in the promoters of many other Oct4- and Sox2-activated genes in ES cells, the well-characterized Nanog proximal promoter provided us with a useful model template for identifying uncharacterized transcriptional cofactors required for Oct4- and Sox2-directed activation. Therefore, we took advantage of a fully reconstituted in vitro transcription system in which one can unambiguously and systematically test and identify transcriptional cofactors that may be directly required to potentiate Oct4- and Sox2-dependent gene activation of Nanog. Here, we report the biochemical purification and identification of a multisubunit stem cell coactivator (SCC) that is required for the synergistic activation of Nanog by Oct4 and Sox2 in vitro. After extensive biochemical characterization, we surprisingly found that SCC is none other than the XPC-RAD23B-CETN2 (XPC) nucleotide excision repair (NER) complex. SCC/XPC interacts directly with Oct4 and Sox2 and co-occupies a majority of Oct4 and Sox2 targets genome-wide in mouse ES cells. Importantly, SCC/XPC is required for stem cell self-renewal and efficient somatic cell reprogramming. Thus, our findings unmask an unanticipated selective coactivator role of an NER complex in transcription in the context of ES cells and may provide a previously unknown molecular link that couples stem cell-specific transcription to DNA damage response with potential implications for enhanced ES cell genome stability.
RESULTS Detection of an Oct4- and Sox2-Dependent Coactivator Activity in EC and ES Cells Having chosen the Nanog promoter as our model template, we next set out to develop an in vitro reconstituted transcription assay that could recapitulate the Oct4- and Sox2-dependent transactivation at the Nanog promoter observed in vivo. To enhance the sensitivity of the assay, we inserted four copies of the Nanog oct-sox-binding sites immediately upstream of the native oct-sox element found in the human Nanog promoter. Our basal in vitro transcription assay consisted of purified recombinant TFIIA, -B, -E and -F together with immunoaffinitypurified native RNA polymerase II, TFIID, and TFIIH (Figure S1A available online). When purified Oct4 and Sox2 were added to this reconstituted transcription system, only a very weak activation of the Nanog promoter was detected (Figure 1A, lanes 1 and 2). As a control, we could show that the same complement of general transcription factors (GTFs) was able to support strong Sp1-dependent activation from a GC box-containing ‘‘generic’’ transcription template (G3BCAT) (Figure 1A, lanes 5 and 6). This initial result suggested that efficient activation of Nanog by Oct4 and Sox2 may require additional cofactors to potentiate a full activator-dependent response. We reasoned that such a putative coactivator ought to be selectively active in pluripotent cell types that express Nanog under the control of Oct4 and Sox2. For example, NTERA-2 (NT2) is a pluripotent human embryonal carcinoma (EC) cell line that expresses Oct4, Sox2, and Nanog and shares with ES cells core molecular mechanisms that govern self-renewal (Pal and Ravindran, 2006). Detailed expression profiling of NT2 and bona fide human ES cell lines revealed many similarities, including robust expression of Nanog (Schwartz et al., 2005; Sperger et al., 2003). However, unlike human ES cells, NT2 cell culture can be more readily scaled up, a prerequisite to generating sufficient quantities of starting materials for the biochemical purification of putative Oct4/Sox2 coactivators. We therefore chose extracts derived from NT2 cells as our starting material in our efforts to develop a ‘‘biochemical complementation’’ assay to hunt for pluripotent stem cell-selective cofactors. We first fractionated NT2 nuclear extracts by conventional phosphocellulose ion exchange chromatography. Next, we supplemented our ‘‘basal’’ reconstituted transcription reactions with various salt-eluted fractions from the phosphocellulose column to see whether there was any activity that could restore Oct4/Sox2-dependent activation of our Nanog promoter. This strategy allowed us to unmask an activity in the high salt phosphocellulose fraction (P1M) prepared from NT2 nuclear extracts (but not HeLa extracts) (Figure S1B) that strongly potentiated transcription of the Nanog promoter in an Oct4- and Sox2dependent manner using either a naked (Figure 1A, lanes 3 and 4) or a Nanog chromatin template assembled with a crude Drosophila cytosolic extract (data not shown). This new cofactor activity is selectively required for transcription of Nanog, as it had no effect on either basal- or Sp1-activated transcription from a control G3BCAT template (Figure 1A, lanes 5–8). Importantly, this P1M fraction also stimulated the Oct4/Sox2-dependent transcription from a native Nanog promoter template (Figure 1B), Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc. 121
A
NT2 P1M OCT4/SOX2
--++ -+-+
NT2 P1M Sp1
- -+ + ++ - E D3 RA WCE α-OCT4
1 2
1 2 3 4 -140 O S
(
)
5
1
2
-+ ++ -+ ++ -+ ++ - - - - -+ -+
NT2 P1M OCT4/SOX2 TBP 1x 1x 2x 2x TFIID
- -+ + ++ - -
-140
D3
OCT4/SOX2
RA
+
- +-
1
2
G3BCAT
C NT2 P1M OCT4/SOX2
F
7 8
( )3xTATA
NanogCAT 4x
B
6
GC b GC-box
OS
-140 O S
(
)
3 4 OS
NanogCAT 4x
3 4 OS
1 2 3 -140 O S
(
NanogCAT
4 5 OS
)4x
6
G D3
RA
NanogCAT
D SOX2 OCT4
-- ++ -- -+ +- ++ -- -+ +- ++ + NT2 P1M
+ D3 P1M
P1M
α-BRG1
α-MED23
*
1 2
3
4
5 -140 O S
(
)
6 7 OS
8
9 10
α-MED7
NanogCAT 4x
Figure 1. Transcriptional Activation of Nanog by Oct4 and Sox2 Requires a Stem Cell-Specific Cofactor (A) Reconstituted in vitro transcription reactions supplemented with Oct4 and Sox2 (lanes 2 and 4) or Sp1 (lanes 6 and 8) plus a phosphocellulose 1 M KCl fraction derived from NT2 nuclear extracts (NT2 P1M, lanes 3, 4, 7, and 8) and programmed with either a Nanog template engineered with four extra copies of the oct-sox composite element (NanogCAT, lanes 1–4), or a GC box-containing template (G3BCAT, lanes 5–8). Oct4/Sox2, NT2 P1M-dependent transcripts are indicated by filled arrowheads and Sp1-dependent transcriptions by open arrowheads. (B) Transcription of the native Nanog promoter requires Oct4, Sox2, and NT2 P1M fraction (lane 4). (C) TFIID and NT2 P1M fraction are needed to potentiate Oct4/Sox2-dependent activation. Transcription reactions contain Oct4 and Sox2 (lanes 1–6), NT2 P1M fraction (lanes 2, 4, and 6) with increasing amounts of recombinant TBP (13 or 23, lanes 1–4), or TFIID (lanes 5 and 6). (D) Synergistic activation of Nanog by Oct4 and Sox2 requires P1M fractions prepared from NT2 or mouse ES cell line D3 nuclear extracts. In vitro transcription reactions contain equal amounts (0.7 mg) of NT2 (lanes 3–6) or D3 P1M fractions (lanes 7–10), with Oct4 alone (lanes 4 and 8), Sox2 alone (lanes 5 and 9), or both activators (lanes 2, 6, and 10). (E) Immunoblotting analysis of Oct4 levels in whole-cell extracts (WCE) prepared from pluripotent D3 cells (D3, lane 1) and cells treated with retinoic acid for 6 days (RA, lane 2). (F) P1M fractions prepared from pluripotent (D3, lanes 1 and 2) and differentiated (RA, lanes 3 and 4) D3 nuclear extracts were added to transcription reactions with or without Oct4 and Sox2. (G) Western blots (2-fold titration) of P1M fractions prepared from pluripotent (D3) and differentiated (RA) D3 nuclear extracts using anti-BRG-1, antiMED23, and anti-MED7 antibodies. Asterisk indicates a nonspecific band or a breakdown product recognized by anti-MED7 antibody. See also Figure S1.
122 Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc.
as well as two other Oct4/Sox2-dependent templates derived from the mouse Fbxo15 promoter (Tokuzawa et al., 2003) (mFbxo15CAT) (Figure S1C, lanes 1–4) and the human HESX1 promoter (Chakravarthy et al., 2008) (HESX1CAT) (Figure S1C, lanes 5–8). Thus, our in vitro complementation assay programmed with naked DNA templates revealed at least one potential coactivator activity that directs Oct4/Sox2-dependent activation of Nanog. We decided to pursue characterization of this cofactor that does not appear to require chromatin-based functions. To the best of our knowledge, this finding also demonstrates for the first time a fully reconstituted, in vitro transcription system that can faithfully recapitulate stem cell-specific gene activation. We next investigated the relative requirements for other cofactors in our assay system. Consistent with previous studies demonstrating that TAFs in the TFIID complex are often required for transcriptional activation by a variety of activators, including nuclear receptors (Lemon et al., 2001), Sp1 (Ryu et al., 1999), and SREBP-1 (Na¨a¨r et al., 1998), substituting holo-TFIID with recombinant human TBP resulted in a near complete loss of activation by Oct4 and Sox2 (Figure 1C). The very weak residual activation that we see using TBP (Figure 1C, lanes 2 and 4) is most likely due to trace amounts of TFIID present in the NT2 P1M fraction (data not shown). These findings suggest that TAFs/holo-TFIID and the putative cofactor detected in the NT2 P1M fraction are both required for optimal transcription of Nanog elicited by Oct4 and Sox2. Interestingly, in this reconstituted system, the addition of CRSP/Mediator complex was not required to obtain robust Oct4/Sox2 activation at the Nanog promoter. However, it is likely that some CRSP/Mediator is present in the P1M fraction, and it remains possible that some other component of the reconstituted system (i.e., Pol II) may have some residual amount of CRSP/Mediator contamination (Na¨a¨r et al., 2002). We found, however, that adding purified CRSP/Mediator instead of the NT2 P1M factor to these reactions completely failed to enhance Oct4/Sox2-dependent activation of Nanog transcription (Figure S1D). This finding indicates that the NT2 cofactor must be distinct from Mediator. Furthermore, addition of other transcriptional activators implicated in Nanog expression (i.e., Nanog, Sall4 [Zhang et al., 2006], Klf4 [Jiang et al., 2008] and Esrrb [van den Berg et al., 2008; Zhang et al., 2008]) also did not replace or enhance Oct4/Sox2-dependent transcription of Nanog in vitro (Figure S1E). To confirm that this newly detected cofactor activity in NT2 cells is also present in bona fide ES cells, P1M fractions were prepared from the pluripotent D3 mouse ES cell line and assayed for transcription. We found that the D3 P1M fraction was as active as the NT2 P1M fraction in potentiating Oct4/Sox2-activated transcription of Nanog (Figure 1D, compare lane 2 to 6 and 10). Interestingly, the highest levels of transactivation by the NT2 or D3 P1M fractions were observed only when both activators were added to the transcription reaction, whereas no activation was detected with Oct4 alone and a moderate level of activation was seen with Sox2 alone (Figure 1D, lanes 3–10). Apparently, this cofactor mediates the synergistic activation of Nanog by Oct4 and Sox2. If, as we postulated, this new coactivator functions selectively in pluripotent cells, one might expect that its presence or activity would need to be downregulated
A
compare lanes 1 and 3). This decrease is not due to a wholesale loss of transcription factors and other cofactors during stem cell differentiation because the levels of PBAF/BAF (BRG-1) and the Mediator complex (MED23 and MED7) were largely unchanged in the two extracts (Figure 1G).
NT2 NE (NH4)2SO4 ppt
B
55%
-
P11 1M KCl
Poros-HQ fracons 46 48 50 52 54 56 58 60 62
Ni-NTA FT
Poros HQ Poros-HQ 0.3M KCl
Q0.3
0.6M KCl
Poros-HS
SCC
0.5M KCl
HAP 0.3M KPi
Superose 6 500-700kDa
C IN
-
Superose 6 fracons 27 29 31 33 35 37 39 41 43 45 47 49
Poros-HE 0.4M KCl
Mono S 0.4M KCl
SCC
E
500-700kDa
Mr (kDa) Mono S fracons 14 15 16 17 18 19 20 Mr (kDa) 200 116 97
D
670
158 44
Mono S fracons IN 12 13 14 15 16 17 18 19 20 21 22 23 24
66 45 31 21 Silver staining
Figure 2. Purification of Stem Cell Coactivator (A) Chromatography scheme for partial purification of Q0.3 and purification of SCC from NT2 nuclear extracts (NT2 NE). NT2 NE is first subjected to ammonium sulfate precipitation (55% saturation) followed by a series of chromatographic columns as indicated. (B) Buffer () and fractions containing SCC eluted from a Poros-HQ anion exchanger (top) assayed in the presence of Oct4 and Sox2 in in vitro transcription assays. (C) Coactivator SCC migrates as a large complex. Input (IN), buffer (), and Superose 6 fractions (top) assayed as in (B) except that all reactions are supplemented with Q0.3 (A). Mobilities of peak activity (500–700 kDa) and gel filtration protein standards are shown (bottom). (D) Transcription profile of stem cell coactivator (SCC) activity after the final Mono S chromatography step. Reactions contain input (IN) and Mono S fractions (top) and are assayed as in (C). (E) Silver-stained SDS-PAGE gel of the active Mono S fractions. Filled arrowheads indicate polypeptides that comigrate with SCC activity. See also Figure S2.
upon differentiation, as is the case for Oct4. To investigate whether the cofactor activity is restricted to the pluripotent state of ES cells, D3 cells were induced to differentiate by removal of LIF and treatment with retinoic acid (RA). The extent of differentiation was monitored by the loss of Oct4 expression that was complete after 6 days (Figure 1E). Nuclear extracts and P1M fractions were then prepared from D3 cells before and after differentiation. When compared to pluripotent D3 P1M fractions, an equivalent amount of P1M fraction prepared from differentiated D3 nuclear extracts showed significantly decreased cofactor activity in our in vitro transcription assay (Figure 1F,
Purification and Identification of a Stem Cell Coactivator Starting with 200–400 L of NT2 cells, we were able to separate the cofactor activity into two distinct chromatographic fractions. One cofactor activity eluted from an anion exchanger (Poros-HQ) at 0.3 M KCl (Q0.3; data not shown), whereas a second distinct activity eluted at 0.6 M KCl (stem cell coactivator [SCC] (Figures 2A and 2B). Full synergistic Oct4/Sox2-dependant activation of Nanog required both fractions in our in vitro reconstituted transcription reactions (Figure S2). Using this biochemical complementation system, we sequentially purified the more robust activity, SCC, over eight chromatographic columns, resulting in > 50,000-fold increase in specific activity (Figure 2A). Because SCC activity migrated with an apparent native molecular mass (Mr) of 600 kDa during size-exclusion chromatography (Figure 2C), it seemed likely that this coactivator was a multiprotein complex. Accordingly, SDS-polyacrylamide gel electrophoresis (SDS-PAGE) of the most purified Mono S fractions revealed a distinct pattern of four major polypeptides (along with multiple breakdown products) that consistently copurified with the SCC activity (Figures 2D and 2E). For the remainder of this report, we focus on the identification and functional characterization of SCC in vitro and in vivo. To identify polypeptides comprising the SCC complex, peak Mono S-purified fractions were pooled and separated by SDSPAGE. Surprisingly, tryptic digests of excised gel bands followed by high-sensitivity mass spectrometry revealed all detectable constituents of SCC to be none other than the Xeroderma pigmentosum group C (XPC)-RAD23B-Centrin 2 (CETN2) nucleotide excision repair (NER) complex (Araki et al., 2001) (Figure 3A). We next carried out western blot analysis with antibodies specific to XPC, RAD23B, and CETN2 to confirm the identities of the purified SCC subunits (Figure 3B). As expected, these three polypeptides were highly enriched in the purified SCC Mono S peak fractions when compared to the crude NT2 P1M fraction (Figure 3B). Because identification of SCC as being identical to the XPC-NER complex was so unexpected, particularly as this repair complex has not been associated with any cell type-specific function nor linked to stem cell transcription, we next wanted to compare the relative amounts of this factor in different cell types. Consistent with the notion that SCC may be functioning in an unusual way in pluripotent stem cells, we found that these three proteins are highly enriched in ES and EC cells. For example, the levels of XPC, RAD23B, and CETN2 in the NT2 P1M fraction are much higher than in an equivalent amount of P1M fraction prepared from HeLa nuclear extracts (Figure 3B). Accordingly, in in vitro transcription reactions, Oct4/Sox2-dependent activation of Nanog by HeLa P1M fraction is much lower than that of NT2 P1M fraction (Figure S1B). XPC and RAD23B were rapidly downregulated upon RA-induced differentiation of mouse D3 ES cells, whereas CETN2, components of the basal transcription machinery (TBP and TFIIE), and other NER factors (XPA and XPB) decreased only slightly while Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc. 123
C B
P1M HeLa NT2 Purif
A XPC
D3 WCE
RA (Days) 0 1 3 7 10 α-XPC α-RAD23B
α-XPC
RAD23B
α-CETN2 α-OCT4
α-RAD23B
α-XPB α-XPA α-TFIIEβ
CETN2
α-CETN2 CETN2
α-TBP α-ACTB
Figure 3. SCC Is the XPC-RAD23B-CETN2 Nucleotide Excision Repair Complex (A) Mass spectrometry analysis of Mono S peak activity fractions (16–18) in Figure 2E with protein identities indicated. (B) SCC is highly enriched in NT2 P1M fraction. Comparative western blot analysis of HeLa and NT2 P1M fractions (1.5 mg each) and purified Mono S SCC fraction (Purif, 30 ng) using anti-XPC, anti-RAD23B, and anti-CETN2 antibodies. (C) Downregulation of XPC and RAD23B upon RA-induced differentiation of mouse D3 ES cells. Western blot analysis of whole-cell extracts prepared from D3 cells (D3 WCE) collected at indicated days post-RA treatment using antibodies against XPC, RAD23B, CETN2, OCT4, XPB, XPA, TFIIEb, TBP, and loading control b-actin (ACTB).
the loading control b-actin remained unchanged (Figure 3C). This finding is consistent with our previous observation that the D3 P1M fraction from differentiated cells is significantly less active than the pluripotent D3 P1M fraction in potentiating Nanog transcription (Figure 1F). Reconstitution and Mechanism of Coactivation by SCC While we were in the process of further characterizing the role of the XPC-RAD23B-CETN2 complex in transcription, Le May et al. reported that XPC and other components of the NER apparatus can be recruited to a gene promoter (e.g., RARb2) upon nuclear hormone induction (Le May et al., 2010). Although the mechanism by which XPC and other NER factors mediate gene activation remains unclear, these recent studies and our new findings have unmasked a hitherto unknown and potentially important role for XPC that is directly linked to transcription. In our case, the most striking finding was the direct requirement for the SCC/XPC complex in selectively potentiating the transcriptional activation of Nanog by Oct4 and Sox2 in ES cell extracts. However, to more firmly establish this exciting new connection, we first needed to eliminate the possibility that trace amounts of contaminants present in our purified SCC fraction were responsible for the coactivator activity detected in our in vitro transcription assays. Therefore, we set about to reconstitute the heterotrimeric XPC-RAD23B-CETN2 complex from recombinant gene products expressed in insect (Sf9) cells following co-infection with baculoviruses expressing His-tagged XPC, FLAG-tagged RAD23B, and untagged CETN2. Using an efficient two-step affinity purification procedure, we were able to purify the recombinant heterotrimeric complex to near homo124 Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc.
geneity (Figure 4A). Our ability to generate pure polypeptide subunits, as well as various combinations of dimeric and trimeric complexes, allowed us to address a number of important questions, such as whether known functional domains of XPC required for NER are also necessary for the cofactor activity. It is well established that XPC’s ability to interact nonspecifically with DNA is essential for its NER function. Indeed, a single point mutation in the DNA-binding domain (W690S) of XPC, identified in an XP patient (XP13PV), abolishes binding to damaged (and undamaged) DNA and is defective in repair in vivo and in vitro (Maillard et al., 2007; Yasuda et al., 2007). To address whether XPC’s nonspecific DNA-binding activity is also important for its coactivator function, a mutant DNA-binding-defective XPC (W690S) complex (that had been independently confirmed to be compromised for DNA binding in vitro; Figures S3A and S3B) was reconstituted in Sf9 cells and tested along with the wild-type complex for their ability to support Oct4/Sox2-dependent transcriptional activation of Nanog in vitro. Surprisingly, both the recombinant wild-type and mutant complexes exhibited specific activities for coactivation comparable to that observed for purified native endogenous SCC from NT2 cells (Figure 4B). Taken together, these results confirm that the XPC-RAD23BCETN2 complex is indeed SCC and suggest that its DNA binding (and repair) activity is dispensable and functionally separable from its transcriptional cofactor activity at least in vitro. It has also been reported that XPC can interact directly with TFIIH (Uchida et al., 2002) and thus might provide a DNA-independent mechanism by which SCC can be recruited to gene promoters. To test this possibility, a C-terminal truncation of XPC that abolishes TFIIH (and CETN2) but retains RAD23B binding (amino acids 1–813, C814St) (Bernardes de Jesus et al., 2008) was used in our in vitro assay and was found to have no adverse affect on the ability of a XPC (C814St)-RAD23B heterodimer to mediate Oct4/Sox2-activated transcription of Nanog (Figures S3C and S3D). We therefore speculate that SCC/XPC is most likely targeted to its cognate promoters via potential interactions with specific activators such as Oct4 and Sox2. To probe for a potential direct interaction between SCC and Oct4 and/or Sox2, mouse SCC subunits were overexpressed with Oct4, Sox2, Klf4, and c-Myc (STEMCCA) (Sommer et al., 2009) in 293T cells. SCC coimmunoprecipitated with Oct4, but not with control IgG (Figure 4C). To examine whether the DNAbinding property of SCC is required for its interaction with Oct4 and other activators, both the wild-type (WT) and DNAbinding-defective (W683S in mouse) XPC/SCC complexes were coexpressed with STEMCCA. Immunoprecipitation of WT and mutant SCC complexes using an anti-RAD23B antibody pulled down both Oct4 and Sox2, but not Klf4 or XPA (Figure 4D). These data indicate a direct and specific protein-protein binding between SCC and select activators, thus providing a mechanism by which SCC may serve as a transcriptional coactivator for Oct4 and Sox2 (but not Klf4; see Figure S1E) in potentiating Nanog transcription. These findings may also explain why the DNAbinding activity of the XPC subunit of SCC is dispensable for transcription in vitro. However, we were unable to reproducibly detect a stable interaction between SCC and Oct4/Sox2 in D3 ES cell extracts. It is worth noting, though, that other coactivators implicated in Oct4/Sox2-directed transcriptional activation
A
Figure 4. Reconstitution of Recombinant SCC Complexes
W690S
WT
NT2
Sf9
B SCC
-
Sf9 NT2
WT W690S
* *
1 2 3 4 5 6 7
D
Silver staining
C
Input
IP Input αRAD23B
STEMCCA
IP
+
mSCC: mSCC
+
STEMCCA:
-
-
-+ --++
α-XPC α-RAD23B
α-XPC
α-CETN2 CETN2 α-RAD23B
α-OCT4 α-SOX2
α-OCT4
α-KLF4 α-XPA
LMW
HMW
E
XPC-RAD23B-CETN2 XPC RAD23B XPC-RAD23B XPC-CETN2
α-CETN2
F
-
XPC
RAD23B
1 2 3 4 5 6 7
XPC RAD23B
XPC CETN2
XPC RAD23B CETN2
(A) Silver-stained SDS-PAGE gel of purified NT2 SCC (NT2), recombinant wild-type (WT), and DNA-bindingdefective mutant (W690S) XPC-containing SCC complexes reconstituted in insect Sf9 cells by coinfection with baculoviruses expressing His-tagged XPC, FLAG-tagged RAD23B, and untagged-CETN2. Major proteolytic fragments of mutant XPC are indicated by asterisks. (B) Recombinant SCC complex enhances Oct4/Sox2activated transcription of Nanog independent of DNA binding. Buffer (), NT2 (Mono S peak activity fractions; lanes 2 and 3), recombinant WT (lanes 4 and 5), and W690S mutant (lanes 6 and 7) SCC complexes are assayed (over a 3-fold concentration range). All transcription reactions contain Oct4, Sox2, and Q0.3 (lanes 1–7). (C) Oct4 interacts with SCC. Western blot analysis of input lysates (2%) and coimmunoprecipitated proteins from extracts of 293T cells transfected with a polycistronic expression plasmid encoding all three subunits of mouse SCC (mSCC) with or without a polycistronic plasmid expressing mouse Oct4, Sox2, Klf4, and c-Myc (STEMCCA) using normal IgG or anti-Oct4 antibody. See also Figure S3. (D) SCC-B interacts directly with Oct4 and Sox2 independent of DNA binding. Control vector (), plasmids expressing wild-type (WT), or mutant (W683S) XPCcontaining mSCC complexes were cotransfected with STEMCCA into 293T cells and immunoprecipitated with anti-RAD23B antibody. Input lysates (2%) and RAD23Bbound proteins were detected by immunoblotting. (E) Coomassie-stained SDS-PAGE gel of purified recombinant XPC, RAD23B, dimeric (XPC-RAD23B and XPC-CETN2), and holo-SCC (XPC-RAD23B-CETN2) complexes. (F) Titrations (over a 4-fold concentration range) of XPC (lanes 2–4), RAD23B (lanes 5–7), XPC-RAD23B (lanes 8–10), XPC-CETN2 (lanes 11–13), and XPC-RAD23BCETN2 (lanes 14–16) in in vitro transcription reactions supplemented with Q0.3 (lanes 1–16) and assayed as in (B). See also Figure S3.
8 9 10 11 12 13 14 15 16
Coomassie staining
(e.g., Mediator and p300/CBP) have not been identified in recent ‘‘interactome’’ studies on Oct4-, Sox2-, or Nanog-associating factors (Engelen et al., 2011; van den Berg et al., 2010; Wang et al., 2006), supporting the idea that functional coactivator-activator interactions can often be weak and transient. The ability to reconstitute active SCC from purified recombinant subunits also provided us with a unique opportunity to examine the contribution of individual subunits, as well as different dimeric combinations, in supporting Oct4/Sox2 transcriptional activation. Purified individual subunits (XPC or RAD23B), partial dimeric complexes (XPC-RAD23B or XPC-CETN2), and holo-SCC complexes (Figure 4E) were assayed over a 4-fold dose-response range in our fully reconstituted in vitro transcription reactions containing Oct4, Sox2, and a partially purified Q0.3 fraction (Figure 4F). The large XPC subunit alone only slightly activated transcription above background at the highest concentrations tested (Figure 4F, compare lanes 1 and 4),
whereas RAD23B alone was essentially inactive. The XPC-CETN2 dimer was slightly more active than XPC alone. By contrast, a marked gain in specific activity was observed with the XPC-RAD23B dimeric complex that was nearly as active as the holo-complex (Figure 4F). These results suggest that the minimal active complex likely consists of XPC and RAD23B, whereas CETN2 may enhance the activity of the complex by providing structural support or stability. SCC Coactivator Function in ES Cell Self-Renewal and Somatic Cell Reprogramming We next set out to determine the role of the SCC/XPC complex on gene expression and Nanog transcription by loss-of-function studies in ES cells. Lentiviruses containing two independent short hairpin RNAs (shRNAs) specifically targeting XPC, RAD23B, and CETN2 were used to infect mouse D3 ES cells to selectively deplete SCC (Figures 5A, S4A, and S4B). Knockdown of SCC subunits resulted in pronounced cellular morphological abnormalities and decreased alkaline phosphatase (AP) activity Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc. 125
B NT SCC KD
NT SCC KD
A
NT
*
* α-XPC
α-RAD23B
α-CETN2
α-ACTB
SCC KD
*
C 100
D Undifferenated Mixed Differenated
60 40 20 0
NT SCC #1 SCC #2
0.05 Rel. to AC CTB
Colony type (%)
80
0.06
0.04 0.03 0.02 0.01
NT
SCC KD
0
NANOG
UTF1
FGF4
ZFP42
Figure 5. SCC Is Required for ES Cell Maintenance (A) Efficiency of shRNA-mediated depletion of SCC in mouse ES cell line D3. Whole-cell extracts of mouse D3 cells infected with nontarget (NT) lentiviruses (MOI of 300) or with an equal mixture of three lentiviruses (MOI of 100 each) targeting XPC, RAD23B, and CETN2 (SCC KD) are analyzed by western blotting. Specific bands recognized by their respective antibodies are indicated by filled arrowheads. Asterisks denote nonspecific signals. (B) ES cell colony morphology and alkaline phosphatase (AP) activity (red) are maintained in control D3 cells (NT, top) but are compromised in SCC-depleted D3 cells (SCC KD, bottom). See also Figure S4C. (C) Clonal assays on SCC-depleted D3 ES cells. Stable nontarget (NT) and SCC-depleted (SCC KD) D3 cell pools were plated at 300 cells per well in 6-well plates, and emerging colonies were stained for AP activity. Differentiation status was scored based on AP staining intensity, ES cell morphology, and colony integrity after 6 days. (D) Two nonoverlapping sets of shRNAs targeting SCC (SCC #1 and SCC #2) are used to deplete SCC. Quantification of Nanog, Utf1, Fgf4, and Zfp42 mRNA levels are analyzed by real-time quantitative PCR (qPCR) and normalized to Actb. Data from representative experiments are shown; error bars represent standard deviations. n = 3. See also Figure S4.
(Figures 5B and S4C). These knockdown cells also showed reduced proliferation rates when compared to control ES cells infected with nontarget viruses, indicating that the self-renewal capacity of ES cells depleted of SCC may also be compromised (data not shown). Indeed, prolonged depletion of SCC resulted in the apoptosis of flattened, fibroblastic AP-negative cells surrounding the collapsing ES cell colonies (Figure 5B and data not shown). Therefore, knockdown of SCC in ES cells likely promotes differentiation followed by rapid apoptosis, two processes that are often coupled. Quantification of colony assays revealed that ES cells depleted of SCC formed fewer undifferentiated colonies, with a corresponding increase in partially and fully differentiated colonies (Figure 5C). Consistent with the observed morphological changes associated with compromised stem cell identity, double and triple knockdown of XPC, RAD23B, and CETN2 resulted in a 2- to 3-fold reduction in the mRNA level of Nanog (Figures 5D and S4D) as well as 126 Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc.
several other stem cell markers (Fgf4, Zfp42, and Utf1) (Figure 5D). Knockdown of individual subunits of SCC resulted in only mild effects on Nanog expression (Figure S4D). Accordingly, we did not observe overt defects in self-renewal in these singlesubunit knockdown ES cells (data not shown). To further probe the molecular mechanism underpinning the function of SCC as a transcriptional coactivator for Oct4 and Sox2 in vivo, we investigated whether regulatory regions of Nanog and Oct4 might serve as direct SCC targets by performing chromatin immunoprecipitation (ChIP) assays in D3 cells using a RAD23B antibody. ChIP-qPCR analysis revealed that RAD23B (and presumably XPC/SCC) occupancy sites coincide with those of Oct4 (Boyer et al., 2005; Chen et al., 2008; Kim et al., 2008) and Sox2 (Figures 6A and S5A). By contrast, we failed to detect any significant enrichment of RAD23B at housekeeping genes b-actin (Actb) (Figure 6A) and dihydrofolate reductase (Dhfr) (Figure S5B) or an intergenic region on chromosome 1 (Figure S5B). To evaluate the extent to which Oct4 and Sox2 target sites overlap those of RAD23B on a genome-wide scale, we performed RAD23B ChIP assays followed by high-throughput sequencing (ChIP-seq) to identify an entire range of RAD23B/SCC-bound genomic regions in D3 cells. RAD23B ChIP-seq results were then compared with published Oct4 and Sox2 ChIP-seq data, along with those of Nanog and Tcf3 (Marson et al., 2008), to assess any potential bias in RAD23B occupancy in relation to these transcription factors. This analysis revealed a striking binding preference of RAD23B/SCC to genomic sites that are also co-occupied by Oct4 and Sox2, but not Nanog or Tcf3 only (70% versus 28%, p < 1015, ANOVA). This strong bias is maintained whether the ChIP-seq data sets are analyzed by the degree of peak overlap (defined by any two peaks with at least one nucleotide of overlap) (Figure 6B) or base pair coverage (Figure 6C), indicating that the majority of RAD23B/SCC-binding sites align with those of Oct4 and Sox2. Importantly, the same analyses performed on ChIP-seq samples obtained from control IgG immunoprecipitations yielded only background correlation (between 4% and 8%), confirming the specificity of the RAD23B/SCC association. We further validated the colocalization among RAD23B/SCC, Oct4, and Sox2 by measuring the distance between overlapping RAD23B/SCC and Oct4/Sox2 peaks (see Extended Experimental Procedures). The majority of them (76%) lie within close proximity (% 200 base pairs) of each other (Figure 6D). Even though most of RAD23B/SCC-bound regions overlap poorly with those bound by Nanog/Tcf3 (28%), those that do are still largely (64%) positioned within 200 base pairs from each other but with a noticeably different distribution pattern than that of Oct4/Sox2 (p < 1015, ANOVA, Figure 6D). However, upon a closer look at the Nanog/Tcf3 ‘‘only’’ genomic coordinates that overlap with RAD23B-bound sites, we found that many of them (40%) could, in fact, contain Oct4 and/or Sox2 when an alternative peak calling strategy (MACS) was used. Taken together, these data strongly suggest a classical coactivator function rather than a purely NER function of SCC both in vitro with naked DNA and in the context of chromatin in ES cells, as XPC/RAD23B-mediated DNA damage repair generally involves transient interactions with DNA (Camenisch et al., 2009) that would not show either sequence or promoter specificity.
0.35
0.35
0.30
0.30
0.30
0 25 0.25
0 25 0.25
0 25 0.25
0.20
% input
0.35
% input
% input
A
0.20
0.20
0.15
0.15
0.15
0.10
0.10
0.10
0.05
0.05
0.05
enh
-950
TSS
intron
Sox2 Nanog Oct4
Sox2 Oct4
-5000
enh
TSS
Sox2 Oct4
Sox2 Oct4
IgGs α-RAD23B
intron
Oct4
TSS
intron
Actb
B Peak Overlap OCT4/SOX2 (n=10,138) NANOG/TCF3 (n=6,424)
RAD23B (n=33,259) 70.9% (7,196) 27.9% (1,794)
IgGs (n=15,677) 8.5% (865) 4.1% (265)
C Base-pair Coverage RAD23B OCT4/SOX2 (5,030,849 bp) 75.1% (3,776,285 bp) NANOG/TCF3 (1,151,570 bp) 28.5% (327,992 bp)
IgGs 8.3% (420,442 bp) 3.7% (42,672 bp)
D
Given the importance of SCC in stem cell maintenance, we next asked whether it might also play a role in the reacquisition of pluripotency during somatic cell reprogramming. Downregulation of either XPC or RAD23B in Oct4-GFP mouse embryonic fibroblasts (MEFs)—which express some SCC, albeit at significantly lower levels than ES cells—led to a dramatic reduction in the reprogramming efficiency. We observed a significant decrease in the number of AP-positive colonies, as well as a marked reduction in the percentage of partially (SSEA-1+, GFP) and fully (SSEA-1+, GFP+) reprogrammed cells, as determined by FACS sorting (Figures 7A, 7B, and S6A). Consistent with our in vitro reconstitution result showing that the CETN2 subunit may not be essential for the transcriptional activity of SCC (Figure 4F), knockdown of CETN2 had minor effects on iPS cell derivation efficiency. As expected, reprogramming efficiency using MEFs derived from XPC and RAD23B knockout (KO) mice (Ng et al., 2003) was also highly compromised. Surprisingly, RAD23A KO MEFs were nearly as efficient as wild-type or RAD23A and B double-heterozygous MEFs in generating AP-positive colonies
Figure 6. SCC Is Recruited to the Nanog and Oct4 Promoters and Genomic Regions Occupied by Oct4 and Sox2 (A) Co-occupancy of SCC, Oct4, and Sox2 on the promoters of Nanog and Oct4. ChIP analysis of RAD23B occupancy on distal enhancers (enh), proximal promoter (transcription start site, TSS), and upstream (positions indicated by numbers) and downstream intronic regions of the Nanog (left), Oct4 (middle), and Actb (right) gene loci. Representative data (n > 5) showing the enrichment of RAD23B (black bars) compared to normal IgGs (white bars) are analyzed by qPCR and expressed as percentage of input chromatin. Schematic diagrams of Oct4- and Sox2-binding sites on the Nanog and Oct4 regulatory regions (TSS and enhancers; see also Figure S5A) are indicated at the bottom. Error bars represent standard deviations. n = 3. (B) Percent peak overlap between RAD23B and control IgG ChIP-seq data relative to published Oct4/Sox2 and Nanog/Tcf3 peak data. (C) Percent base pair overlap between RAD23B and control IgG ChIP-seq data relative to Oct4/Sox2 and Nanog/Tcf3 ChIP-seq data sets. (D) Distribution of distance (in base pair) of RAD23B and control IgG peaks from Oct4/Sox2 and Nanog/Tcf3 peaks. See also Figure S5.
upon iPS cell induction (Figures 7C and S6B). This result may point to a nonredundant function of RAD23B in somatic reprogramming independent of its role in DNA repair, as RAD23B KO (and RAD23A KO) MEFs are NER proficient (Ng et al., 2003). Importantly, depletion of XPC (knockdown and knockout) and CETN2 in MEFs did not affect proliferation rates when compared to nontarget or Oct4 knockdown MEFs. However, RAD23B-depleted MEFs displayed noticeable changes in growth rates, which may partially account for the marked reduction in reprogramming efficiency (data not shown). These data suggest that efficient reprogramming may require SCC/ XPC in conjunction with Oct4 and Sox2 to re-establish ESspecific gene expression programs. DISCUSSION Establishment of ground state pluripotency in embryonic stem cells represents one of the most remarkable events in development. Stem cells have evolved a subset of cell type-specific activators among a constellation of previously identified transcription factors and cofactors to resolve the dichotomy between self-renewal versus differentiation. Our de novo purification of the SCC/XPC complex as a potent coactivator for Oct4 and Sox2 was unanticipated but may, in part, reflect the need for stem cells to robustly expand and diversify their transcriptional repertoire while also maintaining genome integrity. Indeed, other NER factors have been shown to participate in transcriptional regulation both at the basal and activated levels. Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc. 127
Figure 7. SCC Is Required for Efficient Somatic Cell Reprogramming (A) Depletion of SCC blocks somatic cell reprogramming. Oct4-GFP mouse embryonic fibroblasts infected with lentiviruses expressing STEMCCA and rtTA together with nontarget shRNA (NT), shRNAs against Oct4, individual subunits of SCC, or all three subunits simultaneously at low or high multiplicity of infection (SCC LO or HI) are plated in 6-well plates for colony counting and FACS or in 24-well plates for AP staining. AP-positive (red) cells are stained and counted 17 days (14 days + dox, 3 days dox) postinduction (dpi). Results from two separate experiments are shown. (B) Single cell suspensions of 17 dpi Oct4-GFP MEFs as described in (A) are stained with anti-mouse SSEA-1 antibodies and analyzed by FACS. (C) Wild-type (WT), RAD23A, and RAD23B doubleheterozygous (23A/B d-Het) MEFs, together with XPC, RAD23A, and RAD23B knockout (KO) MEFs, are induced with STEMCCA. AP-positive colonies are stained and counted as in (A). See also Figure S6.
For instance, the general transcription factor TFIIH is a classic example with established roles in both transcription initiation and NER (Schaeffer et al., 1993). Interestingly, it has recently been reported that, in HeLa cells, the entire NER complex can be assembled onto promoters of activated genes in an XPCdependent manner. However, XPC alone is not sufficient, as other NER components appear to be responsible for RA-activated transcription (Le May et al., 2010). This finding in HeLa cells is distinct from our observation that the XPC-NER (SCC) complex plays a direct and critical role in Nanog transcription invitro and in ES cells. In our studies, optimal activation of Nanog by Oct4/Sox2 potentiated by SCC requires a second activity present in the Q0.3 fraction. However, preliminary mass spectrometry analyses of the partially purified Q0.3 fraction failed to detect any other XP or NER factors or factors previously identified to copurify with Nanog or Oct4 in ES cells (van den Berg et al., 2010; Wang et al., 2006) (data not shown). Therefore, the SCC/XPC complex can potentiate Nanog transcription and likely other Oct4/Sox2-directed promoters in the absence of additional XP and NER factors in vitro. Taken together, these results suggest that the mechanism by which the SCC/XPC complex coactivates transcription in ES cells may be distinct from its function in HeLa cells. Although XPC plays a critical role in DNA lesion recognition, XPC is not universally required for NER, as certain types of bulky DNA lesions (e.g., cholesterol-DNA adducts) can be repaired 128 Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc.
without XPC (Mu et al., 1996). Intriguingly, even though XPC is recruited to gene promoters irrespective of DNA damage signals (Le May et al., 2010), the XPC-NER complex is the only factor in the XP family that is dispensable for transcription-coupled repair (TCR) (Venema et al., 1990). Indeed, our findings suggest that the coactivator and NER duties carried out by SCC are mechanistically distinct processes, as SCC can function as part of the transcriptional cofactor apparatus via a direct interaction with Oct4 and Sox2 without requiring either DNA or TFIIH binding mediated by XPC. It is worth noting that the effect of single knockdown of XPC or RAD23B was much more pronounced in the reprogramming of MEFs than in the maintenance of ES cells. We surmise that perhaps other redundant regulatory mechanisms in established ES cells can partially compensate for the loss of SCC. Such robust regulatory circuitries are likely to be less developed during the early phase of reprogramming in MEFs and are thus more susceptible to perturbation by SCC depletion. It is conceivable that SCC/XPC may also contribute to the process of chromatin reorganization and facilitate changes in the epigenetic landscape that are conducive to iPS conversion (Le May et al., 2010). Also in agreement with our in vitro and cell-based studies, a mouse double KO of RAD23B and its homolog RAD23A was found to be early embryonic lethal (Ng et al., 2003). This previously puzzling phenotype can now be more readily rationalized in light of the functional role of XPC in transcriptional coactivation revealed here. Taken together, these results strongly suggest that loss of the SCC/XPC complex may indeed compromise the transcriptional integrity of pluripotent stem cells, as well as the ability of somatic cells to re-establish pluripotency. However, XPC KO mice are UV sensitive but otherwise normal, with no obvious developmental defects (Sands et al., 1995). It has been shown that RAD23B is in vast excess relative to XPC (Sugasawa
et al., 1996), suggesting that RAD23B may exist in other complexes independent of XPC that functionally replace SCC. Embryonic stem cells are thought to be under strong selective pressure to maintain genome fidelity because accumulation and propagation of DNA errors to progenitor cells would be lethal during development; therefore, DNA damage response factors and pathways are often upregulated in ES cells (e.g., XPC, RAD23B, ERCC5, etc.) (Cervantes et al., 2002; Ramalho-Santos et al., 2002). Should DNA repair fail, UV-damaged ES cells can be eliminated first by repressing Nanog expression through p53 upregulation, which in turn promotes spontaneous differentiation and efficient apoptosis (Lin et al., 2005). It is interesting to note that, upon UV-induced DNA damage in HeLa cells, recruitment of XPC to non-UV-inducible genes, as well as their expression, are dramatically delayed (Le May et al., 2010). This suggests that some sort of redistribution mechanism may redirect XPC from transcription duty at promoter targets to the NER pathway in response to DNA damage. In light of these observations, it is tempting to speculate that redistribution of XPC-RAD23B-CETN2 from Nanog and presumably other Oct4/ Sox2-regulated promoters to DNA damage sites may provide an efficient sensing mechanism to perturb stem cell-specific gene expression programs and thus provide a window of opportunity for ES cells to either repair the lesions or commit to differentiation and apoptosis. The SCC/XPC complex may therefore act as a molecular link to couple stem cell-specific gene expression programs and genome surveillance in ES cells. EXPERIMENTAL PROCEDURES DNA Constructs, Cell Lines, and Cell Culture Construction of in vitro transcription templates and protein expression plasmids are described in Extended Experimental Procedures. HeLa, 293T, NTERA-2 (NT2), and mouse ES cell line D3 were maintained in standard conditions. Differentiation of D3 ES cells was carried out by LIF removal followed by retinoic acid treatment (5–10 mM, Sigma). Purification and Identification of SCC Nuclear extracts from 400 l of NT2 cells were purified over eight chromatographic steps to homogeneity. Methods for purification and mass spectrometry analyses of SCC are detailed in Extended Experimental Procedures. Western Blotting, Immunoprecipitation, and Affinity Purification Antibodies used are described in Extended Experimental Procedures. Transcriptional activators were purified from transiently transfected HeLa cells followed by affinity purification using anti-FLAG (M2) agarose (Sigma) as described in Extended Experimental Procedures. Recombinant SCC complexes were purified from Sf9 cells infected with baculoviruses (BAC-to-BAC system, Invitrogen) expressing N-terminal His6-tagged or FLAG-tagged XPC, N-FLAG-tagged RAD23B, and untagged CETN2. Sf9 cells were harvested 48 hr after infection, and protein complexes were purified by incubating cell lysates with Ni-NTA resin (QIAGEN), anti-FLAG (M2) agarose (Sigma), and elution by the FLAG peptides. shRNA-Mediated Knockdown of SCC by Lentiviral Infection Control nontarget and pLKO shRNA plasmids targeting XPC, RAD23B, and CETN2 (Sigma) were transfected with packaging vectors into 293T cells using FuGENE 6 (Roche). Supernatants were concentrated by ultracentrifugation and resuspended in PBS. Viral titer was determined by a QuickTiter Lentivirus Titer Kit (Cell Biolabs). SCC knockdown was performed by incubating lentiviral concentrates with D3 cells in the presence of 8 mg/ml polybrene followed by puromycin selection (1.5 mg/ml).
Gene Expression Analysis and ChIP Total RNA from shRNA-mediated knockdown D3 ES cells was isolated using RNeasy Plus Kit (QIAGEN) and analyzed by qRT-PCR. Chromatin immunoprecipitation (ChIP) assays were performed in D3 cells as described in Extended Experimental Procedures. Precipitated DNA was measured by qPCR or sequenced using an Illumina HiSeq 2000 sequencing platform. Methods for gene expression and ChIP analyses are detailed in Extended Experimental Procedures. Somatic Cell Reprogramming Oct4-GFP MEFs (The Jackson Laboratory) were infected with lentiviruses containing STEMCCA and rtTA, followed by infection with pLKO shRNA lentiviral supernatants targeting SCC. Oct4, Sox2, Klf4, and c-Myc expressions were induced by doxycycline, and SCC knockdown MEFs were selected with puromycin. Reprogrammed cells were either detected by alkaline phosphatase activity or stained with anti-SSEA-1 antibodies conjugated to Alexa Fluor 647 (BioLegends) and analyzed by FACS. XPC, RAD23A, and RAD23B knockout MEFs were generous gifts from Dr. Hoeijmakers (Rotterdam, The Netherlands). SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures and six figures and can be found with this article online at doi:10.1016/j.cell. 2011.08.038. ACKNOWLEDGMENTS The authors wish to thank A. Fischer and M. Richner at the Tissue Culture Facility (University of California, Berkeley); S. Zheng, G. Dailey, M. Haggart, and E. Bourbon for technical assistance; S. Chen at National Institute of Biological Sciences (Beijing, China) for mass spectrometry analysis; S. Zhou for initial mass spectrometry analysis; J. Hoeijmakers for knockout MEFs; G. Mostoslavsky for STEMCCA; O. Puig for pGL3-CAT plasmid; S. Ryu for purified rTBP and G3BCAT plasmid; K. Hochedlinger, M. Stadtfeld, J. de Wit, and M. Holmes for technical advice; and M. Botchan, M. Holmes, M. Levine, S. Martin, M. Rape, D. Rio, and all members of our laboratory for critical reading of the manuscript. Y.W.F. was a California Institute for Regenerative Medicine Scholar (CIRM training grant T1-00007). T.Y. was supported by the Swiss National Science Foundation and the Siebel Stem Cell Institute and is a Jane Coffin Childs fellow. Received: July 26, 2011 Revised: August 23, 2011 Accepted: August 25, 2011 Published: September 29, 2011 REFERENCES Araki, M., Masutani, C., Takemura, M., Uchida, A., Sugasawa, K., Kondoh, J., Ohkuma, Y., and Hanaoka, F. (2001). Centrosome protein centrin 2/caltractin 1 is part of the xeroderma pigmentosum group C complex that initiates global genome nucleotide excision repair. J. Biol. Chem. 276, 18665–18672. Bernardes de Jesus, B.M., Bjøra˚s, M., Coin, F., and Egly, J.M. (2008). Dissection of the molecular defects caused by pathogenic mutations in the DNA repair factor XPC. Mol. Cell. Biol. 28, 7225–7235. Boyer, L.A., Lee, T.I., Cole, M.F., Johnstone, S.E., Levine, S.S., Zucker, J.P., Guenther, M.G., Kumar, R.M., Murray, H.L., Jenner, R.G., et al. (2005). Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947–956. Camenisch, U., Tra¨utlein, D., Clement, F.C., Fei, J., Leitenstorfer, A., FerrandoMay, E., and Naegeli, H. (2009). Two-stage dynamic DNA quality check by xeroderma pigmentosum group C protein. EMBO J. 28, 2387–2399. Cervantes, R.B., Stringer, J.R., Shao, C., Tischfield, J.A., and Stambrook, P.J. (2002). Embryonic stem cells and somatic cells differ in mutation frequency and type. Proc. Natl. Acad. Sci. USA 99, 3586–3590.
Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc. 129
Chakravarthy, H., Boer, B., Desler, M., Mallanna, S.K., McKeithan, T.W., and Rizzino, A. (2008). Identification of DPPA4 and other genes as putative Sox2: Oct-3/4 target genes using a combination of in silico analysis and transcription-based assays. J. Cell. Physiol. 216, 651–662. Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V.B., Wong, E., Orlov, Y.L., Zhang, W., Jiang, J., et al. (2008). Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106–1117. Engelen, E., Akinci, U., Bryne, J.C., Hou, J., Gontan, C., Moen, M., Szumska, D., Kockx, C., van Ijcken, W., Dekkers, D.H., et al. (2011). Sox2 cooperates with Chd7 to regulate genes that are mutated in human syndromes. Nat. Genet. 43, 607–611.
Mu¨ller, F., Zaucker, A., and Tora, L. (2010). Developmental regulation of transcription initiation: more than just changing the actors. Curr. Opin. Genet. Dev. 20, 533–540. Na¨a¨r, A.M., Beaurang, P.A., Robinson, K.M., Oliner, J.D., Avizonis, D., Scheek, S., Zwicker, J., Kadonaga, J.T., and Tjian, R. (1998). Chromatin, TAFs, and a novel multiprotein coactivator are required for synergistic activation by Sp1 and SREBP-1a in vitro. Genes Dev. 12, 3020–3031. Na¨a¨r, A.M., Beaurang, P.A., Zhou, S., Abraham, S., Solomon, W., and Tjian, R. (1999). Composite co-activator ARC mediates chromatin-directed transcriptional activation. Nature 398, 828–832. Na¨a¨r, A.M., Lemon, B.D., and Tjian, R. (2001). Transcriptional coactivator complexes. Annu. Rev. Biochem. 70, 475–501.
Gao, X., Tate, P., Hu, P., Tjian, R., Skarnes, W.C., and Wang, Z. (2008). ES cell pluripotency and germ-layer formation require the SWI/SNF chromatin remodeling component BAF250a. Proc. Natl. Acad. Sci. USA 105, 6656–6661.
Na¨a¨r, A.M., Taatjes, D.J., Zhai, W., Nogales, E., and Tjian, R. (2002). Human CRSP interacts with RNA polymerase II CTD and adopts a specific CTD-bound conformation. Genes Dev. 16, 1339–1344.
Goodrich, J.A., and Tjian, R. (2010). Unexpected roles for core promoter recognition factors in cell-type-specific transcription and gene regulation. Nat. Rev. Genet. 11, 549–558.
Ng, J.M., Vermeulen, W., van der Horst, G.T., Bergink, S., Sugasawa, K., Vrieling, H., and Hoeijmakers, J.H. (2003). A novel regulation mechanism of DNA repair by damage-induced and RAD23-dependent stabilization of xeroderma pigmentosum group C protein. Genes Dev. 17, 1630–1645.
Jaenisch, R., and Young, R. (2008). Stem cells, the molecular circuitry of pluripotency and nuclear reprogramming. Cell 132, 567–582. Jiang, J., Chan, Y.S., Loh, Y.H., Cai, J., Tong, G.Q., Lim, C.A., Robson, P., Zhong, S., and Ng, H.H. (2008). A core Klf circuitry regulates self-renewal of embryonic stem cells. Nat. Cell Biol. 10, 353–360. Kagey, M.H., Newman, J.J., Bilodeau, S., Zhan, Y., Orlando, D.A., van Berkum, N.L., Ebmeier, C.C., Goossens, J., Rahl, P.B., Levine, S.S., et al. (2010). Mediator and cohesin connect gene expression and chromatin architecture. Nature 467, 430–435. Kidder, B.L., Palmer, S., and Knott, J.G. (2009). SWI/SNF-Brg1 regulates selfrenewal and occupies core pluripotency-related genes in embryonic stem cells. Stem Cells 27, 317–328. Kim, J., Chu, J., Shen, X., Wang, J., and Orkin, S.H. (2008). An extended transcriptional network for pluripotency of embryonic stem cells. Cell 132, 1049–1061. Kuroda, T., Tada, M., Kubota, H., Kimura, H., Hatano, S.Y., Suemori, H., Nakatsuji, N., and Tada, T. (2005). Octamer and Sox elements are required for transcriptional cis regulation of Nanog gene expression. Mol. Cell. Biol. 25, 2475–2485. Le May, N., Mota-Fernandes, D., Ve´lez-Cruz, R., Iltis, I., Biard, D., and Egly, J.M. (2010). NER factors are recruited to active promoters and facilitate chromatin modification for transcription in the absence of exogenous genotoxic attack. Mol. Cell 38, 54–66. Lemon, B., Inouye, C., King, D.S., and Tjian, R. (2001). Selectivity of chromatinremodelling cofactors for ligand-activated transcription. Nature 414, 924–928. Lin, T., Chao, C., Saito, S., Mazur, S.J., Murphy, M.E., Appella, E., and Xu, Y. (2005). p53 induces differentiation of mouse embryonic stem cells by suppressing Nanog expression. Nat. Cell Biol. 7, 165–171. Liu, Z., Scannell, D.R., Eisen, M.B., and Tijan, R. (2011). Control of embryonic stem cell lineage commitment by core promoter factor, TAF3. Cell 146, 720–731. Loh, Y.H., Wu, Q., Chew, J.L., Vega, V.B., Zhang, W., Chen, X., Bourque, G., George, J., Leong, B., Liu, J., et al. (2006). The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nat. Genet. 38, 431–440. Maillard, O., Solyom, S., and Naegeli, H. (2007). An aromatic sensor with aversion to damaged strands confers versatility to DNA repair. PLoS Biol. 5, e79. Marson, A., Levine, S.S., Cole, M.F., Frampton, G.M., Brambrink, T., Johnstone, S., Guenther, M.G., Johnston, W.K., Wernig, M., Newman, J., et al. (2008). Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134, 521–533. Mu, D., Hsu, D.S., and Sancar, A. (1996). Reaction mechanism of human DNA repair excision nuclease. J. Biol. Chem. 271, 8285–8294.
130 Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc.
Pal, R., and Ravindran, G. (2006). Assessment of pluripotency and multilineage differentiation potential of NTERA-2 cells as a model for studying human embryonic stem cells. Cell Prolif. 39, 585–598. Ramalho-Santos, M., Yoon, S., Matsuzaki, Y., Mulligan, R.C., and Melton, D.A. (2002). ‘‘Stemness’’: transcriptional profiling of embryonic and adult stem cells. Science 298, 597–600. Rodda, D.J., Chew, J.L., Lim, L.H., Loh, Y.H., Wang, B., Ng, H.H., and Robson, P. (2005). Transcriptional regulation of nanog by OCT4 and SOX2. J. Biol. Chem. 280, 24731–24737. Roeder, R.G. (2005). Transcriptional regulation and the role of diverse coactivators in animal cells. FEBS Lett. 579, 909–915. Rosenfeld, M.G., Lunyak, V.V., and Glass, C.K. (2006). Sensors and signals: a coactivator/corepressor/epigenetic code for integrating signal-dependent programs of transcriptional response. Genes Dev. 20, 1405–1428. Ryu, S., Zhou, S., Ladurner, A.G., and Tjian, R. (1999). The transcriptional cofactor complex CRSP is required for activity of the enhancer-binding protein Sp1. Nature 397, 446–450. Sands, A.T., Abuin, A., Sanchez, A., Conti, C.J., and Bradley, A. (1995). High susceptibility to ultraviolet-induced carcinogenesis in mice lacking XPC. Nature 377, 162–165. Schaeffer, L., Roy, R., Humbert, S., Moncollin, V., Vermeulen, W., Hoeijmakers, J.H., Chambon, P., and Egly, J.M. (1993). DNA repair helicase: a component of BTF2 (TFIIH) basic transcription factor. Science 260, 58–63. Schwartz, C.M., Spivak, C.E., Baker, S.C., McDaniel, T.K., Loring, J.F., Nguyen, C., Chrest, F.J., Wersto, R., Arenas, E., Zeng, X., et al. (2005). NTera2: a model system to study dopaminergic differentiation of human embryonic stem cells. Stem Cells Dev. 14, 517–534. Sommer, C.A., Stadtfeld, M., Murphy, G.J., Hochedlinger, K., Kotton, D.N., and Mostoslavsky, G. (2009). Induced pluripotent stem cell generation using a single lentiviral stem cell cassette. Stem Cells 27, 543–549. Sperger, J.M., Chen, X., Draper, J.S., Antosiewicz, J.E., Chon, C.H., Jones, S.B., Brooks, J.D., Andrews, P.W., Brown, P.O., and Thomson, J.A. (2003). Gene expression patterns in human embryonic stem cells and human pluripotent germ cell tumors. Proc. Natl. Acad. Sci. USA 100, 13350–13355. Sugasawa, K., Masutani, C., Uchida, A., Maekawa, T., van der Spek, P.J., Bootsma, D., Hoeijmakers, J.H., and Hanaoka, F. (1996). HHR23B, a human Rad23 homolog, stimulates XPC protein in nucleotide excision repair in vitro. Mol. Cell. Biol. 16, 4852–4861. Taatjes, D.J., Marr, M.T., and Tjian, R. (2004). Regulatory diversity among metazoan co-activator complexes. Nat. Rev. Mol. Cell Biol. 5, 403–410. Tokuzawa, Y., Kaiho, E., Maruyama, M., Takahashi, K., Mitsui, K., Maeda, M., Niwa, H., and Yamanaka, S. (2003). Fbx15 is a novel target of Oct3/4 but is dispensable for embryonic stem cell self-renewal and mouse development. Mol. Cell. Biol. 23, 2699–2708.
Tutter, A.V., Kowalski, M.P., Baltus, G.A., Iourgenko, V., Labow, M., Li, E., and Kadam, S. (2009). Role for Med12 in regulation of Nanog and Nanog target genes. J. Biol. Chem. 284, 3709–3718. Uchida, A., Sugasawa, K., Masutani, C., Dohmae, N., Araki, M., Yokoi, M., Ohkuma, Y., and Hanaoka, F. (2002). The carboxy-terminal domain of the XPC protein plays a crucial role in nucleotide excision repair through interactions with transcription factor IIH. DNA Repair (Amst.) 1, 449–461. van den Berg, D.L., Snoek, T., Mullin, N.P., Yates, A., Bezstarosti, K., Demmers, J., Chambers, I., and Poot, R.A. (2010). An Oct4-centered protein interaction network in embryonic stem cells. Cell Stem Cell 6, 369–381. van den Berg, D.L., Zhang, W., Yates, A., Engelen, E., Takacs, K., Bezstarosti, K., Demmers, J., Chambers, I., and Poot, R.A. (2008). Estrogen-related receptor beta interacts with Oct4 to positively regulate Nanog gene expression. Mol. Cell. Biol. 28, 5986–5995. Venema, J., van Hoffen, A., Natarajan, A.T., van Zeeland, A.A., and Mullenders, L.H. (1990). The residual repair capacity of xeroderma pigmentosum comple-
mentation group C fibroblasts is highly specific for transcriptionally active DNA. Nucleic Acids Res. 18, 443–448. Wang, J., Rao, S., Chu, J., Shen, X., Levasseur, D.N., Theunissen, T.W., and Orkin, S.H. (2006). A protein interaction network for pluripotency of embryonic stem cells. Nature 444, 364–368. Yasuda, G., Nishi, R., Watanabe, E., Mori, T., Iwai, S., Orioli, D., Stefanini, M., Hanaoka, F., and Sugasawa, K. (2007). In vivo destabilization and functional defects of the xeroderma pigmentosum C protein caused by a pathogenic missense mutation. Mol. Cell. Biol. 27, 6606–6614. Zhang, J., Tam, W.L., Tong, G.Q., Wu, Q., Chan, H.Y., Soh, B.S., Lou, Y., Yang, J., Ma, Y., Chai, L., et al. (2006). Sall4 modulates embryonic stem cell pluripotency and early embryonic development by the transcriptional regulation of Pou5f1. Nat. Cell Biol. 8, 1114–1123. Zhang, X., Zhang, J., Wang, T., Esteban, M.A., and Pei, D. (2008). Esrrb activates Oct4 transcription and sustains self-renewal and pluripotency in embryonic stem cells. J. Biol. Chem. 283, 35825–35833.
Cell 147, 120–131, September 30, 2011 ª2011 Elsevier Inc. 131
An Alternative Splicing Switch Regulates Embryonic Stem Cell Pluripotency and Reprogramming Mathieu Gabut,1,2 Payman Samavarchi-Tehrani,3,5 Xinchen Wang,1,2 Valentina Slobodeniuc,1,2 Dave O’Hanlon,1,2 Hoon-Ki Sung,4 Manuel Alvarez,2,6 Shaheynoor Talukder,1,2 Qun Pan,1,2 Esteban O. Mazzoni,7 Stephane Nedelec,7 Hynek Wichterle,7 Knut Woltjen,4 Timothy R. Hughes,1,2 Peter W. Zandstra,2,6 Andras Nagy,4,5 Jeffrey L. Wrana,3,5 and Benjamin J. Blencowe1,2,5,* 1Banting
and Best Department of Medical Research Donnelly Centre University of Toronto, 160 College Street, Toronto, Ontario M5S 3E1, Canada 3Center for Systems Biology, Samuel Lunenfeld Research Institute 4Center for Stem Cells and Tissue Engineering, Samuel Lunenfeld Research Institute Mount Sinai Hospital, 600 University Avenue, Toronto, Ontario M5G 1X5, Canada 5Department of Molecular Genetics, University of Toronto, 1 Kings College Circle, Toronto, Ontario M5S 1A8, Canada 6Institute of Biomaterials and Biomedical Engineering, University of Toronto, 164 College Street, Toronto, Ontario M5T 1P7, Canada 7Columbia University Medical Center, 630 West 168th Street, New York, NY 10032, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.08.023 2The
SUMMARY
Alternative splicing (AS) is a key process underlying the expansion of proteomic diversity and the regulation of gene expression. Here, we identify an evolutionarily conserved embryonic stem cell (ESC)specific AS event that changes the DNA-binding preference of the forkhead family transcription factor FOXP1. We show that the ESC-specific isoform of FOXP1 stimulates the expression of transcription factor genes required for pluripotency, including OCT4, NANOG, NR5A2, and GDF3, while concomitantly repressing genes required for ESC differentiation. This isoform also promotes the maintenance of ESC pluripotency and contributes to efficient reprogramming of somatic cells into induced pluripotent stem cells. These results reveal a pivotal role for an AS event in the regulation of pluripotency through the control of critical ESC-specific transcriptional programs. INTRODUCTION During the past several years, great strides have been made in our understanding of the regulatory processes responsible for maintenance of the pluripotent state of embryonic stem cells (ESCs) and for the reprogramming of somatic cells to induced pluripotent stem cells (iPSCs). A core set of transcription factors that includes Oct4, Nanog, Sox2, and Tcf3 functions in ESC maintenance, with the first three of these factors cross-regulating each other’s expression, as well as genes that stabilize the ESC state (Chen et al., 2008; Kim et al., 132 Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc.
2008; Silva et al., 2009). Indeed, exogenous Oct4 and Sox2, together with Klf4 and c-Myc, reprogram somatic cells to iPSCs (Takahashi and Yamanaka, 2006) by remodeling the transcriptome through successive stages (Samavarchi-Tehrani et al., 2010) that culminate in activation of the core pluripotency transcriptional regulatory network. In contrast to our understanding of transcriptional networks regulating pluripotency, the role of alternative splicing (AS) in this process is not well understood. Recent studies have identified AS differences between ESC and differentiated cell populations (Atlasi et al., 2008; Kunarso et al., 2008; Pritsker et al., 2005; Rao et al., 2010b; Salomonis et al., 2010; Wu et al., 2010; Yeo et al., 2007), and two such events have been implicated in changing the activities of Tcf3 and Sall4, transcription factors that function in pluripotency (Rao et al., 2010b; Salomonis et al., 2010). Therefore, specific AS events may modulate transcriptional networks involved in pluripotency maintenance versus cell-type specification. Forkhead box (FOX) transcription factors regulate a large number of genes involved in cell proliferation, differentiation, and development (Wijchers et al., 2006). The forkhead box forms a winged helix domain of 80 to 100 amino acids that binds to DNA (Li et al., 2004). FOXP1 is one of four FOXP subfamily members that contain a C-terminal forkhead domain together with N-terminal zinc finger and leucine zipper domains. FOXP1 is widely expressed, and its loss or fusion with other proteins through chromosomal translocations is associated with several cancers (Koon et al., 2007). Knockout of murine Foxp1 disrupts the establishment of specific cell types (Dasen et al., 2008; Zhang et al., 2010) and results in early embryonic lethality (Wang et al., 2004). Several splice variants of FOXP1 have been identified (Brown et al., 2008), yet the functions of these are not well understood. In this study, we identify a highly conserved AS event in FOXP1 transcripts that is activated in ESCs and silenced during cell
differentiation. This AS event modifies critical amino acid residues within the forkhead domain and alters its DNA-binding specificity. In ESCs this switches the transcriptional output of FOXP1 such that the pluripotency genes OCT4, NANOG, GDF3, and NR5A2 are stimulated, while genes involved in celllineage specification and differentiation are repressed. Induced expression of the ESC-specific isoform of FOXP1 promotes self-renewal and the maintenance of pluripotency, whereas its silencing inhibits iPSC programming. An evolutionarily conserved AS event thus reconfigures transcriptional regulatory networks required for transitions between ESC pluripotency maintenance and differentiation. RESULTS An Embryonic Stem Cell-Specific Splice Variant from the FOXP1 Gene To identify AS events that might control stem cell pluripotency, we used microarray profiling to compare patterns of AS in undifferentiated and differentiated H9 human (h)ESCs (Extended Experimental Procedures and Figure S1 available online). Whereas few AS changes were detected at day 2, 165 (2.85%) of the profiled exons were predicted to undergo inclusion level changes (Table S1 and data not shown) at day 10 following neural lineage induction. Genes containing these predicted AS changes were represented by diverse functional Gene Ontology (GO) categories. In this study, we focus on a previously unidentified AS change detected in transcripts from the FOXP1 gene. Our analysis indicated that FOXP1 exon 18 had increased inclusion in day 10 neural progenitor-enriched cells compared to undifferentiated H9 hESCs (96% versus 79% inclusion; Table S1). Reverse-transcription-polymerase chain reaction (RT-PCR) assays confirmed this but also detected two unexpected additional bands that are 50 nt and 170 nt longer than the transcripts containing exon 18 (Figure 1A). Sequencing of the +50 nt band revealed the inclusion in hESCs of a previously uncharacterized exon, which we refer to as exon 18b, in FOXP1 transcripts in place of exon 18, whereas the +170 nt band (asterisk in Figure 1) contained exons 18 and 18b. Consistent with low or undetectable (see below) expression of this isoform, inclusion of both exons introduces a termination codon 121 nt downstream of exon 18b that likely elicits nonsense-mediated mRNA decay. However, inclusion of exon 18b instead of exon 18 preserves the open reading frame but modifies the forkhead domain (see below, Figure S2A). Exon 18b is efficiently included in undifferentiated H9 hESCs (>64%, Figure 1B, lanes 1 and 2) and in H9 cells 2 days after differentiation induction (>58%, Figure 1B, lanes 3 to 5), relative to the neural lineage-enriched cell population at day 10 (11%, Figure 1B, lane 6). Consistent with this observation, a high proportion of H9 hESCs still expressed pluripotency markers at day 2 compared to day 10 post-induction of differentiation (Figure S1D). Next, we used RT-PCR to investigate exon 18 and 18b inclusion levels in another hESC line, CA1 (Figure 1B, lane7), and in a panel of partially or fully differentiated human cell lines (Figure 1B lanes 8–15; refer to legend). Similar to H9 hESCs, exon 18b was highly included in CA1 (62%), whereas exon 18 was
the only exon detected in differentiated cell lines. Immunoblotting confirmed that FOXP1 protein containing exon 18b is more highly expressed in hESCs and is not expressed in differentiated cells (see below and Figure S4A). These results show that FOXP1 exon 18b switches from efficient inclusion in hESCs to almost complete skipping in differentiated cells. To further confirm that exon 18b is specifically included in self-renewing, pluripotent ESCs, we assessed exon 18b and 18 inclusion levels in H9 cells sorted for expression of the pluripotency markers TRA1-81 and SSEA-3 (Figure 1C). Partially differentiated H9 hESCs were used, such that reduced exon 18b inclusion was present (compare Figure 1C, lane 1 with lanes 1 and 2 in Figure 1B). However, in sorted cells expressing TRA1-81 and SSEA-3, exon 18b inclusion was high (Figure 1C, lane 3), whereas only minor levels of exon 18b inclusion were detected in the TRA1-81/SSEA-3-negative population (Figure 1C, lane 2). Thus, inclusion of FOXP1 exon 18b is specific to self-renewing, pluripotent hESCs. Hereafter we refer to the exon 18b splice variant as ‘‘FOXP1-ES’’ and the exon 18 variant as ‘‘FOXP1.’’ Evolutionary Conservation of FOXP1-ES Regulation Comparison across species reveals that human FOXP1 exons 18 and 18b are located within an 1000 nt genomic region that is highly conserved (PhastCons mean 0.959, variance 0.029) in 46 vertebrates (Figure 1D). This region includes 120 nt upstream of exon 18, 373 nt between exons 18 and 18b, and 205 nt downstream of exon 18b. This observation suggests that exons 18 and 18b likely have conserved patterns of AS in diverse vertebrate species. To test this, we analyzed the regulation of the orthologous exons (exons 16 and 16b) in mouse Foxp1 transcripts (Figure 1D and Figure S2B). The AS levels of exons 16 and 16b were analyzed in three undifferentiated mouse (m)ESCs lines, CGR8, Hb9, and R1, and following induction into different lineages (Figure 1E and Figure S2C). Similar to human cells, exon 16b displayed the highest inclusion in undifferentiated mESCs (Figure 1E, lanes 1 and 7; Figure S2C, lane 1), and its inclusion level progressively decreased when CGR8- or R1-derived embryoid bodies (EBs) were induced to form cardiomyocytes over a 14 day period (Figure 1E, lanes 3 to 6; Figure S2C, lanes 2 to 4). Furthermore, exon 16b inclusion decreased in day 14 CGR8- or R1-derived neural and glial progenitor-enriched neurospheres (Figure 1E, lane 2; Figure S2C, lane 5), or when Hb9 mESCs were induced to form motor neuron (MN) precursors (Figure 1E, lane 8), and was almost entirely skipped in sorted, differentiated MNs and in the neuroblastoma cell line Neuro2A (Figure 1E, lanes 9 and 10). Similar to human exon 18, mouse exon 16 displayed inclusion in all of the samples but at reduced levels relative to exon 16b in undifferentiated mESCs. Consistent with the high sequence conservation associated with exons 18b/16b and 18/16 and the surrounding intronic regions, these exons thus display conserved patterns of regulation. FOXP1 and FOXP1-ES Have Distinct DNA-Binding Specificities The forkhead domain of human FOXP1 overlaps exons 16 to 19. FOXP forkhead domains are highly homologous and bind Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc. 133
A
16 17
18
B
18b
19
20
FOXP1 (NM_032682) FOXP1-ES Isoform 3 (*)
21
H9
H9
To ta l
H9
hE
SC
/F H9 eed /M e "E a t r r nd i g "M od e l e er "N sod m" eu erm da NP ral " y2 Cs " d day da ay 2 y1 2 CA 0 He 1 h E la SC IM R3 H5 2 3 A5 8 49 Co lo Ra 20 ji 5 Ju rk 29 at 3T
C
-
+ +
*
* 17 18b 19
17 18b 19
17 18 19
64 68 59 58 69 11
62 0
0 0 0 0 0 0 0
17 18 19
exon 18b % inclusion
25
5
55 exon 18b % inclusion
1
2
3
ACTB 2
1
3
5
4
6
7
8
SSEA-3 TRA1-81
ACTB
9 10 11 12 13 14 15
D FOXP1 17
Verterbrate Conservation
Foxp1 15 5.5
18
18b
19
16
16b
17
1 kb
0 -1
E
Hb9
m ES Ne C ur EB osp d he EB ay 2 res d EB ay 5 d EB ay 1 da 0 y1 4 m ES C MN p FA rog CS eni Ne so tors r ur o2 ted MN a
CGR8
* 15 16b 17 15 16 17
36 25 29 22
2
4
25 16 2
1
exon 16% inclusion Gapdh
1
2
3
4
5
6
7
8
9
10
Figure 1. Identification of an Embryonic Stem Cell-Specific Splice Variant from the Human and Mouse FOXP1/Foxp1 Genes (A) Schematic representation of exons 16 to 21 of the human FOXP1 gene. Transcripts including alternative exon 18 (blue; ‘‘FOXP1’’; NM_032682) encode the widely expressed, canonical form of FOXP1, and transcripts including alternative exon 18b (red; ‘‘FOXP1-ES’’) are specifically detected in hESCs. Transcripts simultaneously including exons 18 and 18b (indicated by an asterisk) are predicted to be targeted by nonsense-mediated mRNA decay and are detected at low levels in hESCs. See also Figure S2A. (B) RT-PCR assays using primers annealing to FOXP1 exons 17 and 19 (arrows) were used to analyze FOXP1 splice isoform levels in H9 hESCs grown in the presence of MEF feeder cells (lane 1) or matrigel (lane 2), H9 hESCs induced to differentiate for 2 days toward primitive endoderm (lane 3), primitive mesoderm (lane 4), neural lineages (lane 5), or neural progenitor cells (NPCs) at day 10 (lane 6) post-induction (see also Figure S1). FOXP1 splice isoforms were also analyzed in a second hESC line (CA1, lane 7) and in eight human immortalized cell lines of diverse origin as indicated in lanes 8–15. HeLa, cervical carcinoma; IMR32, neuroblastoma; H538, lung carcinoma; A549, lung adenocarcinoma; Colo 205, colorectal carcinoma; Raji, B lymphoblastoma; Jurkat, T lymphoblastoma; 293T, ‘‘embryonic kidney.’’ *, isoform containing both exons 18 and 18b. ACTB mRNA levels are shown for comparison. (C) RT-PCR analysis (as performed in panel B) of FOXP1 splice isoform levels in unsorted H9 hESCs (lane 1) and, following fluorescence-activated cell sorting (FACS), in H9 hESCs that are either double negative (lane 2) or double positive (lane 3) for the cell surface-expressed pluripotency markers TRA1-81 and SSEA-3. *, isoform containing both exons 18 and 18b. ACTB mRNA levels are shown for comparison. (D) Conservation analysis of sequences surrounding FOXP1 human exons 18 and 18b (orthologous to exons 16 and 16b in mouse Foxp1) across 46 vertebrate species. The conservation plot was generated from the UCSC browser using the hg19 genome assembly. See also Figure S2B.
134 Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc.
a canonical consensus motif GTAAACA as monomers and homo- and/or heterodimers (Koh et al., 2009; Li et al., 2004), and a FOXP2-DNA (Stroud et al., 2006) costructure reveals that residues directly contacting DNA are conserved in FOXP1 (highlighted in green in Figure 2A). Exon 18b is predicted to substitute 35 residues (highlighted in red in Figure 2A), none of which are predicted to alter secondary structure or dimerization (black dots show residues involved in dimerization in Figure 2A). However, of four residues that contact DNA in FOXP2, two (Asn510 and His514) are substituted in FOXP1-ES. These residues form critical hydrogen bonds with the adenine-thymine (A-T) base pair at the fourth position in the canonical FOXP site (underlined in GTAAACA), and their substitution may therefore affect binding affinity and/or specificity. We therefore investigated the DNA-binding properties of FOXP1 and FOXP1-ES using protein-binding microarrays (PBMs; Berger et al., 2006, 2008). The PBM analysis revealed that FOXP1 and FOXP1-ES forkhead domains fused to glutathione S transferase (GST) preferentially recognize distinct DNA-binding motifs (Figure 2B and Figure S3A). The canonical binding motif GTAAACAA was represented by the majority of the highest-scoring GST-FOXP1bound sequences (blue dots), whereas GST-FOXP1-ES preferentially bound CGATACAA or closely related sequences (red dots). Other sequences preferentially bound by FOXP1-ES contained specific C/A-rich motifs (orange dots), whereas other C/A-rich motifs were bound by both proteins (green dots) (Figure 2B and Figure S3A). We confirmed these binding preferences by gel mobility shift assays (Figure 2C and Figure S3B). For example, GST-FOXP1-ES, when compared to GST-FOXP1, preferentially bound dsDNA probes containing AATAAACA and CGATACAA (orange and red dots in Figure 2B, respectively), whereas GST-FOXP1 preferentially bound the consensus GTAAACAA. Furthermore, GST-FOXP1 and GST-FOXP1-ES did not bind mutant versions of each of the analyzed PBMderived binding sites (Figure 2C and Figure S3B; mutant positions underlined). These results show that hESC-specific inclusion of exon 18b changes the DNA-binding specificity of FOXP1. Moreover, consistent with the prediction that substitution of Asn510 and His514 in FOXP1-ES would affect recognition of the fourth A-T base pair in the consensus site, FOXP1-ES bound a T-A base pair at this position in a subset of the preferentially bound PBM sequences. Additional substitutions of conserved residues at the DNA-binding interface of FOXP1-ES presumably account for other changes in the DNA-binding properties of this splice isoform, including its preferential binding to specific C/A-rich motifs. Additionally, the results from the PBM experiments and gel mobility shift assays reveal that GST-FOXP1-ES binds a broader spectrum of sequences than does GST-FOXP1, although with apparent reduced affinity, as at similar concentration ranges GST-FOXP1-ES bound less efficiently to its high-
scoring PBM sequences than did GST-FOXP1 (Figure 2 and Figure S3A). Collectively, these findings suggests that FOXP1-ES and FOXP1 direct different gene expression programs in ESCs. FOXP1 and FOXP1-ES Regulate Distinct Programs of Gene Expression in hESCs To investigate whether FOXP1 and FOXP1-ES control different sets of genes, we performed knockdowns using custom siRNA pools targeting either exon 18 or exon 18b in undifferentiated H9 cells, followed by RNA-Seq profiling. Relative to a control siRNA pool (Figure 3A, lane 1), each siRNA pool resulted in efficient (>80%) knockdown of only the expected FOXP1 isoform (RT-PCR in Figure 3A, lanes 2 and 3, and immunoblotting in Figure S4A, lanes 4 and 5). RNA-Seq reads from each sample were then mapped to RefSeq cDNAs to establish counts of uniquemapping reads per kb per million mapped reads (RPKM; Mortazavi et al., 2008), and genes with at least 2-fold differences were further analyzed. Knockdown of FOXP1 caused changes in expression of 153 genes, whereas FOXP1-ES knockdown was more dramatic, resulting in altered expression of 472 genes, 76 of which overlapped with the FOXP1-dependent gene set (Figure 3B; Table S2 for a full analysis). Analysis by qRT-PCR of a representative set of 19 genes with predicted changes ranging from 2- to 20-fold agreed well with the RNA-Seq-derived estimates (r = 0.941; Figure S4B; see below). Of the affected genes, a significantly higher proportion showed increased expression upon FOXP1-ES knockdown versus FOXP1 (86% versus 58.2%; p = 1.63E-05, Chi-square test). Moreover, of the 76 genes affected in both knockdowns (Figure 3B), 61 (80.3%) displayed increased expression. These results suggest that in undifferentiated hESCs, FOXP1 and FOXP1-ES control distinct but overlapping sets of genes, with a substantially larger set of genes controlled by FOXP1-ES compared to FOXP1 in hESCs. Moreover, FOXP1-ES predominantly acts to suppress gene expression. A GO enrichment analysis of genes decreased upon knockdown of FOXP1 or FOXP1-ES revealed significant enrichment of terms related to early development (p < 1.21E-05, Figure 3C; Table S3 for a full analysis). Interestingly, a subset of the FOXP1-ES-dependent genes are involved in ESC pluripotency maintenance (see below and Figure 3D). Genes upregulated upon knockdown of FOXP1 were not significantly enriched in any GO category, whereas genes upregulated upon knockdown of FOXP1-ES were highly significantly enriched in GO annotations associated with development, transmembrane receptor activity, and cell differentiation (p < 2.24E-06, Figure 3C; Table S3). qRT-PCR validation confirmed that knockdown of FOXP1-ES results in an 2-fold or greater decrease in the expression of the pluripotency genes OCT4, NANOG, NR5A2, GDF3, and TDGF1 and an 2-fold or greater increase in expression of differentiation-associated genes including GAS1, HESX1, SFRP4, and
(E) RT-PCR analysis of Foxp1 splice isoforms in self-renewing CGR8 and Hb9 mouse (m)ESC lines (lanes 1 and 7), in CGR8 mESCs differentiated toward neural and glial progenitors (lane 2), in CGR8 mESCs aggregated to form embryoid bodies (EB) grown in conditions favoring differentiation into cardiomyocytes (EB days 2–10, lanes 3–5), and in beating cardiomyocytes (EB day 14, lane 6). Hb9 mESCs were differentiated into motor neuron (MN) progenitors (lane 8) and into mature MNs, which were FACS sorted (lane 9). Analysis of Neuro2a cells is shown in lane 10. *, isoform containing both exons 16 and 16b. Gapdh mRNA levels are shown for comparison. See also Figure S2C.
Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc. 135
A
15
16
17
18
18b
19
FOXP1
hFOXP1-ES mFoxp1-ES H. Sapiens P. troglodytes B. taurus M. musculus R. norvegicus D. rerio 465
B
470
475
480
485
490
495
500
505
510
515
520
530
535
540
545
555
560
565
570
Preferentially bound isoform
FOXP1
0.4
-0.1
550
Sequences with E scores > 0.45 forming the motifs:
0.5
8-mer E scores: GST-FOXP1-ES
525
0.3
FOXP1-ES
0.2
FOXP1-ES
0.1
both
0
0.1
0.2
0.3
0.4
Sequences without significant E scores.
0.5
8-mer E scores: GST-FOXP1
C
GSTFOXP1
GSTFOXP1-ES
GSTFOXP1
-
GSTFOXP1
GSTFOXP1-ES
GTAAACAA
AATAAACA
*
*
* 1 2 3 4 5
6 7 8 9
CGATACAA
AATGGACA
GGACACAA
GSTFOXP1-ES
* CGCGACAA
* 10 11 12 13 14 15 16 17 18
* 19 20 21 22 23
24 25 26 27
Figure 2. FOXP1-ES Has a Distinct DNA-Binding Specificity Compared to the Canonical Form of FOXP1 (A) Multiple alignment of amino acid sequences encoding the FOXP1 and FOXP1-ES forkhead DNA-binding domains from different vertebrate species. Amino acid sequences conserved across all species analyzed are indicated in white. Amino acid changes introduced as a consequence of splicing of exon 18b in FOXP1-ES are highlighted in red. Species-specific amino acid differences occurring in the canonical form of FOXP1 are highlighted in orange. Amino acids
136 Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc.
WNT1 (Figure 3D). Several other genes that function in pluripotency maintenance and reprogramming, including KLF4, KLF5, SOX2, C-MYC, ZSCAN10, ESRRB, REXO1, and TBX3, displayed negligible or less pronounced changes in mRNA expression upon FOXP1-ES knockdown (Figure S4C and data not shown), indicating that the decreases in OCT4, NANOG, NR5A2, GDF3, and TDGF1 expression are a specific consequence of reduced FOXP1-ES rather than an indirect effect arising from induction of differentiation. Further, consistent with the RNA-Seq analysis, knockdown of FOXP1 resulted in negligible (<1.5-fold) changes in the expression levels of these and many other FOXP1-ES-regulated genes (Figure 3D and Figure S4C). These results thus provide evidence that expression of FOXP1-ES in hESCs suppresses a large number of genes with important functions in cell differentiation and development, while promoting the expression of a specific subset of genes that support pluripotency. Direct Binding of FOXP1-ES and FOXP1 to Regulated Target Genes We next performed chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-Seq) to identify genes that are potentially directly regulated by FOXP1-ES and FOXP1 in H9 ESCs. Using an antibody that efficiently immunoprecipitates both isoforms, >3,400 significant ChIP-Seq peaks were detected across the human genome (Tables S4A and S4B). To assess whether these peaks are sites of FOXP1 and FOXP1ES occupancy, we determined whether they are significantly enriched in individual PBM-derived 8-mers (Figure 2) that bind to either or both isoforms. Scatterplots directly comparing the relative under-peak enrichment and PBM scores for individual 8-mers are shown in Figure 4A (see Tables S5A and S5B for a full analysis). PBM 8-mers that bind preferentially to FOXP1-ES (orange dots) or FOXP1 (blue dots), and other 8-mers that bind to both proteins (green dots), are significantly enriched under the ChIP-Seq peaks. In contrast, the CGATACA consensus and closely related sequences preferentially bound by FOXP1-ES in vitro do not appear to be widely utilized by this factor in vivo (Figure 4A). Previous studies have revealed examples of transcription factors that preferentially bind lower-affinity sites in vivo (Jaeger et al.,
2010; Rowan et al., 2010), and this property may be important to facilitate dynamic changes in transcriptional output mediated by FOXP1-ES and FOXP1 upon induction of ESC differentiation. Together with the analysis of genes differentially expressed upon knockdown of FOXP1 and FOXP1-ES (Table S2), the ChIP-Seq analysis provides a list of 116 candidate direct in vivo targets of these proteins (Table S5C for a full list). GO enrichment analysis showed that these target genes are significantly enriched in terms associated with early development (p < 7.4E11) and cell differentiation (p < 1.7E-06) (Table S5D). Examples of such direct target candidate genes are shown in Figure 4B. Importantly, the data support OCT4 and NANOG as possible direct targets of FOXP1-ES, as the promoters of these genes are proximal to peaks containing 8-mers that preferentially bind FOXP1-ES in vitro. Further supporting a possible direct role for FOXP1-ES in regulating OCT4, we observed a statistically significant overlap between the RNA-Seq-profiled genes that are dependent on FOXP1-ES for expression in H9 hESCs (Figure 3D) and a set of genes previously reported (Kunarso et al., 2010) to be both directly bound and regulated by OCT4 in H1 hESCs (Figure 4C; p = 0.0016, Chi-square test). The majority (26/33) of these overlapping genes show changes in the same direction upon knockdown of either factor (data not shown). In contrast, genes stimulated or repressed by FOXP1 in H9 hESCs did not significantly overlap with the previously reported OCT4 target genes (Figure 4C). Collectively, the results suggest that FOXP1-ES may regulate ESC self-renewal and pluripotency maintenance by directly controlling the expression of a subset of key pluripotency genes. Foxp1-ES Expression Promotes mESC Self-Renewal and Pluripotency Mouse Foxp1-ES, like human FOXP1-ES, is specifically expressed in mESCs and stimulates the expression of Oct4 and Nanog (Figure 1 and data not shown). Therefore, we hypothesized that Foxp1-ES is required for mESC self-renewal and pluripotency. To test this, we first asked whether ectopic Foxp1-ES expression suppresses mESC differentiation. CGR8 mESC lines stably expressing 33Flag-Foxp1-ES or 33FlagFoxp1 isoforms at levels comparable to endogenous protein and under Doxycycline (Dox)-inducible control (Figure S5A)
predicted to contact DNA (based on the cocrystal structure of FOXP2 bound to its recognition site; Stroud et al., 2006) are indicated in green, residues that are the most highly conserved across Forkhead protein family members are indicated by black arrowheads above the alignment, and residues indicated with a black dot are involved in dimerization. (B) Protein-binding microarray (PBM) analysis of the DNA-binding preferences of GST-FOXP1 and GST-FOXP1-ES forkead domain fusion proteins. Relative binding preferences measured as anti-GST fluorescence signal intensity are represented as ‘‘E scores’’ (Berger et al., 2006). The scatterplot directly compares E scores for GST-FOXP1 and GST-FOXP1-ES, after averaging data from two independent experiments. Sequences of probes with E scores > 0.45 in at least one of the two repeat experiments were clustered to derive consensus binding sites. Blue dots represent all probe sequences containing the consensus GTAAACA, which is preferentially bound by GST-FOXP1; red dots represent all probe sequences containing the consensus sequences CGATACA, CAATACA or TGATACA, which are preferentially bound by GST-FOXP1-ES; orange and green dots represent C/A-rich motifs preferentially bound by GST-FOXP1-ES or by both isoforms, respectively; gray dots indicate all other probe sequences with E scores < 0.45. A full version of the scatterplot is shown in Figure S3A. (C) Electrophoretic mobility shift assay (EMSA) validating PBM-derived consensus DNA-binding sites for FOXP1 and FOXP1-ES. Radiolabeled dsDNA probes containing two copies of GTAAACAA (top left panel), AATAAACA (top middle panel), or CGATACAA (top right panel) or two copies of mutant versions of these sequences (bottom panels) were incubated in the absence (lanes 1, 10, and 19) or in the presence of increasing amounts (0.2 to 3.2 pmol) of recombinant GST-FOXP1 or GST-FOXP1-ES proteins, as indicated. Positions mutated in the probe sequences are highlighted in black and underlined. Shifted protein-dsDNA complexes are indicated by arrows, and free dsDNA probe is indicated by an asterisk. Additional EMSA experiments assaying other PBM-derived preferred binding sites for FOXP1 and FOXP1-ES are shown in Figure S3B.
Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc. 137
rl s ex iRN As on ex 18b on s 18 iRN s i R As NA s
C
* 17 18b 19 17 18 19
83
81
% knockdown
ACTB 1
3
2
Enriched Gene Ontology categories Exon18 siRNA down-regulated genes multicellular organismal development anatomical structure morphogenesis receptor binding organ morphogenesis
Ct
A
Exon18b siRNA down-regulated genes multicellular organismal development embryonic development growth factor activity gastrulation
3.09 E --13 2.33 E --08 1.04 E --06 5.60 E --06
3.55 E -08 3.55 E -08 5.43 E -07 3.56 E -06
Exon18b siRNA up-regulated genes
B
exon 18b siRNAs
exon 18 siRNAs
77
76
100 19.7
13.6
20
1.30 E -10 9.75 E -09 8.64 E -07
% of genes with increased expression % of genes with decreased expression
63.6 80.3
40
1.30 E -10
396
80 60
multicellular organismal development system development organ development cell differentiation
86.4
36.4
exon 18 siRNAs exon 18b siRNAs exons 18 and 18b siRNAs
0
2 G 1 O F1 3 4 X1 4 A2 T1 ED AS CIT WN HES BIK CT TDG NR5 NAN GDF FGF O G
D
P4
R SF
Log2 expression fold change relative to H9 ctl siRNAs
4 Exon18 siRNAs 3
Exon18b siRNAs
2 1 0
-1 -2
Figure 3. Knockdown of FOXP1 and FOXP1-ES Affects the Expression of Distinct Sets of Genes in hESCs (A) RT-PCR analysis of FOXP1 and FOXP1-ES splice isoforms in H9 hESCs transfected with a control, nontargeting siRNA pool (lane 1), an siRNA pool targeting exon 18b (lane 2), and an siRNA pool targeting exon 18 (lane 3). ACTB mRNA levels are shown as a loading/recovery control. The corresponding western blot analysis is shown in Figure S4A. (B) Top: Venn diagram showing numbers of genes with estimated 2-fold to 10.8-fold transcript level changes between the FOXP1 (blue circle) or FOXP1-ES (red circle) knockdowns and the control knockdown samples shown in (A). Bottom: Bar graph showing proportions of genes with up- (black fill) or downregulation (white fill) in the gene sets affected by siRNA knockdown of exon 18- or exon 18b-containing transcripts. Genes with transcript changes affected in both knockdowns are also indicated (bar with gray outline).
138 Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc.
were aggregated to form EBs under conditions that favor neural cell differentiation. In the absence of Dox, all three cell lines supported neural differentiation, as revealed by the appearance of cells with neuronal morphology that immunostained with an antibody to the neuronal marker b-III tubulin (Figures 5Aa, 5Ac, and 5Ae). b-III tubulin-positive neurons were also observed in the control line and in 33Flag-Foxp1-expressing cells after Dox stimulation (Figures 5Ab and 5Ad, respectively). In marked contrast, overexpression of 33Flag-Foxp1-ES almost completely abolished neural cell differentiation (Figure 5Af), and only the 33Flag-Foxp1-ES cells showed prominent Oct4 immunostaining (compare Figure 5Al with Figures 5Ag–5Ak). Furthermore, knockdown of Foxp1 did not significantly impact proliferation, whereas knockdown of Foxp1-ES reduced formation of CGR8 mESC colonies by 3-fold (Figures S5B–S5D). Altogether, these results provide evidence that Foxp1 promotes mESC differentiation, whereas expression of Foxp1-ES prevents differentiation and is required for mESC self-renewal. To further establish whether Foxp1-ES expression is required for the maintenance of stem cell identity, the 33Flag-Foxp1- or 33Flag-Foxp1-ES-expressing CGR8 cell lines were cultured in the presence of different amounts of leukemia inhibitory factor (LIF), a cytokine that is required for pluripotency maintenance of mESCs. Both cell lines were cultured with excess LIF, which supports mESC self-renewal (Figure 5B, LIF 1:1, continuous lines), or with 10% of this amount, which is insufficient to prevent cell differentiation (Figure 5B, LIF 1:10, dashed lines). In the absence of Dox, reduced LIF led to a decrease (50% of total) in the number of Oct4-positive cells after four cell passages (Figure 5B, right panels) and reduced cell division rates (Figure 5B, white dashed lines in left panels). However, Dox-induced overexpression of either Foxp1 isoform in the presence of standard LIF concentrations resulted in increased rates of cell division (Figure 5B, solid black lines), and the majority (>80%) of cells remained Oct4 positive. In contrast, in reduced LIF, 33Flag-Foxp1 expression did not prevent cell differentiation, with cell division declining after two passages, and only 40% of the cells remaining Oct4 positive after four passages (Figure 5B, blue panel, black dashed lines). Under the same reduced LIF conditions, overexpression of 33Flag-Foxp1-ES prevented loss of pluripotency characteristics, as more than 90% of the cells remained Oct4 positive after four passages and cell division rates were comparable to controls grown in standard amounts of LIF (Figure 5B, red panel, black dashed lines). To further assess whether Foxp1-ES but not Foxp1 promotes pluripotency, we cultured the 33Flag-Foxp1- and 33FlagFoxp1-ES-expressing cell lines in the absence of exogenous LIF, with or without Dox. As expected, the two cell lines rapidly differentiated in the absence of Dox, and the 33Flag-Foxp1
line could not be maintained in culture beyond five or six passages even in the presence of Dox (data not shown). Strikingly, the 33Flag-Foxp1-ES-expressing cells continued to grow for over 30 passages in the absence of LIF. We refer to these cells as 33Flag-Foxp1-ESDLIF. qRT-PCR analysis confirmed that these cells express Oct4, Nanog, and Nr5a2 at levels comparable to the parental CGR8 cells, but they display reduced levels of Sox2, Klf4, and LifR (Figure 5C). The 33FlagFoxp1-ESDLIF cells were then aggregated to form EBs and cultured under conditions that favor neural differentiation. As before, in absence of Dox, the cells adopted neuronal morphology, expressed b-III tubulin, and displayed negligible Oct4 expression (Figure S5E, left panel). Finally, when injected subcutaneously in mice, the 33Flag-Foxp1-ESDLIF CGR8 mESCs formed teratomas that reproduce all three germ cell types in vivo (Figure 5D and Figure S5F). Collectively, these results support the conclusion that increased expression of Foxp1-ES, but not of Foxp1, promotes the maintenance of CGR8 mESCs in a pluripotent state. Foxp1-ES Is Required for Efficient iPSC Formation We next asked whether Foxp1-ES expression is important for the formation of iPSCs from mouse embryonic fibroblasts (MEFs). For this experiment, secondary mouse MEFs (2 –6C MEFs) were employed that contained integrated piggyBac transposons expressing, under Dox-inducible control, the four Yamanaka transcription factors ‘‘OKMS’’ (Oct4, Klf4, c-Myc, and Sox2) required for iPSC reprogramming (Takahashi and Yamanaka, 2006; Woltjen et al., 2009). In the presence of Dox, the 2 –6C MEFs efficiently form secondary iPSCs (2 –6C iPSCs) that are pluripotent (Woltjen et al., 2009). Consistent with a key role for Foxp1-ES in the maintenance of mESC pluripotency, RT-PCR assays showed that Foxp1 exon 16b is almost completely skipped in primary MEFs but is included to 32% in iPSCs, which is comparable with its inclusion level in mESCs (Figure 6A and Figure S6A, lanes 1 and 2). During 2 –6C MEF reprogramming, Foxp1 exon 16 is predominantly included at early stages but displays progressively decreased inclusion toward the end of reprogramming (compare days 2–21 in Figures 6A and 6B and Figure S6A). Conversely, exon 16b is weakly included (<4%) at the earliest stages of reprogramming (lanes 3–5) but is efficiently included at later stages (days 5–16), reaching the highest level of inclusion (37%) in 2 –6C iPSCs (Figures 6A and 6B and Figure S6A). We next investigated whether Foxp1 and Foxp1-ES are important for iPSC formation. Each isoform was selectively knocked down using siRNA pools specific for either exon 16b or exon 16 sequences, and siRNAs comprising these pools that produced the most efficient isoform-specific knockdown (Figure S6B) were then used in pairs to validate results. Each
(C) Gene Ontology (GO) annotations enrichment analysis performed on sets of genes displaying increased or decreased transcript levels following siRNA knockdown of exon 18- and exon 18b-containing splice isoforms. The top four most enriched annotations are shown for each gene set with corresponding p values, corrected using the Benjamini false-discovery rate. The full analysis is shown in Table S3. (D) qRT-PCR assays validating RNA-Seq predictions (see Table S2) of 2-fold or greater changes in transcript levels from the pluripotency-associated genes (OCT4, TDGF1, NR5A2, NANOG, GDF3, and FGF4) and differentiation-associated genes (GAS1, CITED2, WNT1, HESX1, BIK, and SFRP4) following siRNA knockdown of FOXP1-ES and FOXP1 in H9 hESCs. Measurements are relative to levels detected with a control siRNA pool and represent averages from three independent analyses; standard deviations (SDs) are indicated. See also Figures S4B and S4C.
Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc. 139
8-mer enrichment (Z-score)
A
60
GTAAACAA
60
50
GGTAAACA TGTAAACA
50
GTAAACAA GGTAAACA
GTAAACAG GTAAACAC AGTAAACA
30
ATACAAAA AACAACAA TAAACAAG
20
ACAAAACA CAAAACAA TAAACAAC CTAAACAA
10
40
GTAAACAC GTAAACAG AGTAAACA
30
ATACAAAA
AACAACAA TAAACAAG ACAAAACA CAAAACAA TAAACAAC CTAAACAA
20 10
0
TAAACAAA
TGTAAACA
TAAACAAA
40
0
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
FOXP1 E score
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
FOXP1-ES E score
B 2 kb
CTAAACAAG
OCT4
21.9 10
POU5F1 POU5F1
NANOG
2 kb
24
ATACAAAA 10 NANOG
BIK
2 kb
ACAAAACA
55.5 10
BIK ACAAAACAAGTAAACAAC . . . AATAAACAG
HOXC6
2 kb
34.6 10
HOXC6 HOXC5 HOXC4
HOXC6
GAS1
34.3
2 kb ATACAAAA
10 GAS1
BC036850
TGTAAACAAA
CITED2
2 kb
74.4 10
CITED2
% of genes co-regulated by OCT4
C
*
30
25.0
25 20
% of genes with increased expression
15
% of genes with decreased expression
10.9
10 5 0
8.2
5.6 89
64
exon 18 siRNA
374 68
exon 18b siRNA
Figure 4. Chromatin Immunoprecipitation and High-Throughput Sequencing Analysis of FOXP1/FOXP1-ES Target Genes in hESCs (A) ChIP-Seq analysis in H9 hESCs was performed using a pan-FOXP1 isoform-specific antibody. The scatterplots compare relative enrichment scores for PBM-derived FOXP1 and FOXP1-ES 8-mer binding sequences under ChIP-Seq peaks and PBM-derived binding strengths. Z scores were calculated by counting motif occurrences in peak sequences relative to occurrences after randomizing the same peak sequences 100,000 times. PBM 8-mer sequences that bind preferentially to FOXP1, FOXP1-ES, or both proteins are colored as in Figure 2B. See Table S5A for a full analysis. (B) Representative tracks showing locations of FOXP1/FOXP1-ES ChIP-Seq peaks proximal (± 20 kb of the transcription start site) to genes that display an 2fold or greater change in mRNA expression upon knockdown of FOXP1 isoforms. See also Table S4 and Table S5. (C) Bar graph representing the percentage of genes up- or downregulated in response to FOXP1 or FOXP1-ES siRNA knockdown in H9 hESCs, which are experimentally supported (based on combined ChIP and knockdown-expression analysis; Kunarso et al., 2010) targets of OCT4. OCT4 target genes only significantly overlap those genes showing decreased but not increased expression following knockdown of FOXP1-ES (p = 0.0016; Chi-square test).
140 Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc.
A - Dox
3xFlag-Foxp1 - Dox + Dox
CGR8-rtTA + Dox
3xFlag-Foxp1-ES - Dox + Dox
β-III tubulin a
b
c
d
e
f
g
h
i
j
k
l
Oct4
Hoechst
3xFlag-Foxp1-ES
Cumulative difference in cell cycle numbers
3xFlag-Foxp1
3xFlag Foxp1
4
4
2
2
0
0
+ Dox - LIF 1:1
-2
-2
+ Dox - LIF 1:10
- Dox - LIF 1:1 - Dox - LIF 1:10
-4
-4
-6
-6
- Dox + Dox
0
1
2
3 4
5
0 1
2
3
4
% Oct4 positive cells
B
3xFlag Foxp1-ES
100 80 60 40 20 0
1:1
5
1:10
1:1
1:10
LIF concentration
Gene expression relative to mES CGR8 cells
1 0.8 0.6 0.4
Endodermal tissues
3xFlag-Foxp1-ES ΔLif 1.2
a
b
Mesodermal tissues
D
c
d
Ectodermal tissues
Number of passages
C
e
f
0.2 0 Oct4 Nanog Nr5a2 Sox2
Klf4
LifR
Figure 5. Expression of Foxp1-ES but Not Foxp1 Promotes Pluripotency Maintenance of mESCs (A) CGR8 mESC lines expressing 33Flag-Foxp1 or 33Flag-Foxp1-ES under Doxycycline (Dox)-inducible control (see Figure S5A) and the parental line used to generate these two cell lines (CGR8-rtTA) were aggregated to form embryoid bodies (EBs) and then cultured under conditions promoting neural differentiation. The cultured EBs were treated with or without Dox and then immunostained for b-III tubulin (neural marker) or Oct4 (pluripotency marker). Nuclei were stained with Hoechst. (B) Quantification of CGR8 mESC proliferation in response to Dox-induced expression of 33Flag-Foxp1 or 33Flag-Foxp1-ES in the presence of excess LIF (LIF 1:1), which promotes mESC self-renewal, or in the presence of concentrations of LIF that are insufficient for promoting mESC self-renewal (LIF 1:10). Left panels show cell growth rates calculated as the cumulative difference in cell-cycle numbers relative to the control condition (LIF1:1) without Dox-induced expression of the 33Flag-Foxp1(-ES) transgenes. Right panel: Quantification of the proportions of cells expressing Oct4 under the different growth conditions indicated after four cell passages. Quantifications represent four independent analyses and SDs are indicated.
Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc. 141
isoform-specific pool resulted in a selective reduction (>65%) of only its respective target mRNA isoform (Figure 6C). Subsequently, siRNAs were transfected into 2 –6C MEFs at day 0 or day 13 during reprogramming, then harvested 5 and 3 days later, respectively. Reprogramming colonies were either collected and analyzed by flow cytometry to quantify cells positive for both SSEA-1 and GFP, which marks the reprogramming population (Figure 6D), or fixed and imaged by confocal microscopy (Figure 6E). Although not all SSEA-1-expressing cells eventually progress to iPSCs, SSEA-1 expression during the initiation phase of reprograming made it ideal for assessing the early effects of knockdown of Foxp1-ES and Foxp1. This also afforded more reliable quantification of reprogramming initiation compared to markers such as Nanog, which are expressed at later stages (data not shown). As expected, in the absence of OKMS expression, no SSEA-1/GFP-positive cells formed compared to Dox-induced cultures, which showed robust initiation of SSEA-1 expression. However, transfection of 2 –6C MEFs with siRNAs targeting Oct4 reduced the population of SSEA-1/GFP-positive cells by 5 fold (Figures 6D and 6E). Importantly, knockdown of Foxp1-ES resulted in a comparable (4-fold) reduction in SSEA-1/GFP-positive cells, whereas knockdown of Foxp1 had little to no effect (Figures 6D and 6E). Knockdown of Foxp1-ES (or Oct4) at day 13 also significantly reduced the proportion of SSEA-1/GFP-positive cells (Figure S6C). Finally, we asked whether overexpression of either Foxp1-ES or Foxp1, together with OKMS, differentially affects primary MEF reprogramming. Although overexpression of Foxp1-ES with OKMS factors did not substantially alter the efficiency of formation of SSEA-1-positive colonies, overexpression of Foxp1 completely blocked OKMS induction of AP- and SSEA1-positive colonies (Figure S6D; Extended Experimental Procedures and data not shown). Taken together with the results described earlier, these data provide evidence that the AS-mediated switch controlling Foxp1-ES expression is critical for efficient iPSC formation, as well as for the maintenance of ESC self-renewal and pluripotency.
DISCUSSION Previous investigations of gene regulatory networks that control ESC self-renewal and pluripotency and iPSC reprogramming have largely focused on the roles of transcription factors, chromatin remodeling, and noncoding RNAs in these processes. An important aspect of our findings is the observation that an AS switch controlling the expression of the FOXP1-ES splice isoform is integral to the control of the highly interconnected transcriptional regulatory network required for ESC pluripotency and iPSC reprogramming (Figure 7; refer to Introduction). This
regulatory paradigm is reminiscent of AS events with pivotal roles in the control of transcription factors involved in Drosophila sex determination, courtship behavior, and eye development (Demir and Dickson, 2005; Fic et al., 2007; Fo¨rch and Valca´rcel, 2003). Moreover, additional AS events have recently been reported to influence the activity of transcription or signaling factors implicated in the control of pluripotency genes (Mayshar et al., 2008; Rao et al., 2010b; Salomonis et al., 2010; see Introduction). Thus, a small number of AS events have the capacity to dramatically impact the wiring of transcriptional networks and other processes with critical regulatory functions in pluripotency and early development. Our results extend recent reports establishing critical roles for FOXP1/Foxp1 in the specification of cell lineages in early development. Foxp1 has been reported to coordinate the balance between cardiomyocyte proliferation and differentiation through lineage-specific regulation of Fgf ligands and the Hox protein Nkx2.5 (Zhang et al., 2010) and to promote midbrain identity in mESC-derived dopamine neurons through direct regulation of the homeobox protein Pitx3 (Konstantoulas et al., 2010). It also coordinates the expression of other Hox proteins required for columnar organization of spinal motor neurons (Rousso et al., 2008). Interestingly, FOXP1 together with several other transcription factors has been reported to promote the self-renewal and differentiation potential of mesenchymal stem cells (Kubo et al., 2009), and it has also been implicated in the transition between pro- and pre-B cells during B cell maturation (Hu et al., 2006; Rao et al., 2010a). Taken together with our findings, it is apparent that the differential regulation of FOXP1/Foxp1 and its isoforms can have a profound impact on transitions between cell proliferation, lineage specification, and differentiation in multiple biological contexts. In future studies, it will be of considerable interest to elucidate the mechanisms responsible for the regulation of FOXP1-ES expression. In particular, it will be important to establish which splicing factors control the inclusion of FOXP1/Foxp1 exons 18/16 and 18b/16b, and how these factors themselves are differentially regulated in ESCs and differentiated cells, so as to govern the transcriptional networks that regulate ESC selfrenewal and pluripotency. EXPERIMENTAL PROCEDURES Microarray Hybridization, Data Extraction, and Analysis Total RNA was extracted from ESCs using TRI reagent (Sigma-Aldrich) as per the manufacturer’s recommendations. Poly(A)+ mRNA was purified using Nucleotrap Midiprep kits (Clonetech). cDNA was synthesized using the WTOvation RNA Amplification System (Nugen) and was hybridized to custom AS microarrays as described previously (Pan et al., 2004). Data analysis was performed essentially as described previously (Pan et al., 2004) (S. Mavadadi, J. Calarco, X.W., B.J.B., Q.P., and Q. Morris, unpublished data).
(C) qRT-PCR analysis of transcript expression from genes involved in pluripotency maintenance in Dox-treated CGR8 mESCs expressing 33Flag-Foxp1-ES and grown in absence of LIF (DLIF) for 24 passages. Average expression levels of Oct4, Nanog, Nr5a2, Sox2, Klf4, and LifR in CGR8 33Flag-Foxp1-ES DLIF cells are shown relative to the average expression levels of the same genes in the parental CGR8 mESCs, cultured in parallel in the presence of 1:1 LIF. Expression ratios are average measurements from three independent analyses; positive SDs are indicated. (D) Teratoma assay assessing the pluripotency potential of mouse CGR8 33Flag-Foxp1-ES DLIF cells (see panel C). Hematoxylin and eosin staining of teratoma sections detected all three embryonic germ layer-derived tissues. Endodermal derivatives: ciliated respiratory (a) and intestine-like epithelium (b); mesodermal derivatives: muscle (c) and cartilage (d); ectodermal derivatives: neuronal (e) and skin epithelial cell (f). Bar = 50 mm. See also Figures S5E and S5F.
142 Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc.
Cs da
da
da
+
+
+
+
+
-
Dox
Log2 of relative mRNA expression
da
-
6C
y2
y5
2º-
-
-
B
y1 1 y1 6 da y2 1 2º6C iPS
Fs
ME F 6C iPS
ME
Cs
A
Foxp1 Foxp1-ES Oct4 Sox2 Gapdh 1
2
3
4
5
6
7
8
7 Foxp1 Foxp1-ES
6 5 4 3 2 1 0 2º-6C MEFs
9
C
day 2
day 5
day 11
day 16
day 21
2º-6C iPSCs
D 30
Foxp1 Foxp1-ES
125
% of SSEA1/GFP positive cells at day 5
Relative mRNA expression (%)
150
100 75
50 25
20 15 10 5 0
0 mock
exon16 siRNA
exon16b siRNA
Oct4 siRNA
exon16 siRNA
ctl siRNA
mock
exon16b siRNA
exon16 siRNA
exon16b siRNA
Oct4 siRNA
Oct4 siRNA
SSEA-1
E
25
b
c
d
DAPI
a
Figure 6. Foxp1-ES Is Required for Efficient Reprogramming of MEFs to iPSCs (A) Semiquantitative RT-PCR analysis of the endogenous expression levels of Foxp1, Foxp1-ES, Oct4, and Sox2 during the course of reprogramming of secondary MEF-6C cells into secondary iPSC colonies (2 –6C iPSCs). Induction of Oct4, Klf4, cMyc, and Sox2 transcription factors by addition of Dox at day 0 (2 –6C MEFs) was followed by monitoring transcript levels 2, 5, 11, 16, 21, and 30 days (2 –6C iPSCs) post Dox induction. Gapdh mRNA levels are shown as a loading/recovery control. (B) Bar graph showing the relative levels of expression of endogenous transcripts encoding Foxp1 and Foxp1-ES during reprogramming of 2 –6C MEFs. The levels of expression of Foxp1 and Foxp1-ES were normalized to Gapdh expression levels at each time point and represented as log2 ratios relative to the levels of Foxp1 and Foxp1-ES detected in 2 –6C MEFs and 2 –6C iPSCs, respectively. Positive SDs are indicated. (C) Bar graph showing the relative expression of Foxp1 and Foxp1-ES isoforms following transfection of siRNA pools. Cells were either mock-transfected or transfected with siRNA pools specific for Foxp1 exon 16, Foxp1-ES exon 16b, or siRNA pools specific for Oct4. Expression levels were determined by
Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc. 143
were aligned to generate consensus sequences using enoLOGOS (Workman et al., 2005).
FOXP1 17
18 18b
FOXP1-ES
Pluripotency genes e.g. OCT4, NANOG
Differentiation genes
PLURIPOTENCY MAINTENANCE
19
FOXP1
Pluripotency genes
Differentiation genes
DIFFERENTIATION
Figure 7. Model for the Role of an AS Switch in Controlling Transcriptional Networks Required for the Regulation of ESC Pluripotency and Differentiation In pluripotent ESCs or iPSCs, inclusion of FOXP1 exon 18b results in the expression of FOXP1-ES, which preferentially binds to a distinct set of DNA motifs. This event promotes the expression of key transcription factors including OCT4 and NANOG required for the maintenance of pluripotency, and it represses genes required for ESC differentiation. During differentiation, exon 18b is entirely skipped, resulting in the exclusive inclusion of exon 18 and expression of the ‘‘canonical’’ FOXP1 isoform. This leads to a change in DNA recognition, a consequence of which is reduced expression of pluripotency genes and increased expression of genes required for differentiation.
RNA-Seq Data Generation and Analysis H9 hESCs were transfected with siRNA pools (Dharmacon) using DharmaFECT as per the manufacturer recommendations, transfected again after 2 days, then harvested 2 days later. Total RNA from two independent transfections was pooled and submitted to Illumina Inc. for mRNA sequencing. RNA-Seq analysis was performed essentially as described previously (Pan et al., 2008). Reverse-Transcription-Polymerase Chain Reaction Assays RT-PCR assays were performed using the OneStep kit (QIAGEN) as described previously (Calarco et al., 2007). For qRT-PCR assays, cDNA from 2 mg total RNA was synthesized using SuperScript III Reverse Transcriptase (Invitrogen) as per manufacturer recommendations. Reactions were performed in a 384well format using 20 ng of cDNA and FastStart Universal SYBR Green Master (Rox) (Roche Applied Science). Primer sequences are available upon request.
Gel Mobility Shift Assays dsDNA probes contained two copies of representative PBM-derived binding sequences separated by two cytosines, or mutated derivatives of these sequences. Gel shift assays were performed as described in Hellman and Fried (2007). ChIP-Seq ChIP-Seq experiments were performed as described previously (Schmidt et al., 2009), using an anti-FOXP1 (Abcam) antibody and 5 3 107 H9 hESCs per sample. Genomic DNA Sample Prep Kits (Illumina) were used to prepare dsDNA libraries from fragmented immunoprecipitated and total DNA as per the manufacturer’s protocol, and libraries were sequenced using a HiSeq machine (Illumina). Immunofluorescence Microscropy CGR8 cells were analzyed by immunofluoresence microscopy using polyclonal anti-b-III tubulin (Sigma-Aldrich) and murine monoclonal anti-Oct4 (Pierce), were stained with Hoechst dye (Sigma-Aldrich), then mounted with Aqueous mounting Medium (Permafluor). Images were acquired by epi-fluorescence imaging as previously described (Samavarchi-Tehrani et al., 2010). iPSC Reprogramming Assays and Imaging Secondary (6C) MEFs harboring OKMS transgenes under tetracycline-inducible control were derived using the piggyBac system as previously described (Woltjen et al., 2009). In brief, 2 –6C MEFs were cultured on collagen-coated plates and expression of the transgenes was induced on day 0 of reprogramming using 1.5 mg/ml Dox in standard mouse ESC media. For knockdown experiments, single or pooled siRNAs (Dharmacon/ThermoFisher) were transfected into the 2 –6C MEFs at day 0 or at day 13 after induction, as previously described (Samavarchi-Tehrani et al., 2010). Cells were then cultured for another 3 or 5 days prior to analysis by flow cytometry, immunostaining, or isolation of total RNA for qPCR analysis, also as described in Samavarchi-Tehrani et al. (2010). ACCESSION NUMBERS Preprocessed probe intensity scores for all AS and PB microarray data, and short read sequence data, are available from the GEO database under accession number GSE30992. SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures, six figures, and five tables and can be found with this article online at doi:10. 1016/j.cell.2011.08.023. ACKNOWLEDGMENTS
Protein-Binding Microarrays and Data Analysis GST-FOXP1 and GST-FOXP1-ES were analyzed on PBMs as described previously (Badis et al., 2009), and the resulting data were processed as described in Lam et al. (2011). 8-mers with an E score > 0.45 in at least one of the two experimental repeats were considered significant (Berger et al., 2008) and
B.J.B. dedicates this paper to the memory of S. Levine (1911–2011). We are grateful to D. Schmidt, D. Odom, Q. Morris, N. Barbosa-Morais, S. Mavadadi, and H. van Bakel for advice on data analysis and to A. Golipour for assistance with the iPSC reprogramming experiments. L. Attisano, D. Geschwind, and
semiquantitative RT-PCR assays, normalized to Gapdh levels and relative to the expression levels of the same transcripts in the mock-transfected control. Positive SDs are indicated. See also Figure S6B. (D) Bar graph showing relative proportions of flow cytometry-sorted, reprogramming 2 –6C MEFs that are double-positive for GFP and the ESC/iPSC marker SSEA-1. 2 –6C MEFs were Dox treated to induce OKMS factors and transfected with siRNA pools indicated in (C) at day 0, then were analyzed by flow cytometry and immunostaining 5 days later. Results from analyzing the effects of transfecting the same siRNA pools at day 13 of reprogramming are shown in Figure S6C. Range over mean values for two independent analyses are indicated. (E) Representative images of SSEA-1- and DAPI-stained cells at day 5 following Dox induction of OKMS factors and post-transfection of siRNA pools as described in (C) and (D).
144 Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc.
K.-H. Krause kindly provided valuable reagents, and A. Saltzman, B. Raj, J. Calarco, J. Ellis, H. Han, N. Barbosa-Morais, and M. Q.-Vallie`res provided helpful comments on the manuscript. Our research was supported by grants from the Canadian Institutes for Health Research (CIHR) to B.J.B., J.L.W., A.N., P.W.Z., and T.R.H., a grant from the Canadian Cancer Society to B.J.B., a grant from Genome Canada (through the Ontario Genomics Institute) to B.J.B. and others, a grant from the Ontario Ministry of Research and Innovation to A.N., and a grant from the Ontario Research Fund to J.L.W., B.J.B., and others. E.O.M., S.N., and H.W. were supported by NIH grant P01 NS055923. E.O.M. is the David and Sylvia Lieb Fellow of the Damon Runyon Cancer Research Foundation (DRG-1937-07). M.G. was supported by postdoctoral fellowships from the C.H. Best Foundation and CIHR. J.L.W. is an International Scholar of the HHMI. Received: February 8, 2011 Revised: June 10, 2011 Accepted: August 4, 2011 Published online: September 15, 2011 REFERENCES Atlasi, Y., Mowla, S.J., Ziaee, S.A., Gokhale, P.J., and Andrews, P.W. (2008). OCT4 spliced variants are differentially expressed in human pluripotent and nonpluripotent cells. Stem Cells 26, 3068–3074. Badis, G., Berger, M.F., Philippakis, A.A., Talukder, S., Gehrke, A.R., Jaeger, S.A., Chan, E.T., Metzler, G., Vedenko, A., Chen, X., et al. (2009). Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723. Berger, M.F., Philippakis, A.A., Qureshi, A.M., He, F.S., Estep, P.W., 3rd, and Bulyk, M.L. (2006). Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat. Biotechnol. 24, 1429–1435. Berger, M.F., Badis, G., Gehrke, A.R., Talukder, S., Philippakis, A.A., Pen˜aCastillo, L., Alleyne, T.M., Mnaimneh, S., Botvinnik, O.B., Chan, E.T., et al. (2008). Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell 133, 1266–1276. Brown, P.J., Ashe, S.L., Leich, E., Burek, C., Barrans, S., Fenton, J.A., Jack, A.S., Pulford, K., Rosenwald, A., and Banham, A.H. (2008). Potentially oncogenic B-cell activation-induced smaller isoforms of FOXP1 are highly expressed in the activated B cell-like subtype of DLBCL. Blood 111, 2816–2824. Calarco, J.A., Xing, Y., Ca´ceres, M., Calarco, J.P., Xiao, X., Pan, Q., Lee, C., Preuss, T.M., and Blencowe, B.J. (2007). Global analysis of alternative splicing differences between humans and chimpanzees. Genes Dev. 21, 2963–2975. Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V.B., Wong, E., Orlov, Y.L., Zhang, W., Jiang, J., et al. (2008). Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106–1117. Dasen, J.S., De Camilli, A., Wang, B., Tucker, P.W., and Jessell, T.M. (2008). Hox repertoires for motor neuron diversity and connectivity gated by a single accessory factor, FoxP1. Cell 134, 304–316. Demir, E., and Dickson, B.J. (2005). fruitless splicing specifies male courtship behavior in Drosophila. Cell 121, 785–794. Fic, W., Juge, F., Soret, J., and Tazi, J. (2007). Eye development under the control of SRp55/B52-mediated alternative splicing of eyeless. PLoS ONE 2, e253.
Jaeger, S.A., Chan, E.T., Berger, M.F., Stottmann, R., Hughes, T.R., and Bulyk, M.L. (2010). Conservation and regulatory associations of a wide affinity range of mouse transcription factor binding sites. Genomics 95, 185–195. Kim, J., Chu, J., Shen, X., Wang, J., and Orkin, S.H. (2008). An extended transcriptional network for pluripotency of embryonic stem cells. Cell 132, 1049–1061. Koh, K.P., Sundrud, M.S., and Rao, A. (2009). Domain requirements and sequence specificity of DNA binding for the forkhead transcription factor FOXP3. PLoS ONE 4, e8109. Konstantoulas, C.J., Parmar, M., and Li, M. (2010). FoxP1 promotes midbrain identity in embryonic stem cell-derived dopamine neurons by regulating Pitx3. J. Neurochem. 113, 836–847. Koon, H.B., Ippolito, G.C., Banham, A.H., and Tucker, P.W. (2007). FOXP1: a potential therapeutic target in cancer. Expert Opin. Ther. Targets 11, 955–965. Kubo, H., Shimizu, M., Taya, Y., Kawamoto, T., Michida, M., Kaneko, E., Igarashi, A., Nishimura, M., Segoshi, K., Shimazu, Y., et al. (2009). Identification of mesenchymal stem cell (MSC)-transcription factors by microarray and knockdown analyses, and signature molecule-marked MSC in bone marrow by immunohistochemistry. Genes Cells 14, 407–424. Kunarso, G., Wong, K.Y., Stanton, L.W., and Lipovich, L. (2008). Detailed characterization of the mouse embryonic stem cell transcriptome reveals novel genes and intergenic splicing associated with pluripotency. BMC Genomics 9, 155. Kunarso, G., Chia, N.Y., Jeyakani, J., Hwang, C., Lu, X., Chan, Y.S., Ng, H.H., and Bourque, G. (2010). Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nat. Genet. 42, 631–634. Lam, K.N., van Bakel, H., Cote, A.G., van der Ven, A., and Hughes, T.R. (2011). Sequence specificity is obtained from the majority of modular C2H2 zinc-finger arrays. Nucleic Acids Res. 39, 4680–4690. Li, S., Weidenfeld, J., and Morrisey, E.E. (2004). Transcriptional and DNA binding activity of the Foxp1/2/4 family is modulated by heterotypic and homotypic protein interactions. Mol. Cell. Biol. 24, 809–822. Mayshar, Y., Rom, E., Chumakov, I., Kronman, A., Yayon, A., and Benvenisty, N. (2008). Fibroblast growth factor 4 and its novel splice isoform have opposing effects on the maintenance of human embryonic stem cell selfrenewal. Stem Cells 26, 767–774. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L., and Wold, B. (2008). Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628. Pan, Q., Shai, O., Misquitta, C., Zhang, W., Saltzman, A.L., Mohammad, N., Babak, T., Siu, H., Hughes, T.R., Morris, Q.D., et al. (2004). Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Mol. Cell 16, 929–941. Pan, Q., Shai, O., Lee, L.J., Frey, B.J., and Blencowe, B.J. (2008). Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415. Pritsker, M., Doniger, T.T., Kramer, L.C., Westcot, S.E., and Lemischka, I.R. (2005). Diversification of stem cell molecular repertoire by alternative splicing. Proc. Natl. Acad. Sci. USA 102, 14290–14295. Rao, D.S., O’Connell, R.M., Chaudhuri, A.A., Garcia-Flores, Y., Geiger, T.L., and Baltimore, D. (2010a). MicroRNA-34a perturbs B lymphocyte development by repressing the forkhead box transcription factor Foxp1. Immunity 33, 48–59.
Fo¨rch, P., and Valca´rcel, J. (2003). Splicing regulation in Drosophila sex determination. Prog. Mol. Subcell. Biol. 31, 127–151.
Rao, S., Zhen, S., Roumiantsev, S., McDonald, L.T., Yuan, G.C., and Orkin, S.H. (2010b). Differential roles of Sall4 isoforms in embryonic stem cell pluripotency. Mol. Cell. Biol. 30, 5364–5380.
Hellman, L.M., and Fried, M.G. (2007). Electrophoretic mobility shift assay (EMSA) for detecting protein-nucleic acid interactions. Nat. Protoc. 2, 1849– 1861.
Rousso, D.L., Gaber, Z.B., Wellik, D., Morrisey, E.E., and Novitch, B.G. (2008). Coordinated actions of the forkhead protein Foxp1 and Hox proteins in the columnar organization of spinal motor neurons. Neuron 59, 226–240.
Hu, H., Wang, B., Borde, M., Nardone, J., Maika, S., Allred, L., Tucker, P.W., and Rao, A. (2006). Foxp1 is an essential transcriptional regulator of B cell development. Nat. Immunol. 7, 819–826.
Rowan, S., Siggers, T., Lachke, S.A., Yue, Y., Bulyk, M.L., and Maas, R.L. (2010). Precise temporal control of the eye regulatory gene Pax6 via enhancer-binding site affinity. Genes Dev. 24, 980–985.
Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc. 145
Salomonis, N., Schlieve, C.R., Pereira, L., Wahlquist, C., Colas, A., Zambon, A.C., Vranizan, K., Spindler, M.J., Pico, A.R., Cline, M.S., et al. (2010). Alternative splicing regulates mouse embryonic stem cell pluripotency and differentiation. Proc. Natl. Acad. Sci. USA 107, 10514–10519.
cushion morphogenesis and myocyte proliferation and maturation. Development 131, 4477–4487.
Samavarchi-Tehrani, P., Golipour, A., David, L., Sung, H.K., Beyer, T.A., Datti, A., Woltjen, K., Nagy, A., and Wrana, J.L. (2010). Functional genomics reveals a BMP-driven mesenchymal-to-epithelial transition in the initiation of somatic cell reprogramming. Cell Stem Cell 7, 64–77.
Woltjen, K., Michael, I.P., Mohseni, P., Desai, R., Mileikovsky, M., Ha¨ma¨la¨inen, R., Cowling, R., Wang, W., Liu, P., Gertsenstein, M., et al. (2009). piggyBac transposition reprograms fibroblasts to induced pluripotent stem cells. Nature 458, 766–770.
Schmidt, D., Wilson, M.D., Spyrou, C., Brown, G.D., Hadfield, J., and Odom, D.T. (2009). ChIP-seq: using high-throughput sequencing to discover protein-DNA interactions. Methods 48, 240–248.
Workman, C.T., Yin, Y., Corcoran, D.L., Ideker, T., Stormo, G.D., and Benos, P.V. (2005). enoLOGOS: a versatile web tool for energy normalized sequence logos. Nucleic Acids Res. 33(Web Server issue), W389–W392.
Silva, J., Nichols, J., Theunissen, T.W., Guo, G., van Oosten, A.L., Barrandon, O., Wray, J., Yamanaka, S., Chambers, I., and Smith, A. (2009). Nanog is the gateway to the pluripotent ground state. Cell 138, 722–737.
Wu, J.Q., Habegger, L., Noisa, P., Szekely, A., Qiu, C., Hutchison, S., Raha, D., Egholm, M., Lin, H., Weissman, S., et al. (2010). Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proc. Natl. Acad. Sci. USA 107, 5254–5259.
Stroud, J.C., Wu, Y., Bates, D.L., Han, A., Nowick, K., Paabo, S., Tong, H., and Chen, L. (2006). Structure of the forkhead domain of FOXP2 bound to DNA. Structure 14, 159–166. Takahashi, K., and Yamanaka, S. (2006). Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126, 663–676. Wang, B., Weidenfeld, J., Lu, M.M., Maika, S., Kuziel, W.A., Morrisey, E.E., and Tucker, P.W. (2004). Foxp1 regulates cardiac outflow tract, endocardial
146 Cell 147, 132–146, September 30, 2011 ª2011 Elsevier Inc.
Wijchers, P.J., Burbach, J.P., and Smidt, M.P. (2006). In control of biology: of mice, men and Foxes. Biochem. J. 397, 233–246.
Yeo, G.W., Xu, X., Liang, T.Y., Muotri, A.R., Carson, C.T., Coufal, N.G., and Gage, F.H. (2007). Alternative splicing events identified in human embryonic stem cells and neural progenitors. PLoS Comput. Biol. 3, 1951–1967. Zhang, Y., Li, S., Yuan, L., Tian, Y., Weidenfeld, J., Yang, J., Liu, F., Chokas, A.L., and Morrisey, E.E. (2010). Foxp1 coordinates cardiomyocyte proliferation through both cell-autonomous and nonautonomous mechanisms. Genes Dev. 24, 1746–1757.
Selective Translation of Leaderless mRNAs by Specialized Ribosomes Generated by MazF in Escherichia coli Oliver Vesper,1 Shahar Amitai,2 Maria Belitsky,2 Konstantin Byrgazov,1 Anna Chao Kaberdina,1 Hanna Engelberg-Kulka,2,* and Isabella Moll1,* 1Max F. Perutz Laboratories, Center for Molecular Biology, Department of Microbiology, Immunobiology and Genetics, University of Vienna, Dr. Bohrgasse 9/4, 1030 Vienna, Austria 2Department of Microbiology and Molecular Genetics, IMRIC, The Hebrew University-Hadassah Medical School, Jerusalem 91120, Israel *Correspondence:
[email protected] (H.E.-K.),
[email protected] (I.M.) DOI 10.1016/j.cell.2011.07.047
SUMMARY
Escherichia coli (E. coli) mazEF is a stress-induced toxin-antitoxin (TA) module. The toxin MazF is an endoribonuclease that cleaves single-stranded mRNAs at ACA sequences. Here, we show that MazF cleaves at ACA sites at or closely upstream of the AUG start codon of some specific mRNAs and thereby generates leaderless mRNAs. Moreover, we provide evidence that MazF also targets 16S rRNA within 30S ribosomal subunits at the decoding center, thereby removing 43 nucleotides from the 30 terminus. As this region comprises the anti-Shine-Dalgarno (aSD) sequence that is required for translation initiation on canonical mRNAs, a subpopulation of ribosomes is formed that selectively translates the described leaderless mRNAs both in vivo and in vitro. Thus, we have discovered a modified translation machinery that is generated in response to MazF induction and that probably serves for stress adaptation in Escherichia coli. INTRODUCTION Toxin-antitoxin modules are present in the chromosomes of many bacteria, including pathogens (Engelberg-Kulka and Glaser, 1999; Mittenhuber, 1999; Hayes, 2003; Pandey and Gerdes, 2005; Engelberg-Kulka et al., 2006; Agarwal et al., 2007; Ramage et al., 2009). Each of these modules consists of a pair of genes, usually cotranscribed as operons, in which, generally, the downstream gene encodes for a stable toxin and the upstream gene encodes for a labile antitoxin. In E. coli, seven toxin-antitoxin systems have been described (Metzger et al., 1988; Masuda et al., 1993; Aizenman et al., 1996; Mittenhuber, 1999; Christensen et al., 2001; Hayes, 2003; Pandey and Gerdes, 2005; Schmidt et al., 2007). Among these, one of the most studied is the chromosomal toxin-antitoxin system mazEF, which was the first to be described as regulatable and respon-
sible for bacterial programmed cell death (Aizenman et al., 1996; Engelberg-Kulka et al., 2006). E. coli mazEF encodes the labile antitoxin MazE and the stable toxin MazF. Both mazE and mazF are coexpressed and negatively autoregulated at transcriptional level (Marianovsky et al., 2001). E. coli mazEF is triggered by various stressful conditions, as treatment with antibiotics affecting transcription or translation (Hazan et al., 2004), or by an increase of ppGpp upon severe amino acid starvation (Aizenman et al., 1996). Such stressful conditions prevent mazEF expression; thereby, the short-lived antitoxin MazE is degraded by the ATP-dependent ClpAP serine protease (Aizenman et al., 1996), permitting the stable MazF to exert its toxic effect (Engelberg-Kulka et al., 2006). mazEF-mediated cell death was also reported as a population phenomenon requiring a quorum-sensing factor called extracellular death factor (EDF) (Kolodkin-Gal et al., 2007). MazF is a sequence-specific endoribonuclease that preferentially cleaves single-stranded mRNAs at ACA sequences (Zhang et al., 2003, 2004). As previously reported (Christensen et al., 2003; Zhang et al., 2003), MazF induction causes inhibition of protein synthesis. However, we have recently shown that, surprisingly, this inhibition was not complete; though MazF led to inhibition of synthesis of most proteins (about 90%), it selectively enabled specific synthesis of about 10% of proteins (Amitai et al., 2009). Some of those proteins were required for death of most cells within a population. However, we also found that MazF enabled the synthesis of proteins that permitted survival of a small subpopulation under stressful conditions that cause mazEF-mediated cell death for the majority of the population. Among the proteins involved in cell death were: (1) YfiD, a glycine radical protein known to be able to replace the oxidatively damaged pyruvate formate-lyase subunit (Wagner et al., 2001) and (2) YfbU, a protein of unknown function (Amitai et al., 2009). Among the proteins involved in cell survival were: (1) DeoC, a deoxy ribose-phosphate aldolase known to participate in the catabolism of deoxyribonucleosides (Hammer-Jespersen et al., 1971), and RsuA, a protein known to catalyze the pseudouridylation at position 516 in the 16S RNA (Wrzesinski et al., 1995). Because several ACA sites—the potential MazF cleavage sites—are located in the corresponding mRNAs, we were Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc. 147
intrigued by the mechanism that is responsible for selective synthesis of these proteins. Here, we have elucidated the underlying molecular mechanism leading to selective translation of a particular set of mRNAs upon MazF induction in E. coli. We found that, due to its endoribonucleolytic activity, MazF cleaves at ACA sites at or closely upstream of the AUG start codon of specific mRNAs. Thereby, short-leadered and leaderless mRNAs (lmRNAs) are generated, respectively. Surprisingly, the 16S rRNA of the 30S ribosomal subunit is another target of MazF endoribonuclease. Specifically, the toxin cleaves at an ACA triplet in the 16S rRNA located 50 of helix 45. Thus, this cleavage leads to loss of 43 nucleotides (nts) at the 30 terminus of 16S rRNA, including helix 45 and the antiShine-Dalgarno (aSD) sequence. As the SD-aSD interaction is required for translation initiation on canonical ribosome-binding sites, truncation of 16S rRNA yields a specialized protein synthesis machinery designated as ‘‘stress-ribosome,’’ which selectively translates lmRNAs generated by MazF. Because MazF is triggered under stressful conditions, our results uncovered a hitherto uncharacterized stress adaptation mechanism in E. coli, which is based on generation of a heterogeneous ribosome population that provides a means for selective synthesis of a subclass of proteins. RESULTS MazF Cleaves yfiD and rpsU mRNAs at Specific Sites Directly Upstream of the AUG Start Codon In Vivo Recent studies revealed that several mRNAs are translated upon induction of mazF in vivo (Amitai et al., 2009). However, the presence of potential MazF cleavage sites within these mRNAs would imply their immediate degradation following MazF cleavage. As several lines of evidence indicate that MazF cleaves only at ACA sequences located within unstructured regions and that stable secondary structures can shield the sites of cleavage (Zhang et al., 2004; Zhu et al., 2008), we first examined whether some candidate mRNAs are protected from mRNA cleavage. Therefore, we performed primer extension analysis on total RNA prepared upon mazF overexpression in vivo using the same conditions that determined the synthesis of the respective proteins (Amitai et al., 2009). We studied (1) mRNAs encoding proteins YfiD and YfbU, which are selectively translated in the presence of increased levels of MazF (Amitai et al., 2009), and (2) rpsU mRNA encoding ribosomal protein S21, a coding region that does not contain an ACA cleavage site. In brief, mazF expression was induced in E. coli strain MC4100relA+ harboring plasmid pSA1 that bears an IPTG-inducible mazF gene. Fifteen minutes thereafter, total RNA was purified and primer extension analysis was performed employing specific primers for yfiD (R48) (Table S1 available online and Figure 1A, lanes 5 and 6), rpsU (Y50) (Table S1 and Figure 1C, lanes 5 and 6), and yfbU (B6) (Table S1 and Figures S1C and S1D) mRNAs. In absence of mazF overexpression, primer extension reactions for yfiD mRNA (Figures 1A, lane 5, and 1B) and rpsU mRNA (Figures 1C, lane 5, and 1D) generate signals that correspond to the 50 termini of the transcripts synthesized from the annotated promoters (PyfiD1 and PrpsU) (Green et al., 1998; Lupski et al., 1984). In addition, we determined a second transcriptional start 148 Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc.
point for yfiD mRNA, giving rise to an mRNA harboring a 57 nt long 50 untranslated region (50 UTR) (PyfiD2) (Figures 1A, lane 5, and 1B). Surprisingly, upon induction of mazF expression, primer extension resulted in generation of cDNAs of 128 and 97 nts in length, which correspond to cleavage of yfiD mRNA at two ACA triplets 30 nts and directly upstream of the AUG start codon (Figures 1A, lane 6, and 1B). In contrast, we did not observe cleavage at two additional ACA sequences located 39 nts upstream and 3 nts downstream of the start codon (Figures 1A, lane 6, and 1B). We obtained similar results when primer extension was performed with the rpsU-specific primer (Figure 1C, lane 6). Again, MazF activity resulted in cleavage of rpsU mRNA at the ACA located directly upstream of the start codon (Figures 1C, lane 6, and 1D). Corresponding to absence of ACAs in the coding sequence of rpsU mRNA, we did not observe MazF cleavage within this region (Figures 1C, lane 6). To verify MazF cleavage at the sites determined in vivo, in addition, in vitro cleavage assays have been performed on yfiD and rpsU mRNAs. In vitro-transcribed mRNAs were incubated with purified MazF protein at 37 C. Subsequent primer extension analyses revealed MazF cleavage at the same positions as observed in vivo (Figures S1A and S1B, lane 7). Surprisingly, primer extension using total RNA purified from cells with and without mazF overexpression employing a yfbU specific primer (B6) (Table S1) yielded a stop signal that corresponds to the A of the AUG start codon in both cases (Figure S1C, lanes 5 and 6). These results indicate that, in vivo, the yfbU mRNA might be transcribed as lmRNA. To entirely exclude MazF cleavage at the ACA present in the yfbU mRNA just upstream of the start codon (Figure S1D), primer extension analysis was repeated on total RNA purified from strain MC4100relA+DmazEF (Figure S1C, lane 7). Likewise, the extension signal indicates the presence of a leaderless yfbU mRNA (Figure S1C, lane 7). This result was verified by primer extension analysis at 55 C employing the heat stable reverse transcriptase Superscript III (Invitrogen) using a second yfbU specific primer (I6) (Table S1). As shown in Figure S1C, primer extension reactions generated cDNAs of 88 nts in length, again corresponding to transcription starting directly at the AUG codon (Figure S1C, lanes 8 and 9). Taken together, these results tempted us to speculate that mazF induction results in selective translation of lmRNAs, which are either generally present, like yfbU mRNA, or generated by MazF, as shown for yfiD and rpsU mRNAs. Selective Translation of an lmRNA upon mazF Overexpression lmRNAs are selectively translated in the absence of ribosomal proteins S1 and/or S2 (Moll et al., 2002). Moreover, we have recently shown that lmRNAs are selectively translated by protein-depleted ribosomes, which are formed upon treatment of E. coli cells with the aminoglycoside antibiotic kasugamycin (Ksg) in vivo (Kaberdina et al., 2009). In light of the fact that the mazEF module can be triggered by antibiotics targeting the ribosome (Sat et al., 2001) and that MazF activity results in formation of lmRNAs, we hypothesized that MazF could likewise affect the protein synthesis machinery, thus rendering it selective for lmRNAs. To verify this notion, pulse labeling was performed
A
B
C
D
Figure 1. Determination of MazF Cleavage Sites on yfiD and rpsU mRNAs In Vivo and Schematic Depiction of Genomic Organization, 50 UTRs, and Proximal Coding Regions of the Respective mRNAs (A and C) Primer extensions employing primers (Table S1) specific for yfiD mRNA (A) and rpsU mRNA (C) performed on total RNA purified from E. coli strain MC4100relA+ comprising plasmid pSA1 without (lane 5) or with (lane 6) mazF overexpression. Extension signals corresponding to transcriptional start points of yfiD mRNA (PyfiD1 and PyfiD2) and rpsU mRNA (PrpsU) are indicated by black arrows (lane 5). Signals corresponding to MazF cleavage are indicated by black arrows and labeled analogous to Figures 1B and 1D, where ‘‘A’’ designates cleavage directly upstream of the AUG start codon, resulting in formation of a lmRNA, and ‘‘B’’ indicates cleavage further upstream (lane 6). White arrowheads indicate ACA triplets not cleaved by MazF (lane 6). (Lanes 1–4) Sequencing reactions of 16S rRNA employing primer V43 (Table S1) to determine length of primer extension (indicated to the left). (B and D) Schematic depictions of promoter positions and sequence of 50 UTRs (in gray) and proximal coding regions (in black) of yfiD mRNA (B) and rpsU mRNA (D). MazF cleavage sites are underlined and indicted by arrows. The cleavage site directly upstream of the AUG start codon is marked with ‘‘A.’’ The cleavage site further upstream is indicated by ‘‘B.’’ Potential ACA triplets where MazF cleavage does not occur are indicated by white arrowheads. See also Figure S1.
employing E. coli strain MC4100relA+ harboring plasmid pSA1, which encodes the inducible mazF gene, and plasmid pKTplaccI. Plasmid pKTplaccI encodes the leaderless cI-lacZ fusion gene, giving rise to the 122.4 kD CI-LacZ fusion protein (Grill et al., 2000). The strain was grown in M9 minimal medium as specified in the Experimental Procedures. At OD600 of 0.5, the culture was divided and IPTG was added to one half to induce mazF expression. Before and 15 and 30 min after induction, pulse labeling was performed. Upon precipitation and separation of labeled proteins by SDS-PAGE, the autoradiography shown in Figure 2 reveals selective translation of the leaderless
cI-lacZ mRNA upon overexpression of mazF. In contrast, translation of bulk mRNA is severely inhibited (Figure 2, lanes 5 and 6). It has to be noted that, despite induction of mazF overexpression, the protein cannot be detected (Figure 2, lanes 5 and 6) (Amitai et al., 2009). MazF Cleaves 16S rRNA of 30S Subunits and 70S Ribosomes Taken together, MazF-mediated generation and selective translation of lmRNAs suggest that the endoribonuclease might directly affect ribosomes, as rRNA could represent a potential Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc. 149
Figure 2. Overexpression of mazF Results in Stimulated and Selective Translation of the Leaderless cI-lacZ mRNA In Vivo Pulse labeling performed with E. coli strain MC4100relA+ harboring plasmids pKTplaccI, encoding the leaderless cI-lacZ fusion gene (Grill et al., 2000), and pSA1, encoding the mazF gene under control of the lac-operator. At time point 0 (lane 2), the culture was divided and one half remained untreated (lanes 3 and 4), whereas in the other half, mazF expression was induced with IPTG (lanes 5 and 6). At time points indicated, pulse labeling was performed. The position of the CI-LacZ fusion protein (122.4 kD) is indicated by an arrow to the right of the autoradiograph. The tentative position of MazF (12 kD) is marked by an asterisk. (Lane 1) Protein marker.
target. As indicated in the structure of the 30S ribosomal subunit (Figure 3A), several ACA triplets are present in the 16S rRNA, most of which are located within structured regions or are protected by ribosomal proteins. However, as shown in the secondary structure of the 30 -terminal 16S rRNA, two accessible ACA sites are located at positions 1500–1502 and 1396–1398 (Figure 3B). Cleavage at the latter site would be detrimental for ribosome activity, as it would result in loss of helix 44, which provides most intersubunit bridges (Gabashvili et al., 2000). However, given that this region is enclosed by tRNA and mRNA (Woodcock et al., 1991; Yusupova et al., 2006), we hypothesized that translationally active ribosomes might be protected at this position. In contrast, cleavage at position 1500–1502 in close proximity to the site for colicin E3 cleavage (between nts A1493 and G1494) (Figure 3B; Senior and Holland, 1971) would result in loss of the 30 terminus of 16S rRNA containing helix 45 and the aSD sequence. The SD-aSD interaction is important for translation initiation complex formation on canonical mRNAs comprising structured 50 UTRs. Particularly in vivo, when several mRNAs have to compete for 30S subunits (Hui and de Boer, 1987; Calogero et al., 1988), the small subunit captures mRNAs via SD-aSD interaction (Kaminishi et al., 2007; Yusupova et al., 2001). Therefore, it is feasible that MazF activity could result in formation of ribosomes selective for translation of lmRNAs. To pursue this idea, we treated 30S subunits (data not shown) and 70S ribosomes with purified MazF in vitro (Figure 3C). As specified in the Experimental Procedures, rRNAs within ribosomes were labeled 30 terminally with pC-Cy3. Upon incubation with purified MazF protein, rRNA was extracted and separated on a denaturing gel. As shown in Figure 3C, upon MazF treatment, we observed formation of a fragment of about 44 nts in length (Figure 3C, lane 2). As we employed purified ribosomes, which do not harbor tRNAs or mRNAs protecting 16S rRNA at the decoding site (Woodcock et al., 1991; Yusupova et al., 2006), in addition, a minor fragment was generated that corre150 Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc.
sponds to cleavage at position 1396 of 16S rRNA (Figure 3C, lane 2). Additional experiments employing 16S as well as 23S rRNA radioactively labeled at the 50 (data not shown) and 30 end (Figure S2) revealed no further MazF cleavage. Next, we tested for cleavage of 16S rRNA at A1500 in vivo. Therefore, total RNA was prepared from strain MC4100relA+ DmazEF (Figure 3D, lane 1) or strain MC4100relA+ harboring plasmid pSA1 without (Figure 3D, lane 2) or upon induction of MazF synthesis (Figure 3D, lane 3) and subjected to northern blot analysis employing primers V7, specific for the 30 end of 16S rRNA (Figure 3B and Table S1), and V43, binding to positions 939–955 in 16S rRNA (Table S1). As shown in Figure 3D, in contrast to total RNA prepared from untreated cells (Figure 3Db, lane 2), induction of MazF synthesis yielded the same fragment (Figure 3Db, lane 3) that we observed upon treatment of 70S ribosomes with MazF protein in vitro (Figure 3C, lane 2). Correspondingly, the signal for 16S rRNA obtained with primer V7 decreased upon MazF cleavage (Figure 3Da, lanes 2 and 3). However, the amount of total 16S rRNA, as verified with primer V43, remained constant (Figure 3Dc). To unequivocally determine the site of cleavage, primer extension analysis employing primer V7 was performed on total RNA used for northern blotting. The result reveals that, upon overexpression of mazF, cleavage occurs upstream of A1500 (Figure 3E, lane 7), thus removing 43 nts at the 30 end of 16S rRNA. MazF Activity Results in Formation of Stress Ribosomes In Vivo Next, we aimed to confirm that MazF-mediated removal of the 30 end of 16S rRNA comprising the aSD sequence and helix 45 results in formation of a distinct ribosome population that is functionally specific for translation of lmRNAs, here referred to as ‘‘stress ribosomes’’ (70SD43). E. coli strain MC4100relA+ harboring plasmid pSA1 was grown in LB medium at 37 C. At OD600 of 0.5, the culture was divided and IPTG was added to one half to induce mazF expression, whereas the other half remained untreated. Intriguingly, the ribosome profile did not change upon overexpression of mazF except for reduction of polysome peaks, indicating inhibition of bulk mRNA translation (Figure S3). This result indicates that ribosomes are not degraded and that, in contrast to treatment with the antibiotic Ksg (Kaberdina et al., 2009), no protein-depleted ribosomes are formed upon mazF overexpression. Concomitantly, ribosomes were prepared and their 16S rRNA was analyzed. First, the overall amount of 16S rRNA was determined upon denaturing gel electrophoresis by ethidium bromide staining (Figure 4Aa). The same samples were subsequently probed by northern blotting employing primer V7 for presence or absence of the 30 -terminal fragment (Figure 4Ab). In contrast to rRNA derived from 70S ribosomes purified without mazF expression (Figure 4Ab, lane 1), northern blot analysis employing rRNA from ribosomes purified upon mazF overexpression revealed a reduced signal for probe V7 (Figure 4Ab, lane 2; 70S/70SD43). This result supports our hypothesis that mazF overexpression yields a heterogeneous ribosome population still containing a substantial fraction of canonical 70S. Thus, we included a second purification step to remove uncleaved 70S ribosomes with the help of a biotinylated SD-oligonucleotide
A
B
C
D
E
Figure 3. The 16S rRNA of Assembled 70S Ribosomes Represents a Target for MazF Activity (A) The structure of the 30S ribosomal subunit was modeled employing Polyview 3D (Porollo and Meller, 2007) and PyMOL molecular system software (DeLano, 2002) and PDB file 2AVY (Schuwirth et al., 2005). 16S rRNA (light gray), proteins (dark gray), helix 44 (cyan), and helix 45 (green) are shown. ACA sequences present in 16S rRNA, which are protected by proteins or structural features of rRNA, are indicated in yellow. Two potential MazF cleavage sites at positions 1396– 1398 and 1500–1502 are indicated in blue and red, respectively. (B) Secondary structure of 16S rRNA. Decoding region and helices 44 and 45 are enlarged. Potential MazF cleavage sites are indicated in blue and red, as in (A). Site of Colicin E3 cleavage (AG 1493/1494) is boxed, and aSD sequence is shown in red. Primer V7 (indicated in green) complementary to positions 1511–1535 of 16S rRNA was used for northern blot and primer extension analyses. (C) Treatment of 70S ribosomes with MazF in vitro results in cleavage of 16S rRNA at positions indicated in (B). The 30 end of rRNA was labeled with pC-Cy3. rRNA fragments obtained upon incubation of 70S ribosomes with (lane 2) or without (lane 3) MazF were separated by denaturing PAGE. (Lane 1) In vitro-synthesized Cy3-labeled RNA fragment of 43 nts in length (kindly provided by Dr. U. Bla¨si) was used as a size marker. Red and blue arrows indicate fragments corresponding to MazF cleavage at positions shown in (B). The position of 5S rRNA is indicated by a black arrow. (D) To verify MazF-mediated formation of the 43 nt fragment in vivo, total RNA prepared from untreated MC4100relA+ cells harboring plasmid pSA1 (lane 2) and upon induction of mazF expression with IPTG (lane 3) were separated by denaturing PAGE, blotted, and probed with oligonucleotide V7 (Figure 3B and Table S1) to determine the amount of 30 terminus present in full-length 16S rRNA (a) and the cleaved 30 -terminal fragment (b). To determine the amount of total 16S rRNA, oligonucleotide V43 (Table S1; c) was used. Total RNA prepared from strain MC4100relA+DmazEF (lane 1) was included as control. Northern blot analysis of 5S rRNA employing primer R25 (Table S1; d) served as a loading control. (E) The same RNA (D) was used for primer extension analysis employing primer V7 (Figure 3B). In contrast to strains MC4100relA+DmazEF (lane 5) and untreated MC4100relA+pSA1 (lane 6), a signal was obtained employing RNA purified upon mazF overexpression (lane 7) that indicates unambiguously the site of MazF cleavage 50 of ACA between nts A1499 and A1500, thereby removing 43 nts at the 30 terminus. (Lanes 1–4) Sequencing reactions. The sequence is given to the left; the ACA triplet and the signal corresponding to stop of reverse transcription due to the m3U1498 modification are indicated by asterisks and an open arrowhead, respectively. See also Figure S2.
(V5) (Table S1) immobilized on magnetic beads. This additional step allowed clear separation of stress ribosomes from uncleaved 70S, as verified by the lack of a signal employing primer V7 in northern blot analysis (Figure 4Ab, lane 3). Next, the purified ribosomes were tested in vitro for translational specificity using canonical and leaderless yfiD mRNA variants (Figure 4C, can and ll). Concomitantly, canonical rpsU mRNA containing a 135 nts 50 UTR (Figure S1B) was included in all reactions as internal control. As shown in Figure 4B, 70S
ribosomes purified from untreated cells (lanes 1 and 2), as well as the heterogeneous ribosome population containing 70SD43 and 70S ribosomes roughly in a 1:1 ratio (O.V., unpublished data) purified upon mazF overexpression, were proficient in translating both canonical (can) and leaderless (ll) yfiD mRNA variants as well as the canonical rpsU mRNA (Figure 4B, lanes 3 and 4). Nevertheless, translational efficiency of heterogeneous 70S/70SD43 ribosomes was reduced for the canonical mRNA and lmRNA variant, respectively, when compared to canonical Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc. 151
A
B
C
Figure 4. Stress Ribosomes Formed upon Overexpression of mazF In Vivo Selectively Translate Leaderless yfiD mRNA (A) rRNA prepared from 10 pmoles of ribosomes purified from untreated cells (70S; lane 1) upon overexpression of mazF (70S/70SD43; lane 2) and upon further removal of uncleaved ribosomes employing a biotinylated SD-oligonucleotide (70SD43; lane 3), which were used for in vitro translation shown in (B), was separated by denaturing PAGE and stained with ethidium bromide (a) to determine amount and integrity of 16S rRNA. The same rRNA was probed using labeled oligonucleotide V7 (b) to verify presence and absence of the 30 -terminal 43 nt fragment in the individual ribosome preparations. (B) In vitro translation of canonical (can; lanes 1, 3, and 5) and leaderless (ll; lanes 2, 4, and 6) variants of yfiD mRNA employing 70S ribosomes purified from untreated E. coli MC4100relA+ cells harboring plasmid pSA1 (70S; lanes 1 and 2), purified upon mazF overexpression (70S/70SD43; lanes 3 and 4) and upon removal of uncleaved ribosomes employing immobilized biotinylated SD oligonucleotides (70SD43; lanes 5 and 6). In all reactions, equimolar amounts of canonical rpsU mRNA were added as internal control. Positions of YfiD (14.3 kD) and RpsU (8.5 kD) proteins in the autoradiograph are indicated to the right. (C) 50 UTR and proximal coding region (underlined) of canonical and leaderless yfiD mRNAs used for in vitro translation. The SD sequence of the canonical mRNA is indicated in italics. See also Figure S3.
152 Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc.
70S ribosomes. As expected and consistent with lack of the aSD sequence, purified 70SD43 ribosomes did not translate canonical yfiD mRNA (Figure 4B, lane 5) as well as canonical rpsU mRNA (Figure 4B, lanes 5 and 6). In contrast, they were proficient in translation of the leaderless yfiD mRNA variant (Figure 4B, lane 6). These results clearly substantiate our notion that MazF activity generates a subpopulation of ribosomes that lack 43 nts of the 30 end of 16S rRNA and thus selectively translate lmRNAs. Induction of MazF Activity by Stress Conditions Leads to Formation of 70SD43 Ribosomes and lmRNAs The results shown above were obtained upon artificial overexpression of mazF. Therefore, we next asked whether stress conditions that trigger the mazEF module (Kolodkin-Gal and Engelberg-Kulka, 2006) can likewise induce formation of stress ribosomes. Therefore, strains MC4100relA+ and MC4100relA+ DmazEF were treated with serine hydroxamate (SHX), which induces the stringent response, thus leading to ppGpp synthesis mediated by RelA (Wendrich et al., 2002) or chloramphenicol (Cam), an inhibitor of translation elongation (Moazed and Noller, 1987). Upon stress treatment, total RNA was purified and truncation of 16S rRNA was verified by northern blot analysis again employing probe V7 (Figure 5A). The result clearly shows that the 30 -terminal fragment is cleaved upon stress treatment in the wild-type strain, which correlates with 80% and 30% reduction of the signal obtained for 16S rRNA with the same probe upon SHX (lane 2) and Cam (lane 4) treatment, respectively (Figures 5Aa and 5Ab). In contrast, we did not detect the fragment in the mazEF deletion strain after addition of SHX or Cam (Figure 5Ab, lanes 1 and 3), and as expected, without stress treatment when both strains were grown in LB medium (Figure 5Ab, lanes 5 and 6). Because several lines of evidence indicate induction of MazF activity upon growth in minimal medium (Amitai et al., 2009), total RNA was isolated under this condition and likewise probed with primer V7. The result reveals that, in absence of stress treatment, growth in M9 induces MazF-mediated cleavage in 16S rRNA (Figure 5A, lane 8), which was not observed in the mazF-deletion mutant (Figure 5A, lane 7). A probe specific for 5S rRNA was used as loading control (Figure 5Ac), as only one out of eight 5S rRNAs contains an ACA site located at the very 30 terminus in a double-stranded region (Baik et al., 2009). Next, we tested whether these conditions concomitantly result in formation of lmRNAs. Primer extension analyses with primers specific for yfiD (Figure 5B, lanes 1–5) and rpsU mRNAs (data not shown) show that mazF-mediated cleavage of both mRNAs occurs directly upstream of the start codon upon treatment with SHX (Figure 5B, lane 3) or Cam (Figure 5B, lane 4). In contrast, without stress treatment, we were not able to detect the signal corresponding to the lmRNA (Figure 5B, lane 2). Pulse-labeling experiments performed in the presence of SHX (Figures 5C and 5D) and Cam (data not shown) likewise support the selective translation of lmRNAs caused by MazF under adverse conditions. Strains MC4100relA+ (Figure 5C) and MC4100relA+DmazEF (Figure 5D) harboring plasmid pRB381cI encoding the leaderless cI-lacZ fusion gene (Moll et al., 2004) were grown in M9 minimal medium. At OD600 of
w/o
Δ
wt
ll yfiD
Δ
wt
Cam
a
w/o
Cam
Δ
wt
yfiD SHX
SHX
Δ
B
M9
w/o
LB
can yfiD
A
1
2
3
4
5
wt 16S rRNA
V7
b
43nt fragment
c
5S rRNA
1
C
2
3
4
5
6
7
8
D
wt
- SHX C
+ SHX
0´ 10´ 20´ 30´ 10´ 20´ 30´
CIφLacZ
2
3
4
5
6
7
8
F
wt
- SHX
+ SHX
0´ 10´ 20´ 30´
10´ 20´ 30´
1
2
6
20´
3
a
5S rRNA
+ SHX
0´ 10´ 20´ 30´ 10´
V7
- SHX
C CIφLacZ
1
E
ΔmazF
4
5
43nt fragment
a
7
8
% stressribosomes b
0´
3.91
0.08
2%
43nt fragment
10´
4.02
0.16
4%
5S rRNA
20´
3.63
0.8
22%
30´
3.75
1.91
51%
30´
a b
1
2
3
4
5
6
7
Figure 5. Adverse Conditions Induce the MazF-Dependent Stress Response (A) Northern blot analyses of total RNA prepared from strains MC4100relA+ (lanes 2, 4, 6, and 8) and MC4100relA+DmazEF (lanes 1, 3, 5, and 7) grown in LB medium (lanes 1–6) treated with SHX (lanes 1 and 2), Cam (lanes 3 and 4), untreated (w/o; lanes 5 and 6), or grown in M9 minimal medium without treatment (w/o; lanes 7 and 8). Removal of the 30 end of 16S rRNA (a) and generation of the 43 nt fragment (b) by MazF was determined using oligonucleotide V7. Northern blot analysis of 5S rRNA served as a loading control (c). (B) Primer extension analysis on total RNA purified from untreated strain MC4100relA+ (lanes 2) upon treatment with SHX (lane 3) or Cam (lane 4) used for northern blotting shown in (A) with a yfiD mRNA-specific primer (Table S1). Extension signals indicate MazF cleavage upstream of the start codon upon stress treatment like shown in Figure 1A (black arrows; lanes 3 and 4). Primer extension of in vitro-transcribed canonical (open arrow; lane 1) and leaderless (open circle; lane 5) yfiD mRNAs serve as controls. (C and D) Pulse labeling of strains MC4100relA+ (C) and MC4100relA+DmazEF (D) harboring plasmid pRB381cI encoding the leaderless cI-lacZ fusion gene. Strains were grown in M9 minimal medium and pulsed either in absence (lanes 2–5) or at time points indicated upon addition of SHX (lanes 6–8). (Lane 1) Pulse labeling of strain MC4100relA+ harboring the plasmid pRB381 without cI-lacZ fusion gene to determine the position of the CIFLacZ fusion protein (indicated by arrows). (E) At time points that pulse labeling was performed in (C), total RNA was isolated and subjected to northern blot analysis using primer V7 (a) and primer R25 (specific for 5S rRNA, b) to determine the amount of 43 nt fragment upon addition of SHX. (F) Quantification of the 43 nt fragment and 5S rRNA present at time points of pulse labeling upon addition of SHX indicated in (E) (Figure S4) to estimate the amount of cleaved and total ribosomes, respectively, given in pmoles (a). The percentage of cleaved ribosomes is given (b).
0.25, the cultures were divided and one half was treated with SHX. Before and at time points indicated in Figures 5C and 5D, pulse labeling was carried out. Intriguingly, even before SHX treatment, translation of the leaderless cI-lacZ mRNA was more efficient in the WT strain (Figure 5C, lanes 2–5) when compared to the mazEF deletion strain (Figure 5D, lanes 2–5), consistent with the observed induction of MazF activity upon growth in minimal media (Figure 5A, lanes 7 and 8). Moreover, upon SHX treatment, expression of the leaderless reporter gene as well as other particular genes continued, whereas bulk
mRNA translation was reduced in strain MC4100relA+ (Figure 5C, lanes 6–8). In contrast, employing the mazF deletion strain, translation was reduced 10 min after addition of SHX (Figure 5D, lane 6). Surprisingly, upon prolonged SHX treatment, translation ceased completely and no specific translation was detectable (Figure 5D, lanes 7 and 8). Next, total RNA was isolated at the same time points that pulse labeling was performed in strain MC4100relA+ pRB381cI (Figure 5C) and was subjected to Northern blot analysis using primer V7 (specific for the 43 nt fragment) (Figure 5Ea) and R25 Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc. 153
Figure 6. A Model for the Generation of Leaderless mRNAs and Stress Ribosomes by MazF The mazEF module can be triggered by stressful conditions (i, indicated by an arrow) (Engelberg-Kulka et al., 2006; Christensen et al., 2003), which results in (ii) degradation of the antidote MazE by the ClpAP protease (Aizenman et al., 1996). The activity of released MazF leads to degradation of the majority of transcripts (iii). In addition, it removes the 50 UTR of specific mRNAs, thus rendering them leaderless (iv), and moreover, specifically removes the 30 -terminal 43 nts of 16S rRNA comprising helix 45 as well as the aSD sequence (v), which is essential for the formation of a translation initiation complex on canonical ribosome-binding sites. Consequently, (vi) MazF activity leads to selective translation of a ‘‘leaderless mRNA regulon.’’
(specific for 5S rRNA) (Figure 5Eb). Corresponding to the generation of the 43 nt fragment upon growth in minimal medium, we observed a faint signal using primer V7 before SHX treatment (Figure 5Ea, lanes 1–4). However, upon stress treatment, the amount of the 43 nt fragment increased (Figure 5Ea, lanes 5–7), and further quantification revealed that, upon 20 min treatment, when translation is specific for a distinct pool of mRNAs (Figure 5C, lane 7), 22% of ribosomes are cleaved by MazF (Figure 5F and Figure S4). The fact that only a minor population of ribosomes has to be cleaved in order to result in selective protein synthesis might be explained by inhibition of canonical ribosomes by 50 UTRs containing SD sequences that were cleaved off by MazF (Figure 1). As these RNA fragments were determined to be rather stable (O.V. and I.M., unpublished data), it is conceivable that these fragments bind to ‘‘uncleaved’’ 30S subunits via SD-aSD interaction, thus blocking translation of canonical mRNAs. This idea is supported by work of Mawn et al. (2002), who showed that overexpression of RNA fragments containing SD-like sequences is detrimental for cell viability, as it leads to depletion of free 30S ribosomal subunits. Experiments scrutinizing our hypothesis are currently ongoing. To unambiguously verify that cleavage at position A1500 in the 16S rRNA is pivotal for the posttranscriptional stress response pathway presented here, we introduced mutations at positions A1500 and A1502 in the rrsB gene encoding the 16S rRNA in plasmid pKK3535 containing the rrnB ribosomal RNA operon (Brosius et al., 1981). However, in contrast to results obtained by Vila-Sanjurjo and Dahlberg (2001), strain SQZ10D7 (Cochella and Green, 2004) lacking all rRNA operons was not viable when 154 Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc.
the only sources for 16S rRNA were plasmids containing mutant rrsB genes, supporting our notion that the presence of the ACA site is crucial for cell viability. DISCUSSION mazEF as a Master Regulatory Element that Leads to Alteration of the Translation Program under Stress Conditions in E. coli We have previously shown that, though induction of the endoribonuclease MazF leads to translation inhibition of the majority of mRNAs, the synthesis of an exclusive group of about 50 proteins is still permitted (Amitai et al., 2009). Here, we discovered a molecular mechanism leading to selective translation of specific mRNAs upon MazF induction. We show for the first time that MazF activity leads to an alteration of the translation program by generating a modified translational apparatus composed of functionally specialized ribosomes on one hand and lmRNAs on the other, as illustrated in Figure 6. The E. coli mazEF module is located downstream of the relA gene (Metzger et al., 1988), whose product is responsible for ppGpp synthesis upon amino acid starvation (Wendrich et al., 2002). Stressful conditions that inhibit mazEF expression, such as antibiotics inhibiting transcription and/or translation or increased ppGpp concentration upon severe amino acid starvation (Figure 6, i, indicated by an arrow) (Engelberg-Kulka et al., 2006; Hazan et al., 2004; Christensen et al., 2003) prevent de novo synthesis of both MazE and MazF. Subsequently, the labile MazE is degraded by the ClpAP protease (Figure 6, ii), thereby permitting MazF to act freely as
endoribonuclease, which results in degradation of the majority of mRNAs (Figure 6, iii). In the work presented here, we show that, in addition, MazF removes the 50 UTR of specific mRNAs, thus rendering them leaderless (Figure 6, iv). Moreover, we were able to demonstrate that another target of the endoribonuclease is the ribosome: MazF specifically removes the 30 -terminal 43 nts of 16S rRNA containing helix 45 as well as the aSD sequence (Figure 6, v), which is essential for formation of a translation initiation complex on canonical ribosome-binding sites (Hui and de Boer, 1987; Calogero et al., 1988). Consequently, (Figure 6, vi) MazF activity leads to selective translation of a ‘‘leaderless mRNA regulon’’ by a subpopulation of specialized ribosomes. Stress Adaptation via the Generation of Functionally Specialized Ribosomes and the ‘‘Leaderless Stress Regulon’’ Bacteria must cope with environments that undergo perpetual alterations in temperature, osmolarity, pH, availability of nutrients, and antibiotics, as well as a variety of other adverse agents and conditions. A strategy that bacteria had developed in order to cope with such environmental changes is called stress response. It involves activation of specific sets of genes, mainly at the level of transcription (Storz and Hengge-Aronis, 2000). This report provides a paradigm for posttranscriptional stress response in bacteria based on ribosome specialization: the heterogeneity of the translational machinery caused by MazF results in selective synthesis of proteins encoded by the ‘‘leaderless stress regulon.’’ Thus, the destructive endonuclease MazF turned out to be an instructive element by its ability to generate a subpopulation of distinct ribosomes. Therefore, it should be emphasized that MazF does not cause a complete change but a crucial modulation of the translation program, thereby coupling protein synthesis to the physiological state of the cell. As expected for every newly discovered mechanism, there still remain critical questions, some of which are under our current investigation and two of which are here described. First, our results shows that, under stressful conditions that may be less drastic than the one obtained by overproduction of MazF, the ribosomal population is heterogeneous, including canonical ribosomes and the described specialized ribosomes. The question of whether the bifurcation of the ribosomal population occurs inside of individual cells or is distributed among subpopulations of cells still remains to be elucidated. In addition, what is the cellular fate of specialized ribosomes lacking the 30 end of 16S rRNA? One possibility is that accumulation of such ribosomes is part of the mazEF-mediated death program, whereby we found ‘‘a point of no return’’ in MazF lethality, particularly in minimal medium (Amitai et al., 2004; Kolodkin-Gal and Engelberg-Kulka, 2006). However, there is an initial stage in which the effect of MazF can still be reversed by the antitoxin MazE (Amitai et al., 2004; Kolodkin-Gal and Engelberg-Kulka, 2006). Therefore, a ‘‘ribosome repair system’’ might exist that enables recovering from stressful conditions. Further studies on the leaderless stress regulon will shed light on bacterial pathways in which this regulon is involved and their relation to general physiological phenomena in E. coli. These include cell death (Engelberg-Kulka et al., 2006; Kolodkin-Gal, et al., 2007; Amitai et al., 2009), growth arrest (Gerdes et al., 2005), biofilm formation
(Kolodkin-Gal et al., 2009), and persistence (Keren et al., 2004), which were previously described as being related to the mazEF module. EXPERIMENTAL PROCEDURES Bacterial Strains and Plasmids Used in This Study E. coli strains MC4100relA+ and MC4100relA+DmazEF (both Engelberg-Kulka et al., 1998), BL21 (DE3) (Invitrogen), TG1 (Gibson, 1984), and MG1655 (Blattner et al., 1997) have been described. Unless otherwise indicated, bacterial cultures were grown in LB broth at 37 C supplemented with 100 mg/ml ampicillin or 15 mg/ml tetracycline, as appropriate for plasmid maintenance. Growth of liquid cultures was monitored photometrically by measuring the optical density at 600 nm. Plasmids pKTplaccI (Grill et al., 2000) and pRB381cI (Moll et al., 2004) harbor the first 189 nts of the l cI gene fused to the eighth codon of the lacZ gene under control of a constitutive lac promoter. Plasmid pSA1 is a derivative of pQE30 (QIAGEN) harboring the mazF gene under control of the T5 promoter and the lac operator (Amitai et al., 2009). Plasmid pET28a-mazEF(His)6 was constructed from pET28a (Novagen) to coexpress MazE and MazF(His)6 under the control of the T7 promoter, using the SD sequence of the mazEF operon. Purification of Ribosomes upon In Vivo mazF Expression and Verification of MazF Cleavage To verify MazF cleavage in 16S rRNA in ribosomes in vivo, E. coli strain MC4100relA+pSA1 was grown in LB. At OD600 of 0.5, the culture was divided, and in one half, mazF expression was induced by addition of 500 mM IPTG. Thirty minutes later, cells were harvested, resuspended in Tico buffer (20 mM HEPES-KOH [pH 7.4], 6 mM magnesium acetate, 30 mM ammonium acetate, and 4 mM b-mercapto-ethanol), and lysed by the lysozyme freezethaw method. Upon separation of the S30 extract through a 10% sucrose cushion made up in Tico buffer, the pellet containing crude ribosomes was resuspended in Tico buffer. 70S ribosomes still containing the aSD sequence were removed employing biotinylated SD oligonucleotides (Table S1) that were immobilized on streptavidin-coated magnetic beads. The absence of the 30 -terminal 16S rRNA fragment form ribosomes purified upon MazF treatment was determined by northern blot analysis employing oligonucleotides V43 and V7, which bind to nts 939–955 in the central part of 16S rRNA and to nts 1511–1535 within the 30 -terminal fragment, respectively (Table S1). In brief, rRNA was prepared from ribosomes used for in vitro translation analysis (Figure 4C) by phenol-chloroform extraction. Upon ethanol-precipitation, the rRNA (0.5 mg each) was fractionated on a 4% denaturing polyacrylamide gel, transferred to Hybond-membrane (Amersham) using the Trans-Blot Semi-Dry Transfer Cell (Bio-Rad), and then hybridized to [32P]-labeled oligos V7 and V43. The signals obtained with the labeled probes were visualized by a PhosphorImager (Molecular Dynamics) and quantified employing ImageQuant software. Likewise, total RNA prepared under the same conditions employed for ribosome purification was used to verify the presence of the 30 -terminal fragment upon induction of mazF expression. Northern blot analysis was performed exactly as described before to optimize for the short RNA fragment (Pall and Hamilton, 2008). SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures, four figures, and one table and is available with this article online at doi:10.1016/j.cell.2011.07.047. ACKNOWLEDGMENTS This work was supported by grants P20112-B03 and P22249-B20 from the Austrian Science Fund to I.M., by grant number 66/10 from the Israel Science Foundation (ISF) administrated by the Israel Academy of Science and Humanities, by the USA Army grant W911NF0910212, and by NIH grant GM069509 to H.E.-K.
Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc. 155
Received: November 2, 2010 Revised: March 11, 2011 Accepted: July 21, 2011 Published online: September 22, 2011
Green, J., Baldwin, M.L., and Richardson, J. (1998). Downregulation of Escherichia coli yfiD expression by FNR occupying a site at -93.5 involves the AR1-containing face of FNR. Mol. Microbiol. 29, 1113–1123.
REFERENCES
Grill, S., Gualerzi, C.O., Londei, P., and Bla¨si, U. (2000). Selective stimulation of translation of leaderless mRNA by initiation factor 2: evolutionary implications for translation. EMBO J. 19, 4101–4110.
Agarwal, S., Agarwal, S., and Bhatnagar, R. (2007). Identification and characterization of a novel toxin-antitoxin module from Bacillus anthracis. FEBS Lett. 581, 1727–1734.
Hammer-Jespersen, K., Munch-Petersen, A., Schwartz, M., and Nygaard, P. (1971). Induction of enzymes involed in the catabolism of deoxyribonucleosides and ribonucleosides in Escherichia coli K 12. Eur. J. Biochem. 19, 533–538.
Aizenman, E., Engelberg-Kulka, H., and Glaser, G. (1996). An Escherichia coli chromosomal ‘‘addiction module’’ regulated by guanosine [corrected] 30 ,50 -bispyrophosphate: a model for programmed bacterial cell death. Proc. Natl. Acad. Sci. USA 93, 6059–6063. Amitai, S., Yassin, Y., and Engelberg-Kulka, H. (2004). MazF-mediated cell death in Escherichia coli: a point of no return. J. Bacteriol. 186, 8295–8300. Amitai, S., Kolodkin-Gal, I., Hananya-Meltabashi, M., Sacher, A., and Engelberg-Kulka, H. (2009). Escherichia coli MazF leads to the simultaneous selective synthesis of both ‘‘death proteins’’ and ‘‘survival proteins’’. PLoS Genet. 5, e1000390. Baik, S., Inoue, K., Ouyang, M., and Inouye, M. (2009). Significant bias against the ACA triplet in the tmRNA sequence of Escherichia coli K-12. J. Bacteriol. 191, 6157–6166. Blattner, F.R., Plunkett, G., III, Bloch, C.A., Perna, N.T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J.D., Rode, C.K., Mayhew, G.F., et al. (1997). The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462. Brosius, J., Ullrich, A., Raker, M.A., Gray, A., Dull, T.J., Gutell, R.R., and Noller, H.F. (1981). Construction and fine mapping of recombinant plasmids containing the rrnB ribosomal RNA operon of E. coli. Plasmid 6, 112–118. Calogero, R.A., Pon, C.L., Canonaco, M.A., and Gualerzi, C.O. (1988). Selection of the mRNA translation initiation region by Escherichia coli ribosomes. Proc. Natl. Acad. Sci. USA 85, 6427–6431. Christensen, S.K., Mikkelsen, M., Pedersen, K., and Gerdes, K. (2001). RelE, a global inhibitor of translation, is activated during nutritional stress. Proc. Natl. Acad. Sci. USA 98, 14328–14333. Christensen, S.K., Pedersen, K., Hansen, F.G., and Gerdes, K. (2003). Toxin-antitoxin loci as stress-response-elements: ChpAK/MazF and ChpBK cleave translated RNAs and are counteracted by tmRNA. J. Mol. Biol. 332, 809–819. Cochella, L., and Green, R. (2004). Isolation of antibiotic resistance mutations in the rRNA by using an in vitro selection system. Proc. Natl. Acad. Sci. USA 101, 3786–3791. DeLano, W.L. (2002). The PyMOL Molecular System (San Carlos, CA: DeLano Scientific). Engelberg-Kulka, H., and Glaser, G. (1999). Addiction modules and programmed cell death and antideath in bacterial cultures. Annu. Rev. Microbiol. 53, 43–70. Engelberg-Kulka, H., Reches, M., Narasimhan, S., Schoulaker-Schwarz, R., Klemes, Y., Aizenman, E., and Glaser, G. (1998). rexB of bacteriophage lambda is an anti-cell death gene. Proc. Natl. Acad. Sci. USA 95, 15481– 15486. Engelberg-Kulka, H., Amitai, S., Kolodkin-Gal, I., and Hazan, R. (2006). Bacterial programmed cell death and multicellular behavior in bacteria. PLoS Genet. 2, e135. Gerdes, K., Christensen, S.K., and Løbner-Olesen, A. (2005). Prokaryotic toxin-antitoxin stress response loci. Nat. Rev. Microbiol. 3, 371–382. Gabashvili, I.S., Agrawal, R.K., Spahn, C.M., Grassucci, R.A., Svergun, D.I., Frank, J., and Penczek, P. (2000). Solution structure of the E. coli 70S ribosome at 11.5 A resolution. Cell 100, 537–549. Gibson, T.J. (1984). Studies on the Epstein-Barr virus genome. PhD thesis, University of Cambridge, UK.
156 Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc.
Hayes, F. (2003). Toxins-antitoxins: plasmid maintenance, programmed cell death, and cell cycle arrest. Science 301, 1496–1499. Hazan, R., Sat, B., and Engelberg-Kulka, H. (2004). Escherichia coli mazEFmediated cell death is triggered by various stressful conditions. J. Bacteriol. 186, 3663–3669. Hui, A., and de Boer, H.A. (1987). Specialized ribosome system: preferential translation of a single mRNA species by a subpopulation of mutated ribosomes in Escherichia coli. Proc. Natl. Acad. Sci. USA 84, 4762–4766. Kaberdina, A.C., Szaflarski, W., Nierhaus, K.H., and Moll, I. (2009). An unexpected type of ribosomes induced by kasugamycin: a look into ancestral times of protein synthesis? Mol. Cell 33, 227–236. Kaminishi, T., Wilson, D.N., Takemoto, C., Harms, J.M., Kawazoe, M., Schluenzen, F., Hanawa-Suetsugu, K., Shirouzu, M., Fucini, P., and Yokoyama, S. (2007). A snapshot of the 30S ribosomal subunit capturing mRNA via the Shine-Dalgarno interaction. Structure 15, 289–297. Keren, I., Shah, D., Spoering, A., Kaldalu, N., and Lewis, K. (2004). Specialized persister cells and the mechanism of multidrug tolerance in Escherichia coli. J. Bacteriol. 186, 8172–8180. Kolodkin-Gal, I., and Engelberg-Kulka, H. (2006). Induction of Escherichia coli chromosomal mazEF by stressful conditions causes an irreversible loss of viability. J. Bacteriol. 188, 3420–3423. Kolodkin-Gal, I., Hazan, R., Gaathon, A., Carmeli, S., and Engelberg-Kulka, H. (2007). A linear pentapeptide is a quorum-sensing factor required for mazEFmediated cell death in Escherichia coli. Science 318, 652–655. Kolodkin-Gal, I., Verdiger, R., Shlosberg-Fedida, A., and Engelberg-Kulka, H. (2009). A differential effect of E. coli toxin-antitoxin systems on cell death in liquid media and biofilm formation. PLoS ONE 4, e6785. Lupski, J.R., Ruiz, A.A., and Godson, G.N. (1984). Promotion, termination, and anti-termination in the rpsU-dnaG-rpoD macromolecular synthesis operon of E. coli K-12. Mol. Gen. Genet. 195, 391–401. Marianovsky, I., Aizenman, E., Engelberg-Kulka, H., and Glaser, G. (2001). The regulation of the Escherichia coli mazEF promoter involves an unusual alternating palindrome. J. Biol. Chem. 276, 5975–5984. Masuda, Y., Miyakawa, K., Nishimura, Y., and Ohtsubo, E. (1993). chpA and chpB, Escherichia coli chromosomal homologs of the pem locus responsible for stable maintenance of plasmid R100. J. Bacteriol. 175, 6850–6856. Mawn, M.V., Fournier, M.J., Tirrell, D.A., and Mason, T.L. (2002). Depletion of free 30S ribosomal subunits in Escherichia coli by expression of RNA containing Shine-Dalgarno-like sequences. J. Bacteriol. 184, 494–502. Metzger, S., Dror, I.B., Aizenman, E., Schreiber, G., Toone, M., Friesen, J.D., Cashel, M., and Glaser, G. (1988). The nucleotide sequence and characterization of the relA gene of Escherichia coli. J. Biol. Chem. 263, 15699–15704. Mittenhuber, G. (1999). Occurrence of mazEF-like antitoxin/toxin systems in bacteria. J. Mol. Microbiol. Biotechnol. 1, 295–302. Moazed, D., and Noller, H.F. (1987). Chloramphenicol, erythromycin, carbomycin and vernamycin B protect overlapping sites in the peptidyl transferase region of 23S ribosomal RNA. Biochimie 69, 879–884. Moll, I., Grill, S., Gru¨ndling, A., and Bla¨si, U. (2002). Effects of ribosomal proteins S1, S2 and the DeaD/CsdA DEAD-box helicase on translation of leaderless and canonical mRNAs in Escherichia coli. Mol. Microbiol. 44, 1387–1396.
Moll, I., Hirokawa, G., Kiel, M.C., Kaji, A., and Bla¨si, U. (2004). Translation initiation with 70S ribosomes: an alternative pathway for leaderless mRNAs. Nucleic Acids Res. 32, 3354–3363. Pall, G.S., and Hamilton, A.J. (2008). Improved northern blot method for enhanced detection of small RNA. Nat. Protoc. 3, 1077–1084.
Wagner, A.F., Schultz, S., Bomke, J., Pils, T., Lehmann, W.D., and Knappe, J. (2001). YfiD of Escherichia coli and Y06I of bacteriophage T4 as autonomous glycyl radical cofactors reconstituting the catalytic center of oxygen-fragmented pyruvate formate-lyase. Biochem. Biophys. Res. Commun. 285, 456–462.
Pandey, D.P., and Gerdes, K. (2005). Toxin-antitoxin loci are highly abundant in free-living but lost from host-associated prokaryotes. Nucleic Acids Res. 33, 966–976.
Wendrich, T.M., Blaha, G., Wilson, D.N., Marahiel, M.A., and Nierhaus, K.H. (2002). Dissection of the mechanism for the stringent factor RelA. Mol. Cell 10, 779–788.
Porollo, A., and Meller, J. (2007). Versatile annotation and publication quality visualization of protein complexes using POLYVIEW-3D. BMC Bioinformatics 8, 316. Ramage, H.R., Connolly, L.E., and Cox, J.S. (2009). Comprehensive functional analysis of Mycobacterium tuberculosis toxin-antitoxin systems: implications for pathogenesis, stress responses, and evolution. PLoS Genet. 5, e1000767. Sat, B., Hazan, R., Fisher, T., Khaner, H., Glaser, G., and Engelberg-Kulka, H. (2001). Programmed cell death in Escherichia coli: some antibiotics can trigger mazEF lethality. J. Bacteriol. 183, 2041–2045. Schmidt, O., Schuenemann, V.J., Hand, N.J., Silhavy, T.J., Martin, J., Lupas, A.N., and Djuranovic, S. (2007). prlF and yhaV encode a new toxin-antitoxin system in Escherichia coli. J. Mol. Biol. 372, 894–905. Schuwirth, B.S., Borovinskaya, M.A., Hau, C.W., Zhang, W., Vila-Sanjurjo, A., Holton, J.M., and Cate, J.H. (2005). Structures of the bacterial ribosome at 3.5 A resolution. Science 310, 827–834. Senior, B.W., and Holland, I.B. (1971). Effect of colicin E3 upon the 30S ribosomal subunit of Escherichia coli. Proc. Natl. Acad. Sci. USA 68, 959–963.
Woodcock, J., Moazed, D., Cannon, M., Davies, J., and Noller, H.F. (1991). Interaction of antibiotics with A- and P-site-specific bases in 16S ribosomal RNA. EMBO J. 10, 3099–3103. Wrzesinski, J., Bakin, A., Nurse, K., Lane, B.G., and Ofengand, J. (1995). Purification, cloning, and properties of the 16S RNA pseudouridine 516 synthase from Escherichia coli. Biochemistry 34, 8904–8913. Yusupova, G.Z., Yusupov, M.M., Cate, J.H., and Noller, H.F. (2001). The path of messenger RNA through the ribosome. Cell 106, 233–241. Yusupova, G., Jenner, L., Rees, B., Moras, D., and Yusupov, M. (2006). Structural basis for messenger RNA movement on the ribosome. Nature 444, 391–394. Zhang, J.J., Zhang, Y.L., and Inouye, M. (2003). Characterization of the interactions within the mazEF addiction module of Escherichia coli. J. Biol. Chem. 278, 32300–32306.
Storz, G., and Hengge-Aronis, R. (2000). Bacterial Stress Responses (Washington, DC: ASM Press).
Zhang, J.J., Zhang, Y.L., Zhu, L., Suzuki, M., and Inouye, M. (2004). Interference of mRNA function by sequence-specific endoribonuclease PemK. J. Biol. Chem. 279, 20678–20684.
Vila-Sanjurjo, A., and Dahlberg, A.E. (2001). Mutational analysis of the conserved bases C1402 and A1500 in the center of the decoding domain of Escherichia coli 16S rRNA reveals an important tertiary interaction. J. Mol. Biol. 308, 457–463.
Zhu, L., Phadtare, S., Nariya, H., Ouyang, M., Husson, R.N., and Inouye, M. (2008). The mRNA interferases, MazF-mt3 and MazF-mt7 from Mycobacterium tuberculosis target unique pentad sequences in single-stranded RNA. Mol. Microbiol. 69, 559–569.
Cell 147, 147–157, September 30, 2011 ª2011 Elsevier Inc. 157
Regulatory Control of the Resolution of DNA Recombination Intermediates during Meiosis and Mitosis Joao Matos,1,2 Miguel G. Blanco,1,2 Sarah Maslen,1 J. Mark Skehel,1 and Stephen C. West1,* 1London
Research Institute, Cancer Research UK, Clare Hall Laboratories, South Mimms, Herts EN6 3LD, UK authors contributed equally to this work *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.08.032 2These
SUMMARY
The efficient and timely resolution of DNA recombination intermediates is essential for bipolar chromosome segregation. Here, we show that the specialized chromosome segregation patterns of meiosis and mitosis, which require the coordination of recombination with cell-cycle progression, are achieved by regulating the timing of activation of two crossoverpromoting endonucleases. In yeast meiosis, Mus81Mms4 and Yen1 are controlled by phosphorylation events that lead to their sequential activation. Mus81-Mms4 is hyperactivated by Cdc5-mediated phosphorylation in meiosis I, generating the crossovers necessary for chromosome segregation. Yen1 is also tightly regulated and is activated in meiosis II to resolve persistent Holliday junctions. In yeast and human mitotic cells, a similar regulatory network restrains these nuclease activities until mitosis, biasing the outcome of recombination toward noncrossover products while also ensuring the elimination of any persistent joint molecules. Mitotic regulation thereby facilitates chromosome segregation while limiting the potential for loss of heterozygosity and sister-chromatid exchanges. INTRODUCTION During mitosis and meiosis, cells commit to the transmission of a complete set of chromosomes to the next generation. Whereas the bipolar segregation of replicated sister chromatids keeps the chromosome complement unchanged during mitosis, meiosis generates haploid gametes from diploid germ cells through a single DNA replication phase followed by two consecutive rounds of chromosome segregation. Homologous chromosomes (homologs) segregate in meiosis I and sister chromatids disjoin in meiosis II. The ability of meiotic cells to segregate homologs during meiosis I requires the coordination of a series of specialized events. Most organisms use reciprocal recombination between maternal and 158 Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc.
paternal chromatids to create crossovers (COs) that link homologs through cohesin-mediated sister-chromatid cohesion. When sister kinetochores attach to microtubules from the same pole, rather than from opposite poles as occurs in mitosis, the chiasmata enable the meiosis I spindle to pull maternal and paternal centromeres in opposite directions. Therefore, and in contrast to mitosis, the formation of meiotic COs provides the indispensable mechanical basis for accurate chromosome segregation. The importance of CO formation during meiosis can be appreciated by the complex and potentially deleterious strategy that cells employ in their generation. Most organisms produce COs upon deliberate chromosome breakage, which is initiated by double-strand break (DSB) formation mediated by meiosisspecific expression of Spo11 (Keeney et al., 1997). Recombination with a homologous chromosome leads to the formation of joint molecule (JM) intermediates in which the interacting DNAs are linked by double Holliday junctions (dHJs) (Allers and Lichten, 2001; Hunter and Kleckner, 2001; Schwacha and Kleckner, 1995). Studies from various organisms indicate that there are at least three pathways by which HJs can be processed to generate COs. In budding yeast these involve the Mus81-Mms4, Slx1-Slx4, and Yen1 endonucleases (Fricke and Brill, 2003; Ip et al., 2008; Kaliraman et al., 2001). Different organisms, however, show a specific dependence on one pathway or another. For example, meiotic CO formation in Schizosaccharomyces pombe is dependent only upon Mus81-Eme1 (Eme1 is the ortholog of Mms4) (Boddy et al., 2001; Osman et al., 2003), and a Yen1 ortholog cannot be identified in its genome (Ip et al., 2008). In contrast, Saccharomyces cerevisiae mus81D mutants show a small reduction in CO formation and form spores efficiently, albeit with reduced viability (50% of wild-type), suggesting that Mus81Mms4 plays a relatively modest role in HJ processing and CO formation (de los Santos et al., 2001, 2003; Haber and Heyer, 2001). In budding yeast, Slx1-Slx4 appears to be relatively unimportant for CO formation, as meiotic defects are not observed in slx1 or slx4 mutants (Mullen et al., 2001), and the role of Yen1 has not been investigated. However, Yen1 and Mus81-Mms4 provide overlapping functions in promoting JM resolution and CO formation during mitotic DNA repair (Blanco et al., 2010; Ho et al., 2010; Tay and Wu, 2010). These observations highlight the possibility that a degree of functional redundancy between nucleases might obscure their specific contributions toward JM resolution and the completion of meiotic recombination.
The efficient and appropriate resolution of recombination intermediates is a key event in all cells. During meiosis, dHJs need to be resolved to form the COs necessary for the segregation of homologs, whereas in mitotic cells noncrossover (NCO) formation is favored to avoid the potential for loss of heterozygosity and high levels of sister-chromatid exchanges (SCEs). Indeed, during mitotic recombination CO formation is avoided either by the use of antirecombinogenic pathways that disengage JMs at an early stage, or by the actions of enzymes that promote dHJ dissolution. For example, in budding yeast, DNA helicases such as Srs2 and Sgs1 have been shown to suppress CO formation and to play important roles in recombinational DNA repair (Gangloff et al., 1994; Ira et al., 2003). The timing by which JMs are processed is also critical because unless they are disengaged/processed at the appropriate time, their presence will constitute a physical impediment to chromosome segregation. In budding yeast meiosis, the timing of JM resolution and CO formation is coordinated with cell-cycle progression through the NDT80-dependent expression of the Polo-like kinase Cdc5, to ensure the bipolar segregation of fully resolved DNA (Clyne et al., 2003; Sourirajan and Lichten, 2008). Ndt80-mediated transcription of Cdc5, and other components of the chromosome segregation machinery, is activated in late pachytene as recombination and synapsis checkpoints are satisfied and cells prepare for meiosis I (Chu et al., 1998). Interestingly, whereas Cdc5-depleted cells accumulate JMs, such as dHJs, they are still capable of expressing NDT80-dependent genes and assemble meiosis I spindles (Clyne et al., 2003; Lee and Amon, 2003). These observations indicate that the formation/resolution of late recombination intermediates might not be under checkpoint surveillance. Because a single unmonitored JM could result in chromosome nondisjunction and aneuploidy, cells need to eliminate dHJs prior to chromosome segregation, and it is possible that Cdc5 kinase may play a role in regulating the timing of such events. In this work, we have analyzed the activities of cellular Holliday junction resolvases, in both yeast and human cells. We find that the activities of yeast Mus81-Mms4 (human MUS81-EME1) and Yen1 (GEN1) are tightly regulated throughout the cell cycle, in both meiotic and mitotic cells, in order to coordinate JM resolution with chromosome segregation. Regulation is mediated through cycles of phosphorylation/dephosphorylation that directly modulate HJ resolvase activity, and in the case of Mus81-Mms4 these regulatory cycles are dependent upon Cdc5. These results bring a new cell-cycle dimension to the completion of homologous recombination in both meiotic and mitotic cells. Moreover, the combination of tight activity regulation with multiple overlapping activities ensures the elimination of JMs, while also providing a flexible control that determines the outcome (CO versus NCO) of the recombination process according to the needs of meiotic or mitotic division. RESULTS Mus81-Mms4 and Yen1 Ensure the Completion of Meiotic Recombination To determine the roles and potential interplay between Mus81Mms4, Yen1, and Slx1-Slx4 during meiotic recombination,
S. cerevisiae diploids were generated with deletions in one or more of the corresponding genes. The yen1D, slx1D, and slx4D single mutants formed spores with similar efficiency to wildtype cells (>90% sporulation efficiency), whereas mus81D exhibited a slightly reduced efficiency (80% sporulation) (Figure 1A). The defect in mus81D mutants was enhanced by the lack of YEN1, as only 30% of the mus81D yen1D cells formed mature spores. In contrast, other double mutant combinations, such as mus81D slx1D, mus81D slx4D, yen1D slx1D, or yen1D slx4D failed to show a synthetic effect in terms of sporulation efficiency. Yen1 and Mus81-Mms4 play overlapping roles during DNA repair in proliferating cells (Blanco et al., 2010; Ho et al., 2010; Tay and Wu, 2010). To eliminate the possibility that defects in sporulation in mus81D yen1D mutants might result from the accumulation of toxic repair intermediates from the preceding mitosis, cells expressing PCLB2-MMS4 were generated to specifically deplete Mms4 from meiotic cells (Oh et al., 2008). The PCLB2-MMS4 yen1D mutants showed a similar defect in sporulation to that of mus81D yen1D (Figure 1A) or mms4D yen1D (see Figure S5C available online). Furthermore, using mus81D yen1D spo11D strains, we confirmed that this defect was a consequence of incomplete meiotic recombination (Figure 1A). Measurements of spore viability revealed a complete defect in the formation of viable spores when the YEN1 deletion was combined with mus81D, PCLB2-MMS4, or mms4D (Figure 1B and Figure S5D). Examination of the resulting asci showed that the DNA from the majority of mus81D yen1D cells was concentrated in a single mass (Figure 1C). Deletion of SPO11 alleviated this problem, showing that the DNA segregation defect stems from the accumulation of unresolved recombination intermediates. In contrast, yen1D slx1D or yen1D slx4D exhibited wildtype levels of spore viability (Figure 1B). Distinct Roles for Mus81-Mms4 and Yen1 in JM Resolution in Meiosis I and II To determine whether Mus81-Mms4 and Yen1 play distinct or redundant functions, the stages of meiotic chromosome segregation were visualized in yen1D, mus81D, and mus81D yen1D mutants. For this, strains were constructed in which both homologs of chromosome V were marked by GFP at URA3 (homozygous URA3-GFP), a Myc18-tagged version of the anaphase inhibitor Pds1 was expressed, and the meiosis-specific cohesin subunit Rec8 was tagged with Ha3. In wild-type cells, the metaphase I to anaphase I transition is triggered by Pds1 degradation, followed by spindle elongation, segregation of DNA into two masses, and bipolar segregation of homologous URA3 sequences. The binucleates then reaccumulate Pds1 and form a pair of short meiosis II spindles. Next, a second round of Pds1 degradation initiates the metaphase II to anaphase II transition, resulting in the tetrapolar segregation of URA3-GFP and formation of four distinct nuclei (Figure 2A). Whereas the meiotic segregation events in yen1D mutants were similar to wild-type (Figure 2A), mus81D mutants displayed an unanticipated pattern of chromosome segregation. Although meiosis I spindles were assembled with similar kinetics to control cells, most mus81D mutants failed to segregate homologous chromosomes upon Pds1 destruction at anaphase I (Figure 2A Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc. 159
Figure 1. Requirement for Mus81-Mms4 and Yen1 in Meiosis
A
% sporulation (24hr)
100
(A and B) The efficiencies of spore formation and viability were measured after 24 hr at 30 C with the indicated strains. Bars indicate standard errors. (C) Representative asci are shown.
80 60 40 20 0 T W
ye
n1
∆ m
us
81
m
B
∆
us
m
81
us
ye
∆
81
∆
n1
ye
∆
n1
∆
sp
1∆ o1
∆
sl
1∆
sl
8
8
us
∆
x4
x1
1∆
us
ye
m
m
n
1∆
sl
x1
∆
ye
n
1∆
sl
x4
∆
PC
S4
M
LB
M 2-
PC
L
B2
-M
M
S4
ye
n1
∆
sl
x1
∆
sl
x4
∆
100
% spore viability
80 60 40 20 0 T W
ye
n1
∆ m
us
81
m
∆
us
81
∆
ye ∆
n1
ye
∆
n1
∆
sp
1∆ o1
81
∆
∆
us
m
∆
x4
81
81
us
m
∆
x1
sl
sl
ye
n1
∆
sl
x1
∆
ye
n1
∆
sl
x4
∆
PC
LB
PC
us
∆
S4
n1
M
M 2-
S4
ye
sl
x1
∆
sl
M
L
-M B2
m
C WT
mus81∆
Ascus
DNA
mus81∆ yen1∆
Ascus
Ascus
DNA
mus81∆ yen1∆ spo11∆
DNA
Ascus
and Figure S1). Pds1 reaccumulated normally as meiosis II spindles were formed in mononucleated cells, followed by a second round of Pds1 destruction. At anaphase II, however, sister chromatids segregated efficiently in a tetrapolar fashion, indicating 160 Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc.
x4
∆
that JM resolution occurred after meiosis I, ultimately allowing chromosome segregation. In contrast, mus81D yen1D cells formed meiosis I and II spindles, underwent two cycles of Pds1 accumulation and destruction, but failed to segregate the bulk of their DNA at either anaphase I or anaphase II (Figure 2A and Figure S1). Control experiments showed that this was not due to the persistence of proteinaceous interhomolog connections, as synaptonemal complex (SC) disassembly and cohesin cleavage occurred normally at meiosis I (Figures S1E and S1F). These results indicate that mus81D yen1D mutants fail to segregate their chromosomes during meiosis I due to the lack of Mus81-Mms4 activity, and that Yen1 is required for the single meiotic division that takes place in the absence of Mus81-Mms4. The lack of chromosome segregation in the mus81D yen1D mutants was due to unresolved meiotic DNA joint molecules, as elimination of DSB formation in mus81D yen1D spo11D mutants restored both rounds of chromosome segregation (Figure 2B and Figure S1). To determine whether Yen1 plays a specific role in the resolution of intersister joint molecules, we analyzed dmc1D mek1D mutants where interhomolog recombination is suppressed in favor of intersister repair (Niu et al., 2005) and found that YEN1 was dispensable for the segregation of sister chromatids at anaphase II (Figure S1G). Taken together, these results indicate that Yen1 is responsible for the resolution of persistent JMs, possibly as a backup for Mus81-Mms4.
Yen1 Is Tightly Regulated and Activated at the Onset of Anaphase II Our finding that most mus81D mutant cells fail to undergo meiosis I even in the presence of YEN1 indicates that the role of Yen1 in JM resolution is specific for meiosis II. One explanation for this surprising result would be that either Yen1 DNA protein is absent during meiosis I, or that Yen1 activity is kept in check until the onset of meiosis II. We therefore developed a strategy that allowed us to analyze Yen1 activity throughout meiosis using synchronized cells expressing Myc-tagged Yen1 (YEN1-myc9). Following meiotic induction, samples were taken at different time points,
Yen1-myc9 was affinity-purified from extracts using anti-Myc beads, and HJ resolvase activity was assayed directly by the addition of synthetic 32P-labeled HJ DNA (see scheme, Figure S2A). Whereas the levels of Yen1 protein remained relatively constant throughout meiosis, the activity of the protein changed dramatically, as measured by the conversion of HJ DNA into nicked duplex products. Very little HJ resolvase activity was observed from S phase (Figure 3A, lower panel, lanes i and j, and quantification in Figure 3B) until the activity increased sharply as the cells entered meiosis II (Figure 3A, lanes l and m). The precise timing of Yen1 activation was monitored in highly synchronous meiotic cultures using an ndt80D arrest/release system, showing that Yen1 activation occurred after the accumulation of the meiosis II-specific cyclin Clb3, at the metaphase II to anaphase II transition (Figure S2B, lanes h–k and Figure S2C). Control anti-Myc immunoaffinity purification from cells expressing either untagged Yen1 (Figure 3A, lanes a–g) or Myc-tagged Yen1-EE (Figure S2D), a catalytic-dead derivative of Yen1 (Ip et al., 2008), showed that Yen1 was directly responsible for the HJ resolution activity. The human ortholog of Yen1, GEN1, cleaves a variety of DNA substrates including HJs, 50 -flaps, and replication fork structures (Ip et al., 2008; Rass et al., 2010). To determine whether activation of Yen1 might be specific for one particular substrate, we prepared Myc-tagged Yen1 from cells taken at different stages of meiosis and compared their activities using the three DNA substrates. Similar activation profiles were obtained, indicating that the observed regulation most likely operates by a general inhibition of nuclease activity (Figure S2E). Regulation of Yen1 Activity by Posttranslational Modification Next, we determined whether the changes to Yen1 activity might be due to posttranslational modifications. When analyzed by phosphoaffinity SDS-PAGE, a method that retards phosphorylated proteins, we observed that the inactivation of Yen1 during the early stages of meiosis correlated with a slow electrophoretic mobility of the protein (Figure 3C, lanes c–e). At the onset of meiosis II, however, the mobility of Yen1 increased significantly (Figure 3C, lanes f and g), and the timing of this event occurred together with the activation of Yen1’s nuclease activity (Figure 3A, lanes k–m). This correlation was confirmed by treating ‘‘inactive’’ Yen1-myc9 (immunoaffinity purified from cells synchronized at prophase I by NDT80 deletion) with l-phosphatase. Dephosphorylation of Yen1 resulted in a dramatic activation of the nuclease activity of the protein (Figure 3D, lane c), whereas the activity was unaffected by treatment with inactivated l-phosphatase (Figure 3D, lane d). These results show that Yen1 activity is directly controlled by its phosphorylation/dephosphorylation status, and provide a mechanistic basis for the modulation of Yen1 activity throughout meiosis. We suggest that the protein is held in an inactive phosphorylated state during DNA replication and meiosis I, favoring JM cleavage by Mus81-Mms4, and is then activated by dephosphorylation as cells undergo the second round of chromosome segregation. These results define a cellular role for Yen1 in meiosis II, in the resolution of persistent JMs prior to segregation,
and explain how Yen1 specifically rescues the second meiotic division in mus81D mutants. Slx1-Slx4 Does Not Display HJ Resolvase Activity during Meiosis In contrast to Mus81-Mms4 and Yen1, the genetic analyses presented in Figure 1 indicate that Slx1-Slx4 plays a very minor role in processing meiotic recombination intermediates. To confirm this, synchronized meiotic cells expressing either Myc-tagged Slx1 or Slx4 were used to carry out immunoaffinity purifications similar to those with Yen1. We were unable to detect Slx1-Slx4 HJ resolvase activity at any stage of meiosis (Figure S3A), whereas resolvase activity was observed in immunoprecipitates from proliferating mitotic cells (Figure S3A, lane cc). These results indicate that Slx1-Slx4 activity might be downregulated in meiosis. We did, however, note that both Slx1 and Slx4 were phosphorylated in meiosis (Figures S3A and S3B), but the consequences of these posttranslational modifications are presently unclear. Mus81-Mms4 Is Hyperactivated by the Polo-like Kinase Cdc5 at the Onset of Meiosis I The resolution of interhomolog JMs at the onset of meiosis I depends on the accumulation of Cdc5 (Clyne et al., 2003), but the precise target of Cdc5 in this process remains to be identified. Given that Yen1 and Slx1-Slx4 do not appear to be activated at the time when Cdc5 accumulates (Figure 3A and Figure S3), we determined whether Cdc5 regulates Mus81-Mms4 activity. To do this, Mus81-Mms4, Mms4-myc9 (Figure 4A), or Mus81-myc9 (Figure 5A) were immunoaffinity purified from synchronized meiotic cultures and analyzed for HJ resolution. We observed significant variations in the levels of Mus81Mms4 activity throughout meiosis. Whereas a basal level of activity was observed at all time points, an increase in activity occurred in meiosis I that was coincident with the expression of Cdc5 (Figure 4A, lanes k and l and Figure 5A, lanes c–e) and the formation of meiosis I spindles (Figures 4A and 4B and Figure 5A). Control experiments, carried out with cells that carried untagged Mms4 (Figure 4A, lanes a–g) or a Myc-tagged catalytically impaired version of Mus81, Mus81-DD (de los Santos et al., 2003) (Figure S4A), confirmed that the HJ resolution activity measured in these experiments was dependent on Mus81Mms4. PCLB2-CDC5 cells, depleted of Cdc5, formed meiosis I spindles but failed to activate Mus81-Mms4 (Figure 4A, lanes o–u and Figure 4B) whereas they activated Yen1 after a slight delay (Figure 3A, lanes o–u and Figure 3B). Consistent with the concept that Cdc5 is involved in the hyperactivation of Mus81-Mms4, we found that Mms4-myc9 underwent a posttranslational modification concurrent with Cdc5 expression, as determined by SDS-PAGE (Figure 4A, lanes k and l). Because Cdc5 is involved in the control of several aspects of chromosome segregation and cells lacking meiotic expression of Cdc5 fail to progress beyond meiosis I (Clyne et al., 2003; Lee and Amon, 2003), it was necessary to uncouple defects in meiotic progression from the requirement for Cdc5 in the modification and activation of Mus81-Mms4. This was achieved by meiotic depletion of the APC/C activator Cdc20 (PCLB2-CDC20) (Lee and Amon, 2003), which led to the Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc. 161
A WT S/Pro
Meta I
Ana I
Meta II
Ana II
WT
100
DNA
Tubulin
80 % cells
Homozygous URA3-GFP
60 40 20
Pds1-myc18
0
0
2
1
4 6 8 10 Time in SPM (hr)
yen1∆ Meta I
Ana I
Meta II
Ana II
yen1∆
100
DNA Homozygous URA3-GFP Tubulin
80 % cells
S/Pro
60 40 20
Pds1-myc18 0
0
2
4 6 8 Time in SPM (hr)
10
mus81∆ S/Pro
Meta I
Ana I
Meta II
Ana II
100
mus81∆
DNA
Tubulin
80 % cells
Homozygous URA3-GFP
60 40 20
Pds1-myc18
0
0
2
4 6 8 Time in SPM (hr)
10
mus81∆ yen1∆ Meta I
Ana I
Meta II
Ana II
100
DNA Homozygous URA3-GFP Tubulin
mus81∆ yen1∆
80 % cells
S/Pro
60 40 20
Pds1-myc18 0
0
B mus81∆ yen1∆ Ana I
Meta II
Ana II
mus81∆ yen1∆ spo11∆ Ana I
Meta II
2
4 6 8 Time in SPM (hr)
10
2 nuclei
4 nuclei
1 spindle
2 spindles
Ana II
DNA Tubulin
Pds1-myc18 9 / 91
2 / 98
10 / 90
85 / 15
90 / 10
89 / 11
Nuclear division Yes / No (%)
Figure 2. Mus81-Mms4 and Yen1 Ensure Chromosome Segregation during Both Meiotic Divisions (A) Immunofluorescence analysis of meiosis in WT and mutant strains expressing PDS1-myc18 (for securin visualization) and URA3-GFP (GFP marks chromosome V at URA3). Left: images taken at different stages of the cell cycle illustrate chromosome segregation patterns. Right: quantification and kinetics of meiotic progression, indicating the percentage of cells with two DNA masses (two nuclei), those that have undergone the second meiotic division (four nuclei), and those undergoing meiosis I (one spindle) or meiosis II (two spindles).
162 Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc.
YEN1
YEN1-myc9
0 2 4 6 8 10 12 0 2 4 6 8 10 12
B
PCLB2-CDC5 YEN1-myc9
0 2 4 6 8 10 12 Hours in SPM
Extract
Yen1-myc9 Cdc5
% cells or % HJ cleavage
Resolution assay on α-Myc IPs
Tub2
*
* a b c d e f g
h i j k l m n
% cells or % HJ cleavage
A
YEN1-myc9
100 80 60 40 20 0
0
4
6
8 10 12 Hours in SPM
PCLB2-CDC5 YEN1-myc9 100 80 60 40 20 0
2 4 6 8 10 12 Hours in SPM Resolution activity MI spindle MII spindle
0
o p q r s t u
C
2
D YEN1-myc9 YEN1 myc9
Prophase I: ndt80∆ YEN1-myc9
100
60 40 20
cc c
0
0
2
4
6
8
10
α-Myc IP
2 nuclei 4 nuclei 1 spindle 2 spindles
Cdc5 Pgk1 c
d
e
f
+
+ -
Phosphatase dead Phosphatase
-
+
+
+
30 °C, 15 min
Yen1-myc9-P - Yen1-myc9 *
Yen1-myc9 (Phos-Tag gel)
b
-
Hours in SPM H
Yen1-myc9
a
-
Resolution assay on α-Myc IPs
% cells
80
g
*
a
b
c
d
Figure 3. Activation of Yen1 by Dephosphorylation at the Onset of Meiosis II (A) Extracts were prepared from WT, YEN1-myc9, and PCLB2-CDC5 YEN1-myc9 strains at 2 hr intervals after transfer into sporulation medium. Yen1-myc9, Cdc5, and Tub2 were detected by western blotting, and Yen1-myc9 was immunoaffinity purified from each extract and assayed for HJ resolution activity. Asterisks indicate 50 -32P-labels. (B) Quantification of Yen1 HJ resolvase activity relative to the kinetics of meiotic progression as determined from (A). (C) Kinetics of meiotic progression (spindle morphology and nuclear divisions) and western blot analysis of protein extracts from cells expressing Yen1-myc9. Yen1-myc9 was analyzed by standard or phosphoaffinity (Phos-Tag) SDS-PAGE. Cc: sample from proliferating cells. (D) Activation of Yen1 by phosphatase treatment. Yen1-myc9 was immunopurified from cells arrested in prophase I using ndt80D YEN1-myc9 cells (collected after 6 hr in SPM). Protein fractions were analyzed as in (C) and treated with l-phosphatase, or inactivated phosphatase, as indicated. See also Figure S2 and Figure S3.
accumulation of cells in metaphase I but failed to affect Cdc5dependent phosphorylation and hyperactivation of Mus81Mms4 (Figure S4B, left panel). In contrast, cells lacking both Cdc20 and Cdc5 failed to modify and hyperactivate Mus81Mms4, despite accumulating with meiosis I spindles (Figure S4B, right panel).
These data indicate that Cdc5 regulates the activity of Mus81Mms4 during meiosis I, and that Cdc5-dependent phosphorylation of Mms4 could be important for the disengagement of DNA joint molecules. Recent work has shown that ectopic expression of Cdc5 in ndt80D cells (arrested in pachytene of prophase I) is sufficient to promote the resolution of HJs (Sourirajan and
(B) Determination of the frequency of nuclear division during meiosis I and meiosis II, as described in (A). The percentage of cells at each stage that have undergone the first or second meiotic division (two or four DNA masses) are indicated. See also Figure S1.
Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc. 163
C MMS4-myc9
0 2 4 6 8 10 12
0 2 4 6 8 10 12
Control +β-estradiol
0 2 4 6 8 10 12 Hours in SPM
6
8 10 6
8 10 Hours in SPM Mms4-myc9-P Mms4-myc9
Extract
Extract
Mms4-myc9-P Mms4-myc9 Cdc5
Cdc5-ha3
Tub2 Resolution assay on α-Myc IPs
Resolution assay on α-Myc IPs
*
g
h i j
k l m n
o p q
r s t u
*
*
a
b
c d
e
Phosphatase dead
-
-
-
+
-
-
+
-
Phosphatase
-
+
+
+
30 °C, 15 min Mms4-myc9-P Mms4-myc9
*
Pgk1
*
a b c d e f
Metaphase I: PCLB2-CDC20 MMS4-myc9
α-Myc IP
MMS4
D
PGPD1-GAL4.ER PGAL1-CDC5-ha
PCLB2-CDC5 MMS4-myc9
Resolution assay on α-Myc IPs
A
*
a
f
b
c
d
E
B
MMS4
CDC5 MMS4-myc9 MMS4 myc9 PCLB2-CDC5
60
60
40
40
20
20
0
3
6
MMS4-myc9 8
10
0
3
6
8
10
Hours in SPM Mms4-myc9-P Mms4-myc9
Extract
80
Cdc5
0
0 0
2
4
6
8
10 12
0
2
4
6
8
Hours in SPM Resolution activity
MI spindle
Mms4-myc9-P Mms4-myc9
10 12
α-Myc IP
% cells or % HJ cleavage
MMS4-myc9 MMS4 myc9 80
Cdc5
MII spindle a
b
c
d
e
f
g
h
i
j
Figure 4. Regulation of Mus81-Mms4 Activity by Cdc5-Dependent Phosphorylation (A) Upper panels: extracts were prepared from meiotic WT, MMS4-myc9, and PCLB2-CDC5 MMS4-myc9 cells and proteins were detected by western blotting. Lower panel: Mms4-myc9 was immunoaffinity purified and assayed for HJ resolution activity. (B) Quantification of Mus81-Mms4 activity relative to the kinetics of meiotic progression as determined from (A). (C) Cdc5 expression in prophase I-arrested cells is sufficient to promote the phosphorylation and activation of Mus81-Mms4. Extracts from ndt80D cells expressing Cdc5-ha3 from an estradiol-inducible GAL1 promoter (after 5 hr) were analyzed for the presence of the indicated proteins by western blotting, and immunoaffinity purified Mms4-myc9 was assayed for HJ resolvase activity. (D) Inactivation of Mus81-Mms4 by dephosphorylation. Mms4-myc9 was immunopurified from cells arrested in metaphase I (PCLB2-CDC20 MMS4-myc9 cells, collected after 8 hr in SPM), treated with l-phosphatase as indicated, and assayed for HJ resolution activity. (E) Association of Mms4 with Cdc5. Extracts or anti-Myc immunoprecipitates prepared from PCLB2-CDC20 MMS4-myc9 cells were western blotted for Mms4myc9 or Cdc5 as indicated. See also Figure S4.
Lichten, 2008). We therefore determined whether expression of Cdc5 in ndt80D mutants is sufficient to regulate Mus81-Mms4, by generating MMS4-myc9 ndt80D mutants in which CDC5 expression was under the control of GAL4.ER (Sourirajan and Lichten, 2008). Addition of b-estradiol results in the specific induction of CDC5, while leaving the remaining genes in the NDT80 regulon off. Remarkably, induction of CDC5 was sufficient to promote the modification of Mms4 and prematurely boost the activity of Mus81-Mms4 during prophase I (Figure 4C, lanes d and e). To confirm that the Cdc5-mediated hyperactivation of Mus81Mms4 was a direct consequence of its phosphorylation, fully 164 Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc.
modified and hyperactive Mus81-Mms4 was prepared from MMS4-myc9 cells synchronized at metaphase I (PCLB2CDC20). When treated with l-phosphatase, but not inactivated phosphatase, we observed that increased electrophoretic mobility of Mms4 was linked with a reduction of Mus81-Mms4 nuclease activity (Figure 4D, lane c). Moreover, using cells expressing Myc-tagged Mms4, we found that Cdc5 was present in anti-Myc immunoprecipitates (Figure 4E, lanes h and i). These data demonstrate that Cdc5-dependent phosphorylation hyperactivates Mus81-Mms4 during meiosis I, and that Cdc5 promotes JM resolution during meiosis through the phosphorylation and activation of Mus81-Mms4.
C
A
MMS4-WT YEN1
MUS81-myc9 MMS4-WT-ha3
Meta I
mms4-14A-ha3
0 2 4 5 6 8 10
Ana I
Meta II
Ana II DNA Homozygous URA3-GFP Tubulin
0 2 4 5 6 8 10 Hours in SPM
Extract
Mus81-myc9 Mms4-ha3
Pds1-myc18
Cdc5
α-Myc IP
83 / 17 Mus81-myc9
Meta I
Ana I
Meta II
Ana II DNA Homozygous URA3-GFP Tubulin
*
Resolution assay on α-Myc IPs
Nuclear division Yes / No (%)
mms4-14A YEN1
Mms4-ha3
Pds1-myc18
* 15 / 85 a b c d e
f g
h i
j k l m n
80
MMS4-WT-ha3 MMS4 WT T ha3
80
60
60
40
40
20
20
0
0 0
2
4
6
8
10
0
84 / 16
Ana I
Meta II
DNA Homozygous URA3-GFP Tubulin
Pds1-myc18 2
4
6
8
10
7 / 93
E
Control
Mms4
0 2 4 6 8 9 10 11 12
P
Mus81 binding
175
1
P P
691 aa
350
MMS4-WT-ha3
P PP PP PP P
Cdc5-GFP Mms4-ha3
dHJ-JM Cdc5-GFP
DNA
Ascus
DNA
mms4-14A-ha3
0 2 4 6 8 9 10 11 12
Ascus
8 9 10 11 12 Hours in SPM mc-JM
mms4Δ
mms4-14A yen1Δ
Estradiol
8 9 10 11 12 Hours in SPM dHJ-JM
0 2 4 6 8 9 10 11 12
MMS4-WT yen1Δ
Nuclear division Yes / No (%)
mc-JM
Putative Cdk site
D
18 / 82
ndt80∆ PGAL-CDC5-GFP
B
P PP P P
Ana II
mms4 14A ha3 mms4-14A-ha3
MI spindle MII spindle Resolution activity MI division
PP P P
Nuclear division Yes / No (%)
mms4-14A yen1∆ Meta I
% cells or % HJ cleavage
91 / 9
8 9 10 11 12 Hours in SPM mc-JM dHJ-JM Cdc5-GFP Mms4-ha3
Figure 5. Hyperactivation of Mus81-Mms4 Ensures Timely JM Resolution and Promotes Chromosome Segregation at Meiosis I (A) Analysis of Mus81-Mms4 activity from MUS81-myc9 cells expressing either MMS4-WT-ha3 or mms4-14A-ha3, as described for Figure 4. (B) Phosphorylation map of Mms4. Phosphorylated residues identified by mass spectrometry (red) and predicted CDK consensus sites (blue) are indicated. (C) Immunofluorescence analysis of meiosis in MMS4-WT YEN1, mms4-14A YEN1, and mms4-14A yen1D, as for Figure 2. (D) Representative asci from the indicated strains. (E) Cdc5 activates Mus81-Mms4 to promote JM resolution. Physical analysis of recombination at the HIS4LEU2 locus in ndt80D cells expressing CDC5-GFP from an estradiol-inducible promoter. Southern analysis of psoralen-crosslinked DNA prepared from meiotic time courses of MMS4, mms4D, or mms4-14A strains. The dynamics of JM accumulation and resolution, in the presence or absence of estradiol after 7 hr in SPM, are shown. Protein extracts were analyzed for the presence of the indicated proteins. mc-JM: multichromatid joint molecules; dHJ-JM: double Holliday junction joint molecules. See also Figure S5, Figure S6, and Table S1.
Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc. 165
A
C
MMS4-myc9 As
0
G2/M phase: MMS4-myc9 60 min after release
15 30 45 60 75 90 105 120 135 150 165 180 min after release
-
-
+
-
-
+
-
Phosphatase
-
+
+
+
30 °C, 15 min
α-Myc IP
Extract
Mms4-myc9
-
Clb1
Phosphatase dead
Mms4-myc9-P Mms4-myc9
Cdc5 Resolution assay on α-Myc IPs
*
Resolution assay on α-Myc IPs
*
* a
* S
B
d
M
S phase: YEN1-myc9 25 min after release
D
YEN1-myc9 15 30 45 60 75 90 105 120 135 150 165 180
min after release
Clb1
*
-
-
-
+
-
-
+
-
Phosphatase
-
+
+
+
30 °C, 15 min
α-Myc IP
Yen1-myc9
Extract
0
c
S M
As
b
Phosphatase dead
Yen1-myc9-P Yen1-myc9
Resolution assay on α-Myc IPs
Resolution assay on α-Myc IPs
*
* S
*
S M
a
M
b
c
d
Figure 6. Cell-Cycle Regulation of Mus81-Mms4 and Yen1 Activity in Mitotic Yeast (A) Cells expressing Mms4-myc9 were synchronized by a factor arrest/release, and extracts were analyzed by western blotting for the indicated proteins. Mms4myc9 was immunoaffinity purified from extracts and assayed for HJ resolution activity. As, sample from asynchronous proliferating cells. (B) As in (A), but using cells expressing Yen1-myc9 instead of Mms4-myc9. (C) Inactivation of Mus81-Mms4 by dephosphorylation. Mms4-myc9 was immunoaffinity purified from G2/M cells collected 60 min after a factor release. Proteins were treated with l-phosphatase as indicated and assayed for HJ resolution activity. (D) Activation of Yen1 by dephosphorylation. Yen1-myc9 was immunoprecipitated from G2/M cells collected 25 min after a factor release. Proteins were analyzed as in (C). See also Figure S7.
Analysis of Phosphorylation-Defective Mutants of Mus81-Mms4 To identify the sites of phosphorylation on Mms4, we wished to immunoaffinity purify Mus81-Mms4 from large-scale cell cultures for mass spectrometry (MS). Parallel experiments, described later in this work, indicated that related Mus81-Mms4 phosphorylation events also take place in mitotic G2/M phase cells (Figure 6A and Figure S4C), allowing us to prepare large amounts of Mus81-Mms4 from synchronized mitotic cultures blocked at G2/M using the microtubule-depolymerizing drug benomyl. 166 Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc.
MS analysis of purified Mms4 led to the identification of 12 phosphorylated residues. Although interactions with Mus81 occur at the C-terminal region of Mms4 (Fu and Xiao, 2003), the phosphorylated residues were all located toward the N-terminal half of Mms4 (Figure 5B, detailed in Table S1). Further sites were indicated by sequence prediction, as the preferred binding sequence for the Polo-box domain is serine-phosphoserineproline (S-pS-P), overlapping with a CDK consensus sites (T/SP-X-K/R, or the minimal consensus T/S-P) (Elia et al., 2003). The MMS4 sequence also contains five putative CDK-priming sites
(Table S1), one of which (S56) has a perfect Cdc5-binding motif and was phosphorylated in vivo as determined by MS analysis. Due to incomplete peptide coverage, we were unable to confirm whether the other four predicted CDK sites were phosphorylated. To determine the effects of Mms4 phosphorylation we generated an MMS4 allele, mms4-14A-ha3, in which 14 of the predicted/identified sites were mutated to alanine (Table S1). When Mus81-Mms4-14A was isolated from synchronized meiotic cultures, we observed basal levels of HJ resolvase activity that failed to be hyperactivated upon Cdc5 expression, consistent with reduced Cdc5-dependent phosphorylation (Figure 5A, lanes j–l). Additional control experiments confirmed that Mms4-14A was specifically defective in phosphorylation-dependent hyperactivation: (1) comparison of protein levels showed that Mms4-14A was expressed at similar levels to wild-type Mms4 (Figure 5A, Figure S4D, and Figure S5E); (2) immunoaffinity purification of Mus81-myc9 revealed normal associations with Mms4-14A (Figure 5A); (3) the nuclear localization of Mus81 throughout meiosis was normal in mms4-14A mutants (Figure S5A); (4) chromatin-associated nuclear foci that form during meiosis were present in cells expressing wild-type or phospho mutant Mms4-14A (Figure S5B); and (5) mms4-14A was able to partially suppress the DNA repair defects observed for mms4D mutants in response to DNA-damaging drugs such as methyl methanesulfonate (Figure S4E). Taken together, these data show that, in mms4-14A mutants, Mus81-Mms4 fails to become hyperactivated in meiosis I in response to Cdc5 accumulation and meiotic cell-cycle progression. Joint Molecule Resolution and Chromosome Segregation Depend on Cdc5-Mediated Hyperactivation of Mus81-Mms4 Our data suggest a model in which Cdc5-mediated phosphorylation and hyperactivation of Mus81-Mms4 plays a key role in coordinating the timely completion of meiotic recombination with chromosome segregation. Consistent with this, cells expressing mms4-14A exhibited severe defects in chromosome segregation similar to those observed with mus81D (Figure 5C). We also found that the majority (82%) of mms4-14A yen1D double mutants failed to segregate their chromosomes during meiosis II, showing that this event is again largely dependent on the integrity of YEN1 (Figure 5C). Similarly, the efficiency of spore formation and viability of mms4-14A and mms4-14A yen1D double mutants was comparable to that observed with mms4D and mms4D yen1D mutants, respectively (Figure 5D and Figures S5C and S5D). Finally, despite being expressed at similar levels to Mms4-WT (Figure S5E), we found that Mms414A was unable to rescue the defect in mms4D yen1D mutants (Figure S5C). To directly determine the effect of Cdc5-mediated hyperactivation of Mus81-Mms4 on JM resolution and CO formation, we monitored these events upon induction of Cdc5 expression in ndt80D cells carrying MMS4-WT, mms4D, or mms4-14A (Hunter and Kleckner, 2001; Sourirajan and Lichten, 2008). Physical analysis of recombination at the HIS4LEU2 locus showed that Cdc5 expression triggered the elimination of JMs and promoted the formation of CO recombinants in MMS4-WT but not mms4D cells (Figure 5E and Figure S6). Furthermore, mms4D mutants accu-
mulated aberrant multichromatid JMs (mc-JMs). With mms414A cells, we observed a reproducible delay in the kinetics of JM elimination without any accumulation of mc-JMs (Figure 5E and Figure S6F). These results show that Cdc5-mediated activation of Mus81-Mms4 is required for the timely resolution of JMs, consistent with previous proposals (Sourirajan and Lichten, 2008), and also that Mus81-Mms4 operates independently of Cdc5 in the elimination of aberrant mc-JMs during prophase I. Mitotic Regulation of Mus81-Mms4 and Yen1 Activity To determine whether Mus81-Mms4 and Yen1 undergo analogous regulatory mechanisms during mitosis, anti-Myc immunoaffinity purifications of each protein were carried out using synchronized mitotic yeast carrying Mms4-myc9 or Yen1myc9 after release from a factor arrest (G1/S). We found that both enzymes were tightly regulated throughout the mitotic cell cycle (Figures 6A and 6B). The nuclease activities of Mus81Mms4 and Yen1 were low during S phase (15–30 min after release, see also FACS profile in Figure S7). As cells accumulated Cdc5 and the M phase cyclin Clb1, the activity of Mus81Mms4 increased sharply (Figure 6A, 30–45 min). The peak of Mus81-Mms4 activity correlated with the peak of Cdc5 expression and the initiation of Clb1 proteolysis, which marks entry into anaphase I (60 min). A decline in activity was then observed as cells exited the first cell cycle and re-entered S phase (75– 90 min). In the case of Yen1, a similar activity profile was observed, although the sharpness of the cycle was even greater than that of Mus81-Mms4 (Figure 6B). Activation of Yen1 occurred at a slightly later stage of mitosis, and was coordinated with cyclin degradation and entry into anaphase I (Figure 6B, 60 min). These results indicate that Mus81-Mms4 is activated in response to M phase entry, and that Yen1 is activated later as cells initiate anaphase. Western blot analysis of the electrophoretic mobility of Mms4 and Yen1 revealed a striking correlation between activity levels and posttranslational modification (Figures 6A and 6B). In the case of Mus81-Mms4, nuclease activation correlated with increased Mms4 phosphorylation (Figure 6A). Again, we found that phosphatase treatment of Mus81-Mms4 isolated from cells at the peak of activation during M phase (60 min) resulted in increased electrophoretic mobility and inactivation of the nuclease activity (Figure 6C). In the case of Yen1, nuclease inactivation correlated with its phosphorylation, as seen by the slightly reduced electrophoretic mobility of Yen1 during S phase (Figure 6B). Importantly, l-phosphatase treatment of inactive Yen1 isolated from S phase cells resulted in a dramatic increase in activity (Figure 6D). These data demonstrate that the activities of Mus81-Mms4 and Yen1 are tightly regulated throughout the cell cycle in proliferating cells, and that the mechanism of regulation appears similar to that seen during meiosis. Regulation of MUS81-EME1 and GEN1 in Human Cells Finally, we explored the regulatory control of the human orthologs of Mus81-Mms4 and Yen1. To do this, HeLa cells were generated that expressed MUS81 or GEN1 at endogenous levels from a bacterial artificial chromosome (BAC). Both proteins carried C-terminal FLAP tag fusions that allowed us to GFP-affinity purify Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc. 167
A
D
168 Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc.
B
C
the proteins. Immunoaffinity purified MUS81-EME1 and GEN1 were prepared from asynchronous (As) cells or from cultures blocked at various stages of the cell cycle using thymidine (Thy; G1/S phase arrest), camptothecin (CPT; S/G2 arrest), or nocodazole (NOC; prometaphase arrest). The proteins were then analyzed for their ability to cleave HJs in vitro, and again we found clear evidence for cell-cycle regulation. Whereas little MUS81-EME1 activity was seen in G1/S- or S/G2-arrested cells, a significant increase in HJ resolution activity was observed at prometaphase (Figure 7A, lane e). The role that the Polo-like kinase Cdc5 plays in the hyperactivation of Mus81-Mms4 in yeast prompted us to determine whether human Polo-like kinase PLK1 might be involved in the M phase activation of MUS81-EME1. We found that activation of MUS81-EME1 was coincident with reduced mobility of both MUS81 and EME1, and that PLK1 kinase was present in the MUS81-FLAP pull-downs from nocodazole-treated cells (Figure 7A, lane e). These results indicate that Polo kinase-mediated phosphorylation is likely to regulate the activity of MUS81-EME1, as observed in yeast. In contrast to MUS81-EME1, we did not observe such tight regulation of GEN1 activity, although again maximal activity was observed at M phase (Figure 7B, lane e). However, analysis of the subcellular localization of GEN1 by immunofluorescence microscopy (visualization of the GFP-tag) revealed regulation by a second level of control (Figure 7C). In this case, GEN1 was predominantly found in the cytoplasm, except in cells that had a mitotic spindle (marked with a-tubulin), in which case it was distributed throughout the entire cell. These results suggest that GEN1 protein is controlled by two levels of regulation, first by cell-cycle-mediated changes to its activity, and second by preventing access of the nuclease to DNA until breakdown of the nuclear envelope. DISCUSSION In this work, we have uncovered a remarkable regulatory system that directs the outcome of joint molecule resolution, by timing the actions of crossover-promoting nucleases according to cellular needs. Moreover, we show that the completion and outcome of homologous recombination is precisely co-coordinated with cell-cycle progression, by mechanisms that ensure the specialized chromosome segregation programs
of meiosis and mitosis. In meiotic cells, cell-cycle-regulated phosphorylation events control the enzymatic activities of Mus81-Mms4 and Yen1, which link the completion of recombination to CO generation and chromosome segregation. In mitotic cells, a similar regulatory network produces cycles of inactivation/activation that restrain the nuclease activities, biasing the engagement of JMs toward NCO-promoting pathways, and then releasing them to ensure that persistent intermediates are processed in a timely fashion for chromosome segregation. Mus81-Mms4 Resolves Joint Molecules, whereas Yen1 Safeguards Chromosome Segregation During meiosis, the processing of recombination intermediates by endonucleolytic resolution provides two essential functions. First, HJ resolvases sever the physical connections between chromosomes that would otherwise impede chromosome segregation. Second, the conversion of JMs to COs facilitates the bipolar segregation of homologous chromosomes. Therefore, segregation in general, and bipolar segregation in particular, are linked to the efficiency and outcome of JM resolution. These are known to be critical cellular events because resolution defects at meiosis I could contribute to the cosegregation of homologous chromosomes in human oocytes, which is directly associated with pregnancy loss and developmental disabilities (Hassold and Hunt, 2001). Our study indicates that in yeast Mus81-Mms4 plays a leading role in the resolution of meiotic JMs and thereby promotes the timely segregation of homologs in meiosis I. mus81D and mms4D mutants undergo meiosis and generate spores, which are characterized by a small delay and reduction in CO formation, and low viability (de los Santos et al., 2001, 2003). Our analysis revealed that Mus81-Mms4 is required for JM resolution at meiosis I, and that in its absence a single round of chromosome segregation occurs at meiosis II. Importantly, Yen1 is essential for the single chromosome segregation event in mus81D mutants, despite being dispensable for meiosis in otherwise wild-type cells. Yen1 therefore provides a safeguard activity that deals with JMs that have escaped the attention of Mus81Mms4. As a consequence, the phenotype of mus81D is modest in comparison with fission yeast that lacks Yen1 and relies almost entirely on Mus81-Eme1 for JM resolution and CO formation (Boddy et al., 2001; Osman et al., 2003).
Figure 7. Cell-Cycle Regulation of MUS81-EME1 and GEN1 Activity in Human Cells (A) Extracts were prepared from HeLa cells expressing MUS81-FLAP (CLJM6) after treatment with thymidine (Thy), camptothecin (CPT), or nocodazole (NOC), and analyzed by western blotting for the indicated proteins. MUS81-FLAP was affinity-purified from each sample and assayed for HJ resolution activity. Control samples were prepared from control asynchronous cells (As) or untransfected HeLa cells (U). (B) As in (A), except that the HeLa cells expressed GEN1-FLAP (CLJM4). Extracts and affinity-purified GEN1-FLAP were assayed by western blotting and for HJ resolution activity. (C) Immunofluorescence analysis of asynchronous HeLa cells expressing GEN1-FLAP. DNA was visualized by DAPI staining, and a-tubulin staining marks mitotic cells by decorating the mitotic spindle. (D) Model for the timing and control of HJ resolution. In meiosis two waves of Holliday junction resolution ensure the elimination of recombination intermediates and promote the segregation of chromosomes at both meiotic divisions. Mus81-Mms4 activity is kept low until the onset of meiosis I when Cdc5 expression is induced. Cdc5 binds and phosphorylates Mms4, hyperactivating Mus81-Mms4. At this time, Yen1 is inactive, but becomes activated by dephosphorylation at meiosis II, where it acts as a safeguard that ensures chromosome segregation. In mitosis, the activities of Mus81-Mms4 and Yen1, and their human homologs MUS81-EME1 and GEN1, are enhanced as cells enter M phase of the cell cycle. Prior to this time, noncrossover-promoting pathways of junction dissolution are likely to be dominant. The regulation of the Mus81 and Yen1/GEN1 pathways occurs through cycles of phosphorylation/dephosphorylation similar to that of meiotic cells.
Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc. 169
Timing and Control of JM Resolution by Cell-CycleRegulated Phosphorylation Chromosome segregation analyses revealed the existence of a degree of redundancy between Mus81-Mms4 and Yen1, but also highlighted a functional separation of the two JM resolution pathways in time. Consistent with our genetic data, we found that the activity of Mus81-Mms4 peaks at meiosis I, whereas Yen1 is activated to process HJs as cells undergo meiosis II (see model, Figure 7D). Previous work had established that dHJ accumulation/resolution is coordinated with cell-cycle progression through the actions of Cdc5 (Clyne et al., 2003; Sourirajan and Lichten, 2008). The present work extends this model, by demonstrating that Mus81-Mms4 is a direct target of Cdc5 in JM processing. Physical analysis of recombination showed that Mus81-Mms4 was required for Cdc5-mediated JM resolution. Furthermore, the analysis of mms4 mutants defective for Cdc5-mediated phosphorylation confirmed that modification of Mms4 was important for the efficient disengagement of JMs. These results explain the chromosome segregation defects observed in mms4-14A cells and underscore the importance of the precise coupling of JM resolution with cell-cycle progression. However, Mms4-14A was still capable of supporting JM resolution, albeit with a delay, raising the possibility that either Mms4-14A is still partially activated by Cdc5, or that phosphorylation of a second Cdc5 target may facilitate JM resolution in the presence of a basal level of Mus81-Mms4 activity. Cdc5 is known to play a role in SC disassembly (Sourirajan and Lichten, 2008) and components of the SC have been proposed to counteract Sgs1-mediated dHJ dissolution (Jessop et al., 2006; Oh et al., 2007; Rockmill et al., 2003). It is therefore possible that Cdc5-mediated SC disassembly may facilitate JM resolution by Mus81-Mms4. In contrast to Mus81-Mms4, the phosphorylation of Yen1 keeps its activity under tight control by phosphorylationdependent inhibition. Yen1 phosphoinhibition is initiated during premeiotic S phase and maintained until the onset of meiosis II, at which time dephosphorylation alleviates inhibition. Although the kinase responsible for Yen1 phosphorylation has yet to be identified, the phosphorylation profile observed would be compatible with several kinases involved in promoting S phase and controlling recombination (e.g., CDK, DDK). Previous studies proposed that Yen1 is a target of Clb5-CDK (Loog and Morgan, 2005), and its sequence encodes a multitude of predicted CDK consensus sites. Moreover, it has been shown that phosphorylated Yen1 is predominantly cytoplasmic in S phase, whereas in G1 or upon CDK downregulation it localizes to the nucleus (Kosugi et al., 2009). Most likely, phosphorylationdependent subcellular relocalization provides a second level of regulatory control, similar to that observed with GEN1 in human mitotic cells. Our studies indicate that in order to ensure that all JMs are resolved in time for chromosome segregation, the activities of Mus81-Mms4 and Yen1 are sequentially elevated. Therefore, if JMs escape the attention of Mus81-Mms4, a safeguard activity is in place. This back-up solution may have evolved to suppress the inability of cells to delay cell-cycle progression once JMs mature into dHJs, suggesting that Yen1 acts as a checkpoint substitute and that like most checkpoint proteins only becomes important when abnormal levels of stress are introduced into the 170 Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc.
system. To our knowledge, our study also provides the first demonstration of the direct regulation of the activity of a structure-specific nuclease. It is remarkable that the same type of posttranslational modification regulates the activity of both nucleases, but whereas phosphorylation activates Mus81Mms4, it inhibits Yen1 activity. How posttranslational modification controls the biochemical activities of these nucleases remains to be determined. Resolvase Regulation and Suppression of Crossover Formation in Mitosis In contrast to meiosis, the processing of dHJs in mitosis is generally associated with formation of NCO products. In budding yeast, a complex of Sgs1-Top3-Rmi1 promotes dHJ dissolution reactions that exclusively form noncrossover products (Gangloff et al., 1994; Ira et al., 2003). Similar reactions occur in human cells, where BLM-TopoIIIa-RMI1-RMI2 (the BTR complex) combine to ensure that dHJs that link sister chromatids are dissolved into non-crossovers, as indicated by the elevated sister chromatic exchange (SCE) phenotype that is characteristic of Bloom’s Syndrome (BS) cells (Chaganti et al., 1974; Wu and Hickson, 2003). Recently, we showed that the high frequency of SCE formation in BS cells, which are mutated for BLM, could be reduced by downregulation of MUS81 and GEN1 (Wechsler et al., 2011). These results indicated that the BTR complex plays a primary role in dHJ processing, and that the high frequency of SCE formation results from the actions of MUS81 and GEN1, which resolve the JMs that persist in BS cells into either crossovers or non-crossovers. The results described in this work provide insights into the interplay between HJ dissolution and resolution in mitotic cells. First, in yeast, we found that the activities of Mus81-Mms4 and Yen1 are low in S phase, at the time when JMs are formed, and that similar regulatory events occur in human cells. Thus, JM dissolution pathways are likely to have the upper hand in S phase cells, at least until Mus81-Mms4 (MUS81-EME1) is hyperactivated at G2/M, and Yen1 (GEN1) is activated and gains access to the DNA as cells enter mitosis (Figure 7D). One important prediction of this model is that the processing of a DSB generated at G2/M should have a lower bias toward NCO formation, as a consequence of activation of the nuclease pathways. Consistent with this prediction, increased CO formation is observed when DSBs are induced in nocodazole-treated cells (Ira et al., 2003). From these results, we propose that a primary function of Mus81 and Yen1 in dissolution-proficient cells is to capture and resolve any JM intermediates that have eluded the NCO-promoting pathways and persist until mitosis. As detailed for meiotic and mitotic yeast, the MUS81-EME1 and GEN1 pathways are regulated by cycles of phosphorylation/dephosphorylation. Our results also indicate that PLK1 may play a key role in regulating MUS81-EME1 activity in a manner similar to that shown for Cdc5 in yeast. This is an interesting observation because PLK1 is overexpressed in many cancers and overexpression correlates with a poor prognosis and lower overall survival rate (Scho¨ffski, 2009). One possibility is that the premature activation of MUS81-EME1 could result in an increased frequency of COs and loss of heterozygosity in these cancers.
EXPERIMENTAL PROCEDURES All experimental procedures are described in detail in the Extended Experimental Procedures. Yeast Strains and Cultures All strains were derivatives of haploid BY4741 or diploid SK1, as detailed for each experiment in Table S2. Meiotic time courses and immunofluorescence microscopy of fixed cells were performed as described (Petronczki et al., 2006). Cells were grown for 11 hr (30 C), washed with sporulation medium (SPM, 2% potassium acetate) and inoculated into SPM to OD600 3.5. This time point was defined as t = 0 in all meiotic experiments. For synchronous release of mitotic cultures, BY4741 MATa derivatives were grown exponentially in YPD (OD600 0.3) at 30 C and synchronized by addition of a factor (final concentration 3 mM). After 2 hr (>95% unbudded cells), cells were harvested, washed once in YPD, and released in one half volume of YPD (t = 0 hr). BAC-Mediated Protein Expression in HeLa Cells Modified BACs containing GEN1-FLAP and MUS81-FLAP were transfected into HeLa cell lines and selected for stable integration and endogenous levels of expression (Poser et al., 2008). Protein Assays Meiotic and mitotic cellular lysates, and immunoaffinity-purified proteins were analyzed by SDS-PAGE through standard or phosphoaffinity gels. For nuclease assays, tagged proteins were immunoaffinity purified from yeast or human cell extracts using anti-Myc, or GFP-Trap (Chromotek) beads and washed extensively. The beads (approximate volume 10 ml) were then mixed with 10 ml cleavage buffer (50 mM Tris-HCl, pH 7.5, 5 mM MgCl2, or 2.5 mM for Yen1) and 1 nM 50 -32P-end-labeled synthetic Holliday junction X0 or X26 DNA (Ip et al., 2008). After 30 min incubation (or 1 hr for Mms4) at 30 C with gentle rotation, reactions were stopped by addition of 2.5 ml of 10 mg/ml proteinase K and 2% SDS, followed by incubation for 45 min at 37 C. Loading buffer (3 ml) was then added and radiolabeled products were separated by 10% native PAGE, dried on Whatman paper and analyzed by autoradiography and processed with ImageJ software, or by phosphorimaging using a Typhoon scanner and ImageQuant software. Resolution activity was calculated by determining the fraction of nicked duplex DNA product relative to the sum of the intact substrate and resolution product. Dephosphorylation Assays Myc-fusion proteins were affinity purified using anti-Myc agarose beads and washed extensively with buffer R (40 mM Tris, pH 7.5, 150 mM NaCl, 10% glycerol, 0.1% NP40). The beads were then incubated with or without l-phosphatase (NEB), or l-phosphatase inactivated by heating to 95 C for 10 min. Protein samples were then washed extensively with cleavage buffer and analyzed by western blotting and in Holliday junction resolution assays. Physical Analysis of Recombination at the HIS4LEU2 Locus DNA physical assays were performed essentially as described (Hunter and Kleckner, 2001; Kim et al., 2010). SUPPLEMENTAL INFORMATION
Frontiers Science Program long-term Fellowship, and M.G.B. by the Angeles Alvarin˜o program of the Xunta de Galicia (Spain). Received: February 24, 2011 Revised: June 8, 2011 Accepted: August 5, 2011 Published: September 29, 2011
REFERENCES Allers, T., and Lichten, M. (2001). Differential timing and control of noncrossover and crossover recombination during meiosis. Cell 106, 47–57. Blanco, M.G., Matos, J., Rass, U., Ip, S.C.Y., and West, S.C. (2010). Functional overlap between the structure-specific nucleases Yen1 and Mus81-Mms4 for DNA-damage repair in S. cerevisiae. DNA Repair (Amst.) 9, 394–402. Boddy, M.N., Gaillard, P.H.L., McDonald, W.H., Shanahan, P., Yates, J.R., 3rd, and Russell, P. (2001). Mus81-Eme1 are essential components of a Holliday junction resolvase. Cell 107, 537–548. Chaganti, R.S., Schonberg, S., and German, J. (1974). A manyfold increase in sister chromatid exchanges in Bloom’s syndrome lymphocytes. Proc. Natl. Acad. Sci. USA 71, 4508–4512. Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P.O., and Herskowitz, I. (1998). The transcriptional program of sporulation in budding yeast. Science 282, 699–705. Clyne, R.K., Katis, V.L., Jessop, L., Benjamin, K.R., Herskowitz, I., Lichten, M., and Nasmyth, K. (2003). Polo-like kinase Cdc5 promotes chiasmata formation and cosegregation of sister centromeres at meiosis I. Nat. Cell Biol. 5, 480–485. de los Santos, T., Hunter, N., Lee, C., Larkin, B., Loidl, J., and Hollingsworth, N.M. (2003). The Mus81/Mms4 endonuclease acts independently of doubleHolliday junction resolution to promote a distinct subset of crossovers during meiosis in budding yeast. Genetics 164, 81–94. de los Santos, T., Loidl, J., Larkin, B., and Hollingsworth, N.M. (2001). A role for MMS4 in the processing of recombination intermediates during meiosis in Saccharomyces cerevisiae. Genetics 159, 1511–1525. Elia, A.E., Rellos, P., Haire, L.F., Chao, J.W., Ivins, F.J., Hoepker, K., Mohammad, D., Cantley, L.C., Smerdon, S.J., and Yaffe, M.B. (2003). The molecular basis for phosphodependent substrate targeting and regulation of Plks by the Polo-box domain. Cell 115, 83–95. Fricke, W.M., and Brill, S.J. (2003). Slx1-Slx4 is a second structure-specific endonuclease functionally redundant with Sgs1-Top3. Genes Dev. 17, 1768– 1778. Fu, Y., and Xiao, W. (2003). Functional domains required for the Saccharomyces cerevisiae Mus81-Mms4 endonuclease complex formation and nuclear localization. DNA Repair (Amst.) 2, 1435–1447. Gangloff, S., McDonald, J.P., Bendixen, C., Arthur, L., and Rothstein, R. (1994). The yeast type I topoisomerase Top3 interacts with Sgs1, a DNA helicase homolog: a potential eukaryotic reverse gyrase. Mol. Cell. Biol. 14, 8391– 8398. Haber, J.E., and Heyer, W.D. (2001). The fuss about Mus81. Cell 107, 551–554.
Supplemental Information includes Extended Experimental Procedures, seven figures, and two tables and can be found with this article online at doi:10.1016/ j.cell.2011.08.032. ACKNOWLEDGMENTS We thank Wolfgang Zachariae and Nancy Kleckner for strains and plasmids, Tony Hyman for the BAC tagging cassettes, Keun Kim for help with the physical analysis of recombination, and members of our laboratory for comments and criticisms. This work was supported by Cancer Research UK, the European Research Council, the Louis-Jeantet Foundation, the Swiss Bridge Foundation, and the Breast Cancer Campaign. J.M. was a recipient of a Human
Hassold, T., and Hunt, P. (2001). To err (meiotically) is human: the genesis of human aneuploidy. Nat. Rev. Genet. 2, 280–291. Ho, C.K., Mazo´n, G., Lam, A.F., and Symington, L.S. (2010). Mus81 and Yen1 promote reciprocal exchange during mitotic recombination to maintain genome integrity in budding yeast. Mol. Cell 40, 988–1000. Hunter, N., and Kleckner, N. (2001). The single-end invasion: an asymmetric intermediate at the double-strand break to double-Holliday junction transition of meiotic recombination. Cell 106, 59–70. Ip, S.C.Y., Rass, U., Blanco, M.G., Flynn, H.R., Skehel, J.M., and West, S.C. (2008). Identification of Holliday junction resolvases from humans and yeast. Nature 456, 357–361.
Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc. 171
Ira, G., Malkova, A., Liberi, G., Foiani, M., and Haber, J.E. (2003). Srs2 and Sgs1-Top3 suppress crossovers during double-strand break repair in yeast. Cell 115, 401–411.
Oh, S.D., Lao, J.P., Taylor, A.F., Smith, G.R., and Hunter, N. (2008). RecQ helicase, Sgs1, and XPF family endonuclease, Mus81-Mms4, resolve aberrant joint molecules during meiotic recombination. Mol. Cell 31, 324–336.
Jessop, L., Rockmill, B., Roeder, G.S., and Lichten, M. (2006). Meiotic chromosome synapsis-promoting proteins antagonize the anti-crossover activity of sgs1. PLoS Genet. 2, e155.
Osman, F., Dixon, J., Doe, C.L., and Whitby, M.C. (2003). Generating crossovers by resolution of nicked Holliday junctions: a role for Mus81-Eme1 in meiosis. Mol. Cell 12, 761–774.
Kaliraman, V., Mullen, J.R., Fricke, W.M., Bastin-Shanower, S.A., and Brill, S.J. (2001). Functional overlap between Sgs1-Top3 and the Mms4-Mus81 endonuclease. Genes Dev. 15, 2730–2740.
Petronczki, M., Matos, J., Mori, S., Gregan, J., Bogdanova, A., Schwickart, M., Mechtler, K., Shirahige, K., Zachariae, W., and Nasmyth, K. (2006). Monopolar attachment of sister kinetochores at meiosis I requires casein kinase 1. Cell 126, 1049–1064.
Keeney, S., Giroux, C.N., and Kleckner, N. (1997). Meiosis-specific DNA double-strand breaks are catalyzed by Spo11, a member of a widely conserved protein family. Cell 88, 375–384. Kim, K.P., Weiner, B.M., Zhang, L.R., Jordan, A., Dekker, J., and Kleckner, N. (2010). Sister cohesion and structural axis components mediate homolog bias of meiotic recombination. Cell 143, 924–937. Kosugi, S., Hasebe, M., Tomita, M., and Yanagawa, H. (2009). Systematic identification of cell cycle-dependent yeast nucleocytoplasmic shuttling proteins by prediction of composite motifs. Proc. Natl. Acad. Sci. USA 106, 10171–10176.
Poser, I., Sarov, M., Hutchins, J.R., He´riche´, J.K., Toyoda, Y., Pozniakovsky, A., Weigl, D., Nitzsche, A., Hegemann, B., Bird, A.W., et al. (2008). BAC TransgeneOmics: a high-throughput method for exploration of protein function in mammals. Nat. Methods 5, 409–415. Rass, U., Compton, S.A., Matos, J., Singleton, M.R., Ip, S.C.Y., Blanco, M.G., Griffith, J.D., and West, S.C. (2010). Mechanism of Holliday junction resolution by the human GEN1 protein. Genes Dev. 24, 1559–1569. Rockmill, B., Fung, J.C., Branda, S.S., and Roeder, G.S. (2003). The Sgs1 helicase regulates chromosome synapsis and meiotic crossing over. Curr. Biol. 13, 1954–1962.
Lee, B.H., and Amon, A. (2003). Role of Polo-like kinase CDC5 in programming meiosis I chromosome segregation. Science 300, 482–486.
Scho¨ffski, P. (2009). Polo-like kinase (PLK) inhibitors in preclinical and early clinical development in oncology. Oncologist 14, 559–570.
Loog, M., and Morgan, D.O. (2005). Cyclin specificity in the phosphorylation of cyclin-dependent kinase substrates. Nature 434, 104–108.
Schwacha, A., and Kleckner, N. (1995). Identification of double Holliday junctions as intermediates in meiotic recombination. Cell 83, 783–791.
Mullen, J.R., Kaliraman, V., Ibrahim, S.S., and Brill, S.J. (2001). Requirement for three novel protein complexes in the absence of the Sgs1 DNA helicase in Saccharomyces cerevisiae. Genetics 157, 103–118.
Sourirajan, A., and Lichten, M. (2008). Polo-like kinase Cdc5 drives exit from pachytene during budding yeast meiosis. Genes Dev. 22, 2627–2632.
Niu, H., Wan, L., Baumgartner, B., Schaefer, D., Loidl, J., and Hollingsworth, N.M. (2005). Partner choice during meiosis is regulated by Hop1-promoted dimerization of Mek1. Mol. Biol. Cell 16, 5804–5818. Oh, S.D., Lao, J.P., Hwang, P.Y.H., Taylor, A.F., Smith, G.R., and Hunter, N. (2007). BLM ortholog, Sgs1, prevents aberrant crossing-over by suppressing formation of multichromatid joint molecules. Cell 130, 259–272.
172 Cell 147, 158–172, September 30, 2011 ª2011 Elsevier Inc.
Tay, Y.D., and Wu, L. (2010). Overlapping roles for Yen1 and Mus81 in cellular Holliday junction processing. J. Biol. Chem. 285, 11427–11432. Wechsler, T., Newman, S., and West, S.C. (2011). Aberrant chromosome morphology in human cells defective for Holliday junction resolution. Nature 471, 642–646. Wu, L., and Hickson, I.D. (2003). The Bloom’s syndrome helicase suppresses crossing over during homologous recombination. Nature 426, 870–874.
Saturated Fatty Acids Induce c-Src Clustering within Membrane Subdomains, Leading to JNK Activation Ryan G. Holzer,1 Eek-Joong Park,1 Ning Li,1 Helen Tran,1 Monica Chen,1 Crystal Choi,1 Giovanni Solinas,2 and Michael Karin1,* 1Laboratory of Gene Regulation and Signal Transduction, Department of Pharmacology, School of Medicine, University of California, San Diego, La Jolla, CA 92093, USA 2Laboratory of Metabolic Stress Biology, Department of Medicine, Physiology, University of Fribourg, 1700 Fribourg, Switzerland *Correspondence: karinoffi
[email protected] DOI 10.1016/j.cell.2011.08.034
SUMMARY
Saturated fatty acids (FA) exert adverse health effects and are more likely to cause insulin resistance and type 2 diabetes than unsaturated FA, some of which exert protective and beneficial effects. Saturated FA, but not unsaturated FA, activate Jun N-terminal kinase (JNK), which has been linked to obesity and insulin resistance in mice and humans. However, it is unknown how saturated and unsaturated FA are discriminated. We now demonstrate that saturated FA activate JNK and inhibit insulin signaling through c-Src activation. FA alter the membrane distribution of c-Src, causing it to partition into intracellular membrane subdomains, where it likely becomes activated. Conversely, unsaturated FA with known beneficial effects on glucose metabolism prevent c-Src membrane partitioning and activation, which are dependent on its myristoylation, and block JNK activation. Consumption of a diabetogenic highfat diet causes the partitioning and activation of c-Src within detergent insoluble membrane subdomains of murine adipocytes. INTRODUCTION Insulin resistance is a pathophysiologic condition caused by defective insulin signaling that can cause type 2 diabetes. Although insulin resistance has a strong genetic component (Kahn et al., 1996), it can be initiated and exacerbated by obesity (Ford et al., 1997). Obesity is also associated with low-grade chronic inflammation (Hotamisligil, 2010), whose hallmarks include enhanced production of inflammatory mediators, infiltration of activated macrophages into adipose tissue, and chronic JNK activation in liver, muscle, and fat tissue of obese individuals (Gregor et al., 2009) and experimental animals (Hirosumi et al., 2002; Solinas et al., 2006). Mouse studies identified adipocytes as an important cell type within which JNK activation causes cell-autonomous interference with insulin signaling
(Sabio et al., 2008). Adipocytes store fat and exert both protective and adverse effects on glucose metabolism, depending on the quality and quantity of stored lipids (Virtue and Vidal-Puig, 2008). Not all lipids are equal in their metabolic and health effects; whereas saturated FA have a strong diabetogenic effect (Clandinin et al., 1991) and lead to JNK activation (Solinas et al., 2006), certain unsaturated FA and especially polyunsaturated FA (PUFA) are protective and can even reverse obesityinduced insulin resistance (Clandinin et al., 1991; Robinson et al., 2007; Storlien et al., 1987). The JNKs belong to the mitogen-activated protein kinase (MAPK) group and are activated by physical stresses, such as UV light and heat shock, and receptor-mediated mechanisms, including TNF receptor 1 (TNFR1) and Toll-like receptors (TLR) 2 and 4 (Karin and Gallagher, 2005). Following activation, JNKs participate in many physiological and pathophysiological processes, including apoptosis, cell proliferation, cell migration, and cytokine production. Many of these effects depend on transcription factor activation, but JNKs also affect cell physiology through other substrates (Karin and Gallagher, 2005). For instance, JNKs phosphorylate insulin receptor substrates (IRS) 1 and 2 at serine (Ser) or threonine (Thr) residues and thereby attenuate their insulin-induced tyrosine (Tyr) phosphorylation, resulting in downmodulation of insulin action and diminished AKT activation (Aguirre et al., 2002; Solinas et al., 2006). JNK1deficient mice are protected from obesity-induced insulin resistance (Hirosumi et al., 2002) due to loss of cell-autonomous IRS1/2 phosphorylation within adipocytes (Sabio et al., 2008). JNKs also contribute to insulin resistance by stimulating production of inflammatory mediators by myeloid cells (Solinas et al., 2007; Vallerie et al., 2008) and have neuronal effects that influence obesity and energy metabolism (Sabio et al., 2010). Several mechanisms were proposed to explain chronic JNK activation in obesity, including endoplasmic reticulum (ER) stress (Ozcan et al., 2004) and signaling through inflammation-associated receptors (Shi et al., 2006; Uysal et al., 1997). However, how obesity triggers ER stress remains to be determined and the mechanisms by which ER stress leads to JNK activation are not fully understood either, although they were proposed to depend on the RNA-dependent protein kinase PKR or TRAF2 (Hotamisligil, 2010). Other studies have Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc. 173
implicated the phosphoinositide 3-kinase (PI3K) p85a regulatory subunit (Taniguchi et al., 2007), the scaffolding protein JIP1 (Jaeschke et al., 2004), the lipid chaperone aP2 (Erbay et al., 2009), and the mixed lineage kinase MLK3 (Jaeschke and Davis, 2007). These studies, too, poorly explain JNK activation in fat depots during obesity. In cultured cells, saturated FA such as palmitic acid (PA; C16:0) and stearic acid (SA; C18:0), which are elevated in plasma of obese individuals (Reaven et al., 1988), cause a spectrum of diabetes-related defects and activate JNK (Kharroubi et al., 2004; Solinas et al., 2006). Strong JNK activation is unique to long-chain saturated FA, whereas unsaturated FA are poor JNK activators and even inhibit JNK activation by saturated FA. These effects correlate with the pathophysiological effects of different FA types, suggesting that saturated FA may be physiologically relevant JNK activators. The exact mechanism through which saturated FA activate JNK in cells is unknown, although several studies suggest that FA may activate JNK via TLR2/4 (Shi et al., 2006; Tsukumo et al., 2007). Yet, JNK activation by PA does not require TAK1, a MAPK kinase kinase (MAP3K) that is essential for JNK activation by conventional TLR2/4 ligands (Jaeschke and Davis, 2007; Tseng et al., 2010). Moreover, saturated FA cause slow and sustained JNK activation, whereas JNK activation by TLR ligands is rapid and transient (Solinas et al., 2006; Tseng et al., 2010). The mechanism responsible for discrimination between saturated and unsaturated FA (some of which differ only by two hydrogen atoms) that accounts for their different effects on JNK activity and insulin signaling is even more mysterious. Because FA incorporate into cellular membranes either in their original form or after conversion into other lipid species, they may exert receptor-independent effects on cell signaling and physiology. For instance, PA incorporation reduces membrane fluidity, whereas unsaturated FA and especially PUFA do not have such an effect (Clamp et al., 1997; Karnovsky et al., 1982; Luo et al., 1996; Rintoul et al., 1978; Stulnig et al., 1998, 2001; Webb et al., 2000). We therefore postulated that the membrane may play a key role in JNK regulation by FA and searched for membrane-associated protein kinases that are essential for JNK activation by saturated FA. We now show that a key mediator of JNK activation by saturated FA is c-Src, whose activation correlates with its FA-induced partitioning into intracellular membrane subdomains that can be isolated based on detergent insolubility (Karnovsky et al., 1982; Lingwood and Simons, 2007; Pike, 2004, 2009) or visualized by fluorescent microscopy. Our results provide a biochemical model that explains the differential effects of saturated and unsaturated FA on JNK activity and human health. RESULTS c-Src Is Required for JNK Activation by FA Saturated FA that increase membrane order and melting temperature lead to JNK activation, whereas unsaturated FA capable of decreasing membrane order and melting temperature do not (Solinas et al., 2006). Optimal JNK activation by FA requires prolonged incubation periods and is much slower than receptor-mediated responses. We therefore hypothesized that 174 Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc.
the membrane is the most proximal sensor of FA and that membrane alterations may activate an associated protein kinase that triggers a signaling cascade leading to JNK. Although MLK3 was identified as the most upstream protein kinase that mediates JNK activation by saturated FA (Jaeschke and Davis, 2007), it is a MAP3K that is not membrane associated (Handley et al., 2007). Because many MAP3Ks act downstream to tyrosine kinases, we tested whether the Src group of membraneanchored tyrosine kinases is involved in JNK activation by FA. Indeed, immortalized fibroblasts derived from mice that are deficient in three broadly expressed Src kinases—c-Src, Yes, and Fyn (or SYF) (Stein et al., 1994) —did not exhibit increased JNK1 activity upon incubation with either SA or PA (Figure 1A). A similar defect was exhibited by c-Src/ fibroblasts, and reconstitution of SYF / cells with c-Src (SYF+Src) fully restored JNK1 activation. By contrast, reconstitution of SYF / cells with Yes or Fyn did not fully restore JNK activation by saturated FA (Figure 1B). c-Src was mainly required for JNK activation by saturated FA with acyl chains of at least 16 carbons, as lauric acid (LA; C12:0) and myristic acid (MA; C14:0) did not activate JNK in SYF+Src cells, and neither did the monounsaturated FA palmitoleic acid (POA; C16:1, n-7) or the PUFA eicosapentaenoic acid (EPA; C20:5, n-3) (Figure S1A available online). Knockdown of endogenous c-Src by shRNA in NIH 3T3 cells decreased JNK activation by PA (Figure 1C), and similar results were obtained in human HEK293T cells (Figure S1B). JNK activation by PA was restored by reconstitution of Src-deficient cells with wild-type (WT) Src, but not with an inactive (Y418F) mutant (Figure 1D). SYF –/– cells reconstituted with c-Src(Y527F), lacking the negative phosphorylation site recognized by Csk (Okada et al., 1991), displayed modestly enhanced JNK activation in response to PA, ruling out involvement of Csk-mediated c-Src phosphorylation in the response to saturated FA. By contrast, a compound c-Src(Y527F/G2A) mutant that also lacks the myristoylation sequence necessary for membrane targeting (Patwardhan and Resh, 2010) was highly compromised in JNK activation (Figures 1D and S1C). Pretreatment of SYF+Src cells with the Src family kinase inhibitor PP2 prevented JNK activation by PA, but the structurally similar control compound PP3 was ineffective (Figure S1D). We also examined the role of c-Src in PA-induced MLK3 activation. NIH 3T3, SYF, and SYF+Src cells were treated with PA for up to 8 hr, and MLK3 activation was assessed by immunoblotting with an antibody that recognizes Thr277/Ser281 phosphorylation. Within 2 hr of PA addition, sustained MLK3 phosphorylation was detected in NIH 3T3 and SYF+Src cells, but not in SYF/ cells (Figure 1E). JNK activation in the same cells was also monitored by immunoblotting and was found to parallel MLK3 activation. Curiously, SYF/ cells contained lower amounts of MLK3 than 3T3 or SYF+Src cells, but the basis and significance of this finding are unclear. We examined whether c-Src is needed for induction of ER stress, evaluated by induction of CHOP mRNA and IRE1a-catalyzed XBP1 mRNA splicing. Incubation of fibroblasts with either PA or SA induced CHOP mRNA and XBP1 mRNA splicing, but no role for Src kinases could be identified (Figure S1E). In accordance with these results, PP2 pretreatment of J774A.1 cells did not block PA-induced ER stress markers (Figure S1F). PP2,
Figure 1. c-Src Is Required for JNK Activation and Insulin Resistance by Saturated FA (A–D) JNK activation by FAs is c-Src dependent. (A) NIH 3T3, Src/, SYF /, and SYF fibroblasts reconstituted with c-Src were treated for 6 hr with 500 mM palmitic (PA) or stearic (SA) acids loaded onto BSA. Whole-cell lysates were prepared and subjected to JNK1 immunecomplex kinase assay using GST-c-Jun (1–79) as a substrate and immunoblotting with indicated antibodies. (B) NIH 3T3 and SYF / fibroblasts transduced with empty vector, Fyn, or Yes expression vectors were treated with SA or PA and analyzed as above. (C) NIH 3T3 cells were infected with lentiviral constructs carrying scrambled (Scr) or c-Src-specific shRNAs. After selection in puromycin-containing medium, cells were treated with PA, and JNK1 activity and c-Src expression were analyzed. (D) NIH 3T3, Src/ cells, and Src/ cells transduced with empty vector or WT c-Src or c-Src(Y418F) expression vectors or SYF / cells reconstituted with WT, Y527F, or Y527F/G2A c-Src vectors were treated with PA-loaded BSA or BSA alone and analyzed as above. Fold increase in JNK1 activity is shown below and was determined by densitometric analysis of three similar but separate experiments. (E) Src is required for MLK3 activation by PA. NIH 3T3, SYF /, and SYF+Src cells were treated with PA-loaded BSA for the indicated time periods. Cell lysates were prepared, and phosphorylation of MLK3 and JNK1/2 was monitored by immunoblotting. (F) PA-induced insulin resistance is c-Src dependent. NIH 3T3, SYF /, and SYF+Src cells were pretreated with or without PA for 6 hr before treatment with 100 ng/ml insulin for 7.5 and 15 min. JNK activation and insulin-induced AKT phosphorylation were analyzed by immunoblotting. See also Figure S1.
however, inhibited JNK activation and induction of TNF production in PA-incubated J774A.1 macrophages (Figure S1G and S1H). Importantly, in SYF / cells, PA treatment failed to inhibit insulin-induced AKT activation, but reconstitution with c-Src restored PA-induced inhibition of AKT Ser473 phosphorylation
to the same magnitude seen in 3T3 fibroblasts (Figure 1F). c-Src expression also restored PA-induced JNK1/2 phosphorylation. Collectively, these results establish a critical role for c-Src in JNK activation by FA and inhibition of insulin-mediated AKT activation. Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc. 175
Saturated FA Induce Src Partitioning into Detergent-Insoluble Membrane Microdomains Next, we examined whether c-Src activation is linked to changes in membrane structure. Membrane microdomains enriched in cholesterol, sphingolipids, and other lipids with saturated acyl chains, which presumably reduce membrane fluidity, can be isolated based on resistance to solubilization with cold nonionic detergents (Pike, 2009). Such membrane microdomains, sometimes referred to as lipid rafts, are postulated to be enriched in signaling proteins that can coalesce into larger assemblies (Janes et al., 1999; Pike, 2009). To examine whether saturated FA alter the membrane distribution of c-Src, SYF+Src cells were treated with PA, SA, or BSA alone for 4 hr before preparation of lysates that were solubilized with Triton X-100 at 4 C. Detergent-resistant membranes (DRMs) were isolated by equilibrium density gradient centrifugation (Lingwood and Simons, 2007). Using this technique, DRMs and their associated proteins float to the gradient’s top (fraction 1), whereas solubilized membranes and cell debris remain in fraction 4 (Figure 2A). In BSA-treated cells, very little c-Src was present in the DRM fraction (fraction 1), and most of it was in fractions 3 and 4. However, upon incubation with PA- or SA-loaded BSA, the distribution of c-Src had significantly changed, and a substantial amount of the protein was in fraction 1 (Figures 2A and 2B). A similar enrichment within fraction 1 of cells treated with PA or SA was seen for c-Src phosphorylated at Tyr418, presumably an activated form of the kinase. Fraction 1 also contained flotillin-1 and flotillin-2, proteins that serve as DRM markers, but was devoid of calnexin, a membrane protein that is excluded from DRM (Lingwood and Simons, 2007). Unlike c-Src, the amounts of flotillin-1/2 within fraction 1 were not affected by incubation with either PA or SA (Figure 2B). Incubation with PA-loaded BSA also increased the amounts of JNK1/2, MLK3, and MKK4 within fraction 1 (Figures 2A and 2B). However, no significant effect on the distribution of p85 PI3K, a protein suggested to be involved in JNK activation and insulin resistance (Taniguchi et al., 2007), was seen (Figure 2B). Immunecomplex kinase assay with enolase as a substrate confirmed that incubation of cells with PA or SA increased the amount of active c-Src within fraction 1 (Figure 2C). PA treatment also enhanced the amount of Yes, but not Fyn, within fraction 1, but the effect on Yes was more subtle than the effect on c-Src (Figures S2A and S2B). The nonmyristoylated c-Src(G2A) mutant was mostly confined to the soluble fraction 4 and did not shift into the DRM fraction after PA treatment (Figures S2C). c-Src(G2A) also exhibited less Tyr418 phosphorylation before and after PA treatment. In contrast, a dually palmitoylated c-Src(S3C/S6C) mutant (Sandilands et al., 2007) was present within the DRM fraction and was Tyr418 phosphorylated even in BSA-treated cells, and PA treatment led to only a modest increase in its amount in fraction 1 relative to fraction 4 (Figures S2D). Consistent with their poor effect on JNK activity (Figure S1A), MA treatment led to a small increase in c-Src within the DRM fraction, and LA had no effect on the membrane distribution of c-Src (Figure 2D). Likewise, MA and LA were poorer inducers of CHOP mRNA in response to PA and did not stimulate XBP1 mRNA splicing (Figure 2E). We also used a detergent-independent technique for isolation of putative lipid rafts that relies on physical disruption of cellular 176 Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc.
membranes in the presence of basic sodium carbonate buffer (Ostrom and Insel, 2006). This procedure yielded similar results to those described above, demonstrating PA-induced accumulation of c-Src and phospho-Tyr418 c-Src within the presumed lipid raft fraction, although enrichment of phospho-Tyr418 c-Src was more pronounced relative to the more modest enrichment of c-Src (Figures S2E and S2F). Compared to Triton X-100, a strong detergent that tends to solubilize all but the most ordered membrane subdomains, detergent-independent lipid raft isolation tends to yield larger membrane fragments that include highly ordered (and Triton X-100-insoluble) and lessordered microdomains (Pike, 2004). Our results may therefore suggest that c-Src activation occurs within less-ordered microdomains, and once c-Src is activated, it is concentrated within a more ordered membrane compartment. An in vitro kinase assay using a peptide substrate demonstrated that the c-Src that segregated within the putative lipid raft fraction generated by the detergent-independent method was activated if isolated from PA-treated cells (Figures S2G). PA treatment led to no change in c-Src activity within raft-devoid fractions (Figures S2H). Unsaturated FA Inhibit Src Redistribution and Activation Mono- and polyunsaturated FA increase membrane fluidity (Clamp et al., 1997; Karnovsky et al., 1982; Luo et al., 1996; Stulnig et al., 1998, 2001; Webb et al., 2000) and antagonize cellular effects of saturated FA (Akazawa et al., 2010). We pretreated SYF+Src cells with 300 mM of the monounsaturated POA or the PUFA EPA for 15 min prior to treatment with PA and examined c-Src membrane distribution and activity, JNK activation, and ER stress markers. Pretreatment with either POA or EPA blocked PA-induced partitioning of total c-Src and phosphoTyr418 c-Src into the DRM fraction while having little to no effect on flotillin-1/2 and calnexin distribution (Figures 3A and 3B). However, pretreatment with 300 mM SA did not prevent PAinduced c-Src redistribution, and as expected, even at this lower concentration, SA caused a small amount of c-Src redistribution on its own (Figure S3A). POA and EPA pretreatment also blocked PA-induced JNK activation (Figure 3C) as well as induction of ER stress markers by PA (Figure 3D). These results indicate that mono- and polyunsaturated FA block the effects of saturated FA on c-Src membrane distribution and activity, JNK activation, and ER stress. Given the opposing effects of unsaturated FA versus saturated FA on membrane fluidity, these findings support the hypothesis that altered membrane fluidity is key to the signaling effects of saturated FA. To determine whether saturated FA are preferentially incorporated into DRM, we treated cells for 2 hr with a mixture of 3 H-labeled and cold PA, isolated DRMs by the detergent-dependent method, and measured radioactivity in the different fractions. Most of the radioactivity was present in the DRM fraction (Figure 4A), especially when normalized to the protein content of each fraction (Figure 4B). To examine whether unsaturated FA incorporate preferentially into the detergent-soluble membrane fraction, we incubated cells with 3H-labeled oleic acid (OA; C18:1). Most of the 3H-OA-derived radioactivity was present in the soluble fraction 4 (Figures 4C and 4D). When cells were pretreated with POA and EPA prior to incubation with 3H-PA, incorporation of 3H-PA-derived radioactivity into
Figure 2. Saturated FA Induce c-Src Segregation into Detergent-Resistant Membranes (A) SYF+Src cells were treated for 4 hr with BSA alone or PA- or SA-loaded BSA. Whole-cell lysates were solubilized with Triton X-100 at 4 C, and DRMs were isolated by equilibrium density gradient centrifugation. Fractions were collected from the gradient’s top, such that fraction 1 represents the DRM fraction. Presence of the indicated proteins and phospho-Tyr418 c-Src in the different fractions was analyzed by immunoblotting. Flotillin-1 and flotillin-2 are lipid raft markers, whereas calnexin is a membrane protein that is excluded from lipid rafts/DRM. The percentage of c-Src in Fr. 1 is indicated underneath and was calculated by densitometric analysis of four separate experiments. (B) FA-induced enrichment of different signaling proteins in the DRM fraction following PA or SA treatments. Results are averages ± SD of three to five experiments similar to the one in (A). *p < 0.05. (C) c-Src kinase activity in different membrane fractions from cells treated with BSA or BSA plus saturated FA. Cells were treated and their membranes were fractionated as in (A). c-Src was immunoprecipitated from the different fractions and its activity measured using acid-activated enolase as a substrate. Average fold increase in c-Src activity (n = 4 for PA; n = 3 for SA) is indicated below. (D) c-Src membrane redistribution is dependent on FA acyl chain length. SYF+Src cells were treated with 500 mM PA (C16), MA (C14), or LA (C12). The cells were fractionated as in (A), and the fractions were immunoblotted for the indicated proteins. (E) SYF+Src cells were treated for 6 hr with either BSA alone or BSA loaded with PA, MA, or LA. Induction of CHOP mRNA or XBP1 mRNA splicing was analyzed by Q-RT-PCR. Results are averages ± SD. n = 3. See also Figure S2.
Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc. 177
Figure 3. Mono- and Polyunsaturated FA Block c-Src Membrane Redistribution and Activation as well as JNK Activation and Induction of ER Stress Markers (A–D) Unsaturated FA inhibit c-Src segregation and activation within DRM in response to PA. SYF+Src cells were pretreated with either monounsaturated palmitoleic acid (POA) (A) or polyunsaturated eicosapentaenoic acid (EPA) (B) for 15 min prior to addition of BSA or BSA loaded with PA. After 4 hr, membranes were solubilized and fractionated as in Figure 2, and distribution of the indicated proteins was determined by immunoblotting. c-Src and phospho-Tyr418 c-Src accumulated in the DRM fraction of PA-treated cells, and this was prevented by pretreatment with either POA (A) or EPA (B). (C) JNK activation and (D) induction of ER stress markers in the cells subjected to the above treatments were assessed as described in Figures 1 and 2. See also Figure S3.
the DRM fraction was significantly reduced (Figure 4E). In contrast, pretreatment with SA did not inhibit incorporation of 3 H-PA into the DRM fraction. OA behaved similarly to POA and EPA in that it blocked c-Src partitioning into the DRM fraction in response to PA (Figure S3B) and prevented PA-induced ER stress (Figure S3C) and JNK activation (Figure S3D). These results suggest that exposure of cells to PA results in direct effects on membrane structure, which eventually affect the distribution and activity of membrane-anchored c-Src. DRM and Src Activation in Adipose Tissue of Obese Mice Next, we checked whether consumption of high-fat diet (HFD) also results in c-Src partitioning and activation within DRM microdomains. C57BL/6 mice were placed on HFD for 16 weeks, and their brown adipose tissues (BAT) and white adipose tissues (WAT) were collected. The distribution of membrane proteins, including c-Src, within these tissues was analyzed by the Triton X-100 method described above. Consumption of HFD, but not normal chow (low-fat diet [LFD]), resulted in a dramatic increase of total c-Src and phospho-Tyr418 c-Src within fraction 1 of Triton X-100-solubilized BAT (Figure 5A). Consumption of HFD also increased the amounts of flotillin-1/2 within DRMs, suggesting a major expansion of such membrane microdomains in BAT, but as expected, calnexin remained excluded from the DRM fraction. Consumption of HFD also increased JNK1/2 and MLK3 within the DRM fraction but had little effect on the distribution of MKK4 and the p85a subunit of PI3K (Figure 5A). A Src kinase assay confirmed that HFD consumption resulted in c-Src activation and induced a large increase in the amount of activated Src within fraction 1 (Figure 5B). 178 Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc.
Similar results were obtained by analysis of WAT. Although HFD consumption did not lead to an expansion of the DRM fraction, as indicated by the nearly identical distribution of flotillin-1/2 in membrane fractions of WAT from lean and obese mice, the amount of phospho-Tyr418 Src was higher in fraction 1 of WAT membranes isolated from obese mice (Figure 5C). Immunecomplex kinase assay using enolase as a substrate confirmed that there was a clear and significant increase in the amount of c-Src and c-Src catalytic activity in the DRM fraction of WAT from obese mice (Figure 5D). As expected, HFD consumption resulted in JNK1 activation in both BAT and WAT (Figures S4A and S4B). Increased JNK activation was accompanied by decreased AKT Ser 473 phosphorylation (Figures S4A and S4B). Dasatinib (Sprycel) is a broad spectrum kinase inhibitor that potently inhibits c-Src (Agostino et al., 2010). Dasatinib is used for treatment of Imatinib-resistant chronic myelogenous leukemia (CML), but several case reports indicated that it had beneficial effects on blood glucose in diabetic patients with CML (Agostino et al., 2010; Breccia et al., 2008). We examined whether dasatinib improves glucose tolerance in mice rendered obese by feeding with a HFD for 16 weeks. Indeed, 3 hr of treatment with 30 mg/kg dasatinib resulted in improved glucose tolerance in obese mice (Figure S5). However, given the ability of dasatinib to inhibit several different tyrosine kinases, including the entire Src family, it is difficult to attribute this effect to c-Src itself. Subcellular Distribution of c-Src after FA Treatment Detergent-dependent and -independent methods of DRM or lipid raft isolation are informative but provide no direct evidence
Figure 4. Saturated FA Are Preferentially Incorporated into DRM (A and B) SYF+Src cells were treated for 2 hr with 200 mM 3 H-labeled PA and 300 mM cold PA loaded onto BSA. Cell lysates were prepared and membranes were solubilized and fractionated as in Figure 2. (A) The amount of 3H in each fraction was determined by scintillation counting and (B) normalized to protein content. Results are averages ± SD. n = 3. (C and D) SYF+Src cells were treated for 2 hr with 110 nM 3 H-labeled OA and 300 mM cold OA loaded onto BSA. Cells were fractionated as above, and (C) the amount of 3H in each fraction was determined by scintillation counting and (D) normalized to protein content. Results are averages ± SD. n = 3. (E) SYF+Src cells were pretreated for 15 min with 300 mM EPA, POA, or SA, followed by a 2 hr incubation with 200 mM 3H-labeled PA and 300 mM cold PA loaded onto BSA. Cell lysates were solubilized and fractionated, and the relative amount of 3H in fraction 1 was determined as above. Results are averages ± SD. n = 3; *p < 0.05. See also Figure S3.
that the distribution of the protein in question within the membrane has changed prior to membrane solubilization and extraction. Furthermore, the membrane fractionation approach provides no information regarding the subcellular distribution of the protein in question. Previous studies of c-Src activation have demonstrated that c-Src is frequently targeted to ill-defined endosomal structures distinct from the plasma membrane (Sandilands et al., 2004; Seong et al., 2009; Wang et al., 2005). Furthermore, the lipid raft markers flotillin-1 and flotillin-2 are frequently found on organelles and endosomal membranes in addition to the plasma membrane (Browman et al., 2007; Rajendran et al., 2007). We used indirect three-color immunofluorescence and confocal microscopy to track the distribution of phospho-Tyr418 c-Src before and after treatment with PA or PA+POA relative to that of flotillin-1 and the lysosomal marker LAMP-1. The results indicated that, after 2.5 hr treatment with PA,
Tyr418-phosphorylated c-Src was present in aggregates associated with perinuclear and vesicular membranes (Figure 6A). PhosphoTyr418 c-Src aggregates frequently presented as distended rings, which were infrequent in BSA-treated control cells, or cells that were pretreated with POA prior to addition of PA. Flotillin-1 and LAMP-1 strongly colocalized both in control cells and in PA-treated cells. However, phospho-Tyr418 c-Src in control cells exhibited only weak colocalization with flotillin-1 or LAMP-1; but upon PA treatment, colocalization of phospho-Tyr418 c-Src with flotillin-1 and LAMP-1 was strongly enhanced (Figure 6A). POA pretreatment inhibited PA-induced colocalization of activated c-Src with flotillin-1 or LAMP-1 but had no effect on the relative distribution of the two markers. To confirm these results, we used a detergent-independent postnuclear cell fractionation procedure and collected a fraction that was enriched for LAMP-1 (fraction 3) (Figure S6) and subjected it to lipid raft analysis using Triton X-100. This analysis indicated that the DRM fraction of the lysosomal membrane was strongly enriched for c-Src and phospho-Tyr418 c-Src after PA treatment (Figure 6B). However, only a small amount of c-Src and phospho-Tyr418 c-Src was present in fraction 1 of BSA-treated cells or cells preincubated with POA prior to PA addition. Consistent with the strong colocalization of LAMP-1 and flotillin-1 revealed by the immunofluorescence analysis, the DRM fraction of the lysosomal membrane contained both LAMP-1 and flotillin-1 (Figure 6B). DISCUSSION JNK activity is elevated in adipocytes of obese individuals (Gregor et al., 2009) and mice (Hirosumi et al., 2002; Solinas Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc. 179
Figure 5. Consumption of High-Fat Diet Results in c-Src Segregation and Activation within DRM of Adipose Tissue (A–D) Mice were kept on normal chow (LFD) or high-fat diet (HFD) for 16 weeks, after which brown (A and B, BAT) and white (C and D, WAT) adipose tissues were isolated. The tissues were homogenized, excess lipid was removed, and the lysates were adjusted to equal protein concentrations before cold Triton X-100 solubilization and fractionation as in Figure 2. (A) Density gradient fractions of BAT membranes were analyzed for the indicated proteins by immunoblotting. Results show two different mice for each dietary condition and are representative of separate experiments in which three mice were analyzed. (B) c-Src was immunoprecipitated from the gradient fractions of BAT isolated from mice LFD2 and HFD2, and its kinase activity was measured using enolase as a substrate. (C) Density gradient fractions of WAT membranes from the indicated mice were analyzed for the indicated proteins by immunoblotting. (D) c-Src was immunoprecipitated from fraction 1 of three different LFD- or HFD-fed mice, and its kinase activity was measured as above. See also Figures S4 and S5.
et al., 2006). Chronic JNK activation is of physiological significance, as it contributes to insulin resistance, obesity, and production of inflammatory cytokines (Hirosumi et al., 2002; Sabio et al., 2010; Solinas et al., 2007). The exact mediators by which obesity activates JNK are unknown, but saturated FA such as PA and SA are potent JNK activators in cultured cells, whereas unsaturated FA have only a marginal effect (Solinas et al., 2006). In fact, some unsaturated FA, including POA, which differs from PA by only two hydrogens, and especially PUFA, inhibit JNK activation by saturated FA. Such observations invoke the existence of a sensor that discriminates between saturated and unsaturated FA, at times detecting the presence of a single double bond or the absence of two hydrogen atoms. Our results suggest that this sensor/receptor is none else but the membrane. Incorporation of saturated FA, whose acyl chains assume a rigid and straight conformation, into cellular membranes increases membrane order, resulting in reduced fluidity and higher melting temperature (Karnovsky et al., 1982). By contrast, the bent tails of unsaturated FA reduce membrane order and increase fluidity. We now show that saturated FA stimulate c-Src partitioning into membrane subdomains with increased rigidity that can be isolated based on resistance to detergent solubilization or increased density and that the c-Src that resides within such microdomains is more active than the general pool of c-Src. We also provide genetic evidence that 180 Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc.
c-Src is required for FA-induced JNK1 and MLK3 activation. Furthermore, mono- and polyunsaturated FA, POA, and EPA, respectively, do not alter the membrane distribution of c-Src on their own and block its partitioning and activation within DRMs in cells treated with PA. Correspondingly, POA and EPA prevent JNK activation by PA. Although these results do not exclude the existence of a protein receptor that can discriminate between different FA forms, they favor a hypothetical model according to which FA discrimination is carried out by the membrane. In addition to decreasing membrane fluidity, saturated FA expand the lipid raft compartment, defined as a region of the membrane with decreased fluidity and resistance to nonionic detergents due to enrichment with cholesterol, sphingolipids, and saturated phospholipids (Pike, 2009). By contrast, membrane supplementation with unsaturated FA decreases lipid raft formation (Clamp et al., 1997; Karnovsky et al., 1982; Stulnig et al., 1998, 2001). Although the in vivo existence of lipid rafts remains controversial, such experiments suggest that saturated and unsaturated FA have opposing effects on membrane fluidity and/or structure that result in differential susceptibility to solubilization with nonionic detergents. It was also noted that cellular membranes from obese individuals have abnormal composition and decreased fluidity (Faloia et al., 1999; Watala and Winocour, 1992). Yet, the pathophysiological implications of these observations
Figure 6. Subcellular Localization and Redistribution of Activated c-Src in FA-Treated Fibroblasts (A) SYF+Src cells were treated for 2.5 hr with BSA, PA, or PA+POA before fixation and staining with antibodies to phospho-Tyr418 c-Src, LAMP-1, or flotillin-1. Three-color confocal images were acquired on a Leica SPE-2 confocal microscope. Magnification, 633. Arrows indicate aggregates of phospho-Tyr418 c-Src that colocalize with flotillin-1 and LAMP-1. (B) SYF+Src cells were treated as above. Cells were lysed in detergent-free buffer, and postnuclear lysates were separated on a Percoll gradient under high-speed centrifugation. A band enriched for LAMP-1 was collected, immediately resuspended in buffer, and solubilized with Triton X-100 at 4 C and then fractionated on density gradients. Fractions were collected and examined for the indicated proteins by immunoblotting. See also Figure S6.
remained unclear, and no direct effects on signaling proteins that lead to insulin resistance were found. c-Src is a myristoylated protein that can be found both within and outside of lipid rafts, as well as within endosomes, before translocating to the plasma membrane (Arcaro et al., 2007; Mukherjee et al., 2003; Sandilands et al., 2004; Seong et al., 2009). c-Src can be activated within all of the aforementioned compartments, including rapid activation at nonraft regions of the
plasma membrane and slower activation within putative lipid rafts (Seong et al., 2009). Activation can also occur within endosomes that transit from the perinuclear region to the plasma membrane (Sandilands et al., 2004). Obviously, c-Src distribution within cellular membranes is dynamic and is modulated by environmental conditions. Our results demonstrate that incubation of fibroblasts with PA increases the amount of c-Src that is present within the DRM fraction or membrane microdomains Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc. 181
of increased density that can be isolated by a detergent-free method. Imaging studies confirm that PA induces clustering of Tyr-418 c-Src within endosomal and lysosomal membranes. However, it is unclear whether c-Src translocates to specific membrane microdomains before or after its activation. Nonetheless, given the presumed higher protein density of the c-Src-containing microdomains, it is plausible that increased molecular crowding facilitates c-Src autophosphorylation and activation. Enrichment of c-Src within DRM was also seen in BAT and, to a lesser extent, in WAT of mice kept on HFD. It is plausible that BSA and other FA-binding proteins are taken up by the cell via a pinocytotic mechanism and thereby deliver saturated FA to the same intracellular endosomal and lysosomal compartments within which PA-induced c-Src activation has been observed. Treatment of fibroblasts with PA also results in JNK1/2 and MLK3 recruitment to DRM. JNK1/2 were previously suggested to associate with lipid rafts in response to reactive nitrogen species (Wu et al., 2008) or following UV-C-induced ceramide accumulation (Charruyer et al., 2005), suggesting that such membrane microdomains may contain proteins that serve as platforms for formation of JNK-signaling modules. MLK3, the MAP3K responsible for JNK activation by FA (Jaeschke and Davis, 2007), is a soluble protein, and therefore its recruitment into the DRM fraction must be mediated through interactions with another protein, which could be either c-Src or a c-Src-binding protein. Importantly, c-Src activity is required for MLK3 and JNK1 activation and inhibition of insulin signaling in fibroblasts incubated with saturated FA. Moreover, incubation of fibroblasts with unsaturated FA prevents changes in c-Src distribution and inhibits both c-Src and JNK activation. These findings support the hypothesis that c-Src activation within membrane microdomains of reduced fluidity is a crucial event in the JNK-signaling cascade triggered by FA and are consistent with the model according to which the membrane is the primary sensor of FA structure. Dietary n-3 PUFAs, such as EPA and docosahexaenoic acid (DHA; 22:6, n-3), are common in fish and marine mammals (Simopoulos, 2002). Epidemiological studies have shown that consumption of large amounts of foods rich in n-3 PUFAs reduces the incidence of type 2 diabetes and heart disease and improves glycemic control even in the face of high body mass index (Jørgensen et al., 2006; Kagawa et al., 1982; Kromann and Green, 1980; Thorsdottir et al., 2004). Other studies have implicated DHA, EPA, or fish oil in protection from insulin resistance and type 2 diabetes in rodents and humans (Browning et al., 2007; Luo et al., 1996; Neschen et al., 2007; Oh et al., 2010; Storlien et al., 1987). Monounsaturated FA also enhance insulin sensitivity, and POA was suggested to function as a protective lipokine (Cao et al., 2008; Ryan et al., 2000). Mono- and polyunsaturated FA are biologically active and have many pleiotropic effects that could account for their antidiabetic actions, including reduced adipose tissue inflammation (Oh et al., 2010; Todoric et al., 2006). EPA or DHA also inhibit JNK/AP-1 activation by various stimuli (Liu et al., 2001; Oh et al., 2010; Todoric et al., 2006). Several hypotheses were proposed to explain the antidiabetic effects of mono- and polyunsaturated FA, including altered eicosanoid production (Culp et al., 1980) and modulation 182 Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc.
of peroxisome proliferator-activated receptors (Neschen et al., 2007). In addition, unsaturated FA, but not saturated FA, activate anti-inflammatory G protein-coupled receptors such as GPR120 (Hirasawa et al., 2005; Oh et al., 2010). Activation of this receptor can decrease JNK activation in response to LPS in vitro, and GPR120-deficient mice are more insulin resistant on LFD and refractory to the insulin-sensitizing effects of PUFA. These data suggest that the anti-inflammatory activity of poly- and monounsaturated FAs may be independent of effects on membrane fluidity. However, involvement of GPR120 in the ability of unsaturated FA to block JNK activation by saturated FA has not been tested. Likewise, it is unknown whether saturated FA can competitively prevent GPR120 activation by unsaturated FA. Such a scenario would require GPR120 or similar receptors to bind many different FAs but be activated only by unsaturated FA. Until such a receptor is found, the opposing effects of saturated and unsaturated FA on c-Src, MLK3, and JNK1 activity are most parsimoniously explained by their differential effects on membrane fluidity and structure. Furthermore, it should be noted that the effects of saturated FA on JNK, MLK3, and Src distribution and activity are slow, requiring an hour or more to be detected, and therefore they seem inconsistent with standard receptor-mediated events that occur within much shorter time scales. The cellular uptake of saturated FA from FA-binding proteins and subsequent FA incorporation into biological membranes is also unlikely to be a rapid process. EXPERIMENTAL PROCEDURES Detailed experimental procedures are described in the Extended Experimental Procedures. In brief, most of the in vitro experiments were conducted using mouse fibroblasts that are wild-type, c-Src–/–, SYF–/–, or SYF–/– reconstituted with c-Src. Fibroblasts were incubated with low-endotoxin BSA that was delipidated and then loaded with different FA. JNK activation and phosphorylation were analyzed as described (Solinas et al., 2006). c-Src activation was analyzed either by immunecomplex kinase assays or by immunochemical detection of Tyr418 phosphorylation. Lipid rafts were isolated as previously described (Lingwood and Simons, 2007; Ostrom and Insel, 2006). c-Src activation and subcellular distribution were also analyzed by indirect immunofluorescence of formaldehyde fixed cells. Mice were kept on low-fat or high-fat diets as described (Solinas et al., 2007).
SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures and six figures and can be found with this article online at doi:10.1016/j.cell. 2011.08.034. ACKNOWLEDGMENTS We thank the individuals mentioned in the Extended Experimental Procedures who provided us with essential reagents. Research was supported by grants from the National Institutes of Health (ES006376, ES0100337) and the American Diabetes Association (ADA 7-08-MN-29) to M.K., who is an American Cancer Society Research Professor. Received: October 6, 2010 Revised: May 26, 2011 Accepted: August 8, 2011 Published: September 29, 2011
REFERENCES Agostino, N., Chinchilli, V.M., Lynch, C.J., Koszyk-Szewczyk, A., Gingrich, R., Sivik, J., and Drabick, J.J. (2010). Effect of the tyrosine kinase inhibitors (sunitinib, sorafenib, dasatinib, and imatinib) on blood glucose levels in diabetic and nondiabetic patients in general clinical practice. J. Oncol. Pharm. Pract. Published online August 4, 2010. 10.1177/1078155210378913. Aguirre, V., Werner, E.D., Giraud, J., Lee, Y.H., Shoelson, S.E., and White, M.F. (2002). Phosphorylation of Ser307 in insulin receptor substrate-1 blocks interactions with the insulin receptor and inhibits insulin action. J. Biol. Chem. 277, 1531–1537. Akazawa, Y., Cazanave, S., Mott, J.L., Elmi, N., Bronk, S.F., Kohno, S., Charlton, M.R., and Gores, G.J. (2010). Palmitoleate attenuates palmitate-induced Bim and PUMA up-regulation and hepatocyte lipoapoptosis. J. Hepatol. 52, 586–593. Arcaro, A., Aubert, M., Espinosa del Hierro, M.E., Khanzada, U.K., Angelidou, S., Tetley, T.D., Bittermann, A.G., Frame, M.C., and Seckl, M.J. (2007). Critical role for lipid raft-associated Src kinases in activation of PI3K-Akt signalling. Cell. Signal. 19, 1081–1092. Breccia, M., Muscaritoli, M., Cannella, L., Stefanizzi, C., Frustaci, A., and Alimena, G. (2008). Fasting glucose improvement under dasatinib treatment in an accelerated phase chronic myeloid leukemia patient unresponsive to imatinib and nilotinib. Leuk. Res. 32, 1626–1628. Browman, D.T., Hoegg, M.B., and Robbins, S.M. (2007). The SPFH domain-containing proteins: more than lipid raft markers. Trends Cell Biol. 17, 394–402. Browning, L.M., Krebs, J.D., Moore, C.S., Mishra, G.D., O’Connell, M.A., and Jebb, S.A. (2007). The impact of long chain n-3 polyunsaturated fatty acid supplementation on inflammation, insulin sensitivity and CVD risk in a group of overweight women with an inflammatory phenotype. Diabetes Obes. Metab. 9, 70–80. Cao, H., Gerhold, K., Mayers, J.R., Wiest, M.M., Watkins, S.M., and Hotamisligil, G.S. (2008). Identification of a lipokine, a lipid hormone linking adipose tissue to systemic metabolism. Cell 134, 933–944. Charruyer, A., Grazide, S., Bezombes, C., Mu¨ller, S., Laurent, G., and Jaffre´zou, J.P. (2005). UV-C light induces raft-associated acid sphingomyelinase and JNK activation and translocation independently on a nuclear signal. J. Biol. Chem. 280, 19196–19204. Clamp, A.G., Ladha, S., Clark, D.C., Grimble, R.F., and Lund, E.K. (1997). The influence of dietary lipids on the composition and membrane fluidity of rat hepatocyte plasma membrane. Lipids 32, 179–184. Clandinin, M.T., Cheema, S., Field, C.J., Garg, M.L., Venkatraman, J., and Clandinin, T.R. (1991). Dietary fat: exogenous determination of membrane structure and cell function. FASEB J. 5, 2761–2769. Culp, B.R., Lands, W.E., Lucches, B.R., Pitt, B., and Romson, J. (1980). The effect of dietary supplementation of fish oil on experimental myocardial infarction. Prostaglandins 20, 1021–1031. Erbay, E., Babaev, V.R., Mayers, J.R., Makowski, L., Charles, K.N., Snitow, M.E., Fazio, S., Wiest, M.M., Watkins, S.M., Linton, M.F., and Hotamisligil, G.S. (2009). Reducing endoplasmic reticulum stress through a macrophage lipid chaperone alleviates atherosclerosis. Nat. Med. 15, 1383–1391. Faloia, E., Garrapa, G.G., Martarelli, D., Camilloni, M.A., Lucarelli, G., Staffolani, R., Mantero, F., Curatola, G., and Mazzanti, L. (1999). Physicochemical and functional modifications induced by obesity on human erythrocyte membranes. Eur. J. Clin. Invest. 29, 432–437. Ford, E.S., Williamson, D.F., and Liu, S. (1997). Weight change and diabetes incidence: findings from a national cohort of US adults. Am. J. Epidemiol. 146, 214–222. Gregor, M.F., Yang, L., Fabbrini, E., Mohammed, B.S., Eagon, J.C., Hotamisligil, G.S., and Klein, S. (2009). Endoplasmic reticulum stress is reduced in tissues of obese subjects after weight loss. Diabetes 58, 693–700.
Handley, M.E., Rasaiyaah, J., Chain, B.M., and Katz, D.R. (2007). Mixed lineage kinases (MLKs): a role in dendritic cells, inflammation and immunity? Int. J. Exp. Pathol. 88, 111–126. Hirasawa, A., Tsumaya, K., Awaji, T., Katsuma, S., Adachi, T., Yamada, M., Sugimoto, Y., Miyazaki, S., and Tsujimoto, G. (2005). Free fatty acids regulate gut incretin glucagon-like peptide-1 secretion through GPR120. Nat. Med. 11, 90–94. Hirosumi, J., Tuncman, G., Chang, L., Go¨rgu¨n, C.Z., Uysal, K.T., Maeda, K., Karin, M., and Hotamisligil, G.S. (2002). A central role for JNK in obesity and insulin resistance. Nature 420, 333–336. Hotamisligil, G.S. (2010). Endoplasmic reticulum stress and the inflammatory basis of metabolic disease. Cell 140, 900–917. Jaeschke, A., Czech, M.P., and Davis, R.J. (2004). An essential role of the JIP1 scaffold protein for JNK activation in adipose tissue. Genes Dev. 18, 1976– 1980. Jaeschke, A., and Davis, R.J. (2007). Metabolic stress signaling mediated by mixed-lineage kinases. Mol. Cell 27, 498–508. Janes, P.W., Ley, S.C., and Magee, A.I. (1999). Aggregation of lipid rafts accompanies signaling via the T cell antigen receptor. J. Cell Biol. 147, 447–461. Jørgensen, M.E., Borch-Johnsen, K., and Bjerregaard, P. (2006). Lifestyle modifies obesity-associated risk of cardiovascular disease in a genetically homogeneous population. Am. J. Clin. Nutr. 84, 29–36. Kagawa, Y., Nishizawa, M., Suzuki, M., Miyatake, T., Hamamoto, T., Goto, K., Motonaga, E., Izumikawa, H., Hirata, H., and Ebihara, A. (1982). Eicosapolyenoic acids of serum lipids of Japanese islanders with low incidence of cardiovascular diseases. J. Nutr. Sci. Vitaminol. (Tokyo) 28, 441–453. Kahn, C.R., Vicent, D., and Doria, A. (1996). Genetics of non-insulin-dependent (type-II) diabetes mellitus. Annu. Rev. Med. 47, 509–531. Karin, M., and Gallagher, E. (2005). From JNK to pay dirt: jun kinases, their biochemistry, physiology and clinical importance. IUBMB Life 57, 283–295. Karnovsky, M.J., Kleinfeld, A.M., Hoover, R.L., and Klausner, R.D. (1982). The concept of lipid domains in membranes. J. Cell Biol. 94, 1–6. Kharroubi, I., Ladrie`re, L., Cardozo, A.K., Dogusan, Z., Cnop, M., and Eizirik, D.L. (2004). Free fatty acids and cytokines induce pancreatic beta-cell apoptosis by different mechanisms: role of nuclear factor-kappaB and endoplasmic reticulum stress. Endocrinology 145, 5087–5096. Kromann, N., and Green, A. (1980). Epidemiological studies in the Upernavik district, Greenland. Incidence of some chronic diseases 1950-1974. Acta Med. Scand. 208, 401–406. Lingwood, D., and Simons, K. (2007). Detergent resistance as a tool in membrane research. Nat. Protoc. 2, 2159–2165. Liu, G., Bibus, D.M., Bode, A.M., Ma, W.Y., Holman, R.T., and Dong, Z. (2001). Omega 3 but not omega 6 fatty acids inhibit AP-1 activity and cell transformation in JB6 cells. Proc. Natl. Acad. Sci. USA 98, 7510–7515. Luo, J., Rizkalla, S.W., Boillot, J., Alamowitch, C., Chaib, H., Bruzzo, F., Desplanque, N., Dalix, A.M., Durand, G., and Slama, G. (1996). Dietary (n-3) polyunsaturated fatty acids improve adipocyte insulin action and glucose metabolism in insulin-resistant rats: relation to membrane fatty acids. J. Nutr. 126, 1951–1958. Mukherjee, A., Arnaud, L., and Cooper, J.A. (2003). Lipid-dependent recruitment of neuronal Src to lipid rafts in the brain. J. Biol. Chem. 278, 40806– 40814. Neschen, S., Morino, K., Dong, J., Wang-Fischer, Y., Cline, G.W., Romanelli, A.J., Rossbacher, J.C., Moore, I.K., Regittnig, W., Munoz, D.S., et al. (2007). n-3 Fatty acids preserve insulin sensitivity in vivo in a peroxisome proliferator-activated receptor-alpha-dependent manner. Diabetes 56, 1034–1041. Oh, D.Y., Talukdar, S., Bae, E.J., Imamura, T., Morinaga, H., Fan, W., Li, P., Lu, W.J., Watkins, S.M., and Olefsky, J.M. (2010). GPR120 is an omega-3 fatty acid receptor mediating potent anti-inflammatory and insulin-sensitizing effects. Cell 142, 687–698.
Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc. 183
Okada, M., Nada, S., Yamanashi, Y., Yamamoto, T., and Nakagawa, H. (1991). CSK: a protein-tyrosine kinase involved in regulation of src family kinases. J. Biol. Chem. 266, 24249–24252. Ostrom, R.S., and Insel, P.A. (2006). Methods for the study of signaling molecules in membrane lipid rafts and caveolae. Methods Mol. Biol. 332, 181–191. Ozcan, U., Cao, Q., Yilmaz, E., Lee, A.H., Iwakoshi, N.N., Ozdelen, E., Tuncman, G., Go¨rgu¨n, C., Glimcher, L.H., and Hotamisligil, G.S. (2004). Endoplasmic reticulum stress links obesity, insulin action, and type 2 diabetes. Science 306, 457–461. Patwardhan, P., and Resh, M.D. (2010). Myristoylation and membrane binding regulate c-Src stability and kinase activity. Mol. Cell. Biol. 30, 4094–4107. Pike, L.J. (2004). Lipid rafts: heterogeneity on the high seas. Biochem. J. 378, 281–292. Pike, L.J. (2009). The challenge of lipid rafts. J. Lipid Res. Suppl. 50, S323– S328. Rajendran, L., Le Lay, S., and Illges, H. (2007). Raft association and lipid droplet targeting of flotillins are independent of caveolin. Biol. Chem. 388, 307–314.
Solinas, G., Vilcu, C., Neels, J.G., Bandyopadhyay, G.K., Luo, J.L., Naugler, W., Grivennikov, S., Wynshaw-Boris, A., Scadeng, M., Olefsky, J.M., and Karin, M. (2007). JNK1 in hematopoietically derived cells contributes to diet-induced inflammation and insulin resistance without affecting obesity. Cell Metab. 6, 386–397. Stein, P.L., Vogel, H., and Soriano, P. (1994). Combined deficiencies of Src, Fyn, and Yes tyrosine kinases in mutant mice. Genes Dev. 8, 1999–2007. Storlien, L.H., Kraegen, E.W., Chisholm, D.J., Ford, G.L., Bruce, D.G., and Pascoe, W.S. (1987). Fish oil prevents insulin resistance induced by high-fat feeding in rats. Science 237, 885–888. Stulnig, T.M., Berger, M., Sigmund, T., Raederstorff, D., Stockinger, H., and Waldha¨usl, W. (1998). Polyunsaturated fatty acids inhibit T cell signal transduction by modification of detergent-insoluble membrane domains. J. Cell Biol. 143, 637–644. Stulnig, T.M., Huber, J., Leitinger, N., Imre, E.M., Angelisova, P., Nowotny, P., and Waldhausl, W. (2001). Polyunsaturated eicosapentaenoic acid displaces proteins from membrane rafts by altering raft lipid composition. J. Biol. Chem. 276, 37335–37340.
Reaven, G.M., Hollenbeck, C., Jeng, C.Y., Wu, M.S., and Chen, Y.D. (1988). Measurement of plasma glucose, free fatty acid, lactate, and insulin for 24 h in patients with NIDDM. Diabetes 37, 1020–1024.
Taniguchi, C.M., Aleman, J.O., Ueki, K., Luo, J., Asano, T., Kaneto, H., Stephanopoulos, G., Cantley, L.C., and Kahn, C.R. (2007). The p85alpha regulatory subunit of phosphoinositide 3-kinase potentiates c-Jun N-terminal kinase-mediated insulin resistance. Mol. Cell. Biol. 27, 2830–2840.
Rintoul, D.A., Sklar, L.A., and Simoni, R.D. (1978). Membrane lipid modification of chinese hamster ovary cells. Thermal properties of membrane phospholipids. J. Biol. Chem. 253, 7447–7452.
Thorsdottir, I., Hill, J., and Ramel, A. (2004). Omega-3 fatty acid supply from milk associates with lower type 2 diabetes in men and coronary heart disease in women. Prev. Med. 39, 630–634.
Robinson, L.E., Buchholz, A.C., and Mazurak, V.C. (2007). Inflammation, obesity, and fatty acid metabolism: influence of n-3 polyunsaturated fatty acids on factors contributing to metabolic syndrome. Appl. Physiol. Nutr. Metab. 32, 1008–1024.
Todoric, J., Lo¨ffler, M., Huber, J., Bilban, M., Reimers, M., Kadl, A., Zeyda, M., Waldha¨usl, W., and Stulnig, T.M. (2006). Adipose tissue inflammation induced by high-fat diet in obese diabetic mice is prevented by n-3 polyunsaturated fatty acids. Diabetologia 49, 2109–2119.
Ryan, M., McInerney, D., Owens, D., Collins, P., Johnson, A., and Tomkin, G.H. (2000). Diabetes and the Mediterranean diet: a beneficial effect of oleic acid on insulin sensitivity, adipocyte glucose transport and endothelium-dependent vasoreactivity. QJM 93, 85–91.
Tseng, P.H., Matsuzawa, A., Zhang, W., Mino, T., Vignali, D.A., and Karin, M. (2010). Different modes of ubiquitination of the adaptor TRAF3 selectively activate the expression of type I interferons and proinflammatory cytokines. Nat. Immunol. 11, 70–75.
Sabio, G., Cavanagh-Kyros, J., Barrett, T., Jung, D.Y., Ko, H.J., Ong, H., Morel, C., Mora, A., Reilly, J., Kim, J.K., and Davis, R.J. (2010). Role of the hypothalamic-pituitary-thyroid axis in metabolic regulation by JNK1. Genes Dev. 24, 256–264.
Tsukumo, D.M., Carvalho-Filho, M.A., Carvalheira, J.B., Prada, P.O., Hirabara, S.M., Schenka, A.A., Arau´jo, E.P., Vassallo, J., Curi, R., Velloso, L.A., and Saad, M.J. (2007). Loss-of-function mutation in Toll-like receptor 4 prevents diet-induced obesity and insulin resistance. Diabetes 56, 1986–1998.
Sabio, G., Das, M., Mora, A., Zhang, Z., Jun, J.Y., Ko, H.J., Barrett, T., Kim, J.K., and Davis, R.J. (2008). A stress signaling pathway in adipose tissue regulates hepatic insulin resistance. Science 322, 1539–1543.
Uysal, K.T., Wiesbrock, S.M., Marino, M.W., and Hotamisligil, G.S. (1997). Protection from obesity-induced insulin resistance in mice lacking TNF-alpha function. Nature 389, 610–614.
Sandilands, E., Brunton, V.G., and Frame, M.C. (2007). The membrane targeting and spatial activation of Src, Yes and Fyn is influenced by palmitoylation and distinct RhoB/RhoD endosome requirements. J. Cell Sci. 120, 2555–2564.
Vallerie, S.N., Furuhashi, M., Fucho, R., and Hotamisligil, G.S. (2008). A predominant role for parenchymal c-Jun amino terminal kinase (JNK) in the regulation of systemic insulin sensitivity. PLoS ONE 3, e3151.
Sandilands, E., Cans, C., Fincham, V.J., Brunton, V.G., Mellor, H., Prendergast, G.C., Norman, J.C., Superti-Furga, G., and Frame, M.C. (2004). RhoB and actin polymerization coordinate Src activation with endosome-mediated delivery to the membrane. Dev. Cell 7, 855–869.
Virtue, S., and Vidal-Puig, A. (2008). It’s not how fat you are, it’s what you do with it that counts. PLoS Biol. 6, e237.
Seong, J., Lu, S., Ouyang, M., Huang, H., Zhang, J., Frame, M.C., and Wang, Y. (2009). Visualization of Src activity at different compartments of the plasma membrane by FRET imaging. Chem. Biol. 16, 48–57. Shi, H., Kokoeva, M.V., Inouye, K., Tzameli, I., Yin, H., and Flier, J.S. (2006). TLR4 links innate immunity and fatty acid-induced insulin resistance. J. Clin. Invest. 116, 3015–3025.
Wang, Y., Botvinick, E.L., Zhao, Y., Berns, M.W., Usami, S., Tsien, R.Y., and Chien, S. (2005). Visualizing the mechanical activation of Src. Nature 434, 1040–1045. Watala, C., and Winocour, P.D. (1992). The relationship of chemical modification of membrane proteins and plasma lipoproteins to reduced membrane fluidity of erythrocytes from diabetic subjects. Eur. J. Clin. Chem. Clin. Biochem. 30, 513–519.
Simopoulos, A.P. (2002). Omega-3 fatty acids in inflammation and autoimmune diseases. J. Am. Coll. Nutr. 21, 495–505.
Webb, Y., Hermida-Matsumoto, L., and Resh, M.D. (2000). Inhibition of protein palmitoylation, raft localization, and T cell signaling by 2-bromopalmitate and polyunsaturated fatty acids. J. Biol. Chem. 275, 261–270.
Solinas, G., Naugler, W., Galimi, F., Lee, M.S., and Karin, M. (2006). Saturated fatty acids inhibit induction of insulin gene transcription by JNK-mediated phosphorylation of insulin-receptor substrates. Proc. Natl. Acad. Sci. USA 103, 16454–16459.
Wu, Y.T., Zhang, S., Kim, Y.S., Tan, H.L., Whiteman, M., Ong, C.N., Liu, Z.G., Ichijo, H., and Shen, H.M. (2008). Signaling pathways from membrane lipid rafts to JNK1 activation in reactive nitrogen species-induced non-apoptotic cell death. Cell Death Differ. 15, 386–397.
184 Cell 147, 173–184, September 30, 2011 ª2011 Elsevier Inc.
Conformation-Sensing Antibodies Stabilize the Oxidized Form of PTP1B and Inhibit Its Phosphatase Activity Aftabul Haque,1,2 Jannik N. Andersen,1,4 Annette Salmeen,3,5 David Barford,3 and Nicholas K. Tonks1,* 1Cold
Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA and Cellular Biology Graduate Program, Stony Brook University, Stony Brook, NY 11790, USA 3Division of Structural Biology, Institute of Cancer Research, Chester Beatty Laboratories, 237 Fulham Road, London SW3 6JB, UK 4Present address: The Belfer Institute for Applied Cancer Science, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02115, USA 5Present address: DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.08.036 2Molecular
SUMMARY
Protein tyrosine phosphatase 1B (PTP1B) plays important roles in downregulation of insulin and leptin signaling and is an established therapeutic target for diabetes and obesity. PTP1B is regulated by reactive oxygen species (ROS) produced in response to various stimuli, including insulin. The reversibly oxidized form of the enzyme (PTP1B-OX) is inactive and undergoes profound conformational changes at the active site. We generated conformation-sensor antibodies, in the form of single-chain variable fragments (scFvs), that stabilize PTP1B-OX and thereby inhibit its phosphatase function. Expression of conformation-sensor scFvs as intracellular antibodies (intrabodies) enhanced insulin-induced tyrosyl phosphorylation of the b subunit of the insulin receptor and its substrate IRS-1 and increased insulin-induced phosphorylation of PKB/AKT. Our data suggest that stabilization of the oxidized, inactive form of PTP1B with appropriate therapeutic molecules may offer a paradigm for phosphatase drug development.
INTRODUCTION In conjunction with the increased prevalence of obesity, diabetes has become a major cause of illness and premature death in most countries, mainly through the increased risk of cardiovascular disease. Type 2 diabetes, which is caused by insulin resistance resulting in loss of normal glucose homeostasis, accounts for >90% of all diabetes. It affects more than 220 million people worldwide, and this number is likely to double by 2030 (WHO, 2009). At this time therapeutic options for treating diabetes and obesity are inadequate, and effective approaches to counter the disease are urgently needed.
Considerable interest grew in the potential of protein tyrosine phosphatase 1B (PTP1B) as a therapeutic target for treating diabetes and obesity following the elucidation of its importance as a regulator of insulin and leptin signaling pathways. Gene-targeting studies demonstrated that PTP1B null mice are healthy, do not develop type 2 diabetes, and are resistant to obesity when fed with a high-fat diet (Elchebly et al., 1999; Klaman et al., 2000). PTP1B is also an inhibitor of leptin signaling (Cheng et al., 2002; Myers et al., 2001; Zabolotny et al., 2002). Furthermore, depletion of PTP1B expression with antisense oligonucleotides elicits antidiabetic and antiobesity effects in rodents (Rondinone et al., 2002; Zinker et al., 2002) as well as human subjects (Brandt et al., 2010). PTP1B was the first member of the protein tyrosine phosphatase (PTP) superfamily to be identified and was purified to homogeneity from human placenta as a catalytic domain of 37 kDa (Tonks et al., 1988). Later, it was characterized as an 50 kDa protein (435 amino acids), consisting of an N-terminal catalytic domain followed by a C-terminal segment that serves a regulatory function and anchors the protein at the cytoplasmic face of the endoplasmic reticulum (ER) membrane (Tonks, 2003). It contains a signature catalytic motif, (I/V)HCXAGXXR(S/T)G, which is a highly conserved structural feature among PTPs, in which Cys215 and Arg221 are essential for catalytic activity (Tonks, 2003). PTP1B recognizes the activated insulin receptor as a substrate in vitro and in cells (Bandyopadhyay et al., 1997; Salmeen et al., 2000). Crystal structure and kinetic studies illustrate that PTP1B preferentially dephosphorylates the tandem tyrosine residues (pYpY1162/1163) in the activation loop of the b subunit of the insulin receptor (Salmeen et al., 2000). In addition, insulin receptor substrate-1 (IRS-1) is also a potential substrate of PTP1B (Goldstein et al., 2000). Hence, PTP1B functions to downregulate insulin signaling. The activity of PTP1B is regulated at multiple levels. The architecture of the PTP-active site is such that the cysteinyl residue has a pKa of 4.5–5.0 and is predominantly in the thiolate form at neutral pH, unlike the normal pKa of cysteine, which is 8 (Lohse et al., 1997). This property makes the active-site cysteine a very good nucleophile but also renders it prone to oxidation. Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc. 185
Several labs have demonstrated that PTPs, including PTP1B, are transiently oxidized and reversibly inactivated by H2O2, and that this is important for induction of an optimal tyrosine phosphorylation response to a variety of physiological stimuli (Lee et al., 2002; Meng et al., 2002; Savitsky and Finkel, 2002). Insulin stimulation of mammalian cells leads to enhanced production of intracellular H2O2, which causes reversible oxidization and inhibition of PTP1B activity (Mahadev et al., 2001; Meng et al., 2004). Nox4, a member of the family of nicotinamide adenine dinucleotide phosphate (NADPH) oxidases, was shown to mediate insulin-stimulated H2O2 generation and to regulate the insulin signaling cascade (Mahadev et al., 2004). Understanding the redox regulation of PTPs in a cellular context has been hampered by the absence of sensitive and robust methods for detecting the oxidized phosphatases and separating them from the background of reduced enzymes. Using antibody phage display, we have generated conformation-sensor antibodies in the form of single-chain variable fragments (scFvs) to the reversibly oxidized form of PTP1B (PTP1B-OX) and applied them to understand the redox regulation of this phosphatase. Mild oxidation of the active-site cysteine of PTP1B produces a sulfenic acid (S-OH) intermediate that undergoes a rapid condensation reaction to produce a 5-atom cyclic sulfenylamide species, in which the sulfur atom of the catalytic cysteine is covalently linked to the main-chain nitrogen of the adjacent serine residue (Salmeen et al., 2003; van Montfort et al., 2003). Formation of this sulfenyl-amide intermediate causes profound conformational changes in the active site that transiently inhibit substrate binding and catalysis. These structural changes, however, are reversible under reducing conditions. In order to maintain reversibility, the active-site Cys residue should be oxidized no further than sulfenic acid (S-OH), as higher oxidation to sulfinic (S-O2H) or sulfonic (S-O3H) acid is generally irreversible (Salmeen et al., 2003; van Montfort et al., 2003). The chemical change at the core of the catalytic site is accompanied by a conformational change in which the PTP loop containing the signature motif and Tyr46 of the phosphotyrosine loop, which are both normally buried in the structure, now adopt solventexposed positions. We hypothesized that a conformationsensor antibody that recognizes the reversibly oxidized form of PTP1B (PTP1B-OX) may stabilize the inactive state and thereby inhibit phosphatase activity. We have exploited the fact that a mutant form of PTP1B (CASA), in which the catalytic Cys and adjacent Ser residues are mutated to Ala, adopts a stable conformation that is identical to PTP1B-OX. We have used this as an antigen, and in this report, we describe the characterization of conformation-sensor scFvs that recognize the reversibly oxidized form of PTP1B and stabilize the inactive state to inhibit its reactivation by reducing agent. Using the conformation sensor scFvs as intracellular antibodies or ‘‘intrabodies,’’ we have demonstrated that the activity of PTP1B can be attenuated selectively by stabilizing its reversibly oxidized conformation in cells in response to insulin-induced production of reactive oxygen species (ROS). This results in enhanced and sustained signaling in response to insulin, suggesting that this strategy may provide an alternative approach to the design of PTP-directed inhibitors. 186 Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc.
RESULTS PTP1B-CASA Is Structurally Similar to PTP1B-OX From comparative analysis of the crystal structure of PTP1B in the reduced and oxidized states (cyclic sulfenyl-amide), we hypothesized that mutation of both the catalytic Cys215 and the adjacent Ser216 to Ala would break two critical hydrogen bonds and thereby may induce a conformational change similar to the effects of oxidation. The crystal structure of the mutated PTP1B (PTP1B-CASA) illustrated that the Tyr46 of the pTyr loop became solvent exposed, and the PTP loop was also presented on the protein surface as seen in the oxidized enzyme (Figure 1 and Table S1 available online). PTP1B-CASA was, therefore, used as a stable antigen to generate conformationspecific antibodies to test their potential to specifically recognize the oxidized form of PTP1B. Construction of Antibody Phage Display Library to Target PTP1B-OX We have generated a phage display library displaying scFvs fused to phage surface protein pIII (Figure 2A and Figure S1), from antibody genes collected from chickens immunized with PTP1B-CASA. We generated two different scFv constructs, one with a short linker sequence (GQSSRSS) and one with a long linker (GQSSRSSSGGGSSGGGGS) and mixed them together to generate an scFv library of 2 3 107 total clones. To select for scFvs specific for the PTP1B-CASA mutant, we employed a subtractive panning strategy (Figure 2A). Addition of a molar excess of reduced wild-type (WT) PTP1B to the library in solution would favor enrichment of CASA-specific scFvs over scFvs that recognize the common epitopes on both the oxidized and the reduced forms of the enzyme. Initially, we generated biotinylated CASA mutant for use in subtractive panning by an in vitro chemical modification. This biotinylation method, however, adds biotin to free amine groups at random, and we found that chemical biotinylation caused significant reduction of PTP activity (Figure 2B). To alleviate this problem, we took the approach of site-specific homogenous biotinylation of PTP1B in E. coli by using a biotinylation tag that we fused to the N terminus of the phosphatase (Figure 2C). Biotinylation of PTP1B by this method generated a soluble protein and caused no reduction in activity when compared to the phosphatase activity of the nonmodified enzyme (Figure 2B). This suggests that the in vivo biotinylation does not add biotin nonspecifically to important residues in or around the catalytic site of PTP1B, and N-terminally biotinylated PTP1B-CASA (NBT-PTP1BCASA) was amenable to use for screening for conformationsensor antibodies (Figure 2D). We performed five rounds of subtractive panning using NBT-PTP1B-CASA in the presence of a molar excess of nonmodified PTP1B-WT. Selective enrichment of phage expressing PTP1B-CASA-specific scFvs was observed in terms of increased phage output/input ratio after each round of panning. We observed 1000-fold enrichment of specific scFv-expressing phage particles after the fourth round of panning (Figure 2E). To isolate individual phage particles expressing functional scFvs, we sequenced 400 phagemid constructs from the enriched library. More than 95% of the sequences were full-length
A
Figure 1. The Conformation of PTP1BCASA Resembles that of PTP1B-OX
B
WPD loop
WPD loop Q loop p PTP loop (OX)
Q loop
PTP loop
PTP loop
pTyr loop
pTyr loop (OX)
pTyr loop
PTP1B -CASA and PTP1B -OX
PTP1B (reduced)
C
and PTP1B -OX
D WPD loop
WPD loop
Q262
(A) View of PTP1B-CASA superimposed onto PTP1B-OX (sulfenyl-amide species) (PDB code: 1OEM) (Salmeen et al., 2003). (B) View of PTP1BCASA superimposed onto wild-type reduced PTP1B (PDB code: 2HNQ) (Barford et al., 1994a). In (A) and (B), the view is onto the catalytic site. (C) and (D) are the same superimpositions as (A) and (B), respectively, and show close-up views of the conformational changes at the catalytic site, indicating that the PTP loop and Tyr46 of the pTyr loop become similarly solvent-exposed in PTP1BOX and PTP1B-CASA. In all panels, the PTP loop, pTyr loop, WPD loop, and Q loop of PTP1B-OX are shown as salmon, light green, yellow, and light blue, respectively. The equivalent loops for PTP1B-CASA (A and C) and reduced PTP1B (B and D) are red, dark green, orange, and blue. The Ca-Cb bond of Tyr46 of the reduced pTyr loop is obscured by the superimposition of the oxidized pTyr loop in (B) and (D). Figures are produced using PyMOL. See also Table S1.
S222 Cys215 R221 A215
S216
D181
E115
Q262 (OX) Q262
Cys215
R221
C215 (OX)
S216 (OX)
D181
E115
OX-specific scFvs, we established conditions under which PTP1B was reversibly pTyr oxidized by H2O2 (Figure 3B). The ability loop pTyr loop Y46 of individual scFvs to stabilize the reversibly oxidized, inactive conformation of pTyr PTP1B was assessed by the ability of Y46 loop the scFv to inhibit the reactivation of the (OX) enzyme by reducing agent. Some of the scFvs demonstrated substantial inhibiPTP1B -CASA and PTP1B -OX PTP1B (reduced) and PTP1B -OX tion of the restoration of PTP1B activity following addition of reducing agent (Figfunctional scFvs, indicating that the cloning strategy and library ure 3C). In particular, scFvs 45 and 57 inhibited the reactivation display worked. Interestingly, diversity among the functional by 70%. To generate the scFvs, we utilized a truncated (1–321) form of scFv sequences was found to be 70% after the second round of panning, whereas after the fourth round of panning, it was PTP1B, which comprised the catalytic domain. To investigate 20%, suggesting that the library was enriched with specific further two of the candidate conformation-sensor antibodies scFvs after the panning steps. Sequences were sorted into (scFvs 45 and 57), we tested their potential to inhibit reactivation groups on the basis of differences in their hypervariable regions. of a longer form of recombinant PTP1B (1–394) that contained The selected scFv sequences (Figure S1B) also contain the 6-His most of the noncatalytic C-terminal segment that is found in the protein in vivo. Neither scFv45 nor scFv57 demonstrated and HA tags at the C termini. a significant direct inhibition of phosphatase activity of the reduced, active PTP1B (1–394), whereas, when incubated with Isolation of Candidate scFvs Specific for PTP1B-OX We systematically analyzed individual bacterially purified scFvs the reversibly oxidized form, both of these scFvs caused almost from pools of phage enriched after the subtractive panning, by complete inhibition of reactivation (Figures 3D and 3E; Figure S2). conducting a screen in solution, in order to preserve the confor- These results suggest that these scFvs can sequester oxidized mational integrity of PTP1B-OX. In this screen we employed a PTP1B in its inactive conformation, thereby preventing reactivaphosphatase assay to assess the effect of scFv binding on the tion by reducing agents with the effect of inhibiting phosphatase activity. activity of PTP1B. At first, we tested whether any of the scFvs used in this screen had a direct inhibitory effect on phosphatase activity of PTP1B Candidate scFvs Bound to PTP1B-OX In Vitro under reducing conditions. We observed, as expected, that with High Affinity and Specificity none of the scFvs, from a randomly selected batch, had any We tested whether the candidate scFvs from the in vitro screen such direct effect on activity (Figure 3A). To screen for PTP1B- bound to PTP1B-OX. In a Ni-NTA precipitation experiment A216
S216
Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc. 187
Recombinant Phagemid Vector
A
VL
L
VH
H6 HA
Gene III
Male E. coli Helper phage
Separation
Biotinylated PTP1B_CASA
PTP1B_WT
Elution
Amplification
Streptavidin Coated Magnetic Bead
B % Phosphatase Activity
100 NBT CASA
NBT WT
80
C 49 kDa
60 40
34 kDa
20 0
E Ppt
Sup
NBT-1B-CASA Ppt
Sup
Inp
1B-WT
Inp
D
WB: Biotin 37 kDa
37 kDa WB: PTP1B
Selective Phage Enrichment (Fold Change in Output/Input Ratio)
in vitro Chemical Biotinylation
1000
100
10
1 1
2
3
4
5
Panning Steps
Figure 2. Subtractive Panning to Enrich PTP1B-OX-Specific scFvs (A) Single-chain variable fragments (scFvs) were cloned into the phagemid and transformed to E. coli. Infection with a helper phage (VCSM13) enabled bacteria to make phage particles expressing the scFvs fused to surface protein pIII. Molar excess (up to 503) of wild-type PTP1B was added under reducing condition, followed by biotinylated PTP1B-CASA. Phage-scFvs were then isolated by magnetic beads. The phage-scFvs were eluted, amplified, and used for subsequent rounds of panning. (B) Phosphatase activity of PTP1B (1–321), N-terminally biotinylated (NBT) in vivo or chemically biotinylated in vitro (WT+B), was measured using 32P-RCML as the substrate. Chemical biotinylation was performed at three proteins: biotin ratios ([1] = 1:10, [2] = 1:20, and [3] = 1:30). The activity of biotinylated PTP1B was compared to that of the untagged wild-type enzyme (1–321), which was set as 100%. The error bars represent standard deviation from four phosphatase assays. (C) PTP1B (1–321) (WT and CASA), N-terminally biotinylated in vivo in E. coli, was purified in a two-step purification scheme. (D) N-terminally biotinylated PTP1B-CASA mutant (NBT-CASA) or untagged wild-type PTP1B (1B-WT) was incubated with streptavidin-coated magnetic beads, and biotinylated proteins (and PTP1B) were detected by immunoblot in input (Inp), supernatant (Sup), and precipitated (Ppt) samples. (E) Enrichment of phage-expressing PTP1B-CASA-specific scFvs was estimated in terms of phage output/input ratio after each round of panning. See also Figure S1.
188 Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc.
% Phosphatase Activity
A 100 80 60 40 20
+ scFv106
+ scFv105
+ scFv104
+ scFv64
+ scFv57
+ scFv48
+ scFv45
+ scFv34
- TCEP
-TCEP +TCEP
80 60 40 20
20
+
+
+
+
+
57
64
+
+
+ 106
+
105
+
104
+
60 40
57
60 40 20
0 H2O 2
-
+
+
-
80
+
0 scFv
45
20
- TCEP +TCEP
100
scFv
-
-
57
80
45
-TCEP +TCEP
100
% Phosphatase Activity
E
D % Phosphatase Activity
+
48
+
scFv
45
0 H2O2 -
34
H2O2 (mM)
28
50 100
24
10
40
21
5
60
20
1
+ TCEP
80
-
0 0.05 0.1 0.5
100
-
100
0
+ scFv28
C
%PhosphataseActivity
% Phosphatase Activity
B
+ scFv24
+ scFv21
+ scFv20
- TCEP + TCEP
PTP1B
0
Figure 3. Screening for PTP1B-OX-Specific scFvs (A) Purified scFvs (750 nM) were incubated with PTP1B (1–321) (7.5 nM) with or without reducing agent TCEP (5 mM), and phosphatase activity was measured using 100 nM RCML, in which tyrosine was phosphorylated, as the substrate. (B) PTP1B (1–321) (15 nM) was incubated with increasing concentration of H2O2. Aliquots of H2O2-treated PTP1B (5 nM final) were used in the phosphatase assay following buffer exchange, without or with TCEP, to observe the inactivation and reactivation of the enzyme, respectively. (C) PTP1B (1–321) (15 nM) was reversibly oxidized by H2O2 (75 mM) and aliquots (7.5 nM) of this reversibly oxidized PTP1B were incubated with purified scFvs (750 nM). Reactivation with or without the presence of scFvs was measured by phosphatase assay in the presence of TCEP. (D) Direct inhibitory effect of scFvs 45 or 57 (750 nM each) were assessed using PTP1B (1–394) (7.5 nM) in a phosphatase assay similar to that in (C). (E) PTP1B (1–394) (15 nM) was reversibly oxidized with H2O2 (75 mM), and reactivation by TCEP with or without scFvs 45 and 57 was observed by phosphatase assay. Error bars show standard deviation from three phosphatase assays. See also Figure S2.
in vitro, scFv45 and scFv57 interacted with PTP1B-OX or PTP1B-CASA but not with reduced PTP1B (Figure 4A). A dose-response analysis indicated that scFvs 45 and 57 displayed IC50s of 19 nM and 10 nM, respectively, for suppressing the reactivation of PTP1B-OX (Figure 4B). This result is consis-
tent with the effects of scFvs 45 and 57 on the activity of PTP1B (Figure 3). We used surface plasmon resonance (SPR) to measure the binding constants for the interaction between scFv45 and PTP1B-OX. Under reducing conditions, PTP1B did not show any significant binding to scFv45, further confirming Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc. 189
Pull Down: scFv45
Eluate
Sup
Input
Eluate
1B - OX Sup
Input
1B - R Eluate
Sup
Eluate
Input
Sup
Eluate
Pull Down: scFv57 1B - CASA
1B - OX Input
Input
A
Sup
1B - R
IB: PTP1B PTP1B (1 - 321)
36 kDa
scFv
36 kDa
IB: HA 1B - R: PTP1B - Reduced;
1B - OX: PTP1B - Oxidized;
1B - CASA: PTP1B- C215A/S216A
B % Phosphatase Activity
100 80
scFv
IC50 (nM)
scFv45
19.0 ± 1.9
scFv57
10.2 ± 1.7
scFv34
1217.7 ± 1737.1
60 40 scFv45 scFv57
20
scFv20 scFv34
scFv20
0 10-1
1
102 101 [scFv (nM)]
103
1203.9 ± 757.4
104
C
RU
PTP1B-OX
PTP1B-R
D
Time (s)
0.05 μM 0.10 μM 0.25 μM
RU
0.50 μM
0.75 μM 1.00 μM 5.00 μM 10.0 μM
Time (s)
Figure 4. scFv45 Binds Specifically to PTP1B-OX In Vitro (A) PTP1B (1–321), reduced (1B-R), reversibly oxidized with H2O2 (1B-OX), or with PTP1B-CASA mutant, were incubated with purified scFvs 45 and 57, and the protein complexes were precipitated with Ni-NTA agarose beads. Equivalent amounts (2.5 ng of PTP1B) of input, supernatant, and eluate were subjected to immunoblot analysis. (B) Increasing concentrations of scFvs 45, 57, 34, and 20 (2.5 nM to 1 mM) were incubated with PTP1B-OX, and phosphatase activity was measured after adding TCEP (5 mM). IC50 values for the inhibition of PTP1B reduction and reactivation were determined using the Grafit software. The error bars represent standard deviation from three phosphatase assays. (C) Comparative SPR sensogram shows the interaction between scFv45 (500 nM) and either PTP1B-OX (1 mM) or reduced PTP1B (1 mM with 2 mM TCEP) (PTP1B-R). (D) Different concentrations of PTP1B-OX were injected on immobilized scFv45 (500 nM) on Ni-NTA sensor chip using BIACORE 2000. The kinetic constants for binding were calculated with the BIAevaluation 3.1 software. See also Figure S3.
190 Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc.
B
A
80
40 20 0
20
0 H2O2 scFv
200
-
45
57
TCPTP (1 -3 1 7 )
36 kD a 36 kD a
E lu a te
Sup
IB : T C P T P
In p u t
1 B -O X
E lu a te
Sup
T C P T P-O X
E lu a te
Sup
T C P T P-R
In p u t
C
75 100 H 2O 2 (uM )
In p u t
0
40
+
60
+
+ TCEP 80
+TCEP 60
+
No TCEP
100 % Phosphatase Activity
% Phosphatase Activity
-TCEP
IB : P T P 1 B PTP1B (1 -3 2 1 ) IB : H A
scF v45 IB : H A
scF v45
Figure 5. scFv45 Displays Specificity for PTP1B-OX over TCPTP-OX (A) Recombinant TCPTP (1–317) was reversibly oxidized by H2O2 and reactivated by TCEP (5 mM). Phosphatase assay was determined using 100 nM 32P-RCML and 5 nM of enzyme from each sample. (B) Purified scFvs (750 nM) were incubated with oxidized TCPTP (7.5 nM), and phosphatase assay was determined in presence of TCEP (5 mM). Error bars in (A) and (B) show standard deviation from six (A) and three (B) phosphatase assays. (C) TCPTP was reversibly oxidized with H2O2 and binding with scFv45 was assessed by anti-HA-Agarose pull-down. Equivalent amounts (4 ng of TCPTP) of input, supernatant, and precipitate were analyzed by immunoblot. As a control, a parallel pull-down was performed with reversibly oxidized PTP1B and scFv45. See also Figure S4.
that scFv45 binds specifically to PTP1B-OX but not to PTP1B in its reduced, active conformation (Figure 4C). A dose response between increasing concentrations of PTP1B-OX (50 nM to 10 mM) and a fixed amount of scFv45 (500 nM) indicated that scFv45 bound to PTP1B-OX with high affinity (dissociation constant [KD] = 46 nM), and the interaction had a slow off-rate (Koff = 2.3 3 103 s1) (Figure 4D). The interaction between the PTP1B-CASA mutant and scFv45 was comparable (KD = 52 nM, Koff = 4.6 3 103 s1) (Figure S3A). The interaction between PTP1B-CASA and scFv45 was almost identical with or without the reducing agent TCEP (Figure S3B), showing that the presence of the reducing agent does not affect PTP1BscFv45 binding per se; rather it is the redox-dependent changes in conformation of the enzyme that are responsible for the specific interaction. TCPTP is the closest relative of PTP1B among the classical PTP family of enzymes and displays 75% sequence identity with the catalytic domain of PTP1B (Iversen et al., 2002). Because of their close structural similarity (Figure S4), the potential for agents that target PTP1B to display overlapping specificity toward TCPTP has been a concern. It was shown previously that TCPTP is reversibly oxidized in mammalian cells, together with PTP1B, following insulin stimulation (Meng et al.,
2004). Therefore, we tested the specificity of scFv45 for PTP1B-OX over TCPTP. Bacterially purified recombinant TCPTP was incubated with increasing concentrations of H2O2, and reactivation by reducing agent was determined by measuring phosphatase activity (Figure 5A). Incubation with scFv45, or scFv57, exerted no obvious effect on the reduction and reactivation of oxidized TCPTP (Figure 5B). In a pull-down experiment with anti-HA antibody conjugated to agarose beads, we demonstrated that in vitro-oxidized TCPTP also did not bind to scFv45 (Figure 5C). These results demonstrate that conformationspecific scFvs to PTP1B-OX did not bind the oxidized, inactive form of TCPTP, highlighting their selectivity in recognition of specific epitope(s) on PTP1B-OX. scFv45 Bound to PTP1B-OX in Mammalian Cells in Response to Insulin and H2O2 A major advantage of using phage display to generate PTP1BOX-specific antibodies is that the selected scFvs can be expressed inside mammalian cells as functional intracellular antibodies or ‘‘intrabodies.’’ We observed robust and stable expression of the PTP1B-OX conformation sensor scFv45 in 293T cells (Figure 6). We generated a ‘‘mini’’ mammalian expression library by cloning additional individual scFvs into the Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc. 191
_
NAC
_
TCEP
+ +
+ _ _
_
+
_
+
Vector _ _
+ _
_
_
_
+ +
+ _ _
B scFv45 _ _ _ _
+ +
+ _
H2 O2
H 2O 2
scFv45 _ _
Insulin
Vector _ _
Reduced
50% Input
IP: Anti - His Tag mAb (27E8)
No Treatment
A
IP: Anti -6His mAb 25% Input
IB: PTP1B (FG6)
_ IB: PTP1B (FG6)
50 kDa
PTP1B IB: scFv (anti -HA)
IB: HA (3F10) - HRP
37 kDa
scFv45 -HA - 6His IB: TCPTP (CF4)
C scFv45 + insulin
scFv45 Dapi
PTP1B
Dapi
PTP1B
scFv45 + H 2 O 2 Dapi
PTP1B
D
scFv45
Merge
scFv20 Dapi
PTP1B
scFv45
Merge
scFv20 + Insulin Dapi
PTP1B
scFv45
Merge
Colocalization Coefficient
1 scFv45 0.8
scFv20
0.6 0.4 0.2
scFv20 +H 2 O 2 Dapi
PTP1B
0 No Treatment
scFv20
Merge
scFv20
Merge
scFv20
+ Insulin
+H2O2
Merge
Figure 6. scFv45 Bound to PTP1B-OX in H2O2- and Insulin-Treated 293T Cells (A) scFv45-expressing 293T cells were treated with H2O2 (1 mM) or NAC (20 mM), and cell lysates were prepared with or without TCEP (2 mM). From 1 mg of cell lysate, scFv45 was immunoprecipitated and binding to PTP1B, under oxidized or reduced conditions, was detected by immunoblot. (B) In a similar experiment, 293T cells expressing scFv45 were treated with insulin (25 nM), H2O2 (1 mM), or NAC (20 mM), and intrabody was immunoprecipitated from 1 mg of lysate. The membrane was stripped and reprobed with anti-TCPTP mouse monoclonal antibody (CF4). (C) Cos1 cells transfected with scFv45 or scFv20 were treated with insulin (25 nM), or H2O2 (1 mM) or left untreated. Fixed cells were processed for immunofluorescence and visualized by confocal microscopy with oil immersion (633). (D) Colocalization of PTP1B and scFv45 was analyzed by Zeiss (LSM 710) Colocalization Viewer Software. The numeric range for colocalization is set as 0–1, where ‘‘0’’ indicates no colocalization and ‘‘1’’ indicates colocalization of all pixels. Error bars indicate standard deviation from colocalization analysis of 25 individual cells. See also Figure S5.
pCDNA3.2/V5-GWD-TOPO vector from their corresponding phagemid constructs. Using a Ni-NTA pull-down assay with equivalent amounts of scFv-transfected, H2O2-treated lysates, we identified more candidate scFvs that bound to endogenous PTP1B-OX and did not display any binding to PTP1B under reducing conditions (Figure S5). Interestingly, consistent with selective binding to PTP1B-OX in vitro, scFv45 expressed as an intrabody also bound to PTP1B-OX. The negative control scFv20, which did not stabilize and inhibit reactivation of PTP1B-OX in vitro (Figure 3), showed no binding to PTP1B-OX in cells. This indicates that PTP1B undergoes similar oxidationinduced conformational change in mammalian cells when 192 Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc.
treated with H2O2, and scFvs that bind PTP1B-OX in vitro were also functional when expressed in 293T cells. Furthermore, we demonstrated that scFv45 efficiently immunoprecipitated PTP1B-OX from cells treated with H2O2 but showed little or no interaction with PTP1B when cells were treated under reducing conditions (Figure 6A). These results are consistent with the earlier observation in vitro that scFv45 recognized distinct conformational epitopes formed by oxidation of PTP1B by H2O2. Stimulation of cells with insulin causes rapid and transient oxidation and inhibition of PTP1B, which facilitates increased phosphorylation of the insulin receptor b subunit (Mahadev et al., 2001; Meng et al., 2004). We observed that scFv45
immunoprecipitated PTP1B from lysates of transfected 293T cells that were treated with insulin (Figure 6B). In cells that were untreated or were processed under reducing conditions, such interaction was absent. We demonstrated that scFv45 bound specifically to PTP1B-OX but not to TCPTP-OX in vitro (Figure 5). In mammalian cells, scFv45 displayed similar specificity toward endogenous PTP1B-OX and did not interact with endogenous TCPTP following either insulin stimulation or H2O2 treatment (Figure 6B). In order to visualize the interaction between PTP1B-OX and scFv45 in mammalian cells, we used immunofluorescence to examine whether there was colocalization of PTP1B and scFv45 in Cos1 cells following insulin stimulation or H2O2 treatment. Significant colocalization between PTP1B and scFv45 was observed, whereas such colocalization was absent between the negative control scFv20 and PTP1B (Figures 6C and 6D). scFv45 Enhanced and Prolonged Tyrosine Phosphorylation in 293T Cells in Response to Insulin in an ROS-Dependent Manner We demonstrated that the conformation-sensor antibody scFv45 sequestered reversibly oxidized PTP1B and prevented reactivation of the enzyme. Therefore, we tested whether such sequestration of a negative regulator of insulin receptor kinase had an impact on downstream signaling in mammalian cells. When 293T cells in which scFv45 was overexpressed were stimulated with insulin, the insulin receptor b subunit (IRb) and IRS-1 displayed enhanced and sustained tyrosine phosphorylation in comparison to cells without the ectopically expressed intrabody or with the negative control intrabody scFv20 (Figure 7A). Similar stimulatory effects of scFv45 were observed on insulin- and EGF-induced signaling in HeLa cells (Figure S7). Interestingly, when catalase, the enzyme that catalyzes the decomposition of H2O2 to water and oxygen, was co-overexpressed in the cytosol together with scFv45 or the negative control scFv20, the enhanced tyrosine phosphorylation of both IRb and IRS-1 was diminished (Figure 7A and Figure S6B). Furthermore, we used a phospho-site-specific antibody to focus on the tandem residues (Y1162/Y1163) of the IRb activation loop, which have been identified as substrates of PTP1B (Salmeen et al., 2000). Again, expression of scFv45 led to enhanced and sustained insulin-induced phosphorylation of these residues, and this effect was attenuated by coexpression with catalase (Figure 7B). In contrast, any effects of scFv45 on the phosphorylation of Y1328, from the C terminus of IRb, were much less pronounced, suggesting preferential recognition of individual phosphorylation sites in IRb by PTP1B (Figure S6C). scFv45 Caused Increased AKT Phosphorylation in Response to Insulin In order to test for effects of the intrabody on downstream signaling, we used a phospho-specific antibody that recognizes phosphorylation of T308 in PKB/AKT as a read-out. Cells overexpressing scFv45 displayed enhanced and sustained PKB/AKT phosphorylation when treated with insulin (Figure 7A). This effect was maintained over a time course of 60 min (Figure 7C). When catalase was overexpressed together with scFv45, the stimulatory effects of the intrabody on PKB/AKT activation were
ablated (Figure 7C). Furthermore, expression of the negative control scFv20 had no effect on PKB/AKT activation (Figure S6A). We found no change in PTP1B expression in any of these conditions, indicating that changes in the levels of PTP1B do not underlie the enhancement in signaling. Overall, our data are consistent with a model in which scFv45 binds and stabilizes endogenous PTP1B-OX, thereby effectively attenuating PTP1B activity and removing its inhibitory effect, with resulting enhancement of insulin signaling. DISCUSSION Dysfunctional insulin signaling results in insulin resistance, which is ultimately associated with metabolic syndromes, including type 2 diabetes and obesity (Saltiel and Kahn, 2001). Insulin induces activation of the insulin receptor kinase through autophosphorylation (Saltiel and Pessin, 2002). Recruitment of IRS-1 to the receptor triggers activation of phosphatidylinositol 3-kinases (PI3K) and the stimulation of downstream effectors, such as phosphatidylinositol-dependent kinase 1 (PDK1) and PKB/AKT, leading to translocation of glucose transporter 4 (GLUT4) and glucose uptake, and inactivation of glycogen-synthase kinase 3 (GSK3) (Bryant et al., 2002). PTP1B is an important negative regulator of signaling; it dephosphorylates the tandem tyrosine residues (pY1162/pY1163) of activated IRb (Salmeen et al., 2000) and IRS-1 (Goldstein et al., 2000), exerting a major influence on the duration and amplitude of the cellular response to insulin. Consequently, regulation of PTP1B activity would be expected to facilitate fine-tuning of insulin-induced signaling. Our data highlight the importance of reversible oxidation of PTP1B as one such regulatory mechanism and validate stabilization of the inactive phosphatase conformation as a potential therapeutic strategy. A feature of the microbicidal function of phagocytes is the reduction of molecular oxygen by a NADPH oxidase (NOX) enzyme system, which generates superoxide (O2) that is converted to H2O2 (Bokoch and Zhao, 2006). The NOX enzymes are multiprotein complexes, the activity of which is tightly controlled. Nonphagocytic cells are now also known to contain similar NOX enzymes, which are capable of generating ROS but at lower levels than those associated with the killing of invading micro-organisms (Lambeth, 2004). Controlled production of ROS, such as H2O2, has been observed in nonphagocytic cells in response to a number of ligands that act through receptor tyrosine kinases (Bae et al., 1997; Mahadev et al., 2001; Meng et al., 2004; Sundaresan et al., 1995). PTPs have been identified as direct targets of ROS, the presence of an essential low pKa catalytic cysteine residue rendering them exquisitely sensitive to oxidation and inactivation (Rhee, 2006). In particular, PTP1B is transiently inactivated by reversible oxidation of the activesite cysteine following insulin-induced generation of H2O2 in mammalian cells (Mahadev et al., 2001; Meng et al., 2004). PTP1B is targeted to the cytoplasmic face of membranes of the ER where it functions in a ‘‘dephosphorylation compartment’’ in which it acts to terminate signaling from receptor PTKs that have undergone endocytosis following ligand stimulation (Eden et al., 2010; Haj et al., 2002). This ER localization exposes PTP1B essentially to the entire cytoplasm, and there have Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc. 193
30
20
0
10
+ scFv45 30
20
0
10
30
20
0
10
30
20
0
10
30
+ scFv45 + Catalase
No Intrabody
+ scFv45
20
0
10
0
+ scFv20 30
Insulin (Min)
20
No Intrabody 10
A
IB: pTyr (4G10) pIRS-1 IB: pTyr (4G10) pIRβ IB: IRβ IB: pAKT (Thr 308) IB: AKT
IB: α -Tubulin
IB: HA scFv IB: Catalase
30
20
+ scFv45 0 10
30
20
0
30
20
0
10
10
+ scFv45 + Catalase
No Intrabody 30
20
30
20
10
0 10
+ scFv45
+ scFv20 0
30
20
10
Insulin (Min)
No Intrabody 0
B
IB: pYpY[1162/1163] IR -β
IB: IR- β IB: pAKT (Thr 308) IB: AKT IB: HA scFv
IB: Catalase
60
30
0 5 10
60
30
20
20
+ scFv45 + Catalase
+ scFv45 0 5 10
30 60
20
5 10
No Intrabody 0
C
IB: pAKT (Thr 308)
IB: AKT
IB: α-Tubulin
IB: PTP1B
IB: HA scFv45
IB: Catalase
Figure 7. scFv45 Enhanced Insulin Signaling in 293T Cells in a Redox-Dependent Manner In (A)–(C), cells transiently expressing scFv45 or scFv20 or cotransfected with scFv45 and catalase were treated with insulin (25 nM) for the indicated times, and total cell lysate (60 mg in A and C and 80 mg in B) was subjected to immunoblot analysis of the indicated proteins. See also Figure S6 and Figure S7.
been reports of it exerting its effects from the perinuclear compartment (Romsicki et al., 2004) all the way to interactions with substrate at the plasma membrane (Nievergall et al., 2010). It has been suggested that PTP1B exists in spatially 194 Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc.
distinct subpopulations on the surface of the ER, some of which are associated with a reversible low-activity state that may define different signaling functions (Yudushkin et al., 2007). Reversible oxidation would be one mechanism to achieve this
compartmentalization. Interestingly, NOX4, which is responsible for oxidation of PTP1B in response to insulin (Mahadev et al., 2004), was reported to colocalize with PTP1B on ER membranes (Martyn et al., 2006), and this colocalization was essential for signal-induced oxidation and inactivation of PTP1B (Chen et al., 2008). The reversibility of oxidation and inactivation is essential for this covalent modification to represent a means for regulation of tyrosine phosphorylation-dependent signaling. Our original structure of oxidized PTP1B suggests a molecular mechanism by which such reversibility may be achieved. We observed that oxidation induced profound conformational changes in PTP1B, in which the active-site cleft opens up both to expose critical catalytic residues and to present new binding surfaces on the protein (Salmeen et al., 2003). Upon soaking PTP1B crystals with stoichiometric quantities of H2O2, an unstable sulfenic acid (Cys–S-OH) is generated initially, which then undergoes a rapid condensation reaction to produce a cyclic sulfenyl-amide species (Salmeen et al., 2003). This modification, previously undetected in proteins, was generated when a covalent bond formed between the side-chain S atom of Cys215 and the main-chain N atom of Ser216, and resulted in disruption of critical H bonds that maintain the stability of the active-site cleft. Independently, Van Montfort et al. reported the formation of the same structure when PTP1B crystals were soaked with 2-phenyl-isoxazolidine-3,5-dione, which is a peroxide generator and, thereby, an inhibitor of PTP1B (Tjernberg et al., 2004; van Montfort et al., 2003). Incubation of crystals of the sulfenyl-amide form of PTP1B with reducing agent led to a quantitative reduction of the enzyme and a return to the catalytically active conformation. Thus, formation of this cyclic sulfenyl-amide intermediate may protect the enzyme from irreversible inactivation by higher-order oxidation and, by presenting the signature motif that contains the active-site Cys residue on the surface of the protein, may facilitate reactivation by reducing agents in the cell (Tonks, 2005). We exploited the new binding surfaces that are unique to the oxidized form of the enzyme to generate scFv antibodies that were selective for the oxidized conformation of PTP1B in vitro. Our data illustrate that these scFvs also bound to PTP1B in mammalian cells in response to insulin-mediated ROS production or treatment with exogenous H2O2. These results indicate that the reversible redox modification involving formation of a cyclic sulfenyl-amide also occurs in cells, thereby removing concerns that the structure was an artifact of crystallization and focusing attention on its potential physiological importance. Our study illustrates the importance of endogenously produced H2O2 and concomitant reversible inhibition of PTP1B activity in augmenting insulin signaling. We demonstrated that the conformation-sensor antibody scFv45 bound to the oxidized conformation of PTP1B that is produced following treatment with either insulin or H2O2, but no interaction was observed in untreated cells. Furthermore, scFv45 enhanced and prolonged tyrosine phosphorylation of both IRb and IRS-1, and this effect was transduced downstream to yield a sustained increase in PKB/AKT phosphorylation and activation. Interestingly, scFv45 did not cause changes in basal tyrosine phosphorylation or downstream signaling, indicating that it does not function as an insulin mimetic to promote basal signaling in the absence of
hormone, rather it functions as an insulin sensitizer. This enhancement of insulin-induced signaling was attenuated when scFv45 was coexpressed with catalase, which has been shown previously to degrade H2O2 in cells (Irani et al., 1997), indicating that H2O2 production is required for the antibodies to exert their stimulatory effects on signaling. Therefore, our data suggest that insulin stimulates a rise in intracellular H2O2 that transiently oxidizes and inactivates PTP1B, tipping the delicate balance between PTP and PTK activity in favor of the kinase, to promote signaling. Expression of scFv45 stabilizes the oxidized conformation of PTP1B, delaying its reduction and reactivation and further shifting the balance toward tyrosine phosphorylation, thereby promoting enhanced and sustained insulin signaling. Antibody phage display is a powerful technique that allows target antigens to be preserved in their native state and, therefore, has the potential to yield antibodies that can recognize specific conformations of a target protein (Marasco, 1997). Furthermore, it permits the resulting antibodies to be expressed in cells as intrabodies, such as the scFvs used in this study, in which the variable heavy and light chains are linked directly, without the need for disulphide bonds. Therefore, not only are these intrabodies stable and functional in the overall reducing environment of the cell, but also the ability to express them overcomes the problems associated with delivering whole antibodies across the plasma membrane (Cardinale and Biocca, 2008). One example is the generation of intrabodies that detect specifically GTP-bound Rab6 in the Golgi and with which the dynamics of Rab6-GTP-positive transport intermediates could be followed by immunofluorescence (Nizak et al., 2003). The screening procedures in phage display permit great flexibility in selection conditions. In our study, we generated a library from a donor immunized with the mutant form of PTP1B that is locked in the oxidized conformation. We were able to enrich for conformation-sensor antibodies that recognized this structure specifically by screening with oxidized PTP1B in the presence of a molar excess of the reduced enzyme. Not only did the intrabodies recognize and stabilize the inactive conformation induced by a unique posttranslational inhibitory modification on PTP1B, thereby potentiating insulin signaling, but also they did so with remarkable specificity. There are a number of gene duplications that occurred in the evolution of the PTP family. One such event generated two closely related phosphatases, PTP1B and TCPTP, and one spliced isoform of TCPTP is localized to the ER in a similar manner to PTP1B (Iversen et al., 2002). In fact, insulin induces the oxidation of TCPTP (Meng et al., 2004), and there are data suggesting distinct, yet complementary roles for PTP1B and TCPTP in regulation of insulin signaling (Galic et al., 2005). Nevertheless, the scFvs generated in this study did not recognize TCPTP, highlighting their potential as specific probes of PTP1B function. Although TCPTP displays 75% sequence identity to PTP1B in its core catalytic domain, rising to 85% when conservative substitutions are considered, there are differences in surface residues between the two PTPs (Figure S4) that may contribute to this specificity. Quite apart from their significance as reagents, these antibodies may influence strategies for development of PTP1Bdirected therapeutics. The phenotype of the knockout mouse, together with structural and biochemical data from various Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc. 195
groups, has established PTP1B as a key regulator of insulin and leptin signaling (Tonks, 2003). Consequently, it became a highly prized target in the pharmaceutical industry for therapeutic intervention in diabetes and obesity. Although there have been major programs in industry focused on developing small-molecule inhibitors of PTP1B, these efforts have been frustrated by technical challenges arising from the chemical properties of the PTP active site (Zhang and Zhang, 2007). The susceptibility of PTPs to oxidation causes problems in high-throughput screens. In addition, the tendency of potent active-site-directed inhibitors to be highly charged, such as nonhydrolyzable pTyr mimetics, presents problems with respect to oral bioavailability that limit drug development potential. Nevertheless, considering the importance of PTP1B as a therapeutic target, there is a compelling need to explore innovative ideas and schemes to target the enzyme for generating novel, potent, and selective inhibitors. The profound conformational change at the active site induced by oxidation provides new sites for developing chemical scaffolds or small-molecule inhibitors that could recognize the reversibly oxidized conformation of PTP1B and lock it in the inactive state. Stabilization of the inactive PTP1B-OX conformation may potentiate insulin signaling in a manner similar to inhibition of the catalytically active form of the enzyme, analogous to stabilization of the inactive form of p210 BCR-ABL by Gleevec/Imatinib (Schindler et al., 2000). Also, if one assumes that in responding to insulin the cell targets for oxidation the pool of PTP1B that is important for regulation of the signaling response, then this strategy will also target that pool specifically, possibly also reducing complications of side effects that may accompany inhibition of the native enzyme as a whole. Furthermore, most potent active-site-directed inhibitors of PTP1B show some degree of inhibition of TCPTP as well (Johnson et al., 2002). TCPTP is essential for normal hematopoiesis, and TCPTP null mice die as a result of hematopoietic defects within weeks of birth (YouTen et al., 1997). Consequently, there was concern in drug discovery efforts to minimize the effects on TCPTP of any PTP1B inhibitors. The properties of our scFv intrabodies alleviate this concern. Problems of delivery to the appropriate tissues represent serious hurdles to the use of such intrabodies as therapeutic agents. Nevertheless, our data indicate that if it is possible to stabilize the oxidized, inactive form of PTP1B with an appropriate small molecule that mimics the effects of these antibodies, then this could provide an alternative strategy for PTP-directed drug development that would circumvent the difficulties that are faced when targeting the PTP-active site with highly charged inhibitors. EXPERIMENTAL PROCEDURES Construction of scFv Phage Display Library Chickens were immunized with purified PTP1B-CASA. Total RNA was isolated from the spleen and bone marrow of the immunized animals. First-strand cDNA was synthesized from the total RNA and used for PCR amplification of the VL and VH genes, which were combined to form a full-length scFv construct. Screening Individual scFvs as Conformation-Sensor Antibodies to PTP1B-OX The scFv library was mixed with 10- to 50-fold molar excess of wild-type PTP1B over the biotinylated PTP1B-CASA under reducing conditions.
196 Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc.
Phage-scFvs bound to biotinylated PTP1B-CASA were captured by streptavidin-coated magnetic beads, eluted under acidic conditions, neutralized, and amplified. A total of five rounds of panning were performed accordingly. Individual scFvs were expressed without the pIII fusion component in a nonsuppressor E. coli (TOP10F0 ) and purified with Ni-NTA. Recombinant PTP1B was reversibly oxidized and inactivated with H2O2, then incubated with purified scFvs. The effect of individual scFvs on stabilizing the reversibly oxidized conformation was assessed by phosphatase assay under reducing conditions. Effect of PTP1B-OX-Specific Intrabodies on Insulin Signaling Intrabody constructs for scFv45 or scFv20 were transfected in 293T cells, which were stimulated with insulin for various times. Total proteins in the cell lysates were separated by SDS-PAGE, and global tyrosyl phosphorylation was detected by anti-phosphotyrosine antibody 4G10. To detect specific tyrosyl phosphorylation of the IRb subunit, we used rabbit polyclonal antiinsulin receptor [pYpY1162/1163] phospho-specific antibody. To test the effect of suppressing H2O2 levels on intrabody function, catalase was ectopically coexpressed with scFv45. Catalase expression was detected in the cell samples with anti-catalase rabbit polyclonal antibody. Phosphorylation of the AKT activation loop at residue threonine 308 (T308) was observed with phospho-specific antibody. Details of methods and materials are described in the Extended Experimental Procedures in the Supplemental Information available online. ACCESSION NUMBERS Crystallographic coordinates and structure factors of the PTP1B-CASA mutant have been deposited with the PDB with accession ID codes 3zv2 and r3zv2sf, respectively. SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures, seven figures, one table, and Supplemental References and can be found with this article online at doi:10.1016/j.cell.2011.08.036. ACKNOWLEDGMENTS This work was supported by NIH grants CA53840 and GM55989 to N.K.T. The work in D.B.’s lab was supported by a grant from Cancer Research UK. Received: March 10, 2011 Revised: June 30, 2011 Accepted: August 15, 2011 Published: September 29, 2011 REFERENCES Bae, Y.S., Kang, S.W., Seo, M.S., Baines, I.C., Tekle, E., Chock, P.B., and Rhee, S.G. (1997). Epidermal growth factor (EGF)-induced generation of hydrogen peroxide. Role in EGF receptor-mediated tyrosine phosphorylation. J. Biol. Chem. 272, 217–221. Bandyopadhyay, D., Kusari, A., Kenner, K.A., Liu, F., Chernoff, J., Gustafson, T.A., and Kusari, J. (1997). Protein-tyrosine phosphatase 1B complexes with the insulin receptor in vivo and is tyrosine-phosphorylated in the presence of insulin. J. Biol. Chem. 272, 1639–1645. Barford, D., Flint, A.J., and Tonks, N.K. (1994a). Crystal structure of human protein tyrosine phosphatase 1B. Science 263, 1397–1404. Barford, D., Keller, J.C., Flint, A.J., and Tonks, N.K. (1994b). Purification and crystallization of the catalytic domain of human protein tyrosine phosphatase 1B expressed in Escherichia coli. J. Mol. Biol. 239, 726–730. Bokoch, G.M., and Zhao, T. (2006). Regulation of the phagocyte NADPH oxidase by Rac GTPase. Antioxid. Redox Signal. 8, 1533–1548.
Brandt, T.A., Crooke, S.T., Ackermann, E.J., Xia, X., Morgan, E.S., Liu, Q., Geary, R.S., and Bhanot, S. (2010). ISIS 113715, a novel PTP-1B antisense inhibitor, improves glycemic control and dyslipidemia and increases adiponectin levels in T2DM subjects uncontrolled on stable sulfonylurea therapy. Paper presented at: American Diabetes Association (Carlsbad, CA, USA). Bru¨nger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., GrosseKunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., et al. (1998). Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. D Biol. Crystallogr. 54, 905–921. Bryant, N.J., Govers, R., and James, D.E. (2002). Regulated transport of the glucose transporter GLUT4. Nat. Rev. Mol. Cell Biol. 3, 267–277. Cardinale, A., and Biocca, S. (2008). The potential of intracellular antibodies for therapeutic targeting of protein-misfolding diseases. Trends Mol. Med. 14, 373–380. Chen, K., Kirber, M.T., Xiao, H., Yang, Y., and Keaney, J.F., Jr. (2008). Regulation of ROS signal transduction by NADPH oxidase 4 localization. J. Cell Biol. 181, 1129–1139. Cheng, A., Uetani, N., Simoncic, P.D., Chaubey, V.P., Lee-Loy, A., McGlade, C.J., Kennedy, B.P., and Tremblay, M.L. (2002). Attenuation of leptin action and regulation of obesity by protein tyrosine phosphatase 1B. Dev. Cell 2, 497–503. Eden, E.R., White, I.J., Tsapara, A., and Futter, C.E. (2010). Membrane contacts between endosomes and ER provide sites for PTP1B-epidermal growth factor receptor interaction. Nat. Cell Biol. 12, 267–272. Elchebly, M., Payette, P., Michaliszyn, E., Cromlish, W., Collins, S., Loy, A.L., Normandin, D., Cheng, A., Himms-Hagen, J., Chan, C.C., et al. (1999). Increased insulin sensitivity and obesity resistance in mice lacking the protein tyrosine phosphatase-1B gene. Science 283, 1544–1548. Emsley, P., and Cowtan, K. (2004). Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132. Galic, S., Hauser, C., Kahn, B.B., Haj, F.G., Neel, B.G., Tonks, N.K., and Tiganis, T. (2005). Coordinated regulation of insulin signaling by the protein tyrosine phosphatases PTP1B and TCPTP. Mol. Cell. Biol. 25, 819–829. Goldstein, B.J., Bittner-Kowalczyk, A., White, M.F., and Harbeck, M. (2000). Tyrosine dephosphorylation and deactivation of insulin receptor substrate-1 by protein-tyrosine phosphatase 1B. Possible facilitation by the formation of a ternary complex with the Grb2 adaptor protein. J. Biol. Chem. 275, 4283– 4289. Haj, F.G., Verveer, P.J., Squire, A., Neel, B.G., and Bastiaens, P.I. (2002). Imaging sites of receptor dephosphorylation by PTP1B on the surface of the endoplasmic reticulum. Science 295, 1708–1711. Irani, K., Xia, Y., Zweier, J.L., Sollott, S.J., Der, C.J., Fearon, E.R., Sundaresan, M., Finkel, T., and Goldschmidt-Clermont, P.J. (1997). Mitogenic signaling mediated by oxidants in Ras-transformed fibroblasts. Science 275, 1649– 1652. Iversen, L.F., Moller, K.B., Pedersen, A.K., Peters, G.H., Petersen, A.S., Andersen, H.S., Branner, S., Mortensen, S.B., and Moller, N.P. (2002). Structure determination of T cell protein-tyrosine phosphatase. J. Biol. Chem. 277, 19982–19990. Johnson, T.O., Ermolieff, J., and Jirousek, M.R. (2002). Protein tyrosine phosphatase 1B inhibitors for diabetes. Nat. Rev. Drug Discov. 1, 696–709. Klaman, L.D., Boss, O., Peroni, O.D., Kim, J.K., Martino, J.L., Zabolotny, J.M., Moghal, N., Lubkin, M., Kim, Y.B., Sharpe, A.H., et al. (2000). Increased energy expenditure, decreased adiposity, and tissue-specific insulin sensitivity in protein-tyrosine phosphatase 1B-deficient mice. Mol. Cell. Biol. 20, 5479– 5489. Lambeth, J.D. (2004). NOX enzymes and the biology of reactive oxygen. Nat. Rev. Immunol. 4, 181–189. Lee, S.R., Yang, K.S., Kwon, J., Lee, C., Jeong, W., and Rhee, S.G. (2002). Reversible inactivation of the tumor suppressor PTEN by H2O2. J. Biol. Chem. 277, 20336–20342. Lohse, D.L., Denu, J.M., Santoro, N., and Dixon, J.E. (1997). Roles of aspartic acid-181 and serine-222 in intermediate formation and hydrolysis of the
mammalian protein-tyrosine-phosphatase PTP1. Biochemistry 36, 4568– 4575. Mahadev, K., Zilbering, A., Zhu, L., and Goldstein, B.J. (2001). Insulin-stimulated hydrogen peroxide reversibly inhibits protein-tyrosine phosphatase 1b in vivo and enhances the early insulin action cascade. J. Biol. Chem. 276, 21938–21942. Mahadev, K., Motoshima, H., Wu, X.D., Ruddy, J.M., Arnold, R.S., Cheng, G.J., Lambeth, J.D., and Goldstein, B.J. (2004). The NAD(P)H oxidase homolog Nox4 modulates insulin-stimulated generation of H2O2 and plays an integral role in insulin signal transduction. Mol. Cell. Biol. 24, 1844–1854. Marasco, W.A. (1997). Intrabodies: turning the humoral immune system outside in for intracellular immunization. Gene Ther. 4, 11–15. Martyn, K.D., Frederick, L.M., von Loehneysen, K., Dinauer, M.C., and Knaus, U.G. (2006). Functional analysis of Nox4 reveals unique characteristics compared to other NADPH oxidases. Cell. Signal. 18, 69–82. Meng, T.C., Fukada, T., and Tonks, N.K. (2002). Reversible oxidation and inactivation of protein tyrosine phosphatases in vivo. Mol. Cell 9, 387–399. Meng, T.C., Buckley, D.A., Galic, S., Tiganis, T., and Tonks, N.K. (2004). Regulation of insulin signaling through reversible oxidation of the protein-tyrosine phosphatases TC45 and PTP1B. J. Biol. Chem. 279, 37716–37725. Minor, W., Cymborowski, M., Otwinowski, Z., and Chruszcz, M. (2006). HKL-3000: the integration of data reduction and structure solution—from diffraction images to an initial model in minutes. Acta Crystallogr. D Biol. Crystallogr. 62, 859–866. Myers, M.P., Andersen, J.N., Cheng, A., Tremblay, M.L., Horvath, C.M., Parisien, J.P., Salmeen, A., Barford, D., and Tonks, N.K. (2001). TYK2 and JAK2 are substrates of protein-tyrosine phosphatase 1B. J. Biol. Chem. 276, 47771–47774. Nievergall, E., Janes, P.W., Stegmayer, C., Vail, M.E., Haj, F.G., Teng, S.W., Neel, B.G., Bastiaens, P.I., and Lackmann, M. (2010). PTP1B regulates Eph receptor function and trafficking. J. Cell Biol. 191, 1189–1203. Nizak, C., Monier, S., del Nery, E., Moutel, S., Goud, B., and Perez, F. (2003). Recombinant antibodies to the small GTPase Rab6 as conformation sensors. Science 300, 984–987. Rondinone, C.M., Trevillyan, J.M., Clampit, J., Gum, R.J., Berg, C., Kroeger, P., Frost, L., Zinker, B.A., Reilly, R., Ulrich, R., et al. (2002). Protein tyrosine phosphatase 1B reduction regulates adiposity and expression of genes involved in lipogenesis. Diabetes 51, 2405–2411. Rhee, S.G. (2006). Cell signaling. H2O2, a necessary evil for cell signaling. Science 312, 1882–1883. Romsicki, Y., Reece, M., Gauthier, J.Y., Asante-Appiah, E., and Kennedy, B.P. (2004). Protein tyrosine phosphatase-1B dephosphorylation of the insulin receptor occurs in a perinuclear endosome compartment in human embryonic kidney 293 cells. J. Biol. Chem. 279, 12868–12875. Salmeen, A., Andersen, J.N., Myers, M.P., Tonks, N.K., and Barford, D. (2000). Molecular basis for the dephosphorylation of the activation segment of the insulin receptor by protein tyrosine phosphatase 1B. Mol. Cell 6, 1401–1412. Salmeen, A., Andersen, J.N., Myers, M.P., Meng, T.C., Hinks, J.A., Tonks, N.K., and Barford, D. (2003). Redox regulation of protein tyrosine phosphatase 1B involves a sulphenyl-amide intermediate. Nature 423, 769–773. Saltiel, A.R., and Kahn, C.R. (2001). Insulin signalling and the regulation of glucose and lipid metabolism. Nature 414, 799–806. Saltiel, A.R., and Pessin, J.E. (2002). Insulin signaling pathways in time and space. Trends Cell Biol. 12, 65–71. Savitsky, P.A., and Finkel, T. (2002). Redox regulation of Cdc25C. J. Biol. Chem. 277, 20535–20540. Schindler, T., Bornmann, W., Pellicena, P., Miller, W.T., Clarkson, B., and Kuriyan, J. (2000). Structural mechanism for STI-571 inhibition of abelson tyrosine kinase. Science 289, 1938–1942. Sundaresan, M., Yu, Z.X., Ferrans, V.J., Irani, K., and Finkel, T. (1995). Requirement for generation of H2O2 for platelet-derived growth factor signal transduction. Science 270, 296–299.
Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc. 197
Tjernberg, A., Halle´n, D., Schultz, J., James, S., Benkestock, K., Bystro¨m, S., and Weigelt, J. (2004). Mechanism of action of pyridazine analogues on protein tyrosine phosphatase 1B (PTP1B). Bioorg. Med. Chem. Lett. 14, 891–895. Tonks, N.K. (2003). PTP1B: from the sidelines to the front lines! FEBS Lett. 546, 140–148. Tonks, N.K. (2005). Redox redux: revisiting PTPs and the control of cell signaling. Cell 121, 667–670. Tonks, N.K., Diltz, C.D., and Fischer, E.H. (1988). Purification of the major protein-tyrosine-phosphatases of human placenta. J. Biol. Chem. 263, 6722–6730. van Montfort, R.L.M., Congreve, M., Tisi, D., Carr, R., and Jhoti, H. (2003). Oxidation state of the active-site cysteine in protein tyrosine phosphatase 1B. Nature 423, 773–777. WHO. (2009). Diabetes Fact Sheet (World Health Organization). You-Ten, K.E., Muise, E.S., Itie´, A., Michaliszyn, E., Wagner, J., Jothy, S., Lapp, W.S., and Tremblay, M.L. (1997). Impaired bone marrow microenviron-
198 Cell 147, 185–198, September 30, 2011 ª2011 Elsevier Inc.
ment and immune function in T cell protein tyrosine phosphatase-deficient mice. J. Exp. Med. 186, 683–693. Yudushkin, I.A., Schleifenbaum, A., Kinkhabwala, A., Neel, B.G., Schultz, C., and Bastiaens, P.I. (2007). Live-cell imaging of enzyme-substrate interaction reveals spatial regulation of PTP1B. Science 315, 115–119. Zabolotny, J.M., Bence-Hanulec, K.K., Stricker-Krongrad, A., Haj, F., Wang, Y., Minokoshi, Y., Kim, Y.B., Elmquist, J.K., Tartaglia, L.A., Kahn, B.B., and Neel, B.G. (2002). PTP1B regulates leptin signal transduction in vivo. Dev. Cell 2, 489–495. Zhang, S., and Zhang, Z.Y. (2007). PTP1B as a drug target: recent developments in PTP1B inhibitor discovery. Drug Discov. Today 12, 373–381. Zinker, B.A., Rondinone, C.M., Trevillyan, J.M., Gum, R.J., Clampit, J.E., Waring, J.F., Xie, N., Wilcox, D., Jacobson, P., Frost, L., et al. (2002). PTP1B antisense oligonucleotide lowers PTP1B protein, normalizes blood glucose, and improves insulin sensitivity in diabetic mice. Proc. Natl. Acad. Sci. USA 99, 11357–11362.
Crystal Structure of the Mammalian GIRK2 K+ Channel and Gating Regulation by G Proteins, PIP2, and Sodium Matthew R. Whorton1,2 and Roderick MacKinnon1,2,* 1Laboratory
of Molecular Neurobiology and Biophysics, Rockefeller University, 1230 York Avenue, New York, NY 10065, USA Hughes Medical Institute *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.07.046 2Howard
SUMMARY
G protein-gated K+ channels (Kir3.1–Kir3.4) control electrical excitability in many different cells. Among their functions relevant to human physiology and disease, they regulate the heart rate and govern a wide range of neuronal activities. Here, we present the first crystal structures of a G protein-gated K+ channel. By comparing the wild-type structure to that of a constitutively active mutant, we identify a global conformational change through which G proteins could open a G loop gate in the cytoplasmic domain. The structures of both channels in the absence and presence of PIP2 suggest that G proteins open only the G loop gate in the absence of PIP2, but in the presence of PIP2 the G loop gate and a second inner helix gate become coupled, so that both gates open. We also identify a strategically located Na+ ion-binding site, which would allow intracellular Na+ to modulate GIRK channel activity. These data provide a structural basis for understanding multiligand regulation of GIRK channel gating. INTRODUCTION G protein-gated K+ (GIRK) channels are members of the inward rectifier (Kir) channel family, so named because the outward flow of K+ ions is inhibited by intracellular polyamines and Mg2+, which block the pore in a voltage-dependent manner. Kir channels play an essential role in many physiological processes, including neuronal signaling, kidney function, insulin secretion, and heart rate control. Mutations of Kir channels underlie numerous diseases including primary aldosteronism, Andersen syndrome, Bartter syndrome, and congenital hyperinsulinism (Choi et al., 2011; Hibino et al., 2010). All Kir channels share the same basic topology: four subunits combine to form a canonical K+ pore-forming transmembrane domain (TMD) and a large cytoplasmic domain (CTD) (Figure 1A). It is thought that ion conduction may be regulated by two gates in series: one formed by the inner helices of the TMD (Doyle et al., 1998; Jiang et al., 2002), and the other by the G loop at the apex
of the CTD (Nishida et al., 2007; Pegan et al., 2005). Various regulatory molecules are thought to control these gates, but the control mechanisms are still unknown. The anionic lipid phosphatidylinositol 4,5-bisphosphate (PIP2) is essential for the activation of all Kir channels (Hibino et al., 2010). GIRK channels (Kir3.x) are unique in that they also require G proteins, in combination with PIP2, for activation (Huang et al., 1998; Logothetis et al., 1987; Reuveny et al., 1994; Sui et al., 1998; Wickman et al., 1994). Certain GIRK channel subtypes are also modulated by intracellular Na+ ions (Ho and Murrell-Lagnado, 1999a, 1999b; Lesage et al., 1995; Sui et al., 1996; Sui et al., 1998). GIRK channel activation elicits the flow of K+ ions across the cell membrane and thus drives the membrane voltage toward the Nernst potential for K+. Near the K+ Nernst potential, voltage-dependent Na+ and Ca2+ channels tend to be silenced and therefore electrical excitation is diminished. This is an important signaling mechanism by which hormone and neurotransmitter stimulation of G protein-coupled receptors (GPCRs) regulates many essential physiological processes (Lu¨scher and Slesinger, 2010). For example, acetylcholine secreted by the vagus nerve controls heart rate through stimulation of muscarinic GPCRs in cardiac pacemaker cells (Logothetis et al., 1987; Pfaffinger et al., 1985). Electrophysiological studies have sought to understand how the various ligands—G proteins, PIP2, and Na+—interact simultaneously with GIRK channels to regulate their gating. These studies suggest that G proteins and Na+ function in a codependent manner with PIP2 to open GIRK channels (Huang et al., 1998; Sui et al., 1998; Zhang et al., 1999). Here, we present crystal structures of a quiescent, closed GIRK2 channel and of a point mutant that is constitutively active, independent of G protein stimulation. Further, both structures are determined in the absence and presence of PIP2. These structures render a molecular mechanistic description of multi-ligand regulation of GIRK channels.
RESULTS Closed Structure of GIRK2 Four GIRK channel isoforms (Kir3.1–Kir3.4) associate into various homo-/heterotetrameric complexes. GIRK2 (Kir3.2) forms functional homotetramers and is thus a good candidate Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc. 199
A
C
B
Figure 1. Structure and Function of the GIRK2 Channel
D
(A) Cartoon diagram of the GIRK2 structure. Each subunit of the tetramer is a different color. Unmodeled segments of the turret and N-terminal linker are drawn with dashed lines. The approximate boundary of the phospholipid bilayer is indicated by the thick black lines. The extracellular surface is on top. (B) A cartoon diagram of key residues that mediate the contacts at the interface between the cytoplasmic and transmembrane domains. The same coloring scheme as in (A) is used. (C) Representative example of a two-electrode voltage-clamp recording from Xenopus laevis oocytes expressing the truncated GIRK2 construct used for crystallography, held at 80 mV. The white bar indicates a physiological extracellular solution containing 96 mM NaCl and 2 mM KCl, whereas the gray bars represent a solution of 98 mM KCl only. The application of 10 mM acetylcholine (ACh) or 1 mM tertiapin-Q (TPN-Q) is also indicated. The dashed line represents zero current—all traces under this line represent negative, inward currents. (D) Voltage ramps of oocytes expressing the same truncated GIRK2 construct. Currents were measured using two-electrode voltage clamp with an extracellular solution of 98 mM KCl, in the presence of 1 mM ACh (total current, red) or 1 mM ACh + 100 nM TPN-Q (background current, green). GIRK2-specific current (total minus background) is shown as the blue trace. The inset graph is the same data, but with a different scale for the axes to illustrate the degree of rectification. See also Figure S1 and Table S1.
for crystallographic structure determination (Kofuji et al., 1995). To obtain suitably diffracting crystals, we modified the mouse GIRK2 complementary DNA (cDNA) to remove unstructured regions of the N and C termini (Nishida et al., 2007; Nishida and MacKinnon, 2002; Pegan et al., 2005; Tao et al., 2009). The resulting channel differs from the corresponding human ortholog by only one amino acid near the structured end of the C terminus (Asn377 is Ser in human GIRK2) (Figure S2 available online). This channel with unstructured regions of the N and C termini removed, which we refer to as wild-type in this study, exhibits the fundamental characteristics of the full-length GIRK2: G protein activation, inhibition by tertiapin-Q, and a strongly rectifying current-voltage curve (Figures 1C and 1D). Wild-type GIRK2 crystals diffracted X-rays to 3.6 A˚ resolution. Initial phases were determined by molecular replacement with a GIRK2 CTD structure, and a model was built and refined to working and free residuals (Rw/Rf) of 26.0%/27.3% (Figures 1A and 1B and Table S1) (Inanobe et al., 2007). The overall architecture of GIRK2 is similar to the G protein-independent ‘‘classical inward rectifier’’ channel Kir2.2 (Tao et al., 2009), but has two significant differences. First, the turrets surrounding the extracellular entryway to the pore form a wider, more open vestibule in GIRK2 (Figures S1B and S1C). This structural difference may provide a simple explanation for pharmacological differences between classical inward rectifiers and GIRK channels. Many 200 Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc.
GIRK channels, including GIRK2, are inhibited by certain poreblocking toxins such as tertiapin, as shown in Figure 1C, whereas classical inward rectifier channels are not (Jin and Lu, 1999). The more open turrets in GIRK2 would allow tertiapin to fit into the vestibule, whereas the more restrictive turrets in classical inward rectifiers appear to prevent toxin binding. The second structural difference occurs at the interface between the TMD and CTD. In Kir2.2 the CTD is extended away from the TMD in the absence of PIP2, whereas in GIRK2 the two components are tightly juxtaposed (Figures 1A and 1B and Figure S1A) (Tao et al., 2009). The TMD-CTD interface in GIRK2 is mediated by both hydrophilic and hydrophobic interactions between the interfacial helices of the TMD, the TM-CTD linker, and the bC-bD loop of the CTD (Figure 1B). These interactions were absent in the more extended Kir2.2 structure (Tao et al., 2009). It seems likely that they play an important role in the control of GIRK2 channel activity because they are in close proximity to the two constrictions along the ion pathway that have been hypothesized to function as gates. One gate—the inner helix gate—is formed by the inner helices of the TMD, just inside the membrane, above the level of the interfacial helix (Figure 1A and Figure S1A). Another gate—the G loop gate—is formed by the G loop at the apex of the CTD, just outside the membrane, below the level of the interfacial helix (Figure 1A and Figure S1A). In this structure of GIRK2, both gates are tightly closed.
A
C
B
D
Binding Site for Regulation by Intracellular Sodium Ions Near the TMD-CTD interface, immediately beneath the bC-bD loop, there is electron density that cannot be attributed to protein atoms (Figure 2A). Given the surrounding protein chemical groups, this density most likely represents either a metal ion, for example Na+, or a water molecule. To distinguish between these possibilities, we crystallized the channel in the presence of Tl+, a monovalent metal ion identifiable by its X-ray anomalous signal, which has been used previously to analyze the Na+ binding site in the transport protein LeuT (Boudker et al., 2007). An anomalous peak, the third strongest in the anomalous difference electron density map (4.6 s, after sites in the selectivity filter and at the intracellular pore entryway), identifies this extra density as a metal cation (Figure 2B). In the native structure, the density is undoubtedly due to a Na+ ion coming from the crystallization solution of 1M NaCl. The discovery of a Na+ binding site is interesting because GIRK channels that contain either Kir3.2 or Kir3.4 subunits are known to be activated by elevated levels of intracellular Na+ with an EC50 of 30–40 mM (Ho and Murrell-Lagnado, 1999a, 1999b; Lesage et al., 1995; Sui et al., 1996, 1998; Zhang et al., 1999). Sodium activation of GIRK channels is thought to serve an important physiological function by producing negative feedback on excessive electrical excitability: during excitation, Na+
Figure 2. The Na+ Binding Site in GIRK2 Is Located between the bC-bD and bE-bG Loops and Is Dependent on the Negatively Charged Residue Asp228 (A) The key side chain and main chain atoms involved in coordinating a Na+ ion (purple sphere) are shown. A weighted 2Fo-Fc Na+-omit electron density map is shown as a blue mesh, contoured at 1.2 s. (B) A Tl+ anomalous difference electron density map is shown at the same site, contoured at 4 s (magenta mesh). The data are derived from WT GIRK2 crystals grown in KNO3 and then soaked in TlNO3, all in the absence of any Na+. (C) The same site is shown for the structure of a D228N mutant. Electron density is also shown at the same contour level as for the wild-type, showing the lack of density for Na+. (D) A model of the whole channel is shown. Each subunit is a different color with the front subunit removed for clarity. The Na+ ions are again depicted as purple spheres to show their location relative to the whole channel. The region of the channel shown in (A)–(C) is highlighted by the black box. See also Figure S2.
entry can elevate intracellular Na+ concentrations above the normal range of 5–15 mM, enough to activate GIRK channels and drive the membrane potential negative again. Mutagenesis studies have pinpointed an aspartic acid in the bC-bD loop as a critical determinant of Na+ activation (Asp228 in Kir3.2), and the presence of aspartic acid rather than asparagine accounts for the Na+ sensitivity of GIRK channels with Kir3.2 or Kir3.4 subunits (Ho and Murrell-Lagnado, 1999a, 1999b; Zhang et al., 1999) (Figure S2). Asp228 is one of the coordinating residues of the Na+ site in the crystal structure, and when Asp228 is mutated to asparagine the Na+ density is no longer observed (Figure 2C). This direct correlation between structure and function observed when mutating Asp228 suggests that we have identified the regulatory Na+ binding site. In the refined model, Na+ is coordinated not only by the side chain carboxylate of Asp 228, but also by main chain carbonyl oxygen atoms from Arg230, Asn231, and Ser232 (Figures 2A and 2D). Main chain carbonyls from Leu275 and Val276 in the bE-bG loop may also participate, as well as the flanking histidines (His69 and His233), which may help to coordinate a water molecule near this site. As we present additional data below, it will become clear why this Na+ ion is strategically located in the channel to modulate gating. The R201A Mutant Appears to Stabilize the G Protein-Activated State We suspected that the TMD-CTD interface is likely to play an important role in controlling the channel’s gates, and in particular in transmitting conformational changes that allow G proteins to Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc. 201
regulate the gates. Using the closed structure of GIRK2 to guide our experiments, we introduced point mutations into the TMDCTD interface and assessed their effects using a functional assay developed by Jan and coworkers (Kubo et al., 1993). In this assay, the GIRK2 channel was coexpressed in Xenopus oocytes with the M2 muscarinic G protein-coupled receptor (Kubo et al., 1993). Stimulation of the M2 receptor by acetylcholine (ACh) causes GIRK2 channel opening, mediated by endogenous G proteins in the oocyte. When the wild-type GIRK2 channel is expressed, acetylcholine application stimulates large K+ currents (Figure 1C). Subsequent application of tertiapin-Q (TPN-Q) distinguishes GIRK2 currents from endogenous oocyte channel currents. For the wild-type channel, the magnitude of current stimulated by acetylcholine nearly equals that inhibited by TPN-Q, which means there is little or no G protein-independent GIRK2 current (Figure 1C). Mutations in the TMD-CTD interface were introduced at many of the most conserved positions and in a number of cases the mutations we chose corresponded to naturally occurring mutants that underlie channelopathies (Decher et al., 2007; Donaldson et al., 2003; Lin et al., 2006; Plaster et al., 2001; Zhang et al., 1999) (Figure S2). Not surprisingly, the majority of these mutations yielded nonfunctional GIRK2 channels (listed in the Figure 3 legend). The mutations that did express are shown in Figure 3A. Grey bars show that functional expression levels in each case are comparable to those in the wild-type, while black bars show the fraction of GIRK2 current that is stimulated by the addition of acetylcholine. This fraction is near unity for the wildtype and four of the mutants (Figure 3A). In contrast, the R201A mutation stands out because its acetylcholine-stimulated fraction is less than 0.2, which means that its activity is largely independent of acetylcholine (Figure 3B). To analyze the structural alterations underlying constitutive activation in the R201A mutant, we determined its crystal structure at 3.1 A˚ resolution. Compared to wild-type, this channel shows a large rearrangement of the bC-bD loop, movement of the His233 side chain to fill the void left by deletion of the Arg201 side chain, and a shifting of the strand comprising residues 235–237 to interact more closely with residues 272–275 of the bE-bG loop (Figure 3C). These conformational changes are associated with a displacement and rotation of the G loop and a change in the positions of three amino acids that form the gate’s constriction: Met313, Gly318, and Met319 (Figures 3D and 3E and Movie S1). The net effect of these concerted changes is a widening of the G loop gate from about 6.0 A˚ to nearly 12.0 A˚, and the appearance of oxygen atoms from the Gly318 main-chain carbonyl and the Thr320 side chain to face the pore (Figures 3D and 3E). These conformational changes effectively create a 9 A˚ diameter hydrophilic pore (as delimited by the van der Waals surfaces of the pore-lining oxygen atoms), which should be sufficient to allow passage of an 8 A˚ diameter hydrated K+ ion. The R201A mutant also shows a propagated conformational change that extends to the perimeter of the CTD, a distance of approximately 30 A˚ from the site of the R201A mutation, and encompasses the Na+ binding site (Figure 4 and Movie S1). This propagated conformational change is mediated by a domino-like displacement of b strands, which in turn leads to 202 Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc.
a reorganization of several hydrophobic amino acids (Val67, Leu257, and Val276) and a rotamer switch in the conformation of the Tyr58 side chain near the CTD perimeter. Several independent lines of evidence suggest that this propagated conformational change is similar to the conformational change that occurs when G proteins bind to the channel. First, site-directed mutagenesis studies of GIRK/Kir2.1 chimeric channels have identified several buried and surface resides as being essential in mediating G protein activation of GIRK channels. The critical surface residues are located around the bL-bM loop of the CTD, especially residue Leu344, thus implicating this region as a binding site for G proteins (Finley et al., 2004; He et al., 1999). Therefore, it is reasonable to think that G protein binding to this region could regulate this conformational change, leading to G loop gate opening. It was also shown that mutation of the buried residue Leu273 to isoleucine eliminated G protein activation (He et al., 2002). Until now, it was unclear how this would affect channel function, but here we can see that Leu273 is located on the bE-bG loop next to Val276 and that it undergoes a significant conformational change in the R201A mutant. It is conceivable that a mutation at this position could alter the propagation of a conformational change from the surface of the protein to the G loop. Second, NMR studies of the interaction between G protein subunits and the GIRK1 CTD using transferred cross saturation and chemical shift perturbation experiments have identified regions on the CTD that either directly contact a G protein or undergo a conformational change in response to G protein binding (Yokogawa et al., 2011). These studies concluded that G proteins bind to a surface mainly comprised of the bL-bM and bD-bE loops (Figure S3B). They also concluded that G protein binding elicits conformational changes near this surface as well as at locations within the N terminus, especially Tyr58 (Figures S3C and S3D). The NMR data match the conformational changes that we observe in the bL-bM loop, N terminus, and in particular Tyr58, which is the residue that undergoes a rotamer switch in the R201A mutant (Figure S3A and Figure 4). We will show below that the R201A mutant in the presence of PIP2 reveals an additional conformational change that is consistent with the NMR data on the bD-bE loop (Figures S3E and S3F). Taken together, the mutagenesis and NMR data support the hypothesis that constitutive activation in the R201A mutant channel results because this mutant favors a conformation similar to that induced by G protein binding. The G loop gate is open in the R201A mutant, but the inner helix gate remains closed. We note, however, that G protein stimulation alone is insufficient to achieve ion conduction in GIRK channels. In electrophysiological assays, the signaling lipid PIP2 is also required, in addition to G proteins, for channel activation. The Role of PIP2 in GIRK2 Channel Activation We next determined the crystal structure of wild-type GIRK2 in the presence of C8-PIP2 (eight carbon acyl chains) at 3.0 A˚ resolution. The structure shows one ordered PIP2 lipid molecule per subunit bound near the TMD-CTD interface. The negatively charged phosphates of the PIP2 molecule are coordinated by several positively charged residues, Lys64, Lys194, Lys199, and Lys200, along with backbone amides at the junction of the
A
C
B
D
E
Figure 3. The R201A Mutant Is Constitutively Active, which Is Partially the Result of a Rearrangement of the bC-bD Loop and a Widening of the G Loop Gate (A) A summary of oocyte currents from two-electrode voltage clamp experiments for various GIRK2 mutants. The experiments were performed the same as in Figure 1C. Mutations were selected based on the interactions shown in Figure 1B. Specific GIRK2 current (gray bars) is evaluated as the total ACh-stimulated current minus the remaining current after TPN-Q blockage. Fraction ACh stimulation (black bars) is evaluated as the ACh-stimulated current (the difference in the current before and after the addition of ACh) divided by the total GIRK2-specific current. (*) All currents were measured 16–24 hr after RNA injection, except for Y78L and R201A, which did not show appreciable current until after 3 days of expression. All RNAs were injected undiluted, except for the D228N mutant, which was diluted 10-fold. Error bars represent the standard deviation of the mean from three different oocytes for each mutant. Data are not shown for the following mutants that did not show any detectable currents: D81Y/N/R/A/E, Y78D/A, R230D/K/A, R201K/D/Q, D228R/E/A, R201D/D228D, R230D/D81R, and D81Y/Y78D. (B) Representative example of a two-electrode voltage-clamp recording of Xenopus laevis oocytes expressing the R201A GIRK mutant. The experiment was performed exactly as in Figure 1C. (C) A comparison of the wild-type (red with orange side chains) and R201A mutant structures (green). (D and E) A comparison of the conformation of the G loop gate between the wild-type (red, D) and R201A (green, E) structures. The top panels are a cross-section of the channel, showing the G loops of just two opposing subunits. Key residues that form the main constriction in the ion conduction pathway are highlighted (there was no electron density for M319 in (E), so it was modeled as alanine). The bottom panels show a top-down view of the cytoplasmic domain tetramer with the side chains that were highlighted in the top panels shown as space-filling models.
interfacial and outer helices (Figure 5 and Figures S4A and S4B). The side chains for residues Lys90 and Arg92 did not show any appreciable electron density, but may still contribute to an overall positive electrostatic potential of the binding site (Figure 5 and Figures S4A and S4B). The presence of PIP2 induced a modest displacement of the protein main chain near the PIP2 binding site, where the interfacial helix turns into the outer helix (Fig-
ure S4C). PIP2 also produced a slight rotation of the inner helices, accompanied by a weakening of electron density for the Phe192 side chain, which forms the most constricted region of the inner helix gate (Figures S4D-S4F). Clearly, however, both the G loop gate and inner helix gate remain closed in the presence of PIP2, consistent with the observation that PIP2 alone is insufficient to open GIRK channels in electrophysiological experiments. Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc. 203
Figure 4. The R201A Mutant Shows a Propagated Conformational Change through the Cytoplasmic Domain that Mimics the Same Changes Induced by G Proteins A comparison of the conformational changes over the whole channel between the wild-type (red) and the R201A mutant (green) structures. The front subunit has been removed for clarity. The intensity of the color is related to the absolute conformational change between identical residues (a combination of the distance between both the alpha and gamma carbons of the two structures, to account for both main-chain and rotamer conformational changes). The large deviations in the transmembrane domain arise from a slight rigid-body twist of the cytoplasmic domain relative to the transmembrane domain. The close-up view on the right highlights key side chains involved in propagating the conformational change through the channel. Important ligand-binding sites are identified by blue circles. See also Figure S3 and Movie S1.
In contrast to the small effect of PIP2 on the conformation of the wild-type channel, PIP2 had a profound effect on the R201A mutant. A crystal structure in the presence of C8-PIP2, determined at 3.5 A˚ resolution, shows a large change in both the G loop and the inner helix gates (Figures 6A and 6B and Movie S2). Because of the way in which channel molecules are packed against each other in this orthorhombic crystal, only two of the four subunits are free to bind PIP2 and undergo the conformational change (Figure S5). In the PIP2 bound subunits, we observe conformational changes similar to those observed in the R201A mutant. We also observe a rigid body rotation of the CTDs, which necessitates a movement of the bD-bE loop (Figures 6C and 6D and Figures S3E and S3F). The rotation of the CTDs is propagated across the TMD-CTD interface and is associated with a rotation and splaying apart of the inner helices. The net effect of PIP2 in combination with the R201A mutation is an opening of the G loop gate to 15 A˚ and the inner helix gate to 11 A˚ (Figures 6A and 6B and Movie S2).
Figure 7A shows a surface rendering of the entire pore lining for the wild-type channel and for the R201A mutant in the presence of PIP2. Although we do not know for sure whether in a cell two or four PIP2 molecules bind to the channel in its G protein-activated state, this rendering shows the case in which we allow four subunits to bind PIP2 and undergo the conformational change, which we anticipate would likely occur in the unconstrained environment of the membrane. It is evident that in the setting of dual activation by G protein subunits (mimicked by the R201A mutant) and PIP2, the highly constricted pore of the quiescent wild-type channel opens wide enough to allow a hydrated K+ ion passage from the cytoplasm to the selectivity filter. DISCUSSION This study presents the first molecular structures of a G protein gated K+ channel, GIRK2. Crystallization of the wild-type channel and of a constitutively active mutant, both in the Figure 5. PIP2 Binds at the Interface between the Transmembrane and Cytoplasmic Domains and Is Coordinated by Several Positively Charged Residues PIP2 molecules are colored yellow, orange, and red (for carbon, phosphorus, and oxygen atoms) and are shown in the context of the whole channel (gray) on the left, and a close-up view on the right. The thick black lines indicate the approximate boundary of the plasma membrane and the black box highlights the region of the close-up view on the right. On the right, the main coordinating residues are shown as sticks. Residues Lys90 and Arg92 were modeled as alanines due to a lack of electron density, but probably still contribute to the positive electrostatics of the binding site. The important gating residue Phe192 is also shown for reference. See also Figure S4.
204 Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc.
Figure 6. PIP2 Binding to the R201A Mutant Causes the Inner Helix and G Loop Gates to Open (A) The inner helix gate is shown as a Ca trace for the apo structure of R201A (thin, green ribbons) and the structure of the R201A mutant with PIP2 (thick, blue and purple ribbons). The side chain of Phe192 is also shown. For the R201A + PIP2 structure, the subunits that bound PIP2 are shown in blue, whereas the PIP2-free subunits are shown in purple. (B) The same coloring scheme is used as in (A), except now the G loop gate is highlighted. (C) One subunit of R201A + PIP2 is shown (blue) compared to the apo structure of R201A (green). The residues for the bL-bM loop are hidden to more clearly show the rigid-body twisting of the cytoplasmic domain. The PIP2 molecule is shown in ball and stick representation. (D) A top down, orthogonal view of (C). Only the cytoplasmic domain is shown to more clearly show the rigidbody twisting. See also Figures S3 and S5 and Movie S2.
absence and presence of the signaling lipid PIP2, reveal four distinct structures that we believe represent physiologically relevant conformations that underlie GIRK channel gating. In contrast to voltage-dependent and other ligand-gated K+ channels, GIRK channels clearly contain two functional gates—an inner helix gate and a G loop gate—that are regulated by a combination of cytoplasmic and membrane stimuli. Until this study, all previous structures of Kir family channels including several bacterial family members, a chimera with a bacterial TM and eukaryotic CTD, and a eukaryotic ‘‘classical inward rectifier,’’ exhibited a tightly closed inner helix gate (Clarke et al., 2010; Kuo et al., 2003; Nishida et al., 2007; Tao et al., 2009). The conformation of the inner helix gate in three structures presented here—wild-type without PIP2, wild-type with PIP2, and the R201A mutant without PIP2—have a closed inner helix gate conformation similar to previous structures. In the R201A mutant in the presence of PIP2, we observe an open inner helix gate conformation. As discussed above, we think it is likely that crystal lattice contacts prevented PIP2 binding and opening in two of the subunits, but that in a membrane all four subunits would bind PIP2 and undergo the conformational change. If four subunits undergo the change, then the inner helix gate will open wide enough to permit passage of a hydrated K+ ion, but
not as wide as in voltage-gated K+ and MthK Ca2+ gated K+ channels (Jiang et al., 2002; Jiang et al., 2003; Long et al., 2005; Long et al., 2007). A narrower passageway from the cytoplasm to the selectivity filter in inward rectifier K+ channels might be functionally significant because strong voltage-dependent block by intracellular polyamines and Mg2+ give rise to their namesake conduction property—inward rectification (Lopatin et al., 1994; Matsuda et al., 1987). The strong voltage dependence of these blocking ions is in part due to the coupled movement of blocking and permeant ions in the pore (Spassova and Lu, 1998). A very wide ion pathway would allow blocking and permeant ions to interchange their positions along the pathway (i.e., slip by each other), but a narrower pathway will force the ions to move in a queue, and thus the movement of K+ will impart excess voltage dependence onto the blocking ion and give rise to stronger rectification (Spassova and Lu, 1998). We discovered the constitutively active R201A mutant by evaluating the initial wild-type closed structure and then perturbing the TMD-CTD interface because it appeared as if it ought to be involved in regulating the gates. Not surprisingly, most mutations in this region abolished channel function. Fortuitously, in the R201A mutant, we observe constitutive activation associated with an open G loop gate and a propagated conformational change across the CTD. The extent and distribution of this conformational change matches very well previous mutational studies implicating the bL-bM loop as a site for binding G protein subunits, and NMR studies on the effect of G protein binding to isolated CTD structures (Finley et al., 2004; He et al., 1999, 2002; Yokogawa et al., 2011). We thus hypothesize that the R201A mutant stabilizes a conformation that is similar to the G protein-activated state. Electrophysiological experiments have shown that elevated levels of Na+, along with the presence of PIP2, can activate Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc. 205
certain GIRK channels that have an aspartic acid residue at position 228 (Ho and Murrell-Lagnado, 1999a, 1999b; Lesage et al., 1995; Sui et al., 1996; Sui et al., 1998; Zhang et al., 1999). Our data may start to address the mechanism of this mode of activation. We show that a Na+ ion binds in a pocket between the bCbD and bE-bG loops. This lies directly on the pathway of the propagated conformational changes that we observed in the R201A structure, which we believe to mimic the effects of a bound G protein. Thus, Na+ may work to influence the conformation of this pathway. One possible mechanism by which the Na+ ion could do this is by binding to Asp228 and thereby weakening the ionized hydrogen bond that it forms with Arg201. Through such an interaction, we imagine that Na+ could in part mimic the effect of the R201A mutation and, by extension, the effect of G proteins as well. There is a very good correlation between the four crystal structures and past electrophysiological studies showing a dual requirement for both G proteins and PIP2 to open GIRK channels (Huang et al., 1998; Sui et al., 1998; Zhang et al., 1999). We can now view this dual requirement through a thermodynamic cycle of the four structures: PIP2 alone binds but does not open either gate, G proteins (R201A) alone open the G loop gate but not the inner helix gate, but G proteins in addition to PIP2 open both gates (Figure 7B and Figure S6). In words, the relations suggest that PIP2 functions to couple tightly the two gates so that upon G protein stimulation both gates open to activate the channel. One well-established role of the CTD is to form an extended pore that is important for producing rectification (Nishida and MacKinnon, 2002; Tao et al., 2009; Yang et al., 1995). In this study, we observe how the CTD also has the ability to undergo conformational changes that allow cytoplasmic ligands to allosterically control gates in the pore. In realizing that G proteins apparently regulate the gates through a pathway of conformational change that includes a modulatory Na+ binding site, we have begun to see the structural underpinnings of a very complex form of gating regulation. EXPERIMENTAL PROCEDURES
Figure 7. The R201A Mutant in the Presence of PIP2 Creates a Continuous Cavity from the Cytoplasm to the Selectivity Filter Large Enough for a Hydrated K+ Ion (A) Surface representations for the interior surfaces of the wild-type apo (left) versus a four-fold symmetrical model of R201A + PIP2 (right). The K+-accessible space was determined with the HOLLOW script (Ho and Gruswitz, 2008). The R201A + PIP2 model was generated by rotating the PIP2-bound subunits 90 degrees about the central tetrameric axis. The surfaces of the cavities are colored according to the hydrophobicity of the surrounding side chains. For reference, a cartoon representation of the channel is shown on the far left. The key residues of the inner helix (Phe192) and G loop gates (Met313 and Met319) are shown in cyan. (B) A schematic model depicting the effects that PIP2 and the R201A mutation have on GIRK2 channels. The three main constriction points are labeled in the
206 Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc.
Molecular Biology A truncated GIRK2 cDNA (consisting of residues 52–380—hereafter referred to as the wild-type) was subcloned by PCR into pPICZ and pGEM vectors for expression in Pichia pastoris, or Xenopus laevis oocytes, respectively. The open reading frame in the pPICZ vector contained a C-terminal PreScission protease site, followed by green fluorescent protein (GFP), and a His10 tag. The pGEM vector contained a C-terminal FCYENE tag (Ma et al., 2001) and then a c-Myc tag.
top-left panel: a, selectivity filter; b, inner helix gate; c, G loop gate. PIP2 binding to the wild-type channel opens the inner helix gate slightly, but the G loop gate is still closed (top right). The R201A mutant induces a series of conformational changes that mimic the effect of G protein binding—this results in the opening of the G loop gate, but the inner helix gate is still closed (bottom left). The combination of the R201A mutant and PIP2 causes a rotation of the CTD, which in turn splays apart the inner helix gate and further opens the G loop gate (bottom right). The net effect of these conformational changes is an open channel. See also Figure S6.
Protein Expression and Purification Protein expression was induced in stable P. pastoris cell lines by the addition of methanol for 20–24 hr at 24 C. Cells were harvested by centrifugation, frozen in liquid N2, and stored at 80 C until needed. Frozen cells were lysed in a mixer mill, then solubilized for 1 hr at room temperature (RT) by resuspension in 50 mM HEPES (pH 7.35), 150 mM KCl, 4% (w/v) n-decyl-b-D-maltopyranoside (DM), and protease inhibitor cocktail. Clarified supernatant was then incubated with Talon metal affinity resin for 1 hr at RT with gentle mixing. The resin was washed in batch with 5 column volumes (cv) of buffer A (50 mM HEPES [pH 7.0], 150 mM KCl, 0.4% [w/v] DM), then loaded onto a column and further washed with 5 cv buffer A + 40 mM imidazole, then 2 cv buffer A + 80 mM. The column was then eluted with buffer A + 300 mM imidazole. Peak fractions were pooled and 20 mM DTT, 3 mM TCEP, and 1 mM EDTA were added, and then cut with PreScission protease for either 2 hr at RT or overnight at 4 C. The cleaved protein was then concentrated to run on a Superdex-200 gel filtration column in 20 mM TRIS-HCl (pH 7.5), 150 mM KCl, 0.2% (w/v) DM (anagrade), 20 mM DTT, and 1 mM EDTA. Crystallization and Structure Determination Purified protein representing the peak tetramer fractions was pooled and concentrated in a 50 K MWCO concentrator to 6–7 mg/ml, mixed 1:1 with crystallization solution, then crystallized using the hanging-drop vapor diffusion method. For the PIP2 complexes, 1,2-dioctanoyl-sn-glycero-3-phospho-(10 myo-inositol-40 ,50 -bisphosphate) (C8-PIP2, Avanti Polar Lipids) was added to the concentrated protein at a final concentration of 2 mM right before setting up the drops. For R201A + PIP2, 10 mM spermine was also added, although it was not seen in the electron density maps. Crystallization drops were incubated at 20 C and crystals (D-shaped plates) usually appeared after 1–3 days. The wild-type and R201A mutant crystals grew in 50 mM Na citrate (pH 6.0), 1 M NaCl, and 30%–35% PEG 400. The D228N mutant crystals grew in 50 mM Na citrate (pH 6.0), 1 M NaNO3, and 24%–26% PEG 400. The wild-type + PIP2 crystals grew in 50 mM Na citrate (pH 6.0), 1 M NaCl, and 20% PEG 400. The R201A mutant + PIP2 crystals grew in 50 mM Na HEPES (pH 7.25), 0.5 M NaCl, and 25% PEG 400. For collection of Tl+ anomalous diffraction data, wild-type protein was crystallized in the presence of KNO3 instead of NaCl. These crystals were then transferred to solutions containing TlNO3. For data collection, the crystals were cryoprotected and flash-frozen in liquid N2. The structures were solved by molecular replacement with the MOLREP (Vagin and Teplyakov, 2000) program, using the GIRK2 cytoplasmic domain structure as a search model (Inanobe et al., 2007). The models were built with COOT (Emsley and Cowtan, 2004) and refined with REFMAC (Murshudov et al., 1997). Data collection and refinement statistics are shown in Table S1. Electrophysiology X. laevis oocytes were injected with 50 nl cRNA and incubated for 1–3 days before recording of currents. All experiments were performed at RT with a two-electrode voltage clamp. For gap-free recordings, oocytes were held at 80 mV in ND96 (96 mM NaCl, 2 mM KCl, 0.3 mM CaCl2, 1 mM MgCl2, 10 mM HEPES [pH 7.6] with KOH), then perfused with a highK solution (98 mM KCl, 0.3 mM CaCl2, 1 mM MgCl2, 10 mM HEPES [pH 7.6] with KOH) that also contained either 10 mM acetylcholine (ACh) or 1 mM tertiapin-Q (TPN-Q). GIRK2-specific current was calculated by subtraction of the current remaining after TPN-Q blockage (background oocyte currents) from the maximum current elicited with highK + ACh. Basal GIRK2 current was calculated by subtraction of the current remaining after TPN-Q blockage from the maximum current elicited with highK only. For measuring rectification, the membrane potential was ramped from 80 mV to +80 mV over 100 ms in the presence of either highK buffer + 1 mM ACh to measure total currents, or highK + 1 mM ACh + 100 nM TPN-Q to measure background oocyte currents. GIRK2-specific currents were calculated by subtracting the background currents from the total currents. ACCESSION NUMBERS Coordinates and structure factors have been deposited in the Protein Data Bank with the following accession numbers: WT, 3SYO; D228N, 3SYC; R201A, 3SYP; WT + PIP2, 3SYA; and R201A + PIP2, 3SYQ.
SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures, six figures, one table, and two movies and can be found with this article online at doi:10.1016/j.cell.2011.07.046.
ACKNOWLEDGMENTS We thank P. Hoff and members of D. Gadsby’s laboratory (Rockefeller University) for assistance with oocyte preparation; K.R. Rajashankar and K. Perry at beamline 24ID-C (Advanced Photon Source, Argonne National Laboratory), H. Robinson at beamline X29 (National Synchrotron Light Source, Brookhaven National Laboratory), and M. Becker at beamline 23ID-B (Advanced Photon Source, Argonne National Laboratory) for assistance at the synchrotron; members of the MacKinnon laboratory for assistance; and A. Banerjee, J. Butterwick, X. Tao, and P. Yuan for comments on the manuscript. R.M. is an Investigator in the Howard Hughes Medical Institute. Received: May 10, 2011 Revised: July 13, 2011 Accepted: July 20, 2011 Published: September 29, 2011 REFERENCES Boudker, O., Ryan, R.M., Yernool, D., Shimamoto, K., and Gouaux, E. (2007). Coupling substrate and ion binding to extracellular gate of a sodium-dependent aspartate transporter. Nature 445, 387–393. Choi, M., Scholl, U.I., Yue, P., Bjo¨rklund, P., Zhao, B., Nelson-Williams, C., Ji, W., Cho, Y., Patel, A., Men, C.J., et al. (2011). K+ channel mutations in adrenal aldosterone-producing adenomas and hereditary hypertension. Science 331, 768–772. Clarke, O.B., Caputo, A.T., Hill, A.P., Vandenberg, J.I., Smith, B.J., and Gulbis, J.M. (2010). Domain reorientation and rotation of an intracellular assembly regulate conduction in Kir potassium channels. Cell 141, 1018–1029. Decher, N., Renigunta, V., Zuzarte, M., Soom, M., Heinemann, S.H., Timothy, K.W., Keating, M.T., Daut, J., Sanguinetti, M.C., and Splawski, I. (2007). Impaired interaction between the slide helix and the C-terminus of Kir2.1: a novel mechanism of Andersen syndrome. Cardiovasc. Res. 75, 748–757. Donaldson, M.R., Jensen, J.L., Tristani-Firouzi, M., Tawil, R., Bendahhou, S., Suarez, W.A., Cobo, A.M., Poza, J.J., Behr, E., Wagstaff, J., et al. (2003). PIP2 binding residues of Kir2.1 are common targets of mutations causing Andersen syndrome. Neurology 60, 1811–1816. Doyle, D.A., Morais Cabral, J., Pfuetzner, R.A., Kuo, A., Gulbis, J.M., Cohen, S.L., Chait, B.T., and MacKinnon, R. (1998). The structure of the potassium channel: molecular basis of K+ conduction and selectivity. Science 280, 69–77. Emsley, P., and Cowtan, K. (2004). Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132. Finley, M., Arrabit, C., Fowler, C., Suen, K.F., and Slesinger, P.A. (2004). betaLbetaM loop in the C-terminal domain of G protein-activated inwardly rectifying K(+) channels is important for G(betagamma) subunit activation. J. Physiol. 555, 643–657. He, C., Zhang, H., Mirshahi, T., and Logothetis, D.E. (1999). Identification of a potassium channel site that interacts with G protein betagamma subunits to mediate agonist-induced signaling. J. Biol. Chem. 274, 12517–12524. He, C., Yan, X., Zhang, H., Mirshahi, T., Jin, T., Huang, A., and Logothetis, D.E. (2002). Identification of critical residues controlling G protein-gated inwardly rectifying K(+) channel activity through interactions with the beta gamma subunits of G proteins. J. Biol. Chem. 277, 6088–6096. Hibino, H., Inanobe, A., Furutani, K., Murakami, S., Findlay, I., and Kurachi, Y. (2010). Inwardly rectifying potassium channels: their structure, function, and physiological roles. Physiol. Rev. 90, 291–366.
Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc. 207
Ho, B.K., and Gruswitz, F. (2008). HOLLOW: generating accurate representations of channel and interior surfaces in molecular structures. BMC Struct. Biol. 8, 49.
Ma, D., Zerangue, N., Lin, Y.F., Collins, A., Yu, M., Jan, Y.N., and Jan, L.Y. (2001). Role of ER export signals in controlling surface potassium channel numbers. Science 291, 316–319.
Ho, I.H., and Murrell-Lagnado, R.D. (1999a). Molecular determinants for sodium-dependent activation of G protein-gated K+ channels. J. Biol. Chem. 274, 8639–8648.
Matsuda, H., Saigusa, A., and Irisawa, H. (1987). Ohmic conductance through the inwardly rectifying K channel and blocking by internal Mg2+. Nature 325, 156–159.
Ho, I.H., and Murrell-Lagnado, R.D. (1999b). Molecular mechanism for sodium-dependent activation of G protein-gated K+ channels. J. Physiol. 520, 645–651.
Murshudov, G.N., Vagin, A.A., and Dodson, E.J. (1997). Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr. D Biol. Crystallogr. 53, 240–255.
Huang, C.L., Feng, S., and Hilgemann, D.W. (1998). Direct activation of inward rectifier potassium channels by PIP2 and its stabilization by Gbetagamma. Nature 391, 803–806.
Nishida, M., and MacKinnon, R. (2002). Structural basis of inward rectification: cytoplasmic pore of the G protein-gated inward rectifier GIRK1 at 1.8 A resolution. Cell 111, 957–965.
Inanobe, A., Matsuura, T., Nakagawa, A., and Kurachi, Y. (2007). Structural diversity in the cytoplasmic region of G protein-gated inward rectifier K+ channels. Channels (Austin) 1, 39–45.
Nishida, M., Cadene, M., Chait, B.T., and MacKinnon, R. (2007). Crystal structure of a Kir3.1-prokaryotic Kir channel chimera. EMBO J. 26, 4005–4015.
Jiang, Y., Lee, A., Chen, J., Cadene, M., Chait, B.T., and MacKinnon, R. (2002). The open pore conformation of potassium channels. Nature 417, 523–526.
Pegan, S., Arrabit, C., Zhou, W., Kwiatkowski, W., Collins, A., Slesinger, P.A., and Choe, S. (2005). Cytoplasmic domain structures of Kir2.1 and Kir3.1 show sites for modulating gating and rectification. Nat. Neurosci. 8, 279–287.
Jiang, Y., Lee, A., Chen, J., Ruta, V., Cadene, M., Chait, B.T., and MacKinnon, R. (2003). X-ray structure of a voltage-dependent K+ channel. Nature 423, 33–41.
Pfaffinger, P.J., Martin, J.M., Hunter, D.D., Nathanson, N.M., and Hille, B. (1985). GTP-binding proteins couple cardiac muscarinic receptors to a K channel. Nature 317, 536–538.
Jin, W., and Lu, Z. (1999). Synthesis of a stable form of tertiapin: a high-affinity inhibitor for inward-rectifier K+ channels. Biochemistry 38, 14286–14293.
Plaster, N.M., Tawil, R., Tristani-Firouzi, M., Canu´n, S., Bendahhou, S., Tsunoda, A., Donaldson, M.R., Iannaccone, S.T., Brunt, E., Barohn, R., et al. (2001). Mutations in Kir2.1 cause the developmental and episodic electrical phenotypes of Andersen’s syndrome. Cell 105, 511–519.
Kofuji, P., Davidson, N., and Lester, H.A. (1995). Evidence that neuronal Gprotein-gated inwardly rectifying K+ channels are activated by G beta gamma subunits and function as heteromultimers. Proc. Natl. Acad. Sci. USA 92, 6542–6546. Kubo, Y., Reuveny, E., Slesinger, P.A., Jan, Y.N., and Jan, L.Y. (1993). Primary structure and functional expression of a rat G-protein-coupled muscarinic potassium channel. Nature 364, 802–806. Kuo, A., Gulbis, J.M., Antcliff, J.F., Rahman, T., Lowe, E.D., Zimmer, J., Cuthbertson, J., Ashcroft, F.M., Ezaki, T., and Doyle, D.A. (2003). Crystal structure of the potassium channel KirBac1.1 in the closed state. Science 300, 1922–1926.
Reuveny, E., Slesinger, P.A., Inglese, J., Morales, J.M., In˜iguez-Lluhi, J.A., Lefkowitz, R.J., Bourne, H.R., Jan, Y.N., and Jan, L.Y. (1994). Activation of the cloned muscarinic potassium channel by G protein beta gamma subunits. Nature 370, 143–146. Spassova, M., and Lu, Z. (1998). Coupled ion movement underlies rectification in an inward-rectifier K+ channel. J. Gen. Physiol. 112, 211–221. Sui, J.L., Chan, K.W., and Logothetis, D.E. (1996). Na+ activation of the muscarinic K+ channel by a G-protein-independent mechanism. J. Gen. Physiol. 108, 381–391.
Lesage, F., Guillemare, E., Fink, M., Duprat, F., Heurteaux, C., Fosset, M., Romey, G., Barhanin, J., and Lazdunski, M. (1995). Molecular properties of neuronal G-protein-activated inwardly rectifying K+ channels. J. Biol. Chem. 270, 28660–28667.
Sui, J.L., Petit-Jacques, J., and Logothetis, D.E. (1998). Activation of the atrial KACh channel by the betagamma subunits of G proteins or intracellular Na+ ions depends on the presence of phosphatidylinositol phosphates. Proc. Natl. Acad. Sci. USA 95, 1307–1312.
Lin, Y.W., MacMullen, C., Ganguly, A., Stanley, C.A., and Shyng, S.L. (2006). A novel KCNJ11 mutation associated with congenital hyperinsulinism reduces the intrinsic open probability of beta-cell ATP-sensitive potassium channels. J. Biol. Chem. 281, 3006–3012.
Tao, X., Avalos, J.L., Chen, J., and MacKinnon, R. (2009). Crystal structure of the eukaryotic strong inward-rectifier K+ channel Kir2.2 at 3.1 A resolution. Science 326, 1668–1674.
Logothetis, D.E., Kurachi, Y., Galper, J., Neer, E.J., and Clapham, D.E. (1987). The beta gamma subunits of GTP-binding proteins activate the muscarinic K+ channel in heart. Nature 325, 321–326. Long, S.B., Campbell, E.B., and Mackinnon, R. (2005). Crystal structure of a mammalian voltage-dependent Shaker family K+ channel. Science 309, 897–903. Long, S.B., Tao, X., Campbell, E.B., and MacKinnon, R. (2007). Atomic structure of a voltage-dependent K+ channel in a lipid membrane-like environment. Nature 450, 376–382.
Vagin, A., and Teplyakov, A. (2000). An approach to multi-copy search in molecular replacement. Acta Crystallogr. D Biol. Crystallogr. 56, 1622–1624. Wickman, K.D., In˜iguez-Lluhl, J.A., Davenport, P.A., Taussig, R., Krapivinsky, G.B., Linder, M.E., Gilman, A.G., and Clapham, D.E. (1994). Recombinant Gprotein beta gamma-subunits activate the muscarinic-gated atrial potassium channel. Nature 368, 255–257. Yang, J., Jan, Y.N., and Jan, L.Y. (1995). Control of rectification and permeation by residues in two distinct domains in an inward rectifier K+ channel. Neuron 14, 1047–1054.
Lopatin, A.N., Makhina, E.N., and Nichols, C.G. (1994). Potassium channel block by cytoplasmic polyamines as the mechanism of intrinsic rectification. Nature 372, 366–369.
Yokogawa, M., Osawa, M., Takeuchi, K., Mase, Y., and Shimada, I. (2011). NMR analyses of the Gbetagamma binding and conformational rearrangements of the cytoplasmic pore of G protein-activated inwardly rectifying potassium channel 1 (GIRK1). J. Biol. Chem. 286, 2215–2223.
Lu¨scher, C., and Slesinger, P.A. (2010). Emerging roles for G protein-gated inwardly rectifying potassium (GIRK) channels in health and disease. Nat. Rev. Neurosci. 11, 301–315.
Zhang, H., He, C., Yan, X., Mirshahi, T., and Logothetis, D.E. (1999). Activation of inwardly rectifying K+ channels by distinct PtdIns(4,5)P2 interactions. Nat. Cell Biol. 1, 183–188.
208 Cell 147, 199–208, September 30, 2011 ª2011 Elsevier Inc.
A Pseudoatomic Model of the Dynamin Polymer Identifies a Hydrolysis-Dependent Powerstroke Joshua S. Chappie,1 Jason A. Mears,2,3 Shunming Fang,3 Marilyn Leonard,4 Sandra L. Schmid,4 Ronald A. Milligan,4 Jenny E. Hinshaw,3,* and Fred Dyda1,* 1Laboratory
of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, MD 20892, USA of Pharmacology, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA 3Laboratory of Cell Biochemistry and Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD 20892, USA 4Department of Cell Biology, The Scripps Research Institute, La Jolla, CA 92037, USA *Correspondence:
[email protected] (J.E.H.),
[email protected] (F.D.) DOI 10.1016/j.cell.2011.09.003 2Department
SUMMARY
The GTPase dynamin catalyzes membrane fission by forming a collar around the necks of clathrin-coated pits, but the specific structural interactions and conformational changes that drive this process remain a mystery. We present the GMPPCP-bound structures of the truncated human dynamin 1 helical polymer at 12.2 A˚ and a fusion protein, GG, linking human dynamin 1’s catalytic G domain to its GTPase effector domain (GED) at 2.2 A˚. The structures reveal the position and connectivity of dynamin fragments in the assembled structure, showing that G domain dimers only form between tetramers in sequential rungs of the dynamin helix. Using chemical crosslinking, we demonstrate that dynamin tetramers are made of two dimers, in which the G domain of one molecule interacts in trans with the GED of another. Structural comparison of GGGMPPCP to the GG transition-state complex identifies a hydrolysis-dependent powerstroke that may play a role in membrane-remodeling events necessary for fission. INTRODUCTION Clathrin-mediated endocytosis (CME) is a highly regulated pathway wherein nutrients, growth factors, and macromolecules are concentrated in invaginating clathrin-coated pits (CCPs) that pinch off to form vesicles to carry these cargo into the cell (McMahon and Boucrot, 2011). The large, multidomain GTPase dynamin assembles into collars at the necks of deeply invaginated CCPs to catalyze membrane fission in the final stages of CME (Mettlen et al., 2009; Schmid and Frolov, 2011). Purified dynamin exists as a tetramer (Muhlberg et al., 1997) that can self-assemble into helical structures reminiscent of collars observed in vivo (Hinshaw and Schmid, 1995). Dynamin encodes five domains (Figure S2A available online): a catalytic
G domain; a middle domain involved in self-assembly and oligomerization; a membrane-binding pleckstrin homology (PH) domain; a GTPase effector domain (GED); and a C-terminal prolineand arginine-rich domain (PRD) that binds SH3 domains of accessory proteins important for CME (Praefcke and McMahon, 2004; Mettlen et al., 2009) but is not essential for GTPase activities or oligomerization in vitro (Muhlberg et al., 1997). Aside from the PRD, structures of all of dynamin’s individual domains or their homologs have been solved by crystallography (Figure S2A). These include the human dynamin 1 PH domain (Ferguson et al., 1994; Timm et al., 1994), the G domains of rat dynamin (Reubold et al., 2005) and dictyostelium dynamin A (Niemann et al., 2001), the middle domain and GED of the related interferon-induced GTPase MxA (Gao et al., 2010), and a fusion linking the C terminus of human dynamin 1’s GED (CGED) to its G domain (GG) (Chappie et al., 2010). Crystallographic and biochemical studies have shown that the CGED forms a three-helix bundle with the N and C termini of the G domain (NGTPase and CGTPase, respectively) (Figure S2B) and that this module—the bundle-signaling element (BSE)—transmits the conformational changes associated with dynamin assembly to the G domain (Chappie et al., 2009, 2010). However, as the BSE was structurally characterized in the context of the GG fusion, it is not known whether CGED’s interaction with the G domain occurs in cis within the same polypeptide or in trans via another polypeptide in the dynamin tetramer. Dynamin has a low affinity for guanine nucleotides (10– 100 mM) and a high basal turnover (0.4–1 min1) (Praefcke and McMahon, 2004). Assembly into helical oligomers stimulates dynamin’s basal GTPase activity >100-fold (Warnock et al., 1996; Stowell et al., 1999). This enhancement arises from G domain dimerization, which optimally positions dynamin’s catalytic machinery and stabilizes conformationally flexible switch regions (Chappie et al., 2010). Mutations that impair GTP binding, assembly, or stimulated GTP hydrolysis also cause defects in endocytic uptake in vivo (reviewed in Schmid and Frolov, 2011), thus establishing the importance of dynamin’s GTPase activities in CME. Despite its essential role in CME, the mechanism of dynamincatalyzed membrane fission remains poorly understood. Efforts Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc. 209
to recapitulate these activities in vitro using synthetic membranes suggested that dynamin functions as a mechanochemical enzyme that actively severs the membrane via hydrolysisdependent conformational changes (Sweitzer and Hinshaw, 1998; Stowell et al., 1999; Chen et al., 2004; Mears et al., 2007; Roux et al., 2006) that generate a constricted neck and impose strain on the membrane lipids (Bashkirov et al., 2008; Roux et al., 2010). GTP hydrolysis also promotes partial dissociation of dynamin subunits from membranes (Danino et al., 2004; Ramachandran and Schmid, 2008; Pucadyil and Schmid, 2008; Bashkirov et al., 2008). Loosening of the dynamin scaffold could allow local lipid rearrangements and an energetically favorable hemifission intermediate that promotes nonleaky membrane scission (Bashkirov et al., 2008; Schmid and Frolov, 2011). The hydrolysis-dependent conformational changes that trigger these membrane-remodeling events have yet to be elucidated. Unraveling the mechanisms governing dynamin-catalyzed membrane fission requires a detailed structural understanding of the architecture of assembled dynamin and the conformational changes induced by stimulated GTP hydrolysis. Dynamin’s propensity to form helical arrays in vitro has previously been exploited for cryo-electron microscopy (cryo-EM) structure determination. Three-dimensional reconstructions of truncated dynamin 1 (DPRD, Figure S2A) polymers assembled on anionic lipid scaffolds have been obtained both in the absence of nucleotides (Chen et al., 2004) and in the presence of the nonhydrolyzable GTP analog GMPPCP (Zhang and Hinshaw, 2001). In both cases, the asymmetric unit of assembly is a dimer that adopts a T shape when viewed in cross-section (‘‘T view’’). The structural differences between these maps suggest that rearrangements in the middle domain and GED mediate a nucleotide-dependent constriction of the DPRD assembly (Chen et al., 2004). Constriction alone, however, is not sufficient for membrane fission (Ramachandran and Schmid, 2008; Bashkirov et al., 2008), suggesting that additional conformational changes are required. Although it has been inferred that the middle domain and GED form a coiled-coil ‘‘stalk’’ that connects the PH domain ‘‘leg’’ to the G domain ‘‘head’’ (Zhang and Hinshaw, 2001; Chen et al., 2004), neither the organization nor their connectivity in the polymer is known, owing to the low resolution (>20 A˚) of the DPRD reconstructions and the lack of a complete, atomic-resolution dynamin structure. These limitations have also hindered our understanding of how assembly promotes G domain dimerization, leading to stimulated GTP hydrolysis and membrane fission. To address these issues, we have used cryo-EM to extend the resolution of the constricted DPRD polymer map and employed computational docking and biochemistry to define the underlying subunit interactions. We also present the crystal structure of GG in complex with GMPPCP, which identifies a major hydrolysis-dependent BSE conformational change. Our results provide insights into how dynamin assembly directly facilitates G domain dimerization and stimulated turnover and suggest how the energy of this dimerization and GTP hydrolysis can be converted into large structural movements that may play a role in precipitating membrane fission. 210 Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc.
RESULTS 12.2 A˚ Cryo-EM Reconstruction of DPRD in the Constricted State Reveals Additional Structural Features of the Assembled Dynamin Polymer Our initial attempt to characterize GMPPCP-bound, constricted DPRD tubes using cryo-EM and Fourier-Bessel synthesis produced an 18 A˚ resolution reconstruction (Wilson-Kubalek et al., 2010) that displayed only minor differences compared to previously published structures (Zhang and Hinshaw, 2001; Chen et al., 2004; Wilson-Kubalek et al., 2010). The resolution was limited by variations in the tube diameter, which produced long-range disorder and diminished the overall diffracting power. To circumvent this, we segmented the tubes into individual, overlapping particles that were then aligned, classified, sorted, and averaged with the iterative helical real-space reconstruction (IHRSR) algorithm (Egelman, 2007) (Figure S1A–S1C). This single-particle-based approach produced a 12.2 A˚ helical map (Figure 1A; Figure S1D) that has an inner lumenal diameter of 7 nm, an outer diameter of 40 nm, 13.2 subunits per turn, and a pitch of 99.3 A˚. The improved resolution reveals additional structural features of the DPRD polymer. First, the stalk density, which constitutes the base of the characteristic ‘‘T view’’ (Figure 1B; Movie S1), appears to twist in a crisscross fashion (Figures 1B and 1C), intersecting just below the cleft that separates the ‘‘head’’ density regions along the exterior of the polymer. Second, there are two additional strips of density within the cleft that wrap around the tube (Figure 1D, highlighted with dashed boxes). Each strip forms a continuous connection with the alternating head densities of a single helical rung. Docking of Crystallized Dynamin Fragments Illustrates Ambiguities in Structural Models To decipher the subunit organization of the dynamin polymer, we docked the crystal structures of the GDP.AlF4-stabilized GG dimer (GGGDP.AlF4; PDB 2X2E), the human MxA middle/GED stalk (PDB 3LJB), and the human dynamin 1 PH domain (PDB 1DYN) into our improved DPRD reconstruction (Figure 2A). The MxA stalk structure shares a high degree of sequence homology (19.5% identical, 54.9% similar) with dynamin’s middle domain and GED (Data S1) and currently represents the best structural model for these domains. Attempts to dock GGGDP.AlF4 as a dimer failed as one monomer always grossly protruded from the density, regardless of its orientation (Figure S3A). The GGGDP.AlF4 dimer from an alternate crystal form (PDB 2X2F) exhibited the same discrepancies (data not shown). We therefore selected only one monomer for docking (monomer A from PDB 2X2E), which allowed more degrees of freedom during the fitting procedures. We similarly positioned the MxA stalks individually, as the crystallized assembly could only be fit into a previously published 23 A˚ DPRD map after a significant rotation between adjacent pairs of monomers (Gao et al., 2010). Fitting was carried out using YUP (Tan et al., 2006, 2008) as described in the Experimental Procedures. In total, 8 GG monomers, 12 MxA monomers, and 8 PH domains were positioned into the cryo-EM density. In agreement with previous biochemical data and structural modeling (Chen et al., 2004; Mears et al., 2007), the PH domain is situated in the ‘‘leg’’ density adjacent to the
Figure 1. 12.2 A˚ Reconstruction of DPRD in the Constricted State Reveals New Structural Features of the Assembled Dynamin Polymer (A) Structure of the DPRD polymer. Two density thresholds of the DPRD map are shown: the lower threshold is colored gray, and the higher threshold is in mesh and colored radially to denote the locations of the ‘‘leg’’ (orange), ‘‘stalk’’ (blue), and ‘‘head’’ (green) regions. Left panel shows a side view of the decorated helical tube oriented perpendicular to the helical axis. A section of the tube’s outer surface has been removed to show the interior of the structure in cross-section. The membrane bilayer (M), inner lumenal diameter, outer tube diameter, and pitch are labeled. Black box denotes section of map highlighted in (B). Right panel is an end-on view of the tube looking down the helical axis that is rotated 90 relative to the view on the left. (B) Cross-section through DPRD polymer; the classical ‘‘T view’’ of dynamin subunits within individual helical rungs. The leg, stalk, head, and membrane bilayer (M) density regions are labeled and colored as in (A). The cleft separating head densities within the same helical run is labeled. Dashed lines (1–4) indicate the locations of planar slices through a single helical rung that are shown in (C) with orientation defined by black arrow. (C) Sequential planar sections through a single helical rung show the crisscross twisting of the dynamin stalk density. Black circles highlight intersection point of stalk density. Black dashed boxes in section 1 highlight the additionally resolved strips of density visible in the cleft between G domains in the same helical rung. (D) Two additional strips of density (dashed red and black boxes) are visible in the cleft along the exterior of the structure and form continuous connections with the head densities of a single helical rung. Subunits belonging to alternating helical rungs are labeled in red and black for distinction. View is rotated 90 relative to (B). See also Figure S1 and Movie S1.
Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc. 211
Figure 2. Computational Docking of Crystallized Fragments Derived from Dynamin Family Members (A) Crystal structures of isolated domains from different dynamin family members. From left to right: GDP.AlF4-stabilized dimer of the GTPase-GED fusion (GG) from human dynamin 1 (PDB 2X2E, green), including the G domain and the BSE; stalk dimer from human MxA (PDB 3LJB, blue), which includes the middle domain and GED; dynamin 1 PH domain (PDB 1DYN, orange). (B) Computational docking of structures in (A) into the DPRD map (gray). A section of the DPRD tube is shown in an end-on view looking down the helical axis to highlight the positions of the docked fragments relative to the membrane bilayer (M). Solid vertical black line indicates orientation of the cross-section plane that is rotated by 90 in (C). Dashed black lines denote the planar sections that are rotated by 90 and shown in (D), (H), and (F). (C) T view cross-section illustrating the positioning of domains within a single helical rung. M denotes membrane bilayer. (D and E) Docking of PH domain monomers. (D) shows orientation of PH domains within the same helical rung. (E) depicts asymmetry of PH domain fitting. Variable loop 1 (VL1) is labeled. Orientation is the same as in the T-view in (C). (F and G) Fitting of MxA stalk monomers. (F) shows a zoomed in top view perpendicular to the membrane bilayer. Four of MxA stalk monomers (colored blue and purple) are shown. Note that portions of each MxA monomer protrude from the DPRD density map (yellow boxes). (G) is rotated 90 and shows a side view of the stalk monomers in the same orientation as shown in (B). Residues corresponding to the putative dynamin proline hinge are labeled in the MxA structure (yellow spheres). (H–J) Two possible fittings of GGGDP.AlF4 monomers (green and red) viewed either from the top (H) or the side (I). The different orientations are related by a 180 rotation about an axis parallel to the membrane surface (H) and result in the CGED helix of the BSE (yellow) facing either up (green) or down (red). Each fitting generates a different connection with the stalk (I), resulting in subunits that are either ‘‘extended’’ or ‘‘kinked’’ (J). See also Figure S2, Data S1, and Figure S3.
plasma membrane, the middle/GED fragment inhabits the interior ‘‘stalk’’ density, and the G domain occupies the exterior ‘‘head’’ density of the tube (Figures 2B and 2C). It should be noted that, in our model, the density in the T view cross-section represents the interaction of four different MxA stalk monomers (Figure S3B). 212 Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc.
At the membrane surface, the PH domains are arranged as dimers within the same helical rung (Figures 2C–2E). The density within this region, however, is asymmetric, resulting in nonequivalent orientations for each of the neighboring monomers (Figure 2E). Our confidence in this fitting is strengthened by the fact that in both PH domains variable loop 1—shown by
Figure 3. Structure of GGGMPPCP (A) Domain structure of the G domain-GED (GG) fusion protein from human dynamin 1 and the structure of its complex with GMPPCP. G domain cores are in yellow and light blue, and the three helices of the BSE are shown as NGTPase, purple; CGTPase, purple; CGED, green. A highly conserved flexible hinge region (BSE hinge, red) connects the BSE to the G domain core. GMPPCP molecules (green) and active-site waters (red spheres) are shown. (B and C) Structural comparison of the GGGMPPCP (B, yellow) and GGGDP.AlF4 (C, dark blue, PDB 2X2E) active sites. The elements of the catalytic machinery are labeled in each structure along with the bound nucleotide (green), Mg2+ ion (green spheres), and catalytic and bridging waters (red spheres, H20cat and H20bridge, respectively). The AlF4 (gray) and charge-compensating sodium ion (purple sphere) are shown for GGGDP.AlF4. An additional water molecule (H20cc, red sphere) occupies the ionbinding site in GGGMPPCP. The trans-stabilizing loop and catalytic D180 residue from the adjacent monomer (colored light blue and labeled with subscript ‘‘B’’) are shown at the top of each panel. Dashed black lines indicate hydrogen bonds. See also Figure S4.
fluorescence quenching experiments to penetrate the outer leaflet of PIP2-containing bilayers (Ramachandran and Schmid, 2008)—points into the lipid bilayer density as expected. Although MxA middle/GED monomers match the overall shape of the stalk region density, a portion of these structures protrudes from the map (Figures 2F and 2G, yellow boxes). Where they diverge, the human MxA model contains two prolines (P468 and P597) and a threonine (T416) in helices a2, a4, and a1c, respectively (Figure 2G, yellow spheres). Human dynamin 1 instead contains three highly conserved prolines (Data S1), and we speculate that these residues form a flexible hinge that would allow the dynamin stalk to kink downward into the density and connect to the PH domain below. We also observe an unfilled segment of density beneath each docked MxA stalk model that is continuous with the PH domain density below (Figure S3C). This is not unexpected as the dynamin fragment structures are missing the amino acids (58 in total) that link the middle domain to the PH domain (residues 487–517) and the PH domain to the GED (residues 631–657) (Data S1 and Figure S2), which likely occupy this density. The absence of these connections in our model prohibits us from defining the stalk-PH domain connectivity unambiguously. Our docking yields two equally viable fittings for the GGGDP.AlF4 monomers (Figures 2H and 2I, green versus red). Although both place the globular G domain core into the head density and the BSE into the additional strips of density in the cleft (Figure 1D), their relative orientations differ by a 180 rota-
tion around an axis parallel to the plasma membrane (Figure 2H). In one orientation, the CGED helix is on top (Figure 2H, green), whereas in the other, the CGTPase helix is on top (Figure 2H, red). Each orientation creates a different connectivity between the G domain and the stalk below (Figure 2I), producing two possible subunit arrangements (Figure 2J): long and extended (green) or short and kinked (red). Each imposes a different set of constraints on dynamin assembly and implies different structural contacts between neighboring subunits in the polymer. Structure of GMPPCP-Bound GG Identifies a Major BSE Conformational Change We hypothesized that the uncertainty associated with docking GGGDP.AlF4 monomers into the DPRD map may reflect nucleotide-dependent conformational differences between the crystallized GG dimer, stabilized by the transition-state mimic GDP.AlF4, and DPRD dynamin in the assembled polymer, stabilized by the ground-state analog GMPPCP. To address this problem, we solved the crystal structure of GG in complex with GMPPCP (GGGMPPCP) at 2.2 A˚ (Figure 3A). Although GGGMPPCP is entirely monomeric when analyzed by sizeexclusion chromatography (Chappie et al., 2010) and analytical ultracentrifugation (Figures S4A and S4B), in the crystal it forms a dimer similar to that of the transition-state complex, presumably due to the high protein concentration during crystallization. One molecule of GMPPCP is bound to each active site along with a single Mg2+ ion that is coordinated by S45, T65, and the b- and g-phosphates (Figure 3B). As in the GGGDP.AlF4 structure (PDB 2X2E) (Figure 3C), we resolve the catalytic water, appropriately positioned for an in-line nucleophilic attack on the g-phosphate, and the adjacent bridging water, which contacts the conserved Q40 side chain (Figure 3B). Unlike many small Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc. 213
G proteins, dynamin does not use an ‘‘arginine finger’’ side chain to compensate for the developing negative charge in the transition state (Scheffzek et al., 1998); rather, the positive charge is supplied by a monovalent cation, whose binding is stabilized by G domain dimerization (Chappie et al., 2010). Significantly, this cation is absent in GGGMPPCP as GMPPCP’s b-g methylene connection does not provide the necessary hydrogen-bonding interactions required to complete the ion coordination sphere. Instead, a water molecule (H20cc, Figure 3B) occupies the ion-binding site but is shifted 1.7 A˚ relative to the sodium observed in GGGDP.AlF4 (Figure 3C). H20cc is coordinated by the carbonyls of G60 and G62 and the S41 side chain, which rotates 90 to accommodate the offset from the transition-state complex. As a consequence, the hydrogen bond across the dimer interface between S41 and D180 is broken. The other facets of the nucleotide-binding and catalytic machineries remain essentially unchanged. The major structural difference between the ground-state GGGMPPCP and GGGDP.AlF4 transition-state complexes (Figure 4A) is a 68.81 rigid-body rotation of the BSEs downward about an axis perpendicular to the CGTPase helix coupled with a slight counterclockwise twist (Figure 4B; Movie S2 and Movie S3). This brings each BSE close to the b sheet of the G domain core and results in a more compact transition-state dimer, reducing its radius of gyration from 32.9 A˚ to 30.9 A˚. Residues between H288 and G295 (Figure 3A and Figure 4, red)— previously identified as a flexible hinge (Chappie et al., 2010)— and residues at the start of the G domain core (P32 and Q33) serve as the pivot points for these motions. Whereas the P loop is essentially unchanged, helix a2 tilts toward the active site (Figure 4C). The downstream end of switch 1 (residues 59–68) shifts 1 A˚. The size of the changes increases toward the b sheet with a 3.5 A˚ shift at the upstream end of switch 1 at G53 and culminating in a 4.5 A˚ shift at the tip of the sheet affecting the connecting b23 and b45 loops (Figure 4C, arrows). Moving toward the transition state, the net effect of these changes is a rotation of the central b sheet (Movie S2) and tightening of the hydrophobic packing within the G domain core (Figure S4C), which brings R54, E79, and S126 into hydrogen-bonding distance (Figure S4C). This may also help stabilize switch II as the cis-stabilizing loop (Chappie et al., 2010) shifts nearly 2 A˚ (Figure 4C). The repositioning of elements within the core reconfigures the outer face of the b sheet and facilitates the formation of salt bridges and hydrophobic interactions with the NGTPase helix that anchor the BSE (Figure 4D). Additional stabilization is provided by the NGTPase linker (residues 22–31), which partially reconfigures into a short helix and contacts the BSE’s hydrophobic core via residues I23, L29, and L31 (Figure 4E). Docking of GGGMPPCP Reveals Putative G Domain-Stalk Connectivity We next asked whether docking GGGMPPCP into our DPRD cryoEM map could distinguish between the two possibilities for the G domain-stalk connection (Figure 2J). The fitting approach described above was expanded to include 48 GGGMPPCP monomers, 24 MxA stalk monomers, and 24 PH domains—nearly two complete turns of the DPRD helix (Figure 5A). The ambiguity we previously encountered when fitting the GGGDP.AlF4 monomers 214 Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc.
(Figure 2J) is now absent in the resulting model, as GGGMPPCP adopts a single preferred orientation in the DPRD map (Figure 5B). This is due to the different BSE conformations relative to the G domain core in the two GG structures. The BSEs are oriented with the CGED helices on top (Figure 5B) and occupy the cleft density strips (Figures 5B–5D) that encircle the exterior of the map within each rung of the dynamin helix (Figure 1D). This positions the ends of the CGTPase and CGED helices close to N and C termini of the stalk (Figure 5C, Nstalk and Cstalk), allowing these segments to connect via two short stretches of amino acids that are missing from the docked crystal structures—residues 311–320 and residues 722–725. The physical constraints of these connections and the docking indicate that the underlying dynamin subunits must adopt an extended conformation within the assembled polymer (Figure 5C and Figure 2J). In this configuration, the G domains in adjacent helical rungs are poised to form the productive dimers that were identified by crystallography and are needed for dynamin’s stimulated GTPase activity (Chappie et al., 2010) (Figure 5D). Unlike the crystallized GG dimers, these docked GGGMPPCP monomers are slightly separated, consistent with our findings that G domain dimerization only occurs in the presence of transition-state mimics and not with ground-state analogs such as GMPPCP (Figures S4A and S4B) (Chappie et al., 2010). A similar docking procedure using a homology model for the dynamin 1 middle/GED stalk rather than the MxA structure yielded the same overall fitting and extended subunit arrangement (data not shown). CGED Is Domain Swapped in Full-Length Dynamin Although GG’s CGED helix mimics dynamin’s G domain-GED interactions, its minimal nature does not distinguish whether GED’s association with the G domain in the dynamin tetramer occurs in cis within the same polypeptide or is contributed by another polypeptide in trans (Figure 6A). We therefore used chemical crosslinking to resolve this ambiguity. Two cysteine mutations (R15C in NGTPase /R730C in CGED)—previously shown to enable efficient crosslinking of GG’s N and C termini by a short (3.6 A˚), cysteine-reactive bifunctional crosslinker (MTS-1-MTS) (Chappie et al., 2009)—were introduced into a reactivecysteine-less version of dynamin (DynRCL) to examine G domain-GED interactions in the tetramer. The resulting protein (DynRCL R15C/R730C) shows normal GTPase activity (Figures S5A and S5B) and migrates similarly to wild-type (WT) Dyn when analyzed by nonreducing SDS-PAGE (Figure 6D). Like WT-Dyn (Muhlberg et al., 1997), DynRCL R15C/R730C predominantly generates a tetrameric species when incubated with the general amine-reactive bifunctional crosslinker BS3 (Figure 6B). In contrast, specific G domain-GED crosslinking of DynRCL R15C/R730C by cysteine-reactive MTS-1-MTS predominantly generates a dimer (Figure 6B). Importantly, we did not detect any faster-migrating species indicative of intrapolypeptide or in cis crosslinking. For both reagents, the crosslinking efficiency of the predominant species was unaffected by protein concentration (Figure 6B), consistent with intratetramer or in trans crosslinking. This was confirmed by size-exclusion chromatography of the crosslinked species, which eluted as a tetramer (Figures S5C–S5E). Finally, DynRCL R15C/R730C was subjected to limited proteolysis with Lys-C, which cleaves sites bordering the PH
Figure 4. Structure of GGGMPPCP Identifies a Hydrolysis-Dependent BSE Conformational Change (A) Structural superposition of GGGMPPCP (yellow) and GGGDP.AlF4 (blue, PDB 2X2E). Note the different conformations of the BSE in each structure. The BSE hinge is colored red. (B) Hydrolysis-dependent BSE conformational change. Left panel is superposition of GGGMPPCP and GGGDP.AlF4 monomers. The NGTPase, CGTPase, and CGED helices of the BSE are labeled. Black arrow depicts 68.81 downward rotation of the BSE in the transition-state complex. Middle and right panels depict different views of the monomer superposition. Small black arrow in middle panel describes the slight counterclockwise twist that is coupled to the downward rotation of the BSE; black arrow in right panel describes combined translocation of the BSE. (C) BSE movement induces structural changes in the central b sheet of the G domain core. b strands 2–5, the a2 helix, and the b23 and b45 loops are labeled. Black arrows illustrate how these segments shift to accommodate the BSE. The GMPPCP (green), Mg2+ (green sphere), and active-site waters (red spheres) from the GGGMPPCP structure are shown. (D) Structural interactions between the BSE and G domain core in the GGGDP.AlF4 transition-state complex. Residues contributing to salt bridges and hydrophobic interactions are shown. Black dashed lines are hydrogen bonds. (E) Structural changes of the NGTPase linker. The linker reconfigures into a short helix, allowing I23, L29, and L31 to form stabilizing interactions with the BSE hydrophobic core. See also Figure S4 and Movie S2 and Movie S3.
domain (Figures 6C and 6D) (Muhlberg et al., 1997). Western blotting with G domain- or GED-specific antibodies confirmed that each of the higher-molecular-weight crosslinked species, but none of the lower-molecular-weight bands, contains both the G
domain and the GED (Figure 6D, a-GTPase and a-GED, respectively). Together these data establish that the GED from one polypeptide docks on the G domain of an adjacent polypeptide to form a domain-swapped full-length dynamin dimer, two of Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc. 215
Figure 5. Docking of GGGMPPCP Reveals Putative G Domain-Stalk Connectivity (A) Docked model of assembled DPRD polymer in the constricted state. GGGMPPCP monomers are colored green, middle/GED stalk monomers are colored blue, PH domains are colored orange, and the GMPPCP-bound DPRD reconstruction is rendered in gray. A side view is shown perpendicular to the helical axis. (B) Comparison of GGGDP.AlF4 (top) and GGGMPPCP (bottom) monomer dockings. GGGMPPCP yields a single, preferred orientation where the CGED helix of the BSE (yellow) is on top. Monomers are shown in the same orientation as in (A). (C) Zoomed side (upper panel, perpendicular to helical axis) and end (lower panel, looking down helical axis) views highlighting the putative G domain-stalk connectivity. Labels: CGTPase, C terminus of G domain; Nstalk, N terminus of the middle domain; Cstalk, C-terminal portion of GED present in MxA crystal structure; CGED, beginning of C-terminal GED helix present in GGGMPPCP structure. The proximity of CGTPase to Nstalk and Cstalk to CGED suggests that dynamin subunits adopt an ‘‘extended’’ conformation. (D) GGGMPPCP monomers in adjacent helical rungs are poised for dimerization (dashed black lines). The BSEs occupy the cleft densities above the stalk on the exterior of the map (black arrows).
which associate through middle/GED stalk interactions to form the dynamin tetramer. Membrane-Bound Structure of the Dynamin Tetramer Our docking suggests two possible architectures for this fulllength domain-swapped dynamin dimer (Figure 7A; Figure S6A). Swapping the entire GED would produce a long, m-shaped dimer (Figure 7A). Alternatively, exchanging only the CGED helix 216 Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc.
would result in a short, x-shaped dimer (Figure S6A). The two dimers differ in the relative placement of the PH domains and the intermonomer interfaces. In the long dimer, the PH domains are close enough to allow complete GED exchange, whereas the stalks are separated from their partner in the other monomer (Figure 7A). In the short dimer, this situation is reversed: the structure is stabilized by a back-to-back stalk interaction that forces the PH domains to be splayed apart (Figure S6A). Despite
Figure 6. Dynamin Tetramer Is a Dimer of Domain-Swapped Dimers (A) Cartoons illustrating possible G domain-GED interactions in full-length dynamin. Domains are colored as in (C). Black ‘‘X’’s denote expected crosslinks for each scenario. (B) Chemical crosslinking of DynRCL R15C/R730C. Targeted crosslinking of the engineered cysteine residues in the NGTPase and CGED by MTS-1-MTS produces a prominent dimeric species, whereas nonspecific crosslinking of surface-reactive amines by BS3 primarily yields a tetramer. Crosslinking efficiency of the predominant species was unaffected by protein concentration. (C) Cleavage products of Lys-C limited proteolysis (Muhlberg et al., 1997). (D) Lys-C limited proteolysis and MTS-1-MTS crosslinking of WT and DynRCL R15C/R730C. Left panel shows Coomassie-stained gel of proteolyzed and/or crosslinked products; right panel shows western blotting of the same species. a-GTPase and a-GED are primary antibodies recognizing dynamin’s N terminus (residues 2–17) and GED, respectively. See also Figure S5.
these differences, both dimers use the same stalk interface to form a tetramer (Figure 7B; Figures S6B and S6C). Mutations in this ‘‘assembly interface’’ (Figure 7C; Table S1)—including R399 and I690 in dynamin 1 (Sever et al., 2006; Ramachandran et al., 2007), R408, G392, and Y440-R444 in human MxA (Gao et al., 2010), and G385 in S. cerevisiae Dnm1 (Ingerman et al., 2005)—shift the tetrameric state of these dynamin family members to stable dimers. This interface also provides stabilizing interactions between tetramers in our polymer structure, which may explain the cooperativity observed for membranemediated dynamin assembly (Stowell et al., 1999) and the assembly defects exhibited by dynamin mutant dimers (Song et al., 2004; Ramachandran et al., 2007) (Table S1). Although both of these configurations are consistent with our crosslinking data and with mutagenesis studies defining assembly interfaces, we favor the long dimer for two reasons. First, its shape closely resembles the low-resolution structure of the R399A/I690K mutant dimer revealed by small-angle X-ray scattering (Kenniston and Lemmon, 2010). Second, recent crystallographic studies of the intact DPRD molecule show no
indication of an interpolypeptide exchange of the CGED helix at the top of the molecule (M. Ford and J. Nunnari, personal communication), arguing against the short dimer configuration. Structural Constraints of G Domain Dimerization Dynamin’s stimulated GTPase activity arises from the transitionstate-dependent dimerization of its G domains (Chappie et al., 2010). This association has been proposed to occur between two dynamin tetramers and be driven by dynamin assembly on the plasma membrane (Chappie et al., 2010; Gao et al., 2010). Our docking model supports this hypothesis. The connectivity we derive from computational fitting (Figure 5A) precludes G domain interactions within a single tetramer (Figure S6C) and between tetramers in the same helical rung; instead, G domain dimers can only form between tetramers in adjacent rungs, regardless of the underlying subunit architecture (Figure 7D; Figure S6D). Assembly of the helical collar beyond a single rung thus primes the dynamin subunits for stimulated turnover. Surprisingly, only 5 long tetramers (10 subunits) (Figure 7D) or 6 short tetramers (12 subunits) (Figure S6D) are needed to partner the Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc. 217
Figure 7. Structural Constraints of G Domain Dimerization and the Dynamin Powerstroke (A) Model for the domain-swapped dynamin dimer. In this configuration, a long dimer is formed by a full GED domain swap. Monomers are colored purple and green. An alternative model also consistent with our crosslinking data is shown in Figure S6. (B) Putative arrangement of membrane-bound long dynamin tetramer derived from the docked model of assembled DPRD polymer (Figure 5). The tetramer is comprised of two of the domain-swapped dimers shown in (A) (colored blue and teal, labeled A and B). Black box indicates assembly interface between these dimers (see also Figure S6B). (C) Structural mapping of mutations that impair dynamin oligomerization. A stalk monomer (teal) is shown in two orientations. Mutations within the putative assembly interface (yellow) are labeled and colored magenta; mutations that also produce assembly defects but are localized outside this interface are also
218 Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc.
G domains across helical rungs in the constricted DPRD polymer, indicating that a complete turn of the helix (13 subunits) is not required to facilitate G domain dimerization and stimulated GTPase activity. This observation may explain the inability to detect dynamin collars in vivo unless GTP hydrolysis has been inhibited (Marks et al., 2001; Takei et al., 1995). DISCUSSION The Building Blocks of Dynamin Assembly Here we have combined cryo-EM, X-ray crystallography, computational docking, and biochemistry to provide detailed insights into the structure of assembled dynamin. Our 12.2 A˚ reconstruction of DPRD dynamin in the GMPPCP-bound constricted state revealed additional density features not observed in previous lower-resolution maps (Zhang and Hinshaw, 2001; Chen et al., 2004), which served as an improved structural framework for computational docking. Guided by this molecular envelope, we successfully localized the G domain, the BSE, the middle/GED stalk, and the PH domain within the polymer assembly. The resulting pseudoatomic model, which incorporates our 2.2 A˚ GGGMPPCP crystal structure, reveals the putative G domain-stalk connectivity and suggests that the individual dynamin subunits are extended rather than kinked when assembled on a lipid membrane. We cannot yet define the linkages between the middle/GED stalk and the PH domain, as the intervening sequences are absent from currently available crystallographic models. Our chemical crosslinking demonstrates that the CGED helix from one dynamin polypeptide interacts in trans with the G domain of a second polypeptide, resulting in a domain-swapped dimer. Two of these domain-swapped dimers would then associate via their stalks to form a tetramer. Such an arrangement is consistent with mutations in the middle domain (R361S, R399A) and GED (I690K) that destabilize the dynamin tetramer but generate soluble dimers (Sever et al., 2006; Ramachandran et al., 2007). An underlying domain-swapped dimer also explains how assembling tetramer subunits could generate a helical structure in which the asymmetric unit is a dimer (Zhang and Hin-
shaw, 2001). We therefore propose that a domain-swapped dimer is the minimal unit of dynamin assembly, serving as the basic building block for the tetramer in solution and, by extension, the helical assembly on the membrane. We identified two possible configurations for the domainswapped dimers and their resulting tetramer counterparts that are consistent with all available data. A caveat of these models is that they represent a membrane-bound, assembly-competent conformation that may be distinct from the conformation of the free tetramer in solution. It is possible that dynamin undergoes a major conformational change upon membrane binding that exposes the assembly interface, allowing the rapid and cooperative association of multiple tetramers. Structural studies suggest that the bacterial dynamin-like protein (BDLP) undergoes a self-propagating transition, where GTP- and membraneinduced expansion of compact diamond-shaped BDLP dimers promotes polymerization (Low and Lo¨we, 2006; Low et al., 2009). Interestingly, a subset of PH domain mutations linked to centronuclear myopathy—S619L, S619W, and V625 del—have been shown to promote higher-order assembly in the absence of a lipid scaffold (Kenniston and Lemmon, 2010). These changes also result in stimulated GTP hydrolysis (Kenniston and Lemmon, 2010), suggesting that they alleviate the inherent autoinhibition associated with the assembly-incompetent conformation of the tetramer in solution. Conversion between assembly-incompetent and assembly-competent conformations may thus represent a conserved regulatory mechanism common to dynamin family members. Implications for Dynamin-Catalyzed Membrane Fission Dynamin assembly and constriction generate high curvature and localized stress (Bashkirov et al., 2008; Ramachandran et al., 2009; Roux et al., 2010) that impose a greater strain on the inner monolayer lipids of a tightly squeezed neck than on those of the outer monolayer (Bashkirov et al., 2008; Schmid and Frolov, 2011). PH domain interactions with the phospholipid head groups and the membrane insertion of variable loop 1 maintain this energetically unfavorable configuration (Ramachandran and Schmid, 2008; Ramachandran et al., 2009), which can be
shown and colored red. Dashed black circle defines assumed location of R399ADyn1 and YRGR440AAAAMxA, which are disordered in the crystal structure. Phenotypes are in Table S1. (D) Assembly of long dynamin tetramer models within the GMPPCP-stabilized constricted DPRD map (gray). The numbering and rainbow coloring (red to blue) denote the sequential addition of tetramers and terminates when the first G domain dimer is formed. Upper panel depicts end view of the long assembly looking down the helical axis; lower panel is a side view perpendicular to this axis. The sequential rungs of the dynamin helix are marked in the lower panel with black brackets and numbered as ‘‘1’’ and ‘‘2.’’ Dashed black box highlights the partnering helical rungs facilitating G domain dimerization. (E) Proposed pathway of dynamin-catalyzed membrane fission. Dynamin tetramers exist in an assembly-incompetent conformation in solution (1). Membrane binding causes a conformational change in the tetramer that exposes the assembly interface, inducing the rapid, cooperative assembly of a helical dynamin collar at the neck of an invaginated clathrin-coated pit (CCP) (2). Initial constriction of the neck, triggered by GTP binding and structural changes in middle/GED stalk (Chen et al., 2004), promotes G domain dimerization between tetramers in adjacent helical rungs to optimally position dynamin’s catalytic machinery (3). Assembly-stimulated GTP hydrolysis drives a major rotation of the BSE in the transition state that constitutes the dynamin powerstroke (4). Propagation of this change through multiple turns of the helical dynamin collar causes further constriction of the neck (4). The resulting structural rearrangements might play a role in loosening the dynamin scaffold from the membrane surface, facilitating the membrane-remodeling events that contribute to membrane fission (5). The detached dynamin scaffold disassembles upon release of the hydrolyzed g-phosphate (6), which stabilized the dimer interface. Coloring: G domains, green; middle/GED stalk, blue; PH domains, orange; membrane bilayer, gray; lipid head groups, gray circles. The large gray circle with cyan meshwork is the CCP. The assemblies are shown in the same orientations as in (D) (upper panels 3, 4) and in side view on the lower panels. Red arrows indicate movements associated with the dynamin powerstroke. The 7 nm measurement (3) corresponds to the inner lumenal diameter of our GMPPCP-stabilized DPRD reconstruction, which is poised for G domain dimerization and represents an intermediate along the fission pathway; the 4 nm measurement (4) indicates the theoretical inner lumenal diameter of the neck that would allow the spontaneous formation of a hemifission intermediate following partial detachment of the dynamin scaffold. See also Figure S6 and Table S1.
Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc. 219
relaxed by partial detachment and/or disassembly of dynamin subunits following stimulated GTP hydrolysis (Ramachandran and Schmid, 2008; Pucadyil and Schmid, 2008; Bashkirov et al., 2008). Theoretical modeling indicates that a hemifission intermediate will form at this stage if the lumenal diameter of the neck is equivalent to the bilayer thickness (4 nm) (Bashkirov et al., 2008). Our GMPPCP-stabilized DPRD polymer reconstruction has an inner lumenal diameter of 7 nm, indicating that it is an intermediate along the fission pathway and that additional constriction is necessary to constrain the membrane neck in a manner that allows fission to occur spontaneously once it is released from the dynamin scaffold. Further compression of the polymer also favors G domain dimerization, as the longitudinal proximity between adjacent helical rungs would be increased as the inner lumenal diameter decreases. Our structural data raise the tantalizing possibility of a BSEmediated dynamin powerstroke (Figure 7E) that converts the energy of G domain dimerization and GTP hydrolysis into rearrangements affecting the entire dynamin collar. These changes could provide the mechanochemical force needed for constriction down to 4 nm and subsequent loosening of the dynamin scaffold from the membrane, thus precipitating the membrane-remodeling events required for fission (Figure 7E). Recently, large GTP hydrolysis-dependent conformational changes were also observed for the yeast mitochondrial dynamin-like protein Dnm1 (Mears et al., 2011) that did not occur upon the addition of GMPPCP, suggesting that the formation of a G domain transition-state complex may also play an important role in mitochondrial fission. It remains to be seen whether this system exhibits a similar BSE conformational change. It has been proposed that the assembly-dependent positioning of dynamin’s PH domains helps catalyze the lipid rearrangements needed for fission (Schmid and Frolov, 2011). The PH domains are asymmetrically distributed in the long tetramer assembly with part of the membrane surface unoccupied (Figure 7D) and arranged uniformly around the neck in the short assembly (Figure S6D). As the number of turns required to catalyze fission has yet to be established, the significance of this differential distribution remains to be determined. Intramolecular Conformational Coupling Fluorescence studies have shown that PH domain binding to/ dissociation from the plasma membrane is coupled to structural changes in the G domain’s nucleotide-binding pocket (Solomaha and Palfrey, 2005; Ramachandran and Schmid, 2008). The large distance between these two domains (Figure 2) suggests that a mechanism exists for long-range communication within the dynamin molecule. We previously showed that the BSE senses and transmits assembly-dependent conformational changes to the G domain in a back-to-front manner, i.e., from the membrane to the G domain (Chappie et al., 2009). The hydrolysis-dependent BSE conformational change described here (Figure 4) illustrate that this module can also function front-to-back (i.e., from the G domain to the membrane), amplifying nucleotide-dependent changes in the active site and relaying them through the stalk. These properties make the BSE an ideal regulator of intramolecular crosstalk. Recent evidence suggests that the C-terminal a helix of the PH domain 220 Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc.
(CPH) also plays a role in conformational coupling, as mutations in this region can indirectly modulate dynamin’s GTPase activity (Kenniston and Lemmon, 2010). Being situated at opposing ends of the GED, the CPH and BSE could communicate back and forth via structural fluctuations in the stalk to coordinate membrane binding, dynamin assembly, stimulated GTP hydrolysis, and the subsequent disassembly of the polymer. EXPERIMENTAL PROCEDURES Protein Purification and Biochemical Assays See Extended Experimental Procedures for detailed protocols describing the purification of dynamin and GG constructs, chemical crosslinking, and sedimentation velocity experiments. Preparation of DPRD Dynamin Tubes Liposomes containing 100% 1,2-dioleoyl-sn-glycero-3-phospho-L-serine (DOPS; Avanti Polar Lipids) were prepared by extrusion through polycarbonate membranes (Whatman) with a pore size of 0.4 mm using an Avanti MiniExtruder. Lipids were mixed, dried, rehydrated in buffer (20 mM HEPES, pH 7.5) to a final concentration of 2.5 mM (2 mg/ml), and sonicated prior to extrusion. The resulting unilamellar DOPS liposomes were diluted to 1 mg/ml and mixed 1:1 (v:v) with DPRD dynamin at 1 mg/ml in 20 mM HEPES (pH 7.5)/100 mM NaCl. The mixture was incubated at room temperature for 2 hr and then applied to plasma cleaned C-flat holey grids. The sample was washed with 20 mM HEPES (pH 7.5), blotted, and frozen in liquid ethane. Grids were transferred to liquid nitrogen and stored until use. Cryo-EM and Image Processing Samples were visualized using a Phillips Technai F20 electron microscope operating at 120 kV, and images were collected using Leginon (Potter et al., 1999; Suloway et al., 2005) in manual mode at 1.0–2.0 mm underfocus with a 4K 3 4K Gatan CCD camera at a nominal magnification of 50,0003, corresponding to a resolution of 2.26 A˚ per pixel. Images were individually CTF corrected using ACE2 (Mallick et al., 2005). Ordered, straight DPRD tubes were manually selected for processing by the iterative helical real space reconstruction (IHRSR) methodology (Egelman, 2007; see Extended Experimental Procedures for details). The resolution of the final map was determined to be 12.2 A˚ by Fourier shell correlation (FSC = 0.5) (Figure S1D). X-Ray Data Collection, Structure Solution, and Refinement Native data on a GGGMPPCP crystal were collected at 95 K on a rotating anode source equipped with multilayer focusing optics using Cu Ka radiation and a Saturn A200 CCD detector. All data were integrated and scaled using XDS (Kabsch, 2010). The GGGMPPCP structure was solved by molecular replacement using PHASER (McCoy et al., 2007) and refined with CNS v1.3 (Bru¨nger et al., 1998). See Extended Experimental Procedures for details. X-ray data collection and refinement statistics can be found in Table S2. Computational Docking All-atom structures were refined using the YUP.SCX method (Tan et al., 2008) of the YUP software package (Tan et al., 2006). See Extended Experimental Procedures and Figure S1E for details. Initial fitting was performed using GGGDP.AlF4- monomers (PDB 2X2E), MxA middle/GED stalk monomers (PDB 3LJB), and PH domain monomers (PDB 1DYN), representing 93% of the DPRD sequence. A similar procedure was used to fit GGGMPPCP. Orientations of the middle/GED and PH monomers were largely unchanged, and the GGGMPPCP placement refined to a single orientation that best matched the DPRD cryo-EM structure. GTPase Assays Basal and low-salt assembly-stimulated GTP hydrolysis was measured using a colorimetric malachite green assay described elsewhere (Leonard et al., 2005). In each case, reactions were carried out at 37 C using 2 mM of Histagged full-length dynamin 1 constructs and 500 mM GTP in buffer containing
20 mM HEPES (pH 7.5), 1 mM MgCl2, and either 150 mM KCl (basal) or 0 mM KCl (low-salt assembly-stimulated).
Ferguson, K.M., Lemmon, M.A., Schlessinger, J., and Sigler, P.B. (1994). Crystal structure at 2.2 A resolution of the pleckstrin homology domain from human dynamin. Cell 79, 199–209.
ACCESSION NUMBERS
Gao, S., von der Malsburg, A., Paeschke, S., Behlke, J., Haller, O., Kochs, G., and Daumke, O. (2010). Structural basis of oligomerization in the stalk region of dynamin-like MxA. Nature 465, 502–506.
Atomic coordinates for the GGGMPPCP structure have been deposited in the Protein Data Bank under the accession number 3ZYC. The reconstructed density of GMPPCP-stabilized DPRD lipid tubes has been deposited in the EM Data Bank with accession code EMD-1949. Coordinates for the complete docked model consisting of GGGMPPCP, the human MxA stalk, and the human dynamin 1 PH domain have been deposited in the Protein Data Bank with accession code 3ZYS. SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures, a sequence comparison data file, six figures, two tables, and three movies and can be found with this article online at doi:10.1016/j.cell.2011.09.003.
Hinshaw, J.E., and Schmid, S.L. (1995). Dynamin self-assembles into rings suggesting a mechanism for coated vesicle budding. Nature 374, 190–192. Ingerman, E., Perkins, E.M., Marino, M., Mears, J.A., McCaffery, J.M., Hinshaw, J.E., and Nunnari, J. (2005). Dnm1 forms spirals that are structurally tailored to fit mitochondria. J. Cell Biol. 170, 1021–1027. Kabsch, W. (2010). XDS. Acta Crystallogr. D Biol. Crystallogr. 66, 125–132. Kenniston, J.A., and Lemmon, M.A. (2010). Dynamin GTPase regulation is altered by PH domain mutations found in centronuclear myopathy patients. EMBO J. 29, 3054–3067. Leonard, M., Song, B.D., Ramachandran, R., and Schmid, S.L. (2005). Robust colorimetric assays for dynamin’s basal and stimulated GTPase activities. Methods Enzymol. 404, 490–503.
ACKNOWLEDGMENTS
Low, H.H., and Lo¨we, J. (2006). A bacterial dynamin-like protein. Nature 444, 766–769.
We thank Vasyl Lukiyanchuk and Sharmistha Acharya for assistance in cloning and purification, Rodolfo Ghirlando for assistance with sedimentation velocity experiments, Juha-Pekka Mattila for communication of unpublished data, Joshua Zimmerberg for insightful discussions, Alison B. Hickman for technical advice and critical reading of the manuscript, and Yihong Ye for critical reading of the manuscript. We especially thank Marijn Ford and Jodi Nunnari for the ongoing open dialogue regarding the intricacies of dynamin structure and assembly. This work was supported by the Intramural Program of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) and NIH grants GM52468 and GM75820 (to R.A.M.) and GM42455 (to S.L.S.). J.S.C. was supported by a Nancy Nossal Fellowship award from NIDDK. A portion of the work presented here was conducted at the National Resource for Automated Molecular Microscopy, which is supported by the National Institutes of Health through the National Center for Research Resources’ P41 program (RR017573).
Low, H.H., Sachse, C., Amos, L.A., and Lo¨we, J. (2009). Structure of a bacterial dynamin-like protein lipid tube provides a mechanism for assembly and membrane curving. Cell 139, 1342–1352.
Received: May 27, 2011 Revised: July 26, 2011 Accepted: September 1, 2011 Published: September 29, 2011 REFERENCES Bashkirov, P.V., Akimov, S.A., Evseev, A.I., Schmid, S.L., Zimmerberg, J., and Frolov, V.A. (2008). GTPase cycle of dynamin is coupled to membrane squeeze and release, leading to spontaneous fission. Cell 135, 1276–1286. Bru¨nger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., GrosseKunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., et al. (1998). Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. D Biol. Crystallogr. 54, 905–921. Chappie, J.S., Acharya, S., Liu, Y.W., Leonard, M., Pucadyil, T.J., and Schmid, S.L. (2009). An intramolecular signaling element that modulates dynamin function in vitro and in vivo. Mol. Biol. Cell 20, 3561–3571. Chappie, J.S., Acharya, S., Leonard, M., Schmid, S.L., and Dyda, F. (2010). G domain dimerization controls dynamin’s assembly-stimulated GTPase activity. Nature 465, 435–440. Chen, Y.J., Zhang, P., Egelman, E.H., and Hinshaw, J.E. (2004). The stalk region of dynamin drives the constriction of dynamin tubes. Nat. Struct. Mol. Biol. 11, 574–575. Danino, D., Moon, K.H., and Hinshaw, J.E. (2004). Rapid constriction of lipid bilayers by the mechanochemical enzyme dynamin. J. Struct. Biol. 147, 259–267. Egelman, E.H. (2007). The iterative helical real space reconstruction method: surmounting the problems posed by real polymers. J. Struct. Biol. 157, 83–94.
Mallick, S.P., Carragher, B., Potter, C.S., and Kriegman, D.J. (2005). ACE: automated CTF estimation. Ultramicroscopy 104, 8–29. Marks, B., Stowell, M.H.B., Vallis, Y., Mills, I.G., Gibson, A., Hopkins, C.R., and McMahon, H.T. (2001). GTPase activity of dynamin and resulting conformation change are essential for endocytosis. Nature 410, 231–235. McCoy, A.J., Grosse-Kunstleve, R.W., Adams, P.D., Winn, M.D., Storoni, L.C., and Read, R.J. (2007). Phaser crystallographic software. J. Appl. Cryst. 40, 658–674. McMahon, H.T., and Boucrot, E. (2011). Molecular mechanism and physiological functions of clathrin-mediated endocytosis. Nat. Rev. Mol. Cell Biol. 12, 517–533. Mears, J.A., Ray, P., and Hinshaw, J.E. (2007). A corkscrew model for dynamin constriction. Structure 15, 1190–1202. Mears, J.A., Lackner, L.L., Fang, S., Ingerman, E., Nunnari, J., and Hinshaw, J.E. (2011). Conformational changes in Dnm1 support a contractile mechanism for mitochondrial fission. Nat. Struct. Mol. Biol. 18, 20–26. Mettlen, M., Pucadyil, T.J., Ramachandran, R., and Schmid, S.L. (2009). Dissecting dynamin’s role in clathrin-mediated endocytosis. Biochem. Soc. Trans. 37, 1022–1026. Muhlberg, A.B., Warnock, D.E., and Schmid, S.L. (1997). Domain structure and intramolecular regulation of dynamin GTPase. EMBO J. 16, 6676–6683. Niemann, H.H., Knetsch, M.L.W., Scherer, A., Manstein, D.J., and Kull, F.J. (2001). Crystal structure of a dynamin GTPase domain in both nucleotidefree and GDP-bound forms. EMBO J. 20, 5813–5821. Praefcke, G.J., and McMahon, H.T. (2004). The dynamin superfamily: universal membrane tubulation and fission molecules? Nat. Rev. Mol. Cell Biol. 5, 133–147. Potter, C.S., Chu, H., Frey, B., Green, C., Kisseberth, N., Madden, T.J., Miller, K.L., Nahrstedt, K., Pulokas, J., Reilein, A., et al. (1999). Leginon: a system for fully automated acquisition of 1000 electron micrographs a day. Ultramicroscopy 77, 153–161. Pucadyil, T.J., and Schmid, S.L. (2008). Real-time visualization of dynamincatalyzed membrane fission and vesicle release. Cell 135, 1263–1275. Ramachandran, R., and Schmid, S.L. (2008). Real-time detection reveals that effectors couple dynamin’s GTP-dependent conformational changes to the membrane. EMBO J. 27, 27–37. Ramachandran, R., Surka, M., Chappie, J.S., Fowler, D.M., Foss, T.R., Song, B.D., and Schmid, S.L. (2007). The dynamin middle domain is critical for tetramerization and higher-order self-assembly. EMBO J. 26, 559–566.
Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc. 221
Ramachandran, R., Pucadyil, T.J., Liu, Y.W., Acharya, S., Leonard, M., Lukiyanchuk, V., and Schmid, S.L. (2009). Membrane insertion of the pleckstrin homology domain variable loop 1 is critical for dynamin-catalyzed vesicle scission. Mol. Biol. Cell 20, 4630–4639. Reubold, T.F., Eschenburg, S., Becker, A., Leonard, M., Schmid, S.L., Vallee, R.B., Kull, F.J., and Manstein, D.J. (2005). Crystal structure of the GTPase domain of rat dynamin 1. Proc. Natl. Acad. Sci. USA 102, 13093–13098. Roux, A., Uyhazi, K., Frost, A., and De Camilli, P. (2006). GTP-dependent twisting of dynamin implicates constriction and tension in membrane fission. Nature 441, 528–531. Roux, A., Koster, G., Lenz, M., Sorre, B., Manneville, J.B., Nassoy, P., and Bassereau, P. (2010). Membrane curvature controls dynamin polymerization. Proc. Natl. Acad. Sci. USA 107, 4141–4146. Scheffzek, K., Ahmadian, M.R., and Wittinghofer, A. (1998). GTPase-activating proteins: helping hands to complement an active site. Trends Biochem. Sci. 23, 257–262. Schmid, S.L., and Frolov, V.A. (2011). Dynamin: Functional design of a membrane fission catalyst. Annu. Rev. Cell Dev. Biol. Published online May 18, 2011. 10.1146/annurev-cellbio-100109-104016. Sever, S., Skoch, J., Newmyer, S., Ramachandran, R., Ko, D., McKee, M., Bouley, R., Ausiello, D., Hyman, B.T., and Bacskai, B.J. (2006). Physical and functional connection between auxilin and dynamin during endocytosis. EMBO J. 25, 4163–4174. Solomaha, E., and Palfrey, H.C. (2005). Conformational changes in dynamin on GTP binding and oligomerization reported by intrinsic and extrinsic fluorescence. Biochem. J. 391, 601–611. Song, B.D., Yarar, D., and Schmid, S.L. (2004). An assembly-incompetent mutant establishes a requirement for dynamin self-assembly in clathrinmediated endocytosis in vivo. Mol. Biol. Cell 15, 2243–2252.
222 Cell 147, 209–222, September 30, 2011 ª2011 Elsevier Inc.
Stowell, M.H., Marks, B., Wigge, P., and McMahon, H.T. (1999). Nucleotidedependent conformational changes in dynamin: evidence for a mechanochemical molecular spring. Nat. Cell Biol. 1, 27–32. Suloway, C., Pulokas, J., Fellmann, D., Cheng, A., Guerra, F., Quispe, J., Stagg, S., Potter, C.S., and Carragher, B. (2005). Automated molecular microscopy: the new Leginon system. J. Struct. Biol. 151, 41–60. Sweitzer, S.M., and Hinshaw, J.E. (1998). Dynamin undergoes a GTP-dependent conformational change causing vesiculation. Cell 93, 1021–1029. Tan, R.K.-Z., Petrov, A.S., and Harvey, S.C. (2006). YUP: A molecular simulation program for coarse-grained and multi-scale models. J. Chem. Theory Comput. 2, 529–540. Tan, R.K., Devkota, B., and Harvey, S.C. (2008). YUP.SCX: coaxing atomic models into medium resolution electron density maps. J. Struct. Biol. 163, 163–174. Takei, K., McPherson, P.S., Schmid, S.L., and De Camilli, P. (1995). Tubular membrane invaginations coated by dynamin rings are induced by GTPgamma S in nerve terminals. Nature 374, 186–190. Timm, D., Salim, K., Gout, I., Guruprasad, L., Waterfield, M., and Blundell, T. (1994). Crystal structure of the pleckstrin homology domain from dynamin. Nat. Struct. Biol. 1, 782–788. Warnock, D.E., Hinshaw, J.E., and Schmid, S.L. (1996). Dynamin selfassembly stimulates its GTPase activity. J. Biol. Chem. 271, 22310–22314. Wilson-Kubalek, E.M., Chappie, J.S., and Arthur, C.P. (2010). Helical crystallization of soluble and membrane binding proteins. Methods Enzymol. 481, 45–62. Zhang, P., and Hinshaw, J.E. (2001). Three-dimensional reconstruction of dynamin in the constricted state. Nat. Cell Biol. 3, 922–926.
Beclin1 Controls the Levels of p53 by Regulating the Deubiquitination Activity of USP10 and USP13 Junli Liu,1,5 Hongguang Xia,1,5 Minsu Kim,2,6 Lihua Xu,1,6 Ying Li,1,2,6 Lihong Zhang,1,6 Yu Cai,1 Helin Vakifahmetoglu Norberg,2 Tao Zhang,1 Tsuyoshi Furuya,2 Minzhi Jin,1 Zhimin Zhu,2 Huanchen Wang,3 Jia Yu,1 Yanxia Li,1 Yan Hao,1 Augustine Choi,4 Hengming Ke,3 Dawei Ma,1,* and Junying Yuan2,* 1State Key Laboratory of Bioorganic and Natural Products Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, 354 Fenglin Lu, Shanghai 200032, China 2Department of Cell Biology, Harvard Medical School, 240 Longwood Avenue Boston, MA 02115, USA 3Department of Biophysics and Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA 4Brigham and Women’s Hospital, 75 Francis Street, Boston, MA, 02115, USA 5These authors contributed equally to this work 6These authors contributed equally to this work *Correspondence:
[email protected] (D.M.),
[email protected] (J.Y.) DOI 10.1016/j.cell.2011.08.037
SUMMARY
Autophagy is an important intracellular catabolic mechanism that mediates the degradation of cytoplasmic proteins and organelles. We report a potent small molecule inhibitor of autophagy named ‘‘spautin-1’’ for specific and potent autophagy inhibitor-1. Spautin-1 promotes the degradation of Vps34 PI3 kinase complexes by inhibiting two ubiquitinspecific peptidases, USP10 and USP13, that target the Beclin1 subunit of Vps34 complexes. Beclin1 is a tumor suppressor and frequently monoallelically lost in human cancers. Interestingly, Beclin1 also controls the protein stabilities of USP10 and USP13 by regulating their deubiquitinating activities. Since USP10 mediates the deubiquitination of p53, regulating deubiquitination activity of USP10 and USP13 by Beclin1 provides a mechanism for Beclin1 to control the levels of p53. Our study provides a molecular mechanism involving protein deubiquitination that connects two important tumor suppressors, p53 and Beclin1, and a potent small molecule inhibitor of autophagy as a possible lead compound for developing anticancer drugs. INTRODUCTION Vps34 is the primordial member of the PI3 kinase family and the only known class III PI3 kinase that can phosphorylate the D-3 position on the inositol ring of phosphatidylinositol (PtdIns) to produce PtdIns3P (Schu et al., 1993). In contrast to class I PI3 kinase, which has been extensively studied, much less is known about the class III PI3 kinase or its regulation in mammalian cells. Emerging evidence indicates a central role of Vps34 PI3K activity and its protein partners in orchestrating both initiation and matu-
ration of autophagosomes (Simonsen and Tooze, 2009). Thus, exploring the mechanisms that regulate the class III PI3 kinase has direct implications in our understanding of these important intracellular mechanisms as well as for developing therapies for treatment of human diseases. Similar to their homologs in yeast, Vps34 in mammalian cells is present in two complexes: Vps34 complex I and Vps34 complex II (Itakura et al., 2008; Liang et al., 2006; Matsunaga et al., 2009; Zhong et al., 2009). These two complexes share the core components of Vps34, Beclin1 and p150; and in addition, complex I contains Atg14L and complex II contains UVRAG. Interestingly, the stabilities of different components of Vps34 complexes are codependent upon each other as knockdown of one component often reduces the levels of others in the complexes (Itakura et al., 2008). Beclin1 has been characterized as a tumor suppressor, and its importance is underscored by both the frequent monoallelic loss of beclin1 in human breast, ovarian and prostate tumors, and an increased rate of malignant tumors in BECN1+/ mice (Liang et al., 1999; Qu et al., 2003; Yue et al., 2003). Although autophagy deficiency has been proposed to be the mechanism for the increased tumorigenesis in BECN1+/ mice, a recent study using tissue-specific knockout mice of Atg5 and Atg7 suggests that autophagy deficiency may lead to benign tumors in livers, but not in other tissues (Takamura et al., 2011). Thus, the mechanism of Beclin1 as a tumor suppressor remains as a puzzle. Small molecule inhibitors are important tools in exploring the cellular mechanisms in mammalian cells. However, the only available small molecule inhibitor of autophagy is 3-methyladenine (3-MA), which has a working concentration of 10 mM and inhibits multiple forms of PI3 kinases. Therefore, there is an urgent need to develop highly specific small molecule tools that can be used to facilitate the studies of autophagy in mammalian cells. Using an imaging-based screen, we identified a small molecule inhibitor of autophagy and developed it into a highly potent autophagy inhibitor. We named it ‘‘spautin-1’’ for specific and potent autophagy inhibitor-1. We explored the Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc. 223
Figure 1. Isolation of a Series of Small Molecule Inhibitors of Autophagy (A) The structure of MBCQ. (B) MBCQ reduced the spot numbers (a), spot size (b), and spot intensity (c) of LC3-GFP+ puncta. H4-LC3-GFP cells were treated with rapamycin (0.2 mM) and MBCQ (5 mM) as indicated. The image data are expressed as % of control vehicle treated cells. 1000 cells were analyzed per treatment condition. (C) H4-LC3-GFP cells were treated with rapamycin (0.2 mM) and MBCQ (10 mM) as indicated for 2 hr and the cell lysates were analyzed by western blotting using anti-LC3. b-tubulin was used as a control. (D) An active (C43=spautin-1) and an inactive (C71) derivatives of MBCQ. (E) MEF cells were treated with DMSO (1&), rapamycin (0.2 mM) alone, or together with MBCQ (10 mM), C43 (10 mM) or C71 (10 mM) for 4 hr. The cell lysates were analyzed for western blotting using anti-LC3 antibody. b-tubulin was used as a loading control. (F) Dose-response (in mM) of C43 and inactive C71. H4-LC3-GFP cells were treated with rapamycin (0.2 mM) for 12 hr with C43 or C71 as indicated. The LC3-GFP+ puncta were quantified as in (B). Autophagy index = % {[total LC3-GFP+ spot intensity (compound+rapamycin treated) per cell] – [total LC3-GFP+ spot intensity (DMSO treated) per cell]} / {[total LC3-GFP+ spot intensity (rapamycin treated) per cell] – [total LC3-GFP+ spot intensity (DMSO treated) per cell]}. Rap = rapamycin. (G) H4-LC3-GFP cells were treated with spautin-1(10 mM) with or without E64D (5 mM) for indicated periods of time. The cell lysates were analyzed by western blotting using anti-LC3 and anti-b-tubulin. All error bars indicate STD. See also Figure S1 and S2.
mechanism by which spautin-1 inhibits autophagy and found that it inhibits two ubiquitin specific peptidases, USP10 and USP13, which regulate the deubiquitination of Beclin1 in Vps34 complexes. Using spautin-1 as a tool, we explored the interaction of USP10 and USP13 with Vps34 complexes. Interestingly, we found that Vps34 complexes interact with USP13 and the stabilities of USP10 and USP13 are coordinately regulated with that of Vps34 complexes. Since USP10 is a deubiquitinating enzyme for p53 and regulates the levels of p53 by controlling p53 ubiquitination and degradation (Yuan et al., 2010), regulating the stability of USP10 and USP13 by Vps34 complexes provides a molecular mechanism for class III PI3 kinase to control the levels of p53. Indeed, as predicted by our model, we found that the levels of p53 are reduced in the tissues of BECN1+/ 224 Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc.
mice, which provide a molecular mechanism for the increased tumorigenesis after monoallelic loss of beclin1. Our results demonstrate that class III PI3 kinase is an important tumor suppressor that can regulate the levels of p53 through controlling its deubiquitination. RESULTS Isolation of a Small Molecule Inhibitor of Autophagy by an Image-Based Screen In an imaging-based screen using LC3-GFP as a marker for autophagy (Zhang et al., 2007), we identified a small molecule inhibitor of autophagy, MBCQ, from the ICCB known bioactive library (Figure 1A). MBCQ was previously known as an inhibitor
of phosphodiesterase type 5 (PDE5), an enzyme that degrades cGMP by hydrolysis (MacPherson et al., 2006). Stimulation of H4-LC3-GFP cells with rapamycin led to increases in the levels of LC3-GFP as expected. A quantitative analysis of LC3-GFP puncta using high throughput microscopy showed that the treatment of MBCQ reduced the spot numbers as well as spot size and spot intensity of LC3-GFP dots compared to that of control or rapamycin treatment alone (Figure 1B). Thus, the presence of MBCQ inhibited both basal as well as rapamycin induced LC3GFP autophagic puncta. This result was further confirmed by LC3 western blot analysis (Figure 1C), and similar results were obtained using mouse embryonic fibroblast cells (MEFs) (Figure S1A). Inhibition of autophagy by MBCQ was rapid (Figure 1B) and dose-dependent with an IC50 of 0.8 mM (Figure S1B), which is significantly more potent than the commonly used class III PI3 kinase inhibitor, 3-methyladenine (3-MA). We have also confirmed the autophagy inhibitory activity of MBCQ by electron microscopic studies. Cells treated with rapamycin showed a large number of autophagosomes with characteristic double membrane, which were conspicuously absent in cells treated with rapamycin and MBCQ (Figure S1C). Finally, the treatment of MBCQ was able to reduce the autophagic puncta of LC3-GFP in the presence of rapamycin, under starvation conditions or with bafilomycin which blocks lysosomal degradation (Figure S1D). Thus, MBCQ is an upstream inhibitor of autophagy. Autophagy-Inhibiting Activity of MBCQ Can Be Separated from Its PDE5-Inhibiting Activity One hundred and twelve derivatives of MBCQ were synthesized and analyzed to determine if its activity in inhibiting autophagy could be separated from its inhibition of PDE5 (Table S1 and data not shown). The chemical synthetic schemes are shown in the Extended Experimental Procedures. We selected 9 MBCQ derivatives based on their efficacy in inhibiting autophagy and screened for their activities on PDE5 (Wang et al., 2008). We found that C43 (6-fluoro-N-[4-fluorobenzyl]quinazolin-4-amine), an effective autophagy inhibitor with an IC50 of 0.74 mM (Figure 1D-F), which is comparable to that of MBCQ, has significantly reduced activity toward PDE5 and other PDEs (Figures S2A and S2B and Table S1). Thus, the PDE5 inhibiting activity of MBCQ can be chemically separated from its autophagy inhibiting activity. Consistent with a separation of PDE5 and autophagy inhibiting activities in MBCQ, there were a number of other known PDE5 inhibitors in the bioactive library that we screened, including MY-5445, dipyridamole, IBMX and sildenafil (Viagra), which were not identified as autophagy inhibitors. To further confirm this conclusion, we treated H4-LC3-GFP cells with rapamycin and other PDE5 inhibitors including MY-5445, dipyridamole, IBMX or sildenafil using MBCQ as a positive control. None of the specific PDE5 inhibitors tested, including the most potent PDE5 inhibitor, sildenafil (Viagra) which has an IC50 of 2.5 nM for PDE5, had any activity on autophagy (data not shown). Thus, we conclude that the autophagy inhibiting activity of MBCQ is not related to its PDE5 inhibiting activity. To further examine the specificity of C43 in inhibiting autophagy, we treated mouse embryo fibroblasts (MEF) cells with
C43 or C71, a negative control, in the presence of rapamycin with the levels of autophagy determined by LC3 western blotting. Treatment with C43, but not a negative control C71, inhibited autophagy induced by rapamycin (Figures 1D–1F) and starvation (Figure S2C). We also confirmed the inhibition of autophagy by C43 using electron microscopy (Figure S2D). Furthermore, the treatment of C43 inhibited autophagy activated in the presence of E64D, a protease inhibitor that increases the accumulation of autophagosome by blocking lysosomal degradation (Figure 1G). Based on these data, we conclude that C43 is a potent inhibitor of autophagy and named it ‘‘spautin-1’’ for specific and potent autophagy inhibitor-1. Spautin-1 Promotes Cell Death under Starvation Condition and Inhibits Autophagic Cell Death We first characterized the biological effects of spautin-1 at the cellular level in a selected subset of cancer cell lines. Spautin-1 had no effect on the growth and survival of Bcap-37 cells under normal culture conditions (Figure 2A) but dramatically enhanced cell death in glucose-free media (Figure 2B). Bcap-37 cells treated with spautin-1 under glucose-free condition showed apoptotic morphology (Figure 2C) and characteristic PARP cleavage (Figure 2D). Western blotting for LC3 further confirmed that autophagy was induced under glucose-free conditions, which was inhibited by spautin-1 (Figure 2E). Similar results were obtained with MCF-7 and BT549 cells (data not shown). Thus, spautin-1 can sensitize tumor cells to apoptosis under nutritional deprived conditions. In contrast to the above cancer cell lines analyzed, MDCK cells, a normal cell line derived from the Madin-Darby canine kidney, treatment with spautin-1 under glucose-free conditions did not undergo apoptosis (Figures S3A and S3B). Hs578Bst cells, a myoepithelial cell line established from normal tissue peripheral to a breast cancer, were also not sensitive to the treatment of spautin-1 (Figures S3C and S3D). These results are consistent with the proposal that cancer cells are under increased metabolic pressure and therefore more sensitive to inhibition of autophagy than that of normal cells (KarantzaWadsworth et al., 2007). Increased activation of autophagy in apoptosis deficient cells has been shown to mediate cell death (Shimizu et al., 2004). To test this possibility, we treated Bax/Bak double knockout (DKO) cells with etoposide to induce cell death by DNA damage in the presence or absence of spautin-1. We found that spautin-1 inhibited etoposide induced autophagic cell death of Bax-Bak DKO cells (Figures S3E–S3G). Thus, spautin-1 can be used as a tool to explore the requirement of autophagy in cellular processes. Spautin-1 Selectively Promotes the Degradation of Vps34 Complexes To explore the mechanism by which spautin-1 inhibits autophagy, we first examined the effects of spautin-1 on FYVERFP, an indicator for the activity of class III PI3 kinase, because PtdIns3P, the product of class III PI3 kinase, is important for the formation of autophagosomes (Gaullier et al., 1998; Simonsen and Tooze, 2009). Treatment with spautin-1 (Figure 3A) and MBCQ (Figure S4A) reduced the levels of FYVE-RFP puncta, Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc. 225
Figure 2. The Biological Effects of Spautin-1 on Cellular Models of Cell Death Bcap-37 cells were treated with indicated compounds in normal DMEM with 10% bovine serum (A), glucose free condition (B) or both (C-E) for 48 hr. The cell viability was determined by MTT assay (A), (B), imaged using a phase contrast microscope (C) or the cell lysates were analyzed by western blotting using antiPARP (D), anti-LC3 and anti-b-tubulin (as a control) (E). All error bars indicate STD. See also Figure S3.
but had no effect on the protein levels of FYVE-RFP (Figure S4B), suggesting that spautin-1 reduced the levels of PtdIns3P. The reduction of PtdIns3P in spautin-1 treated cells was confirmed using lipid dot blot analysis (Gozani et al., 2003)(Figure 3B). However, spautin-1 does not inhibit the lipid kinase activity of Vps34 in vitro (data not shown). Thus, spautin-1 can reduce the levels of PtdIns3P in cells, but is not a direct inhibitor of class III PI3 kinase activity. Interestingly, we noted that the levels of Flag-Beclin1 and HA-Vps34 were considerably lower in spautin-1 treated cells than that of control cells (Figure 3C). In addition, treatment with spautin-1 also reduced the levels of GFP-p150 and MycAtg14L (Figures 3D and 3E). On the other hand, the treatment of spautin-1 had no effect on the protein levels of GFP alone, GFP-Arf1, GFP-MT, EGFR, HA-Hrs, HA-Atg3, or GFP-Atg7 (data not shown). Thus, spautin-1 selectively reduces the levels of exogenously expressed components of Vps34 complexes. To determine if spautin-1 has a similar effect on endogenous Vps34 complexes, we conducted a time course study of H4LC3-GFP cells treated with spautin-1 by western blotting. We found that the levels of endogenous Beclin1, Vps34, p150, Atg14L, and UVRAG progressively decreased in a time-dependent manner in the presence of spautin-1, and the effect of spautin-1 on the levels of Vps34 complexes was strongly correlated with that of LC3II (Figure 3F). Other active derivatives such as MBCQ have similar activity profiles (data not shown). In contrast, the treatment of 3-MA has no effect on the protein level of Beclin1 (Figure S4C). These data confirm that spautin-1 selectively reduces the levels of Vps34 complexes in mammalian cells. To explore the mechanism by which spautin-1 reduces the levels of Vps34 complexes, we treated H4-LC3-GFP cells with MBCQ or spautin-1 in the presence or absence of CHX. As shown in Figure 3G, the addition of spautin-1 with CHX reduced the levels of Beclin1 and Vps34 compared to that of CHX alone, suggesting that spautin-1 may promote the degradation of the class III PI3 kinase complexes. To further examine this possi226 Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc.
bility, we treated H4-LC3-GFP cells with spautin-1 in the presence of MG132 or NH4Cl to inhibit proteasomal or lysosomal degradation, respectively. MG132 but not NH4Cl inhibited the reduction of Beclin1 induced by spautin-1 (Figure 3H). The addition of MG132 restored the levels of Vps34 complexes as well as that of autophagy (Figure S5A). Similar results were found with transfected GFP-Beclin1 in 293T cells (Figure S5B). These results suggest that spautin-1 promotes the degradation of Beclin1 through the proteasomal pathway. Since ubiquitination represents an essential step in mediating proteasomal degradation, we tested if ubiquitination of Beclin1 was increased in cells treated with spautin-1. As shown in Figure 3I, the treatment of spautin-1 promoted the ubiquitination of Beclin1 without an obvious effect on the global levels of ubiquitination. Taken together, we conclude that spautin-1 inhibits autophagy by selectively promoting the degradation of the class III PI3 kinase complexes via the proteasomal pathway. Identification of the Deubiquitinating Enzymes for Vps34 Complexes Since ubiquitination of proteins plays a critical role in mediating proteasomal degradation, we hypothesize that spautin-1 targets deubiquitinating enzyme(s) (DUBs) which normally function to negatively regulate the ubiquitination of Vps34 complexes. This follows from the common finding that a small molecule is more likely to be an inhibitor than an activator. To directly test this hypothesis, we screened a collection of 127 siRNAs targeting Human Deubiquitinating Enzymes from the Dharmacon library SMART pools for inhibition of autophagy using H4-LC3-GFP cells as an assay. We found that only knockdown of USP10 or USP13 showed a consistent effect of reducing the levels of endogenous Vps34, Beclin1, Atg14L, p150 and UVRAG (Figures 4A and 4B). Interestingly, the treatment of spautin-1 also reduced the levels of USP10 and USP13, but not USP14, a DUB involved in regulating proteasome function (Lee et al., 2010), or Rubicon, a negative regulator of type III PI3 kinase
Figure 3. Spautin-1 Reduces the Levels of PtdIns3P by Promoting the Degradation of Vps34 Complexes (A) H4-FYVE-RFP cells were treated with rapamycin (0.2 mM) and/or spautin-1 (10 mM) as indicated. The image data are expressed as % of control vehicle treated cells. 1000 cells were analyzed per treatment condition. Rap = rapamycin. (B) MEF cells were treated with DMSO (1&), rapamycin (0.2 mM), spautin-1 (10 mM) as indicated for 4 hr. The lipids were extracted and applied onto polyvinylidene fluoride membrane. The commercial PtdIns3P was spotted as indicated for controls. The levels of PtdIns3P were detected using GST-PX-p40 domain protein, which binds to PtdIns3P, and anti-GST antibody (top panel). The levels of PtdIns4P, detected using GST-PH-FAPP-1 domain protein which binds to PtdIns4P, and anti-GST antibody, were used as a loading control (bottom panel). (C-E) 293T cells were transfected with expression vectors of HA-Vps34 and Flag-Beclin 1 (C), GFP-p150 (D), or myc-Atg14L (E). Twenty-four hours after transfection, cells were treated with DMSO (1&), MBCQ (10 mM) or spautin-1 (10 mM) as indicated for 24 hr. The cell lysates were analyzed by western blotting using anti-HA, anti-Flag, anti-GFP, anti-myc as indicated or anti-b-tubulin (as a control). (F) H4-LC3-GFP cells were treated with spautin-1 (10 mM) as indicated, the cell lysates were analyzed by western blotting using indicated antibodies. b-tubulin was used as a control. (G) H4-LC3-GFP cells were treated with CHX (10 mM) or spautin-1 at indicated concentrations for 12 hr. DMSO (1&) was used as a negative control. The cell lysates were analyzed by western blotting using anti-Beclin1, anti-Vps34, or anti-b-tubulin (as a control). (H) H4-LC3-GFP cells were incubated with MG132 (10 mM) or NH4Cl (10 mM) with or without spautin-1 (10 mM) for 6 hr. The cell lysates were analyzed by western blotting using using indicated antibodies. b-tubulin was used as a control. (I) 293T cells were transfected with GFP-Beclin1 and HA-Ub expression vectors. Twenty-four hours after transfection, cells were treated with spautin-1 (10 mM) for 24 hr and MG132 (5 mM) was added in the last 6 hr. The cell lysates were immunoprecipitated with anti-GFP antibody and the immunocomplexes were analyzed by western blotting using anti-HA antibody. All error bars indicate STD. See also Figure S4.
(Matsunaga et al., 2009; Zhong et al., 2009) (Figure 4C). Similarly, the treatment of MEF cells with spautin-1 also led to a timedependent reduction in the levels of USP10, USP13, Vps34 complexes and autophagy (Figure S5C). In addition, we compared the effects of spautin-1 on HeLa and Bcap-37 cells under normal culture condition and autophagy induction conditions (Figures S5D and S5E). Interestingly, we found that the reduction in the levels of Vps34 complexes in Bcap-37 cells was significantly stronger under autophagy induction conditions than that under normal culture conditions, where autophagy levels are low. Because the reductions in the levels of USP10 and USP13 in H4-LC3-GFP cells treated with spautin-1 appeared later than the reductions in the levels of Vps34 complexes and autophagy (Figure 4C), the reduced levels of USP10 and USP13 are unlikely
to be the primary reason for the ability of spautin-1 to reduce the levels of PtdIns3P and inhibit autophagy. Since the treatment with spautin-1 increases the ubiquitination levels of Beclin1 and knockdown of USP10 or USP13 reduces the levels of Vps34 complexes, we considered the possibility that spautin-1 targets USP10 and USP13 mediated the deubiquitination of Vps34 complexes. We first examined the ability of USP10 and USP13 to mediate the deubiquitination of Vps34 complexes. We found that the overexpression of USP10 was highly effective in reducing the levels of ubiquitinated Beclin1, and this effect was inhibited in the presence of spautin-1 (Figure 4D). Similarly, the overexpression of USP13 reduced the levels of ubiquitinated Beclin1 which was inhibited by spautin-1 (Figure 4E). On the other hand, overexpression of USP10 or USP13 had no obvious effects on the ubiquitination levels of overexpressed Vps34, Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc. 227
Figure 4. Spautin-1 Inhibits the Deubiquitination of Vps34 Complexes (A) and (B) H4-LC3-GFP cells were transfected with indicated siRNAs for 72 hr or treated with rapamycin (0.25 mM) or spautin-1 (10 mM) as indicated, the cell lysates were analyzed by western blotting using indicated antibodies. b-tubulin was used as a control. (C) H4-LC3-GFP cells were treated with spautin-1 (10 mM) as indicated, the cell lysates were analyzed by western blotting using indicated antibodies. b-tubulin was used as a control. (D and E) 293T cells were transfected with indicated expression vectors for 12 hr, incubated with MG132 (10 mM), with or without spautin-1 (10 mM) for 4 hr, the cell lysates were immunoprecipitated with anti-Beclin1 and the immunocomplexes were analyzed by western blotting using anti-HA antibody. (F) and (G) Ubiquitinated Beclin1 was incubated with immunopurified Flag-USP10, Myc-USP13, or Flag-USP10CA, with or without spautin-1 for 2 hr in vitro in deubiquitinating buffer. The western blot was blotted with anti-Beclin1 antibody. (H) and (I) Proteins indicated purified from 293T cells and different concentrations of spautin-1 (20 mM to 100 nM) were mixed and incubated for 30 min. Ub-AMC was then added to each well and incubated for another 45 min. The final concentrations of every protein and Ub–AMC were 20 nM and 0.8 mM, respectively. Ub–AMC hydrolysis was measured. All error bars indicate STD. See also Figure S5.
Atg14L, p150, and UVRAG (Figures S6A–S6H). Taken together, these results suggest that Beclin1 is the primary target of USP10 and USP13. To directly test if spautin-1 can inhibit the deubiquitinating activity of USP10 and USP13, we tested the activity of isolated USP10 and USP13 on ubiquitinated Beclin1 in vitro. As shown in Figures 4F and 4G, the coincubation of ubiquitinated Beclin1 with USP10 or USP13 but not a catalytically inactive USP10 mutant reduced the levels of Beclin1 ubiquitination. Furthermore, the presence of spautin-1 inhibited the deubiquitination of Beclin1 mediated by USP10 and USP13. In contrast, spautin-1 had no effect on CYLD-mediated deubiquitination of RIP1 in vitro (data not shown). To further confirm this result, we developed an in vitro deubiquitination assay using Ub-AMC (the C-terminal derivatization of ubiquitin with 7-amino-4-methylcoumarin), which is a fluorogenic substrate for deubiquitinating 228 Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc.
enzymes (DUBs) (Dang et al., 1998). Using this assay, we found that spautin-1 inhibited USP10 and USP13 with IC50 of 0.60.7 mM while having no inhibitory activity toward CYLD which is also a member of ubiquitin specific peptidase family (Figures 4H and 4I). Thus, spautin-1 is an inhibitor of the deubiquitinating activity of USP10 and USP13. Our results suggest that inhibition of USP10 and USP13 by spautin-1 promotes the ubiquitination and degradation of Vps34 complexes which in turn leads to a reduction in the levels of PtdIns3P and consequent inhibition of autophagy. Regulation of USP10 and USP13 by Vps34 Complexes Unexpectedly, we found that the knockdown of Beclin1 or Vps34 could also reduce the endogenous levels of USP10 and USP13 (Figures 5A and 5B). This suggests that Vps34 complexes may be able to regulate their own levels by stabilizing their cognate
Figure 5. Regulation of USP13 by Vps34 Complexes (A and B) H4-LC3-GFP cells were transfected with indicated siRNAs for 72 hr or treated with rapamycin (0.25 mM) or spautin-1 (10 mM) for 4 hr, the cell lysates were analyzed by western blotting using indicated antibodies. b-tubulin was used as a control. (C) H4-LC3-GFP cells were treated with MG132 (10 mM) and spautin-1 (10 mM) for 6 hr. The cell lysates were immunoprecipitated with anti-USP13 antibody and the immunocomplexes were analyzed by western blotting using anti-Beclin1 antibody. (D) A schematic diagram of Beclin1 truncation mutants used in (E). (E) 293T cells were transfected with Myc-USP13, Flag-Beclin1, Flag-Beclin1-DC-term, Flag-Beclin1-DBD, Flag-Belin1-DBD,CCD,CED as indicated for 24 hr. The cell lysates were immunoprecipitated with anti-flag antibody and the immunocomplexes were analyzed by western blotting using anti-USP13 antibody. (F and G) 293T cells were transfected with flag-Beclin1 or flag-DC-Beclin1 for 24 hr, and then treated with spautin-1(10 mM) as indicated. The cell lysates were assayed by anti-flag, anti-Vps34, anti-p53, anti-LC3, b-tubulin (loading control) as indicated. (H) Flag-Beclin1, Flag-USP10 and Myc-USP13 proteins were isolated from 293T cells individually transfected with the relevant expression constructs by immunoprecipitation followed by extensive washing (12x) and elution with tag peptides. Deubiquitinating activities of indicated proteins were analyzed using Ub-AMC assay. Line 1: Myc-USP13, Flag-USP10 and Flag-Beclin1; Line 2: Myc-USP13 and Flag-Beclin1; Line 3: Flag-USP10 and Flag-Beclin1; Line 4: MycUSP13 and Flag-USP10; Line 5: Myc-USP13; Line 6: Flag-USP10; Line 7: Flag-Beclin1. (I) 293T cells were transfected with indicated expression vectors for 12 hr, incubated with MG132 (10 mM) in the presence or absence of spautin-1 (10 mM) for an additional 4 hr. The cell lysates were immunoprecipitated with anti-USP10 antibody and the immunocomplexes were analyzed by western blotting using anti-HA antibody. See also Figure S6.
deubiquitinating enzymes including USP10 and USP13. This effect is not likely mediated through PtdIns3P, the product of Vps34 complexes, as the treatment of 3-MA which inhibits the kinase activity of class III PI3 kinase had no effect on the levels of USP10 or USP13 (data not shown). Thus, Vps34 complexes have the surprising role of regulating the stability of USP10 and USP13. To determine the mechanism by which Vps34 complexes regulate the stability of USP13 and USP10, we examined the possibility that Beclin1 may interact with USP10 and USP13.
We found that endogenous Beclin1 can interact with USP13 and the interaction was reduced in the presence of spautin-1 (Figure 5C). However, the interaction of Beclin1 and USP10 was considerably weaker (data not shown). These data suggest that Beclin1 may closely interact with USP13, whereas its interaction with USP10 is indirect or transient in nature. To further characterize the interaction of USP13 with Beclin1, we determined the domains of Beclin1 that interact with USP13 (Figure 5D & E). Different truncation mutants of Beclin1 were coexpressed with USP13 in 293T cells and the interaction of Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc. 229
USP13 with different Beclin1 mutants was analyzed by coimmunoprecipitation. A C-terminal deletion mutant of Beclin1 (DCBeclin1) showed significantly reduced binding with USP13, suggesting the C terminus of Beclin1 is important for the interaction. We further compared the effect of spautin-1 in 293T cells expressing DC-Beclin1 mutant or full length Beclin1 (Figures 5F and 5G). Interestingly, we found that not only was the DC-Beclin1 mutant resistant to spautin-induced degradation, but the expression of DC-Beclin1 mutant significantly blocked the effect of spautin-1 in inhibiting autophagy and inducing the degradation of Vps34. Since DC-Beclin1 can bind to Vps34 (Furuya et al., 2010) but not USP13, this experiment suggests that the interaction of Beclin1 and USP13 is critically important for regulating the stability of Vps34 complexes in response to spautin-1 treatment. To directly examine the mechanism by which Beclin1 regulates USP10 and USP13, we determined the effects of their interaction on the deubiquitinating activities in vitro using Ub-AMC as a substrate. As shown in Figure 5H, the DUB activities of USP10 or USP13 were comparatively low when incubated alone with Ub-AMC. Interestingly, the DUB activities were significantly increased when USP13 and USP10 coincubated together or with Beclin1 or all 3 proteins together, suggesting the DUB activity can be significantly enhanced when USP13 interacts with its substrate Beclin1 or USP10. Thus, reduced levels of USP10 and USP13 in the presence of spautin-1 or with beclin1 knockdown may be due to their increased ubiquitination and degradation through the proteasome pathway. Consistent with this possibility, the effect of spautin-1 on the levels of USP10 and Vps34 complexes can be fully restored in the presence of MG132 (Figure S5A). Furthermore, the effect of Beclin1 knockdown on reduced levels of USP10 and USP13 can also be inhibited by MG132 (Figure S7B). Deubiqutination of USP10 by USP13 Since the treatment of spautin-1 also led to reduced levels of USP10, which was inhibited by the addition of MG132 (Figure S5A), it is likely that the levels of USP10 and USP13 are also regulated by ubiquitination. Interestingly, knockdown of either USP10 or USP13 led to reductions in the levels of the other (Figures 4A and 4B). Thus, we considered the possibility that USP10 and USP13 may regulate deubiquitination of each other. Consistent with this possibility, the ubiquitination levels of USP10 were reduced when cells were cotransfected with an expression vector of USP13 and the addition of spautin-1 inhibited the deubiquitination of USP10 by USP13 (Figure 5I). On the other hand, coexpression of USP10 with USP13 has a much less pronounced effect on ubiquitination of USP13 (data not shown). These results suggest that USP13 may directly regulate the deubiquitination of USP10; however, USP10 may regulate USP13 indirectly perhaps by affecting the levels of Vps34 complexes. Since USP10 mediates the deubiquitination of Beclin1 and reduced levels of USP10 leads to increased ubiquitination and degradation of Vps34 complexes, reduced levels of Vps34 complexes as a result of USP10 reduction may in turn lead to destabilization of USP13. Our data supports an interactive regulatory relationship of USP10 and USP13 with Vps34 complexes. We propose that USP10 and USP13 mediate the deubiqutination of Vps34 230 Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc.
complexes to regulate the levels of class III PI3 kinase. Furthermore, Beclin1 also interacts with USP13 and regulates the stability of USP13. Since USP13 can also deubiquitinate USP10, regulating the stability of USP13 by Beclin1 provides a mechanism for Beclin1 to control the stability of USP10. Thus, our data suggest that the levels of Vps34 complexes may be coupled to the levels of USP10 and USP13. Regulation of p53 via Vps34 Complexes and Deubiquitination Since USP10 is known as a deubiquitinating protease of p53 (Yuan et al., 2010), inhibition of USP10 by spautin-1 may promote the degradation of p53. Consistent with this possibility, the treatment of spautin-1 led to a reduction in the levels of p53 that was inhibited in the presence of MG132 (Figure S5A). Furthermore, spautin-1 induced reduction in the levels of p53 was inhibited with knockdown of MDM2, the major E3 ubiquitin ligase for p53 (Figure 6A). On the other hand, knockdown of Mdm2 had no effect on spautin-1 induced reduction of USP10, USP13, Vps34 or Beclin1. In addition, we found that knockdown of USP10, USP13, Beclin1, Vps34, p150, UVRAG, Atg14L all led to reduction in the levels of p53 (Figures 6B–6E; Figure S7A). Thus, the cellular levels of p53 may be coordinately regulated with that of Vps34 complexes via deubiquitinating enzymes such as USP10 and USP13. Finally, consistent with Beclin1 being the primary target of USP10 and USP13, the expression of DC Beclin1, which behaves as a dominant negative in inhibiting the loss of Vps34 complexes and autophagy, also inhibited the reduction of p53 induced by spautin-1 (Figure 5F and 5G). Since our model predicts that the levels of class III PI3 kinase should be correlated with that of p53, we examined the levels of p53 in BECN1+/ mice. As shown in Figure 6F, the levels of Beclin1 in newborn BECN1+/ mice are approximately half of that in wt mice. Consistent with a coordinated regulation of Vps34 complex components, the levels of Vps34, Atg14L, p150 and UVRAG are also significantly reduced in BECN1+/ tissues. Interestingly, as predicted by our model, the levels of USP10 and p53 in the heart, lung and liver of newborn BECN1+/ mice are correspondingly reduced. The levels of USP13 could not be examined currently due to a lack of antibody that can recognize murine USP13. The levels of LC3II in BECN1+/ liver are reduced compared to that of wt. On the other hand, the reduction of LC3II in heart and lung of BECN1+/ mice is not as obvious as that in liver. The reduced levels of p53 provide an important molecular mechanism contributing to the increased tumorigenesis in BECN1+/ mice. To further characterize the effect of spautin-1 on p53, we examined the effect of spautin-1 on the cytoplasmic and the nuclear levels of p53 and found that the treatment of spautin-1 can reduce both nuclear and cytoplasmic p53 (Figure S7C). Furthermore, we found that the effects of spautin on the levels of Vps34 complexes and autophagy could still be observed in SKOV-3 ovarian cancer cell line which is null for p53 (Figure S7D). These results are consistent with the target of spautin-1 being upstream and independent of p53. Taken together, our data suggest a model of regulatory relationship between class III PI3 kinase and p53 via protein interaction and deuqibuitination and the mechanism by which the
Figure 6. Regulation of p53 by Vps34 Complexes, USP10 and USP13 (A-E) H4-LC3-GFP cells were transfected with indicated siRNAs for 72 hr and treated with rapamycin (0.25 mM) or spautin-1 (10 mM) for 4 hr. The cell lysates were analyzed by western blotting using indicated antibodies. Anti-b-tubulin is a loading control. (F) Heart, lung and liver tissues of newborn BECN1+/+ and BECN1+/ mice were isolated and analyzed by western blotting using indicated antibodies. Anti-actin was used as a loading control. Also see Figure S7.
treatment with spautin-1 leads to the reduced levels of Vps34 complexes and p53 (Figure 7). DISCUSSION In this study, we describe a potent small molecule inhibitor of autophagy, named spautin-1, that targets the deubiquitination activity of USP10 and USP13. Using spautin-1 as a tool, we demonstrate that the ubiquitination and degradation of Vps34 complexes are regulated by two ubiquitin-specific peptidases, USP10 and USP13. Inhibiting deubiquitination of Vps34 complexes by spautin-1 leads to increased ubiquitination and degradation of class III PI3 kinase complexes through the proteasomal pathway. Furthermore, our study demonstrates a physiological mechanism for regulating the class III PI3 kinase via protein deubiquitination. Since class III PI3 kinase plays an important role in regulating multiple intracellular vesicular trafficking events including autophagy and endocytosis, the ability of USP10 and USP13 to regulate the stability of Vps34 complexes provides a molecular mechanism for ubiquitination
and proteasomal degradation to control intracellular vesicular trafficking. Unlike that of class I and class II PI3 kinases which need to be activated through receptor signaling, the activity of class III PI3 kinase is believed to be constitutive (Lindmo and Stenmark, 2006). Thus, regulating the protein levels of class III PI3 kinase might provide an important mechanism for controlling the constitutively active class III PI3 kinase and intracellular levels of PtdIns3P. Our recent genome-wide siRNA screen on autophagy demonstrated a prominent role of class III PI3 kinase in regulating autophagy (Lipinski et al., 2010). It will be interesting in future to examine if regulating deubiquitination and ubiquitination of Vps34 complexes provides a general mechanism for controlling the constitutively active class III PI3 kinase activities under different physiological conditions. Consistent with the regulation of this deubiquitination mechanism, we found that the effect of spautin-1 on the levels of Vps34 complexes are dramatically enhanced in Bcap-37 cells under glucose-free condition, suggesting that the deubiquitination of Vps34 complexes may be under the control of nutritional availability. Thus, inhibiting the deubiquitination of Vps34 complexes might Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc. 231
Figure 7. A Model: USP10 and USP13 Mediate the Deubiquitination of Vps34 Complexes and p53 (A) In cells treated with spautin-1, USP10 and USP13 are inhibited which leads to increased ubiquitination and degradation of Beclin1 in Vps34 complexes and p53. (B) USP13 interacts with Beclin1 in Vps34 complexes which provides a mechanism for Vps34 complexes to regulate the deubiquitination activity of USP13. USP13 also mediates the deubiquitination of USP10 which explains why knockdown of USP13 also leads to increased degradation of USP10. On the other hand, knockdown of USP10 leads to the loss of Vps34 complexes which might in turn destabilizes USP13.
provide a strategy for developing autophagy inhibitors as an anticancer therapy. USPs are cysteine proteases containing conserved regions in their amino acid sequence surrounding the Cys, His and Asp/Asn residues that form the catalytic triad. From the structural studies of a number of USP family members, it has been noted that the USP catalytic domains are often not appropriately aligned without binding to their substrates (Komander, 2010). That is, the catalytic Cys in USP7 shifts from catalytically inactive position to an active position where it interacts with the catalytic His only when binding to ubiquitin. On the other hand, although the catalytic machineries of USP14 and USP8 are properly aligned for catalysis in the absence of ubiquitin, the ubiquitin binding sites are blocked by the ubiquitin-binding surface loops (Hu et al., 2005). In addition, the Fingers domain of USP8, which is important for binding to ubiquitin, folds inward which blocks the ubiquitin binding site when not interacting with its substrates. Although no structural information for USP13 and USP10 is currently available, the enhanced DUB activity when USP13 interacts with USP10 or when they interact with Beclin1 suggest that the interaction of USP13 and USP10 with each other or with their substrates can lead to changes in their conformation which may be critical for the catalytic activities. This provides a possible model and mechanism for the close interactive regulatory relationship between USP13/USP10 with Vps34 complexes to explain why knockdown of USP13/USP10 or Vps34 complexes lead to reduced levels of the others. We demonstrate that USP10 and USP13 can both mediate the deubiquitination of Beclin1. Since the stabilities of the core components of Vps34 complexes are codependent upon each 232 Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc.
other (Itakura et al., 2008), regulating deubiquitination of Beclin1 may be sufficient to control the levels of whole complex. On the other hand, our study also demonstrates that Vps34 complexes can regulate their own levels by a feedback control of USP13. Since the interaction of USP13 and Beclin1 is detectable by coimmunoprecipitation, we propose that Beclin1 may have close interaction with USP13. On the other hand, the interaction of Beclin1 and USP10 is consistent with that of enzyme/ substrate which is expected to be weak and transient in nature. Most DUBs with resolved structures show that the enzymes are in an unproductive conformation before binding to the substrates. These inactive states might result from the blocking of the active site by loops or a misalignment of catalytic triads. Thus, binding to Beclin1 may trigger a major change in the conformation of USP13 to allow catalysis. Our study demonstrates that class III PI3 kinase is an important tumor suppressor. The role of beclin1 as a haploid-insufficient tumor suppressor is well-established; however, it has been unclear how might a reduction in beclin1 expression can have such a dramatic impact on genomic instability and tumorigenesis (Karantza-Wadsworth et al., 2007). Our study demonstrates that a reduction of beclin1 expression leads to a reduced p53 level by increasing its ubiquitination, providing an important molecular mechanism contributing to the role of beclin1 as a haploid-insufficient tumor suppressor that is frequently monoallelically lost in human breast, ovarian, and prostate cancers (Liang et al., 1999; Qu et al., 2003; Yue et al., 2003). Recently, using mutant mice with tissue-specific Atg5 or Atg7 deficiency, Takamura et al. showed that multiple benign tumors developed from autophagy deficient liver, but not in other tissues
(Takamura et al., 2011). Thus, it is unlikely that increased rate of tumorigenesis in BECN1+/ mice is due to autophagy deficiency as assumed originally. Since Beclin1 has been shown to interact with Bcl-2, it has also been proposed that decreased levels of Beclin1 in BECN1+/ cells may promote the activity of Bcl-2 to increase cell survival which in turn promotes tumorigenesis (Pattingre et al., 2005). While our model does not rule out of a potential contribution of Bcl-2 or autophagy deficiency from promoting tumorigenesis in BECN1+/ mice, the reduced levels of p53 as a result of reduction in Beclin1 might provide a mechanism to promote genomic instability which in turn leads to tumorigenesis. The ability of Beclin1 to regulate the levels of p53 provides a mechanism underlying the observations that monoallelic loss of beclin1 is sufficient to lead to DNA damage and genomic instability via gene amplification (Karantza-Wadsworth et al., 2007). Consistent with the contribution of p53 deficiency to increased tumorigenesis in beclin1 heterozygous background, the tumor spectra of TP53+/ mice and BECN1+/ mice strongly overlap: the highest frequencies of tumors in both TP53+/ and BECN1+/ mice are lung carcinoma, hepatoma and lymphoma (Jacks et al., 1994; Qu et al., 2003). Furthermore, the beclin1 gene is frequently monoallelically deleted in human sporadic ovarian, prostate and breast cancers similar to that of p53 mutations (http://www-p53.iarc.fr). The similarities in tumor spectra of BECN1+/ mice and TP53+/ mice suggest that reduced p53 levels play an important role in promoting tumorigenesis in BECN1+/ mice. Since the stability of the components of Vps34 complexes are largely codependent upon each other, reduced expression of other components of Vps34 complexes, including Vps34, p150, Atg14L and UVRAG, also leads to reduced levels of p53. Thus, our data suggest that all components of Vps34 complexes can regulate the levels of p53. Curiously, although monoallelic loss of beclin1 is frequently observed in breast, ovarian and prostate cancers, the loss of heterozygocity of beclin1 was not commonly observed (Liang et al., 1999; Qu et al., 2003; Yue et al., 2003). Thus, beclin 1 might not represent a ‘‘conventional’’ tumor suppressor such as Rb that satisfies the ‘‘Knudson two-hit hypothesis’’ criteria for classification as a tumor suppressor gene which indicates that it is necessary to demonstrate loss of both alleles, via either deletion or the presence of inactivating mutations (Knudson, 1971). Since TP53/ mice are viable while BECN1/ mice are early embryonic lethal, Vps34 complexes must provide a wider range of vital cellular functions than controlling p53 protein levels. Thus, while a reduction in beclin1 expression might promote tumorigenesis by reducing the levels of p53, a complete loss of beclin1 might negatively impact the development of certain tumors at least as a complete loss of beclin1 leads to early embryonic lethality (Qu et al., 2003; Yue et al., 2003) and might be required for cell viability at least for certain cell types. Thus, unlike ‘‘conventional’’ tumor suppressor, bi-allelic loss of beclin1 may not promote tumorigenesis and may lead to cell death. Consistent with this possibility, in contrast to that of normal cells, selected cancer cell lines demonstrate an increased sensitivity toward spautin-1 under starvation condition, suggesting that spautin-1 may be used to synergize with selected chemotherapeutic agents to induce cancer cell death. Spautin-1 might
therefore provide a potential lead compound for developing a class of autophagy inhibitors as anticancer therapy. EXPERIMENTAL PROCEDURES High-Throughput Image Analysis Cells were fixed with 4% paraformaldehyde (Sigma) and stained with 3 mg/ml DAPI (Sigma). Images data were collected with an ArrayScan HCS 4.0 Reader with a 203 objective (Cellomics ArrayScan VTI) for DAPI-labeled nuclei and GFP/RFP-tagged intracellular proteins. Cell Lines and Culture Conditions 293T, MEF, HeLa, and Bcap-37 cells were cultured in DMEM media with 10% NCS. H4-LC3-GFP, H4-FYVE-RFP and MDCK cells were cultured in DMEM supplemented with 10% FBS and 1 X Na pyruvate (Invitrogen). Hs578Bst cells were cultured in Hybri-Care Medium (ATCC), supplemented with 30 ng/ml mouse EGF and 10% FBS. For starvation experiments, cells were cultured in DMEM supplemented with 10% serum without Glucose (GIBCO). Antibodies Rabbit polyclonal antibody anti-USP10, anti-USP13,anti-UVRAG,anti-Vps34 and anti-p53, were from Abcam. Rabbit polyclonal antibody anti-LC3B and mouse monoclonal antibody anti-b-tubulin were from sigma. Monoclonal antibodies anti-flag, anti-Myc and anti-HA were from Abmart. Rabbit polyclonal antibody anti-Beclin1 was from Santa Cruz. Polyclonal antibody anti-Atg14L was from MBL. Protein-Lipid Blot Assay Protein-lipid blot assays were carried out as reported (Dowler et al., 2002; Gozani et al., 2003). Briefly, lipids extracted from a 100 mm plate was spotted onto Hybond C-extra membrane (Amersham) and allowed to dry overnight in the dark. The membrane was incubated with lipid blocking buffer (1% BSA in TBST) for 1 hr, washed once in TBST for 30 min, and incubated with protein buffer (1 mg GST-tagged protein per 1 ml TBST with 1% BSA) overnight at 4C. Then the membrane was washed again in TBST for four times at 30 min each, incubated with anti-GST (Sigma) in 1% BSA buffer for 4 hr, washed in TBST for four changes with 5 min each, incubated with secondary antibody for 1 hr, and washed in TBST for four changes with 5 min each. All incubations were at room temperature unless noted otherwise. The signals were visualized with ECL. In Vitro Deubiquitination Assay In vitro deubiquitination assay was carried out using a similar protocol as described in (Yuan et al., 2010). Ubiquitinated Beclin1 was isolated from 293T cells transfected with expression vectors for HA-UB and FLAG-Beclin1. After Incubation with proteasome inhibitor MG132 (25 mM) and a pan DUB inhibitor G5 (25 mM) for 6 hr, ubiquitinated Beclin1 was purified from the cell extracts with anti-FLAG-affinity column in FLAG-lysis buffer (50 mM Tris-HCl [pH 7.8], 137mM NaCl, 10mM NaF, 1mM EDTA, 1% Triton X-100, 0.2% Sarcosyl, 1mM DTT, 10% glycerol and fresh proteinase inhibitors). After extensive washing with the FLAG-lysis buffer, the proteins were eluted with FLAG-peptides (Sigma). The recombinant Flag-USP10 and USP10CA were expressed in 293T cells and purified using FLAG affinity column and eluted with FLAGpeptide. For in vitro deubiquitination assay, ubiquitinated Beclin1 protein was incubated with recombinant USP10 in the deubiquitination buffer (50 mM TrisHCl [pH 8.0], 50mM NaCl, 1mM EDTA, 10mM DTT, 5% glycerol) for 2 hr at 37 C. SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures, seven figures, and one table and can be found with this article online at doi:10.1016/ j.cell.2011.08.037. ACKNOWLEDGMENTS We thank Dan Finley, Bruce Yankner, Zhujun Yao, Dana Christofferson, Dimitry Ofengeim, and Be´ne´dicte Py for comments on the manuscript; Dr. Xin Xie of
Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc. 233
the National Center for Drug Screening in Shanghai for help with the original compound screen for autophagy regulators; Dr. Wade Harper for providing expression vectors of USP10 and USP13; Dr. Zhenkun Lou for mutant expression vector for USP10; Dr. Caroline Shamu (the director of the ICCB screening facility); and David Wrobel and Stewart Rudnicki for helps with siRNA screening. This work was supported in part by a NIH Director’s Pioneer Award US (to J.Y.), grants from the Chinese Academy of Sciences (KGCX2-SW-209 and KJCX2-YW-H08 [to D.M.]), the National Natural Science Foundation of China (21020102037 [to D.M.], and 90813007 [to L.Z.]) and the National Institute on Aging US (R37 AG012859 and PO1 AG027916 [to J.Y.]). M.K. is a recipient of Samsung Scholarship from South Korea. H.V.N. is supported in part by a fellowship form the Swedish Society for Medical Research (SSMF). Received: November 29, 2010 Revised: June 24, 2011 Accepted: August 16, 2011 Published: September 29, 2011 REFERENCES Dang, L.C., Melandri, F.D., and Stein, R.L. (1998). Kinetic and mechanistic studies on the hydrolysis of ubiquitin C-terminal 7-amido-4-methylcoumarin by deubiquitinating enzymes. Biochemistry 37, 1868–1879. Dowler, S., Kular, G., and Alessi, D.R. (2002). Protein lipid overlay assay. Sci. STKE 2002, pl6. Furuya, T., Kim, M., Lipinski, M., Li, J., Kim, D., Lu, T., Shen, Y., Rameh, L., Yankner, B., Tsai, L.H., et al. (2010). Negative regulation of Vps34 by Cdk mediated phosphorylation. Mol. Cell 38, 500–511. Gaullier, J.M., Simonsen, A., D’Arrigo, A., Bremnes, B., Stenmark, H., and Aasland, R. (1998). FYVE fingers bind PtdIns(3)P. Nature 394, 432–433. Gozani, O., Karuman, P., Jones, D.R., Ivanov, D., Cha, J., Lugovskoy, A.A., Baird, C.L., Zhu, H., Field, S.J., Lessnick, S.L., et al. (2003). The PHD finger of the chromatin-associated protein ING2 functions as a nuclear phosphoinositide receptor. Cell 114, 99–111. Hu, M., Li, P., Song, L., Jeffrey, P.D., Chenova, T.A., Wilkinson, K.D., Cohen, R.E., and Shi, Y. (2005). Structure and mechanisms of the proteasome-associated deubiquitinating enzyme USP14. EMBO. J. 24, 3747–3756. Itakura, E., Kishi, C., Inoue, K., and Mizushima, N. (2008). Beclin 1 forms two distinct phosphatidylinositol 3-kinase complexes with mammalian Atg14 and UVRAG. Mol. Biol. Cell 19, 5360–5372. Jacks, T., Remington, L., Williams, B.O., Schmitt, E.M., Halachmi, S., Bronson, R.T., and Weinberg, R.A. (1994). Tumor spectrum analysis in p53-mutant mice. Curr. Biol. 4, 1–7. Karantza-Wadsworth, V., Patel, S., Kravchuk, O., Chen, G., Mathew, R., Jin, S., and White, E. (2007). Autophagy mitigates metabolic stress and genome damage in mammary tumorigenesis. Genes Dev. 21, 1621–1635. Knudson, A.G., Jr. (1971). Mutation and cancer: statistical study of retinoblastoma. Proc. Natl. Acad. Sci. USA 68, 820–823. Komander, D. (2010). Mechanism, specificity and structure of the deubiquitinases. Subcell Biochem. 54, 69–87. Lee, B.H., Lee, M.J., Park, S., Oh, D.C., Elsasser, S., Chen, P.C., Gartner, C., Dimova, N., Hanna, J., Gygi, S.P., et al. (2010). Enhancement of proteasome activity by a small-molecule inhibitor of USP14. Nature 467, 179–184. Liang, C., Feng, P., Ku, B., Dotan, I., Canaani, D., Oh, B.H., and Jung, J.U. (2006). Autophagic and tumour suppressor activity of a novel Beclin1-binding protein UVRAG. Nat. Cell Biol. 8, 688–699.
234 Cell 147, 223–234, September 30, 2011 ª2011 Elsevier Inc.
Liang, X.H., Jackson, S., Seaman, M., Brown, K., Kempkes, B., Hibshoosh, H., and Levine, B. (1999). Induction of autophagy and inhibition of tumorigenesis by beclin 1. Nature 402, 672–676. Lindmo, K., and Stenmark, H. (2006). Regulation of membrane traffic by phosphoinositide 3-kinases. J. Cell Sci. 119, 605–614. Lipinski, M.M., Hoffman, G., Ng, A., Zhou, W., Py, B.F., Hsu, E., Liu, X., Eisenberg, J., Liu, J., Blenis, J., et al. (2010). A genome-wide siRNA screen reveals multiple mTORC1 independent signaling pathways regulating autophagy under normal nutritional conditions. Dev. Cell 18, 1041–1052. MacPherson, J.D., Gillespie, T.D., Dunkerley, H.A., Maurice, D.H., and Bennett, B.M. (2006). Inhibition of phosphodiesterase 5 selectively reverses nitrate tolerance in the venous circulation. J. Pharmacol. Exp. Ther. 317, 188–195. Matsunaga, K., Saitoh, T., Tabata, K., Omori, H., Satoh, T., Kurotori, N., Maejima, I., Shirahama-Noda, K., Ichimura, T., Isobe, T., et al. (2009). Two Beclin 1-binding proteins, Atg14L and Rubicon, reciprocally regulate autophagy at different stages. Nat. Cell Biol. 11, 385–396. Pattingre, S., Tassa, A., Qu, X., Garuti, R., Liang, X.H., Mizushima, N., Packer, M., Schneider, M.D., and Levine, B. (2005). Bcl-2 antiapoptotic proteins inhibit Beclin 1-dependent autophagy. Cell 122, 927–939. Qu, X., Yu, J., Bhagat, G., Furuya, N., Hibshoosh, H., Troxel, A., Rosen, J., Eskelinen, E.L., Mizushima, N., Ohsumi, Y., et al. (2003). Promotion of tumorigenesis by heterozygous disruption of the beclin 1 autophagy gene. J. Clin. Invest. 112, 1809–1820. Schu, P.V., Takegawa, K., Fry, M.J., Stack, J.H., Waterfield, M.D., and Emr, S.D. (1993). Phosphatidylinositol 3-kinase encoded by yeast VPS34 gene essential for protein sorting. Science 260, 88–91. Shimizu, S., Kanaseki, T., Mizushima, N., Mizuta, T., Arakawa-Kobayashi, S., Thompson, C.B., and Tsujimoto, Y. (2004). Role of Bcl-2 family proteins in a non-apoptotic programmed cell death dependent on autophagy genes. Nat. Cell Biol. 6, 1221–1228. Simonsen, A., and Tooze, S.A. (2009). Coordination of membrane events during autophagy by multiple class III PI3-kinase complexes. J. Cell Biol. 186, 773–782. Takamura, A., Komatsu, M., Hara, T., Sakamoto, A., Kishi, C., Waguri, S., Eishi, Y., Hino, O., Tanaka, K., and Mizushima, N. (2011). Autophagy-deficient mice develop multiple liver tumors. Genes Dev. 25, 795–800. Wang, H., Yan, Z., Yang, S., Cai, J., Robinson, H., and Ke, H. (2008). Kinetic and structural studies of phosphodiesterase-8A and implication on the inhibitor selectivity. Biochemistry 47, 12760–12768. Yuan, J., Luo, K., Zhang, L., Cheville, J.C., and Lou, Z. (2010). USP10 regulates p53 localization and stability by deubiquitinating p53. Cell 140, 384–396. Yue, Z., Jin, S., Yang, C., Levine, A.J., and Heintz, N. (2003). Beclin 1, an autophagy gene essential for early embryonic development, is a haploinsufficient tumor suppressor. Proc. Natl. Acad. Sci. USA 100, 15077–15082. Zhang, L., Yu, J., Pan, H., Hu, P., Hao, Y., Cai, W., Zhu, H., Yu, A.D., Xie, X., Ma, D., et al. (2007). Small molecule regulators of autophagy identified by an image-based high-throughput screen. Proc. Natl. Acad. Sci. USA 104, 19023–19028. Zhong, Y., Wang, Q.J., Li, X., Yan, Y., Backer, J.M., Chait, B.T., Heintz, N., and Yue, Z. (2009). Distinct regulation of autophagic activity by Atg14L and Rubicon associated with Beclin 1-phosphatidylinositol-3-kinase complex. Nat. Cell Biol. 11, 468–476.
Absence of CNTNAP2 Leads to Epilepsy, Neuronal Migration Abnormalities, and Core Autism-Related Deficits Olga Pen˜agarikano,1,2,3 Brett S. Abrahams,2,3,6 Edward I. Herman,2,7 Kellen D. Winden,1,2 Amos Gdalyahu,4 Hongmei Dong,2 Lisa I. Sonnenblick,2 Robin Gruver,4 Joel Almajano,2 Anatol Bragin,2 Peyman Golshani,2 Joshua T. Trachtenberg,4 Elior Peles,5 and Daniel H. Geschwind1,2,3,* 1Program
in Neurogenetics, Department of Neurology, David Geffen School of Medicine of Neurology, David Geffen School of Medicine 3Center for Autism Research and Treatment and Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior 4Department of Neurobiology, David Geffen School of Medicine University of California, Los Angeles, CA 90095, USA 5Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot 76100, Israel 6Present address: Departments of Genetics and Neuroscience, Price Center for Genetic and Translational Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA 7Present address: Yale MSTP Program, Yale School of Medicine, New Haven, CT 06511, USA *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.08.040 2Department
SUMMARY
Although many genes predisposing to autism spectrum disorders (ASD) have been identified, the biological mechanism(s) remain unclear. Mouse models based on human disease-causing mutations provide the potential for understanding gene function and novel treatment development. Here, we characterize a mouse knockout of the Cntnap2 gene, which is strongly associated with ASD and allied neurodevelopmental disorders. Cntnap2/ mice show deficits in the three core ASD behavioral domains, as well as hyperactivity and epileptic seizures, as have been reported in humans with CNTNAP2 mutations. Neuropathological and physiological analyses of these mice before the onset of seizures reveal neuronal migration abnormalities, reduced number of interneurons, and abnormal neuronal network activity. In addition, treatment with the FDA-approved drug risperidone ameliorates the targeted repetitive behaviors in the mutant mice. These data demonstrate a functional role for CNTNAP2 in brain development and provide a new tool for mechanistic and therapeutic research in ASD. INTRODUCTION Autism spectrum disorders (ASD) form a heterogeneous neurodevelopmental syndrome characterized by deficits in language development, social interactions, and repetitive behavior/ restricted interests (APA, 2000). Although not necessary for diagnosis, a number of other neurological or behavioral abnormalities are frequently associated with ASD, including hyperac-
tivity, epilepsy, and sensory processing abnormalities (Geschwind, 2009). Research into the genetic basis for ASD has identified many genes, including common and rare variants (Sebat et al., 2007; Glessner et al., 2009; Weiss et al., 2009). Association, linkage, gene expression, and imaging data support the role of both common and rare variants of contactin associated protein-like 2 (CNTNAP2) in ASD. Originally, a recessive nonsense mutation in CNTNAP2 was shown to cause a syndromic form of ASD, cortical dysplasia-focal epilepsy syndrome (CDFE), a rare disorder resulting in epileptic seizures, language regression, intellectual disability, hyperactivity, and, in nearly two-thirds of the patients, autism (Strauss et al., 2006). Several reports have since linked this gene to an increased risk of autism or autismrelated endophenotypes (Alarco´n et al., 2008; Arking et al., 2008; Bakkaloglu et al., 2008; Vernes et al., 2008). Recently, we have shown that the same CNTNAP2 variant that increases risk for the language endophenotype in autism leads to abnormal functional brain connectivity in human subjects (Scott-Van Zeeland et al., 2010), consistent with emerging theories of ASD pathophysiology based on altered neuronal synchrony and disconnection (Belmonte et al., 2004). Cntnap2 (also known as Caspr2) encodes a neuronal transmembrane protein member of the neurexin superfamily involved in neuron-glia interactions and clustering of K+ channels in myelinated axons (Poliak et al., 1999; 2003). However, the fact that the gene is expressed embryonically (Poliak et al., 1999; Abrahams et al., 2007; Alarco´n et al., 2008) and myelination takes place postnatally, together with the increasing number of reports that link the gene to ASD, suggest an additional role for CNTNAP2 in early brain development. This is supported by the imaging and pathology data in patients with CDFE, in whom nearly half manifest presumed neuronal migration abnormalities on MRI, confirmed by histological analysis of brain tissue resected from patients who underwent surgery for epilepsy (Strauss et al., 2006). Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc. 235
The generation of valid animal models is critical for understanding the pathophysiology of ASD and to assess the potential of proposed treatments, as well as developing new, effective interventions. Ideally, mouse models should be based on a known genetic cause of the disease (construct validity), reflect key aspects of the human symptoms (face validity), and respond to treatments that are effective in the human disease (predictive validity) (Chadman et al., 2009; Nestler and Hyman, 2010). Here, we demonstrate that the Cntnap2 knockout mouse exhibits striking parallels to the major neuropathological features in CDFE and the core features of ASD. We observe defects in the migration of cortical projection neurons and a reduction in the number of GABAergic interneurons, as well as accompanying neurophysiological alterations. These data show that CNTNAP2 is involved in the development of cortical circuits and further support alterations in brain synchrony or connectivity in ASD pathophysiology. In addition, treating Cntnap2/ mice with risperidone rescues the repetitive behavior, but not the social deficits, a dissociation parallel to what is seen in human patients. These data demonstrate the validity of the Cntnap2 KO as a mouse model for ASD and provide initial insight into the underlying mechanisms by which CNTNAP2 affects brain development and function. RESULTS Expression of Cntnap2 in Mouse Brain Mutant mice lacking the Cntnap2 gene (Caspr2 null mice) were generated by Dr. Elior Peles (Poliak et al., 2003). We backcrossed the original ICR outbred strain onto the C57BL/6J background for 10–12 generations. Cntnap2/ mice on the C57BL/6J background had a normal appearance; no differences in weight or growth rate were observed when compared with WT littermates. In WT brain, expression of CNTNAP2 was first detected by western blot around embryonic day 14 (E14). As expected, CNTNAP2 was completely absent in the brain of homozygous mutant animals (Figure S1A available online). In situ hybridization demonstrated Cntnap2 expression in multiple adult brain regions, primarily cerebral cortex, hippocampus, striatum, olfactory tract, and cerebellar cortex (Figure S1B). Embryonic expression was also broad, including the ventricular proliferative zones of the developing cortex and ganglionic eminences (where excitatory projection neurons and inhibitory interneurons arise, respectively) overlapping with regions containing migrating neurons and postmigratory cells, indicating a possible role in neuron development and/or migration (Figure S1C). Cntnap2/ Mice Exhibit Epileptic Seizures and Abnormal Electroencephalogram Pattern One of the major phenotypes of CDFE syndrome is the presence of epileptic seizures, which is associated with dense hippocampal astroglyosis (Strauss et al., 2006). In Cntnap2/ mice, spontaneous seizures were commonly observed in animals older than 6 months of age. Seizures were consistently induced by mild stressors during routine handling (Movie S1). A behavioral study of the frequency and severity of the seizures (Racine, 1972) is presented in Table S1. Histological analysis of the hippo236 Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc.
campal formation in these animals did not show any gross structural abnormalities, although a reduction in parvalbumin-positive interneurons was found (see below and Figure S3C). Reactive astrocytosis as indicated by an enhanced expression of glial fibrillary acidic protein (GFAP) was observed throughout the hippocampus of mutant mice after the onset of seizures. Reactive astrocytosis was especially dense in the hilus but was not accompanied by neuronal loss in this structure as indicated by neuronal nuclei (NeuN) staining (Figure 1A). Electroencephalogram (EEG) recordings from freely moving mutant animals implanted with cortical electrodes at 8 months of age showed generalized interictal spike discharges during slow-wave sleep, whereas no electrical abnormalities were found in the EEG of mutant mice at times before the onset of seizures (Figure 1B). To avoid any confounding effect due to the presence of epileptic seizures, the following neuropathological, physiological, and behavioral characterization of Cntnap2 mutants was performed at an age before the onset of seizures. Cntnap2/ Mice Show Neuronal Migration Abnormalities We performed detailed histological analyses of Cntnap2 KO brain and found no gross morphological changes in the brain structure of mutant animals by conventional staining techniques (cresyl violet staining), consistent with previous reports (Poliak et al., 2003). NeuN immunohistochemistry (IHC) revealed the presence of ectopic neurons in the corpus callosum of mutant mice at postnatal day 14 (P14), after neuronal migration is completed, which persisted through adulthood (Figure 2A). Interestingly, ectopic neurons of unknown origin in white matter were also reported in CDFE syndrome patients (Strauss et al., 2006). Patients with CDFE syndrome also show neuronal migration abnormalities, such as abnormal arrangements of neurons in clusters or migratory rows in the deep layers of cortex (Strauss et al., 2006). We assessed laminar positioning of cortical projection neurons in WT and Cntnap2 KO mice with antibodies against CUX1, a marker for upper-cortical layers, and FOXP2, a marker for deep cortical layers (Molyneaux et al., 2007). Cntnap2/ mice show significantly higher numbers of CUX1+ cells in deep cortical layers (V and VI; Figure 2B). In contrast, deeper-layer FOXP2+ cells show the same pattern in both genotypes (Figure S2). We performed BrdU neuron birthdating at E16.5, after the birth of layer V neurons (Angevine and Sidman, 1961) to confirm that the CUX1+ ectopic neurons observed in deep cortical layers were being born concomitant with superficial cell identity and their presence was not due to changes in cell fate. Sections of E16.5-labeled animals were analyzed at P7, when cortical lamination is essentially complete. As shown in Figure 2C, the distribution of BrdU+ cells is significantly different between WT and KO littermates, the latter showing a shortage of cells in uppercortical layers that are redistributed to lower cortical layers. These data indicate that CNTNAP2 is necessary for the normal migration of cortical projection neurons. Reduced Number of Interneurons in Cntnap2/ Mice Since the identification of CNTNAP2 (CASPR2) in 1999 (Poliak et al., 1999), its expression has been reported mainly in
Figure 1. Cntnap2/ Mice Show Epileptic Seizures and Abnormal EEG Pattern (A) Presence of reactive astrocytes in the hippocampal hilus (inset) of P180, but not P14, mutant mice without significant changes in neuronal density. GFAP, glial fibrillary acidic protein; NeuN, neuronal nuclei. Scale bar, 50 mm. GFAP quantification is shown as percentage of area occupied by reactive astrocytes. n = 4 mice/genotype for each age. Data are presented as mean ± SEM. ***p < 0.001. (B) EEG recording from mutant mice shows abnormal spike discharges (arrows) after seizure onset. n = 3 mice for each age. RF, right frontal; RP, right parietal. See also Table S1, Movie S1 and Figure S3C.
excitatory pyramidal cells at the axon initial segment (Inda et al., 2006) and myelinated axons (Poliak et al., 2003, Horresh et al., 2008). The embryonic expression of Cntnap2 in the ganglionic eminence, where GABAergic interneurons arise, led us to analyze the expression of CNTNAP2 in interneurons. We analyzed available microarray data from single glutamatergic and GABAergic neuronal populations isolated from adult mice (Sugino et al., 2006) using weighted gene coexpression network analysis (WGCNA) (Zhang and Horvath, 2005). WGCNA groups functionally related genes into modules based on expression in a way such that genes showing similar expression patterns across samples are grouped together. Modules are likely to represent biological pathways in a way that they are usually coregulated or interact (Oldham et al., 2008). Interestingly, we found that Cntnap2 is part of a module with enriched expression in inhibitory relative to excitatory neurons (Figure S3A and Table S2). In addition, three of the most highly connected genes within the module, which are indicative of module function, are Gad1, Gad2, and Slc32a1 (VGAT), genes that are known to be necessary for GABAergic transmission (Figure 3A), suggesting the possibility of CNTNAP2 involvement in interneuron functioning. Because the potential relationship of CNTNAP2 to GABAergic interneuron function had not been previously explored and interneuron dysfunction has been associated with autism and
epilepsy (Levitt et al., 2004), we analyzed the number and distribution of interneurons in Cntnap2/ mice. GAD1 immunostaining showed that Cntnap2/ mice have a reduced number of GABAergic interneurons in all laminae (Figure 3B). To test whether the reduction in interneurons was subtype specific, we analyzed the number and distribution of the largely nonoverlapping subgroups of interneurons in rodents: parvalbumin (PVALB), calretinin (CALB2) and neuropeptide Y (NPY) (Wonders and Anderson, 2006). We observed that PVALB+ interneurons were the most affected (Figure 3C). Because striatal interneurons are also born in the ganglionic eminence (Marı´n et al., 2001), we next examined the number of interneurons in the striatum. We observed a reduction of striatal GABAergic interneurons in mutants (Figure 3D) without a change in cholinergic interneurons (Figure S3B). PVALB+ interneurons were also reduced in the hippocampus (Figure S3C). Together, these data indicate a role for CNTNAP2 in GABAergic interneuron development. Cntnap2/ Mice Show Reduced Cortical Neuronal Synchrony The abnormal positioning and migration of excitatory principal neurons and reduction in inhibitory interneurons suggested that neuronal network activity might be abnormal. This was particularly important, as GABAergic interneurons are recognized as playing a crucial role in the precise timing of neuronal activity (Sohal et al., 2009) and there is increasing evidence of abnormal neural synchrony as a pathophysiological mechanism in ASD (Uhlhaas and Singer, 2006; Belmonte et al., 2004). In vivo two-photon calcium imaging of layer II/III neurons from somatosensory cortex (Figure 4A) indicated that the neuronal firing pattern of Cntnap2/ mice was highly asynchronous relative to WT mice (Figure 4B). The mean correlation coefficient of the firing timing between all cell pairs of neurons over the distance range analyzed was significantly lower in mutant mice Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc. 237
Figure 2. Cntnap2/ Mice Show Neuronal Migration Abnormalities (A) Presence of ectopic neurons in the corpus callosum of Cntnap2/ mice. NeuN, neuronal nuclei; CTX, cortex; STR, striatum. Scale bar, 20 mm. (B) Expression of Cux1, a marker for upper-layer projection neurons, in somatosensory cortex of WT and Cntnap2/ mice. Note the abnormal distribution of CUX1-positive cells in groups (arrowheads) and rows (arrow) in deep cortical layers of mutant mice. Scale bar, 50 mm. (C) Neuronal birthdating analysis. BrdU injected at E16.5 was immunostained at P7. Note abnormal distribution of neurons in groups (arrowheads) and rows (arrow) in deep cortical layers of mutant animals. Scale bar, 50 mm. Data are presented as mean ± SEM. *p < 0.05, **p < 0.01, ***p < 0.001. n = 3 mice/genotype for each age. See also Figure S2.
(Figure 4C). In addition, neither the average firing amplitude (Figure 4D) nor the average firing rate (Figure 4E) changed significantly between genotypes, suggesting that the asynchronous firing observed in mutant animals is likely due to a network dysfunction rather than to abnormalities in neuronal activity or conduction per se. This is consistent with previous data that show no alterations in peripheral and central nerve conductance in Cntnap2/ mice (Poliak et al., 2003). Behavioral Characterization of Cntnap2/ Mice Because the diagnostic criteria in ASD are currently based on behavioral symptoms rather than molecular or neuroanatomical indicators, we performed a complete battery of behavioral tests to examine the potential effect of the absence of CNTNAP2 on behavior. A summary of the tests performed relevant to each behavioral domain related to ASD (Silverman et al., 2010), and the results obtained are presented in Table S3. Cntnap2/ mice displayed significantly greater locomotor activity than their WT littermates in the open field test (Figures S4A and S4B). This increased activity was also noted when testing for motor coordination and balance with the rotorod, as mutant mice performed significantly better than WT (Figure S4C). Interestingly, several animal models of autism (Kwon et al., 2006; 238 Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc.
Nakatani et al., 2009) and hyperactivity (Gerlai et al., 2000; Vitali and Clarke, 2004) have been reported to perform better in the rotorod test. To assess anxiety-related responses, we performed the light-dark exploration test and observed no significant differences between genotypes (Figure S4D). Potential sensory deficits were assessed with the hot plate test. As shown (Figure S4E), Cntnap2/ mice demonstrated hyperreactivity to thermal sensory stimuli. To further characterize this sensory deficit, we measured the acoustic startle response (Figure S4F) and the degree of prepulse inhibition (Figure S4G) and found no significant differences between genotypes. In addition, olfaction analysis did not show any deficit in mutant mice but, rather, appeared to perform better than WT mice in the buried food test (Figure S4H). Cntnap2/ Mice Show Stereotypic Motor Movements and Behavioral Inflexibility Spatial learning and memory were evaluated in the Morris water maze (MWM). Both WT and mutant mice showed similar learning curves, represented as time to locate the platform in the training phase of the test (Figure 5A). Probe trials were performed the day after the last training trial, and an active search for the platform was evaluated. As expected from their learning curves, WT and KO animals performed similarly in the probe test (Figures 5B and 5C). To assess behavioral flexibility and perseveration, we first performed a classic reversal task using the MWM. We found that Cntnap2/ mice showed impairment in learning the new location of the platform (Figure 5D) and performed poorly in the probe test (Figures 5E and 5F). To study perseveration in more detail, we performed the spontaneous alternation T maze test. As shown in Figure 5G, Cntnap2/ mice showed significantly higher number of no alternations in a standard 10 trial test,
Figure 3. Reduced Number of GABAergic Interneurons in Cntnap2/ Mice (A) Network plot showing the top 300 connections within the Cntnap2 module. (B) Expression of Gad1 at P14 shows a reduced number of GABAergic interneurons in somatosensory cortex of KO mice. Scale bar, 50 mm. (C) Expression of the interneuron markers PVALB, CALB2, and NPY at P14. Scale bar, 50 mm. (D) Expression of the GABAergic markers PVALB and NPY in striatum. Scale bar, 100 mm. Interneuron quantification is shown as percentage of WT. *p < 0.05 and **p < 0.01. For (B)–(E), n = 4 mice/ genotype. Data are presented as mean ± SEM. See also Table S2 and Figure S3.
confirming the perseveration observed in the reversal task on the MWM. In addition to perseveration, repetitive behavior also encompasses motor stereotypies, and both tend to co-occur in children with ASD. Consistent with our other observations, we found that Cntnap2/ mice spent almost three times more time grooming themselves than their WT littermates (Figure 5H). Cntnap2/ Mice Show Communication and Social Behavior Abnormalities Isolation-induced ultrasonic vocalizations (UsVs) are distress calls emitted by pups when separated from their mother, representing an infant-mother vocal communicative behavior that is thought to be relevant to ASD (Crawley, 2007; Silverman et al., 2010). We analyzed the pattern of UsV emission in WT and Cntnap2/mice through development (P3, P6, P9, and P12; Scattoni et al., 2009) and found that Cntnap2/ mice emitted a significantly lower number of ultrasonic calls than WT littermates at all ages (Figure 6A). We next analyzed social behavior in pairs of unfamiliar mice at age P21 (juvenile play test). As shown in Figure 6B, Cntnap2/
mice spent significantly less time interacting with each other and instead showed increased repetitive behaviors such as grooming and digging. We confirmed that the abnormal social behavior was not due to an olfaction defect (Table S3), as Cntnap2 mutant mice actually perform significantly better than WT in the buried food olfaction test (Figure S4H). We also performed a three-chamber social interaction test in adults (Crawley, 2007). As expected for the highly social strain C57/B6J, WT mice showed a highly significant preference for the cup with a mouse, whereas KO mice did not show a significant preference (Figure 6C). Finally, we examined nestbuilding behavior, which is relevant to home-cage social behaviors and mediated by dopaminergic function in mice (Szczypka et al., 2001). Cntnap2/ mice were also significantly impaired in this task (Deacon, 2006), scoring less than half of the WT criterion (Figure 6D). The Atypical Antipsychotic Risperidone Rescues the Hyperactivity, Repetitive Behavior/Perseveration, and Nesting Deficits in Cntnap2/ Mice An important goal in developing mouse models of neuropsychiatric diseases is testing therapeutic treatments. Risperidone was the first drug approved by the United States Food and Drug Administration (FDA) for symptomatic treatment of ASD, alleviating hyperactivity, repetitive behavior, aggression, and self-injurious behavior (McDougle et al., 2000, 2008). Cntnap2/ mice and their WT littermates were treated with risperidone or vehicle for 7–10 days and were tested for improvement in behavior. There were no significant changes in open field activity in treated WT mice, indicating that the dose used for treatment was not sedating (Figures 7A and 7B). However, risperidone decreased the activity levels of Cntnap2/ mice to normal WT levels (Figures 7A and 7B). Consistent with the lack of a sedative effect, treated KO mice improved their nest-building score as well, Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc. 239
Figure 4. Reduced Neuronal Synchronization in Cntnap2/ Mice (A) Representative images of the calcium (top) and astrocytic (bottom) signals of a 3 min movie stack. OGB-1, Oregon green-1; SR-101, sulforhodamine-101. (B) Correlation coefficient of neuronal firing for every pair of neurons over cell distance. (C) Mean correlation coefficient over the distance range analyzed (240 mm). (D) Firing amplitude presented as the summed fluorescence change across all imaged cells in a 3 min time window. (E) Mean number of firing events per cell in a 3 min time window. n = 4 mice/genotype. Data are presented as mean ± SEM. ***p < 0.001.
a task that depends upon sustained activity (Figure 7C). Risperidone also reversed the increased grooming behavior of Cntnap2/ mice (Figure 7E) and perseveration in the T maze (Figure 7F). To assess any effect of risperidone on social behavior, we performed the three- chamber social interaction test and found no improvement with treatment (Figure 7G); nor did we find improvement in sensory hypersensitivity (Figure 7D). DISCUSSION The development of mouse models of ASD is crucial to study the disorder at the molecular level, gain insight into disease mechanisms, and test potential pharmacological interventions. Here, we show that the consequences of CNTNAP2 deficiency in the 240 Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc.
mouse resemble many of the behavioral and cognitive features observed in patients with idiopathic autism and of the pathological features observed in patients with recessive CNTNAP2 mutations that cause a Mendelian form of syndromic autism (Strauss et al., 2006). Cntnap2/ mice have normal anxietyrelated responses, visual spatial memory, and sensorimotor integration but show abnormal vocal communication, repetitive and restrictive behaviors, and abnormal social interactions. In addition, they also show hyperactivity and epileptic seizures, both features described in CDFE patients and in many patients with ASD. Heterozygous animals did not show any of the behavioral (Figure S5) or neuropathological (data not shown) abnormalities observed in homozygote knockouts; nor did they develop epileptic seizures, consistent with the recessive nature of the pathology in humans. Autism and epilepsy are neurodevelopmental syndromes with a high frequency of co-occurrence (Geschwind, 2009), which suggests shared underlying mechanisms. At a molecular level, CNTNAP2 is a single pass transmembrane protein with a short cytoplasmic region involved in clustering K+ channels at juxtaparanodes in myelinated axons (Horresh et al., 2008) and a long extracellular region that forms a neuron-glia cell adhesion complex with contactin 2 (CNTN2, also known as TAG-1), which is necessary for the proper localization of K+ channels in this structure (Poliak et al., 2003). Thus, defects in myelination and K+ channel mislocalization at the nodes of Ranvier could theoretically lead to epilepsy in Cntnap2/ mice. However, this is not likely the case, as both light microscopic and ultrastructural analysis using electron microscopy in the peripheral and central nerves in CNTNAP2-deficient mice (Poliak et al., 2003) showed that nodal morphology and myelination were
Figure 5. Cntnap2/ Show Motor Stereotypic Movements and Behavioral Inflexibility (A–F) Morris water maze (MWM) test. (A–C) Learning. (D–F) Reversal learning. (A) Learning curve as indicated by the latency to locate a hidden platform (up to 60 s) during a 5 day training period. The average of four trials per day is presented. (B) Probe test (the platform is removed), showing the percentage of time spent in each pool quadrant (in 60 s). Note that both genotypes spend significantly more time in the target quadrant. TA, target; AR, adjacent right; AL, adjacent left; OP, opposite. (C) Number of platform site crossings during the probe test. (D) Learning curve for the reversal task of the MWM test, showing the latency to locate the new platform. (E) Probe test. Note that WT, but not KO, mice spend significantly more time in the target quadrant. (F) Number of platform site crossings during the probe test. (G) Number of no alternations in the T maze spontaneous alternation test (10 trials). (H) Time spent grooming over a 10 min period. *p < 0.05. n = 10 mice/genotype. Data are presented as mean ± SEM. See also Table S3 and Figure S4.
normal. In addition, electrophysiological investigation of nerve conductance revealed no abnormalities in conduction velocity, refractory period, or excitability (Poliak et al., 2003). In the current study, neuropathological analysis of Cntnap2/ mice revealed two major mechanisms that have been shown to lead to epilepsy in humans, cortical neuronal migration abnormalities and a reduction in the number of GABAergic interneurons. Whereas neuronal migration abnormalities might have been expected based on observations in patients with CDFE syndrome, the
reduction in GABAergic neurons was unexpected, as CNTNAP2 has not been previously associated with GABAergic neuronal function and no such deficit has been demonstrated in CDFE. These data suggest that assessment of interneurons in patients with CDFE would be worthwhile. Further, whether this reduction in interneurons is also due to a migration defect or, rather, to a defect in neurogenesis, differentiation, and/or survival remains to be elucidated in future work. Nevertheless, the embryonic expression of the gene in the ganglionic eminences and in Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc. 241
Figure 6. Cntnap2/ Mice Show Communication and Social Behavior Abnormalities (A) UsV. Number of calls from pups when separated from their mother at P3, P6, P9, and P12 (5 min). (B) Juvenile play. Time involved in social interaction, as well as repetitive behaviors (grooming and digging) in pairs of mice matched in genotype and sex at age P21 (10 min). (C) Three-chamber social interaction test. Time interacting with either an unfamiliar mouse or an inanimate object (empty cup) in 10 min. (D) Nesting behavior. The nesting score represents the amount of nesting material used after a 16 hr period (1, poor; 5, good). n = 10 mice/genotype. Data are presented as mean ± SEM. *p < 0.05, **p < 0.01, ***p < 0.001. See also Figure S5.
migrating interneurons supports a role for CNTNAP2 in the early development and migration of these cells. The large extracellular domain of CNTNAP2 is composed of a number of protein-protein interaction domains that are common to cell adhesion molecules, including laminin G, EGF repeats, and discoidin-like domains (Poliak et al., 1999). During myelination, CNTNAP2 localizes to the developing nodes of Ranvier, where, as previously mentioned, it interacts extracellularly with CNTN2. Interestingly, CNTN2 is also expressed embryonically, and blocking its function results in migration abnormalities of cortical pioneer neurons and GABAergic interneurons (Denaxa et al., 2001; Morante-Oria et al., 2003), although normal numbers of interneurons were reported in CNTN2 deficient mice, likely due to compensatory mechanisms (Denaxa et al., 2005). Thus, analysis of CNTNAP2 interactors, including CNTN2, during embryogenesis could provide insight to the role of CNTNAP2 in neuronal migration. One physiological consequence of the observed neuropathology caused by Cntnap2 knockout is significantly reduced neuronal synchronization. It is generally accepted that most cognitive functions are based on the coordinated interactions of large neuronal ensembles within and across different specialized brain areas (Uhlhaas and Singer, 2006). Synchronization determines the pattern of neuronal interactions in a way that effective neuronal connectivity would diminish when synchronization is less precise (Womelsdorf et al., 2007). A number of functional neuroimaging studies have reported reduced connectivity in ASD (Just et al., 2004; Villalobos et al., 2005; Cherkassky et al., 2006; Damarla et al., 2010), supporting the notion that the deficits in cognition and behavior associated with ASD are most likely the result of a developmental disconnection (Geschwind and Levitt, 2007). Interestingly, we have recently associated common genetic risk variants in CNTNAP2 with abnormal functional brain connectivity in humans (Scott-Van Zeeland et al., 2010). Our observations of migration abnormalities and reduced number of GABAergic interneurons in Cntnap2/ mice, together 242 Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc.
with the normal global neuronal activity as measured by the firing rate and amplitude, suggest an abnormal neuronal circuit architecture as the cause of the asynchronous firing pattern, rather than abnormalities in neuronal function per se. There are a number of factors that contribute to the precise timing of neural activity (reviewed in Wang, 2010). Interestingly, GABAergic interneurons, in particular PVALB interneurons, have been reported to play a crucial role in the rhythmic pacing of cortical neuronal activity (Sohal et al., 2009). Therefore, further studies analyzing the structure of neuronal networks and interneuron function in Cntnap2/ mice may have important implications for both the potential understanding and treatment of ASD. The ultimate goal of understanding the pathophysiology of the disorder is to develop therapeutic interventions that improve or restore normal brain activity and, ultimately, the associated cognitive and behavioral deficits. Recent studies in mouse models are very encouraging in this regard, including Rett syndrome (Guy et al., 2007), fragile X syndrome (Do¨len et al., 2007), neurofibromatosis type 1 (Costa et al., 2002), Down’s syndrome (Fernandez et al., 2007), and tuberous sclerosis (Ehninger et al., 2008). Here, we have shown that risperidone efficiently reduces hyperactivity, motor stereotypies, and perseveration in Cntnap2/ mice while having no effect on sociability. This observation provides evidence that different pathways lead to the ASD-associated core domains of social and repetitive behavior observed in this mouse model, parallel to the situation in humans, and supports the validity of this mouse model for testing new pharmacological treatments. Repetitive behavior is recognized as reflecting a disruption of the coordinated function of the cortico-striatal circuit (Albin et al., 1989). In brief, two main pathways compose this system: the direct pathway, which promotes motor behavior, and the indirect pathway, which inhibits it (Gerfen et al., 1990). In general, stereotypies indicate an unbalanced activity of this network favoring the direct pathway (Lewis et al., 2007). At the neuronal level, the ultimate firing output to the direct and indirect pathways is determined by excitatory inputs from cortex and thalamus and inhibitory inputs from local interneurons within the striatum (Kreitzer and Malenka, 2008). Interestingly, PVALB+ fast-spiking interneurons, which are the main source of inhibitory input in
Figure 7. Risperidone Rescues Hyperactivity and Repetitive Behavior/Perseveration in Cntnap2/ Mice (A and B) Open field test. Distance traveled (A) and velocity (B) for vehicle- (PBS) and drug-treated WT and KO mice (20 min). (C) Nesting behavior. Risperidone improves the nesting score of Cntnap2/ mice. (D) Hot plate. Risperidone did not have an effect in the hyperreactivity to thermal stimuli. (E) Grooming behavior is reduced in risperidone-treated mutant mice (10 min). (F) Risperidone improved spontaneous alternation of mutant mice. (G) Drug treatment did not have an effect in the threechamber social interaction test. n = 10 mice/genotype and treatment condition. Data are presented as mean ± SEM. *p < 0.05, **p < 0.01, ***p < 0.001.
abnormal striatal dopaminergic function in this model. Exploration of cortico-striatal function in Cntnap2/ mice will provide a better understanding of the neural basis of repetitive behavior in ASD. In addition, the dissociation between repetitive and social behavior with regards to treatment response suggests that Cntnap2/ mice will be useful for dissecting the distinct circuitries involved in these core components of autistic-related abnormal behavior. Finally, because understanding of CNTNAP2 function was previously focused primarily on postnatal development, these data set a new direction for investigation of CNTNAP2’s role during development and in the formation and function of neuronal circuits. EXPERIMENTAL PROCEDURES
the cortico-striatal circuit, have recently been shown to target mainly the direct pathway (Gittis et al., 2010). Therefore, the reduced number of this type of interneuron in Cntnap2/ mice likely results in overactivation of this pathway, leading to hyperactivity and repetitive behavior. Indeed, risperidone is known to potentiate the indirect pathway, which likely rebalances the activity of this network, alleviating these behaviors. Nest building has also been shown to be related to the dopaminergic pathway (Szczypka et al., 2001) and has been reported as disrupted in mice with hyperactivity (Kwon et al., 2006; Zhou et al., 2010). That risperidone also normalized the nesting ability in KO mice provides additional evidence for
Mice Cntnap2 mutant and WT mice were obtained from heterozygous crossings and were born with the expected Mendelian frequencies. The day of vaginal plug detection was designated as E0.5 and the day of birth as P0. The three obtained genotypes were housed together with three to four same sex mice per cage. They were kept in 12 hr light/12 hr dark cycle and had ad-lib access to food and water. All procedures involving animals were performed in accordance with the UCLA animal research committee. Western Blot, In Situ Hybridization, and Immunohistochemistry Western blot, in situ hybridization, and immunohistochemistry were performed using standard methods. See Extended Experimental Procedures for details. BrdU Labeling Timed pregnant mice (E16.5) received a single i.p. injection of BrdU (50 mg/kg body weight, Sigma). Imaging and Cell Counts Images were acquired with a Zeiss LSM-510 laser-scanning confocal microscope. Eight anatomically matched sections from at least three mice per
Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc. 243
genotype were selected for cell counting. For cortical images, cells were counted in matched areas of fixed size (1.2 mm wide) in each hemisphere of somatosensory cortex. The boundaries of the different cortical layers were determined by counterstaining each section with DAPI. Weighted Gene Coexpression Network Analysis The original microarray data set was deposited by Sugino et al. (2006) in the Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi? acc=GSE2882). Weighted gene coexpression network analysis (WGCNA) was performed as previously described (Winden et al., 2009). More detailed conditions are included in the Extended Experimental Procedures. Electroencephalographic Recordings Mice were deeply anesthetized with isoflurane, and microelectrodes (0.005 inch diameter) were implanted over right frontal and parietal cortices. Mice were allowed to recover for 48 hr, and EEG activity was recorded daily for up to 2 weeks. In Vivo Two-Photon Calcium Imaging Neuronal activity was imaged using a calcium indicator injected in layer II/III of somatosensory cortex in young adult mice (2–4 months of age). See the Extended Experimental Procedures for details. Behavioral Tests Ten mice per genotype were evaluated for each behavioral test. Mice were tagged with either an ear tag number or a toe tattoo (in the case of pups). Experimenters were blinded to the genotype during testing. Behavioral tests were performed in the UCLA behavioral test core and analyzed with TopScan (Clever Sys, Inc.) automated system. UsV were analyzed with Avisoft sound analysis and synthesis software for laboratory animals. Morris Water Maze The MWM test was performed as described elsewhere (Vorhees and Williams, 2006). In brief, mice were trained to locate a hidden platform based on distal visual cues to escape from the pool. Mice received four training trials per day (with different start points) for 5 consecutive days. On day 6, the platform was removed and a probe test was performed. The next day, the platform was moved to the opposite quadrant and the reversal task of the test was started. Mice received again four training trials per day to locate the new platform. A probe test was performed on day 10. T Maze Spontaneous Alternation Mice were placed on the base of a T maze and were given the choice to explore either the right or left arm of the maze for ten consecutive trials. A choice was assumed to be made when the mice stepped with the four paws into an arm. At that moment, the gate to that arm was closed and the animal was allowed to explore the arm for 5 s. Ultrasonic Vocalization Pups were removed from the dam and placed in individual heated sound proof chambers equipped to record ultrasonic vocalization (UsV) for 5 min. To avoid potential confounding effects due to temperature, the room was maintained at 21 C and body temperature was measured with a rectal probe after 5 min of the test at P6 (35 C in both genotypes). Juvenile Play Mice at age P21 were placed in a cage (previously habituated to it) with an unfamiliar mouse matched in genotype and sex for 10 min. The time mice were engaged in social interaction (nose-to-nose sniffing, nose-to-anus sniffing, following or crawling on/under each other), and the time mice spent engaged in repetitive behaviors (grooming and digging) was measured by a human observer (Silverman et al., 2010). Three-Chamber Social Interaction Test The social interaction test was performed as previously described (Silverman et al., 2010). In brief, after habituation, a mouse was placed in the central chamber of a clear Plexiglas box divided into three interconnected chambers
244 Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc.
and was given the choice to interact with either an empty wire cup (located in one side chamber) or a similar wire cup with an unfamiliar mouse inside (located in the opposite chamber). Time sniffing each cup was measured. Drug Administration Risperidone (0.2 mg/kg, Sigma) was administered by a daily i.p. injection in a volume of 10 ml/kg for 7 consecutive days. Behavioral tests were performed on days 8, 9, and 10. Mice also received drug treatment during these days approximately 1 hr prior to testing. Statistical Analyses All results are expressed as mean ± SEM. For cell quantifications and neuronal synchrony comparisons between groups, a one-way ANOVA was used. To compare cell distributions within the cortical layers, we used two-way ANOVA. For behavioral tests, either one- or two-way ANOVA with repeated measures followed by Bonferroni-Dunn posthoc tests, when applied, were used. SUPPLEMENTAL INFORMATION Supplemental Information includes Extended Experimental Procedures, five figures, three tables, and one movie and can be found with this article online at doi:10.1016/j.cell.2011.08.040. ACKNOWLEDGMENTS We thank the UCLA behavioral testing core and its supervisor, Dr. Ravi Ponnusamy, for assistance with behavioral testing. We also thank Dr. Alcino Silva, codirector of the core, for his critical discussions about mouse behavioral testing; Dr. Stephanie White for the UsV equipment and software; Dr. Carolyn Houser for assistance with GAD1 IHC; and Dr. William Yang and Dr. Istvan Mody for helpful discussions. We would also like to thank Jamee Bomar and Dr. Asami Oguro-Ando for help with floating IHC and helpful discussions; Greg Osborn for help with UsV analysis; Clark Rosensweig for help with mouse genotyping; Dr. Irina Voineagu and Lauren Kawaguchi for critically reading the manuscript; and Dr. Eric Wexler for useful discussions on drug treatment. This work was supported by grants NIH/NIMH R01 MH081754-02R to D.H.G.; NIH ACE Center 1P50-HD055784-01 to D.H.G. (Project II) and Network grant 5R01-MH081754-04 to D.H.G.; NIH/NS50220 to E.P.; and Dr. Miriam and Sheldon G. Adelson Medical Research Foundation to D.H.G. and E.P. E.P. is the Incumbent of the Hanna Hertz Professorial Chair for Multiple Sclerosis and Neuroscience. Received: April 6, 2011 Revised: June 28, 2011 Accepted: August 26, 2011 Published: September 29, 2011 REFERENCES Abrahams, B.S., Tentler, D., Perederiy, J.V., Oldham, M.C., Coppola, G., and Geschwind, D.H. (2007). Genome-wide analyses of human perisylvian cerebral cortical patterning. Proc. Natl. Acad. Sci. USA 104, 17849–17854. Alarco´n, M., Abrahams, B.S., Stone, J.L., Duvall, J.A., Perederiy, J.V., Bomar, J.M., Sebat, J., Wigler, M., Martin, C.L., Ledbetter, D.H., et al. (2008). Linkage, association, and gene-expression analyses identify CNTNAP2 as an autismsusceptibility gene. Am. J. Hum. Genet. 82, 150–159. Albin, R.L., Young, A.B., and Penney, J.B. (1989). The functional anatomy of basal ganglia disorders. Trends Neurosci. 12, 366–375. Angevine, J.B., Jr., and Sidman, R.L. (1961). Autoradiographic study of cell migration during histogenesis of cerebral cortex in the mouse. Nature 192, 766–768. APA (American Psychiatric Association). (2000). Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (Washington, DC: American Psychiatric Publishing).
Arking, D.E., Cutler, D.J., Brune, C.W., Teslovich, T.M., West, K., Ikeda, M., Rea, A., Guy, M., Lin, S., Cook, E.H., and Chakravarti, A. (2008). A common genetic variant in the neurexin superfamily member CNTNAP2 increases familial risk of autism. Am. J. Hum. Genet. 82, 160–164. Bakkaloglu, B., O’Roak, B.J., Louvi, A., Gupta, A.R., Abelson, J.F., Morgan, T.M., Chawarska, K., Klin, A., Ercan-Sencicek, A.G., Stillman, A.A., et al. (2008). Molecular cytogenetic analysis and resequencing of contactin associated protein-like 2 in autism spectrum disorders. Am. J. Hum. Genet. 82, 165–173. Belmonte, M.K., Allen, G., Beckel-Mitchener, A., Boulanger, L.M., Carper, R.A., and Webb, S.J. (2004). Autism and abnormal development of brain connectivity. J. Neurosci. 24, 9228–9231. Chadman, K.K., Yang, M., and Crawley, J.N. (2009). Criteria for validating mouse models of psychiatric diseases. Am. J. Med. Genet. B. Neuropsychiatr. Genet. 150B, 1–11. Cherkassky, V.L., Kana, R.K., Keller, T.A., and Just, M.A. (2006). Functional connectivity in a baseline resting-state network in autism. Neuroreport 17, 1687–1690. Costa, R.M., Federov, N.B., Kogan, J.H., Murphy, G.G., Stern, J., Ohno, M., Kucherlapati, R., Jacks, T., and Silva, A.J. (2002). Mechanism for the learning deficits in a mouse model of neurofibromatosis type 1. Nature 415, 526–530. Crawley, J.N. (2007). Mouse behavioral assays relevant to the symptoms of autism. Brain Pathol. 17, 448–459. Damarla, S.R., Keller, T.A., Kana, R.K., Cherkassky, V.L., Williams, D.L., Minshew, N.J., and Just, M.A. (2010). Cortical underconnectivity coupled with preserved visuospatial cognition in autism: Evidence from an fMRI study of an embedded figures task. Autism Res. 3, 273–279. Deacon, R.M. (2006). Assessing nest building in mice. Nat. Protoc. 1, 1117– 1119. Denaxa, M., Chan, C.H., Schachner, M., Parnavelas, J.G., and Karagogeos, D. (2001). The adhesion molecule TAG-1 mediates the migration of cortical interneurons from the ganglionic eminence along the corticofugal fiber system. Development 128, 4635–4644. Denaxa, M., Kyriakopoulou, K., Theodorakis, K., Trichas, G., Vidaki, M., Takeda, Y., Watanabe, K., and Karagogeos, D. (2005). The adhesion molecule TAG-1 is required for proper migration of the superficial migratory stream in the medulla but not of cortical interneurons. Dev. Biol. 288, 87–99. Do¨len, G., Osterweil, E., Rao, B.S., Smith, G.B., Auerbach, B.D., Chattarji, S., and Bear, M.F. (2007). Correction of fragile X syndrome in mice. Neuron 56, 955–962. Ehninger, D., Han, S., Shilyansky, C., Zhou, Y., Li, W., Kwiatkowski, D.J., Ramesh, V., and Silva, A.J. (2008). Reversal of learning deficits in a Tsc2+/mouse model of tuberous sclerosis. Nat. Med. 14, 843–848. Fernandez, F., Morishita, W., Zuniga, E., Nguyen, J., Blank, M., Malenka, R.C., and Garner, C.C. (2007). Pharmacotherapy for cognitive impairment in a mouse model of Down syndrome. Nat. Neurosci. 10, 411–413. Gerfen, C.R., Engber, T.M., Mahan, L.C., Susel, Z., Chase, T.N., Monsma, F.J., Jr., and Sibley, D.R. (1990). D1 and D2 dopamine receptor-regulated gene expression of striatonigral and striatopallidal neurons. Science 250, 1429– 1432. Gerlai, R., Pisacane, P., and Erickson, S. (2000). Heregulin, but not ErbB2 or ErbB3, heterozygous mutant mice exhibit hyperactivity in multiple behavioral tasks. Behav. Brain Res. 109, 219–227. Geschwind, D.H. (2009). Advances in autism. Annu. Rev. Med. 60, 367–380. Geschwind, D.H., and Levitt, P. (2007). Autism spectrum disorders: developmental disconnection syndromes. Curr. Opin. Neurobiol. 17, 103–111. Gittis, A.H., Nelson, A.B., Thwin, M.T., Palop, J.J., and Kreitzer, A.C. (2010). Distinct roles of GABAergic interneurons in the regulation of striatal output pathways. J. Neurosci. 30, 2223–2234. Glessner, J.T., Wang, K., Cai, G., Korvatska, O., Kim, C.E., Wood, S., Zhang, H., Estes, A., Brune, C.W., Bradfield, J.P., et al. (2009). Autism genome-wide copy number variation reveals ubiquitin and neuronal genes. Nature 459, 569–573.
Guy, J., Gan, J., Selfridge, J., Cobb, S., and Bird, A. (2007). Reversal of neurological defects in a mouse model of Rett syndrome. Science 315, 1143–1147. Horresh, I., Poliak, S., Grant, S., Bredt, D., Rasband, M.N., and Peles, E. (2008). Multiple molecular interactions determine the clustering of Caspr2 and Kv1 channels in myelinated axons. J. Neurosci. 28, 14213–14222. Inda, M.C., DeFelipe, J., and Mun˜oz, A. (2006). Voltage-gated ion channels in the axon initial segment of human cortical pyramidal cells and their relationship with chandelier cells. Proc. Natl. Acad. Sci. USA 103, 2920–2925. Just, M.A., Cherkassky, V.L., Keller, T.A., and Minshew, N.J. (2004). Cortical activation and synchronization during sentence comprehension in high-functioning autism: evidence of underconnectivity. Brain 127, 1811–1821. Kreitzer, A.C., and Malenka, R.C. (2008). Striatal plasticity and basal ganglia circuit function. Neuron 60, 543–554. Kwon, C.H., Luikart, B.W., Powell, C.M., Zhou, J., Matheny, S.A., Zhang, W., Li, Y., Baker, S.J., and Parada, L.F. (2006). Pten regulates neuronal arborization and social interaction in mice. Neuron 50, 377–388. Levitt, P., Eagleson, K.L., and Powell, E.M. (2004). Regulation of neocortical interneuron development and the implications for neurodevelopmental disorders. Trends Neurosci. 27, 400–406. Lewis, M.H., Tanimura, Y., Lee, L.W., and Bodfish, J.W. (2007). Animal models of restricted repetitive behavior in autism. Behav. Brain Res. 176, 66–74. Marı´n, O., Yaron, A., Bagri, A., Tessier-Lavigne, M., and Rubenstein, J.L. (2001). Sorting of striatal and cortical interneurons regulated by semaphorinneuropilin interactions. Science 293, 872–875. McDougle, C.J., Scahill, L., McCracken, J.T., Aman, M.G., Tierney, E., Arnold, L.E., Freeman, B.J., Martin, A., McGough, J.J., Cronin, P., et al. (2000). Research Units on Pediatric Psychopharmacology (RUPP) Autism Network. Background and rationale for an initial controlled study of risperidone. Child Adolesc. Psychiatr. Clin. N. Am. 9, 201–224. McDougle, C.J., Stigler, K.A., Erickson, C.A., and Posey, D.J. (2008). Atypical antipsychotics in children and adolescents with autistic and other pervasive developmental disorders. J. Clin. Psychiatry 69 (Suppl 4), 15–20. Molyneaux, B.J., Arlotta, P., Menezes, J.R., and Macklis, J.D. (2007). Neuronal subtype specification in the cerebral cortex. Nat. Rev. Neurosci. 8, 427–437. Morante-Oria, J., Carleton, A., Ortino, B., Kremer, E.J., Faire´n, A., and Lledo, P.M. (2003). Subpallial origin of a population of projecting pioneer neurons during corticogenesis. Proc. Natl. Acad. Sci. USA 100, 12468–12473. Nakatani, J., Tamada, K., Hatanaka, F., Ise, S., Ohta, H., Inoue, K., Tomonaga, S., Watanabe, Y., Chung, Y.J., Banerjee, R., et al. (2009). Abnormal behavior in a chromosome-engineered mouse model for human 15q11-13 duplication seen in autism. Cell 137, 1235–1246. Nestler, E.J., and Hyman, S.E. (2010). Animal models of neuropsychiatric disorders. Nat. Neurosci. 13, 1161–1169. Oldham, M.C., Konopka, G., Iwamoto, K., Langfelder, P., Kato, T., Horvath, S., and Geschwind, D.H. (2008). Functional organization of the transcriptome in human brain. Nat. Neurosci. 11, 1271–1282. Poliak, S., Gollan, L., Martinez, R., Custer, A., Einheber, S., Salzer, J.L., Trimmer, J.S., Shrager, P., and Peles, E. (1999). Caspr2, a new member of the neurexin superfamily, is localized at the juxtaparanodes of myelinated axons and associates with K+ channels. Neuron 24, 1037–1047. Poliak, S., Salomon, D., Elhanany, H., Sabanay, H., Kiernan, B., Pevny, L., Stewart, C.L., Xu, X., Chiu, S.Y., Shrager, P., et al. (2003). Juxtaparanodal clustering of Shaker-like K+ channels in myelinated axons depends on Caspr2 and TAG-1. J. Cell Biol. 162, 1149–1160. Racine, R.J. (1972). Modification of seizure activity by electrical stimulation. II. Motor seizure. Electroencephalogr. Clin. Neurophysiol. 32, 281–294. Scott-Van Zeeland, A.A., Abrahams, B.S., Alvarez-Retuerto, A.I., Sonnenblick, L.I., Rudie, J.D., Ghahremani, D., Mumford, J.A., Poldrack, R.A., Dapretto, M., Geschwind, D.H., and Bookheimer, S.Y. (2010). Altered functional connectivity in frontal lobe circuits is associated with variation in the autism risk gene CNTNAP2. Sci. Transl. Med. 2, ra80.
Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc. 245
Scattoni, M.L., Crawley, J., and Ricceri, L. (2009). Ultrasonic vocalizations: a tool for behavioural phenotyping of mouse models of neurodevelopmental disorders. Neurosci. Biobehav. Rev. 33, 508–515.
Villalobos, M.E., Mizuno, A., Dahl, B.C., Kemmotsu, N., and Mu¨ller, R.A. (2005). Reduced functional connectivity between V1 and inferior frontal cortex associated with visuomotor performance in autism. Neuroimage 25, 916–925.
Sebat, J., Lakshmi, B., Malhotra, D., Troge, J., Lese-Martin, C., Walsh, T., Yamrom, B., Yoon, S., Krasnitz, A., Kendall, J., et al. (2007). Strong association of de novo copy number mutations with autism. Science 316, 445–449.
Vitali, R., and Clarke, S. (2004). Improved rotorod performance and hyperactivity in mice deficient in a protein repair methyltransferase. Behav. Brain Res. 153, 129–141.
Silverman, J.L., Yang, M., Lord, C., and Crawley, J.N. (2010). Behavioural phenotyping assays for mouse models of autism. Nat. Rev. Neurosci. 11, 490–502.
Vorhees, C.V., and Williams, M.T. (2006). Morris water maze: procedures for assessing spatial and related forms of learning and memory. Nat. Protoc. 1, 848–858.
Sohal, V.S., Zhang, F., Yizhar, O., and Deisseroth, K. (2009). Parvalbumin neurons and gamma rhythms enhance cortical circuit performance. Nature 459, 698–702.
Wang, X.J. (2010). Neurophysiological and computational principles of cortical rhythms in cognition. Physiol. Rev. 90, 1195–1268.
Strauss, K.A., Puffenberger, E.G., Huentelman, M.J., Gottlieb, S., Dobrin, S.E., Parod, J.M., Stephan, D.A., and Morton, D.H. (2006). Recessive symptomatic focal epilepsy and mutant contactin-associated protein-like 2. N. Engl. J. Med. 354, 1370–1377. Sugino, K., Hempel, C.M., Miller, M.N., Hattox, A.M., Shapiro, P., Wu, C., Huang, Z.J., and Nelson, S.B. (2006). Molecular taxonomy of major neuronal classes in the adult mouse forebrain. Nat. Neurosci. 9, 99–107. Szczypka, M.S., Kwok, K., Brot, M.D., Marck, B.T., Matsumoto, A.M., Donahue, B.A., and Palmiter, R.D. (2001). Dopamine production in the caudate putamen restores feeding in dopamine-deficient mice. Neuron 30, 819–828. Uhlhaas, P.J., and Singer, W. (2006). Neural synchrony in brain disorders: relevance for cognitive dysfunctions and pathophysiology. Neuron 52, 155–168. Vernes, S.C., Newbury, D.F., Abrahams, B.S., Winchester, L., Nicod, J., Groszer, M., Alarco´n, M., Oliver, P.L., Davies, K.E., Geschwind, D.H., et al. (2008). A functional genetic link between distinct developmental language disorders. N. Engl. J. Med. 359, 2337–2345.
246 Cell 147, 235–246, September 30, 2011 ª2011 Elsevier Inc.
Weiss, L.A., Arking, D.E., Gene Discovery Project of Johns Hopkins & the Autism Consortium, Daly, M.J., and Chakravarti, A. (2009). A genome-wide linkage and association scan reveals novel loci for autism. Nature 461, 802– 808. Winden, K.D., Oldham, M.C., Mirnics, K., Ebert, P.J., Swan, C.H., Levitt, P., Rubenstein, J.L., Horvath, S., and Geschwind, D.H. (2009). The organization of the transcriptional network in specific neuronal classes. Mol. Syst. Biol. 5, 291–308. Wonders, C.P., and Anderson, S.A. (2006). The origin and specification of cortical interneurons. Nat. Rev. Neurosci. 7, 687–696. Womelsdorf, T., Schoffelen, J.M., Oostenveld, R., Singer, W., Desimone, R., Engel, A.K., and Fries, P. (2007). Modulation of neuronal interactions through neuronal synchronization. Science 316, 1609–1612. Zhang, B., and Horvath, S. (2005). A general framework for weighted gene coexpression network analysis. Stat. Appl. Genet. Mol. Biol. 4, 17. Zhou, M., Rebholz, H., Brocia, C., Warner-Schmidt, J.L., Fienberg, A.A., Nairn, A.C., Greengard, P., and Flajolet, M. (2010). Forebrain overexpression of CK1d leads to down-regulation of dopamine receptors and altered locomotor activity reminiscent of ADHD. Proc. Natl. Acad. Sci. USA 107, 4401–4406.
Correction
AKT/FOXO Signaling Enforces Reversible Differentiation Blockade in Myeloid Leukemias Stephen M. Sykes, Steven W. Lane, Lars Bullinger, Demetrios Kalaitzidis, Rushdia Yusuf, Borja Saez, Francesca Ferraro, Francois Mercier, Harshabad Singh, Kristina M. Brumme, Sanket S. Acharya, Claudia Scholl, Zuzana Tothova, Eyal C. Attar, Stefan Fro¨hling, Ronald A. DePinho, D. Gary Gilliland, Scott A. Armstrong, and David T. Scadden* *Correspondence:
[email protected] DOI 10.1016/j.cell.2011.09.015
(Cell 146, 697–708; September 2, 2011) In the author list of the article above, the last name of Claudia Scholl was inadvertently spelled as ‘‘Scho¨ll’’. The correct spelling now appears with the article online.
Cell 147, 247, September 30, 2011 ª2011 Elsevier Inc. 247
Reviews Editor, Cell Stem Cell We are seeking to appoint a new Reviews Editor for Cell Stem Cell, to be based in the Cell Press offices in Cambridge, MA. As the Cell Stem Cell Reviews Editor, your primary responsibility will be the development and management of the review section of the journal. In addition, you will have an opportunity to develop the journal’s online presence and to handle a subset of submitted research manuscripts. You will be acquiring and developing the very best editorial content, making use of a network of contacts in academia plus information gathered at international conferences, to help ensure that Cell Stem Cell maintains its leading position within the stem cell research community. This is an exciting and challenging role that provides an opportunity to stay close to the cutting edge of scientific advances while developing a new career away from the bench. The minimum qualification is a doctoral degree in a relevant discipline, and postdoctoral training is an advantage. Previous editorial experience is beneficial but not required—we will make sure you receive the training you need. The successful candidate will be highly motivated and creative and able to work in a team as well as independently. Good interpersonal skills are essential because this role involves networking in the wider scientific community and collaboration with other parts of the business.
To apply Please submit a CV and cover letter describing your qualifications, research interests, current salary, and reasons for pursuing a career in publishing at http://reedelsevier.taleo.net/careersection/51/jobdetail.ftl?lang=en&job=SCI000DE. No phone inquiries, please. Cell Press is an equal opportunity employer. Applications will be considered on an ongoing basis until the closing date of Friday, September 30th.
Scientific Editor, The American Journal of Human Genetics We are seeking to hire the next Scientific Editor for The American Journal of Human Genetics (AJHG). AJHG, a leading international genetics journal, is the monthly publication of The American Society of Human Genetics and is published by Cell Press. As Scientific Editor of the journal, you will be working closely with other members of the AJHG editorial team while overseeing the peer-review process and managing the editorial direction of the journal. This is an exciting and challenging role, and you will need a PhD in a relevant discipline. Excellent written and oral communication skills are essential, because the role involves constant interaction with authors and reviewers as well as with members of the editorial and publishing business. Previous editorial experience is helpful. This is an ideal opportunity to stay close to the cutting edge of scientific developments in the field while developing a career in an exciting publishing environment. Frequent contacts with scientists as authors and reviewers, and as colleagues at meetings, will contribute to building a strong network within the scientific community. The successful candidate will have the opportunity to work out of the AJHG editorial offices in the Department of Molecular and Human Genetics at Baylor College of Medicine in Houston, TX. AJHG offers an attractive salary and benefits package, as well as a stimulating working environment.
To apply Please submit a CV and cover letter describing your qualifications, research interests, and reasons for pursuing a career in scientific publishing to
[email protected]. No phone inquiries, please. The American Journal of Human Genetics and Baylor College of Medicine are equal opportunity/affirmative action employers, M/F/D/V. Applications will be considered in an ongoing basis until the position is filled.
Editor, Trends in Biochemical Sciences We are seeking to appoint a new Editor for Trends in Biochemical Sciences, to be based in the Cell Press offices in Cambridge, MA. As Editor of Trends in Biochemical Sciences, you will be responsible for the strategic development and content management of the journal. You will be acquiring and developing the very best editorial content, making use of a network of contacts in academia plus information gathered at international conferences, to ensure that Trends in Biochemical Sciences maintains its market-leading position. This is an exciting and challenging role that provides an opportunity to stay close to the cutting edge of scientific advances while developing a new career away from the bench. You will work in a highly dynamic and collaborative publishing environment that includes 14 Trends titles and 14 Cell Press titles. You will also collaborate with your Cell Press colleagues to maximize quality and efficiency of content commissioning and participate in exciting new non-journal-based initiatives. The minimum qualification is a PhD in a relevant discipline, and postdoctoral training is an advantage. Previous publishing experience is not necessary—we will make sure you get the training and development you need. Good interpersonal skills are essential because the role involves networking in the wider scientific community and collaboration with other parts of the business.
To apply Please submit a CV and cover letter describing your qualifications, research interests, current salary, and reasons for pursuing a career in publishing at http://reedelsevier.taleo.net/careersection/51/jobdetail.ftl?lang=en&job=SCI000DP. No phone inquiries, please. Cell Press is an equal opportunity employer. Applications will be considered on an ongoing basis until the closing date of Friday, October 21st.
EDITOR-IN-CHIEF
SENIOR EDITORS
ASSOCIATE EDITORS
F.E. Bloom La Jolla, CA, USA
J.F. Baker Chicago, IL, USA P.R. Hof New York, NY, USA G.R. Mangun Davis, CA, USA J.I. Morgan Memphis, TN, USA F.R. Sharp Sacramento, CA, USA R.J.Smeyne Memphis, TN, USA A.F. Sved Pittsburgh, PA, USA
G. Aston-Jones Charleston, SC, USA J.S. Baizer Buffalo, NY, USA J.D. Cohen Princeton, NJ, USA B.M. Davis Pittsburgh, PA, USA J. De Felipe Madrid, Spain M.A. Dyer Memphis, TN, USA M.S. Gold Pittsburgh, PA, USA G.F. Koob La Jolla, CA, USA
T.A. Milner New York, NY, USA S.D. Moore Durham, NC, USA T.H. Moran Baltimore, MD, USA T.F. Münte Magdeburg, Germany K-C. Sonntag Belmont, MA, USA R.J. Valentino Philadelphia, PA, USA C.L. Williams Durham,NC, USA
1
23
Twenty-three to the Power of One.
One re-unified journal, nine specialist sections, 23 receiving Editors ← Authors receive first editorial decision within 30 days of submission ← “Young Investigator Awards” for innovative work by a new generation of researchers ←
Brain Research take another look www.elsevier.com/locate/brainres
Announcements
Positions Available
Early Independent Scientists in the NIH Intramural Research Program
The National Institutes of Health, the nation’s premier agency for biomedical and behavioral research, is recruiting for Early Independent Scientists in the NIH Intramural Research Program (IRP). We are looking for new Ph.D., M.D., D.D.S. and equivalent doctoral researchers who have the creativity, intellect and maturity to flourish in an independent research position. Applicants may not already be affiliated with the IRP.
The IRP is home to approximately 1,200 tenured and tenure-track investigators and 5,000 trainees. We provide an environment that encourages and supports innovative, high-impact research. To enhance the development and early-stage careers of exceptional investigators, the NIH has developed this program to support recent doctoral graduates in independent positions without the need to perform a post-doctoral career fellowship. Thus, the graduate can immediately start an independent career after graduation.
Early Independent Scientists will be provided the resources to establish an independent research program, including salary and benefits, support for lab personnel, lab space, supplies, and start-up equipment. At the time of application, candidates must be within 12 months of completing their Ph.D., M.D. or D.D.S. degree, or for clinician-scientists within twelve months of completing their core clinical residency program. The NIH will support up to five Early Independent Scientists per year, with each scientist receiving up to five years of support. Successful candidates also will be eligible to apply for funds from the NIH Common Fund Early Independent Investigator program.
Complete applications must be received by November 18, 2011. Candidates should submit electronically a cover letter, curriculum vitae, and a 3-page statement of research interests and future plans, and arrange to have 3 to 5 letters of reference sent to: Charles Dearolf, Ph.D.; Assistant Director for Intramural Research; National Institutes of Health;
[email protected].
For more information about the IRP, visit http://irp.nih.gov
The NIH recognizes a unique and compelling need to promote diversity in the biomedical, behavioral, clinical and social sciences research workforce. The NIH expects its efforts to diversify the workforce to lead to the recruitment of the most talented researchers from all groups. We encourage applications from talented researchers from diverse backgrounds underrepresented in biomedical research, including underrepresented racial and ethnic groups, persons with disabilities, and women for participation in all NIH-funded research opportunities.
Positions Available
DIRECTOR, UNIVERSITY OF PITTSBURGH INSTITUTE FOR PERSONALIZED MEDICINE The University of Pittsburgh School of Medicine and UPMC (University of Pittsburgh Medical Center) invite applications and nominations for the position of director of the newly created Institute for Personalized Medicine. The University of Pittsburgh has emerged as a national research leader, ranked among the top 10 educational and research institutions in NIH funding. Within the School of Medicine, areas of research strength include structural biology; cell and molecular biology; drug discovery and design; vaccine development; organ transplantation/immunology; stem cell biology and tissue engineering; medical device development; cancer research and therapy; cardiology and cardiovascular biology; bioinformatics and computational/systems biology; psychiatry, neurobiology, systems neuroscience, and neurological surgery; reproductive and developmental biology; comparative effectiveness research; and clinical research/clinical trials. UPMC is an integrated global health enterprise and one of the nation’s leading academic health care systems, with $9 billion in revenues for fiscal year 2011. UPMC has more than 50,000 employees; nearly 5,000 affiliated physicians; more than 20 hospitals, as well as specialized outpatient facilities, cancer centers, imaging services, and international sites; and a health insurance plan. The mission of the University of Pittsburgh Institute for Personalized Medicine is to apply new knowledge in genetics, genomics, and other disciplines to advance evidence-based medicine, leading to improved models of disease prevention and treatment in individuals. As evidence-based practice decreases variation in population-based care for specific diseases, appropriate treatment variation then needs to be reintroduced on the basis of specific aberrant molecular pathways and manifestations of an individual’s genes and environment in a temporal context. Our goal is to become a major leader in personalized medicine research and to apply this knowledge to yield improved outcomes and decreased costs. The University of Pittsburgh and UPMC hope to attract an international leader in the emerging and rapidly advancing field of personalized medicine. The ideal candidate will have a recognized reputation for innovative scholarship, a distinguished track record of research support and publications, and leadership experience. Candidates should be committed to creative, team-based translational research, the exemplary education of students and trainees, and continued faculty development. Candidates should qualify for the academic rank of professor with tenure. The University of Pittsburgh and UPMC have pledged significant resources to this important new initiative. The director of the Institute for Personalized Medicine will report to the University’s senior vice chancellor for the health sciences and the president of UPMC. Interested and qualified applicants should submit their CV and a letter briefly describing the rationale for their interest to: Search Committee, Institute for Personalized Medicine c/o Office of Academic Affairs University of Pittsburgh School of Medicine Suite 401 Scaife Hall 3550 Terrace Street Pittsburgh, PA 15261 e-mail:
[email protected]; telephone: 412-383-7474 The University of Pittsburgh is an Affirmative Action/Equal Opportunity Employer.
Positions Available Faculty Positions in Gene Regulation and Genomics The Cecil H. and Ida Green Center for Reproductive Biology Sciences and the Division of Basic Reproductive Biology Research in the Department of Obstetrics and Gynecology at the University of Texas Southwestern Medical Center in Dallas invite applications from outstanding candidates for three tenure-track assistant or associate professor positions in signaling, gene regulation, and genome function, especially in the areas of chromatin and transcription, epigenetics, nuclear endpoints of cellular signaling pathways, nuclear receptors, RNA biology, genome organization and evolution, and DNA replication and repair. We are interested in a wide variety of model systems and experimental approaches, including biochemistry, molecular biology, structural biology, animal models, genomics, proteomics, bioinformatics, and computational biology. The Green Center’s research programs focus on, but are not limited to, female reproductive biology in a broad sense, including: oocyte maturation, fertilization, development, pregnancy, parturition, stem cells, endocrinology, and oncology, as well as relevant aspects of metabolism, inflammation, immunity, and neurobiology. • Position 1: Signaling, chromatin, and gene regulation – a broad search for candidates using a wide array of experimental approaches to address fundamental questions in nuclear signaling, chromatin, transcription, epigenetics, and RNA biology.
• Position 2: Genomic, bioinformatic, computational, and evolutionary approaches to understanding gene regulation - a more focused search in areas that will connect to broader genomic initiatives on campus.
• Position 3: Molecular biology of female reproductive systems - a search for candidates using cell-based or physiological models in combination with molecular or genomic approaches to address fundamental questions concerning female reproductive biology. The Green Center is an endowed basic science research center at UT Southwestern, which promotes and supports cutting-edge, integrative, and collaborative basic research in female reproduction and related areas of biology, as well as strong connections between basic and clinical research. This recruitment is part of a major university- and department-supported renovation and rejuvenation of the Green Center (upwards of 12 million dollars when completed and fully staffed). Successful candidates, who will be housed in a newly renovated state-of-the-art research facility and provided a generous start-up package, are expected to establish scientifically rigorous and externally funded research programs and participate in center, department, and university teaching and training programs. To learn more about the Green Center, visit: http://www.utsouthwestern.edu/utsw/home/research/greencenter/ Candidates must have a Ph.D. or M.D. or equivalent in a relevant field of study, postdoctoral or comparable experience, and a demonstrated record of research excellence. Applicants should send a letter of application, curriculum vitae, and a statement of planned research projects as pdf files to
[email protected], indicating the position of interest (1, 2, or 3) in the subject line. Applicants should also arrange for three letters of reference to be sent directly to the above e-mail address. Review of applications will begin on October 1, 2011 and continue during the 2011 – 2012 academic year or until the positions are filled, although applicants are encouraged to submit their materials as soon as possible. UT Southwestern is an Equal Opportunity/Affirmative Action Employer.
Positions Available Department of Health and Human Services (DHHS) National Institutes of Health (NIH) National Institute of Arthritis and Musculoskeletal and Skin Diseases Intramural Research Program (NIAMS) NIH-Center for Regenerative Medicine (NIH CRM) Staff Scientist Position The NIH CRM conducts and supports a program of research to better understand the properties of stem cells and how best to utilize them for screening and therapy. NIH CRM has an opening for a Staff Scientist with expertise in stem cell biology. The successful applicant will have primary responsibility for managing the iPSC generation and screening activities of the laboratory, this will include tissue sourcing from patients at the NIH clinical facilities, developing protocols for integration free iPSC generation and protocols for differentiation and selection of differentiated progeny. An initial focus will be on neural derivatives such as oligodendrocytes and large projection neurons, and it is expected that the applicant will have some tissue culture experience related to primary cell culture of these cell types. The applicant is also expected to have some molecular biology expertise as related to cloning and generation of constructs and techniques used to insert genes into specific loci in culture. Additionally, there will be an opportunity for the successful applicant to design and implement projects either independently or in collaboration with other laboratories interested in stem cell biology. The position requires a Ph.D., at least three years of postdoctoral experience, a strong record of productive research supported by publications, and proven ability to work in a group setting, preferably with supervisory experience. Salary is commensurate with experience and accomplishment. The current salary range for Staff Scientists is $80,354 - $173,826. This position is located on the NIH campus in Bethesda, Maryland, a suburb of Washington, D.C. NIAMS and is a term limited/renewable appointment under Title 42. The NIH offers tremendous depth and breadth of intellectual and technological resources. Benefits include retirement, health insurance, life insurance, thrift savings plan, flexible spending account. To apply send curriculum vitae, bibliography, three letters of recommendation, and a short statement about your skill set, including how you see applying your skill set to the projects proposed in this advertisement by November 1, 2011 to: Wanda White, Administrative Officer, NIAMS/NIH CRM, 9000 Rockville Pike, Building 31 Room 4C12, Bethesda, Maryland 20892, Email to
[email protected] Don’t forget to mention Cell jobs when applying. Please note that applications will be accepted until the position has been filled. DHHS and NIH are Equal Opportunity Employers
The Department of Pharmacology seeks highly qualified applicants for two tenure track faculty positions at the Assistant, Associate, or full Professor level in the area of Neuroscience. Preferred areas of research include neurophysiology, cellular & molecular neuroscience, integrative neurobiology, pain and neurodegenerative disorders involving genetic model systems. The department has new leadership and is undergoing expansion of its faculty (http://www.medicine.uiowa.edu/ pharmacology/). Pharmacology faculty direct an NIH funded training program for pre-doctoral students in Pharmacological Sciences, and participate in interdisciplinary training programs in Neuroscience, Molecular & Cellular Biology, Pain Research, Genetics, and Medical Scientist Training Programs. Newly remodeled research space with exceptional shared instrumentation and core facilities is available. The positions include a 12-month salary, benefits and a competitive start-up package. All applicants must have a relevant doctoral degree, with at least two years of postdoctoral training, a commitment to biomedical education and demonstrate a high level of productivity, strong communication and interpersonal skills. Candidates will be judged on their potential to initiate and maintain a strong, independent research program, their desire to train students and postdoctoral fellows, and to participate in departmental teaching. To apply, visit The University of Iowa website at http://jobs.uiowa.edu, requisition #59967. Include a CV and a letter of interest that provides a summary of research accomplishments and planned research program. Applicants for Associate Professor and Professor should provide names of three referees. Applicants for Assistant Professor should ask three referees to directly submit letters on their behalf to Search Committee #59967 at pharmacology-search@uiowa. edu. Review of applicants will begin immediately and will continue until the positions are filled. Anticipated start date is July 1, 2012. Questions may be directed to Curt D. Sigmund, Head, Department of Pharmacology, phone 319-3357946,
[email protected]. The University of Iowa is an equal opportunity and affirmative action employer.
Positions Available JUNIOR AND ESTABLISHED INVESTIGATOR FACULTY POSITIONS IN GENETICS The Department of Genetics at Dartmouth Medical School seeks outstanding individuals with vigorous and innovative research programs to study fundamental questions in genetics. Particular areas of interest include cancer biology and neuroscience in humans or model organisms (mouse, fish, flies, worms, fungi, etc.). Applicants must have a PhD and/or MD degree and a strong publication record in the field. Evidence of extramural funding is desirable. The position carries a tenure-track or tenured faculty appointment at the rank of Assistant, Associate or Full Professor, commensurate with experience. These positions are part of a major campus-wide initiative to expand faculty in developmental biology, genomics/bioinformatics, molecular therapeutics and cancer biology. The Department of Genetics provides a rich multi-disciplinary collaborative environment and is closely affiliated with the Norris Cotton Comprehensive Cancer center. Faculty members have access to two umbrella PhD programs (http:// www.dartmouth.edu/~mcb/), (http://dms.dartmouth.edu/pemm/). The Dartmouth College campus is situated in the Upper Valley area of New Hampshire/Vermont, a vibrant community offering excellent schools and outstanding qualityof-life opportunities in a beautiful rural environment. Applicants should submit electronically a PDF file of the curriculum vitae, a 1-3 page description of research accomplishments and future research plans, and arrange to have 3 letters of reference sent directly to the search committee. Review of applications will commence Nov. 15th and proceed until positions are filled. Please email materials to:
[email protected] Search Committee Department of Genetics, HB7400 Hanover, NH 03755-7400, USA. http://www.dartmouth.edu/dms/genetics Dartmouth Medical School is an Equal Opportunity Employer. We will extend equal opportunity to all individuals without regard for gender, race, religion, color, national origin, sexual orientation, age, disability, handicap or veteran status.
Faculty Position in Cardiovascular/Respiratory/ Autonomic Disorders Department of Pharmacology and Physiology School of Medicine and Health Sciences The George Washington University The Department of Pharmacology and Physiology is accepting applications for a tenure-eligible faculty member at the rank of Assistant, Associate or Full Professor with expertise in the genetic, cellular or molecular characterization of autonomic, cardiovascular and/or respiratory disorders. This individual will participate in medical and graduate education in the Department of Pharmacology and Physiology as well as the Institute for Biomedical Sciences. Basic Qualifications: Applicants must have a terminal degree (Ph.D. or M.D.) in an appropriate discipline and substantial accomplishments in biomedical research as demonstrated by a significant number of first and/or senior author publications in outstanding peer-reviewed journals as well as promise or success in obtaining external research support. Preferred Qualifications: Preference will be given to candidates with a growing research program focused on altered cardiorespiratory function with clinical relevance to cardiorespiratory diseases such as obstructive sleep apnea, hypertension, sudden infant death syndrome, arrhythmias and myocardial infarction. Preference will be given to candidates using a combination of genetic, optical imaging, electrophysiological and/or behavioral techniques. The successful candidate will participate in collaborative research activities including development of multi-investigator projects for extramural funding. Salary and start up funds will be commensurate with experience. To be considered, please send a complete curriculum vitae plus names and contact information for 3 references electronically to David Mendelowitz, Ph.D., Professor and Vice-Chair, Department of Pharmacology and Physiology, at
[email protected]. If possible, please send this information in the PDF format. Review of applications will begin on October 15th, and will continue until the position is filled. Only complete applications will be considered. GW is an equal opportunity/affirmative action employer.
Positions Available
STANFORD UNIVERSITY DEPARTMENT OF CHEMICAL AND SYSTEMS BIOLOGY The Department of Chemical and Systems Biology at Stanford University School of Medicine invites applications for a tenure-track position at the ASSISTANT PROFESSOR level. We are particularly interested in candidates with a strong interdisciplinary record in the broad areas of chemical biology, systems biology, and/or cellular and molecular biology in normal and disease states. Stanford offers an outstanding environment for creative interdisciplinary biomedical research. The main criterion for appointment in the University Tenure Line is a major commitment to research and teaching. Candidates should have a Ph.D. and/or M.D. degree and postdoctoral research experience. Candidates should send their curriculum vitae, a description of future research plans and the names and addresses of three potential referees to the address below by November 1, 2011. Late applications will be considered. Jean Kavanagh, FAA Department of Chemical and Systems Biology 269 Campus Drive, CCSR Bldg, Room 4145A Stanford University School of Medicine Stanford CA 94305-5174 Stanford University is an equal opportunity employer and is committed to increasing the diversity of its faculty. It welcomes nominations of and applications from women and minority groups, as well as others who would bring additional dimensions to the university’s research, teaching, and clinical missions.
Positions Available Children’s Hospital Boston Harvard Medical School
Faculty position in cardiovascular research The Department of Cardiology of Children’s Hospital Boston is conducting a search for a tenure-track scientist or clinician-scientist at the Assistant or Associate Professor level with a focus on cardiovascular research. Human genetics and animal/stem cell models of heart disease are of special interest, but all areas of cardiovascular research will be considered. Children’s Hospital Boston adjoins the Harvard Medical School complex and offers an unparalleled research environment. Interested applicants should submit their curriculum vitae, list of references, and brief statement of research accomplishments, interests, and future directions. Application materials should be submitted electronically to William T. Pu, MD, Chair, Search Committee, c/o
[email protected]. Children’s Hospital Boston and Harvard Medical School are Affirmative Action/ Equal Opportunity Employers.
Positions Available Yale University School of Medicine Interdepartmental CNNR Program Cellular Neuroscience, Neurodegeneration, and Repair PO Box 9812 New Haven, CT 06536-0812 http://medicine.yale.edu/cnnr/ Faculty Positions The Yale Program in Cellular Neuroscience, Neurodegeneration, and Repair (CNNR) is searching for a scientist who uses molecular and cellular approaches to advance the understanding of nervous system function. Both outstanding applicants with research programs focused on understanding neurodegeneration or promoting neural repair, and applicants with a focus on basic aspects of neuronal function are encouraged to apply. The successful applicant will receive a primary appointment in one of the departments of the Yale School of Medicine and will be active members of that department. Please see our website http://medicine.yale.edu/cnnr. Candidates must hold an M.D. and/or a Ph.D. degree, or equivalent degrees. We invite applications at the rank of assistant professor, but appointments at the rank of associate professor will be considered. Applications will be reviewed as they are received, but must be received before November 15, 2011. Please send a cover letter, curriculum vitae, up to 3 representative publications, a research plan (strictly limited to 2 pages), and arrange for submission of 3 letters of recommendation. All application materials should be sent electronically to Pietro De Camilli and Stephen M. Strittmatter, directors of the Program, exclusively at the following e-mail address:
[email protected]
60
years of leadership in human genetics research, education and service. 1948–2008 www.ashg.org
Applications from women and minority scientists are encouraged. Yale is an Affirmative Action/Equal Opportunity Employer.
Look Again. Discover More. • Access to the 14 Cell Press primary research journals and 14 Trends reviews title, all on the same platform • Improved, more robust article and author search • Easy to navigate home page, articles pages and archive
www.cell.com
BE THE FIRST
to read the latest issue of any Cell Press journal.
Register for Cell Press Email Alerts and get the complete table of contents as soon as the issue publishes online — FREE! Cell Press Email Alerts deliver the news, research, and commentaries featured in each journal’s latest issue, including the full title of every article, direct links to the articles, and the complete author list. Plus, to save you time, each research article has a brief summary highlighting its significant findings. You don’t have to be a subscriber to sign up for Cell Press Email Alerts. While subscribers have instant access to the full text of all articles listed in the Email Alerts, non-subscribers can read the abstracts of all articles as well as the full text of the issue’s Featured Article.
www.cellpress.com
Expand your stem cell library and save today on the latest books on stem cells and regenerative medicine Stem Cells
Stem Cell Anthology
Scientific Facts and Fiction
From Stem Cell Biology, Tissue Engineering, Regenerative Medicine, Cloning and Stem Cell Methods
Christine Mummery, Ian Wilmut, Anja Van de Stolpe and Bernard Roelen November 2010 | 400 pages | Paperback | $79.95 | €57.95 | £48.99 | ISBN: 9780123815354
Principles of Regenerative Medicine, 2nd Edition
Bruce M. Carlson October 2009 | 450 pp. | Hardback | $150.00 | €100.00 | £95.00 |AU$222.00 | ISBN: 9780123756824
Essential Stem Cell Methods
Anthony Atala and Robert Lanza November 2010 | 1400 pages | Hardback | $199.95 | €143.00 | £125.00 | ISBN: 9780123814227
A Volume in the Reliable Lab Solutions Series
Heart Development and Regeneration, 2-Volume Set
Tissue Engineering
Robert Lanza and Irina Klimanskaya April 2009 | 628 pp. | Paperback | $75.00 | €50.95 | £45.99 |AU$111.00 | ISBN: 9780123750617
Nadia Rosenthal and Richard P. Harvey June 2010 | 1072 pp. | Hardback | $199.95 | €143.00 | £125.00 | AU$296.00 | ISBN: 9780123813329
Clemens van Blitterswijk, Peter Thomsen, Jeffrey Hubbell, Ranieri Cancedda, Anders Lindahl Sahlgrenska, Jerome Sohier and David F. Williams March 2008 | 760 pp. | Hardback | $115.00 | €76.95 | £69.99 |AU$170.00 | ISBN: 9780123708694
Essentials of Stem Cell Biology, 2nd Edition
Human Stem Cell Manual
Robert Lanza, Roger Pedersen, John Gearhart, E. Donnall Thomas, Brigid Hogan, James Thomson, Douglas Melton and Sir Ian Wilmut June 2009 | 600 pp. | Hardback | $199.95 | €134.00 | £125.00 | AU$302.00 | ISBN: 9780123747297
Jeanne F. Loring, Robin L. Wesselschmidt and Philip H. Schwartz June 2007 | 488 pp. | Spiral bound | $88.95 | €59.95 | £53.99 |AU$132.00 | ISBN: 9780123704658
Foundations of Regenerative Medicine Clinical and Therapeutic Applications Anthony Atala, Robert Lanza, James Thomson and Robert Nerem September 2009 | 750 pp. | Hardback | $99.95 | €66.95 | £60.99|AU$148.00 | ISBN: 9780123750853
A Laboratory Guide
Handbook of Stem Cells 2-Volume Set with CD-ROM Vol. 1–2 Vol. 1 – Embryonic Stem Cells Vol. 2 – Adult & Fetal Stem Cells Robert Lanza, Roger Pedersen, Helen Blau, E. Donnall Thomas, John Gearhart, James Thomson, Brigid Hogan, Catherine Verfaillie, Douglas Melton, Irving Weissman, Malcolm Moore and Michael West September 2004 | 1,760 pp. | Hardback | $566.00 | €380.00 | £345.00 | AU$817.00 | ISBN: 9780124366435
Cell Stem Cell subscribers save 25% on their book order Secure ordering online at elsevierdirect.com Enter promo code 28024 at check out Prices and publication dates subject to change without notice.
.%7
Find Your Ideal Job! så3EARCHåJOBSåBYåKEYWORD å LOCATIONååTYPEå så0OSTåYOURåRESUMEåå ANONYMOUSLYå så#REATEåAå*OBå!LERTåå ANDåLETåYOURåIDEALåå JOBålNDåYOU
careers.cell.com
Scopus is the largest abstract and citation database of peer-reviewed literature and quality web sources with smart tools to track, analyze and visualize research.
enrich your experience
www.scopus.com
SnapShot: Human Biomedical Genomics
GENOME–SCALE TRAIT MAPPING
P O P U L AT I O N G E N E T I C S
GENOMICS
Eimear E. Kenny1 and Carlos D. Bustamante1 1 Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5120, USA
248
Keyword
Definition
Single-nucleotide variation (SNV)
When a single base pair (A, C, T, G) is observed to vary in a DNA sequence. This variant may be private to an individual or family. For biallelic SNVs (i.e., an SNV that exists in two possible states), an individual can carry three possible allelic combinations or genotypes on their two chromosomes (e.g., for a single base pair position that could be A or G, the three possible genotypes are AA, AG, or GG).
Single-nucleotide polymorphism (SNP)
An SNV that is observed to vary among unrelated individuals and reaches an “appreciable” frequency in the population (classically defined to have a minor allele frequency [MAF, see below] > 5%, but today SNPs can be reliably assayed that are well below 0.1% MAF).
Structural variation (SV)
Variation that arises via deletion, insertion, or rearrangement of the DNA. Types of structural variation include: Indels: generated from the insertion or deletion of genetic segments, typically ranging from 1 base pair to >100 kilobases. Inversions: when a genetic segment is reversed with respect to the flanking DNA sequence. Translocations: when the genetic segment is rearranged between chromosomal locations.
Copy number variation (CNV)
A type of structural variation that results from the repeated gain or loss of a segment of DNA, leading to an ordinal number of segments (0,1, 2, 3, etc.).
Copy number polymorphism (CNP)
CNV that segregates at the same location within or between populations.
Allele frequency
The proportion of chromosomes in a population carrying a particular allele. In the case of a biallelic SNP, alleles may be designated as: Minor: the allele observed in the smallest proportion of the population. Major: the allele observed in the largest proportion of the population. Ancestral: the allele that shares the same state as the last common ancestor of all humans. Derived: the allele that is not the same as the last common ancestors.
Haplotype
The linear combination of alleles at linked loci on chromosomal segments, in which the segments may vary in size from a few loci to the entire chromosome length depending on the number of recombination events that have occurred between the chromosomes.
Linkage disequilibrium (LD)
The covariance of allelic state at a pair of loci across a sample of chromosomes from a population (e.g., "A" allele at locus 1 is seen with "B" allele at locus 2, and "a" allele is seen with "b" allele more often than expected). LD can be caused by a lack of free recombination for loci on the same chromosome (e.g., tight genetic linkage for a pair of loci with multiple alleles) or by population substructure in the sample (e.g., genetic drift at unlinked loci and nonrandom mating lead to correlation in allelic state among loci).
Identical by descent (IBD)
Two chromosomal segments are IBD if they are inherited from the same recent ancestor. Although this is a matter of degree, we often define IBD as sharing above and beyond what would be expected by random mating in the population.Two individuals are deemed to be related if they share homologous genetic segments IBD (i.e., parent-offspring pairs share 50% of their genetic segments IBD).
Identical by state (IBS)
Two alleles at a locus are IBS if they share the same DNA sequence. IBS can indicate IBD, but it can also occur due to random pairing of chromosomal segments in the population (i.e., Hardy-Weinberg).
Quantitative trait
A continuous trait that exhibits variability in a population, e.g., height, body mass index, or plasma cholesterol levels.
Qualitative trait
A trait with two discrete designations is binary or dichotomous (e.g., “case” vs. “control”); a trait with multiple unordered categories is nominal (e.g., “green,” “blue,” or “brown” eyes); and a trait with natural ordering is ordinal (e.g., “low,” “normal,” or “high” resting heart rate).
Monogenic trait (Mendelian trait)
A dichotomous phenotype determined by a single locus, e.g., attached earlobes is a monogenic trait.
Polygenic trait
A common or complex phenotype that results from the combined effects of many loci and usually nongenetic factors, such as diet and the environment, e.g., type II diabetes is a polygenic trait.
Oligogenic trait
A phenotype that is the result of the combined effects of a few loci and often nongenetic factors, such as diet and the environment, e.g., age-related macular degeneration is an oligogenic trait
Heritability (h2)
The proportion of the phenotypic variability that can be accounted for by a locus or loci segregating in the population. This can be near 100% for monogenic traits and usually much less than 100% for oligogenic and polygenic traits.
Prevalence
The proportion of individuals in a population exhibiting a trait.
Penetrance
The probability of having a phenotype given the genotype, e.g., for a hypothetical disease in which the A locus contributes to the disease, one might have penetrances of P(Disease | AA) = 1%, P(Disease | Aa) = 1.5%, and P(Disease | aa ) = 2.25%. The "a" allele is the predisposing or risk allele.
Genotype relative risk (GRR)
The ratio of penetrances for different genotypes at a locus. Under a multiplicative model, each allelic copy increases the chances of disease by a constant factor, e.g., for the example above, GRR = P(Disease | aa)/P(Disease | Aa) = 1.5 and P(Disease | Aa)/P(Disease | AA) = 1.5.
Genome-wide association studies (GWAS)
The simultaneous investigation of many genotypes (between 105 and 107) assayed on genome-scale SNP arrays for correlation with complex phenotypes in large quantitative or qualitative (case-control) population-based studies, resulting in a p value of association for each SNP. Larger GWAS are typically performed by combining summary statistics for each SNP test from multiple GWAS in a meta-analysis. Single GWAS are typically performed in two stages: Discovery GWAS stage: to identify associated SNPs. Replication stage: to confirm the association of top-ranking SNPs in independent samples.
Whole-exome association studies (WEAS)
The association of all DNA sequence mutations or variants in genome-wide coding regions (i.e., the exome) in large-scale population-based or family-based studies for either complex/oligogenic or Mendelian traits. In contrast to GWAS, the majority of variants in WEAS are rare and require methods that can detect multiple causal mutations or genetic heterogeneity at the traitassociated loci. Therefore, instead of analyzing individual variants, WEAS methods analyze variants within a region or gene as a group and usually rely on collapsing, resulting in a p value of association per locus.
Genetic model
Determines the risk (r) conferred by each allele to the phenotype. For example, an additive model increases the risk r- and 2r-fold for heterozygotes and homozygotes, respectively, and a multiplicative model r- and r2-fold for heterozygote and homozygotes, respectively.
Multiple testing burden
Occurs when a large number of statistical tests of the same null hypothesis (i.e., “no association between genotype and phenotype”) are considered simultaneously. To overcome this, one may adjust the significance to account for the number of tests performed (i.e., Bonferonni correction). For example, if one million independent tests were performed, the genome-wide significance threshold would be: nominal significance/number of tests = 0.05/1,000,000 = 5 x 10-8.
Cell 147, September 30, 2011 ©2011 Elsevier Inc.
DOI 10.1016/j.cell.2011.09.020
See online version for legend and references.
the difference between
paradigm and save a dime Confidence leads to discovery. Your inspiration, skill, and uncompromising dedication move us toward a better future. Gibco® GlutaMAX™ media and fetal bovine serum enable you to achieve consistent results worthy of your research. Gibco. Every little thing matters.
www.invitrogen.com/discovergibco
Go to Get the free mobile app for your phone at http://gettag.mobi
©2011 Life Technologies Corporation. All rights reserved. The trademarks mentioned herein are the property of Life Technologies Corporation or their respective owners. For research use only. Not intended for any animal or human therapeutic or diagnostic use. CO22940 0611