Gene Expression Profiling by Microarrays Clinical Implications
Microarray analysis is a highly efficient tool for asse...
32 downloads
816 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Gene Expression Profiling by Microarrays Clinical Implications
Microarray analysis is a highly efficient tool for assessing the expression of a large number of genes simultaneously, and offers a new means of classifying cancer and other diseases. Gene expression profiling can also be used to predict clinical outcome and response to specific therapeutic agents. This survey spans recent applications of microarrays in clinical medicine, covering malignant disease including acute leukemias, lymphoid malignancies, and breast cancer, together with diabetes and heart disease. Investigators in oncology, pharmacology, and related clinical sciences, as well as basic scientists, will value this review of a promising new diagnostic and prognostic technology. Wolf-Karsten Hofmann, M.D., Ph.D. is Professor of Medicine in the Department of Hematology, Oncology and Transfusion Medicine, Charite´ – University Hospital Benjamin Franklin, Berlin. For 4 successive years he received the Young Investigator Award of the American Society of Hematology and, in addition to his research publications, has written many book chapters in hematology and oncology.
Gene Expression Profiling by Microarrays Clinical Implications
Edited by
Wolf-Karsten Hofmann Department of Hematology and Oncology Charite´ – University Hospital Benjamin Franklin Berlin, Germany
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521853965 © Cambridge University Press 2006 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2006 eBook (NetLibrary) ISBN-13 978-0-511-22122-4 ISBN-10 0-511-22122-3 eBook (NetLibrary) ISBN-13 ISBN-10
hardback 978-0-521-85396-5 hardback 0-521-85396-6
Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
Dedicated to Birgit, Konstantin and Franziska. Remembering my father, Heinz Hofmann.
Contents
List of contributors Foreword
page ix xiii
Eckhard Thiel
Introduction
1
Wolf-Karsten Hofmann
1
Technique of microarrays: microarray platforms
8
Sven de Vos
2
Quantitative quality control of microarray experiments: toward accurate gene expression measurements
27
Xujing Wang and Martin J. Hessner
3
Statistical analysis of gene expression data
47
David A. Elashoff
4
Genomic stratification in patients with heart failure
80
Tara A. Bullard, Fre´de´rick Aguilar, Jennifer L. Hall, and Burns C. Blaxall
5
Gene expression profiling for the diagnosis of acute leukemias
106
Torsten Haferlach, Alexander Kohlmann, Susanne Schnittger, Claudia Schoch, and Wolfgang Kern
6
Gene expression profiling can distinguish tumor subclasses of breast carcinomas Ingrid A. Hedenfalk
vii
132
viii
Contents
7
Gene expression profiling in lymphoid malignancies
162
Christof Burek, Elena Hartmann, Zhengrong Mao, German Ott, and Andreas Rosenwald
8
mRNA profiling of pancreatic beta-cells: investigating mechanisms of diabetes
187
Leentje Van Lommel, Yves Moreau, Daniel Pipeleers, Jean-Christophe Jonas, and Frans Schuit
9
Prediction of response and resistance to treatment by gene expression profiling
212
Philipp Kiewe and Wolf-Karsten Hofmann
Index
238
Contributors
Editor Wolf-Karsten Hofmann, Professor of Medicine, Department of Hematology and Oncology, Charite´ – University Hospital Benjamin Franklin, Hindenburgdamm 30, 12203 Berlin, Germany.
Foreword Eckhard Thiel, Professor of Medicine, Head of the Department of Hematology and Oncology, Charite´ – University Hospital Benjamin Franklin, Hindenburgdamm 30, 12203 Berlin, Germany.
Contributors Frederick Aguilar, Cardiovascular Research Institute, Division of Cardiology, Department of Medicine, University of Rochester Medical Centre, Rochester, NY. Burns C. Blaxall, Cardiovascular Research Institute, Division of Cardiology, Department of Medicine, University of Rochester Medical Centre, Rochester, NY. Tara A. Bullard, Cardiovascular Research Institute, Division of Cardiology, Department of Medicine, University of Rochester Medical Centre, Rochester, NY. Christof Burek, Institute of Pathology, University of Wuerzburg, Josef-SchneiderStrasse 2, 97080 Wuerzburg, Germany. Sven de Vos, Department of Hematology/Oncology, UCLA Medical Center, 9-631 Factor Building, 650 Charles E. Young Drive South, Los Angeles, CA 90095-1678, USA.
ix
x
List of contributors
David A. Elashoff, Department of Biostatistics, UCLA School of Public Health, Los Angeles, CA 90095-1772, USA. Torsten Haferlach, Munich Leukaemia Laboratory, Max-Lebsche-Platz 31, 81377 Munich, Germany. Jennifer L. Hall, Lillehei Heart Institute, Division of Cardiology, Department of Medicine, University of Minneapolis, Minnesota, MI, USA. Elena Hartmann, Institute of Pathology, University of Wuerzburg, JosefSchneider-Strasse 2, 97080 Wuerzburg, Germany. Ingrid A. Hedenfalk, Department of Oncology, Clinical Sciences, Lund University, Lund SE-22185, Sweden. Martin J. Hessner, The Max McGee National Research Center for Juvenile Diabetes, The Medical College of Wisconsin and Children’s Research Institute of the Children’s Hospital of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA and The Human and Molecular Genetics Center, The Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA. Jean-Christophe Jonas, Unit of Endocrinology and Metabolism, Faculty of Medicine, Universite Catholique de Louvain, Brussels, Belgium. Wolfgang Kern, Munich Leukaemia Laboratory, Max-Lebsche-Platz 31, 81377 Munich, Germany. Philipp Kiewe, Department of Hematology and Oncology, Charite´ – University Hospital Benjamin Franklin, Hindenburgdamm 30, 12203 Berlin, Germany. Alexander Kohlmann, Roche Molecular Systems, Pleasanton, CA, USA. Zhengrong Mao, Institute of Pathology, University of Wuerzburg, JosefSchneider- Strasse 2, 97080 Wuerzburg, Germany. Yves Moreau, Department of Electrical Engineering, ESAT–SCD, K. U. Leuven, Leuven, Belgium.
xi
List of contributors
German Ott, Institute of Pathology, University of Wuerzburg, Josef-SchneiderStrasse 2, 97080 Wuerzburg, Germany. Daniel Pipeleers, Diabetes Research Center, Vrije Universiteit Brussel, Brussels, Belgium. Andreas Rosenwald, Institute of Pathology, University of Wuerzburg, JosefSchneider-Strasse 2, 97080 Wuerzburg, Germany. Susanne Schnittger, Munich Leukaemia Laboratory, Max-Lebsche-Platz 31, 81377 Munich, Germany. Claudia Schoch, Munich Leukaemia Laboratory, Max-Lebsche-Platz 31, 81377 Munich, Germany. Frans C. Schuit, Gene expression Unit, Department of Molecular Cell Biology, K. U. Leuven, Leuven, Belgium. Leentje Van Lommel, Gene Expression Unit, Department of Molecular Cell Biology, K. U. Leuven, Leuven, Belgium. Xujing Wang, The Max McGee National Research Center for Juvenile Diabetes, The Medical College of Wisconsin and Children’s Research Institute of the Children’s Hospital of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA and The Human and Molecular Genetics Center, The Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA.
Foreword
The introduction of light microscopy in 1872 by Ernst Abbe and Carl Zeiss was one of the first revolutionary steps in the diagnosis of human diseases. It was possible to determine associated structural defects by morphological analysis of tissues and single cells, resulting in the development of major classifications and subgroup definitions. These have been revised many times during the last 130 years and continue to have an huge impact on modern diagnostics. For decades during the last century, light microscopy was one of the most important methods available for the clinical diagnosis of tumors and for describing morphological changes associated with widespread disorders, such as diabetes and heart disease. From chromosomal analysis in the 1960s, molecular biological methods, polymerase chain reaction, and immunological methods, such as immunofluorescence, which enables us to define the surface marker profile of single cells, have been introduced into disease diagnosis. Examples are: the discovery of the Philadelphia-Chromosome t(9;22) as the main feature of chronic myeloid leukemia (CML); the association of HLA-DR subtypes with specific diseases (e.g., HLA B27 in patients with Bechterew’s disease); and mutations of the APC gene in patients with colon cancer. By using such single genetic markers, the risk stratification of these diseases has been improved, resulting in more specific treatment with better clinical longterm outcomes. In particular, for CML, it was possible to use the wellcharacterized molecular defect to design the first target-specific drug (STI571, Imatinibä), dramatically improving the treatment options for patients with Phþ leukemias. Twenty years ago, in 1985, I was asked to provide a review article entitled ‘‘Cell surface markers in leukemia: biological and clinical correlations’’ (Thiel, E., Crit. Rev. Oncol. Hematol. 1985: 209–260). Introduction of cell xiii
xiv
Foreword
surface marker analysis by flow cytometry into clinical bone marrow and peripheral blood samples from patients with acute leukemia (in particular those with acute lymphoblastic leukemia (ALL)), extended the classification of ALL from three morphological defined subgroups (L1–L3 as defined by the French–American–British Cooperative group) to about ten different subtypes. Discrimination between B-lineage and T-lineage ALL, as well as determination of high-risk ALL subtypes, enabled us to introduce the risk-adapted treatment plan for patients with ALL. The consequence of this was, on the one hand, diversity of therapeutic regimens in ALL, but on the other hand, improvement of disease-free long-term survival in a number of patients to more than 60%. Successful immunophenotyping has triggered therapeutic advantages in patients with ALL over the last two decades. Recently available microarray technology is a powerful new tool for assessing the expression of a large number of genes in a single experiment. Its most important use in medical science is for characterizing global gene expression profiles, specific for disease subgroups or ones which have prognostic value (e.g., for disease progression). This can speed up the identification of diagnostic and prognostic markers. Furthermore, gene expression profiling can help predict responses to different pharmacological treatments, resulting in therapeutic stratification for individual patients. This book summarizes the most recent work on gene expression profiling in clinical medicine, including diabetes, heart diseases and tumor diseases. From identifying previously known and unknown subtypes of disease and making a correlation with survival, researchers have moved on to constructing distinct gene models for predicting the clinical outcome or response to specific therapeutic agents. This may change treatment strategies in the future, resulting in more individualized therapy for all kinds of human diseases. Clinical implications of microarray technique have been reflected recently in two other examples. First, Supplement 6 to vol. 37 of Nature Genetics in June 2005, entitled ‘‘The Chipping Forecoast III’’ summarizes recent technical advantages and predictive power of microarray data analysis. Second, is a most important development for clinicians awaiting the introduction of gene expression profiling into the clinical routine. In July 2005 the first international multicenter trial (MILE – Microarray Innovations in Leukemia), conducted by one of the contributors to this
xv
Foreword
volume (Professor Torsten Haferlach, Munich) and initiated by the European LeukemiaNet started to analyze 4000 clinical samples from patients with all subtypes of leukemias, by traditional diagnostic methods (including morphology, immunology, molecular genetics) and by gene expression profiling using microarrays. This trial will be the first attempt to correlate results from standard diagnostics in leukemia with those from gene expression analysis. Diagnosis of leukemias by microarrays can be expected to achieve the level of clinical application very shortly. Just as the introduction of light microscopy revolutionized diagnosis in human disease, I would predict that gene expression profiling will change our understanding of disease classification and prognosis evaluation dramatically over the next few years. Eckhard Thiel, M.D. Berlin, June 2005
Introduction Wolf-Karsten Hofmann Department of Hematology and Oncology, University Hospital Benjamin Franklin, Berlin, Germany
The development of methods to measure gene expression was revolutionized in the early 90th of the last century by Kary Banks Mullis who introduced the polymerase chain reaction (PCR). Total RNA was amplified using specific primers resulting in the detection of a gene specific PCR-product which could be visualized by gel electrophoresis. To detect specific gene expression in all different kinds of human cells, millions of PCR reactions were performed during the last 15 years. Today, PCR can be called a standard method for gene expression analysis which is used for diagnostic purpose as well as for analysis of physiological and pathophysiological gene expression in all organisms including humans. Common PCR can help to detect the expression of single genes within one reaction. By optimizing the technique of PCR, the number of genes which can be detected within one reaction could be increased to a maximum of six by using fluorescence labeled primers or probes. High-throughput analysis of multiple genes, e.g., in hundreds of patients samples by PCR is very time consuming and requires a lot of technical and personell power. As a example, it would require about 625 days of work (24 hours a day) to analyze all the human genes which are known at this time by PCR using a singleplex reaction. In 1995, Patrick O. Brown published the first paper about a new technique which could be used to simultaneously analyze gene expression of 45 genes within on experiment by using a microarray which was prepared by highspeed robotic printing of complementary DNA’s on glass slides. The detection of gene expression using such a ‘‘microarray’’ which is commonly called a ‘‘DNA- or RNA-chip’’ requires a high resolution scanner which can detect the fluorescence signals from each of the complementary cDNA’s with a high sensitifity and specificity. Finally, each of the fluorescence signals is Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
2
W.-K. Hofmann
computed into a expression value which is specific for the gene represented by the cDNA on the glass chip. Even with the relatively low number of 45 genes it was possible for the first time to analyze gene expression of multiple genes selected for analysis within one experiment in hundreds of human tissue samples. From 1995, the technique of microarrays has been improved dramatically. In the late nineties two different principles of microarrays could be distinguished. On the one hand, the so called ‘‘cDNA-microarrays’’ containing a few hundreds to a few thousands cDNA targets of particular interest for hybridization with the RNA extracted from the tissue of interest were introduced. On the other hand, short single strand DNA segments (oligonucleotides) which are generated directly on the microarray surface by chemical synthesis resulted in the production of ‘‘oligonucleotide microarrays’’. The number of genes which can be analyzed has been increased from a few thousands to more than 50 000 different sequences printed on the latest generation of oligonucleotide microarrays. Now, ten years after introduction of the microarray technique we are able to analyze the expression of every known human gene (as annotated by the ‘‘Human Genom Project’’) within one experiment using one oligonucleotide microarray. The chapter ‘‘Technique of microarrays – Microarray platforms’’ by Sven de Vos will give an introduction into the common available microarray techniques exploring in detail the principle of RNA-hybridization and the technical settings for hybridization and signal scanning. Since the detection of the fluorescence signal by the microarray scanner is one of the most important and critical technical issue, several optical systems will be elucidated in this chapter including novel microarray platforms with three dimensional settings to enhance the denseness of specific oligonucleotides and to improve the assay sensitivity due to reduced non-specific binding and background noise. In parallel to the evolution of more and more high densitiy microarrays representing 20 000 to 50 000 different sequences for gene expression detection, an increasing number of technical, quality and data management problems came up to the knowledge of the scientific community. Beside technical issues (robustness of microarray equipment, handling of microarray systems, spotting of high-density microarrays), the quality control of microarray experiments plays a central role with regard to the mobilization
3
Introduction
of this technique for clinical routine settings. The technique has a great potential in the study of complex regulation mechanisms of gene expression in human diseases where a comprehensive evaluation is needed. However, the accumulation of high quality microarray data is still a challenge for many laboratories. One big problem is the correlation of microarray data with gene expression data from other platforms including polymerase chain reaction and Northern blot. Furthermore, the correlation of gene expression data from different microarray platforms is still critical. In addition, a number of experimental steps including the preparation of high-quality RNA are necessary to perform microarray experiments resulting in a high variability of microarray data and significant background noise. Initially, this was a strong limitation for the introduction of microarray analyzes into clinical settings for classification of diseases or new prognostic scoring systems based on gene expression profiles. In the chapter ‘‘Quantitative quality control of microarray experiments: toward accurate gene expression measurements,’’ Xujing Wang, who is one of the experts in the field of quality assessment of microarray data, evaluates several mathematical systems to extract the most reliable information from the gene expression measurement by microarrays. First, the image analysis of scanned microarrays is critical explored followed by suggestions for the quality dependent filtering of expression data. Furthermore, the important issue of data normalization, which is necessary to compare fluorescence data measured from different hybridization series and/or from different materials is discussed in detail showing the potential of normailzation algorithms to improve the compatibility of microarray data. The reminder of the first microarray experiments we have performed in our laboratory in late 1999 is dominated from the attempts to walk through the normalized expression values of about 6800 genes which were analyzed by the Affymetrix HuGene-FL microarray in about 25 tumor samples from patients with malignant lymphoma. At this time, we have started to use common data bank systems (e.g., Microsoft Excel) to go step by step through all the genes in all of the samples. Mobilising this ‘‘hand made analysis strategy’’ we were able to find the most differentially expressed genes comparing each of the tumor sample to each of the control sample which was created from normal lymphatic tissue. Finally, we got a list containing the GenBank annotations of the genes we found to be differentially
4
W.-K. Hofmann
expressed, but at this stage of analysis we did not have any idea about the common name, the function and the importance of the gene we have detected. It was the work of months to go through all the genes which were significantly regulated and to finally define a virtually network of genes which could be used for pathway analysis or to find answers for biological relevant questions. Furthermore, we did not know about the statistical significance of our results because there were no data analysis restrictions or recommendations available to perform a more specific analysis of gene expression data. Confirmation of highly differentially expressed genes using a second, non-microarray based technique, was necessary and time consuming. In the chapter ‘‘Statistical analysis of gene expression data’’ by David A. Elashoff the most popular questions ‘‘What genes are correlated with specific characteristics of the samples’’ and ‘‘Are there specific patterns of gene expression or combinations of multiple genes which can accurately predict the sample characteristics’’ are discussed. The analysis of the increasing amount of microarray data requires complex statistical methods and models which are discussed and compared each to each other by the author who has extended experiences in the statistical analysis of large series of microarray data from patients samples. Analysis of gene expression by microarrays is expected to define new diagnostic criteria and to find new and by conventional techniques not definable risk parameters in human diseases. Therefore, the statistical methods of clustering and class membership prediction are evaluated in detail to enable the reader of scientific publications containing microarray data in the context of clinical settings to understand the way the data were created but also to see problems and pitfalls in data analysis of large series of microarray hybridizations. After reading the chapters about techniques of microarrays, quality control and statistical analysis of microarray data, which have a number of technical issues which are important for the critical and timely interpretation of microarray data, the question has to be answered, what clinical implications are derived from microarray results. The answer to this question may be difficult at this time, but during the last 10 years a strong progression with regard to the introduction of microarray experiments for disease classification, risk stratification and detection of pathomechanisms of certain human diseases could be notified. Even if the most encouraging results for the utilization of gene expression analysis by microarrays for
5
Introduction
clinical routine come from the analysis of tumor cells, in particular of leukemia- and lymphoma cells, there are other very well accepted efforts to introduce gene expression profiles into the evaluation of common diseases like diabetes and heart insufficiency. Therefore, the introduction in and the significance of gene expression analysis by microarrays for the management of human diseases are discussed in several chapters by well known experts. Genomic profiling using microarrays in both animals and humans has elucidated numerous novel genes and pathways associated with the development, progression and regression of heart failure (by Tara A. Bullard). This has resulted on the one hand in a substantial insight into mechanisms of disease and has generated novel hypotheses associated with the complex nature of heart failure. On the other hand, gene expression profiling using RNA which is derived from the peripheral blood of patients with heart failure will certainly enhance the diagnostic capabilities for this disease. As discussed previously, one of the main topics in the field of gene expression profiling by microarrays is represented by the analysis of tumor samples. In the chapter by Torsten Haferlach it is shown in an excellent way how gene expression signatures can correlate with high accuracy with certain subtypes of patients with acute leukemias. The Haferlach group has worldwide the most experience with genomic analysis for diagnostic purpose in patients with acute leukemia. This scientific group conducts the first multicenter trials to evaluate the microarray technique for diagnostic use in patients with leukemias as compared to the standard diagnostics (MILE – Microarray Innovations in Leukemia). There are other tumor diseases which could definitely get benefit from the gene expression profiling experiments. It was shown in breast cancer that gene expression profiling can distinguish new tumor subclasses which may result in different therapeutic strategies used in each of the different subclasses. The association of specific gene expression profiles to the metastatic potential and tumor aggressiveness of breast cancer and the possible prediction of clinical outcome of patients with this disease may have a strong impact on the adjuvant treatment decisions made in such patients. Furthermore, it seems to be possible to correlate specific gene expression profiles to the efficacy of several cytostatic/antihormone drugs used for the treatment of patients with breast cancer. This may clearly result in the
6
W.-K. Hofmann
individualized tumor therapy which is definitely required to improve the clinical outcome of patients with breast cancer. There is no other field in oncology which is as affected by the availability of gene expression techniques as the group of malignant lymphomas. Historically, gene expression of single genes (as of Cyclin D1 in mantle cell lymphoma) has been used to define distinct subtypes of these malignancies which can be used in addition to standard histology and morphology of lymph nodes or other lymphatic tissues. The classification of malignant lymphomas has been a topic of encouraged debates and various conceptual frameworks were used in the past to classify lymphomas in a clinically and biologically meaningful way. The chapter by Christof Burek summarizes recent developed classification systems based on the expression of several hundreds of genes which can be helpful on the one hand to establish new diagnosis subgroups but on the other hand can help to identify new prognostic parameters for malignant lymphomas. There are several clinical trials ongoing using such new risk factors (which can only be determined by specific gene expression profiles and not by traditional moleculargenetic analysis) to adapt the therapy schedule and the combination of cytotoxic drugs in the treatment of low- and high-grade malignant lymphomas. This may finally result in a comprehensive molecular classification of lymphomas with direct impact on clinical treatment decisions, but it may also help to resolve the biological and clinical heterogeneity that is present in many currently defined lymphoma entities. How can we benefit from microarrays in the management of patients with diabetes mellitus, a more common disease than all tumor diseases we have discussed above but lacking a therapeutic concept which is in addition of substitution of insulin and to explore late complications of this disease? Leentje van Lommel gives an overview about recent advances applying gene expression profiling to pancreatic beta-cells to discover mechanisms of diabetes. In this chapter, the importance of selection of target cells for microarray experiments at a high purity is explored in detail. To focus on biological or/and disease related information, different strategies for data mining are discussed, demonstrating the difficulties to find the keyregulator genes for the beta-cell malfunction in diabetes, even in the case that data from a parallel analysis of more than 40 000 genes was available. After the diagnosis has been established in a patient, the question has to be answered: ‘‘Whether and how to treat this patient?’’ For decades,
7
Introduction
medicamentous therapy in medicine has been an empiric science largely based on trial and error. Even today we can not predict the effectiveness of a particular drug in an individual patient. To find the most suitable antihypertensive agent, for example, it may take more than one attempt. Oftentimes, only meticulous evaluation of a large amount of clinical trials has enhanced treatment success and benefit for patients. With the ability to evaluate the expression of thousands of genes at a time by commercially available or customized gene arrays and the application of sophisticated statistical algorithms, a new era in the prognostic assessment of diseases as well as therapeutic implications has begun. In the chapter ‘‘Prediction of response and resistance to treatment by gene expression profiling,’’ Philipp Kiewe evaluates recent progress to correlate specific gene expression profiles to the clinical course of diseases including the prediction of sensitivity of target cells to different therapeutic options. As discussed earlier, the most recent advances have been made in the treatment of haematological malignancies including acute leukemias and high-grade lymphomas, but there are other fields which definitely can benefit from the approach to define early at diagnosis, what kind of drug combination/treatment may have the highest success rate in the individual patient. The application of microarray technique to answer this important question in advance is one of the most interesting challenge for the future of gene expression profiling settings in modern human medicine. In conclusion, the present book entitled ‘‘Gene expression profiling by microarrays – clinical implications’’ is written to serve as a little guide to better understand the overwhelming number of upcoming scientific publications mobilizing gene expression profiling as a technique to define and to illustrate new disease classifications and prognostic factors. Looking to the future it seems possible that disease specific gene expression profiles can be used in addition to (and later as a substitution for) traditional diagnostic tools including morphology, laboratory values and genetic analysis. Definitely it can not be a substitution for evaluation of the clinical performance of every patient which may be the largest variable for diagnostic and treatment decisions. Finally, gene expression profiling can be the starting point for the development of ‘‘real’’ individualized and targetspecific treatments in all kinds of human diseases.
1
Technique of microarrays: microarray platforms Sven de Vos UCLA Medical Center, Los Angeles, CA, USA
Introduction Within a few years of their inception, microarrays have become a widely used tool to study global gene expression of cells in culture or complex tissues in many different organisms. The major technical advance lies in the high throughput capability covering the RNA expression of whole genomes on a single chip, thereby transforming the classical paradigm of studying ‘‘one gene at a time.’’ With only modest efforts, an immense amount of raw data can be produced, which has created unique challenges for the analyses and interpretations of microarray experiments when attempting to distil meaningful conclusions from these large data sets. On the experimental side, quality problems of the sample materials, on the hardware side problems with probe sets, quality controls, and protocol standardization, and on the analysis side questions of the most suitable statistical analysis techniques soon surfaced. Stringent experimental planning and controlling is necessary to extract meaningful data from microarray experiments. In order to create reliable and comparable data sets, the minimal information about a microarray experiment (MIAME) [1] has been published and adherence to these guidelines increasingly is required by scientific journals. The landscape of high-throughput gene expression has continued to evolve and most recently has witnessed an onslaught of new and improved microarray platforms. The basic protocol starts with the hybridization of complementary strands of labeled DNA or RNA from cells or tissues with representations of known genes or expressed sequence tags (ESTs) spotted onto a solid support, usually glass or nylon. Several strategies for labeling are employed Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
9
Technique of microarrays: microarray platforms
and include radioactivity, a hapten group, or fluorescent nucleotides. The detection is accomplished by autoradiography, chemiluminescence, or fluorescence scanning. Different microarray platforms are in use, which spot either known genes/ESTs as PCR fragments, plasmids, or synthetic oligonucleotides onto a solid surface, or directly synthesize sequence tags of 20–60 oligonucleotides on glass chips. The signal intensity is then correlated with the relative expression of a known gene. Microarrays are applied for RNA profiling, polymorphism screening, mapping of genomic DNA clones, searching for gene signaling pathways or novel drug targets [2–6], and finally are used increasingly to reclassify malignancies and improve clinical outcome predictions [7, 8]. The DNA microarray technology was introduced by Patrick Brown et al. in October 1995, who used spotted cDNA microarrays [9]. However, several competing technologies for microarray probe implementation have emerged, including the use of full-length cDNAs, or pre-synthesized or in situ synthesized oligonucleotides as probes [10]. Although spotted cDNA microarrays are still in use today by dedicated individual laboratories, oligonucleotide-based arrays, first brought to the market by Affymetrix, became increasingly popular and dominate the market today, due to their ready availability and reproducibility.
Microarray platforms Several different arraying methods exist ranging from on-chip photolithographic synthesis of 20–25 mer oligos onto silicon wafers [11], printing of 20–25 mer oligos onto solid support, and 500–5000 bp cDNAs printed onto either glass slides or membranes [12]. The two most used support materials are glass and nylon. The advantages of a glass surface are covalent attachment of DNA samples, excellent durability, and no significant background noise due to its low fluorescence [13]. cDNA microarrays
Conventional spotted microarrays are fabricated in dedicated laboratories and complement the use of commercially available microarrays such as GeneChips. The advantage of spotted microarrays containing a few
10
S. de Vos
hundred to a few thousand cDNA targets of particular interest is that they can be printed at relative low cost compared with commercial arrays. The production of spotted microarrays is a highly automated process, using either pin-based robotic arrays or an inkjet micro-dispersing system to print cDNAs or oligonucleotides on glass slides [14]. Arrays utilizing cDNAs as probes have primarily been utilized in academic laboratories, and Incyte (Palo Alto, CA, USA) provided a commercial cDNA microarray service until 2001. Agilent (Palo Alto, CA, USA) employed in situ synthesis at the surface of the microarray slide by inkjet-printing using phosphoramidite chemistry [14–18]. Inkjet technology is also used by Agilent to provide spotted cDNA arrays from polymerase chain reaction (PCR) amplicons. Microarrays containing large PCR amplified cDNA fragments, ranging from 0.5 to 2.0 kb in size, are generated by physically depositing small amounts of each cDNA of interest onto known locations on glass surfaces (for review see [19]). In many cases, the targets are chosen directly from databases such as GenBank, dbESt, and UniGene. Additionally, full-length cDNAs, collections of partially sequenced cDNAs (or ESTs), randomly chosen cDNAs from any library of interest, or a specific set of genes of interest can be used. The microarrays are produced on poly-L-lysine (Sigma, St. Louis, MO, USA) coated microscope slides and the cDNA fragments are cross-linked by UV to the matrix. A robotic arraying machine loads about 1 mL of PCR-amplified fragments from corresponding wells of 96-well plates and deposits about 5 nL of each sample onto each of up to 100 slides. Several commercial arraying machines can be used and include MicroGrid from BioRobotics (Cambridge, UK), GMS 417 from Genetic Microsystems (Woburn, MA, USA), Omni-Grid from GeneMachines (San Carlos, CA, USA), the PixSys PA series from Cartesian Technologies (Irvine, CA, USA), and the arrayer made by Beecher Instruments (Silver Spring, MD, USA). Oligonucleotide microarrays
Affymetrix dominated the market for many years applying photolithographic technologies derived from the semiconductor industry to the production of high-density microarray. A GeneChip consists of short singlestrand DNA segments (oligonucleotides), which are generated directly on the microarray surface by chemical synthesis [20]. The underlying principle
11
Technique of microarrays: microarray platforms
is the combination of photolithography and solid-phase DNA synthesis. First, synthetic linkers with attached photochemically removable protecting groups are fixed to a silicon surface. Then, light is directed through a photolithography mask to specific areas of the chip surface, which results in photo-deprotection of only illuminated areas. Finally, hydroxylprotected deoxynucleotides are incubated with the surface, and chemical coupling occurs at those sites that have been photo-deprotected in the preceding step. This series of steps is then repeated to synthesize polynucleotides in a highly specific manner at defined locations. GeneChips are designed in silicio, and do not require the maintenance of clone libraries with their inherent risk of misidentifying tubes, cDNAs, or spots, thereby avoiding problems that have occurred with standard cDNA array productions. Between 11 and 20 oligonucleotide pairs per gene, designed to hybridize to different regions of the same RNA, are distributed on a GeneChip. This provides probe redundancy, decreases the problem of cross-hybridization effects, and makes the data less sensitive to isolated quality problems of the chip-surface such as scratches or bubbles. A feature only employed by GeneChips is the use of mismatch (MM) control probes which are identical to the perfect match (PM) probes with the exception of a single base difference. The idea is for MM probes to control for specific hybridization and to allow subtracting background noise and crosshybridization signals. The absolute need for such MM controls has been questioned, however, by several users. The omission of MM controls would double the ‘‘real estate’’ on the chip surface. In the earlier versions, a GeneChip contained about 10 000 genes/ESTs with 40 oligos per gene on the array. The latest generation GeneChip (Human Genome Plus 2.0) contains only 11 probe pairs per gene with a smaller feature size of 11 mm per spot. This enables the production of a ‘‘whole genome on a single chip,’’ querying the expression levels of >47 000 transcripts and splice variants derived from 38 500 human genes. The Agilent oligonucleotide microarrays use single 60-mers contrasting the multiple short 25-mer probes employed by Affymetrix, providing a good compromise between specificity and tightness of binding. A 60-mer oligo provides a larger hybridization area, which enhances sensitivity compared with a 25-mer setting. Whereas the Affymetrix multiple 25-mer PM/ MM design is rather sensitive to sequence mismatches, the longer 60-mer
12
S. de Vos
format is more tolerant of sequence mismatches as they occur in genes with highly polymorphic regions. While others have used linkers to move the oligos physically away from the slide-surface for better hybridization efficiency [21, 22], Agilent uses the actual 60-mer probe sequence as linker. The effect of single base differences on the hybridization efficiency was found to be related to its distance from the microarray surface, and implies that one benefit of additional oligo length is due to the displacement of the 50 end from the surface [18]. Both Agilent and Affymetrix provide also custom arrays in addition to their ‘‘off-the-shelf ’’ high-density oligonucleotide microarrays. Novel microarray platforms
The CodeLink bioarray platform from Amersham Biosciences (Piscataway, NJ, USA) introduced a novel array design using a three-dimensional polyacrylamid gel on the slide surface providing an aqueous reaction environment and improving the assay sensitivity due to reduced non-specific binding and background noise. 30-mer oligos, one per queried gene, are pre-produced, validated, and immobilized on the microarray surface [23]. Applied Biosystems (Foster City, CA, USA) have updated their human microarrays and introduced the Human Genome Survey Microarray V2.0., featuring 32 878 probes for the interrogation of 29 098 genes. Most of the probes, which are 60-mers, are found within 1500 bases of the 30 end of the source transcript, where labeling is more robust. This system uses chemiluminescence for detection. 60-mer oligonucleotides are produced, validated by mass spectrometry, and deposited onto a nylon microarray substrate which is subsequently attached on a glass support. A very unique, new design using digital micromirror device (DMD) was introduced by NimbleGen Systems (Madison, WI, USA). NimbleGen introduced a centralized production facility able to synthesize microarrays containing 195 000 features using a DMD that creates digital masks to synthesize specific polymers based on its proprietary maskless array synthesizer (MAS) technology [24]. At the core of the MAS technology is the DMD, similar to the digital light processor (DLP) created by Texas Instruments. The DMD is an array of 786 000 tiny aluminum mirrors, arranged on a computer chip, where each mirror is individually addressable. These tiny aluminum mirrors shine light in specific patterns and, coupled
13
Technique of microarrays: microarray platforms
with photo deposition chemistry, are used to produce arrays of oligonucleotide probes. In traditional high-density microarray development, physical masks are required to create the patterns of light on the slides, which is a time-consuming and expensive process. The DMD is ‘‘maskless’’ because no physical masks are involved by creating ‘‘virtual masks’’ replacing the physical chromium masks used in traditional arrays. The DMD patterns light by flipping mirrors on and off according to instructions in a ‘‘digital mask’’ file. The DMD controls the pattern of UV light on the microscope slide in the reaction chamber, which is coupled to a DNA synthesizer. The UV light deprotects the oligo strand, allowing the synthesis of the appropriate DNA molecule, very similar to traditional oligonucleotide synthesis. The advantage of this system is that custom high density arrays can be created in a cost-effective and rapid fashion in a less than 3-hour long process. Target-labeling In the standard microarray experiment mRNA expressions in two different biological samples are compared either on the same or on replicate microarrays. The Affymetrix GeneChip and CodeLink Bioarray systems utilize a single-color detection scheme, where only one sample is hybridized per chip. The classical spotted microarrays and the Agilent system employ a two-color scheme, in which the same array is hybridized with two samples each labeled with different fluorescent dyes (usually Cy3 and Cy5). The ratio of fluorescent signals represents transcript abundance of two different biological samples on the same microarray. Usually, a reference mRNA is labeled with one fluorescent dye, whereas the sample of interest mRNA is labeled with another dye. This strategy, depending on the size of the gene expression experiment, requires an abundant supply of identical reference mRNA. These references are generated from normal cells or tissues, or from a mixture of transcriptomes of many cell types, to obtain reference signals of most of the microarray cDNAs [25]. However, due to the relative nature of two-color scheme chips, complex normalization algorithms are needed when comparing data from different chips [26, 27]. For spotted microarrays, several labeling methods are available. The dyes Cy3 and Cy5 can be incorporated during the first cDNA synthesis from a
14
S. de Vos
total of polyAþ RNA. Alternatively, in a two-step process, first-strand cDNA is labeled initially with amino-allyl deoxyuridine triphosphate (AAdUTP) and then chemically coupled with cyanine dyes. Although it generates good-quality image data, a disadvantage of methods employing direct incorporation of fluorescently modified nucleotides into the reverse transcription (RT) reaction, is the requirement of large amounts of starting material (up to 50 mg of total RNA or 1 mg of mRNA). The 3DNA dendrimer system (www.genisphere.com) has been introduced recently [28]. This method involves first hybridizing unlabeled first-strand cDNA that contains a 50 dendrimer binding sequence to the microarray. The hybridized cDNAs are then detected by incubating the chips with dendrimers, prelabeled with Cy3 or Cy5, containing the capture sequence on the cDNAs. Dendrimers are complexes of partially double-stranded oligonucleotides, which form stable, spherical structures with a determined number of free ends. Specificity of the dendrimer detection is accomplished through specific binding of capture oligonucleotide on a free arm of the dendrimer. By synthesizing an RT primer consisting of an oligo dT sequence coupled to a sequence complementary to the capture sequence on the dendrimer, first-strand cDNA probes are generated, without modified nucleotides, that are capable of binding the dendrimers via the complementary primers. The dendrimer detection (3DNA) reagents provide high-quality signal using low amounts of starting RNA material, and maintain a low background over increasing amounts of RNA used and over an increasing number of scans. As this system does not depend on the incorporation of fluorescent dNTPs into a reverse transcription reaction, it avoids inefficient hybridization of the cDNA to the microarray due to incorporation of fluorescent dye nucleotide conjugates into the reverse transcript. For the Agilent platform several Nucleic Acid Sample Amplification/ Labeling procedures are in use. The direct labeling approach generates Cy3or Cy5-labeled targets from 10 mg of total RNA or 200 ng of polyAþ RNA by synthesis of fluorescent-labeled cDNA by reverse transcription. This procedure can be completed quickly and involves three steps: synthesizing fluorescent-labeled cDNA using reverse transcriptase, digesting the RNA, and cDNA target purification. The Low RNA Input Fluorescent Linear Amplification Kit (www.chem.agilent.com) is designed to amplify and label either total RNA or poly A(þ) RNA to generate micrograms of
15
Technique of microarrays: microarray platforms
cyanine-labeled cRNA (antisense) or cDNA (sense). This approach generates cyanine 3- or cyanine 5-labeled cRNA for oligo microarrays or cyanine 3- or cyanine 5-labeled cDNA for cDNA microarrays from as little as 50 ng total RNA. The procedure consists of converting mRNA primed with an oligo (d)T-T7 primer into dsDNA with MMLV-RT, linear amplification using T7 RNA polymerase, and converting cRNA to fluorescent cDNA (for cDNA microarrays). According to the manufacturer, a detection limit of as low as 1 mRNA copy per 104 cells can be achieved. The CodeLink system uses a single color scheme where biotin-labeled cRNA target is prepared by a linear amplification method (www1. amershambiosciences.com). The poly(A)þ RNA subpopulation (within the total RNA population) is primed for reverse transcription by a DNA oligonucleotide containing the T7 RNA polymerase promoter 50 to a d(T)24 sequence. After second-strand cDNA synthesis, the cDNA serves as the template for an in vitro transcription (IVT) reaction to produce the target cRNA. The IVT is performed in the presence of biotinylated nucleotides to label the target cRNA. This method produces approximately 1000-fold to 5000-fold linear amplification of the input poly(A)þ RNA. Hybridization is performed overnight in a temperature-controlled shaking incubator. Post-hybridization the assays are stained with a Cyä5-streptavidin conjugate. The Affymetrix One-Cycle GeneChipâ IVT Labeling Kit (www.affymetrix. com) is similar to the CodeLink assay with total RNA (1 mg to 15 mg) or mRNA (0.2 mg to 2 mg) first reverse transcribed using a T7-Oligo(dT) promoter primer in the first-strand cDNA synthesis reaction. Following RNase H-mediated second-strand cDNA synthesis, the double-stranded cDNA is purified and serves as a template in the subsequent in vitro transcription reaction. The IVT reaction is carried out in the presence of T7 RNA polymerase and a biotinylated nucleotide analog/ribonucleotide mix for cRNA amplification and biotin labeling. The biotinylated cRNA targets are then cleaned up, fragmented, and hybridized to the GeneChip expression arrays. After washing, the arrays are stained with streptavidin-phycoerythrin (SAPE) and counterstained with biotinylated anti-streptavidin and SAPE. For smaller amounts of starting total RNA, in the range of 10 ng to 100 ng, the Two-Cycle Eukaryotic Target Labeling Assay (www.affymetrix.com) was introduced. An additional cycle
16
S. de Vos
Table 1.1. Comparison of selected microarray platforms Probe type
Probes/gene
Printing method Standard robotic printing
Spotted cDNA arrays Affymetrix Agilent
cDNA
Investigator dependent
25-mer 60-mer
ABI
60-mer
CodeLink
30-mer
11–20 Photolithography 1(most genes) In situ ink-jet printing and synthesis 1 Printing of prefabricated oligos 1 Piezoelectric dispension
NimbleGen
24-mer
Investigator dependent
In situ photo-directed (micromirror) synthesis and arraying
Detection method
One- or twocolor system
Cy-3 and Cy-5 Two-color
Fluorescent Fluorescent
One-color Two-color
Digoxygenin (DIG) StrepavidinAlexa Fluor 647 Fluorescent
One-color One-color
One- or twocolor
of cDNA synthesis and IVT amplification is required to obtain sufficient amounts of labeled cRNA target. After cDNA synthesis in the first cycle, an unlabeled ribonucleotide mix is used in the first cycle of IVT amplification. The unlabeled cRNA is then reverse transcribed in the first-strand cDNA synthesis step of the second cycle using random primers. Subsequently, the T7-Oligo(dT) promoter primer is used in the second-strand cDNA synthesis to generate double-stranded cDNA template containing T7 promoter sequences. The resulting double-stranded cDNA is then amplified and labeled using a biotinylated nucleotide analog/ribonucleotide mix in the second IVT reaction. The labeled cRNA is then cleaned up, fragmented, and hybridized to GeneChip expression arrays. ABI introduced the Applied Biosystems Expression Array System, which exploits new detection chemistries based on chemiluminescence (www. appliedbiosystems.com). Two labeling schemes are available, depending on the amount of target RNA. Starting with abundant RNA, a very rapid protocol with single round reverse transcription and labeling produces sufficient labeled cDNA for accurate gene expression analysis. For limited
17
Technique of microarrays: microarray platforms
starting RNA (0.5 mg of total RNA) an Eberwine-based IVT-amplification and labeling protocol [29, 30], increases the yield of cRNA from cDNA by about 1000-fold. This system is very sensitive to the 10 fM target detection level, which translates into a single copy of mRNA per 600 000 mRNA species or 0.5 copies per cell. The Digoxigenin (DIG) RT Labeling Kit converts mRNA into labeled cDNA. DIG-labeled cDNA or cRNA is incubated with the microarray in a hybridization chamber for 16 hours at 55 8C. After washing to remove unhybridized DIG-labeled molecules, an alkaline phosphatase-antibody conjugate is added to bind to the DIG-labeled target. The addition of substrate and a chemiluminescence enhancer initiates the chemiluminescent reaction. Chemiluminescence requires no excitation and results in very low background noise, thereby improving signal-to-noise ratios and enabling low-level detection. Microarray image analysis To determine which DNAs correlate with changes in gene expression, the microarrays are first scanned to produce visual images and to generate raw numerical data for each spot on the array. The microarray reader is basically a computer-controlled inverted scanning fluorescent confocal microscope with a double or multiple laser illumination system, such as ScanArrayer 4000 and 5000 from General Scanning (Watertown, MA, USA), Avalanche from Molecular Dynamics (Sunnyvale, CA, USA), GMS 418 from Genetic MicroSystems (Woburn, MA, USA), and GeneTAC from Genomic Solutions (Ann Arbor, MI, USA). In addition, some companies, such as Genometrix (The Woodlands, TX, USA); Applied Precision (Seattle, WA, USA), are developing charge-coupled device (CCD) cameras to capture microarray images. The pin-and-ring array technology is capable of creating spots of extremely consistent size, shape, and volume. Typically, a 488-nm, 100-mW Argon ion laser for exciting FITC, a 532-nm, 100-mW NdYag for Cy3, and a 633-nm, 35-mW HeNe for Cy5 are employed. The emitted light, after passing back through the objective and primary dichroic, is focused through a confocal pinhole and through a secondary dichroic onto two cooled photo-multiplier tubes (PMTs), which operate in parallel for the two different wavelengths. About 10 pg/mL of each species of cDNA can be detected reliably. After scanning, a combined color image is obtained
18
S. de Vos
and is processed further using microarray image analysis programs. The objective of microarray image analysis is to extract probe intensities or ratios at each cDNA target location, and then cross-link printed clone information so that biologists can interpret the outcomes easily and perform further high-level analysis. However, the microarray image sources are not only from one print-mode (i.e., different printing tip arrangement of different arrayers) [14] or one hybridization method (i.e., fluorescent [Stanford, NIH, etc.], radioactive probe, and others) [31], and the analysis methods are very different. Typically, microarray image analysis consists of cDNA target segmentation, target detection, local background intensity, and probe fluorescent intensity measurement, ratio analysis, and data visualization. Accuracy and reliability of microarray platforms While microarrays have transformed the experimental approach to many research projects, a lingering question remains. How reliable are these measurements? A high degree of reliability is an absolute requirement for any plans to move microarrays into clinical use for diagnostic or prognostic purposes. However, to determine accuracy and reliability of microarray data is a difficult undertaking, as there are various experimental procedures, labeling protocols, microarray platforms, and analysis techniques in use. Problems of reproducibility of microarray results have been recognized to be due to biological and technical variations [26, 32, 33], and further technical confounding artifacts are still being uncovered [34–40]. The accuracy and sensitivity of microarrays are low compared to other techniques such as RT-PCR. One root of the problem lies in the accuracy of the sequences used to generate microarrays. For example, Mecham et al. found that, for mammalian Affymetrix microarrays, an unexpectedly large number of probes (greater than 19%) did not correspond to their appropriate mRNA reference sequence (RefSeq)[41]. However, after exclusion of data derived from inaccurate probes, the data derived from sequence-verified probes demonstrated increased precision in technical replicates, increased accuracy when translating data from one generation microarray to another, and increased accuracy comparing data of oligonucleotide and cDNA microarrays [41]. Therefore, the identification and removal of inaccurate
19
Technique of microarrays: microarray platforms
probes can improve the performance of microarrays significantly. This problem could be addressed by requiring the submission of probe information along with expression data when reporting microarray results and therefore, the Microarray Gene Expression Data (MGED) Society (http:// www.mged.org) is recommending the inclusion of probe sequence information to the minimal information about a microarray experiment (MIAME) criteria [1]. An even more daunting question has been the comparability of different microarray platforms. Interplatform comparisons have been published with different levels of scrutiny regarding the experimental procedures and data analysis methods employed. Whereas initial platform comparisons appeared promising [42, 43], further studies revealed disappointingly poor correlations when comparing spotted cDNA arrays and the Affymetrix platform [44]. The authors concluded that the prognosis for the integration of gene expression measurements across platforms was poor and suggested that probe-specific factors were influencing the measurements on both platforms differently. However, in these comparisons the variables were not only different microarray platforms, but also different laboratories conducting the respective experiments. But, even when a single laboratory compared three different platforms (Agilent, Affymetrix, and Amersham) using the same RNA preparation, a disturbingly poor data concordance was reported [45]. These authors suggested the need for establishing industrial manufacturing standards and further independent and thorough validation of the technology. Attempting to verify that each platform indeed measured the genes it claimed to target, the authors obtained probe sequence data from Agilent, Affymetrix and Amersham. Several probe problems were discovered: (i) many gene transcripts exist in splice variants which may or may not be detected by different probes, (ii) probes can cross-hybridize with near matches, (iii) many probes did not correlate to annotated sequences in the public database RefSeq [46]. Next, Yauk et al. published a comprehensive comparison of six microarray technologies encompassing different reporter systems (short oligonucleotides, long oligonucleotides, and cDNAs), labeling techniques and hybridization protocols, using four oligonucleotide and two cDNA platforms to compare gene expression between two sample types [47]. They determined the overall consistency (reproducibility) within each platform, and correlation
20
S. de Vos
among replicates within and between technologies. The investigators found that the top performing platforms showed low levels of technical variability, which translated in an increased ability to detect differential expression. They concluded that the top four platforms were very consistent and that biological, rather than technological, differences accounted for the majority of data variation [47]. In a recent review, van Bakel and Holstege addressed assessing microarray performance and proposed the use of external control RNAs as a versatile and robust method for achieving this goal [48]. The industry-led External RNA Control Consortium (ERCC; http://www. affymetrix.com/community/standards/ercc.affx) is attempting to establish a universal set of external controls for that purpose. To shed further light onto this issue, three more papers have been published recently [49–51]. Each study demonstrated that, with carefully designed and controlled experimentation and implementation of standardized protocols and data analyses, much improved reproducibility across platforms could be achieved [52]. The authors suggested that good standard operating procedures rather than the technology issues determined the reproducibility and cross-platform comparability. The raised issues of microarray data reliability should not discourage the use of this new technology. Rather, addressing these issues should help improve the quality of microarray-derived data to the level needed for future clinical applications. Outlook With the increasing number of microarray vendors offering a variety of whole-genome arrays in different species, the answer to the question of which array system to choose has become quite complex and depends on the specific experiments planned, amount of available starting material, costs, and local expertise and availabilities of the various technologies [53]. The available microarray technologies have to agree on some common denominators in order to improve the current platform comparability and overall quality issues. The source of genetic information and its annotation need to be corroborated across different technologies in order to compare performance accurately across platforms. If the performance based on probe selection can be empirically determined, then the microarray data can
21
Technique of microarrays: microarray platforms
potentially be compared at level of the raw signals. The establishment of gene expression standards will be of paramount importance for any cross-platform comparisons. All array manufacturers will need to include the same probe sequences in order to accurately assess performance across different technologies. Array manufacturers need to work together to provide an information resource describing probe set methodology (and sequences) for commercially available products that allows users to identify groups of genes that can be compared directly across platforms. Identifying optimal target preparation methodologies (i.e., cDNA vs. cRNA hybridization, small sample amplification) and normalizing sample starting material and hybridization cocktail sensitivity will allow for more efficient comparative analysis. Recent key trends in the microarray field have been the shift from cDNA to oligonucleotide-based microarrays and from ‘‘self-made’’ to commercial platforms, due to better affordability, improvements in sensitivity, specificity, and reproducibility of the latter. While Affymetrix with its GeneChip system, employing photolithographic techniques derived from semiconductor technology for the fabrication of microarrays, has been the market leader for several years and still is, competition has entered the arena. Furthermore, microarray manufacturers are aiming for lucrative clinical applications and, despite the current specificity and quality issues of microarray platforms, in December of 2004 the FDA cleared Affymetrix’s GeneChip 300Dx instrument platform and half of Roche Molecular Diagnostics CYP450 AmpliChip for in vitro diagnostic uses. Novel applications of microarrays include whole genome comparative genomic hybridization (CGH) arrays (e.g., NimbleGen, Madison, WI, USA). A challenging new direction is the capturing of mRNA splice variants on microarrays. Some of the discrepancies when comparing different microarray platforms or validating microarray results with RT-PCR may relate to inadvertently measuring different splice variants on different microarray platforms when attempting to measure the same genes. Approximately 74% or more of all human genes express more than one splice isoform through alternate splicing, which is responsible for much of the protein diversity in humans [54]. Splice isoforms are often disease or tissue specific [55]. Some versions of whole genome arrays incorporate some splice variants, whereas other vendors are offering cancer specific human splice variant microarrays such as the Transexpressä Cancer
22
S. de Vos
array from ArrayIt (Sunnyvale, CA, USA). Another exciting new development is the use of microarrays for high-throughput microRNA profiling [56–58]. One of the latest improvements in this novel field has been the introduction of a RNA-primed, array-based Klenow enzyme (RAKE) assay [59]. This assay measures mature 22 nucleotide miRNA sequences, eliminates the systematic bias of reverse transcription, PCR amplification, ligation reactions or enzymatic labeling and is sensitive enough to discriminate between miRNAs that differ by a few nucleotides. In summary, for the conduct of microarray studies, maximal disclosure of the experimentation is warranted. This not only includes adherence to the minimum information about a microarray experiment (MIAME)compliant annotations [1], but also making available the raw data set, and for quality control the scanned image data. As noted above, the availability of the probe nucleotide sequences from microarray manufacturers will do more to enable researchers to remove uninformative data from the whole data set, improving the performance of ‘‘noise-reduced’’ microarray experiments.
REFERENCES 1. Brazma, A., Hingamp, P., Quackenbush et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet. 2001; 29(4): 365–71. 2. Marton, M. J., Derisi, J. L., Bennett, H. A. et al. Drug target validation and identification of secondary drug target effects using DNA microarrays. Nat. Med. 1998; 4(11): 1293–301. 3. Tavazoie, S., Hughes, J. D., Campbell, M. J., Cho, R. J., and Church, G. M. Systematic determination of genetic network architecture. Nat. Genet. 1999; 22(3): 281–5. 4. Iyer, V. R., Eisen, M. B., Ross, D. T. et al. The transcriptional program in the response of human fibroblasts to serum. Science 1999; 283(5398): 83–7. 5. DeRisi, J. L., Iyer, V. R., and Brown, P. O. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 1997; 278(5338): 680–6. 6. Hughes, T. R., Marton, M. J., Jones, A. R. et al. Functional discovery via a compendium of expression profiles. Cell 2000; 102(1): 109–26. 7. Pomeroy, S. L., Tamayo, P., Gaasenbeek, M. et al. Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 2002; 415(6870): 436–42.
23
Technique of microarrays: microarray platforms
8. van’t Veer, L. J., Dai, H., van de Vijver, M. J. et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002; 415(6871): 530–6. 9. Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995; 270(5235): 467–70. 10. Stafford, P. and Liu, P. Microarray technology comparison, statistical analysis, and experimental design. In Microarray Methods and Applications – Nuts and Bolts. DNA Press., 2003: 3273–324. 11. Lockhart, D. J., Dong, H., Byrne, M. C. et al. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat. Biotechnol. 1996; 14(13): 1675–80. 12. Shalon, D., Smith, S. J., and Brown, P. O. A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. Genome Res. 1996; 6(7): 639–45. 13. Cheung, V. G., Morley, M., Aguilar, F. et al. Making and reading microarrays. Nat. Genet. 1999; 21(1 Suppl): 15–19. 14. Bowtell, D. D. Options available – from start to finish – for obtaining expression data by microarray. Nat. Genet. 1999; 21(1 Suppl): 25–32. 15. Hardiman, G. Microarray technologies 2003 – an overview. Pharmacogenomics 2003; 4(3): 251–6. 16. Knight, J. When the chips are down. Nature 2001; 410(6831): 860–1. 17. Lipshutz, R. J., Fodor, S. P., Gingeras, T. R., and Lockhart, D. J. High density synthetic oligonucleotide arrays. Nat. Genet. 1999; 21(1 Suppl): 20–4. 18. Hughes, T. R., Mao, M., Jones, A. R. et al. Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat. Biotechnol. 2001; 19(4): 342–7. 19. Xiang, C. C. and Chen, Y. cDNA microarray technology and its applications. Biotechnol. Adv. 2000; 18(1): 35–46. 20. Chee, M., Yang, R., Hubbell, E. et al. Accessing genetic information with highdensity DNA arrays. Science 1996; 274(5287): 610–14. 21. Southern, E., Mir, K., and Shchepinov, M. Molecular interactions on microarrays. Nat. Genet. 1999; 21(1 Suppl): 5–9. 22. Shchepinov, M. S., Case-Green, S. C., and Southern, E. M. Steric factors influencing hybridisation of nucleic acids to oligonucleotide arrays. Nucl. Acids Res. 1997; 25(6): 1155–61. 23. Ramakrishnan, R., Dorris, D., Lublinsky, A. et al. An assessment of Motorola CodeLink microarray performance for gene expression profiling applications. Nucl. Acids Res. 2002; 30(7): e30. 24. Nuwaysir, E. F., Huang, W., Albert, T. J. et al. Gene expression analysis using oligonucleotide arrays produced by maskless photolithography. Genome Res. 2002; 12(11): 1749–55.
24
S. de Vos
25. Puskas, L. G., Zvara, A., Hackler, L., Jr., Micsik, T., and van Hummelen, P. Production of bulk amounts of universal RNA for DNA microarrays. Biotechniques 2002; 33(4): 898–900, 902, 904. 26. Yang, Y. H., Dudoit, S., Luu, P. et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucl. Acids Res. 2002; 30(4): e15. 27. Dudley, A. M., Aach, J., Steffen, M. A., and Church, G. M. Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range. Proc. Natl Acad. Sci. USA 2002; 99(11): 7554–9. 28. Stears, R. L., Getts, R. C., and Gullans, S. R. A novel, sensitive detection system for high-density microarrays using dendrimer technology. Physiol. Genomics 2000; 3(2): 93–9. 29. Van Gelder, R. N., van Zastrow, M. E., Yool, A. et al. Amplified RNA synthesized from limited quantities of heterogeneous cDNA. Proc. Natl Acad. Sci. USA 1990; 87(5): 1663–7. 30. Eberwine, J., Belt, B., Kacharmina, J. E., and Miyashiro, K. Analysis of subcellularly localized mRNAs using in situ hybridization, mRNA amplification, and expression profiling. Neurochem. Res. 2002; 27(10): 1065–77. 31. Chen, Y., Kamat, V., Dougherty, E. R. et al. Ratio statistics of gene expression levels and applications to microarray data analysis. Bioinformatics 2002; 18(9): 1207–15. 32. Churchill, G. A. Fundamentals of experimental design for cDNA microarrays. Nat. Genet. 2002; 32 Suppl: 490–5. 33. Quackenbush, J. Microarray data normalization and transformation. Nat. Genet. 2002; 32 Suppl: 496–501. 34. Diehl, F., Grahlmann, S., Beier, M., and Moheisel, J. D. Manufacturing DNA microarrays of high spot homogeneity and reduced background signal. Nucl. Acids Res. 2001; 29(7): E38. 35. Ramdas, L., Coombes, K. R., Baggerly, K. et al. Sources of nonlinearity in cDNA microarray expression measurements. Genome Biol. 2001; 2(11): RESEARCH0047. 36. Chuaqui, R. F., Bonner, R. F., Best, C. J. et al. Post-analysis follow-up and validation of microarray experiments. Nat. Genet. 2002; 32 Suppl: 509–14. 37. Fare, T. L., Coffey, E. M., Dai, H. et al. Effects of atmospheric ozone on microarray data quality. Anal. Chem. 2003; 75(17): 4672–5. 38. Martinez, M. J., Aragon, A. D., Rodriguez, A. L. et al. Identification and removal of contaminating fluorescence from commercial and in-house printed DNA microarrays. Nucl. Acids Res. 2003; 31(4): e18. 39. t Hoen, P. A., de Kort, F., van Ommen, G. F., and den Dunnen, J. T. Fluorescent labelling of cRNA for microarray applications. Nucl. Acids Res. 2003; 31(5): e20.
25
Technique of microarrays: microarray platforms
40. Lyng, H., Badiee, A., Svendsrud, D. H. et al. Profound influence of microarray scanner characteristics on gene expression ratios: analysis and procedure for correction. BMC Genomics 2004; 5(1): 10. 41. Mecham, B. H., Wetmore, D. Z., Szallasi, Z. et al. Increased measurement accuracy for sequence-verified microarray probes. Physiol. Genomics 2004; 18(3): 308–15. 42. Cho, R. J., Campbell, M. J., Winzeler, E. A. et al. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 1998; 2(1): 65–73. 43. Spellman, P. T., Sherlock, G., Zhang, M. Q. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 1998; 9(12): 3273–97. 44. Ross, D. T., Scherf, U., Eisen, M. B. et al. Systematic variation in gene expression patterns in human cancer cell lines. Nat. Genet. 2000; 24(3): 227–35. 45. Tan, P. K., Downey, T. J., Spitznagel, E. L., Jr. et al. Evaluation of gene expression measurements from commercial microarray platforms. Nucl. Acids Res. 2003; 31(19): 5676–84. 46. Marshall, E. Getting the noise out of gene arrays. Science 2004; 306(5696): 630–1. 47. Yauk, C. L., Berndt, M. L., Williams, A., and Douglas, G. R. Comprehensive comparison of six microarray technologies. Nucl. Acids Res. 2004; 32(15): e124. 48. van Bakel, H. and Holstege, F. C. In control: systematic assessment of microarray performance. EMBO Rep. 2004; 5(10): 964–9. 49. Larkin, J. E., Frank, B. C., Gavras, H., Sultana, R., and Quackenbush, J. Independence and reproducibility across microarray platforms. Nat. Methods 2005; 2(5): 337–44. 50. Irizarry, R. A., Warren, D., Spencer, F. et al. Multiple-laboratory comparison of microarray platforms. Nat. Methods 2005; 2(5): 345–50. 51. Bammler, T., Beyer, R. P., Bhattacharya, S. et al. Standardizing global gene expression analysis between laboratories and across platforms. Nat. Methods 2005; 2(5): 351–6. 52. Sherlock, G. Of fish and chips. Nat. Methods 2005; 2(5): 329–30. 53. Hardiman, G. Microarray platforms – comparisons and contrasts. Pharmacogenomics 2004; 5(5): 487–502. 54. Maniatis, T. and Tasic, B. Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature 2002; 418(6894): 236–43. 55. Garcia-Blanco, M. A., Baraniak, A. P., and Lasda, E. L. Alternative splicing in disease and therapy. Nat. Biotechnol. 2004; 22(5): 535–46. 56. Liu, C. G., Calin, G. A., Meloon, B. et al. An oligonucleotide microchip for genomewide microRNA profiling in human and mouse tissues. Proc. Natl Acad. Sci. USA 2004; 101(26): 9740–4.
26
S. de Vos
57. Miska, E. A., Alvarez-Saavedra, E., Townsend, M. et al. Microarray analysis of microRNA expression in the developing mammalian brain. Genome Biol. 2004; 5(9): R68. 58. Thomson, J. M., Parker, J., Perou, C. M., and Hammond, S. M. A custom microarray platform for analysis of microRNA gene expression. Nat. Methods 2004; 1(1): 47–53. 59. Nelson, P. T., Baldwin, D. A., Scearce, L. M. et al. Microarray-based, highthroughput gene expression profiling of microRNAs. Nat. Methods 2004; 1(2): 155–61.
2
Quantitative quality control of microarray experiments: toward accurate gene expression measurements Xujing Wang and Martin J. Hessner The Medical College of Wisconsin and Children’s Research Institute of the Children’s Hospital of Wisconsin, Milwaukee, WI, USA The Human and Molecular Genetics Center, The Medical College of Wisconsin, Milwaukee, WI, USA
Introduction Since its introduction in the 1990s, microarray technology has brought about a fundamental transformation in laboratory research, and has become a widely used genetic tool [1]. The technology has great potential in the study of networks that regulate gene expression and in the study of complex human diseases where a comprehensive evaluation is needed. However, as it stands now, acquiring high quality microarray data is still a challenge for many laboratories. The noise and data variability is often high, while correlations with other platforms including RT-PCR and Northern blots, and between different microarray platforms, are often unsatisfactory [2]. As a result, gene expression analysis using microarrays is considered by many to be not quantitative. This has limited the technology’s application largely to complex biological systems [3]. The major reason for the noise in microarrays is that there are many experimental steps and hence many sources for data variability. To efficiently reduce the noise in the system, it is essential to have an information acquisition and analysis procedure that can properly dissect the sources and manage each of them accordingly. We have previously reported a microarray hybridization image processing software Matarray, which possesses an iterative procedure that utilizes both spatial and intensity information for signal identification [4]. Most uniquely, a set of quality scores are defined, which measure and quality control (QC) the major sources of data variability including, high and non-uniform noise profile, low or saturated Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
28
X. Wang and M. J. Hessner
signal intensity, and irregular spot size and shape. Based on these individual measures, a composite score qcom is defined for each spot, which gives an overall assessment of the data quality [4]. Nevertheless, some sources of variability cannot be evaluated directly or quantitatively from the posthybridization images. One important example is the quality of array fabrication. Generation of microarray slides involves coating of the glass slides, printing up to tens of thousands of amplified cDNA ‘‘probes’’ and fixing/blocking of the slide. During this process, variable amounts of material can be deposited and/or retained on the array surface depending on a number of factors. Furthermore, even on relatively high-quality arrays, there are suboptimal spots due to PCR failures or pin misses during printing. We, and others, have shown that, when the amount of immobilized probe is inadequate, the measurements made on such arrays will be unreliable [5–8]. To enable array fabrication QC, we have recently developed a novel three-color cDNA microarray platform [5, 6, 9], which we termed third dye array visualization (TDAV) technology [10]. The approach labels the cDNA probes printed on the array slides with a non-invasive third dye (TD) fluorescein [5], and makes prehybridization measurement of element/array morphology, surface DNA deposition/retention, and background levels possible [5, 6, 9]. Based on this work, we will now describe several recent advances in our microarray data analysis and QC, including: (1) more accurate hybridization data acquisition assisted by the information from the TD images; (2) efficient and qualitative data filtering and normalization based on the ratio-quality score plot; (3) statistical evaluation that utilizes the quality score as a weight factor, which avoids the missing value problem. With these technical and analytical developments, we show that accurate gene expression measurement by microarrays is achievable. Accurate information acquisition from microarrays The data acquisition from the microarray images mainly includes the following steps [4, 10]: (1) locating the pixel groups that make up the individual signal spot (grid alignment); (2) correct identification of signal pixels from local background (edge detection or segmentation); (3) accurate quantification of the intensity information; and (4) assignment of an appropriate confidence measure to the results. Efficient, reliable microarray image
29
Quantitative quality control of microarray experiments
processing is a prerequisite, since prediction of the differential gene expression, as well as data clustering and data mining, all depend on the accuracy of the measurements derived at this step. For example, inaccurate background subtraction or saturation in pixel intensities can result in an intensitydependent shift of the ratio distribution [11]; some background adjustment methods can increase the variability substantially in the data, especially from low-intensity spots [11]. Though many commercial and free packages are available, some fundamental issues remain unresolved. An important example is, which of the two dye channels should be utilized to define the signal region when they lead to (as they often do) different signal-background segmentation? Many of the technical difficulties for image analysis are due to the commonplace noise and irregularities (in spot intensity, size and shape, etc.) on microarray images [4]. If images of the arrayed probes prior to hybridization were available to assist the definition of signal pixels, it should be much more desirable than the posthybridization images for three reasons: (1) it provides a criterion for judging which coordinates on the cyanine images should contain signal information, as it should be those that possess spotted probe materials. Pixels that do not have strong enough probe signals can be assigned to background; (2) the signal intensities should be much more uniform, since they depend on the amount of printed probe rather than on the abundance of the individual transcripts, which can vary tremendously. The advantage should be more significant for transcripts that are expressed at low levels; (3) the image should be much cleaner than the cyanine dye images since the hybridization process is a major source for the noise and artifacts on images. These characteristics would make signal–background segmentation a much easier task. We have compared the prehybridization TD image and the posthybridization cyanine images, using data from a set of 16 hybridizations that compared the thymus of day 65 and day 40 BB-DP (BioBreeding Diabetic Prone) rats. Each array possesses 18 432 elements. Figure 2.1 gives an example. The TD image is more uniform in intensity and spot size distribution, and has less noise and artifacts. We have further quantified their quality difference by examining the signal-to-noise ratios, and the coefficient of variation (CV) in spot log intensities and spot size, and report the results in Table 2.1. We have found that the TD images are the most uniform, most regular, and free of background. While the non-uniformities and noise measures
30
X. Wang and M. J. Hessner
Table 2.1. Comparison of the quality measures between the TD and cyanine images
CV of log intensities (%) Noise-to-signal ratio (%) CV in spot size (%)
Fig. 2.1.
TD
Cy5
Cy3
8.8 1.4 3.2 2.4 4.6 3.3
15.6 1.1 15.3 3.0 11.5 1.6
13.7 1.3 15.9 3.1 10.7 1.1
A fluorescein TD image prior to hybridization (left), and the corresponding composite cyanine dye image after hybridization (right). The latter shows the greater intensity heterogeneity, spot size distribution, and noise level.
for the cyanine images are all in double-digit numbers, the same measures for the TD image are all in single digits, almost an order of magnitude better. We have developed new algorithms to incorporate TDAV to Matarray for reliable image analysis. Matarray now allows input of three images (TD, Cy5 and Cy3) from each array. It performs grid alignment and signal background segmentation as described previously [4] on the TD image. After all signal areas are defined, the hybridization data are then acquired from the corresponding pixels on the cyanine images. We have found that this approach leads to an enhancement in accuracy and reproducibility of data acquired, when we examine the measured versus actual data of
31
Quantitative quality control of microarray experiments
control clones spiked-in at known ratios, and the replicate consistency. Improvement is more significant for weak cyanine spots that correspond to lowly expressed genes. The speed of image analysis has also improved. Using Matarray’s iterative, localized grid alignment algorithm [4], we found that, on high-density cyanine images, two to three iterations are often needed to reach a desirable level of signal area definition. A single iteration can take up to 10 minutes for an array of 20 000 spots. Using the TD image, we have found that it generally requires only a single iteration. This benefit becomes more significant when we need to process a large number of slides. In summary, data acquisition that utilizes the prehybridization TD image allows unambiguous definition of signal pixels from background, and leads to improved accuracy and efficiency. Quantitative data filtering and normalization, advantages of the quality score approach After information acquisition, filtering and normalization are necessary for microarray data in order to reduce noise and bias. Many algorithms have been proposed to reduce noise, including filtering by intensity, size, shape, or heuristic methods [12, 13]. The efficiency of such approaches are often in question [14]. Statistical algorithms have been developed based on replicate consistency, and/or overall frequency distributions of the expression data [12, 15]. But they are ineffective in detecting artifacts that affect only a small number of spots on the array or those that affect all replicates equally. The most frequent systematic bias in dual hybridization microarrays is the labeling difference in the two fluorescent cyanine dyes. Even for a homotypic experiment (also known as self–self hybridization, the same sample is used in both dye channels) the distribution of the intensity ratio measurements may not center around 1. Many factors can cause this bias, including the physical properties of the dyes, labeling efficiency, probe coupling, scanner settings, and inappropriate data processing. Such bias can hamper a direct interpretation of the data since the normal ‘‘fold of change’’ interpretation of ratio is incorrect. Averaging over replicates may increase variation since each of them may have different characteristics in such bias. Furthermore, the bias can depend on factors including intensity and spatial location of the spot, making its correction a complicated issue. There
32
X. Wang and M. J. Hessner
is considerable diversity in existing normalization procedures [16–18], a popular example being a localized approach that derives the normalization factor from the ratio-intensity plot using the LOWESS (locally weighted scatter plot smoothing) technique (often termed MA-LOWESS normalization) [16]. Despite these efforts, the field still lacks a standardized scheme. Their performance, limitations, and appropriateness to certain microarray designs are still open questions [18]. For efficient filtering and normalization, the major factors contributing to noise and bias need to be identified and properly dissected. We approach this problem through the definition of a set quality scores according to the information acquired from the three images for each array, each measuring the effect of a major source of data variability. Quality measures from cyanine images
The approach of defining quality scores for quantitative data QC was first introduced in Matarray for cyanine image analysis [4]. For each spot, nonredundant factors that affect data quality were identified, including spot size, signal-to-noise ratio, background level and uniformity, and saturation status, and individual quality scores are defined for each: sig jAA0 j ; qsignoise ¼ qsize ¼ exp A0 sig þ bkgl bkgl qbkg1 ¼ f1 =CV bkg ; qbkg2 ¼ f2 1 (2:1) bkgl þ bkg0 ( 1; if % of saturated pixels 510% qsat ¼ 0; if % of saturated pixels 10% where fi are normalization factors. sig stands for signal, and bkgl and bkg0 are local and global background levels, respectively. Several other factors, including the intensity level, and the variation of the signal pixels (corresponding score qsig ¼ f3 /CVsig), have also been considered, but were found not to be independent of the above 5 scores, and hence were excluded from further consideration. Based on these, a composite score qcom is defined to give an overall assessment of the quality [4]: 1=4 qcom ¼ qsize qsignoise qbkg1 qbkg2 qsat (2:2)
33
Quantitative quality control of microarray experiments
We have demonstrated through numerous experiments that qcom captures very well the inherent variability in microarray measurements. High qcom spots generate less data variability, and removing spots with low qcom can dramatically improve the reliability of data [4, 19]. Figure 2.2(a) demonstrates this point using data from a microarray experiment that profiled and compared the thymus of day 40 BB-DP (Bio Breeding-Diabetic Prone) and BB-DR (Diabetic-Resistant) rats. Four animals/strain were used and, for each animal pairing, four replicate hybridizations were performed, totaling 16 hybridizations. For each pair of direct replicate hybridizations, we obtained the genes that show a differential expression (DE) at P ¼ 0.05 on at least one array and divide them into 25 bins. For genes in each bin, we determine their mean qcom and the Pearson correlation in log ratio measurements between the two replicates. The mean and standard deviation are then determined for all replicate pairs, and the results are plotted against qcom. Measurements corresponding to high qcom spots are much more reproducible, and variability in those corresponding to low qcom spots is much higher. Quality measures from the TD image
Information from the TD image allows an evaluation of the array fabrication, which is another major source of variability and cannot be quantitatively assessed from the posthybridization images. We and others have shown that it directly affects the accuracy of the expression measurements [5–9]. Specifically, when the amount of immobilized probe is inadequate, the measurements will be insensitive to differential expressions in the two samples. Instead, the dynamic range of detection is compressed, and the data variability is increased in a probe quantity-dependent fashion [5–7]. In addition, the scale of compression can depend on the fold of change of the transcript, thus leading to gene-specific artifacts in data [20]. Other factors that can influence data quality include those that have been studied previously on cyanine images. For example, noise on the prehybridization slide will lead directly to noise in expression measurements post-hybridization [4]. Based on these observations, we recently formulated a quality measure for every spot from the TD image by defining qTD ¼ qint qcom ðTDÞ
(2:3)
34
X. Wang and M. J. Hessner
(a) 1.0
Replicate correlation
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4 0.6 Quality score
0.8
1.0
Correlation - qcom
Correlation - qTD (b) 1000
100
Ratio
10
1
0.1
0.01
1E-3 0.0
0.2
0.6 0.4 qTD × qcom
Before normalization
Fig. 2.2.
0.8
1.0
After normalization
The benefit of data QC utilizing log R-quality scores plot. (a) Data variability depends on the quality scores qcom and qTD. Data with better scores exhibit higher replicate consistency. (b) Quality-dependent localized normalization.
35
Quantitative quality control of microarray experiments
where qcom(TD) is the composite score of TD spot defined from size, signalto-noise, background level and uniformity, as given in the Eq. (2.2) [4], and normalized between [0.5 1], and 1; intensity threshold (2:4) qint ¼ intensity=threshold; intensity5 threshold In Matarray the default threshold ¼ 5000 RFU/pixel [5, 6]. The set-up of this quantitative measure according to TD signal information requires a standard to ensure consistent prehybridization TD image collection. For that we have implemented a confocal laser scanner calibration method utilizing FluorIS (CLONDIAG, Jena, Germany), a non-bleaching, reusable, calibration/ standardization tool [5]. We have found that qTD affects data variability in the same fashion as qcom, with high qTD spots yielding less variation. In Fig. 2.2(a) we have also presented the dependence of replicate consistency on qTD for the same set of data. Notice that majority of the data concentrate on the high-quality end, specifically the high qTD region. This is because all of the arrays used in our microarray experiments have been QCed using our TDAV technology as previously described [5, 6]. We have examined the correlation between qTD and qcom, and validated that they are two non-redundant quality measures each capturing a different major source of data variability [10, 20]. QC by each is necessary. For experiments performed using our TDAV arrays, we define a (final) overall quality score for every spot by Qf ¼ qTD qcom
(2:5)
Using qcom, qTD and Qf, data quality and characteristics can be evaluated conveniently utilizing the ratio-q plot, and data filtering can be performed quantitatively according to desired stringency. Normally in our analysis only spots with Qf > 0 will be retained for further data mining and modeling. R–Q LOWESS normalization and its efficiency
Based on our quality score definitions, we have developed an original quality-dependent normalization procedure [19]. For each spot we defined a quality-dependent Z-score, in place of the commonly used log ratio log R: log RðqÞ mean log RðqÞ (2:6) Z¼ SD log RðqÞ
36
X. Wang and M. J. Hessner
and define the Z-method normalized log ratio to be log RZ ¼ log RðqÞ mean log RðqÞ
(2:7)
The local mean of log ratio mean_log R(q) is obtained using a LOWESS scheme as described previously [16, 19, 21]. The local standard deviation (SD) of log ratio SD_log R(q) is obtained using a moving window LOWESS approach. First, the SD for every spot is determined with f proportion of its neighboring spots, where f is the fraction of data used for smoothing in the LOWESS fit for mean [16, 21]. In Matarray the default value for f is 0.05. After that, a LOWESS will be performed on the SDs and the fitted result is defined to be SD_log R(q) [19]. For experiments using our three-color microarray platform, the normalization will be performed sequentially over qTD and qcom. First, utilizing log RqTD plot a localized LOWESS normalization as described by Eq. (2.7). After that, the qTD-normalized log ratio will be plotted against qcom and Z-normalization will be performed [19]. Figure 2.2(b) gives a data set from the same thymus data before and after normalization, showing the efficiency of our procedure to correct quality-dependent bias in data. To demonstrate the advantage of our approach, we have recently compared it to a commonly used MA-LOWESS normalization. Data from three different experiments were processed both by our pipeline and by the MA-LOWESS procedure. The experiments were (1) the aforementioned profiling and comparison of thymus of day 40 BB-DP and BB-DR rats; (2) gene expression profiling of the kidney from an end-stage renal failure (ESRD) rat model. Three pairs of 22 week-old fawn hooded hypertensive (FHH) rats and control August Copenhagen Irish (ACI) rats were compared. For each animal pair, two replicate hybridizations were performed; (3) time course profiling of drug (staurosporine [22]) induced apoptosis progression in pancreatic islet RIN-m5F cells. At each of 2, 4, and 6 hours after drug treatment, six replicate hybridizations were performed between apoptotic cells and controls. All spots with Qf ¼ 0 were dropped (10% of all spots). We calculated the correlation coefficient between replicate pairs for genes exhibit DE at P ¼ 0.05 in at least one replicate. The results are presented in Fig. 2.3, revealing a better (P < 0.0001) overall performance by our processing pipeline, as most of the data were in the lower triangle below the 458 line.
37
Quantitative quality control of microarray experiments
1.0
MA-LOWESS normalization
0.8
0.6
0.4
0.2
0.0 0.0
0.2
0.4
0.6
0.8
1.0
Z normalization
Fig. 2.3.
The advantage of quality-dependent filtering and normalization. Data from three different microarray experiments were processed with our normalization and the MA-LOWESS normalization procedures. All spots with Qf ¼ 0 were dropped. The correlation coefficients between all direct replicates (60 pairs) were compared between the two procedures, showing an overall better performance by our normalization approach (P < 0.0001).
Qf -weighted mean and t-test Having a quality definition for every spot not only allows efficient and quantitative data QC, it also leads to more convenient data processing strategies. Data filtering is essential in microarrays. However, different replicates in one experiment often have a different number and composition of low-quality spots, and hence a different set of spots retained after filtering. Many genes can have one or more replicate data points eradicated. This ‘‘missing value’’ problem makes combining data from replicates, and the down-stream statistical evaluation and data mining cumbersome. In our approach, data acquired from every spot is accompanied with a quality score Qf , and the spots to be filtered are those with Qf ¼ 0. Therefore, instead of physically removing the bad data from further analysis, we can adopt a
38
X. Wang and M. J. Hessner
weighted mean approach to combine results from replicates where Qf serves as the weighting factor. A gene will only be removed physically from further analysis if it failed QC on all replicates. In this approach, data filtering is built in, and the contribution from bad data is eliminated automatically through their vanishing weight. In addition, for data that has passed QC this approach automatically gives the best data a higher weight though their high quality scores, therefore it has the potential of more sensitive and accurate measurements. Using a set of control clones spiked in at known input ratio, we have confirmed that this is indeed the case. In the aforementioned BB rat thymus study, eight of the labeling reactions of total thymus RNA were spiked with four Arabidopsis in vitro transcripts (cellulose synthase, chlorophyll a/b binding protein, ribulose-1,5-bisphosphate and triosphosphate isomerase) at known input ratios of 30:1, 10:1, 5:1, and 1:1, respectively. For the remaining eight hybridizations, they were spiked in at 1:30, 1:10, 1:5, and 1:1, respectively. These clones enabled an evaluation of the accuracy of microarray measurements through the comparison of measured output ratios to the known RNA input ratios. In Fig. 2.4(a) we present the weighted and non-weighted mean of the microarray measurements, as a function of their actual input ratios. It shows that the Qf -weighted mean led to a better agreement with the actual ratio. In either the weighted or non-weighted mean approach, the last data point (corresponding to spiked-in ratio of 30:1, Cy5:Cy3) is an exception, which deviates from actual input ratio significantly. A close look of its data revealed that there is a significant amount of pixel intensity saturation in the Cy5 channel. Furthermore, we have shown that, at very high folds of change, the microarray measurement intrinsically is prone to compression, probably due to the technology’s limited dynamic range for linear detection [20]. Excluding this data point, we found that there exists a highly linear relationship between the measured and the actual log ratios (R2 > 0.96, P < 0.01 in both cases), with less compression in the Qf -weighted ratio measurements as the slope of the linear fit is closer to one (0.89 vs. 0.83). We have investigated further the incorporation of quality scores in statistical evaluation, by implementing a weighted t-test where the mean and standard error are replaced with their Qf -weighted counterparts [23]. In Fig. 2.4(b) we compare the P-values obtained using weighted and normal t-test for the whole thymus data sets (excluding the Arabidopsis
(a) 2
(b) 100 P-value, weighted t-test
Quantitative quality control of microarray experiments
Measured log10 (ratio)
39
1
0
–1
–2 –2
0 1 2 Input log10 (ratio) Non-weighted, M = 0.83*l-0.03 Weighted, M = 0.89*l-0.002 45° line –1
10–1 10–2 10–3 10–4
10–4 10–3 10–2 10–1 P-value, non-weighted t-test
100
P-value, weighted t-test
(c) 100 10–1 10–2 10–3 10–4
10–4 10–3 10–2 10–1 P-value, non-weighted t-test
Fig. 2.4.
100
Statistical evaluation of microarray data using Qf-weighted approach. (a) The straight mean and Qf -weighted mean of measured log ratio is compared with the input ratio for Arabidopsis clones that were spiked in at known ratios. The weighted mean shows an improvement in agreement with input ratio over the straight mean. (b),(c) The weighted t-test leads to more sensitive detection of differentially expressed genes. (b) For an experiment that compared the day 40 BB-DP and BB-DR rat thymus, the P-value derived using weighted t-test is compared with that from normal t-test. At P ¼ 0.01 (dashed lines) significantly more genes are detected using weighted t-test. (c) p-values are compared for Arabidopsis positive control clones that were spiked in at known input ratios of 30:1, 10:1, 5:1, 1:5, 1:10 and 1:30, all significantly different from 1:1. The weighted t-test is able to detect more at P ¼ 0.01.
40
X. Wang and M. J. Hessner
data points). A feature becomes evident that the weighted t-test leads to more genes with significant P values. To determine whether this is due to more sensitive detection or a higher false positive rate, we again turn to the Arabidopsis control clones. Each of our rat arrays possesses 76 spots corresponding to the four Arabidopsis clones. Therefore, this experiment generates totally 152 Arabidopsis data points. Forty of them correspond to the clone spiked in at 1:1 ratio and serve as (non-DE) negative controls. The remaining 112 correspond to an input ratio that is significantly different from 1 and serve as (DE) positive controls. We find that the type II error (false positive) rates are comparable between the weighted and non-weighted ttest. At P ¼ 0.01, 7 out of the 40 negative controls are significant according to the weighted t-test, whilst five are significant from the non-weighted ttest. More interestingly, the type I error rate is significantly reduced in the weighted approach (Fig. 2.4(c)). For the 112 positive controls, 31 have only one or none replicate that passes QC (Qf > 0). There is no need to include them in the statistical test. For the remaining 81, the weighted t-test is able to detect all but one at P ¼ 0.01 (Type I error rate: 1.2%). In contrast, nonweighted t-test misses 18, leading to a type I error rate of 22.2%. This result indicates indrectly that those data points in Fig. 2.4(b) with weighted t-test P < 0.01 are highly likely to be true positives. Since the microarray technology is often utilized as an explorative tool to be followed by conformational measures, more sensitive detection is highly desirable. In summary, we have found that the Qf -weighted statistics allows more accurate and sensitive detection of gene expression changes. The accuracy of gene expression measurements by microarrays The microarray technology is generally considered less accurate and less quantitative than alternative technologies like quantitative real-time RT-PCR [2, 3]. Significant compression in the microarray measurements is often observed [2, 24]. Is accurate, quantitative measurement of gene expression changes feasible by microarrays? Our answer is yes. The results presented in Fig. 2.4(a) demonstrated a highly linear relationship between the measured and the actual ratio spanning over a dynamic range of 300-fold of change (excluding the last data point), with a small compression in the measured ratios. Recently, in a study that profiled and compared liver
41
Quantitative quality control of microarray experiments
1.0 0.8
Log10 (ratio), microarray
0.6 0.4 0.2 0.0 –0.2 –0.4 –0.6 –0.8 –1.0 –1.0
–0.8
–0.6
–0.4 –0.2 0.0 0.2 Log10 (ratio), RT-PCR
Spots with Q f > 0
0.4
0.6
0.8
1.0
Spots with Q f = 0
Linear fit to good spots
Fig. 2.5.
Accuracy of gene expression ratio measurements. Measurements by microarrays are compared with those by quantitative RT-PCR in a rat liver experiment. A highly linear relationship is observed, with log Rmicroarray ¼ 0.86 log RRT-PCR þ 0.03, R2 ¼ 0.94 and P < 0.001. The compression in the microarray measurements is modest.
gene expression from day 65 BB-DR rats and day 65 Wistar-Furth (WF) rats, we have confirmed the microarray measurements for a set of genes using quantitative real-time RT-PCR (manuscript in preparation). Four animals from each strain were sacrificed and equal amounts of purified total RNA from the animals in the same strain were pooled. The two pools were then compared in six replicate hybridizations, with three of them reverse labeled. Twenty-two genes of biological interest were selected for RT-PCR measurements. The result is given in Fig. 2.5, showing an overall good agreement between the two technologies. Seven (open circles) of the 22 genes were identified as poor-quality spots as their Qf ¼ 0. After removal of these genes, a highly linear relationship
42
X. Wang and M. J. Hessner
(R2 0.94, P < 0.001) existed for the remaining 15 genes. If we were able to calibrate microarray measurements using RT-PCR results as a standard, we 1 ¼ 1:16. would have Rcorrected ¼ (Rmeasured)q, with a correction factor q 0:86 This is much better than a q 1.88 previously reported by others [25]. If the seven poor-quality data points were to be included, the agreement between the two platforms drops to R2 0.90, with a correction factor q 1.27, still not bad. Our studies suggest that, with stringent, efficient QC protocols, cDNA microarrays are capable of generating high-quality, quantitative measurements comparable to that by real time RT-PCR. Furthermore, in a recent project we have compared measurements from cDNA microarrays with those from Affymetrix and Agilent technologies’ oligonucleotide array platforms. We observed a high correlation among the three, with no significant difference in terms of data quality. Briefly, the rat liver pool sample described above was also hybridized to Affymetrix’s U34A array and Agilent’s G4130A array. The Affymetrix arrays were processed with MAS5.0 and both Agilent and our in-house arrays were processed with Matarray. The three platforms share 2824 Unigene unique genes. After data filtering, 895 genes passed QC on all platforms. Most of the drop comes from Affymetrix, as more than 50% of the genes were labeled ‘‘absent’’ on at least one array. We then calculated the correlation between each pair of platforms for all of the 895 genes and for the genes showing DE (P ¼ 0.01) in at least one platform, and the concordance rate in the DE predictions. The result is summarized in Table 2.2. Table 2.2. Good agreement is observed between gene expression measurements by three different microarray platforms: Affymetrix, Agilent oligonucleotide arrays, and our in-house cDNA arrays Correlation for Correlation for Concordance of all genes DE genes DE genes Affymetrix vs. Agilent 0.73 Affymetrix vs. cDNA 0.68 Agilent vs. cDNA 0.75 DE genes are defined at P ¼ 0.01.
0.95 0.93 0.95
77% 55% 67%
43
Quantitative quality control of microarray experiments
Discussion and conclusions The microarray technology allows a comprehensive examination of gene expression profiles at whole genome level and has become a widely used genetic tool [1, 2]. It has great potential in resolving complex biological issues. A few important examples include: investigation of mechanisms for complex human diseases [3]; evaluation of drug toxicity in pharmacology [26]; revelation of how stem cells differentiate into cell types of specific functions [27], etc. Furthermore, the oligonucleotide array platform allows users to design probes for each gene to detect multivariant regions of a transcript (i.e., splice variants) and to avoid regions that are repetitive or similar to other genes [28, 29]. However, to fully utilize its potential, we need a comprehensive data QC scheme to ensure the quality of gene expression measurements obtained from microarrays. Many of the complex biological questions require accurate detection of expression changes. Through our technical and analytical developments, we have demonstrated that accurate gene expression measurements by the cDNA microarrays are possible, when all the major factors affecting data quality are properly dissected and managed [20]. We have invented a three-color microarray technology, which resolves the QC issue of array fabrication and ensures that only high quality arrays are used in microarray experiments [5, 6]. In our microarray analysis platform, the prehybridization TD image is utilized to assist data acquisition from the posthybridization cyanine image, which leads to improved accuracy and efficiency [10]. With a set of quality scores defined according to quality measures from the three images for each microarray, and an overall score Qf, we have shown that efficient, quantitative data filtering, and normalization can be achieved. Furthermore, by adopting a Qf-weighted mean and a Qf-weighted statistical test, our analysis platform allows highly convenient data processing with improved accuracy and sensitivity. In this approach data filtering is built in through each spot’s quality score, and the missing value problem is avoided. Lastly and most importantly, we show that our microarray technology and bioinformatics setup leads to gene expression measurements that are comparable in quality to that by quantitative RT-PCR.
44
X. Wang and M. J. Hessner
Through numerous experiments we have demonstrated the advantages of having an quantitative measure of data quality and of utilizing the log R–Q plot for data filtering and normalization [4, 19]. We have found that such a log R–Q plot revealing a data structure and possible artifacts, provides insight for data quality evaluation and for deciding data filtering stringencies. In addition, we have found the log R–Q plot useful in the design and optimization of new protocols/algorithms, as it can differentiate the effect on good vs bad spots, and thus points out means for improvement. Most of the QC issues are common to microarray laboratories, and most of our approaches are therefore applicable to many. An individual laboratory can identify the major quality-affecting factors and define the corresponding quantitative measures. Furthermore, the quality score approach can be Q generalized to a weighted mean approach such as defining qcom ¼ i qwi i , as the individual factors may affect data quality at a different degree for different labs. In this way investigators will be able to tailor our approach for their own microarray setup. Our algorithms, although initially developed for cDNA arrays, are also applicable to spotted oligonucleotide arrays. In a recent report we have developed the three-color oligonucleotide array platform by introducing a third-dye labeled universal tracking oligonucleotide into the printing buffer. Thus the quality of array fabrication can be evaluated quantitatively through the measurements of the tracking oligonucleotide [30], and similar data filtering and normalization procedures can be developed for oligonucleotide arrays utilizing TDAV. A high-quality microarray platform, either oligonucleotide or cDNA based, will allow laboratory investigators to focus on their biological questions instead of the technical issues of the data, and will allow statisticians and bioinformatic investigators to develop more powerful complex analysis approaches. Acknowledgments This work is supported in part by a National Institute of Biomedical Imaging and Bioengineering Grant (EB001421) and by a special fund from the Children’s Hospital of Wisconsin Foundation. We thank Shuang Jia, Lisa Meyer, Rhonda Geoffrey, and Bixia Xiang for their analytical and technical contributions.
45
Quantitative quality control of microarray experiments
R E F E RE N C E S 1. Brown, P. O. and Botstein, D. Exploring the new world of the genome with DNA microarrays. Nat. Genet. 1999; 21(1 Suppl): 33–7. 2. Chuaqui, R. F., Bonner, R. F., Best, C. J. et al. Post-analysis follow-up and validation of microarray experiments. Nat. Genet. 2002; 32 Suppl: 509–14. 3. Miklos, G. L. and Maleszka, R. Microarray reality checks in the context of a complex disease. Nat. Biotechnol. 2004; 22(5): 615–21. 4. Wang, X., Ghosh, S., and Guo, S.-W. Quantitative quality control in microarray image processing and data acquisition. Nucl. Acids Res. 2001; 29: E75–82. 5. Hessner, M., Wang, X., Hulse, K. et al. Three color cDNA microarrays: quantitative assessment through the use of fluorescein-labeled probes. Nucl. Acids. Res. 2003; 31: e14. 6. Hessner, M. J., Wang, X., Khan, S. et al. Use of a three-color cDNA microarray platform to measure and control support-bound probe for improved data quality and reproducibility. Nucl. Acids Res. 2003; 31: e60. 7. Wang, Y., Wang, X., Guo, S. W., and Ghosh, S. Conditions to ensure competitive hybridization in two-color microarray: a theoretical and experimental analysis. Biotechniques 2002; 32(6): 1342–6. 8. Yue, H., Eastman, P. S., Wang, B. B. et al. An evaluation of the performance of cDNA microarrays for detecting changes in global mRNA expression. Nucl. Acids Res. 2001; 29(8): E41–1. 9. Hessner, M. J., Meyer, L., Tackes, J., Muheisen, S., and Wang, X. Immobilized support-bound probe and glass surface chemistry as variables in microarray fabrication. BMC Genomics 2004; 5: 53. 10. Wang, X., Jiang, N., Feng, X. et al. A novel approach for high quality microarray processing using third-dye array visualization technology. IEEE Trans. NanoBioscience 2003; 2(4): 193–201. 11. Yang, Y. H., Buckley, M. J., and Speed, T. P. Analysis of cDNA microarray images. Brief Bioinform. 2001; 2(4): 341–9. 12. Kadota, K., Miki, R., Bono, H. et al. Preprocessing implementation for microarray (PRIM): an efficient method for processing cDNA microarray data. Physiol. Genomics 2001; 4(3): 183–8. 13. Mills, J. C. and Gordon, J. I. A new approach for filtering noise from high-density oligonucleotide microarray datasets. Nucl. Acids Res. 2001; 29(15): E72–2. 14. Tran, P. H., Peiffer, D. A., Shin, Y., Meek, L. M., Brody, J. P., and Cho, K. W. Microarray optimizations: increasing spot accuracy and automated identification of true microarray signals. Nucl. Acids Res. 2002; 30(12): e54. 15. Tseng, G. C., Oh, M. K., Rohlin, L., Liao, J. C., and Wong, W. H. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucl. Acids Res. 2001; 29(12): 2549–57.
46
X. Wang and M. J. Hessner
16. Yang, Y. H., Dudoit, S., Luu, P. et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucl. Acids Res. 2002; 30(4): e15. 17. Colantuoni, C., Henry, G., Zeger, S., and Pevsner, J. Local mean normalization of microarray element signal intensities across an array surface: quality control and correction of spatially systematic artifacts. Biotechniques 2002; 32(6): 1316–20. 18. Bilban, M., Buehler, L. K., Head, S., Desoye, G., and Quaranta, V. Normalizing DNA microarray data. Curr. Issues Mol. Biol. 2002; 4(2): 57–64. 19. Wang, X., Hessner, M. J., Wu, Y., Pati, N., and Ghosh, S. Quantitative quality control in microarray experiments and the application in data filtering, normalization and false positive rate prediction. Bioinformatics 2003; 19: 1341–7. 20. Wang, X., Jia, S., Meyer, L. et al. Accurate gene expression measurements by cDNA microarrays. Submitted. 21. Cleveland, W. S. and Devlin, S. J. Locally weighted regression: an approach to regression analysis by local fitting. J. Am. Statist. Assoc. 1988; 83(403): 596–610. 22. Sanchez-Margalet, V., Lucas, M., Solano, F., and Goberna, R. Sensitivity of insulinsecreting RIN m5F cells to undergoing apoptosis by the protein kinase C inhibitor staurosporine. Exp. Cell Res. 1993; 209(1): 160–3. 23. Young, M. J., Eisenberg, J. M., Williams, S. V., and Hershey, J. C. Comparing aggregate estimates of derived thresholds for clinical decisions. Health Serv. Res. 1986; 20(6 Pt 1): 763–80. 24. Rajeevan, M. S., Vernon, S. D., Taysavang, N., and Unger, E. R. Validation of arraybased gene expression profiles by real-time (kinetic) RT-PCR. J. Mol. Diagn. 2001; 3(1): 26–31. 25. Yuen, T., Wurmbach, E., Pfeffer, R. L., Ebersole, B. J., and Sealfon, S. C. Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays. Nucl. Acids Res. 2002; 30(10): e48. 26. Guerreiro, N., Staedtler, F., Grenet, O., Kehren, J., and Chibout, S. D. Toxicogenomics in drug development. Toxicol. Pathol. 2003; 31(5): 471–9. 27. Costoya, J. A., Hobbs, R. M., Barna, M. et al. Essential role of Plzf in maintenance of spermatogonial stem cells. Nat. Genet. 2004; 36(6): 653–9. 28. Kane, M. D., Jatkoe, T. A., Stumpf, C. R., Lu, J., Thomas, J. D., and Madore, S. J. Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucl. Acids Res. 2000; 28(22): 4552–7. 29. Wang, H. Y., Malek, R. L., Kwitek, A. E. et al. Assessing unmodified 70-mer oligonucleotide probe performance on glass-slide microarrays. Genome Biol. 2003; 4(1): R5. 30. Hessner, M. J., Singh, V. K., Wang, X. et al. Visualization and quality control of spotted 70-mer arrays using a labeled tracking oligonucleotide. BMC Genomics 2004; 5: 12.
3
Statistical analysis of gene expression data David A. Elashoff Department of Biostatistics, UCLA School of Public Health, Los Angeles, CA, USA
Abstract Statistical analysis of the complex data sets produced in DNA microarray experiments presents substantial challenges to the experimenter and statistician alike. Due to the large number of genes and small number of samples, traditional statistical analysis methods alone are not typically sufficient to make appropriate conclusions. This chapter introduces the reader to the basic concepts in the analysis of microarray data and provides a summary of some of the most commonly used techniques. The overall structure of a microarray data analysis can be divided into four distinct components. The four components of a microarray data analysis consist of data preprocessing/quality control, identification of differentially expressed genes, unsupervised clustering/data visualization, and supervised classification/prediction. As the science of microarray analysis has advanced, a wide variety of methods have been developed to address each of these components. Guidance is provided as to the situations in which the various techniques can be applied most productively and cautions given about cases where these techniques will give inappropriate answers. Introduction The growth of microarray research has resulted in considerable interest in the statistical and computational communities in the development of methods for addressing these problems. The most common scientific questions asked in a microarray experiment are, ‘‘What genes are correlated with specific characteristics of the samples?’’ and ‘‘Are there specific Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
48
D. A. Elashoff
patterns of gene expression, or combinations of multiple genes, which can accurately predict the sample characteristics?’’ The characteristics can be simple such as tumor tissue versus normal tissue or more complex such as survival times for the experimental subjects. The overall structure of a microarray data analysis can be described in terms of four distinct components. The four components of a microarray data analysis consist of data preprocessing, gene filtering identifying differentially expressed genes, unsupervised clustering, and supervised classification/prediction. These four components will be addressed in detail in the following sections. While not all of these components are present in all analyses, these categories give the general structure for most analyses. Data preprocessing with appropriate quality control is necessary to generate consistent expression values on the basis of the scanned microarrays. Gene filtering consists of methods of identifying genes that are differentially expressed with respect to the sample characteristics of interest. Clustering and prediction are used to identify large-scale patterns of gene expression and to utilize these patterns to predict biological outcomes. As the science of microarray analysis has progressed, a multitude of different techniques have developed to address each of these analytics components. One of the most challenging problems for the analyst is to decide which set of techniques to use. One difficulty is that microarray experiments lack the replication and validation that has typically helped guide analysis strategies in other fields. The microarray analysis literature contains many assessments of the various methods which will often show one method or other is the ‘‘best’’ method of analysis based on a specific set of evaluation criteria. Unfortunately, different sets of evaluation criteria will select different best methods. Realistically, there is no one universal analysis strategy that is the overall best choice for each element of microarray analysis. Rather, there are a number of appropriate techniques for each element of the analysis and the challenge is to use those techniques appropriately and to correctly interpret what the results mean. Concurrent with the development and adaptation of statistical methods for microarray analysis has been the development of software packages to implement these methodologies. Two freely available packages provide an environment to carry out most microarray data analysis.
49
Statistical analysis of gene expression data
The most complete tool for these investigations is the Bioconductor (http://www.bioconductor.org) package for R (http://www.r-project.org/) that is an open source and open development software project for the analysis and comprehension of genomic data. R is an open source statistical analysis software package implementing the S statistical computing language. Bioconductor implements all the data acquisition methods, along with many of the statistical methods for gene filtering, clustering and classification that are described in this chapter. DNA-Chip Analyzer (dChip) (http://www.dchip.org) is a free software package specifically developed to handle Affymetrix microarray data. This package performs data preprocessing, normalization, gene filtering and clustering. The major advantages of dChip are its ease of use and its suite of helpful visualization tools. Statistical background and challenges with microarray data The basic statistical challenge with microarray analysis is that we wish to make conclusions about a large number of genes on the basis of a small number of samples. In the statistical literature this is referred to as the small n (n ¼ the number of samples), large p (p ¼ the number of variables, or in this case the number of genes) problem. This problem is also referred to as the multiple testing problem or multiple comparison problem. This problem can be explained in terms of the balance between two fundamental concepts of statistical hypothesis testing: the significance level and statistical power. The first concept is the significance level, also referred to as the alphalevel. The significance level is the preset probability that a statistical test will result in a false positive. Typically in the scientific literature this probability is set to 0.05. Another way to consider the significance level is that it is the percent of statistical tests that will yield a significant result even when there is no correlation between the gene expression and the characteristics of interest in our samples. False positive statistical test results, that is, genes that we will incorrectly observe to be differentially expressed, are a special concern for microarray analysis. A simple calculation demonstrates the magnitude of this problem in microarray analysis. The most recent Affymetrix human genechip Human Genome U133 Plus 2.0 array contains
50
D. A. Elashoff
over 47,000 transcripts. If a microarray experiment were run using these arrays and we performed a standard statistical test to compare the expression values for each transcript versus the outcome measure we would expect, assuming independence, 47 000 0.05 ¼ 2350 false positive results. If we suppose that only a small number of genes are truly differentially expressed, then the false positive genes would dramatically outnumber the true positives. This is a problem as there is no a priori way to determine which statistical results are true positives and which are false positives. The second concept is that of statistical power. The statistical power of a statistical test is the probability the test will detect a gene that is truly differentially expressed. The statistical power is one minus the probability of a false negative. False negatives in microarray studies will result in the study failing to find genes that are truly related to the biological differences in the samples. The power of a hypothesis test is based on: the chosen level of significance, the strength of the relationship between the expression level and the outcome, the amount of variability in the expression level, and most importantly on the sample size n. The power increases with larger differences, sample sizes, and significance levels. The small sample sizes typically found in microarray studies give rise to low levels of statistical power (i.e., high false negative rates). Further, if we reduce the level of significance to reduce the numbers of false positives, we will also reduce the statistical power for detecting differentially expressed genes. The combination of low power and a large number of false positives in microarray analyses can lead to a low reproducibility between studies and a large chance of failing to detect key differences or failing to evaluate them in depth due to the large numbers of chance correlations.
Data processing Data acquisition and computation expression indices
The first step of the analysis is to compute numerical summaries for each gene from the microarrays. Microarrays must be scanned and the image files loaded into a software package. These software packages perform image segmentation and subsequently compute expression values for each gene. A number of different software packages exist for both spotted and
51
Statistical analysis of gene expression data
Affymetrix arrays; these packages implement various algorithms for computing the expression value for each gene and performing normalization on these values. Spotted arrays
Spotted arrays consist of a grid of spotted cDNAs or oligonucleotides. Typically, each spot corresponds to an individual gene and is roughly circular in shape. These arrays can be either two color (red/green) arrays in which two samples are hybridized simultaneously to the array, or one color arrays with a single sample. The goal of the data preprocessing for spotted arrays is the computation of the expression value for each spot or gene for the red and green channel in the region of the spot. Background intensities are also computed for each channel. One of the areas of controversy in spotted array preprocessing is the methodology for identifying which pixels in the array image file belong to a given spot. There are a number of methods to perform this identification, the most commonly used of which fall into two broad categories: fixed circles or adaptive shape. The fixed circles methodology is used in a number of software packages including the free package ScanAlyze (http://rana.lbl.gov/EisenSoftware.htm). This technique will draw a circle around each spot and consider all pixels within the circle to be included in the spot. The adaptive shape method found in the package Spot [1] is implemented in R with additional details available online at (http://experimental.act.cmis.csiro.au/Spot/index.php). Instead of assuming that each spot is circular, Spot uses a statistical algorithm (Seeded region growing (SRG)) that finds the center of each spot and then ‘‘grows’’ to encompass the entire area of the spot that has hybridized intensity significantly brighter than the surrounding background. This procedure more accurately identifies pixels that should and should not be included in the computation of the spot intensity. An automatic grid-finding procedure also minimizes manual intervention. The expression value for an individual gene is typically reported using the log ratio of the background-subtracted red and green intensities. The next step is data normalization which typically consists of a two-stage process, within slide and between slide normalization. Within slide normalization is performed using the M-A normalization procedure [2], which seeks to
52
D. A. Elashoff
remove the effect of dye bias. Between slide normalization can be performed in a number of different ways and is essential to ensure data consistency between arrays. Affymetrix arrays
Affymetrix expression arrays represent each gene by a set of 10–20 oligonucleotide probe pairs where each probe pair consists of a perfect match (PM) probe and the mismatch (MM) probe. The first step of an analysis of Affymetrix data is to summarize the intensities of these probes by an overall expression index. In principle, this index should directly correspond to the concentration of RNA corresponding to that gene in the sample. The earliest algorithm (MAS4) [3] simply computed the trimmed mean of the differences between perfect match and mismatch over the set of probe pairs to obtain a measure of the gene expression. One downside for the MAS4 algorithm is that it can produce negative expression values when the MM probes have higher intensities for a given probe set. These negative intensities lack a straightforward biological interpretation. This technique has been superseded by three popular techniques for generating gene expression values (Table 3.1); Affymetrix Microarray Suite (MAS) 5.0 [4], dChip (Li-Wong algorithm) [5], and RMA [6]. Continued research on expression indices has resulted in a number of additional methods intended to model more accurately the probe specific effects; for example, PDNN [7] uses physical modeling to determine probe weights, and two GCRMA [8] methods use GC content of the probe sequences to reduce variance in the mismatch probes.
Table 3.1. Commonly used expression metrics Reference name
Probes used
Summary methods
Probe specific effects?
MAS4 MAS5 PMonly Diff RMA
PM-MM PM-IM PM PM-MM PM
Affymetrix average difference (trimmed mean) Affymetrix One-step Tukey’s biweight Li and Wong reduced perfect match only model Li and Wong reduced difference model Robust multi-array analysis (log transformation)
No No Yes Yes Yes
53
Statistical analysis of gene expression data
MAS5
The Affymetrix Microarray Suite 5.0 (MAS 5.0) software package computes two summary measures for each gene. The first measure is the expression value computed using the one-step Tukey’s biweight algorithm, which calculates a expression level over the set of probe pairs. This method uses a technique described as a median polish whereby the expression index is a weighted mean of the individual probes. The weights of the probes are inversely proportional to the difference between each probe and the median intensity probe for the probe set. This method effectively reduces the influence of outlier expression values by down-weighting them in the expression computation. To avoid the possibility of negative expression values the MAS5 algorithm truncates the mismatch probes to the level of the perfect match probes. Finally, MAS5 scales the expression data from various arrays so that the mean expression value is the same across arrays. The second measure is the presence call, either present, marginally present, or absent. The presence call is a decision rule utilizing a number of details of the probe pairs to make a determination of whether mRNA corresponding to that particular gene is in the sample. dChip
An alternative method for data acquisition is the use of the dChip software which implements the Li–Wong algorithm. This software also computes a measure of the expression level akin to the average difference as well as a presence/absence call. The Li–Wong algorithm fits a model that includes probe specific terms and an overall expression index for each gene. This model better reflects the true nature of the data in Affymetrix chips by including probe specific effects as there is considerable variability in the intensities of the various probes in each probe set. The algorithm allows the computation of the standard error for the expression index from each gene and array allowing the user to determine the relative variability of each observation. Another key feature of this software is the use of a statistical model to detect outliers and artifacts on the arrays. This helps eliminate both large- and small-scale array artifacts caused by scratches, dust and mishandling. This software also normalizes the expression values across multiple arrays using the invariant subset normalization procedure which ensures that the measurement scale is the same in all experimental
54
Fig. 3.1.
D. A. Elashoff
Consistency of probe specific effects. Each panel shows the gene expression data for a single probe set from a single Affymetrix array. The left portion of each panel plots the PM (blue line) and MM (green line) intensities (y-axis) vs. the probes in linear order. The right portion of each panel shows the actual image of the probe set from the array. The probes are displayed with the PM probes in the top line and the MM probes in the bottom line. Brighter shades of yellow indicate higher intensities/more expression. The four panels on the left side show an individual gene in four different samples. The four panels on the right show a different gene in the same four samples. We note that the pattern of probe intensities remains roughly constant yet the overall level appears to vary between samples. In the left panel the gene appears to be overexpressed in the middle two samples, while in the right panel the gene appears overexpressed in the bottom three samples. These images were created in the dChip software package.
conditions. Two versions of the Li–Wong algorithm are implemented in dChip, the difference model, which models the differences between the PM and MM probes, and the PM only model which discards the MM probes from the computation of the expression values (Fig. 3.1).
55
Statistical analysis of gene expression data
RMA
The RMA method has some similarities to the Li–Wong method in that it fits probe-specific effects for each genes. However, RMA models the logarithm of the intensities from the probes rather than the absolute intensities. RMA ignores the mismatch probes and instead uses a novel method to compute background intensities over regions of the arrays. The log transformation aids in stabilizing the variability of the variance across the range of expression values as well as reducing the influence of outlier probes. RMA has been shown to be the expression index with the highest correlation to spike-in genes, although these experiments do not necessarily represent the typical performance of biological samples[9]. RMA also shows better technical properties as a measure of gene expression as it is the only metric that does not show a strong correlation between expression and variability. Summary
Practically, each of the methods has its relative advantages and disadvantages with none being the clear choice. The perfect match only models (RMA, DIFF) avoid the extra error associated with the MM probes while at the same time not adjusting well for non-specific binding. The models that fit probe specific effects (RMA, DIFF, PMonly) produce less variable expression values but require a larger sample size to ensure adequate estimation of those effects. The dChip methods use explicit outlier detection while the other methods derive robustness from their computational algorithms. Added to this confusion is the fact that, if we want to identify differentially expressed genes by any given comparison metric, say the t-test, each of the data acquisition methods will produce a different list of significant genes with disagreement ranging as high as 90% [10]. At the same time, preliminary reports indicate that genes identified as differentially expressed are confirmed at approximately the same rate by RT-PCR [11]. This leaves us with the uncomfortable situation in which the different expression indices will produce very different results but at the same time appear to give equally useful results. Quality control (QC)
Microarray quality control (QC) issues must be addressed in microarray experiments. Microarrays suffer from a wide variety of quality control issues
56
D. A. Elashoff
from defects on individual chips that can arise from dust, scratches and bubbles on the microarray to variability in sample quality due to RNA degradation and sample inadequacy. The practical results of the defects and sample quality issues are that we will have increased signal for probes in the regions with dust spots and scratches and decreased signal due to bubbles. Poor quality samples or poor sample handling can have negative effects on the statistical analyses including an increase in false positives due to influential outliers and an increase in false negatives due to the increased variability of gene expression. To reduce the effect of defects and sample handling a series of quality control checks is recommended. The first step is a visual inspection of the array images. Figure 3.2 demonstrates a variety of defects that can be identified by visual inspection and the results of dChip outlier detection. Beyond visual examination there are a number of overall chip QC metrics that can be useful in identifying chips that should be removed from the analysis. There are a number of ways to make use of the QC metrics presented. First, we can consider specific thresholds for each metric to identify problematic chips. Thresholding works best with dChip outliers. Second, we consider consistency of these metrics across the chips in the study. It is important to note that consistency should only be considered within individual tissue/cell types as there can be very significant differences in the number and type of genes expressed between tissues. Further, it has been noted that the relative probe intensities in Affymetrix arrays vary substantially across tissue type, which will result in inflated estimates of variability when using models that fit probe-specific effects. Four useful quality control metrics are dChip outliers, the present/absent percentage, the 50 /30 ratio and the Affymetrix scale factor. dChip outliers
If greater than 5% of probe sets are determined to be outliers the software recommends that the chip will be deleted. This criterion is somewhat liberal and a more stringent criterion of, say, 1–2% outliers is probably warranted. Present/absent percentage
The percent of genes called present on each chip can be readily computed and provides a useful QC metric. Samples that are outliers in the overall
57
Fig. 3.2.
Statistical analysis of gene expression data
These four panels show images of Affymetrix microarrays. (a) shows an array with no apparent defects. (b) shows an array with a large bubble on the right side of the chip. (c) shows an array with a large dust spot defect. (d) shows this same array after the application of the dChip outlier detection. White lines indicate probe outliers that will be removed from the analysis. These images were created in the dChip software package.
present percentage, especially those with low overall present percentages, should be considered for removal. These types of samples often result from inadequate amounts of RNA or significant RNA degradation. Outlying, low overall present percentages also present problems for data normalization procedures utilized by the various data acquisition software as these
58
D. A. Elashoff
procedures are based on the assumption that there is roughly the same quantity of RNA present in each sample. Brightness/dimness
This QC metric is based on the scale factor used in the Affymetrix MAS5 software. The concept is to compute the trimmed mean of the unnormalized expression values for each array separately and to remove arrays that have outlying values. Essentially this criterion removes arrays that are substantially brightness or dimmer than the other arrays in the experiment as these arrays have likely been mishandled or contain poor quality sample. 50 30 ratio
Expression arrays generally have one or more control genes as well as many genes represented by multiple probe sets that contain both a 50 fragment and a 30 fragment. The ratio of the 50 expression to the 30 expression is a good marker of RNA degradation and hence data quality. Arrays with low overall 50 /30 ratios or ratios that are significantly lower than in the other samples should be considered for removal. Based on our visual inspection and examination of QC metrics we can decide which arrays should not be used. Typically, these quality control steps should be run in an iterative manner; that is, after the first set of problematic arrays are removed, the data acquisition and normalization steps should be re-run on the reduced data set. Then, the QC metrics can be recomputed and it can be determined if additional arrays should be removed. While the removal of chips would seem to reduce our ability to detect differentially expressed genes by lowering our sample size, this effect is more than compensated for by the reduction of within group variability obtained by removing bad data. Gene filtering: identifying differentially expressed genes The first scientific question considered in a typical array analysis is to identify genes that are differentially expressed relative to some characteristic of the samples. These characteristics could be grouping characteristics such as cancer/normal or treated/untreated, or they could be phenotypic measures such as growth rate, clinical outcome measures such as treatment
59
Statistical analysis of gene expression data
response or survival time. Different types of experimental designs will require different analytic tools. Gene filtering methodology for identifying differentially expressed genes typically involves the use of one or more observational and/or statistical metrics to create lists of differentially expressed genes. Each of these observational and statistical criteria require the use of prespecified thresholds for assessing whether individual genes are differentially expressed. Genes that do not surpass the thresholds for each of the filtering criteria will be removed and the final list of differentially expressed genes will consist of the intersection of those genes that surpass the thresholds for all criteria. We can contrast this approach with a classical statistical filtering which would use of one statistical criterion, the t-test for example, and use a significance threshold of P 0.05. Observational filtering metrics
These filtering metrics are based on the observed expression values produced from the data preprocessing analysis step. These observational metrics use filtering thresholds that are usually based on the experience of the microarray analyst and not on any universally accepted criteria. Present/absent calls
The simplest gene filtering method is based on the presence call in Affymetrix data or similar metrics derived for other array types. The typical usage is to remove genes from the analysis that are below a certain percent present threshold. This threshold can be set very low, say 10%, so that only genes that are absent in all samples are removed. In this case the genes to be removed are likely those that are either not expressed or expressed at a very low level across the experiment. These genes are unlikely to have biological significance yet will produce occasional false positives due to their large numbers. Alternatively, a higher threshold can be used, say 50–80% to produce lists of genes that are more consistently expressed in the samples under consideration. Fold change
The next, and most commonly used, filtering criteria is the fold change statistic. The fold change must be computed on the basis of two experimental conditions. The fold change statistic is the difference between the
60
D. A. Elashoff
mean expression values in two experimental conditions divided by the smaller of the two means with 1 added depending on the sign of the difference. As in all filtering, each gene is examined separately and genes that do not exceed a minimum threshold are removed. This criterion was the first and is still the most extensively used in the microarray literature to identify differentially expressed genes. Generally, a fold change threshold of greater than 2 or less than 2 is used. It should be noted this threshold does not work well when using expression indices which only use perfect match probes (PMonly, RMA) or when the transformations are applied to the expression data (RMA, LogMAS5). In these cases we will expect only a tiny fraction of genes (<0.5%) to have fold changes greater than 2 (compared to 5–10% for the other expression indices) [10]. The fold change criteria can still be used in these cases by using lower thresholds (1.2–1.8). Absolute difference
A third observational criterion is the absolute difference between the mean expression values for each group. Support for the use of this criterion is reported in Cole (2003) [12] where the combination of the ratio (foldchange) and the absolute difference resulted in good performance relative to conventional statistical filtering. The value for the absolute difference threshold will depend on the type of array and the preprocessing/normalization methodology. Statistical filtering metrics Confidence interval for the fold-change
We can compute the 95% confidence interval for the fold change based on the assumption that the fold change is a ratio of two normally distributed variables. This technique allows us to incorporate the standard errors of the expression indices computed by dChip and RMA to gain a degree of confidence about whether the observed fold change is different from a predefined threshold. The advantage of this estimation of the confidence interval is that we can obtain upper and lower bounds on the fold change. The filtering threshold for this method will be two sided in that it will specify an absolute value threshold for the upper or lower bound. If we use unity as our threshold this results in a filtering criterion that is
61
Statistical analysis of gene expression data
Table 3.2. Common statistical models for microarray analysis Type of sample characteristic(s) of interest
Statistical model
Categorical (ex. group, treatment type), Single Sample per subject Categorical, multiple samples per subject (replicates, time series) Quantitative (ex. growth rate, tumor size) Time-to-event (ex. survival, remission time)
ANOVA (t-test if two group design) Mixed effects ANOVA (paired t-test if pre/post design) Linear regression Cox proportional hazards regression
similar to the t-statistic. A threshold of 2 would result in a gene list of genes that we can say, with 95% confidence, are at least twofold different between the groups. Classical statistical testing
The next filtering criterion involves the use of classical statistical analysis methods on a gene-by-gene basis. Table 3.2 reviews some common appropriate statistical analysis methods depending on the type of sample characteristic under study. These methods will look for statistical correlations between the sample characteristics and expression values for each gene. For the ANOVA and linear regression models, the gene expression value is considered as the outcome or dependent variable and the subject characteristics are the independent variable. In the case of factorial experiments there may be multiple categorical characteristics in the models (cell type, treatment, etc.) This model structure is reversed for time to event analyses where the gene expression values would be the independent variables. Regardless of which type of statistical model is used, a p-value will be obtained that describes the significance of the correlation between the gene expression and the characteristic(s) of interest. The filtering threshold for these models will be the P-value. The next section discusses the ways we might choose specific P-value thresholds since the traditional 0.05 threshold may not be appropriate due to the fact that multiple comparisons are being performed. Alternatively, non-parametric statistical methods can be used if we do not want to make any distributional assumptions concerning the expression data. For example, the Wilcoxon rank sum test can be used to instead of the two independent sample t-test.
62
D. A. Elashoff
Correction for multiple testing/false discovery rate
As discussed earlier, one of the primary statistical concerns with microarray analysis, especially with respect to gene filtering, is that of multiple testing and false positives. A number of techniques have been designed to address this problem and they fall into two general categories. The first type of technique attempts to control the overall false positive rate for the entire set of comparisons. These methods control the familywise error rate (FWER) or the per comparison error rate (PCER). The FWER is defined as the probability of having one or more false positives in our full set of comparisons and the PCER is defined as the expected proportion of comparisons that will be false positives. These techniques are discussed in Dudoit et al. (2000) [13]. The most commonly used and least appropriate correction technique is the Bonferroni correction. This reduces the FWER using a significance level computed as 0.05 divided by the number of comparisons performed. For microarray data sets this will result in P-value thresholds of between 0.000005 and 0.000001. These strict thresholds will typically not even produce the desired FWER due to the failure of the normality assumption in the tails of the distribution of gene expression values. Further, changing our level of significance to these values will result in a dramatic loss of power to detect genes that are truly differentially expressed. Other corrections such as the Holm step-down method and the Westfall–Young method provide more sensible albeit more computationally intense methods for controlling FWER. These techniques will also result in substantial reductions in power. Given that the power is typically low anyway in microarray studies due to the small sample sizes, these techniques are not particularly appropriate unless we only wish to identify a very small number of differentially genes with very high confidence from our microarray experiment. The more typical approach is to treat array experiments as exploratory science and be willing to accept that a certain percentage of our significant results will be false positives. This naturally leads us to consider the second type of multiple testing correction technique: controlling for the false discovery rate FDR [14]. The false discovery rate is the proportion of the differentially expressed genes generated by our statistical comparison that we expect to be false positives. For example, if our microarray analysis consists of a two-group comparison using t-tests with a significance level of 0.05 and we identify
63
Statistical analysis of gene expression data
5000 out of 50 000 genes as differentially expressed then the FDR is approximated as (50 000 0.05)/5000 or about 50%. This would mean that we expect that 50% of the genes identified as differentially expressed are actually false positives. This computation has been simplified for ease of understanding. This computation also assumes that the level of significance for the test is exactly equal to the probability of a false positive. Unfortunately, with real data this assumption is not necessarily accurate. Significance analysis of microarrays (SAM)
One easily accessible approach to filtering using a FDR filtering threshold for identifying differentially expressed genes is the SAM methodology which is implemented in the SAM software package (http://www-stat.stanford. edu/tibs/SAM/) [15]. This software can handle analyses involving any of the experimental designs in Table 3.2. SAM is based on permutation type tests to determine whether genes are significantly differentially expressed. To begin, the software computes the relevant test statistic for the relationship between each gene and the sample characteristic of interest. Next, the method permutes the data and re-computes the gene-by-gene test statistics. The test statistics from the permuted data are ranked from largest to smallest. This permutation procedure is then repeated many times. Next, the average test statistic for each rank is computed over all the permutations. The average test statistic value for rank r, or expected test-statistic, essentially tells us how strong a relationship we expect to see by chance alone between the gene with the rth strongest correlation and the outcome. The software then allows the user to specify a threshold for significance based on the difference between the observed and expected test-statistic. Based on the chosen threshold the software will compute an estimated FDR for the set of genes that exceed that threshold. Clustering/data visualization: unsupervised learning Clustering methods are used to examine the global expression relationships between samples as well as to identify groups of genes that have similar expression patterns across the samples. A variety of clustering methods has been proposed to analyze expression array data. Three methods that yield graphical representations of the global expression structure are hierarchical
64
D. A. Elashoff
clustering, self-organizing maps (SOM), and principal components analysis (PCA). These techniques help the user to visualize the data and aid in summarizing the predominant patterns in the data. More recent methods have also been developed to identify small gene clusters that tend to be associated with important differences between groups of subjects. Clustering methods can only go so far, however, since they involve no inferences and do not identify individual genes that may be differentially expressed across experimental conditions. Hierarchical clustering
Hierarchical clustering is useful as a rapid screen for correlated mRNA expression when little is known about the number of clusters that might be present. A gene expression dendogram is constructed from the similarity scores using a hierarchical clustering software program such as Bioconductor, dChip or CLUSTER (http://rana.lbl.gov/EisenSoftware.htm). To construct the dendogram, a similarity measure between genes (across samples) and between samples (across genes) is required. Typical similarity metrics in microarray analysis are the Pearson correlation coefficient or Euclidean distance. The first step of clustering is to compute all pair-wise similarities between objects (arrays or genes). After evaluating similarities from all pairs of objects, we can construct a distance matrix of n*n (where n is the number of objects). The hierarchical algorithm proceeds as follows. First we look for the pair of objects with the shortest distance or the most similar gene expression pattern. We then construct a ‘‘composite object’’ by averaging (in average-linkage hierarchical clustering) all gene expression values from two objects. This process is then iterated using a new dataset containing n1 objects (n2 objects not averaged and the composite object). This procedure is repeated until the distance matrix is reduced to a single element. The graphical visualization of the hierarchical algorithm is illustrated by dendrogram where each merger is represented by a binary tree, and the length of each branch is indicative of the distance between two samples. The resulting tree and expression values can be plotted with TREEVIEW (http://rana.lbl.gov/ EisenSoftware.htm). An example is displayed in Fig. 3.3. Hierarchical clustering can be useful. First, the binary tree for the samples gives an overall representation of their relationships. This can be used to discover new relationships between samples or to determine if the global
65
Fig. 3.3.
Statistical analysis of gene expression data
Hierarchical clustering of a data set. Each column represents one array and each row represents an individual gene. The binary tree structure on the left hand side indicates the relationships between the genes. A red colored cell in the grid indicates that that gene is highly expressed in the particular sample relative to the other samples. Blue indicates low relative expression. This image is based on a gene list of 630 genes filtered on the basis of present percentage, coefficient of variation, and mean expression. One prominent feature of the data which can be easily observed is that the nine arrays plotted on the left hand side have consistently lower expression over the most variable and highly expressed genes in the data set. This should prompt an investigation of the characteristics of those samples to determine what features differentiate them from the other samples. This image was created in the dChip software package.
66
D. A. Elashoff
expression patterns differ across different types of samples. Similarly, the binary tree for the genes can identify sets of genes with similar expression patterns across the set of samples. The dendogram, or heat map as it is often referred to in the microarray literature, can help the user to identify sets of genes that appear to be regulated differentially across the grouping of samples. One important consideration in hierarchical cluster analysis is the initial choice of the subset of genes to be included in the analysis. It is atypical to include all the genes from the arrays in the analysis due to the massive size and complexity of the resulting output. To obtain a useful analysis while at the same time preserving the expression relationships in the array data set, it is standard practice to filter the genes. Typical gene filtering criteria for hierarchical clustering include lower bounds on the present percentage, coefficient of variation, and mean intensity. These filters will remove genes that are not expressed, expressed at a low level, or which have little variability across the samples. Genes in these categories are unlikely to be relevant to the experimental objectives of the array experiment. It is recommended that these filters be adapted to produce filtered gene lists containing 500–5000 genes. Clearly the fewer genes considered, the more accessible the information will be, while with more genes we will be able to examine larger numbers of gene to gene relationships. It is not recommended to filter the genes based on differential expression as this will create hierarchical trees that contain only those genes related to our outcome of interest rather than the array data set as a whole. Often we may wish to determine if the overall gene expression patterns reflect the known experimental groupings. The Rand index [16] measures the level of agreement between predicted group assignment and the true group information. The Rand index varies between 0 and 1; where 1 corresponds to perfect agreement between the true and predicted classes. The expected value of the Rand index based on a class permutation analysis is approximately 0.5. We can use the Rand index to assess the level of agreement between distinct groupings generated by the cluster analysis and the true classes of the samples. Self-organizing maps (SOM)
An attractive, alternative approach to analyzing the expression array data by hierarchical clustering is to use self-organizing maps (SOM) [17] to
67
Statistical analysis of gene expression data
generate gene clusters. SOMs are a type of mathematical cluster analysis that is particularly well suited for recognizing and classifying features in complex, multidimensional data. SOM creates a two level clustering of the expression values. In the lower level clustering, the genes are categorized into m nodes where m is much smaller than the overall number of genes. Each node contains genes with similar expression patterns across the set of samples. The nodes are then arranged in a two-dimensional structure such that adjacent nodes share similar overall expression patterns. The structure of the nodes gives the higher level clustering of the gene expression patterns. This technique works best if the genes are prefiltered to exclude genes that have minimal variance across the set of samples. This can be achieved by computing the coefficient of variation (the standard deviation of the expression values divided by the overall mean) for each gene and then setting a minimum threshold for the coefficient of variation for inclusion in the SOM. Beyond data visualization this technique can be useful in combining sets of genes which can then be analyzed together as a unit. This can reduce the gene expression data set to a small number of patterns for comparisons to the outcome measures of interest. This method can be used to reduce the stringency of multiple testing corrections that we need to use in gene filtering. SOM can be created using the program GENECLUSTER [18]. Principal components (PCA)
PCA is a multivariate technique which can be used to visualize multidimensional datasets. It can be used as a graphical tool to help visualize the overall gene expression relationship between the samples. PCA works by reducing the dimensionality of a data set. A simple example can illustrate the technique, consider a data set with two variables. If we would like to determine how the samples relate to each other based on those variables we make a two dimensional scatter plot and examine how our samples group together. To perform an analogous task with microarray data would be impossible as we would have to plot the samples in a 10–50 thousand dimension space. As this is not feasible, we must try to reduce the data down to two or three dimensions that still preserve the gene expression relationship between the samples. Mathematically, PCA involves the identification of a series of linear combinations of genes, or principal components that best ‘‘explain’’ the variability in the gene expression values of a microarray dataset. While PCA
68
D. A. Elashoff
X1.1
10,000.000 5,000.000 0.000
3,000.0 6,000.00
0.0 X1.3
3,000.00
–3,000.0 –6,000.0
Fig. 3.4.
0.00 3,000.00 –6,000.00
X1.2
Principal components plot of a microarray data set. In the figure there are six different groups represented by the different color/letter combinations. This figure was generated based on all genes on the array indicating that there are substantial genome wide expression differences between the samples from the different groups. We note that two of the groups, green and orange, are very distinct from the other groups. The arrow points to a purple sample that clusters with the green group. Given that the groups are otherwise quite distinct; this could be an indication of an error in sample handling.
can be based on any set of genes, in microarray datasets it is usually performed on either the full set of genes or the subset of genes that have been found to be differentially expressed. Once the principal components are computed, we can use the components as new axes to display the locations of the samples in the new lower dimensional space. Samples will appear as individual points in these plots. An example of a PCA plot is given in Fig. 3.4. The PCA based on all genes, or on a subset filtered as recommended for hierarchical clustering, will show the large-scale patterns in the gene expression data, which may or may not correspond to the study groupings. Poor quality or mishandled samples will typically manifest as outliers points in the PCA plot. Sample patterns grouping patterns on the PCA plot that are unrelated to the study groupings should be investigated for relationships with any of the sample characteristics. If any relationships are found, then it is advisable to control for the effect of these characteristics in any statistical analysis. The groupings in the PCA plots can result from largescale patterns based on large numbers of genes, or on a small numbers of highly variable genes.
69
Statistical analysis of gene expression data
The plot for a PCA, which is based on only differentially expressed genes, will generally show groupings based on the criteria used to identify those genes. For example, if we have a data set with cancer and normal samples and we use the gene filtering steps identified above to create a gene list for the PCA, we expect that the PCA plot will show substantial differences between the cancer/normal groups. Because of this, these plots should not be used to assess whether the expression data differentiate the sample by their groupings. The main utility of this type of analysis is to identify individual samples that do not fit the pattern (Ex: cancer samples that group with the normal samples). Additional clustering techniques
Two additional unsupervised clustering approaches are the ‘‘gene shaving’’ technique, suggested by Tibshirani and Hastie (1999) [19] and the plaid models suggested by Lazzaroni and Owen (2000) [20]. Gene shaving seeks to find subsets of genes whose expression levels vary most across experimental conditions. This method has, in previous studies, isolated small clusters of genes that exhibit high correlation to clinical outcomes. The small gene clusters identified by this method can be more extensively analyzed through a correlation study with the experimental conditions used in the design. The plaid model technique seeks to find sets of genes that have similar expression over blocks of subjects. This technique can be used to establish which sets of genes have differential expression across the groups of subjects. Classification/prediction: supervised learning Many prediction techniques are discussed in the statistical analysis of microarrays literature. The most prominent classes of techniques are regression, classification trees, nearest neighbor prediction, linear discriminant analysis (LDA), support vector machines (SVM) and neural networks (NN). These techniques are general categories of prediction methodologies. For each category there are many types of models and options to select from. Regardless of the choice of type of prediction model it is essential to perform model validation either by cross-validation, resampling, or by testing on independent samples. Each modeling method will consist of two stages, model building and model assessment.
70
D. A. Elashoff
Prediction models Logistic regression
Logistic regression is a technique that uses linear combinations of genes to predict the probability that the samples have a certain characteristic. This technique is most relevant to microarray experiments that are two-group experimental designs such as cancer/normal or treated/untreated. The models can only be built with a small number of genes and therefore will require careful gene selection/filtering. The standard method for creating prediction models using logistic regression is to first construct the model using the full set of genes of interest. Stepwise model selection criteria can then be used to remove variables one by one to try to find the optimal set of predictors in the model. The user must be careful to avoid including genes that are highly correlated with each other in these models as these can invalidate the statistical tests used to assess the significance of the individual genes. Advantages of this approach are that it will produce a simple, interpretable model with statistical inferences for each of the genes in the model. A disadvantage of regression modeling is that it relies on a variety of modeling assumptions that may not be correct. Despite this reliance, logistic regression has been shown to be a very effective classification technique in comparison to other multivariate prediction techniques [21]. Another disadvantage is that these models require the user to specify the (small) set of genes to be considered in the model as well as any gene-by-gene interaction terms. In the case of continuous outcomes, linear regression techniques can be easily substituted for logistic regression with much the same set of advantages and disadvantages. Classification and regression trees (CART)
CART [22] is a decision tree-based classification method that searches the predictors for cut-point values that best separate the samples into the two (or more) groups. This technique creates a classifier that can easily handle non-linearity and multiple interactions which cannot be easily accounted for by regression models. Furthermore, the CART method produces the most easily interpretable model of the classification techniques under consideration. The CART analysis can be done with the specialty software CART (Salford Systems) or using the function rpart in R. The CART
71
Statistical analysis of gene expression data
software integrates cross-validation and model assessment tools which can be used to compute the error rates for each model. Random forests
The random forest [23] method is somewhat similar to CART in that it constructs tree-based classification models by iteratively searching for cut-points in the set of predictor variables. Instead of a single tree, however, this technique generates large numbers of trees by randomly sampling from the observed data. These random samples will not include the entire set of samples. Each of these trees then ‘‘votes’’ for the predictions of those samples left out. The majority classification vote for the samples left out from the set of trees allows us to compute the error rate for the forest predictor. Prediction of test data is then done by majority votes from predictions based on the set of trees. Advantages of this technique include high accuracy, robustness to noise and outliers and internal estimates of error. Disadvantages of forest predictors are reduced interpretability compared to CART models. K-nearest neighbor (KNN)
Nearest neighbor classification methods produce classification decisions based on the classes of other samples with similar expression patterns. A simple, but useful, algorithm of this type is KNN [24] in which we identify the k nearest neighbors to a sample and then decide the classification for that sample by a majority vote of the neighbors. The challenge for implementation is the definition how these neighbors are defined. A practical method for microarrays is to use gene filtering first to select genes that are differentially expressed between the groups. Next, we can compute the Euclidean distance between the sample of interest and all other samples based on the filtered genes. The k samples that have the smallest distance to the sample of interest form the set of nearest neighbors. The choice of k has a substantial effect on the resulting classification decisions. General tips for choosing k include using odd valued k to limit the possibility of tied voting and basing the selection on the sample sizes in each group. Values for k that range from 10–50% of the group sample sizes are likely to be reasonable choices. Alternatively, cross-validation can be used to asses the optimal value for k.
72
D. A. Elashoff
Linear discriminant analysis (LDA)
Linear discriminant analysis is a well-established statistical classification technique that is very useful for developing microarray prediction models. The method identifies linear combinations of genes that have large ratios of between groups to within group variability. Put another way, this criteria finds weighted sums of genes in which the difference between groups is large in comparison to the within group variance. The linear combinations produced by this analysis are referred to as discriminant variables. The number of discriminant variables created is the number of distinct groups minus one. Classification is performed by computing the Euclidean distance between each sample and the mean of each group in the space of the discriminant variables and then classifying the sample according to which group mean it is closest to. It is suggested that some filtering be performed prior to application of this technique to remove genes with low expression and/or low present percentages. An advantage of this technique is that it can make use of large numbers of genes to make predictions. A disadvantage is the lack of interpretability of linear combinations of large numbers of genes. Support vector machines (SVMs)
Another approach for creating prediction models is the use of support vector machines (SVMs) [25, 26]. SVMs are a multivariate statistical technique used to analyze large numbers of variables that are observed in a relatively small number of samples. This method finds the optimal hyperplane in the space of the gene expression values for differentiating the samples based on the characteristic of interest. Support vector regression allows for the analysis of gene expression in relation to survival as a continuous variable. Advantages of SVMs are that they can make use of large numbers of genes simultaneously and will generally produce strong prediction models. Disadvantages include lack of interpretability and a tendency to overfit the data. Neural networks
Neural networks are machine learning classification tools that represent the relationship between the expression values and the true classes by a network of connections and nodes. The networks consists of the inputs, in this case gene expression data, which are then connected to a/an hidden layer(s)
73
Statistical analysis of gene expression data
of nodes which are then connected to the output layer of units, one for each of the possible outcome categories. The connections among inputs, nodes and outputs have weights which are iteratively adjusted to improve the overall prediction of the outcomes categories. This process is repeated until a vector of weights is found that best fits the data. There is considerable flexibility in the construction of these networks in that the user can select the number of hidden layers to use and number of nodes per layer. These models tend to be very good at constructing accurate models, however, they have a strong tendency to overfit the data. These models also tend not to be biologically interpretable. Summary of prediction methods
As with each element of microarray analysis we have many possible choices each with its own advantages and disadvantages. Table 3.3 summarizes some of the key attributes of the prediction methods we have discussed. The proper choice of prediction methods will depend on the scientific questions under consideration and the types of error which can be tolerated. For example, if the objective is to isolate a small set of genes that when used in combination can perform a prediction of an outcome measure, then regression or classification trees would be appropriate. If, instead, the objective is to determine if we can make outcome predictions on the basis of the overall gene expression patterns, then LDA, SVM or neural networks might be appropriate. Model assessment and validation
Once we have constructed our prediction model by whichever technique is appropriate the next step is model assessment and validation. In the model assessment stage we can evaluate metrics such as sensitivity, specificity, accuracy, prediction error, and the area under the curve (AUC) of a receiver operating characteristic (ROC) curve to determine how well the prediction model is able to classify our samples. We can expect that, since these multivariate techniques have thousands of genes to choose from, they will be able to create a model that appears to be able to produce strong predictions. To truly assess microarray prediction models there are a number of methods we can use including: cross-validation, permutation tests, and training/test.
74
D. A. Elashoff
Table 3.3. Summarization of the properties of prediction techniques for microarray analysis Method
Filtering required
Predictive ability
Tendency to overfit
Interpretability
Clustering Logistic regression Classification tree KNN LDA SVM Neural networks
Medium High Medium–high Medium–high Low Low Low
Low Medium Medium Medium Medium High High
Low Medium Medium Low Low High High
Low High High Medium Low Low Low
The filtering required column indicates whether a technique can build models on the basis of many genes (low filtering), or on the basis of a highly selected subset of genes (high filtering). Predictive ability indicates how well these methods are able to create a decision rule that will accurately predict the training samples. Tendency to overfit indicates the likelihood that the model created will overfit the data. Interpretability indicates how easily the resulting prediction model will be to interpret and whether it will help to identify small combinations of genes that can be used to predict the sample characteristics.
Cross-validation involves an iterative process that starts by randomly removing a fixed number of samples from our data set. Next, the prediction model is fit on the remaining set of samples. This prediction model is then applied to the samples that have been left out. This process is then iterated many times. The percentage of times that the prediction model fails to accurately predict the left out samples is the cross-validated error rate of the model. The number of samples to be left out is typically much smaller than the overall sample size with 1, 5, and 10 being popular choices. While the cross-validated error rate is a more relevant model assessment metric than the error rate computed using all samples, it is not a perfect solution. One problem is that the prediction models are typically based on a filtered gene list. If we filter genes after sample removal, then each prediction model will be constructed based on different sets of genes. If we filter prior to sample removal, then we do not have a perfect test of the model as the left out samples still contribute to the determination of the filtered set. Permutation tests are a method by which we can compute reference distributions for model assessment metrics. For example, we can use permutation tests to assess whether the observed accuracy is larger than we
75
Statistical analysis of gene expression data
would expect by chance. This will allow us to determine whether seemingly strong prediction results are due to overfitting and data mining or if they are due to actual relationships between gene expression and the sample characteristics. The implementation of a permutation test begins by randomly rearranging the sample characteristics while leaving the gene expression values unchanged. This will remove any true relationships between the gene expression and the sample characteristics. Next, we perform our gene filtering and prediction modeling steps and compute the value for our model assessment metrics. This procedure is then repeated many times. We can then compare our observed value for the assessment metric with the distribution of values obtained in the random permutation. If our observed value is larger than the 95th percentile in the permutation distribution, we will have confidence that our observed value is unlikely to have arisen by chance alone. We can compute a permutation P-value for our observed assessment metric by computing the percent of permuted values that exceed the observed value. The last and most powerful method for unbiased model assessment is to divide the samples randomly into training and test sets of samples. In this paradigm, we use the training samples to construct a prediction model. The next step will be to apply the prediction model to the data from the test set and compute the predicted classifications for each sample. Based on the predicted classifications and the true outcomes, we can assess the sensitivity, specificity and accuracy of the prediction model. One important note is that we should ensure that the each group is proportionally represented in the training and test sets. The percentage of the samples in the training set is typically larger than 50% (67% and 75% are popular choices), while maintaining a sufficient number of test cases to obtain some degree of precision in our estimate of the prediction accuracy. The training/test approach is the best method for obtaining independent model assessment, although it is not always practical due to the small samples sizes in many microarray studies. Experimental design Given the low power and multiple comparison in microarray studies proper consideration of the experimental design prior to running the arrays is an absolute necessity. Although experimental design issues deserve a chapter of
76
D. A. Elashoff
their own, here we discuss briefly only three issues: spotted arrays dye-swap designs, sample allocation, and sample size computations. Spotted array design
Experimental design is most complex in the case of two color spotted array experiments due to the differential response of the two dyes and the fact that these are relative rather than stand-alone experiments. The simplest and earliest design for these experiments consisted of running each sample in one channel of each array and then using a common reference sample for the other channel of every array. The advantage of this design is that this effectively normalizes the expression of every gene across the set of samples. The disadvantage is that the arrays are not being used effectively in that we have one measurement of each sample, which are the measures of interest, and many measurements of the common reference about which we will have no interest. Alternatives to this simple design are a dye-swap design in which every sample is run on two arrays, once in the red channel and once in the green, and a more complex dye-swap method called the loop design [27]. These designs correct for dye biases while at the same time giving multiple measures of the samples of interest which will decrease the experimental variability. Sample allocation
There are a few simple principles for obtaining the best power given limitations in the overall number of arrays that can be run. First is that in almost every case biological replication (multiple subjects per experimental condition) results in higher statistical power than technical replication (multiple arrays per subject). Second, it is usually best to have balanced designs where the subjects (or arrays) are equally divided between the experimental conditions. For experiments with multiple experimental variables (for example, cell type and treatment) factorial designs can improve balance and simplify statistical approaches. Sample size
The number of samples per group is typically based on funding/sample limitations rather than statistical considerations due to the exploratory nature of these studies. However, if we are looking for specific effects or
77
Statistical analysis of gene expression data
specific genes, running a power analysis can aid in determining the likelihood (power) of obtaining the results of interest given our likely restrictions in sample size. If we can make some assumptions about what we expect or hope to observe in the study then the general sample size and power software nQuery Advisor 6.0 (www.statsolusa.com) has two tables for the computation of power and sample size for detection of specified fold- changes. If our array study is more exploratory in nature there are still a number of practical rules of thumb to use for sample size selection. First, it is advisable to have no fewer than three chips per group to enable minimally adequate variance estimation. Second, if prediction modeling will be a goal of the microarray analysis, a minimum of 12 per group should be used. Unless we have as least 12 per group there is greater than a 5% chance that at least one gene will appear not only to be a significant predictor but will be a perfect predictor even if no genes are correlated with the sample grouping.
R E F E RE N C E S 1. Yang, Y. H., Buckley, M. J., Dudoit, S., and Speed, T. P. Comparison of methods for image analysis on cDNA microarray data. UC Berkeley Technical Report 584, November 2000. 2. Yang, Y. H., Dudoit, S., Luu P., and Speed, T. P. Normalization for cDNA microarray data. UC Berkeley Technical Report, December 2000. 3. Affymetrix. Affymetrix Microarray Suite User Guide, Version 4 edn. Affymetrix Santa Clara, CA, 1999. 4. Affymetrix. Statistical Algorithms Description Document. Affymetrix, Santa Clara, CA, 2002. 5. Li, C. and Wong, W. H. (2001) Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc. Natl Acad. Sci. 2001; 98: 31. 6. Irizarry, R., Bolstad, B., Collin, F., Cope, L., Hobbs, B. and Speed, T. (2003) Summaries of Affymetrix GeneChip probe level data. Nucl. Acids Res. 31(4): e15. 7. Zhang, L., Miles, M., and Aldape K. A model of molecular interactions on short oligonucleotide microarrays. Nat. Biotechnol. 2003; 21(7): 818–21. 8. Wu, Z., Irizarry, R., Gentleman, R., Murillo, F., and Spencer, F. A model based background adjustment for oligonucleotide expression arrays. J. Am. Statist. Assoc. 2005; 99(468): 909–17.
78
D. A. Elashoff
9. Shedden, K., Chen, W., Kuick, R. et al. Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data BMC Bioinformatics 2005; 6: 26. 10. Elashoff, D., Oh, M., Brown, N., Li, Y., Wong, D. T., and Horvath, S. Empirical study of the influence of expression index on the standard statistical analysis of oligonucleotide microarray data. Manuscript under review. 11. Rosati, B., Frau, F., Kuehler, A., Rodriguea, S., and Mckinnon, D. Comparison of different probe-level analysis techniques for oligonucleotide microarrays. BioTechniques 2004; 36(2): 316–22. 12. Cole, S. W., Galic, Z., and Zack, J. A. Controlling false-negative errors in microarray differential expression analysis: a PRIM approach. Bioinformatics, 2003; 19(14): 1808–16. 13. Dudoit, S., Yang, Y. H., Callow, M. J., and Speed, T. P. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. UC Berkeley Technical Report 578, 2000. 14. Benjamini, Y. and Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc., Series B 1995; 85: 289–300. 15. Tusher, V., Tibshirani, R., and Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl Acad. Sci. USA 2001; 98(9): 5116–21. 16. Hubert, L. and Arabie, P. Comparing partitions. J. Classifications 1985; 2: 194–218. 17. Tamayo, P., Slonim, D., Mesirov, J. et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hemato poietic differentiation. Proc. Natl Acad. Sci. USA 1999; 96: 2907–12. 18. Golub, T. R., Slonim, D. K., Tamayo, P. et al. Molecular classification of cancer: class prediction by gene expression monitoring. Science 1999; 286: 531–7. 19. Tibshirani, R., Hastie, T., Eisen, M., Ross, D., Botstein, D., and Brown, P. Clustering methods for the analysis of DNA microarray data. Stanford Tech. Report October 1999. 20. Lazzeroni, L. and Owen, A. B. Plaid models for gene expression data. Statist. Sinica 2002; 12: 61–86. 21. Terrin, N., Schmid, C., Griffith, J., D’Agostino, R., and Selker, H. External validity of of predictive models: a comparison of logistic regression, classification trees, and neural networks. J. Clin. Epidemiol. 2003; 56: 721–729. 22. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. Classification and Regression Trees. New York: Chapman & Hall, 1984. 23. Breiman, L. Random forests, random features. Department of Statistics, UC Berkeley Technical Report 567, 1999. 24. Cover, T. and Hart, P. Nearest neighbor pattern classification. IEEE Trans. Information Theory 1967; 13(1): 21–7.
79
Statistical analysis of gene expression data
25. Vapnik, V. N. The nature of statistical learning theory. 2nd edn. In Statistics for Engineering and Information Science. New York: Springer, 2000, xix, 314 p. 26. Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. Gene selection for cancer classifcation using support vector machines. Machine Learning 2002; 46: 389–422. 27. Kerr, M. K. and Churchill, G. A. Experimental design for gene expression microarrays. Biostatistics 2001; 2(2): 183–20.
4
Genomic stratification in patients with heart failure Tara A. Bullard,1 Fre´de´rick Aguilar,1 Jennifer L. Hall,2 and Burns C. Blaxall1 1 2
Cardiovascular Research Institute, University of Rochester Medical Centre, NY Lillehei Heart Institute, University of Minnesota, Minneapolis, MI
Introduction Cardiovascular disease, or heart failure (HF), continues to be the leading cause of death worldwide, having surpassed infectious disease in the 1990s. Current estimates indicate that chronic HF affects 1–2% of the total population of developed countries [1]. Patients with HF face a dismal prognosis: 5-year survival following diagnosis of any HF is approximately 50%, and 1-year survival for those with end-stage disease is less than 50% [2–4]. In the United States alone, there are approximately 550 000 newly diagnosed cases of HF per year, with numbers continually rising [4]. Furthermore, recent predictions suggest that HF will become the leading cause of all disability by 2020. HF is a complex and progressive disease with numerous etiologies that involve environmental, genetic and genomic factors. While progress has been made in identifying components that may contribute to HF, our current understanding of the molecular underpinnings of HF remains remarkably limited. Although treatment modalities of recent years have improved disease prognosis, novel insights are required to enhance duration and quality of life further for patients suffering from this debilitating disease. Compounding the complex nature of HF is the recent suggestion that the adult heart expresses as many as 10 000 genes. To unravel the molecular complexities of HF, the rapidly developing field of gene expression profiling by microarrays provides an excellent means by which to investigate genome-wide difference in gene expression profiles associated with cardiovascular disease. In this chapter we review some of the recent insights into HF in both animal models and human disease achieved by microarray gene expression profiling. Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
81
Genomic stratification in patients with heart failure
Animal models of HF Numerous genetically engineered animal models of HF and cardiomyopathy have been described, often discovered serendipitously following knockout or overexpression of a particular gene. Often, these mouse models closely recapitulate many of the aspects of human HF, such as hypertrophy, dilation, and aberrations of cardiac signal transduction cascades, including the -adrenergic receptor (-AR) signaling system. These models provide powerful tools to investigate the molecular underpinnings of HF at the level of gene expression profiling. Importantly, a limited number of genetic, pharmacologic and surgical interventions have been described in animal models that can cause or reverse HF, providing important insight into the development, progression and rescue of HF. Below, we will discuss animal models of HF in which gene expression patterns have been explored by microarray technology. Hypertrophy and HF
Redfern and colleagues [5] were one of the first to report on the use of microarrays in gene expression profiling of HF. They had created an inducible synthetic ligand-stimulated Gi coupled receptor that could lead to the induction or regression of hypertrophy and HF. Using oligonucleotide microarrays, they demonstrated significant changes in overall gene expression in both the development and regression of HF. Subsequently, this group developed a tool to visualize and analyze gene expression data at the level of manually curated signal transduction pathways and gene ontology families, called GenMAPP [6]. Subsequently, Aronow and colleagues [7] reported on gene expression profiling in four different animal models of hypertrophy and HF using microarrays. From this study, in particular from the Gq cardiac overexpression model of HF, they went on to implicate the mitochondrial ‘‘death protein,’’ NIX, in apoptotic cardiomyopathy and death [7]. Utilizing transgenic mice, they unexpectedly found that sNIX, a previously unknown splice variant of NIX, protected the animals against myocyte apoptosis [8]. Akt and apoptosis
Excessive apoptosis (programmed cell death) may be a component in the pathogenesis of numerous cardiovascular diseases and particularly HF.
82
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
Indeed, myocyte cell loss through apoptosis has been reported in endstage heart failure [9]. Chronically elevated plasma catecholamines leads to overstimulation of cardiomyocyte 1-ARs and subsequent apoptosis [10]. The serine-threonine kinase Akt is activated by many cardioprotective ligand-receptor systems including insulin, insulin-like growth factor or estrogen [11–13]. Acute activation of Akt protects cardiomyocytes from apoptosis [14]. Cardiac specific overexpression of a constitutively active mutant of Akt (myr-Akt) led to dramatic cardiac hypertrophy with preserved systolic function and protection from ischemia-reperfusion injury [15]. The investigative group of Antony Rosenzweig recently analyzed the effects of cardiac Akt overexpression on the expression of 11 000 genes in the heart of myr-Akt mice compare to non-transgenic littermates by microarray [16]. In this preliminary study, they used pooled RNA from whole heart. It is important to note that the use of pooled RNA samples may dilute detection of significant low-expressing genes and increase false positive and false negative results. They showed that chronic Akt activation modified the expression of more than 40 genes in the heart. Interestingly, they found an Akt-dependent upregulation of the anti-apoptotic molecule IGFB-P5 in their transgenic mice. This result suggests that IGFB-P5 could play a role in the cytoprotective effect of Akt in the heart. Tumor necrosis factor-a (TNF-a)
Tumor necrosis factor- (TNF-) demonstrates elevated expression in failing human hearts. Cardiac-specific TNF- overexpressing mice developed dilated cardiomyopathy (DCM) and were determined to be a suitable model for congestive heart failure [17]. Later studies demonstrated that there was extensive extracellular matrix (ECM) remodeling in the TNF- transgenic mice, namely increased matrix metalloproteinase-2 (MMP-2) and MMP-9 activity, increased in collagen synthesis, deposition, denaturation, and decreased undenatured collagens [18]. Recently, microarray technology was utilized to compare gene profiles of compensated and decompensated hearts in TNF- overexpressing mice [19]. Analysis showed that more than 50 genes were differentially expressed in the TNF- mice, and there was a significant increase in immune response genes as well as genes regulating collagens, MMPs, and transforming growth factor- [19].
83
Genomic stratification in patients with heart failure
a1b-adrenergic receptor (a1b-AR)
Both - and -AR have been implicated in the pathological onset and progression of HF. To date, only the -AR transgenic animal model of HF has been investigated utilizing microarrays. Cardiac-directed expression of a constitutively active form of the the 1b-AR has been demonstrated to result in myocardial hypertrophy [20]. Wang et al. [21] found that transgenic mice overexpressing the constitutively active form of 1b-AR subjected to pressure-overload hypertrophy resulting from trans-aortic coarctation (TAC) had increased heart weights, lung weights and atrial natriuretic peptide mRNA, suggesting that 1b-AR regulates hypertrophy [21]. Recently, Yun et al. [22] examined gene expression in these mice before the onset of disease and when the animals achieved significant hypertrophy. The transgenic hearts demonstrated a 60% increase in Src gene expression. Cardiac gene expression was decreased for common hypertrophy-inducing proteins such as actin, collagen and GP130 pathways. Atrial natriuretic peptide (ANP) ablation
Atrial natriuretic peptide (ANP) is synthesized and released by the heart during pressure and volume overload, pathologic and physiological cardiac hypertrophy, HF and hypertension, and its elevated expression is a hallmark of HF. ANP deficient mice demonstrated the physiologically protective role of ANP elevation, in that the mice exhibit cardiac hypertrophy at baseline and display an exaggerated pathological response to TAC. Microarrays were utilized to identify genes associated with pressure overload resulting from TAC in the presence or absence of ANP. The data revealed increased expression of several ECM molecules and regulatory proteins in ANP deficient mice following TAC as compared to sham operated controls [23]. Development, progression and rescue of HF in animal models
Several animal models of DCM that demonstrate hallmark aberrations of -AR signaling known to occur in human HF have been described. Ablation of the muscle LIM protein (MLP), a cytoarchitectural gene, results in the disruption of cardiomyocyte cytoarchitecture and DCM [24]. Cardiac-directed overexpression of calsequestrin (CSQ) a high-capacity calcium binding protein, leads to rapid onset DCM and enhanced
84
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
mortality [25]. Both animal models develop HF concordant with hallmark abnormalities in -AR signaling, including decreased -AR expression and increased expression of the -AR kinase (ARK), an enzyme responsible for phosphorylating and desensitizing -ARs in animal and human HF. Previous work from our group had demonstrated that inhibiting the activity of ARK by a dominant negative peptide (ARKct) could normalize -AR signaling, cardiac function and survival in these animal models [26, 27]. As such, the MLP-/-/ARKct and CSQ/ARKct animals provided excellent models with which to investigate the development, progression and ‘‘rescue’’ of HF. We recently performed a robust analysis of LV gene expression by oligonucleotide microarray in these mouse models of HF and transgenic ‘‘rescue.’’ We were able to clearly distinguish between normal, HF and ‘‘rescue’’ based on gene expression profile. Importantly, the substantial sample size (n ¼ 53) allowed us to develop a novel statistical algorithm that accurately predicted cardiac phenotype based solely on gene expression. Furthermore, we were able to accurately predict both early- and late-stage HF based on gene expression (Fig. 4.1) [28]. With an appropriate sample size, this exciting method has identified specific genes predictive of a particular cardiac phenotype that can be further evaluated as novel diagnostic or therapeutic targets in HF. Bearing in mind the important caveat that these studies were performed in genetically engineered animal models, these studies suggest that early-stage HF gene expression is highly similar to and potentially predictive of advanced-stage HF, indicating early activation of the HF gene expression profile in the progression of the disease. In another study of cardiac gene expression in the development of HF, Ueno and colleagues [29] investigated gene-expression changes in Dahl, salt sensitive, hypertensive rats during progression from hypertension to hypertrophy and HF. They reported that a transcription factor, apoptosis-related D-binding protein (DBP), was significantly downregulated during the progressive development of HF, leading them to propose that this gene may be cardioprotective. Myocardial infarction
Sehl et al. [30] utilized microarrays to determine changes in gene expression in response to myocardial infarction (MI) in rats [30]. Ventricular samples
(a)
0.27
CSQa CSQy HF MLPa MLPy Normal
D2
0.2275
0.185
0.1425
0.1 0.83
0.865
0.9
0.935
0.97
D1 (b)
1.0
Classification probability
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Fig. 4.1.
Sample
Classification and prediction of both early and end-stage HF as end-stage HF phenotype by gene expression. (a) Multidimensional scaling image demonstrating Normal (green) and HF (red) distinction, along with the novel HF classification of both the early stage HF MLP-/-(purple) and CSQ (light blue), as well as the end-stage HF MLP-/- (dark blue) and CSQ (yellow). (b) Blind prediction of both early stage and advanced stage HF as end-stage HF phenotype based on the normal vs. HF data set; a, end-stage HF; Y, early stage HF; Normal, dark blue; HF, light blue; early stage HF, red and yellow; end-stage HF, magenta and rust. Reproduced with permission from [28].
86
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
were obtained from rats at multiple time points up to 84 days following MI induced by coronary artery ligation [30]. The MI model has been previously characterized to display an early wound healing response followed by a progression to HF [31]. During the early phase, matrix related proteins such as osteopontin, collagen III and fibronectin were elevated, but then returned to baseline at 6 weeks postinjury [30]. Conversely, ANP and brain natriuretic peptide (BNP) were elevated during the early phase but remained elevated at 12 weeks following injury [30]. These results demonstrate the utility of gene expression profiling to investigate the development and progression of HF. Pressure overload
Pressure overload, induced experimentally by TAC results in left ventricular hypertrophy acutely and can lead to DCM and HF chronically. Several groups have examined changes in gene expression following TAC by microarray analysis. Zhao et al. [32] induced compensated pressure overload hypertrophy in mice by TAC and collected tissue 48 hours, 10 days, and 3 weeks postsurgery. Analysis revealed an upregulation of 38 genes at 48 hours, 269 genes at 10 days, and 203 genes at 3 weeks and downregulation of 15 genes at 48 h, 160 genes at 10 days, and 124 genes at 3 weeks with the altered genes being derived from 12 different functional groupings [32]. Using a more comprehensive approach, Wagner et al. [33] examined changes in gene expression within each of the four chambers of the heart following TAC in mice. The gene expression profile of post-TAC left atria clustered with the gene expression profiles of left ventricles postTAC mice and the post-TAC right ventricles clustered with those in the sham-operated control mice [33]. Therefore, it is generally preferable to use consistent sample location, preferably left ventricular free wall samples, instead of whole heart due to chamber-specific heterogeneity of gene expression. Pharmacologic induction of heart failure
Friddle et al. [34] investigated genes associated with the onset and regression of isoproterenol- and angiotensin II-induced hypertrophy. They found that 25 known genes and 30 novel genes were altered during induction and regression of hypertrophy. Upon further analysis they found that 32 of those
87
Genomic stratification in patients with heart failure
genes were specific to induction of hypertrophy and 8 genes were specific to regression of hypertrophy. The pyrrolizidine alkaloid monocrotaline (MCT) is another method used to pharmacologically induce hypertrophy, although it is often associated with pulmonary right ventricular dysfunction. MCT treatment causes compensated or decompensated hypertrophy in a dose-dependent manner in Wistar rats. Following 2 weeks of MCT treatment, gene expression profiles demonstrated that decompensated hearts showed activation of pro-apoptotic pathways, whereas compensated hearts showed a reduction in pro-apoptotic signalling via p38 MAPK, through upregulation of MKP1 [35]. Animal models of cardiac conductance abnormality
Numerous microarray studies have been performed in diseases related to HF, such as coronary artery disease and myocardial conductance abnormalities. To demonstrate the utility of microarray technologies in cardiovascular disease beyond congestive HF, we will discuss a single microarray study in a porcine model of atrial fibrillation. Analysis revealed 387 genes from the left atrium and 81 genes in the right atrium were altered, with the greatest overall change in myosin regulatory light chain (MLC-2V) [36]. Microarrays have also been used to study an experimental mouse model of Chagas’ Disease [37]. Differentially expressed genes were divided into their functional groups, based on gene ontology, which included transcription, intracellular transport, structure/junction/adhesion or extracellular matrix, signaling, host defense, energetics, metabolism, cell shape, and death [37]. A mouse heart gene expression database
The mouse transcriptome microarray was developed by Tabibiazar et al. [38] to investigate and construct a chamber specific database of gene expression in the normal mouse heart [38]. The microarray consisted of 25 000 unique genes and expressed sequence tags (ESTs) that were obtained from the National Institutes of Aging clone set, the RIKEN clone set and approximately 5000 genes from other investigators [38]. Pooled RNA from the each of the four chambers and the interventricular septum (IVS) of female C57B1/6 mice were used. The significance analysis of microarrays (SAM) algorithm was used to identify genes that were unique to each
88
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
chamber. Genes were classified into the following functional categories: transcription factors, cytoskeletal genes, extracellular matrix molecules, growth-related proteins, metabolism-related molecules, several ESTs, and uncharacterized genes. The IVS expressed numerous genes similar to the left and right ventricles, but also had a unique gene expression pattern [38]. A comparison between the atria and the ventricles resulted in a difference of 5430 genes with 2460 genes being expressed higher in the atria and 2970 genes being expressed higher in the ventricles [38]. The authors compiled their results into a searchable online database of the genes expressed in the normal mouse heart (http://mousedevelopment.org) [38]. These data underscore the important divergence of chamber-specific gene expression investigation, and urge caution in utilizing biopsy samples from remote areas of the heart (i.e., right ventricular septal biopsy) as a surrogate measure of left ventricular gene expression. Cardiac development and the fetal gene program of HF
Re-activation of a fetal gene program has been generally accepted as a hallmark of the development of HF. Therefore, understanding cardiac development at the level of gene expression could provide substantial insight into congenital cardiac defects and potentially into the gene expression pattern of HF. To investigate the developmental gene expression program of the heart, Sehl et al. [30] utilized ventricular samples obtained from E13 embryos and 1-day-old neonates, and samples from each time point were pooled for the developmental study. Of the 86 known genes examined, 46 were detected in the embryonic samples while 81 were detected in the neonatal samples. The normal adult cardiac gene expression pattern more closely resembled that of the neonatal rather than fetal gene expression profile, in that the embryonic samples differentially expressed 76%, whereas the neonatal samples only differentially expressed 22% of the genes detected [1, 30]. Chen et al. [39] have also examined changes in gene expression in the developing mouse heart by using cDNA microarrays. Of the 6000 genes examined, those associated with cell cycle progression and growth factors were decreased, and those associated with structural proteins and stress response factors were increased during the transition from neonate to adult mouse heart [39]. These results document changes in gene expression
89
Genomic stratification in patients with heart failure
associated with the development and differentiation of the heart, and may provide insight into how the manipulation of these genes in HF may hold potential therapeutic efficacy. Microarrays have been used to define further the molecular profile of animal models of HF. Several reports have utilized microarray analysis to construct databases of normal animal hearts during the course of development and during the progression of disease. Having a defined frame of reference will be a valuable asset in designing future experiments aimed at understanding the mechanisms underlying HF. Importantly, the gene expression profiling experiments have identified numerous novel individual gene targets and groups of genes that are currently under investigation. Ultimately, it will be important to validate the changes seen in animal models in human HF in a bidirectional fashion. Human HF Gene expression analysis of non-failing (NF) and failing human cardiac tissue has identified numerous human HF gene candidates and corroborated results of earlier single gene studies. In 2000, the group of Meredith Bond and colleagues [40] were the first to publish microarray data in endstage human HF. Using two NF and two HF human hearts (one ischemic and one DCM) and simple fold-change comparisons, they found 19 genes with coordinately altered expression in both types of HF, including genes involved in cytoskeletal and myofibrillar organization, protein turnover and energy metabolism. Among these were decreased expression of SLIM1 (striated muscle LIM protein-1), a cytoarchitectural gene known to play a role in myoblast spreading and migration [41], and increased gelsolin, which regulates thin filament assembly and turnover in cardiac myocytes [42]. Their data along with data from others suggest that the myocyte cytoskeleton could play an important role in the development and progression of HF. Subsequently, the same group reported a larger microarray study using 8 NF and 8 DCM hearts [43]. They found 103 differentially expressed in failing and NF heart (this time not including sLIM1 and gelsolin), notably genes encoding ECM, cytoskeleton and proteolysis/stress related proteins, including the hallmark genes of HF, such as ANP, ANP precursor and BNP.
90
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
These results confirm that failing human myocytes revert to a pattern of fetal gene expression. Interestingly, they reported differential geneexpression profiles between alcoholic, dilated and familial cardiomyopathies, confirming that microarrays have potential utility as diagnostic tools for HF. The reduced expression of signal-transduction genes in the latter two states with minimal change in the alcoholic DCM led the authors to suggest that alcoholic cardiomyopathy might not involve neurohormonal or cytokine pathways [43]. In 2002, Barrans and colleagues constructed the first cardiovascularbased cDNA microarray called ‘‘Cardiochip’’ [44]. This CardioChip contains 10 848 redundant and randomly-selected sequenced EST (expressed sequence tags) derived from several human heart and artery cDNA libraries. Total RNA were extracted from five non-failing human adult hearts and seven DCM hearts. Secondary validation of numerous candidate genes found by microarray was performed by quantitative real time PCR. They found >100 transcripts differentially expressed in DCM. Not surprisingly, ANP was their top candidate gene. They observed consistent upregulation of sarcomeric and ECM proteins. They also found a down-regulation of calcium cycling genes previously associated with reduced cardiac contractility in HF patients [44]. Hypertrophic cardiomyopathy (HCM), ischemic cardiomyopathy (ICM), and DCM are a few of the many forms of cardiomyopathy. However, HCM and DCM result from very different molecular pathways and pathophysiological remodeling [45]. Phenotypically, the left ventricle in DCM is dilated and hypocontractile, whereas the left ventricle in HCM is hypertrophied and hypercontractile. Hwang et al. [46] investigated the molecular portraits of DCM- and HCM- related end-stage HF using an in-house spotted cDNA microarray with 10 272 unique clones from various cardiovascular cDNA libraries sequenced and annotated in their laboratory. RNA samples were obtained from pools of left ventricular free wall (HCM: n ¼ 3, DCM: n ¼ 2 and normal adult heart: n ¼ 3). Considering heterogeneity of gene expression in human samples, among various other factors, pooling RNA samples from different patients could lead to numerous false positive or false negative results. They found commonly up-regulated or down-regulated genes, notably up-regulation of ANP or down-regulation of the sarcoplasmic/endoplasmic reticulum calcium-ATPase (SERCA).
91
Genomic stratification in patients with heart failure
They also showed that expression of some genes was modified specifically in either HCM (CSQ, lipocortin or lumican) or DCM (B-crystallin, desmin or -dystrobrevin) [46]. By using the DNA microarray technology, they found several candidate genes differentially expressed in DCM and HCM that may improve diagnostic and therapeutic approaches to these diseases. Recently, Yung and collaborators [47] identified differential gene expression in end-stage idiopathic DCM. They compared gene expression of six failing left ventricles midmyocardium (LVM) to 5 NF LVM. They found significantly regulated genes belonging to various functional categories. They found a number of novel apoptotic and cytoskeletal genes not previously implicated in DCM. Boheler and colleagues [48] recently reported HF-associated gene expression profiling in seven NF and eight failing hearts, which they subsequently validated and further investigated in a total of 34 hearts. They confirmed the role of mitogen activated protein kinases (MAPK) in HF, and found that HF gene expression profiles differed considerably among patients of different age and gender, indicating the importance of cautious sample selection for microarray studies. Left ventricular assist devices in human HF For patients with end-stage HF, there are few options for effective treatment. Although cardiac transplantation remains the optimal treatment for end-stage HF, substantial limitations of this surgical intervention include perioperative ventricular dysfunction and an extremely limited supply of acceptable donor hearts (<3000/year worldwide). Left ventricular unloading in end-stage HF patients via left ventricular assist device (LVAD) support can lead to normalization of myocardial structure and function, including increased -AR responsiveness [49], decreased myocyte volume [50], width [51] and diameter [52], improved myocyte contractility [53], decreased QRS duration and increased QT duration [54]. This process has been termed reverse remodeling. A number of these molecular changes appear to be time-dependent, requiring 15–30 days for normalization [52]. Recently, LVADs have proven to be an effective treatment of endstage HF patients as a ‘‘bridge to transplant’’ [55, 56]. Furthermore, the investigators of the REMATCH trial recently published their landmark
92
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
results, demonstrating that long-term LVAD support provides significantly improved morbidity and mortality when compared to optimized medical management in patients with end-stage HF who are ineligible for transplant [56]. Along these lines the recently formed LVAD Working Recovery Group (LVAD WRG), has envisioned the use of LVADs as a potential ‘‘bridge-torecovery,’’ where the extent of recovery in certain patients would be sufficient to allow weaning from LVAD support and obviate the necessity of transplant [57, 58]. The paired nature of LV samples obtained from patients at the time of LVAD placement and subsequently at the time of transplant provide a powerful opportunity to investigate cardiac gene expression changes that occur in an individual patient in the development and regression of HF, obviating the confounding factor of vast human heterogeneity. Our laboratories have been at the forefront of differential gene expression investigation before and after salutary LVAD support. We were the first to report gene expression profiles associated with LVAD-mediated reverse remodeling of paired LV tissue. We set out to study mechanically unloaded failing human hearts to delineate potential genes that might be involved in the reverse remodeling process secondary to LVAD support. We assessed the LV gene expression of six male patients before and after 2 months of LVAD support using Affymetrix oligonucleotide microarrays (Fig. 4.2) [59]. Importantly, we were able to segregate patients according to pre- or post-LVAD status using the entire gene expression profile, demonstrating that there is indeed a genomic signature of changes that occur with LVAD support. We also found that a representative majority of the differentially regulated genes were involved with energy and metabolism. We uncovered the unexpected finding that patients were blindly stratified by myocardial gene expression profile concordant with their HF etiology (i.e., ischemic or idiopathic dilated cardiomyopathy), see Fig. 4.2. Our results and those of others underscore that specific genomic footprints are associated with distinct phenotypic changes in cardiac function. We subsequently examined the effects of LVAD support on gene expression, initially in a similar paired sample set (n ¼ 7) [60]. In this study, we found 1374 genes were increased and 1629 genes were decreased following LVAD implantation. Of the genes associated with LVAD-mediated reverse
93
Genomic stratification in patients with heart failure
(a) 0.039
Post1 Post2 Pre1 Pre2
D2
0.0115
–0.010
–0.0435
–0.071 0.07
0.1125
0.155
0.1975
0.24
D1
Fig. 4.2.
Pre (1)
Pre (2)
Pre (3)
Pre (4)
Pre (5)
Pre (6)
Post (3)
Post (5)
Post (4)
Post (2)
Post (6)
Post (1)
(b)
Distinction of pre- and post-left ventricular assist device (LVAD) status and HF etiology by gene expression. (a) Gene expression data for pre-LVAD (yellow and blue) and post-LVAD (red and green) patients were subjected to multidimensional scaling (MDS) using Pearson dissimilarities. Distinction of the pre- and post-LVAD grouping by an exact test (randomization experiment) demonstrates the pre- and post-LVAD distinction has a significance value of 0.002, illustrating the significant effect of LVAD support on gene expression. Data are also classified according to segregating patient clinical characteristics: Pre-1 (yellow) and post-1 (red) represent non-ischemic patients, whereas pre-2 (blue) and post-2 (green) represent ischemic patients. (b) Hierarchical clustering of pre- and post-LVAD samples, 1–3 represent non-ischemic patients and 4–6 represent ischemic patients. Note the segregation of the non-ischemic patients and relative similarity of ischemic patients post-left ventricular assist device (LVAD). Reproduced with permission from [59].
94
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
remodeling, genes encoding transcription factors, cell growth/apoptosis/ DNA repair, cell structure proteins, metabolism, and cell signaling were up-regulated, while genes for cytokines were [60] down-regulated. We proceeded to study a substantially larger pre- and post-LVAD patient cohort (n ¼ 19). Analysis revealed that 22 genes were down-regulated and 85 genes were up-regulated in response to LVAD support. Further analysis revealed that a majority of these genes were involved in the regulation of vascular signaling networks. Specifically, we found that mechanical unloading of the heart with the LVAD led to an increase in sprouty-1, an inhibitor of vessel sprouting and ERK signaling [61]. We have recently reported that up-regulation of sprouty-1, is sufficient to inhibit ERK signaling and decrease endothelial cell proliferations [62]. Flesch and colleagues and recent work by Baba et al. have demonstrated a significant reduction in phosphorylated ERK1/2 following ventricular support in six patients [63, 64]. Our data confirms significant reduction in pERK1/2 activity and suggests that dampening of the p21Ras/MEK/ERK pathway post-LVAD may be modulated in part through the up-regulation of sprouty-1 that we have described [62]. Ongoing studies in our laboratory are continuing investigation of the role of this gene in modulating cardiac function. Microarrays have been used in combination with other methods such as immunohistochemistry to identify protein expression and localization of uncharacterized genes and their roles in mechanical unloading of the heart. Chen et al. [65] recently investigated gene expression in 11 patients pre- and post-LVAD support using a 12 000 feature cDNA microarray, and found 85 genes that were significantly upregulated post-LVAD. The majority of their study focused on the most significantly up-regulated gene they found post-LVAD, which was the apelin-angiotensin receptor-like 1 (APJ), receptor for the potent cardiac inotrope apelin. Using immuhistochemistry they found increased expression of apelin in vascular endothelium post-LVAD. They also found increased serum levels of apelin in HF patients [65]. Subsequent studies from their laboratory demonstrated that apelin/APJ signaling in mice indeed functioned as a potent cardiac inotrope that decreased cardiac preload and afterload with little hypertrophy [66]. Steenbergen and colleagues [67] recently explored the activities of proand anti-apoptotic pathways in human idiopathic DCM, as well as in pre- and post-LVAD samples. In particular, anti-apoptotic genes were
95
Genomic stratification in patients with heart failure
down-regulated in the TNF–NFB signaling pathway in the idiopathic DCM patients, which is consistent with a role for apoptosis in HF; samples from patients who had been on heart assist devices did not show decreased expression of NFB, which the authors identify as a target of potential therapeutic interest [67]. Recently, Margulies et al. [68] used a novel analytical strategy to compare gene expression between failing and LVAD supported hearts. They found that over 3088 transcripts were altered, but only 238 transcripts had a consistent response following LVAD support. Of those genes consistently expressed following LVAD support, only 11% exhibited partial normalization and 5% exhibited full normalization. Importantly, a majority of preand post-LVAD samples in this study were unpaired, limiting the ability to investigate within-patient gene expression normalization. The sample size of this study was substantially larger than in previous studies; however, the unpaired nature of the majority of the samples may limit our ability to compare the studies clearly. Further analysis will be required to clarify the discrepancies between data sets, however this study raises important questions regarding the extent of LVAD-mediated reverse remodeling at the level of gene expression. Final common pathway?
Recently, Kittleson et al. [69] sought to validate our prior findings of clear differentiation between ICM and non-ischemic cardiomyopathy (NICM, a.k.a. DCM). They investigated whether microarray data from different institutions could be utilized as a tool to determine cardiac gene expression profiles ICM and NICM. In their study they examined 48 samples (a majority of them being historical data from other institutions) from various disease stages including LVAD placement/cardiac transplantation, after LVAD support and endomyocardial biopsies from newly diagnosed patients [69]. The representative genes from the prediction profile were associated with signal transduction, metabolism and cell growth and maintenance. Of the samples investigated, ICM was associated with the greatest changes in gene expression, with most of the genes being up-regulated [69]. They showed that gene expression profiling could distinguish between ICM and NICM [69], thereby validating our original distinction of ICM and NICM in pre- and post-LVAD patients [59].
96
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
Many authors have suggested that there is a ‘‘final common pathway’’ in end-stage HF, and thus have predicted there is little difference, particularly at the molecular level, between various HF etiologies. Indeed, two microarray reports have indicated that, while they could distinguish pre- and post-LVAD state, or NF and HF, they were unable to distinguish ICM from DCM [65, 70]. Unfortunately, these studies have not indicated data or methods by which they achieved a lack of distinction between ICM and DCM. The study of Kittleson et al. [69], as well as our own studies ( [59] and unpublished data), demonstrate that there appear to be striking differences between ICM and DCM. While both may present clinically with a relatively similar end-stage HF phenotype, the data suggest that investigation of etiology-specific diagnosis treatment may be warranted. Continuing work in this area will require collaborative efforts from multiple investigators for comparison in larger sample sizes to gain the increased power needed for these types of analyses. Furthermore, identifying gene expression changes from blood or tissue samples collected from patients in early stages of disease is a next logical step, as the goal is to design effective diagnostic and therapeutic strategies for early intervention to slow, stop, or reverse the disease process. The Framingham risk score is currently the most widely used approach to identify individuals at risk for cardiovascular disease and is based on a variety of standard risk factors identified decades previously – age, diabetes, smoking, blood pressure, and cholesterol. Unfortunately, this approach has proven incomplete, as it excludes a high percentage of individuals at risk of developing heart disease that do not smoke, do not have high blood pressure or cholesterol, and do not have diabetes. This approach also does not provide an early diagnostic marker of disease, but is based upon a cascade of factors that are consistent with late stages of the disease process. Given the poor prognosis for HF patients, there is a pressing need for early detection of HF to allow for effective therapeutic interventions to reduce, prevent or reverse the course of disease. In humans it is apparent that numerous genes are altered in the course of HF and following LVAD support. Microarrays have been utilized in a number of ways to investigate changes in gene expression, and they have been used alone or in combination with other techniques to elucidate the possible functional cardiac role of novel or uncharacterized genes.
97
Genomic stratification in patients with heart failure
Microarray analysis has also identified several functional groups of genes as having a potential role in HF as well as LVAD-mediated reverse remodeling. Finally, microarray technology can be sensitive enough to distinguish between different etiologies of HF and can accurately predict HF phenotype and outcome. They have become a valuable asset in assessing HF and may soon be reliably used to predict HF prognosis, response to therapy and identify therapeutic targets. Peripheral blood mononuclear cell profiling in HF
Availability of human cardiac tissue is relatively limited to surgical procedures, thus substantially limiting its utility as a prognostic tool. Therefore, establishing the ability to determine patient prognosis and response to therapy utilizing more amenable tissue samples represents the holy grail of microarray gene expression profiling as a diagnostic tool. Since HF is generally associated with inflammatory processes, peripheral blood mononuclear cells (PBMC) may permit discovery of new genes associated with the disease and identify a novel diagnostic/prognostic tool. Yndestad et al. [71] were the first to report the use of cDNA arrays to investigate gene expression of a large number of cytokines and other inflammatory mediators in PBMC from chronic HF patients (n ¼ 8) and healthy blood donors (n ¼ 8) using pooled samples. Real time quantitative PCR was used to confirm the differential expression of identified genes of interest. They found differential gene expression in PBMC of several members of the cytokine network in chronic HF, notably an upregulation of several ligands in the TNF superfamily. Horwitz et al. [72] attempted to correlate the gene-expression profiles obtained from peripheral blood with histological cardiac allograft rejection on serial endomyocardial biopsies. In their study, they collected blood samples from 189 consecutive cardiac transplant patients for RNA isolation and performed 21 Affymetrix microarrays (control (n ¼ 7), rejection (n ¼ 7), and postrejection (n ¼ 7)). To avoid variability in hybridization conditions, all arrays were hybridized on the same day by a single technician. They found 40 candidate markers of allograft rejection. Although fold change of the expression of each marker is relatively low (<2.5-fold), it is interesting to note that the expression of these 40 transcripts returns toward normal levels after rejection treatment with immunosuppressive agents. Their results raise the possibility that peripheral
98
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
blood gene-expression profiles could serve as a non-invasive method to screen for cardiac allograft rejection. As a further demonstration of the utility of PBMC to determine cardiovascular disease and stroke outcome, Moore et al. [73] recently reported that PBMC gene expression profile could identify patients suffering from acute ischemic stroke, and found a number of predictive genes associated with an adaptive response to acute neurological damage. Landmark studies by Hill et al. and Vasa and colleagues have shown that the number of circulating endothelial progenitor cells are decreased in patients with cardiovascular disease [74, 75]. Thus changes in gene expression in PBMC may be in part a reflection of changes in cell types that could be predictive of patient prognosis. Although the PBMC studies are relatively preliminary, they all suggest that clinical diagnosis and prognosis of HF and cardiovascular disease through peripheral blood assays may not be such a distant hope. Conclusions Microarray profiling of HF in both animals and humans has elucidated numerous novel genes and pathways associated with the development, progression and regression of HF. Importantly, many of these microarray studies have blindly confirmed previous single-gene studies, validating the technology as a means to accurately profile HF gene expression. Shortly after the advent of microarray technology resulting in publications with n ¼ 1–2 experiments, we heard a timely quote from Dr. John Weinstein of the NIH at a conference in 2000, who said: ‘‘Never before has the publication record been so far ahead of reality’’: Indeed, the contemporary truth. Substantial progress has been made in performing and analyzing microarray experiments since that time. However, it is important to note that there is still no gold-standard consensus for statistically analyzing the daunting volumes of microarray data. Several methods have emerged that are much more meritorious than others, and only time will tell which of these methods holds the greatest promise. The discrepancy in methods may be the prime explanation for the minimal overlap in results from numerous microarray experiments of the same disease process.
99
Genomic stratification in patients with heart failure
Secondary validation of the differential expression and functional cardiac role of a candidate diagnostic or therapeutic gene is highly recommended. Microarray analysis can provide numerous false positives, thus confirmation of differential expression with conventional technologies, such as quantitative real-time PCR, Northern blot, Western blot, or immunostaining analyses is warranted. Furthermore, mRNA expression may not necessarily reflect protein expression, nor can it represent post-translational modifications such as phosphorylation and glycosylation. Therefore, HF expression profiling by microarrays can be complemented by other genetic, genomic and proteomic tools, such as positional cloning, single nucleotide polymorhpism (SNP) analysis, and proteomic profiling, to understand more clearly the biological pathways contributing to HF. Furthermore, follow-up studies with targeted overexpression or knockout of the gene of interest in cell and animal models are required for functional validation of therapeutic targets. Microarray analysis has, and will, continue to provide substantial insight and generate novel hypotheses associated with the complex nature of the development, progression and regression of HF. Further studies utilizing molecular, cellular and systems biology in both animal and human HF will validate these hypotheses and insights. Diagnostic microarray gene expression profiling, particularly with PBMCs or other peripheralblood-derived RNA, will certainly enhance our diagnostic capabilities for this debilitating disease. Microarray-based investigation of HF will eventually identify novel diagnostic and therapeutic targets that promise direct clinical application to the management and prognosis of cardiovascular disease.
R E F E RE N C E S 1. Berry, C., Murdoch, D. R., and McMurray, J. J. Economics of chronic heart failure. Eur. J. Heart Fail. 2001; 3(3): 283–91. 2. 2001 Heart and Stroke Statistical Update. Dallas: American Heart Association; 2001. 3. Califf, R. M., Adams, K. F., McKenna, W. J. et al. A randomized controlled trial of epoprostenol therapy for severe congestive heart failure: The Flolan International Randomized Survival Trial (FIRST). Am. Heart J. 1997; 134(1): 44–54.
100
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
4. Jessup, M. and Brozena, S. Heart failure. N. Engl. J. Med. 2003; 348(20): 2007–18. 5. Redfern, C. H., Degtyarev, M. Y., Kwa, A. T. et al. Conditional expression of a Gi-coupled receptor causes ventricular conduction delay and a lethal cardiomyopathy. Proc. Natl Acad. Sci. USA 2000; 97(9): 4826–31. 6. Dahlquist, K. D., Salomonis, N., Vranizan, K. et al. GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways. Nat. Genet. 2002; 31(1): 19–20. 7. Aronow, B. J., Toyokawa, T., Canning, A. et al. Divergent transcriptional responses to independent genetic causes of cardiac hypertrophy. Physiol. Genom. 2001; 6(1): 19–28. 8. Yussman, M. G., Toyokawa, T., Odley, A. et al. Mitochondrial death protein Nix is induced in cardiac hypertrophy and triggers apoptotic cardiomyopathy. Nat. Med. 2002; 8(7): 725–30. 9. Narula, J., Haider, N., Virmani, R. et al. Apoptosis in myocytes in end-stage heart failure. N. Engl. J. Med. 1996; 335(16): 1182–9. 10. Communal, C., Singh, K., Pimentel, D. R. et al. Norepinephrine stimulates apoptosis in adult rat ventricular myocytes by activation of the beta-adrenergic pathway. Circulation 1998; 98(13): 1329–34. 11. Aikawa, R., Nawano, M., Gu, Y. et al. Insulin prevents cardiomyocytes from oxidative stress-induced apoptosis through activation of PI3 kinase/Akt. Circulation 2000; 102(23): 2873–9. 12. Buerke, M., Murohara, T., Skurk, C. et al. Cardioprotective effect of insulin-like growth factor I in myocardial ischemia followed by reperfusion. Proc. Natl Acad. Sci. USA 1995; 92(17): 8031–5. 13. Camper-Kirby, D., Welch, S., Walker, A. et al. Myocardial Akt activation and gender: increased nuclear activity in females versus males. Circ Res. 2001; 88(10): 1020–7. 14. Matsui, T., Tao, J., del Monte, F. et al. Akt activation preserves cardiac function and prevents injury after transient cardiac ischemia in vivo. Circulation 2001; 104(3): 330–5. 15. Matsui, T., Li, L., Wu, J. C. et al. Phenotypic spectrum caused by transgenic overexpression of activated Akt in the heart. J. Biol. Chem. 2002; 277(25): 22896–901. 16. Cook, S. A., Matsui, T., Li, L. et al. Transcriptional effects of chronic Akt activation in the heart. J. Biol. Chem. 2002; 277(25): 22528–33. 17. Kubota, T., McTiernan, C. F., Frye, C. S. et al. Dilated cardiomyopathy in transgenic mice with cardiac-specific overexpression of tumor necrosis factor-alpha. Circ. Res. 1997; 81(4): 627–35. 18. Li, Y. Y., Feng, Y. Q., Kadokami, T. et al. Myocardial extracellular matrix remodeling in transgenic mice overexpressing tumor necrosis factor alpha can be
101
Genomic stratification in patients with heart failure
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
modulated by anti-tumor necrosis factor alpha therapy. Proc. Natl Acad. Sci. USA 2000; 97(23): 12746–51. Tang, Z., McGowan, B. S., Huber, S. A. et al. Gene expression profiling during the transition to failure in TNF-alpha over-expressing mice demonstrates the development of autoimmune myocarditis. J. Mol. Cell Cardiol. 2004; 36(4): 515–30. Milano, C. A., Dolber, P. C., Rockman, H. A. et al. Myocardial expression of a constitutively active alpha 1B-adrenergic receptor in transgenic mice induces cardiac hypertrophy. Proc. Natl Acad. Sci. USA 1994; 91(21): 10109–13. Wang, B. H., Du, X. J., Autelitano, D. J. et al. Adverse effects of constitutively active alpha(1B)-adrenergic receptors after pressure overload in mouse hearts. Am. J. Physiol. Heart Circ. Physiol. 2000; 279(3): H1079–86. Yun, J., Zuscik, M. J., Gonzalez-Cabrera, P. et al. Gene expression profiling of alpha(1b)-adrenergic receptor-induced cardiac hypertrophy by oligonucleotide arrays. Cardiovasc. Res. 2003; 57(2): 443–55. Wang, D., Oparil, S., Feng, J. A. et al. Effects of pressure overload on extracellular matrix expression in the heart of the atrial natriuretic peptide-null mouse. Hypertension 2003; 42(1): 88–95. Arber, S., Hunter, J. J., Ross, J., Jr. et al. MLP-deficient mice exhibit a disruption of cardiac cytoarchitectural organization, dilated cardiomyopathy, and heart failure. Cell 1997; 88(3): 393–403. Jones, L. R., Suzuki, Y. J., Wang, W. et al. Regulation of Ca2þ signaling in transgenic mouse cardiac myocytes overexpressing calsequestrin. J. Clin. Invest. 1998; 101(7): 1385–93. Cho, M. C., Rapacciuolo, A., Koch, W. J. et al. Defective beta-adrenergic receptor signaling precedes the development of dilated cardiomyopathy in transgenic mice with calsequestrin overexpression. J. Biol. Chem. 1999; 274(32): 22251–6. Rockman, H. A., Chien, K. R., Choi, D. J. et al. Expression of a beta-adrenergic receptor kinase 1 inhibitor prevents the development of myocardial failure in genetargeted mice. Proc. Natl Acad. Sci. USA 1998; 95(12): 7000–5. Blaxall, B. C., Spang, R., Rockman, H. A. et al. Differential myocardial gene expression in the development and rescue of murine heart failure. Physiol. Genom. 2003; 15(2): 105–14. Ueno, S., Ohki, R., Hashimoto, T. et al. DNA microarray analysis of in vivo progression mechanism of heart failure. Biochem. Biophys. Res. Commun. 2003; 307(4): 771–7. Sehl, P. D., Tai, J. T., Hillan, K. J. et al. Application of cDNA microarrays in determining molecular phenotype in cardiac growth, development, and response to injury. Circulation 2000; 101(16): 1990–9.
102
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
31. Fishbein, M. C., Maclean, D., and Maroko, P. R. Experimental myocardial infarction in the rat: qualitative and quantitative changes during pathologic evolution. Am. J. Pathol. 1978; 90(1): 57–70. 32. Zhao, M., Chow, A., Powers, J. et al. Microarray analysis of gene expression after transverse aortic constriction in mice. Physiol. Genom. 2004; 19(1): 93–105. 33. Wagner, R. A., Tabibiazar, R., Powers, J. et al. Genome-wide expression profiling of a cardiac pressure overload model identifies major metabolic and signaling pathway responses. J. Mol. Cell. Cardiol. 2004; 37(6): 1159–70. 34. Friddle, C. J., Koga, T., Rubin, E. M. et al. Expression profiling reveals distinct sets of genes altered during induction and regression of cardiac hypertrophy. Proc. Natl Acad. Sci. USA 2000; 97(12): 6745–50. 35. Buermans, H. P., Redout, E. M., Schiel, A. E. et al. Micro-array analysis reveals pivotal divergent mRNA expression profiles early in the development of either compensated ventricular hypertrophy or heart failure. Physiol. Genom. 2005. 36. Lai, L. P., Lin, J. L., Lin, C. S. et al. Functional genomic study on atrial fibrillation using cDNA microarray and two-dimensional protein electrophoresis techniques and identification of the myosin regulatory light chain isoform reprogramming in atrial fibrillation. J. Cardiovasc. Electrophysiol. 2004; 15(2): 214–23. 37. Mukherjee, S., Belbin, T. J., Spray, D. C. et al. Microarray analysis of changes in gene expression in a murine model of chronic chagasic cardiomyopathy. Parasitol Res. 2003; 91(3): 187–96. 38. Tabibiazar, R., Wagner, R. A., Liao, A. et al. Transcriptional profiling of the heart reveals chamber-specific gene expression patterns. Circ. Res. 2003; 93(12): 1193–201. 39. Chen, H. W., Yu, S. L., Chen, W. J. et al. Dynamic changes of gene expression profiles during postnatal development of the heart in mice. Heart. 2004; 90(8): 927–34. 40. Yang, J., Moravec, C. S., Sussman, M. A. et al. Decreased SLIM1 expression and increased gelsolin expression in failing human hearts measured by high-density oligonucleotide arrays. Circulation. 2000; 102(25): 3046–52. 41. Robinson, P. A., Brown, S., McGrath, M. J. et al. Skeletal muscle LIM protein 1 regulates integrin-mediated myoblast adhesion, spreading, and migration. Am. J. Physiol. Cell. Physiol. 2003; 284(3): C681–95. 42. Matsudaira, P. and Janmey, P. Pieces in the actin-severing protein puzzle. Cell 1988; 54(2): 139–40. 43. Tan, F. L., Moravec, C. S., Li, J. et al. The gene expression fingerprint of human heart failure. Proc. Natl Acad. Sci. USA 2002; 99(17): 11387–92. 44. Barrans, J. D., Allen, P. D., Stamatiou, D. et al. Global gene expression profiling of end-stage dilated cardiomyopathy using a human cardiovascular-based cDNA microarray. Am. J. Pathol. 2002; 160(6): 2035–43.
103
Genomic stratification in patients with heart failure
45. Schonberger, J. and Seidman, C. E. Many roads lead to a broken heart: the genetics of dilated cardiomyopathy. Am. J. Hum. Genet. 2001; 69(2): 249–60. 46. Hwang, J. J., Allen, P. D., Tseng, G. C. et al. Microarray gene expression profiles in dilated and hypertrophic cardiomyopathic end-stage heart failure. Physiol. Genom. 2002; 10(1): 31–44. 47. Yung, C. K., Halperin, V. L., Tomaselli, G. F. et al. Gene expression profiles in endstage human idiopathic dilated cardiomyopathy: altered expression of apoptotic and cytoskeletal genes. Genomics 2004; 83(2): 281–97. 48. Boheler, K. R., Volkova, M., Morrell, C. et al. Sex- and age-dependent human transcriptome variability: implications for chronic heart failure. Proc. Natl Acad. Sci. USA 2003; 100(5): 2754–9. 49. Ogletree-Hughes, M. L., Stull, L. B., Sweet, W. E. et al. Mechanical unloading restores beta-adrenergic responsiveness and reverses receptor downregulation in the failing human heart. Circulation 2001; 104(8): 881–6. 50. Zafeiridis, A., Jeevanandam, V., Houser, S. R. et al. Regression of cellular hypertrophy after left ventricular assist device support. Circulation 1998; 98(7): 656–62. 51. Zafeiridis, A., Jeevanandam, V., Houser, S. R. et al. Regression of cellular hypertrophy after left ventricular assist device support. Circulation 1998; 98(7): 656–62. 52. Madigan, J. D., Barbone, A., Choudhri, A. F. et al. Time course of reverse remodeling of the left ventricle during support with a left ventricular assist device. J. Thorac. Cardiovasc. Surg. 2001; 121(5): 902–8. 53. Dipla, K., Mattiello, J. A., Jeevanandam, V. et al. Myocyte recovery after mechanical circulatory support in humans with end-stage heart failure. Circulation 1998; 97(23): 2316–22. 54. Harding, J. D., Piacentino, V., 3rd, Gaughan, J. P. et al. Electrophysiological alterations after mechanical circulatory support in patients with advanced cardiac failure. Circulation 2001; 104(11): 1241–7. 55. Hosenpud, J. D., Bennett, L. E., Keck, B. M. et al. The Registry of the International Society for Heart and Lung Transplantation: eighteenth Official Report-2001. J. Heart Lung Transpl. 2001; 20(8): 805–15. 56. Rose, E. A., Gelijns, A. C., Moskowitz, A. J. et al. Long-term use of a left-ventriuclar assist device for end-stage heart failure. N. Engl. J. Med. 2001; 345: 1435–43. 57. Hetzer, R., Muller, J. H., Weng, Y. et al. Bridging-to-recovery. Ann. Thorac. Surg. 2001; 71(3 Suppl.): S109–13; discussion S114–15. 58. Terracciano, C. M., Hardy, J., Birks, E. J. et al. Clinical recovery from end-stage heart failure using left-ventricular assist device and pharmacological therapy correlates with increased sarcoplasmic reticulum calcium content but not with regression of cellular hypertrophy. Circulation 2004; 109(19): 2263–5.
104
T. A. Bullard, F. Aguilar, J. L. Hall, and B. C. Blaxall
59. Blaxall, B. C., Tschannen-Moran, B. M., Milano, C. A. et al. Differential gene expression and genomic patient stratification following left ventricular assist device support. J. Am. Coll. Cardiol. 2003; 41(7): 1096–106. 60. Chen, Y., Park, S., Li, Y. et al. Alterations of gene expression in failing myocardium following left ventricular assist device support. Physiol. Genom. 2003; 14(3): 251–60. 61. Hall, J. L., Grindle, S., Han, X. et al. Genomic profiling of the human heart before and after mechanical support with a ventricular assist device reveals alterations in vascular signaling networks. Physiol. Genom. 2004; 17(3): 283–91. 62. Huebert, R. C., Li, Q., Adhikari, N. et al. Identification and regulation of Sprouty1, a negative inhibitor of the ERK cascade, in the human heart. Physiol. Genom. 2004; 18(3): 284–9. 63. Baba, H. A., Stypmann, J., Grabellus, F. et al. Dynamic regulation of MEK/Erks and Akt/GSK-3beta in human end-stage heart failure after left ventricular mechanical support: myocardial mechanotransduction-sensitivity as a possible molecular mechanism. Cardiovasc. Res. 2003; 59(2): 390–9. 64. Flesch, M., Margulies, K. B., Mochmann, H. C. et al. Differential regulation of mitogen-activated protein kinases in the failing human heart in response to mechanical unloading. Circulation 2001; 104(19): 2273–6. 65. Chen, M. M., Ashley, E. A., Deng, D. X. et al. Novel role for the potent endogenous inotrope apelin in human cardiac dysfunction. Circulation 2003; 108(12): 1432–9. 66. Ashley, E. A., Powers, J., Chen, M. et al. The endogenous peptide apelin potently improves cardiac contractility and reduces cardiac loading in vivo. Cardiovasc. Res. 2005; 65(1): 73–82. 67. Steenbergen, C., Afshari, C. A., Petranka, J. G. et al. Alterations in apoptotic signaling in human idiopathic cardiomyopathic hearts in failure. Am. J. Physiol. Heart Circ. Physiol. 2003; 284(1): H268–76. 68. Margulies, K. B., Matiwala, S., Cornejo, C. et al. Mixed Messages. Transcription patterns in failing and recovering human myocardium. Circ. Res. 2005. 69. Kittleson, M. M., Ye, S. Q., Irizarry, R. A. et al. Identification of a gene expression profile that differentiates between ischemic and nonischemic cardiomyopathy. Circulation 2004; 110(22): 3444–51. 70. Steenman, M., Chen, Y. W., Le Cunff, M. et al. Transcriptomal analysis of failing and nonfailing human hearts. Physiol Genom. 2003; 12(2): 97–112. 71. Yndestad, A., Damas, J. K., Geir Eiken, H. et al. Increased gene expression of tumor necrosis factor superfamily ligands in peripheral blood mononuclear cells during chronic heart failure. Cardiovasc. Res. 2002; 54(1): 175–82. 72. Horwitz, P. A., Tsai, E. J., Putt, M. E. et al. Detection of cardiac allograft rejection and response to immunosuppressive therapy with peripheral blood gene expression. Circulation 2004; 110(25): 3815–21.
105
Genomic stratification in patients with heart failure
73. Moore, D. F., Li, H., Jeffries, N. et al. Using peripheral blood mononuclear cells to determine a gene expression profile of acute ischemic stroke: a pilot investigation. Circulation 2005; 111(2): 212–21. 74. Hill, J. M., Zalos, G., Halcox, J. P. et al. Circulating endothelial progenitor cells, vascular function, and cardiovascular risk. N. Engl. J. Med. 2003; 348(7): 593–600. 75. Vasa, M., Fichtlscherer, S., Aicher, A. et al. Number and migratory activity of circulating endothelial progenitor cells inversely correlate with risk factors for coronary artery disease. Circ. Res. 2001; 89(1): E1–7.
5
Gene expression profiling for the diagnosis of acute leukemias Torsten Haferlach,1 Alexander Kohlmann,2 Susanne Schnittger,1 Claudia Schoch,1 and Wolfgang Kern1 1 2
MLL – Munich Leukemia Laboratory, Germany Roche Molecular Systems, Pleasanton, CA, USA
Introduction Malignant diseases are diagnosed and classified based on cytologic and histologic findings. In particular, acute leukemias are identified based on the cytomorphologic examination of peripheral blood smears and bone marrow aspirates supplemented by cytochemical parameters such as myeloperoxidase (MPO) and non-specific esterase (NSE). Additional diagnostic methods include multiparameter immunophenotyping, which enables a lineage-assignment and a subclassification according to the maturational stage, as well as cytogenetics, supplemented by fluorescence in situ hybridization (FISH), and polymerase chain reaction (PCR). These latter methods have provided deep insights into the biology of different acute leukemia entities. Disease-specific chromosomal aberrations and molecular alterations have been identified for a variety of leukemia subtypes. As a consequence, modern diagnostics in acute leukemias include these methods in combination to allow an optimum characterization of the respective disease. An algorithm for a variety of diagnostic questions using these methods in varying combinations is helpful in order to gather all relevant information in an effective way [1]. Progress in acute leukemia research not only includes the identification and characterization of biologic subgroups. Application of different methods now also allows the selection of diseasespecific therapeutic approaches, e.g., the use of all-trans retinoic acid in acute promyelocytic leukemia [2] or the early application of allogeneic transplantation strategies in AML with complex aberrant karyotypes. The Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
107
Gene expression profiling for the diagnosis of acute leukemias
significant efficacy of imatinib in BCR-ABL-positive ALL and CML patients, and the use of specific antibodies against CD 20 or CD52, demonstrates the impressive advances in developing tailored disease-specific therapeutic approaches, based on a molecular rationale [3]. Even with the huge amount of biologic knowledge of disease-specifically altered genetic pathways that already exists, an additional kick, microarray technologies, provide comprehensive data on expression patterns of large numbers of genes. These methods therefore, may become an essential tool for optimizing the classification of acute leukemias and thus may be used routinely for diagnostic purposes in the near future [4, 5]. In addition, it will also lead to the detection of new biologically defined and clinically relevant subtypes in acute leukemias and may be the basis for tailored therapeutic decisions. Furthermore, a recommendation of drugs to be applied to an individual patient and the identification of targets for newly developed drugs may be within the focus of such functional genomic approaches in acute leukemia. As other chapters in this book will focus in much more detail on technical aspects of microarray platforms (Chapter 1), and on data mining strategies, interpretation, and storage of microarray data (Chapter 3), we will not discuss these very important aspects here, but refer the readers to the respective chapters. Characterization of acute myeloid leukemia by microarray analyses Pivotal work in gene expression profiling in acute leukemias has been reported by Golub et al., providing data on the applicability of microarrays and new biostatistical analysis methods [6]. These analyses have provided not only a ‘‘class prediction’’ (prediction of a tumor entity based on specific gene expression profiles of selected informative genes) [6–8] but also a ‘‘class discovery’’ (discovery of new subentities within groups formerly regarded as homogeneous entities) [9–12]. This discovery is not limited to the pure identification of new biological entities of leukemia but also includes the definition of prognostically different groups [9, 11], which is anticipated to influence therapeutic strategies. Thus, it seems a realistic goal to predict response to therapy, relapse risk, and even the risk of developing a secondary AML as suggested in children treated for acute lymphoblastic leukemia (ALL) [11].
108
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
Class prediction in AML Distinction of acute myeloid leukemia from acute lymphoblastic leukemia based on gene expression profiles
For both AML and ALL, distinct and specific gene expression profiles have been shown in the first study applying the microarray technology to acute leukemia samples [6], which is most relevant since the distinction of ALL and AML is routine daily practice and necessary for therapeutic decisions. In current practice this is based on cytomorphology, cytochemistry, and immunophenotyping. Golub et al. showed that this distinction is possible also solely on the basis of gene expression profiles. In bone marrow samples of 27 patients with ALL and 11 patients with AML a total of only 50 discriminatory genes were demonstrated to allow the separation of these large and heterogeneous entities from each other. In 36 out of the 38 cases the molecular diagnosis of leukemia was made correctly based on the gene expression profile as analyzed on the microarray. In a further set of 34 unknown samples that had not been used to build up the classification model, the classification was also correct in 29 cases. These analyses represented the first, and a major, step towards molecular diagnostics of acute leukemias [6]. A confirmatory study [13] could discriminate, in 51 childhood leukemias, between AML and ALL as well as subdivide B-precursor from T-precursor ALL using their cDNA arrays representing 4608 genes. Class prediction of cytomorphologically defined subgroups of AML
Focusing on cytomorphology as the classification criterion, the identification of distinct gene expression patterns in FAB subtypes of AML was approached by our group demonstrating the prediction of the FAB subtype solely based on the expression status of a limited set of genes [14]. The expression of 22 000 genes was analyzed using microarrays (U133A, Affymetrix) in patients with AML M0 (n ¼ 8), M1 (n ¼ 23), M2 (n ¼ 28), M3 (n ¼ 10), M3v (n ¼ 9), M4 (n ¼ 13), M4eo (n ¼ 11), M5a (n ¼ 10), M5b (n ¼ 12), and M6 (n ¼ 6). All 130 cases were characterized by cytomorphology and cytogenetics. RT-PCR was carried out in all cases with AML M2/t(8;21), APL, and M4eo. Immunophenotyping was performed in all M0 cases. Based on the expression signature of only one to three genes, it was possible to separate M3, M3v, M4eo, and M6 from all other subtypes with 100% accuracy (ac).
109
Gene expression profiling for the diagnosis of acute leukemias
(a) Myeloperoxidase: measured by cytochemistry
(c) Non-specific esterase: measured by cytochemistry
% 100 90 80 70 60 50 40 30 20 10 0
% 100 90 80 70 60 50 40 30 20 10 0 M0
M1
M2
M3 M3v M4 M4eo M5a M5b M6
M0
M1
M2
M3 M3v M4 M4eo M5a M5b M6
FAB subtypes
FAB subtypes
r = 0.803, P < 0.001
r = 0.723, P < 0.001
(b) Myeloperoxidase: measured by microarray
(d) CES1 (carboxyl-esterase1): measured by microarray
Signal 18000
Signal
16000
1800
2000 1600
14000
1400 1200
12000 10000
1000
8000
800
6000
600
4000
400
2000
200 0
0 M0
M1
M2
M3 M3v M4 M4eo M5a M5b M6
M0
M1
M2
FAB subtypes
Fig. 5.1.
M3 M3v M4 M4eo M5a M5b M6
FAB subtypes
Comparison between cytochemistry and gene expression in AML (n ¼ 130). (a) Percentage of myeloperoxidase positive cells as measured by cytochemistry on bone marrow smears in the different FAB subtypes. (b) Signal of MPO (myeloperoxidase) measured on the microarray in the same cases as analyzed in (a). (c) Percentage of non-specific esterase positive cells as measured by cytochemistry on bone marrow smears in the different FAB subtypes. (d) Signal of CES1 (carboxyl-esterase-1) measured on the microarray in the same cases as analyzed in (c).
As shown in Fig. 5.1(a)–(d) the percentage of myeloperoxidase or nonspecific-esterase positive cells, as measured by cytochemistry, significantly correlates to microarray signal intensities across all different FAB subtypes, clearly demonstrating the validity of these analyses.
110
Fig. 5.2.
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
In the hierarchical cluster analysis and the principal component analysis (PCA) the feature space consisted of measured expression data from genes differentially expressed between FAB M3 and M3v. Visualization of data based on n ¼ 14 probe sets discovered at 1% false discovery rate (FDR).
Going further into detail, in samples from patients with acute promyelocytic leukemia gene expression profiles in the two morphologically different subtypes, AML M3 and AML M3v, were analyzed [15]. Both subtypes harbor a translocation t(15;17), resulting in PML/RARa fusion transcripts. However, cytomorphology is different: the abnormal promyelocytes in M3 have a heavy granulation and bundles of Auer rods, while M3v cases have non- or hypogranular blasts with a typical bi-lobulated nuclear configuration. The white blood cell count (WBC) at diagnosis is highly elevated in M3v cases, while normal WBC counts are observed in M3 in general, despite a packed bone marrow in both subtypes. Gene expression profiling identified several genes that were strongly increased in M3v and that were decreased in M3 (Fig. 5.2). Most of these genes are involved in (a) cytoskeleton organization, cell adhesion, and migration, (b) signal transduction and cell cycle control; and (c) are differentially expressed during differentiation, which may be the basis for the morphologic and clinical differences described above. While cytogenetic and molecular genetic differences had been reported previously in 126 cases with AML M5a and M5b [16], this analysis has been
111
Gene expression profiling for the diagnosis of acute leukemias
extended by gene expression profiling to further refine molecular genetic patterns in 10 AML M5a and 12 AML M5b cases. Irrespective of the presence of 11q23 aberrations, and the FLT3-LM status, both groups were separated with 100% accuracy, based on the expression status of only two genes, HLX1 and PPP1R14B, differentially expressed during monocytic maturation. Further genes that were found to discriminate M5a and M5b were APOC2, being markedly enhanced in the process of differentiation into macrophage-like cells, and RGS2 that may play a role in leukemogenesis. The HOXB family members HOXB2, HOXB5, and HOXB6 were expressed in M5/11q23-negative but not, or at low levels, in M5/11q23positive cases. Thus, the morphologically different leukemias AML M5a and M5b could be shown to reveal different cytogenetic and molecular genetic patterns [17]. Identification of specific genetic abnormalities based on gene expression profiles
Cytogenetic aberrations are considered to be disease defining in a large number of AML cases and represent the most important prognostic parameter in AML. Therefore, these entities have been incorporated into the new WHO classification of AML [18]. Today, chromosome banding analysis and also FISH and RT-PCR are applied to identify these genetic abnormalities. In this context, Schoch et al. demonstrated that the cytogenetically defined subtypes, AML with t(8;21), AML with t(15;17), and AML with inv(16), are characterized by different and specific gene expression profiles. The underlying basic genetic alterations lead to patterns of gene expression that can be detected unequivocally by microarrays. A minimum set of only 13 genes was sufficient to predict the karyotypes accurately in the respective AML samples [19]. Looking further at the diagnostic applicability of microarray analyses, an extension of this work was performed, applying the U133A microarray. Thus, all previously used hybridization cocktails utilized in the initial profiling study and 13 additional patient samples were hybridized to the newly designed and improved U133A microarray. Genes represented on U95Av2 microarrays are also present on U133A microarrays. Next, using the NetAffx web analysis center, the corresponding U133A counterparts for the presented U95Av2 probesets were determined [20]. This search resulted in a total of 58 U133A counterparts for the 36 designated U95Av2 probesets. In order to visualize that the presented
112
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
diagnostic composition of genes accurately separates the three subtypes, all samples were subsequently analyzed by two-dimensional hierarchical clustering. Even on the U133A microarray expression data, also including additional samples of each of the three cytogenetically defined AML subgroups, all AML cases were separated repeatedly according to their underlying chromosomal aberration. More importantly, when support vector machines (SVM), a common supervised machine learning algorithm, were trained with two-thirds of the data set based on these differentially expressed marker genes, the remaining one-third independent cohort of patients was classified robustly and accurately. Even following the inclusion of normal bone marrow samples into this analysis, the expression profiles of 35 genes were sufficient to predict with a 100% accuracy if the sample contained normal bone marrow, AML M2 with t(8;21), APL with t(15;17), or AML M4eo with inv(16). A further step was the addition of samples with AML carrying aberrations of chromosome 11q23, i.e., MLL rearrangements, representing an analysis of AML subtypes with recurring chromosomal aberrations as defined by the WHO [21]. Based on the gene expression profiles, a minimum set of 39 genes was sufficient to classify samples as normal bone marrow or as AML with one of the aberrations t(8;21), t(15;17), inv(16), or 11q23-rearrangements. A principal component analysis as calculated, based on the expression intensities of these 39 genes visualizes this (Fig. 5.3). Thus, the differential expression of these 39 candidate genes is sufficient to classify cytogenetically defined subtypes of AML and to separate these from normal bone marrow. The accuracy of this classification as determined by a leave-one-out cross-validation amounts to 100%. Approaching this issue similarly, Debernardi et al. [22] investigated 28 AMLs with t(8;21), t(15;17), and inv(16) and also included cases with different 11q23 aberrations as well as ten AML patients with a normal karyotype using U95A arrays (Affymetrix). In comparison to the data set described above, many discriminating genes were proven in this independent cohort of AMLs. Furthermore, the expression status of specific genes correlated with 11q23 as well as with AML with normal karyotype. The latter group was characterized by a distinctive up-regulation of members of the class I homeobox A and B gene families, implying a common underlying genetic lesion for AML with normal karyotype.
113
Gene expression profiling for the diagnosis of acute leukemias
t(15;17) t(8;21) nBM
inv(16)
Fig. 5.3.
t(11q23)/MLL
Principal component analysis based on U133 A expression data of WHO classified AML subtypes with recurrent chromosome aberrations and normal bone marrow mononuclear cells from healthy volunteers. Sixty AML samples comprising the colorcoded subgroups t(15;17) (n ¼ 20), t(8;21) (n ¼ 13), inv(16) (n ¼ 12), and t(11q34)/ MLL gene rearrangement positive samples (n ¼ 15) can accurately be discriminated and are different from normal bone marrow, nBM (n ¼ 9).
Ross et al. analyzed gene expression profiles in 130 children with AML. They could achieve very high prediction accuracies of 100% in acute promyelocytic leukemia, in the core binding factors (CBF)-leukemias, as well as in acute megakaryocytic leukemia, the latter one being clinically relevant in particular in childhood AML. In addition, the prediction accuracy for AML with t(11q23)/MLL was higher than 90% and amounted to 98% in the training set. Probably it is possible to achieve an even higher accuracy by maximizing the sample numbers to overcome the certain degree of biologic heterogeneity in this group, which is due to the different fusion partners of the MLL gene. These analyses also strongly contribute to the development of the microarray technology as a diagnostic tool in acute leukemias [23].
114
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
Although the prognostic relevance of AML with trisomy 8 as the sole cytogenetic abnormality is controversial, it gives a nice model for analyses of gene dosage effects. Virtaneva et al. focused on this subtype and asked if this subgroup can be separated from AML with a normal karyotype based on its gene expression profile [24]. CD34þ cells were analyzed after Ficoll–Hypaque gradient centrifugation, labeled with a secondary antibody conjugated to magnetic beads, and purified over a magnetic column. However, in this analysis using HuGeneFL arrays (Affymetrix), a totally correct separation of both cytogenetically different AML subgroups was not possible, maybe due to trisomy 8 not being the disease-defining abnormality but rather a secondary aberration in addition to pathogenetically essential molecular events. Nonetheless, a gene dosage effect was also demonstrated, since many genes coded on chromosome 8 were expressed at higher levels in the group of AML with trisomy 8. This was confirmed for þ8 but was shown also for other gains (þ11, þ13) and losses (7; and 5q) from Schoch et al. [25]. Because of its very poor clinical course and its characteristic, but not yet well-understood, biologic features AML with a complex aberrant karyotype is an outstanding subtype of AML, accounting for 10% to 20% of all cases. Despite intensive treatment including allogeneic stem cell transplantation, long-term survival is achieved in less than 10% of patients. Schoch et al. analyzed AML with complex aberrant karyotype (n ¼ 36) in comparison to AML with t(8;21) (n ¼ 13), with inv(16) (n ¼ 12), with rearrangement of the MLL gene (n ¼ 15), with trisomy 8 as the sole cytogenetic abnormality (n ¼ 10), and with a normal karyotype (n ¼ 64) [26]. In pairwise comparisons the discrimination of AML with complex aberrant karyotype from every other subgroup was possible with 100% accuracy as assessed by leaveone-out cross-validation. For each pairwise comparison, one to seven genes were sufficient to reach this optimal accuracy. The expression of HOXA9 and HOXA7 was discriminative between AML with complex aberrant karyotype and AML with t(8;21), t(15;17), and inv(16) as both HOX genes were expressed in complex aberrant karyotype but showed no, or a very low, expression in the other prognostically favorable subgroups. Compared to all other AML subtypes, AML with complex aberrant karyotype had a significantly higher expression of RAD21 (1.7-fold), which is involved in double-strand break repair and is anti-apoptotic (P ¼ 0.0001).
115
Gene expression profiling for the diagnosis of acute leukemias
In addition, expression of the following genes, which are involved in DNA repair and DNA damage-induced checkpoint signaling (categorized by GeneOntology terms), was also elevated 1.5- to 3-fold in AML with complex aberrant karyotype compared to all other subtypes (for all P < 0.0005): RAD1, RAD9, RAD23B, PIR51 (RAD51 interacting protein), NBS1, MSH6, UBL1, and ADPRTL2. It can be speculated that the high expression of these genes plays an important role in resistance to chemotherapeutic agents causing DNA damage. AML with genetic aberrations detectable on the molecular level can be identified, based on their gene expression profiles
In the largest subgroup of AML, which includes 40% to 45% of all cases, no chromosomal rearrangements are present when applying chromosome banding analysis. Therefore, the use of molecular studies are carried out to identify the underlying genetic defects in these AML. Several mutations have been identified. The most frequent ones are length and point mutations within the FLT3 genes and partial tandem duplications within the MLL gene. In a cohort of 1992 unselected newly diagnosed acute myeloid leukemia (AML) cases, 125 (6.3%) had a partial tandem duplication within the MLLgene (MLL-PTD) [27]. This mutation occurs mainly in cytogenetically normal AML and, like MLL-translocations (tMLL), is characterized by an unfavorable prognosis. Similar to translocations of 11q23 involving the MLL-gene, the MLL-PTD occurs significantly more frequent in AML secondary to a previous cytotoxic therapy. Schnittger et al. asked the question whether it is possible to discriminate MLL-PTD (n ¼ 10) from AML with a normal karyotype without this aberration (n ¼ 30) and from AML with tMLL (n ¼ 15) by different gene expression patterns using U133A microarrays (Affymetrix) [27]. By pairwise comparison it was not possible to define a specific expression profile discriminating the MLL-PTD positive from the negative cases. However, a specific expression profile was found for the various MLL-translocations as compared to the MLL-PTD group and normal karyotype group (Fig. 5.4). Many of the discriminative genes encoded for DNA binding proteins that are involved in transcriptional regulation and developmental processes. Similarities were found between MLL-PTD and tMLL in the HOX gene expression pattern (HOXA7, HOXA9, HOXA4, HOXA5, HOXA10) although HOXB2, HOXB5, HOXB6, and HOXB7 are
116
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
MLL-PTD
t(11q23)/MLL
Fig. 5.4.
Principal component analysis based on U133A expression data to separate AML with translocation of the MLL gene t(11q23) from AML patient samples with detected partial tandem duplications within the MLL gene (MLL-PTD). This analysis includes twenty-five AML samples. Based on a subset of 88 most discriminatively expressed genes t(11q23)/MLL (n ¼ 15), represented by turquoise spheres and MLL-PTD (n ¼ 10) samples, respresented by pink spheres can accurately be discriminated.
expressed lower in the tMLL group. These data suggest that, in tMLL and MLL-PTD positive AML, the HOX-gene regulation is altered in a common way; however, the overall expression pattern is markedly different between both groups. The MLL-PTD pattern cannot be differentiated accurately from that of AML with normal karyotype and thus these two groups may be more closely related to each other than MLL-PTD is to tMLL. Focusing on the most frequent molecular alteration in AML (23% of all cases) Schnittger et al. evaluated AML with FLT3-length mutation (FLT3LM), which is particularly frequent in AML with a normal karyotype (40%) [28]. The expression profile of AML with normal karyotype and FLT3-LM (n ¼ 21) was compared to normal bone marrow samples (n ¼ 9), AML with
117
Gene expression profiling for the diagnosis of acute leukemias
t(8;21) (n ¼ 13), inv(16) (n ¼ 12), t(15;17) (n ¼ 19), MLL-translocations (n ¼ 15), trisomy 8 (n ¼ 10), and complex aberrant karyotype (n ¼ 31) [29]. The FLT3-LM group was discriminated from trisomy 8 with 97% accuracy and from all other karyotypically aberrant AML groups with 100% accuracy. The confidence was 0.85 for the comparison to AML with complex aberrant karyotype and 1.0 for all other comparisons. Using this data analysis algorithm, it was not possible to discriminate between AML with normal karyotype and FLT3-LM (n ¼ 27) and those without FLT3-LM (n ¼ 21), neither was it possible after including the patients with FLT3D835/I836 (n ¼ 6) in the FLT3-mutation positive group. However, the same analysis within each FAB subgroup resulted in a clear distinction between FLT3-LM positive and FLT3-negative cases. The 20 top genes found to be discriminatively expressed in each analysis varied substantially between the FAB subtypes, although many are downstream target genes of the FLT3 pathway. These data suggest that the effects of a mutationally activated FLT3 receptor may be different, depending on a primary genetic alteration or on the composition of different genetic alterations in addition to the FLT3-LM. These additional alterations may vary between the several morphological subtypes and cause a differentiation stop at different levels. In a similar approach Lacayo et al. could identify cases with FLT3-LM, FLT3 activating loop mutations, and those without either mutation in a series of 81 children with AML [30], although there were significant overlaps between the respective groups. In addition, genes indicative of the prognosis were identified; however, they have not been validated in an independent set of patients, which is particularly needed with regard to the relatively small sample number used in this analysis. Approaching class discovery in AML
Given the large amount of data generated by gene expression profiling, a main focus in future studies will be the refinement of the subcategorization in subgroups with so far identical patterns of diagnostic results but different clinical outcomes. This is urgently needed, especially in the two major groups of AML, i.e., cases with normal karyotypes (45% of patients) and AML with complex aberrant karyotype (15% of patients). Approaching this line of research, Qian et al. were able to identify distinct subtypes of therapy-related AML (t-AML) [31] when analyzing CD34þ hematopoietic
118
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
progenitor cells from 14 patients using U95A microarrays (Affymetrix). Although gene expression patterns typical of arrested differentiation in early progenitor cells commonly were detected in all cases, two specific characteristics could be found. Cases with a –5/del(5q) had a higher expression of genes involved in cell cycle control, checkpoints, or growth and a loss of expression of the gene encoding IFN consensus sequence-binding protein. Another subgroup was characterized by the down-regulation of transcription factors involved in early hematopoiesis and the overexpression of proteins involved in signal transduction like FLT3 and cell survival like BCL2. Targeting these different pathways may lead to more specific treatment approaches in this very poor prognostic group of AML. Additional studies tried to unravel the biologic heterogeneity in AML with normal karyotype and assessed the relation of their findings to the clinical course of the patients [32–34]. Taken together, the available gene expression analyses in AML indicate that, based on the expression status of a small number of genes, AML can be discriminated from ALL and that the disease-defining genetic abnormalities can be predicted in cytogenetically defined AML subtypes, due to their highly specific gene expression profiles. The basic requirements for a diagnostic tool are thus fulfilled [5]. Comparison of protein expression and gene expression levels as assessed by flow cytometry and microarrays
Expression data obtained by microarray analyses were correlated to protein expression data, which has been generated by multiparameter flow cytometry used as a standard method for diagnosing and subclassifying AML and ALL. Kern et al. analyzed 39 relevant markers in 113 patients with newly diagnosed AML and ALL and 4 normal bone marrow samples by both methods in parallel (Affymetrix U133A microarray) [35]. A high degree of correlation between protein expression and RNA abundance was observed with regard to both positivity/negativity and quantitative data. Thus, in 1512 of 2187 (69.1%) comparisons congruent results were obtained with regard to positivity or negativity of expression, respectively. Moreover, in genes most relevant for diagnosing and subclassifying AML and ALL, i.e., CD13 (as an example see Fig. 5.5), CD33, MPO, CD22, CD79a, CD19, CD10, and TdT, congruent results were obtained in 75% to 100%. These
119
Gene expression profiling for the diagnosis of acute leukemias
10000 Mean fluorescence intensity by microarray
CD13 (ANPEP)
1000
100
10
1 1
Fig. 5.5.
10 % positive cells by flow cytometry
100
Correlation of protein and mRNA levels in AML. The percentages of cells stained positive for CD13 by multiparameter flow cytometry (MFC) are plotted against signal intensities for CD13 obtained by microarray analysis (MA) in 812 cases with AML (P < 0.0001, Spearman rank correlation).
data are considered as evidence that protein expression is correlated highly to mRNA abundance in AML and ALL. By taking advantage of these observations, new antigens may be identified, which are expressed on the cell surface of AML and ALL cells and which may be promising targets to monitor minimal residual disease. Prognostic studies in AML based on genetic profiling
Yagi et al. [36] analyzed 54 pediatric AML using Affymetrix U95 A arrays and focused on the reproducibility of some FAB subtypes and especially on gene patterns to predict outcome. After unsupervised clustering, they were able to differentiate patients with t(8;21) from those with inv(16), and from those demonstrating an AML M4/5, or AML M7 phenotype or immunophenotype by specific gene expression signatures. Within this unsupervised analysis, no specific profile was found that correlated to the prognosis of the patients. Since the inclusion of further cases with other FAB subtypes and
120
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
cytogenetic abnormalities (no karyotype available in 9 of 54 cases) resulted in an increased heterogeneity, the authors restricted their further analyses to the genetically and morphologically better defined subentities. For further calculation, data were analyzed supervised with respect to outcome and prognosis. A subset of 35 genes were selected that were independent of the morphology or karyotype of the patients; some of them are associated with regulation of the cell cycle or with apoptosis. By hierarchical cluster analysis, patients could be classified into high-risk and low-risk groups with highly significant differences in EFS (P < 0.001). However, demonstrating the difficulties in finding prognostic markers using gene expression profiling, these markers described by Yagi et al. had no prognostic impact in an independent analysis in another data set [23]. Bullinger et al. analyzed 65 peripheral blood and 54 bone marrow samples in patients with AML [33]. Based on 6283 most variably expressed genes, they were able to reproduce cytogenetically defined AML subgroups and, in addition, to define based on gene expression profiles two different groups with highly differing prognosis. While both groups mainly included AML cases with normal karyotypes without differences in many prognostic parameters, it is noteworthy that the group with the poorer prognosis included more patients with monosomy 7, complex aberrant karyotypes, and length mutations of FLT3, while the group with the better prognosis included more patients with inv(16). Thus, the observed differences in the prognosis between the two groups may be largely due to imbalances in profiles of established prognosic factors rather than due to the identification of a new biologically characterized subgroup of AML. Genes as published by Bullinger et al. should be tested in independent cohorts of AML patients to support their prognostic power further. Similar results have been reported by Valk et al., who discovered, based on microarray analysis, 16 groups of AML featuring distinct gene expression profiles which, in addition showed significant differences in clinical course [32]. However, while many of the identified groups were characterized by specific cytogenetic aberrations known to be highly predictive of outcome, none of the groups were restricted to cases without cytogenetic abnormalities. Thus, the task remains to identify markers capable of discriminating prognostically different cases out of the heterogeneous group of AML with normal karyotype.
121
Gene expression profiling for the diagnosis of acute leukemias
An improvement in this direction has been reported by Kern et al., who analyzed gene expression profiles in 205 patients with AML and normal karyotype [34]. In order to identify genetically defined subgroups, an unsupervised principal component analysis revealed 79% of cases clustering together, while a subgroup comprising 21% cases formed another cluster. Importantly, the analysis of known genetic markers including presence of length mutations and point mutations of FLT3, partial tandem duplications of MLL, or mutations of CEBPA, NRAS, or CKIT, did not reveal differences between the two groups. Significant differences were found, however, in their phenotypes, with more monocytic features in the smaller group. Analysis of differentially regulated genetic pathways revealed CD14, WT1, MYCN, HCK, and SPTBN1 as discriminating genes. Stressing the potential impact of this analysis on the clinical management of AML, these two groups significantly differed in the event-free survival. Thus, it was demonstrated here too that, within the group of AML with normal karyotype, highly needed genetic markers with prognostic impact can be identified by use of gene expression profiling. Regarding the biologic heterogeneity of AML in general, and of AML with normal karyotype in particular, it is anticipated that further large-scale studies in the context of clinical trials are needed to fully characterize and validate novel and clinically relevant subgroups in AML. Characterization of acute lymphoblastic leukemia Representing pivotal and basic analyses, Golub et al. showed that ALL can be distinguished from AML based on the expression status of a small number of genes [6] and Moos et al. divided 51 childhood leukemia samples into AML, B-lineage ALL, and T-lineage ALL using cDNA arrays with 4608 genes [13]. In addition, in the latter study, cases with low- and high-risk ALL, as well as patients having a TEL-AML1 fusion transcript, were identified. Specific gene expression patterns are associated with distinct subtypes of B-precursor and T-precursor ALL
Armstrong et al. were the first to demonstrate that childhood ALL with chromosomal aberrations involving the MLL gene can be regarded as a molecularly defined entity, distinct from other ALL [10]. MLL-positive ALL
122
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
had a distinct gene expression profile consistent with an early hematopoietic progenitor cell expressing multilineage markers and specific HOX genes (assessed on U95A, Affymetrix). The comparison of these ALL with MLL aberrations to other ALL cases as well as to AML resulted in a clear separation of all three groups from each other. A milestone in microarray analysis with respect to class discovery, class prediction, and prediction of outcome was the report by Yeoh et al. on 327 childhood ALL cases analyzed by Affymetrix U95A arrays [11]. Patients were discriminated according to their cytogenetic and immunological as well as to their molecular subtype of ALL: T-ALL, E2A-PBX1, BCR/ABL, TEL/AML1, MLL gene rearrangements, and hyperdiploid ALL. Many of the relevant genes were also verified in an analysis of adult ALL patients [37]. As some patients were not classified according to these catagories, a novel subgroup of ALL was postulated, characterized by high expression of genes including the receptor phosphatase PTPRM and LHFPL2. Most surprisingly, not only the therapeutic outcome in most children with ALL could be predicted, but also specific genes in ALL blasts at diagnosis appeared to indicate an increased risk of developing a therapy-induced AML after successful treatment of ALL. Kohlmann et al. analyzed adult ALL samples comprising three precursor B-ALL subtypes, t(9;22) (n ¼ 15), t(4;11) (n ¼ 9), or t(8;14) (n ¼ 4), and precursor T-ALL (n ¼ 9). All of these subgroups were also discriminated accurately [21]. In addition, the global gene expression profiles of ALL (n ¼ 10) and AML (n ¼ 15) both t(11q23)/MLL-positive, were analyzed by Kohlmann et al. (U133A arrays, Affymetrix) [38]. Based on 20 topranked genes, both leukemias were discriminated with 100% accuracy (by permutation-based neighborhood analysis). In an expanded analysis, the top 200 up-regulated and down-regulated genes were identified using the SAM program [39] and functionally annotated by means of NetAffx [20]. Out of this first set of 20 genes, 17 were contained in this larger set of 400 genes. Besides IGHM (IgM heavy chain), VPREB1 (surrogate lightchain), and CD22, the genes PAX5 and RRAS2 were highly or very highly expressed in t(11q23)/MLL-ALL, while in t(11q23)/MLL-AML their expression was low or absent. PAX5 mediates B-lineage commitment by repressing the transcription of non-lymphoid genes and by simultaneously activating the expression of B-lineage-specific genes. RRAS2 (TC21), a Ras-like guanosine triphosphatase (GTPase) with high oncogenic potential, might be an
123
Gene expression profiling for the diagnosis of acute leukemias
interesting candidate with respect to leukemogenesis. Genes with higher expression in t(11q23) AML as compared to t(11q23) ALL included CST3 and APLP2, to date related only to Alzheimer’s disease. A third candidate gene, CLN2, has a role in lysosomal storage disorders. Although both acute leukemias show rearrangement of the MLL gene, different expression patterns lead to the designation of differentially expressed genes. Considering the clear B-lineage commitment, the recently reported finding that t(11q23)/MLL in childhood ALL represents a distinct disease termed ‘‘mixed-lineage leukemia’’ [10] may not be generalizable. Rather, the observed changes in expression patterns in MLL-positive leukemias may depend strongly on the differing cellular background, myeloid, or lymphoid. A recent analysis aimed at analyzing ALL cases in even more detail, combining the immunologic classification proposed by the EGIL group [40] with cytogenetic findings [41]. Based on this classification gene expression signatures were assessed in cases with Pro-B-ALL/t(4;11) (n ¼ 25), c-ALL/Pre-B-ALL (with t(9;22) n ¼ 35, without t(9;22) n ¼ 30), mature B-ALL/t(8;14) (n ¼ 13), Pro-T-ALL (n ¼ 6), Pre-T-ALL (n ¼ 13), and cortical T-ALL (n ¼ 20). Applying various support vector machines (SVM) with a tenfold cross-validation approach, the prediction accuracy for discriminating T-precursor from B-precursor ALL was 100% in this analysis. While principal component analysis of B-precursor ALL cases yielded distinct clusters for Pro-B-ALL, c-ALL/Pre-B-ALL, and mature B-ALL, c-ALL/PreB-ALL with t(9;22) were not discriminated completely from those without t(9;22). Accordingly, classifying B-precursor ALL with SVM resulted in an accuracy of 87.4% with misclassification occuring mainly between c-ALL/ Pre-B-ALL with and without t(9;22). This is in accordance with previous analyses demonstrating the difficulty to identify a specific gene expression signature for B-precursor ALL with t(9;22) [11]. Furthermore, cortical T-ALL were found to cluster distinct from immature T-ALL cases, however, there was a large overlap between Pro-T-ALL and Pre-T-ALL and even between Pro-T-ALL and biphenotypic acute leukemia cases carrying both T-lymphatic and myeloid features. These analyses will help to optimize the classification of ALL. While for well-defined subgroups with specific genetic alterations a 100% prediction accuracy could be achieved (Pro-B-ALL and mature B-ALL), it was lower for other subgroups suggesting that a
124
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
restructured and even more comprehensive classification may be necessary in this disease. A first step in this direction has been suggested by Ferrando et al. who identified distinct gene expression signatures in T-precursor ALL and related them to normal thymocyte development [42, 43]. Accordingly, Pro-T and LYL1-positive cases were distinguished from early cortical thymocyte and HOX11-positive cases as well as from TAL1-positive late cortical thymocyte cases. This approach may result in a classification which is based on more constantly detectable changes between different maturational stages and therefore may reveal more reproducibility. The major role of cytogenetic aberrations in the characterization of acute leukemia entities has been proven in an analysis of ALL cases comparing gene expression signatures in children and adults [37]. Using genes differentially expressed in different childhood ALL subgroups comprizing t(11q23), t(9;22), and T-ALL [10] for classification of adult ALL cases resulted in a 100% prediction accuracy. This analysis not only stresses the pathogenetic importance of genetics in leukemia even when considering both childhood and adult cases but also further proves the reproducibility of microarray analyses in general yielding identical data in two analyses performed in different sites on independent samples. Prediction of response in ALL
Hofmann et al. analyzed 25 bone marrow samples using HuGeneFL arrays (Affymetrix) from 19 patients with Philadelphia-positive ALL who were treated with the BCR/ABL tyrosine kinase inhibitor imatinib [44, 45]. Patients were selected according to their cytogenetic response to the drug. Ninety-five genes were identified to predict the treatment outcome in all cases; another 56 genes were found to predict leukemia cells that had secondary resistance to the drug after remission. Resistant cells expressed adenosine triphosphate (ATP) synthetases such as ATP5A1 and ATP5C1 and had reduced expression of the proapoptotic gene BAK1 and the cellcycle control gene p15 INK4b. Cheok et al. elucidated the genomics of cellular responses to treatment with methotrexate and mercaptopurine, alone or in combination, before and after treatment in 60 childhood ALL cases (U95A, Affymetrix) [46]. A total of 124 differentially expressed genes accurately discriminated the four
125
Gene expression profiling for the diagnosis of acute leukemias
possible treatment groups. Genes included those involved in apoptosis, mismatch repair, cell-cycle control, and stress response. Leukemia cells in different patients appeared to react in a similar matter after specific treatments and therefore to share common pathways of genomic responses to a variety of drug schedules. In a similar approach Holleman et al. analyzed 173 children with B-precursor ALL for in vitro sensitivity to the four major drugs used to treat the disease and identified genes differentially expressed between sensitive and resistant cases [47]. Combining their gene expression findings to a score they were able to create a highly powerful and independent predictor consisting of 124 genes, 121 of which had not been identified in this context before. With regard to adult T-precursor ALL genes differentially expressed between clinically responsive and refractory cases were identified in 33 patients and validated in additional 18 patients [48]. Overall, these results point to further applications of microarray profiling to tailor therapy, including the avoidance of drugs unlikely to induce remission. Further expression profile studies with independent test cohorts are especially needed for this purpose. Ross et al. rehybridized 132 childhood ALL probes from their previous cohort to the Affymetrix U133 set and identified almost 60% of new discriminating genes in comparison to their previous analysis with U95A [11, 49]. As a proportion of these new genes were highly ranked as class discriminators and led to an overall diagnostic accuracy of 97% in several analyses, the authors proposed to assess these gene expression profiles in a prospective clinical trial. The main clinical focus should be at diagnosis of ALL with respect to accuracy, practicability, and cost-effectiveness in comparison with standard diagnostic techniques. A global approach for the diagnosis in leukemia using gene expression profiling Focusing on the diagnostic application of microarray technology, the essential role for an accurate diagnosis and classification of leukemias must be recognized as the basis for the adequate management of patients. The diagnostic accuracy and efficiency of present methods may be significantly improved by gene expression profiling. Haferlach et al. analyzed gene expression profiles in bone marrow and peripheral blood samples from 937 patients with all clinically relevant leukemia subtypes (n ¼ 892) and
126
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
healthy donors (n ¼ 45) by U133A and B GeneChips (Affymetrix) [5]. For each of the 12 most relevant leukemia subgroups, differentially expressed genes were calculated. Class prediction was performed using support vector machines. Prediction accuracies were estimated by tenfold cross-validation and assessed for robustness in a 100-fold resampling approach using randomly chosen test-sets consisting of one-third of the samples. Applying the top 100 genes of each subgroup, an overall prediction accuracy of 95.1% was achieved. In addition, this very high accuracy has been confirmed in a resampling approach (median, 93.8%; range, 91.4% 95.8%). In particular, AML with t(15;17), t(8;21), or inv(16), CLL, and Pro-B-ALL with t(11q23) were classified with 100% sensitivity and 100% specificity. Accordingly, cluster analysis completely separated all of the 13 subgroups analyzed. This analysis demonstrates that, by use of gene expression profiling, all clinically relevant subentities of leukemia can be predicted with a very high accuracy. It forms an important basis for the further development of microarray technology as a diagnostic tool for leukemias. Conclusions and future directions The introduction of microarray technology has been a major step towards the comprehensive biologic characterization of various diseases and will allow the identification of yet unknown subentities and even new biologically defined entities. In particular, it has become clear that distinct cytogenetically defined subtypes in acute leukemia carry highly specific gene expression profiles, which can be used to identify these subtypes based on microarray analyses with very high accuracy [5]. Therefore, it is expected that the routine application of microarrays will significantly improve molecular diagnostics in acute leukemia [50] and will provide deep insights into the pathogenetic alterations of malignant and non-malignant hematopoietic cells. In addition, these comprehensive data are anticipated to allow the identification of prognostically relevant markers as well as disease-specific markers, which can be applied for programs of monitoring minimal residual disease. Of highest clinical relevance is the capability of microarray approaches to identify pathogenetically essential structures and alterations, which can be targeted by future drugs that hopefully will lead to an improved management of these diseases.
127
Gene expression profiling for the diagnosis of acute leukemias
Adequate diagnosis and subclassification of leukemias today is based on a comprehensive combination of various methods including cytomorphology, cytochemistry, multiparameter immunophenotyping, cytogenetics, fluorescence in-situ hybridization, and quantitative and non-quantitative molecular genetics. This is costly, time-consuming, and requires skilled expert-level personnel in centralized reference laboratories. Using microarray methods, substantial steps forward may be made towards both optimizing diagnostic capabilities and reducing financial investment. A significant number of today’s diagnostic approaches can be reproduced by gene expression profiling already; however, further large trials are needed to assert the validity of this approach for diagnosis, prognosis, and treatment decisions. Such trials (Microarray Innovation in LEukemia – MILE) will be conducted by the European Leukemia Network (ELN) using ten sites and 4000 arrays and will hopefully answer the question about which techniques will be used for diagnosis, treatment decisions, and prognosis in acute leukemias in the genomics era.
R E F E RE N C E S 1. Haferlach, T., Kern, W., Schnittger, S., and Schoch, C. Modern diagnostics in acute leukemias. Crit. Rev. Hematol. Oncol. 2005; 56: 223–34. 2. Fenaux, P., Le Deley, M. C., Castaigne, S. et al. Effect of all transretinoic acid in newly diagnosed acute promyelocytic leukemia. Results of a multicenter randomized trial. European APL 91 Group. Blood 1993; 82: 3241–9. 3. Kantarjian, H., Sawyers, C., Hochhaus, A. et al. Hematologic and cytogenetic responses to imatinib mesylate in chronic myelogenous leukemia. N. Engl. J. Med. 2002; 346: 645–52. 4. Haferlach, T., Kohlmann, A., Kern, W. et al. Gene expression profiling as a tool for the diagnosis of acute leukemias. Semin. Hematol. 2003; 40: 281–95. 5. Haferlach, T., Kohlmann A., Schnittger, S. et al. A global approach to the diagnosis of leukemia using gene expression profiling. Blood 2005; 106: 1189–98. 6. Golub, T. R., Slonim, D. K., Tamayo, P. et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999; 286: 531–7. 7. Ramaswamy, S., Tamayo, P., Rifkin, R. et al. Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl Acad. Sci. USA 2001; 98: 15149–54.
128
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
8. Ramaswamy, S., Golub, T. R. DNA microarrays in clinical oncology. J. Clin. Oncol. 2002; 20: 1932–41. 9. Alizadeh, A. A., Eisen, M. B., Davis, R. E. et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403: 503–11. 10. Armstrong, S. A., Staunton, J. E., Silverman, L. B. et al. MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat. Genet. 2002; 30: 41–7. 11. Yeoh, E. J., Ross, M. E., Shurtleff, S. A. et al. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 2002; 1: 133–43. 12. Zhan, F., Hardin, J., Kordsmeier, B. et al. Global gene expression profiling of multiple myeloma, monoclonal gammopathy of undetermined significance, and normal bone marrow plasma cells. Blood 2002; 99: 1745–57. 13. Moos, P. J., Raetz, E. A., Carlson, M. A. et al. Identification of gene expression profiles that segregate patients with childhood leukemia. Clin. Cancer Res. 2002; 8: 3118–30. 14. Haferlach, T., Kohlmann, A., Dugas, M. et al. Gene expression profiling is able to reproduce different phenotypes in AML as defined by the FAB classification. Blood 2002; 100: 195a. 15. Haferlach, T., Kohlmann, A., Schnittger, S. et al. AML M3 and AML M3 variant each have a distinct gene expression signature but also share patterns different from other genetically defined AML subtypes. Genes Chromosomes Cancer 2005; 43: 113–27. 16. Haferlach, T., Schoch, C., Schnittger, S. et al. Distinct genetic patterns can be identified in acute monoblastic and acute monocytic leukaemia (FAB AML M5a and M5b): a study of 124 patients. Br. J. Haematol. 2002; 118: 426–31. 17. Haferlach, T., Schnittger, S., Kohlmann, A. et al. Genetic profiling in acute monoblastic versus acute monocytic leukemia: a gene expression study on 22 patients. Blood 2002; 100: 195a. 18. Jaffe, E. S., Harris, N. L., Stein, H., and Vardiman, J. W. World Health Organization Classification of Tumours. Pathology and Genetics of Tumours of Haematopoietic and Lymphoid Tissues 2001. Lyon: IARC Press. 19. Schoch, C., Kohlmann, A., Schnittger, S. et al. Acute myeloid leukemias with reciprocal rearrangements can be distinguished by specific gene expression profiles. Proc. Natl Acad. Sci. USA 2002; 99: 10008–13. 20. Liu, G., Loraine, A. E., Shigeta, R. et al. NetAffx: Affymetrix probesets and annotations. Nucl. Acids Res. 2003; 31: 82–6. 21. Kohlmann, A., Schoch, C., Schnittger, S. et al. Molecular characterization of acute leukemias by use of microarray technology. Genes Chromosomes Cancer 2003; 37: 396–405.
129
Gene expression profiling for the diagnosis of acute leukemias
22. Debernardi, S., Lillington, D. M., Chaplin, T. et al. Genome-wide analysis of acute myeloid leukemia with normal karyotype reveals a unique pattern of homeobox gene expression distinct from those with translocation-mediated fusion events. Genes Chromosomes Cancer 2003; 37: 149–58. 23. Ross, M. E., Mahfouz, R., Onciu, M. et al. Gene expression profiling of pediatric acute myelogenous leukemia. Blood 2004; 104: 3679–87. 24. Virtaneva, K., Wright, F. A., Tanner, S. M. et al. Expression profiling reveals fundamental biological differences in acute myeloid leukemia with isolated trisomy 8 and normal cytogenetics. Proc. Natl Acad. Sci. USA 2001; 98: 1124–9. 25. Schoch, C., Kohlmann, A., Dugas, M. et al. Genomic gains and losses influence expression levels of genes located within the affected regions: a study on acute myeloid leukemias with trisomy 8, 11, or 13, monosomy 7, or deletion 5q. Leukemia 2005; 19: 1224–8. 26. Schoch, C., Kern, W., Kohlmann, A. et al. Acute myeloid leukemia with a complex aberrant karyotype is a distinct biological entity characterized by genomic imbalances and a specific gene expression profile. Genes Chromosomes Cancer 2005; 43: 227–38. 27. Schnittger, S., Kohlmann, A., Haferlach, T. et al. Acute myeloid leukemia (AML) with partial tandem duplication of the MLL-gene (MLL-PTD) can be discriminated from MLL-translocations based on specific gene expression profiles. Blood 2002; 100: 310a. 28. Schnittger, S., Schoch, C., Dugas, M. et al. Analysis of FLT3 length mutations in 1003 patients with acute myeloid leukemia: correlation to cytogenetics, FAB subtype, and prognosis in the AMLCG study and usefulness as a marker for the detection of minimal residual disease. Blood 2002; 100: 59–66. 29. Schnittger, S., Kohlmann, A., Dugas, M. et al. Acute myeloid leukemia (AML) with FLT3-length mutations (FLT3-LM) can be discriminated from AML without FLT3LM in distinct AML-subtypes based on specific gene expression profiles. Blood 2002; 100: 311a. 30. Lacayo, N. J., Meshinchi, S., Kinnunen, P. et al. Gene expression profiles at diagnosis in de novo childhood AML patients identify FLT3 mutations with good clinical outcomes. Blood 2004; 104: 2646–54. 31. Qian, Z., Fernald, A. A., Godley, L. A., Larson, R. A., and Le Beau, M. M. Expression profiling of CD34þ hematopoietic stem/progenitor cells reveals distinct subtypes of therapy-related acute myeloid leukemia. Proc. Natl Acad. Sci. USA 2002; 99: 14925–30. 32. Valk, P. J., Verhaak, R. G., Beijen, M. A. et al. Prognostically useful gene-expression profiles in acute myeloid leukemia. N. Engl. J. Med. 2004; 350: 1617–28. 33. Bullinger, L., Dohner, K., Bair, E. et al. Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. N. Engl. J. Med. 2004; 350: 1605–16.
130
T. Haferlach, A. Kohlmann, S. Schnittger, C. Schoch, and W. Kern
34. Kern, W., Schoch, C., Kohlmann, A. et al. Identification of biologically distinct and clinically relevant subentities in patients with acute myeloid leukemia and normal karyotypes by use of gene expression profiling. Blood 2004; 104: 3078–85. 35. Kern, W., Kohlmann, A., Wuchter, C. et al. Correlation of protein expression and gene expression in acute leukemia. Cytometry 55B; 2003: 29–36. 36. Yagi, T., Morimoto, A., Eguchi, M. et al. Identification of a gene expression signature associated with pediatric AML prognosis. Blood 2003; 102: 1849–56. 37. Kohlmann, A., Schoch, C., Schnittger, S. et al. Pediatric acute lymphoblastic leukemia (ALL) gene expression signatures classify an independent cohort of adult ALL patients. Leukemia 2004; 18: 63–71. 38. Kohlmann, A., Schoch, C., Dugas, M. et al. New insights into MLL gene rearranged acute leukemias using gene expression profiling: shared pathways, lineage commitment, and partner genes. Leukemia 2005; 19(6): 953–64. 39. Vapnik, V. Statistical Learning Theory. New York: Wiley, 1998. 40. Bene, M. C., Castoldi, G., Knapp, W. et al. Proposals for the immunological classification of acute leukemias. European Group for the Immunological Characterization of Leukemias (EGIL). Leukemia 1995; 9: 1783–6. 41. Kern, W., Kohlmann, A., Schoch, C. et al. Gene expression profiling in adult acute lymphoblastic leukemia, biphenotypic acute leukemia, and acute myeloid leukemia without differentiation: confirmation of immunophenotypic and cytogenetic diagnostic findings. Blood 2004; 104: 3078–85. 42. Ferrando, A. A., Neuberg, D. S., Staunton, J. et al. Gene expression signatures define novel oncogenic pathways in T cell acute lymphoblastic leukemia. Cancer Cell 2002; 1: 75–87. 43. Ferrando, A. A., Armstrong, S. A., Neuberg, D. S. et al. Gene expression signatures in MLL-rearranged T-lineage and B-precursor acute leukemias: dominance of HOX dysregulation. Blood 2003; 102: 262–8. 44. Hofmann, W. K., de Vos, S., Elashoff, D. et al. Relation between resistance of Philadelphia-chromosome-positive acute lymphoblastic leukaemia to the tyrosine kinase inhibitor STI571 and gene-expression profiles: a gene-expression study. Lancet 2002; 359: 481–6. 45. Hofmann, W. K., Komor, M., Hoelzer, D. and Ottmann, O. G. Mechanisms of resistance to STI571 (Imatinib) in Philadelphia-chromosome positive acute lymphoblastic leukemia. Leuk. Lymphoma 2004; 45: 655–60. 46. Cheok, M. H., Yang, W., Pui, C. H. et al. Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells. Nat. Genet. 2003; 34: 85–90. 47. Holleman, A., Cheok, M. H., den Boer, M. L. et al. Gene-expression patterns in drug-resistant acute lymphoblastic leukemia cells and response to treatment. N. Engl. J. Med. 2004; 351: 533–42.
131
Gene expression profiling for the diagnosis of acute leukemias
48. Chiaretti, S., Li, X., Gentleman, R. et al. Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival. Blood 2004; 103: 2771–8. 49. Ross, M. E., Zhou, X., Song, G. et al. Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. Blood 2003; 102: 2951–9. 50. Staudt, L. M. Molecular diagnosis of the hematologic cancers. N. Engl. J. Med. 2003; 348: 1777–85.
6
Gene expression profiling can distinguish tumor subclasses of breast carcinomas Ingrid A. Hedenfalk Department of Oncology, Lund University, Sweden
Introduction Breast cancer is known to be a heterogeneous and multifactorial disease, affecting approximately one in ten women in the Western world. While the majority of breast cancer patients respond to initial treatment (local and systemic), a large fraction of patients relapse over time and suffer through sometimes inefficient chemotherapeutic treatment regimens. A small number of biomarkers for targeted treatment exist, e.g., presence of the estrogen receptor (ER) is used to predict response to anti-estrogen treatment and patients whose tumors overexpress HER2/neu can be treated with Herceptinâ. Nevertheless, little progress has been made when it comes to tailoring treatment and identifying novel therapeutic targets for the clinical management of breast cancer patients. With the advent of high-throughput genetic profiling techniques, simultaneous assessment of thousands of genes in single experiments is now possible, thereby augmenting the degree of complexity in genetic patterns that can be investigated, and increasing the likelihood of identifying potential therapeutic targets in primary breast carcinomas. Global gene expression profiling studies conducted over the last couple of years have shown that molecular profiling of breast cancers can be used to identify clinically and genetically significant subtypes of breast carcinomas [1–4] and subgroups of patients with different prognosis or disease outcome [5–7], and to predict therapeutic response to both endocrine and chemotherapeutic drugs [8–12]. Gene expression profiling may indeed become a general strategy for personalizing treatment choices and for predicting clinical outcome in individual patients in the not too distant future. Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
133
Distinguishing tumor subclasses of breast carcinomas
Gene expression profiles associated with breast cancer phenotypes Breast cancer is a clinically, genetically and histopathologically heterogeneous disease. Numerous factors have been identified that are associated with patient survival and response to therapy, including age at diagnosis, tumor size, lymph node status, tumor histology, and cellular proliferation rate. In addition, several molecular markers have also been found to correlate with patient prognosis, the most important being expression of the ER, with ER negativity associated with poor patient survival. None of these factors can, however (alone or in combination), with complete accuracy predict the outcome for individual breast cancer patients. As an example, the anti-estrogen tamoxifen is used to treat the approximately 75% of women whose breast tumors express the ER, but approximately half of these patients do not respond to this endocrine treatment [13]. Additional markers are needed that can predict prognosis and identify patients who will benefit from available treatments. Moreover, given the fairly poor overall survival rate for breast cancer patients, there is also a great need for the identification of novel therapeutic targets and for the development of tailored treatment regimens. Sporadic breast cancer
Breast cancer has become one of the most studied human malignancies in the decade of microarray-based research. These mainly retrospective profiling studies have led to an increased understanding of the heterogeneity underlying breast cancer development and behavior. The early microarray studies were focused mainly on identifying gene expression profiles correlating with already known subclasses. Given the importance of the ER in breast cancer biology, the discovery of unique and separate gene expression profiles associated with ER positive and negative breast cancers, respectively, was not surprising [1, 3, 14]. Among the first microarray studies, the investigation of invasive ductal breast carcinomas conducted by Perou and Sorlie et al. largely separated tumors into those positive for ER and those negative for ER [1, 5]. In addition, based on a subset of expressed genes, these investigators also defined novel subgroups within the ER positive and negative subgroups, respectively, using a hierarchical clustering analysis approach (Fig. 6.1). The
134
I. A. Hedenfalk
(a)
Basal-like
ERBB2+
(b)
(c)
(d)
(e)
(f)
(g)
Normal breast-like
Luminal Luminal subtype C subtype B
Luminal subtype A
135
Distinguishing tumor subclasses of breast carcinomas
ER negative tumors were divided into three subgroups, one characterized by the expression of markers characteristic for basal epithelial cells, one characterized by overexpression of ERBB2, and one with a gene expression profile resembling normal breast tissue. Two subgroups within the ER positive breast cancers were also identified, both expressing markers of normal luminal epithelial cells. However, while one of the subgroups showed high levels of expression of ER itself and a number of associated genes, the second smaller group showed low to moderate expression of luminal specific genes, including the group of ER related genes, but was further defined by the expression of a group of genes with unknown function. These genes were also found to be similarly expressed in the basal-like and ERBB2-overexpressing ER negative subgroups. In general, it is believed that the majority of breast tumors arise from luminal epithelial cells, while tumors evolving from basal epithelial cells are believed to be less common. The suggestion that ductal breast carcinomas may be derived from two distinct cell types, basal or luminal, is intriguing, especially in the light of suggestions of the existence of a breast stem cell, and warrants further investigation [15, 16]. This potentially different origin of the tumor cells from basal vs. luminal epithelial cells was supported by immunohistochemical staining for cytokeratins (CKs) typical for these two cell types; CK 5/6 and 17 for basal-like tumors and CK 8/18 for luminal-like tumors [5]. While the clinical significance of these proposed novel subgroups remains an open question, the biological and clinical heterogeneity suggested illustrates the need for more targeted treatment regimens for
Fig. 6.1.
Gene expression patterns of 85 experimental samples representing 78 carcinomas, three benign tumors, and four normal tissues, analyzed by hierarchical clustering using the 476 cDNA intrinsic clone set. (a) The tumor specimens were divided into five (or six) subtypes based on differences in gene expression. The cluster dendrogram showing the five (six) subtypes of tumors are colored as: luminal subtype A, dark blue; luminal subtype B, yellow; luminal subtype C, light blue; normal breast-like, green; basal-like, red; and ERBB2 þ, pink. (b) The full cluster diagram scaled down. The colored bars on the right represent the inserts presented in (c)–(g). (c) ERBB2 amplicon cluster. (d) Novel unknown cluster. (e) Basal epithelial cell-enriched cluster. (f ) Normal breast-like cluster. (g) Luminal epithelial gene cluster containing ER. (Sorlie T. et al. (2001) Proc. Natl Acad. Sci. USA 98: 10869–74. Copyright # (2001) National Academy of Sciences USA.)
136
I. A. Hedenfalk
subgroups of breast cancer patients, but also demonstrates the potential for gene expression profiling in identifying these potentially significant subgroups. The actual number of molecular subgroups that exists in breast cancer remains unknown, but the existence of the above described, genetically distinct, subgroups has been corroborated in additional sets of gene expression data and by other investigators [17, 18]. Nevertheless, additional studies including breast carcinomas representing the whole spectrum of the disease are needed to accomplish a more refined and robust gene expression-based subclassification of breast cancer. Hereditary breast cancer
Approximately 5–10% of all breast cancers are of hereditary origin, and two major breast cancer susceptibility genes have been identified to date, BRCA1 [19] and BRCA2 [20]. Although these high-penetrance syndromes account for a small proportion of all cancers, the identification of these genes and the investigation of their roles in breast cancer development and progression emphasize the genetic aspect of cancer in general, and provide a basis for extrapolating findings from ‘‘genetically defined’’ studies to the investigation of the more common forms of sporadic disease. While mutation screening in the two known breast cancer susceptibility genes for hereditary breast cancer families, allowing mutation carriers to make informed decisions regarding surveillance and/or prophylactic approaches, has become commonplace at oncogenetic clinics across the world, the techniques used for screening are time-consuming and expensive. Studies of the histopathological features, genomic alterations and hormone receptor levels in these tumors support the notion that breast cancers caused by germline mutations in BRCA1 and BRCA2 differ from each other and from tumors not caused by mutations in these genes at a molecular level. While certain characteristics, such as medullary histology and ER negativity, are more common among BRCA1 derived breast tumors, which make up a somewhat homogeneous group, BRCA2 derived breast tumors and, to an even greater extent, non-BRCA1/2 (BRCAx) breast tumors constitute considerably more heterogeneous groups. Therefore, an alternative means of classifying BRCA1, BRCA2 and non-BRCA1/2 associated tumors would greatly facilitate the identification of patients carrying mutations in these genes. Moreover, comprehensive understanding of the
137
Distinguishing tumor subclasses of breast carcinomas
underlying defects causing the development of hereditary breast cancers may greatly improve both treatment strategies and intervention options for the affected patients, and may give insights into breast cancer biology in general. The first investigation of gene expression profiles in BRCA1 and BRCA2derived hereditary breast cancers illustrated that tumors derived from patients with BRCA1 mutations can be distinguished from those with BRCA2 mutations and sporadic breast cancers based on gene expression profiles [2]. The finding that BRCA1 and BRCA2 associated tumors display unique gene expression profiles could have clinical implications, in that it may become possible to perform gene expression profiling analyses based on a set of highly informative genes in order to determine if a potential hereditary breast cancer patient carries a BRCA1 or BRCA2 mutation. Closer examination of the identity of, and relationship between, the genes whose expression profiles differ between BRCA1 and BRCA2 associated breast cancers may also give insight into the fundamental basis of malignant transformation and tumor progression caused by mutations in these two genes. When a group of sporadic breast cancers was compared with the BRCA1 and BRCA2 derived tumors, a BRCA1-like gene expression profile was discovered in a tumor from a patient without a germline mutation in the gene. When this tumor biopsy was compared with other sporadic cancers, it was found to express the lowest level of the BRCA1 gene, and also displayed other clinical and pathological characteristics commonly seen in BRCA1 tumors. The BRCA1 promoter region was therefore analyzed for aberrant methylation, which is known to silence BRCA1 in sporadic breast tumors [21]. Upon investigation, in a blinded fashion, of all sporadic tumors included in the study, it was found that the sporadic tumor classified as a BRCA1-like tumor was the only one that showed hypermethylation of BRCA1 (Fig. 6.2), indicative of epigenetic inactivation of the gene [2]. Over the last couple of years it has become increasingly evident that epigenetic events such as promoter methylation can be important in tumorigenesis [22]. This finding therefore, illustrates the use of global gene expression profiling for identifying such events in the absence of germline mutations. It also illustrates the high degree of sensitivity of transcriptional profiling, and demonstrates that defects in individual genes give rise to
138
I. A. Hedenfalk
n
ie
t Pa
U
1 t2
M
n
ie
t Pa
U
9 t1
M
n
ie
t Pa
U
5 t1
M
U
M
nt
n
ie
t Pa
0 t2
18
nt
P
P
P
M
nt
U
M
d s te late al cy y h m o et or h N mp M NA ly D
16
ie at
ie at
ie at
U
17
U
M
U
M
bp
86 75 1
Fig. 6.2.
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Methylation analysis of the BRCA1 promoter in sporadic breast tumors. Methylation specific PCR was used to distinguish unmethylated from methylated alleles of BRCA1 on the basis of sequence changes produced following bisulfite treatment of DNA, which converts unmethylated, but not methylated, cytosines to uracil, and subsequent PCR by use of primers designed for either methylated or unmethylated DNA. The methylated product is 75 bp long, and the unmethylated product is 86 bp. DNA from normal lymphocytes (NL) was used as a negative control, and in vitro methylated DNA (IVD) was used as a positive control. U ¼ unmethylated BRCA1. M ¼ methylated BRCA1. (Hedenfalk, I. et al. (2001) N. Engl. J. Med. 344: 539–48. Copyright # (2001) Massachusetts Medical Society. All rights reserved).
unique and characteristic genetic signatures, an example of which is the important role played by the BRCA1 gene in breast tumorigenesis. In a later study of gene expression profiles from young breast cancer patients, tumors from 18 BRCA1 mutation carriers were included in the analysis, confirming a BRCA1 specific gene expression signature [6]. When these 18 tumors were analyzed, together with 97 sporadic samples in a study by Sorlie et al., all the BRCA1 derived tumors appeared within the basal subgroup upon clustering analysis [17]. Whereas the tumors with BRCA1 mutations in the initial study showed low levels of expression of ER, CK 8, and ERBB2 [2], an extended study using tissue microarrays containing a larger number of BRCA1 and other hereditary breast cancers could not support the idea unequivocally that mutations in the BRCA1 gene predispose for a basal tumor type, as approximately half of the tumors in the follow-up study stained positively for the luminal marker MUC1 [23].
139
Distinguishing tumor subclasses of breast carcinomas
Nevertheless, BRCA1 associated breast cancers share several features with basal tumors, and it will be interesting to investigate further whether BRCA1 associated tumors arise from a basal cell type, or differentiate into a basallike phenotype, in the breast. Moreover, it has recently been shown that BRCA1 interacts with, and regulates, the activity of the ER, thereby providing a link between BRCA1 and estrogen-responsiveness in breast tumorigenesis [24, 25]. While mutations in BRCA1 and BRCA2 were proposed initially to be responsible for the majority of hereditary breast cancers, more recent population-based studies suggest that they account for a far smaller proportion than expected, and that considerable variation exists between different populations [26]. Presumably, these non-BRCA1/2 hereditary breast cancers may arise due to mutations in other high-penetrance genes [27], or perhaps due to low-penetrance alleles, e.g., CHEK2 [28]. As mentioned previously, these BRCAx cancers appear to comprise a histologically and clinically heterogeneous group, indicating the presence of multiple underlying mechanisms or genetic alterations. The heterogeneity of cancerpredisposing defects in BRCAx families has limited the power of traditional genetic linkage analysis. Consequently, with no current means to substratify the BRCAx families with mutations in a common gene, the search for additional breast cancer predisposing genes has been confounded. To this end, microarray-based gene expression profiling of this heterogeneous group of familial breast cancers has shown promising results in discovering novel subgroups, separate from BRCA1 and BRCA2 associated hereditary breast cancers, and with distinctive gene expression profiles (Fig. 6.3) [4]. In addition to the distinct transcriptional differences between the two novel subgroups identified, copy number analysis of genomic DNA from the same group of tumors using microarray-based comparative genomic hybridization (CGH) revealed that these subgroups were each associated with specific somatic genetic alterations, lending further support to the notion that there are multiple distinct subclasses of BRCAx tumors [4]. These findings illustrate that gene expression-based profiling can be used to identify distinct and homogeneous subclasses within the familial non-BRCA1/2 breast cancer families, and that microarray-based CGH can further identify distinct chromosomal aberrations within these subgroups. This study lends support to the idea that gene expression profiling, combined with global genomic
Fig. 6.3.
Gene expression-based class discovery of BRCAx breast cancers. (a) Based on 16 BRCAx tumors, the most significant separation into two classes resulted in classes with seven (group A, yellow) and nine (group B, blue) samples, respectively. Sixty statistically significant genes (P < 0.001), which were found to separate the groups, are listed. Expression levels for each gene are normalized across the samples such that the mean is 0 and the variance is 1. Expression levels greater than the mean are pseudo-colored red, and those below are pseudo-colored green. The scale indicates SDs above or below the mean. (b) The number of genes separating BRCAx cancers into two subgroups (dotted line) is plotted as a function of the signal-to-noise weight. The bars (1 SD) show the number of genes expected by chance. There is a clear overabundance of genes separating the BRCAx subgroups (c and d ). Based on the 60 genes that best separated BRCAx tumors into two groups, multidimensional scaling analysis and hierarchical clustering of the 16 samples together with BRCA1 (grey) and BRCA2 (purple) tumors is shown. The BRCAx subgroups were separated from one another as well as from the BRCA1 and BRCA2 tumors, reflecting the difference between BRCAx and BRCA1/2 tumors. (Hedenfalk I. et al. (2003) Proc. Natl Acad. Sci. USA 100: 2532–7. Copyright # (2003) National Academy of Sciences USA.)
141
Distinguishing tumor subclasses of breast carcinomas
profiling and conventional genetic linkage analyses, could increase the power of identifying novel predisposition genes dramatically within homogeneous subsets of families. Correlation between transcriptional profiles and histological types
There are two major histological types of breast cancer: invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC), accounting for approximately 80% and 15% of all invasive breast cancers, respectively. IDC, the incidence of which is fairly stable, is by far the most common histological type. ILC, on the other hand, is the most rapidly increasing breast cancer phenotype in the United States and Western Europe, and is also more difficult to diagnose. Importantly, ILCs have been found to respond poorly to neoadjuvant chemotherapy compared with IDCs, and pathological complete response to neoadjuvant chemotherapy, which is often used as a surrogate marker for survival, may have no prognostic significance in ILCs [29]. Other than loss of E-cadherin expression in ILCs, it is not clear whether the histological types of IDC and ILC correlate with distinct molecular features, and, if so, which genes might be involved in the development of these two histologies. In order to address this issue, Zhao et al. recently analyzed 21 ILCs and 38 IDCs using 42K cDNA microarrays, and found that they could subdivide the lobular carcinomas into two distinct subgroups [30]. While approximately half of the ILCs (termed ‘‘typical lobular’’) displayed a distinct gene expression profile from the ductal carcinomas, the other subgroup of ILCs (termed ‘‘ductal-like’’) shared similar transcriptional profiles with the IDCs. These findings suggest that molecular differences exist between the two histological subtypes of breast cancer, but further studies are needed to determine whether they require different therapeutic strategies. Detecting metastatic potential and tumor aggressiveness The occurrence of distant metastases is the primary cause of death in cancer patients, including breast cancer patients. It is generally believed that the ability to metastasize is acquired late in tumor progression in a small number of cells within the primary tumor. A much-cited paper by Ramaswamy and colleagues describing gene expression profiling of primary
142
I. A. Hedenfalk
epithelial tumors and unmatched metastases reported a small proportion of the primary tumors expressing a 17-gene ‘‘metastasis-like’’ gene expression signature associated with metastatic potential and poor survival [31]. The identification of this ‘‘metastasis signature’’ in common between the metastases and the primary tumors led the authors to hypothesize that metastatic potential may be present in a subset of cells within a tumor early in tumor progression, contrary to the classical model where metastatic potential is believed to be acquired within subsets of cells late in tumorigenesis. In support of this hypothesis, a number of other studies linking gene expression profiles to clinical outcome in breast cancer have suggested that the potential for distant metastasis and overall survival probability may be attributable to biological characteristics of the primary tumor at the time of diagnosis [6, 32, 33]. While the level of sensitivity of microarrays would require a fairly large proportion of the cells within a tumor biopsy to express any given gene signature (such as the 17-gene ‘‘metastasis signature’’), as changes occurring in a rare population of cells would not likely be detectable, one must keep in mind that bulk tissue biopsies have been studied. The findings reported in the above studies may alternatively suggest that some of these changes that potentiate metastasis may constitute early changes that promote non-invasive growth, or, the expression signature generated may represent an average of genes expressed in all the cells within the biopsy, while not each individual cell necessarily has to express all of these genes. Finally, a recent report has suggested that metastasis to a particular tissue or organ is mediated by specific sets of genes [34], possibly enabling researchers to develop targeted intervention strategies. While these data, based on small patient cohorts, only represent the first steps towards genomic understanding of the complex biology of metastasis, the clinical implications of these findings are both significant and apparent. Prediction of clinical outcome Among the biomarkers used to predict prognosis in breast cancer are lymph node status, tumor size, histological type, nuclear grade, and ER and PR status. While these factors are useful for identifying subgroups of patients with different disease outcomes, they poorly predict individual outcomes. In addition to these prognostic factors, predictive factors that predict response
143
Distinguishing tumor subclasses of breast carcinomas
to certain therapies include ER, PR and HER-2/ERBB2. However, very few tools are currently available to identify patients at risk of disease recurrence or overall survival. Currently, many breast cancer patients undergo adjuvant systemic treatment, despite very low risk of developing a recurrence, while not all high-risk patients are identified and offered potentially curative adjuvant treatment. There is a critical need to identify low risk patients to spare them unnecessary over-treatment of their disease, while also identifying high-risk patients and developing accurate predictive markers for the development of rational, individualized treatment options. Recent developments in the application of microarray technologies to the investigation of breast tumor specimens suggest that the global approach of these new techniques may generate clinically useful diagnostic and prognostic markers/ profiles to more accurately predict long-term outcome of individual patients. The five subgroups of breast cancer originally identified by hierarchical clustering by Sorlie and colleagues [5] have been correlated with clinical outcome of the patient in an extended study of two independent data sets [17]. Prognosis
It has been suggested that classifiers based on gene expression profiles would probably compare favorably with other clinically validated predictive and prognostic markers. Recently, several groups have reported prognostic gene expression profiles in breast cancer, suggesting the emergence of a new, genomic dimension of breast cancer management [5–7, 17, 18, 35–40]. These studies have, however, mainly been small, retrospective studies not representing the whole spectrum of breast cancer heterogeneity. Nevertheless, the results from these and other studies on a wide spectrum of malignancies are promising and suggest that a refined classification system based on molecular signatures may indeed become a valuable tool in the clinical setting within the not too distant future. In a follow-up study to the first papers that identified subtypes among sporadic breast cancers and correlated them to differences in gene expression profiles [1], the same group examined whether these phenotypic profiles correlated with disease outcome and could function as prognostic markers [5]. In addition to further separation of the luminal, ER positive group, into at least two subtypes, a survival analysis among a subset of
144
I. A. Hedenfalk
patients showed a considerably better prognosis for the luminal-like subgroup compared to a poorer prognosis for the basal-like subgroup. A subsequent study of two independent data sets further refined the prognostic tool such that differences could be seen among the two luminal subgroups [17]; the luminal B group showed a much worse prognosis than the luminal A group (Fig. 6.4). As the patients with ER positive tumors included in the survival analysis received tamoxifen treatment (i.e., patients within both luminal-like groups), these findings suggest that an underlying difference between the two subgroups may be responsiveness to tamoxifen treatment, as patients within luminal group A may respond better to antiestrogen treatment. These early studies were promising in terms of correlating distinct gene expression profiles to significant differences in clinical outcome. Nevertheless, the independent prognostic value of the molecular classification remains to be elucidated, and the number of molecular subtypes may in fact increase as larger sample sets become available for analysis. An additional study addressing the basal vs. luminal phenotype aspect was recently published. CGH analysis of 43 grade III invasive ductal breast carcinomas positive for the basal cytokeratin CK 14, as well as 43 grade and age-matched CK 14 negative controls, all with a median follow-up of 7 years, revealed significant differences in mean number of changes as well as types of alterations between the two groups [41]. Nevertheless, both supervised and unsupervised algorithms separated the tumors into two groups with one group containing 42% of the CK 14 positive basal tumors, and with a significantly shorter overall survival. These data suggest that the basal phenotype on its own does not convey a poor prognosis; rather, basal tumors are also heterogeneous with only a subset displaying a poor clinical outcome [41]. The identification of subclasses of breast tumors as basal-like and luminal-like, and the finding that they differ in their clinical outcome, suggests a need for determination of the patterns of gene expression in normal basal and luminal cells of the breast. The generation of cell-type specific expression profiles in the normal breast may provide a baseline for studies investigating breast cancer progression and outcome prediction, as well as metastasis. Jones et al. therefore performed gene expression profiling analyses of purified normal luminal and myoepithelial cells in the breast. The panel of differentially expressed genes was tested subsequently against a breast cancer tissue microarray, whereupon the investigators could identify
145
Distinguishing tumor subclasses of breast carcinomas
(a) van't Veer data set 1.0
Probability
0.8 0.6
P < 0.01
0.4 0.2
0.0 0
24
48
72
96
120
144
168
192
Time to distant metastasis (months) Censored,
Luminal A,
Luminal B,
Basal,
ERBB2+
(b) Norway/Stanford data set 1.0
0.8 Probability
P < 0.01 0.6
0.4
0.2
0.0 0
24
48
72
96
Overall survival (months)
Fig. 6.4.
Overall survival analysis of breast cancer patients stratified by gene expression-based subtypes. Kaplan–Meier curves are shown for the 72 patients with locally advanced breast cancer who were treated homogeneously in two prospective trials. (a) Time to development of distant metastasis in the 97 sporadic cases from van’t Veer et al. [6]. Patients were stratified according to the subtypes. (b) Overall survival for 72 patients with locally advanced breast cancer in the Norway/Stanford cohort [5]. (Sorlie T. et al. (2003) Proc. Natl Acad. Sci. USA 100: 8418–23. Copyright # (2003) National Academy of Sciences USA.)
146
I. A. Hedenfalk
independent prognostic information for breast cancer patients [39]. Extrapolation of these findings to the generation of normal cell type specific gene lists may provide novel diagnostic and prognostic markers, and may also contribute to the understanding of the multistep progression from normal epithelial cells to invasive cancer in the breast. van’t Veer and colleagues identified a 70-gene ‘‘prognostic classifier’’ in young (<53 years) lymph node negative patients using a training set of 78 tumors and a validation set of 19 additional tumors [6]. Good vs. poor prognosis patients were selected based on 5 years of clinical follow-up; the optimal classifier correctly predicted outcome for 83% of the patients, with a sensitivity of 85% for poor prognosis and specificity of 81% for good prognosis. In an extension of this study, van de Vijver et al. studied 234 additional young patients (<53 years) with stage I–II breast cancer, with or without lymph node involvement using the same 70-gene classifier [7]. The authors reported that their molecular classifier predicted patient outcome with a sensitivity of 93% and specificity of 53%, and have suggested that the 70-gene classifier outperforms currently used St. Gallen (in Europe) [42] and NIH (in the US) [43] consensus criteria. Unfortunately, however, these findings have not been corroborated in other, more recent studies; the reasons for this may be numerous. A study re-analyzing the patient data from the above studies showed that the Nottingham prognostic index (which is based on tumor size, lymph node status and histological grade) [44] predicts prognosis with the same accuracy as the 70-gene predictor [45]. Ideally, ‘‘conventional’’ clinical markers should be combined with the new molecular markers to develop optimized prognostic tools. Aiming at predicting lymph node status and relapse for breast cancer patients, Huang et al. constructed meta-genes (patterns of gene expression) that were associated with the clinical parameters and that were capable of predicting outcome in individual patients with about 90% accuracy [36]. The meta-genes defined distinct groups of genes, suggesting different biological processes underlying these two features of breast cancer. Subsequently, these meta-genes were combined with clinical parameters into a ‘‘clinico-genomic’’ model to achieve a prognostic index used for personalized prediction of disease outcome [46]. Such a prognostic index associated with lymph node status shows promise for the accurate and individualized prognostication of breast cancer patients. Further, in a small study of 55 breast cancer
147
Distinguishing tumor subclasses of breast carcinomas
patients, Ahr et al. identified patients with a high risk of disease recurrence both within a lymph node positive subgroup (11/22) and a subgroup classified as N0 (3/5) [35], emphasizing the need for a refined prognostic tool for the assessment of disease recurrence. In a population-based study (46 node negative and 53 node positive patients, the majority of whom received adjuvant treatment after surgery), Sotiriou et al. did not, however, find a strong association between gene expression profiles and lymph node metastases [18]. The authors speculated that this may be due to a ‘‘dilution’’ of the metastatic signature derived from a small fraction of cells with metastatic potential by signals from non-metastasizing cells. These partly conflicting results imply differences in sensitivity between techniques used and/or selection bias of patients included in the studies, and emphasize the need for large, prospective studies designed to address specific questions, and the need for transparency to facilitate comparison across technology platforms and patient populations. Recently, Chang et al. took an interesting approach when they applied a previously described ‘‘wound response signature’’ identified as the response of normal fibroblasts to serum, to predict distant metastasis free and overall survival in breast cancer patients [47]. In addition to demonstrating how hypothesis-driven gene expression profiles generated from in vitro experiments can provide biologically and clinically significant insights into gene expression profiles from clinical tumor specimens, the ‘‘wound response signature’’ improved risk stratification independently of known clinical and pathological risk factors. In addition, integration of the previously described ‘‘intrinsic gene set’’ [5] and the 70-gene ‘‘prognostic signature’’ [6] with the ‘‘wound-response signature’’ [48], despite little overlap in actual genes represented, gave compatible and generally consistent predictions of outcome and, where the signatures diverged, the combined information improved the risk stratification. Tumors defined as basal-like expressed the 70-gene ‘‘poor prognosis signature’’ as well as the activated ‘‘wound response signature,’’ suggesting that these breast tumors represent a distinct disease entity with an aggressive clinical course [47]. Prospective studies are necessary to determine whether the suggested approach might benefit breast cancer patients in clinical decision making. Given the apparently great disparities between ER positive and negative breast cancer in terms of clinical outcome, biological features, and, as has
148
I. A. Hedenfalk
become recently evident, molecular signatures, it is becoming apparent that predictors of prognosis must be established separately for these two groups of breast cancers. With this in mind, Nagahata et al. aimed to identify molecular markers for classifying ER negative breast cancers with respect to postoperative prognosis [37]. Upon identification of genes differentially expressed among ten patients who died within 5 years after surgery and ten matched patients who survived disease free for more than 5 years, they devised a scoring system for predicting postoperative prognosis for ER negative breast cancers on the basis of aberrant gene expression patterns. While very small, this study represents a step in the right direction and illustrates the need for stratification based on, e.g., ER status, of patients prior to prognosis prediction. Onda and colleagues similarly constructed a prognostic index based on the differential expression of ten genes between patients who died within 5 years after surgery and those who survived disease free for more than 5 years [38]. The authors identified a small set of genes from global gene expression profiling analyses and developed a prognostic index based on real-time polymerase chain reaction (RT-PCR) measurements of these genes. This approach to predict postoperative prognosis for patients with primary breast cancer is clearly attractive in a clinical setting, if it can be developed into a reliable test upon validation in multi-centre prospective studies. The largest study to date aimed at identifying signature gene lists with the potential for prediction of clinical outcome in breast cancer was recently reported by Wang and colleagues [40]. In this study, 286 lymph node negative patients (median age 52 years) who did not receive adjuvant systemic treatment were included. In a training set of 115 tumors, a 76-gene signature was identified that could subsequently predict distant metastasis free survival in an independent test set of 171 lymph node negative patients. Importantly, and in contrast to, e.g., the two Dutch studies [6, 7], Wang et al. considered ER positive and ER negative tumors separately, generating gene lists taking into account the ER status when classifying clinical outcome for lymph node negative breast cancer patients. This is likely to improve the specificity of a gene-based predictor, given the presumed inter-connection between ER status, gene expression profile and outcome status [49]. Moreover, there was no overlap in genes between the ER positive and ER negative groups, despite some genes being involved in
149
Distinguishing tumor subclasses of breast carcinomas
similar cellular and molecular processes; this finding also supports the idea that the underlying mechanism/s for disease progression differ depending on ER status. While Wang and coauthors, in concordance with van’t Veer and colleagues, suggest an improvement in the risk assessment of breast cancer patients with the use of gene expression profiles compared to conventional criteria commonly used in Europe (St. Gallen, [42]) and the USA (NIH, [43]), the overlap in genes involved in the two predictors is minimal. Differences in techniques used and, maybe more importantly, patients included may account for a certain degree of these discrepancies; this demonstrates the need for transparency in terms of already published results as well as the implementation of well-designed, large, prospective studies aimed at answering very specific questions in terms of outcome prediction. While gene expression-based techniques lend promise to the improvement of individual patient care, these studies also illustrate the heterogeneity of the disease and that we have not yet reached a point where these applications can be readily incorporated into clinical practice. Response to therapy
As mentioned, although ER status is predictive of response to hormonal treatment, the predictive value is not 100%. Moreover, there are, to date, no clinically useful predictive markers for a patient’s response to chemotherapy. In addition, all patients eligible for chemotherapy receive the same treatment although the average expected benefit is low; some patients fail to respond and suffer unnecessary toxicity, while others would benefit from more severe treatment. Hence, selection of patients most likely to benefit from adjuvant systemic therapy would be a great advance in the clinical management of breast cancer. The assessment of one or a few individual markers has not been shown to be powerful enough to reveal the complex biology of clinical breast cancer and response to therapy. However, patterns of expression of larger numbers of genes could be successful in distinguishing between sensitive and resistant tumors. Tamoxifen
Tamoxifen significantly reduces tumor recurrence in a subset of ER positive, early stage breast cancer patients, but markers predictive of treatment success have to be identified. Recently, an attempt was made at identifying
150
I. A. Hedenfalk
a gene expression-based predictor of response to tamoxifen in breast cancer patients, when Ma et al. performed a global microarray analysis of ER positive invasive breast tumors from 60 patients treated with tamoxifen [9]. Having first identified 19 differentially expressed genes between recurrence cases (46% of patients who developed distant metastases with a median time to recurrence of 4 years) and matched non-recurrence cases (54% of patients who remained disease free with a median follow-up of 10 years), based on the 60 grossly dissected fresh frozen tissue specimens, the investigation was extended to include the re-analysis of the same cohort of patients following laser-capture microdissection (LCM). The findings from the combined analysis led to the identification of a two-gene expression ratio predictive of clinical outcome among these patients. One of the genes, HOXB13, was overexpressed in tamoxifen recurrence cases, and the other, IL17BR was overexpressed in non-recurrence cases. This two-gene ratio had a stronger correlation with treatment outcome than either gene alone in both cohorts, and, furthermore, was validated by RT-PCR using formalinfixed paraffin-embedded tissue sections corresponding to the same patient cohort. A significant benefit of using a simple two-gene test is the possibility of applying the test to routine clinical specimens, which are mostly formalin-fixed and paraffin-embedded, as opposed to the fresh frozen tissue needed for microarrays. The two-gene ratio, however, failed to predict outcome in patients not receiving adjuvant treatment included in a previously published study, suggesting that the genes correlating with clinical outcome in an adjuvant tamoxifen setting most likely differ from those associated with the diverse populations of breast cancers in other studies [9]. The finding that the expression ratio of two genes can predict recurrence accurately in patients with early-stage ER positive tumors treated with adjuvant tamoxifen may be helpful in identifying patients in need of more aggressive treatment and, due to its simplicity, may also be accessible in the clinical setting, but is currently limited by the size of the patient cohort and will hence require confirmation in large, population-based patient cohorts. Whether this two-gene predictor reflects the response of the tumors to tamoxifen or its inherent aggressiveness (or both) also remains to be established. One of the first prospective studies designed to validate a gene expression signature assay correlating with response to hormonal treatment in a large
151
Distinguishing tumor subclasses of breast carcinomas
multicentre clinical trial was recently published. Using a prospectively designed assay of 21 genes to calculate recurrence scores, Paik et al. were able to quantify the likelihood of distant recurrence in patients with nodenegative, ER positive breast cancer who had been treated with tamoxifen [10]. The levels of expression of 16 cancer-related genes and five reference genes were used in a prospectively defined algorithm to calculate a recurrence score and to determine a risk group (low, intermediate, or high) for each patient. 668 patients were subsequently categorized into low (51%), intermediate (22%) and high risk (27%) groups based on an RT-PCR assay of the 21 genes in paraffin-embedded tumor tissue. A Kaplan–Meier analysis then estimated the rate of distant recurrence at 10 years in the low, intermediate and high-risk groups at 6.8%, 14.3% and 30.5%, respectively. The difference in the risk of recurrence between the low- and high-risk groups was large and statistically significant. The estimated risk of recurrence in the high-risk group was similar to what has been reported among breast cancer patients with node positive disease, illustrating the need for a finer and more quantitative prognostic tool for the separation of breast cancer patients. The recurrence score could also be used to predict overall survival in this study; a noteworthy finding, since approximately 50% of the deaths occurred in the absence of recurrent breast cancer [10]. Additional biomarkers are needed to identify patients who may benefit from adjuvant endocrine therapy. In a study of ER positive breast carcinomas, Jansen et al. identified a gene expression profile predictive of response to tamoxifen [12]. Using a training set of 46 breast tumors, 81 genes were identified as differentially expressed between tamoxifen responsive and resistant tumors. The predictive signature was then reduced to 44 genes, and was validated in a set of 66 tumors. In the sample set described, the predictive signature compared favorably with, and contributed independently to, traditional prognostic factors. Not surprisingly, genes involved in estrogen action were found among the differentially expressed genes, possibly contributing to the underlying differences in response to tamoxifen. No overlap in genes was, however, found with the previously described 21gene signature described in ER-positive tumors from patients who had received adjuvant tamoxifen [10], again highlighting the difficulty in comparing, and further, either corroborating or disproving microarray-based expression signatures in human cancers. If confirmed, however, these tests
152
I. A. Hedenfalk
may prove helpful in identifying patients who may benefit from more complete ablation of estrogen signalling using, e.g., aromatase inhibitors, or from the addition of chemotherapeutic agents. Taxanes
A number of profiling studies have emerged over the last couple of years aimed at predicting the effectiveness of adjuvant chemotherapeutic treatment/s. Though encouraging, larger studies, designed to address these questions specifically, are needed to determine the clinical usefulness of gene expression profiling in predicting response to any given treatment. In one of the first studies, Chang and colleagues investigated how gene expression profiling can be used for the prediction of therapeutic response to the taxane docetaxel in a neoadjuvant setting in patients with breast cancer [8], and further, if the occurrence of resistance could be predicted by specific gene expression patterns [50]. Core biopsies from 24 patients, approximately half of which were defined as ‘‘sensitive’’ (<25% residual disease) and the other half as ‘‘resistant’’ (>25% residual disease), after neoadjuvant treatment with docetaxel were analyzed. 92 genes were determined as the most significantly differentially expressed between the two groups (P ¼ 0.001), and were found to be involved in a variety of functional classes, possibly indicating the underlying cause/s of response to the treatment. In general, resistant tumors overexpressed genes associated with the cell cycle, protein translation, and RNA transcription, whereas sensitive tumors overexpressed genes involved in stress and apoptosis, cell adhesion and cytoskeleton, protein transport, signal transduction, and RNA splicing or transport. Leave-one-out cross-validation and validation in a small (six patients) sample set showed a high degree of accuracy in classifying the tumors with the predictive model [8]. In the follow-up study based on the same patients, the authors attempted to identify gene expression patterns that might predict suboptimal response and resistance to docetaxel. Gene expression patterns were examined after 3 months of neoadjuvant docetaxel therapy in LCM and non-LCM surgical specimens of residual breast cancers [50]. In contrast to the pretreatment biopsies mentioned above, no statistically significant differentially expressed genes were found (in either LCM or non-LCM biopsies); i.e., the gene expression profiles of tumors that were initially either sensitive or resistant were similar after 3 months of treatment.
153
Distinguishing tumor subclasses of breast carcinomas
This finding could possibly be explained by the residual tumors in the sensitive group displaying incomplete response to therapy and inherent resistance, or by changes caused by the treatment itself. Clonal selection of a resistant phenotype may also help explain these results. Nevertheless, further investigation of the differentially expressed genes in the initial pretreatment biopsies may give clues to additional, important therapeutic targets. It is also worth noting that, although the magnitude of changes between pretreatment and LCM or non-LCM biopsies differed, the significantly differentially expressed genes showed the same expression patterns regardless of whether microdissection was performed or not. Iwao-Koizumi et al. similarly attempted to predict the clinical response in 70 breast cancer patients to neoadjuvant docetaxel by gene expression profiling, using a high-throughput RT-PCR technique [11]. Although the definition of responders and non-responders, as well as the methodological platform, differed from the above-mentioned studies, a set of diagnostic genes capable of predicting the clinical response to docetaxel was correspondingly identified and validated. These authors also found that tubulin was overexpressed in non-responders, which may contribute to resistance to the anti-microtubule agent docetaxel. Further, non-responders in this study were characterized by elevated expression of genes controlling the cellular redox environment; overexpression of these redox genes protected cultured mammary tumor cells from docetaxel-induced cell death, indicating a potential novel target for therapeutic intervention [11]. We are just at the first step of understanding how to utilize the powerful high-throughput global profiling techniques for the benefit of oncology patients. As discussed below, few, if any, gene expression-based assays have been validated thoroughly for use as prognostic or predictive tests in the treatment of cancer patients. Integration of data from other genomic technologies Other than microarrays for gene expression profiling, several additional genomic technologies have emerged that may significantly contribute to the improvement of cancer diagnosis and prognosis and for the development of new drugs, some of which have been touched upon in the above text. Somatic genetic changes associated with specific features may radically
154
I. A. Hedenfalk
influence a tumor’s phenotype, and DNA may be more readily extracted from formalin-fixed paraffin-embedded tissue available from most pathology departments. Different microarray-based CGH platforms exist (using cDNAs, oligonucleotides or BACs), and have been proven useful tools in furthering our understanding of breast tumor biology [4, 41, 51, 52]. Correspondingly, global methods for studying epigenetic events such as methylation patterns have been developed, and hold promise for, e.g., biomarker evaluation (for review see [53]). Furthermore, large-scale efforts to identify single-nucleotide polymorphisms (SNPs) associated with differences in tumor growth and metastasis, as well as drug metabolism and response to therapy, are under way and will complement global gene expression and genomic profiling. Finally, various protein expression profiling methods have been developed that have tremendous potential in detecting variations in proteins expressed by cells/tumors. The identification of protein profiles in individual tumors may be used in a similar manner to microarray-derived gene expression profiles for classification purposes, as exemplified recently by Jacquemier et al., who used immunohistochemistry on tissue microarrays to identify subclasses of breast cancer and to predict prognosis [54]. While some proteomics methods in use today cannot readily determine the identity of the expressed proteins, due to their excellent degree of sensitivity these techniques hold great promise for future assays for, e.g., the identification of serum markers for the detection of druggable targets or residual disease. Microarrays in clinical practice It has become evident that microarrays can be used effectively to begin to penetrate the heterogeneity and genetic complexity of breast cancer. Combined with other genomic technologies, we are at a stage where these high-throughput techniques can reveal significant information about tumor biology at a level not previously possible, including clinical behavior and response to therapy. In addition to increasing our knowledge of breast cancer biology and behavior, the development of new diagnostic tools for the improved management of breast cancer patients is undoubtedly an attractive prospect. Given the high degree of variability of certain standard diagnostic tests between institutions, and widespread inter-observer
155
Distinguishing tumor subclasses of breast carcinomas
variation in the pathological classification of breast cancer subtypes, a single, standardized assay capable of replacing the conventional methods and/or providing more accurate information would certainly represent an appealing improvement in the clinical management of individual patients. Given the promising results of the many studies on gene expression-based prediction of breast cancer outcome and response to various treatments, microarray diagnostics are indeed moving into clinical patient management to guide decisions regarding, e.g., adjuvant therapy following surgery [55]. The ultimate value of a prognostic predictor is to be able, with greater accuracy, to identify patients who need further adjuvant treatment without over-treating those who are cured by surgery alone. The first studies where sets of genes have been isolated from microarray profiles have been used to generate predictors for clinical outcome show promising results and have been entered into clinical trials. The collaborative effort between the Netherlands Cancer Institute and Rosetta Inpharmatics resulting in the 70-gene ‘‘prognostic indicator’’ discussed above [6, 7] has led to the development and marketing of a diagnostic kit named MammaPrintâ. More recently, the Oncotype DXä assay, a PCR-based assay of 21 genes used to determine a breast cancer patient’s risk of developing metastases [10], was launched by researchers at the company Genomic Health. While these two examples represent a great step forward in the individualization and optimization of treatment selection for breast cancer patients, it may be questioned whether these predictors have been investigated in enough detail to be applicable to diverse patient populations, and compared to conventional prognostic markers [45]; Oncotype DXä was developed for lymph-node negative patients with ER positive tumors, and MammaPrintâ was established based on patients with small (T1–T2) tumors, also without nodal involvement. It is possible therefore, at this point, that the gene expressionbased prognosis predictors currently available may be improved further, and that it may be too soon to settle for the presently available options. The first prospective, randomized, clinical trial to test the efficacy of a prognostic microarray-based signature – the 70-gene ‘‘prognostic indicator’’ [6, 7] – is in progress. Reflective of the speed at which the field of microarraybased clinical research changes, the design of the trial, Microarray for Node-Negative Disease may Avoid Chemotherapy (MINDACT), has been altered substantially since its original conception. The first phase of the trial
156
I. A. Hedenfalk
was an independent validation of the gene signature, and preliminary results from this have been reported at, e.g., scientific meetings. Patient samples from six different institutions in four countries were analyzed, and although statistically significant results were found, they were far less than what was reported in the original publication [56]. This could be due to the difference in patient populations tested, but also due to possible over-fitting of the data in the original studies, where some of the same samples were used in both the discovery and validation phases. Instead of assigning all patients randomly to the conventional arm or the microarray arm and thereafter determining whether they belong to the high- or low-risk group, the new design entails applying both conventional criteria and the gene signature to all patient samples, and thereafter focusing on the discordant cases who will then be assigned randomly to either of the two arms. The primary goal of the trial is to prove that fewer patients with breast cancer need to be treated with adjuvant chemotherapy, as currently 95% of women in the USA and 85% of women in Europe with node negative breast cancer are eligible for adjuvant chemotherapy, despite the fact that many of these patients are likely to be cured by local treatment alone. The combination of our increased knowledge about the human genome and advances in molecular biotechnology holds great promise for the future of clinical medicine. The clinical practice of oncology traditionally has been founded on empirical treatment strategies derived from population-based evaluations, but is currently being transformed into predictive, individualized treatment based on the molecular classification of disease, and targeted therapy. This transformation has the potential to increase treatment efficacy, while decreasing both toxicity and costs. The future Intense research over the last couple of years has demonstrated that gene expression profiling can classify tumors effectively into comprehensive groups and has revealed extensive differences in gene expression among cancers hitherto considered homogeneous using standard diagnostic tools. Gene expression profiling has proven useful in predicting the effectiveness of treatment and in classifying tumors according to response or resistance to therapy. Consequently, this new technology may have profound clinical
157
Distinguishing tumor subclasses of breast carcinomas
implications in the diagnosis and prognostication of cancer, thereby improving cancer stratification and clinical decision making. In many cases, the ability to predict outcome better may improve patient survival immediately; high-risk patients may be found to benefit from intensified treatment at the time of diagnosis, while low-risk patients who do not require such treatment may be spared unnecessary toxicity. However, in many cases the treatment options that are available are not sufficient for high-risk patients; hence, these novel tools are of little value to the patient. Therefore, the identification of novel therapeutic targets for the treatment of these patients is of critical and immediate importance. As a consequence, the great impact of microarray technology on clinical practice may be felt more in the long term. As it has been illustrated by a few research groups, probably the most plausible approach for translating microarray-derived gene expression profiles into clinically applicable routine assays is first to identify diagnostic or prognostic gene expression profiles consisting of a small number of genes using whole genome microarrays (and fresh frozen tissues), and then validating the clinical efficacy of these genes in prospective studies using a simple and robust conventional assay such as RT-PCR and formalin-fixed paraffin-embedded tissue. Such an assay could be incorporated readily into clinical practice, and would also be the most cost-effective. While this strategy may improve and refine the stratification of patients greatly into available treatment alternatives, the greatest obstacle in the advancement of clinical oncology is the identification and development of novel, targeted drugs with optimized efficacy.
R E F E RE N C E S 1. Perou, C. M., Sorlie, T., Eisen, M. B. et al. Molecular portraits of human breast tumours. Nature 2000; 406(6797): 747–52. 2. Hedenfalk, I., Duggan, D., Chen, Y. et al. Gene-expression profiles in hereditary breast cancer. N. Engl. J. Med. 2001; 344(8): 539–48. 3. Gruvberger, S., Ringner, M., Chen, Y. et al. Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns. Cancer Res. 2001; 61(16): 5979–84.
158
I. A. Hedenfalk
4. Hedenfalk, I., Ringner, M., Ben-Dor, A. et al. Molecular classification of familial non-Brca1/Brca2 breast cancer. Proc. Natl Acad. Sci. USA 2003; 100(5): 2532–7. 5. Sorlie, T., Perou, C. M., Tibshirani, R. et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl Acad. Sci. USA 2001; 98(19): 10869–74. 6. van’t Veer, L. J., Dai, H., van de Vijver, M. J. et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002; 415(6871): 530–6. 7. van de Vijver, M. J., He, Y. D., van’t Veer, L. J. et al. A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 2002; 347(25): 1999–2009. 8. Chang, J. C., Wooten, E. C., Tsimelzon, A. et al. Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet 2003; 362(9381): 362–9. 9. Ma, X. J., Wang, Z., Ryan, P. D. et al. A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell 2004; 5(6): 607–16. 10. Paik, S., Shak, S., Tang, G. et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N. Engl. J. Med. 2004; 351(27): 2817–26. 11. Iwao-Koizumi, K., Matoba, R., Ueno, N. et al. Prediction of docetaxel response in human breast cancer by gene expression profiling. J. Clin. Oncol. 2005; 23(3): 422–31. 12. Jansen, M. P., Foekens, J. A., van Staveren, I. L. et al. Molecular classification of tamoxifen-resistant breast carcinomas by gene expression profiling. J. Clin. Oncol. 2005; 23(4): 732–40. 13. Osborne, C. K. Tamoxifen in the treatment of breast cancer. N. Engl. J. Med. 1998; 339(22): 1609–18. 14. West, M., Blanchette, C., Dressman, H. et al. Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl Acad. Sci. USA 2001; 98(20): 11462–7. 15. Kordon, E. C. and Smith, G. H. An entire functional mammary gland may comprise the progeny from a single cell. Development 1998; 125(10): 1921–30. 16. Gudjonsson, T., Villadsen, R., Nielsen, H. L. et al. Isolation, immortalization, and characterization of a human breast epithelial cell line with stem cell properties. Genes Dev. 2002; 16(6): 693–706. 17. Sorlie, T., Tibshirani, R., Parker, J. et al. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc. Natl Acad. Sci. USA 2003; 100(14): 8418–23. 18. Sotiriou, C., Neo, S. Y., McShane, L. M. et al. Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc. Natl Acad. Sci. USA 2003; 100(18): 10393–8.
159
Distinguishing tumor subclasses of breast carcinomas
19. Miki, Y., Swensen, J., Shattuck-Eidens, D. et al. A strong candidate for the breast and ovarian cancer susceptibility gene Brca1. Science 1994; 266(5182): 66–71. 20. Wooster, R., Bignell, G., Lancaster, J. et al. Identification of the breast cancer susceptibility gene Brca2. Nature 1995; 378(6559): 789–92. 21. Esteller, M., Silva, J. M., Dominguez, G. et al. Promoter hypermethylation and Brca1 inactivation in sporadic breast and ovarian tumors [see comments]. J. Natl Cancer Inst. 2000; 92(7): 564–9. 22. Jones, P. A. and Baylin, S. B. The fundamental role of epigenetic events in cancer. Nat. Rev. Genet. 2002; 3(6): 415–28. 23. Hedenfalk, I., Simon, R., and Trent, J. Gene-expression profiles in hereditary breast cancer. N. Engl. J. Med. 2001; 344(26): 2029. 24. Fan, S., Wang, J., Yuan, R. et al. Brca1 inhibition of estrogen receptor signaling in transfected cells. Science 1999; 284(5418): 1354–6. 25. Ma, Y. X., Tomita, Y., Fan, S. et al. Structural determinants of the Brca1: estrogen receptor interaction. Oncogene 2005; 24(11): 1831–46. 26. Szabo, C. I. and King, M. C. Population genetics of Brca1 and Brca2. Am. J. Hum. Genet. 1997; 60(5): 1013–20. 27. Kainu, T., Juo, S. H., Desper, R. et al. Somatic deletions in hereditary breast cancers implicate 13q21 as a putative novel breast cancer susceptibility locus. Proc. Natl Acad. Sci. USA 2000; 97(17): 9603–8. 28. Meijers-Heijboer, H., Van Den Ouweland, A., Klijn, J. et al. Low-penetrance susceptibility to breast cancer due to Chek2(*)1100delc in noncarriers of Brca1 or Brca2 mutations. Nat. Genet. 2002; 31(1): 55–9. 29. Cristofanilli, M., Gonzalez-Angulo, A., Sneige, N. et al. Invasive lobular carcinoma classic type: response to primary chemotherapy and survival outcomes. J. Clin. Oncol. 2005; 23(1): 41–8. 30. Zhao, H., Langerod, A., Ji, Y. et al. Different gene expression patterns in invasive lobular and ductal carcinomas of the breast. Mol. Biol. Cell 2004; 15(6): 2523–36. 31. Ramaswamy, S., Ross, K. N., Lander, E. S. et al. A molecular signature of metastasis in primary solid tumors. Nat. Genet. 2003; 33(1): 49–54. 32. Schmidt-Kittler, O., Ragg, T., Daskalakis, A. et al. From latent disseminated cells to overt metastasis: genetic analysis of systemic breast cancer progression. Proc. Natl Acad. Sci. USA 2003; 100(13): 7737–42. 33. Weigelt, B., Glas, A. M., Wessels, L. F. et al. Gene expression profiles of primary breast tumors maintained in distant metastases. Proc. Natl Acad. Sci. USA 2003; 100(26): 15901–5. 34. Kang, Y., Siegel, P. M., Shu, W. et al. A multigenic program mediating breast cancer metastasis to bone. Cancer Cell. 2003; 3(6): 537–49. 35. Ahr, A., Karn, T., Solbach, C. et al. Identification of high risk breast-cancer patients by gene expression profiling. Lancet 2002; 359(9301): 131–2.
160
I. A. Hedenfalk
36. Huang, E., Cheng, S. H., Dressman, H. et al. Gene expression predictors of breast cancer outcomes. Lancet 2003; 361(9369): 1590–6. 37. Nagahata, T., Onda, M., Emi, M. et al. Expression profiling to predict postoperative prognosis for estrogen receptor-negative breast cancers by analysis of 25,344 genes on a Cdna microarray. Cancer Sci. 2004; 95(3): 218–25. 38. Onda, M., Emi, M., Nagai, H. et al. Gene expression patterns as marker for 5-year postoperative prognosis of primary breast cancers. J. Cancer Res. Clin. Oncol. 2004; 130(9): 537–45. 39. Jones, C., Mackay, A., Grigoriadis, A. et al. Expression profiling of purified normal human luminal and myoepithelial breast cells: identification of novel prognostic markers for breast cancer. Cancer Res. 2004; 64(9): 3037–45. 40. Wang, Y., Klijn, J. G., Zhang, Y. et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 2005; 365(9460): 671–9. 41. Jones, C., Ford, E., Gillett, C. et al. Molecular cytogenetic identification of subgroups of grade Iii invasive ductal breast carcinomas with different clinical outcomes. Clin. Cancer Res. 2004; 10(18 Pt 1): 5988–97. 42. Goldhirsch, A., Glick, J. H., Gelber, R. D. et al. International Consensus Panel on the treatment of primary breast cancer. Seventh International Conference on adjuvant therapy of primary breast cancer. J. Clin. Oncol. 2001; 19(18): 3817–27. 43. Eifel, P., Axelson, J. A., Costa, J. et al. National Institutes of Health Consensus Development Conference Statement: adjuvant therapy for breast cancer, November 1–3, 2000. J. Natl Cancer Inst. 2001; 93(13): 979–89. 44. Blamey, R. W., Davies, C. J., Elston, C. W. et al. Prognostic factors in breast cancer – the formation of a prognostic index. Clin. Oncol. 1979; 5(3): 227–36. 45. Eden, P., Ritz, C., Rose, C. et al. ‘‘Good Old’’ clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers. Eur. J. Cancer 2004; 40(12): 1837–41. 46. Pittman, J., Huang, E., Dressman, H. et al. Integrated modeling of clinical and gene expression information for personalized prediction of disease outcomes. Proc. Natl Acad. Sci. USA 2004; 101(22): 8431–6. 47. Chang, H. Y., Nuyten, D. S., Sneddon, J. B. et al. Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. Proc. Natl Acad. Sci. USA 2005; 102(10): 3738–43. 48. Chang, H. Y., Sneddon, J. B., Alizadeh, A. A. et al. Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumors and wounds. PLoS Biol. 2004; 2(2): E7. 49. Gruvberger, S. K., Ringner, M., Eden, P. et al. Expression profiling to predict outcome in breast cancer: the influence of sample selection. Breast Cancer Res. 2003; 5(1): 23–6.
161
Distinguishing tumor subclasses of breast carcinomas
50. Chang, J. C., Wooten, E. C., Tsimelzon, A. et al. Patterns of resistance and incomplete response to docetaxel by gene expression profiling in breast cancer patients. J. Clin. Oncol. 2005; 23(6): 1169–77. 51. Hyman, E., Kauraniemi, P., Hautaniemi, S. et al. Impact of DNA amplification on gene expression patterns in breast cancer. Cancer Res. 2002; 62(21): 6240–5. 52. Nessling, M., Richter, K., Schwaenen, C. et al. Candidate genes in breast cancer revealed by microarray-based comparative genomic hybridization of archived tissue. Cancer Res. 2005; 65(2): 439–47. 53. Esteller, M. Relevance of DNA methylation in the management of cancer. Lancet Oncol. 2003; 4(6): 351–8. 54. Jacquemier, J., Ginestier, C., Rougemont, J. et al. Protein expression profiling identifies subclasses of breast cancer and predicts prognosis. Cancer Res. 2005; 65(3): 767–79. 55. Schubert, C. M. Microarray to be used as routine clinical screen. Nat. Med. 2003; 9(1): 9. 56. Tuma, R. S. Trial and error: prognostic gene signature study design altered. J. Natl Cancer Inst. 2005; 97(5): 331–3.
7
Gene expression profiling in lymphoid malignancies Christof Burek, Elena Hartmann, Zhengrong Mao, German Ott, and Andreas Rosenwald Institute of Pathology, University of Wuerzburg, Germany
Introduction The development of high throughput technologies and, in particular, of DNA microarrays led to a great leap forward in the understanding of complex biological processes, as highlighted in the previous chapters. Not surprisingly, this technology has also revealed exciting new insights in the field of lymphoid malignancies. Specifically, first steps have been taken towards a molecular classification of lymphomas, and gene expressionbased survival predictors for lymphoma patients have been created that may prove useful in guiding future treatment decisions. Importantly, oncogenic pathways and relevant biological features of various lymphoma subtypes have been uncovered that may facilitate new targeted treatment approaches. Traditionally, lymphoma classifications have been a topic of hot debate, and various conceptual frameworks have been used in the past to classify lymphomas in a clinically and biologically meaningful way [1]. Early attempts of lymphoma classification relied heavily on either morphological or clinical aspects (e.g., in the Rappaport classification or in the Working Formulation, respectively). In the Kiel classification, cytological and immunologic criteria were applied for the first time to classify lymphomas according to their derivation from the B- or T-cell lineage. The latest approaches to lymphoma classification, the Revised European-American Lymphoma (REAL) and World Health Organization (WHO) classifications, include morphological aspects, immunophenotype and clinical features, but also underlying genetic alterations to define lymphoma subtypes [2, 3]. For example, mantle cell lymphoma (MCL) is now regarded as a distinct Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
163
Gene expression profiling in lymphoid malignancies
subgroup of B-cell non-Hodgkin’s lymphoma (B-NHL), characterized by the reciprocal chromosomal translocation t(11;14) that is present in virtually all cases [4]. The inclusion of recurrent genetic alterations in lymphoma classification schemes can be viewed as a first attempt at a molecular diagnosis of these neoplasms. However, many currently defined lymphoma entities are still characterized by considerable biological and clinical heterogeneity, e.g., in their response to a given therapy. It is hoped that global gene expression profiles established from these tumors may have the potential to define molecularly distinct diseases in the future. In this chapter, we will illustrate the effectiveness of this approach by reviewing gene expression profiling studies in various subtypes of B-NHL. B-cell chronic lymphocytic leukemia (B-CLL) Decades ago, B-cell chronic lymphocytic leukemia (B-CLL) was considered to be a homogeneous neoplasm of immature, immune-incompetent and minimally self-renewing B-cells [5]. However, it has long been evident that the clinical course of B-CLL can be highly variable. One group of patients with B-CLL is characterized by having stable disease for decades. These patients never require therapy and are likely to die from a cause unrelated to their leukemia. In contrast, the second group of B-CLL patients suffers from a rapid accumulation of leukemic cells and requires therapeutic intervention soon after diagnosis. These patients have a median survival of only a few years and die as a result of their leukemia. The dramatic heterogeneity in the clinical course of individual B-CLL patients likely reflects molecular differences in the leukemic cells. It has been demonstrated that B-CLL cells from different individuals vary in the mutational status of their immunoglobulin heavy chain (IgVH) genes [6, 7], and in two landmark studies in 1999 these molecular differences were found to be correlated with the clinical course of B-CLL [8, 9]. In particular, patients with the stable form of B-CLL frequently had somatically mutated IgVH genes, whereas somatically unmutated IgVH genes were detected predominantly in patients with a poor clinical outcome. Mutations of the IgVH gene are usually detected by comparing the DNA sequence in the neoplastic B-cells with the corresponding genes in the germline configuration [10], and a difference in the sequence of more
164
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
than 2% is frequently used as a cut-off for placing a particular B-CLL in the mutated or unmutated category. Since the process of somatic hypermutation is a characteristic of the germinal center stage of B-cell differentiation, the question arose whether B-CLL actually comprises two distinct diseases, one being derived from an unmutated, pre-germinal center B-cell and the other derived from a germinal center or post-germinal center B-cell. This question was addressed in two major studies using gene expression profiling techniques [11, 12]. Moreover, these studies sought to identify potential surrogate markers for the IgVH gene mutational status, since sequencing of the IgVH genes is cumbersome and may not be feasible for routine application in a clinical diagnostic laboratory. Both studies characterized a panel of B-CLL patients for the mutational status of their IgVH genes and generated corresponding gene expression profiles of the leukemic cells. In the study by Rosenwald and colleagues [12], lymphochip cDNA microarrays containing more than 17 000 human cDNA clones [13] were used to profile B-CLL cells that had been purified from whole blood samples of 37 patients using a CD19-positive selection method. Similarly, Klein et al. [11] characterized 34 B-CLL patients using the Affymetrix U95A platform with approximately 12 000 genes represented on the array [11]. The results of both studies led to similar conclusions that argue against the hypothesis that B-CLL encompasses two distinct disease entities. In particular, B-CLL is characterized by a homogeneous gene expression profile that is common to cases with mutated and unmutated IgVH genes and that discriminates this B-cell neoplasm from various other types of B-NHL. These results suggest that all B-CLL cases, irrespective of the IgVH mutational status, are derived from a common cell of origin or share fundamental pathogenetic features. Therefore, B-CLL should be considered a single disease. However, given the differences in the clinical behavior between IgVHmutated and IgVH-unmutated B-CLL, a search was made in both studies to identify genes differentially expressed between these two subgroups. Despite the use of different array platforms, these gene lists showed a remarkable overlap, and one of the genes, ZAP-70, has attracted much attention in follow-up studies. ZAP-70, the best discriminator between IgVH-mutated and IgVH-unmutated B-CLL cases in one of the studies [12], encodes for a tyrosine kinase critical for signaling through the T-cell receptor [14].
165
Gene expression profiling in lymphoid malignancies
ZAP-70 had not previously been described as being expressed in B-cells; in B-CLL cells, however, its expression is strongly associated with IgVHunmutated cases [12, 15]. Possible technological platforms for the clinical application of ZAP-70 expression in B-CLL cells include semiquantitative or quantitative reverse-transcription-polymerase chain reaction (RT-PCR) assays, immunohistochemical approaches (IHC), and flow cytometric analysis. Probably the most promising approach for determining ZAP-70 expression in a routine clinical setting is by multiparameter flow cytometric analysis. Recent studies illustrate that ZAP-70 expression, measured by flow cytometry, might prove to be a valuable prognostic marker for CLL patients in the future that is equivalent to the IgVH mutational status. Crespo and colleagues [16] studied ZAP-70 expression by flow cytometric analysis in 56 B-CLL patients and all patients with ZAP-70 expression in 20% of the leukemic cells carried unmutated IgVH genes. Conversely, 21 of 24 patients with ZAP-70 expression in <20% of B-CLL cells had mutated IgVH genes, suggesting that ZAP-70 might be an excellent surrogate for the IgVH mutational status. The median time to progression and median survival in patients with early stage B-CLL (Binet stage A) was significantly longer in patients with <20% ZAP-70-positive cells, although the correlation between ZAP-70 expression and survival did not reach statistical significance in the overall patient cohort. The excellent correlation between ZAP-70 expression and the IgVH mutational status was confirmed in a much larger study that was conducted at the Royal Bournemouth Hospital in the UK [17]. In this study, the median survival between ZAP-70 positive and ZAP-70 negative patients was dramatically different: 24.4 years in the ZAP-70 negative B-CLL group and 9.3 years in the ZAP-70-positive B-CLL group. Another recent study by Rassenti and colleagues [18] reported that ZAP-70 expression measured by flow cytometry might be a stronger predictor for the need of treatment than the IgVH-mutational status. While patients with mutated IgVH genes had significantly longer median time to initial treatment among ZAP-70 negative patients, IgVH mutation status failed to demonstrate a significant delay in disease progression among ZAP-70 positive patients. Although these results need to be confirmed in larger patient cohorts and in prospective clinical trials the existing results suggest that ZAP-70 might prove to be a valuable prognostic marker for B-CLL
166
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
Immunohistochemical detection of ZAP-70 in B-CLL
lg-unmutated B-CLL ZAP-70 positive
Clinical application High expression of ZAP-70
ZAP-70
lg-mutated B-CLL ZAP-70 negative
Clinical application
ZAP-70 B-CLL mutated CD3 + CD65
Flow cytometry
Identification of molecular markers
CD3 + CD65
B-CLL unmutated
Gene expression profiling in B-CLL
ZAP-70 Low expression of ZAP-70
Immunohistochemical detection of ZAP-70 in B-CLL
Fig. 7.1.
Gene expression profiling identifies the clinical marker ZAP70 in B-CLL. ZAP70 expression appears to correlate well with the mutational status of the immunoglobulin heavy chain gene in B-CLL cells and predicts survival of B-CLL patients. Potential applications in the routine clinical setting involve ZAP70 detection by immunohistochemistry or by flow cytometric analysis.
patients in the future (Fig. 7.1). However, there are also caveats. In 7–23% of B-CLL patients, the ZAP-70 and the IgVH mutational status were discordant [12, 15, 18], and we will have to await future studies to determine the predictive power of ZAP-70 in these patients. A recent study of 100 blood samples from B-CLL patients correlated gene expression profiles with underlying genetic alterations [19]. B-CLL cases with deletions in 17p13 (p53 locus), 11q22–23 (ATM locus), 13q14, 6q21 as well as cases with trisomy of 12q13 all had distinct gene expression profiles, and class predictors for these cytogenetic subgroups could be constructed that resulted in low misclassification rates. Moreover, significant subsets of differentially expressed genes were found to be localized in the respective
167
Gene expression profiling in lymphoid malignancies
genomic regions, indicating that gene dosage effects may play a role for the pathogenesis and distinct clinical behavior in at least some of these subgroups. For example, in cases with trisomy 12, 18 of the top 25 genes discriminatory for this group mapped to chromosome 12, and all of these genes were more highly expressed in these cases as compared to other cytogenetic subgroups. Gene expression signatures of B-CLL cells in response to DNA damage Although the gene expression signatures of all B-CLL cases, regardless of their IgVH mutational status and their clinical course, are highly similar, their response to DNA damage-inducing agents can vary dramatically. Two crucial genes that are key regulators in the cellular response to DNA damage, ATM and p53, are frequently altered in a subset of B-CLL cases, suggesting that this pathways might play an important role in these cases [20–22]. A recent microarray study revealed elegantly that ATM-mutant, p53-mutant, and ATM/p53 wild-type B-CLL cells respond differently to the induction of DNA double strand breaks by ionizing radiation in vitro, whereas the transcriptional profiles of these three groups were indistinguishable before irradiation [23]. In particular, p53- and ATM-deficient B-CLL cells failed to upregulate proapoptotic p53 target genes in response to irradiation (e.g., BAX, FAS, TRAIL receptor-2), suggesting that both genes cooperate in the cellular response to DNA damage, which ultimately drives B-CLL cells into programmed cell death. Surprisingly, however, ATM also appears to up-regulate antiapoptotic signals in response to DNA damage, since ATMmutant B-CLL cells did not show upregulation of antiapoptotic genes such as members of the NFB pathway, in contrast to p53 mutant and ATM/p53 wild-type B-CLL cells. This finding implies that future therapeutic strategies in B-CLL could aim at restoring the p53 pathway in p53-deficient B-CLL cells; however, it might also be beneficial to interfere therapeutically with the antiapoptotic signals conferred by ATM in response to DNA damage. In a similar approach, Vallat and colleagues [24] compared the expression profile of B-CLL cells, which were either sensitive or resistant to apoptotic stimuli induced by in vitro irradiation. Overall, 13 genes were identified as being differentially expressed between the two groups, among which c-myc and c-IAP1 were up-regulated in the resistant B-CLL samples. These genes
168
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
may prove to be useful markers with clinical relevance for B-CLL cells resistant to apoptotic stimuli. Gene expression profiling has also been applied to study in vivo drug effects on the tumor cells in patients with B-CLL [25]. Purine analogues (such as fludarabine) that are frequently used as first-line therapy in B-CLL patients induce DNA double strand breaks in B-CLL cells, and the presence of p53 mutations in CLL is associated with clinical resistance to fludarabine treatment and decreased survival [26, 27]. The molecular consequences of fludarabine treatment of CLL patients in vivo were recently investigated using gene expression profiling [25]. After the first infusion of fludarabine in seven previously untreated B-CLL patients, leukemic cells were obtained 3 hours, 6 hours, and at days 2, 3, 4, and 5 after the start of therapy. Subsequent gene expression profiling of B-CLL cells revealed a homogenous induction of the p53 pathway in vivo, and all seven patients showed clinically partial remissions. Importantly, none of the seven patients showed other prominent gene expression changes; thus the p53-mediated gene expression response might play a key role in the induction of apoptosis in response to fludarabine treatment in vivo. The finding that fludarabine activates a p53 response in CLL cells in vivo suggests that fludarabine treatment has the potential to select p53-mutant CLL cells, which are more drug resistant and associated with an aggressive clinical course. Future studies will have to show whether the observed p53-dependent gene expression changes are specific for the response to fludarabine treatment or whether they can also occur in other therapies (e.g., with chlorambucil). Mantle cell lymphoma (MCL) Mantle cell lymphoma (MCL) is a morphologically distinct subtype of B-NHL with a poor response to therapy and a median survival of approximately 3–4 years. On the genetic level, the major characteristic is the chromosomal translocation t(11;14)(q13;q32) that juxtaposes the Cyclin D1 gene next to the immunoglobulin heavy chain gene [4, 28, 29]. As a consequence, Cyclin D1 is expressed constitutively in this B-NHL subtype, while the expression of this D-type cyclin is usually not found in B-cells [30]. Cyclin D1 controls the progression from the G1 to the S phase of the cell cycle by binding to cyclin-dependent kinases (CDKs), leading to
169
Gene expression profiling in lymphoid malignancies
phosphorylation of the retinoblastoma (Rb) protein which, in turn, releases transcription factors of the E2F family. The specific up-regulation of Cyclin D1 in MCL suggests an important role for the pathogenesis of this lymphoma and its expression has been helpful in distinguishing MCL from other B-cell malignancies, such as small lymphocytic lymphoma. Despite the t(11;14) that is common to virtually all MCL cases, the clinical course of this lymphoma is highly variable. Whereas some patients succumb to their disease within months, other patients follow an indolent clinical course with a survival of more than 10 years [4, 29]. The dramatic clinical heterogeneity of MCL led to a search for prognostic markers, and various morphologic and genetic features have been associated with clinical outcome. In particular, the blastic variant of MCL has been found to be correlated with adverse clinical outcome [31–33]. A high tumor proliferation rate, genomic deletions of the INK4a/ARF locus and p53 mutations or overexpression also showed an association with decreased survival times in MCL patients in previous studies [31, 32, 34–38]. What have we learned so far from gene expression profiling of MCL specimens? In the largest study performed to date [39], tumor biopsies from 92 untreated MCL patients were subjected to microarray analysis using cDNA Lymphochips. Morphologically and immunophenotypically, all cases showed typical features of MCL and Cyclin D1 expression was proven in all cases. Not surprisingly, all of these MCL cases displayed a homogeneous gene expression profile that was distinct from profiles of other types of B-NHL. A set of ‘‘MCL signature genes’’ that are expressed most highly and differentially in MCL may therefore prove useful for the development of a gene expression-based diagnostic test for MCL, especially in cases where differential diagnosis to other B-NHL subtypes is not straightforward. Another result of this study that highlights the potential of a gene expression-based molecular diagnostic approach is the definition of a potentially new subgroup of MCL, termed Cyclin D1-negative MCL. Occasionally, MCL-like lymphoma cases are observed that show morphologic and immunophenotypic features characteristic of MCL, but lack Cyclin D1 expression. Seven out of nine cases investigated had a gene expression profile that was identical to that of Cyclin D1-positive MCL cases. Therefore, these cases may present bona fide MCL cases without
170
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
overexpression of Cyclin D1, especially since their clinical behavior did not differ compared with the Cyclin D1-positive MCL subgroup. Surprisingly, a subset of Cyclin D1-negative MCL cases displayed high expression of Cyclin D2 or Cyclin D3, a finding which is currently unexplained. However, up-regulation of these cyclins may represent an alternative pathogenetic mechanism in MCL cases that lack Cyclin D1 expression. The most important aspect of this study, however, is the construction of a gene expression-based predictor of survival in MCL patients that provides prognostic information at the time of diagnosis. In a search for individual genes whose expression correlates with survival, 48 genes were identified that predicted outcome with a high statistical significance, and all of these genes were more highly expressed in tumors from patients with a short survival. These genes were grouped functionally into gene expression signatures [40], which represent genes that are coordinately expressed in a particular cell type, during a particular differentiation stage or during a distinct biological process. Of note, many of the 48 predictive genes fell into the ‘‘proliferation signature’’ that encompasses genes that are required for DNA replication, cell cycle progression, and metabolic processes required for the proliferation of tumor cells. A subset of 20 proliferation signatureassociated genes was used to construct a survival predictor for MCL patients that performed remarkably well in a training set as well as a validation set of MCL cases. The MCL patients were ranked according to the proliferation signature score of their tumor specimens and divided into four quartiles with median survival times of 0.8, 2.3, 3.3, and 6.7 years. In conclusion, the accurate quantitative measurement of proliferation in MCL cells, provided by proliferation signature genes, identified subsets of MCL patients that differ in their survival times by almost 6 years (Fig. 7.2). From a biological standpoint, the proliferation signature in MCL can be viewed as an integrator of various oncogenic events in MCL tumors. First, levels of Cyclin D1 expression can vary among Cyclin D1-positive MCL cases, and an increase in overall Cyclin D1 expression in a given MCL specimen is associated with a higher proliferation signature measured by gene expression. Second, deletions in the INK4a/ARF locus were also found to be associated with an increase of the proliferation signature with statistical significance. However, mathematical models including Cyclin D1 expression levels or the INK4a/ARF deletion status alone or in combination
171
Gene expression profiling in lymphoid malignancies
Expression-based Proliferation Signature Predicts Outcome in Mantle Cell Lymphoma CDC2 Average
ASPM CENP-F Tubulin-α CIP2
Proliferation
RAN
Intermediate
High
Low
Probability, %
Survival 100%
50%
1
2
3
4
5
6
7
8
9
10
Overall survival / years MCL: immunhistochemistry / Ki 67
High proliferation
Fig. 7.2.
Low proliferation
Gene expression profiling identifies the proliferation-associated signature as a strong predictor for survival in mantle cell lymphoma (MCL). The prognosis of patients with a high score of the expression-based proliferation signature is poor, while the prognosis is more favorable in patients with a low expression of the proliferation signature. Potential clinical application involves the measurement of proliferation-associated markers by immunohistochemical methods (such as Ki 67).
172
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
did not perform as well in predicting survival as the proliferation signaturebased outcome model. Thus, the gene expression-based model may capture additional oncogenic events in MCL cells that are presently not known. Additional insights into the pathogenesis of MCL can be gleaned from several other studies of MCL specimens using gene expression profiling. An early investigation by Hofmann and colleagues [41] used Affymetrix oligonucleotide arrays for a comparison of five MCL cases with four cases of nonmalignant hyperplastic lymph nodes. Among 92 genes that were found to be differentially expressed between these two groups, 50 genes were more highly expressed in MCL, whereas 42 genes were down-regulated. The group of up-regulated genes included transcription factors (AP2, JUN, MYB, MYC), cell cycle related genes (CDK4, CCND1) and growth factors and their receptors (IL-1R, IL-3, IL-8, IL-13). Among down-regulated genes, key genes from pro-apoptotic pathways such as FADD, DAXX, CASP2, and RAIDD were identified leading to the hypothesis that disturbances of apoptotic pathways may contribute to the pathogenesis of MCL. In a study performed by Martinez and co-workers [42] 38 MCL samples were analyzed using a cDNA ‘‘Oncochip’’ and compared to normal mantle zone B-cells that had been purified from non-malignant hyperplastic tonsils. In agreement with a previous study [39], gene expression profiles of MCL cases were found to be homogeneous. More than 400 genes were differentially expressed between MCL and normal mantle zone B-cells, among which several genes affected the tumor necrosis factor, the nuclear factor B and apoptotic pathways. In addition, two genes were identified with high expression in MCL specimens that may be of relevance for the pathogenesis of this lymphoma subtype. First, the receptor of IL-10 (IL10R) is highly expressed in a substantial proportion of MCL cases. Elevated expression of IL-10 has been reported in CD5-positive B-cells (such as MCL cells) [43] and may be, at least in part, responsible for increased survival of these B-cells. Thus, IL-10 may act in an autocrine fashion in MCL cells and stimulate growth or, alternatively, confer antiapoptotic signals through the concomitantly expressed IL-10 receptor. In addition, SPARC, a secreted extracellular matrix glycoprotein, appears to be overexpressed in many MCL cases. This protein may be of relevance for the homing of tumor cells and has also been found overexpressed in multiple myeloma samples [44].
173
Gene expression profiling in lymphoid malignancies
Follicular lymphoma (FL) Follicular lymphoma (FL) is the most common lymphoid neoplasm in the Western world, next to diffuse large B-cell lymphoma (DLBCL), accounting for roughly 30% of newly diagnosed lymphoid tumors. In the vast majority of cases (85%), a characteristic chromosomal translocation, t(14;18)(q32;q21) can be observed that juxtaposes the BCL2 oncogene with the immunoglobulin heavy chain promoter sequence. By virtue of this chromosomal rearrangement, BCL2 is constitutively overexpressed in neoplastic germinal centers. Since BCL2 protein overexpression is crucial in avoiding programmed cell death (apoptosis), the aberration confers longevity to the tumor cells and, hence, also time to acquire secondary chromosomal alterations. The latter point is particularly important, because rearranged BCL2 as a sole genetic event is not believed to be sufficient for neoplastic transformation. In the majority of cases, the neoplastic infiltrate grows in atypical follicular structures and is composed of centroblasts and centrocytes with the latter usually dominating. A much rarer variant of FL is made up of centroblasts exclusively (FL Grade 3B). This subtype harbors BCL2 alterations in only single cases, while nearly 60% of the cases are characterized by BCL6 translocations [45]. During progression of the disease, the tumor cells may lose their entity-specific structural and cytological particularities, and diffuse large cell lymphoma occurs in 30%–60% of the cases [46]. The median survival of patients with FL is approximately 8–10 years; however, some patients suffer from early progression or transformation of their lymphoma, while others have relatively stable disease for more than a decade. At present, no reliable markers exist at the time of diagnosis that would predict the fate of an individual patient and, thus gene expression profiling was applied to search for such markers. In early studies, expression profiling was performed to identify differences between FL specimens and their transformed diffuse large B-cell counterparts. Lossos and associates [47] carried out investigations in sequential biopsy specimens of 12 FL patients before and after transformation, using a customized cDNA microarray. The authors did not identify one common gene expression signature that was altered in all of the transformed specimens; instead, they were able to define two transformation-associated
174
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
gene expression profiles with c-myc expression being the crucial difference between the two. One group of five patients demonstrated an increase in the expression of c-myc and associated target genes, while in a second group of four patients, the expression of the same cluster of genes including c-myc was decreased. In the remaining cases, no prominent changes in gene expressions were observed. Because of the absence of mutations in c-myc with the exception of one case, and because increased proliferation usually is accompanied by c-myc activation and overexpression, the authors speculated that, in the transformed specimens, the increased expression of c-myc could be a consequence of, rather than the cause of high-grade transformation. For those cases displaying decreased c-myc expression upon transformation, Lossos et al. [47] speculated on an interference with programmed cell death pathways, because c-myc is known to promote cellular growth, but can also drive cells into apoptosis. The decreased expression of c-myc therefore, would favor transformation by down-regulating pro-apoptotic pathways in the tumor cells. In comparing expression profiles of the 12 post-transformation specimens with 12 de novo DLBCL specimens, Lossos and associates [47] observed that the expression of genes associated with proliferation, cell cycle and c-myc activation were overexpressed in de novo DLBCL, while two gene clusters made up of a heterogeneous set of genes were overexpressed in the transformed biopsies. These clusters named ‘‘CD20’’ and ‘‘CDw52’’ consisted of B-lymphocyte signaling, anti-proliferative and surface marker genes [47]. In another study, Elenitoba-Johnson and colleagues [48] examined 12 matched pairs of FL and their DLBCL counterparts and an additional set of unrelated FL and DLBCL tumors by cDNA microarray analysis and quantitative RT-PCR (qRT-PCR). They found 113 genes to be differentially expressed between DLBCLs and their antecedent FL with both increases as well as decreases in gene expression levels upon transformation, among them growth factor and cytokine receptors such as C-MET and FGFR3, and N-RAS and RAS-related genes. Of particular interest, they also identified overexpression of p38MAPK in DLBCLs with a prior history of FL. This represents an interesting finding because p38MAPK is known to be a mediator of cellular responses induced by stress or cytokines; in addition, p38MAPK target genes were also overexpressed in the specimens. They also constructed a qRT-PCR gene set that proved to be highly discriminative in
175
Gene expression profiling in lymphoid malignancies
assigning FL, DLBCL and cell lines carrying the t(14;18) to their appropriate groups. The importance of phosphorylated p38MAPK expression was further corroborated in t(14;18)-positive cell lines by functional studies demonstrating a decrease in cell viability upon administration of selective inhibitors of this kinase. Moreover, a t(14;18)-positive cell line xenografted into nude mice showed growth cessation after administration of the p38MAPK inhibitor SB203580 suggesting pharmacological targeting of this gene to represent a possible new approach in the treatment of transformed FL. In a large series of 191 FL of various grades, Dave and colleagues [49] tested the hypothesis whether survival could be predicted by the analysis of gene expression signatures in primary diagnostic biopsies of follicular lymphoma using Affymetrix U133A and U133B arrays. The following strategy was pursued in the study: After dividing the patients into a test set and a validation set, gene expression signatures associated with the length of survival were defined in the test set. Various multivariate survival predictors were tested, including combinations of a total of ten gene expression signatures, five of them being associated with a good prognosis and five being associated with adverse clinical outcome. Two signatures, termed immune-response 1 (IR-1) and immune-response 2 (IR-2) signatures, displayed striking statistical synergy in defining the patient’s individual risk. In this model, IR-1 overexpression was associated with a more favorable outcome, while IR-2 overexpression was pointing at an adverse prognosis (Fig. 7.3). Using these two signatures, an individual survival predictor score was constructed for each patient and subsequently, patients were divided into four survival quartiles according to the relative value of their individual survival predictor score. The resulting Kaplan–Meier plots identified patient subgroups with distinct clinical courses and highly different median survival times between 13.6 years (quartile 1) and 3.9 years (quartile 4), thus illustrating the powerful impact of gene expression profiling in determining prognostic features of patients with FL. The survival predictor also stratified patients in the clinical low-risk (international prognostic index (IPI) 0–1) and intermediate-risk (IPI 2–3) groups with median survival differing by up to five years. Importantly, the gene expressionbased survival predictor performed equally well in an independent validation set of FL cases.
176
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
Immune Response Signatures Predict Outcome in Follicular Lymphoma Follicular lymphoma (FL)
Gene expression profiling in FL
Immune response 1 Immune response 2
Immune response 1 Immune response 2
Fig. 7.3.
Good prognosis
Poor prognosis
Immune response signatures predict outcome in follicular lymphoma (FL). A mathematical predictor using Immune Response 1 and Immune Response 2 signature genes strongly predicts outcome of FL patients at the time of diagnosis (for details see text).
Analysis of the individual components of the IR-1 and IR-2 expression signatures yielded important information on the nature of cells expressing the respective genes. In particular, Immune response 1-associated genes appear to be mainly expressed in T-cells (e.g., CD7, ITK, and STAT4) and in macrophages (e.g., ACTN1 and TNFSF13B). Of importance, this profile was not a reflection of the number of infiltrating T-cells or macrophages. The IR-2 signature contained genes normally expressed in macrophages or dendritic cells, like TLR5, FCGR1 A, or C3AR1. After flow sorting of CD19þ and CD19 cell populations from four primary specimens, expression profiling actually proved the expression of IR-1 and IR-2 signatures in the non-malignant cell population, thus hinting at features of nonmalignant cells. Analysis of gene expression profiling in various reactive B-cell subpopulations derived from peripheral blood or tonsils corroborated this finding further, providing evidence that expression signatures
177
Gene expression profiling in lymphoid malignancies
associated with survival in FL mainly reflect biological properties of nonmalignant tumor-infiltrating cells. An entirely different approach in defining clinical aggressiveness of FL cases by using gene expression profiling was recently provided by Glas and associates [50]. This group used supervised clustering analyses to detect differences between expression profiles of clinically indolent and aggressive FL. The authors first pursued a paired sample strategy using 24 samples from 12 patients with available material from indolent and aggressive disease phases. Cross-validation resulted in the emergence of 81 genes with discriminative power. Using the expression of these 81 genes in a learning set of 24 biopsies, an expression level for indolent and aggressive FL cases was defined. The ensuing stratification of FL cases was compared to clinical variables derived from initial presentation, response to chemotherapy, and histopathological variables (such as grade), and the emerging groups were found to be tightly correlated. Within this approach, genes up-regulated in aggressive disease phases played a role in cell cycle control (e.g., CCNE2, CDK2 and CHECK1), DNA synthesis (TOP2A and CTPS) and increased metabolic activity. Vice versa, genes derived from reactive T-cells were upregulated in samples from the indolent phase of the disease, and, in contrast to the study of Dave et al. [49], this observation reflected an elevated non-tumor cell content in these biopsies rather than distinct biological features of the infiltrating bystander cells. The classifier was validated in an additional set of 40 indolent and 18 aggressive FL specimens predefined on the basis of clinical and morphologic criteria and was found to be highly predictive with only 4 of 58 specimens being misclassified as compared to the initial estimation of the grade of aggressiveness of the tumors. The classification result of a group of six additional patients showed that, at the time of the biopsy, they presented with ‘‘low-grade’’ clinical and morphological features, but that progressed to aggressive disease within less than 1 year. The classifier assigned all but one of the cases to the indolent group, indicating that its application results in the detection of biological and clinical features at the time the biopsy was taken rather than in the identification of cases at risk for aggressive disease. In this respect, this study aimed at a different goal than Dave and colleagues [49], namely at refining the molecular criteria for a subclassification of FL on the basis of their clinical aggressiveness.
178
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
Diffuse large B-cell lymphoma (DLBCL) From the early days of gene expression profiling using DNA microarrays, the study of diffuse large B-cell lymphomas (DLBCL) has attracted much interest, since this lymphoma subtype is the most common among all nonHodgkin lymphomas. Moreover, DLBCL has been known for a long time to be a biologically and clinically heterogeneous B-NHL subgroup [51]. Clinically, DLBCL patients are characterized by a different response to current therapeutic regimens. While approximately half of the patients experience a long-term remission and can be considered as cured from their lymphoma, the remainder suffers from early disease progression or early relapse and dies from their lymphoma within a short period of time. Standard pathologic techniques could so far not explain the biological heterogeneity among DLBCL patients and therefore, it was rather obvious to try to use gene expression profiling techniques to shed light on molecular differences among DLBCL cases. In a hallmark study, Alizadeh and co-workers compared gene expression profiles of a broad spectrum of lymphoid malignancies, primary B-cell populations and B-cell lymphoma cell lines using the customized ‘‘lymphochip’’ microarray [52]. Hierarchical clustering [53] revealed characteristic gene expression signatures for DLBCL, FL, B-CLL and various B-cell populations, and B-NHL subgroups could readily be distinguished based on their expression profiles. DLBCL were markedly different from other lymphomas; however, a considerable heterogeneity in gene expression was evident among DLBCL and, most importantly, a germinal center B-cell signature [40] could be identified as a variable feature of DLBCL specimens. Accordingly, the terms germinal center-like type of DLBCL (GCB DLBCL) and activated B-cell-like type of DLBCL (ABC DLBCL) were coined. GCB DLBCL are characterized by high expression of genes that are typically expressed in B-cells at the germinal center stage of B-cell differentiation (such as CD10 and BCL6). Conversely, ABC DLBCL express genes at a high level that are up-regulated in B-cell in response to mitogenic activation (such as BCL2, IRF-4 and CD44). From a clinical point of view, the varying molecular features of DLBCL had a strong impact on the clinical course of the disease with strikingly different 5-year survival rates between GCB and ABC DLBCL patients.
179
Gene expression profiling in lymphoid malignancies
A subsequent, larger study of more than 200 DLBCL patients confirmed and extended the initial observations in an independent data set [54]. In this study, GCB and ABC DLBCL patients had 5-year survival rates following anthracyclin-based chemotherapy of 60% and 39%, respectively. In addition, a third group of DLBCL patients was identified in which the expression profile was not consistent with a GCB- or ABC-type profile (Type 3 DLBCL). Importantly, two oncogenic genomic events could be assigned to the GCB-like subtype of DLBCL, namely the chromosomal translocation t(14;18) involving the BCL2 gene and amplifications of the c-rel locus on chromosome 2p. Conversely, ABC DLBCL have been shown to be characterized by constitutive activation of the NFB-pathway, which may, at least in part, be responsible for the more aggressive clinical behavior [55]. In addition, ABC DLBCL show frequent genomic gains or amplifications of chromosomal regions in 3q and 18q, while GCB DLBCL have more frequent gains in 12q ([56] Fig. 7.4). While GCB DLBCL patients had a more favorable clinical outcome as compared to ABC DLBCL patients, this distinction did not capture fully all molecular features in the tumors that were associated with the clinical course of the disease. To create a gene expression-based outcome predictor, the data set was therefore searched for individual genes that were associated with the clinical outcome with a certain statistic probability. This search yielded more than 670 cDNA clones that were subsequently assigned to previously defined gene expression signatures [40]. In the next step, various gene expression signatures were combined in multivariate statistical models, and four different signatures were found to be significantly associated with survival in DLBCL patients. Among these signatures, the proliferationassociated signature correlated with poor outcome, while three signatures were correlated with a favorable outcome. These included the MHC class II and the ‘‘lymph node’’ signatures suggesting that the presentation of antigens to the immune system and the composition of the bystander cell infiltrate in the lymphoma specimens may play an important role in the response to therapy. In particular, down-regulation of MHC class II may explain why some DLBCL tumors evade the immune response, and the expression of macrophage and NK-cell associated genes in the lymph node signature point to an important role for a direct antitumor response of these cells as a prognostic factor. The fourth prognostic gene expression signature
180
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
Molecular Subgroups of Diffuse Large B-Cell Lymphoma (DLBCL)
GCB-DLBCL (Germinal Center B-Cell-like DLBCL)
Postulated cell of origin
Genetic alterations
Survival
Fig. 7.4.
ABC-DLBCL (Activated B-Cell-like DLBCL)
Germinal center
Postgerminal center
t(14;18)(q32;q21)
Constitutive activation of the NFκB pathway
Gains/amplifications of 2p
Gains/amplifications of 3q
Gains/amplifications of 12q
Gains/amplifications of 18q
Favorable
Poor
Molecular subgroups of diffuse large B-cell lymphoma (DLBCL). GCB- and ABC-DLBCL differ in their presumed cell of origin, in underlying genetic alterations and in their clinical behavior (for details see text).
reflected the previously defined phenotype of germinal center and activated B-cells. In a slightly different approach conducted by Shipp and colleagues [57], DLBCL tumors were grouped into prognostically favorable and poor subsets that were compared in their gene expression profiles. This study
181
Gene expression profiling in lymphoid malignancies
identified a 13-gene predictor, and some of these genes, such as PKC-, may prove to be promising targets for future therapy. Although this study was conducted using a different platform (Affymetrix platform as compared with the customized ‘‘lymphochip’’ in the studies previously mentioned), GCB and ABC subtypes could be identified on both platforms and, importantly, showed a different survival also in the data generated on the Affymetrix platform [58]. The role of the microenvironment in DLBCL tumor samples was highlighted in a very recent gene expression profiling study by Monti and colleagues [59]. In particular, a subgroup of DLBCL was defined that is characterized by high expression of T- and NK cell-associated genes as well as dendritic cell markers, interferon-induced genes, cytokines, and adhesion molecules. Two additional robust gene expression subgroups of DLBCL showed high expression of genes associated with the biochemical process of oxidative phosphorylation and B-cell receptor signaling and proliferation, respectively. Although these DLBCL subgroups did not differ statistically in their clinical behavior, some of the identified molecular features may help to define more rational treatment approaches of DLBCL tumors in the future. Conclusions Over recent years, we have seen significant progress in using gene expression profiling techniques to establish a molecular diagnosis of cancer subtypes. In lymphoid malignancies, the most frequent subgroups of B-cell lymphomas were quite extensively studied and robust gene expression signatures could be defined that are characteristic of these neoplasms. In that sense, first steps have been taken towards a molecular classification of lymphomas that may help to resolve the biological and clinical heterogeneity that is present in many currently defined lymphoma entities. Importantly, new lymphoma subgroups were identified that differ in their underlying molecular features, such as the GCB- and ABC-like subgroups of DLBCL. Given the vast differences in gene expression and the usage of different oncogenic pathways (e.g., the activation of the NFkB pathway in ABC DLBCL), it is evident that new therapeutic approaches in DLBCL will have to be tested in the context of these newly defined subgroups. Gene expression profiling has also identified clinical markers that may become
182
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
widely used in routine clinical application. In B-CLL, for example, ZAP70 appears to be a good surrogate marker for the mutational status of the immunoglobulin heavy chain gene and diagnostic tests are currently being developed that will allow for an easy quantification of ZAP70 on the leukemic cells in a routine setting. Finally, gene expression profiles from lymphoma specimens at the time of diagnosis may also prove useful for predicting the clinical course of the disease and thus may guide future treatment decisions. It has to be noted, however, that all studies that were summarized in this chapter were performed in a retrospective manner, and patients had not been treated according to currently used regimens. Thus, the benefit of molecular diagnostic studies and, in particular, of gene expression profiling, will have to be tested in future multicenter clinical trials. To this end, some obstacles will have to be overcome (e.g., the fact that fresh frozen tumor tissue required for expression profiling is not routinely obtained in all patients), but, ultimately, these trials are likely to validate the concept that expression profiles will be useful for providing a molecular definition of lymphomas and for predicting a therapeutic response to a given therapy. Acknowledgments Andreas Rosenwald is supported by the Interdisciplinary Center for Clinical Research (IZKF) of the University of Wu¨rzburg, Germany.
REFERENCES 1. Harris, N. L., Jaffe, E. S., Diebold, J., Flandrin, G., Muller-Hermelink, H. K., and Vardiman, J. Lymphoma classification – from controversy to consensus: the R.E.A.L. and WHO Classification of lymphoid neoplasms. Ann. Oncol. 2000; 11, Suppl 1: 3–10. 2. Harris, N. L., Jaffe, E. S., Stein, H. et al. A revised European-American classification of lymphoid neoplasms: a proposal from the International Lymphoma Study Group. Blood 1994; 84: 1361–92. 3. Jaffe E. S., H. N. L., Stein H., and Vardiman J. W. (eds.) World Health Organization Classification of Tumours. Pathology and Genetics of Tumours of Haematopoietic and Lymphoid Tissues. Lyon: IARC Press, 2001.
183
Gene expression profiling in lymphoid malignancies
4. Campo, E., Raffeld, M., and Jaffe, E. S. Mantle-cell lymphoma. Semin. Hematol. 1999; 36: 115–27. 5. Dameshek, W. Chronic lymphocytic leukemia – an accumulative disease of immunolgically incompetent lymphocytes. Blood 1967; 29: Suppl: 566–84. 6. Fais, F., Ghiotto, F., Hashimoto, S. et al. Chronic lymphocytic leukemia B cells express restricted sets of mutated and unmutated antigen receptors. J. Clin. Invest. 1998; 102: 1515–25. 7. Oscier, D. G., Thompsett, A., Zhu, D., and Stevenson, F. K. Differential rates of somatic hypermutation in V(H) genes among subsets of chronic lymphocytic leukemia defined by chromosomal abnormalities. Blood 1997; 89: 4153–60. 8. Damle, R. N., Wasil, T., Fais, F. et al. Ig V gene mutation status and CD38 expression as novel prognostic indicators in chronic lymphocytic leukemia. Blood 1999; 94: 1840–7. 9. Hamblin, T. J., Davis, Z., Gardiner, A., Oscier, D. G., and Stevenson, F. K. Unmutated Ig V(H) genes are associated with a more aggressive form of chronic lymphocytic leukemia. Blood 1999; 94: 1848–54. 10. Chiorazzi, N. and Ferrarini, M. B cell chronic lymphocytic leukemia: lessons learned from studies of the B cell antigen receptor. Annu. Rev. Immunol. 2003; 21: 841–94. 11. Klein, U., Tu, Y., Stolovitzky, G. A. et al. Gene expression profiling of B cell chronic lymphocytic leukemia reveals a homogeneous phenotype related to memory B cells. J. Exp. Med. 2001; 194: 1625–38. 12. Rosenwald, A., Alizadeh, A. A., Widhopf, G. et al. Relation of gene expression phenotype to immunoglobulin mutation genotype in B cell chronic lymphocytic leukemia. J. Exp. Med. 2001; 194: 1639–47. 13. Alizadeh, A., Eisen, M., Davis, R. E. et al. The lymphochip: a specialized cDNA microarray for the genomic-scale analysis of gene expression in normal and malignant lymphocytes. Cold Spring Harb. Symp. Quant. Biol. 1999; 64: 71–8. 14. Chan, A. C., Iwashima, M., Turck, C. W., and Weiss, A. ZAP-70: a 70 kd protein-tyrosine kinase that associates with the TCR zeta chain. Cell 1992; 71: 649–62. 15. Wiestner, A., Rosenwald, A., Barry, T. S. et al. ZAP-70 expression identifies a chronic lymphocytic leukemia subtype with unmutated immunoglobulin genes, inferior clinical outcome, and distinct gene expression profile. Blood 2003; 101: 4944–51. 16. Crespo, M., Bosch, F., Villamor, N. et al. ZAP-70 expression as a surrogate for immunoglobulin-variable-region mutations in chronic lymphocytic leukemia. N. Engl. J. Med. 2003; 348: 1764–75. 17. Orchard, J. A., Ibbotson, R. E., Davis, Z. et al. ZAP-70 expression and prognosis in chronic lymphocytic leukaemia. Lancet 2004; 363: 105–11.
184
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
18. Rassenti, L. Z., Huynh, L., Toy, T. L. et al. ZAP-70 compared with immunoglobulin heavy-chain gene mutation status as a predictor of disease progression in chronic lymphocytic leukemia. N. Engl. J. Med. 2004; 351: 893–901. 19. Haslinger, C., Schweifer, N., Stilgenbauer, S. et al. Microarray gene expression profiling of B-cell chronic lymphocytic leukemia subgroups defined by genomic aberrations and VH mutation status. J. Clin. Oncol. 2004; 22: 3937–49. 20. Pettitt, A. R., Sherrington, P. D., Stewart, G., Cawley, J. C., Taylor, A. M., and Stankovic, T. p53 dysfunction in B-cell chronic lymphocytic leukemia: inactivation of ATM as an alternative to TP53 mutation. Blood 2001; 98: 814–22. 21. Stankovic, T., Weber, P., Stewart, G. et al. Inactivation of ataxia telangiectasia mutated gene in B-cell chronic lymphocytic leukaemia. Lancet 1999; 353: 26–9. 22. Stankovic, T., Stewart, G. S., Fegan, C. et al. Ataxia telangiectasia mutated-deficient B-cell chronic lymphocytic leukemia occurs in pregerminal center cells and results in defective damage response and unrepaired chromosome damage. Blood 2002; 99: 300–9. 23. Stankovic, T., Hubank, M., Cronin, D. et al. Microarray analysis reveals that TP53and ATM-mutant B-CLLs share a defect in activating proapoptotic responses after DNA damage but are distinguished by major differences in activating prosurvival responses. Blood 2004; 103: 291–300. 24. Vallat, L., Magdelenat, H., Merle-Beral, H. et al. The resistance of B-CLL cells to DNA damage-induced apoptosis defined by DNA microarrays. Blood 2003; 101: 4598–606. 25. Rosenwald, A., Chuang, E. Y., Davis, R. E. et al. Fludarabine treatment of patients with chronic lymphocytic leukemia induces a p53-dependent gene expression response. Blood 2004; 104: 1428–34. 26. Dohner, H., Fischer, K., Bentz, M. et al. p53 gene deletion predicts for poor survival and non-response to therapy with purine analogs in chronic B-cell leukemias. Blood 1995; 85: 1580–9. 27. Wattel, E., Preudhomme, C., Hecquet, B. et al. p53 mutations are associated with resistance to chemotherapy and short survival in hematologic malignancies. Blood 1994; 84: 3148–57. 28. Raffeld, M. and Jaffe, E. S. bcl-1, t(11;14), and mantle cell-derived lymphomas. Blood 1991; 78: 259–63. 29. Swerdlow, S. H. and Williams, M. E. From centrocytic to mantle cell lymphoma: a clinicopathologic and molecular review of 3 decades. Hum. Pathol. 2002; 33: 7–20. 30. Rosenberg, C. L., Wong, E., Petty, E. M. et al. PRAD1, a candidate BCL1 oncogene: mapping and expression in centrocytic lymphoma. Proc. Natl Acad. Sci. USA 1991; 88: 9638–42. 31. Argatoff, L. H., Connors, J. M., Klasa, R. J., Horsman, D. E., and Gascoyne, R. D. Mantle cell lymphoma: a clinicopathologic study of 80 cases. Blood 1997; 89: 2067–78.
185
Gene expression profiling in lymphoid malignancies
32. Bosch, F., Lopez-Guillermo, A., Campo, E. et al. Mantle cell lymphoma: presenting features, response to therapy, and prognostic factors. Cancer 1998; 82: 567–75. 33. Lardelli, P., Bookman, M. A., Sundeen, J., Longo, D. L., and Jaffe, E. S. Lymphocytic lymphoma of intermediate differentiation. Morphologic and immunophenotypic spectrum and clinical correlations. Am. J. Surg. Pathol. 1990; 14: 752–63. 34. Dreyling, M. H., Bullinger, L., Ott, G. et al. Alterations of the cyclin D1/p16-pRB pathway in mantle cell lymphoma. Cancer Res. 1997; 57: 4608–14. 35. Greiner, T. C., Moynihan, M. J., Chan, W. C. et al. p53 mutations in mantle cell lymphoma are associated with variant cytology and predict a poor prognosis. Blood 1996; 87: 4302–10. 36. Hernandez, L., Fest, T., Cazorla, M. et al. p53 gene mutations and protein overexpression are associated with aggressive variants of mantle cell lymphomas. Blood 1996; 87: 3351–9. 37. Pinyol, M., Cobo, F., Bea, S. et al. p16(INK4a) gene inactivation by deletions, mutations, and hypermethylation is associated with transformed and aggressive variants of non-Hodgkin’s lymphomas. Blood 1998; 91: 2977–84. 38. Raty, R., Franssila, K., Joensuu, H., Teerenhovi, L., and Elonen, E. Ki-67 expression level, histological subtype, and the International Prognostic Index as outcome predictors in mantle cell lymphoma. Eur. J. Haematol. 2002; 69: 11–20. 39. Rosenwald, A., Wright, G., Wiestner, A. et al. The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell 2003; 3: 185–97. 40. Shaffer, A. L., Rosenwald, A., Hurt, E. M. et al. Signatures of the immune response. Immunity 2001; 15: 375–85. 41. Hofmann, W. K., de Vos, S., Tsukasaki, K. et al. Altered apoptosis pathways in mantle cell lymphoma detected by oligonucleotide microarray. Blood 2001; 98: 787–94. 42. Martinez, N., Camacho, F. I., Algara, P. et al. The molecular signature of mantle cell lymphoma reveals multiple signals favoring cell survival. Cancer Res. 2003; 63: 8226–32. 43. Gary-Gouy, H., Harriague, J., Bismuth, G., Platzer, C., Schmitt, C., and Dalloul, A. H. Human CD5 promotes B-cell survival through stimulation of autocrine IL-10 production. Blood 2002; 100: 4537–43. 44. De Vos, J., Thykjaer, T., Tarte, K. et al. Comparison of gene expression profiling between malignant and normal plasma cells with oligonucleotide arrays. Oncogene 2002; 21: 6848–57. 45. Katzenberger, T., Ott, G., Klein, T., Kalla, J., Muller-Hermelink, H. K., and Ott, M. M. Cytogenetic alterations affecting BCL6 are predominantly found in follicular lymphomas grade 3B with a diffuse large B-cell component. Am. J. Pathol. 2004; 165: 481–90.
186
C. Burek, E. Hartmann, Z. Mao, G. Ott, and A. Rosenwald
46. Muller-Hermelink, H. K., Zettl, A., Pfeifer, W., and Ott, G. Pathology of lymphoma progression. Histopathology 2001; 38: 285–306. 47. Lossos, I. S., Alizadeh, A. A., Diehn, M. et al. Transformation of follicular lymphoma to diffuse large-cell lymphoma: alternative patterns with increased or decreased expression of c-myc and its regulated genes. Proc. Natl Acad. Sci. USA 2002; 99: 8886–91. 48. Elenitoba-Johnson, K. S., Jenson, S. D., Abbott, R. T. et al. Involvement of multiple signaling pathways in follicular lymphoma transformation: p38-mitogen-activated protein kinase as a target for therapy. Proc. Natl Acad. Sci. USA 2003; 100: 7259–64. 49. Dave, S. S., Wright, G., Tan, B. et al. Prediction of survival in follicular lymphoma based on molecular features of tumor-infiltrating immune cells. N. Engl. J. Med. 2004; 351: 2159–69. 50. Glas, A. M., Kersten, M. J., Delahaye, L. J. et al. Gene expression profiling in follicular lymphoma to assess clinical aggressiveness and to guide the choice of treatment. Blood 2005; 105: 301–7. 51. Coiffier, B. Diffuse large cell lymphoma. Curr. Opin. Oncol. 2001; 13: 325–34. 52. Alizadeh, A. A., Eisen, M. B., Davis, R. E. et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403: 503–11. 53. Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 1998; 95: 14863–8. 54. Rosenwald, A., Wright, G., Chan, W. C. et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. 2002; 346: 1937–47. 55. Davis, R. E., Brown, K. D., Siebenlist, U., and Staudt, L. M. Constitutive nuclear factor kappaB activity is required for survival of activated B cell-like diffuse large B cell lymphoma cells. J. Exp. Med. 2001; 194: 1861–74. 56. Bea, S., Zettl, A., Wright, G. et al. Diffuse large B-cell lymphoma subgroups have distinct genetic profiles that influence tumor biology and improve gene-expressionbased survival prediction. Blood 2005; 106(9): 3183–90. 57. Shipp, M. A., Ross, K. N., Tamayo, P. et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat. Med. 2002; 8: 68–74. 58. Wright, G., Tan, B., Rosenwald, A., Hurt, E. H., Wiestner, A., and Staudt, L. M. A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma. Proc. Natl Acad. Sci. USA 2003; 100: 9991–6. 59. Monti, S., Savage, K. J., Kutok, J. L. et al. Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response. Blood 2005; 105: 1851–61.
8
mRNA profiling of pancreatic beta-cells: investigating mechanisms of diabetes Leentje Van Lommel,1 Yves Moreau,2 Daniel Pipeleers,3 Jean-Christophe Jonas,4 and Frans Schuit1 1
Gene Expression Unit, Department of Molecular Cell Biology, K. U. Leuven, Belgium Department of Electrical Engineering, ESAT–SCD, K. U. Leuven, Belgium 3 Diabetes Research Center, Vrije Universiteit Brussel, Belgium 4 Unit of Endocrinology and Metabolism, Faculty of Medicine, Universite´ Catholique de Louvain, Brussels, Belgium 2
The pancreatic beta-cell in health and disease Diabetes is one of the most common health problems of our age, affecting more than 150 million patients worldwide today and perhaps will affect twice as many a few decades on from now [1]. In the vast majority of patients, the exact cause of their disease state is unknown. Diabetes can be caused by autoimmune destruction of insulin-producing beta-cells in the pancreas (type 1 diabetes) or by a failure of the beta-cells to produce/release insulin in sufficient amounts to meet the metabolic demand (type 2 diabetes). The latter group represents about 90% of all diabetic patients and today is one of the most prevalent existing health problems. The high prevalence, chronic nature, and unknown cause of diabetes are strong arguments for studying the beta-cell in order to identify mechanisms by which beta-cells are destroyed or made dysfunctional. Identification of key aspects of beta-cell dysfunction at the molecular detail requires that the molecules in the normal beta-cell, as well as their mode of action, are known. The rationale to build up such knowledge in a biomedical context is dual. First, it is expected that such insight will assist in the process of finding new ways of drug treatment for type 2 patients, so that beta-cell dysfunction can be corrected in a more efficient way than is possible today. Second, this type of knowledge will define the endpoint of investigations that aim to generate new beta-cells from pancreatic or extrapancreatic precursor cells. Following this rationale, we have started to identify the mRNA expression profile of the normal beta-cell and have Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
188
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
embarked on studies exploring differences in diabetic conditions. This chapter will review some of these experiments, as well as microarray analysis performed by other groups in the field; in addition, it will discuss some methodological issues we have encountered in our work as we presume these can be generalized to other areas of investigation. Pancreatic beta-cells are unique in that they are the sole cell type that produces insulin with great efficacy. Moreover, these cells are highly effective in storing large quantities of the hormone (up to 30% of total protein content of the cells) in secretory granules. From a physiological standpoint, it is of utmost importance that the content of such granules is only released when the organism needs insulin for nutrient homeostasis, i.e., during and just after a meal. The acute control of this process has two major players. On the one hand, the meal-induced rise in plasma glucose causes immediate beta-cell stimulation [2–4]. This effect is amplified further by intestinal gluco-incretins (in particular, hormones like GLP-1 (glucagon-like peptide 1) and GIP (glucose-dependent insulinotropic peptide))[5, 6]. The secreted insulin then corrects the rise in glucose via insulin receptor-mediated uptake and metabolic disposal of glucose in peripheral tissues. Between meals and during fasting, a severe excess of insulin in the circulation would be lethal within minutes, as blood glucose would drop below the levels required for brain function. Such dangerous periods of hypoglycemia are prevented effectively by the suppression of insulin secretion to very low levels under basal conditions; furthermore, pancreatic alpha cells are stimulated under such conditions to secrete glucagon, which will cause glycogen breakdown in the liver. Beta-cells therefore, have the difficult task of secreting precisely the level of insulin required to both prevent episodes of hypoglycemia between meals and avoid periods of hyperglycemia during/ after meals. In order to be able to perform such task with efficacy and precision, the cells are equipped with glucose and incretin sensors that measure the level of the extracellular stimuli on a minute-by-minute and on a chronic basis (for a review of this topic, see [7]). Microarray analysis of pancreatic islets and primary clonal beta-cells During the last 5 years, more than a dozen microarray RNA expression studies have been performed with isolated islets, purified beta-cells or
189
mRNA profiling of pancreatic beta-cells: diabetes
Table 8.1. Microarray studies on pancreatic islets, primary beta-cells or clonal beta-cells PMID a
Model
Array type
Principal outcome of the analysis
10811900
MIN6 cell line
Glucose-regulated genes
11334433 11687580 12086928 12193586 12639920
Rat beta-cells Rat beta-cells Rat beta-cells Human islet cells MIN6 cell lines
Affymetrix Mu6500 Affymetrix U34A Affymetrix U34A Affymetrix U34A Affymetrix U95A Affymetrix U74A
12914774
Human islets
14534319
Rat islets
14557546
Murine cell lines
14578289
INS1E cell line
14600816
Rat beta-cells
15094191 15126236 15151993 15196698
Murine cell lines Rat islets Rat islets Rat islets
15321003
Human islets
Cytokine-regulated genes NFB-regulated genes Glucose-regulated cataplerosis Glucose-regulated TGF-beta signaling Differential expression of releaserelated genes List of expressed cytokine-related genes
Atlas cytokine cDNA array Metabolex Rat Glucose and FFA-induced genes Islet array Affymetrix Mu11K Differential expression of transcription factors Affymetrix U34A Time course of cytokine-regulated genes Affymetrix U34A Intereron-, double strand RNAregulated genes Incyte GEM I Differential expression of Hox genes Affymetrix 230A Effect of PPAR1 activation Affymetrix U34A Effect of PDX1-inactivation Effect of prolactin (24 h culture) Clontech cDNA array Affymetrix U133A List of expressed genes
Ref. [8] [19] [20] [11] [18] [10] [23] [15] [24] [22] [21] [25] [26] [17] [28] [29]
a
PMID ¼ PubMed identification number; the papers have been listed in chronological order of publication.
beta-cell lines, in order to study (i) the mRNAs that are called ‘‘present’’ in the system; (ii) the response of such transcripts to nutrients like glucose or fatty acids; (iii) the influence of cytokines on beta-cells (Table 8.1). The outcome of such work was interpreted in a one-, two- or three-state model in which (differential) expression was defined in terms of ‘‘absence/ presence’’ of mRNAs and ‘‘increased/decreased mRNAs.’’ In some of the studies cited below, a correlation was made between the differentially expressed
190
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
genes and the functional changes of the cells, and in one study particular attention was paid to the potential presence of false positives amongst the selected transcripts that were said to be significantly up- or down-regulated. The first microarray analysis in the research field of this chapter, was performed on the murine clonal cell line MIN6, which is widely studied for its robust glucose responsiveness. The analysis was done by comparing the effect of 2.5 vs. 25 mM glucose during 24 h of tissue culture and hybridization of fragmented cRNA onto Affymetrix U74A arrays. From the 78 mRNAs that were reported to be glucose responsive, 21 were related to the secretory pathway, in particular subunits of the ER-translocon for secretory proteins like preproinsulin. This observation extended previous work from the same laboratory, further suggesting that glucose acutely stimulated insulin mRNA translation by lifting the signal peptide-mediated arrest of ER-translocation [9]. This study was continued by a comparative gene expression analysis in the B1 and C3 subclones of MIN6 which are, respectively, robust and poor insulin secretory responders to a various number of stimuli [10]. The results of hybridization to Affymetrix murine U74A arrays generated a list of about 400 differentially expressed transcripts (including RIKENs and ESTs) where expression correlated with the secretory state of the cells. Glucose-regulated mRNA analysis was studied first in primary beta-cells by Flamez et al. [11], who chose a two-state model of 24 h cultured rat betacells at 3 mM and 10 mM glucose. The cells cultured at 10 mM glucose were better insulin-secretory responders and had increased mRNA expression of genes related to citrate cataplerosis, a putative new signaling pathway [12, 13]. More than 200 glucose-regulated transcripts identified in this model, as measured by Affymetrix U34A arrays, were classified functionally [14]. Zhou et al. have used Affymetrix-based custom arrays to investigate glucose- and free fatty acid-induced gene expression in rat islets, which were maintained in tissue culture [15]. The custom array contains more than 22 000 probe sets of which about 12 000 are complementary to sequences from an islet cDNA library. The outcome of this study underlines the importance of members of the cAMP response element modulator (CREM) transcription factors (CREM-17X and ICER-1) as mediators of a so-called glucolipotoxicity response [16]. The array data were supported by measurements of glucose-induced insulin release, which was suppressed under conditions of up-regulated CREM-mRNA. A pronounced metabolic
191
mRNA profiling of pancreatic beta-cells: diabetes
effect was identified via microarray analysis in rat islets in which the betacell transcription factor PDX1 was inactivated via overexpression of a dominant-negative mutant [17]. Of the 2640 detected transcripts (present calls on the Affymetrix U34 arrays), 125 were differentially expressed, of which 24 were related to metabolism. Finally, human islets have also been exposed for 24 h to low (2 mM) and high (17 mM) glucose, followed by mRNA expression analysis using Affymetrix U95A arrays [18]. From the 6000 mRNAs that were detected as being present, 20 were regulated consistently by glucose in all three human islet donor preparations. Purity of the islets cells was 85%. A cluster of the glucose-regulated mRNA was noted to interact in the TGF-beta signaling pathway. Purified rat beta-cells have also been used [19–21] to study the molecular mechanisms of cytokine-induced beta-cell toxicity, in order to understand autoimmune-mediated beta-cell destruction. Cardozo et al. [19] reported that, of the 3000 probe sets detected as present in these cells, about 8% (250) were up- or down-regulated after 6–24 h exposure to cytokines (IL1 plus IFN). A potential molecular mechanism in this process subsequently was investigated by the same group using experimental manipulation of the NFB signaling pathway [20]. To investigate the mechanism of the interferon effect, the same group [21] later studied the effect of double-stranded RNA, which activates the EIF2-kinase PKR. In a time-course study on the rat cell line INS1E, Kutlu et al. [22] listed 700 mRNAs as being cytokine responsive, half of which were influenced by interfering with nitric oxide production. The time-dependent expression of these mRNAs was investigated via cluster analysis. A cytokine-related microarray analysis was also performed on human islet preparations [23] that were 66% 13% pure and that had a stimulation index of glucose-induced insulin release of 17% 8%. Custom-made cDNA membranes were used to detect cytokinerelated genes. Potentially novel mediators of inflammation during islets transplantation were thus identified. The differences in the molecular phenotype of glucagon-producing alpha-cells and insulin-producing beta-cells were explored [24] in models of differentiated glucose-secreting and insulin secreting murine cell lines (TC1.6 vs. MIN6). A long list of differentially expressed mRNAs (around 9% off all 11 000 expressed genes) was generated using Affymetrix Murine11K arrays. Functional classification of the encoded proteins resulted in the
192
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
observation that transcription factors and their regulators made the significantly most different category. Following a similar approach, Mizusawa et al. [25] also investigated the molecular basis of alpha- and beta-cell differentiation in TC1.6 vs. MIN6 cells. Hybridization of RNA was performed onto Incyte GEM I cDNA arrays; the outcome of this study was focused on a number of differentially expressed Hox genes. Parton et al. [26] have studied the effect of PPAR1 activation (pharmacological activation, overexpression of the gene or a combination of both) in cultured male rat islets. Expression of mRNA was analyzed using Affymetrix 230A arrays. As the protocol started from 100 ng total RNA, an amplification protocol was used. Statistical analysis included the testing of the chance for false discovery [27], which assesses the chance that mRNAs are called significantly differentially expressed, when these are, in fact, false positives. About 300 transcripts were observed to be (truly) differentially expressed, including a large cluster involved in fatty acid uptake and fatty acid oxidation. The observed changes in the RNA profile were further supported by biochemical alterations in fatty acid oxidation and a parallel functional effect was studied on insulin release. In a paper by Bordin et al., the effect of prolactin (24 h culture) was studied on mRNA expression in isolated islets female Wistar rats [28]. About 10% (54 out of 588) of the studied transcripts were altered. Prolactin lowered apoptosis-related mRNAs and enhanced the abundance of transcripts that encode proteins of the cell cycle, signal transduction, and translation. Finally, the study by Hui et al. on human islet cells [29] has generated a list of about 7000 probe sets corresponding to expressed transcripts from a total of 23 000 on the Affymetrix U133A array (about 30%, which is a rather low percentage). Surprisingly, GLP-1 receptor and PDX1-mRNA signals were called absent in this experiment. The purity of beta-cells in the used cell preparations was not specified.
A handful of experimental challenges Systematic false positives: the balanced choice of an experimental system
As in any experimental setup, an important aspect in the design of a microarray mRNA expression analysis is the choice of the preparation of cells used
193
mRNA profiling of pancreatic beta-cells: diabetes
to prepare total RNA. This issue was already reviewed for studies on brain [30] and seems also to be of high relevance in the case of pancreatic betacells, which are located in microscopic islets of Langerhans that are dispersed throughout the pancreas and represent only 1% of the total pancreatic volume. Therefore, beta-cell transcripts from a total pancreatic RNA preparation are ‘‘diluted’’ with 99% of molecules originating from other cell types, mostly from acinar cells. Such molecules will compete with beta-cell RNA in the binding to commonly recognized probe sets, interfering with the interpretation of results. Moreover, the exocrine pancreas produces and secretes its ribonuclease, making the extracted RNA from any preparation containing exocrine cells even more unstable than is common for other tissues. Consequently, microarray analysis of total pancreatic mRNA is expected to allow detection of just a few very abundant beta-cell transcripts such as insulin, GLUT2, and IAPP. Microarray experiments are therefore, preferentially performed on mRNA extracted from highly enriched beta-cell preparations, such as isolated islets of Langerhans (in which beta-cells represent about 75% of the volume), or beta-cells that are further enriched from islets cells [31] by flow sorting (purity 90% or more). We have used such cells for the study of glucose regulation of mRNA abundance in previous studies [11, 14] and currently extend such analysis in combination with the study of tissue-specific mRNA profiles (L. Van Lommel et al., manuscript in preparation). Figure 8.1(a) shows that such preparations yield highly reproducible RNA profiles that can be readily distinguished from the transcript profile observed in other tissues such as liver (Fig. 8.1(b)). As the preparation of islets and the further enrichment of beta-cells is time consuming and costly, there is widespread interest in using beta-cell lines as surrogates for primary cells. As such cell lines have become increasingly stable exhibiting, for instance, a robust glucose-responsiveness during many passages [32], we compared the expression profile of primary rat betacells to that of INS1 cells. As illustrated in the scatter plots, such RNA again yields reproducible RNA profiles (Fig. 8.1(c)), but these are profoundly different from those obtained from primary beta-cells (Fig. 8.1(d)). We suspect that part of this difference is an artifact resulting from the fact that primary beta-cells are not 100% pure (see below), while beta-cell lines are devoid of contamination. Furthermore, the primary beta population is heterogeneous [33, 34], while cell lines are likely to be more homogenous.
194
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
10000
10000
(a) beta-cells vs. beta-cells
1000
100
100
100
10
10
1
1
replicate 2
0.1 0.01
1
0.1
10
100
1000
0.1 10000
liver
1000
1000
10000
10000
(b) beta-cells vs. liver
1000
100
10
10
1
1
0.1 0.01
0.1
1
10
replicate 1
100
1000
0.1 10000
beta-cells 10000
10000
10000
(c) INS1 cells vs. INS1 cells
10000
(d) beta-cells vs. INS1 cells 1000
1000
100
100
100
100
10
10
10
10
1
1
1
1
0.1 0.001
0.01
0.1
1
10
replicate 1
Fig. 8.1.
100
1000
0.1 10000
INS1 cells
1000
replicate 2
1000
0.1 0.01
0.1
1
10
100
1000
0.1 10000
beta-cells
Scatter plots of mRNA signals in rat beta-cells, liver and the beta-cell line INS1. Each graph shows the correlates of 15 866 probe set signals generated by two independent hybridizations to Affymetrix 230A arrays plotted on a logarithmic scale. Red dots and yellow dots represent probe sets that are called ‘‘present’’ and ‘‘absent,’’ respectively on both arrays, while blue dots have a discordant call (present/absent or absent/present). The parallel blue lines represent functions with equations (from left to right): x ¼ 0.03y; x ¼ 0.1y; x ¼ 0.3y; x ¼ 0.5y; x ¼ 2y; x ¼ 3y; x ¼ 10y and x ¼ 30y. The reproducibility of the cell purification/culture, RNA extraction, preparation of labeled cRNA, hybridization, washing and scanning procedure is illustrated by correlating the results from two separate rat beta-cell preparations (a), and two independent cultures of the primary well differentiated and glucose-responsive rat insulinoma cell line INS1 (c). Probe sets called twice present represent approximately 50% of all probe sets on the array; less than 1% of these produce signals with at least two-fold difference among biological replicates (for an explanation in primary beta-cells, see Fig. 8.2) so that more than 99% of the probe sets called present are between the two middle lines of the plot; i.e., close to the line x ¼ y. Please note there are virtually no blue dots, a further illustration of the excellent reproducibility among biological replicates.
195
mRNA profiling of pancreatic beta-cells: diabetes
Finally, adult beta-cells contain only a small fraction of replicating cells, while a large fraction of the population in any replicating cell line is participating in the cell cycle. For these reasons, it seems safe to interpret mRNA expression analysis of beta-cell lines with some caution and confirm the most relevant results in primary cells. When working with islets and purified beta-cell preparations, one must face, however, the problem of contaminating cells. As the degree of contamination varies from one preparation to another, groups of transcripts of contaminating cells can be identified either in scatter plots (Fig. 8.2(a)), in calculated expression ratios (Table 8.2), or in particular clusters after cluster analysis (data not shown). As shown in Fig. 8.2(a), such a group is visible in scatter plots as a series of probe sets with a similar fold changes in expression between samples. Table 8.2 enlists 25 probe sets for acinar markers that cosegregate in a discrete area of the scatter plot and that display (i) high variation (mean ¼ 73%, i.e., about three- to four-fold higher than a comparative set of true positives) in primary beta-cells; (ii) clustered pattern of differential expression, i.e., all markers are highest in the same sample; and (iii) very low expression in INS1 cells. As Table 8.2 shows, the control of INS1 cells helps to sort out acinar contaminants from transcripts that are truly expressed in beta-cells. Tissue culture for at least a few days also lowers the expression of acinar markers [35], mostly because contaminating acinar cells die during this period. Another example of a false positive in purified beta-cells is illustrated in Fig. 8.2(b). All tissues contain blood vessels and blood cells. Therefore, the probe sets for alpha- and beta-globin mRNA (highly expressed in reticulocytes) give strong to very strong signals. Fig. 8.1. caption (cont.) The right plots are representative for nine pairwise comparisons that were performed between three primary rat beta-cell samples vs. three rat livers (b) and three primary rat beta-cell samples vs. three cultures of INS1 cells (d). As expected, the profound phenotypic difference between beta-cells and liver (b) is reflected in the scatter plot, as (1) about 20% of the red dots lie outside the two middle lines and (2) about 30% of the probe sets are present in liver and absent in beta-cells or vice versa, forming the blue wings of a butterfly-like pattern. Surprisingly, the molecular phenotype of the primary rat beta-cell is also profoundly different from that of the tumoral INS1 cells (d), as reflected by 3000 differentially expressed probes sets on the Affymetrix rat 230A array when comparing three preparations of rat beta-cells vs. INS1 cells.
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
(a)
Insulin 1
Cluster originating from acinar cell contamination
Signal intensity beta-cell preparation 3
196
Signal intensity beta-cell preparation 2
6000 mRNA signal intensity
(b)
Alpha globin mRNA P P
P
P
4000
P
P
P
2000
A
0
P
P
INS1 Beta- Non- Liver Muscle Fat Brain Pitui Lung Kidn cells cells betacells
Fig. 8.2.
False positive signals in rat beta-cell mRNA can be caused by contaminating cells. Panel (a) shows the most abundant mRNA signals in a representative comparison of two freshly prepared rat beta-cell preparations. The most intense signal on the rat 230A array invariably is that corresponding to insulin mRNA (probe set 1387815). Within the dotted area are 20 probe sets that invariably yield higher signals in beta-cell sample 3 than in sample 2; all of these encode contaminating acinar cell proteins. Please note that the fold-change between samples 3 and 2 tends to rise from below two-fold to between two- and three-fold when the signal intensity declines. Electron microscopic analysis of the flow sorted cells indicates that the acinar cell contamination varies between less than 1 and 5% of the freshly prepared cells [35]. Panel (b), illustrates that even a very low degree of contamination with blood cells in FACS-purified beta-cells results in ‘‘present’’ calls for alpha globin mRNA which originates from circulating reticulocytes. The cell line INS1 is a negative control.
Amylase 1 Amylase 1 Pancreatic trypsin 1 Elastase 2 Cationic trypsinogen Carboxypeptidase B1 Chymotrypsinogen B Carboxyl ester lipase Elastase 2 Pancreatic lipase Carboxypeptidase A1 Ribonuclease, RNase A family, 1 Syncollin Elastase 1 Chymotrypsin-like Colipase Pancreatic lipase related protein 1 Zymogen granule protein ZG-16p Zymogen granule protein GP2 Pancreatic lipase-related protein 2 Pancreatic trypsin 2 Serine protease inhibitor type 1 Pancreatitis-associated protein Serine protease inhibitor type 1 Pancreatic trypsin inhibitor type II
1369502 1369503 1369030 1387471 1388186 1370084 1369951 1368396 1384778 1368554 1369657 1393026 1370837 1387819 1370107 1368196 1368532 1368586 1386933 1387516 1370126 1387193 1368238 1368447 1387967
5511 4553 4805 5538 4309 4866 4559 4886 4447 4370 4163 4208 3853 3362 3602 3047 2751 2270 2228 1763 980 588 387 323 272
Exp1 3406 2319 2120 1593 1816 1111 1231 916 1158 957 887 1003 763 612 782 509 409 388 240 266 50 51 39 52 29
Exp2 4265 3594 3501 3039 2552 2670 2718 2292 2463 2079 2337 1350 1724 1555 1069 1265 941 841 635 756 202 236 236 121 83
Exp3 4394 3489 3475 3390 2892 2882 2836 2698 2689 2469 2462 2187 2113 1843 1818 1607 1367 1166 1034 928 411 291 221 165 128
Mean 1059 1121 1342 1996 1281 1886 1667 2016 1656 1739 1642 1759 1582 1397 1552 1303 1228 983 1052 763 499 273 174 141 127
SD
Signals in primary rat beta-cells
24% 32% 39% 59% 44% 65% 59% 75% 62% 70% 67% 80% 75% 76% 85% 81% 90% 84% 102% 82% 122% 94% 79% 85% 99%
CV 1.3 1.5 1.7 1.9 1.4 2.4 2.2 2.5 2.1 2.2 2.6 1.3 2.3 2.5 1.4 2.5 2.3 2.2 2.6 2.8 4.0 4.7 6.0 2.3 2.8
e3/e2 1 1 17 17 17 24 38 4 23 25 6 3 42 3 41 10 4 31 24 15 0 1 8 1 1
Mean 1 1 3 8 10 7 9 4 6 4 2 1 10 1 22 4 3 6 4 10 0 0 2 0 0
SD
INS1cells
12.0 11.5 7.6 7.3 7.5 6.8 5.7 8.8 6.5 6.2 8.0 9.1 5.3 8.7 5.4 6.9 7.9 4.9 4.7 5.7 8.2 6.6 4.1 6.1 6.4
SLR 0.002 0.01 0.01 0.04 0.02 0.06 0.04 0.08 0.05 0.07 0.06 0.10 0.09 0.08 0.12 0.10 0.13 0.12 0.17 0.11 0.23 0.14 0.10 0.11 0.16
t-test
Beta vs. INS1
Data were analyzed using Affymetrix GCOS software. Signal intensities of 25 acinar cell markers were calculated in arrays from primary beta-cells and INS1 cells (N ¼ 3 each) using the global scaling method taking 150 as target intensity value. The signal ratios in samples 1/2 and 3/2 is greater than 1 for each of these markers (clustered behavior of the probe sets). Please remark that these signal ratios rise as the mean signal declines: this may be caused by saturation of the probe cells at high signal intensities (see also Fig. 8.5). Also, note that the high coefficient of variation (CV) among beta-cell samples, means that few of the signals are significantly different (unpaired Student’s t-test) vs. those in INS1-cells; such gene cluster does not follow the Bayesian prior on variance [55]. SLR ¼ signal log ratio.
Encoded protein
Probe set
Affymetrix rat 230A microaarays
Table 8.2. Variation of exocrine pancreatic cell mRNA markers in purified beta-cell preparations
198
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
An appropriate negative control is the clonal beta-cell line INS1, which contains no such contamination (Fig. 8.2(b)). In FACS purified islet betaand non-beta-cells the alpha-globin mRNA marker is low, but it is called present and six- to eight-fold higher than in INS1 cells (P < 0.02). The signal is 30 to 40 times lower than in whole rat tissues and is presumably caused by a low grade (<1%) contamination of red blood cells. It is conceivable that other systematic false positives in preparations of beta-cells are related to contaminating cells. These may be endothelial, stellate, or dendritic, and other hormone-producing cells from the islet. Comparison of the microarray profile of FACS-purified beta- and non-beta-cells is a potential and partial solution to this problem. In summary for this paragraph, a major area of concern in the interpretation of mRNA expression profiling of normal or pathological tissues is the possibility that variation in the mRNA signal is caused not only by changes in the intensity of gene expression in the cell type under investigation, but is also influenced by variation in the cell composition of the tissue. Systematic false negatives: a metaphysical challenge
More difficult than the identification of false positives is the discrimination between true and false negatives, as the claim that a phenomenon is not there is, by definition, a tricky one. The weakness of any statement that a study object is not there is that the failure to observe something could indeed be caused by genuine absence (true negative), but could as well be based upon the limitations of the used measuring technique (false negative). The technique of microarray analysis does not escape this problem of measurement and interpretation [36]. The most obvious reason for a microarray experiment generating negative signals for a particular mRNA species is that there are simply too few transcript copies per cell to generate hybridization signals above the noise level. This chapter will not address this issue, as it is related to protocols of increasing sensitivity by pre-amplifying the signal in small samples and we have no experience with such protocols. False negatives can, however, also be generated when the average signals are of intermediate (Fig. 8.3) and even very high (Fig. 8.4) intensity, and we chose to show an example for each of these as the encoded proteins are important for beta-cell function. Figure 8.3 summarizes hybridization signals of probe sets directed against the glucagon receptor, which plays
199
mRNA profiling of pancreatic beta-cells: diabetes
(a)
(b) 500 Glucagon receptor mRNA A
mRNA signal intensity
mRNA signal intensity
300 A 200 A
100
A
A
A
A
A
A
(d)
800
mRNA signal intensity
P
GIP receptor mRNA P
P
200 A
A
P
A
A
A
A
0
P
100 A
A
250 200
A
A
P
A
A
A
A
A
Alpha-2a adrenergic receptor mRNA
150 100 A 50
A A
A
A
A
A
0 INS1 Beta- Non- Liver Muscle Fat cells cells betacells
Fig. 8.3.
200
P
600
400
300
0
mRNA signal intensity
(c)
Somatostatin type 2 receptor mRNA
400
A
0
P
Brain Pitui Lung Kidn
INS1 Beta- Non- Liver Muscle Fat cells cells betacells
Brain Pitui Lung Kidn
False vs. true negative calls for G-protein-coupled receptor mRNA signals in beta-cells. Glucagon receptor mRNA from INS1 cells, rat purified beta-cells and rat liver (a) hybridizes with average intensity (around 140 units with the global scaling 150 method) to the Affymetrix rat 230A array, and yet is called ‘‘absent.’’ Without further biological information, like Northern blots or functional studies [39–41] it is impossible to discriminate this false negative call from a true negative like the SSTR2 (b). Control true positive calls of G-protein coupled receptor mRNA in the beta-cells are the GIP-receptor (c) and the alpha-2 A adrenergic receptor (d).
a crucial role in glycogen degradation in the liver [37] and which is well known to be expressed (PCR, northern blots, western blots) and functional (binding studies, cAMP production) in rat beta-cells [38–41]. However, the sole probe set on the Affymetrix rat 230A array gives ‘‘absent signals’’ for all tested tissue samples, including liver. With arrays only from beta-cells and no further knowledge about the behavior of the gene, one would disregard further analysis of the data. However, inspecting the reproducible signals in
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
(a)
(b) 6000
4000 somatostatin
and PP
mRNA
0
(c) 6000 Insulin 1
4000
and Insulin 2
mRNA
2000
0
Fig. 8.4.
INS1 cells
Beta- Non- Liver Muscle Fat Brain Pituitary Lung Kidney cells betacells
Signal per probe cell (x1000)
mRNA signal intensity
Glucagon 2000
Signal per probe cell (x1000)
200
40 30 20 10
0 1 2 3 4 5 6 7 8 9 10 11 Probe pairs in insulin 1 probe set (#1387815) 12
8
4
0 1 2 3 4 5 6 7 8 9 10 11 Probe pairs in beta actin probe set (#1398835)
False negatives in the detection of differential islet hormone mRNA expression. (a) Signals for glucagon, somatostatin and pancreatic polypeptide (PP) mRNA, the most abundant transcript of islet endocrine non beta-cells (top); the high signal in beta-cells is much above the expected 10% (non-beta-cell contamination), so that some of the comparisons are called ‘‘no change.’’ Even less difference between purified beta-cells and non-betacells is observed for insulin 1 and insulin 2 (bottom), and yet these mRNAs are known to be specifically expressed in beta-cells. (b) The false negative ‘‘No Change’’ for differentially expressed and abundant transcripts like insulin1 and insulin 2 is explained by saturation binding of the RNA to the 22 probe cells of the probe sets which are displayed as pairs of perfect match (green) and mismatch (gray). As is illustrated for rat insulin 1, the 11 probe cells of the mismatch probes are almost saturated, generating signals that are 80%–90% of the perfect match cells. Hybridization of the same RNA to beta actin probe pairs (c) gives acceptable ratio’s of MM/PM probe pairs (between 20 and 30%). Data in (b) and (c) are means SD for three experiments.
INS1 cells, beta-cells and liver (Fig. 8.3(a)) gives another opinion about this transcript. In fact, the signal in these cell types is one to two orders of magnitude higher than in all other tested tissues. For all these reasons, one should consider classifying the ‘‘absent’’ call as being a false negative. Further analysis of the 11 probe pairs for this transcript has shown that at least 4 are non-informative (data not shown), so that the algorithm used in the GCOS software fails to express accurately the call, i.e., the interpretation of the
201
mRNA profiling of pancreatic beta-cells: diabetes
hybridization intensity is not correct. In comparison, a true negative for beta-cells is the somatostatin type 2 receptor (Fig. 8.3(b)), which is highly expressed in islets alpha-cells and responsible for the great sensitivity of these cells to somatostatin 14 [42]. True positive G-protein coupled receptors for beta-cells with correlations to previous functional experiments [41, 43] were also found in the same experiment like GLP-1 receptors (not shown), GIP-receptors and alpha2a-adrenergic receptors (Fig. 8.3(c) and (d)). At the level of analysis of differential expression between two samples, an intriguing false negative case was made for the analysis of transcripts encoding islet hormones, especially preproinsulin mRNA, in purified islet cells. Using different techniques like immuno-cytochemistry, radioimmunosassay of hormone content, and electron microscopy, it was shown previously that the cross-contamination of non-beta-cells in the beta-cell preparation is 10% or less, while beta-cells represent 10% or less of the nonbeta-cell preparations [31]. This would imply that, for the non-beta-cell hormones glucagon, somatostatin, and pancreatic polypeptide, a signal ratio beta-cells/non-beta-cells of about 0.1 would have been expected, while the observed ratios were around 0.5 (Fig. 8.4(a)). Furthermore, the expected (beta- /non-beta) ratio for insulin 1 and insulin 2 mRNA should be around 10, while in fact the ratios approximated unity (Fig. 8.4(a)). We confirmed via real-time RT-PCR that the expected high expression ratio of insulin mRNA is present in these preparations. So how can such false negatives in differential expression exist for a case in which there is plenty of signal to detect? We reasoned that the ‘‘plenty’’ might be the problem in this case. As insulin mRNA is the most abundant transcript in the beta-cell (see Fig. 8.2(a)), one could imagine that the hybridization conditions fall outside the linear range, so that the measured signal is not quantitative. Real-time PCR measurements confirm this idea, as does closer inspection of the perfect match (PM) and mismatch (MM) probe pairs for the insulin 1 and insulin 2 probe sets on rat (Fig. 8.4(b)) and mouse arrays (not shown). We thus found out that the binding of the cRNA target to the MM cells approaches that of the PM cells, indicating that the latter must be saturated and explaining why the calculated signal is not quantitative. This would explain why dilution about ten-fold of insulin mRNA from beta-cells (as occurs in the sorted non-beta-cell preparations) still yields the same signal. As a negative control for this problem the percentage MM/PM ratios are
202
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
shown for beta actin mRNA, which is abundant but apparently not saturating the array (Fig. 8.4(c)). How frequent this type of problem is in the use of microarrays in general is not known, but it seems conceivable that other examples like that of insulin exist. In fact, a recent analysis of the performance of oligonucleotide and cDNA arrays [44] suggest that the former have a more restricted linear range of signal detection. So, in summary, the case of insulin and the glucagon receptor, two important gene products for the beta-cell, do emphasize that the occurrence of systematic false negatives in microarray experiments cannot be ignored. Finding the molecular Nemo in an ocean of data Performing a series of the latest generation microarray experiments is like immersing the team of involved investigators in an ocean of data. When no care is taken to extract the most relevant information, the generated results can be like a tsunami, overwhelming to the investigator and defying any focused hypothesis for further research. Perhaps, somewhere drifting in this ocean waits a true treasure, hidden amongst large numbers of false positives and countless differentially expressed genes of mediocre or even questionable relevance. A number of approaches have been proposed to address this crucial point, such as noise reduction [45, 46], normalization of data [47], appropriate statistical testing [48–51], screening for false positives [27, 52], and extracting the most relevant information [53]. This chapter will discuss two additional strategies that we have been developing in the context of this problem and that produces data sets with an increased ratio of true over false positives: (i) the practice of linking two or more different microarray data sets, joining different viewpoints on the same biological object and (ii) exploiting the information hidden in the occurrence of redundant probe sets for the same transcript on Affymetrix mRNA expression arrays. Avoiding random false positives by linking independent data sets
Microarray mRNA expression analysis is a powerful screening technique that tests a sample of 1 mg total RNA with many thousands of parallel hybridizations. Even if each of these hybridizations is performed with precision and high reproducibility [48], the measures required to protect against false positives (random fluctuations exceeding the permitted
203
mRNA profiling of pancreatic beta-cells: diabetes
threshold of variation) are draconic. For instance, on an array configuration with 45 000 probe sets and 20 000 expressed transcripts, the preset P-value used in differential expression analysis has to be P < 104 in order to have two or fewer false positives. Since microarray experiments are expensive, investigators avoid high numbers of repetition, often three times for welldefined cell preparations [48]. This means that, by accepting P < 104 in order to prevent false positives, one will lose most true positives as well, a situation that is not desired. We have learned from a number of related microarray experiments on the same biological object that linking the different data sets is a powerful way of guiding the investigator towards appropriate gene prioritization. Suppose beta-cell mRNA expression is measured in conditions A vs. B and we want to know which of the 20 000 transcripts called present are truly changed. When the frequency of true positives is 1% and the acceptance threshold P-value of each compared transcript is 0.01, the differential expression analysis A vs. B would generate a data set of 200 true and 200 false positives (Fig. 8.5(a) left). In terms of follow-up, this is a large project, as the obvious way to find out which are the true and which are false positives is to validate the transcripts one by one with other techniques like real-time RT-PCR. However, such analysis is an enormous task that would fill the time alloted to several Ph.D. students. But suppose that we test beta-cell mRNA expression a second time, now in conditions C vs. D. With a frequency of true positives of 1.5 % and a threshold P-value of 0.01, we would now generate a second data set with 300 true and 200 false positives (Fig. 8.5(a) right). When we accept as null hypothesis that the two comparisons bear no other than a random relationship in terms of behavior of RNA molecules, then the expected number of true and false positives exhibiting differential expression in A vs. B and C vs. D would be between 4 and 5. We have recently finalized experiments in which a first analysis (that of glucose-regulated mRNAs) was linked to a second study (beta-cells mRNA profile vs. profile in other tissues). The interesting result (L. Van Lommel et al., unpublished data) is that the linked data set gives numbers of positives that are between one and two orders of magnitude higher than the expected number, so that the null hypothesis can be rejected firmly. In practice, we have reasons to assume that such a linked data set is likely to be much depleted of false positives; furthermore the reduced size of such
204
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
(a)
? ?
? ? ?
? ?
Data set 1: 400 positives (200 T + 200 F)
? ?
? Linked data set: 50 positives (48 T + 2 F)
?
?
? ??
?
Data set 2: 500 positives (300 T + 200 F)
Non-diabetic rats
(b) Pancreas Culture Low glucose Islets High glucose Culture Low glucose
preculture BetaCells
High glucose INS1 Cells
Fig. 8.5.
non-BetaCells
Ex vivo
Liver Muscle Adipose tissue Brain Pituitary Lung Kidney
Linking independent databases increase the likelihood to extract relevant data. (a) The predicted outcome of differential analysis A vs. B (20 000 expressed genes) with P < 0.01 as cutoff and 1% true positives is a data set with 200 true and 200 false positives. A second data set is generated from the same cells now comparing condition (C) vs. (D). The overlap between the data sets 1 and 2 may be a way to enrich true positives, allowing further hypothesis-driven research. (b) Ongoing research effort in a network of three Belgian universities (K. U. Leuven, Vrije Universiteit Brussel, Universite´ Catholique de Louvain) has started to connect the study of glucose regulation in beta-cells (two experimental in vitro models, left) to the analysis of beta-cell-specific gene expression (ex vivo, right). In addition, the data set can be linked to results generated with INS1 cells (broken arrow) to check the issue of contamination in primary cells (see also Fig. 8.2 and Table 8.2).
sets will allow more systematic in-depth exploration of the function of the selected mRNAs. We believe that the approach outlined in this section can be generalized to other situations, e.g., the disease state vs. normal linked to the response of the disease state to a certain therapy [54].
205
mRNA profiling of pancreatic beta-cells: diabetes
Exploiting the probe redundancy on Affymetrix expression arrays
We, like any user of the Affymetrix expression arrays, have noticed that quite a large number of transcripts with the same name and gene annotation is represented by two or more probe sets on the same array. For the mouse 430 2.0 array (45 037 probe sets), we have counted the distribution of singlets, doublets, triplets, etc. (Fig. 8.6(a)), and calculated that about 50% of the probe sets are singlets, 20% doublets, 15% triplets, and another 15% anything repeated four times or more. It seems straightforward to postulate the null hypothesis that being a member of a singlet, doublet or triplet probe set has no a-priori influence on the chance that the transcript will be selected in a subset of differentially expressed genes. So, if on the whole array 50% of the probes are singlets, then the expected number of singlets in a differentially expressed data set of 100 positives (true and false) is somewhere close to 50. Of the remaining 50 differentially expressed probes, 20 are expected to be members of doublets. The presence of several non-singlet probe sets in a small list of selected ‘‘differentially expressed’’ mRNAs can then be used to do a first search for relevant biological information. Suppose the null hypothesis is that all probes sets in the selected set are the results of random variation (i.e., there is no real differential expression). Since there are about 4600 doublets on the array, the random chance of having one complete pair (both members of the doublet in the set of 100) is 19/4600 ¼ 0.004; the chance to have two complete pairs is 19/4600 17/4600 ¼ 0.00001; the chance to have three, four, and five complete pairs is 5 108, 1 1010 and 3 1013, respectively. For triplets, an analogous calculation can be made. In this way, for the complete selected data set, an overall P-value can be calculated in order to accept/reject the null hypothesis that complete pairs are present because of chance. The observation that probe pairs are dissociated (one within the list of differentially expressed and the other(s) excluded from the list) can have multiple reasons: false positives (random chance of the selected probe set to detect a difference between the two tested conditions), false negatives (failure of the excluded probe set to measure what the selected probe set does), true negatives (e.g., different parts of the mRNA being interrogated in case of splice variants, or errors in the annotation of the probe set). In several projects that we have recently finished (L. Van Lommel et al., unpublished data; [54]), a high density of doublets and triplets is recognized
206
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
(a)
(b) 50
20 000
15 000
10 000
5000
0
Fig. 8.6.
% probe sets in category
Number of genes in category
25 000
1
2
40
30
20
10
0 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 Number of different probe sets for the same gene annotation
9
Redundancy of probe sets on the Affymetrix mouse 430 2.0 array. The 45 037 probe sets on the Affymetrix mouse 430 2.0 array were sorted on basis of their gene annotation (status February 2005; https://www.affymetrix.com/site/login/login.affx) and genes with single, double, triple etc probe set representation were counted (panel (a)). On basis of this information, it was calculated (b) that 50% of the probe sets are single representations per gene, while the other 50% are members of redundant pairs, triplets, quadruplets, etc.). As there are about 4600 doublets (approximately 20% of all probes) the random probability to have both members of a pair in a sample of 10 differentially expressed probes is approximately 1/4600 ¼ 0.0002. The chance that the recognized transcript is a false positive is, therefore, very small.
in the selected list of differentially expressed mRNAs, strongly indicating that the number of random false positives is low in such cases. Conclusions and perspectives An increasing number of genome-wide mRNA expression studies has been undertaken on primary beta-cells (whole islets or purified cells) as well as on clonal beta-cells. The purpose of this work has been to characterize the part of the genome that is expressed in these cells as well as to understand the molecular basis of nutrient- and cytokine-regulation or dysregulation. Because of the massive amounts of data that are generated by these experiments, attention should be paid to data mining of relevant (biological or
207
mRNA profiling of pancreatic beta-cells: diabetes
disease-related) information. Purification of the primary beta-cells is an important prior step in filtering the relevant mRNA information, and new approaches such as linking independent array data sets are required to help discriminating true from false positives. There is a need to improve methods that indicate whether biologically relevant false negatives exist. This chapter has suggested some solutions inherent to the structure of the array experiment that may help enhance the ratio of true/false positives, thus facilitating further hypothesis-driven analysis on a gene-by-gene basis. Acknowledgments Current research of FS is supported by the FWO-Vlaanderen (G.0529.05), the Juvenile Diabetes Research Foundation (1-2002-801) and the K. U. Leuven (GOA/2004/11).
R E F E RE N C E S 1. Zimmet, P., Alberti, K. G., and Shaw, J. Global and societal implications of the diabetes epidemic. Nature 2001; 414: 782–7. 2. Newgard, C. B. and McGarry, J. D. Metabolic coupling factors in pancreatic betacell signal transduction. Annu. Rev. Biochem. 1995; 64: 689–719. 3. Henquin, J. C. Triggering and amplifying pathways of regulation of insulin secretion by glucose. Diabetes 2000; 49: 1751–60. 4. Schuit, F. C., Huypens, P., Heimberg, H., and Pipeleers, D. G. Glucose sensing in pancreatic beta-cells: a model for the study of other glucose-regulated cells in gut, pancreas, and hypothalamus. Diabetes 2001; 50: 1–11. 5. Creutzfeldt, W. and Ebert, R. New developments in the incretin concept. Diabetologia 1985; 28: 565–73. 6. Holst, J. J. and Gromada, J. Role of incretin hormones in the regulation of insulin secretion in diabetic and nondiabetic humans. Am. J. Physiol. Endocrinol. Metab. 2004; 287: E199–206. 7. Hinke, S. A., Hellemans, K., and Schuit, F. C. Plasticity of the beta cell insulin secretory competence: preparing the pancreatic beta cell for the next meal. J. Physiol. 2004; 558: 369–80. 8. Webb, G. C., Akbar, M. S., Zhao, C., and Steiner, D. F. Expression profiling of pancreatic beta cells: glucose regulation of secretory and metabolic pathway genes. Proc. Natl Acad. Sci. USA 2000; 97: 5773–8.
208
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
9. Welsh, M., Scherberg, N., Gilmore, R., and Steiner, D. F. Translational control of insulin biosynthesis. Evidence for regulation of elongation, initiation and signalrecognition-particle-mediated translational arrest by glucose. Biochem. J. 1986; 235: 459–67. 10. Lilla, V., Webb, G., Rickenbach, K. et al. Differential gene expression in wellregulated and dysregulated pancreatic beta-cell (MIN6) sublines. Endocrinology 2003; 144: 1368–79. 11. Flamez, D., Berger, V., Kruhoffer, M., Orntoft, T., Pipeleers, D., and Schuit, F. C. Critical role for cataplerosis via citrate in glucose-regulated insulin release. Diabetes 2002; 51: 2018–24. 12. Farfari, S., Schulz, V., Corkey, B., and Prentki, M. Glucose-regulated anaplerosis and cataplerosis in pancreatic beta-cells: possible implication of a pyruvate/citrate shuttle in insulin secretion. Diabetes 2000; 49: 718–26. 13. Lu, D., Mulder, H., Zhao, P. et al. 13 C NMR isotopomer analysis reveals a connection between pyruvate cycling and glucose-stimulated insulin secretion (GSIS). Proc. Natl Acad. Sci. USA 2002; 99: 2708–13. 14. Schuit, F., Flamez, D., De Vos, A., and Pipeleers, D. Glucose-regulated gene expression maintaining the glucose-responsive state of beta-cells. Diabetes 2002; 51 (Suppl. 3): S326–32. 15. Zhou, Y. P., Marlen, K., Palma, J. F. et al. Overexpression of repressive cAMP response element modulators in high glucose and fatty acid-treated rat islets. A common mechanism for glucose toxicity and lipotoxicity? J. Biol. Chem. 2003; 278: 51316–23. 16. El Assaad, W., Buteau, J., Peyot, M. L. et al. Saturated fatty acids synergize with elevated glucose to cause pancreatic beta-cell death. Endocrinology 2003; 144: 4154–63. 17. Gauthier, B. R., Brun, T., Sarret, E. J. et al. Oligonucleotide microarray analysis reveals PDX1 as an essential regulator of mitochondrial metabolism in rat islets. J. Biol. Chem. 2004; 279: 31121–30. 18. Shalev, A., Pise-Masison, C. A., Radonovich, M. et al. Oligonucleotide microarray analysis of intact human pancreatic islets: identification of glucose-responsive genes and a highly regulated TGFbeta signaling pathway. Endocrinology 2002; 143: 3695–8. 19. Cardozo, A. K., Kruhoffer, M., Leeman, R., Orntoft, T., and Eizirik, D. L. Identification of novel cytokine-induced genes in pancreatic beta-cells by highdensity oligonucleotide arrays. Diabetes 2001; 50: 909–20. 20. Cardozo, A. K., Heimberg, H., Heremans, Y. et al. A comprehensive analysis of cytokine-induced and nuclear factor-kappa B-dependent genes in primary rat pancreatic beta-cells. J. Biol. Chem. 2001; 276: 48879–86. 21. Rasschaert, J., Liu, D., Kutlu, B. et al. Global profiling of double stranded RNA- and IFN-gamma-induced genes in rat pancreatic beta cells. Diabetologia 2003; 46: 1641–57.
209
mRNA profiling of pancreatic beta-cells: diabetes
22. Kutlu, B., Cardozo, A. K., Darville, M. I. et al. Discovery of gene networks regulating cytokine-induced dysfunction and apoptosis in insulin-producing INS-1 cells. Diabetes 2003; 52: 2701–19. 23. Johansson, U., Olsson, A., Gabrielsson, S., Nilsson, B., and Korsgren, O. Inflammatory mediators expressed in human islets of Langerhans: implications for islet transplantation. Biochem. Biophys. Res. Commun. 2003; 308: 474–9. 24. Wang, J., Webb, G., Cao, Y., and Steiner, D. F. Contrasting patterns of expression of transcription factors in pancreatic alpha and beta cells. Proc. Natl Acad. Sci. USA 2003; 100: 12660–5. 25. Mizusawa, N., Hasegawa, T., Ohigashi, I. et al. Differentiation phenotypes of pancreatic islet beta- and alpha-cells are closely related with homeotic genes and a group of differentially expressed genes. Gene 2004; 331: 53–63. 26. Parton, L. E., Diraison, F., Neill, S. E. et al. Impact of PPARgamma overexpression and activation on pancreatic islet gene expression profile analyzed with oligonucleotide microarrays. Am. J. Physiol. Endocrinol. Metab. 2004; 287: E390–404. 27. Storey, J. D. and Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 2003; 100: 9440–5. 28. Bordin, S., Amaral, M. E., Anhe, G. F. et al. Prolactin-modulated gene expression profiles in pancreatic islets from adult female rats. Mol. Cell Endocrinol. 2004; 220: 41–50. 29. Hui, H., Wang, C., Li, H. et al. Gene expression profiling of cultured human islet preparations. Diabetes Technol. Ther. 2004; 6: 481–92. 30. Henry, G. L., Zito, K., and Dubnau, J. Chipping away at brain function: mining for insights with microarrays. Curr. Opin. Neurobiol. 2003; 13: 570–6. 31. Pipeleers, D. G., in’t Veld, P. A., Van de, W. M., Maes, E., Schuit, F. C., and Gepts, W. A new in vitro model for the study of pancreatic A and B cells. Endocrinology 1985; 117: 806–16. 32. Hohmeier, H. E. and Newgard, C. B. Cell lines derived from pancreatic islets. Mol. Cell Endocrinol. 2004; 228: 121–8. 33. Schuit, F. C., in’t Veld, P. A., and Pipeleers, D. G. Glucose stimulates proinsulin biosynthesis by a dose-dependent recruitment of pancreatic beta cells. Proc. Natl Acad. Sci. USA 1988; 85: 3865–9. 34. Kiekens, R., In,’., V., Mahler, T., Schuit, F., Van de, W. M., and Pipeleers, D. Differences in glucose recognition by individual rat pancreatic B cells are associated with intercellular differences in glucose-induced biosynthetic activity. J. Clin. Invest. 1992; 89: 117–25. 35. Schuit, F., Moens, K., Heimberg, H., and Pipeleers, D. Cellular origin of hexokinase in pancreatic islets. J. Biol. Chem. 1999; 274: 32803–9. 36. Cole, S. W., Galic, Z., and Zack, J. A. Controlling false-negative errors in microarray differential expression analysis: a PRIM approach. Bioinformatics 2003; 19: 1808–16.
210
L. Van Lommel, Y. Moreau, D. Pipeleers, J.-C. Jonas, and F. Schuit
37. Robison, G. A., Butcher, R. W., and Sutherland, E. W. Cyclic AMP. Annu. Rev. Biochem. 1968; 37: 149–74. 38. Pipeleers, D. G., Schuit, F. C., in’t Veld, P. A. et al., Interplay of nutrients and hormones in the regulation of insulin release. Endocrinology 1985; 117: 824–33. 39. Van Schravendijk, C. F., Foriers, A., Hooghe-Peters, E. L. et al. Pancreatic hormone receptors on islet cells. Endocrinology 1985; 117: 841–8. 40. Schuit, F. C. and Pipeleers, D. G. Regulation of adenosine 30 ,50 -monophosphate levels in the pancreatic B cell. Endocrinology 1985; 117: 834–40. 41. Moens, K., Heimberg, H., Flamez, D. et al. Expression and functional activity of glucagon, glucagon-like peptide I, and glucose-dependent insulinotropic peptide receptors in rat pancreatic islet cells. Diabetes 1996; 45: 257–61. 42. Schuit, F. C., Derde, M. P., and Pipeleers, D. G. Sensitivity of rat pancreatic A and B cells to somatostatin. Diabetologia 1989; 32: 207–12. 43. Schuit, F. C. and Pipeleers, D. G. Differences in adrenergic recognition by pancreatic A and B cells. Science 1986; 232: 875–7. 44. Allemeersch, J., Durinck, S., Vanderhaeghen, R. et al. Benchmarking the CATMA microarray. A novel tool for Arabidopsis transcriptome analysis. Plant Physiol. 2005; 137: 588–601. 45. Mills, J. C. and Gordon, J. I. A new approach for filtering noise from high-density oligonucleotide microarray datasets. Nucl. Acids Res. 2001; 29: E72. 46. Aris, V. M., Cody, M. J., Cheng, J. et al. Noise filtering and nonparametric analysis of microarray data underscores discriminating markers of oral, prostate, lung, ovarian and breast cancer. BMC Bioinformatics 2004; 5: 185. 47. Kerr, M. K., Martin, M., and Churchill, G. A. Analysis of variance for gene expression microarray data. J. Comput. Biol. 2000; 7: 819–37. 48. Lee, M. L., Kuo, F. C., Whitmore, G. A., and Sklar, J. Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proc. Natl Acad. Sci. USA 2000; 97: 9834–9. 49. Ludbrook, J. Statistics in physiology and pharmacology: a slow and erratic learning curve. Clin. Exp. Pharmacol. Physiol. 2001; 28: 488–92. 50. Long, A. D., Mangalam, H. J., Chan, B. Y., Tolleri, L., Hatfield, G. W., and Baldi, P. Improved statistical inference from DNA microarray data using analysis of variance and a Bayesian statistical framework. Analysis of global gene expression in Escherichia coli K12. J. Biol. Chem. 2001; 276: 19937–44. 51. Delongchamp, R. R., Bowyer, J. F., Chen, J. J., and Kodell, R. L. Multiple-testing strategy for analyzing cDNA array data on gene expression. Biometrics 2004; 60: 774–82. 52. De Smet, F., Moreau, Y., Engelen, K., Timmerman, D., Vergote, I., and De Moor, B. Balancing false positives and false negatives for the detection of differential expression in malignancies. Br. J. Cancer 2004; 91: 1160–5.
211
mRNA profiling of pancreatic beta-cells: diabetes
53. Gao, F., Foat, B. C., and Bussemaker, H. J. Defining transcriptional networks through integrative modeling of mRNA expression and transcription factor binding data. BMC. Bioinformatics 2004; 5: 31. 54. Venken, K., Schuit, F., Van Lommel, L. et al. Growth without growth hormone receptor: estradiol is a major growth hormone-independent regulator of hepatic IGF-1 synthesis. J. Bone Miner. Res. 2005; 20: 2138–49. 55. Baldi, P. and Long, A. D. A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. Bioinformatics 2001; 17: 509–19.
9
Prediction of response and resistance to treatment by gene expression profiling Philipp Kiewe and Wolf-Karsten Hofmann Department of Hematology and Oncology, University Hospital Benjamin Franklin, Berlin, Germany
Introduction For many decades, drug therapy in medicine has been an empirical science, largely based on trial and error. Even today we cannot predict how effective a particular drug will be in an individual patient. To find the most suitable antihypertensive agent, for example, it may take more than one attempt. Often, only careful evaluation of a large number of clinical trials has enhanced treatment success and patient benefit. Testing bacterial sensitivity towards antibiotic drugs (resistogram) was among the first attempts to a ‘‘proof of principle’’ before the onset of treatment. It is still an unsurpassed method for targeted antimicrobial therapy and a prime example of an educated approach to treatment. With the possibility of evaluating the expression of thousands of genes at a time using commercially available or customized gene arrays and applying sophisticated statistical algorithms, a new era has dawned for prognostic assessment of diseases as well as for therapeutic implications. In treating infectious diseases, microarrays and mapping of single nucleotide polymorphisms (SNPs) have already enhanced drug discovery and understanding of resistance mechanisms [1–3]. However, all efforts to define gene expression profiles for disease are limited by available representative tissue samples. Although recently it has been shown that differentially expressed genes in heart failure patients can be found within white blood cells [4], progress is most pronounced in hematology and oncology. Despite the variety of clinical, morphological and molecular parameters used to classify human malignancies today, patients receiving the same diagnosis can have markedly different clinical courses and treatment responses. Gene Expression Profiling by Microarrays: Clinical Implications, ed. Wolf-Karsten Hofmann. Published by Cambridge University Press 2006. # Cambridge University Press 2006.
213
Prediction of response and resistance to treatment
One of the first studies utilizing microarray technology for analyzing patients’ tumor samples demonstrated how this tool could correctly classify human acute leukemias [5]. It was followed by many attempts to tackle the complexity of cancer. Using class membership prediction (a statistical method designed to classify samples according to their gene expression profile by associating particular expression signatures with disease subtypes or prognostic factors), gene lists predictive of treatment success can be established. Depending on patient samples and statistical algorithms, response prediction can be narrowed down to a single cytotoxic agent and to a few differentially expressed genes that can then easily be measured by realtime polymerase chain reaction (RT-PCR) or immunohistochemistry. It is beyond the scope of this chapter to provide a complete survey of the use of gene expression analysis in hematological or oncological disease. An exhaustive review on this topic with regard to hematologic malignancies has recently been published by Margalit et al. [6]. We have aimed to focus primarily on the most important research that uses gene expression analysis to predict the course of malignant diseases and response to treatment. Looking at the important research on non-Hodgkin’s lymphoma, acute lymphoblastic leukemia was chosen to communicate the principles of gene expression analysis in more detail. The last part of this chapter reviews current literature on gene expression profiles in breast cancer and other solid tumors, reporting outstanding results regarding the prognostic value of gene signatures with respect to individual cytotoxic agents. The biodiversity of diffuse large B-cell lymphoma Classification of human lymphomas has steadily evolved from their initial recognition by Thomas Hodgkin in 1832 and the subsequent distinction of Hodgkin’s disease from other types of lymphoma. Discovery of immunophenotypic markers and genetically defined differences has resulted in the development of various classification schemes, most recently the WHO classification. However, within this classification system, various morphologic subtypes are still put together in groups, despite the suspicion that they include more than one disease entity.
214
P. Kiewe and W.-K. Hofmann
Diffuse large B-cell lymphoma (DLBCL) accounts for 40% of all nonHodgkin’s lymphomas. While it is believed to arise from germinal center B-cells or from B-cells at a later stage of differentiation, it is clinically heterogenous with respect to treatment success. Although various markers have been proposed for defining subsets in DLBCL, few have been clinically helpful. Two much acclaimed publications on gene expression formed the basis of a comprehensive study on gene expression in DLBCL carried out by Rosenwald and colleagues [7]. Alizadeh [8] had first presented evidence of two molecularly defined subgroups of DLBCL, irrespective of histological subtypes, one exhibiting features of germinal center (GC) B-cells, the other resembling in vitro activated B-lymphocytes from peripheral blood by unsupervised hierarchical clustering of gene expression measured on a cDNA array called ‘‘lymphochip.’’ The two groups corresponded with marked differences in clinical outcome as demonstrated by Kaplan–Meier plots and provided additional information to the risk groups determined by the international prognostic indicator (IPI), which takes into account patient’s age, and performance status, as well as extent and location of lymphomas (Fig. 9.1). Shipp and colleagues [9] were not able to reaffirm this cell-of-origin classification, mainly due to the use of oligonucleotide-based Affymetrix HU6800 arrays with a different gene composition as well as supervised statistical approaches. However, using class membership prediction methods, they established a 13-gene model as an outcome predictor, which was able to separate patient samples into one group predicted to be cured and one with fatal or refractory disease. Again, the Kaplan–Meier survival analysis showed a highly significant difference and independence from risk groups defined by IPI. Rosenwald et al. focused on the biopsy specimens of 240 patients with DLCBL. In addition to GC B-like DLCBL and activated B-like DLCBL, hierarchical clustering revealed a third group of lymphoma cells (termed type 3 DLCBL). Again, neither group was related to a particular histologic subtype. Furthermore, they realized that two common oncogenic events in DLCBL, bcl-2 translocation and c-rel amplification, were only detected in the GC B-like subgroup, and that patients within this subgroup had the highest 5-year OS rate.
215
Prediction of response and resistance to treatment
(a)
(b) All patients
1.0
All patients
1.0
Low clinical risk Probability
Probability
GC B-like 19 patients, 6 deaths
0.5 Activated B-like
24 patients, 9 deaths
0.5
High clinical risk
21 patients, 16 deaths
14 patients, 11 deaths
P = 0.01
0.0 0
P = 0.002
0.0
2 4 6 8 10 Overall survival (years)
0
12
2 4 6 8 10 Overall survival (years)
12
(c) Low clinical risk patients
1.0
GC B-like Probability
14 patients, 3 deaths
0.5
Activated B-like 10 patients, 6 deaths
P = 0.05
0.0 0
Fig. 9.1.
Clinically
distinct
DLBCL
2 4 6 8 10 Overall survival (years)
subgroups
defined
by
12
gene
expression
profiling.
(a) Kaplan–Meier plot of overall survival of DLBCL patients grouped on the basis of gene expression profiling. (b) Kaplan–Meier plot of overall survival of DLBCL patients grouped according to the international prognostic index (IPI). Low clinical risk patients (IPI score 0–2) and high clinical risk patients (IPI score 3–5) are plotted separately. (c) Kaplan–Meier plot of overall survival of low clinical risk DLBCL (IPI score 0–2) grouped on the basis of their gene expression profiles. Reproduced with permission from [8].
In a second approach, 17 individual genes were determined that predicted outcome, and a multivariate model was established incorporating differences in the levels of gene expression among the subgroups of DLCBL that influenced outcome, as well as other differences in gene expression
216
P. Kiewe and W.-K. Hofmann
associated with the likelihood of survival. Using this model, samples were allocated to four gene expression signatures: normal GC B-cell signature; signature of reactive non-malignant cells in biopsy specimens of DLCBL; MHC class II signature; and proliferation signature. Kaplan–Meier survival analysis showed highly significant differences among the four different gene signatures, again independent of the IPI score. The predictor model had a greater prognostic power than that derived from the subgroups of DLCBL, and it could be used to further subdivide the patients within each subgroup into distinct risk groups. Importantly, the classification proposed by Rosenwald et al. was later confirmed by immunohistochemistry [10], referring to conventional methods for exact classification. It is worth mentioning that Rosenwald’s group, as well as another team around Savage, for the first time managed to identify a unique gene expression signature of primary mediastinal B-cell lymphoma (PMBL), which until then had not been distinguished reliably from other types of DLBCL [11, 12]. Both studies revealed a pathogenetic relationship between PMBL and Hodgkin’s lymphoma, showing the ability of gene expression analysis to find previously unknown relationships between diseases.
Pediatric and adult acute lymphoblastic leukemia (ALL): models for prediction of relapse and response to treatment Owing to the origin of leukemia from cells within the human bone marrow, it is relatively easy to obtain material of high purity for gene expression analysis. Preparations of mononuclear cells containing 80% to 90% leukemic blasts can be isolated from peripheral blood or bone marrow and are usually preserved at initial diagnosis and later during the course of disease. Treatment of childhood ALL is a success story, with long-term event-free survival in 80% of patients. Consequent definition of risk factors and allocation of patients into risk-defined prognostic groups as well as the implementation of minimal residual disease (MRD) measurements within treatment protocols, have resulted in an individual tailoring of treatment, crucial for long-term survival. Expression of certain molecular markers such as BCR-ABL or the E2A-PBX fusion protein as well as rearrangements
217
Prediction of response and resistance to treatment
within the MLL-gene locus have been attributed to a poor response to conventional anti-metabolite-based treatments, necessitating more aggressive protocols or early allogenic stem-cell transplantation. However, accurate risk assignment is difficult and expensive, usually requiring collective expertise within a specialized treatment center. To determine whether correct risk assignment by gene expression analysis is feasible, Yeoh et al. studied 327 samples of pediatric ALL patients with respect to 12 600 genes represented by oligonucleotides on an Affymetrix HG_U95Av2 array. With unsupervised hierarchical clustering of samples, six major leukemia subtypes were identified. These had previously been assigned on the basis of their immunophenotype, cytogenetic analysis or specific gene rearrangements, comprising T-lineage ALL (T-ALL), hyperdiploid caryotype with >50 chromosomes as well as BCR-ABL, E2A-PBX1, TEL-AML1 and MLL gene rearrangements. Additionally, a previously unknown subgroup of 14 samples without consistent cytogenetic abnormality was discovered. Of all samples with a heterogenous gene expression pattern 20% did not fit into the subgroups. With a given test set of samples, gene expression analysis yielded an accuracy of 96% in correct allocation to subgroups. In some respects it was superior to conventional methods as some misclassified samples revealed atypical translocations that had previously not been recognized by fluorescence-in-situ-hybridization (FISH) probes or PCR primers. Despite the success in recognizing distinct leukemia subgroups with high or low risk of treatment failure, individual risk assignment remains an imprecise process. The next step then was to identify patients with a high risk of treatment failure, defined either by relapse of disease or with therapyinduced development of secondary acute myeloid leukemia (AML). While it was impossible to detect a single expression signature that would predict relapse irrespective of the genetic subtype, indicating the absence of a unifying mechanism for relapse, within individual leukemic subgroups, distinct expression profiles predicting relapse could be defined, particularly in T-ALL and hyperdiploid caryotype >50 chromosomes with an accuracy of 97% and 100%, respectively. Though no single gene, but rather an expression pattern of a combination of genes defined the difference, these findings are remarkable because few risk-stratifying biologic
218
P. Kiewe and W.-K. Hofmann
features had previously been identified for either T-ALL or hyperdiploid >50 chromosomes ALL. A very provocative result of Yeoh’s experiments was the identification of specific expression patterns in ALL blasts at the time of diagnois for the subsequent development of secondary AML. This is difficult to understand because different, independent hematopoietic stem cells are believed to give rise to secondary leukemia. Nonetheless, while again no consistent pattern was seen across all subgroups, development of secondary AML within the TEL-AML1 subgroup could be predicted with 100% accuracy. Genes within this signature included RSU1, a suppressor of the RAS-pathway, and MSH3, a mismatch repair enzyme. Based upon these important findings, Cario et al. designed a very straightforward study [13]. The German ALL-BFM 2000 trial for childhood ALL assigns high risk (HR), intermediate risk (IR) and standard risk (SR) to patients according to their in vivo response to treatment, translating these into treatment intensity. Each ALL clone possesses an individual and reproducible signature defined by specific rearrangements within the T-cell receptor (T-lineage ALL) or immunoglobulin (B-lineage ALL) gene. Molecular detection of these transcripts by PCR can expose minimal residual disease (MRD) even when cytologically no ALL cells are seen. MRD evaluation after 33 days and 12 weeks is used as a marker for response and risk assignment. Acknowledging the heterogeneity among the different ALL subgroups, only patient samples with the B-cell precursor immunophenotype were selected from the trial, and BCR-ABL, MLL-AF4 or TEL-AML1 rearrangements were excluded. Of 51 patient samples, 21 HR and 30 SR, were evaluated at the time of diagnosis for the expression of up to 30 000 genes on spotted cDNA microarrays. Unsupervised hierarchical clustering depicted two branches with inconsistent distribution of HR and SR samples and no statistically significant differences in clinical features of patients. The samples were then sorted using a supervised clustering algorithm to distinguish poor treatment response from good. Analysis of genes in the resistant group revealed a predominance of genes involved in cell cycle progression, cell division control, and apoptosis, indicating an impaired cell proliferation and impaired apoptosis in B-cell precursor ALL samples that are resistant to chemotherapy.
219
Prediction of response and resistance to treatment
It was now possible to correctly assign random test samples to either the high- or low-risk group with an accuracy of 84%, hampered slightly by an inconsistent preparation method of samples within the test set. The described means of response prediction may be particularly valuable for the group of intermediate-risk patients where MRD is unable to identify patients with risk of relapse. The study impressively demonstrated that, with strict selection of patients, only a small number of samples are needed to identify, with high accuracy, gene signatures predictive of relapse. Pediatric ALL and adult ALL differ markedly in their biology and in their response to treatment, with a considerably higher proportion of treatment failure in adult patients. It is all the more remarkable that the genes involved in resistance to childhood ALL are in excellent agreement with the data attained by Chiaretti et al. [14], who studied gene expression with respect to different response to treatment and survival in adult T-ALL. Since no relevant subgroups have been identified in adult T-ALL to date, the aim of their study was to identify a gene expression signature with prognostic value in terms of response to induction therapy and long-term outcome. Thirty-three initial patient samples from the Italian GIMEMA trial were used for gene expression analysis on Affymetrix HG_U95Av2 arrays, comprising patients refractory to induction chemotherapy, as well as patients relapsing within 2 years, and patients with continuous complete remission (CCR). Hierarchical clustering of 313 differentially expressed genes identified two major groups corresponding with phenotypic T-cell differentiation. In the following step, samples of patients achieving CR after induction therapy were compared with refractory patients, and 25 discriminating genes with the lowest prediction error rate were selected. While the profile of patients achieving CR was heterogeneous, the refractory group showed a homogeneous pattern, characterized by the high expression of IL-8 and reduced expression of the remaining genes, including members of the histone family, cell adhesion molecules, and genes involved in cell cycle proliferation, suggesting an impaired cell proliferation in refractory T-ALL. Another comparison: between patients who relapsed within 2 years and patients in CCR or with later relapse should identify genes predicting longterm outcome. Nineteen reliable predictor genes were identified. Genes involved in mitotic assembly and mitotic checkpoint (e.g., TTK) as well as
Probability of CCR
220
P. Kiewe and W.-K. Hofmann
(a)
(b)
(c)
1.0 0.8
Good risk 1.0 Poor risk 0.8
WBC < 100 µ/l 1.0 WBC > 100 µ/l 0.8
0.6
0.6
0.6
0.4
0.4
0.4
0.2 0.0 0
Fig. 9.2.
10
20
0.2 0.2 P < 0.001 P = 0.02 Log rank test 0.0 Log rank test 0.0 0 10 20 30 40 50 60 0 30 40 50 60 Duration of complete remission (mo.)
Immature Mature
P = 0.91 Log rank test 10
20
30
40
50
60
Kaplan–Meier plots estimating probability of maintaining CR for adult T-ALL. (a) Twenty-four evaluable patients were assigned to either good-risk or poor-risk T-ALL based on expression of AHNAK, CD2, and TTK as measured by oligonucleotide microarrays. (b) Kaplan–Meier plots based on the white blood count (WBC) at diagnosis. (c) Kaplan–Meier plots based on the degree of T-lineage differentiation of the leukemic cell (immature ¼ T1–T2; mature ¼ T3–T4 of EGIL classification). Reproduced with permission from [14].
CD2 were expressed selectively in samples from patients with long CCR. In the relapse group, among the few identified genes with increased expression was AHNAK, encoding an unusually large 700 kDa protein with poorly defined function. Model selection for smallest prediction error (leave-one-out crossvalidation [5]) depicted three genes, AHNAK, CD2, and TTK. Twenty-four patients were then categorized into good-risk and poor-risk cohorts, based exclusively on the expression values for these three genes. Eleven patients were classified as good risk (3 relapsed in less than 2 years), and 13 patients as poor risk (2 remained in CCR), yielding an overall error rate of 19%. Kaplan–Meier curves of survival for both risk groups demonstrate the superior prediction of outcome of the three-gene model compared with predictions based on the initial leukocyte count and T-cell differentiation (Fig. 9.2). The results obtained from gene array data were reproduced with realtime PCR (RT-PCR) using the three-gene model, yielding a slightly higher overall error rate and similar Kaplan–Meier curves of survival. It is a revolutionary concept to attribute resistance to induction therapy to one gene and relapse to three genes, respectively, because an easier evaluation with methods like PCR or immunocytometry would be feasible. However, despite a cautious statistical approach for selection of candidate genes, these findings need to be verified in larger cohorts of adult patients.
221
Prediction of response and resistance to treatment
It was important to learn that individual response to treatment and risk of relapse can be predicted by gene array data. However, to optimize treatment strategies, we want to know exactly to what cytotoxic agent the individual patient will respond. This question was addressed by a Dutch group around Holleman and Pieters [15], who had previously gathered experience with in vitro drug resistance of leukemic blasts in childhood ALL. They used samples obtained from 173 children with primarily diagnosed ALL, of which they had assessed the in vitro sensitivity defined by LC50 values towards four relevant cytotoxic agents: prednisolone, vincristine, asparaginase, and daunorubicin. Patients with in vitro resistance to antileukemic agents have a substantially worse prognosis than children whose ALL cells are drug sensitive [16], but little is known about the genetic basis of resistance to individual antileukemic agents. Gene expression of all samples was determined using an Affymetrix U133A microarray with more than 22 000 oligonucleotide probe sets. Unsupervised hierarchical clustering distinguished samples according to their immunophenotype or genetic subtype as anticipated but not with respect to drug sensitivity. Due to the strong gene expression signature of T-ALL, only samples of B-lineage ALL were selected for further supervised clustering methods. The differential expression of 172 gene-probe sets with lowest prediction error (representing 124 unique known genes and 28 complementary DNA clones) was associated with resistance to prednisolone (42 probe sets), vincristine (59 probe sets), asparaginase (54 probe sets), and daunorubicin (22 probe sets) with a predictive accuracy for B-lineage ALL ranging from 71% to 76%. These results are documented impressively in Fig. 9.3. When survival of all 173 patients treated according to Dutch and German pediatric ALL protocols was compared to a combined gene expression score indicative of resistance to the four drugs, there was an obvious correlation. Survival analysis for three different groups (high, intermediate, and low level of resistance) that were defined by their combined gene expression score revealed three significantly different Kaplan–Meier curves. These results could be validated using an independent group of samples obtained from a different but comparable clinical trial. The combined drug resistance gene expression score also predicted outcome of treatment in a multivariate
222
Fig. 9.3.
P. Kiewe and W.-K. Hofmann
Results of supervised hierarchical-clustering and principal-component analyses with the use of genes that discriminate between drug-resistant and drug-sensitive B-lineage ALL with respect to prednisolone, vincristine, asparaginase, and daunorubicin. The Wilcoxon rank-sum test and t-test were used to identify genes that were differentially expressed in sensitive and resistant ALL (p ¼ 0.001). Each column represents an ALL sample, labelled according to whether it was sensitive (green) or resistant (red) to a given drug, and each row represents a probe set. The ‘‘heat’’ maps on the left side of the figure indicate high (red) or low (green) level of expression relative to the number of standard deviations from the mean. The three-dimensional plots on the right show three principal components based on the significant discriminating genes for each drug. Each circle represents a patient with leukemia; red circles indicate those with drug-resistant ALL, and green circles those with drug-sensitive ALL. Reproduced with permission from [15].
223
Prediction of response and resistance to treatment
analysis that included the patient’s age, ALL genetic subtype, ALL lineage, and leukocyte count at diagnosis. Asking what functional categories of genes are involved in drug resistance, the authors found particular functional groups over-represented in different cytotoxic agents. Not surprisingly, prednisolone sensitivity is defined by a high percentage of genes involved in carbohydrate metabolism, whereas vincristine sensitivity seems to depend on genes involved in nucleic acid metabolism, and asparaginase sensitivity is related to protein metabolism genes. This study may have boosted our comprehension of drug resistance markedly as most of the differentially expressed genes have not been linked previously to resistance to the four agents investigated. No universal crossresistance gene was identified, since no single gene was associated with resistance to all four drugs supporting the concept of a combination therapy in human cancer. Furthermore, the findings may point to previously unrecognized potential targets for new agents augmenting the efficacy of current chemotherapy. There already exists an increasing availability of target-specific drugs, most of them lacking surrogate markers that can adequately monitor response to treatment. To use resources economically, it would be desirable to predict a potential benefit from these compounds. STI571 (imatinib; Glivecä) is an ABL-selective tyrosine kinase inhibitor with response rates of 95%–100% in patients with chronic myeloid leukemia in chronic phase. However, response rates in ALL are lower (59%) and nearly all patients who respond initially to STI571 become refractory to treatment within a few weeks or months [17, 18]. Therefore, Hofmann et al. evaluated the attractive concept to predict resistance to STI571 by gene expression analysis in patients with Philadelphia-chromosome-positive (Phþ) ALL [19]. Bone marrow samples of 19 patients with Phþ ALL were collected before enrolment into a phase II study with STI571 treatment and evaluated for gene expression on the Affymetrix HuGeneFL array, containing 5600 human genes. For four patients, additional material was available at the time of refractoriness to the compound. With differentially expressed genes in these pairs, class-membership prediction was used to establish whether leukemic samples could be classified into three classes: sensitive to STI571,
224
P. Kiewe and W.-K. Hofmann
primarily resistant, or secondarily resistant to STI571. Ninety-five differentially expressed genes were found to be able to distinguish samples sensitive to STI571 from those resistant to it. Fifty-six highly differentially expressed genes were identified in leukemic cells that had secondary resistance to STI571 but only one of these genes was within the list of 95 genes selected for prediction of resistance to STI571. Most genes predicting primary resistance are unknown or have no apparent association with mechanisms of resistance to STI571. Conversely, some of the 56 genes selected by comparison of sensitive and secondarily resistant cells have already been implicated in possible mechanisms of resistance or in pathways altered in leukemogenesis. Beyond the advanced understanding of the mechanisms of resistance, the use of gene expression profiling in the pretreatment prediction of response could be highly profitable in the treatment with STI571 and potentially other signal transduction modulators. Breast cancer – enhancing chances for cure Breast cancer is the most frequent malignancy in women. With metastatic breast cancer remaining a fatal disease, priority is given to early detection of breast lesions as well as an aggressive pre- and postsurgical management including radiotherapy and combination chemotherapy, particularly when cancer has spread to regional lymph nodes. Thus major advances have been achieved in preventing relapse and augmenting overall survival. Several prognostic markers and therapeutic targets (e.g., histological grade, hormone receptors, HER2-neu receptor) have been identified. It is still not possible to predict which patients will respond to neoadjuvant chemotherapy, which therefore is considered to be an in vivo assessment of sensitivity. Equally important, we do not know which women will benefit from adjuvant treatment because survival data is derived from a collective within large clinical trials. Van’t Veer and her group were the first to look for gene signatures suitable for patient-tailored treatment strategies [20]. They selected 78 primary breast cancers from lymph-node negative patients, including samples from patients with a long disease-free survival and those with later development of distant metastases, and carried out gene expression analysis on microarrays containing approximately 25 000 human genes (further
225
Prediction of response and resistance to treatment
investigations with samples carrying BRCA1 germline mutations are beyond the scope of this chapter). Unsupervised hierarchical clustering depicted two branches of samples which were to some extent already distinguishing patients with ‘‘good prognosis’’ from patients with ‘‘poor prognosis’’ concordant with a division of estrogen receptor (ER) positive from ER negative samples. When supervised statistical methods were applied to search for a prognostic signature, 231 genes were found to be associated significantly with disease outcome and were further limited to an optimal number of 70 marker genes. This classifier predicted 83% of patients correctly. These results could be validated with an independent test set of samples, where only 2 out of 19 tumors were misclassified. Genes predictive of the development of metastases were involved in cell cycle, invasion, angiogenesis, and signal transduction. Surprisingly, none of the many single genes which have been correlated previously with disease outcome such as cyclin D1, UPA, PAI-1, HER2-neu, and c-myc were present within the 70 marker genes, highlighting the need for an approach based on the expression of many genes. Eligibility for adjuvant chemotherapy in lymph-node negative patients is defined poorly. Currently, decision is based mainly on clinical and histopathological prognostic factors such as grading, tumor size, angioinvasion, and hormone receptor status. The gene signature proposed by the authors outperformed any single prognostic factor, and even a multivariate model including all the classical prognostic factors with respect to clinical outcome, and could therefore become a valuable determinant for adjuvant chemotherapy. Once patients have been selected for (neo-)adjuvant chemotherapy, we want to know exactly what agents will be efficient and how much treatment intensity is needed to avoid unnecessary side effects. Neoadjuvant chemotherapy provides a unique opportunity to identify molecular predictors of response to treatment in breast cancer. Thus, Chang et al. looked at response to docetaxel in chemotherapy prior to surgery [21]. From 23 patients with primary breast cancer, using a HGU95_Av2 array, they identified 92 genes correlating with docetaxel response. Genes from sensitive samples were involved in cell cycle, cytoskeleton, adhesion, protein transport, transcription, and stress or apoptosis; whereas resistant
226
P. Kiewe and W.-K. Hofmann
tumors showed increased expression of some transcriptional and signal transduction genes. The predictor yielded an accuracy of 88% and was validated in an independent set of six patients as well as by RT-PCR for selected genes. A Japanese group around Iwao–Koizumi approached the same question with a different technique [22]. They used adaptor-tagged competitive PCR (ATAC-PCR), a high-throughput PCR method, to study gene expression in 44 primary breast cancer samples, and to develop a predictor for docetaxel response consisting of 85 genes of which 61 were overexpressed in nonresponders. The accuracy of prediction was 72.4%, and the predictor was also validated with an independent test set. In addition to an increased tubulin expression, they found most prominently an elevated expression of genes controlling the cellular redox environment in non-responders. The importance of these genes for docetaxel sensitivity could be demonstrated in a transfection experiment. After transfection into the normally docetaxel sensitive cell line MCF-7, all genes were able to protect MCF-7 from cell death. Despite these intriguing discoveries, only three genes overlapped within the predictor models of Chang and Iwao–Koizumi. Whether this is due to differences in methodology, treatment regimens, or response evaluation, the susceptibility of predictor models based on gene expression data to confounding factors becomes obvious here. Since neoadjuvant chemotherapy is composed of a combination of cytotoxic drugs, usually comprising a taxane, doxorubicin, cyclophosphamide, and fluorouracil (T/FAC), Ayers et al. [23] wanted to examine the feasibility of developing a multigene predictor of pathologic complete response (pCR) to a common neoadjuvant treatment regimen. Pathologic complete response after neoadjuvant therapy is associated with improved longterm, disease-free and overall survival, and it is currently considered to be the best, though imperfect, early surrogate for cure. Primary breast cancer samples of 24 patients were used for predictor modeling with gene expression data from 30 721 human sequence clones on a high-density nylon cDNA array. While no single marker gene could be identified that was sufficiently associated with pCR, a multigene model with 74 markers was built and validated with a test set of 18 independent patient samples. Predictive accuracy was 78%, with low sensitivity but high specificity.
227
Prediction of response and resistance to treatment
Inflammatory breast cancer (IBC) is a rare but aggressive form of breast cancer with limited survival. Patients usually receive neoadjuvant chemotherapy, but accurate predictors of pathological response are missing. Bertucci et al. [24] measured gene expression of samples from patients with IBC and non-inflammatory breast cancer on home-made cDNAspotted nylon microarrays containing 8000 genes. With supervised analysis, 109 genes were found to discriminate IBC from non-IBC samples with an 85% accuracy. A similar approach identified 85 genes that were able to subdivide IBC patients with significantly different CR rate (62% accuracy). Among them were genes previously associated with drug sensitivity, for instance a high expression of CDKN1B was correlated to pCR. The authors also suggested an important role of the host immune system in tumor eradication after chemotherapy, as seen in an elevation of genes encoding cytokines, chemokines, and cytokine receptors in sensitive samples. Importantly, comparison with the discriminator genes reported by Chang and Ayers revealed several similarities.
From gene expression analysis to treatment plan: common predictors of response in heterogenous cancers Finding the best treatment strategy in solid cancers demands an interdisciplinary approach. A conclusive histopathological diagnosis should always be aspired to as well as a thorough radiological investigation of dissemination. Surgical resection, radiation treatment, and chemotherapy, either within a neoadjuvant, adjuvant, or palliative setting will then be considered to achieve an optimal outcome. Many authors have hence tried to elicit biological information from gene expression analysis to integrate into treatment concepts. Hippo et al. [25] have investigated the biology of gastric cancer. It is one of the leading causes of cancer death in the world, and prognosis for advanced gastric cancer is extremely poor. Recently, many genetic alterations have been clarified but not sufficiently for common pathways of carcinogenesis to be understood and their involvement in the diverse clinical properties like histological type, metastatic status, invasiveness, and response to chemotherapy. Gene expression of 22 samples of gastric cancer as well as eight
228
P. Kiewe and W.-K. Hofmann
samples of normal gastric tissue was studied using an Affymetrix Hu-GeneFL array. Whole cancer tissues were investigated due to the importance of epithelial–stromal interaction. Hierarchial clustering revealed a large number of genes overexpressed in cancer, mainly related to cell cycle, cell growth, cell motility, cell adhesion, and matrix remodeling. Further analysis showed that certain genes were expressed differentially between intestinal-type and diffuse-type gastric cancer. Most important, however, was the association of nine genes with tumors that had already disseminated to local lymph nodes. These genes comprised the matrix-remodeling genes FN1, PCOLCE, and PFN2; most intriguing was the overexpression of Oct-2, a gene generally regarded as a lymphoid or neuronal cell-specific transcription factor. The protein expression of Oct-2 was then investigated by immunohistochemistry and demonstrated a strong immunoreactivity in gastric cancer cells with lymph node metastases and in some infiltrating lymphocytes but not in cancer cells without metastases. The authors suggest an important tool to clarify the lymph node status, which is critical for adequate surgical intervention. Bertucci and his group not only studied inflammatory breast cancer but also applied the same technique of cDNA-spotted nylon microarrays with 8000 genes to colorectal cancer (CRC) [26], a heterogeneous tumor that is known for its multistep progression from benign adenomatous tissue. Unsupervised hierarchical clustering of cancer and non-cancerous colon samples depicted two branches corresponding with a benign or malignant phenotype. The same clustering algorithm applied only to the CRC samples, two sorted groups differing with respect to stage, clinical outcome, and the presence of metastases. To define subgroups better, a supervised approach was applied to the cancer tissue samples. Presence or occurrence of metastases was represented by 194 unique genes with a predictive accuracy of 82%, and associated with significantly impaired overall survival. The presence of lymph node involvement was reflected by a predictor containing 41 differentially expressed genes. Analysis of these revealed up- and downregulation of many genes known to play a role in metastatic disease or lymph node involvement, associated with various cellular processes. Representative of differentially expressed genes, the involvement of NM23, a gene up-regulated in CRC but down-regulated in poor prognosis tumors, was validated by immunohistochemisty.
229
Prediction of response and resistance to treatment
To elucidate responsiveness of rectal adenocarcinomas to preoperative chemotherapy, Ghadimi et al. prospectively studied gene expression of biopsies from 30 locally advanced rectal carcinomas on National Cancer Institute cDNA (9984 features) and oligonucleotide (22 231 features) arrays [27]. All patients were participants of a phase III clinical trial and were randomized to receive preoperative combined-modality therapy, including fluorouracil and radiation. Due to a short follow-up period, tumor downsizing by comparison of the T category (TNM classification) was chosen as an intermediate end point rather than overall survival. In a training set of 23 tumor biopsies, 54 differentially expressed genes (cDNA array) were found to discriminate responsive and nonresponsive tumors using class comparison analysis. Accuracy of correct class prediction was 83%. To validate these results with a test set on an independent microarray platform, a newly obtained set of seven tumors was hybridized to oligonucleotide arrays. Here, 39 of the 54 genes, identified as statistically significant using the cDNA arrays, had corresponding features. The new classifier was trained using the expression of these 39 genes and predicted correctly the response of six out of seven patients (86%), attesting the robustness of microarray profiling and the biologic relevance of the genes identified. The predictor genes represented members of several cellular pathways and mapped to different chromosomes. Of particular interest were genes encoding proteins involved in DNA damage repair pathways, such as SMC1, or microtubule organization, indicating a major role of radiation in treatment response. A group around Kurokawa asked for a molecular predictor of response to intra-arterial 5-fluorouracil (5-FU) and interferon- (IFN-) combination therapy in advanced hepatocellular carcinoma (HCC) [28]. While 40%–50% of patients showed promising results to therapy with clear signs of remission, survival was extremely short for patients who did not respond. The group previously had investigated response to docetaxel in breast cancer patients using ATAC-PCR. The same technique was now applied to the study of 20 HCC samples with 3080 genes, many of which had been selected from databases on the basis of involvement in HCC. With few exceptions, unsupervised hierarchical clustering already sorted patients into two different branches according to treatment response. With supervised analysis, a molecular prediction system was then constructed. Sixty-three differentially expressed genes were needed to exhibit distinct profiles
230
P. Kiewe and W.-K. Hofmann
between the two groups with an accuracy of 85% (Fig. 9.4). An independent dataset of a further 11 patients was used to validate the model, and patients were classified into a poor signature group and a good signature group with significant differences in response and overall survival (Fig. 9.5). The 63 predictive genes included some genes that had previously been reported to be associated with sensitivity to 5-FU or IFN. For example, up-regulation of FAS ligand (CD95) correlates with poor sensitivity to 5-FU chemotherapy. A gene involved in sensitivity to IFN- is OASL, which plays an important role in the antiviral effects of IFN, mediating apoptosis and controlling cellular growth. The impact of this study on treatment strategy might be considerable, because timely identification of patients who will not respond to therapy protects them from debilitating side effects and preserves their chance to receive alternative treatment. Nakamura et al. [29] demonstrated in their analysis of 18 pancreatic tumors that a high purity of tumor cells obtained by means of laser microdissection yields large amounts of differentially expressed genes in cancer cells as compared to normal ductal epithelial cells. For their gene expression study, they fabricated a cDNA array representing 23 040 different genes. Supervised analysis comparing samples from patients with a disease-free interval of >12 months after surgery with those having an early recurrence identified a 30-gene predictor that separated both groups. Seventy-six genes were found to be predictive of lymph node metastases. This study revealed a number of interesting genes either up- or down-regulated in particular settings, contributing to our knowledge about the biology of aggressive tumors. The detection of specific markers in pancreatic carcinoma, which ranges among the most fatal malignancies and is usually discovered at a late stage, would facilitate tumor detection and enhance the development of targeted therapy. Gemcitabine, a pyrimidine nucleoside analogue, is the most effective chemotherapeutic agent in pancreatic carcinoma; however, no more than 25% of patients will respond to treatment. Akada and his group [30] wondered if gene expression could predict response to gemcitabine in an in vitro model. They identified 71 genes expressed differentially in pancreatic cancer cell lines resistant or sensitive to gemcitabine, many of which were involved in the phosphatidylinositol 3-kinase/Akt pathway. Resistance
231
Fig. 9.4.
Prediction of response and resistance to treatment
The expression pattern of the 63 predictive genes in 20 hepatocellular carcinoma cases. Each column represents a patient sample, and each row represents a single gene. Reproduced with permission from [28].
232
P. Kiewe and W.-K. Hofmann
(a)
Overall survival rate
1.0 0.8 0.6 0.4
P = 0.001
0.2 0.0 0
12
24
36
48
60
Months after surgery
(b) Disease-free survival rate
1.0 0.8 0.6 0.4 P = 0.002
0.2 0.0 0
Fig. 9.5.
36 48 12 24 Months after surgery
60
Overall survival curves (a) and disease-free survival curves (b) from HCC-patients calculated using the Kaplan–Meier method for the validation dataset. Differences in survival curves were estimated by the log-rank test. Reproduced with permission from [28].
could also be attributed to down-regulation of BNIP3, a Bcl-2 family proapoptotic protein. Experiments suppressing BNIP3 with small interfering RNA demonstrated that gemcitabine-induced cytotoxicity in vitro was much reduced. The last publication reviewed for this chapter concentrates on bladder cancers and response to neoadjuvant chemotherapy with methotrexate, vinblastine, doxorubicin, and cisplatin. Neoadjuvant treatment with these substances (M-VAC regimen), followed by radical cystectomy, is more likely to eliminate residual cancer, improving survival in patients with locally advanced bladder cancer. Takata and his colleagues analyzed gene expression profiles of biopsy materials from 27 invasive bladder cancers
233
Prediction of response and resistance to treatment
using a home-made cDNA spotted microarray consisting of 27 648 genes [31]. From a list of dozens of genes expressed differently between ‘‘responder’’ tumors and ‘‘non-responder’’ tumors, 14 genes were selected that clearly separated both groups and predicted accurately the drug responses of eight of nine test cases reserved from the original 27 samples. Though notably inferior to microarray analysis, subsequent measuring of predictor genes by RT-PCR still provided sufficient predictive power to discriminate between the two groups, implicating an easy system for clinical use. Among the differentiating genes, the authors found that topoisomerase IIa (TOP2a) was down-regulated in the non-responder group. It is known to be a target for several anticancer drugs including doxorubicin, and down-regulation has been associated with resistance to a large range of chemotherapeutic drugs. Other mechanisms included up-regulation in non-responders of an anti-apoptotic factor (RELA), an inducer gene of further genes involved in drug resistance (SLC16A3), and a gene inhibiting p53 (PARK). Conclusions The simultaneous transcription analysis of thousands of genes certainly provides a powerful tool for a better understanding of mechanisms involved in the development of disease. The possibility of attributing expression patterns to drug sensitivity is very attractive, and infinite potential applications seem to arise within the numerous medical domains. However, progress has evolved most in tumor medicine. To date, various research groups around the world have used gene expression profiles to categorize malignant diseases molecularly, and to identify gene expression signatures predictive of survival or response to chemotherapy. From the identification of previously known and unknown subtypes of disease and correlation with survival, researchers have moved on to the construction of distinct gene models for the prediction of clinical outcome or response to specific therapeutic agents. Most predictor models are merging 20 to 100 differently expressed genes, and not all results have been exactly reproducible by independent research teams. This seems to be due partly to various differences in the technique of gene expression analysis and the statistical algorithms applied, but it is also
234
P. Kiewe and W.-K. Hofmann
due to differences in sample selection and clinical definitions of response. If predictor models could be reduced reliably to a minimum amount of genes, clinical assessment would be much easier using PCR methods. High prediction accuracy and specificity is vital in many clinical circumstances, particularly when a decision on treatment is based on a particular gene expression signature. Supervised statistical methods are modified continually to filter out irrelevant information and improve the predictive value. Xu et al., for instance, have shown that the integration of artificial neural networks into supervised statistical methods can enhance predicitive power [32]. Given the heterogeneity of tumor tissue, it is questionable whether specific models for individual response to treatment, particularly with respect to certain drugs, will be clinically applicable within the near future. However, gene expression analysis has been inspirational in its ability to allocate patients to prognostic risk groups, promoting a differential treatment concept. Furthermore, despite differences observed within particular malignancies, basic concepts of tumor biology and drug resistance seem to have emerged. Repeatedly, genes belonging to functional groups involved in reduced cell proliferation or impaired apoptosis have been found in resistant tumors. Aggressiveness or metastatic potential was associated with genes involved in angiogenesis, cell motility, cell adhesion, or matrix remodeling. Finally, sensitivity to cytotoxic treatment could be attributed to drugspecific pathways. To validate the results reviewed within this chapter positively and to determine their relevance with respect to future clinical applications, large prospective trials with systematic and standardized analysis of gene expression profiles will be required. Finally, it is desirable to transfer the merits and insights attained in tumor medicine to non-malignant diseases and capitalize on the exploration of the transcriptome.
REFERENCES 1. Utaida, S., Dunman, P. M., Macapagal, D. et al. Genome-wide transcriptional profiling of the response of Staphylococcus aureus to cell-wall-active antibiotics reveals a cell-wall–stress stimulon. Microbiology 2003; 149: 2719–32.
235
Prediction of response and resistance to treatment
2. Freiberg, C., Bro¨tz-Oesterhelt, H., and Labischinski, H. The impact of transcriptome and proteome analyses on antibiotic drug discovery. Curr. Opin. Microbiol. 2004; 7: 451–59. 3. March, R. Pharmacogenomics: the genomics of drug response. Yeast 2000; 17: 16–21. 4. Seiler, P. U., Stypmann, J., Breithardt, G., and Schulz-Bahr, E. Real-time RT-PCR for gene expression profiling in blood of heart failure patients: a pilot study. Basic Res. Cardiol. 2004; 99: 230–8. 5. Golub, T. R., Slonim, D. K., Tamayo, P. et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999; 286: 531–7. 6. Margalit, O., Somech, R., Amariglio, N., and Rechavi, G. Microarray-based gene expression profilino of hematologic malignancies: basic concepts and clinical applications. Blood Rev. 2005; 19: 223–34. 7. Rosenwald, A., Wright, G., Chan, W. C. et al. Lymphoma/Leukemia Molecular Profiling Project. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. 2002; 346: 1937–47. 8. Alizadeh, A. A., Eisen, M. B., Davis, R. E. et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403: 503–11. 9. Shipp, M. A., Ross, K. N., Tamayo, P. et al. Diffuse large B-cell lymphoma outcome prediction by gene expression profiling and supervised machine learning. Nat. Med. 2002; 8: 68–74. 10. Hans, C. P., Weisenburger, D. D., Greiner, T. C. et al. Confirmation of the molecular classification of diffuse large B-cell lymphoma by immunohistochemistry using a tissue microarray. Blood 2004; 103: 275–82. 11. Rosenwald, A., Wright, G., Leroy, K. et al. Molecular diagnosis of primary mediastinal B cell lymphoma identifies a clinically favorable subgroup of diffuse large B cell lymphoma related to Hodgkin lymphoma. J. Exp. Med. 2003; 198: 851–62. 12. Savage, K. J., Monti, S., Kutok, J. L. et al. The molecular signature of mediastinal large B-cell lymphoma differs from that of other diffuse large B-cell lymphomas and shares features with classical Hodgkin lymphoma. Blood 2003; 102: 3871–9. 13. Cario, G., Stanulla, M., Fine, B. M. et al. Distinct gene expression profiles determine molecular treatment response in childhood acute lymphoblastic leucemia. Blood 2005; 105: 821–6. 14. Chiaretti, S., Li, X., Gentleman, R. et al. Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival. Blood 2004; 103: 2771–8. 15. Holleman, A., Cheok, M., den Boer, M. L. et al. Gene expression patterns in drugresistant acute lymphoblastic leukemia cells and response to treatment. N. Engl. J. Med. 2004; 351: 533–42.
236
P. Kiewe and W.-K. Hofmann
16. Pieters, R., Huismanns, D. R., Loonen, A. H. et al. Relation of cellular drug resistance to long-term clinical outcome in childhood acute lymphoblastic leukaemia. Lancet 1991; 338: 399–403. 17. Druker, B. J., Sawyers, C. L., Kantarjian, H. et al. Activity of a specific inhibitor of the BCR-ABL tyrosine kinase in the blast crisis of chronic myeloid leukaemia and acute lymphoblastic leukaemia with the Philadelphia chromosome. N. Engl. J. Med. 2001; 344: 1038–42. 18. Ottmann, O. G., Sawyers, C., Druker, B. et al, for the International STI571 Study Group. A phase II study to determine the safety and antileukaemic effect of STI571 in adult patients with Philadelphia chromosome positive acute leukemias. Blood 2000; 96: 828a. 19. Hofmann, W. K., de Vos, S., Elashoff, D. et al. Relation between resistance of Philadelphia-chromosome-positive acute lymphoblastic leukaemia to the tyrosine kinase inhibitor STI571 and gene expression profiles: a gene expression study. Lancet 2002; 359: 481–6. 20. van’t Veer, L. J., Dai, H., van de Vijver, M. J., et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002; 415: 530–6. 21. Chang, J. C., Wooten, E. C., Tsimelzon, A. et al. Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. Lancet 2003; 362: 362–9. 22. Iwao-Koizumi, K., Matoba, R., Ueno, N. et al. Prediction of docetaxel response in human breast cancer by gene expression profiling. J. Clin. Oncol. 2005; 23: 422–31. 23. Ayers, M., Symmans, W. F., Stec, J. et al. Gene expression profiles predict complete pathologic response to neoadjuvant paclitaxel and fluorouracil, doxorubicin and cyclophosphamide chemotherapy in breast cancer. J. Clin. Oncol. 2004; 22: 2284–93. 24. Bertucci, F., Finetti, P., Rougemont, J. et al. Gene expression profiling for molecular characterization of inflammatory breast cancer and prediction of response to chemotherapy. Cancer Res. 2004; 64: 8558–65. 25. Hippo, Y., Taniguchi, H., Tsutsumi, S. et al. Global gene expression analysis of gastric cancer by oligonucleotide microarrays. Cancer Res. 2002; 63: 233–40. 26. Bertucci, F., Salas, S., Eysteries, S. et al. Gene expression profiling of colon cancer by DNA microarrays and correlation with histoclinical parameters. Oncogene 2004; 23: 1377–91. 27. Ghadimi, M., Grade, M., Difilippantonio, M. J. et al. Effectiveness of gene expression profiling for response prediction of rectal adenocarcinomas to preoperative chemotherapy. J. Clin. Oncol. 2005; 23: 1826–38. 28. Kurokawa, Y., Matoba, R., Nagano, H. et al. Molecular prediction of response to 5-fluorouracil and interferon-a combination chemotherapy in advanced hepatocellular carcinoma. Clin. Cancer Res. 2004; 10: 6029–38.
237
Prediction of response and resistance to treatment
29. Nakamura, T., Furukawa, Y., Nakagawa, H. et al. Genome-wide cDNA microarray analysis of gene expression profiles in pancreatic cancers using populations of tumor cells and normal ductal epithelial cells selected for purity by laser microdissection. Oncogene 2004; 23: 2385–400. 30. Akada, M., Crnogorac-Jurcevic, T., Lattimore, S. et al. Intrinsic chemoresistance to gemcitabine is associated with decreased expression of BNIP3 in pancreatic cancer. Clin. Cancer Res. 2005; 11: 3094–101. 31. Takata, R., Katagiri, T., Kanehira, M. et al. Predicting response to methotrexate, vinblastine, doxorubicin, and cisplatin neoadjuvant chemotherapy for bladder cancers through genome-wide gene expression profiling. Clin. Cancer Res. 2005; 11: 2625–36. 32. Xu, Y., Selaru, F. M., Yin, J. et al. Artificial neural networks and gene filtering distinguish between global gene expression profiles of Barrett’s esophagus and esophageal cancer. Cancer Res. 2002; 62: 3493–7.
Index
Note: page numbers in italics refer to figures and tables 50 30 ratio 58 1b-adrenergic receptor 83 2a-adrenergic receptor 201 absolute difference 60 acute lymphoblastic leukemia (ALL) 106–7 adults 216–24 blast expression patterns 218 B-precursor 121–5 characterization 121–5 class discriminator genes 125 continuous complete remission 219–20, 220 cytogenetic aberrations 124 cytotoxic agent response 221–4 diagnosis 118–19 distinction from acute myeloid leukemia 108, 118 drug resistance genes 222, 223 drug sensitivity 222 gene expression patterns 121–4 signatures 219 gene signature of relapse 219 immunologic classification combination with cytogenetic findings 123–4 individual risk assignment 217 minimal residual disease measurements 216, 218, 219 MLL gene aberrations 121–2, 123 molecular marker expression 216–17 pediatric 216–24 predictor genes 219–20 principal component analysis 222 protein expression data 118–19 relapse gene signature 219 prediction models 216–24 response prediction 124–5, 219 risk assignment 218 RNA abundance correlation with protein expression 118–19
238
subclassification 118–19 subgroup heterogeneity 218 subtypes 121–4, 217 supervised hierarchical clustering 222 T-precursor 121–5 treatment response models 216–24 acute megakaryocytic leukemia 113–14 acute myeloid leukemia (AML) 106–7 biologic heterogeneity 121 class discovery 107, 117–18 class prediction 108–17 with complex aberrant karyotype 114–15 cytochemistry 109 cytogenetic aberrations 111–15, 120 diagnosis 118–19 distinction from acute lymphoblastic leukemia 108, 118 FLT3 gene 115, 116–17, 120, 121 gene expression profiling 109, 110–11 disease with genetic aberrations 115–18 genetic abnormalities 111–15 with genetic aberrations 115–18 genetic abnormality identification 111–15 genetic clustering 121 genetic profiling 119–21 HOX genes 111, 114, 115–16 M3 subtype 110 M3v subtype 110 M5a and b 110–11 microarray analyses 107–21 MLL gene 113–14, 115, 116, 121 mRNA correlation with protein levels 119 principal components analysis 113, 116 prognostication based on genetic profiling 119–21 protein correlation with mRNA levels 119 RAD genes 114–15 RNA abundance correlation with protein expression 118–19 subclassification 118–19
239
Index
subtypes 110–11, 114–15 cytogenetically defined 111–12, 120 therapy-induced 217, 218 trisomy 8 114 U133A microarrays 111–12 acute promyelocytic leukemia 113–14 adaptive shape method 51 Affymetrix oligonucleotide microarrays 10–12, 57 accuracy 18 data analysis 52, 53 data quality 42 probe redundancy 205–6 target-labeling 13, 15–16 Agilent oligonucleotide microarrays 11–12 comparison with other platforms 19 data quality 42 target-labeling 13–15 Akt serine-threonine kinase 81–2 amino-allyl deoxyuridine triphosphate (AA-dUTP) 13–14 cAMP response element modulator (CREM) 190 angiotensin II 86–7 anti-apoptotic genes 94–5 antibiotics, bacterial sensitivity testing 212 apelin-angiotensin receptor-like 1 (APJ) 94 APLP2 gene 123 apoptosis 81–2 impaired 234 Applied Biosystems Expression Array System, target-labeling 16–17 aromatase inhibitors 151–2 array-based Klenow enzyme (RAKE) assay 21–2 asparaginase 221–3 ATM gene, DNA damage in B-CLL 167–8 atrial fibrillation, animal models 87 atrial natriuretic peptide (ANP) 89–90 ablation 83 cardiomyopathy 90–1 precursor 89–90 bacteria sensitivity testing to antibiotics 212 -adrenergic receptor ( -AR) signaling system 81 abnormalities 83–4 -AR kinase ( ARK) 83–4 B-cell chronic lymphocytic leukemia (B-CLL) 163–7 ATM gene 167–8 fludarabine therapy 168 gene expression profiling 164–7, 166 gene expression signatures in response to DNA damage 167–8 heterogeneity 163 IgVH gene mutations 163–6 p53 gene 167–8 ZAP-70 gene 164–7 B-cell receptor signaling 181 BCL2 overexpression in follicular lymphoma 173
between slide normalization 51–2 binary trees 64–6 Bioconductor program 49 hierarchical clustering 64 bladder cancer 232–3 BNIP3 gene 230–2 bone natriuretic protein 89–90 Bonferroni correction 62 BRCA1 gene 136–41 estrogen receptor interactions 139 gene expression profiling 137 gene expression signature 138–9 hypermethylation 137–8 methylation analysis 138 mutations carriers 138–9 frequency in hereditary breast cancers 139–41 screening 136–7 BRCA1-like tumors 137–8 BRCA2 gene 136–41 gene expression profiling 137 mutations frequency in hereditary breast cancers 139–41 screening 136–7 BRCAx families 139–41 class discovery 140 breast, cell-type expression profiles 144–6 breast cancer 132, 224–7 adjuvant chemotherapy eligibility 225 basal tumors 138–9 basal-like subgroup 143–6 biological characteristics at diagnosis 142 biomarkers 142–3 classification 143 clinical outcome prediction 142–53 clinico-genomic model 146–7 comparative genomic hybridization arrays 154 data integration from other genomic technologies 153–4 discriminator genes 227 estrogen receptor positivity/negativity 147–8 76-gene signature 148–9 response prediction to hormonal treatment 149 tamoxifen biomarker need 151–2 gene expression profiling 132, 133–41, 135, 143–4 lymph nodes metastases 147 gene expression signature assay in tamoxifen treatment 150–1 70-gene prognostic classifier 146, 147, 155 clinical trial 155–6 gene signatures 224–5 2-gene test 150 hereditary 136–41 comparative genomic hybridization 139–41 subgroups 139–41
240
Index
breast cancer (cont.) histological type correlation with transcriptional profiles 141 host immune system in tumor eradication 227 inflammatory 227 intrinsic gene set 147 invasive ductal 133–6, 141 invasive lobular 141 luminal-like subgroup 143–6 lymph node status prediction 146–7 metastatic potential 141–2 methylation patterns 154 microarrays in clinical practice 154–6 molecular signatures 143 multigene predictor of pathologic complete response 226 mutation screening 136–7 neoadjuvant chemotherapy response 224, 225–6 phenotypes gene expression profile associations 133–41 somatic genetic changes 153–4 prognosis 143–9, 224–5 predictors 155 protein spectra 154 proteomics 154 recurrence risk 143 likelihood quantification in tamoxifen therapy 150–1 relapse prediction 146–7 prevention 224 response to therapy 149–53 SNP identification 154 sporadic 133–6 sporadic BRCA1-like tumors 137–8 subgroups 136, 143–6, 154 survival 143, 145, 224 treatment gene expression-based response predictor 149–50 response predictor 226 tumor aggressiveness 141–2 wound response signature 147 brightness/dimness metric 58 calsequestrin (CSQ) 83–4 cAMP response element modulator (CREM) 190 cardiac conductance abnormality, animal models 87 cardiac development, heart failure 88–9 cardiac dilation 81 cardiac hypertrophy 81 heart failure 81 pharmacologic induction 86–7 cardiac phenotype prediction 84 cardiac signal transduction cascade aberrations 81 CardioChip 90
cardiomyocytes, apoptosis 82 cardiomyopathy animal models 81 apoptotic 81 dilated 83–4, 90–1 anti-apoptotic genes 94–5 differential gene expression 91 hypertrophic 90–1 ischemic 90–1, 95–7 non-ischemic 95–7 types 90–1 cardiovascular disease see heart failure CART program 70–1 CD14 gene 121 CD20 cluster 174 CD22 gene, acute lymphoblastic leukemia 122 CDKN1B gene 227 cDNA microarrays, spotted 9–10, 51–2 cardiovascular-based 90 comparison with other platforms 19–20 data quality 42 gene expression accuracy measurements 43 developing mouse heart 88–9 left ventricular assist device 94 peripheral blood mononuclear cell profiling 97–8 target-labeling 13–14 CDw52 cluster 174 charge-coupled device (CCD) cameras 17 chemiluminescence, target-labeling 17 cisplatin, bladder cancer response 232–3 classification and regression trees (CART) 70–1 classification techniques 69–75 K-nearest neighbor 71 logistic regression 70 model assessment/validation 73–5 classification trees 69 CLN2 gene 123 CLUSTER program for hierarchical clustering 64 clustering methods 63–9 hierarchical 63–6, 110 acute lymphoblastic leukemia 222 data sets 65 c-myc expression in follicular lymphoma 173–4 CodeLink bioarray platform 12 target-labeling 13, 15 colorectal cancer 228–9 class prediction 229 lymph node involvement 228 preoperative chemotherapy 229 subgroups 228 comparative genomic hybridization (CGH) arrays 21 breast cancer 154 hereditary 139–41
241
Index
computation expression indices 50–1 CREM-17X 190 cross-validation 73–4 CST3 gene 123 Cy3 and Cy5 dyes 13–14 cyanine dyes 13–14, 30 cyanine images comparison with third dye images 30 quality measures 32–3 cyclin D1 gene in mantle cell lymphoma 168–72 cyclophosphamide, neoadjuvant chemotherapy 226 cytogenetics 106 cytokeratins (CKs) 135, 144 cytokines, pancreatic beta-cell toxicity 191 cytotoxic treatment sensitivity 234 data acquisition 37 microarrays 28–31, 50–1 data filtering logR-Q plot 35–6, 44 quantitative 31–6 data normalization 37 logR-Q plot 35–6 process 51–2 quantitative 31–6 data processing 50–8 data sets 8 hierarchical clustering 65 data visualization 63–9 daunorubicin 221–3 D-binding protein (DBP) 84 dChip algorithm 49, 52, 53–4, 55 hierarchical clustering 64 outliers 56 decision trees 70–1 dendograms 64–6 dendrimer detection (3DNA) reagents 14 dendrimers 14 diabetes 187–8 type 1 187 type 2 187–8 differential expression of genes 33 diffuse large B-cell lymphoma 174–5, 178–81 activated B-cell-like type 178–81 biodiversity 213–16 gene expression profiling 178–81 germinal center B-cell signature 178 germinal center-like type 178–81 heterogeneity 178 microenvironment role 181 outcome prediction 214, 215–16 predictor model 216 prognostic groups 180–1 subgroups 180, 214, 215 type 3 cells 214 Digital Micromirror Device (DMD) 12–13
Digoxigenin (DIG) RT Labeling Kit 17 disease gene expression profiles 212 prognostic assessment 212 DNA damage B-cell chronic lymphocytic leukemia gene expression signatures 167–8 repair pathways 229 DNA microarray technology 9 DNA-Chip Analyzer (dChip) program 49 see also dChip algorithm docetaxel response predictor 226 therapeutic response prediction with gene expression profiling 152–3 doxorubicin bladder cancer response 232–3 neoadjuvant chemotherapy 226 drug resistance genes 223, 234 drug therapy 212–13 prognostic assessment 212 dye-swap design 76 endothelial progenitor cells in heart failure 98 ERBB2 overexpression 135 ERK signaling 94 estrogen receptors (ER) 132, 133 BRCA1 interactions 139 breast cancer positivity/negativity 147–8 76-gene signature 148–9 response prediction to hormonal treatment 149 expression 133–5 Euclidean distance 71, 72 experimental design, statistical analysis 75–7 expressed sequence tags (ESTs) 8, 9 CardioChip 90 Mouse Transcriptome Microarray 87–8 false discovery correction 62–3 family wise error rate (FWER) 62 FAS ligand upregulation 230 fetal gene program in heart failure 88–9 fixed circle method 51 flow cytometry, multiparameter 118–19 FLT3 gene 115, 116–17 length mutation 116–17, 120, 121 point mutation 121 fludarabine 168 fluorescein non-invasive third dye 28, 30 fluorescent in situ hybridization (FISH) 106 5-fluorouracil hepatocellular carcinoma response 229–30 neoadjuvant chemotherapy 226 FN1 gene in gastric cancer 228 fold change statistic 59–60 confidence interval 60–1
242
Index
follicular lymphoma 173–7 BCL2 overexpression 173 cell cycle control genes 177 clinical aggressiveness 177 c-myc expression 173–4 DNA synthesis genes 177 gene expression profiling 173–7 gene expression signatures in survival prediction 175 immune-response signatures 175, 176, 176–7 metabolic activity genes 177 p38 MAPK overexpression 174–5 survival 173 prediction 175 Framingham risk score 96 G coupled receptor 81 gastric cancer lymph node dissemination 228 response predictors 227–8 GCRMA algorithm 52 gemcitabine 230–2 gene expression accuracy measurements 40–2, 41, 42, 43 gene filtering 58–63 hierarchical clustering 66 linear discriminant analysis 72 observational metrics 59–60 permutation tests 75 principal components analysis 68, 69 gene shaving 69 GeneChips 10–11 leukemia diagnosis 126 target-labeling 13, 15–16 GENECLUSTER program 67 GenMAPP program 81 germinal center B-cell signature, diffuse large B-cell lymphoma 178 glucagon receptor, hybridization signals 198–201 glucagon-like peptide 1 (GLP-1) 188, 201 glucolipotoxicity response 190 glucose-dependent insulinotropic peptide (GIP) 188 glucose-dependent insulinotropic peptide (GIP) receptors 201 G-protein coupled receptors 201 HCK gene 121 heart failure 80 adrenergic receptors 83 animal models 81, 83–4 apoptosis 81–2 atrial natriuretic peptide ablation 83 cardiac gene expression 84 cardiac hypertrophy 81 classification 85
endothelial progenitor cells 98 end-stage 91–2 final common pathway 96 expression profiling by microarrays 99 fetal gene program 88–9 gene expression 89–90, 96–7 human 89–91 left ventricular assist devices 91–8 microarray analysis 99 molecular profile definition 89 peripheral blood mononuclear cell profiling 97–8 pharmacologic induction 86–7 prediction 84, 85 pressure overload 86 risk score 96 heart gene expression database, mouse 87–8 heat maps 64–6 hepatocellular carcinoma 229–30 predictive genes 231 HER2/neu overexpression 132 Herceptinâ 132 hierarchical cluster analysis 110 see also clustering methods, hierarchical Holm step-down method 62 homotypic experiments 31 host immune system in breast cancer tumor eradication 227 HOX genes in acute myeloid leukemia 111, 114, 115–16 HOXB13 gene 150 HuGeneFL arrays acute lymphoblastic leukemia 124 acute myeloid leukemia with trisomy 8 114 Human Genome Survey Microarray 12 ICER-1 190 IGFB-P5 82 IGHM gene in acute lymphoblastic leukemia 122 IL-10 receptor gene 172 IL17BR gene 150 imatinib 106–7, 124 acute lymphoblastic leukemia response 223–4 immune-response signatures 175, 176, 176–7 immunoglobulin heavy chain (IgVH) genes 163–4 mutations 163–6 immunophenotyping, multiparameter 106 INK4a/ARF locus deletions in mantle cell lymphoma 170–2 INS1 cells 195–8 insulin mRNA 201–2 insulin production 188 intensity ratio measurements, bias 31–2 interferon (IFN-), hepatocellular carcinoma response 229–30 International Prognostic Indicator (IPI) 214
243
Index
intrinsic gene set 147 islets of Langerhans 193 contaminating cells 195–8 isoproterenol 86–7 Klenow enzyme, array-based (RAKE) assay 21–2 K-nearest neighbor classification 71 learning supervised 69–75 unsupervised 63–9 left ventricular assist devices 93 bridge-to-transplant/recovery 91–2 gene expression profiling 92–4 human heart failure 91–8 leukemia 106–7 acute megakaryocytic 113–14 class prediction with Support Vector Machines 126 diagnosis 106–7, 125–6 gene expression profiling 125–6 see also acute lymphoblastic leukemia; acute myeloid leukemia; B-cell chronic lymphocytic leukemia (B-CLL) leukemogenesis, RRAS2 gene 122 LHFPL2 genes 122 linear discriminant analysis (LDA) 69, 72 Li-Wong algorithm 52, 53–4 logistic regression technique 70 logR-Q plot 34, 35–6, 44 LOWESS technique 32 R-Q normalization 35–6 lymph nodes metastases 147 status prediction in breast cancer 146–7 Lymphochip cDNA microarrays 164, 214 lymphoid malignancies 162–3 lymphoma gene expression-based survival predictors 162–3 molecular classification 162–3 primary mediastinal B-cell 216 see also diffuse large B-cell lymphoma; follicular lymphoma; mantle cell lymphoma malignant disease aggressiveness 234 class membership prediction 213 clinical course 212–13 diagnosis 106–7 response prediction 213 treatment response 212–13 see also metastases MA-LOWESS normalization 32, 36 MammaPrintâ 155 mantle cell lymphoma 162–3, 168–72 clinical heterogeneity 169
cyclin D1 gene 168–72 gene expression profiling 169–72 IL-10 receptor gene 172 INK4a/ARF locus deletions 170–2 prognostic information at diagnosis 170 proliferation signature 170–2 SPARC gene overexpression 172 subgroups 169–70 survival predictor 170, 171 MAS4 algorithm 52, 52 MAS5 algorithm 52, 52, 53 brightness/dimness metric 58 Maskless Array Synthesizer (MAS) technology 12–13 Matarray 30–1 cyanine image analysis 32 third dye images 35 matrix metalloproteinase 2 (MMP-2) 82 matrix metalloproteinase 9 (MMP-9) 82 mercaptopurine 124–5 meta-genes in breast cancer 146–7 metastases breast cancer 141–2 lymph nodes 147 metastatic potential 234 methotrexate acute lymphoblastic leukemia 124–5 bladder cancer response 232–3 methylation analysis, BRCA1 137–8, 138 microarray(s) 8–9 accurate information acquisition 28–31 breast cancer clinical practice 154–6 data acquisition 28–31, 50–1 data variability 27–8 gene expression accuracy measurements 40–2, 42, 43 heart failure gene expression 96–7 known RNA input ratios 38 measured output ratios 38 noise 27–8 protocol 8–9 quantitative quality control 27–8 secondary validation 99 uses 43 weighted t-test 38–40 whole genome 21 microarray data analysis 47 acute myeloid leukemia 107–21 components 48 data processing 50–8 heart failure 99 software packages 48–9 statistical models 61 Microarray for Node-Negative Disease may Avoid Chemotherapy (MINDACT) trial 155–6 microarray image analysis 17–18 probe information 19
244
Index
microarray mRNA expression analysis 188–90 glucose-regulated 190–1 linking of independent data sets 202–4 murine clonal cell line MIN6 190 systematic false positives 192–8 microarray platforms 9–10 accuracy 18–20 choosing 20–1 comparability 16, 19–20 genetic information source/annotation 20–1 manufacturing trends 21 novel applications 21–2 probes 21 reliability 18–20 microarray readers 17 microRNA profiling, high-throughput 21–2 microtubule organization 229 minimal information about a microarray experiment (MIAME) 8 probe information 19 mismatch (MM) probe 11, 52, 54, 55 insulins 1 and 2 201–2 mitogen activated protein kinases (MAPK) 91 MLL gene 113–14, 115 acute lymphoblastic leukemia 121–2, 123 acute myeloid leukemia 116 partial tandem duplication 115, 121 monocrotaline (MCT) 87 mouse heart gene expression database 87–8 Mouse Transcriptome Microarray 87–8 multiple comparison problem 49 multiple testing problem 49 correction 62–3 murine clonal cell line MIN6, microarray mRNA expression studies 190 muscle LIM protein (MLP) 83–4 MYCN gene 121 myeloperoxidase 106 myocardial infarction, animal models 84–6 myosin regulatory light chain (MLC-2V) 87 nearest neighbor prediction 69 NetAffx program 122 neural networks (NN) 69, 72–3 NFB signaling pathway 191 NIX mitochondrial death protein 81 NM23 gene 228 non-Hodgkin’s lymphoma, B-cell 162–3 non-specific esterase 106 nQuery Advisor program 77 Nucleic Acid Sample Amplification/Labeling procedures 14–15 OASL gene 230 Oct-2 gene in gastric cancer 228
oligonucleotide microarrays 10–12 quality control 44 spotted 51–2 Oncotype DXä 155 oxidative phosphorylation 181 p38MAPK overexpression 174–5 p53 gene, DNA damage in B-CLL 167–8 pancreatic alpha cells 188 molecular phenotype 191–2 pancreatic beta-cells 187–8 cell line surrogate use 193–5 contaminating cells 195–8 cytokine-induced toxicity 191 data sets 202–6 linking of independent 202–4, 204 enriched preparations 193 experimental systems 192–8 FAC purified 198 glucagon receptor hybridization signals 198–201 G-protein coupled receptors 201 molecular phenotype 191–2 mRNA expression 194–195, 203–4 differential islet hormone detection 200 false positive signals 196 G-protein-couple receptor 199 mRNA marker variation 197 PPAR 1 192 primary cells 193–5 clonal 188–92 probe redundancy on Affymetrix expression arrays 205–6, 206 prolactin effects 192 random false positive avoidance 202–4 systematic false negatives 198–202 systematic false positives 192–8 total pancreatic RNA preparation 193 pancreatic carcinoma 230–2 gene predictive of lymph node metastases 230 pancreatic islets microarray analysis 188–92 microarray mRNA expression studies 188–90 PARK gene 233 pathologic complete response, multigene predictor 226 PAX5 gene in acute lymphoblastic leukemia 122 PCOLCE gene in gastric cancer 228 PDNN algorithm 52 PDX1 beta-cell transcription factor 190–1 per comparison error rate (PCER) 62 perfect match (PM) probe 52, 54, 55 insulins 1 and 2 201–2 peripheral blood mononuclear cell profiling in heart failure 97–8 permutation tests 73, 74–5
245
Index
peroxisome proliferator activator receptor- 1 (PPAR 1) activation 192 PFN2 gene in gastric cancer 228 phosphatidylinositol 3-kinase/Akt pathway 230–2 pin-and-ring array technology 17 plaid models 69 PMonly expression metric 52 polymerase chain reaction (PCR) 106 prediction models 70–3, 233–4 acute lymphoblastic leukemia relapse 216–24 see also named cancers prediction techniques 69–75 model assessment/validation 73–5 properties 74 prednisolone 221–3 preproinsulin mRNA 201–2 present/absent percentage 56–8, 59 pressure overload 86 principal components analysis (PCA) 64, 67–9, 110 acute lymphoblastic leukemia 222 acute myeloid leukemia 113, 116 plot 68 probes doublets 205–6 microarray platforms 21 mismatch 11, 52, 55 perfect match 52, 55 singles 205–6 specific effect consistency 54 triplet 205–6 see also mismatch (MM) probe; perfect match (PM) probe prolactin, pancreatic beta-cell effects 192 protein expression data 118–19 proteomics in breast cancer 154 PTPRM genes 122 p-value 61 permutation 75 thresholds 62 quality control 43–4 chip metrics 56–8 data 34 gene expression accuracy measurements 43 oligonucleotide microarrays 44 quality score Qf -weighted mean 37–40, 39, 43 quantitative of microarrays 27–8 statistical analysis 55–8 quality score approach 31–6 Qf -weighted mean 37–40, 39, 43 RAD genes in acute myeloid leukemia 114–15 Rand index 66 random forests 71 receiver operating characteristic (ROC), area under the curve (AUC) 73
redox environment, overexpression of genes controlling 153 regression techniques 69 RELA anti-apoptotic factor 233 remodeling, reverse 91, 92 all-trans retinoic acid 106–7 reverse remodeling 91, 92 ribonuclease, pancreatic 193 RMA algorithm 52, 52, 55 RNA abundance, protein expression correlation in AML and ALL 118–19 RNA microarray design 9 RRAS2 gene in acute lymphoblastic leukemia 122 SAM program 63, 87–8, 122 sarcomplasmic/endoplasmic reticulum calcium-ATPase (SERCA) 90–1 ScanAlyze program 51 seeded region growing (SRG) 51 self-organizing maps (SOM) 64, 66–7 signal-background segmentation 29 significance analysis of microarrays (SAM) 63, 87–8, 122 significance level 49–50 single-nucleotide polymorphisms (SNPs) in breast cancer 154 SLC16A3 gene 233 SLIMI1 gene 89 SMC1 gene 229 SPARC gene overexpression 172 splice isoforms 21 Spot program 51 spotted array two-color 76 see also cDNA microarrays, spotted Sprouty 1 94 SPTBN1 gene 121 statistical analysis 47–9 challenges 49–50 classical testing 61 experimental design 75–7 method development 47–9 models for microarray data 61 quality control 55–8 sample allocation 76 sample size 76–7 statistical filtering metrics 60–1 statistical power 50 STI571 see imatinib Support Vector Machines (SVM) 69, 72, 112, 123, 126 tamoxifen 133, 149–52 biomarker need 151–2 breast cancer recurrence risk likelihood quantification 150–1
246
Index
tamoxifen (cont.) gene expression signature assay 150–1 gene expression-based response predictor 149–50 target-labeling 13–17 taxanes 152–3 neoadjuvant chemotherapy 226 therapeutic response prediction 152–3 third dye array visualization (TDAV) technology 28, 29–31 gene expression accuracy measurements 43 third dye images 29–31 comparison with cyanine 30 quality measures 33–5 thresholding 56 topoisomerase IIa (TOP2a) gene 233 training/test technique 73, 75 trans-aortic coarctation (TAC) 83 pressure overload 86 transplantation, allogenic 106–7 TREEVIEW program 64
trisomy 8, acute myeloid leukemia 114 t-test, weighted 38–40 tubulin overexpression 153 tumour necrosis factor (TNF-) 82 tumour necrosis factor (TNF)–NFB signaling pathway 94–5 U95A arrays 119–20 U133A microarrays 111–12, 126 vinblastin 232–3 vincristine 221–3 VPREB1 gene in acute lymphoblastic leukemia 122 Westfall-Young method 62 within-slide normalization 51–2 wound response signature 147 WT1 gene 121 ZAP-70 gene 164–7, 166 flow cytometry 165