| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
bIII. Medical Department (M.E., K.K., R.P.), University of Leipzig, D-04103 Leipzig, Germany; Interdisciplinary Center for Clinical Research Leipzig (K.K.), D-04103 Leipzig, Germany; and Department of Nuclear Medicine and Endocrine Oncology (A.K., B.J.), Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Gliwice Branch, 44-100 Gliwice, Poland
Correspondence: Address all correspondence and requests for reprints to: R. Paschke, M.D., III. Medical Department, University of Leipzig, Ph.-Rosenthal-Str. 27, D-04103 Leipzig, Germany. E-mail: pasr{at}medizin.uni-leipzig.de
| Abstract |
|---|
|
|
|---|
| I. Introduction |
|---|
|
|
|---|
B. Methodological evolution
As the sequence of information for the whole human genome became available, life scientists were challenged with the task of measuring the expression levels on a global scale. For this purpose, cDNA library clones were spotted on large membrane sheets to hybridize radiolabeled cDNA pools (2) generated from total or purified mRNA. This made it possible to measure the signal intensity of hundreds of transcripts. These ancestors of microarrays called "macroarrays" were characterized by several difficulties; e.g., they were made of porous material, they were difficult to handle, and they often required radioactive labeling techniques. The elimination of these technical drawbacks by the introduction of new carrier materials (i.e., glass slides) and fluorescence-based signal detection in 1995, when the first real cDNA microarray was established (3), formed the basis for a broad application. Subsequently, new developments in automation, robotic spotting, photolithography, fluorescence-based detection, and bioinformatics led to a rapid growth in microarray usage. According to the length of spotted DNA fragments, all RNA expression arrays can be subdivided into two groups: 1) cDNA microarrays that use approximately 200- to 500-bp double-stranded DNA fragments, which are normally produced by PCR; and 2) oligonucleotide microarrays that use 2570 bp of single stranded DNA. Both cDNA and oligonucleotide fragments are chemically attached to the glass slides. These arrays are traditionally used with a two-color detection system; i.e., RNA from sample and control is labeled with fluorescent Cy3 and Cy5 dyes and hybridized to the same array. Therefore, these two-color arrays produce a ratio indicating the differential expression between the sample and the control. In contrast to these microarrays that are characterized by a deposition of cDNA fragments or oligonucleotides on the slides, Fodor and co-workers (Affymetrix, Santa Clara, CA) (4, 5, 6, 7) pioneered the development of a new in situ synthesis technology that combines photolithography known from the semiconductor industry and chemical DNA synthesis. This technology enabled a further miniaturization of the assay and the manufacturing of high-density oligonucleotide microarrays (GeneChips): the current version of Affymetrix GeneChips (Human Genome U133 Plus 2.0) provides a comprehensive coverage of the transcribed human genome on a single array covering over 47,000 transcripts represented by more than 1 million distinct oligonucleotide entities that cover a 11-µm square on the array (called feature size). Moreover, in contrast to the two-color arrays, Affymetrix GeneChips use a standardized biotin-labeling protocol and produce an intensity signal allowing absolute quantification.
The power of microarray studies depends not only on the quality of array design and production but also on the statistic and bioinformatic approaches used to analyze the data. Indeed, the application of different mathematical algorithms can influence the outcome of microarray data analysis enormously, e.g., by different statistical powers to detect significant differentially expressed genes. Early studies and studies employing radioactively labeled macroarrays often used housekeeping genes for data normalization (8) or simple algorithms such as z-score transformation (9, 10). However, normalization to classical housekeeping genes such as glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and ß-actin (ACTB) is of limited value and requires special attention because it cannot be excluded that these genes are regulated [e.g., whereas no regulation of these genes has been observed in AFTNs, they do show differential expression in TSH-stimulated primary thyroid epithelial cells (11)]. As an alternative, various preprocessing techniques have been developed: e.g., MAS5.0 (12), dChip (13), Robust Multichip Average (RMA) (14, 15), and GC-RMA (16). However, so far a clear recommendation for the best performing algorithm cannot be given (17). Similar to the high number of preprocessing algorithms, numerous methods of data analysis have been published. They can be classified as methods to detect differentially expressed genes and gene sets in a supervised manner (e.g., empirical filtering, statistic algorithms such as t test, F test), unsupervised cluster analysis (e.g., hierarchical clustering, self-organizing maps, k-mean clustering), and supervised sample classification [e.g., Recursive Feature Replacement (RFR) (18, 19), Prediction Analysis of Microarrays PAM (20)].
Because the introduction of microarrays has offered the possibility to investigate a multitude of parameters (i.e., genes) in a small number of samples (i.e., microarrays), this possibility introduced in parallel the most challenging problem of microarray analysis, which is the so-called "problem of multiple comparisons." Therefore, in up-to-date microarray studies, a correction for multiple comparisons [e.g., false discovery rate (FDR), Westfall-Young step-down permutation correction (21), and significance analysis of microarrays (22)] should be mandatory.
Hierarchical cluster analysis has become the most popular and most frequently used multivariate technique to analyze microarray data that results in a complete tree with leaves as individual patterns (genes or experiments) and the root as the convergence point of all branches. However, given enough genes, the genes will always cluster. Therefore, there is only minor scientific value in the fact that there are genes that behave in a similar way. Instead, the interpretation of a common clustering may be crucial. However, sometimes it seems that authors feel the urge to present their microarray results in the form of a cluster diagram, although such clusters are not always meaningful.
Because the conventional diagnosis of cancer is based on the morphological appearance of stained tissues whose analysis requires highly trained pathologists, the introduction of microarrays offered the hope that classification and prediction of cancer by means of their gene expression profiles could be more objective and accurate. Nevertheless, the problem of classification by microarrays is not simple; the genes that make it possible to predict the classes have to be identified from a large number of genes (in a relatively small number of samples). Moreover, it is important to identify which genes contribute most to the classification. For this purpose, several new computational methods have been developed. In addition to algorithms suggested by Golub et al. (23) and Hedenfalk et al. (24), a simple and well-performing approach based on nearest shrunken centroids has been proposed (20). At the moment, the reference method for classification problems in microarray studies are support vector machine (SVM) algorithms, and new promising supervised gene selection algorithms based on the SVM technique (25) [such as recursive feature elimination (26) and RFR (18, 19)] have been introduced.
An increasing number of studies have used microarrays in the field of thyroidology. These studies investigate (patho)-physiological questions or aim at elucidating the tumors molecular etiology (Table 1
), whereas others look primarily for genetic markers that could improve the differential diagnosis of thyroid tumors. Both platforms, spotted cDNA and oligonucleotide microarrays with a two-color or radioactive detection, and in situ photolithographically synthesized oligonucleotide arrays with a one-color detection, have been used to analyze the gene expression profiles of malignant thyroid tumors [i.e., follicular thyroid carcinoma (FTC) (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37) and papillary thyroid carcinoma (PTC) (31, 32, 33, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)], and benign thyroid tumors [such as autonomously functioning thyroid nodules (AFTNs) (8, 11, 48, 51, 52, 53) and cold thyroid nodules (CTNs) (8, 48, 51, 54)].
|
| II. New Perspectives Generated by Gene Expression Profiling of Benign and Malignant Thyroid Tumors |
|---|
|
|
|---|
For the adhesion-controlling PTC genes, it is well known that they participate in both invasion and metastasis and are involved in inflammation. Borrello et al. (56), who investigated the specific gene expression patterns of primary human thyrocytes transfected with the RET/PTC1 oncogene, could show that this oncogene activates the expression of a large set of genes involved in inflammation and tumor invasion, such as chemokines, matrix-degrading enzymes, and adhesion molecules. Interestingly, they also found this expression pattern in specimens of PTC harboring the RET/PTC rearrangement and pT4N1 presentation, which validates the in vivo relevance of their in vitro results. Therefore, and because PTCs are often associated with chronic inflammatory thyroiditis, the authors concluded that the inflammation-associated genes activated by RET/PTC1 most likely contribute to tumor progression and locoregional metastasis (56). This conclusion remains in some conflict with clinical data showing better outcome in PTC cases with concomitant lymphocytic infiltration (60). In the PTC gene expression profile, a very distinct expression pattern of immune response genes was observed, which follows in terms of intensity the pattern that is characteristic for the difference between tumor and normal tissue (44).
The molecular background of PTC was comprehensively investigated in a recent study of Melillo et al. (58). Using a broad methodology, they showed that the oncogenic proteins (i.e., RAS, BRAF, RET/PTC rearrangements) involved in the initiation of PTC signal within one cascade, namely the RAS-BRAF-MAPK-pathway. Nearly two thirds of the targets that are under the control of RET/PTC were shown to be regulated by the RAS-BRAF-MAPK pathway. This finding is supported by the gene expression study of Frattini et al. (43), which reveals similar global gene expression profiles of PTC harboring different genetic lesions. Although the molecular etiology of about 70% of all PTC is characterized by activating mutations or chromosomal rearrangements of BRAF, RET, or RAS (43, 61, 62, 63), a changed expression pattern of genes associated with the RAS-BRAF-MAPK pathway has not been described in human PTC microarray studies (31, 32, 33, 38, 40, 42, 43, 44, 45, 46). However, in a recent reanalysis of the data sets of Huang et al. (38) and Jarzab et al. (44) we did identify a significantly increased expression pattern for various genes of this pathway by GenMAPP analysis (48). Moreover, gene expression studies of papillary carcinoma cell lines harboring the RET/PTC3 rearrangement, a mutant HRAS (v-Ha-ras) or a mutant BRAF oncogene attributed the differential expression of numerous PTC markers, like LGALS3 and DUSP6, to the activation of the RAS-BRAF-MAPK pathway (58).
Although gene expression profiles of PTC with different genotypes (i.e., RAS or BRAF mutation, RET/PTC rearrangements) share a high similarity, they are not identical (Table 2
). Melillo et al. (58) could show that a large number of genes are specifically modulated by only one or two of the three oncogenes (i.e., RET/PTC, RAS, and BRAF). Therefore, the three oncoproteins, which are part of a single signaling pathway, are each able to trigger specific signals in addition to the common ones. This conclusion is supported and widened by the gene expression profiling of Giordano et al. (46), who could show strong relationships not only between gene expression and PTC morphology, but also between the gene expression pattern and the occurrence of RET/PTC rearrangements and BRAF or RAS mutations. Therefore, it is very likely that the activation or interaction of/with additional/alternative signaling pathways forms the molecular basis of tumors and causes discrete mutation-specific phenotypic features (e.g., morphology).
|
Differences in the gene expression profile of PTC variants were indicated by Giordano et al. (46), who showed that PTCs of classic, follicular, and tall cell variants did cluster separately by principal component analysis. The difference between classic and tall cell variant was also described by Wreesmann et al. (47). On the other hand, the differences between the classic and follicular PTC variant are less visible (40), although some authors published lists of genes differentiating both variants (32).
The various groups specify different markers for the molecular diagnosis of PTC, such as SFTPB and CITED 1 (38), ADM (adrenomedullin), TROP2 (tumor-associated calcium signal transducer 2), and NRP2 (neuropilin 2) (40); the gene set of KIT (v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog), CDH1 (cadherin 1), LSM7 (LSM7 homolog, U6 small nuclear RNA associated), SYNGR2 (synaptogyrin 2), FAM13A1 (family with sequence similarity 13, member A1), IMPACT (Impact homolog) and four nameless genes (42); and a 20-gene classifier (44) as well as gene lists comprising hundreds of genes (33, 40). This issue will be addressed below.
B. Follicular thyroid carcinoma (FTC)
Gene expression profiles of FTC were analyzed in several microarray studies, with the aim to improve the molecular differentiation between FTC and follicular adenomas and to elucidate the molecular etiology of FTC further (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37).
In an early investigation, Barden et al. (29) compared the gene expression profiles of FTC and follicular adenoma using Affymetrix GeneChips and found strikingly distinct gene expression patterns for both entities. They identified 105 differentially expressed genes between FTC and follicular adenoma including ADM, GPC1 (glypican 1), TGFA (TGF
), MET and IGFBP3 (IGF binding protein 3) (up-regulated in FTC), and DIO1, CREM (cAMP-responsive element modulator), FBLN5 (fibulin 5), and MT1 (metallothionein 1) (down-regulated in FTC). An alternative approach of microarray analysis that aims at identifying the minimal number of discriminating genes appears more promising for diagnostic purposes and could also lead to further elucidation of the molecular etiology. Weber et al. (30) applied the linear discriminant analysis to their data set of FTC and benign thyroid samples to find the best combination of genes discriminating the different entities. Using this approach, they identified the combination of CCND2 (cyclin D2), PCSK2 (proprotein convertase subtilisin/kexin type 2), and PLAB (growth differentiation factor 15) that allowed accurate differentiation between benign and malignant follicular thyroid neoplasia with a high sensitivity and specificity. Cerutti et al. (71) have specified another set of genes, based on serial analysis of gene expression (SAGE) analysis, which comprised DDIT3 (DNA-damage-inducible transcript 3), ARG2 (arginase, type II), ITM1 (integral membrane protein 1), and C1orf24. Although these genes were primarily selected for diagnostic purposes (with the aim to differentiate between follicular adenomas and carcinomas), they should also be considered for their significance regarding the biology of FTC.
By a combination of microarray data and previously published loss of heterozygosity data of FTC and normal thyroid samples, Aldred et al. (27) identified five new putative tumor suppressor genes. Three of them belong to a set of most consistently down-regulated genes [CAV1 (caveolin 1), CAV2 (caveolin 2), and GDF10 (growth differentiation factor 10)], whereas GPC3 (glypican 3) and CHRDL (chordin-like 1) might be functionally related to GDF10. Although both caveolins are down-regulated in FTC, their molecular mechanisms of down-regulation remain unclear. Loss of heterozygosity was found in a subgroup of tumors, but the authors could not identify mutations in either gene, and furthermore, the methylation status of the caveolin-1 promoter did not correlate with the expression pattern (27). The investigation of GDF10, GPC3, and CHRDL revealed similar results. Moreover, GPC3 and CHRDL have also been shown to be down-regulated in AFTNs (52) and CTNs (54). Therefore, these genes most likely do not make it possible to distinguish between benign and malignant thyroid tumors.
Subsequent to the independent analysis of their PTC and FTC microarray data, Huang et al. (38) and Aldred et al. (27) combined their data sets and compared the gene expression profiles of FTC and PTC to understand better the similarities and differences between these two tumor types (31). They identified genes that are characterized by a differential expression between FTC and PTC [CST6 (cystatin E/M), DPYSL3 (dihydropyrimidinase-like 3), G0S2 (G0/G1switch 2), IGFBP6 (IGF binding protein 6), CITED1, CLDN10 (claudin 10), CAV1, and CAV2], and additional 28 genes including TG (thyroglobulin), TPO, GPC3, DUSP1, APOD (apolipoprotein D), MT1G (metallothionein 1G), and CRABP1 (cellular retinoic acid binding protein 1) that are characterized by a decreased expression in both entities, which might suggest a common origin of FTC and PTC. However, this conclusion might be overstated because TG, TPO, MT1G, and CRABP1 are characterized by an increased expression in AFTNs (52), which rather suggests a functional relevance for thyroid hormone production. In contrast, the decreased expression of GPC3 and APOD, which has also been shown in AFTNs (52) and CTNs (54), most likely reflects increased proliferation that is inherent to all tumor entities.
In addition to activating RAS mutations, which have been found in follicular neoplasms (72), a fusion oncogene between PAX8 and PPAR
has been identified in up to 63% of FTC (73), but also in a number of follicular adenomas (74, 75, 76). However, the knowledge about the detailed molecular events by which PAX8-PPAR
contributes to tumor development is so far limited (77). Therefore, to identify molecular signatures of FTC with and without the PAX8-PPAR
fusion oncogene, Lui et al. (36) performed a gene expression profiling of both subtypes. In their study, Lui et al. could show a highly uniform expression profile in the FTC characterized by a PAX8-PPAR
fusion oncogene that differed significantly from those of the FTC without the fusion oncogene. On the basis of these results, they reasoned that FTCs with a PAX8-PPAR
fusion oncogene form a distinct biological entity within the FTC. Moreover, this conclusion is in line with other findings. French et al. (78) demonstrated distinct clinicopathological features in FTC with a PAX8-PPAR
rearrangement. Nikiforova et al. (79) proposed two distinct pathways involved in the development of FTC (initiated by either RAS point mutations or PAX8-PPAR
rearrangements). A gene expression profiling study of Lacroix et al. (34) also identified specific gene expression patterns associated with the PAX8-PPAR
rearrangement. In addition, the assumption of a distinct PAX8-PPAR
pathway is supported by a differential expression of signal transduction genes [e.g., increased: EPOR (erythropoietin receptor), ADCY9 (adenylate cyclase 9), EPAC (Rap guanine nucleotide exchange factor 3), and DUSP6; decreased: HINT1 (histidine triad nucleotide binding protein 1), RAP1GA1 (RAP1 GTPase activating protein), and CSNK2B (casein kinase 2)] and proliferation-associated genes [e.g., CTBP2 (C-terminal binding protein 2), FSCN1 (fascin homolog 1), and CCND1 (cyclin D1)] in FTC with a PAX8-PPAR
rearrangement (36). Recently, Giordano et al. (28) studied the implications of the PAX8-PPAR
fusion protein for the neoplastic mechanism of FTC by a comprehensive investigation of gene expression profiles of PAX8-PPAR
-positive FTC in comparison to PAX8-PPAR
-negative FTC, along with other common thyroid tumors such as follicular adenomas, oncocytic carcinomas, oncocytic adenomas, and papillary carcinomas. PAX8-PPAR
-positive FTC could be identified by examining the microarray data for increased expression of PPAR
(28). Moreover, principal component analysis of all follicular neoplasms revealed a separation of the PAX8-PPAR
-positive FTC from the PAX8-PPAR
-negative FTC and the follicular adenomas, suggesting that the PAX8-PPAR
rearrangement is the predominant source of variation in gene expression within this set of tumors (28). Interestingly, in their study Giordano et al. (28) could show that the PAX8-PPAR
fusion protein can function in a PPAR
-like manner, although it also has transcriptional properties distinct from either PAX8 or PPAR
. These findings are contrary to the original description of the PAX8-PPAR
fusion protein (73). Kroll et al. (73) showed that the fusion protein does not activate PPAR-responsive promoters, but functions as a dominant-negative inhibitor of PPAR
-induced reporter gene activation. Therefore, the inhibition of endogenous PPAR
is said to be an important mechanism by which the PAX8-PPAR
fusion protein causes FTC. However, due to Giordanos findings (28), the concept that the PAX8-PPAR
fusion protein contributes to follicular carcinoma by antagonizing endogenous PPAR
needs reevaluation. Advanced bioinformatic analysis of Giordanos data revealed potential roles of several metabolic pathways in the oncogenic action of the fusion protein. An enrichment of the PAX8-PPAR
expression signature was observed in pathways related to fatty acid oxidation and metabolism, and to amino acid and carbohydrate metabolism. These results are striking because PPAR
regulates adipogenesis and glucose metabolism.
C. Autonomously functioning thyroid nodules (AFTNs)
In an early study using membrane arrays and radioactively labeled cDNA probes, we could show a down-regulation of several signal-transducing components such as IGF-II, the platelet-derived growth factor receptor (PDGFR) and the type III TGF-ß receptor (TGFBR3) in AFTNs and CTNs, which suggested a disturbed signaling system (8). In a recent study, we confirmed these findings. The TGF-ß signaling cascade is particularly characterized by strong changes in its pattern of gene expression in AFTNs when compared with their normal surrounding tissues (ST): the TGFBR3, SMAD 1, 3, and 4, as well as p300, a transcriptional coactivator, showed a decreased expression in AFTNs, whereas the inhibitory SMAD 6 and 7 showed an increased expression in AFTNs. Moreover, a decreased expression of TGFBR3 and TGFB1 could also be shown at the protein level (52, 80). These findings suggest an inactivation of TGF-ß signaling in AFTNs due to a constitutively activated TSH receptor (TSHR) (e.g., resulting from TSHR mutations) (52).
Because the molecular etiology of AFTNs is mainly characterized by a constitutively activated TSHR-cAMP-pathway, we compared the gene expression profiles of TSH-stimulated primary thyroid epithelial cells (51). Because these data also showed significant differences in the gene expression profiles of the TGF-ß signaling cascade, the findings in the AFTNs are most likely due to the constitutively activated cAMP cascade. This is further supported by investigations of TSH-stimulated porcine thyroid follicles ex vivo (81) showing a decreased TGFB1 mRNA expression. Furthermore, in the context of an increased ß-arrestin 2 expression in AFTNs (82), findings of a ß-arrestin 2 mediated endocytosis of the TGFBR3 and a subsequent down-regulation of its signaling by Chen et al. (83) support and further elucidate this interpretation. It has been suggested that a down-regulation of lymphocyte and macrophage-specific genes (52, 53) might reflect a different cellular composition of the AFTNs with a lack of lymphocytes and discussed by Wattel et al. (53). However, the specific expression patterns of TGF-ß signaling-associated genes in thyrocytes are not influenced by the different cellular composition and could be reproduced in TSH-stimulated primary thyroid epithelial cells (51).
Interestingly, two independent studies using different microarray platforms (i.e., spotted glass slides and Affymetrix GeneChips) and different methods of data analysis consistently show an increased expression of thyroid-specific genes such as TPO and DIO1, as well as collagen (type IX,
3), different metallothioneins, and sialyltransferase (SIAT)1 in AFTNs (52, 53). The stringent and prominent SIAT1 expression pattern that we found in our microarray study of AFTNs (52) prompted us to investigate one of its possible functional relevances further. Subsequent studies characterized a new aspect of posttranslational modification of the TSHR. In cell culture experiments, we demonstrated for the first time that the transfer of sialic acid directly affects TSHR signaling because it improves and prolongs the cell-surface expression of the TSHR (84). Furthermore, microarray investigations of AFTNs and TSH-stimulated primary thyroid cells illustrate a remarkable induction or activation of negative feedback mechanisms such as an up-regulation of phosphodiesterases (11, 51, 53), which is in line with previous findings of Persani et al. (85). Moreover, the presence of negative feedback mechanisms in AFTNs is further supported by an increased expression of G protein-coupled receptor kinases (GRK) in AFTNs and their ability to desensitize the TSHR as shown in in vitro experiments (86). Interestingly, whereas microarray studies of TSH-stimulated primary thyroid cell cultures reveal an increased expression of the regulator of G protein signaling (RGS) 2 (11, 51), which has been shown to reduce the TSHR signaling via inositol-3-phosphate (87), RGS2 is characterized by a decreased expression in AFTNs (11, 51, 88). Such a difference can most likely be explained by defects in the RGS regulation pathway or by additional counter-regulatory mechanisms that occur only in the chronically proliferating AFTNs. Furthermore, in TSH-stimulated primary thyroid cell cultures, an increased expression of CREM, which acts as an repressor of cAMP-induced genes (89), could be observed, although this was not observed by microarray studies of AFTNs (11, 51). These findings demonstrate that the additional investigation of a compatible cell model (e.g., TSH-stimulated primary thyroid epithelial cells) may confirm and further define and/or explain specific gene expression patterns (e.g., TGF-ß signaling cascade) detected in tissue samples (e.g., AFTNs). However, it is obvious that the gene expression profile of the cell model itself can only depict a part of the complex situation present in the tissue samples.
The comparison of the gene expression patterns of AFTNs harboring a somatic TSHR mutation and TSHR mutation-negative AFTNs (52) indicated a number of signal transduction genes [e.g., p21/Cdc42/Rac1-activated kinase (PAK)1 and 2, RGS4 and -6, Janus kinase (JAK)1, and G protein-coupled receptor kinase (GRK)2] which are characterized by a different expression pattern between these two subgroups of AFTNs. These findings indicate strong differences in the signal transduction of AFTNs caused by constitutively activating mutations in the TSHR compared with AFTNs without TSHR mutations and could thus give a lead for the elucidation of their molecular etiology.
Taken together, the global analysis of gene expression profiles of AFTNs showed both an inactivation of TGF-ß signaling and a remarkable induction or activation of negative feedback mechanisms in the AFTNs, which could also be confirmed in a compatible cell model in TSH-stimulated primary thyroid epithelial cells. Moreover, the prominent SIAT1 expression pattern in AFTNs provided the basis for the characterization of a new aspect of posttranslational modification of the TSHR.
D. Cold thyroid nodules (CTNs)
Several microarray studies, which were performed with the aim to identify novel diagnostic and clinical markers for differentiated thyroid tumors, used benign thyroid nodules, histologically classified as follicular adenoma or hyperplastic nodules, for comparison with PTC and FTC (29, 30, 32, 33, 35, 37, 40, 42, 49, 90). However, with respect to their function, histologically benign thyroid nodules can be further distinguished as AFTNs or CTNs or less often as so-called warm nodules, which do not show detectable differences to the surrounding thyroid tissue by scintigraphy. CTNs constitute the most abundant thyroid nodular lesion with a prevalence of about 80% of all palpable thyroid nodules (91). In contrast to AFTNs (described in Section II.C), the current knowledge concerning their molecular etiology is very limited. Nevertheless, there is only one study that investigated the array-based gene expression profiles of CTNs classified by scintigraphy (54).
In an early study we could show a down-regulation of several signal transducing components both in AFTNs and CTNs, which seemed to reflect a disturbed signaling system (8). These similarities in the gene expression patterns of AFTNs and CTNs might be attributable to a common property of both benign tumor entities, e.g., increased proliferation (92, 93). To gain a higher resolution that might help to identify specific signaling cascades involved in nodular development, we subsequently compared gene expression profiles of 22 CTNs to their normal surrounding tissue using the U95A Affymetrix GeneChip (54). On the basis of the high number of investigated genes (approximately 10,000 full-length genes) and an improved statistical analysis that made it possible to analyze the significance of differential gene expression within gene sets (e.g., signaling cascades), the molecular pattern of the increased proliferation in CTNs (93) could be further defined. Regulation of gene expression was most consistent for a number of histone mRNAs and gene sets containing cell cycle-associated genes, like cyclin D1, cyclin H/cyclin-dependent kinase (CDK) 7, and cyclin B. Furthermore, these expression data also revealed that contrary to PTCs, altered expression of components belonging to the RAS-MAPK cascade is of minor importance for the development of CTNs because gene sets representing this pathway did not show differential expression in comparison to the surrounding normal tissue. This is in line with findings of Esapa et al. (94), who showed that a general relevance of RAS mutations for the development of follicular adenoma is unlikely. Moreover, these results are supported by findings of Krohn et al. (95) and a recent in vitro study indicating that the dedifferentiated phenotype of CTNs is unlikely to be the result of an activated RAS signaling (96).
Interestingly, the gene set analysis also revealed a significantly altered expression pattern in the group of G protein signaling molecules, which is mainly based on a differential expression of several protein kinase C (PKC) isoforms and an increased expression of the Gq
protein. This expression pattern is especially interesting because it has been shown that thyroid cells undergoing a long-term PKC stimulation are characterized by a general loss of thyroid-specific functions (e.g., loss of iodide transport and thyroglobulin iodination), which is reminiscent of CTNs (97, 98, 99, 100, 101).
The up-regulation of cell cycle genes in the gene expression profile of CTNs (54) is so distinct that it may be used for the differential molecular diagnosis of these tumors, as shown by RFR algorithm (44). Mainly genes related to proliferation and growth processes were included into a 20-gene molecular classifier (e.g., cyclin-dependent kinase inhibitor 1C, histone 1 H2be, and histone 2 H2aa). Furthermore, the CTN classifier contains genes considered as cancer specific, such as the fibroblast growth factor receptor 1 (FGFR1) found down-regulated by Chevillard et al. (32) in the follicular variant of PTC in comparison with the classic variant, TLE4 (Transducin-like enhancer protein 4), described by Aldred et al. (27) as down-regulated in FTC, and TUSC3 (tumor suppressor candidate 3) overexpressed in PTC (38). In contrast to AFTNs, the CTN multigene signature was also found in some PTC, which could be explained by the partly dedifferentiated and proliferating phenotype that is common to both entities.
Several studies provide evidence that differentiated functions of thyrocytes and of iodide metabolism can be reinduced by retinoic acid (102, 103, 104). Therefore, the significantly changed expression pattern of CRABP1 in CTNs is of special interest (54). CRABP1 encodes a high-affinity cellular retinoic acid binding protein that regulates the availability of retinoic acid for its nuclear receptors and is also involved in retinoic acid catabolism (105). Therefore, the decreased mRNA expression and the decreased protein expression of CRABP1 in CTNs might also impact on the partly dedifferentiated and hypofunctioning phenotype of this tumor entity. Our recent findings showing an increased expression of CRABP1 in AFTNs in comparison to their normal surrounding tissue support this assumption (unpublished observation). Moreover, because a decreased CRABP1 expression has also been shown in PTC (38) and FTC (27) in addition to CTNs, a reduced expression of CRABP1 most likely reflects an early dedifferentiating event in the pathogenesis of thyroid tumors.
Overall, in addition to the interesting finding of decreased CRABP1 expression, gene expression profiling of CTNs allowed identification of the molecular pattern of their increased proliferation that is characterized by a differential expression of several cell cycle-associated genes. Moreover, whereas the RAS-MAPK cascade is most likely of minor importance for the development of CTNs, the gene expression patterns of the Gq-PKC pathway suggest further investigations to define its relevance in relation to the dedifferentiated phenotype of CTNs.
E. Clues for the comparison and differential diagnosis of thyroid nodules, based on the microarray data
Despite the fact that microarray studies revealed very distinct changes in the expression of certain genes, none of the several genes identified by array studies as differentially regulated was proven to be an ideal single marker of PTC (106, 107, 108, 109, 110, 111, 112, 113, 114, 115). For example, DPP4 (dipeptidyl-peptidase 4), which was indicated by Huang et al. (38) and later shown by Jarzab et al. (44) to be the most up-regulated gene in PTC, did not clearly differentiate between PTC and benign tissue (116). Also, oncofibronectin, galectin 3, and other proposed markers did not work properly in a single gene context (117, 118, 119, 120, 121). Moreover, the different multigene classifiers proposed by various authors showed only a minor overlap of the respective markers included. In fact, such a comparison requires a systematic bioinformatic approach. Such analysis has been performed in some types of cancer (122) but is not available for thyroid cancer.
In a study comprising 62 samples of PTC, follicular variant of PTC, FTC, medullary thyroid carcinoma, and follicular adenomas/hyperplastic nodules, a set of 662 differentially expressed genes was identified that discriminates between benign and malignant thyroid tumors (33). The discriminating gene set contained both known cancer-associated genes (e.g., LGALS3, TIMP, TGFA, ADM, MET, and FN1) and previously unidentified genes. However, such large gene lists are not applicable for diagnostic purposes. Thus, approaches to limit the number of genes in an identifier have been undertaken. The aim of the study of Jarzab et al. (44) was not to list genes with the largest fold-changes, but rather to specify the most powerful set of genes that discriminate PTC and benign tissue. This gene set was selected not by univariate approaches but by complementation of the genes, by a RFR algorithm. Interestingly, the classifier does not contain many known genes found in other approaches like FN1 or TIMP, whereas some other genes previously known for their up-regulation in PTC (e.g., DPP4, SERPINA1, LGALS3, and MET) were included (Fig. 1
). Within the classifier there were some new genes, previously not described in PTC [e.g., EVA1 (epithelial V-like antigen 1), and LPR4 (transmembrane 6 superfamily member 2)] or not evaluated in a diagnostic context (retinoid X receptor,
, RXRG), as well as genes exhibiting less distinct changes in expression (e.g., gap junction protein, ß3, GJB3). The goal of this approach was not to obtain 20 genes showing a correlated change in expression between tumor and benign tissue but to generate complementary information from a selected set of genes. The idea of using gene interactions to improve classification accuracy can be explained with the simplest example of a two-gene linear interaction (Fig. 2
). Within the shown data set of papillary thyroid cancers and normal/benign thyroid tissues, there are samples that cannot be classified by single markers, but their class could be predicted by a combination of them. However, two genes cannot classify all samples, and it is necessary to base the classification on a larger number of genes, usually five to 20.
|
|
A more formal analysis of the accuracy of molecular diagnosis of PTC was also conducted with SVMs (K. Fujarewicz, M. Jarzab, M. Eszlinger, K. Krohn, R. Paschke, M. Oczko-Wojciechowska, M. Wiench, A. Kukulska, B. Jarzab, and A. Swierniak, manuscript in preparation). It proved that 98% of accuracy may be achieved with 95% confidence and limits the range between 95 and 100%. The lower limit of the 95% confidence interval reached 95% already with classifiers composed of only five genes. In this context, it is advisable to proceed with molecular classifiers composed of at least six genes. It is more than a simple coincidence that Mazzanti et al. (42) succeeded to differentiate their PTC samples from benign samples by a gene set of six genes and that another molecular classifier proposed for PTC on the basis of microarray data and published recently also consists of six genes (123).
Tumors and tissues often consist of more than one type of cells that might contribute differently to the measured expression of a given gene (124). The resulting problem of data interpretation in such a complex situation can be solved in different ways. Before the microarray experiment is performed, specific cell types can be isolated by microdissection. Alternatively, after microarray hybridization the expression of specific genes can be localized by in situ hybridization. Venet et al. (124) suggest a mathematic algorithm to separate samples consisting of different cell types. We prefer to include at least 1020 genes into the classifier, both tumor markers and stromal cell markers. Still, at this number of genes, the conversion to a multi-quantitative PCR approach is feasible (125). Others prefer a wider minimalization, mainly because they intend to perform immunohistochemic identification. Weber et al. (30) have proposed a three-gene classifier for differentiation of follicular tumors, and Cerutti et al. (71) included four genes for the same purpose. These proposed marker sets should be reinvestigated in fine-needle aspiration (FNA) biopsies (FNABs) for their applicability to FNAB diagnosis.
| III. First Results, Possibilities, and Solutions for Data Set Integration and Meta-Analysis |
|---|
|
|
|---|
|
|
C. Selection of reference tissue and intra- vs. interindividual comparisons
Some microarray studies of thyroid tumors tried to answer (patho-) physiological questions or aimed to elucidate the molecular etiology of the tumor, whereas others looked primarily for genetic markers that could improve the differential diagnosis of thyroid tumors. Because of these different questions to be answered by the microarray experiments, the selection of the reference tissues and the way of comparing the expression data varied significantly. In some studies, intraindividual comparisons were performed, whereas other studies performed interindividual comparisons. Furthermore, in some studies the reference tissue was defined as strictly nonnodular healthy tissue (8, 32, 34, 37, 38, 44, 45, 46, 48, 51, 52, 54), whereas it also comprised benign lesions such as goiter, follicular adenoma, and hyperplastic nodules in other studies (29, 30, 32, 33, 40, 42, 90). Moreover, there are differences in the nomenclature of benign tumors: whereas we prefer a nomenclature that is based on the function of the benign tumors (AFTNs, CTNs), other reports base their nomenclature on the histological morphology of the tumors (follicular adenoma, hyperplastic nodules). Therefore, in the case of comparing different microarray studies or performing a meta-analysis, it is mandatory to keep these characteristics of the different benign tumors in mind. Otherwise, discrepant and misleading results are preprogrammed.
Not only do advanced analyses such as comparison and meta-analysis of microarray studies require knowledge concerning the selection of the reference tissues and the type of comparison (i.e., intra- or interindividual comparisons). In a recent study, we addressed the relevance of paired (intraindividual) vs. unpaired (interindividual) data sets (48). The comparison of paired and unpaired PTC data sets clearly indicated a higher quality of paired data sets. Despite the fact that the paired data sets were hybridized to different GeneChip generations, the paired data sets showed more similarities in their gene expression profiles than the paired and the unpaired data sets that were hybridized to the same GeneChip generation and were generated in the same laboratory under identical conditions (however, there were some differences in the number of samples, which could also be relevant). The GenMAPP analysis of the paired PTC data sets showed significant differences in the expression pattern of the MAPK cascade, whereas the unpaired data set provided no indications for this alteration (48). Taken together, because intraindividual comparisons share more similarities even across GeneChip generations than intra- and interindividual comparisons using the same microarray generation, the elimination of the "individual background" by intraindividual comparisons seems mandatory, especially in studies whose objective is the identification of subtle differences, e.g., to elucidate the etiology of a specific pathology. In contrast, in studies looking for diagnostic markers of malignant tumors such as follicular and PTC, the variability of reference tissues might be of less importance; on the contrary, it may help to find out more robust cancer markers.
Furthermore, gene expression of thyroid tissue that is considered normal with regard to tumor characteristics could vary considerable. Bruno et al. (133) demonstrated differences in the expression of the sodium iodide symporter (SLC5A5) and the iodide transporter (SLC5A8) between thyroid tissue stimulated with low and normal serum TSH levels. Because the identification of differentially regulated genes critically depends on the type or metabolic status of the reference tissue as described above, we recently focused on differences in the gene expression patterns of the so-called "normal" surrounding tissues of AFTNs and CTNs (134). Principal component analysis and hierarchical clustering showed a distinct separation of the two subgroups, and statistical analysis revealed a significantly changed expression pattern of more than 300 genes (FDR < 1%). Among them, a decreased expression of thyroid-specific genes such as TSHR, TPO, and PAX8 in the ST of AFTNs could be shown. Moreover, in the ST of AFTNs, a correlation between the presurgical TSH levels and the gene expression of DIO1, TPO, SIAT1, and the Gi
-protein 1 (GNAI1) could be observed (r > 0.7; P < 0.01). These data show strong differences in the gene expression patterns of so-called normal thyroid tissues, which can be attributed to their different TSH exposure. Therefore, a careful selection of the normal reference tissue by presurgical assessment of the TSH levels and scintigraphy in iodine-deficient areas should be performed in studies investigating the gene expression patterns of thyroid tumors.
| IV. Future Developments |
|---|
|
|
|---|
rearrangements and BRAF mutations (136, 137, 138, 139). Although none of them are applicable to routine diagnostics so far, there is progress in molecular testing of fine-needle aspirates that may improve preoperative diagnosis. Most promising techniques are based on the detection of tumor-specific mutations (e.g., BRAF mutations, RET/PTC rearrangements) in the FNA material (140, 141, 142, 143). These methods are characterized by a high specificity and easy detection of the mutations in genomic DNA. However, because not all FTCs and PTCs are characterized by the currently known mutations in BRAF or RET/PTC rearrangements or PAX8/PPAR
rearrangements, these methods are limited. Thus, the nearly genomewide microarray investigations of both benign and malignant thyroid tumors and the comparison of these different studies may help to identify additional potentially useful markers. Although data from microarray investigations suggest that this technology has the sensitivity and specificity needed for a screening tool and as an adjunct to clinical pathological diagnosis in the future (33, 39), currently the amount and the quality of the RNA available from FNAs is limiting such an analysis in routine diagnosis. Therefore, an assay based on a limited number of differentiating genes, identified by sophisticated algorithms in comparative studies, appears to be more promising for FNAB application (30, 44). However, in contrast to the analysis of tumor-specific mutations, this approach of quantitatively measuring RNA markers is more susceptible to potential limitations of FNA such as limited and variable numbers of follicular cells obtained in each biopsy and the potential contamination by other cell types such as activated lymphocytes in patients with lymphocytic thyroiditis. Therefore, a correction for mRNA yield (e.g., by measuring a housekeeping gene like ß-actin) and thyroid specificity of mRNA extracted from a FNA sample (e.g., by measuring a thyroid-specific gene like TG) seems mandatory. Nevertheless, the identification of genes or gene groups that correlate with nodular growth, thyroid dedifferentiation, or malignancy would make it possible to improve both diagnostics and therapy of thyroid nodular disease.
B. Genome analysis (SNP arrays, arrayCGH)
Up to now, the use of microarrays has mainly focused on the investigation of the transcriptome. However, advances in the microarray technology opened a wide range of new applications. The introduction of the Affymetrix GeneChip Mapping Arrays enables genotyping of up to 500,000 single nucleotide polymorphisms (SNPs) with an average distance between SNPs of 5.8 kb, which makes it possible to conduct large-scale linkage analyses, association studies, and copy number studies. This methodological development will also influence the genetic analysis of thyroid pathologies. In addition to a lack of iodine as the most prevalent factor for goiter development, familial clustering of goiters and the female predominance of goiters suggest a genetic background (144, 145). Actually, linkage analysis revealed two candidate regions, the multinodular goiter 1 locus and the Xp22 locus (146, 147, 148). A genomewide scan to detect susceptibility loci that predispose for euthyroid goiter using 450 microsatellite markers indicated genetic heterogeneity (149, 150). Therefore, linkage studies using SNP arrays will be an important adjunct to identify new susceptibility loci in an efficient way.
Comparative genomic hybridization (CGH) made a significant impact on cancer cytogenetics. It provides the possibility to detect chromosomal copy number changes in cells, tissue samples, and formalin-fixed paraffin-embedded material. However, CGH to metaphase chromosomes provides only a limited resolution of 510 Mb for the detection of copy-number losses and gains, 2 Mb for amplifications, and the analysis requires a high level of cytogenetic expertise. The introduction of microarray-based CGH (arrayCGH) with a higher resolution could circumvent these limitations, thus making a broader application possible. Because classic CGH revealed several chromosomal imbalances in follicular thyroid adenoma, FTC, and the follicular variant of PTC (151, 152, 153, 154), the investigation of thyroid tumors by microarray-based CGH might further resolve the pattern of these chromosomal aberrations or of the array expression patterns and might help to identify and specify the molecular etiology of thyroid tumors.
Additional new applications of microarray technology have been reported showing the high potential of this platform, e.g., the combination of chromatin immunoprecipitation (ChIP) with hybridization of microarrays (ChIP-on-chip), which is a powerful way to explore sites of DNA-protein interaction across the whole genome, and the analysis of the methylation status of CpG islands in promoter regions. On the basis of these high resolution techniques, new genes involved in diseases as well as mechanistic aspects of diseases are likely to be discovered.
| Footnotes |
|---|
First Published Online March 12, 2007
Abbreviations: AFTN, Autonomously functioning thyroid nodule; CGH, comparative genomic hybridization; CTN, cold thyroid nodule; FDR, false discovery rate; FNA, fine-needle aspiration; FNAB, FNA biopsy; FNAC, FNA cytology; FTC, follicular thyroid carcinoma; PKC, protein kinase C; PTC, papillary thyroid carcinoma; RFR, recursive feature replacement; RGS, regulator of G protein signaling; SNP, single nucleotide polymorphism; ST, surrounding tissues; SVM, support vector machine; TSHR, TSH receptor.
| References |
|---|
|
|
|---|