help button home button Endocrine Society Endocrine Reviews
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Purchase Article
Right arrow View Shopping Cart
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Copyright Permission
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Lou, H.
Right arrow Articles by Gagel, R. F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lou, H.
Right arrow Articles by Gagel, R. F.
Endocrine Reviews 22 (2): 205-225
Copyright © 2001 by The Endocrine Society

Alternative Ribonucleic Acid Processing in Endocrine Systems

Hua Lou1 and Robert F. Gagel2

Department of Genetics and the Ireland Cancer Center (H.L.), Case Western Reserve University, School of Medicine and University Hospitals of Cleveland, Cleveland, Ohio 44106-4955; and Department of Medical Specialties (R.F.G.), Section of Endocrine Neoplasia and Hormone Disorders, University of Texas M.D. Anderson Cancer Center, Houston, Texas 77030


    Abstract
 Top
 Abstract
 I. Introduction
 II. Splicing Mechanism and...
 III. Alternative RNA Processing...
 IV. Strategies to Study...
 V. Mechanisms Controlling...
 VI. Future Perspectives
 References
 
Alternative RNA processing is a mechanism for creation of protein diversity through selective inclusion or exclusion of RNA sequence during posttranscriptional processing. More than one-third of human pre-mRNAs undergo alternative RNA processing modification, making this a ubiquitous biological process. The protein isoforms produced have distinct and sometimes opposite functions, underscoring the importance of this process. This review focuses on important endocrine genes regulated by alternative RNA processing. We discuss how diverse events such as spermatogenesis or GH action are regulated by this process. We focus on several endocrine (calcitonin/calcitonin gene-related peptide) and nonendocrine (Drosophila doublesex and P-element and mouse c-src) examples to highlight recent progress in the elucidation of molecular mechanisms regulating this process. Finally, we outline methods (model systems and techniques) used by investigators in this field to study processing of individual pre-mRNAs.

I. Introduction

II. Splicing Mechanism and Types of Alternative Splicing

A. Splicing mechanism

B. Types of alternative splicing

III. Alternative RNA Processing in pre-mRNAs of Endocrine-Related Genes

A. Alternatively spliced exons

B. Alternative usage of splice sites

IV. Strategies to Study Alternative RNA Processing

A. Development of model systems

B. Identification of cis-acting sequence elements

C. Identification of trans-acting protein components regulating alternative splicing

V. Mechanisms Controlling Alternative RNA Processing

A. Drosophila doublesex (dsx)

B. Drosophila P-element

C. Mouse c-src

D. Human calcitonin/CGRP

VI. Future Perspectives


    I. Introduction
 Top
 Abstract
 I. Introduction
 II. Splicing Mechanism and...
 III. Alternative RNA Processing...
 IV. Strategies to Study...
 V. Mechanisms Controlling...
 VI. Future Perspectives
 References
 
ALTERNATIVE splicing is an important molecular mechanism that increases the protein diversity derived from a single gene. Through alternative splicing, multiple forms of mRNA can be produced from one pre-mRNA molecule. There are examples in which thousands of different mRNA molecules are produced from a single pre-mRNA molecule. One example is neurexin, a neuronal cell surface protein that is expressed in at least 1,000 isoforms. These isoforms display differential specificity for various ligands and may function as cell recognition molecules (1). In a study by Hanke et al. (2), the authors aligned 475 human proteins against human expressed sequence tags and estimated that alternative splicing occurs in pre-mRNAs of 34% of human genes. The nature of this study further suggested that 34% might be a significant underestimate. Therefore, Hanke and colleagues implied that alternative splicing is more the rule than the exception for RNA processing of human genes. This study, combined with a few other bioinformatic studies (3, 4), is changing our current understanding of alternative splicing.

The most common type of alternative splicing is inclusion or exclusion of one or more exons from a pre-mRNA molecule in the final mRNA product (1, 5, 6). Such differential usage of exons may have several consequences regarding the function of the gene product. First, polypeptides having distinct structures and functions can be produced by using mutually exclusive exons [calcitonin/calcitonin gene-related peptide (CGRP)](7, 8). Second, expression of a functional protein can be switched off by incorporating an in-frame stop codon as a result of inclusion of an alternative exon [cAMP-responsive element-binding protein (CREB)] (9). Third, a transcription repressor can be converted to an activator by inclusion of an exon that changes the nature of a transcription factor [cAMP-responsive element modulator (CREM)] (10). Fourth, deletion of an exon or exons encoding one or more modular domains of a polypeptide may lead to production of a dominant negative form of the protein [peroxisome proliferator-activated receptor {alpha} (PPAR{alpha})] (11). Finally, insertion or deletion of a few amino acids by including or excluding an internal exon can lead to production of multiple highly related polypeptides that are involved in fine tuning the function of a specific protein (neurexin) (1). More examples will be discussed in great detail in Section III.

Alternative splicing can be regulated in a tissue- or developmental stage-specific manner (12, 13) and/or by extracellular signaling cues. These cues include growth factors (14, 15, 16, 17), dexamethasone (16, 18), insulin (19), cytokines (20, 21, 22), extracellular pH (23), or ions (24) and may regulate splicing through Ras (25) or other intracellular signaling pathways.

Alternative splicing occurs in genes involved in almost every aspect of cellular processes. The present review provides an overview of alternative splicing with an emphasis on endocrine-related genes. Because of the increasing number of genes whose pre-mRNAs have been identified to undergo alternative splicing, it is impossible to provide a complete list of such genes. Instead, we will discuss a subset of endocrine genes, alternative splicing of which is well characterized and/or of significant biological relevance. This will be followed by discussions of general strategies of studying the molecular mechanisms that regulate alternative splicing. Finally, we will discuss the mechanisms that control alternative splicing in general, with emphasis on several thoroughly characterized model systems involving both Drosophila and mammalian genes.


    II. Splicing Mechanism and Types of Alternative Splicing
 Top
 Abstract
 I. Introduction
 II. Splicing Mechanism and...
 III. Alternative RNA Processing...
 IV. Strategies to Study...
 V. Mechanisms Controlling...
 VI. Future Perspectives
 References
 
A. Splicing mechanism
When splicing was discovered in 1977, investigators named the coding sequences "exons" because they are expressed sequences and named the noncoding sequences "introns" because they are intervening sequences that disrupt coding sequences and are removed during splicing. They also defined the 5'- and 3'-splice sites relative to an intron (Fig. 1Go). In vertebrates, small exons having an average size of 137 nucleotides are buried in introns as large as 100,000 nucleotides (26). Many vertebrate genes are very large, spanning hundreds of thousands of nucleotides on a chromosome and containing numerous exons. In order to produce a functional protein, splicing must be an accurate mechanism by which all of the exons in the primary RNA transcript are joined precisely.



View larger version (38K):
[in this window]
[in a new window]
 
Figure 1. A, Diagram showing factors involved in the recognition of an internal exon and 5'- and 3'-terminal exons. The U1 and U2 snRNPs and SR proteins define the internal exon. U1 snRNP and the cap-binding proteins (CBP80 and CBP20) define the 5'-terminal exon, while U2 snRNP, SR proteins, and polyadenylation factors [cleavage/polyadenylation specificity factor (CPSF), cleavage stimulation factor (CstF), cleavage factor (CF), and poly(A) polymerase (PAP)] (141 ) define the 3'-terminal exon. Interaction of the internal and terminal exon defining complexes across the intron leads to efficient splicing and polyadenylation. Abbreviations: U1, U1 snRNP; U2, U2 snRNP; U4/U5/U6, U4/U5/U6 tri-snRNP; U2AF, U2 auxiliary factor; ESE, exon splicing enhancer; ISE, intron splicing enhancer. B, Consensus sequences of 5'- and 3'-splice sites and polyadenylation site.

 
In vertebrates, splicing occurs in cell nuclei cotranscriptionally while a pre-mRNA is being transcribed from DNA. Splicing requires both sequence elements and nuclear factors. The basic splicing signals consist of short, loosely conserved sequences located at the 5'- and 3'-splice sites (Fig. 1Go). The 5'-splice site signal is generally a variation of the consensus sequence CAG/GUAAGUA, while the 3'-splice site signal contains three components: a branchpoint sequence [YNCURAY (Y, pyrimidine; N, any nucleotide; R, purine)], a polypyrimidine-tract, and a CAG sequence at the 3'-end of the intron. These individual splice site sequences can be found frequently throughout a gene and are therefore unlikely to contain sufficient information for accurate splicing by themselves. Auxiliary splicing enhancer sequences have been found in both exons (exon splicing enhancer) and introns (intron splicing enhancer) (Fig. 1Go).

Splicing occurs in a macromolecular complex termed the spliceosome. Nuclear components included in the spliceosome can be divided into three classes: snRNA-containing small nuclear ribonucleoprotein (snRNP) complexes, arginine/serine-rich (SR) RNA-binding proteins, and other non-SR protein splicing factors, including heterogeneous nuclear ribonucleoproteins (hnRNP), RNA helicases, kinases, and other enzymes. During initial formation of a smaller prespliceosome complex, U1 snRNP, along with SR proteins, recognizes and forms base pairs with the 5'-splice site sequence, and U2 snRNP forms base pairs with the branchpoint sequence stabilized by U2 auxiliary factor (U2AF) bound to the polypyrimidine-tract sequence. Subsequently, U4/U5/U6 tri-snRNPs and other splicing factors are recruited to the complex to form the larger spliceosome complex in which the chemistry of splicing occurs. A detailed review of the splicing mechanism was described by Burge et al. (27).

How does the vertebrate RNA processing machinery distinguish the short exons from the large introns? An "exon definition model" pioneered and experimentally tested by Berget (28) makes a compelling argument that it is the smaller exon structure which is first recognized and defined as the newly synthesized transcript leaves the RNA Pol II complex. This model suggests that the ends of an exon are delineated by the interaction of splicing factors with their cognate 3'- and 5'-splice sites (Fig. 1Go). Communication between the factors across the exon results in recognition and permits precise joining of widely separated exons (29). The first and last exons are defined by interactions between the cap-binding complex and factors binding at the first 5'-splice site, and between factors binding at the last 3'-splice site and the polyadenylation complex, respectively.

Splicing does not occur as an isolated event; it interacts dynamically with both the transcription process and other RNA processing events such as capping and polyadenylation (30, 31). McCracken et al. vividly described an mRNA "factory" that carries out coupled transcription, splicing, and polyadenylation of mRNA precursors (32). The interactions between factors involved in these processes have been demonstrated. The carboxyterminal domain (CTD) of the large subunit of RNA polymerase II plays a critical role in such interactions. CTD interacts with splicing and polyadenylation factors to increase the efficiency of both processes (32, 33, 34, 35). Intriguingly, one study showed that alternative splicing of the fibronectin EIIIA exon was differentially modulated by the promoters used to drive the transcription unit, implicating an intimate cross-talk between transcription and splicing machinery (36).

B. Types of alternative splicing
There are several types of alternative RNA processing (Fig. 2Go). The most common is the inclusion or exclusion of one or more entire exons, a reflection of the modular genomic structure of many genes (see Section III). When the alternatively spliced exon is the 5'-terminal exon, alternative splicing is often coupled with alternative usage of promoters. Likewise, when a 3'-terminal exon is alternatively spliced, alternative polyadenylation is always involved. Like exons, introns can be alternatively spliced, generating transcripts with intron sequences either removed or retained. A relatively less frequent alternative splicing event is the alternative usage of splice sites. More complicated types of alternative splicing can occur using different combinations of these basic types.



View larger version (27K):
[in this window]
[in a new window]
 
Figure 2. Diagram showing patterns of alternative RNA processing. Arrows indicate transcription start sites.

 

    III. Alternative RNA Processing in pre-mRNAs of Endocrine-Related Genes
 Top
 Abstract
 I. Introduction
 II. Splicing Mechanism and...
 III. Alternative RNA Processing...
 IV. Strategies to Study...
 V. Mechanisms Controlling...
 VI. Future Perspectives
 References
 
Pre-mRNA of many endocrine-related genes undergoes alternative splicing. These genes include various hormones, cell surface receptors, nuclear receptors, and other transcription factors. Many studies have contributed to the current understanding of these alternative splicing events. Earlier studies focused on identification of isoforms; more recent studies have focused on the distribution and functions of the different isoforms. Collectively, these studies provide a rich tapestry of information that illustrates the importance of alternative splicing in endocrine systems.

A number of endocrine-related genes whose pre-mRNAs undergo alternative splicing are listed in Table 1Go. Examples were chosen because the alternative splicing patterns of these genes have been well established, and these examples represent a wide range of types of alternative splicing. In the following section, we will focus on a few of these genes and discuss the role of alternative splicing in regulating their protein functions. It is important that readers understand that the individual examples will be grouped by "alternative splicing type," but not by genes. Therefore a single gene, such as the LH or FSH receptor that has more than one different "splicing event," will be presented in different sections. It is also important that readers understand that there is little known about either regulatory sequences or trans-acting factors that regulate alternative splicing of endocrine genes. In fact, the only genes for which there is information on both cis- elements and trans-acting factors are the calcitonin/CGRP, GH, and fibronectin genes. The importance of the alternative products to be described in the following sections suggests that there are important regulating pathways remaining to be discovered.


View this table:
[in this window]
[in a new window]
 
Table 1. Alternative splicing pattern of pre-mRNA of endocrine-related genes

 
A. Alternatively spliced exons
1. Cassette exon.
a. CREB and CREM.
CREB and CREM are members of a multigene family of transcription factors involved in cAMP-mediated transcription regulation. The factors in this family of proteins contain the basic transactivation and DNA-binding domains. When phosphorylated by protein kinases, these proteins become activated, bind to promoters containing the cAMP-responsive element (CRE), and subsequently act as either transcription activators or repressors.

CREB and CREM share a remarkable genomic structure, in which exons are organized as modules of functional domains of the proteins (Fig. 3Go). The two genes may have arisen from gene duplication from an ancestral gene and have diverged to encode transcriptional activators and repressors of the cAMP signal transduction pathway. The common exons shared by the two genes include exons A–C and E–I, which contain activation domains and basic-domain-leucine-zipper (bZip) DNA-binding domains. The activation domain is divided into two regions. The first region, the phosphorylation box (P-box) encoded by exons E and F, contains a cluster of phosphorylation sites that can be regulated by various kinases. The second region contains two glutamine-rich domains encoded by exons C and G that are hypothesized to function as surfaces for interactions with basal transcription factors.



View larger version (49K):
[in this window]
[in a new window]
 
Figure 3. Functions of CREM and CREB are determined largely by alternative splicing. The top line in each panel represents the genomic DNA structure with the exon names shown on top of the diagram. The promoters are indicated by arrows. The modular domains of the polypeptides are indicated above or below the exons. The mRNA products resulting from various alternative splicing pathways are diagrammed below the DNA structure. Functions of protein isoforms encoded by these mRNAs are shown on the left, and the nomenclature for the isoforms is shown on the right. A, CREM. CREM has two promoters: the upstream promoter directs the transcription of the full-length pre-mRNA, while the downstream promoter directs the transcription of a truncated pre-mRNA encoding the ICER isoforms. Abbreviation: ICER, inducible cAMP early repressor. B, CREB. CREB isoforms are generated by four different posttranscriptional modifications, which are indicated in the column on the right of panel B (a, alternative cassette exon inclusion; b, alternative 3'-splice site usage; c, alternative 3'-terminal exon usage; d, alternative translation start site usage). The "X" in exons indicates presence of an in-frame stop codon, and "AUG" marks the translation reinitiation site. The black exons indicate that these exons contain noncoding sequences because of the incorporation of the stop codon upstream of these exons. Diagrams boxed within the rectangles represent polypeptide products generated by the primary RNA sequence shown immediately above each box.

 
Numerous isoforms, which function as either activators or repressors, exist for both proteins. The functional versatility of these two genes is modulated by several mechanisms, including alternative splicing, the alternative transcription start site, and the alternative translation initiation site. Alternative splicing is a major determinant of the gene activity for these two genes. Alternative inclusion of an exon or exons is capable of drastically changing the transcription activity of either protein, and it controls gene expression in a specific developmental process (spermatogenesis) in an exquisite fashion. In the following section, we will focus on the role of alternative splicing in the function of CREM and CREB isoforms.

The CREM gene contains the common exons A–C and E–I (Fig. 3AGo), some of which are selectively included to produce both transcription repressors and activators. The three basic forms of CREM proteins—CREM{alpha}, CREMß, and CREM{gamma}—lack the two glutamine-rich exons C and G and include one of the two mutually exclusive exons Ia or Ib, which encodes DNA-binding domains (Fig. 3AGo). These isoforms can form homodimers or heterodimers with CREB, which bind to CRE with the same efficiency and specificity as that of CREB (37). Amazingly, these CREM isoforms block transcription activation by antagonizing the CREB activator (37). This negative effect on transcription can be explained by one of two models: CREM forms either a homodimer occupying the CRE site or a nonfunctional heterodimer with CREB titrating out the CREB activators.

Remarkably, expression of the CREM gene is regulated in a cell-specific manner to generate isoforms that activate cAMP-dependent transcription. During pubertal development and the initiation of spermatogenesis, CREM expression changes abruptly and is characterized by three features. First, CREM becomes highly abundant in adult testis, while it is expressed at very low levels in prepubertal animals. This change results from alternative polyadenylation that eliminates a destabilizing element present in the 3'-untranslated region (3'-UTR) (38). Second, the form of CREM expressed is almost exclusively an activator, an abrupt change from the predominant repressor form found in prepubertal testis. This pattern of expression results from the generation of CREM isoforms that include at least one of the two glutamine-rich exons C and G (Fig. 3AGo) (10). Finally, the pattern of CREM expression is regulated by the increase of FSH at puberty, an effect that occurs at the level of both transcript abundance and protein activity (38). The developmental switch of the alternative splicing pathway changes the CREM function from a transcription antagonist to an activator that is required for transcription activation of several postmeiotic genes whose promoters contain CRE motifs. Most of these genes encode structural proteins required for differentiation of spermatozoa (39).

Another group of isoforms, the inducible cAMP early repressors (ICERs), are produced by transcription using an alternative promoter in the intron downstream of exon G (Fig. 3AGo). ICER isoforms contain only the DNA-binding domain of CREM and function as powerful repressors of cAMP-induced transcription. ICER is expressed in a circadian manner in the pineal gland and controls the oscillation in the hormonal synthesis of melatonin (40).

The three basic forms of CREB—CREB{Delta}, CREB{alpha}, and CREBß—contain both activation and DNA-binding domains and function predominantly as positive modulators of CRE-containing genes (Fig. 3BGo). These isoforms are uniformly and ubiquitously expressed in a wide range of tissues and cell lines (9, 41). However, the expression of CREB is also differentially regulated during spermatogenesis. In testis, alternative splicing results in the expression of repressor CREB isoforms (42, 43). Exons {psi}, Y, W, Z (human-specific), and {Omega} are testis-specific and are mostly included in CREB mRNA in germ cells (9, 42, 44). Alternative splicing of some of these exons is regulated cyclically during the 12-day cycle of spermatogenesis (45). Exons {psi}, Y, and W contain in-frame stop codons that terminate translation prematurely, generating shorter CREB isoforms. Exon {Omega} is an alternative 3'-terminal exon, the inclusion of which excludes exon I from the pre-mRNA molecule, therefore producing a similarly short isoform.

These testis-specific CREB isoforms lack the DNA-binding domain and nuclear translocation signal and are distributed in the cytoplasm. Although the exact function of these isoforms per se is unknown, they provide a mechanism of down-regulating the expression of CREB activators by generating the repressor isoform, I-CREB. When exon W is included in the CREB mRNA, translation reinitiates downstream from the stop codon, generating several I-CREB isoforms that contain only the DNA-binding domain of CREB (Fig. 3BGo). These I-CREB isoforms function as dominant negative repressors to inhibit production of CREB activators by competing with full-length CREB for the CRE elements present in the CREB promoter. Therefore, the majority of cAMP regulation in spermatocytes is most likely controlled by CREB:I-CREB ratios as the CREM activators are not detected until later stages of germ cell differentiation (43).

b. PPAR{alpha}.
PPARs, including PPAR{alpha}, PPARß, and PPAR{gamma}, belong to the superfamily of nuclear receptors that are ligand-activated transcription factors. A heterodimeric complex forms between an activated PPAR and the retinoic X receptor (RXR) and subsequently binds to the peroxisome proliferator response element of target promoters to activate transcription. PPARs play an important role in lipid metabolism.

The PPAR{alpha} gene was first cloned in mice, and it contains eight exons (46). A screen for PPAR{alpha} variants using RT-PCR led to the identification of one human variant that lacks exon 6 in its mRNA (11). This variant transcript is widely expressed in a number of cell lines and tissues, and the ratio of the two transcript levels varies between individuals and tissues. A shorter polypeptide is produced as a result of the introduction of a premature stop codon by the alternative splicing of exons 5 and 7. This truncated protein, PPAR{alpha}tr, has the DNA-binding domain but lacks the entire ligand-binding domain. Western blot analysis indicated the presence of this protein in human hepatocytes.

Functional studies (11) of PPAR{alpha}tr have demonstrated that it interferes with the PPAR{alpha} transactivation function. When a PPAR{alpha}tr cDNA was transfected in CV-1 cells, PPAR{alpha}tr was localized predominantly in the cytoplasm. When cells were cultured in the media containing a different batch of FBS, PPAR{alpha}tr could be induced to enter the nucleus, where it repressed the transcription activity of the wild-type PPAR{alpha}. It appears that this negative effect of PPAR{alpha}tr resulted from competition between PPAR{alpha}tr and PPAR{alpha} for essential transcription coactivators because cotransfection of the coactivator CREB-binding protein relieved the repression (11). Presumably, factors that alter the ratio of PPAR{alpha} to PPAR{alpha}tr can change the signaling pathway triggered by PPAR{alpha}.

c. Fibronectin.
Fibronectins are glycoproteins that play critical roles in cellular processes such as adhesion, migration, differentiation, and proliferation. Fibronectins form dimers, and each chain contains a polypeptide having a molecular mass of approximately 250 kDa. A fibronectin polypeptide consists of three types of repeating units and exists in multiple forms (~20 isoforms in humans) as a result of alternative splicing. Alternative splicing occurs in three regions within the type III repeats of fibronectin: EIIIA, EIIIB, and a variable region (V) (Fig. 4Go) (47). EIIIA and EIIIB are single exons and can be either included in (EIIIA+ or EIIIB+) or excluded from (EIIIA- or EIIIB-) the fibronectin mRNA; V contains multiple exons and can be spliced in a number of ways to produce five potential variants.



View larger version (33K):
[in this window]
[in a new window]
 
Figure 4. Alternative splicing of the fibronectin pre-mRNA. The top line in this figure represents the genomic DNA structure of fibronectin. The three types of repeat sequences are represented by different shapes of exons as indicated. Exons of type III repeats are numbered from 1–15. Alternatively spliced exons are present between type III exons, and shown as hatched boxes. Potential products of alternative splicing are diagrammed below the fibronectin genomic structure in shaded areas, showing inclusion/exclusion of the alternatively spliced exons with their two immediate flanking exons. Abbreviations: cFN, cellular fibronectin; pFN, plasma fibronectin.

 
Fibronectin is produced in a wide range of cells. The circulating plasma fibronectin is produced largely in hepatocytes and is soluble. This form lacks EIIIA and EIIIB regions and has variable amounts of V. Other forms, referred to as cellular fibronectin, are produced by numerous cells and contain variable amounts of the EIIIA, EIIIB, and V regions. Expression of all variable forms of fibronectin is regulated in a cell-specific manner during development, and dysregulation may occur in pathological conditions such as cancer. In general, in actively proliferating cells (e.g., during embryogenesis and wound repair and in malignant cells), most fibronectin mRNA is EIIIA+, EIIIB+, and V+, suggesting that alternative splicing of fibronectin is regulated to generate isoforms appropriate for growth (47).

Perhaps consistent with this notion, there is evidence that alternative splicing of EIIIA is regulated during ovarian follicular development. Colman-Lerner et al. (48) demonstrated that EIIIA+ fibronectin is expressed at much higher levels in the follicular fluid of follicles smaller than 8 mm where granulosa cells proliferate actively than in follicles larger than 8 mm. Furthermore, EIIIA splicing is up-regulated by cAMP and transforming growth factor-ß in primary cultures of bovine granulosa cells (BGCs).

There is experimental evidence that EIIIA+ fibronectin is a growth-regulatory factor. A synthetic EIIIA+ peptide as well as the conditioned medium of BGCs (containing predominantly EIIIA+ fibronectin) showed mitogenic activity, but the EIIIA- plasma fibronectin did not. After the immunodepletion of fibronectin, the BGC conditioned medium lost its mitogenic activity (48).

d. Insulin receptor (IR).
The IR is a different type of cell-surface glycoprotein receptor. The dimerized IR consists of two extracellular {alpha}-subunits and two transmembrane ßsubunits. The {alpha}-subunit contains the insulin-binding domain, and the ß-subunit contains the tyrosine kinase and phosphorylation sites. Binding of insulin to the IR activates tyrosine kinase to phosphorylate the IR and other intracellular substrates.

The IR is encoded by a single gene comprising 22 exons. The first 11 exons code for the {alpha}-subunit, while the remaining 11 exons code for the ß-subunit (49). The preproreceptor undergoes a posttranslational proteolytic process to generate the {alpha}- and ß-subunits. The {alpha}-subunit exists in two isoforms resulting from alternative splicing of the 36-nucleotide exon 11, which is differentially regulated in a tissue-specific fashion (50). The A isoform (IR-A) lacks exon 11, is expressed ubiquitously, and is the only isoform in lymphocytes, brain, and spleen. The B isoform (IR-B) contains exon 11 and is expressed predominantly in liver, muscle, adipocytes, and kidney.

Several lines of evidence suggest that the production of IR-A is associated with the differentiation stage of the cells. First, IR-A is preferentially expressed in fetal cells such as fetal fibroblasts and muscle and liver cells under normal physiological conditions (51). Second, IR-A expression is up-regulated in a number of tumors, including breast and colon cancer cells (51). Third, splicing of exon 11 has been shifted to produce more IR-B mRNA when HepG2 hepatoma cells were cultured in a differentiation medium containing dexamethasone (18). Interestingly, the ratio of IR-A to IR-B in insulin-sensitive cells could be changed by insulin, suggesting that alternative splicing can be regulated by changes in signal transduction pathways (52).

IR-A and IR-B differ by 12 amino acids at the carboxy terminus of their {alpha}-subunits and exhibit clearly different functions. Although IR-A has a higher affinity for insulin (53), IR-B autoregulates itself to a greater extent than IR-A and has increased kinase activity toward the IR substrate I in vitro (54). Additional phosphorylation sites on IR-B have also been detected. On the other hand, pp120-regulated insulin endocytosis and degradation occurred when NIH 3T3 cells were cotransfected with pp120, which is one of the IR substrates in liver, and IR-A but not IR-B (55). In addition to insulin, insulin-like growth factor II (IGF-II) can bind IR-A with an affinity close to that of insulin (51). There is experimental evidence that this may have relevance for breast cancer where IR-A is preferentially overexpressed, leading to IGF-II-mediated growth (56).

e. Gonadotropin receptors.
FSH and LH belong to a family of glycoprotein hormones generated in the pituitary and regulate reproduction. FSH and LH stimulate target cells by binding specifically to membrane receptors (FSHR and LHR, respectively) and activating a cascade of biochemical reactions triggered by the G protein-signaling pathway. FSHR and LHR proteins have a common structure, consisting of an extracellular domain, membrane-spanning domain, and intracellular domain.

FSHR and LHR have been cloned from many species, including human, rat, mouse, sheep, pig, chicken, and turkey. RNA transcripts for both genes undergo extensive alternative splicing, generating numerous variants. Some of the variants are species specific; most are shared by a wide range of species. The function of most of these isoforms remains largely unknown. However, given the large number of these variants and, in some cases, the tissue-specific distribution of the variants, it is almost impossible to ignore their existence.

The structure and organization of the FSHR and LHR genes are very similar. For example, they are both large genes (FHSR, 54 kb; LHR, 70 kb). Also, exons 1–9 in FSHR and 1–10 in LHR encode the large extracellular domain, while exon 10 in FSHR and exon 11 in LHR encode the transmembrane and short cytoplasmic domains (Fig. 5Go) (57). In addition, the pre-mRNA of both FSHR and LHR undergoes many different types of alternative splicing; the most common type is the skipping or inclusion of one or more internal exons and is discussed here, while other types will be discussed in other sections.



View larger version (48K):
[in this window]
[in a new window]
 
Figure 5. Alternative splicing of gonadotropin receptor pre-mRNAs. The top line in each panel represents the genomic DNA structure. The domain structure of the polypeptides are indicated above the exons. The exon size is indicated in nucleotides above each exon. The mRNA products resulting from various alternative splicing pathways are diagrammed below the DNA structure. The types of alternative splicing are shown on the left. The "X" in exons indicates the presence of an in-frame stop codon; the black exons indicate that these exons contain noncoding sequences because of the incorporation of the stop codon upstream of these exons. Panel A, FSHR. Panel B, LHR.

 
For both FSHR and LHR, the skipping of one or more exons in the extracellular domain leads to in-frame deletion of amino acids in the resulting isoform without changing the open reading frame. These skip forms most commonly appear to produce nonfunctional receptors (58, 59, 60). For example, Tena-Sempere et al. (58) demonstrated that the exon-lacking FSHR variants in mice were unable to bind FSH or elicit cAMP or progesterone responses when they were introduced into cells by transient transfection. Furthermore, the same variants fail to modulate the FSHR function when cotransfected with full-length mouse FSHR. These experiments suggest that the mouse exon-lacking variants are not functional.

Some gonadotropin receptor variants originate from the inclusion of an extra exon. For example, in rat testis, FSHR mRNAs have been identified that contain either exon 4A between exons 4 and 5 or exon 9A between exons 9 and 10 (Fig. 5Go) (61). Both mRNAs encode truncated FSHR proteins consisting of the entire extracellular domain or the amino-terminal half of the extracellular domain. The two variants appeared to be nonfunctional.

f. GH receptor (GHR).
Similar to FSHR and LHR, GHR is a polypeptide hormone receptor located on the membrane of target cells. GHR consists of a large hormone-binding extracellular domain (~245 amino acids) and, unlike FSHR and LHR, a short transmembrane domain (24 amino acids) and large intracellular domain (~350 amino acids) (62).

A soluble GH-binding protein (GHBP) that shares sequences with GHR has been detected in a number of species. GHBP is generated by at least three different, species-specific mechanisms. In rabbits and humans, GHBP is generated by proteolytic cleavage of the GHR hormone-binding domain (63). Interestingly, this proteolytic process is enhanced by alternative splicing in humans. Two truncated GHR variants, GHR1–279 and GHR1–277, have been identified that lack part of the intracellular domain of GHR. These two variants are generated through two different alternative splicing events. GHR1–277 is encoded by an mRNA that lacks the entire exon 9 (Fig. 6AGo) (64, 65). The mechanism that generates GHR1–279 will be discussed later.



View larger version (32K):
[in this window]
[in a new window]
 
Figure 6. GH-binding protein (GHBP) is generated by different alternative splicing event in mouse, human, and chicken. The domain structure of GHR, which is similar in various species, is indicated on top of the human GHR DNA. The alternatively spliced exons and their splicing pathways are depicted in shaded areas. A, In humans, alternative 3'-splice site usage or selective exclusion of exon 9 leads to the production of GHR1–279 and GHR1–277, respectively, which encode shorter polypeptides because of the presence of in-frame stop codons. B, In mice, GHBP is produced by selective usage of a 3'-terminal exon, 8A. C, In chickens, GHBP is produced by an mRNA molecule that undergoes alternative 3'-splice site usage that leads to the incorporation of a premature stop codon.

 
2. Alternatively spliced 3'-terminal exons.
a. Calcitonin/CGRP.
Alternative splicing of calcitonin/CGRP was discovered in 1982 (7, 8). This alternative splicing event is among the earliest described and is a classic example of production of two peptides having entirely different sequences regulated by alternative splicing.

The genomic structure of the calcitonin/CGRP gene is relatively simple. The gene spans 8 kb of DNA containing six exons. One unique feature is that it has two 3'-terminal exons, designated 4 and 6 (Fig. 7Go). In addition, the pre-mRNA of the calcitonin/CGRP gene is processed to generate two distinct peptide hormones—calcitonin and CGRP—through mutually exclusive usage of exon 4 and exons 5 and 6, respectively. Also, the coding sequences for calcitonin and CGRP are on exon 4 and exon 5, respectively, while exons 2 and 3 encode the signal peptide (Fig. 7Go).



View larger version (13K):
[in this window]
[in a new window]
 
Figure 7. Schematic representation of alternative RNA processing of calcitonin/CGRP. Coding regions for calcitonin or CGRP peptide are located in exons 4 or 5. Two polyadenylation sites are present in exons 4 and 6. Processing of the calcitonin/CGRP pre-mRNA produces calcitonin mRNA in thyroid C cells by joining exons 1–4 and CGRP mRNA in neuronal cells by joining exons 1–3 and 5–6.

 
The calcitonin/CGRP gene is transcribed widely in neurons and neuroendocrine cells, most commonly in the thyroid C cells. In thyroid C cells, 95–98% of the calcitonin/CGRP pre-mRNA is processed to splice exons 1–3 to exon 4 and use exon 4 as the 3'-terminal exon with concomitant polyadenylation at the end of this exon (Fig. 7Go). The resulting mRNA encodes one of the most potent peptide inhibitors of osteoclast-mediated bone resorption. In neuronal cells, primarily in trigeminal ganglia and hippocampus cells, 99% of the calcitonin/CGRP pre-mRNA is processed to splice exons 1–3 to exons 5 and 6 and use exon 6 as the 3'-terminal exon with concomitant polyadenylation at the end of this exon. The resulting mRNA from this splicing pathway produces CGRP, a neurotransmitter involved in sensory neuronal function and is also the most potent endogenous vasodilator. The splicing ratio is altered in transformed C cells [medullary thyroid carcinoma (MTC)] in which roughly equal amounts of calcitonin and CGRP mRNA are produced (7).

Alternative splicing of the calcitonin/CGRP gene has been studied extensively by several groups. These studies have led to the identification of multiple sequence elements and protein factors involved in this regulated alternative splicing event. These results will be summarized in Section V.

b. Gonadotropin receptors.
The LH and FSH receptors (LHR, FSHR) are examples of genes with a second type of alternative splicing pattern. A second mechanism (the first is described in an earlier section) that generates truncated FSHR and LHR variants is the selective usage of an alternative 3'-terminal exon that contains a different polyadenylation site. As shown in Fig. 5Go, at least three FSHR and two LHR isoforms are generated by this mechanism (57, 66, 67). Additionally, one LHR isoform is generated by using an alternative polyadenylation site located in intron 10 (Fig. 5Go) (68). The mRNAs that contain altered 3'-terminal sequences lead to the production of truncated proteins lacking various amounts of the full-length protein.

The functions of the truncated FSHR proteins have been determined in previous studies. The isoform generated from exons 1–8 was efficiently expressed on the cell surface, presumably using other compensating motifs for membrane insertion, and displayed high affinity and specificity for FSH binding (66). More dramatically, the FSHR isoform that changes the carboxy-terminal intracellular domain from 65 to 40 amino acids (the last one listed for FSHR in Fig. 5Go) functions as a dominant negative receptor (69, 70). By itself, this isoform was incapable of activating adenylate cyclase, although it was present on the plasma membrane and exhibited specific FSH-binding activity. Coexpression of this isoform with the active FSHR full-length receptor resulted in a dramatic loss of cAMP accumulation after FSH induction (69, 70).

c. GHR.
As discussed earlier, GHBP is generated by different species-specific mechanisms. In mice and rats, GHBP is generated by alternative inclusion of exon 8A between exon 7 encoding the 3'-end of the extracellular domain and exon 8 encoding the transmembrane domain (Fig. 6BGo) (71). When exon 8A is included, polyadenylation occurs at the end of this exon, and the downstream exon 8 will not be included.

B. Alternative usage of splice sites
1. CREB. Two isoforms of CREB—CREB{Delta}-35 and CREB{Delta}-14—have been identified at low levels in brain, thymus, and testis (72). The two alternative 3'-splice sites in exons F and G are used to produce shorter CREB mRNAs (Fig. 3BGo), which in turn leads to incorporation of in-frame stop codons. The function of these two short isoforms is not clear.

2. Gonadotropin receptors. Recently, an LHR mRNA was identified in turkeys and chickens in which an intron was partially included in the final mRNA through alternative 3'-splice site usage (Fig. 5BGo) (68). This intron-containing mRNA encodes a truncated protein variant containing only the extracellular domain. The function of this LHR isoform was not investigated.

3. GHR. The third mechanism that generates soluble GHBP is usage of an alternative 3'-splice site. In chickens, a 17-nucleotide intron 6 is inserted in the final mRNA through usage of an alternative 3' splice site at the end of intron 6 to produce the truncated GHR (Fig. 6CGo) (73). In humans, a 26-nucleotide sequence of exon 9 is deleted by selective usage of a 3'-splice site located in exon 9 (GHR1–279, Fig. 6AGo). This mRNA generates a smaller protein because of the incorporation of an in-frame stop codon. Through an unknown mechanism, the media of 293 cells transfected with GHR1–279 contained 20-fold more GHBP than that found in the media of cells transfected with the GHR full-length protein (65). More interestingly, although inactive by itself, GHR1–279 can form heterodimers with the full-length GHR and acts as a negative regulator of the full-length receptor (65).

In addition to the splicing variants discussed above, GHR transcripts undergo more alternative RNA processing at the 5'-UTR and the region encoding the extracellular hormone-binding domain. Generation of these variants is summarized in an elegant review by Eden and Talamantes (62).

4. Vascular endothelial growth factor (VEGF) receptor R1 (FLT-1). Angiogenesis plays a significant role in mammalian reproduction and is controlled by a balance of pro- and anti-angiogenic factors. Members of the VEGF family are among the proangiogenic factors expressed in the placenta, which stimulate vasculogenesis and angiogenesis in early pregnancy. The balance between these angiogenic inducers and inhibitors regulates the net angiogenic effect.

One of the VEGF receptors, VEGFR1 (FLT-1), is expressed as a cell surface receptor in the spongiotrophoblast layer of the placenta and is a potent stimulator of angiogenesis. Recently, a soluble form of this receptor, sFLT-1, was identified, and its cDNA was cloned from mice (74). sFLT-1 is shorter than FLT-1 and has a different C-terminal amino acid sequence containing the ligand-binding domain but not the transmembrane domain. sFLT-1 is produced from the FLT-1 gene by alternative splicing; this mechanism is evolutionarily conserved because a corresponding sequence of human sFLT-1 was also identified. Although the nature of this alternative splicing event is unknown at present, the sequence at the site of the divergence of FLT-1 and sFLT-1 suggests that differential 5'-splice site usage is involved.

He et al. (74) also demonstrated that sFLT-1 is expressed in vivo and functions as a potent antagonist to VEGF. The expression of FLT-1 and sFLT-1 in placental spongiotrophoblast cells is regulated in pregnant mice. The sFLT-1 transcripts were undetectable at day 11 and increased between days 13 and 17, while the FLT-1 transcripts were detected at days 11, 13, and 15, but disappeared at day 17. In addition, the ligand-binding domain on sFLT-1 enabled it to bind VEGF in serum, which resulted in the inhibition of the binding of VEGF to the cell surface receptor, FLT-1. Therefore, the ratio of FLT-1 to sFLT-1, regulated by alternative splicing, may be an important determinant for placental angiogenesis in pregnant mice.


    IV. Strategies to Study Alternative RNA Processing
 Top
 Abstract
 I. Introduction
 II. Splicing Mechanism and...
 III. Alternative RNA Processing...
 IV. Strategies to Study...
 V. Mechanisms Controlling...
 VI. Future Perspectives
 References
 
With the discovery of an increasing number of genes that undergo alternative splicing, it is obvious that this mechanism plays a very significant role in gene expression and warrants extensive studies. However, splicing is such a specialized field that investigators whose primary focus is not RNA often find it difficult to begin their studies on the mechanism of alternative splicing. This section is intended to provide a general outline for how to study the molecular mechanisms regulating alternative splicing. It will describe the routine methods and techniques used in these studies. It will also discuss the concerns and potential problems with each method.

A study of the mechanism controlling any alternative splicing event may be divided into three phases that can overlap one another. In phase I, one develops a model system for the in vivo and in vitro studies. In the splicing field, an in vivo study denotes a study performed in whole cells, not necessarily in whole animals, while an in vitro study means a study performed using extracts prepared from cells. In phase II, one determines the cis-acting sequence requirement for each splicing pathway. In phase III, one determines the trans-acting protein components involved in each pathway.

A. Development of model systems
As a first step to study the mechanism controlling an alternative splicing event, one must develop a model system that contains two components: cell lines or tissues that duplicate the two splicing patterns in whole animals and a minigene construct that contains all of the necessary sequence information for each splicing pathway.

1. Cell models. When cell models are being developed, it is important to understand the concept of the "default" pathway of an alternative splicing event. The default pathway is generally defined as the pathway that is not regulated, meaning that only the constitutive splicing factors are required for it to happen. In most cases, the default pathway is also the one that occurs in most of the tissues or cells that express the gene of interest, while the regulated pathway is the one that occurs in only one or a few tissues. In contrast to differentially spliced exons, exons that are included in all cells at all times are termed constitutively spliced exons.

Cell models can be developed from either primary cultures of specific tissue types or established cell lines. The advantage of using primary cultures is obvious: the cultures represent the actual cellular environment where the alternative splicing occurs, so the factors identified in these cells are likely true players. Additionally, primary cultures of specific tissue types may be the only available model if there are no cell lines available for the appropriate splicing phenotype. There are several practical disadvantages of primary cultures as well. First, cells in some tissues cannot be adapted to grow in primary cultures. In some cases, it takes a great deal of time to develop a method to culture certain types of cells. Second, cells that do grow in primary cultures may exhibit extremely low efficiency for transfection, a critical methodology for studying the sequence requirement. Third, it is often difficult to grow cells in primary cultures in the large quantities required for preparing nuclear extracts for in vitro studies.

The alternative to using primary cultures is to test a battery of established cell lines and choose the two cell lines that show opposite splicing phenotypes. Unless the gene of interest is endogenously expressed in a number of cell lines, this process usually involves transfection of cells with a model construct containing the DNA sequence of interest (see below) followed by RNA analysis, such as RT-PCR, RNase protection assay (RPA), or primer extension, to determine the splicing phenotype. For example, HeLa cells have been commonly used for the default splicing pathways for many genes. In addition, cell lines have been widely used to study a large number of alternative splicing events because they are relatively easy to maintain, transfect, and grow in large quantities. However, the caveat of using cell lines for studying alternative splicing of a gene that is not normally expressed in those cells is the potential of studying an artifact; namely, the cultured cells give a specific splicing choice for a completely different reason.

The ideal cell model system may involve the following components. To study the sequence requirements, HeLa cells are used for the default pathway, and a primary culture or different cell line is used for the regulated splicing pathway. To study the trans-acting protein components, there is a growing trend of using the real tissues where the gene of interest undergoes regulated alternative splicing for preparing nuclear extracts (75).

2. Minigene construct. The second component of a model system is a minigene construct used for transfection of the cell lines. The reason for creating a minigene construct is a practical one: simplification of the cloning process for DNA sequence manipulations. Vertebrate genes are usually very large, containing small exons having an average size of 137 bp and introns as large as tens of thousands of base pairs. Eliminating the unnecessary sequences from the minigene construct at this stage significantly speeds up the mutational analysis that defines the cis-acting sequence elements required for a splicing pathway.

The minigene construct usually contains the exon that is differentially included, two exons flanking the alternative exon, and minimal intron sequences between these exons. This usually indicates a large deletion of intron sequences from the wild-type gene. While the deletions simplify the minigene construct, there is danger associated with doing so for two reasons. First, so many splicing elements have been identified in intron sequences that there is probably a better chance of finding an element in an intron than in an exon. Discovery of these intron elements has actually changed our understanding of the function of intron sequences: they are not merely junk sequences to be spliced out during splicing. Second, multiple splicing elements, including sometimes redundant positive and negative elements, have been found to be associated with a single alternative splicing event. The examples include c-src (76), the GABAA {gamma}-subunit (77), FGFR1 (78, 79), {alpha}-tropomyosin (80), cardiac troponin T (81), CD44 (25), and many other genes. Removal of intron sequence may result in removal of one or more of these multiple elements, thereby confusing results. In some cases, the intron size is also important (82). Therefore, it may be difficult to decide which sequences to delete.

Although there is an example of splicing elements located throughout the length of introns (83), in most cases, the intron elements are located close to the exon that is alternatively spliced. It is therefore reasonable to include intron sequences of up to 2 kb that flank the alternatively included exon in the minigene construct. To test whether a minigene construct contains all of the necessary sequences for the alternative splicing event, transfection experiments should be performed using the two cell types that process the pre-mRNA through two opposite splicing pathways. If correct splicing phenotypes are observed, it is reasonable to use the minigene construct to carry out further experiments.

After initial characterization of the sequence elements, it is sometimes necessary to create a smaller minigene construct, further eliminating sequences in the original construct. One of the reasons for doing so is that the elements are sometimes too complicated, containing multiple sequence motifs; it also makes it easier to separate the motifs and analyze them individually (84).

B. Identification of cis-acting sequence elements
During the creation of the minigene construct, one may gain insight as to which sequences are important for either splicing pathway. For example, deletion of intron sequence in the minigene construct may change the alternative splicing phenotype, indicating the deleted sequence is important. The next step is the detailed sequence analysis to determine the critical sequence motifs. Traditionally, this process involves two types of sequence manipulation. The first is generation of deletion mutants based on the minigene construct followed by transfection/RNA analysis using the two cell types that show opposite splicing phenotypes. A heterologous sequence having a similar size with that of the deleted sequence is often used to substitute the deleted sequence to ensure that the effect observed with the deletion constructs is not a distance effect. However, the substituted sequence may introduce unknown elements, complicating the result. Inclusion of multiple heterologous sequences generally reduces this possibility. The second type of sequence manipulation is the generation of point mutations. Technically, both types of sequence manipulation can be achieved by combining old-fashioned, restriction enzyme-directed mutagenesis and modern, PCR-directed mutagenesis.

Recently, investigators have taken advantage of the power of evolution to identify splicing elements. The rationale is the belief that sequences important for any molecular steps of gene expression should be conserved among different species during evolution. The method involves DNA cloning and sequencing regions that flank and/or are located in the alternatively spliced exon from several different species in which there is conservation of the alternative splicing event. Alignment of the sequences obtained from a number of species will indicate the conserved sequences, which are potentially important for a specific alternative splicing event.

The method described above is extremely powerful in identifying intronic elements because nonfunctional intron sequences are usually not conserved during evolution. Using this phylogenetic analysis, a number of intronic splicing elements have been identified from genes such as c-src (85, 86), calcitonin/CGRP (87), cardiac troponin T (81), Drosophila transformer-2 (88), FGFR1 (W. Jin, personal communication), fibronectin (89), and hnRNP A1 (90), just to name a few.

Splicing elements can also be identified by searching for recognizable sequence motifs in the sequence of interest. For example, exon splicing enhancers that are either rich in purine or in cytosine/adenosine (91) and intronic splicing repressors that are rich in cytosine/uridine have been found in many genes (92). In addition, splicing signals (5'- and 3'-splice site sequences) have also been found to act as regulators when they are located in exons or introns (87, 93). Table 2Go provides a list of exon and intron elements that have been shown to be important in alternative splicing.


View this table:
[in this window]
[in a new window]
 
Table 2. Exon and intron RNA processing elements

 
There are several additional steps necessary to prove the relevance of a particular element or sequence. Deletion of the element (with maintenance of size by insertion of a heterologous sequence) with a change in the splicing phenotype is an important first step. However, proof that the element is important also requires point mutation. Since splicing elements can be large, this may require extensive mutagenesis before key components of the elements are identified. Additional evidence for the importance of the element can be provided by demonstrating its activity in a completely unrelated splicing context. It is also essential to demonstrate that point mutations that alter the splice choice in a minigene have similar effects when inserted into an intact wild-type gene expressed in relevant cell types. Finally, it is important to be vigorous in the pursuit of key components of a splicing regulatory sequence because the mutant sequences will provide tools for the next step, the identification of trans-acting factors.

A complicated picture often emerges following this phase of a study because of the identification of multiple, sometimes even redundant, elements. Both positive (enhancer) and negative (repressor or silencer) elements can be associated with a single alternative splicing event (Table 2Go). There is emerging evidence that the combinatorial effects of multiple elements control the fate of an exon (94). Some are required for exclusion of an exon, while others are required for inclusion. This complex picture reflects the increasingly accepted notion that alternatively spliced exons have evolved to contain suboptimal splicing signals and/or repressor elements, perhaps to ensure that the exons are not constitutively included in all cells. Alternatively spliced exons therefore need enhancer sequences and cell-specific trans-acting factors to make them visible to the splicing machinery. Splicing decisions are frequently found to be regulated by multiple splicing elements. It is not surprising that point mutations of several different elements may modify the splicing phenotypes. This type of result is to be expected, and the relative importance of each element may not be fully understood until all of the relevant trans-acting factors are identified. A corollary is that identification of a single element that can alter splicing phenotype does not mean that it is the regulatory element controlling the cell-specific alternative splicing.

After identifying the splicing elements, one must decide which elements to analyze further. These elements can be categorized as elements required for the default splicing pathway or the regulated splicing pathway. In general, investigators pay more attention to the elements involved in the regulated splicing pathway because the ultimate question is how the alternative splicing event is regulated.

C. Identification of trans-acting protein components regulating alternative splicing
Perhaps the most difficult and time-consuming phase is the identification of protein components that interact with the splicing elements. Isolation of protein factors involved in a regulated alternative splicing event requires an in vitro RNA processing system.

Nuclear extracts can be prepared from the two cell lines that represent each splicing pathway observed for the gene. In an ideal situation, proteins that interact with a splicing element can be isolated using one of several biochemical approaches and tested for their function by manipulating the nuclear extract. One of the common manipulations is the depletion of a specific protein from the nuclear extract followed by supplementation of a recombinant protein. If a protein is required for a splicing pathway, depleting the protein should abolish splicing or switch splicing pathways; supplementation of the recombinant protein should rescue splicing or switch back to the original splicing pathway.

Splicing-competent nuclear extracts can be easily prepared from HeLa cells (95), but it can be difficult and in some cases impossible to obtain such extracts from other types of cells. The latter extracts are defined as splicing-incompetent extracts and can still be used for characterizing protein components that interact with a specific sequence element. The difficult task, i.e., finding a functional test for the proteins identified from these extracts, will come later.

Proteins that regulate a splicing element can be divided into two classes: ones that bind to RNA and ones that interact with other proteins in a complex. During purification of protein factors, proteins that bind to RNA can be monitored by UV cross-linking assays (96), while proteins that do not bind to RNA can be monitored by RNA gel-shift analyses (97). Characterization of a number of splicing elements suggests that complexes formed on sequence elements are composed of proteins that interact directly with RNA and others that interact with those proteins. A general strategy is to initially focus on purifying the RNA-binding proteins and subsequently characterize the protein complex using the identified RNA-binding protein as a tool.

The conventional method of protein purification involves a series of chromatography columns, the last of which is usually an RNA affinity column. This approach is labor intensive and time consuming, yet it remains a powerful tool. Several new techniques that have potential for purifying factors that interact with RNA sequences have been developed recently. Two of these methods will be discussed in the following section.

A yeast three-hybrid genetic screening method has been successfully used to clone two RNA-binding proteins from cDNA libraries (98, 99, 100). This screen is similar in principle to that of the two-hybrid screen except that an additional hybrid RNA molecule is included for isolation of RNA-binding proteins (Fig. 8Go). The two proteins cloned using this technique are stem-loop binding protein, which binds the 3'-end of histone mRNA, and fem-3 binding factor, which binds to the 3'-UTR of fem-3 mRNA. To date, no splicing factors have been isolated using this method. One potential caveat originates from the fact that RNA-binding proteins tend to form a complex that has a greater RNA affinity than do any of the individual proteins (97).



View larger version (24K):
[in this window]
[in a new window]
 
Figure 8. Diagram showing the principle of a yeast three-hybrid screen, a selection approach used to identify RNA-binding proteins. The three-hybrid screen is modified from the yeast two-hybrid screen and involves three hybrid molecules. Hybrid protein 1 contains the LexA DNA binding domain fused to the MS2 coat protein. Hybrid protein 2 consists of the Gal4 activation domain linked to the RNA-binding domain Y of any RNA-binding protein. The hybrid RNA consists of a MS2 binding site and the RNA sequence, X, to be screened with. As in a two-hybrid system, the Hybrid 2 is made from a cDNA library, and the reporter genes are His and LacZ. The reporter genes can only be activated by binding of the fusion protein containing the Gal4 activation domain to the RNA sequence X through interactions between the RNA sequence and its binding protein.

 
Another novel method, the so-called StreptoTag technique, is an affinity-purification method developed by Bachler et al. (101). In this technique, a hybrid RNA containing an aptamer sequence having a high affinity for the antibiotic streptomycin and the RNA sequence of interest is incubated with a crude nuclear extract. After complex formation, the samples are applied to an affinity column that contains streptomycin immobilized to sepharose. The RNA-protein complex is recovered by elution with the antibiotic. This method provides much higher specificity than other existing RNA affinity purification methods. In one study, a one-step affinity purification led to the isolation of highly purified U1A protein capable of binding to U1 snRNA sequence from a crude yeast extract (101). It remains to be seen if this StreptoTag method can be used to isolate proteins that have a lower RNA-binding affinity.

To purify proteins that are involved in a complex but do not bind RNA, several strategies can be adopted. The well known yeast two-hybrid approach is highly effective if one of the RNA-binding proteins has been identified from the experiments discussed above. Another strategy is to initially isolate and purify the complex that forms on the RNA target using methods such as StreptoTag. In a second step, the protein components involved in the complex can be identified using mass spectrometry, a sensitive technique for analyzing minute quantities of proteins that has made quantum leaps in recent years and has been used to identify proteins formed in spliceosomes (102).


    V. Mechanisms Controlling Alternative RNA Processing
 Top
 Abstract
 I. Introduction
 II. Splicing Mechanism and...
 III. Alternative RNA Processing...
 IV. Strategies to Study...
 V. Mechanisms Controlling...
 VI. Future Perspectives
 References
 
A. Drosophila doublesex (dsx)
Somatic sex determination in Drosophila melanogaster involves a hierarchy of regulated alternative RNA processing. The pre-mRNA of dsx, the final gene in this hierarchy, undergoes sex-specific alternative splicing to produce male- and female-specific dsx proteins that have sex-specific regulatory functions. Male and female dsx mRNAs share the first three exons but have different 3'-exons and polyadenylation sites (Fig. 9AGo). In female dsx mRNA, the three common exons are spliced to exon 4, while in male dsx mRNA, they are spliced to exons 5 and 6.



View larger version (22K):
[in this window]
[in a new window]
 
Figure 9. Mechanism controlling alternative splicing of the Drosophila doublesex (dsx) pre-mRNA. A, Diagram showing the sex-specific alternative splicing pathways of the dsx pre-mRNA. Inclusion of exon 4 is female specific and exclusion of exon 4 is male specific. B, cis-Acting element. The dsx repeat element (dsxRE) sequences are represented by black squares. Sequences of dsxRE, PRE, and 3'-SS are shown. Abbreviations: DsxRE, dsx repeat element; PRE, purine-rich element; 3'- SS, 3'-splice site. C, trans-Acting factors. The dsx repeat and PRE sequences are capable of binding to tra-2, tra, and one of the SR family proteins. Small rectangular box, tra-2; small oval, tra; small circle, one of the SR proteins.

 
Genetics has provided a powerful tool for identifying both cis-acting elements and trans-acting factors involved in regulated alternative splicing of the dsx pre-mRNA. Through a series of elegant experiments using sex determination as an end point, the key genes regulating this splicing choice were identified. Mutations of dsx that cause a constitutive male dsx splicing pattern map to exon 4 (103). This region contains 6 copies of a 13-nucleotide repeat sequence and is the target for the genetically determined trans-acting factors tra and tra-2. A series of detailed studies have demonstrated that the male-specific pathway, exclusion of exon 4, is the default pathway and occurs in the absence of tra or tra-2 (104). Tra-2 is expressed in both males and females; its expression is necessary but not sufficient for female-specific inclusion of exon 4. Combining the expression of tra-2 and tra, which is expressed only in females, leads to the female-specific inclusion of exon 4 (103, 104).

Dsx exon 4 splicing is an excellent example of a concept alluded to earlier in this communication—the selective strengthening of an alternatively spliced exon with weak splice sites. In this case the 3'-splice site for exon 4 is particularly weak. A variety of intact cell and in vitro cell-free experiments have further characterized the exon 4 elements that are used to selectively strengthen this exon. Three hundred nucleotides downstream of the 3'-splice site of exon 4 is a 270-nucleotide (nt) element named the dsx repeat element (dsxRE, Fig. 9BGo). This element contains six copies of a 13-nucleotide repeat sequence and a purine-rich element (PRE) between repeats 5 and 6. The 13-nucleotide repeat sequence in dsxRE is nearly identical in distantly related D. melanogaster (6 copies) and D. virilis (4 copies); dsxREs from the 2 species are interchangeable in an in vitro splicing assay (105). DsxRE functions to activate the 3'-splice site of exon 4, which consists of a poor polypyrimidine tract (106, 107). The sequence of this suboptimal polypyrimidine tract is also highly conserved evolutionarily (108).

Each 13-nucleotide repeat sequence forms a binding site for tra, tra-2, and another RNA-binding protein (RBP1) (108, 109). The PRE sequence binds to tra and tra-2 and one of the SR proteins, dSRp30 or B52/dSRp55 (109). All of these proteins belong to the SR family of splicing factors, characterized by the presence of at least one RNA-binding domain and an SR domain. In the absence of tra and tra-2, RBP1, dSRp30, or B52/dSRp55 binds to dsxRE with low affinities. Tra and tra-2 stabilize the complex formed between RBP1, dSRp30, or B52/dSRp55 and the splicing enhancer. This complex promotes binding of the constitutive splicing factor U2AF65 to the poor polypyrimidine tract at the 3'-splice site. U2AF65 binding is facilitated by the bridging factor U2AF35 (Fig. 9CGo) (110). In addition, RBP1 has been shown to activate female-specific splicing by binding to its target sequence at the 3'-splice site of exon 4 (108).

One copy of the 13-nucleotide repeat sequence was capable of forming a complex and activating tra- and tra-2dependent female-specific splicing, albeit at low efficiency. The splicing efficiency increased linearly rather than synergistically when the number of repeats was increased (111). This experiment indicates that the function of multisite enhancer elements is to increase the probability of an interaction between the enhancer complex and splicing machinery rather than to promote functional synergy (111).

B. Drosophila P-element
In D. melanogaster, the P-element transposition is restricted to the germline. The molecular mechanism controlling this phenomenon involves alternative intron retention of the third intron (IVS3) of the P-element pre-mRNA. In germ cells, IVS3 is removed to generate an active transposase. However, in somatic cells, splicing of IVS3 is inhibited, leading to production of a shorter protein that can function as a negative regulator of transposition (Fig. 10AGo) (112).



View larger version (28K):
[in this window]
[in a new window]
 
Figure 10. Mechanism controlling alternative splicing of the Drosophila P-element pre-mRNA. A, Diagram showing that intron 3 (IVS3) of the P-element pre-mRNA is selectively included in somatic tissues. B, Sequences involved in P-element alternative splicing are shown. Exon 3 sequence is shown in the bracket, and the actual 5'-splice site sequence is underlined. The two pseudo 5'-splice sites are indicated. C, Factors involved in the P-element alternative splicing.

 
Rio and colleagues (112, 113) have carried out a series of elegant biochemical and genetic experiments to decipher the molecular mechanism that regulates the differential splicing of P-element IVS3. The P-element story remains one of the best characterized and most clear-cut regulated alternative splicing event. Rio and colleagues first developed an in vitro splicing complementation system using the HeLa and Drosophila Kc cell nuclear extracts to demonstrate that the differential splicing of IVS3 is regulated by somatic repression. The ability of the HeLa extract to splice IVS3 was gradually reduced as more Kc extract was added, establishing that inhibited IVS3 splicing in soma is the regulated pathway (112).

The sequences responsible for somatic repression of IVS3 splicing are located in the 5'-exon of this intron. As shown in Fig. 10BGo, the inhibitory element contains two pseudo 5'-splice sites (F1 and F2) that are 20 nucleotides upstream of the accurate 5'-splice site (112, 113, 114). F1 and F2 are called pseudo