help button home button Endocrine Society Endocrine Reviews
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Purchase Article
Right arrow View Shopping Cart
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Copyright Permission
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Sherwood, N. M.
Right arrow Articles by McRory, J. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sherwood, N. M.
Right arrow Articles by McRory, J. E.
Endocrine Reviews 21 (6): 619-670
Copyright © 2000 by The Endocrine Society

The Origin and Function of the Pituitary Adenylate Cyclase-Activating Polypeptide (PACAP)/Glucagon Superfamily1

Nancy M. Sherwood, Sandra L. Krueckl and John E. McRory2

Department of Biology, University of Victoria, Victoria, British Columbia V8W 2Y2, Canada


    Abstract
 Top
 Abstract
 I. Introduction
 II. Evolution of the...
 III. Superfamily Evolution by...
 IV. Superfamily...
 V. Conservation of PACAP—A...
 VI. Conclusions and Future...
 References
 
The pituitary adenylate cyclase-activating polypeptide (PACAP)/glucagon superfamily includes nine hormones in humans that are related by structure, distribution (especially the brain and gut), function (often by activation of cAMP), and receptors (a subset of seven-transmembrane receptors). The nine hormones include glucagon, glucagon-like peptide-1 (GLP-1), GLP-2, glucose-dependent insulinotropic polypeptide (GIP), GH-releasing hormone (GRF), peptide histidine-methionine (PHM), PACAP, secretin, and vasoactive intestinal polypeptide (VIP). The origin of the ancestral superfamily members is at least as old as the invertebrates; the most ancient and tightly conserved members are PACAP and glucagon. Evidence to date suggests the superfamily began with a gene or exon duplication and then continued to diverge with some gene duplications in vertebrates. The function of PACAP is considered in detail because it is newly (1989) discovered; it is tightly conserved (96% over 700 million years); and it is probably the ancestral molecule. The diverse functions of PACAP include regulation of proliferation, differentiation, and apoptosis in some cell populations. In addition, PACAP regulates metabolism and the cardiovascular, endocrine, and immune systems, although the physiological event(s) that coordinates PACAP responses remains to be identified.

I. Introduction

II. Evolution of the Superfamily—Two Special Members

A. PACAP as the most tightly conserved family member

B. GRF with rapid structural changes in evolution

C. GRF/PACAP gene - a duplication late in evolution

III. Superfamily Evolution by Exon and Gene Duplication

A. Six single genes in human superfamily

B. Members of superfamily in many vertebrates

C. Key to origin of superfamily in protochordates

D. Extension of PACAP ancestry uncertain in insects

E. Exon duplication as original step in ancestry

F. Gene duplication crucial in creating gene families

G. Peptide elongation important in some families

H. Alternative splicing for expansion of individual families

I. Alternative promoters for expansion of family functions

J. Posttranslational processing for increase of peptide families

IV.Superfamily Members—Overlapping Functions, Expression, and Receptors

A. PACAP and VIP

B. PHM

C. Glucagon

D. GLP-1

E. GLP-2

F. GRF

G. GIP

H. Secretin

V. Conservation of PACAP—A Clue to Function

A. A regulator of the cell cycle and development

B. A regulator of smooth and cardiac muscles

C. An immune system regulator

D. A regulator of bone metabolism

E. An endocrine/paracrine regulator

F. An exocrine regulator

G. A regulator in the nervous system

VI Conclusions and Future Directions


    I. Introduction
 Top
 Abstract
 I. Introduction
 II. Evolution of the...
 III. Superfamily Evolution by...
 IV. Superfamily...
 V. Conservation of PACAP—A...
 VI. Conclusions and Future...
 References
 
THE superfamily of hormones that includes PACAP (pituitary adenylate cyclase-activating polypeptide) and glucagon has the distinction of containing secretin, the first factor to be identified as a hormone (1). Secretin is now one of nine bioactive peptides in the superfamily; each peptide is related in structure by the N-terminal amino acids (2). In humans the present members are shown in Fig. 1Go and include PACAP, glucagon, glucagon-like peptide-1 (GLP-1), glucagon-like peptide-2 (GLP-2), GH-releasing hormone (GRF), vasoactive intestinal polypeptide (VIP), peptide histidine methionine (PHM), secretin, and glucose-dependent insulinotropic polypeptide (GIP). We have just begun to understand the origin of this rapidly evolving family of peptides and its many important functions.



View larger version (41K):
[in this window]
[in a new window]
 
Figure 1. Glucagon superfamily. The PACAP/glucagon superfamily members that are present in humans are arranged by length. Nine peptides are bioactive, but PACAP-related peptide (PRP) has not been shown to be bioactive to date. The N-terminal region (first 27 amino acids) is the main part of the bioactive core for these peptides.

 
In 1902 Bayliss and Starling (1) reported that extracts from the wall of the small intestine (duodenum and jejunum) contained secretin, which stimulated the pancreas to secrete a fluid or "juice." They demonstrated that the acid chyme from the stomach, or indeed hydrochloric acid alone, acted on the denervated intestine to release a chemical substance into the blood; this substance stimulated the pancreas to release a fluid that altered the food in the gut. Although they were able to identify some characteristics of secretin, it was not until 1966 that Mutt et al. (3) reported the peptide structure of secretin (isolated from pig). Another 25 yr elapsed before the gene and cDNA encoding secretin (isolated from rat) were reported (4, 5). It is surprising that the secretin structure is not reported for any animal except a bird and several mammals because Bayliss and Starling (1) showed in 1902 that the frog gut had secretin-like activity when tested in a dog.

Meanwhile, there has been an explosion of information about the other glucagon superfamily members in regard to sequences of the peptides, cDNAs, and genes. The location of many of the family members in the gut or pancreas was only a beginning to finding the peptides widely distributed in several tissues and species. For example, secretin was considered to be a gastrointestinal peptide but is now found in the gonads, brain, and developing pancreas (Section IV). Wide distribution is a recurring theme for the superfamily in which all but one of the members have been isolated from the brain in vertebrates; in addition, many are present in the gastrointestinal, pancreatic, and gonadal organs. Even the newly discovered (1989) family member named PACAP (6) has been identified in both brain and gonads; it is given special consideration in this review (Section V) as it is the most likely ancestral molecule for the superfamily.

The multiple functions of the peptides in the PACAP/glucagon superfamily overlap and continue to grow in number. For neuropeptides in general, Leslie Iversen (7) argues that the coordinated functions of single neuropeptides remain unclear despite our progress in identifying family members, receptor subtypes, and specific antagonists. Iversen states "To understand the wider biological significance of neuropeptides we need, perhaps, to look at some of their functions in simpler organisms." He suggests that one of the best-known examples of a simple peptide system is the egg-laying behavior in Aplysia in response to the release of several peptides from the bag cells; these peptides act on both the nervous system and gonads. In vertebrates, many of the PACAP/glucagon superfamily members also have actions in several organs. Thus, it seems appropriate to investigate the origin of the PACAP/glucagon superfamily with a view to identifying both the structures and functions, especially coordinated functions that have evolved in invertebrates and vertebrates.


    II. Evolution of the Superfamily—Two Special Members
 Top
 Abstract
 I. Introduction
 II. Evolution of the...
 III. Superfamily Evolution by...
 IV. Superfamily...
 V. Conservation of PACAP—A...
 VI. Conclusions and Future...
 References
 
A. PACAP as the most tightly conserved family member
PACAP is the most conserved member of the PACAP/glucagon superfamily in terms of length and sequence identity of the nucleotides and amino acids. In the last 10 yr, PACAP has been identified from 16 vertebrate species: human (8, 9), sheep (6, 10, 11), rat (12), mouse (13, 14), chicken (15), lizard (partial sequence in Ref. 16), frog (17), 5 species of salmon (18, 19), catfish (20), stargazer (21), and stingray (22). Additionally, two forms were characterized from an ancient protochordate, tunicate (23) (Fig. 2Go). The sequence conservation is shown by the complete identity of the amino acid sequence for mammalian PACAP peptides. The chicken and frog forms of PACAP each have only one amino acid change in comparison with the mammalian form, whereas fish PACAP peptides have three or four changes (89–92% amino acid identity) with mammalian PACAP. With one exception, amino acid substitutions in fish PACAP peptides are in the C-terminal region beyond amino acid 27.



View larger version (41K):
[in this window]
[in a new window]
 
Figure 2. Comparison of amino acids in the PACAP (top) and GRF (bottom) families. Each peptide sequence and the percent identity are compared with the human sequence. Identical amino acids are represented by dots, whereas differing residues are shown in the single letter code for amino acids. PACAP is highly conserved in contrast to GRF peptides.

 
Of the different forms of PACAP characterized to date, the most ancient forms of PACAP are those in the tunicate (Fig. 2Go). The tunicate PACAP-1 cDNA encodes a PACAP of only 27 amino acids, whereas the vertebrate PACAPs are present as both a 38-amino acid form and a truncated 27-amino acid form that is identical to the N-terminal portion of PACAP-38. In the tunicate the 27-amino acid form is the result of a stop codon in the cDNA encoding PACAP-1 (23). There is a striking conservation of the tunicate PACAP-1 amino acids: 26 of 27 amino acids are conserved in comparison with the mammalian, frog, salmon, catfish, and stingray PACAPs. The sequence conservation is not limited to the peptides as shown by a nucleotide identity of 96% between the human and tunicate PACAP-1 cDNAs.

In addition to the tunicate PACAP-1 cDNA, PACAP-2 cDNA was isolated and found to encode a 27-amino acid peptide. The difference between PACAP-1 and -2 is that the latter has 4 amino acid changes compared with the mammalian form. Nevertheless, the high identity that is found between the two tunicate cDNAs and peptides suggests that the two PACAPs in tunicate probably originated from a gene duplication. This duplication could have occurred in the tunicate lineage at any time over the last 700 million years or in an organism that originated before the phylogenetically ancient tunicate. A peptide (amnesiac) has been identified in Drosophila with 18% sequence identity to human or tunicate PACAP-27; another peptide (maxadilan) in the sand fly (Lutzomyia longipalpis) has 15% identity with PACAP-27 and 16% with PACAP-38. Evidence to date does not clearly support these peptides as homologous to PACAP (see Section III.D. below). It is clear from the tunicate data that the evolutionary pressure to maintain the PACAP amino acid sequence is high; other known members of the PACAP/glucagon superfamily do not have such high conservation of amino acid sequences. As discussed in Section V, evidence as to why the primary structure of the PACAP family is so tightly conserved in terms of function is beginning to emerge.

B. GRF with rapid structural changes in evolution
Not quite 20 yr have passed since a GH-releasing factor (GRF) was isolated and sequenced from human pancreatic tumors. Rivier and co-workers (24) in 1982 found a 40-amino acid GRF peptide with a free carboxy terminus within their tumor extract. In addition to a 40-amino acid form, Guillemin et al. (25) in 1982 found a 44-amino acid, amidated GRF peptide as well as a 37-amino acid peptide (Fig. 2Go) from a different single pancreatic tumor.

In 1984 the hypothalamic form of the GRF peptide was sequenced and reported to be 44 amino acids and identical to the pancreatic tumor sequence (26). In subsequent years, the primary sequences of GRF from 16 vertebrate and 1 protochordate species have been identified (Fig. 2Go). A total of 15 distinct sequences were identified from human, pig, cattle, goat, sheep, hamster, rat, mouse, chicken, salmon, carp, catfish, and tunicate (Figs. 2Go and 3Go). Of the 15 GRF sequences known, the GRF peptide sequence was determined by protein chemistry from 7 of the species. In addition, the cDNA and/or gene was isolated and the GRF peptide was deduced for the human (28, 29, 30) hamster (34), mouse (37, 38), chicken (15), salmon (18, 19), catfish (20), and tunicate (23).





View larger version (177K):
[in this window]
[in a new window]
 
Figure 3. Structure of peptides in PACAP/glucagon superfamily. *, Extended forms for some of the peptides within these groups have been reported. Amino acid substitutions are compared in each group within the human form of the peptide. Conservative substitutions (blue) and radical substitutions (red) were determined by positive values in the 250 PAM pairwise exchange table (PET91) matrix (139 ).

 
Unlike the conservation that is seen in the PACAP nucleotides and amino acids, limited conservation of nucleotides and amino acids is present in the GRF family. This divergence among species is not surprising in view of the relatively low sequence identity of 68% between the human and rat GRF peptides. For example, the chicken GRF peptide compared with GRF in other species has only 42% amino acid identity to human, 47% to rat, 76% to carp, and 59% to tunicate GRF (a 27-amino acid peptide). Sequence identity found among the different GRFs is due primarily to the bioactive core (140, 141) within the first 29 amino acids (Fig. 2Go). Little or no conservation is seen in the remaining amino acids (Fig. 2Go). The tunicate GRF-1 and GRF-2 are both 59% identical in amino acids to human GRF and 89% identical to each other.

Not only sequence but also the length of GRF peptides varies among species. The human GRF peptide is 44 amino acids in healthy tissue. Other mammalian GRFs show a full-length peptide varying from 44 amino acids in cow, goat, and sheep to 43 amino acids in rat and 42 amino acids in mouse. All seven fish sequences encode a protein of 45 amino acids (Fig. 2Go). Some of the changes in the GRF sequence are due to use of different splice sites in intron 4, which interrupts GRF; this mechanism partially explains differences in the C-terminal portion of human and rat GRF (2). Likewise, the chicken gene produces two peptides of different lengths, 43 and 46 amino acids due to alternative processing at the intron 4-exon 5 boundary. Only the human, pig, cow, goat, and sheep GRF peptides are amidated at the carboxy terminus.

The most ancient GRFs isolated to date are those of the protochordate in which a GRF-like peptide is shorter and encoded on a single exon. The tunicate cDNAs that encode PACAP-1 and -2 also encode the tunicate GRF-like peptides (termed tunicate GRF hereafter) (23). Similar to tunicate PACAP-1 and -2, the tunicate GRFs are only 27 amino acids in length; this length is much shorter than all other known GRFs, but is similar to the 29-amino acid bioactive core of vertebrate GRFs (140, 141). The tunicate 27-amino acid GRF-1 and GRF-2, as deduced from the cDNAs, are the result of posttranslational processing of the precursor proteins at the dibasic site between the tunicate GRF and PACAP. Longer forms of GRF cannot be translated because the nucleotides that encode PACAP follow immediately after the GRF sequence. The function of tunicate GRF is not known and indeed, GH has not been identified in tunicates to date.

C. GRF/PACAP gene—a duplication late in evolution
It was suggested previously that the DNA encoding GRF evolved into a distinct gene about 750 million years ago (142). This idea was based on protein biochemistry, peptide sequence comparisons, and the demonstration that the mammalian genome encodes GRF and PACAP on separate genes and chromosomes. However, recent evolutionary studies have shown that both peptides are encoded in the same gene in birds (15), fish (18, 19, 20), and tunicates (23). Therefore, the evidence that GRF and PACAP are encoded in one gene supports the hypothesis that a separate GRF gene evolved only about 250 million years ago. The current theory is that the GRF-PACAP gene duplication was followed by nucleotide substitutions giving rise to different genes for GRF and PACAP. This is thought to have occurred after separation of the avian and mammalian lineages. Neither the exact evolutionary time nor taxa are known for the GRF-PACAP gene duplication, but further studies on extant species will help to delineate when the gene duplication occurred (see Section III.B, paragraph 4).

The function of the two peptides encoded in one gene is interesting. In fish, GH is released by both PACAP and GRF, provided the correct fish form of the peptide is used. In salmon and goldfish, synthetic PACAP (salmon, ovine, frog, or zebrafish forms) and GRF (salmon and carp forms) were compared for their ability to release GH from pituitary cells in vitro. In salmon, GRF-45 (the salmon form) released GH at concentrations between 10-12 and 10-8 M, and PACAP-38 (salmon form) was effective at 10-10 to 10-7 M, but the dose-response curve was more consistent for PACAP (19). In goldfish, several peptides released GH at 10-10 to 10- 6 M; the potency (derived from ED50) was carp GRF-45 > zebrafish PACAP-38 > ovine PACAP-38 > ovine PACAP-27 > zebrafish PACAP-27 = frog PACAP-38 >> mammalian VIP (143). In cultured pituitary cells of eels, human PACAP-27 and PACAP-38 were effective in releasing GH, but human GRF was not effective (144), as might be expected from the rapid change in GRF structure in evolution. This evidence suggests that PACAP is a hypophysial factor in nonmammalian vertebrates. Even more intriguing is the observation that alternative splicing, which occurs in brain tissue, results in a short transcript encoding PACAP without GRF, in addition to a full-length transcript with GRF and PACAP encoded. The short transcript implies that the brain can selectively produce more PACAP than GRF, but the reason remains unknown. In mammals, GRF or PACAP can release GH, but there is debate as to whether PACAP acts directly on somatotropes (Ref. 145 and see Section V.E.1).


    III. Superfamily Evolution by Exon and Gene Duplication
 Top
 Abstract
 I. Introduction
 II. Evolution of the...
 III. Superfamily Evolution by...
 IV. Superfamily...
 V. Conservation of PACAP—A...
 VI. Conclusions and Future...
 References
 
A. Six single genes in the human superfamily
The organization of the known genes for the glucagon superfamily in humans is compared in Fig. 4Go. The six genes encode at least nine bioactive peptides (PRP has not been shown to be bioactive). The VIP and glucagon genes encode two or three bioactive peptides, whereas the PACAP, GRF, secretin, and GIP genes encode only a single bioactive peptide. Nonetheless, the structural organization of all the genes is similar, and the precursor molecules are even more striking in their similarity. Each gene encodes a signal peptide, an N-terminal peptide (also named a cryptic peptide), one to three bioactive peptides, and a C-terminal peptide. Each of the six family members in humans is a single copy gene. Specific receptors of the seven-transmembrane type have been identified for each of the bioactive peptides except PHM in the superfamily (Section IV).



View larger version (19K):
[in this window]
[in a new window]
 
Figure 4. The human PACAP/glucagon superfamily genes. Six genes that encode nine bioactive hormones (PRP is not bioactive) are shown; the top five genes were isolated and sequenced from humans, whereas the secretin gene is from rat. The exons are shown as boxes and the introns as lines. The lengths of the exons and introns are not proportional so that the exons can be aligned between genes. Exon 1 is an open box at the 5'-end of the gene and contains the 5'-UTR except for the secretin gene. The open box at the 3'-end of the gene is the 3'-UTR. Abbreviations are: SP, signal peptide; PRP, PACAP-related peptide; PHI, peptide histidine-isoleucine; GLUC, glucagon; SECR, secretin.

 
The PACAP and VIP genes are best discussed together because the structures for the peptide, precursor, and gene are very similar. The human VIP peptide is 70% identical to PACAP-27 (Fig. 3Go), and the organization of the precursor and gene compared with that of PACAP is likewise similar (Fig. 4Go). The structure of the human PACAP gene was reported by Hosoya and colleagues (8) in 1992. The complete human VIP gene sequence has been reported by four different groups (41, 42, 43, 44). Although the VIP gene has seven exons compared with PACAP’s five exons, the two genes have a marked similarity. Only the VIP and PACAP genes have an additional exon between the signal peptide and one of the biopeptide genes (Fig. 4Go). The function of this cryptic or spacer protein is not clear, but the extended protein may be useful for folding of the precursor before cleavage. Exons 4 and 5 in both genes are thought to result from exon duplications (Section III.E).

It is the peptides (PACAP or VIP) encoded on exon 5 that have well documented and potent effects. Also, exon 4 in the VIP gene encodes PHM, which is reported to have fewer, but similar, functions to VIP in regard to the gut and to pituitary and pancreatic hormone release (see Ref. 67). In contrast, the nucleotides of exon 4 in the PACAP gene of mammals encode PACAP-related peptide (PRP), which is not a functional peptide in studies done to date (146), most likely due to the substitution of Asp for His or Tyr in the N-terminal position (147). However, in animals that evolved before mammals, the PACAP gene does encode a functional peptide (GRF) on exon 4. The appearance of a nonfunctional peptide on exon 4 may be the result of a gene duplication in the early mammalian lineage (15) resulting in separate genes for PACAP and GRF. Finally, the VIP gene encodes on exons 5 and 6 a C-terminal 15-amino acid peptide with no known function and encodes on exons 6 and 7 the 3' untranslated region. These double exons beyond the exons encoding bioactive peptides are also present in the GIP and secretin genes.

The glucagon, GRF, GIP, and secretin genes form a separate group of genes compared with PACAP and VIP genes because they lack an exon between the exons encoding the signal peptide and bioactive peptide (Fig. 4Go). However, it seems clear that all six genes are in the same line of evolution because each of the four genes lacks exactly one exon. The N-terminal peptides encoded in the four genes results from a small region at the 3'-end of the exon with the signal peptide and the 5'-end of the exon with the bioactive peptide (Fig. 4Go). Hence, each of the six precursors has a cryptic or N-terminal protein 5' to the bioactive peptide. It is an open question whether the PACAP and VIP genes gained, or the other genes lost, an exon. It is also noteworthy in the GRF, GIP, and secretin genes that only one bioactive peptide is encoded. A C-terminal peptide is encoded on the exon adjacent to the bioactive peptide exon(s), but a function is not known.

The glucagon gene is unique in that it encodes three bioactive peptides: glucagon, GLP-1, and GLP-2 (Figs. 3Go and 4Go). Each peptide is encoded on a separate exon, which suggests that there were exon duplications (Section III.E). The posttranslational processing of the precursor varies in specific tissues and species and, in addition, alternative splicing of the transcript is tissue specific in vertebrates other than mammals.

The GRF gene is best compared with the PACAP gene as they are thought to be duplicates from early mammals. Although the GRF and PACAP genes have five exons, the PACAP gene has gained exon 3, creating coding for a longer N-terminal peptide. Instead, the GRF gene has an additional exon at the 3'-end encoding the 3'-untranslated region (UTR). The advantage of either exon is not clear but may be related to folding of the precursor or stability of the mRNA. The comparison between the GRF and PACAP genes is important as both GRF and PACAP appear to be encoded on a single gene in birds, fish, and tunicates (Fig. 4Go). The GRF gene can be also viewed as having six exons because exon 1 in brain transcripts is distinct from exon 1 in the placental or testicular transcript (148).

The GIP gene has six exons with the GIP hormone encoded on exons 3 and 4 (Fig. 4Go) (129, 130, 136). Thus, the biologically active core (30 amino acids) is encoded on exon 3 (35 aa) with the remaining 7 aa encoded on exon 4. The N-terminal peptide spans exons 2 and 3; the C-terminal peptide spans exons 4–6.

In terms of evolution, it is interesting that the two gut hormones, GIP and GLP-1, in humans have 41% identity of amino acids between the two peptides and between the receptors. This percent identity suggests the peptides have been separated for a long time. However, the origin of GIP has been traced only to birds to date (Fig. 3Go), whereas GLP-l is present even in jawless fish. The separation of the two genes is difficult to deduce until fish have been carefully analyzed for the presence of GIP. Within mammals, GIP is highly conserved in that rat and human GIPs are 95% identical in amino acids (137).

Finally, the secretin gene is the shortest gene of the superfamily in the sense that only four exons are present (Fig. 4Go). The first exon in the other glucagon family genes is always one that contains the 5'- untranslated region exclusively. The secretin gene lacks this exon, which means the gene lacks an intron between the transcriptional and translational start sites. Compared with the vertebrate PACAP and VIP genes, the secretin gene also lacks the exon that occurs between the exon encoding the signal peptide and the exon encoding secretin (4, 5). The secretin hormone is encoded on the second exon, whereas an N-terminal peptide spans exons 1 and 2, and a C-terminal peptide is encoded on exons 2, 3, and 4 (Fig. 4Go). The C-terminal peptide can be 44 or 72 amino acids depending on whether alternative splicing removes exon 3 (Section III.H). The splicing of this exon leaves only three exons and leads to speculation that the ancestral secretin gene was three exons. Meanwhile, neither the N-terminal peptide (10 amino acids) nor the C-terminal peptide is known to have a function unless it is to maintain a folding pattern that allows cleavage of secretin from the precursor.

One of the earlier rules for the superfamily was that position 6 is Phe. However, evolutionary studies show that rabbit secretin has Leu6, chinook salmon and catfish GRF have Leu6, and several GLPs have Tyr6 (Fig. 3Go). These are not radical substitutions (139). In PRPs, Ile6 is a radical substitution, but the peptide does not appear to be functional.

A comparison of the promoter regions for the six superfamily genes is of considerable interest to determine how regulation of gene transcription has evolved. These studies are in their infancy, although the promoter region responsible for tissue-specific expression has been analyzed in the VIP and glucagon genes.

For the VIP gene, the transcription initiation site has been determined (42). In the promoter region of the VIP gene, there are three TATAAA boxes (44, 149), AP2 consensus sites (149), a cAMP-responsive element (CRE) within 100 bases from the start site (44, 149, 150), a cytokine-responsive area at about 1 kb upstream (see Ref. 151), and an area containing repeated DNA sequences, which are not Alu sequences (43). A tissue-specific element is 4–5 kb upstream from the transcription start site (150). The importance of the tissue-specific element was shown indirectly by inappropriate expression in tissues when only 2 kb of the VIP promoter was present (151).

For the PACAP gene, analysis of promoter locations supports the idea that the mammalian GRF and PACAP genes are products of a duplication. Thus, in rat the PACAP and GRF genes each contains a testis-specific promoter and first exon that are 13.5 kb (PACAP) or 10.7 kb (GRF) upstream from the transcription start sites used in the hypothalamus (148, 152, 153). In addition, there are several transcription start sites for PACAP in the hypothalamus in rat (153) and mouse (14), but all are part of the proximal promoter region. In the human PACAP gene, the transcription start site was not definitively identified (8), but at least one start site was identified for the chicken and salmon genes (15, 19). For promoter analysis, specific binding sites have not been tested, but consensus sequences that resemble response elements have been noted. For example, several potential Sox5/SRY sites were noted in the upstream testis-specific PACAP promoter (153).

The glucagon gene also contains a TATAAA sequence (82, 154) in the 5'-flanking region. The regulation of the glucagon gene results in tissue-specific expression in the pancreas, intestine, and brain (see Ref. 155 for a recent review). In experiments in which the glucagon promoter was fused to the coding sequences for SV40 large T antigen, only 850 bp of proximal promoter were sufficient for expression in the pancreas and brain (156), whereas 2.0 kb were needed for additional expression in the intestine (157). However, a fusion gene with 1.3 kb of the glucagon promoter and the coding region of luciferase resulted in expression in the intestine (158, 159). In more detailed experiments using a variety of glucagon promoter segments fused to the coding sequence of luciferase, it was shown that the promoter region between -1.3 and -2.2 kb affected expression in both the islet and intestinal cells, but the regions that enhanced or suppressed expression in the two tissues were not always the same. Other tissue-specific elements have been identified (see Ref. 160), including four DNA control elements (G1 to G4) within the first 300 bp of the rat glucagon promoter (161). A number of responsive elements that bind transcription factors in the glucagon promoter have been studied: the G1 element includes nucleotides between -60 and -118 that bind proteins restricting expression of glucagon to the {alpha}-cells in the islets (162); the G1 and G3 elements bind the Pax-6/Cdx-2/3 heterodimer for gene activation (163); a G2 element from -165 to -200 is a calcium response element that binds hepatic nuclear factor (HNF)-3ß (winged-helix family) and NFATp (164). Also, G2 binds an Ets-like protein (164) and HNF-3{alpha} (165); the G3 element (-234 to -274) includes an insulin-responsive element that inhibits glucagon gene transcription (166); the G3 element binds a PISCES protein and a winged helix protein to activate transcription (167); an E box in the G4 region binds a heterodimer E47/BETA2 that transactivates glucagon gene transcription unless E47 is overexpressed (161); a cAMP-responsive element (CRE) of eight nucleotides binds cAMP response element binding protein (CREB) (161); and TCATT motifs that are adjacent to the CRE sites bind protein factors that modulate CREB (168). Further proof that HNF-3{alpha} is an important activator of transcription for the glucagon gene is shown in that loss of HNF-3{alpha} by a null mutation resulted in hypoglycemic mice and a marked reduction in plasma glucagon (165).

For the GRF gene, regulation of the transcription has been difficult to study because there is a lack of suitable cultured neurons. Lack of a consensus TATA box in the promoter is thought to be the basis for multiple transcription initiation sites. One important factor that controls the expression of GRF is GH: transcripts of GRF mRNA increase if GH is eliminated by removal of the pituitary, and GRF mRNA transcripts decrease if GH levels are high as in transgenic mice that overexpress GH (169). Specific analysis of response elements in the promoter of the GRF gene have not been reported, although it is known that an upstream (10 kb) promoter is used as the transcription initiation site in placenta compared with the one used in the brain resulting in different exon 1 sequences for the 5'-UTR (170, 171). Also, like PACAP, the GRF gene has a testis-specific promoter that is 10.7 kb upstream from the promoter used in the hypothalamus (148). Again, in the testis transcripts, exon 1 is not the same as the one used in the hypothalamus.

For the GIP gene, two or possibly three transcription start sites were found, but all were within three bases of the other (130). In the flanking region, a TATA and CAAT motif were found as were several consensus sequences of Sp1, AP-1, AP-2, and CRE within 400 bp of the start site. An interesting observation by Higashimoto and Liddle (136) about the GIP gene was the presence in intron 1 of putative TATA and CCAAT boxes and several translational cis elements. Indeed, they found minor amounts of a short transcript that starts from exon 2 during development in the rat intestine. The use of exon 2 for the initiation of both transcription and translation is of interest because the tunicate PACAP genes and vertebrate secretin genes lack the usual exon 1 found in other superfamily members and also use a single exon as the initiation site for both transcription and translation.

The secretin gene promoter contains a TATA box, an E box, and consensus sequences for transcription factors AP-2 and Sp1 (4, 172). A promoter region of only 174 bp of flanking sequence was sufficient for maximal expression of secretin mRNA in HIT cells (173). Within this region is an E box (CAGCTG motif), which is similar to the core of enhancer elements found in the rat insulin I gene; this motif may explain expression of secretin mRNA in the developing pancreas. Mutations in the CAGCTG region decreased transcription to less than 20% (173). Evidence suggests that a transcription factor BETA2 (a basic helix-loop-helix factor) forms a heterodimer with either E12 or E47 in activating transcription of the secretin, glucagon, or insulin gene by binding to their E boxes (174). Secretin is not expressed in the S cells in the gut if the BETA2 gene is disrupted (174). In addition, BETA2 has a coactivator p300 that enhances secretin gene transcription; BETA2 and p300 together block cell division in the S cells (175). In normal mice, it is not known why the secretin gene ceases to be expressed in the pancreatic ß-cells in adults, but continues to be expressed in gut S cells. In summary, only a few transcription factors or response elements that are shared in the superfamily have been identified to date for the six genes.

B. Members of superfamily in many vertebrates
If the hypothesis is correct that the glucagon superfamily evolved from a single ancestral gene, then it is theoretically possible to identify each member of the family in extant vertebrates or invertebrates and to deduce information about the distinct origin of each hormone. The phylogenetic distribution of each glucagon family member has advanced rapidly due to chemical and molecular sequencing. Immunohistochemical identification has been helpful, but not definitive, due to cross-reactivity of antibodies.

In mammals, all nine bioactive hormones (PACAP, VIP, PHM, glucagon, GLP-1, GLP-2, GRF, GIP, and secretin) have been identified, and the structures are shown in Fig. 3Go where most of the references are listed for this section. Current evidence suggests that the PACAP-related peptide (PRP) is not bioactive. Therefore, PRP will not be discussed further.

Although the PACAP family is highly conserved with identical sequences for both PACAP-27 and PACAP-38 in the four mammalian species studied to date, there is also high conservation of VIP in mammals where VIP-28 is identical in 10 mammalian species, but differs in guinea pig and opossum. For glucagon-29, eight mammalian species share the same sequence, but guinea pig, degu, and opossum each have a distinct sequence. In contrast to PACAP, VIP, and glucagon, there are four distinct sequences for GLP-1 in the four mammalian species studied. Also, four sequences are identified for GIP-42, one for each of the four species. PHM/PHI-27 has five distinct sequences for eight species. Considerable sequence variation is seen in secretin-27 in most mammals: six sequence variations are present in the eight species for which sequences are identified. At the other end of the conservation spectrum from PACAP is GRF, with rapid changes in sequence, as noted above in Section II.B. Seven distinct sequences have been identified for GRF (42–44 amino acids) in eight mammalian species. Thus, it is likely that the strongest evidence for phylogenetic ancestry for the molecules will come from PACAP, VIP, and glucagon. GRF and secretin vary so rapidly in sequence during the evolution of the mammals that ancestral tracking will require study of more species at closer intervals. Variation in structure can occur rapidly in the C-terminal portion of the peptide if the coding is interrupted by an intron as for GRF and GIP, but tracking of the ancestral N-terminal portion of the hormone should be possible.

In birds, reptiles, or amphibians, it is difficult to generalize about the PACAP/glucagon superfamily ancestry as very few sequences are known for these three classes of vertebrates (Fig. 3Go). Chicken and turkey have been used exclusively among birds for sequence identification with two exceptions in which duck and ostrich were studied for glucagon. GIP has not been identified in any bird, and only one species has been used to identify several of the hormones: PACAP, GLP-1, secretin, and GRF. Two distinct sequences in birds were identified for VIP (two species), PHI (two species), and glucagon (four species). For each avian hormone, the sequence is different compared with the mammalian form, but the differences are small enough to clearly identify the family lineage. The organization of the avian genes for family members has been very useful in understanding the evolution of the superfamily. The avian gene that encodes both PACAP and GRF (Fig. 5Go) supports the idea that a gene duplication occurred in early mammals leading to separate genes for PACAP and GRF (Fig. 6aGo). In contrast, the avian gene for glucagon/GLP-1/GLP-2 has the same organization as the mammalian gene, which is additional proof that the organization of this gene has remained stable in the vertebrates, including fish.



View larger version (30K):
[in this window]
[in a new window]
 
Figure 5. Superfamily genes from tunicate to humans. The genes that have been isolated and sequenced from invertebrates (tunicate, a protochordate) and vertebrates are shown for the PACAP/glucagon superfamily members. The exons are shown as boxes and the introns as lines. The lengths of the exons and introns are not proportional so that the exons can be aligned between genes. Exon 1 is an open box at the 5'-end of the gene and contains the 5'-UTR except for the secretin gene. The open box at the 3'-end of the gene is the 3'-UTR. Abbreviations are: SP, signal peptide; PRP, PACAP-related peptide; GLUC, glucagon; SECR, secretin.

 


View larger version (32K):
[in this window]
[in a new window]
 
Figure 6. A, Hypothetical scheme for duplication of the GRF/PACAP gene. The protochordate (tunicate) GRF/PACAP gene is shown on the top line. This gene is the most ancient example of the GRF/PACAP gene reported to date and it encodes both PACAP-27 and GRF-27. Elongated forms of PACAP and GRF are encoded in a single gene also in fish, birds, and presumably in early mammals. However, GRF and PACAP are each encoded in a separate gene in present day mammals that have been studied to date. This evidence suggests that a gene duplication occurred in the GRF/PACAP gene in early-evolving mammals. Exons are shown as boxes and introns as lines. Abbreviations are: 5'U, 5'-UTR; SP, signal peptide; cryptic, a region encoding a N-terminal peptide without known function; PRP, PACAP-related peptide; 3', 3'-UTR. B, Proposed evolution of the PACAP/glucagon superfamily. The larger open boxes represent a gene, whereas the smaller boxes inside represent exons that encode hormones. The colored boxes are putative new bioactive peptides. For example, the duplication of the GRF-PACAP gene in early mammals is thought to have resulted in two genes, one of which encodes PACAP and the other GRF; the second exon is no longer active at some point after duplication as shown by an X. The first event in the evolution of the superfamily is hypothesized to be an exon duplication followed by one or more complete gene duplications. The transition between invertebrates and vertebrates has resulted in nucleotide base substitutions and extension of the C-terminal domains in many peptides. The peptides found in mammals are shown along the bottom for the six known genes.

 
In reptiles, only glucagon and VIP primary structures have been reported for the full-length peptides, although partial sequences for PACAP, PHI, and GLP-1 are long enough for identification (16). Glucagon in turtles and alligators is identical in structure to that in duck and ostrich, but not to chicken, turkey, or mammals. VIP in a lizard (gila monster) is 75% identical to the VIP amino acids in chicken/turkey.

An interesting variation in the PACAP/glucagon superfamily occurs in reptiles in that the gila monster has a novel peptide family named exendin (Fig. 3Go). The peptides are clearly related in structure to the superfamily, but the peptides are found exclusively in the venom made in the salivary gland of the gila monster. Four variant peptides of 35–39 amino acids have been identified and named exendin-1 (helospectin), exendin-2 (helodermin), exendin-3, and exendin-4. In peptide structure exendin-2 is related to human PACAP with 53% sequence identity, whereas exendin-4 is related to human GLP-1, also with 53% identity. Exendin-2 interacts with the VIP and secretin receptors resulting in an increase in cAMP in rats (73). In contrast, exendin-4 binds with the GLP-1 receptor, increases cAMP, stimulates glucose-dependent insulin secretion if injected, and increases insulin gene expression in cultured islet cells (76). However, extensive studies suggest that exendins do not have mammalian homologs (16). Rather, evidence suggests that one or two gene duplications, possibly within the taxon of lizards including the gila monster, have resulted in separate genes for PACAP, glucagon, exendin-2, and exendin-4. The cDNAs and distribution pattern of the latter two are much closer to each other than to the cDNAs for PACAP or glucagon, which have been partially sequenced for the gila (16). The origin of the venom peptides will be easier to discern when the gene for the gila PACAP can be compared with that for exendin-2 and the gene for glucagon can be compared with that for exendin-4. Alternative splicing, which is common within the superfamily, may explain the differences in cDNA organization.

In amphibians only five peptides have been identified. They are PACAP, VIP, glucagon, GLP-1, and GLP-2.

In fish, six out of nine families (PACAP, VIP, GRF, glucagon, GLP-1, GLP-2) have been found. Only secretin, GIP, and PHI have not been isolated and sequenced from any fish species. Secretin bioactivity has been reported from intestinal extracts of bony and cartilaginous fish, but the structure of secretin is not known (176). PHI is assumed to be present as it occurs in tandem with VIP on one gene in reptiles (16), birds, and mammals. A fish VIP cDNA or gene has not been isolated to date but is likely to be identified soon, as VIP peptides have been sequenced from dogfish shark, bowfin, trout, and cod (Fig. 3Go).

Of the superfamily peptides identified in fish, the most conserved compared with human is PACAP followed by VIP, and then glucagon. The least conserved are the GLP and GRF families. PACAP-27 is identical in salmon and catfish compared with human PACAP-27. There is more variation with PACAP-38 between fish and human. Of the hormones identified (Fig. 3Go), the amino acids of fish peptides compared with human peptides (100%) have an identity of 82% for VIP, 69–86% for glucagon, 35–68% for GLP-1, 40% for GLP-2, and 30–45% for GRF. Hence, fish PACAP-27 has the highest sequence identity with the human form; GRF has the least identity, but the percentage of GRF would be higher if only the biologically active cores were considered. Secretin has not been identified by structure except in mammals and birds, but intestinal extracts of reptiles, amphibians, bony fish, and cartilaginous fish have been shown to have secretin bioactivity (176).

In jawless fish, which to date includes only studies of lamprey, the peptide sequences for glucagon and GLP-1 have been reported (Fig. 3Go) and are 72% and 48% identical, respectively, to their counterparts in humans. Recently, two cDNAs encoding glucagon and GLPs were isolated from sea lamprey (Petromyzon marinus) intestine (Ref. 116 and Fig. 3Go). One precursor encoded glucagon and GLP-1, but the GLP-2-like molecule was lacking five critical amino acids near the N terminus, making functionality unlikely. The other precursor encoded glucagon-II that was 72% identical to glucagon I and did not encode GLP-1, but did encode a GLP-2-like molecule (116).

Proof that the PACAP/glucagon superfamily members were present in the earliest vertebrates, the jawless fish, has been established only for the glucagon/GLP-1/GLP-2 trio. However, the superfamily peptides, with the exception of secretin and GIP, have been shown to be present in representatives of most other vertebrate classes. The origin of these two families will remain unknown until an exhaustive search has been done for the peptides in nonmammalian vertebrates. The rapid evolution of this superfamily includes gene duplications and alternative splicing, leaving open the possibility that the origins of secretin and GIP were within the vertebrates or preceded the vertebrates. The latter possibility is favored by the fact that the sequence of these two peptides in humans is quite distinct from the highly conserved PACAP, suggesting a long separation time.

C. Key to origin of superfamily in protochordates
The protochordates are the major group from which the vertebrates are thought to have arisen. Thus, they are a logical group in which to investigate the origin of the PACAP/glucagon superfamily members in invertebrates. The Garstang theory states that the key group of protochordates from which the vertebrates arose was the tunicates (Urochordata). One possibility is that the tunicate’s mobile larvae, which have a fish-like appearance, went through sexual maturation without metamorphosis. This process could result in a sexually mature adult without transformation into the sessile adult that normally occurs in tunicates (although one group of urochordates, the larvaceans, remain as mobile adults). This ancestral tunicate may have given rise to the amphioxus, another protochordate, and to the vertebrates. Other scenarios for the origin of the vertebrates vary as to the taxon for the ancestral group, but tunicate still remains closely related to vertebrate ancestors (177).

It is now clear that several of the superfamily members are present in tunicates (Figs. 2Go, 3Go, and 5Go). PACAP-27 was identified encoded in a cDNA isolated from a tunicate; the percent identity of amino acids was 96% with human PACAP-27 (23). An extended peptide (PACAP-38) was not encoded, suggesting that PACAP-27 is the original form of PACAP in evolution. A second peptide was encoded on the same gene with PACAP-27. We called it GRF-27 because it was on the exon that was 5' to the PACAP exon as in the vertebrates. Again, this may be the original form of this peptide as only the biologically active core is encoded and not the extended peptide found in vertebrates. However, the identity of the GRF-27 peptide was 59% with human GRF and 61% with human glucagon. This closeness in identity makes the point that the PACAP/glucagon peptides may be closer in sequence as the ancestry is traced in animals that evolved earlier than vertebrates.

A further interesting twist to the tunicate PACAP story is that a second cDNA was isolated from the same tunicate library and was shown to encode two peptides also. One peptide was 85% identical to the deduced amino acids of PACAP-27 in humans and the other peptide was 59% identical to human GRF-27. The question is whether this cDNA represents a duplication of the PACAP gene in the tunicates in the 700 million years since they separated from the stem line leading to vertebrates or whether the duplication occurred in the stem line that may have continued into vertebrates. The second form of PACAP-27 is 67% identical to human VIP and could be a precursor. Further studies in animals whose ancestors evolved before the protochordates may help to distinguish the two possibilities.

The presence of PACAP-27 in the tunicate establishes the origin of the PACAP/glucagon superfamily in the invertebrates. It also adds weight to the idea that PACAP was the original peptide in this family and the one most tightly conserved. The isolation of glucagon from a tunicate will be an important step in understanding the origin of the super-family.

D. Extension of PACAP ancestry uncertain in insects
In Drosophila a neuropeptide gene was identified that has some identity to PACAP (178). This gene, named amnesiac, encodes a signal peptide followed by several possible peptides depending on the cleavage sites. One of the peptides deduced from the gene had 10% identity with human PACAP-38 or 18% with PACAP-27 (see sequences below).

This identity is too low to claim that amnesiac is homologous to PACAP in tunicates or vertebrates. However, the authors show that an inserted space in PACAP after both amino acids 23 and 27 would increase the identity to 21% for PACAP-38 and 30% for PACAP-27. If amino acid similarity is used for the calculation, then the relationship is higher. However, the amnesiac peptide has four cysteines, which are usually highly conserved, and PACAP-27 and PACAP-38 do not have any cysteines. Also, it is not clear why only PACAP-38 and not PACAP-27 has electrophysiological effects and why cross-reactivity in Drosophila occurs only with antisera raised against PACAP-38 and not PACAP-27 (179) because the Drosophila peptide is only 32 amino acids and identity to PACAP is only within the first 27 amino acids. Therefore, with respect to PACAP homology, amnesiac will have to remain in a gray area until other invertebrate PACAP-like peptides are identified.

Another intriguing story is that the sand fly (Lutzomyia longipalpis), which obtains blood from vertebrates, has a peptide (maxadilan) that activates the PAC1 receptor resulting in vasodilation (180). Like the fruit fly’s amnesiac peptide, maxadilan has a low identity with hPACAP-38 (16%) or hPACAP-27 (15%) (see sequences at top of page 631).

Again, the low sequence identity and presence of four cysteines in maxadilan compared with no cysteines in PACAP suggests a decision about homology is not possible at this time. The authors (180) suggest maxadilan is an example of functional convergent evolution.

E. Exon duplication as the original step in ancestry
The suggestion that each peptide in the PACAP/glucagon superfamily is encoded on a single exon is based on three facts. First, the four peptides isolated from tunicates, the most ancestral forms to date, are each encoded on one exon (Fig. 5Go). Second, most of the superfamily peptides isolated from vertebrates, including humans, are encoded on one exon. Third, the two peptides (GRF and GIP) that are not contained on one exon do have their bioactive cores encoded on one exon. Tunicate GRF-like peptide is an example of a peptide that was originally only 27 amino acids as in tunicates and was extended later in the 3' direction.

Exon duplication of the ancestral peptides is the most logical explanation for some of the peptides present in mammals to date (Fig. 6BGo). Three of the six superfamily genes are examples of possible exon duplication: GRF/PACAP (in nonmammals), PHI/VIP, and glucagon/GLP-1/GLP-2. The gene for which we have the longest evolutionary story is PACAP. The two PACAP genes can be clearly identified in tunicates because of the high sequence identity (96% or 85%) of the deduced PACAP-27 peptides with human PACAP-27. The argument that an exon duplication is responsible for PACAP and the GRF-like exons is that the sequence of the two exons has remained close in the tunicates. Only a 15% decrease is apparent in recently evolved mammals if the first 27 amino acids are compared as explained below and in Fig. 8Go. In tunicates the percent identity between PACAP-27 on one exon and GRF-like peptide-27 on the adjacent exon is 44% or 48% (Fig. 7Go). After the transition between invertebrates (tunicates) and vertebrates, the divergence between the exons encoding PACAP and the associated peptides (a GRF-like peptide) continued to increase. In fish such as sturgeon that have survived from some of the earliest bony fish, the percent identity between the peptides on exons 4 and 5 are almost as high (4l%) as in the tunicates (D. W. Lescheid and N. M. Sherwood, unpublished). In fish, such as salmon and catfish, in which the ancestor evolved more recently than sturgeon, the identity is reduced (26–33%); the identity between exons for chicken is similar (33%). But it is in mammals such as humans where a drastic reduction in peptide identity (7%) occurs between the peptides on exons 4 and 5 (Fig. 7AGo). One possibility is that there was a complete gene duplication in early-evolving mammals resulting in one gene in which PACAP is conserved and another gene in which GRF is conserved (Fig. 7BGo). One piece of evidence that supports this hypothesis is that human PACAP-27 has only 7% identity with the peptide (PRP) on the adjacent exon and human GRF-27 has only 19% identity with the adjacent peptide (a cryptic peptide without known function), but if PACAP-27 on one gene is compared with GRF-27 on the other gene, the identity leaps to 33%, which is the same as that in chicken and salmon.



View larger version (14K):
[in this window]
[in a new window]
 
Figure 8. Exon skipping in the PACAP/glucagon superfamily. Genes from four different families have been shown to use alternative splicing to remove one exon. In the PACAP-GRF gene in fish and birds, the exon encoding the bioactive core of GRF is removed from some transcripts in the brain. In the VIP-PHI gene in birds, the exon encoding PHI is removed from some transcripts; VIP cDNA and genes have not been isolated from fish. In the glucagon gene, the exon encoding GLP-2 is removed from the transcript only in the pancreas, but not in the intestine in birds and fish. In the secretin gene, the C-terminal peptide of unknown function is spliced from some transcripts. In all four genes, it appears that the coding region for the more potent peptides (PACAP, VIP, glucagon, GLP-1, and secretin) are not spliced; only the exons containing the less effective peptides are spliced out (skipped).

 


View larger version (16K):
[in this window]
[in a new window]
 
Figure 7. Percent identity between the amino acids of PACAP-27 and the peptide that is encoded on the same gene with PACAP. A high percent identity (44–48%) between PACAP-27 and the GRF-like peptide-27 is observed for the tunicate peptides, suggesting that the exon duplication is ancient even in tunicates. However, the identity is closer than in the vertebrate exons for this gene. In the GRF-PACAP gene in fish and birds, the percent identity between the amino acids of PACAP 1–27 and GRF 1–27 ranges from 26–41%. In humans the identity is only 7% or 19% on the two genes (bottom panel). The explanation may be that after the gene duplicated, GRF was conserved on one gene and PACAP was conserved on the other gene, but the companion exons (PRP on the PACAP gene and the C-terminal peptide on the GRF gene) were not conserved. However, the bottom panel shows that if PACAP-27 on one gene and GRF-27 on the other gene are compared, there is 33% conservation of amino acids, similar to the identity in fish and birds.

 
This consideration of early exon duplication also leads to two other conclusions. First, the PACAP exon duplication is likely to have occurred much earlier in evolution than in the tunicates. Second, the exon duplication occurred before the complete gene duplication in tunicates because the bioactive peptides are closer between genes (89%) than between exons (44–48%).

The evolution of the VIP gene has not been traced to invertebrates. In the human and chicken VIP genes, the identity between PHM/PHI (exon 4) and VIP (exon 5) is 44% at the amino acid level. Linder et al. (43) argue that the VIP/PHM exon duplication was an ancient event because there is low conservation between VIP and PHM and a lack of conservation in the flanking sequences around VIP/PHM. Yamagami et al. (44) contend that there is enough sequence similarity in the introns flanking the PHM and VIP exons to suggest that a broad region, and not just an exon, was duplicated. If true, the phrase intragenic duplication rather than exon duplication might be more accurate. Although the protein sequence of VIP is known for four fish, the cDNA or peptide structure of the peptide on the adjacent exon is not known. Hence, the origin of the VIP gene or the time of the exon duplication is not clear but appears to be more recent than the PACAP gene. Another possibility is that the VIP gene resulted from a duplication of the PACAP gene as there is 70% identity between the VIP and PACAP peptides in humans. In tunicates we did not find a VIP gene, but the two PACAP peptides in tunicates were both 67% identical to human VIP, which is similar to the identity between human PACAP and human VIP (70%).

The glucagon gene is thought to have had an exon duplication twice, resulting in three exons encoding one peptide each (Fig. 6BGo). If all three peptides are compared with each other in the human, chicken, bullfrog, and trout genes, then glucagon tends to be closer to GLP-1 than GLP-2; the latter is closer to GLP-1. The percent identity among the three peptides in each of four species is 28–59% (Fig. 3Go). Hence, there is not enough information to identify the origin of glucagon. A recent paper shows that glucagon, GLP-1, and GLP-2 are encoded in lamprey, but the three intact peptides are not encoded in a single cDNA (116). The authors speculate that glucagon originated about 1 billion years ago, whereas GLP-1 and GLP-2 diverged from each other about 700 million years ago (116).

Isolation of invertebrate genes encoding glucagon and its associated peptides is needed to narrow the evolutionary period for the origin of the gene and exon duplication within the gene. In tunicates we have not found a glucagon gene, although we found two cDNAs encoding a peptide with 63–67% identity to human glucagon. However, we judged these peptides to be a GRF-like peptide because each was in the same gene with PACAP and was in the exon that is 5' to PACAP coding. The peptide identity of the tunicate peptides was 59% with human GRF. Functional studies may clarify the true identity of the tunicate peptides.

F. Gene duplication crucial in creating gene families
In evolution several gene duplications in the superfamily have occurred, resulting in six distinct genes in mammals as shown in Fig. 8Go. The two GRF-PACAP genes in tunicates could have arisen in the tunicate stem line or before tunicates separated from other invertebrates. In vertebrates it is clear that fish (at least bony fish) have two genes in which one encodes GRF/PACAP and the other glucagon, GLP-1, and GLP-2. It is likely that a PHI/VIP gene exists in fish, although the cDNA has not been isolated. GIP and secretin genes have not been isolated from fish. The most evolutionarily recent gene duplication that has been suggested for the superfamily is the duplication of the PACAP-GRF gene in early mammals (15, 19).

These several gene duplications have not resulted in gene clusters on a chromosome. Rather, each of the superfamily genes is located on a different chromosome, although the location of the secretin gene is not reported. In humans the glucagon gene is on chromosome 2 (2q36-q37); VIP gene is on chromosome 6 (6q16-q22d); GIP is on 17 (17q21.3-q22), PACAP is on 18 (18p11), and GRF is on chromosome 20 (20q) (from Refs. 8, 130). The separate location of the genes suggests that 1) the duplication events did not occur at the same time; 2) the genes have been evolving for a long time; and 3) the function of the genes is not interdependent. This is in contrast to Hox genes where the genes remained clustered in a continuous linear sequence after tandem duplication in invertebrates. Only complete genomic duplications in vertebrates resulted in copies of the clusters on more than one chromosome (181).

At least one gene duplication in the superfamily has resulted in the novel exendin family identified to date only in a few lizards (gila monster). There are at least two scenarios. The PACAP gene may have duplicated to produce exendin-2 (helodermin), and the glucagon gene may have duplicated to produce exendin-4, a scenario based on the sequence identity between PACAP and exendin-2 and between GLP-1 and exendin-4. The second possibility is that either PACAP or the glucagon gene duplicated and the new gene later duplicated again. This concept is based on the organization of the cDNAs of exendin-2 and -4, which are closer to each other than to the genes in the original family. Also, the expression pattern shows that the exendins are present only in the venom produced in the salivary glands (see Refs. 16, 76).

G. Peptide elongation important in some families
In the six superfamily genes of mammals, there are two peptides that are not encoded on a single exon. GRF and GIP extend onto the following exon. In addition, GRFs in vertebrates may be extended because the GRF-like peptides in tunicate are known to be 27 amino acids and to be encoded on one exon (Fig. 5Go). In vertebrates GRF is 42–46 amino acids, but one exon encodes the first 32 amino acids and the next exon encodes the remaining amino acids. Thus, the first exon encodes a complete biologically active core of 29 amino acids with full biological activity and about 50% potency compared with the full-length hGRF of 44 amino acids (182). The exon-intron organization in the middle of the human GRF gene shows splice sites of gt/ag at the start/end of each of the four introns. There is no question that the extension of GRF in mammals makes the peptide more effective for the release of GH; the fragment of GRF (1–29 amide) has full biological activity but reduced potency (25, 140, 182).

Because an ancestral GIP has not been detected, we don’t know whether a shorter peptide on one exon evolved earlier. We do know that human GIP has 35 amino acids encoded on one exon and the remaining seven amino acids on the adjacent exon (130, 136) and that the bioactive portion of GIP is encoded on one exon. The evidence is that a synthetic peptide for GIP-30 is as potent as GIP-42 in stimulating insulin secretion (183, 184) and proinsulin gene transcription (184), although other functions, such as somatostatin release and inhibition of acid secretion from the stomach, are reduced (185). Further evidence that the shorter GIP-30 might be relevant is that a cleavage site exists at amino acid positions 31–33 (Gly-Lys-Lys), although GIP-30 has not yet been isolated from the blood. Also, transfected GIP receptors bound both GIP-42 and GIP-30 with high affinity in one study (186), but lower affinity for GIP-30 in another study in which GIP-30, nonetheless, was as effective as GIP-42 in stimulating cAMP accumulation (187). The extension of the GIP peptide onto another exon does not involve a novel mechanism, as the human GIP gene is very similar to those of other glucagon superfamily members in that gt/ag is present at the start/end of each of the five introns (130), including the intron that interrupts GIP coding.

H. Alternative splicing for expansion of individual families
There are several types of alternative splicing that have been documented in the PACAP/glucagon superfamily: 1) exon deletion (exon skipping); 2) intron sliding; and 3) splicing within exons encoding the 5'-UTR.

Exon skipping is a dominant characteristic of this superfamily (Fig. 8Go). PACAP/GRF mRNA is expressed as a long and short transcript in the brain in fish and chicken; the short mRNA is lacking exon 4 (15, 19). The functional significance is that exon 4 encodes the bioactive core of GRF. When PACAP and GRF are encoded together on one gene, the ratio of GRF to PACAP can be changed. The exon encoding PACAP is not deleted. Mammals achieve the same effect with more control due to the separation of PACAP and GRF on two genes. A similar effect is observed with the VIP gene in which the PHI exon is deleted, although this has only been reported for chicken (60) and turkey (61); the most abundant (98%) form of mRNA in several tissues is the shorter version with PHI deleted. There is no evidence of alternative splicing for the mammalian VIP transcript (43). In the glucagon gene, the GLP-2 exon is spliced in a tissue-specific manner. GLP-2 is not present in the pancreatic mRNA in salmon and chicken, but is present in the gut mRNA (91) where the GLP-2 peptide is known to affect intestinal growth (188, 189, 190). The secretin gene is another example of exon skipping in specific tissues and species (4, 5, 122), but the peptide encoded on exon 3, the deleted exon, does not have a known function.

Exon deletion could be an alternate way to regulate a gene that encodes more than one bioactive peptide. It seems less sophisticated than encoding each peptide on a separate gene with a distinct promoter. In the course of evolution, the mammals now have four superfamily genes (GRF, PACAP, secretin, GIP) with only one biopeptide. The proposed duplication of the GRF and PACAP gene in early mammals is a striking example of a process that could lead to finer regulation of two important peptides. It will be of great interest to determine whether secretin and GIP originated on genes in which solitary or multiple peptides were encoded.

Intron sliding occurs particularly in the GRF gene. In the human GRF gene, a downstream splice site results in a C-terminal peptide that is one amino acid shorter than if the upstream site is used (30). This is unlikely to have any functional consequences. More important in explaining the rapid changes in GRF structure during evolution of mammals is the observation that there is intron sliding in the rat GRF gene compared with the human gene. This change in splice donor site between intron 3 and exon 4 results in a change in the C terminus in the rat GRF by which it no longer matches the human GRF. Intron 3 interrupts the coding of GRF, which spans exons 3 and 4. Also, there is intron sliding when the human and rat GIP genes are compared. In the rat gene, the splice site for intron 2 and exon 3 is 24 nucleotides downstream compared with the human gene. This splice site in GIP is within the N-terminal peptide and in rat results in an N-terminal peptide that is eight amino acids shorter than the human form (136, 137).

One of the most interesting uses of alternative splicing in the superfamily is within the receptors. The PACAP receptor has at least five different inserts (191, 192), and the GRF receptor has at least one (169, 193); both sets of receptor inserts are in the third intracellular loop and are thought to control coupling to different intracellular signaling pathways. This alternative splicing is clearly functional. In addition, alternative splicing is reported for the glucagon receptor (194), GIP receptor (195), and VIP receptor (196), although the biological function of these variants is not known. Any member of this subfamily of receptors that has multiple signaling pathways (e.g., GLP-1) may be shown eventually to have variant forms of its receptor.

I. Alternative promoters for expansion of family functions
For several family members, alternative promoters are used in a tissue-specific way. Upstream promoters can lead to changes in the 5'-UTR and may alter the stability of the mRNA before translation. An example is the GRF gene transcript. In the placenta and testis, different upstream promoters are used compared with the brain transcription start site. This results in a different exon 1, which encodes the 5'-UTR (148, 171).

For PACAP at least three distinct mRNAs have been detected. The dominant one in rat neural tissue is 2.2 kb and is the same as the one shown in Fig. 4Go (PACAP, human). The other two transcripts are shortened (0.9 kb) but not identical to each other. One is a testis-specific transcript with a shortened 5'- and 3'-UTR plus an extra 126-bp sequence in the 5'-UTR compared with the 2.2-kb transcript (152, 153). The other short transcript was detected in the superior cervical ganglion after stimulation by depolarization; the sequence was not determined, but primers were used to show that it was not identical to the testis-specific transcript (197). Hence, PACAP alternative promoters have been identified and result in tissue-specific (testis, placenta, and sympathetic nervous system) transcripts.

The GIP gene may use two promoters: the most abundant mRNA begins at exon 1 as shown in Fig. 4Go; a minor mRNA begins at exon 2, as suggested by the ribonuclease protection assays. In this case it appears that intron 1, which contains a TATA box and cis-acting elements, is the alternative promoter (198). The purpose of mRNA lacking exon 1 and most of the 5'-UTR is not clear.

J. Posttranslational processing for increase of peptide families
Several of the superfamily peptides can be cleaved from their protein precursors in more than one way, creating new peptides. The enzymes that cleave PACAP and possibly other hormones from the precursor are important and have been reviewed (145). In most cases, biological activity resides with the shorter peptide.

The glucagon precursor can be cleaved variously depending on the tissue. In the mammalian pancreas, glucagon is cleaved from the precursor, whereas GLP-1 and GLP-2 are not cleaved from the larger fragment except to a minor extent (199). In contrast, GLP-1 and GLP-2 are cleaved from the precursor in the small intestine; GLP-1 is further cleaved in mammals to GLP-1(7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37) or GLP-1(7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36) amide before it becomes biologically active. From the intestinal precursor, glucagon is not released; instead the precursor is processed so that glicentin (glucagon that is extended at the N and C termini) and oxyntomodulin (glucagon that is extended at the C terminus only) are released. The fish glucagon precursor lacks the GLP-2 coding region in the pancreas, but not in the intestine; this is a result of alternative splicing and not posttranslational modifications. In fish, GLP-1 is 31 amino acids as in mammals, but the six-amino acid N terminus that is cleaved to make a mature peptide in mammals is not present in fish who use GLP-1(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31). The initial cleavage from the fish precursor results in an active GLP-1 of only 31 amino acids. Finally, in primitive fish including lamprey, the glucagon precursor is cleaved further in the intestine to produce glucagon from glicentin (200). Glicentin (69 amino acids) does not release insulin, whereas the shorter oxyntomodulin (37 amino acids) is almost as effective as glucagon (29 amino acids) in releasing glucose-induced insulin from the pancreas (201). Likewise, GLP-1 and GLP-2 are active only as cleaved peptides and not as the extended product of GLP-1 and GLP-2 connected by a small peptide as released from the pancreas.

PACAP is processed as a 38- or 27-amino acid peptide (10). Both peptides are equally effective in some functions, but the extended PACAP-38 is more effective in others (see Section V). VIP is a 28-amino acid peptide, but extended forms of VIP have been identified as VIP(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28) Gly-Lys-Arg (69). PHI and PHM are found as both PHI/PHM-27 and in the extended form known as PHV-42 (70, 202). Also, PHI(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27) Gly, secretin(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27) Gly, and secretin(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27) Gly-Lys-Arg exist (69).


    IV. Superfamily Members—Overlapping Functions, Expression, and Receptors