| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Monash Institute of Medical Research (G.M.G., M.K.O.), Monash University, Melbourne, 3168, Australia; Biology Department (K.R.), Vrije Universiteit Brussel, B-1050 Brussels, Belgium; and The Australian Research Council Centre of Excellence in Biotechnology and Development (M.K.O.), Monash University, Melbourne, 3168, Australia
Correspondence: Address all correspondence and requests for reprints to: Dr. Gerard M. Gibbs, Monash Institute of Medical Research, Monash University, 27-31 Wright Street, Clayton 3168, Australia. E-mail: gerard.gibbs{at}med.monash.edu.au.
| Abstract |
|---|
|
|
|---|
| I. Introduction |
|---|
|
|
|---|
The evolutionary diversity of the CAP superfamily was recognized very early, and speculative functional relationships were proposed, suggesting (among other things) isozymes with distinct substrate specificity (2) and functional links between the plant and human immune systems (10). Some 16 yr after recognition of this phylogenetic relationship, there have been some gains in our understanding of mammalian CAP biology; however, in many cases insights have been limited to teasing out the finer detail of earlier characterizations. CAP proteins have been identified in most tissues within the human and mouse and are proposed to have significant roles during reproduction, development, immune function and in pathologies including specific cancers, nerve damage, pancreatitis, and heart failure. Many of the uncharacterized mammalian CAP mRNAs show expression biases to the same tissues, suggesting that the superfamily will be increasingly recognized as a key driver of fertility, health, and disease.
Annotation of the human and mouse genomes has identified the extent to which the CAP superfamily exists in mammals; 31 CAPs in humans and 33 in mice have been identified. Our phylogenetic analysis shows these can be separated into nine distinct subfamilies: the CRISPs, the glioma Pr-1 (GLIPR1) proteins, the GLIPR2 (or GAPR1) proteins, PI15, PI16, CRISP LCCL domain containing 1 (CRISPLD1), CRISPLD2, the mannose receptor-like (MRL) proteins, and the R3H domain-like (R3HDML) proteins (Fig. 1
).
|
| II. The CAP Superfamily Conservation and Evolutionary Distribution |
|---|
|
|
|---|
The tertiary crystal structure of several CAP superfamily proteins has been determined (Section V). They show that the CAP motifs are present in a small, disulfide bond stabilized and structurally conserved 17- to 21-kDa domain referred to as the CAP (or SCP or Pr-1) domain. The presence of these sequence motifs and the resultant CAP domain is characteristic and definitive of the entire superfamily. Using a 175-amino acid region covering most of the CAP domain and incorporating these signature sequences, 1281 CAP proteins in 427 species can be identified using Pfam (13). CAP superfamily proteins typically, but not always, contain a signal sequence directing proteins to specific intracellular compartments or more typically to the extracellular environment, where their disulfide-stabilized structure is presumably important for overall stability. Within this extracellular environment, functions are proposed to be either autocrine or paracrine. In the most broad sense, these involve regulating signal transduction events, e.g., via ion channel gating or receptor activation, or modifying other proteins, for example in association with extracellular matrix remodeling or cellular adhesion.
Although a large number of proteins within the superfamily contain a CAP domain in isolation (e.g., the Ag5 and Pr-1 proteins found in insects and plants), many others contain additional C-terminal extensions. With such a large superfamily, identification and classification of subfamilies has been difficult and often confusing; however, it is becoming apparent that many groupings can be defined based upon similarity in the CAP domain and also the nature of any C-terminal extensions. A schematic overview of the mammalian CAP superfamily architecture is provided in Fig. 1
. Within published literature, nomenclature has frequently been loosely applied when naming new proteins in this superfamily, resulting in confusion. Herein we have endeavored to clarify, and simplify, CAP nomenclature. In some cases, we have recommended name changes.
Those proteins that contain the CAP domain in isolation typically terminate shortly after the CAP2 motif. The plant Pr-1, terminal amino acids are a PY dipeptide; and in the Ag5, a similar motif, [ILVP]Y, is very near to the terminus. The PY dipeptide is conserved among all mammalian CAPs and is identified shortly after the CAP2 motif. Given the largely conserved nature of this dipeptide throughout the superfamily, and given that it terminates the CAP domain in a number of cases, this may be considered the termination of the domain throughout the entire superfamily. This general rule is used throughout this review.
As summarized in Fig. 1
, each of the CAPs identified in mammals (with the exception of GLIPR2) includes a C-terminal extension. Most have at least a short C-terminal extension containing four cysteines, which has been referred to as a Hinge region (14). Additional domains are sometimes entirely unique and are observed only in association with the CAP domain, e.g., the ion channel regulator (ICR) at the C terminus of the CRISPs (15); some display homology to domains of other protein families, such as the C-type lectin (CTL) domain (Section IV.C) (16) and the LCCL domain (Section IV.D) (17); and others have C-terminal extensions with no homology to other sequences. This multidomain architecture is suggestive of multifunctional proteins. This interpretation is supported by several lines of evidence as detailed throughout this review.
Database analyses that we have undertaken show that at least 43 different protein domain architectures are present for the CAP superfamily across the three major phylogenetic divisions of Archaea, Bacteria and Eukaryota. Within each of these divisions, there are 8, 213, and 216 species, respectively, that are known to contain CAP proteins. Within the Bacteria, the majority of sequences reside within the Proteobacteria (138 sequences) and Firmicutes (132). Within the Eukaryota, the majority of sequences reside within the Viridiplantae (183) and Metazoa (529), primarily distributed between Mollusca, Nematoda (109), Arthropoda (flying winged insects of the Neoptera) (216) and Chordata (170). Within the Chordata, proteins have been primarily identified from squamates (snakes and lizards) (49), primates (36), and rodents (41). The focus of the current review is the CAPs of chordate origin, especially those found within mammals. Some CAPs of nonmammalian origin will, however, be described because they provide a useful context for future investigation.
| III. Mammalian CAP Superfamily Proteins |
|---|
|
|
|---|
Within all mammalian species investigated, several CRISP1 paralogs have been identified. CRISP2 was originally identified in the guinea pig as auto-antigen 1 (28, 29), in the mouse as TPX1 (2, 30), and in the rat (31, 32), human (33), and horse (34). CRISP3 was originally identified in the mouse (25), followed by the human (also called specific granule protein 28) (33, 35), and the horse (also called horse seminal plasma protein-3) (34, 36, 37). CRISP4 has been identified only in the mouse and the rat (38, 39).
Expressed sequence tags (ESTs) and genome sequences show that CRISPs are found in many other species including the chimpanzee, cow, pig, dog, chicken, opossum, and platypus (G. M. Gibbs, unpublished data). Owing to their abundance in the male reproductive tract of all of the species studied to date, CRISPs have gained much attention as potential regulators of male fertility.
CRISPs are two-domain proteins with an evolutionary diverse and structurally conserved N-terminal CAP domain of approximately 21 kDa, which contains each of the CAP motifs, and a C-terminal cysteine-rich domain of approximately 6 kDa, which is comprised of a Hinge and an ICR region (Fig. 1
). The presence of two structural domains in CRISPs was first indicated by partial disulfide bond mapping of CRISP1 (40), but was formally confirmed after publication of the crystal structure of several snake CRISPs (14, 41, 42) (Section V). The C-terminal extension contains 10 strictly conserved cysteines, which form five disulfide bonds, and is the defining element of the CRISP subfamily. The C-terminal domain of CRISPs is often referred to as the cysteine-rich domain; however, the wealth of other nonrelated but similarly named domains leads to confusion—hence our proposed CRISP nomenclature (15). Throughout this review we refer to the C-terminal extension as the CRISP domain when we consider both the Hinge and the ICR together. The spacing of the cysteines in the Hinge region is different among each of the mammalian CAP subfamilies. The cysteine spacing of the Hinge region in the CRISPs is Cx2Cx3Cx4C.
CRISPs are only found in vertebrates and have been reported in many mammals, reptiles, Xenopus frogs, and the parasitic lamprey. All CRISPs contain a predicted signal peptide consistent with their extracellular localization or their localization to specific intracellular compartments. Although they do not contain transmembrane domains, they are sometimes found associated with membranes potentially either through glycosylation or through interactions with integral membrane proteins. All mammalian CRISPs have a molecular mass of between 25 and 32 kDa, depending upon their glycosylation status or other posttranslational modifications.
Aspects of the biological function of mammalian CRISPs are starting to become apparent. Notably, they are likely to be dual function proteins with an activity associated with both the N-terminal CAP domain and the C-terminal CRISP domain. For example, the N-terminal CAP domain has been implicated in fusion to the oocyte (Section V.C), whereas the C-terminal domain has an ion channel regulatory activity (Section IV.B). In particular, for the former of these functions, analysis of the CAP superfamily and CAP domain containing proteins is useful. CRISPs are expressed in numerous locations throughout the body, and depending upon expression context, their biological functions may differ. The expression context and biochemistry of CRISPs are described throughout this section, and the biological functions are the focus of Sections V–VIII.
1. Mammalian CRISP orthologs and paralogs.
Genomic data show that most mammals contain three CRISPs, with the mouse being the only one to date shown to contain four homologous genes. Genes are typically clustered and syntenic between species (reviewed in Ref. 43). Human CRISP1, CRISP2, and CRISP3 are localized to chromosome 6p21.3; rat Crisp1, Crisp2, and Crisp4 are localized to chromosome 9q12; and mouse Crisp1, Crisp2, and Crisp3 are localized to chromosome 17B2, whereas Crisp4 is located on chromosome 1A3 in a locus rich in other CAP superfamily genes (Glipr1, Glipr1l1, Glipr1l2).
A phylogenetic analysis of 99 CAP protein sequences yields a well-resolved consensus tree defining the major mammalian CAP protein clades (Fig. 1B
). In this tree, all CRISP proteins cluster together in a single clade composed of two major subclades that receive high node credibility by our analyses. One subclade consists of CRISP1 orthologs plus rodent CRISP4, whereas the other combines the CRISP2 and CRISP3 orthologs. A closer examination of the phylogenetic structure of the CRISP clade (Fig. 2
) highlights the current nomenclature inconsistencies between species and between subfamilies (or subclades). The nomenclatural inconsistency for CRISP1, CRISP3, and CRISP4 had its origin in the earliest period of CRISP research and developed through a combination of incomplete characterizations of the Crisp genes in the mouse and rat and overlapping expression profiles of CRISPs within the epididymis of the rat and the human. Although phylogenetic differences were recognized between "apparent" CRISP1 orthologs in the rat and human (44), it was not until the identification of mouse and rat CRISP4 (38, 39) that the genetic picture was completed and the true ortholog of human CRISP1 was identified as rodent CRISP4 (39). The long-stem branch preceding the CRISP2/CRISP3 subclade (Fig. 2
) indicates that CRISP2 and CRISP3 are closely related paralogs. The clustering of equine and canine CRISP3 with the CRISP2 group (Fig. 2
) may be a phylogenetic artifact caused by the overall low divergence between CRISP2 and CRISP3 proteins and was partially resolved when alternative likelihood methods were applied (K. Roelants, unpublished data). The CRISP3 orthologs appear to have the most sequence variation, and this is represented in the phylogenetic tree. CRISP2 forms a well-resolved clade and is the most clearly defined in terms of interspecies nomenclature. Mouse and rat CRISP1 cluster with human CRISP3, providing supporting evidence that rodent CRISP1 is the ortholog of human CRISP3. For simplicity, within the following sections these subdivisions are referred to as the CRISP1 group, the CRISP2 group, and the CRISP3 group, despite individual instances of nomenclature inconsistency. We considered attempting to rename the CRISPs in this review so that the nomenclature accurately reflected the phylogenetic classification, but felt that this would lead to confusion in the already extensive CRISP literature.
|
2. CRISPs expression, localization, and posttranslational modifications.
Rather than reviewing mammalian CRISP expression profile according to nomenclature classification, this summary of CRISP expression is based on tissue distribution. Given the high likelihood of functional redundancy between family members, we believe this organization will assist in functional predictions and overcome some of the difficulties associated with the nomenclature inconsistencies.
3. Testis.
Germ cells within the testis undergo successive mitotic and meiotic divisions followed by a series of morphogenic processes, collectively called spermiogenesis, to form the highly specialized spermatozoa (46). The differentiated spermatozoa is comprised of several subcellular compartments and can be most simply divided into the head and the tail. The head contains a highly condensed nucleus and an anteriorly positioned lytic vesicle called the acrosome. Lysis of the acrosome (a process called the acrosome reaction) is required to achieve fertilization (47). The sperm tail, which is required for motility, is divided into three regions: the midpiece, principal piece, and end piece. In addition to the axoneme, which all ciliated cells possess, sperm tails have unique ultrastructural components, notably, the outer dense fibers and a fibrous sheath (48, 49).
Based on published data, Crisp2 is the only CRISP normally expressed in the mammalian testis. Its expression is not regulated by androgens, and CRISP2 protein is not modified by glycosylation (28, 50). CRISP2 has been partially characterized in the guinea pig, mouse, rat, horse, and human. The bulk of the data suggest that CRISP2 is first produced in spermatids; however, its mRNA may be expressed several days earlier, then subjected to translational delay (28, 32, 51, 52, 53) via binding to the RNA binding protein DAZL (54). CRISP2 is incorporated into the acrosome contents and is visible from early steps in round spermatid development (28, 29, 51, 53, 55, 56). Based on protein localization studies largely in the rat, CRISP2 is also incorporated into the developing sperm connecting piece (or neck) and accessory structures of the sperm tail (51). Incorporation of CRISP2 into the sperm tail is likely to occur via the granulated bodies (51). Granulated bodies are believed to be protein transport vesicles required for the transfer of proteins from the cytoplasmic lobe of elongating spermatids into the developing sperm tail (57). Within the acrosome of guinea pig sperm, CRISP2 is estimated to be 6.4% of the total protein content (28).
Although murine Crisp2 expression is clearly abundant within the testis, low levels of cDNA have also been amplified from purified secretory acinar cells from the mouse lacrimal gland (tear duct) (58), and cDNA and protein have been observed in the mouse ovary, lung, and skeletal muscle (45).
4. Epididymal production and sperm localization.
Fully differentiated testicular sperm are not capable of fertilizing an oocyte and are required to undergo a period of epididymal maturation to achieve fertilization competence. The epididymis is comprised of several regions, most simply divided into the initial segment, the caput, corpus, and cauda. During passage through the initial segment, the caput and corpus, sperm acquire the ability to fertilize. Sperm are subsequently stored within the cauda epididymis in an inactivated, yet fully functional state. Epididymal maturation is characterized by a dramatic remodeling of the sperm cell, including changes in both the membrane lipid and protein component. Many proteins produced by the epididymis are secreted into the epididymal lumen where they both surround the sperm and are transferred to the sperm membrane.
Mouse, rat, human, and horse epididymis contains several CRISPs, typically from the CRISP1 and CRISP3 clades (Fig. 2
). They show partially overlapping expression profiles (25, 33, 37, 39, 59) and, as indicated by the recent knockout of Crisp1 in the mouse which had normal fertility (60), most likely share some degree of functional redundancy. The epithelium of humans and the horse epididymis secrete abundant CRISP1a and CRISP3b into the epididymal lumen (33, 37, 59), whereas rodents secrete abundant CRISP1b and CRISP4a (39). The superscript following each of these proteins is representative of their phylogenetic group (Fig. 2
). Although the nomenclature would indicate different expression profiles between humans and rodents, there is in fact conservation based upon phylogenetic grouping. It is likely that each are functional counterparts whose biological role is determined by posttranslational modifications and the nature of the interaction with sperm. Based on mouse EST analyses, Crisp3 also appears to be expressed in the epididymis, although, unlike Crisp1 and Crisp4, this is not the primary tissue of production.
Human CRISP1 protein has been detected in all regions of the epididymis, vas deferens, and seminal plasma, and its expression is predominant in the male reproductive tract (33, 39, 44). CRISP1 from the epididymal epithelium becomes enriched within epididymosomes (61, 62), from whence it is transferred, at least in part, to the postacrosomal region of the sperm head (44). The interaction between CRISP1 and sperm is facilitated through N-linked glycosylation, which contributes approximately 13% to the total molecular mass (or 4 kDa) (44). CRISP1 associated with ejaculated sperm can be washed off under mild conditions, suggesting that it interacts weakly with sperm and that a significant proportion remains in the fluid component of the epididymal lumen (33). CRISP1 has been estimated to be present at a concentration of between 8 and 14 µg/ml (or 0.15–0.25 µg/mg of total seminal plasma protein) (33).
CRISP3 is also produced by the secretory epithelium of the human epididymis. It is present at highest concentrations in the cauda epididymal lumen and vas deferens (4.6 µg/mg total protein) (59). CRISP3 exists in two forms (+/– glycosylation), both of which remain associated with human sperm after analysis of the washed highly motile "swim up" fraction (59). CRISP3 is present in seminal plasma at 14.8 µg/ml with the majority originating from the accessory sex organs, i.e., the ampulla of the vas deferens and the prostate (59).
Within the mouse, Crisp4 expression and protein localization were observed from 30 d after birth in the principal cells in the epididymal epithelium. Expression levels increased up to d 60 in an androgen-regulated manner (38). Within the rat, Crisp4 is most highly expressed in proximal regions of the epididymis, i.e., highest levels were observed in the initial segment, segments III-VI of the caput epididymis, and in the corpus epididymis. Although much lower levels of expression were seen in the cauda epididymis (38), the secreted protein accumulates within the corpus and cauda where it attaches to the maturing sperm (39). A similar distribution was observed in the mouse epididymis (39). The majority of the CRISP4 protein, however, is released from sperm under nondetergent extraction conditions, suggesting primarily a nonintegral membrane association (39). Low expression of murine Crisp4 has also been detected in the oocyte (39), skeletal muscle, spleen, and thymus (45).
Similar to rodent Crisp4, rodent Crisp1 is found in the highest levels in the epididymis (39). In comparison to Crisp4, however, Crisp1 is expressed more distally, and the highest levels of expression were observed in the cauda epididymis (39). Crisp1 expression is androgen-regulated (50), consistent with the identification of androgen response elements within the promoter (63). The bulk of rodent CRISP1 is produced and secreted from the principal cells of the proximal corpus epididymis through to the cauda, where it can be identified as a 29-kDa protein and a 31-kDa protein with three major and one minor charge variants that are suggestive of differential glycosylation (27, 39, 40, 64).
Within the mouse, CRISP1 comprises 15% of the total cauda epididymal fluid protein content (12.4 µg/mg of cauda protein) (40). Within the rat, CRISP1 reaches maximal concentrations in the cauda epididymal lumen (19, 65) and is estimated to be present at approximately 15.5 µg/mg of total protein (20, 66). Such a high concentration is consistent with the proposed role of CRISP1 as a decapacitation factor involved in maintaining sperm in a quiescent state before ejaculation (see Section V.B) (67).
Rat epididymal CRISP1 exists as two isoforms, the smaller D and larger E isoform with an approximate difference of 2 kDa in molecular mass. The smaller D form is synthesized from the principal cells of the caput epididymis and in all subsequent regions of the epididymis. The larger E form is synthesized in the corpus and proximal cauda (64, 68, 69) and is quantitatively the lesser fraction (65, 70). Both forms of rat CRISP1 have been purified to homogeneity from the epididymal fluid, and sequencing of an internal 34-amino acid peptide supports their uniform identity (71). Both isoforms had a blocked N terminus (70), which was likely the result of cleavage of the predicted signal peptide after secretion from the epididymal epithelium. Tryptic peptide maps from each isoform showed a few differences; however, direct sequencing of those peptides failed, suggesting that a distinction between the isoforms resides with the blocked N-terminal tryptic fragment.
Considerable research has gone into determining the type of CRISP1 associated with sperm and the dynamics of its redistribution during capacitation and fertilization. The posttranslational differences between the D and E forms has significant implication in this redistribution. Although these studies have been complicated by variations in techniques and analysis tools, a clear picture is beginning to emerge. It is possible to discriminate the D and E isoforms of CRISP1 using the 4E9 monoclonal antibody that binds to a partially characterized epitope on the N terminus of the E form (71). Although both the D and E isoforms are glycosylated (in rat CRISP1 this corresponds to between 6 and 7.5% of its total mass or
2 kDa) (65, 72), this is not the basis for the molecular mass difference (70). Splice variants of rat Crisp1 have been identified; however, these do not contribute to additional coding sequence in the N terminus (73, 74). Protease and cyanogen bromide cleavage of CRISP1 indicates that the 4E9 epitope involves both additional protein sequence and the presence of an O-linked N-acetyl galactosamine epitope of 204 Da (43, 70, 75).
In the rodent, most sperm bound CRISP1 is loosely associated with the plasma membrane and is easily removed (64, 76). A smaller population, the E form, is more strongly attached to the membrane and remains associated with the sperm head after maturation within the female reproductive tract (27, 64). This more strongly attached population has been proposed to have a role in the fusion to the oocyte and is described in more detail below in Section V.C.
Immunofluorescence staining of sperm from the caput epididymis showed generalized CRISP1 staining over the entire sperm surface, which becomes localized to the midpiece and the head after epididymal maturation and entry into the cauda epididymis (65). As indicated, within epididymal sperm the D isoform appears to be glycosyl-phosphatidylinositol (GPI) anchored (65) and is seen associated loosely with the dorsal region of the sperm head in fixed sperm (64, 77). The E isoform undergoes C-terminal proteolytic truncation during epididymal transit to create progressively smaller versions, with the smallest of these (26 kDa) associating very strongly with the membrane over the sperm tail (64). The inability of the 11D4 monoclonal antibody, which binds to the 11 most C-terminal amino acids of full length CRISP1 (64), failed to detect the truncated forms on the sperm tail, supporting the notion of proteolysis. This proteolysis interpretation is yet to be confirmed.
Many reports have described the redistribution of CRISP1 D isoform from the dorsal head region to the equatorial segment during capacitation (77, 78). Indirect immunofluorescence on sperm using the 4E9 antibody suggests that a small proportion of the CRISP1 E isoform redistributes to the equatorial segment on the sperm head after in vitro capacitation (Section V.C) (compare Fig. 1 in Ref. 64 with Fig. 7 in Ref. 67) and that the total CRISP1 pool (as detected by antibody CAP-A) on the sperm membrane is reduced during capacitation (67). Given, the question of specificity of polyclonal antibodies raised against almost identical D and E isoforms; the relative abundance of the D isoform within the epididymal lumen and on fresh cauda sperm; that the D isoform is not tightly bound to the membrane; and that the E isoform appears to undergo capacitation-associated redistribution to the equatorial segment, the possibility exists that on capacitated sperm the E isoform becomes preferentially detected as the D isoform is depleted (Section V.C)—hence, the appearance of translocation from dorsal to equatorial segment of the D isoform. Although several outstanding questions remain, not least of which is the nature of the truncated E isoforms, this hypothesis is consistent with numerous aspects of the functional models (Section V.C).
5. CRISP expression in the accessory glands of the male reproductive tract.
As indicated above, CRISP3 is produced by the normal human vas deferens and prostate and contributes to the high levels observed in seminal plasma (14.8 µg/ml) (59). Crisp3 is also produced in the horse seminal vesicles and is thus likely to contribute to seminal plasma levels in other species (37). Immunoprecipitation and gel filtration experiments have shown that within human seminal plasma, CRISP3 binds to dimeric β-microseminoprotein (β-MSP) with 1:1 stoichiometry and with a KD of approximately 6.5 x 10–11 M, suggesting a highly specific interaction (79). As the function of β-MSP [also called prostate secretory protein 94 (PSP94)] is not clear, the biological significance of this interaction is not yet apparent. There is, however a significant excess of β-MSP in seminal plasma, meaning that it is likely that all the CRISP3 is complexed. This interaction has been proposed to occur via the CAP domain because protease-inhibitor 16 (a non-CRISP CAP) that is found in human serum plasma also interacted with β-MSP (80) (Section III.E).
Within stallion seminal plasma, CRISP3 is not glycosylated and is present as a monomer in solution and at concentrations up to 1.3 mg/ml. It has been estimated that after removal of seminal plasma, 0.9 to 9 million molecules of CRISP3 remain associated with each stallion sperm on the postacrosomal and equatorial segment and that they may have a function related to oocyte binding (36, 37) in a manner conceptually similar to that proposed for CRISP1 (reviewed in Ref. 76) (Section V.C). To date, this role has not been explored.
6. CRISPs in the immune system.
Crisp3 has a broader expression distribution than other mammalian CRISPs. Within the nondiseased animal, the expression of Crisp3 cDNA is suggestive of a role in the innate immune system (35, 81). Specifically, CRISP3 has been detected in the secretions from several exocrine glands onto mucosal surfaces, including the lacrimal gland, submandibular (salivary) gland, prostate, and pancreas (25, 33, 58); and within several cells of the immune system including pre-B cells, neutrophils, and eosinophils (81, 82); in the spleen where its expression is regulated by Oct2 (83); and in the thymus (33). The potential role of CRISP3 in the immune system is described in more detail in Section VII. CRISP3 is estimated to be present at 6.3 µg/ml in human plasma, 21.8 µg/ml in saliva, 11.2 µg/ml in seminal plasma, and 0.15 µg/ml in sweat (82). We showed that, CRISP3 mRNA and protein is also found in skeletal muscle, uterus, mammary gland, and heart (45), but such expression is unlikely to be related to immune function.
Within the mouse submandibular gland, the developmental expression profile of Crisp3 shows a distinct increase (approximately 6-fold) relative to Crisp1 from birth up to 20 d of age (84). Consistent with the presence of an androgen-response element in both mouse Crisp1 and Crisp3 promoters (85), the absolute level of both mRNAs increased within the granular convoluted tubules of the male submandibular gland from d 30, concordant with increased androgen levels (25, 84). Crisp3 expression has also been observed in equine submandibular glands (37).
Within neutrophils, CRISP3 is present as two populations including an N-linked glycosylated form (predicted to be at Asn239) with an estimated molecular mass of 29 kDa, and a nonmodified form of 27 kDa (82). Within human plasma, presumably as a product of neutrophil and eosinophil secretory granules, CRISP3 binds to
-1B-glycoprotein (
1BG) with a 1:1 molar ratio and a KD in the nanomolar range, suggestive of a highly specific interaction (86). This interaction was not metal ion dependent and has been proposed to inactivate the potentially toxic effect of CRISP3. Such an activity would be analogous to the inactivation of snake venom metalloproteases and myotoxins by
1BG-like proteins in opossum plasma (87, 88, 89, 90). Based on this thesis, CRISP3 may normally be kept in a bound-inactive state until it reaches, or is delivered to, its endogenous target. Unfortunately, because the activity of CRISP3 is, as yet, unknown, this hypothesis is difficult to test.
It is of note that CRISP3 expression is dramatically up-regulated in a number of pathologies including in prostate cancer (91, 92, 93), squamous carcinoma of the tongue (94), chronic pancreatitis (95), and Sjögrens syndrome (96, 97, 98). The potential clinical significance of these changes is described in Sections VII–VIII.
B. Glioma pathogenesis related-1 (GLIPR1) subfamily
The second most well-characterized CAP subfamily in mammals consists of the GLIPR1 proteins. GLIPR1 proteins are also called related to testis specific, vespid and pathogenesis related-1 (RTVP1) proteins. Similar to the CRISPs, the GLIPR1 proteins are a multigene subfamily consisting of three genes in most species, with four identified within the mouse. Each GLIPR1 component group has distinct differences at their C termini (Fig. 1A
). Our phylogenetic analysis shows that mammalian GLIPR1 proteins form a single well-supported clade composed of three distinct subclades (Fig. 1B
): GLIPR1, GLIPR1-like 1 (GLIPR1L1), and GLIPR1-like 2 (GLIPR1L2). GLIPR1-like 3 (GLIPR1L3) is unique to the mouse and appears to be a duplication of Glipr1l1 (99). GLIPR1L3 has a high degree of sequence identity with GLIPR1L1 (82%) and groups within the GLIPR1L1 clade after phylogenetic analysis (see supplementary Fig. 1, published as supplemental data on The Endocrine Societys Journals Online web site at http://edrv.endojournals.org, and Ref. 325 cited therein). Available expression data suggest distinct differences in expression profiles of each of these groups. Each of the GLIPR1 subfamily members contains the four CAP signature sequence motifs and the Hinge-like sequence with a cysteine spacing of Cx2Cx5Cx4C. Uniquely among the mammalian CAP superfamily, GLIPR1 and GLIPR1L2 proteins are predicted to have C-terminal transmembrane domains (99), which may anchor the extracellular CAP domain to membranes. There are some localization data to support this prediction (99). GLIPR1L2 has a long glutamate-rich domain (99), the function of which remains unknown.
The first GLIPR1 gene was identified after an examination of the major up-regulated genes within the most aggressive form of human brain cancer, glioblastoma multiforme/astrocytoma, and within glioma cell lines (100, 101). Transcripts were not identified in other neuronal cancer cell lines or within the normal brain, suggesting that GLIPR1 expression was derived from the cell from which the cancer originated and, as such, may be important for pathogenesis. The same gene (RTVP-1) was later identified in prostate cancer cell lines and shown to be P53-regulated and proapoptotic (102). Epigenetic down-regulation of GLIPR1 has been observed within the cancerous prostate (103). The role of GLIPR1 in cancer is discussed in Section VI.
Human GLIPR1, GIPR1L1, and GLIPR1L2 genes are clustered on chromosome 12q21, and the mouse genes are localized on chromosome 10D1 (99). Each of the human and mouse genes has functional p53 response elements (99, 102). Genomic analyses identified likely orthologs of all GLIPR1, GLIPR1L1, and GLIPR1L2 within many taxa ranging from teleost fish to the chicken and other mammals (G. M. Gibbs, unpublished data).
Human GLIPR1 expression is observed in the fetal kidney and in multiple adult tissues (99, 104). Quantitative RT-PCR showed that GLIPR1 expression was highest in the human lung, followed by the testis, bone marrow, prostate, bladder, and kidney (99). Similarly, GLIPR1L2 has a wide expression profile with highest levels in the testis and lower levels in kidney, prostate, lung, bladder, and bone marrow (99). GLIPR1L1 is almost exclusively expressed in the testis, with trace amounts in the bladder (99). The cellular localization of GLIPR1L1 and GLIPR1L2 in humans is currently unknown. Similarly, a formal analysis of murine expression profiles has not been completed; however, EST expression data are consistent with that reported for the human. Splice variants of GLIPR1 are observed, and variations in expression profile are reported; however, it is not apparent how these may alter biological function (99, 105).
C. Glioma pathogenesis related-2 (GLIPR2)/Golgi-associated pathogenesis related-1 (GAPR-1) subfamily
The GLIPR2 subfamily possesses several unique characteristics among the mammalian CAP proteins. Although only two GLIPR2 proteins have been characterized (human and mouse), several unique characteristics suggest that this clade may contain the most primitive mammalian CAPs. This interpretation is supported by our phylogenetic analysis, showing the GLIPR2 clade to be the earliest diverged CAP subfamily within mammals, which was rooted with the protein PRY1 from the yeast Saccharomyces cerevisiae (106) (Fig. 1B
). Despite the similarity in nomenclature between GLIPR1 and GLIPR2 proteins, they do not display a particularly close phylogenetic relationship within the CAP superfamily (Fig. 1B
).
Genomic analysis showed GLIPR2 orthologs to be present in the widest array of species of all those in the mammalian CAP superfamily (G. M. Gibbs, unpublished data). GLIPR2 and paralogs can be seen within S. cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, insects, ascidians, bony fish, amphibians, birds, and mammals. Thirty-six genes have been identified, with 24 in mammals, including the human, chimpanzee, monkey, mouse, rat, dog, and cow. GLIPR2 is localized on human chromosome 9p12-p13, with orthologous genes localized in syntenic regions in other species.
An analysis of mammalian GLIPR2 protein sequences shows that each of the CAP sequence motifs is present, but they contain a greater degree of sequence variability than in other subfamilies (G. M. Gibbs, unpublished observation). Uniquely, GLIPR2 proteins do not contain a predicted signal sequence. This is consistent with its intracellular localization to the Golgi membrane (107). In addition, the conserved PY dipeptide sequence present in most other CAP superfamily proteins is absent, and the Hinge-like sequence present in all other mammalian CAPs is absent from GLIPR2 (Fig. 1A
). In this latter respect, GLIPR2 is more similar to many nonmammalian CAPs. GLIPR2 proteins have the greatest degree of identity with the plant Pr-1 proteins (Section VII.C). This, along with their abundance in immune environments, supports the notion that GLIPR2 proteins may represent an evolutionary link between the plant and mammalian immune systems, a relationship that has been considered on several occasions but not yet demonstrated (100, 107).
The GLIPR2 gene (originally called C9orf19) was identified in the human and has a wide expression profile. Highest expression levels were seen in the heart, lung, and peripheral blood leukocytes. Lower levels were detected in the skeletal muscle, prostate, and uterus (108). At a similar time, the same protein (originally called GAPR-1) was purified from the lipid-enriched microdomains of CHO cells Golgi complexes, where its myristolated N terminus was shown to be important for interaction to the cytoplasmic side of the Golgi membrane (107). Weak interaction with caveolin-1 supported enrichment to lipid raft domains.
Experimental analysis showed that GLIPR2 is predominantly localized in tissues with immunological functions including circulating and developing leukocytes in the spleen. High levels of GLIPR2 have also been reported in the kidney, pancreas, lung, uterus, embryo, and placenta. Lower levels were apparent in the brain, testis, and skeletal muscle (107).
More recently, GLIPR2 was identified in the mouse as an up-regulated protein in kidney fibrosis (109). Fibrosis is an accumulation of extracellular matrix that ultimately may replace normal tissue and often results in organ failure (110). Fibrosis has a complex etiology, which is often associated with an accumulation of myofibroblasts producing increased extracellular matrix (111, 112). Myofibroblasts arise from several sources, one of which is the transdifferentiation of local epithelial cells in a process called the epithelial to mesenchymal transition (113). In the normal kidney, GLIPR2 is secreted (potentially through a non-ER secretory pathway) to the collecting ducts of the medulla. However, within the fibrotic kidney GLIPR2 was observed within epithelial cells of proximal tubules and within the damaged glomeruli, adjacent to activated myofibroblasts, and directly apposed to zones of peritubular fibrosis. The addition of exogenous GLIPR2 to cultured renal epithelial cells caused their transformation to a fibroblast-like morphology, supporting the notion that GLIPR2 is involved in epithelial to mesenchymal transition and in generating a pool of myofibroblasts contributing to fibrosis (109).
D. Peptidase inhibitor 15 (PI15) subfamily
The PI15 subfamily of CAP proteins was first identified from cultured human glioblastoma cells as a novel 25-kDa trypsin binding protein (p25TI) with weak inhibitory activity (114). Genomic analysis shows PI15 orthologs in 29 species conserved throughout the Euteleostomi (bony vertebrates), with 23 identified in mammals including the human, chimpanzee, monkey, cow, horse, dog, mouse, and rat (G. M. Gibbs, unpublished data). Human PI15 is situated on chromosome 8q21.11 and is adjacent to another mammalian CAP superfamily gene, the CRISPLD1 (Section III.F). Similarly, orthologous PI15 genes in six mammalian species are localized adjacent to CRISPLD1 (Section III.F). Our phylogenetic analyses cluster the PI15 proteins as the sister clade of the CRISPLD1, CRISPLD2, and R3HDML subfamilies (Fig. 1B
and supplementary Fig. 1).
An analysis of seven mammalian PI15 sequences shows a minimum of 90% identity over the entire length of the 258 amino acid proteins (G. M. Gibbs, unpublished data). All PI15 subfamily members contain each of the expected CAP motifs. Structurally, they contain a CAP domain terminating with the conserved PY dipeptide, a Hinge-like sequence with cysteine spacing of Cx2Cx7Cx4C, followed by an additional short stretch of 12 amino acids (Fig. 1A
and supplementary Fig. 2). N-terminal sequencing of cell culture-derived PI15 showed that the predicted signal peptide was active. Human PI15 appears to be produced as a propeptide and is cleaved at a furin-like protease cleavage site to produce a mature protein of 198 amino acids with one predicted glycosylation site (115).
Within the mouse, PI15 was identified as a marker of podocyte maturation during kidney glomerular development (116). Within the human, PI15 has been detected within the brain, placenta, and lymphocytes (115). Screening of adult human and mouse tissue blots showed expression of PI15 in a wide range of tissues including the prostate, mammary gland, salivary gland, thyroid gland, skeletal muscle, smooth muscle, heart, and ovary (117). Within the mouse and human, PI15 expression was observed within glandular structures including the prostate, mammary gland, salivary gland, and thyroid gland (117). This expression profile mirrors that recently reported for the murine CRISPs (45).
Similar to GLIPR1, human PI15 is up-regulated within glioblastoma and neuroblastoma cell lines (115). The increased production of PI15 from cancer cell lines (115) and demonstration of a weak trypsin inhibitory activity (114) provide indirect support for a role in cancer pathogenesis, but this activity is yet to be confirmed, and an association between PI15 and cancer pathogenesis has not been established.
PI15 proteins have been identified in the pancreatic mesoderm of the chicken (originally called sugarcrisp) (117). Chicken PI15 is expressed at precise periods during organogenesis suggestive of a role during development (117). Chicken PI15 was observed within the mesoderm of the emerging dorsal pancreatic bud, within the thyroid anlagen between the emerging lung buds, within the lung bud itself, and within the anterior and posterior regions of the limb buds (117). Organogenesis of branched organs requires signaling between mesenchymal and epithelial cells and precise modification of the extracellular matrix through inhibition and activation of numerous intracellular and extracellular factors, including proteases, protease inhibitors, and adhesion molecules (118, 119) (reviewed in Refs. 120 and 121). The localization of chicken PI15 (with 90% identity to human PI15) to regions of bud development is suggestive of a role in the controlled modification of extracellular matrix through the action of proteases and protease inhibitors and in branching morphogenesis. Similarly, rat CRISPLD2 identified during lung development is proposed to have a role in the regulation of extracellular matrix formation (Section III.G).
Mouse PI15 has been identified within the developing podocytes lining Bowmans capsule of the kidney glomerulus (116). Pi15 showed a distinct peak of expression during the S-shaped and capillary-looped stage in the embryonic day (E) 18.5 kidney. Expression of Pi15 mirrored the expression of the known transcription factor and podocyte marker, Forkhead box c2 (Foxc2), but expression of Pi15 was not regulated by FOXC2 (116).
Although relatively little data are available on PI15 proteins, within multiple species and tissue environments PI15 proteins are secreted and localized to the extracellular matrix. Within this environment, they may have a role in the regulation of the modification of extracellular matrix.
E. Peptidase inhibitor 16 (PI16) subfamily
The peptidase inhibitor 16 (PI16) CAP subfamily is not well characterized, and there are only a few reports following their identification in human serum as a PSP94 binding protein (also called β-MSP) (80) and from a murine cardiac cDNA library (122). Although also referred to as cysteine-rich protease inhibitor (CRIPI) and CRISP9 through public databases, the family of orthologous genes was renamed PI16. It remains to be established whether the PI16 proteins have a protease inhibitory activity as reported for the human PI15 protein.
Our in silico genomic analysis identified numerous PI16 orthologs conserved within the Eukaryota, where 26 species containing the PI16 gene were identified (G. M. Gibbs, unpublished data). Human PI16 is localized to chromosome 6p21.2 and to syntenic regions within other species including chromosome 17B1 in the mouse, chromosome 12:9.02 Mb in the dog, and chromosome 23:8.86 Mb in the cow.
Analysis of six mammalian protein sequences shows a high degree of identity, with a minimum of 60% (between the dog and the mouse) and is consistent with them forming a distinct subclade after phylogenetic analysis (Fig. 1B
and supplementary Figs. 1 and 3). PI16 proteins are normally about 460 amino acids in length. The mouse and rat contain an insertion of about 20 amino acids. Each of the CAP motifs is present, but a distinct feature of this family is the spacing between CAP3 and CAP4, suggesting a major structural difference between this subfamily and all other CAP superfamily proteins (Fig. 1A
). Following the conserved PY dipeptide, a Hinge-like sequence with cysteine spacing the same as the GLIPR1 subfamily (Cx2Cx5Cx4C) is present, followed by an additional approximately 264 amino acids with no homology to any other sequences and no recognized protein domains.
Expression of human PI16 is observed within the testis, prostate, small intestine, colon, peripheral blood leukocytes, and the ovary with immunohistochemical analyses showing localization to specific cells within the fibromuscular stroma of the normal human prostate, the pituitary gland, parathyroid gland, tonsil, kidney, stomach, liver, and the Leydig cells within the testis (80). This wide expression profile is consistent with the National Center for Biotechnology Information Unigene expression profile. Analysis of serum levels of PI16 in prostate cancer patients showed a significant decrease (80), and it has been used as a prognostic marker of prostate cancer recurrence (123) (Section VII). Within human serum, PI16 is present as two molecular mass forms of 94 and 71 kDa, which reduce to 68 and 52 kDa, respectively, after deglycosylation (80). The nature of the 16-kDa difference in molecular mass is currently unclear.
Mouse PI16 (predicted to be 53 kDa) was identified as a secretory protein from a cardiac cDNA library, where it was shown to have a role in regulating cardiomyocyte size (122). PI16 was produced from a single transcript as three glycosylated forms of 74, 100, and 108 kDa (122). Deglycosylation did not eliminate the mass differences between the bands, suggesting additional posttranslational modification (122), paralleling observations for human PI16 and other CAP superfamily proteins. Within the mouse, PI16 is abundant in skin and aorta, with lower levels in adipose tissue (122). Immunohistochemical analysis of hearts showed barely detectable levels of PI16 within the cytoplasm, but protein had accumulated in the intercellular spaces, consistent with the secretory nature of the CAP superfamily proteins. PI16 is the second most highly up-regulated protein in cardiac myocytes in a mouse model of heart failure, second only to atrial naturetic peptide (122) (Section VIII.D).
F. CRISPLD1/CAPLD1 subfamily
There are presently no primary publications describing the CRISPLD1 proteins. During our analysis of the mammalian CAP superfamily, we identified a large group of proteins displaying clear homology to the CAP superfamily with each of the normal conserved sequence motifs. Available EST data support their expression, and our phylogenetic sequence analysis supports their recognition as a distinct CAP subfamily (Fig. 1B
).
Among the mammalian species used in the phylogenetic analysis, there is a minimum 92% sequence identity over the length of the CRISPLD1 proteins (supplementary Fig. 4). Our analysis of selected CRISPLD1 proteins showed that they are approximately 500 amino acids in length and contain each of the four CAP consensus motifs, the conserved PY dipeptide, and a Hinge-like sequence (Cx2Cx7Cx4C), followed by an additional 260 amino acids containing tandem LCCL domains (Section IV.D and Fig. 1A
). Analysis of expression profiles through Unigene shows a wide expression profile within the human and mouse with highest levels within the ear. In this regard, the presence of the LCCL domain may be significant because a mutation in LCCL domain has been associated with the deafness autosomal dominant nonsyndromic sensorinueral disorder 9 (DFNA9) (17) (Section IV.D).
CRISPLD1 is also annotated as Cocacrisp, CRISP10, and LCRISP1 on public databases. Despite the prevalence of CRISP nomenclature within this CAP subfamily, members of the subfamily are clearly distinct from the CRISPs because they do not contain the 10 absolutely conserved cysteines or the ICR domain of the CRISPs. As such, we suggest a new nomenclature for this group: CAP and LCCL domain containing protein 1 (CAPLD1).
The CAPLD1 subfamily is conserved within the Euteleostomi (bony vertebrates) where 30 orthologs have been predicted, with 23 of these being in mammals including the human, chimpanzee, mouse, rat, dog, cow, and horse. Human CAPLD1 is localized to chromosome 8q21.11. Orthologous genes in other mammalian species are normally localized to syntenic regions. In each of those species listed above, CAPLD1 is always within the same locus as PI15. Mouse Crisp4 is also found within the same gene cluster.
G. CRISPLD2/CAPLD2 subfamily
The first member of this subfamily was identified in the rat as a glucocorticoid-inducible gene with a potential role in branching morphogenesis (124). Within this context, it was named late gestation lung protein 1 (Lgl1). Subsequently, it was named CRISP and LCCL domain containing protein 2 (CRISPLD2) to reflect its homology with the CRISPLD1/CAPLD1 subfamily (Fig. 1A
). However, as outlined above, this subfamily of proteins is distinct to the CRISPs, and as such the CRISPLD1 nomenclature is inappropriate (Fig. 1B
). As for the CRISPLD1 subfamily, we propose that the CRISPLD2 nomenclature be changed to CAP and LCCL domain containing protein 2 (CAPLD2) to more accurately reflect the phylogenetic structure of the mammalian CAPs.
Our genomic analysis of the orthologous CAPLD2 genes illustrated their conservation throughout the bony vertebrates, where 28 orthologs were identified. At least 22 mammalian species contain CAPLD2 genes, including the human, chimpanzee, mouse, rat, cow, dog, and horse. The human gene is localized to chromosome 16q24.1, and the mouse gene localized to the syntenic region at chromosome 8E1. EST expression profiles indicate that within human and mouse Capld2 is widely expressed.
In terms of protein structure, CAPLD2 proteins are typically approximately 500 amino acids in length, with the exception of the horse and dog, which, assuming that the theoretical translation is correct, may have extended N termini. Mammalian CAPLD2s show a minimum of 74% identity, and each contains a CAP domain including each of the conserved CAP motifs, followed by the PY dipeptide and the Hinge-like sequence (Cx2Cx7Cx4C) (Fig. 1A
and supplementary Fig. 5). Similar to CAPLD1, CAPLD2s have a large C-terminal extension following the Hinge-like sequence. The extensions are about 260 amino acids in length and include tandem LCCL domains (Section IV.D). Our phylogenetic analysis shows that the CAPLD2 proteins form a distinct subfamily within the CAP superfamily, which is most closely related to the CAPLD1 and PI15 subfamilies (Fig. 1B
and supplementary Fig. 1).
Rat Capld2 was identified during a screen for glucocorticoid-induced transcripts in fetal d 20 lung fibroblasts (124). It was also shown to be expressed within the fetal heart, kidney, intestine, and within the adult lung, heart, and spleen. Within the fetal lung, Capld2 was expressed in the mesenchyme from d 12 at the initiation of branching morphogenesis. Increased expression was observed through to d 21, where it became associated with smooth muscle myofibroblasts adjacent to the growing epithelium, which regulates alveolar development (125). Treatment of developing E12–13 rat lungs in explant culture with antisense oligonucleotides to Capld2 reduced protein levels by 49% and resulted in an apparent stoichiometric decrease of terminal bud development to 47% of control levels, suggesting a crucial role for CAPLD2 in branching morphogenesis (125). Additionally, maximal Capld2 expression in rat lung mesenchyme at late gestation suggested a role in alveolar development. Alveolar septation reaches a maximum 7 d after birth (126) in the rat when CAPLD2 is concentrated at the tips of budding secondary alveolar septa (127).
Bronchopulmonary dysplasia is a chronic lung disorder characterized by inflammation and scarring in the lungs that can develop in children born prematurely. A mouse model of bronchopulmonary dysplasia, induced by increased oxygen levels, showed reduced levels of Capld2 expression following increased oxygen, resulting in arrested alveolar growth. Upon return to normal oxygen levels, alveolar regeneration was observed, and Capld2 levels returned to normal, suggesting that Capld2 expression was related to septal growth and alveolar partitioning (127). Similarly, within the E12 mouse kidney corresponding to the initiation of branching, retinoic acid-induced expression of Capld2 was observed with a localization to the mesenchyme surrounding the developing kidney (128). At E13.5, CAPLD2 was observed within the mesenchymal cells in the nephrogenic zone and in stromal cells surrounding the ureteric bud trunk. At E18.5 and in the adult, CAPLD2 was evident in maturing proximal tubules but not in glomeruli. Heterozygous Capld1 knockout mice show reduced branching morphogenesis in the kidney suggesting, similar to data on the lung, that secreted CAPLD2 enhanced morphogenesis (128).
Retinoic acid has an indirect, but essential, role in branching morphogenesis. It is believed to stimulate mesenchymal cells to produce a paracrine branching morphogen (129, 130). The stimulation of expression of Capld2 by retinoic acid in the kidney and lung at early stages of branching may represent a conserved mechanism required for the initiation of branching in branched organs (128). Consistent with this is the localization of PI15 to areas of budding during development in the chicken (117). Branching morphogenesis is, at least in part, regulated via the controlled modification of extracellular matrix through localized concentrations of activating and inhibitory enzymes (118, 131). Supporting this role in extracellular matrix modification is the demonstration that CAP proteins function as proteases (i.e., Tex31, Section IV.A) (132) or as a protease inhibitor (i.e., P151, Section III D) (115).
Mouse Capld2 is also expressed within the mandible, palate, and nasopharynx regions during craniofacial development at E13.5–17.5 (133). Within the Caucasian and Hispanic populations, the region of the genome containing CAPLD2 has been associated with nonsyndromic cleft lip with or without cleft palate (133). This association was not observed in the Columbian population, and as such the role of CAPLD2 in facial development is inconclusive. However, it is of note that this apparent inconsistency between populations may reflect the complex genetic etiology of the syndrome.
H. Mannose receptor-like/CAP domain and CTL domain containing (CAPCL) subfamily
The MRL gene subfamily is virtually uncharacterized. There have been no primary publications on its potential function or indeed its distribution within mammals. It is however, a clearly annotated group of orthologs within public databases. During our analysis of the CAP superfamily, the MRL were recognized as a distinct CAP subfamily (Fig. 1
and supplementary Fig. 1). This group of orthologous genes was, presumably, named as such because of the presence of the CTL domain. CTL domains are abundant in the mannose receptor (134). The mannose receptor has been implicated in immune regulation through recognition and clearance of microorganisms and serum glycoproteins through interaction with the CTL domain (135, 136). Apart from the presence of a CTL domain, however, the MRL group of proteins within the CAP superfamily has no similarity to the mannose receptor, including not containing a transmembrane domain. As such, we suggest the adoption of the following more informative nomenclature: CAP and CTL domain containing (CAPCL).
The theoretical translations of CAPCL sequences available through public databases display some variability. This is particularly true for the N-terminal region, where start codons may be incorrectly applied. Across the major homologous region of the CAPCL proteins, a minimum of 78% identity was observed (G. M. Gibbs, unpublished data). The CAPCL proteins have each of the four CAP consensus sequence motifs, the PY dipeptide, the Hinge-like sequence with a cysteine spacing of Cx2Cx6Cx10C, and an approximately 215-amino acid C-terminal extension that contains the CTL domain (Fig. 1A
and supplementary Fig. 6). Following phylogenetic analysis of nine sequences from six mammalian species, the CAPCL formed a discrete highly conserved clade within the mammalian CAP superfamily (Fig. 1B
and supplementary Fig. 1). The CAPCL proteins, like most CAPs, contain a predicted signal sequence and are likely to be secreted into the extracellular matrix.
There are three predicted human CAPCL paralogous genes that are localized to 16q22.1-16q22.3 and a single predicted mouse gene to chromosome 8E1. Our genomic analysis indicated 27 orthologous genes in the bony vertebrates, with 24 in mammals including human, monkey, mouse, cow, horse, and dog. ESTs show a restricted expression profile to the kidney and testis in the human and the mouse.
I. Peptidase inhibitor R3H domain containing-like subfamily
As with the CAPCL family of proteins, there are no primary publications for the peptidase inhibitor R3H domain (R3HDML), yet they are evident as a large group of annotated proteins in public databases. The R3H motif has a consensus sequence RxxxH and has a proposed single stranded (ss) DNA binding function (137). The motif R3H is present in many mammalian CAP proteins, as well as Na-Asp-2 from the nematode parasite (138), which has a known tertiary structure. It is not known whether CAPs containing this motif are ssDNA binding proteins; however, given their typically extracellular localization, this may be unlikely.
The predicted human R3HDML gene is located on chromosome 20q13.12, and its mouse ortholog is located on chromosome 2H3. Our genomic analysis predicts 26 orthologous genes within the bony vertebrates, with 23 orthologs in mammals, including the human, chimpanzee, mouse, dog, and cow. Limited EST data show a restricted expression profile to eye, intestine, and lung in the mouse.
Protein sequence alignments (supplementary Fig. 7) show a minimum of 78% identity between the mouse and human. This subfamily of the CAPs shares each of the CAP motifs, the PY dipeptide, the Hinge-like sequence (Cx2Cx7Cx4C), and a short C-terminal extension of 12 amino acids. Our phylogenetic analyses provide strong support for a close relationship of R3HDML proteins with PI15 proteins and CAPLD proteins (Fig. 1B
).
| IV. Structural Characteristics of CAP Superfamily Proteins |
|---|
|
|
|---|
A. The CAP domain
The tertiary structure of CAP superfamily proteins shows a remarkable conservation, despite often low overall identity and significant evolutionary distance between organisms. Core structural elements are retained (the CAP motifs), resulting in a conserved structure with a conserved putative active site. Variation outside of these sequences may account for the different activities (binding partner and/or enzymatic function) that have been reported in different environments.
1. An enzymatic function?
The only enzymatic function described to date for a CAP superfamily protein is as a substrate-specific protease Tex31 from the cone snail Conus textile (132). Tex31 showed a specific proteolytic activity for the cleavage of the propeptide of the conotoxin TxVI1, which was enhanced 5-fold by calcium and inhibited by serine protease inhibitors, suggesting a metalloprotease- or serine protease-like activity. As such, Tex31 may be a novel class of substrate-specific protease (132).
A CAP protein with 67% identity to Tex31 was recently identified from Conus marmoreus. This protein, originally identified as GlaCrisp (139) and later as Mr30 (140), has been investigated for proteolytic function. Hansson et al. (139) showed no proteolytic activity when tested against the substrate for Tex31. Qian et al. (140) showed a very low serine protease-like activity when tested against a substrate not related to the Tex31 substrate. These latter authors could not discount the possibility of trace levels of contaminant proteolytic function and concluded that Mr30 did not have protease activity. Stecrisp from Trimeresurus stejnegeri snake venom was tested in protease assay against the Tex31 substrate with no detectable proteolytic activity (14). It remains unclear whether Tex31 and related proteins have a very high degree of protease substrate specificity that precludes observation of proteolytic function in related proteins, or whether variation in the putative active site changes the function of the protein.
2. Structural conservation.
The nuclear magnetic resonance (NMR) structure of P14a from the tomato (141) and the crystal structure of several CAPs has been determined (Fig. 3A
). These include vespid Ves v5 (11), human GAPR-1 (142), nematode Na-ASP-2 (138), and numerous CRISPs from snake venom, e.g., stecrisp, triflin, and natrin (14, 41, 42, 143). In addition, structural coordinates for the venom CRISPs pseudechetoxin (PsTx) (144) and pseudecin (145) are in the Brookhaven Protein Data Bank (PDB). These structural determinations showed that CAP domains possessed a similar overall structure consisting of a unique
-β-
fold that is stabilized by a buried hydrogen bonding network and, with the exception of GAPR-1, several disulfide bonds (Fig. 3A
). In all except GAPR-1, the conserved cysteines within β-strand 3 and β-strand 4, forming the central β-sheet (Fig. 3
), form the only conserved disulfide bond present within the superfamily. Other disulfides are typically conserved within a given subfamily. These features provide the thermal, pH, and proteolytic stability reported for CAP proteins, which is consistent with the structural requirements of an extracellular functioning protein (141). The absence of disulfide bonding in GAPR-1 proteins is consistent with their intracellular localization (Section III.C). The GAPR-1 proteins do contain two conserved cysteines, but structural assessment showed they are not close enough to form a disulfide bond (142).
|
The histidines, with an ability to complex divalent cations, form a structure with some similarity to protease active sites (11, 41), and this is consistent with the calcium-activated serine protease-like activity reported for Tex31 (132). In contrast, metalloproteases require a catalytic triad of histidines to complex metal ions with a catalytic glutamate (reviewed in Ref. 146), and serine proteases require a conserved serine within a catalytic triad (reviewed in Refs. 147 and 148). From a protease perspective, the putative active site therefore appears incomplete (14, 41, 138). To circumvent this apparent limitation, Serrano et al. (142) proposed that the dimerization of GAPR-1 completed the formation of the active site.
From a structural perspective, CAP domains have some of the attributes of a protease, yet they lack the complete functionally active site seen in more "classical" proteases. It is unknown whether this "incomplete" putative active site is the structural basis of the protease inhibitory activity reported for human PI15 (114) or whether this site in the CAP domain represents a new class of protease active sites as proposed for Tex31 (132). The definitive demonstration of protease activity for a mammalian CAP has not been demonstrated; nor has it been exhaustively tested.
Each of the four CAP motifs has been mapped to the ribbon structure of triflin to show their tertiary structure context (Fig. 3
). The CAP1 sequence forms the core of the domain and includes two solvent exposed amino acids, notably a conserved histidine oriented to the center of the soluble cleft and a lysine in the turn at the apex of the cleft. The CAP2 sequence forms one of the three major β strands through the core of the protein, followed by a turn that is initiated by a conserved proline. The CAP2 motif contains two conserved solvent exposed amino acids, including an asparagine directed toward the conserved histidines and the proline at the apex of the cleft. Indeed, a synthetic peptide corresponding to the CAP2 motif has been shown to interfere with fertilization in a competitive inhibition assay, which is suggestive of a functionally important interaction at these sites (149) (Section V.C). The CAP3 motif is part of an
-helix on the outer face of the CAP domain; however, the conserved amino acids in this motif often point to the interior of the structure and are stabilized by a network of hydrogen bonds. These interactions in turn stabilize the surface exposed helix. The CAP4 sequence contains invariant amino acids that provide direct stabilizing interactions to a solvent exposed histidine forming part of the proposed active site.
As a point of reference, P14a, from the tomato, and human GAPR-1 represent what may be considered the most basic functional CAP unit. They do not have additional loops, insertions, or C-terminal extensions (Fig. 3A
).
B. The cysteine-rich domain or CRISP domain
In addition to the CAP domain, the CRISP proteins are characterized by their approximately 60-amino acid cysteine-rich C-terminal extension. This extension contains 10 conserved cysteines that form five disulfide bonds. Numerous reptile venom CRISP crystal structures (14, 41, 42) and the NMR structure of the mouse CRISP2 CRISP domain (15) show that CRISP domains exist as two distinct structural regions (1), the Hinge region that contains four cysteines forming two disulfide bonds and (2) the ICR region, which contains six cysteines forming three disulfide bonds. The NMR solution structure of mouse CRISP2 CRISP domain showed a very high degree of flexibility between the Hinge and the ICR (15). Combined with crystal structure data obtained from snake CRISPs showing that the Hinge is tightly associated with the CAP domain, with no interactions between the ICR and the CAP domain (14, 42), this suggests that the ICR has a relatively high degree of rotational freedom in comparison to the CAP domain and Hinge region. Although unpublished, the NMR structure of human CRISP2 ICR has been determined with the structure deposited into the Brookhaven PDB (PDB code: 2cq7). This shows the same structure in the ICR as other CRISPs.
In describing the structural domains of the CRISPs, the Hinge and ICR have typically been grouped as one: as the cysteine-rich domain or the CRISP domain. However, in view of the broader mammalian CAP family (Fig. 1
), it is apparent that the ICR is the region that is uniquely present in the CRISPs whereas the Hinge is present broadly across the mammalian CAPs, with the strong likelihood that the Hinge is disulfide-bonded in the same manner across the superfamily. As such, the CRISP subfamily of the CAP superfamily may be defined, in addition to their CAP signature sequences, by the unique ICR containing six cysteines and three disulfide bonds. This interpretation also suggests that the Hinge region may have functional significance in its own rite. This interpretation is supported by the identification of CRISP2 binding partners that rely on Hinge sequences for their interaction, specifically MAPK kinase kinase II (MAP3KII) and gametogenetin 1 (GGN1) (56, 150).
Uniquely among the CAP superfamily, the CRISPs have ion channel regulatory activity. This was first characterized in helothermine, a CRISP from the venom of the poisonous lizard Heloderma horridum horridum (Mexican beaded lizard) (151). Helothermine inhibited IA type and delayed rectifier voltage gated K+ channels, the ryanodine receptor (RyR) and voltage gated Ca2+ channels (152, 153, 154). CRISPs from numerous poisonous reptiles have subsequently been investigated for their ion channel regulatory activity (reviewed in Refs. 6 and 7).
Numerous detailed characterizations of the ion channel regulatory activity from venom CRISPs have shown that they have an activity that appears either broad acting, as in the case of helothermine, or very specific, as evidenced by a comparison of PsTx and pseudecin (reviewed in Refs. 6 and 7). PsTx purified from the venom of the Australian King Brown snake was the first characterized peptide regulator of cyclic nucleotide-gated type A channels (144). Pseudecin, which differs from PsTx by three amino acids in the CRISP domain, has a 30-fold lower activity against cyclic nucleotide-gated type A 2, indicating that certain amino acids determine specific activity and regulate function (145). This has yet to be systematically investigated through site-directed mutagenesis. The inhibitory activities of CRISPs are typically voltage-independent and may directly block ion current by binding to the ion channel pore (153, 155).
The characterization of venom CRISPs has used native proteins containing both the CAP and CRISP domains. As such, it was unclear which part of the protein was involved in ion channel regulatory function. This was recently resolved through the determination of CRISP crystal structures and through the demonstration that recombinant CRISP domain of murine CRISP2 was sufficient for an ion channel regulatory function (14, 15).
Although there is no primary sequence homology, the ICR of CRISPs has a conserved disulfide architecture and an overall molecular shape with a high degree of similarity to the sea anemone toxins ShK and BgK, which regulate potassium channels (156, 157, 158). This structural similarity suggests that the ICR alone will be sufficient for ion channel regulatory function.
To date, the ability of mammalian CRISPs to regulate ion channels has only been examined for CRISP2 (15). Specifically, when applied to the cytoplasmic side of RyR prepared from skeletal muscle (RyR1) and heart (RyR2) sarcoplasmic reticulum, the CRISP domain of CRISP2 resulted in the activation of RyR1 and the inhibition through RyR2. When applied to the luminal domain of RyRs, CRISP2 inhibited RyR1 and RyR2 in a voltage-dependent manner. As outlined in Section V.B, several biological properties of other mammalian CRISPs are entirely consistent with roles in regulation of ion channel activity, e.g., CRISP1 as a sperm decapacitation factor and CRISP3 in the pathogenesis for Sjögrens syndrome (Section VIII.A).
Despite conclusive data on the function of CRISP domains, the function of the CAP domain in CRISP proteins largely remains unknown. Alluding to a function, however, was the recent identification of a CRISP protein from the parasitic Japanese river lamprey, Lethenteron japonicum (159), and data suggesting its immune suppressive function. The lamprey is one of the most primitive vertebrates and one of the few vertebrate parasites (160). The parasitic lamprey attaches to the host and feeds on blood for extended periods of time. To facilitate this, it secretes factors to assist in feeding, and in this regard, it is similar to the invertebrate parasites such as the human hookworm (Section VII.A). The lamprey secretes two major proteins from its buccal gland. One of these, a CRISP protein, was shown to have an effect on smooth muscle contraction possibly through high voltage-activated (L)-type Ca2+ channels (159), similar to several snake CRISPs. This activity was proposed to assist in feeding through vasodilatation and is likely to be attributed to the CRISP domain. Given the conservation of the CAP domain in parasitic blood meal feeders (Section VII.A), the CAP domain in this context may be involved in immune suppression or in inhibition of platelet aggregation by analogy to the nematode CAPs (Section VII.A).
C. C-type lectin (CTL) domains
CTL domains are C-terminal extensions found in association with the CAP domain in the CAPCL subfamily of the mammalian CAPs (Fig. 1
; Section III.H). CTL domains are 100 to 130 amino acids in length and contain four conserved cysteines, which form two disulfide bonds, and a conserved tryptophan (PROSITE PDOC00537). They function in calcium-dependent carbohydrate recognition (reviewed in Refs. 16 and 161).
In addition to their association with the CAP domain, CTL domains are seen as a C-terminal anchor in the macrophage mannose receptor (134); as a C-terminal domain in the collectins, which are implicated as modulators of the innate immune system (reviewed in Ref. 162); as an extracellular domain in the selectins, which are membrane-anchored cellular adhesion molecules with roles in inflammation and cancer (e.g., reviews in Refs. 163 and 164); and are also associated with the LCCL domain. The function of the CTL domain is unknown.
D. The limulus factor C, Coch-5b2, and Lgl1 (LCCL) domains
The LCCL domain (also called the factor C homology domain) was named after recognition of homology between the Limulus (horseshoe crab) factor C, COCH, and Lgl1 (Section III.G) (17). LCCL domains are normally found in association with other protein domains, including, but not limited to, CTL domains in factor C (165), von Willebrand type-A domains in cochlin and akhirin (166, 167), or CAP domains in CAPLD2 (17, 124). LCCL domains are about 100 amino acids in length, and their C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI (PROSITE PS50820).
Little is known about the function of the LCCL domain, although it is present in multidomain proteins whose roles span diverse functions. LCCL domains are found in potential complement factors in the case of limulus factor C and the Plasmodium Scavenger Receptor Cysteine-Rich, LCCL, Adhesive-like Protein (PSLAP) (165, 168), in the human deafness associated gene (169, 170, 171, 172, 173), in akhirin (and cochlin), which have been implicated in eye development (167, 174, 175), and also within CAPLD2. It has been proposed that proteins containing LCCL domains may be involved in cellular adhesion (167, 168), although it is not known whether this is associated with the LCCL domain specifically or with sequences.
| V. CAP Proteins in Reproduction |
|---|
|
|
|---|
A. Spermatogenesis
Spermatogenesis occurs in the testis and is the process whereby a germ cell develops into a sperm. Based on published data, CRISP2 is the only mammalian CAP protein characterized as being produced during spermatogenesis, but an analysis of EST databases and unpublished data suggests that several other superfamily members are also produced in the testis.
In all mammalian species studied to date, CRISP2 is produced initially in either late pachytene spermatocytes or round spermatids, and CRISP2 becomes localized to the acrosome covering the sperm head, the accessory structures of the sperm tail, and on the developing germ cell membrane (28, 29, 31, 32, 51). CRISP2 appears to be a component of the soluble and diffusible fraction of the electron dense acrosomal matrix (176).
A role for CRISP2 in cell-cell adhesion has been proposed. Experiments whereby rat spermatogenic cell-derived cDNAs were transfected into Jurkat Tag cells identified CRISP2 (at that time called TPX1) as a protein putatively involved in mediating adhesion between developing germ cells and Sertoli cells (31). This result was further supported through the ability of CRISP2 antiserum to interfere with the adhesion of purified germ cells to cultured Sertoli cells. Cell binding ability was localized to the N-terminal-most 101 amino acids, and was completely independent of the CRISP domain (53). This aspect of CRISP2 function has not been explored further.
Within the sperm tail, the function of CRISP2 is currently unknown; however, it may be related to motility and its dependence on ion channel regulation. Immunoelectron microscopy and Western blotting data generated from the rat indicate that CRISP2 is a component of the connecting piece, the outer dense fibers, and the longitudinal columns of the fibrous sheath (51).
Within the male reproductive tract, although sperm may have the capacity for motility, they are kept in a quiescent state by decapacitation factors. Upon entering the female reproductive tract, or culture media, sperm cells begin to move with a modal type of motility. At this time, sperm cells also begin to undergo an additional set of maturation events collectively called "capacitation," during which extensive protein rearrangements and posttranslational modifications occur (reviewed in Ref. 177), and they gain the ability to undergo the acrosome reaction, bind an oocyte, and exhibit hyperactivated motility. All of these processes are critically regulated by Ca2+ flux (178, 179, 180, 181, 182). In relation to Ca2+ regulation, we have recently shown that CRISP2 can regulate Ca2+ flux through RyRs (15). Although the RyR subtypes have not been defined, RyRs have been localized to the connecting piece of human sperm and have been implicated in the generation of a stable oscillating pattern of intracellular Ca2+ correlating with sperm tail movement (183). Although the RyR data are suggestive of a role for CRISP2 in motility regulation, it is not conclusive. Given the range of ion channels regulated by venom CRISPs (Section IV.B) and the high number of ion channel types found on sperm (reviewed in Ref. 184), it is plausible that CRISP2 may also regulate other channel types.
Recent experiments from our laboratory have shown that CRISP2 exists in a complex within sperm. Specifically, CRISP2 binds to MAP3KII (also called MLK3), and they co-localize in the acrosome (150) and to GGN1 in the sperm tail (56). Within somatic cells, MAP3KII is an upstream activator of Jun kinase (185, 186); however, in the absence of sperm transcription and the localization to the acrosome, it is more likely that MAP3KII is involved in posttranslational phosphorylation of target proteins (187). MAP3KII may therefore regulate CRISP2 function during capacitation. GGN1 on the other hand, is a protein of unknown function that appears to bind to CRISP2 in the sperm tail (56).
In addition to CRISP2, an analysis of Unigene (188) EST expression profiles suggests that all members of the GLIPR subfamily are expressed in the testis (i.e., GLIPR1, GLIPR1L1, GLIPR1L2 and GLIPR1L3), as are PI16 (human), CAPLD1 (human), CAPLD2 (human and mouse), and the CAPCL proteins (human and mouse). The cellular localization and function of these proteins in testis function and fertility remain unknown.
Recently, the involvement of CAP superfamily proteins within nonvertebrate sperm was highlighted (189). The ascidian (sea squirt) Halocynthia roretzi produces two CAP superfamily proteins called HrUrabin and HrUrabin-long (L) within their testes and on sperm. These proteins have highest homology to GLIPR1L1 and GLIPR1L2, respectively, and are GPI-anchored to the sperm membrane and localized to lipid rafts (189). It is of note that MAK248 (another GLIPR1L1 ortholog) is also GPI anchored to sperm within the epididymal lumen of the macaque (190). Ascidians are hermaphroditic and have developed mechanisms to select against self-fertilization, thus ensuring the maintenance of genetic diversity. Although HrUrabin appears to be an adhesion protein in a manner conceptually similar to that proposed for CRISP1 (Section V.C), it has a role in the negative selection against "self" oocytes (189). HrUrabin is the first CAP superfamily protein reported within nonvertebrate sperm.
B. Epididymal maturation and sperm capacitation
Because epididymal maturation is essential for fertilization (reviewed in Refs. 191 and 192), proteins of epididymal origin may also be essential for sperm function (61, 193, 194, 195). This has been a central tenet of approaches to sperm-specific contraceptive development over numerous years (196, 197) Proposed functions for CRISPs in this context can be broadly categorized into decapacitation factors and sperm-ooctye receptor complex formation.
During epididymal maturation, sperm cells come into contact with at least CRISP1, CRISP3, and CRISP4. CRISPs within the epididymal fluid surround sperm in high concentrations and interact with the sperm membrane with varying affinities. The mechanisms of attachment and their relative strength dictate whether CRISPs disassociate from sperm early in the female reproductive tract, or whether they stay associated with sperm until the final stages of fertilization, and fundamentally effect the ultimate biological function.
Capacitation is an incompletely defined continuum of molecular events that occurs in sperm in the female reproductive tract and is required to achieve fertilization (198). The molecular mechanisms involved during capacitation are complex, and many excellent reviews exist; however, a few pertinent points will be raised here. Capacitation is initiated and regulated by pathways including cholesterol removal and plasma membrane remodeling with the redistribution/dissolution of lipid rafts (199, 200, 201, 202), intracellular alkalinization from HCO3–, and an increase in intracellular cAMP (203, 204) (reviewed in Ref. 205), resulting in the activation of numerous signal transduction pathways (reviewed in Ref. 206). Correlates of capacitation include a massive up-regulation of tyrosine phosphorylation associated with the sperm tail and head (207), the manifestation of hyperactivated motility (208, 209), the ability to bind the zona pellucida of the oocyte, and the subsequent induction of the acrosome reaction (210, 211).
Decapacitation factors are proteins that inhibit sperm capacitation, and their removal from sperm is required to achieve capacitation (212, 213, 214). Decapacitation factors, of which there appear to be many, are often of epididymal or accessory organ origin (67, 215, 216, 217, 218, 219) and are removed from sperm after entry into the reproductive tract or by incubation in vitro in capacitating conditions.
The larger glycosylated form of CRISP1 has been shown to function as a sperm decapacitation factor in rats and to reversibly inhibit capacitation-associated protein tyrosine phosphorylation and the acrosome reaction in a dose-dependent manner (67). CRISP1 was also identified in a fraction of low molecular mass diffusible proteins from mouse sperm, which upon readdition to sperm caused the inhibition of the progesterone-induced acrosome reaction and the inhibition of tyrosine phosphorylation in membrane proteins, without grossly affecting tyrosine phosphorylation within the head or tail (215). The differences observed in the effect of CRISP1 on rat and mouse sperm protein phosphorylation may be due to species variation or the heterogeneous protein pool used in the latter investigation. Roberts et al. (67) proposed that CRISP1 interacts with a specific lipid or protein on the sperm surface to inhibit capacitation. This is entirely consistent with the inhibition of ionic signaling of CRISP proteins as demonstrated for CRISP2 (15), the high concentrations of CRISP1 and CRISP4 in the epididymal lumen, and the reversible association of the majority proportion of the CRISPs to the plasma membrane. A similar role for CRISP4 as a decapacitation factor is possible.
Given first, the demonstrated ion channel regulatory activity (primarily inhibitory) of CRISP domains; second, the high concentrations of CRISPs in the cauda epididymal lumen; and third, the decapacitation factor activity of CRISP1, it is most likely that at least one function of the epididymal CRISPs is to store sperm in a quiescent state through inhibition of ion signal transduction. After entry into the female reproductive tract and the dissociation of much of the CRISP content from the surface of sperm, or through their potential inactivation by binding to other proteins, as in the case of CRISP3 binding to β-MSP (79), specific classes of ion channels may become activated, thus allowing ion signal transduction and sperm capacitation.
Surprisingly, given the above data, an analysis of male Crisp1 knockout mice showed normal fertility and produced litters of normal size. Even more surprisingly, capacitation as indicated by increased global tyrosine phosphorylation, was reduced (60). Membrane protein phosphorylation was not specifically assessed. From the data presented, it was unclear whether sperm capacitated more slowly or whether the full suite of tyrosine phosphorylation is not required for fertility. These data do, however, suggest that the removal of CRISP1 results in deregulated signal transduction. To dissect the true role of CRISPs in capacitation on an equivalent footing to that occurring in human sperm, it may be necessary to produce Crisp1-Crisp4 double knockout mice.
Calcium plays a central role in sperm capacitation and is essential both for achieving hyperactivated motility (220, 221) and for the initiation of the acrosome reaction (e.g., reviews in Refs. 222 and 223). An increase in intracellular Ca2+ within the sperm tail is required for the initiation of hyperactivated motility, and subsequent cyclic changes in intracellular Ca2+ correlate with the beating of the sperm tail. The Catsper ion channels are specific to the principal piece of the sperm tail, and the Catsper1–4 null mice are all sterile through an inability to undergo hyperactivated motility (224, 225, 226, 227). Infertility is attributed to a lack of Ca2+ ion entry into the sperm tail from extracellular sources (180), but it has recently been shown that stimulation of release of Ca2+ from intracellular stores at the base of the sperm head in the Catsper1 and Catsper2 null mouse is sufficient for sperm to hyperactivate (181). The most likely candidate for the intracellular Ca2+ store at the base of the sperm head is the redundant nuclear envelop, and the central role for this compartment in regulating hyperactivated motility is becoming apparent (181, 228). The signal transduction pathways regulating the release of intracellular Ca2+ from the redundant nuclear envelop are unknown, and the identity of the ion channels in this compartment remain unclear; however, both the IP3R and the RyR have been identified in different species (183, 229). Thus, as outlined above, given the localization of CRISP2 to the connecting piece and its ability to regulate RyR activity, it is plausible that CRISP2 is involved in this process. Given that CRISP1 (and likely CRISP4 and CRISP3) diffuse off sperm during the early stages of capacitation, it is also possible that their removal alleviates repression of specific ion channels resulting in the initiation of signal transduction cascades that contribute to hyperactivation.
The binding of CRISPs to β-MSP, or similar proteins, may be functionally important and relevant to all CRISPs in the reproductive tract. This is supported by the interaction of PI16 with PSP94 (β-MSP) in human serum (80) and the binding of triflin to SSP-2 (β-MSP) in snake serum (230, 231). Conservation of binding partners identified from diverse sources supports this interaction as being functionally significant. In each case, the high affinity interaction was proposed to have an inactivating/self-protective function. β-MSP is one of the most abundant proteins secreted from the prostate (232). β-MSP binds to the sperm midpiece and head and regulates motility through inhibition of Na+, K+-ATPases, it comprises a large proportion of the protein released from sperm after ejaculation, and it inhibits the spontaneous acrosome reaction (233, 234, 235, 236). Increased expression of β-MSP may be associated with reduced fertility in men (236). As yet, the significance of the high-affinity interaction and the importance of these proteins in regulation of sperm capacitation remain unclear.
To date, only one non-CRISP CAP superfamily protein has been identified in the mammalian epididymis, MAK248 from the macaque (190). MAK248 is most closely related to GLIPR1L1 in the mouse (supplementary Fig. 1). However, differences in the expression of these two proteins (exclusively epididymal vs. exclusively testicular) suggest that they may not be strict orthologs. MAK248 is a GPI-anchored protein that appears to undergo epididymal-associated proteolytic cleavage into two disulfide-linked peptides and is localized to the equatorial segment and the posterior head of capacitated macaque sperm (190). The function of MAK248 remains unknown, yet the presence of a CAP domain containing protein distinct to the CRISPs suggests that the CAP domain specifically has a conserved function within the reproductive tract.
C. CAPs role in sperm and egg fusion
As indicated in part A of this section, transfection data have suggested that a sequence in the N-terminal domain of CRISP2 may be involved in germ cell binding to Sertoli cells (31). Although this specific function has not been explored further, these data do fit with recent data suggesting that peptides corresponding to the CAP2 signature motif of either rat CRISP2 or CRISP1 can interfere with in vitro fertilization (IVF) in the rat, specifically sperm-oocyte fusion (149). Although the essential nature of CRISP1 in oocyte fusion is refuted by a recent knockout of the mouse Crisp1 gene (60), these data suggest that there are sequences within the CAP proteins that are fusigenic. Conflicting data exist concerning the ability of CRISP2 antibodies to interfere with IVF; specifically, data from the Cuasnicu laboratory showed that the addition of anti-CRISP2 serum to human sperm in the zona-free hamster oocyte assay inhibited sperm binding (55). However, antibodies to CRISP2 in a guinea pig assay did not inhibit IVF (28), nor did CRISP2 antibodies inhibit the binding of human sperm to excess human oocytes (M. K. OBryan, unpublished data). The concrete interpretation of these data is currently further limited by the high degree of sequence homology between CAP members in the CAP signature motifs, uncertainties as to the presentation of CRISP2 on the surface of either developing germ cells or sperm, and the finding that based on available crystal structures, seven of the 12 amino acids in the fusogenic inhibitory peptide sequence are buried within the internal structure of the CAP domain (Section IV.A). An analysis of the tertiary structure context of the CAP2 motif from the venom CRISP triflin (Fig. 3
) suggests that two of the five amino acids not buried within the core of the structure have side chains that are likely to be solvent exposed. These are the conserved proline and asparagine, which are localized to the top of the solvent exposed cleft that contains the putative active site (Fig. 3
). Whether this pentamer containing two solvent exposed amino acids is sufficient for binding has not been determined. Regardless, because we currently know very little about the posttranslation processing for sperm surface proteins, we believe that it remains plausible that a protein bearing a CAP2 signature motif, and thus a CAP, is involved in binding male germ cells to either Sertoli cells or oocytes.
Although Crisp1 knockout males have normal fertility, several lines of evidence suggest a subtle role in the fertilization process. Certainly after capacitation in a number of species, a proportion of CRISP1 remains associated with the fusigenic portion of the sperm, namely the equatorial segment (Section III.A). CRISP1 antibodies have been shown to inhibit rat sperm fertilization (237, 238, 239), and CRISP1 protein has been shown to bind to the oocyte, suggesting complementary binding sites (240, 241, 242). The immunization of male rats with CRISP1 significantly and reversibly reduced male fertility through inhibition in egg penetration by the sperm, with accumulation of sperm in the perivitelline space (243, 244). Anti-CRISP1 antibodies in the epididymis of immunized rats bound to CRISP1 without affecting expression levels or the localization of CRISP1 to the sperm plasma membrane. Sperm capacitation and the binding of sperm to the plasma membrane of zona pellucida free eggs were also not affected. The same results have been obtained for mouse CRISP1 when using purified CRISP1 in competitive binding assays (241), and for mouse CRISP2 and human CRISP1 when using immunological inhibition (55, 242). A similar activity was demonstrated for Izumo in the mouse (245). Izumo is generally regarded as a bona fide sperm oocyte receptor protein.
The reason for the disconnection of in vitro and in vivo CRISP1 activity in particular may well be the functional redundancy between CAP family members or cross-reactivity of analytical reagents with other CAPs. As such, it may only be after the generation of combination null mice that the true story will become clear.
D. Female reproductive tract
The extent of expression of CAP superfamily proteins within the mammalian female reproductive tract is largely unexplored. We have recently shown expression of CRISP1 and CRISP3 within the murine uterus produced in the secretory epithelial cells (45), and numerous members of the CAP superfamily may be expressed within the ovary and uterus based upon EST analysis. CAP superfamily proteins have, however, been characterized in the female reproductive tract of nonmammalian species, in particular from Xenopus, where they have a demonstrated biological function.
Organisms that undergo external fertilization require chemoattraction to ensure that sperm can find the oocyte and achieve fertilization (reviewed in Ref. 246). The three layers of jelly surrounding the Xenopus oocyte are comprised of a matrix of high molecular mass proteins and low molecular mass diffusible proteins (247). An analysis of the diffusible components from Xenopus laevis egg jelly identified a protein, called allurin, as having a specific sperm chemoattractant activity (248). Allurin is secreted from the jelly-secreting ducts of the Xenopus upper oviductal epithelium, where it becomes localized to the surface of the ciliated cells (249). During passage of the oocyte through the oviduct, the ciliated cells deposit allurin into the developing egg jelly layers (249). After transport through the oviduct, allurin makes up 3% of the total egg jelly protein (249). After spawning, most of the allurin within the outer jelly layer is released within 5 min, whereby it functions in sperm chemoattraction (250).
Although the mechanism of chemoattraction remains unknown, the biological activity appears conserved among Xenopus species. Allurin from X. tropicalis has a chemotactic function for both X. tropicalis and X. laevis sperm (251). Similarly, X. laevis allurin has a chemotactic activity for X. tropicals sperm (251). Allurin was the first vertebrate protein isolated to have a sperm chemotactic activity.
Allurin has highest sequence identity (about 40%) to the CRISPs (252), and similar to the mammalian CRISPs, it has a Hinge-like sequence with the same spacing of cysteine residues (Cx2Cx3Cx4C). Beyond this however, the sequence is terminated, and there is no ICR domain. The expression profile similarity and structural similarity raise the possibility of a chemotactic activity of the mammalian CAPs, although this remains undetermined.
E. Genetic studies associate SNPs in CRISPs with reduced fertility
To date, the only CAPs to be assessed for their potential as a genetic cause of infertility are the CRISPs. An analysis of nonsynonymous single nucleotide polymorphisms (SNPs) in equine Crisp3 was suggestive of a role in male fertility (253). The authors identified three polymorphisms within Crisp3, and one heterozygous SNP encoding E208K was significantly associated with reduced stallion fertility (253). E208K is localized to the CRISP domain of CRISP3 in a region of increased amino acid sequence variability among the CRISP protein family. A biochemical basis for reduced CRISP3 function has not, as yet, been investigated. The study also identified several polymorphisms in the equine Crisp1 and Crisp2 genes; however, no association with impaired fertility was detected (253).
Several studies on human male fertility have proposed a role for CRISP2; however, concrete proof of this is lacking. The analysis of an Albanian family identified a t(6;21)(p21.1;p13) translocation as the causal factor in male-specific infertility characterized by severe oligoasthenoteratospermia (254). CRISP2 was contained within the deleted region. Similarly, a larger analysis of break points in infertile male patients carrying balanced translocations identified CRISP2 as a candidate causal gene in three cases of male infertility (255).
More recently, our group has undertaken a detailed analysis of polymorphisms in the CRISP2 gene in a large number of infertile patients with a range of phenotypes (52). This study identified 21 CRISP2 polymorphisms, indicating an extremely high level of polymorphism between individuals. Three polymorphisms resulted in amino acid substitutions, including one that resulted in the loss of a strictly conserved cysteine involved in intramolecular disulfide bonding in the Hinge region (C196R). This mutation was only observed in a heterozygous state and occurred in similar frequencies in the infertile and control populations; however, a yeast two-hybrid analysis showed that in a homozygous state this polymorphism resulted in the loss of the ability of CRISP2 to bind to its binding partner GGN1 (52, 56).
| VI. CAP Proteins in Cancer |
|---|
|
|
|---|
Several lines of evidence suggest a role for CRISP3 in the progression of prostate cancer. CRISP3 is expressed at low levels within the normal human prostate (33), yet it is highly up-regulated in the cancerous prostate. By comparing microdissected prostate cancer tissue to adjacent "normal" tissue in a microarray analysis, Ernst et al. (256) showed that several genes within 6p21 were significantly up- or down-regulated. Of these, CRISP3 was the most highly up-regulated with a 21-fold overexpression in cancerous tissue compared with matched control tissue. This conclusion was supported by other studies where a 20- to 2000-fold up-regulation was observed (91, 92). The majority of CRISP3 production came from the prostatic epithelium, rather than the surrounding stromal tissue (256). Bjartell et al. (258) showed, using immunohistochemistry on prostate biopsies, that CRISP3 was significantly up-regulated in regions of high-grade cancer (Gleeson scores 4/5). Furthermore, the mRNA and protein for the CRISP3 binding protein β-MSP was dramatically down-regulated in prostate cancer (257), suggesting that prostate cancer progression may be critically affected by the ratio of bound to free CRISP3.
Although the mechanism of CRISP3 activity in prostate cancer is yet to be revealed, CRISP3 has been explored as a marker of long-term patient outcomes and as a marker of regression. Patients with increased expression of CRISP3 in cancerous regions of the prostate had a lower recurrence-free probability (258). Serum CRISP3 concentration was found to correlate with progression in both a univariate and multivariate model with β-MSP. However, the inclusion of CRISP3 in existing predictive models (i.e., prostate serum antigen and pathological staging) did not improve their power, potentially as a consequence of numerous confounding endogenous sources of CRISP3 (93, 258). As expected for a gene containing an androgen-response element in its promoter, the levels of serum CRISP3 fell after orchiectomy (93).
CRISP3 was also shown to be significantly down-regulated in tongue tissue from patients with squamous carcinomas, compared with healthy tissue (94). The significance of this change in expression is yet to be explored.
The cancerous human prostate also showed aberrant expression of a member of the GLIPR subfamily. In contrast to CRISP3, however, GLIPR1 was suppressed in prostate cancers. Within the mouse, GLIPR1 (also known as RTVP1) was originally identified as a p53-stimulated prostate transcript (102). The transfection of GLIPR1 into a number of mouse and human prostate cell lines resulted in increased apoptosis by both p53-dependent and p53-independent pathways (102, 103). Similarly, viral delivery of Glipr1 in a preclinical model of prostate cancer resulted in decreased tumor growth and metastasis (103, 259). In addition to increased rates of apoptosis, decreased tumor progression after virally mediated delivery in vivo appeared to depend, at least in part, on antiangiogenic and immunostimulatory activities (259). The preimmunization of mice with a Glipr1-modified tumor cell vaccine was also partially protective against the development of prostate cancer in an orthotopic cancer model (260), and, perhaps most compellingly, mice lacking Glipr1 had a significantly increased predisposition to spontaneous tumor growth when compared with wild-type littermates (261). It is also of note that Glipr heterozygous animals were significantly more disposed to developing tumors than wild-type animals, suggesting that the proapoptotic effect of GLIPR1 is stochastic in nature (261). Tumors were observed in several tissue types. Concurrent p53 inactivation greatly increased the rates of tumor development; however, rates were still significantly above that of p53 inactivation alone, suggesting that for GLIPR1, the p53-independent apoptosis is functionally significant in carcinogenesis.
In terms of the mechanism of apoptosis, transfection experiments suggest that GLIPR1 production results in elevated reactive oxygen species production, leading to the sustained activation of c-Jun N-terminal kinase (JNK); a shift in the balance between pro- and antiapoptotic BCL2 family members toward the apoptotic pathway; and ultimately apoptosis by both caspase-dependent and caspase-independent pathways (261).
With one notable exception, there is also a compelling case for GLIPR1 being a human tumor suppressor gene. A case for using gene therapy for delivery of GLIPR1 to prostate cancers is most well developed; however, expression analyses show that GLIPR1 is also suppressed in human cancer cell lines derived from multiple sources, including lung, colon, and bladder cancers and lymphomas, and that the forced up-regulation of GLIPR1 resulted in apoptosis (261). GLIPR1 gene therapy may therefore be of value for a range of cancers. Indeed, at the time of writing this review, a phase I/II clinical trial on the potential of adenoviral-mediated GLIPR1 cancer therapy was under way (IND13033). mRNA and protein analyses using human prostate tissue have shown that GLIPR1 production is suppressed, largely via methylation of the promoter, in cancerous regions compared with normal prostate tissue (103).
Perhaps the loudest note of caution against widespread GLIPR1 delivery at the moment is the identification of GLIPR1 as one of the most highly up-regulated transcripts in human gliomas (astrocyte-derived brain tumors) (100, 101). In contrast to the situation for GLIPR1 in prostate cancer, GLIPR1 is up-regulated in astrocyte-derived tumors and cell lines, and the degree of expression correlates with the degree of invasiveness i.e., glioblastoma >anaplastic astrocytomas >low- grade astrocytomas >normal brain (100, 104). GLIPR1 is expressed in very low levels in normal brain tissue, and the delivery of interfering RNAs into glioma cells results in elevated apoptosis. Conversely, the transfection of GLIPR1 into glioma cell lines increases both their proliferation rate and their ability to form colonies in agar (104).
In an almost mirror image of the effect of GLIPR1 on prostate cells, the transfection of GLIPR1 into a glioma cell line resulted in decreased phosphorylation (suppression) of JNK and a shift in the BCL2 family members toward a prosurvival phenotype (104). Clearly these data indicate that the GLIPR1 function is context dependent, and at this time the pivotal point between pro- and antitumorigenic effects is not known, but it may involve the relative levels of each of the four GLIPR1 transcripts that have been observed (101, 104). Regardless, the potential to use GLIPR1 gene delivery or suppression as a targeted therapy is an exciting prospect, and the outcome of clinical trials is eagerly awaited.
Elevated GLIPR1 expression has also been observed in a number of other tumor types, although its mechanism has not been determined. GLIPR1 is, for example, highly up-regulated in Wilms tumor (WT). WT is a renal cancer that arises during embryonic development and typically occurs in children. WT has a complex etiology and is associated with both genetic and epigenetic changes (262, 263, 264, 265, 266). Using a comparative analysis of differentially methylated regions in the genomes of normal fetal and WT kidney tissue, Chilukamarri et al. (267) identified the GLIPR1 promoter as being a frequently hypomethylated locus in WT. Consistent with this, GLIPR1 mRNA and proteins were significantly up-regulated in the majority of WT tissues examined (267).
| VII. CAP Proteins in Immune Regulation |
|---|
|
|
|---|
A. Anclystoma-secreted protein, or activation-associated secreted protein (ASP)
CAP superfamily proteins have been identified in several species of both free living (C. elegans) and parasitic nematode. Parasitic worms gain residence in their host through suppression and evasion of the host immune system (268). Anclystoma caninium (dog hookworm) produces numerous CAP proteins that are secreted in high abundance immediately after the transition of the free living larvae to the parasitic form. These include the neutrophil inhibitory factor (NIF), the hookworm platelet inhibitor (HPI), ASP1, and ASP2 (269, 270, 271, 272). These proteins are produced at the crucial early stage of parasitic invasion, whereby the parasite attempts to evade the host immune system through secretion of immune evasion molecules, ultimately to facilitate and enhance feeding.
NIF was identified after a screen for inhibitors of leukocyte function (269). The 41-kDa glycoprotein blocked adhesion of activated neutrophils to vascular endothelial cells via direct high-affinity binding to the extracellular domain of CD11b/CD18 integrin (269, 273). Integrins are involved in the transfer of signals from the extracellular matrix to the cytoplasm and are key to the control of these signals (e.g., review in Ref. 274). The CD11b/CD18 integrin (otherwise known as macrophage-1 antigen, complement-receptor 3, or
Mβ2 integrin) is involved in the mammalian innate immune system and mediates inflammation through the regulation of leukocyte adhesion (275). HPI has 36% identity to NIF, interacts with the integrin glycoprotein IIb/IIIa and GPIa/GPIIa, and inhibits platelet aggregation (270).
ASP1 and ASP2 each have about 30% identity to NIF. ASP-related proteins are expressed throughout many, if not all, nematode species, are produced in great abundance shortly after transition to the parasitic form (268), are secreted, and appear to have a conserved function related to evasion of host immunity or to enhanced feeding (270, 272, 276, 277, 278, 279, 280, 281, 282). One of the major human parasitic nematodes, Necator americanus, contains a family of nine ASP genes that are the most highly represented within EST libraries (282), alluding to their expression abundance. Based upon similarity in sequence and expression profile to the ASP from the dog hookworm, these may have a similar function in immune evasion or inhibition of platelet aggregation.
As an aside, EST analysis of N. americanus identified a group of small molecules with similarity to potassium channel blockers from the sea anemone (282). The hookworm secretes, among others, two classes of proteins in high abundance, one with a CAP domain and one with an ICR function. The ICR in the hookworm has similarity to the BgK toxin from the sea anemone (158). The ICR domain of CRISPs (Sections III.A and IV.B) has structural similarity to the BgK toxin (14, 15). Thus, it appears that mammals convergently evolved CRISPs that incorporate both domains into a single protein. In the very same vein, CTL domain-containing proteins are secreted from N. americanus (282). The CTL domain from the human hookworm has greater similarity to the mammalian P-selectins than they do to the lectins from C. elegans (282). Mammals produce CAP and CTL domain-containing proteins as a single gene product, notably the CAPCL genes (Section III.H).
B. Antigen 5 (Ag5)
The Ag5 proteins form a major and distinct clade of the CAP superfamily. Ag5 proteins were first identified within the fire ants and numerous species of wasps reviewed in Ref. 8). They were later also identified within the midgut of Drosophila (1, 283) and are secreted from the saliva of blood-feeding ticks (284), sand flies (285), stable flies (286), and mosquitoes (287, 288).
The Ag5 proteins are the most abundant proteins within wasp venom and are frequently associated with eliciting a strong allergenic response in people (289, 290, 291, 292). Although detailed characterizations of the allergenic response have shown both an IgE and an IgG component (293, 294, 295), the function of Ag5 proteins within wasp and fire ant venom remains largely unexplored.
Within the blood-feeding ticks, flies, and mosquitoes (284, 285, 286, 287, 288), the Ag5 proteins are part of a cocktail of salivary proteins that are believed to function either in suppression of the host immune system or in prevention of clotting to prolong feeding (reviewed in Ref. 296). In this regard, the biological function is similar to that proposed for the ASP and for a CRISP from the parasitic river lamprey (159). Because the Ag5, ASP, and lamprey CRISP have the CAP domain in common, it is most likely that function is encoded within the CAP domain.
C. Pathogenesis related-1 (Pr-1)
Plants display numerous responses after pathogen infection that are aimed at the isolation and elimination of the pathogen (reviewed in Refs. 9 and 297). Among these are the hypersensitive response resulting in the isolation and death of infected cells. The hypersensitive response sees a dramatic up-regulation of the Pr proteins (reviewed in Refs. 9 and 297, 298, 299). Of the 17 distinct groups of Pr proteins (9), the Pr-1 proteins are the most highly up-regulated and contribute to 10% of the total protein within the infected leaf (4). The Pr-1 proteins are a major group of the CAP superfamily (298) and were first identified within the tobacco plant after infection by the tobacco mosaic virus (300, 301). Although it has been demonstrated that Pr-1 proteins are not antiviral (302), they may possess an antioomycete (water mold) activity (303, 304) through negative regulation of β-(1–3)-glucanase (305).
The increased expression of Pr-1 proteins throughout noninfected parts of the plants suggests a role in the plant systemic acquired resistance response (9, 298, 306), which is analogous to acquired immunity in mammals. Pr-1 expression also occurs during distinct developmental stages of the normal healthy plants (307). Therefore, similar to the situation in mammals, various roles of Pr-1 proteins may be dependent upon expression context. For a more expansive review of the Pr-1 protein within plants, readers are referred to numerous excellent reviews (9, 297, 298, 306).
D. An immune function for mammalian CAPs
The CAPs from nonmammalian sources with immune regulatory function operate through two general mechanisms: 1) introduction by foreign or invasive organisms for evasion of host defense/enhanced feeding; and 2) up-regulation of CAPs by the host for defensive purposes. In all of the characterized instances relating to immune invasion, the targets of CAP protein binding are different. The changed binding partner interactions may be a consequence of molecular evolution, and these changes may determine their biological function. Although this premise makes it impossible to predict immune function of mammalian CAPs based upon similarity to distant homologs, it does provide a basis for the examination of their function within immune environments.
As noted already, our recent analysis of the CRISPs in normal mouse tissues, CRISP1, CRISP3, and CRISP4 shows a marked expression bias to tissues with roles in the immune system (45). These include the thymus and the spleen, which contain Crisp1, Crisp3, and Crisp4. CRISP3 is produced within B cells where its expression is regulated by OCT2 (83) and in neutrophils where it has been localized to secretory granules (35). Neutrophil granules are rich in matrix degradation proteins whose function is to contribute indirectly to microbiocidal potential through the degradation of the extracellular matrix (308, 309, 310). Localization of CRISP3 to this granule is supportive of a role in matrix degradation and is consistent with the proposed role of rat CAPLD2 (Lgl1) in branching morphogenesis, which requires the controlled remodeling of the extracellular matrix (311). Functional characterization of PI15 as a protease inhibitor (114) or Tex31 as a protease (132) is also consistent with these observations. The distribution of CRISP1 and CRISP4 within mouse immune tissues suggests that these may also have a role in immune regulation (45), although these have not been tested.
GLIPR1 has also been identified within cells of the macrophage lineage and up-regulated in response to 1
,25-dihydroxyvitamine D3 and TGF β-induced maturation (100, 312). The role of GLIPR1 within this setting and its role in macrophage function have not been extensively explored. No immune defects have been reported in the Glipr1 knockout mouse line to date (261).
| VIII. CAP Protein in Nonreproductive Tract Pathologies |
|---|
|
|
|---|
B. Chronic pancreatitis
Chronic pancreatitis results in a significant up-regulation of CRISP3 (
21-fold) (313). Chronic pancreatitis is an inflammatory disease characterized by progressive and irreversible destruction of the pancreas, and it has both environmental, e.g., alcohol consumption, and genetic components (314). In situ hybridization and immunohistochemical analyses showed a distinct up-regulation of CRISP3 in the cytoplasm of degenerating acinar cells in chronic pancreatitis tissues, compared with adjacent normal acinar cells and to acinar cells of the normal pancreas (95). The significance of these observations is currently unknown.
C. Neuropathic pain
Crisp1 up-regulation has been observed in relation to neuronal pathology, specifically in a rat model of partial sciatic nerve injury leading to neuropathic pain. Crisp1 was identified as a differentially regulated gene within the dorsal root ganglion cells associated with the perception of pain (315). Partial injury of the nervous system through, for example, spinal cord injury, multiple sclerosis, stroke, infection, toxins, or genetic or immune-mediated disorders can result in the development of neuropathic pain (316, 317). Using the rat model, an analysis of expression profiles within the dorsal root ganglion 2 wk after partial sciatic nerve injury identified the expression of Crisp1 (318). In situ hybridization of dorsal root ganglion cross-sections showed an abundance of transcripts skewed toward the smallest cells with a high proportion being nociceptive (those that respond to high intensity, painful stimulus). Although it remains to be confirmed that CRISP1 protein is produced in nociceptive cells, the aberrant expression of a potential ICR in conditions of nerve damage may result in the inappropriate transmission of sensory information. Mechanisms controlling Crisp1 expression under these conditions are unknown; however, androgen receptor expression in dorsal root ganglia has been demonstrated (319). The pain response of Crisp1 null animals has not as yet been studied.
D. Heart disease
As outlined earlier, PI16 has been identified as a secreted protein involved in the regulation of cardiomyocyte size (122). In a mouse model of heart failure, PI16 was shown to be highly up-regulated (470 ± 55%). A similar increase was observed in damaged human heart tissue (323 ± 130%) (122). PI16 production by rat cardiomyocytes in vitro significantly inhibited their growth. RNA interference-mediated inhibition results in increased cardiomyocyte size, and the in vivo production of PI16 in a transgenic mouse model resulted in apparently healthy mice with small hearts made up of hypotrophic cardiomyocytes (122). Interestingly, cardiomyocytes transfected with small interfering RNA against PI16 are larger and showed reorganized sarcomeres (the basic muscle cell unit between Z-bands) (122). As a consequence of its secretion, the presence of a CAP domain with homology to a protease (132), and also homology to the human PI15 protease inhibitor, Frost and Engelhardt (122) hypothesized that PI16 may have a role in the matrix remodeling component of heart failure (320).
E. Bronchopulmonary dysplasia
Consistent with the proposed role of CAPLD2 in lung branching morphogenesis, it is significantly down-regulated in rat models of bronchopulmonary dysplasia (127). Bronchopulmonary dysplasia is a chronic lung disease most often observed in severely underweight premature children. Its pathology appears to be due, at least in part, to abnormal alveolar and airway development or inflammation (321). Within both the normal and treated rat, CAPLD2 was associated with the growing alveolar tips during periods of rapid growth (127). Such a localization is consistent with a role for CAPLD2 as a regulator of the extracellular matrix. The biochemistry underlying CAPLD2 function is currently unknown.
| IX. Conclusions and Perspectives |
|---|
|
|
|---|
CAP superfamily proteins represent striking cases of mosaic evolution, with both evolutionary stasis (in CAP motifs) and change (throughout the rest of the protein). These two apparently opposing evolutionary patterns have resulted in a superfamily with conserved tertiary structure, yet a diversity of function. This variability may be at the core of the difficulty in defining the activity/function of the CAP superfamily. In numerous cases, specific high-affinity interactions have been described, and these may provide the basis for a competitive negative regulation by the CAPs. If an enzymatic function for CAP domains is confirmed (e.g., protease activity), the specific targets of that function may be different based on the different protein interactions controlled through variation in surface amino acids and potentially associated domains. An example of this is provided by NIF and HPI, activation-associated proteins from the nematode. Both of these bind integrins, yet their target is different resulting, in differing biological functions. NIF inhibits neutrophil adhesion, and HPI inhibits platelet aggregation (269, 270).
In the most general terms, CAP superfamily proteins are identified in three environments: 1) as a normal component in a cell or tissue; 2) as being up-regulated during disease; and 3) as being introduced by a foreign agent for a toxic/immune evasive effect. Within this review we have been concerned primarily with the two former situations, although examples of the latter have been provided and are useful in consideration of the former. In numerous cases where CAP expression is chronically up-regulated, an association between extracellular matrix remodeling, immune system regulation/dysfunction, or otherwise general disruption of cellular homeostasis is observed. Determination of the function of CAPs in these instances of disease will undoubtedly be the focus of future investigation.
Some aspects of CAP protein expression appear contradictory because they are up-regulated in cases of chronic disease and cancer where positive correlations have been drawn, yet are also present in the normal healthy adult. Although this in itself is not extraordinary, in many cases there appears to be a dichotomy in function. An example of this is bourne out through the CRISP subfamily. CRISPs are highly expressed and conserved within the mammalian male reproductive tract, but also within the venom of poisonous reptiles (322, 323). Their retention implies an important function in each of these environments, which are at first glance, opposing; one is important for the generation of life, and the other has a toxic effect potentially leading to death. In reality, both processes are critically dependent on ion channel regulation.
An accumulating body of evidence suggests an important role for CAPs in mammalian biology, specifically relating to reproductive function, the immune system, cancer invasiveness, and numerous chronic diseases, organogenesis, and development. Some very significant questions remain—specifically, and most notably, what is the underlying mode of action of the CAP domain, an enzyme or high-affinity interacting protein functioning via competitive inhibition? Characterization of this activity will be of enormous benefit to understanding their biological significance in the diverse environments in which they are expressed and may offer new treatment possibilities. For the readers of this journal, the function of CAP proteins is particularly notable for the processes of sperm development, maturation, and function, but also for prostate development and disease. The possibilities and potential importance of the superfamily is further highlighted when examining the many uncharacterized members of the GLIPR1, GLIPR2, CAPLD1, CAPLD2, PI15, and PI16 subfamilies. The majority are highly represented in endocrine-regulated tissues and disease.
| Acknowledgments |
|---|
| Footnotes |
|---|
Disclosure Summary: The authors have nothing to disclose.
First Published Online September 29, 2008
Abbreviations: Ag5, Antigen 5; ASP, activation-associated secreted protein; CAP, cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 protein; CAPCL, CAP and CTL domain containing; CAPLD1, CAP and LCCL domain containing protein 1; CRISP, cysteine-rich secretory protein; CRISPLD1, CRISP LCCL domain containing 1; CTL, C-type lectin; DFNA9, deafness autosomal dominant nonsyndromic sensorinueral disorder 9; E, embryonic day; EST, expressed sequence tag; GAPR-1, Golgi-associated pathogenesis related-1; GGN1, gametogenetin 1; GLIPR1, glioma pathogenesis related-1; GLIPR1L1, GLIPR1-like 1; GPI, glycosyl-phosphatidylinositol; HPI, hookworm platelet inhibitor; ICR, ion channel regulator; IVF, in vitro fertilization; LCCL, limulus factor C, Coch-5b2, and Lgl1; Lgl1, late gestation lung protein 1; MAP3KII, MAPK kinase kinase II; MRL, mannose receptor-like; β-MSP, β-microseminoprotein; NIF, neutrophil inhibitory factor; NMR, nuclear magnetic resonance; PDB, Protein Data Bank; PI15, peptidase inhibitor 15; Pr-1, pathogenesis-related-1; PsTx, pseudechetoxin; R3HDML, R3H domain-like; RTVP1, related to testis specific, vespid and Pr-1; RyR, ryanodine receptor; SCP, sperm-coating glycoprotein; SNP, single nucleotide polymorphism; WT, Wilms tumor.
Received for publication July 14, 2008. Accepted for publication September 4, 2008.
| References |
|---|
|
|
|---|
-lactalbumin-like activity with the plasma membrane of rat spermatozoa. The Biochemical journal 206:161–164[Medline]
1B-glycoprotein in human plasma. Biochemistry 43:12877–12886[CrossRef][Medline]
1B-glycoprotein. Biochemistry 31:410–418
2-SCB-
2 binding site. Yeast 11:681–689[Medline]
-carboxyglutamic acid residue in a novel cysteine-rich secretory protein without propeptide. Biochemistry 45:12828–12839
and β subunits in the functions of integrin
Mβ2. J Biol Chem 280:1336–1345This article has been cited by other articles:
![]() |
D. Ma, Y. Wang, H. Yang, J. Wu, S. An, L. Gao, X. Xu, and R. Lai Anti-thrombosis Repertoire of Blood-feeding Horsefly Salivary Glands Mol. Cell. Proteomics, September 1, 2009; 8(9): 2071 - 2079. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Endocrinology | Endocrine Reviews | J. Clin. End. & Metab. |
| Molecular Endocrinology | Recent Prog. Horm. Res. | All Endocrine Journals |