- Original Articles
Conserved Structure and Adjacent Location of the Thrombin Receptor and Protease-Activated Receptor 2 Genes Define a Protease-Activated Receptor Gene Cluster
Molecular Medicinevolume 2, pages349–357 (1996)
Thrombin is a serine protease that elicits a variety of cellular responses. Molecular cloning of a thrombin receptor revealed a G protein-coupled receptor that is activated by a novel proteolytic mechanism. Recently, a second protease-activated receptor was discovered and dubbed PAR2. PAR2 is highly related to the thrombin receptor by sequence and, like the thrombin receptor, is activated by cleavage of its amino terminal exodomain. Also like the thrombin receptor, PAR2 can be activated by the hexapeptide corresponding to its tethered ligand sequence independent of receptor cleavage. Thus, functionally, the thrombin receptor and PAR2 constitute a fledgling receptor family that shares a novel proteolytic activation mechanism. To further explore the relatedness of the two known protease-activated receptors and to examine the possibility that a protease-activated gene cluster might exist, we have compared the structure and chromosomal locations of the thrombin receptor and PAR2 genes.
Materials and Methods
The genomic structures of the two protease-activated receptor genes were determined by analysis of λ phage, P1 bacteriophage, and bacterial artificial chromosome (BAC) genomic clones. Chromosomal location was determined with fluorescent in situ hybridization (FISH) on metaphase chromosomes, and the relative distance separating the two genes was evaluated both by means of two-color FISH and analysis of YACs and BACs containing both genes.
Analysis of genomic clones revealed that the two protease-activated receptor genes share a two-exon genomic structure in which the first exon encodes 5′-untranslated sequence and signal peptide, and the second exon encodes the mature receptor protein and 3′-untranslated sequence. The two receptor genes also share a common locus with the two human genes located at 5q13 and the two mouse genes at 13D2, a syntenic region of the mouse genome. These techniques also suggest that the physical distance separating these two genes is less than 100 kb.
The fact that the thrombin receptor and PAR2 genes share an identical structure and are located within approximately 100 kb of each other in the genome demonstrates that these genes arose from a gene duplication event. These results define a new proteaseactivated receptor gene cluster in which new family members may be found.
Thrombin is a serine protease that elicits a variety of cellular responses. Molecular cloning of a thrombin receptor revealed a G protein-coupled receptor that is activated by a novel proteolytic mechanism (1–4). Thrombin cleaves its receptor’s amino-terminal exodomain to unmask a new amino terminus, which then binds to the body of the receptor to effect receptor activation. Given the broad biological roles of G protein-coupled receptors and proteases, it is perhaps not surprising that a second protease-activated receptor, dubbed PAR2, was recently discovered (5–8). PAR2 is highly related to the thrombin receptor by sequence, with 34% overall amino acid identity, and is activated by tryptic cleavage of its amino terminal exodomain (5,8). Whether trypsin itself and/or another protease with similar substrate specificity are physiological activators of PAR2 is unknown. The location of the activating cleavage site in PAR2 is analogous to that in the thrombin receptor, and the tethered ligand sequence immediately carboxyl to this site in PAR2 is similar to the tethered ligand sequence in the thrombin receptor. Like the thrombin receptor, PAR2 can be activated by the hexapeptide corresponding to its tethered ligand sequence independent of receptor cleavage. Thus, functionally, the thrombin receptor and PAR2 constitute a fledgling receptor family that shares a novel proteolytic activation mechanism (5,7).
To explore further the relatedness of the two known protease-activated receptors and to examine the possibility that a protease-activated gene cluster might exist, we have compared the structure and chromosomal locations of the thrombin receptor and PAR2 genes. Our results show that these genes share an identical structure and are located within approximately 100 kb of each other in the genome. Thus, the thrombin receptor and PAR2 genes probably arose from a gene duplication event and define a new protease-activated receptor gene cluster. The ultimate size of this cluster and the ancestral relationships of its members remain to be determined.
Materials and Methods
Genomic Cloning and Mapping of the Human and Mouse Thrombin Receptors—Human Thrombin Receptor
A human placental genomic library in EMBL3 SP6/7 (Clontech, Palo Alto, CA, U.S.A.) was screened with a 2123-bp cDNA probe spanning the entire coding region of the human thrombin receptor. A human placental genomic library in pWE15 cosmid (Stratagene, LaJolla, CA, U.S.A.) was subsequently screened for putative exon 1 with a polymerase chain reaction (PCR)-generated probe corresponding to bases −172 to +77 of the cDNA (bases are numbered with 1 representing the dA of the start codon) (1). DNA from positive clones was digested with restriction endonucleases and analyzed by Southern blotting using probe for the coding region of the receptor. Positive restriction fragments were subcloned into pBluescript KS+ (Stratagene) and sequenced. Intron size was determined using long-range PCR (Boehringer Mannheim, Indianapolis, IN, U.S.A.) as recommended by the manufacturer on two P1 bacteriophage genomic clones (see below) which contained both exons of the human thrombin receptor. The forward primer used was just 3′ of the exon 1 splice donor site (5′-ACCCCTGCCTCAGTTTCCTCCAAAG-3′) and the reverse primer just 5′ of the exon 2 splice acceptor site (5′-CTCCTCATCCTCCCAAATGGTTC-3′). The identity of the 15-kb PCR product was confirmed by Southern analysis with hybridization to oligomer probes derived from intronic sequence just 3′ of the forward primer (5′-TGGCATTTGGGCTGAGATCTGGAGT-3′) and 5′ of the reverse primer (5′-ACCGGGGATCTAAGGTGGCATTTGT-3′).
Mouse Thrombin Receptor
An SV129 mouse genomic library in the bacteriophage FixII (Stratagene) was screened with both a full-length cDNA probe and a PCR-generated probe covering bases −15 to +83 (same base numbering convention as above). Positive clones were analyzed as for the human gene. An SV129 mouse genomic library in bacteriophage P1 (Genome Systems) was screened by PCR using both primers for exon 1 (bp −1003 to −815) and exon 2 (bp +197 to +407). A single positive P1 clone was analyzed as described above.
Cloning of Human PAR2 Exon 2
Degenerate PCR primers to the sequences in the mouse PAR2 gene with greatest homology to the murine thrombin receptor were used to amplify a fragment PAR2 human exon 2 (forward primer GTIGTITA[C/T]ATIATIGTITT[C/T] and reverse primer [A/G]TA[A/G]TAIAC[A/G]AAIGG[A/G]TCIAT[A/G]CA; I = inosine). Additional 5′ sequence and 3′ sequence was obtained using hemi-degenerate PCR. Finally, sequence 5′ of the putative first transmembrane domain and 3′ of the putative seventh transmembrane domain was obtained by directly sequencing a bacteriophage P1 clone containing the human PAR2 gene (see below) using the amplicycle sequencing kit (Perkin-Elmer, Norwalk, CT, U.S.A.). All reported PCR-derived sequences were confirmed using at least two independent PCR reactions to avoid errors due to the inherent error rate of Taq polymerase and agree with the subsequently published sequence of the human PAR2 gene exon 2 (8).
Chromosomal Localization of the Thrombin Receptor and PAR2 Genes
The forward primer hP2F2 (CAACTGGATTTATGGGGAAGCTC) and the reverse primer hP2B2 (GATGTTCAGGGCAGGAATGA) were used to screen a human genomic P1 bacteriophage library (Genome Systems, St. Louis, MO, U.S.A.) for a clone containing exon 2 of human PAR2. The forward primer hTRF3 (TCTGAATTGTGTCGCTTCGTCACTG) and reverse primer hTRB2 (TGGCAACTGCGGAAGAGCTAAGAC) were used to screen the human genomic P1 bacteriophage library for a clone containing exon 2 of thrombin receptor. These clones were used to generate probes for FISH. Probe for single color FISH was generated by nick translation using digoxigenin-11-dUTP (Boehringer Mannheim) and detected by anti-digoxigenin antibody conjugated with fluorescein isothiocyanate (FITC). Probes for two-color FISH were labeled either with FITC-12-dUTP (NEN, DuPont, Wilmington, DE, U.S.A.) or with digoxigenin-dUTP; the latter was detected with rhodamine-conjugated anti-digoxigenin (Boehringer Mannheim). The chromosomal locations of the probes were determined by digital microscopy and localized by the fractional length from the p terminus (FLpter) as described previously (9). The probes for thrombin receptor and PAR2 were also simultaneously hybridized to chromatin fibers that were released by physical disruption of the nucleus and stretched out prior to hybridization (Ref. 10 and W.-L. Kuo, manuscript in preparation).
Identification of YAC Clones Containing Both the Thrombin Receptor and PAR2 Genes
Exact primers hP2F2, hP2B2, hTRF3, and hTRB2 (see above) were used to screen a yeast artificial chromosome (YAC) human genomic library by PCR (Genome Systems). Positive clones were analyzed by Southern analysis performed at high stringency (hybridization at 65°C in 2× SSPE, 0.1% SDS, overnight and wash at 65°C in 0.1 × SSPE, 0.1% SDS for 30 min) using probe derived from exon 2 of both thrombin receptor and PAR2.
Analysis of BAC Clones Containing Both Murine PAR Genes
Probe for exon 2 (+122 to +930 of the cDNA) of the murine PAR2 gene was used to screen a mouse bacterial artificial chromosome (BAC) library at high stringency (Genome Systems). Positive clones were analyzed by PCR for the presence of sequence 5′ of thrombin receptor gene start codon (−4602 to −4477 using the forward primer TCGACTCGATCAGGCTCTCCTTAG and the reverse primer CAAATCAGGCATGGTGGCAC) and the presence of sequence within exon 2 of the thrombin receptor gene (+ 650 to +1137 of the cDNA using the forward primer CAGTCCCTGTCCTGGCGCACT and the reverse primer ATCCAGAGCTTTCTTTGCAGCA). PCR reactions were performed for 35 cycles of 95°C for 1 min, 58°C for 1.5 min and 72°C for 1.5 min with a terminal extension at 72°C for 8 min.
The Thrombin Receptor Genomic Structure is Identical to That of PAR2
Analysis of λ phage and cosmid genomic clones revealed a two-exon structure for the mouse and human thrombin receptor genes. The first exon encoded 5′-untranslated sequence and signal peptide; the second encoded the mature receptor protein and 3′-untranslated sequence (Fig. 1). Because no single clone obtained by screening λ phage or cosmid libraries contained both exons, a mouse genomic library in P1 bacteriophage was screened by PCR for both exons to identify a doubly positive clone. Southern blot analysis of this clone revealed an approximately 14-kb intron separating the short first exon from the remainder of the coding region in exon 2 (Fig. 1). Similarly, Southern and long-range PCR analysis of human genomic DNA and of two P1 bacteriophage genomic clones revealed a 15-kb intron separating exons 1 and 2. PCR analysis also suggested the presence of a large intron at this site in the Xenopus thrombin receptor gene (data not shown). This two exon genomic structure, including the separation of the putative signal peptide from the mature receptor, is identical to that of both the human and murine PAR2 genes (Fig. 1) (5,6,8).
The exact transcriptional start sites of the thrombin receptor genes have not been identified. The longest human thrombin receptor cDNA cloned to date extends 150 nucleotides 5′ of the original sequence (11), predicting a 3.6-kb mRNA. Indeed, by Northern analysis human thrombin receptor mRNA was only slightly larger than a 3.45-kb human thrombin receptor cRNA run for comparison and 5′ rapid amplification of cDNA ends (RACE) (12) performed on DAMI cell cDNA did not reveal additional 5′ sequence (data not shown). Primer extension of polyA RNA from DAMI, HEL, or human umbilical venous endothelial cells did not yield a predominant product (data not shown). This negative result may be due to the high GC content of the 5′ end of the mRNA or to the actual use of multiple start sites. Sequencing of 5457 bases 5′ of the human gene’s exon 1 and 4605 bases 5′ of exon 1 of the murine gene (Genbank Accession Numbers U36755 and U36756) revealed several TATAA sequences but none closer than 800–900 bp 5′ of the translational start codon. Thus, it is likely that the thrombin receptor gene is “TATAA-less”, but we cannot exclude the existence of a small noncoding exon 5′ to those that we have characterized. Of note, the transcriptional start for PAR2 has also been difficult to define (6).
The Thrombin Receptor and PAR2 Genes Share a Common Chromosomal Locus
The chromosomal loci of the human and murine thrombin receptors were determined by FISH to metaphase chromosomes. The human gene mapped to 5q13 (Fig. 2A). The mouse thrombin receptor gene mapped to 13D2, which is syntenic with the human receptor locus (data not shown) (13). The human PAR2 gene mapped to 5q13, the same locus as that for the human thrombin receptor gene (Fig. 2B). These data are consistent with earlier observations on the thrombin receptor and PAR2 genes (8,11).
To determine more accurately the proximity of the two human genes, two-color FISH was performed on dispersed chromatin in interphase nuclei. The PAR2 and thrombin receptor genes colocalized (Fig. 2C). The consistent appearance of two DNA segments as a single spot by two-color FISH has correlated with a separation of <0.1 Mb (14). To improve mapping resolution further, FISH was also performed on free chromatin fibers that were released from intact nuclei and stretched out mechanically prior to hybridization (W.-L. Kuo, manuscript in preparation). Two-color FISH of the PAR2 and thrombin receptor P1 genomic clones on this “stretched DNA” revealed these genes to be adjacent (Fig. 2D).
Analysis of Human YACs and Mouse BACs Confirm That the Thrombin Receptor and PAR2 Genes Share a Common Locus
Four independent YAC clones that contain both the human thrombin receptor and PAR2 genes were obtained by PCR screening. Each of these clones was also shown to be positive for both genes by Southern analysis (Fig. 3). These data confirm that the human thrombin receptor and PAR2 genes are in close proximity, probably within a few hundred kilobases of each other (14). Similarly, five clones obtained from a mouse BAC genomic library were positive for both the thrombin receptor and PAR2 genes by PCR analysis (data not shown). The size of the BAC clones ranged from 110 to 140 kb (data not shown) suggesting a maximum distance between the two murine genes of approximately 100 kb.
These studies report the thrombin receptor genomic structure and reveal that, in addition to sharing a novel proteolytic mechanism of activation, thrombin receptor and PAR2 share an unusual genomic structure and a common location in the genome. Other G protein-coupled receptor genes do contain introns (e.g., the oxytocin receptor (15), the endothelin receptors (16,17), the tachykinin receptors (18), and the dopamine receptors (19,20), but thus far only thrombin receptor and PAR2 possess a genomic structure in which the exon encoding the signal peptide is separated from that encoding the main body of the receptor by a single long intron. The fact that the two known protease-activated G protein-coupled receptors share this unusual gene structure and are adjacent in the genome is strong evidence that these genes evolved by means of a gene duplication event.
Like the protease-activated receptors, the genes for other subfamilies of G protein-coupled receptors are known to cluster at single genetic loci. Two clusters of genes at two distinct loci are known for the adrenergic receptors: the β2 and α1 receptor genes on human chromosome 5q32–34 and the β1 and α2 receptor genes on human chromosome 10q24–26 (21). The receptors for the neutrophil chemoattractants C5a and f-Met-Leu-Phe and formyl peptide-like receptors 1 and 2 are all located at 19q13.3 (22). Two interleukin 8 receptor genes and a related pseudogene map to 2q35 (23). Thus duplication has generated gene clusters encoding receptors that respond to the same ligand but couple to distinct effector pathways and show different tissue distributions (e.g., the adrenergic receptors), as well as clusters encoding receptors that are related by sequence but respond to very distinct ligands (e.g., the C5a and f-Met-Leu-Phe receptors, which share 33% amino acid identity).
Proteases serve many functions and their number greatly exceeds the number of known protease-activated receptors. It is thus tempting to speculate that a large number of proteases with diverse expression patterns, specificity, and regulation had evolved by the time the first protease-activated receptor arose, perhaps from a preexisting peptide receptor (1,24). Duplication of this ancestral gene in such an environment may have fostered evolution of new protease-activated receptors capable of responding to distinct proteases as occurred with thrombin receptor and PAR2. We have shown that the protease specificity of the thrombin receptor is determined primarily by the P1–P4 amino acids at the thrombin cleavage site and that replacing these with an enteropeptidase cleavage site yields a receptor that signals to enteropeptidase and not to thrombin (2). Thus, receptors for a variety of proteases could easily arise by receptor gene duplication followed by the mutation of the few codons that encode the cleavage site. Such gene duplication may also have allowed acquisition of new regulatory elements to allow tissue specific regulation of receptors that respond to the same or different proteases.
Several observations suggest that additional family members may exist. The thrombin receptor knock-out mouse, made possible by the genomic clones reported here, reveals the existence of a second platelet thrombin receptor and tissue-specific roles for distinct thrombin receptors (25). Cellular responses to other proteases such as neutrophil cathepsin G (26) and mast cell tryptase (27) have been reported. Whether the receptors that mediate these responses are related to the known protease-activated receptors by gene duplication and whether such genes might remain clustered with the thrombin receptor and PAR2 genes remain to be explored.
Vu T-KH, Hung DT, Wheaton VI, Coughlin SR. (1991) Molecular cloning of a functional thrombin receptor reveals a novel proteolytic mechanism of receptor activation. Cell 64: 1057–1068.
Vu T-KH, Wheaton VI, Hung DT, Coughlin SR. (1991) Domains specifying thrombin-receptor interaction. Nature 353: 674–677.
Rasmussen UB, Vouret-Craviari V, Jallat S, et al. (1991) cDNA cloning and expression of a hamster alpha-thrombin receptor coupled to Ca2+ mobilization. FEBS Lett. 288: 123–128.
Chen J, Ishii M, Wang L, Ishii K, Coughlin SR. (1994) Thrombin receptor activation: Confirmation of the intramolecular tethered liganding hypothesis and discovery of an alternative intermolecular liganding mode. J. Biol. Chem. 269: 16041–16045.
Nystedt S, Emilsson K, Wahlestedt C, Sunderin J. (1994) Molecular cloning of a potential novel proteinase activated receptor. Proc. Natl. Acad. Sci. U.S.A. 91: 9208–9212.
Nystedt S, Larsson AK, Aberg H, Sundelin J. (1995) The mouse proteinase-activated receptor-2 cDNA and gene. Molecular cloning and functional expression. J. Biol. Chem. 270: 5950–5955.
Coughlin SR. (1994) Protease-activated receptors start a family. Proc. Natl. Acad. Sci. U.S.A. 91: 92100–92102.
Nystedt S, Emilsson K, Larsson A, Strombeck B, Sundelin J. (1995) Molecular cloning and functional expression of the gene encoding the human proteinase-activated receptor 2. Eur. J. Biochem. 232: 84–89.
Stokke T, Collins C, Kuo WL, et al. (1995) A physical map of chromosome 20 established using fluorescence in situ hybridization and digital image analysis. Genomics 26: 134–137.
Senger G, Jones TA, Fidlerova H, et al. (1994) Released chromatin: Linearized DNA for high resolution fluorescence in situ hybridization. Hum. Mol. Genet. 3: 1275–1280.
Bahou WF, Coller BS, Potter CL, Norton KJ, Kutok JL, Goligorsky MS. (1993) The thrombin receptor extracellular domain contains sites crucial for peptide-ligand induced activation. J. Clin. Invest. 91: 1405–1413.
Frohman MA, Dush MK, Martin GR. (1988) Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proc. Natl Acad. Sci. U.S.A. 85: 8998–9002.
Copeland RA. (1994) Reverse fluorescence staining of proteins in Polyacrylamide gels using terbium chloride. Anal. Biochem. 220: 218–219.
Trask B, Pinkel D, van den Engh G. (1989) The proximity of DNA sequences in interphase cell nuclei is correlated to genomic distance and permits ordering of cosmids spanning 250 kilobase pairs. Genomics 5: 710–717.
Rozen F, Russo C, Banville D, Zingg HH. (1995) Structure, characterization, and expression of the rat oxytocin receptor gene. Proc. Natl Acad. Sci. U.S.A. 92:200–204.
Arai H, Nakao K, Takaya K, et al. (1993) The human endothelin-B receptor gene. Structural organization and chromosomal assignment. J. Biol Chem. 268: 3463–3470.
Hosoda K, Nakao K, Tamura N, et al. (1992) Organization, structure, chromosomal assignment, and expression of the gene encoding the human endothelin-A receptor. J. Biol. Chem. 267: 18797–18804.
Gerard NP, Eddy RJ, Shows TB, Gerard C. (1990) The human neurokinin A (substance K) receptor. Molecular cloning of the gene, chromosome localization, and isolation of cDNA from tracheal and gastric tissues [published erratum appears in (1991) J. Biol Chem. 266(2): 1354]. J. Biol Chem. 265: 20455–20462.
Zhou QY, Li C, Civelli O. (1992) Characterization of gene organization and promoter region of the rat dopamine D1 receptor gene. J. Neurochem. 59: 1875–1883.
Fu D, Skryabin BV, Brosius J, Robakis NK. (1995) Molecular cloning and characterization of the mouse dopamine D3 receptor gene: an additional intron and an mRNA variant. DNA Cell Biol 14: 485–492.
Yang FT, Xue FY, Zhong WW, et al. (1990) Chromosomal organization of adrenergic receptor genes. Proc. Natl Acad. Sci. U.S.A. 87: 1516–1520.
Bao L, Gerard NP, Eddy RJ, Shows TB, Gerard C. (1992) Mapping of genes for the human C5a receptor (C5AR), human FMLP receptor (FPR), and two FMLP receptor homologue orphan receptors (FPRH1, FPRH2) to chromosome 19. Genomics 13: 437–440.
Ahuja SK, Ozcelik T, Milatovitch A, Francke U, Murphy PM. (1992) Molecular evolution of the human interleukin-8 receptor gene cluster. Nat. Genet. 2: 31–36.
Chen J, Bernstein HS, Chen M, et al. (1995) Tethered ligand library for discovery of peptide agonists. J. Biol. Chem. 270: 23398–23401.
Connolly A, Ishihara H, Kahn M, Farese RV Jr, Conghlin SR. (1996) Role of thrombin receptor in development and evidence for a second receptor. Nature (In press).
Selak M. (1994) Cathepsin G and thrombin: evidence for two different platelet receptors. Biochem. J. 297: 269–275.
Hartmann T, Ruoss SJ, Raymond WW, Seuwen K, Caughey GH. (1992) Human tryptase as a potent, cell-specific mitogen: role of signaling pathways in synergistic responses. Am. J. Physiol 262: L528–L534.
von Heijne G. (1983) Patterns of amino acids near signal-sequence cleavage sites. Eur. J. Biochem. 133: 17–21.
von Heijne G. (1986) A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 14: 4683–4690.
MK was supported by an HHMI postdoctoral fellowship. AC was supported by KO-8 HL03234-01. RW was supported by a Sarnoff Fellowship. SRC is an Established Investigator of the American Heart Association. This work was supported by the Daiichi Research Center, UCSF, NIH-HL44907, and NIH-DK50267.