Worm Breeder's Gazette 15(3): 40 (June 1, 1998)
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
Department of Zoology, University of British Columbia, Vancouver, B.C. Canada
Mammalian perlecan, the major heparan sulfate proteoglycan of the extracellular matrix has five domains with similarity to the LDL-receptor (domain II), laminin (domains III & V) and the neural cell adhesion molecule (domain IV). We have previously shown that the unc-52 gene in C. elegans encodes several large proteins structurally homologous to perlecan but lacking the C-terminal domain V (Rogalski et al., 1993, Genes and Dev. 7:1471-1484). Our analysis of this gene identified 26 exons covering almost 15 kb of genomic DNA and revealed the presence of several alternatively spliced transcripts. Two different poly(A) addition sites located 8.5 kb apart are utilized to generate a number of large polypeptides containing domains I, II, III & IV or a shortened polypeptide(s) containing only domains I, II & III. Recently, we discovered that the unc-52 gene actually spans over 20 kb of genomic DNA and consists of 37 exons. Eleven additional exons were identified downstream of exon 26 when the C. elegans genome consortium provided the sequence and annotation of cosmid C38C6. We have confirmed the intron-exon boundaries in this region by sequencing a cDNA clone obtained from Y. Kohara that extends from exon 27 to ~200 bp downstream of the putative stop codon in exon 37. In addition, several RT-PCR fragments were generated and sequenced to confirm that the newly identified exons were part of the unc-52 gene and to identify the splice sites used to join exon 26 to exon 27. The longest potential open reading frame of the unc-52 gene now encodes a 3375 amino acid protein with a molecular weight of approximately 370 kD consisting of a putative signal sequence and five distinct domains. The newly identified exons encode sequences that are very similar to domain V of the mouse and human perlecan polypeptides, confirming our earlier conclusion that the UNC-52 proteins are the nematode orthologs of these mammalian proteoglycans. Curiously, domain V of the nematode protein contains a region of approximately 180 amino acids that is not found in the mammalian proteins. This region is extremely rich in threonine (45/180) and serine (19/180) residues and also contains 12 repeats of the amino acid sequence EEP. In summary, the unc-52 gene uses three separate transcriptional stop regions to produce three general classes of protein isoforms: short (domains I, II & III), medium (domains I, II, III & IV) and long (domains I, II, III, IV & V). At least some of these UNC-52 polypeptides function in the basement membranes of muscle cells and are essential for the assembly and attachment of the myofilament lattice.