Worm Breeder's Gazette 11(4): 43
These abstracts should not be cited in bibliographies. Material contained herein should be treated as personal communication and should be cited as such only with the consent of the author.
We have been using the 55 kb unc-22 sequence (G. Benian et al, Nature 342 (1989) 45-50, and this WBG) to test the gm automated DNA sequence analysis program (WBG Jan. 1990). One of the initial test runs on the complete sequence predicted ten short exons between positions 16700 and 22000, which together encoded an amino-acid sequence containing twitchin-like repeats. Five of these exons were shown to be true unc-22 exons by PCR sequencing of unc-22 cDNA. At analysis stringencies similar to those that give good results for the myosin sequences, gm predicts numerous minor splicing alternatives for unc-22. A typical 'best' prediction, run in 17 minutes on a workstation, predicts 15 of the 20 unc-22 exons located 3' of position 15000 correctly, predicts one of the two boundaries of another 2 exons correctly, and predicts 6 spurious exons. A spinoff of these tests has been the prediction of several new genes in the unc-22 sequence. two of which have now been confirmed by cDNA analysis. The extents and orientations of three genes predicted in the unc-22 sequence are shown in Fig. 1. [See Figure 1] Only the 3' end of the predicted female-specific ('fem-sp') gene is contained within the unc-22 sequence. The amino acid sequence of this fragment is highly similar to that of the mouse interleukin-1 precursor (PIR ICMS1); however, more of the sequence is needed to see whether this is significant. A partial cDNA overlapping this prediction has been isolated. The predicted 'serine-rich' gene is embedded entirely within the 7.4 kb intron of unc-22. By using selected fragments from the first 24 kb of unc-22 as probes against Northerns, at least two male-specific messages have been detected; probably one of these messages in encoded by spe-17 (more in the next WBG). Neither of these messages, however, seems to be from the predicted serine-rich gene. A number of ORFs over 100 bases in length contained within the sequence have no known function; one or more of these may encode male-specific transcripts. The predicted 'transporter' gene is the best characterized of the predictions shown in Fig. 1. A partial cDNA overlapping this prediction has been isolated. An amino-acid sequence derived from the cDNA data and the gm run is shown in Fig. 2, together with an alignment with a mammalian glucose transporter protein generated by fasta (W. Pearson and D. Lipmann, PNAS 85 (1987) 2444-2448). The predicted protein is also significantly similar to other mammalian glucose and ion transporters. [See Figure 2] The predicted protein has four highly hydrophobic regions, and is expected from the Garnier rules to be composed primarily of alpha helix, suggesting that it is an integral membrane protein. We suspect that this gene encodes a glucose transporter, or a closely related protein. D. Baillie has located at least three other potential integral membrane protein genes between unc-22 and dpy-20 (S. Prasad and D. Baillie, Genomics 5 (1989)185-198), one of which appears to be the Na+/H+ antiporter gene (this WBG).