Table of Contents
The ~100 MB genome of C. elegans codes for ~20,000 protein-coding genes many of which are required for the function of the nervous system, composed of 302 neurons in the adult hermaphrodite and of 383 neurons in the adult male. In addition to housekeeping genes, a differentiated neuron is thought to express many hundreds if not thousands of genes that define its functional properties. These genes code for ion channels, G-protein-coupled receptors, neurotransmitter-synthesizing enzymes, transporters and receptors, neuropeptides and their receptors, cell adhesion molecules, motor proteins, signaling molecules and many others. Collectively such genes have been referred to as “terminal differentiation genes” or “effector genes”. The differential expression of distinct combinations of terminal differentiation genes define different neuron types. This paper provides a compendium of more than 2,800 putative terminal differentiation genes. One pervasive theme revealed by the analysis of many gene families is the nematode-specific expansions of many neuron function-related gene families, including, for example, many types of ion channel families, sensory receptors and neurotransmitter receptors. The gene lists provided here can serve multiple purposes. They can serve as quick reference guides for individual gene families or they can be used to mine large datasets (e.g., expression datasets) for genes with likely functions in the nervous system. They also serve as a starting point for future projects. For example, a comprehensive understanding of the regulation of the often complex expression patterns of these genes in the nervous system will eventually explain how nervous systems are built.
Neurons are information processing devices that receive, integrate and transmit signals to induce specific patterns of behavior. Among the key defining features of a mature neuron are its specific position, morphology and physical connections (in the form of electrical and chemical synapses), its electrophysiological properties (i.e., resting potential of the cellular membrane), and the molecular means by which it receives, propagates and transmits chemical signals, either locally across synapses or over longer distances in a paracrine manner. These basic features are defined by the expression of “nuts and bolts” genes that have demonstrated or predicted functions in terminally differentiating or mature neurons (Table 1). Such genes have been referred to as “terminal differentiation genes” or “effector genes” (see Neurogenesis in the nematode Caenorhabditis elegans) and are the focus of this review. These gene families are listed in the overview Table 1 and include ~2,800 genes. Differences in the identity and function of individual neuron types can presumably be ascribed to the differential expression of specific members of these gene families.
Table 1: Summary of genes discussed in each chapter. As mentioned in the text, molecules listed in specific categories in this Table are often no more than mere candidates for being involved in the indicated function.
|Section||Gene family||Number of genes||Table|
|2.1.1||Channel types||72||Table 2|
|2.1.2||Auxiliary subunits||53||Table 3|
|2.2||Calcium channels, transporters and binding proteins|
|2.2.1||Voltage gated calcium channels and auxiliary subunits||11||Table 4, Table 3|
|2.2.2||Other calcium channels||3|
|2.2.3||Calcium transporter||14||Table 5|
|2.2.4||Calcium binding proteins||65||Table 6|
|2.3||TRP channels||23||Table 7|
|2.4||Cyclic nucleotide-gated ion channels||6||Table 8|
|2.5||Ligand-gated ion channels (LGICs)|
|2.5.1||nAChR-type ligand-gated ion channels of the Cys-loop LGIC superfamily||61||Table 9|
|2.5.2||Other ligand-gated ion channels of the Cys-loop LGIC superfamily (GABA-, Glutamate-gated and others)||41||Table 10|
|2.5.3||Auxiliary subunits of the Cys-loop LGIC superfamily||20||Table 3|
|2.6||Ionotropic glutamate receptors|
|2.6.1||Channel types||15||Table 11|
|2.6.2||Auxiliary subunits||8||Table 3|
|2.7.1||Channel types||32||Table 12|
|2.7.2||Auxiliary subunits||10||Table 3|
|2.8||Chloride channels and chloride transporters|
|2.8.1||Chloride channels||35||Table 13|
|2.8.2||Chloride transporters||11||Table 5|
|2.9||New ion channels||1|
|2.10||Summary of absent ion channels|
|3.1||Neurotransmitter synthesis||24||Table 14|
|3.2||Vesicular transport of neurotransmitters||17 (+12)||Table 5|
|3.3||Neurotransmitter reuptake||32||Table 5|
|3.4||Neurotransmitter degradation||12||Table 14|
|3.5||The case for and against other neurotransmitter systems|
|4.1||Neuropeptide-encoding genes||122||Table 15|
|4.2||Biosynthesis and processing of neuropeptides||47||Table 16|
|4.3||Neuropeptide receptors: beyond the GPCRs||70 (+ GPCR)||Table 17|
|5.||G-protein coupled receptors (GPCRs)||Table 18|
|5.1||Metabotropic neurotransmitter receptors||29||Table 19|
|5.2||GPCR-type neuropeptide receptors (+ additional candidates)||153 (+100)||Table 20, (+Table 21)|
|5.3||Sensory and orphan GPCRs||∼1,280|
|5.4||Adhesion GPCRs||5||Table 22|
|5.5||Frizzled/Taste2 GPCRs||4||Table 18|
|5.6||Downstream of GPCRs||83||Table 23|
|6.1||Guanylyl cyclases||34||Table 24|
|7.||Receptors of CO2 and O2||39||Table 25|
|8.||Presynaptic machinery||57||Table 26|
|9.||Neurotransmitter receptor localization: PDZ proteins||70||Table 27|
|10.||Gap junctions||25||Table 28|
|11.||Motor proteins & their associated complexes|
|11.1||Kinesin, dynein and myosin motors||56||Table 29–31|
|11.2||Motor complexes that build cilia of sensory neurons||35||Table 32|
|12.||Neuronal recognition and adhesion molecules|
|12.1||Immunoglobulin superfamily||64||Table 33|
|12.2||eLRR proteins||29||Table 33|
|12.4||Neurexins superfamily and neurexin ligands||8||Table 35|
*Not the exact sum of individual numbers because some genes occur multiple times in different categories (auxiliary ion channel subunits—4 duplicates; ciliary components—7 duplicates; Ig/LRR—6 duplicates)
Structural and regulatory genes involved in cytoskeletal organization (e.g., small GTPases) or in basic cellular processes are not considered here since most of them have broad functions in many different cell types and are also sometimes only transiently expressed in the nervous system. Gene regulatory factors are also not considered because a neuronal function is difficult to predict a priori (the only exception being proneural bHLH factors; however, with a few possible exceptions, these factors usually have no function in mature neurons). The reader is referred to Neurogenesis in the nematode Caenorhabditis elegans, which describes gene regulatory factors operating during nervous system development.
The gene lists provided in this review are an update and extension of the first analysis of neurobiology-related gene families in C. elegans genome compiled by Cori Bargmann in the 1998 C. elegans genome issue of Science (Bargmann, 1998). The gene lists also summarize and extend many ensuing sequence analyses of individual gene families, as referenced in the respective sections below. The completeness of the analysis of individual gene families was assessed by a combination of domain searches using SMART, InterPro and Panther databases (Schultz et al., 2000; Zdobnov and Apweiler, 2001; Thomas et al., 2003; McDowall and Hunter, 2011), by analysis gene families as shown in TreeFam (Li et al., 2006) and, if necessary, by re-iterative BLASTP searches. It cannot be excluded that a more sophisticated sequence analyses may reveal additional family members. A substantial number of new gene names were assigned, many of them completely new names, and many in accordance with previously assigned names. For some gene families the numbers provided here differ from those of previous reports and database collection, e.g., InterPro domain databases (used in the description of protein families in Genomic classification of protein-coding gene families). This is because databases are populated by a large number of duplicate entries that either reflect differentially spliced isoforms arising from the same locus or trivial problems in duplicate gene naming. In contrast to these databases, the counts presented in this review rely almost entirely on manual curation of gene families, with the exception of the chemosensory-subfamily of 7TMR genes with ~1,280 members, for which I relied, in large part, on the analysis by Robertson and Thomas described in The putative chemoreceptor families of C. elegans. In many cases, counts presented here also differ from previous analyses because the genome sequence was almost but not entirely complete at the time of previous analyses. In addition, gene predictions have sometimes significantly changed over the years as a result of improved gene predictions and experimental validation through in-depth transcriptome analysis (Gerstein et al., 2010).
The gene lists also include a rough and superficial description of known expression patterns. As mentioned in the individual chapters below, the expression of many genes has been analyzed and neuronal expression has been confirmed (references to expression patterns and individual gene functions are most often not provided directly in the text, but the respective gene names are hyperlinked to Wormbase entries in which function and expression patterns are described in more detail and where references are provided). However, for a substantial number of genes the expression is either unknown or could not be detected in the nervous system using (perhaps incomplete) reporter gene fusion constructs; their inclusion in this compendium is solely based on the potential of the gene to determine specific neuronal properties and should not be considered a documented fact. Genes with important functions in a neuron can also have similar (or distinct) functions in a non-neuronal cell type. More information on individual genes can be found in the hyperlinked Wormbase entries for individual genes, which also provide appropriate references to the literature.
Among the key defining features of a neuron are the enormously varied ways to regulate the electrical properties across the cellular membrane, a feat achieved through a variety of different ion channels. Most plasma membrane ion channels in the nervous system come in four distinct topologies which likely evolved independently (Hille, 2001; Jegla et al., 2009) (Figure 1):
The voltage-gated family of potassium, sodium and calcium channels. The pore forming α subunit of both voltage-gated calcium and sodium channels contain 24 transmembrane (TM) domains which are 4 repeats of a 6TM motif thought to be derived from ancestral potassium channels (Yu et al., 2005) (Figure 1). 6TM voltage-gated potassium channels, in turn, exist as tetramers, with the total ion channel therefore consisting also of a 24TM topology. Non-voltage-gated TRP channels and cyclic-nucleotide gated (CNG) channels—each of which also displays the 6TM topology—are related to these channels as well, as illustrated in Figure 2 (Yu et al., 2005). These channels are described below in Sections 2.1 (potassium channels), 2.2 (calcium channels), 2.3 (TRP channels) and 2.4 (CNG channels).
The cysteine-loop family of ligand-gated ion channels. These are pentameric channels with each subunit displaying a 4TM topology (Figure 1). These channels, as well as auxiliary subunits for the channels, are described in Section 2.5.
Ionotropic glutamate receptors. Unlike the LGIC-type glutamate-gated anion channels, these are tetrameric cation channels, with each subunit containing four hydrophobic segments, three transmembrane domains, and the P loop that is involved in forming the pore (Figure 1). These channels are described in Section 2.6.
P2X and ASIC channels. These channels are not obviously related by primary sequence, but show structural similarities. They each contain two transmembrane domains, assemble as trimers and form similar pores (Young, 2010). These channels are described in Section 2.7.
The C. elegans genome codes for representatives of all the main families described above, as detailed in the ensuing sections. Within specific families, individual member have been lost in the C. elegans genome, with the most notable absentees being sodium-gated ion channels, P2X channels and HCN channels, as also discussed below.
Potassium channels modulate the resting potential of a neuron and are therefore critical determinants of neuronal excitability and synaptic function. A total of 72 potassium channels are encoded in the C. elegans genome. These channels fall into three large structural classes, the 6-transmembrane (6TM), 4TM and 2TM classes (Table 2, see Figure 1 in Potassium channels in C. elegans). All three families are thought to derive from an ancestor with a core 2TM topology (Kir/Kcs class) (Yu et al., 2005). The 4TM channels (TWIK channels, see below) are thought to represent simple duplications of the 2TM topology. The 6TM channels again contain the 2TM core unit but acquired 4 additional, unrelated TM domains (note that the 6TM channel topology constitutes the basic building block of the 24TM calcium/sodium channel class, Figure 1). Even though derived from a common ancestor, potassium channels do not form a homogenous group. Voltage-gated potassium channels of the Eag family (Kv10-12) are more closely related to cyclic-nucleotide-gated channels than they are to other potassium channels (Figure 2).
The most notable feature of C. elegans potassium channels is the large expansion of the two-pore TWIK and TWIK-related channel family (TWIK stands for Tandem of Pore Domains in a Weak Inward Rectifying K+, of which there are 47 members in C. elegans, most of them functionally uncharacterized (Table 2). The human genome contains only around 15 TWK channels. The expression pattern of 20 of the TWIK channels has been examined by reporter gene fusions. Most of them are expressed in the nervous system (Potassium channels in C. elegans) (Table 2).
Voltage-gated potassium channels often associate with auxiliary subunits (Table 3). One class of such subunits is the single-pass KCNE/MinK family (four genes in mammals). There are four characterized C. elegans KCNE orthologs (mps-1, mps-2, mps-3, mps-4) that are each expressed in individual neuron types (Park et al., 2005). MPS-1, MPS-2 and MPS-3 interact with the voltage-gated potassium channel KVS-1 (Park et al., 2005). MPS-4 associates with the potassium channel EXP-2, and accelerates activation and deactivation in response to changes in voltage (Park and Sesti, 2007). In addition, the genome contains four uncharacterized genes with homology to mps-3 and four genes with homology to mps-2; all likely arose by local duplications (Table 3). Whether any of these proteins are also auxiliary subunits to potassium channels is unclear given the low degree of sequence homology.
There are four uncharacterized C. elegans genes related to the KChIP/KCNIP family of auxiliary subunits of voltage-gated channels (Pongs and Schwarz, 2010) (Table 3). The KChIP proteins, small EF hand proteins of the NCS superfamily, are unusual as they not only serve as auxiliary subunits, but also as transcriptional regulatory proteins (Burgoyne and Haynes, 2012). Curiously, proteins highly similar to type IV dipeptidyl peptidases which are normally involved in neuropeptide processing, have also been shown to be auxiliary subunits of voltage-gated potassium channels (Pongs and Schwarz, 2010). Seven genes in the worm genome (dpf-1 through dpf-7) encode type IV dipeptidyl peptidases, with two of them being by far the most similar to type IV dipeptidyl peptidases (dpf-1 and dpf-2).
There is an uncharacterized C. elegans homolog (sssh-1) of the fly gene sleepless, which codes for a small GPI-anchored Ly-6/neurotoxin superfamily member that regulates the levels, localization and activity of Drosophila Shaker (Pongs and Schwarz, 2010). Even though there is no obvious worm ortholog of the Kvβ/KCNAB auxiliary subunit family (Pongs and Schwarz, 2010), this family belongs to an extended superfamily of aldo/keto-reductases. mec-14, which is thought to encode an auxiliary subunit of the MEC-4/MEC-10 degenerin channel, is a member of this superfamily too (M. Chalfie, pers. comm.).
Clear orthologs of the auxiliary subunit family Bkβ/KCNMB of calcium-activated potassium channels cannot readily be found in the C. elegans genome. The C. elegans BK channel slo-1 appears to rather use a small protein with a single transmembrane domain (bkip-1 for “BK channel interacting protein”) as auxiliary subunit (Chen et al., 2010). bkip-1 has no paralogs in C. elegans and no obvious orthologs outside nematodes.
Sulfonylurea receptors (SURs), are auxiliary subunits of the inwardly rectifying Kir family of potassium channels in vertebrates and are members of subfamily C of ABC transporter family (official names—ABCC8 and ABCC9). There are nine members of the ABCC subfamily in worms (Zhao et al., 2007) (Table 3), yet unlike Drosophila, the worm genome does not contain an obvious ortholog of the ABCC8/9 subfamily. Other ABCC subfamily members may have adopted the auxiliary potassium channel subunit function, with perhaps ctf-1 being the best candidate (Table 3).
TWIK channels might also rely on auxiliary subunits. A multipass transmembrane protein, UNC-93, co-localizes with the TWIK channel SUP-9 and is required for its function (de la Cruz et al., 2003). UNC-93 is phylogenetically conserved and is part of a larger family of 17 related C. elegans proteins that are presently uncharacterized (de la Cruz et al., 2003) (Table 3). This family has expanded in C. elegans mirroring the expansion of TWIK channels. Another transmembrane protein required for SUP-9/TWIK function, called SUP-10, may also be a auxiliary subunit (de la Cruz et al., 2003), but is not phylogenetically conserved and there are no worm paralogs.
Calcium is a broadly used signaling molecule, but it also has several specialized functions in the nervous system, e.g., in synaptic vesicle release, in modulation of ion channel activity and, of course, as an ion that is itself involved in generating currents across excitable membranes in neurons. This is an absolutely critical feature of calcium since C. elegans does not generate sodium-based action potentials (Goodman et al., 1998). In this section, I will not only summarize calcium channels but also cover other genes related to “neuronal calcium”.
The molecular biology of neuronal calcium is briefly summarized in Figure 3 (Grienberger and Konnerth, 2012). Some calcium-permeant channels, namely nAChR-type receptors and glutamate receptors (NMDA, Kainate and AMPA-type), are discussed in an ensuing section (Section 2.5) and so are metabotropic receptors that signal to mobilize intracellular calcium stores (Section 5.1).
Voltage-gated calcium channels (VGCCs) are composed of a pore-forming unit, the 24TM domain-containing α1 subunit, and are usually associated with auxiliary β subunits and α2δ subunits. There are five α1 subunits in the worm genome, two α2δ subunits and two β subunits (Table 4). The role of the γ subunit, a family of tetraspanin molecules, remains unclear. These molecules (two of them exist in the C. elegans genome, stg-1 and stg-2) are now thought to have a major role in AMPA glutamate receptor biology, as mentioned in Section 2.6.
α1 subunits come in three families, Cav1, Cav2 and Cav3. These correspond to the physiologically defined L-type (‘long-lasting’), N-type (‘Non-L’ or ‘neuronal’, includes the P, Q and R types) and T-type (‘transient’) channels (Catterall et al., 2005). Mammals possess several subtypes of each channel type differing in tissue and subcellular distribution. Only single genes for each type are found in invertebrates such as C. elegans (Table 6). Specifically, egl-19 codes for the L-type, unc-2 for an Non-L-type and cca-1 for an T-type channel.
In addition, C. elegans contains two members of the α1U branch of invertebrate and vertebrate cation channels (nca-1 and nca-2), which are more distantly related to the α1 type. The channels require two phylogenetically conserved auxiliary proteins for their correct localization, encoded by unc-79 and unc-80 (Humphrey et al., 2007; Jospin et al., 2007). There are no obvious paralogs of unc-79 or unc-80 in the worm genome.
Two types of channel proteins are involved in mobilizing calcium from intracellular stores (Figure 3). Ryanodine receptors (RyRs) represent a class of intracellular calcium channels with prominent roles in excitable cells like muscles and neurons. In vertebrates, there are three RyRs: two for different types of muscle, and one expressed more broadly but most predominantly in the brain. There is a single RyR in C. elegans, encoded by the unc-68 locus. Expression analysis originally localized the protein to muscle, but more recent studies show that the gene also functions in neurons (Liu et al., 2005). There is also a single IP3 receptor, another intracellular calcium channel, encoded by the itr-1 gene. It is expressed in the intestine but also in some neurons and muscle (Table 5).
ORAI/CRAC ion channels are unusual 4TM, tetrameric plasma membrane channels that are activated by depletion of intracellular calcium stores. This activation works through an ER-resident calcium sensor STIM1 (an EF hand protein) that is directly linked to the plasma membrane channel (Figure 3). The C. elegans genome codes for one ORAI ortholog, orai-1 and one STIM1 ortholog, stim-1. They operate in reproductive tissue (Strange et al., 2007) and their function in the nervous system has not yet been explored.
Cytosolic calcium concentrations are controlled by sodium-coupled transporters of the SLC8 and SLC24 families (Figure 3). In vertebrates, many of these proteins are expressed strongly in the brain and have various brain-specific functions (Lytton, 2007). There are three members of the SLC8 family in worms (ncx-1 through ncx-3 for “Na+/Ca++ exchangers”) and seven members of the SLC24 family (ncx-4 through ncx-10) (Table 5). None of these transporters have yet been investigated for expression or function. There are also ATPases that transport calcium across the plasma membrane. There are three such ATPases in worms: mca-1 (expressed in excretory cell), mca-2 (hypodermis) and mca-3 (many tissues including neurons). A single homolog of the SERCA-type sarco-endoplasmic reticulum Ca++ ATPase, sca-1, exists in worms (Figure 3).
Intracellular calcium binds to proteins via a number of different motifs, the most prominent being the small EF hand motif (other calcium binding motifs, such as the C2 domain also have other binding partners). A number of vertebrate EF hand proteins, calbindin, calretinin and parvalbumin, have served as “classic” markers for specific neuron types in the vertebrate nervous system. One family of EF hand proteins, the NCS (“neuronal calcium sensor”) family (14 genes in mammals) has many specialized functions in the nervous system, often relating to ion channel regulation (Burgoyne and Haynes, 2012). Generally, EF hand proteins are thought to act as either “sensor” proteins that respond to calcium with a conformational change that triggers downstream events or as “buffer” proteins that control local calcium concentration; that distinction is, however, beginning to blur (Schwaller, 2009).
There are more than 100 genes in the worm that code for easily recognizable EF hand containing proteins (by contrast, humans are thought to have several hundred), many of them with very broad cellular functions. Given plenty of precedents, the most likely candidates of these genes for neuron-specific functions are those that exclusively code for EF hands and no other domains. C. elegans contains 64 of such genes (Table 6). There is one ortholog of classic calmodulin (cmd-1), eight calmodulin-related genes (there are many calmodulin-related genes in humans, too) and seven members of the NCS family of calcium sensor proteins (14 in humans), including homologs of human NCS-1 and the KChIP/DREAM proteins (mentioned above in the context of their role as K+ channel auxiliary proteins). There are 48 additional genes that code for proteins that exclusively contain EF hands and no other domains (Table 6). Many of them are C. elegans orthologs of well-characterized mammalian proteins with well-documented roles in the nervous system, but among them are also 16 genes with no obvious vertebrate homologs. Based on sequence, there are no obvious nematode orthologs of calbindin, parvalbumin or calretinin.
The TRP (Transient Receptor Potential) superfamily of cation channels are evolutionarily related to voltage-gated ion channels: they contain six transmembrane domains and a pore loop between the fifth and sixth transmembrane domains. However, TRP channels are generally not activated by voltage, but rather by a remarkable diversity of ligands or sensory inputs (Kahn-Kirby and Bargmann, 2006; Venkatachalam and Montell, 2007).
TRP channels fall into distinct classes based on overall sequence features (Yu et al., 2005) (Figure 2). The C. elegans genome contains 23 genes that display similarities to TRP channels. 17 of them are canonical TRP channels which fall into the TRPA, TRPC, TRPM, TRPML, TRPN, TRPV and TRPP subfamilies (Table 7) (Kahn-Kirby and Bargmann, 2006; Xiao and Xu, 2009) and one is a TRP-related TRPP1-type protein, LOV-1 (an 11-transmembrane domain protein). Five genes code for uncharacterized, multipass-transmembrane paralogs of a nematode-specific expansion (named trpl for TRP-like) (Table 7). trpl genes show relatively little sequence similarity to TRP channels, but do contain sequence signature motifs found in TRPM channels (Panther domain PTHR13800 “TRP, SUBFAMILY M”). The human genome encodes 28 TRP channel genes. Many of the C. elegans genes have been functionally analyzed and most are expressed in the nervous system. The so-far-characterized neuronally expressed TRP channels function as thermosensors, mechanoreceptors, proprioceptors or transduce signals in olfaction (Xiao and Xu, 2009).
Cyclic nucleotide gated (CNG) ion channels are signal-transducing cation-selective ion channels that form tetramers using specific combinations of α and β-type subunits. Even though they are not voltage-gated, they are members of the superfamily of voltage-gated ion channels (Yu et al., 2005) (Figure 2). C. elegans contains a total of six CNGs (Table 8). One, tax-4, encodes a canonical α subunit, whereas tax-2 encodes a canonical β subunit and both have been involved in various sensory paradigms (see Chemosensation in C. elegans). Vertebrates also contain six CNG channels, and all are α- or β-type. Four additional C. elegans CNGs, cng-1, cng-2, cng-3 and che-6, encode neither clear α or β subunits but display a somewhat higher sequence affinity to α subunits. The expression of most of the six CNGs has been investigated, revealing expression in partially overlapping subsets of sensory neurons.
As mentioned above, hyperpolarization-activated channels (HCNs) are related in sequence to the CNGs but, in contrast to flies and vertebrates, the C. elegans genome contains no HCN orthologs (Figure 2).
Neurotransmitters signal via two types of receptors: ion channels, also called ionotropic receptors (this section), and G-protein-coupled receptors (GPCRs), also called metabotropic receptors (Section 5.1 below).
Most ligand-gated ion channels (LGICs) in the C. elegans genome fall into the cysteine-loop superfamily of ion channels, which are characterized by the presence of a disulfide bond between two invariant cysteine resides in an extracellular loop region (Figure 1). Cys-loop LGICs consist of five homologous subunits arranged in a homomeric or heteromeric manner around a central pore (Sine and Engel, 2006). In mammals, the LGIC superfamily consists of about 45 genes, insects have just over 20 such genes, but the C. elegans genome contains 102 LGIC subunit-encoding genes (Jones and Sattelle, 2008) (Figure 4). Members of this C. elegans gene family include cation-permeable acetylcholine receptors related to vertebrate nicotinic acetylcholine receptors (nAChRs, see Section 2.5.1), anion-permeable GABA receptors related to vertebrate GABAA receptors (Section 2.5.2), and glutamate-gated anion channels (Section 2.5.2) related to channels found widely in invertebrate species (Jones and Sattelle, 2008). In addition, C. elegans contains LGICs not yet identified in vertebrates and insects including anion channels gated by acetylcholine or biogenic amines (serotonin, tyramine, dopamine) (Ringstad et al., 2009), and possibly other ligands. Of the many additional orphan LGICs (all termed lgc genes) several fall into broad families, but it is unknown how they are gated (Jones and Sattelle, 2008).
A phylogenetic analysis of the LGIC superfamily from various nematode and non-nematode species (Rufener et al., 2010) reveals that the above-mentioned groups fall into two large blocks, as illustrated in Figure 4: a very large group of the nAChR-related genes (including vertebrate and C. elegans bona-fide nAChRs, as well as many “orphan” genes, Section 2.5.1) and a block of non-nAChR-type genes (Section 2.5.2). Characterized members of the former block are cation channels and characterized members of the latter block (with the exception of exp-1) are anion channels.
In C. elegans, the group of nAChR-type ligand-gated ion channels of the Cys-loop LGIC superfamily consists of 61 diverse genes (almost 3 times as many as in mammals), some of them well-characterized (Table 9). These genes can be divided into subgroups (Jones et al., 2007). As Figure 4 illustrates, the most striking subgroups are the UNC-29 subgroup (four genes—unc-29, lev-1, acr-2, acr-3), the UNC-38 subgroup (six genes—unc-38, unc-63, acr-6, acr-8, acr-12, acr-13) and the DEG-3 subgroup (eight genes—deg-3, des-2, acr-5, acr-17, acr-18, acr-20, acr-23, acr-24). Each of these subgroups contains functionally characterized nAChR channel subunits. Notably, though, a heteromeric channel composed of DEG-3 and DES-2 proteins (encoded by an operon) appears to be sensory receptor that responds to ambient choline (Yassin et al., 2001).
Within the large number of uncharacterized lgc genes in this group, additional subgroups can be observed (Jones et al., 2007), including some obvious recent duplications, creating very close paralogous gene pairs (e.g., lgc-7 and lgc-8, lgc-16 and lgc17). Two members of this diverse group (pbo-5 and pbo-6) function as proton-gated ion channels (Beg et al., 2008), illustrating the wide range of gating-mechanisms for orphan members of this group. The expression pattern is not known for most of the orphan lgc genes (Table 9).
The group of LGICs that is phylogenetically distinct from the nAChR group contains 41 genes (Sine and Engel, 2006; Rufener et al., 2010) (Figure 4, Table 10). This group has extensively radiated and diversified in worms compared to humans where this group consists of 19 relatively close related GABAA receptor-encoding genes and 5 glycine receptor coding genes (Tsang et al., 2007). Glycine receptor genes are thought to be a vertebrate specific invention. The 41 C. elegans genes can be broadly subdivided into several subgroups based on sequence similarity. With one exception (exp-1) all characterized members of this group are anion channels:
GABA-gated ion channel subgroup. This subgroup, consisting of seven genes, codes for canonical GABA-gated chloride channels. Some of these genes are closely related to vertebrate GABAA receptors. The members are gab-1, unc-49, lgc-36, lgc-37, lgc-38 and the more distant paralogs exp-1 and lgc-35. unc-49, gab-1 and exp-1 encode bona fide GABA-gated channels (see the WormBook chapter GABA). exp-1 is the odd man out in this overall LGIC group since it is the only cation channel.
Inhibitory, ACh-gated chloride channel subgroup. This subgroup contains eight genes, including the four electrophysiologically characterized acc-1 through acc-4 genes (Putrenko et al., 2005), as well as the currently uncharacterized lgc-46, lgc-47, lgc-48, and lgc-49 genes.
Biogenic amine-gated subgroup. This subgroup contains eight genes including mod-1, which encodes an electrophysiologically characterized serotonin-gated chloride channel, lgc-55 which encodes a tyramine-gated chloride channel and lgc-53, encoding a dopamine-gated chloride channel (Pirri et al., 2009; Ringstad et al., 2009). The ligands for the remaining channel-encoding genes—lgc-50, lgc-51, lgc-52, lgc-54 and ggr-3 (whose name, GABA/Gly receptor, is a bit of a misnomer as it displays no specific affinity to ggr-1 and ggr-2, which are in a different and possibly GABA-gated subgroup)—are not yet identified. Vertebrates also have serotonin-gated channels, but those are cation-selective, and not anion-selective like MOD-1.
Glutamate-gated anion channels subgroup. This invertebrate-specific subgroup contains six genes—glc-1 through glc-4, avr-14, avr-15 (see Ionotropic glutamate receptors: genetics, behavior and electrophysiology). Receptors encoded by these genes are ivermectin-sensitive. They have been speculated to be the invertebrate homologs of glycine receptors (Vassilatis et al., 1997). From the ligand perspective, note that this family is one of two types of glutamate-gated ion channels in the C. elegans genome. The other type is unrelated to the pentameric LGICs and contains glutamate-gated cation channels related to vertebrate AMPA/kainate/NMDA receptors. These are discussed in Section 2.6.
The remaining 12 members of this group contain nine genes that are related (ggr-1, ggr-2, lgc-39 through lgc-45) and 3 genes that show no affinity to any subgroup (lgc-32, lgc-33, lgc-34). The LGC-40 channel has been shown to be a low-affinity serotonin receptor that is also gated by choline and acetylcholine (Ringstad et al., 2009).
All genes are listed in Table 10. The expression of most family members is not known.
LGICs require auxiliary subunits for their trafficking, assembly and function. The best characterized auxiliary subunits are those for the nAChRs and many of them were first identified through functional analysis in C. elegans (Table 3). These include the unrelated genes ric-3, unc-50, and unc-74 (Boulin et al., 2008), as well as nra-2 and nra-4, which encode ER-resident type I transmembrane proteins (Almedom et al., 2009). With the exception of nra-2, which is related to the nicastrin-encoding aph-2 gene, none of these genes have additional paralogs in the C. elegans genome. LEV-9, a protein with multiple Sushi/CCP domains is an additional auxiliary subunit identified by functional analysis (Gendrel et al., 2009). The gene adjacent to lev-9, T07H6.4, encodes a protein with the same domain composition as LEV-9 but its function is unknown. Another nAChR auxiliary subunit protein, LEV-10, contains several CUB domains, an LDL domain, and a transmembrane domain. There are three more genes in the genome coding for proteins with a similar domain architecture: mig-13, neto-1 (the ortholog of vertebrate Neto1/2), and K05C4.11. To date, mig-13 has only been implicated in cell migration, not AChR function, while neto-1 and K05C4.11 are uncharacterized. An alternatively spliced form of the lev-10 locus, called eat-18, is required for cholinergic transmission in the pharynx (McKay et al., 2004).
A one-pass transmembrane protein, MOLO-1, that contains a single extracellular globular domain, the TPM domain, was recently found to be a new auxiliary subunit of nAChR (Boulin et al., 2012). The worm genome contains six molo-1 paralogs (Table 3). Vertebrate GPI-anchored or transmembrane Lynx/SLURP proteins have also been implicated in nAChR function (Jones et al., 2010). These proteins contain a characteristic LU (“Ly-6 antigen / uPA receptor”) domain. There are four C. elegans genes (lurp-1 through lurp-4) encoding proteins with a similar domain architecture, all of them uncharacterized to date (Table 3). In addition, the C. elegans genome contains 10 proteins with homology to the Ly6 domain (InterPro domain IPR010558). They all appear to originate from a nematode-specific expansion. The founding member of this family, ODR-2, was identified by its involvement in odortaxis (Chou et al., 2001). The nine paralogs of ODR-2 are called hot genes (for “homologs of odr two”) (Table 3). Although their mechanism of action is not known, the homology of all these proteins with the Lynx/SLURP-type regulators of LGICs as well as the documented neuronal function of ODR-2, suggest that these worm proteins could function as regulators of LGICs.
Even though not considered an auxiliary subunit per se, the rapsyn protein is required for clustering of nAChRs on vertebrate skeletal muscle. The C. elegans genome contains one functionally conserved rapsyn ortholog, rpy-1, which is expressed in both muscle and neurons (Nam et al., 2009). Another nAChR clustering protein is the secreted OIG-4 protein, which is composed of a single immunoglobulin (Ig) domain (Rapti et al., 2011). C. elegans has five secreted 1-Ig domain proteins (oig-1 through oig-5).
As mentioned briefly above, two types of glutamate-gated ion channels are encoded in the C. elegans genome. One group consists of inhibitory glutamate-gated anion channels, which are members of the Cys loop LGIC family and which have been described above (Table 10). The second group is composed of the highly conserved glutamate-gated cation channels (“ionotropic glutamate receptors” or iGluRs). These glutamate receptors are tetrameric and related to the AMPA, Kainate and NMDA receptors in vertebrates. There are ten subunits encoded in the C. elegans genome. Two of them form NMDA receptor-type channels (encoded by nmr-1 and nmr-2) and eight form AMPA receptor-type channels (encoded by glr-1 through glr-8) (Table 11). All of these genes are expressed in distinct and partly overlapping sets of neurons (see Ionotropic glutamate receptors: genetics, behavior and electrophysiology).
In addition, there are five related and as yet unnamed genes in the genome whose protein products share homology with the AMPA-type glr genes (e-value in BLAST search 1e-04 to 5e-09) (Table 11). They all contain predicted ligand-binding domains related to solute-binding domains in bacterial amino acid-binding proteins. They have several transmembrane segments, but tend to code for smaller proteins than the NMR/GLR proteins. These genes, as well as the more canonical glr genes glr-7 and glr-8, may belong to a newly defined subtype of iGluRs—termed ionotropic receptors (IRs)—that serve as chemosensory molecules in flies (Croset et al., 2010). Three C. elegans genes (glr-7, glr-8, W02A2.5) fulfill sequence criteria to be IR genes, and two of these are expressed in pharyngeal neurons, suggesting roles in food sensing (Croset et al., 2010). It is interesting to remember here the above-mentioned LGIC proteins DEG-3 and DEG-2 that serve as sensory channels for ambient choline. Perhaps it is a general feature of different types of ion channel families to be employed as sensory receptors for ambient metabolites.
Ionotropic glutamate receptors require a number of distinct auxiliary transmembrane proteins collectively called TARPs (for transmembrane AMPA receptor regulatory proteins) (Jackson and Nicoll, 2011). The C. elegans genome contains the TARP sol-1, which codes for a CUB domain protein, and stg-1 and stg-2, which code for proteins related to the vertebrate TARP stargazin. The vertebrate CUB/LDL/TM proteins Neto1 and Neto2 also function as TARPs and, as mentioned above in the context of nAChR auxiliary subunits, there are a total of four Neto1/2-like proteins encoded in the C. elegans genome (besides a Neto1/2 ortholog, neto-1, there are lev-10, mig-3 and K05C4.11). C. elegans also contains an uncharacterized homolog of the vertebrate TARP Cornichon (Jackson and Nicoll, 2011), cni-1, but lacks obvious homologs of the SynDIG1 or CKAMP44 TARPs (Table 3).
DEGenerin/Epithelial Na+ Channels/Acid sensing ion channels (DEG/ENaC/ASIC) constitute, together with the related P2X channels, the fourth type of ion channel superfamily (Figure 1). P2X-type ion channels, which are directly activated by adenosine triphosphate (ATP) (Fountain and Burnstock, 2009), can be found in all vertebrate species, in marine invertebrate species like mollusks and sea urchins, and even in fungi, but they appear to have been lost in C. elegans and Drosophila (Bavan et al., 2009).
The related DEG/ENaC/ASIC channels have been implicated in a broad spectrum of cellular functions and can be gated by a variety of distinct mechanisms, ranging from mechanosensory stimuli to pH to small ligands, such as FMRFamide peptides (see Mechanosensation; Bazopoulou and Tavernarakis, 2007). Individual proteins cross the membrane twice, have intracellular N- and C-termini and a large extracellular loop that includes a conserved cysteine-rich region. Their multimeric state was initially controversial, but recent work suggests that they are trimers (Jasti et al., 2007; Gonzales et al., 2009). The naming of this class in the literature is not always consistent: DEG, ENaC and ASIC channels are specific subtypes of these receptors and sometimes the entire family is either referred to only as ASIC or as DEG/ENaC. I refer to them here with all three names.
With a total of 30 members (Table 12), C. elegans has expanded its repertoire of DEG/ENaC/ASIC channels significantly compared to ~10 genes in mammals. There are no specific ortholog pairs of vertebrate and worm channels, suggesting independent radiation of this gene family (Bazopoulou and Tavernarakis, 2007). Even though some of their names (Table 12) may suggest otherwise, none of the C. elegans members are more closely related to vertebrate ASIC or ENaC proteins. Nevertheless, domain analysis as well as the clustering in phylogenetic trees (TreeFam TF317359) suggest that some but not all of the 30 genes fall into related subgroups (Bazopoulou and Tavernarakis, 2007) (Table 12). One of these subgroups, the “egas” subgroup, contains a peculiar domain combination of the signature ASC domain (present in all superfamily members) and multiple EGF repeats. The only other clade in which such a combination can also be found are hemichordates. No expression patterns have yet been reported for this subfamily.
Almost half of the DEG/ENaC/ASIC channels have been characterized for expression or function. With two exceptions (flr-1 and unc-105) they are all expressed in the nervous system or have specific neuronal functions (Table 12).
Stomatins are membrane proteins thought to be auxiliary subunits that modulate the activity of DEG/ENaC/ASIC channels both in worms and vertebrates (Lapatsina et al., 2012). They are defined by the presence of a characteristic and structurally conserved core domain called the stomatin or SPFH domain (Stomatin, Prohibitin, Flotillin, HflK/HflC) domain. There are five mammalian stomatin genes and ten C. elegans stomatin-like genes (mec-2, unc-1, unc-24, stl-1, sto-1 through sto-6), most of them originating from an apparently nematode-specific expansion (Table 3). Even though only explicitly demonstrated to be an auxiliary subunit for MEC-4/MEC-10 degenerin channels, the co-localization of UNC-1 protein with an innexin protein (Chen et al., 2007) suggests that stomatins may also be auxiliary subunits for different types of transmembrane channels (see Section 10 for innexins). This is consistent with the physical association of vertebrate stomatin-like proteins with a TRP channel (Lapatsina et al., 2012). The expression patterns of five of the ten stomatins have been analyzed and neuronal expression was detected for each of them. The MEC-4/MEC-10 DEG channel complex not only employs a stomatin as auxiliary protein, but also an oxidoreductase-related protein, MEC-14.
Plasma membrane-localized chloride channels are molecularly diverse and have many distinct functions in the nervous system. Besides the neurotransmitter-gated chloride channels mentioned above (Section 2.5.2), there are a number of additional chloride channels, some only recently identified as such (Duran et al., 2010). In a good number of cases, the distinction between chloride channels and transporters is blurry.
One major type of chloride channel is the CLC superfamily, members of which control the membrane potential of cells. Some CLC channels are voltage gated, while others function as chloride/proton exchangers. The C. elegans genome contains six members of the phylogenetically very ancient family of CLC chloride channel proteins (Schriever et al., 1999) (Table 13).
C. elegans contains a calcium-regulated chloride channel of the Tweety family, ttyh-1. The channel conducts large chloride (“maxi-Cl-”) currents. C. elegans ttyh-1 is expressed widely throughout the nervous system, but has not yet been functionally characterized. Vertebrate Tweety was recently found to be associated with synaptic vesicles (Morciano et al., 2009). C. elegans also contains two members of the recently defined anoctamin family of calcium-activated chloride channels, both also presently uncharacterized (anoh-1 and anoh-2) (Table 13).
Bestrophins are another family of plasma membrane-located, calcium-activated chloride channels (four genes in mammals) (Duran et al., 2010). Bestrophins are expressed in multiple vertebrate tissue types including the nervous system. C. elegans has significantly expanded its repertoire of these bestrophin-like genes: there are 26 family members, an expansion by a factor of more than six compared to mammals (Table 13). All proteins share a homology region (“Bestrophin” or “RFP-TM” domain) of 350-400 amino acids. Two of the three C. elegans genes whose expression has been analyzed so far with reporter genes show expression in the nervous system (Table 13).
There are no obvious worm (or fly) homologs of calcium-activated chloride channels of the CLCA family.
The very large superfamily of SLC “solute carrier” transporters (>100 genes, mentioned again in Section 3 in the context of neurotransmitter transporters) includes one family, the SLC12 family, which transports chloride across membranes and which has important functions in the nervous system. Their importance—and the reason why they are mentioned here in the context of ion channels—stems from the fact that intracellular chloride concentration determines the strength and polarity of inhibitory neurotransmitters that act on chloride channels such as GABA (Hebert et al., 2004) (Figure 5). Specifically, in vertebrates the relative expression levels of the K+/ Cl- cotransporter KCC2 (SLC12A4-7 subfamily) and the Na+/K+/2Cl- cotransporter NKCC1/2 (SLC12A1,2 vertebrates) determine whether neurons respond to GABA (or other transmitters) with a depolarizing, excitatory response or with a hyperpolarizing, inhibitory response (Hebert et al., 2004). There are three homologs of the vertebrate KCC family in worms (kcc-1, 2, and 3) and one member of the sodium potassium chloride cotransporter Nkcc (nkcc-1) (Tanis et al., 2009) (Table 5), compared to 4 Kcc genes and 2 Nkcc gene in humans. C. elegans kcc-2 is indeed required to determine the inhibitory action of various neurotransmitters and is expressed in the nervous system (Tanis et al., 2009). nkcc-1 expression and function has not yet been reported. C. elegans has two more distant homologs of the NKCC/SLC12A1-3-type Na+/K+/2Cl- cotransporter, F10E7.9 and B0303.11, which are expressed in neurons and the excretory system, respectively (Table 5).
There are two additional members of the SLC12 chloride transporter family, SLC12A8 and SLC12A9, and C. elegans contains an ortholog of SLC12A9 (T04B8.5), which is expressed in neurons and muscle (Table 5). SLC12A9 is thought to modulate the activity of the related Nkcc transporter (Caron et al., 2000).
Apart from the SLC12 subfamily, the sodium-dependent chloride/bicarbonate transporters of the SLC4 family are known to regulate chloride balance and pH in the nervous system (Bellemer et al., 2011) (Figure 5). There are four SLC4 members in the worm genome (abts-1 through abts-4), and all four are expressed in the nervous system, some of them only in a subset of neurons (Sherman et al., 2005). One of them, abts-1, has been directly implicated in inhibitory neurotransmission (Bellemer et al., 2011).
New types of ion channels are still being discovered. Two entirely new cation non-selective, plasma membrane channels with a >30 transmembrane topology were identified in 2010 in vertebrates, called Piezo1 and Piezo2 (Coste et al., 2010). More recently Piezo proteins were shown to be the pore forming unit of a new type of mechanoreceptor (Coste et al., 2012). These channels bear no obvious homology to any other type of ion channels. C. elegans contains a single ortholog of the Piezo family (T20D3.11 fused to C10C5.1). Given the Piezo family precedent it would not be surprising if more ion channels remain to be identified.
In summary, genome sequence analysis shows that the following types of ion channels are notably absent in C.elegans: voltage-gated sodium channels, glycine-gated ion channels, P2X channels, and HCN channels. Note that the absence of voltage-gated sodium channel is not generally indicative of an absence of classic action potential in C. elegans since all-or-none action potentials can, at least in C. elegans muscle, be generated by voltage-gated calcium channels as well (Gao and Zhen, 2011; Liu et al., 2011). In most cases the absence of the channel is considered to be a loss since it is paralleled by the absence of the channel in some but not all invertebrates (P2X channels, HCN channels, voltage-gated sodium channels). In one case (glycine-gated LGIC) the channel may have only originated in the vertebrate lineage (Tsang et al., 2007).
The steps of synthesis, vesicular loading and reuptake of individual neurotransmitters are referred to as a “neurotransmitter pathway”. C. elegans is known to use as neurotransmitters acetylcholine, GABA, glutamate, serotonin, dopamine, octopamine and tyramine (and most likely more, such as melatonin), and the respective neurotransmitter pathways are shown in Figure 6A. The reader is referred to other chapters in WormBook that discuss these neurotransmitter systems in more detail (GABA; Biogenic amine neurotransmitters in C. elegans; Acetylcholine). I provide here an overview and summary of the genomic complement of confirmed and speculative neurotransmitter pathway genes (Table 14).
Acetylcholine (ACh) is synthesized from choline by the enzyme choline acetyltransferase (cha-1, Figure 6A). In vertebrates, ATP citrate lyase, which generates the CoA cofactor for the acetyl transfer reaction (Figure 6B), is broadly expressed but becomes largely restricted to cholinergic neurons in the mature nervous system. Two ATP citrate lyase orthologs (acly-1, acly-2) are encoded in the worm genome, both uncharacterized; perhaps one of them has specialized for its role in cholinergic neurotransmission, while the other may perform a more general metabolic function. In vertebrates, choline is not generated de novo in neurons but synthesized in the liver by a specific biosynthetic pathway and then taken up by neurons through the choline transporter ChT. Curiously, C. elegans, like plants and fungi, has a distinct pathway for choline synthesis, the PEAMT pathway, which allows neurons to generate choline cell-autonomously from phospho-ethanolamine (thereby lessening the importance of the ChT homolog cho-1 in C. elegans) (Brendza et al., 2007; Mullen et al., 2007). The key enzymes in this alternative pathway are pmt-1 and pmt-2 (Brendza et al., 2007).
GABA is synthesized from the amino acid glutamic acid by the enzyme glutamic acid decarboxylase (GAD), encoded by unc-25 (Figure 6A). Vertebrates contain several GAD isozymes, but C. elegans only contains one.
Glutamate metabolism in the vertebrate nervous system is remarkably complex and not well explored in C. elegans. Vertebrate neurons do not express a pyruvate carboxylase for de novo synthesis of glutamate but are rather provided with glutamate from support cells (astrocytes), which convert glutamate to glutamine via glutamine synthetase and then provide glutamine to neurons. Neurons then synthesize glutamate from glutamine using glutaminase. In C. elegans, there is one pyruvate carboxylase (pyr-1), 4 glutamine synthetases (gln-1, gln-2, gln-3, gln-5—a nematode-specific expansion) and 3 glutaminases (glna-1, glna-2, glna-3—again a nematode-specific expansion). Whether these biosynthetic enzymes are also differentially expressed in C. elegans neurons and putative support cells is not known.
Most of the synthesis pathways of monoamine neurotransmitters are multistep processes (summarized in Figure 7). For dopamine biosynthesis, tyrosine is first hydroxylated by tyrosine hydroxylase (TH, one gene in C. elegans, cat-2) to produce L-Dopa and for 5-HT synthesis, tryptophan is hydroxylated by tryptophan hydroxylase (TPH, one gene in C. elegans, tph-1) to produce 5-hydroxytryptophan (Figure 6A, Figure 7). Both TH and TPH require a cofactor, tetrahydrobiopterin (BH4), which is generated through a multistep biosynthetic process (Figure 6C). Of the enzymes involved in this process (Figure 6C, Table 14), only the GTP cyclohydrolase encoded by the cat-4 locus has been analyzed to date in worms and is expressed exclusively in serotonergic and dopaminergic neurons, as expected. After their generation by TH and TPH, both L-Dopa and 5-hydroxytryptophan are then decarboxylated by the same amino acid decarboxylase (AAAD), encoded by bas-1, to produce dopamine and serotonin, respectively. Besides bas-1, there are four more genes in the genome that code for AAADs (Hare and Loer, 2004) (Table 14): one of them likely an inactive enzyme, two of them with unknown substrates and the last one being tyrosine decarboxylase (tdc-1) which is utilized to generate tyramine from tyrosine. In one of the two neuron classes that synthesize tyramine (i.e., express tdc-1), tyramine is converted by tyramine β-hydroxylase (tbh-1) to octopamine (Figure 6A, Figure 7).
Melatonin is a biogenic amine that can act as a neuromodulator in various species (Hardeland and Poeggeler, 2003). Melatonin has been detected in worms and is involved in regulating locomotory behavior (Tanaka et al., 2007). Melatonin is synthesized through the N-acetylation of serotonin by serotonin N-acetyltransferase (called AANAT for arylalkylamine N-acetyltransferase) (Figure 7). There are many N-acetyltransferases encoded in the worm genome and one of them, anat-1, is most closely related to AANAT (Migliori et al., 2011). anat-1 is expressed in several uncharacterized neuron types and is functionally also uncharacterized. N-acetylated serotonin is then converted into melatonin by hydroxyindole-O-methyltransferase (HIOMT, homt-1 in C. elegans), which is also neuronally expressed, but functionally uncharacterized (Tanaka et al., 2007).
Vesicular transporters for small-molecule neurotransmitters fall into the phylogenetically conserved SLC superfamily of solute carriers. The SLC18 family is called the “vesicular amine transporter family” (He et al., 2009) and contains the vesicular transporter for biogenic amines (dopamine, serotonin, tyramine, octopamine), encoded by cat-1, and the acetylcholine vesicular transporter, encoded by unc-17 (Table 5).
The SLC32 family (“Vesicular inhibitory amino acid transporter family”) contains as its sole C. elegans member the vesicular GABA transporter, encoded by unc-47. To be localized appropriately within GABA neurons, the UNC-47 protein requires the phylogenetically conserved LAMP (lysosome associated membrane proteins)-like protein UNC-46, which is exclusively expressed in GABA neurons (Schuske et al., 2007). Vertebrate UNC-46 homologs are also expressed in a neuron-type specific manner (David et al., 2007). UNC-46 is distantly related to the more canonical C. elegans LAMP proteins lmp-1 and lmp-2. There are no other obvious paralogs of UNC-46.
The SLC17 family of transporters contains several bona fide neurotransmitter transporters (Reimer and Edwards, 2004) and has significantly expanded in worms. The SLC17 family is subdivided into several subfamilies (Table 5). The vertebrate SLC17A6-8 subfamily is composed of vesicular glutamate transporters (VGluTs). C. elegans has three members of this subfamily: the well characterized eat-4 gene and two closely related, likely VGluTs, called vglu-2 and vglu-3. Mammalian genomes also encode three VGluTs, but the C. elegans genes represent an independent expansion. The vertebrate SLCC17A1-5 subfamily contains vesicular aspartate/glutamate transporters and C. elegans contains one uncharacterized homolog of this subfamily, C38C10.2 (Table 5). The SLC17A9 subfamily contains vesicular nucleotide transporters and C. elegans contains one homolog of this subfamily, vnut-1, which is also uncharacterized. Notably, however, there are no obvious homologs of ionotropic (P2X) or metabotropic (P2Y) purinergic neurotransmitter receptors and as such the substrate for vnut-1 is not clear. In addition, C. elegans contains nine genes (eight of which constitute a C. elegans specific expansion) that are clear SLC17 superfamily members, but show no homology to any specific SLC17 subfamily (Table 5).
Casting the web even wider and considering more divergent SLC17 family members, the expansion of the C. elegans family is even more obvious—there are at least 51 SLC17 members in worms and only nine in humans (Hoglund et al., 2011). Most of this expansion appears to be nematode-specific (see the 43 genes in TreeFam tree TF315412). The role of these additional members in the nervous system, if any, is unknown.
Recent work demonstrated the localization of the concentrative nucleoside transporter CNT2 (an SLC28 family member) on synaptic vesicle membranes in rat (Melani et al., 2012). It is therefore possible that the two worm CNT2 homologs slc-28.1 and slc-28.2 may be involved in vesicular uptake of adenosine (discussed more in Section 3.5).
Another substance present in neurotransmitter-containing vesicles is zinc. Zinc is used in a variety of distinct processes in many cell types but particularly notable is its presence in many glutamatergic vesicles in the mammalian nervous system (Bitanihirwe and Cunningham, 2009). Neurons with zinc-containing glutamatergic synaptic vesicles have been termed “gluzinergic” (Bitanihirwe and Cunningham, 2009). Synaptic zinc modulates the overall excitability of the brain through effects on voltage-gated calcium channels, glutamatergic, GABAergic, dopaminergic and nicotinic receptors (Bitanihirwe and Cunningham, 2009). Zinc is transported into synaptic vesicles through members of the SLC30 family of SLC carriers (10 in humans) (Lichten and Cousins, 2009). There are 12 SLC30 members encoded in the worm genome, six show sequence affinity with specific human SLC30 subtypes and six of them are diverse (Table 5). At least one of them is expressed in a subset of neurons (toc-1). There are two worm orthologs (cdf-2, ttm-1) of the human SLC30A2/3/4/8 subfamily which contains SLC30A3/ZnT3, the best characterized zinc synaptic vesicular transporter.
Members of the SLC transporter superfamily mediate the reuptake of a neurotransmitter once it has been released at the synapse (He et al., 2009). Depending on the neurotransmitter system, reuptake of the neurotransmitter (or a break-down product such as choline) occurs exclusively by the presynaptic cell, by adjacent cells, or by a combination of both.
The C. elegans genome contains homologs for all canonical reuptake transporters, based on both sequence and functional analysis (Table 11): one transporter for serotonin (mod-5, SLC6 family); one for dopamine (dat-1, SLC6 family); one for GABA (snf-11, SLC6 family); one for choline, the breakdown product of acetylcholine (cho-1, SLC5 family); and six glutamate plasma membrane transporters (glt genes, SLC1 family). The glutamate transporters are expressed in multiple distinct cell types (similar to the situation in vertebrates where glutamate is taken up not by the neuron but by surrounding tissue) (Mano et al., 2007), while the other transporters are mostly restricted in their expression to the neurons that have produced the transmitter (with the exception of the snf-11 GABA transporter which may be expressed only in a subset of GABAergic neurons) (Mullen et al., 2006).
However, this is not the full story. The SLC6 family, which is called “K+/Cl- dependent neurotransmitter transporter family”, contains the serotonin, dopamine and GABA reuptake transporters as well as 14 additional members, three of them with low sequence similarity and perhaps not acting as transporters (Table 5). One of them, snf-6, codes for a muscle-expressed acetylcholine/choline transporter (Kim et al., 2004) and another, snf-12, is expressed in the hypodermis and involved in an immunity response (its cargo is unknown) (Dierking et al., 2011). The remaining genes are completely uncharacterized, but two of them are expressed in neurons and six of them represent a nematode-specific gene expansion (Table 5). Any of these genes may encode reuptake transporters for known neurotransmitters systems for which no plasma membrane reuptake transporter has yet been identified (e.g., biogenic amines like tyramine, octopamine or trace amines), or for uncharacterized neurotransmitters.
Aside from the Na+/Cl- dependent neurotransmitter reuptake of monoamines by the SLC6 family, alternative monoamine uptake mechanisms are known to exist. The recently identified human plasma membrane monoamine transporter PMAT is a low-affinity, high capacity and Na+/Cl- independent monoamine transporter (Engel et al., 2004) and is a member of the small family of SLC29 transporters (four mammalian members). Another SLC29 family member is also selectively expressed in the mammalian brain (Dahlin et al., 2009) and a knockout of a fly SLC29 family member shows various neurophysiological defects (Knight et al., 2010). There are seven members of the SLC29 family in the C. elegans genome (ent-1 through ent-7) (Table 5). ent-1 and ent-2 are expressed in non-neuronal cells and the remaining genes have not yet been characterized (Table 5).
Either before or after reuptake several neurotransmitters are degraded (schematically shown in Figure 6A). Acetylcholine is already broken down in the synaptic cleft by acetylcholinesterases (AChE, ace genes) before the breakdown product, choline, is taken up by the presynaptic, cholinergic neuron. While mammals only contain a single AChE gene, C. elegans contains four (Table 14). The four ace genes are expressed in cholinergic neurons as well as other cell types.
Monoamine neurotransmitters such as dopamine and serotonin are removed from the synapse by specific plasma membrane reuptake transporters, as mentioned above. In mammals, two monoamine oxidases (MAO-A and MAO-B) are important for subsequent degradation of monoamine neurotransmitters. The C. elegans genome encodes several proteins with homologies to MAO, with AMX-2 being the most similar to mammalian MAO-A and MAO-B (Table 14). In an alternative degradation pathway the catechol-O-methyltransferase COMT degrades monoamines. In contrast to the single enzyme in humans, there are 5 COMT-like proteins encoded in the C. elegans genome, all uncharacterized (Table 14). None of these genes have been functionally analyzed to date.
In insects MAO activity is weak and instead the major enzyme for monoamine breakdown is serotonin N-acetyltransferase AANAT (anat-1 in worms) (Tsugehara et al., 2007). Whether anat-1, which is involved in melatonin synthesis in worms (Migliori et al., 2011), is also employed for serotonin degradation is not known. anat-1 is expressed in multiple unidentified neurons.
The GABA degradation pathway consists of the enzymes GABA transaminase (GABA-T in vertebrates, gta-1 in worms) and succinic semialdehyde dehydrogenase (SSADH, alh-7 in worms) (Table 14). The genes encoding these enzymes have not been functionally analyzed to date in worms.
There are several neurotransmitters in other organisms whose existence in C. elegans is unclear, such as glycine, purines (mainly ATP and ADP), histamine and trace amines. Since electrophysiological methods for measuring neurotransmitter-induced or -modulated currents, are not readily applicable to C. elegans neurons, the only readily available route to identify neurotransmitter systems is to seek homologs of neurotransmitter pathway genes in the genome.
Based on sequence homology, the C. elegans genome does contain a vesicular transporter for glycine (unc-47), which also transports GABA. unc-47 is, however, only expressed in those cells that by immunostaining also contain GABA. At most, glycine can therefore only be a co-transmitter with GABA. C. elegans contains no clear ortholog of the glycine reuptake transporter GlyT (SLC6A5), while it does contain a GABA transporter ortholog (snf-11). Like other invertebrates, the C. elegans genome also contains no obvious orthologs of glycine-gated ion channels.
The C. elegans genome contains an as-yet-uncharacterized gene with similarity to the vesicular transporter for nucleotides (vnut-1) (Sreedharan et al., 2010). After synaptic release, ATP is thought to be hydrolyzed to ADP by ecto-ATPases, of which there are three in the worm genome (mig-23, ntp-1, uda-1), and reuptake may occur via concentrative nucleoside transporters of the SLC28 family (CNT1,2,3 in mammals), of which there are two in the worm genome (slc-28.1 and slc-28.2), both uncharacterized (Table 5). Uptake could also occur via the alternative, equilibrative nucleoside transporter family SLC29 (ENT1, 2, 3, and 4 in mammals; seven worm homologs, ent-1 through ent-7) (Table 5). However, both the ATPases and the nucleoside transporter are generally thought to have broad physiological roles, and their existence in the worm genome can therefore not be taken as strong evidence for the use of ATP as neurotransmitter. Lastly, and perhaps most indicative, no obvious orthologs of ionotropic or metabotropic purine receptors (P2X and P2Y) exist in the C. elegans genome.
Adenosine is not traditionally considered a neurotransmitter, but it has been shown to be involved in modulating neuronal activity through P1-type G-protein coupled receptors (Webster, 2001). Recent work demonstrated excitation-dependent release of adenosine from vertebrate neurons, supporting its role as neurotransmitter (Melani et al., 2012). There is a clear worm ortholog of the adenosine receptors, ador-1, which is equally related to A1, A2 and A3-subtype P1 adenosine receptors. No expression or functional analysis has been reported yet. As noted above, there are two worm homologs of the SLC28 transporters and seven SLC29 transporters, both of which transport nucleosides like adenosine across membranes. Intriguingly, vertebrate CNT2 (one of the SLC28 family members) has recently been localized on synaptic vesicles in the rat brain (Melani et al., 2012).
Epinephrine and Norepinephrine. Epinephrine (adrenaline) and norepinephrine (noradrenaline) are generally thought to be restricted to deuterostomes and indeed they cannot be detected biochemically in C. elegans (Sulston et al., 1975). The biogenic amines octopamine and tyramine, which exist as trace amines in vertebrates, are generally thought to be the invertebrate “analogs” of norepinephrine and epinephrine (Roeder, 1999; Roeder, 2005). This is because of their related structures (Figure 7), but also because of striking similarities in their principle physiological roles (as discussed in detail for octopamine and norepinephrine in Roeder (1999)). However, on an enzymatic pathway level, the absence of epinephrine and norepinephrine cannot be predicted. The enzyme tyramine-β−hydroxylase (TBH-1), which generates octopamine in C. elegans, is closely related to dopamine-β−hydroxylase, which generates norepinephrine (Figure 7). There are three paralogous genes in the C.elegans genome (anmt-1 through anmt-3) (Table 14) that display similarity to phenylethanolamine N-methyltransferase (PNMT), which generates epinephrine. These genes are, however, equally related to indolethylamine N-methyltransferase (INMT) and nicotinamide N-methyltransferase (NNMT), which modify xenobiotic compounds.
Histamine. Histamine is biochemically detectable in worm extracts (Pertel and Wilson, 1974) but no specific role has yet been ascribed to histamine. Biosynthetically, histamine generation requires a histidine decarboxylase (HDC), a member of the family of aromatic amino acid decarboxylases (AAADs) (Figure 7). There is no obvious worm ortholog of vertebrate or fly HDC, but the worm genome contains five AAADs (Hare and Loer, 2004) (Table 14). One is the tyramine-producing enzyme TDC-1, another is the dopamine- and serotonin- producing enzyme BAS-1, both mentioned above. There is a close bas-1 paralog that misses residues for AAAD function (basl-1). Another uncharacterized AAAD (hdl-1) displays no specific affinity to any subtype and the last one (hdl-2) is somewhat more distantly related to the other AAADs. Vesicular transport of histamine can occur via the biogenic amine vesicular transporter cat-1 (Duerr et al., 1999). However, there are no obvious homologs of metabotropic histamine receptors in the genome of worms (or flies) (Roeder, 2003) and there are also no obvious orthologs to histamine-gated ion channels. However, GABAA channels have recently been shown to be modulated by histamine in vertebrates (Saras et al., 2008) and there are numerous GABAA-like channels in the worm genome. Arguing against the presence of histamine as a neurotransmitter in C. elegans is the lack of the two major enzymes involved in histamine breakdown, histaminase (diamine oxidase) and histamine methyltransferase. There are also no worm homologs of the Drosophila tan and ebony genes which generate histamine via an alternative pathway.
Melatonin. Even though melatonin is produced in C. elegans (Migliori et al., 2011) (Figure 7), its role as a neuromodulator in C. elegans is not proven. Arguing for a neuronal role is that the knockout of the HOMT enzyme produces locomotory defects (Tanaka et al., 2007), arguing against it is the absence of obvious homologs of the GPCR-type melatonin receptors MT1 or MT2 in the C. elegans genome (Tanaka et al., 2007). Best BLAST hits against vertebrate MT1/2 receptors are several npr-type putative neuropeptide receptors. Similarities between MT1/2 receptors and NPY receptors have been noted (Metpally and Sowdhamini, 2005). Melatonin can also signal via nuclear hormone receptors in mammals (Becker-Andre et al., 1994).
Other trace amines. As mentioned above, tyramine is generated by decarboxylation of the aromatic amino acid tyrosine by the AAAD enzyme TDC-1. In vertebrates, the decarboxylation of other aromatic amino acids (phenylalanine and tryptophan) by AAAD generates phenylethylamine and tryptamine, respectively (Figure 7). Both are called “trace amines” because they are found in only very small amounts in the vertebrate CNS, but they do have effects when administered to the brain (Webster, 2001). Since bas-1 and tdc-1 expression has already been assigned to specific aminergic neurons (i.e., serotonergic, dopaminergic, tyraminergic and octopaminergic), perhaps the above-mentioned AAAD hdl-1 and/or hdl-2 are involved in creating these trace amines. cat-1 could be involved in synaptic vesicle uptake and the above-mentioned orphan SLC6 family members in synaptic reuptake. Since trace amines are thought to be signaling through GPCRs, some of the many orphan GPCRs in the worm genome (some of which have homology to trace amine receptors, e.g., srsx-22 or srsx-25, BLAST score ~1e-09) could serve as receptors.
Synephrine, a metabolite of octopamine, is another trace amine that may exist in C. elegans. For this to be the case, one of the three above-mentioned possible homologs of PNMT (anmt-1 through anmt-3) would need to be able to use octopamine as substrate.
Taken together, the points raised above do not make definitive arguments for or against the existence of the additional neurotransmitter/neuromodulator systems in C. elegans. The best candidates for as-yet-unexplored neuromodulator systems are melatonin and adenosine. The substantial number of uncharacterized orphan vesicular transporter and reuptake transporters, as well as the impressive number of orphan ligand-gated channels (lgc genes mentioned above) and GPCRs, strongly suggest that as-yet-uncharacterized neurotransmitter/neuromodulator systems exist in the worm.
At present 122 neuropeptide genes encoding over 250 distinct neuropeptides have been identified in the C. elegans genome (Li and Kim, 2010). Of these, 40 genes encode insulin-like peptides (ins genes), 31 genes encode FMRFamide-related peptides (flp genes), and 51 genes encode non-insulin, non-FMRFamide-related neuropeptides (most encoded by the so-called nlp genes) (Li and Kim, 2010) (Table 15). Some of the nlp genes show similarities to neuropeptides of other species (Nathoo et al., 2001). There are likely more neuropeptides awaiting discovery, as evidenced by the slow trickle of newly discovered neuropeptides that initial analyses did not uncover. For example, recent proteomic analysis has identified a PDF peptide homolog (Janssen et al., 2009), genetic screens for behavioral mutants identified a previously uncharacterized neuropeptide, snet-1 (Yamada et al., 2010), and recent searches for C. elegans homologs of neuropeptides for which receptors are predicted in the worm genome (see Section 5.2 below) revealed a putative oxytocin homolog, ntc-1 (Beets et al., 2012; Garrison et al., 2012).
Expression patterns have been analyzed for more than 60 of the neuropeptide-encoding genes using reporter gene fusions. As summarized by Li and Kim, these studies show that all ins and flp genes examined and most of the nlp genes are expressed in the nervous system, with each gene having a unique expression pattern (Li and Kim, 2010) (Table 15). So far, more than half the cells in the nervous system express at least one neuropeptide gene but this number may increase because genes involved in neuropeptide release (e.g., unc-31) appear to be very broadly if not ubiquitously expressed in the nervous system. Furthermore, many individual neurons clearly express multiple neuropeptide genes.
Neuropeptides are produced as proproteins that contain several individual neuropeptides. These precursors are first cleaved by proprotein convertases, then the C-terminal basic residue is cleaved by a carboxypeptidase (Li and Kim, 2010). Afterwards most neuropeptides become further modified through amidation. There are four Kex2/subtilisin-like proprotein convertases (PCs) in the C. elegans genome (Table 16), one uncharacterized ortholog of a chaperone for the protease (sbt-1) and one carboxypeptidase E (egl-21) which has been explicitly linked to neuropeptidergic signaling (Li and Kim, 2010). In addition, the genome contains two, as yet uncharacterized homologs of the carboxypeptidase D family (cpd-1 and cpd-2), which are functionally related to the E family (Dong et al., 1999). These genes are the only representatives of the neuropeptide-processing carboxypeptidases in C. elegans, formerly called “regulatory carboxypeptidases” and now called M14B subfamily of metallocarboxypeptidases (TF315592) (see Merops database). C. elegans also contains a number of additional M14A-type carboxypeptidases (Merops database) but, in light of the function of their orthologues, they all likely act in digestion.
While egl-3 and egl-21 are well characterized in terms of function and expression (Kass et al., 2001; Jacob and Kaplan, 2003), much less is known about the amidation process. Studies in other organisms have shown that amidation involves the modification of a peptide-glycine processing intermediate by two enzymatic activities: a mono-oxygenase (called PHM) that associates with copper and molecular oxygen to hydroxylate glycine, and an enzyme (PAL) which cleaves the hydroxyglycine moiety to produce the peptide-amide (Prigge et al., 2000). In humans, both enzymatic activities reside in a single protein called PAM. In flies these two activities have split into two proteins (PHM and PAL). Curiously, the C. elegans genome contains genes for both versions: there is an as-yet-uncharacterized homolog of the multi-enzyme PAM protein encoded by pamn-1, as well as PHM and PAL orthologs, encoded by pghm-1 and pgal-1, respectively (Table 16). pamn-1 is expressed in the nervous system; the other genes have not been studied yet.
Analogous to classic neurotransmitter removal by reuptake or degradation, the activity of neuropeptides is also terminated through specific mechanisms. No reuptake mechanisms have yet been discovered for neuropeptides, rather signal-termination mechanisms work through the degradation of the neuropeptide by specific proteases. A number of different protease families have been implicated in this signal termination process, with the neprilysins being the best studied (Turner et al., 2001). Neprilysins are zinc metallopeptidases present on the outer surface of cells. The neprilysin family has significantly expanded in C. elegans, containing 27 members (Table 16), while there are only five neprilysin-like proteases in mammals. Two of the nep genes have been implicated in the execution of several distinct C. elegans behaviors (Spanier et al., 2005; Yamada et al., 2010).
Other proteases implicated in neuropeptide degradation (Isaac et al., 2009) are class III and class IV dipeptidyl peptidases (of which there are eight encoded in the C. elegans genome), the angiotensin-converting enzyme family (of which there is only one C. elegans homolog encoded by acn-1; the protein product is, however, predicted to be catalytically inactive), and tripeptidyl peptidase II (one C.elegans gene, tpp-2) (Table 16).
The most prevalent class of neuropeptides receptors are 7TM G-protein-coupled receptors (GPCRs). These types of receptors and their relationship to neuropeptide signaling will be discussed in the ensuing GPCR section (Section 5). However, GPCRs are not the only neuropeptide receptors, as exemplified by the neuromodulatory insulin peptides. The insulin-like peptide encoded by ins-1 acts in a neuromodulatory manner to affect the local activity in a chemosensory circuit (Tomioka et al., 2006). As mentioned above, there are 40 insulin-encoding genes, and all genes examined so far are expressed in restricted domains within the nervous system, suggesting neuronal functions for many other insulins as well. DAF-2 is the only insulin/IGF receptor-like tyrosine kinase in the C. elegans genome and several insulin-like peptides, such as ins-1 are known to signal through DAF-2 (Tomioka et al., 2006). Intriguingly, the C. elegans genome also contains an unusual types of insulin receptor-related proteins (Dlakic, 2002), encoded by the irld genes (for insulin/EGF receptor L domain). Unlike DAF-2, these proteins do not contain tyrosine kinase domains but only contain extracellular Cys-rich L domains (related to LRR domains), found in insulin and EGF receptors (IPR000494). Like the insulin-encoding ins genes, genes for these type of putative insulin binding proteins have vastly expanded in the worm genome (69 genes) (Table 17). Many of these genes are recent duplicates. One third of these proteins contain transmembrane domains or membrane anchors and may be part of atypical insulin receptor complexes. L domain-only proteins cannot be found in Drosophila or vertebrate genomes (Dlakic, 2002).
Other types of neuropeptide receptors exist. The currents of ENaC/DEG/ASIC channels from snails and vertebrates are modulated by FMRFamide and related neuropeptides (Askwith et al., 2000; Jeziorski et al., 2000). Whether some C. elegans ENaC/DEG/ASIC-channels are gated by FMRFamides is not known. In any case, it should be kept in mind that non-GPCRs may act as receptors for other neuropeptides as well.
C. elegans contains more than 1,300 predicted GPCRs (see The putative chemoreceptor families of C. elegans). The human genome is thought to code for around 800 GPCRs (Lagerstrom and Schioth, 2008). Detailed bioinformatic analysis has organized GPCRs into five classes (Lagerstrom and Schioth, 2008) (Table 18). The Rhodopsin class (formerly class A) is by far the largest and contains a diverse set of members, including aminergic, peptidergic and olfactory receptors. The Secretin class (formerly class B) is much smaller and contains various peptidergic receptors related to the founding member of the family, the secretin receptor. The Adhesion class (formerly also part of class B) contains GPCRs with largely expanded N-terminal regions that are involved in cell adhesion; they usually also contain a “GPS domain” (for GPCR proteolytic site”) that is required for cleavage of the N-terminal extracellular domain. The Glutamate receptor class (formerly class C) contains metabotropic glutamate and GABA receptors. The Frizzled/Taste2 class contains Wnt receptors and a newly discovered class of taste receptors. As shown in Table 18, C. elegans contains members of each class. Rather than discussing them according to this classification system, I will discuss them by function and type of ligand.
The C. elegans genome contains three GPCRs for ACh (gar-1, gar-2, gar-3), two for GABA (gbb-1, gbb-2), and three for glutamate (mgl-1, mgl-2, mgl-3). These receptors can be clearly identified based on sequence features (Table 19). The metabotropic glutamate receptors mgl-1, mgl-2 and mgl-3 fall into the three ancestral groups of mGluRs (Kuang et al., 2006). In addition, there are two more, as-yet-uncharacterized GPCRs in the genome, C30A5.10 and F35H10.10, that show significant similarities to mGluRs and display an annotated “Ligand-binding domain of metabotropic glutamate receptor” annotated by the CCD domain database. F35H10.10 contains signature motifs of the GPCR family 3 (also called family C, IPR000337).
The ACh GPCRs fall into the into the Rhodopsin class of GPCRs (like most other GPCRs), while the GABA and Glu GPCRs fall into the Glutamate receptor class (class C) of GPCRs (IPR000337). Aside from the above-mentioned genes there are no other Class C family members in the worm genome.
C. elegans contains 16 GPCRs for monoaminergic neurotransmitters (Table 19). These receptors fall into the Rhodopsin or “Class A” family of GPCRs. Four biogenic amines have been identified as neurotransmitters in C. elegans so far: serotonin, dopamine, tyramine and octopamine (as mentioned above, there may be more). Several but not all of these receptors can be assigned to specific ligands by sequence alone. Biochemical and functional analysis has assigned most of the receptors to specific ligands as shown in Table 19. Reporter-based expression patterns exist for all 16 families members, revealing their expression in a restricted set of neurons. Several of the receptors are also expressed in muscle cells. In addition to the 16 genes encoding aminergic GPCRs, there is also another gene, C24A8.6, that likely arose through a local inversion of the neighboring dop-6 gene. Parts of this gene are identical to dop-6, but as the identity is limited to one restricted part of the protein, C24A8.6 is unlikely to form a functional GPCR. More information on biogenic amines and their receptors can be found in Biogenic amine neurotransmitters in C. elegans.
As mentioned above, even though it is not clear whether adenosine can generally be classified as a neurotransmitter in any system, a clear ortholog of a GPCR-type adenosine receptor, ador-1, is encoded in the worm genome but has not yet been characterized in terms of expression or function.
Apart from some known exceptions described in Section 4.3, neuropeptides generally signal through GPCRs. Neuropeptide GPCRs are mostly of the Rhodopsin class, but also the Secretin class (Table 18). The ability to predict neurotransmitter receptors by sequence (i.e., to distinguish them from other types of GPCRs, such as olfactory GPCRs) is not as straight-forward as it is for metabotropic receptors of classic neurotransmitters. Nevertheless, some basic sequence features are conserved enough in neuropeptide receptors to assess the number of neuropeptide receptors in the worm genome as was done in several previous studies (Keating et al., 2003; Wenick and Hobert, 2004; Janssen et al., 2010). By revisiting these previous datasets and supplementing them by additional analysis, a list of 153 genes can be assembled whose products share significant homology to known neuropeptide receptors (Table 20, see legend for methodology). This list includes the 14 C. elegans neuropeptides receptors that have been biochemically deorphanized as FLP or NLP receptors (Li and Kim, 2010). The cutoff for significance was set to BLAST scores of 1e-04 to exclude known non-neuropeptide receptor GPCRs (three deorphanized true sensory receptors, odr-10, srbc-64, and srg-37, and one likely sensory receptor, str-2, were used to set this threshold as they all have e values larger than 1e-04).
About one quarter of the 153 putative neuropetide receptors stand out in their obvious relation to vertebrate neuropeptide receptor families, such as the neuropeptide Y receptor family (at least 12 members in C.elegans), the neuromedin receptor family (at least 6 C. elegans genes), the neurokinin/neuropeptide FF/orexin receptor family (6 genes), the somatostatin family (6 genes), the galannin receptor family (4 genes), and the gonadotropin-relasing hormone receptor family (8 genes) (Cardoso et al., 2012) (Table 20). Most of the remaining C. elegans neuropeptide receptors cluster into related and likely paraologous families, some of them quite large. For example, one family contains 20 genes, termed frpr genes, and clusters with the Drosophila FMRFamide receptor FR (TF316702). Two of the family members have been shown biochemically to be FLP receptors (Li and Kim, 2010). However, in spite of its similarity with other members of this family, one family member (daf-37) was recently reported to be a sensory receptor for ascarosides, glycolipids that worms use to communicate with one another (Park et al., 2012). Another family with 17 genes, the dmsr family, clusters with Drosophila dromyosuppressin receptor Dms-R2 (TF315509), which serves as a receptor for a FMRFamide peptide (Klose et al., 2010). Indeed one of the worm family members, EGL-6, has been shown to be a receptor for FLP-10 and FLP-17 (Ringstad and Horvitz, 2008). Members of another family (TF315321) contain low similarity to fly FMRFamide receptor, but the degree of similarity is consistent across most family members. Many of the remaining putative neuropeptide receptors also fall into smaller families and two of them are known to be activated by NLP peptides (Table 20). Some of them are obvious sequence orthologs to fly or vertebrate receptors (Table 20). Three of them are secretin-type (formerly class B-type) neurotransmitter receptors, pdfr-1, seb-2 and seb-3 (for secretin/class B-type receptor). A recent analysis of GPCRs in the C. elegans genome comes to a similar conclusion (Frooninckx et al., 2012).
Almost all of the >20 candidate neuropeptide receptor-encoding genes for which an expression pattern has been examined show expression in a restricted number of neurons (Table 20). Given that many neuronally expressed peptides modulate transmission at neuromuscular junctions or contraction of other organ types (e.g., the gut (Nichols et al., 2002)) it is likely that many of the receptors will also be expressed in muscle or other non-neuronal cell types.
However, there are possibly many more than the 153 neuropeptide receptors described above. As Robertson and Thomas note in The putative chemoreceptor families of C. elegans, the divergent srw gene family, which consists of 100 members, is related to neuropeptide receptors. A BLAST analysis of representative genes from individual srw subfamily branches illustrates this notion (Table 21). Since almost 90% of srw genes reside in large clusters on the arms of chromosome V, they likely represent a nematode-specific expansion. Whether they serve to monitor internal signals or serve as receptors for environmental peptides, as suggested by Robertson and Thomas, remains to be seen. In addition, several members of the srsx family of GPCRs (37 genes) show significant BLAST hits to neuropeptide receptors. Any of those GPCRs could of course also be ligands for other chemical substances such as lipids or other small organic molecules. That this is more than just a mere possibility is illustrated by daf-37 and daf-38, two GPCRs recently reported to be receptors for ascarosides (as mentioned above, these are glycolipids that worms use to communicate with one another) (Park et al., 2012). Both daf-37 and daf-38 display significant homology to neuropeptide receptors (Table 21), illustrating the diversity and unpredictability of ligands for GPCRs.
Robertson and Thomas identified ~1,280 likely chemoreceptor genes in the genome, which they classify into several groups in Table 1 of The putative chemoreceptor families of C. elegans. The evidence for the function of these GPCRs as chemoreceptors is very strong: First, for the many dozen genes for which expression profiles have been examined, expression is clearly observed in sensory neurons (Troemel et al., 1995; Colosimo et al., 2004; Chen et al., 2005). Second, several of the receptors are localized to the sensory dendritic endings of chemosensory neurons where the chemosensory apparatus is thought to be concentrated (Dwyer et al., 1998; Colosimo et al., 2004). Third, since the discovery of odr-10 as a receptor for the odorant diacetyl (Sengupta et al., 1996; Zhang et al., 1997), four additional genes (srbc-64, srbc-66, srg-36, srg-37) have been shown to code for receptors for ascarosides (Kim et al., 2009; McGrath et al., 2011). Fourth, as pointed out by Robertson and Thomas (The putative chemoreceptor families of C. elegans), it is hard to imagine what else other than sensory receptors could be encoded by such a large gene family.
The sensory modalities of the sensory GPCRs may be diverse. Apart from small chemicals (e.g., diacetyl for ODR-10), some of the GPCRs may also be light detectors (Edwards et al., 2008; Liu et al., 2010). Sensory modalities are difficult to predict, for example, the UV receptor lite-1 is a member of a small subfamily of worm GPCRs (five gur genes) that were initially recognized based on their similarity to Drosophila gustatory receptors (Robertson et al., 2003).
It is noteworthy that a substantial number of the chemosensory GPCR family members are not only expressed in sensory neurons, but also in distinct sets of interneurons and motorneurons (a few samples are shown in The putative chemoreceptor families of C. elegans). In fact, some are only expressed in interneurons, e.g., sra-11 (Troemel et al., 1995). Since more than one third of so-far-analyzed chemosensory GPCRs are expressed in non-sensory neurons (Troemel et al., 1995; Chen et al., 2005), it is to be expected that hundreds of the total of ~1,280 receptors may be expressed in many different parts of the nervous system. These genes may code for receptors that monitor neuronal or non-neuronally-derived internal signals. These signals could range in their composition from peptides to small organic substances to lipids, all of which are recognized by GPCRs in other systems. These signals may be highly species-specific. For example, there are no worm homologs for a good number of vertebrate GPCRs known to sense different types of lipids (Kostenis, 2004). Additionally, there are no homologs of cannabinoid GPCRs (key enzymes in endocannabinoid synthesis are also missing), no homologs of the LPA receptor, and no homologs of the FFA (free fatty acid receptor) GPCRs. Worms may use different types of lipids for signaling.
Together with a small number of secretin-type hormone receptors (three in C. elegans), this family used to be part of the class B family of GPCRs, but is now recognized as its own family (Lagerstrom and Schioth, 2008) (Table 18). The signature features of adhesion GPCRs are (1) an extended N-terminus that contains distinct sets of domains involved in protein-protein interactions (hence their presumed role in cell adhesion), (2) a signature GPS domain involved in autoproteolytic cleavage of the N-terminal extracellular domain, and (3) the 7TM part which is related to the secretin/hormone-type GPCRs.
While humans contain diverse set of proteins in the adhesion group, each with distinct extracellular domains (33 in total) (Lagerstrom and Schioth, 2008) and diverse function (Yona et al., 2008), C. elegans only contains five members of this group (Table 22). Three of them are obvious orthologs of well-characterized fly and human proteins: fmi-1 (Flamingo ortholog), lat-1, and lat-2 (latrophilin orthologs). fmi-1 function in neuronal development and synapse formation (Steimel et al., 2010; Najarro et al., 2012) and lat-1 has a role in synaptic transmission (Willson et al., 2004). C. elegans contains single homologs of proteins that Latrophilins are thought to interact with, neurexin (nrx-1 in worms) and teneurin-1 (ten-1 in worms), both with presumed functions in synapse formation.
Methuselah is the founding member of a GPCR subfamily that expanded specifically in Drosophila and is considered a third subgroup of the former “class B” class of GPCRs (Harmar, 2001). Methuselah has documented functions in synaptic transmission (Song et al., 2002) and can bind various neuropeptides (Ja et al., 2009). Based on similarity identified in the Panther database (PTHR12011), there are two genes in C. elegans related to Methuselah's, mth-1 and mth-2. In contrast to fly Methuselah proteins, the two worm proteins contain the signature GPS domain which clearly places them in the adhesion group of class B GPCRs. The function or expression of mth-1 and mth-2 have not yet been examined.
There are no Taste2 GPCRs in the C. elegans genome, but there are four Frizzled-type GPCRs (Table 18), which are receptors for Wingless-type signaling molecules (of which there are five in the C. elegans genome). In both C. elegans and other organisms, there is abundant evidence for the role of Frizzled signaling in neuronal development (for a review of the C. elegans work, see Wnt signaling. In flies and vertebrate, there is evidence for a role of Frizzled in synapse function, particularly in the form of anterograde and retrograde feedback signals at the synapse (Speese and Budnik, 2007). In worms, the evidence for adult neuronal functions of Frizzleds is the dependence of AMPA receptor localization on a Frizzled effector gene, β-catenin (Dreier et al., 2005) and the observation that reporters for the two Frizzled receptors mom-5 and cfz-2 are expressed in a subset adult neurons. The expression or function of other Frizzleds in the adult nervous system has not yet been reported.
GPCRs usually signal through heteromeric G-protein complexes by acting as nucleotide exchange factors (signaling downstream of Frizzled GPCRs tends to be more complex (Speese and Budnik, 2007)). There are 21 Gα–, two Gβ–, and two Gγ-encoding genes in the C. elegans genome (see Heterotrimeric G proteins in C. elegans), compared to 21, 5 and 19 genes in humans (Table 23). Three Gα-(gsa-1, egl-30, goa-1), both Gβ and one Gγ-encoding gene (gpc-2) are very broadly expressed, most of the remaining 18 Gα and one Gγ-encoding gene (gpc-1) are expressed in a highly neuron-type specific manner, most of them in sensory neurons (see Heterotrimeric G proteins in C. elegans).
The activity of GPCRs is controlled by a number of regulatory factors, including G protein coupled receptor kinases (GRKs), of which there are two in the worm genome (grk-1 and grk-2) (Table 23), and arrestin. Arrestins regulate the inactivation, internalization and trafficking of GPCRs (Gurevich and Gurevich, 2006). Mammalian genomes encode four conventional arrestins, which are either specifically expressed in visual systems (vertebrate cone arrestin and rod arrestin) or are broadly expressed (arrestins 2 and 3). There is one conventional arrestin in the C. elegans genome, called arr-1 (Table 23).
Intriguingly, a functionally uncharacterized family of arrestin-related proteins (now called α-arrestins) has been recognized in genomic sequences across phylogeny (Alvarez, 2008). Their overall sequence homology to classic arrestin is relatively low, but their predicted overall structure appears highly related. Classic arrestin is composed of two modules with antiparallel β-sheets (Arrestin-N and Arrestin-C domain) which are similar to an Fn3 module (Aubry et al., 2009). The arrestin-related proteins (called ADCs or ARRDCs for “Arrestin domain containing”) share this structure (Aubry et al., 2009). There are five of these ARRDC proteins in humans, but C. elegans has vastly expanded its repertoire of this type of protein—it contains 31 arrd genes coding for ARRDC proteins. As listed in Table 23, expression of the three arrd genes whose expression has been analyzed to date was detected in subsets of neurons. It is intriguing to think about this family expansion in the context of expansion of the worm repertoire of GPCRs, as well as G proteins (above). The limited number of neurons in the worm means that individual neurons likely express dozens of GPCRs. To distinguish between activation of different GPCRs (and to therefore permit discrimination between distinct inputs), GPCR subfamilies may be able to hook up with distinct downstream signaling molecules to produce distinct signaling outputs.
GPCR function is also regulated by recently discovered transmembrane proteins of the RAMP family and by the GASP proteins (Magalhaes et al., 2012); there are no obvious homologs of either type of protein encoded in the worm genome.
Heterotrimeric G-protein signaling is controlled by various regulatory factors, including guanine nucleotide exchange factors (GEFs) and guanine nucleotide dissociation inhibitors (GDIs) proteins. Apart from the GPCRs themselves, there is at least one other dedicated GEF encoded in the worm genome, ric-8, and there are three GDI proteins each of which contain “GoLoco” domains (Table 23). A large family of G-protein regulators are encoded by the rgs genes (“regulator of G protein signaling”) (Porter and Koelle, 2009). There are 21 members of this family encoded in the worm genome (compared to 38 in humans), many with either broad or cell-type specific expression in the nervous system and specific roles in nervous system function (Porter and Koelle, 2009) (Table 23).
Guanylyl cyclases generate a second messenger molecule, cGMP, which is predominantly used in the nervous system of C. elegans, as assessed by the neuron-type specific expression of most guanylyl cyclases (Table 24). Generally, guanylyl cyclases exist either as soluble, cytoplasmic versions (sGCs) or in single pass transmembrane versions with large extracellular domains (rGCs) (Potter, 2011). There are 27 gcy genes in the C. elegans genome coding for rGCs and seven gcy genes coding for sGCs (Table 24), compared to five and four genes in human, respectively (Fitzpatrick et al., 2006; Ortiz et al., 2006).
Members of both families are thought to be receptors for small ligands In C. elegans. Molecular oxygen is the ligand for sGCs, with different sGCs being tuned to detect different oxygen concentrations (Gray et al., 2004; Zimmer et al., 2009). sGCs are exclusively expressed in sensory neurons (all likely oxygen-sensory neurons).
rGCs are expressed in many sensory neuron (where several of them localize to sensory dendrites), but are also expressed in non-sensory neurons. Specific functions have been identified in non-sensory neurons (Shinkai et al., 2011), suggesting that rGC function is not restricted to sensing environmental cues. No specific ligands have yet been identified for any C. elegans rGC proteins, but there are many diverse candidates: (1) small peptides, in analogy to the ligands of mammalian GCY proteins (Potter, 2011); (2) salt ions, based on their role sensory taste perception (Ortiz et al., 2009); and (3) other small organic molecules (based on the presence of the “extracellular ligand-binding receptor domain”, IPR001828, which can also be found in metazoan glutamate receptors and bacterial amino acid transporters).
rGCs may also be activated by CO2 (Hallem et al., 2011; Brandt et al., 2012). In contrast to the other putative ligands of rGCs which likely act through the extracellular domain, CO2 sensing may work through the intracellular catalytic domain, in analogy to vertebrates rGCs (Potter, 2011). This process may entail the conversion of CO2 into bicarbonate, catalyzed by carbonic anhydrases (see below, Section 7).
GCAPs (guanylate cyclase-activating proteins) and GCIPs (guanylate cyclase-inhibitory proteins) are calcium-binding, EF hand proteins that control GCY activity in the phototransduction pathway in the vertebrate retina (Palczewski et al., 2004). The closest relatives to the GCAP and GCIP proteins in the C. elegans genome are the NCS proteins NCS-1, NCS-2 and NCS-3 (Table 6). The expression of ncs-1 in many sensory neurons matches the expression of receptor-type gcy genes.
Aside from cGMP-gated ion channels (CNG) discussed above, other critical neuronal effectors of cGMP are cGMP-dependent protein kinases, of which there are two encoded in the worm genome: egl-4, which is expressed in the nervous system and has various nervous-system-associated functions (see Chemosensation in C. elegans); and pkg-2, which is presently uncharacterized.
cGMP levels are controlled by phosphodiesterases. The C. elegans genome encodes six of these enzymes (pde-1 through pde-6), each of which has a closely-related human ortholog (Conti and Beavo, 2007) (Table 24). Based on this orthology, pde-4 and pde-6 are cAMP-specific while pde-1, pde-2, pde-3 and pde-5 control cGMP levels. pde-1, pde-2 and pde-3 are expressed in the nervous system; the expression of other genes has not been investigated. Six pde genes do not reflect a nematode expansion as there are more than 20 human pde genes; the clear one-to-one orthology observed for human and worm PDEs (Conti and Beavo, 2007) is rarely seen in other gene families analyzed in this paper.
Like vertebrate rGCs, C. elegans rGCs are activated by CO2 (Hallem et al., 2011; Brandt et al., 2012). This process may entail the conversion of CO2 into bicarbonate, catalyzed by carbonic anhydrases. Expression of carbonic anhydrase is generally considered to be a hallmark of CO2 responsive neurons (Bretscher et al., 2011). The C. elegans genome encodes eight predicted carbonic anhydrases, six of the α family (cah-1 through cah-6) and two of the mitochondrial β family (bca-1 and bca-2) (Table 25). All six cah genes are expressed in restricted patterns in the nervous system (Bretscher et al., 2011).
The above-mentioned sGC proteins are not the only oxygen sensors. Globin domain proteins are heme proteins important for oxygen transport, storage and sensing (Weber and Vinogradov, 2001). The globin glb-5 has been implicated specifically in neuronal oxygen sensing (McGrath et al., 2009; Persson et al., 2009). With 33 members, the globin family has dramatically expanded in nematodes (Hoogewijs et al., 2008; Tilleman et al., 2011). All but three of the genes are expressed in restricted patterns in the nervous system (Hoogewijs et al., 2008), suggesting neuron-type specific functions (Table 25). These proteins are not likely to simply act as buffers or sinks but are rather thought to be involved in signaling events, triggered by oxygen binding (Persson et al., 2009). Why so many genes are needed for the seemingly simple task of oxygen binding is mysterious but likely relates to the fact that C. elegans navigates through environments with highly variable ambient oxygen concentrations. For example, oxygen concentrations in the soil vary from 1%-21%, depending on depth from the surface as well as soil properties such as compaction, aeration, and drainage (Anderson and Ultsch, 1987).
The machinery required for synaptic transmission is composed of many highly conserved proteins. Some components of the core SV fusion machinery were discovered by genetic studies of C. elegans neuronal function (see Synaptic function).
A SNARE complex mediates synaptic vesicle fusion. The core SNARE complex at the synapse is a ternary complex that is composed of two plasma membrane proteins (tSNAREs), syntaxin and SNAP25, plus a vesicle associated protein of the synaptobrevin/VAMP family (vSNARE) (Wang and Tang, 2006; Parpura and Mohideen, 2008). Based on the presence of conserved Q and R amino acids, SNARE proteins have also been classified as Q-SNAREs and R-SNAREs. Q-SNARES usually function as plasma membrane tSNAREs, and R-SNAREs function as vesicle-associated vSNAREs. In addition, SNAREs require an “SM” (Sec1/Munc18) protein to function, which is UNC-18 at the C. elegans synapses.
A number of SNARE proteins are not directly involved in neurotransmitter exocytosis, but do show selective expression in specific domains of the nervous system and have distinct neuron-specific functions, including neurotransmitter receptor trafficking, neuronal morphogenesis and neurite outgrowth (Wang and Tang, 2006).
In C. elegans the vesicle-associated vSNARE synaptobrevin is encoded by snb-1, and the plasma membrane-associated tSNAREs syntaxin and SNAP-25 are encoded by unc-64 and ric-4/snap-25, respectively. The C. elegans genome contains eight additional, mostly uncharacterized VAMP/synaptobrevin family members, including orthologs of Sec22 and Ykt6 and nine additional syntaxin family members (Table 26). Besides the canonical SNAP-25 ortholog, ric-4, there are two more SNAP-25 related genes, aex-4 and snap-29 and seven additional Q-SNAREs (Table 26).
A key feature of SNARE-mediated fusion at synapses is that it is calcium-dependent, and the calcium sensor synaptotagmin is required for synaptic vesicle fusion (Parpura and Mohideen, 2008). The calcium-sensing synaptotagmins have also expanded, with six additional snt genes besides the first characterized worm synaptotagmin snt-1 (Table 26). With >10 synaptobrevins, syntaxins and synaptotagmins each, mammals contain roughly the same number of these types of genes.
C. elegans contains single copies of the many integral components of synaptic vesicles and regulatory factors involved in the synaptic vesicle cycle, as shown in Table 26. The reader is referred to Synaptic funtion for more detail.
Dense core vesicles are thought to secrete neuropeptides and represent a class of vesicle distinct from small vesicles that carry fast acting, “classic” neurotransmitters. The C. elegans genome encodes single orthologs of several proteins involved in the dense core vesicle cycle, including unc-31/CAPS, and the IA2-related tyrosine phosphatase ida-1 (Li and Kim, 2010).
Genes that are involved in the synaptic vesicle cycle are expressed in most if not all neuron types. They are often not restricted to the nervous system and are expressed in other tissue types as well.
Apart from calcium, synaptic vesicle release is regulated by neuronal GPCR signaling (Perez-Mansilla and Nurrish, 2009). One critical node in integration of GPCR signals and synaptic vesicle release is the control of diacylglycerol (DAG) production, which in turns controls unc-13 activity. DAG levels are controlled by PLCβ (one clear ortholog in C. elegans, egl-8) and diacylglycerol kinases, of which there are five in the worm genome (dgk-1 through dgk-5). All of those for which expression has been analyzed (dgk-1, dgk-3, dgk-4) show neuronal expression and dgk-1 has been implicated in synaptic function (Perez-Mansilla and Nurrish, 2009).
Various types of scaffolding proteins organize neurotransmitter receptor localization in postsynaptic densities. The most prominent type of such scaffolding proteins contains PDZ domains (Feng and Zhang, 2009). The C. elegans genome contains 70 proteins with easily recognizable PDZ domains (Table 27); the human genome contains several hundreds. Several of the C. elegans PDZ domain proteins are orthologs of well-characterized PDZ proteins known to localize neurotransmitter receptors, while most of them have unknown functions. In C. elegans, the LIN-10 protein is known to be required for neurotransmitter receptor localization (Glodowski et al., 2005) and the multiple-PDZ protein encoded by mpz-1 colocalizes with the GPCR-type serotonin receptor SER-1 (Xiao et al., 2006). Other multidomain PDZ proteins may similarly be involved in neurotransmitter receptor clustering.
Gap junctions electrically couple cells. Within the C. elegans nervous system gap junctions form a giant neuronal network that can be broken down into subnetworks based on the number of gap junction contacts (Majewska and Yuste, 2001). Apart from their role in controlling circuit activity in the mature C. elegans nervous system (Starich et al., 2009; Kawano et al., 2011), gap junctions also have developmental roles during neuronal circuit formation (Chuang et al., 2007; Yeh et al., 2009).
Gap junctions in both vertebrates and invertebrates are formed by oligomerization of six homomeric or heteromeric subunits on each cell membrane. Hemichannels on either cell membrane connect in either a homotypic or heterotypic manner to form a gap junction. Recent work suggests that hemichannels alone may also have functions, possibly as some sort of “leak” channels that allow small molecules to leave the cell (Scemes et al., 2009).
In vertebrates, the gap junction subunits are called connexins (20 in humans) and pannexins (three in humans), while invertebrate gap junction subunits are called innexins (Starich et al., 2001; Scemes et al., 2009). All proteins have a similar topology of four transmembrane domains with cytoplasmic N- and C-terminal tails. Vertebrate connexins and invertebrate innexins share no notable primary sequence similarity, but vertebrate pannexins (3 genes) were actually identified based on their similarity to innexins (hence the term “pannexin”).
The C. elegans genome contains 25 innexin-encoding (inx) genes (Altun et al., 2009), about the same number as there are vertebrate connexin and pannexin genes. Based on a genome-wide reporter gene analysis, 20 out of the 25 inx genes are expressed in the nervous system (Table 28), many of them with striking cellular specificity (Altun et al., 2009). Expression of innexin genes—and hence gap junction circuitry—appears to be remarkably dynamic (Chuang et al., 2007; Altun et al., 2009).
Innexin function appears to be regulated by stomatins (Chen et al., 2007), which, as described above, also regulate other channels. As mentioned above, there are 10 stomatin-encoding genes in the worm genome (Table 3).
Besides several broad cellular roles, members of the kinesin, dynein, and myosin superfamily molecular motors have specific roles in neuronal function, plasticity, and morphogenesis (Hirokawa et al., 2010). Kinesins have been divided into specific subfamilies, several of which carry out essential functions in all dividing cells, while the most relevant for nervous system development are the subfamilies involved in synaptic vesicle transport and intraflagellar transport (see below and also The sensory cilia of Caenorhabditis elegans). 21 kinesin-like proteins are encoded in the C. elegans genome (Table 29), some characterized (e.g., unc-104, osm-3, klp-6), some completely uncharacterized. In addition, C. elegans contains three atypical kinesins, one with a well-documented function in axon pathfinding (vab-8).
Cytoplasmic dyneins consist of heavy chains, light intermediate chains, intermediate chains, and light chains (which fall into three further subfamilies). C. elegans contains representatives for each family, with some of the families being notably expanded (Table 30).
Lastly, the C. elegans genome contains 18 genes coding for proteins with myosin motor domains, including classic muscle myosin, but also several genes that are expressed in neurons (Table 31).
Cilia are microtubule-containing organelles that emanate from the surfaces of most animal cells. There are two types of cilia: motile cilia used for locomotion or for the generation of fluid flow, and non-motile (primary) cilia, which are implicated in sensing the environment (see The sensory cilia of Caenorhabditis elegans). Unlike many organisms, including humans, sensory neurons are the only ciliated cell types in C. elegans. 60 of the 302 hermaphroditic neurons are sensory neurons that possess non-motile, primary cilia, many with a striking morphological diversity (http://www.wormatlas.org).
In order to build ciliated structures and transport specialized functional components (such as receptors and ion channels) into this structure, a sophisticated anterograde and retrograde transport machinery exists that traffics substrates from the so-called transition zone to the distal ends of the ciliated dendrites (The sensory cilia of Caenorhabditis elegans). Specific types of kinesins provide the anterograde motor activity, and specific dyneins the retrograde motor activity (Table 32). The intraflagellar transport complexes that contain the motor proteins contain a substantial number of additional proteins that fall into three separate modules, the IFT-A, IFT-B, and BBSome modules (amounting together to a total of 27 proteins). A “parts list” of these complexes was recently put together by Inglis et al. (Inglis et al., 2009) and is summarized in Table 32. Differential regulation of individual components of this core set of ciliary genes is thought to be at least in part responsible for building distinct types of cilia (Mukhopadhyay et al., 2007; Silverman and Leroux, 2009). More genes involved in building specific types of ciliated endings likely remain to be identified.
A hallmark of nervous system architecture is the specificity of cellular contacts, either synaptic or adhesive. Proteins involved in synapse formation and maintenance are poorly characterized to date in any system. By contrast, a plethora of molecules are known to be involved in adhesive interactions in a mature nervous system and in cell recognition during development. The C. elegans genome contains a complex assembly of many distinct types of cell adhesion and extracellular matrix proteins. Among all these types of transmembrane or extracellular proteins, the classes of proteins with the most extensively documented function in the development and function of nervous systems of various species are the immunoglobulin superfamily members, the Leucine-Rich Repeat (LRR) proteins, cadherin family members, and neurexins and their various ligands. C. elegans has a number of representatives of each family.
Members of the immunoglobulin superfamily are involved in axon pathfinding (e.g., unc-40/DCC, sax-3/Robo), synapse formation (syg-1, syg-2), neuronal axon and soma adhesion (sax-7), axonal maintenance (zig genes) and neurotransmitter receptor clustering (oig-4). Essentially all immunoglobulin superfamily members examined so far are expressed in a subset of neurons (Aurelio et al., 2002; Schwarz et al., 2009). The immunoglobulin superfamily of C. elegans has been described previously (Vogel et al., 2003; Hobert et al., 2004). Excluding intracellular Ig domain proteins (such as those encoded by unc-22, unc-89, dim-1, or unc-73), there are a total of 64 proteins that contain one to many copies of easily recognizable subtypes of Ig domains (Table 33). 18 of these proteins contain no other obvious protein domains, are not necessarily closely related to one another and are small secreted or transmembrane proteins of the zig and oig type. Many of the remaining proteins contain additional fibronectin-III domains, which are related to Ig domains, and some contain distinct complements of domains, some of which are implicated in cell adhesion, others in signaling (Table 33). In contrast to many other gene families discussed here, the Ig domain family has not expanded in worms, but has expanded in humans which have many hundreds of immunoglobulin superfamily members.
Notably absent from the C. elegans genome is a homolog of the immunoglobulin superfamily member DSCAM, a neuronal recognition protein with remarkable isoform diversity in flies (Hattori et al., 2008). There are also no obvious orthologs of mouse SynCAM proteins, although proteins with similar domain architecture exist (igcm-3, igcm-4). There is a worm homolog of vertebrate Sidekick (rig-4), implicated in synaptic targeting in vertebrates (Schwarz et al., 2009). Receptor tyrosine phosphatases (all containing extracellular Ig domains) have been implicated in various aspects of neuronal development (Paul and Lombroso, 2003). There are three of these genes in the C. elegans genome (Table 33), one of them the LAR ortholog ptp-3 in worms, which has been implicated in synapse maturation in several different species including worms (Stryker and Johnson, 2007).
Members of the LRR family include the axon guidance cue Slit/slt-1 and several vertebrate proteins involved in synapse formation and function (de Wit et al., 2011). Extracellular LRR (eLRR) proteins (i.e., either secreted or transmembrane) in worms have been analyzed in silico (Dolan et al., 2007). 29 eLRR protein-encoding genes can be found in the worm genome (Table 33). Most are exclusively composed of LRR repeats, some also contain Ig domains. Some of the proteins are secreted, but most C. elegans eLRR proteins have transmembrane or GPI anchors. A number of them have highly conserved vertebrate orthologs, such as the guidance cue slt-1, the Toll-like receptor protein tol-1 or neuropeptide receptor fshr-1. Even though there are no obvious orthologs of the vertebrate LRR synaptic adhesion molecules (LRRTMs, SALMs and NGLs), there are C. elegans proteins with similar domain architectures (multiple LRR domains and a transmembrane domain, or multiple LRR and Ig domains) (Table 33).
Many of the eLRR family members have been analyzed for expression in the nervous system, revealing neuronal expression for many of them (Liu and Shen, 2011) (Table 34). Notably, in contrast to many other gene families discussed here, the LRR family has, like the Ig domain family, not expanded in worms. There are almost five times as many eLRR proteins in mouse and humans (Dolan et al., 2007).
The C. elegans genome contains 13 genes that code for proteins with cadherin domains, one more than previously noted (see The cadherin superfamily) (Table 34). This is significantly less than the more than 100 cadherin and cadherin-related genes in mammals. Representatives of several ancient cadherin subgroups can be found in C. elegans, including Flamingo, FAT and Dachous-type cadherins as well as more classic cadherins, yet there are also a number of nematode-specific cadherins, which are mostly uncharacterized (The cadherin superfamily) (Table 34).
There are no protocadherins encoded in the fly or worm genome (The cadherin superfamily). DSCAM in Drosophila and the protocadherins in mammals are the two classes of diversely spliced cell-cell recognition molecules in metazoan nervous systems (Zipursky and Sanes, 2010). The absence of both types of isoform-rich molecules in the worm genome may be a testament to the reduced morphological complexity of its nervous system.
Vertebrate neurexins are synaptic proteins that interact with a set of distinct partners to help synapses mature and function appropriately. C. elegans contains a single neurexin gene, nrx-1, that is broadly expressed in the nervous system but not yet functionally characterized (Table 35). There is presently no evidence in transcriptome datasets that the nrx-1 locus produces anything close to the tremendous amount of alternatively spliced isoforms that are characteristic of vertebrate neurexins (Missler and Sudhof, 1998). Neurexin-like genes of the CASPR family can also be found in the worm genome (Haklai-Topper et al., 2011) (Table 35).
The C. elegans genome contains orthologs of several neurexin binding proteins, including neuroligin (nlg-1), a neuroligin-related protein (glit-1) and two latrophilin genes (lat-1 and lat-2). There are no obvious orthologs of LRRTM1/2, synaptic adhesion molecules that interact with Neurexin, but proteins with similar domain architectures can be found, as mentioned above in the LRR section (Table 34). Clear orthologs of other neurexin binding partners (LRRTM1/2, Cbln1, Neurexophilins) cannot be found in the worm genome (Table 35).
This compendium covers ~2,800 genes with predicted neuronal functions. Yet it likely only scratches the surface of neuron-type specific gene batteries since there are already and will be many more genes which show neuron-type specific expression patterns in the mature nervous system. In fact, any given neuron type is expected to express several thousand genes and the comparison of even closely related individual neuron types revealed more than 1,200 differentially expressed genes (Etchberger et al., 2007).
Even though the gene family analyses provided here offers just a glimpse of neuronal molecular diversity, one pervasive theme emerges - the expansion of many gene families with specific neuronal functions. This expansion is not strictly C. elegans-specific but can also be observed in C. briggsae and C. remanei, suggesting that the expansions occurred more than 100 million years ago. The scale of expansion becomes even more impressive if one considers that the progression from invertebrates to vertebrates was accompanied by two genome duplications. Thus, in theory, any worm gene should have four vertebrate orthologs, as is indeed often observed. Yet the picture is dramatically different in many of the gene families discussed here. Even cases with similar overall gene numbers (e.g., 21 Gα genes in worms compared to around 21 Gα genes in vertebrates) argue already for gene family expansions in worms (i.e., if there are ~20 Gα genes in vertebrates one would have expected only 5 genes in worms). Many more dramatic expansions are apparent, including the expansion of the two pore TWK channel family, the Cys-loop ligand gated ion channel family, chloride channels, DEG/ENaC channels, specific subfamilies of the SLC-type transporters (both vesicular transport and synaptic reuptake), neuropeptide-processing enzymes and others. Notably though, cell adhesion families (IgSF, LRRs, cadherins) have not expanded in worms. Many more genes of these types can be found in mammals, a likely testament to the increased morphological diversity of vertebrate neuron types.
Gene family expansion in worms is also apparent if one stays closer to home and only compares worms and flies, both members of the ecdysozoa clade. Even though the Drosophila nervous system contains more than 300 times as many neurons as C. elegans (~100,000 vs. 302), most gene family expansions discussed above are observed in worms, but not flies. This observation supports the notion that the larger worm gene numbers in specific gene families are not the result of gene loss in other genomes but more likely reflect gene family expansion in a specific invertebrate phylum. Within the nematodes, neuronal gene family expansions are not restricted to C. elegans, but can be observed in C. briggsae and C. remanei, often (but not always) with clear one-to-one ortholog matches. Whether this holds for distant nematode species can only be assessed once other nematode genome sequences are better annotated.
There are three common functional themes in the gene family expansions. First, the expansion in the vesicular and reuptake transporters together with the expansion of ligand-gated ion channels suggests the existence of as yet to be discovered neurotransmitter/neuromodulator systems. Large GPCR-subfamilies, like the srw gene family (with similarities to neuropeptide receptors), as well as the pervasive non-sensory neuron expression of many olfactory-type GPCRs appear to make the same point. The ability to tune membrane potentials with the hugely expanded two pore TWK potassium gene family is also consistent with a tremendous adaptability of C. elegans neurons to various types of signaling inputs. Second, the worm has clearly expanded its repertoire of neuronal mechanisms with which it monitors its environment. This expanded sensory repertoire includes sensory GPCR proteins, GCY proteins, globins and ion channels (most notably the DEG/ENaC/ASIC family). This is consistent with the recognition of many sensory modalities being tuned over very narrow ranges. These expanded sensory functions are also likely to reflect the complex and highly variable sensory environment in which nematodes find themselves. Intriguingly, even the expansion of the ligand-gated ion channels may possibly be related to sensory functions, as exemplified by the choline-sensing DEG-3 and DES-2 LGICs. Last, the expanded neuronal “toolbox” suggests that in its molecular composition each C. elegans neuron may be a much more complex information processing device than a fly or vertebrate neuron. Understanding the functions of each individual component of this toolbox and understanding how the expression of all these genes is regulated is a daunting challenge.
|Topology||Family||Gene (alt. name)||Expression1|
|6-transmembrane||Voltage-gated: Shaker/Kv1 subfamily (1 gene)||shk-1||number of interneurons and sensory neurons|
|Voltage-gated: Shab/Kv2 subfamily (6 genes)||exp-2||muscle, sensory neurons|
|kvs-1||motorneurons, sensory neurons|
|Voltage-gated: Shaw/Kv3 subfamily (3 genes)||shw-1||?|
|egl-36 (shw-2)||subset of neurons, muscle|
|shw-3 (kht-1)||subset of neurons|
|Voltage-gated: Shal/Kv4 subfamily (1 gene)||shl-1||neurons|
|KQT family (3 genes)||kqt-1||subset of neurons|
|Eag-like/Kv10-12 family (2 genes)||egl-2||sensory neurons, muscle|
|unc-103||many neurons, muscle|
|Calcium-activated Slo family (2 genes)||slo-1 (nsy-3)||subset of neurons, muscle|
|slo-2||subset of neurons|
|Calcium-activated SK family (4 genes)||kcnl-1||?|
|kcnl-2||subset of neurons|
|4-transmembrane||TWK Channel family (47 genes)||egl-23 (twk-41)||?|
|sup-9 (twk-38)||subset of neurons, muscle|
|unc-110 (twk-18)||muscle only|
|twk-2||subset of neurons|
|twk-3||subset of neurons|
|twk-4||subset of neurons|
|twk-6||neurons + others|
|twk-16||subset of neurons|
|twk-17||subset of neurons|
|twk-20||neurons and muscle|
|twk-23||neurons + others|
|twk-29||subset of neurons|
|twk-30||subset of neurons|
|twk-32||subset of neurons|
|2-transmembrane||Kir family (3 genes)||irk-1||small number of neurons|
|Auxiliary subunit for||Gene||Domains/homology||Experimentally confirmed||Expression1|
|Voltage-gated K+ channels||mps-1||KCNE/MinK ortholog||yes||subset of neurons|
|mps-2||yes||subset of neurons|
|mps-3||yes||subset of neurons|
|mps-4||yes||subset of neurons|
|sssh-1||Drosophila sleepless ortholog||no||?|
|dpf-1||Dipeptidyl-peptidase IV-like peptidases||no||?|
|dpf-2||no||muscle, seam cells|
|dpf-5||no||Intestine, rectal gland cells|
|ctf-1||ABCC ortholog, unclear which subfamily||no||excretory cell|
|mrp-1||ABCC1/2/6-ortholog (SUR = ABCC8/9)||no||some neurons, pharynx, intestine, hypodermis|
|mrp-2||no||some neurons, pharynx, intestine, excretory cell|
|mrp-7||no||neurons, muscle, intestine|
|mrp-5||ABCC5/1/12 ortholog||no||neurons, pharynx, intestine, muscle, hypodermis|
|mrp-6||ABCC4 ortholog||no||neurons, intestine|
|bkip-1||no ortholog||yes||neurons, muscle|
|TWK-type potassium channels||sup-10||–||–||muscle|
|unc-93||human UNC-93A||no||neurons, muscle|
|Y39B6A.27||human UNC93-like MFSD11 (TF315284)||no||?|
|Voltage-gated calcium channels||unc-79||none||yes||neurons|
|nAChR LGIC||lev-9||multiple Sushi/CCP domains||yes (worms)||neurons|
|lev-10||CUB domains, LDL domain, TM domain2||yes||NMJ|
|mig-13||no||subset of neurons|
|lurp-1||Lynx/SLURP orthologs (LU domain)||yes (vertebrates)||neurons|
|odr-2||Ly6-related domain||no||subset of neurons|
|hot-4||no||subset of neurons|
|AMPA-type Glu receptors (TARPs)||sol-1||CUB domains||yes||neurons|
|stg-1||stargazin-orthologs||yes||subset of neurons|
|stg-2||yes||subset of neurons|
|lev-10||CUB domains, LDL domain, TM domain2||no||subset of neurons|
|mig-13||no||subset of neurons|
|DEG/ENaC channels (and perhaps TRP)||mec-2||stomatin||yes||neurons|
2One family member, lev-10, encodes a confirmed auxiliary subunit for C. elegans nAChRs. Another one, neto-1, encodes the worm ortholog of vertebrate Neto proteins with are auxiliary subunits for glutamate receptors. Whether other members of this family are also auxiliary subunits and for which type of channel, is not yet known.
|SLC class||Gene||Subfamily||Likely substrate||Expression1|
|SLC17: Vesicular glutamate transporter family (14 genes)||eat-4||SLC17A6-8||glutamate||neurons|
|C02C2.4||no specific subfamily, worm-specific expansion||neurons, intestine|
|SLC18: Vesicular amine transporter family (2 genes)||cat-1||dopamine, histamine, serotonin, tyramine, octopamine||all DA, 5HY, Tyr, Oct neurons|
|unc-17||acetylcholine||all cholinergic neurons|
|SLC32: Vesicular inhibitory amino acid transporter family (GABA & Glycine) (1 gene)||unc-47||GABA||all GABA neurons|
|SLC1: High-affinity glutamate and neutral amino acid transporter family (6 genes) (REUPTAKE)||glt-1||glutamate||muscle|
|glt-4||some neurons, pharynx|
|glt-6||pharynx, excretory canal|
|SLC6: Na+/Cl- dependent neurotransmitter transporter family (17 genes) (REUPTAKE)||dat-1||SLC6A2,3,4||dopamine||DA neurons|
|mod-5||serotonin||subset of 5HT neurons|
|snf-3||neurons, excretory system|
|snf-5||cluster 1||neurons, intestine|
|snf-11 (gat-1)||SLC6A1||GABA||neurons, muscle|
|snf-12||vesicles in hypodermis|
|SLC28: Na+ coupled nucleoside transporter family (2 genes) (REUPTAKE ?)||F27E11.1/slc-28.1||nucleosides||?|
|SLC29: Facilitative nucleoside transporters (includes low affinity, high capacity monoamine transporters) (7 genes) (REUPTAKE)||ent-1||Monoamines, others||pharynx, intestine|
|SLC8 & SLC24: Na+/Ca2+ exchanger & Na+/Ca2+-K+ exchanger (10 genes)||ncx-1||SLC8||Na+/Ca2+ exchanger||?|
|SLC30: cation diffusion facilitator (CDF) family (12 genes)||cdf-2||SLC30A2,3,4,8||intestine|
|cdf-1||SLC30A1,10||muscle and intestine|
|toc-1||SLC30A6||subset of neurons|
|SLC12: cation-chloride cotransporter family (7 genes)||kcc-1||SLC12A4-6||K+/Cl- transporter||muscle, neurons, intestine|
|B0303.11||Na+/K+/2Cl− transporter?||excretory system|
|SLC4: Cl−–HCO3− exchangers (4 genes)||abts-1||SLC4A7-10||neurons, hypodermis, muscle|
|abts-2||SLC4A11||subset of neurons|
|abts-4||SLC4A1-3||subset of neurons|
|Family||Gene (alt. name)||Homolog (and/or domains)||Expression1|
|Calmodulin family (9 genes)||cmd-1||calmodulin (best hit)||?|
|NCS family (7 genes)||ncs-1||human NCS-1||subset of neurons|
|Others (49 genes)||cnb-1||calcineurin (regulatory subunit of PP2B)||neurons, muscle|
|rsa-1||PP2A regulatory subunit||?|
|cex-2||neurons, hypodermis, pharynx|
|efdh-1||EFHD1/2 ortholog||neurons, muscle, pharynx|
|R08D7.5||Centrin (caltractin) 1/2/3 ortholog||only neurons|
|T04F3.4||human MCFD2 (distant)||?|
|C29E4.14||human MCFD2 (distant)||?|
|ZK856.8||human CHP1/2 ortholog||?|
|calm-1||CIB ortholog (Calcium and integrin binding)||?|
|calu-1||human CALU (Reticulocalbin)||muscle, intestine, pharynx|
|mlc-1||myosin light chain||muscle|
|mlc-5||neurons, hypodermis, intestine|
|F43C9.2||distally related to troponin||?|
|E02A10.3||none specifically||neurons, intestine|
|R09H10.6||none specifically||subset of neurons|
|Y73C8B.5||none specifically||reproductive system|
EF hands show similarity with EH domains. Genes were not included where InterPro predicted the same domain to be similar to EF hands and to EH domains.
|Family||Gene (alt. name)||Expression1|
|gon-2||pharynx, excretory cell, intestine|
2A subfamily of genes related to one another (PTHR13800 (“Transient Receptor Potential Cation Channel, Subfamily M”)), as identified with InterproScan.
|Gene||Gated by (as confirmed in vitro)||Expression1|
|acr-2||ACh||subset of neurons|
|acr-3||ACh||subset of neurons (possible operon with acr-2)|
|acr-5||subset of neurons|
|acr-7||subset of neurons|
|acr-8||subset of neurons|
|acr-9||ventral cord neurons|
|acr-11||subset of neurons|
|acr-12||ACh||exclusively in neurons, including ventral cord neurons|
|acr-13/lev-8||ACh||subset of neurons|
|acr-14||subset of neurons|
|acr-15||subset of neurons|
|acr-16||muscle, some neurons|
|deg-3||subset of neurons|
|des-2/acr-4||subset of neurons|
|lev-1||ACh||muscle, some neurons in ventral cord|
|unc-29||ACh||muscle, some neurons|
|unc-38||ACh||muscle, many neurons|
|unc-63||ACh||muscle, many neurons|
|lgc-2/pbo-5||protons||subset of neurons, muscle|
|Subgroup||Gene||Experimentally confirmed ligand||Expression1|
|GABA subgroup (7 genes)||gab-1||GABA||?|
|exp-1||GABA||subset of neurons, muscle|
|Aminergic subgroup (8 genes)||mod-1||serotonin||subset of neurons|
|lgc-55||tyramine||subset of neurons|
|ggr-3||subset of neurons|
|GluCl subgroup (6 genes)||avr-14||glutamate||subset of neurons|
|avr-15||glutamate||subset of neurons|
|ACC subgroup (8 genes)||acc-1||acetylcholine||?|
|lgc-46||motor neurons, muscle|
|diverse (12 genes)||lgc-32||?|
|ggr-1||subset of neurons|
|ggr-2||subset of neurons|
|NMDA-type||nmr-1||subset of neurons|
|nmr-2||subset of neurons|
|AMPA-type||glr-1||subset of neurons|
|glr-2||subset of neurons|
|glr-3||subset of neurons|
|glr-4||subset of neurons|
|glr-5||subset of neurons|
|glr-6||subset of neurons|
|glr-7||subset of neurons|
|glr-8||subset of neurons|
|W02A2.5||subset of neurons|
2Clear homology to GLR receptors, containing predicted extracellular solute-binding protein domain (like GLRs), but lacking some other core sequence features of GLRs (Brockie et al., 2001).
|Gene||Cosmid name||Notes on homologies or domains||Expression1|
|mec-4||T01C8.7||subset of neurons|
|del-4||T28B8.5||subset of neurons|
|deg-1||C47C12.6||subset of neurons|
|del-1||E02H4.1||subset of neurons|
|mec-10||F16F9.5||subset of neurons|
|egas-1||Y69H2.11||Subgroup 3 (EGF + ASC domain)||?|
|del-2||F58G6.6||subset of neurons|
|Y57G11C.44||short fragment, pseudogene?||?|
|F58G6.8||short fragment, pseudogene?||?|
|Type||Gene (alt. name)||Expression1|
|CLC-type (6 genes)||clh-1 (clc-1)||hypoderms, neurons|
|clh-2 (clc-2)||head, tail neurons, vulval muscles|
|clh-3 (clc-3)||Excretory cell, vulva, neurons, enteric muscles, epithelial cells|
|clh-4 (clc-4)||excretory cell|
|clh-5 (clc-5)||pharynx, intestine, hypodermis, unidentified cells in head and tail|
|clh-6 (clc-6)||pharynx, intestine, excretory cell, neurons|
|anoctamin-related (2 genes)||anoh-1||?|
|tweety-related (1 gene)||ttyh-1||broadly in neurons|
|bestrophin-related (26 genes)||best-1||?|
|best-3||hypodermis, excretory cell, muscle|
|best-24||neurons, intestine, hypodermis|
|Process||Gene (alt. name)||Enzymatic activity||Expression1|
|ACh synthesis||cha-1||cholineacetyltransferase||ACh neurons|
|acly-1||ATP citrate lyase||?|
|GABA synthesis||unc-25||glutamic acid decarboxylase (GAD)||GABA neurons|
|Biogenic amine synthesis2||tph-1||tryptophan hydroxylase (requires BH4 cofactor)||5HT neurons|
|tbh-1||tyramine beta hydroxylase||OA neurons (RIC)|
|cat-2||tyrosine hydroxylase (requires BH4 cofactor)||dopamine neurons|
|cat-4||GTP cyclohydrolase (for BH4 synthesis2)||5HT + dopamine neurons|
|gfrp-1||GTP cyclohydrolase feedback regulator (for BH4 synthesis2)||?|
|ptps-1||6-pyruvoyl-tetrahydropterin synthase (for BH4 synthesis2)||?|
|pcbd-1||pterin-4-alpha-carbinolamine dehydratase (BH4 recycling2)||?|
|qdpr-1||quinoid dihydropteridine reductase (BH4 recycling2)||?|
|tdc-1||aromatic amino acid decarboxylase (AAAD)||Tyr + OA neurons (RIM, RIC)|
|bas-1||5HT + dopamine neurons|
|basl-1||inactive aromatic amino acid decarboxylase||?|
|anat-1||arylalkylamine N-acetyltransferase||subset of neurons|
|homt-1||hydroxyindole-O-methyltransferase||PVT + uterine cells|
|anmt-1||amine N-methyltransferase (PNMT, INMT, NNMT)||muscle, intestine, pharynx|
|ACh degradation||ace-1||ACh esterase||some neurons, muscle|
|ace-2||subset of ACh neurons|
|ace-3||some neurons, muscle|
|ace-4||some neurons, muscle|
|comt-1||Catechol-O-Methyltransferase (COMT) homologs||?|
|GABA degradation||gta-1||GABA transaminase||?|
|adh-7||succinic semialdehyde dehydrogenase||?|
Ach = acetylcholine, 5HT = serotonin, DA = dopamine, Tyr = tyramine, OA = octopaminergic
2See Figure 6 and Figure 7. Note that the involvement in biogenic amine synthesis of some of these genes is not at all proven (e.g., anmt genes).
|Insulin-related peptides (40 ins genes)||daf-28, ins-1 through ins-39||
|FMRFamides (31 flp genes)||flp-1 through flp-28, flp-32, flp-33, flp-44||
|Neuropeptide-like proteins (51 genes)||nlp-1 through nlp-48||
|pdf-1||subset of neurons|
|snet-1||subset of neurons|
|ntc-1||subset of neurons|
|Process||Gene||Type of protein||Expression1|
|Maturation (7 genes)||egl-3||PC-type proprotein convertase||many, but not all neurons|
|Modification (3 genes)||pghm-1||Peptidylglycine alpha-amidating monooxygenase||?|
|Degradation (37 genes)||acn-1||ACE-like protein (catalytically inactive)||hypodermis|
|nep-2||muscle, glia, neurons|
|dpf-2||muscle, seam cells|
|dpf-5||intestine, rectal gland cells|
|tpp-2||Tripeptidyl peptidase II||neurons, intestine|
|daf-2||transmembrane or membrane associated||very broad|
|hpa-1||subset of neurons, glia|
|hpa-2||secreted||subset of neurons, glia|
1As determined by TMHMM search (http://www.cbs.dtu.dk/services/TMHMM/)
|Name||Subtypes||Gene number in||Gene Identity|
|Rhodopsin (Class A)||biogenic amine||16||∼200||Table 19|
|muscarinic (ACh)||3||Table 19: gar-1,2,3|
|putative peptidergic||>153||Table 20|
|chemosensory and others||∼1,280||∼400||The putative chemoreceptor families of C. elegans|
|Secretin (Class B)||3||15||Table 20: pdfr-1, seb-2, seb-3|
|Adhesion (Class B)||5||33||Table 22: lat-1, lat-2, fmi-1, mth-1, mth-2|
|Glutamate receptor (Class C)||6 (7)2||22||Table 19: mgl-1,2,3, gbb-1,2, F35H10.10 (C30A5.10)2|
|Frizzled/Taste2||4||26||mig-1, lin-17, mom-5, cfz-2|
1According to (Lagerstrom and Schioth, 2008).
2The assignment of this gene into this class is ambiguous (see text).
|Type based on sequence homology||Gene||Ligand||Expression1|
|metabotropic Glutamate receptor (5 genes)||mgl-1||glutamate||subset of neurons|
|mgl-3||subset of neurons|
|mAChRs (muscarinic acetylcholine receptors) (3 genes)||gar-1||acetylcholine||subset of neurons|
|gar-2||subset of neurons|
|metabotropic GABA receptors (2 genes)||gbb-1||GABA||broadly in nervous system|
|biogenic amine receptor (16 genes)||dop-1||dopamine||subset of neurons|
|dop-2||subset of neurons|
|dop-3||subset of neurons|
|dop-4||?||subset of neurons|
|dop-5||?||subset of neurons|
|dop-6||?||subset of neurons|
|octr-1||octopamine||subset of neurons|
|ser-3||subset of neurons, muscle|
|ser-6||subset of neurons|
|ser-1||serotonin||subset of neurons, muscle|
|ser-4||subset of neurons|
|ser-5||subset of neurons, muscle|
|ser-7||subset of neurons|
|ser-2||tyramine||subset of neurons, muscle|
|tyra-2||subset of neurons|
|tyra-3||subset of neurons|
|adenosine receptor||ador-1||adenosine (?)||?|
|Cosmid name||Gene name||Best human/fly neuropeptide receptor hit in BLAST search2||e-value2||Known ligand3||Expression4|
|Class B (secretin-type)1|
|C13B9.4||pdfr-1||fly PDF receptor||−91||NLP-37||neurons, muscle|
|ZK643.3||seb-2||human calcitonin receptor||−34||muscle|
|C18B12.2||seb-3||corticotropin-releasing factor receptor||−44||neurons|
|Class A (rhodopsin-type)1|
|Neuropeptide F/Y receptor family. From analysis in (Cardoso et al., 2012) and from TF350004, expanded with genes in TF315303.|
|C39E6.6||npr-1||fly neuropeptide F receptor||−50||FLP-18,21||neurons|
|T05A1.1||npr-2||fly neuropeptide F receptor||−47||?|
|C10C6.2||npr-3||fly neuropeptide F receptor||−40||FLP-10||neurons|
|C16D6.2||npr-4||fly neuropeptide F receptor||−48||FLP-4,10||neurons|
|Y58G8A.4||npr-5||fly neuropeptide F receptor||−53||FLP-1,2||neurons|
|F41E7.3||npr-6||fly neuropeptide F receptor||−65||neurons|
|F35G8.1||npr-7||fly neuropeptide F receptor||−35||?|
|C56G3.1||npr-8||fly neuropeptide F receptor5||−29||?|
|C53C7.1||npr-10||fly neuropeptide F receptor||−49||FLP-3||?|
|C25G6.5||npr-11||fly neuropeptide F receptor||−50||NLP-1||?|
|T22D1.12||npr-12||fly neuropeptide F receptor||−41||?|
|ZC412.1||npr-13||fly neuropeptide F receptor||−39||neurons, intestine|
|Ghrelin-obstatin/neuromedin U receptor family. From (Cardoso et al., 2012).|
|C48C5.1||nmur-1||human neuromedin receptor5||−48||subset neurons|
|K10B4.4||nmur-2||human neuromedin receptor||−48||NLP-44||?|
|F02E8.2||nmur-3||fly capa receptor||−39||?|
|C30F12.6||nmur-4||thyrotropin-releasing hormone receptor||−70||pharynx, intestine|
|T07D4.1||npr-20||fly Tachykinin-like receptor||−28||?|
|T23C6.5||npr-21||human neuropeptide FF receptor||−28||neurons|
|Neurokinin/neuropeptide FF/orexin receptor family. From (Cardoso et al., 2012).|
|W05B5.2||npr-14||human orexin receptor||−32||?|
|Y59H11AL.1||npr-22||fly neuropeptide Y receptor||−47||FLP-7,11||?|
|C50F7.1||npr-35||fly SIFR homolog||−56||?|
|Somatostatin receptor receptor family. From (Cardoso et al., 2012) and expanded with genes from TF334200|
|F56B6.5||npr-16||human somatostatin receptor||−37||neurons|
|C06G4.5||npr-17||human somatostatin receptor||−21||?|
|C43C3.2||npr-18||human somatostatin receptor||−18||?|
|R106.2||npr-24||human somatostatin receptor||−43||?|
|T02E9.1||npr-25||human somatostatin receptor||−24||?|
|T02D1.6||npr-26||human somatostatin receptor||−22||?|
|F42C5.2||npr-27||human somatostatin receptor||−19||?|
|F55E10.7||npr-28||human somatostatin receptor||−18||?|
|ZC84.4||npr-29||human nociceptin receptor||−12||?|
|T07F8.2||npr-31||human Rfamide peptide receptor||−10||?|
|Y116A8B.5||npr-32||allatostatin C receptor||−35||?|
|Galanin receptor family. From (Cardoso et al., 2012) and expanded with npr-33 from TF350000|
|T27D1.3||npr-15||fly allatostatin receptor||−31||?|
|F31B9.1||npr-33||human galanin receptor||−31||?|
|Y54E2A.1||npr-34||human pyroglutamylated Rfamide receptor||−32||?|
|Gonadotropin-releasing hormone receptor family (TF106499)|
|C15H11.2||gnrr-2||fly FMRFamide receptor||−08||?|
|F13D2.2||gnrr-6||human oxytocin receptor||−16||?|
|F13D2.3||gnrr-7||human serotonin receptor||−14||?|
|Y105C5A.23||daf-38||human vasopressin receptor||−21||sensory neurons|
|Gastrin-cholecystokinin receptor family|
|T23B3.4||ckr-1||human CCK receptor||−41||neurons|
|Related Vasopressin receptor family|
|T07D10.2||ntr-1||human vasopressin receptor||−34||neurons|
|F14F4.1||ntr-2||human vasopressin receptor||−31||neurons|
|Related to Sex peptide receptor family|
|R03A10.6||sprr-1||fly sex peptide receptor||−60||?|
|F42D1.3||sprr-2||fly sex peptide receptor||−55||?|
|Y69A2AR.15||sprr-3||fly sex peptide receptor||−30||?|
|Drosophila FMRFamide receptor family (TF316702)|
|C02B8.5||frpr-1||fly FMRFamide receptor||−18||?|
|C05E7.4||frpr-2||fly FMRFamide receptor||−17||?|
|C26F1.6||frpr-3||fly FMRFamide receptor||−32||FLP-7,11||?|
|C54A12.2||frpr-4||fly FMRFamide receptor||−33||?|
|C56A3.3||frpr-5||fly FMRFamide receptor||−33||?|
|F21C10.12||frpr-6||fly FMRFamide receptor||−37||?|
|F39B3.2||frpr-7||fly FMRFamide receptor||−24||?|
|F53A9.5||frpr-8||fly FMRFamide receptor||−35||?|
|F53B7.2||frpr-9||fly FMRFamide receptor||−22||?|
|F57H12.4||frpr-10||fly FMRFamide receptor||−28||?|
|K06C4.8||frpr-11||fly FMRFamide receptor||−25||?|
|K06C4.9||frpr-12||fly FMRFamide receptor||−25||?|
|K06C4.17||frpr-13||fly FMRFamide receptor||−12||?|
|K07E8.5||frpr-14||fly FMRFamide receptor||−24||?|
|K10C8.2||frpr-15||fly FMRFamide receptor||−27||?|
|R12C12.3||frpr-16||fly FMRFamide receptor||−24||?|
|T14C1.1||frpr-17||fly FMRFamide receptor||−38||?|
|T19F4.1||frpr-18||fly FMRFamide receptor||−43||FLP-2||?|
|Y41D4A.8||frpr-19||fly FMRFamide receptor||−34||?|
|C30B5.5||daf-37||fly FMRFamide receptor||−16||sensory neurons|
|Drosophila Dromyosuppressin receptor family (TF315509)|
|Another Drosophila FMRFamide receptor family (TF315321)|
|E04D5.2||fly FMRFamide receptor||−15||?|
|T11F9.1||fly FMRFamide receptor||−10||?|
|R11F4.2||fly FMRFamide receptor||−08||?|
|Y37E11AL.1||fly FMRFamide receptor||−07||?|
|C54D10.5||fly FMRFamide receptor||−07||?|
|ZK1307.7||fly SIFamide receptor||−07||?|
|F32D8.10||fly FMRFamide receptor||−06||?|
|F56A11.4||fly FMRFamide receptor||−05||?|
|Y41D4B.24||fly leucokinin receptor||−05||?|
|F57A8.4||fly methusaleh receptor||−05||?|
|Y40C5A.4||fly FMRFamide receptor||−04||?|
|C47E8.3||fly sex peptide receptor||−04||?|
|B0034.5||human Melanin-concent. hormone receptor||−04||?|
|Related family (TF315326) with fly ortholog (CG33639)|
|B0563.6||fly sex peptide receptor||−13||?|
|Y70D2A.1||fly FMRFamide receptor||−11||?|
|Related family (TF317595) with fly ortholog (CG33696)|
|H09F14.1||fly peptide receptor||−17||?|
|F16C3.1||fly peptide receptor||−14||?|
|C24B5.1||fly peptide receptor||−09||?|
|Related family with no specific orthologs (TF315359)|
|R13H7.2||fly sex peptide receptor||−14||neurons, intestine|
|K03H6.5||growth hormone secretagogue receptor||−10||?|
|F40A3.7||fly sex peptide receptor||−09||?|
|W10C4.1||fly sex peptide receptor||−08||?|
|Related family with no specific orthologs (TF315508)|
|C04C3.6||human cholecystokinin receptor||−05||?|
|ZK863.1||human cholecystokinin receptor||−05||?|
|C50H11.13||human cholecystokinin receptor||−05||?|
|C54E10.3||human cholecystokinin receptor||−05||?|
|T01B11.1||human neuropeptide S receptor||−06||?|
|Related family with no specific orthologs (TF316587)|
|T14B1.2||aex-2||human galanin receptor||−06||neurons, muscle|
|C25B8.5||aexr-1||fly SIFamide receptor||−10||?|
|C25B8.7||aexr-2||human prokineticin receptor||−11||?|
|Related family with no specific orthologs (TF316160)|
|F52D10.4||fly FMRFamide receptor||−05||?|
|F56A12.2||fly FMRFamide receptor||−07||?|
|M04G7.3||fly orphan GPCR||−05||?|
|Related family with no specific orthologs (TF317550)|
|F54E4.2||fly cardioacceleratory peptide receptor||−05||?|
|Related family with no specific orthologs (TF318526)|
|C01F1.4||human neurokinin receptor||−07||?|
|F10D7.1||human galanin receptor||−04||?|
|H02I12.3||human adrenergic receptor||−07||?|
|FSHR ortholog (LRR repeats)|
|C50H2.1||fshr-1||FSH receptor||−98||neurons, intestine|
|No obvious paralogs or orthologs|
|ZK813.5||fly Tachykinin-like receptor||−10||?|
|H23L24.4||fly FMRFamide receptor||−09||?|
|T02D1.4||fly FMRFamide receptor||−08||?|
|F36D4.4||human somatostatin receptor||−07||?|
|ZK721.4||fly CCK-like GPCR||−07||?|
|B0334.6||fly FMRFamide receptor||−07||?|
|Y34D9A.2||npr-23||human anaphylatoxin receptor||−06||?|
|C09F12.3||fly FMRFamide receptor||−05||?|
|F13H6.5||fly proctolin receptor||−05||?|
This list was assembled using previously published accounts of putative neuropeptide receptors as a starting point (Keating et al., 2003; Wenick and Hobert, 2004; Janssen et al., 2010) and verifying these lists with BLAST searches. Groupings with vertebrate families was done as in (Cardoso et al., 2012). Additional genes were identified through clustering of gene families in TreeFam and paralogs assigned as presented on the Gene Summary pages in WormBase. In addition, genes identified in the srw gene family in Figure 2 of The putative chemoreceptor families of C. elegans were analzyed by BLAST. The InterPro domain IPR000276 (“GPCR, rhodopsin-like, 7TM”) contains 233 genes, most of which are clearly related to neuropeptide receptors; all of the 233 genes in this list were therefore BLAST-analyzed, and genes with scores lower than an arbitrary cutoff of E value =1e-04 were included in the list above.
1GPCR class indicates a commonly used classification scheme (Lagerstrom and Schioth, 2008) with class A being rhodopsin-like receptors and class B being secretin-like receptors (see text).
2BLASTP analysis of only Homo sapiens and Drosophila melanogaster database.
3From (Li and Kim, 2010).
5The gene is not shown in the respective TreeFam tree, but was assigned into the family together with the other Treefam family members by paralogy assignment presented at the Gene Summary pages on WormBase.
|srw-51||fly dromyosuppressin receptor||−13||?|
|srw-94||fly sex peptide receptor||−12||?|
|srw-33||fly sex peptide receptor||−11||?|
|srw-67||fly sex peptide receptor||−10||?|
|srw-29||fly peptide receptor GPCR||−10||?|
|srw-53||fly proctolin receptor||−09||?|
|srw-57||fly dromyosuppressin receptor||−09||?|
|srw-113||fly FMRFamide receptor||−08||?|
|srw-103||fly sex peptide receptor||−08||neuronal|
|srw-44||fly dromyosuppressin receptor||−08||?|
|srw-42||fly sex peptide receptor||−07||?|
|srw-87||fly sex peptide receptor||−07||?|
|srw-122||fly FMRFamide receptor||−07||?|
|srw-42||fly sex peptide receptor||−07||?|
|srw-102||fly sex peptide receptor||−06||?|
|srw-115||fly FMRFamide receptor||−06||?|
|srw-118||fly FMRFamide receptor||−06||hypodermis|
|srw-123||fly FMRFamide receptor||−06||?|
|srw-102||fly sex peptide receptor||−06||?|
|srw-73||fly proctolin receptor||−05||?|
|srw-127||human orexin receptor||−05||?|
|srw-139||fly dromyosuppressin receptor||−05||subset of neurons|
|srw-8||human angiotensin II receptor||−04||?|
|srw-1||-||above e-04 cutoff||?|
|srw-13||-||above e-04 cutoff||?|
|srw-36||-||above e-04 cutoff||?|
1Representative members of several subbranch of srw gene family members (as shown in Figure 2 of The putative chemoreceptor families of C. elegans were analyzed.
2BLASTP analysis of only Homo sapiens and Drosophila melanogaster database.
|fmi-1||Flamingo/Starry Night/CELSR||Cadherin + EGF + LamG + HormR + GPS + 7TMR (secretin-type)||neurons|
|lat-1||Latrophilin||SUEL Lectin + HormR + DUF3497 + GPS + 7TMR (secretin-type)||neurons, pharynx, reproductive tissues|
|lat-2||pharynx, excretory cell|
|mth-1||Drosophila methuselah-like1||None in N-terminus + GPS + 7TMR (secretin-type)||?|
1This homology is not apparent by BLAST searches but is only picked out in the Panther database, PTHR12011
|Gene||Notes on domain structure||Expression1|
|Gα (21 genes)||gsa-1, egl-30, goa-1, gpa-1, gpa-2, gpa-3, gpa-4, gpa-5, gpa-6, gpa-7, gpa-8, gpa-9, gpa-10, gpa-11, gpa-12, gpa-13, gpa-14, gpa-15, gpa-16, gpa-17, odr-3||
|Gβ (2 genes)||gpb-1||ubiquitous|
|Gγ (2 genes)||gpc-1||subset of neurons|
|RGS family (21 genes)||axl-1||?|
|eat-16||broad neuronal, muscle|
|egl-10||broad neuronal, muscle|
|rgs-3||subset of neurons|
|GRK (2 genes)||grk-1||intestine|
|GPR/GoLoco (3 genes)||gpr-1 (ags-3.2)||all mitotically dividing cells|
|gpr-2 (ags-3.3)||all mitotically dividing cells|
|ags-3||neurons, intestine, muscle|
|arrd-4||N+C domain||subset of sensory neurons|
|arrd-6||2x C domain||subset of neurons|
|arrd-10||N domain only||?|
|arrd-12||N domain only||?|
|arrd-15||N+C domain||subset of neurons|
|arrd-20||N domain only||?|
|rnh-1.2||N + RnaseH||?|
|Receptor-type GCY (27 genes)||ANF receptor domain + TM + PK + Cyc||daf-11, odr-1, gcy-1, gcy-2, gcy-3, gcy-4, gcy-5, gcy-6, gcy-7, gcy-8, gcy-9, gcy-12, gcy-13, gcy-14, gcy-15, gcy-17, gcy-18, gcy-19, gcy-20, gcy-22, gcy-23, gcy-25, gcy-28, gcy-29||
|Ex. domain + TM + PK + Cyc||gcy-11, gcy-21|
|TM + PK + Cyc||gcy-27|
|Soluble GCY (7 genes)||HNOB + Cyc||gcy-31 through gcy-37||7/7 in sensory neurons only|
|cGMP-specific PDEs (4 genes)||PDE||pde-1 (hPDE1 ortholog)||neurons|
|PDE + GAF||pde-2 (hPDE2 ortholog)||neurons, muscle, pharynx, intestine|
|PDE (no TM)||pde-3 (hPDE3 ortholog)||neurons, hypodermis|
|PDE + GAF||pde-5 (hPDE10 ortholog)||?|
|cAMP-specific PDEs (2 genes)||PDE||pde-4 (hPDE4 ortholog)||?|
|PDE + PAS||pde-6 (hPDE8 ortholog)||?|
1ANF receptor domain = “IPR001828 Extracellular ligand-binding receptor”. This domain can also be found in metabotropic GABA and Glu receptors and in ionotropic Glu receptors. TM = transmembrane. PK = protein kinase-like. Cyc = guanylyl cyclase. HNOB = heme nitric oxide binding domain. Data taken from Ortiz et al. (Ortiz et al., 2006), but domains have been reanalyzed. Orthology relationship of pde genes was established by Treefam and in (Conti and Beavo, 2007).
|cah-2||subset of neurons|
|cah-3||subset of neurons, intestine|
|cah-4||neurons, excretory cell|
|cah-5||subset of neurons, intestine|
|cah-6||subset of neurons|
|Globin||glb-1||subset of neurons, muscle or hypodermis|
|glb-2||subset of neurons|
|glb-3||subset of neurons|
|glb-4||subset of neurons|
|glb-5||subset of neurons (oxygen sensory neurons URX, AQR/PQR, BAG)2|
|glb-6||subset of neurons|
|glb-7||subset of neurons|
|glb-9||subset of neurons|
|glb-10||subset of neurons, enriched at synapse3|
|glb-11||subset of neurons|
|glb-12||subset of neurons|
|glb-13||subset of neurons|
|glb-14||subset of neurons, vulval muscle|
|glb-15||no observable expression|
|glb-16||subset of neurons|
|glb-17||subset of neurons|
|glb-18||subset of neurons|
|glb-19||subset of neurons|
|glb-20||subset of neurons, muscle|
|glb-21||subset of neurons, pharynx|
|glb-22||subset of neurons|
|glb-23||subset of neurons|
|glb-24||subset of neurons|
|glb-25||subset of neurons|
|glb-26||head mesodermal cell, stomato-intestinal muscle|
|glb-27||subset of neurons|
|glb-28||subset of neurons|
|glb-29||subset of neurons|
|glb-30||subset of neurons|
|glb-31||subset of neurons|
|glb-32||subset of neurons|
|glb-33||subset of neurons|
1Mostly from reporter analysis done by (Hoogewijs et al., 2008).
2Convert CO2 into bicarbonate. Expression of carbonic anhydrase is generally considered to be a hallmark of CO2 responsive neurons (Bretscher et al., 2011).
3From (Sieburth et al., 2005).
|Overall type||Homolog||Gene (alt. name)||Expression1|
|Calcium sensor for vesicle release||synaptotagmin (7 genes)||snt-1||broad neuronal|
|R-SNARE||VAMP/synaptobrevin (9 genes)||snb-1||ubiquitous|
|sec-22||muscle, reproductive system|
|Q SNARE2||Qa subtype (10 genes)||unc-64 (syx-1)||neurons|
|Qb subtype (5 genes)||gos-28||seam cells, intestine|
|Qc subtype (3 genes)||syx-6||neuron, intestine|
|Qb/c subtype (3 genes)||ric-4 (snap-25)||broad neuronal|
|tetraspan vesicle proteins (TVPs)||synaptogyrin||sng-1||neurons|
|SCAMP||scm-1||subset of neurons|
|Other vesicle associated or regulatory proteins||synapsin||snn-1||broad neuronal|
|Rim-binding protein||elks-1||broad neuronal|
|erp-1||broad neuronal, muscle|
|cpx-2||subset of neurons|
|IA2||ida-1||subset of neurons|
2Subtype classification from http://bioinformatics.mpibpc.mpg.de/snare/snareQueryPage.jsp
3Possibly generated by unc-18 duplication, more similar to unc-18 than to any other member of the Sec1 superfamily, of which unc-18 is a member.
|Gene (alt. name)||Domain structure||Homolog||Expression1|
|PDZ only (31 genes)||mpz-1||PDZ only (10)||MPDZ||neurons|
|par-3||PDZ only (3)||Bazooka||?|
|C01F6.6/mpz-2||PDZ only (2)||PDZK1||pharynx, intestine, excretory cell|
|mpz-3||PDZ only (2)||none||?|
|mpz-4||PDZ only (2)||none||?|
|mpz-5||PDZ only (2)||none||?|
|mpz-6||PDZ only (2)||none||?|
|Y42H9AR.1||PDZ only (2)||GRASP65||?|
|T21G5.4||PDZ only (2)||paralog of C25G4.6||?|
|C25G4.6||PDZ only (2)||paralog of T21G5.4||?|
|gipc-1||PDZ only (1)||GIPC||intestine|
|gipc-2||PDZ only (1)||GIPC||intestine, neurons|
|gopc-1||PDZ only (1)||GOPC/CAL/PIST||?|
|gras-1||PDZ only (1)||GRASP||?|
|mics-1||PDZ only (1)||many||?|
|C09G1.4||PDZ only (1)||none||?|
|C46H11.6||PDZ only (1)||none||muscle|
|C50D2.3||PDZ only (1)||none||?|
|C52A11.3||PDZ only (1)||none||?|
|F23C8.13||PDZ only (1)||none||?|
|F40F9.3||PDZ only (1)||none||?|
|T15H9.4||PDZ only (1)||none||?|
|ZK849.1||PDZ only (1)||none||?|
|Y52E8A.1||PDZ only (1)||none||?|
|C01B7.5||PDZ only (1)||none||?|
|T19B10.5||PDZ only (1)||Periaxin(?)||?|
|psmd-9||PDZ only (1)||PSMD9||?|
|C45G9.7||PDZ only (1)||tax1BP3||?|
|F16G10.5||PDZ only (1)||none||?|
|F20D6.1||PDZ only (1)||none||?|
|mics-1||PDZ only (1)||Magix||?|
|MAGUK type (8 genes)||dlg-1||PDZ, SH3, GuKc||DLG||epithelial, neurons|
|lin-2||PDZ, SH3, GuKc||CASK||neurons|
|magu-1||PDZ, SH3, GuKc||MAGUK family (MPP3)||?|
|magu-2||PDZ, SH3, GuKc||MAGUK family (MPP5)||?|
|magu-3||PDZ, SH3, GuKc||MAGUK family (MPP6)||pharynx, intestine|
|magu-4||PDZ, GuKc||TJP1/2||hypodermins, neurons|
|zoo-1||PDZ, SH3, GuKc||ZO1/TJP3||hypoderm, muscle|
|magi-1||PDZ, WW, GuKc||MAGI||neurons|
|frm-5.2||B41, PDZ||frm-5 paralogue (duplication)||?|
|frm-8||WW, PDZ, B41||FERMPD||neurons|
|ptp-1||B41, FERM, PDZ, PTPc||?|
|Others (27 genes)||cnk-1||SAM, PDZ, PH||CNKSR||?|
|kin-4||S/T kinase, PDZ||none||?|
|afd-1||RA, FH, PDZ||afadin||?|
|lin-10||PTB, PDZ||LIN10||very broad|
|alp-1||PDZ, ZM, LIM||ALP/Enigma||muscle, head neurons|
|nab-1||PDZ, SAM||neurabin||neurons, hypodermis|
|stn-2||PDZ, PH||syntrophin||neurons, muscle|
|stn-1||PDZ, PH||syntrophin||neurons, muscle|
|syd-1||PDZ, C2, RhoGAP||SYD||neurons|
|rhgf-1||PDZ, C1, RhoGEF, PH||ARHGEF||neurons|
|par-6||PDZ, PB1||adult ?|
|let-413||LRR, PDZ||Erbin, Scribble||intestine, hypodermis, pharynx|
|mig-5||DAX, PDZ, DEP||dishevelled||embryo broad, neurons|
|dsh-2||DAX, PDZ||dishevelled||neurons, intestine|
|dsh-1||DAX, DEP, PDZ||dishevelled||neurons, pharynx|
|pxf-1||cNMP, RasGEF, PDZ||RAPGEF||neurons|
|inx-1, inx-2, inx-3, inx-4, inx-5, inx-6, inx-7, inx-8, inx-9, inx-10, inx-11, inx-12, inx-13, inx-14, inx-17, inx-18, inx-19/nsy-5, unc-7, unc-9||neuronally expressed innexins|
|inx-15, inx-16, inx-20, inx-21, inx-22, eat-5||non-neuronally expressed innexins|
1Assembled from (Altun et al., 2009).
|Family1||Family Function2||Gene (alt. name)||Homolog||Expression3|
|Kinesin-1||Vesicle, organelle and mRNA transport||unc-116||conventional||broad|
|Kinesin-2||Vesicle and intraflagellar transport||osm-3 (klp-2)||KIF17||sensory neurons|
|Kinesin-3||Vesicle transport||unc-104 (klp-1)||KIF1A/KIF1B||neurons|
|klp-4||KIF13A/B, KIF14||neurons, pharynx, intestine|
|Kinesin-4||Chromosome positioning||klp-12||KIF21A,B||dividing cells|
|Kinesin-5||Spindle pole separation, bipolarity||bmk-1 (klp-14)||BimC||dividing cells|
|Kinesin-6||Central spindle assembly, cytokinesis||zen-4 (klp-9)||KIF20A,B, KIF23||dividing cells|
|Kinesin-12||Spindle pole organization||klp-10||KIF15||?|
|Kinesin-14||Spindle pole organization and cargo transport||klp-3||KIFC2, KIFC3||pharynx|
|atypical||vab-8 (klp-5)||neurons, muscle|
|klp-8||neurons, hypodermis, excretory|
1According to (Hirokawa et al., 2010).
2According to (Verhey and Hammond, 2009).
|Class||Gene (alt. name)||Adult expression1|
|che-3 (dhc-2)||ciliated neurons|
|Intermediate chain||dyci-1||muscle, gonad|
|light intermediate chain||xbx-1||ciliated neurons|
|dli-1||neurons, hypodermis, pharynx|
|Light chain (LC8-type)||dlc-1||broad|
|dlc-2||neurons, intestine, muscle|
|Light chain (Tctex1-type)||dylt-1||broad|
|dylt-2 (xbx-2)||ciliated neurons|
|Light chain (Roadblock-type)||dyrb-1||broad|
*While dhc-1 and che-3 contain the typical DHC N1 and N2 domains, dhc-3 and dhc-4 only contain N2 domains.
|Gene (alt. name)||Adult Expression1|
|hum-1||neurons, intestine, muscle|
|IFT modules/components||Gene name||Description|
|xbx-1||Light intermediate chain|
|IFT-A (5 genes)||dyf-2||IFT144|
|IFT-B (14 genes)||osm-1||IFT172|
|BBSome (8 genes)||bbs-1||BBS1|
The list has been adapted from (Inglis et al., 2009). Expression of all examined genes is observed in sensory neurons (www.wormbase.org).
|Protein family||Gene name2||Domains||Expression3|
|Ig domain proteins1 (56 genes)||cam-1 (ROR)||Ig + Frz + Kr + TyrKinase||neurons|
|egl-15 (FGFR)||3 Ig + TM + TyrKinase||hypodermis|
|ver-1||5 Ig + TM + TyrKinase||neurons, muscle|
|ver-3||4 Ig + TM + TyrKinase||neurons, muscle|
|ver-4||4 Ig + TM + TyrKinase||?|
|clr-1||1 Ig + 2 Fn3 + TyrPhosphatase||?|
|ptp-3 (LAR)||3 Ig + 9 Fn3 + TyrPhosphatase||muscle, neurons|
|ptp-4||1 Ig + 3 Fn3 + TyrPhosphatase||?|
|dig-1||many domains (extracell. matrix)||hypodermis, mesoderm|
|him-4||many domains (extracell. matrix)||muscle|
|unc-52 (perlecan)||many domains (extracell. matrix)||muscle|
|igcm-1||7 Ig + 2 Fn3 + TM||neurons, muscle, seam|
|igcm-2||3 Ig + 2 Fn3 + TM||some neurons, intestine|
|igcm-3||3 Ig + TM||hypodermis muscle|
|igcm-4||3 Ig + TM||?|
|mig-6||Ig + TSP1 + KU||muscle|
|oig-1||1 Ig (secreted)||some neurons|
|oig-2||1 Ig (secreted)||some neurons|
|oig-3||1 Ig (secreted)||some neurons, pharynx|
|oig-4||1 Ig (secreted)||muscle|
|oig-5||1 Ig (secreted)||?|
|oig-6||1 Ig + TM||?|
|oig-7||1 Ig + TM||?|
|oig-8||1 Ig + TM||?|
|rig-1||6 Ig + 2 Fn3 + TM||neurons, muscle, hypodermis|
|rig-3||4 Ig + GPI||neurons|
|rig-4||6 Ig + 13 Fn3 + TM||neurons, muscle, hypodermis|
|rig-5||3 Ig + GPI||neurons, gut|
|rig-6||6 Ig + 4 Fn3 + GPI||neurons, muscle|
|ncam-1||5 Ig + 1 Fn3 + TM||neurons, gut|
|sax-3 (Robo)||6 Ig + 3 Fn3 + TM||broad neuronal|
|sax-7 (L1)||6 Ig + 5 Fn3 + TM||very broad|
|lad-2 (L1)||6 Ig + 5 Fn3 + TM||some neurons|
|syg-1||5 Ig + TM||neurons, muscle|
|syg-2||8 Ig + 1 Fn3 + TM||neurons, muscle|
|unc-40 (DCC)||4 Ig + 6 Fn3 + TM||broad neuronal|
|unc-5||Ig + TSP1 + TM + ZO1 + DEATH||some neurons|
|wrk-1||3 Ig + 1 Fn3 + GPI||some neurons, gut|
|zig-1||2 Ig + TM||broad neuronal|
|zig-2||2 Ig (secreted)||subset of neurons|
|zig-3||2 Ig (secreted)||subset of neurons|
|zig-4||2 Ig (secreted)||subset of neurons|
|zig-5||2 Ig (secreted)||subset of neurons|
|zig-6||2 Ig (secreted)||subset of neurons|
|zig-7||2 Ig (secreted)||subset of neurons|
|zig-8||2 Ig (secreted)||subset of neurons|
|zig-9||2 Ig (secreted)||?|
|zig-10||2 Ig + TM||?|
|igeg-1||Ig + EGF + TM||?|
|igeg-2||Ig + EGF + TM||?|
|igdb-1||1 Ig + 4 Fn3 + 2 DB + TM||?|
|igdb-2||2 Ig + 5 Fn3 + 4 DB + TM||hypodermis|
|igdb-3||1 Ig + 1 Fn3 + 2 DB||?|
|madd-4||Ig + many TSP1 (secreted)||neurons|
|3 Ig + 6 Fn3 (secreted)||?|
|Sushi domains + 1 Ig (paralog of and adjacent to lev-9)||?|
|Ig + LRR proteins (6 genes)||pxn-1 (Peroxidasin)||LRR + Ig + peroxidase (secreted)||?|
|pxn-2 (Peroxidasin)||LRR + Ig + peroxidase (secreted)||neurons, hypodermis|
|sma-10||LRRs + Ig + TM||hypodermis, intestine, pharynx|
|iglr-1||LRRs + Ig + TM||neurons in head and VNC.|
|iglr-2||LRRs + Ig + TM||?|
|iglr-3||LRRs + Ig (secreted)||?|
|eLRR proteins (23 genes)||fshr-1 (FSHR)||LRRs + 7TMR||neurons, intestine|
|slt-1 (SLT)||LRRs + EGF + LamG||epidermis, muscle|
|tol-1||LRRs + TM + TIR||neurons, epithelial|
|pan-1||LRRs + TM||hypodermis, pharynx, head muscles|
|let-4 (sym-5)||LRRs + TM||?|
|egg-6||LRRs + TM||hypodermis, pharynx|
|dma-1||LRRs + TM||subset of neurons|
|lron-1||LRRs + TM||pharynx|
|lron-2||LRRs + TM||pharynx|
|lron-3||LRRs + TM||?|
|lron-4||LRRs + TM||ventral nerve cord|
|lron-5||LRRs + TM||Many neurons in head, VNC, and tail.|
|lron-6||LRRs + TM||Many neurons in nerve ring and VNC.|
|lron-7||LRRs + TM||intestine|
|lron-8||LRRs + TM||hypodermis, pharynx, muscle|
|lron-9||LRRs + TM||head, VNC neurons, seam cells, muscles|
|lron-10||LRRs + TM||?|
|lron-11||LRRs + TM||Pharynx, hypodermis|
|lron-12||LRRs + GPI||?|
|lron-13||LRRs + GPI||?|
|lron-14||LRRs + GPI||Numerous neurons in head and VNC|
Proteins were identified from searches of the SMART and InterPro domain databases.
Intracellular proteins with Ig or LRR domains are excluded from the list and were identified either by the presence of other domains known have cytoplasmic function or based on the absence of a detectable signal sequence, as assessed by SignalP.
1The cutoff for inclusion in this list is somewhat arbitrary since Ig domains can significantly degenerate, making them somewhat difficult to predict (example, T17A3.10; this gene may in fact be fused to ver-2).
2Some well-known vertebrate orthologs listed in parenthesis.
3From www.wormbase.org except eLRR expression patterns which are from (Liu and Shen, 2011).
|Gene (alt. name)||Homology||Expression1|
|casy-1 (cdh-11)||calsyntenin-like||neurons, intestine|
|Y37E11AL.6||nematode-specifc (Cdh domain & EGF domain)||?|
|Neurexin superfamily||nrx-1||classic neurexin||broad neuronal|
|nlr-1||CASPR-like||subset of neurons|
|Neurexin ligands||lat-1||latrophilin||intestine, neurons, muscle|
*Even though no obvious orthologs can be found, proteins with similar domain architecture are encoded in the genome.
Work in my laboratory is funded by the National Institutes of Health (R01NS039996-05; R01NS050266-03) and the Howard Hughes Medical Institute. I thank Jonathan Hodgkin for discussion and his involvement in gene naming and Michael Koelle, Jim Rand, Thomas Boulin, Ines Carrera, Jeremy Dittman, Martin Chalfie, Niels Ringstad, Piali Sengupta, Chris Li, Iva Greenwald and especially Erik Jorgensen for comments on the manuscript.
Almedom, R.B., Liewald, J.F., Hernando, G., Schultheis, C., Rayes, D., Pan, J., Schedletzky, T., Hutter, H., Bouzat, C., and Gottschalk, A. (2009). An ER-resident membrane protein complex regulates nicotinic acetylcholine receptor subunit composition at the synapse. Embo J. 28, 2636-2649. Abstract Article
Altun, Z.F., Chen, B., Wang, Z.W., and Hall, D.H. (2009). High resolution map of Caenorhabditis elegans gap junction proteins. Dev. Dyn. 238, 1936-1950. Abstract Article
Alvarez, C.E. (2008). On the origins of arrestin and rhodopsin. BMC Evol. Biol. 8, 222. Abstract Article
Anderson, J.F., and Ultsch, G.R. (1987). Respiratory gas concentrations in the microhabitats of some Florida arthropods. Comp. Biochem. Physiol. 88A, 585588. Article
Askwith, C.C., Cheng, C., Ikuma, M., Benson, C., Price, M.P., and Welsh, M.J. (2000). Neuropeptide FF and FMRFamide potentiate acid-evoked currents from sensory neurons and proton-gated DEG/ENaC channels. Neuron 26, 133-141. Abstract Article
Aubry, L., Guetta, D., and Klein, G. (2009). The arrestin fold: variations on a theme. Curr. Genomics 10, 133-142. Abstract Article
Aurelio, O., Hall, D.H., and Hobert, O. (2002). Immunoglobulin-domain proteins required for maintenance of ventral nerve cord organization. Science 295, 686-690. Abstract Article
Bargmann, C.I. (1998). Neurobiology of the Caenorhabditis elegans genome. Science 282, 2028-2033. Abstract Article
Bargmann, C.I. Chemosensation in C. elegans (October 25, 2006), WormBook, ed. The C. elegans Research Community, WormBook, doi/10.1895/wormbook.1.123.1, http://www.wormbook.org. Article
Bastiani, C. and Mendel, J. Heterotrimeric G proteins in C. elegans (October 13, 2006), WormBook, ed. The C. elegans Research Community, WormBook, doi/10.1895/wormbook.1.75.1, http://www.wormbook.org. Article
Bavan, S., Straub, V.A., Blaxter, M.L., and Ennion, S.J. (2009). A P2X receptor from the tardigrade species Hypsibius dujardini with fast kinetics and sensitivity to zinc and copper. BMC Evol. Biol. 9, 17. Abstract Article
Bazopoulou, D., and Tavernarakis, N. (2007). Mechanosensitive ion channels in Caenorhabditis elegans Curr. Top. Membr. 59, 49-79. Article
Becker-Andre, M., Wiesenberg, I., Schaeren-Wiemers, N., Andre, E., Missbach, M., Saurat, J.H., and Carlberg, C. (1994). Pineal gland hormone melatonin binds and activates an orphan of the nuclear receptor superfamily. J. Biol. Chem. 269. 28531-28534. Abstract
Beets, I., Janssen, T., Meelkop, E., Temmerman, L., Suetens, N., Rademakers, S., Jansen, G., and Schoofs, L. (2012). Vasopressin/oxytocin-related signaling regulates gustatory associative learning in C. elegans. Science 338, 543-545.