Mutation mapping by whole-genome sequencing (WGS) is becoming increasingly popular with C. elegans researchers. Pioneering studies from several worm labs have laid the foundations for this trend by providing proofs of concept (Sarin et al., 2008; Flibotte et al., 2010), working out experimental strategies (Doitsidou et al., 2010, Zuryn et al., 2010), and contributing tools and workflows for the analysis of mapping-by-sequencing data (Minevich et al., 2012).
Yet even though many good bioinformatic tools for WGS data analysis exist, almost all of them seem daunting to biologists. The CloudMap collection of workflows and tools for Galaxy is a notable exception here, but it requires the upload of huge experimental datasets to the main Galaxy server (http://usegalaxy.org), which is time-consuming and not well-suited for maintaining an archive of data from several experiments. Hence, most people in the field still do not consider analyzing WGS data themselves, and the resulting need to establish collaborations prevents a more rapid spread of mapping-by-sequencing approaches in the community.
We have recently developed MiModD, a comprehensive software package for the identification and annotation of mutations in the genomes of model organisms from whole-genome sequencing data, with the specific goal to enable biologists/geneticists with limited bioinformatical knowledge to analyze genome-wide sequencing data locally on (relatively) standard desktop computers. The current beta version of MiModD enables short-reads alignment from several input formats, variant calling, post-processing (i.e., filtering and annotating lists of variants), and the generation of summary reports with hyperlinks to biological databases including Wormbase. It also supports various mapping-by-sequencing approaches by producing CloudMap-ready output, i.e., variant lists produced by MiModD can be used directly with the CloudMap EMS Variant Density, Variant Discovery Mapping, and Hawaiian Variant Mapping tools. By default, the package is controlled through a command line interface, but can be fully integrated, with a single command, into a local installation of Galaxy to obtain a beginner-friendly graphical user interface and a lab-internal WGS analysis server.
In principle, MiModD can be used to analyze WGS data from any organism, but its memory requirements are dependent on genome size. To analyze C. elegans data we recommend a system with 16GB of memory (though 8GB will do for testing). The package runs on Linux and Mac OS systems without any further strict hardware requirements. Obviously, performance will depend on the exact configuration of the system MiModD is running on, but on our development machine - a desktop PC equipped with a 3rd generation Intel Core i7 processor, 16 GB RAM and an Ubuntu 12.04 operating system - the complete analysis of a 30x covered worm genome finishes in less than 1 hour, which should be good enough for the occasional user.
MiModD is free, open-source and actively developed, and labs interested in WGS analysis are encouraged to download and test the software. We offer assistance with installing and using the package to users of the beta version and are looking forward to receiving your feedback. Further information is available at www.celegans.de/mimodd.
References
Doitsidou M, Poole RJ, Sarin S, Bigelow H, and Hobert O. (2010). C. elegans mutant identification with a one-step whole-genome-sequencing and SNP mapping strategy. PLoS One 5, e15435.
Flibotte S, Edgley ML, Chaudhry I, Taylor J, Neil SE, Rogula A, Zapf R, Hirst M, Butterfield Y, Jones SJ, et al. (2010). Whole-genome profiling of mutagenesis in Caenorhabditis elegans. Genetics 185, 431-441.
Minevich G, Park DS, Blankenberg D, Poole RJ, and Hobert O. (2012). CloudMap: a cloud-based pipeline for analysis of mutant genome sequences. Genetics 192, 1249-1269.
Sarin S, Prabhu S, O'Meara MM, Pe'er I, and Hobert O. (2008). Caenorhabditis elegans mutant allele identification by whole-genome sequencing. Nat. Methods 5, 865-867.
Zuryn S, Le Gras S, Jamet K, and Jarriault S. (2010). A strategy for direct mapping and identification of mutations by whole-genome sequencing. Genetics 186, 427-430.