Here, I present a simple script written in Perl for generating the necessary primers for use in GFP fusion experiments (Hobert, 2002; Reporter gene fusions) in C. elegans using multi sequence files. The program returns forward and reverse GFP fusion primers based on parameters determined by the user. These parameters include: length of search areas at the 5' and 3' ends of the input sequence; forward primer length, and reverse primer length (to be appended to the GFP specific oligo); and 3' end GC clamp. The main feature of this script is that it works with files containing any number of FASTA formatted DNA sequences. The script uses the BioPerl SeqIO module (Stajich et al., 2002) to parse FASTA formatted sequences, and can easily be adapted to return primers from a large number of formats, including GenBank, EMBL, ABI, FASTQ, and KEGG (Stajich et al., 2002). The program returns primers with GC% between 40 and 60, and primer Tm values between 52°C and 68°C. The program selects against highly repetitive sequences composed of multiple tandem identical base pairs or multiple di-nucleotide repeats. The program also examines self-complementarity within each primer and enumerates a 'selfie_score' by generating reverse complement substrings of each primer and checking for pattern matches. Two 'selfie_score' values are returned for the fusion primer: the first one examines self-complementarity to the designed reverse primer only, and the second score checks complementarity between the designed portion and also the GFP specific portion.
The script is organized such that parameters for each primer (forward or reverse) can be easily modified (and improved) by the user independently, and contains comments describing each step. The program executes from the command line and the output provides tab separated features that prints to the screen and also writes to an output file: tab1: start position of primer; tab2: primer Tm; tab3: primer sequence; tab4: selfie_score {two scores for the fusion primer}; tab5: percent GC content.
This simple script might be of use to researchers working on multi-gene families, alternatively spliced transcripts, or really any type of large gene data set where GFP expression is desired. If only forward primers are desired the user can enter 'zero' for 3' space to be sampled, and using such modifications to the code may provide further utility in traditional PCR experiments for large gene data sets. Hosted at SourceForge and freely available for download here - http://batchfusionprimerfetch.sourceforge.net/
References
Boulin, T. et al. Reporter gene fusions (April 5, 2006), WormBook, ed. The C. elegans Research Community, WormBook, doi/10.1895/wormbook.1.106.1, http://54.165.141.74/.
Hobert O. (2002). PCR fusion-based approach to create reporter gene constructs for expression analysis in transgenic C. elegans. Biotechniques 32, 728-30.
Stajich JE, et al. (2002). The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 12, 1611-1618.