comparison tools/mira4/mira4_bait.xml @ 9:302d13490b23 draft

Uploaded v0.0.2 preview 1, BAM output
author peterjc
date Thu, 28 Nov 2013 05:07:59 -0500
parents
children 79759fdec6cb
comparison
equal deleted inserted replaced
8:2ab1d6f6a8ec 9:302d13490b23
1 <tool id="mira_4_0_bait" name="MIRA v4.0 mirabait" version="0.0.1">
2 <description>Filter reads using kmer matches</description>
3 <requirements>
4 <requirement type="binary">mirabait</requirement>
5 <requirement type="package" version="4.0">MIRA</requirement>
6 </requirements>
7 <version_command interpreter="python">mira4_bait.py --version</version_command>
8 <command interpreter="python">
9 mira4_bait.py $input_reads.ext $output_choice $strand_choice $kmer_length $min_occurence "$bait_file" "$input_reads" "$output_reads"
10 </command>
11 <stdio>
12 <!-- Assume anything other than zero is an error -->
13 <exit_code range="1:" />
14 <exit_code range=":-1" />
15 </stdio>
16 <inputs>
17 <param name="bait_file" type="data" format="fasta,fastq,mira" required="true" label="Bait file (what to look for)" />
18 <param name="input_reads" type="data" format="fasta,fastq,mira" required="true" label="Reads to search" />
19 <param name="output_choice" type="select" label="Output positive matches, or negative matches?">
20 <option value="pos">Just positive matches</option>
21 <option value="neg">Just negative matches</option>
22 </param>
23 <param name="strand_choice" type="select" label="Check for matches on both strands?">
24 <option value="both">Check both strands</option>
25 <option value="fwd">Just forward strand</option>
26 </param>
27 <param name="kmer_length" type="integer" value="31" min="1" max="32"
28 label="k-mer length" help="Maximum 32" />
29 <param name="min_occurence" type="integer" value="1" min="1"
30 label="Minimum k-mer occurence"
31 help="How many k-mer matches do you want per read? Minimum one" />
32 </inputs>
33 <outputs>
34 <data name="output_reads" format="fasta" label="$input_reads.name #if str($output_choice)=='pos' then 'matching' else 'excluding matches to' # $bait_file.name">
35 <!-- TODO - Replace this with format="input:input_reads" if/when that works -->
36 <change_format>
37 <when input_dataset="input_reads" attribute="extension" value="fastq" format="fastq" />
38 <when input_dataset="input_reads" attribute="extension" value="fastqsanger" format="fastqsanger" />
39 <when input_dataset="input_reads" attribute="extension" value="fastqsolexa" format="fastqsolexa" />
40 <when input_dataset="input_reads" attribute="extension" value="fastqillumina" format="fastqillumina" />
41 <when input_dataset="input_reads" attribute="extension" value="fastqcssanger" format="fastqcssanger" />
42 </change_format>
43 </data>
44 </outputs>
45 <tests>
46 <test>
47 <param name="bait_file" value="tvc_bait.fasta" ftype="fasta" />
48 <param name="input_reads" value="tvc_mini.fastq" ftype="fastqsanger" />
49 <output name="output_reads" file="tvc_mini_bait_pos.fastq" ftype="fastqsanger" />
50 </test>
51 <test>
52 <param name="bait_file" value="tvc_bait.fasta" ftype="fasta" />
53 <param name="input_reads" value="tvc_mini.fastq" ftype="fastqsanger" />
54 <param name="kmer_length" value="32" />
55 <param name="min_occurence" value="50" />
56 <output name="output_reads" file="tvc_mini_bait_strict.fastq" ftype="fastqsanger" />
57 </test>
58 <test>
59 <param name="bait_file" value="tvc_bait.fasta" ftype="fasta" />
60 <param name="input_reads" value="tvc_mini.fastq" ftype="fastqsanger" />
61 <param name="output_choice" value="neg" />
62 <output name="output_reads" file="tvc_mini_bait_neg.fastq" ftype="fastqsanger" />
63 </test>
64 </tests>
65 <help>
66 **What it does**
67
68 Runs the ``mirabait`` utility from MIRA v4.0 to filter your input reads
69 according to whether or not they contain perfect kmer matches to your
70 bait file. By default this looks for 31-mers (kmers or *k*-mers where
71 the fragment length *k* is 31), and only requires a single matching kmer.
72
73 The ``mirabait`` utility is useful in many applications and pipelines
74 outside of using the main MIRA tool for assembly or mapping.
75
76 .. class:: warningmark
77
78 Note ``mirabait`` cannot be used on protein (amino acid) sequences.
79
80 **Example Usage**
81
82 To remove over abundant entries like rRNA sequences, run ``mirabait`` with
83 known rRNA sequences as the bait and select the *negative* matches.
84
85 To do targeted assembly by fishing out reads belonging to a gene and just
86 assemble these, run ``mirabait`` with the gene of interest as the bait and
87 select the *positive* matches.
88
89 To iteratively reconstruct mitochondria you could start by fishing out reads
90 matching any known mitochondrial sequence, assembly those, and repeat.
91
92
93 **Notes on paired read**
94
95 .. class:: warningmark
96
97 While MIRA4 is aware of many read naming conventions to identify paired read
98 partners, the ``mirabait`` tool considers each read in isolation. Applying
99 it to paired read files may leave you with orphaned reads.
100
101
102 **Citation**
103
104 If you use this Galaxy tool in work leading to a scientific publication please
105 cite the following papers:
106
107 Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013).
108 Galaxy tools and workflows for sequence analysis with applications
109 in molecular plant pathology. PeerJ 1:e167
110 http://dx.doi.org/10.7717/peerj.167
111
112 Bastien Chevreux, Thomas Wetter and Sándor Suhai (1999).
113 Genome Sequence Assembly Using Trace Signals and Additional Sequence Information.
114 Computer Science and Biology: Proceedings of the German Conference on Bioinformatics (GCB) 99, pp. 45-56.
115 http://www.bioinfo.de/isb/gcb99/talks/chevreux/main.html
116
117 This wrapper is available to install into other Galaxy Instances via the Galaxy
118 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/mira4_assembler
119 </help>
120 </tool>