comparison summarize_unique_barcodes.xml @ 24:431aebd93843 draft default tip

Fixed a bug in k2n.R where the function k2n_calc() would result in an error for single-end read files.
author nikos
date Wed, 05 Aug 2015 09:21:02 -0400
parents 31f25b37187b
children c139b9abe064
comparison
equal deleted inserted replaced
23:fb76303acf4f 24:431aebd93843
1 <tool id="rna_probing_summarize" version="1.0.0" name="Summarize Unique Barcodes" force_history_refresh="True">
2 <description></description>
3
4 <requirements>
5 <requirement type="package" version="4.1.0">gnu_awk</requirement>
6 <requirement type="package" version="0.1.19">samtools</requirement>
7 <requirement type="package" version="3.1.1">R_3_1_1</requirement>
8 <requirement type="R-module">RNAprobR</requirement>
9 <requirement type="package" version="1.0.0">RNAprobR</requirement>
10 <requirement type="set_environment">RNA_PROBING_SCRIPT_PATH</requirement>
11 </requirements>
12
13 <command interpreter="bash">
14 summarize_unique_barcodes.sh
15
16 ## Inputs
17 -f $input1 -b $input2
18
19 ##
20
21 #if str( $k2n ) == 'True':
22 -k
23 #end if
24
25 #if str( $priming.flag ) == 'True':
26 -p $priming.position
27 #end if
28
29 #if str( $trimming ) == 'True':
30 -t
31 #end if
32
33 -r \$RNA_PROBING_SCRIPT_PATH
34 </command>
35
36 <!-- basic error handling -->
37 <stdio>
38 <regex match="Error" level="fatal" description="" />
39 </stdio>
40
41 <inputs>
42 <param format="bam" name="input1" type="data" label="Aligned Reads" help="BAM format." />
43 <param format="tabular" name="input2" type="data" optional="True" label="Barcodes" help="Produced by Debarcoding tool." />
44 <param name="k2n" type="boolean" checked="False" truevalue="True" falsevalue="False" label="Produce k2n file" help="Check the box if you ran the tool and received a warning message to produce the k2n file. Necessary if you want to use 'HRF-Seq' method in 'Normalize' tool. Warning: Can be very slow!" />
45 <param name="trimming" type="boolean" checked="True" truevalue="True" falsevalue="False" label="Trim untemplated nucleotides" help="" />
46 <conditional name="priming">
47 <param name="flag" type="select" label="Set priming position" help="Set the priming position manually.">
48 <option value="False">No</option>
49 <option value="True">Yes</option>
50 </param>
51 <when value="True">
52 <param name="position" type="integer" value="0" min="0" label="Priming position" />
53 </when>
54 <when value="False" />
55 </conditional>
56 </inputs>
57
58 <outputs>
59 <data format="tabular" name="trimming_stats" label="${tool.name} on ${on_string}: Trimming stats" from_work_dir="output_dir/trimming_stats.txt">
60 <filter>trimming is True</filter>
61 </data>
62 <data format="tabular" name="unique_barcodes" label="${tool.name} on ${on_string}: Unique Barcodes" from_work_dir="output_dir/unique_barcodes.txt">
63 <filter> input2 != None </filter>
64 </data>
65 <data format="tabular" name="read_counts" label="${tool.name} on ${on_string}: Read Counts" from_work_dir="output_dir/read_counts.txt" />
66 <data format="txt" name="k2n_file" label="${tool.name} on ${on_string}: k2n file" from_work_dir="output_dir/k2n.txt">
67 <filter> k2n is True </filter>
68 </data>
69 </outputs>
70
71 <tests>
72 <test>
73 <param name="input1" value="aligned.bam" />
74 <param name="input2" value="barcodes.txt" />
75 <param name="k2n" value="True" />
76 <param name="trimming" value="True" />
77 <output name="trimming_stats" file="trimming_stats.txt" />
78 <output name="unique_barcodes" file="unique_barcodes.txt" />
79 <output name="read_counts" file="read_counts.txt" />
80 </test>
81 </tests>
82
83 <help>
84 **What it does**
85
86 *Summarize Unique Barcodes* counts the number of unique random barcodes and reads associated with each sequenced fragment. A fragment is understood as 1) a pair of Reverse Transcriptase (RT) termination site and RT priming site given paired end sequencing, or 2) an RT termination site in a single end sequencing. For non-barcoded sequencing it only counts the reads matching each fragment.
87
88 ------
89
90 **Inputs**
91
92 *Summarize Unique Barcodes* requires a file containing the Aligned Reads (required) in BAM_ format and tabular file with the Barcodes (optional) produced by the *Preprocessing* tool of the *RNA probing* suite.
93
94 .. _BAM: http://samtools.github.io/hts-specs/SAMv1.pdf
95
96 -------
97
98 **Parameters**
99
100 **Produce k2n file** - A file that contains a sequence of numbers where the n-th element informs how many unique cDNA molecules gives rise to observing n unique barcodes in a given sample. Required for calculating Estimated Unique Counts (EUCs) in tool *Normalize*.
101
102 **Trim untemplated nucleotides** - Untemplated nucleotides can be added to cDNA 3’ ends via terminal transferase activity of reverse transcriptase which offset the location of the read-end mapping and lead to erroneous assignment of reactivity information to nucleotides upstream of those which has reacted (Schmidt and Mueller, 1999, Talkish et al., 2014). Setting this parameter on will remove those nucleotides.
103
104 Recommended for methods based on detecting reverse transcription termination sites (e.g. DMS-Seq, HRF-Seq or SHAPE-Seq), and not for methods based on ligating the linker directly to RNA (e.g. PARS or FragSeq).
105
106 **Set priming position** - Applicable when the priming site is fixed.
107
108 ------
109
110 **Outputs**
111
112 **Unique Barcodes** (if a Barcode file is given) is a tabular file with 4 columns.
113
114 ====== ==========================================================
115 Column Description
116 ------ ----------------------------------------------------------
117 1 Transcript identifier
118 2 RT termination site (start)
119 3 RT priming site (end)
120 4 Count of unique barcodes associated with fragments matching the first three columns
121 ====== ==========================================================
122
123 .
124
125 **Read Counts** is similar to Unique Barcodes but the fourth column is a count of reads matching first three columns.
126
127 **k2n file** as described above.
128
129 **Trimming Stats** reports statistics of trimming untemplated nucleotides from read ends.
130
131 </help>
132
133 <citations>
134 <citation type="doi">10.1093/nar/gku167</citation>
135 <citation type="doi">10.1093/nar/27.21.e31-i</citation>
136 <citation type="doi">10.1261/rna.042218.113</citation>
137 </citations>
138
139 </tool>