comparison fasta_clipping_histogram.xml @ 3:d0969fa24eb1 draft

planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
author devteam
date Tue, 13 Oct 2015 12:38:50 -0400
parents de44f4045b05
children 40ec4170291e
comparison
equal deleted inserted replaced
2:20e471a2fdc6 3:d0969fa24eb1
1 <tool id="cshl_fasta_clipping_histogram" name="Length Distribution" version="1.0.0"> 1 <tool id="cshl_fasta_clipping_histogram" name="Length Distribution" version="1.0.0">
2 <description>chart</description> 2 <description>chart</description>
3 <requirements> 3 <requirements>
4 <requirement type="package" version="0.0.13">fastx_toolkit</requirement> 4 <requirement type="package" version="0.0.13">fastx_toolkit</requirement>
5 </requirements> 5 </requirements>
6 <command>fasta_clipping_histogram.pl $input $outfile</command> 6 <command>fasta_clipping_histogram.pl $input $outfile</command>
7
8 <inputs>
9 <param format="fasta" name="input" type="data" label="Library to analyze" />
10 </inputs>
11 7
12 <outputs> 8 <inputs>
13 <data format="png" name="outfile" metadata_source="input" /> 9 <param format="fasta" name="input" type="data" label="Library to analyze" />
14 </outputs> 10 </inputs>
15 <help>
16 11
12 <outputs>
13 <data format="png" name="outfile" metadata_source="input" />
14 </outputs>
15 <tests>
16 </tests>
17 <help>
17 **What it does** 18 **What it does**
18 19
19 This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file. 20 This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file.
20 21
21 **TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results. 22 **TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results.
22 23
23 ----- 24 -----
24 25
25 **Output Examples** 26 **Output Examples**
26 27
27 In the following library, most sequences are 24-mers to 27-mers. 28 In the following library, most sequences are 24-mers to 27-mers.
28 This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place). 29 This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place).
29 30
30 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_1.png 31 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_1.png
31 32
32 33 In the following library, most sequences are 19,22 or 23-mers.
33 In the following library, most sequences are 19,22 or 23-mers.
34 This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place). 34 This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place).
35 35
36 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_2.png 36 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_2.png
37 37
38
39 ----- 38 -----
40
41 39
42 **Input Formats** 40 **Input Formats**
43 41
44 This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so:: 42 This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so::
45 43
47 AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG 45 AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG
48 >sequence2 46 >sequence2
49 GTGTGTGTGGGAAGTTGACACAGTA 47 GTGTGTGTGGGAAGTTGACACAGTA
50 >sequence3 48 >sequence3
51 CCTTGAGATTAACGCTAATCAAGTAAAC 49 CCTTGAGATTAACGCTAATCAAGTAAAC
52
53 50
54 If the sequences span over multiple lines:: 51 If the sequences span over multiple lines::
55 52
56 >sequence1 53 >sequence1
57 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG 54 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG
61 Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences:: 58 Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences::
62 59
63 >sequence1 60 >sequence1
64 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG 61 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG
65 62
66
67 ----- 63 -----
68
69
70 64
71 **Multiplicity counts (a.k.a reads-count)** 65 **Multiplicity counts (a.k.a reads-count)**
72 66
73 If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing). 67 If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing).
74 68
82 GGGATATATCCCCACACACACACAC 76 GGGATATATCCCCACACACACACAC
83 77
84 Each sequence is counts as one, to produce the following chart: 78 Each sequence is counts as one, to produce the following chart:
85 79
86 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_3.png 80 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_3.png
87
88 81
89 Example 2 - The following FASTA file have multiplicity counts:: 82 Example 2 - The following FASTA file have multiplicity counts::
90 83
91 >seq1-2 84 >seq1-2
92 GGATCC 85 GGATCC
104 ------ 97 ------
105 98
106 This tool is based on `FASTX-toolkit`__ by Assaf Gordon. 99 This tool is based on `FASTX-toolkit`__ by Assaf Gordon.
107 100
108 .. __: http://hannonlab.cshl.edu/fastx_toolkit/ 101 .. __: http://hannonlab.cshl.edu/fastx_toolkit/
109 102 </help>
110 </help>
111 <!-- FASTA-Clipping-Histogram is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) -->
112 </tool> 103 </tool>