Mercurial > repos > devteam > fasta_clipping_histogram
annotate fasta_clipping_histogram.xml @ 3:d0969fa24eb1 draft
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
author | devteam |
---|---|
date | Tue, 13 Oct 2015 12:38:50 -0400 |
parents | de44f4045b05 |
children | 40ec4170291e |
rev | line source |
---|---|
1 | 1 <tool id="cshl_fasta_clipping_histogram" name="Length Distribution" version="1.0.0"> |
3
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
2 <description>chart</description> |
0 | 3 <requirements> |
4 <requirement type="package" version="0.0.13">fastx_toolkit</requirement> | |
5 </requirements> | |
3
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
6 <command>fasta_clipping_histogram.pl $input $outfile</command> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
7 |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
8 <inputs> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
9 <param format="fasta" name="input" type="data" label="Library to analyze" /> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
10 </inputs> |
0 | 11 |
3
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
12 <outputs> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
13 <data format="png" name="outfile" metadata_source="input" /> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
14 </outputs> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
15 <tests> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
16 </tests> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
17 <help> |
0 | 18 **What it does** |
19 | |
20 This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file. | |
21 | |
22 **TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results. | |
23 | |
24 ----- | |
25 | |
26 **Output Examples** | |
27 | |
3
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
28 In the following library, most sequences are 24-mers to 27-mers. |
0 | 29 This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place). |
30 | |
31 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_1.png | |
32 | |
3
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
33 In the following library, most sequences are 19,22 or 23-mers. |
0 | 34 This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place). |
35 | |
36 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_2.png | |
37 | |
38 ----- | |
39 | |
40 **Input Formats** | |
41 | |
42 This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so:: | |
43 | |
44 >sequence1 | |
45 AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG | |
46 >sequence2 | |
47 GTGTGTGTGGGAAGTTGACACAGTA | |
48 >sequence3 | |
49 CCTTGAGATTAACGCTAATCAAGTAAAC | |
50 | |
51 If the sequences span over multiple lines:: | |
52 | |
53 >sequence1 | |
54 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG | |
55 TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG | |
56 aactggtctttacctTTAAGTTG | |
57 | |
58 Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences:: | |
59 | |
60 >sequence1 | |
61 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG | |
62 | |
63 ----- | |
64 | |
65 **Multiplicity counts (a.k.a reads-count)** | |
66 | |
67 If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing). | |
68 | |
69 Example 1 - The following FASTA file *does not* have multiplicity counts:: | |
70 | |
71 >seq1 | |
72 GGATCC | |
73 >seq2 | |
74 GGTCATGGGTTTAAA | |
75 >seq3 | |
76 GGGATATATCCCCACACACACACAC | |
77 | |
78 Each sequence is counts as one, to produce the following chart: | |
79 | |
80 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_3.png | |
81 | |
82 Example 2 - The following FASTA file have multiplicity counts:: | |
83 | |
84 >seq1-2 | |
85 GGATCC | |
86 >seq2-10 | |
87 GGTCATGGGTTTAAA | |
88 >seq3-3 | |
89 GGGATATATCCCCACACACACACAC | |
90 | |
91 The first sequence counts as 2, the second as 10, the third as 3, to produce the following chart: | |
92 | |
93 .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_4.png | |
94 | |
95 Use the **FASTA Collapser** tool to create FASTA files with multiplicity counts. | |
96 | |
97 ------ | |
98 | |
99 This tool is based on `FASTX-toolkit`__ by Assaf Gordon. | |
100 | |
101 .. __: http://hannonlab.cshl.edu/fastx_toolkit/ | |
3
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
102 </help> |
d0969fa24eb1
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
1
diff
changeset
|
103 </tool> |