Mercurial > repos > devteam > fasta_clipping_histogram
changeset 0:82e8c467e2ec draft
Uploaded
author | devteam |
---|---|
date | Wed, 25 Sep 2013 14:38:48 -0400 |
parents | |
children | de44f4045b05 |
files | fasta_clipping_histogram.xml tool_dependencies.xml |
diffstat | 2 files changed, 118 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/fasta_clipping_histogram.xml Wed Sep 25 14:38:48 2013 -0400 @@ -0,0 +1,112 @@ +<tool id="cshl_fasta_clipping_histogram" name="Length Distribution"> + <description>chart</description> + <requirements> + <requirement type="package" version="0.0.13">fastx_toolkit</requirement> + </requirements> + <command>fasta_clipping_histogram.pl $input $outfile</command> + + <inputs> + <param format="fasta" name="input" type="data" label="Library to analyze" /> + </inputs> + + <outputs> + <data format="png" name="outfile" metadata_source="input" /> + </outputs> +<help> + +**What it does** + +This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file. + +**TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results. + +----- + +**Output Examples** + +In the following library, most sequences are 24-mers to 27-mers. +This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place). + +.. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_1.png + + +In the following library, most sequences are 19,22 or 23-mers. +This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place). + +.. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_2.png + + +----- + + +**Input Formats** + +This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so:: + + >sequence1 + AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG + >sequence2 + GTGTGTGTGGGAAGTTGACACAGTA + >sequence3 + CCTTGAGATTAACGCTAATCAAGTAAAC + + +If the sequences span over multiple lines:: + + >sequence1 + CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG + TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG + aactggtctttacctTTAAGTTG + +Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences:: + + >sequence1 + CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG + + +----- + + + +**Multiplicity counts (a.k.a reads-count)** + +If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing). + +Example 1 - The following FASTA file *does not* have multiplicity counts:: + + >seq1 + GGATCC + >seq2 + GGTCATGGGTTTAAA + >seq3 + GGGATATATCCCCACACACACACAC + +Each sequence is counts as one, to produce the following chart: + +.. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_3.png + + +Example 2 - The following FASTA file have multiplicity counts:: + + >seq1-2 + GGATCC + >seq2-10 + GGTCATGGGTTTAAA + >seq3-3 + GGGATATATCCCCACACACACACAC + +The first sequence counts as 2, the second as 10, the third as 3, to produce the following chart: + +.. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_4.png + +Use the **FASTA Collapser** tool to create FASTA files with multiplicity counts. + +------ + +This tool is based on `FASTX-toolkit`__ by Assaf Gordon. + + .. __: http://hannonlab.cshl.edu/fastx_toolkit/ + +</help> +</tool> +<!-- FASTA-Clipping-Histogram is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) -->
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tool_dependencies.xml Wed Sep 25 14:38:48 2013 -0400 @@ -0,0 +1,6 @@ +<?xml version="1.0"?> +<tool_dependency> + <package name="fastx_toolkit" version="0.0.13"> + <repository changeset_revision="e76e81b3eccf" name="package_fastx_toolkit_0_0_13" owner="devteam" toolshed="http://testtoolshed.g2.bx.psu.edu" /> + </package> +</tool_dependency>