Mercurial > repos > devteam > fasta_clipping_histogram
view fasta_clipping_histogram.xml @ 5:cf0a5c96c7f8 draft default tip
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tool_collections/fastx_toolkit/fasta_clipping_histogram commit fd099d17eceaa319fbfe429f4725328d88b18c9f
author | iuc |
---|---|
date | Thu, 10 Aug 2023 06:50:14 +0000 |
parents | 40ec4170291e |
children |
line wrap: on
line source
<tool id="cshl_fasta_clipping_histogram" name="Length Distribution" version="1.0.1+galaxy@VERSION_SUFFIX@" profile="22.05"> <description>chart</description> <macros> <import>macros.xml</import> </macros> <expand macro="requirements"> <requirement type="package" version="1.49">perl-gdgraph</requirement> </expand> <command detect_errors="exit_code"><![CDATA[ fasta_clipping_histogram.pl '$input' '$outfile' ]]></command> <inputs> <expand macro="fasta_input" /> </inputs> <outputs> <data name="outfile" format="png" metadata_source="input" /> </outputs> <tests> <test> <param name="input" value="fasta_clipping_histogram-in1.fa" /> <output name="outfile" file="fasta_clipping_histogram-out1.png" compare="sim_size" delta="512" /> </test> <test> <param name="input" value="fasta_clipping_histogram-in2.fa" /> <output name="outfile" file="fasta_clipping_histogram-out2.png" compare="sim_size" delta="512" /> </test> </tests> <help><![CDATA[ **What it does** This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file. **TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results. ----- **Output Examples** In the following library, most sequences are 24-mers to 27-mers. This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place). .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_1.png In the following library, most sequences are 19,22 or 23-mers. This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place). .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_2.png ----- **Input Formats** This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so:: >sequence1 AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG >sequence2 GTGTGTGTGGGAAGTTGACACAGTA >sequence3 CCTTGAGATTAACGCTAATCAAGTAAAC If the sequences span over multiple lines:: >sequence1 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG aactggtctttacctTTAAGTTG Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences:: >sequence1 CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG ----- **Multiplicity counts (a.k.a reads-count)** If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing). Example 1 - The following FASTA file *does not* have multiplicity counts:: >seq1 GGATCC >seq2 GGTCATGGGTTTAAA >seq3 GGGATATATCCCCACACACACACAC Each sequence is counts as one, to produce the following chart: .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_3.png Example 2 - The following FASTA file have multiplicity counts:: >seq1-2 GGATCC >seq2-10 GGTCATGGGTTTAAA >seq3-3 GGGATATATCCCCACACACACACAC The first sequence counts as 2, the second as 10, the third as 3, to produce the following chart: .. image:: ${static_path}/fastx_icons/fasta_clipping_histogram_4.png Use the **FASTA Collapser** tool to create FASTA files with multiplicity counts. ------ This tool is based on `FASTX-toolkit`__ by Assaf Gordon. .. __: http://hannonlab.cshl.edu/fastx_toolkit/ ]]></help> <expand macro="citations" /> </tool>