Mercurial > repos > abims-sbr > blastalign
view BlastAlign.xml @ 2:92615a423389 draft
planemo upload for repository https://github.com/abims-sbr/adaptsearch commit 44a89d5eeb82789bfc643b33c11f391281b6374b
author | abims-sbr |
---|---|
date | Wed, 27 Sep 2017 10:02:43 -0400 |
parents | aba551b2b79e |
children | 49017ea906b5 |
line wrap: on
line source
<tool name="BlastAlign" id="blastalign" version="2.0"> <description> Align the nucleic acid sequences using BLASTN </description> <macros> <import>macros.xml</import> </macros> <requirements> <expand macro="python_required" /> <requirement type="package" version="1.4">blastalign</requirement> </requirements> <stdio> <exit_code range="1:" level="fatal" /> </stdio> <command><![CDATA[ ln -s '$input' '$input.element_identifier'".fasta" && BlastAlign -i '$input.element_identifier'".fasta" -m $advanced_option.m #if $advanced_option.r != "" -r $advanced_option.r #end if #if $advanced_option.x != "" -x $advanced_option.x #end if -n $advanced_option.n #if $advanced_option.s != 0 -s $advanced_option.s #end if && ln -s '$input.element_identifier'".fasta.phy" out.phy && ln -s '$input.element_identifier'".fasta.nxs" out.nxs && #if $fasta_out.value == True python $__tool_directory__/scripts/S01_phylip2fasta.py out.phy out.fasta #end if ]]></command> <inputs> <param name="input" type="data" format="fasta" label="Choose your file" help="A fasta file with nucleotides sequences" /> <section name="advanced_option" title="Blast advanced options" expanded="True"> <param argument="-m" type="integer" value="95" min="0" max="100" label="Proportion of gaps allowed in any one sequence in the final alignement" help="default = 95, i.e. only removing sequences with extremely short matches" /> <param argument="-r" type="text" area="True" size="1x20" label="Choose a reference sequence" help="default is to search for best candidate (if entered, the sequence will be extracted, written to a separate file, and blasted against the original input file)"/> <param argument="-x" type="text" area="True" size="5x25" label="Choose the sequences to be excluded from this analysis" help="name of comma-separated sequences " /> <param argument="-n" type="boolean" checked="true" truevalue="T" falsevalue="F" label="Retain original names in output files" help="option F is to output the 15 character name abbreviations (stripped of potentially problematic characters) that is used in the program" /> <param argument="-s" type="integer" value="0" min="0" label="Number of sequences to be used in initial search for reference sequence" help="default (= 0) is to find the reference sequence by blasting all sequences against all sequences, only randomly subsampling when it thinks the blast output file might be too large" /> </section> <param name="fasta_out" type="boolean" checked="true" label="Do you want to convert the output phylip in fasta format ? " /> </inputs> <outputs> <data format="phy" name="phy" from_work_dir="out.phy" label="Alignment of ${input.name} in phylip" /> <data format="nxs" name="nxs" from_work_dir="out.nxs" label="Alignment of ${input.name} in nexus" /> <data format="fasta" name="fasta" from_work_dir="out.fasta" label="Alignment of ${input.name} in fasta"> <filter>fasta_out == True</filter> </data> </outputs> <tests> <test> <param name="input" ftype="fasta" value="inputs/locus1_sp2.fasta" /> <section name="advanced_option"> <param name="m" value="95" /> <param name="r" value="" /> <param name="x" value="" /> <param name="n" value="False" /> <param name="s" value="0" /> </section> <param name="fasta_out" value="True" /> <output name="phy" value="outputs/locus1_sp2.phy" /> <output name="nxs" value="outputs/locus1_sp2.nxs" /> <output name="fasta" value="outputs/locus1_sp2.fasta" /> </test> <test> <param name="input" ftype="fasta" value="inputs/locus1_sp3.fasta" /> <section name="advanced_option"> <param name="m" value="95" /> <param name="r" value="" /> <param name="x" value="" /> <param name="n" value="False" /> <param name="s" value="0" /> </section> <param name="fasta_out" value="True" /> <output name="phy" value="outputs/locus1_sp3.phy" /> <output name="nxs" value="outputs/locus1_sp3.nxs" /> <output name="fasta" value="outputs/locus1_sp3.fasta" /> </test> <!--locus10_sp2.fasta locus1_sp3.fasta locus2_sp2.fasta locus3_sp2.fasta locus4_sp2.fasta locus5_sp2.fasta locus6_sp2.fasta locus7_sp2.fasta locus8_sp2.fasta locus9_sp2.fasta--> </tests> <help> .. class:: infomark **Authors** The script in perl was written by **Robert Belshaw** and **Aris Katzourakis**. @HELP_AUTHORS@ ============ What it does ============ | This tool takes **nucleic sequences in fasta format** or **'dataset collection list' containing fasta files** and returns a multiple alignement (in Nexus and Phylip formats) using BLAST+ | -------- ========== Parameters ========== The choice of several parameters for the blast is possible. **-m [maximum proportion of gaps allowed in any one sequence in the final alignement]** | integer (between 0 and 100) | By default : 95%, i.e. only removes sequences with extremely short matches. | We find 50 the most useful. | **-r [name of reference sequence]** | text | Default is searching for best candidate. | If entered, the sequence will be extracted, written to a separate file, and blasted against the original input file. | **-x [name of comma-separated sequences to be excluded from this analysis]** | text | **-n** | If it's checked : retain original names in output files. | If isn't checked : to output the 15 character name abbreviations (stripped of potentially problematic characters) that is used in the tool. | **-s [number of sequences to be used in initial search for reference sequence]** | integer (between 0 and total number of sequences) | Default is finding the reference sequence by blasting all sequences against all sequences, only randomly subsampling when it thinks the blast output file might be too large. -------- ======= Outputs ======= This tool, produces the following files : **Alignment** | is the output with important informations. | when the alignment failed with BlastAlign, the name of the file is writting down this output. | **Alignement_file_failed** | is the output containing the files failed during the run of BlastAlign. | **Alignment_{inputfile}_phylip** | is the output with the aligned sequences in Phylip format. | **Alignment_{inputfile}_nexus** | is the output with the aligned sequences in Nexus format. | **Alignment_{input_file}_fasta** | is the output with the aligned sequences in Fasta format. -------- =============== Working Example =============== ------------------------------ The input file and its options ------------------------------ **Input file** | >Pf210_1/1_1.000_920 | CCGGTGGCCATTTTCTGCACCTCGTGGGTTATTGAGCTGAAAGTGGTTCAGCTCACTGTCTGTTAACAGCCGTGTCGGTCTGAGGGTATCACAGTTAATATAATGAATCAAGAGAAGTTGAAGCAGCTCCAGGCCCAAGTCCGCATCGGAGGAAAGGG | CACAGCAAGAAGAAAGAAGAAGGTGATTCACAGAACAGCAACAACAGATGACAAGAAACTGCAAAGTACACTGAAGAAATTGGCAGTAAATAATATTCCGGGTATAGAAGAGGTTAACATGATAAAGGATGACGGGCAAGTAATACATTTTACCAATCCGA | AGGTGCAGGCTTCTCTTCAGTCAAACACATTTGCCATTAATGGCCAAGCCGAAACGAAACAAATCACTGACTTGCTACCCGGTATATTAAATCAGCTGGGGGCTGAAAGTTTAACAAACTTGAAGAAGCTGGCTAAATCTGTGACTGCTGGAGTTGATTC | TGATAACAAGCAGGATGCAGCAGATATTGATGAAGATGATGATGATGTCCCAGAACTGGTTGAAAACTTTGACGAAGCATCGAAGAATGAGGGGACGTAATTCTTCTCCCACTTTATGCCATGGTAGCATCAATCGTTTTGCTGATGATGGCGTGTTTATAC | CTACCACCCAGTGTAGATTTGTCCAGACCTGGCTTGTTTGACATTGCTTGTTGGATTTTGCAACAATATCATGATTAGACTGCCTGGCTTTGTGGCCTAAATACTGTATTAAAGTGTCTGTAAAAGGGAAGCAATTTTTCTATTAAGAAGTTATCCACTAGCAT | ATTGACAGTTTTGCATGTTTGATTTTGTTCCTCGTGCAGGTCAGAACACTGATTGTACAGTGGCTGATTACAGAAAAATTGTATTCAGAGTTAAATAAACACATTATTATCCAAA | >Pp_17_1/1_1.000_930 | CCGGTGGCCATTTTCTGCACCTCGTGGGTATCTTGGGTTCGATTTGTATCAGCTCCCTATGTAAAATTAAACAAACTTATAACATAGATTGCAGCTGACAATACAATGAACCAAGAAAAATTAAAACAACTCCAAGCCCAGGTGCGCATTGGAGGCAAGGG | TACAGCAAGAAGAAAGAAGAAGGTCATTCATAGAACAGCAACAACAGATGATAAAAAACTGCAGAGTACATTAAAAAAACTAGCAGTAAATAATATTCCAGGTATAGAAGAGGTTAATATGATAAAAGATGATGGACAGGTAATACATTTTACCAATCCAAAA | GTACAGGCTTCTCTACAGTCAAACACATTTGCTATTAATGGGCAAGCTGAGACAAAACAAATCACCGAATTGTTGCCTGGTATATTAAATCAGCTGGGAGCAGAAAGTTTAACAAATCTGAAGAAACTGGCTACATCCGTGACTGGTGGAGTTGATTCTGAT | AACAAGCCAGAAACAGCAGAAATTGATGAAGACGATGATGATGTTCCAGATTTGGTTGAAAACTTTGACGAGGCATCCAAGAATGAAGGAACGTAATTTGTCATTGGTAGATCCTCCCATAGCCTGATTCTTGTGGCTGGCGACAGCTTGTTTATATTTTAC CCAGTGTAGATTTGTTCAAGAAGGTGTGCTGGCGTTGTTTGAATTTTGTAATAGTACCATGATTTAAATACCCGGTTAACGGCCTACCTGTTATGTAGAAATTGTAGAGAAAAAATTAAATCAATTTTGTATGAACTATAAGCAGCAGCTAATATATTTGCAGTTT TACATGTTTATCTGTTCATCAGCATGGGTCAGAGAATGACCGTACTTTGCTGGTGATAGAATGCTTGTATTCAAAGTTTAATAAATGGTTGTAAGCCATTTAAAAAAAAAAAAAAA ---------------- The output files ---------------- **BlastAlign** ************************ BlastAlign ************************ | | This program takes nucleotide sequences in fasta format and returns a multiple alignment (in Nexus and Phylip formats) using BLASTN | | Input file locus_2_sp_8.fasta has 2 sequences and is 1894 bytes | (maximum number of sequences that will be used to search for the reference sequence is 770) | | | BlastAlign finished: it has produced a multiple alignment of 2 sequences and length 720 by aligning to sequence Pf2101/11000920 (proportion of gaps in each sequence is less than 0.95) | **Alignment_{inputfile}_phylip** | 2 720 S | Pf2101/1100 ccggtggccattttctgcacctcgtgggttattgagctgaaagtggttcagctcactgtctgttaacagccgtgtcggtctgagggtatcacagttaatataatgaatcaagagaagttgaagcagctccaggcccaagtccgcatcggaggaaagggcacagcaagaagaaagaagaaggtgattcacagaacagcaacaacagat gacaagaaactgcaaagtacactgaagaaattggcagtaaataatattccgggtatagaagaggttaacatgataaaggatgacgggcaagtaatacattttaccaatccgaaggtgcaggcttctcttcagtcaaacacatttgccattaatggccaagccgaaacgaaacaaatcactgacttgctacccggtatattaaatcagctgggggctgaaag tttaacaaacttgaagaagctggctaaatctgtgactgctggagttgattctgataacaagcaggatgcagcagatattgatgaagatgatgatgatgtcccagaactggttgaaaactttgacgaagcatcgaagaatgaggggacgtaattcttctcccactttatgccatggtagcatcaatcgttttgctgatgatggcgtgtttatacctaccacccagtgtaga tttgtccagacctggcttgtttgacattgcttgttggattttgcaacaatatcatgattaga | Pp171/11000 ccggtggccattttctgcacctcgtgggt-------------------------------------------------------------------aatacaatgaaccaagaaaaattaaaacaactccaagcccaggtgcgcattggaggcaagggtacagcaagaagaaagaagaaggtcattcatagaacagcaacaacagatgataaaaaactgcagag | tacattaaaaaaactagcagtaaataatattccaggtatagaagaggttaatatgataaaagatgatggacaggtaatacattttaccaatccaaaagtacaggcttctctacagtcaaacacatttgctattaatgggcaagctgagacaaaacaaatcaccgaattgttgcctggtatattaaatcagctgggagcagaaagtttaacaaatctgaagaaact | ggctacatccgtgactggtggagttgattctgataacaagccagaaacagcagaaattgatgaagacgatgatgatgttccagatttggttgaaaactttgacgaggcatccaagaatgaaggaacgtaatt-----------------------------------------------------------------acccagtgtagatttgt---------------------------------------------- | ------------- | **Alignment_{inputfile}_nexus** | #NEXUS [Aligned to seq Pf2101/1100 by BlastAlign. We have excluded sequences with more than 0.95 gaps] BEGIN DATA; | dimensions ntax=2 nchar=720; | format gap=- datatype=DNA; | matrix Pf2101/1100 ccggtggccattttctgcacctcgtgggttattgagctgaaagtggttcagctcactgtctgttaacagccgtgtcggtctgagggtatcacagttaatataatgaatcaagagaagttgaagcagctccaggcccaagtccgcatcggaggaaagggcacagcaagaagaaagaagaaggtgattcacagaacagcaacaacagat gacaagaaactgcaaagtacactgaagaaattggcagtaaataatattccgggtatagaagaggttaacatgataaaggatgacgggcaagtaatacattttaccaatccgaaggtgcaggcttctcttcagtcaaacacatttgccattaatggccaagccgaaacgaaacaaatcactgacttgctacccggtatattaaatcagctgggggctgaaag tttaacaaacttgaagaagctggctaaatctgtgactgctggagttgattctgataacaagcaggatgcagcagatattgatgaagatgatgatgatgtcccagaactggttgaaaactttgacgaagcatcgaagaatgaggggacgtaattcttctcccactttatgccatggtagcatcaatcgttttgctgatgatggcgtgtttatacctaccacccagtgtaga tttgtccagacctggcttgtttgacattgcttgttggattttgcaacaatatcatgattaga | Pp171/11000 ccggtggccattttctgcacctcgtgggt-------------------------------------------------------------------aatacaatgaaccaagaaaaattaaaacaactccaagcccaggtgcgcattggaggcaagggtacagcaagaagaaagaagaaggtcattcatagaacagcaacaacagatgataaaaaactgcagag | tacattaaaaaaactagcagtaaataatattccaggtatagaagaggttaatatgataaaagatgatggacaggtaatacattttaccaatccaaaagtacaggcttctctacagtcaaacacatttgctattaatgggcaagctgagacaaaacaaatcaccgaattgttgcctggtatattaaatcagctgggagcagaaagtttaacaaatctgaagaaac | tggctacatccgtgactggtggagttgattctgataacaagccagaaacagcagaaattgatgaagacgatgatgatgttccagatttggttgaaaactttgacgaggcatccaagaatgaaggaacgtaatt-----------------------------------------------------------------acccagtgtagatttgt-------------------------------------------- | ------------- | ; | end; | **Alignment_{inputfile}_fasta** | >Pf2101/11000920 ccggtggccattttctgcacctcgtgggttattgagctgaaagtggttcagctcactgtctgttaacagccgtgtcggtctgagggtatcacagttaatataatgaatcaagagaagttgaagcagctccaggcccaagtccgcatcggaggaaagggcacagcaagaagaaagaagaaggtgattcacagaacagcaacaacagatgacaagaaactg caaagtacactgaagaaattggcagtaaataatattccgggtatagaagaggttaacatgataaaggatgacgggcaagtaatacattttaccaatccgaaggtgcaggcttctcttcagtcaaacacatttgccattaatggccaagccgaaacgaaacaaatcactgacttgctacccggtatattaaatcagctgggggctgaaagtttaacaaacttgaa gaagctggctaaatctgtgactgctggagttgattctgataacaagcaggatgcagcagatattgatgaagatgatgatgatgtcccagaactggttgaaaactttgacgaagcatcgaagaatgaggggacgtaattcttctcccactttatgccatggtagcatcaatcgttttgctgatgatggcgtgtttatacctaccacccagtgtagatttgtccagacctggc ttgtttgacattgcttgttggattttgcaacaatatcatgattaga | >Pp171/11000930 ccggtggccattttctgcacctcgtgggt-------------------------------------------------------------------aatacaatgaaccaagaaaaattaaaacaactccaagcccaggtgcgcattggaggcaagggtacagcaagaagaaagaagaaggtcattcatagaacagcaacaacagatgataaaaaactgcagagtacattaaaaaa actagcagtaaataatattccaggtatagaagaggttaatatgataaaagatgatggacaggtaatacattttaccaatccaaaagtacaggcttctctacagtcaaacacatttgctattaatgggcaagctgagacaaaacaaatcaccgaattgttgcctggtatattaaatcagctgggagcagaaagtttaacaaatctgaagaaactggctacatccg tgactggtggagttgattctgataacaagccagaaacagcagaaattgatgaagacgatgatgatgttccagatttggttgaaaactttgacgaggcatccaagaatgaaggaacgtaatt-----------------------------------------------------------------acccagtgtagatttgt--------------------------------------------------------- --------------------------------------------------- Changelog --------- **Version 2.0 - 21/04/2017** - NEW: BlastAlign will now be launched on one file at once. Although, it will manage a Dataset Collection to deal with numerous files. **Version 1.0 - 13/04/2017** - TEST: Add funtional test with planemo - IMPROVEMENT: Use conda dependencies for blastalign, blast-legacy, perl, python </help> <expand macro="citations" /> </tool>