# HG changeset patch # User peterjc # Date 1381853254 14400 # Node ID ffefb87bd414f8d2cbd9b40fe1e78356468c39e5 # Parent df86ed992a1bea4faf94f17a8d429cbfa4fce532 Uploaded v0.0.1 preview 5, using MIRA 4.0 RC4, supports segment_placement (pairing type) diff -r df86ed992a1b -r ffefb87bd414 tools/mira4/README.rst --- a/tools/mira4/README.rst Fri Oct 11 04:28:45 2013 -0400 +++ b/tools/mira4/README.rst Tue Oct 15 12:07:34 2013 -0400 @@ -1,5 +1,5 @@ -Galaxy tool to wrap the MIRA sequence assembly program (v4.0) -============================================================= +Galaxy wrapper for the MIRA assembly program (v4.0) +=================================================== This tool is copyright 2011-2013 by Peter Cock, The James Hutton Institute (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved. @@ -11,6 +11,11 @@ It is available from the Galaxy Tool Shed at: http://toolshed.g2.bx.psu.edu/view/peterjc/mira4_assembler +It uses a Galaxy datatype definition 'mira' for the MIRA Assembly Format, +http://toolshed.g2.bx.psu.edu/view/peterjc/mira_datatypes + +A separate wrapper for MIRA v3.4 is available from the Galaxy Tool Shed at: +http://toolshed.g2.bx.psu.edu/view/peterjc/mira_assembler Automated Installation ====================== @@ -23,9 +28,7 @@ cluster settings for de novo usage (high RAM) and mapping (lower RAM). Consult the Galaxy adminstration documentation for your cluster setup. -WARNING: This tool was developed to construct viral genome assembly and -mapping pipelines, for which the run time and memory requirements are -negligible. For larger tasks, be aware that MIRA can require vast amounts +WARNING: For larger tasks, be aware that MIRA can require vast amounts of RAM and run-times of over a week are possible. This tool wrapper makes no attempt to spot and reject such large jobs. @@ -50,7 +53,7 @@ -You will also need to install MIRA, we used version 4.0 RC3. See: +You will also need to install MIRA, we used version 4.0 RC4. See: * http://chevreux.org/projects_mira.html * http://sourceforge.net/projects/mira-assembler/ @@ -65,7 +68,7 @@ ======= ====================================================================== Version Changes ------- ---------------------------------------------------------------------- -v0.0.1 - Initial version (prototype for MIRA 4.0 RC3, based on wrapper for v3.4) +v0.0.1 - Initial version (prototype for MIRA 4.0 RC4, based on wrapper for v3.4) ======= ====================================================================== diff -r df86ed992a1b -r ffefb87bd414 tools/mira4/mira4.py --- a/tools/mira4/mira4.py Fri Oct 11 04:28:45 2013 -0400 +++ b/tools/mira4/mira4.py Tue Oct 15 12:07:34 2013 -0400 @@ -31,14 +31,13 @@ return ver.split("\n", 1)[0] -os.environ["PATH"] = "/mnt/galaxy/downloads/mira_4.0rc3_linux-gnu_x86_64_static/bin/:%s" % os.environ["PATH"] +os.environ["PATH"] = "/mnt/galaxy/downloads/mira_4.0rc4_linux-gnu_x86_64_static/bin/:%s" % os.environ["PATH"] mira_binary = "mira" mira_ver = get_version(mira_binary) if not mira_ver.strip().startswith("4.0"): stop_err("This wrapper is for MIRA V4.0, not:\n%s" % mira_ver) -if "-v" in sys.argv: - print "MIRA wrapper version %s," % WRAPPER_VER - print mira_ver +if "-v" in sys.argv or "--version" in sys.argv: + print "%s, MIRA wrapper version %s" % (mira_ver, WRAPPER_VER) sys.exit(0) diff -r df86ed992a1b -r ffefb87bd414 tools/mira4/mira4_de_novo.xml --- a/tools/mira4/mira4_de_novo.xml Fri Oct 11 04:28:45 2013 -0400 +++ b/tools/mira4/mira4_de_novo.xml Tue Oct 15 12:07:34 2013 -0400 @@ -5,7 +5,7 @@ mira MIRA - mira4.py -v + mira4.py --version mira4.py $manifest $out_maf $out_fasta $out_log @@ -29,6 +29,13 @@ + + + + + + + @@ -62,6 +69,10 @@ technology = ${rg.technology} ##MIRA will accept multiple filenames on one data line, or multiple data lines #for $f in $rg.filenames +#if str($rg.segment_placement) != "" +##Record the segment placement (if any) +segmentplacement = ${rg.segment_placement} +#end if ##Must now map Galaxy datatypes to MIRA file types... #if $f.ext.startswith("fastq") ##MIRA doesn't like fastqsanger etc, just plain old fastq: @@ -109,6 +120,19 @@ It is particularly suited to small genomes such as bacteria. +**Notes** + +.. class:: warningmark + +Note that the raw data for Roche 454 and Ion Torrent paired-end libraries +sequences a circularised fragment such that the raw data starts with the +end of the fragment, a linker, then the start of the fragment. This means +both the start and end are sequenced from the same strand, and thus should +be given to MIRA as orientation "2---> 1--->". However, in order to +use this data with traditional tools expecting Sanger capillary style +libraries which expect "---> <---" your FASTQ files may have been +pre-processed to mimic this by reverse complementing one of the pair. + **Citation** If you use this Galaxy tool in work leading to a scientific publication please diff -r df86ed992a1b -r ffefb87bd414 tools/mira4/mira4_mapping.xml --- a/tools/mira4/mira4_mapping.xml Fri Oct 11 04:28:45 2013 -0400 +++ b/tools/mira4/mira4_mapping.xml Tue Oct 15 12:07:34 2013 -0400 @@ -5,7 +5,7 @@ mira MIRA - mira4.py -v + mira4.py --version mira4.py $manifest $out_maf $out_fasta $out_log @@ -38,6 +38,13 @@ + + + + + + + @@ -97,6 +104,10 @@ ##This is perhaps redundant as MIRA defaults to StrainX for the reads: strain = StrainX #end if +#if str($rg.segment_placement) != "" +##Record the segment placement (if any) +segmentplacement = ${rg.segment_placement} +#end if ##MIRA will accept multiple filenames on one data line, or multiple data lines #for $f in $rg.filenames ##Must now map Galaxy datatypes to MIRA file types... @@ -149,6 +160,19 @@ It is particularly suited to small genomes such as bacteria. +**Notes** + +.. class:: warningmark + +Note that the raw data for Roche 454 and Ion Torrent paired-end libraries +sequences a circularised fragment such that the raw data starts with the +end of the fragment, a linker, then the start of the fragment. This means +both the start and end are sequenced from the same strand, and thus should +be given to MIRA as orientation "2---> 1--->". However, in order to +use this data with traditional tools expecting Sanger capillary style +libraries which expect "---> <---" your FASTQ files may have been +pre-processed to mimic this by reverse complementing one of the pair. + **Citation** If you use this Galaxy tool in work leading to a scientific publication please diff -r df86ed992a1b -r ffefb87bd414 tools/mira4/tool_dependencies.xml --- a/tools/mira4/tool_dependencies.xml Fri Oct 11 04:28:45 2013 -0400 +++ b/tools/mira4/tool_dependencies.xml Tue Oct 15 12:07:34 2013 -0400 @@ -3,9 +3,9 @@ - https://downloads.sourceforge.net/project/mira-assembler/MIRA/stable/mira_4.0rc3_linux-gnu_x86_64_static.tar.bz2 + https://downloads.sourceforge.net/project/mira-assembler/MIRA/stable/mira_4.0rc4_linux-gnu_x86_64_static.tar.bz2 - mira_4.0rc3_linux-gnu_x86_64_static/bin + mira_4.0rc4_linux-gnu_x86_64_static/bin $INSTALL_DIR