Mercurial > repos > peterjc > ncbi_blast_plus

--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/README.rst	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,166 @@
+Galaxy wrappers for NCBI BLAST+ suite
+=====================================
+
+These wrappers are copyright 2010-2013 by Peter Cock, The James Hutton Institute
+(formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
+See the licence text below.
+
+Currently tested with NCBI BLAST 2.2.26+ (i.e. version 2.2.26 of BLAST+),
+and does not work with the NCBI 'legacy' BLAST suite (e.g. blastall).
+
+Note that these wrappers (and the associated datatypes) were originally
+distributed as part of the main Galaxy repository, but as of August 2012
+moved to the Galaxy Tool Shed as 'ncbi_blast_plus' (and 'blast_datatypes').
+My thanks to Dannon Baker from the Galaxy development team for his assistance
+with this.
+
+These wrappers are available from the Galaxy Tool Shed at:
+http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+
+
+Automated Installation
+======================
+
+Galaxy should be able to automatically install the dependencies, i.e. the
+'blast_datatypes' repository which defines the BLAST XML file format
+('blastxml') and protein and nucleotide BLAST databases ('blastdbp' and
+'blastdbn').
+
+You must tell Galaxy about any system level BLAST databases using configuration
+files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein
+databases like NR), and blastdb_d.loc (protein domain databases like CDD or
+SMART) which are located in the tool-data/ folder. Sample files are included
+which explain the tab-based format to use.
+
+You can download the NCBI provided databases as tar-balls from here:
+
+* ftp://ftp.ncbi.nlm.nih.gov/blast/db/ (nucleotide and protein databases like NR)
+* ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/ (domain databases like CDD)
+
+
+Manual Installation
+===================
+
+For those not using Galaxy's automated installation from the Tool Shed, put
+the XML and Python files in the tools/ncbi_blast_plus/ folder and add the XML
+files to your tool_conf.xml as normal (and do the same in tool_conf.xml.sample
+in order to run the unit tests). For example, use::
+
+  <section name="NCBI BLAST+" id="ncbi_blast_plus_tools">
+    <tool file="ncbi_blast_plus/ncbi_blastn_wrapper.xml" />
+    <tool file="ncbi_blast_plus/ncbi_blastp_wrapper.xml" />
+    <tool file="ncbi_blast_plus/ncbi_blastx_wrapper.xml" />
+    <tool file="ncbi_blast_plus/ncbi_tblastn_wrapper.xml" />
+    <tool file="ncbi_blast_plus/ncbi_tblastx_wrapper.xml" />
+    <tool file="ncbi_blast_plus/ncbi_makeblastdb.xml" />
+    <tool file="ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml" />
+    <tool file="ncbi_blast_plus/ncbi_blastdbcmd_info.xml" />
+    <tool file="ncbi_blast_plus/ncbi_rpsblast_wrapper.xml" />
+    <tool file="ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml" />
+    <tool file="ncbi_blast_plus/blastxml_to_tabular.xml" />
+  </section>
+
+You will also need to install 'blast_datatypes' from the Tool Shed. This
+defines the BLAST XML file format ('blastxml') and protein and nucleotide
+BLAST databases composite file formats ('blastdbp' and 'blastdbn').
+
+As described above for an automated installation, you must also tell Galaxy
+about any system level BLAST databases using the tool-data/blastdb*.loc files.
+
+You must install the NCBI BLAST+ standalone tools somewhere on the system
+path. Currently the unit tests are written using "BLAST 2.2.26+".
+
+Run the functional tests (adjusting the section identifier to match your
+tool_conf.xml.sample file)::
+
+    ./run_functional_tests.sh -sid NCBI_BLAST+-ncbi_blast_plus_tools
+
+
+History
+=======
+
+======= ======================================================================
+Version Changes
+------- ----------------------------------------------------------------------
+v0.0.11 - Final revision as part of the Galaxy main repository, and the
+          first release via the Tool Shed
+v0.0.12 - Implements genetic code option for translation searches.
+        - Changes <parallelism> to 1000 sequences at a time (to cope with
+          very large sets of queries where BLAST+ can become memory hungry)
+        - Include warning that BLAST+ with subject FASTA gives pairwise
+          e-values
+v0.0.13 - Use the new error handling options in Galaxy (the previously
+          bundled hide_stderr.py script is no longer needed).
+v0.0.14 - Support for makeblastdb and blastdbinfo with local BLAST databases
+          in the history (using work from Edward Kirton), requires v0.0.14
+          of the 'blast_datatypes' repository from the Tool Shed.
+v0.0.15 - Stronger warning in help text against searching against subject
+          FASTA files (better looking e-values than you might be expecting).
+v0.0.16 - Added repository_dependencies.xml for automates installation of the
+          'blast_datatypes' repository from the Tool Shed.
+v0.0.17 - The BLAST+ search tools now default to extended tabular output
+          (all too often our users where having to re-run searches just to
+          get one of the missing columns like query or subject length)
+v0.0.18 - Defensive quoting of filenames in case of spaces (where possible,
+          BLAST+ handling of some mult-file arguments is problematic).
+v0.0.19 - Added wrappers for rpsblast and rpstblastn, and new blastdb_d.loc
+          for the domain databases they use (e.g. CDD, PFAM or SMART).
+        - Correct case of exception regular expression (for error handling
+          fall-back in case the return code is not set properly).
+        - Clearer naming of output files.
+v0.0.20 - Added unit tests for BLASTN and TBLASTX.
+        - Added percentage identity option to BLASTN.
+        - Fallback on ElementTree if cElementTree missing in XML to tabular.
+        - Link to Tool Shed added to help text and this documentation.
+        - Tweak dependency on blast_datatypes to also work on Test Tool Shed
+        - Adopted standard MIT License.
+        - Development moved to GitHub, https://github.com/peterjc/galaxy_blast
+======= ======================================================================
+
+
+Bug Reports
+===========
+
+You can file an issue here https://github.com/peterjc/galaxy_blast/issues or ask
+us on the Galaxy development list http://lists.bx.psu.edu/listinfo/galaxy-dev
+
+
+Developers
+==========
+
+This script and related tools were originally developed on the 'tools' branch
+of the following Mercurial repository:
+https://bitbucket.org/peterjc/galaxy-central/
+
+As of July 2013, development is continuing on a dedicated GitHub repository:
+https://github.com/peterjc/galaxy_blast
+
+For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball I use
+the following command from the GitHub repository root folder::
+
+    $ ./ncbi_blast_plus/make_ncbi_blast_plus.sh
+
+This simplifies ensuring a consistent set of files is bundled each time,
+including all the relevant test files.
+
+
+Licence (MIT)
+=============
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/blastxml_to_tabular.py	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,261 @@
+#!/usr/bin/env python
+"""Convert a BLAST XML file to tabular output.
+
+Takes three command line options, input BLAST XML filename, output tabular
+BLAST filename, output format (std for standard 12 columns, or ext for the
+extended 24 columns offered in the BLAST+ wrappers).
+
+The 12 columns output are 'qseqid sseqid pident length mismatch gapopen qstart
+qend sstart send evalue bitscore' or 'std' at the BLAST+ command line, which
+mean:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The additional columns offered in the Galaxy BLAST+ wrappers are:
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+Most of these fields are given explicitly in the XML file, others some like
+the percentage identity and the number of gap openings must be calculated.
+
+Be aware that the sequence in the extended tabular output or XML direct from
+BLAST+ may or may not use XXXX masking on regions of low complexity. This
+can throw the off the calculation of percentage identity and gap openings.
+[In fact, both BLAST 2.2.24+ and 2.2.25+ have a subtle bug in this regard,
+with these numbers changing depending on whether or not the low complexity
+filter is used.]
+
+This script attempts to produce identical output to what BLAST+ would have done.
+However, check this with "diff -b ..." since BLAST+ sometimes includes an extra
+space character (probably a bug).
+"""
+import sys
+import re
+
+if "-v" in sys.argv or "--version" in sys.argv:
+    print "v0.0.12"
+    sys.exit(0)
+
+if sys.version_info[:2] >= ( 2, 5 ):
+    try:
+        from xml.etree import cElementTree as ElementTree
+    except ImportError:
+        from xml.etree import ElementTree as ElementTree
+else:
+    from galaxy import eggs
+    import pkg_resources; pkg_resources.require( "elementtree" )
+    from elementtree import ElementTree
+
+def stop_err( msg ):
+    sys.stderr.write("%s\n" % msg)
+    sys.exit(1)
+
+#Parse Command Line
+try:
+    in_file, out_file, out_fmt = sys.argv[1:]
+except:
+    stop_err("Expect 3 arguments: input BLAST XML file, output tabular file, out format (std or ext)")
+
+if out_fmt == "std":
+    extended = False
+elif out_fmt == "x22":
+    stop_err("Format argument x22 has been replaced with ext (extended 24 columns)")
+elif out_fmt == "ext":
+    extended = True
+else:
+    stop_err("Format argument should be std (12 column) or ext (extended 24 columns)")
+
+
+# get an iterable
+try:
+    context = ElementTree.iterparse(in_file, events=("start", "end"))
+except:
+    stop_err("Invalid data format.")
+# turn it into an iterator
+context = iter(context)
+# get the root element
+try:
+    event, root = context.next()
+except:
+    stop_err( "Invalid data format." )
+
+
+re_default_query_id = re.compile("^Query_\d+$")
+assert re_default_query_id.match("Query_101")
+assert not re_default_query_id.match("Query_101a")
+assert not re_default_query_id.match("MyQuery_101")
+re_default_subject_id = re.compile("^Subject_\d+$")
+assert re_default_subject_id.match("Subject_1")
+assert not re_default_subject_id.match("Subject_")
+assert not re_default_subject_id.match("Subject_12a")
+assert not re_default_subject_id.match("TheSubject_1")
+
+
+outfile = open(out_file, 'w')
+blast_program = None
+for event, elem in context:
+    if event == "end" and elem.tag == "BlastOutput_program":
+        blast_program = elem.text
+    # for every <Iteration> tag
+    if event == "end" and elem.tag == "Iteration":
+        #Expecting either this, from BLAST 2.2.25+ using FASTA vs FASTA
+        # <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
+        # <Iteration_query-def>Endoplasmic reticulum resident protein 44 OS=Homo sapiens GN=ERP44 PE=1 SV=1</Iteration_query-def>
+        # <Iteration_query-len>406</Iteration_query-len>
+        # <Iteration_hits></Iteration_hits>
+        #
+        #Or, from BLAST 2.2.24+ run online
+        # <Iteration_query-ID>Query_1</Iteration_query-ID>
+        # <Iteration_query-def>Sample</Iteration_query-def>
+        # <Iteration_query-len>516</Iteration_query-len>
+        # <Iteration_hits>...
+        qseqid = elem.findtext("Iteration_query-ID")
+        if re_default_query_id.match(qseqid):
+            #Place holder ID, take the first word of the query definition
+            qseqid = elem.findtext("Iteration_query-def").split(None,1)[0]
+        qlen = int(elem.findtext("Iteration_query-len"))
+
+        # for every <Hit> within <Iteration>
+        for hit in elem.findall("Iteration_hits/Hit"):
+            #Expecting either this,
+            # <Hit_id>gi|3024260|sp|P56514.1|OPSD_BUFBU</Hit_id>
+            # <Hit_def>RecName: Full=Rhodopsin</Hit_def>
+            # <Hit_accession>P56514</Hit_accession>
+            #or,
+            # <Hit_id>Subject_1</Hit_id>
+            # <Hit_def>gi|57163783|ref|NP_001009242.1| rhodopsin [Felis catus]</Hit_def>
+            # <Hit_accession>Subject_1</Hit_accession>
+            #
+            #apparently depending on the parse_deflines switch
+            sseqid = hit.findtext("Hit_id").split(None,1)[0]
+            hit_def = sseqid + " " + hit.findtext("Hit_def")
+            if re_default_subject_id.match(sseqid) \
+            and sseqid == hit.findtext("Hit_accession"):
+                #Place holder ID, take the first word of the subject definition
+                hit_def = hit.findtext("Hit_def")
+                sseqid = hit_def.split(None,1)[0]
+            # for every <Hsp> within <Hit>
+            for hsp in hit.findall("Hit_hsps/Hsp"):
+                nident = hsp.findtext("Hsp_identity")
+                length = hsp.findtext("Hsp_align-len")
+                pident = "%0.2f" % (100*float(nident)/float(length))
+
+                q_seq = hsp.findtext("Hsp_qseq")
+                h_seq = hsp.findtext("Hsp_hseq")
+                m_seq = hsp.findtext("Hsp_midline")
+                assert len(q_seq) == len(h_seq) == len(m_seq) == int(length)
+                gapopen = str(len(q_seq.replace('-', ' ').split())-1  + \
+                              len(h_seq.replace('-', ' ').split())-1)
+
+                mismatch = m_seq.count(' ') + m_seq.count('+') \
+                         - q_seq.count('-') - h_seq.count('-')
+                #TODO - Remove this alternative mismatch calculation and test
+                #once satisifed there are no problems
+                expected_mismatch = len(q_seq) \
+                                  - sum(1 for q,h in zip(q_seq, h_seq) \
+                                        if q == h or q == "-" or h == "-")
+                xx = sum(1 for q,h in zip(q_seq, h_seq) if q=="X" and h=="X")
+                if not (expected_mismatch - q_seq.count("X") <= int(mismatch) <= expected_mismatch + xx):
+                    stop_err("%s vs %s mismatches, expected %i <= %i <= %i" \
+                             % (qseqid, sseqid, expected_mismatch - q_seq.count("X"),
+                                int(mismatch), expected_mismatch))
+
+                #TODO - Remove this alternative identity calculation and test
+                #once satisifed there are no problems
+                expected_identity = sum(1 for q,h in zip(q_seq, h_seq) if q == h)
+                if not (expected_identity - xx <= int(nident) <= expected_identity + q_seq.count("X")):
+                    stop_err("%s vs %s identities, expected %i <= %i <= %i" \
+                             % (qseqid, sseqid, expected_identity, int(nident),
+                                expected_identity + q_seq.count("X")))
+
+
+                evalue = hsp.findtext("Hsp_evalue")
+                if evalue == "0":
+                    evalue = "0.0"
+                else:
+                    evalue = "%0.0e" % float(evalue)
+
+                bitscore = float(hsp.findtext("Hsp_bit-score"))
+                if bitscore < 100:
+                    #Seems to show one decimal place for lower scores
+                    bitscore = "%0.1f" % bitscore
+                else:
+                    #Note BLAST does not round to nearest int, it truncates
+                    bitscore = "%i" % bitscore
+
+                values = [qseqid,
+                          sseqid,
+                          pident,
+                          length, #hsp.findtext("Hsp_align-len")
+                          str(mismatch),
+                          gapopen,
+                          hsp.findtext("Hsp_query-from"), #qstart,
+                          hsp.findtext("Hsp_query-to"), #qend,
+                          hsp.findtext("Hsp_hit-from"), #sstart,
+                          hsp.findtext("Hsp_hit-to"), #send,
+                          evalue, #hsp.findtext("Hsp_evalue") in scientific notation
+                          bitscore, #hsp.findtext("Hsp_bit-score") rounded
+                          ]
+
+                if extended:
+                    sallseqid = ";".join(name.split(None,1)[0] for name in hit_def.split(">"))
+                    #print hit_def, "-->", sallseqid
+                    positive = hsp.findtext("Hsp_positive")
+                    ppos = "%0.2f" % (100*float(positive)/float(length))
+                    qframe = hsp.findtext("Hsp_query-frame")
+                    sframe = hsp.findtext("Hsp_hit-frame")
+                    if blast_program == "blastp":
+                        #Probably a bug in BLASTP that they use 0 or 1 depending on format
+                        if qframe == "0": qframe = "1"
+                        if sframe == "0": sframe = "1"
+                    slen = int(hit.findtext("Hit_len"))
+                    values.extend([sallseqid,
+                                   hsp.findtext("Hsp_score"), #score,
+                                   nident,
+                                   positive,
+                                   hsp.findtext("Hsp_gaps"), #gaps,
+                                   ppos,
+                                   qframe,
+                                   sframe,
+                                   #NOTE - for blastp, XML shows original seq, tabular uses XXX masking
+                                   q_seq,
+                                   h_seq,
+                                   str(qlen),
+                                   str(slen),
+                                   ])
+                #print "\t".join(values)
+                outfile.write("\t".join(values) + "\n")
+        # prevents ElementTree from growing large datastructure
+        root.clear()
+        elem.clear()
+outfile.close()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/blastxml_to_tabular.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,137 @@
+<tool id="blastxml_to_tabular" name="BLAST XML to tabular" version="0.0.11">
+    <description>Convert BLAST XML output to tabular</description>
+    <version_command interpreter="python">blastxml_to_tabular.py --version</version_command>
+    <command interpreter="python">
+      blastxml_to_tabular.py $blastxml_file $tabular_file $out_format
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+    </stdio>
+    <inputs>
+        <param name="blastxml_file" type="data" format="blastxml" label="BLAST results as XML"/>
+        <param name="out_format" type="select" label="Output format">
+            <option value="std">Tabular (standard 12 columns)</option>
+            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
+        </param>
+    </inputs>
+    <outputs>
+        <data name="tabular_file" format="tabular" label="BLAST results as tabular" />
+    </outputs>
+    <requirements>
+    </requirements>
+    <tests>
+        <test>
+            <param name="blastxml_file" value="blastp_four_human_vs_rhodopsin.xml" ftype="blastxml" />
+            <param name="out_format" value="std" />
+            <!-- Note this has some white space differences from the actual blastp output blast_four_human_vs_rhodopsin.tabluar -->
+            <output name="tabular_file" file="blastp_four_human_vs_rhodopsin_converted.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="blastxml_file" value="blastp_four_human_vs_rhodopsin.xml" ftype="blastxml" />
+            <param name="out_format" value="ext" />
+            <!-- Note this has some white space differences from the actual blastp output blast_four_human_vs_rhodopsin_22c.tabluar -->
+            <output name="tabular_file" file="blastp_four_human_vs_rhodopsin_converted_ext.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="blastxml_file" value="blastp_sample.xml" ftype="blastxml" />
+            <param name="out_format" value="std" />
+            <!-- Note this has some white space differences from the actual blastp output -->
+            <output name="tabular_file" file="blastp_sample_converted.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="blastxml_file" value="blastx_rhodopsin_vs_four_human.xml" ftype="blastxml" />
+            <param name="out_format" value="std" />
+            <!-- Note this has some white space differences from the actual blastx output -->
+            <output name="tabular_file" file="blastx_rhodopsin_vs_four_human_converted.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="blastxml_file" value="blastx_rhodopsin_vs_four_human.xml" ftype="blastxml" />
+            <param name="out_format" value="ext" />
+            <!-- Note this has some white space and XXXX masking differences from the actual blastx output -->
+            <output name="tabular_file" file="blastx_rhodopsin_vs_four_human_converted_ext.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="blastxml_file" value="blastx_sample.xml" ftype="blastxml" />
+            <param name="out_format" value="std" />
+            <!-- Note this has some white space differences from the actual blastx output -->
+            <output name="tabular_file" file="blastx_sample_converted.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="blastxml_file" value="blastp_human_vs_pdb_seg_no.xml" ftype="blastxml" />
+            <param name="out_format" value="std" />
+            <!-- Note this has some white space differences from the actual blastp output -->
+            <output name="tabular_file" file="blastp_human_vs_pdb_seg_no_converted_std.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="blastxml_file" value="blastp_human_vs_pdb_seg_no.xml" ftype="blastxml" />
+            <param name="out_format" value="ext" />
+            <!-- Note this has some white space differences from the actual blastp output -->
+            <output name="tabular_file" file="blastp_human_vs_pdb_seg_no_converted_ext.tabular" ftype="tabular" />
+        </test>
+    </tests>
+    <help>
+
+**What it does**
+
+NCBI BLAST+ (and the older NCBI 'legacy' BLAST) can output in a range of
+formats including tabular and a more detailed XML format. A complex workflow
+may need both the XML and the tabular output - but running BLAST twice is
+slow and wasteful.
+
+This tool takes the BLAST XML output and can convert it into the
+standard 12 column tabular equivalent:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The BLAST+ tools can optionally output additional columns of information,
+but this takes longer to calculate. Most (but not all) of these columns are
+included by selecting the extended tabular output. The extra columns are
+included *after* the standard 12 columns. This is so that you can write
+workflow filtering steps that accept either the 12 or 22 column tabular
+BLAST output. This tool now uses this extended 24 column output by default.
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+Beware that the XML file (and thus the conversion) and the tabular output
+direct from BLAST+ may differ in the presence of XXXX masking on regions
+low complexity (columns 21 and 22), and thus also calculated figures like
+the percentage identity (column 3).
+
+**References**
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_blastdbcmd_info.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,67 @@
+<tool id="ncbi_blastdbcmd_info" name="NCBI BLAST+ database info" version="0.0.6">
+    <description>Show BLAST database information from blastdbcmd</description>
+    <requirements>
+        <requirement type="binary">blastdbcmd</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>blastdbcmd -version</version_command>
+    <command>
+blastdbcmd -dbtype $db_opts.db_type -db "${db_opts.database.fields.path}" -info -out "$info"
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+	<!-- Suspect blastdbcmd sometimes fails to set error level -->
+	<regex match="Error:" />
+	<regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <conditional name="db_opts">
+            <param name="db_type" type="select" label="Type of BLAST database">
+              <option value="nucl" selected="True">Nucleotide</option>
+              <option value="prot">Protein</option>
+            </param>
+            <when value="nucl">
+                <param name="database" type="select" label="Nucleotide BLAST database">
+                    <options from_file="blastdb.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+            </when>
+            <when value="prot">
+                <param name="database" type="select" label="Protein BLAST database">
+                    <options from_file="blastdb_p.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+            </when>
+        </conditional>
+    </inputs>
+    <outputs>
+        <data name="info" format="txt" label="${db_opts.database.fields.name} info" />
+    </outputs>
+    <help>
+
+**What it does**
+
+Calls the NCBI BLAST+ blastdbcmd command line tool with the -info
+switch to give summary information about a BLAST database, such as
+the size (number of sequences and total length) and date.
+
+-------
+
+**References**
+
+Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
+
+Schaffer et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. 2001. Nucleic Acids Res. 29:2994-3005.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,139 @@
+<tool id="ncbi_blastdbcmd_wrapper" name="NCBI BLAST+ blastdbcmd entry(s)" version="0.0.6">
+    <description>Extract sequence(s) from BLAST database</description>
+    <requirements>
+        <requirement type="binary">blastdbcmd</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>blastdbcmd -version</version_command>
+    <command>
+## The command is a Cheetah template which allows some Python based syntax.
+## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
+blastdbcmd -dbtype $db_opts.db_type -db "${db_opts.database.fields.path}"
+
+##TODO: What about -ctrl_a and -target_only as advanced options?
+
+#if $id_opts.id_type=="file":
+-entry_batch "$id_opts.entries"
+#else:
+##Perform some simple search/replaces to remove whitespace
+##and make it comma separated, and escape any pipe characters
+-entry "$id_opts.entries.replace('\r',',').replace('\n',',').replace(' ','').replace(',,',',').replace(',,',',').strip(',').replace('|','\|')"
+#end if
+
+##When building a BLAST database, to ensure unique IDs makeblastdb will
+##do things like turning a FASTA entry with ID of ERP44 into lcl|ERP44
+##(if using -parse_seqids) or simply assign it an ID using the record
+##number like gnl|BL_ORD_ID|123 (to cope with duplicate IDs in the FASTA
+##file). In -parse_seqids mode, a duplicate FASTA ID gives an error.
+##
+##The BLAST plain text and XML output will contain these BLAST IDs, but
+##the tabular output does not (at least, not in BLAST 2.2.25+).
+##Therefore in general, Galaxy users won't care about the (internal)
+##BLAST identifiers.
+##
+##The blastdbcmd FASTA output will also contain these IDs, but in the
+##context of the BLAST tabular output they are not helpful. Therefore
+##to recover the original ID as used in the FASTA file for makeblastdb
+##we need a litte post processing.
+##
+##We remove the NCBI's lcl|... or gnl|BL_ORD_ID|123 prefixes
+##using sed, however the exact syntax differs for Mac OS X's sed
+
+#if str($outfmt)=="blastid":
+-out "$seq"
+#else if sys.platform == "darwin":
+| sed -E 's/^>(lcl\||gnl\|BL_ORD_ID\|[0-9]* )/>/1' > "$seq"
+#else:
+| sed 's/>\(lcl|\|gnl|BL_ORD_ID|[0-9]* \)/>/1' > "$seq"
+#end if
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+	<!-- Suspect blastdbcmd sometimes fails to set error level -->
+	<regex match="Error:" />
+	<regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <conditional name="db_opts">
+            <param name="db_type" type="select" label="Type of BLAST database">
+              <option value="nucl" selected="True">Nucleotide</option>
+              <option value="prot">Protein</option>
+            </param>
+            <when value="nucl">
+                <param name="database" type="select" label="Nucleotide BLAST database">
+                    <options from_file="blastdb.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+            </when>
+            <when value="prot">
+                <param name="database" type="select" label="Protein BLAST database">
+                    <options from_file="blastdb_p.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+            </when>
+        </conditional>
+        <conditional name="id_opts">
+            <param name="id_type" type="select" label="Type of identifier list">
+              <option value="file">From file</option>
+              <option value="prompt">User entered</option>
+            </param>
+            <when value="file">
+                <param name="entries" type="data" format="txt,tabular" label="Sequence identifier(s)" help="Plain text file with one ID per line (i.e. single column tabular file)"/>
+            </when>
+            <when value="prompt">
+                <param name="entries" type="text" label="Sequence identifier(s)" help="Comma or new line separated list." optional="False" area="True" size="10x30"/>
+            </when>
+        </conditional>
+        <param name="outfmt" type="select" label="Output format">
+          <option value="original">FASTA with original identifiers</option>
+          <option value="blastid">FASTA with BLAST assigned identifiers</option>
+        </param>
+    </inputs>
+    <outputs>
+        <data name="seq" format="fasta" label="Sequences from ${db_opts.database.fields.name}" />
+    </outputs>
+    <help>
+
+**What it does**
+
+Extracts FASTA formatted sequences from a BLAST database
+using the NCBI BLAST+ blastdbcmd command line tool.
+
+.. class:: warningmark
+
+**BLAST assigned identifiers**
+
+When a BLAST database is constructed from a FASTA file, the
+original identifiers can be replaced with BLAST assigned
+identifiers, partly to ensure uniqueness. e.g. Sometimes
+a prefix of 'lcl|' is added (lcl is short for local),
+or an arbitrary name starting 'gnl|BL_ORD_ID|' is created.
+
+If you are using the tabular output from BLAST, it will contain
+the original identifiers - not the BLAST assigned identifiers
+suitable for use with the blastdbcmd tool.
+
+If you are using the XML or plain text output, this will also
+contain the BLAST assigned identifiers. However, this means
+getting a list of BLAST assigned identifiers isn't straightforward.
+
+-------
+
+**References**
+
+Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
+
+Schaffer et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. 2001. Nucleic Acids Res. 29:2994-3005.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_blastn_wrapper.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,257 @@
+<tool id="ncbi_blastn_wrapper" name="NCBI BLAST+ blastn" version="0.0.20">
+    <description>Search nucleotide database with nucleotide query sequence(s)</description>
+    <!-- If job splitting is enabled, break up the query file into parts -->
+    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
+    <requirements>
+        <requirement type="binary">blastn</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>blastn -version</version_command>
+    <command>
+## The command is a Cheetah template which allows some Python based syntax.
+## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
+blastn
+-query "$query"
+#if $db_opts.db_opts_selector == "db":
+  -db "${db_opts.database.fields.path}"
+#elif $db_opts.db_opts_selector == "histdb":
+  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
+#else:
+  -subject "$db_opts.subject"
+#end if
+-task $blast_type
+-evalue $evalue_cutoff
+-out "$output1"
+##Set the extended list here so if/when we add things, saved workflows are not affected
+#if str($out_format)=="ext":
+    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
+#else:
+    -outfmt $out_format
+#end if
+-num_threads 8
+#if $adv_opts.adv_opts_selector=="advanced":
+$adv_opts.filter_query
+$adv_opts.strand
+## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
+## Note -max_target_seqs overrides -num_descriptions and -num_alignments
+#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
+-max_target_seqs $adv_opts.max_hits
+#end if
+#if (str($adv_opts.identity_cutoff) and float(str($adv_opts.identity_cutoff)) > 0 ):
+-perc_identity $adv_opts.identity_cutoff
+#end if
+#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
+-word_size $adv_opts.word_size
+#end if
+$adv_opts.ungapped
+$adv_opts.parse_deflines
+## End of advanced options:
+#end if
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+        <!-- In case the return code has not been set propery check stderr too -->
+        <regex match="Error:" />
+        <regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <param name="query" type="data" format="fasta" label="Nucleotide query sequence(s)"/>
+        <conditional name="db_opts">
+            <param name="db_opts_selector" type="select" label="Subject database/sequences">
+              <option value="db" selected="True">Locally installed BLAST database</option>
+              <option value="histdb">BLAST database from your history</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
+            </param>
+            <when value="db">
+                <param name="database" type="select" label="Nucleotide BLAST database">
+                    <options from_file="blastdb.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="histdb">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="data" format="blastdbn" label="Nucleotide BLAST database" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="file">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="data" format="fasta" label="Nucleotide FASTA file to use as database"/>
+            </when>
+        </conditional>
+        <param name="blast_type" type="select" display="radio" label="Type of BLAST">
+            <option value="megablast">megablast</option>
+            <option value="blastn">blastn</option>
+            <option value="blastn-short">blastn-short</option>
+            <option value="dc-megablast">dc-megablast</option>
+            <!-- Using BLAST 2.2.24+ this gives an error:
+            BLAST engine error: Program type 'vecscreen' not supported
+            <option value="vecscreen">vecscreen</option>
+            -->
+        </param>
+        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
+        <param name="out_format" type="select" label="Output format">
+            <option value="6">Tabular (standard 12 columns)</option>
+            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
+            <option value="5">BLAST XML</option>
+            <option value="0">Pairwise text</option>
+            <option value="0 -html">Pairwise HTML</option>
+            <option value="2">Query-anchored text</option>
+            <option value="2 -html">Query-anchored HTML</option>
+            <option value="4">Flat query-anchored text</option>
+            <option value="4 -html">Flat query-anchored HTML</option>
+            <!--
+            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
+            -->
+        </param>
+        <conditional name="adv_opts">
+            <param name="adv_opts_selector" type="select" label="Advanced Options">
+              <option value="basic" selected="True">Hide Advanced Options</option>
+              <option value="advanced">Show Advanced Options</option>
+            </param>
+            <when value="basic" />
+            <when value="advanced">
+                <!-- Could use a select (yes, no, other) where other allows setting 'level window linker' -->
+                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with DUST)" truevalue="-dust yes" falsevalue="-dust no" checked="true" />
+                <param name="strand" type="select" label="Query strand(s) to search against database/subject">
+                    <option value="-strand both">Both</option>
+                    <option value="-strand plus">Plus (forward)</option>
+                    <option value="-strand minus">Minus (reverse complement)</option>
+                </param>
+                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
+                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
+                    <validator type="in_range" min="0" />
+                </param>
+                <param name="identity_cutoff" type="float" min="0" max="100" value="0" label="Percent identity cutoff (-perc_identity)" help="Use zero for no cutoff" />
+                <!-- I'd like word_size to be optional, with minimum 4 for blastn -->
+                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 4.">
+                    <validator type="in_range" min="0" />
+                </param>
+                <param name="ungapped" type="boolean" label="Perform ungapped alignment only?" truevalue="-ungapped" falsevalue="" checked="false" />
+                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
+            </when>
+        </conditional>
+    </inputs>
+    <outputs>
+        <data name="output1" format="tabular" label="${blast_type.value_label} on ${on_string}">
+            <change_format>
+                <when input="out_format" value="0" format="txt"/>
+                <when input="out_format" value="0 -html" format="html"/>
+                <when input="out_format" value="2" format="txt"/>
+                <when input="out_format" value="2 -html" format="html"/>
+                <when input="out_format" value="4" format="txt"/>
+                <when input="out_format" value="4 -html" format="html"/>
+                <when input="out_format" value="5" format="blastxml"/>
+            </change_format>
+        </data>
+    </outputs>
+    <tests>
+        <test>
+            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="three_human_mRNA.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-40" />
+            <param name="out_format" value="6" />
+            <param name="adv_opts_selector" value="basic" />
+            <output name="output1" file="blastn_rhodopsin_vs_three_human.tabular" ftype="tabular" />
+        </test>
+    </tests>
+    <help>
+
+.. class:: warningmark
+
+**Note**. Database searches may take a substantial amount of time.
+For large input datasets it is advisable to allow overnight processing.
+
+-----
+
+**What it does**
+
+Search a *nucleotide database* using a *nucleotide query*,
+using the NCBI BLAST+ blastn command line tool.
+Algorithms include blastn, megablast, and discontiguous megablast.
+
+.. class:: warningmark
+
+You can also search against a FASTA file of subject nucleotide
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
+-----
+
+**Output format**
+
+Because Galaxy focuses on processing tabular data, the default output of this
+tool is tabular. The standard BLAST+ tabular output contains 12 columns:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The BLAST+ tools can optionally output additional columns of information,
+but this takes longer to calculate. Most (but not all) of these columns are
+included by selecting the extended tabular output. The extra columns are
+included *after* the standard 12 columns. This is so that you can write
+workflow filtering steps that accept either the 12 or 24 column tabular
+BLAST output. Galaxy now uses this extended 24 column output by default.
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+The third option is BLAST XML output, which is designed to be parsed by
+another program, and is understood by some Galaxy tools.
+
+You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
+The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
+The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
+The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
+and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
+
+-------
+
+**References**
+
+Zhang et al. A Greedy Algorithm for Aligning DNA Sequences. 2000. JCB: 203-214.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_blastp_wrapper.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,308 @@
+<tool id="ncbi_blastp_wrapper" name="NCBI BLAST+ blastp" version="0.0.20">
+    <description>Search protein database with protein query sequence(s)</description>
+    <!-- If job splitting is enabled, break up the query file into parts -->
+    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
+    <requirements>
+        <requirement type="binary">blastp</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>blastp -version</version_command>
+    <command>
+## The command is a Cheetah template which allows some Python based syntax.
+## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
+blastp
+-query "$query"
+#if $db_opts.db_opts_selector == "db":
+  -db "${db_opts.database.fields.path}"
+#elif $db_opts.db_opts_selector == "histdb":
+  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
+#else:
+  -subject "$db_opts.subject"
+#end if
+-task $blast_type
+-evalue $evalue_cutoff
+-out "$output1"
+##Set the extended list here so if/when we add things, saved workflows are not affected
+#if str($out_format)=="ext":
+    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
+#else:
+    -outfmt $out_format
+#end if
+-num_threads 8
+#if $adv_opts.adv_opts_selector=="advanced":
+$adv_opts.filter_query
+-matrix $adv_opts.matrix
+## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
+## Note -max_target_seqs overrides -num_descriptions and -num_alignments
+#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
+-max_target_seqs $adv_opts.max_hits
+#end if
+#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
+-word_size $adv_opts.word_size
+#end if
+##Ungapped disabled for now - see comments below
+##$adv_opts.ungapped
+$adv_opts.parse_deflines
+## End of advanced options:
+#end if
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+        <!-- In case the return code has not been set propery check stderr too -->
+        <regex match="Error:" />
+        <regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <param name="query" type="data" format="fasta" label="Protein query sequence(s)"/>
+        <conditional name="db_opts">
+            <param name="db_opts_selector" type="select" label="Subject database/sequences">
+              <option value="db" selected="True">Locally installed BLAST database</option>
+              <option value="histdb">BLAST database from your history</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
+            </param>
+            <when value="db">
+                <param name="database" type="select" label="Protein BLAST database">
+                    <options from_file="blastdb_p.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="histdb">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="data" format="blastdbp" label="Protein BLAST database" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="file">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="data" format="fasta" label="Protein FASTA file to use as database"/>
+            </when>
+        </conditional>
+        <param name="blast_type" type="select" display="radio" label="Type of BLAST">
+            <option value="blastp">blastp</option>
+            <option value="blastp-short">blastp-short</option>
+        </param>
+        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
+        <param name="out_format" type="select" label="Output format">
+            <option value="6">Tabular (standard 12 columns)</option>
+            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
+            <option value="5">BLAST XML</option>
+            <option value="0">Pairwise text</option>
+            <option value="0 -html">Pairwise HTML</option>
+            <option value="2">Query-anchored text</option>
+            <option value="2 -html">Query-anchored HTML</option>
+            <option value="4">Flat query-anchored text</option>
+            <option value="4 -html">Flat query-anchored HTML</option>
+            <!--
+            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
+            -->
+        </param>
+        <conditional name="adv_opts">
+            <param name="adv_opts_selector" type="select" label="Advanced Options">
+              <option value="basic" selected="True">Hide Advanced Options</option>
+              <option value="advanced">Show Advanced Options</option>
+            </param>
+            <when value="basic" />
+            <when value="advanced">
+                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
+                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="false" />
+                <param name="matrix" type="select" label="Scoring matrix">
+                    <option value="BLOSUM90">BLOSUM90</option>
+                    <option value="BLOSUM80">BLOSUM80</option>
+                    <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option>
+                    <option value="BLOSUM50">BLOSUM50</option>
+                    <option value="BLOSUM45">BLOSUM45</option>
+                    <option value="PAM250">PAM250</option>
+                    <option value="PAM70">PAM70</option>
+                    <option value="PAM30">PAM30</option>
+                </param>
+                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
+                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
+                    <validator type="in_range" min="0" />
+                </param>
+                <!-- I'd like word_size to be optional, with minimum 2 for blastp -->
+                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
+                    <validator type="in_range" min="0" />
+                </param>
+                <!--
+                Can't use '-ungapped' on its own, error back is:
+                Composition-adjusted searched are not supported with an ungapped search, please add -comp_based_stats F or do a gapped search
+                Tried using '-ungapped -comp_based_stats F' and blastp crashed with 'Attempt to access NULL pointer.'
+                <param name="ungapped" type="boolean" label="Perform ungapped alignment only?" truevalue="-ungapped -comp_based_stats F" falsevalue="" checked="false" />
+                -->
+                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
+            </when>
+        </conditional>
+    </inputs>
+    <outputs>
+        <data name="output1" format="tabular" label="${blast_type.value_label} on ${on_string}">
+            <change_format>
+                <when input="out_format" value="0" format="txt"/>
+                <when input="out_format" value="0 -html" format="html"/>
+                <when input="out_format" value="2" format="txt"/>
+                <when input="out_format" value="2 -html" format="html"/>
+                <when input="out_format" value="4" format="txt"/>
+                <when input="out_format" value="4 -html" format="html"/>
+                <when input="out_format" value="5" format="blastxml"/>
+            </change_format>
+        </data>
+    </outputs>
+    <tests>
+        <test>
+            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="rhodopsin_proteins.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-8" />
+            <param name="blast_type" value="blastp" />
+            <param name="out_format" value="5" />
+            <param name="adv_opts_selector" value="advanced" />
+            <param name="filter_query" value="False" />
+            <param name="matrix" value="BLOSUM62" />
+            <param name="max_hits" value="0" />
+            <param name="word_size" value="0" />
+            <param name="parse_deflines" value="True" />
+            <output name="output1" file="blastp_four_human_vs_rhodopsin.xml" ftype="blastxml" />
+        </test>
+        <test>
+            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="rhodopsin_proteins.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-8" />
+            <param name="blast_type" value="blastp" />
+            <param name="out_format" value="6" />
+            <param name="adv_opts_selector" value="advanced" />
+            <param name="filter_query" value="False" />
+            <param name="matrix" value="BLOSUM62" />
+            <param name="max_hits" value="0" />
+            <param name="word_size" value="0" />
+            <param name="parse_deflines" value="True" />
+            <output name="output1" file="blastp_four_human_vs_rhodopsin.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="rhodopsin_proteins.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-8" />
+            <param name="blast_type" value="blastp" />
+            <param name="out_format" value="ext" />
+            <param name="adv_opts_selector" value="advanced" />
+            <param name="filter_query" value="False" />
+            <param name="matrix" value="BLOSUM62" />
+            <param name="max_hits" value="0" />
+            <param name="word_size" value="0" />
+            <param name="parse_deflines" value="True" />
+            <output name="output1" file="blastp_four_human_vs_rhodopsin_ext.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="query" value="rhodopsin_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-8" />
+            <param name="blast_type" value="blastp" />
+            <param name="out_format" value="6" />
+            <param name="adv_opts_selector" value="basic" />
+            <output name="output1" file="blastp_rhodopsin_vs_four_human.tabular" ftype="tabular" />
+        </test>
+    </tests>
+    <help>
+
+.. class:: warningmark
+
+**Note**. Database searches may take a substantial amount of time.
+For large input datasets it is advisable to allow overnight processing.
+
+-----
+
+**What it does**
+
+Search a *protein database* using a *protein query*,
+using the NCBI BLAST+ blastp command line tool.
+
+.. class:: warningmark
+
+You can also search against a FASTA file of subject protein
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
+-----
+
+**Output format**
+
+Because Galaxy focuses on processing tabular data, the default output of this
+tool is tabular. The standard BLAST+ tabular output contains 12 columns:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The BLAST+ tools can optionally output additional columns of information,
+but this takes longer to calculate. Most (but not all) of these columns are
+included by selecting the extended tabular output. The extra columns are
+included *after* the standard 12 columns. This is so that you can write
+workflow filtering steps that accept either the 12 or 24 column tabular
+BLAST output. Galaxy now uses this extended 24 column output by default.
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+The third option is BLAST XML output, which is designed to be parsed by
+another program, and is understood by some Galaxy tools.
+
+You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
+The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
+The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
+The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
+and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
+
+-------
+
+**References**
+
+Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
+
+Schaffer et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. 2001. Nucleic Acids Res. 29:2994-3005.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_blastx_wrapper.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,294 @@
+<tool id="ncbi_blastx_wrapper" name="NCBI BLAST+ blastx" version="0.0.19">
+    <description>Search protein database with translated nucleotide query sequence(s)</description>
+    <!-- If job splitting is enabled, break up the query file into parts -->
+    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
+    <requirements>
+        <requirement type="binary">blastx</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>blastx -version</version_command>
+    <command>
+## The command is a Cheetah template which allows some Python based syntax.
+## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
+blastx
+-query "$query"
+#if $db_opts.db_opts_selector == "db":
+  -db "${db_opts.database.fields.path}"
+#elif $db_opts.db_opts_selector == "histdb":
+  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
+#else:
+  -subject "$db_opts.subject"
+#end if
+-query_gencode $query_gencode
+-evalue $evalue_cutoff
+-out "$output1"
+##Set the extended list here so if/when we add things, saved workflows are not affected
+#if str($out_format)=="ext":
+    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
+#else:
+    -outfmt $out_format
+#end if
+-num_threads 8
+#if $adv_opts.adv_opts_selector=="advanced":
+$adv_opts.filter_query
+$adv_opts.strand
+-matrix $adv_opts.matrix
+## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
+## Note -max_target_seqs overrides -num_descriptions and -num_alignments
+#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
+-max_target_seqs $adv_opts.max_hits
+#end if
+#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
+-word_size $adv_opts.word_size
+#end if
+$adv_opts.ungapped
+$adv_opts.parse_deflines
+## End of advanced options:
+#end if
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+        <!-- In case the return code has not been set propery check stderr too -->
+        <regex match="Error:" />
+        <regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <param name="query" type="data" format="fasta" label="Nucleotide query sequence(s)"/>
+        <conditional name="db_opts">
+            <param name="db_opts_selector" type="select" label="Subject database/sequences">
+              <option value="db" selected="True">Locally installed BLAST database</option>
+              <option value="histdb">BLAST database from your history</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
+            </param>
+            <when value="db">
+                <param name="database" type="select" label="Protein BLAST database">
+                    <options from_file="blastdb_p.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="histdb">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="data" format="blastdbp" label="Protein BLAST database" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="file">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="data" format="fasta" label="Protein FASTA file to use as database"/>
+            </when>
+        </conditional>
+        <param name="query_gencode" type="select" label="Query genetic code">
+            <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details -->
+            <option value="1" select="True">1. Standard</option>
+            <option value="2">2. Vertebrate Mitochondrial</option>
+            <option value="3">3. Yeast Mitochondrial</option>
+            <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option>
+            <option value="5">5. Invertebrate Mitochondrial</option>
+            <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option>
+            <option value="9">9. Echinoderm Mitochondrial</option>
+            <option value="10">10. Euplotid Nuclear</option>
+            <option value="11">11. Bacteria and Archaea</option>
+            <option value="12">12. Alternative Yeast Nuclear</option>
+            <option value="13">13. Ascidian Mitochondrial</option>
+            <option value="14">14. Flatworm Mitochondrial</option>
+            <option value="15">15. Blepharisma Macronuclear</option>
+            <option value="16">16. Chlorophycean Mitochondrial Code</option>
+            <option value="21">21. Trematode Mitochondrial Code</option>
+            <option value="22">22. Scenedesmus obliquus mitochondrial Code</option>
+            <option value="23">23. Thraustochytrium Mitochondrial Code</option>
+            <option value="24">24. Pterobranchia mitochondrial code</option>
+        </param>
+        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
+        <param name="out_format" type="select" label="Output format">
+            <option value="6">Tabular (standard 12 columns)</option>
+            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
+            <option value="5">BLAST XML</option>
+            <option value="0">Pairwise text</option>
+            <option value="0 -html">Pairwise HTML</option>
+            <option value="2">Query-anchored text</option>
+            <option value="2 -html">Query-anchored HTML</option>
+            <option value="4">Flat query-anchored text</option>
+            <option value="4 -html">Flat query-anchored HTML</option>
+            <!--
+            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
+            -->
+        </param>
+        <conditional name="adv_opts">
+            <param name="adv_opts_selector" type="select" label="Advanced Options">
+              <option value="basic" selected="True">Hide Advanced Options</option>
+              <option value="advanced">Show Advanced Options</option>
+            </param>
+            <when value="basic" />
+            <when value="advanced">
+                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
+                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="true" />
+                <param name="strand" type="select" label="Query strand(s) to search against database/subject">
+                    <option value="-strand both">Both</option>
+                    <option value="-strand plus">Plus (forward)</option>
+                    <option value="-strand minus">Minus (reverse complement)</option>
+                </param>
+                <param name="matrix" type="select" label="Scoring matrix">
+                    <option value="BLOSUM90">BLOSUM90</option>
+                    <option value="BLOSUM80">BLOSUM80</option>
+                    <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option>
+                    <option value="BLOSUM50">BLOSUM50</option>
+                    <option value="BLOSUM45">BLOSUM45</option>
+                    <option value="PAM250">PAM250</option>
+                    <option value="PAM70">PAM70</option>
+                    <option value="PAM30">PAM30</option>
+                </param>
+                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
+                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
+                    <validator type="in_range" min="0" />
+                </param>
+                <!-- I'd like word_size to be optional, with minimum 2 for blastx -->
+                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
+                    <validator type="in_range" min="0" />
+                </param>
+                <param name="ungapped" type="boolean" label="Perform ungapped alignment only?" truevalue="-ungapped" falsevalue="" checked="false" />
+                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
+            </when>
+        </conditional>
+    </inputs>
+    <outputs>
+        <data name="output1" format="tabular" label="blastx on ${on_string}">
+            <change_format>
+                <when input="out_format" value="0" format="txt"/>
+                <when input="out_format" value="0 -html" format="html"/>
+                <when input="out_format" value="2" format="txt"/>
+                <when input="out_format" value="2 -html" format="html"/>
+                <when input="out_format" value="4" format="txt"/>
+                <when input="out_format" value="4 -html" format="html"/>
+                <when input="out_format" value="5" format="blastxml"/>
+            </change_format>
+        </data>
+    </outputs>
+    <tests>
+        <test>
+            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-10" />
+            <param name="out_format" value="5" />
+            <param name="adv_opts_selector" value="basic" />
+            <output name="output1" file="blastx_rhodopsin_vs_four_human.xml" ftype="blastxml" />
+        </test>
+        <test>
+            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-10" />
+            <param name="out_format" value="6" />
+            <param name="adv_opts_selector" value="basic" />
+            <output name="output1" file="blastx_rhodopsin_vs_four_human.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-10" />
+            <param name="out_format" value="ext" />
+            <param name="adv_opts_selector" value="basic" />
+            <output name="output1" file="blastx_rhodopsin_vs_four_human_ext.tabular" ftype="tabular" />
+        </test>
+    </tests>
+    <help>
+
+.. class:: warningmark
+
+**Note**. Database searches may take a substantial amount of time.
+For large input datasets it is advisable to allow overnight processing.
+
+-----
+
+**What it does**
+
+Search a *protein database* using a *translated nucleotide query*,
+using the NCBI BLAST+ blastx command line tool.
+
+.. class:: warningmark
+
+You can also search against a FASTA file of subject protein
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
+-----
+
+**Output format**
+
+Because Galaxy focuses on processing tabular data, the default output of this
+tool is tabular. The standard BLAST+ tabular output contains 12 columns:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The BLAST+ tools can optionally output additional columns of information,
+but this takes longer to calculate. Most (but not all) of these columns are
+included by selecting the extended tabular output. The extra columns are
+included *after* the standard 12 columns. This is so that you can write
+workflow filtering steps that accept either the 12 or 24 column tabular
+BLAST output. Galaxy now uses this extended 24 column output by default.
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+The third option is BLAST XML output, which is designed to be parsed by
+another program, and is understood by some Galaxy tools.
+
+You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
+The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
+The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
+The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
+and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
+
+-------
+
+**References**
+
+Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_makeblastdb.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,129 @@
+<tool id="ncbi_makeblastdb" name="NCBI BLAST+ makeblastdb" version="0.0.5">
+  <description>Make BLAST database</description>
+    <requirements>
+        <requirement type="binary">makeblastdb</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>makeblastdb -version</version_command>
+    <command>
+makeblastdb -out "${os.path.join($outfile.extra_files_path,'blastdb')}"
+$parse_seqids
+$hash_index
+## Single call to -in with multiple filenames space separated with outer quotes
+## (presumably any filenames with spaces would be a problem). Note this gives
+## some extra spaces, e.g. -in " file1 file2 file3  " but BLAST seems happy:
+-in "
+#for $i in $in
+${i.file} #end for
+"
+#if $title:
+-title "$title"
+#else:
+##Would default to being based on the cryptic Galaxy filenames, which is unhelpful
+-title "BLAST Database"
+#end if
+-dbtype $dbtype
+## #set $sep = '-mask_data '
+## #for $i in $mask_data
+## $sep${i.file}
+## #set $set = ', '
+## #end for
+## #set $sep = '-gi_mask -gi_mask_name '
+## #for $i in $gi_mask
+## $sep${i.file}
+## #set $set = ', '
+## #end for
+## #if $tax.select == 'id':
+## -taxid $tax.id
+## #else if $tax.select == 'map':
+## -taxid_map $tax.map
+## #end if
+</command>
+<stdio>
+    <!-- Anything other than zero is an error -->
+    <exit_code range="1:" />
+    <exit_code range=":-1" />
+    <!-- In case the return code has not been set propery check stderr too -->
+    <regex match="Error:" />
+    <regex match="Exception:" />
+</stdio>
+<inputs>
+    <param name="dbtype" type="select" display="radio" label="Molecule type of input">
+        <option value="prot">protein</option>
+        <option value="nucl">nucleotide</option>
+    </param>
+    <!-- TODO Allow merging of existing BLAST databases (conditional on the database type)
+    <repeat name="in" title="Blast or Fasta Database" min="1">
+        <param name="file" type="data" format="fasta,blastdbn,blastdbp" label="Blast or Fasta database" />
+    </repeat>
+    -->
+    <repeat name="in" title="FASTA file" min="1">
+        <param name="file" type="data" format="fasta" />
+    </repeat>
+    <param name="title" type="text" value="" label="Title for BLAST database" help="This is the database name shown in BLAST search output" />
+    <param name="parse_seqids" type="boolean" truevalue="-parse_seqids" falsevalue="" checked="False" label="Parse the sequence identifiers" help="This is only advised if your FASTA file follows the NCBI naming conventions using pipe '|' symbols" />
+    <param name="hash_index" type="boolean" truevalue="-hash_index" falsevalue="" checked="true" label="Enable the creation of sequence hash values." help="These hash values can then be used to quickly determine if a given sequence data exists in this BLAST database." />
+
+    <!-- SEQUENCE MASKING OPTIONS -->
+    <!-- TODO
+    <repeat name="mask_data" title="Provide one or more files containing masking data">
+        <param name="file" type="data" format="asnb" label="File containing masking data" help="As produced by NCBI masking applications (e.g. dustmasker, segmasker, windowmasker)" />
+    </repeat>
+    <repeat name="gi_mask" title="Create GI indexed masking data">
+        <param name="file" type="data" format="asnb" label="Masking data output file" />
+    </repeat>
+    -->
+
+    <!-- TAXONOMY OPTIONS -->
+    <!-- TODO
+    <conditional name="tax">
+        <param name="select" type="select" label="Taxonomy options">
+            <option value="">Do not assign sequences to Taxonomy IDs</option>
+            <option value="id">Assign all sequences to one Taxonomy ID</option>
+            <option value="map">Supply text file mapping sequence IDs to taxnomy IDs</option>
+        </param>
+        <when value="">
+        </when>
+        <when value="id">
+            <param name="id" type="integer" value="" label="NCBI taxonomy ID" help="Integer &gt;=0" />
+        </when>
+        <when value="map">
+            <param name="file" type="data" format="txt" label="Seq ID : Tax ID mapping file" help="Format: SequenceId TaxonomyId" />
+        </when>
+    </conditional>
+    -->
+</inputs>
+<outputs>
+    <!-- If we only accepted one FASTA file, we could use its human name here... -->
+    <data name="outfile" format="data" label="${dbtype.value_label} BLAST database from ${on_string}">
+        <change_format>
+                <when input="dbtype" value="nucl" format="blastdbn"/>
+                <when input="dbtype" value="prot" format="blastdbp"/>
+        </change_format>
+    </data>
+</outputs>
+<help>
+**What it does**
+
+Make BLAST database from one or more FASTA files and/or BLAST databases.
+
+This is a wrapper for the NCBI BLAST+ tool 'makeblastdb', which is the
+replacement for the 'formatdb' tool in the NCBI 'legacy' BLAST suite.
+
+<!--
+Applying masks to an existing BLAST database will not change the original database; a new database will be created.
+For this reason, it's best to apply all masks at once to minimize the number of unnecessary intermediate databases.
+-->
+
+**Documentation**
+
+http://www.ncbi.nlm.nih.gov/books/NBK1763/
+
+**References**
+
+Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+</help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_rpsblast_wrapper.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,238 @@
+<tool id="ncbi_rpsblast_wrapper" name="NCBI BLAST+ rpsblast" version="0.0.4">
+    <description>Search protein domain database (PSSMs) with protein query sequence(s)</description>
+    <!-- If job splitting is enabled, break up the query file into parts -->
+    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject" merge_outputs="output1"></parallelism>
+    <requirements>
+        <requirement type="binary">rpsblast</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>rpsblast -version</version_command>
+    <command>
+## The command is a Cheetah template which allows some Python based syntax.
+## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
+rpsblast
+-query "$query"
+#if $db_opts.db_opts_selector == "db":
+  -db "${db_opts.database.fields.path}"
+#elif $db_opts.db_opts_selector == "histdb":
+  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
+#end if
+-evalue $evalue_cutoff
+-out "$output1"
+##Set the extended list here so if/when we add things, saved workflows are not affected
+#if str($out_format)=="ext":
+    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
+#else:
+    -outfmt $out_format
+#end if
+-num_threads 8
+#if $adv_opts.adv_opts_selector=="advanced":
+$adv_opts.filter_query
+## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
+## Note -max_target_seqs overrides -num_descriptions and -num_alignments
+#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
+-max_target_seqs $adv_opts.max_hits
+#end if
+#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
+-word_size $adv_opts.word_size
+#end if
+$adv_opts.parse_deflines
+## End of advanced options:
+#end if
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+        <!-- In case the return code has not been set propery check stderr too -->
+        <regex match="Error:" />
+        <regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <param name="query" type="data" format="fasta" label="Protein query sequence(s)"/>
+        <conditional name="db_opts">
+            <param name="db_opts_selector" type="select" label="Protein domain database (PSSM)">
+              <option value="db" selected="True">Locally installed BLAST database</option>
+	      <!-- TODO - define new datatype
+              <option value="histdb">BLAST protein domain database from your history</option>
+	      -->
+            </param>
+            <when value="db">
+                <param name="database" type="select" label="Protein domain database">
+                    <options from_file="blastdb_d.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+	    <!-- TODO - define new datatype
+            <when value="histdb">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="data" format="blastdbd" label="Protein domain database" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+	    -->
+        </conditional>
+        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
+        <param name="out_format" type="select" label="Output format">
+            <option value="6">Tabular (standard 12 columns)</option>
+            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
+            <option value="5">BLAST XML</option>
+            <option value="0">Pairwise text</option>
+            <option value="0 -html">Pairwise HTML</option>
+            <option value="2">Query-anchored text</option>
+            <option value="2 -html">Query-anchored HTML</option>
+            <option value="4">Flat query-anchored text</option>
+            <option value="4 -html">Flat query-anchored HTML</option>
+            <!--
+            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
+            -->
+        </param>
+        <conditional name="adv_opts">
+            <param name="adv_opts_selector" type="select" label="Advanced Options">
+              <option value="basic" selected="True">Hide Advanced Options</option>
+              <option value="advanced">Show Advanced Options</option>
+            </param>
+            <when value="basic" />
+            <when value="advanced">
+                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
+                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="false" />
+                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
+                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
+                    <validator type="in_range" min="0" />
+                </param>
+                <!-- I'd like word_size to be optional, with minimum 2 for rpsblast -->
+                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
+                    <validator type="in_range" min="0" />
+                </param>
+                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
+            </when>
+        </conditional>
+    </inputs>
+    <outputs>
+        <data name="output1" format="tabular" label="rpsblast on ${on_string}">
+            <change_format>
+                <when input="out_format" value="0" format="txt"/>
+                <when input="out_format" value="0 -html" format="html"/>
+                <when input="out_format" value="2" format="txt"/>
+                <when input="out_format" value="2 -html" format="html"/>
+                <when input="out_format" value="4" format="txt"/>
+                <when input="out_format" value="4 -html" format="html"/>
+                <when input="out_format" value="5" format="blastxml"/>
+            </change_format>
+        </data>
+    </outputs>
+    <help>
+
+.. class:: warningmark
+
+**Note**. Database searches may take a substantial amount of time.
+For large input datasets it is advisable to allow overnight processing.
+
+-----
+
+**What it does**
+
+Search a *protein domain database* using a *protein query*,
+using the NCBI BLAST+ rpsblast command line tool.
+
+The protein domain databases use position-specific scoring matrices
+(PSSMs) and are available for a number of domain collections including:
+
+*CDD* - NCBI curarated meta-collection of domains, see
+http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml#NCBI_curated_domains
+
+*Kog* - PSSMs from automatically aligned sequences and sequence
+fragments classified in the KOGs resource, the eukaryotic
+counterpart to COGs, see http://www.ncbi.nlm.nih.gov/COG/new/
+
+*Cog* - PSSMs from automatically aligned sequences and sequence
+fragments classified in the COGs resource, which focuses primarily
+on prokaryotes, see http://www.ncbi.nlm.nih.gov/COG/new/
+
+*Pfam* - PSSMs from Pfam-A seed alignment database, see
+http://pfam.sanger.ac.uk/
+
+*Smart* - PSSMs from SMART domain alignment database, see
+http://smart.embl-heidelberg.de/
+
+*Tigr* - PSSMs from TIGRFAM database of protein families, see
+http://www.jcvi.org/cms/research/projects/tigrfams/overview/
+
+*Prk* - PSSms from automatically aligned stable clusters in the
+Protein Clusters database, see
+http://www.ncbi.nlm.nih.gov/proteinclusters?cmd=search&amp;db=proteinclusters
+
+The exact list of domain databases offered will depend on how your
+local Galaxy has been configured.
+
+-----
+
+**Output format**
+
+Because Galaxy focuses on processing tabular data, the default output of this
+tool is tabular. The standard BLAST+ tabular output contains 12 columns:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The BLAST+ tools can optionally output additional columns of information,
+but this takes longer to calculate. Most (but not all) of these columns are
+included by selecting the extended tabular output. The extra columns are
+included *after* the standard 12 columns. This is so that you can write
+workflow filtering steps that accept either the 12 or 24 column tabular
+BLAST output. Galaxy now uses this extended 24 column output by default.
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+The third option is BLAST XML output, which is designed to be parsed by
+another program, and is understood by some Galaxy tools.
+
+You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
+The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
+The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
+The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
+and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
+
+-------
+
+**References**
+
+Marchler-Bauer A, Bryant SH. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W327-31.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,239 @@
+<tool id="ncbi_rpstblastn_wrapper" name="NCBI BLAST+ rpstblastn" version="0.0.4">
+    <description>Search protein domain database (PSSMs) with translated nucleotide query sequence(s)</description>
+    <!-- If job splitting is enabled, break up the query file into parts -->
+    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject" merge_outputs="output1"></parallelism>
+    <requirements>
+        <requirement type="binary">rpstblastn</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>rpstblastn -version</version_command>
+    <command>
+## The command is a Cheetah template which allows some Python based syntax.
+## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
+rpstblastn
+-query "$query"
+#if $db_opts.db_opts_selector == "db":
+  -db "${db_opts.database.fields.path}"
+#elif $db_opts.db_opts_selector == "histdb":
+  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
+#end if
+-evalue $evalue_cutoff
+-out "$output1"
+##Set the extended list here so if/when we add things, saved workflows are not affected
+#if str($out_format)=="ext":
+    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
+#else:
+    -outfmt $out_format
+#end if
+##Seems rpstblastn does not currently support multiple threads :(
+##-num_threads 8
+#if $adv_opts.adv_opts_selector=="advanced":
+$adv_opts.filter_query
+## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
+## Note -max_target_seqs overrides -num_descriptions and -num_alignments
+#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
+-max_target_seqs $adv_opts.max_hits
+#end if
+#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
+-word_size $adv_opts.word_size
+#end if
+$adv_opts.parse_deflines
+## End of advanced options:
+#end if
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+        <!-- In case the return code has not been set propery check stderr too -->
+        <regex match="Error:" />
+        <regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <param name="query" type="data" format="fasta" label="Nucleotide query sequence(s)"/>
+        <conditional name="db_opts">
+            <param name="db_opts_selector" type="select" label="Protein domain database (PSSM)">
+              <option value="db" selected="True">Locally installed BLAST database</option>
+              <!-- TODO - define new datatype
+              <option value="histdb">BLAST protein domain database from your history</option>
+              -->
+            </param>
+            <when value="db">
+                <param name="database" type="select" label="Protein domain database">
+                    <options from_file="blastdb_d.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <!-- TODO - define new datatype
+            <when value="histdb">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="data" format="blastdbd" label="Protein domain database" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            -->
+        </conditional>
+        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
+        <param name="out_format" type="select" label="Output format">
+            <option value="6">Tabular (standard 12 columns)</option>
+            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
+            <option value="5">BLAST XML</option>
+            <option value="0">Pairwise text</option>
+            <option value="0 -html">Pairwise HTML</option>
+            <option value="2">Query-anchored text</option>
+            <option value="2 -html">Query-anchored HTML</option>
+            <option value="4">Flat query-anchored text</option>
+            <option value="4 -html">Flat query-anchored HTML</option>
+            <!--
+            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
+            -->
+        </param>
+        <conditional name="adv_opts">
+            <param name="adv_opts_selector" type="select" label="Advanced Options">
+              <option value="basic" selected="True">Hide Advanced Options</option>
+              <option value="advanced">Show Advanced Options</option>
+            </param>
+            <when value="basic" />
+            <when value="advanced">
+                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
+                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="false" />
+                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
+                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
+                    <validator type="in_range" min="0" />
+                </param>
+                <!-- I'd like word_size to be optional, with minimum 2 for rpsblast -->
+                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
+                    <validator type="in_range" min="0" />
+                </param>
+                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
+            </when>
+        </conditional>
+    </inputs>
+    <outputs>
+        <data name="output1" format="tabular" label="rpstblastn on ${on_string}">
+            <change_format>
+                <when input="out_format" value="0" format="txt"/>
+                <when input="out_format" value="0 -html" format="html"/>
+                <when input="out_format" value="2" format="txt"/>
+                <when input="out_format" value="2 -html" format="html"/>
+                <when input="out_format" value="4" format="txt"/>
+                <when input="out_format" value="4 -html" format="html"/>
+                <when input="out_format" value="5" format="blastxml"/>
+            </change_format>
+        </data>
+    </outputs>
+    <help>
+
+.. class:: warningmark
+
+**Note**. Database searches may take a substantial amount of time.
+For large input datasets it is advisable to allow overnight processing.
+
+-----
+
+**What it does**
+
+Search a *protein domain database* using a *nucleotide query*,
+using the NCBI BLAST+ rpstblastn command line tool.
+
+The protein domain databases use position-specific scoring matrices
+(PSSMs) and are available for a number of domain collections including:
+
+*CDD* - NCBI curarated meta-collection of domains, see
+http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml#NCBI_curated_domains
+
+*Kog* - PSSMs from automatically aligned sequences and sequence
+fragments classified in the KOGs resource, the eukaryotic
+counterpart to COGs, see http://www.ncbi.nlm.nih.gov/COG/new/
+
+*Cog* - PSSMs from automatically aligned sequences and sequence
+fragments classified in the COGs resource, which focuses primarily
+on prokaryotes, see http://www.ncbi.nlm.nih.gov/COG/new/
+
+*Pfam* - PSSMs from Pfam-A seed alignment database, see
+http://pfam.sanger.ac.uk/
+
+*Smart* - PSSMs from SMART domain alignment database, see
+http://smart.embl-heidelberg.de/
+
+*Tigr* - PSSMs from TIGRFAM database of protein families, see
+http://www.jcvi.org/cms/research/projects/tigrfams/overview/
+
+*Prk* - PSSms from automatically aligned stable clusters in the
+Protein Clusters database, see
+http://www.ncbi.nlm.nih.gov/proteinclusters?cmd=search&amp;db=proteinclusters
+
+The exact list of domain databases offered will depend on how your
+local Galaxy has been configured.
+
+-----
+
+**Output format**
+
+Because Galaxy focuses on processing tabular data, the default output of this
+tool is tabular. The standard BLAST+ tabular output contains 12 columns:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The BLAST+ tools can optionally output additional columns of information,
+but this takes longer to calculate. Most (but not all) of these columns are
+included by selecting the extended tabular output. The extra columns are
+included *after* the standard 12 columns. This is so that you can write
+workflow filtering steps that accept either the 12 or 24 column tabular
+BLAST output. Galaxy now uses this extended 24 column output by default.
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+The third option is BLAST XML output, which is designed to be parsed by
+another program, and is understood by some Galaxy tools.
+
+You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
+The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
+The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
+The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
+and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
+
+-------
+
+**References**
+
+Marchler-Bauer A, Bryant SH. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W327-31.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_tblastn_wrapper.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,340 @@
+<tool id="ncbi_tblastn_wrapper" name="NCBI BLAST+ tblastn" version="0.0.20">
+    <description>Search translated nucleotide database with protein query sequence(s)</description>
+    <!-- If job splitting is enabled, break up the query file into parts -->
+    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
+    <requirements>
+        <requirement type="binary">tblastn</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>tblastn -version</version_command>
+    <command>
+## The command is a Cheetah template which allows some Python based syntax.
+## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
+tblastn
+-query "$query"
+#if $db_opts.db_opts_selector == "db":
+  -db "${db_opts.database.fields.path}"
+#elif $db_opts.db_opts_selector == "histdb":
+  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
+#else:
+  -subject "$db_opts.subject"
+#end if
+-evalue $evalue_cutoff
+-out "$output1"
+##Set the extended list here so if/when we add things, saved workflows are not affected
+#if str($out_format)=="ext":
+    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
+#else:
+    -outfmt $out_format
+#end if
+-num_threads 8
+#if $adv_opts.adv_opts_selector=="advanced":
+-db_gencode $adv_opts.db_gencode
+$adv_opts.filter_query
+-matrix $adv_opts.matrix
+## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
+## Note -max_target_seqs overrides -num_descriptions and -num_alignments
+#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
+-max_target_seqs $adv_opts.max_hits
+#end if
+#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
+-word_size $adv_opts.word_size
+#end if
+##Ungapped disabled for now - see comments below
+##$adv_opts.ungapped
+$adv_opts.parse_deflines
+## End of advanced options:
+#end if
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+        <!-- In case the return code has not been set propery check stderr too -->
+        <regex match="Error:" />
+        <regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <param name="query" type="data" format="fasta" label="Protein query sequence(s)"/>
+        <conditional name="db_opts">
+            <param name="db_opts_selector" type="select" label="Subject database/sequences">
+              <option value="db" selected="True">Locally installed BLAST database</option>
+              <option value="histdb">BLAST database from your history</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
+            </param>
+            <when value="db">
+                <param name="database" type="select" label="Nucleotide BLAST database">
+                    <options from_file="blastdb.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="histdb">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="data" format="blastdbn" label="Nucleotide BLAST database" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="file">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="data" format="fasta" label="Nucleotide FASTA file to use as database"/>
+            </when>
+        </conditional>
+        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
+        <param name="out_format" type="select" label="Output format">
+            <option value="6">Tabular (standard 12 columns)</option>
+            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
+            <option value="5">BLAST XML</option>
+            <option value="0">Pairwise text</option>
+            <option value="0 -html">Pairwise HTML</option>
+            <option value="2">Query-anchored text</option>
+            <option value="2 -html">Query-anchored HTML</option>
+            <option value="4">Flat query-anchored text</option>
+            <option value="4 -html">Flat query-anchored HTML</option>
+            <!--
+            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
+            -->
+        </param>
+        <conditional name="adv_opts">
+            <param name="adv_opts_selector" type="select" label="Advanced Options">
+              <option value="basic" selected="True">Hide Advanced Options</option>
+              <option value="advanced">Show Advanced Options</option>
+            </param>
+            <when value="basic" />
+            <when value="advanced">
+                <param name="db_gencode" type="select" label="Database/subject genetic code">
+                    <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details -->
+                    <option value="1" select="True">1. Standard</option>
+                    <option value="2">2. Vertebrate Mitochondrial</option>
+                    <option value="3">3. Yeast Mitochondrial</option>
+                    <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option>
+                    <option value="5">5. Invertebrate Mitochondrial</option>
+                    <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option>
+                    <option value="9">9. Echinoderm Mitochondrial</option>
+                    <option value="10">10. Euplotid Nuclear</option>
+                    <option value="11">11. Bacteria and Archaea</option>
+                    <option value="12">12. Alternative Yeast Nuclear</option>
+                    <option value="13">13. Ascidian Mitochondrial</option>
+                    <option value="14">14. Flatworm Mitochondrial</option>
+                    <option value="15">15. Blepharisma Macronuclear</option>
+                    <option value="16">16. Chlorophycean Mitochondrial Code</option>
+                    <option value="21">21. Trematode Mitochondrial Code</option>
+                    <option value="22">22. Scenedesmus obliquus mitochondrial Code</option>
+                    <option value="23">23. Thraustochytrium Mitochondrial Code</option>
+                    <option value="24">24. Pterobranchia mitochondrial code</option>
+                </param>
+                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
+                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="true" />
+                <param name="matrix" type="select" label="Scoring matrix">
+                    <option value="BLOSUM90">BLOSUM90</option>
+                    <option value="BLOSUM80">BLOSUM80</option>
+                    <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option>
+                    <option value="BLOSUM50">BLOSUM50</option>
+                    <option value="BLOSUM45">BLOSUM45</option>
+                    <option value="PAM250">PAM250</option>
+                    <option value="PAM70">PAM70</option>
+                    <option value="PAM30">PAM30</option>
+                </param>
+                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
+                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
+                    <validator type="in_range" min="0" />
+                </param>
+                <!-- I'd like word_size to be optional, with minimum 2 for blastp -->
+                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
+                    <validator type="in_range" min="0" />
+                </param>
+                <!--
+                Can't use '-ungapped' on its own, error back is:
+                Composition-adjusted searched are not supported with an ungapped search, please add -comp_based_stats F or do a gapped search
+                Tried using '-ungapped -comp_based_stats F' and tblastn crashed with 'Attempt to access NULL pointer.'
+                <param name="ungapped" type="boolean" label="Perform ungapped alignment only?" truevalue="-ungapped -comp_based_stats F" falsevalue="" checked="false" />
+                -->
+                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
+            </when>
+        </conditional>
+    </inputs>
+    <outputs>
+        <data name="output1" format="tabular" label="tblastn on ${on_string}">
+            <change_format>
+                <when input="out_format" value="0" format="txt"/>
+                <when input="out_format" value="0 -html" format="html"/>
+                <when input="out_format" value="2" format="txt"/>
+                <when input="out_format" value="2 -html" format="html"/>
+                <when input="out_format" value="4" format="txt"/>
+                <when input="out_format" value="4 -html" format="html"/>
+                <when input="out_format" value="5" format="blastxml"/>
+            </change_format>
+        </data>
+    </outputs>
+    <tests>
+        <test>
+            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-10" />
+            <param name="out_format" value="5" />
+            <param name="adv_opts_selector" value="advanced" />
+            <param name="filter_query" value="false" />
+            <param name="matrix" value="BLOSUM80" />
+            <param name="max_hits" value="0" />
+            <param name="word_size" value="0" />
+            <param name="parse_deflines" value="false" />
+            <output name="output1" file="tblastn_four_human_vs_rhodopsin.xml" ftype="blastxml" />
+        </test>
+        <test>
+            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-10" />
+            <param name="out_format" value="ext" />
+            <param name="adv_opts_selector" value="advanced" />
+            <param name="filter_query" value="false" />
+            <param name="matrix" value="BLOSUM80" />
+            <param name="max_hits" value="0" />
+            <param name="word_size" value="0" />
+            <param name="parse_deflines" value="false" />
+            <output name="output1" file="tblastn_four_human_vs_rhodopsin_ext.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-10" />
+            <param name="out_format" value="6" />
+            <param name="adv_opts_selector" value="advanced" />
+            <param name="filter_query" value="false" />
+            <param name="matrix" value="BLOSUM80" />
+            <param name="max_hits" value="0" />
+            <param name="word_size" value="0" />
+            <param name="parse_deflines" value="false" />
+            <output name="output1" file="tblastn_four_human_vs_rhodopsin.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <!-- Same as above, but parse deflines - on BLAST 2.2.25+ - 2.2.27+ makes no difference -->
+            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-10" />
+            <param name="out_format" value="6" />
+            <param name="adv_opts_selector" value="advanced" />
+            <param name="filter_query" value="false" />
+            <param name="matrix" value="BLOSUM80" />
+            <param name="max_hits" value="0" />
+            <param name="word_size" value="0" />
+            <param name="parse_deflines" value="true" />
+            <output name="output1" file="tblastn_four_human_vs_rhodopsin.tabular" ftype="tabular" />
+        </test>
+        <test>
+            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-10" />
+            <param name="out_format" value="0 -html" />
+            <param name="adv_opts_selector" value="advanced" />
+            <param name="filter_query" value="false" />
+            <param name="matrix" value="BLOSUM80" />
+            <param name="max_hits" value="0" />
+            <param name="word_size" value="0" />
+            <param name="parse_deflines" value="false" />
+            <output name="output1" file="tblastn_four_human_vs_rhodopsin.html" ftype="html" />
+        </test>
+    </tests>
+    <help>
+
+.. class:: warningmark
+
+**Note**. Database searches may take a substantial amount of time.
+For large input datasets it is advisable to allow overnight processing.
+
+-----
+
+**What it does**
+
+Search a *translated nucleotide database* using a *protein query*,
+using the NCBI BLAST+ tblastn command line tool.
+
+.. class:: warningmark
+
+You can also search against a FASTA file of subject nucleotide
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
+-----
+
+**Output format**
+
+Because Galaxy focuses on processing tabular data, the default output of this
+tool is tabular. The standard BLAST+ tabular output contains 12 columns:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The BLAST+ tools can optionally output additional columns of information,
+but this takes longer to calculate. Most (but not all) of these columns are
+included by selecting the extended tabular output. The extra columns are
+included *after* the standard 12 columns. This is so that you can write
+workflow filtering steps that accept either the 12 or 24 column tabular
+BLAST output. Galaxy now uses this extended 24 column output by default.
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+The third option is BLAST XML output, which is designed to be parsed by
+another program, and is understood by some Galaxy tools.
+
+You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
+The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
+The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
+The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
+and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
+
+-------
+
+**References**
+
+Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/ncbi_tblastx_wrapper.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,294 @@
+<tool id="ncbi_tblastx_wrapper" name="NCBI BLAST+ tblastx" version="0.0.20">
+    <description>Search translated nucleotide database with translated nucleotide query sequence(s)</description>
+    <!-- If job splitting is enabled, break up the query file into parts -->
+    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
+    <requirements>
+        <requirement type="binary">tblastx</requirement>
+        <requirement type="package" version="2.2.26+">blast+</requirement>
+    </requirements>
+    <version_command>tblastx -version</version_command>
+    <command>
+## The command is a Cheetah template which allows some Python based syntax.
+## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
+tblastx
+-query "$query"
+#if $db_opts.db_opts_selector == "db":
+  -db "${db_opts.database.fields.path}"
+#elif $db_opts.db_opts_selector == "histdb":
+  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
+#else:
+  -subject "$db_opts.subject"
+#end if
+-query_gencode $query_gencode
+-evalue $evalue_cutoff
+-out "$output1"
+##Set the extended list here so if/when we add things, saved workflows are not affected
+#if str($out_format)=="ext":
+    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
+#else:
+    -outfmt $out_format
+#end if
+-num_threads 8
+#if $adv_opts.adv_opts_selector=="advanced":
+-db_gencode $adv_opts.db_gencode
+$adv_opts.filter_query
+$adv_opts.strand
+-matrix $adv_opts.matrix
+## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
+## Note -max_target_seqs overrides -num_descriptions and -num_alignments
+#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
+-max_target_seqs $adv_opts.max_hits
+#end if
+#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
+-word_size $adv_opts.word_size
+#end if
+$adv_opts.parse_deflines
+## End of advanced options:
+#end if
+    </command>
+    <stdio>
+        <!-- Anything other than zero is an error -->
+        <exit_code range="1:" />
+        <exit_code range=":-1" />
+        <!-- In case the return code has not been set propery check stderr too -->
+        <regex match="Error:" />
+        <regex match="Exception:" />
+    </stdio>
+    <inputs>
+        <param name="query" type="data" format="fasta" label="Nucleotide query sequence(s)"/>
+        <conditional name="db_opts">
+            <param name="db_opts_selector" type="select" label="Subject database/sequences">
+              <option value="db" selected="True">Locally installed BLAST database</option>
+              <option value="histdb">BLAST database from your history</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
+            </param>
+            <when value="db">
+                <param name="database" type="select" label="Nucleotide BLAST database">
+                    <options from_file="blastdb.loc">
+                      <column name="value" index="0"/>
+                      <column name="name" index="1"/>
+                      <column name="path" index="2"/>
+                    </options>
+                </param>
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="histdb">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="data" format="blastdbn" label="Nucleotide BLAST database" />
+                <param name="subject" type="hidden" value="" />
+            </when>
+            <when value="file">
+                <param name="database" type="hidden" value="" />
+                <param name="histdb" type="hidden" value="" />
+                <param name="subject" type="data" format="fasta" label="Nucleotide FASTA file to use as database"/>
+            </when>
+        </conditional>
+        <param name="query_gencode" type="select" label="Query genetic code">
+            <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details -->
+            <option value="1" select="True">1. Standard</option>
+            <option value="2">2. Vertebrate Mitochondrial</option>
+            <option value="3">3. Yeast Mitochondrial</option>
+            <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option>
+            <option value="5">5. Invertebrate Mitochondrial</option>
+            <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option>
+            <option value="9">9. Echinoderm Mitochondrial</option>
+            <option value="10">10. Euplotid Nuclear</option>
+            <option value="11">11. Bacteria and Archaea</option>
+            <option value="12">12. Alternative Yeast Nuclear</option>
+            <option value="13">13. Ascidian Mitochondrial</option>
+            <option value="14">14. Flatworm Mitochondrial</option>
+            <option value="15">15. Blepharisma Macronuclear</option>
+            <option value="16">16. Chlorophycean Mitochondrial Code</option>
+            <option value="21">21. Trematode Mitochondrial Code</option>
+            <option value="22">22. Scenedesmus obliquus mitochondrial Code</option>
+            <option value="23">23. Thraustochytrium Mitochondrial Code</option>
+            <option value="24">24. Pterobranchia mitochondrial code</option>
+        </param>
+        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
+        <param name="out_format" type="select" label="Output format">
+            <option value="6">Tabular (standard 12 columns)</option>
+            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
+            <option value="5">BLAST XML</option>
+            <option value="0">Pairwise text</option>
+            <option value="0 -html">Pairwise HTML</option>
+            <option value="2">Query-anchored text</option>
+            <option value="2 -html">Query-anchored HTML</option>
+            <option value="4">Flat query-anchored text</option>
+            <option value="4 -html">Flat query-anchored HTML</option>
+            <!--
+            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
+            -->
+        </param>
+        <conditional name="adv_opts">
+            <param name="adv_opts_selector" type="select" label="Advanced Options">
+              <option value="basic" selected="True">Hide Advanced Options</option>
+              <option value="advanced">Show Advanced Options</option>
+            </param>
+            <when value="basic" />
+            <when value="advanced">
+                <param name="db_gencode" type="select" label="Database/subject genetic code">
+                    <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details -->
+                    <option value="1" select="True">1. Standard</option>
+                    <option value="2">2. Vertebrate Mitochondrial</option>
+                    <option value="3">3. Yeast Mitochondrial</option>
+                    <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option>
+                    <option value="5">5. Invertebrate Mitochondrial</option>
+                    <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option>
+                    <option value="9">9. Echinoderm Mitochondrial</option>
+                    <option value="10">10. Euplotid Nuclear</option>
+                    <option value="11">11. Bacteria and Archaea</option>
+                    <option value="12">12. Alternative Yeast Nuclear</option>
+                    <option value="13">13. Ascidian Mitochondrial</option>
+                    <option value="14">14. Flatworm Mitochondrial</option>
+                    <option value="15">15. Blepharisma Macronuclear</option>
+                    <option value="16">16. Chlorophycean Mitochondrial Code</option>
+                    <option value="21">21. Trematode Mitochondrial Code</option>
+                    <option value="22">22. Scenedesmus obliquus mitochondrial Code</option>
+                    <option value="23">23. Thraustochytrium Mitochondrial Code</option>
+                    <option value="24">24. Pterobranchia mitochondrial code</option>
+                </param>
+                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
+                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="true" />
+                <param name="strand" type="select" label="Query strand(s) to search against database/subject">
+                    <option value="-strand both">Both</option>
+                    <option value="-strand plus">Plus (forward)</option>
+                    <option value="-strand minus">Minus (reverse complement)</option>
+                </param>
+                <param name="matrix" type="select" label="Scoring matrix">
+                    <option value="BLOSUM90">BLOSUM90</option>
+                    <option value="BLOSUM80">BLOSUM80</option>
+                    <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option>
+                    <option value="BLOSUM50">BLOSUM50</option>
+                    <option value="BLOSUM45">BLOSUM45</option>
+                    <option value="PAM250">PAM250</option>
+                    <option value="PAM70">PAM70</option>
+                    <option value="PAM30">PAM30</option>
+                </param>
+                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
+                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
+                    <validator type="in_range" min="0" />
+                </param>
+                <!-- I'd like word_size to be optional, with minimum 2 for tblastx -->
+                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
+                    <validator type="in_range" min="0" />
+                </param>
+                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
+            </when>
+        </conditional>
+    </inputs>
+    <outputs>
+        <data name="output1" format="tabular" label="tblastx on ${on_string}">
+            <change_format>
+                <when input="out_format" value="0" format="txt"/>
+                <when input="out_format" value="0 -html" format="html"/>
+                <when input="out_format" value="2" format="txt"/>
+                <when input="out_format" value="2 -html" format="html"/>
+                <when input="out_format" value="4" format="txt"/>
+                <when input="out_format" value="4 -html" format="html"/>
+                <when input="out_format" value="5" format="blastxml"/>
+            </change_format>
+        </data>
+    </outputs>
+    <tests>
+        <test>
+            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
+            <param name="db_opts_selector" value="file" />
+            <param name="subject" value="three_human_mRNA.fasta" ftype="fasta" />
+            <param name="database" value="" />
+            <param name="evalue_cutoff" value="1e-40" />
+            <param name="out_format" value="6" />
+            <param name="adv_opts_selector" value="basic" />
+            <output name="output1" file="tblastx_rhodopsin_vs_three_human.tabular" ftype="tabular" />
+        </test>
+    </tests>
+    <help>
+
+.. class:: warningmark
+
+**Note**. Database searches may take a substantial amount of time.
+For large input datasets it is advisable to allow overnight processing.
+
+-----
+
+**What it does**
+
+Search a *translated nucleotide database* using a *protein query*,
+using the NCBI BLAST+ tblastx command line tool.
+
+.. class:: warningmark
+
+You can also search against a FASTA file of subject nucleotide
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
+-----
+
+**Output format**
+
+Because Galaxy focuses on processing tabular data, the default output of this
+tool is tabular. The standard BLAST+ tabular output contains 12 columns:
+
+====== ========= ============================================
+Column NCBI name Description
+------ --------- --------------------------------------------
+     1 qseqid    Query Seq-id (ID of your sequence)
+     2 sseqid    Subject Seq-id (ID of the database hit)
+     3 pident    Percentage of identical matches
+     4 length    Alignment length
+     5 mismatch  Number of mismatches
+     6 gapopen   Number of gap openings
+     7 qstart    Start of alignment in query
+     8 qend      End of alignment in query
+     9 sstart    Start of alignment in subject (database hit)
+    10 send      End of alignment in subject (database hit)
+    11 evalue    Expectation value (E-value)
+    12 bitscore  Bit score
+====== ========= ============================================
+
+The BLAST+ tools can optionally output additional columns of information,
+but this takes longer to calculate. Most (but not all) of these columns are
+included by selecting the extended tabular output. The extra columns are
+included *after* the standard 12 columns. This is so that you can write
+workflow filtering steps that accept either the 12 or 24 column tabular
+BLAST output. Galaxy now uses this extended 24 column output by default.
+
+====== ============= ===========================================
+Column NCBI name     Description
+------ ------------- -------------------------------------------
+    13 sallseqid     All subject Seq-id(s), separated by a ';'
+    14 score         Raw score
+    15 nident        Number of identical matches
+    16 positive      Number of positive-scoring matches
+    17 gaps          Total number of gaps
+    18 ppos          Percentage of positive-scoring matches
+    19 qframe        Query frame
+    20 sframe        Subject frame
+    21 qseq          Aligned part of query sequence
+    22 sseq          Aligned part of subject sequence
+    23 qlen          Query sequence length
+    24 slen          Subject sequence length
+====== ============= ===========================================
+
+The third option is BLAST XML output, which is designed to be parsed by
+another program, and is understood by some Galaxy tools.
+
+You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
+The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
+The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
+The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
+and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
+
+-------
+
+**References**
+
+Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
+
+This wrapper is available to install into other Galaxy Instances via the Galaxy
+Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
+    </help>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/repository_dependencies.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,5 @@
+<?xml version="1.0"?>
+<repositories description="This requires the BLAST datatype definitions (e.g. the BLAST XML format).">
+<!-- Revision 4:f9a7783ed7b6 on the main (and test) tool shed is v0.0.14 which added BLAST databases -->
+<repository changeset_revision="f9a7783ed7b6" name="blast_datatypes" owner="devteam" toolshed="http://testtoolshed.g2.bx.psu.edu" />
+</repositories>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ncbi_blast_plus/tool_dependencies.xml	Tue Jul 30 07:33:46 2013 -0400
@@ -0,0 +1,20 @@
+<?xml version="1.0"?>
+<tool_dependency>
+    <package name="blast+" version="2.2.26+">
+        <install version="1.0">
+            <actions>
+                <action type="download_by_url">ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.26/ncbi-blast-2.2.26+-src.tar.gz</action>
+                <action type="shell_command">cd c++ &amp;&amp; ./configure --prefix=$INSTALL_DIR &amp;&amp; make &amp;&amp; make install</action>
+                <action type="set_environment">
+                    <environment_variable name="PATH" action="prepend_to">$INSTALL_DIR/bin</environment_variable>
+                </action>
+            </actions>
+        </install>
+        <readme>
+Downloads and compiles BLAST+ from the NCBI, which assumes you have
+all the required build dependencies installed. See:
+http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&amp;PAGE_TYPE=BlastDocs&amp;DOC_TYPE=Download
+        </readme>
+    </package>
+</tool_dependency>
+
--- a/tools/ncbi_blast_plus/blastxml_to_tabular.py	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,261 +0,0 @@
-#!/usr/bin/env python
-"""Convert a BLAST XML file to tabular output.
-
-Takes three command line options, input BLAST XML filename, output tabular
-BLAST filename, output format (std for standard 12 columns, or ext for the
-extended 24 columns offered in the BLAST+ wrappers).
-
-The 12 columns output are 'qseqid sseqid pident length mismatch gapopen qstart
-qend sstart send evalue bitscore' or 'std' at the BLAST+ command line, which
-mean:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The additional columns offered in the Galaxy BLAST+ wrappers are:
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-Most of these fields are given explicitly in the XML file, others some like
-the percentage identity and the number of gap openings must be calculated.
-
-Be aware that the sequence in the extended tabular output or XML direct from
-BLAST+ may or may not use XXXX masking on regions of low complexity. This
-can throw the off the calculation of percentage identity and gap openings.
-[In fact, both BLAST 2.2.24+ and 2.2.25+ have a subtle bug in this regard,
-with these numbers changing depending on whether or not the low complexity
-filter is used.]
-
-This script attempts to produce identical output to what BLAST+ would have done.
-However, check this with "diff -b ..." since BLAST+ sometimes includes an extra
-space character (probably a bug).
-"""
-import sys
-import re
-
-if "-v" in sys.argv or "--version" in sys.argv:
-    print "v0.0.12"
-    sys.exit(0)
-
-if sys.version_info[:2] >= ( 2, 5 ):
-    try:
-        from xml.etree import cElementTree as ElementTree
-    except ImportError:
-        from xml.etree import ElementTree as ElementTree
-else:
-    from galaxy import eggs
-    import pkg_resources; pkg_resources.require( "elementtree" )
-    from elementtree import ElementTree
-
-def stop_err( msg ):
-    sys.stderr.write("%s\n" % msg)
-    sys.exit(1)
-
-#Parse Command Line
-try:
-    in_file, out_file, out_fmt = sys.argv[1:]
-except:
-    stop_err("Expect 3 arguments: input BLAST XML file, output tabular file, out format (std or ext)")
-
-if out_fmt == "std":
-    extended = False
-elif out_fmt == "x22":
-    stop_err("Format argument x22 has been replaced with ext (extended 24 columns)")
-elif out_fmt == "ext":
-    extended = True
-else:
-    stop_err("Format argument should be std (12 column) or ext (extended 24 columns)")
-
-
-# get an iterable
-try:
-    context = ElementTree.iterparse(in_file, events=("start", "end"))
-except:
-    stop_err("Invalid data format.")
-# turn it into an iterator
-context = iter(context)
-# get the root element
-try:
-    event, root = context.next()
-except:
-    stop_err( "Invalid data format." )
-
-
-re_default_query_id = re.compile("^Query_\d+$")
-assert re_default_query_id.match("Query_101")
-assert not re_default_query_id.match("Query_101a")
-assert not re_default_query_id.match("MyQuery_101")
-re_default_subject_id = re.compile("^Subject_\d+$")
-assert re_default_subject_id.match("Subject_1")
-assert not re_default_subject_id.match("Subject_")
-assert not re_default_subject_id.match("Subject_12a")
-assert not re_default_subject_id.match("TheSubject_1")
-
-
-outfile = open(out_file, 'w')
-blast_program = None
-for event, elem in context:
-    if event == "end" and elem.tag == "BlastOutput_program":
-        blast_program = elem.text
-    # for every <Iteration> tag
-    if event == "end" and elem.tag == "Iteration":
-        #Expecting either this, from BLAST 2.2.25+ using FASTA vs FASTA
-        # <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID>
-        # <Iteration_query-def>Endoplasmic reticulum resident protein 44 OS=Homo sapiens GN=ERP44 PE=1 SV=1</Iteration_query-def>
-        # <Iteration_query-len>406</Iteration_query-len>
-        # <Iteration_hits></Iteration_hits>
-        #
-        #Or, from BLAST 2.2.24+ run online
-        # <Iteration_query-ID>Query_1</Iteration_query-ID>
-        # <Iteration_query-def>Sample</Iteration_query-def>
-        # <Iteration_query-len>516</Iteration_query-len>
-        # <Iteration_hits>...
-        qseqid = elem.findtext("Iteration_query-ID")
-        if re_default_query_id.match(qseqid):
-            #Place holder ID, take the first word of the query definition
-            qseqid = elem.findtext("Iteration_query-def").split(None,1)[0]
-        qlen = int(elem.findtext("Iteration_query-len"))
-
-        # for every <Hit> within <Iteration>
-        for hit in elem.findall("Iteration_hits/Hit"):
-            #Expecting either this,
-            # <Hit_id>gi|3024260|sp|P56514.1|OPSD_BUFBU</Hit_id>
-            # <Hit_def>RecName: Full=Rhodopsin</Hit_def>
-            # <Hit_accession>P56514</Hit_accession>
-            #or,
-            # <Hit_id>Subject_1</Hit_id>
-            # <Hit_def>gi|57163783|ref|NP_001009242.1| rhodopsin [Felis catus]</Hit_def>
-            # <Hit_accession>Subject_1</Hit_accession>
-            #
-            #apparently depending on the parse_deflines switch
-            sseqid = hit.findtext("Hit_id").split(None,1)[0]
-            hit_def = sseqid + " " + hit.findtext("Hit_def")
-            if re_default_subject_id.match(sseqid) \
-            and sseqid == hit.findtext("Hit_accession"):
-                #Place holder ID, take the first word of the subject definition
-                hit_def = hit.findtext("Hit_def")
-                sseqid = hit_def.split(None,1)[0]
-            # for every <Hsp> within <Hit>
-            for hsp in hit.findall("Hit_hsps/Hsp"):
-                nident = hsp.findtext("Hsp_identity")
-                length = hsp.findtext("Hsp_align-len")
-                pident = "%0.2f" % (100*float(nident)/float(length))
-
-                q_seq = hsp.findtext("Hsp_qseq")
-                h_seq = hsp.findtext("Hsp_hseq")
-                m_seq = hsp.findtext("Hsp_midline")
-                assert len(q_seq) == len(h_seq) == len(m_seq) == int(length)
-                gapopen = str(len(q_seq.replace('-', ' ').split())-1  + \
-                              len(h_seq.replace('-', ' ').split())-1)
-
-                mismatch = m_seq.count(' ') + m_seq.count('+') \
-                         - q_seq.count('-') - h_seq.count('-')
-                #TODO - Remove this alternative mismatch calculation and test
-                #once satisifed there are no problems
-                expected_mismatch = len(q_seq) \
-                                  - sum(1 for q,h in zip(q_seq, h_seq) \
-                                        if q == h or q == "-" or h == "-")
-                xx = sum(1 for q,h in zip(q_seq, h_seq) if q=="X" and h=="X")
-                if not (expected_mismatch - q_seq.count("X") <= int(mismatch) <= expected_mismatch + xx):
-                    stop_err("%s vs %s mismatches, expected %i <= %i <= %i" \
-                             % (qseqid, sseqid, expected_mismatch - q_seq.count("X"),
-                                int(mismatch), expected_mismatch))
-
-                #TODO - Remove this alternative identity calculation and test
-                #once satisifed there are no problems
-                expected_identity = sum(1 for q,h in zip(q_seq, h_seq) if q == h)
-                if not (expected_identity - xx <= int(nident) <= expected_identity + q_seq.count("X")):
-                    stop_err("%s vs %s identities, expected %i <= %i <= %i" \
-                             % (qseqid, sseqid, expected_identity, int(nident),
-                                expected_identity + q_seq.count("X")))
-
-
-                evalue = hsp.findtext("Hsp_evalue")
-                if evalue == "0":
-                    evalue = "0.0"
-                else:
-                    evalue = "%0.0e" % float(evalue)
-
-                bitscore = float(hsp.findtext("Hsp_bit-score"))
-                if bitscore < 100:
-                    #Seems to show one decimal place for lower scores
-                    bitscore = "%0.1f" % bitscore
-                else:
-                    #Note BLAST does not round to nearest int, it truncates
-                    bitscore = "%i" % bitscore
-
-                values = [qseqid,
-                          sseqid,
-                          pident,
-                          length, #hsp.findtext("Hsp_align-len")
-                          str(mismatch),
-                          gapopen,
-                          hsp.findtext("Hsp_query-from"), #qstart,
-                          hsp.findtext("Hsp_query-to"), #qend,
-                          hsp.findtext("Hsp_hit-from"), #sstart,
-                          hsp.findtext("Hsp_hit-to"), #send,
-                          evalue, #hsp.findtext("Hsp_evalue") in scientific notation
-                          bitscore, #hsp.findtext("Hsp_bit-score") rounded
-                          ]
-
-                if extended:
-                    sallseqid = ";".join(name.split(None,1)[0] for name in hit_def.split(">"))
-                    #print hit_def, "-->", sallseqid
-                    positive = hsp.findtext("Hsp_positive")
-                    ppos = "%0.2f" % (100*float(positive)/float(length))
-                    qframe = hsp.findtext("Hsp_query-frame")
-                    sframe = hsp.findtext("Hsp_hit-frame")
-                    if blast_program == "blastp":
-                        #Probably a bug in BLASTP that they use 0 or 1 depending on format
-                        if qframe == "0": qframe = "1"
-                        if sframe == "0": sframe = "1"
-                    slen = int(hit.findtext("Hit_len"))
-                    values.extend([sallseqid,
-                                   hsp.findtext("Hsp_score"), #score,
-                                   nident,
-                                   positive,
-                                   hsp.findtext("Hsp_gaps"), #gaps,
-                                   ppos,
-                                   qframe,
-                                   sframe,
-                                   #NOTE - for blastp, XML shows original seq, tabular uses XXX masking
-                                   q_seq,
-                                   h_seq,
-                                   str(qlen),
-                                   str(slen),
-                                   ])
-                #print "\t".join(values)
-                outfile.write("\t".join(values) + "\n")
-        # prevents ElementTree from growing large datastructure
-        root.clear()
-        elem.clear()
-outfile.close()
--- a/tools/ncbi_blast_plus/blastxml_to_tabular.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,137 +0,0 @@
-<tool id="blastxml_to_tabular" name="BLAST XML to tabular" version="0.0.11">
-    <description>Convert BLAST XML output to tabular</description>
-    <version_command interpreter="python">blastxml_to_tabular.py --version</version_command>
-    <command interpreter="python">
-      blastxml_to_tabular.py $blastxml_file $tabular_file $out_format
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-    </stdio>
-    <inputs>
-        <param name="blastxml_file" type="data" format="blastxml" label="BLAST results as XML"/>
-        <param name="out_format" type="select" label="Output format">
-            <option value="std">Tabular (standard 12 columns)</option>
-            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
-        </param>
-    </inputs>
-    <outputs>
-        <data name="tabular_file" format="tabular" label="BLAST results as tabular" />
-    </outputs>
-    <requirements>
-    </requirements>
-    <tests>
-        <test>
-            <param name="blastxml_file" value="blastp_four_human_vs_rhodopsin.xml" ftype="blastxml" />
-            <param name="out_format" value="std" />
-            <!-- Note this has some white space differences from the actual blastp output blast_four_human_vs_rhodopsin.tabluar -->
-            <output name="tabular_file" file="blastp_four_human_vs_rhodopsin_converted.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="blastxml_file" value="blastp_four_human_vs_rhodopsin.xml" ftype="blastxml" />
-            <param name="out_format" value="ext" />
-            <!-- Note this has some white space differences from the actual blastp output blast_four_human_vs_rhodopsin_22c.tabluar -->
-            <output name="tabular_file" file="blastp_four_human_vs_rhodopsin_converted_ext.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="blastxml_file" value="blastp_sample.xml" ftype="blastxml" />
-            <param name="out_format" value="std" />
-            <!-- Note this has some white space differences from the actual blastp output -->
-            <output name="tabular_file" file="blastp_sample_converted.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="blastxml_file" value="blastx_rhodopsin_vs_four_human.xml" ftype="blastxml" />
-            <param name="out_format" value="std" />
-            <!-- Note this has some white space differences from the actual blastx output -->
-            <output name="tabular_file" file="blastx_rhodopsin_vs_four_human_converted.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="blastxml_file" value="blastx_rhodopsin_vs_four_human.xml" ftype="blastxml" />
-            <param name="out_format" value="ext" />
-            <!-- Note this has some white space and XXXX masking differences from the actual blastx output -->
-            <output name="tabular_file" file="blastx_rhodopsin_vs_four_human_converted_ext.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="blastxml_file" value="blastx_sample.xml" ftype="blastxml" />
-            <param name="out_format" value="std" />
-            <!-- Note this has some white space differences from the actual blastx output -->
-            <output name="tabular_file" file="blastx_sample_converted.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="blastxml_file" value="blastp_human_vs_pdb_seg_no.xml" ftype="blastxml" />
-            <param name="out_format" value="std" />
-            <!-- Note this has some white space differences from the actual blastp output -->
-            <output name="tabular_file" file="blastp_human_vs_pdb_seg_no_converted_std.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="blastxml_file" value="blastp_human_vs_pdb_seg_no.xml" ftype="blastxml" />
-            <param name="out_format" value="ext" />
-            <!-- Note this has some white space differences from the actual blastp output -->
-            <output name="tabular_file" file="blastp_human_vs_pdb_seg_no_converted_ext.tabular" ftype="tabular" />
-        </test>
-    </tests>
-    <help>
-
-**What it does**
-
-NCBI BLAST+ (and the older NCBI 'legacy' BLAST) can output in a range of
-formats including tabular and a more detailed XML format. A complex workflow
-may need both the XML and the tabular output - but running BLAST twice is
-slow and wasteful.
-
-This tool takes the BLAST XML output and can convert it into the
-standard 12 column tabular equivalent:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The BLAST+ tools can optionally output additional columns of information,
-but this takes longer to calculate. Most (but not all) of these columns are
-included by selecting the extended tabular output. The extra columns are
-included *after* the standard 12 columns. This is so that you can write
-workflow filtering steps that accept either the 12 or 22 column tabular
-BLAST output. This tool now uses this extended 24 column output by default.
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-Beware that the XML file (and thus the conversion) and the tabular output
-direct from BLAST+ may differ in the presence of XXXX masking on regions
-low complexity (columns 21 and 22), and thus also calculated figures like
-the percentage identity (column 3).
-
-**References**
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_blast_plus.txt	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,151 +0,0 @@
-Galaxy wrappers for NCBI BLAST+ suite
-=====================================
-
-These wrappers are copyright 2010-2013 by Peter Cock, The James Hutton Institute
-(formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
-See the licence text below.
-
-Currently tested with NCBI BLAST 2.2.26+ (i.e. version 2.2.26 of BLAST+),
-and does not work with the NCBI 'legacy' BLAST suite (e.g. blastall).
-
-Note that these wrappers (and the associated datatypes) were originally
-distributed as part of the main Galaxy repository, but as of August 2012
-moved to the Galaxy Tool Shed as 'ncbi_blast_plus' (and 'blast_datatypes').
-My thanks to Dannon Baker from the Galaxy development team for his assistance
-with this.
-
-These wrappers are available from the Galaxy Tool Shed at:
-http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-
-
-Automated Installation
-======================
-
-Galaxy should be able to automatically install the dependencies, i.e. the
-'blast_datatypes' repository which defines the BLAST XML file format
-('blastxml') and protein and nucleotide BLAST databases ('blastdbp' and
-'blastdbn').
-
-You must tell Galaxy about any system level BLAST databases using configuration
-files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein
-databases like NR), and blastdb_d.loc (protein domain databases like CDD or
-SMART) which are located in the tool-data/ folder. Sample files are included
-which explain the tab-based format to use.
-
-You can download the NCBI provided databases as tar-balls from here:
-ftp://ftp.ncbi.nlm.nih.gov/blast/db/ (nucleotide and protein databases like NR)
-ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/ (domain databases like CDD)
-
-
-Manual Installation
-===================
-
-For those not using Galaxy's automated installation from the Tool Shed, put
-the XML and Python files in the tools/ncbi_blast_plus/ folder and add the XML
-files to your tool_conf.xml as normal (and do the same in tool_conf.xml.sample
-in order to run the unit tests). For example, use:
-
-  <section name="NCBI BLAST+" id="ncbi_blast_plus_tools">
-    <tool file="ncbi_blast_plus/ncbi_blastn_wrapper.xml" />
-    <tool file="ncbi_blast_plus/ncbi_blastp_wrapper.xml" />
-    <tool file="ncbi_blast_plus/ncbi_blastx_wrapper.xml" />
-    <tool file="ncbi_blast_plus/ncbi_tblastn_wrapper.xml" />
-    <tool file="ncbi_blast_plus/ncbi_tblastx_wrapper.xml" />
-    <tool file="ncbi_blast_plus/ncbi_makeblastdb.xml" />
-    <tool file="ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml" />
-    <tool file="ncbi_blast_plus/ncbi_blastdbcmd_info.xml" />
-    <tool file="ncbi_blast_plus/ncbi_rpsblast_wrapper.xml" />
-    <tool file="ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml" />
-    <tool file="ncbi_blast_plus/blastxml_to_tabular.xml" />
-  </section>
-
-You will also need to install 'blast_datatypes' from the Tool Shed. This
-defines the BLAST XML file format ('blastxml') and protein and nucleotide
-BLAST databases composite file formats ('blastdbp' and 'blastdbn').
-
-As described above for an automated installation, you must also tell Galaxy
-about any system level BLAST databases using the tool-data/blastdb*.loc files.
-
-You must install the NCBI BLAST+ standalone tools somewhere on the system
-path. Currently the unit tests are written using "BLAST 2.2.26+".
-
-Run the functional tests (adjusting the section identifier to match your
-tool_conf.xml.sample file):
-
-./run_functional_tests.sh -sid NCBI_BLAST+-ncbi_blast_plus_tools
-
-
-History
-=======
-
-v0.0.11 - Final revision as part of the Galaxy main repository, and the
-          first release via the Tool Shed
-v0.0.12 - Implements genetic code option for translation searches.
-        - Changes <parallelism> to 1000 sequences at a time (to cope with
-          very large sets of queries where BLAST+ can become memory hungry)
-        - Include warning that BLAST+ with subject FASTA gives pairwise
-          e-values
-v0.0.13 - Use the new error handling options in Galaxy (the previously
-          bundled hide_stderr.py script is no longer needed).
-v0.0.14 - Support for makeblastdb and blastdbinfo with local BLAST databases
-          in the history (using work from Edward Kirton), requires v0.0.14
-          of the 'blast_datatypes' repository from the Tool Shed.
-v0.0.15 - Stronger warning in help text against searching against subject
-          FASTA files (better looking e-values than you might be expecting).
-v0.0.16 - Added repository_dependencies.xml for automates installation of the
-          'blast_datatypes' repository from the Tool Shed.
-v0.0.17 - The BLAST+ search tools now default to extended tabular output
-          (all too often our users where having to re-run searches just to
-          get one of the missing columns like query or subject length)
-v0.0.18 - Defensive quoting of filenames in case of spaces (where possible,
-          BLAST+ handling of some mult-file arguments is problematic).
-v0.0.19 - Added wrappers for rpsblast and rpstblastn, and new blastdb_d.loc
-          for the domain databases they use (e.g. CDD, PFAM or SMART).
-        - Correct case of exception regular expression (for error handling
-          fall-back in case the return code is not set properly).
-        - Clearer naming of output files.
-v0.0.20 - Added unit tests for BLASTN and TBLASTX.
-        - Fallback on ElementTree if cElementTree missing in XML to tabular.
-        - Link to Tool Shed added to help text and this documentation.
-        - Tweak dependency on blast_datatypes to also work on Test Tool Shed
-
-
-Developers
-==========
-
-This script and related tools are being developed on the 'tools' branch of the
-following Mercurial repository:
-https://bitbucket.org/peterjc/galaxy-central/
-
-For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball I use
-the following command from the Galaxy root folder:
-
-$ ./tools/ncbi_blast_plus/make_ncbi_blast_plus.sh
-
-This simplifies ensuring a consistent set of files is bundled each time,
-including all the relevant test files.
-
-
-Licence (MIT/BSD style)
-=======================
-
-Permission to use, copy, modify, and distribute this software and its
-documentation with or without modifications and for any purpose and
-without fee is hereby granted, provided that any copyright notices
-appear in all copies and that both those copyright notices and this
-permission notice appear in supporting documentation, and that the
-names of the contributors or copyright holders not be used in
-advertising or publicity pertaining to distribution of the software
-without specific prior permission.
-
-THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE DISCLAIM ALL
-WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED
-WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE
-CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY SPECIAL, INDIRECT
-OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
-OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
-OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE
-OR PERFORMANCE OF THIS SOFTWARE.
-
-NOTE: This is the licence for the Galaxy Wrapper only. NCBI BLAST+ and
-associated data files are available and licenced separately.
--- a/tools/ncbi_blast_plus/ncbi_blastdbcmd_info.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,67 +0,0 @@
-<tool id="ncbi_blastdbcmd_info" name="NCBI BLAST+ database info" version="0.0.6">
-    <description>Show BLAST database information from blastdbcmd</description>
-    <requirements>
-        <requirement type="binary">blastdbcmd</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>blastdbcmd -version</version_command>
-    <command>
-blastdbcmd -dbtype $db_opts.db_type -db "${db_opts.database.fields.path}" -info -out "$info"
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-	<!-- Suspect blastdbcmd sometimes fails to set error level -->
-	<regex match="Error:" />
-	<regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <conditional name="db_opts">
-            <param name="db_type" type="select" label="Type of BLAST database">
-              <option value="nucl" selected="True">Nucleotide</option>
-              <option value="prot">Protein</option>
-            </param>
-            <when value="nucl">
-                <param name="database" type="select" label="Nucleotide BLAST database">
-                    <options from_file="blastdb.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-            </when>
-            <when value="prot">
-                <param name="database" type="select" label="Protein BLAST database">
-                    <options from_file="blastdb_p.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-            </when>
-        </conditional>
-    </inputs>
-    <outputs>
-        <data name="info" format="txt" label="${db_opts.database.fields.name} info" />
-    </outputs>
-    <help>
-
-**What it does**
-
-Calls the NCBI BLAST+ blastdbcmd command line tool with the -info
-switch to give summary information about a BLAST database, such as
-the size (number of sequences and total length) and date.
-
--------
-
-**References**
-
-Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
-
-Schaffer et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. 2001. Nucleic Acids Res. 29:2994-3005.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,139 +0,0 @@
-<tool id="ncbi_blastdbcmd_wrapper" name="NCBI BLAST+ blastdbcmd entry(s)" version="0.0.6">
-    <description>Extract sequence(s) from BLAST database</description>
-    <requirements>
-        <requirement type="binary">blastdbcmd</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>blastdbcmd -version</version_command>
-    <command>
-## The command is a Cheetah template which allows some Python based syntax.
-## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
-blastdbcmd -dbtype $db_opts.db_type -db "${db_opts.database.fields.path}"
-
-##TODO: What about -ctrl_a and -target_only as advanced options?
-
-#if $id_opts.id_type=="file":
--entry_batch "$id_opts.entries"
-#else:
-##Perform some simple search/replaces to remove whitespace
-##and make it comma separated, and escape any pipe characters
--entry "$id_opts.entries.replace('\r',',').replace('\n',',').replace(' ','').replace(',,',',').replace(',,',',').strip(',').replace('|','\|')"
-#end if
-
-##When building a BLAST database, to ensure unique IDs makeblastdb will
-##do things like turning a FASTA entry with ID of ERP44 into lcl|ERP44
-##(if using -parse_seqids) or simply assign it an ID using the record
-##number like gnl|BL_ORD_ID|123 (to cope with duplicate IDs in the FASTA
-##file). In -parse_seqids mode, a duplicate FASTA ID gives an error.
-##
-##The BLAST plain text and XML output will contain these BLAST IDs, but
-##the tabular output does not (at least, not in BLAST 2.2.25+).
-##Therefore in general, Galaxy users won't care about the (internal)
-##BLAST identifiers.
-##
-##The blastdbcmd FASTA output will also contain these IDs, but in the
-##context of the BLAST tabular output they are not helpful. Therefore
-##to recover the original ID as used in the FASTA file for makeblastdb
-##we need a litte post processing.
-##
-##We remove the NCBI's lcl|... or gnl|BL_ORD_ID|123 prefixes
-##using sed, however the exact syntax differs for Mac OS X's sed
-
-#if str($outfmt)=="blastid":
--out "$seq"
-#else if sys.platform == "darwin":
-| sed -E 's/^>(lcl\||gnl\|BL_ORD_ID\|[0-9]* )/>/1' > "$seq"
-#else:
-| sed 's/>\(lcl|\|gnl|BL_ORD_ID|[0-9]* \)/>/1' > "$seq"
-#end if
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-	<!-- Suspect blastdbcmd sometimes fails to set error level -->
-	<regex match="Error:" />
-	<regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <conditional name="db_opts">
-            <param name="db_type" type="select" label="Type of BLAST database">
-              <option value="nucl" selected="True">Nucleotide</option>
-              <option value="prot">Protein</option>
-            </param>
-            <when value="nucl">
-                <param name="database" type="select" label="Nucleotide BLAST database">
-                    <options from_file="blastdb.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-            </when>
-            <when value="prot">
-                <param name="database" type="select" label="Protein BLAST database">
-                    <options from_file="blastdb_p.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-            </when>
-        </conditional>
-        <conditional name="id_opts">
-            <param name="id_type" type="select" label="Type of identifier list">
-              <option value="file">From file</option>
-              <option value="prompt">User entered</option>
-            </param>
-            <when value="file">
-                <param name="entries" type="data" format="txt,tabular" label="Sequence identifier(s)" help="Plain text file with one ID per line (i.e. single column tabular file)"/>
-            </when>
-            <when value="prompt">
-                <param name="entries" type="text" label="Sequence identifier(s)" help="Comma or new line separated list." optional="False" area="True" size="10x30"/>
-            </when>
-        </conditional>
-        <param name="outfmt" type="select" label="Output format">
-          <option value="original">FASTA with original identifiers</option>
-          <option value="blastid">FASTA with BLAST assigned identifiers</option>
-        </param>
-    </inputs>
-    <outputs>
-        <data name="seq" format="fasta" label="Sequences from ${db_opts.database.fields.name}" />
-    </outputs>
-    <help>
-
-**What it does**
-
-Extracts FASTA formatted sequences from a BLAST database
-using the NCBI BLAST+ blastdbcmd command line tool.
-
-.. class:: warningmark
-
-**BLAST assigned identifiers**
-
-When a BLAST database is constructed from a FASTA file, the
-original identifiers can be replaced with BLAST assigned
-identifiers, partly to ensure uniqueness. e.g. Sometimes
-a prefix of 'lcl|' is added (lcl is short for local),
-or an arbitrary name starting 'gnl|BL_ORD_ID|' is created.
-
-If you are using the tabular output from BLAST, it will contain
-the original identifiers - not the BLAST assigned identifiers
-suitable for use with the blastdbcmd tool.
-
-If you are using the XML or plain text output, this will also
-contain the BLAST assigned identifiers. However, this means
-getting a list of BLAST assigned identifiers isn't straightforward.
-
--------
-
-**References**
-
-Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
-
-Schaffer et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. 2001. Nucleic Acids Res. 29:2994-3005.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,253 +0,0 @@
-<tool id="ncbi_blastn_wrapper" name="NCBI BLAST+ blastn" version="0.0.20">
-    <description>Search nucleotide database with nucleotide query sequence(s)</description>
-    <!-- If job splitting is enabled, break up the query file into parts -->
-    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
-    <requirements>
-        <requirement type="binary">blastn</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>blastn -version</version_command>
-    <command>
-## The command is a Cheetah template which allows some Python based syntax.
-## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
-blastn
--query "$query"
-#if $db_opts.db_opts_selector == "db":
-  -db "${db_opts.database.fields.path}"
-#elif $db_opts.db_opts_selector == "histdb":
-  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
-#else:
-  -subject "$db_opts.subject"
-#end if
--task $blast_type
--evalue $evalue_cutoff
--out "$output1"
-##Set the extended list here so if/when we add things, saved workflows are not affected
-#if str($out_format)=="ext":
-    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
-#else:
-    -outfmt $out_format
-#end if
--num_threads 8
-#if $adv_opts.adv_opts_selector=="advanced":
-$adv_opts.filter_query
-$adv_opts.strand
-## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
-## Note -max_target_seqs overrides -num_descriptions and -num_alignments
-#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
--max_target_seqs $adv_opts.max_hits
-#end if
-#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
--word_size $adv_opts.word_size
-#end if
-$adv_opts.ungapped
-$adv_opts.parse_deflines
-## End of advanced options:
-#end if
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-        <!-- In case the return code has not been set propery check stderr too -->
-        <regex match="Error:" />
-        <regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <param name="query" type="data" format="fasta" label="Nucleotide query sequence(s)"/>
-        <conditional name="db_opts">
-            <param name="db_opts_selector" type="select" label="Subject database/sequences">
-              <option value="db" selected="True">Locally installed BLAST database</option>
-              <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (see warning note below)</option>
-            </param>
-            <when value="db">
-                <param name="database" type="select" label="Nucleotide BLAST database">
-                    <options from_file="blastdb.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="histdb">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="data" format="blastdbn" label="Nucleotide BLAST database" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="file">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="data" format="fasta" label="Nucleotide FASTA file to use as database"/>
-            </when>
-        </conditional>
-        <param name="blast_type" type="select" display="radio" label="Type of BLAST">
-            <option value="megablast">megablast</option>
-            <option value="blastn">blastn</option>
-            <option value="blastn-short">blastn-short</option>
-            <option value="dc-megablast">dc-megablast</option>
-            <!-- Using BLAST 2.2.24+ this gives an error:
-            BLAST engine error: Program type 'vecscreen' not supported
-            <option value="vecscreen">vecscreen</option>
-            -->
-        </param>
-        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
-        <param name="out_format" type="select" label="Output format">
-            <option value="6">Tabular (standard 12 columns)</option>
-            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
-            <option value="5">BLAST XML</option>
-            <option value="0">Pairwise text</option>
-            <option value="0 -html">Pairwise HTML</option>
-            <option value="2">Query-anchored text</option>
-            <option value="2 -html">Query-anchored HTML</option>
-            <option value="4">Flat query-anchored text</option>
-            <option value="4 -html">Flat query-anchored HTML</option>
-            <!--
-            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
-            -->
-        </param>
-        <conditional name="adv_opts">
-            <param name="adv_opts_selector" type="select" label="Advanced Options">
-              <option value="basic" selected="True">Hide Advanced Options</option>
-              <option value="advanced">Show Advanced Options</option>
-            </param>
-            <when value="basic" />
-            <when value="advanced">
-                <!-- Could use a select (yes, no, other) where other allows setting 'level window linker' -->
-                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with DUST)" truevalue="-dust yes" falsevalue="-dust no" checked="true" />
-                <param name="strand" type="select" label="Query strand(s) to search against database/subject">
-                    <option value="-strand both">Both</option>
-                    <option value="-strand plus">Plus (forward)</option>
-                    <option value="-strand minus">Minus (reverse complement)</option>
-                </param>
-                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
-                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!-- I'd like word_size to be optional, with minimum 4 for blastn -->
-                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 4.">
-                    <validator type="in_range" min="0" />
-                </param>
-                <param name="ungapped" type="boolean" label="Perform ungapped alignment only?" truevalue="-ungapped" falsevalue="" checked="false" />
-                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
-            </when>
-        </conditional>
-    </inputs>
-    <outputs>
-        <data name="output1" format="tabular" label="${blast_type.value_label} on ${on_string}">
-            <change_format>
-                <when input="out_format" value="0" format="txt"/>
-                <when input="out_format" value="0 -html" format="html"/>
-                <when input="out_format" value="2" format="txt"/>
-                <when input="out_format" value="2 -html" format="html"/>
-                <when input="out_format" value="4" format="txt"/>
-                <when input="out_format" value="4 -html" format="html"/>
-                <when input="out_format" value="5" format="blastxml"/>
-            </change_format>
-        </data>
-    </outputs>
-    <tests>
-        <test>
-            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="three_human_mRNA.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-40" />
-            <param name="out_format" value="6" />
-            <param name="adv_opts_selector" value="basic" />
-            <output name="output1" file="blastn_rhodopsin_vs_three_human.tabular" ftype="tabular" />
-        </test>
-    </tests>
-    <help>
-
-.. class:: warningmark
-
-**Note**. Database searches may take a substantial amount of time.
-For large input datasets it is advisable to allow overnight processing.
-
------
-
-**What it does**
-
-Search a *nucleotide database* using a *nucleotide query*,
-using the NCBI BLAST+ blastn command line tool.
-Algorithms include blastn, megablast, and discontiguous megablast.
-
-.. class:: warningmark
-
-You can also search against a FASTA file of subject nucleotide
-sequences. This is *not* advised because it is slower (only one
-CPU is used), but more importantly gives e-values for pairwise
-searches (very small e-values which will look overly signficiant).
-In most cases you should instead turn the other FASTA file into a
-database first using *makeblastdb* and search against that.
-
------
-
-**Output format**
-
-Because Galaxy focuses on processing tabular data, the default output of this
-tool is tabular. The standard BLAST+ tabular output contains 12 columns:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The BLAST+ tools can optionally output additional columns of information,
-but this takes longer to calculate. Most (but not all) of these columns are
-included by selecting the extended tabular output. The extra columns are
-included *after* the standard 12 columns. This is so that you can write
-workflow filtering steps that accept either the 12 or 24 column tabular
-BLAST output. Galaxy now uses this extended 24 column output by default.
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-The third option is BLAST XML output, which is designed to be parsed by
-another program, and is understood by some Galaxy tools.
-
-You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
-The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
-The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
-The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
-and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
-
--------
-
-**References**
-
-Zhang et al. A Greedy Algorithm for Aligning DNA Sequences. 2000. JCB: 203-214.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,308 +0,0 @@
-<tool id="ncbi_blastp_wrapper" name="NCBI BLAST+ blastp" version="0.0.20">
-    <description>Search protein database with protein query sequence(s)</description>
-    <!-- If job splitting is enabled, break up the query file into parts -->
-    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
-    <requirements>
-        <requirement type="binary">blastp</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>blastp -version</version_command>
-    <command>
-## The command is a Cheetah template which allows some Python based syntax.
-## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
-blastp
--query "$query"
-#if $db_opts.db_opts_selector == "db":
-  -db "${db_opts.database.fields.path}"
-#elif $db_opts.db_opts_selector == "histdb":
-  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
-#else:
-  -subject "$db_opts.subject"
-#end if
--task $blast_type
--evalue $evalue_cutoff
--out "$output1"
-##Set the extended list here so if/when we add things, saved workflows are not affected
-#if str($out_format)=="ext":
-    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
-#else:
-    -outfmt $out_format
-#end if
--num_threads 8
-#if $adv_opts.adv_opts_selector=="advanced":
-$adv_opts.filter_query
--matrix $adv_opts.matrix
-## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
-## Note -max_target_seqs overrides -num_descriptions and -num_alignments
-#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
--max_target_seqs $adv_opts.max_hits
-#end if
-#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
--word_size $adv_opts.word_size
-#end if
-##Ungapped disabled for now - see comments below
-##$adv_opts.ungapped
-$adv_opts.parse_deflines
-## End of advanced options:
-#end if
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-        <!-- In case the return code has not been set propery check stderr too -->
-        <regex match="Error:" />
-        <regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <param name="query" type="data" format="fasta" label="Protein query sequence(s)"/>
-        <conditional name="db_opts">
-            <param name="db_opts_selector" type="select" label="Subject database/sequences">
-              <option value="db" selected="True">Locally installed BLAST database</option>
-              <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (see warning note below)</option>
-            </param>
-            <when value="db">
-                <param name="database" type="select" label="Protein BLAST database">
-                    <options from_file="blastdb_p.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="histdb">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="data" format="blastdbp" label="Protein BLAST database" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="file">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="data" format="fasta" label="Protein FASTA file to use as database"/>
-            </when>
-        </conditional>
-        <param name="blast_type" type="select" display="radio" label="Type of BLAST">
-            <option value="blastp">blastp</option>
-            <option value="blastp-short">blastp-short</option>
-        </param>
-        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
-        <param name="out_format" type="select" label="Output format">
-            <option value="6">Tabular (standard 12 columns)</option>
-            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
-            <option value="5">BLAST XML</option>
-            <option value="0">Pairwise text</option>
-            <option value="0 -html">Pairwise HTML</option>
-            <option value="2">Query-anchored text</option>
-            <option value="2 -html">Query-anchored HTML</option>
-            <option value="4">Flat query-anchored text</option>
-            <option value="4 -html">Flat query-anchored HTML</option>
-            <!--
-            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
-            -->
-        </param>
-        <conditional name="adv_opts">
-            <param name="adv_opts_selector" type="select" label="Advanced Options">
-              <option value="basic" selected="True">Hide Advanced Options</option>
-              <option value="advanced">Show Advanced Options</option>
-            </param>
-            <when value="basic" />
-            <when value="advanced">
-                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
-                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="false" />
-                <param name="matrix" type="select" label="Scoring matrix">
-                    <option value="BLOSUM90">BLOSUM90</option>
-                    <option value="BLOSUM80">BLOSUM80</option>
-                    <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option>
-                    <option value="BLOSUM50">BLOSUM50</option>
-                    <option value="BLOSUM45">BLOSUM45</option>
-                    <option value="PAM250">PAM250</option>
-                    <option value="PAM70">PAM70</option>
-                    <option value="PAM30">PAM30</option>
-                </param>
-                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
-                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!-- I'd like word_size to be optional, with minimum 2 for blastp -->
-                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!--
-                Can't use '-ungapped' on its own, error back is:
-                Composition-adjusted searched are not supported with an ungapped search, please add -comp_based_stats F or do a gapped search
-                Tried using '-ungapped -comp_based_stats F' and blastp crashed with 'Attempt to access NULL pointer.'
-                <param name="ungapped" type="boolean" label="Perform ungapped alignment only?" truevalue="-ungapped -comp_based_stats F" falsevalue="" checked="false" />
-                -->
-                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
-            </when>
-        </conditional>
-    </inputs>
-    <outputs>
-        <data name="output1" format="tabular" label="${blast_type.value_label} on ${on_string}">
-            <change_format>
-                <when input="out_format" value="0" format="txt"/>
-                <when input="out_format" value="0 -html" format="html"/>
-                <when input="out_format" value="2" format="txt"/>
-                <when input="out_format" value="2 -html" format="html"/>
-                <when input="out_format" value="4" format="txt"/>
-                <when input="out_format" value="4 -html" format="html"/>
-                <when input="out_format" value="5" format="blastxml"/>
-            </change_format>
-        </data>
-    </outputs>
-    <tests>
-        <test>
-            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="rhodopsin_proteins.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-8" />
-            <param name="blast_type" value="blastp" />
-            <param name="out_format" value="5" />
-            <param name="adv_opts_selector" value="advanced" />
-            <param name="filter_query" value="False" />
-            <param name="matrix" value="BLOSUM62" />
-            <param name="max_hits" value="0" />
-            <param name="word_size" value="0" />
-            <param name="parse_deflines" value="True" />
-            <output name="output1" file="blastp_four_human_vs_rhodopsin.xml" ftype="blastxml" />
-        </test>
-        <test>
-            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="rhodopsin_proteins.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-8" />
-            <param name="blast_type" value="blastp" />
-            <param name="out_format" value="6" />
-            <param name="adv_opts_selector" value="advanced" />
-            <param name="filter_query" value="False" />
-            <param name="matrix" value="BLOSUM62" />
-            <param name="max_hits" value="0" />
-            <param name="word_size" value="0" />
-            <param name="parse_deflines" value="True" />
-            <output name="output1" file="blastp_four_human_vs_rhodopsin.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="rhodopsin_proteins.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-8" />
-            <param name="blast_type" value="blastp" />
-            <param name="out_format" value="ext" />
-            <param name="adv_opts_selector" value="advanced" />
-            <param name="filter_query" value="False" />
-            <param name="matrix" value="BLOSUM62" />
-            <param name="max_hits" value="0" />
-            <param name="word_size" value="0" />
-            <param name="parse_deflines" value="True" />
-            <output name="output1" file="blastp_four_human_vs_rhodopsin_ext.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="query" value="rhodopsin_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-8" />
-            <param name="blast_type" value="blastp" />
-            <param name="out_format" value="6" />
-            <param name="adv_opts_selector" value="basic" />
-            <output name="output1" file="blastp_rhodopsin_vs_four_human.tabular" ftype="tabular" />
-        </test>
-    </tests>
-    <help>
-
-.. class:: warningmark
-
-**Note**. Database searches may take a substantial amount of time.
-For large input datasets it is advisable to allow overnight processing.
-
------
-
-**What it does**
-
-Search a *protein database* using a *protein query*,
-using the NCBI BLAST+ blastp command line tool.
-
-.. class:: warningmark
-
-You can also search against a FASTA file of subject protein
-sequences. This is *not* advised because it is slower (only one
-CPU is used), but more importantly gives e-values for pairwise
-searches (very small e-values which will look overly signficiant).
-In most cases you should instead turn the other FASTA file into a
-database first using *makeblastdb* and search against that.
-
------
-
-**Output format**
-
-Because Galaxy focuses on processing tabular data, the default output of this
-tool is tabular. The standard BLAST+ tabular output contains 12 columns:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The BLAST+ tools can optionally output additional columns of information,
-but this takes longer to calculate. Most (but not all) of these columns are
-included by selecting the extended tabular output. The extra columns are
-included *after* the standard 12 columns. This is so that you can write
-workflow filtering steps that accept either the 12 or 24 column tabular
-BLAST output. Galaxy now uses this extended 24 column output by default.
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-The third option is BLAST XML output, which is designed to be parsed by
-another program, and is understood by some Galaxy tools.
-
-You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
-The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
-The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
-The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
-and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
-
--------
-
-**References**
-
-Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
-
-Schaffer et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. 2001. Nucleic Acids Res. 29:2994-3005.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,294 +0,0 @@
-<tool id="ncbi_blastx_wrapper" name="NCBI BLAST+ blastx" version="0.0.19">
-    <description>Search protein database with translated nucleotide query sequence(s)</description>
-    <!-- If job splitting is enabled, break up the query file into parts -->
-    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
-    <requirements>
-        <requirement type="binary">blastx</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>blastx -version</version_command>
-    <command>
-## The command is a Cheetah template which allows some Python based syntax.
-## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
-blastx
--query "$query"
-#if $db_opts.db_opts_selector == "db":
-  -db "${db_opts.database.fields.path}"
-#elif $db_opts.db_opts_selector == "histdb":
-  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
-#else:
-  -subject "$db_opts.subject"
-#end if
--query_gencode $query_gencode
--evalue $evalue_cutoff
--out "$output1"
-##Set the extended list here so if/when we add things, saved workflows are not affected
-#if str($out_format)=="ext":
-    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
-#else:
-    -outfmt $out_format
-#end if
--num_threads 8
-#if $adv_opts.adv_opts_selector=="advanced":
-$adv_opts.filter_query
-$adv_opts.strand
--matrix $adv_opts.matrix
-## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
-## Note -max_target_seqs overrides -num_descriptions and -num_alignments
-#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
--max_target_seqs $adv_opts.max_hits
-#end if
-#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
--word_size $adv_opts.word_size
-#end if
-$adv_opts.ungapped
-$adv_opts.parse_deflines
-## End of advanced options:
-#end if
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-        <!-- In case the return code has not been set propery check stderr too -->
-        <regex match="Error:" />
-        <regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <param name="query" type="data" format="fasta" label="Nucleotide query sequence(s)"/>
-        <conditional name="db_opts">
-            <param name="db_opts_selector" type="select" label="Subject database/sequences">
-              <option value="db" selected="True">Locally installed BLAST database</option>
-              <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (see warning note below)</option>
-            </param>
-            <when value="db">
-                <param name="database" type="select" label="Protein BLAST database">
-                    <options from_file="blastdb_p.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="histdb">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="data" format="blastdbp" label="Protein BLAST database" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="file">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="data" format="fasta" label="Protein FASTA file to use as database"/>
-            </when>
-        </conditional>
-        <param name="query_gencode" type="select" label="Query genetic code">
-            <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details -->
-            <option value="1" select="True">1. Standard</option>
-            <option value="2">2. Vertebrate Mitochondrial</option>
-            <option value="3">3. Yeast Mitochondrial</option>
-            <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option>
-            <option value="5">5. Invertebrate Mitochondrial</option>
-            <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option>
-            <option value="9">9. Echinoderm Mitochondrial</option>
-            <option value="10">10. Euplotid Nuclear</option>
-            <option value="11">11. Bacteria and Archaea</option>
-            <option value="12">12. Alternative Yeast Nuclear</option>
-            <option value="13">13. Ascidian Mitochondrial</option>
-            <option value="14">14. Flatworm Mitochondrial</option>
-            <option value="15">15. Blepharisma Macronuclear</option>
-            <option value="16">16. Chlorophycean Mitochondrial Code</option>
-            <option value="21">21. Trematode Mitochondrial Code</option>
-            <option value="22">22. Scenedesmus obliquus mitochondrial Code</option>
-            <option value="23">23. Thraustochytrium Mitochondrial Code</option>
-            <option value="24">24. Pterobranchia mitochondrial code</option>
-        </param>
-        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
-        <param name="out_format" type="select" label="Output format">
-            <option value="6">Tabular (standard 12 columns)</option>
-            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
-            <option value="5">BLAST XML</option>
-            <option value="0">Pairwise text</option>
-            <option value="0 -html">Pairwise HTML</option>
-            <option value="2">Query-anchored text</option>
-            <option value="2 -html">Query-anchored HTML</option>
-            <option value="4">Flat query-anchored text</option>
-            <option value="4 -html">Flat query-anchored HTML</option>
-            <!--
-            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
-            -->
-        </param>
-        <conditional name="adv_opts">
-            <param name="adv_opts_selector" type="select" label="Advanced Options">
-              <option value="basic" selected="True">Hide Advanced Options</option>
-              <option value="advanced">Show Advanced Options</option>
-            </param>
-            <when value="basic" />
-            <when value="advanced">
-                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
-                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="true" />
-                <param name="strand" type="select" label="Query strand(s) to search against database/subject">
-                    <option value="-strand both">Both</option>
-                    <option value="-strand plus">Plus (forward)</option>
-                    <option value="-strand minus">Minus (reverse complement)</option>
-                </param>
-                <param name="matrix" type="select" label="Scoring matrix">
-                    <option value="BLOSUM90">BLOSUM90</option>
-                    <option value="BLOSUM80">BLOSUM80</option>
-                    <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option>
-                    <option value="BLOSUM50">BLOSUM50</option>
-                    <option value="BLOSUM45">BLOSUM45</option>
-                    <option value="PAM250">PAM250</option>
-                    <option value="PAM70">PAM70</option>
-                    <option value="PAM30">PAM30</option>
-                </param>
-                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
-                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!-- I'd like word_size to be optional, with minimum 2 for blastx -->
-                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
-                    <validator type="in_range" min="0" />
-                </param>
-                <param name="ungapped" type="boolean" label="Perform ungapped alignment only?" truevalue="-ungapped" falsevalue="" checked="false" />
-                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
-            </when>
-        </conditional>
-    </inputs>
-    <outputs>
-        <data name="output1" format="tabular" label="blastx on ${on_string}">
-            <change_format>
-                <when input="out_format" value="0" format="txt"/>
-                <when input="out_format" value="0 -html" format="html"/>
-                <when input="out_format" value="2" format="txt"/>
-                <when input="out_format" value="2 -html" format="html"/>
-                <when input="out_format" value="4" format="txt"/>
-                <when input="out_format" value="4 -html" format="html"/>
-                <when input="out_format" value="5" format="blastxml"/>
-            </change_format>
-        </data>
-    </outputs>
-    <tests>
-        <test>
-            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-10" />
-            <param name="out_format" value="5" />
-            <param name="adv_opts_selector" value="basic" />
-            <output name="output1" file="blastx_rhodopsin_vs_four_human.xml" ftype="blastxml" />
-        </test>
-        <test>
-            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-10" />
-            <param name="out_format" value="6" />
-            <param name="adv_opts_selector" value="basic" />
-            <output name="output1" file="blastx_rhodopsin_vs_four_human.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-10" />
-            <param name="out_format" value="ext" />
-            <param name="adv_opts_selector" value="basic" />
-            <output name="output1" file="blastx_rhodopsin_vs_four_human_ext.tabular" ftype="tabular" />
-        </test>
-    </tests>
-    <help>
-
-.. class:: warningmark
-
-**Note**. Database searches may take a substantial amount of time.
-For large input datasets it is advisable to allow overnight processing.
-
------
-
-**What it does**
-
-Search a *protein database* using a *translated nucleotide query*,
-using the NCBI BLAST+ blastx command line tool.
-
-.. class:: warningmark
-
-You can also search against a FASTA file of subject protein
-sequences. This is *not* advised because it is slower (only one
-CPU is used), but more importantly gives e-values for pairwise
-searches (very small e-values which will look overly signficiant).
-In most cases you should instead turn the other FASTA file into a
-database first using *makeblastdb* and search against that.
-
------
-
-**Output format**
-
-Because Galaxy focuses on processing tabular data, the default output of this
-tool is tabular. The standard BLAST+ tabular output contains 12 columns:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The BLAST+ tools can optionally output additional columns of information,
-but this takes longer to calculate. Most (but not all) of these columns are
-included by selecting the extended tabular output. The extra columns are
-included *after* the standard 12 columns. This is so that you can write
-workflow filtering steps that accept either the 12 or 24 column tabular
-BLAST output. Galaxy now uses this extended 24 column output by default.
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-The third option is BLAST XML output, which is designed to be parsed by
-another program, and is understood by some Galaxy tools.
-
-You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
-The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
-The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
-The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
-and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
-
--------
-
-**References**
-
-Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_makeblastdb.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,129 +0,0 @@
-<tool id="ncbi_makeblastdb" name="NCBI BLAST+ makeblastdb" version="0.0.5">
-  <description>Make BLAST database</description>
-    <requirements>
-        <requirement type="binary">makeblastdb</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>makeblastdb -version</version_command>
-    <command>
-makeblastdb -out "${os.path.join($outfile.extra_files_path,'blastdb')}"
-$parse_seqids
-$hash_index
-## Single call to -in with multiple filenames space separated with outer quotes
-## (presumably any filenames with spaces would be a problem). Note this gives
-## some extra spaces, e.g. -in " file1 file2 file3  " but BLAST seems happy:
--in "
-#for $i in $in
-${i.file} #end for
-"
-#if $title:
--title "$title"
-#else:
-##Would default to being based on the cryptic Galaxy filenames, which is unhelpful
--title "BLAST Database"
-#end if
--dbtype $dbtype
-## #set $sep = '-mask_data '
-## #for $i in $mask_data
-## $sep${i.file}
-## #set $set = ', '
-## #end for
-## #set $sep = '-gi_mask -gi_mask_name '
-## #for $i in $gi_mask
-## $sep${i.file}
-## #set $set = ', '
-## #end for
-## #if $tax.select == 'id':
-## -taxid $tax.id
-## #else if $tax.select == 'map':
-## -taxid_map $tax.map
-## #end if
-</command>
-<stdio>
-    <!-- Anything other than zero is an error -->
-    <exit_code range="1:" />
-    <exit_code range=":-1" />
-    <!-- In case the return code has not been set propery check stderr too -->
-    <regex match="Error:" />
-    <regex match="Exception:" />
-</stdio>
-<inputs>
-    <param name="dbtype" type="select" display="radio" label="Molecule type of input">
-        <option value="prot">protein</option>
-        <option value="nucl">nucleotide</option>
-    </param>
-    <!-- TODO Allow merging of existing BLAST databases (conditional on the database type)
-    <repeat name="in" title="Blast or Fasta Database" min="1">
-        <param name="file" type="data" format="fasta,blastdbn,blastdbp" label="Blast or Fasta database" />
-    </repeat>
-    -->
-    <repeat name="in" title="FASTA file" min="1">
-        <param name="file" type="data" format="fasta" />
-    </repeat>
-    <param name="title" type="text" value="" label="Title for BLAST database" help="This is the database name shown in BLAST search output" />
-    <param name="parse_seqids" type="boolean" truevalue="-parse_seqids" falsevalue="" checked="False" label="Parse the sequence identifiers" help="This is only advised if your FASTA file follows the NCBI naming conventions using pipe '|' symbols" />
-    <param name="hash_index" type="boolean" truevalue="-hash_index" falsevalue="" checked="true" label="Enable the creation of sequence hash values." help="These hash values can then be used to quickly determine if a given sequence data exists in this BLAST database." />
-
-    <!-- SEQUENCE MASKING OPTIONS -->
-    <!-- TODO
-    <repeat name="mask_data" title="Provide one or more files containing masking data">
-        <param name="file" type="data" format="asnb" label="File containing masking data" help="As produced by NCBI masking applications (e.g. dustmasker, segmasker, windowmasker)" />
-    </repeat>
-    <repeat name="gi_mask" title="Create GI indexed masking data">
-        <param name="file" type="data" format="asnb" label="Masking data output file" />
-    </repeat>
-    -->
-
-    <!-- TAXONOMY OPTIONS -->
-    <!-- TODO
-    <conditional name="tax">
-        <param name="select" type="select" label="Taxonomy options">
-            <option value="">Do not assign sequences to Taxonomy IDs</option>
-            <option value="id">Assign all sequences to one Taxonomy ID</option>
-            <option value="map">Supply text file mapping sequence IDs to taxnomy IDs</option>
-        </param>
-        <when value="">
-        </when>
-        <when value="id">
-            <param name="id" type="integer" value="" label="NCBI taxonomy ID" help="Integer &gt;=0" />
-        </when>
-        <when value="map">
-            <param name="file" type="data" format="txt" label="Seq ID : Tax ID mapping file" help="Format: SequenceId TaxonomyId" />
-        </when>
-    </conditional>
-    -->
-</inputs>
-<outputs>
-    <!-- If we only accepted one FASTA file, we could use its human name here... -->
-    <data name="outfile" format="data" label="${dbtype.value_label} BLAST database from ${on_string}">
-        <change_format>
-                <when input="dbtype" value="nucl" format="blastdbn"/>
-                <when input="dbtype" value="prot" format="blastdbp"/>
-        </change_format>
-    </data>
-</outputs>
-<help>
-**What it does**
-
-Make BLAST database from one or more FASTA files and/or BLAST databases.
-
-This is a wrapper for the NCBI BLAST+ tool 'makeblastdb', which is the
-replacement for the 'formatdb' tool in the NCBI 'legacy' BLAST suite.
-
-<!--
-Applying masks to an existing BLAST database will not change the original database; a new database will be created.
-For this reason, it's best to apply all masks at once to minimize the number of unnecessary intermediate databases.
--->
-
-**Documentation**
-
-http://www.ncbi.nlm.nih.gov/books/NBK1763/
-
-**References**
-
-Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-</help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_rpsblast_wrapper.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,238 +0,0 @@
-<tool id="ncbi_rpsblast_wrapper" name="NCBI BLAST+ rpsblast" version="0.0.4">
-    <description>Search protein domain database (PSSMs) with protein query sequence(s)</description>
-    <!-- If job splitting is enabled, break up the query file into parts -->
-    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject" merge_outputs="output1"></parallelism>
-    <requirements>
-        <requirement type="binary">rpsblast</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>rpsblast -version</version_command>
-    <command>
-## The command is a Cheetah template which allows some Python based syntax.
-## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
-rpsblast
--query "$query"
-#if $db_opts.db_opts_selector == "db":
-  -db "${db_opts.database.fields.path}"
-#elif $db_opts.db_opts_selector == "histdb":
-  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
-#end if
--evalue $evalue_cutoff
--out "$output1"
-##Set the extended list here so if/when we add things, saved workflows are not affected
-#if str($out_format)=="ext":
-    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
-#else:
-    -outfmt $out_format
-#end if
--num_threads 8
-#if $adv_opts.adv_opts_selector=="advanced":
-$adv_opts.filter_query
-## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
-## Note -max_target_seqs overrides -num_descriptions and -num_alignments
-#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
--max_target_seqs $adv_opts.max_hits
-#end if
-#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
--word_size $adv_opts.word_size
-#end if
-$adv_opts.parse_deflines
-## End of advanced options:
-#end if
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-        <!-- In case the return code has not been set propery check stderr too -->
-        <regex match="Error:" />
-        <regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <param name="query" type="data" format="fasta" label="Protein query sequence(s)"/>
-        <conditional name="db_opts">
-            <param name="db_opts_selector" type="select" label="Protein domain database (PSSM)">
-              <option value="db" selected="True">Locally installed BLAST database</option>
-	      <!-- TODO - define new datatype
-              <option value="histdb">BLAST protein domain database from your history</option>
-	      -->
-            </param>
-            <when value="db">
-                <param name="database" type="select" label="Protein domain database">
-                    <options from_file="blastdb_d.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-	    <!-- TODO - define new datatype
-            <when value="histdb">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="data" format="blastdbd" label="Protein domain database" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-	    -->
-        </conditional>
-        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
-        <param name="out_format" type="select" label="Output format">
-            <option value="6">Tabular (standard 12 columns)</option>
-            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
-            <option value="5">BLAST XML</option>
-            <option value="0">Pairwise text</option>
-            <option value="0 -html">Pairwise HTML</option>
-            <option value="2">Query-anchored text</option>
-            <option value="2 -html">Query-anchored HTML</option>
-            <option value="4">Flat query-anchored text</option>
-            <option value="4 -html">Flat query-anchored HTML</option>
-            <!--
-            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
-            -->
-        </param>
-        <conditional name="adv_opts">
-            <param name="adv_opts_selector" type="select" label="Advanced Options">
-              <option value="basic" selected="True">Hide Advanced Options</option>
-              <option value="advanced">Show Advanced Options</option>
-            </param>
-            <when value="basic" />
-            <when value="advanced">
-                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
-                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="false" />
-                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
-                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!-- I'd like word_size to be optional, with minimum 2 for rpsblast -->
-                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
-                    <validator type="in_range" min="0" />
-                </param>
-                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
-            </when>
-        </conditional>
-    </inputs>
-    <outputs>
-        <data name="output1" format="tabular" label="rpsblast on ${on_string}">
-            <change_format>
-                <when input="out_format" value="0" format="txt"/>
-                <when input="out_format" value="0 -html" format="html"/>
-                <when input="out_format" value="2" format="txt"/>
-                <when input="out_format" value="2 -html" format="html"/>
-                <when input="out_format" value="4" format="txt"/>
-                <when input="out_format" value="4 -html" format="html"/>
-                <when input="out_format" value="5" format="blastxml"/>
-            </change_format>
-        </data>
-    </outputs>
-    <help>
-
-.. class:: warningmark
-
-**Note**. Database searches may take a substantial amount of time.
-For large input datasets it is advisable to allow overnight processing.
-
------
-
-**What it does**
-
-Search a *protein domain database* using a *protein query*,
-using the NCBI BLAST+ rpsblast command line tool.
-
-The protein domain databases use position-specific scoring matrices
-(PSSMs) and are available for a number of domain collections including:
-
-*CDD* - NCBI curarated meta-collection of domains, see
-http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml#NCBI_curated_domains
-
-*Kog* - PSSMs from automatically aligned sequences and sequence
-fragments classified in the KOGs resource, the eukaryotic
-counterpart to COGs, see http://www.ncbi.nlm.nih.gov/COG/new/
-
-*Cog* - PSSMs from automatically aligned sequences and sequence
-fragments classified in the COGs resource, which focuses primarily
-on prokaryotes, see http://www.ncbi.nlm.nih.gov/COG/new/
-
-*Pfam* - PSSMs from Pfam-A seed alignment database, see
-http://pfam.sanger.ac.uk/
-
-*Smart* - PSSMs from SMART domain alignment database, see
-http://smart.embl-heidelberg.de/
-
-*Tigr* - PSSMs from TIGRFAM database of protein families, see
-http://www.jcvi.org/cms/research/projects/tigrfams/overview/
-
-*Prk* - PSSms from automatically aligned stable clusters in the
-Protein Clusters database, see
-http://www.ncbi.nlm.nih.gov/proteinclusters?cmd=search&amp;db=proteinclusters
-
-The exact list of domain databases offered will depend on how your
-local Galaxy has been configured.
-
------
-
-**Output format**
-
-Because Galaxy focuses on processing tabular data, the default output of this
-tool is tabular. The standard BLAST+ tabular output contains 12 columns:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The BLAST+ tools can optionally output additional columns of information,
-but this takes longer to calculate. Most (but not all) of these columns are
-included by selecting the extended tabular output. The extra columns are
-included *after* the standard 12 columns. This is so that you can write
-workflow filtering steps that accept either the 12 or 24 column tabular
-BLAST output. Galaxy now uses this extended 24 column output by default.
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-The third option is BLAST XML output, which is designed to be parsed by
-another program, and is understood by some Galaxy tools.
-
-You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
-The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
-The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
-The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
-and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
-
--------
-
-**References**
-
-Marchler-Bauer A, Bryant SH. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W327-31.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,239 +0,0 @@
-<tool id="ncbi_rpstblastn_wrapper" name="NCBI BLAST+ rpstblastn" version="0.0.4">
-    <description>Search protein domain database (PSSMs) with translated nucleotide query sequence(s)</description>
-    <!-- If job splitting is enabled, break up the query file into parts -->
-    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject" merge_outputs="output1"></parallelism>
-    <requirements>
-        <requirement type="binary">rpstblastn</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>rpstblastn -version</version_command>
-    <command>
-## The command is a Cheetah template which allows some Python based syntax.
-## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
-rpstblastn
--query "$query"
-#if $db_opts.db_opts_selector == "db":
-  -db "${db_opts.database.fields.path}"
-#elif $db_opts.db_opts_selector == "histdb":
-  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
-#end if
--evalue $evalue_cutoff
--out "$output1"
-##Set the extended list here so if/when we add things, saved workflows are not affected
-#if str($out_format)=="ext":
-    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
-#else:
-    -outfmt $out_format
-#end if
-##Seems rpstblastn does not currently support multiple threads :(
-##-num_threads 8
-#if $adv_opts.adv_opts_selector=="advanced":
-$adv_opts.filter_query
-## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
-## Note -max_target_seqs overrides -num_descriptions and -num_alignments
-#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
--max_target_seqs $adv_opts.max_hits
-#end if
-#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
--word_size $adv_opts.word_size
-#end if
-$adv_opts.parse_deflines
-## End of advanced options:
-#end if
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-        <!-- In case the return code has not been set propery check stderr too -->
-        <regex match="Error:" />
-        <regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <param name="query" type="data" format="fasta" label="Nucleotide query sequence(s)"/>
-        <conditional name="db_opts">
-            <param name="db_opts_selector" type="select" label="Protein domain database (PSSM)">
-              <option value="db" selected="True">Locally installed BLAST database</option>
-              <!-- TODO - define new datatype
-              <option value="histdb">BLAST protein domain database from your history</option>
-              -->
-            </param>
-            <when value="db">
-                <param name="database" type="select" label="Protein domain database">
-                    <options from_file="blastdb_d.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <!-- TODO - define new datatype
-            <when value="histdb">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="data" format="blastdbd" label="Protein domain database" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            -->
-        </conditional>
-        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
-        <param name="out_format" type="select" label="Output format">
-            <option value="6">Tabular (standard 12 columns)</option>
-            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
-            <option value="5">BLAST XML</option>
-            <option value="0">Pairwise text</option>
-            <option value="0 -html">Pairwise HTML</option>
-            <option value="2">Query-anchored text</option>
-            <option value="2 -html">Query-anchored HTML</option>
-            <option value="4">Flat query-anchored text</option>
-            <option value="4 -html">Flat query-anchored HTML</option>
-            <!--
-            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
-            -->
-        </param>
-        <conditional name="adv_opts">
-            <param name="adv_opts_selector" type="select" label="Advanced Options">
-              <option value="basic" selected="True">Hide Advanced Options</option>
-              <option value="advanced">Show Advanced Options</option>
-            </param>
-            <when value="basic" />
-            <when value="advanced">
-                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
-                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="false" />
-                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
-                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!-- I'd like word_size to be optional, with minimum 2 for rpsblast -->
-                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
-                    <validator type="in_range" min="0" />
-                </param>
-                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
-            </when>
-        </conditional>
-    </inputs>
-    <outputs>
-        <data name="output1" format="tabular" label="rpstblastn on ${on_string}">
-            <change_format>
-                <when input="out_format" value="0" format="txt"/>
-                <when input="out_format" value="0 -html" format="html"/>
-                <when input="out_format" value="2" format="txt"/>
-                <when input="out_format" value="2 -html" format="html"/>
-                <when input="out_format" value="4" format="txt"/>
-                <when input="out_format" value="4 -html" format="html"/>
-                <when input="out_format" value="5" format="blastxml"/>
-            </change_format>
-        </data>
-    </outputs>
-    <help>
-
-.. class:: warningmark
-
-**Note**. Database searches may take a substantial amount of time.
-For large input datasets it is advisable to allow overnight processing.
-
------
-
-**What it does**
-
-Search a *protein domain database* using a *nucleotide query*,
-using the NCBI BLAST+ rpstblastn command line tool.
-
-The protein domain databases use position-specific scoring matrices
-(PSSMs) and are available for a number of domain collections including:
-
-*CDD* - NCBI curarated meta-collection of domains, see
-http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml#NCBI_curated_domains
-
-*Kog* - PSSMs from automatically aligned sequences and sequence
-fragments classified in the KOGs resource, the eukaryotic
-counterpart to COGs, see http://www.ncbi.nlm.nih.gov/COG/new/
-
-*Cog* - PSSMs from automatically aligned sequences and sequence
-fragments classified in the COGs resource, which focuses primarily
-on prokaryotes, see http://www.ncbi.nlm.nih.gov/COG/new/
-
-*Pfam* - PSSMs from Pfam-A seed alignment database, see
-http://pfam.sanger.ac.uk/
-
-*Smart* - PSSMs from SMART domain alignment database, see
-http://smart.embl-heidelberg.de/
-
-*Tigr* - PSSMs from TIGRFAM database of protein families, see
-http://www.jcvi.org/cms/research/projects/tigrfams/overview/
-
-*Prk* - PSSms from automatically aligned stable clusters in the
-Protein Clusters database, see
-http://www.ncbi.nlm.nih.gov/proteinclusters?cmd=search&amp;db=proteinclusters
-
-The exact list of domain databases offered will depend on how your
-local Galaxy has been configured.
-
------
-
-**Output format**
-
-Because Galaxy focuses on processing tabular data, the default output of this
-tool is tabular. The standard BLAST+ tabular output contains 12 columns:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The BLAST+ tools can optionally output additional columns of information,
-but this takes longer to calculate. Most (but not all) of these columns are
-included by selecting the extended tabular output. The extra columns are
-included *after* the standard 12 columns. This is so that you can write
-workflow filtering steps that accept either the 12 or 24 column tabular
-BLAST output. Galaxy now uses this extended 24 column output by default.
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-The third option is BLAST XML output, which is designed to be parsed by
-another program, and is understood by some Galaxy tools.
-
-You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
-The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
-The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
-The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
-and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
-
--------
-
-**References**
-
-Marchler-Bauer A, Bryant SH. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W327-31.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,340 +0,0 @@
-<tool id="ncbi_tblastn_wrapper" name="NCBI BLAST+ tblastn" version="0.0.20">
-    <description>Search translated nucleotide database with protein query sequence(s)</description>
-    <!-- If job splitting is enabled, break up the query file into parts -->
-    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
-    <requirements>
-        <requirement type="binary">tblastn</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>tblastn -version</version_command>
-    <command>
-## The command is a Cheetah template which allows some Python based syntax.
-## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
-tblastn
--query "$query"
-#if $db_opts.db_opts_selector == "db":
-  -db "${db_opts.database.fields.path}"
-#elif $db_opts.db_opts_selector == "histdb":
-  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
-#else:
-  -subject "$db_opts.subject"
-#end if
--evalue $evalue_cutoff
--out "$output1"
-##Set the extended list here so if/when we add things, saved workflows are not affected
-#if str($out_format)=="ext":
-    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
-#else:
-    -outfmt $out_format
-#end if
--num_threads 8
-#if $adv_opts.adv_opts_selector=="advanced":
--db_gencode $adv_opts.db_gencode
-$adv_opts.filter_query
--matrix $adv_opts.matrix
-## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
-## Note -max_target_seqs overrides -num_descriptions and -num_alignments
-#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
--max_target_seqs $adv_opts.max_hits
-#end if
-#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
--word_size $adv_opts.word_size
-#end if
-##Ungapped disabled for now - see comments below
-##$adv_opts.ungapped
-$adv_opts.parse_deflines
-## End of advanced options:
-#end if
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-        <!-- In case the return code has not been set propery check stderr too -->
-        <regex match="Error:" />
-        <regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <param name="query" type="data" format="fasta" label="Protein query sequence(s)"/>
-        <conditional name="db_opts">
-            <param name="db_opts_selector" type="select" label="Subject database/sequences">
-              <option value="db" selected="True">Locally installed BLAST database</option>
-              <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (see warning note below)</option>
-            </param>
-            <when value="db">
-                <param name="database" type="select" label="Nucleotide BLAST database">
-                    <options from_file="blastdb.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="histdb">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="data" format="blastdbn" label="Nucleotide BLAST database" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="file">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="data" format="fasta" label="Nucleotide FASTA file to use as database"/>
-            </when>
-        </conditional>
-        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
-        <param name="out_format" type="select" label="Output format">
-            <option value="6">Tabular (standard 12 columns)</option>
-            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
-            <option value="5">BLAST XML</option>
-            <option value="0">Pairwise text</option>
-            <option value="0 -html">Pairwise HTML</option>
-            <option value="2">Query-anchored text</option>
-            <option value="2 -html">Query-anchored HTML</option>
-            <option value="4">Flat query-anchored text</option>
-            <option value="4 -html">Flat query-anchored HTML</option>
-            <!--
-            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
-            -->
-        </param>
-        <conditional name="adv_opts">
-            <param name="adv_opts_selector" type="select" label="Advanced Options">
-              <option value="basic" selected="True">Hide Advanced Options</option>
-              <option value="advanced">Show Advanced Options</option>
-            </param>
-            <when value="basic" />
-            <when value="advanced">
-                <param name="db_gencode" type="select" label="Database/subject genetic code">
-                    <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details -->
-                    <option value="1" select="True">1. Standard</option>
-                    <option value="2">2. Vertebrate Mitochondrial</option>
-                    <option value="3">3. Yeast Mitochondrial</option>
-                    <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option>
-                    <option value="5">5. Invertebrate Mitochondrial</option>
-                    <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option>
-                    <option value="9">9. Echinoderm Mitochondrial</option>
-                    <option value="10">10. Euplotid Nuclear</option>
-                    <option value="11">11. Bacteria and Archaea</option>
-                    <option value="12">12. Alternative Yeast Nuclear</option>
-                    <option value="13">13. Ascidian Mitochondrial</option>
-                    <option value="14">14. Flatworm Mitochondrial</option>
-                    <option value="15">15. Blepharisma Macronuclear</option>
-                    <option value="16">16. Chlorophycean Mitochondrial Code</option>
-                    <option value="21">21. Trematode Mitochondrial Code</option>
-                    <option value="22">22. Scenedesmus obliquus mitochondrial Code</option>
-                    <option value="23">23. Thraustochytrium Mitochondrial Code</option>
-                    <option value="24">24. Pterobranchia mitochondrial code</option>
-                </param>
-                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
-                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="true" />
-                <param name="matrix" type="select" label="Scoring matrix">
-                    <option value="BLOSUM90">BLOSUM90</option>
-                    <option value="BLOSUM80">BLOSUM80</option>
-                    <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option>
-                    <option value="BLOSUM50">BLOSUM50</option>
-                    <option value="BLOSUM45">BLOSUM45</option>
-                    <option value="PAM250">PAM250</option>
-                    <option value="PAM70">PAM70</option>
-                    <option value="PAM30">PAM30</option>
-                </param>
-                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
-                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!-- I'd like word_size to be optional, with minimum 2 for blastp -->
-                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!--
-                Can't use '-ungapped' on its own, error back is:
-                Composition-adjusted searched are not supported with an ungapped search, please add -comp_based_stats F or do a gapped search
-                Tried using '-ungapped -comp_based_stats F' and tblastn crashed with 'Attempt to access NULL pointer.'
-                <param name="ungapped" type="boolean" label="Perform ungapped alignment only?" truevalue="-ungapped -comp_based_stats F" falsevalue="" checked="false" />
-                -->
-                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
-            </when>
-        </conditional>
-    </inputs>
-    <outputs>
-        <data name="output1" format="tabular" label="tblastn on ${on_string}">
-            <change_format>
-                <when input="out_format" value="0" format="txt"/>
-                <when input="out_format" value="0 -html" format="html"/>
-                <when input="out_format" value="2" format="txt"/>
-                <when input="out_format" value="2 -html" format="html"/>
-                <when input="out_format" value="4" format="txt"/>
-                <when input="out_format" value="4 -html" format="html"/>
-                <when input="out_format" value="5" format="blastxml"/>
-            </change_format>
-        </data>
-    </outputs>
-    <tests>
-        <test>
-            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-10" />
-            <param name="out_format" value="5" />
-            <param name="adv_opts_selector" value="advanced" />
-            <param name="filter_query" value="false" />
-            <param name="matrix" value="BLOSUM80" />
-            <param name="max_hits" value="0" />
-            <param name="word_size" value="0" />
-            <param name="parse_deflines" value="false" />
-            <output name="output1" file="tblastn_four_human_vs_rhodopsin.xml" ftype="blastxml" />
-        </test>
-        <test>
-            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-10" />
-            <param name="out_format" value="ext" />
-            <param name="adv_opts_selector" value="advanced" />
-            <param name="filter_query" value="false" />
-            <param name="matrix" value="BLOSUM80" />
-            <param name="max_hits" value="0" />
-            <param name="word_size" value="0" />
-            <param name="parse_deflines" value="false" />
-            <output name="output1" file="tblastn_four_human_vs_rhodopsin_ext.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-10" />
-            <param name="out_format" value="6" />
-            <param name="adv_opts_selector" value="advanced" />
-            <param name="filter_query" value="false" />
-            <param name="matrix" value="BLOSUM80" />
-            <param name="max_hits" value="0" />
-            <param name="word_size" value="0" />
-            <param name="parse_deflines" value="false" />
-            <output name="output1" file="tblastn_four_human_vs_rhodopsin.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <!-- Same as above, but parse deflines - on BLAST 2.2.25+ - 2.2.27+ makes no difference -->
-            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-10" />
-            <param name="out_format" value="6" />
-            <param name="adv_opts_selector" value="advanced" />
-            <param name="filter_query" value="false" />
-            <param name="matrix" value="BLOSUM80" />
-            <param name="max_hits" value="0" />
-            <param name="word_size" value="0" />
-            <param name="parse_deflines" value="true" />
-            <output name="output1" file="tblastn_four_human_vs_rhodopsin.tabular" ftype="tabular" />
-        </test>
-        <test>
-            <param name="query" value="four_human_proteins.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-10" />
-            <param name="out_format" value="0 -html" />
-            <param name="adv_opts_selector" value="advanced" />
-            <param name="filter_query" value="false" />
-            <param name="matrix" value="BLOSUM80" />
-            <param name="max_hits" value="0" />
-            <param name="word_size" value="0" />
-            <param name="parse_deflines" value="false" />
-            <output name="output1" file="tblastn_four_human_vs_rhodopsin.html" ftype="html" />
-        </test>
-    </tests>
-    <help>
-
-.. class:: warningmark
-
-**Note**. Database searches may take a substantial amount of time.
-For large input datasets it is advisable to allow overnight processing.
-
------
-
-**What it does**
-
-Search a *translated nucleotide database* using a *protein query*,
-using the NCBI BLAST+ tblastn command line tool.
-
-.. class:: warningmark
-
-You can also search against a FASTA file of subject nucleotide
-sequences. This is *not* advised because it is slower (only one
-CPU is used), but more importantly gives e-values for pairwise
-searches (very small e-values which will look overly signficiant).
-In most cases you should instead turn the other FASTA file into a
-database first using *makeblastdb* and search against that.
-
------
-
-**Output format**
-
-Because Galaxy focuses on processing tabular data, the default output of this
-tool is tabular. The standard BLAST+ tabular output contains 12 columns:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The BLAST+ tools can optionally output additional columns of information,
-but this takes longer to calculate. Most (but not all) of these columns are
-included by selecting the extended tabular output. The extra columns are
-included *after* the standard 12 columns. This is so that you can write
-workflow filtering steps that accept either the 12 or 24 column tabular
-BLAST output. Galaxy now uses this extended 24 column output by default.
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-The third option is BLAST XML output, which is designed to be parsed by
-another program, and is understood by some Galaxy tools.
-
-You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
-The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
-The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
-The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
-and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
-
--------
-
-**References**
-
-Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,294 +0,0 @@
-<tool id="ncbi_tblastx_wrapper" name="NCBI BLAST+ tblastx" version="0.0.20">
-    <description>Search translated nucleotide database with translated nucleotide query sequence(s)</description>
-    <!-- If job splitting is enabled, break up the query file into parts -->
-    <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
-    <requirements>
-        <requirement type="binary">tblastx</requirement>
-        <requirement type="package" version="2.2.26+">blast+</requirement>
-    </requirements>
-    <version_command>tblastx -version</version_command>
-    <command>
-## The command is a Cheetah template which allows some Python based syntax.
-## Lines starting hash hash are comments. Galaxy will turn newlines into spaces
-tblastx
--query "$query"
-#if $db_opts.db_opts_selector == "db":
-  -db "${db_opts.database.fields.path}"
-#elif $db_opts.db_opts_selector == "histdb":
-  -db "${os.path.join($db_opts.histdb.extra_files_path,'blastdb')}"
-#else:
-  -subject "$db_opts.subject"
-#end if
--query_gencode $query_gencode
--evalue $evalue_cutoff
--out "$output1"
-##Set the extended list here so if/when we add things, saved workflows are not affected
-#if str($out_format)=="ext":
-    -outfmt "6 std sallseqid score nident positive gaps ppos qframe sframe qseq sseq qlen slen"
-#else:
-    -outfmt $out_format
-#end if
--num_threads 8
-#if $adv_opts.adv_opts_selector=="advanced":
--db_gencode $adv_opts.db_gencode
-$adv_opts.filter_query
-$adv_opts.strand
--matrix $adv_opts.matrix
-## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string
-## Note -max_target_seqs overrides -num_descriptions and -num_alignments
-#if (str($adv_opts.max_hits) and int(str($adv_opts.max_hits)) > 0):
--max_target_seqs $adv_opts.max_hits
-#end if
-#if (str($adv_opts.word_size) and int(str($adv_opts.word_size)) > 0):
--word_size $adv_opts.word_size
-#end if
-$adv_opts.parse_deflines
-## End of advanced options:
-#end if
-    </command>
-    <stdio>
-        <!-- Anything other than zero is an error -->
-        <exit_code range="1:" />
-        <exit_code range=":-1" />
-        <!-- In case the return code has not been set propery check stderr too -->
-        <regex match="Error:" />
-        <regex match="Exception:" />
-    </stdio>
-    <inputs>
-        <param name="query" type="data" format="fasta" label="Nucleotide query sequence(s)"/>
-        <conditional name="db_opts">
-            <param name="db_opts_selector" type="select" label="Subject database/sequences">
-              <option value="db" selected="True">Locally installed BLAST database</option>
-              <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (see warning note below)</option>
-            </param>
-            <when value="db">
-                <param name="database" type="select" label="Nucleotide BLAST database">
-                    <options from_file="blastdb.loc">
-                      <column name="value" index="0"/>
-                      <column name="name" index="1"/>
-                      <column name="path" index="2"/>
-                    </options>
-                </param>
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="histdb">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="data" format="blastdbn" label="Nucleotide BLAST database" />
-                <param name="subject" type="hidden" value="" />
-            </when>
-            <when value="file">
-                <param name="database" type="hidden" value="" />
-                <param name="histdb" type="hidden" value="" />
-                <param name="subject" type="data" format="fasta" label="Nucleotide FASTA file to use as database"/>
-            </when>
-        </conditional>
-        <param name="query_gencode" type="select" label="Query genetic code">
-            <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details -->
-            <option value="1" select="True">1. Standard</option>
-            <option value="2">2. Vertebrate Mitochondrial</option>
-            <option value="3">3. Yeast Mitochondrial</option>
-            <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option>
-            <option value="5">5. Invertebrate Mitochondrial</option>
-            <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option>
-            <option value="9">9. Echinoderm Mitochondrial</option>
-            <option value="10">10. Euplotid Nuclear</option>
-            <option value="11">11. Bacteria and Archaea</option>
-            <option value="12">12. Alternative Yeast Nuclear</option>
-            <option value="13">13. Ascidian Mitochondrial</option>
-            <option value="14">14. Flatworm Mitochondrial</option>
-            <option value="15">15. Blepharisma Macronuclear</option>
-            <option value="16">16. Chlorophycean Mitochondrial Code</option>
-            <option value="21">21. Trematode Mitochondrial Code</option>
-            <option value="22">22. Scenedesmus obliquus mitochondrial Code</option>
-            <option value="23">23. Thraustochytrium Mitochondrial Code</option>
-            <option value="24">24. Pterobranchia mitochondrial code</option>
-        </param>
-        <param name="evalue_cutoff" type="float" size="15" value="0.001" label="Set expectation value cutoff" />
-        <param name="out_format" type="select" label="Output format">
-            <option value="6">Tabular (standard 12 columns)</option>
-            <option value="ext" selected="True">Tabular (extended 24 columns)</option>
-            <option value="5">BLAST XML</option>
-            <option value="0">Pairwise text</option>
-            <option value="0 -html">Pairwise HTML</option>
-            <option value="2">Query-anchored text</option>
-            <option value="2 -html">Query-anchored HTML</option>
-            <option value="4">Flat query-anchored text</option>
-            <option value="4 -html">Flat query-anchored HTML</option>
-            <!--
-            <option value="-outfmt 11">BLAST archive format (ASN.1)</option>
-            -->
-        </param>
-        <conditional name="adv_opts">
-            <param name="adv_opts_selector" type="select" label="Advanced Options">
-              <option value="basic" selected="True">Hide Advanced Options</option>
-              <option value="advanced">Show Advanced Options</option>
-            </param>
-            <when value="basic" />
-            <when value="advanced">
-                <param name="db_gencode" type="select" label="Database/subject genetic code">
-                    <!-- See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for details -->
-                    <option value="1" select="True">1. Standard</option>
-                    <option value="2">2. Vertebrate Mitochondrial</option>
-                    <option value="3">3. Yeast Mitochondrial</option>
-                    <option value="4">4. Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code</option>
-                    <option value="5">5. Invertebrate Mitochondrial</option>
-                    <option value="6">6. Ciliate, Dasycladacean and Hexamita Nuclear Code</option>
-                    <option value="9">9. Echinoderm Mitochondrial</option>
-                    <option value="10">10. Euplotid Nuclear</option>
-                    <option value="11">11. Bacteria and Archaea</option>
-                    <option value="12">12. Alternative Yeast Nuclear</option>
-                    <option value="13">13. Ascidian Mitochondrial</option>
-                    <option value="14">14. Flatworm Mitochondrial</option>
-                    <option value="15">15. Blepharisma Macronuclear</option>
-                    <option value="16">16. Chlorophycean Mitochondrial Code</option>
-                    <option value="21">21. Trematode Mitochondrial Code</option>
-                    <option value="22">22. Scenedesmus obliquus mitochondrial Code</option>
-                    <option value="23">23. Thraustochytrium Mitochondrial Code</option>
-                    <option value="24">24. Pterobranchia mitochondrial code</option>
-                </param>
-                <!-- Could use a select (yes, no, other) where other allows setting 'window locut hicut' -->
-                <param name="filter_query" type="boolean" label="Filter out low complexity regions (with SEG)" truevalue="-seg yes" falsevalue="-seg no" checked="true" />
-                <param name="strand" type="select" label="Query strand(s) to search against database/subject">
-                    <option value="-strand both">Both</option>
-                    <option value="-strand plus">Plus (forward)</option>
-                    <option value="-strand minus">Minus (reverse complement)</option>
-                </param>
-                <param name="matrix" type="select" label="Scoring matrix">
-                    <option value="BLOSUM90">BLOSUM90</option>
-                    <option value="BLOSUM80">BLOSUM80</option>
-                    <option value="BLOSUM62" selected="true">BLOSUM62 (default)</option>
-                    <option value="BLOSUM50">BLOSUM50</option>
-                    <option value="BLOSUM45">BLOSUM45</option>
-                    <option value="PAM250">PAM250</option>
-                    <option value="PAM70">PAM70</option>
-                    <option value="PAM30">PAM30</option>
-                </param>
-                <!-- Why doesn't optional override a validator? I want to accept an empty string OR a non-negative integer -->
-                <param name="max_hits" type="integer" value="0" label="Maximum hits to show" help="Use zero for default limits">
-                    <validator type="in_range" min="0" />
-                </param>
-                <!-- I'd like word_size to be optional, with minimum 2 for tblastx -->
-                <param name="word_size" type="integer" value="0" label="Word size for wordfinder algorithm" help="Use zero for default, otherwise minimum 2.">
-                    <validator type="in_range" min="0" />
-                </param>
-                <param name="parse_deflines" type="boolean" label="Should the query and subject defline(s) be parsed?" truevalue="-parse_deflines" falsevalue="" checked="false" help="This affects the formatting of the query/subject ID strings"/>
-            </when>
-        </conditional>
-    </inputs>
-    <outputs>
-        <data name="output1" format="tabular" label="tblastx on ${on_string}">
-            <change_format>
-                <when input="out_format" value="0" format="txt"/>
-                <when input="out_format" value="0 -html" format="html"/>
-                <when input="out_format" value="2" format="txt"/>
-                <when input="out_format" value="2 -html" format="html"/>
-                <when input="out_format" value="4" format="txt"/>
-                <when input="out_format" value="4 -html" format="html"/>
-                <when input="out_format" value="5" format="blastxml"/>
-            </change_format>
-        </data>
-    </outputs>
-    <tests>
-        <test>
-            <param name="query" value="rhodopsin_nucs.fasta" ftype="fasta" />
-            <param name="db_opts_selector" value="file" />
-            <param name="subject" value="three_human_mRNA.fasta" ftype="fasta" />
-            <param name="database" value="" />
-            <param name="evalue_cutoff" value="1e-40" />
-            <param name="out_format" value="6" />
-            <param name="adv_opts_selector" value="basic" />
-            <output name="output1" file="tblastx_rhodopsin_vs_three_human.tabular" ftype="tabular" />
-        </test>
-    </tests>
-    <help>
-
-.. class:: warningmark
-
-**Note**. Database searches may take a substantial amount of time.
-For large input datasets it is advisable to allow overnight processing.
-
------
-
-**What it does**
-
-Search a *translated nucleotide database* using a *protein query*,
-using the NCBI BLAST+ tblastx command line tool.
-
-.. class:: warningmark
-
-You can also search against a FASTA file of subject nucleotide
-sequences. This is *not* advised because it is slower (only one
-CPU is used), but more importantly gives e-values for pairwise
-searches (very small e-values which will look overly signficiant).
-In most cases you should instead turn the other FASTA file into a
-database first using *makeblastdb* and search against that.
-
------
-
-**Output format**
-
-Because Galaxy focuses on processing tabular data, the default output of this
-tool is tabular. The standard BLAST+ tabular output contains 12 columns:
-
-====== ========= ============================================
-Column NCBI name Description
------- --------- --------------------------------------------
-     1 qseqid    Query Seq-id (ID of your sequence)
-     2 sseqid    Subject Seq-id (ID of the database hit)
-     3 pident    Percentage of identical matches
-     4 length    Alignment length
-     5 mismatch  Number of mismatches
-     6 gapopen   Number of gap openings
-     7 qstart    Start of alignment in query
-     8 qend      End of alignment in query
-     9 sstart    Start of alignment in subject (database hit)
-    10 send      End of alignment in subject (database hit)
-    11 evalue    Expectation value (E-value)
-    12 bitscore  Bit score
-====== ========= ============================================
-
-The BLAST+ tools can optionally output additional columns of information,
-but this takes longer to calculate. Most (but not all) of these columns are
-included by selecting the extended tabular output. The extra columns are
-included *after* the standard 12 columns. This is so that you can write
-workflow filtering steps that accept either the 12 or 24 column tabular
-BLAST output. Galaxy now uses this extended 24 column output by default.
-
-====== ============= ===========================================
-Column NCBI name     Description
------- ------------- -------------------------------------------
-    13 sallseqid     All subject Seq-id(s), separated by a ';'
-    14 score         Raw score
-    15 nident        Number of identical matches
-    16 positive      Number of positive-scoring matches
-    17 gaps          Total number of gaps
-    18 ppos          Percentage of positive-scoring matches
-    19 qframe        Query frame
-    20 sframe        Subject frame
-    21 qseq          Aligned part of query sequence
-    22 sseq          Aligned part of subject sequence
-    23 qlen          Query sequence length
-    24 slen          Subject sequence length
-====== ============= ===========================================
-
-The third option is BLAST XML output, which is designed to be parsed by
-another program, and is understood by some Galaxy tools.
-
-You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).
-The HTML versions use basic webpage formatting and can include links to the hits on the NCBI website.
-The pairwise output (the default on the NCBI BLAST website) shows each match as a pairwise alignment with the query.
-The two query anchored outputs show a multiple sequence alignment between the query and all the matches,
-and differ in how insertions are shown (marked as insertions or with gap characters added to the other sequences).
-
--------
-
-**References**
-
-Altschul et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. 1997. Nucleic Acids Res. 25:3389-3402.
-
-This wrapper is available to install into other Galaxy Instances via the Galaxy
-Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus
-    </help>
-</tool>
--- a/tools/ncbi_blast_plus/repository_dependencies.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,5 +0,0 @@
-<?xml version="1.0"?>
-<repositories description="This requires the BLAST datatype definitions (e.g. the BLAST XML format).">
-<!-- Revision 4:f9a7783ed7b6 on the main (and test) tool shed is v0.0.14 which added BLAST databases -->
-<repository changeset_revision="f9a7783ed7b6" name="blast_datatypes" owner="devteam" toolshed="http://testtoolshed.g2.bx.psu.edu" />
-</repositories>
--- a/tools/ncbi_blast_plus/tool_dependencies.xml	Wed May 29 10:03:48 2013 -0400
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,20 +0,0 @@
-<?xml version="1.0"?>
-<tool_dependency>
-    <package name="blast+" version="2.2.26+">
-        <install version="1.0">
-            <actions>
-                <action type="download_by_url">ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.26/ncbi-blast-2.2.26+-src.tar.gz</action>
-                <action type="shell_command">cd c++ &amp;&amp; ./configure --prefix=$INSTALL_DIR &amp;&amp; make &amp;&amp; make install</action>
-                <action type="set_environment">
-                    <environment_variable name="PATH" action="prepend_to">$INSTALL_DIR/bin</environment_variable>
-                </action>
-            </actions>
-        </install>
-        <readme>
-Downloads and compiles BLAST+ from the NCBI, which assumes you have
-all the required build dependencies installed. See:
-http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&amp;PAGE_TYPE=BlastDocs&amp;DOC_TYPE=Download
-        </readme>
-    </package>
-</tool_dependency>
-