Mercurial > repos > peterjc > ncbi_blast_plus

--- a/tools/ncbi_blast_plus/ncbi_blast_plus.txt	Wed Apr 17 09:44:25 2013 -0400
+++ b/tools/ncbi_blast_plus/ncbi_blast_plus.txt	Wed Apr 17 09:44:44 2013 -0400
@@ -1,7 +1,7 @@
 Galaxy wrappers for NCBI BLAST+ suite
 =====================================

-These wrappers are copyright 2010-2012 by Peter Cock, The James Hutton Institute
+These wrappers are copyright 2010-2013 by Peter Cock, The James Hutton Institute
 (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
 See the licence text below.

@@ -15,6 +15,20 @@
 with this.


+Automated Installation
+======================
+
+Galaxy should be able to automatically install the dependencies, i.e. the
+'blast_datatypes' repository which defines the BLAST XML file format
+('blastxml') and protein and nucleotide BLAST databases ('blastdbp' and
+'blastdbn').
+
+You must tell Galaxy about any system level BLAST databases using configuration
+files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein
+databases like NR), located in the tool-data folder. Sample fils are included
+which explain the tab based format to use.
+
+
 Manual Installation
 ===================

@@ -22,12 +36,14 @@
 the XML and Python files under tools/ncbi_blast_plus and add the XML files
 to your tool_conf.xml as normal.

+You will also need to install 'blast_datatypes' from the Tool Shed. This
+defines the BLAST XML file format ('blastxml') and protein and nucleotide
+BLAST databases composite file formats ('blastdbp' and 'blastdbn').
+
 You must tell Galaxy about any system level BLAST databases using configuration
 files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein
-databases like NR).
-
-You will also need to install 'blast_datatypes' from the Tool Shed. This
-defines the BLAST XML file format ('blastxml').
+databases like NR), located in the tool-data folder. Sample fils are included
+which explain the tab based format to use.


 History
@@ -42,8 +58,13 @@
           e-values
 v0.0.13 - Use the new error handling options in Galaxy (the previously
           bundled hide_stderr.py script is no longer needed).
-v0.0.14 - Support for makeblastdb and local BLAST databases in the history
-          (using work from Edward Kirton).
+v0.0.14 - Support for makeblastdb and blastdbinfo with local BLAST databases
+          in the history (using work from Edward Kirton), requires v0.0.14
+          of the 'blast_datatypes' repository from the Tool Shed.
+v0.0.15 - Stronger warning in help text against searching against subject
+          FASTA files (better looking e-values than you might be expecting).
+v0.0.16 - Added repository_dependencies.xml for automates installation of the
+          'blast_datatypes' repository from the Tool Shed.


 Developers
@@ -82,5 +103,5 @@
 OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE
 OR PERFORMANCE OF THIS SOFTWARE.

-NOTE: This is the licence for the Galaxy Wrapper only. BLAST+ and
+NOTE: This is the licence for the Galaxy Wrapper only. NCBI BLAST+ and
 associated data files are available and licenced separately.
--- a/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml	Wed Apr 17 09:44:25 2013 -0400
+++ b/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml	Wed Apr 17 09:44:44 2013 -0400
@@ -1,4 +1,4 @@
-<tool id="ncbi_blastn_wrapper" name="NCBI BLAST+ blastn" version="0.0.14">
+<tool id="ncbi_blastn_wrapper" name="NCBI BLAST+ blastn" version="0.0.15">
     <description>Search nucleotide database with nucleotide query sequence(s)</description>
     <!-- If job splitting is enabled, break up the query file into parts -->
     <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
@@ -55,7 +55,7 @@
             <param name="db_opts_selector" type="select" label="Subject database/sequences">
               <option value="db" selected="True">BLAST Database</option>
               <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (pairwise e-values)</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
             </param>
             <when value="db">
                 <param name="database" type="select" label="Nucleotide BLAST database">
@@ -162,6 +162,15 @@
 using the NCBI BLAST+ blastn command line tool.
 Algorithms include blastn, megablast, and discontiguous megablast.

+.. class:: warningmark
+
+You can also search against a FASTA file of subject nucleotide
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
 -----

 **Output format**
--- a/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml	Wed Apr 17 09:44:25 2013 -0400
+++ b/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml	Wed Apr 17 09:44:44 2013 -0400
@@ -1,4 +1,4 @@
-<tool id="ncbi_blastp_wrapper" name="NCBI BLAST+ blastp" version="0.0.14">
+<tool id="ncbi_blastp_wrapper" name="NCBI BLAST+ blastp" version="0.0.15">
     <description>Search protein database with protein query sequence(s)</description>
     <!-- If job splitting is enabled, break up the query file into parts -->
     <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
@@ -56,7 +56,7 @@
             <param name="db_opts_selector" type="select" label="Subject database/sequences">
               <option value="db" selected="True">BLAST Database</option>
               <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (pairwise e-values)</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
             </param>
             <when value="db">
                 <param name="database" type="select" label="Protein BLAST database">
@@ -227,6 +227,15 @@
 Search a *protein database* using a *protein query*,
 using the NCBI BLAST+ blastp command line tool.

+.. class:: warningmark
+
+You can also search against a FASTA file of subject protein
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
 -----

 **Output format**
--- a/tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml	Wed Apr 17 09:44:25 2013 -0400
+++ b/tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml	Wed Apr 17 09:44:44 2013 -0400
@@ -1,4 +1,4 @@
-<tool id="ncbi_blastx_wrapper" name="NCBI BLAST+ blastx" version="0.0.14">
+<tool id="ncbi_blastx_wrapper" name="NCBI BLAST+ blastx" version="0.0.15">
     <description>Search protein database with translated nucleotide query sequence(s)</description>
     <!-- If job splitting is enabled, break up the query file into parts -->
     <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
@@ -56,7 +56,7 @@
             <param name="db_opts_selector" type="select" label="Subject database/sequences">
               <option value="db" selected="True">BLAST Database</option>
               <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (pairwise e-values)</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
             </param>
             <when value="db">
                 <param name="database" type="select" label="Protein BLAST database">
@@ -215,6 +215,15 @@
 Search a *protein database* using a *translated nucleotide query*,
 using the NCBI BLAST+ blastx command line tool.

+.. class:: warningmark
+
+You can also search against a FASTA file of subject protein
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
 -----

 **Output format**
--- a/tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml	Wed Apr 17 09:44:25 2013 -0400
+++ b/tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml	Wed Apr 17 09:44:44 2013 -0400
@@ -1,4 +1,4 @@
-<tool id="ncbi_tblastn_wrapper" name="NCBI BLAST+ tblastn" version="0.0.14">
+<tool id="ncbi_tblastn_wrapper" name="NCBI BLAST+ tblastn" version="0.0.15">
     <description>Search translated nucleotide database with protein query sequence(s)</description>
     <!-- If job splitting is enabled, break up the query file into parts -->
     <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
@@ -56,7 +56,7 @@
             <param name="db_opts_selector" type="select" label="Subject database/sequences">
               <option value="db" selected="True">BLAST Database</option>
               <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (pairwise e-values)</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
             </param>
             <when value="db">
                 <param name="database" type="select" label="Nucleotide BLAST database">
@@ -261,6 +261,15 @@
 Search a *translated nucleotide database* using a *protein query*,
 using the NCBI BLAST+ tblastn command line tool.

+.. class:: warningmark
+
+You can also search against a FASTA file of subject nucleotide
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
 -----

 **Output format**
--- a/tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml	Wed Apr 17 09:44:25 2013 -0400
+++ b/tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml	Wed Apr 17 09:44:44 2013 -0400
@@ -1,4 +1,4 @@
-<tool id="ncbi_tblastx_wrapper" name="NCBI BLAST+ tblastx" version="0.0.14">
+<tool id="ncbi_tblastx_wrapper" name="NCBI BLAST+ tblastx" version="0.0.15">
     <description>Search translated nucleotide database with translated nucleotide query sequence(s)</description>
     <!-- If job splitting is enabled, break up the query file into parts -->
     <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism>
@@ -56,7 +56,7 @@
             <param name="db_opts_selector" type="select" label="Subject database/sequences">
               <option value="db" selected="True">BLAST Database</option>
               <option value="histdb">BLAST database from your history</option>
-              <option value="file">FASTA file from your history (pairwise e-values)</option>
+              <option value="file">FASTA file from your history (see warning note below)</option>
             </param>
             <when value="db">
                 <param name="database" type="select" label="Nucleotide BLAST database">
@@ -203,6 +203,15 @@
 Search a *translated nucleotide database* using a *protein query*,
 using the NCBI BLAST+ tblastx command line tool.

+.. class:: warningmark
+
+You can also search against a FASTA file of subject nucleotide
+sequences. This is *not* advised because it is slower (only one
+CPU is used), but more importantly gives e-values for pairwise
+searches (very small e-values which will look overly signficiant).
+In most cases you should instead turn the other FASTA file into a
+database first using *makeblastdb* and search against that.
+
 -----

 **Output format**
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/ncbi_blast_plus/repository_dependencies.xml	Wed Apr 17 09:44:44 2013 -0400
@@ -0,0 +1,5 @@
+<?xml version="1.0"?>
+<repositories description="This requires the BLAST datatype definitions (e.g. the BLAST XML format).">
+<!-- Revision 4:f9a7783ed7b6 on the main tool shed is v0.0.14 which added BLAST databases -->
+<repository toolshed="http://toolshed.g2.bx.psu.edu" name="blast_datatypes" owner="devteam" changeset_revision="f9a7783ed7b6" />
+</repositories>