Mercurial > repos > peterjc > ncbi_blast_plus
changeset 2:b70b142bbc39 draft
Uploaded v0.0.16
| author | peterjc | 
|---|---|
| date | Wed, 17 Apr 2013 09:44:44 -0400 | 
| parents | c84837116457 | 
| children | cf4903f5c81f | 
| files | tools/ncbi_blast_plus/ncbi_blast_plus.txt tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml tools/ncbi_blast_plus/repository_dependencies.xml | 
| diffstat | 7 files changed, 89 insertions(+), 18 deletions(-) [+] | 
line wrap: on
 line diff
--- a/tools/ncbi_blast_plus/ncbi_blast_plus.txt Wed Apr 17 09:44:25 2013 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blast_plus.txt Wed Apr 17 09:44:44 2013 -0400 @@ -1,7 +1,7 @@ Galaxy wrappers for NCBI BLAST+ suite ===================================== -These wrappers are copyright 2010-2012 by Peter Cock, The James Hutton Institute +These wrappers are copyright 2010-2013 by Peter Cock, The James Hutton Institute (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved. See the licence text below. @@ -15,6 +15,20 @@ with this. +Automated Installation +====================== + +Galaxy should be able to automatically install the dependencies, i.e. the +'blast_datatypes' repository which defines the BLAST XML file format +('blastxml') and protein and nucleotide BLAST databases ('blastdbp' and +'blastdbn'). + +You must tell Galaxy about any system level BLAST databases using configuration +files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein +databases like NR), located in the tool-data folder. Sample fils are included +which explain the tab based format to use. + + Manual Installation =================== @@ -22,12 +36,14 @@ the XML and Python files under tools/ncbi_blast_plus and add the XML files to your tool_conf.xml as normal. +You will also need to install 'blast_datatypes' from the Tool Shed. This +defines the BLAST XML file format ('blastxml') and protein and nucleotide +BLAST databases composite file formats ('blastdbp' and 'blastdbn'). + You must tell Galaxy about any system level BLAST databases using configuration files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein -databases like NR). - -You will also need to install 'blast_datatypes' from the Tool Shed. This -defines the BLAST XML file format ('blastxml'). +databases like NR), located in the tool-data folder. Sample fils are included +which explain the tab based format to use. History @@ -42,8 +58,13 @@ e-values v0.0.13 - Use the new error handling options in Galaxy (the previously bundled hide_stderr.py script is no longer needed). -v0.0.14 - Support for makeblastdb and local BLAST databases in the history - (using work from Edward Kirton). +v0.0.14 - Support for makeblastdb and blastdbinfo with local BLAST databases + in the history (using work from Edward Kirton), requires v0.0.14 + of the 'blast_datatypes' repository from the Tool Shed. +v0.0.15 - Stronger warning in help text against searching against subject + FASTA files (better looking e-values than you might be expecting). +v0.0.16 - Added repository_dependencies.xml for automates installation of the + 'blast_datatypes' repository from the Tool Shed. Developers @@ -82,5 +103,5 @@ OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. -NOTE: This is the licence for the Galaxy Wrapper only. BLAST+ and +NOTE: This is the licence for the Galaxy Wrapper only. NCBI BLAST+ and associated data files are available and licenced separately.
--- a/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml Wed Apr 17 09:44:25 2013 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml Wed Apr 17 09:44:44 2013 -0400 @@ -1,4 +1,4 @@ -<tool id="ncbi_blastn_wrapper" name="NCBI BLAST+ blastn" version="0.0.14"> +<tool id="ncbi_blastn_wrapper" name="NCBI BLAST+ blastn" version="0.0.15"> <description>Search nucleotide database with nucleotide query sequence(s)</description> <!-- If job splitting is enabled, break up the query file into parts --> <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism> @@ -55,7 +55,7 @@ <param name="db_opts_selector" type="select" label="Subject database/sequences"> <option value="db" selected="True">BLAST Database</option> <option value="histdb">BLAST database from your history</option> - <option value="file">FASTA file from your history (pairwise e-values)</option> + <option value="file">FASTA file from your history (see warning note below)</option> </param> <when value="db"> <param name="database" type="select" label="Nucleotide BLAST database"> @@ -162,6 +162,15 @@ using the NCBI BLAST+ blastn command line tool. Algorithms include blastn, megablast, and discontiguous megablast. +.. class:: warningmark + +You can also search against a FASTA file of subject nucleotide +sequences. This is *not* advised because it is slower (only one +CPU is used), but more importantly gives e-values for pairwise +searches (very small e-values which will look overly signficiant). +In most cases you should instead turn the other FASTA file into a +database first using *makeblastdb* and search against that. + ----- **Output format**
--- a/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml Wed Apr 17 09:44:25 2013 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml Wed Apr 17 09:44:44 2013 -0400 @@ -1,4 +1,4 @@ -<tool id="ncbi_blastp_wrapper" name="NCBI BLAST+ blastp" version="0.0.14"> +<tool id="ncbi_blastp_wrapper" name="NCBI BLAST+ blastp" version="0.0.15"> <description>Search protein database with protein query sequence(s)</description> <!-- If job splitting is enabled, break up the query file into parts --> <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism> @@ -56,7 +56,7 @@ <param name="db_opts_selector" type="select" label="Subject database/sequences"> <option value="db" selected="True">BLAST Database</option> <option value="histdb">BLAST database from your history</option> - <option value="file">FASTA file from your history (pairwise e-values)</option> + <option value="file">FASTA file from your history (see warning note below)</option> </param> <when value="db"> <param name="database" type="select" label="Protein BLAST database"> @@ -227,6 +227,15 @@ Search a *protein database* using a *protein query*, using the NCBI BLAST+ blastp command line tool. +.. class:: warningmark + +You can also search against a FASTA file of subject protein +sequences. This is *not* advised because it is slower (only one +CPU is used), but more importantly gives e-values for pairwise +searches (very small e-values which will look overly signficiant). +In most cases you should instead turn the other FASTA file into a +database first using *makeblastdb* and search against that. + ----- **Output format**
--- a/tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml Wed Apr 17 09:44:25 2013 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml Wed Apr 17 09:44:44 2013 -0400 @@ -1,4 +1,4 @@ -<tool id="ncbi_blastx_wrapper" name="NCBI BLAST+ blastx" version="0.0.14"> +<tool id="ncbi_blastx_wrapper" name="NCBI BLAST+ blastx" version="0.0.15"> <description>Search protein database with translated nucleotide query sequence(s)</description> <!-- If job splitting is enabled, break up the query file into parts --> <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism> @@ -56,7 +56,7 @@ <param name="db_opts_selector" type="select" label="Subject database/sequences"> <option value="db" selected="True">BLAST Database</option> <option value="histdb">BLAST database from your history</option> - <option value="file">FASTA file from your history (pairwise e-values)</option> + <option value="file">FASTA file from your history (see warning note below)</option> </param> <when value="db"> <param name="database" type="select" label="Protein BLAST database"> @@ -215,6 +215,15 @@ Search a *protein database* using a *translated nucleotide query*, using the NCBI BLAST+ blastx command line tool. +.. class:: warningmark + +You can also search against a FASTA file of subject protein +sequences. This is *not* advised because it is slower (only one +CPU is used), but more importantly gives e-values for pairwise +searches (very small e-values which will look overly signficiant). +In most cases you should instead turn the other FASTA file into a +database first using *makeblastdb* and search against that. + ----- **Output format**
--- a/tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml Wed Apr 17 09:44:25 2013 -0400 +++ b/tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml Wed Apr 17 09:44:44 2013 -0400 @@ -1,4 +1,4 @@ -<tool id="ncbi_tblastn_wrapper" name="NCBI BLAST+ tblastn" version="0.0.14"> +<tool id="ncbi_tblastn_wrapper" name="NCBI BLAST+ tblastn" version="0.0.15"> <description>Search translated nucleotide database with protein query sequence(s)</description> <!-- If job splitting is enabled, break up the query file into parts --> <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism> @@ -56,7 +56,7 @@ <param name="db_opts_selector" type="select" label="Subject database/sequences"> <option value="db" selected="True">BLAST Database</option> <option value="histdb">BLAST database from your history</option> - <option value="file">FASTA file from your history (pairwise e-values)</option> + <option value="file">FASTA file from your history (see warning note below)</option> </param> <when value="db"> <param name="database" type="select" label="Nucleotide BLAST database"> @@ -261,6 +261,15 @@ Search a *translated nucleotide database* using a *protein query*, using the NCBI BLAST+ tblastn command line tool. +.. class:: warningmark + +You can also search against a FASTA file of subject nucleotide +sequences. This is *not* advised because it is slower (only one +CPU is used), but more importantly gives e-values for pairwise +searches (very small e-values which will look overly signficiant). +In most cases you should instead turn the other FASTA file into a +database first using *makeblastdb* and search against that. + ----- **Output format**
--- a/tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml Wed Apr 17 09:44:25 2013 -0400 +++ b/tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml Wed Apr 17 09:44:44 2013 -0400 @@ -1,4 +1,4 @@ -<tool id="ncbi_tblastx_wrapper" name="NCBI BLAST+ tblastx" version="0.0.14"> +<tool id="ncbi_tblastx_wrapper" name="NCBI BLAST+ tblastx" version="0.0.15"> <description>Search translated nucleotide database with translated nucleotide query sequence(s)</description> <!-- If job splitting is enabled, break up the query file into parts --> <parallelism method="multi" split_inputs="query" split_mode="to_size" split_size="1000" shared_inputs="subject,histdb" merge_outputs="output1"></parallelism> @@ -56,7 +56,7 @@ <param name="db_opts_selector" type="select" label="Subject database/sequences"> <option value="db" selected="True">BLAST Database</option> <option value="histdb">BLAST database from your history</option> - <option value="file">FASTA file from your history (pairwise e-values)</option> + <option value="file">FASTA file from your history (see warning note below)</option> </param> <when value="db"> <param name="database" type="select" label="Nucleotide BLAST database"> @@ -203,6 +203,15 @@ Search a *translated nucleotide database* using a *protein query*, using the NCBI BLAST+ tblastx command line tool. +.. class:: warningmark + +You can also search against a FASTA file of subject nucleotide +sequences. This is *not* advised because it is slower (only one +CPU is used), but more importantly gives e-values for pairwise +searches (very small e-values which will look overly signficiant). +In most cases you should instead turn the other FASTA file into a +database first using *makeblastdb* and search against that. + ----- **Output format**
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/ncbi_blast_plus/repository_dependencies.xml Wed Apr 17 09:44:44 2013 -0400 @@ -0,0 +1,5 @@ +<?xml version="1.0"?> +<repositories description="This requires the BLAST datatype definitions (e.g. the BLAST XML format)."> +<!-- Revision 4:f9a7783ed7b6 on the main tool shed is v0.0.14 which added BLAST databases --> +<repository toolshed="http://toolshed.g2.bx.psu.edu" name="blast_datatypes" owner="devteam" changeset_revision="f9a7783ed7b6" /> +</repositories>
