# HG changeset patch # User peterjc # Date 1393439174 18000 # Node ID af4da561893beaf4b58e25fc979222b89f1270c8 # Parent f83e5d79b6abaeaf774f9299a0b646be3f8a689c Uploaded v0.1.0 preview 4, adding taxonomy columns etc diff -r f83e5d79b6ab -r af4da561893b tools/ncbi_blast_plus/README.rst --- a/tools/ncbi_blast_plus/README.rst Wed Feb 26 10:35:01 2014 -0500 +++ b/tools/ncbi_blast_plus/README.rst Wed Feb 26 13:26:14 2014 -0500 @@ -82,6 +82,14 @@ * ftp://ftp.ncbi.nlm.nih.gov/blast/db/ (nucleotide and protein databases like NR) * ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/ (domain databases like CDD) +If using the optional taxonomy columns, you will also need to download the +NCBI taxonomy files (``taxdb.btd`` and ``taxdb.bti`` from ``taxdb.tar.gz`` on +the BLAST database FTP site). Currently explicit version tracking of the +taxonomy is not supported, and in order to use this you must set the +``$BLASTDB`` environment variable to include the path where you unzipped the +taxonomy files. If this is not done, the taxonomy columns like species name +will appear as ``N/A`` in the tabular output. + The BLAST+ binaries support multi-threaded operation, which is handled via the $GALAXY_SLOTS environment variable. This should be set automatically by Galaxy via your job runner settings, which allows you to (for example) allocate four @@ -151,7 +159,8 @@ - Now depends on package_blast_plus_2_2_28 in ToolShed. - Extended tabular output includes 'salltitles' as column 25. v0.1.00 - Now depends on package_blast_plus_2_2_29 in ToolShed. - - Tablar output now includes option to pick specific columns. + - Tablar output now includes option to pick specific columns, + including previously unavailable taxonomy columns. - BLAST XML to tabular tool supports multiple input files. - More detailed descriptions for BLASTN and BLASTP task option. - Wrappers for segmasker, dustmasker and convert2blastmask. diff -r f83e5d79b6ab -r af4da561893b tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml Wed Feb 26 10:35:01 2014 -0500 +++ b/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml Wed Feb 26 13:26:14 2014 -0500 @@ -65,7 +65,7 @@ - + diff -r f83e5d79b6ab -r af4da561893b tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml Wed Feb 26 10:35:01 2014 -0500 +++ b/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml Wed Feb 26 13:26:14 2014 -0500 @@ -55,7 +55,7 @@ - + diff -r f83e5d79b6ab -r af4da561893b tools/ncbi_blast_plus/ncbi_macros.xml --- a/tools/ncbi_blast_plus/ncbi_macros.xml Wed Feb 26 10:35:01 2014 -0500 +++ b/tools/ncbi_blast_plus/ncbi_macros.xml Wed Feb 26 13:26:14 2014 -0500 @@ -59,7 +59,34 @@ - + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -317,7 +344,7 @@ #elif str($output.out_format)=="cols" ##Pick your own columns. Galaxy gives us it comma separated, BLAST+ wants space separated: ##TODO - Can we catch the user picking no columns and raise an error here? -#set cols = (str($output.std_cols)+","+str($output.ext_cols)).replace("None", "").replace(",,", ",").replace(",", " ").strip() +#set cols = (str($output.std_cols)+","+str($output.ext_cols)+","+str($output.ids_cols)+","+str($output.misc_cols)+","+str($output.tax_cols)).replace("None", "").replace(",,", ",").replace(",", " ").strip() -outfmt "6 $cols" #else: -outfmt $output.out_format @@ -379,7 +406,7 @@ ====== ========= ============================================ The BLAST+ tools can optionally output additional columns of information, -but this takes longer to calculate. Most (but not all) of these columns are +but this takes longer to calculate. Many commonly used extra columns are included by selecting the extended tabular output. The extra columns are included *after* the standard 12 columns. This is so that you can write workflow filtering steps that accept either the 12 or 25 column tabular @@ -403,7 +430,11 @@ 25 salltitles All subject title(s), separated by a '<>' ====== ============= =========================================== -The third option is BLAST XML output, which is designed to be parsed by +The third option is to customise the tabular output by selecting which +columns you want, from the standard set of 12, the default set of 25, +or any of the additional columns BLAST+ offers (including species name). + +The fourth option is BLAST XML output, which is designed to be parsed by another program, and is understood by some Galaxy tools. You can also choose several plain text or HTML output formats which are designed to be read by a person (not by another program).