# HG changeset patch # User peterjc # Date 1599583093 0 # Node ID 25f86a96c4c9674ca65f654cdee8ba2197a767fd # Parent 00330a63ffcf2a5bf65281abfa261b0c1b7760fb "planemo upload for repository https://github.com/peterjc/galaxy_blast/tree/master/tools/ncbi_blast_plus commit 8b5b3a72e5714b12142d0f863729f56964691244-dirty" diff -r 00330a63ffcf -r 25f86a96c4c9 tools/ncbi_blast_plus/README.rst --- a/tools/ncbi_blast_plus/README.rst Fri Aug 21 12:44:41 2020 +0000 +++ b/tools/ncbi_blast_plus/README.rst Tue Sep 08 16:38:13 2020 +0000 @@ -101,8 +101,10 @@ You can download the NCBI provided databases as tar-balls from here: -* ftp://ftp.ncbi.nlm.nih.gov/blast/db/ (nucleotide and protein databases like NR) -* ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/ (domain databases like CDD) +* ftp://ftp.ncbi.nlm.nih.gov/blast/db/ (nucleotide and protein databases like + NT and NR) +* ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/ (domain databases like + CDD) If using the optional taxonomy columns, you will also need to download the NCBI taxonomy files (``taxdb.btd`` and ``taxdb.bti`` from ``taxdb.tar.gz`` on @@ -131,56 +133,80 @@ ======= ====================================================================== Version Changes ------- ---------------------------------------------------------------------- -v0.0.11 - Final revision as part of the Galaxy main repository, and the - first release via the Tool Shed -v0.0.12 - Implements genetic code option for translation searches. - - Changes ```` to 1000 sequences at a time (to cope with - very large sets of queries where BLAST+ can become memory hungry) - - Include warning that BLAST+ with subject FASTA gives pairwise - e-values -v0.0.13 - Use the new error handling options in Galaxy (the previously - bundled ``hide_stderr.py`` script is no longer needed). -v0.0.14 - Support for makeblastdb and blastdbinfo with local BLAST databases - in the history (using work from Edward Kirton), requires v0.0.14 - of the ``blast_datatypes`` repository from the Tool Shed. -v0.0.15 - Stronger warning in help text against searching against subject - FASTA files (better looking e-values than you might be expecting). -v0.0.16 - Added repository_dependencies.xml for automates installation of the - ``blast_datatypes`` repository from the Tool Shed. -v0.0.17 - The BLAST+ search tools now default to extended tabular output - (all too often our users where having to re-run searches just to - get one of the missing columns like query or subject length) -v0.0.18 - Defensive quoting of filenames in case of spaces (where possible, - BLAST+ handling of some multi-file arguments is problematic). -v0.0.19 - Added wrappers for rpsblast and rpstblastn, and new ``blastdb_d.loc`` - for the domain databases they use (e.g. CDD, PFAM or SMART). - - Correct case of exception regular expression (for error handling - fall-back in case the return code is not set properly). - - Clearer naming of output files. -v0.0.20 - Added unit tests for BLASTN and TBLASTX. - - Added percentage identity option to BLASTN. - - Fallback on ElementTree if cElementTree missing in XML to tabular. - - Link to Tool Shed added to help text and this documentation. - - Tweak dependency on ``blast_datatypes`` to also work on Test Tool Shed. - - Dependency on new ``package_blast_plus_2_2_26`` in Tool Shed. - - Adopted standard MIT License. - - Development moved to GitHub, https://github.com/peterjc/galaxy_blast - - Updated citation information (Cock et al. 2013). -v0.0.21 - Use macros to simplify the XML wrappers (by John Chilton). - - Added wrapper for dustmasker. - - Enabled masking for makeblastdb (Nicola Soranzo). - - Requires ``maskinfo-asn1`` and ``maskinfo-asn1-binary`` datatypes, - defined in ``blast_datatypes`` v0.0.17 on Galaxy ToolShed. - - Tests updated for BLAST+ 2.2.27 instead of BLAST+ 2.2.26. - - Now depends on ``package_blast_plus_2_2_27`` in ToolShed. -v0.0.22 - More use of macros to simplify the wrappers. - - Set number of threads via ``$GALAXY_SLOTS`` environment variable. - - More descriptive default output names. - - Tests require updated BLAST DB definitions (``blast_datatypes`` v0.0.18). - - Pre-check for duplicate identifiers in ``makeblastdb`` wrapper. - - Tests updated for BLAST+ 2.2.28 instead of BLAST+ 2.2.27. - - Now depends on ``package_blast_plus_2_2_28`` in ToolShed. - - Extended tabular output includes 'salltitles' as column 25. +v0.3.3 - Fixed ``tool_dependencies.xml`` to use BLAST+ 2.7.1 (useful only for + older Galaxy instances - we recommend conda for dependencies now). +v0.3.2 - Fixed incomplete ``@CLI_OPTIONS@`` macro in the help text for the + ``tblastn`` and ``blastdbcmd`` wrappers. +v0.3.1 - Clarify help text for max hits option, confusing as depending on the + output format it must be mapped to different command line arguments. + - Extend gzipped query support to all the command line tools. + - Workaround for gzipped support under Galaxy release 16.01 or older. +v0.3.0 - Updated for NCBI BLAST+ 2.7.1, + - Depends on BioConda or legacy ToolShed ``package_blast_plus_2_7_1``. + - Document the BLAST+ 2.6.0 change in the standard 12 column output + from ``qacc,sacc,...`` to ``qaccver,saccver,...`` instead. + - Accept gzipped FASTA inputs for subject files, queries to ``blastn`` + and input to ``makeblastdb`` (contribution from Anton Nekrutenko). +v0.2.02 - Document the BLAST+ 2.5.0 change in the standard 12 column output + from ``qseqid,sseqid,...`` to ``qacc,sacc,...`` instead. + - Support for per-matrix recommended gaps settings (``-gapopen`` and + ``-gapextend``, contribution from Caleb Easterly and Jim Johnson). + - Support for ``-window_size``, ``-threshold``, ``-comp_based_stats`` + and revising ``-word_size`` to avoid using zero to mean default + (contribution from Caleb Easterly). +v0.2.01 - Use ```` (internal change only). + - Single quote command line arguments (internal change only). + - Show BLAST command line argument corresponding to each tool + parameter (contribution from Nicola Soranzo). + - Add ``-max_hsps`` option (contribution from Nicola Soranzo). + - Add ``-use_sw_tback`` option for BLASTP (Nicola Soranzo). +v0.2.00 - Updated for NCBI BLAST+ 2.5.0, where GI numbers are less visible, + tabular output changes with `-parse_deflines`, and percentage + identifies are now given to 3dp rather than 2dp. + - Depends on ``package_blast_plus_2_5_0`` in ToolShed, or BioConda. + - ``blastxml_to_tabular`` now also gives percentage idenity to 3dp. + - Removed never-used binary and Python module dependency declarations + (internal change only). +v0.1.08 - Allow searching against multiple locally installed databases + (contribution from Gildas Le Corguillé and Emma Prudent). + - Minor XML and Python style changes (internal change only). + - Set ``allow_duplicate_entries="False"`` in sample configuration file + ``tool_data_table_conf.xml``. + - Fix identifers with pipes in ``blastdbcmd`` wrapper (Devon Ryan). +v0.1.07 - Re-enabled some ``*.loc`` file tests (these had not been supported + on the Tool Shed test framework, but that is not currently in use). + - Fixed macro problem with version field in blastxml_to_tabular.xml + (contribution from Bjoern Gruening and Daniel Blankenberg). +v0.1.06 - Now depends on ``package_blast_plus_2_2_31`` in ToolShed. + - Tests updated for BLAST+ 2.2.31 instead of BLAST+ 2.2.30. +v0.1.05 - Define ``parallelism`` tag via a macro (internal change only). + - Define wrapper versions via a macro (internal change only). + - Update citation information now GigaScience paper is out. +v0.1.04 - Fixed regression using BLAST databases from the history. Currently + Galaxy inputs must still use ``.extra_files_path`` rather than the + more consise ``.extra_files`` available for output files (Issue #69) +v0.1.03 - Reorder XML elements (internal change only). + - Planemo for Tool Shed upload (``.shed.yml``, internal change only). +v0.1.02 - Now depends on ``package_blast_plus_2_2_30`` in ToolShed. + - Tests updated for BLAST+ 2.2.30 instead of BLAST+ 2.2.29. + - New tasks ``blastp-fast``, ``blastx-fast`` and ``tblastn-fast``. + - New minimum query HSP coverage option, ``-qcov_hsp_perc``. + - Removed ``-word_size`` from RPS-BLAST and RPS-TBLASTN wrappers, this + is set during database construction and should not have been offered + as a command line option in releases prior to BLAST+ 2.2.30. + - BLAST database ``blastdb*.loc`` files now accessed via the XML + table definitions in Galaxy's ``tool_data_table_conf.xml`` file, + setup via ``tool-data/tool_data_table_conf.xml.sample`` + - Replace ``.extra_files_path`` with ``.files_path`` (internal change, + thanks to Bjoern Gruening and John Chilton). + - Added *"NCBI BLAST+ integrated into Galaxy"* preprint citation. +v0.1.01 - Requires ``blastdbd`` datatype (``blast_datatypes`` v0.0.19). + - Wrapper for makeprofiledb added to create protein domain databases + (based on contribution from Bjoern Gruening). + - The RPS-BLAST and RPS-TBLASTN wrappers support using a protein + domain database from the user's history. + - Tool definitions now embed citation information (by John Chilton). + - BLAST tools support GI and SeqID filters (added by Bjoern Gruening). v0.1.00 - Now depends on ``package_blast_plus_2_2_29`` in ToolShed. - Tabular output now includes option to pick specific columns (based on contribution from Jim Johnson), including previously @@ -194,80 +220,55 @@ - Supports setting a taxonomy ID in ``makeblastdb`` wrapper. - Subtle changes like new conditional settings will require some old workflows be updated to cope. -v0.1.01 - Requires ``blastdbd`` datatype (``blast_datatypes`` v0.0.19). - - Wrapper for makeprofiledb added to create protein domain databases - (based on contribution from Bjoern Gruening). - - The RPS-BLAST and RPS-TBLASTN wrappers support using a protein - domain database from the user's history. - - Tool definitions now embed citation information (by John Chilton). - - BLAST tools support GI and SeqID filters (added by Bjoern Gruening). -v0.1.02 - Now depends on ``package_blast_plus_2_2_30`` in ToolShed. - - Tests updated for BLAST+ 2.2.30 instead of BLAST+ 2.2.29. - - New tasks ``blastp-fast``, ``blastx-fast`` and ``tblastn-fast``. - - New minimum query HSP coverage option, ``-qcov_hsp_perc``. - - Removed ``-word_size`` from RPS-BLAST and RPS-TBLASTN wrappers, this - is set during database construction and should not have been offered - as a command line option in releases prior to BLAST+ 2.2.30. - - BLAST database ``blastdb*.loc`` files now accessed via the XML - table definitions in Galaxy's ``tool_data_table_conf.xml`` file, - setup via ``tool-data/tool_data_table_conf.xml.sample`` - - Replace ``.extra_files_path`` with ``.files_path`` (internal change, - thanks to Bjoern Gruening and John Chilton). - - Added *"NCBI BLAST+ integrated into Galaxy"* preprint citation. -v0.1.03 - Reorder XML elements (internal change only). - - Planemo for Tool Shed upload (``.shed.yml``, internal change only). -v0.1.04 - Fixed regression using BLAST databases from the history. Currently - Galaxy inputs must still use ``.extra_files_path`` rather than the - more consise ``.extra_files`` available for output files (Issue #69) -v0.1.05 - Define ``parallelism`` tag via a macro (internal change only). - - Define wrapper versions via a macro (internal change only). - - Update citation information now GigaScience paper is out. -v0.1.06 - Now depends on ``package_blast_plus_2_2_31`` in ToolShed. - - Tests updated for BLAST+ 2.2.31 instead of BLAST+ 2.2.30. -v0.1.07 - Re-enabled some ``*.loc`` file tests (these had not been supported - on the Tool Shed test framework, but that is not currently in use). - - Fixed macro problem with version field in blastxml_to_tabular.xml - (contribution from Bjoern Gruening and Daniel Blankenberg). -v0.1.08 - Allow searching against multiple locally installed databases - (contribution from Gildas Le Corguillé and Emma Prudent). - - Minor XML and Python style changes (internal change only). - - Set ``allow_duplicate_entries="False"`` in sample configuration file - ``tool_data_table_conf.xml``. - - Fix identifers with pipes in ``blastdbcmd`` wrapper (Devon Ryan). -v0.2.00 - Updated for NCBI BLAST+ 2.5.0, where GI numbers are less visible, - tabular output changes with `-parse_deflines`, and percentage - identifies are now given to 3dp rather than 2dp. - - Depends on ``package_blast_plus_2_5_0`` in ToolShed, or BioConda. - - ``blastxml_to_tabular`` now also gives percentage idenity to 3dp. - - Removed never-used binary and Python module dependency declarations - (internal change only). -v0.2.01 - Use ```` (internal change only). - - Single quote command line arguments (internal change only). - - Show BLAST command line argument corresponding to each tool - parameter (contribution from Nicola Soranzo). - - Add ``-max_hsps`` option (contribution from Nicola Soranzo). - - Add ``-use_sw_tback`` option for BLASTP (Nicola Soranzo). -v0.2.02 - Document the BLAST+ 2.5.0 change in the standard 12 column output - from ``qseqid,sseqid,...`` to ``qacc,sacc,...`` instead. - - Support for per-matrix recommended gaps settings (``-gapopen`` and - ``-gapextend``, contribution from Caleb Easterly and Jim Johnson). - - Support for ``-window_size``, ``-threshold``, ``-comp_based_stats`` - and revising ``-word_size`` to avoid using zero to mean default - (contribution from Caleb Easterly). -v0.3.0 - Updated for NCBI BLAST+ 2.7.1, - - Depends on BioConda or legacy ToolShed ``package_blast_plus_2_7_1``. - - Document the BLAST+ 2.6.0 change in the standard 12 column output - from ``qacc,sacc,...`` to ``qaccver,saccver,...`` instead. - - Accept gzipped FASTA inputs for subject files, queries to ``blastn`` - and input to ``makeblastdb`` (contribution from Anton Nekrutenko). -v0.3.1 - Clarify help text for max hits option, confusing as depending on the - output format it must be mapped to different command line arguments. - - Extend gzipped query support to all the command line tools. - - Workaround for gzipped support under Galaxy release 16.01 or older. -v0.3.2 - Fixed incomplete ``@CLI_OPTIONS@`` macro in the help text for the - ``tblastn`` and ``blastdbcmd`` wrappers. -v0.3.3 - Fixed ``tool_dependencies.xml`` to use BLAST+ 2.7.1 (useful only for - older Galaxy instances - we recommend conda for dependencies now). +v0.0.21 - Use macros to simplify the XML wrappers (by John Chilton). + - Added wrapper for dustmasker. + - Enabled masking for makeblastdb (Nicola Soranzo). + - Requires ``maskinfo-asn1`` and ``maskinfo-asn1-binary`` datatypes, + defined in ``blast_datatypes`` v0.0.17 on Galaxy ToolShed. + - Tests updated for BLAST+ 2.2.27 instead of BLAST+ 2.2.26. + - Now depends on ``package_blast_plus_2_2_27`` in ToolShed. +v0.0.20 - Added unit tests for BLASTN and TBLASTX. + - Added percentage identity option to BLASTN. + - Fallback on ElementTree if cElementTree missing in XML to tabular. + - Link to Tool Shed added to help text and this documentation. + - Tweak ``blast_datatypes`` to also work on Test Tool Shed. + - Dependency on new ``package_blast_plus_2_2_26`` in Tool Shed. + - Adopted standard MIT License. + - Development moved to GitHub, https://github.com/peterjc/galaxy_blast + - Updated citation information (Cock et al. 2013). +v0.0.19 - Added wrappers for rpsblast and rpstblastn, and new ``blastdb_d.loc`` + for the domain databases they use (e.g. CDD, PFAM or SMART). + - Correct case of exception regular expression (for error handling + fall-back in case the return code is not set properly). + - Clearer naming of output files. +v0.0.17 - The BLAST+ search tools now default to extended tabular output + (all too often our users where having to re-run searches just to + get one of the missing columns like query or subject length) +v0.0.16 - Added repository_dependencies.xml for automates installation of the + ``blast_datatypes`` repository from the Tool Shed. +v0.0.15 - Stronger warning in help text against searching against subject + FASTA files (better looking e-values than you might be expecting). +v0.0.14 - Support for makeblastdb and blastdbinfo with local BLAST databases + in the history (using work from Edward Kirton), requires v0.0.14 + of the ``blast_datatypes`` repository from the Tool Shed. +v0.0.13 - Use the new error handling options in Galaxy (the previously + bundled ``hide_stderr.py`` script is no longer needed). +v0.0.12 - Implements genetic code option for translation searches. + - Changes ```` to 1000 sequences at a time (to cope with + very large sets of queries where BLAST+ can become memory hungry) + - Include warning that BLAST+ with subject FASTA gives pairwise + e-values +v0.0.11 - Final revision as part of the Galaxy main repository, and the + first release via the Tool Shed +v0.0.22 - More use of macros to simplify the wrappers. + - Set number of threads via ``$GALAXY_SLOTS`` environment variable. + - More descriptive default output names. + - Tests require updated BLAST DB definitions (``blast_datatypes`` + v0.0.18). + - Pre-check for duplicate identifiers in ``makeblastdb`` wrapper. + - Tests updated for BLAST+ 2.2.28 instead of BLAST+ 2.2.27. + - Now depends on ``package_blast_plus_2_2_28`` in ToolShed. + - Extended tabular output includes 'salltitles' as column 25. ======= ======================================================================