Mercurial > repos > devteam > ncbi_blast_plus
view tool-data/blastdb.loc.sample @ 23:9e483194ebf6 draft
planemo upload for repository https://github.com/peterjc/galaxy_blast/tree/master/tools/ncbi_blast_plus commit befaf16290a437209109c06c20d0deb170925d16-dirty
author | peterjc |
---|---|
date | Fri, 15 Sep 2017 07:52:22 -0400 |
parents | 5e9d5e536b79 |
children |
line wrap: on
line source
# This is a sample file distributed with Galaxy that is used to define a # list of nucleotide BLAST databases, using three columns tab separated: # # <unique_id>{tab}<database_caption>{tab}<base_name_path> # # The captions typically contain spaces and might end with the build date. # It is important that the actual database name does not have a space in # it, and that there are only two tabs on each line. # # You can download the NCBI provided protein databases like NR from here: # ftp://ftp.ncbi.nlm.nih.gov/blast/db/ # # For simplicity, many Galaxy servers are configured to offer just a live # version of each NCBI BLAST database (updated with the NCBI provided # Perl scripts or similar). In this case, we recommend using the case # sensistive base-name of the NCBI BLAST databases as the unique id. # Consistent naming is important for sharing workflows between Galaxy # servers. # # For example, consider the NCBI partially non-redundant nucleotide # nt BLAST database, where you have downloaded and decompressed the # files under /data/blastdb/ meaning at the command line BLAST+ would # would look at the files /data/blastdb/nt.n* when run with: # # $ blastn -db /data/blastdb/nt -query ... # # In this case use nr (lower case to match the NCBI file naming) as the # unique id in the first column of blastdb_p.loc, giving an entry like # this: # # nt{tab}NCBI partially non-redundant (nt){tab}/data/blastdb/nt # # Alternatively, rather than a "live" mirror of the NCBI databases which # are updated automatically, for full reproducibility the Galaxy Team # recommend saving date-stamped copies of the databases. In this case # your blastdb.loc file should include an entry per line for each # version you have stored. For example: # # nt_05Jun2010{tab}NCBI nt (partially non-redundant) 05 Jun 2010{tab}/data/blastdb/05Jun2010/nt # nt_15Aug2010{tab}NCBI nt (partially non-redundant) 15 Aug 2010{tab}/data/blastdb/15Aug2010/nt # ...etc... # # See also blastdb_p.loc which is for any protein BLAST database, and # blastdb_d.loc which is for any protein domains databases (like CDD).