Mercurial > repos > devteam > ncbi_blast_plus
comparison tool-data/blastdb.loc.sample @ 1:5e9d5e536b79 draft
Uploaded v0.1.02 preview 2, clarify sample blastdb loc files, etc
author | peterjc |
---|---|
date | Tue, 03 Mar 2015 05:32:18 -0500 |
parents | 432ea9614cc9 |
children |
comparison
equal
deleted
inserted
replaced
0:432ea9614cc9 | 1:5e9d5e536b79 |
---|---|
1 #This is a sample file distributed with Galaxy that is used to define a | 1 # This is a sample file distributed with Galaxy that is used to define a |
2 #list of nucleotide BLAST databases, using three columns tab separated | 2 # list of nucleotide BLAST databases, using three columns tab separated: |
3 #(longer whitespace are TAB characters): | |
4 # | 3 # |
5 #<unique_id> <database_caption> <base_name_path> | 4 # <unique_id>{tab}<database_caption>{tab}<base_name_path> |
6 # | 5 # |
7 #The captions typically contain spaces and might end with the build date. | 6 # The captions typically contain spaces and might end with the build date. |
8 #It is important that the actual database name does not have a space in | 7 # It is important that the actual database name does not have a space in |
9 #it, and that there are only two tabs on each line. | 8 # it, and that there are only two tabs on each line. |
10 # | 9 # |
11 #So, for example, if your database is nt and the path to your base name | 10 # You can download the NCBI provided protein databases like NR from here: |
12 #is /depot/data2/galaxy/blastdb/nt/nt.chunk, then the blastdb.loc entry | 11 # ftp://ftp.ncbi.nlm.nih.gov/blast/db/ |
13 #would look like this: | |
14 # | 12 # |
15 #nt_02_Dec_2009 nt 02 Dec 2009 /depot/data2/galaxy/blastdb/nt/nt.chunk | 13 # For simplicity, many Galaxy servers are configured to offer just a live |
14 # version of each NCBI BLAST database (updated with the NCBI provided | |
15 # Perl scripts or similar). In this case, we recommend using the case | |
16 # sensistive base-name of the NCBI BLAST databases as the unique id. | |
17 # Consistent naming is important for sharing workflows between Galaxy | |
18 # servers. | |
16 # | 19 # |
17 #and your /depot/data2/galaxy/blastdb/nt directory would contain all of | 20 # For example, consider the NCBI partially non-redundant nucleotide |
18 #your "base names" (e.g.): | 21 # nt BLAST database, where you have downloaded and decompressed the |
22 # files under /data/blastdb/ meaning at the command line BLAST+ would | |
23 # would look at the files /data/blastdb/nt.n* when run with: | |
19 # | 24 # |
20 #-rw-r--r-- 1 wychung galaxy 23437408 2008-04-09 11:26 nt.chunk.00.nhr | 25 # $ blastn -db /data/blastdb/nt -query ... |
21 #-rw-r--r-- 1 wychung galaxy 3689920 2008-04-09 11:26 nt.chunk.00.nin | |
22 #-rw-r--r-- 1 wychung galaxy 251215198 2008-04-09 11:26 nt.chunk.00.nsq | |
23 #...etc... | |
24 # | 26 # |
25 #Your blastdb.loc file should include an entry per line for each "base name" | 27 # In this case use nr (lower case to match the NCBI file naming) as the |
26 #you have stored. For example: | 28 # unique id in the first column of blastdb_p.loc, giving an entry like |
29 # this: | |
27 # | 30 # |
28 #nt_02_Dec_2009 nt 02 Dec 2009 /depot/data2/galaxy/blastdb/nt/nt.chunk | 31 # nt{tab}NCBI partially non-redundant (nt){tab}/data/blastdb/nt |
29 #wgs_30_Nov_2009 wgs 30 Nov 2009 /depot/data2/galaxy/blastdb/wgs/wgs.chunk | |
30 #test_20_Sep_2008 test 20 Sep 2008 /depot/data2/galaxy/blastdb/test/test | |
31 #...etc... | |
32 # | 32 # |
33 #You can download the NCBI provided protein databases like NT from here: | 33 # Alternatively, rather than a "live" mirror of the NCBI databases which |
34 #ftp://ftp.ncbi.nlm.nih.gov/blast/db/ | 34 # are updated automatically, for full reproducibility the Galaxy Team |
35 # recommend saving date-stamped copies of the databases. In this case | |
36 # your blastdb.loc file should include an entry per line for each | |
37 # version you have stored. For example: | |
35 # | 38 # |
36 #See also blastdb_p.loc which is for any protein BLAST database, and | 39 # nt_05Jun2010{tab}NCBI nt (partially non-redundant) 05 Jun 2010{tab}/data/blastdb/05Jun2010/nt |
37 #blastdb_d.loc which is for any protein domains databases (like CDD). | 40 # nt_15Aug2010{tab}NCBI nt (partially non-redundant) 15 Aug 2010{tab}/data/blastdb/15Aug2010/nt |
38 | 41 # ...etc... |
39 | 42 # |
43 # See also blastdb_p.loc which is for any protein BLAST database, and | |
44 # blastdb_d.loc which is for any protein domains databases (like CDD). |