Mercurial > repos > charles_s_test > seqsero2
comparison libs/sratoolkit.2.8.0-centos_linux64/README-blastn @ 3:38ad1130d077 draft
planemo upload commit a4fb57231f274270afbfebd47f67df05babffa4a-dirty
author | charles_s_test |
---|---|
date | Mon, 27 Nov 2017 11:21:07 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
2:0d65b71ff8df | 3:38ad1130d077 |
---|---|
1 # =========================================================================== | |
2 # | |
3 # PUBLIC DOMAIN NOTICE | |
4 # National Center for Biotechnology Information | |
5 # | |
6 # This software/database is a "United States Government Work" under the | |
7 # terms of the United States Copyright Act. It was written as part of | |
8 # the author's official duties as a United States Government employee and | |
9 # thus cannot be copyrighted. This software/database is freely available | |
10 # to the public for use. The National Library of Medicine and the U.S. | |
11 # Government have not placed any restriction on its use or reproduction. | |
12 # | |
13 # Although all reasonable efforts have been taken to ensure the accuracy | |
14 # and reliability of the software and data, the NLM and the U.S. | |
15 # Government do not and cannot warrant the performance or results that | |
16 # may be obtained by using this software or data. The NLM and the U.S. | |
17 # Government disclaim all warranties, express or implied, including | |
18 # warranties of performance, merchantability or fitness for any particular | |
19 # purpose. | |
20 # | |
21 # Please cite the author in any work or product based on this material. | |
22 # | |
23 # =========================================================================== | |
24 | |
25 | |
26 The NCBI SRA ( Sequence Read Archive ) | |
27 | |
28 | |
29 Contact: sra-tools@ncbi.nlm.nih.gov | |
30 http://trace.ncbi.nlm.nih.gov/Traces/sra/std | |
31 | |
32 | |
33 Stand-alone BLAST searches against SRA runs in their native format. | |
34 ------------------------------------------------------------------- | |
35 | |
36 A stand-alone blastn application to perform BLAST searches directly against | |
37 native SRA files is included in this distribution. This application has been | |
38 tested in-house at the NCBI, but has not been heavily used, so this should be | |
39 considered a preliminary (alpha) release to a few experienced users. A 64-bit | |
40 LINUX application has been built for this testing. | |
41 | |
42 The application is called "blastn_vdb". | |
43 | |
44 The application can be invoked in much the same manner as the standard | |
45 blastn application: | |
46 | |
47 1) blastn_vdb -help or blastn_vdb -h will produce usage messages. | |
48 | |
49 2) The BLAST+ command-line manual at http://www.ncbi.nlm.nih.gov/books/NBK1763/ | |
50 provides more details on the options, though not all blastn options are | |
51 available with blastn_vdb. Some options simply do not apply to sequences in SRA | |
52 (e.g., -gilist is missing as these sequences have not been assigned GI's). Some | |
53 options have not yet been implemented (e.g., -num_threads is currently disabled). | |
54 | |
55 | |
56 To search cached or on-demand SRA objects. | |
57 ------------------------------------------ | |
58 An example blastn_vdb command-line would be: | |
59 | |
60 ./blastn_vdb -db "ERR039542 ERR047215 ERR039539 ERR039540" -query nt.test -out test.out | |
61 | |
62 The file nt.test contains the query in FASTA format, and it will be searched against | |
63 the reads in runs with accessions ERR039542 ERR047215 ERR039539 ERR039540. | |
64 | |
65 If you have not already downloaded these objects using the vdb "prefetch" tool, | |
66 they will be retrieved on-demand from NCBI under standard configuration. For | |
67 alternative configuration information, please see the "README-vdb-config" file | |
68 in this distribution. | |
69 | |
70 Searching with manually downloaded files. | |
71 ----------------------------------------- | |
72 If you have manually downloaded files, e.g. via aspera or wget, etc., they may | |
73 be referred to as "local" files. You can pass one or more file paths to be used | |
74 collectively as the database. In this case the blastn_vdb command-line would be: | |
75 | |
76 ./blastn_vdb -db <SRR_file> -query <input_file> -out <output_file> | |
77 | |
78 Where | |
79 <SRR_file> is the path (relative or absolute) and name of the SRRxxxxx file | |
80 <input_file> is a fasta file containing the sequence(s) to be BLASTed | |
81 <output_file> is the name specified for the output report of the blast search. | |
82 | |
83 Example: | |
84 | |
85 ./blastn_vdb -db ./subdir/ERR039542.sra -query nt.test -out test.out | |
86 | |
87 Querying multiple SRR files simultaneously: | |
88 | |
89 ./blastn_vdb -db "<SRR_file1> <SRR_file2> <SRR_file3>" -query <input.fa> -out <output_file> | |
90 | |
91 Enclose the group of files to be included in the search set in "quotes", e.g. | |
92 "./SRR_file1.sra ./SRR_file2.sra ./SRR_file3.sra" | |
93 | |
94 Example: | |
95 | |
96 ./blastn_vdb -db "./ERR039542.sra ./ERR047215.sra ./ERR039539.sra ./ERR039540.sra" -query nt.test -out test.out | |
97 | |
98 Caveats | |
99 ------- | |
100 There are some limitations on the currently available application: | |
101 | |
102 1) Individual SRA data files containing more than 2 billion reads are not yet supported. For a | |
103 paired-end experiment this is actually a limitation of about 1 billion "spots". | |
104 | |
105 2) Compressed SRA ("cSRA") is not yet fully supported. Currently, only the | |
106 unaligned fraction of reads are searched. Compressed SRA are runs containing | |
107 alignments (e.g., ERR230455). Runs can be checked with "vdb-dump" to report if | |
108 they contain alignment information: | |
109 | |
110 $ vdb-dump -E ERR230455 | |
111 enumerating the tables of database 'ERR230455' | |
112 tbl #1: PRIMARY_ALIGNMENT | |
113 tbl #2: REFERENCE | |
114 tbl #3: SEQUENCE | |
115 | |
116 3) You may need to prefix "./" to the run name for files in your current | |
117 directory. | |
118 | |
119 4) The blast_formatter is not currently able to read native SRA files, so | |
120 reformatting of results saved as a blast archive is not yet supported. | |
121 | |
122 Common errors and fixes. | |
123 ------------------------ | |
124 | |
125 1) Failure to provide relative path to manually downloaded SRR file: | |
126 | |
127 ./blastn_vdb -db SRR770754.sra -query srr770754_test.fa -out test.out | |
128 Error: NCBI C++ Exception: | |
129 "vdb2blast_util.cpp", line 253: Error: ncbi::CVDBBlastUtil::x_MakeSRASeqSrc() | |
130 - VDB BlastSeqSrc construction failed: Failed to add any run to VDB runset: unsupported while allocating | |
131 | |
132 Fix: | |
133 Include relative (e.g., "../" or "./") or absolute (e.g., "/home/user/SRA_BLAST_data/") file path with SRR file |