Mercurial > repos > charles_s_test > seqsero2
comparison libs/sratoolkit.2.8.0-centos_linux64/README-blastn @ 3:38ad1130d077 draft
planemo upload commit a4fb57231f274270afbfebd47f67df05babffa4a-dirty
| author | charles_s_test |
|---|---|
| date | Mon, 27 Nov 2017 11:21:07 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 2:0d65b71ff8df | 3:38ad1130d077 |
|---|---|
| 1 # =========================================================================== | |
| 2 # | |
| 3 # PUBLIC DOMAIN NOTICE | |
| 4 # National Center for Biotechnology Information | |
| 5 # | |
| 6 # This software/database is a "United States Government Work" under the | |
| 7 # terms of the United States Copyright Act. It was written as part of | |
| 8 # the author's official duties as a United States Government employee and | |
| 9 # thus cannot be copyrighted. This software/database is freely available | |
| 10 # to the public for use. The National Library of Medicine and the U.S. | |
| 11 # Government have not placed any restriction on its use or reproduction. | |
| 12 # | |
| 13 # Although all reasonable efforts have been taken to ensure the accuracy | |
| 14 # and reliability of the software and data, the NLM and the U.S. | |
| 15 # Government do not and cannot warrant the performance or results that | |
| 16 # may be obtained by using this software or data. The NLM and the U.S. | |
| 17 # Government disclaim all warranties, express or implied, including | |
| 18 # warranties of performance, merchantability or fitness for any particular | |
| 19 # purpose. | |
| 20 # | |
| 21 # Please cite the author in any work or product based on this material. | |
| 22 # | |
| 23 # =========================================================================== | |
| 24 | |
| 25 | |
| 26 The NCBI SRA ( Sequence Read Archive ) | |
| 27 | |
| 28 | |
| 29 Contact: sra-tools@ncbi.nlm.nih.gov | |
| 30 http://trace.ncbi.nlm.nih.gov/Traces/sra/std | |
| 31 | |
| 32 | |
| 33 Stand-alone BLAST searches against SRA runs in their native format. | |
| 34 ------------------------------------------------------------------- | |
| 35 | |
| 36 A stand-alone blastn application to perform BLAST searches directly against | |
| 37 native SRA files is included in this distribution. This application has been | |
| 38 tested in-house at the NCBI, but has not been heavily used, so this should be | |
| 39 considered a preliminary (alpha) release to a few experienced users. A 64-bit | |
| 40 LINUX application has been built for this testing. | |
| 41 | |
| 42 The application is called "blastn_vdb". | |
| 43 | |
| 44 The application can be invoked in much the same manner as the standard | |
| 45 blastn application: | |
| 46 | |
| 47 1) blastn_vdb -help or blastn_vdb -h will produce usage messages. | |
| 48 | |
| 49 2) The BLAST+ command-line manual at http://www.ncbi.nlm.nih.gov/books/NBK1763/ | |
| 50 provides more details on the options, though not all blastn options are | |
| 51 available with blastn_vdb. Some options simply do not apply to sequences in SRA | |
| 52 (e.g., -gilist is missing as these sequences have not been assigned GI's). Some | |
| 53 options have not yet been implemented (e.g., -num_threads is currently disabled). | |
| 54 | |
| 55 | |
| 56 To search cached or on-demand SRA objects. | |
| 57 ------------------------------------------ | |
| 58 An example blastn_vdb command-line would be: | |
| 59 | |
| 60 ./blastn_vdb -db "ERR039542 ERR047215 ERR039539 ERR039540" -query nt.test -out test.out | |
| 61 | |
| 62 The file nt.test contains the query in FASTA format, and it will be searched against | |
| 63 the reads in runs with accessions ERR039542 ERR047215 ERR039539 ERR039540. | |
| 64 | |
| 65 If you have not already downloaded these objects using the vdb "prefetch" tool, | |
| 66 they will be retrieved on-demand from NCBI under standard configuration. For | |
| 67 alternative configuration information, please see the "README-vdb-config" file | |
| 68 in this distribution. | |
| 69 | |
| 70 Searching with manually downloaded files. | |
| 71 ----------------------------------------- | |
| 72 If you have manually downloaded files, e.g. via aspera or wget, etc., they may | |
| 73 be referred to as "local" files. You can pass one or more file paths to be used | |
| 74 collectively as the database. In this case the blastn_vdb command-line would be: | |
| 75 | |
| 76 ./blastn_vdb -db <SRR_file> -query <input_file> -out <output_file> | |
| 77 | |
| 78 Where | |
| 79 <SRR_file> is the path (relative or absolute) and name of the SRRxxxxx file | |
| 80 <input_file> is a fasta file containing the sequence(s) to be BLASTed | |
| 81 <output_file> is the name specified for the output report of the blast search. | |
| 82 | |
| 83 Example: | |
| 84 | |
| 85 ./blastn_vdb -db ./subdir/ERR039542.sra -query nt.test -out test.out | |
| 86 | |
| 87 Querying multiple SRR files simultaneously: | |
| 88 | |
| 89 ./blastn_vdb -db "<SRR_file1> <SRR_file2> <SRR_file3>" -query <input.fa> -out <output_file> | |
| 90 | |
| 91 Enclose the group of files to be included in the search set in "quotes", e.g. | |
| 92 "./SRR_file1.sra ./SRR_file2.sra ./SRR_file3.sra" | |
| 93 | |
| 94 Example: | |
| 95 | |
| 96 ./blastn_vdb -db "./ERR039542.sra ./ERR047215.sra ./ERR039539.sra ./ERR039540.sra" -query nt.test -out test.out | |
| 97 | |
| 98 Caveats | |
| 99 ------- | |
| 100 There are some limitations on the currently available application: | |
| 101 | |
| 102 1) Individual SRA data files containing more than 2 billion reads are not yet supported. For a | |
| 103 paired-end experiment this is actually a limitation of about 1 billion "spots". | |
| 104 | |
| 105 2) Compressed SRA ("cSRA") is not yet fully supported. Currently, only the | |
| 106 unaligned fraction of reads are searched. Compressed SRA are runs containing | |
| 107 alignments (e.g., ERR230455). Runs can be checked with "vdb-dump" to report if | |
| 108 they contain alignment information: | |
| 109 | |
| 110 $ vdb-dump -E ERR230455 | |
| 111 enumerating the tables of database 'ERR230455' | |
| 112 tbl #1: PRIMARY_ALIGNMENT | |
| 113 tbl #2: REFERENCE | |
| 114 tbl #3: SEQUENCE | |
| 115 | |
| 116 3) You may need to prefix "./" to the run name for files in your current | |
| 117 directory. | |
| 118 | |
| 119 4) The blast_formatter is not currently able to read native SRA files, so | |
| 120 reformatting of results saved as a blast archive is not yet supported. | |
| 121 | |
| 122 Common errors and fixes. | |
| 123 ------------------------ | |
| 124 | |
| 125 1) Failure to provide relative path to manually downloaded SRR file: | |
| 126 | |
| 127 ./blastn_vdb -db SRR770754.sra -query srr770754_test.fa -out test.out | |
| 128 Error: NCBI C++ Exception: | |
| 129 "vdb2blast_util.cpp", line 253: Error: ncbi::CVDBBlastUtil::x_MakeSRASeqSrc() | |
| 130 - VDB BlastSeqSrc construction failed: Failed to add any run to VDB runset: unsupported while allocating | |
| 131 | |
| 132 Fix: | |
| 133 Include relative (e.g., "../" or "./") or absolute (e.g., "/home/user/SRA_BLAST_data/") file path with SRR file |
