Mercurial > repos > charles_s_test > seqsero2
diff libs/sratoolkit.2.8.0-centos_linux64/README-blastn @ 3:38ad1130d077 draft
planemo upload commit a4fb57231f274270afbfebd47f67df05babffa4a-dirty
author | charles_s_test |
---|---|
date | Mon, 27 Nov 2017 11:21:07 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/libs/sratoolkit.2.8.0-centos_linux64/README-blastn Mon Nov 27 11:21:07 2017 -0500 @@ -0,0 +1,133 @@ +# =========================================================================== +# +# PUBLIC DOMAIN NOTICE +# National Center for Biotechnology Information +# +# This software/database is a "United States Government Work" under the +# terms of the United States Copyright Act. It was written as part of +# the author's official duties as a United States Government employee and +# thus cannot be copyrighted. This software/database is freely available +# to the public for use. The National Library of Medicine and the U.S. +# Government have not placed any restriction on its use or reproduction. +# +# Although all reasonable efforts have been taken to ensure the accuracy +# and reliability of the software and data, the NLM and the U.S. +# Government do not and cannot warrant the performance or results that +# may be obtained by using this software or data. The NLM and the U.S. +# Government disclaim all warranties, express or implied, including +# warranties of performance, merchantability or fitness for any particular +# purpose. +# +# Please cite the author in any work or product based on this material. +# +# =========================================================================== + + +The NCBI SRA ( Sequence Read Archive ) + + +Contact: sra-tools@ncbi.nlm.nih.gov +http://trace.ncbi.nlm.nih.gov/Traces/sra/std + + +Stand-alone BLAST searches against SRA runs in their native format. +------------------------------------------------------------------- + +A stand-alone blastn application to perform BLAST searches directly against +native SRA files is included in this distribution. This application has been +tested in-house at the NCBI, but has not been heavily used, so this should be +considered a preliminary (alpha) release to a few experienced users. A 64-bit +LINUX application has been built for this testing. + +The application is called "blastn_vdb". + +The application can be invoked in much the same manner as the standard +blastn application: + +1) blastn_vdb -help or blastn_vdb -h will produce usage messages. + +2) The BLAST+ command-line manual at http://www.ncbi.nlm.nih.gov/books/NBK1763/ +provides more details on the options, though not all blastn options are +available with blastn_vdb. Some options simply do not apply to sequences in SRA +(e.g., -gilist is missing as these sequences have not been assigned GI's). Some +options have not yet been implemented (e.g., -num_threads is currently disabled). + + +To search cached or on-demand SRA objects. +------------------------------------------ +An example blastn_vdb command-line would be: + +./blastn_vdb -db "ERR039542 ERR047215 ERR039539 ERR039540" -query nt.test -out test.out + +The file nt.test contains the query in FASTA format, and it will be searched against +the reads in runs with accessions ERR039542 ERR047215 ERR039539 ERR039540. + +If you have not already downloaded these objects using the vdb "prefetch" tool, +they will be retrieved on-demand from NCBI under standard configuration. For +alternative configuration information, please see the "README-vdb-config" file +in this distribution. + +Searching with manually downloaded files. +----------------------------------------- +If you have manually downloaded files, e.g. via aspera or wget, etc., they may +be referred to as "local" files. You can pass one or more file paths to be used +collectively as the database. In this case the blastn_vdb command-line would be: + +./blastn_vdb -db <SRR_file> -query <input_file> -out <output_file> + +Where +<SRR_file> is the path (relative or absolute) and name of the SRRxxxxx file +<input_file> is a fasta file containing the sequence(s) to be BLASTed +<output_file> is the name specified for the output report of the blast search. + +Example: + +./blastn_vdb -db ./subdir/ERR039542.sra -query nt.test -out test.out + +Querying multiple SRR files simultaneously: + +./blastn_vdb -db "<SRR_file1> <SRR_file2> <SRR_file3>" -query <input.fa> -out <output_file> + +Enclose the group of files to be included in the search set in "quotes", e.g. +"./SRR_file1.sra ./SRR_file2.sra ./SRR_file3.sra" + +Example: + +./blastn_vdb -db "./ERR039542.sra ./ERR047215.sra ./ERR039539.sra ./ERR039540.sra" -query nt.test -out test.out + +Caveats +------- +There are some limitations on the currently available application: + +1) Individual SRA data files containing more than 2 billion reads are not yet supported. For a +paired-end experiment this is actually a limitation of about 1 billion "spots". + +2) Compressed SRA ("cSRA") is not yet fully supported. Currently, only the +unaligned fraction of reads are searched. Compressed SRA are runs containing +alignments (e.g., ERR230455). Runs can be checked with "vdb-dump" to report if +they contain alignment information: + + $ vdb-dump -E ERR230455 + enumerating the tables of database 'ERR230455' + tbl #1: PRIMARY_ALIGNMENT + tbl #2: REFERENCE + tbl #3: SEQUENCE + +3) You may need to prefix "./" to the run name for files in your current +directory. + +4) The blast_formatter is not currently able to read native SRA files, so +reformatting of results saved as a blast archive is not yet supported. + +Common errors and fixes. +------------------------ + +1) Failure to provide relative path to manually downloaded SRR file: + +./blastn_vdb -db SRR770754.sra -query srr770754_test.fa -out test.out +Error: NCBI C++ Exception: + "vdb2blast_util.cpp", line 253: Error: ncbi::CVDBBlastUtil::x_MakeSRASeqSrc() + - VDB BlastSeqSrc construction failed: Failed to add any run to VDB runset: unsupported while allocating + +Fix: +Include relative (e.g., "../" or "./") or absolute (e.g., "/home/user/SRA_BLAST_data/") file path with SRR file