annotate hub-archive-creator/test-data/tblastN/readme/README.md @ 6:9193fe3ee73f draft default tip

Uploaded
author yating-l
date Thu, 22 Dec 2016 15:59:24 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
6
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
1 Conversion of NCBI BLAST+ tblastn results to PSL format
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
2 =======================================================
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
3 Wilson Leung <wleung@wustl.edu>
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
4
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
5 Last Update: 04/24/2016
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
6
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
7
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
8 Version information
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
9 -------------------
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
10 * Kent source tree: v324
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
11 * NCBI BLAST+: BLAST 2.2.30+
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
12
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
13 Data sources
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
14 -------------------
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
15 For testing purposes, the database consists of only contig1 in the Dbia3 assembly while the protein sequences correspond to the three isoforms of the *D. melanogaster* *ci* gene in contig1. The protein sequences are available through [FlyBase](http://flybase.org/cgi-bin/getseq.html?source=dmel&id=FBgn0004859&chr=4&dump=PrecompiledFasta&targetset=translation).
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
16
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
17 * Dbia3.fa = contig1 sequence in the Dbia3 asssembly
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
18 * ci.pep = Protein sequences for the three isoforms of the *ci* gene in *D. melanogaster*
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
19
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
20 Conversion protocol
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
21 -----------------------
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
22 1. Create BLAST database for the assembly
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
23 ```
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
24 makeblastdb -in Dbia3.fa -dbtype nucl
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
25 ```
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
26
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
27 2. Perform tblastn search and output results in XML format
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
28 ```
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
29 tblastn -outfmt 5 -db Dbia3.fa -query ci.pep -out tblastn_Dbia3_ci.xml -evalue 1e-2
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
30 ```
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
31
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
32 3. Convert results into PSL format
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
33 ```
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
34 blastXmlToPsl -convertToNucCoords tblastn_Dbia3_ci.xml tblastn_Dbia3_ci.xml.psl
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
35 ```
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
36
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
37 4. Convert PSL output into BED format
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
38 ```
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
39 pslToBed tblastn_Dbia3_ci.xml.psl tblastn_Dbia3_ci.xml.bed
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
40 ```
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
41
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
42 Output files
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
43 -----------------------
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
44 * tblastn_Dbia3_ci.xml = tblastn results in XML format
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
45 * tblastn_Dbia3_ci.xml.psl = tblastn results in PSL format
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
46 * tblastn_Dbia3_ci.xml.bed = tblastn results in BED format
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
47
9193fe3ee73f Uploaded
yating-l
parents:
diff changeset
48