annotate variant_effect_predictor/Bio/Search/Hit/BlastHit.pm @ 0:1f6dce3d34e0

Uploaded
author mahtabm
date Thu, 11 Apr 2013 02:01:53 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1 #-----------------------------------------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2 # $Id: BlastHit.pm,v 1.13 2002/10/22 09:36:19 sac Exp $
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
3 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
4 # BioPerl module Bio::Search::Hit::BlastHit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
5 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
6 # (This module was originally called Bio::Tools::Blast::Sbjct)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
7 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
8 # Cared for by Steve Chervitz <sac@bioperl.org>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
9 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
10 # You may distribute this module under the same terms as perl itself
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
11 #-----------------------------------------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
12
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
13 ## POD Documentation:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
14
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
15 =head1 NAME
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
16
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
17 Bio::Search::Hit::BlastHit - Bioperl BLAST Hit object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
18
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
19 =head1 SYNOPSIS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
20
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
21 The construction of BlastHit objects is performed by
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
22 Bio::SearchIO::blast::BlastHitFactory in a process that is
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
23 orchestrated by the Blast parser (B<Bio::SearchIO::blast::blast>).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
24 The resulting BlastHits are then accessed via
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
25 B<Bio::Search::Result::BlastResult>). Therefore, you do not need to
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
26 use B<Bio::Search::Hit::BlastHit>) directly. If you need to construct
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
27 BlastHits directly, see the new() function for details.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
28
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
29 For B<Bio::SearchIO> BLAST parsing usage examples, see the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
30 B<examples/search-blast> directory of the Bioperl distribution.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
31
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
32
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
33 =head1 DESCRIPTION
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
34
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
35 The Bio::Search::Hit::BlastHit.pm module encapsulates data and methods
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
36 for manipulating "hits" from a BLAST report. A BLAST hit is a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
37 collection of HSPs along with other metadata such as sequence name
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
38 and score information. Hit objects are accessed via
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
39 B<Bio::Search::Result::BlastResult> objects after parsing a BLAST report using
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
40 the B<Bio::SearchIO> system.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
41
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
42 In Blast lingo, the "sbjct" sequences are all the sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
43 in a target database which were compared against a "query" sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
44 The terms "sbjct" and "hit" will be used interchangeably in this module.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
45 All methods that take 'sbjct' as an argument also support 'hit' as a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
46 synonym.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
47
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
48 This module supports BLAST versions 1.x and 2.x, gapped and ungapped,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
49 and PSI-BLAST.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
50
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
51
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
52 =head2 HSP Tiling and Ambiguous Alignments
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
53
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
54 If a Blast hit has more than one HSP, the Bio::Search::Hit::BlastHit.pm
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
55 object has the ability to merge overlapping HSPs into contiguous
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
56 blocks. This permits the BlastHit object to sum data across all HSPs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
57 without counting data in the overlapping regions multiple times, which
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
58 would happen if data from each overlapping HSP are simply summed. HSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
59 tiling is performed automatically when methods of the BlastHit object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
60 that rely on tiled data are invoked. These include
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
61 L<frac_identical()|frac_identical>, L<frac_conserved()|frac_conserved>, L<gaps()|gaps>,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
62 L<frac_aligned_query()|frac_aligned_query>, L<frac_aligned_hit()|frac_aligned_hit>,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
63 L<num_unaligned_query()|num_unaligned_query>, L<num_unaligned_hit()|num_unaligned_hit>.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
64
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
65 It also permits the assessment of an "ambiguous alignment" if the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
66 query (or sbjct) sequences from different HSPs overlap
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
67 (see L<ambiguous_aln()|ambiguous_aln>). The existence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
68 of an overlap could indicate a biologically interesting region in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
69 sequence, such as a repeated domain. The BlastHit object uses the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
70 C<-OVERLAP> parameter to determine when two sequences overlap; if this is
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
71 set to 2 -- the default -- then any two sbjct or query HSP sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
72 must overlap by more than two residues to get merged into the same
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
73 contig and counted as an overlap. See the L<BUGS | BUGS> section below for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
74 "issues" with HSP tiling.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
75
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
76
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
77 The results of the HSP tiling is reported with the following ambiguity codes:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
78
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
79 'q' = Query sequence contains multiple sub-sequences matching
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
80 a single region in the sbjct sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
81
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
82 's' = Subject (BlastHit) sequence contains multiple sub-sequences matching
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
83 a single region in the query sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
84
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
85 'qs' = Both query and sbjct sequences contain more than one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
86 sub-sequence with similarity to the other sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
87
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
88
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
89 For addition information about ambiguous BLAST alignments, see
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
90 L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils> and
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
91
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
92 http://www-genome.stanford.edu/Sacch3D/help/ambig_aln.html
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
93
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
94 =head1 DEPENDENCIES
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
95
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
96 Bio::Search::Hit::BlastHit.pm is a concrete class that inherits from
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
97 B<Bio::Root::Root> and B<Bio::Search::Hit::HitI>. and relies on
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
98 B<Bio::Search::HSP::BlastHSP>.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
99
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
100
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
101 =head1 BUGS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
102
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
103 One consequence of the HSP tiling is that methods that rely on HSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
104 tiling such as L<frac_identical()|frac_identical>, L<frac_conserved()|frac_conserved>, L<gaps()|gaps>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
105 etc. may report misleading numbers when C<-OVERLAP> is set to a large
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
106 number. For example, say we have two HSPs and the query sequence tile
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
107 as follows:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
108
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
109 1 8 22 30 40 60
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
110 Full seq: ------------------------------------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
111 * ** * **
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
112 HSP1: --------------- (6 identical matches)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
113 ** ** **
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
114 HSP2: ------------- (6 identical matches)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
115
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
116
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
117 If C<-OVERLAP> is set to some number over 4, HSP1 and HSP2 will not be
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
118 tiled into a single contig and their numbers of identical matches will
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
119 be added, giving a total of 12, not 10 if they had be combined into
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
120 one contig. This can lead to number greater than 1.0 for methods
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
121 L<frac_identical()|frac_identical> and L<frac_conserved()|frac_conserved>. This is less of an issue
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
122 with gapped Blast since it tends to combine HSPs that would be listed
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
123 separately without gapping. (Fractions E<gt>1.0 can be viewed as a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
124 signal for an interesting alignment that warrants further inspection,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
125 thus turning this bug into a feature :-).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
126
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
127 Using large values for C<-OVERLAP> can lead to incorrect numbers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
128 reported by methods that rely on HSP tiling but can be useful if you
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
129 care more about detecting ambiguous alignments. Setting C<-OVERLAP>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
130 to zero will lead to the most accurate numbers for the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
131 tiling-dependent methods but will be useless for detecting overlapping
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
132 HSPs since all HSPs will appear to overlap.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
133
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
134
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
135 =head1 SEE ALSO
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
136
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
137 Bio::Search::HSP::BlastHSP.pm - Blast HSP object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
138 Bio::Search::Result::BlastResult.pm - Blast Result object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
139 Bio::Search::Hit::HitI.pm - Interface implemented by BlastHit.pm
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
140 Bio::Root::Root.pm - Base class for BlastHit.pm
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
141
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
142 Links:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
143
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
144 http://bio.perl.org/Core/POD/Search/Hit/Blast/BlastHSP.pm.html
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
145
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
146 http://bio.perl.org/Projects/modules.html - Online module documentation
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
147 http://bio.perl.org/Projects/Blast/ - Bioperl Blast Project
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
148 http://bio.perl.org/ - Bioperl Project Homepage
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
149
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
150
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
151 =head1 FEEDBACK
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
152
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
153 =head2 Mailing Lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
154
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
155 User feedback is an integral part of the evolution of this and other
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
156 Bioperl modules. Send your comments and suggestions preferably to one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
157 of the Bioperl mailing lists. Your participation is much appreciated.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
158
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
159 bioperl-l@bioperl.org - General discussion
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
160 http://bio.perl.org/MailList.html - About the mailing lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
161
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
162 =head2 Reporting Bugs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
163
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
164 Report bugs to the Bioperl bug tracking system to help us keep track
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
165 the bugs and their resolution. Bug reports can be submitted via email
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
166 or the web:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
167
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
168 bioperl-bugs@bio.perl.org
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
169 http://bio.perl.org/bioperl-bugs/
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
170
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
171 =head1 AUTHOR
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
172
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
173 Steve Chervitz E<lt>sac@bioperl.orgE<gt>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
174
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
175 See L<the FEEDBACK section | FEEDBACK> for where to send bug reports and comments.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
176
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
177 =head1 ACKNOWLEDGEMENTS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
178
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
179 This software was originally developed in the Department of Genetics
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
180 at Stanford University. I would also like to acknowledge my
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
181 colleagues at Affymetrix for useful feedback.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
182
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
183 =head1 COPYRIGHT
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
184
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
185 Copyright (c) 1996-2001 Steve Chervitz. All Rights Reserved.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
186
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
187 =head1 DISCLAIMER
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
188
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
189 This software is provided "as is" without warranty of any kind.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
190
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
191 =head1 APPENDIX
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
192
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
193 The rest of the documentation details each of the object methods.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
194 Internal methods are usually preceded with a _
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
195
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
196 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
197
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
198
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
199 # Let the code begin...
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
200
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
201 package Bio::Search::Hit::BlastHit;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
202
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
203 use strict;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
204 use Bio::Search::Hit::HitI;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
205 use Bio::Root::Root;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
206 require Bio::Search::BlastUtils;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
207 use vars qw( @ISA %SUMMARY_OFFSET $Revision);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
208
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
209 use overload
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
210 '""' => \&to_string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
211
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
212 @ISA = qw( Bio::Root::Root Bio::Search::Hit::HitI );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
213
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
214 $Revision = '$Id: BlastHit.pm,v 1.13 2002/10/22 09:36:19 sac Exp $'; #'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
215
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
216
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
217 =head2 new
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
218
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
219 Usage : $hit = Bio::Search::Hit::BlastHit->new( %named_params );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
220 : Bio::Search::Hit::BlastHit.pm objects are constructed
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
221 : automatically by Bio::SearchIO::BlastHitFactory.pm,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
222 : so there is no need for direct instantiation.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
223 Purpose : Constructs a new BlastHit object and Initializes key variables
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
224 : for the hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
225 Returns : A Bio::Search::Hit::BlastHit object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
226 Argument : Named Parameters:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
227 : Parameter keys are case-insensitive.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
228 : -RAW_DATA => array reference holding raw BLAST report data
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
229 : for a single hit. This includes all lines
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
230 : within the HSP alignment listing section of a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
231 : traditional BLAST or PSI-BLAST (non-XML) report,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
232 : starting at (or just after) the leading '>'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
233 : -HOLD_RAW_DATA => boolean, should -RAW_DATA be saved within the object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
234 : -QUERY_LEN => Length of the query sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
235 : -ITERATION => integer (PSI-BLAST iteration number in which hit was found)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
236 : -OVERLAP => integer (maximum overlap between adjacent
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
237 : HSPs when tiling)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
238 : -PROGRAM => string (type of Blast: BLASTP, BLASTN, etc)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
239 : -SIGNIF => significance
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
240 : -IS_PVAL => boolean, true if -SIGNIF contains a P-value
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
241 : -SCORE => raw BLAST score
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
242 : -FOUND_AGAIN => boolean, true if this was a hit from the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
243 : section of a PSI-BLAST with iteration > 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
244 : containing sequences that were also found
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
245 : in iteration 1.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
246 Comments : This object accepts raw Blast report data not because it
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
247 : is required for parsing, but in order to retrieve it
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
248 : (only available if -HOLD_RAW_DATA is set to true).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
249
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
250 See Also : L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>, L<Bio::Root::Root::new()|Bio::Root::Root>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
251
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
252 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
253
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
254 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
255 sub new {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
256 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
257 my ($class, @args ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
258 my $self = $class->SUPER::new( @args );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
259
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
260 my ($raw_data, $signif, $is_pval, $hold_raw);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
261
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
262 ($self->{'_blast_program'}, $self->{'_query_length'}, $raw_data, $hold_raw,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
263 $self->{'_overlap'}, $self->{'_iteration'}, $signif, $is_pval,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
264 $self->{'_score'}, $self->{'_found_again'} ) =
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
265 $self->_rearrange( [qw(PROGRAM
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
266 QUERY_LEN
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
267 RAW_DATA
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
268 HOLD_RAW_DATA
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
269 OVERLAP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
270 ITERATION
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
271 SIGNIF
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
272 IS_PVAL
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
273 SCORE
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
274 FOUND_AGAIN )], @args );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
275
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
276 # TODO: Handle this in parser. Just pass in name parameter.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
277 $self->_set_id( $raw_data->[0] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
278
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
279 if($is_pval) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
280 $self->{'_p'} = $signif;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
281 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
282 $self->{'_expect'} = $signif;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
283 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
284
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
285 if( $hold_raw ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
286 $self->{'_hit_data'} = $raw_data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
287 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
288
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
289 return $self;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
290 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
291
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
292 sub DESTROY {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
293 my $self=shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
294 #print STDERR "-->DESTROYING $self\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
295 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
296
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
297
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
298 #=================================================
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
299 # Begin Bio::Search::Hit::HitI implementation
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
300 #=================================================
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
301
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
302 =head2 algorithm
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
303
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
304 Title : algorithm
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
305 Usage : $alg = $hit->algorithm();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
306 Function: Gets the algorithm specification that was used to obtain the hit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
307 For BLAST, the algorithm denotes what type of sequence was aligned
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
308 against what (BLASTN: dna-dna, BLASTP prt-prt, BLASTX translated
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
309 dna-prt, TBLASTN prt-translated dna, TBLASTX translated
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
310 dna-translated dna).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
311 Returns : a scalar string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
312 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
313
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
314 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
315
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
316 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
317 sub algorithm {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
318 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
319 my ($self,@args) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
320 return $self->{'_blast_program'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
321 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
322
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
323 =head2 name
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
324
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
325 Usage : $hit->name([string]);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
326 Purpose : Set/Get a string to identify the hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
327 Example : $name = $hit->name;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
328 : $hit->name('M81707');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
329 Returns : String consisting of the hit's name or undef if not set.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
330 Comments : The name is parsed out of the "Query=" line as the first chunk of
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
331 non-whitespace text. If you want the rest of the line, use
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
332 $hit->description().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
333
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
334 See Also: L<accession()|accession>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
335
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
336 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
337
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
338 #'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
339
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
340 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
341 sub name {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
342 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
343 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
344 if (@_) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
345 my $name = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
346 $name =~ s/^\s+|(\s+|,)$//g;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
347 $self->{'_name'} = $name;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
348 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
349 return $self->{'_name'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
350 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
351
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
352 =head2 description
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
353
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
354 Usage : $hit_object->description( [integer] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
355 Purpose : Set/Get a description string for the hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
356 This is parsed out of the "Query=" line as everything after
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
357 the first chunk of non-whitespace text. Use $hit->name()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
358 to get the first chunk (the ID of the sequence).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
359 Example : $description = $hit->description;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
360 : $desc_60char = $hit->description(60);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
361 Argument : Integer (optional) indicating the desired length of the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
362 : description string to be returned.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
363 Returns : String consisting of the hit's description or undef if not set.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
364
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
365 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
366
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
367 #'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
368
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
369 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
370 sub description {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
371 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
372 my( $self, $len ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
373 $len = (defined $len) ? $len : (CORE::length $self->{'_description'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
374 return substr( $self->{'_description'}, 0 ,$len );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
375 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
376
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
377 =head2 accession
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
378
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
379 Title : accession
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
380 Usage : $acc = $hit->accession();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
381 Function: Retrieve the accession (if available) for the hit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
382 Returns : a scalar string (empty string if not set)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
383 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
384 Comments: Accession numbers are extracted based on the assumption that they
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
385 are delimited by | characters (NCBI-style). If this is not the case,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
386 use the name() method and parse it as necessary.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
387
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
388 See Also: L<name()|name>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
389
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
390 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
391
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
392 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
393 sub accession {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
394 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
395 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
396 if(@_) { $self->{'_accession'} = shift; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
397 $self->{'_accession'} || '';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
398 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
399
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
400 =head2 raw_score
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
401
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
402 Usage : $hit_object->raw_score();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
403 Purpose : Gets the BLAST score of the best HSP for the current Blast hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
404 Example : $score = $hit_object->raw_score();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
405 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
406 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
407 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
408
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
409 See Also : L<bits()|bits>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
410
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
411 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
412
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
413 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
414 sub raw_score {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
415 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
416 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
417
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
418 # The check for $self->{'_score'} is a remnant from the 'query' mode days
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
419 # in which the sbjct object would collect data from the description line only.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
420
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
421 my ($score);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
422 if(not defined($self->{'_score'})) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
423 $score = $self->hsp->score;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
424 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
425 $score = $self->{'_score'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
426 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
427 return $score;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
428 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
429
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
430
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
431 =head2 length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
432
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
433 Usage : $hit_object->length();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
434 Purpose : Get the total length of the hit sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
435 Example : $len = $hit_object->length();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
436 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
437 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
438 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
439 Comments : Developer note: when using the built-in length function within
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
440 : this module, call it as CORE::length().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
441
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
442 See Also : L<logical_length()|logical_length>, L<length_aln()|length_aln>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
443
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
444 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
445
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
446 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
447 sub length {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
448 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
449 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
450 return $self->{'_length'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
451 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
452
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
453 =head2 significance
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
454
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
455 Equivalent to L<signif()|signif>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
456
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
457 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
458
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
459 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
460 sub significance { shift->signif( @_ ); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
461 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
462
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
463
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
464 =head2 next_hsp
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
465
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
466 Title : next_hsp
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
467 Usage : $hsp = $obj->next_hsp();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
468 Function : returns the next available High Scoring Pair object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
469 Example :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
470 Returns : Bio::Search::HSP::BlastHSP or undef if finished
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
471 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
472
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
473 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
474
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
475 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
476 sub next_hsp {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
477 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
478 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
479
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
480 unless($self->{'_hsp_queue_started'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
481 $self->{'_hsp_queue'} = [$self->hsps()];
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
482 $self->{'_hsp_queue_started'} = 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
483 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
484 pop @{$self->{'_hsp_queue'}};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
485 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
486
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
487 #=================================================
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
488 # End Bio::Search::Hit::HitI implementation
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
489 #=================================================
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
490
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
491
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
492 # Providing a more explicit method for getting name of hit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
493 # (corresponds with column name in HitTableWriter)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
494 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
495 sub hit_name {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
496 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
497 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
498 $self->name( @_ );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
499 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
500
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
501 # Older method Delegates to description()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
502 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
503 sub desc {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
504 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
505 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
506 return $self->description( @_ );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
507 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
508
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
509 # Providing a more explicit method for getting description of hit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
510 # (corresponds with column name in HitTableWriter)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
511 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
512 sub hit_description {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
513 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
514 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
515 return $self->description( @_ );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
516 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
517
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
518 =head2 score
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
519
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
520 Equivalent to L<raw_score()|raw_score>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
521
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
522 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
523
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
524 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
525 sub score { shift->raw_score( @_ ); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
526 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
527
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
528
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
529 =head2 hit_length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
530
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
531 Equivalent to L<length()|length>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
532
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
533 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
534
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
535 # Providing a more explicit method for getting length of hit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
536 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
537 sub hit_length { shift->length( @_ ); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
538 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
539
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
540
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
541 =head2 signif
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
542
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
543 Usage : $hit_object->signif( [format] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
544 Purpose : Get the P or Expect value for the best HSP of the given BLAST hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
545 : The value returned is the one which is reported in the description
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
546 : section of the Blast report. For Blast1 and WU-Blast2, this
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
547 : is a P-value, for Blast2, it is an Expect value.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
548 Example : $obj->signif() # returns 1.3e-34
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
549 : $obj->signif('exp') # returns -34
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
550 : $obj->signif('parts') # returns (1.3, -34)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
551 Returns : Float or scientific notation number (the raw P/Expect value, DEFAULT).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
552 : Integer if format == 'exp' (the magnitude of the base 10 exponent).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
553 : 2-element list (float, int) if format == 'parts' and P/Expect value
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
554 : is in scientific notation (see Comments).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
555 Argument : format: string of 'raw' | 'exp' | 'parts'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
556 : 'raw' returns value given in report. Default. (1.2e-34)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
557 : 'exp' returns exponent value only (34)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
558 : 'parts' returns the decimal and exponent as a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
559 : 2-element list (1.2, -34) (see Comments).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
560 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
561 Comments : The signif() method provides a way to deal with the fact that
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
562 : Blast1 and Blast2 formats (and WU- vs. NCBI-BLAST) differ in
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
563 : what is reported in the description lines of each hit in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
564 : Blast report. The signif() method frees any client code from
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
565 : having to know if this is a P-value or an Expect value,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
566 : making it easier to write code that can process both
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
567 : Blast1 and Blast2 reports. This is not necessarily a good thing,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
568 : since one should always know when one is working with P-values or
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
569 : Expect values (hence the deprecated status).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
570 : Use of expect() is recommended since all hits will have an Expect value.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
571 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
572 : Using the 'parts' argument is not recommended since it will not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
573 : work as expected if the expect value is not in scientific notation.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
574 : That is, floats are not converted into sci notation before
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
575 : splitting into parts.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
576
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
577 See Also : L<p()|p>, L<expect()|expect>, L<Bio::Search::BlastUtils::get_exponent()|Bio::Search::BlastUtils>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
578
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
579 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
580
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
581 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
582 sub signif {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
583 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
584 # Some duplication of logic for p(), expect() and signif() for the sake of performance.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
585 my ($self, $fmt) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
586
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
587 my $val = defined($self->{'_p'}) ? $self->{'_p'} : $self->{'_expect'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
588
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
589 # $val can be zero.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
590 defined($val) or $self->throw("Can't get P- or Expect value: HSPs may not have been set.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
591
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
592 return $val if not $fmt or $fmt =~ /^raw/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
593 ## Special formats: exponent-only or as list.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
594 return &Bio::Search::BlastUtils::get_exponent($val) if $fmt =~ /^exp/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
595 return (split (/eE/, $val)) if $fmt =~ /^parts/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
596
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
597 ## Default: return the raw P/Expect-value.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
598 return $val;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
599 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
600
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
601 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
602 sub raw_hit_data {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
603 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
604 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
605 my $data = '>';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
606 # Need to add blank lines where we've removed them.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
607 foreach( @{$self->{'_hit_data'}} ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
608 if( $_ eq 'end') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
609 $data .= "\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
610 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
611 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
612 $data .= /^\s*(Score|Query)/ ? "\n$_" : $_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
613 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
614 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
615 return $data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
616 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
617
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
618
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
619 #=head2 _set_length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
620 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
621 # Usage : $hit_object->_set_length( "233" );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
622 # Purpose : Set the total length of the hit sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
623 # Example : $hit_object->_set_length( $len );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
624 # Returns : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
625 # Argument : Integer (only when setting). Any commas will be stripped out.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
626 # Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
627 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
628 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
629
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
630 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
631 sub _set_length {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
632 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
633 my ($self, $len) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
634 $len =~ s/,//g; # get rid of commas
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
635 $self->{'_length'} = $len;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
636 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
637
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
638 #=head2 _set_description
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
639 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
640 # Usage : Private method; called automatically during construction
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
641 # Purpose : Sets the description of the hit sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
642 # : For sequence without descriptions, does not set any description.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
643 # Argument : Array containing description (multiple lines).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
644 # Comments : Processes the supplied description:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
645 # 1. Join all lines into one string.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
646 # 2. Remove sequence id at the beginning of description.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
647 # 3. Removes junk charactes at begin and end of description.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
648 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
649 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
650
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
651 #--------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
652 sub _set_description {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
653 #--------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
654 my( $self, @desc ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
655 my( $desc);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
656
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
657 # print STDERR "BlastHit: RAW DESC:\n@desc\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
658
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
659 $desc = join(" ", @desc);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
660
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
661 my $name = $self->name;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
662
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
663 if($desc) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
664 $desc =~ s/^\s*\S+\s+//; # remove the sequence ID(s)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
665 # This won't work if there's no description.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
666 $desc =~ s/^\s*$name//; # ...but this should.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
667 $desc =~ s/^[\s!]+//;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
668 $desc =~ s/ \d+$//;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
669 $desc =~ s/\.+$//;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
670 $self->{'_description'} = $desc;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
671 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
672
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
673 # print STDERR "BlastHit: _set_description = $desc\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
674 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
675
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
676 =head2 to_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
677
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
678 Title : to_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
679 Usage : print $hit->to_string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
680 Function: Returns a string representation for the Blast Hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
681 Primarily intended for debugging purposes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
682 Example : see usage
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
683 Returns : A string of the form:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
684 [BlastHit] <name> <description>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
685 e.g.:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
686 [BlastHit] emb|Z46660|SC9725 S.cerevisiae chromosome XIII cosmid
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
687 Args : None
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
688
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
689 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
690
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
691 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
692 sub to_string {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
693 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
694 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
695 return "[BlastHit] " . $self->name . " " . $self->description;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
696 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
697
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
698
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
699 #=head2 _set_id
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
700 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
701 # Usage : Private method; automatically called by new()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
702 # Purpose : Sets the name of the BlastHit sequence from the BLAST summary line.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
703 # : The identifier is assumed to be the first
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
704 # : chunk of non-whitespace characters in the description line
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
705 # : Does not assume any semantics in the structure of the identifier
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
706 # : (Formerly, this method attempted to extract database name from
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
707 # : the seq identifiers, but this was prone to break).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
708 # Returns : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
709 # Argument : String containing description line of the hit from Blast report
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
710 # : or first line of an alignment section (with or without the leading '>').
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
711 # Throws : Warning if cannot locate sequence ID.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
712 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
713 #See Also : L<new()|new>, L<accession()|accession>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
714 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
715 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
716
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
717 #---------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
718 sub _set_id {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
719 #---------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
720 my( $self, $desc ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
721
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
722 # New strategy: Assume only that the ID is the first white space
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
723 # delimited chunk. Not attempting to extract accession & database name.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
724 # Clients will have to interpret it as necessary.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
725 if($desc =~ /^>?(\S+)\s*(.*)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
726 my ($name, $desc) = ($1, $2);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
727 $self->name($name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
728 $self->{'_description'} = $desc;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
729 # Note that this description comes from the summary section of the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
730 # BLAST report and so may be truncated. The full description will be
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
731 # set from the alignment section. We're setting description here in case
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
732 # the alignment section isn't being parsed.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
733
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
734 # Assuming accession is delimited with | symbols (NCBI-style)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
735 my @pieces = split(/\|/,$name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
736 my $acc = pop @pieces;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
737 $self->accession( $acc );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
738 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
739 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
740 $self->warn("Can't locate sequence identifier in summary line.", "Line = $desc");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
741 $desc = 'Unknown sequence ID' if not $desc;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
742 $self->name($desc);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
743 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
744 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
745
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
746
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
747 =head2 ambiguous_aln
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
748
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
749 Usage : $ambig_code = $hit_object->ambiguous_aln();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
750 Purpose : Sets/Gets ambiguity code data member.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
751 Example : (see usage)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
752 Returns : String = 'q', 's', 'qs', '-'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
753 : 'q' = query sequence contains overlapping sub-sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
754 : while sbjct does not.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
755 : 's' = sbjct sequence contains overlapping sub-sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
756 : while query does not.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
757 : 'qs' = query and sbjct sequence contains overlapping sub-sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
758 : relative to each other.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
759 : '-' = query and sbjct sequence do not contains multiple domains
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
760 : relative to each other OR both contain the same distribution
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
761 : of similar domains.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
762 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
763 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
764 Status : Experimental
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
765
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
766 See Also : L<Bio::Search::BlastUtils::tile_hsps>, L<HSP Tiling and Ambiguous Alignments>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
767
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
768 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
769
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
770 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
771 sub ambiguous_aln {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
772 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
773 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
774 if(@_) { $self->{'_ambiguous_aln'} = shift; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
775 $self->{'_ambiguous_aln'} || '-';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
776 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
777
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
778
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
779
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
780 =head2 overlap
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
781
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
782 Usage : $blast_object->overlap( [integer] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
783 Purpose : Gets/Sets the allowable amount overlap between different HSP sequences.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
784 Example : $blast_object->overlap(5);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
785 : $overlap = $blast_object->overlap;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
786 Returns : Integer.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
787 Argument : integer.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
788 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
789 Status : Experimental
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
790 Comments : Any two HSPs whose sequences overlap by less than or equal
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
791 : to the overlap() number of resides will be considered separate HSPs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
792 : and will not get tiled by Bio::Search::BlastUtils::_adjust_contigs().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
793
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
794 See Also : L<Bio::Search::BlastUtils::_adjust_contigs()|Bio::Search::BlastUtils>, L<BUGS | BUGS>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
795
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
796 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
797
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
798 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
799 sub overlap {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
800 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
801 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
802 if(@_) { $self->{'_overlap'} = shift; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
803 defined $self->{'_overlap'} ? $self->{'_overlap'} : 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
804 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
805
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
806
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
807
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
808
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
809
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
810
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
811 =head2 bits
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
812
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
813 Usage : $hit_object->bits();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
814 Purpose : Gets the BLAST bit score of the best HSP for the current Blast hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
815 Example : $bits = $hit_object->bits();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
816 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
817 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
818 Throws : Exception if bit score is not set.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
819 Comments : For BLAST1, the non-bit score is listed in the summary line.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
820
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
821 See Also : L<score()|score>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
822
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
823 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
824
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
825 #---------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
826 sub bits {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
827 #---------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
828 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
829
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
830 # The check for $self->{'_bits'} is a remnant from the 'query' mode days
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
831 # in which the sbjct object would collect data from the description line only.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
832
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
833 my ($bits);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
834 if(not defined($self->{'_bits'})) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
835 $bits = $self->hsp->bits;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
836 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
837 $bits = $self->{'_bits'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
838 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
839 return $bits;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
840 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
841
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
842
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
843
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
844 =head2 n
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
845
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
846 Usage : $hit_object->n();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
847 Purpose : Gets the N number for the current Blast hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
848 : This is the number of HSPs in the set which was ascribed
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
849 : the lowest P-value (listed on the description line).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
850 : This number is not the same as the total number of HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
851 : To get the total number of HSPs, use num_hsps().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
852 Example : $n = $hit_object->n();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
853 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
854 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
855 Throws : Exception if HSPs have not been set (BLAST2 reports).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
856 Comments : Note that the N parameter is not reported in gapped BLAST2.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
857 : Calling n() on such reports will result in a call to num_hsps().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
858 : The num_hsps() method will count the actual number of
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
859 : HSPs in the alignment listing, which may exceed N in
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
860 : some cases.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
861
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
862 See Also : L<num_hsps()|num_hsps>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
863
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
864 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
865
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
866 #-----
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
867 sub n {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
868 #-----
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
869 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
870
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
871 # The check for $self->{'_n'} is a remnant from the 'query' mode days
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
872 # in which the sbjct object would collect data from the description line only.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
873
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
874 my ($n);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
875 if(not defined($self->{'_n'})) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
876 $n = $self->hsp->n;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
877 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
878 $n = $self->{'_n'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
879 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
880 $n ||= $self->num_hsps;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
881
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
882 return $n;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
883 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
884
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
885
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
886
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
887 =head2 frame
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
888
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
889 Usage : $hit_object->frame();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
890 Purpose : Gets the reading frame for the best HSP after HSP tiling.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
891 : This is only valid for BLASTX and TBLASTN/X reports.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
892 Example : $frame = $hit_object->frame();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
893 Returns : Integer (-2 .. +2)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
894 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
895 Throws : Exception if HSPs have not been set (BLAST2 reports).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
896 Comments : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
897 : already been tiled, they will be tiled first automatically..
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
898 : If you don't want the tiled data, iterate through each HSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
899 : calling frame() on each (use hsps() to get all HSPs).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
900
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
901 See Also : L<hsps()|hsps>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
902
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
903 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
904
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
905 #----------'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
906 sub frame {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
907 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
908 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
909
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
910 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
911
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
912 # The check for $self->{'_frame'} is a remnant from the 'query' mode days
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
913 # in which the sbjct object would collect data from the description line only.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
914
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
915 my ($frame);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
916 if(not defined($self->{'_frame'})) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
917 $frame = $self->hsp->frame;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
918 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
919 $frame = $self->{'_frame'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
920 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
921 return $frame;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
922 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
923
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
924
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
925
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
926
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
927
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
928 =head2 p
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
929
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
930 Usage : $hit_object->p( [format] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
931 Purpose : Get the P-value for the best HSP of the given BLAST hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
932 : (Note that P-values are not provided with NCBI Blast2 reports).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
933 Example : $p = $sbjct->p;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
934 : $p = $sbjct->p('exp'); # get exponent only.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
935 : ($num, $exp) = $sbjct->p('parts'); # split sci notation into parts
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
936 Returns : Float or scientific notation number (the raw P-value, DEFAULT).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
937 : Integer if format == 'exp' (the magnitude of the base 10 exponent).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
938 : 2-element list (float, int) if format == 'parts' and P-value
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
939 : is in scientific notation (See Comments).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
940 Argument : format: string of 'raw' | 'exp' | 'parts'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
941 : 'raw' returns value given in report. Default. (1.2e-34)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
942 : 'exp' returns exponent value only (34)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
943 : 'parts' returns the decimal and exponent as a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
944 : 2-element list (1.2, -34) (See Comments).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
945 Throws : Warns if no P-value is defined. Uses expect instead.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
946 Comments : Using the 'parts' argument is not recommended since it will not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
947 : work as expected if the P-value is not in scientific notation.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
948 : That is, floats are not converted into sci notation before
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
949 : splitting into parts.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
950
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
951 See Also : L<expect()|expect>, L<signif()|signif>, L<Bio::Search::BlastUtils::get_exponent()|Bio::Search::BlastUtils>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
952
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
953 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
954
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
955 #--------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
956 sub p {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
957 #--------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
958 # Some duplication of logic for p(), expect() and signif() for the sake of performance.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
959 my ($self, $fmt) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
960
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
961 my $val = $self->{'_p'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
962
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
963 # $val can be zero.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
964 if(not defined $val) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
965 # P-value not defined, must be a NCBI Blast2 report.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
966 # Use expect instead.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
967 $self->warn( "P-value not defined. Using expect() instead.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
968 $val = $self->{'_expect'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
969 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
970
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
971 return $val if not $fmt or $fmt =~ /^raw/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
972 ## Special formats: exponent-only or as list.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
973 return &Bio::Search::BlastUtils::get_exponent($val) if $fmt =~ /^exp/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
974 return (split (/eE/, $val)) if $fmt =~ /^parts/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
975
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
976 ## Default: return the raw P-value.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
977 return $val;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
978 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
979
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
980
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
981
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
982 =head2 expect
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
983
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
984 Usage : $hit_object->expect( [format] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
985 Purpose : Get the Expect value for the best HSP of the given BLAST hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
986 Example : $e = $sbjct->expect;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
987 : $e = $sbjct->expect('exp'); # get exponent only.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
988 : ($num, $exp) = $sbjct->expect('parts'); # split sci notation into parts
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
989 Returns : Float or scientific notation number (the raw expect value, DEFAULT).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
990 : Integer if format == 'exp' (the magnitude of the base 10 exponent).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
991 : 2-element list (float, int) if format == 'parts' and Expect
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
992 : is in scientific notation (see Comments).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
993 Argument : format: string of 'raw' | 'exp' | 'parts'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
994 : 'raw' returns value given in report. Default. (1.2e-34)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
995 : 'exp' returns exponent value only (34)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
996 : 'parts' returns the decimal and exponent as a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
997 : 2-element list (1.2, -34) (see Comments).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
998 Throws : Exception if the Expect value is not defined.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
999 Comments : Using the 'parts' argument is not recommended since it will not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1000 : work as expected if the expect value is not in scientific notation.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1001 : That is, floats are not converted into sci notation before
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1002 : splitting into parts.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1003
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1004 See Also : L<p()|p>, L<signif()|signif>, L<Bio::Search::BlastUtils::get_exponent()|Bio::Search::BlastUtils>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1005
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1006 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1007
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1008 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1009 sub expect {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1010 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1011 # Some duplication of logic for p(), expect() and signif() for the sake of performance.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1012 my ($self, $fmt) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1013
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1014 my $val;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1015
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1016 # For Blast reports that list the P value on the description line,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1017 # getting the expect value requires fully parsing the HSP data.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1018 # For NCBI blast, there's no problem.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1019 if(not defined($self->{'_expect'})) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1020 if( defined $self->{'_hsps'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1021 $self->{'_expect'} = $val = $self->hsp->expect;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1022 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1023 # If _expect is not set and _hsps are not set,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1024 # then this must be a P-value-based report that was
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1025 # run without setting the HSPs (shallow parsing).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1026 $self->throw("Can't get expect value. HSPs have not been set.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1027 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1028 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1029 $val = $self->{'_expect'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1030 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1031
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1032 # $val can be zero.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1033 defined($val) or $self->throw("Can't get Expect value.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1034
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1035 return $val if not $fmt or $fmt =~ /^raw/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1036 ## Special formats: exponent-only or as list.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1037 return &Bio::Search::BlastUtils::get_exponent($val) if $fmt =~ /^exp/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1038 return (split (/eE/, $val)) if $fmt =~ /^parts/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1039
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1040 ## Default: return the raw Expect-value.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1041 return $val;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1042 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1043
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1044
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1045 =head2 hsps
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1046
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1047 Usage : $hit_object->hsps();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1048 Purpose : Get a list containing all HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1049 : Get the numbers of HSPs for the current hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1050 Example : @hsps = $hit_object->hsps();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1051 : $num = $hit_object->hsps(); # alternatively, use num_hsps()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1052 Returns : Array context : list of Bio::Search::HSP::BlastHSP.pm objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1053 : Scalar context: integer (number of HSPs).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1054 : (Equivalent to num_hsps()).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1055 Argument : n/a. Relies on wantarray
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1056 Throws : Exception if the HSPs have not been collected.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1057
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1058 See Also : L<hsp()|hsp>, L<num_hsps()|num_hsps>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1059
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1060 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1061
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1062 #---------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1063 sub hsps {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1064 #---------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1065 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1066
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1067 if (not ref $self->{'_hsps'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1068 $self->throw("Can't get HSPs: data not collected.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1069 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1070
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1071 return wantarray
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1072 # returning list containing all HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1073 ? @{$self->{'_hsps'}}
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1074 # returning number of HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1075 : scalar(@{$self->{'_hsps'}});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1076 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1077
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1078
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1079
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1080 =head2 hsp
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1081
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1082 Usage : $hit_object->hsp( [string] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1083 Purpose : Get a single BlastHSP.pm object for the present BlastHit.pm object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1084 Example : $hspObj = $hit_object->hsp; # same as 'best'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1085 : $hspObj = $hit_object->hsp('best');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1086 : $hspObj = $hit_object->hsp('worst');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1087 Returns : Object reference for a Bio::Search::HSP::BlastHSP.pm object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1088 Argument : String (or no argument).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1089 : No argument (default) = highest scoring HSP (same as 'best').
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1090 : 'best' or 'first' = highest scoring HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1091 : 'worst' or 'last' = lowest scoring HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1092 Throws : Exception if the HSPs have not been collected.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1093 : Exception if an unrecognized argument is used.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1094
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1095 See Also : L<hsps()|hsps>, L<num_hsps>()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1096
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1097 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1098
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1099 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1100 sub hsp {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1101 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1102 my( $self, $option ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1103 $option ||= 'best';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1104
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1105 if (not ref $self->{'_hsps'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1106 $self->throw("Can't get HSPs: data not collected.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1107 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1108
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1109 my @hsps = @{$self->{'_hsps'}};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1110
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1111 return $hsps[0] if $option =~ /best|first|1/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1112 return $hsps[$#hsps] if $option =~ /worst|last/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1113
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1114 $self->throw("Can't get HSP for: $option\n" .
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1115 "Valid arguments: 'best', 'worst'");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1116 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1117
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1118
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1119
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1120 =head2 num_hsps
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1121
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1122 Usage : $hit_object->num_hsps();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1123 Purpose : Get the number of HSPs for the present Blast hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1124 Example : $nhsps = $hit_object->num_hsps();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1125 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1126 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1127 Throws : Exception if the HSPs have not been collected.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1128
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1129 See Also : L<hsps()|hsps>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1130
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1131 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1132
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1133 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1134 sub num_hsps {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1135 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1136 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1137
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1138 if (not defined $self->{'_hsps'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1139 $self->throw("Can't get HSPs: data not collected.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1140 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1141
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1142 return scalar(@{$self->{'_hsps'}});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1143 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1144
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1145
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1146
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1147 =head2 logical_length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1148
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1149 Usage : $hit_object->logical_length( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1150 : (mostly intended for internal use).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1151 Purpose : Get the logical length of the hit sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1152 : For query sequence of BLASTX and TBLASTX reports and the hit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1153 : sequence of TBLASTN and TBLASTX reports, the returned length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1154 : is the length of the would-be amino acid sequence (length/3).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1155 : For all other BLAST flavors, this function is the same as length().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1156 Example : $len = $hit_object->logical_length();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1157 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1158 Argument : seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1159 ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1160 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1161 Comments : This is important for functions like frac_aligned_query()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1162 : which need to operate in amino acid coordinate space when dealing
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1163 : with T?BLASTX type reports.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1164
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1165 See Also : L<length()|length>, L<frac_aligned_query()|frac_aligned_query>, L<frac_aligned_hit()|frac_aligned_hit>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1166
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1167 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1168
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1169 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1170 sub logical_length {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1171 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1172 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1173 my $seqType = shift || 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1174 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1175
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1176 my $length;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1177
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1178 # For the sbjct, return logical sbjct length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1179 if( $seqType eq 'sbjct' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1180 $length = $self->{'_logical_length'} || $self->{'_length'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1181 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1182 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1183 # Otherwise, return logical query length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1184 $length = $self->{'_query_length'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1185
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1186 # Adjust length based on BLAST flavor.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1187 if($self->{'_blast_program'} =~ /T?BLASTX/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1188 $length /= 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1189 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1190 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1191 return $length;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1192 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1193
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1194
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1195 =head2 length_aln
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1196
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1197 Usage : $hit_object->length_aln( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1198 Purpose : Get the total length of the aligned region for query or sbjct seq.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1199 : This number will include all HSPs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1200 Example : $len = $hit_object->length_aln(); # default = query
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1201 : $lenAln = $hit_object->length_aln('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1202 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1203 Argument : seq_Type = 'query' or 'hit' or 'sbjct' (Default = 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1204 ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1205 Throws : Exception if the argument is not recognized.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1206 Comments : This method will report the logical length of the alignment,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1207 : meaning that for TBLAST[NX] reports, the length is reported
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1208 : using amino acid coordinate space (i.e., nucleotides / 3).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1209 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1210 : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1211 : already been tiled, they will be tiled first automatically..
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1212 : If you don't want the tiled data, iterate through each HSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1213 : calling length() on each (use hsps() to get all HSPs).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1214
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1215 See Also : L<length()|length>, L<frac_aligned_query()|frac_aligned_query>, L<frac_aligned_hit()|frac_aligned_hit>, L<gaps()|gaps>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>, L<Bio::Search::HSP::BlastHSP::length()|Bio::Search::HSP::BlastHSP>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1216
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1217 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1218
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1219 #---------------'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1220 sub length_aln {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1221 #---------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1222 my( $self, $seqType ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1223
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1224 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1225 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1226
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1227 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1228
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1229 my $data = $self->{'_length_aln_'.$seqType};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1230
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1231 ## If we don't have data, figure out what went wrong.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1232 if(!$data) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1233 $self->throw("Can't get length aln for sequence type \"$seqType\"" .
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1234 "Valid types are 'query', 'hit', 'sbjct' ('sbjct' = 'hit')");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1235 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1236 $data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1237 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1238
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1239
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1240 =head2 gaps
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1241
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1242 Usage : $hit_object->gaps( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1243 Purpose : Get the number of gaps in the aligned query, sbjct, or both sequences.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1244 : Data is summed across all HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1245 Example : $qgaps = $hit_object->gaps('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1246 : $hgaps = $hit_object->gaps('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1247 : $tgaps = $hit_object->gaps(); # default = total (query + hit)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1248 Returns : scalar context: integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1249 : array context without args: two-element list of integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1250 : (queryGaps, sbjctGaps)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1251 : Array context can be forced by providing an argument of 'list' or 'array'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1252 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1253 : CAUTION: Calling this method within printf or sprintf is arrray context.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1254 : So this function may not give you what you expect. For example:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1255 : printf "Total gaps: %d", $hit->gaps();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1256 : Actually returns a two-element array, so what gets printed
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1257 : is the number of gaps in the query, not the total
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1258 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1259 Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total' | 'list' (default = 'total')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1260 ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1261 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1262 Comments : If you need data for each HSP, use hsps() and then interate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1263 : through each HSP object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1264 : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1265 : already been tiled, they will be tiled first automatically..
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1266 : Not relying on wantarray since that will fail in situations
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1267 : such as printf "%d", $hit->gaps() in which you might expect to
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1268 : be printing the total gaps, but evaluates to array context.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1269
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1270 See Also : L<length_aln()|length_aln>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1271
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1272 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1273
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1274 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1275 sub gaps {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1276 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1277 my( $self, $seqType ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1278
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1279 $seqType ||= (wantarray ? 'list' : 'total');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1280 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1281
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1282 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1283
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1284 $seqType = lc($seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1285
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1286 if($seqType =~ /list|array/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1287 return ($self->{'_gaps_query'}, $self->{'_gaps_sbjct'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1288 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1289
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1290 if($seqType eq 'total') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1291 return ($self->{'_gaps_query'} + $self->{'_gaps_sbjct'}) || 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1292 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1293 return $self->{'_gaps_'.$seqType} || 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1294 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1295 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1296
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1297
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1298
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1299 =head2 matches
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1300
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1301 Usage : $hit_object->matches( [class] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1302 Purpose : Get the total number of identical or conserved matches
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1303 : (or both) across all HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1304 : (Note: 'conservative' matches are indicated as 'positives'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1305 : in the Blast report.)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1306 Example : ($id,$cons) = $hit_object->matches(); # no argument
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1307 : $id = $hit_object->matches('id');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1308 : $cons = $hit_object->matches('cons');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1309 Returns : Integer or a 2-element array of integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1310 Argument : class = 'id' | 'cons' OR none.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1311 : If no argument is provided, both identical and conservative
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1312 : numbers are returned in a two element list.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1313 : (Other terms can be used to refer to the conservative
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1314 : matches, e.g., 'positive'. All that is checked is whether or
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1315 : not the supplied string starts with 'id'. If not, the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1316 : conservative matches are returned.)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1317 Throws : Exception if the requested data cannot be obtained.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1318 Comments : If you need data for each HSP, use hsps() and then interate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1319 : through the HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1320 : Does not rely on wantarray to return a list. Only checks for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1321 : the presence of an argument (no arg = return list).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1322
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1323 See Also : L<Bio::Search::HSP::BlastHSP::matches()|Bio::Search::HSP::BlastHSP>, L<hsps()|hsps>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1324
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1325 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1326
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1327 #---------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1328 sub matches {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1329 #---------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1330 my( $self, $arg) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1331 my(@data,$data);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1332
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1333 if(!$arg) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1334 @data = ($self->{'_totalIdentical'}, $self->{'_totalConserved'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1335
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1336 return @data if @data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1337
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1338 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1339
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1340 if($arg =~ /^id/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1341 $data = $self->{'_totalIdentical'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1342 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1343 $data = $self->{'_totalConserved'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1344 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1345 return $data if $data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1346 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1347
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1348 ## Something went wrong if we make it to here.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1349 $self->throw("Can't get identical or conserved data: no data.");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1350 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1351
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1352
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1353 =head2 start
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1354
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1355 Usage : $sbjct->start( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1356 Purpose : Gets the start coordinate for the query, sbjct, or both sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1357 : in the BlastHit object. If there is more than one HSP, the lowest start
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1358 : value of all HSPs is returned.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1359 Example : $qbeg = $sbjct->start('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1360 : $sbeg = $sbjct->start('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1361 : ($qbeg, $sbeg) = $sbjct->start();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1362 Returns : scalar context: integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1363 : array context without args: list of two integers (queryStart, sbjctStart)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1364 : Array context can be "induced" by providing an argument of 'list' or 'array'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1365 Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1366 ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1367 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1368 Comments : This method requires that all HSPs be tiled. If there is more than one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1369 : HSP and they have not already been tiled, they will be tiled first automatically..
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1370 : Remember that the start and end coordinates of all HSPs are
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1371 : normalized so that start < end. Strand information can be
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1372 : obtained by calling $hit->strand().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1373
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1374 See Also : L<end()|end>, L<range()|range>, L<strand()|strand>, L<HSP Tiling and Ambiguous Alignments>, L<Bio::Search::HSP::BlastHSP::start|Bio::Search::HSP::BlastHSP>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1375
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1376 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1377
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1378 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1379 sub start {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1380 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1381 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1382
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1383 $seqType ||= (wantarray ? 'list' : 'query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1384 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1385
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1386 # If there is only one HSP, defer this call to the solitary HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1387 if($self->num_hsps == 1) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1388 return $self->hsp->start($seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1389 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1390 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1391 if($seqType =~ /list|array/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1392 return ($self->{'_queryStart'}, $self->{'_sbjctStart'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1393 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1394 ## Sensitive to member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1395 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1396 return $self->{$seqType.'Start'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1397 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1398 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1399 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1400
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1401
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1402 =head2 end
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1403
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1404 Usage : $sbjct->end( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1405 Purpose : Gets the end coordinate for the query, sbjct, or both sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1406 : in the BlastHit object. If there is more than one HSP, the largest end
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1407 : value of all HSPs is returned.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1408 Example : $qend = $sbjct->end('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1409 : $send = $sbjct->end('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1410 : ($qend, $send) = $sbjct->end();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1411 Returns : scalar context: integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1412 : array context without args: list of two integers (queryEnd, sbjctEnd)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1413 : Array context can be "induced" by providing an argument of 'list' or 'array'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1414 Argument : In scalar context: seq_type = 'query' or 'sbjct'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1415 : (case insensitive). If not supplied, 'query' is used.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1416 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1417 Comments : This method requires that all HSPs be tiled. If there is more than one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1418 : HSP and they have not already been tiled, they will be tiled first automatically..
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1419 : Remember that the start and end coordinates of all HSPs are
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1420 : normalized so that start < end. Strand information can be
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1421 : obtained by calling $hit->strand().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1422
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1423 See Also : L<start()|start>, L<range()|range>, L<strand()|strand>, L<HSP Tiling and Ambiguous Alignments>, L<Bio::Search::HSP::BlastHSP::end|Bio::Search::HSP::BlastHSP>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1424
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1425 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1426
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1427 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1428 sub end {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1429 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1430 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1431
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1432 $seqType ||= (wantarray ? 'list' : 'query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1433 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1434
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1435 # If there is only one HSP, defer this call to the solitary HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1436 if($self->num_hsps == 1) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1437 return $self->hsp->end($seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1438 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1439 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1440 if($seqType =~ /list|array/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1441 return ($self->{'_queryStop'}, $self->{'_sbjctStop'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1442 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1443 ## Sensitive to member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1444 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1445 return $self->{$seqType.'Stop'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1446 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1447 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1448 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1449
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1450 =head2 range
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1451
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1452 Usage : $sbjct->range( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1453 Purpose : Gets the (start, end) coordinates for the query or sbjct sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1454 : in the HSP alignment.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1455 Example : ($qbeg, $qend) = $sbjct->range('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1456 : ($sbeg, $send) = $sbjct->range('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1457 Returns : Two-element array of integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1458 Argument : seq_type = string, 'query' or 'hit' or 'sbjct' (default = 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1459 ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1460 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1461
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1462 See Also : L<start()|start>, L<end()|end>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1463
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1464 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1465
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1466 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1467 sub range {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1468 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1469 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1470 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1471 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1472 return ($self->start($seqType), $self->end($seqType));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1473 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1474
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1475
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1476 =head2 frac_identical
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1477
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1478 Usage : $hit_object->frac_identical( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1479 Purpose : Get the overall fraction of identical positions across all HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1480 : The number refers to only the aligned regions and does not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1481 : account for unaligned regions in between the HSPs, if any.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1482 Example : $frac_iden = $hit_object->frac_identical('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1483 Returns : Float (2-decimal precision, e.g., 0.75).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1484 Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1485 : default = 'query' (but see comments below).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1486 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1487 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1488 Comments : Different versions of Blast report different values for the total
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1489 : length of the alignment. This is the number reported in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1490 : denominators in the stats section:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1491 : "Identical = 34/120 Positives = 67/120".
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1492 : NCBI BLAST uses the total length of the alignment (with gaps)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1493 : WU-BLAST uses the length of the query sequence (without gaps).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1494 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1495 : Therefore, when called with an argument of 'total',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1496 : this method will report different values depending on the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1497 : version of BLAST used. Total does NOT take into account HSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1498 : tiling, so it should not be used.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1499 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1500 : To get the fraction identical among only the aligned residues,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1501 : ignoring the gaps, call this method without an argument or
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1502 : with an argument of 'query' or 'hit'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1503 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1504 : If you need data for each HSP, use hsps() and then iterate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1505 : through the HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1506 : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1507 : already been tiled, they will be tiled first automatically.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1508
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1509 See Also : L<frac_conserved()|frac_conserved>, L<frac_aligned_query()|frac_aligned_query>, L<matches()|matches>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1510
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1511 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1512
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1513 #------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1514 sub frac_identical {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1515 #------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1516 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1517 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1518 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1519
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1520 ## Sensitive to member name format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1521 $seqType = lc($seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1522
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1523 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1524
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1525 sprintf( "%.2f", $self->{'_totalIdentical'}/$self->{'_length_aln_'.$seqType});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1526 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1527
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1528
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1529
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1530 =head2 frac_conserved
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1531
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1532 Usage : $hit_object->frac_conserved( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1533 Purpose : Get the overall fraction of conserved positions across all HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1534 : The number refers to only the aligned regions and does not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1535 : account for unaligned regions in between the HSPs, if any.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1536 Example : $frac_cons = $hit_object->frac_conserved('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1537 Returns : Float (2-decimal precision, e.g., 0.75).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1538 Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1539 : default = 'query' (but see comments below).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1540 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1541 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1542 Comments : Different versions of Blast report different values for the total
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1543 : length of the alignment. This is the number reported in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1544 : denominators in the stats section:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1545 : "Positives = 34/120 Positives = 67/120".
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1546 : NCBI BLAST uses the total length of the alignment (with gaps)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1547 : WU-BLAST uses the length of the query sequence (without gaps).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1548 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1549 : Therefore, when called with an argument of 'total',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1550 : this method will report different values depending on the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1551 : version of BLAST used. Total does NOT take into account HSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1552 : tiling, so it should not be used.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1553 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1554 : To get the fraction conserved among only the aligned residues,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1555 : ignoring the gaps, call this method without an argument or
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1556 : with an argument of 'query' or 'hit'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1557 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1558 : If you need data for each HSP, use hsps() and then interate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1559 : through the HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1560 : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1561 : already been tiled, they will be tiled first automatically.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1562
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1563 See Also : L<frac_identical()|frac_identical>, L<matches()|matches>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1564
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1565 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1566
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1567 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1568 sub frac_conserved {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1569 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1570 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1571 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1572 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1573
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1574 ## Sensitive to member name format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1575 $seqType = lc($seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1576
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1577 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1578
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1579 sprintf( "%.2f", $self->{'_totalConserved'}/$self->{'_length_aln_'.$seqType});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1580 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1581
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1582
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1583
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1584
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1585 =head2 frac_aligned_query
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1586
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1587 Usage : $hit_object->frac_aligned_query();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1588 Purpose : Get the fraction of the query sequence which has been aligned
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1589 : across all HSPs (not including intervals between non-overlapping
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1590 : HSPs).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1591 Example : $frac_alnq = $hit_object->frac_aligned_query();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1592 Returns : Float (2-decimal precision, e.g., 0.75).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1593 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1594 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1595 Comments : If you need data for each HSP, use hsps() and then interate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1596 : through the HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1597 : To compute the fraction aligned, the logical length of the query
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1598 : sequence is used, meaning that for [T]BLASTX reports, the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1599 : full length of the query sequence is converted into amino acids
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1600 : by dividing by 3. This is necessary because of the way
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1601 : the lengths of aligned sequences are computed.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1602 : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1603 : already been tiled, they will be tiled first automatically.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1604
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1605 See Also : L<frac_aligned_hit()|frac_aligned_hit>, L<logical_length()|logical_length>, L<length_aln()|length_aln>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1606
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1607 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1608
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1609 #----------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1610 sub frac_aligned_query {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1611 #----------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1612 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1613
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1614 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1615
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1616 sprintf( "%.2f", $self->{'_length_aln_query'}/$self->logical_length('query'));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1617 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1618
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1619
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1620
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1621 =head2 frac_aligned_hit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1622
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1623 Usage : $hit_object->frac_aligned_hit();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1624 Purpose : Get the fraction of the hit (sbjct) sequence which has been aligned
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1625 : across all HSPs (not including intervals between non-overlapping
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1626 : HSPs).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1627 Example : $frac_alnq = $hit_object->frac_aligned_hit();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1628 Returns : Float (2-decimal precision, e.g., 0.75).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1629 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1630 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1631 Comments : If you need data for each HSP, use hsps() and then interate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1632 : through the HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1633 : To compute the fraction aligned, the logical length of the sbjct
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1634 : sequence is used, meaning that for TBLAST[NX] reports, the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1635 : full length of the sbjct sequence is converted into amino acids
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1636 : by dividing by 3. This is necessary because of the way
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1637 : the lengths of aligned sequences are computed.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1638 : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1639 : already been tiled, they will be tiled first automatically.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1640
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1641 See Also : L<frac_aligned_query()|frac_aligned_query>, L<matches()|matches>, , L<logical_length()|logical_length>, L<length_aln()|length_aln>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1642
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1643 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1644
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1645 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1646 sub frac_aligned_hit {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1647 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1648 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1649
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1650 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1651
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1652 sprintf( "%.2f", $self->{'_length_aln_sbjct'}/$self->logical_length('sbjct'));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1653 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1654
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1655
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1656 ## These methods are being maintained for backward compatibility.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1657
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1658 =head2 frac_aligned_sbjct
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1659
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1660 Same as L<frac_aligned_hit()|frac_aligned_hit>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1661
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1662 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1663
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1664 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1665 sub frac_aligned_sbjct { my $self=shift; $self->frac_aligned_hit(@_); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1666 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1667
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1668 =head2 num_unaligned_sbjct
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1669
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1670 Same as L<num_unaligned_hit()|num_unaligned_hit>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1671
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1672 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1673
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1674 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1675 sub num_unaligned_sbjct { my $self=shift; $self->num_unaligned_hit(@_); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1676 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1677
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1678
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1679
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1680 =head2 num_unaligned_hit
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1681
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1682 Usage : $hit_object->num_unaligned_hit();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1683 Purpose : Get the number of the unaligned residues in the hit sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1684 : Sums across all all HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1685 Example : $num_unaln = $hit_object->num_unaligned_hit();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1686 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1687 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1688 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1689 Comments : See notes regarding logical lengths in the comments for frac_aligned_hit().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1690 : They apply here as well.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1691 : If you need data for each HSP, use hsps() and then interate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1692 : through the HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1693 : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1694 : already been tiled, they will be tiled first automatically..
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1695
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1696 See Also : L<num_unaligned_query()|num_unaligned_query>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>, L<frac_aligned_hit()|frac_aligned_hit>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1697
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1698 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1699
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1700 #---------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1701 sub num_unaligned_hit {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1702 #---------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1703 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1704
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1705 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1706
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1707 my $num = $self->logical_length('sbjct') - $self->{'_length_aln_sbjct'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1708 ($num < 0 ? 0 : $num );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1709 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1710
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1711
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1712 =head2 num_unaligned_query
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1713
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1714 Usage : $hit_object->num_unaligned_query();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1715 Purpose : Get the number of the unaligned residues in the query sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1716 : Sums across all all HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1717 Example : $num_unaln = $hit_object->num_unaligned_query();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1718 Returns : Integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1719 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1720 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1721 Comments : See notes regarding logical lengths in the comments for frac_aligned_query().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1722 : They apply here as well.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1723 : If you need data for each HSP, use hsps() and then interate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1724 : through the HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1725 : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1726 : already been tiled, they will be tiled first automatically..
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1727
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1728 See Also : L<num_unaligned_hit()|num_unaligned_hit>, L<frac_aligned_query()|frac_aligned_query>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1729
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1730 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1731
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1732 #-----------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1733 sub num_unaligned_query {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1734 #-----------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1735 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1736
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1737 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1738
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1739 my $num = $self->logical_length('query') - $self->{'_length_aln_query'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1740 ($num < 0 ? 0 : $num );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1741 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1742
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1743
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1744
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1745 =head2 seq_inds
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1746
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1747 Usage : $hit->seq_inds( seq_type, class, collapse );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1748 Purpose : Get a list of residue positions (indices) across all HSPs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1749 : for identical or conserved residues in the query or sbjct sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1750 Example : @s_ind = $hit->seq_inds('query', 'identical');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1751 : @h_ind = $hit->seq_inds('hit', 'conserved');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1752 : @h_ind = $hit->seq_inds('hit', 'conserved', 1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1753 Returns : Array of integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1754 : May include ranges if collapse is non-zero.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1755 Argument : [0] seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1756 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1757 : [1] class = 'identical' or 'conserved' (default = 'identical')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1758 : (can be shortened to 'id' or 'cons')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1759 : (actually, anything not 'id' will evaluate to 'conserved').
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1760 : [2] collapse = boolean, if non-zero, consecutive positions are merged
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1761 : using a range notation, e.g., "1 2 3 4 5 7 9 10 11"
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1762 : collapses to "1-5 7 9-11". This is useful for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1763 : consolidating long lists. Default = no collapse.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1764 Throws : n/a.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1765 Comments : Note that HSPs are not tiled for this. This could be a problem
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1766 : for hits containing mutually exclusive HSPs.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1767 : TODO: Consider tiling and then reporting seq_inds for the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1768 : best HSP contig.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1769
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1770 See Also : L<Bio::Search::HSP::BlastHSP::seq_inds()|Bio::Search::HSP::BlastHSP>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1771
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1772 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1773
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1774 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1775 sub seq_inds {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1776 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1777 my ($self, $seqType, $class, $collapse) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1778
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1779 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1780 $class ||= 'identical';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1781 $collapse ||= 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1782
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1783 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1784
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1785 my (@inds, $hsp);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1786 foreach $hsp ($self->hsps) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1787 # This will merge data for all HSPs together.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1788 push @inds, $hsp->seq_inds($seqType, $class);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1789 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1790
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1791 # Need to remove duplicates and sort the merged positions.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1792 if(@inds) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1793 my %tmp = map { $_, 1 } @inds;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1794 @inds = sort {$a <=> $b} keys %tmp;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1795 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1796
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1797 $collapse ? &Bio::Search::BlastUtils::collapse_nums(@inds) : @inds;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1798 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1799
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1800
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1801 =head2 iteration
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1802
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1803 Usage : $sbjct->iteration( );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1804 Purpose : Gets the iteration number in which the Hit was found.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1805 Example : $iteration_num = $sbjct->iteration();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1806 Returns : Integer greater than or equal to 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1807 Non-PSI-BLAST reports will report iteration as 1, but this number
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1808 is only meaningful for PSI-BLAST reports.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1809 Argument : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1810 Throws : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1811
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1812 See Also : L<found_again()|found_again>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1813
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1814 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1815
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1816 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1817 sub iteration { shift->{'_iteration'} }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1818 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1819
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1820
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1821 =head2 found_again
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1822
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1823 Usage : $sbjct->found_again;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1824 Purpose : Gets a boolean indicator whether or not the hit has
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1825 been found in a previous iteration.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1826 This is only applicable to PSI-BLAST reports.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1827
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1828 This method indicates if the hit was reported in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1829 "Sequences used in model and found again" section of the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1830 PSI-BLAST report or if it was reported in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1831 "Sequences not found previously or not previously below threshold"
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1832 section of the PSI-BLAST report. Only for hits in iteration > 1.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1833
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1834 Example : if( $sbjct->found_again()) { ... };
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1835 Returns : Boolean (1 or 0) for PSI-BLAST report iterations greater than 1.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1836 Returns undef for PSI-BLAST report iteration 1 and non PSI_BLAST
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1837 reports.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1838 Argument : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1839 Throws : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1840
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1841 See Also : L<found_again()|found_again>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1842
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1843 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1844
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1845 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1846 sub found_again { shift->{'_found_again'} }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1847 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1848
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1849
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1850 =head2 strand
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1851
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1852 Usage : $sbjct->strand( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1853 Purpose : Gets the strand(s) for the query, sbjct, or both sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1854 : in the best HSP of the BlastHit object after HSP tiling.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1855 : Only valid for BLASTN, TBLASTX, BLASTX-query, TBLASTN-hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1856 Example : $qstrand = $sbjct->strand('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1857 : $sstrand = $sbjct->strand('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1858 : ($qstrand, $sstrand) = $sbjct->strand();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1859 Returns : scalar context: integer '1', '-1', or '0'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1860 : array context without args: list of two strings (queryStrand, sbjctStrand)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1861 : Array context can be "induced" by providing an argument of 'list' or 'array'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1862 Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1863 ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1864 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1865 Comments : This method requires that all HSPs be tiled. If they have not
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1866 : already been tiled, they will be tiled first automatically..
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1867 : If you don't want the tiled data, iterate through each HSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1868 : calling strand() on each (use hsps() to get all HSPs).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1869 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1870 : Formerly (prior to 10/21/02), this method would return the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1871 : string "-1/1" for hits with HSPs on both strands.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1872 : However, now that strand and frame is properly being accounted
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1873 : for during HSP tiling, it makes more sense for strand()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1874 : to return the strand data for the best HSP after tiling.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1875 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1876 : If you really want to know about hits on opposite strands,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1877 : you should be iterating through the HSPs using methods on the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1878 : HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1879 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1880 : A possible use case where knowing whether a hit has HSPs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1881 : on both strands would be when filtering via SearchIO for hits with
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1882 : this property. However, in this case it would be better to have a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1883 : dedicated method such as $hit->hsps_on_both_strands(). Similarly
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1884 : for frame. This could be provided if there is interest.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1885
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1886 See Also : B<Bio::Search::HSP::BlastHSP::strand>()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1887
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1888 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1889
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1890 #----------'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1891 sub strand {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1892 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1893 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1894
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1895 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1896
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1897 $seqType ||= (wantarray ? 'list' : 'query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1898 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1899
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1900 my ($qstr, $hstr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1901 # If there is only one HSP, defer this call to the solitary HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1902 if($self->num_hsps == 1) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1903 return $self->hsp->strand($seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1904 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1905 elsif( defined $self->{'_qstrand'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1906 # Get the data computed during hsp tiling.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1907 $qstr = $self->{'_qstrand'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1908 $hstr = $self->{'_sstrand'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1909 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1910 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1911 # otherwise, iterate through all HSPs collecting strand info.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1912 # This will return the string "-1/1" if there are HSPs on different strands.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1913 # NOTE: This was the pre-10/21/02 procedure which will no longer be used,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1914 # (unless the above elsif{} is commented out).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1915 my (%qstr, %hstr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1916 foreach my $hsp( $self->hsps ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1917 my ( $q, $h ) = $hsp->strand();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1918 $qstr{ $q }++;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1919 $hstr{ $h }++;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1920 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1921 $qstr = join( '/', sort keys %qstr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1922 $hstr = join( '/', sort keys %hstr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1923 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1924
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1925 if($seqType =~ /list|array/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1926 return ($qstr, $hstr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1927 } elsif( $seqType eq 'query' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1928 return $qstr;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1929 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1930 return $hstr;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1931 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1932 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1933
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1934
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1935 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1936 __END__
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1937
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1938 #####################################################################################
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1939 # END OF CLASS #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1940 #####################################################################################
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1941
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1942
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1943 =head1 FOR DEVELOPERS ONLY
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1944
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1945 =head2 Data Members
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1946
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1947 Information about the various data members of this module is provided for those
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1948 wishing to modify or understand the code. Two things to bear in mind:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1949
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1950 =over 4
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1951
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1952 =item 1 Do NOT rely on these in any code outside of this module.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1953
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1954 All data members are prefixed with an underscore to signify that they are private.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1955 Always use accessor methods. If the accessor doesn't exist or is inadequate,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1956 create or modify an accessor (and let me know, too!). (An exception to this might
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1957 be for BlastHSP.pm which is more tightly coupled to BlastHit.pm and
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1958 may access BlastHit data members directly for efficiency purposes, but probably
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1959 should not).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1960
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1961 =item 2 This documentation may be incomplete and out of date.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1962
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1963 It is easy for these data member descriptions to become obsolete as
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1964 this module is still evolving. Always double check this info and search
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1965 for members not described here.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1966
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1967 =back
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1968
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1969 An instance of Bio::Search::Hit::BlastHit.pm is a blessed reference to a hash containing
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1970 all or some of the following fields:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1971
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1972 FIELD VALUE
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1973 --------------------------------------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1974 _hsps : Array ref for a list of Bio::Search::HSP::BlastHSP.pm objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1975 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1976 _db : Database identifier from the summary line.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1977 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1978 _desc : Description data for the hit from the summary line.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1979 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1980 _length : Total length of the hit sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1981 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1982 _score : BLAST score.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1983 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1984 _bits : BLAST score (in bits). Matrix-independent.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1985 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1986 _p : BLAST P value. Obtained from summary section. (Blast1/WU-Blast only)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1987 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1988 _expect : BLAST Expect value. Obtained from summary section.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1989 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1990 _n : BLAST N value (number of HSPs) (Blast1/WU-Blast2 only)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1991 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1992 _frame : Reading frame for TBLASTN and TBLASTX analyses.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1993 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1994 _totalIdentical: Total number of identical aligned monomers.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1995 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1996 _totalConserved: Total number of conserved aligned monomers (a.k.a. "positives").
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1997 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1998 _overlap : Maximum number of overlapping residues between adjacent HSPs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1999 : before considering the alignment to be ambiguous.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2000 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2001 _ambiguous_aln : Boolean. True if the alignment of all HSPs is ambiguous.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2002 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2003 _length_aln_query : Length of the aligned region of the query sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2004 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2005 _length_aln_sbjct : Length of the aligned region of the sbjct sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2006
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2007
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2008 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2009
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2010 1;