annotate variant_effect_predictor/Bio/Search/Hit/BlastHit.pm @ 1:d6778b5d8382 draft default tip

Deleted selected files
author willmclaren
date Fri, 03 Aug 2012 10:05:43 -0400
parents 21066c0abaf5
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1 #-----------------------------------------------------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2 # $Id: BlastHit.pm,v 1.13 2002/10/22 09:36:19 sac Exp $
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
3 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
4 # BioPerl module Bio::Search::Hit::BlastHit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
5 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
6 # (This module was originally called Bio::Tools::Blast::Sbjct)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
7 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
8 # Cared for by Steve Chervitz <sac@bioperl.org>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
9 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
10 # You may distribute this module under the same terms as perl itself
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
11 #-----------------------------------------------------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
12
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
13 ## POD Documentation:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
14
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
15 =head1 NAME
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
16
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
17 Bio::Search::Hit::BlastHit - Bioperl BLAST Hit object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
18
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
19 =head1 SYNOPSIS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
20
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
21 The construction of BlastHit objects is performed by
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
22 Bio::SearchIO::blast::BlastHitFactory in a process that is
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
23 orchestrated by the Blast parser (B<Bio::SearchIO::blast::blast>).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
24 The resulting BlastHits are then accessed via
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
25 B<Bio::Search::Result::BlastResult>). Therefore, you do not need to
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
26 use B<Bio::Search::Hit::BlastHit>) directly. If you need to construct
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
27 BlastHits directly, see the new() function for details.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
28
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
29 For B<Bio::SearchIO> BLAST parsing usage examples, see the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
30 B<examples/search-blast> directory of the Bioperl distribution.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
31
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
32
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
33 =head1 DESCRIPTION
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
34
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
35 The Bio::Search::Hit::BlastHit.pm module encapsulates data and methods
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
36 for manipulating "hits" from a BLAST report. A BLAST hit is a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
37 collection of HSPs along with other metadata such as sequence name
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
38 and score information. Hit objects are accessed via
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
39 B<Bio::Search::Result::BlastResult> objects after parsing a BLAST report using
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
40 the B<Bio::SearchIO> system.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
41
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
42 In Blast lingo, the "sbjct" sequences are all the sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
43 in a target database which were compared against a "query" sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
44 The terms "sbjct" and "hit" will be used interchangeably in this module.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
45 All methods that take 'sbjct' as an argument also support 'hit' as a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
46 synonym.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
47
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
48 This module supports BLAST versions 1.x and 2.x, gapped and ungapped,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
49 and PSI-BLAST.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
50
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
51
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
52 =head2 HSP Tiling and Ambiguous Alignments
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
53
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
54 If a Blast hit has more than one HSP, the Bio::Search::Hit::BlastHit.pm
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
55 object has the ability to merge overlapping HSPs into contiguous
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
56 blocks. This permits the BlastHit object to sum data across all HSPs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
57 without counting data in the overlapping regions multiple times, which
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
58 would happen if data from each overlapping HSP are simply summed. HSP
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
59 tiling is performed automatically when methods of the BlastHit object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
60 that rely on tiled data are invoked. These include
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
61 L<frac_identical()|frac_identical>, L<frac_conserved()|frac_conserved>, L<gaps()|gaps>,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
62 L<frac_aligned_query()|frac_aligned_query>, L<frac_aligned_hit()|frac_aligned_hit>,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
63 L<num_unaligned_query()|num_unaligned_query>, L<num_unaligned_hit()|num_unaligned_hit>.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
64
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
65 It also permits the assessment of an "ambiguous alignment" if the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
66 query (or sbjct) sequences from different HSPs overlap
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
67 (see L<ambiguous_aln()|ambiguous_aln>). The existence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
68 of an overlap could indicate a biologically interesting region in the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
69 sequence, such as a repeated domain. The BlastHit object uses the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
70 C<-OVERLAP> parameter to determine when two sequences overlap; if this is
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
71 set to 2 -- the default -- then any two sbjct or query HSP sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
72 must overlap by more than two residues to get merged into the same
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
73 contig and counted as an overlap. See the L<BUGS | BUGS> section below for
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
74 "issues" with HSP tiling.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
75
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
76
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
77 The results of the HSP tiling is reported with the following ambiguity codes:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
78
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
79 'q' = Query sequence contains multiple sub-sequences matching
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
80 a single region in the sbjct sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
81
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
82 's' = Subject (BlastHit) sequence contains multiple sub-sequences matching
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
83 a single region in the query sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
84
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
85 'qs' = Both query and sbjct sequences contain more than one
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
86 sub-sequence with similarity to the other sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
87
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
88
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
89 For addition information about ambiguous BLAST alignments, see
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
90 L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils> and
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
91
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
92 http://www-genome.stanford.edu/Sacch3D/help/ambig_aln.html
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
93
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
94 =head1 DEPENDENCIES
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
95
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
96 Bio::Search::Hit::BlastHit.pm is a concrete class that inherits from
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
97 B<Bio::Root::Root> and B<Bio::Search::Hit::HitI>. and relies on
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
98 B<Bio::Search::HSP::BlastHSP>.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
99
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
100
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
101 =head1 BUGS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
102
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
103 One consequence of the HSP tiling is that methods that rely on HSP
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
104 tiling such as L<frac_identical()|frac_identical>, L<frac_conserved()|frac_conserved>, L<gaps()|gaps>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
105 etc. may report misleading numbers when C<-OVERLAP> is set to a large
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
106 number. For example, say we have two HSPs and the query sequence tile
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
107 as follows:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
108
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
109 1 8 22 30 40 60
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
110 Full seq: ------------------------------------------------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
111 * ** * **
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
112 HSP1: --------------- (6 identical matches)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
113 ** ** **
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
114 HSP2: ------------- (6 identical matches)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
115
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
116
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
117 If C<-OVERLAP> is set to some number over 4, HSP1 and HSP2 will not be
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
118 tiled into a single contig and their numbers of identical matches will
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
119 be added, giving a total of 12, not 10 if they had be combined into
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
120 one contig. This can lead to number greater than 1.0 for methods
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
121 L<frac_identical()|frac_identical> and L<frac_conserved()|frac_conserved>. This is less of an issue
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
122 with gapped Blast since it tends to combine HSPs that would be listed
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
123 separately without gapping. (Fractions E<gt>1.0 can be viewed as a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
124 signal for an interesting alignment that warrants further inspection,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
125 thus turning this bug into a feature :-).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
126
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
127 Using large values for C<-OVERLAP> can lead to incorrect numbers
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
128 reported by methods that rely on HSP tiling but can be useful if you
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
129 care more about detecting ambiguous alignments. Setting C<-OVERLAP>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
130 to zero will lead to the most accurate numbers for the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
131 tiling-dependent methods but will be useless for detecting overlapping
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
132 HSPs since all HSPs will appear to overlap.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
133
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
134
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
135 =head1 SEE ALSO
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
136
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
137 Bio::Search::HSP::BlastHSP.pm - Blast HSP object.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
138 Bio::Search::Result::BlastResult.pm - Blast Result object.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
139 Bio::Search::Hit::HitI.pm - Interface implemented by BlastHit.pm
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
140 Bio::Root::Root.pm - Base class for BlastHit.pm
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
141
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
142 Links:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
143
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
144 http://bio.perl.org/Core/POD/Search/Hit/Blast/BlastHSP.pm.html
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
145
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
146 http://bio.perl.org/Projects/modules.html - Online module documentation
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
147 http://bio.perl.org/Projects/Blast/ - Bioperl Blast Project
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
148 http://bio.perl.org/ - Bioperl Project Homepage
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
149
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
150
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
151 =head1 FEEDBACK
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
152
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
153 =head2 Mailing Lists
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
154
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
155 User feedback is an integral part of the evolution of this and other
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
156 Bioperl modules. Send your comments and suggestions preferably to one
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
157 of the Bioperl mailing lists. Your participation is much appreciated.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
158
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
159 bioperl-l@bioperl.org - General discussion
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
160 http://bio.perl.org/MailList.html - About the mailing lists
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
161
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
162 =head2 Reporting Bugs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
163
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
164 Report bugs to the Bioperl bug tracking system to help us keep track
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
165 the bugs and their resolution. Bug reports can be submitted via email
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
166 or the web:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
167
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
168 bioperl-bugs@bio.perl.org
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
169 http://bio.perl.org/bioperl-bugs/
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
170
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
171 =head1 AUTHOR
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
172
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
173 Steve Chervitz E<lt>sac@bioperl.orgE<gt>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
174
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
175 See L<the FEEDBACK section | FEEDBACK> for where to send bug reports and comments.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
176
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
177 =head1 ACKNOWLEDGEMENTS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
178
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
179 This software was originally developed in the Department of Genetics
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
180 at Stanford University. I would also like to acknowledge my
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
181 colleagues at Affymetrix for useful feedback.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
182
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
183 =head1 COPYRIGHT
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
184
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
185 Copyright (c) 1996-2001 Steve Chervitz. All Rights Reserved.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
186
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
187 =head1 DISCLAIMER
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
188
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
189 This software is provided "as is" without warranty of any kind.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
190
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
191 =head1 APPENDIX
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
192
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
193 The rest of the documentation details each of the object methods.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
194 Internal methods are usually preceded with a _
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
195
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
196 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
197
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
198
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
199 # Let the code begin...
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
200
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
201 package Bio::Search::Hit::BlastHit;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
202
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
203 use strict;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
204 use Bio::Search::Hit::HitI;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
205 use Bio::Root::Root;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
206 require Bio::Search::BlastUtils;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
207 use vars qw( @ISA %SUMMARY_OFFSET $Revision);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
208
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
209 use overload
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
210 '""' => \&to_string;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
211
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
212 @ISA = qw( Bio::Root::Root Bio::Search::Hit::HitI );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
213
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
214 $Revision = '$Id: BlastHit.pm,v 1.13 2002/10/22 09:36:19 sac Exp $'; #'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
215
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
216
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
217 =head2 new
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
218
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
219 Usage : $hit = Bio::Search::Hit::BlastHit->new( %named_params );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
220 : Bio::Search::Hit::BlastHit.pm objects are constructed
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
221 : automatically by Bio::SearchIO::BlastHitFactory.pm,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
222 : so there is no need for direct instantiation.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
223 Purpose : Constructs a new BlastHit object and Initializes key variables
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
224 : for the hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
225 Returns : A Bio::Search::Hit::BlastHit object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
226 Argument : Named Parameters:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
227 : Parameter keys are case-insensitive.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
228 : -RAW_DATA => array reference holding raw BLAST report data
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
229 : for a single hit. This includes all lines
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
230 : within the HSP alignment listing section of a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
231 : traditional BLAST or PSI-BLAST (non-XML) report,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
232 : starting at (or just after) the leading '>'.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
233 : -HOLD_RAW_DATA => boolean, should -RAW_DATA be saved within the object.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
234 : -QUERY_LEN => Length of the query sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
235 : -ITERATION => integer (PSI-BLAST iteration number in which hit was found)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
236 : -OVERLAP => integer (maximum overlap between adjacent
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
237 : HSPs when tiling)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
238 : -PROGRAM => string (type of Blast: BLASTP, BLASTN, etc)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
239 : -SIGNIF => significance
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
240 : -IS_PVAL => boolean, true if -SIGNIF contains a P-value
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
241 : -SCORE => raw BLAST score
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
242 : -FOUND_AGAIN => boolean, true if this was a hit from the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
243 : section of a PSI-BLAST with iteration > 1
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
244 : containing sequences that were also found
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
245 : in iteration 1.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
246 Comments : This object accepts raw Blast report data not because it
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
247 : is required for parsing, but in order to retrieve it
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
248 : (only available if -HOLD_RAW_DATA is set to true).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
249
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
250 See Also : L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>, L<Bio::Root::Root::new()|Bio::Root::Root>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
251
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
252 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
253
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
254 #-------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
255 sub new {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
256 #-------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
257 my ($class, @args ) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
258 my $self = $class->SUPER::new( @args );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
259
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
260 my ($raw_data, $signif, $is_pval, $hold_raw);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
261
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
262 ($self->{'_blast_program'}, $self->{'_query_length'}, $raw_data, $hold_raw,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
263 $self->{'_overlap'}, $self->{'_iteration'}, $signif, $is_pval,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
264 $self->{'_score'}, $self->{'_found_again'} ) =
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
265 $self->_rearrange( [qw(PROGRAM
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
266 QUERY_LEN
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
267 RAW_DATA
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
268 HOLD_RAW_DATA
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
269 OVERLAP
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
270 ITERATION
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
271 SIGNIF
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
272 IS_PVAL
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
273 SCORE
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
274 FOUND_AGAIN )], @args );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
275
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
276 # TODO: Handle this in parser. Just pass in name parameter.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
277 $self->_set_id( $raw_data->[0] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
278
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
279 if($is_pval) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
280 $self->{'_p'} = $signif;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
281 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
282 $self->{'_expect'} = $signif;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
283 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
284
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
285 if( $hold_raw ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
286 $self->{'_hit_data'} = $raw_data;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
287 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
288
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
289 return $self;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
290 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
291
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
292 sub DESTROY {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
293 my $self=shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
294 #print STDERR "-->DESTROYING $self\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
295 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
296
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
297
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
298 #=================================================
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
299 # Begin Bio::Search::Hit::HitI implementation
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
300 #=================================================
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
301
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
302 =head2 algorithm
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
303
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
304 Title : algorithm
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
305 Usage : $alg = $hit->algorithm();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
306 Function: Gets the algorithm specification that was used to obtain the hit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
307 For BLAST, the algorithm denotes what type of sequence was aligned
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
308 against what (BLASTN: dna-dna, BLASTP prt-prt, BLASTX translated
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
309 dna-prt, TBLASTN prt-translated dna, TBLASTX translated
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
310 dna-translated dna).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
311 Returns : a scalar string
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
312 Args : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
313
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
314 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
315
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
316 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
317 sub algorithm {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
318 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
319 my ($self,@args) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
320 return $self->{'_blast_program'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
321 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
322
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
323 =head2 name
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
324
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
325 Usage : $hit->name([string]);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
326 Purpose : Set/Get a string to identify the hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
327 Example : $name = $hit->name;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
328 : $hit->name('M81707');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
329 Returns : String consisting of the hit's name or undef if not set.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
330 Comments : The name is parsed out of the "Query=" line as the first chunk of
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
331 non-whitespace text. If you want the rest of the line, use
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
332 $hit->description().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
333
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
334 See Also: L<accession()|accession>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
335
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
336 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
337
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
338 #'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
339
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
340 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
341 sub name {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
342 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
343 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
344 if (@_) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
345 my $name = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
346 $name =~ s/^\s+|(\s+|,)$//g;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
347 $self->{'_name'} = $name;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
348 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
349 return $self->{'_name'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
350 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
351
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
352 =head2 description
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
353
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
354 Usage : $hit_object->description( [integer] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
355 Purpose : Set/Get a description string for the hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
356 This is parsed out of the "Query=" line as everything after
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
357 the first chunk of non-whitespace text. Use $hit->name()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
358 to get the first chunk (the ID of the sequence).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
359 Example : $description = $hit->description;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
360 : $desc_60char = $hit->description(60);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
361 Argument : Integer (optional) indicating the desired length of the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
362 : description string to be returned.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
363 Returns : String consisting of the hit's description or undef if not set.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
364
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
365 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
366
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
367 #'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
368
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
369 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
370 sub description {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
371 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
372 my( $self, $len ) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
373 $len = (defined $len) ? $len : (CORE::length $self->{'_description'});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
374 return substr( $self->{'_description'}, 0 ,$len );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
375 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
376
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
377 =head2 accession
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
378
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
379 Title : accession
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
380 Usage : $acc = $hit->accession();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
381 Function: Retrieve the accession (if available) for the hit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
382 Returns : a scalar string (empty string if not set)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
383 Args : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
384 Comments: Accession numbers are extracted based on the assumption that they
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
385 are delimited by | characters (NCBI-style). If this is not the case,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
386 use the name() method and parse it as necessary.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
387
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
388 See Also: L<name()|name>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
389
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
390 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
391
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
392 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
393 sub accession {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
394 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
395 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
396 if(@_) { $self->{'_accession'} = shift; }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
397 $self->{'_accession'} || '';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
398 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
399
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
400 =head2 raw_score
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
401
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
402 Usage : $hit_object->raw_score();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
403 Purpose : Gets the BLAST score of the best HSP for the current Blast hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
404 Example : $score = $hit_object->raw_score();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
405 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
406 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
407 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
408
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
409 See Also : L<bits()|bits>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
410
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
411 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
412
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
413 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
414 sub raw_score {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
415 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
416 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
417
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
418 # The check for $self->{'_score'} is a remnant from the 'query' mode days
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
419 # in which the sbjct object would collect data from the description line only.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
420
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
421 my ($score);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
422 if(not defined($self->{'_score'})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
423 $score = $self->hsp->score;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
424 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
425 $score = $self->{'_score'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
426 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
427 return $score;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
428 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
429
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
430
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
431 =head2 length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
432
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
433 Usage : $hit_object->length();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
434 Purpose : Get the total length of the hit sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
435 Example : $len = $hit_object->length();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
436 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
437 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
438 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
439 Comments : Developer note: when using the built-in length function within
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
440 : this module, call it as CORE::length().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
441
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
442 See Also : L<logical_length()|logical_length>, L<length_aln()|length_aln>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
443
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
444 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
445
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
446 #-----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
447 sub length {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
448 #-----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
449 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
450 return $self->{'_length'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
451 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
452
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
453 =head2 significance
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
454
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
455 Equivalent to L<signif()|signif>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
456
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
457 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
458
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
459 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
460 sub significance { shift->signif( @_ ); }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
461 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
462
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
463
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
464 =head2 next_hsp
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
465
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
466 Title : next_hsp
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
467 Usage : $hsp = $obj->next_hsp();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
468 Function : returns the next available High Scoring Pair object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
469 Example :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
470 Returns : Bio::Search::HSP::BlastHSP or undef if finished
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
471 Args : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
472
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
473 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
474
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
475 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
476 sub next_hsp {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
477 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
478 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
479
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
480 unless($self->{'_hsp_queue_started'}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
481 $self->{'_hsp_queue'} = [$self->hsps()];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
482 $self->{'_hsp_queue_started'} = 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
483 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
484 pop @{$self->{'_hsp_queue'}};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
485 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
486
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
487 #=================================================
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
488 # End Bio::Search::Hit::HitI implementation
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
489 #=================================================
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
490
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
491
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
492 # Providing a more explicit method for getting name of hit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
493 # (corresponds with column name in HitTableWriter)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
494 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
495 sub hit_name {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
496 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
497 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
498 $self->name( @_ );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
499 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
500
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
501 # Older method Delegates to description()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
502 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
503 sub desc {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
504 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
505 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
506 return $self->description( @_ );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
507 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
508
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
509 # Providing a more explicit method for getting description of hit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
510 # (corresponds with column name in HitTableWriter)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
511 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
512 sub hit_description {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
513 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
514 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
515 return $self->description( @_ );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
516 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
517
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
518 =head2 score
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
519
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
520 Equivalent to L<raw_score()|raw_score>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
521
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
522 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
523
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
524 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
525 sub score { shift->raw_score( @_ ); }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
526 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
527
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
528
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
529 =head2 hit_length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
530
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
531 Equivalent to L<length()|length>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
532
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
533 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
534
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
535 # Providing a more explicit method for getting length of hit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
536 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
537 sub hit_length { shift->length( @_ ); }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
538 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
539
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
540
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
541 =head2 signif
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
542
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
543 Usage : $hit_object->signif( [format] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
544 Purpose : Get the P or Expect value for the best HSP of the given BLAST hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
545 : The value returned is the one which is reported in the description
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
546 : section of the Blast report. For Blast1 and WU-Blast2, this
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
547 : is a P-value, for Blast2, it is an Expect value.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
548 Example : $obj->signif() # returns 1.3e-34
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
549 : $obj->signif('exp') # returns -34
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
550 : $obj->signif('parts') # returns (1.3, -34)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
551 Returns : Float or scientific notation number (the raw P/Expect value, DEFAULT).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
552 : Integer if format == 'exp' (the magnitude of the base 10 exponent).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
553 : 2-element list (float, int) if format == 'parts' and P/Expect value
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
554 : is in scientific notation (see Comments).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
555 Argument : format: string of 'raw' | 'exp' | 'parts'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
556 : 'raw' returns value given in report. Default. (1.2e-34)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
557 : 'exp' returns exponent value only (34)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
558 : 'parts' returns the decimal and exponent as a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
559 : 2-element list (1.2, -34) (see Comments).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
560 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
561 Comments : The signif() method provides a way to deal with the fact that
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
562 : Blast1 and Blast2 formats (and WU- vs. NCBI-BLAST) differ in
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
563 : what is reported in the description lines of each hit in the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
564 : Blast report. The signif() method frees any client code from
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
565 : having to know if this is a P-value or an Expect value,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
566 : making it easier to write code that can process both
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
567 : Blast1 and Blast2 reports. This is not necessarily a good thing,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
568 : since one should always know when one is working with P-values or
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
569 : Expect values (hence the deprecated status).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
570 : Use of expect() is recommended since all hits will have an Expect value.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
571 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
572 : Using the 'parts' argument is not recommended since it will not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
573 : work as expected if the expect value is not in scientific notation.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
574 : That is, floats are not converted into sci notation before
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
575 : splitting into parts.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
576
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
577 See Also : L<p()|p>, L<expect()|expect>, L<Bio::Search::BlastUtils::get_exponent()|Bio::Search::BlastUtils>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
578
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
579 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
580
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
581 #-------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
582 sub signif {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
583 #-------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
584 # Some duplication of logic for p(), expect() and signif() for the sake of performance.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
585 my ($self, $fmt) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
586
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
587 my $val = defined($self->{'_p'}) ? $self->{'_p'} : $self->{'_expect'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
588
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
589 # $val can be zero.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
590 defined($val) or $self->throw("Can't get P- or Expect value: HSPs may not have been set.");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
591
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
592 return $val if not $fmt or $fmt =~ /^raw/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
593 ## Special formats: exponent-only or as list.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
594 return &Bio::Search::BlastUtils::get_exponent($val) if $fmt =~ /^exp/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
595 return (split (/eE/, $val)) if $fmt =~ /^parts/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
596
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
597 ## Default: return the raw P/Expect-value.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
598 return $val;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
599 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
600
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
601 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
602 sub raw_hit_data {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
603 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
604 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
605 my $data = '>';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
606 # Need to add blank lines where we've removed them.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
607 foreach( @{$self->{'_hit_data'}} ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
608 if( $_ eq 'end') {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
609 $data .= "\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
610 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
611 else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
612 $data .= /^\s*(Score|Query)/ ? "\n$_" : $_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
613 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
614 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
615 return $data;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
616 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
617
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
618
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
619 #=head2 _set_length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
620 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
621 # Usage : $hit_object->_set_length( "233" );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
622 # Purpose : Set the total length of the hit sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
623 # Example : $hit_object->_set_length( $len );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
624 # Returns : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
625 # Argument : Integer (only when setting). Any commas will be stripped out.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
626 # Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
627 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
628 #=cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
629
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
630 #-----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
631 sub _set_length {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
632 #-----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
633 my ($self, $len) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
634 $len =~ s/,//g; # get rid of commas
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
635 $self->{'_length'} = $len;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
636 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
637
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
638 #=head2 _set_description
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
639 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
640 # Usage : Private method; called automatically during construction
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
641 # Purpose : Sets the description of the hit sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
642 # : For sequence without descriptions, does not set any description.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
643 # Argument : Array containing description (multiple lines).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
644 # Comments : Processes the supplied description:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
645 # 1. Join all lines into one string.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
646 # 2. Remove sequence id at the beginning of description.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
647 # 3. Removes junk charactes at begin and end of description.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
648 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
649 #=cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
650
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
651 #--------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
652 sub _set_description {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
653 #--------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
654 my( $self, @desc ) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
655 my( $desc);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
656
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
657 # print STDERR "BlastHit: RAW DESC:\n@desc\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
658
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
659 $desc = join(" ", @desc);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
660
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
661 my $name = $self->name;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
662
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
663 if($desc) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
664 $desc =~ s/^\s*\S+\s+//; # remove the sequence ID(s)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
665 # This won't work if there's no description.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
666 $desc =~ s/^\s*$name//; # ...but this should.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
667 $desc =~ s/^[\s!]+//;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
668 $desc =~ s/ \d+$//;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
669 $desc =~ s/\.+$//;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
670 $self->{'_description'} = $desc;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
671 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
672
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
673 # print STDERR "BlastHit: _set_description = $desc\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
674 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
675
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
676 =head2 to_string
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
677
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
678 Title : to_string
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
679 Usage : print $hit->to_string;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
680 Function: Returns a string representation for the Blast Hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
681 Primarily intended for debugging purposes.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
682 Example : see usage
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
683 Returns : A string of the form:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
684 [BlastHit] <name> <description>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
685 e.g.:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
686 [BlastHit] emb|Z46660|SC9725 S.cerevisiae chromosome XIII cosmid
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
687 Args : None
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
688
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
689 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
690
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
691 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
692 sub to_string {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
693 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
694 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
695 return "[BlastHit] " . $self->name . " " . $self->description;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
696 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
697
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
698
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
699 #=head2 _set_id
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
700 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
701 # Usage : Private method; automatically called by new()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
702 # Purpose : Sets the name of the BlastHit sequence from the BLAST summary line.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
703 # : The identifier is assumed to be the first
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
704 # : chunk of non-whitespace characters in the description line
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
705 # : Does not assume any semantics in the structure of the identifier
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
706 # : (Formerly, this method attempted to extract database name from
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
707 # : the seq identifiers, but this was prone to break).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
708 # Returns : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
709 # Argument : String containing description line of the hit from Blast report
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
710 # : or first line of an alignment section (with or without the leading '>').
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
711 # Throws : Warning if cannot locate sequence ID.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
712 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
713 #See Also : L<new()|new>, L<accession()|accession>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
714 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
715 #=cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
716
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
717 #---------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
718 sub _set_id {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
719 #---------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
720 my( $self, $desc ) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
721
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
722 # New strategy: Assume only that the ID is the first white space
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
723 # delimited chunk. Not attempting to extract accession & database name.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
724 # Clients will have to interpret it as necessary.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
725 if($desc =~ /^>?(\S+)\s*(.*)/) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
726 my ($name, $desc) = ($1, $2);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
727 $self->name($name);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
728 $self->{'_description'} = $desc;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
729 # Note that this description comes from the summary section of the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
730 # BLAST report and so may be truncated. The full description will be
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
731 # set from the alignment section. We're setting description here in case
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
732 # the alignment section isn't being parsed.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
733
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
734 # Assuming accession is delimited with | symbols (NCBI-style)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
735 my @pieces = split(/\|/,$name);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
736 my $acc = pop @pieces;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
737 $self->accession( $acc );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
738 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
739 else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
740 $self->warn("Can't locate sequence identifier in summary line.", "Line = $desc");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
741 $desc = 'Unknown sequence ID' if not $desc;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
742 $self->name($desc);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
743 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
744 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
745
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
746
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
747 =head2 ambiguous_aln
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
748
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
749 Usage : $ambig_code = $hit_object->ambiguous_aln();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
750 Purpose : Sets/Gets ambiguity code data member.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
751 Example : (see usage)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
752 Returns : String = 'q', 's', 'qs', '-'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
753 : 'q' = query sequence contains overlapping sub-sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
754 : while sbjct does not.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
755 : 's' = sbjct sequence contains overlapping sub-sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
756 : while query does not.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
757 : 'qs' = query and sbjct sequence contains overlapping sub-sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
758 : relative to each other.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
759 : '-' = query and sbjct sequence do not contains multiple domains
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
760 : relative to each other OR both contain the same distribution
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
761 : of similar domains.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
762 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
763 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
764 Status : Experimental
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
765
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
766 See Also : L<Bio::Search::BlastUtils::tile_hsps>, L<HSP Tiling and Ambiguous Alignments>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
767
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
768 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
769
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
770 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
771 sub ambiguous_aln {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
772 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
773 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
774 if(@_) { $self->{'_ambiguous_aln'} = shift; }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
775 $self->{'_ambiguous_aln'} || '-';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
776 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
777
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
778
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
779
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
780 =head2 overlap
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
781
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
782 Usage : $blast_object->overlap( [integer] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
783 Purpose : Gets/Sets the allowable amount overlap between different HSP sequences.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
784 Example : $blast_object->overlap(5);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
785 : $overlap = $blast_object->overlap;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
786 Returns : Integer.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
787 Argument : integer.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
788 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
789 Status : Experimental
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
790 Comments : Any two HSPs whose sequences overlap by less than or equal
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
791 : to the overlap() number of resides will be considered separate HSPs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
792 : and will not get tiled by Bio::Search::BlastUtils::_adjust_contigs().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
793
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
794 See Also : L<Bio::Search::BlastUtils::_adjust_contigs()|Bio::Search::BlastUtils>, L<BUGS | BUGS>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
795
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
796 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
797
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
798 #-------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
799 sub overlap {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
800 #-------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
801 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
802 if(@_) { $self->{'_overlap'} = shift; }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
803 defined $self->{'_overlap'} ? $self->{'_overlap'} : 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
804 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
805
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
806
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
807
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
808
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
809
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
810
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
811 =head2 bits
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
812
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
813 Usage : $hit_object->bits();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
814 Purpose : Gets the BLAST bit score of the best HSP for the current Blast hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
815 Example : $bits = $hit_object->bits();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
816 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
817 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
818 Throws : Exception if bit score is not set.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
819 Comments : For BLAST1, the non-bit score is listed in the summary line.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
820
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
821 See Also : L<score()|score>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
822
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
823 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
824
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
825 #---------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
826 sub bits {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
827 #---------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
828 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
829
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
830 # The check for $self->{'_bits'} is a remnant from the 'query' mode days
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
831 # in which the sbjct object would collect data from the description line only.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
832
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
833 my ($bits);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
834 if(not defined($self->{'_bits'})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
835 $bits = $self->hsp->bits;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
836 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
837 $bits = $self->{'_bits'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
838 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
839 return $bits;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
840 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
841
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
842
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
843
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
844 =head2 n
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
845
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
846 Usage : $hit_object->n();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
847 Purpose : Gets the N number for the current Blast hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
848 : This is the number of HSPs in the set which was ascribed
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
849 : the lowest P-value (listed on the description line).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
850 : This number is not the same as the total number of HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
851 : To get the total number of HSPs, use num_hsps().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
852 Example : $n = $hit_object->n();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
853 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
854 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
855 Throws : Exception if HSPs have not been set (BLAST2 reports).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
856 Comments : Note that the N parameter is not reported in gapped BLAST2.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
857 : Calling n() on such reports will result in a call to num_hsps().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
858 : The num_hsps() method will count the actual number of
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
859 : HSPs in the alignment listing, which may exceed N in
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
860 : some cases.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
861
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
862 See Also : L<num_hsps()|num_hsps>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
863
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
864 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
865
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
866 #-----
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
867 sub n {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
868 #-----
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
869 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
870
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
871 # The check for $self->{'_n'} is a remnant from the 'query' mode days
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
872 # in which the sbjct object would collect data from the description line only.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
873
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
874 my ($n);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
875 if(not defined($self->{'_n'})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
876 $n = $self->hsp->n;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
877 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
878 $n = $self->{'_n'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
879 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
880 $n ||= $self->num_hsps;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
881
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
882 return $n;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
883 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
884
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
885
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
886
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
887 =head2 frame
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
888
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
889 Usage : $hit_object->frame();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
890 Purpose : Gets the reading frame for the best HSP after HSP tiling.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
891 : This is only valid for BLASTX and TBLASTN/X reports.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
892 Example : $frame = $hit_object->frame();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
893 Returns : Integer (-2 .. +2)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
894 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
895 Throws : Exception if HSPs have not been set (BLAST2 reports).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
896 Comments : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
897 : already been tiled, they will be tiled first automatically..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
898 : If you don't want the tiled data, iterate through each HSP
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
899 : calling frame() on each (use hsps() to get all HSPs).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
900
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
901 See Also : L<hsps()|hsps>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
902
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
903 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
904
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
905 #----------'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
906 sub frame {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
907 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
908 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
909
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
910 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
911
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
912 # The check for $self->{'_frame'} is a remnant from the 'query' mode days
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
913 # in which the sbjct object would collect data from the description line only.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
914
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
915 my ($frame);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
916 if(not defined($self->{'_frame'})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
917 $frame = $self->hsp->frame;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
918 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
919 $frame = $self->{'_frame'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
920 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
921 return $frame;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
922 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
923
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
924
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
925
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
926
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
927
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
928 =head2 p
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
929
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
930 Usage : $hit_object->p( [format] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
931 Purpose : Get the P-value for the best HSP of the given BLAST hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
932 : (Note that P-values are not provided with NCBI Blast2 reports).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
933 Example : $p = $sbjct->p;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
934 : $p = $sbjct->p('exp'); # get exponent only.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
935 : ($num, $exp) = $sbjct->p('parts'); # split sci notation into parts
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
936 Returns : Float or scientific notation number (the raw P-value, DEFAULT).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
937 : Integer if format == 'exp' (the magnitude of the base 10 exponent).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
938 : 2-element list (float, int) if format == 'parts' and P-value
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
939 : is in scientific notation (See Comments).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
940 Argument : format: string of 'raw' | 'exp' | 'parts'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
941 : 'raw' returns value given in report. Default. (1.2e-34)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
942 : 'exp' returns exponent value only (34)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
943 : 'parts' returns the decimal and exponent as a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
944 : 2-element list (1.2, -34) (See Comments).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
945 Throws : Warns if no P-value is defined. Uses expect instead.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
946 Comments : Using the 'parts' argument is not recommended since it will not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
947 : work as expected if the P-value is not in scientific notation.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
948 : That is, floats are not converted into sci notation before
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
949 : splitting into parts.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
950
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
951 See Also : L<expect()|expect>, L<signif()|signif>, L<Bio::Search::BlastUtils::get_exponent()|Bio::Search::BlastUtils>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
952
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
953 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
954
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
955 #--------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
956 sub p {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
957 #--------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
958 # Some duplication of logic for p(), expect() and signif() for the sake of performance.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
959 my ($self, $fmt) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
960
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
961 my $val = $self->{'_p'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
962
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
963 # $val can be zero.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
964 if(not defined $val) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
965 # P-value not defined, must be a NCBI Blast2 report.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
966 # Use expect instead.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
967 $self->warn( "P-value not defined. Using expect() instead.");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
968 $val = $self->{'_expect'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
969 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
970
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
971 return $val if not $fmt or $fmt =~ /^raw/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
972 ## Special formats: exponent-only or as list.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
973 return &Bio::Search::BlastUtils::get_exponent($val) if $fmt =~ /^exp/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
974 return (split (/eE/, $val)) if $fmt =~ /^parts/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
975
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
976 ## Default: return the raw P-value.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
977 return $val;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
978 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
979
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
980
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
981
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
982 =head2 expect
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
983
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
984 Usage : $hit_object->expect( [format] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
985 Purpose : Get the Expect value for the best HSP of the given BLAST hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
986 Example : $e = $sbjct->expect;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
987 : $e = $sbjct->expect('exp'); # get exponent only.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
988 : ($num, $exp) = $sbjct->expect('parts'); # split sci notation into parts
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
989 Returns : Float or scientific notation number (the raw expect value, DEFAULT).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
990 : Integer if format == 'exp' (the magnitude of the base 10 exponent).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
991 : 2-element list (float, int) if format == 'parts' and Expect
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
992 : is in scientific notation (see Comments).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
993 Argument : format: string of 'raw' | 'exp' | 'parts'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
994 : 'raw' returns value given in report. Default. (1.2e-34)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
995 : 'exp' returns exponent value only (34)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
996 : 'parts' returns the decimal and exponent as a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
997 : 2-element list (1.2, -34) (see Comments).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
998 Throws : Exception if the Expect value is not defined.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
999 Comments : Using the 'parts' argument is not recommended since it will not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1000 : work as expected if the expect value is not in scientific notation.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1001 : That is, floats are not converted into sci notation before
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1002 : splitting into parts.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1003
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1004 See Also : L<p()|p>, L<signif()|signif>, L<Bio::Search::BlastUtils::get_exponent()|Bio::Search::BlastUtils>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1005
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1006 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1007
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1008 #-----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1009 sub expect {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1010 #-----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1011 # Some duplication of logic for p(), expect() and signif() for the sake of performance.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1012 my ($self, $fmt) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1013
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1014 my $val;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1015
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1016 # For Blast reports that list the P value on the description line,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1017 # getting the expect value requires fully parsing the HSP data.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1018 # For NCBI blast, there's no problem.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1019 if(not defined($self->{'_expect'})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1020 if( defined $self->{'_hsps'}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1021 $self->{'_expect'} = $val = $self->hsp->expect;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1022 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1023 # If _expect is not set and _hsps are not set,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1024 # then this must be a P-value-based report that was
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1025 # run without setting the HSPs (shallow parsing).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1026 $self->throw("Can't get expect value. HSPs have not been set.");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1027 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1028 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1029 $val = $self->{'_expect'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1030 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1031
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1032 # $val can be zero.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1033 defined($val) or $self->throw("Can't get Expect value.");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1034
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1035 return $val if not $fmt or $fmt =~ /^raw/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1036 ## Special formats: exponent-only or as list.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1037 return &Bio::Search::BlastUtils::get_exponent($val) if $fmt =~ /^exp/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1038 return (split (/eE/, $val)) if $fmt =~ /^parts/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1039
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1040 ## Default: return the raw Expect-value.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1041 return $val;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1042 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1043
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1044
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1045 =head2 hsps
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1046
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1047 Usage : $hit_object->hsps();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1048 Purpose : Get a list containing all HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1049 : Get the numbers of HSPs for the current hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1050 Example : @hsps = $hit_object->hsps();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1051 : $num = $hit_object->hsps(); # alternatively, use num_hsps()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1052 Returns : Array context : list of Bio::Search::HSP::BlastHSP.pm objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1053 : Scalar context: integer (number of HSPs).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1054 : (Equivalent to num_hsps()).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1055 Argument : n/a. Relies on wantarray
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1056 Throws : Exception if the HSPs have not been collected.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1057
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1058 See Also : L<hsp()|hsp>, L<num_hsps()|num_hsps>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1059
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1060 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1061
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1062 #---------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1063 sub hsps {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1064 #---------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1065 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1066
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1067 if (not ref $self->{'_hsps'}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1068 $self->throw("Can't get HSPs: data not collected.");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1069 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1070
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1071 return wantarray
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1072 # returning list containing all HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1073 ? @{$self->{'_hsps'}}
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1074 # returning number of HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1075 : scalar(@{$self->{'_hsps'}});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1076 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1077
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1078
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1079
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1080 =head2 hsp
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1081
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1082 Usage : $hit_object->hsp( [string] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1083 Purpose : Get a single BlastHSP.pm object for the present BlastHit.pm object.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1084 Example : $hspObj = $hit_object->hsp; # same as 'best'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1085 : $hspObj = $hit_object->hsp('best');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1086 : $hspObj = $hit_object->hsp('worst');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1087 Returns : Object reference for a Bio::Search::HSP::BlastHSP.pm object.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1088 Argument : String (or no argument).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1089 : No argument (default) = highest scoring HSP (same as 'best').
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1090 : 'best' or 'first' = highest scoring HSP.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1091 : 'worst' or 'last' = lowest scoring HSP.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1092 Throws : Exception if the HSPs have not been collected.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1093 : Exception if an unrecognized argument is used.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1094
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1095 See Also : L<hsps()|hsps>, L<num_hsps>()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1096
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1097 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1098
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1099 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1100 sub hsp {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1101 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1102 my( $self, $option ) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1103 $option ||= 'best';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1104
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1105 if (not ref $self->{'_hsps'}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1106 $self->throw("Can't get HSPs: data not collected.");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1107 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1108
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1109 my @hsps = @{$self->{'_hsps'}};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1110
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1111 return $hsps[0] if $option =~ /best|first|1/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1112 return $hsps[$#hsps] if $option =~ /worst|last/i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1113
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1114 $self->throw("Can't get HSP for: $option\n" .
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1115 "Valid arguments: 'best', 'worst'");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1116 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1117
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1118
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1119
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1120 =head2 num_hsps
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1121
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1122 Usage : $hit_object->num_hsps();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1123 Purpose : Get the number of HSPs for the present Blast hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1124 Example : $nhsps = $hit_object->num_hsps();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1125 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1126 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1127 Throws : Exception if the HSPs have not been collected.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1128
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1129 See Also : L<hsps()|hsps>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1130
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1131 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1132
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1133 #-------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1134 sub num_hsps {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1135 #-------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1136 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1137
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1138 if (not defined $self->{'_hsps'}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1139 $self->throw("Can't get HSPs: data not collected.");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1140 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1141
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1142 return scalar(@{$self->{'_hsps'}});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1143 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1144
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1145
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1146
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1147 =head2 logical_length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1148
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1149 Usage : $hit_object->logical_length( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1150 : (mostly intended for internal use).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1151 Purpose : Get the logical length of the hit sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1152 : For query sequence of BLASTX and TBLASTX reports and the hit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1153 : sequence of TBLASTN and TBLASTX reports, the returned length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1154 : is the length of the would-be amino acid sequence (length/3).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1155 : For all other BLAST flavors, this function is the same as length().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1156 Example : $len = $hit_object->logical_length();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1157 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1158 Argument : seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1159 ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1160 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1161 Comments : This is important for functions like frac_aligned_query()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1162 : which need to operate in amino acid coordinate space when dealing
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1163 : with T?BLASTX type reports.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1164
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1165 See Also : L<length()|length>, L<frac_aligned_query()|frac_aligned_query>, L<frac_aligned_hit()|frac_aligned_hit>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1166
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1167 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1168
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1169 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1170 sub logical_length {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1171 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1172 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1173 my $seqType = shift || 'query';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1174 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1175
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1176 my $length;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1177
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1178 # For the sbjct, return logical sbjct length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1179 if( $seqType eq 'sbjct' ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1180 $length = $self->{'_logical_length'} || $self->{'_length'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1181 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1182 else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1183 # Otherwise, return logical query length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1184 $length = $self->{'_query_length'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1185
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1186 # Adjust length based on BLAST flavor.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1187 if($self->{'_blast_program'} =~ /T?BLASTX/ ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1188 $length /= 3;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1189 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1190 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1191 return $length;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1192 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1193
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1194
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1195 =head2 length_aln
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1196
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1197 Usage : $hit_object->length_aln( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1198 Purpose : Get the total length of the aligned region for query or sbjct seq.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1199 : This number will include all HSPs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1200 Example : $len = $hit_object->length_aln(); # default = query
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1201 : $lenAln = $hit_object->length_aln('query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1202 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1203 Argument : seq_Type = 'query' or 'hit' or 'sbjct' (Default = 'query')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1204 ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1205 Throws : Exception if the argument is not recognized.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1206 Comments : This method will report the logical length of the alignment,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1207 : meaning that for TBLAST[NX] reports, the length is reported
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1208 : using amino acid coordinate space (i.e., nucleotides / 3).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1209 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1210 : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1211 : already been tiled, they will be tiled first automatically..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1212 : If you don't want the tiled data, iterate through each HSP
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1213 : calling length() on each (use hsps() to get all HSPs).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1214
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1215 See Also : L<length()|length>, L<frac_aligned_query()|frac_aligned_query>, L<frac_aligned_hit()|frac_aligned_hit>, L<gaps()|gaps>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>, L<Bio::Search::HSP::BlastHSP::length()|Bio::Search::HSP::BlastHSP>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1216
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1217 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1218
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1219 #---------------'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1220 sub length_aln {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1221 #---------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1222 my( $self, $seqType ) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1223
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1224 $seqType ||= 'query';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1225 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1226
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1227 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1228
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1229 my $data = $self->{'_length_aln_'.$seqType};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1230
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1231 ## If we don't have data, figure out what went wrong.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1232 if(!$data) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1233 $self->throw("Can't get length aln for sequence type \"$seqType\"" .
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1234 "Valid types are 'query', 'hit', 'sbjct' ('sbjct' = 'hit')");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1235 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1236 $data;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1237 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1238
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1239
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1240 =head2 gaps
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1241
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1242 Usage : $hit_object->gaps( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1243 Purpose : Get the number of gaps in the aligned query, sbjct, or both sequences.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1244 : Data is summed across all HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1245 Example : $qgaps = $hit_object->gaps('query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1246 : $hgaps = $hit_object->gaps('hit');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1247 : $tgaps = $hit_object->gaps(); # default = total (query + hit)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1248 Returns : scalar context: integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1249 : array context without args: two-element list of integers
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1250 : (queryGaps, sbjctGaps)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1251 : Array context can be forced by providing an argument of 'list' or 'array'.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1252 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1253 : CAUTION: Calling this method within printf or sprintf is arrray context.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1254 : So this function may not give you what you expect. For example:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1255 : printf "Total gaps: %d", $hit->gaps();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1256 : Actually returns a two-element array, so what gets printed
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1257 : is the number of gaps in the query, not the total
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1258 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1259 Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total' | 'list' (default = 'total')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1260 ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1261 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1262 Comments : If you need data for each HSP, use hsps() and then interate
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1263 : through each HSP object.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1264 : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1265 : already been tiled, they will be tiled first automatically..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1266 : Not relying on wantarray since that will fail in situations
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1267 : such as printf "%d", $hit->gaps() in which you might expect to
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1268 : be printing the total gaps, but evaluates to array context.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1269
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1270 See Also : L<length_aln()|length_aln>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1271
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1272 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1273
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1274 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1275 sub gaps {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1276 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1277 my( $self, $seqType ) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1278
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1279 $seqType ||= (wantarray ? 'list' : 'total');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1280 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1281
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1282 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1283
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1284 $seqType = lc($seqType);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1285
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1286 if($seqType =~ /list|array/i) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1287 return ($self->{'_gaps_query'}, $self->{'_gaps_sbjct'});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1288 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1289
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1290 if($seqType eq 'total') {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1291 return ($self->{'_gaps_query'} + $self->{'_gaps_sbjct'}) || 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1292 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1293 return $self->{'_gaps_'.$seqType} || 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1294 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1295 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1296
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1297
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1298
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1299 =head2 matches
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1300
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1301 Usage : $hit_object->matches( [class] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1302 Purpose : Get the total number of identical or conserved matches
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1303 : (or both) across all HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1304 : (Note: 'conservative' matches are indicated as 'positives'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1305 : in the Blast report.)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1306 Example : ($id,$cons) = $hit_object->matches(); # no argument
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1307 : $id = $hit_object->matches('id');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1308 : $cons = $hit_object->matches('cons');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1309 Returns : Integer or a 2-element array of integers
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1310 Argument : class = 'id' | 'cons' OR none.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1311 : If no argument is provided, both identical and conservative
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1312 : numbers are returned in a two element list.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1313 : (Other terms can be used to refer to the conservative
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1314 : matches, e.g., 'positive'. All that is checked is whether or
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1315 : not the supplied string starts with 'id'. If not, the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1316 : conservative matches are returned.)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1317 Throws : Exception if the requested data cannot be obtained.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1318 Comments : If you need data for each HSP, use hsps() and then interate
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1319 : through the HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1320 : Does not rely on wantarray to return a list. Only checks for
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1321 : the presence of an argument (no arg = return list).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1322
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1323 See Also : L<Bio::Search::HSP::BlastHSP::matches()|Bio::Search::HSP::BlastHSP>, L<hsps()|hsps>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1324
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1325 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1326
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1327 #---------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1328 sub matches {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1329 #---------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1330 my( $self, $arg) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1331 my(@data,$data);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1332
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1333 if(!$arg) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1334 @data = ($self->{'_totalIdentical'}, $self->{'_totalConserved'});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1335
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1336 return @data if @data;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1337
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1338 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1339
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1340 if($arg =~ /^id/i) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1341 $data = $self->{'_totalIdentical'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1342 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1343 $data = $self->{'_totalConserved'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1344 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1345 return $data if $data;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1346 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1347
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1348 ## Something went wrong if we make it to here.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1349 $self->throw("Can't get identical or conserved data: no data.");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1350 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1351
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1352
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1353 =head2 start
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1354
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1355 Usage : $sbjct->start( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1356 Purpose : Gets the start coordinate for the query, sbjct, or both sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1357 : in the BlastHit object. If there is more than one HSP, the lowest start
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1358 : value of all HSPs is returned.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1359 Example : $qbeg = $sbjct->start('query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1360 : $sbeg = $sbjct->start('hit');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1361 : ($qbeg, $sbeg) = $sbjct->start();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1362 Returns : scalar context: integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1363 : array context without args: list of two integers (queryStart, sbjctStart)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1364 : Array context can be "induced" by providing an argument of 'list' or 'array'.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1365 Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1366 ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1367 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1368 Comments : This method requires that all HSPs be tiled. If there is more than one
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1369 : HSP and they have not already been tiled, they will be tiled first automatically..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1370 : Remember that the start and end coordinates of all HSPs are
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1371 : normalized so that start < end. Strand information can be
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1372 : obtained by calling $hit->strand().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1373
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1374 See Also : L<end()|end>, L<range()|range>, L<strand()|strand>, L<HSP Tiling and Ambiguous Alignments>, L<Bio::Search::HSP::BlastHSP::start|Bio::Search::HSP::BlastHSP>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1375
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1376 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1377
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1378 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1379 sub start {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1380 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1381 my ($self, $seqType) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1382
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1383 $seqType ||= (wantarray ? 'list' : 'query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1384 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1385
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1386 # If there is only one HSP, defer this call to the solitary HSP.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1387 if($self->num_hsps == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1388 return $self->hsp->start($seqType);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1389 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1390 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1391 if($seqType =~ /list|array/i) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1392 return ($self->{'_queryStart'}, $self->{'_sbjctStart'});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1393 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1394 ## Sensitive to member name changes.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1395 $seqType = "_\L$seqType\E";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1396 return $self->{$seqType.'Start'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1397 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1398 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1399 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1400
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1401
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1402 =head2 end
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1403
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1404 Usage : $sbjct->end( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1405 Purpose : Gets the end coordinate for the query, sbjct, or both sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1406 : in the BlastHit object. If there is more than one HSP, the largest end
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1407 : value of all HSPs is returned.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1408 Example : $qend = $sbjct->end('query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1409 : $send = $sbjct->end('hit');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1410 : ($qend, $send) = $sbjct->end();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1411 Returns : scalar context: integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1412 : array context without args: list of two integers (queryEnd, sbjctEnd)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1413 : Array context can be "induced" by providing an argument of 'list' or 'array'.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1414 Argument : In scalar context: seq_type = 'query' or 'sbjct'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1415 : (case insensitive). If not supplied, 'query' is used.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1416 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1417 Comments : This method requires that all HSPs be tiled. If there is more than one
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1418 : HSP and they have not already been tiled, they will be tiled first automatically..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1419 : Remember that the start and end coordinates of all HSPs are
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1420 : normalized so that start < end. Strand information can be
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1421 : obtained by calling $hit->strand().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1422
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1423 See Also : L<start()|start>, L<range()|range>, L<strand()|strand>, L<HSP Tiling and Ambiguous Alignments>, L<Bio::Search::HSP::BlastHSP::end|Bio::Search::HSP::BlastHSP>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1424
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1425 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1426
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1427 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1428 sub end {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1429 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1430 my ($self, $seqType) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1431
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1432 $seqType ||= (wantarray ? 'list' : 'query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1433 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1434
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1435 # If there is only one HSP, defer this call to the solitary HSP.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1436 if($self->num_hsps == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1437 return $self->hsp->end($seqType);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1438 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1439 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1440 if($seqType =~ /list|array/i) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1441 return ($self->{'_queryStop'}, $self->{'_sbjctStop'});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1442 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1443 ## Sensitive to member name changes.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1444 $seqType = "_\L$seqType\E";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1445 return $self->{$seqType.'Stop'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1446 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1447 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1448 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1449
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1450 =head2 range
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1451
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1452 Usage : $sbjct->range( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1453 Purpose : Gets the (start, end) coordinates for the query or sbjct sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1454 : in the HSP alignment.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1455 Example : ($qbeg, $qend) = $sbjct->range('query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1456 : ($sbeg, $send) = $sbjct->range('hit');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1457 Returns : Two-element array of integers
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1458 Argument : seq_type = string, 'query' or 'hit' or 'sbjct' (default = 'query')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1459 ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1460 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1461
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1462 See Also : L<start()|start>, L<end()|end>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1463
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1464 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1465
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1466 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1467 sub range {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1468 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1469 my ($self, $seqType) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1470 $seqType ||= 'query';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1471 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1472 return ($self->start($seqType), $self->end($seqType));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1473 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1474
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1475
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1476 =head2 frac_identical
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1477
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1478 Usage : $hit_object->frac_identical( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1479 Purpose : Get the overall fraction of identical positions across all HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1480 : The number refers to only the aligned regions and does not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1481 : account for unaligned regions in between the HSPs, if any.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1482 Example : $frac_iden = $hit_object->frac_identical('query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1483 Returns : Float (2-decimal precision, e.g., 0.75).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1484 Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1485 : default = 'query' (but see comments below).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1486 : ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1487 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1488 Comments : Different versions of Blast report different values for the total
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1489 : length of the alignment. This is the number reported in the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1490 : denominators in the stats section:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1491 : "Identical = 34/120 Positives = 67/120".
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1492 : NCBI BLAST uses the total length of the alignment (with gaps)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1493 : WU-BLAST uses the length of the query sequence (without gaps).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1494 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1495 : Therefore, when called with an argument of 'total',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1496 : this method will report different values depending on the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1497 : version of BLAST used. Total does NOT take into account HSP
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1498 : tiling, so it should not be used.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1499 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1500 : To get the fraction identical among only the aligned residues,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1501 : ignoring the gaps, call this method without an argument or
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1502 : with an argument of 'query' or 'hit'.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1503 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1504 : If you need data for each HSP, use hsps() and then iterate
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1505 : through the HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1506 : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1507 : already been tiled, they will be tiled first automatically.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1508
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1509 See Also : L<frac_conserved()|frac_conserved>, L<frac_aligned_query()|frac_aligned_query>, L<matches()|matches>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1510
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1511 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1512
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1513 #------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1514 sub frac_identical {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1515 #------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1516 my ($self, $seqType) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1517 $seqType ||= 'query';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1518 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1519
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1520 ## Sensitive to member name format.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1521 $seqType = lc($seqType);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1522
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1523 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1524
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1525 sprintf( "%.2f", $self->{'_totalIdentical'}/$self->{'_length_aln_'.$seqType});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1526 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1527
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1528
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1529
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1530 =head2 frac_conserved
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1531
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1532 Usage : $hit_object->frac_conserved( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1533 Purpose : Get the overall fraction of conserved positions across all HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1534 : The number refers to only the aligned regions and does not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1535 : account for unaligned regions in between the HSPs, if any.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1536 Example : $frac_cons = $hit_object->frac_conserved('hit');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1537 Returns : Float (2-decimal precision, e.g., 0.75).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1538 Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1539 : default = 'query' (but see comments below).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1540 : ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1541 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1542 Comments : Different versions of Blast report different values for the total
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1543 : length of the alignment. This is the number reported in the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1544 : denominators in the stats section:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1545 : "Positives = 34/120 Positives = 67/120".
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1546 : NCBI BLAST uses the total length of the alignment (with gaps)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1547 : WU-BLAST uses the length of the query sequence (without gaps).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1548 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1549 : Therefore, when called with an argument of 'total',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1550 : this method will report different values depending on the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1551 : version of BLAST used. Total does NOT take into account HSP
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1552 : tiling, so it should not be used.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1553 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1554 : To get the fraction conserved among only the aligned residues,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1555 : ignoring the gaps, call this method without an argument or
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1556 : with an argument of 'query' or 'hit'.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1557 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1558 : If you need data for each HSP, use hsps() and then interate
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1559 : through the HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1560 : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1561 : already been tiled, they will be tiled first automatically.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1562
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1563 See Also : L<frac_identical()|frac_identical>, L<matches()|matches>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1564
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1565 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1566
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1567 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1568 sub frac_conserved {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1569 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1570 my ($self, $seqType) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1571 $seqType ||= 'query';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1572 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1573
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1574 ## Sensitive to member name format.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1575 $seqType = lc($seqType);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1576
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1577 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1578
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1579 sprintf( "%.2f", $self->{'_totalConserved'}/$self->{'_length_aln_'.$seqType});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1580 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1581
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1582
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1583
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1584
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1585 =head2 frac_aligned_query
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1586
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1587 Usage : $hit_object->frac_aligned_query();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1588 Purpose : Get the fraction of the query sequence which has been aligned
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1589 : across all HSPs (not including intervals between non-overlapping
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1590 : HSPs).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1591 Example : $frac_alnq = $hit_object->frac_aligned_query();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1592 Returns : Float (2-decimal precision, e.g., 0.75).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1593 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1594 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1595 Comments : If you need data for each HSP, use hsps() and then interate
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1596 : through the HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1597 : To compute the fraction aligned, the logical length of the query
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1598 : sequence is used, meaning that for [T]BLASTX reports, the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1599 : full length of the query sequence is converted into amino acids
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1600 : by dividing by 3. This is necessary because of the way
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1601 : the lengths of aligned sequences are computed.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1602 : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1603 : already been tiled, they will be tiled first automatically.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1604
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1605 See Also : L<frac_aligned_hit()|frac_aligned_hit>, L<logical_length()|logical_length>, L<length_aln()|length_aln>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1606
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1607 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1608
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1609 #----------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1610 sub frac_aligned_query {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1611 #----------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1612 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1613
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1614 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1615
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1616 sprintf( "%.2f", $self->{'_length_aln_query'}/$self->logical_length('query'));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1617 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1618
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1619
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1620
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1621 =head2 frac_aligned_hit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1622
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1623 Usage : $hit_object->frac_aligned_hit();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1624 Purpose : Get the fraction of the hit (sbjct) sequence which has been aligned
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1625 : across all HSPs (not including intervals between non-overlapping
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1626 : HSPs).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1627 Example : $frac_alnq = $hit_object->frac_aligned_hit();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1628 Returns : Float (2-decimal precision, e.g., 0.75).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1629 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1630 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1631 Comments : If you need data for each HSP, use hsps() and then interate
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1632 : through the HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1633 : To compute the fraction aligned, the logical length of the sbjct
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1634 : sequence is used, meaning that for TBLAST[NX] reports, the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1635 : full length of the sbjct sequence is converted into amino acids
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1636 : by dividing by 3. This is necessary because of the way
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1637 : the lengths of aligned sequences are computed.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1638 : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1639 : already been tiled, they will be tiled first automatically.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1640
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1641 See Also : L<frac_aligned_query()|frac_aligned_query>, L<matches()|matches>, , L<logical_length()|logical_length>, L<length_aln()|length_aln>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1642
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1643 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1644
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1645 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1646 sub frac_aligned_hit {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1647 #--------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1648 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1649
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1650 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1651
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1652 sprintf( "%.2f", $self->{'_length_aln_sbjct'}/$self->logical_length('sbjct'));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1653 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1654
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1655
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1656 ## These methods are being maintained for backward compatibility.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1657
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1658 =head2 frac_aligned_sbjct
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1659
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1660 Same as L<frac_aligned_hit()|frac_aligned_hit>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1661
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1662 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1663
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1664 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1665 sub frac_aligned_sbjct { my $self=shift; $self->frac_aligned_hit(@_); }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1666 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1667
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1668 =head2 num_unaligned_sbjct
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1669
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1670 Same as L<num_unaligned_hit()|num_unaligned_hit>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1671
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1672 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1673
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1674 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1675 sub num_unaligned_sbjct { my $self=shift; $self->num_unaligned_hit(@_); }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1676 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1677
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1678
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1679
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1680 =head2 num_unaligned_hit
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1681
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1682 Usage : $hit_object->num_unaligned_hit();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1683 Purpose : Get the number of the unaligned residues in the hit sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1684 : Sums across all all HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1685 Example : $num_unaln = $hit_object->num_unaligned_hit();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1686 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1687 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1688 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1689 Comments : See notes regarding logical lengths in the comments for frac_aligned_hit().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1690 : They apply here as well.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1691 : If you need data for each HSP, use hsps() and then interate
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1692 : through the HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1693 : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1694 : already been tiled, they will be tiled first automatically..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1695
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1696 See Also : L<num_unaligned_query()|num_unaligned_query>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>, L<frac_aligned_hit()|frac_aligned_hit>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1697
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1698 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1699
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1700 #---------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1701 sub num_unaligned_hit {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1702 #---------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1703 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1704
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1705 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1706
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1707 my $num = $self->logical_length('sbjct') - $self->{'_length_aln_sbjct'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1708 ($num < 0 ? 0 : $num );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1709 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1710
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1711
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1712 =head2 num_unaligned_query
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1713
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1714 Usage : $hit_object->num_unaligned_query();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1715 Purpose : Get the number of the unaligned residues in the query sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1716 : Sums across all all HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1717 Example : $num_unaln = $hit_object->num_unaligned_query();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1718 Returns : Integer
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1719 Argument : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1720 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1721 Comments : See notes regarding logical lengths in the comments for frac_aligned_query().
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1722 : They apply here as well.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1723 : If you need data for each HSP, use hsps() and then interate
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1724 : through the HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1725 : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1726 : already been tiled, they will be tiled first automatically..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1727
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1728 See Also : L<num_unaligned_hit()|num_unaligned_hit>, L<frac_aligned_query()|frac_aligned_query>, L<Bio::Search::BlastUtils::tile_hsps()|Bio::Search::BlastUtils>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1729
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1730 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1731
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1732 #-----------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1733 sub num_unaligned_query {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1734 #-----------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1735 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1736
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1737 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1738
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1739 my $num = $self->logical_length('query') - $self->{'_length_aln_query'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1740 ($num < 0 ? 0 : $num );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1741 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1742
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1743
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1744
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1745 =head2 seq_inds
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1746
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1747 Usage : $hit->seq_inds( seq_type, class, collapse );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1748 Purpose : Get a list of residue positions (indices) across all HSPs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1749 : for identical or conserved residues in the query or sbjct sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1750 Example : @s_ind = $hit->seq_inds('query', 'identical');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1751 : @h_ind = $hit->seq_inds('hit', 'conserved');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1752 : @h_ind = $hit->seq_inds('hit', 'conserved', 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1753 Returns : Array of integers
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1754 : May include ranges if collapse is non-zero.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1755 Argument : [0] seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1756 : ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1757 : [1] class = 'identical' or 'conserved' (default = 'identical')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1758 : (can be shortened to 'id' or 'cons')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1759 : (actually, anything not 'id' will evaluate to 'conserved').
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1760 : [2] collapse = boolean, if non-zero, consecutive positions are merged
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1761 : using a range notation, e.g., "1 2 3 4 5 7 9 10 11"
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1762 : collapses to "1-5 7 9-11". This is useful for
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1763 : consolidating long lists. Default = no collapse.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1764 Throws : n/a.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1765 Comments : Note that HSPs are not tiled for this. This could be a problem
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1766 : for hits containing mutually exclusive HSPs.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1767 : TODO: Consider tiling and then reporting seq_inds for the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1768 : best HSP contig.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1769
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1770 See Also : L<Bio::Search::HSP::BlastHSP::seq_inds()|Bio::Search::HSP::BlastHSP>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1771
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1772 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1773
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1774 #-------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1775 sub seq_inds {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1776 #-------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1777 my ($self, $seqType, $class, $collapse) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1778
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1779 $seqType ||= 'query';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1780 $class ||= 'identical';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1781 $collapse ||= 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1782
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1783 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1784
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1785 my (@inds, $hsp);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1786 foreach $hsp ($self->hsps) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1787 # This will merge data for all HSPs together.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1788 push @inds, $hsp->seq_inds($seqType, $class);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1789 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1790
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1791 # Need to remove duplicates and sort the merged positions.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1792 if(@inds) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1793 my %tmp = map { $_, 1 } @inds;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1794 @inds = sort {$a <=> $b} keys %tmp;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1795 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1796
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1797 $collapse ? &Bio::Search::BlastUtils::collapse_nums(@inds) : @inds;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1798 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1799
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1800
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1801 =head2 iteration
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1802
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1803 Usage : $sbjct->iteration( );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1804 Purpose : Gets the iteration number in which the Hit was found.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1805 Example : $iteration_num = $sbjct->iteration();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1806 Returns : Integer greater than or equal to 1
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1807 Non-PSI-BLAST reports will report iteration as 1, but this number
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1808 is only meaningful for PSI-BLAST reports.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1809 Argument : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1810 Throws : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1811
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1812 See Also : L<found_again()|found_again>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1813
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1814 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1815
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1816 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1817 sub iteration { shift->{'_iteration'} }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1818 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1819
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1820
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1821 =head2 found_again
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1822
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1823 Usage : $sbjct->found_again;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1824 Purpose : Gets a boolean indicator whether or not the hit has
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1825 been found in a previous iteration.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1826 This is only applicable to PSI-BLAST reports.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1827
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1828 This method indicates if the hit was reported in the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1829 "Sequences used in model and found again" section of the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1830 PSI-BLAST report or if it was reported in the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1831 "Sequences not found previously or not previously below threshold"
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1832 section of the PSI-BLAST report. Only for hits in iteration > 1.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1833
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1834 Example : if( $sbjct->found_again()) { ... };
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1835 Returns : Boolean (1 or 0) for PSI-BLAST report iterations greater than 1.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1836 Returns undef for PSI-BLAST report iteration 1 and non PSI_BLAST
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1837 reports.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1838 Argument : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1839 Throws : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1840
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1841 See Also : L<found_again()|found_again>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1842
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1843 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1844
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1845 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1846 sub found_again { shift->{'_found_again'} }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1847 #----------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1848
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1849
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1850 =head2 strand
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1851
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1852 Usage : $sbjct->strand( [seq_type] );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1853 Purpose : Gets the strand(s) for the query, sbjct, or both sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1854 : in the best HSP of the BlastHit object after HSP tiling.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1855 : Only valid for BLASTN, TBLASTX, BLASTX-query, TBLASTN-hit.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1856 Example : $qstrand = $sbjct->strand('query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1857 : $sstrand = $sbjct->strand('hit');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1858 : ($qstrand, $sstrand) = $sbjct->strand();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1859 Returns : scalar context: integer '1', '-1', or '0'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1860 : array context without args: list of two strings (queryStrand, sbjctStrand)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1861 : Array context can be "induced" by providing an argument of 'list' or 'array'.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1862 Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default = 'query')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1863 ('sbjct' is synonymous with 'hit')
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1864 Throws : n/a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1865 Comments : This method requires that all HSPs be tiled. If they have not
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1866 : already been tiled, they will be tiled first automatically..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1867 : If you don't want the tiled data, iterate through each HSP
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1868 : calling strand() on each (use hsps() to get all HSPs).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1869 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1870 : Formerly (prior to 10/21/02), this method would return the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1871 : string "-1/1" for hits with HSPs on both strands.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1872 : However, now that strand and frame is properly being accounted
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1873 : for during HSP tiling, it makes more sense for strand()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1874 : to return the strand data for the best HSP after tiling.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1875 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1876 : If you really want to know about hits on opposite strands,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1877 : you should be iterating through the HSPs using methods on the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1878 : HSP objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1879 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1880 : A possible use case where knowing whether a hit has HSPs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1881 : on both strands would be when filtering via SearchIO for hits with
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1882 : this property. However, in this case it would be better to have a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1883 : dedicated method such as $hit->hsps_on_both_strands(). Similarly
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1884 : for frame. This could be provided if there is interest.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1885
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1886 See Also : B<Bio::Search::HSP::BlastHSP::strand>()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1887
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1888 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1889
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1890 #----------'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1891 sub strand {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1892 #----------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1893 my ($self, $seqType) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1894
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1895 Bio::Search::BlastUtils::tile_hsps($self) if not $self->{'_tile_hsps'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1896
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1897 $seqType ||= (wantarray ? 'list' : 'query');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1898 $seqType = 'sbjct' if $seqType eq 'hit';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1899
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1900 my ($qstr, $hstr);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1901 # If there is only one HSP, defer this call to the solitary HSP.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1902 if($self->num_hsps == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1903 return $self->hsp->strand($seqType);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1904 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1905 elsif( defined $self->{'_qstrand'}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1906 # Get the data computed during hsp tiling.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1907 $qstr = $self->{'_qstrand'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1908 $hstr = $self->{'_sstrand'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1909 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1910 else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1911 # otherwise, iterate through all HSPs collecting strand info.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1912 # This will return the string "-1/1" if there are HSPs on different strands.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1913 # NOTE: This was the pre-10/21/02 procedure which will no longer be used,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1914 # (unless the above elsif{} is commented out).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1915 my (%qstr, %hstr);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1916 foreach my $hsp( $self->hsps ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1917 my ( $q, $h ) = $hsp->strand();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1918 $qstr{ $q }++;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1919 $hstr{ $h }++;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1920 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1921 $qstr = join( '/', sort keys %qstr);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1922 $hstr = join( '/', sort keys %hstr);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1923 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1924
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1925 if($seqType =~ /list|array/i) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1926 return ($qstr, $hstr);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1927 } elsif( $seqType eq 'query' ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1928 return $qstr;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1929 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1930 return $hstr;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1931 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1932 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1933
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1934
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1935 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1936 __END__
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1937
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1938 #####################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1939 # END OF CLASS #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1940 #####################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1941
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1942
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1943 =head1 FOR DEVELOPERS ONLY
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1944
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1945 =head2 Data Members
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1946
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1947 Information about the various data members of this module is provided for those
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1948 wishing to modify or understand the code. Two things to bear in mind:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1949
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1950 =over 4
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1951
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1952 =item 1 Do NOT rely on these in any code outside of this module.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1953
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1954 All data members are prefixed with an underscore to signify that they are private.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1955 Always use accessor methods. If the accessor doesn't exist or is inadequate,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1956 create or modify an accessor (and let me know, too!). (An exception to this might
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1957 be for BlastHSP.pm which is more tightly coupled to BlastHit.pm and
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1958 may access BlastHit data members directly for efficiency purposes, but probably
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1959 should not).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1960
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1961 =item 2 This documentation may be incomplete and out of date.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1962
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1963 It is easy for these data member descriptions to become obsolete as
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1964 this module is still evolving. Always double check this info and search
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1965 for members not described here.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1966
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1967 =back
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1968
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1969 An instance of Bio::Search::Hit::BlastHit.pm is a blessed reference to a hash containing
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1970 all or some of the following fields:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1971
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1972 FIELD VALUE
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1973 --------------------------------------------------------------
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1974 _hsps : Array ref for a list of Bio::Search::HSP::BlastHSP.pm objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1975 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1976 _db : Database identifier from the summary line.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1977 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1978 _desc : Description data for the hit from the summary line.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1979 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1980 _length : Total length of the hit sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1981 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1982 _score : BLAST score.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1983 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1984 _bits : BLAST score (in bits). Matrix-independent.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1985 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1986 _p : BLAST P value. Obtained from summary section. (Blast1/WU-Blast only)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1987 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1988 _expect : BLAST Expect value. Obtained from summary section.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1989 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1990 _n : BLAST N value (number of HSPs) (Blast1/WU-Blast2 only)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1991 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1992 _frame : Reading frame for TBLASTN and TBLASTX analyses.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1993 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1994 _totalIdentical: Total number of identical aligned monomers.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1995 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1996 _totalConserved: Total number of conserved aligned monomers (a.k.a. "positives").
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1997 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1998 _overlap : Maximum number of overlapping residues between adjacent HSPs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1999 : before considering the alignment to be ambiguous.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2000 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2001 _ambiguous_aln : Boolean. True if the alignment of all HSPs is ambiguous.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2002 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2003 _length_aln_query : Length of the aligned region of the query sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2004 :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2005 _length_aln_sbjct : Length of the aligned region of the sbjct sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2006
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2007
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2008 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2009
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2010 1;