annotate variant_effect_predictor/Bio/Search/HSP/BlastHSP.pm @ 0:1f6dce3d34e0

Uploaded
author mahtabm
date Thu, 11 Apr 2013 02:01:53 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1 #-----------------------------------------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2 # $Id: BlastHSP.pm,v 1.20 2002/12/24 15:45:33 jason Exp $
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
3 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
4 # BioPerl module Bio::Search::HSP::BlastHSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
5 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
6 # (This module was originally called Bio::Tools::Blast::HSP)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
7 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
8 # Cared for by Steve Chervitz <sac@bioperl.org>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
9 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
10 # You may distribute this module under the same terms as perl itself
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
11 #-----------------------------------------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
12
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
13 ## POD Documentation:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
14
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
15 =head1 NAME
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
16
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
17 Bio::Search::HSP::BlastHSP - Bioperl BLAST High-Scoring Pair object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
18
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
19 =head1 SYNOPSIS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
20
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
21 The construction of BlastHSP objects is performed by
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
22 Bio::Factory::BlastHitFactory in a process that is
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
23 orchestrated by the Blast parser (B<Bio::SearchIO::psiblast>).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
24 The resulting BlastHSPs are then accessed via
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
25 B<Bio::Search::Hit::BlastHit>). Therefore, you do not need to
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
26 use B<Bio::Search::HSP::BlastHSP>) directly. If you need to construct
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
27 BlastHSPs directly, see the new() function for details.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
28
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
29 For B<Bio::SearchIO> BLAST parsing usage examples, see the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
30 B<examples/search-blast> directory of the Bioperl distribution.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
31
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
32 =head1 DESCRIPTION
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
33
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
34 A Bio::Search::HSP::BlastHSP object provides an interface to data
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
35 obtained in a single alignment section of a Blast report (known as a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
36 "High-scoring Segment Pair"). This is essentially a pairwise
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
37 alignment with score information.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
38
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
39 BlastHSP objects are accessed via B<Bio::Search::Hit::BlastHit>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
40 objects after parsing a BLAST report using the B<Bio::SearchIO>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
41 system.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
42
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
43 =head2 Start and End coordinates
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
44
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
45 Sequence endpoints are swapped so that start is always less than
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
46 end. This affects For TBLASTN/X hits on the minus strand. Strand
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
47 information can be recovered using the strand() method. This
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
48 normalization step is standard Bioperl practice. It also facilitates
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
49 use of range information by methods such as match().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
50
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
51 =over 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
52
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
53 =item * Supports BLAST versions 1.x and 2.x, gapped and ungapped.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
54
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
55 =back
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
56
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
57 Bio::Search::HSP::BlastHSP.pm has the ability to extract a list of all
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
58 residue indices for identical and conservative matches along both
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
59 query and sbjct sequences. Since this degree of detail is not always
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
60 needed, this behavior does not occur during construction of the BlastHSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
61 object. These data will automatically be collected as necessary as
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
62 the BlastHSP.pm object is used.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
63
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
64 =head1 DEPENDENCIES
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
65
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
66 Bio::Search::HSP::BlastHSP.pm is a concrete class that inherits from
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
67 B<Bio::SeqFeature::SimilarityPair> and B<Bio::Search::HSP::HSPI>.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
68 B<Bio::Seq> and B<Bio::SimpleAlign> are employed for creating
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
69 sequence and alignment objects, respectively.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
70
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
71 =head2 Relationship to SimpleAlign.pm & Seq.pm
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
72
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
73 BlastHSP.pm can provide the query or sbjct sequence as a B<Bio::Seq>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
74 object via the L<seq()|seq> method. The BlastHSP.pm object can also create a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
75 two-sequence B<Bio::SimpleAlign> alignment object using the the query
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
76 and sbjct sequences via the L<get_aln()|get_aln> method. Creation of alignment
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
77 objects is not automatic when constructing the BlastHSP.pm object since
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
78 this level of functionality is not always required and would generate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
79 a lot of extra overhead when crunching many reports.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
80
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
81
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
82 =head1 FEEDBACK
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
83
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
84 =head2 Mailing Lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
85
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
86 User feedback is an integral part of the evolution of this and other
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
87 Bioperl modules. Send your comments and suggestions preferably to one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
88 of the Bioperl mailing lists. Your participation is much appreciated.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
89
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
90 bioperl-l@bioperl.org - General discussion
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
91 http://bio.perl.org/MailList.html - About the mailing lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
92
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
93 =head2 Reporting Bugs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
94
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
95 Report bugs to the Bioperl bug tracking system to help us keep track
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
96 the bugs and their resolution. Bug reports can be submitted via email
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
97 or the web:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
98
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
99 bioperl-bugs@bio.perl.org
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
100 http://bugzilla.bioperl.org/
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
101
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
102 =head1 AUTHOR
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
103
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
104 Steve Chervitz E<lt>sac@bioperl.orgE<gt>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
105
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
106 See L<the FEEDBACK section | FEEDBACK> for where to send bug reports and comments.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
107
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
108 =head1 ACKNOWLEDGEMENTS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
109
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
110 This software was originally developed in the Department of Genetics
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
111 at Stanford University. I would also like to acknowledge my
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
112 colleagues at Affymetrix for useful feedback.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
113
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
114 =head1 SEE ALSO
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
115
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
116 Bio::Search::Hit::BlastHit.pm - Blast hit object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
117 Bio::Search::Result::BlastResult.pm - Blast Result object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
118 Bio::Seq.pm - Biosequence object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
119
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
120 =head2 Links:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
121
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
122 http://bio.perl.org/Core/POD/Tools/Blast/BlastHit.pm.html
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
123
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
124 http://bio.perl.org/Projects/modules.html - Online module documentation
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
125 http://bio.perl.org/Projects/Blast/ - Bioperl Blast Project
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
126 http://bio.perl.org/ - Bioperl Project Homepage
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
127
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
128 =head1 COPYRIGHT
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
129
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
130 Copyright (c) 1996-2001 Steve Chervitz. All Rights Reserved.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
131
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
132 =head1 DISCLAIMER
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
133
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
134 This software is provided "as is" without warranty of any kind.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
135
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
136 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
137
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
138
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
139 # END of main POD documentation.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
140
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
141 =head1 APPENDIX
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
142
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
143 The rest of the documentation details each of the object methods.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
144 Internal methods are usually preceded with a _
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
145
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
146 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
147
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
148 # Let the code begin...
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
149
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
150 package Bio::Search::HSP::BlastHSP;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
151
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
152 use strict;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
153 use Bio::SeqFeature::SimilarityPair;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
154 use Bio::SeqFeature::Similarity;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
155 use Bio::Search::HSP::HSPI;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
156
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
157 use vars qw( @ISA $GAP_SYMBOL $Revision %STRAND_SYMBOL );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
158
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
159 use overload
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
160 '""' => \&to_string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
161
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
162 $Revision = '$Id: BlastHSP.pm,v 1.20 2002/12/24 15:45:33 jason Exp $'; #'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
163
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
164 @ISA = qw(Bio::SeqFeature::SimilarityPair Bio::Search::HSP::HSPI);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
165
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
166 $GAP_SYMBOL = '-'; # Need a more general way to handle gap symbols.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
167 %STRAND_SYMBOL = ('Plus' => 1, 'Minus' => -1 );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
168
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
169
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
170 =head2 new
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
171
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
172 Usage : $hsp = Bio::Search::HSP::BlastHSP->new( %named_params );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
173 : Bio::Search::HSP::BlastHSP.pm objects are constructed
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
174 : automatically by Bio::SearchIO::BlastHitFactory.pm,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
175 : so there is no need for direct instantiation.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
176 Purpose : Constructs a new BlastHSP object and Initializes key variables
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
177 : for the HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
178 Returns : A Bio::Search::HSP::BlastHSP object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
179 Argument : Named parameters:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
180 : Parameter keys are case-insensitive.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
181 : -RAW_DATA => array ref containing raw BLAST report data for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
182 : for a single HSP. This includes all lines
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
183 : of the HSP alignment from a traditional BLAST
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
184 or PSI-BLAST (non-XML) report,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
185 : -RANK => integer (1..n).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
186 : -PROGRAM => string ('TBLASTN', 'BLASTP', etc.).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
187 : -QUERY_NAME => string, id of query sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
188 : -HIT_NAME => string, id of hit sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
189 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
190 Comments : Having the raw data allows this object to do lazy parsing of
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
191 : the raw HSP data (i.e., not parsed until needed).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
192 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
193 : Note that there is a fair amount of basic parsing that is
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
194 : currently performed in this module that would be more appropriate
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
195 : to do within a separate factory object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
196 : This parsing code will likely be relocated and more initialization
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
197 : parameters will be added to new().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
198 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
199 See Also : B<Bio::SeqFeature::SimilarityPair::new()>, B<Bio::SeqFeature::Similarity::new()>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
200
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
201 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
202
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
203 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
204 sub new {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
205 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
206 my ($class, @args ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
207
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
208 my $self = $class->SUPER::new( @args );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
209 # Initialize placeholders
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
210 $self->{'_queryGaps'} = $self->{'_sbjctGaps'} = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
211 my ($raw_data, $qname, $hname, $qlen, $hlen);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
212
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
213 ($self->{'_prog'}, $self->{'_rank'}, $raw_data,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
214 $qname, $hname) =
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
215 $self->_rearrange([qw( PROGRAM
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
216 RANK
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
217 RAW_DATA
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
218 QUERY_NAME
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
219 HIT_NAME
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
220 )], @args );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
221
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
222 # _set_data() does a fair amount of parsing.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
223 # This will likely change (see comment above.)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
224 $self->_set_data( @{$raw_data} );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
225 # Store the aligned query as sequence feature
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
226 my ($qb, $hb) = ($self->start());
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
227 my ($qe, $he) = ($self->end());
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
228 my ($qs, $hs) = ($self->strand());
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
229 my ($qf,$hf) = ($self->query->frame(),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
230 $self->hit->frame);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
231
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
232 $self->query( Bio::SeqFeature::Similarity->new (-start =>$qb,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
233 -end =>$qe,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
234 -strand =>$qs,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
235 -bits =>$self->bits,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
236 -score =>$self->score,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
237 -frame =>$qf,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
238 -seq_id => $qname,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
239 -source =>$self->{'_prog'} ));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
240
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
241 $self->hit( Bio::SeqFeature::Similarity->new (-start =>$hb,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
242 -end =>$he,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
243 -strand =>$hs,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
244 -bits =>$self->bits,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
245 -score =>$self->score,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
246 -frame =>$hf,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
247 -seq_id => $hname,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
248 -source =>$self->{'_prog'} ));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
249
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
250 # set lengths
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
251 $self->query->seqlength($qlen); # query
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
252 $self->hit->seqlength($hlen); # subject
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
253
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
254 $self->query->frac_identical($self->frac_identical('query'));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
255 $self->hit->frac_identical($self->frac_identical('hit'));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
256 return $self;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
257 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
258
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
259 #sub DESTROY {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
260 # my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
261 # #print STDERR "--->DESTROYING $self\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
262 #}
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
263
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
264
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
265 # Title : _id_str;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
266 # Purpose : Intended for internal use only to provide a string for use
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
267 # within exception messages to help users figure out which
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
268 # query/hit caused the problem.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
269 # Returns : Short string with name of query and hit seq
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
270 sub _id_str {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
271 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
272 if( not defined $self->{'_id_str'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
273 my $qname = $self->query->seqname;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
274 my $hname = $self->hit->seqname;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
275 $self->{'_id_str'} = "QUERY=\"$qname\" HIT=\"$hname\"";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
276 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
277 return $self->{'_id_str'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
278 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
279
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
280 #=================================================
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
281 # Begin Bio::Search::HSP::HSPI implementation
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
282 #=================================================
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
283
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
284 =head2 algorithm
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
285
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
286 Title : algorithm
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
287 Usage : $alg = $hsp->algorithm();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
288 Function: Gets the algorithm specification that was used to obtain the hsp
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
289 For BLAST, the algorithm denotes what type of sequence was aligned
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
290 against what (BLASTN: dna-dna, BLASTP prt-prt, BLASTX translated
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
291 dna-prt, TBLASTN prt-translated dna, TBLASTX translated
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
292 dna-translated dna).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
293 Returns : a scalar string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
294 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
295
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
296 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
297
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
298 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
299 sub algorithm {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
300 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
301 my ($self,@args) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
302 return $self->{'_prog'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
303 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
304
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
305
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
306
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
307
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
308 =head2 signif()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
309
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
310 Usage : $hsp_obj->signif()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
311 Purpose : Get the P-value or Expect value for the HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
312 Returns : Float (0.001 or 1.3e-43)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
313 : Returns P-value if it is defined, otherwise, Expect value.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
314 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
315 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
316 Comments : Provided for consistency with BlastHit::signif()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
317 : Support for returning the significance data in different
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
318 : formats (e.g., exponent only), is not provided for HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
319 : This is only available for the BlastHit or Blast object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
320
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
321 See Also : L<p()|p>, L<expect()|expect>, L<Bio::Search::Hit::BlastHit::signif()|Bio::Search::Hit::BlastHit>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
322
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
323 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
324
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
325 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
326 sub signif {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
327 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
328 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
329 my $val ||= defined($self->{'_p'}) ? $self->{'_p'} : $self->{'_expect'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
330 $val;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
331 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
332
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
333
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
334
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
335 =head2 evalue
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
336
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
337 Usage : $hsp_obj->evalue()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
338 Purpose : Get the Expect value for the HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
339 Returns : Float (0.001 or 1.3e-43)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
340 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
341 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
342 Comments : Support for returning the expectation data in different
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
343 : formats (e.g., exponent only), is not provided for HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
344 : This is only available for the BlastHit or Blast object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
345
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
346 See Also : L<p()|p>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
347
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
348 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
349
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
350 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
351 sub evalue { shift->{'_expect'} }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
352 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
353
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
354
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
355 =head2 p
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
356
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
357 Usage : $hsp_obj->p()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
358 Purpose : Get the P-value for the HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
359 Returns : Float (0.001 or 1.3e-43) or undef if not defined.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
360 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
361 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
362 Comments : P-value is not defined with NCBI Blast2 reports.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
363 : Support for returning the expectation data in different
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
364 : formats (e.g., exponent only) is not provided for HSP objects.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
365 : This is only available for the BlastHit or Blast object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
366
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
367 See Also : L<expect()|expect>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
368
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
369 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
370
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
371 #-----
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
372 sub p { my $self = shift; $self->{'_p'}; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
373 #-----
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
374
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
375 # alias
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
376 sub pvalue { shift->p(@_); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
377
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
378 =head2 length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
379
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
380 Usage : $hsp->length( [seq_type] )
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
381 Purpose : Get the length of the aligned portion of the query or sbjct.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
382 Example : $hsp->length('query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
383 Returns : integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
384 Argument : seq_type: 'query' | 'hit' or 'sbjct' | 'total' (default = 'total')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
385 ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
386 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
387 Comments : 'total' length is the full length of the alignment
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
388 : as reported in the denominators in the alignment section:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
389 : "Identical = 34/120 Positives = 67/120".
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
390
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
391 See Also : L<gaps()|gaps>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
392
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
393 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
394
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
395 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
396 sub length {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
397 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
398 ## Developer note: when using the built-in length function within
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
399 ## this module, call it as CORE::length().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
400 my( $self, $seqType ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
401 $seqType ||= 'total';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
402 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
403
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
404 $seqType ne 'total' and $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
405
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
406 ## Sensitive to member name format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
407 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
408 $self->{$seqType.'Length'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
409 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
410
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
411
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
412
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
413 =head2 gaps
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
414
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
415 Usage : $hsp->gaps( [seq_type] )
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
416 Purpose : Get the number of gaps in the query, sbjct, or total alignment.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
417 : Also can return query gaps and sbjct gaps as a two-element list
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
418 : when in array context.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
419 Example : $total_gaps = $hsp->gaps();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
420 : ($qgaps, $sgaps) = $hsp->gaps();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
421 : $qgaps = $hsp->gaps('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
422 Returns : scalar context: integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
423 : array context without args: (int, int) = ('queryGaps', 'sbjctGaps')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
424 Argument : seq_type: 'query' or 'hit' or 'sbjct' or 'total'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
425 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
426 : (default = 'total', scalar context)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
427 : Array context can be "induced" by providing an argument of 'list' or 'array'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
428 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
429
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
430 See Also : L<length()|length>, L<matches()|matches>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
431
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
432 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
433
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
434 #---------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
435 sub gaps {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
436 #---------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
437 my( $self, $seqType ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
438
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
439 $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
440
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
441 $seqType ||= (wantarray ? 'list' : 'total');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
442 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
443
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
444 if($seqType =~ /list|array/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
445 return (($self->{'_queryGaps'} || 0), ($self->{'_sbjctGaps'} || 0));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
446 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
447
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
448 if($seqType eq 'total') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
449 return ($self->{'_queryGaps'} + $self->{'_sbjctGaps'}) || 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
450 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
451 ## Sensitive to member name format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
452 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
453 return $self->{$seqType.'Gaps'} || 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
454 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
455 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
456
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
457
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
458 =head2 frac_identical
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
459
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
460 Usage : $hsp_object->frac_identical( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
461 Purpose : Get the fraction of identical positions within the given HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
462 Example : $frac_iden = $hsp_object->frac_identical('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
463 Returns : Float (2-decimal precision, e.g., 0.75).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
464 Argument : seq_type: 'query' or 'hit' or 'sbjct' or 'total'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
465 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
466 : default = 'total' (but see comments below).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
467 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
468 Comments : Different versions of Blast report different values for the total
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
469 : length of the alignment. This is the number reported in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
470 : denominators in the stats section:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
471 : "Identical = 34/120 Positives = 67/120".
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
472 : NCBI-BLAST uses the total length of the alignment (with gaps)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
473 : WU-BLAST uses the length of the query sequence (without gaps).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
474 : Therefore, when called without an argument or an argument of 'total',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
475 : this method will report different values depending on the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
476 : version of BLAST used.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
477 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
478 : To get the fraction identical among only the aligned residues,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
479 : ignoring the gaps, call this method with an argument of 'query'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
480 : or 'sbjct' ('sbjct' is synonymous with 'hit').
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
481
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
482 See Also : L<frac_conserved()|frac_conserved>, L<num_identical()|num_identical>, L<matches()|matches>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
483
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
484 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
485
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
486 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
487 sub frac_identical {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
488 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
489 # The value is calculated as opposed to storing it from the parsed results.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
490 # This saves storage and also permits flexibility in determining for which
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
491 # sequence (query or sbjct) the figure is to be calculated.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
492
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
493 my( $self, $seqType ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
494 $seqType ||= 'total';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
495 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
496
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
497 if($seqType ne 'total') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
498 $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
499 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
500 ## Sensitive to member name format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
501 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
502
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
503 sprintf( "%.2f", $self->{'_numIdentical'}/$self->{$seqType.'Length'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
504 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
505
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
506
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
507 =head2 frac_conserved
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
508
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
509 Usage : $hsp_object->frac_conserved( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
510 Purpose : Get the fraction of conserved positions within the given HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
511 : (Note: 'conservative' positions are called 'positives' in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
512 : Blast report.)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
513 Example : $frac_cons = $hsp_object->frac_conserved('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
514 Returns : Float (2-decimal precision, e.g., 0.75).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
515 Argument : seq_type: 'query' or 'hit' or 'sbjct' or 'total'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
516 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
517 : default = 'total' (but see comments below).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
518 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
519 Comments : Different versions of Blast report different values for the total
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
520 : length of the alignment. This is the number reported in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
521 : denominators in the stats section:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
522 : "Identical = 34/120 Positives = 67/120".
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
523 : NCBI-BLAST uses the total length of the alignment (with gaps)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
524 : WU-BLAST uses the length of the query sequence (without gaps).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
525 : Therefore, when called without an argument or an argument of 'total',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
526 : this method will report different values depending on the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
527 : version of BLAST used.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
528 :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
529 : To get the fraction conserved among only the aligned residues,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
530 : ignoring the gaps, call this method with an argument of 'query'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
531 : or 'sbjct'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
532
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
533 See Also : L<frac_conserved()|frac_conserved>, L<num_conserved()|num_conserved>, L<matches()|matches>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
534
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
535 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
536
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
537 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
538 sub frac_conserved {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
539 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
540 # The value is calculated as opposed to storing it from the parsed results.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
541 # This saves storage and also permits flexibility in determining for which
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
542 # sequence (query or sbjct) the figure is to be calculated.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
543
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
544 my( $self, $seqType ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
545 $seqType ||= 'total';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
546 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
547
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
548 if($seqType ne 'total') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
549 $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
550 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
551
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
552 ## Sensitive to member name format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
553 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
554
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
555 sprintf( "%.2f", $self->{'_numConserved'}/$self->{$seqType.'Length'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
556 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
557
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
558 =head2 query_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
559
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
560 Title : query_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
561 Usage : my $qseq = $hsp->query_string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
562 Function: Retrieves the query sequence of this HSP as a string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
563 Returns : string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
564 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
565
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
566
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
567 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
568
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
569 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
570 sub query_string{ shift->seq_str('query'); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
571 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
572
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
573 =head2 hit_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
574
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
575 Title : hit_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
576 Usage : my $hseq = $hsp->hit_string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
577 Function: Retrieves the hit sequence of this HSP as a string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
578 Returns : string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
579 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
580
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
581
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
582 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
583
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
584 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
585 sub hit_string{ shift->seq_str('hit'); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
586 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
587
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
588
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
589 =head2 homology_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
590
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
591 Title : homology_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
592 Usage : my $homo_string = $hsp->homology_string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
593 Function: Retrieves the homology sequence for this HSP as a string.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
594 : The homology sequence is the string of symbols in between the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
595 : query and hit sequences in the alignment indicating the degree
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
596 : of conservation (e.g., identical, similar, not similar).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
597 Returns : string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
598 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
599
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
600 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
601
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
602 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
603 sub homology_string{ shift->seq_str('match'); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
604 #----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
605
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
606 #=================================================
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
607 # End Bio::Search::HSP::HSPI implementation
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
608 #=================================================
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
609
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
610 # Older method delegating to method defined in HSPI.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
611
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
612 =head2 expect
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
613
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
614 See L<Bio::Search::HSP::HSPI::expect()|Bio::Search::HSP::HSPI>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
615
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
616 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
617
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
618 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
619 sub expect { shift->evalue( @_ ); }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
620 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
621
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
622
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
623 =head2 rank
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
624
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
625 Usage : $hsp->rank( [string] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
626 Purpose : Get the rank of the HSP within a given Blast hit.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
627 Example : $rank = $hsp->rank;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
628 Returns : Integer (1..n) corresponding to the order in which the HSP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
629 appears in the BLAST report.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
630
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
631 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
632
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
633 #'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
634
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
635 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
636 sub rank { shift->{'_rank'} }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
637 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
638
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
639 # For backward compatibility
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
640 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
641 sub name { shift->rank }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
642 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
643
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
644 =head2 to_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
645
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
646 Title : to_string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
647 Usage : print $hsp->to_string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
648 Function: Returns a string representation for the Blast HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
649 Primarily intended for debugging purposes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
650 Example : see usage
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
651 Returns : A string of the form:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
652 [BlastHSP] <rank>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
653 e.g.:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
654 [BlastHit] 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
655 Args : None
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
656
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
657 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
658
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
659 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
660 sub to_string {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
661 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
662 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
663 return "[BlastHSP] " . $self->rank();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
664 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
665
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
666
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
667 #=head2 _set_data (Private method)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
668 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
669 # Usage : called automatically during object construction.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
670 # Purpose : Parses the raw HSP section from a flat BLAST report and
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
671 # sets the query sequence, sbjct sequence, and the "match" data
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
672 # : which consists of the symbols between the query and sbjct lines
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
673 # : in the alignment.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
674 # Argument : Array (all lines for a single, complete HSP, from a raw,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
675 # flat (i.e., non-XML) BLAST report)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
676 # Throws : Propagates any exceptions from the methods called ("See Also")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
677 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
678 #See Also : L<_set_seq()|_set_seq>, L<_set_score_stats()|_set_score_stats>, L<_set_match_stats()|_set_match_stats>, L<_initialize()|_initialize>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
679 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
680 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
681
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
682 #--------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
683 sub _set_data {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
684 #--------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
685 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
686 my @data = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
687 my @queryList = (); # 'Query' = SEQUENCE USED TO QUERY THE DATABASE.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
688 my @sbjctList = (); # 'Sbjct' = HOMOLOGOUS SEQUENCE FOUND IN THE DATABASE.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
689 my @matchList = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
690 my $matchLine = 0; # Alternating boolean: when true, load 'match' data.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
691 my @linedat = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
692
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
693 #print STDERR "BlastHSP: set_data()\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
694
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
695 my($line, $aln_row_len, $length_diff);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
696 $length_diff = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
697
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
698 # Collecting data for all lines in the alignment
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
699 # and then storing the collections for possible processing later.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
700 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
701 # Note that "match" lines may not be properly padded with spaces.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
702 # This loop now properly handles such cases:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
703 # Query: 1141 PSLVELTIRDCPRLEVGPMIRSLPKFPMLKKLDLAVANIIEEDLDVIGSLEELVIXXXXX 1200
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
704 # PSLVELTIRDCPRLEVGPMIRSLPKFPMLKKLDLAVANIIEEDLDVIGSLEELVI
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
705 # Sbjct: 1141 PSLVELTIRDCPRLEVGPMIRSLPKFPMLKKLDLAVANIIEEDLDVIGSLEELVILSLKL 1200
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
706
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
707 foreach $line( @data ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
708 next if $line =~ /^\s*$/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
709
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
710 if( $line =~ /^ ?Score/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
711 $self->_set_score_stats( $line );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
712 } elsif( $line =~ /^ ?(Identities|Positives|Strand)/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
713 $self->_set_match_stats( $line );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
714 } elsif( $line =~ /^ ?Frame = ([\d+-]+)/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
715 # Version 2.0.8 has Frame information on a separate line.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
716 # Storing frame according to SeqFeature::Generic::frame()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
717 # which does not contain strand info (use strand()).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
718 my $frame = abs($1) - 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
719 $self->frame( $frame );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
720 } elsif( $line =~ /^(Query:?[\s\d]+)([^\s\d]+)/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
721 push @queryList, $line;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
722 $self->{'_match_indent'} = CORE::length $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
723 $aln_row_len = (CORE::length $1) + (CORE::length $2);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
724 $matchLine = 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
725 } elsif( $matchLine ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
726 # Pad the match line with spaces if necessary.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
727 $length_diff = $aln_row_len - CORE::length $line;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
728 $length_diff and $line .= ' 'x $length_diff;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
729 push @matchList, $line;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
730 $matchLine = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
731 } elsif( $line =~ /^Sbjct/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
732 push @sbjctList, $line;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
733 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
734 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
735 # Storing the query and sbjct lists in case they are needed later.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
736 # We could make this conditional to save memory.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
737 $self->{'_queryList'} = \@queryList;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
738 $self->{'_sbjctList'} = \@sbjctList;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
739
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
740 # Storing the match list in case it is needed later.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
741 $self->{'_matchList'} = \@matchList;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
742
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
743 if(not defined ($self->{'_numIdentical'})) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
744 my $id_str = $self->_id_str;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
745 $self->throw( -text => "Can't parse match statistics. Possibly a new or unrecognized Blast format. ($id_str)");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
746 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
747
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
748 if(!scalar @queryList or !scalar @sbjctList) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
749 my $id_str = $self->_id_str;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
750 $self->throw( "Can't find query or sbjct alignment lines. Possibly unrecognized Blast format. ($id_str)");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
751 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
752 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
753
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
754
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
755 #=head2 _set_score_stats (Private method)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
756 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
757 # Usage : called automatically by _set_data()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
758 # Purpose : Sets various score statistics obtained from the HSP listing.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
759 # Argument : String with any of the following formats:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
760 # : blast2: Score = 30.1 bits (66), Expect = 9.2
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
761 # : blast2: Score = 158.2 bits (544), Expect(2) = e-110
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
762 # : blast1: Score = 410 (144.3 bits), Expect = 1.7e-40, P = 1.7e-40
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
763 # : blast1: Score = 55 (19.4 bits), Expect = 5.3, Sum P(3) = 0.99
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
764 # Throws : Exception if the stats cannot be parsed, probably due to a change
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
765 # : in the Blast report format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
766 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
767 #See Also : L<_set_data()|_set_data>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
768 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
769 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
770
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
771 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
772 sub _set_score_stats {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
773 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
774 my ($self, $data) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
775
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
776 my ($expect, $p);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
777
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
778 if($data =~ /Score = +([\d.e+-]+) bits \(([\d.e+-]+)\), +Expect = +([\d.e+-]+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
779 # blast2 format n = 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
780 $self->bits($1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
781 $self->score($2);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
782 $expect = $3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
783 } elsif($data =~ /Score = +([\d.e+-]+) bits \(([\d.e+-]+)\), +Expect\((\d+)\) = +([\d.e+-]+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
784 # blast2 format n > 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
785 $self->bits($1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
786 $self->score($2);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
787 $self->{'_n'} = $3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
788 $expect = $4;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
789
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
790 } elsif($data =~ /Score = +([\d.e+-]+) \(([\d.e+-]+) bits\), +Expect = +([\d.e+-]+), P = +([\d.e-]+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
791 # blast1 format, n = 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
792 $self->score($1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
793 $self->bits($2);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
794 $expect = $3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
795 $p = $4;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
796
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
797 } elsif($data =~ /Score = +([\d.e+-]+) \(([\d.e+-]+) bits\), +Expect = +([\d.e+-]+), +Sum P\((\d+)\) = +([\d.e-]+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
798 # blast1 format, n > 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
799 $self->score($1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
800 $self->bits($2);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
801 $expect = $3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
802 $self->{'_n'} = $4;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
803 $p = $5;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
804
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
805 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
806 my $id_str = $self->_id_str;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
807 $self->throw(-class => 'Bio::Root::Exception',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
808 -text => "Can't parse score statistics: unrecognized format. ($id_str)",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
809 -value => $data);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
810 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
811 $expect = "1$expect" if $expect =~ /^e/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
812 $p = "1$p" if defined $p and $p=~ /^e/i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
813
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
814 $self->{'_expect'} = $expect;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
815 $self->{'_p'} = $p || undef;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
816 $self->significance( $p || $expect );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
817 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
818
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
819
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
820 #=head2 _set_match_stats (Private method)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
821 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
822 # Usage : Private method; called automatically by _set_data()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
823 # Purpose : Sets various matching statistics obtained from the HSP listing.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
824 # Argument : blast2: Identities = 23/74 (31%), Positives = 29/74 (39%), Gaps = 17/74 (22%)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
825 # : blast2: Identities = 57/98 (58%), Positives = 74/98 (75%)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
826 # : blast1: Identities = 87/204 (42%), Positives = 126/204 (61%)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
827 # : blast1: Identities = 87/204 (42%), Positives = 126/204 (61%), Frame = -3
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
828 # : WU-blast: Identities = 310/553 (56%), Positives = 310/553 (56%), Strand = Minus / Plus
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
829 # Throws : Exception if the stats cannot be parsed, probably due to a change
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
830 # : in the Blast report format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
831 # Comments : The "Gaps = " data in the HSP header has a different meaning depending
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
832 # : on the type of Blast: for BLASTP, this number is the total number of
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
833 # : gaps in query+sbjct; for TBLASTN, it is the number of gaps in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
834 # : query sequence only. Thus, it is safer to collect the data
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
835 # : separately by examining the actual sequence strings as is done
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
836 # : in _set_seq().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
837 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
838 #See Also : L<_set_data()|_set_data>, L<_set_seq()|_set_seq>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
839 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
840 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
841
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
842 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
843 sub _set_match_stats {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
844 #--------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
845 my ($self, $data) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
846
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
847 if($data =~ m!Identities = (\d+)/(\d+)!) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
848 # blast1 or 2 format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
849 $self->{'_numIdentical'} = $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
850 $self->{'_totalLength'} = $2;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
851 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
852
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
853 if($data =~ m!Positives = (\d+)/(\d+)!) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
854 # blast1 or 2 format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
855 $self->{'_numConserved'} = $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
856 $self->{'_totalLength'} = $2;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
857 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
858
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
859 if($data =~ m!Frame = ([\d+-]+)!) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
860 $self->frame($1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
861 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
862
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
863 # Strand data is not always present in this line.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
864 # _set_seq() will also set strand information.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
865 if($data =~ m!Strand = (\w+) / (\w+)!) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
866 $self->{'_queryStrand'} = $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
867 $self->{'_sbjctStrand'} = $2;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
868 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
869
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
870 # if($data =~ m!Gaps = (\d+)/(\d+)!) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
871 # $self->{'_totalGaps'} = $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
872 # } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
873 # $self->{'_totalGaps'} = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
874 # }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
875 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
876
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
877
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
878
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
879 #=head2 _set_seq_data (Private method)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
880 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
881 # Usage : called automatically when sequence data is requested.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
882 # Purpose : Sets the HSP sequence data for both query and sbjct sequences.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
883 # : Includes: start, stop, length, gaps, and raw sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
884 # Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
885 # Throws : Propagates any exception thrown by _set_match_seq()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
886 # Comments : Uses raw data stored by _set_data() during object construction.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
887 # : These data are not always needed, so it is conditionally
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
888 # : executed only upon demand by methods such as gaps(), _set_residues(),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
889 # : etc. _set_seq() does the dirty work.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
890 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
891 #See Also : L<_set_seq()|_set_seq>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
892 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
893 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
894
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
895 #-----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
896 sub _set_seq_data {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
897 #-----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
898 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
899
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
900 $self->_set_seq('query', @{$self->{'_queryList'}});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
901 $self->_set_seq('sbjct', @{$self->{'_sbjctList'}});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
902
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
903 # Liberate some memory.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
904 @{$self->{'_queryList'}} = @{$self->{'_sbjctList'}} = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
905 undef $self->{'_queryList'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
906 undef $self->{'_sbjctList'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
907
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
908 $self->{'_set_seq_data'} = 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
909 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
910
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
911
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
912
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
913 #=head2 _set_seq (Private method)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
914 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
915 # Usage : called automatically by _set_seq_data()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
916 # : $hsp_obj->($seq_type, @data);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
917 # Purpose : Sets sequence information for both the query and sbjct sequences.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
918 # : Directly counts the number of gaps in each sequence (if gapped Blast).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
919 # Argument : $seq_type = 'query' or 'sbjct'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
920 # : @data = all seq lines with the form:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
921 # : Query: 61 SPHNVKDRKEQNGSINNAISPTATANTSGSQQINIDSALRDRSSNVAAQPSLSDASSGSN 120
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
922 # Throws : Exception if data strings cannot be parsed, probably due to a change
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
923 # : in the Blast report format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
924 # Comments : Uses first argument to determine which data members to set
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
925 # : making this method sensitive data member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
926 # : Behavior is dependent on the type of BLAST analysis (TBLASTN, BLASTP, etc).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
927 # Warning : Sequence endpoints are normalized so that start < end. This affects HSPs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
928 # : for TBLASTN/X hits on the minus strand. Normalization facilitates use
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
929 # : of range information by methods such as match().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
930 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
931 #See Also : L<_set_seq_data()|_set_seq_data>, L<matches()|matches>, L<range()|range>, L<start()|start>, L<end()|end>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
932 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
933 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
934
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
935 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
936 sub _set_seq {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
937 #-------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
938 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
939 my $seqType = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
940 my @data = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
941 my @ranges = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
942 my @sequence = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
943 my $numGaps = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
944
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
945 foreach( @data ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
946 if( m/(\d+) *([^\d\s]+) *(\d+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
947 push @ranges, ( $1, $3 ) ;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
948 push @sequence, $2;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
949 #print STDERR "_set_seq found sequence \"$2\"\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
950 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
951 $self->warn("Bad sequence data: $_");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
952 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
953 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
954
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
955 if( !(scalar(@sequence) and scalar(@ranges))) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
956 my $id_str = $self->_id_str;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
957 $self->throw("Can't set sequence: missing data. Possibly unrecognized Blast format. ($id_str)");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
958 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
959
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
960 # Sensitive to member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
961 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
962 $self->{$seqType.'Start'} = $ranges[0];
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
963 $self->{$seqType.'Stop'} = $ranges[ $#ranges ];
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
964 $self->{$seqType.'Seq'} = \@sequence;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
965
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
966 $self->{$seqType.'Length'} = abs($ranges[ $#ranges ] - $ranges[0]) + 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
967
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
968 # Adjust lengths for BLASTX, TBLASTN, TBLASTX sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
969 # Converting nucl coords to amino acid coords.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
970
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
971 my $prog = $self->algorithm;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
972 if($prog eq 'TBLASTN' and $seqType eq '_sbjct') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
973 $self->{$seqType.'Length'} /= 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
974 } elsif($prog eq 'BLASTX' and $seqType eq '_query') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
975 $self->{$seqType.'Length'} /= 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
976 } elsif($prog eq 'TBLASTX') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
977 $self->{$seqType.'Length'} /= 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
978 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
979
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
980 if( $prog ne 'BLASTP' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
981 $self->{$seqType.'Strand'} = 'Plus' if $prog =~ /BLASTN/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
982 $self->{$seqType.'Strand'} = 'Plus' if ($prog =~ /BLASTX/ and $seqType eq '_query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
983 # Normalize sequence endpoints so that start < end.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
984 # Reverse complement or 'minus strand' HSPs get flipped here.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
985 if($self->{$seqType.'Start'} > $self->{$seqType.'Stop'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
986 ($self->{$seqType.'Start'}, $self->{$seqType.'Stop'}) =
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
987 ($self->{$seqType.'Stop'}, $self->{$seqType.'Start'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
988 $self->{$seqType.'Strand'} = 'Minus';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
989 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
990 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
991
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
992 ## Count number of gaps in each seq. Only need to do this for gapped Blasts.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
993 # if($self->{'_gapped'}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
994 my $seqstr = join('', @sequence);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
995 $seqstr =~ s/\s//g;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
996 my $num_gaps = CORE::length($seqstr) - $self->{$seqType.'Length'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
997 $self->{$seqType.'Gaps'} = $num_gaps if $num_gaps > 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
998 # }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
999 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1000
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1001
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1002 #=head2 _set_residues (Private method)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1003 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1004 # Usage : called automatically when residue data is requested.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1005 # Purpose : Sets the residue numbers representing the identical and
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1006 # : conserved positions. These data are obtained by analyzing the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1007 # : symbols between query and sbjct lines of the alignments.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1008 # Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1009 # Throws : Propagates any exception thrown by _set_seq_data() and _set_match_seq().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1010 # Comments : These data are not always needed, so it is conditionally
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1011 # : executed only upon demand by methods such as seq_inds().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1012 # : Behavior is dependent on the type of BLAST analysis (TBLASTN, BLASTP, etc).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1013 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1014 #See Also : L<_set_seq_data()|_set_seq_data>, L<_set_match_seq()|_set_match_seq>, seq_inds()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1015 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1016 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1017
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1018 #------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1019 sub _set_residues {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1020 #------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1021 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1022 my @sequence = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1023
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1024 $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1025
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1026 # Using hashes to avoid saving duplicate residue numbers.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1027 my %identicalList_query = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1028 my %identicalList_sbjct = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1029 my %conservedList_query = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1030 my %conservedList_sbjct = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1031
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1032 my $aref = $self->_set_match_seq() if not ref $self->{'_matchSeq'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1033 $aref ||= $self->{'_matchSeq'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1034 my $seqString = join('', @$aref );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1035
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1036 my $qseq = join('',@{$self->{'_querySeq'}});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1037 my $sseq = join('',@{$self->{'_sbjctSeq'}});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1038 my $resCount_query = $self->{'_queryStop'} || 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1039 my $resCount_sbjct = $self->{'_sbjctStop'} || 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1040
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1041 my $prog = $self->algorithm;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1042 if($prog !~ /^BLASTP|^BLASTN/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1043 if($prog eq 'TBLASTN') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1044 $resCount_sbjct /= 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1045 } elsif($prog eq 'BLASTX') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1046 $resCount_query /= 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1047 } elsif($prog eq 'TBLASTX') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1048 $resCount_query /= 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1049 $resCount_sbjct /= 3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1050 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1051 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1052
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1053 my ($mchar, $schar, $qchar);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1054 while( $mchar = chop($seqString) ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1055 ($qchar, $schar) = (chop($qseq), chop($sseq));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1056 if( $mchar eq '+' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1057 $conservedList_query{ $resCount_query } = 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1058 $conservedList_sbjct{ $resCount_sbjct } = 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1059 } elsif( $mchar ne ' ' ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1060 $identicalList_query{ $resCount_query } = 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1061 $identicalList_sbjct{ $resCount_sbjct } = 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1062 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1063 $resCount_query-- if $qchar ne $GAP_SYMBOL;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1064 $resCount_sbjct-- if $schar ne $GAP_SYMBOL;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1065 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1066 $self->{'_identicalRes_query'} = \%identicalList_query;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1067 $self->{'_conservedRes_query'} = \%conservedList_query;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1068 $self->{'_identicalRes_sbjct'} = \%identicalList_sbjct;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1069 $self->{'_conservedRes_sbjct'} = \%conservedList_sbjct;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1070
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1071 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1072
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1073
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1074
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1075
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1076 #=head2 _set_match_seq (Private method)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1077 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1078 # Usage : $hsp_obj->_set_match_seq()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1079 # Purpose : Set the 'match' sequence for the current HSP (symbols in between
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1080 # : the query and sbjct lines.)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1081 # Returns : Array reference holding the match sequences lines.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1082 # Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1083 # Throws : Exception if the _matchList field is not set.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1084 # Comments : The match information is not always necessary. This method
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1085 # : allows it to be conditionally prepared.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1086 # : Called by _set_residues>() and seq_str().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1087 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1088 #See Also : L<_set_residues()|_set_residues>, L<seq_str()|seq_str>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1089 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1090 #=cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1091
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1092 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1093 sub _set_match_seq {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1094 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1095 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1096
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1097 if( ! ref($self->{'_matchList'}) ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1098 my $id_str = $self->_id_str;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1099 $self->throw("Can't set HSP match sequence: No data ($id_str)");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1100 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1101
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1102 my @data = @{$self->{'_matchList'}};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1103
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1104 my(@sequence);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1105 foreach( @data ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1106 chomp($_);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1107 ## Remove leading spaces; (note: aln may begin with a space
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1108 ## which is why we can't use s/^ +//).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1109 s/^ {$self->{'_match_indent'}}//;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1110 push @sequence, $_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1111 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1112 # Liberate some memory.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1113 @{$self->{'_matchList'}} = undef;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1114 $self->{'_matchList'} = undef;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1115
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1116 $self->{'_matchSeq'} = \@sequence;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1117
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1118 return $self->{'_matchSeq'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1119 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1120
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1121
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1122 =head2 n
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1123
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1124 Usage : $hsp_obj->n()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1125 Purpose : Get the N value (num HSPs on which P/Expect is based).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1126 : This value is not defined with NCBI Blast2 with gapping.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1127 Returns : Integer or null string if not defined.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1128 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1129 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1130 Comments : The 'N' value is listed in parenthesis with P/Expect value:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1131 : e.g., P(3) = 1.2e-30 ---> (N = 3).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1132 : Not defined in NCBI Blast2 with gaps.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1133 : This typically is equal to the number of HSPs but not always.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1134 : To obtain the number of HSPs, use Bio::Search::Hit::BlastHit::num_hsps().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1135
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1136 See Also : L<Bio::SeqFeature::SimilarityPair::score()|Bio::SeqFeature::SimilarityPair>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1137
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1138 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1139
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1140 #-----
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1141 sub n { my $self = shift; $self->{'_n'} || ''; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1142 #-----
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1143
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1144
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1145 =head2 matches
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1146
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1147 Usage : $hsp->matches([seq_type], [start], [stop]);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1148 Purpose : Get the total number of identical and conservative matches
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1149 : in the query or sbjct sequence for the given HSP. Optionally can
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1150 : report data within a defined interval along the seq.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1151 : (Note: 'conservative' matches are called 'positives' in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1152 : Blast report.)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1153 Example : ($id,$cons) = $hsp_object->matches('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1154 : ($id,$cons) = $hsp_object->matches('query',300,400);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1155 Returns : 2-element array of integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1156 Argument : (1) seq_type = 'query' or 'hit' or 'sbjct' (default = query)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1157 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1158 : (2) start = Starting coordinate (optional)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1159 : (3) stop = Ending coordinate (optional)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1160 Throws : Exception if the supplied coordinates are out of range.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1161 Comments : Relies on seq_str('match') to get the string of alignment symbols
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1162 : between the query and sbjct lines which are used for determining
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1163 : the number of identical and conservative matches.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1164
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1165 See Also : L<length()|length>, L<gaps()|gaps>, L<seq_str()|seq_str>, L<Bio::Search::Hit::BlastHit::_adjust_contigs()|Bio::Search::Hit::BlastHit>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1166
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1167 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1168
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1169 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1170 sub matches {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1171 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1172 my( $self, %param ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1173 my(@data);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1174 my($seqType, $beg, $end) = ($param{-SEQ}, $param{-START}, $param{-STOP});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1175 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1176 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1177
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1178 my($start,$stop);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1179
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1180 if(!defined $beg && !defined $end) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1181 ## Get data for the whole alignment.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1182 push @data, ($self->{'_numIdentical'}, $self->{'_numConserved'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1183 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1184 ## Get the substring representing the desired sub-section of aln.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1185 $beg ||= 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1186 $end ||= 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1187 ($start,$stop) = $self->range($seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1188 if($beg == 0) { $beg = $start; $end = $beg+$end; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1189 elsif($end == 0) { $end = $stop; $beg = $end-$beg; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1190
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1191 if($end >= $stop) { $end = $stop; } ##ML changed from if (end >stop)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1192 else { $end += 1;} ##ML moved from commented position below, makes
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1193 ##more sense here
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1194 # if($end > $stop) { $end = $stop; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1195 if($beg < $start) { $beg = $start; }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1196 # else { $end += 1;}
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1197
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1198 # my $seq = substr($self->seq_str('match'), $beg-$start, ($end-$beg));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1199
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1200 ## ML: START fix for substr out of range error ------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1201 my $seq = "";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1202 my $prog = $self->algorithm;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1203 if (($prog eq 'TBLASTN') and ($seqType eq 'sbjct'))
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1204 {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1205 $seq = substr($self->seq_str('match'),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1206 int(($beg-$start)/3), int(($end-$beg+1)/3));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1207
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1208 } elsif (($prog eq 'BLASTX') and ($seqType eq 'query'))
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1209 {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1210 $seq = substr($self->seq_str('match'),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1211 int(($beg-$start)/3), int(($end-$beg+1)/3));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1212 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1213 $seq = substr($self->seq_str('match'),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1214 $beg-$start, ($end-$beg));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1215 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1216 ## ML: End of fix for substr out of range error -----------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1217
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1218
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1219 ## ML: debugging code
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1220 ## This is where we get our exception. Try printing out the values going
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1221 ## into this:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1222 ##
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1223 # print STDERR
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1224 # qq(*------------MY EXCEPTION --------------------\nSeq: ") ,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1225 # $self->seq_str("$seqType"), qq("\n),$self->rank,",( index:";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1226 # print STDERR $beg-$start, ", len: ", $end-$beg," ), (HSPRealLen:",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1227 # CORE::length $self->seq_str("$seqType");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1228 # print STDERR ", HSPCalcLen: ", $stop - $start +1 ," ),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1229 # ( beg: $beg, end: $end ), ( start: $start, stop: stop )\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1230 ## ML: END DEBUGGING CODE----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1231
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1232 if(!CORE::length $seq) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1233 my $id_str = $self->_id_str;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1234 $self->throw("Undefined $seqType sub-sequence ($beg,$end). Valid range = $start - $stop ($id_str)");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1235 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1236 ## Get data for a substring.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1237 # printf "Collecting HSP subsection data: beg,end = %d,%d; start,stop = %d,%d\n%s<---\n", $beg, $end, $start, $stop, $seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1238 # printf "Original match seq:\n%s\n",$self->seq_str('match');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1239 $seq =~ s/ //g; # remove space (no info).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1240 my $len_cons = CORE::length $seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1241 $seq =~ s/\+//g; # remove '+' characters (conservative substitutions)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1242 my $len_id = CORE::length $seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1243 push @data, ($len_id, $len_cons);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1244 # printf " HSP = %s\n id = %d; cons = %d\n", $self->rank, $len_id, $len_cons; <STDIN>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1245 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1246 @data;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1247 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1248
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1249
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1250 =head2 num_identical
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1251
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1252 Usage : $hsp_object->num_identical();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1253 Purpose : Get the number of identical positions within the given HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1254 Example : $num_iden = $hsp_object->num_identical();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1255 Returns : integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1256 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1257 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1258
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1259 See Also : L<num_conserved()|num_conserved>, L<frac_identical()|frac_identical>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1260
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1261 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1262
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1263 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1264 sub num_identical {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1265 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1266 my( $self) = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1267
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1268 $self->{'_numIdentical'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1269 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1270
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1271
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1272 =head2 num_conserved
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1273
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1274 Usage : $hsp_object->num_conserved();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1275 Purpose : Get the number of conserved positions within the given HSP.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1276 Example : $num_iden = $hsp_object->num_conserved();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1277 Returns : integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1278 Argument : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1279 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1280
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1281 See Also : L<num_identical()|num_identical>, L<frac_conserved()|frac_conserved>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1282
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1283 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1284
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1285 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1286 sub num_conserved {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1287 #-------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1288 my( $self) = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1289
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1290 $self->{'_numConserved'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1291 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1292
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1293
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1294
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1295 =head2 range
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1296
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1297 Usage : $hsp->range( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1298 Purpose : Gets the (start, end) coordinates for the query or sbjct sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1299 : in the HSP alignment.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1300 Example : ($query_beg, $query_end) = $hsp->range('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1301 : ($hit_beg, $hit_end) = $hsp->range('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1302 Returns : Two-element array of integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1303 Argument : seq_type = string, 'query' or 'hit' or 'sbjct' (default = 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1304 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1305 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1306
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1307 See Also : L<start()|start>, L<end()|end>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1308
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1309 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1310
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1311 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1312 sub range {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1313 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1314 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1315
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1316 $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1317
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1318 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1319 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1320
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1321 ## Sensitive to member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1322 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1323
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1324 return ($self->{$seqType.'Start'},$self->{$seqType.'Stop'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1325 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1326
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1327 =head2 start
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1328
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1329 Usage : $hsp->start( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1330 Purpose : Gets the start coordinate for the query, sbjct, or both sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1331 : in the HSP alignment.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1332 : NOTE: Start will always be less than end.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1333 : To determine strand, use $hsp->strand()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1334 Example : $query_beg = $hsp->start('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1335 : $hit_beg = $hsp->start('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1336 : ($query_beg, $hit_beg) = $hsp->start();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1337 Returns : scalar context: integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1338 : array context without args: list of two integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1339 Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default= 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1340 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1341 : Array context can be "induced" by providing an argument of 'list' or 'array'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1342 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1343
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1344 See Also : L<end()|end>, L<range()|range>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1345
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1346 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1347
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1348 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1349 sub start {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1350 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1351 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1352
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1353 $seqType ||= (wantarray ? 'list' : 'query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1354 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1355
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1356 $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1357
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1358 if($seqType =~ /list|array/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1359 return ($self->{'_queryStart'}, $self->{'_sbjctStart'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1360 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1361 ## Sensitive to member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1362 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1363 return $self->{$seqType.'Start'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1364 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1365 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1366
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1367 =head2 end
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1368
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1369 Usage : $hsp->end( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1370 Purpose : Gets the end coordinate for the query, sbjct, or both sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1371 : in the HSP alignment.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1372 : NOTE: Start will always be less than end.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1373 : To determine strand, use $hsp->strand()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1374 Example : $query_end = $hsp->end('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1375 : $hit_end = $hsp->end('hit');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1376 : ($query_end, $hit_end) = $hsp->end();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1377 Returns : scalar context: integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1378 : array context without args: list of two integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1379 Argument : In scalar context: seq_type = 'query' or 'hit' or 'sbjct' (default= 'query')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1380 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1381 : Array context can be "induced" by providing an argument of 'list' or 'array'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1382 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1383
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1384 See Also : L<start()|start>, L<range()|range>, L<strand()|strand>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1385
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1386 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1387
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1388 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1389 sub end {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1390 #----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1391 my ($self, $seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1392
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1393 $seqType ||= (wantarray ? 'list' : 'query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1394 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1395
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1396 $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1397
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1398 if($seqType =~ /list|array/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1399 return ($self->{'_queryStop'}, $self->{'_sbjctStop'});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1400 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1401 ## Sensitive to member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1402 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1403 return $self->{$seqType.'Stop'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1404 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1405 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1406
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1407
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1408
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1409 =head2 strand
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1410
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1411 Usage : $hsp_object->strand( [seq_type] )
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1412 Purpose : Get the strand of the query or sbjct sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1413 Example : print $hsp->strand('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1414 : ($query_strand, $hit_strand) = $hsp->strand();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1415 Returns : -1, 0, or 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1416 : -1 = Minus strand, +1 = Plus strand
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1417 : Returns 0 if strand is not defined, which occurs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1418 : for BLASTP reports, and the query of TBLASTN
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1419 : as well as the hit if BLASTX reports.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1420 : In scalar context without arguments, returns queryStrand value.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1421 : In array context without arguments, returns a two-element list
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1422 : of strings (queryStrand, sbjctStrand).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1423 : Array context can be "induced" by providing an argument of 'list' or 'array'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1424 Argument : seq_type: 'query' or 'hit' or 'sbjct' or undef
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1425 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1426 Throws : n/a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1427
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1428 See Also : B<_set_seq()>, B<_set_match_stats()>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1429
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1430 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1431
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1432 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1433 sub strand {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1434 #-----------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1435 my( $self, $seqType ) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1436
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1437 # Hack to deal with the fact that SimilarityPair calls strand()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1438 # which will lead to an error because parsing hasn't yet occurred.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1439 # See SimilarityPair::new().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1440 return if $self->{'_initializing'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1441
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1442 $seqType ||= (wantarray ? 'list' : 'query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1443 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1444
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1445 ## Sensitive to member name format.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1446 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1447
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1448 # $seqType could be '_list'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1449 $self->{'_queryStrand'} or $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1450
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1451 my $prog = $self->algorithm;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1452
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1453 if($seqType =~ /list|array/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1454 my ($qstr, $hstr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1455 if( $prog eq 'BLASTP') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1456 $qstr = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1457 $hstr = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1458 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1459 elsif( $prog eq 'TBLASTN') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1460 $qstr = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1461 $hstr = $STRAND_SYMBOL{$self->{'_sbjctStrand'}};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1462 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1463 elsif( $prog eq 'BLASTX') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1464 $qstr = $STRAND_SYMBOL{$self->{'_queryStrand'}};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1465 $hstr = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1466 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1467 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1468 $qstr = $STRAND_SYMBOL{$self->{'_queryStrand'}} if defined $self->{'_queryStrand'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1469 $hstr = $STRAND_SYMBOL{$self->{'_sbjctStrand'}} if defined $self->{'_sbjctStrand'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1470 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1471 $qstr ||= 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1472 $hstr ||= 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1473 return ($qstr, $hstr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1474 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1475 local $^W = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1476 $STRAND_SYMBOL{$self->{$seqType.'Strand'}} || 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1477 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1478
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1479
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1480 =head2 seq
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1481
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1482 Usage : $hsp->seq( [seq_type] );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1483 Purpose : Get the query or sbjct sequence as a Bio::Seq.pm object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1484 Example : $seqObj = $hsp->seq('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1485 Returns : Object reference for a Bio::Seq.pm object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1486 Argument : seq_type = 'query' or 'hit' or 'sbjct' (default = 'query').
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1487 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1488 Throws : Propagates any exception that occurs during construction
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1489 : of the Bio::Seq.pm object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1490 Comments : The sequence is returned in an array of strings corresponding
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1491 : to the strings in the original format of the Blast alignment.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1492 : (i.e., same spacing).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1493
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1494 See Also : L<seq_str()|seq_str>, L<seq_inds()|seq_inds>, B<Bio::Seq>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1495
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1496 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1497
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1498 #-------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1499 sub seq {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1500 #-------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1501 my($self,$seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1502 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1503 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1504 my $str = $self->seq_str($seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1505
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1506 require Bio::Seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1507
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1508 new Bio::Seq (-ID => $self->to_string,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1509 -SEQ => $str,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1510 -DESC => "$seqType sequence",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1511 );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1512 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1513
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1514 =head2 seq_str
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1515
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1516 Usage : $hsp->seq_str( seq_type );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1517 Purpose : Get the full query, sbjct, or 'match' sequence as a string.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1518 : The 'match' sequence is the string of symbols in between the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1519 : query and sbjct sequences.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1520 Example : $str = $hsp->seq_str('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1521 Returns : String
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1522 Argument : seq_Type = 'query' or 'hit' or 'sbjct' or 'match'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1523 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1524 Throws : Exception if the argument does not match an accepted seq_type.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1525 Comments : Calls _set_seq_data() to set the 'match' sequence if it has
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1526 : not been set already.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1527
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1528 See Also : L<seq()|seq>, L<seq_inds()|seq_inds>, B<_set_match_seq()>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1529
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1530 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1531
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1532 #------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1533 sub seq_str {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1534 #------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1535 my($self,$seqType) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1536
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1537 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1538 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1539 ## Sensitive to member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1540 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1541
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1542 $self->_set_seq_data() unless $self->{'_set_seq_data'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1543
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1544 if($seqType =~ /sbjct|query/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1545 my $seq = join('',@{$self->{$seqType.'Seq'}});
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1546 $seq =~ s/\s+//g;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1547 return $seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1548
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1549 } elsif( $seqType =~ /match/i) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1550 # Only need to call _set_match_seq() if the match seq is requested.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1551 my $aref = $self->_set_match_seq() unless ref $self->{'_matchSeq'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1552 $aref = $self->{'_matchSeq'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1553
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1554 return join('',@$aref);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1555
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1556 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1557 my $id_str = $self->_id_str;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1558 $self->throw(-class => 'Bio::Root::BadParameter',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1559 -text => "Invalid or undefined sequence type: $seqType ($id_str)\n" .
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1560 "Valid types: query, sbjct, match",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1561 -value => $seqType);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1562 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1563 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1564
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1565 =head2 seq_inds
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1566
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1567 Usage : $hsp->seq_inds( seq_type, class, collapse );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1568 Purpose : Get a list of residue positions (indices) for all identical
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1569 : or conserved residues in the query or sbjct sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1570 Example : @s_ind = $hsp->seq_inds('query', 'identical');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1571 : @h_ind = $hsp->seq_inds('hit', 'conserved');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1572 : @h_ind = $hsp->seq_inds('hit', 'conserved', 1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1573 Returns : List of integers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1574 : May include ranges if collapse is true.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1575 Argument : seq_type = 'query' or 'hit' or 'sbjct' (default = query)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1576 : ('sbjct' is synonymous with 'hit')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1577 : class = 'identical' or 'conserved' (default = identical)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1578 : (can be shortened to 'id' or 'cons')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1579 : (actually, anything not 'id' will evaluate to 'conserved').
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1580 : collapse = boolean, if true, consecutive positions are merged
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1581 : using a range notation, e.g., "1 2 3 4 5 7 9 10 11"
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1582 : collapses to "1-5 7 9-11". This is useful for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1583 : consolidating long lists. Default = no collapse.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1584 Throws : n/a.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1585 Comments : Calls _set_residues() to set the 'match' sequence if it has
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1586 : not been set already.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1587
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1588 See Also : L<seq()|seq>, B<_set_residues()>, L<Bio::Search::BlastUtils::collapse_nums()|Bio::Search::BlastUtils>, L<Bio::Search::Hit::BlastHit::seq_inds()|Bio::Search::Hit::BlastHit>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1589
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1590 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1591
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1592 #---------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1593 sub seq_inds {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1594 #---------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1595 my ($self, $seqType, $class, $collapse) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1596
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1597 $seqType ||= 'query';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1598 $class ||= 'identical';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1599 $collapse ||= 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1600 $seqType = 'sbjct' if $seqType eq 'hit';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1601
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1602 $self->_set_residues() unless defined $self->{'_identicalRes_query'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1603
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1604 $seqType = ($seqType !~ /^q/i ? 'sbjct' : 'query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1605 $class = ($class !~ /^id/i ? 'conserved' : 'identical');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1606
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1607 ## Sensitive to member name changes.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1608 $seqType = "_\L$seqType\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1609 $class = "_\L$class\E";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1610
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1611 my @ary = sort { $a <=> $b } keys %{ $self->{"${class}Res$seqType"}};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1612
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1613 require Bio::Search::BlastUtils if $collapse;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1614
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1615 return $collapse ? &Bio::Search::BlastUtils::collapse_nums(@ary) : @ary;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1616 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1617
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1618
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1619 =head2 get_aln
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1620
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1621 Usage : $hsp->get_aln()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1622 Purpose : Get a Bio::SimpleAlign object constructed from the query + sbjct
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1623 : sequences of the present HSP object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1624 Example : $aln_obj = $hsp->get_aln();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1625 Returns : Object reference for a Bio::SimpleAlign.pm object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1626 Argument : n/a.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1627 Throws : Propagates any exception ocurring during the construction of
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1628 : the Bio::SimpleAlign object.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1629 Comments : Requires Bio::SimpleAlign.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1630 : The Bio::SimpleAlign object is constructed from the query + sbjct
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1631 : sequence objects obtained by calling seq().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1632 : Gap residues are included (see $GAP_SYMBOL).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1633
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1634 See Also : L<seq()|seq>, L<Bio::SimpleAlign>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1635
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1636 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1637
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1638 #------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1639 sub get_aln {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1640 #------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1641 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1642
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1643 require Bio::SimpleAlign;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1644 require Bio::LocatableSeq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1645 my $qseq = $self->seq('query');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1646 my $sseq = $self->seq('sbjct');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1647
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1648 my $type = $self->algorithm =~ /P$|^T/ ? 'amino' : 'dna';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1649 my $aln = new Bio::SimpleAlign();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1650 $aln->add_seq(new Bio::LocatableSeq(-seq => $qseq->seq(),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1651 -id => 'query_'.$qseq->display_id(),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1652 -start => 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1653 -end => CORE::length($qseq)));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1654
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1655 $aln->add_seq(new Bio::LocatableSeq(-seq => $sseq->seq(),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1656 -id => 'hit_'.$sseq->display_id(),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1657 -start => 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1658 -end => CORE::length($sseq)));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1659
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1660 return $aln;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1661 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1662
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1663
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1664 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1665 __END__
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1666
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1667
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1668 =head1 FOR DEVELOPERS ONLY
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1669
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1670 =head2 Data Members
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1671
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1672 Information about the various data members of this module is provided for those
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1673 wishing to modify or understand the code. Two things to bear in mind:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1674
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1675 =over 4
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1676
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1677 =item 1 Do NOT rely on these in any code outside of this module.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1678
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1679 All data members are prefixed with an underscore to signify that they are private.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1680 Always use accessor methods. If the accessor doesn't exist or is inadequate,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1681 create or modify an accessor (and let me know, too!).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1682
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1683 =item 2 This documentation may be incomplete and out of date.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1684
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1685 It is easy for these data member descriptions to become obsolete as
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1686 this module is still evolving. Always double check this info and search
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1687 for members not described here.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1688
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1689 =back
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1690
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1691 An instance of Bio::Search::HSP::BlastHSP.pm is a blessed reference to a hash containing
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1692 all or some of the following fields:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1693
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1694 FIELD VALUE
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1695 --------------------------------------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1696 (member names are mostly self-explanatory)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1697
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1698 _score :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1699 _bits :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1700 _p :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1701 _n : Integer. The 'N' value listed in parenthesis with P/Expect value:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1702 : e.g., P(3) = 1.2e-30 ---> (N = 3).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1703 : Not defined in NCBI Blast2 with gaps.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1704 : To obtain the number of HSPs, use Bio::Search::Hit::BlastHit::num_hsps().
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1705 _expect :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1706 _queryLength :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1707 _queryGaps :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1708 _queryStart :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1709 _queryStop :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1710 _querySeq :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1711 _sbjctLength :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1712 _sbjctGaps :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1713 _sbjctStart :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1714 _sbjctStop :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1715 _sbjctSeq :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1716 _matchSeq : String. Contains the symbols between the query and sbjct lines
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1717 which indicate identical (letter) and conserved ('+') matches
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1718 or a mismatch (' ').
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1719 _numIdentical :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1720 _numConserved :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1721 _identicalRes_query :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1722 _identicalRes_sbjct :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1723 _conservedRes_query :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1724 _conservedRes_sbjct :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1725 _match_indent : The number of leading space characters on each line containing
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1726 the match symbols. _match_indent is 13 in this example:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1727 Query: 285 QNSAPWGLARISHRERLNLGSFNKYLYDDDAG
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1728 Q +APWGLARIS G+ + Y YD+ AG
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1729 ^^^^^^^^^^^^^
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1730
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1731
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1732 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1733
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1734 1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1735