annotate variant_effect_predictor/Bio/LocatableSeq.pm @ 3:d30fa12e4cc5 default tip

Merge heads 2:a5976b2dce6f and 1:09613ce8151e which were created as a result of a recently fixed bug.
author devteam <devteam@galaxyproject.org>
date Mon, 13 Jan 2014 10:38:30 -0500
parents 1f6dce3d34e0
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1 # $Id: LocatableSeq.pm,v 1.22.2.1 2003/03/31 11:49:51 heikki Exp $
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
3 # BioPerl module for Bio::LocatableSeq
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
4 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
5 # Cared for by Ewan Birney <birney@sanger.ac.uk>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
6 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
7 # Copyright Ewan Birney
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
8 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
9 # You may distribute this module under the same terms as perl itself
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
10
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
11 # POD documentation - main docs before the code
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
12
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
13 =head1 NAME
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
14
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
15 Bio::LocatableSeq - A Sequence object with start/end points on it
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
16 that can be projected into a MSA or have coordinates relative to another seq.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
17
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
18 =head1 SYNOPSIS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
19
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
20
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
21 use Bio::LocatableSeq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
22 my $seq = new Bio::LocatableSeq(-seq => "CAGT-GGT",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
23 -id => "seq1",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
24 -start => 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
25 -end => 7);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
26
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
27
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
28 =head1 DESCRIPTION
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
29
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
30
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
31 # a normal sequence object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
32 $locseq->seq();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
33 $locseq->id();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
34
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
35 # has start,end points
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
36 $locseq->start();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
37 $locseq->end();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
38
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
39 # inheriets off RangeI, so range operations possible
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
40
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
41 =head1 FEEDBACK
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
42
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
43
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
44 =head2 Mailing Lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
45
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
46
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
47 User feedback is an integral part of the evolution of this and other
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
48 Bioperl modules. Send your comments and suggestions preferably to one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
49 of the Bioperl mailing lists. Your participation is much appreciated.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
50
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
51 bioperl-l@bioperl.org - General discussion
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
52 http://bio.perl.org/MailList.html - About the mailing lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
53
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
54
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
55 The locatable sequence object was developed mainly because the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
56 SimpleAlign object requires this functionality, and in the rewrite
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
57 of the Sequence object we had to decide what to do with this.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
58
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
59 It is, to be honest, not well integrated with the rest of bioperl, for
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
60 example, the trunc() function does not return a LocatableSeq object,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
61 as some might have thought. There are all sorts of nasty gotcha's
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
62 about interactions between coordinate systems when these sort of
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
63 objects are used.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
64
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
65
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
66 =head2 Reporting Bugs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
67
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
68 Report bugs to the Bioperl bug tracking system to help us keep track
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
69 the bugs and their resolution. Bug reports can be submitted via email
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
70 or the web:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
71
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
72 bioperl-bugs@bio.perl.org
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
73 http://bugzilla.bioperl.org/
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
74
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
75
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
76 =head1 APPENDIX
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
77
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
78 The rest of the documentation details each of the object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
79 methods. Internal methods are usually preceded with a _
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
80
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
81 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
82
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
83 #'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
84 # Let the code begin...
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
85
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
86 package Bio::LocatableSeq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
87 use vars qw(@ISA);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
88 use strict;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
89
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
90 use Bio::PrimarySeq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
91 use Bio::RangeI;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
92 use Bio::Location::Simple;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
93 use Bio::Location::Fuzzy;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
94
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
95
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
96 @ISA = qw(Bio::PrimarySeq Bio::RangeI);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
97
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
98 sub new {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
99 my ($class, @args) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
100 my $self = $class->SUPER::new(@args);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
101
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
102 my ($start,$end,$strand) =
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
103 $self->_rearrange( [qw(START END STRAND)],
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
104 @args);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
105
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
106 defined $start && $self->start($start);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
107 defined $end && $self->end($end);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
108 defined $strand && $self->strand($strand);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
109
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
110 return $self; # success - we hope!
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
111 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
112
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
113 =head2 start
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
114
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
115 Title : start
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
116 Usage : $obj->start($newval)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
117 Function:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
118 Returns : value of start
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
119 Args : newvalue (optional)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
120
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
121 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
122
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
123 sub start{
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
124 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
125 if( @_ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
126 my $value = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
127 $self->{'start'} = $value;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
128 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
129 return $self->{'start'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
130
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
131 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
132
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
133 =head2 end
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
134
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
135 Title : end
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
136 Usage : $obj->end($newval)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
137 Function:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
138 Returns : value of end
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
139 Args : newvalue (optional)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
140
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
141 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
142
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
143 sub end {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
144 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
145 if( @_ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
146 my $value = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
147 my $string = $self->seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
148 if ($string and $self->start) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
149 my $s2 = $string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
150 $string =~ s/[.-]+//g;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
151 my $len = CORE::length $string;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
152 my $new_end = $self->start + $len - 1 ;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
153 my $id = $self->id;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
154 $self->warn("In sequence $id residue count gives value $len.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
155 Overriding value [$value] with value $new_end for Bio::LocatableSeq::end().")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
156 and $value = $new_end if $new_end != $value and $self->verbose > 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
157 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
158
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
159 $self->{'end'} = $value;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
160 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
161 return $self->{'end'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
162
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
163 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
164
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
165 =head2 strand
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
166
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
167 Title : strand
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
168 Usage : $obj->strand($newval)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
169 Function:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
170 Returns : value of strand
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
171 Args : newvalue (optional)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
172
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
173 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
174
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
175 sub strand{
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
176 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
177 if( @_ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
178 my $value = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
179 $self->{'strand'} = $value;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
180 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
181 return $self->{'strand'};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
182 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
183
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
184 =head2 get_nse
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
185
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
186 Title : get_nse
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
187 Usage :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
188 Function: read-only name of form id/start-end
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
189 Example :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
190 Returns :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
191 Args :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
192
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
193 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
194
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
195 sub get_nse{
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
196 my ($self,$char1,$char2) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
197
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
198 $char1 ||= "/";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
199 $char2 ||= "-";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
200
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
201 $self->throw("Attribute id not set") unless $self->id();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
202 $self->throw("Attribute start not set") unless $self->start();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
203 $self->throw("Attribute end not set") unless $self->end();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
204
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
205 return $self->id() . $char1 . $self->start . $char2 . $self->end ;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
206
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
207 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
208
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
209
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
210 =head2 no_gap
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
211
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
212 Title : no_gaps
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
213 Usage :$self->no_gaps('.')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
214 Function:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
215
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
216 Gets number of gaps in the sequence. The count excludes
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
217 leading or trailing gap characters.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
218
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
219 Valid bioperl sequence characters are [A-Za-z\-\.\*]. Of
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
220 these, '.' and '-' are counted as gap characters unless an
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
221 optional argument specifies one of them.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
222
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
223 Returns : number of internal gaps in the sequnce.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
224 Args : a gap character (optional)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
225
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
226 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
227
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
228 sub no_gaps {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
229 my ($self,$char) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
230 my ($seq, $count) = (undef, 0);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
231
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
232 # default gap characters
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
233 $char ||= '-.';
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
234
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
235 $self->warn("I hope you know what you are doing setting gap to [$char]")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
236 unless $char =~ /[-.]/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
237
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
238 $seq = $self->seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
239 return 0 unless $seq; # empty sequence does not have gaps
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
240
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
241 $seq =~ s/^([$char]+)//;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
242 $seq =~ s/([$char]+)$//;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
243 $count++ while $seq =~ /[$char]+/g;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
244
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
245 return $count;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
246
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
247 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
248
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
249
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
250 =head2 column_from_residue_number
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
251
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
252 Title : column_from_residue_number
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
253 Usage : $col = $seq->column_from_residue_number($resnumber)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
254 Function:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
255
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
256 This function gives the position in the alignment
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
257 (i.e. column number) of the given residue number in the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
258 sequence. For example, for the sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
259
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
260 Seq1/91-97 AC..DEF.GH
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
261
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
262 column_from_residue_number(94) returns 5.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
263
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
264 An exception is thrown if the residue number would lie
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
265 outside the length of the aligment
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
266 (e.g. column_from_residue_number( "Seq2", 22 )
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
267
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
268 Returns : A column number for the position of the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
269 given residue in the given sequence (1 = first column)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
270 Args : A residue number in the whole sequence (not just that
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
271 segment of it in the alignment)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
272
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
273 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
274
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
275 sub column_from_residue_number {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
276 my ($self, $resnumber) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
277
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
278 $self->throw("Residue number has to be a positive integer, not [$resnumber]")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
279 unless $resnumber =~ /^\d+$/ and $resnumber > 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
280
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
281 if ($resnumber >= $self->start() and $resnumber <= $self->end()) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
282 my @residues = split //, $self->seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
283 my $count = $self->start();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
284 my $i;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
285 for ($i=0; $i < @residues; $i++) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
286 if ($residues[$i] ne '.' and $residues[$i] ne '-') {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
287 $count == $resnumber and last;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
288 $count++;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
289 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
290 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
291 # $i now holds the index of the column.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
292 # The actual column number is this index + 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
293
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
294 return $i+1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
295 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
296
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
297 $self->throw("Could not find residue number $resnumber");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
298
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
299 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
300
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
301
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
302 =head2 location_from_column
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
303
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
304 Title : location_from_column
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
305 Usage : $loc = $ali->location_from_column( $seq_number, $column_number)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
306 Function:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
307
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
308 This function gives the residue number in the sequence with
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
309 the given name for a given position in the alignment
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
310 (i.e. column number) of the given. Gaps complicate this
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
311 process and force the output to be a L<Bio::Range> where
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
312 values can be undefined. For example, for the alignment
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
313
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
314 Seq1/91-97 AC..DEF.G.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
315 Seq2/1-9 .CYHDEFKGK
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
316
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
317 location_from_column( Seq1/91-97, 3 ) position 93
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
318 location_from_column( Seq1/91-97, 2 ) position 92^93
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
319 location_from_column( Seq1/91-97, 10) position 97^98
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
320 location_from_column( Seq2/1-9, 1 ) position undef
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
321
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
322 An exact position returns a Bio::Location::Simple object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
323 where where location_type() returns 'EXACT', if a position
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
324 is between bases location_type() returns 'IN-BETWEEN'.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
325 Column before the first residue returns undef. Note that if
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
326 the position is after the last residue in the alignment,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
327 that there is no guarantee that the original sequence has
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
328 residues after that position.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
329
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
330 An exception is thrown if the column number is not within
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
331 the sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
332
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
333 Returns : Bio::Location::Simple or undef
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
334 Args : A column number
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
335 Throws : If column is not within the sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
336
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
337 See L<Bio::Location::Simple> for more.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
338
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
339 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
340
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
341 sub location_from_column {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
342 my ($self, $column) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
343
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
344 $self->throw("Column number has to be a positive integer, not [$column]")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
345 unless $column =~ /^\d+$/ and $column > 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
346 $self->throw("Column number [column] is larger than".
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
347 " sequence length [". $self->length. "]")
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
348 unless $column <= $self->length;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
349
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
350 my ($loc);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
351 my $s = $self->subseq(1,$column);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
352 $s =~ s/\W//g;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
353 my $pos = CORE::length $s;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
354
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
355 my $start = $self->start || 0 ;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
356 if ($self->subseq($column, $column) =~ /[a-zA-Z]/ ) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
357 $loc = new Bio::Location::Simple
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
358 (-start => $pos + $start - 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
359 -end => $pos + $start - 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
360 -strand => 1
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
361 );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
362 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
363 elsif ($pos == 0 and $self->start == 1) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
364 } else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
365 $loc = new Bio::Location::Simple
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
366 (-start => $pos + $start - 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
367 -end => $pos +1 + $start - 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
368 -strand => 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
369 -location_type => 'IN-BETWEEN'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
370 );
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
371 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
372 return $loc;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
373 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
374
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
375 =head2 revcom
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
376
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
377 Title : revcom
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
378 Usage : $rev = $seq->revcom()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
379 Function: Produces a new Bio::LocatableSeq object which
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
380 has the reversed complement of the sequence. For protein
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
381 sequences this throws an exception of "Sequence is a
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
382 protein. Cannot revcom"
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
383
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
384 Returns : A new Bio::LocatableSeq object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
385 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
386
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
387 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
388
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
389 sub revcom {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
390 my ($self) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
391
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
392 my $new = $self->SUPER::revcom;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
393 $new->strand($self->strand * -1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
394 $new->start($self->start) if $self->start;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
395 $new->end($self->end) if $self->end;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
396 return $new;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
397 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
398
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
399
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
400 =head2 trunc
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
401
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
402 Title : trunc
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
403 Usage : $subseq = $myseq->trunc(10,100);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
404 Function: Provides a truncation of a sequence,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
405
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
406 Example :
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
407 Returns : a fresh Bio::PrimarySeqI implementing object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
408 Args : Two integers denoting first and last columns of the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
409 sequence to be included into sub-sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
410
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
411
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
412 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
413
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
414 sub trunc {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
415
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
416 my ($self, $start, $end) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
417 my $new = $self->SUPER::trunc($start, $end);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
418
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
419 $start = $self->location_from_column($start);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
420 $start ? ($start = $start->start) : ($start = 1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
421
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
422 $end = $self->location_from_column($end);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
423 $end = $end->start if $end;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
424
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
425 $new->strand($self->strand);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
426 $new->start($start) if $start;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
427 $new->end($end) if $end;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
428
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
429 return $new;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
430 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
431
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
432 1;