annotate variant_effect_predictor/Bio/EnsEMBL/Funcgen/BindingMatrix.pm @ 0:21066c0abaf5 draft

Uploaded
author willmclaren
date Fri, 03 Aug 2012 10:04:48 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2 # Ensembl module for Bio::EnsEMBL::Funcgen::BindingMatrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
3 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
4 # You may distribute this module under the same terms as Perl itself
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
5
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
6 =head1 LICENSE
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
7
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
8 Copyright (c) 1999-2011 The European Bioinformatics Institute and
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
9 Genome Research Limited. All rights reserved.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
10
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
11 This software is distributed under a modified Apache license.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
12 For license details, please see
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
13
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
14 http://www.ensembl.org/info/about/code_licence.html
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
15
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
16 =head1 CONTACT
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
17
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
18 Please email comments or questions to the public Ensembl
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
19 developers list at <ensembl-dev@ebi.ac.uk>.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
20
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
21 Questions may also be sent to the Ensembl help desk at
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
22 <helpdesk@ensembl.org>.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
23
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
24
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
25 =head1 NAME
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
26
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
27 Bio::EnsEMBL::Funcgen::BindingMatrix - A module to represent a BindingMatrix.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
28 In EFG this represents the binding affinities of a Transcription Factor to DNA.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
29
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
30 =head1 SYNOPSIS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
31
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
32 use Bio::EnsEMBL::Funcgen::BindingMatrix;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
33
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
34 my $matrix = Bio::EnsEMBL::Funcgen::BindingMatrix->new(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
35 -name => "MA0122.1",
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
36 -type => "Jaspar",
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
37 -description => "Nkx3-2 Jaspar Matrix",
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
38 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
39 $matrix->frequencies("A [ 4 1 13 24 0 0 6 4 9 ]
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
40 C [ 7 4 1 0 0 0 0 6 7 ]
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
41 G [ 4 5 7 0 24 0 18 12 5 ]
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
42 T [ 9 14 3 0 0 24 0 2 3 ]");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
43
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
44 print $matrix->relative_affinity("TGGCCACCA")."\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
45
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
46 print $matrix->threshold."\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
47
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
48 =head1 DESCRIPTION
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
49
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
50 This class represents information about a BindingMatrix, containing the name
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
51 (e.g. the Jaspar ID, or an internal name), and description. A BindingMatrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
52 is always associated to an Analysis (indicating the origin of the matrix e.g.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
53 Jaspar) and a FeatureType (the binding factor).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
54
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
55 =head1 SEE ALSO
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
56
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
57 Bio::EnsEMBL::Funcgen::DBSQL::BindingMatrixAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
58
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
59 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
60
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
61
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
62 use strict;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
63 use warnings;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
64
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
65 package Bio::EnsEMBL::Funcgen::BindingMatrix;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
66
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
67 use Bio::EnsEMBL::Utils::Argument qw( rearrange ) ;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
68 use Bio::EnsEMBL::Utils::Exception qw( throw warning );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
69 use Bio::EnsEMBL::Funcgen::Storable;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
70
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
71 use vars qw(@ISA);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
72 @ISA = qw(Bio::EnsEMBL::Funcgen::Storable);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
73
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
74 =head2 new
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
75
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
76 Arg [-name]: string - name of Matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
77 Arg [-analysis]: Bio::EnsEMBL::Analysis - analysis describing how the matrix was obtained
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
78 Arg [-frequencies]: (optional) string - frequencies representing the binding affinities of a Matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
79 Arg [-threshold]: (optional) float - minimum relative affinity for binding sites of this matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
80 Arg [-description]: (optional) string - descriptiom of Matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
81 Example : my $matrix = Bio::EnsEMBL::Funcgen::BindingMatrix->new(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
82 -name => "MA0122.1",
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
83 -analysis => $analysis,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
84 -description => "Jaspar Matrix",
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
85 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
86 Description: Constructor method for BindingMatrix class
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
87 Returntype : Bio::EnsEMBL::Funcgen::BindingMatrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
88 Exceptions : Throws if name or/and type not defined
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
89 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
90 Status : Medium risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
91
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
92 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
93
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
94 sub new {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
95 my $caller = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
96
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
97 my $obj_class = ref($caller) || $caller;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
98 my $self = $obj_class->SUPER::new(@_);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
99
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
100 my ( $name, $analysis, $freq, $desc, $ftype, $thresh ) = rearrange
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
101 ( [
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
102 'NAME', 'ANALYSIS', 'FREQUENCIES', 'DESCRIPTION', 'FEATURE_TYPE', 'THRESHOLD'
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
103 ], @_);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
104
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
105
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
106 if(! defined $name){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
107 throw("Must supply a name\n");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
108 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
109
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
110 if(! ((ref $analysis) && $analysis->isa('Bio::EnsEMBL::Analysis') )){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
111 throw("You must define a valid Bio::EnsEMBL::Analysis");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
112 #leave is stored test to adaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
113 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
114
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
115 if(! (ref($ftype) && $ftype->isa('Bio::EnsEMBL::Funcgen::FeatureType'))){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
116 throw("You must define a valid Bio::EnsEMBL::Funcgen::FeatureType");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
117 #leave is stored test to adaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
118 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
119
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
120 $self->name($name);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
121 $self->{analysis} = $analysis;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
122 $self->{feature_type} = $ftype;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
123 $self->frequencies($freq) if $freq;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
124 $self->description($desc) if $desc;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
125 $self->threshold($thresh) if $thresh;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
126
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
127 return $self;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
128 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
129
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
130
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
131 =head2 feature_type
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
132
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
133 Example : my $ft_name = $matrix->feature_type()->name();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
134 Description: Getter for the feature_type attribute for this matrix.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
135 Returntype : Bio::EnsEMBL::Funcgen::FeatureType
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
136 Exceptions : None
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
137 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
138 Status : At risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
139
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
140 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
141
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
142 sub feature_type {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
143 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
144
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
145 return $self->{'feature_type'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
146 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
147
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
148
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
149 =head2 name
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
150
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
151 Arg [1] : (optional) string - name
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
152 Example : my $name = $matrix->name();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
153 Description: Getter and setter of name attribute
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
154 Returntype : string
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
155 Exceptions : None
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
156 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
157 Status : Low Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
158
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
159 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
160
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
161 sub name {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
162 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
163 $self->{'name'} = shift if @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
164 return $self->{'name'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
165 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
166
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
167 =head2 description
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
168
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
169 Arg [1] : (optional) string - description
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
170 Example : my $desc = $matrix->description();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
171 Description: Getter and setter of description attribute
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
172 Returntype : string
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
173 Exceptions : None
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
174 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
175 Status : Low Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
176
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
177 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
178
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
179 sub description {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
180 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
181 $self->{'description'} = shift if @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
182 return $self->{'description'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
183 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
184
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
185 =head2 threshold
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
186
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
187 Arg [1] : (optional) float - threshold
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
188 Example : my $thresh = $matrix->threshold();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
189 Description: Getter and setter of threshold attribute
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
190 Returntype : float
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
191 Exceptions : None
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
192 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
193 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
194
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
195 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
196
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
197 sub threshold {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
198 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
199 $self->{'threshold'} = shift if @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
200 return $self->{'threshold'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
201 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
202
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
203
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
204 =head2 analysis
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
205 Example : $matrix->analysis()->logic_name();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
206 Description: Getter for the feature_type attribute for this matrix.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
207 Returntype : Bio::EnsEMBL::Analysis
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
208 Exceptions : None
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
209 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
210 Status : At risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
211
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
212 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
213
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
214 sub analysis {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
215 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
216
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
217 return $self->{'analysis'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
218 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
219
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
220
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
221 =head2 frequencies
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
222
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
223 Arg [1] : (optional) string - frequencies
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
224 Example : $matrix->frequencies($frequencies_string);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
225 Description: Getter and setter of frequencies attribute
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
226
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
227 The attribute is a string representing the matrix binding
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
228 affinities in the Jaspar format. E.g.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
229 ">
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
230 [ ]
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
231 "
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
232
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
233 Returntype : string
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
234 Exceptions : Throws if the string attribute is not a properly
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
235 formed matrix in the Jaspar format
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
236 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
237 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
238
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
239 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
240
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
241 sub frequencies {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
242 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
243
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
244 my $frequencies = shift if @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
245 if($frequencies){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
246 $self->_weights($frequencies);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
247 $self->{'frequencies'} = $frequencies;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
248 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
249 return $self->{'frequencies'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
250 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
251
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
252 =head2 frequencies_revcomp
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
253
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
254 Example : $matrix->frequencies_revcomp();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
255 Description: Getter for the reverse complement frequencies attribute
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
256
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
257 The attribute represents the reverse complement of frequencies
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
258
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
259 Returntype : string
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
260 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
261 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
262
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
263 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
264
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
265 sub frequencies_revcomp {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
266 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
267
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
268 return $self->{'frequencies_revcomp'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
269 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
270
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
271
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
272 =head2 relative_affinity
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
273
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
274 Arg [1] : string - Binding Site Sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
275 Arg [2] : (optional) boolean - 1 if results are to be in linear scale (default is log scale)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
276 Example : $matrix->relative_affinity($sequence);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
277 Description: Calculates the binding affinity of a given sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
278 relative to the optimal site for the matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
279 The site is taken as if it were in the proper orientation
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
280 Considers a purely random background p(A)=p(C)=p(G)=p(T)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
281 Returntype : double
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
282 Exceptions : Throws if the sequence length does not have the matrix length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
283 or if the sequence has unclear bases (N is not accepted)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
284 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
285 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
286
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
287 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
288
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
289 sub relative_affinity {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
290 my ($self, $sequence, $linear) = (shift, shift, shift);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
291 $sequence =~ s/^\s+//;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
292 $sequence =~ s/\s+$//;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
293
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
294 throw "No sequence given" if !$sequence;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
295 $sequence = uc($sequence);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
296 if($sequence =~ /[^ACGT]/){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
297 throw "Sequence $sequence contains invalid characters: Only Aa Cc Gg Tt accepted";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
298 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
299
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
300 my $weight_matrix = $self->_weights;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
301 my $matrix_length = scalar(@{$weight_matrix->{'A'}});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
302 if(length($sequence) != $matrix_length){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
303 throw "Sequence $sequence does not have length $matrix_length";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
304 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
305
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
306 my $log_odds = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
307 my @bases = split(//,$sequence);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
308 for(my $i=0;$i<$matrix_length;$i++){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
309 $log_odds += $weight_matrix->{$bases[$i]}->[$i];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
310 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
311
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
312 #This log scale may be quite unrealistic... but usefull just for comparisons...
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
313 if(!$linear){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
314 return ($log_odds - $self->_min_bind) / ($self->_max_bind - $self->_min_bind);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
315 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
316 return (exp($log_odds) - exp($self->_min_bind)) / (exp($self->_max_bind) - exp($self->_min_bind));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
317 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
318
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
319 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
320
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
321 =head2 is_position_informative
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
322
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
323 Arg [1] : int - 1-based position withing the matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
324 Arg [2] : (optional) double - threshold [0-2] for information content [default is 1.5]
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
325 Example : $matrix->is_position_informative($pos);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
326 Description: Returns true if position information content is over threshold
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
327 Returntype : boolean
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
328 Exceptions : Throws if position or threshold out of bounds
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
329 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
330 Status : At High Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
331
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
332 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
333
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
334 sub is_position_informative {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
335 my ($self,$position,$threshold) = (shift,shift,shift);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
336 throw "Need a position" if(!defined($position));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
337 throw "Position out of bounds" if(($position<1) || ($position > $self->length));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
338 if(!defined($threshold)){ $threshold = 1.5; }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
339 throw "Threshold out of bounds" if(($threshold<0) || ($threshold>2));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
340 return ($self->{'ic'}->[$position-1] >= $threshold);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
341 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
342
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
343
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
344
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
345 =head2 length
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
346
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
347 Example : $bm->length();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
348 Description: Returns the length of the the matrix (e.g. 19bp long)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
349 Returntype : int with the length of this binding matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
350 Exceptions : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
351 Caller : General
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
352 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
353
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
354 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
355
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
356 sub length {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
357 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
358
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
359 my $weight_matrix = $self->_weights;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
360
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
361 return scalar(@{$weight_matrix->{'A'}});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
362 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
363
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
364 =head2 _weights
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
365
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
366 Arg [1] : (optional) string - frequencies
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
367 Example : _weights($frequencies);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
368 Description: Private Getter Setter for the weight matrix based on frequencies
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
369 Returntype : HASHREF with the weights of this binding matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
370 Exceptions : Throws if the frequencies attribute string does not correspond
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
371 to 4 rows of an equal number of integer numbers.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
372 Caller : Self
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
373 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
374
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
375 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
376
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
377 sub _weights {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
378 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
379
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
380 #for the moment use equiprobability and constant pseudo-count
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
381 my $pseudo = 0.1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
382
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
383 #TODO allow for it to be passed as parameters?
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
384 my $frequencies = shift if @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
385 if($frequencies){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
386 $frequencies =~ s/^(>.*?\n)//;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
387 my $header = $1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
388
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
389 my ($a,$c,$g,$t) = split(/\n/,$frequencies);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
390 my @As = split(/\s+/,_parse_matrix_line('[A\[\]]',$a));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
391 my @Cs = split(/\s+/,_parse_matrix_line('[C\[\]]',$c));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
392 my @Gs = split(/\s+/,_parse_matrix_line('[G\[\]]',$g));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
393 my @Ts = split(/\s+/,_parse_matrix_line('[T\[\]]',$t));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
394 if((scalar(@As)!=scalar(@Cs)) || (scalar(@As)!=scalar(@Gs)) || (scalar(@As)!=scalar(@Ts)) ){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
395 throw "Frequencies provided are not a valid frequency matrix"
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
396 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
397 $self->_calc_ic(\@As,\@Cs,\@Gs,\@Ts,$pseudo);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
398
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
399 #Create the reverse complement
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
400 my @revT = reverse(@As);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
401 my @revA = reverse(@Ts);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
402 my @revC = reverse(@Gs);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
403 my @revG = reverse(@Cs);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
404 my $revcomp = $header;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
405 $revcomp.= "A [ ".join("\t",@revA)." ]\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
406 $revcomp.= "C [ ".join("\t",@revC)." ]\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
407 $revcomp.= "G [ ".join("\t",@revG)." ]\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
408 $revcomp.= "T [ ".join("\t",@revT)." ]\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
409 $self->{'frequencies_revcomp'} = $revcomp;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
410
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
411 my @totals;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
412 for(my $i=0;$i<scalar(@As);$i++){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
413 $totals[$i]=$As[$i]+$Cs[$i]+$Gs[$i]+$Ts[$i];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
414 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
415
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
416 my %weights;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
417 #We can allow distinct background per nucleotide, instead of 0.25 for all... pass as parameter
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
418 #But if the matrix was obtained using in-vivo data, it shouldn't matter the organism nucleotide bias..
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
419 #We're using 0.1 as pseudo-count... the matrix cannot have very few elements... (e.g. <30 not good)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
420 my @was; for(my $i=0;$i<scalar(@As);$i++){ $was[$i] = log((($As[$i] + $pseudo) / ($totals[$i]+(4*$pseudo))) / 0.25); };
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
421 $weights{'A'} = \@was;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
422 my @wcs; for(my $i=0;$i<scalar(@Cs);$i++){ $wcs[$i] = log((($Cs[$i] + $pseudo) / ($totals[$i]+(4*$pseudo))) / 0.25); };
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
423 $weights{'C'} = \@wcs;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
424 my @wgs; for(my $i=0;$i<scalar(@Gs);$i++){ $wgs[$i] = log((($Gs[$i] + $pseudo) / ($totals[$i]+(4*$pseudo))) / 0.25); };
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
425 $weights{'G'} = \@wgs;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
426 my @wts; for(my $i=0;$i<scalar(@Ts);$i++){ $wts[$i] = log((($Ts[$i] + $pseudo) / ($totals[$i]+(4*$pseudo))) / 0.25); };
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
427 $weights{'T'} = \@wts;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
428
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
429 $self->{'weights'} = \%weights;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
430
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
431 my $max = 0; my $min = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
432 for(my $i=0;$i<scalar(@As);$i++){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
433 my $col = [ $was[$i], $wcs[$i], $wgs[$i], $wts[$i] ];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
434 $min += _min($col);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
435 $max += _max($col);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
436 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
437
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
438 #Log scale
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
439 $self->_max_bind($max);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
440 $self->_min_bind($min);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
441 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
442
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
443 return $self->{'weights'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
444
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
445 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
446
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
447 =head2 _calc_ic
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
448
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
449 Example : _calc_ic($as,$cs,$gs,$ts,$pseudo);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
450 Description: Private function to calculate the matrix information content per position
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
451 Caller : self
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
452 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
453
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
454 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
455
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
456 sub _calc_ic {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
457 my ($self,$as, $cs, $gs, $ts,$pseudo) = (shift,shift, shift, shift, shift, shift);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
458 my @ic = ();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
459 for (my $i=0;$i<scalar(@$as);$i++){
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
460 my $total_i = $as->[$i] + $cs->[$i] + $gs->[$i] + $ts->[$i] + (4*$pseudo);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
461 my $fas = ($as->[$i] + $pseudo) / $total_i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
462 my $fcs = ($cs->[$i] + $pseudo) / $total_i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
463 my $fgs = ($gs->[$i] + $pseudo) / $total_i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
464 my $fts = ($ts->[$i] + $pseudo) / $total_i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
465 my $ic_i = 2 + ($fas * log($fas)/log(2)) + ($fcs * log($fcs)/log(2)) + ($fgs * log($fgs)/log(2)) + ($fts * log($fts)/log(2));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
466 push @ic, $ic_i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
467 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
468 $self->{'ic'} = \@ic;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
469 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
470
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
471 sub _parse_matrix_line {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
472 my ($pat,$line) = (shift,shift);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
473 $line=~s/$pat//g; $line=~s/^\s+//; $line=~s/\s+$//;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
474 return $line;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
475 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
476
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
477 sub _max { return _min_max(shift, 0); }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
478
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
479 sub _min { return _min_max(shift, 1); }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
480
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
481 sub _min_max {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
482 my ($list,$min) = (shift, shift);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
483 my $min_max = $list->[0];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
484 map { if($min ? $_ < $min_max : $_ > $min_max){ $min_max = $_; } } @$list;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
485 return $min_max;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
486 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
487
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
488
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
489 =head2 _max_bind
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
490
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
491 Arg [1] : (optional) double - maximum binding affinity
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
492 Example : $matrix->_max_bind(10.2);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
493 Description: Private Getter and setter of max_bind attribute (not to be called directly)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
494 Returntype : float with the maximum binding affinity of the matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
495 Exceptions : None
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
496 Caller : Self
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
497 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
498
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
499 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
500
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
501 sub _max_bind {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
502 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
503
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
504 $self->{'max_bind'} = shift if @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
505
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
506 return $self->{'max_bind'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
507 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
508
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
509 =head2 _min_bind
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
510
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
511 Arg [1] : (optional) double - minimum binding affinity
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
512 Example : $matrix->_min_bind(-10.2);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
513 Description: Private Getter and setter of min_bind attribute (not to be called directly)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
514 Returntype : float with the minimum binding affinity of the matrix
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
515 Exceptions : None
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
516 Caller : Self
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
517 Status : At Risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
518
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
519 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
520
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
521 sub _min_bind {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
522 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
523
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
524 $self->{'min_bind'} = shift if @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
525
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
526 return $self->{'min_bind'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
527 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
528
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
529 1;