annotate variant_effect_predictor/Bio/Tree/RandomFactory.pm @ 0:21066c0abaf5 draft

Uploaded
author willmclaren
date Fri, 03 Aug 2012 10:04:48 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1 # $Id: RandomFactory.pm,v 1.8 2002/12/24 17:52:03 jason Exp $
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
3 # BioPerl module for Bio::Tree::RandomFactory
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
4 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
5 # Cared for by Jason Stajich <jason@bioperl.org>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
6 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
7 # Copyright Jason Stajich
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
8 #
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
9 # You may distribute this module under the same terms as perl itself
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
10
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
11 # POD documentation - main docs before the code
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
12
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
13 =head1 NAME
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
14
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
15 Bio::Tree::RandomFactory - TreeFactory for generating Random Trees
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
16
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
17 =head1 SYNOPSIS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
18
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
19 use Bio::Tree::RandomFactory
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
20 my $factory = new Bio::Tree::RandomFactory( -samples => \@taxonnames,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
21 -maxcount => 10);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
22
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
23 # or for anonymous samples
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
24
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
25 my $factory = new Bio::Tree::RandomFactory( -sample_size => 6,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
26 -maxcount = 50);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
27
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
28 =head1 DESCRIPTION
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
29
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
30 Builds a random tree every time next_tree is called or up to -maxcount times.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
31
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
32 This algorithm is based on the make_tree algorithm from Richard Hudson 1990.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
33
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
34 Hudson, R. R. 1990. Gene genealogies and the coalescent
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
35 process. Pp. 1-44 in D. Futuyma and J. Antonovics, eds. Oxford
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
36 surveys in evolutionary biology. Vol. 7. Oxford University
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
37 Press, New York
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
38
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
39
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
40 =head1 FEEDBACK
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
41
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
42 =head2 Mailing Lists
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
43
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
44 User feedback is an integral part of the evolution of this and other
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
45 Bioperl modules. Send your comments and suggestions preferably to
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
46 the Bioperl mailing list. Your participation is much appreciated.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
47
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
48 bioperl-l@bioperl.org - General discussion
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
49 http://bioperl.org/MailList.shtml - About the mailing lists
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
50
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
51 =head2 Reporting Bugs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
52
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
53 Report bugs to the Bioperl bug tracking system to help us keep track
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
54 of the bugs and their resolution. Bug reports can be submitted via
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
55 the web:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
56
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
57 http://bugzilla.bioperl.org/
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
58
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
59
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
60 =head1 AUTHOR - Jason Stajich
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
61
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
62 Email jason@bioperl.org
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
63
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
64 =head1 CONTRIBUTORS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
65
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
66 Matthew Hahn, E<lt>matthew.hahn@duke.eduE<gt>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
67
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
68 =head1 APPENDIX
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
69
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
70 The rest of the documentation details each of the object methods.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
71 Internal methods are usually preceded with a _
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
72
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
73 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
74
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
75
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
76 # Let the code begin...
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
77
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
78
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
79 package Bio::Tree::RandomFactory;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
80 use vars qw(@ISA $PRECISION_DIGITS);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
81 use strict;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
82
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
83 BEGIN {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
84 $PRECISION_DIGITS = 3; # Precision for the branchlength
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
85 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
86
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
87 use Bio::Factory::TreeFactoryI;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
88 use Bio::Root::Root;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
89 use Bio::TreeIO::TreeEventBuilder;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
90 use Bio::Tree::AlleleNode;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
91
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
92 @ISA = qw(Bio::Root::Root Bio::Factory::TreeFactoryI );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
93
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
94 =head2 new
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
95
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
96 Title : new
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
97 Usage : my $factory = new Bio::Tree::RandomFactory(-samples => \@samples,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
98 -maxcount=> $N);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
99 Function: Initializes a Bio::Tree::RandomFactory object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
100 Returns : Bio::Tree::RandomFactory
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
101 Args :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
102
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
103
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
104 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
105
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
106 sub new{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
107 my ($class,@args) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
108 my $self = $class->SUPER::new(@args);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
109
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
110 $self->{'_eventbuilder'} = new Bio::TreeIO::TreeEventBuilder();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
111 $self->{'_treecounter'} = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
112 $self->{'_maxcount'} = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
113 my ($maxcount, $samps,$samplesize ) = $self->_rearrange([qw(MAXCOUNT
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
114 SAMPLES
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
115 SAMPLE_SIZE)],
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
116 @args);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
117 my @samples;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
118
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
119 if( ! defined $samps ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
120 if( ! defined $samplesize || $samplesize <= 0 ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
121 $self->throw("Must specify a valid samplesize if parameter -SAMPLE is not specified");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
122 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
123 foreach ( 1..$samplesize ) { push @samples, "Samp$_"; }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
124 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
125 if( ref($samps) =~ /ARRAY/i ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
126 $self->throw("Must specify a valid ARRAY reference to the parameter -SAMPLES, did you forget a leading '\\'?");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
127 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
128 @samples = @$samps;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
129 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
130
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
131 $self->samples(\@samples);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
132 $self->sample_size(scalar @samples);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
133 if( defined $maxcount ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
134 $self->maxcount($maxcount);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
135 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
136 return $self;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
137 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
138
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
139 =head2 next_tree
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
140
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
141 Title : next_tree
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
142 Usage : my $tree = $factory->next_tree
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
143 Function: Returns a random tree based on the initialized number of nodes
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
144 NOTE: if maxcount is not specified on initialization or
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
145 set to a valid integer, subsequent calls to next_tree will
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
146 continue to return random trees and never return undef
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
147
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
148 Returns : Bio::Tree::TreeI object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
149 Args : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
150
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
151 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
152
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
153 sub next_tree{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
154 my ($self) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
155 # If maxcount is set to something non-zero then next tree will
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
156 # continue to return valid trees until maxcount is reached
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
157 # otherwise will always return trees
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
158 return undef if( $self->maxcount &&
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
159 $self->{'_treecounter'}++ >= $self->maxcount );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
160 my $size = $self->sample_size;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
161
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
162 my $in;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
163 my @tree = ();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
164 my @list = ();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
165
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
166 for($in=0;$in < 2*$size -1; $in++ ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
167 push @tree, { 'nodenum' => "Node$in" };
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
168 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
169 # in C we would have 2 arrays
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
170 # an array of nodes (tree)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
171 # and array of pointers to these nodes (list)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
172 # and we just shuffle the list items to do the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
173 # tree topology generation
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
174 # instead in perl, we will have a list of hashes (nodes) called @tree
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
175 # and a list of integers representing the indexes in tree called @list
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
176
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
177 for($in=0;$in < $size;$in++) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
178 $tree[$in]->{'time'} = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
179 $tree[$in]->{'desc1'} = undef;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
180 $tree[$in]->{'desc2'} = undef;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
181 push @list, $in;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
182 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
183
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
184 my $t=0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
185 # generate times for the nodes
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
186 for($in = $size; $in > 1; $in-- ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
187 $t+= -2.0 * log(1 - $self->random(1)) / ( $in * ($in-1) );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
188 $tree[2 * $size - $in]->{'time'} =$t;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
189 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
190 # topology generation
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
191 for ($in = $size; $in > 1; $in-- ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
192 my $pick = int $self->random($in);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
193 my $nodeindex = $list[$pick];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
194 my $swap = 2 * $size - $in;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
195 $tree[$swap]->{'desc1'} = $nodeindex;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
196 $list[$pick] = $list[$in-1];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
197 $pick = int rand($in - 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
198 $nodeindex = $list[$pick];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
199 $tree[$swap]->{'desc2'} = $nodeindex;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
200 $list[$pick] = $swap;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
201 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
202 # Let's convert the hashes into nodes
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
203
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
204 my @nodes = ();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
205 foreach my $n ( @tree ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
206 push @nodes,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
207 new Bio::Tree::AlleleNode(-id => $n->{'nodenum'},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
208 -branch_length => $n->{'time'});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
209 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
210 my $ct = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
211 foreach my $node ( @nodes ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
212 my $n = $tree[$ct++];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
213 if( defined $n->{'desc1'} ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
214 $node->add_Descendent($nodes[$n->{'desc1'}]);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
215 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
216 if( defined $n->{'desc2'} ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
217 $node->add_Descendent($nodes[$n->{'desc2'}]);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
218 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
219 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
220 my $T = new Bio::Tree::Tree(-root => pop @nodes );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
221 return $T;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
222 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
223
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
224 =head2 add_Mutations
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
225
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
226 Title : add_Mutations
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
227 Usage : $factory->add_Mutations($tree, $mutcount);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
228 Function: Adds mutations to a tree via a random process weighted by
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
229 branch length (it is a poisson distribution
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
230 as part of a coalescent process)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
231 Returns : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
232 Args : $tree - Bio::Tree::TreeI
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
233 $nummut - number of mutations
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
234
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
235
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
236 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
237
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
238 sub add_Mutations{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
239 my ($self,$tree, $nummut) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
240 my @branches;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
241 my @lens;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
242 my $branchlen = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
243 my $last = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
244 my @nodes = $tree->get_nodes();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
245 my $precision = 10**$PRECISION_DIGITS;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
246 my $i = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
247
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
248 # Jason's somewhat simplistics way of doing a poission
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
249 # distribution for a fixed number of mutations
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
250 # build an array and put the node number in a slot
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
251 # representing the branch to put a mutation on
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
252 # but weight the number of slots per branch by the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
253 # length of the branch ( ancestor's time - node time)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
254
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
255 foreach my $node ( @nodes ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
256 if( $node->ancestor ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
257 my $len = int ( ($node->ancestor->branch_length -
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
258 $node->branch_length) * $precision);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
259 if ( $len > 0 ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
260 for( my $j =0;$j < $len;$j++) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
261 push @branches, $i;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
262 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
263 $last += $len;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
264 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
265 $branchlen += $len;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
266 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
267 if( ! $node->isa('Bio::Tree::AlleleNode') ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
268 bless $node, 'Bio::Tree::AlleleNode'; # rebless it to the right node
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
269 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
270 $node->purge_markers;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
271 $i++;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
272 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
273 # sanity check
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
274 die("branch len is $branchlen arraylen is $last")
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
275 unless ( $branchlen == $last );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
276
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
277 for( my $j = 0; $j < $nummut; $j++) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
278 my $index = int(rand($branchlen));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
279 my $branch = $branches[$index];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
280 $nodes[$branch]->add_alleles("Mutation$j", [1]);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
281 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
282 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
283
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
284 =head2 maxcount
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
285
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
286 Title : maxcount
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
287 Usage : $obj->maxcount($newval)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
288 Function:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
289 Example :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
290 Returns : value of maxcount
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
291 Args : newvalue (optional)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
292
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
293
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
294 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
295
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
296 sub maxcount{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
297 my ($self,$value) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
298 if( defined $value) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
299 if( $value =~ /^(\d+)/ ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
300 $self->{'maxcount'} = $1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
301 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
302 $self->warn("Must specify a valid Positive integer to maxcount");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
303 $self->{'maxcount'} = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
304 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
305 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
306 return $self->{'_maxcount'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
307 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
308
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
309 =head2 samples
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
310
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
311 Title : samples
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
312 Usage : $obj->samples($newval)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
313 Function:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
314 Example :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
315 Returns : value of samples
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
316 Args : newvalue (optional)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
317
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
318
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
319 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
320
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
321 sub samples{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
322 my ($self,$value) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
323 if( defined $value) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
324 if( ref($value) !~ /ARRAY/i ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
325 $self->warn("Must specify a valid array ref to the method 'samples'");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
326 $value = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
327 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
328 $self->{'samples'} = $value;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
329 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
330 return $self->{'samples'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
331
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
332 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
333
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
334 =head2 sample_size
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
335
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
336 Title : sample_size
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
337 Usage : $obj->sample_size($newval)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
338 Function:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
339 Example :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
340 Returns : value of sample_size
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
341 Args : newvalue (optional)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
342
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
343
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
344 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
345
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
346 sub sample_size{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
347 my ($self,$value) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
348 if( defined $value) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
349 $self->{'sample_size'} = $value;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
350 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
351 return $self->{'sample_size'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
352
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
353 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
354
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
355 =head2 attach_EventHandler
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
356
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
357 Title : attach_EventHandler
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
358 Usage : $parser->attatch_EventHandler($handler)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
359 Function: Adds an event handler to listen for events
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
360 Returns : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
361 Args : Bio::Event::EventHandlerI
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
362
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
363 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
364
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
365 sub attach_EventHandler{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
366 my ($self,$handler) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
367 return if( ! $handler );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
368 if( ! $handler->isa('Bio::Event::EventHandlerI') ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
369 $self->warn("Ignoring request to attatch handler ".ref($handler). ' because it is not a Bio::Event::EventHandlerI');
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
370 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
371 $self->{'_handler'} = $handler;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
372 return;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
373 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
374
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
375 =head2 _eventHandler
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
376
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
377 Title : _eventHandler
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
378 Usage : private
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
379 Function: Get the EventHandler
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
380 Returns : Bio::Event::EventHandlerI
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
381 Args : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
382
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
383
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
384 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
385
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
386 sub _eventHandler{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
387 my ($self) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
388 return $self->{'_handler'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
389 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
390
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
391 =head2 random
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
392
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
393 Title : random
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
394 Usage : my $rfloat = $node->random($size)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
395 Function: Generates a random number between 0 and $size
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
396 This is abstracted so that someone can override and provide their
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
397 own special RNG. This is expected to be a uniform RNG.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
398 Returns : Floating point random
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
399 Args : $maximum size for random number (defaults to 1)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
400
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
401
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
402 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
403
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
404 sub random{
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
405 my ($self,$max) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
406 return rand($max);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
407 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
408
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
409 1;