annotate variant_effect_predictor/Bio/EnsEMBL/Compara/AlignSlice.pm @ 1:d6778b5d8382 draft default tip

Deleted selected files
author willmclaren
date Fri, 03 Aug 2012 10:05:43 -0400
parents 21066c0abaf5
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1 =head1 LICENSE
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
2
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
3 Copyright (c) 1999-2012 The European Bioinformatics Institute and
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
4 Genome Research Limited. All rights reserved.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
5
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
6 This software is distributed under a modified Apache license.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
7 For license details, please see
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
8
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
9 http://www.ensembl.org/info/about/code_licence.html
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
10
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
11 =head1 CONTACT
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
12
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
13 Please email comments or questions to the public Ensembl
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
14 developers list at <dev@ensembl.org>.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
15
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
16 Questions may also be sent to the Ensembl help desk at
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
17 <helpdesk@ensembl.org>.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
18
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
19 =head1 NAME
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
20
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
21 Bio::EnsEMBL::Compara::AlignSlice - An AlignSlice can be used to map genes and features from one species onto another one
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
22
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
23 =head1 DESCRIPTION
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
24
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
25 INTRODUCTION
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
26
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
27 An AlignSlice is an object built with a reference Slice and the corresponding set of genomic alignments.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
28 The genomic alignments are used to map features from one species onto another and viceversa.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
29
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
30 STRUCTURE
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
31
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
32 Every Bio::EnsEMBL::Compara::AlignSlice contains a set of Bio::EnsEMBL::Compara::AlignSlice::Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
33 objects, at least one by species involved in the alignments. For instance, if the reference Slice is a
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
34 human slice and the set of alignments corresponds to human-mouse BLASTZ_NET alignments, there will be at
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
35 least one Bio::EnsEMBL::Compara::AlignSlice::Slice for human and at least another one for mouse. The main
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
36 Bio::EnsEMBL::Compara::AlignSlice::Slice for the reference species will contain a single genomic sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
37 whilst the other Bio::EnsEMBL::Compara::AlignSlice::Slice objects might be made of several pieces of
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
38 genomic sequences, depending on the set of alignments. Here is a graphical representation:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
39
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
40 ref.Slice **************************************************************
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
41
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
42 alignments 11111111111
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
43 2222222222222222
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
44 33333333333333333
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
45
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
46 resulting Bio::EnsEMBL::Compara::AlignSlice:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
47
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
48 AS::Slice 1 **************************************************************
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
49 AS::Slice 2 .11111111111....2222222222222222......33333333333333333.......
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
50
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
51 MODES
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
52
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
53 Two modes are currently available: condensed and expanded mode. The default mode is "condensed". In
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
54 condensed mode, no gaps are allowed in the reference Slice which means that information about deletions
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
55 in the reference species (i.e. insertions in the other species) are lost. On the other hand, the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
56 first Bio::EnsEMBL::Compara::AlignSlice::Slice object corresponds to the original Bio::EnsEMBL::Slice.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
57
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
58 In the expanded mode, the first Bio::EnsEMBL::Compara::AlignSlice::Slice is expanded in order to
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
59 accomodate the gaps corresponding to the deletions (insertions). Bear in mind that in expanded mode, the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
60 length of the resulting AlignSlice will be most probably larger than the length of the reference
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
61 Bio::EnsEMBL::Slice.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
62
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
63
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
64 OVERLAPPING ALIGNMENTS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
65
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
66 No overlapping alignments are allowed by default. This means that if an alignment overlaps another one, the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
67 second alignment is ignored. This is due to lack of information needed to reconciliate both alignment.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
68 Here is a graphical example showing this problem:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
69
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
70 ALN 1: Human (ref) CTGTGAAAA----CCCCATTAGG
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
71 Mouse (1) CTGAAAATTTTCCCC
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
72
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
73 ALN 2: Human (ref) CTGTGAAAA---CCCCATTAGG
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
74 Mouse (1) AAAGGGCCCCATTA
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
75
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
76 Possible solution 1:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
77 Human (ref) CTGTGAAAA----CCCCATTAGG
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
78 Mouse (1) CTGAAAATTTTCCCC----
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
79 Mouse (1) ----AAA-GGGCCCCATTA
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
80
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
81 Possible solution 2:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
82 Human (ref) CTGTGAAAA----CCCCATTAGG
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
83 Mouse (1) CTGAAAATTTTCCCC----
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
84 Mouse (1) ----AAAGGG-CCCCATTA
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
85
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
86 Possible solution 3:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
87 Human (ref) CTGTGAAAA-------CCCCATTAGG
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
88 Mouse (1) CTGAAAATTTT---CCCC
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
89 Mouse (1) AAA----GGGCCCCATTA
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
90
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
91 There is no easy way to find which of these possible solution is the best without trying to realign the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
92 three sequences together and this is far beyond the aim of this module. The best solution is to start with
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
93 multiple alignments instead of overlapping pairwise ones.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
94
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
95 The third possibility is probably not the best alignment you can get although its implementation is
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
96 systematic (insert as many gaps as needed in order to accommodate the insertions and never ever overlap
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
97 them) and computationally cheap as no realignment is needed. You may ask this module to solve overlapping
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
98 alignments in this way using the "solve_overlapping" option.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
99
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
100 RESULT
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
101
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
102 The AlignSlice results in a set of Bio::EnsEMBL::Compara::AlignSlice::Slice which are objects really
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
103 similar to the Bio::EnsEMBL::Slice. There are intended to be used just like the genuine
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
104 Bio::EnsEMBL::Slice but some methods do not work though. Some examples of non-ported methods are: expand()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
105 and invert(). Some other methods work as expected (seq, subseq, get_all_Attributes,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
106 get_all_VariationFeatures, get_all_RepeatFeatures...). All these Bio::EnsEMBL::Compara::AlignSlice::Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
107 share the same fake coordinate system defined by the Bio::EnsEMBL::Compara::AlignSlice. This allows to
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
108 map features from one species onto the others.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
109
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
110 =head1 SYNOPSIS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
111
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
112 use Bio::EnsEMBL::Compara::AlignSlice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
113
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
114 ## You may create your own AlignSlice objects but if you are interested in
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
115 ## getting AlignSlice built with data from an EnsEMBL Compara database you
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
116 ## should consider using the Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
117 ## instead
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
118 my $align_slice = new Bio::EnsEMBL::Compara::AlignSlice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
119 -adaptor => $align_slice_adaptor,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
120 -reference_Slice => $reference_Slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
121 -method_link_species_set => $all_genomic_align_blocks,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
122 -genomicAlignBlocks => $all_genomic_align_blocks,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
123 -expanded => 1
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
124 -solve_overlapping => 1
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
125 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
126
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
127 my $all_slices = $aling_slice->get_all_Slices();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
128 foreach my $this_slice (@$all_slices) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
129 ## See also Bio::EnsEMBL::Compara::AlignSlice::Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
130 my $species_name = $this_slice->genome_db->name()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
131 my $all_mapped_genes = $this_slice->get_all_Genes();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
132 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
133
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
134 my $simple_align = $align_slice->get_projected_SimpleAlign();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
135
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
136 SET VALUES
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
137 $align_slice->adaptor($align_slice_adaptor);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
138 $align_slice->reference_Slice($reference_slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
139 $align_slice->add_GenomicAlignBlock($genomic_align_block_1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
140 $align_slice->add_GenomicAlignBlock($genomic_align_block_2);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
141 $align_slice->add_GenomicAlignBlock($genomic_align_block_3);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
142
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
143 GET VALUES
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
144 my $align_slice_adaptor = $align_slice->adaptor();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
145 my $reference_slice = $align_slice->reference_Slice();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
146 my $all_genomic_align_blocks = $align_slice->get_all_GenomicAlignBlock();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
147 my $mapped_genes = $align_slice->get_all_Genes_by_genome_db_id(3);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
148 my $simple_align = $align_slice->get_projected_SimpleAlign();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
149
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
150 =head1 OBJECT ATTRIBUTES
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
151
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
152 =over
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
153
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
154 =item adaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
155
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
156 Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor object to access DB
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
157
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
158 =item reference_slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
159
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
160 Bio::EnsEMBL::Slice object used to create this Bio::EnsEBML::Compara::AlignSlice object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
161
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
162 =item all_genomic_align_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
163
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
164 a listref of Bio::EnsEMBL::Compara::GenomicAlignBlock objects found using the reference_Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
165
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
166 =back
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
167
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
168 =head1 APPENDIX
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
169
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
170 The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
171
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
172 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
173
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
174
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
175 # Let the code begin...
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
176
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
177
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
178 package Bio::EnsEMBL::Compara::AlignSlice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
179
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
180 use strict;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
181 use Bio::EnsEMBL::Utils::Argument qw(rearrange);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
182 use Bio::EnsEMBL::Utils::Exception qw(throw warning info verbose);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
183 use Bio::EnsEMBL::Compara::AlignSlice::Exon;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
184 use Bio::EnsEMBL::Compara::AlignSlice::Slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
185 use Bio::EnsEMBL::Compara::GenomicAlignBlock;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
186 use Bio::EnsEMBL::Compara::GenomicAlign;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
187 use Bio::SimpleAlign;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
188
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
189 ## Creates a new coordinate system for creating empty Slices.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
190
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
191 my $aligngap_coord_system = new Bio::EnsEMBL::CoordSystem(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
192 -NAME => 'alignment',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
193 -VERSION => "none",
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
194 -TOP_LEVEL => 0,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
195 -SEQUENCE_LEVEL => 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
196 -RANK => 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
197 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
198
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
199 =head2 new (CONSTRUCTOR)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
200
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
201 Arg[1] : a reference to a hash where keys can be:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
202 -adaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
203 -reference_slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
204 -genomic_align_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
205 -method_link_species_set
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
206 -expanded
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
207 -solve_overlapping
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
208 -preserve_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
209 -adaptor: the Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
210 -reference_slice:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
211 Bio::EnsEMBL::Slice, the guide slice for this align_slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
212 -genomic_align_blocks:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
213 listref of Bio::EnsEMBL::Compara::GenomicAlignBlock
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
214 objects containing to the alignments to be used for this
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
215 align_slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
216 -method_link_species_set;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
217 Bio::EnsEMBL::Compara::MethodLinkSpeciesSet object for
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
218 all the previous genomic_align_blocks. At the moment all
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
219 the blocks should correspond to the same MethodLinkSpeciesSet
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
220 -expanded: boolean flag. If set to true, the AlignSlice will insert all
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
221 the gaps requiered by the alignments in the reference_slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
222 (see MODES elsewhere in this document)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
223 -solve_overlapping:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
224 boolean flag. If set to true, the AlignSlice will allow
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
225 overlapping alginments and solve indeterminations according
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
226 to the method described in OVERLAPPING ALIGNMENTS elsewhere
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
227 in this document
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
228 -preserve_blocks:
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
229 boolean flag. By default the AlignSlice trim the alignments
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
230 in order to fit the reference_slice. This flags tell the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
231 AlignSlice to use the alignment block as they are (usually
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
232 this is only used by the AlignSliceAdaptor, use with care)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
233 Example : my $align_slice =
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
234 new Bio::EnsEMBL::Compara::AlignSlice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
235 -adaptor => $align_slice_adaptor,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
236 -reference_slice => $reference_slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
237 -method_link_species_set => $method_link_species_set,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
238 -genomic_align_blocks => [$gab1, $gab2],
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
239 -expanded => 1
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
240 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
241 Description: Creates a new Bio::EnsEMBL::Compara::AlignSlice object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
242 Returntype : Bio::EnsEMBL::Compara::AlignSlice object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
243 Exceptions : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
244 Caller : general
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
245
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
246 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
247
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
248 sub new {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
249 my ($class, @args) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
250
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
251 my $self = {};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
252 bless $self,$class;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
253
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
254 my ($adaptor, $reference_slice, $genomic_align_blocks, $genomic_align_trees,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
255 $method_link_species_set, $expanded, $solve_overlapping, $preserve_blocks,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
256 $species_order) =
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
257 rearrange([qw(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
258 ADAPTOR REFERENCE_SLICE GENOMIC_ALIGN_BLOCKS GENOMIC_ALIGN_TREES
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
259 METHOD_LINK_SPECIES_SET EXPANDED SOLVE_OVERLAPPING PRESERVE_BLOCKS
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
260 SPECIES_ORDER
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
261 )], @args);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
262
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
263 $self->adaptor($adaptor) if (defined ($adaptor));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
264 $self->reference_Slice($reference_slice) if (defined ($reference_slice));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
265 if (defined($genomic_align_blocks)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
266 foreach my $this_genomic_align_block (@$genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
267 $self->add_GenomicAlignBlock($this_genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
268 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
269 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
270 $self->{_method_link_species_set} = $method_link_species_set if (defined($method_link_species_set));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
271
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
272 $self->{expanded} = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
273 if ($expanded) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
274 $self->{expanded} = 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
275 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
276
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
277 $self->{solve_overlapping} = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
278 if ($solve_overlapping) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
279 $self->{solve_overlapping} = $solve_overlapping;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
280 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
281
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
282 if ($genomic_align_trees) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
283 $self->_create_underlying_Slices($genomic_align_trees, $self->{expanded},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
284 $self->{solve_overlapping}, $preserve_blocks, $species_order);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
285
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
286 #Awful hack to store the _alignslice_from and _alignslice_to on the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
287 #GenomicAlignBlock for use in get_all_ConservationScores which uses
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
288 #GenomicAlignBlock and not GenomicAlignTree
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
289 foreach my $tree (@$genomic_align_trees) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
290 foreach my $block (@$genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
291
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
292 #my $gab_id = $tree->get_all_leaves->[0]->get_all_genomic_aligns_for_node->[0]->genomic_align_block_id;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
293 #Hope always have a reference_genomic_align. The above doesn't work because only the reference species
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
294 #has the genomic_align_block_id set
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
295 my $gab_id = $tree->reference_genomic_align->genomic_align_block_id;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
296 my $block_id = $block->dbID;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
297
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
298 #if the block has been restricted, need to look at original_dbID
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
299 if (!defined $block_id) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
300 $block_id = $block->{original_dbID};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
301 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
302 my $tree_ref_ga = $tree->{reference_genomic_align};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
303 my $block_ref_ga = $block->{reference_genomic_align};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
304
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
305 #Need to check the ref_ga details not just the block id because
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
306 #the original_dbID is not unique for 2x genome blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
307 if ($gab_id == $block_id &&
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
308 $tree_ref_ga->dnafrag_start == $block_ref_ga->dnafrag_start &&
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
309 $tree_ref_ga->dnafrag_end == $block_ref_ga->dnafrag_end &&
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
310 $tree_ref_ga->dnafrag_strand == $block_ref_ga->dnafrag_strand) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
311
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
312 $block->{_alignslice_from} = $tree->{_alignslice_from};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
313 $block->{_alignslice_to} = $tree->{_alignslice_to};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
314 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
315 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
316 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
317 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
318 $self->_create_underlying_Slices($genomic_align_blocks, $self->{expanded},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
319 $self->{solve_overlapping}, $preserve_blocks, $species_order);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
320 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
321 return $self;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
322 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
323
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
324
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
325 =head2 sub_AlignSlice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
326
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
327 Arg 1 : int $start
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
328 Arg 2 : int $end
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
329 Example : my $sub_align_slice = $align_slice->sub_AlignSlice(10, 50);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
330 Description: Creates a new Bio::EnsEMBL::Compara::AlignSlice object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
331 corresponding to a sub region of this one
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
332 Returntype : Bio::EnsEMBL::Compara::AlignSlice object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
333 Exceptions : return undef if no internal slices can be created (see
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
334 Bio::EnsEMBL::Compara::AlignSlice::Slice->sub_Slice)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
335 Caller : $align_slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
336 Status : Testing
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
337
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
338 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
339
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
340 sub sub_AlignSlice {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
341 my ($self, $start, $end) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
342 my $sub_align_slice = {};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
343
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
344 throw("Must provide START argument") if (!defined($start));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
345 throw("Must provide END argument") if (!defined($end));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
346
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
347 bless $sub_align_slice, ref($self);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
348
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
349 $sub_align_slice->adaptor($self->adaptor);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
350 $sub_align_slice->reference_Slice($self->reference_Slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
351 foreach my $this_genomic_align_block (@{$self->get_all_GenomicAlignBlocks()}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
352 $sub_align_slice->add_GenomicAlignBlock($this_genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
353 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
354 $sub_align_slice->{_method_link_species_set} = $self->{_method_link_species_set};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
355
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
356 $sub_align_slice->{expanded} = $self->{expanded};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
357
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
358 $sub_align_slice->{solve_overlapping} = $self->{solve_overlapping};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
359
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
360 foreach my $this_slice (@{$self->get_all_Slices}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
361 my $new_slice = $this_slice->sub_Slice($start, $end);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
362 push(@{$sub_align_slice->{_slices}}, $new_slice) if ($new_slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
363 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
364 return undef if (!$sub_align_slice->{_slices});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
365
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
366 return $sub_align_slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
367 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
368
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
369
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
370 =head2 adaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
371
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
372 Arg[1] : (optional) Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor $align_slice_adaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
373 Example : my $align_slice_adaptor = $align_slice->adaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
374 Example : $align_slice->adaptor($align_slice_adaptor)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
375 Description: getter/setter for the adaptor attribute
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
376 Returntype : Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
377 Exceptions : throw if arg is not a Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
378 Caller : $object->methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
379
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
380 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
381
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
382 sub adaptor {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
383 my ($self, $adaptor) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
384
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
385 if (defined($adaptor)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
386 throw "[$adaptor] must be a Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor object"
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
387 unless ($adaptor and $adaptor->isa("Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor"));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
388 $self->{'adaptor'} = $adaptor;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
389 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
390
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
391 return $self->{'adaptor'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
392 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
393
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
394
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
395 =head2 get_all_Slices (experimental)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
396
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
397 Arg[1] : [optional] string $species_name1
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
398 Arg[2] : [optional] string $species_name2
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
399 Arg[...] : [optional] string $species_nameN
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
400 Example : my $slices = $align_slice->get_all_Slices
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
401 Description: getter for all the Slices in this AlignSlice. If a list
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
402 of species is specified, returns only the slices for
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
403 these species. The slices are returned in a "smart"
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
404 order, i.e. the slice corresponding to the reference
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
405 species is returned first and then the remaining slices
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
406 depending on their phylogenetic distance to the first
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
407 one.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
408 NB: You can use underscores instead of whitespaces for
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
409 the name of the species, i.e. Homo_sapiens will be
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
410 understood as "Homo sapiens". However if the GenomeDB is found
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
411 to already have _ defined in the name then this behaviour is
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
412 disabled.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
413 Returntype : listref of Bio::EnsEMBL::Compara::AlignSlice::Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
414 objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
415 Exceptions :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
416 Caller : $object->methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
417
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
418 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
419
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
420 sub get_all_Slices {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
421 my ( $self, @species_names ) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
422 my $slices = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
423
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
424 if (@species_names) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
425 foreach my $slice ( @{ $self->{_slices} } ) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
426 #Substitute _ for spaces & check if the current GenomeDB matches with
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
427 #or without them
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
428 foreach my $this_species_name (@species_names) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
429 ( my $space_species_name = $this_species_name ) =~ s/_/ /g;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
430 push( @$slices, $slice )
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
431 if ( ( $this_species_name eq $slice->genome_db->name )
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
432 || ( $space_species_name eq $slice->genome_db->name ) );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
433 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
434 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
435 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
436 else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
437 $slices = $self->{_slices};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
438 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
439
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
440 return $slices;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
441 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
442
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
443
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
444 =head2 reference_Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
445
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
446 Arg[1] : (optional) Bio::EnsEMBL::Slice $slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
447 Example : my $reference_slice = $align_slice->reference_slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
448 Example : $align_slice->reference_Slice($reference_slice)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
449 Description: getter/setter for the attribute reference_slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
450 Returntype : Bio::EnsEMBL::Slice object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
451 Exceptions : throw if arg is not a Bio::EnsEMBL::Slice object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
452 Caller : $object->methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
453
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
454 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
455
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
456 sub reference_Slice {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
457 my ($self, $reference_slice) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
458
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
459 if (defined($reference_slice)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
460 throw "[$reference_slice] must be a Bio::EnsEMBL::Slice object"
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
461 unless ($reference_slice and $reference_slice->isa("Bio::EnsEMBL::Slice"));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
462 $self->{'reference_slice'} = $reference_slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
463 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
464
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
465 return $self->{'reference_slice'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
466 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
467
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
468
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
469 =head2 add_GenomicAlignBlock
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
470
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
471 Arg[1] : Bio::EnsEMBL::Compara::GenomicAlignBlock $genomicAlignBlock
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
472 Example : $align_slice->add_GenomicAlignBlock($genomicAlignBlock)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
473 Description: add a Bio::EnsEMBL::Compara::GenomicAlignBlock object to the array
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
474 stored in the attribute all_genomic_align_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
475 Returntype : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
476 Exceptions : throw if arg is not a Bio::EnsEMBL::Compara::GenomicAlignBlock
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
477 Caller : $object->methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
478
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
479 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
480
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
481 sub add_GenomicAlignBlock {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
482 my ($self, $genomic_align_block) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
483
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
484 if (!defined($genomic_align_block)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
485 throw "Too few arguments for Bio::EnsEMBL::Compara::AlignSlice->add_GenomicAlignBlock()";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
486 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
487 if (!$genomic_align_block or !$genomic_align_block->isa("Bio::EnsEMBL::Compara::GenomicAlignBlock")) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
488 throw "[$genomic_align_block] must be a Bio::EnsEMBL::Compara::GenomicAlignBlock object";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
489 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
490
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
491 push(@{$self->{'all_genomic_align_blocks'}}, $genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
492 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
493
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
494
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
495 =head2 get_all_GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
496
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
497 Arg[1] : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
498 Example : my $all_genomic_align_blocks = $align_slice->get_all_GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
499 Description: getter for the attribute all_genomic_align_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
500 Returntype : listref of Bio::EnsEMBL::Compara::GenomicAlignBlock objects
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
501 Exceptions : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
502 Caller : $object->methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
503
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
504 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
505
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
506 sub get_all_GenomicAlignBlocks {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
507 my ($self) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
508
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
509 return ($self->{'all_genomic_align_blocks'} || []);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
510 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
511
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
512
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
513 =head2 get_MethodLinkSpeciesSet
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
514
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
515 Arg[1] : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
516 Example : my $method_link_species_set = $align_slice->get_MethodLinkSpeciesSet
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
517 Description: getter for the Bio::EnsEMBL::Compara::MethodLinkSpeciesSet
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
518 used to create this object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
519 Returntype : Bio::EnsEMBL::Compara::MethodLinkSpeciesSet
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
520 Exceptions : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
521 Caller : $object->methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
522
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
523 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
524
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
525 sub get_MethodLinkSpeciesSet {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
526 my ($self) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
527
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
528 return $self->{'_method_link_species_set'};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
529 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
530
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
531
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
532 =head2 get_SimpleAlign
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
533
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
534 Arg[1] : none
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
535 Example : use Bio::AlignIO;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
536 my $out = Bio::AlignIO->newFh(-fh=>\*STDOUT, -format=> "clustalw");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
537 print $out $align_slice->get_SimpleAlign();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
538 Description : This method creates a Bio::SimpleAlign object using the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
539 Bio::EnsEMBL::Compara::AlignSlice::Slices underlying this
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
540 Bio::EnsEMBL::Compara::AlignSlice object. The SimpleAlign
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
541 describes the alignment where the first sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
542 corresponds to the reference Slice and the remaining
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
543 correspond to the other species.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
544 Returntype : Bio::SimpleAlign object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
545 Exceptions :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
546 Caller : $object->methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
547
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
548 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
549
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
550 sub get_SimpleAlign {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
551 my ($self, @species) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
552 my $simple_align;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
553
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
554 ## Create a single Bio::SimpleAlign for the projection
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
555 $simple_align = Bio::SimpleAlign->new();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
556 $simple_align->id("ProjectedMultiAlign");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
557
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
558 my $genome_db_name_counter;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
559 foreach my $slice (@{$self->get_all_Slices(@species)}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
560 my $seq = Bio::LocatableSeq->new(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
561 -SEQ => $slice->seq,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
562 -START => $slice->start,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
563 -END => $slice->end,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
564 -ID => $slice->genome_db->name.($genome_db_name_counter->{$slice->genome_db->name} or ""),
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
565 -STRAND => $slice->strand
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
566 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
567 ## This allows to have several sequences for the same species. Bio::SimpleAlign complains
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
568 ## about having the same ID, START and END for two sequences...
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
569 if (!defined($genome_db_name_counter->{$slice->genome_db->name})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
570 $genome_db_name_counter->{$slice->genome_db->name} = 2;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
571 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
572 $genome_db_name_counter->{$slice->genome_db->name}++;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
573 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
574
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
575 $simple_align->add_seq($seq);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
576 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
577
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
578 return $simple_align;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
579 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
580
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
581
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
582 =head2 get_all_ConservationScores
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
583
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
584 Arg 1 : (opt) integer $display_size (default 700)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
585 Arg 2 : (opt) string $display_type (one of "AVERAGE" or "MAX") (default "MAX")
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
586 Arg 3 : (opt) integer $window_size
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
587 Example : my $conservation_scores =
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
588 $align_slice->get_all_ConservationScores(1000, "AVERAGE", 10);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
589 Description: Retrieve the corresponding
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
590 Bio::EnsEMBL::Compara::ConservationScore objects for the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
591 Bio::EnsEMBL::Compara::AlignSlice object. It calls either
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
592 _get_expanded_conservation_scores if the AlignSlice has
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
593 "expanded" set or _get_condensed_conservation_scores for
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
594 condensed mode.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
595 It sets up the align_start, align_end and slice_length and map
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
596 the resulting objects onto the AlignSlice.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
597 Please refer to the documentation in
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
598 Bio::EnsEMBL::Compara::DBSQL::ConservationScoreAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
599 for more details.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
600 Returntype : ref. to an array of Bio::EnsEMBL::Compara::ConservationScore
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
601 objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
602 Caller : object::methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
603 Status : At risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
604
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
605 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
606
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
607 sub get_all_ConservationScores {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
608 my ($self, $display_size, $display_type, $window_size) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
609 my $all_conservation_scores = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
610 my $y_axis_min;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
611 my $y_axis_max;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
612
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
613 my $conservation_score_adaptor = $self->adaptor->db->get_ConservationScoreAdaptor();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
614
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
615 #Get scores in either expanded or condensed mode
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
616 if ($self->{expanded}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
617 $all_conservation_scores = $self->_get_expanded_conservation_scores($conservation_score_adaptor, $display_size, $display_type, $window_size);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
618 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
619 $all_conservation_scores = $self->_get_condensed_conservation_scores($conservation_score_adaptor, $display_size, $display_type, $window_size);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
620 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
621
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
622 return $all_conservation_scores;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
623 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
624
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
625 =head2 _get_expanded_conservation_scores
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
626
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
627 Arg 1 : Bio::EnsEMBL::Compara::DBSQL::ConservationScoreAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
628 Arg 2 : (opt) integer $display_size (default 700)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
629 Arg 3 : (opt) string $display_type (one of "AVERAGE" or "MAX") (default "MAX")
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
630 Arg 4 : (opt) integer $window_size
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
631 Example : my $conservation_scores =
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
632 $self->_get_expanded_conservation_scores($cs_adaptor, 1000, "AVERAGE", 10);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
633 Description: Retrieve the corresponding
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
634 Bio::EnsEMBL::Compara::ConservationScore objects for the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
635 Bio::EnsEMBL::Compara::GenomicAlignBlock objects underlying
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
636 this Bio::EnsEMBL::Compara::AlignSlice object. This method
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
637 calls the Bio::EnsEMBL::Compara::DBSQL::ConservationScoreAdaptor->
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
638 fetch_all_by_GenomicAlignBlock() method. It sets up the align_start,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
639 align_end and slice_length and map the resulting objects onto
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
640 the AlignSlice. $diaplay_slize, $display_type and $window_size
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
641 are passed to the fetch_all_by_GenomicAlignBlock() method.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
642 Please refer to the documentation in
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
643 Bio::EnsEMBL::Compara::DBSQL::ConservationScoreAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
644 for more details.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
645 Returntype : ref. to an array of Bio::EnsEMBL::Compara::ConservationScore
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
646 objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
647 Caller : object::methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
648 Status : At risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
649
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
650 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
651 sub _get_expanded_conservation_scores {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
652 my ($self, $conservation_score_adaptor, $display_size, $display_type, $window_size) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
653 my $y_axis_min;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
654 my $y_axis_max;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
655
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
656 my $all_conservation_scores = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
657 my $offset = 0; #the start of each gab in alignment coords
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
658 my $prev_gab_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
659 foreach my $this_genomic_align_block (@{$self->get_all_GenomicAlignBlocks()}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
660 $this_genomic_align_block->{restricted_aln_start} = $this_genomic_align_block->{_alignslice_from};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
661 $this_genomic_align_block->{restricted_aln_end} = $this_genomic_align_block->{_alignslice_to};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
662
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
663 #Need to map the coords I get back from the conservation_score_adaptor onto the slice using the start position of the genomic_align_block
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
664 my $all_these_conservation_scores = $conservation_score_adaptor->fetch_all_by_GenomicAlignBlock(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
665 $this_genomic_align_block, 1,$this_genomic_align_block->length, $self->get_all_Slices()->[0]->length,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
666 $display_size, $display_type, $window_size);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
667
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
668 #Need to account for gaps between gabs which will be slice end - start on the reference slice.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
669 if ($prev_gab_end) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
670 my $gap = $this_genomic_align_block->reference_slice_start - $prev_gab_end - 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
671 if ($gap) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
672 $offset += $gap;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
673 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
674 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
675 $prev_gab_end = $this_genomic_align_block->reference_slice_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
676
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
677 foreach my $score (@$all_these_conservation_scores) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
678 $score->position($score->position + $offset);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
679 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
680
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
681 #offset is the sum of preceding gab lengths
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
682 $offset += ($this_genomic_align_block->{restricted_aln_end} - $this_genomic_align_block->{restricted_aln_start} + 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
683
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
684 #initialise y axis min and max
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
685 if (!defined $y_axis_max) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
686 $y_axis_max = $all_these_conservation_scores->[0]->y_axis_max;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
687 $y_axis_min = $all_these_conservation_scores->[0]->y_axis_min;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
688 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
689 #find overall min and max
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
690 if ($y_axis_min > $all_these_conservation_scores->[0]->y_axis_min) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
691 $y_axis_min = $all_these_conservation_scores->[0]->y_axis_min;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
692 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
693 if ($y_axis_max < $all_these_conservation_scores->[0]->y_axis_max) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
694 $y_axis_max = $all_these_conservation_scores->[0]->y_axis_max;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
695 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
696 push (@$all_conservation_scores, @$all_these_conservation_scores);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
697 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
698 #set overall min and max
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
699 $all_conservation_scores->[0]->y_axis_min($y_axis_min);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
700 $all_conservation_scores->[0]->y_axis_max($y_axis_max);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
701
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
702 return $all_conservation_scores;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
703 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
704
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
705 =head2 _get_condensed_conservation_scores
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
706
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
707 Arg 1 : Bio::EnsEMBL::Compara::DBSQL::ConservationScoreAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
708 Arg 2 : (opt) integer $display_size (default 700)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
709 Arg 3 : (opt) string $display_type (one of "AVERAGE" or "MAX") (default "MAX")
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
710 Arg 4 : (opt) integer $window_size
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
711 Example : my $conservation_scores =
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
712 $self->_get_expanded_conservation_scores($cs_adaptor, 1000, "AVERAGE", 10);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
713 Description: Retrieve the corresponding
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
714 Bio::EnsEMBL::Compara::ConservationScore objects for the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
715 reference Bio::EnsEMBL::Slice object of
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
716 this Bio::EnsEMBL::Compara::AlignSlice object. This method
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
717 calls the Bio::EnsEMBL::Compara::DBSQL::ConservationScoreAdaptor->
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
718 fetch_all_by_MethodLinkSpeciesSet_Slice() method. It sets up
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
719 the align_start, align_end and slice_length and map the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
720 resulting objects onto the AlignSlice. $display_slize,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
721 $display_type and $window_size are passed to the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
722 fetch_all_by_MethodLinkSpeciesSet_Slice() method.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
723 Please refer to the documentation in
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
724 Bio::EnsEMBL::Compara::DBSQL::ConservationScoreAdaptor
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
725 for more details.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
726 Returntype : ref. to an array of Bio::EnsEMBL::Compara::ConservationScore
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
727 objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
728 Caller : object::methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
729 Status : At risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
730
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
731 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
732 sub _get_condensed_conservation_scores {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
733 my ($self, $conservation_score_adaptor, $display_size, $display_type, $window_size) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
734
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
735 my $all_conservation_scores = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
736
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
737 throw ("Must have method_link_species_set defined to retrieve conservation scores for a condensed AlignSlice") if (!defined $self->{_method_link_species_set});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
738 throw ("Must have reference slice defined to retrieve conservation scores for a condensed AlignSlice") if (!defined $self->{'reference_slice'});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
739
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
740 # select the conservation score mlss_id linked to the msa
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
741 my $sql = 'SELECT method_link_species_set_id FROM method_link_species_set_tag JOIN method_link_species_set USING (method_link_species_set_id) JOIN method_link USING (method_link_id) WHERE type LIKE "%CONSERVATION\_SCORE" AND tag = "msa_mlss_id" AND value = ?';
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
742 my $sth = $conservation_score_adaptor->prepare($sql);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
743 $sth->execute($self->{_method_link_species_set}->dbID);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
744
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
745 my ($cs_mlss_id) = @{$sth->fetchrow_arrayref};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
746 $sth->finish;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
747
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
748 throw ("Unable to find conservation score method_link_species_set for this multiple alignment " . $self->{_method_link_species_set}->dbID) if (!defined $cs_mlss_id);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
749 my $mlss_adaptor = $self->adaptor->db->get_MethodLinkSpeciesSetAdaptor();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
750
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
751 my $cs_mlss = $mlss_adaptor->fetch_by_dbID($cs_mlss_id);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
752
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
753 $all_conservation_scores = $conservation_score_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($cs_mlss, $self->{'reference_slice'}, $display_size, $display_type, $window_size);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
754
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
755 return $all_conservation_scores;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
756 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
757
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
758
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
759 =head2 get_all_ConstrainedElements
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
760
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
761 Arg 1 : (opt) string $method_link_type (default = GERP_CONSTRAINED_ELEMENT)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
762 Arg 2 : (opt) listref Bio::EnsEMBL::Compara::GenomeDB $species_set
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
763 (default, the set of species from the MethodLinkSpeciesSet used
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
764 to build this AlignSlice)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
765 Example : my $constrained_elements =
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
766 $align_slice->get_all_ConstrainedElements();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
767 Description: Retrieve the corresponding constrained elements for these alignments.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
768 Objects will be located on this AlignSlice, i.e. the
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
769 reference_slice, reference_slice_start, reference_slice_end
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
770 and reference_slice_strand will refer to this AlignSlice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
771 object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
772 Returntype : ref. to an array of Bio::EnsEMBL::Compara::ConstrainedElement
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
773 objects.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
774 Caller : object::methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
775 Status : At risk
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
776
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
777 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
778
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
779 sub get_all_ConstrainedElements {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
780 my ($self, $method_link_type, $species_set) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
781 my $all_constrained_elements = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
782
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
783 $method_link_type ||= "GERP_CONSTRAINED_ELEMENT";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
784 my $key_cache = "_constrained_elements_".$method_link_type;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
785 if ($species_set) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
786 $key_cache .= "::" . join("-", sort map {s/\W/_/g} map {$_->name} @$species_set);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
787 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
788 $species_set = $self->{_method_link_species_set}->species_set;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
789 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
790
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
791 if (!defined($self->{$key_cache})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
792 my $method_link_species_set_adaptor = $self->adaptor->db->get_MethodLinkSpeciesSetAdaptor();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
793 my $method_link_species_set = $method_link_species_set_adaptor->fetch_by_method_link_type_GenomeDBs(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
794 $method_link_type, $self->{_method_link_species_set}->species_set);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
795
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
796 if ($method_link_species_set) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
797 my $constrained_element_adaptor = $self->adaptor->db->get_ConstrainedElementAdaptor();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
798 $all_constrained_elements = $constrained_element_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
799 $method_link_species_set, $self->reference_Slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
800 my $big_mapper = $self->{_reference_Mapper};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
801 foreach my $this_constrained_element (@{$all_constrained_elements}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
802 my $reference_slice_start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
803 my $reference_slice_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
804 my $reference_slice_strand;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
805
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
806 my @alignment_coords = $big_mapper->map_coordinates(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
807 "sequence", # $self->genomic_align->dbID,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
808 $this_constrained_element->start + $this_constrained_element->slice->start - 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
809 $this_constrained_element->end + $this_constrained_element->slice->start - 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
810 $this_constrained_element->strand,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
811 "sequence" # $from_mapper->from
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
812 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
813 foreach my $alignment_coord (@alignment_coords) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
814 next if (!$alignment_coord->isa("Bio::EnsEMBL::Mapper::Coordinate"));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
815 if (!defined($reference_slice_strand)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
816 $reference_slice_start = $alignment_coord->start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
817 $reference_slice_end = $alignment_coord->end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
818 $reference_slice_strand = $alignment_coord->strand;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
819 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
820 if ($alignment_coord->start < $reference_slice_start) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
821 $reference_slice_start = $alignment_coord->start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
822 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
823 if ($alignment_coord->end > $reference_slice_end) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
824 $reference_slice_end = $alignment_coord->end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
825 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
826 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
827 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
828 $this_constrained_element->slice($self);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
829 $this_constrained_element->start($reference_slice_start);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
830 $this_constrained_element->end($reference_slice_end);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
831 $this_constrained_element->strand($reference_slice_strand);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
832 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
833 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
834 $self->{$key_cache} = $all_constrained_elements;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
835 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
836
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
837 return $self->{$key_cache};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
838 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
839
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
840
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
841 =head2 _create_underlying_Slices (experimental)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
842
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
843 Arg[1] : listref of Bio::EnsEMBL::Compara::GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
844 $genomic_align_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
845 Arg[2] : [optional] boolean $expanded (default = FALSE)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
846 Arg[3] : [optional] boolean $solve_overlapping (default = FALSE)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
847 Arg[4] : [optional] boolean $preserve_blocks (default = FALSE)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
848 Example :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
849 Description: Creates a set of Bio::EnsEMBL::Compara::AlignSlice::Slices
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
850 and attach it to this object.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
851 Returntype :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
852 Exceptions : warns about overlapping GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
853 Caller :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
854
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
855 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
856
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
857 sub _create_underlying_Slices {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
858 my ($self, $genomic_align_blocks, $expanded, $solve_overlapping, $preserve_blocks, $species_order) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
859 my $strand = $self->reference_Slice->strand;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
860
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
861 my $align_slice_length = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
862 my $last_ref_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
863 if ($strand == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
864 $last_ref_pos = $self->reference_Slice->start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
865 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
866 $last_ref_pos = $self->reference_Slice->end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
867 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
868
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
869 my $ref_genome_db = $self->adaptor->db->get_GenomeDBAdaptor->fetch_by_Slice($self->reference_Slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
870 my $big_mapper = Bio::EnsEMBL::Mapper->new("sequence", "alignment");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
871
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
872 my $sorted_genomic_align_blocks;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
873
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
874 if ($solve_overlapping eq "restrict") {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
875 $sorted_genomic_align_blocks = _sort_and_restrict_GenomicAlignBlocks($genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
876 } elsif ($solve_overlapping) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
877 $sorted_genomic_align_blocks = _sort_and_compile_GenomicAlignBlocks($genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
878 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
879 $sorted_genomic_align_blocks = _sort_GenomicAlignBlocks($genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
880 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
881 @$sorted_genomic_align_blocks = reverse(@$sorted_genomic_align_blocks) if ($strand == -1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
882 foreach my $this_genomic_align_block (@$sorted_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
883 my $original_genomic_align_block = $this_genomic_align_block;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
884 my ($from, $to);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
885
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
886 if ($preserve_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
887 ## Don't restrict the block. Set from and to to 1 and length respectively
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
888 $from = 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
889 $to = $this_genomic_align_block->length;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
890 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
891 #need to check that the block is still overlapping the slice - it may
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
892 #have already been restricted by the options above.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
893 if ($this_genomic_align_block->reference_genomic_align->dnafrag_start > $self->reference_Slice->end || $this_genomic_align_block->reference_genomic_align->dnafrag_end < $self->reference_Slice->start) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
894 next;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
895 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
896 ($this_genomic_align_block, $from, $to) = $this_genomic_align_block->restrict_between_reference_positions(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
897 $self->reference_Slice->start, $self->reference_Slice->end);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
898 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
899
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
900 $original_genomic_align_block->{_alignslice_from} = $from;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
901 $original_genomic_align_block->{_alignslice_to} = $to;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
902
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
903 my $reference_genomic_align = $this_genomic_align_block->reference_genomic_align;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
904
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
905 #If I haven't needed to restrict, I don't gain this link so add it here
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
906 if (!defined $reference_genomic_align->genomic_align_block->reference_genomic_align) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
907 $reference_genomic_align->genomic_align_block($this_genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
908 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
909
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
910 my ($this_pos, $this_gap_between_genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
911 if ($strand == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
912 $this_pos = $reference_genomic_align->dnafrag_start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
913 $this_gap_between_genomic_align_blocks = $this_pos - $last_ref_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
914 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
915 $this_pos = $reference_genomic_align->dnafrag_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
916 $this_gap_between_genomic_align_blocks = $last_ref_pos - $this_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
917 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
918
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
919 if ($this_gap_between_genomic_align_blocks > 0) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
920 ## Add mapper info for inter-genomic_align_block space
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
921 if ($strand == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
922 $big_mapper->add_map_coordinates(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
923 'sequence',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
924 $last_ref_pos,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
925 $this_pos - 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
926 $strand,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
927 'alignment',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
928 $align_slice_length + 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
929 $align_slice_length + $this_gap_between_genomic_align_blocks,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
930 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
931 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
932 $big_mapper->add_map_coordinates(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
933 'sequence',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
934 $this_pos + 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
935 $last_ref_pos,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
936 $strand,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
937 'alignment',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
938 $align_slice_length + 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
939 $align_slice_length + $this_gap_between_genomic_align_blocks,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
940 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
941 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
942 $align_slice_length += $this_gap_between_genomic_align_blocks;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
943 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
944 $reference_genomic_align->genomic_align_block->start($align_slice_length + 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
945 $original_genomic_align_block->{_alignslice_start} = $align_slice_length;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
946 if ($expanded) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
947 $align_slice_length += CORE::length($reference_genomic_align->aligned_sequence("+FAKE_SEQ"));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
948 $reference_genomic_align->genomic_align_block->end($align_slice_length);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
949 $big_mapper->add_Mapper($reference_genomic_align->get_Mapper(0));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
950 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
951 $align_slice_length += $reference_genomic_align->dnafrag_end - $reference_genomic_align->dnafrag_start + 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
952 $reference_genomic_align->genomic_align_block->end($align_slice_length);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
953 $big_mapper->add_Mapper($reference_genomic_align->get_Mapper(0,1));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
954 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
955 $reference_genomic_align->genomic_align_block->slice($self);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
956
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
957 if ($strand == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
958 $last_ref_pos = $reference_genomic_align->dnafrag_end + 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
959 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
960 $last_ref_pos = $reference_genomic_align->dnafrag_start - 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
961 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
962 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
963 my ($this_pos, $this_gap_between_genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
964 if ($strand == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
965 $this_pos = $self->reference_Slice->end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
966 $this_gap_between_genomic_align_blocks = $this_pos - ($last_ref_pos - 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
967 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
968 $this_pos = $self->reference_Slice->start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
969 $this_gap_between_genomic_align_blocks = $last_ref_pos + 1 - $this_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
970 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
971
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
972 ## $last_ref_pos is the next nucleotide position after the last mapped one.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
973 if ($this_gap_between_genomic_align_blocks > 0) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
974 if ($strand == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
975 $big_mapper->add_map_coordinates(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
976 'sequence',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
977 $last_ref_pos,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
978 $this_pos,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
979 $strand,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
980 'alignment',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
981 $align_slice_length + 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
982 $align_slice_length + $this_gap_between_genomic_align_blocks,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
983 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
984 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
985 $big_mapper->add_map_coordinates(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
986 'sequence',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
987 $this_pos,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
988 $last_ref_pos,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
989 $strand,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
990 'alignment',
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
991 $align_slice_length + 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
992 $align_slice_length + $this_gap_between_genomic_align_blocks,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
993 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
994 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
995 $align_slice_length += $this_gap_between_genomic_align_blocks;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
996 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
997
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
998 if ($species_order) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
999 foreach my $species_def (@$species_order) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1000 my $genome_db_name = $species_def->{genome_db}->name;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1001 # print STDERR "SPECIES:: ", $genome_db_name, "\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1002 my $new_slice = new Bio::EnsEMBL::Compara::AlignSlice::Slice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1003 -length => $align_slice_length,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1004 -requesting_slice => $self->reference_Slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1005 -align_slice => $self,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1006 -method_link_species_set => $self->{_method_link_species_set},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1007 -genome_db => $species_def->{genome_db},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1008 -expanded => $expanded,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1009 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1010 $new_slice->{genomic_align_ids} = $species_def->{genomic_align_ids};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1011 push(@{$self->{slices}->{lc($genome_db_name)}}, $new_slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1012 push(@{$self->{_slices}}, $new_slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1013 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1014 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1015 # print STDERR "SPECIES:: ", $ref_genome_db->name, "\n";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1016 $self->{slices}->{lc($ref_genome_db->name)} = [new Bio::EnsEMBL::Compara::AlignSlice::Slice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1017 -length => $align_slice_length,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1018 -requesting_slice => $self->reference_Slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1019 -align_slice => $self,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1020 -method_link_species_set => $self->{_method_link_species_set},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1021 -genome_db => $ref_genome_db,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1022 -expanded => $expanded,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1023 )];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1024 $self->{_slices} = [$self->{slices}->{lc($ref_genome_db->name)}->[0]];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1025 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1026
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1027 $self->{slices}->{lc($ref_genome_db->name)}->[0]->add_Slice_Mapper_pair(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1028 $self->reference_Slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1029 $big_mapper,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1030 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1031 $align_slice_length,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1032 $self->reference_Slice->strand
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1033 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1034 $self->{_reference_Mapper} = $big_mapper;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1035
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1036 foreach my $this_genomic_align_block (@$sorted_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1037 if (UNIVERSAL::isa($this_genomic_align_block, "Bio::EnsEMBL::Compara::GenomicAlignTree")) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1038 ## For trees, loop through all nodes (internal and leaves) to add the GenomicAligns
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1039 foreach my $this_genomic_align_node (@{$this_genomic_align_block->get_all_nodes}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1040 # but we have to skip the reference node as this has already been added to the guide Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1041 next if ($this_genomic_align_node eq $this_genomic_align_block->reference_genomic_align_node);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1042
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1043 # For composite segments (2X genomes), the node will link to several GenomicAligns.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1044 # Add each of them to one of the AS:Slice objects
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1045 foreach my $this_genomic_align (@{$this_genomic_align_node->get_all_genomic_aligns_for_node}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1046 # Link to genomic_align_block may have been lost during tree minimization
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1047 $this_genomic_align->genomic_align_block_id(0);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1048 $this_genomic_align->genomic_align_block($this_genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1049 $self->_add_GenomicAlign_to_a_Slice($this_genomic_align, $this_genomic_align_block,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1050 $species_order, $align_slice_length);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1051 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1052 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1053 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1054 ## For plain alignments, just use all non-reference GenomicAlign objects
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1055 foreach my $this_genomic_align
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1056 (@{$this_genomic_align_block->get_all_non_reference_genomic_aligns}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1057 $self->_add_GenomicAlign_to_a_Slice($this_genomic_align, $this_genomic_align_block,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1058 $species_order, $align_slice_length);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1059 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1060 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1061 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1062 #It is possible for the same region in the ref species (eg Gibbon) to align to 2 different blocks (pairwise to human in the case of the EPO low coverage alignments). In this case, although the incoming $genomic_align_blocks will have 2 blocks, the $sorted_genomic_align_blocks will only contain 1 of the blocks. It may happen that one species occurs in one block (eg gorilla) but not in the other. However, the $species_order will contain gorilla but the $sorted_genomic_align_blocks may not. This results in a slice being created for gorilla but it has no slice_mapper_pairs. Must check the slices and remove any that have {'slice_mapper_pairs'} as undef (no alignment comes out as a GAP).
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1063
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1064 if ($species_order) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1065 my $slices = $self->{_slices};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1066 for (my $i = (@$slices-1); $i >= 0; --$i) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1067 if (@{$slices->[$i]->{'slice_mapper_pairs'}} == 0) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1068 #remove from {slices}
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1069 delete $self->{slices}->{$slices->[$i]->genome_db->name};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1070 #remove from {_slices}
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1071 splice @$slices, $i, 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1072 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1073 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1074 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1075
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1076
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1077
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1078 return $self;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1079 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1080
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1081 =head2 _add_GenomicAlign_to_a_Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1082
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1083 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1084
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1085 sub _add_GenomicAlign_to_a_Slice {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1086 my ($self, $this_genomic_align, $this_genomic_align_block, $species_order, $align_slice_length) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1087
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1088 my $expanded = $self->{expanded};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1089 my $species = $this_genomic_align->dnafrag->genome_db->name;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1090
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1091 if (!defined($self->{slices}->{lc($species)})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1092 $self->{slices}->{lc($species)} = [new Bio::EnsEMBL::Compara::AlignSlice::Slice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1093 -length => $align_slice_length,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1094 -requesting_slice => $self->reference_Slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1095 -align_slice => $self,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1096 -method_link_species_set => $self->{_method_link_species_set},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1097 -genome_db => $this_genomic_align->dnafrag->genome_db,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1098 -expanded => $expanded,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1099 )];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1100 push(@{$self->{_slices}}, $self->{slices}->{lc($species)}->[0]);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1101 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1102
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1103 my $this_block_start = $this_genomic_align_block->start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1104 my $this_block_end = $this_genomic_align_block->end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1105 my $this_core_slice = $this_genomic_align->get_Slice();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1106 if (!$this_core_slice) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1107 $this_core_slice = new Bio::EnsEMBL::Slice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1108 -coord_system => $aligngap_coord_system,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1109 -seq_region_name => "GAP",
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1110 -start => $this_block_start,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1111 -end => $this_block_end,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1112 -strand => 0
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1113 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1114 $this_core_slice->{seq} = "." x ($this_block_end - $this_block_start + 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1115 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1116 return if (!$this_core_slice); ## The restriction of the GenomicAlignBlock may return a void GenomicAlign
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1117
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1118 ## This creates a link between the slice and the tree node. This is required to display
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1119 ## the tree on the web interface.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1120 if ($this_genomic_align->genome_db->name eq "ancestral_sequences") {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1121 foreach my $genomic_align_node (@{$this_genomic_align_block->get_all_sorted_genomic_align_nodes}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1122 my $genomic_align_group = $genomic_align_node->genomic_align_group;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1123 next if (!$genomic_align_group);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1124
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1125 foreach my $genomic_align (@{$genomic_align_group->get_all_GenomicAligns}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1126 if ($this_genomic_align == $genomic_align) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1127 my $simple_tree = $genomic_align_node->newick_simple_format();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1128 $simple_tree =~ s/\_[^\_]+\_\d+\_\d+\[[\+\-]\]//g;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1129 $simple_tree =~ s/\:[\d\.]+//g;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1130 $this_core_slice->{_tree} = $simple_tree;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1131 last;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1132 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1133 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1134 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1135 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1136
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1137 my $this_mapper = $this_genomic_align->get_Mapper(0, !$expanded);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1138 # Fix block start and block end for composite segments (2X genomes)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1139 if ($this_genomic_align->cigar_line =~ /^(\d*)X/ or $this_genomic_align->cigar_line =~ /(\d*)X$/) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1140 $this_block_start = undef;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1141 $this_block_end = undef;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1142 my @blocks = $this_mapper->map_coordinates("sequence", $this_genomic_align->dnafrag_start,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1143 $this_genomic_align->dnafrag_end, $this_genomic_align->dnafrag_strand, "sequence");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1144 foreach my $this_block (@blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1145 next if ($this_block->isa("Bio::EnsEMBL::Mapper::Gap"));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1146 $this_block_start = $this_block->start if (!defined($this_block_start) or $this_block->start < $this_block_start);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1147 $this_block_end = $this_block->end if (!defined($this_block_end) or $this_block->end > $this_block_end);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1148 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1149 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1150
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1151 # Choose the appropriate AS::Slice for adding this bit of the alignment
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1152 my $this_underlying_slice = $self->_choose_underlying_Slice($this_genomic_align, $this_block_start,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1153 $this_block_end, $align_slice_length, $species_order);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1154
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1155 # Add a Slice, Mapper, and start-end-strand coordinates to an underlying AS::Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1156 $this_underlying_slice->add_Slice_Mapper_pair(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1157 $this_core_slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1158 $this_mapper,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1159 $this_block_start,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1160 $this_block_end,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1161 $this_genomic_align->dnafrag_strand
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1162 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1163 return;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1164 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1165
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1166
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1167 sub _choose_underlying_Slice {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1168 my ($self, $this_genomic_align, $this_block_start, $this_block_end, $align_slice_length, $species_order) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1169 my $underlying_slice = undef;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1170
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1171 my $expanded = $self->{expanded};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1172 my $species = $this_genomic_align->dnafrag->genome_db->name;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1173
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1174 if (defined($this_genomic_align->{_temporary_AS_underlying_Slice})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1175 my $preset_underlying_slice = $this_genomic_align->{_temporary_AS_underlying_Slice};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1176 delete($this_genomic_align->{_temporary_AS_underlying_Slice});
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1177 return $preset_underlying_slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1178 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1179
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1180 if (!defined($self->{slices}->{lc($species)})) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1181 ## No slice for this species yet. Create, store and return it
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1182 $underlying_slice = new Bio::EnsEMBL::Compara::AlignSlice::Slice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1183 -length => $align_slice_length,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1184 -requesting_slice => $self->reference_Slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1185 -align_slice => $self,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1186 -method_link_species_set => $self->{_method_link_species_set},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1187 -genome_db => $this_genomic_align->dnafrag->genome_db,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1188 -expanded => $expanded,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1189 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1190 push(@{$self->{_slices}}, $underlying_slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1191 push(@{$self->{slices}->{lc($species)}}, $underlying_slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1192 return $underlying_slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1193 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1194
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1195 if ($species_order) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1196 my $preset_underlying_slice = undef;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1197 foreach my $this_underlying_slice (@{$self->{_slices}}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1198 if (!$this_genomic_align->{original_dbID} and $this_genomic_align->dbID) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1199 $this_genomic_align->{original_dbID} = $this_genomic_align->dbID;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1200 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1201 if (grep {$_ == $this_genomic_align->{original_dbID}}
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1202 @{$this_underlying_slice->{genomic_align_ids}}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1203 $preset_underlying_slice = $this_underlying_slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1204 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1205 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1206 if ($preset_underlying_slice) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1207 my $overlap = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1208 my $slice_mapper_pairs = $preset_underlying_slice->get_all_Slice_Mapper_pairs();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1209 foreach my $slice_mapper_pair (@$slice_mapper_pairs) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1210 my $block_start = $slice_mapper_pair->{start};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1211 my $block_end = $slice_mapper_pair->{end};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1212 #a block may not have a start and end if there is no sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1213 #eg the cigar_line looks like 139D17186X
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1214 next if (!defined $this_block_start || !defined $this_block_end);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1215 if ($this_block_start <= $block_end and $this_block_end >= $block_start) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1216 $overlap = 1;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1217 last;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1218 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1219 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1220 if (!$overlap) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1221 ## This block does not overlap any previous block: add it!
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1222 $underlying_slice = $preset_underlying_slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1223 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1224 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1225 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1226
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1227 if (!$underlying_slice) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1228 ## Try to add this alignment to an existing underlying Bio::EnsEMBL::Compara::AlignSlice::Slice
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1229 SLICE: foreach my $this_underlying_slice (@{$self->{slices}->{lc($species)}}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1230 my $slice_mapper_pairs = $this_underlying_slice->get_all_Slice_Mapper_pairs();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1231 PAIRS: foreach my $slice_mapper_pair (@$slice_mapper_pairs) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1232 my $block_start = $slice_mapper_pair->{start};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1233 my $block_end = $slice_mapper_pair->{end};
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1234 if ($this_block_start <= $block_end and $this_block_end >= $block_start) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1235 next SLICE; ## This block overlaps a previous block
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1236 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1237 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1238 ## This block does not overlap any previous block: add it!
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1239 $underlying_slice = $this_underlying_slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1240 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1241 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1242
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1243 if (!$underlying_slice) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1244 ## This block overlaps at least one block in every available underlying
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1245 ## Bio::EnsEMBL::Compara::AlignSlice::Slice. Create a new one!
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1246 $underlying_slice = new Bio::EnsEMBL::Compara::AlignSlice::Slice(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1247 -length => $align_slice_length,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1248 -requesting_slice => $self->reference_Slice,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1249 -align_slice => $self,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1250 -method_link_species_set => $self->{_method_link_species_set},
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1251 -genome_db => $this_genomic_align->dnafrag->genome_db,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1252 -expanded => $expanded,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1253 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1254 push(@{$self->{_slices}}, $underlying_slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1255 push(@{$self->{slices}->{lc($species)}}, $underlying_slice);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1256 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1257
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1258 # if ($this_genomic_align->cigar_line =~ /X/) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1259 # ## This GenomicAlign is part of a composite alignment
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1260 # my $genomic_align_group = $this_genomic_align->genomic_align_group_by_type("composite");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1261 # foreach my $this_genomic_align (@{$genomic_align_group->genomic_align_array}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1262 # # next if ($this_genomic_align
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1263 # }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1264 # }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1265
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1266 return $underlying_slice;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1267 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1268
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1269
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1270 =head2 _sort_and_restrict_GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1271
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1272 Arg[1] : listref of Bio::EnsEMBL::Compara::GenomicAlignBlocks $gabs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1273 Example : $sorted_gabs = _sort_GenomicAlignBlocks($gabs);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1274 Description : This method returns the original list of
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1275 Bio::EnsEMBL::Compara::GenomicAlignBlock objects in order
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1276 Returntype : listref of Bio::EnsEMBL::Compara::GenomicAlignBlock objects
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1277 Exceptions :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1278 Caller : methodname()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1279
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1280 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1281
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1282 sub _sort_and_restrict_GenomicAlignBlocks {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1283 my ($genomic_align_blocks) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1284 my $sorted_genomic_align_blocks = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1285 return $sorted_genomic_align_blocks if (!$genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1286
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1287 my $last_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1288 foreach my $this_genomic_align_block (sort _sort_gabs @{$genomic_align_blocks}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1289 if (defined($last_end) and
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1290 $this_genomic_align_block->reference_genomic_align->dnafrag_start <= $last_end) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1291 if ($this_genomic_align_block->reference_genomic_align->dnafrag_end > $last_end) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1292 $this_genomic_align_block = $this_genomic_align_block->restrict_between_reference_positions($last_end + 1, undef);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1293 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1294 warning("Ignoring GenomicAlignBlock because it overlaps".
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1295 " previous GenomicAlignBlock");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1296 next;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1297 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1298 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1299 $last_end = $this_genomic_align_block->reference_genomic_align->dnafrag_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1300 push(@$sorted_genomic_align_blocks, $this_genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1301 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1302
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1303 return $sorted_genomic_align_blocks;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1304 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1305
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1306 =head2 _sort_GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1307
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1308 Arg[1] : listref of Bio::EnsEMBL::Compara::GenomicAlignBlocks $gabs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1309 Example : $sorted_gabs = _sort_GenomicAlignBlocks($gabs);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1310 Description : This method returns the original list of
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1311 Bio::EnsEMBL::Compara::GenomicAlignBlock objects in order
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1312 Returntype : listref of Bio::EnsEMBL::Compara::GenomicAlignBlock objects
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1313 Exceptions :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1314 Caller : methodname()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1315
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1316 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1317
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1318 sub _sort_GenomicAlignBlocks {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1319 my ($genomic_align_blocks) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1320 my $sorted_genomic_align_blocks = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1321 return $sorted_genomic_align_blocks if (!$genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1322
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1323 my $last_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1324 foreach my $this_genomic_align_block (sort _sort_gabs @{$genomic_align_blocks}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1325 if (!defined($last_end) or
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1326 $this_genomic_align_block->reference_genomic_align->dnafrag_start > $last_end) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1327 push(@$sorted_genomic_align_blocks, $this_genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1328 $last_end = $this_genomic_align_block->reference_genomic_align->dnafrag_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1329 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1330 my $block_id;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1331 if (UNIVERSAL::isa($a, "Bio::EnsEMBL::Compara::GenomicAlignBlock")) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1332 warning("Ignoring Bio::EnsEMBL::Compara::GenomicAlignBlock #".
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1333 ($this_genomic_align_block->dbID or "-unknown")." because it overlaps".
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1334 " previous Bio::EnsEMBL::Compara::GenomicAlignBlock");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1335 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1336 warning("Ignoring Bio::EnsEMBL::Compara::GenomicAlignTree #".
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1337 ($this_genomic_align_block->node_id or "-unknown")." because it overlaps".
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1338 " previous Bio::EnsEMBL::Compara::GenomicAlignTree");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1339 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1340
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1341 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1342 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1343
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1344 return $sorted_genomic_align_blocks;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1345 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1346
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1347 sub _sort_gabs {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1348
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1349 if (UNIVERSAL::isa($a, "Bio::EnsEMBL::Compara::GenomicAlignBlock")) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1350 _sort_genomic_align_block();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1351 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1352 _sort_genomic_align_tree();
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1353 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1354 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1355
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1356 sub _sort_genomic_align_block {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1357
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1358 if ($a->reference_genomic_align->dnafrag_start == $b->reference_genomic_align->dnafrag_start) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1359 ## This may happen when a block has been split into small pieces and some of them contain
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1360 ## gaps only for the reference species. In this case, use another species for sorting these
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1361 ## genomic_align_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1362 for (my $i = 0; $i<@{$a->get_all_non_reference_genomic_aligns()}; $i++) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1363 for (my $j = 0; $j<@{$b->get_all_non_reference_genomic_aligns()}; $j++) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1364 next if ($a->get_all_non_reference_genomic_aligns->[$i]->dnafrag_id !=
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1365 $b->get_all_non_reference_genomic_aligns->[$j]->dnafrag_id);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1366 if (($a->get_all_non_reference_genomic_aligns->[$i]->dnafrag_start !=
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1367 $b->get_all_non_reference_genomic_aligns->[$j]->dnafrag_start) and
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1368 ($a->get_all_non_reference_genomic_aligns->[$i]->dnafrag_strand ==
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1369 $b->get_all_non_reference_genomic_aligns->[$j]->dnafrag_strand)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1370 ## This other genomic_align is not a full gap and ca be used to sort these blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1371 if ($a->get_all_non_reference_genomic_aligns->[$i]->dnafrag_strand == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1372 return $a->get_all_non_reference_genomic_aligns->[$i]->dnafrag_start <=>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1373 $b->get_all_non_reference_genomic_aligns->[$j]->dnafrag_start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1374 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1375 return $b->get_all_non_reference_genomic_aligns->[$j]->dnafrag_start <=>
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1376 $a->get_all_non_reference_genomic_aligns->[$i]->dnafrag_start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1377 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1378 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1379 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1380 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1381 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1382 return $a->reference_genomic_align->dnafrag_start <=> $b->reference_genomic_align->dnafrag_start
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1383 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1384 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1385
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1386 sub _sort_genomic_align_tree {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1387
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1388 if ($a->reference_genomic_align->dnafrag_start == $b->reference_genomic_align->dnafrag_start) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1389 ## This may happen when a block has been split into small pieces and some of them contain
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1390 ## gaps only for the reference species. In this case, use another species for sorting these
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1391 ## genomic_align_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1392 my $a_leaves = $a->get_all_leaves;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1393 my $b_leaves = $b->get_all_leaves;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1394
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1395 for (my $i = 0; $i < @$a_leaves; $i++) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1396 for (my $j = 0; $j < @$b_leaves; $j++) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1397 #look at high coverage sequences only
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1398 my $a_gas = $a_leaves->[$i]->get_all_genomic_aligns_for_node;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1399 next if (@$a_gas > 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1400 my $a_ga = $a_gas->[0];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1401
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1402 my $b_gas = $b_leaves->[$j]->get_all_genomic_aligns_for_node;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1403 next if (@$b_gas > 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1404 my $b_ga = $b_gas->[0];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1405
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1406 next if ($a_ga->dnafrag_id != $b_ga->dnafrag_id);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1407 if (($a_ga->dnafrag_start != $b_ga->dnafrag_start) and ($a_ga->dnafrag_strand == $b_ga->dnafrag_strand)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1408 ## This other genomic_align is not a full gap and ca be used to sort these blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1409 if ($a_ga->dnafrag_strand == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1410 return $a_ga->dnafrag_start <=> $b_ga->dnafrag_start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1411 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1412 return $b_ga->dnafrag_start <=> $a_ga->dnafrag_start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1413 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1414 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1415 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1416 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1417 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1418 return $a->reference_genomic_align->dnafrag_start <=> $b->reference_genomic_align->dnafrag_start
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1419 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1420 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1421
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1422 =head2 _sort_and_compile_GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1423
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1424 Arg[1] : listref of Bio::EnsEMBL::Compara::GenomicAlignBlocks $gabs
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1425 Example : $sorted_fake_gabs = _sort_and_compile_GenomicAlignBlocks($gabs);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1426 Description : This method returns a list of
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1427 Bio::EnsEMBL::Compara::GenomicAlignBlock objects sorted by
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1428 position on the reference Bio::EnsEMBL::Compara::DnaFrag. If two
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1429 or more Bio::EnsEMBL::Compara::GenomicAlignBlock objects
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1430 overlap, it compile them, using the _compile_GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1431 method.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1432 Returntype : listref of Bio::EnsEMBL::Compara::GenomicAlignBlock objects
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1433 Exceptions :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1434 Caller : methodname()
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1435
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1436 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1437
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1438 sub _sort_and_compile_GenomicAlignBlocks {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1439 my ($genomic_align_blocks) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1440 my $sorted_genomic_align_blocks = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1441 return $sorted_genomic_align_blocks if (!$genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1442
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1443 ##############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1444 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1445 ## Compile GenomicAlignBlocks in group of GenomicAlignBlocks based on reference coordinates
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1446 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1447 my $sets_of_genomic_align_blocks = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1448 my $start_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1449 my $end_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1450 my $this_set_of_genomic_align_blocks = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1451 foreach my $this_genomic_align_block (sort _sort_gabs @$genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1452 my $this_start_pos = $this_genomic_align_block->reference_genomic_align->dnafrag_start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1453 my $this_end_pos = $this_genomic_align_block->reference_genomic_align->dnafrag_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1454 if (defined($end_pos) and ($this_start_pos <= $end_pos)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1455 # this genomic_align_block overlaps previous one. Extend this set_of_coordinates
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1456 $end_pos = $this_end_pos if ($this_end_pos > $end_pos);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1457 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1458 # there is a gap between this genomic_align_block and the previous one. Close and save
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1459 # this set_of_genomic_align_blocks (if it exists) and start a new one.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1460 push(@{$sets_of_genomic_align_blocks}, [$start_pos, $end_pos, $this_set_of_genomic_align_blocks])
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1461 if (defined(@$this_set_of_genomic_align_blocks));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1462 $start_pos = $this_start_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1463 $end_pos = $this_end_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1464 $this_set_of_genomic_align_blocks = [];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1465 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1466 push(@$this_set_of_genomic_align_blocks, $this_genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1467 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1468 push(@{$sets_of_genomic_align_blocks}, [$start_pos, $end_pos, $this_set_of_genomic_align_blocks])
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1469 if (defined(@$this_set_of_genomic_align_blocks));
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1470 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1471 ##############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1472
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1473 foreach my $this_set_of_genomic_align_blocks (@$sets_of_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1474 my $this_compiled_genomic_align_block;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1475 if (@$this_set_of_genomic_align_blocks == 1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1476 $this_compiled_genomic_align_block = $this_set_of_genomic_align_blocks->[0];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1477 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1478 $this_compiled_genomic_align_block =
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1479 _compile_GenomicAlignBlocks(@$this_set_of_genomic_align_blocks);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1480 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1481 push(@{$sorted_genomic_align_blocks}, $this_compiled_genomic_align_block);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1482 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1483
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1484 return $sorted_genomic_align_blocks;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1485 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1486
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1487 =head2 _compile_GenomicAlignBlocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1488
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1489 Arg [1] : integer $start_pos (the start of the fake genomic_align)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1490 Arg [2] : integer $end_pos (the end of the fake genomic_align)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1491 Arg [3] : listref of Bio::EnsEMBL::Compara::GenomicAlignBlocks $set_of_genomic_align_blocks
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1492 $all_genomic_align_blocks (the pairwise genomic_align_blocks used for
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1493 this fake multiple genomic_aling_block)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1494 Example :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1495 Description :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1496 Returntype : Bio::EnsEMBL::Compara::GenomicAlignBlock object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1497 Exceptions :
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1498 Caller : methodname
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1499
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1500 =cut
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1501
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1502 sub _compile_GenomicAlignBlocks {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1503 my ($start_pos, $end_pos, $all_genomic_align_blocks) = @_;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1504
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1505 ############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1506 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1507 ## Change strands in order to have all reference genomic aligns on the forward strand
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1508 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1509 my $strand;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1510 foreach my $this_genomic_align_block (@$all_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1511 my $this_genomic_align = $this_genomic_align_block->reference_genomic_align;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1512 if (!defined($strand)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1513 $strand = $this_genomic_align->dnafrag_strand;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1514 } elsif ($strand != $this_genomic_align->dnafrag_strand) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1515 $strand = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1516 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1517 if ($this_genomic_align->dnafrag_strand == -1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1518
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1519 if (UNIVERSAL::isa($this_genomic_align_block, "Bio::EnsEMBL::Compara::GenomicAlignTree")) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1520 foreach my $this_node (@{$this_genomic_align_block->get_all_nodes}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1521 my $genomic_align_group = $this_node->genomic_align_group;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1522 next if (!$genomic_align_group);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1523 foreach my $genomic_align (@{$genomic_align_group->get_all_GenomicAligns}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1524 $genomic_align->reverse_complement;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1525 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1526 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1527 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1528 foreach my $genomic_align (@{$this_genomic_align_block->genomic_align_array}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1529 $genomic_align->reverse_complement;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1530 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1531 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1532 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1533 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1534 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1535 ############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1536
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1537 ## Nothing has to be compiled if there is one single GenomicAlignBlock!
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1538 $all_genomic_align_blocks->[0]->reverse_complement;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1539 return $all_genomic_align_blocks->[0] if (scalar(@$all_genomic_align_blocks) == 1);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1540
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1541 ############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1542 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1543 ## Fix all sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1544 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1545 foreach my $this_genomic_align_block (@$all_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1546 my $this_genomic_align = $this_genomic_align_block->reference_genomic_align;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1547 my $this_start_pos = $this_genomic_align->dnafrag_start;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1548 my $this_end_pos = $this_genomic_align->dnafrag_end;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1549 my $starting_gap = $this_start_pos - $start_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1550 my $ending_gap = $end_pos - $this_end_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1551
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1552 my $this_cigar_line = $this_genomic_align->cigar_line;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1553 my $this_original_sequence = $this_genomic_align->original_sequence;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1554 $this_genomic_align->aligned_sequence("");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1555 if ($starting_gap) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1556 $this_cigar_line = $starting_gap."M".$this_cigar_line;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1557 $this_original_sequence = ("N" x $starting_gap).$this_original_sequence;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1558 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1559 if ($ending_gap) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1560 $this_cigar_line .= $ending_gap."M";
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1561 $this_original_sequence .= ("N" x $ending_gap);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1562 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1563 $this_genomic_align->cigar_line($this_cigar_line);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1564 $this_genomic_align->original_sequence($this_original_sequence);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1565
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1566 foreach my $this_genomic_align (@{$this_genomic_align_block->get_all_non_reference_genomic_aligns}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1567 $this_genomic_align->aligned_sequence("");
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1568 my $this_cigar_line = $this_genomic_align->cigar_line;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1569 $this_cigar_line = $starting_gap."D".$this_cigar_line if ($starting_gap);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1570 $this_cigar_line .= $ending_gap."D" if ($ending_gap);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1571 $this_genomic_align->cigar_line($this_cigar_line);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1572 $this_genomic_align->aligned_sequence(); # compute aligned_sequence using cigar_line
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1573 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1574 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1575 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1576 ############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1577
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1578 ############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1579 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1580 ## Distribute gaps
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1581 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1582 my $aln_pos = 0;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1583 my $gap;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1584 do {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1585 my $gap_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1586 my $genomic_align_block;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1587 $gap = undef;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1588
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1589 ## Get the (next) first gap from all the alignments (sets: $gap_pos, $gap and $genomic_align_block_id)
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1590 foreach my $this_genomic_align_block (@$all_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1591 my $this_gap_pos = index($this_genomic_align_block->reference_genomic_align->aligned_sequence,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1592 "-", $aln_pos);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1593 if ($this_gap_pos > 0 and (!defined($gap_pos) or $this_gap_pos < $gap_pos)) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1594 $gap_pos = $this_gap_pos;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1595 my $gap_string = substr($this_genomic_align_block->reference_genomic_align->aligned_sequence,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1596 $gap_pos);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1597 ($gap) = $gap_string =~ /^(\-+)/;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1598 $genomic_align_block = $this_genomic_align_block;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1599 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1600 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1601
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1602 ## If a gap has been found, apply it to the other GAB
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1603 if ($gap) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1604 $aln_pos = $gap_pos + length($gap);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1605 foreach my $this_genomic_align_block (@$all_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1606 next if ($genomic_align_block eq $this_genomic_align_block); # Do not add gap to itself!!
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1607 if (UNIVERSAL::isa($this_genomic_align_block, "Bio::EnsEMBL::Compara::GenomicAlignTree")) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1608 foreach my $this_node (@{$this_genomic_align_block->get_all_nodes}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1609 my $genomic_align_group = $this_node->genomic_align_group;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1610 next if (!$genomic_align_group);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1611 foreach my $this_genomic_align (@{$genomic_align_group->get_all_GenomicAligns}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1612 # insert gap in the aligned_sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1613 my $aligned_sequence = $this_genomic_align->aligned_sequence;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1614 substr($aligned_sequence, $gap_pos, 0, $gap);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1615 $this_genomic_align->aligned_sequence($aligned_sequence);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1616 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1617 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1618 } else {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1619 foreach my $this_genomic_align (@{$this_genomic_align_block->genomic_align_array}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1620 # insert gap in the aligned_sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1621 my $aligned_sequence = $this_genomic_align->aligned_sequence;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1622 substr($aligned_sequence, $gap_pos, 0, $gap);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1623 $this_genomic_align->aligned_sequence($aligned_sequence);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1624 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1625 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1626 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1627 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1628
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1629 } while ($gap); # exit loop if no gap has been found
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1630
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1631 ## Fix all cigar_lines in order to match new aligned_sequences
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1632 foreach my $this_genomic_align_block (@$all_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1633 foreach my $this_genomic_align (@{$this_genomic_align_block->genomic_align_array}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1634 $this_genomic_align->cigar_line(""); # undef old cigar_line
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1635 $this_genomic_align->cigar_line(); # compute cigar_line from aligned_sequence
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1636 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1637 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1638 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1639 ############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1640
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1641 ############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1642 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1643 ## Create the reference_genomic_align for this fake genomic_align_block
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1644 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1645 ## All the blocks have been edited and all the reference genomic_aling
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1646 ## should be equivalent. Here, we create a new one with no fixed sequence.
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1647 ## This permits to retrieve the real sequence when needed
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1648 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1649 my $reference_genomic_align;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1650 if (@$all_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1651 my $this_genomic_align = $all_genomic_align_blocks->[0]->reference_genomic_align;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1652 $reference_genomic_align = new Bio::EnsEMBL::Compara::GenomicAlign(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1653 -dbID => -1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1654 -dnafrag => $this_genomic_align->dnafrag,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1655 -dnafrag_start => $start_pos,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1656 -dnafrag_end => $end_pos,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1657 -dnafrag_strand => 1,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1658 -cigar_line => $this_genomic_align->cigar_line,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1659 -method_link_species_set => $this_genomic_align->method_link_species_set,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1660 -visible => 1
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1661 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1662 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1663 ##
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1664 ############################################################################################
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1665
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1666 ## Create the genomic_align_array (the list of genomic_aling for this fake gab
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1667 my $genomic_align_array = [$reference_genomic_align];
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1668 foreach my $this_genomic_align_block (@$all_genomic_align_blocks) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1669 foreach my $this_genomic_align (@{$this_genomic_align_block->get_all_non_reference_genomic_aligns}) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1670 $this_genomic_align->genomic_align_block_id(0); # undef old genomic_align_block_id
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1671 push(@$genomic_align_array, $this_genomic_align);
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1672 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1673 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1674
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1675 ## Create the fake multiple Bio::EnsEMBL::Compara::GenomicAlignBlock
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1676 my $fake_genomic_align_block = new Bio::EnsEMBL::Compara::GenomicAlignBlock(
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1677 -length => ($end_pos - $start_pos + 1),
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1678 -genomic_align_array => $genomic_align_array,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1679 -reference_genomic_align => $reference_genomic_align,
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1680 -level_id => 0
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1681 );
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1682
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1683 if ($strand == -1) {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1684 $fake_genomic_align_block->reverse_complement;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1685 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1686
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1687 return $fake_genomic_align_block;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1688 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1689
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1690
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1691 sub DESTROY {
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1692 my $self = shift;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1693 ## Remove circular reference in order to allow Perl to clear the object
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1694 $self->{all_genomic_align_blocks} = undef;
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1695 }
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1696
21066c0abaf5 Uploaded
willmclaren
parents:
diff changeset
1697 1;