annotate variant_effect_predictor/Bio/Tools/Lucy.pm @ 0:1f6dce3d34e0

Uploaded
author mahtabm
date Thu, 11 Apr 2013 02:01:53 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1 # $Id: Lucy.pm,v 1.6 2002/10/22 07:38:46 lapp Exp $
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
3 # BioPerl module for Bio::Tools::Lucy
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
4 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
5 # Copyright Her Majesty the Queen of England
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
6 # written by Andrew Walsh (paeruginosa@hotmail.com) during employment with
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
7 # Agriculture and Agri-food Canada, Cereal Research Centre, Winnipeg, MB
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
8 #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
9 # You may distribute this module under the same terms as perl itself
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
10 # POD documentation - main docs before the code
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
11
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
12 =head1 NAME
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
13
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
14 Bio::Tools::Lucy - Object for analyzing the output from Lucy,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
15 a vector and quality trimming program from TIGR
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
16
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
17 =head1 SYNOPSIS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
18
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
19 # Create the Lucy object from an existing Lucy output file
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
20 @params = ('seqfile' => 'lucy.seq', 'lucy_verbose' => 1);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
21 $lucyObj = Bio::Tools::Lucy->new(@params);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
22
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
23 # Get names of all sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
24 $names = $lucyObj->get_sequence_names();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
25
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
26 # Print seq and qual values for sequences >400 bp in order to run CAP3
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
27 foreach $name (@$names) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
28 next unless $lucyObj->length_clear($name) > 400;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
29 print SEQ ">$name\n", $lucyObj->sequence($name), "\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
30 print QUAL ">$name\n", $lucyObj->quality($name), "\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
31 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
32
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
33 # Get an array of Bio::PrimarySeq objects
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
34 @seqObjs = $lucyObj->get_Seq_Objs();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
35
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
36
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
37 =head1 DESCRIPTION
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
38
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
39 Bio::Tools::Lucy.pm provides methods for analyzing the sequence and
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
40 quality values generated by Lucy program from TIGR.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
41
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
42 Lucy will identify vector, poly-A/T tails, and poor quality regions in
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
43 a sequence. (www.genomics.purdue.edu/gcg/other/lucy.pdf)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
44
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
45 The input to Lucy can be the Phred sequence and quality files
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
46 generated from running Phred on a set of chromatograms.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
47
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
48 Lucy can be obtained (free of charge to academic users) from
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
49 www.tigr.org/softlab
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
50
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
51 There are a few methods that will only be available if you make some
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
52 minor changes to the source for Lucy and then recompile. The changes
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
53 are in the 'lucy.c' file and there is a diff between the original and
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
54 the modified file in the Appendix
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
55
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
56 Please contact the author of this module if you have any problems
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
57 making these modifications.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
58
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
59 You do not have to make these modifications to use this module.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
60
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
61 =head2 Creating a Lucy object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
62
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
63 @params = ('seqfile' => 'lucy.seq', 'adv_stderr' => 1,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
64 'fwd_desig' => '_F', 'rev_desig' => '_R');
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
65 $lucyObj = Bio::Tools::Lucy->new(@params);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
66
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
67 =head2 Using a Lucy object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
68
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
69 You should get an array with the sequence names in order to use
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
70 accessor methods. Note: The Lucy binary program will fail unless
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
71 the sequence names provided as input are unique.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
72
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
73 $names_ref = $lucyObj->get_sequence_names();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
74
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
75 This code snippet will produce a Fasta format file with sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
76 lengths and %GC in the description line.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
77
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
78 foreach $name (@$names) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
79 print FILE ">$name\t",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
80 $lucyObj->length_clear($name), "\t",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
81 $lucyObj->per_GC($name), "\n",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
82 $lucyObj->sequence($name), "\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
83 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
84
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
85
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
86 Print seq and qual values for sequences >400 bp in order to assemble
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
87 them with CAP3 (or other assembler).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
88
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
89 foreach $name (@$names) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
90 next unless $lucyObj->length_clear($name) > 400;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
91 print SEQ ">$name\n", $lucyObj->sequence($name), "\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
92 print QUAL ">$name\n", $lucyObj->quality($name), "\n";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
93 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
94
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
95 Get all the sequences as Bio::PrimarySeq objects (eg., for use with
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
96 Bio::Tools::Blast to perform BLAST).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
97
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
98 @seqObjs = $lucyObj->get_Seq_Objs();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
99
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
100 Or use only those sequences that are full length and have a Poly-A
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
101 tail.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
102
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
103 foreach $name (@$names) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
104 next unless ($lucyObj->full_length($name) and $lucy->polyA($name));
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
105 push @seqObjs, $lucyObj->get_Seq_Obj($name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
106 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
107
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
108
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
109 Get the names of those sequences that were rejected by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
110
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
111 $rejects_ref = $lucyObj->get_rejects();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
112
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
113 Print the names of the rejects and 1 letter code for reason they
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
114 were rejected.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
115
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
116 foreach $key (sort keys %$rejects_ref) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
117 print "$key: ", $rejects_ref->{$key};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
118 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
119
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
120 There is a lot of other information available about the sequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
121 analyzed by Lucy (see APPENDIX). This module can be used with the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
122 DBI module to store this sequence information in a database.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
123
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
124 =head1 FEEDBACK
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
125
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
126 =head2 Mailing Lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
127
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
128 User feedback is an integral part of the evolution of this and other
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
129 Bioperl modules. Send your comments and suggestions preferably to one
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
130 of the Bioperl mailing lists. Your participation is much appreciated.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
131
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
132 bioperl-l@bioperl.org - General discussion
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
133 http://bio.perl.org/MailList.html - About the mailing lists
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
134
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
135 =head2 Reporting Bugs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
136
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
137 Report bugs to the Bioperl bug tracking system to help us keep track
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
138 the bugs and their resolution. Bug reports can be submitted via email
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
139 or the web:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
140
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
141 bioperl-bugs@bio.perl.org
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
142 http://bugzilla.bioperl.org/
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
143
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
144 =head1 AUTHOR
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
145
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
146 Andrew G. Walsh paeruginosa@hotmail.com
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
147
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
148 =head1 APPENDIX
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
149
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
150 Methods available to Lucy objects are described below. Please note
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
151 that any method beginning with an underscore is considered internal
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
152 and should not be called directly.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
153
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
154 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
155
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
156
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
157 package Bio::Tools::Lucy;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
158
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
159 use vars qw($VERSION $AUTOLOAD @ISA @ATTR %OK_FIELD);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
160 use strict;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
161 use Bio::PrimarySeq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
162 use Bio::Root::Root;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
163 use Bio::Root::IO;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
164
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
165 @ISA = qw(Bio::Root::Root Bio::Root::IO);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
166 @ATTR = qw(seqfile qualfile stderrfile infofile lucy_verbose fwd_desig rev_desig adv_stderr);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
167 foreach my $attr (@ATTR) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
168 $OK_FIELD{$attr}++
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
169 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
170 $VERSION = "0.01";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
171
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
172 sub AUTOLOAD {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
173 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
174 my $attr = $AUTOLOAD;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
175 $attr =~ s/.*:://;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
176 $attr = lc $attr;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
177 $self->throw("Unallowed parameter: $attr !") unless $OK_FIELD{$attr};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
178 $self->{$attr} = shift if @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
179 return $self->{$attr};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
180 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
181
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
182 =head2 new
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
183
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
184 Title : new
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
185 Usage : $lucyObj = Bio::Tools::Lucy->new(seqfile => lucy.seq, rev_desig => '_R',
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
186 fwd_desig => '_F')
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
187 Function: creates a Lucy object from Lucy analysis files
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
188 Returns : reference to Bio::Tools::Lucy object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
189 Args : seqfile Fasta sequence file generated by Lucy
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
190 qualfile Quality values file generated by Lucy
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
191 infofile Info file created when Lucy is run with -debug 'infofile' option
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
192 stderrfile Standard error captured from Lucy when Lucy is run
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
193 with -info option and STDERR is directed to stderrfile
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
194 (ie. lucy ... 2> stderrfile).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
195 Info in this file will include sequences dropped for low
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
196 quality. If you've modified Lucy source (see adv_stderr below),
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
197 it will also include info on which sequences were dropped because
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
198 they were vector, too short, had no insert, and whether a poly-A
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
199 tail was found (if Lucy was run with -cdna option).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
200 lucy_verbose verbosity level (0-1).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
201 fwd_desig The string used to determine whether sequence is a forward read.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
202 The parser will assume that this match will occus at the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
203 end of the sequence name string.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
204 rev_desig As above, for reverse reads.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
205 adv_stderr Can be set to a true value (1). Will only work if you have modified
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
206 the Lucy source code as outlined in DESCRIPTION and capture
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
207 the standard error from Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
208
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
209 If you don't provide filenames for qualfile, infofile or stderrfile,
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
210 the module will assume that .qual, .info, and .stderr are the file
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
211 extensions and search in the same directory as the .seq file for these
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
212 files.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
213
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
214 For example, if you create a Lucy object with $lucyObj =
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
215 Bio::Tools::Lucy-E<gt>new(seqfile =E<gt>lucy.seq), the module will
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
216 find lucy.qual, lucy.info and lucy.stderr.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
217
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
218 You can omit any or all of the quality, info or stderr files, but you
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
219 will not be able to use all of the object methods (see method
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
220 documentation below).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
221
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
222 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
223
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
224 sub new {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
225 my ($class,@args) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
226 my $self = $class->SUPER::new(@args);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
227 my ($attr, $value);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
228 while (@args) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
229 $attr = shift @args;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
230 $attr = lc $attr;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
231 $value = shift @args;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
232 $self->{$attr} = $value;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
233 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
234 &_parse($self);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
235 return $self;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
236 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
237
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
238 =head2 _parse
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
239
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
240 Title : _parse
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
241 Usage : n/a (internal function)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
242 Function: called by new() to parse Lucy output files
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
243 Returns : nothing
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
244 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
245
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
246 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
247
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
248 sub _parse {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
249 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
250 $self->{seqfile} =~ /^(\S+)\.\S+$/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
251 my $file = $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
252
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
253 print "Opening $self->{seqfile} for parsing...\n" if $self->{lucy_verbose};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
254 open SEQ, "$self->{seqfile}" or $self->throw("Could not open sequence file: $self->{seqfile}");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
255 my ($name, $line);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
256 my $seq = "";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
257 my @lines = <SEQ>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
258 while ($line = pop @lines) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
259 chomp $line;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
260 if ($line =~ /^>(\S+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
261 $name = $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
262 if ($self->{fwd_desig}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
263 $self->{sequences}{$name}{direction} = "F" if $name =~ /^(\S+)($self->{fwd_desig})$/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
264 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
265 if ($self->{rev_desig}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
266 $self->{sequences}{$name}{direction} = "R" if $name =~ /^(\S+)($self->{rev_desig})$/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
267 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
268 $self->{sequences}{$name}{min_clone_len} = $2; # this is used for TIGR Assembler, as are $3 and $4
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
269 $self->{sequences}{$name}{max_clone_len} = $3;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
270 $self->{sequences}{$name}{med_clone_len} = $4;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
271 $self->{sequences}{$name}{beg_clear} = $5;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
272 $self->{sequences}{$name}{end_clear} = $6;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
273 $self->{sequences}{$name}{length_raw} = $seq =~ tr/[AGCTN]//; # from what I've seen, these are the bases Phred calls. Please let me know if I'm wrong.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
274 my $beg = $5-1; # substr function begins with index 0
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
275 $seq = $self->{sequences}{$name}{sequence} = substr ($seq, $beg, $6-$beg);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
276 my $count = $self->{sequences}{$name}{length_clear} = $seq =~ tr/[AGCTN]//;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
277 my $countGC = $seq =~ tr/[GC]//;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
278 $self->{sequences}{$name}{per_GC} = $countGC/$count * 100;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
279 $seq = "";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
280 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
281 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
282 $seq = $line.$seq;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
283 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
284 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
285
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
286
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
287 # now parse quality values (check for presence of quality file first)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
288 if ($self->{qualfile}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
289 open QUAL, "$self->{qualfile}" or $self->throw("Could not open quality file: $self->{qualfile}");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
290 @lines = <QUAL>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
291 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
292 elsif (-e "$file.qual") {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
293 print "You did not set qualfile, but I'm opening $file.qual\n" if $self->{lucy_verbose};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
294 $self->qualfile("$file.qual");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
295 open QUAL, "$file.qual" or $self->throw("Could not open quality file: $file.qual");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
296 @lines = <QUAL>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
297 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
298 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
299 print "I did not find a quality file. You will not be able to use all of the accessor methods.\n" if $self->{lucy_verbose};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
300 @lines = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
301 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
302
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
303 my (@vals, @slice, $num, $tot, $vals);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
304 my $qual = "";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
305 while ($line = pop @lines) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
306 chomp $line;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
307 if ($line =~ /^>(\S+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
308 $name = $1;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
309 @vals = split /\s/ , $qual;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
310 @slice = @vals[$self->{sequences}{$name}{beg_clear} .. $self->{sequences}{$name}{end_clear}];
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
311 $vals = join "\t", @slice;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
312 $self->{sequences}{$name}{quality} = $vals;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
313 $qual = "";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
314 foreach $num (@slice) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
315 $tot += $num;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
316 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
317 $num = @slice;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
318 $self->{sequences}{$name}{avg_quality} = $tot/$num;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
319 $tot = 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
320 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
321 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
322 $qual = $line.$qual;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
323 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
324 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
325
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
326 # determine whether reads are full length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
327
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
328 if ($self->{infofile}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
329 open INFO, "$self->{infofile}" or $self->throw("Could not open info file: $self->{infofile}");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
330 @lines = <INFO>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
331 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
332 elsif (-e "$file.info") {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
333 print "You did not set infofile, but I'm opening $file.info\n" if $self->{lucy_verbose};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
334 $self->infofile("$file.info");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
335 open INFO, "$file.info" or $self->throw("Could not open info file: $file.info");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
336 @lines = <INFO>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
337 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
338 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
339 print "I did not find an info file. You will not be able to use all of the accessor methods.\n" if $self->{lucy_verbose};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
340 @lines = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
341 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
342
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
343 foreach (@lines) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
344 /^(\S+).+CLV\s+(\d+)\s+(\d+)$/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
345 if ($2>0 && $3>0) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
346 $self->{sequences}{$1}{full_length} = 1 if $self->{sequences}{$1}; # will show cleavage info for rejected sequences too
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
347 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
348 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
349
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
350
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
351 # parse rejects (and presence of poly-A if Lucy has been modified)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
352
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
353 if ($self->{stderrfile}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
354 open STDERR_LUCY, "$self->{stderrfile}" or $self->throw("Could not open quality file: $self->{stderrfile}");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
355 @lines = <STDERR_LUCY>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
356
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
357 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
358 elsif (-e "$file.stderr") {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
359 print "You did not set stderrfile, but I'm opening $file.stderr\n" if $self->{lucy_verbose};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
360 $self->stderrfile("$file.stderr");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
361 open STDERR_LUCY, "$file.stderr" or $self->throw("Could not open quality file: $file.stderr");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
362 @lines = <STDERR_LUCY>;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
363 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
364 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
365 print "I did not find a standard error file. You will not be able to use all of the accessor methods.\n" if $self->{lucy_verbose};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
366 @lines = ();
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
367 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
368
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
369 if ($self->{adv_stderr}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
370 foreach (@lines) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
371 $self->{reject}{$1} = "Q" if /dropping\s+(\S+)/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
372 $self->{reject}{$1} = "V" if /Vector: (\S+)/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
373 $self->{reject}{$1} = "E" if /Empty: (\S+)/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
374 $self->{reject}{$1} = "S" if /Short: (\S+)/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
375 $self->{sequences}{$1}{polyA} = 1 if /(\S+) has PolyA/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
376 if (/Dropped PolyA: (\S+)/) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
377 $self->{reject}{$1} = "P";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
378 delete $self->{sequences}{$1};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
379 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
380 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
381 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
382 else {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
383 foreach (@lines) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
384 $self->{reject}{$1} = "R" if /dropping\s+(\S+)/;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
385 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
386 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
387
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
388 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
389
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
390 =head2 get_Seq_Objs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
391
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
392 Title : get_Seq_Objs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
393 Usage : $lucyObj->get_Seq_Objs()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
394 Function: returns an array of references to Bio::PrimarySeq objects
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
395 where -id = 'sequence name' and -seq = 'sequence'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
396
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
397 Returns : array of Bio::PrimarySeq objects
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
398 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
399
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
400 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
401
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
402 sub get_Seq_Objs {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
403 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
404 my($seqobj, @seqobjs);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
405 foreach my $key (sort keys %{$self->{sequences}}) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
406 $seqobj = Bio::PrimarySeq->new( -seq => "$self->{sequences}{$key}{sequence}",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
407 -id => "$key");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
408 push @seqobjs, $seqobj;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
409 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
410 return \@seqobjs;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
411 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
412
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
413 =head2 get_Seq_Obj
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
414
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
415 Title : get_Seq_Obj
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
416 Usage : $lucyObj->get_Seq_Obj($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
417 Function: returns reference to a Bio::PrimarySeq object where -id = 'sequence name'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
418 and -seq = 'sequence'
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
419 Returns : reference to Bio::PrimarySeq object
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
420 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
421
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
422 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
423
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
424 sub get_Seq_Obj {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
425 my ($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
426 my $seqobj = Bio::PrimarySeq->new( -seq => "$self->{sequences}{$key}{sequence}",
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
427 -id => "$key");
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
428 return $seqobj;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
429 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
430
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
431 =head2 get_sequence_names
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
432
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
433 Title : get_sequence_names
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
434 Usage : $lucyObj->get_sequence_names
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
435 Function: returns reference to an array of names of the sequences analyzed by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
436 These names are required for most of the accessor methods.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
437 Note: The Lucy binary will fail unless sequence names are unique.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
438 Returns : array reference
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
439 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
440
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
441 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
442
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
443 sub get_sequence_names {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
444 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
445 my @keys = sort keys %{$self->{sequences}};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
446 return \@keys;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
447 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
448
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
449 =head2 sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
450
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
451 Title : sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
452 Usage : $lucyObj->sequence($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
453 Function: returns the DNA sequence of one of the sequences analyzed by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
454 Returns : string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
455 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
456
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
457 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
458
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
459 sub sequence {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
460 my ($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
461 return $self->{sequences}{$key}{sequence};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
462 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
463
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
464 =head2 quality
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
465
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
466 Title : quality
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
467 Usage : $lucyObj->quality($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
468 Function: returns the quality values of one of the sequences analyzed by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
469 This method depends on the user having provided a quality file.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
470 Returns : string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
471 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
472
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
473 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
474
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
475 sub quality {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
476 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
477 return $self->{sequences}{$key}{quality};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
478 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
479
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
480 =head2 avg_quality
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
481
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
482 Title : avg_quality
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
483 Usage : $lucyObj->avg_quality($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
484 Function: returns the average quality value for one of the sequences analyzed by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
485 Returns : float
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
486 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
487
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
488 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
489
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
490 sub avg_quality {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
491 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
492 return $self->{sequences}{$key}{avg_quality};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
493 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
494
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
495 =head2 direction
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
496
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
497 Title : direction
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
498 Usage : $lucyObj->direction($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
499 Function: returns the direction for one of the sequences analyzed by Lucy
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
500 providing that 'fwd_desig' or 'rev_desig' were set when the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
501 Lucy object was created.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
502 Strings returned are: 'F' for forward, 'R' for reverse.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
503 Returns : string
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
504 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
505
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
506 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
507
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
508 sub direction {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
509 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
510 return $self->{sequences}{$key}{direction} if $self->{sequences}{$key}{direction};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
511 return "";
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
512 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
513
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
514 =head2 length_raw
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
515
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
516 Title : length_raw
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
517 Usage : $lucyObj->length_raw($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
518 Function: returns the length of a DNA sequence prior to quality/ vector
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
519 trimming by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
520 Returns : integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
521 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
522
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
523 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
524
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
525 sub length_raw {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
526 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
527 return $self->{sequences}{$key}{length_raw};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
528 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
529
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
530 =head2 length_clear
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
531
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
532 Title : length_clear
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
533 Usage : $lucyObj->length_clear($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
534 Function: returns the length of a DNA sequence following quality/ vector
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
535 trimming by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
536 Returns : integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
537 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
538
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
539 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
540
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
541 sub length_clear {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
542 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
543 return $self->{sequences}{$key}{length_clear};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
544 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
545
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
546 =head2 start_clear
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
547
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
548 Title : start_clear
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
549 Usage : $lucyObj->start_clear($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
550 Function: returns the beginning position of good quality, vector free DNA sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
551 determined by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
552 Returns : integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
553 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
554
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
555 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
556
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
557 sub start_clear {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
558 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
559 return $self->{sequences}{$key}{beg_clear};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
560 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
561
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
562
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
563 =head2 end_clear
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
564
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
565 Title : end_clear
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
566 Usage : $lucyObj->end_clear($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
567 Function: returns the ending position of good quality, vector free DNA sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
568 determined by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
569 Returns : integer
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
570 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
571
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
572 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
573
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
574 sub end_clear {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
575 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
576 return $self->{sequences}{$key}{end_clear};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
577 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
578
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
579 =head2 per_GC
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
580
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
581 Title : per_GC
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
582 Usage : $lucyObj->per_GC($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
583 Function: returns the percente of the good quality, vector free DNA sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
584 determined by Lucy.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
585 Returns : float
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
586 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
587
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
588 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
589
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
590 sub per_GC {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
591 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
592 return $self->{sequences}{$key}{per_GC};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
593 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
594
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
595 =head2 full_length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
596
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
597 Title : full_length
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
598 Usage : $lucyObj->full_length($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
599 Function: returns the truth value for whether or not the sequence read was
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
600 full length (ie. vector present on both ends of read). This method
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
601 depends on the user having provided the 'info' file (Lucy must be
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
602 run with the -debug 'info_filename' option to get this file).
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
603 Returns : boolean
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
604 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
605
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
606 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
607
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
608 sub full_length {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
609 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
610 return 1 if $self->{sequences}{$key}{full_length};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
611 return 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
612 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
613
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
614 =head2 polyA
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
615
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
616 Title : polyA
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
617 Usage : $lucyObj->polyA($seqname)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
618 Function: returns the truth value for whether or not a poly-A tail was detected
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
619 and clipped by Lucy. This method depends on the user having modified
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
620 the source for Lucy as outlined in DESCRIPTION and invoking Lucy with
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
621 the -cdna option and saving the standard error.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
622 Note, the final sequence will not show the poly-A/T region.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
623 Returns : boolean
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
624 Args : name of a sequence
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
625
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
626 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
627
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
628 sub polyA {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
629 my($self, $key) = @_;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
630 return 1 if $self->{sequences}{$key}{polyA};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
631 return 0;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
632 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
633
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
634 =head2 get_rejects
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
635
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
636 Title : get_rejects
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
637 Usage : $lucyObj->get_rejects()
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
638 Function: returns a hash containing names of rejects and a 1 letter code for the
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
639 reason Lucy rejected the sequence.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
640 Q- rejected because of low quality values
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
641 S- sequence was short
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
642 V- sequence was vector
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
643 E- sequence was empty
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
644 P- poly-A/T trimming caused sequence to be too short
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
645 In order to get the rejects, you must provide a file with the standard
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
646 error from Lucy. You will only get the quality category rejects unless
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
647 you have modified the source and recompiled Lucy as outlined in DESCRIPTION.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
648 Returns : hash reference
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
649 Args : none
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
650
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
651 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
652
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
653 sub get_rejects {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
654 my $self = shift;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
655 return $self->{reject};
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
656 }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
657
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
658 =head2 Diff for Lucy source code
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
659
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
660 352a353,354
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
661 > /* AGW added next line */
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
662 > fprintf(stderr, "Empty: %s\n", seqs[i].name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
663 639a642,643
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
664 > /* AGW added next line */
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
665 > fprintf(stderr, "Short/ no insert: %s\n", seqs[i].name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
666 678c682,686
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
667 < if (left) seqs[i].left+=left;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
668 ---
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
669 > if (left) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
670 > seqs[i].left+=left;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
671 > /* AGW added next line */
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
672 > fprintf(stderr, "%s has PolyA (left).\n", seqs[i].name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
673 > }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
674 681c689,693
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
675 < if (right) seqs[i].right-=right;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
676 ---
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
677 > if (right) {
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
678 > seqs[i].right-=right;
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
679 > /* AGW added next line */
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
680 > fprintf(stderr, "%s has PolyA (right).\n", seqs[i].name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
681 > }
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
682 682a695,696
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
683 > /* AGW added next line */
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
684 > fprintf(stderr, "Dropped PolyA: %s\n", seqs[i].name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
685 734a749,750
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
686 > /* AGW added next line */
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
687 > fprintf(stderr, "Vector: %s\n", seqs[i].name);
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
688
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
689 =cut
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
690
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
691 1;