annotate variant_effect_predictor/Bio/Tools/Lucy.pm @ 0:2bc9b66ada89 draft default tip

Uploaded
author mahtabm
date Thu, 11 Apr 2013 06:29:17 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
1 # $Id: Lucy.pm,v 1.6 2002/10/22 07:38:46 lapp Exp $
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
2 #
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
3 # BioPerl module for Bio::Tools::Lucy
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
4 #
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
5 # Copyright Her Majesty the Queen of England
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
6 # written by Andrew Walsh (paeruginosa@hotmail.com) during employment with
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
7 # Agriculture and Agri-food Canada, Cereal Research Centre, Winnipeg, MB
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
8 #
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
9 # You may distribute this module under the same terms as perl itself
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
10 # POD documentation - main docs before the code
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
11
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
12 =head1 NAME
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
13
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
14 Bio::Tools::Lucy - Object for analyzing the output from Lucy,
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
15 a vector and quality trimming program from TIGR
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
16
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
17 =head1 SYNOPSIS
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
18
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
19 # Create the Lucy object from an existing Lucy output file
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
20 @params = ('seqfile' => 'lucy.seq', 'lucy_verbose' => 1);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
21 $lucyObj = Bio::Tools::Lucy->new(@params);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
22
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
23 # Get names of all sequences
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
24 $names = $lucyObj->get_sequence_names();
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
25
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
26 # Print seq and qual values for sequences >400 bp in order to run CAP3
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
27 foreach $name (@$names) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
28 next unless $lucyObj->length_clear($name) > 400;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
29 print SEQ ">$name\n", $lucyObj->sequence($name), "\n";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
30 print QUAL ">$name\n", $lucyObj->quality($name), "\n";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
31 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
32
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
33 # Get an array of Bio::PrimarySeq objects
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
34 @seqObjs = $lucyObj->get_Seq_Objs();
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
35
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
36
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
37 =head1 DESCRIPTION
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
38
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
39 Bio::Tools::Lucy.pm provides methods for analyzing the sequence and
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
40 quality values generated by Lucy program from TIGR.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
41
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
42 Lucy will identify vector, poly-A/T tails, and poor quality regions in
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
43 a sequence. (www.genomics.purdue.edu/gcg/other/lucy.pdf)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
44
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
45 The input to Lucy can be the Phred sequence and quality files
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
46 generated from running Phred on a set of chromatograms.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
47
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
48 Lucy can be obtained (free of charge to academic users) from
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
49 www.tigr.org/softlab
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
50
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
51 There are a few methods that will only be available if you make some
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
52 minor changes to the source for Lucy and then recompile. The changes
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
53 are in the 'lucy.c' file and there is a diff between the original and
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
54 the modified file in the Appendix
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
55
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
56 Please contact the author of this module if you have any problems
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
57 making these modifications.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
58
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
59 You do not have to make these modifications to use this module.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
60
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
61 =head2 Creating a Lucy object
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
62
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
63 @params = ('seqfile' => 'lucy.seq', 'adv_stderr' => 1,
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
64 'fwd_desig' => '_F', 'rev_desig' => '_R');
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
65 $lucyObj = Bio::Tools::Lucy->new(@params);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
66
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
67 =head2 Using a Lucy object
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
68
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
69 You should get an array with the sequence names in order to use
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
70 accessor methods. Note: The Lucy binary program will fail unless
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
71 the sequence names provided as input are unique.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
72
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
73 $names_ref = $lucyObj->get_sequence_names();
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
74
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
75 This code snippet will produce a Fasta format file with sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
76 lengths and %GC in the description line.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
77
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
78 foreach $name (@$names) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
79 print FILE ">$name\t",
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
80 $lucyObj->length_clear($name), "\t",
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
81 $lucyObj->per_GC($name), "\n",
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
82 $lucyObj->sequence($name), "\n";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
83 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
84
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
85
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
86 Print seq and qual values for sequences >400 bp in order to assemble
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
87 them with CAP3 (or other assembler).
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
88
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
89 foreach $name (@$names) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
90 next unless $lucyObj->length_clear($name) > 400;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
91 print SEQ ">$name\n", $lucyObj->sequence($name), "\n";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
92 print QUAL ">$name\n", $lucyObj->quality($name), "\n";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
93 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
94
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
95 Get all the sequences as Bio::PrimarySeq objects (eg., for use with
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
96 Bio::Tools::Blast to perform BLAST).
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
97
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
98 @seqObjs = $lucyObj->get_Seq_Objs();
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
99
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
100 Or use only those sequences that are full length and have a Poly-A
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
101 tail.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
102
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
103 foreach $name (@$names) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
104 next unless ($lucyObj->full_length($name) and $lucy->polyA($name));
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
105 push @seqObjs, $lucyObj->get_Seq_Obj($name);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
106 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
107
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
108
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
109 Get the names of those sequences that were rejected by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
110
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
111 $rejects_ref = $lucyObj->get_rejects();
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
112
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
113 Print the names of the rejects and 1 letter code for reason they
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
114 were rejected.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
115
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
116 foreach $key (sort keys %$rejects_ref) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
117 print "$key: ", $rejects_ref->{$key};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
118 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
119
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
120 There is a lot of other information available about the sequences
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
121 analyzed by Lucy (see APPENDIX). This module can be used with the
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
122 DBI module to store this sequence information in a database.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
123
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
124 =head1 FEEDBACK
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
125
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
126 =head2 Mailing Lists
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
127
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
128 User feedback is an integral part of the evolution of this and other
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
129 Bioperl modules. Send your comments and suggestions preferably to one
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
130 of the Bioperl mailing lists. Your participation is much appreciated.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
131
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
132 bioperl-l@bioperl.org - General discussion
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
133 http://bio.perl.org/MailList.html - About the mailing lists
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
134
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
135 =head2 Reporting Bugs
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
136
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
137 Report bugs to the Bioperl bug tracking system to help us keep track
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
138 the bugs and their resolution. Bug reports can be submitted via email
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
139 or the web:
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
140
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
141 bioperl-bugs@bio.perl.org
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
142 http://bugzilla.bioperl.org/
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
143
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
144 =head1 AUTHOR
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
145
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
146 Andrew G. Walsh paeruginosa@hotmail.com
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
147
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
148 =head1 APPENDIX
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
149
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
150 Methods available to Lucy objects are described below. Please note
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
151 that any method beginning with an underscore is considered internal
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
152 and should not be called directly.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
153
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
154 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
155
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
156
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
157 package Bio::Tools::Lucy;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
158
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
159 use vars qw($VERSION $AUTOLOAD @ISA @ATTR %OK_FIELD);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
160 use strict;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
161 use Bio::PrimarySeq;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
162 use Bio::Root::Root;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
163 use Bio::Root::IO;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
164
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
165 @ISA = qw(Bio::Root::Root Bio::Root::IO);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
166 @ATTR = qw(seqfile qualfile stderrfile infofile lucy_verbose fwd_desig rev_desig adv_stderr);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
167 foreach my $attr (@ATTR) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
168 $OK_FIELD{$attr}++
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
169 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
170 $VERSION = "0.01";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
171
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
172 sub AUTOLOAD {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
173 my $self = shift;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
174 my $attr = $AUTOLOAD;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
175 $attr =~ s/.*:://;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
176 $attr = lc $attr;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
177 $self->throw("Unallowed parameter: $attr !") unless $OK_FIELD{$attr};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
178 $self->{$attr} = shift if @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
179 return $self->{$attr};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
180 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
181
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
182 =head2 new
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
183
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
184 Title : new
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
185 Usage : $lucyObj = Bio::Tools::Lucy->new(seqfile => lucy.seq, rev_desig => '_R',
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
186 fwd_desig => '_F')
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
187 Function: creates a Lucy object from Lucy analysis files
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
188 Returns : reference to Bio::Tools::Lucy object
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
189 Args : seqfile Fasta sequence file generated by Lucy
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
190 qualfile Quality values file generated by Lucy
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
191 infofile Info file created when Lucy is run with -debug 'infofile' option
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
192 stderrfile Standard error captured from Lucy when Lucy is run
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
193 with -info option and STDERR is directed to stderrfile
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
194 (ie. lucy ... 2> stderrfile).
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
195 Info in this file will include sequences dropped for low
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
196 quality. If you've modified Lucy source (see adv_stderr below),
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
197 it will also include info on which sequences were dropped because
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
198 they were vector, too short, had no insert, and whether a poly-A
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
199 tail was found (if Lucy was run with -cdna option).
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
200 lucy_verbose verbosity level (0-1).
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
201 fwd_desig The string used to determine whether sequence is a forward read.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
202 The parser will assume that this match will occus at the
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
203 end of the sequence name string.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
204 rev_desig As above, for reverse reads.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
205 adv_stderr Can be set to a true value (1). Will only work if you have modified
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
206 the Lucy source code as outlined in DESCRIPTION and capture
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
207 the standard error from Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
208
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
209 If you don't provide filenames for qualfile, infofile or stderrfile,
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
210 the module will assume that .qual, .info, and .stderr are the file
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
211 extensions and search in the same directory as the .seq file for these
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
212 files.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
213
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
214 For example, if you create a Lucy object with $lucyObj =
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
215 Bio::Tools::Lucy-E<gt>new(seqfile =E<gt>lucy.seq), the module will
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
216 find lucy.qual, lucy.info and lucy.stderr.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
217
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
218 You can omit any or all of the quality, info or stderr files, but you
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
219 will not be able to use all of the object methods (see method
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
220 documentation below).
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
221
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
222 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
223
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
224 sub new {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
225 my ($class,@args) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
226 my $self = $class->SUPER::new(@args);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
227 my ($attr, $value);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
228 while (@args) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
229 $attr = shift @args;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
230 $attr = lc $attr;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
231 $value = shift @args;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
232 $self->{$attr} = $value;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
233 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
234 &_parse($self);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
235 return $self;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
236 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
237
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
238 =head2 _parse
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
239
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
240 Title : _parse
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
241 Usage : n/a (internal function)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
242 Function: called by new() to parse Lucy output files
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
243 Returns : nothing
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
244 Args : none
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
245
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
246 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
247
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
248 sub _parse {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
249 my $self = shift;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
250 $self->{seqfile} =~ /^(\S+)\.\S+$/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
251 my $file = $1;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
252
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
253 print "Opening $self->{seqfile} for parsing...\n" if $self->{lucy_verbose};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
254 open SEQ, "$self->{seqfile}" or $self->throw("Could not open sequence file: $self->{seqfile}");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
255 my ($name, $line);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
256 my $seq = "";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
257 my @lines = <SEQ>;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
258 while ($line = pop @lines) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
259 chomp $line;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
260 if ($line =~ /^>(\S+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
261 $name = $1;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
262 if ($self->{fwd_desig}) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
263 $self->{sequences}{$name}{direction} = "F" if $name =~ /^(\S+)($self->{fwd_desig})$/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
264 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
265 if ($self->{rev_desig}) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
266 $self->{sequences}{$name}{direction} = "R" if $name =~ /^(\S+)($self->{rev_desig})$/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
267 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
268 $self->{sequences}{$name}{min_clone_len} = $2; # this is used for TIGR Assembler, as are $3 and $4
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
269 $self->{sequences}{$name}{max_clone_len} = $3;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
270 $self->{sequences}{$name}{med_clone_len} = $4;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
271 $self->{sequences}{$name}{beg_clear} = $5;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
272 $self->{sequences}{$name}{end_clear} = $6;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
273 $self->{sequences}{$name}{length_raw} = $seq =~ tr/[AGCTN]//; # from what I've seen, these are the bases Phred calls. Please let me know if I'm wrong.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
274 my $beg = $5-1; # substr function begins with index 0
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
275 $seq = $self->{sequences}{$name}{sequence} = substr ($seq, $beg, $6-$beg);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
276 my $count = $self->{sequences}{$name}{length_clear} = $seq =~ tr/[AGCTN]//;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
277 my $countGC = $seq =~ tr/[GC]//;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
278 $self->{sequences}{$name}{per_GC} = $countGC/$count * 100;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
279 $seq = "";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
280 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
281 else {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
282 $seq = $line.$seq;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
283 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
284 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
285
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
286
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
287 # now parse quality values (check for presence of quality file first)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
288 if ($self->{qualfile}) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
289 open QUAL, "$self->{qualfile}" or $self->throw("Could not open quality file: $self->{qualfile}");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
290 @lines = <QUAL>;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
291 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
292 elsif (-e "$file.qual") {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
293 print "You did not set qualfile, but I'm opening $file.qual\n" if $self->{lucy_verbose};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
294 $self->qualfile("$file.qual");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
295 open QUAL, "$file.qual" or $self->throw("Could not open quality file: $file.qual");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
296 @lines = <QUAL>;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
297 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
298 else {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
299 print "I did not find a quality file. You will not be able to use all of the accessor methods.\n" if $self->{lucy_verbose};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
300 @lines = ();
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
301 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
302
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
303 my (@vals, @slice, $num, $tot, $vals);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
304 my $qual = "";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
305 while ($line = pop @lines) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
306 chomp $line;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
307 if ($line =~ /^>(\S+)/) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
308 $name = $1;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
309 @vals = split /\s/ , $qual;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
310 @slice = @vals[$self->{sequences}{$name}{beg_clear} .. $self->{sequences}{$name}{end_clear}];
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
311 $vals = join "\t", @slice;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
312 $self->{sequences}{$name}{quality} = $vals;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
313 $qual = "";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
314 foreach $num (@slice) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
315 $tot += $num;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
316 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
317 $num = @slice;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
318 $self->{sequences}{$name}{avg_quality} = $tot/$num;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
319 $tot = 0;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
320 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
321 else {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
322 $qual = $line.$qual;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
323 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
324 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
325
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
326 # determine whether reads are full length
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
327
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
328 if ($self->{infofile}) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
329 open INFO, "$self->{infofile}" or $self->throw("Could not open info file: $self->{infofile}");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
330 @lines = <INFO>;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
331 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
332 elsif (-e "$file.info") {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
333 print "You did not set infofile, but I'm opening $file.info\n" if $self->{lucy_verbose};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
334 $self->infofile("$file.info");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
335 open INFO, "$file.info" or $self->throw("Could not open info file: $file.info");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
336 @lines = <INFO>;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
337 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
338 else {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
339 print "I did not find an info file. You will not be able to use all of the accessor methods.\n" if $self->{lucy_verbose};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
340 @lines = ();
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
341 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
342
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
343 foreach (@lines) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
344 /^(\S+).+CLV\s+(\d+)\s+(\d+)$/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
345 if ($2>0 && $3>0) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
346 $self->{sequences}{$1}{full_length} = 1 if $self->{sequences}{$1}; # will show cleavage info for rejected sequences too
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
347 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
348 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
349
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
350
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
351 # parse rejects (and presence of poly-A if Lucy has been modified)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
352
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
353 if ($self->{stderrfile}) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
354 open STDERR_LUCY, "$self->{stderrfile}" or $self->throw("Could not open quality file: $self->{stderrfile}");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
355 @lines = <STDERR_LUCY>;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
356
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
357 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
358 elsif (-e "$file.stderr") {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
359 print "You did not set stderrfile, but I'm opening $file.stderr\n" if $self->{lucy_verbose};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
360 $self->stderrfile("$file.stderr");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
361 open STDERR_LUCY, "$file.stderr" or $self->throw("Could not open quality file: $file.stderr");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
362 @lines = <STDERR_LUCY>;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
363 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
364 else {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
365 print "I did not find a standard error file. You will not be able to use all of the accessor methods.\n" if $self->{lucy_verbose};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
366 @lines = ();
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
367 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
368
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
369 if ($self->{adv_stderr}) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
370 foreach (@lines) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
371 $self->{reject}{$1} = "Q" if /dropping\s+(\S+)/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
372 $self->{reject}{$1} = "V" if /Vector: (\S+)/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
373 $self->{reject}{$1} = "E" if /Empty: (\S+)/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
374 $self->{reject}{$1} = "S" if /Short: (\S+)/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
375 $self->{sequences}{$1}{polyA} = 1 if /(\S+) has PolyA/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
376 if (/Dropped PolyA: (\S+)/) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
377 $self->{reject}{$1} = "P";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
378 delete $self->{sequences}{$1};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
379 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
380 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
381 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
382 else {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
383 foreach (@lines) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
384 $self->{reject}{$1} = "R" if /dropping\s+(\S+)/;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
385 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
386 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
387
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
388 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
389
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
390 =head2 get_Seq_Objs
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
391
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
392 Title : get_Seq_Objs
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
393 Usage : $lucyObj->get_Seq_Objs()
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
394 Function: returns an array of references to Bio::PrimarySeq objects
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
395 where -id = 'sequence name' and -seq = 'sequence'
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
396
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
397 Returns : array of Bio::PrimarySeq objects
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
398 Args : none
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
399
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
400 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
401
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
402 sub get_Seq_Objs {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
403 my $self = shift;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
404 my($seqobj, @seqobjs);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
405 foreach my $key (sort keys %{$self->{sequences}}) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
406 $seqobj = Bio::PrimarySeq->new( -seq => "$self->{sequences}{$key}{sequence}",
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
407 -id => "$key");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
408 push @seqobjs, $seqobj;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
409 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
410 return \@seqobjs;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
411 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
412
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
413 =head2 get_Seq_Obj
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
414
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
415 Title : get_Seq_Obj
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
416 Usage : $lucyObj->get_Seq_Obj($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
417 Function: returns reference to a Bio::PrimarySeq object where -id = 'sequence name'
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
418 and -seq = 'sequence'
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
419 Returns : reference to Bio::PrimarySeq object
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
420 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
421
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
422 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
423
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
424 sub get_Seq_Obj {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
425 my ($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
426 my $seqobj = Bio::PrimarySeq->new( -seq => "$self->{sequences}{$key}{sequence}",
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
427 -id => "$key");
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
428 return $seqobj;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
429 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
430
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
431 =head2 get_sequence_names
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
432
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
433 Title : get_sequence_names
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
434 Usage : $lucyObj->get_sequence_names
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
435 Function: returns reference to an array of names of the sequences analyzed by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
436 These names are required for most of the accessor methods.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
437 Note: The Lucy binary will fail unless sequence names are unique.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
438 Returns : array reference
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
439 Args : none
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
440
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
441 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
442
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
443 sub get_sequence_names {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
444 my $self = shift;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
445 my @keys = sort keys %{$self->{sequences}};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
446 return \@keys;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
447 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
448
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
449 =head2 sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
450
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
451 Title : sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
452 Usage : $lucyObj->sequence($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
453 Function: returns the DNA sequence of one of the sequences analyzed by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
454 Returns : string
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
455 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
456
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
457 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
458
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
459 sub sequence {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
460 my ($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
461 return $self->{sequences}{$key}{sequence};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
462 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
463
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
464 =head2 quality
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
465
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
466 Title : quality
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
467 Usage : $lucyObj->quality($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
468 Function: returns the quality values of one of the sequences analyzed by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
469 This method depends on the user having provided a quality file.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
470 Returns : string
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
471 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
472
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
473 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
474
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
475 sub quality {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
476 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
477 return $self->{sequences}{$key}{quality};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
478 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
479
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
480 =head2 avg_quality
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
481
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
482 Title : avg_quality
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
483 Usage : $lucyObj->avg_quality($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
484 Function: returns the average quality value for one of the sequences analyzed by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
485 Returns : float
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
486 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
487
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
488 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
489
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
490 sub avg_quality {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
491 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
492 return $self->{sequences}{$key}{avg_quality};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
493 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
494
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
495 =head2 direction
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
496
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
497 Title : direction
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
498 Usage : $lucyObj->direction($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
499 Function: returns the direction for one of the sequences analyzed by Lucy
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
500 providing that 'fwd_desig' or 'rev_desig' were set when the
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
501 Lucy object was created.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
502 Strings returned are: 'F' for forward, 'R' for reverse.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
503 Returns : string
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
504 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
505
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
506 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
507
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
508 sub direction {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
509 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
510 return $self->{sequences}{$key}{direction} if $self->{sequences}{$key}{direction};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
511 return "";
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
512 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
513
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
514 =head2 length_raw
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
515
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
516 Title : length_raw
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
517 Usage : $lucyObj->length_raw($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
518 Function: returns the length of a DNA sequence prior to quality/ vector
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
519 trimming by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
520 Returns : integer
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
521 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
522
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
523 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
524
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
525 sub length_raw {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
526 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
527 return $self->{sequences}{$key}{length_raw};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
528 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
529
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
530 =head2 length_clear
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
531
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
532 Title : length_clear
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
533 Usage : $lucyObj->length_clear($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
534 Function: returns the length of a DNA sequence following quality/ vector
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
535 trimming by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
536 Returns : integer
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
537 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
538
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
539 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
540
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
541 sub length_clear {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
542 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
543 return $self->{sequences}{$key}{length_clear};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
544 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
545
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
546 =head2 start_clear
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
547
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
548 Title : start_clear
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
549 Usage : $lucyObj->start_clear($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
550 Function: returns the beginning position of good quality, vector free DNA sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
551 determined by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
552 Returns : integer
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
553 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
554
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
555 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
556
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
557 sub start_clear {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
558 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
559 return $self->{sequences}{$key}{beg_clear};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
560 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
561
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
562
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
563 =head2 end_clear
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
564
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
565 Title : end_clear
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
566 Usage : $lucyObj->end_clear($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
567 Function: returns the ending position of good quality, vector free DNA sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
568 determined by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
569 Returns : integer
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
570 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
571
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
572 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
573
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
574 sub end_clear {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
575 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
576 return $self->{sequences}{$key}{end_clear};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
577 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
578
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
579 =head2 per_GC
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
580
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
581 Title : per_GC
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
582 Usage : $lucyObj->per_GC($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
583 Function: returns the percente of the good quality, vector free DNA sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
584 determined by Lucy.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
585 Returns : float
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
586 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
587
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
588 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
589
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
590 sub per_GC {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
591 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
592 return $self->{sequences}{$key}{per_GC};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
593 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
594
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
595 =head2 full_length
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
596
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
597 Title : full_length
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
598 Usage : $lucyObj->full_length($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
599 Function: returns the truth value for whether or not the sequence read was
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
600 full length (ie. vector present on both ends of read). This method
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
601 depends on the user having provided the 'info' file (Lucy must be
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
602 run with the -debug 'info_filename' option to get this file).
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
603 Returns : boolean
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
604 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
605
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
606 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
607
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
608 sub full_length {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
609 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
610 return 1 if $self->{sequences}{$key}{full_length};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
611 return 0;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
612 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
613
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
614 =head2 polyA
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
615
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
616 Title : polyA
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
617 Usage : $lucyObj->polyA($seqname)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
618 Function: returns the truth value for whether or not a poly-A tail was detected
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
619 and clipped by Lucy. This method depends on the user having modified
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
620 the source for Lucy as outlined in DESCRIPTION and invoking Lucy with
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
621 the -cdna option and saving the standard error.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
622 Note, the final sequence will not show the poly-A/T region.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
623 Returns : boolean
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
624 Args : name of a sequence
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
625
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
626 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
627
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
628 sub polyA {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
629 my($self, $key) = @_;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
630 return 1 if $self->{sequences}{$key}{polyA};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
631 return 0;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
632 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
633
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
634 =head2 get_rejects
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
635
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
636 Title : get_rejects
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
637 Usage : $lucyObj->get_rejects()
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
638 Function: returns a hash containing names of rejects and a 1 letter code for the
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
639 reason Lucy rejected the sequence.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
640 Q- rejected because of low quality values
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
641 S- sequence was short
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
642 V- sequence was vector
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
643 E- sequence was empty
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
644 P- poly-A/T trimming caused sequence to be too short
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
645 In order to get the rejects, you must provide a file with the standard
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
646 error from Lucy. You will only get the quality category rejects unless
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
647 you have modified the source and recompiled Lucy as outlined in DESCRIPTION.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
648 Returns : hash reference
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
649 Args : none
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
650
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
651 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
652
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
653 sub get_rejects {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
654 my $self = shift;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
655 return $self->{reject};
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
656 }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
657
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
658 =head2 Diff for Lucy source code
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
659
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
660 352a353,354
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
661 > /* AGW added next line */
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
662 > fprintf(stderr, "Empty: %s\n", seqs[i].name);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
663 639a642,643
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
664 > /* AGW added next line */
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
665 > fprintf(stderr, "Short/ no insert: %s\n", seqs[i].name);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
666 678c682,686
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
667 < if (left) seqs[i].left+=left;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
668 ---
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
669 > if (left) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
670 > seqs[i].left+=left;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
671 > /* AGW added next line */
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
672 > fprintf(stderr, "%s has PolyA (left).\n", seqs[i].name);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
673 > }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
674 681c689,693
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
675 < if (right) seqs[i].right-=right;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
676 ---
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
677 > if (right) {
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
678 > seqs[i].right-=right;
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
679 > /* AGW added next line */
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
680 > fprintf(stderr, "%s has PolyA (right).\n", seqs[i].name);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
681 > }
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
682 682a695,696
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
683 > /* AGW added next line */
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
684 > fprintf(stderr, "Dropped PolyA: %s\n", seqs[i].name);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
685 734a749,750
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
686 > /* AGW added next line */
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
687 > fprintf(stderr, "Vector: %s\n", seqs[i].name);
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
688
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
689 =cut
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
690
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
691 1;