Mercurial > repos > konradpaszkiewicz > velvetoptimiser
annotate VelvetOptimiser-2.1.7_modified/VelvetOpt/Utils.pm @ 0:7363fee7f20c default tip
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
author | konradpaszkiewicz |
---|---|
date | Tue, 07 Jun 2011 17:42:26 -0400 |
parents | |
children |
rev | line source |
---|---|
0
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
1 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
2 # VelvetOpt::Utils.pm |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
3 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
4 # Copyright 2008,2009,2010 Simon Gladman <simon.gladman@csiro.au> |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
5 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
6 # This program is free software; you can redistribute it and/or modify |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
7 # it under the terms of the GNU General Public License as published by |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
8 # the Free Software Foundation; either version 2 of the License, or |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
9 # (at your option) any later version. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
10 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
11 # This program is distributed in the hope that it will be useful, |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
12 # but WITHOUT ANY WARRANTY; without even the implied warranty of |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
14 # GNU General Public License for more details. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
15 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
16 # You should have received a copy of the GNU General Public License |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
17 # along with this program; if not, write to the Free Software |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
18 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
19 # MA 02110-1301, USA. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
20 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
21 # Version 2.1.3 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
22 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
23 # Changes for Version 2.0.1 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
24 # Added Mikael Brandstrom Durling's numCpus and freeMem for the Mac. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
25 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
26 # Changes for Version 2.1.0 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
27 # Fixed bug in estExpCov so it now correctly uses all short read categories not just the first two. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
28 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
29 # Changes for Version 2.1.2 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
30 # Fixed bug in estExpCov so it now won't take columns with "N/A" or "Inf" into account |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
31 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
32 # Changes for Version 2.1.3 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
33 # Changed the minimum contig size to use for estimating expected coverage to 3*kmer size -1 and set the minimum coverage to 2 instead of 0. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
34 # This should get rid of exp_covs of 1 when it should be very high for assembling reads that weren't ampped to a reference using one of the standard read mapping programs |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
35 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
36 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
37 package VelvetOpt::Utils; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
38 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
39 use strict; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
40 use lib "/usr/local/lib/perl5/site_perl/5.8.8"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
41 use warnings; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
42 use POSIX qw(ceil floor); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
43 use Carp; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
44 use List::Util qw(max); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
45 use Bio::SeqIO; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
46 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
47 # num_cpu |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
48 # It returns the number of cpus present in the system if linux. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
49 # If it is MAC then it returns the number of cores present. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
50 # If the OS is not linux or Mac then it returns 1. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
51 # Written by Torsten Seemann 2009 (linux) and Mikael Brandstrom Durling 2009 (Mac). |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
52 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
53 sub num_cpu { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
54 if ( $^O =~ m/linux/i ) { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
55 my ($num) = qx(grep -c ^processor /proc/cpuinfo); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
56 chomp $num; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
57 return $num if $num =~ m/^\d+/; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
58 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
59 elsif( $^O =~ m/darwin/i){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
60 my ($num) = qx(system_profiler SPHardwareDataType | grep Cores); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
61 $num =~ /.*Cores: (\d+)/; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
62 $num =$1; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
63 return $num; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
64 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
65 return 1; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
66 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
67 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
68 # free_mem |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
69 # Returns the current amount of free memory |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
70 # Mac Section written by Mikael Brandstrom Durling 2009 (Mac). |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
71 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
72 sub free_mem { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
73 if( $^O =~ m/linux/i ) { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
74 my $x = `free | grep '^Mem:' | sed 's/ */~/g' | cut -d '~' -f 4,7`; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
75 my @tmp = split "~", $x; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
76 my $total = $tmp[0] + $tmp[1]; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
77 my $totalGB = $total / 1024 / 1024; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
78 return $totalGB; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
79 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
80 elsif( $^O =~ m/darwin/i){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
81 my ($tmp) = qx(vm_stat | grep size); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
82 $tmp =~ /.*size of (\d+) bytes.*/; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
83 my $page_size = $1; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
84 ($tmp) = qx(vm_stat | grep free); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
85 $tmp =~ /[^0-9]+(\d+).*/; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
86 my $free_pages = $1; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
87 my $totalGB = ($free_pages * $page_size) / 1024 / 1024 / 1024; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
88 return $totalGB; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
89 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
90 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
91 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
92 # estExpCov |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
93 # it returns the expected coverage of short reads from an assembly by |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
94 # performing a math mode on the stats.txt file supplied.. It looks at |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
95 # all the short_cov? columns.. Uses minimum contig length and minimum coverage. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
96 # needs the stats.txt file path and name, and the k-value used in the assembly. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
97 # Original algorithm by Torsten Seemann 2009 under the GPL. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
98 # Adapted by Simon Gladman 2009. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
99 # It does a weighted mode... |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
100 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
101 sub estExpCov { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
102 use List::Util qw(max); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
103 my $file = shift; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
104 my $kmer = shift; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
105 my $minlen = 3 * $kmer - 1; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
106 my $mincov = 2; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
107 my $fh; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
108 unless ( open IN, $file ) { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
109 croak "Unable to open $file for exp_cov determination.\n"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
110 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
111 my @cov; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
112 while (<IN>) { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
113 chomp; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
114 my @x = split m/\t/; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
115 my $len = scalar @x; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
116 next unless @x >= 7; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
117 next unless $x[1] =~ m/^\d+$/; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
118 next unless $x[1] >= $minlen; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
119 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
120 #add all the short_cov columns.. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
121 my $cov = 0; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
122 for(my $i = 5; $i < $len; $i += 2){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
123 if($x[$i] =~ /\d/){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
124 $cov += $x[$i]; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
125 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
126 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
127 next unless $cov > $mincov; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
128 push @cov, ( ( int($cov) ) x $x[1] ); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
129 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
130 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
131 my %freq_of; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
132 map { $freq_of{$_}++ } @cov; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
133 my $mode = 0; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
134 $freq_of{$mode} = 0; # sentinel |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
135 for my $x ( keys %freq_of ) { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
136 $mode = $x if $freq_of{$x} > $freq_of{$mode}; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
137 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
138 return $mode; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
139 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
140 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
141 # estVelvetMemUse |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
142 # returns the estimated memory usage for velvet in GB |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
143 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
144 sub estVelvetMemUse { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
145 my ($readsize, $genomesize, $numreads, $k) = @_; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
146 my $velvetgmem = -109635 + 18977*$readsize + 86326*$genomesize + 233353*$numreads - 51092*$k; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
147 my $out = ($velvetgmem/1024) / 1024; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
148 return $out; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
149 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
150 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
151 # getReadSizeNum |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
152 # returns the number of reads and average size in the short and shortPaired categories... |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
153 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
154 sub getReadSizeNum { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
155 my $f = shift; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
156 my %reads; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
157 my $num = 0; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
158 my $currentfiletype = "fasta"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
159 #first pull apart the velveth string and get the short and shortpaired filenames... |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
160 my @l = split /\s+/, $f; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
161 my $i = 0; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
162 foreach (@l){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
163 if(/^-/){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
164 if(/^-fasta/){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
165 $currentfiletype = "fasta"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
166 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
167 elsif(/^-fastq/){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
168 $currentfiletype = "fastq"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
169 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
170 elsif(/(-eland)|(-gerald)|(-fasta.gz)|(-fastq.gz)/) { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
171 croak "Cannot estimate memory usage from file types other than fasta or fastq..\n"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
172 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
173 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
174 elsif(-r $_){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
175 my $file = $_; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
176 if($currentfiletype eq "fasta"){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
177 my $x = `grep -c "^>" $file`; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
178 chomp($x); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
179 $num += $x; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
180 my $l = &getReadLength($file, 'Fasta'); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
181 $reads{$l} += $x; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
182 print STDERR "File: $file has $x reads of length $l\n"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
183 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
184 else { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
185 my $x = `grep -c "^@" $file`; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
186 chomp($x); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
187 $num += $x; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
188 my $l = &getReadLength($file, 'Fastq'); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
189 $reads{$l} += $x; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
190 print STDERR "File: $file has $x reads of length $l\n"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
191 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
192 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
193 $i ++; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
194 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
195 my $totlength = 0; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
196 foreach my $k (keys %reads){ |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
197 $totlength += ($reads{$k} * $k); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
198 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
199 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
200 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
201 my @results; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
202 push @results, floor($totlength/$num); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
203 push @results, ($num/1000000); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
204 printf STDERR "Total reads: %.1f million. Avg length: %.1f\n",($num/1000000), ($totlength/$num); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
205 return @results; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
206 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
207 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
208 # getReadLength - returns the length of the first read in a file of type fasta or fastq.. |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
209 # |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
210 sub getReadLength { |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
211 my ($f, $t) = @_; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
212 my $sio = Bio::SeqIO->new(-file => $f, -format => $t); |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
213 my $s = $sio->next_seq() or croak "Something went bad while reading file $f!\n"; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
214 return $s->length; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
215 } |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
216 |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
217 return 1; |
7363fee7f20c
Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff
changeset
|
218 |