Mercurial > repos > peterjc > tmhmm_and_signalp
annotate tools/protein_analysis/signalp3.py @ 3:fe10f448d641
Migrated tool version 0.0.6 from old tool shed archive to new tool shed repository
| author | peterjc | 
|---|---|
| date | Tue, 07 Jun 2011 17:39:26 -0400 | 
| parents | a2eeeaa6f75e | 
| children | ef7ceca37e3f | 
| rev | line source | 
|---|---|
| 
0
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
1 #!/usr/bin/env python | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
2 """Wrapper for SignalP v3.0 for use in Galaxy. | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
3 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
4 This script takes exactly fives command line arguments: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
5 * the organism type (euk, gram+ or gram-) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
6 * length to truncate sequences to (integer) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
7 * number of threads to use (integer) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
8 * an input protein FASTA filename | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
9 * output tabular filename. | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
10 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
11 It then calls the standalone SignalP v3.0 program (not the webservice) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
12 requesting the short output (one line per protein) using both NN and HMM | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
13 for predictions. | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
14 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
15 First major feature is cleaning up the output. The raw output from SignalP | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
16 v3.0 looks like this (21 columns space separated): | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
17 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
18 # SignalP-NN euk predictions # SignalP-HMM euk predictions | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
19 # name Cmax pos ? Ymax pos ? Smax pos ? Smean ? D ? # name ! Cmax pos ? Sprob ? | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
20 gi|2781234|pdb|1JLY| 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N gi|2781234|pdb|1JLY|B Q 0.000 17 N 0.000 N | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
21 gi|4959044|gb|AAD342 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N gi|4959044|gb|AAD34209.1|AF069992_1 Q 0.000 0 N 0.000 N | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
22 gi|671626|emb|CAA856 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N gi|671626|emb|CAA85685.1| Q 0.000 0 N 0.000 N | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
23 gi|3298468|dbj|BAA31 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N gi|3298468|dbj|BAA31520.1| Q 0.066 24 N 0.139 N | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
24 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
25 In order to make it easier to use in Galaxy, this wrapper script reformats | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
26 this to use tab separators. Also it removes the redundant truncated name | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
27 column, and assigns unique column names in the header: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
28 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
29 #ID NN_Cmax_score NN_Cmax_pos NN_Cmax_pred NN_Ymax_score NN_Ymax_pos NN_Ymax_pred NN_Smax_score NN_Smax_pos NN_Smax_pred NN_Smean_score NN_Smean_pred NN_D_score NN_D_pred HMM_bang HMM_Cmax_score HMM_Cmax_pos HMM_Cmax_pred HMM_Sprob_score HMM_Sprob_pred | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
30 gi|2781234|pdb|1JLY|B 0.061 17 N 0.043 17 N 0.199 1 N 0.067 N 0.055 N Q 0.000 17 N 0.000 N | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
31 gi|4959044|gb|AAD34209.1|AF069992_1 0.099 191 N 0.012 38 N 0.023 12 N 0.014 N 0.013 N Q 0.000 0 N 0.000 N | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
32 gi|671626|emb|CAA85685.1| 0.139 381 N 0.020 8 N 0.121 4 N 0.067 N 0.044 N Q 0.000 0 N 0.000 N | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
33 gi|3298468|dbj|BAA31520.1| 0.208 24 N 0.184 38 N 0.980 32 Y 0.613 Y 0.398 N Q 0.066 24 N 0.139 N | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
34 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
35 The second major feature is overcoming SignalP's built in limit of 4000 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
36 sequences by breaking up the input FASTA file into chunks. This also allows | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
37 us to pre-trim the sequences since SignalP only needs their starts. | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
38 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
39 The third major feature is taking advantage of multiple cores (since SignalP | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
40 v3.0 itself is single threaded) by using the individual FASTA input files to | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
41 run multiple copies of TMHMM in parallel. I would normally use Python's | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
42 multiprocessing library in this situation but it requires at least Python 2.6 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
43 and at the time of writing Galaxy still supports Python 2.4. | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
44 """ | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
45 import sys | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
46 import os | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
47 from seq_analysis_utils import stop_err, split_fasta, run_jobs | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
48 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
49 FASTA_CHUNK = 500 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
50 MAX_LEN = 6000 #Found by trial and error | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
51 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
52 if len(sys.argv) != 6: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
53 stop_err("Require five arguments, organism, truncate, threads, input protein FASTA file & output tabular file") | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
54 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
55 organism = sys.argv[1] | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
56 if organism not in ["euk", "gram+", "gram-"]: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
57 stop_err("Organism argument %s is not one of euk, gram+ or gram-" % organism) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
58 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
59 try: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
60 truncate = int(sys.argv[2]) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
61 except: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
62 truncate = 0 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
63 if truncate < 0: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
64 stop_err("Truncate argument %s is not a positive integer (or zero)" % sys.argv[2]) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
65 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
66 try: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
67 num_threads = int(sys.argv[3]) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
68 except: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
69 num_threads = 0 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
70 if num_threads < 1: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
71 stop_err("Threads argument %s is not a positive integer" % sys.argv[3]) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
72 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
73 fasta_file = sys.argv[4] | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
74 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
75 tabular_file = sys.argv[5] | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
76 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
77 def clean_tabular(raw_handle, out_handle): | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
78 """Clean up SignalP output to make it tabular.""" | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
79 for line in raw_handle: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
80 if not line or line.startswith("#"): | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
81 continue | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
82 parts = line.rstrip("\r\n").split() | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
83 assert len(parts)==21, repr(line) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
84 assert parts[14].startswith(parts[0]) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
85 #Remove redundant truncated name column (col 0) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
86 #and put full name at start (col 14) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
87 parts = parts[14:15] + parts[1:14] + parts[15:] | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
88 out_handle.write("\t".join(parts) + "\n") | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
89 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
90 fasta_files = split_fasta(fasta_file, tabular_file, n=FASTA_CHUNK, truncate=truncate, max_len=MAX_LEN) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
91 temp_files = [f+".out" for f in fasta_files] | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
92 assert len(fasta_files) == len(temp_files) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
93 jobs = ["signalp -short -t %s %s > %s" % (organism, fasta, temp) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
94 for (fasta, temp) in zip(fasta_files, temp_files)] | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
95 assert len(fasta_files) == len(temp_files) == len(jobs) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
96 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
97 def clean_up(file_list): | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
98 for f in file_list: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
99 if os.path.isfile(f): | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
100 os.remove(f) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
101 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
102 if len(jobs) > 1 and num_threads > 1: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
103 #A small "info" message for Galaxy to show the user. | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
104 print "Using %i threads for %i tasks" % (min(num_threads, len(jobs)), len(jobs)) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
105 results = run_jobs(jobs, num_threads) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
106 assert len(fasta_files) == len(temp_files) == len(jobs) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
107 for fasta, temp, cmd in zip(fasta_files, temp_files, jobs): | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
108 error_level = results[cmd] | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
109 try: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
110 output = open(temp).readline() | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
111 except IOError: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
112 output = "" | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
113 if error_level or output.lower().startswith("error running"): | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
114 clean_up(fasta_files) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
115 clean_up(temp_files) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
116 stop_err("One or more tasks failed, e.g. %i from %r gave:\n%s" % (error_level, cmd, output), | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
117 error_level) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
118 del results | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
119 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
120 out_handle = open(tabular_file, "w") | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
121 fields = ["ID"] | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
122 #NN results: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
123 for name in ["Cmax", "Ymax", "Smax"]: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
124 fields.extend(["NN_%s_score"%name, "NN_%s_pos"%name, "NN_%s_pred"%name]) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
125 fields.extend(["NN_Smean_score", "NN_Smean_pred", "NN_D_score", "NN_D_pred"]) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
126 #HMM results: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
127 fields.extend(["HMM_type", "HMM_Cmax_score", "HMM_Cmax_pos", "HMM_Cmax_pred", | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
128 "HMM_Sprob_score", "HMM_Sprob_pred"]) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
129 out_handle.write("#" + "\t".join(fields) + "\n") | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
130 for temp in temp_files: | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
131 data_handle = open(temp) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
132 clean_tabular(data_handle, out_handle) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
133 data_handle.close() | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
134 out_handle.close() | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
135 | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
136 clean_up(fasta_files) | 
| 
 
a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
 
peterjc 
parents:  
diff
changeset
 | 
137 clean_up(temp_files) | 
