Mercurial > repos > devteam > lastz
annotate lastz_wrapper.py @ 1:3c13c9c09ad9 draft
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
author | devteam |
---|---|
date | Tue, 13 Oct 2015 12:23:37 -0400 |
parents | |
children |
rev | line source |
---|---|
1
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
1 #!/usr/bin/env python |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
2 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
3 """ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
4 Runs Lastz |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
5 Written for Lastz v. 1.01.88. |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
6 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
7 usage: lastz_wrapper.py [options] |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
8 --ref_name: The reference name to change all output matches to |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
9 --ref_source: Whether the reference is cached or from the history |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
10 --source_select: Whether to used pre-set or cached reference file |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
11 --input1: The name of the reference file if using history or reference base name if using cached |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
12 --input2: The reads file to align |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
13 --ref_sequences: The number of sequences in the reference file if using one from history |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
14 --pre_set_options: Which of the pre set options to use, if using pre-sets |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
15 --strand: Which strand of the read to search, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
16 --seed: Seeding settings, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
17 --gfextend: Whether to perform gap-free extension of seed hits to HSPs (high scoring segment pairs), if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
18 --chain: Whether to perform chaining of HSPs, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
19 --transition: Number of transitions to allow in each seed hit, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
20 --O: Gap opening penalty, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
21 --E: Gap extension penalty, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
22 --X: X-drop threshold, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
23 --Y: Y-drop threshold, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
24 --K: Threshold for HSPs, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
25 --L: Threshold for gapped alignments, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
26 --entropy: Whether to involve entropy when filtering HSPs, if specifying all parameters |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
27 --identity_min: Minimum identity (don't report matches under this identity) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
28 --identity_max: Maximum identity (don't report matches above this identity) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
29 --coverage: The minimum coverage value (don't report matches covering less than this) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
30 --unmask: Whether to convert lowercase bases to uppercase |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
31 --out_format: The format of the output file (sam, diffs, or tabular (general)) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
32 --output: The name of the output file |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
33 --lastzSeqsFileDir: Directory of local lastz_seqs.loc file |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
34 """ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
35 import optparse, os, subprocess, shutil, sys, tempfile, threading, time |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
36 from Queue import Queue |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
37 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
38 from galaxy import eggs |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
39 import pkg_resources |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
40 pkg_resources.require( 'bx-python' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
41 from bx.seq.twobit import * |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
42 from bx.seq.fasta import FastaReader |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
43 from galaxy.util.bunch import Bunch |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
44 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
45 STOP_SIGNAL = object() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
46 WORKERS = 4 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
47 SLOTS = 128 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
48 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
49 def stop_err( msg ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
50 sys.stderr.write( "%s" % msg ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
51 sys.exit() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
52 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
53 def stop_queues( lastz, combine_data ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
54 # This method should only be called if an error has been encountered. |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
55 # Send STOP_SIGNAL to all worker threads |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
56 for t in lastz.threads: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
57 lastz.put( STOP_SIGNAL, True ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
58 combine_data.put( STOP_SIGNAL, True ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
59 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
60 class BaseQueue( object ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
61 def __init__( self, num_threads, slots=-1 ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
62 # Initialize the queue and worker threads |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
63 self.queue = Queue( slots ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
64 self.threads = [] |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
65 for i in range( num_threads ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
66 worker = threading.Thread( target=self.run_next ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
67 worker.start() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
68 self.threads.append( worker ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
69 def run_next( self ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
70 # Run the next job, waiting until one is available if necessary |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
71 while True: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
72 job = self.queue.get() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
73 if job is STOP_SIGNAL: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
74 return self.shutdown() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
75 self.run_job( job ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
76 time.sleep( 1 ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
77 def run_job( self, job ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
78 stop_err( 'Not Implemented' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
79 def put( self, job, block=False ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
80 # Add a job to the queue |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
81 self.queue.put( job, block ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
82 def shutdown( self ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
83 return |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
84 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
85 class LastzJobQueue( BaseQueue ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
86 """ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
87 A queue that runs commands in parallel. Blocking is done so the queue will |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
88 not consume much memory. |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
89 """ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
90 def run_job( self, job ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
91 # Execute the job's command |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
92 proc = subprocess.Popen( args=job.command, shell=True, stderr=subprocess.PIPE, ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
93 proc.wait() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
94 stderr = proc.stderr.read() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
95 proc.wait() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
96 if stderr: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
97 stop_queues( self, job.combine_data_queue ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
98 stop_err( stderr ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
99 job.combine_data_queue.put( job ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
100 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
101 class CombineDataQueue( BaseQueue ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
102 """ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
103 A queue that concatenates files in serial. Blocking is not done since this |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
104 queue is not expected to grow larger than the command queue. |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
105 """ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
106 def __init__( self, output_filename, num_threads=1 ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
107 BaseQueue.__init__( self, num_threads ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
108 self.CHUNK_SIZE = 2**20 # 1Mb |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
109 self.output_file = open( output_filename, 'wb' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
110 def run_job( self, job ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
111 in_file = open( job.output, 'rb' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
112 while True: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
113 chunk = in_file.read( self.CHUNK_SIZE ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
114 if not chunk: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
115 in_file.close() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
116 break |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
117 self.output_file.write( chunk ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
118 for file_name in job.cleanup: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
119 os.remove( file_name ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
120 def shutdown( self ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
121 self.output_file.close() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
122 return |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
123 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
124 def __main__(): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
125 #Parse Command Line |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
126 parser = optparse.OptionParser() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
127 parser.add_option( '', '--ref_name', dest='ref_name', help='The reference name to change all output matches to' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
128 parser.add_option( '', '--ref_source', dest='ref_source', help='Whether the reference is cached or from the history' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
129 parser.add_option( '', '--ref_sequences', dest='ref_sequences', help='Number of sequences in the reference dataset' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
130 parser.add_option( '', '--source_select', dest='source_select', help='Whether to used pre-set or cached reference file' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
131 parser.add_option( '', '--input1', dest='input1', help='The name of the reference file if using history or reference base name if using cached' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
132 parser.add_option( '', '--input2', dest='input2', help='The reads file to align' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
133 parser.add_option( '', '--pre_set_options', dest='pre_set_options', help='Which of the pre set options to use, if using pre-sets' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
134 parser.add_option( '', '--strand', dest='strand', help='Which strand of the read to search, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
135 parser.add_option( '', '--seed', dest='seed', help='Seeding settings, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
136 parser.add_option( '', '--transition', dest='transition', help='Number of transitions to allow in each seed hit, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
137 parser.add_option( '', '--gfextend', dest='gfextend', help='Whether to perform gap-free extension of seed hits to HSPs (high scoring segment pairs), if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
138 parser.add_option( '', '--chain', dest='chain', help='Whether to perform chaining of HSPs, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
139 parser.add_option( '', '--O', dest='O', help='Gap opening penalty, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
140 parser.add_option( '', '--E', dest='E', help='Gap extension penalty, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
141 parser.add_option( '', '--X', dest='X', help='X-drop threshold, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
142 parser.add_option( '', '--Y', dest='Y', help='Y-drop threshold, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
143 parser.add_option( '', '--K', dest='K', help='Threshold for HSPs, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
144 parser.add_option( '', '--L', dest='L', help='Threshold for gapped alignments, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
145 parser.add_option( '', '--entropy', dest='entropy', help='Whether to involve entropy when filtering HSPs, if specifying all parameters' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
146 parser.add_option( '', '--identity_min', dest='identity_min', help="Minimum identity (don't report matches under this identity)" ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
147 parser.add_option( '', '--identity_max', dest='identity_max', help="Maximum identity (don't report matches above this identity)" ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
148 parser.add_option( '', '--coverage', dest='coverage', help="The minimum coverage value (don't report matches covering less than this)" ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
149 parser.add_option( '', '--unmask', dest='unmask', help='Whether to convert lowercase bases to uppercase' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
150 parser.add_option( '', '--out_format', dest='format', help='The format of the output file (sam, diffs, or tabular (general))' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
151 parser.add_option( '', '--output', dest='output', help='The output file' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
152 parser.add_option( '', '--lastzSeqsFileDir', dest='lastzSeqsFileDir', help='Directory of local lastz_seqs.loc file' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
153 ( options, args ) = parser.parse_args() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
154 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
155 # output version # of tool |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
156 try: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
157 tmp = tempfile.NamedTemporaryFile().name |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
158 tmp_stdout = open( tmp, 'wb' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
159 proc = subprocess.Popen( args='lastz -v', shell=True, stdout=tmp_stdout ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
160 tmp_stdout.close() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
161 returncode = proc.wait() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
162 stdout = None |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
163 for line in open( tmp_stdout.name, 'rb' ): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
164 if line.lower().find( 'version' ) >= 0: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
165 stdout = line.strip() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
166 break |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
167 if stdout: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
168 sys.stdout.write( '%s\n' % stdout ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
169 else: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
170 raise Exception |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
171 except: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
172 sys.stdout.write( 'Could not determine Lastz version\n' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
173 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
174 if options.unmask == 'yes': |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
175 unmask = '[unmask]' |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
176 else: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
177 unmask = '' |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
178 if options.ref_name: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
179 ref_name = '[nickname=%s]' % options.ref_name |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
180 else: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
181 ref_name = '' |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
182 # Prepare for commonly-used preset options |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
183 if options.source_select == 'pre_set': |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
184 set_options = '--%s' % options.pre_set_options |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
185 # Prepare for user-specified options |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
186 else: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
187 set_options = '--%s --%s --gapped --strand=%s --seed=%s --%s O=%s E=%s X=%s Y=%s K=%s L=%s --%s' % \ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
188 ( options.gfextend, options.chain, options.strand, options.seed, options.transition, |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
189 options.O, options.E, options.X, options.Y, options.K, options.L, options.entropy ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
190 # Specify input2 and add [fullnames] modifier if output format is diffs |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
191 if options.format == 'diffs': |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
192 input2 = '%s[fullnames]' % options.input2 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
193 else: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
194 input2 = options.input2 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
195 if options.format == 'tabular': |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
196 # Change output format to general if it's tabular and add field names for tabular output |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
197 format = 'general-' |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
198 tabular_fields = ':score,name1,strand1,size1,start1,zstart1,end1,length1,text1,name2,strand2,size2,start2,zstart2,end2,start2+,zstart2+,end2+,length2,text2,diff,cigar,identity,coverage,gaprate,diagonal,shingle' |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
199 elif options.format == 'sam': |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
200 # We currently ALWAYS suppress SAM headers. |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
201 format = 'sam-' |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
202 tabular_fields = '' |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
203 else: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
204 format = options.format |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
205 tabular_fields = '' |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
206 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
207 # Set up our queues |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
208 lastz_job_queue = LastzJobQueue( WORKERS, slots=SLOTS ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
209 combine_data_queue = CombineDataQueue( options.output ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
210 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
211 if options.ref_source == 'history': |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
212 # Reference is a fasta dataset from the history, so split job across |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
213 # the number of sequences in the dataset ( this could be a HUGE number ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
214 try: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
215 # Ensure there is at least 1 sequence in the dataset ( this may not be necessary ). |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
216 error_msg = "The reference dataset is missing metadata, click the pencil icon in the history item and 'auto-detect' the metadata attributes." |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
217 ref_sequences = int( options.ref_sequences ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
218 if ref_sequences < 1: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
219 stop_queues( lastz_job_queue, combine_data_queue ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
220 stop_err( error_msg ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
221 except: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
222 stop_queues( lastz_job_queue, combine_data_queue ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
223 stop_err( error_msg ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
224 seqs = 0 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
225 fasta_reader = FastaReader( open( options.input1 ) ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
226 while True: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
227 # Read the next sequence from the reference dataset |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
228 seq = fasta_reader.next() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
229 if not seq: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
230 break |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
231 seqs += 1 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
232 # Create a temporary file to contain the current sequence as input to lastz |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
233 tmp_in_fd, tmp_in_name = tempfile.mkstemp( suffix='.in' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
234 tmp_in = os.fdopen( tmp_in_fd, 'wb' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
235 # Write the current sequence to the temporary input file |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
236 tmp_in.write( '>%s\n%s\n' % ( seq.name, seq.text ) ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
237 tmp_in.close() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
238 # Create a 2nd temporary file to contain the output from lastz execution on the current sequence |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
239 tmp_out_fd, tmp_out_name = tempfile.mkstemp( suffix='.out' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
240 os.close( tmp_out_fd ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
241 # Generate the command line for calling lastz on the current sequence |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
242 command = 'lastz %s%s%s %s %s --ambiguousn --nolaj --identity=%s..%s --coverage=%s --format=%s%s > %s' % \ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
243 ( tmp_in_name, unmask, ref_name, input2, set_options, options.identity_min, |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
244 options.identity_max, options.coverage, format, tabular_fields, tmp_out_name ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
245 # Create a job object |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
246 job = Bunch() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
247 job.command = command |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
248 job.output = tmp_out_name |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
249 job.cleanup = [ tmp_in_name, tmp_out_name ] |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
250 job.combine_data_queue = combine_data_queue |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
251 # Add another job to the lastz_job_queue. Execution |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
252 # will wait at this point if the queue is full. |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
253 lastz_job_queue.put( job, block=True ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
254 # Make sure the value of sequences in the metadata is the same as the |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
255 # number of sequences read from the dataset ( this may not be necessary ). |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
256 if ref_sequences != seqs: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
257 stop_queues( lastz_job_queue, combine_data_queue ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
258 stop_err( "The value of metadata.sequences (%d) differs from the number of sequences read from the reference (%d)." % ( ref_sequences, seqs ) ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
259 else: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
260 # Reference is a locally cached 2bit file, split job across number of chroms in 2bit file |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
261 tbf = TwoBitFile( open( options.input1, 'r' ) ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
262 for chrom in tbf.keys(): |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
263 # Create a temporary file to contain the output from lastz execution on the current chrom |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
264 tmp_out_fd, tmp_out_name = tempfile.mkstemp( suffix='.out' ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
265 os.close( tmp_out_fd ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
266 command = 'lastz %s/%s%s%s %s %s --ambiguousn --nolaj --identity=%s..%s --coverage=%s --format=%s%s >> %s' % \ |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
267 ( options.input1, chrom, unmask, ref_name, input2, set_options, options.identity_min, |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
268 options.identity_max, options.coverage, format, tabular_fields, tmp_out_name ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
269 # Create a job object |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
270 job = Bunch() |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
271 job.command = command |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
272 job.output = tmp_out_name |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
273 job.cleanup = [ tmp_out_name ] |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
274 job.combine_data_queue = combine_data_queue |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
275 # Add another job to the lastz_job_queue. Execution |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
276 # will wait at this point if the queue is full. |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
277 lastz_job_queue.put( job, block=True ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
278 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
279 # Stop the lastz_job_queue |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
280 for t in lastz_job_queue.threads: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
281 lastz_job_queue.put( STOP_SIGNAL, True ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
282 # Although all jobs are submitted to the queue, we can't shut down the combine_data_queue |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
283 # until we know that all jobs have been submitted to its queue. We do this by checking |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
284 # whether all of the threads in the lastz_job_queue have terminated. |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
285 while threading.activeCount() > 2: |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
286 time.sleep( 1 ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
287 # Now it's safe to stop the combine_data_queue |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
288 combine_data_queue.put( STOP_SIGNAL ) |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
289 |
3c13c9c09ad9
planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents:
diff
changeset
|
290 if __name__=="__main__": __main__() |