Mercurial > repos > jankanis > blast2html
annotate blast2html.py @ 111:1faac255ae3c draft default tip
update version to 0.0.12
| author | Jan Kanis <jan.code@jankanis.nl> | 
|---|---|
| date | Tue, 08 Jul 2014 17:05:54 +0200 | 
| parents | ee2b105d772a | 
| children | 
| rev | line source | 
|---|---|
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 1 #!/usr/bin/env python3 | 
| 52 | 2 # -*- coding: utf-8 -*- | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 3 | 
| 77 | 4 # Actually this program works with both python 2 and 3, tested against python 2.6 | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 5 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 6 # Copyright The Hyve B.V. 2014 | 
| 74 | 7 # License: GPL version 3 or (at your option) any higher version | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 8 | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 9 from __future__ import unicode_literals, division | 
| 52 | 10 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 11 import sys | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 12 import math | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 13 import warnings | 
| 107 | 14 import six, codecs, io | 
| 77 | 15 from six.moves import builtins | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 16 from os import path | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 17 from itertools import repeat | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 18 from collections import defaultdict | 
| 102 
8f02008a5f20
look at all blast*.loc files; python2.6 compat fix
 Jan Kanis <jan.code@jankanis.nl> parents: 
101diff
changeset | 19 import glob | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 20 import argparse | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 21 from lxml import objectify | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 22 import jinja2 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 23 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 24 builtin_str = str | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 25 str = six.text_type | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 26 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 27 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 28 | 
| 78 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 29 _filters = dict(float='float') | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 30 def filter(func_or_name): | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 31 "Decorator to register a function as filter in the current jinja environment" | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 32 if isinstance(func_or_name, six.string_types): | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 33 def inner(func): | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 34 _filters[func_or_name] = func.__name__ | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 35 return func | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 36 return inner | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 37 else: | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 38 _filters[func_or_name.__name__] = func_or_name.__name__ | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 39 return func_or_name | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 40 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 41 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 42 def color_idx(length): | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 43 if length < 40: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 44 return 0 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 45 elif length < 50: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 46 return 1 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 47 elif length < 80: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 48 return 2 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 49 elif length < 200: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 50 return 3 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 51 return 4 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 52 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 53 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 54 def fmt(val, fmt): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 55 return format(float(val), fmt) | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 56 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 57 @filter | 
| 78 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 58 def numfmt(val): | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 59 """Format numbers in decimal notation, but without excessive trailing 0's. | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 60 Default python float formatting will use scientific notation for some values, | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 61 or append trailing zeros with the 'f' format type, and the number of digits differs | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 62 between python 2 and 3.""" | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 63 fpart, ipart = math.modf(val) | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 64 if fpart == 0: | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 65 return str(int(val)) | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 66 # round to 10 to get identical representations in python 2 and 3 | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 67 s = format(round(val, 10), '.10f').rstrip('0') | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 68 if s[-1] == '.': | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 69 s += '0' | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 70 return s | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 71 | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 72 @filter | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 73 def firsttitle(hit): | 
| 99 | 74 return str(hit.Hit_def).split('>')[0] | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 75 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 76 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 77 def othertitles(hit): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 78 """Split a hit.Hit_def that contains multiple titles up, splitting out the hit ids from the titles.""" | 
| 99 | 79 id_titles = str(hit.Hit_def).split('>') | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 80 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 81 titles = [] | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 82 for t in id_titles[1:]: | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 83 id, title = t.split(' ', 1) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 84 titles.append(argparse.Namespace(Hit_id = id, | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 85 Hit_def = title, | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 86 Hit_accession = '', | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 87 getroottree = hit.getroottree)) | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 88 return titles | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 89 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 90 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 91 def hitid(hit): | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 92 return str(hit.Hit_id) | 
| 26 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 93 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 94 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 95 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 96 def alignment_pre(hsp): | 
| 70 
fa8a93bdefd7
fix bug in calculations of alignment end
 Jan Kanis <jan.code@jankanis.nl> parents: 
69diff
changeset | 97 """Create the preformatted alignment blocks""" | 
| 73 | 98 | 
| 99 # line break length | |
| 100 linewidth = 60 | |
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 101 | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 102 qfrom = int(hsp['Hsp_query-from']) | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 103 qto = int(hsp['Hsp_query-to']) | 
| 71 | 104 qframe = int(hsp['Hsp_query-frame']) | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 105 hfrom = int(hsp['Hsp_hit-from']) | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 106 hto = int(hsp['Hsp_hit-to']) | 
| 71 | 107 hframe = int(hsp['Hsp_hit-frame']) | 
| 73 | 108 | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 109 qseq = hsp.Hsp_qseq.text | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 110 midline = hsp.Hsp_midline.text | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 111 hseq = hsp.Hsp_hseq.text | 
| 71 | 112 | 
| 73 | 113 if not qframe in (1, -1): | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 114 warnings.warn("Error in BlastXML input: Hsp node {0} has a Hsp_query-frame of {1}. (should be 1 or -1)".format(nodeid(hsp), qframe)) | 
| 71 | 115 qframe = -1 if qframe < 0 else 1 | 
| 73 | 116 if not hframe in (1, -1): | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 117 warnings.warn("Error in BlastXML input: Hsp node {0} has a Hsp_hit-frame of {1}. (should be 1 or -1)".format(nodeid(hsp), hframe)) | 
| 71 | 118 hframe = -1 if hframe < 0 else 1 | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 119 | 
| 71 | 120 def split(txt): | 
| 73 | 121 return [txt[i:i+linewidth] for i in range(0, len(txt), linewidth)] | 
| 71 | 122 | 
| 73 | 123 for qs, mid, hs, offset in zip(split(qseq), split(midline), split(hseq), range(0, len(qseq), linewidth)): | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 124 yield ( | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 125 "Query {0:>7} {1} {2}\n".format(qfrom+offset*qframe, qs, qfrom+(offset+len(qs)-1)*qframe) + | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 126 " {0:7} {1}\n".format('', mid) + | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 127 "Subject{0:>7} {1} {2}".format(hfrom+offset*hframe, hs, hfrom+(offset+len(hs)-1)*hframe) | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 128 ) | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 129 | 
| 71 | 130 if qfrom+(len(qseq)-1)*qframe != qto: | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 131 warnings.warn("Error in BlastXML input: Hsp node {0} qseq length mismatch: from {1} to {2} length {3}".format( | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 132 nodeid(hsp), qfrom, qto, len(qseq))) | 
| 71 | 133 if hfrom+(len(hseq)-1)*hframe != hto: | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 134 warnings.warn("Error in BlastXML input: Hsp node {0} hseq length mismatch: from {1} to {2} length {3}".format( | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 135 nodeid(hsp), hfrom, hto, len(hseq))) | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 136 | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 137 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 138 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 139 @filter('len') | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 140 def blastxml_len(node): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 141 if node.tag == 'Hsp': | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 142 return int(node['Hsp_align-len']) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 143 elif node.tag == 'Iteration': | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 144 return int(node['Iteration_query-len']) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 145 raise Exception("Unknown XML node type: "+node.tag) | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 146 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 147 @filter | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 148 def nodeid(node): | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 149 id = [] | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 150 if node.tag == 'Hsp': | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 151 id.insert(0, node.Hsp_num.text) | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 152 node = node.getparent().getparent() | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 153 assert node.tag == 'Hit' | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 154 if node.tag == 'Hit': | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 155 id.insert(0, node.Hit_num.text) | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 156 node = node.getparent().getparent() | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 157 assert node.tag == 'Iteration' | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 158 if node.tag == 'Iteration': | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 159 id.insert(0, node['Iteration_iter-num'].text) | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 160 return '-'.join(id) | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 161 raise ValueError("The nodeid filter can only be applied to Hsp, Hit or Iteration nodes in a BlastXML document") | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 162 | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 163 | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 164 @filter | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 165 def asframe(frame): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 166 if frame == 1: | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 167 return 'Plus' | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 168 elif frame == -1: | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 169 return 'Minus' | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 170 raise Exception("frame should be either +1 or -1") | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 171 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 172 # def genelink(hit, type='genbank', hsp=None): | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 173 # if not isinstance(hit, six.string_types): | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 174 # hit = hitid(hit) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 175 # link = "http://www.ncbi.nlm.nih.gov/nucleotide/{0}?report={1}&log$=nuclalign".format(hit, type) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 176 # if hsp != None: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 177 # link += "&from={0}&to={1}".format(hsp['Hsp_hit-from'], hsp['Hsp_hit-to']) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 178 # return link | 
| 18 | 179 | 
| 180 | |
| 181 # javascript escape filter based on Django's, from https://github.com/dsissitka/khan-website/blob/master/templatefilters.py#L112-139 | |
| 182 # I've removed the html escapes, since html escaping is already being performed by the template engine. | |
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 183 | 
| 76 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 184 # The r'\u0027' syntax doesn't work the way we need to in python 2.6 with unicode_literals | 
| 18 | 185 _base_js_escapes = ( | 
| 76 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 186 ('\\', '\\u005C'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 187 ('\'', '\\u0027'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 188 ('"', '\\u0022'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 189 # ('>', '\\u003E'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 190 # ('<', '\\u003C'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 191 # ('&', '\\u0026'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 192 # ('=', '\\u003D'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 193 # ('-', '\\u002D'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 194 # (';', '\\u003B'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 195 (u'\u2028', '\\u2028'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 196 (u'\u2029', '\\u2029') | 
| 18 | 197 ) | 
| 198 | |
| 199 # Escape every ASCII character with a value less than 32. This is | |
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 200 # needed a.o. to prevent html parsers from jumping out of javascript | 
| 18 | 201 # parsing mode. | 
| 202 _js_escapes = (_base_js_escapes + | |
| 203 tuple(('%c' % z, '\\u%04X' % z) for z in range(32))) | |
| 204 | |
| 205 @filter | |
| 206 def js_string_escape(value): | |
| 74 | 207 """ | 
| 208 Javascript string literal escape. Note that this only escapes data | |
| 209 for embedding within javascript string literals, not in general | |
| 210 javascript snippets. | |
| 211 """ | |
| 18 | 212 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 213 value = str(value) | 
| 18 | 214 | 
| 215 for bad, good in _js_escapes: | |
| 216 value = value.replace(bad, good) | |
| 217 | |
| 218 return value | |
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 219 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 220 @filter | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 221 def hits(result): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 222 # sort hits by longest hotspot first | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 223 return sorted(result.Iteration_hits.findall('Hit'), | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 224 key=lambda h: max(blastxml_len(hsp) for hsp in h.Hit_hsps.Hsp), | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 225 reverse=True) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 226 | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 227 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 228 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 229 class BlastVisualize: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 230 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 231 colors = ('black', 'blue', 'green', 'magenta', 'red') | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 232 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 233 max_scale_labels = 10 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 234 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 235 def __init__(self, input, templatedir, templatename, genelinks={}): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 236 self.input = input | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 237 self.templatename = templatename | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 238 self.genelinks = genelinks | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 239 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 240 self.blast = objectify.parse(self.input).getroot() | 
| 81 
9fb1a7d67317
remove unneeded encoding parameter
 Jan Kanis <jan.code@jankanis.nl> parents: 
80diff
changeset | 241 self.loader = jinja2.FileSystemLoader(searchpath=templatedir) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 242 self.environment = jinja2.Environment(loader=self.loader, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 243 lstrip_blocks=True, trim_blocks=True, autoescape=True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 244 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 245 self._addfilters(self.environment) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 246 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 247 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 248 def _addfilters(self, environment): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 249 for filtername, funcname in _filters.items(): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 250 try: | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 251 environment.filters[filtername] = getattr(self, funcname) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 252 except AttributeError: | 
| 77 | 253 try: | 
| 254 environment.filters[filtername] = globals()[funcname] | |
| 255 except KeyError: | |
| 256 environment.filters[filtername] = getattr(builtins, funcname) | |
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 257 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 258 def render(self, output): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 259 template = self.environment.get_template(self.templatename) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 260 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 261 params = (('Query ID', self.blast["BlastOutput_query-ID"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 262 ('Query definition', self.blast["BlastOutput_query-def"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 263 ('Query length', self.blast["BlastOutput_query-len"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 264 ('Program', self.blast.BlastOutput_version), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 265 ('Database', self.blast.BlastOutput_db), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 266 ) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 267 | 
| 107 | 268 result = template.stream(blast=self.blast, | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 269 iterations=self.blast.BlastOutput_iterations.Iteration, | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 270 colors=self.colors, | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 271 params=params) | 
| 107 | 272 | 
| 273 result.dump(output) | |
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 274 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 275 @filter | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 276 def match_colors(self, result): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 277 """ | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 278 An iterator that yields lists of length-color pairs. | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 279 """ | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 280 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 281 query_length = blastxml_len(result) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 282 | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 283 percent_multiplier = 100 / query_length | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 284 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 285 for hit in hits(result): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 286 # sort hotspots from short to long, so we can overwrite index colors of | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 287 # short matches with those of long ones. | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 288 hotspots = sorted(hit.Hit_hsps.Hsp, key=lambda hsp: blastxml_len(hsp)) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 289 table = bytearray([255]) * query_length | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 290 for hsp in hotspots: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 291 frm = hsp['Hsp_query-from'] - 1 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 292 to = int(hsp['Hsp_query-to']) | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 293 table[frm:to] = repeat(color_idx(blastxml_len(hsp)), to - frm) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 294 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 295 matches = [] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 296 last = table[0] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 297 count = 0 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 298 for i in range(query_length): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 299 if table[i] == last: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 300 count += 1 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 301 continue | 
| 18 | 302 matches.append((count * percent_multiplier, self.colors[last] if last != 255 else 'transparent')) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 303 last = table[i] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 304 count = 1 | 
| 18 | 305 matches.append((count * percent_multiplier, self.colors[last] if last != 255 else 'transparent')) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 306 | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 307 yield dict(colors=matches, hit=hit, defline=firsttitle(hit)) | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 308 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 309 @filter | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 310 def queryscale(self, result): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 311 query_length = blastxml_len(result) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 312 skip = math.ceil(query_length / self.max_scale_labels) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 313 percent_multiplier = 100 / query_length | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 314 for i in range(1, query_length+1): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 315 if i % skip == 0: | 
| 25 | 316 yield dict(label = i, width = skip * percent_multiplier) | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 317 if query_length % skip != 0: | 
| 22 
53cd304c5f26
Add index for multiple results; fix layout of query ruler for edge case
 Jan Kanis <jan.code@jankanis.nl> parents: 
21diff
changeset | 318 yield dict(label = query_length, | 
| 25 | 319 width = (query_length % skip) * percent_multiplier) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 320 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 321 @filter | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 322 def hit_info(self, result): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 323 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 324 query_length = blastxml_len(result) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 325 | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 326 for hit in hits(result): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 327 hsps = hit.Hit_hsps.Hsp | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 328 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 329 cover = [False] * query_length | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 330 for hsp in hsps: | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 331 cover[hsp['Hsp_query-from']-1 : int(hsp['Hsp_query-to'])] = repeat(True, blastxml_len(hsp)) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 332 cover_count = cover.count(True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 333 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 334 def hsp_val(path): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 335 return (float(hsp[path]) for hsp in hsps) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 336 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 337 yield dict(hit = hit, | 
| 77 | 338 title = firsttitle(hit), | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 339 maxscore = "{0:.1f}".format(max(hsp_val('Hsp_bit-score'))), | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 340 totalscore = "{0:.1f}".format(sum(hsp_val('Hsp_bit-score'))), | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 341 cover = "{0:.0%}".format(cover_count / query_length), | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 342 e_value = "{0:.4g}".format(min(hsp_val('Hsp_evalue'))), | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 343 # FIXME: is this the correct formula vv? | 
| 80 | 344 # float(...) because non-flooring division doesn't work with lxml elements in python 2.6 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 345 ident = "{0:.0%}".format(float(min(float(hsp.Hsp_identity) / blastxml_len(hsp) for hsp in hsps)))) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 346 | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 347 @filter | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 348 def genelink(self, hit, text=None, clas=None, display_nolink=True): | 
| 99 | 349 """Create a html link from a hit node to a configured gene bank webpage. | 
| 350 text: The text of the link, defaults to the hit_id | |
| 351 clas: extra css classes that will be added to the <a> element | |
| 352 display_nolink: boolean, if false don't display anything if no link can be created. Default True. | |
| 353 """ | |
| 354 | |
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 355 if text is None: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 356 text = hitid(hit) | 
| 99 | 357 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 358 db = hit.getroottree().getroot().BlastOutput_db | 
| 99 | 359 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 360 if isinstance(self.genelinks, six.string_types): | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 361 template = self.genelinks | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 362 else: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 363 template = self.genelinks.get(db) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 364 if template is None: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 365 return text if display_nolink else '' | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 366 args = dict(id=hitid(hit).split('|'), | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 367 fullid=hitid(hit), | 
| 107 | 368 defline=str(hit.Hit_def).split(' ', 1)[0].split('|'), | 
| 369 fulldefline=str(hit.Hit_def).split(' ', 1)[0], | |
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 370 accession=str(hit.Hit_accession)) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 371 try: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 372 link = template.format(**args) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 373 except Exception as e: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 374 warnings.warn('Error in formatting gene bank link {} with {}: {}'.format(template, args, e)) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 375 return text if display_nolink else '' | 
| 99 | 376 | 
| 102 
8f02008a5f20
look at all blast*.loc files; python2.6 compat fix
 Jan Kanis <jan.code@jankanis.nl> parents: 
101diff
changeset | 377 classattr = 'class="{0}" '.format(jinja2.escape(clas)) if clas is not None else '' | 
| 
8f02008a5f20
look at all blast*.loc files; python2.6 compat fix
 Jan Kanis <jan.code@jankanis.nl> parents: 
101diff
changeset | 378 return jinja2.Markup("<a {0}href=\"{1}\">{2}</a>".format(classattr, jinja2.escape(link), jinja2.escape(text))) | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 379 | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 380 | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 381 def read_genelinks(dir): | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 382 links = {} | 
| 102 
8f02008a5f20
look at all blast*.loc files; python2.6 compat fix
 Jan Kanis <jan.code@jankanis.nl> parents: 
101diff
changeset | 383 # blastdb.loc, blastdb_p.loc, blastdb_d.loc, etc. | 
| 104 | 384 files = sorted(glob.glob(path.join(dir, 'blastdb*.loc'))) | 
| 385 # reversed, so blastdb.loc will take precedence | |
| 386 for f in reversed(files): | |
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 387 try: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 388 f = open(path.join(dir, f)) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 389 for l in f.readlines(): | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 390 if l.strip().startswith('#'): | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 391 continue | 
| 101 
e780606b7c25
test new command line parameters, fix small bug
 Jan Kanis <jan.code@jankanis.nl> parents: 
99diff
changeset | 392 line = l.rstrip('\n').split('\t') | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 393 try: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 394 links[line[2]] = line[3] | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 395 except IndexError: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 396 continue | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 397 f.close() | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 398 except OSError: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 399 continue | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 400 if not links: | 
| 106 | 401 if not files: | 
| 402 warnings.warn("No gene bank link templates found (no blastdb*.loc files found in {0})".format(dir)) | |
| 403 else: | |
| 404 warnings.warn("No gene bank link templates found in {0}".format(', '.join(files))) | |
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 405 return links | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 406 | 
| 29 
4e6ac737ba17
improve the galaxy html stripping warning; make sure the tool can find the template from within galaxy
 Jan Kanis <jan.code@jankanis.nl> parents: 
26diff
changeset | 407 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 408 def main(): | 
| 29 
4e6ac737ba17
improve the galaxy html stripping warning; make sure the tool can find the template from within galaxy
 Jan Kanis <jan.code@jankanis.nl> parents: 
26diff
changeset | 409 default_template = path.join(path.dirname(__file__), 'blast2html.html.jinja') | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 410 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 411 parser = argparse.ArgumentParser(description="Convert a BLAST XML result into a nicely readable html page", | 
| 108 | 412 usage="{0} [-i] INPUT [-o OUTPUT] [--genelink-template URL_TEMPLATE]".format(sys.argv[0])) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 413 input_group = parser.add_mutually_exclusive_group(required=True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 414 input_group.add_argument('positional_arg', metavar='INPUT', nargs='?', type=argparse.FileType(mode='r'), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 415 help='The input Blast XML file, same as -i/--input') | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 416 input_group.add_argument('-i', '--input', type=argparse.FileType(mode='r'), | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 417 help='The input Blast XML file') | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 418 parser.add_argument('-o', '--output', type=argparse.FileType(mode='w'), default=sys.stdout, | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 419 help='The output html file') | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 420 # We just want the file name here, so jinja can open the file | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 421 # itself. But it is easier to just use a FileType so argparse can | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 422 # handle the errors. This introduces a small race condition when | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 423 # jinja later tries to re-open the template file, but we don't | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 424 # care too much. | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 425 parser.add_argument('--template', type=argparse.FileType(mode='r'), default=default_template, | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 426 help='The template file to use. Defaults to blast_html.html.jinja') | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 427 | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 428 dblink_group = parser.add_mutually_exclusive_group() | 
| 108 | 429 dblink_group.add_argument('--genelink-template', metavar='URL_TEMPLATE', | 
| 430 default='http://www.ncbi.nlm.nih.gov/nucleotide/{accession}?report=genbank&log$=nuclalign', | |
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 431 help="""A link template to link hits to a gene bank webpage. The template string is a | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 432 Python format string. It can contain the following replacement elements: {id[N]}, {fullid}, | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 433 {defline[N]}, {fulldefline}, {accession}, where N is a number. id[N] and defline[N] will be | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 434 replaced by the Nth element of the id or defline, where '|' is the field separator. | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 435 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 436 The default is 'http://www.ncbi.nlm.nih.gov/nucleotide/{accession}?report=genbank&log$=nuclalign', | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 437 which is a link to the NCBI nucleotide database.""") | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 438 | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 439 dblink_group.add_argument('--db-config-dir', | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 440 help="""The directory where databases are configured in blastdb*.loc files. These files | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 441 are consulted for creating a gene bank link. The files should be tab-separated tables (with lines | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 442 starting with '#' ignored), where the third field of a line should be a database path and the fourth | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 443 a genebank link template conforming to the --genelink-template option syntax. | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 444 | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 445 This option is incompatible with --genelink-template.""") | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 446 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 447 args = parser.parse_args() | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 448 if args.input == None: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 449 args.input = args.positional_arg | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 450 if args.input == None: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 451 parser.error('no input specified') | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 452 | 
| 107 | 453 if six.PY2: | 
| 454 # The argparse.FileType wrapper doesn't support an encoding | |
| 108 | 455 # argument, so for python 2 we need to wrap or reopen the | 
| 456 # output. The input files are already read as utf-8 by the | |
| 107 | 457 # respective libraries. | 
| 110 | 458 # | 
| 107 | 459 # One option is using codecs, but the codecs' writelines() | 
| 460 # method doesn't support streaming but collects all output and | |
| 108 | 461 # writes at once (see Python issues #5445 and #21910). On the | 
| 462 # other hand the io module is slower (though not | |
| 463 # significantly). | |
| 107 | 464 | 
| 465 # args.output = codecs.getwriter('utf-8')(args.output) | |
| 110 | 466 # def fixed_writelines(iter, self=args.output): | 
| 467 # for i in iter: | |
| 468 # self.write(i) | |
| 469 # args.output.writelines = fixed_writelines | |
| 470 | |
| 471 args.output.close() | |
| 107 | 472 args.output = io.open(args.output.name, 'w') | 
| 473 | |
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 474 templatedir, templatename = path.split(args.template.name) | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 475 args.template.close() | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 476 if not templatedir: | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 477 templatedir = '.' | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 478 | 
| 98 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 479 if args.db_config_dir is None: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 480 genelinks = args.genelink_template | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 481 elif not path.isdir(args.db_config_dir): | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 482 parser.error('db-config-dir does not exist or is not a directory') | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 483 else: | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 484 genelinks = read_genelinks(args.db_config_dir) | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 485 | 
| 
4378d11f0ed7
implement configurable gene bank links
 Jan Kanis <jan.code@jankanis.nl> parents: 
81diff
changeset | 486 b = BlastVisualize(args.input, templatedir, templatename, genelinks) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 487 b.render(args.output) | 
| 107 | 488 args.output.close() | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 489 | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 490 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 491 if __name__ == '__main__': | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 492 main() | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 493 | 
