Mercurial > repos > jankanis > blast2html
annotate blast2html.py @ 88:463744384507 draft py2.6
add missing dependencies
| author | Jan Kanis <jan.code@jankanis.nl> | 
|---|---|
| date | Tue, 24 Jun 2014 11:20:24 +0200 | 
| parents | 9fb1a7d67317 | 
| children | 4378d11f0ed7 | 
| rev | line source | 
|---|---|
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 1 #!/usr/bin/env python3 | 
| 52 | 2 # -*- coding: utf-8 -*- | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 3 | 
| 77 | 4 # Actually this program works with both python 2 and 3, tested against python 2.6 | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 5 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 6 # Copyright The Hyve B.V. 2014 | 
| 74 | 7 # License: GPL version 3 or (at your option) any higher version | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 8 | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 9 from __future__ import unicode_literals, division | 
| 52 | 10 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 11 import sys | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 12 import math | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 13 import warnings | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 14 import six, codecs | 
| 77 | 15 from six.moves import builtins | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 16 from os import path | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 17 from itertools import repeat | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 18 import argparse | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 19 from lxml import objectify | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 20 import jinja2 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 21 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 22 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 23 | 
| 78 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 24 _filters = dict(float='float') | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 25 def filter(func_or_name): | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 26 "Decorator to register a function as filter in the current jinja environment" | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 27 if isinstance(func_or_name, six.string_types): | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 28 def inner(func): | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 29 _filters[func_or_name] = func.__name__ | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 30 return func | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 31 return inner | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 32 else: | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 33 _filters[func_or_name.__name__] = func_or_name.__name__ | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 34 return func_or_name | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 35 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 36 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 37 def color_idx(length): | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 38 if length < 40: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 39 return 0 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 40 elif length < 50: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 41 return 1 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 42 elif length < 80: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 43 return 2 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 44 elif length < 200: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 45 return 3 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 46 return 4 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 47 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 48 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 49 def fmt(val, fmt): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 50 return format(float(val), fmt) | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 51 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 52 @filter | 
| 78 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 53 def numfmt(val): | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 54 """Format numbers in decimal notation, but without excessive trailing 0's. | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 55 Default python float formatting will use scientific notation for some values, | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 56 or append trailing zeros with the 'f' format type, and the number of digits differs | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 57 between python 2 and 3.""" | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 58 fpart, ipart = math.modf(val) | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 59 if fpart == 0: | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 60 return str(int(val)) | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 61 # round to 10 to get identical representations in python 2 and 3 | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 62 s = format(round(val, 10), '.10f').rstrip('0') | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 63 if s[-1] == '.': | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 64 s += '0' | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 65 return s | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 66 | 
| 
7d0d46168fd5
Format all numbers in a predictable way
 Jan Kanis <jan.code@jankanis.nl> parents: 
77diff
changeset | 67 @filter | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 68 def firsttitle(hit): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 69 return hit.Hit_def.text.split('>')[0] | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 70 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 71 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 72 def othertitles(hit): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 73 """Split a hit.Hit_def that contains multiple titles up, splitting out the hit ids from the titles.""" | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 74 id_titles = hit.Hit_def.text.split('>') | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 75 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 76 titles = [] | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 77 for t in id_titles[1:]: | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 78 fullid, title = t.split(' ', 1) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 79 hitid, id = fullid.split('|', 2)[1:3] | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 80 titles.append(dict(id = id, | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 81 hitid = hitid, | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 82 fullid = fullid, | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 83 title = title)) | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 84 return titles | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 85 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 86 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 87 def hitid(hit): | 
| 26 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 88 hitid = hit.Hit_id.text | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 89 s = hitid.split('|', 2) | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 90 if len(s) >= 2: | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 91 return s[1] | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 92 return hitid | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 93 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 94 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 95 def seqid(hit): | 
| 26 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 96 hitid = hit.Hit_id.text | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 97 s = hitid.split('|', 2) | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 98 if len(s) >= 3: | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 99 return s[2] | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 100 return hitid | 
| 
c8347745bbad
use Iteration_message tag; also work with unexpected Hit_id values
 Jan Kanis <jan.code@jankanis.nl> parents: 
25diff
changeset | 101 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 102 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 103 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 104 def alignment_pre(hsp): | 
| 70 
fa8a93bdefd7
fix bug in calculations of alignment end
 Jan Kanis <jan.code@jankanis.nl> parents: 
69diff
changeset | 105 """Create the preformatted alignment blocks""" | 
| 73 | 106 | 
| 107 # line break length | |
| 108 linewidth = 60 | |
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 109 | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 110 qfrom = int(hsp['Hsp_query-from']) | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 111 qto = int(hsp['Hsp_query-to']) | 
| 71 | 112 qframe = int(hsp['Hsp_query-frame']) | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 113 hfrom = int(hsp['Hsp_hit-from']) | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 114 hto = int(hsp['Hsp_hit-to']) | 
| 71 | 115 hframe = int(hsp['Hsp_hit-frame']) | 
| 73 | 116 | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 117 qseq = hsp.Hsp_qseq.text | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 118 midline = hsp.Hsp_midline.text | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 119 hseq = hsp.Hsp_hseq.text | 
| 71 | 120 | 
| 73 | 121 if not qframe in (1, -1): | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 122 warnings.warn("Error in BlastXML input: Hsp node {0} has a Hsp_query-frame of {1}. (should be 1 or -1)".format(nodeid(hsp), qframe)) | 
| 71 | 123 qframe = -1 if qframe < 0 else 1 | 
| 73 | 124 if not hframe in (1, -1): | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 125 warnings.warn("Error in BlastXML input: Hsp node {0} has a Hsp_hit-frame of {1}. (should be 1 or -1)".format(nodeid(hsp), hframe)) | 
| 71 | 126 hframe = -1 if hframe < 0 else 1 | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 127 | 
| 71 | 128 def split(txt): | 
| 73 | 129 return [txt[i:i+linewidth] for i in range(0, len(txt), linewidth)] | 
| 71 | 130 | 
| 73 | 131 for qs, mid, hs, offset in zip(split(qseq), split(midline), split(hseq), range(0, len(qseq), linewidth)): | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 132 yield ( | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 133 "Query {0:>7} {1} {2}\n".format(qfrom+offset*qframe, qs, qfrom+(offset+len(qs)-1)*qframe) + | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 134 " {0:7} {1}\n".format('', mid) + | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 135 "Subject{0:>7} {1} {2}".format(hfrom+offset*hframe, hs, hfrom+(offset+len(hs)-1)*hframe) | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 136 ) | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 137 | 
| 71 | 138 if qfrom+(len(qseq)-1)*qframe != qto: | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 139 warnings.warn("Error in BlastXML input: Hsp node {0} qseq length mismatch: from {1} to {2} length {3}".format( | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 140 nodeid(hsp), qfrom, qto, len(qseq))) | 
| 71 | 141 if hfrom+(len(hseq)-1)*hframe != hto: | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 142 warnings.warn("Error in BlastXML input: Hsp node {0} hseq length mismatch: from {1} to {2} length {3}".format( | 
| 69 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 143 nodeid(hsp), hfrom, hto, len(hseq))) | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 144 | 
| 
19c48f2ec775
wrap alignments if they are too long
 Jan Kanis <jan.code@jankanis.nl> parents: 
55diff
changeset | 145 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 146 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 147 @filter('len') | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 148 def blastxml_len(node): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 149 if node.tag == 'Hsp': | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 150 return int(node['Hsp_align-len']) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 151 elif node.tag == 'Iteration': | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 152 return int(node['Iteration_query-len']) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 153 raise Exception("Unknown XML node type: "+node.tag) | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 154 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 155 @filter | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 156 def nodeid(node): | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 157 id = [] | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 158 if node.tag == 'Hsp': | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 159 id.insert(0, node.Hsp_num.text) | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 160 node = node.getparent().getparent() | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 161 assert node.tag == 'Hit' | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 162 if node.tag == 'Hit': | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 163 id.insert(0, node.Hit_num.text) | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 164 node = node.getparent().getparent() | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 165 assert node.tag == 'Iteration' | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 166 if node.tag == 'Iteration': | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 167 id.insert(0, node['Iteration_iter-num'].text) | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 168 return '-'.join(id) | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 169 raise ValueError("The nodeid filter can only be applied to Hsp, Hit or Iteration nodes in a BlastXML document") | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 170 | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 171 | 
| 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 172 @filter | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 173 def asframe(frame): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 174 if frame == 1: | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 175 return 'Plus' | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 176 elif frame == -1: | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 177 return 'Minus' | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 178 raise Exception("frame should be either +1 or -1") | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 179 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 180 def genelink(hit, type='genbank', hsp=None): | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 181 if not isinstance(hit, six.string_types): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 182 hit = hitid(hit) | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 183 link = "http://www.ncbi.nlm.nih.gov/nucleotide/{0}?report={1}&log$=nuclalign".format(hit, type) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 184 if hsp != None: | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 185 link += "&from={0}&to={1}".format(hsp['Hsp_hit-from'], hsp['Hsp_hit-to']) | 
| 18 | 186 return link | 
| 187 | |
| 188 | |
| 189 # javascript escape filter based on Django's, from https://github.com/dsissitka/khan-website/blob/master/templatefilters.py#L112-139 | |
| 190 # I've removed the html escapes, since html escaping is already being performed by the template engine. | |
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 191 | 
| 76 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 192 # The r'\u0027' syntax doesn't work the way we need to in python 2.6 with unicode_literals | 
| 18 | 193 _base_js_escapes = ( | 
| 76 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 194 ('\\', '\\u005C'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 195 ('\'', '\\u0027'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 196 ('"', '\\u0022'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 197 # ('>', '\\u003E'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 198 # ('<', '\\u003C'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 199 # ('&', '\\u0026'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 200 # ('=', '\\u003D'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 201 # ('-', '\\u002D'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 202 # (';', '\\u003B'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 203 (u'\u2028', '\\u2028'), | 
| 
03e044b5bcc2
fix escaping of javascript literals
 Jan Kanis <jan.code@jankanis.nl> parents: 
75diff
changeset | 204 (u'\u2029', '\\u2029') | 
| 18 | 205 ) | 
| 206 | |
| 207 # Escape every ASCII character with a value less than 32. This is | |
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 208 # needed a.o. to prevent html parsers from jumping out of javascript | 
| 18 | 209 # parsing mode. | 
| 210 _js_escapes = (_base_js_escapes + | |
| 211 tuple(('%c' % z, '\\u%04X' % z) for z in range(32))) | |
| 212 | |
| 213 @filter | |
| 214 def js_string_escape(value): | |
| 74 | 215 """ | 
| 216 Javascript string literal escape. Note that this only escapes data | |
| 217 for embedding within javascript string literals, not in general | |
| 218 javascript snippets. | |
| 219 """ | |
| 18 | 220 | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 221 value = six.text_type(value) | 
| 18 | 222 | 
| 223 for bad, good in _js_escapes: | |
| 224 value = value.replace(bad, good) | |
| 225 | |
| 226 return value | |
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 227 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 228 @filter | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 229 def hits(result): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 230 # sort hits by longest hotspot first | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 231 return sorted(result.Iteration_hits.findall('Hit'), | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 232 key=lambda h: max(blastxml_len(hsp) for hsp in h.Hit_hsps.Hsp), | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 233 reverse=True) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 234 | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 235 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 236 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 237 class BlastVisualize: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 238 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 239 colors = ('black', 'blue', 'green', 'magenta', 'red') | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 240 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 241 max_scale_labels = 10 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 242 | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 243 def __init__(self, input, templatedir, templatename): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 244 self.input = input | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 245 self.templatename = templatename | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 246 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 247 self.blast = objectify.parse(self.input).getroot() | 
| 81 
9fb1a7d67317
remove unneeded encoding parameter
 Jan Kanis <jan.code@jankanis.nl> parents: 
80diff
changeset | 248 self.loader = jinja2.FileSystemLoader(searchpath=templatedir) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 249 self.environment = jinja2.Environment(loader=self.loader, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 250 lstrip_blocks=True, trim_blocks=True, autoescape=True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 251 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 252 self._addfilters(self.environment) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 253 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 254 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 255 def _addfilters(self, environment): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 256 for filtername, funcname in _filters.items(): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 257 try: | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 258 environment.filters[filtername] = getattr(self, funcname) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 259 except AttributeError: | 
| 77 | 260 try: | 
| 261 environment.filters[filtername] = globals()[funcname] | |
| 262 except KeyError: | |
| 263 environment.filters[filtername] = getattr(builtins, funcname) | |
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 264 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 265 def render(self, output): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 266 template = self.environment.get_template(self.templatename) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 267 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 268 params = (('Query ID', self.blast["BlastOutput_query-ID"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 269 ('Query definition', self.blast["BlastOutput_query-def"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 270 ('Query length', self.blast["BlastOutput_query-len"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 271 ('Program', self.blast.BlastOutput_version), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 272 ('Database', self.blast.BlastOutput_db), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 273 ) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 274 | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 275 result = template.render(blast=self.blast, | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 276 iterations=self.blast.BlastOutput_iterations.Iteration, | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 277 colors=self.colors, | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 278 genelink=genelink, | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 279 params=params) | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 280 if six.PY2: | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 281 result = result.encode('utf-8') | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 282 output.write(result) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 283 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 284 @filter | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 285 def match_colors(self, result): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 286 """ | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 287 An iterator that yields lists of length-color pairs. | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 288 """ | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 289 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 290 query_length = blastxml_len(result) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 291 | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 292 percent_multiplier = 100 / query_length | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 293 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 294 for hit in hits(result): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 295 # sort hotspots from short to long, so we can overwrite index colors of | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 296 # short matches with those of long ones. | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 297 hotspots = sorted(hit.Hit_hsps.Hsp, key=lambda hsp: blastxml_len(hsp)) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 298 table = bytearray([255]) * query_length | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 299 for hsp in hotspots: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 300 frm = hsp['Hsp_query-from'] - 1 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 301 to = int(hsp['Hsp_query-to']) | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 302 table[frm:to] = repeat(color_idx(blastxml_len(hsp)), to - frm) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 303 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 304 matches = [] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 305 last = table[0] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 306 count = 0 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 307 for i in range(query_length): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 308 if table[i] == last: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 309 count += 1 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 310 continue | 
| 18 | 311 matches.append((count * percent_multiplier, self.colors[last] if last != 255 else 'transparent')) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 312 last = table[i] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 313 count = 1 | 
| 18 | 314 matches.append((count * percent_multiplier, self.colors[last] if last != 255 else 'transparent')) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 315 | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 316 yield dict(colors=matches, hit=hit, defline=firsttitle(hit)) | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 317 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 318 @filter | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 319 def queryscale(self, result): | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 320 query_length = blastxml_len(result) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 321 skip = math.ceil(query_length / self.max_scale_labels) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 322 percent_multiplier = 100 / query_length | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 323 for i in range(1, query_length+1): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 324 if i % skip == 0: | 
| 25 | 325 yield dict(label = i, width = skip * percent_multiplier) | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 326 if query_length % skip != 0: | 
| 22 
53cd304c5f26
Add index for multiple results; fix layout of query ruler for edge case
 Jan Kanis <jan.code@jankanis.nl> parents: 
21diff
changeset | 327 yield dict(label = query_length, | 
| 25 | 328 width = (query_length % skip) * percent_multiplier) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 329 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 330 @filter | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 331 def hit_info(self, result): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 332 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 333 query_length = blastxml_len(result) | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 334 | 
| 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 335 for hit in hits(result): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 336 hsps = hit.Hit_hsps.Hsp | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 337 | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 338 cover = [False] * query_length | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 339 for hsp in hsps: | 
| 21 
67ddcb807b7d
make it work with multiple queries
 Jan Kanis <jan.code@jankanis.nl> parents: 
20diff
changeset | 340 cover[hsp['Hsp_query-from']-1 : int(hsp['Hsp_query-to'])] = repeat(True, blastxml_len(hsp)) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 341 cover_count = cover.count(True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 342 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 343 def hsp_val(path): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 344 return (float(hsp[path]) for hsp in hsps) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 345 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 346 yield dict(hit = hit, | 
| 77 | 347 title = firsttitle(hit), | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 348 maxscore = "{0:.1f}".format(max(hsp_val('Hsp_bit-score'))), | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 349 totalscore = "{0:.1f}".format(sum(hsp_val('Hsp_bit-score'))), | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 350 cover = "{0:.0%}".format(cover_count / query_length), | 
| 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 351 e_value = "{0:.4g}".format(min(hsp_val('Hsp_evalue'))), | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 352 # FIXME: is this the correct formula vv? | 
| 80 | 353 # float(...) because non-flooring division doesn't work with lxml elements in python 2.6 | 
| 77 | 354 ident = "{0:.0%}".format(float(min(float(hsp.Hsp_identity) / blastxml_len(hsp) for hsp in hsps))), | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 355 accession = hit.Hit_accession) | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 356 | 
| 29 
4e6ac737ba17
improve the galaxy html stripping warning; make sure the tool can find the template from within galaxy
 Jan Kanis <jan.code@jankanis.nl> parents: 
26diff
changeset | 357 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 358 def main(): | 
| 29 
4e6ac737ba17
improve the galaxy html stripping warning; make sure the tool can find the template from within galaxy
 Jan Kanis <jan.code@jankanis.nl> parents: 
26diff
changeset | 359 default_template = path.join(path.dirname(__file__), 'blast2html.html.jinja') | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 360 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 361 parser = argparse.ArgumentParser(description="Convert a BLAST XML result into a nicely readable html page", | 
| 75 
67b1a319c6dc
First go at 2.6 compatibility
 Jan Kanis <jan.code@jankanis.nl> parents: 
74diff
changeset | 362 usage="{0} [-i] INPUT [-o OUTPUT]".format(sys.argv[0])) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 363 input_group = parser.add_mutually_exclusive_group(required=True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 364 input_group.add_argument('positional_arg', metavar='INPUT', nargs='?', type=argparse.FileType(mode='r'), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 365 help='The input Blast XML file, same as -i/--input') | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 366 input_group.add_argument('-i', '--input', type=argparse.FileType(mode='r'), | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 367 help='The input Blast XML file') | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 368 parser.add_argument('-o', '--output', type=argparse.FileType(mode='w'), default=sys.stdout, | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 369 help='The output html file') | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 370 # We just want the file name here, so jinja can open the file | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 371 # itself. But it is easier to just use a FileType so argparse can | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 372 # handle the errors. This introduces a small race condition when | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 373 # jinja later tries to re-open the template file, but we don't | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 374 # care too much. | 
| 55 
4217bb9cf1d3
depend on python 3; fix internal links with multiple iterations
 Jan Kanis <jan.code@jankanis.nl> parents: 
53diff
changeset | 375 parser.add_argument('--template', type=argparse.FileType(mode='r'), default=default_template, | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 376 help='The template file to use. Defaults to blast_html.html.jinja') | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 377 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 378 args = parser.parse_args() | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 379 if args.input == None: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 380 args.input = args.positional_arg | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 381 if args.input == None: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 382 parser.error('no input specified') | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 383 | 
| 20 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 384 templatedir, templatename = path.split(args.template.name) | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 385 args.template.close() | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 386 if not templatedir: | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 387 templatedir = '.' | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 388 | 
| 
4434ffab721a
add a parameter for the template
 Jan Kanis <jan.code@jankanis.nl> parents: 
18diff
changeset | 389 b = BlastVisualize(args.input, templatedir, templatename) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 390 b.render(args.output) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 391 | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 392 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 393 if __name__ == '__main__': | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 394 main() | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 395 | 
