Mercurial > repos > jankanis > blast2html
annotate blast_html.py @ 15:c2d63adb83db draft
renamed files
| author | Jan Kanis <jan.code@jankanis.nl> | 
|---|---|
| date | Mon, 12 May 2014 17:13:49 +0200 | 
| parents | visualise.py@a459c754cdb5 | 
| children | 0b33898bba45 | 
| rev | line source | 
|---|---|
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 1 #!/usr/bin/env python3 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 2 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 3 # Copyright The Hyve B.V. 2014 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 4 # License: GPL version 3 or higher | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 5 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 6 import sys | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 7 import math | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 8 import warnings | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 9 from itertools import repeat | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 10 import argparse | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 11 from lxml import objectify | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 12 import jinja2 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 13 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 14 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 15 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 16 _filters = {} | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 17 def filter(func_or_name): | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 18 "Decorator to register a function as filter in the current jinja environment" | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 19 if isinstance(func_or_name, str): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 20 def inner(func): | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 21 _filters[func_or_name] = func | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 22 return func | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 23 return inner | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 24 else: | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 25 _filters[func_or_name.__name__] = func_or_name | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 26 return func_or_name | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 27 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 28 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 29 def color_idx(length): | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 30 if length < 40: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 31 return 0 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 32 elif length < 50: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 33 return 1 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 34 elif length < 80: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 35 return 2 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 36 elif length < 200: | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 37 return 3 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 38 return 4 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 39 | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 40 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 41 def fmt(val, fmt): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 42 return format(float(val), fmt) | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 43 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 44 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 45 def firsttitle(hit): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 46 return hit.Hit_def.text.split('>')[0] | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 47 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 48 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 49 def othertitles(hit): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 50 """Split a hit.Hit_def that contains multiple titles up, splitting out the hit ids from the titles.""" | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 51 id_titles = hit.Hit_def.text.split('>') | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 52 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 53 titles = [] | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 54 for t in id_titles[1:]: | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 55 fullid, title = t.split(' ', 1) | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 56 hitid, id = fullid.split('|', 2)[1:3] | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 57 titles.append(dict(id = id, | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 58 hitid = hitid, | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 59 fullid = fullid, | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 60 title = title)) | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 61 return titles | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 62 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 63 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 64 def hitid(hit): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 65 return hit.Hit_id.text.split('|', 2)[1] | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 66 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 67 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 68 def seqid(hit): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 69 return hit.Hit_id.text.split('|', 2)[2] | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 70 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 71 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 72 def alignment_pre(hsp): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 73 return ( | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 74 "Query {:>7s} {} {}\n".format(hsp['Hsp_query-from'], hsp.Hsp_qseq, hsp['Hsp_query-to']) + | 
| 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 75 " {:7s} {}\n".format('', hsp.Hsp_midline) + | 
| 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 76 "Subject{:>7s} {} {}".format(hsp['Hsp_hit-from'], hsp.Hsp_hseq, hsp['Hsp_hit-to']) | 
| 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 77 ) | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 78 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 79 @filter('len') | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 80 def hsplen(node): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 81 return int(node['Hsp_align-len']) | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 82 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 83 @filter | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 84 def asframe(frame): | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 85 if frame == 1: | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 86 return 'Plus' | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 87 elif frame == -1: | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 88 return 'Minus' | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 89 raise Exception("frame should be either +1 or -1") | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 90 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 91 def genelink(hit, type='genbank', hsp=None): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 92 if not isinstance(hit, str): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 93 hit = hitid(hit) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 94 link = "http://www.ncbi.nlm.nih.gov/nucleotide/{}?report={}&log$=nuclalign".format(hit, type) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 95 if hsp != None: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 96 link += "&from={}&to={}".format(hsp['Hsp_hit-from'], hsp['Hsp_hit-to']) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 97 return jinja2.Markup(link) | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 98 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 99 | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 100 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 101 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 102 class BlastVisualize: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 103 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 104 colors = ('black', 'blue', 'green', 'magenta', 'red') | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 105 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 106 max_scale_labels = 10 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 107 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 108 templatename = 'visualise.html.jinja' | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 109 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 110 def __init__(self, input): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 111 self.input = input | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 112 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 113 self.blast = objectify.parse(self.input).getroot() | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 114 self.loader = jinja2.FileSystemLoader(searchpath='.') | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 115 self.environment = jinja2.Environment(loader=self.loader, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 116 lstrip_blocks=True, trim_blocks=True, autoescape=True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 117 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 118 self.environment.filters['color'] = lambda length: match_colors[color_idx(length)] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 119 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 120 for name, filter in _filters.items(): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 121 self.environment.filters[name] = filter | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 122 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 123 self.query_length = int(self.blast["BlastOutput_query-len"]) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 124 self.hits = self.blast.BlastOutput_iterations.Iteration.Iteration_hits.Hit | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 125 # sort hits by longest hotspot first | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 126 self.ordered_hits = sorted(self.hits, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 127 key=lambda h: max(hsplen(hsp) for hsp in h.Hit_hsps.Hsp), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 128 reverse=True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 129 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 130 def render(self, output): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 131 template = self.environment.get_template(self.templatename) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 132 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 133 params = (('Query ID', self.blast["BlastOutput_query-ID"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 134 ('Query definition', self.blast["BlastOutput_query-def"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 135 ('Query length', self.blast["BlastOutput_query-len"]), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 136 ('Program', self.blast.BlastOutput_version), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 137 ('Database', self.blast.BlastOutput_db), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 138 ) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 139 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 140 if len(self.blast.BlastOutput_iterations.Iteration) > 1: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 141 warnings.warn("Multiple 'Iteration' elements found, showing only the first") | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 142 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 143 output.write(template.render(blast=self.blast, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 144 length=self.query_length, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 145 hits=self.blast.BlastOutput_iterations.Iteration.Iteration_hits.Hit, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 146 colors=self.colors, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 147 match_colors=self.match_colors(), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 148 queryscale=self.queryscale(), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 149 hit_info=self.hit_info(), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 150 genelink=genelink, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 151 params=params)) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 152 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 153 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 154 def match_colors(self): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 155 """ | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 156 An iterator that yields lists of length-color pairs. | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 157 """ | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 158 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 159 percent_multiplier = 100 / self.query_length | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 160 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 161 for hit in self.hits: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 162 # sort hotspots from short to long, so we can overwrite index colors of | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 163 # short matches with those of long ones. | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 164 hotspots = sorted(hit.Hit_hsps.Hsp, key=lambda hsp: hsplen(hsp)) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 165 table = bytearray([255]) * self.query_length | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 166 for hsp in hotspots: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 167 frm = hsp['Hsp_query-from'] - 1 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 168 to = int(hsp['Hsp_query-to']) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 169 table[frm:to] = repeat(color_idx(hsplen(hsp)), to - frm) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 170 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 171 matches = [] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 172 last = table[0] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 173 count = 0 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 174 for i in range(self.query_length): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 175 if table[i] == last: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 176 count += 1 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 177 continue | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 178 matches.append((count * percent_multiplier, self.colors[last] if last != 255 else 'none')) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 179 last = table[i] | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 180 count = 1 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 181 matches.append((count * percent_multiplier, self.colors[last] if last != 255 else 'none')) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 182 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 183 yield dict(colors=matches, link="#hit"+hit.Hit_num.text, defline=firsttitle(hit)) | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 184 | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 185 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 186 def queryscale(self): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 187 skip = math.ceil(self.query_length / self.max_scale_labels) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 188 percent_multiplier = 100 / self.query_length | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 189 for i in range(1, self.query_length+1): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 190 if i % skip == 0: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 191 yield dict(label = i, width = skip * percent_multiplier) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 192 if self.query_length % skip != 0: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 193 yield dict(label = self.query_length, width = (self.query_length % skip) * percent_multiplier) | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 194 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 195 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 196 def hit_info(self): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 197 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 198 for hit in self.ordered_hits: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 199 hsps = hit.Hit_hsps.Hsp | 
| 9 
9e7927673089
intermediate commit before converting some tables to divs
 Jan Kanis <jan.code@jankanis.nl> parents: 
7diff
changeset | 200 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 201 cover = [False] * self.query_length | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 202 for hsp in hsps: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 203 cover[hsp['Hsp_query-from']-1 : int(hsp['Hsp_query-to'])] = repeat(True, hsplen(hsp)) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 204 cover_count = cover.count(True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 205 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 206 def hsp_val(path): | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 207 return (float(hsp[path]) for hsp in hsps) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 208 | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 209 yield dict(hit = hit, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 210 title = firsttitle(hit), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 211 link_id = hit.Hit_num, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 212 maxscore = "{:.1f}".format(max(hsp_val('Hsp_bit-score'))), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 213 totalscore = "{:.1f}".format(sum(hsp_val('Hsp_bit-score'))), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 214 cover = "{:.0%}".format(cover_count / self.query_length), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 215 e_value = "{:.4g}".format(min(hsp_val('Hsp_evalue'))), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 216 # FIXME: is this the correct formula vv? | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 217 ident = "{:.0%}".format(float(min(hsp.Hsp_identity / hsplen(hsp) for hsp in hsps))), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 218 accession = hit.Hit_accession) | 
| 12 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 219 | 
| 
2fbdf2eb27b4
All data is displayed now, still some formatting to do
 Jan Kanis <jan.code@jankanis.nl> parents: 
9diff
changeset | 220 | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 221 def main(): | 
| 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 222 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 223 parser = argparse.ArgumentParser(description="Convert a BLAST XML result into a nicely readable html page", | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 224 usage="{} [-i] INPUT [-o OUTPUT]".format(sys.argv[0])) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 225 input_group = parser.add_mutually_exclusive_group(required=True) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 226 input_group.add_argument('positional_arg', metavar='INPUT', nargs='?', type=argparse.FileType(mode='r'), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 227 help='The input Blast XML file, same as -i/--input') | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 228 input_group.add_argument('-i', '--input', type=argparse.FileType(mode='r'), | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 229 help='The input Blast XML file') | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 230 parser.add_argument('-o', '--output', type=argparse.FileType(mode='w'), default=sys.stdout, | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 231 help='The output html file') | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 232 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 233 args = parser.parse_args() | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 234 if args.input == None: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 235 args.input = args.positional_arg | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 236 if args.input == None: | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 237 parser.error('no input specified') | 
| 7 
1df2bfce5c24
first features are working, partial match table
 Jan Kanis <jan.code@jankanis.nl> parents: diff
changeset | 238 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 239 b = BlastVisualize(args.input) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 240 b.render(args.output) | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 241 | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 242 | 
| 14 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 243 if __name__ == '__main__': | 
| 
a459c754cdb5
add links, refactor, proper commandline arguments
 Jan Kanis <jan.code@jankanis.nl> parents: 
13diff
changeset | 244 main() | 
| 13 
7660519f2dc9
proper layout for alignments, added some links
 Jan Kanis <jan.code@jankanis.nl> parents: 
12diff
changeset | 245 | 
