|
3
|
1 #Created 07/01/2011 - Konrad Paszkiewicz, Exeter Sequencing Service, University of Exeter, UK
|
|
|
2 Revisions 2013 by Peter Cock, The James Hutton Institute, UK
|
|
2
|
3
|
|
|
4 The attached is a crude wrapper script for Interproscan. Typically this is useful when one wants to produce an annotation which is not based on sequence
|
|
|
5 similarity. E.g after a denovo transcriptome assembly, each transcript could be translated and run through this tool.
|
|
|
6
|
|
|
7 Prerequisites:
|
|
|
8
|
|
|
9 1. A working installation of Interproscan on your Galaxy server/cluster.
|
|
|
10
|
|
|
11 Limitations:
|
|
|
12
|
|
|
13 Currently it is setup to work with PFAM only due to the heavy computational demands Interproscan makes.
|
|
|
14
|
|
|
15 Input formats:
|
|
|
16
|
|
|
17 The standard interproscan input is either genomic or protein sequences. In the case of genomic sequences Interproscan will of run an ORF
|
|
|
18 prediction tool. However this tends to lose the ORF information (e.g. start/end co-ordinates) from the header. As such the requirement here is to input ORF
|
|
|
19 sequences (e.g. from EMBOSS getorf) and to then replace any spaces in the FASTA header with underscores. This workaround generally preserves the relevant
|
|
|
20 positional information.
|
|
|
21
|
|
|
22
|
|
|
23
|