annotate antismash.xml @ 11:d2c2eb518142 draft

Uploaded
author bgruening
date Wed, 09 Oct 2013 11:14:23 -0400
parents b11e1dfbc7c9
children 9cfa2fb488b0
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
1 <tool id="antismash" name="Secondary Metabolites" version="2.0.2.0">
4
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
2 <description>and Antibiotics Analysis (antiSMASH)</description>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
3 <requirements>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
4 <requirement type="package" version="3.0">hmmer</requirement>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
5 <requirement type="package" version="2.3.2">hmmer</requirement>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
6 <requirement type="package" version="2.2.28">blast+</requirement>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
7 <requirement type="package" version="3.8.31">muscle</requirement>
5
73d11f6a3cd7 Uploaded
bgruening
parents: 4
diff changeset
8 <requirement type="package" version="2.0.2">antismash_python_deps</requirement>
4
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
9 <requirement type="package" version="2.0.2">antismash</requirement>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
10 </requirements>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
11 <command>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
12 run_antismash.py
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
13
4
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
14 --input $infile
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
15 --cpus 4
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
16 #set $type_list = ','.join([$type for $type in $types])
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
17 --enable $type_list
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
18 --input-type nucl
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
19 $smcogs
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
20 $clusterblast
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
21 $subclusterblast
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
22 $inclusive
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
23 $full_hmmer
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
24 $full_blast
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
25
4
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
26 --pfamdir $pfam_database.fields.path
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
27
9
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
28 ## leave out the start and end features, it can be easily replaced with Galaxy tools
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
29 ##--from START Start analysis at nucleotide specified
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
30 ##--to END
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
31
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
32 </command>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
33 <inputs>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
34 <param name="infile" type="data" format="gb,embl" label="Nucleotide sequence file in GenBank or EMBL format"/>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
35
11
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
36 <param name="clusterblast" type="boolean" label="BLAST identified clusters against known clusters" truevalue="--clusterblast" falsevalue="" checked="True" />
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
37 <param name="smcogs" type="boolean" label="analysis of secondary metabolism gene families (smCOGs)"
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
38 falsevalue="" truevalue="--smcogs" checked="True" />
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
39
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
40 <param name="full_blast" type="boolean" label="Run a whole-genome BLAST analysis" truevalue="--full-blast" falsevalue="" checked="False" />
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
41 <param name="subclusterblast" type="boolean" label="Subcluster Blast analysis" truevalue="--subclusterblast" falsevalue="" checked="false" />
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
42 <param name="full_hmmer" type="boolean" label="Run a whole-genome Pfam analysis" truevalue="--full-hmmer" falsevalue="" checked="false" />
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
43
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
44 <param name="inclusive" type="boolean" label="Use inclusive algorithm for cluster detection" truevalue="--inclusive" falsevalue="" checked="false" />
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
45
9
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
46 <param name="pfam_database" type="select" label="Pfam database" help="Pfam Covariance models">
4
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
47 <options from_file="antismash.loc">
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
48 <column name="value" index="0"/>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
49 <column name="name" index="1"/>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
50 <column name="path" index="2"/>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
51 </options>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
52 </param>
9b91d26ee080 Uploaded
bgruening
parents: 0
diff changeset
53
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
54 <param name="types" type="select" display="checkboxes" multiple="true" label="Gene cluster types to search">
9
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
55 <option value="t1pks" selected="True">type I polyketide synthases</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
56 <option value="t2pks" selected="True">type II polyketide synthases</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
57 <option value="t3pks" selected="True">type III polyketide synthases</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
58 <option value="t4pks" selected="True">type IV polyketide synthases</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
59 <option value="transatpks" selected="True">trans-AT PKS</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
60 <option value="nrps" selected="True">nonribosomal peptide synthetases</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
61 <option value="terpene" selected="True">terpene synthases</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
62 <option value="lantipeptide" selected="True">lantipeptides</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
63 <option value="bacteriocin" selected="True">bacteriocins</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
64 <option value="blactam" selected="True">beta-lactams</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
65 <option value="amglyccycl" selected="True">aminoglycosides / aminocyclitols</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
66 <option value="aminocoumarin" selected="True">aminocoumarins</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
67 <option value="siderophore" selected="True">siderophores</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
68 <option value="ectoine" selected="True">ectoines</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
69 <option value="butyrolactone" selected="True">butyrolactones</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
70 <option value="indole" selected="True">indoles</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
71 <option value="nucleoside" selected="True">nucleosides</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
72 <option value="phosphoglycolipid" selected="True">phosphoglycolipids</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
73 <option value="oligosaccharide" selected="True">oligosaccharides</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
74 <option value="furan" selected="True">furans</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
75 <option value="hserlactone" selected="True">hserlactones</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
76 <option value="thiopeptide" selected="True">thiopeptides</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
77 <option value="phenazine" selected="True">phenazines</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
78 <option value="phosphonate" selected="True">phosphonates</option>
b11e1dfbc7c9 Uploaded
bgruening
parents: 5
diff changeset
79 <option value="others" selected="True">others</option>
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
80 </param>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
81
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
82 </inputs>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
83 <outputs>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
84 <data format="fasta" name="geneclusterprots" label="${tool.name} on ${on_string} (Gen Cluster Proteins)" />
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
85 <data format="tabular" name="zip" label="${tool.name} on ${on_string} (all files compressed)" />
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
86 <data format="html" name="html" label="${tool.name} on ${on_string} (html report)" />
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
87 <data name="embl" format="text" label="${tool.name} on ${on_string} EMBL Output Format">
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
88 <filter>(wg_blast == True or pfam == True)</filter>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
89 </data>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
90 </outputs>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
91 <help>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
92
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
93 .. class:: infomark
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
94
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
95 **What it does**
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
96
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
97 antiSMASH allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes.
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
98 It integrates and cross-links with a large number of in silico secondary metabolite analysis tools that have been published earlier.
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
99
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
100
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
101 **Input**
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
102
11
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
103 The ideal input for antiSMASH is an annotated nucleotide file in Genbank format or EMBL format. If no annotation is available,
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
104 we recommend running your sequence through an annotation pipeline like RAST are one included in Galaxy.
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
105
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
106
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
107 There are several optional analyses that may or may not be run on your sequence.
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
108 Highly recommended is the Gene Cluster Blast Comparative Analysis, which runs BlastP using each amino acid sequence from a detected gene cluster as a
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
109 query on a large database of predicted protein sequences from secondary metabolite biosynthetic gene clusters, and pools the results to identify
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
110 the gene clusters that are most homologous to the gene cluster that was detected in your query nucleotide sequence.
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
111
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
112
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
113 Also available is the analysis of secondary metabolism gene families (smCOGs).
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
114 This analysis attempts to allocate each gene in the detected gene clusters to a secondary metabolism-specific gene
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
115 family using profile hidden Markov models specific for the conserved sequence region characteristic of this family.
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
116 Additionally, a phylogenetic tree is constructed of each gene together with the (max. 100) sequences of the smCOG seed alignment.
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
117
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
118
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
119 For the most thorough genome analysis, we provide genome-wide PFAM HMM analysis of all genes in the genome through modules of the CLUSEAN pipeline.
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
120 Of course, some regions important to secondary metabolism may have been missed in the gene cluster identification stage
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
121 (e.g. because they represent the biosynthetic pathway of a yet unknown secondary metabolite).
d2c2eb518142 Uploaded
bgruening
parents: 9
diff changeset
122 Therefore, when genome-wide PFAM HMM analysis is selected, the PFAM frequencies are also used to find all genome regions in which PFAM domains typical for secondary metabolism are overrepresented.
0
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
123
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
124
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
125 **References**
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
126
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
127 Marnix H. Medema, Kai Blin, Peter Cimermancic, Victor de Jager, Piotr Zakrzewski, Michael A. Fischbach, Tilmann Weber,
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
128 Rainer Breitling and Eriko Takano (2011). antiSMASH: Rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters. Nucleic Acids Research, doi: 10.1093/nar/gkr466.
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
129
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
130 http://antismash.secondarymetabolites.org/help.html
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
131
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
132 </help>
7ad005dfbe78 Uploaded
bgruening
parents:
diff changeset
133 </tool>