# HG changeset patch
# User bgruening
# Date 1381331663 14400
# Node ID d2c2eb51814259785f3fce5011ebe85e9692788d
# Parent d2c785cdf23ea973834ce5f97fec8069c3ba0941
Uploaded
diff -r d2c785cdf23e -r d2c2eb518142 antismash.xml
--- a/antismash.xml Wed Oct 09 10:06:13 2013 -0400
+++ b/antismash.xml Wed Oct 09 11:14:23 2013 -0400
@@ -5,7 +5,6 @@
hmmer
blast+
muscle
- biopython
antismash_python_deps
antismash
@@ -34,13 +33,15 @@
-
-
-
+
+
+
+
+
+
-
-
@@ -91,11 +92,6 @@
.. class:: infomark
-That version of antiSMASH can only handle one sequence. So multi-sequence FASTA files are not supported.
-For multiple sequences please use multi-antiSMASH. The advantage of that tool is that it will provide you with a
-archive of all results created from antiSMASH (It can be large!) and a HTML output, for better inspection.
-
-
**What it does**
antiSMASH allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes.
@@ -104,7 +100,26 @@
**Input**
-If you don't have an annotated GenBank or embl file you also can provide a glimmer prediction output. You can created it with glimmer or glimmerHMM.
+The ideal input for antiSMASH is an annotated nucleotide file in Genbank format or EMBL format. If no annotation is available,
+we recommend running your sequence through an annotation pipeline like RAST are one included in Galaxy.
+
+
+There are several optional analyses that may or may not be run on your sequence.
+Highly recommended is the Gene Cluster Blast Comparative Analysis, which runs BlastP using each amino acid sequence from a detected gene cluster as a
+query on a large database of predicted protein sequences from secondary metabolite biosynthetic gene clusters, and pools the results to identify
+the gene clusters that are most homologous to the gene cluster that was detected in your query nucleotide sequence.
+
+
+Also available is the analysis of secondary metabolism gene families (smCOGs).
+This analysis attempts to allocate each gene in the detected gene clusters to a secondary metabolism-specific gene
+family using profile hidden Markov models specific for the conserved sequence region characteristic of this family.
+Additionally, a phylogenetic tree is constructed of each gene together with the (max. 100) sequences of the smCOG seed alignment.
+
+
+For the most thorough genome analysis, we provide genome-wide PFAM HMM analysis of all genes in the genome through modules of the CLUSEAN pipeline.
+Of course, some regions important to secondary metabolism may have been missed in the gene cluster identification stage
+(e.g. because they represent the biosynthetic pathway of a yet unknown secondary metabolite).
+Therefore, when genome-wide PFAM HMM analysis is selected, the PFAM frequencies are also used to find all genome regions in which PFAM domains typical for secondary metabolism are overrepresented.
**References**
diff -r d2c785cdf23e -r d2c2eb518142 tool_dependencies.xml
--- a/tool_dependencies.xml Wed Oct 09 10:06:13 2013 -0400
+++ b/tool_dependencies.xml Wed Oct 09 11:14:23 2013 -0400
@@ -12,9 +12,6 @@
-
-
-
@@ -54,10 +51,6 @@
https://bitbucket.org/antismash/antismash2/downloads/antismash-2.0.2.x86_64.tar.bz2
-
- antismash-2.0.2/*
- $INSTALL_DIR
-
$INSTALL_DIR/run_antismash.py