# HG changeset patch # User bgruening # Date 1381331663 14400 # Node ID d2c2eb51814259785f3fce5011ebe85e9692788d # Parent d2c785cdf23ea973834ce5f97fec8069c3ba0941 Uploaded diff -r d2c785cdf23e -r d2c2eb518142 antismash.xml --- a/antismash.xml Wed Oct 09 10:06:13 2013 -0400 +++ b/antismash.xml Wed Oct 09 11:14:23 2013 -0400 @@ -5,7 +5,6 @@ hmmer blast+ muscle - biopython antismash_python_deps antismash @@ -34,13 +33,15 @@ - - - + + + + + + - - @@ -91,11 +92,6 @@ .. class:: infomark -That version of antiSMASH can only handle one sequence. So multi-sequence FASTA files are not supported. -For multiple sequences please use multi-antiSMASH. The advantage of that tool is that it will provide you with a -archive of all results created from antiSMASH (It can be large!) and a HTML output, for better inspection. - - **What it does** antiSMASH allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes. @@ -104,7 +100,26 @@ **Input** -If you don't have an annotated GenBank or embl file you also can provide a glimmer prediction output. You can created it with glimmer or glimmerHMM. +The ideal input for antiSMASH is an annotated nucleotide file in Genbank format or EMBL format. If no annotation is available, +we recommend running your sequence through an annotation pipeline like RAST are one included in Galaxy. + + +There are several optional analyses that may or may not be run on your sequence. +Highly recommended is the Gene Cluster Blast Comparative Analysis, which runs BlastP using each amino acid sequence from a detected gene cluster as a +query on a large database of predicted protein sequences from secondary metabolite biosynthetic gene clusters, and pools the results to identify +the gene clusters that are most homologous to the gene cluster that was detected in your query nucleotide sequence. + + +Also available is the analysis of secondary metabolism gene families (smCOGs). +This analysis attempts to allocate each gene in the detected gene clusters to a secondary metabolism-specific gene +family using profile hidden Markov models specific for the conserved sequence region characteristic of this family. +Additionally, a phylogenetic tree is constructed of each gene together with the (max. 100) sequences of the smCOG seed alignment. + + +For the most thorough genome analysis, we provide genome-wide PFAM HMM analysis of all genes in the genome through modules of the CLUSEAN pipeline. +Of course, some regions important to secondary metabolism may have been missed in the gene cluster identification stage +(e.g. because they represent the biosynthetic pathway of a yet unknown secondary metabolite). +Therefore, when genome-wide PFAM HMM analysis is selected, the PFAM frequencies are also used to find all genome regions in which PFAM domains typical for secondary metabolism are overrepresented. **References** diff -r d2c785cdf23e -r d2c2eb518142 tool_dependencies.xml --- a/tool_dependencies.xml Wed Oct 09 10:06:13 2013 -0400 +++ b/tool_dependencies.xml Wed Oct 09 11:14:23 2013 -0400 @@ -12,9 +12,6 @@ - - - @@ -54,10 +51,6 @@ https://bitbucket.org/antismash/antismash2/downloads/antismash-2.0.2.x86_64.tar.bz2 - - antismash-2.0.2/* - $INSTALL_DIR - $INSTALL_DIR/run_antismash.py