# HG changeset patch # User jjohnson # Date 1307480708 14400 # Node ID e076d95dbdb5dd4dfc3febb6bdb73dc11a9adfdf # Parent c7923b34dea499425ee15908718f464941c8078b Migrated tool version 1.19.0 from old tool shed archive to new tool shed repository diff -r c7923b34dea4 -r e076d95dbdb5 mothur/README --- a/mothur/README Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/README Tue Jun 07 17:05:08 2011 -0400 @@ -1,9 +1,10 @@ Provides galaxy tools for the Mothur metagenomics package - http://www.mothur.org/wiki/Main_Page -Install mothur v.1.16.0 on your galaxy system so galaxy can execute the mothur command +Install mothur v.1.19.0 on your galaxy system so galaxy can execute the mothur command + ( This version of wrappers is designed for Mothur version 1.19 - it may work on later versions ) http://www.mothur.org/wiki/Download_mothur http://www.mothur.org/wiki/Installation - ( This Galaxy iMothur wrapper will invoke Mothur in command line mode: http://www.mothur.org/wiki/Command_line_mode ) + ( This Galaxy Mothur wrapper will invoke Mothur in command line mode: http://www.mothur.org/wiki/Command_line_mode ) TreeVector is also packaged with this Mothur package to view phylogenetic trees: TreeVector is a utility to create and integrate phylogenetic trees as Scalable Vector Graphics (SVG) files. @@ -102,11 +103,13 @@ + + @@ -114,7 +117,8 @@ - + + @@ -122,6 +126,7 @@ + @@ -137,6 +142,7 @@ + @@ -162,7 +168,7 @@ - + @@ -176,15 +182,22 @@ - + + + + + + + + @@ -205,6 +218,10 @@ + + + + @@ -212,7 +229,6 @@ - diff -r c7923b34dea4 -r e076d95dbdb5 mothur/lib/galaxy/datatypes/metagenomics.py --- a/mothur/lib/galaxy/datatypes/metagenomics.py Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/lib/galaxy/datatypes/metagenomics.py Tue Jun 07 17:05:08 2011 -0400 @@ -712,12 +712,91 @@ return False class SequenceTaxonomy(Tabular): - file_ext = 'taxonomy' + file_ext = 'seq.taxonomy' + """ + A table with 2 columns: + - SequenceName + - Taxonomy (semicolon-separated taxonomy in descending order) + Example: + X56533.1 Eukaryota;Alveolata;Ciliophora;Intramacronucleata;Oligohymenophorea;Hymenostomatida;Tetrahymenina;Glaucomidae;Glaucoma; + X97975.1 Eukaryota;Parabasalidea;Trichomonada;Trichomonadida;unclassified_Trichomonadida; + AF052717.1 Eukaryota;Parabasalidea; + """ def __init__(self, **kwd): - """A list of names""" Tabular.__init__( self, **kwd ) self.column_names = ['name','taxonomy'] + def sniff( self, filename ): + """ + Determines whether the file is a SequenceTaxonomy + """ + try: + pat = '^([^ \t\n\r\f\v;]+([(]\d+[)])?[;])+$' + fh = open( filename ) + count = 0 + while True: + line = fh.readline() + if not line: + break #EOF + line = line.strip() + if line: + fields = line.split('\t') + if len(fields) != 2: + return False + if not re.match(pat,fields[1]): + return False + count += 1 + if count > 10: + break + if count > 0: + return True + except: + pass + finally: + fh.close() + return False + +class RDPSequenceTaxonomy(SequenceTaxonomy): + file_ext = 'rdp.taxonomy' + """ + A table with 2 columns: + - SequenceName + - Taxonomy (semicolon-separated taxonomy in descending order, RDP requires exactly 6 levels deep) + Example: + AB001518.1 Bacteria;Bacteroidetes;Sphingobacteria;Sphingobacteriales;unclassified_Sphingobacteriales; + AB001724.1 Bacteria;Cyanobacteria;Cyanobacteria;Family_II;GpIIa; + AB001774.1 Bacteria;Chlamydiae;Chlamydiae;Chlamydiales;Chlamydiaceae;Chlamydophila; + """ + def sniff( self, filename ): + """ + Determines whether the file is a SequenceTaxonomy + """ + try: + pat = '^([^ \t\n\r\f\v;]+([(]\d+[)])?[;]){6}$' + fh = open( filename ) + count = 0 + while True: + line = fh.readline() + if not line: + break #EOF + line = line.strip() + if line: + fields = line.split('\t') + if len(fields) != 2: + return False + if not re.match(pat,fields[1]): + return False + count += 1 + if count > 10: + break + if count > 0: + return True + except: + pass + finally: + fh.close() + return False + class ConsensusTaxonomy(Tabular): file_ext = 'cons.taxonomy' def __init__(self, **kwd): @@ -845,9 +924,9 @@ ## Qiime Classes -class MetadataMapping(Tabular): +class QiimeMetadataMapping(Tabular): MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] ) - file_ext = 'mapping' + file_ext = 'qiimemapping' def __init__(self, **kwd): """ @@ -887,6 +966,144 @@ Tabular.set_meta(self, dataset, overwrite, skip, max_data_lines) self.set_column_names(dataset) +class QiimeOTU(Tabular): + """ + Associates OTUs with sequence IDs + Example: + 0 FLP3FBN01C2MYD FLP3FBN01B2ALM + 1 FLP3FBN01DF6NE FLP3FBN01CKW1J FLP3FBN01CHVM4 + 2 FLP3FBN01AXQ2Z + """ + file_ext = 'qiimeotu' + +class QiimeOTUTable(Tabular): + """ + #Full OTU Counts + #OTU ID PC.354 PC.355 PC.356 Consensus Lineage + 0 0 1 0 Root;Bacteria;Firmicutes;"Clostridia";Clostridiales + 1 1 3 1 Root;Bacteria + 2 0 2 2 Root;Bacteria;Bacteroidetes + """ + MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] ) + file_ext = 'qiimeotutable' + def init_meta( self, dataset, copy_from=None ): + tabular.Tabular.init_meta( self, dataset, copy_from=copy_from ) + def set_meta( self, dataset, overwrite = True, skip = None, **kwd ): + self.set_column_names(dataset) + def set_column_names(self, dataset): + if dataset.has_data(): + dataset_fh = open( dataset.file_name ) + line = dataset_fh.readline() + line = dataset_fh.readline() + if line.startswith('#OTU ID'): + dataset.metadata.column_names = line.strip().split('\t'); + dataset_fh.close() + dataset.metadata.comment_lines = 2 + +class QiimeDistanceMatrix(Tabular): + """ + PC.354 PC.355 PC.356 + PC.354 0.0 3.177 1.955 + PC.355 3.177 0.0 3.444 + PC.356 1.955 3.444 0.0 + """ + file_ext = 'qiimedistmat' + def init_meta( self, dataset, copy_from=None ): + tabular.Tabular.init_meta( self, dataset, copy_from=copy_from ) + def set_meta( self, dataset, overwrite = True, skip = None, **kwd ): + self.set_column_names(dataset) + def set_column_names(self, dataset): + if dataset.has_data(): + dataset_fh = open( dataset.file_name ) + line = dataset_fh.readline() + # first line contains the names + dataset.metadata.column_names = line.strip().split('\t'); + dataset_fh.close() + dataset.metadata.comment_lines = 1 + +class QiimePCA(Tabular): + """ + Principal Coordinate Analysis Data + The principal coordinate (PC) axes (columns) for each sample (rows). + Pairs of PCs can then be graphed to view the relationships between samples. + The bottom of the output file contains the eigenvalues and % variation explained for each PC. + Example: + pc vector number 1 2 3 + PC.354 -0.309063936588 0.0398252112257 0.0744672231759 + PC.355 -0.106593922619 0.141125998277 0.0780204374172 + PC.356 -0.219869362955 0.00917241121781 0.0357281314115 + + + eigvals 0.480220500471 0.163567082874 0.125594470811 + % variation explained 51.6955484555 17.6079322939 + """ + file_ext = 'qiimepca' + +class QiimeParams(Tabular): + """ +###pick_otus_through_otu_table.py parameters### + +# OTU picker parameters +pick_otus:otu_picking_method uclust +pick_otus:clustering_algorithm furthest + +# Representative set picker parameters +pick_rep_set:rep_set_picking_method first +pick_rep_set:sort_by otu + """ + file_ext = 'qiimeparams' + +class QiimePrefs(data.Text): + """ + A text file, containing coloring preferences to be used by make_distance_histograms.py, make_2d_plots.py and make_3d_plots.py. + Example: +{ +'background_color':'black', + +'sample_coloring': + { + 'Treatment': + { + 'column':'Treatment', + 'colors':(('red',(0,100,100)),('blue',(240,100,100))) + }, + 'DOB': + { + 'column':'DOB', + 'colors':(('red',(0,100,100)),('blue',(240,100,100))) + } + }, +'MONTE_CARLO_GROUP_DISTANCES': + { + 'Treatment': 10, + 'DOB': 10 + } +} + """ + file_ext = 'qiimeprefs' + +class QiimeTaxaSummary(Tabular): + """ + Taxon PC.354 PC.355 PC.356 + Root;Bacteria;Actinobacteria 0.0 0.177 0.955 + Root;Bacteria;Firmicutes 0.177 0.0 0.444 + Root;Bacteria;Proteobacteria 0.955 0.444 0.0 + """ + MetadataElement( name="column_names", default=[], desc="Column Names", readonly=False, visible=True, no_value=[] ) + file_ext = 'qiimetaxsummary' + + def set_column_names(self, dataset): + if dataset.has_data(): + dataset_fh = open( dataset.file_name ) + line = dataset_fh.readline() + if line.startswith('Taxon'): + dataset.metadata.column_names = line.strip().split('\t'); + dataset_fh.close() + + def set_meta( self, dataset, overwrite = True, skip = None, max_data_lines = None, **kwd ): + Tabular.set_meta(self, dataset, overwrite, skip, max_data_lines) + self.set_column_names(dataset) + if __name__ == '__main__': import doctest, sys doctest.testmod(sys.modules[__name__]) diff -r c7923b34dea4 -r e076d95dbdb5 mothur/suite_config.xml --- a/mothur/suite_config.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/suite_config.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,240 +1,270 @@ - + Mothur metagenomics commands as Galaxy tools - + Calculate the number of potentially misaligned bases - + Align sequences to a template alignment - + + Analysis of molecular variance + + + Non-parametric multivariate analysis of changes in community structure + + Order Sequences by OTU - - Generate a newick trees for dissimilarity among groups - - + Find putative chimeras using bellerophon - + Find putative chimeras using ccode - + Find putative chimeras using chimeraCheck - + Find putative chimeras using pintail - + Find putative chimeras using slayer - + Trim sequences to a specified length - + Assign sequences to taxonomy - + Assign sequences to taxonomy - + Generate a tree using relaxed neighbor joining - + Assign sequences to OTUs (Dotur implementation) - + Group sequences that are part of a larger sequence - + Assign sequences to OTUs (Operational Taxonomic Unit) splits large matrices - + Assign sequences to OTUs (Operational Taxonomic Unit) - + Generate collector's curves for calculators on OTUs - - Summary of calculator values for OTUs + + Generate collector's curves for OTUs - + Find a consensus sequence for each OTU or phylotype - + correlation of data to axes - + Remove gap characters from sequences - + Return all sequences - + calculate uncorrected pairwise distances between aligned sequences - + Generate a phylip-formatted dissimilarity distance matrix among multiple groups - + Convert fastq to fasta and quality - + removes columns from alignments - + Select groups - + group names from shared or from list and group - + Picks by taxon - + + Get otus for each distance in a otu list + + Generate a fasta with a representative sequence for each OTU - + Get otus containing sequences from specified groups - + + Get rabund from a otu list or sabund + + Calculate the relative abundance of each otu - + + Get sabund from a otu list or rabund + + Picks sequences by name - + + Get shared sequences at each distance from list and group + + Assign sequences to OTUs (Operational Taxonomic Unit) - + Generate a heatmap for OTUs - + Generate a heatmap for pariwise similarity - + + Homogeneity of molecular variance + + Identify indicator "species" for nodes on a tree - + Cramer-von Mises tests communities for the same structure - - Lists the names of the sequences + + Lists the names (accnos) of the sequences - + Assign groups to Sets - + + Convert fasta and quality to fastq + + Make a group file - + + Make a shared file from a list and a group + + + Mantel correlation coefficient between two matrices. + + Merge data - + Merge groups in a shared file - + generate principle components plot data - + generate non-metric multidimensional scaling data - + Normalize the number of sequences per group to a specified level - + + Relate OTUs at different distances + + calculate uncorrected pairwise distances between sequences - - Order Sequences by OTU + + Generate a List file for each group - + Describes whether two or more communities have the same structure - - generate principle components plot data + + Principal Coordinate Analysis for a shared file - - Principal Coordinate Analysis + + Principal Coordinate Analysis for a distance matrix - - Alpha Diversity calculate unique branch length + + Alpha Diversity calculates unique branch length - + Assign sequences to OTUs based on taxonomy - + Remove sequences due to pyrosequencing errors - + Generate inter-sample rarefaction curves for OTUs - + Generate intra-sample rarefaction curves for OTUs - - Read OTU list and group to create a shared file + + Remove groups from groups,fasta,names,list,taxonomy - - Remove groups - - + Picks by taxon - + Remove otus containing sequences from specified groups - + Remove rare OTUs - + Remove sequences by name - + Reverse complement the sequences - + Screen sequences - + + Determine the quality of OTU assignment + + Summarize the quality of sequences - + Separate sequences into rare and abundant groups - + Generates a fasta file for each group - + Create a sub sample - + Summarize the quality of sequences - + Summary of calculator values for OTUs - + Summary of calculator values for OTUs - + Generate a newick tree for dissimilarity among groups Draw a Phylogenic Tree - + Trim sequences - primers, barcodes, quality - + Describes whether two or more communities have the same structure - + Describes whether two or more communities have the same structure - + Return unique sequences - - Generate Venn diagrams gor groups + + Generate Venn diagrams for groups diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tool-data/mothur_calculators.loc --- a/mothur/tool-data/mothur_calculators.loc Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tool-data/mothur_calculators.loc Tue Jun 07 17:05:08 2011 -0400 @@ -26,16 +26,20 @@ chao single sing Community richness the Chao1 estimator jack single sing Community richness the jackknife estimator sobs single sing Community richness the observed richness +##Community evenness +simpsoneven single sing Community evenness a Simpson index-based measure of evenness +shannoneven single sing Community evenness a Shannon index-based measure of evenness +heip single sing Community evenness Heip's metric of community evenness +smithwilson single sing Community evenness Smith and Wilson's metric of community evenness ##Community diversity bergerparker single xxxx Community diversity the Berger-Parker index -coverage single sing Community diversity the sampling coverage coverage +coverage single sing Community diversity the sampling coverage +goodscoverage single sing Community diversity the Good's estimate of sampling coverage invsimpson single sing Community diversity the Simpson index npshannon single sing Community diversity the non-parametric Shannon index qstat single xxxx Community diversity the Q statistic shannon single sing Community diversity the Shannon index simpson single sing Community diversity the Simpson index -simpsoneven single sing Community diversity the Simpson index -smithwilson single sing Smith and Wilson's metric of community evenness ##Estimates of number of additional OTUs observed with extra sampling boneh single xxxx Estimator Boneh's estimator efron single xxxx Estimator Efron's estimator @@ -55,6 +59,7 @@ jest shared shar Community Membership Similarity the Jaccard similarity coefficient based on the Chao1 estimated richnesses kulczynski shared xxxx Community Membership Similarity the Kulczynski similarity coefficient kulczynskicody shared xxxx Community Membership Similarity the Kulczynski-Cody similarity coefficient +kstest shared xxxx Kolmogorov-Smirnov test lennon shared xxxx Community Membership Similarity the Lennon similarity coefficient ochiai shared xxxx Community Membership Similarity the Ochiai similarity coefficient sorclass shared shar Community Membership Similarity the Sorenson similarity coefficient based on the observed richness diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/align.check.xml --- a/mothur/tools/mothur/align.check.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/align.check.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Calculate the number of potentially misaligned bases mothur_wrapper.py @@ -6,16 +6,27 @@ --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.align\.check$:'$out_file --outputdir='$logfile.extra_files_path' --fasta=$fasta - --map=$map + --map=$ss.map - - - - - - - + + + + + + + + + + + + + + + + + + @@ -37,8 +48,9 @@ **Command Documenation** -The align.check_ command allows you to calculate the number of potentially misaligned bases in a 16S rRNA gene sequence alignment. +The align.check_ command allows you to calculate the number of potentially misaligned bases in a 16S rRNA gene sequence alignment using a secondary_structure_map_. If you are familiar with the editor window in ARB, this is the same as counting the number of ~, #, -, and = signs. +.. _secondary_structure_map: http://www.mothur.org/wiki/Secondary_structure_map .. _align.check: http://www.mothur.org/wiki/Align.check diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/align.seqs.xml --- a/mothur/tools/mothur/align.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/align.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,12 +1,12 @@ - + Align sequences to a template alignment mothur_wrapper.py --cmd='align.seqs' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.align$:'$out_file,'^\S+\.align\.report$:'$report --outputdir='$logfile.extra_files_path' - --candidate=$candidate - --template=$alignment.template + --fasta=$candidate + --reference=$alignment.template #if $search.method == 'kmer': --ksize=$search.ksize #else: @@ -26,22 +26,22 @@ --processors=2 - + - - + + - + - - + + @@ -111,8 +111,16 @@ **Command Documenation** -The align.seqs_ command aligns a user-supplied fasta-formatted candidate sequence file to a user-supplied fasta-formatted template alignment. +The align.seqs_ command aligns a user-supplied fasta-formatted candidate sequence file to a user-supplied fasta-formatted template_alignment_. +The general approach is to + i) find the closest template for each candidate using kmer searching, blastn, or suffix tree searching; + ii) to make a pairwise alignment between the candidate and de-gapped template sequences using the Needleman-Wunsch, Gotoh, or blastn algorithms; and + iii) to re-insert gaps to the candidate and template pairwise alignments using the NAST algorithm so that the candidate sequence alignment is compatible with the original template alignment. + +In general the alignment is very fast - we are able to align over 186,000 full-length sequences to the SILVA alignment in less than 3 hrs with a quality as good as the SINA aligner. Furthermore, this rate can be accelerated using multiple processors. While the aligner doesn't explicitly take into account the secondary structure of the 16S rRNA gene, if the template database is based on the secondary structure, then the resulting alignment will at least be implicitly based on the secondary structure. + +.. _template_alignment: http://www.mothur.org/wiki/Alignment_database .. _align.seqs: http://www.mothur.org/wiki/Align.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/amova.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/amova.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,63 @@ + + Analysis of molecular variance + + mothur_wrapper.py + --cmd='amova' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.amova$:'$amova + --outputdir='$logfile.extra_files_path' + --phylip=$dist + --design=$design + #if int($iters.__str__) > 0: + --iters=$iters + #end if + #if float($alpha.__str__) > 0.0: + --alpha=$alpha + #end if + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The amova_ command calculates the analysis of molecular variance from a phylip_distance_matrix_, a nonparametric analog of traditional analysis of variance. This method is widely used in population genetics to test the hypothesis that genetic diversity within two populations is not significantly different from that which would result from pooling the two populations. + +A design file partitions a list of names into groups. It is a tab-delimited file with 2 columns: name and group, e.g. : + ======= ======= + duck bird + cow mammal + pig mammal + goose bird + cobra reptile + ======= ======= + +The Make_Design tool can construct a design file from a Mothur dataset that contains group names. + + +.. _phylip_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix +.. _amova: http://www.mothur.org/wiki/Amova + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/anosim.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/anosim.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,63 @@ + + Non-parametric multivariate analysis of changes in community structure + + mothur_wrapper.py + --cmd='anosim' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.anosim$:'$anosim + --outputdir='$logfile.extra_files_path' + --phylip=$dist + --design=$design + #if int($iters.__str__) > 0: + --iters=$iters + #end if + #if float($alpha.__str__) > 0.0: + --alpha=$alpha + #end if + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The anosim_ command uses a phylip_distance_matrix_ and a design file to calculate the non-parametric multivariate analysis of changes in community structure. + +A design file partitions a list of names into groups. It is a tab-delimited file with 2 columns: name and group, e.g. : + ======= ======= + duck bird + cow mammal + pig mammal + goose bird + cobra reptile + ======= ======= + +The Make_Design tool can construct a design file from a Mothur dataset that contains group names. + + +.. _phylip_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix +.. _anosim: http://www.mothur.org/wiki/Anosim + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/bin.seqs.xml --- a/mothur/tools/mothur/bin.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/bin.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Order Sequences by OTU mothur_wrapper.py @@ -8,18 +8,20 @@ --datasetid='$logfile.id' --new_file_path='$__new_file_path__' --new_datasets='^\S+?\.(unique|[0-9.]*)\.fasta$:fasta' --fasta=$fasta - --READ_cmd='read.otu' - --READ_list=$otu + --list=$otu #if $name.__str__ != "None" and len($name.__str__) > 0: --name=$name #end if #if $label.__str__ != "None" and len($label.__str__) > 0: --label='$label' #end if + #if $group.__str__ != "None" and len($group.__str__) > 0: + --group='$group' + #end if - + @@ -27,6 +29,7 @@ + @@ -47,8 +50,9 @@ **Command Documenation** -The bin.seqs_ command prints out a fasta-formatted file where sequences are ordered according to the OTU that they belong to. Such an output may be helpful for generating primers specific to an OTU or for classification of sequences. +The bin.seqs_ command generates fasta-formatted files where sequences are ordered according to the OTU from the list_file_ that they belong to. Such an output may be helpful for generating primers specific to an OTU or for classification of sequences. +.. _list_file: http://www.mothur.org/wiki/List_file .. _bin.seqs: http://www.mothur.org/wiki/Bin.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/chimera.bellerophon.xml --- a/mothur/tools/mothur/chimera.bellerophon.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/chimera.bellerophon.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Find putative chimeras using bellerophon mothur_wrapper.py @@ -46,8 +46,25 @@ **Command Documenation** -The chimera.bellerophon_ command identifies putative chimeras using the bellerophon approach. +The chimera.bellerophon_ command identifies putative chimeras using the bellerophon_ approach. + +Advantages of Bellerophon: + + 1) You can process all sequences from a PCR-clone library in a single analysis and don't have to inspect outputs for every sequence in the dataset. + 2) The approximate putative breakpoint is calculated using a sliding window (see above) and will help verification of the chimera manually. + 3) A chimeric sequence is not only tested against two (putative) parent sequences but rather is assessed by how well it fits into the complete phylogenetic environment of a multiple sequence alignment. Hence sequences do not become invisible to the program as is the case with CHIMERA_CHECK (see Ref 1 below). + 4) The calculations Bellerophon uses to detect chimeric sequences are computationally relatively cheap and results are quickly calculated for datasets with up 50 sequences (~1 min). Larger datasets take longer - 100 sequences ~30 min, 300 sequences ~8 hours. +Tips for using Bellerophon: + + 1) Bellerophon works most efficiently if the parent sequences or non-chimeric sequences closely related to the parent sequences are present in the dataset analyzed. Therefore, as many sequences as possible from the one PCR-clone library should be included in the analysis since the parent sequences of any chimera are most likely to be in that dataset. Addition of non-chimeric outgroup sequences (e.g. from isolates) may help refine an analysis by providing reference points (and a broader phylogenetic context) in the analysis, but be aware of increasing analysis time with bigger datasets. + 2) Bellerophon is compromised by using sequences of different lengths as this can produce artificial skews in distance matrices of fragments of the alignment. Datasets containing sequences of the same length and covering the same portion of the gene should be used (usually not an issue with sequences from a PCR-clone library). The filter will automatically remove sequences too short for the window size, i.e. less than 600 bp for a window size of 300. + 3) If possible multiple window sizes should be used as the number of identified chimeras can vary with the choice of the window size. + 4) Re-running the dataset without the first reported chimeras may identify additional putative chimeras by reducing noise in the analysis. Ideally, the dataset should continue to be re-run removing previously reported chimeras until no chimeras are identified. + 5) Bellerophon should be used in concert with other detection methods such as CHIMERA_CHECK and putatively identified chimeras should always be confirmed by manual inspection of the sequences for signature shifts. + + +.. _bellerophon: http://comp-bio.anu.edu.au/Bellerophon/doc/doc.html .. _chimera.bellerophon: http://www.mothur.org/wiki/Chimera.bellerophon diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/chimera.ccode.xml --- a/mothur/tools/mothur/chimera.ccode.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/chimera.ccode.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Find putative chimeras using ccode mothur_wrapper.py @@ -6,7 +6,7 @@ --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.ccode\.chimeras$:'$out_file,'^\S+\.ccode\.accnos$:'$out_accnos --outputdir='$logfile.extra_files_path' --fasta=$fasta - --template=$alignment.template + --reference=$alignment.template $filter #if $mask.source == 'default': --mask=default @@ -22,14 +22,14 @@ --processors=2 - + - + - + @@ -37,7 +37,7 @@ - + @@ -46,12 +46,12 @@ - + - + + Find putative chimeras using chimeraCheck mothur_wrapper.py @@ -6,7 +6,7 @@ --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.chimeracheck\.chimeras$:'$out_file --outputdir='$logfile.extra_files_path' --fasta=$fasta - --template=$alignment.template + --reference=$alignment.template #if int($ksize.__str__) > 0: --ksize=$ksize #end if @@ -15,23 +15,25 @@ #end if #if $svg.gen == 'yes': --svg=true - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^(\S+)\.chimeracheck\.svg$:svg' - #if $name.__str__ != "None" and len($name.__str__) > 0: - --name='$name' + #if $svg.name.__str__ != "None" and len($svg.name.__str__) > 0: + --name='$svg.name' + #end if + #if $svg.as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^(\S+)\.chimeracheck\.svg$:svg' #end if #end if --processors=2 - + - + - + @@ -39,7 +41,7 @@ - + @@ -53,6 +55,7 @@ + @@ -76,7 +79,9 @@ **Command Documenation** -The chimera.check_ command identifies putative chimeras using the chimeraCheck approach. Note: following the RDP model this method does not determine whether or not a sequence is chimeric, but allows you to determine that based on the IS values produced. +The chimera.check_ command identifies putative chimeras using the chimeraCheck approach. It looks at distance of left side of query to it's closest match + distance of right side of query to it's closest match - distance of whole query and its closest match over several windows. + +Note: following the RDP model this method does not determine whether or not a sequence is chimeric, but allows you to determine that based on the IS values produced. .. _chimera.check: http://www.mothur.org/wiki/Chimera.check diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/chimera.pintail.xml --- a/mothur/tools/mothur/chimera.pintail.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/chimera.pintail.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Find putative chimeras using pintail mothur_wrapper.py @@ -9,7 +9,7 @@ #set results = $results + ["'^\S+\.pintail\.accnos$:'" + $out_accnos.__str__] --outputdir='$logfile.extra_files_path' --fasta=$fasta - --template=$alignment.template + --reference=$alignment.template $filter #if $mask.source == 'default': --mask=default @@ -38,12 +38,12 @@ - + - + @@ -51,7 +51,7 @@ - + @@ -137,8 +137,26 @@ **Command Documenation** -The chimera.pintail_ command identifies putative chimeras using the pintail approach. +The chimera.pintail_ command identifies putative chimeras using the pintail approach. It looks at the variation between the expected differences and the observed differences in the query sequence over several windows. + +This method was written using the algorithms described in the paper_ "At Least 1 in 20 16S rRNA Sequence Records Currently Held in the Public Repositories is Estimated To Contain Substantial Anomalies" by Kevin E. Ashelford 1, Nadia A. Chuzhanova 3, John C. Fry 1, Antonia J. Jones 2 and Andrew J. Weightman 1. + + +From www.bioinformatics-toolkit.org_ + +The Pintail algorithm is a technique for determining whether a 16S rDNA sequence is anomalous. It is based on the idea that the extent of local base differences between two aligned 16S rDNA sequences should be roughly the same along the length of the alignment (having allowed for the underlying pattern of hypervariable and conserved regions known to exist within the 16S rRNA gene). In other words, evolutionary distance between two reliable sequences should be constant along the length of the gene. +In contrast, if an error-free sequence is compared with an anomalous sequence, evolutionary distance along the alignment is unlikely to be constant, especially if the anomaly in question is a chimera and formed from phylogenetically different parental sequences. + +The Pintail algorithm is designed to detect and quantify such local variations and in doing so generates the Deviation from Expectation (DE) statistic. The higher the DE value, the greater the likelihood that the query is anomalous. + +The algorithm works as follows + +The sequence to be checked (the query) is first globally aligned with a phylogenetically similar sequence known to be error-free (the subject). At regular intervals along the resulting alignment, the local evolutionary distance between query and subject is estimated by recording percentage base mismatches within a sampling window of fixed length. The resulting array of percentages (observed percentage differences) reflects variations in evolutionary distance between the query and subject along the length of the 16S rRNA gene. Subtracting observed percentage differences from an equivalent array of expected percentage differences (predicted values for error-free sequences), we obtain a set of deviations, the standard deviation of which (Deviation from Expectation, DE) summarises the variation between observed and expected datasets. The greater the DE value, the greater the disparity there is between observed and expected percentage differences, and the more likely it is that the query sequence is anomalous. + + +.. _paper: http://www.ncbi.nlm.nih.gov/pubmed/16332745 +.. _www.bioinformatics-toolkit.org: http://www.bioinformatics-toolkit.org/Help/Topics/pintailAlgorithm.html .. _chimera.pintail: http://www.mothur.org/wiki/Chimera.pintail diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/chimera.slayer.xml --- a/mothur/tools/mothur/chimera.slayer.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/chimera.slayer.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Find putative chimeras using slayer mothur_wrapper.py @@ -8,12 +8,12 @@ --tmpdir='${logfile.extra_files_path}/input' --fasta=$fasta #if $alignment.source == 'self': - --template='self' + --reference='self' #if $alignment.name.__str__ != "None" and len($alignment.name.__str__) > 0: --name=$alignment.name #end if #else: - --template=$alignment.template + --reference=$alignment.template #end if #if $options.setby == 'user': --search=$options.search @@ -30,19 +30,20 @@ --minsnp=$options.minsnp --divergence=$options.divergence $options.trim + $options.split #end if --processors=2 - + - + @@ -50,7 +51,7 @@ - + @@ -75,12 +76,13 @@ - + + @@ -107,6 +109,20 @@ The chimera.slayer_ command identifies putative chimeras using the slayer approach. +ChimeraSlayer_ is a chimeric sequence detection utility, compatible with near-full length Sanger sequences and shorter 454-FLX sequences (~500 bp). + +Chimera Slayer involves the following series of steps that operate to flag chimeric 16S rRNA sequences: + + (A) the ends of a query sequence are searched against an included database of reference chimera-free 16S sequences to identify potential parents of a chimera; + (B) candidate parents of a chimera are selected as those that form a branched best scoring alignment to the NAST-formatted query sequence; + (C) the NAST alignment of the query sequence is improved in a `chimera-aware' profile-based NAST realignment to the selected reference parent sequences; and + (D) an evolutionary framework is used to flag query sequences found to exhibit greater sequence homology to an in silico chimera formed between any two of the selected reference parent sequences. + +Note: +It is not recommended to blindly discard all sequences flagged as chimeras. Some may represent naturally formed chimeras that do not represent PCR artifacts. Sequences flagged may warrant further investigation. + + +.. _ChimeraSlayer: http://microbiomeutil.sourceforge.net/ .. _chimera.slayer: http://www.mothur.org/wiki/Chimera.slayer diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/chop.seqs.xml --- a/mothur/tools/mothur/chop.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/chop.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Trim sequences to a specified length mothur_wrapper.py @@ -12,7 +12,7 @@ $short - + @@ -23,7 +23,7 @@ - + mothur @@ -41,7 +41,7 @@ **Command Documenation** -The chop.seqs_ command reads a fasta file and outputs a .chop.fasta containing the trimmed sequences. It works on both aligned and unaligned sequences. +The chop.seqs_ command reads a fasta file of sequences and outputs a .chop.fasta file containing the trimmed sequences. It works on both aligned and unaligned sequences. .. _chop.seqs: http://www.mothur.org/wiki/Chop.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/classify.otu.xml --- a/mothur/tools/mothur/classify.otu.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/classify.otu.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Assign sequences to taxonomy mothur_wrapper.py @@ -6,7 +6,7 @@ --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?\.(unique|[0-9.]*\.cons\.taxonomy)$:cons.taxonomy' + --new_datasets='^\S+?\.(unique|[0-9.]*\.cons\.taxonomy)$:cons.taxonomy','^\S+?\.(unique|[0-9.]*\.cons\.tax\.summary)$:tax.summary' --list=$otu --taxonomy=$tax.taxonomy #if $reftax.source != 'none' and len($reftax.taxonomy.__str__) > 0: @@ -18,6 +18,10 @@ #if $label.__str__ != "None" and len($label.__str__) > 0: --label='$label' #end if + #if $group.__str__ != "None" and len($group.__str__) > 0: + --group='$group' + #end if + --basis=$basis $probs @@ -28,7 +32,7 @@ - + @@ -36,7 +40,7 @@ - + @@ -47,7 +51,7 @@ - + @@ -55,7 +59,7 @@ - + @@ -68,6 +72,11 @@ + + + + + @@ -90,6 +99,8 @@ The classify.otu_ command assigns sequences to chosen taxonomy outline. +The basis parameter allows you indicate what you want the summary file to represent, options are otu and sequence. Default is otu. For example consider the following basis=sequence could give Clostridiales 3 105 16 43 46, where 105 is the total number of sequences whose otu classified to Clostridiales. 16 is the number of sequences in the otus from groupA, 43 is the number of sequences in the otus from groupB, and 46 is the number of sequences in the otus from groupC. Now for basis=otu could give Clostridiales 3 7 6 1 2, where 7 is the number of otus that classified to Clostridiales. 6 is the number of otus containing sequences from groupA, 1 is the number of otus containing sequences from groupB, and 2 is the number of otus containing sequences from groupC. + .. _classify.otu: http://www.mothur.org/wiki/Classify.otu diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/classify.seqs.xml --- a/mothur/tools/mothur/classify.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/classify.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Assign sequences to taxonomy mothur_wrapper.py @@ -6,7 +6,7 @@ --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.taxonomy$:'$taxonomy_out,'^\S+\.tax\.summary$:'$tax_summary --outputdir='$logfile.extra_files_path' --fasta=$fasta - --template=$alignment.template + --reference=$alignment.template --taxonomy=$tax.taxonomy #if $classify.method == 'bayesian': --method=$classify.method @@ -43,12 +43,12 @@ - + - + @@ -56,7 +56,7 @@ - + @@ -65,7 +65,7 @@ - + @@ -73,7 +73,7 @@ - + @@ -115,7 +115,7 @@ - + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/clearcut.xml --- a/mothur/tools/mothur/clearcut.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/clearcut.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Generate a tree using relaxed neighbor joining mothur_wrapper.py @@ -81,6 +81,13 @@ The clearcut_ command runs clearcut +The clearcut command allows mothur users to run the clearcut_program_ from within mothur. The clearcut program written by Initiative for Bioinformatics and Evolutionary Studies (IBEST) at the University of Idaho. For more information about clearcut please refer to http://bioinformatics.hungry.com/clearcut/ + +Clearcut is a stand-alone reference implementation of relaxed neighbor joining (RNJ). + +Clearcut is capable of taking either a distance matrix or a multiple sequence alignment (MSA) as input. If necessary, Clearcut will compute corrected distances based on a configurable distance correction model (Jukes-Cantor or Kimura). Clearcut outputs a phylogenetic tree in Newick format and an optional corrected distance matrix. + +.. _clearcut_program: http://bioinformatics.hungry.com/clearcut/ .. _clearcut: http://www.mothur.org/wiki/Clearcut diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/cluster.classic.xml --- a/mothur/tools/mothur/cluster.classic.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/cluster.classic.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,57 +1,39 @@ - + Assign sequences to OTUs (Dotur implementation) mothur_wrapper.py --cmd='cluster.classic' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.[fna]n\.sabund$:'$sabund,'^\S+\.[fna]n\.rabund$:'$rabund,'^\S+\.[fna]n\.list$:'$otulist --outputdir='$logfile.extra_files_path' - --READ_cmd='read.dist' - #if $matrix.format == "column": - --READ_column=$matrix.dist - --READ_name=$matrix.name - #elif $matrix.format == "phylip": - --READ_phylip=$matrix.dist - #if $matrix.name.__str__ != "None" and len($matrix.name.__str__) > 0: - --READ_name=$matrix.name - #end if - #end if - $sim - #if float($cutoff.__str__) > 0.0: - --READ_cutoff=$cutoff - #end if - $hard - #if len($precision.__str__) > 0: - --READ_precision=$precision + --phylip=$dist + #if $name.__str__ != "None" and len($name.__str__) > 0: + --name=$name #end if #if len($method.__str__) > 0: --method=$method #end if + #if float($cutoff.__str__) > 0.0: + --cutoff=$cutoff + #end if + $hard + #if len($precision.__str__) > 0: + --precision=$precision + #end if + $sim - - - - - - - - - - - - - - + + - + - + - @@ -62,9 +44,8 @@ - - diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/cluster.fragments.xml --- a/mothur/tools/mothur/cluster.fragments.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/cluster.fragments.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Group sequences that are part of a larger sequence mothur_wrapper.py diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/cluster.split.xml --- a/mothur/tools/mothur/cluster.split.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/cluster.split.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,9 +1,9 @@ - + Assign sequences to OTUs (Operational Taxonomic Unit) splits large matrices mothur_wrapper.py --cmd='cluster.split' - --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.sabund$:'$sabund,'^\S+\.rabund$:'$rabund,'^\S+\.list$:'$otulist + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.sabund$:'$sabund,'^\S+\.rabund$:'$rabund,'^\S+\.list$:'$otulist,'^\S+\.dist$:'$dist_out --outputdir='$logfile.extra_files_path' #if $splitby.splitmethod == "distance": #if $splitby.matrix.format == "column": @@ -72,26 +72,26 @@ - + - + - + - + - @@ -110,6 +110,9 @@ + + splitby.splitmethod == 'fasta' + mothur diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/cluster.xml --- a/mothur/tools/mothur/cluster.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/cluster.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,27 +1,26 @@ - + Assign sequences to OTUs (Operational Taxonomic Unit) mothur_wrapper.py --cmd='cluster' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.[fna]n\.sabund$:'$sabund,'^\S+\.[fna]n\.rabund$:'$rabund,'^\S+\.[fna]n\.list$:'$otulist --outputdir='$logfile.extra_files_path' - --READ_cmd='read.dist' #if $matrix.format == "column": - --READ_column=$matrix.dist - --READ_name=$matrix.name + --column=$matrix.dist + --name=$matrix.name #elif $matrix.format == "phylip": - --READ_phylip=$matrix.dist + --phylip=$matrix.dist #if $matrix.name.__str__ != "None" and len($matrix.name.__str__) > 0: - --READ_name=$matrix.name + --name=$matrix.name #end if #end if $sim #if float($cutoff.__str__) > 0.0: - --READ_cutoff=$cutoff + --cutoff=$cutoff #end if $hard #if len($precision.__str__) > 0: - --READ_precision=$precision + --precision=$precision #end if #if len($method.__str__) > 0: --method=$method @@ -34,24 +33,24 @@ - - + + - - + + - + - + - @@ -62,7 +61,7 @@ - @@ -88,8 +87,14 @@ **Command Documenation** -The cluster_ command assign sequences to OTUs (Operational Taxonomy Unit). +The cluster_ command assign sequences to OTUs (Operational Taxonomy Unit). The assignment is based on a phylip-formatted_distance_matrix_ or a column-formatted_distance_matrix_ and name_ file. It generates a list_, a sabund_ (Species Abundance), and a rabund_ (Relative Abundance) file. +.. _phylip-formatted_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix +.. _column-formatted_distance_matrix: http://www.mothur.org/wiki/Column-formatted_distance_matrix +.. _name: http://www.mothur.org/wiki/Name_file +.. _list: http://www.mothur.org/wiki/List_file +.. _rabund: http://www.mothur.org/wiki/Rabund_file +.. _sabund: http://www.mothur.org/wiki/Sabund_file .. _cluster: http://www.mothur.org/wiki/Cluster diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/collect.shared.xml --- a/mothur/tools/mothur/collect.shared.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/collect.shared.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Generate collector's curves for calculators on OTUs mothur_wrapper.py @@ -7,74 +7,40 @@ --outputdir='$logfile.extra_files_path' --datasetid='$logfile.id' --new_file_path='$__new_file_path__' --new_datasets='^\S+?\.(anderberg|braycurtis|jabund|jclass|jest|kstest|kulczynski|kulczynskicody|lennon|morisitahorn|ochiai|shared\.ace|shared\.chao|shared\.nseqs|shared\.sobs|sorabund|sorclass|sorest|thetan|thetayc|whittaker)$:tabular' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if + --shared=$otu + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' #end if #if $calc.__str__ != "None" and len($calc.__str__) > 0: - --calc='$calc' + --calc='$calc' #end if $all #if float($freq.__str__) > 0: --freq=$freq #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups=$groups #end if - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + To filter: select labels to include + + + + + + + + To filter: select select at least 2 groups + + + + + + + @@ -105,8 +71,9 @@ **Command Documenation** -The collect.shared_ command generates collector's curves for calculators, which describe the similarity between communities or their shared richness. Collector's curves describe how richness or diversity change as you sample additional individuals. If a collector's curve becomes parallel to the x-axis, you can be reasonably confident that you have done a good job of sampling and can trust the last value in the curve. +The collect.shared_ command generates collector's curves for calculators_, which describe the similarity between communities or their shared richness. Collector's curves describe how richness or diversity change as you sample additional individuals. If a collector's curve becomes parallel to the x-axis, you can be reasonably confident that you have done a good job of sampling and can trust the last value in the curve. For calc parameter choices see: http://www.mothur.org/wiki/Calculators +.. _calculators: http://www.mothur.org/wiki/Calculators .. _collect.shared: http://www.mothur.org/wiki/Collect.shared diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/collect.single.xml --- a/mothur/tools/mothur/collect.single.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/collect.single.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,16 +1,20 @@ - - Summary of calculator values for OTUs + + Generate collector's curves for OTUs mothur_wrapper.py --cmd='collect.single' --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+\.(\S+)$:tabular' - --READ_cmd='read.otu' - --READ_list=$otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' + --new_datasets='^\S+?\.(\S+)$:tabular' + #if isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('shared').__class__): + --shared=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('rabund').__class__): + --rabund=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('sabund').__class__): + --sabund=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('list').__class__): + --list=$otu #end if #if $label.__str__ != "None" and len($label.__str__) > 0: --label='$label' @@ -24,13 +28,12 @@ #if int($size.__str__) > 0: --size=$size #end if - #if float($freq.__str__) > 0: --freq=$freq #end if + - - + @@ -46,9 +49,11 @@ - - - + + + @@ -69,8 +74,9 @@ **Command Documenation** -The collect.single_ command generates collector's curves using calculators, that describe the richness, diversity, and other features of individual samples. Collector's curves describe how richness or diversity change as you sample additional individuals. If a collector's curve becomes parallel to the x-axis, you can be reasonably confident that you have done a good job of sampling and can trust the last value in the curve. Otherwise, you need to keep sampling. +The collect.single_ command generates collector's curves using calculators_, that describe the richness, diversity, and other features of individual samples. Collector's curves describe how richness or diversity change as you sample additional individuals. If a collector's curve becomes parallel to the x-axis, you can be reasonably confident that you have done a good job of sampling and can trust the last value in the curve. Otherwise, you need to keep sampling. For calc parameter choices see: http://www.mothur.org/wiki/Calculators +.. _calculators: http://www.mothur.org/wiki/Calculators .. _collect.single: http://www.mothur.org/wiki/Collect.single diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/consensus.seqs.xml --- a/mothur/tools/mothur/consensus.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/consensus.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Find a consensus sequence for each OTU or phylotype mothur_wrapper.py diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/corr.axes.xml --- a/mothur/tools/mothur/corr.axes.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/corr.axes.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + correlation of data to axes mothur_wrapper.py diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/degap.seqs.xml --- a/mothur/tools/mothur/degap.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/degap.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Remove gap characters from sequences mothur_wrapper.py diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/deunique.seqs.xml --- a/mothur/tools/mothur/deunique.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/deunique.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Return all sequences mothur_wrapper.py @@ -10,7 +10,7 @@ - + @@ -32,8 +32,9 @@ **Command Documenation** -The deunique.seqs_ command is the reverse of the unique.seqs command, and creates a fasta file from a fasta and name file. +The deunique.seqs_ command is the reverse of the unique.seqs command, and creates a fasta file from a fasta and name_ file. +.. _name: http://www.mothur.org/wiki/Name_file .. _deunique.seqs: http://www.mothur.org/wiki/Deunique.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/dist.seqs.xml --- a/mothur/tools/mothur/dist.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/dist.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + calculate uncorrected pairwise distances between aligned sequences mothur_wrapper.py @@ -64,8 +64,10 @@ **Command Documenation** -The dist.seqs_ command will calculate uncorrected pairwise distances between aligned sequences. +The dist.seqs_ command will calculate uncorrected pairwise distances between aligned sequences. The command will generate a column-formatted_distance_matrix_ that is compatible with the column option in the read.dist command. The command is also able to generate a phylip-formatted_distance_matrix_. There are several options for how to handle gap comparisons and terminal gaps. +.. _column-formatted_distance_matrix: http://www.mothur.org/wiki/Column-formatted_distance_matrix +.. _phylip-formatted_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix .. _dist.seqs: http://www.mothur.org/wiki/Dist.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/dist.shared.xml --- a/mothur/tools/mothur/dist.shared.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/dist.shared.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,37 +1,28 @@ - + Generate a phylip-formatted dissimilarity distance matrix among multiple groups mothur_wrapper.py --cmd='dist.shared' --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - #if len($output.__str__) > 0: - #if $output.__str__ == 'square': - --new_datasets='^\S+?\.([a-z]+\.(unique|[0-9.]*)\.(square|lt))\.dist$:square.dist' - #elif $output.__str__ == 'lt': + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + #if len($output.__str__) > 0: + #if $output.__str__ == 'square': + --new_datasets='^\S+?\.([a-z]+\.(unique|[0-9.]*)\.(square|lt))\.dist$:square.dist' + #elif $output.__str__ == 'lt': + --new_datasets='^\S+?\.([a-z]+\.(unique|[0-9.]*)\.(square|lt))\.dist$:lower.dist' + #end if + #else: --new_datasets='^\S+?\.([a-z]+\.(unique|[0-9.]*)\.(square|lt))\.dist$:lower.dist' #end if - #else: - --new_datasets='^\S+?\.([a-z]+\.(unique|[0-9.]*)\.(square|lt))\.dist$:lower.dist' #end if - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if + --shared=$otu + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups=$groups #end if #if $calc.__str__ != "None" and len($calc.__str__) > 0: --calc=$calc @@ -42,47 +33,21 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + @@ -91,10 +56,11 @@ - + + @@ -115,8 +81,10 @@ **Command Documenation** -The dist.shared_ command will generate a phylip-formatted distance matrix that describes the dissimilarity (1-similarity) among multiple groups. +The dist.shared_ command will generate a phylip-formatted_distance_matrix_ that describes the dissimilarity (1-similarity) among multiple groups from a shared_ file. For calc parameter choices see: http://www.mothur.org/wiki/Calculators +.. _phylip-formatted_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix +.. _shared: http://www.mothur.org/wiki/Shared_file .. _dist.shared: http://www.mothur.org/wiki/Dist.shared diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/fastq.info.xml --- a/mothur/tools/mothur/fastq.info.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/fastq.info.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Convert fastq to fasta and quality mothur_wrapper.py diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/filter.seqs.xml --- a/mothur/tools/mothur/filter.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/filter.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + removes columns from alignments mothur_wrapper.py diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.group.xml --- a/mothur/tools/mothur/get.group.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/get.group.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,54 +1,14 @@ - + group names from shared or from list and group mothur_wrapper.py --cmd='get.group' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.bootGroups$:'$bootgroups --outputdir='$logfile.extra_files_path' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --READ_groups='$input.groups' - #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #end if + --shared=$otu - - - - - - - - - - - - - - - - - - - - - - - - - - - + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.groups.xml --- a/mothur/tools/mothur/get.groups.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/get.groups.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Select groups mothur_wrapper.py @@ -46,7 +46,7 @@ - + @@ -61,7 +61,7 @@ list_in != None - + taxonomy_in != None diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.lineage.xml --- a/mothur/tools/mothur/get.lineage.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/get.lineage.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Picks by taxon mothur_wrapper.py @@ -20,7 +20,7 @@ #end if #if $alignreport_in.__str__ != "None" and len($alignreport_in.__str__) > 0: --alignreport=$alignreport_in - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($alignreport_in.__str__)) + ":'" + $alignreport_out.__str__] + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.pick.align.report',$os.path.basename($alignreport_in.__str__)) + ":'" + $alignreport_out.__str__] #end if #if $list_in.__str__ != "None" and len($list_in.__str__) > 0: --list=$list_in @@ -34,7 +34,7 @@ --result=#echo ','.join($results) - + @@ -51,9 +51,8 @@ - - + fasta_in != None @@ -86,8 +85,13 @@ **Command Documenation** -The get.lineage_ command reads a taxonomy file and a taxon and generates a new file that contains only the sequences in the that are from that taxon. You may also include either a fasta, name, group, list, or align.report file to this command and mothur will generate new files for each of those containing only the selected sequences. +The get.lineage_ command reads a taxonomy_ file and a taxon and generates a new file that contains only the sequences in the that are from that taxon. You may also include either a fasta, name_, group_, list_, or align.report_ file to this command and mothur will generate new files for each of those containing only the selected sequences. +.. _taxonomy: http://www.mothur.org/wiki/Taxonomy_outline +.. _name: http://www.mothur.org/wiki/Name_file +.. _group: http://www.mothur.org/wiki/Group_file +.. _list: http://www.mothur.org/wiki/List_file +.. _align.report: http://www.mothur.org/wiki/Align.seqs .. _get.lineage: http://www.mothur.org/wiki/Get.lineage diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.otulist.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/get.otulist.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,63 @@ + + Get otus for each distance in a otu list + + mothur_wrapper.py + ## output {group_file_name}.pick.{label}.groups {list_file_name}.pick.{label}.list + #import re, os.path + --cmd='get.otulist' + --result='^mothur.\S+\.logfile$:'$logfile + --outputdir='$logfile.extra_files_path' + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.((unique|[0-9.]*)\.otu)$:tabular' + #end if + --list=$list_in + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label=$label + #end if + #if $sort.__str__ != "None" and len($sort.__str__) > 0: + --sort=$sort + #end if + + + + + + + + + + + If otu is selected the output will be otu number followed by the list of names in that otu. + If name is selected the output will be a sequence name followed by its otu number. + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The get.otulist_ command parses a list file and creates an .otu file for each distance containing 2 columns. The first column is the OTU number the second column is a list of sequences in that OTU. + +.. _get.otulist: http://www.mothur.org/wiki/Get.otulist + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.oturep.xml --- a/mothur/tools/mothur/get.oturep.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/get.oturep.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,12 +1,14 @@ - + Generate a fasta with a representative sequence for each OTU mothur_wrapper.py --cmd='get.oturep' --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?\.((unique|[0-9.]*)(\S+)\.rep\.fasta)$:fasta','^\S+?\.((unique|[0-9.]*)(\S+)\.rep\.names)$:names' + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.((unique|[0-9.]*)(\S+)\.rep\.fasta)$:fasta','^\S+?\.((unique|[0-9.]*)(\S+)\.rep\.names)$:names' + #end if --fasta=$fasta --list=$otu_list #if $input.source == 'column': @@ -21,8 +23,13 @@ #if $label.__str__ != "None" and len($label.__str__) > 0: --label='$label' #end if - #if $groups.__str__ != "None" and len($groups.__str__) > 0: - --groups=$groups + #if $pick.type == 'yes': + #if $pick.group.__str__ != "None" and len($pick.group.__str__) > 0: + --group=$pick.group + #end if + #if $pick.groups.__str__ != "None" and len($pick.groups.__str__) > 0: + --groups=$pick.groups + #end if #end if #if $sorted.__str__ != "None" and len($sorted.__str__) > 0: --sorted=$sorted @@ -34,26 +41,36 @@ - + - - + + - - + + - - - - - - - + + + + + + + + + + + + + + + + + @@ -67,8 +84,8 @@ - + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.otus.xml --- a/mothur/tools/mothur/get.otus.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/get.otus.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Get otus containing sequences from specified groups mothur_wrapper.py @@ -9,15 +9,18 @@ --group=$group_in --list=$list_in --label=$label - #if $groups.__str__ != "None" and len($groups.__str__) > 0: - --groups=$groups - #end if - #if $accnos.__str__ != "None" and len($accnos.__str__) > 0: - --accnos=$accnos + #if $groupnames.source == 'groups': + #if $groupnames.groups.__str__ != "None" and len($groupnames.groups.__str__) > 0: + --groups=$groupnames.groups + #end if + #else + #if $groupnames.accnos.__str__ != "None" and len($groupnames.accnos.__str__) > 0: + --accnos=$groupnames.accnos + #end if #end if #set results = ["'^mothur.\S+\.logfile$:'" + $logfile.__str__] - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.' + $label.__str__ + '.\2',$os.path.basename($group_in.__str__)) + ":'" + $group_out.__str__] - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.'+ $label.__str__ + '.\2',$os.path.basename($list_in.__str__)) + ":'" + $list_out.__str__] + #set results = $results + ["'" + $re.sub(r'^(.*)\.(.*?)',r'\1.pick.' + $label.__str__ + r'.\2',$os.path.basename($group_in.__str__)) + ":'" + $group_out.__str__] + #set results = $results + ["'" + $re.sub(r'^(.*)\.(.*?)',r'\1.pick.'+ $label.__str__ + r'.\2',$os.path.basename($list_in.__str__)) + ":'" + $list_out.__str__] --result=#echo ','.join($results) @@ -29,14 +32,25 @@ - - - - - - - - + + + + + + + + At least one group must be selected + + + + + + + + + + + @@ -59,8 +73,9 @@ **Command Documenation** -The get.otus_ command selects otus containing sequences from a specific group or set of groups. +The get.otus_ command selects otus from a list_ containing sequences from a specific group or set of groups. +.. _list: http://www.mothur.org/wiki/List_file .. _get.otus: http://www.mothur.org/wiki/Get.otus diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.rabund.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/get.rabund.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,58 @@ + + Get rabund from a otu list or sabund + + mothur_wrapper.py + --cmd='get.rabund' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.rabund$:'$rabund + --outputdir='$logfile.extra_files_path' + #if isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('list').__class__): + --list=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('sabund').__class__): + --sabund=$otu + #end if + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label=$label + #end if + $sorted + + + + + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The get.rabund_ command generates an rabund_ file from a list_ or sabund_ file. + +.. _rabund: http://www.mothur.org/wiki/Rabund_file +.. _list: http://www.mothur.org/wiki/List_file +.. _sabund: http://www.mothur.org/wiki/Sabund_file +.. _get.rabund: http://www.mothur.org/wiki/Get.rabund + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.relabund.xml --- a/mothur/tools/mothur/get.relabund.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/get.relabund.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,76 +1,41 @@ - + Calculate the relative abundance of each otu mothur_wrapper.py --cmd='get.relabund' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.relabund$:'$relabund --outputdir='$logfile.extra_files_path' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if + --shared=$otu + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups=$groups #end if #if $scale.__str__ != "None" and len($scale.__str__) > 0: --scale=$scale #end if - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + - + @@ -96,8 +61,9 @@ **Command Documenation** -The get.relabund_ command calculates the relative abundance of each otu in a sample. It outputs a .relabund file. +The get.relabund_ command calculates the relative abundance of each otu in a sample from a shared_ file. It outputs a .relabund_ file. +.. _shared: http://www.mothur.org/wiki/Shared_file .. _get.relabund: http://www.mothur.org/wiki/Get.relabund diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.sabund.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/get.sabund.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,55 @@ + + Get sabund from a otu list or rabund + + mothur_wrapper.py + --cmd='get.sabund' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.sabund$:'$sabund + --outputdir='$logfile.extra_files_path' + #if isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('list').__class__): + --list=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('rabund').__class__): + --rabund=$otu + #end if + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label=$label + #end if + + + + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The get.sabund_ command generates an sabund_ file from a list_ or rabund_ file. + +.. _sabund: http://www.mothur.org/wiki/Sabund_file +.. _list: http://www.mothur.org/wiki/List_file +.. _rabund: http://www.mothur.org/wiki/Rabund_file +.. _get.sabund: http://www.mothur.org/wiki/Get.sabund + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.seqs.xml --- a/mothur/tools/mothur/get.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/get.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Picks sequences by name mothur_wrapper.py @@ -47,7 +47,7 @@ - + @@ -72,7 +72,7 @@ list_in != None - + taxonomy_in != None @@ -92,8 +92,14 @@ **Command Documenation** -The get.seqs_ command takes a list of sequence names and either a fasta, name, group, list, or align.report file to generate a new file that contains only the sequences in the list. This command may be used in conjunction with the list.seqs command to help screen a sequence collection. +The get.seqs_ command takes a list of sequence names and either a fasta, name_, group_, list_, align.report_ or taxonomy_ file to generate a new file that contains only the sequences in the list. This command may be used in conjunction with the list.seqs_ command to help screen a sequence collection. +.. _name: http://www.mothur.org/wiki/Name_file +.. _group: http://www.mothur.org/wiki/Group_file +.. _list: http://www.mothur.org/wiki/List_file +.. _align.report: http://www.mothur.org/wiki/Align.seqs +.. _taxonomy: http://www.mothur.org/wiki/Taxonomy_outline +.. _list.seqs: http://www.mothur.org/wiki/list.seqs .. _get.seqs: http://www.mothur.org/wiki/Get.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/get.sharedseqs.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/get.sharedseqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,106 @@ + + Get shared sequences at each distance from list and group + + mothur_wrapper.py + #import re, os.path + --cmd='get.sharedseqs' + --result='^mothur.\S+\.logfile$:'$logfile + --outputdir='$logfile.extra_files_path' + --list=$list + --group=$group + #set datasets = ["'^\S+?\.((unique|[0-9.]*)(\S+)\.shared.seqs)$:tabular'"] + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' + #end if + #if $seqs_from.selection == 'unique': + #if $seqs_from.groups.__str__ != "None" and len($seqs_from.groups.__str__) > 0: + --unique=$seqs_from.groups + #end if + #elif $seqs_from.selection == 'shared': + #if $seqs_from.groups.__str__ != "None" and len($seqs_from.groups.__str__) > 0: + --shared=$seqs_from.groups + #end if + #end if + #if $fasta.__str__ != "None" and len($fasta.__str__) > 0: + --fasta=$fasta + #set datasets = $datasets + ["'^\S+?\.((unique|[0-9.]*)(\S+)\.shared.fasta)$:fasta'"] + #end if + #if $output.__str__ != "None" and len($output.__str__) > 0: + --output=$output + #set datasets = $datasets + ["'^\S+?\.((unique|[0-9.]*)(\S+)\.accnos)$:accnos'"] + #end if + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets=#echo ','.join($datasets) + #end if + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The get.sharedseqs_ command takes a list and group file and outputs a *.shared.seqs file for each distance. This is useful for those cases where you might be interested in identifying sequences that are either unique or shared by specific groups, which you could then classify. + +.. _get.sharedseqs: http://www.mothur.org/wiki/Get.sharedseqs + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/hcluster.xml --- a/mothur/tools/mothur/hcluster.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/hcluster.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Assign sequences to OTUs (Operational Taxonomic Unit) mothur_wrapper.py @@ -18,10 +18,10 @@ --method=$method #end if #if float($cutoff.__str__) > 0.0: - --READ_cutoff=$cutoff + --cutoff=$cutoff #end if #if len($precision.__str__) > 0: - --READ_precision=$precision + --precision=$precision #end if $hard $sorted @@ -58,9 +58,9 @@ - + - + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/heatmap.bin.xml --- a/mothur/tools/mothur/heatmap.bin.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/heatmap.bin.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,30 +1,31 @@ - + Generate a heatmap for OTUs mothur_wrapper.py --cmd='heatmap.bin' --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?\.((\S+)\.(unique|[0-9.]*)\.heatmap\.bin\.svg)$:svg' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #elif $input.source == 'shared': - --READ_relabund=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.((\S+)\.(unique|[0-9.]*)\.heatmap\.bin\.svg)$:svg' #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: + #if isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('shared').__class__): + --shared=$input.otu + #elif isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('rabund').__class__): + --rabund=$input.otu + #elif isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('sabund').__class__): + --sabund=$input.otu + #elif isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('list').__class__): + --list=$input.otu + #elif isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('relabund').__class__): + --relabund=$input.otu + #end if + #if $input.has_groups != 'no' and $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: --groups=$input.groups #end if + #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: + --label='$input.label' + #end if #if $scale.__str__ != "None" and len($scale.__str__) > 0: --scale='$scale' #end if @@ -39,42 +40,54 @@ #end if - - - - - + + + + + - - - + + + + + + + + + + - + + + + + + + + + + + + + + + - - - - - - - - - - + @@ -88,11 +101,12 @@ - - + + + @@ -113,8 +127,10 @@ **Command Documenation** -The heatmap.bin_ command generates a heat map from data provided in either a .list or a .shared file. +The heatmap.bin_ command generates a heat map from data provided in either a list_ or a shared_ file. +.. _list: http://www.mothur.org/wiki/List_file +.. _shared: http://www.mothur.org/wiki/Shared_file .. _heatmap.bin: http://www.mothur.org/wiki/Heatmap.bin diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/heatmap.sim.xml --- a/mothur/tools/mothur/heatmap.sim.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/heatmap.sim.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,30 +1,22 @@ - + Generate a heatmap for pariwise similarity mothur_wrapper.py --cmd='heatmap.sim' - --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?\.((unique|[0-9.]*)(\S+)\.heatmap\.sim\.svg)$:svg' - #if $input.source == 'similarity': - --READ_cmd='read.otu' - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' + #if $as_datasets.__str__ == "yes": + #if $input.source == 'shared': + --result='^mothur.\S+\.logfile$:'$logfile + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.((unique|[0-9.]*)(\S+)\.heatmap\.sim\.svg)$:svg' + #else: + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.heatmap\.sim\.svg$:'$heatmap #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --label='$input.label' - #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups - #end if - #if $input.calc.__str__ != "None" and len($input.calc.__str__) > 0: - --calc='$input.calc' - #end if - #elif $input.source == 'shared': - --READ_cmd='read.otu' - --READ_shared=$input.otu + #else: + --result='^mothur.\S+\.logfile$:'$logfile + #end if + #if $input.source == 'shared': + --shared=$input.otu #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: --label='$input.label' #end if @@ -39,45 +31,15 @@ --name=$input.name #elif $input.source == 'phylip': --phylip=$input.dist - #if $input.name.__str__ != "None" and len($input.name.__str__) > 0: - --name=$input.name - #end if #end if - - - - - + + + - - - - - - - - - - - - - - - - - - - - - - - - - @@ -94,6 +56,14 @@ + + + + + + + + @@ -101,12 +71,15 @@ - + + + as_datasets != 'true' and input['source'] != 'shared' + mothur @@ -124,8 +97,12 @@ **Command Documenation** -The heatmap.sim_ command generates a heat map from data provided in either a .list or a .shared file. +The heatmap.sim_ command generates a heat map from data provided in either a shared_ file, a phylip_ distance matrix, or a column_ distance matrix and a name_ file. For calc parameter choices see: http://www.mothur.org/wiki/Calculators +.. _shared: http://www.mothur.org/wiki/Shared_file +.. _phylip: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix +.. _column: http://www.mothur.org/wiki/Column-formatted_distance_matrix +.. _name: http://www.mothur.org/wiki/Name_file .. _heatmap.sim: http://www.mothur.org/wiki/Heatmap.sim diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/homova.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/homova.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,62 @@ + + Homogeneity of molecular variance + + mothur_wrapper.py + --cmd='homova' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.homova$:'$homova + --outputdir='$logfile.extra_files_path' + --phylip=$dist + --design=$design + #if int($iters.__str__) > 0: + --iters=$iters + #end if + #if float($alpha.__str__) > 0.0: + --alpha=$alpha + #end if + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The homova_ command calculates the homogeneity of molecular variance (HOMOVA) from a phylip_distance_matrix_, a nonparametric analog of Bartlett's test for homo- geneity of variance, which has been used in population genetics to test the hypothesis that the genetic diversity within two or more populations is homogeneous. + +A design file partitions a list of names into groups. It is a tab-delimited file with 2 columns: name and group, e.g. : + ======= ======= + duck bird + cow mammal + pig mammal + goose bird + cobra reptile + ======= ======= + +The Make_Design tool can construct a design file from a Mothur dataset that contains group names. + +.. _phylip_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix +.. _homova: http://www.mothur.org/wiki/Homova + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/indicator.xml --- a/mothur/tools/mothur/indicator.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/indicator.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Identify indicator "species" for nodes on a tree mothur_wrapper.py @@ -64,8 +64,9 @@ **Command Documenation** -The indicator_ command reads a shared or relabund file and a tree file, and outputs a .indicator.tre and .indicator.summary file. The new tree contains labels at each internal node. The label is the node number so you can relate the tree to the summary file. The summary file lists the indicator value for each OTU for each node. +The indicator_ command reads a shared_ or relabund file and a tree file, and outputs a .indicator.tre and .indicator.summary file. The new tree contains labels at each internal node. The label is the node number so you can relate the tree to the summary file. The summary file lists the indicator value for each OTU for each node. +.. _shared: http://www.mothur.org/wiki/Shared_file .. _indicator: http://www.mothur.org/wiki/Indicator diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/libshuff.xml --- a/mothur/tools/mothur/libshuff.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/libshuff.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,21 +1,19 @@ - + Cramer-von Mises tests communities for the same structure mothur_wrapper.py --cmd='libshuff' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.libshuff\.summary$:'$summary,'^\S+\.libshuff\.coverage$:'$coverage --outputdir='$logfile.extra_files_path' - --READ_cmd='read.dist' - #if $matrix.format == "column": - --READ_column=$matrix.dist - --READ_name=$matrix.name - #elif $matrix.format == "phylip": - --READ_phylip=$matrix.dist + --phylip=$dist + --group=$group + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups=$groups #end if - --READ_group=$group #if len($iters.__str__) > 0: --iters=$iters #end if + $sim #if $form == "discrete": #if 1.0 >= float($form.step.__str__) > 0.0: --step=$form.step @@ -26,22 +24,17 @@ #end if - - - - - - - - - - - - - - - + + + + + + + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/list.seqs.xml --- a/mothur/tools/mothur/list.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/list.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,5 +1,5 @@ - - Lists the names of the sequences + + Lists the names (accnos) of the sequences mothur_wrapper.py --cmd='list.seqs' @@ -9,7 +9,7 @@ --fasta=$search.input #elif $search.type == "name": --name=$search.input - #else if search.type == "group": + #else if $search.type == "group": --group=$search.input #elif $search.type == "alignreport": --alignreport=$search.input @@ -45,7 +45,7 @@ - + @@ -70,8 +70,13 @@ **Command Documenation** -The list.seqs_ command writes out the names of the sequences found within a fasta, name, group, list, or align.report file. +The list.seqs_ command writes out the names of the sequences found within a fasta, name_, group_, list_, align.report_ or taxonomy_ file. +.. _name: http://www.mothur.org/wiki/Name_file +.. _group: http://www.mothur.org/wiki/Group_file +.. _list: http://www.mothur.org/wiki/List_file +.. _align.report: http://www.mothur.org/wiki/Align.seqs +.. _taxonomy: http://www.mothur.org/wiki/Taxonomy_outline .. _list.seqs: http://www.mothur.org/wiki/list.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/make.design.xml --- a/mothur/tools/mothur/make.design.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/make.design.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Assign groups to Sets cat $generated_design > $design diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/make.fastq.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/make.fastq.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,41 @@ + + Convert fasta and quality to fastq + + mothur_wrapper.py + --cmd='fastq.info' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.fastq$:'$fastq + --outputdir='$logfile.extra_files_path' + --fasta=$fasta + --qfile=$qfile + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The fastq.info_ command reads a fasta file and quality file and creates a fastq. + + +.. _fastq.info: http://www.mothur.org/wiki/Make.fastq + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/make.group.xml --- a/mothur/tools/mothur/make.group.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/make.group.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Make a group file mothur_wrapper.py @@ -36,8 +36,9 @@ **Command Documenation** -The make.group_ command reads a fasta file or series of fasta files and creates a group file. +The make.group_ command reads a fasta file or series of fasta files and creates a group_ file. +.. _group: http://www.mothur.org/wiki/Group_file .. _make.group: http://www.mothur.org/wiki/Make.group diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/make.shared.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/make.shared.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,75 @@ + + Make a shared file from a list and a group + + mothur_wrapper.py + #import re, os.path + --cmd='make.shared' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.shared$:'$shared + --outputdir='$logfile.extra_files_path' + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.((\S+)\.rabund)$:rabund' + #end if + --list=$list + --group=$group + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' + #end if + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --unique=$groups + #end if + #if $ordergroup.__str__ != "None" and len($ordergroup.__str__) > 0: + --ordergroup=$ordergroup + #end if + + + + + + + + + + + + + + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The make.shared_ command takes a list_ and a group_ file and outputs a shared_ file, as well as a rabund_ file for each group. + + +.. _list: http://www.mothur.org/wiki/List_file +.. _group: http://www.mothur.org/wiki/Group_file +.. _shared: http://www.mothur.org/wiki/Shared_file +.. _rabund: http://www.mothur.org/wiki/Rabund_file +.. _make.shared: http://www.mothur.org/wiki/Make.shared + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/mantel.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/mantel.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,52 @@ + + Mantel correlation coefficient between two matrices. + + mothur_wrapper.py + --cmd='mantel' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.mantel$:'$mantel + --outputdir='$logfile.extra_files_path' + --phylip=$dist + --phylip2=$dist2 + --method=$method + #if int($iters.__str__) > 0: + --iters=$iters + #end if + + + + + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The mantel_ command calculates the Mantel correlation coefficient between two matrices_. + +.. _matrices: //www.mothur.org/wiki/Phylip-formatted_distance_matrix +.. _mantel: http://www.mothur.org/wiki/Mantel + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/merge.files.xml --- a/mothur/tools/mothur/merge.files.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/merge.files.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Merge data mothur_wrapper.py diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/merge.groups.xml --- a/mothur/tools/mothur/merge.groups.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/merge.groups.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Merge groups in a shared file mothur_wrapper.py @@ -92,8 +92,20 @@ **Command Documenation** -The merge.groups_ command reads a shared file and a design file and merges the groups in the shared file that are in the same grouping in the design file. +The merge.groups_ command reads a shared_ file and a design file and merges the groups in the shared file that are in the same grouping in the design file. +A design file partitions a list of names into groups. It is a tab-delimited file with 2 columns: name and group, e.g. : + ======= ======= + duck bird + cow mammal + pig mammal + goose bird + cobra reptile + ======= ======= + +The Make_Design tool can construct a design file from a Mothur dataset that contains group names. + +.. _shared: http://www.mothur.org/wiki/Shared_file .. _merge.groups: http://www.mothur.org/wiki/Merge.groups diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/metastats.xml --- a/mothur/tools/mothur/metastats.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/metastats.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,30 +1,21 @@ - + generate principle components plot data mothur_wrapper.py --cmd='metastats' - --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.metastats$:'$metastats + --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --READ_groups='$input.groups' - #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --READ_groups='$input.groups' - #end if + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.((unique|[0-9.]*)(\..*?)+\.metastats)$:txt' + #end if + --shared=$otu + --design=$design + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' + #end if + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups='$groups' #end if #if int($iters.__str__) > 0: --iters=$iters @@ -32,59 +23,29 @@ #if 1 >= $threshold >= 0: --threshold=$threshold #end if - #if $design.__str__ != "None" and len($design.__str__) > 0: - --design=$design - #if $sets.__str__ != "None" and len($sets.__str__) > 0: - --sets=$sets - #end if + #if $sets.__str__ != "None" and len($sets.__str__) > 0: + --sets=$sets #end if --processors=2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + help="design has 2 columns: group(col 1) and grouping(col 2) (separated by a TAB character) use make.design"/> @@ -92,10 +53,12 @@ + + + - mothur diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/mothur_wrapper.py --- a/mothur/tools/mothur/mothur_wrapper.py Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/mothur_wrapper.py Tue Jun 07 17:05:08 2011 -0400 @@ -4,7 +4,7 @@ http://www.mothur.org/ Supports mothur version -mothur v.1.15.0 +mothur v.1.17.0 Class encapsulating Mothur galaxy tool. Expect each invocation to include: @@ -46,7 +46,7 @@ debug = False #debug = True -max_processors = 1 +max_processors = 2 def stop_err( msg ): sys.stderr.write( "%s\n" % msg ) @@ -54,6 +54,8 @@ def __main__(): # tranform the logfile into html + # add extra file ouput + # add object tags for svg files def logfile_to_html(logfile_path,htmlfile_path,tmp_input_dir_name,tmp_output_dir_name,title="Mothur Logfile"): if debug: print >> sys.stdout, 'logfile_to_html %s -> %s' % (logfile_path, htmlfile_path) if debug: print >> sys.stdout, 'logfile_to_html input_dir: %s' % tmp_input_dir_name @@ -69,10 +71,18 @@ continue elif line.find('put directory to ') >= 0: continue + elif line.startswith('Mothur\'s directories:') : + continue + elif line.startswith('outputDir=') : + continue elif line.startswith('Type ') : continue elif line.find(tmp_output_dir_name) >= 0: - line = re.sub(out_pat,'\\1',line) + # if debug: print >> sys.stdout, 'logfile_to_html #%s#' % line + if line.strip().endswith('.svg'): + line = re.sub(out_pat,'

\\1

',line) + else: + line = re.sub(out_pat,'\\1',line) elif line.find(tmp_input_dir_name) >= 0: line = re.sub(in_pat,'\\1',line) html.write(line) @@ -165,6 +175,146 @@ The complexity of inputs should be handled by the glaxy tool xml file. """ cmd_dict = dict() + cmd_dict['align.check'] = dict({'required' : ['fasta','map']}) + #cmd_dict['align.seqs'] = dict({'required' : ['candidate','template'], 'optional' : ['search','ksize','align','match','mismatch','gapopen','gapextend','flip','threshold','processors']}) + cmd_dict['align.seqs'] = dict({'required' : ['fasta','reference',], 'optional' : ['search','ksize','align','match','mismatch','gapopen','gapextend','flip','threshold','processors']}) + cmd_dict['amova'] = dict({'required' : ['phylip','design'] , 'optional' : ['alpha','iters']}) + cmd_dict['anosim'] = dict({'required' : ['phylip','design'] , 'optional' : ['alpha','iters']}) + #cmd_dict['bin.seqs'] = dict({'required' : ['fasta'], 'optional' : ['name','label','group']}) + cmd_dict['bin.seqs'] = dict({'required' : ['list','fasta'], 'optional' : ['name','label','group']}) + #cmd_dict['bootstrap.shared'] = dict({'required' : [], 'optional' : ['calc','groups','iters','label']}) + cmd_dict['bootstrap.shared'] = dict({'required' : ['shared'], 'optional' : ['calc','groups','iters','label']}) + #catchall + cmd_dict['chimera.bellerophon'] = dict({'required' : ['fasta'], 'optional' : ['filter','correction','window','increment','processors']}) + #cmd_dict['chimera.ccode'] = dict({'required' : ['fasta','template'], 'optional' : ['filter','mask','window','numwanted','processors']}) + cmd_dict['chimera.ccode'] = dict({'required' : ['fasta','reference'], 'optional' : ['filter','mask','window','numwanted','processors']}) + #cmd_dict['chimera.check'] = dict({'required' : ['fasta','template'], 'optional' : ['ksize','svg','name','increment','processors']}) + cmd_dict['chimera.check'] = dict({'required' : ['fasta','reference'], 'optional' : ['ksize','svg','name','increment','processors']}) + #cmd_dict['chimera.pintail'] = dict({'required' : ['fasta','template'], 'optional' : ['conservation','quantile','filter','mask','window','increment','processors']}) + cmd_dict['chimera.pintail'] = dict({'required' : ['fasta','reference'], 'optional' : ['conservation','quantile','filter','mask','window','increment','processors']}) + #cmd_dict['chimera.slayer'] = dict({'required' : ['fasta','template'], 'optional' : ['name','search','window','increment','match','mismatch','numwanted','parents','minsim','mincov','iters','minbs','minsnp','divergence','realign','split','processors']}) + cmd_dict['chimera.slayer'] = dict({'required' : ['fasta','reference'], 'optional' : ['name','search','window','increment','match','mismatch','numwanted','parents','minsim','mincov','iters','minbs','minsnp','divergence','realign','split','processors']}) + #cmd_dict['chop.seqs'] = dict({'required' : ['fasta','numbases'], 'optional' : ['keep','short']}) + cmd_dict['chop.seqs'] = dict({'required' : ['fasta','numbases'], 'optional' : ['countgaps','keep','short']}) + cmd_dict['classify.otu'] = dict({'required' : ['list','taxonomy'],'optional' : ['name','cutoff','label','group','probs','basis','reftaxonomy']}) + #cmd_dict['classify.seqs'] = dict({'required' : ['fasta','template','taxonomy'],'optional' : ['name','search','ksize','method','match','mismatch','gapopen','gapextend','numwanted','probs','processors']}) + cmd_dict['classify.seqs'] = dict({'required' : ['fasta','reference','taxonomy'],'optional' : ['name','search','ksize','method','match','mismatch','gapopen','gapextend','numwanted','probs','processors']}) + cmd_dict['clearcut'] = dict({'required' : [['phylip','fasta']],'optional' : ['seed','norandom','shuffle','neighbor','expblen','expdist','ntrees','matrixout','kimura','jukes','protein','DNA']}) + #cmd_dict['cluster'] = dict({'required' : [] , 'optional' : ['method','cutoff','hard','precision']}) + cmd_dict['cluster'] = dict({'required' : [['phylip','column']] , 'optional' : ['name','method','cutoff','hard','precision','sim','showabund','timing']}) + #cmd_dict['cluster.classic'] = dict({'required' : ['phylip'] , 'optional' : ['method','cutoff','hard','precision']}) + cmd_dict['cluster.classic'] = dict({'required' : ['phylip'] , 'optional' : ['name','method','cutoff','hard','sim','precision']}) + cmd_dict['cluster.fragments'] = dict({'required' : ['fasta'] , 'optional' : ['name','diffs','percent']}) + cmd_dict['cluster.split'] = dict({'required' : [['fasta','phylip','column']] , 'optional' : ['name','method','splitmethod','taxonomy','taxlevel','showabund','cutoff','hard','large','precision','timing','processors']}) + #cmd_dict['collect.shared'] = dict({'required' : [], 'optional' : ['calc','label','freq','groups','all']}) + cmd_dict['collect.shared'] = dict({'required' : ['shared'], 'optional' : ['calc','label','freq','groups','all']}) + #cmd_dict['collect.single'] = dict({'required' : [], 'optional' : ['calc','abund','size','label','freq']}) + cmd_dict['collect.single'] = dict({'required' : [['list', 'sabund', 'rabund', 'shared']], 'optional' : ['calc','abund','size','label','freq']}) + cmd_dict['consensus.seqs'] = dict({'required' : ['fasta'], 'optional' : ['list','name','label']}) + cmd_dict['corr.axes'] = dict({'required' : [['shared','relabund','metadata'],'axes'], 'optional' : ['label','groups','method','numaxes']}) + cmd_dict['degap.seqs'] = dict({'required' : ['fasta']}) + cmd_dict['deunique.seqs'] = dict({'required' : ['fasta','name'], 'optional' : []}) + #cmd_dict['dist.seqs'] = dict({'required' : ['fasta'], 'optional' : ['calc','countends','output','cutoff','processors']}) + cmd_dict['dist.seqs'] = dict({'required' : ['fasta'], 'optional' : ['calc','countends','output','cutoff','oldfasta','column','processors']}) + #cmd_dict['dist.shared'] = dict({'required' : [], 'optional' : ['calc','label','groups','output']}) + cmd_dict['dist.shared'] = dict({'required' : ['shared'], 'optional' : ['calc','label','groups','output']}) + cmd_dict['fastq.info'] = dict({'required' : ['fastq'], 'optional' : []}) + cmd_dict['filter.seqs'] = dict({'required' : ['fasta'], 'optional' : ['vertical','trump','soft','hard','processors']}) + #cmd_dict['get.group'] = dict({'required' : [], 'optional' : []}) + cmd_dict['get.group'] = dict({'required' : ['shared'], 'optional' : []}) + cmd_dict['get.groups'] = dict({'required' : ['group'], 'optional' : ['groups','accnos','fasta','name','list','taxonomy']}) + cmd_dict['get.lineage'] = dict({'required' : ['taxonomy','taxon'],'optional' : ['fasta','name','group','list','alignreport','dups']}) + ##cmd_dict['get.otulist'] = dict({'required' : [], 'optional' : []}) + cmd_dict['get.otulist'] = dict({'required' : ['list'], 'optional' : ['label','sort']}) + #cmd_dict['get.oturep'] = dict({'required' : ['fasta','list'], 'optional' : ['phylip','column','name','label','group','groups','sorted','precision','cutoff','large','weighted']}) + cmd_dict['get.oturep'] = dict({'required' : ['fasta','list',['phylip','column']], 'optional' : ['name','label','group','groups','sorted','precision','cutoff','large','weighted']}) + cmd_dict['get.otus'] = dict({'required' : ['group','list','label'], 'optional' : ['groups','accnos']}) + ##cmd_dict['get.rabund'] = dict({'required' : [],'optional' : []}) + cmd_dict['get.rabund'] = dict({'required' : [['list','sabund']],'optional' : ['sorted','label']}) + #cmd_dict['get.relabund'] = dict({'required' : [],'optional' : ['scale','label','groups']}) + cmd_dict['get.relabund'] = dict({'required' : ['shared'],'optional' : ['scale','label','groups']}) + ##cmd_dict['get.sabund'] = dict({'required' : [],'optional' : []}) + cmd_dict['get.sabund'] = dict({'required' : [['list','rabund']],'optional' : ['label']}) + cmd_dict['get.seqs'] = dict({'required' : ['accnos',['fasta','qfile','name','group','list','alignreport','taxonomy']], 'optional' : ['dups']}) + ##cmd_dict['get.sharedseqs'] = dict({'required' : [], 'optional' : []}) + cmd_dict['get.sharedseqs'] = dict({'required' : ['list','group'], 'optional' : ['label', 'unique', 'shared', 'output', 'fasta']}) + cmd_dict['hcluster'] = dict({'required' : [['column','phylip']] , 'optional' : ['name','method','cutoff','hard','precision','sorted','showabund','timing']}) + #cmd_dict['heatmap.bin'] = dict({'required' : [], 'optional' : ['label','groups','scale','sorted','numotu','fontsize']}) + cmd_dict['heatmap.bin'] = dict({'required' : [['list', 'sabund', 'rabund', 'shared']], 'optional' : ['label','groups','scale','sorted','numotu','fontsize']}) + #cmd_dict['heatmap.sim'] = dict({'required' : [], 'optional' : ['calc','phylip','column','name','label','groups']}) + cmd_dict['heatmap.sim'] = dict({'required' : [['shared','phylip','column']], 'optional' : ['calc','name','label','groups']}) + cmd_dict['homova'] = dict({'required' : ['phylip','design'] , 'optional' : ['alpha','iters']}) + cmd_dict['indicator'] = dict({'required' : ['tree',['shared','relabund']], 'optional' : ['groups','label','design']}) + #cmd_dict['libshuff'] = dict({'required' : [],'optional' : ['iters','form','step','cutoff']}) + cmd_dict['libshuff'] = dict({'required' : ['phylip','group'],'optional' : ['groups','iters','form','sim','step','cutoff']}) + cmd_dict['list.seqs'] = dict({'required' : [['fasta','name','group','list','alignreport','taxonomy']]}) + cmd_dict['make,fastq'] = dict({'required' : ['fasta','qfile'] , 'optional' : []}) + #cmd_dict['make.group'] = dict({'required' : ['fasta','groups'], 'optional' : ['output']}) + cmd_dict['make.group'] = dict({'required' : ['fasta','groups'], 'optional' : []}) + cmd_dict['make.shared'] = dict({'required' : ['list','group'], 'optional' : ['label','groups','ordergroup']}) + cmd_dict['mantel'] = dict({'required' : ['phylip','phylip2'] , 'optional' : ['method','iters']}) + cmd_dict['merge.files'] = dict({'required' : ['input','output']}) + cmd_dict['merge.groups'] = dict({'required' : ['shared','design'], 'optional' : ['groups', 'label']}) + #cmd_dict['metastats'] = dict({'required' : ['design'], 'optional' : ['groups', 'label','iters','threshold','sets','processors']}) + cmd_dict['metastats'] = dict({'required' : ['shared','design'], 'optional' : ['groups', 'label','iters','threshold','sets','processors']}) + cmd_dict['nmds'] = dict({'required' : ['phylip'], 'optional' : ['axes','mindim','maxdim','iters','maxiters','epsilon']}) + #cmd_dict['normalize.shared'] = dict({'required' : [], 'optional' : ['label','method','norm','groups']}) + cmd_dict['normalize.shared'] = dict({'required' : [['shared','relabund']], 'optional' : ['label','method','norm','groups','makerelabund']}) + ##cmd_dict['otu.hierarchy'] = dict({'required' : [], 'optional' : []}) + cmd_dict['otu.hierarchy'] = dict({'required' : ['list','label'], 'optional' : ['output']}) + cmd_dict['pairwise.seqs'] = dict({'required' : ['fasta'], 'optional' : ['align','calc','countends','output','cutoff','match','mismatch','gapopen','gapextend','processors']}) + cmd_dict['parse.list'] = dict({'required' : ['list','group'], 'optional' : ['label']}) + #cmd_dict['parsimony'] = dict({'required' : [], 'optional' : ['groups','iters','random','processors']}) + cmd_dict['parsimony'] = dict({'required' : ['tree'], 'optional' : ['group','groups','name','iters','random','processors']}) + #cmd_dict['pca'] = dict({'required' : [], 'optional' : ['label','groups','metric']}) + cmd_dict['pca'] = dict({'required' : [['shared','relabund']], 'optional' : ['label','groups','metric']}) + #cmd_dict['pcoa'] = dict({'required' : ['phylip'], 'optional' : []}) + cmd_dict['pcoa'] = dict({'required' : ['phylip'], 'optional' : ['metric']}) + #cmd_dict['phylo.diversity'] = dict({'required' : [],'optional' : ['groups','iters','freq','scale','rarefy','collect','summary','processors']}) + cmd_dict['phylo.diversity'] = dict({'required' : ['tree','group'],'optional' : ['name','groups','iters','freq','scale','rarefy','collect','summary','processors']}) + cmd_dict['phylotype'] = dict({'required' : ['taxonomy'],'optional' : ['name','cutoff','label']}) + #cmd_dict['pre.cluster'] = dict({'required' : ['fasta'], 'optional' : ['names','diffs']}) + cmd_dict['pre.cluster'] = dict({'required' : ['fasta'], 'optional' : ['name','diffs']}) + #cmd_dict['rarefaction.shared'] = dict({'required' : [], 'optional' : ['label','iters','groups','jumble']}) + cmd_dict['rarefaction.shared'] = dict({'required' : ['shared'], 'optional' : ['calc','label','iters','groups','jumble']}) + #cmd_dict['rarefaction.single'] = dict({'required' : [], 'optional' : ['calc','abund','iters','label','freq','processors']}) + cmd_dict['rarefaction.single'] = dict({'required' : [['list', 'sabund', 'rabund', 'shared']], 'optional' : ['calc','abund','iters','label','freq','processors']}) + #cmd_dict['read.dist'] = dict({'required' : [['phylip','column']], 'optional' : ['name','cutoff','hard','precision','sim','group']}) + #cmd_dict['read.otu'] = dict({'required' : [['rabund','sabund','list','shared','relabund']], 'optional' : ['label','group','groups','ordergroup']}) + #cmd_dict['read.tree'] = dict({'required' : ['tree'], 'optional' : ['name','group']}) + cmd_dict['remove.groups'] = dict({'required' : ['group'], 'optional' : ['groups','accnos','fasta','name','list','taxonomy']}) + cmd_dict['remove.lineage'] = dict({'required' : ['taxonomy','taxon'],'optional' : ['fasta','name','group','list','alignreport','dups']}) + cmd_dict['remove.otus'] = dict({'required' : ['group','list','label'], 'optional' : ['groups','accnos']}) + #cmd_dict['remove.rare'] = dict({'required' : [['list','sabund','rabund','shared'],'nseqs'], 'optional' : ['group','groups','label','bygroup']}) + cmd_dict['remove.rare'] = dict({'required' : [['list','sabund','rabund','shared'],'nseqs'], 'optional' : ['group','groups','label','bygroup']}) + cmd_dict['remove.seqs'] = dict({'required' : ['accnos',['fasta','qfile','name','group','list','alignreport','taxonomy']], 'optional' : ['dups']}) + cmd_dict['reverse.seqs'] = dict({'required' : ['fasta']}) + cmd_dict['screen.seqs'] = dict({'required' : ['fasta'], 'optional' : ['start','end','maxambig','maxhomop','minlength','maxlength','criteria','optimize','name','group','alignreport','processors']}) + cmd_dict['sens.spec'] = dict({'required' : ['list',['column','phylip']] , 'optional' : ['label','cutoff','hard','precision']}) + cmd_dict['sffinfo'] = dict({'required' : [['sff','sfftxt']], 'optional' : ['fasta','qfile','trim','sfftxt','flow','accnos']}) + cmd_dict['split.abund'] = dict({'required' : ['fasta',['name','list']], 'optional' : ['cutoff','group','groups','label','accnos']}) + #cmd_dict['split.groups'] = dict({'required' : ['fasta','group'], 'optional' : []}) + cmd_dict['split.groups'] = dict({'required' : ['fasta','group'], 'optional' : ['name','groups']}) + cmd_dict['sub.sample'] = dict({'required' : [['fasta','list','sabund','rabund','shared']], 'optional' : ['name','group','groups','label','size','persample']}) + #cmd_dict['summary.seqs'] = dict({'required' : ['fasta'],'outputs' : ['names']}) + cmd_dict['summary.seqs'] = dict({'required' : ['fasta'], 'optional' : ['name','processors']}) + #cmd_dict['summary.shared'] = dict({'required' : [], 'optional' : ['calc','label','groups','all','distance']}) + cmd_dict['summary.shared'] = dict({'required' : ['shared'], 'optional' : ['calc','label','groups','all','distance','processors']}) + #cmd_dict['summary.single'] = dict({'required' : [], 'optional' : ['calc','abund','size','label','groupmode']}) + cmd_dict['summary.single'] = dict({'required' : [['list','sabund','rabund','shared']], 'optional' : ['calc','abund','size','label','groupmode']}) + #cmd_dict['tree.shared'] = dict({'required' : [], 'optional' : ['groups','calc','cutoff','precision','label']}) + cmd_dict['tree.shared'] = dict({'required' : [['shared','phylip','column']], 'optional' : ['name','groups','calc','cutoff','precision','label']}) + cmd_dict['trim.seqs'] = dict({'required' : ['fasta'], 'optional' : ['group','oligos','qfile','qaverage','qthreshold','qtrim','flip','maxambig','maxhomop','minlength','maxlength','bdiffs','pdiffs','tdiffs','allfiles','keepfirst','removelast']}) + #cmd_dict['unifrac.unweighted'] = dict({'required' : [], 'optional' : ['groups','iters','distance','random','root','processors']}) + cmd_dict['unifrac.unweighted'] = dict({'required' : ['tree'], 'optional' : ['name','group','groups','iters','distance','random','root','processors']}) + #cmd_dict['unifrac.weighted'] = dict({'required' : [], 'optional' : ['groups','iters','distance','random','root','processors']}) + cmd_dict['unifrac.weighted'] = dict({'required' : ['tree'], 'optional' : ['name','group','groups','iters','distance','random','root','processors']}) + #cmd_dict['unique.seqs'] = dict({'required' : ['fasta'], 'optional' : ['names']}) + cmd_dict['unique.seqs'] = dict({'required' : ['fasta'], 'optional' : ['name']}) + #cmd_dict['venn'] = dict({'required' : [], 'optional' : ['calc','label','groups','abund','nseqs','permute']}) + cmd_dict['venn'] = dict({'required' : [['list','shared']], 'optional' : ['calc','label','groups','abund','nseqs','permute']}) + ## + """ cmd_dict['merge.files'] = dict({'required' : ['input','output']}) cmd_dict['make.group'] = dict({'required' : ['fasta','groups'], 'optional' : ['output']}) cmd_dict['merge.groups'] = dict({'required' : ['shared','design'], 'optional' : ['groups', 'label']}) @@ -192,7 +342,7 @@ cmd_dict['chimera.ccode'] = dict({'required' : ['fasta','template'], 'optional' : ['filter','mask','window','numwanted','processors']}) cmd_dict['chimera.check'] = dict({'required' : ['fasta','template'], 'optional' : ['ksize','svg','name','increment','processors']}) cmd_dict['chimera.pintail'] = dict({'required' : ['fasta','template'], 'optional' : ['conservation','quantile','filter','mask','window','increment','processors']}) - cmd_dict['chimera.slayer'] = dict({'required' : ['fasta','template'], 'optional' : ['name','search','window','increment','match','mismatch','numwanted','parents','minsim','mincov','iters','minbs','minsnp','divergence','realign','processors']}) + cmd_dict['chimera.slayer'] = dict({'required' : ['fasta','template'], 'optional' : ['name','search','window','increment','match','mismatch','numwanted','parents','minsim','mincov','iters','minbs','minsnp','divergence','realign','split','processors']}) cmd_dict['dist.seqs'] = dict({'required' : ['fasta'], 'optional' : ['calc','countends','output','cutoff','processors']}) cmd_dict['pairwise.seqs'] = dict({'required' : ['fasta'], 'optional' : ['align','calc','countends','output','cutoff','match','mismatch','gapopen','gapextend','processors']}) cmd_dict['read.dist'] = dict({'required' : [['phylip','column']], 'optional' : ['name','cutoff','hard','precision','sim','group']}) @@ -203,7 +353,7 @@ cmd_dict['cluster.fragments'] = dict({'required' : ['fasta'] , 'optional' : ['name','diffs','percent']}) cmd_dict['cluster.split'] = dict({'required' : [['fasta','phylip','column']] , 'optional' : ['name','method','splitmethod','taxonomy','taxlevel','showabund','cutoff','hard','large','precision','timing','processors']}) cmd_dict['metastats'] = dict({'required' : ['design'], 'optional' : ['groups', 'label','iters','threshold','sets','processors']}) - cmd_dict['summary.single'] = dict({'required' : [], 'optional' : ['calc','abund','size','label','groupmode','processors']}) + cmd_dict['summary.single'] = dict({'required' : [], 'optional' : ['calc','abund','size','label','groupmode']}) cmd_dict['summary.shared'] = dict({'required' : [], 'optional' : ['calc','label','groups','all','distance']}) cmd_dict['collect.single'] = dict({'required' : [], 'optional' : ['calc','abund','size','label','freq']}) cmd_dict['collect.shared'] = dict({'required' : [], 'optional' : ['calc','label','freq','groups','all']}) @@ -214,21 +364,21 @@ cmd_dict['split.abund'] = dict({'required' : ['fasta',['name','list']], 'optional' : ['cutoff','group','groups','label','accnos']}) cmd_dict['split.groups'] = dict({'required' : ['fasta','group'], 'optional' : []}) cmd_dict['tree.shared'] = dict({'required' : [], 'optional' : ['groups','calc','cutoff','precision','label']}) - cmd_dict['unifrac.unweighted'] = dict({'required' : [], 'optional' : ['groups','iters','distance','random','processors']}) - cmd_dict['unifrac.weighted'] = dict({'required' : [], 'optional' : ['groups','iters','distance','random','processors']}) + cmd_dict['unifrac.unweighted'] = dict({'required' : [], 'optional' : ['groups','iters','distance','random','root','processors']}) + cmd_dict['unifrac.weighted'] = dict({'required' : [], 'optional' : ['groups','iters','distance','random','root','processors']}) cmd_dict['parsimony'] = dict({'required' : [], 'optional' : ['groups','iters','random','processors']}) cmd_dict['sffinfo'] = dict({'required' : ['sff'], 'optional' : ['fasta','qfile','trim','sfftxt','flow','accnos']}) cmd_dict['fastq.info'] = dict({'required' : ['fastq'], 'optional' : []}) cmd_dict['heatmap.bin'] = dict({'required' : [], 'optional' : ['label','groups','scale','sorted','numotu','fontsize']}) cmd_dict['heatmap.sim'] = dict({'required' : [], 'optional' : ['calc','phylip','column','name','label','groups']}) - cmd_dict['venn'] = dict({'required' : [], 'optional' : ['calc','label','groups','nseqs','permute']}) + cmd_dict['venn'] = dict({'required' : [], 'optional' : ['calc','label','groups','abund','nseqs','permute']}) cmd_dict['pcoa'] = dict({'required' : ['phylip'], 'optional' : []}) cmd_dict['pca'] = dict({'required' : [], 'optional' : ['label','groups','metric']}) cmd_dict['nmds'] = dict({'required' : ['phylip'], 'optional' : ['axes','mindim','maxdim','iters','maxiters','epsilon']}) cmd_dict['corr.axes'] = dict({'required' : [['shared','relabund','metadata'],'axes'], 'optional' : ['label','groups','method','numaxes']}) cmd_dict['get.group'] = dict({'required' : [], 'optional' : []}) cmd_dict['phylotype'] = dict({'required' : ['taxonomy'],'optional' : ['name','cutoff','label']}) - cmd_dict['phylo.diversity'] = dict({'required' : [],'optional' : ['groups','iters','freq','processors','scale','rarefy','collect','summary','processors']}) + cmd_dict['phylo.diversity'] = dict({'required' : [],'optional' : ['groups','iters','freq','scale','rarefy','collect','summary','processors']}) cmd_dict['get.oturep'] = dict({'required' : ['fasta','list'], 'optional' : ['phylip','column','name','label','group','groups','sorted','precision','cutoff','large','weighted']}) cmd_dict['get.relabund'] = dict({'required' : [],'optional' : ['scale','label','groups']}) cmd_dict['libshuff'] = dict({'required' : [],'optional' : ['iters','form','step','cutoff']}) @@ -237,10 +387,7 @@ cmd_dict['get.lineage'] = dict({'required' : ['taxonomy','taxon'],'optional' : ['fasta','name','group','list','alignreport','dups']}) cmd_dict['remove.lineage'] = dict({'required' : ['taxonomy','taxon'],'optional' : ['fasta','name','group','list','alignreport','dups']}) cmd_dict['bootstrap.shared'] = dict({'required' : [], 'optional' : ['calc','groups','iters','label']}) - """ - Mothur 1.15 - """ - cmd_dict['cluster.classic'] = dict({'required' : [] , 'optional' : ['method','cutoff','hard','precision']}) + cmd_dict['cluster.classic'] = dict({'required' : ['phylip'] , 'optional' : ['method','cutoff','hard','precision']}) cmd_dict['get.groups'] = dict({'required' : ['group'], 'optional' : ['groups','accnos','fasta','name','list','taxonomy']}) cmd_dict['remove.groups'] = dict({'required' : ['group'], 'optional' : ['groups','accnos','fasta','name','list','taxonomy']}) cmd_dict['get.otus'] = dict({'required' : ['group','list','label'], 'optional' : ['groups','accnos']}) @@ -251,6 +398,14 @@ cmd_dict['sub.sample'] = dict({'required' : [['fasta','list','sabund','rabund','shared']], 'optional' : ['name','group','groups','label','size','persample']}) cmd_dict['consensus.seqs'] = dict({'required' : ['fasta'], 'optional' : ['list','name','label']}) cmd_dict['indicator'] = dict({'required' : ['tree',['shared','relabund']], 'optional' : ['groups','label','design']}) + + cmd_dict['amova'] = dict({'required' : ['phylip','design'] , 'optional' : ['alpha','iters']}) + cmd_dict['homova'] = dict({'required' : ['phylip','design'] , 'optional' : ['alpha','iters']}) + cmd_dict['anosim'] = dict({'required' : ['phylip','design'] , 'optional' : ['alpha','iters']}) + cmd_dict['mantel'] = dict({'required' : ['phylip','phylip2'] , 'optional' : ['method','iters']}) + cmd_dict['make,fastq'] = dict({'required' : ['fasta','qfile'] , 'optional' : []}) + """ + parser = optparse.OptionParser() # Options for managing galaxy interaction parser.add_option( '--debug', dest='debug', action='store_true', default=False, help='Turn on wrapper debugging to stdout' ) @@ -318,6 +473,7 @@ # parser.add_option( '--taxon', dest='taxon', action="callback", callback=remove_confidence_callback, help='A Taxon' ) parser.add_option( '--candidate', dest='candidate', help=' file ' ) parser.add_option( '--template', dest='template', help=' file ' ) + parser.add_option( '--reference', dest='reference', help=' file ' ) parser.add_option( '--dups', dest='dups', help='if True also apply to the aliases from the names files' ) parser.add_option( '--keep', dest='keep', help='Either front or back to specify the which end of the sequence to keep' ) parser.add_option( '--search', dest='search', help='Method for finding the template sequence: kmer, blast, suffix' ) @@ -353,6 +509,7 @@ parser.add_option( '--output', dest='output', help='Format for output' ) parser.add_option( '--method', dest='method', help='Method to use for analysis - cluster' ) parser.add_option( '--splitmethod', dest='splitmethod', help='Method to split a distance file - cluster.split' ) + parser.add_option( '--split', dest='split', help='Chimera split parameter, whether to detect trimeras and quadmeras' ) parser.add_option( '--abund', dest='abund', type='int', help='Threshold for rare to Abundant OTU classification' ) parser.add_option( '--size', dest='size', type='int', help='Size - sample size' ) parser.add_option( '--groupmode', dest='groupmode', help='Collate groups into one result table' ) @@ -373,12 +530,15 @@ parser.add_option( '--percent', dest='percent', type='int', help='(0-100 percent)' ) parser.add_option( '--divergence', dest='divergence', type='float', help='Divergence cutoff for chimera determination' ) parser.add_option( '--sff', dest='sff', help='Sff file' ) + parser.add_option( '--svg', dest='svg', help='SVG' ) parser.add_option( '--sfftxt', dest='sfftxt', help='Generate a sff.txt file' ) parser.add_option( '--flow', dest='flow', help='Generate a flowgram file' ) parser.add_option( '--trim', dest='trim', help='Whether sequences and quality scores are trimmed to the clipQualLeft and clipQualRight values' ) parser.add_option( '--input', dest='input', help='' ) parser.add_option( '--phylip', dest='phylip', help='' ) + parser.add_option( '--phylip2', dest='phylip2', help='' ) parser.add_option( '--column', dest='column', help='' ) + parser.add_option( '--sort', dest='sort', help='specify sort order' ) parser.add_option( '--sorted', dest='sorted', help='Input is presorted' ) parser.add_option( '--showabund', dest='showabund', help='' ) parser.add_option( '--short', dest='short', help='Keep sequences that are too short to chop' ) @@ -387,6 +547,7 @@ parser.add_option( '--numotu', dest='numotu', help='' ) parser.add_option( '--fontsize', dest='fontsize', help='' ) parser.add_option( '--neqs', dest='neqs', help='' ) + parser.add_option( '--random', dest='random', help='' ) parser.add_option( '--permute', dest='permute', help='' ) parser.add_option( '--rarefy', dest='rarefy', help='' ) parser.add_option( '--collect', dest='collect', help='' ) @@ -408,6 +569,8 @@ parser.add_option( '--sets', dest='sets', help='' ) parser.add_option( '--metric', dest='metric', help='' ) parser.add_option( '--epsilon', dest='epsilon', help='' ) + parser.add_option( '--alpha', dest='alpha', help='' ) + parser.add_option( '--root', dest='root', help='' ) parser.add_option( '--axes', dest='axes', help='table of name column followed by columns of axis values' ) parser.add_option( '--numaxes', dest='numaxes', help='the number of axes' ) parser.add_option( '--metadata', dest='metadata', help='data table with columns of floating-point values' ) @@ -446,7 +609,14 @@ os.makedirs(options.tmpdir) tmp_dir = options.tmpdir else: - tmp_dir = tempfile.mkdtemp() + if options.outputdir != None: + if not os.path.isdir(options.outputdir): + os.makedirs(options.outputdir) + tmp_dir = os.path.join(options.outputdir,'tmp') + if not os.path.isdir(tmp_dir): + os.makedirs(tmp_dir) + else: + tmp_dir = tempfile.mkdtemp() if options.inputdir != None: if not os.path.isdir(options.inputdir): os.makedirs(options.inputdir) @@ -478,11 +648,12 @@ # print >> sys.stderr, cmd_opts # print >> sys.stderr, params # so will appear as blurb for file params.append('%s(%s)' % (options.cmd,cmd_opts)) + if debug: params.append('get.current()') try: # Generate the mothur commandline # http://www.mothur.org/wiki/Command_line_mode cmdline = 'mothur "#' + '; '.join(params) + '"' - # print >> sys.stdout, '%s' % cmdline + if debug: print >> sys.stdout, '%s' % cmdline if tmp_dir == None or not os.path.isdir(tmp_dir): tmp_dir = tempfile.mkdtemp() tmp_stderr_name = tempfile.NamedTemporaryFile( dir=tmp_dir,suffix='.err' ).name @@ -492,6 +663,7 @@ proc = subprocess.Popen( args=cmdline, shell=True, cwd=tmp_dir, stderr=tmp_stderr.fileno(), stdout=tmp_stdout.fileno() ) # proc = subprocess.Popen( args=cmdline, shell=True, stderr=subprocess.PIPE, stdout=subprocess.PIPE ) returncode = proc.wait() + if debug: print >> sys.stdout, 'returncode %d' % returncode tmp_stderr.close() # get stderr, allowing for case where it's very large tmp_stderr = open( tmp_stderr_name, 'rb' ) @@ -505,12 +677,26 @@ except OverflowError: pass tmp_stderr.close() + tmp_stdout.close() + if debug: print >> sys.stdout, 'parse %s' % tmp_stdout_name if returncode != 0: + try: + # try to copy stdout to the logfile + for output in options.result.split(','): + # Each item has a regex pattern and a file path to a galaxy dataset + (pattern,path) = output.split(':') + if debug: print >> sys.stdout, '%s -> %s' % (pattern,path) + if pattern.find('\.logfile') > 0: + if path != None and os.path.exists(path): + logfile_to_html(tmp_stdout_name,path,inputdir,outputdir,title="Mothur %s Error Logfile" % options.cmd) + break + except: + pass raise Exception, stderr stdout = '' # Parse stdout to provide info - tmp_stdout.close() tmp_stdout = open( tmp_stdout_name, 'rb' ) + # try to find a "little" something interesting to print as info for the galaxy interface info = '' if options.cmd.startswith('chimera') and not options.cmd.endswith('check'): pattern = '^.*$' @@ -533,19 +719,32 @@ info += "Chimeras: %d" % chimera_count else: found_begin = False + info_chars = 0 for line in tmp_stdout: if line.find(outputdir) >= 0: continue + if line.startswith('**************'): + continue if re.match('^Processing.*',line): continue + if re.match('^Reading .*',line): + continue + if re.match('^Merging .*',line): + continue + if re.match('^DONE.*',line): + continue if re.match('.*\.\.\.\s*$',line): continue if re.match('^\d*\s*$',line): continue + # if re.match('^(unique|[0-9.]*)(\t\d+)+',line): # abundance from cluster commands + if not options.cmd.startswith('unifrac') and re.match('^([0-9.]+)(\t\d+)*',line): # abundance from cluster commands, allow unique line into info + continue if re.match('Output .*',line): break - if found_begin: + if found_begin and info_chars < 200: info += "%s" % line + info_chars += len(line) if re.match('mothur > ' + options.cmd + '$.*$', line): found_begin = True tmp_stdout.close() @@ -553,6 +752,15 @@ # Collect output files flist = os.listdir(outputdir) if debug: print >> sys.stdout, '%s' % flist + # chimera.check can generate svg files, but they are not listed in the mothur.*.logfile, so we'll added them in here + if options.cmd == 'chimera.check': + svgs = [] + mothurlog = None + for fname in flist: + if fname.endswith('.svg'): + svgs.append(fname) + elif fname.endswith('.logfile'): + mothurlog = fname # process option result first # These are the known galaxy datasets listed in the --result= param if len(flist) > 0 and options.result: @@ -573,23 +781,31 @@ if fname.endswith('.logfile'): # Make the logfile into html logfile_to_html(fpath,path,inputdir,outputdir,title="Mothur %s Logfile" % options.cmd) - elif False and outputdir == options.outputdir: - # Use a hard link if outputdir is the extra_files_path + elif outputdir == options.outputdir: + # Use a hard link if outputdir is the extra_files_path, allows link from mothur logfile without copying data. try: + if debug: print >> sys.stdout, 'link %s %s' % (fpath, path) os.link(fpath, path) except: + if debug: print >> sys.stdout, 'copy %s %s' % (fpath, path) shutil.copy2(fpath, path) else: + if debug: print >> sys.stdout, 'copy2 %s %s' % (fpath, path) shutil.copy2(fpath, path) break + # mothur.*.logfile may be in tmp_dir # chimera.pintail e.g. generates files in the working dir that we might want to save if not found: for fname in os.listdir(tmp_dir): if debug: print >> sys.stdout, 'tmpdir %s match: %s' % (fname,re.match(pattern,fname)) if re.match(pattern,fname): fpath = os.path.join(tmp_dir,fname) - shutil.copy2(fpath, path) - break + if fname.endswith('.logfile'): + # Make the logfile into html + logfile_to_html(fpath,path,inputdir,outputdir,title="Mothur %s Logfile" % options.cmd) + else: + shutil.copy2(fpath, path) + break # Handle the dynamically generated galaxy datasets # http://bitbucket.org/galaxy/galaxy-central/wiki/ToolsMultipleOutput # --new_datasets= specifies files to copy to the new_file_path @@ -605,7 +821,8 @@ if m: fpath = os.path.join(outputdir,fname) if len(m.groups()) > 0: - root = m.groups()[0] + # remove underscores since galaxy uses that as a field separator for dynamic datasets + root = m.groups()[0].replace('_','') else: # remove the ext from the name if it exists, galaxy will add back later # remove underscores since galaxy uses that as a field separator for dynamic datasets @@ -632,13 +849,18 @@ try: if outputdir != options.outputdir and os.path.exists(outputdir): if os.path.islink(outputdir): + if debug: print >> sys.stdout, 'rm outputdir %s' % outputdir os.remove(outputdir) + if debug: print >> sys.stdout, 'rmtree outputdir %s' % outputdir shutil.rmtree(os.path.dirname(outputdir)) else: + if debug: print >> sys.stdout, 'rmtree %s' % outputdir shutil.rmtree(outputdir) if inputdir != options.inputdir and os.path.exists(inputdir): + if debug: print >> sys.stdout, 'rmtree %s' % inputdir shutil.rmtree(inputdir) except: + if debug: print >> sys.stdout, 'rmtree failed' pass diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/nmds.xml --- a/mothur/tools/mothur/nmds.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/nmds.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,9 +1,9 @@ - + generate non-metric multidimensional scaling data mothur_wrapper.py --cmd='nmds' - --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.nmds\.axes$:'$nmds_axes,'^\S+\.nmds\.iters$:'$nmds_iters,'^\S+\.stress\.nmds$:'$stress_nmds + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.nmds\.axes$:'$nmds_axes,'^\S+\.nmds\.iters$:'$nmds_iters,'^\S+\.nmds\.stress$:'$stress_nmds --outputdir='$logfile.extra_files_path' --phylip=$dist #if $axes.__str__ != "None" and len($axes.__str__) > 0: @@ -38,7 +38,7 @@ - + mothur @@ -56,8 +56,9 @@ **Command Documenation** -The nmds_ command generates non-metric multidimensional scaling data +The nmds_ command generates non-metric multidimensional scaling data from a phylip_distance_matrix_. +.. _phylip_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix .. _nmds: http://www.mothur.org/wiki/Nmds diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/normalize.shared.xml --- a/mothur/tools/mothur/normalize.shared.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/normalize.shared.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,27 +1,21 @@ - + Normalize the number of sequences per group to a specified level mothur_wrapper.py --cmd='normalize.shared' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.norm\.shared$:'$shared --outputdir='$logfile.extra_files_path' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if + #if isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('shared').__class__): + --shared=$otu + $makerelabund + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('relabund').__class__): + --relabund=$otu #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' + #end if + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups=$groups #end if #if $method.__str__ != "None" and len($method.__str__) > 0: --method=$method @@ -31,53 +25,29 @@ #end if - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + - + + @@ -99,8 +69,10 @@ **Command Documenation** -The normalize.shared_ command normalizes the number of sequences per group to a specified level. +The normalize.shared_ command normalizes the number of sequences per group to a specified level. The input is a shared_ or relabund_ file. +.. _shared: http://www.mothur.org/wiki/Shared_file +.. _relabund: http://www.mothur.org/wiki/Get.relabund .. _normalize.shared: http://www.mothur.org/wiki/Normalize.shared diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/otu.hierarchy.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/otu.hierarchy.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,61 @@ + + Relate OTUs at different distances + + mothur_wrapper.py + ## output {group_file_name}.pick.{label}.groups {list_file_name}.pick.{label}.list + #import re, os.path + --cmd='otu.hierarchy' + --outputdir='$logfile.extra_files_path' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.otu\.hierarchy$:'$hierarchy + --list=$list + --label=$label1,$label2 + --output=$output + + + + + + + + + + + + + + + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The otu.hierarchy_ command relates OTUs from a list_ at different distances. + +.. _list_file: http://www.mothur.org/wiki/List_file +.. _otu.hierarchy: http://www.mothur.org/wiki/Otu.hierarchy + + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/pairwise.seqs.xml --- a/mothur/tools/mothur/pairwise.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/pairwise.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + calculate uncorrected pairwise distances between sequences mothur_wrapper.py @@ -14,10 +14,12 @@ #if float($cutoff.__str__) > 0.0: --cutoff=$cutoff #end if - --match=$scoring.match - --mismatch=$scoring.mismatch - --gapopen=$scoring.gapopen - --gapextend=$scoring.gapextend + #if $scoring.setby == 'user': + --match=$scoring.match + --mismatch=$scoring.mismatch + --gapopen=$scoring.gapopen + --gapextend=$scoring.gapextend + #end if #if len($output.__str__) > 0: --output=$output #end if @@ -40,10 +42,21 @@ help="Penalize terminal gaps"/> - - - - + + + + + + + + + + + + + + + @@ -76,8 +89,10 @@ **Command Documenation** -The pairwise.seqs_ command will calculate uncorrected pairwise distances between sequences. +The pairwise.seqs_ command will calculate uncorrected pairwise distances between sequencesi as a column-formatted_distance_matrix_ or phylip-formatted_distance_matrix_. +.. _column-formatted_distance_matrix: http://www.mothur.org/wiki/Column-formatted_distance_matrix +.. _phylip-formatted_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix .. _pairwise.seqs: http://www.mothur.org/wiki/Pairwise.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/parse.list.xml --- a/mothur/tools/mothur/parse.list.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/parse.list.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,5 +1,5 @@ - - Order Sequences by OTU + + Generate a List file for each group mothur_wrapper.py --cmd='parse.list' @@ -16,7 +16,8 @@ - + + All labels are included if none are selected @@ -42,8 +43,10 @@ **Command Documenation** -The parse.list_ command prints out a fasta-formatted file where sequences are ordered according to the OTU that they belong to. Such an output may be helpful for generating primers specific to an OTU or for classification of sequences. +The parse.list_ command reads a list_ file and group_ file and generates a list_ file for each group_ in the groupfile. +.. _list: http://www.mothur.org/wiki/List_file +.. _group: http://www.mothur.org/wiki/Group_file .. _parse.list: http://www.mothur.org/wiki/Parse.list diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/parsimony.xml --- a/mothur/tools/mothur/parsimony.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/parsimony.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,36 +1,37 @@ - + Describes whether two or more communities have the same structure mothur_wrapper.py --cmd='parsimony' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.psummary$:'$psummary,'^\S+\.parsimony$:'$parsimony --outputdir='$logfile.extra_files_path' - --READ_cmd='read.tree' - --READ_tree=$tree + --tree=$tree #if $group.__str__ != "None" and len($group.__str__) > 0: - --READ_group='$group' + --group='$group' #end if #if $groups.__str__ != "None" and len($groups.__str__) > 0: --groups='$groups' #end if + #if $name.__str__ != "None" and len($name.__str__) > 0: + --name='$name' + #end if #if int($iters.__str__) > 0: --iters=$iters #end if --processors=2 - - - - + + + + - - + + @@ -36,6 +40,7 @@ + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/phylotype.xml --- a/mothur/tools/mothur/phylotype.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/phylotype.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,9 +1,9 @@ - + Assign sequences to OTUs based on taxonomy mothur_wrapper.py --cmd='phylotype' - --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.[fna]n\.sabund$:'$sabund,'^\S+\.[fna]n\.rabund$:'$rabund,'^\S+\.[fna]n\.list$:'$otulist + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.sabund$:'$sabund,'^\S+\.rabund$:'$rabund,'^\S+\.list$:'$otulist --outputdir='$logfile.extra_files_path' --taxonomy=$taxonomy #if 50 >= int($cutoff.__str__) > 0: @@ -17,7 +17,7 @@ #end if - + @@ -36,8 +36,8 @@ - - + + @@ -56,8 +56,11 @@ **Command Documenation** -The phylotype_ command assign sequences to OTUs based on their taxonomy and outputs a .list, .rabund and .sabund files. +The phylotype_ command assign sequences to OTUs based on their taxonomy and outputs a a list_, a sabund_ (Species Abundance), and a rabund_ (Relative Abundance) file. +.. _list: http://www.mothur.org/wiki/List_file +.. _rabund: http://www.mothur.org/wiki/Rabund_file +.. _sabund: http://www.mothur.org/wiki/Sabund_file .. _phylotype: http://www.mothur.org/wiki/Phylotype diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/pre.cluster.xml --- a/mothur/tools/mothur/pre.cluster.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/pre.cluster.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,16 +1,22 @@ - + Remove sequences due to pyrosequencing errors mothur_wrapper.py + #import re, os.path + #set results = ["'^mothur.\S+\.logfile$:'" + $logfile.__str__] + ## adds .precluster before the last extension to the input file + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.precluster.\2',$os.path.basename($fasta.__str__)) + ":'" + $fasta_out.__str__] + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.precluster.names',$os.path.basename($fasta.__str__)) + ":'" + $names_out.__str__] --cmd='pre.cluster' - --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.precluster\.fasta$:'$fasta_out,'^\S+\.precluster\.names$:'$names_out --outputdir='$logfile.extra_files_path' --fasta=$fasta - #if $matrix.name.__str__ != "None" and len($matrix.name.__str__) > 0: + #if $name.__str__ != "None" and len($name.__str__) > 0: --name=$name + #end if #if 20 >= int($diffs.__str__) >= 0: --diffs=$diffs #end if + --result=#echo ','.join($results) diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/rarefaction.shared.xml --- a/mothur/tools/mothur/rarefaction.shared.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/rarefaction.shared.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,27 +1,16 @@ - + Generate inter-sample rarefaction curves for OTUs mothur_wrapper.py --cmd='rarefaction.shared' --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.rarefaction$:'$rarefaction --outputdir='$logfile.extra_files_path' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if + --shared=$otu + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups=$groups #end if #if int($iters.__str__) > 0: --iters=$iters @@ -32,46 +21,23 @@ #end if - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + All groups will be analyzed by default if none are selected + + + + + + + @@ -103,7 +69,7 @@ **Command Documenation** -The rarefaction.shared_ command generates inter-sample rarefaction curves using a re-sampling without replacement approach. The traditional way that ecologists use rarefaction is not to randomize the sampling order within a sample, rather between samples. For instance, if we wanted to know the number of OTUs in the human colon, we might sample from various sites within the colon, and sequence a bunch of 16S rRNA genes. By determining the number of OTUs in each sample and comparing the composition of those samples it is possible to determine how well you have sampled the biodiversity within the individual. +The rarefaction.shared_ command generates inter-sample rarefaction curves using a re-sampling without replacement approach. The traditional way that ecologists use rarefaction is not to randomize the sampling order within a sample, rather between samples. For instance, if we wanted to know the number of OTUs in the human colon, we might sample from various sites within the colon, and sequence a bunch of 16S rRNA genes. By determining the number of OTUs in each sample and comparing the composition of those samples it is possible to determine how well you have sampled the biodiversity within the individual. For calc parameter choices see: http://www.mothur.org/wiki/Calculators .. _rarefaction.shared: http://www.mothur.org/wiki/Rarefaction.shared diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/rarefaction.single.xml --- a/mothur/tools/mothur/rarefaction.single.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/rarefaction.single.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Generate intra-sample rarefaction curves for OTUs mothur_wrapper.py @@ -6,11 +6,15 @@ --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?\.(((\S+)\.)?rarefaction)$:tabular' - --READ_cmd='read.otu' - --READ_list=$otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' + --new_datasets='^\S+?\.(((\S+)\.)?(rarefaction|r_\w*))$:tabular' + #if isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('shared').__class__): + --shared=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('rabund').__class__): + --rabund=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('sabund').__class__): + --sabund=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('list').__class__): + --list=$otu #end if #if $label.__str__ != "None" and len($label.__str__) > 0: --label='$label' @@ -30,8 +34,7 @@ --processors=2 - - + @@ -71,7 +74,7 @@ **Command Documenation** -The rarefaction.single_ command generates intra-sample rarefaction curves using a re-sampling without replacement approach. Rarefaction curves provide a way of comparing the richness observed in different samples. +The rarefaction.single_ command generates intra-sample rarefaction curves using a re-sampling without replacement approach. Rarefaction curves provide a way of comparing the richness observed in different samples. For calc parameter choices see: http://www.mothur.org/wiki/Calculators .. _rarefaction.single: http://www.mothur.org/wiki/Rarefaction.single diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/remove.groups.xml --- a/mothur/tools/mothur/remove.groups.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/remove.groups.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,5 +1,5 @@ - - Remove groups + + Remove groups from groups,fasta,names,list,taxonomy mothur_wrapper.py #import re, os.path @@ -9,11 +9,14 @@ --cmd='remove.groups' --outputdir='$logfile.extra_files_path' --group=$group_in - #if $groups.__str__ != "None" and len($groups.__str__) > 0: - --groups=$groups - #end if - #if $accnos.__str__ != "None" and len($accnos.__str__) > 0: - --accnos=$accnos + #if $groupnames.source == 'groups': + #if $groupnames.groups.__str__ != "None" and len($groupnames.groups.__str__) > 0: + --groups=$groupnames.groups + #end if + #else + #if $groupnames.accnos.__str__ != "None" and len($groupnames.accnos.__str__) > 0: + --accnos=$groupnames.accnos + #end if #end if #if $fasta_in.__str__ != "None" and len($fasta_in.__str__) > 0: --fasta=$fasta_in @@ -36,24 +39,34 @@ - - - - - - - - - + + + + + + + + + + + + + + + + + + + - + - + fasta_in != None @@ -62,7 +75,7 @@ list_in != None - + taxonomy_in != None @@ -82,8 +95,12 @@ **Command Documenation** -The remove.groups_ command removes sequences from a specific group or set of groups from the following file types: fasta, name, group, list, taxonomy. +The remove.groups_ command removes sequences from a specific group or set of groups from the following file types: fasta, name_, group_, list_, taxonomy_. +.. _name: http://www.mothur.org/wiki/Name_file +.. _group: http://www.mothur.org/wiki/Group_file +.. _list: http://www.mothur.org/wiki/List_file +.. _taxonomy: http://www.mothur.org/wiki/Taxonomy_outline .. _remove.groups: http://www.mothur.org/wiki/Remove.groups diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/remove.lineage.xml --- a/mothur/tools/mothur/remove.lineage.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/remove.lineage.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Picks by taxon mothur_wrapper.py @@ -20,7 +20,7 @@ #end if #if $alignreport_in.__str__ != "None" and len($alignreport_in.__str__) > 0: --alignreport=$alignreport_in - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($alignreport_in.__str__)) + ":'" + $alignreport_out.__str__] + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.pick.align.report',$os.path.basename($alignreport_in.__str__)) + ":'" + $alignreport_out.__str__] #end if #if $list_in.__str__ != "None" and len($list_in.__str__) > 0: --list=$list_in @@ -34,7 +34,7 @@ --result=#echo ','.join($results) - + @@ -53,7 +53,7 @@ - + fasta_in != None @@ -86,8 +86,13 @@ **Command Documenation** -The remove.lineage_ command reads a taxonomy file and a taxon and generates a new file that contains only the sequences in the that are not from that taxon. You may also include either a fasta, name, group, list, or align.report file to this command and mothur will generate new files for each of those containing only the selected sequences. +The remove.lineage_ command reads a taxonomy_ file and a taxon and generates a new file that contains only the sequences in the that are not from that taxon. You may also include either a fasta_, name_, group_, list_, or align.report_ file to this command and mothur will generate new files for each of those containing only the selected sequences. +.. _taxonomy: http://www.mothur.org/wiki/Taxonomy_outline +.. _name: http://www.mothur.org/wiki/Name_file +.. _group: http://www.mothur.org/wiki/Group_file +.. _list: http://www.mothur.org/wiki/List_file +.. _align.report: http://www.mothur.org/wiki/Align.seqs .. _remove.lineage: http://www.mothur.org/wiki/Remove.lineage diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/remove.otus.xml --- a/mothur/tools/mothur/remove.otus.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/remove.otus.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Remove otus containing sequences from specified groups mothur_wrapper.py @@ -9,11 +9,14 @@ --group=$group_in --list=$list_in --label=$label - #if $groups.__str__ != "None" and len($groups.__str__) > 0: - --groups=$groups - #end if - #if $accnos.__str__ != "None" and len($accnos.__str__) > 0: - --accnos=$accnos + #if $groupnames.source == 'groups': + #if $groupnames.groups.__str__ != "None" and len($groupnames.groups.__str__) > 0: + --groups=$groupnames.groups + #end if + #else + #if $groupnames.accnos.__str__ != "None" and len($groupnames.accnos.__str__) > 0: + --accnos=$groupnames.accnos + #end if #end if #set results = ["'^mothur.\S+\.logfile$:'" + $logfile.__str__] #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.' + $label.__str__ + '.\2',$os.path.basename($group_in.__str__)) + ":'" + $group_out.__str__] @@ -29,14 +32,25 @@ - - - - - - - - + + + + + + + + At least one group must be selected + + + + + + + + + + + @@ -59,8 +73,9 @@ **Command Documenation** -The remove.otus_ command removes otus containing sequences from a specific group or set of groups. +The remove.otus_ command removes otus from a list_ containing sequences from a specific group or set of groups. +.. _list: http://www.mothur.org/wiki/List_file .. _remove.otus: http://www.mothur.org/wiki/Remove.otus diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/remove.rare.xml --- a/mothur/tools/mothur/remove.rare.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/remove.rare.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,71 +1,103 @@ - + Remove rare OTUs mothur_wrapper.py - ## output {group_file_name}.pick.{label}.groups {list_file_name}.pick.{label}.list + ## output {group_file_name}.pick.groups {list_file_name}.pick.{format} #import re, os.path #set results = ["'^mothur.\S+\.logfile$:'" + $logfile.__str__] --cmd='remove.rare' --outputdir='$logfile.extra_files_path' - #if isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('shared').__class__): - --shared=$otu - #if $group_in.__str__ != "None" and len($group_in.__str__) > 0: - --group=$group_in + #if isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('shared').__class__): + --shared=$input.otu + $input.bygroup + #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: + --groups=$input.groups #end if - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.otu_in.__str__)) + ":'" + $pick_shared.__str__] - #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('rabund').__class__): - --rabund=$otu - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.otu_in.__str__)) + ":'" + $pick_rabund.__str__] - #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('sabund').__class__): - --sabund=$otu - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.otu_in.__str__)) + ":'" + $pick_sabund.__str__] - #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('list').__class__): - --list=$otu - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.otu_in.__str__)) + ":'" + $pick_list.__str__] - $bygroup + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.otu.__str__)) + ":'" + $pick_otu.__str__] + #elif isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('rabund').__class__): + --rabund=$input.otu + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.otu.__str__)) + ":'" + $pick_otu.__str__] + #elif isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('sabund').__class__): + --sabund=$input.otu + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.otu.__str__)) + ":'" + $pick_otu.__str__] + #elif isinstance($input.otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('list').__class__): + --list=$input.otu + #if $input.group.__str__ != "None" and len($input.group.__str__) > 0: + --group=$input.group + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.group.__str__)) + ":'" + $pick_group.__str__] + #end if + #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: + --groups=$input.groups + #end if + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.pick.\2',$os.path.basename($input.otu.__str__)) + ":'" + $pick_otu.__str__] #end if - #if $label.__str__ != "None" and len($label.__str__) > 0: - --label=$label - #end if - #if $groups.__str__ != "None" and len($groups.__str__) > 0: - --groups=$groups + #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: + --label=$input.label #end if --nseqs=$nseqs --result=#echo ','.join($results) - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - - otu.datatype == 'list' - - - otu.datatype == 'rabund' - - - otu.datatype == 'sabund' - - - otu.datatype == 'shared' + + + input['source'] == 'list' @@ -84,8 +116,12 @@ **Command Documenation** -The remove.rare_ command reads one of the following file types: list, rabund, sabund or shared file. It outputs a new file after removing the rare otus. +The remove.rare_ command reads one of the following file types: list_, rabund_, sabund_ or shared_ file. It outputs a new file after removing the rare otus. +.. _list: http://www.mothur.org/wiki/List_file +.. _sabund: http://www.mothur.org/wiki/Sabund_file +.. _rabund: http://www.mothur.org/wiki/Rabund_file +.. _shared: http://www.mothur.org/wiki/Shared_file .. _remove.rare: http://www.mothur.org/wiki/Remove.rare diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/remove.seqs.xml --- a/mothur/tools/mothur/remove.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/remove.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Remove sequences by name mothur_wrapper.py @@ -47,7 +47,7 @@ - + @@ -83,7 +83,7 @@ list_in != None - + taxonomy_in != None @@ -103,8 +103,14 @@ **Command Documenation** -The remove.seqs_ command takes a list of sequence names and either a fasta, name, group, list, or align.report file to generate a new file that does not contain the sequences in the list. This command may be used in conjunction with the list.seqs command to help screen a sequence collection. +The remove.seqs_ command takes a list of sequence names and either a fasta, name_, group_, list_, align.report_ or taxonomy_ file to generate a new file that does not contain the sequences in the list. This command may be used in conjunction with the list.seqs_ command to help screen a sequence collection. +.. _name: http://www.mothur.org/wiki/Name_file +.. _group: http://www.mothur.org/wiki/Group_file +.. _list: http://www.mothur.org/wiki/List_file +.. _align.report: http://www.mothur.org/wiki/Align.seqs +.. _taxonomy: http://www.mothur.org/wiki/Taxonomy_outline +.. _list.seqs: http://www.mothur.org/wiki/list.seqs .. _remove.seqs: http://www.mothur.org/wiki/Remove.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/reverse.seqs.xml --- a/mothur/tools/mothur/reverse.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/reverse.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Reverse complement the sequences mothur_wrapper.py @@ -8,11 +8,11 @@ --fasta=$fasta - + - + mothur diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/screen.seqs.xml --- a/mothur/tools/mothur/screen.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/screen.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,12 +1,12 @@ - + Screen sequences mothur_wrapper.py #import re, os.path --cmd='screen.seqs' #set results = ["'^mothur.\S+\.logfile$:'" + $logfile.__str__] - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.good.\2',$os.path.basename($input.__str__)) + ":'" + $out_file.__str__] - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.bad.accnos',$os.path.basename($input.__str__)) + ":'" + $bad_accnos.__str__] + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.good.\2',$os.path.basename($input.__str__)) + ":'" + $out_file.__str__] + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.bad.accnos',$os.path.basename($input.__str__)) + ":'" + $bad_accnos.__str__] --outputdir='$logfile.extra_files_path' --tmpdir='${logfile.extra_files_path}/input' --fasta=$input @@ -34,17 +34,21 @@ #if $optimize != None and $optimize.__str__ != "None": --optimize=$optimize #end if + #if $input_qfile != None and $input_qfile.__str__ != "None": + --qfile=$input_qfile + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.good.\2',$os.path.basename($input_qfile.__str__)) + ":'" + $output_qfile.__str__] + #end if #if $input_names != None and $input_names.__str__ != "None": --name=$input_names - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.good.\2',$os.path.basename($input_names.__str__)) + ":'" + $output_names.__str__] + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.good.\2',$os.path.basename($input_names.__str__)) + ":'" + $output_names.__str__] #end if #if $input_groups != None and $input_groups.__str__ != "None": --group=$input_groups - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.good.\2',$os.path.basename($input_groups.__str__)) + ":'" + $output_groups.__str__] + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.good.\2',$os.path.basename($input_groups.__str__)) + ":'" + $output_groups.__str__] #end if #if $input_alignreport != None and $input_alignreport.__str__ != "None": --alignreport=$input_alignreport - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.good.\2',$os.path.basename($input_alignreport.__str__)) + ":'" + $output_alignreport.__str__] + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'\1.good.\2',$os.path.basename($input_alignreport.__str__)) + ":'" + $output_alignreport.__str__] #end if --result=#echo ','.join($results) --processors=2 @@ -66,18 +70,18 @@ - + + - - - - + + + + input_qfile != None - input_names != None @@ -104,8 +108,11 @@ **Command Documenation** -The screen.seqs_ command enables you to keep sequences that fulfill certain user defined criteria. Furthermore, it enables you to cull those sequences not meeting the criteria from a names, group, or align.report file. +The screen.seqs_ command enables you to keep sequences that fulfill certain user defined criteria. Furthermore, it enables you to cull those sequences not meeting the criteria from a name_, group_, or align.report_ file. +.. _name: http://www.mothur.org/wiki/Name_file +.. _group: http://www.mothur.org/wiki/Group_file +.. _align.report: http://www.mothur.org/wiki/Align.seqs .. _screen.seqs: http://www.mothur.org/wiki/Screen.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/sens.spec.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mothur/tools/mothur/sens.spec.xml Tue Jun 07 17:05:08 2011 -0400 @@ -0,0 +1,70 @@ + + Determine the quality of OTU assignment + + mothur_wrapper.py + #import re, os.path + --cmd='sens.spec' + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.sensspec$:'$sensspec + --outputdir='$logfile.extra_files_path' + --list=$list + #if isinstance($dist.datatype, $__app__.datatypes_registry.get_datatype_by_extension('pair.dist').__class__): + --column=$dist + #else + --phylip=$dist + #end if + #if $label.__str__ != "None" and len($label.__str__) > 0: + --label='$label' + #end if + #if len($precision.__str__) > 0: + --precision=$precision + #end if + #if float($cutoff.__str__) > 0.0: + --cutoff=$cutoff + #end if + $hard + + + + + + + + + + + + + + + + + + + + mothur + + + + +**Mothur Overview** + +Mothur_, initiated by Dr. Patrick Schloss and his software development team +in the Department of Microbiology and Immunology at The University of Michigan, +provides bioinformatics for the microbial ecology community. + +.. _Mothur: http://www.mothur.org/wiki/Main_Page + +**Command Documenation** + +The sens.spec_ command takes a list_ and either a column_ or phylip_ distance matrix to determine the quality of OTU assignment. + + +.. _list: http://www.mothur.org/wiki/List_file +.. _column: http://www.mothur.org/wiki/Column-formatted_distance_matrix +.. _phylip: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix +.. _sens.spec: http://www.mothur.org/wiki/Sens.spec + + + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/sffinfo.xml --- a/mothur/tools/mothur/sffinfo.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/sffinfo.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Summarize the quality of sequences mothur_wrapper.py diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/split.abund.xml --- a/mothur/tools/mothur/split.abund.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/split.abund.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,29 +1,50 @@ - + Separate sequences into rare and abundant groups mothur_wrapper.py --cmd='split.abund' - --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.abund\.list$:'$abund_list,'^\S+\.rare\.list$:'$rare_list,'^\S+\.rare\.accnos$:'$rare_accnos,'^\S+\.abund\.accnos$:'$abund_accnos + #import re, os.path + --result='^mothur.\S+\.logfile$:'$logfile + ## --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.abund\.list$:'$abund_list,'^\S+\.rare\.list$:'$rare_list,'^\S+\.rare\.accnos$:'$rare_accnos,'^\S+\.abund\.accnos$:'$abund_accnos --outputdir='$logfile.extra_files_path' - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?\.((unique|[0-9.]+)\.(rare|abund)\.fasta)$:fasta','^\S+?\.((unique|[0-9.]+)\.(rare|abund)\.groups)$:groups','^\S+?\.((unique|[0-9.]+)\.(rare|abund)\.accnos)$:accnos' + #set datasets = [] + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.((unique|[0-9.]+)\.(rare|abund)\.fasta)$:fasta','^\S+?\.((unique|[0-9.]+)\.(rare|abund)\.groups)$:groups','^\S+?\.((unique|[0-9.]+)\.(rare|abund)\.accnos)$:accnos' + #end if --fasta=$fasta + #set datasets = $datasets + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'^\1.(.*\.fasta)$',$os.path.basename($fasta.__str__)) + ":fasta'"] #if $search.type == "list": --list=$search.input + #set datasets = $datasets + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'^\1.(.*\.list)$',$os.path.basename($search.input.__str__)) + ":list'"] + #if $accnos: + #set datasets = $datasets + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'^\1.(.*\.accnos)$',$os.path.basename($search.input.__str__)) + ":accnos'"] + #end if #if $search.label.__str__ != "None" and len($search.label.__str__) > 0: --label=$search.label #end if #elif $search.type == "name": --name=$search.input + #set datasets = $datasets + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'^\1.(.*\.names)$',$os.path.basename($search.input.__str__)) + ":names'"] + #if $accnos: + #set datasets = $datasets + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'^\1.(.*\.accnos)$',$os.path.basename($search.input.__str__)) + ":accnos'"] + #end if #end if --cutoff=$cutoff - #if $group.__str__ != "None" and len($group.__str__) > 0: - --group=$group - #end if - #if $groups.__str__ != "None" and len($groups.__str__) > 0: - --groups=$groups + #if $split.type == 'yes': + #if $split.group.__str__ != "None" and len($split.group.__str__) > 0: + --group=$split.group + #set datasets = $datasets + ["'" + $re.sub(r'(^.*)\.(.*?)$',r'^\1.(.*\.groups)$',$os.path.basename($split.group.__str__)) + ":groups'"] + #end if + #if $split.groups.__str__ != "None" and len($split.groups.__str__) > 0: + --groups=$split.groups + #end if #end if $accnos + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets=#echo ','.join($datasets) + #end if @@ -37,27 +58,38 @@ - + - + - - - - - - - - + + + + + + + + + + + + + + + + + + + mothur @@ -83,8 +116,10 @@ **Command Documenation** -The split.abund_ command reads a fasta file and a list or a names file and splits the sequences into rare and abundant groups. +The split.abund_ command reads a fasta file and a list_ or a name_ file and splits the sequences into rare and abundant groups. +.. _list: http://www.mothur.org/wiki/List_file +.. _name: http://www.mothur.org/wiki/Name_file .. _split.abund: http://www.mothur.org/wiki/Split.abund diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/split.groups.xml --- a/mothur/tools/mothur/split.groups.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/split.groups.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Generates a fasta file for each group mothur_wrapper.py @@ -6,13 +6,27 @@ --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?\.(\S+\.fasta)$:fasta' + --new_datasets='^\S+?\.(\S+\.fasta)$:fasta','^\S+?\.(\S+\.names)$:names' --fasta=$fasta --group=$group + #if $name.__str__ != "None" and len($name.__str__) > 0: + --name=$name + #end if + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups=$groups + #end if - + + + + + + + + + @@ -33,8 +47,10 @@ **Command Documenation** -The split.groups_ command reads a fasta file group file and generates a fasta file for each group in the groupfile. +The split.groups_ command reads a fasta file and group_ file and generates a fasta file for each group in the groupfile. A name_ file can also be split into groups. +.. _group: http://www.mothur.org/wiki/Group_file +.. _name: http://www.mothur.org/wiki/Name_file .. _split.groups: http://www.mothur.org/wiki/Split.groups diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/sub.sample.xml --- a/mothur/tools/mothur/sub.sample.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/sub.sample.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Create a sub sample mothur_wrapper.py @@ -12,26 +12,30 @@ #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.subsample.\2',$os.path.basename($input.fasta_in.__str__)) + ":'" + $fasta_out.__str__] #if $input.name_in.__str__ != "None" and len($input.name_in.__str__) > 0: --name=$input.name_in - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.subsample.\2',$os.path.basename($input.name_in.__str__)) + ":'" + $names_out.__str__] + ## #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.subsample.\2',$os.path.basename($input.name_in.__str__)) + ":'" + $names_out.__str__] #end if - #if $input.use_group.group_in.__str__ != "None" and len($input.use_group.group_in.__str__) > 0: - --group=$input.use_group.group_in - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.subsample.\2',$os.path.basename($input.use_group.group_in.__str__)) + ":'" + $group_out.__str__] - #if $input.use_group.groups.__str__ != "None" and len($input.use_group.groups.__str__) > 0: - --groups=$input.use_group.groups + #if $input.use_group.to_filter == "yes": + #if $input.use_group.group_in.__str__ != "None" and len($input.use_group.group_in.__str__) > 0: + --group=$input.use_group.group_in + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.subsample.\2',$os.path.basename($input.use_group.group_in.__str__)) + ":'" + $group_out.__str__] + #if $input.use_group.groups.__str__ != "None" and len($input.use_group.groups.__str__) > 0: + --groups=$input.use_group.groups + #end if + $input.use_group.persample #end if - $input.use_group.persample #end if #elif $input.format == "list": --list=$input.otu_in #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.subsample.\2',$os.path.basename($input.otu_in.__str__)) + ":'" + $list_out.__str__] - #if $input.use_group.group_in.__str__ != "None" and len($input.use_group.group_in.__str__) > 0: - --group=$input.use_group.group_in - #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.subsample.\2',$os.path.basename($input.use_group.group_in.__str__)) + ":'" + $group_out.__str__] - #if $input.use_group.groups.__str__ != "None" and len($input.use_group.groups.__str__) > 0: - --groups=$input.use_group.groups + #if $input.use_group.to_filter == "yes": + #if $input.use_group.group_in.__str__ != "None" and len($input.use_group.group_in.__str__) > 0: + --group=$input.use_group.group_in + #set results = $results + ["'" + $re.sub(r'(^.*)\.(.*?)',r'\1.subsample.\2',$os.path.basename($input.use_group.group_in.__str__)) + ":'" + $group_out.__str__] + #if $input.use_group.groups.__str__ != "None" and len($input.use_group.groups.__str__) > 0: + --groups=$input.use_group.groups + #end if + $input.use_group.persample #end if - $input.use_group.persample #end if #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: --label=$input.label @@ -176,9 +180,11 @@ input['format'] == 'rabund' + input['use_group']['group_in'] != None @@ -199,8 +205,12 @@ **Command Documenation** -The sub.sample_ command selects otus containing sequences from a specific group or set of groups. +The sub.sample_ command can be used as a way to normalize your data, or to create a smaller set from your original set. It takes as an input the following file types: fasta, list_, shared_, rabund_ and sabund_ to generate a new file that contains a sampling of your original file. +.. _list: http://www.mothur.org/wiki/List_file +.. _shared: http://www.mothur.org/wiki/Shared_file +.. _rabund: http://www.mothur.org/wiki/Rabund_file +.. _sabund: http://www.mothur.org/wiki/Sabund_file .. _sub.sample: http://www.mothur.org/wiki/Sub.sample diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/summary.seqs.xml --- a/mothur/tools/mothur/summary.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/summary.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Summarize the quality of sequences mothur_wrapper.py @@ -6,9 +6,13 @@ --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.summary$:'$out_summary --outputdir='$logfile.extra_files_path' --fasta=$fasta + #if $name.__str__ != "None" and len($name.__str__) > 0: + --name=$name + #end if + diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/summary.shared.xml --- a/mothur/tools/mothur/summary.shared.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/summary.shared.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Summary of calculator values for OTUs mothur_wrapper.py @@ -7,81 +7,35 @@ --outputdir='$logfile.extra_files_path' --datasetid='$logfile.id' --new_file_path='$__new_file_path__' --new_datasets='^\S+?\.((\S+)\.(unique|[0-9.]*)\.dist)$:lower.dist' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' - #end if - #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups - #end if - #* - --READ_list=$otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' + --shared=$otu + #if $groups.__str__ != "None" and len($groups.__str__) > 0: + --groups=$groups #end if #if $label.__str__ != "None" and len($label.__str__) > 0: --label='$label' #end if - *# #if $calc.__str__ != "None" and len($calc.__str__) > 0: --calc='$calc' #end if $all + $distance - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + @@ -113,8 +67,10 @@ **Command Documenation** -The summary.shared_ command produce a summary file that has the calculator value for each line in the OTU data and for all possible comparisons between the different groups in the group file. +The summary.shared_ command produce a summary file that has the calculator value for each line in the OTU data of the shared_ file and for all possible comparisons between the different groups in the group_ file. This can be useful if you aren't interested in generating collector's or rarefaction curves for your multi-sample data analysis. It would be worth your while, however, to look at the collector's curves for the calculators you are interested in to determine how sensitive the values are to sampling. If the values are not sensitive to sampling, then you can trust the values. Otherwise, you need to keep sampling. For calc parameter choices see: http://www.mothur.org/wiki/Calculators +.. _shared: http://www.mothur.org/wiki/Shared_file +.. _group: http://www.mothur.org/wiki/Group_file .. _summary.shared: http://www.mothur.org/wiki/Summary.shared diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/summary.single.xml --- a/mothur/tools/mothur/summary.single.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/summary.single.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,16 +1,25 @@ - + Summary of calculator values for OTUs mothur_wrapper.py --cmd='summary.single' - --result='^mothur.\S+\.logfile$:'$logfile + #if isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('shared').__class__) and not $groupmode.__str__ == '--groupmode=true': + --result='^mothur.\S+\.logfile$:'$logfile + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+\.((\S+?)\.summary)$:tabular' + #else: + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.summary$:'$summary + #end if --outputdir='$logfile.extra_files_path' - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+\.((\S+?)\.summary)$:tabular' - --READ_cmd='read.otu' - --READ_list=$otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' + #if isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('shared').__class__): + --shared=$otu + $groupmode + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('rabund').__class__): + --rabund=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('sabund').__class__): + --sabund=$otu + #elif isinstance($otu.datatype, $__app__.datatypes_registry.get_datatype_by_extension('list').__class__): + --list=$otu #end if #if $label.__str__ != "None" and len($label.__str__) > 0: --label='$label' @@ -24,16 +33,14 @@ #if int($size.__str__) > 0: --size=$size #end if - $groupmode - --processors=2 - - + + @@ -46,10 +53,15 @@ - + + + (otu.file_ext == 'shared' and groupmode == True) + mothur @@ -67,7 +79,7 @@ **Command Documenation** -The summary.single_ command produce a summary file that has the calculator value for each line in the OTU data and for all possible comparisons between the different groups in the group file. +The summary.single_ command produce a summary file that has the calculator value for each line in the OTU data and for all possible comparisons between the different groups in the group_ file. This can be useful if you aren't interested in generating collector's or rarefaction curves for your multi-sample data analysis. It would be worth your while, however, to look at the collector's curves for the calculators you are interested in to determine how sensitive the values are to sampling. If the values are not sensitive to sampling, then you can trust the values. Otherwise, you need to keep sampling. For calc parameter choices see: http://www.mothur.org/wiki/Calculators .. _summary.single: http://www.mothur.org/wiki/Summary.single diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/tree.shared.xml --- a/mothur/tools/mothur/tree.shared.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/tree.shared.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,69 +1,66 @@ - + Generate a newick tree for dissimilarity among groups mothur_wrapper.py --cmd='tree.shared' - --result='^mothur.\S+\.logfile$:'$logfile - --outputdir='$logfile.extra_files_path' - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?([a-z]+\.(unique|[0-9.]*)\.tre)$:tre' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' + #if $input.source == 'shared': + --result='^mothur.\S+\.logfile$:'$logfile + #if $input.as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?([a-z]+\.(unique|[0-9.]*)\.tre)$:tre' + #end if + --shared=$input.dist + #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: + --groups=$input.groups #end if #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' + --label='$input.label' #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu - #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: - --READ_label='$input.label' + #else: + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.tre$:'$tre + --outputdir='$logfile.extra_files_path' + #if $input.source == 'column': + --column=$input.dist + --name=$input.name + #elif $input.source == 'phylip': + --phylip=$input.dist + #if $input.name.__str__ != "None" and len($input.name.__str__) > 0: + --name=$input.name + #end if #end if #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups - #end if #if $calc.__str__ != "None" and len($calc.__str__) > 0: --calc=$calc #end if - + - - + + + - - - - - - - - - - - - - - - - + + + + + + + - + + - + - + @@ -82,6 +79,9 @@ + + input['source'] != 'shared' + mothur @@ -99,7 +99,7 @@ **Command Documenation** -The tree.shared_ command will generate a newick-formatted tree file that describes the dissimilarity (1-similarity) among multiple groups. +The tree.shared_ command will generate a newick-formatted tree file that describes the dissimilarity (1-similarity) among multiple groups. For calc parameter choices see: http://www.mothur.org/wiki/Calculators .. _tree.shared: http://www.mothur.org/wiki/Tree.shared diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/trim.seqs.xml --- a/mothur/tools/mothur/trim.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/trim.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Trim sequences - primers, barcodes, quality mothur_wrapper.py @@ -34,6 +34,11 @@ #if $oligo.tdiffs > 0: --tdiffs=$oligo.tdiffs #end if + $oligo.allfiles + #if $oligo.allfiles.value: + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.(\S+\.fasta)$:fasta','^\S+?\.(\S+\.groups)$:groups' + #end if #end if #if $qual.add == "yes": --qfile=$qual.qfile @@ -59,20 +64,21 @@ - + - + + - + @@ -88,15 +94,17 @@ - + - qfile != none and len(qfile) > 0 + (qual['add'] == 'yes' and len(qual['qfile'].__str__) > 0) - - qfile != none and len(qfile) > 0 + + (qual['add'] == 'yes' and len(qual['qfile'].__str__) > 0) - + + (oligo['add'] == 'yes' and len(oligo['oligos']) > 0) + mothur diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/unifrac.unweighted.xml --- a/mothur/tools/mothur/unifrac.unweighted.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/unifrac.unweighted.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,20 +1,19 @@ - + Describes whether two or more communities have the same structure mothur_wrapper.py --cmd='unifrac.unweighted' - --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.uwsummary$:'$summary,'^\S+\.unweighted\.(column\.|philip\.)?dist$:'$dist + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.uwsummary$:'$summary,'^\S+\.unweighted\.(column\.|phylip\.)?dist$:'$dist,'^\S+\.unweighted$:'$unweighted --outputdir='$logfile.extra_files_path' - --READ_cmd='read.tree' - --READ_tree=$tree + --tree=$tree #if $group.__str__ != "None" and len($group.__str__) > 0: - --READ_group='$group' + --group='$group' #end if #if $groups.__str__ != "None" and len($groups.__str__) > 0: --groups='$groups' #end if #if $name.__str__ != "None" and len($name.__str__) > 0: - --READ_name='$name' + --name='$name' #end if #if int($iters.__str__) > 0: --iters=$iters @@ -23,13 +22,15 @@ #if $distance.__str__ != "false": --distance=$distance #end if + $root --processors=2 - - - + + + + @@ -39,20 +40,21 @@ - + - - + + - + + + (random == True) + distance != 'false' @@ -60,10 +62,6 @@ - mothur diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/unifrac.weighted.xml --- a/mothur/tools/mothur/unifrac.weighted.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/unifrac.weighted.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,20 +1,19 @@ - + Describes whether two or more communities have the same structure mothur_wrapper.py --cmd='unifrac.weighted' - --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.wsummary$:'$summary,'^\S+\.weighted\.(column\.|philip\.)?dist$:'$dist + --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.wsummary$:'$summary,'^\S+\.weighted\.(column\.|phylip\.)?dist$:'$dist,'^\S+\.weighted$:'$weighted --outputdir='$logfile.extra_files_path' - --READ_cmd='read.tree' - --READ_tree=$tree + --tree=$tree #if $group.__str__ != "None" and len($group.__str__) > 0: - --READ_group='$group' + --group='$group' #end if #if $groups.__str__ != "None" and len($groups.__str__) > 0: --groups='$groups' #end if #if $name.__str__ != "None" and len($name.__str__) > 0: - --READ_name='$name' + --name='$name' #end if #if int($iters.__str__) > 0: --iters=$iters @@ -23,13 +22,14 @@ #if $distance.__str__ != "false": --distance=$distance #end if + $root --processors=2 - - - - + + + + @@ -40,20 +40,21 @@ - + - - + + - + + + (random == True) + distance != 'false' diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/unique.seqs.xml --- a/mothur/tools/mothur/unique.seqs.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/unique.seqs.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,4 +1,4 @@ - + Return unique sequences mothur_wrapper.py @@ -6,7 +6,9 @@ --result='^mothur.\S+\.logfile$:'$logfile,'^\S+\.unique\.\w+$:'$out_fasta,'^\S+\.names$:'$out_names --outputdir='$logfile.extra_files_path' --fasta=$fasta - --name=$names + #if $names.__str__ != "None" and len($names.__str__) > 0: + --name=$names + #end if @@ -33,8 +35,9 @@ **Command Documenation** -The unique.seqs_ command returns only the unique sequences found in a fasta-formatted sequence file and a file that indicates those sequences that are identical to the reference sequence. +The unique.seqs_ command returns only the unique sequences found in a fasta-formatted sequence file and a name_ file that indicates those sequences that are identical to the reference sequence. +.. _name: http://www.mothur.org/wiki/Name_file .. _unique.seqs: http://www.mothur.org/wiki/Unique.seqs diff -r c7923b34dea4 -r e076d95dbdb5 mothur/tools/mothur/venn.xml --- a/mothur/tools/mothur/venn.xml Tue Jun 07 17:01:07 2011 -0400 +++ b/mothur/tools/mothur/venn.xml Tue Jun 07 17:05:08 2011 -0400 @@ -1,72 +1,51 @@ - - Generate Venn diagrams gor groups + + Generate Venn diagrams for groups mothur_wrapper.py --cmd='venn' --result='^mothur.\S+\.logfile$:'$logfile --outputdir='$logfile.extra_files_path' - --datasetid='$logfile.id' --new_file_path='$__new_file_path__' - --new_datasets='^\S+?\.(\S+\.svg)$:svg' - --READ_cmd='read.otu' - #if $input.source == 'similarity': - --READ_list=$input.otu - #if $otu_group.__str__ != "None" and len($otu_group.__str__) > 0: - --READ_group='$otu_group' - #end if + #if $as_datasets.__str__ == "yes": + --datasetid='$logfile.id' --new_file_path='$__new_file_path__' + --new_datasets='^\S+?\.(\S+\.svg)$:svg' + #end if + #if $input.source == 'shared': + --shared=$input.otu #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: --label='$input.label' #end if #if $input.calc.__str__ != "None" and len($input.calc.__str__) > 0: --calc='$input.calc' #end if - #elif $input.source == 'shared': - --READ_shared=$input.otu + $nseqs + $permute + #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: + --groups=$input.groups + #end if + #elif $input.source == 'similarity': + --list=$input.otu #if $input.label.__str__ != "None" and len($input.label.__str__) > 0: --label='$input.label' #end if - #if $calc.__str__ != "None" and len($calc.__str__) > 0: - --calc='$calc' + #if $input.calc.__str__ != "None" and len($input.calc.__str__) > 0: + --calc='$input.calc' #end if - $nseqs - $permute - #end if - #if $input.groups.__str__ != "None" and len($input.groups.__str__) > 0: - --groups=$input.groups + #if $input.abund >= 5: + --abund='$input.abund' + #end if + #end if - + - - - - + + - - - - - - - - - - - - - - - - - - - - - - + @@ -85,9 +64,27 @@ + + + + + + + + + + + + + + + + + + @@ -108,7 +105,7 @@ **Command Documenation** -The venn_ command generates Venn diagrams to compare the richness shared among 2, 3, or 4 groups. +The venn_ command generates Venn diagrams to compare the richness shared among 2, 3, or 4 groups. For calc parameter choices see: http://www.mothur.org/wiki/Calculators .. _venn: http://www.mothur.org/wiki/Venn