Mercurial > repos > jjohnson > mothur_toolsuite
comparison mothur/README @ 0:591e72edabed
Migrated tool version 1.15.1 from old tool shed archive to new tool shed repository
| author | jjohnson |
|---|---|
| date | Tue, 07 Jun 2011 16:54:12 -0400 |
| parents | |
| children | c7923b34dea4 |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:591e72edabed |
|---|---|
| 1 Provides galaxy tools for the Mothur metagenomics package - http://www.mothur.org/wiki/Main_Page | |
| 2 | |
| 3 Install mothur v.1.15.0 on your galaxy system so galaxy can execute the mothur command | |
| 4 http://www.mothur.org/wiki/Download_mothur | |
| 5 http://www.mothur.org/wiki/Installation | |
| 6 ( This Galaxy iMothur wrapper will invoke Mothur in command line mode: http://www.mothur.org/wiki/Command_line_mode ) | |
| 7 | |
| 8 TreeVector is also packaged with this Mothur package to view phylogenetic trees: | |
| 9 TreeVector is a utility to create and integrate phylogenetic trees as Scalable Vector Graphics (SVG) files. | |
| 10 TreeVector was written by Ralph_Pethica, Department_of_Computer_Science, University_of_Bristol | |
| 11 TreeVector: http://supfam.cs.bris.ac.uk/TreeVector/about.html | |
| 12 Install in galaxy: tool-data/shared/jars/TreeVector.jar | |
| 13 | |
| 14 Install reference data from silva and greengenes | |
| 15 Silva reference: | |
| 16 http://www.mothur.org/wiki/Silva_reference_files | |
| 17 - Bacterial references (14,956 sequences) | |
| 18 http://www.mothur.org/w/images/9/98/Silva.bacteria.zip | |
| 19 - Archaeal references (2,297 sequences) | |
| 20 http://www.mothur.org/w/images/3/3c/Silva.archaea.zip | |
| 21 - Eukaryotic references (1,238 sequences) | |
| 22 http://www.mothur.org/w/images/1/1a/Silva.eukarya.zip | |
| 23 - Silva-based alignment of template file for chimera.slayer (5,181 sequences) | |
| 24 http://www.mothur.org/w/images/f/f1/Silva.gold.bacteria.zip | |
| 25 Alignment database rRNA gene sequences: | |
| 26 http://www.mothur.org/wiki/Alignment_database | |
| 27 - greengenes reference alignment | |
| 28 http://www.mothur.org/w/images/7/72/Greengenes.alignment.zip | |
| 29 - SILVA (Silva reference) | |
| 30 http://www.mothur.org/w/images/f/f1/Silva.gold.bacteria.zip | |
| 31 Secondary structure mapping files: | |
| 32 http://www.mothur.org/wiki/Secondary_structure_map | |
| 33 http://www.mothur.org/w/images/6/6d/Silva_ss_map.zip | |
| 34 http://www.mothur.org/w/images/4/4b/Gg_ss_map.zip | |
| 35 Lane masks: | |
| 36 http://www.mothur.org/wiki/Lane_mask | |
| 37 greengenes-compatible mask: | |
| 38 - lane1241.gg.filter - A Lane Masks that comes with the greengenes arb database | |
| 39 http://www.mothur.org/w/images/2/2a/Lane1241.gg.filter | |
| 40 - lane1287.gg.filter - A Lane Masks that comes with the greengenes arb database | |
| 41 http://www.mothur.org/w/images/a/a0/Lane1287.gg.filter | |
| 42 - lane1349.gg.filter - Pat Schloss's transcription of the mask from the Lane paper | |
| 43 http://www.mothur.org/w/images/3/3d/Lane1349.gg.filter | |
| 44 SILVA-compatible mask: | |
| 45 - lane1349.silva.filter - Pat Schloss's transcription of the mask from the Lane paper | |
| 46 http://www.mothur.org/w/images/6/6d/Lane1349.silva.filter | |
| 47 | |
| 48 Example from UMN installation: (We also made these available in a Galaxy public data library) | |
| 49 /project/db/galaxy/mothur/Silva.bacteria.zip | |
| 50 /project/db/galaxy/mothur/silva.eukarya.fasta | |
| 51 /project/db/galaxy/mothur/Greengenes.alignment.zip | |
| 52 /project/db/galaxy/mothur/Silva.archaea.zip | |
| 53 /project/db/galaxy/mothur/Silva_ss_map.zip | |
| 54 /project/db/galaxy/mothur/silva.eukarya.ncbi.tax | |
| 55 /project/db/galaxy/mothur/Silva.gold.bacteria.zip | |
| 56 /project/db/galaxy/mothur/Silva.archaea/silva.archaea.silva.tax | |
| 57 /project/db/galaxy/mothur/Silva.archaea/silva.archaea.gg.tax | |
| 58 /project/db/galaxy/mothur/Silva.archaea/silva.archaea.rdp.tax | |
| 59 /project/db/galaxy/mothur/Silva.archaea/nogap.archaea.fasta | |
| 60 /project/db/galaxy/mothur/Silva.archaea/silva.archaea.ncbi.tax | |
| 61 /project/db/galaxy/mothur/Silva.archaea/silva.archaea.fasta | |
| 62 /project/db/galaxy/mothur/nogap.eukarya.fasta | |
| 63 /project/db/galaxy/mothur/silva.eukarya.silva.tax | |
| 64 /project/db/galaxy/mothur/silva.gold.align | |
| 65 /project/db/galaxy/mothur/silva.ss.map | |
| 66 /project/db/galaxy/mothur/gg.ss.map | |
| 67 /project/db/galaxy/mothur/silva.bacteria/silva.bacteria.silva.tax | |
| 68 /project/db/galaxy/mothur/silva.bacteria/silva.bacteria.rdp6.tax | |
| 69 /project/db/galaxy/mothur/silva.bacteria/nogap.bacteria.fasta | |
| 70 /project/db/galaxy/mothur/silva.bacteria/silva.bacteria.gg.tax | |
| 71 /project/db/galaxy/mothur/silva.bacteria/silva.bacteria.ncbi.tax | |
| 72 /project/db/galaxy/mothur/silva.bacteria/silva.bacteria.fasta | |
| 73 /project/db/galaxy/mothur/silva.bacteria/silva.bacteria.rdp.tax | |
| 74 /project/db/galaxy/mothur/Silva.eukarya.zip | |
| 75 /project/db/galaxy/mothur/Gg_ss_map.zip | |
| 76 /project/db/galaxy/mothur/core_set_aligned.imputed.fasta | |
| 77 | |
| 78 | |
| 79 Add tool-data: (contains pointers to silva and greengenes reference data) | |
| 80 tool-data/mothur_aligndb.loc | |
| 81 tool-data/mothur_calulators.loc | |
| 82 tool-data/mothur_map.loc | |
| 83 tool-data/mothur_taxonomy.loc | |
| 84 tool-data/shared/jars/TreeVector.jar | |
| 85 | |
| 86 | |
| 87 add config files (*.xml) and wrapper code (*.py) from tools/mothur/* to your galaxy installation | |
| 88 | |
| 89 | |
| 90 add datatype definition file: lib/galaxy/datatypes/metagenomics.py | |
| 91 | |
| 92 add the following import line to: lib/galaxy/datatypes/registry.py | |
| 93 import metagenomics # added for metagenomics mothur | |
| 94 | |
| 95 | |
| 96 add datatypes to: datatypes_conf.xml | |
| 97 <!-- Start Mothur Datatypes --> | |
| 98 <datatype extension="otu" type="galaxy.datatypes.metagenomics:Otu" display_in_upload="true"/> | |
| 99 <datatype extension="list" type="galaxy.datatypes.metagenomics:OtuList" display_in_upload="true"/> | |
| 100 <datatype extension="sabund" type="galaxy.datatypes.metagenomics:Sabund" display_in_upload="true"/> | |
| 101 <datatype extension="rabund" type="galaxy.datatypes.metagenomics:Rabund" display_in_upload="true"/> | |
| 102 <datatype extension="shared" type="galaxy.datatypes.metagenomics:SharedRabund" display_in_upload="true"/> | |
| 103 <datatype extension="relabund" type="galaxy.datatypes.metagenomics:RelAbund" display_in_upload="true"/> | |
| 104 <datatype extension="names" type="galaxy.datatypes.metagenomics:Names" display_in_upload="true"/> | |
| 105 <datatype extension="summary" type="galaxy.datatypes.metagenomics:Summary" display_in_upload="true"/> | |
| 106 <datatype extension="groups" type="galaxy.datatypes.metagenomics:Group" display_in_upload="true"/> | |
| 107 <datatype extension="oligos" type="galaxy.datatypes.metagenomics:Oligos" display_in_upload="true"/> | |
| 108 <datatype extension="align" type="galaxy.datatypes.metagenomics:SequenceAlignment" display_in_upload="true"/> | |
| 109 <datatype extension="accnos" type="galaxy.datatypes.metagenomics:AccNos" display_in_upload="true"/> | |
| 110 <datatype extension="align.check" type="galaxy.datatypes.metagenomics:AlignCheck" display_in_upload="true"/> | |
| 111 <datatype extension="align.report" type="galaxy.datatypes.metagenomics:AlignReport" display_in_upload="true"/> | |
| 112 <datatype extension="filter" type="galaxy.datatypes.metagenomics:LaneMask" display_in_upload="true"/> | |
| 113 <datatype extension="dist" type="galaxy.datatypes.metagenomics:DistanceMatrix" display_in_upload="true"/> | |
| 114 <datatype extension="pair.dist" type="galaxy.datatypes.metagenomics:PairwiseDistanceMatrix" display_in_upload="true"/> | |
| 115 <datatype extension="square.dist" type="galaxy.datatypes.metagenomics:SquareDistanceMatrix" display_in_upload="true"/> | |
| 116 <datatype extension="lower.dist" type="galaxy.datatypes.metagenomics:LowerTriangleDistanceMatrix" display_in_upload="true"/> | |
| 117 <datatype extension="taxonomy" type="galaxy.datatypes.metagenomics:SequenceTaxonomy" display_in_upload="true"/> | |
| 118 <datatype extension="cons.taxonomy" type="galaxy.datatypes.metagenomics:ConsensusTaxonomy" display_in_upload="true"/> | |
| 119 <datatype extension="tax.summary" type="galaxy.datatypes.metagenomics:TaxonomySummary" display_in_upload="true"/> | |
| 120 <datatype extension="freq" type="galaxy.datatypes.metagenomics:Frequency" display_in_upload="true"/> | |
| 121 <datatype extension="quan" type="galaxy.datatypes.metagenomics:Quantile" display_in_upload="true"/> | |
| 122 <datatype extension="filtered.quan" type="galaxy.datatypes.metagenomics:FilteredQuantile" display_in_upload="true"/> | |
| 123 <datatype extension="masked.quan" type="galaxy.datatypes.metagenomics:MaskedQuantile" display_in_upload="true"/> | |
| 124 <datatype extension="filtered.masked.quan" type="galaxy.datatypes.metagenomics:FilteredMaskedQuantile" display_in_upload="true"/> | |
| 125 <datatype extension="tre" type="galaxy.datatypes.data:Newick" display_in_upload="true"/> | |
| 126 <!-- End Mothur Datatypes --> | |
| 127 | |
| 128 add mothur tools to: tool_conf.xml | |
| 129 <section name="Metagenomics Mothur" id="metagenomics_mothur"> | |
| 130 <label text="Mothur Utilities" id="mothur_utilities"/> | |
| 131 <tool file="mothur/merge.files.xml"/> | |
| 132 <tool file="mothur/make.group.xml"/> | |
| 133 <tool file="mothur/get.groups.xml"/> | |
| 134 <tool file="mothur/remove.groups.xml"/> | |
| 135 <label text="Mothur Sequence Analysis" id="mothur_sequence_analysis"/> | |
| 136 <tool file="mothur/summary.seqs.xml"/> | |
| 137 <tool file="mothur/reverse.seqs.xml"/> | |
| 138 <tool file="mothur/list.seqs.xml"/> | |
| 139 <tool file="mothur/get.seqs.xml"/> | |
| 140 <tool file="mothur/remove.seqs.xml"/> | |
| 141 <tool file="mothur/trim.seqs.xml"/> | |
| 142 <tool file="mothur/unique.seqs.xml"/> | |
| 143 <tool file="mothur/deunique.seqs.xml"/> | |
| 144 <tool file="mothur/chop.seqs.xml"/> | |
| 145 <tool file="mothur/screen.seqs.xml"/> | |
| 146 <tool file="mothur/filter.seqs.xml"/> | |
| 147 <tool file="mothur/degap.seqs.xml"/> | |
| 148 <tool file="mothur/consensus.seqs.xml"/> | |
| 149 <tool file="mothur/sub.sample.xml"/> | |
| 150 <tool file="mothur/chimera.bellerophon.xml"/> | |
| 151 <tool file="mothur/chimera.ccode.xml"/> | |
| 152 <tool file="mothur/chimera.check.xml"/> | |
| 153 <tool file="mothur/chimera.pintail.xml"/> | |
| 154 <tool file="mothur/chimera.slayer.xml"/> | |
| 155 <tool file="mothur/align.seqs.xml"/> | |
| 156 <tool file="mothur/align.check.xml"/> | |
| 157 <tool file="mothur/split.abund.xml"/> | |
| 158 <tool file="mothur/split.groups.xml"/> | |
| 159 <tool file="mothur/parse.list.xml"/> | |
| 160 <tool file="mothur/pre.cluster.xml"/> | |
| 161 <tool file="mothur/cluster.fragments.xml"/> | |
| 162 <tool file="mothur/dist.seqs.xml"/> | |
| 163 <tool file="mothur/pairwise.seqs.xml"/> | |
| 164 <tool file="mothur/bin.seqs.xml"/> | |
| 165 <tool file="mothur/classify.seqs.xml"/> | |
| 166 <tool file="mothur/sffinfo.xml"/> | |
| 167 <tool file="mothur/fastq.info.xml"/> | |
| 168 <tool file="mothur/pcoa.xml"/> | |
| 169 <tool file="mothur/get.lineage.xml"/> | |
| 170 <tool file="mothur/remove.lineage.xml"/> | |
| 171 <label text="Mothur Operational Taxonomy Unit" id="mothur_taxonomy_unit"/> | |
| 172 <tool file="mothur/cluster.xml"/> | |
| 173 <tool file="mothur/hcluster.xml"/> | |
| 174 <tool file="mothur/cluster.classic.xml"/> | |
| 175 <tool file="mothur/read.otu.xml"/> | |
| 176 <tool file="mothur/classify.otu.xml"/> | |
| 177 <tool file="mothur/get.otus.xml"/> | |
| 178 <tool file="mothur/remove.otus.xml"/> | |
| 179 <tool file="mothur/get.oturep.xml"/> | |
| 180 <tool file="mothur/get.relabund.xml"/> | |
| 181 <label text="Mothur Single Sample Analysis" id="mothur_single_sample_analysis"/> | |
| 182 <tool file="mothur/collect.single.xml"/> | |
| 183 <tool file="mothur/rarefaction.single.xml"/> | |
| 184 <tool file="mothur/summary.single.xml"/> | |
| 185 <tool file="mothur/heatmap.bin.xml"/> | |
| 186 <label text="Mothur Multiple Sample Analysis" id="mothur_multiple_sample_analysis"/> | |
| 187 <tool file="mothur/collect.shared.xml"/> | |
| 188 <tool file="mothur/rarefaction.shared.xml"/> | |
| 189 <tool file="mothur/normalize.shared.xml"/> | |
| 190 <tool file="mothur/summary.shared.xml"/> | |
| 191 <tool file="mothur/dist.shared.xml"/> | |
| 192 <tool file="mothur/heatmap.bin.xml"/> | |
| 193 <tool file="mothur/heatmap.sim.xml"/> | |
| 194 <tool file="mothur/venn.xml"/> | |
| 195 <tool file="mothur/tree.shared.xml"/> | |
| 196 <label text="Mothur Hypothesis Testing" id="mothur_hypothesis_testing"/> | |
| 197 <tool file="mothur/parsimony.xml"/> | |
| 198 <tool file="mothur/unifrac.weighted.xml"/> | |
| 199 <tool file="mothur/unifrac.unweighted.xml"/> | |
| 200 <tool file="mothur/libshuff.xml"/> | |
| 201 <label text="Mothur Phylotype Analysis" id="mothur_phylotype_analysis"/> | |
| 202 <tool file="mothur/phylotype.xml"/> | |
| 203 <tool file="mothur/phylo.diversity.xml"/> | |
| 204 <tool file="mothur/clearcut.xml"/> | |
| 205 <tool file="mothur/indicator.xml"/> | |
| 206 <tool file="mothur/bootstrap.shared.xml"/> | |
| 207 <tool file="mothur/TreeVector.xml"/> | |
| 208 </section> <!-- metagenomics_mothur --> | |
| 209 | |
| 210 | |
| 211 ############ DESIGN NOTES ######################################################################################################### | |
| 212 Each mothur command has it's own tool_config (.xml) file, but all call the same python wrapper code: mothur_wrapper.py | |
| 213 | |
| 214 * Every mothur tool will call mothur_wrapper.py script with a --cmd= parameter that gives the mothur command name. | |
| 215 * Many mothur commands require date to be read into memory (using read.dist, read.otu, read.tree) before executed the command, | |
| 216 these are accomplished in the tool_config and mothur_wrapper.py with --READ_cmd= and --READ_<option> parameters. | |
| 217 * Every tool will produce the logfile of the mothur run as an output. | |
| 218 * When the outputs of a mothur command could be determined in advance, they are included in the --result= parameter to mothur_wrapper.py | |
| 219 * When the number of outputs cannot be determined in advance, the name patterns and datatypes of the ouputs | |
| 220 are included in the --new_datasets parameter to mothur_wrapper.py | |
| 221 | |
| 222 Here is an example call to the mothur_wrapper.py script with an explanation before each param : | |
| 223 mothur_wrapper.py | |
| 224 # name of a mothur command, this is required | |
| 225 --cmd='summary.shared' | |
| 226 # Galaxy output dataset list, these are output files that can be determined before the command is run | |
| 227 # The items in the list are separated by commas | |
| 228 # Each item contains a regex to match the output filename and a galaxy dataset filepath in which to copy the data (separated by :) | |
| 229 --result='^mothur.\S+\.logfile$:'/home/galaxy/data/database/files/002/dataset_2613.dat,'^\S+\.summary$:'/home/galaxy/data/database/files/002/dataset_2614.dat | |
| 230 # Galaxy output dataset extra_files_path direcotry in which to put all output files (usually the logfile extra_file path) | |
| 231 --outputdir='/home/galaxy/data/database/files/002/dataset_2613_files' | |
| 232 # The id of one of the galaxy outputs (e.g. the mothur logfile) used for dynamic dataset generation (when number of outputs not known in advance) | |
| 233 # see: ttp://bitbucket.org/galaxy/galaxy-central/wiki/ToolsMultipleOutput | |
| 234 --datasetid='2578' | |
| 235 # The galaxy directory in which to copy all output files for dynamic dataset generation (special galaxy tool param: $__new_file_path__) | |
| 236 --new_file_path='$__new_file_path__' | |
| 237 # specifies files to copy to the new_file_path | |
| 238 # The list is separated by commas | |
| 239 # Each item conatins: a regex pattern for matching filenames and a galaxy datatype (separated by :) | |
| 240 # The regex match.groups()[0] is used as the id name of the dataset, and must result in unique name for each output | |
| 241 --new_datasets='^\S+?\.((\S+)\.(unique|[0-9.]*)\.dist)$:lower.dist' | |
| 242 # Many mothur commands first require data to be read into memory using: read.otu, read.dist, or read.tree | |
| 243 # This prequisite command and its params are prefixed with 'READ_' | |
| 244 --READ_cmd='read.otu' | |
| 245 --READ_list=/home/galaxy/data/database/files/001/dataset_1557.dat | |
| 246 --READ_group='/home/galaxy/data/database/files/001/dataset_1545.dat' | |
| 247 --READ_label='unique,0.07' | |
| 248 |
