view export2graphlan.xml @ 3:ebe3cb467f8c draft

Uploaded
author george-weingart
date Thu, 04 Sep 2014 13:51:33 -0400
parents dba11280df2c
children c0c7f369e331
line wrap: on
line source

<tool id="export2graphlan" name="export2graphlan" version="1.0.0">
  <description>Export to Graphlan</description>
  <command  interpreter="python">
    export2graphlan.py 
	-i $inp_data 
	-o $out_data
	-t $output_tree_file 
	-a $output_annot_file 
	--title $export_title
	--annotations $export_annotations 
	--external_annotations $export_external_annotations
	--skip_rows 1,2 
   </command>
   
	<inputs>
	    <param name="export_title" type="text" format="text" label="Title" value="Title"/>
	    <param name="export_annotations" type="text" format="text" label="Annotations" value="2,3"/>
	    <param name="export_external_annotations" type="text" format="text" label="External Annotations" value="4,5,6"/>   
		<param format="tabular" name="inp_data" type="data" label="Input used to run Lefse -  See samples below - Please use Galaxy Get-Data/Upload-File. Use File-Type = Tabular" 	help="This is the file that was used as input for Lefse"/>
		<param format="lefse_res" name="out_data" type="data" label="Output of  Lefse"  help="This is the  Lefse output file"/>	
    </inputs>
	<outputs>
            <data  name="output_annot_file"  format="circl"  />
            <data  name="output_tree_file"  format="circl"  />
	</outputs>
                                  
  <help>
Overview
========
**export2graphlan** is an *OPTIONAL* tool that automatically convert **LEfSe**, **MetaPhlAn2**, and **HUMAnN** input and/or output files, to **GraPhlAn**. Input file can be also given in BIOM (both 1 and 2) format.

The aim of this tool is to support biologists, helping them by provide the tree and the annotation file for GraPhlAn, automatically.

Input files
-----------

As shown in the image below, export2graphlan can work with just one of the following files or with both of them.

 * **Result of MetaPhlAn or HUMAnN analysis**: As depicted in the image below, this file can be the result of a MetaPhlAn analysis or a HUMAnN analysis. Generally, it is a tab separated file that have for each row a taxonomy and an abundance value.

 * **Output of LEfSe**: This file is the result of LEfSe execute on the *Result of MetaPhlAn or HUMAnN analysis* file. This file allow GraPhlAn to highlight for you the found biomarkers.

Input parameters
----------------
 
      --annotations ANNOTATIONS
                        List which levels should be annotated in the tree. Use
                        a comma separate values form, e.g.,
                        --annotation_levels 1,2,3. Default is None
      --external_annotations EXTERNAL_ANNOTATIONS
                        List which levels should use the external legend for
                        the annotation. Use a comma separate values form,
                        e.g., --annotation_levels 1,2,3. Default is None
    --background_levels BACKGROUND_LEVELS
                        List which levels should be highlight with a shaded
                        background. Use a comma separate values form, e.g.,
                        --background_levels 1,2,3
    --background_clades BACKGROUND_CLADES
                        Specify the clades that should be highlight with a
                        shaded background. Use a comma separate values form
                        and surround the string with " if it contains spaces.
                        Example: --background_clades "Bacteria.Actinobacteria,
                        Bacteria.Bacteroidetes.Bacteroidia,
                        Bacteria.Firmicutes.Clostridia.Clostridiales"
    --background_colors BACKGROUND_COLORS
                        Set the color to use for the shaded background. Colors
                        can be either in RGB or HSV (using a semi-colon to
                        separate values, surrounded with ()) format. Use a
                        comma separate values form and surround the string
                        with " if it contains spaces. Example:
                        --background_colors "#29cc36, (150; 100; 100), (280;
                        80; 88)"
    --title TITLE         If specified set the title of the GraPhlAn plot.
                        Surround the string with " if it contains spaces,
                        e.g., --title "Title example"
    --title_font_size TITLE_FONT_SIZE
                        Set the title font size. Default is 15
    --def_clade_size DEF_CLADE_SIZE
                        Set a default size for clades that are not found as
                        biomarkers by LEfSe. Default is 10
    --min_clade_size MIN_CLADE_SIZE
                        Set the minimum value of clades that are biomarkers.
                        Default is 20
    --max_clade_size MAX_CLADE_SIZE
                        Set the maximum value of clades that are biomarkers.
                        Default is 200
    --def_font_size DEF_FONT_SIZE
                        Set a default font size. Default is 10
    --min_font_size MIN_FONT_SIZE
                        Set the minimum font size to use. Default is 8
    --max_font_size MAX_FONT_SIZE
                        Set the maximum font size. Default is 12
    --annotation_legend_font_size ANNOTATION_LEGEND_FONT_SIZE
                        Set the font size for the annotation legend. Default
                        is 10
    --abundance_threshold ABUNDANCE_THRESHOLD
                        Set the minimun abundace value for a clade to be
                        annotated. Default is 20.0
    --most_abundant MOST_ABUNDANT
                        When only lefse_input is provided, you can specify how
                        many clades highlight. Since the biomarkers are
                        missing, they will be chosen from the most abundant
    --least_biomarkers LEAST_BIOMARKERS
                        When only lefse_input is provided, you can specify the
                        minimum number of biomarkers to extract. The taxonomy
                        is parsed, and the level is choosen in order to have
                        at least the specified number of biomarkers
    --discard_otus        If specified the OTU ids will be discarde from the
                        taxonmy. Default behavior keep OTU ids in taxonomy
    --internal_levels     If specified sum-up from leaf to root the abundances
                        values. Default behavior do not sum-up abundances on
                        the internal nodes

    input parameters:
    You need to provide at least one of the two arguments
    -i LEFSE_INPUT, --lefse_input LEFSE_INPUT 
			LEfSe input data   
                        
    -o LEFSE_OUTPUT, --lefse_output LEFSE_OUTPUT
			LEfSe output result data

    output parameters:
    -t TREE, --tree TREE  Output filename where save the input tree for GraPhlAn
    -a ANNOTATION, --annotation ANNOTATION
                        Output filename where save GraPhlAn annotation

    Input data matrix parameters:
    --sep SEP
    --out_table OUT_TABLE
                        Write processed data matrix to file
    --fname_row FNAME_ROW
                        row number containing the names of the features
                        [default 0, specify -1 if no names are present in the
                        matrix
    --sname_row SNAME_ROW
                        column number containing the names of the samples
                        [default 0, specify -1 if no names are present in the
                        matrix
    --metadata_rows METADATA_ROWS
                        Row numbers to use as metadata[default None, meaning
                        no metadata
    --skip_rows SKIP_ROWS
                        Row numbers to skip (0-indexed, comma separated) from
                        the input file[default None, meaning no rows skipped
    --sperc SPERC         Percentile of sample value distribution for sample
                        selection
    --fperc FPERC         Percentile of feature value distribution for sample
                        selection
    --stop STOP           Number of top samples to select (ordering based on
                        percentile specified by --sperc)
    --ftop FTOP           Number of top features to select (ordering based on
                        percentile specified by --fperc)
    --def_na DEF_NA       Set the default value for missing values [default None
                        which means no replacement]

Integration
===========

A graphical representation of how **export2graphlan** can be integrated in the analysis pipeline:

.. image:: https://bitbucket.org/repo/oL6bEG/images/3364692296-graphlan_integration.png
    :height: 672
    :width: 800

Want to know more?
==================

If you want to know more about **export2graphlan** please have a look at the tutorial
  </help>
</tool>