Mercurial > repos > peter-waltman > ucsc_cluster_tools2
view cluster.tools/partition.xml @ 9:a3c03541fe6f draft default tip
Uploaded
author | peter-waltman |
---|---|
date | Mon, 11 Mar 2013 17:30:48 -0400 |
parents | a58527c632b7 |
children |
line wrap: on
line source
<tool id="partiton_clust" name="Partition Clustering" force_history_refresh="True"> <command interpreter="python">partition.py -d $dataset ${dist_obj} -n ${direction} -a $alg_cond.algorithm #if $alg_cond.algorithm == 'pam' # -m ${alg_cond.distance_metric} #end if #if str($numk) != "-1": -k ${numk} #end if #if str($direction) == "rows": -o ${rdata_output_rows} #end if #if str($direction) == "cols": -o ${rdata_output_cols} #end if </command> <inputs> <param name="dataset" type="data" format='tabular' label="Data Set" help="Specify the data matrix (tab-delimited) to be clustered"/> <param name="dist_obj" type="boolean" label="Distance Object (R dist object)?" truevalue="-D" falsevalue="" checked="False" help="Check if the matrix contains the pairwise distances between a set of objects"/> <param name="direction" type="select" label="Cluster Columns or Rows?" help="Specify the matrix dimension to cluster (see help below)"> <option value="cols">Columns (Samples)</option> <option value="rows" selected='true'>Rows (Genes)</option> </param> <conditional name='alg_cond'> <param name="algorithm" type="select" label="PAM or K-means?" help="Specify the partition cluster method to use (see help below)"> <option value="km">K-means</option> <option value="pam" selected='true'>PAM</option> </param> <when value='pam'> <param name="distance_metric" type="select" label="Distance Metric" help="Specify the distance metric to use (see help below)"> <option value="cosine" selected='true'>Cosine</option> <option value="abscosine">Absolute Cosine</option> <option value="pearson">Pearson</option> <option value="abspearson">Absolute Pearson</option> <option value="spearman">Spearman</option> <option value="kendall">Kendall</option> <option value="euclidean">Euclidean</option> <option value="maximum">Maximum</option> <option value="manhattan">Manhattan (AKA city block)</option> <option value="canberra">Canberra</option> <option value="binary">Binary</option> </param> </when> </conditional> <param name="numk" type="integer" label="Number of Clusters" value="-1" help="Specify the number of clusters to use (-1 to use default. See help below)."/> </inputs> <outputs> <data format="rdata" name="rdata_output_rows" label="Partition Clustering Results; Gene Clusters (RData)"> <filter>(direction)=="rows"</filter> </data> <data format="rdata" name="rdata_output_cols" label="Partition Clustering Results; Sample Clusters (RData)"> <filter>(direction)=="cols"</filter> </data> </outputs> <help> .. class:: infomark **Perform Partition Clustering (Cluster Samples) on a specified data set** ---- **Parameters** - **Data Set** - Specify the data matrix to be clustered. Data must be formated as follows: * Tab-delimited * Use row/column headers - **Distance Object** Specify whether or not the data set is a pairwise distance matrix - **Cluster Samples or Genes** - Specify the dimension of the matrix to cluster: * Rows (Genes) * Columns (Samples) - **PAM or K-means?** Specify which partition clustering method to use - users have choice of: * PAM (Partition Around Mediods) * K-means - **Distance Metric** Specify the distance metric to use. Note, this is ONLY AVAILABLE IF PAM IS THE ALGORITHM BEING USED. Choice of: * Cosine (AKA uncentered pearson) * Absolute Cosine (AKA uncentered pearson, absolute value) * Pearson (pearson correlation) * Absolute Pearson (pearson correlation, absolute value) * Spearman (spearman correlation) * Kendall (Kendall's Tau) * Euclidean (euclidean distance) * Maximum * Manhattan (AKA city block) * Canberra * Binary - **Number of Clusters** Specify the number of clusters to use. If set to -1, default values will be used, with the default set as follows: * if samples/columns are being clustered, the **default** is 5. * if genes/rows are being clustered, the **default** is set to num_rows/30, e.g. if there are 600 row/genes in the matrix, the default will be 20 clusters. </help> </tool>