comparison cluster.tools/hclust.xml @ 2:b442996b66ae draft

Uploaded
author peter-waltman
date Wed, 27 Feb 2013 20:17:04 -0500
parents
children
comparison
equal deleted inserted replaced
1:e25d2bece0a2 2:b442996b66ae
1 <tool id="hcluster" name="Hierarchical Clustering (HAC)" force_history_refresh="True">
2 <command interpreter="python">hclust.py
3 -d $dataset
4 ${dist_obj}
5 -n ${direction}
6 -m ${distance_metric}
7 -l ${linkage}
8 -k ${numk}
9 -o ${rdata_output}
10
11 </command>
12 <inputs>
13 <param name="dataset" type="data" format='tabular' label="Data Set" help="Specify the data matrix (tab-delimited) to be clustered"/>
14 <param name="dist_obj" type="boolean" label="Distance Object (R dist object)?" truevalue="-D" falsevalue="" checked="False" help="Check if the matrix contains the pairwise distances between a set of objects"/>
15 <param name="direction" type="select" label="Cluster Samples or Genes?" help="Specify the matrix dimension to cluster (see help below)">
16 <option value="cols">Columns (Samples)</option>
17 <option value="rows" selected='true'>Rows (Genes)</option>
18 </param>
19
20 <param name="distance_metric" type="select" label="Distance Metric" help="Specify the distance metric to use (see help below)">
21 <option value="cosine" selected='true'>Cosine</option>
22 <option value="abscosine">Absolute Cosine</option>
23 <option value="pearson">Pearson</option>
24 <option value="abspearson">Absolute Pearson</option>
25 <option value="spearman">Spearman</option>
26 <option value="kendall">Kendall</option>
27 <option value="euclidean">Euclidean</option>
28 <option value="maximum">Maximum</option>
29 <option value="manhattan">Manhattan (AKA city block)</option>
30 <option value="canberra">Canberra</option>
31 <option value="binary">Binary</option>
32 </param>
33
34 <param name="linkage" type="select" label="Linkage" help="Specify the linkage to use when clustering (see help below)">
35 <option value="average">Average</option>
36 <option value="centroid">Centroid</option>
37 <option value="complete" selected='true'>Complete</option>
38 <option value="mcquitty">McQuitty</option>
39 <option value="median">Median</option>
40 <option value="single">Single</option>
41 <option value="ward">Ward</option>
42 </param>
43
44 <param name="numk" type="integer" label="Number of Clusters" value="50" help="Specify the number of clusters to use"/>
45
46 </inputs>
47 <outputs>
48 <data format="rdata" name="rdata_output" label="Hierarchical Clustering Result (RData)"/>
49 </outputs>
50 <help>
51 .. class:: infomark
52
53 **Perform Hierarchical Clustering (Cluster Samples) on a specified data set**
54
55 ----
56
57 **Parameters**
58
59 - **Data Set** - Specify the data matrix to be clustered. Data must be formated as follows:
60
61 * Tab-delimited
62 * Use row/column headers
63
64 - **Cluster Samples or Genes** - Specify the dimension of the matrix to cluster:
65
66 * Rows (Genes)
67 * Columns (Samples)
68
69 - **Distance Object** Specify whether or not the data set is a pairwise distance matrix
70
71 - **Distance Metric** Specify the distance metric to use. Choice of:
72
73 * Cosine (AKA uncentered pearson)
74 * Absolute Cosine (AKA uncentered pearson, absolute value)
75 * Pearson (pearson correlation)
76 * Absolute Pearson (pearson correlation, absolute value)
77 * Spearman (spearman correlation)
78 * Kendall (Kendall's Tau)
79 * Euclidean (euclidean distance)
80 * Maximum
81 * Manhattan (AKA city block)
82 * Canberra
83 * Binary
84
85 - **Linkage** Specify the linkage to use when clustering. Choice of:
86
87 * Average (see documentation for R's hclust function for explanation of choices)
88 * Single
89 * Complete
90 * Median
91 * Centroid
92 * McQuity
93 * Ward
94
95 - **Number of Clusters** Specify the number of clusters to use
96
97 </help>
98 </tool>