annotate normalize.xml @ 0:31cfcab40d8f draft

Uploaded
author ynewton
date Wed, 26 Sep 2012 17:32:30 -0400
parents
children a968fcdbd710
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
1 <tool id="matrix_normalize" name="Matrix Normalize" version="1.0.0">
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
2 <description>Matrix Normalize</description>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
3 <command interpreter="Rscript">normalize.r $genomicMatrix $normType $normBy
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
4 #if str($normalList) != "None":
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
5 $normalList
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
6 #end if
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
7 > $outfile
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
8 </command>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
9 <inputs>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
10 <param name="genomicMatrix" type="data" label="Genomic Matrix"/>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
11 <param name="normBy" type="select" label="normalize by (row or column)">
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
12 <option value="row">ROW</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
13 <option value="column">COLUMN</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
14 </param>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
15 <param name="normType" type="select" label="type of normalization">
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
16 <option value="median_shift">Median Shift</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
17 <option value="mean_shift">Mean Shift</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
18 <option value="t_statistic">T-Statistic (z-scores)</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
19 <option value="exponential_fit">Exponential Distribution Normalization</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
20 <option value="normal_fit">Normal Distribution Normalization</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
21 <option value="weibull_0.5_fit">Weibull Distribution Normalization (scale=1,shape=0.5)</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
22 <option value="weibull_1_fit">Weibull Distribution Normalization (scale=1,shape=1)</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
23 <option value="weibull_1.5_fit">Weibull Distribution Normalization (scale=1,shape=1.5)</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
24 <option value="weibull_5_fit">Weibull Distribution Normalization (scale=1,shape=5)</option>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
25 </param>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
26 <param name="normalList" optional="true" type="data" label="Normal List"/>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
27 </inputs>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
28 <outputs>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
29 <data name="outfile" format="tabular"/>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
30 </outputs>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
31 <help>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
32 **What it does**
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
33
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
34 This tool takes data in a matrix format and normalizes it using the chosen normalization options. The matrix data is assumed to be column and row annotated, meaning that the first line of the matrix file is assumed to be the column headers and the first column of each row is assumed to be the row header.
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
35
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
36 Data can be normalized either by row or column. Note that fitting normalizations automatically do so by column regardless of the user selection.
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
37
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
38 The following normalizations are provided:
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
39
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
40 1. Median shift: if no normals list is provided then computes the median for the whole row and subtracts it from each entry of the row. If normals are provided then computes median for normals and subtracts it from each value of non-normal. Returns only non-normal samples if normals are provided. If "Column" is selected in normalize by, then normals are ignored.
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
41
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
42 2. Mean shift: if no normals list is provided then computes the mean for the whole row and subtracts it from each entry of the row. If normals are provided then computes mean for normals and subtracts it from each value of non-normal. Returns only non-normal samples if normals are provided. If "Column" is selected in normalize by, then normals are ignored.
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
43
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
44 3. T-statistic (z-score): sometimes called standardization. Z-score is computed for each value of the row/column. If normals are specified then the z-score within each class (normals and non-normals) is computed.
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
45
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
46 4. Exponential normalization: performed by columns/samples. All genes/probes in the column/sample are ranked. Then inverse CDF (quantile function) is applied to the ranks (transforms a rank to a real number in exponential distribution).
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
47
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
48 5. Normal normalization: same as exponential normalization, but inverse quantile function of Normal distribution is applied.
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
49
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
50 6. Weibull normalizations: same as exponential normalization, but inverse quantile function of Weibull distribution is applied with appropriate scale and shape parameters.
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
51
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
52
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
53 Normals parameter is an optional parameter which contains a list of column headers from the input matrix which should be considered as normals
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
54
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
55 </help>
31cfcab40d8f Uploaded
ynewton
parents:
diff changeset
56 </tool>