# HG changeset patch # User ynewton # Date 1349194080 14400 # Node ID b3a41a2bc7e257e69e6272ba7ccaaec3fdb66a62 # Parent a968fcdbd7104160ca4da8bf702c6bd08ea50a95 Deleted selected files diff -r a968fcdbd710 -r b3a41a2bc7e2 ._normalize.xml Binary file ._normalize.xml has changed diff -r a968fcdbd710 -r b3a41a2bc7e2 normalize.xml --- a/normalize.xml Tue Oct 02 12:06:14 2012 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,56 +0,0 @@ - - Matrix Normalize - normalize.r $genomicMatrix $normType $normBy -#if str($normalList) != "None": - $normalList -#end if - > $outfile - - - - - - - - - - - - - - - - - - - - - - - - -**What it does** - -This tool takes data in a matrix format and normalizes it using the chosen normalization options. The matrix data is assumed to be column and row annotated, meaning that the first line of the matrix file is assumed to be the column headers and the first column of each row is assumed to be the row header. - -Data can be normalized either by row or column. Note that exponential, normal, and weibull normalizations automatically do so by column regardless of the user selection. - -The following normalizations are provided: - -1. Median shift: if no normals list is provided then computes the median for the whole row and subtracts it from each entry of the row. If normals are provided then computes median for normals and subtracts it from each value of non-normal. Returns only non-normal samples if normals are provided. If "Column" is selected in normalize by, then normals are ignored. - -2. Mean shift: if no normals list is provided then computes the mean for the whole row and subtracts it from each entry of the row. If normals are provided then computes mean for normals and subtracts it from each value of non-normal. Returns only non-normal samples if normals are provided. If "Column" is selected in normalize by, then normals are ignored. - -3. T-statistic (z-score): sometimes called standardization. Z-score is computed for each value of the row/column. If normals are specified then the z-score within each class (normals and non-normals) is computed. - -4. Exponential normalization: performed by columns/samples. All genes/probes in the column/sample are ranked. Then inverse CDF (quantile function) is applied to the ranks (transforms a rank to a real number in exponential distribution). - -5. Normal normalization: same as exponential normalization, but inverse quantile function of Normal distribution is applied. - -6. Weibull normalizations: same as exponential normalization, but inverse quantile function of Weibull distribution is applied with appropriate scale and shape parameters. - - -Normals parameter is an optional parameter which contains a list of column headers from the input matrix which should be considered as normals - - - \ No newline at end of file