ucsc_cancer_browser_stats: ttest/stats.xml comparison

comparison ttest/stats.xml @ 0:12bb38e187b9

Uploaded, initial check-in

author	melissacline
date	Mon, 28 Jul 2014 20:12:17 -0400
parents
children	a04e3c59e117

comparison

equal deleted inserted replaced

--1:000000000000
+:12bb38e187b9
+<tool id="ucscCancerBrowserStats" description="Statistical Tests of Difference" name="UCSC Cancer Browser Stats" version="0.0.1">
+<description>Apply statistical tests of difference to the rows in a genomic matrix, where the columns are categorized by a second (clinical) matrix</description>
+<command interpreter="python">
+stats.py  $genomicMatrix $clinicalFeatures $outFile -a="${category1}" -b="${category2}"
+</command>
+<inputs>
+<param format="tabular" name="genomicMatrix" type="data" label="Genomic Matrix"/>
+<param format="tabular" name="clinicalFeatures" type="data" label="Clinical Matrix"/>
+<param type="text" name="category1" label="Category 1" optional="false"/>
+<param type="text" name="category2" label="Category 2" optional="false"/>
+</inputs>
+<outputs>
+<data format="tabular" name="outFile" />
+</outputs>
+<requirements>
+<requirement type="python-module">numpy</requirement>
+</requirements>
+<tests>
+<param name="genomicMatrix" value="sample.genomic.matrix.txt"/>
+<param name="clinicalMatrix" value="sample.clinical.matrix.txt"/>
+<param name="category1" value="A"/>
+<param name="category2" value="B"/>
+<output name="outFile" value="sample.stats.output.txt"/>
+</tests>
+<help>
+This tool performs statistical tests found in the UCSC Cancer Genomics
+Browser.  The input data is a genomic matrix (containing genomic data,
+with rows representing genes or probes and columns representing
+samples or patients), a clinical matrix of two (or more) columns
+assigning categorical values to the samples, and two categorical
+values of interest.  The tool identifies the samples corresponding to
+each categorical value, then identifies the columns in the genomic
+matrix corresponding to those sets of samples, which identifies two
+groups of columns.  For each row in the genomic matrix, it extracts
+the value for those two sets of columns, performs a t-test on the two
+sets of values, and returns the result for the row.  Any values for
+any columns NOT pertaining to one of the categorical values of
+interest are ignored.
+The user runs this tool with th following steps:
+1. Specify a genomic matrix.  The expected format is with rows representing
+genes and columns representing samples, and the first line contains sample
+names.
+2. Specify a clinical matrix.  Here, rows indicate samples, columns
+indicate clinical features, and the header row contains feature names.
+The first column MUST indicate the sample names, and MUST correspond
+to the column names of the genomic matrix.  The clinical feature of
+interest MUST be in the second column.  Any other columns will be
+ignored.
+3. Indicate two clinical values that you want to use for defining the
+two groups.  For example, the two groups could be "Red group" and
+"Green group", 0 and 1, or whatever.
+The output indicates, for each row, the t-statistic reporting on the
+difference between the two groups of columns (as specified by the two
+clinical values), the p-value corresponding to that t-statistic, the
+median value for each group, and the difference between the medians.  If it
+cannot calculate these values, it returns a vector of NAs.
+For example, given the following genomic matrix for (1)::
+Gene  1    2    3    4    5    6    7    8    9    10
+G1    2.0  2.2  3.2  1.1  5.1  8.1  3.2  1.1  8.1  0.2
+G2    0.1  8.2  9.1  4.2  6.1  4.9  3.9  2.3  1.1  0.2
+and given the following clinical matrix for (2)::
+sample_id Value
+1         A
+2         A
+3         B
+4         C
+5         B
+6         B
+7         A
+8         A
+9         B
+10        A
+and given A for Category 1 and B for Category 2
+the tool will assemble the following two groups of values::
+G1 A:(2.0, 2.2, 3.2, 1.1, 0.2) B:(3.2, 5.1, 8.1, 8.1)
+G2 A:(0.1, 8.2, 3.9, 2.3, 0.2) B:(9.1, 6.1, 4.9, 1.1)
+Note that the values for sample_id 4 do not appear, because it has a Value
+of C in the second column, which is neither A nor B.
+And it will return the output::
+Gene Statistic  pValue    Median1   Median2   Delta
+G1   -4.168999  0.004194  2.000000  6.600000  -4.600000
+G2   -1.198486  0.269724  2.300000  5.500000  -3.200000
+</help>
+</tool>

Mercurial > repos > melissacline > ucsc_cancer_browser_stats

comparison ttest/stats.xml @ 0:12bb38e187b9