annotate gsummary.xml @ 1:2e7bc1bb2dbe draft default tip

Uploaded
author iuc
date Fri, 09 Jan 2015 12:56:07 -0500
parents ffcdde989859
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
ffcdde989859 Uploaded
iuc
parents:
diff changeset
1 <tool id="Summary_Statistics1" name="Summary Statistics" version="1.3.0">
ffcdde989859 Uploaded
iuc
parents:
diff changeset
2 <description>for any numerical column</description>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
3 <expand macro="requirements" />
ffcdde989859 Uploaded
iuc
parents:
diff changeset
4 <macros>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
5 <import>statistic_tools_macros.xml</import>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
6 </macros>
1
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
7 <command interpreter="python">
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
8 <![CDATA[
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
9 gsummary.py
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
10 $input
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
11 $out_file1
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
12 "$cond"
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
13 ]]>
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
14 </command>
0
ffcdde989859 Uploaded
iuc
parents:
diff changeset
15 <inputs>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
16 <param format="tabular" name="input" type="data" label="Summary statistics on" help="Dataset missing? See TIP below"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
17 <param name="cond" size="30" type="text" value="c5" label="Column or expression" help="See syntax below">
ffcdde989859 Uploaded
iuc
parents:
diff changeset
18 <validator type="empty_field" message="Enter a valid column or expression, see syntax below for examples"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
19 </param>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
20 </inputs>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
21 <outputs>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
22 <data format="tabular" name="out_file1" />
ffcdde989859 Uploaded
iuc
parents:
diff changeset
23 </outputs>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
24 <tests>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
25 <test>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
26 <param name="input" value="1.bed"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
27 <output name="out_file1" file="gsummary_out1.tabular"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
28 <param name="cond" value="c2"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
29 </test>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
30 </tests>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
31 <help>
1
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
32 <![CDATA[
0
ffcdde989859 Uploaded
iuc
parents:
diff changeset
33
ffcdde989859 Uploaded
iuc
parents:
diff changeset
34 .. class:: warningmark
ffcdde989859 Uploaded
iuc
parents:
diff changeset
35
ffcdde989859 Uploaded
iuc
parents:
diff changeset
36 This tool expects input datasets consisting of tab-delimited columns (blank or comment lines beginning with a # character are automatically skipped).
ffcdde989859 Uploaded
iuc
parents:
diff changeset
37
ffcdde989859 Uploaded
iuc
parents:
diff changeset
38 .. class:: infomark
ffcdde989859 Uploaded
iuc
parents:
diff changeset
39
1
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
40 **TIP:** If your data is not TAB delimited, use *Text Manipulation->Convert delimiters to TAB*
0
ffcdde989859 Uploaded
iuc
parents:
diff changeset
41
ffcdde989859 Uploaded
iuc
parents:
diff changeset
42 .. class:: infomark
ffcdde989859 Uploaded
iuc
parents:
diff changeset
43
ffcdde989859 Uploaded
iuc
parents:
diff changeset
44 **TIP:** Computing summary statistics may throw exceptions if the data value in every line of the columns being summarized is not numerical. If a line is missing a value or contains a non-numerical value in the column being summarized, that line is skipped and the value is not included in the statistical computation. The number of invalid skipped lines is documented in the resulting history item.
ffcdde989859 Uploaded
iuc
parents:
diff changeset
45
ffcdde989859 Uploaded
iuc
parents:
diff changeset
46 .. class:: infomark
ffcdde989859 Uploaded
iuc
parents:
diff changeset
47
ffcdde989859 Uploaded
iuc
parents:
diff changeset
48 **USING R FUNCTIONS:** Most functions (like *abs*) take only a single expression. *log* can take one or two parameters, like *log(expression,base)*
ffcdde989859 Uploaded
iuc
parents:
diff changeset
49
ffcdde989859 Uploaded
iuc
parents:
diff changeset
50 Currently, these R functions are supported: *abs, sign, sqrt, floor, ceiling, trunc, round, signif, exp, log, cos, sin, tan, acos, asin, atan, cosh, sinh, tanh, acosh, asinh, atanh, lgamma, gamma, gammaCody, digamma, trigamma, cumsum, cumprod, cummax, cummin*
ffcdde989859 Uploaded
iuc
parents:
diff changeset
51
ffcdde989859 Uploaded
iuc
parents:
diff changeset
52 -----
ffcdde989859 Uploaded
iuc
parents:
diff changeset
53
ffcdde989859 Uploaded
iuc
parents:
diff changeset
54 **Syntax**
ffcdde989859 Uploaded
iuc
parents:
diff changeset
55
ffcdde989859 Uploaded
iuc
parents:
diff changeset
56 This tool computes basic summary statistics on a given column, or on a valid expression containing one or more columns.
ffcdde989859 Uploaded
iuc
parents:
diff changeset
57
ffcdde989859 Uploaded
iuc
parents:
diff changeset
58 - Columns are referenced with **c** and a **number**. For example, **c1** refers to the first column of a tab-delimited file.
ffcdde989859 Uploaded
iuc
parents:
diff changeset
59
ffcdde989859 Uploaded
iuc
parents:
diff changeset
60 - For example:
ffcdde989859 Uploaded
iuc
parents:
diff changeset
61
ffcdde989859 Uploaded
iuc
parents:
diff changeset
62 - **log(c5)** calculates the summary statistics for the natural log of column 5
ffcdde989859 Uploaded
iuc
parents:
diff changeset
63 - **(c5 + c6 + c7) / 3** calculates the summary statistics on the average of columns 5-7
ffcdde989859 Uploaded
iuc
parents:
diff changeset
64 - **log(c5,10)** summary statistics of the base 10 log of column 5
ffcdde989859 Uploaded
iuc
parents:
diff changeset
65 - **sqrt(c5+c9)** summary statistics of the square root of column 5 + column 9
ffcdde989859 Uploaded
iuc
parents:
diff changeset
66
ffcdde989859 Uploaded
iuc
parents:
diff changeset
67 -----
ffcdde989859 Uploaded
iuc
parents:
diff changeset
68
ffcdde989859 Uploaded
iuc
parents:
diff changeset
69 **Examples**
ffcdde989859 Uploaded
iuc
parents:
diff changeset
70
ffcdde989859 Uploaded
iuc
parents:
diff changeset
71 - Input Dataset::
ffcdde989859 Uploaded
iuc
parents:
diff changeset
72
ffcdde989859 Uploaded
iuc
parents:
diff changeset
73 c1 c2 c3 c4 c5 c6
ffcdde989859 Uploaded
iuc
parents:
diff changeset
74 586 chrX 161416 170887 41108_at 16990
ffcdde989859 Uploaded
iuc
parents:
diff changeset
75 73 chrX 505078 532318 35073_at 1700
ffcdde989859 Uploaded
iuc
parents:
diff changeset
76 595 chrX 1361578 1388460 33665_s_at 1960
ffcdde989859 Uploaded
iuc
parents:
diff changeset
77 74 chrX 1420620 1461919 1185_at 8600
ffcdde989859 Uploaded
iuc
parents:
diff changeset
78
ffcdde989859 Uploaded
iuc
parents:
diff changeset
79 - Summary Statistics on column c6 of the above input dataset::
ffcdde989859 Uploaded
iuc
parents:
diff changeset
80
ffcdde989859 Uploaded
iuc
parents:
diff changeset
81 #sum mean stdev 0% 25% 50% 75% 100%
ffcdde989859 Uploaded
iuc
parents:
diff changeset
82 29250.000 7312.500 7198.636 1700.000 1895.000 5280.000 10697.500 16990.000
ffcdde989859 Uploaded
iuc
parents:
diff changeset
83
1
2e7bc1bb2dbe Uploaded
iuc
parents: 0
diff changeset
84 ]]>
0
ffcdde989859 Uploaded
iuc
parents:
diff changeset
85 </help>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
86 </tool>