annotate linear_regression.xml @ 0:ffcdde989859 draft

Uploaded
author iuc
date Tue, 29 Jul 2014 06:30:45 -0400
parents
children 2e7bc1bb2dbe
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
ffcdde989859 Uploaded
iuc
parents:
diff changeset
1 <tool id="LinearRegression1" name="Perform Linear Regression" version="1.1.0">
ffcdde989859 Uploaded
iuc
parents:
diff changeset
2 <description> </description>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
3 <expand macro="requirements" />
ffcdde989859 Uploaded
iuc
parents:
diff changeset
4 <macros>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
5 <import>statistic_tools_macros.xml</import>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
6 </macros>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
7 <command interpreter="python">
ffcdde989859 Uploaded
iuc
parents:
diff changeset
8 linear_regression.py
ffcdde989859 Uploaded
iuc
parents:
diff changeset
9 $input1
ffcdde989859 Uploaded
iuc
parents:
diff changeset
10 $response_col
ffcdde989859 Uploaded
iuc
parents:
diff changeset
11 $predictor_cols
ffcdde989859 Uploaded
iuc
parents:
diff changeset
12 $out_file1
ffcdde989859 Uploaded
iuc
parents:
diff changeset
13 $out_file2
ffcdde989859 Uploaded
iuc
parents:
diff changeset
14 1>/dev/null
ffcdde989859 Uploaded
iuc
parents:
diff changeset
15 </command>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
16 <inputs>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
17 <param format="tabular" name="input1" type="data" label="Select data" help="Dataset missing? See TIP below."/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
18 <param name="response_col" label="Response column (Y)" type="data_column" data_ref="input1" numerical="True"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
19 <param name="predictor_cols" label="Predictor columns (X)" type="data_column" data_ref="input1" numerical="True" multiple="true" >
ffcdde989859 Uploaded
iuc
parents:
diff changeset
20 <validator type="no_options" message="Please select at least one column."/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
21 </param>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
22 </inputs>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
23 <outputs>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
24 <data format="input" name="out_file1" metadata_source="input1" />
ffcdde989859 Uploaded
iuc
parents:
diff changeset
25 <data format="pdf" name="out_file2" />
ffcdde989859 Uploaded
iuc
parents:
diff changeset
26 </outputs>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
27 <tests>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
28 <test>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
29 <param name="input1" value="regr_inp.tabular"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
30 <param name="response_col" value="3"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
31 <param name="predictor_cols" value="1,2"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
32 <output name="out_file1" file="regr_out.tabular"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
33 <output name="out_file2" file="regr_out.pdf"/>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
34 </test>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
35 </tests>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
36 <help>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
37
ffcdde989859 Uploaded
iuc
parents:
diff changeset
38
ffcdde989859 Uploaded
iuc
parents:
diff changeset
39 .. class:: infomark
ffcdde989859 Uploaded
iuc
parents:
diff changeset
40
ffcdde989859 Uploaded
iuc
parents:
diff changeset
41 **TIP:** If your data is not TAB delimited, use *Edit Datasets-&gt;Convert characters*
ffcdde989859 Uploaded
iuc
parents:
diff changeset
42
ffcdde989859 Uploaded
iuc
parents:
diff changeset
43 -----
ffcdde989859 Uploaded
iuc
parents:
diff changeset
44
ffcdde989859 Uploaded
iuc
parents:
diff changeset
45 .. class:: infomark
ffcdde989859 Uploaded
iuc
parents:
diff changeset
46
ffcdde989859 Uploaded
iuc
parents:
diff changeset
47 **What it does**
ffcdde989859 Uploaded
iuc
parents:
diff changeset
48
ffcdde989859 Uploaded
iuc
parents:
diff changeset
49 This tool uses the 'lm' function from R statistical package to perform linear regression on the input data. It outputs two files, one containing the summary statistics of the performed regression, and the other containing diagnostic plots to check whether model assumptions are satisfied.
ffcdde989859 Uploaded
iuc
parents:
diff changeset
50
ffcdde989859 Uploaded
iuc
parents:
diff changeset
51 *R Development Core Team (2009). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.*
ffcdde989859 Uploaded
iuc
parents:
diff changeset
52
ffcdde989859 Uploaded
iuc
parents:
diff changeset
53 -----
ffcdde989859 Uploaded
iuc
parents:
diff changeset
54
ffcdde989859 Uploaded
iuc
parents:
diff changeset
55 .. class:: warningmark
ffcdde989859 Uploaded
iuc
parents:
diff changeset
56
ffcdde989859 Uploaded
iuc
parents:
diff changeset
57 **Note**
ffcdde989859 Uploaded
iuc
parents:
diff changeset
58
ffcdde989859 Uploaded
iuc
parents:
diff changeset
59 - This tool currently treats all predictor and response variables as continuous numeric variables. Running the tool on categorical variables might result in incorrect results.
ffcdde989859 Uploaded
iuc
parents:
diff changeset
60
ffcdde989859 Uploaded
iuc
parents:
diff changeset
61 - Rows containing non-numeric (or missing) data in any of the chosen columns will be skipped from the analysis.
ffcdde989859 Uploaded
iuc
parents:
diff changeset
62
ffcdde989859 Uploaded
iuc
parents:
diff changeset
63 - The summary statistics in the output are described below:
ffcdde989859 Uploaded
iuc
parents:
diff changeset
64
ffcdde989859 Uploaded
iuc
parents:
diff changeset
65 - sigma: the square root of the estimated variance of the random error (standard error of the residiuals)
ffcdde989859 Uploaded
iuc
parents:
diff changeset
66 - R-squared: the fraction of variance explained by the model
ffcdde989859 Uploaded
iuc
parents:
diff changeset
67 - Adjusted R-squared: the above R-squared statistic adjusted, penalizing for the number of the predictors (p)
ffcdde989859 Uploaded
iuc
parents:
diff changeset
68 - p-value: p-value for the t-test of the null hypothesis that the corresponding slope is equal to zero against the two-sided alternative.
ffcdde989859 Uploaded
iuc
parents:
diff changeset
69
ffcdde989859 Uploaded
iuc
parents:
diff changeset
70
ffcdde989859 Uploaded
iuc
parents:
diff changeset
71 </help>
ffcdde989859 Uploaded
iuc
parents:
diff changeset
72 </tool>