comparison linear_regression.xml @ 80:c4a3a8999945 draft

Uploaded
author bernhardlutz
date Mon, 20 Jan 2014 14:39:43 -0500
parents
children
comparison
equal deleted inserted replaced
79:dc82017052ac 80:c4a3a8999945
1 <tool id="LinearRegression1" name="Perform Linear Regression" version="1.1.0">
2 <description> </description>
3 <expand macro="requirements" />
4 <macros>
5 <import>statistic_tools_macros.xml</import>
6 </macros>
7 <command interpreter="python">
8 linear_regression.py
9 $input1
10 $response_col
11 $predictor_cols
12 $out_file1
13 $out_file2
14 1>/dev/null
15 </command>
16 <inputs>
17 <param format="tabular" name="input1" type="data" label="Select data" help="Dataset missing? See TIP below."/>
18 <param name="response_col" label="Response column (Y)" type="data_column" data_ref="input1" numerical="True"/>
19 <param name="predictor_cols" label="Predictor columns (X)" type="data_column" data_ref="input1" numerical="True" multiple="true" >
20 <validator type="no_options" message="Please select at least one column."/>
21 </param>
22 </inputs>
23 <outputs>
24 <data format="input" name="out_file1" metadata_source="input1" />
25 <data format="pdf" name="out_file2" />
26 </outputs>
27 <tests>
28 <test>
29 <param name="input1" value="regr_inp.tabular"/>
30 <param name="response_col" value="3"/>
31 <param name="predictor_cols" value="1,2"/>
32 <output name="out_file1" file="regr_out.tabular"/>
33 <output name="out_file2" file="regr_out.pdf"/>
34 </test>
35 </tests>
36 <help>
37
38
39 .. class:: infomark
40
41 **TIP:** If your data is not TAB delimited, use *Edit Datasets-&gt;Convert characters*
42
43 -----
44
45 .. class:: infomark
46
47 **What it does**
48
49 This tool uses the 'lm' function from R statistical package to perform linear regression on the input data. It outputs two files, one containing the summary statistics of the performed regression, and the other containing diagnostic plots to check whether model assumptions are satisfied.
50
51 *R Development Core Team (2009). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.*
52
53 -----
54
55 .. class:: warningmark
56
57 **Note**
58
59 - This tool currently treats all predictor and response variables as continuous numeric variables. Running the tool on categorical variables might result in incorrect results.
60
61 - Rows containing non-numeric (or missing) data in any of the chosen columns will be skipped from the analysis.
62
63 - The summary statistics in the output are described below:
64
65 - sigma: the square root of the estimated variance of the random error (standard error of the residiuals)
66 - R-squared: the fraction of variance explained by the model
67 - Adjusted R-squared: the above R-squared statistic adjusted, penalizing for the number of the predictors (p)
68 - p-value: p-value for the t-test of the null hypothesis that the corresponding slope is equal to zero against the two-sided alternative.
69
70
71 </help>
72 </tool>