Mercurial > repos > devteam > best_regression_subsets
annotate best_regression_subsets.xml @ 2:b82e4cb8764d default tip
Corrected version string.
| author | devteam <devteam@galaxyproject.org> |
|---|---|
| date | Thu, 10 Apr 2014 13:47:13 -0400 |
| parents | 4b84b5118705 |
| children |
| rev | line source |
|---|---|
|
2
b82e4cb8764d
Corrected version string.
devteam <devteam@galaxyproject.org>
parents:
1
diff
changeset
|
1 <tool id="BestSubsetsRegression1" name="Perform Best-subsets Regression" version="1.0.0"> |
| 0 | 2 <description> </description> |
| 1 | 3 <requirements> |
| 4 <requirement type="package" version="1.7.1">numpy</requirement> | |
| 5 <requirement type="package" version="1.0.3">rpy</requirement> | |
| 6 </requirements> | |
| 0 | 7 <command interpreter="python"> |
| 8 best_regression_subsets.py | |
| 9 $input1 | |
| 10 $response_col | |
| 11 $predictor_cols | |
| 12 $out_file1 | |
| 13 $out_file2 | |
| 14 1>/dev/null | |
| 15 2>/dev/null | |
| 16 </command> | |
| 17 <inputs> | |
| 18 <param format="tabular" name="input1" type="data" label="Select data" help="Dataset missing? See TIP below."/> | |
| 19 <param name="response_col" label="Response column (Y)" type="data_column" data_ref="input1" /> | |
| 20 <param name="predictor_cols" label="Predictor columns (X)" type="data_column" data_ref="input1" multiple="true" > | |
| 21 <validator type="no_options" message="Please select at least one column."/> | |
| 22 </param> | |
| 23 </inputs> | |
| 24 <outputs> | |
| 25 <data format="input" name="out_file1" metadata_source="input1" /> | |
| 26 <data format="pdf" name="out_file2" /> | |
| 27 </outputs> | |
| 28 <tests> | |
| 29 <!-- Testing this tool will not be possible because this tool produces a pdf output file. | |
| 30 --> | |
| 31 </tests> | |
| 32 <help> | |
| 33 | |
| 34 .. class:: infomark | |
| 35 | |
| 36 **TIP:** If your data is not TAB delimited, use *Edit Datasets->Convert characters* | |
| 37 | |
| 38 ----- | |
| 39 | |
| 40 .. class:: infomark | |
| 41 | |
| 42 **What it does** | |
| 43 | |
| 44 This tool uses the 'regsubsets' function from R statistical package for regression subset selection. It outputs two files, one containing a table with the best subsets and the corresponding summary statistics, and the other containing the graphical representation of the results. | |
| 45 | |
| 46 ----- | |
| 47 | |
| 48 .. class:: warningmark | |
| 49 | |
| 50 **Note** | |
| 51 | |
| 52 - This tool currently treats all predictor and response variables as continuous variables. | |
| 53 | |
| 54 - Rows containing non-numeric (or missing) data in any of the chosen columns will be skipped from the analysis. | |
| 55 | |
| 56 - The 6 columns in the output are described below: | |
| 57 | |
| 58 - Column 1 (Vars): denotes the number of variables in the model | |
| 59 - Column 2 ([c2 c3 c4...]): represents a list of the user-selected predictor variables (full model). An asterix denotes the presence of the corresponding predictor variable in the selected model. | |
| 60 - Column 3 (R-sq): the fraction of variance explained by the model | |
| 61 - Column 4 (Adj. R-sq): the above R-squared statistic adjusted, penalizing for higher number of predictors (p) | |
| 62 - Column 5 (Cp): Mallow's Cp statistics | |
| 63 - Column 6 (bic): Bayesian Information Criterion. | |
| 64 | |
| 65 | |
| 66 </help> | |
| 67 </tool> |
