5
|
1 <tool id="markersSelection" name="Markers selection" force_history_refresh="True" version="0.1.0">
|
|
2 <requirement type="package" version="1.1.2">mpagenomics</requirement>
|
|
3 <command interpreter="python">
|
|
4 markersSelection.py '$input' '$response' '$__new_file_path__' '$folds' '$loss' '$outputlog' '$output' '$log'
|
|
5 </command>
|
|
6 <inputs>
|
|
7 <param name="input" type="data" format="sef" label="Input Signal" help="see below for more information on file format"/>
|
|
8 <param name="response" type="data" format="csv" label="Data response" help="Data response csv file. See below for more information on file format" />
|
|
9 <param name="folds" type="integer" min="1" value="10" label ="Number of folds for cross validation" help="Integer between 1 and number of file in the .cel file dataset"/>
|
|
10 <param name="loss" type="select" multiple="false" label="Response type">
|
|
11 <option value="linear">Linear</option>
|
|
12 <option value="logistic">Logistic</option>
|
|
13 </param>
|
|
14
|
|
15 <param name="outputgraph" type="select" label="Output figures">
|
|
16 <option value="TRUE">Yes</option>
|
|
17 <option value="FALSE">No</option>
|
|
18 </param>
|
|
19 <param name="outputlog" type="select" label="Output log">
|
|
20 <option value="TRUE">Yes</option>
|
|
21 <option value="FALSE">No</option>
|
|
22 </param>
|
|
23 </inputs>
|
|
24 <outputs>
|
|
25 <data format="txt" name="output" label="selection of ${input.name}" />
|
|
26 <data format="log" name="log" label="log of selection of ${input.name}" >
|
|
27 <filter>outputlog == "TRUE"</filter>
|
|
28 </data>
|
|
29 </outputs>
|
|
30 <stdio>
|
|
31 <exit_code range="1:" level="fatal" description="See logs for more details" />
|
|
32 </stdio>
|
|
33 <help>
|
|
34 **What it does**
|
|
35
|
|
36 This tool selects some relevant markers according to a response using penalized regressions.
|
|
37
|
|
38 Input:
|
|
39
|
|
40 *A tabular text file containing 3 fixed columns and 1 column per sample:*
|
|
41
|
|
42 - chr: Chromosome.
|
|
43 - position: Genomic position (in bp).
|
|
44 - probeNames: Names of the probes.
|
|
45 - One column per sample which contain the copy number signal for each sample.
|
|
46
|
|
47 Output:
|
|
48
|
|
49 *A tabular text file containing 5 columns which describe all the selected SNPs (1 line per SNP):*
|
|
50
|
|
51 - chr: Chromosome containing the selected SNP.
|
|
52 - position: Position of the selected SNP.
|
|
53 - index: Index of the selected SNP.
|
|
54 - names: Name of the selected SNP.
|
|
55 - coefficient: Regression coefficient of the selected SNP.
|
|
56
|
|
57 -----
|
|
58
|
|
59 **Data Response csv file**
|
|
60
|
|
61 Data response csv file format:
|
|
62
|
|
63 - The first column contains the names of the different files of the dataset.
|
|
64
|
|
65 - The second column is the response associated with each file.
|
|
66
|
|
67 - Column names of these two columns are respectively files and response.
|
|
68
|
|
69 - Columns are separated by a comma
|
|
70
|
|
71 - *Extensions of the files (.CEL for example) should be removed*
|
|
72
|
|
73
|
|
74
|
|
75 **Example**
|
|
76
|
|
77 Let 3 .cel files in the studied dataset ::
|
|
78
|
|
79 patient1.cel
|
|
80 patient2.cel
|
|
81 patient3.cel
|
|
82
|
|
83 The csv file should look like this ::
|
|
84
|
|
85 files,response
|
|
86 patient1,1.92145
|
|
87 patient2,2.12481
|
|
88 patient3,1.23545
|
|
89
|
|
90
|
|
91 -----
|
|
92
|
|
93 **Citation**
|
|
94
|
|
95 If you use this tool please cite :
|
|
96
|
|
97 `Q. Grimonprez, A. Celisse, M. Cheok, M. Figeac, and G. Marot. MPAgenomics : An R package for multi-patients analysis of genomic markers, 2014. Preprint <http://fr.arxiv.org/abs/1401.5035>`_
|
|
98
|
|
99 </help>
|
|
100 </tool>
|