Mercurial > repos > sblanck > mpagenomics_normalize
view markersSelection.xml @ 6:7dc6ce39fb89 default tip
add selection tool
author | blanck |
---|---|
date | Wed, 29 Apr 2015 10:08:52 +0200 |
parents | b7f3854e08f8 |
children |
line wrap: on
line source
<tool id="markersSelection" name="Markers selection" force_history_refresh="True" version="0.1.0"> <requirement type="package" version="1.1.2">mpagenomics</requirement> <command interpreter="python"> markersSelection.py '$input' '$response' '$__new_file_path__' '$folds' '$loss' '$outputlog' '$output' '$log' </command> <inputs> <param name="input" type="data" format="sef" label="Input Signal" help="see below for more information on file format"/> <param name="response" type="data" format="csv" label="Data response" help="Data response csv file. See below for more information on file format" /> <param name="folds" type="integer" min="1" value="10" label ="Number of folds for cross validation" help="Integer between 1 and number of file in the .cel file dataset"/> <param name="loss" type="select" multiple="false" label="Response type"> <option value="linear">Linear</option> <option value="logistic">Logistic</option> </param> <param name="outputgraph" type="select" label="Output figures"> <option value="TRUE">Yes</option> <option value="FALSE">No</option> </param> <param name="outputlog" type="select" label="Output log"> <option value="TRUE">Yes</option> <option value="FALSE">No</option> </param> </inputs> <outputs> <data format="txt" name="output" label="selection of ${input.name}" /> <data format="log" name="log" label="log of selection of ${input.name}" > <filter>outputlog == "TRUE"</filter> </data> </outputs> <stdio> <exit_code range="1:" level="fatal" description="See logs for more details" /> </stdio> <help> **What it does** This tool selects some relevant markers according to a response using penalized regressions. Input: *A tabular text file containing 3 fixed columns and 1 column per sample:* - chr: Chromosome. - position: Genomic position (in bp). - probeNames: Names of the probes. - One column per sample which contain the copy number signal for each sample. Output: *A tabular text file containing 5 columns which describe all the selected SNPs (1 line per SNP):* - chr: Chromosome containing the selected SNP. - position: Position of the selected SNP. - index: Index of the selected SNP. - names: Name of the selected SNP. - coefficient: Regression coefficient of the selected SNP. ----- **Data Response csv file** Data response csv file format: - The first column contains the names of the different files of the dataset. - The second column is the response associated with each file. - Column names of these two columns are respectively files and response. - Columns are separated by a comma - *Extensions of the files (.CEL for example) should be removed* **Example** Let 3 .cel files in the studied dataset :: patient1.cel patient2.cel patient3.cel The csv file should look like this :: files,response patient1,1.92145 patient2,2.12481 patient3,1.23545 ----- **Citation** If you use this tool please cite : `Q. Grimonprez, A. Celisse, M. Cheok, M. Figeac, and G. Marot. MPAgenomics : An R package for multi-patients analysis of genomic markers, 2014. Preprint <http://fr.arxiv.org/abs/1401.5035>`_ </help> </tool>