Mercurial > repos > sblanck > mpagenomics_normalize

<tool id="markersSelection" name="Markers selection" force_history_refresh="True" version="0.1.0">
  <requirement type="package" version="1.1.2">mpagenomics</requirement>
  <command interpreter="python">
    markersSelection.py '$input' '$response' '$__new_file_path__' '$folds' '$loss' '$outputlog' '$output' '$log'
  </command>
  <inputs>
    <param name="input" type="data" format="sef" label="Input Signal" help="see below for more information on file format"/>
	<param name="response" type="data" format="csv" label="Data response" help="Data response csv file. See below for more information on file format" />
	<param name="folds" type="integer" min="1" value="10" label ="Number of folds for cross validation" help="Integer between 1 and number of file in the .cel file dataset"/>
    <param name="loss" type="select" multiple="false" label="Response type">
      	<option value="linear">Linear</option>
    	<option value="logistic">Logistic</option>
    </param>

	<param name="outputgraph" type="select" label="Output figures">
        <option value="TRUE">Yes</option>
        <option value="FALSE">No</option>
     </param>
    <param name="outputlog" type="select" label="Output log">
        <option value="TRUE">Yes</option>
        <option value="FALSE">No</option>
    </param>
  </inputs>
  <outputs>
    <data format="txt" name="output" label="selection of ${input.name}" />
    <data format="log" name="log" label="log of selection of ${input.name}" >
    	<filter>outputlog == "TRUE"</filter>
    </data>
  </outputs>
  <stdio>
    <exit_code range="1:"   level="fatal"   description="See logs for more details" />
   </stdio>
  <help>
  **What it does**

This tool selects some relevant markers according to a response using penalized regressions.

Input:

*A tabular text file containing 3 fixed columns and 1 column per sample:*

	- chr: Chromosome.
	- position: Genomic position (in bp).
  	- probeNames: Names of the probes.
  	- One column per sample which contain the copy number signal for each sample.

Output:

*A tabular text file containing 5 columns which describe all the selected SNPs (1 line per SNP):*

	- chr: Chromosome containing the selected SNP.
  	- position: Position of the selected SNP.
	- index: Index of the selected SNP.
	- names: Name of the selected SNP.
	- coefficient: Regression coefficient of the selected SNP.

-----

**Data Response csv file**

Data response csv file format:

	- The first column contains the names of the different files of the dataset.

	- The second column is the response associated with each file.

	- Column names of these two columns are respectively files and response.

	- Columns are separated by a comma

	- *Extensions of the files (.CEL for example) should be removed*


**Example**

Let 3 .cel files in the studied dataset ::

     	patient1.cel
     	patient2.cel
     	patient3.cel

The csv file should look like this ::

     	files,response
     	patient1,1.92145
     	patient2,2.12481
     	patient3,1.23545


-----

**Citation**

If you use this tool please cite :

`Q. Grimonprez, A. Celisse, M. Cheok, M. Figeac, and G. Marot. MPAgenomics : An R package for multi-patients analysis of genomic markers, 2014. Preprint &lt;http://fr.arxiv.org/abs/1401.5035&gt;`_

 </help>
</tool>
author	blanck
date	Wed, 29 Apr 2015 10:08:52 +0200
parents	b7f3854e08f8
children