Mercurial > repos > melissacline > ucsc_cancer_utilities

<tool id="mergeMutationDatasets" description="Merge two Xena positional mutation datasets into a new dataset" name="Merge Xena Mutation by Position Data" version="0.0.1">
  <command interpreter="python">
      mergeXenaMutation.py $outputC $outputSourceMatrix $errorLog  $inputA $inputB
      #if $labelForDatasetA
          --aLabel "${labelForDatasetA}"
      #else
          --aLabel "${inputA.name}"
      #end if
      #if $labelForDatasetB
          --bLabel "${labelForDatasetB}"
      #else
          --bLabel "${inputB.name}"
      #end if
  </command>
  <inputs>
    <param name="inputA" format="tabular" type="data" label="Xena Mutation by Position Dataset A"/>
    <param type="text" name="labelForDatasetA"  label="Dataset A Label (eg. LGG)" value="A"/>
    <param name="inputB" format="tabular" type="data" label="Xena Mutation by Position Dataset B"/>
    <param type="text" name="labelForDatasetB"  label="Dataset B Label (eg. GBM)" value="B"/>
 </inputs>
  <outputs>
    <data name="errorLog" format="data" label="Execution Log" hidden="True" />
    <data name="outputSourceMatrix" format="tabular" label="Data Source ${labelForDatasetA}+${labelForDatasetB}"/>
    <data name="outputC" format="tabular" label="Mutation by Position ${labelForDatasetA}+${labelForDatasetB}"/>
  </outputs>
  <help>

**Merge Xena Positional Mutation Datasets**

1. Input xena positional mutation data file format: tab-deliminated

   =======    =====  ======= ===== ========= ====== ========
   sample     chr    start   end   reference alt    anything
   =======    =====  ======= ===== ========= ====== ========
   sample1    chr1   1       1     A         T      0.2
   sample1    chr1   10      10    T         A      0.1
   sample2    chr1   20      20    G         GG     0.0
   sample2    chr1   20      21    GT        G
   ...        ...    ...     ...   ...       ...    ...
   =======    =====  ======= ===== ========= ====== ========


2.  Output file 1: Given two datasets of mutation data, merge them to produce a third dataset that is the union of the first two.  The new dataset will contain all mutations from either dataset.

3.  Output file 2: To maintain provenance, this script also outputs a second data file, with one row for each sample ID that appears in the output dataset, and two columns per row indicating which input dataset(s) contained some mutation data for that sample.  Users can specify descriptive labels to indicate the data source.   </help>
</tool>
author	jingchunzhu@gmail.com
date	Tue, 27 Oct 2015 16:07:09 -0700
parents	2a240b005731
children