ucsc_cancer_utilities: mergeGenomicFiles.xml annotate

annotate mergeGenomicFiles.xml @ 35:8ef79bd0be9a

modify

author	jingchunzhu <jingchunzhu@gmail.com>
date	Fri, 24 Jul 2015 16:14:36 -0700
parents	3a259686f0fc
children	9806198df91f

rev	line source
17 0b0a6f326dad Cleaned up the output dataset names for Merge Genomic Datasets melissacline parents: 7 diff changeset	1 <tool id="mergeGenomicFiles" description="Merge two genomic datasets into a new dataset" name="Merge Genomic Datasets" version="0.0.1">
5 6c23a3b58eb8 Uploaded melissacline parents: diff changeset	2 <description>
7 1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	3 Given two genomic datasets, merge them to create a larger dataset with the row and column identifiers from both datasets. Output this larger dataset, along with a 2-column matrix indicating the source file of each sample
5 6c23a3b58eb8 Uploaded melissacline parents: diff changeset	4 </description>
6c23a3b58eb8 Uploaded melissacline parents: diff changeset	5 <command interpreter="python">
7 1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	6 mergeGenomicMatrixFiles.py $inputA $inputB $outputC $outputSourceMatrix
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	7 #if $labelForDatasetA
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	8 --aLabel "${labelForDatasetA}"
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	9 #end if
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	10 #if $labelForDatasetB
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	11 --bLabel "${labelForDatasetB}"
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	12 #end if
5 6c23a3b58eb8 Uploaded melissacline parents: diff changeset	13 </command>
6c23a3b58eb8 Uploaded melissacline parents: diff changeset	14 <inputs>
6c23a3b58eb8 Uploaded melissacline parents: diff changeset	15 <param name="inputA" format="tabular" type="data" label="Genomic Dataset A"/>
7 1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	16 <param type="text" name="labelForDatasetA" label="Dataset A Label (optional)" optional="true"/>
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	17 <param name="inputB" format="tabular" type="data" label="Genomic Dataset B"/>
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	18 <param type="text" name="labelForDatasetB" label="Dataset B Label (optional)" optional="true"/>
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	19 </inputs>
5 6c23a3b58eb8 Uploaded melissacline parents: diff changeset	20 <outputs>
17 0b0a6f326dad Cleaned up the output dataset names for Merge Genomic Datasets melissacline parents: 7 diff changeset	21 <data name="outputSourceMatrix" format="tabular" label="Genomic Data Sources"/>
0b0a6f326dad Cleaned up the output dataset names for Merge Genomic Datasets melissacline parents: 7 diff changeset	22 <data name="outputC" format="tabular" label="Merged Genomic Data"/>
5 6c23a3b58eb8 Uploaded melissacline parents: diff changeset	23 </outputs>
6c23a3b58eb8 Uploaded melissacline parents: diff changeset	24 <help>
6c23a3b58eb8 Uploaded melissacline parents: diff changeset	25 *Merge Genomic Datasets*
6c23a3b58eb8 Uploaded melissacline parents: diff changeset	26
7 1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	27 Given two genomic datasets, merge them to produce a third dataset that is the union of the first two. The new dataset will contain all column labels from either dataset, and all row labels from either dataset. If a row label appears in both datasets, the output dataset will contain, for that row, all values for the first set of columns, plus all values for the second set of columns. If a row label appears in the first dataset only, the output dataset will contain the values for the columns of the first dataset, and blanks (indicating missing values) for the columns of the second dataset.
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	28
1d150e860c4d Expanded the functionality of the merge genomic datasets tool, to generate an output dataset with the file (or label) indicating where each column came from melissacline parents: 5 diff changeset	29 To maintain provenance, this script also outputs a second matrix, with one row for each column in the output dataset, and two columns per row indicating which input dataset that column came from. By default, the input dataset name is used to indicate which input file each column came from. Optionally, the user can specify descriptive labels to be used in place of the filenames. This all assumes that each column exists in only one input dataset.
5 6c23a3b58eb8 Uploaded melissacline parents: diff changeset	30 </help>
6c23a3b58eb8 Uploaded melissacline parents: diff changeset	31 </tool>

Mercurial > repos > melissacline > ucsc_cancer_utilities

annotate mergeGenomicFiles.xml @ 35:8ef79bd0be9a