Mercurial > repos > jjohnson > cistrome_correlation

<tool name="Two wiggle file correlation in union regions" id="correlation_intervals" version="0.1.0">
  <description>Calculate the correlation coefficient of two wiggle / bigwig files in the union regions from two bed files</description>
  <macros>
    <import>corr_macros.xml</import>
  </macros>
  <expand macro="requirements_union" />
  <command>
#if $wfile1.extension == "wig"
    qc_chIP_peak.py -s $step -m $method -f bed
#elif $wfile1.extension == "bigwig"
    qc_chIP_peakBW.py
#end if
  -x $wfile1 -y $wfile2 -p $bfile1 -q $bfile2 -r qc_chIP-output.txt &amp;> $log &amp;&amp;
Rscript qc_chIP-output.txt
  </command>
  <inputs>
    <!-- need pybedtools for qc_chIP_peakBW.py
    <param format="wig,bigwig" name="wfile1" type="data" label="WIGGLE / bigwig file 1"/>
    -->
    <param format="wig" name="wfile1" type="data" label="WIGGLE / bigwig file 1"/>
    <param format="bed" name="bfile1" type="data" label="BED file 1(100,000 lines max)"/>
    <!-- need pybedtools for qc_chIP_peakBW.py
    <param format="wig,bigwig" name="wfile2" type="data" label="WIGGLE / bigwig file 2"/>
    -->
    <param format="wig" name="wfile2" type="data" label="WIGGLE / bigwig file 2"/>
    <param format="bed" name="bfile2" type="data" label="BED file 2(100,000 lines max)"/>
    <param name="step" type="integer" label="Step" value="5" help="step in points. This option is only used for wig file.">
      <validator type="in_range" max="100" min="1" message="Step is out of range, Step has to be between 1 to 100" />
    </param>
    <param name="method" type="select" label="method:" help="method to process the paired two sets of data in the sampling step." >
      <option value="mean" selected="true">mean</option>
      <option value="median">median</option>
      <option value="sum">sum</option>
      <option value="sample">sample</option>
    </param>
    <param name="method" type="hidden" label="method:" help="method to process the paired two sets of data in the sampling step." >
      <option value="mean">mean</option>
    </param>
  </inputs>
  <outputs>
    <data format="pdf" name="output" from_work_dir="qc_chIP-output.txt.pdf"/>
    <data format="txt" name="log" label="job log" />
    <data format="txt" name="rscript" label="job rscript" from_work_dir="qc_chIP-output.txt"/>
  </outputs>
  <expand macro="stdio"/>
  <tests>
    <test>
      <param name="wfile1" value="control.wig" />
      <param name="bfile1" value="peaks.bed" />
      <param name="wfile2" value="treatment.wig" />
      <param name="bfile2" value="peaks.bed" />
      <param name="step" value="5" />
      <output name="log">
        <assert_contents>
            <has_text_matching expression="centering pscore2" />
        </assert_contents>
      </output>
    </test>
  </tests>


  <help>
This tool calculates the correlation coefficient on two sets where the
two sets intersect The tool is written by Tao Liu. It calls R for
plotting.

.. class:: infomark

**TIP:** This can be used to evaluate the correlation between
two biological replicates.

.. class:: warningmark

**NEED IMPROVEMENT**

-----

**Parameters**

- **WIGGLE file 1 and 2** are the two wiggle files to be
  included. These two are required.
- **BED file 1 and 2** are the two BED files to be used to
  extract scores from wiggle files.
- **wiggle files** click *Add new wiggle file* to add more wiggle
  files and labels.
- **Genome/Assembly** Genome assembly to be used. The tool will
  download the chromosome information from UCSC database.
- **Method** When scores are extracted for a region in BED file, a
  method will be applied to calculate a value to represent this
  region. Options are *median* to use the median value or *mean* to
  use the average value.
- **Step** Step in data points. The step is a window to extract the
  scores from wiggle files along the whole genome. So that every step
  number of points will have a value to represent it by using certain **Method**
- **Method** When scores are extracted for a step long window, a
  method will be applied to calculate a value to represent this
  window. Options are *median* to use the median value or *mean* to
  use the average value, or *sample* to sample 1 point to represent
  the region, or *sum* to use the sum of values in the region.

-----

**Outputs**

- **PNG file** is the correlation plot

  </help>

</tool>
author	jjohnson
date	Mon, 22 Sep 2014 11:54:41 -0400
parents
children