changeset 5:1f312af2f8db draft

Uploaded
author bgruening
date Tue, 06 Aug 2013 08:20:47 -0400
parents c8a0dc481493
children c5847db0cb41
files bamCompare.xml bamCorrelate.xml bamCoverage.xml bamFingerprint.xml bigwigCompare.xml computeGCBias.xml computeMatrix.xml correctGCBias.xml heatmapper.xml profiler.xml
diffstat 10 files changed, 88 insertions(+), 55 deletions(-) [+]
line wrap: on
line diff
--- a/bamCompare.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/bamCompare.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -1,5 +1,5 @@
 <tool id="bamCompare" name="bamCompare" version="1.0">
-  <description>Normalize and compare two BAM files to output ratio, log2ratio or difference.</description>
+  <description>normalizes and compares two BAM files to obtain the ratio, log2ratio or difference.</description>
   <requirements>
     <requirement type="package" version="1.5.1_59e067cce039cb93add04823c9f51cab202f8c2b">deepTools</requirement>
     <requirement type="package" version="1.7.1">numpy</requirement>
@@ -157,7 +157,7 @@
             help="If set, reads that have the same orientation and start position will be considered only once. If reads are paired, the mate position also has to coincide to ignore a read." /> 
 
         <param name="minMappingQuality" type="integer" optional="true" value="1" min="1"
-            label="Minimum mapping quality"
+            label="Minimum mapping quality (e.g. BOWTIE2 measures)"
             help= "If set, only reads that have a mapping quality score higher than the given value are considered"/>
 
         <param name="missingDataAsZero" type="boolean" truevalue="yes" falsevalue="no" checked="True"
@@ -180,30 +180,28 @@
 **What it does**
 
 This tool compares two BAM files based on the number of mapped reads. To
-compare the BAM files the genome is partitioned into bins of equal size, then
-the number of reads found in each BAM file are counted for such bins and
-finally a summarizing value is reported. This vaule can be the ratio of the
-number of reads per bin, the log2 of the ratio or the difference. This tool
-can normalize the number of reads on each BAM file using the SES method
-proposed by Diaz et al. (2012). "Normalization, bias correction, and peak
-calling for ChIP-seq". Statistical applications in genetics and molecular
-biology, 11(3). Normalization based on read counts is also available. The
-output is either a bedgraph or a bigwig file containing the bin location and
-the resulting comparison values. By default if reads are mated the fragment
-length reported in the BAM file is used.
+compare the BAM files, the genome is partitioned into bins of equal size,
+the reads are counted for each bin and each BAM file and finally, a summarizing value is reported.
+This value can be the ratio of the number of reads per bin, the log2 of the ratio or the difference.
+This tool can normalize the number of reads on each BAM file using the SES method
+proposed by Diaz et al. (2012). Stat Appl Genet Mol Biol 11(3). Normalization based on read counts is also available. The
+output is either a bedGraph or a bigWig file containing the bin location and
+the resulting comparison values.
+If paired-end reads are present, the fragment
+length reported in the BAM file is used by default.
 
 -----
 
 .. class:: infomark
 
-Please acknowledge that this tool **is still in development** and we will be very happy to receive feedback from the users. If you run into any trouble please sent an email to `Fidel Ramirez`_.
+If you would like to give us feedback or you run into any trouble, please sent an email to deeptools@googlegroups.com
 
 This tool is developed by the `Bioinformatics and Deep-Sequencing Unit`_ at the `Max Planck Institute for Immunobiology and Epigenetics`_.
 
 
 .. _Bioinformatics and Deep-Sequencing Unit: http://www3.ie-freiburg.mpg.de/facilities/research-facilities/bioinformatics-and-deep-sequencing-unit/
 .. _Max Planck Institute for Immunobiology and Epigenetics: http://www3.ie-freiburg.mpg.de
-.. _Fidel Ramirez: ramirez@ie-freiburg.mpg.de
+
 
   </help>
   
--- a/bamCorrelate.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/bamCorrelate.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -1,5 +1,5 @@
 <tool id="bamCorrelate" name="bamCorrelate" version="1.0.1">
-  <description>corrlates pairs of bam files</description>
+  <description>correlates pairs of BAM files</description>
   <requirements>
     <requirement type="package" version="1.5.1_59e067cce039cb93add04823c9f51cab202f8c2b">deepTools</requirement>
   </requirements>
@@ -125,7 +125,7 @@
         
     <param name="includeZeros" type="boolean" truevalue="--includeZeros" falsevalue=""
        label ="Include zeros"
-       help  ="If set, then zero counts that happen for *all* bam files given are included. The default behavior is to ignore those cases" />
+       help  ="If set, then regions with zero counts for *all* BAM files given are included. The default behavior is to ignore those cases" />
 
     </when>
   </conditional>
@@ -156,15 +156,18 @@
 
 **What it does**
 
-Genomes are split into bins of given length. For each bin the number of reads
-found for each of the bam files is counted. A correlation is computed for all
-pairs of bam files.
+This tool is useful to assess the overall similarity of different BAM files. A typical application
+is to check the correlation between replicates or published data sets.
+
+The tool splits the genomes are into bins of given length. For each bin, the number of reads
+found in each BAM file is counted and a correlation is computed for all
+pairs of BAM files.
 
 -----
 
 .. class:: infomark
 
-Please acknowledge that this tool **is still in development** and we will be very happy to receive feedback from the users. If you run into any trouble please sent an email to `Fidel Ramirez`_.
+If you would like to give us feedback or you run into any trouble, please sent an email to deeptools@googlegroups.com
 
 This tool is developed by the `Bioinformatics and Deep-Sequencing Unit`_ at the `Max Planck Institute for Immunobiology and Epigenetics`_.
 
--- a/bamCoverage.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/bamCoverage.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -129,7 +129,7 @@
 
 .. class:: infomark
 
-Please acknowledge that this tool **is still in development** and we will be very happy to receive feedback from the users. If you run into any trouble please sent an email to `Fidel Ramirez`_.
+If you would like to give us feedback or you run into any trouble, please sent an email to deeptools@googlegroups.com
 
 This tool is developed by the `Bioinformatics and Deep-Sequencing Unit`_ at the `Max Planck Institute for Immunobiology and Epigenetics`_.
 
--- a/bamFingerprint.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/bamFingerprint.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -141,15 +141,20 @@
 
 **What it does**
 
-Samples indexed bam files and plots a profile for each bam file. At each
-sample position all reads overlaping a window (bin) of specified length are
-counted. This counts are then sorted and the cumulative sum plotted
+This tool is based on a method developed by Diaz et al. (2012). Stat Appl Genet Mol Biol 11(3).
+The resulting plot can be used to assess the strength of a ChIP (for factors that bind to narrow regions).
+The tool first samples indexed bam files and counts all reads overlapping a window (bin) of specified length.
+These counts are then sorted according to their rank and the cumulative sum of read counts are plotted. An ideal input
+with perfect uniform distribution of reads along the genome (i.e. without enrichments in open chromatin etc.) should
+generate a straight diagonal line. A very specific and strong ChIP enrichment will be indicated by a prominent and steep
+rise of the cumulative sum towards the highest rank. This means that a big chunk of reads from the ChIP sample is located in
+few bins which corresponds to high, narrow enrichments seen for transcription factors.
 
 -----
 
 .. class:: infomark
 
-Please acknowledge that this tool **is still in development** and we will be very happy to receive feedback from the users. If you run into any trouble please sent an email to `Fidel Ramirez`_.
+If you would like to give us feedback or you run into any trouble, please sent an email to deeptools@googlegroups.com
 
 This tool is developed by the `Bioinformatics and Deep-Sequencing Unit`_ at the `Max Planck Institute for Immunobiology and Epigenetics`_.
 
--- a/bigwigCompare.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/bigwigCompare.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -1,5 +1,5 @@
 <tool id="bigwigCompare" name="bigwigCompare" version="1.0">
-  <description>compares two bigwig files based on the number of mapped reads</description>
+  <description>normalizes and compares two bigWig files to obtain the ratio, log2ratio or difference</description>
   <requirements>
     <requirement type="package" version="1.5.1_59e067cce039cb93add04823c9f51cab202f8c2b">deepTools</requirement>
     <requirement type="package" version="0.1">ucsc_tools</requirement>
@@ -92,14 +92,14 @@
 This tool compares two bigwig files based on the number of mapped reads. To
 compare the bigwig files the genome is partitioned into bins of equal size,
 then the number of reads found in each BAM file are counted for such bins and
-finally a summarizing value is reported. This vaule can be the ratio of the
-number of readsper bin, the log2 of the ratio, the sum or the difference.
+finally a summarizing value is reported. This value can be the ratio of the
+number of reads per bin, the log2 of the ratio, the sum or the difference.
 
 -----
 
 .. class:: infomark
 
-Please acknowledge that this tool **is still in development** and we will be very happy to receive feedback from the users. If you run into any trouble please sent an email to `Fidel Ramirez`_.
+If you would like to give us feedback or you run into any trouble, please sent an email to deeptools@googlegroups.com
 
 This tool is developed by the `Bioinformatics and Deep-Sequencing Unit`_ at the `Max Planck Institute for Immunobiology and Epigenetics`_.
 
--- a/computeGCBias.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/computeGCBias.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -1,5 +1,6 @@
 <tool id="computeGCBias" name="computeGCBias" version="1.0.1">
-  <description></description>
+  <description>to see whether your samples should be normalized for GC bias</description>
+  
   <requirements>
     <requirement type="package" version="1.5.1_59e067cce039cb93add04823c9f51cab202f8c2b">deepTools</requirement>
   </requirements>
@@ -51,13 +52,17 @@
 
   #end if
 
-  #if $output.showOutputSettings == "yes"
-      #if $output.saveBiasPlot:
-        --biasPlot biasPlot.png ;
-        mv biasPlot.png $biasPlot
-      #end if
+  #if $saveBiasPlot:
+    --biasPlot $biasPlot
   #end if
 
+##  #if $output.showOutputSettings == "yes"
+##      #if $output.saveBiasPlot:
+##        --biasPlot biasPlot.png ;
+##        mv biasPlot.png $biasPlot
+##      #end if
+##  #end if
+
   ; rm $temp_dir -rf
 
   </command>
@@ -123,6 +128,8 @@
         </when>
     </conditional>
 
+    <param name="saveBiasPlot" type="boolean" truevalue="--biasPlot" falsevalue="" checked="True" label="Save a diagnostic image summarizing the GC bias found on the sample"/>
+    <!--
     <conditional name="output" >
         <param name="showOutputSettings" type="select" label="Show additional output options" >
         <option value="no" selected="true">no</option>
@@ -133,6 +140,7 @@
         <param name="saveBiasPlot" type="boolean" label="Save a diagnostic image summarizing the GC bias found on the sample"/>
       </when>
     </conditional>
+    -->
   </inputs>
   <outputs>
     <data format="tabular" name="outFileName" />
@@ -144,8 +152,8 @@
 
 **What it does**
 
-Computes the GC bias ussing Benjamini's method [citation]. The resulting GC
-bias can later be used to plot the bias or to correct the bias.
+This tool computes the GC bias ussing the method proposed by Benjamini and Speed (2012). Nucleic Acids Res. 
+The output is used to plot the bias and can also be used later on to correct the bias with the tool correctGCbias.
 
 -----
 
--- a/computeMatrix.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/computeMatrix.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -4,14 +4,29 @@
     <requirement type="package" version="1.5.1_59e067cce039cb93add04823c9f51cab202f8c2b">deepTools</requirement>
   </requirements>
   <command>
+    #import tempfile
+
+    #set $temp_input_handle = tempfile.NamedTemporaryFile()
+    #set $temp_input_path = $temp_input_handle.name
+    #silent $temp_input_handle.close()
+
+    #for $rf in $regionsFiles:
+        cat "$rf.regionsFile" >> $temp_input_path;
+        #if str($rf.label.value).strip():
+            echo "\#$rf.label.value" >> $temp_input_path;
+        #else:
+            echo "\#$rf.regionsFile.name" >> $temp_input_path;
+        #end if
+    #end for
+
+
   computeMatrix
 
   $mode.mode_select
-  --regionsFileName '$regionsFile'
+  --regionsFileName '$temp_input_path'
   --scoreFileName '$scoreFile'
   --outFileName '$outFileName'
 
-  ##ToDo
   --numberOfProcessors 4
 
   #if $output.showOutputSettings == "yes"
@@ -61,11 +76,17 @@
     #end if
 
   #end if
+  ; rm $temp_input_path
 
   </command>
   <inputs>
-    <param name="regionsFile" format="bed,gff" type="data" label="Regions to plot" help="File, in BED or GFF format, containing the regions to plot."/>
-    <param name="scoreFile" format="bigwig,bam" type="data" label="Score file" help="Either a bigWig file (containing a score, usually covering the whole genome) or a BAM file. For this last case, coverage counts will be used for the heatmap."/>
+
+    <repeat name="regionsFiles" title="regions to plot" min="1">
+        <param name="regionsFile" format="bed" type="data" label="Regions to plot" help="File, in BED or GFF format, containing the regions to plot."/>
+        <param name="label" type="text" size="30" optional="true" value="" label="Label" help="Label to use in the output."/>
+    </repeat>
+
+    <param name="scoreFile" format="bigwig" type="data" label="Score file" help="Either a bigWig file (containing a score, usually covering the whole genome) or a BAM file. For this last case, coverage counts will be used for the heatmap."/>
 
     <conditional name="mode" >
       <param name="mode_select" type="select" label="computeMatrix has two main output options" help="In the scale-regions mode, all regions in the BED/GFF file are stretched or shrunk to the same length (bp) that is indicated by the user. Reference-point refers to a position within the BED/GFF regions (e.g start of region). In the reference-point mode only those genomic positions before (downstream) and/or after (upstream) the reference point will be plotted.">
--- a/correctGCBias.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/correctGCBias.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -1,6 +1,5 @@
 <tool id="correctGCBias" name="correctGCBias" version="1.0.1">
-  <description>
-  </description>
+  <description>use the output from computeGCBias to obtain corrected sample files</description>
   <requirements>
     <requirement type="package" version="1.5.1_59e067cce039cb93add04823c9f51cab202f8c2b">deepTools</requirement>
     <requirement type="package" version="0.1">ucsc_tools</requirement>
@@ -106,14 +105,13 @@
 
 **What it does**
 
-Computes the GC bias ussing Benjamini's method [citation]. The resulting GC
-bias can later be used to plot the bias or to correct the bias.
-
+This tool requires the output from computeGCBias to correct the given BAM files according to the method proposed by Benjamini and Speed (2012). Nucleic Acids Res.
+The resulting BAM files can be used in any downstream analyses, but be aware that you should not filter out duplicates from here on.
 -----
 
 .. class:: infomark
 
-Please acknowledge that this tool **is still in development** and we will be very happy to receive feedback from the users. If you run into any trouble please sent an email to `Fidel Ramirez`_.
+If you would like to give us feedback or you run into any trouble, please sent an email to deeptools@googlegroups.com
 
 This tool is developed by the `Bioinformatics and Deep-Sequencing Unit`_ at the `Max Planck Institute for Immunobiology and Epigenetics`_.
 
--- a/heatmapper.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/heatmapper.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -144,7 +144,7 @@
 
         <param name="missingDataColor" type="text" label="Missing data color" value="black" optional="true" help="If 'Represent missing data as zero' is not set, such cases will be colored in black by default. By using this parameter a different color can be set. A value between 0 and 1 will be used for a gray scale (black is 0). Also color names can be used, see a list here: http://packages.python.org/ete2/reference/reference_svgcolors.html. Alternatively colors can be specified using the #rrggbb notation." />
 
-        <param name="colorMap" type="select" label="Color map to use for the heatmap" help=" Available values can be seen here: http://www.astro.lsa.umich.edu/~msshin/science/code/matplotlib_cm/">
+        <param name="colorMap" type="select" label="Color map to use for the heatmap" help=" Available color map names can be found here: http://www.astro.lsa.umich.edu/~msshin/science/code/matplotlib_cm/">
             <option value="RdYlBu" selected="true">RdYlBu</option>
             <option value="Accent">Accent</option>
             <option value="Spectral">Spectral</option>
@@ -353,7 +353,7 @@
 
 .. class:: infomark
 
-Please acknowledge that this tool **is still in development** and we will be very happy to receive feedback from the users. If you run into any trouble please sent an email to `Fidel Ramirez`_.
+If you would like to give us feedback or you run into any trouble, please sent an email to deeptools@googlegroups.com
 
 This tool is developed by the `Bioinformatics and Deep-Sequencing Unit`_ at the `Max Planck Institute for Immunobiology and Epigenetics`_.
 
--- a/profiler.xml	Mon Aug 05 11:36:11 2013 -0400
+++ b/profiler.xml	Tue Aug 06 08:20:47 2013 -0400
@@ -11,8 +11,8 @@
   --matrixFile $matrixFile
 
   #if $output.showOutputSettings == "yes"
-      #set newoutFileName=str($outFileName)+"."+str($output.outFileFormat)
-      --outFileName $newoutFilename
+      #set newoutFileName = str($outFileName)+"."+str($output.outFileFormat)
+      --outFileName $newoutFileName
       #if $output.outFileNameData:
         --outFileNameData '$output.outFileNameData' 
       #end if
@@ -25,7 +25,7 @@
         --outFileSortedRegions '$output.outFileSortedRegions'
       #end if
   #else
-    #set newoutFileName=str($outFileName)+".png"
+    #set newoutFileName = str($outFileName)+".png"
     --outFileName $newoutFileName
   #end if
   
@@ -96,7 +96,7 @@
         </param>
         <param name="saveData" type="boolean" label="Save the data underlying data for the average profile"/>
         <param name="saveMatrix" type="boolean" label="Save the the matrix of values underlying the heatmap"/>
-        <param name="saveSortedRegions" type="boolean" label="Save the regions after skipping zeros or min/max threshold values" help="The order of the regions in the file follows the sorting order selected. This is useful, for example, to generate other heatmaps keeping the sorting of the first heatmap."/>
+        <param name="saveSortedRegions" type="boolean" label="Save the regions after skipping zeros or min/max threshold values" help="This outputs the file of genomic intervals in the order that will be shown in the heatmap or summary profile. This is useful, for example, to generate other heatmaps keeping the sorting of the first heatmap."/>
       </when>
     </conditional>
 
@@ -171,7 +171,7 @@
 
 .. class:: infomark
 
-Please acknowledge that this tool **is still in development** and we will be very happy to receive feedback from the users. If you run into any trouble please sent an email to `Fidel Ramirez`_.
+If you would like to give us feedback or you run into any trouble, please sent an email to deeptools@googlegroups.com
 
 This tool is developed by the `Bioinformatics and Deep-Sequencing Unit`_ at the `Max Planck Institute for Immunobiology and Epigenetics`_.