comparison computeGCBias.xml @ 6:a66bf1fdd4b4 draft

planemo upload for repository https://github.com/fidelram/deepTools/tree/master/galaxy/wrapper/ commit 54a10cf268ca9a5399f13458a1b218be7891bd41
author bgruening
date Wed, 23 Dec 2015 03:55:49 -0500
parents 511d00417d91
children f1b7a3555d34
comparison
equal deleted inserted replaced
5:511d00417d91 6:a66bf1fdd4b4
139 139
140 **Summary of the method used** 140 **Summary of the method used**
141 141
142 In order to estimate how many reads with what kind of GC content one should have sequenced, we first need to determine how many regions the specific 142 In order to estimate how many reads with what kind of GC content one should have sequenced, we first need to determine how many regions the specific
143 reference genome contains for each amount of GC content, i.e. how many regions in the genome have 50% GC (or 10% GC or 90% GC or...). 143 reference genome contains for each amount of GC content, i.e. how many regions in the genome have 50% GC (or 10% GC or 90% GC or...).
144 We then sample a large number of equally sized genome bins and count how many times we see a bin with 50% GC (or 10% GC or 90% or...). These EXPECTED values are independent of any 144 We then sample a large number of equally sized genome bins and count how many times we see a bin with 50% GC (or 10% GC or 90% or...). These EXPECTED values are independent of any
145 sequencing as it only depends on the respective reference genome (i.e. it will most likely vary between mouse and fruit fly due to their genome's different GC contents). 145 sequencing as it only depends on the respective reference genome (i.e. it will most likely vary between mouse and fruit fly due to their genome's different GC contents).
146 The OBSERVED values are based on the reads from the sequenced sample. Instead of noting how many genomic regions there are per GC content, we now count the reads per GC content. 146 The OBSERVED values are based on the reads from the sequenced sample. Instead of noting how many genomic regions there are per GC content, we now count the reads per GC content.
147 In an ideal sample without GC bias, the ratio of OBSERVED/EXPECTED values should be close to 1 regardless of the GC content. Due to PCR (over)amplifications, the majority of ChIP samples 147 In an ideal sample without GC bias, the ratio of OBSERVED/EXPECTED values should be close to 1 regardless of the GC content. Due to PCR (over)amplifications, the majority of ChIP samples
148 usually shows a significant bias towards reads with high GC content (>50%) 148 usually shows a significant bias towards reads with high GC content (>50%)
149 149
150 .. image:: $PATH_TO_IMAGES/QC_GCplots_input.png 150 .. image:: $PATH_TO_IMAGES/QC_GCplots_input.png
151 151
152 152
153 You can find more details on the computeGCBias doc page: https://deeptools.readthedocs.org/en/release-1.6/content/tools/computeGCBias.html 153 You can find more details on the computeGCBias wiki page: computeGCBias wiki: https://github.com/fidelram/deepTools/wiki/QC#wiki-computeGCbias
154 154
155 155
156 **Output files**: 156 **Output files**:
157 157
158 - Diagnostic plot 158 - Diagnostic plot