spp_phantompeak: spp/man/find.binding.positions.Rd annotate

annotate spp/man/find.binding.positions.Rd @ 15:e689b83b0257 draft

Uploaded

author	zzhou
date	Tue, 27 Nov 2012 16:15:21 -0500
parents	ce08b0efa3fd
children

rev	line source
6 ce08b0efa3fd Uploaded zzhou parents: diff changeset	1 \name{find.binding.positions}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	2 \alias{find.binding.positions}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	3 %- Also NEED an '\alias' for EACH other topic documented here.
ce08b0efa3fd Uploaded zzhou parents: diff changeset	4 \title{ Determine significant point protein binding positions (peaks) }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	5 \description{
ce08b0efa3fd Uploaded zzhou parents: diff changeset	6 Given the signal and optional control (input) data, determine location of the
ce08b0efa3fd Uploaded zzhou parents: diff changeset	7 statistically significant point binding positions. If the control data
ce08b0efa3fd Uploaded zzhou parents: diff changeset	8 is not provided, the statistical significance can be assessed based on
ce08b0efa3fd Uploaded zzhou parents: diff changeset	9 tag randomization. The method also provides options for masking
ce08b0efa3fd Uploaded zzhou parents: diff changeset	10 regions exhibiting strong signals within the control data.
ce08b0efa3fd Uploaded zzhou parents: diff changeset	11 }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	12 \usage{
ce08b0efa3fd Uploaded zzhou parents: diff changeset	13 find.binding.positions(signal.data, e.value = NULL, fdr = NULL, masked.data = NULL, control.data = NULL, min.dist = 200, window.size = 4e+07, cluster = NULL, debug = T, n.randomizations = 3, shuffle.window = 1, min.thr = 0, topN = NULL, tag.count.whs = 100, enrichment.z = 2, method = tag.wtd, tec.filter = T, tec.window.size = 10000, tec.masking.window.size=tec.window.size, tec.z = 5, tec.poisson.z=5,tec.poisson.ratio=5, n.control.samples = 1, enrichment.background.scales = c(1, 5, 10), background.density.scaling = F, use.randomized.controls = F, ...)
ce08b0efa3fd Uploaded zzhou parents: diff changeset	14 }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	15 %- maybe also 'usage' for other objects documented here.
ce08b0efa3fd Uploaded zzhou parents: diff changeset	16 \arguments{
ce08b0efa3fd Uploaded zzhou parents: diff changeset	17 ~~ tag data ~~
ce08b0efa3fd Uploaded zzhou parents: diff changeset	18 \item{signal.data}{ signal tag vector list }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	19 \item{control.data}{ optional control (input) tag vector list }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	20
ce08b0efa3fd Uploaded zzhou parents: diff changeset	21 ~~ position stringency criteria ~~
ce08b0efa3fd Uploaded zzhou parents: diff changeset	22 \item{e.value}{ E-value defining the desired statistical significance
ce08b0efa3fd Uploaded zzhou parents: diff changeset	23 of binding positions. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	24 \item{fdr}{ FDR defining statistical significance of binding positions }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	25 \item{topN}{ instead of determining statistical significance
ce08b0efa3fd Uploaded zzhou parents: diff changeset	26 thresholds, return the specified number of highest-scoring
ce08b0efa3fd Uploaded zzhou parents: diff changeset	27 positions}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	28
ce08b0efa3fd Uploaded zzhou parents: diff changeset	29 ~~ other params ~~
ce08b0efa3fd Uploaded zzhou parents: diff changeset	30 \item{whs}{ window half-sized that should be used for binding
ce08b0efa3fd Uploaded zzhou parents: diff changeset	31 detection (e.g. determined from cross-correlation profiles)}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	32 \item{masked.data}{ optional set of coordinates that should be masked
ce08b0efa3fd Uploaded zzhou parents: diff changeset	33 (e.g. known non-unique regions) }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	34 \item{min.dist}{ minimal distance that must separate detected binding
ce08b0efa3fd Uploaded zzhou parents: diff changeset	35 positions. In case multiple binding positions are detected within
ce08b0efa3fd Uploaded zzhou parents: diff changeset	36 such distance, the position with the highest score is returned. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	37 \item{window.size}{ size of the window used to segment the chromosome
ce08b0efa3fd Uploaded zzhou parents: diff changeset	38 during calculations to reduce memory usage. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	39 \item{cluster}{ optional \code{snow} cluster to parallelize the
ce08b0efa3fd Uploaded zzhou parents: diff changeset	40 processing on }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	41 \item{min.thr}{ minimal score requirement for a peak }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	42 \item{background.density.scaling}{ If TRUE, regions of significant tag
ce08b0efa3fd Uploaded zzhou parents: diff changeset	43 enrichment will be masked out when calculating size ratio of the
ce08b0efa3fd Uploaded zzhou parents: diff changeset	44 signal to control datasets (to estimate ratio of the background tag
ce08b0efa3fd Uploaded zzhou parents: diff changeset	45 density). If FALSE, the dataset ratio will be equal to the ratio of
ce08b0efa3fd Uploaded zzhou parents: diff changeset	46 the number of tags in each dataset.}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	47
ce08b0efa3fd Uploaded zzhou parents: diff changeset	48 ~~ randomized controls ~~
ce08b0efa3fd Uploaded zzhou parents: diff changeset	49 \item{n.randomizations}{ number of tag randomziations that should be
ce08b0efa3fd Uploaded zzhou parents: diff changeset	50 performed (when the control data is not provided) }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	51 \item{use.randomized.controls}{ Use randomized tag control, even if
ce08b0efa3fd Uploaded zzhou parents: diff changeset	52 \code{control.data} is supplied. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	53 \item{shuffle.window}{ during tag randomizations, tags will be split
ce08b0efa3fd Uploaded zzhou parents: diff changeset	54 into groups of \code{shuffle.window} and will be maintained
ce08b0efa3fd Uploaded zzhou parents: diff changeset	55 together throughout the randomization. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	56
ce08b0efa3fd Uploaded zzhou parents: diff changeset	57 ~~ fold-enrichment confidence intervals
ce08b0efa3fd Uploaded zzhou parents: diff changeset	58 \item{tag.count.whs}{ half-size of a window used to assess fold
ce08b0efa3fd Uploaded zzhou parents: diff changeset	59 enrichment of a binding position}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	60 \item{enrichment.z}{ Z-score used to define the significance level of
ce08b0efa3fd Uploaded zzhou parents: diff changeset	61 the fold-enrichment confidence intervals }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	62 \item{enrichment.background.scales}{ In estimating the peak
ce08b0efa3fd Uploaded zzhou parents: diff changeset	63 fold-enrichment confidence intervals, the background tag density is
ce08b0efa3fd Uploaded zzhou parents: diff changeset	64 estimated based on windows with half-sizes of
ce08b0efa3fd Uploaded zzhou parents: diff changeset	65 \code{2tag.count.whsenrichment.background.scales}. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	66 \item{method}{ either \code{tag.wtd} for WTD method, or
ce08b0efa3fd Uploaded zzhou parents: diff changeset	67 \code{tag.lwcc} for MTC method}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	68 \item{mle.filter}{ If turned on, will exclude predicted positions
ce08b0efa3fd Uploaded zzhou parents: diff changeset	69 whose MLE enrichment ratio (for any of the background scales) is
ce08b0efa3fd Uploaded zzhou parents: diff changeset	70 below a specified min.mle.threshold }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	71 \item{min.mle.threshold}{ MLE enrichment ratio threshold that each
ce08b0efa3fd Uploaded zzhou parents: diff changeset	72 predicted position must exceed if mle.filter is turned on. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	73
ce08b0efa3fd Uploaded zzhou parents: diff changeset	74 ~~ masking regions of significant control enrichment ~~
ce08b0efa3fd Uploaded zzhou parents: diff changeset	75 \item{tec.filter}{ Whether to mask out the regions exhibiting
ce08b0efa3fd Uploaded zzhou parents: diff changeset	76 significant enrichment in the control data in doing other
ce08b0efa3fd Uploaded zzhou parents: diff changeset	77 calculations. The regions are identified using Poisson statistics
ce08b0efa3fd Uploaded zzhou parents: diff changeset	78 within sliding windows, either relative to the scaled signal (tec.z), or
ce08b0efa3fd Uploaded zzhou parents: diff changeset	79 relative to randomly-distributed expectation (tec.poisson.z).}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	80 \item{tec.window.size}{ size of the window used to determine
ce08b0efa3fd Uploaded zzhou parents: diff changeset	81 significantly enrichent control regions }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	82 \item{tec.masking.window.size}{ size of the window used to mask
ce08b0efa3fd Uploaded zzhou parents: diff changeset	83 the area around significantly enrichent control regions }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	84 \item{tec.z}{ Z-score defining statistical stringency by which a given
ce08b0efa3fd Uploaded zzhou parents: diff changeset	85 window is determined to be significantly higher in the input than in
ce08b0efa3fd Uploaded zzhou parents: diff changeset	86 the signal, and masked if that is the case.}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	87 \item{tec.poisson.z}{ Z-score defining statistical stringency by which a given
ce08b0efa3fd Uploaded zzhou parents: diff changeset	88 window is determined to be significantly higher than the
ce08b0efa3fd Uploaded zzhou parents: diff changeset	89 tec.poisson.ratio above the expected uniform input background. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	90 \item{tec.poisson.ratio}{ Fold ratio by which input must exceed the
ce08b0efa3fd Uploaded zzhou parents: diff changeset	91 level expected from the uniform distribution. }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	92
ce08b0efa3fd Uploaded zzhou parents: diff changeset	93
ce08b0efa3fd Uploaded zzhou parents: diff changeset	94
ce08b0efa3fd Uploaded zzhou parents: diff changeset	95
ce08b0efa3fd Uploaded zzhou parents: diff changeset	96 }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	97 \value{
ce08b0efa3fd Uploaded zzhou parents: diff changeset	98 \item{npl}{A per-chromosome list containing data frames describing
ce08b0efa3fd Uploaded zzhou parents: diff changeset	99 determined binding positions. Column description:
ce08b0efa3fd Uploaded zzhou parents: diff changeset	100 \item{x}{ position }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	101 \item{y}{ score }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	102 \item{evalue}{ E-value }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	103 \item{fdr}{ FDR. For peaks higher than the maximum control peak,
ce08b0efa3fd Uploaded zzhou parents: diff changeset	104 the highest dataset FDR is reported }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	105 \item{enr}{ lower bound of the fold-enrichment ratio confidence
ce08b0efa3fd Uploaded zzhou parents: diff changeset	106 interval. This is the estimate determined using scale of
ce08b0efa3fd Uploaded zzhou parents: diff changeset	107 1. Estimates corresponding to higher scales are returned in other enr columns
ce08b0efa3fd Uploaded zzhou parents: diff changeset	108 with scale appearing in the name.}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	109 \item{enr.mle}{ enrichment ratio maximum likely estimate }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	110 }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	111 \item{thr}{ info on the chosen statistical threshold of the peak scores}
ce08b0efa3fd Uploaded zzhou parents: diff changeset	112 }
ce08b0efa3fd Uploaded zzhou parents: diff changeset	113
ce08b0efa3fd Uploaded zzhou parents: diff changeset	114 \examples{
ce08b0efa3fd Uploaded zzhou parents: diff changeset	115 # find binding positions using WTD method, 200bp half-window size,
ce08b0efa3fd Uploaded zzhou parents: diff changeset	116 control data, 1% FDR
ce08b0efa3fd Uploaded zzhou parents: diff changeset	117 bp <-
ce08b0efa3fd Uploaded zzhou parents: diff changeset	118 find.binding.positions(signal.data=chip.data,control.data=input.data,fdr=0.01,method=tag.wtd,whs=200);
ce08b0efa3fd Uploaded zzhou parents: diff changeset	119
ce08b0efa3fd Uploaded zzhou parents: diff changeset	120 # find binding positions using MTC method, using 5 tag randomizations,
ce08b0efa3fd Uploaded zzhou parents: diff changeset	121 # keeping pairs of tag positions together (shuffle.window=2)
ce08b0efa3fd Uploaded zzhou parents: diff changeset	122 bp <- find.binding.positions(signal.data=chip.data,control.data=input.data,fdr=0.01,method=tag.lwcc,whs=200,use.randomized.controls=T,n.randomizations=5,shuffle.window=2)
ce08b0efa3fd Uploaded zzhou parents: diff changeset	123
ce08b0efa3fd Uploaded zzhou parents: diff changeset	124 # print out the number of determined positions
ce08b0efa3fd Uploaded zzhou parents: diff changeset	125 print(paste("detected",sum(unlist(lapply(bp$npl,function(d) length(d$x)))),"peaks"));
ce08b0efa3fd Uploaded zzhou parents: diff changeset	126
ce08b0efa3fd Uploaded zzhou parents: diff changeset	127
ce08b0efa3fd Uploaded zzhou parents: diff changeset	128 }

Mercurial > repos > zzhou > spp_phantompeak

annotate spp/man/find.binding.positions.Rd @ 15:e689b83b0257 draft