mimodd: deletion_predictor.xml annotate

annotate deletion_predictor.xml @ 0:7da2c9654a83 draft default tip

Uploaded

author	wolma
date	Tue, 12 Aug 2014 11:26:15 -0400
parents
children

rev	line source
0 7da2c9654a83 Uploaded wolma parents: diff changeset	1 <tool id="deletion_predictor" name="Deletion Prediction for paired-end data">
7da2c9654a83 Uploaded wolma parents: diff changeset	2 <description>Predicts deletions in one or more aligned read samples based on coverage of the reference genome and on insert sizes</description>
7da2c9654a83 Uploaded wolma parents: diff changeset	3 <requirements>
7da2c9654a83 Uploaded wolma parents: diff changeset	4 <requirement type="package" version="3.4.1">python3</requirement>
7da2c9654a83 Uploaded wolma parents: diff changeset	5 <requirement type="package" version="0.1.3_9af04e0e9125">MiModD</requirement>
7da2c9654a83 Uploaded wolma parents: diff changeset	6 </requirements>
7da2c9654a83 Uploaded wolma parents: diff changeset	7 <command>
7da2c9654a83 Uploaded wolma parents: diff changeset	8 mimodd delcall
7da2c9654a83 Uploaded wolma parents: diff changeset	9 #for $l in $list_input
7da2c9654a83 Uploaded wolma parents: diff changeset	10 ${l.bamfile}
7da2c9654a83 Uploaded wolma parents: diff changeset	11 #end for
7da2c9654a83 Uploaded wolma parents: diff changeset	12 $covfile -o $outputfile
7da2c9654a83 Uploaded wolma parents: diff changeset	13 --max_cov $max_cov --min_size $min_size $include_uncovered $group_by_id --verbose
7da2c9654a83 Uploaded wolma parents: diff changeset	14 </command>
7da2c9654a83 Uploaded wolma parents: diff changeset	15
7da2c9654a83 Uploaded wolma parents: diff changeset	16 <inputs>
7da2c9654a83 Uploaded wolma parents: diff changeset	17 <repeat name="list_input" title="Aligned reads input source" default="1" min="1">
7da2c9654a83 Uploaded wolma parents: diff changeset	18 <param name="bamfile" type="data" format="bam" label="input BAM file" />
7da2c9654a83 Uploaded wolma parents: diff changeset	19 </repeat>
7da2c9654a83 Uploaded wolma parents: diff changeset	20 <param name="covfile" type="data" format="tabular" label="input coverage file" help="A MiModD coverage file as generated by the Variant Calling and Coverage Analysis tool."/>
7da2c9654a83 Uploaded wolma parents: diff changeset	21 <param name="group_by_id" type="boolean" label="group reads based on read group id only" truevalue="-i" falsevalue="" checked="true" help="If selected, reads from different read groups will be treated strictly separate. If turned off, read groups with identical sample names are used together for identifying uncovered regions, but are still treated separately for the prediction of deletions." />
7da2c9654a83 Uploaded wolma parents: diff changeset	22 <param name="include_uncovered" type="boolean" label="include low-coverage regions" truevalue="-u" falsevalue="" checked="true" help="If selected, regions that fulfill the coverage criteria below, but are not statistically significant deletions, will be included in the output." />
7da2c9654a83 Uploaded wolma parents: diff changeset	23 <param name="max_cov" type="integer" value="0" label="maximal coverage allowed inside a low-coverage region (default: 0)" help="The maximal coverage at a site allowed to consider it as part of a low-coverage region" />
7da2c9654a83 Uploaded wolma parents: diff changeset	24 <param name="min_size" type="integer" value="100" label="minimal deletion size (default: 100)" help="A low-coverage region must consist of at least this number of consecutive bases below the maximal coverage to consider it in further analyses."/>
7da2c9654a83 Uploaded wolma parents: diff changeset	25 </inputs>
7da2c9654a83 Uploaded wolma parents: diff changeset	26
7da2c9654a83 Uploaded wolma parents: diff changeset	27 <outputs>
7da2c9654a83 Uploaded wolma parents: diff changeset	28 <data name="outputfile" format="gff" />
7da2c9654a83 Uploaded wolma parents: diff changeset	29 </outputs>
7da2c9654a83 Uploaded wolma parents: diff changeset	30
7da2c9654a83 Uploaded wolma parents: diff changeset	31 <help>
7da2c9654a83 Uploaded wolma parents: diff changeset	32 .. class:: infomark
7da2c9654a83 Uploaded wolma parents: diff changeset	33
7da2c9654a83 Uploaded wolma parents: diff changeset	34 What it does
7da2c9654a83 Uploaded wolma parents: diff changeset	35
7da2c9654a83 Uploaded wolma parents: diff changeset	36 The tool predicts deletions from paired-end data in a two-step process.
7da2c9654a83 Uploaded wolma parents: diff changeset	37
7da2c9654a83 Uploaded wolma parents: diff changeset	38 First, it finds regions of low-coverage, i.e., candidate regions for deletions, by scanning a coverage file as produced by the Variant Calling and Coverage Analysis tool.
7da2c9654a83 Uploaded wolma parents: diff changeset	39 The maximal coverage allowed inside a low-coverage region and the minimal deletion size parameters are used at this step to define what is considered a low-coverage region.
7da2c9654a83 Uploaded wolma parents: diff changeset	40
7da2c9654a83 Uploaded wolma parents: diff changeset	41 Second, the tool assesses every low-coverage region statistically for evidence of it being a real deletion.
7da2c9654a83 Uploaded wolma parents: diff changeset	42 This step requires paired-end data since it relies on shifts in the distribution of read pair insert sizes around real deletions.
7da2c9654a83 Uploaded wolma parents: diff changeset	43
7da2c9654a83 Uploaded wolma parents: diff changeset	44 By default, the tool only reports Deletions, i.e., the fraction of low-coverage regions that pass the statistical test.
7da2c9654a83 Uploaded wolma parents: diff changeset	45 If include low-coverage regions is selected, regions that failed the test will also be reported.
7da2c9654a83 Uploaded wolma parents: diff changeset	46
7da2c9654a83 Uploaded wolma parents: diff changeset	47 With group reads based on read group id only selected, as it is by default, grouping of reads into samples is done strictly based on their read group IDs.
7da2c9654a83 Uploaded wolma parents: diff changeset	48 With the option deselected, grouping is done based on sample names in the first step of the analysis, i.e. the reads of all samples with a shared sample name are used to identify low-coverage regions.
7da2c9654a83 Uploaded wolma parents: diff changeset	49 In the second step, however, reads will be regrouped by their read group IDs again, i.e. the statistical assessment for real deletions is always done on a per read group basis.
7da2c9654a83 Uploaded wolma parents: diff changeset	50
7da2c9654a83 Uploaded wolma parents: diff changeset	51 TIP:
7da2c9654a83 Uploaded wolma parents: diff changeset	52 Deselecting group reads based on read group id only can be useful, for example, if you have both paired-end and single-end sequencing data for the same sample.
7da2c9654a83 Uploaded wolma parents: diff changeset	53
7da2c9654a83 Uploaded wolma parents: diff changeset	54 In this case, the two sets of reads will usually share a common sample name, but differ in their read groups.
7da2c9654a83 Uploaded wolma parents: diff changeset	55 With grouping based on sample names, the single-end data can be used together with the paired-end data to identify low-coverage regions, thus increasing overall coverage and reliability of this step.
7da2c9654a83 Uploaded wolma parents: diff changeset	56 Still, the assessment of deletions will use only the paired-end data (auto-detecting that the single-end reads do not provide insert size information).
7da2c9654a83 Uploaded wolma parents: diff changeset	57
7da2c9654a83 Uploaded wolma parents: diff changeset	58 </help>
7da2c9654a83 Uploaded wolma parents: diff changeset	59
7da2c9654a83 Uploaded wolma parents: diff changeset	60 </tool>

Mercurial > repos > wolma > mimodd

annotate deletion_predictor.xml @ 0:7da2c9654a83 draft default tip