Mercurial > repos > blankenberg > naive_variant_detector
changeset 10:d6d7aa386bad
Update help text.
author | Daniel Blankenberg <dan@bx.psu.edu> |
---|---|
date | Mon, 26 Aug 2013 17:47:25 -0400 |
parents | a17fbdd7b47a |
children | 15493ebbc53b |
files | tools/naive_variant_detector.xml |
diffstat | 1 files changed, 60 insertions(+), 5 deletions(-) [+] |
line wrap: on
line diff
--- a/tools/naive_variant_detector.xml Mon Aug 26 12:45:02 2013 -0400 +++ b/tools/naive_variant_detector.xml Mon Aug 26 17:47:25 2013 -0400 @@ -1,5 +1,5 @@ <tool id="naive_variant_detector" name="Naive Variant Caller" version="0.0.1"> - <description>on BAM files</description> + <description>tabulate variable sites from BAM datasets</description> <requirements> <requirement type="package" version="1.7.1">numpy</requirement> <requirement type="package" version="0.0.1">pyBamParser</requirement> @@ -94,8 +94,8 @@ <param name="use_strand" type="boolean" truevalue="--use_strand" falsevalue="" checked="False" label="Report counts by strand"/> <param name="coverage_dtype" type="select" label="Choose the dtype to use for storing coverage information" help="This affects the maximum recorded value for a position, e.g. uint8 would be 255 coverage, but will require the least amount of RAM"> - <option value="uint8" selected="True">uint8</option> - <option value="uint16">uint16</option> + <option value="uint8">uint8</option> + <option value="uint16" selected="True">uint16</option> <option value="uint32">uint32</option> <option value="uint64">uint64</option> </param> @@ -107,19 +107,74 @@ <help> **What it does** -This tool is a naive variant detector. +This tool is a naive variant caller that processes aligned sequencing reads from the BAM format and produces a VCF file containing per position variant calls. This tool allows multiple BAM files to be provided as input and utilizes read group information to make calls for individual samples. + +User configurable options allow filtering reads that do not pass mapping or base quality thresholds and minimum per base read depth; user's can also specify the ploidy and whether to consider each strand separately. + +In addition to calling alternate alleles based upon simple ratios of nucleotides at a position, per base nucleotide counts are also provided. A custom tag, NC, is used within the Genotype fields. The NC field is a comma-separated listing of nucleotide counts in the form of <nucleotide>=<count>, where a plus or minus character is prepended to indicate strand, if the strandedness option was specified. + ------ **Inputs** -Accepts one or more BAM input files. +Accepts one or more BAM input files and a reference genome from the built-in list or from a FASTA file in your history. **Outputs** The output is in VCF format. +**Options** + +Reference Genome: + + Ensure that you have selected the correct reference genome, either from the list of built-in genomes or by selecting the corresponding FASTA file from your history. + +Restrict to regions: + + You can specify any number of regions on which you would like to receive results. You can specify just a chromosome name, or a chromosome name and start postion, or a chromosome name and start and end position for the set of desired regions. + +Minimum number of reads needed to consider a REF/ALT: + + This value declares the minimum number of reads containing a particular base at each position in order to list and use said allele in genotyping calls. Default is 0. + +Minimum base quality: + + The minimum base quality score needed for the position in a read to be used for nucleotide counts and genotyping. Default is no filter. + +Minimum mapping quality: + + The minimum mapping quality score needed to consider a read for nucleotide counts and genotyping. Default is no filter. + +Ploidy: + + The number of genotype calls to make at each reported position. + +Only write out positions with with possible alternate alleles: + + When set, only positions which have at least one non-reference nucleotide which passes declare filters will be present in the output. + +Report counts by strand: + + When set, nucleotide counts (NC) will be reported in reference to the aligned read's source strand. Reported as: <strand><BASE>=<COUNT>. + +Choose the dtype to use for storing coverage information: + + This controls the maximum depth value for each nucleotide/position/strand (when specified). Smaller values require the least amount of memory, but have smaller maximal limits. + + +--------+----------------------+ + | name | max value | + +========+======================+ + | uint8 | 255 | + +--------+----------------------+ + | uint16 | 65535 | + +--------+----------------------+ + | uint32 | 4294967295 | + +--------+----------------------+ + | uint64 | 18446744073709551615 | + +--------+----------------------+ + ------ **Citation**