comparison cgatools_suite/tools/cgatools/snpdiff.xml @ 7:96829b1b73ea draft

Uploaded
author bcrain-completegenomics
date Wed, 06 Jun 2012 16:58:26 -0400
parents
children
comparison
equal deleted inserted replaced
6:e4eff539a999 7:96829b1b73ea
1 <tool id="cga_snpdiff" name="snpdiff" version="0.0.1">
2
3 <description>compares snp calls to a Complete Genomics variant file.</description> <!--adds description in toolbar-->
4
5 <requirements>
6 <requirement type="binary">cgatools</requirement>
7 </requirements>
8
9 <command> <!--run executable-->
10 cgatools snpdiff --beta -h
11 </command>
12
13 <outputs>
14 <data format="tabular" name="output" />
15 </outputs>
16
17 <inputs>
18 </inputs>
19
20 <help>
21
22 **What it does**
23
24 This tool ompares snp calls to a Complete Genomics variant file.
25
26 cgatools: http://sourceforge.net/projects/cgatools/files/
27
28 -----
29
30 **cgatools Manual**::
31
32 COMMAND NAME
33 snpdiff - Compares snp calls to a Complete Genomics variant file.
34
35 DESCRIPTION
36 Compares the snp calls in the "genotypes" file to the calls in a Complete
37 Genomics variant file. The genotypes file is a tab-delimited file with at
38 least the following columns (additional columns may be given):
39
40 Chromosome (Required) The name of the chromosome.
41 Offset0Based (Required) The 0-based offset in the chromosome.
42 GenotypesStrand (Optional) The strand of the calls in the Genotypes
43 column (+ or -, defaults to +).
44 Genotypes (Optional) The calls, one per allele. The following
45 calls are recognized:
46 A,C,G,T A called base.
47 N A no-call.
48 - A deleted base.
49 . A non-snp variation.
50
51 The output is a tab-delimited file consisting of the columns of the
52 original genotypes file, plus the following additional columns:
53
54 Reference The reference base at the given position.
55 VariantFile The calls made by the variant file, one per allele.
56 The character codes are the same as is described for
57 the Genotypes column.
58 DiscordantAlleles (Only if Genotypes is present) The number of
59 Genotypes alleles that are discordant with calls in
60 the VariantFile. If the VariantFile is described as
61 haploid at the given position but the Genotypes is
62 diploid, then each genotype allele is compared
63 against the haploid call of the VariantFile.
64 NoCallAlleles (Only if Genotypes is present) The number of
65 Genotypes alleles that were no-called by the
66 VariantFile. If the VariantFile is described as
67 haploid at the given position but the Genotypes is
68 diploid, then a VariantFile no-call is counted twice.
69
70 The verbose output is a tab-delimited file consisting of the columns of the
71 original genotypes file, plus the following additional columns:
72
73 Reference The reference base at the given position.
74 VariantFile The call made by the variant file for one allele (there is
75 a line in this file for each allele). The character codes
76 are the same as is described for the Genotypes column.
77 [CALLS] The rest of the columns are pasted in from the VariantFile,
78 describing the variant file line used to make the call.
79
80 The stats output is a comma-separated file with several tables describing
81 the results of the snp comparison, for each diploid genotype. The tables
82 all describe the comparison result (column headers) versus the genotype
83 classification (row labels) in different ways. The "Locus classification"
84 tables have the most detailed match classifications, while the "Locus
85 concordance" tables roll these match classifications up into "discordance"
86 and "no-call". A locus is considered discordant if it is discordant for
87 either allele. A locus is considered no-call if it is concordant for both
88 alleles but has a no-call on either allele. The "Allele concordance"
89 describes the comparison result on a per-allele basis.
90
91 OPTIONS
92 -h [ --help ]
93 Print this help message.
94
95 --reference arg
96 The input crr file.
97
98 --variants arg
99 The input variant file.
100
101 --genotypes arg
102 The input genotypes file.
103
104 --output-prefix arg
105 The path prefix for all output reports.
106
107 --reports arg (=Output,Verbose,Stats)
108 Comma-separated list of reports to generate. A report is one of:
109 Output The output genotypes file.
110 Verbose The verbose output file.
111 Stats The stats output file.
112
113 SUPPORTED FORMAT_VERSION
114 0.3 or later
115 </help>
116 </tool>