annotate snpSift_annotate.xml @ 5:8952990fcab9

Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
author Jim Johnson <jj@umn.edu>
date Wed, 27 Nov 2013 09:11:32 -0600
parents 6ad9205c1307
children 0ad9733e22a4
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
5
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
1 <tool id="snpSift_annotate" name="SnpSift Annotate" version="3.4">
0
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
2 <description>Annotate SNPs from dbSnp</description>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
3 <!--
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
4 You will need to change the path to wherever your installation is.
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
5 You can change the amount of memory used, just change the -Xmx parameter (e.g. use -Xmx2G for 2Gb of memory)
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
6 -->
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
7 <requirements>
5
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
8 <requirement type="package" version="3.4">snpEff</requirement>
0
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
9 </requirements>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
10 <command>
2
6ad9205c1307 Update to SnpEff version 3.3
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
11 java -Xmx6G -jar \$SNPEFF_JAR_PATH/SnpSift.jar $annotate_cmd
0
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
12 #if $annotate.id :
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
13 -id
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
14 #elif $annotate.info_ids.__str__.strip() != '' :
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
15 -info "$annotate.info_ids"
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
16 #end if
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
17 -q $dbSnp $input > $output
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
18 </command>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
19 <inputs>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
20 <param format="vcf" name="input" type="data" label="VCF input"/>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
21 <param format="vcf" name="dbSnp" type="data" label="VCF File with ID field annotated (e.g. dnSNP.vcf)"
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
22 help="The ID field for a variant in input will be assigned from a matching variant in this file."/>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
23 <conditional name="annotate">
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
24 <param name="id" type="boolean" truevalue="id" falsevalue="info" checked="True" label="Only annotate ID field (do not add INFO field)" help=""/>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
25 <when value="id"/>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
26 <when value="info">
5
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
27 <param name="info_ids" type="text" value="" size="60" optional="true" label="Limit INFO annotation to these INFO IDs"
0
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
28 help="list is a comma separated list of fields. When blank, all INFO fields are included">
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
29 <validator type="regex" message="IDs separted by commas">^(([a-zA-Z][a-zA-Z0-9_-]*)(,[a-zA-Z][a-zA-Z0-9_-]*)*)?$</validator>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
30 </param>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
31 </when>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
32 </conditional>
5
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
33 <param name="annotate_cmd" type="boolean" truevalue="annMem" falsevalue="annotate" checked="false" label="Annotate in Memory">
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
34 <help>
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
35 Allows unsorted VCF files, but it loads the entire 'database' VCF file into memory (which may not be practical for large 'database' VCF files).
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
36 Otherwise, both the database and the input VCF files should be sorted by position (Chromosome sort order can differ between files).
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
37 </help>
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
38 </param>
0
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
39 </inputs>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
40 <stdio>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
41 <exit_code range=":-1" level="fatal" description="Error: Cannot open file" />
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
42 <exit_code range="1:" level="fatal" description="Error" />
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
43 </stdio>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
44
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
45 <outputs>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
46 <data format="vcf" name="output" />
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
47 </outputs>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
48 <tests>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
49 <test>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
50 <param name="input" ftype="vcf" value="annotate_1.vcf"/>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
51 <param name="dbSnp" ftype="vcf" value="db_test_1.vcf"/>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
52 <param name="annotate_cmd" value="False"/>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
53 <param name="id" value="True"/>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
54 <output name="output">
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
55 <assert_contents>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
56 <has_text text="rs76166080" />
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
57 </assert_contents>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
58 </output>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
59 </test>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
60 </tests>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
61 <help>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
62
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
63 This is typically used to annotate IDs from dbSnp.
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
64
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
65 For details about this tool, please go to http://snpeff.sourceforge.net/SnpSift.html#annotate
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
66
5
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
67 Annotatating only the ID field from dbSnp137.vcf ::
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
68
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
69 Input VCF:
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
70 CHROM POS ID REF ALT QUAL FILTER INFO
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
71 22 16157571 . T G 0.0 FAIL NS=53
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
72 22 16346045 . T C 0.0 FAIL NS=244
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
73 22 16350245 . C A 0.0 FAIL NS=192
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
74
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
75 Annotated Output VCF:
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
76 #CHROM POS ID REF ALT QUAL FILTER INFO
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
77 22 16157571 . T G 0.0 FAIL NS=53
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
78 22 16346045 rs56234788 T C 0.0 FAIL NS=244
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
79 22 16350245 rs2905295 C A 0.0 FAIL NS=192
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
80
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
81
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
82
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
83 Annotatating both the ID and INFO fields from dbSnp137.vcf ::
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
84
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
85 Input VCF:
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
86 #CHROM POS ID REF ALT QUAL FILTER INFO
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
87 22 16157571 . T G 0.0 FAIL NS=53
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
88 22 16346045 . T C 0.0 FAIL NS=244
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
89 22 16350245 . C A 0.0 FAIL NS=192
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
90
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
91 Annotated Output VCF:
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
92 #CHROM POS ID REF ALT QUAL FILTER INFO
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
93 22 16157571 . T G 0.0 FAIL NS=53
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
94 22 16346045 rs56234788 T C 0.0 FAIL NS=244;RSPOS=16346045;GMAF=0.162248628884826;dbSNPBuildID=129;SSR=0;SAO=0;VP=050100000000000100000100;WGT=0;VC=SNV;SLO;GNO
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
95 22 16350245 rs2905295 C A 0.0 FAIL NS=192;RSPOS=16350245;GMAF=0.230804387568556;dbSNPBuildID=101;SSR=1;SAO=0;VP=050000000000000100000140;WGT=0;VC=SNV;GNO
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
96
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
97
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
98 SnpEff citation:
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
99 "A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.", Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. Fly (Austin). 2012 Apr-Jun;6(2):80-92. PMID: 22728672 [PubMed - in process]
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
100
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
101 SnpSift citation:
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
102 "Using Drosophila melanogaster as a model for genotoxic chemical mutational studies with a new program, SnpSift", Cingolani, P., et. al., Frontiers in Genetics, 3, 2012.
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
103
8952990fcab9 Update to snpEff version 3.4 and add data managers to download snpEff genome reference databases
Jim Johnson <jj@umn.edu>
parents: 2
diff changeset
104
0
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
105 </help>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
106 </tool>
e1d9f6a0ad53 Uploaded
jjohnson
parents:
diff changeset
107