annotate variant_effect_predictor/README.txt @ 0:1f6dce3d34e0

Uploaded
author mahtabm
date Thu, 11 Apr 2013 02:01:53 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
1 ############################
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
2 # #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
3 # Variant Effect Predictor #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
4 # #
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
5 ############################
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
6
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
7 Copyright (c) 1999-2011 The European Bioinformatics Institute and
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
8 Genome Research Limited. All rights reserved.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
9
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
10 This software is distributed under a modified Apache license.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
11 For license details, please see
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
12
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
13 http://www.ensembl.org/info/about/code_licence.html
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
14
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
15 Please email comments or questions to the public Ensembl
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
16 developers list at <dev@ensembl.org>.
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
17
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
18 Questions may also be sent to the Ensembl help desk at
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
19 <helpdesk@ensembl.org>
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
20
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
21 Quickstart
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
22 ==========
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
23
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
24 Install API and cache files, run in offline mode:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
25
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
26 perl INSTALL.pl
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
27 perl variant_effect_predictor.pl --offline
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
28
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
29
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
30 Documentation
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
31 =============
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
32
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
33 For a summary of command line flags, run:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
34
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
35 perl variant_effect_predictor.pl --help
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
36
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
37 For full documentation see
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
38
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
39 http://www.ensembl.org/info/docs/variation/vep/vep_script.html
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
40
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
41
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
42
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
43 Changelog
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
44 =========
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
45
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
46 New in version 2.6 (July 2012)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
47 ------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
48
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
49 - support for structural variant consequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
50
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
51 - Sequence Ontology (SO) consequence terms now default
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
52
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
53 - script runtime 3-4x faster when using forking
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
54
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
55 - 1000 Genomes global MAF available in cache files
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
56
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
57 - improved memory usage
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
58
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
59
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
60 New in version 2.5 (May 2012)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
61 -----------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
62
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
63 - SIFT and PolyPhen predictions now available for RefSeq transcripts
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
64
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
65 - retrieve cell type-specific regulatory consequences
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
66
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
67 - consequences can be retrieved based on a single individual's genotype in
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
68 a VCF input file
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
69
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
70 - find overlapping structural variants
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
71
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
72 - Condel support removed from main script and moved to a plugin
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
73
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
74
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
75 New in version 2.4 (February 2012)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
76 ----------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
77 - offline mode and new installer script make it easy to use the VEP without
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
78 the usual dependencies
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
79
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
80 - output columns configurable using the --fields flag
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
81
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
82 - VCF output support expanded, can now carry all fields
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
83
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
84 - output affected exon and intron numbers with --numbers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
85
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
86 - output overlapping protein domains using --domains
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
87
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
88 - enhanced support for LRGs
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
89
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
90 - plugins now work on variants called as intergenic
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
91
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
92
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
93 New in version 2.3 (December 2011)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
94 ----------------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
95
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
96 - Add custom annotations from tabix-indexed files (BED, GFF, GTF, VCF, bigWig)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
97
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
98 - Add new functionality to the VEP with user-written plugins
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
99
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
100 - Filter input on consequence type
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
101
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
102
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
103 Version 2.2 (September 2011)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
104 ----------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
105
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
106 - SIFT, PolyPhen and Condel predictions and regulatory features now accessible
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
107 from the cache
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
108
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
109 - Support for calling consequences against RefSeq transcripts
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
110
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
111 - Variant identifiers (e.g. dbSNP rsIDs) and HGVS notations supported as input
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
112 format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
113
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
114 - Variants can now be filtered by frequency in HapMap and 1000 genomes
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
115 populations
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
116
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
117 - Script can be used to convert files between formats (Ensembl/VCF/Pileup/HGVS
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
118 to Ensembl/VCF/Pileup)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
119
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
120 - Large amount of code moved to API modules to ensure consistency between web
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
121 and script VEP
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
122
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
123 - Memory usage optimisations
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
124
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
125 - VEP script moved to ensembl-tools CVS module
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
126
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
127 - Added --canonical, --per_gene and --no_intergenic options
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
128
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
129
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
130 Version 2.1 (June 2011)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
131 -----------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
132
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
133 - ability to use local file cache in place of or alongside connecting to an
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
134 Ensembl database
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
135
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
136 - significant improvements to speed of script
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
137
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
138 - whole-genome mode now default (no disadvantage for smaller datasets)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
139
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
140 - improved status output with progress bars
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
141
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
142 - regulatory region consequences now reinstated and improved
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
143
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
144 - modification to output file - Transcript column is now Feature, and is
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
145 followed by a Feature_type column
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
146
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
147 - full documentation now online
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
148
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
149
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
150 Version 2.0 (April 2011)
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
151 ------------------------
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
152
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
153 Version 2.0 of the Variant Effect Predictor script (VEP) constitutes a complete
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
154 overhaul of both the script and the API behind it. It requires at least version
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
155 62 of the Ensembl API to function. Here follows a summary of the changes:
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
156
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
157 - support for SIFT, PolyPhen and Condel non-synonymous predictions in human
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
158
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
159 - per-allele and compound consequence types
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
160
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
161 - support for Sequence Ontology (SO) and NCBI consequence terms
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
162
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
163 - modified output format
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
164 - support for new output fields in Extra column
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
165 - header section containing information on database and software versions
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
166 - codon change shown in output
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
167 - CDS position shown in output
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
168 - option to output Ensembl protein identifiers
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
169 - option to output HGVS nomenclature for variants
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
170
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
171 - support for gzipped input files
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
172
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
173 - enhanced configuration options, including the ability to read configuration
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
174 from a file
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
175
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
176 - verbose output now much more useful
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
177
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
178 - whole-genome mode now more stable
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
179
1f6dce3d34e0 Uploaded
mahtabm
parents:
diff changeset
180 - finding existing co-located variations now ~5x faster