annotate variant_effect_predictor/README.txt @ 0:2bc9b66ada89 draft default tip

Uploaded
author mahtabm
date Thu, 11 Apr 2013 06:29:17 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
1 ############################
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
2 # #
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
3 # Variant Effect Predictor #
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
4 # #
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
5 ############################
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
6
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
7 Copyright (c) 1999-2011 The European Bioinformatics Institute and
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
8 Genome Research Limited. All rights reserved.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
9
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
10 This software is distributed under a modified Apache license.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
11 For license details, please see
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
12
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
13 http://www.ensembl.org/info/about/code_licence.html
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
14
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
15 Please email comments or questions to the public Ensembl
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
16 developers list at <dev@ensembl.org>.
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
17
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
18 Questions may also be sent to the Ensembl help desk at
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
19 <helpdesk@ensembl.org>
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
20
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
21 Quickstart
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
22 ==========
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
23
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
24 Install API and cache files, run in offline mode:
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
25
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
26 perl INSTALL.pl
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
27 perl variant_effect_predictor.pl --offline
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
28
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
29
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
30 Documentation
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
31 =============
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
32
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
33 For a summary of command line flags, run:
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
34
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
35 perl variant_effect_predictor.pl --help
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
36
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
37 For full documentation see
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
38
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
39 http://www.ensembl.org/info/docs/variation/vep/vep_script.html
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
40
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
41
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
42
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
43 Changelog
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
44 =========
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
45
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
46 New in version 2.6 (July 2012)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
47 ------------------------------
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
48
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
49 - support for structural variant consequences
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
50
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
51 - Sequence Ontology (SO) consequence terms now default
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
52
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
53 - script runtime 3-4x faster when using forking
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
54
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
55 - 1000 Genomes global MAF available in cache files
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
56
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
57 - improved memory usage
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
58
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
59
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
60 New in version 2.5 (May 2012)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
61 -----------------------------
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
62
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
63 - SIFT and PolyPhen predictions now available for RefSeq transcripts
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
64
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
65 - retrieve cell type-specific regulatory consequences
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
66
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
67 - consequences can be retrieved based on a single individual's genotype in
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
68 a VCF input file
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
69
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
70 - find overlapping structural variants
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
71
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
72 - Condel support removed from main script and moved to a plugin
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
73
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
74
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
75 New in version 2.4 (February 2012)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
76 ----------------------------------
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
77 - offline mode and new installer script make it easy to use the VEP without
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
78 the usual dependencies
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
79
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
80 - output columns configurable using the --fields flag
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
81
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
82 - VCF output support expanded, can now carry all fields
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
83
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
84 - output affected exon and intron numbers with --numbers
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
85
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
86 - output overlapping protein domains using --domains
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
87
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
88 - enhanced support for LRGs
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
89
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
90 - plugins now work on variants called as intergenic
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
91
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
92
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
93 New in version 2.3 (December 2011)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
94 ----------------------------------
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
95
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
96 - Add custom annotations from tabix-indexed files (BED, GFF, GTF, VCF, bigWig)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
97
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
98 - Add new functionality to the VEP with user-written plugins
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
99
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
100 - Filter input on consequence type
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
101
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
102
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
103 Version 2.2 (September 2011)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
104 ----------------------------
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
105
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
106 - SIFT, PolyPhen and Condel predictions and regulatory features now accessible
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
107 from the cache
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
108
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
109 - Support for calling consequences against RefSeq transcripts
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
110
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
111 - Variant identifiers (e.g. dbSNP rsIDs) and HGVS notations supported as input
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
112 format
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
113
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
114 - Variants can now be filtered by frequency in HapMap and 1000 genomes
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
115 populations
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
116
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
117 - Script can be used to convert files between formats (Ensembl/VCF/Pileup/HGVS
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
118 to Ensembl/VCF/Pileup)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
119
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
120 - Large amount of code moved to API modules to ensure consistency between web
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
121 and script VEP
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
122
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
123 - Memory usage optimisations
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
124
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
125 - VEP script moved to ensembl-tools CVS module
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
126
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
127 - Added --canonical, --per_gene and --no_intergenic options
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
128
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
129
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
130 Version 2.1 (June 2011)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
131 -----------------------
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
132
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
133 - ability to use local file cache in place of or alongside connecting to an
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
134 Ensembl database
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
135
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
136 - significant improvements to speed of script
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
137
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
138 - whole-genome mode now default (no disadvantage for smaller datasets)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
139
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
140 - improved status output with progress bars
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
141
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
142 - regulatory region consequences now reinstated and improved
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
143
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
144 - modification to output file - Transcript column is now Feature, and is
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
145 followed by a Feature_type column
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
146
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
147 - full documentation now online
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
148
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
149
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
150 Version 2.0 (April 2011)
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
151 ------------------------
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
152
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
153 Version 2.0 of the Variant Effect Predictor script (VEP) constitutes a complete
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
154 overhaul of both the script and the API behind it. It requires at least version
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
155 62 of the Ensembl API to function. Here follows a summary of the changes:
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
156
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
157 - support for SIFT, PolyPhen and Condel non-synonymous predictions in human
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
158
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
159 - per-allele and compound consequence types
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
160
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
161 - support for Sequence Ontology (SO) and NCBI consequence terms
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
162
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
163 - modified output format
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
164 - support for new output fields in Extra column
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
165 - header section containing information on database and software versions
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
166 - codon change shown in output
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
167 - CDS position shown in output
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
168 - option to output Ensembl protein identifiers
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
169 - option to output HGVS nomenclature for variants
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
170
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
171 - support for gzipped input files
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
172
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
173 - enhanced configuration options, including the ability to read configuration
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
174 from a file
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
175
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
176 - verbose output now much more useful
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
177
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
178 - whole-genome mode now more stable
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
179
2bc9b66ada89 Uploaded
mahtabm
parents:
diff changeset
180 - finding existing co-located variations now ~5x faster