SnpSift filter
You can filter ia vcf file using arbitrary expressions, for instance "(QUAL > 30) | (exists INDEL) | ( countHet() > 2 )". The actual expressions can be quite complex, so it allows for a lot of flexibility.
Some examples:
- I want to filter out samples with quality less than 30:
- ( QUAL > 30 )
- ...but we also want InDels that have quality 20 or more:
- (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 )
- ...or any homozygous variant present in more than 3 samples:
- (countHom() > 3) | (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 )
- ...or any heterozygous sample with coverage 25 or more:
- ((countHet() > 0) & (DP >= 25)) | (countHom() > 3) | (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 )
- I want to keep samples where the genotype for the first sample is homozygous variant and the genotype for the second sample is reference:
- isHom( GEN[0] ) & isVariant( GEN[0] ) & isRef( GEN[1] )
For complete details about this tool and expressions that can be used, please read the fine manual http://snpeff.sourceforge.net/SnpSift.html#filter
For details about this tool, please see the snpEff web site.