Mercurial > repos > jjohnson > snpeff
diff snpSift_filter.xml @ 0:e1d9f6a0ad53
Uploaded
| author | jjohnson |
|---|---|
| date | Thu, 04 Jul 2013 10:43:55 -0400 |
| parents | |
| children | 6ad9205c1307 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/snpSift_filter.xml Thu Jul 04 10:43:55 2013 -0400 @@ -0,0 +1,127 @@ +<tool id="snpSift_filter" name="SnpSift Filter" version="3.2"> + <options sanitize="False" /> + <description>Filter variants using arbitrary expressions</description> + <requirements> + <requirement type="package" version="3.2">snpEff</requirement> + </requirements> + <command> + java -Xmx6G -jar \$JAVA_JAR_PATH/SnpSift.jar filter -f $input -e $exprFile $inverse $pass + #if $filterId and len($filterId.__str__.strip()) > 0: + --filterId = "$filterId" + #end if + #if $addFilter and len($addFilter.__str__.strip()) > 0: + --addFilter = "$addFilter" + #end if + #if $rmFilter and len($rmFilter.__str__.strip()) > 0: + --rmFilter = "$rmFilter" + #end if + > $output + </command> + <inputs> + <param format="vcf" name="input" type="data" label="VCF input"/> + <param name="expr" type="text" label="Expression" size="120"/> + <param name="inverse" type="boolean" truevalue="--inverse" falsevalue="" checked="false" label="Inverse. Show lines that do not match filter expression"/> + <param name="pass" type="boolean" truevalue="--pass" falsevalue="" checked="false" label="Use 'PASS' field instead of filtering out VCF entries"/> + <param name="filterId" type="text" value="" optional="true" label="ID for this filter (##FILTER tag in header and FILTER VCF field)." size="10"/> + <param name="addFilter" type="text" value="" optional="true" label="Add a string to FILTER VCF field if 'expression' is true." size="10"/> + <param name="rmFilter" type="text" value="" optional="true" label="Remove a string from FILTER VCF field if 'expression' is true (and 'str' is in the field)." size="10"/> + </inputs> + <configfiles> + <configfile name="exprFile"> + $expr + </configfile> + </configfiles> + + <outputs> + <data format="vcf" name="output" /> + </outputs> + <stdio> + <exit_code range=":-1" level="fatal" description="Error: Cannot open file" /> + <exit_code range="1:" level="fatal" description="Error" /> + </stdio> + + <tests> + + <test> + <param name="input" ftype="vcf" value="test01.vcf"/> + <param name="expr" value="QUAL >= 50"/> + <output name="output"> + <assert_contents> + <has_text text="28837706" /> + <not_has_text text="NT_166464" /> + </assert_contents> + </output> + </test> + + <test> + <param name="input" ftype="vcf" value="test01.vcf"/> + <param name="expr" value="(CHROM = '19')"/> + <output name="output"> + <assert_contents> + <has_text text="3205820" /> + <not_has_text text="NT_16" /> + </assert_contents> + </output> + </test> + + <test> + <param name="input" ftype="vcf" value="test01.vcf"/> + <param name="expr" value="(POS >= 20175) & (POS <= 35549)"/> + <output name="output"> + <assert_contents> + <has_text text="20175" /> + <has_text text="35549" /> + <has_text text="22256" /> + <not_has_text text="18933" /> + <not_has_text text="37567" /> + </assert_contents> + </output> + </test> + + <test> + <param name="input" ftype="vcf" value="test01.vcf"/> + <param name="expr" value="( DP >= 5 )"/> + <output name="output"> + <assert_contents> + <has_text text="DP=5;" /> + <has_text text="DP=6;" /> + <not_has_text text="DP=1;" /> + </assert_contents> + </output> + </test> + + </tests> + + <help> + +**SnpSift filter** + +You can filter ia vcf file using arbitrary expressions, for instance "(QUAL > 30) | (exists INDEL) | ( countHet() > 2 )". The actual expressions can be quite complex, so it allows for a lot of flexibility. + +Some examples: + + - *I want to filter out samples with quality less than 30*: + + * **( QUAL > 30 )** + + - *...but we also want InDels that have quality 20 or more*: + + * **(( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 )** + + - *...or any homozygous variant present in more than 3 samples*: + + * **(countHom() > 3) | (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 )** + + - *...or any heterozygous sample with coverage 25 or more*: + + * **((countHet() > 0) & (DP >= 25)) | (countHom() > 3) | (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 )** + + - *I want to keep samples where the genotype for the first sample is homozygous variant and the genotype for the second sample is reference*: + + * **isHom( GEN[0] ) & isVariant( GEN[0] ) & isRef( GEN[1] )** + + +For complete details about this tool and epressions that can be used, please go to http://snpeff.sourceforge.net/SnpSift.html#filter + + </help> +</tool>
