Mercurial > repos > mahtabm > ensembl
changeset 2:a5976b2dce6f
changing defualt values for ensembl database
| author | mahtabm | 
|---|---|
| date | Thu, 11 Apr 2013 17:15:42 +1000 | 
| parents | 1f6dce3d34e0 | 
| children | d30fa12e4cc5 | 
| files | variant_effect_predictor/variant_effect_predictor.xml | 
| diffstat | 1 files changed, 14 insertions(+), 148 deletions(-) [+] | 
line wrap: on
 line diff
--- a/variant_effect_predictor/variant_effect_predictor.xml Thu Apr 11 02:01:53 2013 -0400 +++ b/variant_effect_predictor/variant_effect_predictor.xml Thu Apr 11 17:15:42 2013 +1000 @@ -1,7 +1,7 @@ <tool id="ensembl" name="ENSEMBL variant effect predictor"> <description>to annotate variants using an ENSEMBL database</description> <command interpreter="perl"> - variant_effect_predictor.pl -i=$input -o=$output -species=$species + variant_effect_predictor.pl -i=$input -o=$output -species=$species --sift b --polyphen b #if $database_options.database_options_selector == "advanced" --host=$database_options.host --user=$database_options.username --port=$database_options.portnum #if $database_options.password @@ -9,103 +9,26 @@ #end if #else ## hardcoded default values - bad? - --host=www.ebi.edu.au --user=anonymous --port=3306 - #end if - #if $parameters.everything - --everything - #else - #if $parameters.sift_options.sift - --sift $parameters.sift_options.sift_value.value - #end if - #if $parameters.polyphen_options.polyphen - --polyphen $parameters.polyphen_options.polyphen_value.value - #end if - #if $parameters.ccds - --ccds - #end if - #if $parameters.hgvs - --hgvs - #end if - #if $parameters.hgnc - --hgnc - #end if - #if $parameters.numbers - --numbers - #end if - #if $parameters.domains - --domains - #end if - #if $parameters.regulatory - --regulatory - #end if - #if $parameters.canonical - --canonical - #end if - #if $parameters.protein - --protein - #end if - #if $parameters.gmaf - --gmaf - #end if + --host=ensembldb.ensembl.org --user=anonymous --port=5306 #end if </command> <inputs> <param format="vcf" name="input" type="data" label="Input variants file" help="This should be a variant file in vcf format."/> <!-- TODO: allow other variant format types? --> - <param name="species" label="Name of the species being annotated" type="text" Default ="human" help="Species for your data. This can be the latin name e.g. 'homo_sapiens' or any Ensembl alias e.g. 'mouse'. Specifying the latin name can speed up initial database connection as the registry does not have to load all available database aliases on the server."/> <!-- TODO: files in galaxy have a reference genome specified. We should probaby try to use that instead. --> + <param name="species" label="Name of the species being annotated" type="text" Default ="human" help="Species for your data. This can be the latin name e.g. 'human' or any Ensembl alias e.g. 'mouse'. Specifying the latin name can speed up initial database connection as the registry does not have to load all available database aliases on the server."/> <!-- TODO: files in galaxy have a reference genome specified. We should probaby try to use that instead. --> <conditional name="database_options"> - <param name="database_options_selector" type="select" label="Database Options"> + <param name="database_options_selector" type="select" label="Database Options"> <option value="basic" selected="True">Use Default Database</option> <option value="advanced">Choose Database Manually</option> - </param> - <when value="basic"> - <!-- no options --> - </when> - <when value="advanced"> - <param name="host" label="Database host address" type="text" default="www.ebi.edu.au" help="By default connects to the EMBL Australia database at www.ebi.edu.au"/> <!-- TODO: may want a drop-down list with the main EMBL database listed too and with an other field, with Australian as default? In this case should state that there is a cap. --> - <param name="username" label="Username" Default="anonymous" type="text" help="Default='anonymous'"/> - <param name="password" label="Password, if required" Default="" type="text" help="Most public ENSEMBL databases do not require a password for access"/> - <param name="portnum" label="Database port" type="text" Default="3306" help="The default for EMBL Australia's ENSEMBL is 3306."/> - </when> - </conditional> - <conditional name="parameters"> - <param name="everything" label="everything" type="boolean" checked="true" - help="shortcut to switch on all the following parameters"/> - <when value="true"></when> - <when value="false"> - <conditional name="sift_options"> - <param name="sift" label="Sift" type="boolean" - help="Human only SIFT predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids. The VEP can output the prediction term, score or both. Not used by default"/> - <when value="true"> - <param name="sift_value" label="options" type="select"> - <option value="b" selected="true">Both (prediction term and score)</option> - <option value="s">score</option> - <option value="p">prediction term</option> - </param> - </when> - <when value="false"></when> - </conditional> - <conditional name="polyphen_options"> - <param name="polyphen" label="PolyPhen" type="boolean" - help="Human only PolyPhen is a tool which predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. The VEP can output the prediction term, score or both. Not used by default"/> - <when value="true"> - <param name="polyphen_value" label="options" type="select"> - <option value="b" selected="true">Both (prediction term and score)</option> - <option value="s">score</option> - <option value="p">prediction term</option> - </param> - </when> - <when value="false"></when> - </conditional> - <param name="ccds" label="Add CCDS transcript identifier" type="boolean"/> - <param name="hgvs" label="Add HGVS nomenclature" type="boolean"/> - <param name="hgnc" label="Add HGNC gene Identifier" type="boolean"/> - <param name="numbers" label="Add affected exon and intron Numbers" type="boolean"/> - <param name="domains" label="Add (overlapping) protein Domains" type="boolean"/> - <param name="regulatory" label="Overlaps Regulatory regions" type="boolean" /> - <param name="canonical" label="Add flag for Canonical transcript" type="boolean"/> - <param name="protein" label="Ensembl Protein identifier" type="boolean"/> - <param name="gmaf" label="Add GMAF (Global Minor Allele Frequency)" type="boolean" /> - </when> + </param> + <when value="basic"> + <!-- no options --> + </when> + <when value="advanced"> + <param name="host" label="Database host address" type="text" default="ensembldb.ensembl.org" help="By default connects to ensembldb.ensembl.org"/> <!-- TODO: may want a drop-down list with the main EMBL database listed too and with an other field --> + <param name="username" label="Username" Default="anonymous" type="text" help="Default='anonymous'"/> + <param name="password" label="Password, if required" Default="" type="text" help="Most public ENSEMBL databases do not require a password for access"/> + <param name="portnum" label="Database port" type="text" Default="5306" help="The default is 5306."/> + </when> </conditional> </inputs> <outputs> @@ -114,64 +37,7 @@ </outputs> <help> -============ -Description -============ This tool connects to the ENSEMBL database using ENSEMBL's Variant Effect Predictor script and retrieves annotations for an input variants file. - -============ -Parameters -============ -everything - Shortcut flag to switch on all of the following: - ``sift b - polyphen b - ccds - hgvs - hgnc - numbers - domains - regulatory - cell_type - canonical - protein - gmaf`` - -sift [both|score|prediction term] - **Human only** SIFT predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids. The VEP can output the prediction term, score or both. *Not used by default* - -polyphen [both|score|prediction term] - **Human only** PolyPhen is a tool which predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. The VEP can output the prediction term, score or both. *Not used by default* - -ccds - Adds the CCDS transcript identifier (where available) to the output. *Not used by default* - -hgvs - Add HGVS nomenclature based on Ensembl stable identifiers to the output. Both coding and protein sequence names are added where appropriate. Currently it is not possible to generate HGVS identifiers from the cache; a database connection must be made. *Not used by default* - -hgnc - Adds the HGNC gene identifer (where available) to the output. *Not used by default* - -numbers - Adds affected exon and intron numbering to to output. Format is Number/Total. *Not used by default* - -domains - Adds names of overlapping protein domains to output. *Not used by default* - -regulatory - Look for overlaps with regulatory regions. The script can also call if a variant falls in a high information position within a transcription factor binding site. Output lines have a Feature type of RegulatoryFeature or MotifFeature. *Not used by default* - -cell_type - Report only regulatory regions that are found in the given cell type(s). Can be a single cell type or a comma-separated list. The functional type in each cell type is reported under CELL_TYPE in the output. To retrieve a list of cell types, use ``--cell_type list``. *Not used by default* - -canonical - Adds a flag indicating if the transcript is the canonical transcript for the gene. *Not used by default* - -protein - Add the Ensembl protein identifier to the output where appropriate. *Not used by default* - -gmaf - Add the global minor allele frequency (MAF) from 1000 Genomes Phase 1 data for any existing variant to the output. *Not used by default* </help> </tool>
