# HG changeset patch # User iuc # Date 1430793998 14400 # Node ID 65f742e605ecf4c8afe01a4908f6a1c26949ef20 # Parent ae03de7a9fee3efff7f2830b95a2bf55b8ece1f6 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/gemini commit 344140b8df53b8b7024618bb04594607a045c03a diff -r ae03de7a9fee -r 65f742e605ec gemini_annotate.xml --- a/gemini_annotate.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_annotate.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ adding your own custom annotations - - gemini_macros.xml annotate + + + - @@ -36,7 +36,7 @@ label="The name of the column to be added to the variant table" help="(-c)"> - + @@ -48,7 +48,7 @@ - diff -r ae03de7a9fee -r 65f742e605ec gemini_autosomal_recessive.xml --- a/gemini_autosomal_recessive.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_autosomal_recessive.xml Mon May 04 22:46:38 2015 -0400 @@ -1,13 +1,14 @@ Find variants meeting an autosomal recessive/dominant model - - gemini_macros.xml + + + "${ outfile }" ]]> - @@ -58,8 +58,8 @@ **What it does** -Assuming you have defined the familial relationships between samples when loading your VCF into GEMINI, one can leverage a -built-in tool for identifying variants that meet an autosomal recessive or dominant inheritance pattern. +Assuming you have defined the familial relationships between samples when loading your VCF into GEMINI, one can leverage a +built-in tool for identifying variants that meet an autosomal recessive or dominant inheritance pattern. The reported variants will be restricted to those variants having the potential to impact the function of affecting protein coding transcripts. @CITATION@ diff -r ae03de7a9fee -r 65f742e605ec gemini_burden.xml --- a/gemini_burden.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_burden.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ perform sample-wise gene-level burden calculations - - gemini_macros.xml burden + + + - - - - - - - @@ -65,7 +65,7 @@ **What it does** -The burden tool provides a set of utilities to perform burden summaries on a per-gene, per sample basis. +The burden tool provides a set of utilities to perform burden summaries on a per-gene, per sample basis. By default, it outputs a table of gene-wise counts of all high impact variants in coding regions for each sample. $ gemini burden test.burden.db diff -r ae03de7a9fee -r 65f742e605ec gemini_comp_hets.xml --- a/gemini_comp_hets.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_comp_hets.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ Identifying potential compound heterozygotes - - gemini_macros.xml comp_hets + + + "${ outfile }" ]]> - - - @@ -44,13 +44,13 @@ **What it does** -Many recessive disorders are caused by compound heterozygotes. Unlike canonical recessive sites where the same recessive allele is -inherited from both parents at the _same_ site in the gene, compound heterozygotes occur when the individual’s phenotype is caused +Many recessive disorders are caused by compound heterozygotes. Unlike canonical recessive sites where the same recessive allele is +inherited from both parents at the _same_ site in the gene, compound heterozygotes occur when the individual’s phenotype is caused by two heterozygous recessive alleles at _different_ sites in a particular gene. -So basically, we are looking for two (typically loss-of-function (LoF)) heterozygous variants impacting the same gene at different loci. -The complicating factor is that this is _recessive_ and as such, we must also require that the consequential alleles at each heterozygous -site were inherited on different chromosomes (one from each parent). As such, in order to use this tool, we require that all variants are phased. +So basically, we are looking for two (typically loss-of-function (LoF)) heterozygous variants impacting the same gene at different loci. +The complicating factor is that this is _recessive_ and as such, we must also require that the consequential alleles at each heterozygous +site were inherited on different chromosomes (one from each parent). As such, in order to use this tool, we require that all variants are phased. Once this has been done, the comp_hets tool will provide a report of candidate compound heterozygotes for each sample/gene. diff -r ae03de7a9fee -r 65f742e605ec gemini_db_info.xml --- a/gemini_db_info.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_db_info.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ List the gemini database tables and columns - - gemini_macros.xml db_info + + + "${ outfile }" ]]> - @@ -27,7 +27,7 @@ **What it does** -Because of the sheer number of annotations that are stored in gemini, there are admittedly too many columns to remember by rote. +Because of the sheer number of annotations that are stored in gemini, there are admittedly too many columns to remember by rote. If you can’t recall the name of particular column, just use the db_info tool. It will report all of the tables and all of the columns / types in each table. @CITATION@ diff -r ae03de7a9fee -r 65f742e605ec gemini_de_novo.xml --- a/gemini_de_novo.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_de_novo.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ Identifying potential de novo mutations - - gemini_macros.xml de_novo + + + "${ outfile }" ]]> - @@ -39,7 +39,7 @@ **What it does** -Assuming you have defined the familial relationships between samples when loading your VCF into GEMINI, +Assuming you have defined the familial relationships between samples when loading your VCF into GEMINI, you can use this tool for identifying de novo (a.k.a spontaneous) mutations that arise in offspring. @CITATION@ diff -r ae03de7a9fee -r 65f742e605ec gemini_interactions.xml --- a/gemini_interactions.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_interactions.xml Mon May 04 22:46:38 2015 -0400 @@ -1,14 +1,15 @@ Find genes among variants that are interacting partners - - gemini_macros.xml interactions + + + "${ outfile }" ]]> - diff -r ae03de7a9fee -r 65f742e605ec gemini_load.xml --- a/gemini_load.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_load.xml Mon May 04 22:46:38 2015 -0400 @@ -1,16 +1,17 @@ Loading a VCF file into GEMINI - - gemini_macros.xml load + + + - @@ -51,22 +51,22 @@ - - - - - - diff -r ae03de7a9fee -r 65f742e605ec gemini_lof_sieve.xml --- a/gemini_lof_sieve.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_lof_sieve.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ Filter LoF variants by transcript position and type - - gemini_macros.xml lof_sieve + + + "${ outfile }" ]]> - @@ -27,10 +27,10 @@ **What it does** -Not all candidate LoF variants are created equal. For e.g, a nonsense (stop gain) variant impacting the first 5% of a polypeptide is far -more likely to be deleterious than one affecting the last 5%. Assuming you’ve annotated your VCF with snpEff v3.0+, the lof_sieve tool -reports the fractional position (e.g. 0.05 for the first 5%) of the mutation in the amino acid sequence. -In addition, it also reports the predicted function of the transcript so that one can segregate candidate +Not all candidate LoF variants are created equal. For e.g, a nonsense (stop gain) variant impacting the first 5% of a polypeptide is far +more likely to be deleterious than one affecting the last 5%. Assuming you’ve annotated your VCF with snpEff v3.0+, the lof_sieve tool +reports the fractional position (e.g. 0.05 for the first 5%) of the mutation in the amino acid sequence. +In addition, it also reports the predicted function of the transcript so that one can segregate candidate LoF variants that affect protein_coding transcripts from processed RNA, etc. @CITATION@ diff -r ae03de7a9fee -r 65f742e605ec gemini_pathways.xml --- a/gemini_pathways.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_pathways.xml Mon May 04 22:46:38 2015 -0400 @@ -1,14 +1,15 @@ Map genes and variants to KEGG pathways - - gemini_macros.xml pathways + + + "${ outfile }" ]]> - - - @@ -40,9 +40,9 @@ **What it does** -Mapping genes to biological pathways is useful in understanding the function/role played by a gene. -Likewise, genes involved in common pathways is helpful in understanding heterogeneous diseases. -We have integrated the KEGG pathway mapping for gene variants, to explain/annotate variation. +Mapping genes to biological pathways is useful in understanding the function/role played by a gene. +Likewise, genes involved in common pathways is helpful in understanding heterogeneous diseases. +We have integrated the KEGG pathway mapping for gene variants, to explain/annotate variation. This requires your VCF be annotated with either snpEff/VEP. diff -r ae03de7a9fee -r 65f742e605ec gemini_query.xml --- a/gemini_query.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_query.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ Querying the GEMINI database - - gemini_macros.xml query + + + - @@ -61,13 +61,13 @@ - - - @@ -75,7 +75,7 @@ - @@ -99,7 +99,7 @@ **What it does** -The real power in the GEMINI framework lies in the fact that all of your genetic variants have been stored in a convenient database in the context of a wealth of genome annotations that facilitate variant interpretation. +The real power in the GEMINI framework lies in the fact that all of your genetic variants have been stored in a convenient database in the context of a wealth of genome annotations that facilitate variant interpretation. The expressive power of SQL allows one to pose intricate questions of one’s variation data. This tool offers you an easy way to query your variants! http://gemini.readthedocs.org/en/latest/content/querying.html diff -r ae03de7a9fee -r 65f742e605ec gemini_region.xml --- a/gemini_region.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_region.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ Extracting variants from specific regions or genes - - gemini_macros.xml region + + + "${ outfile }" ]]> - diff -r ae03de7a9fee -r 65f742e605ec gemini_roh.xml --- a/gemini_roh.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_roh.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ Identifying runs of homozygosity - - gemini_macros.xml roh + + + "${ outfile }" ]]> - @@ -32,7 +32,7 @@ - @@ -66,16 +66,16 @@ =========================================================================== Runs of homozygosity are long stretches of homozygous genotypes that reflect segments shared identically by descent and are a result of consanguinity or -natural selection. Consanguinity elevates the occurrence of rare recessive -diseases (e.g. cystic fibrosis) that represent homozygotes for strongly deleterious -mutations. Hence, the identification of these runs holds medical value. +natural selection. Consanguinity elevates the occurrence of rare recessive +diseases (e.g. cystic fibrosis) that represent homozygotes for strongly deleterious +mutations. Hence, the identification of these runs holds medical value. -The 'roh' tool in GEMINI returns runs of homozygosity identified in whole genome data. +The 'roh' tool in GEMINI returns runs of homozygosity identified in whole genome data. The tool basically looks at every homozygous position on the chromosome as a possible -start site for the run and looks for those that could give rise to a potentially long -stretch of homozygous genotypes. +start site for the run and looks for those that could give rise to a potentially long +stretch of homozygous genotypes. -For e.g. for the given example allowing ``1 HET`` genotype (h) and ``2 UKW`` genotypes (u) +For e.g. for the given example allowing ``1 HET`` genotype (h) and ``2 UKW`` genotypes (u) the possible roh runs (H) would be: @@ -90,13 +90,13 @@ roh returned for --min-snps = 20 would be: :: - + roh_run1 = H H H H h H H H H u H H H H H u H H H H H H H roh_run2 = H H H H u H H H H H u H H H H H H H h H H H H H -As you can see, the immediate homozygous position right of a break (h or u) would be the possible -start of a new roh run and genotypes to the left of a break are pruned since they cannot +As you can see, the immediate homozygous position right of a break (h or u) would be the possible +start of a new roh run and genotypes to the left of a break are pruned since they cannot be part of a longer run than we have seen before. diff -r ae03de7a9fee -r 65f742e605ec gemini_stats.xml --- a/gemini_stats.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_stats.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ Compute useful variant statistics - - gemini_macros.xml stats + + + "${ outfile }" ]]> - diff -r ae03de7a9fee -r 65f742e605ec gemini_windower.xml --- a/gemini_windower.xml Tue Apr 28 22:55:56 2015 -0400 +++ b/gemini_windower.xml Mon May 04 22:46:38 2015 -0400 @@ -1,11 +1,12 @@ Conducting analyses on genome "windows" - - gemini_macros.xml windower + + + "${ outfile }" ]]> - @@ -34,12 +34,12 @@ - -