Mercurial > repos > artbio > snvtocnv

--- a/segmentation_sequenza.R	Sun Feb 21 12:49:38 2021 +0000
+++ b/segmentation_sequenza.R	Sun Mar 07 12:01:21 2021 +0000
@@ -1,15 +1,17 @@
 # load packages that are provided in the conda env
-options(show.error.messages = F,
-       error = function() {
-           cat(geterrmessage(), file = stderr()); q("no", 1, F)})
-Sys.setenv(TZ = "Pacific/Auckland") # turnaround the tidyverse bug "In OlsonNames() : no Olson database found"
-
+# options(show.error.messages = F,
+#       error = function() {
+#           cat(geterrmessage(), file = stderr()); q("no", 1, F)})

 library(optparse)
 library(sequenza)
 library(BiocParallel)
 library(tidyverse)
-
+Sys.setenv(TZ = "Etc/UTC") # turnaround the tidyverse bug "In OlsonNames() : no Olson database found"
+tzdirs <- c(Sys.getenv("TZDIR"), file.path(R.home("share"),
+        "zoneinfo"), "/usr/share/zoneinfo", "/usr/share/lib/zoneinfo",
+        "/usr/lib/zoneinfo", "/usr/local/etc/zoneinfo", "/etc/zoneinfo",
+        "/usr/etc/zoneinfo")
 option_list <- list(
   make_option(
     c("-i", "--input"),
--- a/sequenza_index.xml	Sun Feb 21 12:49:38 2021 +0000
+++ b/sequenza_index.xml	Sun Mar 07 12:01:21 2021 +0000
@@ -1,4 +1,4 @@
-<tool id="sequenzaindex" name="create GC_wiggle of reference genome" version="0.4.0">
+<tool id="sequenzaindex" name="create GC_wiggle of reference genome" version="0.5.0">
     <description>
     </description>
     <macros>
@@ -40,33 +40,24 @@
 snvtocnv
 ============================

-This tool is wrapping several cleaning steps to produce bam files suitable for subsequent
-analyses with lumpy-smoove (or other large structural variation callers) or with
-somatic-varscan (or small structural variation callers)
-
-
-Workflow
-=============
-
-The tool is using the following command line for filtering:
+Analyzes genomic sequencing data from paired normal-tumor samples, including
+cellularity and ploidy estimation; mutation and copy number (allele-specific and total
+copy number) detection, quantification and visualization.

-::
+This tools builds the GC wigle index of the reference genome required to perform analysis
+of the somatic single nucleotide variations using the tool "Infer CNVs from SNVs"

-    sambamba view -h -t 8 --filter='mapping_quality >= 1 and not(unmapped) and not(mate_is_unmapped)' -f 'bam' $input_base".bam"
-    &#124; samtools rmdup - -
-    &#124;tee $input_base".filt1.dedup.bam" &#124; bamleftalign --fasta-reference reference.fa -c --max-iterations "5" -
-    &#124; samtools calmd  -C 50 -b -@ 4 - reference.fa &gt; $input_base".filt1.dedup.bamleft.calmd.bam" ;
-    sambamba view -h -t 8 --filter='mapping_quality &lt;&#61; 254' -f 'bam' -o $input_base".filt1.dedup.bamleft.calmd.filt2.bam" $input_base".filt1.dedup.bamleft.calmd.bam"

-Purpose
+Inputs
 --------

-This "workflow" tool was generated in order to limit the number of ``python metadata/set.py`` jobs
-which occur at each step of standard galaxy workflows. Indeed, these jobs are poorly optimized and may last considerable
-amounts of time when datasets are large, at each step, lowering the overall performance of the workflow.
+The reference genome in a fasta format
+
+*Warning* the genome fasta must be sorted according to the chromosomes
+(e.g. chr1, chr2, .. chr21, chr22)

     </help>
     <citations>
-        <citation type="doi">10.1371/journal.pone.0168397</citation>
+        <citation type="doi">10.1093/annonc/mdu479</citation>
     </citations>
 </tool>
--- a/snvtocnv.xml	Sun Feb 21 12:49:38 2021 +0000
+++ b/snvtocnv.xml	Sun Mar 07 12:01:21 2021 +0000
@@ -1,4 +1,4 @@
-<tool id="snvtocnv" name="Infer CNVs from SNVs" version="0.4.0">
+<tool id="snvtocnv" name="Infer CNVs from SNVs" version="0.5.0">
     <description>
     </description>
     <macros>
@@ -56,33 +56,22 @@
 snvtocnv
 ============================

-This tool is wrapping several cleaning steps to produce bam files suitable for subsequent
-analyses with lumpy-smoove (or other large structural variation callers) or with
-somatic-varscan (or small structural variation callers)
-
-
-Workflow
-=============
-
-The tool is using the following command line for filtering:
+Analyze genomic sequencing data from paired normal-tumor samples, including
+cellularity and ploidy estimation; mutation and copy number (allele-specific and total
+copy number) detection, quantification and visualization.

-::
-
-    sambamba view -h -t 8 --filter='mapping_quality >= 1 and not(unmapped) and not(mate_is_unmapped)' -f 'bam' $input_base".bam"
-    &#124; samtools rmdup - -
-    &#124;tee $input_base".filt1.dedup.bam" &#124; bamleftalign --fasta-reference reference.fa -c --max-iterations "5" -
-    &#124; samtools calmd  -C 50 -b -@ 4 - reference.fa &gt; $input_base".filt1.dedup.bamleft.calmd.bam" ;
-    sambamba view -h -t 8 --filter='mapping_quality &lt;&#61; 254' -f 'bam' -o $input_base".filt1.dedup.bamleft.calmd.filt2.bam" $input_base".filt1.dedup.bamleft.calmd.bam"

-Purpose
+Inputs
 --------

-This "workflow" tool was generated in order to limit the number of ``python metadata/set.py`` jobs
-which occur at each step of standard galaxy workflows. Indeed, these jobs are poorly optimized and may last considerable
-amounts of time when datasets are large, at each step, lowering the overall performance of the workflow.
+A GC wigle of genome index generated with the tool "create GC_wiggle of reference genome"
+available from this galaxy wrapper
+
+A vcf file of somatic *single* nucleotide variations observed in a tumor sample
+

     </help>
     <citations>
-        <citation type="doi">10.1371/journal.pone.0168397</citation>
+        <citation type="doi">10.1093/annonc/mdu479</citation>
     </citations>
 </tool>