# HG changeset patch # User artbio # Date 1615118481 0 # Node ID 88e03bac1e36b7ad950315f8bfd6e7f06dc5c099 # Parent 604281aa5ad488375c6183ea509a4e784db924a9 "planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/snvtocnv commit ccbc1fc0e1af1e9cf5000fe2a3f60655cd5793eb" diff -r 604281aa5ad4 -r 88e03bac1e36 segmentation_sequenza.R --- a/segmentation_sequenza.R Sun Feb 21 12:49:38 2021 +0000 +++ b/segmentation_sequenza.R Sun Mar 07 12:01:21 2021 +0000 @@ -1,15 +1,17 @@ # load packages that are provided in the conda env -options(show.error.messages = F, - error = function() { - cat(geterrmessage(), file = stderr()); q("no", 1, F)}) -Sys.setenv(TZ = "Pacific/Auckland") # turnaround the tidyverse bug "In OlsonNames() : no Olson database found" - +# options(show.error.messages = F, +# error = function() { +# cat(geterrmessage(), file = stderr()); q("no", 1, F)}) library(optparse) library(sequenza) library(BiocParallel) library(tidyverse) - +Sys.setenv(TZ = "Etc/UTC") # turnaround the tidyverse bug "In OlsonNames() : no Olson database found" +tzdirs <- c(Sys.getenv("TZDIR"), file.path(R.home("share"), + "zoneinfo"), "/usr/share/zoneinfo", "/usr/share/lib/zoneinfo", + "/usr/lib/zoneinfo", "/usr/local/etc/zoneinfo", "/etc/zoneinfo", + "/usr/etc/zoneinfo") option_list <- list( make_option( c("-i", "--input"), diff -r 604281aa5ad4 -r 88e03bac1e36 sequenza_index.xml --- a/sequenza_index.xml Sun Feb 21 12:49:38 2021 +0000 +++ b/sequenza_index.xml Sun Mar 07 12:01:21 2021 +0000 @@ -1,4 +1,4 @@ - + @@ -40,33 +40,24 @@ snvtocnv ============================ -This tool is wrapping several cleaning steps to produce bam files suitable for subsequent -analyses with lumpy-smoove (or other large structural variation callers) or with -somatic-varscan (or small structural variation callers) - - -Workflow -============= - -The tool is using the following command line for filtering: +Analyzes genomic sequencing data from paired normal-tumor samples, including +cellularity and ploidy estimation; mutation and copy number (allele-specific and total +copy number) detection, quantification and visualization. -:: +This tools builds the GC wigle index of the reference genome required to perform analysis +of the somatic single nucleotide variations using the tool "Infer CNVs from SNVs" - sambamba view -h -t 8 --filter='mapping_quality >= 1 and not(unmapped) and not(mate_is_unmapped)' -f 'bam' $input_base".bam" - | samtools rmdup - - - |tee $input_base".filt1.dedup.bam" | bamleftalign --fasta-reference reference.fa -c --max-iterations "5" - - | samtools calmd -C 50 -b -@ 4 - reference.fa > $input_base".filt1.dedup.bamleft.calmd.bam" ; - sambamba view -h -t 8 --filter='mapping_quality <= 254' -f 'bam' -o $input_base".filt1.dedup.bamleft.calmd.filt2.bam" $input_base".filt1.dedup.bamleft.calmd.bam" -Purpose +Inputs -------- -This "workflow" tool was generated in order to limit the number of ``python metadata/set.py`` jobs -which occur at each step of standard galaxy workflows. Indeed, these jobs are poorly optimized and may last considerable -amounts of time when datasets are large, at each step, lowering the overall performance of the workflow. +The reference genome in a fasta format + +*Warning* the genome fasta must be sorted according to the chromosomes +(e.g. chr1, chr2, .. chr21, chr22) - 10.1371/journal.pone.0168397 + 10.1093/annonc/mdu479 diff -r 604281aa5ad4 -r 88e03bac1e36 snvtocnv.xml --- a/snvtocnv.xml Sun Feb 21 12:49:38 2021 +0000 +++ b/snvtocnv.xml Sun Mar 07 12:01:21 2021 +0000 @@ -1,4 +1,4 @@ - + @@ -56,33 +56,22 @@ snvtocnv ============================ -This tool is wrapping several cleaning steps to produce bam files suitable for subsequent -analyses with lumpy-smoove (or other large structural variation callers) or with -somatic-varscan (or small structural variation callers) - - -Workflow -============= - -The tool is using the following command line for filtering: +Analyze genomic sequencing data from paired normal-tumor samples, including +cellularity and ploidy estimation; mutation and copy number (allele-specific and total +copy number) detection, quantification and visualization. -:: - - sambamba view -h -t 8 --filter='mapping_quality >= 1 and not(unmapped) and not(mate_is_unmapped)' -f 'bam' $input_base".bam" - | samtools rmdup - - - |tee $input_base".filt1.dedup.bam" | bamleftalign --fasta-reference reference.fa -c --max-iterations "5" - - | samtools calmd -C 50 -b -@ 4 - reference.fa > $input_base".filt1.dedup.bamleft.calmd.bam" ; - sambamba view -h -t 8 --filter='mapping_quality <= 254' -f 'bam' -o $input_base".filt1.dedup.bamleft.calmd.filt2.bam" $input_base".filt1.dedup.bamleft.calmd.bam" -Purpose +Inputs -------- -This "workflow" tool was generated in order to limit the number of ``python metadata/set.py`` jobs -which occur at each step of standard galaxy workflows. Indeed, these jobs are poorly optimized and may last considerable -amounts of time when datasets are large, at each step, lowering the overall performance of the workflow. +A GC wigle of genome index generated with the tool "create GC_wiggle of reference genome" +available from this galaxy wrapper + +A vcf file of somatic *single* nucleotide variations observed in a tumor sample + - 10.1371/journal.pone.0168397 + 10.1093/annonc/mdu479