Mercurial > repos > nilesh > rseqc
diff infer_experiment.xml @ 3:71ed55a3515a draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/rseqc commit 37fb1988971807c6a072e1afd98eeea02329ee83
author | iuc |
---|---|
date | Tue, 14 Mar 2017 10:22:57 -0400 |
parents | f92b87abef3d |
children | 017eaaf58e5e |
line wrap: on
line diff
--- a/infer_experiment.xml Thu Jul 18 11:01:08 2013 -0500 +++ b/infer_experiment.xml Tue Mar 14 10:22:57 2017 -0400 @@ -1,124 +1,141 @@ -<tool id="infer_experiment" name="Infer Experiment"> - <description>speculates how RNA-seq were configured</description> - <requirements> - <requirement type="package" version="2.3.7">rseqc</requirement> - </requirements> - <command interpreter="python"> infer_experiment.py -i $input -r $refgene - - #if $sample_size.boolean - -s $sample_size.size - #end if - - > $output - </command> - <inputs> - <param name="input" type="data" format="bam,sam" label="Input BAM/SAM file" /> - <param name="refgene" type="data" format="bed" label="Reference gene model in bed format" /> - <conditional name="sample_size"> - <param name="boolean" type="boolean" label="Modify usable sampled reads" value="false" /> - <when value="true"> - <param name="size" type="integer" label="Number of usable sampled reads (default = 200000)" value="200000" /> - </when> - </conditional> - </inputs> - <outputs> - <data format="txt" name="output" /> - </outputs> - <tests> - <test> - <param name="input" value="Pairend_nonStrandSpecific_36mer_Human_hg19.bam" /> - <param name="refgene" value="hg19_RefSeq.bed" /> - <output name="output" file="inferexpout.txt" /> - </test> - </tests> - <help> -.. image:: https://code.google.com/p/rseqc/logo?cct=1336721062 +<tool id="rseqc_infer_experiment" name="Infer Experiment" version="@WRAPPER_VERSION@"> + <description>speculates how RNA-seq were configured</description> + + <macros> + <import>rseqc_macros.xml</import> + </macros> + + <expand macro="requirements" /> + + <expand macro="stdio" /> + + <version_command><![CDATA[infer_experiment.py --version]]></version_command> + + <command><![CDATA[ + infer_experiment.py -i '${input}' -r '${refgene}' + --sample-size ${sample_size} + --mapq ${mapq} + > '${output}' + ]]> + </command> ------ + <inputs> + <expand macro="bam_param" /> + <expand macro="refgene_param" /> + <expand macro="sample_size_param" /> + <expand macro="mapq_param" /> + </inputs> + + <outputs> + <data format="txt" name="output" /> + </outputs> -About RSeQC -+++++++++++ + <tests> + <test> + <param name="input" value="pairend_strandspecific_51mer_hg19_chr1_1-100000.bam"/> + <param name="refgene" value="hg19_RefSeq_chr1_1-100000.bed"/> + <output name="output" file="output.infer_experiment.txt"/> + </test> + </tests> -The RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. “Basic modules” quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while “RNA-seq specific modules” investigate sequencing saturation status of both splicing junction detection and expression estimation, mapped reads clipping profile, mapped reads distribution, coverage uniformity over gene body, reproducibility, strand specificity and splice junction annotation. + <help><![CDATA[ +infer_experiment.py ++++++++++++++++++++ -The RSeQC package is licensed under the GNU GPL v3 license. +This program is used to speculate how RNA-seq sequencing were configured, especially how +reads were stranded for strand-specific RNA-seq data, through comparing reads' mapping +information to the underneath gene model. + Inputs ++++++++++++++ Input BAM/SAM file - Alignment file in BAM/SAM format. + Alignment file in BAM/SAM format. Reference gene model - Gene model in BED format. + Gene model in BED format. Number of usable sampled reads (default=200000) - Number of usable reads sampled from SAM/BAM file. More reads will give more accurate estimation, but make program little slower. + Number of usable reads sampled from SAM/BAM file. More reads will give more accurate estimation, but make program little slower. +Outputs ++++++++ -Output -++++++++++++++ -This program is used to speculate how RNA-seq sequencing were configured, especially how reads were stranded for strand-specific RNA-seq data, through comparing reads' mapping information to the underneath gene model. Generally, strand specific RNA-seq data should be handled differently in both visualization and RPKM calculation. +For pair-end RNA-seq, there are two different +ways to strand reads (such as Illumina ScriptSeq protocol): -For pair-end RNA-seq, there are two different ways to strand reads: +1. 1++,1--,2+-,2-+ -1) 1++,1--,2+-,2-+ - - read1 mapped to '+' strand indicates parental gene on '+' strand - - read1 mapped to '-' strand indicates parental gene on '-' strand - - read2 mapped to '+' strand indicates parental gene on '-' strand - - read2 mapped to '-' strand indicates parental gene on '+' strand -2) 1+-,1-+,2++,2-- - - read1 mapped to '+' strand indicates parental gene on '-' strand - - read1 mapped to '-' strand indicates parental gene on '+' strand - - read2 mapped to '+' strand indicates parental gene on '+' strand - - read2 mapped to '-' strand indicates parental gene on '-' strand +* read1 mapped to '+' strand indicates parental gene on '+' strand +* read1 mapped to '-' strand indicates parental gene on '-' strand +* read2 mapped to '+' strand indicates parental gene on '-' strand +* read2 mapped to '-' strand indicates parental gene on '+' strand + +2. 1+-,1-+,2++,2-- + +* read1 mapped to '+' strand indicates parental gene on '-' strand +* read1 mapped to '-' strand indicates parental gene on '+' strand +* read2 mapped to '+' strand indicates parental gene on '+' strand +* read2 mapped to '-' strand indicates parental gene on '-' strand For single-end RNA-seq, there are also two different ways to strand reads: -1) ++,-- - -read mapped to '+' strand indicates parental gene on '+' strand - - read mapped to '-' strand indicates parental gene on '-' strand -2) +-,-+ - - read mapped to '+' strand indicates parental gene on '-' strand - - read mapped to '-' strand indicates parental gene on '+' strand +1. ++,-- + +* read mapped to '+' strand indicates parental gene on '+' strand +* read mapped to '-' strand indicates parental gene on '-' strand + +2. +-,-+ + +* read mapped to '+' strand indicates parental gene on '-' strand +* read mapped to '-' strand indicates parental gene on '+' strand + Example Output ++++++++++++++ **Example1** :: - ========================================================= - This is PairEnd Data :: + ========================================================= + This is PairEnd Data :: - Fraction of reads explained by "1++,1--,2+-,2-+": 0.4992 - Fraction of reads explained by "1+-,1-+,2++,2--": 0.5008 - Fraction of reads explained by other combinations: 0.0000 - ========================================================= + Fraction of reads explained by "1++,1--,2+-,2-+": 0.4992 + Fraction of reads explained by "1+-,1-+,2++,2--": 0.5008 + Fraction of reads explained by other combinations: 0.0000 + ========================================================= *Conclusion*: We can infer that this is NOT a strand specific because 50% of reads can be explained by "1++,1--,2+-,2-+", while the other 50% can be explained by "1+-,1-+,2++,2--". **Example2** :: - ============================================================ - This is PairEnd Data + ============================================================ + This is PairEnd Data - Fraction of reads explained by "1++,1--,2+-,2-+": 0.9644 :: - Fraction of reads explained by "1+-,1-+,2++,2--": 0.0356 - Fraction of reads explained by other combinations: 0.0000 - ============================================================ - + Fraction of reads explained by "1++,1--,2+-,2-+": 0.9644 :: + Fraction of reads explained by "1+-,1-+,2++,2--": 0.0356 + Fraction of reads explained by other combinations: 0.0000 + ============================================================ + *Conclusion*: We can infer that this is a strand-specific RNA-seq data. strandness of read1 is consistent with that of gene model, while strandness of read2 is opposite to the strand of reference gene model. **Example3** :: - ========================================================= - This is SingleEnd Data :: + ========================================================= + This is SingleEnd Data :: - Fraction of reads explained by "++,--": 0.9840 :: - Fraction of reads explained by "+-,-+": 0.0160 - Fraction of reads explained by other combinations: 0.0000 - ========================================================= + Fraction of reads explained by "++,--": 0.9840 :: + Fraction of reads explained by "+-,-+": 0.0160 + Fraction of reads explained by other combinations: 0.0000 + ========================================================= *Conclusion*: This is single-end, strand specific RNA-seq data. Strandness of reads are concordant with strandness of reference gene. - </help> -</tool> \ No newline at end of file + +@ABOUT@ + +]]> + </help> + + <expand macro="citations" /> + +</tool>