# HG changeset patch # User rlegendre # Date 1413817669 14400 # Node ID 015db5db052c9c0512303e1796ec2cdd88358337 # Parent 29c9c86e17e1431c26f4548ab17f59eb6e210643 Uploaded diff -r 29c9c86e17e1 -r 015db5db052c metagene_readthrough.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/metagene_readthrough.xml Mon Oct 20 11:07:49 2014 -0400 @@ -0,0 +1,60 @@ + + Analyse Ribo-seq alignment to detect readthrough events + + samtools + HTseq + pysam + csv + Bio + + + metagene_readthrough.py --gff $gff --fasta $fasta --bam $mapping --dirout=$output,$output.files_path + + + + + + + + + + + + + + +Summary +------- +This tool uses Ribo-seq data (bam file) to extract potential genes with readthrough events from a reference annotation file (GFF3). + +C-terminal protein extensions were identified as previously described (Dunn J.G. and al, 2013). Only uniquely mapped footprints whose size is in the range 25 to 34 are considered. +A gene is read-though if : + + i) It is covered by more than 128 footprints. + + ii) There are footprints after stop codon. + + iii) There are footprints overlapping the next in frame stop codon. + + iv) There is not Methionine in the next five codons downstream the official stop codon of CDS. + + v) The coverage is homogeneous within the extension. + +Stop codon readthrough was estimated by calculating a ratio between footprints in the C-terminal extension and in the CDS. Ribosome density footprints were estimated in RPKM (reads per kilobase per million). +To control variability due to stop codon peaks, footprints mapping to stop codons are excluded to RPKM computing. + +Output +------- +This tool produces html file with plots for each readthrough gene. + + +Dependances +------------ + +.. class:: warningmark + +This tool depends on Python (>=2.7) and following packages : numpy 1.8.0, Biopython 1.58, matplotlib 1.3.1, HTSeq and pysam. Samtools is used for bam manipulation. + + + +