Mercurial > repos > yhoogstrate > featurecounts_valid_gff
changeset 31:503e4165b1fc
Uploaded
author | yhoogstrate |
---|---|
date | Fri, 07 Feb 2014 05:00:14 -0500 |
parents | 689c7888b39b |
children | a6665369ff4f |
files | featurecounts_valid_gff.xml |
diffstat | 1 files changed, 47 insertions(+), 17 deletions(-) [+] |
line wrap: on
line diff
--- a/featurecounts_valid_gff.xml Fri Feb 07 04:34:48 2014 -0500 +++ b/featurecounts_valid_gff.xml Fri Feb 07 05:00:14 2014 -0500 @@ -118,21 +118,51 @@ </outputs> <help> -featureCounts-valid-gff: - This application count reads aligned to annotated genes in a reference genome from SAM or BAM files. - This is similar to tools like DEXSeq-count, HTSeq-count, etc. - The tool is written in pure C, without the requirement of third party sorting software. - Therefore this tool is incredibly fast and takes about 7 minutes on a single average CPU for a 10GB alignment to a Homo Sapiens genome! - - --- - - Remark that this is a fork of the original "featureCounts" package, which can be found at: - http://subread.sourceforge.net/ - - --- - - This fork is able to read GTF/GFF files according to the provided standard by Ensembl, which can be found at: - http://www.ensembl.org/info/website/upload/gff.html - The fork is maintained by: Youri Hoogstrate - </help> +featureCounts-valid-gff:: + +**featureCounts Overview** +FeatureCounts is a light-weight read counting program written entirely in the C programming language. It can be used to count both gDNA-seq and RNA-seq reads for genomic features in in SAM/BAM files. +It has a variety of advanced parameters but its major strength is its outstanding performance: analysis of a 10GB BAM file takes about 7 minutes on a single average CPU (Homo Sapiens genome)! +Liao Y, Smyth GK and Shi W. featureCounts: an efficient general-purpose program for assigning sequence reads to genomic features. Bioinformatics, Advance Access, accepted on Nov 7, 2013 + +featureCounts is part of a bigger analysis suite called subread: +http://subread.sourceforge.net/ +Liao Y, Smyth GK and Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research, 41(10):e108, 2013 + +The tool requires a GFF/GTF file for getting the genomic coordinates of the genes that should be measured. +This file format is known to have at least the following common variants: +* http://genome.ucsc.edu/FAQ/FAQformat.html#format3 +* http://www.ensembl.org/info/website/upload/gff.html + +FeatureCounts is only able to handle the UCSC variant. Therefore the project was forked to "featureCounts-valid-gff", able to handle both. +Whis wrapper wraps the fork, such that it is independent of the GFF/GTF variant. + +**Input formats** + +Alignments should be provided in either: +* SAM format - hhttp://samtools.sourceforge.net/samtools.shtml#5 +* BAM format + +Gene regions should be provided in the GFF/GTF format: +* http://genome.ucsc.edu/FAQ/FAQformat.html#format3 +* http://www.ensembl.org/info/website/upload/gff.html + +**Installation** + +Make sure you have proper GFF/GTF files (corresponding to your reference genome used for the aligment) uploaded to your history. +The source of this file should not be important since this fork can handle both ENSEMBL and UCSC variants of the GTF/GFF format. + +**License** + +* featureCounts / subread: GNU General Public License version 3.0 (GPLv3) +* featureCounts-valid-gff: GNU General Public License version 3.0 (GPLv3) + +**Contact** + +The tool wrapper has been written by Youri Hoogstrate from the Erasmus Medical Center (Rotterdam, Netherlands) on behalf of the Translational Research IT (TraIT) project: +http://www.ctmm.nl/en/programmas/infrastructuren/traitprojecttranslationeleresearch + +More tools by the Translational Research IT (TraIT) project can be found in the following repository: +http://toolshed.nbic.nl/ +</help> </tool>