Mercurial > repos > devteam > data_manager_hisat_index_builder
view data_manager/hisat_index_builder.xml @ 0:8082f04de7ae draft default tip
planemo upload for repository https://github.com/galaxyproject/tools-devteam/tree/master/data_managers/data_manager_hisat_index_builder commit 86cf90107482cab1cb47fc0d42d6705f8077daa7
author | devteam |
---|---|
date | Fri, 06 Nov 2015 14:17:24 -0500 |
parents | |
children |
line wrap: on
line source
<tool id="hisat_index_builder_data_manager" name="HISAT index" tool_type="manage_data" version="1.0.0"> <description>builder</description> <requirements> <requirement type="package" version="0.1.6">hisat</requirement> </requirements> <stdio> <exit_code range=":-1" /> <exit_code range="1:" /> </stdio> <command interpreter="python">hisat_index_builder.py "${out_file}" --fasta_filename "${all_fasta_source.fields.path}" --fasta_dbkey "${all_fasta_source.fields.dbkey}" --fasta_description "${all_fasta_source.fields.name}" --data_table_name "hisat_indexes"</command> <inputs> <param label="Source FASTA Sequence" name="all_fasta_source" type="select"> <options from_data_table="all_fasta" /> </param> <param label="Name of sequence" name="sequence_name" type="text" value="" /> <param label="ID for sequence" name="sequence_id" type="text" value="" /> </inputs> <outputs> <data format="data_manager_json" name="out_file" /> </outputs> <help> <![CDATA[ .. class:: infomark **Notice:** If you leave name, description, or id blank, it will be generated automatically. What is HISAT? -------------- `HISAT <http://ccb.jhu.edu/software/hisat>`__ is a fast and sensitive spliced alignment program. As part of HISAT, we have developed a new indexing scheme based on the Burrows-Wheeler transform (`BWT <http://en.wikipedia.org/wiki/Burrows-Wheeler_transform>`__) and the `FM index <http://en.wikipedia.org/wiki/FM-index>`__, called hierarchical indexing, that employs two types of indexes: (1) one global FM index representing the whole genome, and (2) many separate local FM indexes for small regions collectively covering the genome. Our hierarchical index for the human genome (about 3 billion bp) includes ~48,000 local FM indexes, each representing a genomic region of ~64,000bp. As the basis for non-gapped alignment, the FM index is extremely fast with a low memory footprint, as demonstrated by `Bowtie <http://bowtie-bio.sf.net>`__. In addition, HISAT provides several alignment strategies specifically designed for mapping different types of RNA-seq reads. All these together, HISAT enables extremely fast and sensitive alignment of reads, in particular those spanning two exons or more. As a result, HISAT is much faster >50 times than `TopHat2 <http://ccb.jhu.edu/software/tophat>`__ with better alignment quality. Although it uses a large number of indexes, the memory requirement of HISAT is still modest, approximately 4.3 GB for human. HISAT uses the `Bowtie2 <http://bowtie-bio.sf.net/bowtie2>`__ implementation to handle most of the operations on the FM index. In addition to spliced alignment, HISAT handles reads involving indels and supports a paired-end alignment mode. Multiple processors can be used simultaneously to achieve greater alignment speed. HISAT outputs alignments in `SAM <http://samtools.sourceforge.net/SAM1.pdf>`__ format, enabling interoperation with a large number of other tools (e.g. `SAMtools <http://samtools.sourceforge.net>`__, `GATK <http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit>`__) that use SAM. HISAT is distributed under the `GPLv3 license <http://www.gnu.org/licenses/gpl-3.0.html>`__, and it runs on the command line under Linux, Mac OS X and Windows. ]]> </help> <citations> <citation type="doi">10.1038/nmeth.3317</citation> </citations> </tool>