view flaimapper-gtf-from-fasta.xml @ 24:bd695d4cb7c4 draft

Uploaded
author yhoogstrate
date Sun, 29 Mar 2015 02:43:11 -0400
parents e0c0fc569303
children 19d1402611ef
line wrap: on
line source

<?xml version="1.0" encoding="UTF-8"?>
<tool id="flaimapper-gtf-from-fasta" name="FlaiMapper: extract GTF from FASTA" version="1.1.5.b">
	<description>Extract GTF file from FASTA file (as FlaiMapper reference).</description>
	<requirements>
		<requirement type="package" version="0.8.2.1">pysam</requirement>
		<requirement type="package" version="1.1.5">flaimapper</requirement>
	</requirements>
	
	<version_command>flaimapper --version</version_command>
	
	<command>
		gtf-from-fasta
			 -o $output
			    $fasta
	</command>
	
	<stdio>
		<regex
			match="[fai_load] build FASTA index." 
			source="stderr" 
			level="log" 
			description="The FASTA file is being indexed." />
	</stdio>
	
	<inputs>
		<param name="fasta" type="data" format="fasta" label="Fasta sequence corresponding to reference genome" help="" />
	</inputs>
	
	<outputs>
		<data format="gtf" name="output" label="${tool.name} on ${fasta.name}" />
	</outputs>
	
	<tests>
		<test>
			<param name="fasta" value="ncrnadb09.fa" />
			
			<param name="output" value="ncrnadb09.gtf" />
		</test>
	</tests>
	
	<help>
FlaiMapper wrapper for Galaxy
=============================

https://github.com/yhoogstrate/flaimapper
http://www.ncbi.nlm.nih.gov/pubmed/25338717
http://dx.doi.org/10.1093/bioinformatics/btu696

Fragment Location Annotation Identification Mapper

FlaiMapper: computational annotation of small ncRNA-derived fragments using RNA-seq high-throughput data.

Input formats
-------------
To make FlaiMapper compatible with both an entire reference genome as a
separate ncRNA database, it requires an additional GTF file *(mask file)*.
The major difference between an entire reference and a ncRNA database
is that an entire reference usually contains multiple ncRNAs per sequence
entry (chromosome). While for the ncRNA database, each entry should
represent one single mature ncRNA.

Therefore the mask file that represents to the FASTA file of a ncRNA
database will only contain the start- and end positions of each entry.
To generate this in an automated fashion, you can make use of this tool
*as long as the FASTA file doesn't contain entire chromosomes* but
mature ncRNA.

An example input file is **ncRNAdb09**, available at the following URLs:
https://raw.githubusercontent.com/yhoogstrate/flaimapper/master/share/annotations/ncRNA_annotation/ncrnadb09.fa *(reference file)*

It should generate a GTF/GFF file (mask file) similar to the following URL:
https://raw.githubusercontent.com/yhoogstrate/flaimapper/master/share/annotations/ncRNA_annotation/ncrnadb09.gtf *(mask file)*

Installation
------------

The wrapper makes use of easy_install to install a python egg. Please
ensure you have easy_install installed.

License
-------

**flaimapper** and **wrapper**:

GPL (>=3)

**pysam**:

The MIT License

Contact
-------

The tool wrapper has been written by Youri Hoogstrate from the Erasmus
Medical Center (Rotterdam, Netherlands).


Development
-----------

* Repository-Maintainer: Youri Hoogstrate
* Repository-Developers: Youri Hoogstrate

* Repository-Development: https://bitbucket.org/EMCbioinf/galaxy-tool-shed-tools

The tool wrapper has been written by Youri Hoogstrate from the Erasmus
Medical Center (Rotterdam, Netherlands).

	</help>
	
	<citations>
		<citation type="doi">10.1093/bioinformatics/btu696</citation>
	</citations>
</tool>