Galaxy | (sandbox for testing) | Tool Preview

GFF3_to_GTF (version 1.0.0)
GFF3 format file for converting to GTF.

What it does

This tool converts data from GFF3 format to GTF format.


Example


About formats

GFF3 format General Feature Format is a format for describing genes and other features associated with DNA, RNA and Protein sequences. GFF3 lines have nine tab-separated fields:

  1. seqid - Must be a chromosome or scaffold.
  2. source - The program that generated this feature.
  3. type - The name of this type of feature. Some examples of standard feature types are "gene", "CDS", "protein", "mRNA", and "exon".
  4. start - The starting position of the feature in the sequence. The first base is numbered 1.
  5. stop - The ending position of the feature (inclusive).
  6. score - A score between 0 and 1000. If there is no score value, enter ".".
  7. strand - Valid entries include '+', '-', or '.' (for don't know/care).
  8. phase - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
  9. attributes - All lines with the same group are linked together into a single item.

GTF format Gene Transfer Format, it borrows from GFF, but has additional structure that warrants a separate definition and format name. GTF lines have nine tab-seaparated fields:

  1. seqname - The name of the sequence.
  2. source - This indicating where the annotation came from.
  3. feature - The name of the feature types. The following feature types are required: 'CDS', 'start_codon' and 'stop_codon'
  4. start - The starting position of the feature in the sequence. The first base is numbered 1.
  5. end - The ending position of the feature (inclusive).
  6. score - The score field indicates a degree of confidence in the feature's existence and coordinates.
  7. strand - Valid entries include '+', '-', or '.'
  8. frame - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base.
  9. attributes - These attributes are designed for handling multiple transcripts from the same genomic region.

This tool is a part of the MLB Group at Friedrich Miescher Laboratory of the Max Planck Society. Copyright (C) 2010 Vipin T. Sreedharan (vipin.ts@tuebingen.mpg.de)