Abstract
Summary Gene-centric bioinformatics studies frequently involve calculation or extraction of various features of genes such as gene ID mapping, GC content calculation and different types of gene lengths, through manipulation of gene models that are often annotated in GTF format and available from ENSEMBL or GENCODE database. Such computation is essential for subsequent analysis such as intron retention detection where independent introns may need to be identified, converting RNA-seq read counts to FPKM where gene length is required, and obtaining flanking regions around transcription start sites. However, to our knowledge, a software package that is dedicated to analyzing various modes of gene models directly from GTF file is not publicly available. In this work, GTFtools (implemented in Python and not dependent on any non-python third-party software), a stand-alone command-line software that provides a set of functions to analyze various modes of gene models, is provided for facilitating routine bioinformatics studies where information about gene models needs to be calculated.
Availability GTFtools is freely available at www.genemine.org/gtftools.php
Contact hongdong{at}csu.edu.cn.