The tedious task of finding homologous noncoding RNA genes

  1. Peter Menzel1,2,
  2. Jan Gorodkin1 and
  3. Peter F. Stadler2,3,4,5,6
  1. 1Section for Genetics and Bioinformatics, IBHV, and Center for Applied Bioinformatics, University of Copenhagen, DK-1870 Frederiksberg, Denmark
  2. 2Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany
  3. 3Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany
  4. 4RNomics Group, Fraunhofer Institut für Zelltherapie und Immunologie, D-04103 Leipzig, Germany
  5. 5Santa Fe Institute, Santa Fe, New Mexico 87501, USA
  6. 6Department of Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria

    Abstract

    User-driven in silico RNA homology search is still a nontrivial task. In part, this is the consequence of a limited precision of the computational tools in spite of recent exciting progress in this area, and to a certain extent, computational costs are still problematic in practice. An important, and as we argue here, dominating issue is the dependence on good curated (secondary) structural alignments of the RNAs. These are often hard to obtain, not so much because of an inherent limitation in the available data, but because they require substantial manual curation, an effort that is rarely acknowledged. Here, we qualitatively describe a realistic scenario for what a “regular user” (i.e., a nonexpert in a particular RNA family) can do in practice, and what kind of results are likely to be achieved. Despite the indisputable advances in computational RNA biology, the conclusion is discouraging: BLAST still works better or equally good as other methods unless extensive expert knowledge on the RNA family is included. However, when good curated data are available the recent development yields further improvements in finding remote homologs. Homology search beyond the reach of BLAST hence is not at all a routine task.

    Keywords:

    Keywords

    Footnotes

    • Reprint requests to: Peter F. Stadler, Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16-18, D-04107 Leipzig, Germany; e-mail: peter.stadler{at}bioinf.uni-leipzig.de; fax: +49-341-97-16709.

    • Article published online ahead of print. Article and publication date are at http://www.rnajournal.org/cgi/doi/10.1261/rna.1556009.

    | Table of Contents