Whole Proteome pI Values Correlate with Subcellular Localizations of Proteins for Organisms within the Three Domains of Life

  1. Russell Schwartz1,2,3,4,
  2. Claire S. Ting1,2, and
  3. Jonathan King1
  1. 1Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA

Abstract

Isoelectric point (pI) values have long been a standard measure for distinguishing between proteins. This article analyzes distributions of pI values estimated computationally for all predicted ORFs in a selection of fully sequenced genomes. Histograms of pI values confirm the bimodality that has been observed previously for bacterial and archaeal genomes (Van Bogelen et al. 1999) and reveal a trimodality in eukaryotic genomes. A similar analysis on subsets of a nonredundant protein sequence database generated from the full database by selecting on subcellular localization shows that sequences annotated as corresponding to cytosolic and integral membrane proteins have pI distributions that appear to correspond with the two observed modes of bacteria and archaea. Furthermore, nuclear proteins have a broader distribution that may account for the third mode observed in eukaryotes. On the basis of this association between pI and subcellular localization, we conclude that the bimodal character of whole proteome pI values in bacteria and archaea and the trimodal character in eukaryotes are likely to be general properties of proteomes and are associated with the need for different pI values depending on subcellular localization. Our analyses also suggest that the proportions of proteomes consisting of membrane-associated proteins may be currently underestimated.

Footnotes

  • 2 These authors contributed equally to this work.

  • 3 Present address: Massachusetts Institute of Technology, 77 Massachusetts Avenue, Room 68-322, Cambridge, MA 02139, USA.

  • 4 Corresponding author.

  • E-MAIL rss{at}alum.mit.edu; FAX (617) 252-1843.

  • Article published on-line before print: Genome Res.,10.1101/gr.158701.

  • Article and publication are at www.genome.org/cgi/doi/10.1101/gr.158701.

    • Received August 4, 2000.
    • Accepted February 16, 2001.
| Table of Contents

Preprint Server