High-Throughput Computational and Experimental Techniques in Structural Genomics

  1. Mark R. Chance1,2,3,4,6,
  2. Andras Fiser1,3,
  3. Andrej Sali1,5,
  4. Ursula Pieper1,5,
  5. Narayanan Eswar1,5,
  6. Guiping Xu1,3,
  7. J. Eduardo Fajardo1,3,
  8. Thirumuruhan Radhakannan2,4, and
  9. Nebojsa Marinkovic2,4
  1. 1 New York Structural Genomics Research Consortium, Albert Einstein College of Medicine, Bronx, New York 10461, USA
  2. 2 Department of Physiology and Biophysics, Albert Einstein College of Medicine, Bronx, New York 10461, USA
  3. 3 Department of Biochemistry, Albert Einstein College of Medicine, Bronx, New York 10461, USA
  4. 4 Center for Synchrotron Biosciences, Albert Einstein College of Medicine, Bronx, New York 10461, USA
  5. 5 Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and California Institute for Quantitative Biomedical Research, University of California San Francisco, San Francisco, California 94143, USA

Abstract

Structural genomics has as its goal the provision of structural information for all possible ORF sequences through a combination of experimental and computational approaches. The access to genome sequences and cloning resources from an ever-widening array of organisms is driving high-throughput structural studies by the New York Structural Genomics Research Consortium. In this report, we outline the progress of the Consortium in establishing its pipeline for structural genomics, and some of the experimental and bioinformatics efforts leading to structural annotation of proteins. The Consortium has established a pipeline for structural biology studies, automated modeling of ORF sequences using solved (template) structures, and a novel high-throughput approach (metallomics) to examining the metal binding to purified protein targets. The Consortium has so far produced 493 purified proteins from >1077 expression vectors. A total of 95 have resulted in crystal structures, and 81 are deposited in the Protein Data Bank (PDB). Comparative modeling of these structures has generated >40,000 structural models. We also initiated a high-throughput metal analysis of the purified proteins; this has determined that 10%-15% of the targets contain a stoichiometric structural or catalytic transition metal atom. The progress of the structural genomics centers in the U.S. and around the world suggests that the goal of providing useful structural information on most all ORF domains will be realized. This projected resource will provide structural biology information important to understanding the function of most proteins of the cell.

Footnotes

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.2537904.

  • 6 Corresponding author. E-MAIL mrc{at}aecom.yu.edu; FAX (718) 430-8587.

    • Accepted May 12, 2004.
    • Received March 3, 2004.
| Table of Contents

Preprint Server