Segmentation algorithm for DNA sequences

Chun-Ting Zhang, Feng Gao, and Ren Zhang
Phys. Rev. E 72, 041917 – Published 17 October 2005

Abstract

A new measure, to quantify the difference between two probability distributions, called the quadratic divergence, has been proposed. Based on the quadratic divergence, a new segmentation algorithm to partition a given genome or DNA sequence into compositionally distinct domains is put forward. The new algorithm has been applied to segment the 24 human chromosome sequences, and the boundaries of isochores for each chromosome were obtained. Compared with the results obtained by using the entropic segmentation algorithm based on the Jensen-Shannon divergence, both algorithms resulted in all identical coordinates of segmentation points. An explanation of the equivalence of the two segmentation algorithms is presented. The new algorithm has a number of advantages. Particularly, it is much simpler and faster than the entropy-based method. Therefore, the new algorithm is more suitable for analyzing long genome sequences, such as human and other newly sequenced eukaryotic genome sequences.

  • Figure
  • Figure
  • Received 7 March 2005

DOI:https://doi.org/10.1103/PhysRevE.72.041917

©2005 American Physical Society

Authors & Affiliations

Chun-Ting Zhang1,*, Feng Gao1, and Ren Zhang2

  • 1Department of Physics, Tianjin University, Tianjin 300072, China
  • 2Department of Epidemiology and Biostatistics, Tianjin Cancer Institute and Hospital, Tianjin 300060, China

  • *Email: ctzhang@tju.edu.cn

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 72, Iss. 4 — October 2005

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×