Abstract
We used a pseudorandom-walk representation in a four-dimensional embedding to estimate the global fractal dimension D of 164 sequences from GenBank and generated length-matched control sequences of three types: random, matched in base content, and matched in dimer content. The mean D of the sequences was 1.631±0.137. This D was significantly lower than the D’s for all three control types, indicating the presence of significant information content in DNA sequences not explained by base or dimer frequencies. This variation was due largely to nonuniform distribution of bases and dimers within DNA sequences. The D of genomic DNA sequences was different from the D of messenger RNA sequences.
- Received 19 February 1992
DOI:https://doi.org/10.1103/PhysRevA.45.8902
©1992 American Physical Society