Identification and measurement of neighbor-dependent nucleotide substitution processes

Bioinformatics. 2005 May 15;21(10):2322-8. doi: 10.1093/bioinformatics/bti376. Epub 2005 Mar 15.

Abstract

Motivation: Neighbor-dependent substitution processes generated specific pattern of dinucleotide frequencies in the genomes of most organisms. The CpG-methylation-deamination process is, e.g. a prominent process in vertebrates (CpG effect). Such processes, often with unknown mechanistic origins, need to be incorporated into realistic models of nucleotide substitutions.

Results: Based on a general framework of nucleotide substitutions we developed a method that is able to identify the most relevant neighbor-dependent substitution processes, estimate their relative frequencies and judge their importance in order to be included into the modeling. Starting from a model for neighbor independent nucleotide substitution we successively added neighbor-dependent substitution processes in the order of their ability to increase the likelihood of the model describing given data. The analysis of neighbor-dependent nucleotide substitutions based on repetitive elements found in the genomes of human, zebrafish and fruit fly is presented.

Availability: A web server to perform the presented analysis is freely available at: http://evogen.molgen.mpg.de/server/substitution-analysis

Publication types

  • Comparative Study

MeSH terms

  • Algorithms*
  • Animals
  • Base Composition
  • Base Pair Mismatch / genetics*
  • Base Pairing / genetics
  • Base Sequence
  • Chromosome Mapping / methods*
  • CpG Islands / genetics
  • Drosophila melanogaster
  • Evolution, Molecular*
  • Genome, Human
  • Humans
  • Models, Genetic
  • Molecular Sequence Data
  • Nucleotides / genetics*
  • Repetitive Sequences, Nucleic Acid / genetics
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Sequence Homology, Nucleic Acid
  • Software
  • Zebrafish

Substances

  • Nucleotides