RT Journal Article SR Electronic T1 Codon usage is a stochastic process across genetic codes of the kingdoms of life JF bioRxiv FD Cold Spring Harbor Laboratory SP 066381 DO 10.1101/066381 A1 Bohdan B. Khomtchouk A1 Claes Wahlestedt A1 Wolfgang Nonner YR 2016 UL http://biorxiv.org/content/early/2016/07/27/066381.abstract AB DNA encodes protein primary structure using 64 different codons to specify 20 different amino acids and a stop signal. To uncover rules of codon use, ranked codon frequencies have previously been analyzed in terms of empirical or statistical relations for a small number of genomes. These descriptions fail on most genomes reported in the Codon Usage Tabulated from GenBank (CUTG) database. Here we model codon usage as a random variable. This stochastic model provides accurate, one-parameter characterizations of 2210 nuclear and mitochondrial genomes represented with > 104 codons/genome in CUTG. We show that ranked codon frequencies are well characterized by a truncated normal (Gaussian) distribution. Most genomes use codons in a nearuniform manner. Lopsided usages are also widely distributed across genomes but less frequent. Our model provides a universal framework for investigating determinants of codon use.