%0 Journal Article %A Ian Holmes %T Modular non-repeating codes for DNA storage %D 2016 %R 10.1101/057448 %J bioRxiv %P 057448 %X We describe a strategy for constructing codes for DNA-based information storage by serial composition of weighted finite-state transducers. The resulting state machines can integrate correction of substitution errors; synchronization by interleaving watermark and periodic marker signals; conversion from binary to ternary, quaternary or mixed-radix sequences via an efficient block code; encoding into a DNA sequence that avoids homopolymer, dinucleotide, or trinucleotide runs and other short local repeats; and detection/correction of errors (including local duplications, burst deletions, and substitutions) that are characteristic of DNA sequencing technologies. We present software implementing these codes, available at github.com/ihh/dnastore, with simulation results demonstrating that the generated DNA is free of short repeats and can be accurately decoded even in the presence of substitutions, short duplications and deletions. %U https://www.biorxiv.org/content/biorxiv/early/2016/06/06/057448.full.pdf