TY - JOUR T1 - Predicting DNA Hybridization Kinetics from Sequence JF - bioRxiv DO - 10.1101/149427 SP - 149427 AU - Jinny X. Zhang AU - John Z. Fang AU - Wei Duan AU - Lucia R. Wu AU - Angela W. Zhang AU - Neil Dalchau AU - Boyan Yordanov AU - Rasmus Petersen AU - Andrew Phillips AU - David Yu Zhang Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/06/13/149427.abstract N2 - Hybridization is a key molecular process in biology and biotechnology, but to date there is no predictive model for accurately determining hybridization rate constants based on sequence information. To approach this problem systematically, we first performed 210 fluorescence kinetics experiments to observe the hybridization kinetics of 100 different DNA target and probe pairs (subsequences of the CYCS and VEGF genes) at temperatures ranging from 28 °C to 55 °C. Next, we rationally designed 38 features computable based on sequence, each feature individually correlated with hybridization kinetics. These features are used in our implementation of a weighted neighbor voting (WNV) algorithm, in which the hybridization rate constant of an unknown sequence is predicted based on similarity reactions with known rate constants (a.k.a. labeled instances). Automated feature selection and weighting optimization resulted in a final 6-feature WNV model, which can predict hybridization rate constants of new sequences to within a factor of 2 with ≈74% accuracy and within a factor of 3 with ≈92% accuracy, based on leave-one-out cross-validation. Predictive understanding of hybridization kinetics allows more efficient design of nucleic acid probes, for example in allowing sparse hybrid-capture panels to more quickly and economically enrich desired regions from genomic DNA. ER -