TY - JOUR T1 - Proper modelling of ligand binding requires an ensemble of bound and unbound states JF - bioRxiv DO - 10.1101/078147 SP - 078147 AU - Nicholas M Pearce AU - Frank von Delft Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/09/28/078147.abstract N2 - Synopsis We emphasise and demonstrate the importance of modelling the superpositions of ligand-bound and unbound states that commonly occur in crystallographic datasets. Generation of an ensemble that describes not only the dominant state in the crystal is important for the high-quality refinement of low-occupancy ligands, as well as to present a model that explains all of the observed density.Abstract Small molecules bind to only a fraction of the proteins in the crystal lattice, but occupancy refinement of ligands is often avoided by convention; occupancies are set to unity, assuming that the error will be adequately modelled by the B-factors, and weak ligand density is generally ignored or attributed to disorder. Where occupancy refinement is performed, the superposed atomic state is rarely modelled. We show here that these modelling approaches lead to a degradation of the quality of the ligand model, and potentially affect the interpretation of the interactions between the bound ligand and the protein. Instead, superior accuracy is achieved by modelling the ligand as partially occupied and superposed on a ligand-free “ground-state” solvent model. Explicit modelling of the superposed unbound fraction of the crystal using a reference dataset allows constrained refinement of the occupancy of the ligand with minimal fear of over-fitting. Better representation of the crystal also leads to more meaningful refined atomic parameters such as the B-factor, allowing more insight into dynamics in the crystal. We present a simple approach and simple guidelines for generating the ensemble of bound and unbound states, assuming that datasets representing the unbound states (the ground state) are available. Judged by various electron density metrics, ensemble models are consistently better than corresponding single-state models. Furthermore, local modelling of the superposed ground state is found to be generally more important for the quality of the ligand model than convergence of the overall phases. ER -