Abstract
Small molecule distribution coefficients between immiscible nonaqueuous and aqueous phases—such as cyclohexane and water—measure the degree to which small molecules prefer one phase over another at a given pH. As distribution coefficients capture both thermodynamic effects (the free energy of transfer between phases) and chemical effects (protonation state and tautomer effects in aqueous solution), they provide an exacting test of the thermodynamic and chemical accuracy of physical models without the long correlation times inherent to the prediction of more complex properties of relevance to drug discovery, such as protein-ligand binding affinities. Forthe SAMPL5 challenge, we carried out a blind prediction exercise in which participants were tasked with the prediction of distribution coefficients to assess its potential as a new route for the evaluation and systematic improvement of predictive physical models. These measurements are typically performed for octanol-water, but we opted to utilize cyclohexane forthe nonpolar phase. Cyclohexane was suggested to avoid issues with the high water content and persistent heterogeneous structure of water-saturated octanol phases, since it has which has greatly reduced water content and a homogeneous liquid structure. Using a modified shake-flask LC-MS/MS protocol, we collected cyclohexane/water distribution coefficients for a set of 53 druglike compounds at pH 7.4. These measurements were used as the basis forthe SAMPL5 Distribution Coefficient Challenge, where 18 research groups predicted these measurements before the experimental values reported here were released. In this work, we describe the experimental protocol we utilized for measurement of cyclohexane-water distribution coefficients, report the measured data, propose a new bootstrap-based data analysis procedure to incorporate multiple sources of experimental error, and provide insights to help guide future iterations of this valuable exercise in predictive modeling.
Footnotes
↵* arr2011{at}med.cornell.edu
↵‡ lin.baiwei{at}gene.com
↵§ feng{at}dnli.com
↵¶ ortwine.daniel{at}gene.com
↵** dmobley{at}uci.edu
1 Shimadzu cat. no. 228-45450-91
2 DMSO stocks from Genentech compound library
3 ACS grad ≥99%, Sigma-Aldrich cat. no 179191-2L, batch #00555ME
4 136 mM NaCl, 2.6 mM KCl, 7.96 mM Na2HPO4, 1.46 mM KH2PO4, with pH adjusted to 7.4, prepared by the Genentech Media lab
5 Thermo Fisher Scientific, Titer Plate Shaker, model: 4625,Waltham, MA, USA
6 Agilent Technologies, Vial plate for holding 54 × 2 mL vials part no. G2255-68700
7 Eppendorf, Centrifuge 5804, Hamburg, Germany
8 384-well glass coat plate:Thermo Scientific, Microplate, 384-Well; Webseal Plate; Glass-coated Polypropylene; Square well shape; U-Shape well bottom; 384 wells; 90uL sample volume; catalog number: 3252187
9 ACROS Organics, 1-octanol 99% pure, catalog number: AC150630010, Geel, Belgium
10 Waters Xbridge C18 2.130mm with 2.5m particles
11 Agilent cat no 24214-001
12 All LC solvents were HPLC-grade and purchased from OmniSolv (Charlotte, NC, USA)
13 This was done using a Shimadzu NexeraX2 consisting of an LC-30AD(pump), SIL-30AC (auto-injector), and SPD-20AC(UV/VIS detector) with Sciex API4000QTRP (MS)
14 This was done using a Shimadzu NexeraX2 consisting of an LC-30AD(pump), SIL-30AC (auto-injector), and SPD-20AC(UV/VIS detector) with Sciex API4000 (MS)
15 For the purpose of the D3R/SAMPL5 workshop, we originally erroneously reported the standard deviation · instead of the standard error · . The factor of corrects the sample standard deviation across all MRM measurements for the correlation between the 3 replicate measurements belonging to a single independent experimental repeat.