Abstract
In synergy studies, one focuses on compound combinations that promise a synergistic or antagonistic effect. With the help of high-throughput techniques, a huge amount of compound combinations can be screened and filtered for suitable candidates for a more detailed analysis. Those promising candidates are chosen based on the deviance between a measured response and an expected non-interactive response. A non-interactive response is based on a principle of no interaction, such as Loewe Additivity [Loewe, 1928] or Bliss Independence [Bliss, 1939]. In Lederer et al. [2018], an explicit formulation of the hitherto implicitly defined Loewe Additivity has been introduced, the so-called Explicit Mean Equation. In the current study we show that this Explicit Mean Equation outperforms the original implicit formulation of Loewe Additivity and Bliss Independence when measuring synergy in terms of the deviance between measured and expected response. Further, we show that a deviance based computation of synergy outper-forms a parametric approach. We show this on two datasets of compound combinations that are categorized into synergistic, non-interactive and antagonistic [Yadav et al., 2015, Cokol et al., 2011].
1 Introduction
When combining a substance with other substances, one is generally interested in interaction effects. Those interaction effects are usually described as synergistic or antagonistic, dependent on whether the interaction is positive, resulting in greater effects than expected, or negative, resulting in smaller effects than expected. From data generated with high-throughput techniques, one is con-fronted with massive compound interaction screens. From those screens, one needs to filter for interesting candidates that exhibit an interaction effect. To quickly scan all interactions, a simple measure is needed. Based on that pre-processing scan, those filtered combination candidates can then be examined in greater detail.
To determine whether a combination of substances exhibits an interaction effect, it is crucial to determine a non-interactive effect. Only when deviance from that so-called null reference is observed, can one speak of an interactive effect [Lederer et al., 2018]. Over the last century, many principles of non-interaction have been introduced. For an extensive overview, refer to [Greco et al., 1995, Geary, 2012]. Two main principles for non-interactivity have survived the critics: Loewe Additivity [Loewe, 1928] and Bliss Independence [Bliss, 1939]. The popularity of Loewe Additivity is based on its principle of sham combination which assumes no interaction when a compound is combined with itself. Other null reference models do not hold that assumption. An alternative is Bliss Independence, which assumes (statistical) independence between the combined compounds.
Independent of the indecisive opinions about the null reference, there are multiple proposals how synergy can be measured given a null reference model. Some suggest to measure synergy as the difference between an observed isobole and a reference isobole calculated from a null-reference model. An isobole is the set of all dose combinations of the compounds that reach the same fixed effect, such as 50% of the maximal effect [Minto et al., 2000, Chou and Talalay, 1984]. Another way to quantify synergy on the basis of the isobole is to look at the curvature and arc-length of the longest isobole spanned over the measured response [Cokol et al., 2011]. As the deviation from an isobole is measured for a fixed effect or dose ratio, synergy is only measured locally along that fixed effect or dose ratio. In order to not miss any effects, this method has to be applied for as many dose ratios possible.
In this paper we measure synergy as the deviation over the entire response surface. One way to do so is the Combenefit method by measuring synergy in terms of volume between the expected and measured effect [Di Veroli et al., 2016]. We will refer to it as a lack-of-fit method as it quantifies the lack of fit from the measured data to the null reference model. Another way of capturing the global variation is by introducing a synergy parameter α into the mathematical formulation of the response surface. This parameter α is fitted by minimizing the error between the measured effect and the α-dependent response surface. Such statistical definition of synergy allows for statistical testing of significance of the synergy parameter. The cost of this more statistical approach is the complexity of the model fits.
As the research area of synergy evolved from different disciplines, different terminologies are in common use. The response can be measured among others in growth rate, survival, or death. It is usually referred to as the measured or phenotypic effect or as cell survival. In this study we interchange the terms response and effect.
When measuring a compound combination, one also measures each agent individually. The dose or concentration is typically some biological compound per unit of weight when using animal or plant models or per unit of volume when using a cell-based assay. However, it can also be an agent of a different type for example a dose of radiation as used in modern combination therapies for cancer [Nat, 2018]. This individual response is called mono-therapeutic response [Di Veroli et al., 2016] or single compound effect. We prefer a more statistical terminology and refer to it as conditional response or conditional effect. With record we refer to all measurements taken of one cell line or organism which is exposed to all combinations of two compounds. In other literature, this is referred to as response matrix [Lehar et al., 2007, Yadav et al., 2015].
In Section 2.1, we give a short introduction to the two null response principles, Loewe Additivity and Bliss Independence. We explain in detail several null reference models that build on those principles. We introduce synergy as any effect different from an interaction free model in Section 2.2. There, we also introduce the parametrized and deviance based synergy approaches. In Section 2.3, we introduce two datasets that come with a categorization into synergistic, non-interactive and antagonistic. We evaluate the models and methods in Section 3 together with a detailed comparison of the synergy scores.
2 Materials and Methods
2.1 Theory
Before one can decide whether a compound combination exhibits a synergistic effect, one needs to decide on the expected effect assuming no interaction between the compounds. Such so-called null reference models are constructed from the conditional (mono-therapeutic) dose-response curves of each of the compounds, which we denote by fj (xj) for j ∈ {1, 2}. Null reference models extend the conditional dose-response curves to a (null-reference) surface spanned between the two conditional responses. We denote the surface as f (x1, x2) such that and
Thus, the conditional response curves are the boundary conditions of the null reference surface. For this study, we focus on Hill curves to model the conditional dose-responses. More detailed information can be found in Appendix A.
2.1.1 Loewe Additivity
Loewe Additivity builds on the concepts of sham combination and dose equivalence. The first concept is the idea that a compound does not interact with itself. The latter concept assumes that both compounds that reach the same effect can be interchanged. Therefore, any linear combination of fractions of those doses which reach the effect individually and, summed up, are equal to one, yields that exact same effect. Mathematically speaking, if dose from the first compound reaches the same effect as dose from the second compound, then any dose combination (x1, x2), for which holds, should yield the same effect as and . As this idea can be generalized to any effect y, one gets where and are replaced with and , the inverse functions of Hill curves, respectively. For a fixed effect y, Eq. 4 defines an isobole, which is in mathematical terms a contour line. Hence the name of this model: the General Isobole Equation. It is an implicit formulation as the effect y of a dose combination (x1, x2) is implicitly given in Eq. 4. In the following we use the mathematical notation for the General Isobole Equation fGI (x1, x2) = y with mathematical notation for the General Isobole Equation fGI (x1, x2) = y with y being the solution to Eq. 4.
It was shown by Lederer et al. [2018] that the principle of Loewe Additivity is based on a so-called Loewe Additivity Consistency Condition (LACC). This condition is that it should not matter whether equivalent doses of two compounds are expressed in terms of the first or the second. Under the assumption of the LACC being valid, Lederer et al. [2018] have shown, that a null reference model can be formulated explicitly, by expressing the doses of one compound in terms of the other compound: where is the dose x1 of compound one to reach the same effect of compound two with dose x2 (see Fig. 7 in Appendix A). Summing up this dose equivalent of the first compound with the dose of the first compound allows for the computation of the expected effect of the compound combination. With the two formulations above, the effect y of the dose combination (x1, x2) is expressed as the effect of either one compound to reach that same effect. Under the LACC, all three models, Eq. 4, Eq. 5 and Eq. 6 are equivalent. It was further shown, that, in order for the LACC to hold, conditional dose-response curves must be proportional to each other. It has been commented by Geary [2012] and shown in [Lederer et al., 2018], that this consistency condition is often violated. In an effort to take advantage of the explicit formulation and to counteract the different behaviour of Eq. 5 and Eq. 6 in case of a violated LACC, Lederer et al. [2018] introduced the so-called Explicit Mean Equation as mean of the two explicit formulations of Eq. 5 and Eq. 6:
A more extensive overview of Loewe Additivity and definition of null reference models can be found in Lederer et al. [2018].
2.1.2 Bliss Independence
Bliss Independence assumes independent sites of action of the two compounds and was introduced a decade later than Loewe Additivity in [Bliss, 1939]. Note that the formulation of Bliss Independence depends on the measurement of the effect. The best known formulation of Bliss Independence is based on monotonically increasing responses for increasing doses: where gi (xi) = 1 - fi (xi) is a conditional response curve with increasing effect for increasing doses. In case the effect is measured in percent, i.e. y ∈ [0, 100], the interaction term needs to be divided by 100 to ensure the right dimensionality of the term.
Here, we measure the effect in terms of cell death or growth inhibition. Therefore the conditional response curves are monotonically decreasing for increasing oncentrations or doses.
The records are normalized to the response at x1 = 0, x2 = 0, thus f1 (0) = f2 (0) = 1. To arrive from Eq. 8 to Eq. 9, one replaces any g by 1 - f. Chou and Talalay [1984] derive the Bliss Independence from a first order Michaelis-Menten kinetic system with mutually non-exclusive inhibitors.
2.2 Methods
The models introduced in the previous section are null reference models in that they predict a response surface in the absence of compound interaction. We capture synergy in a single parameter to facilitate the screening process. This is different from other approaches, such as Chou and Talalay [1977], who measure synergy as deviation from a null-reference isobole without summarizing the deviation in a single parameter. The single parameter value is typically referred to as synergy-or α-score [Berenbaum, 1977]. As we investigate two methods to quantify synergy, we introduce two synergy parameters α and γ, which measure the extent of synergy. Both synergy scores α and γ are parametrized such that α = 0 or γ = 0 denote absence of an interaction effect. In case α or γ take a value different from zero, we speak of a non-additive, or interactive effect. A compound combination is, dependent on the sign of synergy parameter, one of the three following:
Here, we measure synergy in two different ways, namely in fitting parametrized models or computing the lack-of-fit. The first method fits null reference models that are extended with a synergy parameter α. For these parametrized models α is computed by minimizing the square deviation between the measured response and the response spanned by the α-dependent model. For the second method the difference between a null reference model and the data is computed. For this method, the synergy score γ is defined as the volume that is spanned between the null reference model and the measured response.
Just as the conditional responses form the boundary condition for the null-reference surface (Eq. 1, Eq. 2), we want the conditional responses to be the boundary condition for all values of α. Explicitly, assuming a synergy model dependent on α is denoted by f (x1, x2|α), then with fi denoting the conditional response of compound i. We refer to Eq. 11 as the Synergy Desideratum. As we will see below, not all synergy models fulfil this property.
2.2.1 Parametrized Synergy
We extend the null reference models introduced in Section 2.1 in Eq. 4 - Eq. 9 to parametrized synergy models. The extension of the General Isobole Equation is the popular Combination Index introduced by Berenbaum [1977] and Chou and Talalay [1984]:
Berenbaum originally equated the left-hand side of Eq. 4 to the so-called Combination Index I. Depending on I smaller, larger, or equal to 1, synergy, antagonism or non-interaction is indicated. For consistency with the other synergy models, we set I = 1-α such that α matches the outcomes as listed in Eq. 10. In Section 3 we will refer to this model as fCI (x1, x2 | α), where α is the parameter that minimizes the squared error between measured data and Eq. 12.
Note that this model violates the Synergy Desideratum in Eq. 11 as α not zero leads to deviations from the conditional responses. Explicitly, fCI (x1, 0|α) = f1 ((1 - α) x1) ≠ f1 (x1). Although the Combination Index model violates the Synergy Desideratum, in practice it performs quite well and is in widespread use.
The explicit formulations in Eq. 5 and Eq. 6 are equivalent to the General Isobole Equation, fGI (x1, x2), given in Eq. 4, under the LACC [Lederer et al., 2018], but different if the conditional responses are not proportional. The two explicit equations are in fact an extension of the ‘cooperative effect synergy’ proposed by Geary [2012] for compounds with qualitatively similar effects. For these explicit formulations in Eq. 5 and Eq. 6 we propose a model that captures the interaction based on the explicit formulations:
With this, we can extend the Explicit Mean Equation model fmean (x1, x2) in Eq. 7 to a parametrized synergy model: which we refer to as fmean (x1, x2|α). By multiplying the synergy parameter α with the dose we ensure that fmean (x1, 0|α) = f1 (x1) and fmean (0, x2 |α) = f2 (x2). Thus, the fmean (x1, x2 |α) model fulfills the Synergy Desideratum (Eq. 11).
To investigate the difference between the two models f2→1 (x1, x2) (Eq. 5) and f1→2 (x1, x2) (Eq. 6) we treat compound one and two based on the difference in slopes in the conditional responses. Instead of speaking of the first and second compound, we speak of the smaller and larger one, referring to the order of steepness. Therefore, we use models Eq. 13 and Eq. 14, but categorize the compounds based on the slope parameter of their conditional response curves. This results in flarge→small (x1, x2|α) and fsmall→large (x1, x2 |α). Another synergy model we introduce here and refer to as fgeary (x1, x2 | α) is based on a comment of Geary [2012], hence the naming. The two explicit models f2→1 (x1, x2) and f1→2 (x1, x2) yield the same surface under the LACC but do rarely in practice. Therefore, it cannot be determined whether a response that lies between the two surfaces is synergistic or antagonistic and hence should be treated as non-interactive. Thus, if α from f1→2 (x1, x2|α) and α from f2→1 (x1, x2 | α) are of equal sign, the synergy score of that model is computed as the mean of those two parameters. In case the two synergy parameters are of opposite sign, the synergy score is set to 0:
Next, to extend the null reference model following the principle of Bliss Independence, we extend Eq. 8 to
The motivation for this model is that any interaction between the two compounds is caught in the interaction term of the two conditional responses. In case of no interaction, the synergy parameter α = 0, which leads to (1 + α) = 1, and results in no deviance from the null reference model. As we use the formulation of Eq. 9 due to measuring the effect as survival, we reformulate Eq. 17 analogously as we did to get from Eq. 8 to Eq. 9: by replacing gi (xi) with 1 - fi (xi). Hence, Eq. 17 takes the form:
This model does satisfy the requirement of no influence of the synergy parameter on conditional doses: fbliss (x1, 0|α) = f1 (x1) and fbliss (0, x2 |α) = f2 (x2) as fi (0) = 1. In case of synergy, the interactive effect is expected to be larger, therefore, α being positive. If the compound combination has an antagonistic effect, the interaction term is expected to be smaller.
2.2.2 Lack-of-Fit Synergy
The second method to measure synergy investigated here is to compute the lack-of-fit of the measured response of a combination of compounds to the response of a null reference model derived from the conditional responses. We refer to this synergy value as γ: with the estimated effect with parameters Θ of the fitted conditional responses following any non-interactive model and y the measured effect. Note that and y are dependent on the concentration combination (x1, x2). This method was used in the AstraZeneca DREAM challenge [Menden et al., 2018] with the General Isobole Equation as null reference model and can be found in [Di Veroli et al., 2016]. Computing the volume has the advantage of taking the experimental design into account in contrast to simply taking the mean deviance over all measurement points, which is independent of the relative positions of the measurements. The degree of synergy varies for different dose concentrations and transformations. The computed surface will be different for the same experiment if a log-transformation is applied to the doses or not.
In all, we have introduced six null reference models, five of them building up on the concept of Loewe Additivity and one on Bliss Independence. We further have introduced two methods to compute synergy, the parametric one and the lack-of-fit method. This results in twelve synergy model-method combinations: the parametric ones, fCI (x1, x2 |α) (Eq. 12), flarge→small (x1, x2 |α) and fsmall→large (x1, x2 |α) (Eq. 13, Eq. 14, dependent on the slope parameters) together with their mean, fmean (x1, x2 |α) (Eq. 15), the method of Geary and fbliss (x1, x2 |α) (Eq. 17). For the lack-of-fit method, we take as the null reference: fGI (x1, x2) (Eq. 4), flarge→small (x1, x2) and fsmall→large (x1, x2) (Eq. 5, Eq. 6), with the Explicit Mean Equation, fmean (x1, x2) (Eq. 7), the method of Geary (analogously to Eq. 16) and fbliss (x1, x2) (Eq. 9).
2.2.3 Fitting the Synergy Parameter
Before applying the two methods presented in Section 2.2.1 and Section 2.2.2, we normalize and clean the data from outliers. In a first step we normalize all records to the same value, y0, the measured response at zero dose concentration from both compounds. Second, we discard outliers using the deviation from a spline approximation. Third, we fit both conditional responses of each record, namely the responses of each compound individually, to a pair of Hill curves (Eq. 21, Appendix A). We fit the response at zero dose concentration for both Hill curves. This gives the parameter set Θ = {y0, y∞,1, y∞,2, e1, e2, s1, s2} for each record. More details are given in Appendix B.
We apply the two different methods to calculate the synergy parameters α and γ to each record. First, for the parametrized synergy models, we apply a grid search for α, for α ∈ [-1, 1] with a step size of 0.01, minimizing the sum of squared errors. This gives the value of α for which the squared error between the ith measured effect y(i) and ith expected effect is minimal:
Note that we exclude the conditional responses that we used to fit Θ from the minimization. Second, we apply the lack-of-fit method from Di Veroli et al. [2016], where synergy is measured in terms of the integral difference in log space of measured response and surface spanned by the non-interactive models in Section 2.1, as given in Eq. 19. For the calculation of the integrals, we apply the trapezoidal rule [Press et al., 2007, Chapter 4].
2.3 Material
To evaluate the two methods introduced in Section 2.2.1 and Section 2.2.2, we apply them to two datasets of compound combination screening for which a categorization into the three synergy cases is provided.
The Mathews Griner dataset is a cancer compound synergy study by Mathews Griner et al. [2014]. In a one-to-all experimental design, the compound ibrutinib was combined with 463 other compounds and administered to the cancer cell line TMD8 of which cell viability was measured. The dataset is published at https://tripod.nih.gov/matrix-client/. Each compound combination was measured for 5 different doses, decreasing from 125μM to 2.5μM in a four-fold dilution for each compound alongside their conditional effects, resulting in 36 different dose combinations. The categorization of this dataset comes from a study by Yadav et al. [2015], in which every record was categorized based on a visual inspection.
The Cokol dataset comes from a study about fungal cell growth of the yeast S. cerevisiae (strain By4741), where Cokol et al. [2011] categorized the dataset. In this study the influence on cell growth was measured when exposed to 33 different compounds that were combined with one another based on promising combinations chosen by the authors, resulting in 200 different drug-drug-cell combinations. With an individually measured maximal effect dose for every compound, the doses administered decrease linearly in 7 steps with the eighth dose set to zero, resulting in an 8 × 8 factorial design.
Based on the longest arc length of an isobole that is compared to the expected longest linear isobole in a non-interactive scenario, each record was given a score. In more detail, from the estimated surface of a record assuming no interaction, the longest contour line is measured in terms of its length and direction (convex or concave). A convex contour line leads to the categorization of a record as synergistic and the arc length of the longest contour line determines the strength of synergy. A concave contour line results in an antagonistic categorization with its extent being measured again as the length of the longest isobole. Thus the Cokol dataset not only comes with a classification but also with a synergy score similar to α or γ.
To our knowledge, these two datasets are the only high-throughput ones with a classification into the three synergy classes: antagonistic, non-interactive and synergistic. Both datasets are somewhat imbalanced because interactions are rare [Borisy et al., 2003, Zhang et al., 2007, Farha and Brown, 2010]. The distribution of the classification is listed in Table 1. We obtained both categorizations after personal communication with the authors Yadav et al. [2015] and Cokol et al. [2011]. For the purpose of comparing the synergy models, we consider these two classifications as ground truth.
3 Results
Using the two methods of computing the synergy score, the parametric one (Section 2.2.1) and the lack-of-fit one (Section 2.2.2), we compute synergy scores for all records of the two datasets introduced in Section 2.3.
3.1 Kendall rank correlation coefficient
Having obtained the synergy scores from the two different methods as described in Section 2.2.3, we compute the Kendall rank correlation coefficient, which is also known as Kendall’s tau coefficient and was originally proposed by Kendall [1938]. This coefficient computes the rank correlation between the data as originally categorized by Yadav et al. [2015] and Cokol et al. [2011] and the computed synergy scores resulting from the two methods introduced in Section 2.2.1 and Section 2.2.2. For the analysis, we rank synergistic records highest at rank 3, followed by non-interactive at rank 2 and antagonistic lowest at rank 1. Due to the many ties in rank, the Kendall rank correlation coefficient cannot take a value higher than 0.75 for Mathews Griner and 0.8 for Cokol, even if a perfect ranking was given. An overview of the Kendall rank correlation coefficients is given in Table 2 and Table 3 in Appendix C.
To compare parametric and lack-of-fit methods, we plot the correlation values as a scatter plot per method (see Fig. 1) with the values from the parametric method plotted on the x-axis and those from the lack-of-fit method on the y-axis. Most of the points scatter in the upper left triangle, above the diagonal line. This shows that the lack-of-fit method outperforms the parametric method. This holds for all models applied to the Mathews Griner dataset and also for all models applied to the Cokol dataset. For both datasets, the highest correlation scores result from those null reference models that are based on the Loewe Additivity principle. The Bliss null reference model performs worst for the Mathews Griner set. For the Cokol data it is the second worst model. To a certain extent this can be explained as the classification of the Cokol dataset is based on isobole length relative to non-interactive, which is a Loewe Additivity type analysis. On both datasets the Explicit Mean Equation performs best with a correlation value of 0.55 and 0.62 for the Mathews Griner and Cokol dataset, respectively.
3.2 ROC-analysis
In high-throughput synergy studies, one generally screens for promising candidates that exhibit a synergistic or antagonistic effect. Those promising candidates are then investigated in more detail with genetic assays and other techniques. To determine how well the underlying null reference models result in distinguishable synergy scores, we conduct an ROC analysis (receiver operating characteristic), comparing the estimated synergy scores with the class categorization that is given for both datasets. A standard ROC analysis applies to binary classification, where cases are compared to controls. In this study, we have three classes: synergistic, antagonistic and non-interactive. We therefore compare each class to the combination of the other two, e.g. synergistic as cases versus the antagonistic and non-interactive combined as control. Typically, in ROC analyses, the cases rank higher than the controls. When treating the class antagonistic as case compared to the control synergistic and non-interactive we change all signs of the synergy scores. Therefore, the ranking of synergy scores is reversed and antagonistic synergy scores rank higher. Problems arise when comparing non-interactive cases to the control synergistic and antagonistic as their values should lie between the two control classes. Therefore, the absolute value of the estimated synergy scores is taken, which allows a ranking where the synergy scores of the non-interactive records should rank lower than the other synergy scores. Additionally, we can again multiply all synergy scores with minus one to revert the order of scores such that the cases rank higher.
The AUC values (area under the curve) are reported in Table 4 - Table 7 in Appendix C. For completeness, and based on the critique of Saito and Rehmsmeier [2015] to use PRC-AUC values for imbalanced datasets, the PRC-AUC values are also computed and can be found in Table 8 - Table 11 in Appendix C.
Analogously to the previous section, we depict the ROC values for both datasets in scatter plots (Fig. 2) with ROC values based on the parametric approach depicted on the x-axis and those based on the lack-of-fit approach on the y-axis. The underlying null reference models are shown by color. The different comparisons, such as synergistic versus non-interactive and antagonistic, are depicted by shape of the plot symbol. From Fig. 2, the dominance of the lack-of-fit approach over the parametric one is as apparent as from Fig. 1. With regard to the comparison of the different cases, visualized in shape, the ROC values from the comparison of the synergistic cases to the non-interactive and antagonistic controls, score the highest values around 0.9. The comparison of the non-interactive cases to the interactive ones score the lowest.
As the overall highest AUC scores result from the lack-of-fit method, we have a closer look at those for both datasets (Table 5 and Table 7 in Appendix C). For the antagonistic case, the values range around 0.80 for the Mathews Griner dataset and around 0.85 for the Cokol dataset. AUC values of the non-interactive case range around 0.75 for both datasets. The AUC values for the synergistic case for both datasets range around a value of 0.90 with one outlier of 0.77 for the Bliss Independence model on the Mathews Griner dataset.
Overall, the lack-of-fit outperforms the parametric method on both datasets. For the lack-of-fit method, the flarge→small (x1, x2) performs best on the Mathews Griner dataset and is followed closely by the Explicit Mean Equation. On the second dataset, the Explicit Mean Equation performs overall best.
3.3 Scattering of Synergy Scores
To further investigate the performance of the methods and null reference models, we plot the synergy scores of the best performing models based on the Kendall rank correlation coefficient analysis (Section 3.1) and the ROC analysis (Section 3.2) for both datasets in Fig. 3, Fig. 4 and Fig. 5. In all figures, the overall correlation of the compared data is depicted together with the correlation per categorization. The coloring of the scores is based on the original categorization as antagonistic, non-interactive or synergistic as provided by Yadav et al. [2015] and Cokol et al. [2011].
In Fig. 3 the synergy scores computed with the lack-of-fit method are plotted against the original synergy scores from Cokol et al. [2011]. Applying the lack-of-fit method to the Bliss Independence model (Eq. 9) results in scores which are mainly above zero (Fig. 3, upper left). Further, it can be seen in the density plots along the y-axis in Fig. 3, upper left panel, and on the x-axis of Fig. 4, both panels in the first row and left panel in the middle row, that the synergy scores that are computed based on the principle of Bliss Independence cannot be easily separated by categorization, making it difficult to come up with a threshold to categorize a record into one of the three synergy categories (synergy, antagonism, non-interaction) given a synergy score.
For the other three models depicted in Fig. 3, that are based on the principle of Loewe Additivity, the synergy scores are more clearly separated. The computed scores of the synergistic records distribute nicely above zero in the upper right corner (categorized as synergistic and computed synergy scores above zero) as well as they scatter in the lower left corner for antagonistic cases. In all those three panels in Fig. 3 we see for the non-interactive records that the computed scores of those three models are both positive and negative ranging roughly between −0.1 and 0.1 symmetrically. Barely any of the computed synergy scores for antagonistic cases are positive. Therefore, the chances of a record being antagonistic if the synergy score is above zero are quite low as well as the risk of categorizing a record as antagonistic if it is synergistic.
We further looked in detail into dose combinations for which both the fGI (x1, x2) and fmean (x1, x2) yield positive synergy values for antagonistic cases and into dose combinations for which the fmean (x1, x2) model results in negative synergy values for records which are labeled as synergistic. Those are all in all eight records. One of them is a compound combined with itself. Hence, per definition of the Loewe Additivity, no interaction is expected. We looked at the conditional responses of all eight dose combinations. At least one of the conditional responses exhibits small effects with the maximal response y∞ being above 0.65 (comp. left panel of Fig. 6). That leads to the computed null-reference surface to be quite high and hence causes synergistic scores if any effects are measured that are smaller than . We suspect that the dose concentrations are not well-sampled and larger maximal doses should have been administered.
We further looked up the seven dose combinations (excluding the one where the compound is combined with itself) in the Connectivity Map [Subramanian et al., 2017, Lamb et al., 2006]. Of those, we could find five in the Connectivity Map. All of these dose combinations showed non-interactive effects on all cell lines they were tested on. The assays found in the Connectivity Map are run on cancer cell lines. The dose combinations investigated here are run on yeast. Hence, a full comparison cannot be made.
In Fig. 4 and Fig. 5, the computed scores from different null reference models are plotted against each other. We compare the implicit formulation (General Isobole Equation) to the Bliss Independence model and the two best performing models that are based on the explicit formulation of Loewe Additivity, fmean (x1, x2) and flarge→small (x1, x2). The coloring of the scores is based on the original categorization as antagonistic, non-interactive or synergistic as provided by Yadav et al. [2015] and Cokol et al. [2011].
In Fig. 4 the scores from the Mathews Griner dataset are plotted. In the two panels in the upper row and the left panel in the middle row Bliss Independence is compared to the other three null models that build up on the principles of Loewe Additivity. The first three panels compare fbliss (x1, x2) with the models based on the principle of Loewe Additivity. It is obvious, that the scores based on Bliss Independence are larger than those of Loewe Additivity and mainly above zero. The scores from models that are based on Loewe Additivity are very similar to each other, as they scatter along the diagonal (panels in middle right and lower row). It is difficult, though, to tell apart whether a record is synergistic or antagonistic, as non-interactive records scatter largely between −0.5 and 0.5. Only records with a computed score outside that range can be categorized as interactive. For the Cokol dataset, which serves as basis for Fig. 5, the scores can be better separated. Despite the scores being generally smaller than those from the Mathews Griner data, the records can be easier separated, when using a Loewe Additivity based model. Additionally, we see here the similarity between these additive models given their strong correlation (right panels in middle row and both panels in lower row). Further, the scores based on flarge→small (x1, x2) achieve higher values than those from the other two Loewe Additivity based models. This becomes obvious when comparing the null-reference surfaces of those three models, as depicted in [Lederer et al., 2018, Fig. 4]. The surface spanned by flarge→small (x1, x2) spans a surface above those surfaces spanned by Explicit Mean Equation or General Isobole Equation. Therefore, in synergistic cases where the measured effect is greater, and hence the response in cell death smaller, the difference from the null-reference surface to flarge→small (x1, x2) is greater than to the other two models.
4 Discussion
The rise of high-throughput methods in recent years allows for massive screening of compound combinations. With the increase of data, there is an urge to develop methods that allow for reliable filtering of promising combinations. Further, the recent success of a synergy study of in vivo mice by Grüner et al. [2016] underlines the fast development of possibilities to generate biological data. Therefore, it is all the more important to develop methods that are sound and easily applicable to high-throughput data.
In this study we use two datasets of compound combinations that come with a categorization into synergistic, non-interactive or antagonistic for each record. Based on the fitted conditional responses, we compute the synergy scores of all records. We compare six models that build on the principles of Loewe Additivity and Bliss Independence. Those six models are used with two different methods to compute a synergy score for each record. The first method is a parametric approach and is motivated by the Combination Index introduced by Berenbaum [1977]. The second method quantifies the difference in volume between the expected response assuming no interaction and the measured response and is motivated by Di Veroli et al. [2016].
The computed synergy scores are compared in two different ways: first Kendall rank correlation coefficients are computed to investigate the reconstruction of ranking of the records (see Section 3.1), followed by an ROC analysis (Section 3.2) with the aim to verify the capacity, given a computed synergy score, to distinguish records from different categories. Both the Kendall rank correlation coefficient and the ROC analysis show a superiority of those models that are based on Loewe Additivity relative to those based on Bliss Independence. Note that we conduct the research only on combinations of two compounds. Meanwhile it is shown in Russ and Kishony [2018] that Bliss Independence maintains accuracy when increasing the number of compounds that are combined with each other while Loewe Additivity loses its predictive power for an increasing number of compounds. From those additive models the Explicit Mean Equation is the overall best performing model for both datasets. The comparison of the parametric method with the lack-of-fit method shows a superiority of the lack-of-fit method. To recall, the motivation behind the parametric approach was the statistical advantages of such an approach. It allows to define an interval around α = 0 in which a compound combination can be considered additive. For the lack-of-fit method, such statistical evaluation can not be done directly, but could be performed on the basis of bootstrapping.
Chou and Talalay [1977] measure the interaction effect locally for a fixed ratio of doses of both compounds that are supposed to reach the same effect, say one unit of the first compound causes the same effect as two units of the second compound, which results in the dose combination of 1:2. Along this fixed ratio of doses, they compute the left-hand side of Eq. 3 given the two doses x1 and x2 that are assumed to reach a fixed effect y* together with being the dose of compound i that reaches the fixed effect alone. For the fixed dose ratio, they run over all expected effects, usually from zero to one. A geometric interpretation of that method is depicted in [Greco et al., 1995, Fig. 7, p. 341]. The resulting values of the left-hand side of Eq. 3 are analyzed graphically: all computed values are plotted versus the expected fixed effect y* = [0, 1]. Values higher than one exhibit synergistic behaviour, values below one antagonism. This method allows for results that show antagonistic behaviour for, say, smaller effects, as well as synergistic behaviour for higher effects, or vice versa. That such a behaviour of switching from antagonistic behaviour in one region to synergistic behaviour in another can occur was also shown in Norberg and Wahlström [1988]. Our main motivation in this study is to provide a single synergy score that allows for fast filtering of interesting candidates for more in-depth research. To extend that idea, standard deviation could further be taken into account. Additionally, the superior lack-of-fit method is much faster and simpler to implement than the parametric one.
Finally, to asses how distinguishable the synergy scores are, we visualize the synergy scores based on the underlying category (Section 3.3). The synergy scores from the lack-of-fit method can, based on their sign, reliably be categorized as synergistic or antagonistic. For records categorized as non-interactive, the computed synergy scores are positive as well as negative. For the two datasets, we saw different extents of separation between those scores, which makes it difficult to generalize the results. All in all, the differentiation from no interaction poses a more difficult task as choosing the threshold is arbitrary.
For Bliss Independence such a differentiation is barely possible due to a strong overlap of synergy scores from all three categories and mainly positive synergy scores from the lack-of-fit method. Different ranges of synergy scores for both datasets make it additionally difficult to assess synergy or antagonism for a record based on the unique information of the synergy score.
Finally, we want to emphasize the performance benefit of the recently introduced Explicit Mean Equation [Lederer et al., 2018] over the implicit formulation in form of the General Isobole Equation. The explicit formulation of this additive model does not only show higher accuracy in prediction performance but was also shown to speed up computation by a factor of 250. Although the performance of models and method are consistent across the two (quite different) datasets considered in this study, reliable comparison of different methods would benefit from the availability of more ground truth drug screening datasets.
Conflict of Interest
The authors declare that they have no conflict of interest.
Funding
This work was supported by the Radboud University and CogIMon H2020 ICT-644727.
A Conditional Dose Response Curves
A common approach for modeling monotonic dose-response curves fj with j ∈ {1, 2} is the Hill curve [Hill, 1910], also referred to as the sigmoid function. The Hill model is, due to its good fit to many sources of data, the most widely applied model for fitting compound responses [Goutelle et al., 2008]. It has a sigmoidal shape with little change for small doses but with a rapid decline in response once a certain threshold is met. For even larger doses the effect asymptotes to a constant maximal effect. Two exemplary Hill curves are depicted in Fig. 7. There are several parameterizations of the Hill curve. We use the following throughout this study to fit conditional responses: where y0 is the response at zero dose and y∞ the maximal response of the cells to the compound, e the dose concentration reaching half of the maximal response and s the steepness of the curve. Eq. 21 is equivalent to the parametrization used in the drc package [Ritz and Strebig, 2016], the so-called four parameter log-logistic model. By our definition of the Hill curve, a positive s leads to a descending Hill curve.
B Data Cleaning, Fitting of Hill Curve and Parameter Estimation for Implicit Models
First, we normalize all records by the measured response at zero dose concentration from both compounds, y0. Second, we conduct an outlier analysis of the normalized responses by fitting a spline surface and deleting outliers to discard them. Third, we then fit the conditional responses of the cleaned data to Hill curves.
We fit a general additive model (GAM) to the normalized raw data using thin plate splines [Wood, 2017], not transforming the doses in any way. The surfaces of those fitted thin plate splines span the checkerboards of every record and data points with too large absolute residual values are rejected. For fitting the splines we use method gam() of the mgcv-package [Wood, 2011]. The threshold to reject data points is at three times the inter-quantile range of all residuals of a given record. Every data point with an absolute residual above that threshold is discarded. For the Mathews Griner data, this leads to 125 records out of the 466 (less than 30%) where a mean of 1.59 outliers were excluded per record with an overall of 199 data points excluded, which is less than one percent of the overall data. A maximum of 6 outliers was detected once. Similarly, we excluded on average 4.21 data points for the Cokol data on 150 of the total 200 (75%) records with a maximum of 13 data points and an overall of 623 data points excluded, which is about 8.7% of all data points.
To fit the two conditional responses of a record to two Hill functions of the form of Eq. 21 we use the drc package [Ritz et al., 2015]. Unlike other synergy analyses such as [Yadav et al., 2015], the response at zero concentration y0 is not fixed to 1 but merely constrained to be the same for both response curves. The other Hill parameters, y∞, s and e are fitted for both compounds individually. In case the asymptote parameter y∞ is below zero for any of the two Hill curves, the conditional response of that compound is refitted to a two-parameter model with y∞ set to zero and y0 kept from the fitting of both compounds together. This is the case for 43 records of the Mathews Griner dataset and 125 records of the Cokol dataset.
The fGI (x1, x2) model is an implicit model for the response y. Therefore, a root finder is used to find a response given concentrations and parameters describing the Hill curves of the conditional responses, Θ = {y0, y∞,j, ej, sj}, We used the standard implementation of a root finder in the R stats package, uniroot() [R Core Team, 2016], which is based on the Brent-Dekker-van Wijngaarden algorithm [Press et al., 2007, Chapter 9]. As convergence criterion we used 1.22 × 10−4.
Acknowledgments
We thank Bhagwan Yadav for the sharing of the code used for the analysis in Yadav et al. [2015] and Murat Cokol for the sharing of the data and analytical insights from Cokol et al. [2011].