Abstract
In many sensory systems the neural signal is coded by multiple parallel pathways, suggesting an evolutionary fitness benefit of general nature. A common pathway splitting is that into ON and OFF cells, responding to stimulus increments and decrements, respectively. According to efficient coding theory, sensory neurons have evolved to an optimal configuration for maximizing information transfer given the structure of natural stimuli and circuit constraints. Using the efficient coding framework, we describe two aspects of neural coding: how to optimally split a population into ON and OFF pathways, and how to allocate the firing thresholds of individual neurons given realistic noise levels, stimulus distributions and optimality measures. We find that populations of ON and OFF neurons convey equal information about the stimulus regardless of the ON/OFF mixture, once the thresholds are chosen optimally, independent of stimulus statistics and noise. However, an equal ON/OFF mixture is the most efficient as it uses the fewest spikes to convey this information. The optimal thresholds and coding efficiency, however, depend on noise and stimulus statistics if information is decoded by an optimal linear readout. With non-negligible noise, mixed ON/OFF populations reap significant advantages compared to a homogeneous population. The best coding performance is achieved by a unique mixture of ON/OFF neurons tuned to stimulus asymmetries and noise. We provide a theory for how different cell types work together to encode the full stimulus range using a diversity of response thresholds. The optimal ON/OFF mixtures derived from the theory accord with certain biases observed experimentally.
Introduction
The efficient coding hypothesis states that sensory systems have evolved to optimally transmit information about the natural world given limitations on their biophysical components and constraints on energy use [4]. This theory has been successfully applied to explain the structure of neuronal receptive fields in the mammalian retina [2, 3] and fly lamina [23, 43] based on the statistics of natural scenes. Similar arguments have been made to explain why early sensory pathways often split into parallel channels that represent different stimulus variables, for example different auditory waveforms [37], or local visual patterns [35]. However, even neurons that encode the same sensory variable often split further into distinct types. A commonly encountered diversification is into ON and OFF types: ON cells fire when the stimulus increases and OFF cells when it decreases. This basic ON-OFF dichotomy is found in many modalities, including vertebrate vision [22], invertebrate vision [18], thermosensation [14], and chemosensation [9]. Furthermore, even among neurons that encode the same sensory variable with the same sign, one often encounters distinct types that have different response thresholds, for example among touch receptors [41] and electroreceptors [6]. The same principle seems to apply several synapses downstream from the receptors [20], and even in the organization of the motor periphery, where motor neurons that activate the same muscle have a broad range of response thresholds [17]. In the present article we consider this pathway splitting among neurons that represent the same variable and explore whether it can be understood based on efficient coding theory.
One reason why the ON and OFF pathways have evolved may be to optimize information about both increments and decrements in stimulus intensity by providing excitatory signals for both [32]. For instance, if there were only ON neurons, such a cell would need high baseline firing rate to encode stimulus decrements, which can be very costly. A population of many neurons, however, could resolve this issue by tuning their thresholds so that they jointly code for the stimulus. We have previously addressed the benefits for having ON and OFF cells in a small population of just two cells [16]. Since ON and OFF neurons often exhibit a broad distribution of firing thresholds [6, 41, 17, 20], an important question is thus, what distribution of thresholds yields the most efficient coding. Here we study optimal information transmission in sensory populations comprised of different mixtures of ON and OFF neurons that code for a common stimulus variable. We develop the problem parametrically in the neuronal noise and the distribution of stimuli that the cells encode.
The efficient coding hypothesis does not specify what quantity the neural population should optimize. Therefore, we consider two alternative measures of optimal coding that are in common use: first we maximize the mutual information between stimulus and response, and second we optimize the estimate of the stimulus obtained by a linear decoder of the response. When constraining the maximal firing rate of each cell, we find that counter to our expectations the mutual information is identical for any mixture of ON and OFF cells once the thresholds of all cells are optimized. This result is independent of the shape of the stimulus distribution or the level of neuronal noise. However, the total mean spike count is lowest for the population with equal numbers of ON and OFF cells, making this arrangement optimal in terms of bits per spike. Optimizing the linear decoder requires determining not only the cells’ thresholds, but also the decoding weights in order to minimize the mean square error between the stimulus and its estimate. Under this criterion, the optimal ON/OFF mixture and cells’ thresholds depend on the asymmetries in the stimulus distribution and the noise level. Our theory yields surprising results regarding the optimal organization of sensory populations comprised of ON and OFF cells. We make distinct predictions for the optimal distribution of thresholds under the two optimality measures, providing insight into the diverse coding strategies of these populations across different sensory modalities and species.
Results
Population coding model
We study a population of ON and OFF neurons that respond to a common stimulus. Model neurons are assumed to transmit information about a common scalar stimulus through the spike count observed during a short coding window. The duration of this coding window, T, is chosen based on the observed dynamics of neuronal responses; for instance, for retinal ganglion cells, T is typically in the range of 10-50 ms [42, 29]. Neuronal spike counts are stochastic and their mean is modulated by the stimulus through a binary response function: ON (OFF) neurons fire Poisson spikes with an average mean count νmax whenever the stimulus intensity s is above (below) their threshold θi, and zero otherwise, i.e. νi(s) = νmaxΘ(s − θi) for ON neurons and νi(s) = νmaxΘ(θi − s) for OFF neurons (Figure 1A), where Θ is the Heaviside function. The binary response function is provably optimal under the constraint of Poisson spiking and short coding window encountered in many sensory areas [38, 34, 7, 27] and offers a reasonable approximation of neural behavior in several systems [24, 29].
Since we use a discrete rate function, we can replace θi by the corresponding cumulative threshold which essentially maps any stimulus distribution into a uniform distribution from 0 to 1 (Figure 1B). Since the stimulus dependence enters only through these values, the maximal mutual information is independent of the stimulus distribution, provided that the stimulus cumulative distribution is continuous.
Maximal mutual information for mixtures of ON and OFF neurons
Population responses are determined by the ratio of ON vs. OFF cells, as well as the distribution of firing thresholds of these cells. We determine the values of these parameters that maximize the Shannon mutual information between stimulus and population response, while constraining the maximum expected spike count R = νmaxT for each cell. Biophysically, such a constraint on the maximal firing rate arises naturally from refractoriness of the spike-generating membrane. We have analytically proven the following theorem (SI Appendix):
Equal Coding Theorem
For a population of N ON and OFF Poisson neurons coding a one-dimensional stimulus in a fixed time window T by binary rate functions with maximal firing rate νmax, the mutual information is identical for all ON/OFF mixtures when the thresholds are optimized, for all N, νmax and stimulus distributions.
Optimized mutual information equals the negative log probability of the quiescent state. The value of this maximal information depends on the maximal expected spike count, R = νmaxT, which controls the Poisson noise level, since in the active state (where the cell’s firing rate is nonzero) the variance of spike count is R. It is therefore useful to introduce the noise parameter q = e−R, ranging from q = 0 in the noiseless limit and q = 1 in the high noise limit. We find that for all ON/OFF mixtures, the information with optimized thresholds is given by (Figure 2A)
Interestingly, we can write I = −log P (0) where P (0) denotes the probability that no cells fire, i.e. the quiescent state. There are two contributions to P (0): (1) the probability that the stimulus is in an interval in stimulus space which is not coded by any ON or OFF cells (white region in Figure 1B), and (2) the probability that the stimulus is in an interval in stimulus space that is coded by a subset of ON or OFF cells (grey regions in Figure 1B), but no cell fires due to the Poisson variability. In the noiseless limit, R → ∞ (i.e. q = 0), I reaches its upper bound I = log(N + 1). The effect of noise is most prominent when R is of order 1/N, so that the total spike count RN is of order 1, implying that the signal-to-noise of the entire population is of order 1. We call this the high noise regime, and here we obtain I → log(RN/e + 1).
Optimal distribution of thresholds
In the case of a discrete rate function, the information does not depend directly on the stimulus distribution p(s), but only on the areas of p(s) between consecutive thresholds. It is therefore useful to define the optimal threshold intervals where the neurons’ thresholds are ordered θ1 ≤… ≤ θN (and we define the special θ0 = −∞ and θN+1 = ∞). We find a surprisingly simple structure for the optimal pi. The optimal thresholds divide stimulus space into intervals of equal area, p (Eq. 11), except for the ‘edge’ intervals, pedge (Eq. 12), and the ‘silent’ interval, p0, which separates the ON and OFF thresholds (Figure 2B). This p0 is the only non-noisy response state where the firing rate of each cell is zero. We call this optimal threshold structure the infomax solution. We consider several limiting cases: first, a large population N ≫ 1 and maximal firing rate per neuron R, which is much larger than 1/N, i.e. 1 − q = 𝒪 (1). We call this the large population regime. In this regime, pedge = p = p0 = 1/(N + 1), so the N thresholds divide stimulus space into N + 1 equal intervals.
In this large population regime, we can rewrite the optimal thresholds as a continuous function of cumulative stimulus space; we replace θi with θ(x), where x = i/N is between 0 and 1. Then the optimal thresholds equalize the area under the stimulus density, where Z is a normalization factor. Therefore, the population of cells achieves ‘histogram equalization’, a strategy that has been proven optimal for a single cell, which codes a stimulus using a continuous function (the cell’s graded response) [23, 26].
In contrast, in the high noise regime, the system performs redundant coding so that p ≪ 1 and the only two substantial threshold intervals are the edge intervals pedge = 1/e, and the silent interval, p0 = 1 − 2/e, where e denotes exp(1) (Figure 2C,D). This implies that the optimal solution is to place all ON thresholds, and similarly all OFF thresholds, at roughly the same value, maximizing redundancy and noise reduction.
Mean population firing rate depends on the ON/OFF mixture
Despite equality in information for all ON/OFF mixtures (Figure 2A), each optimized ON/OFF population uses a different mean spike count to achieve this information. In the large population regime, the mean spike count per neuron is r(α) = R(α2 + (1 − α)2)/2 where α is the fraction of OFF cells. This mean spike count per neuron is minimized at α = 1/2, where it is half of the mean spike count for the homogeneous population, r(0) = R/2 (Figure 2E, yellow). As the noise increases, the relative benefits of the equally mixed relative to the homogeneous population decrease (Figure 2E). In the high noise regime, all mixtures produce roughly the same mean spike count per neuron of R/e (Figure 2E, brown).
Minimizing mean square error of the optimal linear readout
The mutual information tells us how well the population represents the stimulus without regard for how it can be decoded. An alternative criterion for coding efficiency is the ability of downstream neurons to decode this information. A simple biologically plausible decoding mechanism is linear decoding [11, 33, 46]. Here we study the accuracy of a downstream neuron that estimates the stimulus value s using a weighted sum of spike counts ni of the upstream population of binary Poisson neurons with thresholds θi
The weights wi, constant w0 and thresholds θi are optimized to minimize the mean square estimation error (MSE).
Accuracy of the optimal linear readout without noise
We first consider the scenario of low noise (R → ∞), in which case the limitation on the accuracy of the stimulus reconstruction comes solely from the discreteness of the rate functions of each cell in the population. Unlike maximizing the information, when minimizing the MSE both weights and thresholds depend on the stimulus distribution p(s) (Figure 3). Interestingly, we find that in this low noise limit, the optimal MSE is proportional to 1/N 2 and is the same for all ON/OFF mixtures, including the homogeneous population. The optimal weights are given by where 〈s〉i are the centers of mass of intervals of p(s) intersected by neighboring thresholds (Methods Eq. 15, Figure 3A). The optimal thresholds are the average of two neighboring centers of mass
The constant term and the stimulus interval not coded by any cell depend on the ON/OFF mixture (Figure 3B,C). In the large population regime, we can rewrite the thresholds θi as a continuous function θ(x) of the cumulative stimulus space x = i/N between 0 and 1. Interestingly, the optimal thresholds equalize not the area under the stimulus density, as in the case of the mutual information (Eq. 2), but the area under its one-third power where Z is a normalization factor. We invert this relationship to derive the optimal thresholds θ(x). Since the optimal MSE depends on the stimulus distribution, from now on we consider the Laplace distribution p(s) = 1/2 e−|s|, which arises from fits produced by filtering natural images with difference-of-Gaussian linear filters corresponding to center-surround receptive fields [13, 5]. In this case, the optimal thresholds can be derived from Eq. 6 (Figure 3D):
The infomax thresholds are the same except that the pre-factor is 1 instead of 3, making them less spread out in the tails (Figure 3D). In particular, the largest thresholds (in magnitude) are ±3 log(2N) when optimizing the MSE, three times as large as in the infomax case, ± log(2N). To further compare the optimal thresholds from minimizing the MSE and the infomax solution, we also plot the cumulative optimal thresholds (Figure 3E). While the optimal strategy when maximizing the information is to emphasize stimuli with higher likelihood of occurring, minimizing the MSE of the optimal linear readout pushes thresholds towards relatively rare stimuli near the tails of the stimulus distribution (Figure 3F).
Mixed ON/OFF populations in the presence of noise
When we introduce noise into the system so that the maximum expected spike count R is order 1, mixed ON/OFF populations show a dramatic improvement of the MSE over predominantly homogeneous populations (Figure 4A). For the Laplace distribution we have considered so far, and different noise values, the optimal fraction of OFF cells in the population is α = 1/2. Although there is a unique best ON/OFF mixture, the performance of populations with similar proportions of ON and OFF neurons is similar (i.e. the MSE around α = 1/2 is flat), while the homogeneous population has the highest MSE. As the noise decreases (R increases), this difference in performance between the mixed and homogeneous populations becomes even more dramatic, see for example R = 1 (Figure 4A). For a large population, where we assume small difference between two neighboring thresholds, we can can derive the asymptotic limit (SI Appendix) by expanding p(s) around each threshold to determine how the MSE scales with the population size N and the noise level R. For mixed ON/OFF populations, the variance term dominates the MSE scaling as 1/N, while the constant term (determined solely by the discreteness of the rate functions) remains proportional to 1/N 2 [33]. For homogeneous populations, the variance term still dominates the MSE scaling as log2(RN)/(RN), while the constant term as log(RN)/(RN).
In addition to the big difference in coding performance between mixed and homogeneous populations, the presence of noise also qualitatively changes the distribution of optimal thresholds (Figure 4B). In the asymptotic limit, the thresholds for a mixed population with an equal number of ON and OFF cells follow logarithmic profiles as a function of cumulative stimulus space (x = i/N) similar to the noiseless case (Eq. 7), except that the pre-factor is 2 instead of 3. This is also the case for any population where neither cell type predominates over the other, see for instance the population with 2/3 OFF cells and 1/3 ON cells (Figure 4C, red, blue). Thus, the noise has the effect of concentrating the thresholds near more likely stimuli, increasing the redundancy of the code.
When there is a pronounced over-representation of one cell type in the mixture (as in the extreme case of the homogeneous population), the optimal thresholds θ(x) exhibit a distinct asymmetry. The thresholds corresponding to the more abundant population are distributed linearly with x, while the thresholds corresponding to the less abundant population are distributed logarithmically with x, as before (SI Appendix). We demonstrate this asymmetry in the extreme case of the homogeneous population, e.g. all ON neurons (Figure 4B,C, black):
Moreover, the smallest threshold for the homogeneous population is much larger than the smallest threshold for any mixed population, suggesting that there is a large region of stimuli that is not coded by any cell in the homogeneous case.
In summary, we conclude that introducing noise has a dramatic effect on the coding efficiency of different ON/OFF mixtures when the MSE of the optimal linear readout is minimized: coding by mixed ON/OFF populations is much better compared to coding by populations dominated by one cell type, such as the homogeneous population. When considering the optimal thresholds, we find that introducing noise does not qualitatively affect the shape of the optimal threshold profile for mixed ON-OFF populations compared to the noiseless case (except for a constant pre-factor). However, for the homogeneous population, the addition of noise qualitatively changes the shape of the optimal thresholds, leaving a large stimulus region that is not coded by any cell.
The optimal ON-OFF mixture of the linear readout depends on the asymmetry in the stimulus distribution
Depending on the sensory modality, the distribution of natural stimuli may be asymmetric around the most likely stimulus. In the case of vision, it has been shown that the distribution of contrasts in natural images is indeed skewed towards more negative values [30, 39]. Therefore, we instead consider an asymmetric Laplace distribution p(s) ∝ es/τ− for s < 0 and p(s) ∝ e−s/τ+ for s ≥0. In this case, by minimizing the MSE we predict that the optimal ON/OFF mixture will be tuned to these stimulus asymmetries. At a fixed noise (or maximum spike count R), increasing the negative stimulus bias favors more OFF cells of the optimally configured population (Figure 5A,B). Similarly, increasing the positive bias in the stimulus distribution would favor an increase of the ON cells in the population. At a fixed level of stimulus bias, increasing the value of the noise accentuates the asymmetry in the optimal ON/OFF mixture (Figure 5C).
In summary, our theory predicts different optimal ON-OFF mixtures at which the lowest MSE is achieved depending on an asymmetry in the stimulus distribution and changing the noise level. Even in nature, the relative predominance of ON and OFF cells in populations can be different, for instance in the retinas of different species [10, 30]. Therefore, if we know the natural stimulus distribution being encoded by a population and the bounds on cells’ firing rates, we can predict the optimal ON/OFF ratio, as well as the tuning properties of the cells and compare them to experimental observations. We anticipate that our theory would be a good match in early sensory populations of primary sensory neurons, which have mostly likely been adapted to the statistics of the stimulus distribution.
Discussion
Information in neural circuits is processed by many different cell types, but it remains a challenge to understand how these distinct cell types work together. Here we treat a puzzling aspect of neural coding, how do discrete cell types conspire to collectively encode a single relevant variable? We use information theoretic measures to offer a plausible explanation: that neurons diversify their responses to maximize the coding efficiency of the range of occurring stimuli given biophysical constraints. The efficient coding framework we developed covers two aspects of the population code: (1) how to optimally split a population into ON and OFF pathways, and (2) how to allocate the thresholds of the individual neurons in the population as a function of the noise level, the stimulus distribution and the optimality measure.
We first examined optimal coding by maximizing mutual information: when constraining the maximum firing rate, all ON/OFF mixtures yield the same maximal information about the stimulus when thresholds are optimized (Figure 2). The optimal thresholds divide stimulus space into equal intervals with size dependent on the noise, except for the edge intervals coded by the largest ON and OFF thresholds. However, the system with an equal number of ON and OFF cells is most efficient, because it conveys the highest information per spike (Figure 2). The invariance in coding performance by different ON/OFF mixtures is also present when we demand the stimulus to be read out by an optimal linear readout, but only in the absence of noise (Figure 3). In the biologically relevant regimes of non-negligible noise, noise has a dramatic influence on the optimal performance realized by different ON/OFF mixtures (Figure 4). Populations with a similar number of ON and OFF cells have a much smaller decoding error than populations dominated by one cell type. The extreme case of the homogeneous population performs dramatically worse than any mixed population. Our theory predicts the optimal ON/OFF mixture depending on the asymmetry in the stimulus distribution and the amount of noise – thus making it relevant for sensory systems beyond the retina where ON and OFF pathways are encountered (Figure 5). However, the distribution of optimal thresholds is not much affected by the presence of noise, except for a scaling factor. The huge difference in performance reflected in the MSE between the homogeneous and any mixed population seems to be due to the threshold distribution – the optimal thresholds acquire qualitatively a very different structure in the homogeneous case, leaving a large stimulus region not coded by any cell.
How comparable is coding by different ON/OFF mixtures for the two optimality measures? Generally, the infomax criterion implements an optimal strategy which emphasizes stimuli that occur with higher probability. In contrast, minimizing the mean square error of the linear readout implements a more conservative strategy that utilizes more cells in the encoding of rarer stimuli due to a larger error penalty (Figure 3F, 4B). What ON-OFF thresholds one finds in a biological system could be indicative of the optimality measure utilized by that system.
Our work predicts the firing thresholds of neuronal populations in the high noise regime, which corresponds to short coding windows commonly encountered in biology; for instance, in the mammalian retina computations can be performed with an average of a few spikes per coding window [28, 42, 29]. In the low noise regime when the coding window is sufficiently long, or there is a large number of neurons, our results agree with previous studies on infomax and the optimal linear readout [23, 25, 45]. Efficient coding in the high noise regime has previously been examined, but only in terms of the transfer function of a single neuron [7, 27]. Using numerical simulations it was shown that the optimal transfer function is binary, as commonly encountered in biological systems [7, 27]. We go beyond this work and provide analytical solutions for how a population of binary neurons should coordinate their response ranges to optimally represent a given stimulus in the realistic regimes of short encoding times.
In our theory we assumed that variability in the spiking output of each cell is the dominant source of noise; this is in agreement with the small shared variability observed among retinal ganglion cells [29]. However, nearby ganglion cells can also exhibit strong noise correlations under certain conditions, presumably arising from noise in shared presynaptic neurons as early as the photoreceptors [1], or produced by circuit interactions [40, 47]. We expect that our results will continue to hold at low levels of input noise. At high levels of input noise, a different coding strategy may be optimal as previous work has shown that neurons perform redundant coding whereby the thresholds converge to the same value [40, 16, 21, 8].
Our predictions for the optimal ON/OFF mixture and the population thresholds could be directly compared to experiment. For instance, when deriving the optimal ON/OFF mixture based on the optimal linear readout, the imbalance in ON vs. OFF cell arises as a result of asymmetries in the stimulus distribution and the amount of noise. Indeed, if one analyzes raw stimulus values, such as the light intensity in a natural scene or the intensity of natural sounds, the resulting distributions can be very skewed towards negative stimuli [31, 12, 44, 39, 36, 15, 30]. Our theory then predicts that more resources should be spent on OFF, which is consistent with the predominance of OFF retinal ganglion cells in the retinas of a variety of species [10, 30]. Other related theories of ON/OFF splitting which have examined efficient coding including the spatial dimension, have also shown that OFF cells predominate [19, 30]. Besides predicting the optimal ON/OFF ratio, in our encoding model of one stimulus variable, we derive how the optimal ON and OFF subpopulations should coordinate their thresholds to achieve optimal encoding.
Given the ubiquity of ON/OFF pathway splitting in different sensory modalities and species, it may be possible to test our predictions in other sensory systems where the neuronal responses properties may be different, depending on differences in the natural stimulus distribution and the bounds on cells’ firing rates. The challenge in each case might be to determine the natural stimulus distributions that the populations might be optimized to encode. Comparing the predicted optimal and the measured threshold distributions would provide a test whether the efficient coding criteria proposed here are a likely constraint that shapes the evolution of sensory systems.
Materials and Methods
Mutual information
The Shannon mutual information between the stimulus s and the spiking response n of the population is the difference between response and noise entropy: where 〈·〉x denote averages over the distribution p(x) and 〈p(n) = p(n |s)〉s. We assume that stimulus encoding by all neurons is statistically independent conditional on s. Given the noise model, knowing the stimulus s unambiguously determines the response firing rate ν. Therefore, we can replace p(ni|s) with p(ni|ν) which is Poisson distributed .
For a binary response function with two firing rate levels, 0 and νmax, we can lump together all states with nonzero spike counts into a single state which we denote as 1. Correspondingly, the state with zero spikes is 0: where q = e−R and R = νmaxT denote the level of noise in the system.
The optimal cumulative thresholds θ1 ≤… ≤ θN divide the stimulus into N + 1 intervals which have equal area except for the two ‘edge’ intervals, and the ‘silent’ interval that separates the ON and OFF thresholds, p0 = 1 − (N − 2)p − 2pedge (SI Appendix). Note that for the homogeneous population, .
Optimal linear readout. No noise (R → ∞)
Here we consider the ON and OFF thresholds separately. In a population of N neurons, we assume that there are N− OFF neurons and N+ ON neurons. The mean square error can be written as: where y is the stimulus estimate from Eq. 3, C denotes the matrix of pairwise correlations between the cells’ responses, Cij = Ci for i ≥ j,and U denotes the correlation between response and stimulus: with i = 1,…, N+ for the ON cells and similarly for the OFF cells with i = 1,…, N−. In the optimal solution the ON and OFF responses do not overlap; thus, there is no correlation between them. Optimizing the error with respect to the weights gives Eq. 4 for the optimal weights, while optimizing with respect to the threshold gives Eq. 5 for the optimal thresholds, with the centers of mass 〈s〉i defined as: where we have defined θN+1 = ∞. Optimizing with respect to the constant term yields for the homogeneous population: and for a mixed population: where θ1OFF denotes the largest OFF threshold and θ1ON denotes the smallest ON threshold in the population.
Non-negligible noise (finite R)
We normalize readout, for convenience
The error can be written as before (Eq. 13) with correlations where δij = 1 if i = j and otherwise 0. If we define Ci = 〈Θ(s − θi)〉 as before, then for i ≥ j: with i = 1,…, N+ for the ON cells and similarly for the OFF cells with i = 1,…, N−. Optimizing with respect to the ON (OFF) weights: and optimizing with respect to the ON (OFF) thresholds: with i = 1,…, N+ for the ON cells and similarly for the OFF cells with i = 1,…, N−. To solve these equations numerically, we implement an iterative procedure that rapidly converges to the optimal solution: starting from an ansatz for the thresholds, we compute C and U and obtain w from Eq. 21, which is used to derive the new set of thresholds.
Acknowledgments
JG was supported by the Max Planck Society and a Burroughs-Wellcome Career Award at the Scientific Interface. All authors were supported by the NIH, the Gatsby Charitable Foundation and the Swartz Foundation.