Abstract
Human visual cortex is organized into multiple retinotopic maps. Characterizing the arrangement of these maps on the cortical surface is essential to many visual neuroscience studies. Typically, maps are obtained by voxel-wise analysis of fMRI data. This method, while useful, maps only a portion of the visual field and is limited by measurement noise and subjective assessment of boundaries. We developed a novel Bayesian mapping approach which combines observation–a subject’s retinotopic measurements from small amounts of fMRI time–with a prior–a learned retinotopic atlas. This process automatically draws areal boundaries, corrects discontinuities in the measured maps, and predicts validation data more accurately than an atlas alone or independent datasets alone. This new method can be used to improve the accuracy of retinotopic mapping, to analyze large fMRI datasets automatically, and to quantify differences in map properties as a function of health, development and natural variation between individuals.
Introduction
Visual responses in a substantial part of the human brain are organized in retinotopic maps, whereby nearby positions in the brain represent adjacent locations in the image. Accurate measurement of these maps using functional magnetic resonance imaging (fMRI) is essential to a wide range of neuroscience and clinical applications [1]. Maps provide a basis to compare or to aggregate measurements across individuals, groups, tasks, stimuli, and laboratories. Maps are used to study homology between species[2], cortical plasticity [3], and development [4]. Many studies of cortical visual function in human - whether in motion [5], color [6], object recognition [7], or attention [8] - include retinotopic mapping as a first step. Finally, basic properties of the maps themselves, such as the cortical magnification function (mm of cortex per degree of visual field), can be used to understand visual performance [9]. Hence, the ability to accurate and efficiently measure retinotopic maps is of broad importance in visual neuroscience.
Currently, most retinotopic analyses of fMRI data use a voxel-wise approach. The general method is to (1) measure responses to mapping stimuli, (2) derive retinotopic coordinates for each voxel or surface vertex by analyzing traveling waves [10, 11] or by solving a population receptive field (pRF) model for each voxel [12], and (3) identify areal boundaries by visual inspection. Aside from requiring significant time and effort, the maps that result from this process retain many common sources of error including distortion of the BOLD signal due to partial voluming [13], vessel artifacts [14], other sources of physiological noise, and model fitting biases [15]. Due to the various sources of noise, the measured maps have discontinuities and often systematically miss portions of the visual field, such as the vertical meridian [16-20]. Further, the measured maps are limited by the available stimulus field of view in the scanner, often as little as 6-12° of eccentricity, and have difficulty measuring the foveal representation [21], the portion of the maps most important for many visual tasks including reading and object recognition [22]. These many shortcomings of the traditional retinotopic mapping process derive from the fact that it is organized around optimizing the explanatory power of the retinotopy solutions from individual voxels, rather than that of the entire visual field or cortical area. As a consequence, it yields maps that are neither smooth nor complete—nor grounded in any context of how the visual field is warped onto the cortical surface. Lacking these data, the comparison of maps between subjects is difficult, and precise quantitative examination of individual differences is impossible.
An alternative to voxel-wise modeling of fMRI data is to build a retinotopic atlas - a computational model of the mapping between visual field position and cortical structure. Atlases have been developed to describe the mean anatomical locations [23] and the mean functional organization [24, 25] of the V1-V3 retinotopic areas across subjects. The atlas, after being fit to training data, is applied to an individual anatomical MRI image without functional data. These atlases are useful, but they provide only a description of the mean and do not offer a method of characterizing maps in an individual subject. A tool capable of providing such a characterization would be of considerable utility to researchers interested in homology between species [26], individual differences among humans [27, 28], and basic principles of cortical organization [1] and development [29].
Individual variation in cortical maps between subjects can be quite large. For example, the surface area of V1 can vary by 2- to 3-fold across healthy adults [27, 30, 31]. Despite these very large differences, if the surface anatomies are co-registered between individuals, the map topographies sometimes show reasonable agreement. For example, atlases of V1-3 [24, 25] have been derived using topological models of the cortical field [24, 32] with anatomical registration between observers [33, 34]. The assumption underlying these atlases is that once a correspondence is found between the sulcal pattern in two subjects’ visual cortices, the function will match. Hence if one were interested in individual variation in cortical topography after anatomical registration, this method is uninformative: it assumes the answer is 0. In this paper, we develop a novel method and toolset for modeling retinotopic maps in individual subjects. This method provides a precise quantification of the space between average and individual maps, enabling us to examine differences in cortical organization, even after co-registration of the surface anatomies.
Here, we hypothesize that a Bayesian model of retinotopic maps, combining data (retinotopic measurements from voxels or vertices) with a prior (a full-field atlas derived from the anatomy), will eliminate many of the issues with retinotopic mapping (described above) by optimizing the description of cortical retinotopic maps in the context of the full visual field and cortical map. We propose that such methods can describe cortical retinotopic maps in individual subjects more accurately than an atlas alone or measurements alone. These hypotheses are motivated by two factors. First, previous work employing functional data to supplement global anatomical alignments between subjects has found an increase in the overlap of functional ROIs drawn from independent localizers [35]. Thus, even when subjects are aligned anatomically, appreciable and systematic differences in the structure-function relationship remain. Allowing the measurement from an individual subject to inform the alignment will, in part, capture these individual differences. Secondly, we believe that the basic form of the atlas (the prior) is sufficiently accurate that incorporating it will result in a more accurate estimate of the retinotopic map than the measurements alone.
In this paper, we employ a Bayesian maximum-likelihood optimization to describe the retinotopic maps in striate and extra striate cortex with previously infeasible precision. Unlike previous work on functional alignment [36], we perform alignment between each individual subject’s retinotopic parameters and a model of retinotopy described on the (anatomically-aligned) group-average cortical surface. This optimization builds on previous work using iterative approaches to fit and interpolate smooth retinotopic maps in individual subjects [27] by incorporating an explicit prior in the place of human intervention and adopting an explicitly Bayesian formulation. We additionally publish with this paper a tool capable of implementing the method we describe as well as all source code employed. We use these tools to characterize retinotopic maps from several subjects in terms of the precise warping from visual field to visual cortex. Using these characterizations, we are able to quantify the extent to which variations in retinotopic organization are due to anatomical differences versus differences in the structure-function relationship. We show that, in fact, these two sources of variability—differences in structure and differences in the structure-to-function mapping—are roughly equal and orthogonal across subjects. This means that after warping individual cortical surfaces to bring the anatomies into registration, an additional warping, equal in size, is needed to bring the functional maps into alignment, thereby demonstrating substantial variability in an early, sensory region of the human brain.
Results and Discussion
Retinotopic mapping experiments were performed on eight subjects using fMRI. Twelve individual retinotopy scans were performed on each subject then combined into 22 datasets for cross-validation as well as one full dataset of all scans for detailed analysis (Fig. S1; see also Materials and Methods). Predicted maps were then generated using the datasets and compared to the validation dataset. We compared three methods for predicting retinotopic maps (Fig. 1): (1) using the training data alone as a prediction (the “observed” maps); (2) using the subject’s anatomy alone to apply an anatomically-defined template of retinotopy (the “prior” maps); and (3) combining data with anatomy (observation with prior) using Bayesian inference (the “inferred” maps). We then leverage the differences between these methods to characterize the pattern of individual differences in retinotopic maps across subjects. The prior map, used for methods 2 and 3, was derived from fitting a template to a high-quality dataset derived from 181 subjects in the Human Connectome Project [37-39].
Bayesian inference has the advantages of an anatomical atlas while also respecting individual differences
The inferred retinotopic maps, predicted by Bayesian inference, provide high-quality descriptions of the full retinotopic topology for each subject’s V1-V3 regions. These maps can be produced even in the absence of observed retinotopic measurements (i.e., using the prior alone), or by combination with retinotopic data. In this latter case, the inferred maps have all the advantages of the retinotopic prior (topologically smooth maps, predictions beyond the stimulus aperture, etc.), while also accounting for idiosyncrasies in individual maps. Three examples of maps that demonstrate this advantage are shown in Figure 2. The first two columns show maps in which, relative to the validation maps, the predictions made from data alone have highly curved iso-eccentricity contours. These contours reflect noise rather than true curvature in the iso-eccentricity contours, as shown by the validation data. For these two columns, the predictions from the prior alone have iso-eccentric contours that are too smooth (as compared to the validation data). The correct lines appear to lie between the training data and the prior. Hence when data is combined with the prior (Fig. 2C, 3rd row) the iso-eccentric contours resemble those of the validation dataset. The third column of Figure 2 shows an instance in which, even lacking a coherent polar angle reversal to define the ventral V1/V2 boundary near the fovea in the predictions made from data alone, combination of the data with prior more accurately predicts that boundary in the validation dataset than the prior alone.
In constructing the Bayesian-inferred retinotopic map for a single hemisphere, we perform two deformations, detailed in Figure 3 (see also Materials and Methods): (1) we first deform that hemisphere’s inflated surface to register it to an average anatomical atlas using FreeSurfer (leftmost arrow in Fig. 3C); and (2) we then further deform the surface to register it to the retinotopic prior, based on the hemisphere’s retinotopic measurements (second arrow in Fig. 3C). These steps together account for the individual differences in the organization of the subjects’ retinotopic maps. In step 1, we account for structural differences across subjects—the deformation that occurs for this registration is unique for each subject. This is where prior work ended [24, 25]. In step 2, we account for the differences in the relationship between structure and function across subjects. Although it is possible that deformations in step 2 partly compensate for imperfections in the first two registrations, we propose that, to a first approximation, the deformations applied in the last step indicate meaningful individual differences in the structure-function relationship.
Individual differences in the V1-V3 structure-function relationship across subjects are substantial
The Bayesian model fitting allows us to parcellate two sources of variation between individuals: differences in surface anatomy (sulcal patterns) and differences in structure-to-function mapping, i.e. how retinotopic features map onto the surface anatomy. These two sources of variation map approximately to the two deformations in our atlas fitting: differences in surface anatomy are reflected in the deformation used for the surface alignment, and differences in structure-to-function mapping are reflected in the deformation to align the retinotopic data to the prior (the Bayesian fit). There are some subjects for whom there are large differences between the retinotopic atlas defined by the prior and the atlas defined by the Bayesian fit (Figs. 4A, 4B). In this example subject, the iso-eccentricity lines in the Bayesian atlas are substantially more compressed along the posterior-anterior axis compared to the anatomical atlas, and the iso-angle lines in V2/V3 are more dorsal compared to the anatomical atlas. Where there are discrepancies, the Bayesian inferred map is more accurate. For example, the polar angles and the eccentricities in the validation data are approximately constant where the Bayesian map predicts iso-angle and iso-eccentricity lines, but not where the prior map predicts them (Fig. 4B). This indicates that even after anatomical alignment, the retinotopy in this subject differs systematically from the prior. For some other subjects the two atlases are in closer agreement such that the prior alone is a good fit to the retinotopic data (Fig. 4C).
To quantify the two types of deformations, we calculated the mean 3×3 distance matrix between (1) a vertex’s native position, (2) its position in the anatomical alignment (fsaverage position), and (3) its “retinotopic position” after alignment to the retinotopic prior. All vertices in the V1-V3 region within 12° of eccentricity, as predicted by the Bayesian inference on the full dataset (all retinotopy scans combined), were used. We then performed 2D metric embedding to determine the mean deformation steps and the mean angle between them (Fig. 5A). Overall, the mean deformation distance across vertices is 3.1° ± 0.8° (μ ± σ across subjects) of the cortical sphere for the anatomical alignment and 3.3° ± 0.4° for the retinotopic alignment. The mean angle between these deformations is 82.5° ± 9.5° (note that this last measurement is in terms of degrees of rotational angle rather than degrees of the cortical sphere). Of these two deformations, the latter, in which vertices are driven away from their anatomically-aligned positions according to their associated retinotopic measurements, is about 10% larger than the the anatomical alignment, in which vertices are driven away from their positions in the subject’s native cortical sphere according to the surface curvature. The anatomical alignment corresponds to variation in the surface topology and is accounted for in anatomical atlases [24, 25]; the retinotopic alignment corresponds to variation in the structure-function mapping and is accounted for in the Bayesian model. An additional summary measurement of these deformations, the root-mean-square deviation (RMSD) distances across vertices near the occipital pole in a particular retinotopy dataset, provides a metric of the total warping applied during each step of the alignment process for each subject. A summary of this measurement, as well as various other summary statistics is provided in Table 1.
For both left (t8 = −8.72; p < 0.0001) and right (t8 = −5.19; p = 0.0013) hemispheres, the retinotopic deformation distances were significantly larger than the the anatomical deformation distances, which were still substantial; this is true whether one calculates the mean deformation distance over the entire patch of cortex immediately around the occipital pole (shown in the maps in Fig. 2) or over only the vertices that are predicted to be in V1-V3; Figure 5 is calculated over the latter of these two ROIs. Note that if the retinotopic deformation distances had been much larger, the prior anatomical atlases would have been less accurate. Had they been close to 0, then the anatomical atlas alone would have been as accurate as the Bayesian model.
We interpret the warping performed to align retinotopic data to the anatomical prior (the retinotopic alignment) as evidence of individual differences in the way in which retinotopy maps to sulcal topology. An alternative possibility is that the retinotopic alignment corrects for incomplete or incorrect warping performed by FreeSurfer during alignment of each subject’s sulcal topology to that of the fsaverage hemispheres (the anatomical alignment). Two observations cast doubt on this interpretation, however. First, the angle between the alignments is roughly orthogonal, meaning that there is very little movement along the axis of the first (structural) alignment during the second (retinotopic) alignment. Had the first alignment been in the correct direction but too conservative, then we would have expected the retinotopic alignment to be in the same (or similar) direction, rather than orthogonal. Second, if the retinotopic alignment served to correct an incomplete or incorrect anatomical warping, then anatomical metrics such as curvature would become more uniform across subjects after retinotopic alignment compared to after only anatomical alignment. Figure 5B demonstrates that this explanation is unlikely by showing the distribution across surface vertices of the standard deviation of curvature across subjects. When curvature is compared across subjects without retinotopic or anatomical alignment (Native alignment in Fig. 5B) the standard deviation is quite high. When subjects are compared after anatomical alignment, the standard deviations are much lower. After further alignment to the anatomical prior of retinotopy (Retinotopic alignment), the standard deviation of curvature across subjects is between these two extrema. This suggests that the retinotopic alignment is sacrificing some amount of structural uniformity across subjects in order to accommodate the individual differences in subjects’ structure-to-function mapping, and is consistent with our interpretation that there are substantial individual differences in the mapping between retinotopy and surface topology.
The large individual differences that remain, even after structural co-registration, point to the importance of using at least some individual subject functional data when inferring the maps, rather than assuming the atlas (prior) is correct. The specific nature of these deformations, and whether, for example, they fall into a few basic patterns, is an important question about the natural variation of individual brains. Our new method, combined with large datasets such as the HCP retinotopy data set [39], could be used to address this question.
The inferred maps make highly accurate predictions using very little data
To quantify the accuracy of our Bayesian-inferred retinotopic maps, and to compare the accuracy against other predictions, we used a cross-validation scheme such that predictions from data alone, the prior alone, or via Bayesian inference were compared against independent validation datasets (Fig. S1). The validation datasets were derived from 6 of the 12 scans; the predictions from data alone and from Bayesian inference were derived from training datasets, which were comprised of various combinations of 1-6 independent scans (between 3.2 and 19.2 minutes of data). The predictions from the prior alone did not use training data.
Within the validation dataset, all vertices from the inner 12° of eccentricity of the V1-V3 maps, as identified via Bayesian inference using the validation dataset, were compared to their counterparts in the predicted maps. Note that the inferred maps from the validation dataset were used only to identify the vertices included in the comparison; the retinotopic coordinates from the validation datasets themselves were taken as the “gold-standard” measurements. In computing prediction accuracy, we weighted the vertices by the fraction of variance explained for each vertex’s pRF solution in the validation dataset. For the three types of training datasets (prior alone, data alone, Bayesian inference), we assume that each vertex makes a prediction regardless of the variance explained. To prevent errors at high eccentricity from dominating the error metric, we calculated the scaled error for a vertex to be the angular distance in the visual field between its retinotopic coordinates in the predicted map and the validation dataset, divided by the eccentricity in the validation dataset. Figure 6A shows the scaled mean squared error (MSE) for the various predicted maps in terms of the amount of time spent collecting retinotopic data for the map.
The maps predicted via Bayesian inference were highly accurate irrespective of the amount of data used to inform the fits (Fig. 6). Between those predicted maps informed by 3.2 minutes of scan time (1 scan) and 19.2 minutes (6 scans), the scaled MSE of the prediction remains in the range of ~0.4-0.5. These scaled errors are larger near the fovea because the denominator used for scaling the error metric (i.e., the eccentricity) could be very small; when the range is limited to 3 to 12 deg, the MSE is much lower, ~0.20-0.23. Expressed separately in units of polar angle and eccentricity, the mean absolute polar angle error from a Bayesian map derived from a single 3.2 minute scan is 22° ± 4.6° and the mean absolute eccentricity error is 0.69° ± 0.11° ± (μ + σ across 96 datasets). For the prior alone, these errors are substantially higher: 31° ± 11° for polar angle and 2.0° ± 1.3° for eccentricity. Note that these errors are approximately 3× higher than those reported for previous versions of the anatomical prior [25]; however these discrepancies are due to differences in the metric used, the amount of data collected, the thresholding applied, and the use of smoothing. Some of these factors we cannot reproduce exactly (amount of data) or have deliberately abandoned (smoothing), but by using the same metric (median absolute error across all vertices) and similar thresholding (1.25° < predicted eccentricity < 8.75°), we obtain errors very close to those previously reported: 12° of polar angle and 0.90° of eccentricity. In contrast to the inferred maps for which the accuracy is largely independent of scan time, the accuracy of the predictions from data alone was highly influenced by scan time. The scaled MSE of the maps predicted from the training datasets alone for the same range of scan times ranged from ~2.2 (3.2 minutes of training data) to ~0.3 (19.2 minutes of training data). With more than ~11 minutes of scan time, the predictions made from the training datasets alone have a slightly lower scaled MSE than those made from Bayesian inference, though, notably, the improvements in scaled MSE for more than 11 minutes of scan time are small.
The prediction accuracy from the prior alone (0 min. scan time) was generally intermediate in accuracy between the predictions from the Bayesian model and the data alone. Predictions derived from the anatomically-defined atlas alone are more accurate than predictions from 3.2 minutes of scanning and only slightly less accurate than predictions from training datasets derived from 6.4 minutes of scanning; this is in agreement with previous analyses of prediction error versus measurement error in retinotopic mapping experiments [25]. The predictions using the prior alone were universally less accurate than the Bayesian predictions (Fig. 6, cyan)—for all subjects and all training datasets, the combination of prior with data improved prediction accuracy compared to the anatomical prior alone. These data demonstrate that the application of our new method to a small amount of retinotopic mapping data yields a higher quality retinotopic map than can be derived from other sources alone, with the possible exception of data derived from a very long retinotopic mapping session.
The fact that the increased prediction accuracy from the Bayesian maps is almost independent of the amount of scan time used for the observations suggests that much of the individual variability is captured by a low dimensional warping from the template, which can be inferred from a modest amount of data.
Another significant advantage of the method is that it eliminates the need for human intervention in the process of delineating retinotopic maps and visual areas. In most studies that requite retinotopic mapping data, one or more experimenters hand-label the visual area boundaries. While human raters are better able to understand atypical retinotopic boundaries than our method, they are nonetheless subject to inter-rater disagreement and human error. Furthermore, although expert human raters have a much more nuanced prior about retinotopic map organization than our method, and thus may sometimes draw boundaries better than our method, our method at least makes its prior explicit and quantifiable, and, thus, comparable and replicable across studies.
What is ground truth?
The motivation for a Bayesian approach to retinotopic mapping can be found most clearly in the measured retinotopic maps themselves. In all of our measured retinotopic maps, there are numerous systematic imperfections (Fig. 7A), and the literature contains many reports of similar errors [14, 40-42]. These imperfections can arise from a variety of sources, including partial voluming [43], negative BOLD [44], and large, draining veins. Imperfections due to blood vessel artifacts in particular, such as the transverse sinus in ventral visual cortex, can have effects over large distances [14], and most perniciously, they may lead to large and reliable responses that nonetheless differ from the local neuronal activity in the voxel [42]. Such artifacts can be difficult to track down and are often not eliminated by typical methods of cleaning up maps such as smoothing, thresholding, or simply collecting larger datasets.
Although with large datasets (> 19 min of scan time), the prediction accuracy for the validation dataset is highest using the data alone rather than the data and prior, we believe that even in these cases the combination of data + prior is probably closest to ground truth. We defined accuracy operationally as the difference from the validation set, as this provides a single set of independent measures that can be used to assess the accuracy of all 3 types of models. The validation dataset is defined by at least as many scans as any of the training datasets, and hence is our best measurement. However, the validation dataset is not ground truth, as it is subject to errors from systematic and random measurement noise. The inferred maps, unlike the maps from the validation and training datasets alone, produce a topologically smooth transformation into the visual field: they represent the complete visual field with no holes, redundancies, or discontinuities. Hence an enclosed region in the visual field will project to an enclosed region in the inferred map on the cortical surface. This is not the case for the maps predicted by the data alone without the use of an atlas (Fig. 7B). We consider this difference to be an advantage of the inferred maps, since it is generally accepted that the cortical surface of V1 is a topological map of the visual field.
In short, since we do not correct for all of these potential sources of systematic error, we consider our estimates of error from the Bayesian-inferred maps to be conservative, and the estimate from the data-to-data predictions to be liberal.
The Bayesian model accurately predicts visual field positions not included in the training data
One important advantage of using the method of Bayesian inference outlined in this paper is that it provides predictions beyond the extent of the stimulus aperture in the retinotopy experiment. These peripheral predictions extend to 90° of eccentricity, even though the data used to derive the prior was based on stimuli that only extended to ~8° of eccentricity. Hence it is important to ask whether the model makes accurate predictions in the periphery. We demonstrate this in two ways. First, when our registration algorithm is run using only a subset of the eccentricity range (e.g., only data within the first 3° or the first 6° of eccentricity), the predicted maps remain accurate to 12° of eccentricity (Fig. 8A). Second, we compared wide-field retinotopy data, collected out to 48° of eccentricity from subject S1201 to the Bayesian-inferred map predictions made using our data with a 12°-aperture (Fig. 8B). We find that in both cases, our method is highly accurate despite lacking training data for peripheral measurements. Thus, even if prediction accuracy within the measured regions were similar for the Bayesian model and the data-to-data predictions, the Bayesian model is advantageous because it includes predictions for regions of the visual field beyond training data.
The Bayesian inferred maps accurately reproduce systematic properties of the visual field maps
Another aspect in which our work here extends previous methods is the addition of the pRF size to the retinotopic quantities predicted by the model in the inferred maps—previous work predicted only the pRF centers [24, 25]. Here, we predict the pRF sizes for the vertices based on the eccentricity inferred from the Bayesian map, and the relationship between eccentricity and pRS size in previous literature [45]. The inferred pRF size of a vertex is the pRF size predicted by the Kay et al. [45] model for the vertex’s inferred visual area at the vertex’s inferred eccentricity. Although pRF size can depend on variables such as the voxel size and the stimulus that may differ between scans, we find that this method approximately reproduces the pRF sizes found in the observed retinotopic maps (Figs. 9A and 9B).
Another metric inversely related to pRF size is the cortical magnification, usually measured in terms of mm2 of cortex representing one degree2 of the visual field. We summarize these measurements in Figures 9C and 9D. Our measurements of cortical magnification are broadly in agreement with previous work by Horton and Hoyt [46], shown by the dotted black line in the lower panels of Figure 9. The cortical magnification of the inferred maps is quite similar to those of the observed retinotopic maps. In both cases, V1 has slightly lower cortical magnification than V2 and V3 near the fovea, but higher magnification in the periphery. This difference is slightly exaggerated in the inferred maps relative to the observed maps; though this difference is slight and is in agreement with previous examinations of cortical magnification [21].
The retinotopic prior and Bayesian-inferred maps include 12 visual areas
Previous research on the retinotopic organization of visual cortex used a model of V1, V2, and V3 retinotopy described by Schira et al. [32] to produce a template of retinotopy that included only those visual areas. This “banded double-sech” model accurately describes the anisotropic magnification of the visual field on the cortical surface, particularly near the fovea. However, we have observed, particularly in individual data, that retinotopic data from outside the V1-V3 region described by the Schira model has a large impact on the quality of the inferred map. Accordingly, in creating our retinotopic prior, we constructed a new model that includes 9 additional visually active regions: hV4, VO1, VO2, V3a, V3b, LO1, LO2, TO1, and TO2. This model employed a new method of constructing 2D models of retinotopy that was specifically designed to accommodate the distortions caused by anatomical alignment, inflation, and flattening of the cortical surface. The new method is much simpler to extend to many more visual field maps, as it does not rely on an analytic description of the flattened (2D) retinotopic maps, which are only available for V1-V3. Rather, it requires as input a cortical map image on which estimates of the visual area boundaries have been drawn manually and labeled as either a foveal boundary, a peripheral boundary, an upper vertical meridian, a lower vertical meridian, or a horizontal meridian. A minimization technique is then used to fill in the retinotopic coordinates between the drawn boundaries (see Models of Retinotopy in Materials and Methods). The new retinotopic prior, including all new areas can be seen in Figure S3. Although we consider the addition of 9 retinotopic areas to be an important development, we consider these areas preliminary and do not analyze them in detail here. One reason for this is that the organizations of many of these areas remain under dispute. Additionally, the responses to our stimuli in these areas is of a considerably lower quality than in V1-V3; thus even were we to analyze the accuracy of the predictions in these ares, our validation dataset would be a particularly poor standard. We do, however, include these areas in the predictions made by the Bayesian inference method so that interested researchers may analyze or use them. These areas are included in the data provided with this paper, and predictions of these areas are included when using the tools provided in the supplemental materials.
Making the Bayesian inference explicit
Our method has the advantage of allowing the retinotopic atlas to act as a prior constraint on new observed data. This is a Bayesian model in the general sense of combining a prior belief with a measurement in order to make an inference. The computation can also be reformulated in an explicit Bayesian framework. We define a hypothesis H to be a particular warping of the cortical surface, and we define the evidence E to be a particular set of retinotopic measurements. We then convert the cost functions from Table 2 into probabilities by assuming an exponential relationship. Hence, the prior probability of H is defined in terms of the deviation from the retinotopic prior: P(H) = exp(-(Fe(x) + Fϑ(x) + Fp(x))), and the likelihood of the evidence under a given hypothesis, P(E | H), is defined in terms of the fit between the retinotopic model and the retinotopic measurements: P(E | H) = exp(-Fφ(x)).
During registration, we seek the hypothesis H that maximizes the posterior probability P(H | E) = P(E | H) P(H)/P(E). Because P(E) is a constant, we can ignore it and instead maximize the function given in Equation 1, which is equivalent to minimizing F(x). This operation is performed during registration. Thus, to derive our cost function from Bayes’ rule, we write:
Individual differences in structure-function relationship
An important question in human neuroscience is the degree to which different brains, when brought into anatomical registration, share the same functional mapping. There is no single, agreed-upon method to register the brains of different individuals, but a general finding is that cortical function shows better inter-observer agreement when the brains are aligned based on sulcal topology (surface registration) rather than volume registration [23, 47]. Here, using surface registration, we find that substantial individual differences in functional mapping remains, e.g. as evidenced by the amount of additional warping needed to align individual brains to retinotopic measurements (Fig. 5). These results are consistent with studies showing differences in structural and functional alignment of primate and human area MT [48]. They are also consistent with studies combining structural and functional alignment in the absence of a model or template of the underlying function [35, 36]. Such studies show that the extra warping driven by functional alignment leads to better predictions of functional responses in cross-validated data.
The anatomical atlas, described previously [25], was adapted into the first step of our method (Fig. 3C) and is equivalent to the prediction using the retinotopic prior alone that we present here (Fig. S2B). Consistent with previous results, we find that the prior alone produces reasonably good predictions for the retinotopic maps of most subjects (Fig. 6). Additionally, maps predicted using the prior alone contain many of the advantages described here such as topological smoothness, complete coverage of the visual field, and prediction of peripheral retinotopy. However the idiosyncrasies of individual subjects’ retinotopic maps are often not well predicted by the retinotopic prior alone (Figs. 2, 4A, and 4B). By combining both the retinotopic prior and a small amount of measured data we are able to produce higher-quality predictions that not only share these advantages but also improve the prediction accuracy of the maps beyond that of the measurement error (Fig. 6).
Method Availability and Usage
The method described in this paper has been made publicly available in a number of ways in order to enable easy use by other researchers. The Bayesian inference method itself is implemented as part of a free software library called Neuropythy; we have publicly archived the version of this library used in the preparation of this manuscript at DOI:10.5281/zenodo.1312983. Additionally, we have created a universally executable Docker image and have publicly archived it at DOI:10.5281/zenodo.1313859. Detailed instructions for the use of both the library and the universally executable Docker image are available at the Open Science Foundation repository associated with this paper (https://osf.io/knb5g/). In brief, the method may be run with only a few inputs: a FreeSurfer directory for the subject (which can be generated from an anatomical image), and a set of files containing the measured retinotopic parameters for the subject’s cortical surface. The outputs produced are a similar set of files describing the inferred retinotopic parameters of the subject’s cortical surface. Runtime on a contemporary desktop computer is less than an hour per subject.
Materials and Methods
Scientific Transparency
All source code and notebooks as well as all anonymized data employed in this Methods section and the preparation of this manuscript have been made publicly available at the Open Science Foundation: https://osf.io/knb5g/
Subjects
This study was approved by the New York University Institutional Review Board, and all subjects provided written consent. A total of eight subjects (4 female, mean age 31, range 26-46) participated in the experiment. All scan protocols are described below.
Magnetic Resonance Imaging
All MRI data were collected at the New York University Center for Brain Imaging using a 3T Siemens Prisma scanner. Data were acquired with a 64-channel phased array receive coil. High resolution whole-brain anatomical T1-weighted images (1 mm3 isotropic voxels) were acquired from each subject for registration and segmentation using a 3D rapid gradient echo sequence (MPRAGE). BOLD fMRI data were collected using a T2*-sensitive echo planar imaging pulse sequence (1s TR; 30ms echo time; 75° flip angle; 2.0 × 2.0 × 2.0 mm3 isotropic voxels, multiband acceleration 6). Two additional scans were collected with reversed phase-encoded blips, resulting in spatial distortions in opposite directions. These scans were used to estimate and correct for spatial distortions in the EPI runs using a method similar to [49] as implemented in FSL [50].
Anatomical images were processed using the FreeSurfer image analysis suite, which is freely available online (http://surfer.nmr.mgh.harvard.edu/) [33, 34, 51, 52]. Subject brains were inflated and aligned to FreeSurfer’s anatomical fsaverage atlas.
Stimulus Protocols
Each subject participated in 12 retinotopic mapping scans using the same stimulus emptied in the Human Connectome Proect [39]. Briefly, bar apertures on a uniform gray background swept gradually across the visual field at four evenly-spaced orientations while the subject maintained fixation. Bar apertures contained a grayscale pink noise background with randomly placed objects, faces, words, and scenes. All stimuli were presented within a circular aperture extending to 12.4° of eccentricity. The bars were a constant width (1.5°) at all eccentricities. Subjects performed a task in which they were required to attend to the fixation dot and indicate when its color changed.
The 12 scans were split into several subsets and analyzed as independent datasets. Six of the scans (two of each bar width) were allocated to the subject’s validation dataset, while the remaining three scans were used for 21 training datasets: 6 datasets with 1 scan each, 5 datasets with 2 scans each, 4 datasets with 3 scans each, 3 datasets with 4 scans each, 2 datasets with 5 scans each, and 1 dataset with all 6 training scans. Additionally, all 12 scans were included in a full dataset which was used for all analyses not related to the accuracy of the Bayesian inference method.
Additionally, one previously published retinotopy dataset for with a wide field of view (48° of eccentricity) was re-analyzed [1] (their Fig. 3). The subject for this dataset was also included in the newly acquired data, S1201. The wide-field-of-view dataset was used as a further validation set for models derived from the newly acquired data for S1201, as it enabled us to test the accuracy of model predictions in the far periphery from models derived from data with limited eccentricity.
FMRI processing
Spatial distortions due to inhomogeneities in the magnetic field were corrected using in-house software from NYU’s Center for Brain Imaging (http://cbi.nyu.edu/software). The data were then motion-corrected by co-registering all volumes of all scans to the first volume of the first scan in the session. The fMRI slices were co-registered to the whole brain T1-weighted anatomy, and the time series resampled via trilinear interpolation to the 1 mm3 voxels within the cortical ribbon (gray matter). Finally, the time series were averaged for each voxel across all scans with the same stimulus within a given dataset.
PRF solutions
Retinotopic maps were produced by solving population receptive field (pRF) models for each voxel using Vistasoft, as described previously [12]. pRF models were solved using a 2-stage coarse-to-fine approach on the time series in the 1 mm3 gray matter voxels. The first stage of the model fit was a grid fit, solved on time series that were temporally decimated (2×), spatially blurred on the cortical surface using a discrete heat kernel (approximately equal to a Gaussian kernel of 5 mm width at half height), and subsampled by a factor of 2. The decimation and blurring helps to find an approximate solution that is robust to local minima. The parameters obtained from the grid fit were interpolated to all gray matter voxels and used as seeds for the subsequent nonlinear optimization. Finally, the pRF parameters were projected from the volume to the cortical surface vertices for white, mid-gray, and pial surfaces using nearest-neighbor interpolation; values were then averaged across the three layers using a weighted-mean in which the fraction of BOLD signal variance explained by the pRF model was used as a weight. All vertices with a pRF variance explained fraction less than 0.1 were ignored.
Models of Retinotopy
In previous work [25], group-average retinotopic maps were registered to a model of V1, V2, and V3 retinotopy described by Schira et al. [32]. This “banded double-sech” model accurately describes the anisotropic magnification of the visual field on the cortical surface, particularly near the fovea. However, we have observed, particularly in individual data, that retinotopic data from outside the V1-V3 region described by the Schira model has a large impact on the quality of the inferred map. Accordingly, we constructed a new model that includes 9 additional visually active regions: hV4, VO1, VO2, V3a, V3b, LO1, LO2, TO1, and TO2. This model employed a new method of constructing 2D models of retinotopy that was specifically designed to accommodate the distortions caused by anatomical alignment, inflation, and flattening of the cortical surface. The new method is much simpler to extend to many more visual field maps, as it does not rely on an analytic description of the flattened (2D) retinotopic maps, which are only available for V1-V3. Rather, it requires as input a cortical map image on which estimates of the visual area boundaries have been drawn manually and labeled as either a foveal boundary, a peripheral boundary, or an upper-, horizontal- or lower-visual-field meridian (i.e., the visual-field extrema are labeled). These boundaries need only be drawn once for a single group-average retinotopic map. The boundaries are projected onto the cortical surface, and the retinotopic coordinates for each vertex on the surface are deduced via a minimization procedure. This minimization is motivated by two principles: (1) the retinotopic fields (polar angle and eccentricity) should be as orthogonal to each other as possible and (2) the retinotopic fields should be as smooth as possible. To this end, the minimization routine simultaneously maximizes both the smoothness of the retinotopic fields between vertices connected by edges in the mesh as well as the overall orthogonality between the polar angle field and the eccentricity field. The hand-drawn boundary values are held constant during the minimization. Equation 2 gives the function f that is minimized, where θ and ρ represent vectors of the polar angle and eccentricity values, respectively; E represents the set of edges between vertices in the mesh; and x represents the matrix of vertex coordinates (i.e., xu represents the coordinate vector of the vertex u). During minimization the values of the polar angle and eccentricity vectors are scaled such that both fields ranged from −1 to 1 (e.g., polar angle boundary values of 0° and 180° were assigned values of −1 and 1 respectively) so that the fields could be evaluated easily for orthogonality; after the minimization, the polar angle was linearly rescaled back to the range 0°-180° while eccentricity was rescaled so as to have an exponential distribution that best fit the group-average. We employed this model generation routine using boundaries drawn over the group-average data (see Anatomically-defined Atlas of Retinotopy: Group Average, below) as well as the Wang et al. [23] atlas and the Hinds et al. [53] V1 boundary as rough guidelines.
A full description of the model, including how it can be projected onto an fsaverage spherical surface or an individual subject’s fsaverage-aligned spherical surface, is provided in the open source Neuropythy library (https://github.com/noahbenson/neuropythy).
Anatomically-defined Atlas of Retinotopy
Construction of the anatomically-defined atlas of retinotopy is summarized in Figure S2. Previous work employed a mass-spring-damper system combined with a nonlinear gradient-descent minimization in order to register group-average retinotopic data, averaged on FreeSurfer’s fsaverage_sym hemisphere [54], with a model of V1, V2, and V3 retinotopy [32]. In this paper, we modify this technique slightly to bring it more in line with previous established methods such as those used by FreeSurfer for surface-based anatomical registration [33, 34]. In brief, retinotopy is measured in a group of subjects via fMRI; the subjects’ cortical meshes are aligned to the fsaverage surface via FreeSurfer’s surface registration; the retinotpoic coordinates are then averaged across subjects at each vertex on a single atlas of the cortical surface; a 2D atlas of retinotopy is then placed on this cortical surface; and finally, the cortical surface is warped to match the retinotopic atlas as best as possible given constraints on the warping. Each of these steps is described in more detail below.
Group-average Data
Group-average retinotopic maps (Fig. S2B) were obtained from 181 subjects whose data were published and made freely available as part of the Human Connectome Project [39]. The resulting group-average retinotopic maps are shown in Figure S2B.
Cortical Map Projection
The cortical surfaces of the fsaverage left and right hemispheres, on which the group-average data were constructed, were inflated both to a smooth hemisphere (FreeSurfer’s “inflated” surface) as well as to a sphere (FreeSurfer’s “sphere” surface); the vertices on the spherical surfaces were then flattened to 2D maps using an orthographic map projection. Precise parameters of this projection and the source code used to generate it are included in the supplemental materials. We refer to the 2D vertex coordinates in this resulting map as the “initial vertex coordinates” because they precede the warping of the vertex coordinates that occurs during registration.
Registration
The initial vertex coordinates of the map projections described above were warped in order to bring the polar angle and eccentricity measurements of the vertices into alignment with the 2D model’s predictions of retinotopy while maintaining topological constraints: i.e., preventing triangles in the triangle mesh representing the 2D cortical map from inverting and penalizing excessive stretching or compression of the map. This process was achieved by minimizing a potential function defined in terms of the edges of the triangle mesh, the angles of the triangle mesh, and the positions of the vertices with polar angle and eccentricity measurements above the weight threshold (see Group-Average Data, above). Equation 3 gives this potential function, F(x), which is further broken down into four components detailed in Table 2. Fundamentally, the potential function F is a sum of two kinds of penalties: penalties for deviations from the reference mesh and penalties for mismatches between the vertices with retinotopic coordinates and their positions in the retinotopic model. In the case of the former, the reference mesh is given by the parameters x0, E, Θ, P, and Φ, and the potential of the deviations are defined by Fe, Fϑ, and Fp. The latter is described by Fφ. In these functions, x represents the n × 2 matrix of the 2D-coordinates of each vertex while x0 represents the same coordinates in the reference mesh; E represents the set of undirected edges (represented as (u, v) pairs such that (u, v) and (v, u) are not both in E) in the reference mesh; Θ represents the set of angle triples (a, b, c) such that the angle is between edge (a, b) and edge (a, c); P is the set of vertices that lie on the perimeter of the 2D map projection; rx(u,v) is the Euclidean distance between vectors (x)u and (x)v; and αx(a,b,c) is the counter-clockwise angle between vectors ((x)b - (x)a) and ((x)c - (x)a); Φ represents the set of anchors defined by the retinotopic model in which each anchor is represented by a tuple (u,y,a,w) where w is the weight of the anchor, u is the vertex drawn to the anchor, σ is the standard deviation of the anchor’s Gaussian potential well, and y is the 2D point to which the anchor is attached; the constants q0 and q1 are the minimum and maximum allowable edge lengths, respectively.
The term of the potential function devoted to the retinotopic model is given in Fφ (Eq. 3; Tab. 2). This potential term is a set of inverted-Gaussian potential wells called anchors. Each anchor represents the attraction of a single vertex u, with measured polar angle θ, eccentricity ρ, and weight w, to a 2D point y, at which the retinotopic model predicts a polar angle value of θ and an eccentricity value of ρ. Note that each visual area represents every point (θ, ρ) in the visual field, there are multiple anchors per vertex with retinotopic data. In fact, the retinotopic model used in this paper defines 9 maps in addition to the V1-V3 maps (see Model of Retinotopy, above), bringing the total number of anchors per retinotopic vertex to 12. The additional areas are intended partly to prevent vertices immediately outside of V1-V3 from being drawn incorrectly into the V1-V3 section of the model and are not analyzed in detail in this paper. Each anchor additionally defines a parameter σ; this value is the width (standard deviation) of the anchor’s Gaussian potential well; σ is defined as the minimum distance from the given anchor to any other anchor to which u is also attracted; this value was given a maximum value of 20ε where ε is the mean edge-length in the projected map.
The potential function was minimized using a gradient descent algorithm sensitive to the singularities in the terms Ge, and Gϑ (Tab. 2); whenever the singularity is accidentally crossed, the minimizer backtracks and chooses a smaller step-size. This approach prevents the inversion (from counter-clockwise ordering to clockwise ordering) of any triangle in the mesh, as such an inversion would require the minimization trajectory to pass through a singularity at the point where α = 0 or α = π. The source code used to minimize the potential function as well as specifications of the gradients of each term is provided in the open-source library included with the Neuropythy and Neurotica libraries (https://github.com/noahbenson/nben).
Minimization was run for at least 2500 steps in which the step-size was constrained such that the displacement of each vertex in each step was at most 1/50th of the average edge-length in the map projection. A small amount of exponentially distributed random noise was added to the gradient at each step with the constraint that the gradient direction at each vertex be conserved; this noise did not affect the minimum obtained by the search but did speed up convergence significantly (see associated libraries for further details). Convergence was generally observed within 1000-2000 steps. The set of vertex coordinates that resulted from this minimization brings the retinotopic measurements associated with the vertices in V1, V2, and V3 referred to as the registered vertex coordinates.
Prediction
The registered vertex coordinates, once obtained, give the alignment of the subject’s cortical surface to the model of retinotopy; accordingly, a prediction of any vertex’s associated pRF and visual area label can be derived by comparing the the vertex’s registered coordinates with the model. Every vertex whose registered coordinates fall within the model’s V1 boundary, for example, is labeled as part of V1. Because only the vertex coordinates, and not the vertex identities, are changed during the registration process, there is no need to invert the registration: visual area label, polar angle, and eccentricity values assigned to each vertex apply as readily to the vertices whether they are visualized in the registered vertex coordinates or in the coordinates that define the subject’s white-matter surface, for example. The retinotopic map predictions for the group-average data is shown in Figure S2D (left column).
Because the group-average retinotopic data were used in the registration, the predicted map that results provides a reasonable estimate of any subject’s expected retinotopic map, as shown previously [24, 25], although the predicted map does not account for further individual differences in the structure to function relationship, as we show in this paper. Additionally, because the predicted map from the group-averaged data is defined on the fsaverage subject’s cortical surfaces, a retinotopic map prediction for any new subject, for whom retinotopic mapping measurements may not be available, can be easily obtained: one can use FreeSurfer to align the new subject’s cortical surface with the fsaverage subject’s surface (anatomical structure alignment) then to project the retinotopic maps from the fsaverage subject to the new subject based on the anatomical similarity between them. Because of this, we refer to this group-average retinotopic prediction as the anatomically-defined atlas of retinotopy. This atlas is used as the prior for the Bayesian model fit, described below. The atlas is similar but not identical to one presented previously [25].
Bayesian Retinotopic Maps
The anatomically-defined atlas of retinotopy, while providing a good prediction for most subjects’ individual retinotopic maps, nonetheless does not account for individual differences in the mapping between anatomical location and retinotopic coordinates. Accordingly, predicted retinotopic maps for individual subjects were refined starting from the anatomically-defined atlas of retinotopy using a similar method as was used to generate the atlas originally; this process is detailed in Figure 3C. For each subject, their cortical surface was aligned based on anatomical structure to the fsaverage subject’s cortical surface using FreeSurfer, then their retinotopic data were projected to a map using the identical map projection described above in the section on the anatomically-defined atlas of retinotopy. Note that, in this case, the anatomical alignment to the fsaverage subject serves to make the map projections as similar as possible between individual subjects and the anatomically-defined atlas of retinotopy. If we were not interested in incorporating information obtained from the anatomically-defined atlas of retinotopy (which represents a prior belief of retinotopic organization based on group-average data), this step would not be necessary.
The individual subject’s projected map is then arranged according to the registered vertex coordinates from the anatomically-defined atlas of retinotopy; this step reflects the prior belief that the group-average registration to the retinotopy model is generally accurate for an individual subject when that subject’s anatomical structure has been aligned to the fsaverage subject’s. Critically, none of the steps taken so far in processing the individual subject’s data relies on any measurements of retinotopy that might be associated with that subject. Rather, these steps have relied only on anatomical structure. If, for a subject, no retinotopic measurements are made, then there is no data with which to modify this prior belief; accordingly, the prediction of retinotopy for that subject would be identical to the prediction of retinotopy contained in the anatomically-defined atlas. In other words, without observation, the prior remains the prediction.
The next step registers the individually measured retinotopy data to the anatomically-defined atlas. Before registration, the individual subject’s data is resampled onto a uniform triangular mesh, and each vertex whose retinotopic measurements are above threshold are given a weight, w, based on the variance explained, ω, of its pRF model solution. The mesh is resampled to the same uniform triangle mesh used as the initial vertex coordinates in the registration of the anatomically-defined atlas of retinotopy in order to speed up registration. Triangles that are tightly pinched (i.e., triangles with internal angles near 0 or π) can drastically slow the registration progress by forcing the minimizer to frequently backtrack steps; resampling makes such behavior much less likely during the initial minimization. Aside from the weight, other parameters tracked by the potential field, including anchors parameters used by the function Fφ(x), are obtained identically as with the anatomically-defined atlas of retinotopy. These anchors inherit the weight of the vertex to which they apply, but are reduced when the field sign of the triangles adjacent to the vertex does not match the field sign of the visual area to which it is tied by the anchor or when the pRF size predicted by the model does not match that of the vertex’s measured pRF. Details regarding the weights on anchors are provided in the neuropythy library.
For each training dataset of each subject, minimization was run for 2500 steps using the same protocol that was used with the anatomically-defined atlas of retinotopy. Retinotopic map prediction, based on the positions of the registered vertex coordinates in the retinotopy model, were also computed identically to those in the anatomically-defined atlas. Identical minimization and prediction methods were run for each test dataset as well, but these results were not used to measure the accuracy or effectiveness of the prediction methods.
Cortical Magnification
Cortical magnification was calculated using both the observed retinotopic maps and the inferred maps that were produced by combining each subject’s full retinotopy dataset with the retinotopic prior. This combination of data should, in theory, produce the highest-quality retinotopic map predictions of which we are capable (see Results & Discussion). Cortical magnification was calculated by first projecting all vertices in a single visual area (such as V1) into the visual field based on their pRF centers. The cortical magnification of a particular polar angle and eccentricity is then the total white vertex surface-area (as calculated by FreeSurfer) of all pRF centers within a disk of some radius α, divided by the area of the disk (πα2). For an eccentricity ρ, we used a radius α = ρ / 3. on these maps in two ways.
Acknowledgements
The research was supported by NIH grants R00-EY022116 and R01-MH111417. The authors declare no competing financial interests.