Abstract
Reinforcement occurs when hybridization between closely related lineages produces low fitness offspring, prompting selection for elevated reproductive isolation specifically in areas of sympatry. Both pre-mating and post-mating prezygotic behaviors have been shown to be the target of reinforcing selection, but it remains unclear whether remating behaviors experience reinforcement, although they can also influence offspring identity and limit formation of hybrids. Here we evaluated evidence for reinforcing selection on remating behaviors in D. pseudoobscura, by comparing remating traits in females from populations historically allopatric and sympatric with D. persimilis. We found that the propensity to remate was not higher in sympatric females, compared to allopatric females, regardless of whether the first mated male was heterospecific or conspecific. Moreover, remating behavior did not contribute to interspecific reproductive isolation among any population; that is, females showed no higher propensity to remate following a heterospecific first mating than they were following a conspecific first mating. Instead, we found that females are less likely to remate after initial matings with unfamiliar males, regardless of species identity. This is consistent with one scenario of postmating sexual conflict in which females are poorly defended against post- copulatory manipulation by males with whom they have not co-evolved. Our results are generally inconsistent with reinforcement on remating traits, and suggest that this behavior might be more strongly shaped by the consequences of local antagonistic male-female interactions than interactions with heterospecifics.
Significance Statement It is becoming clear that prior knowledge can change what we see, but how is prior knowledge neurally instantiated and at what point during processing does it impact perception? We show that when observers know in advance the meaning of an ambiguous image, posterior alpha-band oscillations increase prior to target onset and visual-evoked potentials show rapid enhancement ~120 ms following the target. Results suggest that alpha is involved in bringing prior knowledge to bear on the interpretation of sensory stimuli, amplifying subsequent responses. These findings have implications for interpreting alpha activity and for predictive processing models of perception. Finally, our results provide a clear instance of knowledge affecting perception, a rejoinder to lingering doubts that perception is shaped by knowledge.
Introduction
Constructing meaning from noisy sensory input is crucial for visually guided behavior. Canonical hierarchical models of visual perception explain this ability as a strictly feed-forward process, whereby low-level sensory signals feed into higher-level systems underlying categorization according to prior knowledge (1–3). There is now considerable evidence, however, to suggest that prior knowledge about the world can feedback and impact relatively early stages of sensory analysis (4–14), though strong opposition continues to persist (15). A dramatic demonstration of the influence of prior knowledge on perception occurs with two-tone Mooney images (16).
Objects rendered as Mooney images can appear completely unrecognizable until observers are provided with knowledge of the object’s identity, after which it becomes clearly recognizable, even days later (17). This is hypothesized to result from prior knowledge bearing on the interpretation of otherwise ambiguous sensory information (18).
Although the proposal that prior expectations play an important role in perceptual decision-making is largely accepted, the precise stage of visual processing at which bottom-up inputs interact with top-down expectations remains unclear. Some have argued that benefits of knowledge on recognition reflect later processes divorced entirely from perception (15). In contrast, recent fMRI experiments have observed modulation of activity to predicted stimuli in sensory regions, suggesting an early locus of top-down effects. For example, expecting to see a face modulates BOLD responses to face stimuli in the fusiform face area (19, 20). Multivariate analysis of fMRI responses have also shown that expectations about orientation (21) and the direction of motion (22) improve stimulus representations in early visual cortex. However, identifying the stages of visual processing (as opposed to just the brain regions) that are influenced by prior knowledge can be difficult from fMRI results alone. For example, the sluggish nature of the BOLD signal makes it difficult to distinguish between an effect of expectations on sensory-evoked signals or on later feedback signals to the same sensory regions.
Electrophysiological recordings are better suited to address this issue due to precise temporal resolution and the ability to link certain responses to sensory-evoked activity, such as the visual P1 event-related potential (ERP) component (23).
In addition to uncertainty about when prior knowledge impacts visual processing, it is also unclear how prior knowledge itself is represented and then brought to bear on the interpretation of incoming sensory stimuli. One way that prior knowledge may influence perception is by biasing baseline activity in perceptual circuits, pushing the interpretation of sensory evidence towards that which is expected (24). Biasing of prestimulus activity according to expectations has been observed both in decision- and motor-related prefrontal, parietal, and subcortical regions (25–28) as well as in sensory regions selective for the expected stimulus (19, 29–31). In regards to the latter, alpha-band oscillations over posterior brain regions are proposed to play an important role in modulating prestimulus activity according to expectations. For example, it is well known that prior knowledge of the location of an upcoming stimulus changes preparatory alpha activity in corresponding regions of visual and parietal cortex (32–35). Likewise, expectations about when a stimulus will appear are reflected in prestimulus alpha dynamics (36–38). Recently, Mayer and colleagues (39) demonstrated that when the identity of a target letter could be predicted, prestimulus alpha power increased over left-lateralized posterior sensors, the magnitude of which predicted changes in sensory evoked responses to the target letter. These findings suggest that alpha- band dynamics may be involved in establishing top-down perceptual predictions in anticipation of perception.
To better understand both when and how prior knowledge influences perception, we first developed and normed a set of novel Mooney stimuli (Experiment 1) and then established effects of prior knowledge on recognition accuracy (Experiment 1), and on people’s ability to perform a more basic visual task: discriminating whether two pictures are the same, when presented simultaneously (Experiment 2) or sequentially (Experiment 3-4). Lastly, we recorded electroencephalography (EEG) during the task (Experiment 4). In Experiments 2-4, participants were informed of the meaning of a subset of the images allowing us experimentally manipulate prior knowledge, both between- (Experiments 2 and 3) and within-subjects (Experiment 4). Verbal cues drastically enhanced recognition accuracy. For example, being told that an image contained a piece of furniture increased 16-fold participant’s accuracy in recognizing a desk.
In the same/different discrimination task, knowledge decreased response times and improved accuracy. Viewing targets that were made meaningful led to increased P1 amplitudes, suggesting an enhancement of early-stage visual processing. This effect was accompanied by an increase in alpha power during the cue-target interval, the magnitude of which predicted the magnitude of the target-evoked P1 change across subjects. Lastly, single-trial P1 amplitudes were predictive of behavior only when image meaning was trained. Combined, these findings suggest that prior knowledge impacts early stages of visual processing by biasing the state of prestimulus neural activity, reflected in the modulation of alpha-band oscillations.
Materials and Method
Experiment 1
Materials. We constructed 71 Mooney images by superimposing familiar images of easily nameable and common artefacts and animals onto patterned background. These superimposed images were then blurred (Gaussian Blur) and then thresholded to a black-and-white bitmap. All images can be found at https://osf.io/stvgy/.
Procedure
Experiment 1A. Free Naming. We recruited 94 participants from Amazon Mechanical Turk. Each participant was randomly assigned to view one of 4 subsets of the 71 Mooney images, and to name at the basic-level what they saw in each image. Each image was seen by approximately 24 people. Naming accuracies for the 71 images (see below for details on how these were computed) ranged from 0% to 95%.
Experiment 1B. Basic Level Cues. From the 71 images used in Exp. 1A we selected the images with accuracy at or below 33% (29 images). We then presented these images to an additional 42 participants recruited from Amazon Mechanical Turk. Each participant was shown one of two subsets of the 29 images and asked to choose among 29 basic-level names (e.g., “trumpet”, “leopard”, “table”), which object they thought was present in the image (i.e., a 29-alternative forced choice). Each image received approximately 21 responses.
Experiment 1C. Superordinate Cues. Out of the 29 images used in Exp. 1B we selected 15 that had a clear superordinate label (see Fig. 1). Twenty additional participants recruited from Amazon Mechanical Turk were presented with each image along with its corresponding superordinate label and were asked to name, at the basic level, the object they saw in their picture by typing their response. For example, given the superordinate cue “musical instrument”, participants were expected to respond with “trumpet” given a Mooney image of a trumpet.
Experiment 2
Materials. From the set of 15 used in Exp. 1C, we chose the 10 that had the highest accuracy in the basic-level cue condition (Exp. 1B) and were most benefited by the cues (boot, cake, cheese, desk, guitar, leopard, socks, train, trumpet, turtle). The images subtended approximately 7o×7o of visual angle. Each category (e.g., guitar) was instantiated by four variants (see Fig. S1): two different image backgrounds and two different positions of the images. These additional images were introduced to tease apart potential detection effects be driven by low-level processing alone.
Participants. We recruited 35 college undergraduates to participate in exchange for course credit. Two were eliminated for low accuracy (less than 77%), resulting in 14 participants in the meaning trained condition (8 female), and 19 in the meaning untrained condition (11 female). All participants provided written informed consent. The University of Wisconsin-Madison Institutional Review Board approved this and all other studies reported here.
Familiarization Procedure. Participants were randomly assigned to a meaning trained or meaning untrained condition. The two conditions differed only in how participants were familiarized with the images. In the meaning trained condition participants first viewed each Mooney image accompanied by an instruction, e.g., “Please look for CAKE”, twice for each Mooney image (Trials 1-20). Participants then saw all the images again and were asked to type in what they saw in each image, guessing in the case that they could not see anything (Trials 21-30). Finally, participants were shown each image again, asked to type in the label once more and asked to rate on a 1-5 how certain they were that the image portrayed the object they typed. In the meaning untrained condition, participants were familiarized with the images while performing a one-back task, being asked to press the spacebar anytime an image was repeated back-to-back. Repetitions occurred on 20-25% of the trials. In total, participants in the meaning-trained and untrained conditions saw each image 4 and 5 times respectively.
Same/Different Task. Following familiarization, participants’ were tested in their ability to visually discriminate pairs of Mooney images. Their task was to indicate whether the two images were physically identical or different in any way (Fig. 2A). Each trial began with a central fixation cross (500 ms), followed by the presentation of one of the Mooney images (the “cue”) approximately 8o of visual angle above, below, to the left or to the right of fixation. After 1500 ms the second image (the “target”) appeared in one of the remaining cardinal positions. The two images remained visible until the participant responded “same” or “different” using the keyboard (hand- response mapping was counterbalanced between participants). Accuracy feedback (a buzz or bleep) sounded following the response, followed by a randomly determined inter-trial interval (blank screen) between 250 and 450 ms. Image pairs were equally divided into three trial-types (Fig 2C): (1) two identical images (same trials), (2) same object, but different location, (3) different-objects at different locations. The backgrounds of the two images on a given trial were always the same and On a given trial, both cue and target objects were either trained or untrained. Participants completed 6 practice trials followed by 360 testing trials.
Behavioral Data Analysis. Accuracy was modeled using logistic mixed effects regression with experiment block and trial-type random slopes and subject and item-category random intercepts. RTs were modeled in the same way, but using linear mixed effects regression. RT analyses excluded responses longer than 5s and those exceeding 3SDs of the subject’s mean.
Experiment 3
Participants. 32 college undergraduates were recruited to participate in exchange for course credit. 16 were assigned to the meaning trained condition (13 female), and the other 16 to the meaning untrained condition (12 female).
Familiarization Procedure and Task. The familiarization procedure, task, and materials were identical to Experiment 2 except that the first and second images (approximately 6o×6o of visual angle) were presented briefly and sequentially at the point of fixation, in order to increase difficulty and better test for effects of meaning on task accuracy (see Fig. 2B). On each trial, the initial cue image was presented for 300 ms for the initial 6 practice trials and 150 ms for the 360 subsequent trials. The image was then replaced by a pattern mask for 167 ms followed by a 700 ms blank screen, followed by the second target image. Participants’ task, as before, was to indicate whether the cue and target images were identical. The pattern masks were black-and-white bitmaps consisting of randomly intermixed ovals and rectangles (https://osf.io/stvgy/).
Behavioral Data Analysis. Exclusion criteria and analysis were the same as in Experiment 2.
Experiment 4
Participants. Nineteen college undergraduates were recruited to participate in exchange for monetary compensation. 3 were excluded from any analysis due to poor EEG recoding quality, resulting in 16 participants (9 female) with usable data. All participants reported normal or corrected visual acuity and color vision and no history of neurological disorders.
Familiarization Procedure and Task. The familiarization procedure, task, and materials were nearly identical to that used for Experiment 3, but modified to accommodate a within-subject design. For each participant, 5 of the 10 images were assigned to the meaning trained condition and the remaining to the meaning untrained condition, counterbalanced between subjects. Participants first viewed the 5 Mooney images in the meaning condition together with their names (trials 1-10), with each image seen twice. Participants then viewed the same images again and asked to type in what they saw in each image (trials 11-15). For trials 16-20 participants were again asked to enter labels for the images and prompted after each trial to indicate on a 1-5 scale how certain they were that the image portrayed the object they named. During trials 21-43 participants completed a 1-back task identical to that used in Experiments 2-3 as a way of becoming familiarized with the images assigned to the meaning untrained condition. Participants then completed 360 trials of the same/different task described in Experiment 3.
EEG Recording and Preprocessing. EEG was recorded from 60 Ag/AgCl electrodes with electrode positions conforming to the extended 10–20 system. Recordings were made using a forehead reference electrode and an Eximia 60-channel amplifier (Nextim; Helsinki, Finland) with a sampling rate of 1450 Hz. Preprocessing and analysis was conducted in MATLAB (R2014b, Natick, MA) using custom scripts and the EEGLAB toolbox (40). Data were downsampled to 500 Hz offline and were divided into epochs spanning −1500 ms prior to cue onset to +1500 ms after target onset. Epochs with activity exceeding ±75 µV at any electrode site were automatically discarded. Independent components responsible for vertical and horizontal eye artifacts were identified from an independent component analysis (using the runica algorithm implemented in EEGLAB) and subsequently removed. Visually identified channels with poor contact were spherically interpolated. After these preprocessing steps, we applied a Laplacian transform to the data using spherical splines (41). The Laplacian is a spatial filter (also known as current scalp density) that aids in topographical localization and converts the data into a reference-independent scheme, allowing researchers to more easily compare results across labs; the resulting units are in µV/cm2. For recent discussion on the benefits of the surface Laplacian for scalp EEG see (42, 43).
Event-related Potential Analysis. Cleaned epochs were filtered between 0.5 and 25 Hz using a first-order Butterworth filter (MATLAB function butter.m). Data were time-locked to target onset, baselined using a common 200 ms prestimulus window subtraction, and sorted according to target meaning condition (trained or untrained). To quantify the effect of meaning on early visual responses, we focused on the amplitude of the visual P1 component. Following a prior experiment in our lab that found larger P1 amplitudes to images preceded by linguistic cues (44), we derived separate left and right regions of interest by averaging the signal from occipito-parietal electrodes PO3/4, P3/4, P7/8, P9/10, and O1/2. P1 amplitude was defined as the average of a 30 ms window, centered on the P1 peak as identified from the grand average ERP (see Fig. 4A). Lastly, in order to relate P1 amplitudes to behavior, we used a single-trial analysis. As in prior work from our lab (44), single-trial peaks were determined from each baselined electrode cluster (left and right regions of interest) by extracting the largest local voltage maxima between 70 to 150 ms post-stimulus (using the MATLAB function findpeaks). Any trial without a detectable local maximum (on average ~ 1%) was excluded from analysis.
Time-Frequency Analysis. Time-frequency decomposition was performed by convolving single trial data with a family of Morelet wavelets, spanning 3–50 Hz, in 1.6-Hz steps, with wavelet cycles increasing linearly between 3 and 10 cycles as a function of frequency. Power was extracted from the resulting complex time series by squaring the absolute value of the time series. To adjust for power-law scaling, time-frequency power was converted into percent signal change relative to a common condition pre-cue baseline of −400 to −100 ms. To identify time- frequency-electrode features of interest for later analysis in a data-driven way while avoiding circular inference, we first averaged together all data from all conditions and all electrodes. This reveled a prominent (~65% signal change from baseline) task-related increase in alpha-band power (8-14 Hz) during the 500 ms preceding target onset, with a clear posterior scalp distribution (see Fig. 5A). Based on this, we focused subsequent analysis on 8-14 Hz power across the prestimulus window −500 to 0 ms using the same left/right posterior electrode clusters as in the ERP analysis.
Statistical Analysis. The effect of meaning training on the time course of prestimulus alpha power (see Fig. 5B) was analyzed with a non-parametric permutation test, the result of which was cluster corrected to deal with multiple comparisons across time points (45). This was accomplished by randomly shuffling the association between condition labels (meaning trained or untrained) and alpha power 10,000 times. On every iteration, a t-statistic was computed for each time sample and the largest number of contiguous significant samples was saved, forming a distribution of t-statistics under the null hypothesis that meaning training had no effect, as well as a distribution of cluster sizes expected under the null. The t-statistic associated with the true data mapping was compared, at each time point, against this null distribution and only cluster sizes exceeding the 95% percentile of the null cluster distribution was considered statistically different. α was set at 0.05 for all comparisons. Prestimulus alpha power was additionally analyzed by means of a linear mixed-effects model using meaning condition (trained vs. untrained) and electrode cluster (left vs. right hemisphere) and their interaction to predict alpha power (here averaged across the prestimulus window -500 to 0 ms) with random slopes for meaning condition and hemisphere by subject. The same model was used to predict averaged P1 amplitudes. Where correlations are reported, we used Spearman rank coefficients to test for monotonic relationships while mitigating the influence of potential outliers.
Results
Experiment 1
Mean accuracy for the 15 images used in all versions of Experiment 1 is displayed in Fig. 1A. The benefit conferred by different cue-types relative to a free naming baseline shown in Fig. 1B. Baseline recognition performance was 11%. Providing participants with a list of 29 possibilities increased recognition to 52%, a 4.7-fold increase (Exp. 1B), b = .41, 95% CI [.31, .51], t = 8.07, p < .0005. Providing participants with superordinate labels (e.g., “animal”, “musical instrument”) boosted performance to 40%, a nearly 4-fold increase compared to the 11% baseline, b = .29, 95% CI [.19, .39], t = 5.66, p < .0005. For example, knowing that there is a piece of furniture in the image produced a 16-fold increase in accuracy in recognizing it as a desk (an impressive result even allowing for guessing). The recognition advantage that verbal cues provide is especially striking given that they do not provide any spatial information to the identity of the image.
Experiment 2
Results are shown in Fig. 3. Overall accuracy was high—93.1% (93.5% on different trials and 92.2% on same trials) and not significantly affected by training with meaning training (z<1). This is not surprising given that participants had unlimited time to inspect the two images. Participants exposed to the meaning of the images, however, had significantly shorter RTs than those who were not exposed to image meanings: RTmeaning=824 ms; RTno-meaning=1018ms (b=192, 95% CI = [59, 327], t=2.82, p=.008; see Fig. 3). There was a marginal trial-type by meaning interaction (b=73, t=1.98, p=.06). Meaning was most beneficial in detecting that two images were exactly identical, (b=260, t=2.77, p=.009). There remained a significant benefit of meaning in detecting difference in images with the same object in a different location, (b=203, t=2.63, p=.01) and a smaller but still reliable difference when two images had different objects and object locations, (b=117, t=2.33, p=.03).
Experiment 3
The brief, masked presentation of the first image had an expected detrimental effect on accuracy, which was now 86.9% (89.9% on different trials and 81.1% on same trials). Exposing participants to the image meanings significantly improved accuracy: Mmeaning=90.9%; Mno-meaning=82.9% (b=.67, 95% CI = [.22, 1.12], z=2.93, p=.003; Fig. 3). The meaning advantage interacted significantly with trial type (b=.30, 95% CI = [.08, .52], z=2.65, p=.008). The advantage of being exposed to meaning was again largest for the identical-image trials (b=1.10, z=4.25, p<.0001). It was slightly smaller when the two images showed the same object in different locations (b=.53, z=2.13, p=.03), and when the two images showed different objects in different locations (b=.67, z=1.76, p=.08).
Experiment 4
BehaviorOverall accuracy was 89.0% (92.8% on different trials and 81.3% on same trials). Participants were marginally more accurate when judging images previously rendered meaningful compared to images whose meaning was untrained (b=.22, 95% CI = [−.02, .46], z=1.82, p=.07; Fig. 4A). The meaning-by-trial-type interaction was not significant. Participants became more accurate over time for both meaning trained and meaning untrained images (b=.34, z=4.47, p<.0001). The meaning-by-block interactions were not significant, t<1. Overall RT was 641 ms, and was marginally shorter when discriminating images that were previously rendered meaningful, (b=-9.4, 95% CI=[−19.8, 1.0], t=1.77, p=.08). The meaning-by-trial-type and meaning-by-block interactions for RTs were not significant, t<1. We can combine accuracy and RTs into a single by-subject inverse efficiency score (46) by dividing each subject’s meaningful and meaningless trial RTs by their respective accuracies. Efficiency was significantly better on meaningful trials, M=734 than meaningless trials, M=756 (b=22.1, t=2.73, p=.02).
Electrophysiology. As shown in Fig. 4B, trial-averaged P1 amplitude was significantly larger when viewing targets previously made meaningful (b=-1.7, t=-2.16, p=.037). Although there was no significant interaction with hemisphere, follow-up t-tests revealed P1 amplitude modulation by meaning at the left hemisphere electrode cluster (t(1,15)=2.59, p=.020), but not at right (t(1,15)=.35, p=.725). Analysis of the time course of prestimulus alpha power revealed a temporal cluster of significantly greater power on meaning-trained trials from approximately -480 to -250 ms prior to target onset. Like the P1 effect, this difference was observed over left occipito-parietal sensors, but not right (see Fig. 5B). The linear mixed-effects model of alpha power (averaged over the 500 ms prior to target onset) revealed a significant effect of meaning (b=-9.85, t=-2.3, p=.03), indicating greater prestimulus alpha power on meaning trained trials, and a significant interaction between hemisphere and meaning (b=8.31, t=2.75, p=.014). Paired t-tests revealed that meaning affected prestimulus alpha power in the left (t(1,15)=2.21, p=.043), but not right (t(1,15)=0.35, p=.729) hemisphere.
We next assessed the relationship between the meaning effect on prestimulus alpha power and the meaning effect on P1 amplitudes across participants by correlating alpha modulations (averaged over the prestimulus window) with P1 modulations. This analysis revealed a significant positive correlation (rho = 0.52, p = .037) indicating that individuals who showed a greater increase in prestimulus alpha by meaning training also had a larger magnitude effect of meaning on P1 amplitudes (see Fig. 6). This relationship was not significant over right hemisphere electrodes (rho = −0.21), and the two correlations were significantly different (p=.042), suggesting that these interactions may be specific to the left hemisphere. Together, these results demonstrate that prior knowledge of the meaning of an ambiguous stimulus increases preparatory alpha power, enhances early visual responses, and suggests that these two processes are related. The general finding that effects of meaning are stronger over the left hemisphere than the right may indicate the linguistic source of the meaning (47): participants, after all, were verbally instructed as to the meaning of the images, or relatedly, the more categorical representations induced by language (48–50).
Finally, we used linear mixed effects models to relate the per-trial P1 peak amplitudes to the latency of the responses, which occurred about 550 ms later (44). This trial-based analysis confirmed a main effect of meaningfulness on P1 amplitudes. Amplitudes were significantly higher on meaningful trials (M=64.42 µV) than meaningless trials (M=63.70µV)*, b=6.22, 95%CI = [1.22, 11.22], t=2.44, p=.01, independently confirming the trial-averaged P1 effect (see Fig. 4B). There was no overall relationship between the P1 peak amplitude and response latency, but there was a significant interaction with meaningfulness, b=−0.008, 95% CI =[−.014, −.001], t=2.25, p=.02: On meaningful trials, higher P1 peak amplitudes were associated with marginally faster latencies, b=-0.005, 95% CI = [−.01, .0008], t=1.71, p=.09. On meaningless trials, the P1 peak amplitude did not at all predict response latencies, b=.002, t=.63, 95% CI=[−.004, .008], p>.5.
General Discussion
How does object knowledge impact object perception? Prior knowledge of the meaning of a visual stimulus could impact visual judgments at relatively late stages of processing, once lower level information reaches putatively higher-level conceptual/semantic representations (15, 51, 52). Alternatively, prior knowledge may feed back to modulate low-levels of perceptual processing, as suggested by predictive coding accounts (14, 24, 53). To investigate how prior knowledge of the identity of objects impacted perception, we designed a novel set of difficult-to-recognize Mooney-style images (16). As anecdotally well-known but rarely demonstrated, providing verbal cues—either multiple basic-level alternatives or superordinate hints (e.g., furniture, musical instrument) dramatically improved people’s ability to recognize objects in the image (Fig. 1). We then examined whether ascribing meaning to the ambiguous images improved not just people’s ability to recognize the denoted object, but to perform a basic perceptual task: image discrimination. Indeed, ascribing meaning to the images through verbal cues (54) improved people’s ability to determine whether two simultaneously or sequentially presented images were the same or not (Fig. 3 and 4). The behavioral advantage might still be thought to reflect an effect of meaningfulness on some relatively late process were it not for the electrophysiological results showing that ascribing meaning led to increase in the amplitude of P1 responses to the target (Fig. 4B) (cf., 55). This was accompanied by an increase in alpha amplitude during the cue-target interval when the cue was meaningful (Fig. 5). The effect of meaning training on pre-target alpha power and target-evoked P1 amplitude were positively correlated across participants, such that individuals who showed larger increases in pre-target alpha power as a result of meaning training, also showed larger increases in P1 amplitude (Fig. 6).
Prior knowledge impacts early stages of perceptual processing. The P1 ERP component is associated with relatively early regions in the visual hierarchy (most likely ventral peristriate regions within Brodmann’s Area 18 (56–59)) but is has been shown to be sensitive to top-down manipulations such as spatial cueing (23, 60), object based attention (61), object recognition (62, 63), and recently, trial-by-trial linguistic cuing (49). Our finding that both single trial and trial-averaged P1 amplitudes were increased following meaning training is thus most parsimoniously explained as a prior knowledge having an early locus in affecting perception. This result is consistent with prior fMRI findings implicating sectors of early visual cortex in the recognition of Mooney images (64, 65) but extends these results by demonstrating that the timing of Mooney recognition is consistent with the modulation of early, feedforward visual processing. Our findings are also in line with two recent magnetoencephalography (MEG) studies reporting early effects of prior experience on subjective awareness ratings (39, 66). In those studies, however, prior experience is difficult to disentangle from perceptual repetition. For example, Aru et al. (66) compared MEG responses to images that had previously been studied against images that were completely novel, leaving open mere exposure as a potential source of differences. In our task, by contrast, participants were familiarized with both meaning trained and meaning untrained images but only the meaning of the Mooney image was revealed in the meaning training condition, thereby isolating effects of recognition. One possible alternative by which meaning training may have had its effect is through spatial attention. For example, it is conceivable that on learning that a given image has a boot on the left side, participants subsequently were more effective in attending to the more informative side of the image. If true, such an explanation would not detract from the behavioral benefit we observed, but would mean that the effects of knowledge were limited to spatial attentional gain. Subsequent analyses ruled out this possibility (Figs. S2-S3). It is noteworthy that, like the present results, the two abovementioned MEG studies as well as related work from our lab employing linguistic cues (44) have all found early effects over left-lateralized occipito-parietal sensors, perhaps reflecting some lateralization of semantic or linguistic processing related to the nature of the prior knowledge manipulated in these experiments.
Prestimulus alpha-band oscillations as carriers of top-down perceptual expectations. Mounting neurophysiological evidence has linked low-frequency oscillations in the alpha and beta bands to top-down processing (67–70). Recent work has demonstrated that perceptual expectations modulate alpha-band activity prior to the onset of a target stimulus, plausibly biasing baseline activity towards the interpretation of the expected stimulus (28, 39). We provide further support for this hypothesis by showing that posterior alpha power increases when participants have prior knowledge of the meaning of the cue image, which was to be used as a comparison template for the subsequent target. Further, pre-target alpha modulation was found to predict the effect of prior knowledge on target-evoked P1 responses, suggesting that representations from prior knowledge activated by the cue interacted with target processing. Notably, the positive direction of this effect—increased prestimulus alpha power predicted larger P1 amplitudes (Fig. 6)—directly contrasts with previous findings of a negative relationship between these variables (71–73), which is interpreted as reflecting the inhibitory nature of alpha rhythms (74). Indeed, our observation directly contrasts with the notion of alpha as a purely inhibitory or “idling” rhythm. We suggest that, in our task, increased prestimulus alpha-band power may reflect the pre-activation of neurons representing prior knowledge about object identity, thereby facilitating subsequent perceptual same/different judgments. This is consistent with the finding that evoked gamma and multiunit responses in Macaque inferotemporal cortex are positively correlated with prestimulus alpha power (75), suggesting that the alpha modulation we observed may have its origin in regions where alpha is not playing an inhibitory role.
Implications for predictive processing models. Although our results are supportive of a general tenant of predictive processing accounts (5, 9, 24)—that predictions, formed through prior knowledge, can influence early sensory representations—our results also depart in an important way from certain proposals made by predictive coding theorists (5, 76, 77). With respect to the neural implementation of predictive coding, it is suggested that feedforward responses reflect the difference between the predicted information and the actual input. Predicted inputs should therefore result in a reduced feedforward response. Experimental evidence for this proposal, however, is controversial. Several fMRI experiments have observed reduced visual cortical responses to expected stimuli (78–80), whereas visual neurophysiology studies describe most feedback connections as excitatory input onto excitatory neurons in lower-level regions (81–83), which may underlie the reports of enhanced fMRI and electrophysiological responses to expected stimuli (31, 39, 84). A recent behavioral experiment designed to tease apart these alternatives found that predictive feedback increased perceived contrast—which is known to be monotonically related to activity in primary visual cortex—suggesting that prediction enhances sensory responses (85). Our finding that prior knowledge increased P1 amplitude also supports the notion that feedback processes enhance early evoked responses, although teasing apart the scenarios under which responses are enhanced or reduced by predictions remains an important challenge for future research.
FOOTNOTES
*These values are quite different from the peak amplitudes in the waveform traces in Fig. 4B because the grand means reflect the average of peaks occurring at different latencies on different trials and so the amplitudes are lower.