Abstract
Neural oscillations adjust their phase towards the predicted onset of rhythmic stimulation to optimize the processing of relevant information. Whether such phase alignments can be observed in non-rhythmic contexts, however, remains unclear. Here, we recorded the magnetoencephalogram while healthy participants were engaged in a temporal prediction task judging the visual or crossmodal (tactile) reappearance of a uniformly moving visual stimulus after it disappeared behind an occluder. The temporal prediction conditions were contrasted with a working memory control condition to dissociate phase adjustments of endogenous neural oscillations from stimulus-driven activity. During temporal predictions, we observed stronger delta band inter-trial phase consistency (ITPC) in a network of sensory, parietal and frontal brain areas. Delta ITPC further correlated with individual prediction performance in parts of the cerebellum and in visual cortex. Our results provide evidence that phase alignments of low-frequency neural oscillations underlie temporal predictions in non-rhythmic unimodal and crossmodal contexts.
Introduction
In recent years, increasing evidence has been gathered that the processing of information in the brain occurs in a rhythmic, oscillatory fashion. Extracellular local field potentials as well as the macroscopically observable magneto- and electroencephalogram (M/EEG) reflect alternating transmembrane current flows from large ensembles of synchronized neurons, oscillating with frequencies ranging from very slow (<0.1 Hz) to high gamma frequencies (>200 Hz). It has been proposed that these neural oscillations reflect alternating states of higher or lower neural excitability which can modulate the efficiency with which coupled neurons engage in mutual interactions (Buzsáki, 2006; Fries, 2005). As a result, neural communication and information processing has been shown to occur in a phase-dependent manner (Engel et al., 2001, 2013; Fries, 2005), reflected for example by fluctuations in perception thresholds that correlate with the phase of ongoing oscillations (VanRullen, 2016).
Since they represent fluctuations in neural excitability, oscillations were also linked to temporal predictions of upcoming relevant information (Arnal and Giraud, 2012; Engel et al., 2001; Rimmele et al., 2018). Environmental stimuli often entail temporal regularities that make aspects of the stimulation, such as the temporal onsets of upcoming changes, highly predictable. Studies have shown that animals can utilize these predictive aspects to optimize behavior, leading to faster reaction times (Gould et al., 2011; Lakatos et al., 2008; Rohenkohl and Nobre, 2011; Stefanics et al., 2010) or enhanced stimulus processing (Cravo et al., 2013; Wilsch et al., 2015). By means of top-down induced phase resets of neural oscillations, phases of high excitability might be adjusted towards the expected onset of relevant upcoming in order to optimize behavior. When presented with a 2 Hz isochronous rhythm, for instance, the phase of oscillations within the same frequency range (and their harmonics) might align to the stimulation rhythm in such a way that phases of high excitability optimally coincide with each recurring stimulus onset. In other words, the phase of the aligned oscillation codes for the predicted time point of each upcoming stimulus onset in the rhythm. It has been suggested that such phase alignments could form the basis of neural mechanisms that underlie temporal predictions (Arnal and Giraud, 2012; Rimmele et al., 2018; Schroeder and Lakatos, 2009).
Due to the rhythmic and therefore temporally highly predictable nature of many auditory stimuli such as speech or music, particularly in the auditory domain, many studies gathered evidence that oscillations reset and thereby adjust their phase towards rhythmic stimuli of various frequencies (Doelling and Poeppel, 2015; Giraud and Poeppel, 2012). Also in the visual domain, studies showed that neural oscillations align to rhythmic visual input (Besle et al., 2011; Cravo et al., 2013; Gomez-Ramirez et al., 2011; Herrmann, 2001; Lakatos et al., 2008; Saleh et al., 2010). However, other studies investigating temporal predictions in the visual domain observed exclusive involvements of oscillations in the alpha range (8–12 Hz) rather than phase alignments of oscillations matching the low-frequency temporal structure of the stimulation (Rohenkohl and Nobre, 2011; Samaha et al., 2015), leaving the involvement of interval-matching phase alignments in visual predictions unsettled.
Moreover, despite their ecological relevance, using rhythms for the investigation of an involvement of oscillations in temporal predictions entails methodological and conceptual challenges. Rhythms have the obvious methodological advantage that the temporal structure of the stimulation and therefore the frequencies of the oscillations that should align to the rhythmic stimulation are well-defined. However, rhythmic input also leads to a continuous stream of regularly bottom-up evoked potentials, which are – at least – difficult to distinguish from top-down phase adjusted endogenous neural oscillations within the same frequency (Doelling et al., 2019; Zoefel et al., 2018). In addition, using only rhythmic stimulation lacks the opportunity to link phase adjustments to a more general neural mechanism that predicts the temporal structure of any external input. If phase adjustments form the basis of tracking the temporal regularities of any relevant information, neural oscillations should align also to predictable temporal regularities that are inferred from input that does not itself comprise rhythmic components, such as, for instance, monotonic motion. Nevertheless, the vast majority of studies investigating phase adjustments in the context of temporal predictions presented participants with streams of (quasi-)rhythmic stimulation. Disentangling phase alignments of neural oscillations from a continuous stream of event-related potentials in a non-rhythmic predictive context therefore constitute important aspects for examining the involvement of endogenous neural oscillations in temporal prediction processes.
In the current study, we set out to investigate whether phase adjustments of neural oscillations can be observed for non-rhythmic, but predictable visual motion stimuli. We measured MEG while healthy participants watched a visual stimulus continuously moving across the screen until it disappeared behind an occluder. We manipulated the time for the stimulus to reappear on the other side of the occluder (on average 1.5 s). The task was to judge whether the stimulus reappeared too early or too late based on the speed of the stimulus earlier to disappearance. Hence, from the time point of disappearance behind the occluder, participants were required to temporally predict the correct time point of reappearance to be able to accomplish the task. We contrasted this condition to a control task, in which participants judged the luminance of the reappearing stimulus instead of its timing. Importantly, since physical appearance of both conditions was exactly the same in all aspects of the stimulation, any purely stimulus-related, bottom-up activity should level out between the two conditions. Moreover, since it has been shown that sensory stimulation can lead to crossmodal phase adjustments also in relevant but unstimulated other modalities (Lakatos et al., 2007; Mercier et al., 2013; ten Oever and Sack, 2015), we further introduced a third condition, in which a tactile instead of a visual stimulus was presented at reappearance. By contrasting it to the working memory control condition, we sought to determine whether phase adjustments can be observed in regions associated with tactile stimulus processing, when sensory information was in fact only provided to the visual system.
We hypothesized that in the two temporal prediction tasks, as compared to working memory, we would observe stronger inter-trial phase consistency (ITPC) within time windows between disappearance and expected reappearance. Enhanced ITPC specifically in these time windows would reflect phase resets of ongoing oscillations at disappearance of the stimulus, where temporal prediction processes might be initialized. Moreover, if the phase of these oscillations indeed codes for the time point of the expected reappearance in each participant, two further hypotheses can be formulated: (a) participants showing a more consistent judgment of reappearance timing, as represented by a steep psychometric function, should have stronger ITPC during temporal predictions than participants who performed less accurately, since a consistent timing judgment across trials should also involve a consistent phase across trials. And (b), a clustering of a specific phase within the oscillation showing the strongest ITPC in each participant should be observable at the individual subjective time points of predicted reappearance. That is, not at the actual reappearance time points of the stimulus itself but at each individual’s subjective “right on time” impression, i.e., the time point at which the reappearance is expected, a clustering of a specific (“right-on-time”) phase should be observable across participants. This time point can also be inferred from the psychometric function as the point of subjective equality.
Taken together, observing enhanced ITPC during temporal prediction as well as a phase clustering at the time point of expected reappearance would provide evidence that endogenous neural oscillations align to the temporally predictive structure of external stimulation in a non-rhythmic visual context. This would strongly support the hypothesis that such phase alignments form the basis of the neural mechanisms that underlie temporal prediction processes in a unimodal visual as well as crossmodal visuotactile context.
Results
Behavioral results
Participants did not receive feedback about the correctness of their response. This ensured that participants relied on their individual and subjective “right on time” (ROT) impression in the temporal prediction and “point of subjective equivalence” (PSE) in the working memory condition. Across participants, there was no statistically significant bias towards “too early/darker” or “too late/brighter” responses in the visual temporal prediction (Δt (ROTV) = 13.15 ± 155.20 ms; t(22) = .41; p = .69) or in the working memory task (ΔRGB (PSE) = −1.29 ± 4.54 RGB; t(22) = −1.36; p = .19), respectively (Figure 1B). In the tactile temporal prediction task, participants showed a significant bias towards “too early” responses (Δt (ROTT) = 99.80 ± 150.00 ms; t(22) = 3.19; p = .004).
Participants responded significantly faster in each of the temporal prediction tasks as compared to the working memory task (visual prediction: t(22) = −2.55; p = .02; temporal prediction: t(22) = −4.29; p < .001). To assess whether reaction times were dependent on the timing of the reappearing stimulus (Figure 1C), we averaged across all luminance differences and fitted a linear model to reaction time data in each condition. Since reaction times should be slowest for timing differences around 0 ms, where ambiguity for the timing of the reappearing stimulus was strongest, we expected reaction times to follow an inverted U-shape in the temporal prediction tasks. Therefore, we used a second-order polynomial regression with timing difference as predictor and tested each participant’s coefficients against zero. Reaction times were significantly predicted by timing difference in all, the visual prediction (first-order coefficient: −7.77 × 10-4 ± 5.27 × 10-4, t(22) = −7.08, p < .001; second-order coefficient: −1.42 × 10-6 ± 1.20 × 10-6, t(22) = −5.68, p < .001), the tactile prediction (first-order coefficient: −2.88 × 10-4 ± 4.43 × 10-4, t(22) = −3.12, p = .005; second-order coefficient: −1.26 × 10-6 ± 1.10 × 10-6, t(22) = −5.50, p < .001) as well as in the working memory task (first-order coefficient: −1.60 × 10-4 ± 1.44 × 10-4, t(22) = −5.31, p < .001; second-order coefficient: 2.75 × 10-7 ± 3.51 × 10-7, t(22) = 3.76, p = .001). Hence, although the timing of the stimulus was not relevant in the working memory task, reaction times in that condition were (in part) also dependent on the timing of the reappearing stimulus and faster the later the stimulus reappeared.
Temporal prediction was associated with reduced beta power in sensory regions
Analyzing the neural data, we were first interested in investigating which frequency bands showed modulated spectral power during windows of temporal predictions. To determine frequency bands of interest, we tested an average of spectral power across all sensors and conditions against a pre-stimulus baseline window. Although we were most interested in time windows of temporal prediction, i.e., around disappearance of the stimulus, we first obtained a general overview of power modulations at each event in the experimental paradigm by computing cluster-based permutations statistics in three separate time windows centered on: (a) the onset of the moving stimulus (“Movement”), (b) disappearance of the stimulus behind the occluder (“Disappearance”), and (c) reappearance of the stimulus (“Reappearance”). All windows were normalized with a pre-movement baseline window (Figure 2A).
In time bins around movement onset as well as reappearance of the stimulus, clusters of frequencies in the theta and delta range showed a statistically significant increase of spectral power as compared to the baseline window. All time windows further depicted a significant decrease of spectral power in frequencies within the beta to lower gamma range, which extended into the higher gamma range at the end of the occluder window or the beginning of the reappearance window, respectively (all cluster p-values < .008). Importantly, even with using a liberal cluster alpha level of .05 (one-sided), we did not find a statistically significant modulation of delta power during the disappearance window. This was also not the case when reducing the test to sensors from occipital regions only (see Figure S1).
Since we were most interested in examining power modulations associated with temporal predictions, i.e., during the disappearance window, we compared spectral power estimates between the temporal prediction tasks and the working memory task in all sensors within the disappearance window. We restricted our analysis to the classical beta band ranging from 13 to 30 Hz, showing significant decreases as compared to baseline. Cluster-based permutation statistics revealed reduced beta power during visual temporal prediction in comparison to working memory in occipital sensors during all time-bins of the disappearance window (cluster-p = .01). Source level statistics on the average across all time bins showing significant differences on sensor level revealed a statistically significant decrease of beta power in a cluster of bilateral occipital voxels (cluster-p = .01). A comparison between the tactile prediction task and the working memory task showed that beta power was reduced during tactile prediction in a cluster of occipital as well as left lateralized frontocentral sensors (cluster-p = .002). In occipital sensors, beta power was reduced during all time bins, whereas the more anterior power reduction evolved first in left frontal sensors during stimulus disappearance and shifted towards more left-lateralized central sensors with ongoing disappearance time. At source level, a significant power reduction in the beta band was most strongly apparent in parts of bilateral visual as well as left-lateralized somatosensory cortex in an average across the whole time window (cluster-p = .01; see Figure S2 for condition-specific beta modulations as compared to baseline).
Inter-trial phase consistency in the delta band was stronger during temporal prediction than working memory
For the analysis of ITPC, we followed a similar approach as for the analysis of spectral power. First, we tested ITPC differences as compared to baseline in the three time windows for an average across all sensors and conditions by cluster-based permutation statistics. ITPC was significantly increased across a range of different frequencies in time bins around movement onset, disappearance and reappearance of the stimulus as compared to baseline (all cluster-p < .001; Figure 3A). Significant ITPC changes were broadband for the time windows centered on movement onset as well as reappearance of the stimulus with strongest increases in the delta to alpha range. At disappearance of the stimulus, significant ITPC differences to baseline were observed up to the low beta range with strongest increases in the delta and theta band.
The delta band showed strongest increases in ITPC but no increase in power as compared to baseline for an average across all conditions (see Figures 2A, 3A, and S1). Therefore, we restricted our further statistical analyses to frequencies between 0.5 to 3 Hz and time bins around disappearance of the stimulus. Differences between the two temporal prediction tasks and the working memory task were examined by computing cluster-based permutation statistics across all sensors for an average in this frequency band. For a better estimation of when differences in ITPC between the conditions became apparent, we enlarged the analysis of ITPC to time bins ranging from −1,900 ms to 1,900 ms centered on the disappearance of the stimulus. Note that in this enlarged analysis window the timing of the movement onset as well as the reappearance of the stimulus strongly jittered across trials. The effect of these events on ITPC estimates were thus strongly reduced (in comparison to the time windows that were centered on these events; see Figure S4).
By comparing the visual temporal prediction task to the working memory task, we found two clusters that showed significantly stronger ITPC during temporal predictions (Figure 3B; for clarity of presentation, only every second time bin, i.e. 200 ms, of the cluster was plotted). One cluster included sensors from right temporal, frontal and occipital regions in time bins from −400 to 1,900 ms (cluster p < .001). The second cluster included left frontotemporal sensors in time bins ranging from 0 to 1,900 ms (cluster p = .01) Source level analysis revealed that for an average of the time window from −400 to 1,900 ms, ITPC differences between the two conditions were strongest in right-lateralized central and inferior frontal voxels (cluster p < .001).
For the contrast of tactile temporal prediction to working memory, we found a similar pattern of significant ITPC differences as for the contrast of visual prediction to working memory (Figure 3C). ITPC was also significantly enhanced in bilateral temporal sensors, evolving around −400 ms in right temporal sensors and shifting towards left hemisphere with ongoing disappearance time (cluster p < .001). In this contrast, however, differences in ITPC were more strongly apparent also in frontal and central sensors. Besides strongest differences in ITPC again in right superior parietal and inferior frontal voxels, source level analysis also revealed strong differences in bilateral somatosensory voxels for the contrast of tactile prediction to working memory (cluster p < .001).
Figure 3D depicts absolute ITPC estimates for all three conditions in the enlarged disappearance time window. ITPC was averaged across participants and all the sensors that exhibited the top 20% of t values in the ITPC contrast between visual temporal prediction and working memory between 0 and 1,500 ms (see Figure 3B; similar results were obtained for sensors showing the top 10% or 5% of t values, see Figure S4D). ITPC also increased in the working memory condition around disappearance of the stimulus, but dropped down to stimulus movement level shortly afterwards. ITPC in the visual as well as tactile temporal prediction tasks also decreased after an initial overshoot, but stayed elevated throughout the entire disappearance window. Importantly, enhanced ITPC estimates during temporal predictions became apparent roughly at around disappearance of the stimulus and did not exist during baseline or early movement time windows.
Inter-trial phase consistency predicts individual behavioral performance
If the phase of neural oscillations was indeed associated with temporal predictions, a participant who judged the reappearance of the stimulus within her individual subjectively correct ROT framework in a consistent manner should exhibit stronger ITPC during temporal predictions than a participant who performed less consistently, as a consistent timing judgement across trials should involve a similar phase across trials. The consistency of judgements can be inferred from the steepness of the psychometric function. That is, the steeper the psychometric function, the more consistent the answers of the participant. To examine the relationship between individual ITPC estimates and the steepness of the psychometric functions, we computed Pearson correlations of source level delta ITPC with the steepness of the psychometric function in all voxels of the 5,003 voxels grid across participants. Using cluster-based permutation statistics, we found statistically significant positive correlations in the visual (cluster p = .003) as well as in the tactile temporal prediction task (cluster p = .002; Figure 4). Strongest correlations were found in the cerebellum and right lateralized early visual areas in both tasks. No clusters showing significant positive or negative correlations were observed in the working memory task (all cluster p > .1).
Delta phase clusters at individually predicted reappearance time points
One of our main interests in this study was to examine whether the phase of slow oscillations codes for the predicted time point of reappearance of the stimulus, i.e., whether a clustering of a specific low-frequency phase can be observed at each individual’s ROT. In order to test that, we extracted the mean phase of that delta frequency that showed the strongest ITPC within each temporal prediction task as compared to the working memory task at ROT in each participant. In case there was no relationship between delta phase and individual ROTs, all phases extracted at ROT should be randomly distributed across the unit circle. That is, even if delta oscillations showed a phase reset at disappearance but this phase reset was not relevant for temporal predictions, the phases extracted at ROT should strongly differ across participants, since individual ROTs strongly differed across participants as well (see Figure 1B). Moreover, if the frequency showing strongest ITPC within the frequency band of 0.5 to 3 Hz was further not related to temporal predictions, the phase extracted at ROT for these various frequencies should also vary strongly across participants.
For this analysis, we again used the sensors that showed the strongest statistical differences in ITPC for the contrasts of each prediction task to the working memory (see Figure 3B and C). Moreover, only trials in which the stimulus actually reappeared later than each individuals ROT were considered, so that stimulus onset related brain activity would not distort phase estimates at ROT. Mean phases extracted at ROT from each channel and all participants were then plotted into a histogram for each condition (Figure 5, upper row). We quantified the distance of the observed distribution to a uniform distribution by means of the modulation index (MI; Tort et al., 2010). In the working memory condition, we used individual ROTs from the visual prediction task (which employed identical stimulation) and extracted the phase from the frequency that showed the strongest ITPC as compared to the visual prediction task.
To test whether the observed MI was significantly stronger than a random distribution obtained from surrogate MIs, we repeated the analysis 10,000 times using a randomly chosen frequency from the same delta band for each participant in each repetition. We found that for both, the visual prediction (p = .03) as well as the tactile prediction task (p = 0), the observed MI was significantly stronger than the surrogate MIs. Phases at ROT from both tasks clustered roughly around ±90°. In the working memory task, no significant clustering at a specific phase was found (p = .96).
Our reaction time analysis revealed that also in the working memory task, participants had a certain expectation about the temporal reappearance of the stimulus. Therefore, we hypothesized that the phase of the frequency that showed the strongest ITPC during the visual prediction task might also code for the timing of the reappearing stimulus in the working memory task, since physical stimulation was identical in both tasks. We repeated the above described analysis for the working memory condition, now using the same frequencies as obtained from the visual prediction condition and again tested the observed MI against 10,000 repetitions with randomly chosen frequencies (Figure 5, Panel 4: WM (V)). With frequencies obtained from the visual prediction task, the MI observed for the working memory task was significantly stronger than MIs obtained from the random repetitions (p = .02).
ITPC estimates during temporal predictions do not correlate with eye movements
One potential explanation for the observed effects in ITPC could be that participants tracked the moving stimulus with their eyes to be able to judge the correct time point of reappearance. Thus, consistent horizontal eye movements with the speed of the stimulus might lead to enhanced ITPC in the delta band. To make sure that differences in eye movements do not explain the observed differences in ITPC between the conditions, we analyzed horizontal eye movements recorded by an eye tracker (ET) during the MEG measurement. Figure 6A depicts condition-wise horizontal eye positions averaged across all participants and centered on the disappearance of the stimulus, showing no systematic differences between the conditions. Moreover, if horizontal eye movements would explain the effects in ITPC, we should observe the same effects between the conditions when we compute ITPC for the ET data. Differences in ITPC between the two temporal prediction conditions and the working memory condition are depicted Figure 6B and C. Using cluster-based permutation statistics, we did not observe any time-frequency cluster that revealed significant differences between the conditions (all cluster p > .1).
Further, we tested whether there are any significant correlations between individual ITPC values obtained from the MEG data and from the ET data. We averaged ITPC values from a time window of 0 to 1.500 ms and again used the top 20% of channels showing the strongest effect for ITPC for the MEG data (for channels see Figure 3D). Correlations between the data from the two measuring devices are depicted in Figure 6D. Again, we did not observe significant correlations between the ITPC values obtained from MEG and ET data. The strongest, albeit not significant correlation was found in the working memory condition, which confirms that the ITPC differences found in the MEG data cannot be explained by horizontal eye movements during temporal predictions.
Discussion
Our results support the idea that phase adjustments of ongoing neural oscillations could form the neuronal basis of temporal predictions and suggest that this framework can be extended to temporal predictions inferred from stimulation that does not itself comprise rhythmic components. Our task design enabled us to disentangle the phase reset of ongoing neural oscillations from evoked event related potentials and showed that phase adjustments are stronger in the context of temporal predictions than in tasks where temporal structure is less relevant. The strength of the observed phase adjustments correlated with the ability to consistently judge the temporal reappearance of the stimulus across participants. Moreover, the phase of individual delta oscillations clustered at around 90° at each participant’s predicted time point of reappearance, possibly indicating an optimal phase of neural oscillations in the context of temporal prediction.
Cross-modal temporal predictions are reflected by a beta power reduction in both sensory systems
It has been suggested that temporal predictions of upcoming events might be mediated by neuronal oscillations in the delta and beta frequency range (Arnal and Giraud, 2012). The enhanced phase consistency of delta oscillations as well as the power modulations in the beta band observed in the current study are in line with this hypothesis. However, earlier reports on beta power modulations during temporal predictions are inconsistent. In a study by Fujioka et al. (2012), after an initial reduction in power, beta oscillations transiently re-synchronized to reach a maximum at the time point of the expected subsequent stimulus in a rhythm, increasing differentially as a function of the utilized frequency of the rhythmic stimulation. Other studies found that beta power was even increased shortly before the onset of the expected stimulus in auditory (Arnal et al., 2015; Gulberti et al., 2015) and visual rhythmic stimulation (Saleh et al., 2010). On the other hand, van Ede et al. (2011) found that predicting the onset of a tactile stimulus was specifically associated with a reduction of beta power in contralateral tactile areas and accompanied by faster reaction times. The authors suggest that a reduction in beta power might signal preparatory processes in the sensory system that expects the upcoming event.
The observed decrease in beta power in task-relevant sensory regions in the current study largely match the results reported by van Ede et al. (2011). During visual temporal predictions, beta band power was reduced in visual sensory regions as compared to the visual control condition during the entire disappearance time. During crossmodal predictions, in which temporal information was provided to the visual system, but reappearance was expected in the tactile domain, beta band power was decreased in both, visual as well as tactile regions.
Since also in the working memory condition participants expected to perceive a visual stimulus, preparatory processes alone cannot explain this reduction in beta power. This is especially the case in the crossmodal condition, in which no visual stimulus was expected, but stronger decreases in beta were also observed in visual areas. Moreover, since we observed beta decreases also in tactile regions at the time of visual stimulus disappearance, the decrease could not solely be an effect of external stimulation.
One could argue that potential working memory maintenance processes associated with beta power increases in visual regions (Daume et al., 2017b, 2017a) could in fact explain the stronger decrease in beta power during temporal predictions. The lack of a beta modulation relative to the pre-stimulus baseline in the working memory condition would be in line with this explanation, since working memory maintenance (reflected by a beta power increase) combined with processes related to preparation (reflected by a beta power decrease) would level out beta modulations in early visual areas (see Figure S2). This could, however, not explain the observed beta power reduction in visual areas during the tactile temporal prediction condition, since no visual stimulation was expected here.
Beta decreases observed during temporal predictions might therefore relate to more than only to preparatory processes to an upcoming stimulus. Cross-modal decreases in beta band activity in both the temporal information providing visual as well as the stimulation expecting tactile system might reflect that both sensory modalities are continuously involved in temporal prediction processes, not only in processes preparing for the upcoming stimulation. We found no significant increases in beta power during temporal predictions, even if the time window was centered on the time point of predicted reappearance (ROT) in each participant in either of the two prediction conditions (see Figure S3). Whether decreases in beta power are associated with non-rhythmic temporal predictions while increases might reflect temporal predictions during rhythmic stimulation, remains subject to future research.
Neural oscillations at low frequencies adapt to the temporal structure of visual moving stimuli
Seminal work by Lakatos et al. (2008) showed that the phase of delta oscillations in visual area V1 of monkeys followed the attended regular stream of either visual or auditory rhythmic stimulation. Moreover, reaction times towards a target stimulus varied as a function of the phase of delta oscillations. Such phase alignments could form the basis for temporal prediction processes to facilitate processing of predictive upcoming stimulation (Arnal and Giraud, 2012; Schroeder and Lakatos, 2009). Similar results have also been found in studies of the human brain (Besle et al., 2011; Gomez-Ramirez et al., 2011).
Phase entrainment of neural oscillations does not only occur in the delta band but can flexibly adapt to the frequency of the external input also at higher frequencies such as the theta or the alpha band. Doelling and Poeppel (2015) presented participants with rhythmic, musical stimuli composed of peak note rates varying from 0.5 to 8 notes per second and found neural oscillations with frequencies matching the different peak note rates to show increased phase entrainment to the stimuli. Moreover, multiple rhythmic streams presented at different frequencies can be simultaneously tracked by neural oscillations in different frequency bands, and behavior is especially enhanced when the phases from both rhythms coincide (Henry et al., 2014). This is specifically important in speech processing, where neural oscillations can track the complex spectrum of spoken language (Giraud and Poeppel, 2012). Accordingly, it has been shown that the frequency of the entrained rhythm can modulate the comprehension of spoken words (Kösem et al., 2018).
To serve as a potential mechanism to temporally predict the onset of relevant upcoming information, the entrainment of neural oscillations from different frequencies is crucial in order to flexibly adapt to the naturally occurring temporal regularities. However, in the visual system, evidence for the tracking of temporally predictive input by neural oscillations from different frequency bands is not as clear. On the one hand, studies showed that the phase of delta oscillations is involved in temporal predictions of visual input (Cravo et al., 2013; Saleh et al., 2010; Wilsch et al., 2015). On the other hand, studies suggested that temporal predictions in the visual system were specific to the alpha band. Rohenkohl and Nobre (2011), for instance, used rhythmically presented visual stimuli at 2.5 and 1.25 Hz moving across the screen until it disappeared behind an occluder. Neural oscillations exclusively from the alpha band showed modulated activity associated with temporal predictions during the disappearance time. They found no phase locking of oscillations in lower frequencies. Moreover, results reported by Samaha et al. (2015) suggest that specifically the phase of alpha oscillations is modulated to predict the onset of a visual stimulus.
In the current study, we provide further evidence that neural oscillations from the delta band show enhanced phase alignment during a window of explicit visual temporal predictions across trials. In order to adapt to the temporal regularity of the presented visual stimulus, delta frequencies in a wide network of parietal and frontal brain areas exerted more consistent phase resets at around the time point of disappearance of a visual stimulus as compared to a working memory control condition. The strength of this phase adjustment in each participant correlated with the consistency in judging a reappearance of the visual stimulus as too early or too late. This was the case only in the temporal prediction tasks, which underlines the behavioral relevance of the observed phase adjustments for temporal predictions.
Moreover, by providing no feedback about the correctness of their response, we made sure that participants used individual time points at which they subjectively expected the stimulus to reappear. Within each participant’s neural oscillation that showed the strongest ITPC during temporal predictions, we found a clustering of phases roughly around ±90° at each participant’s ROT. This was not the case when using the frequencies showing the strongest ITPC in the working memory condition, where timing was not as important. That is, within each individual’s subjective temporal framework, neural oscillations adjusted their phase to the external stimulation such that a phase of 90° eventually coincided with each individual’s predicted time point of reappearance. This provides strong support for the notion that in the context of temporal predictions the phase of delta oscillations adjusts to the temporal structure of the stimulation to code for the timing of the predicted reappearance. Our results are in line with results reported by Cravo et al. (2013), who showed that contrast sensitivity was a function of the phase of entrained delta oscillations. In their study, the strongest contrast sensitivity for visual stimuli was also observed at a delta phase around 90°. This phase range might therefore indicate an optimal phase for processes related to temporal prediction.
Only when using the frequencies that exposed the strongest ITPC in the visual prediction task also for the analysis of phase clustering in the working memory task, we observed a significant phase clustering across all participants again roughly around ±90° as well. The results of the behavioral data suggest that the temporal structure of the stimulation was not totally irrelevant to the control task (see Figure 1C). Therefore, phase adjustments in neural oscillations related to the temporal structure might have also occurred within this task (see Figure S5 for condition specific ITPC), but they were less consistent than the phase adjustments observed in the temporal prediction tasks.
Importantly, our study suggests that the mechanism of phase adjustments for temporal predictions can be extended to external stimulation that does not as such involve rhythms. We found that low-frequency oscillations can adjust their phase also to the temporal structure of external stimulation that had to be inferred from motion. Many natural stimuli are composed of highly predictable regularities, but not all of them are intrinsically rhythmic. Our results therefore indicate that the framework of phase adjustments during temporal predictions might be generalized to all forms of predictive stimulation.
Enhanced ITPC during temporal predictions is associated with phase adjustments in endogenous neural oscillations
In earlier investigations of phase adjustments to external stimulation participants were mostly presented with streams of rhythmic input. An undeniable advantage of using rhythms is the well-defined frequency range of interest for analyzing the recorded brain activity. However, rhythmic input also causes evoked brain activity within the same frequency range, which makes it difficult to disentangle streams of evoked activity from entrained endogenous neural oscillations (Doelling et al., 2019; Zoefel et al., 2018).
Our results provide further evidence that phase resets of low-frequency oscillations observed during temporal predictions cannot solely be explained by stimulus-evoked, bottom-up brain activity (see also, Doelling et al., 2019; Kösem et al., 2018; ten Oever et al., 2017). In the current study, we aimed at reducing such brain responses to a minimum by presenting participants with a continuously moving stimulus instead of several discrete stimuli. We were particularly interested in the time point at which the stimulus transiently disappeared behind an occluder (as opposed to sharp onsets and offsets in rhythms). At disappearance, we did not observe an increase in low-frequency power as compared to pre-stimulus baseline in any of the conditions studies, which could have been associated with stimulus-evoked brain activity. Moreover, by introducing a control condition in which physical stimulation was exactly the same as during temporal predictions, we further aimed at controlling for brain responses that were not specific to temporal predictions. Importantly, delta power was not stronger during temporal predictions as compared to the working memory task (see Figure S1).
We observed enhanced ITPC during temporal predictions as compared to the working memory condition as well as to the pre-stimulus baseline window at the time point of disappearance. This strongly favors the notion that ongoing, endogenous neural oscillations underwent a phase reset around the time point of disappearance, which was more consistent during temporal predictions than during working memory processes. These phase resets can therefore not be solely related to brain responses evoked by the offset of the visual movement, since we did not observe power differences at low frequencies.
Furthermore, we introduced a crossmodal temporal prediction task, in which a visual stimulus disappeared behind the occluder, while participants had to judge the timing of an upcoming tactile stimulus. In this condition, we observed enhanced ITPC in somatosensory areas at around the time point of disappearance of the visual stimulus. Since stimulation was purely visual at that time, evoked brain activity could not explain the observed effects in ITPC. However, we did not observe enhanced ITPC in early visual areas at any time point during temporal predictions as compared to the working memory control condition (discussed further below).
Phase resets occurred in a network of frontoparietal and sensory brain areas
We observed enhanced ITPC values in a network of mostly frontal and parietal brain areas during visual as well as crossmodal temporal predictions. Strongest ITPC values were observed in superior parietal and inferior frontal cortex contralateral to stimulus disappearance during both, visual as well as crossmodal temporal predictions. During tactile temporal predictions, strong ITPC values were also observed in bilateral somatosensory as well as in inferior parietal cortex contralateral to the predicted tactile stimulation. Similarly, Besle et al. (2011) observed significant phase entrainment to audiovisual stimulation in a wide network of distributed areas including parietal and inferior frontal areas. These observations support the notion that brain areas involved in temporal predictions may constitute a frontoparietal timing network (Coull and Nobre, 2008; Rimmele et al., 2018; Wiener et al., 2010).
The fact that strong ITPC differences were not only observed in areas contralateral to the disappearance, but also in areas contralateral to the predicted reappearance of the stimulus in the opposite hemifield suggests that phase adjustments might not only reflect processes of temporal, but also processes of spatial predictions. Similarly, we observed enhanced ITPC values also in early somatosensory areas contralateral to the disappearance of the purely visual stimulus during crossmodal temporal predictions, despite the fact that prediction-relevant information was provided only by a moving visual stimulus. This supports evidence reported earlier showing that stimulation within one modality can crossmodally reset the phase of ongoing low-frequency in other modalities, which might be an important mechanism for multisensory integration processes (Lakatos et al., 2007; Mercier et al., 2013; ten Oever and Sack, 2015). Similarly, during crossmodal predictions we observed enhanced ITPC also in somatosensory areas contralateral to the expected tactile stimulation. Phase adjustments in a distributed network of areas might therefore reflect temporal as well as spatial or crossmodal predictions in areas that are relevant for providing as well as receiving the predictable information.
Along the same line, we expected to find enhanced ITPC during temporal predictions in early visual areas, especially as the visual system was the only modality explicitly confronted with the external temporal information. In fact, strong ITPC estimates was also observed in occipital sensors as compared to baseline in each condition (see Figure S5), but they were not different between the conditions. One reason for this could be that processes not related to temporal prediction and therefore equal in all conditions could have overshadowed effects in the visual system. However, we found that voxels in early visual areas showed strong correlations between individual ITPC estimates and the steepness of the psychometric function in both temporal prediction tasks, but not in the working memory task. This suggests that consistent phase resets of delta oscillations within visual areas might have supported consistent timing judgments with the participants’ subjective timing frameworks. This indicates a critical involvement of the visual system also in processes related to temporal prediction.
Moreover, we observed strong correlations between ITPC and behavior in the cerebellum, supporting earlier reports on a involvement of the cerebellum in temporal prediction processes (Breska and Ivry, 2016; Ivry and Keele, 1989). Roth and coworkers (2013), for instance, found that cerebellar patients were significantly impaired in recalibrating sensory temporal predictions of a reappearing visual stimulus. This finding is of particular interest as we adapted the authors’ experimental paradigm for the use in the current study. Theirs and our results therefore indicate that the cerebellum might be crucially involved in accurate and consistent judgments of temporal regularities deployed in perceiving object motion.
Conclusions
We provide evidence that the phase of neural oscillations can adjust to the temporal regularities of external stimulation. Such phase alignments could provide a key mechanism that predicts the onset of upcoming events in order to optimize processing of relevant information and thereby adapt behavior. Our results further reveal that this concept of phase adjustments during temporal predictions can be extended to non-rhythmic, but predictable visual motion stimuli, which suggests that phase adjustments could be a general mechanism for temporal prediction processes. In a crossmodal setting, we show that temporal information provided to one modality leads to phase adjustments in another modality when crossmodal temporal predictions are necessary. Such crossmodal phase resets could be the neuronal basis of multisensory integration processes. Moreover, by introducing a physically matched control condition, our results support the notion that phase alignments observed during temporal predictions are based on phase resets in ongoing neural oscillations and do not arise as a byproduct of bottom-up stimulus processing. Importantly, we observed that these phase adjustments occurred on an individual level to match each individual’s subjective temporal predictions time points. Additionally, the more consistent a participant was in estimating the time of reappearance of the stimulus, the higher the phase alignment was across trials. Taken together, our results provide important insights into the neural mechanisms that might be utilized by the brain to predict the temporal and spatial onsets of upcoming events.
Methods
Participants
Twenty-three healthy volunteers (mean age ± standard deviation (SD): 27.13 ± 4.30 years; 20 females; all right-handed) took part in the study. They gave informed written consent and were monetarily compensated with 13 €/hour for participation. All volunteers had normal or corrected-to-normal vision, normal touch, as well as no background of psychiatric or neurological disorder. The ethics committee of the Medical Association Hamburg approved the study protocol (PV5073), and the experiment was carried out in accordance with the approved guidelines and regulations.
Experimental procedure
The experimental paradigm used in the current study was adopted from an earlier report investigating visual temporal predictions in cerebellar patients (Roth et al., 2013). Our experiment consisted of three conditions: a visual temporal prediction task, a crossmodal (tactile) temporal prediction task, and a working memory (control) task. The trials of all conditions started with the presentation of a randomly generated, white noise occluder (size: 7.5° × 11.3° (h × w)) that was smoothed with a Gaussian filter (imgaussfilt.m in MATLAB) and presented in the middle of the screen against a grey background screen (luminance: 44 cd/m2; corresponds to 115 red-green-blue (RGB) values in our setting; see Figure 1A). At the center of the occluder, a red fixation dot was presented. We instructed participants to fixate this dot throughout the entire trial. After 1500 ms, an oval stimulus (size: 3.5° × 1.0°) set on in the periphery of the screen, moving towards the occluder with a speed of 6.9 °/s. The luminance of the stimulus differed in all trials between 120 to 161 cd/m2 (6 steps, counterbalanced, corresponds to 170 to 220 RGB). For half of the participants, the stimulus started on the left side of the occluder and moved from left side towards the right side. For the other half, the stimulus started on the right side and moved from right to left. The direction of movement was kept constant for each participant throughout the entire experiment. In each trial, the starting point of the stimulus differed such that the stimulus took 1,000 to 1,500 ms to disappear completely behind the occluder from starting point, randomly jittered with 100 ms (counterbalanced). The size of the occluder and the speed of the stimulus were chosen so that the stimulus would need exactly 1,500 ms to reappear on the other side of the occluder. However, we manipulated the timing and the luminance of the reappearing stimulus. In each trial, the reappearance of the stimulus differed between ±17 to ±467 ms (randomly jittered, but counterbalanced in steps of 50 ms; corresponds to ±1 to ±28 frames with a jitter of 3 frames at 60 Hz) from the correct reappearance time of 1,500 ms. Hence, the stimulus was covered by the occluder for 1,033 to 1,967 ms and was reappearing at 20 different time points. In the visual prediction task as well as in the working memory task, we also manipulated the luminance of the reappearing stimulus relative the luminance the stimulus had before disappearance in each trial (jittered, but counterbalanced between ±1 to ±40 cd/m2, also using 20 different values; corresponds to ±1 to ± 28 RGB in steps of 3 RGB to make it similar to the timing manipulation). After reappearance, the stimulus moved to the other side of the screen for 500 ms with the same speed until it set off the screen. The occluder was presented throughout the entire trial.
By manipulating the timing as well as the luminance in both conditions, we made sure that both, the visual temporal prediction as well as the working memory task had the exact equal physical appearance throughout all trials. They only differed in their cognitive set. In the visual temporal prediction task, we asked participants to judge whether the stimulus was reappearing too early or too late based on the speed the stimulus had earlier to the occluder (which was kept constant throughout the entire experiment). In the working memory task, participants were asked to judge whether the luminance of the reappearing visual stimulus became brighter or darker as compared to the stimulus earlier to disappearance. Participants answered by pressing one of two buttons with their index or middle finger of the hand contralateral to the reappearing stimulus.
The tactile temporal prediction task was equal to the visual temporal prediction task, with the only difference that a tactile stimulus instead of a visual was presented at the time of reappearance to the right or left index finger (depending on which side the stimulus was expected to reappear behind the occluder). The tactile stimulus was presented by means of a Braille piezostimulator (QuaeroSys, Stuttgart, Germany; 2 × 4 pins, each 1 mm in diameter with a spacing of 2.5 mm), pushing up all eight pins for 200 ms. At that time, nothing happened on the screen. Participants gave their answer with the same hand as in the other two conditions (i.e., with the hand that was not stimulated by the Braille stimulator). Response mapping of the two buttons was counterbalanced across all participants. As soon as participants gave their answer, the fixation dot turned dark grey for 100 ms to indicate that the response was registered. However, participants did not receive trial-wise feedback about the correctness of their response. After a short delay of 200 ms, the white-noise occluder was randomly re-shuffled to signal the start of a new trial.
All three conditions were presented block-wise. At the beginning of each block, participants were informed about the current task. The order of presentation of the conditions was kept constant for each participant, but was randomized across participants (counterbalanced). At the end of each block, they were informed about the overall accuracy of their answers within the last block and were allowed to rest as long as they wanted. Each participant performed two sessions at two different recording days. The experimental procedure was kept constant across both sessions, i.e., movement direction, response mapping, as well as condition order did not change in the second session for individual participants. Each session comprised twelve blocks, i.e., four blocks per condition. Each block consisted of 60 trials, resulting in a total number of 480 trials per condition or 1,440 trials in total. Due to technical difficulties, for one participant we only acquired data from one session with a total number of 720 trials.
At the beginning of each recording day, participants performed a short training of all conditions to get familiar with the overall experimental procedure and the stimulus material. This training took part in the same environment as the subsequent recording session. At the end of the second recording day, participants filled a questionnaire asking for any specific strategy they might have used for the temporal prediction task.
We used MATLAB R2014b (MathWorks, Natick, USA; RRID: SCR_001622) and Psychtoolbox (Brainard, 1997; RRID: SCR_002881) on a Dell Precision T5500 with Ubuntu 64-bit operating system (Version: 16.04.5 LTS) for stimulus presentation. The visual stimuli were projected onto a matte backprojection screen at 60 Hz with a resolution of 1,920 × 1,080 pixels positioned 65 cm in front of participants. To mask the sound of the Braille stimulator during tactile stimulation, we presented participants with auditory pink noise at sampling rate of 48 kHz and volume of 85 dB using MEG-compatible in-ear headphones (SRM-252S, STAX Limited, Fujimi, Japan) during all experimental blocks.
Data acquisition and pre-processing
MEG was recorded at a sampling rate of 1,200 Hz using a 275-channel whole-head system (CTF MEG International Services LP, Coquitlam, Canada) situated in a dimly lit, sound attenuated and magnetically shielded chamber. We additionally recorded electrical eye, muscle and cardiac activity with Ag/AgCl-electrodes in order to have a better estimate for endogenous artefacts. Online head localizations (Stolk et al., 2013) were used to navigate participants back to their original head position prior to the onset of a new experimental block if their movements exceeded five mm from their initial position. The initial head position from the first recording day was saved so that participants could be navigated back to their initial head position also during the second recording day. This assured comparable head positions of each participant across sessions. Five malfunctioning channels were either not recorded or excluded from further analysis for all participants. To further control for eye movement artifacts, eye movements were tracked with an MEG-compatible EyeLink 1000 Long Range Mount system (SR Research, Osgoode, Canada).
We analyzed reaction time data using R (R Core Team, 2014; RRID: SCR_001905) and RStudio (RStudio Inc., Boston, USA; RRID: SCR_000432). Trials with reaction times longer than three standard deviations were excluded from analysis. Due to the right-skewed nature of reaction times, reaction time data were first log-transformed and then standardized across all trials from each participant.
All other data were analyzed using MATLAB R2016b with FieldTrip (Oostenveld et al., 2011; RRID: SCR_004849), the MEG and EEG Toolbox Hamburg (METH, Guido Nolte; RRID: SCR_016104), or custom made scripts. The physiological continuous recording of each session was first cut into epochs of variable length. Each trial was cut 1,250 ms earlier to stimulus movement onset and 1,250 ms after offset of the reappeared stimulus. Trial length therefore varied between 4,717 and 6,183 ms. To prevent that the timing in a given trial was not exactly as intended, e.g., by short movement interruptions of the stimulus, we removed trials which contained MEG marker timings that differed from the intended timing of the moving stimulus in the trial by at least one frame (17 ms). Thus, we excluded on average 1.2 trials in each participant and each session (range: 0 – 24 trials).
Moreover, trials containing strong muscle artifacts or jumps were detected by semi-automatic procedures implemented in FieldTrip and excluded from analysis. The remaining trials were filtered with a high-pass filter at 0.5 Hz, a low-pass filter at 170 Hz, and three band-stop filters at 49.5–50.5 Hz, 99.5–100.5 Hz and 149.5–150.5 Hz and subsequently down-sampled to 400 Hz.
We performed an independent component analysis (infomax algorithm) to remove components containing eye-movements, muscle, and cardiac artefacts. Components were identified by visual inspection of their time course, variance across samples, power spectrum, and topography (Hipp and Siegel, 2013). On average, 25.7 ± 8.6 components were rejected in each participant and each session. All trials were again visually inspected and trials containing artefacts that were not detected by the previous steps were removed.
As a final step, using procedures described by Stolk et al. (2013) and online (http://www.fieldtriptoolbox.org/example/how_to_incorporate_head_movements_in_MEG_analysis/) we identified trials in which the head position of the participant differed by 5 mm from the mean circumcenter of the head position from the whole session (on average: 2.6 trials per participant and session, range: 0 – 86 trials) and excluded them from further analysis. 670.2 ± 26.7 trials of the total of 720 trials remained from pre-processing on average per participant in each session.
Quantification and statistical analysis
In the current experiment, we introduced a control condition that was physically identical to our temporal prediction tasks (until reappearance in the tactile condition) in order to account for processes that are not directly related temporal predictions. Hence, for most of our statistical analyses, we were interested in comparing the two temporal prediction tasks with the working memory control task, respectively, and not in comparing the two temporal prediction tasks with each other. Therefore, instead of computing an analysis of variance across all three conditions, we directly computed two separate t-tests for the comparison of the visual or the tactile temporal prediction with the working memory task, respectively, and accounted for multiple comparisons by adjusting the alpha level.
Psychometric curve
We did not provide participants with feedback about the correctness of their response. Hence, participants responded within their individual framework of a “subjectively correct” reappearance timing or a “subjectively equal” luminance of the stimulus, respectively. To obtain these subjective points of “right-on-time” (ROT) in the temporal prediction tasks or the “points of subjective equality” (PSE) in the working memory task, we fitted a psychometric curve to the behavioral data of each participant from all trials in each condition. First, for each timing difference or luminance difference, respectively, we computed the proportion of “too late” or “brighter” answers for each participant. Then, we fitted a binomial logistic regression (psychometric curve) using the glmfit.m and gmlval.m functions provided in MATLAB. The fitted timing or luminance difference value at 50% proportion “too late” or “brighter” answers was determined as ROT or PSE for each participant, respectively. To test for a significant bias towards one of the answers, we tested the ROT or PSE from all participants against zero using one-sample t-tests (α = .05 / 3 = .017). The steepness of the psychometric function was computed as the reciprocal of the difference between fitted timing or luminance difference values at 75% and 25% proportion “too late” or “brighter” answers, respectively.
Linear model
To test whether reaction times were dependent on the timing difference of the reappearing stimulus in each task, we averaged across all luminance differences within each timing difference bin in each condition. We then utilized a second-order (quadratic) polynomial regression model with timing difference as predictor for reaction times and computed the first- and second-order coefficients for each participant in each condition. The coefficients from all participants were then tested against zero using one-sample t-tests in all conditions (α = .05 / 3 = .017).
Spectral power
We decomposed the MEG recordings into time-frequency representations by convolving the data with complex Morlet’s wavelets (Cohen, 2014). The recording of each trial and channel was convoluted with 40 complex wavelets, logarithmically spaced between 0.5 to 100 Hz. With increasing frequency, the number of cycles for each wavelet logarithmically increased from two to ten cycles. For all analyses of the MEG data, we considered subjectively correct trials only, i.e., trials in which participants answered correctly based on their individual ROT. To correct for trial count differences between the tasks, we stratified the number of trials for each participant for the three different conditions by randomly selecting as many trials for each condition as the number available from the condition with lowest trial count.
Since the temporal dependencies between the movement onset, disappearance behind the occluder and reappearance of the stimulus varied strongly between trials, averaging across trials would heavily smear the power estimates of the different stages within each trial. To obtain an estimate of spectral power modulations related to the different events in our experimental paradigm, we cut each trial further into four separate, partly overlapping windows (see Figure 2A): a “Baseline” window from −550 to −50 ms earlier to movement onset; a “Movement” window from −50 to 950 ms relative to the movement onset; a “Disappearance” window from −350 to 950 ms relative to complete disappearance of the stimulus behind the occluder; and a “Reappearance” window from −350 to 450 ms relative to the (first frame) reappearance of the stimulus. Spectral power estimates were then averaged across all trials belonging to the same condition in each window and binned into time windows 100 ms (centered on each full deci-second). All power estimates were normalized using the pre-stimulus baseline window from −500 to −200 ms earlier to movement onset.
For all statistical analyses on sensor level, we first flipped all sensors of participants, who saw the stimulus moving from right to left, at the sagittal midline, i.e., the anterior-posterior axis. This made sure that lateralized activity due to the lateralized stimulation was comparable across groups. From this on, we considered all participants as if for everyone the stimulus was moving from the left to the right side. Channels that did not have a counterpart on the opposite site were excluded from further analyses. In order to obtain an overview of the spectral power modulations related to the different events within the trials, we then averaged the power estimates across all channels and conditions (grand average) and tested each time-frequency pair of the Movement, Disappearance and Reappearance windows against the pre-stimulus baseline using paired-sample t-tests. We controlled for multiple comparisons by employing cluster-based permutation statistics as implemented in FieldTrip (Maris and Oostenveld, 2007). In this procedure, neighboring time-frequency bins with an uncorrected p-value below 0.05 are combined into clusters, for which the sum of t-values is computed. A null-distribution is created through permutations of data across participants (n = 1,000 permutations), which defines the maximum cluster-level test statistics and corrected p-values for each cluster. For each window, a separate cluster-permutation test was performed (α = .05; liberally chosen to observe all ongoing power modulations; see Results section).
Since we were most interested in differences between the conditions during the disappearance time, we subsequently compared the spectral power estimates averaged within the beta range (13–30 Hz; see Results section) at each time point within the disappearance window and all channels from the visual or tactile temporal prediction task with the working memory task. We again employed cluster-permutation statistics, this time by clustering neighboring channels and time points. We used a one-sided α = .025 / 2 = .0125, since negative and positive clusters were tested separately, and to adjust for the two separate comparisons between the conditions (used throughout the study unless stated differently).
To estimate spectral power in source space, we computed separate leadfields for each recording session and participant based on each participant’s mean head position in each session and individual magnetic resonance images. We used the single-shell volume conductor model (Nolte, 2003) with a 5,003 voxel grid that was aligned to the MNI152 template brain (Montreal Neurological Institute, MNI; http://www.mni.mcgill.ca) as implemented in the METH toolbox. Cross-spectral density (CSD) matrices were computed from the complex wavelet convoluted data in steps of 100 ms in the same time windows as outlined above. To avoid biases in source projection, common adaptive linear spatial filters (DICS beamformer (Gross et al., 2001)) pointing into the direction of maximal variance were computed from CSD matrices averaged across all time bins and conditions for each frequency.
All time-frequency resolved CSD matrices were then multiplied with the spatial filters to estimate spectral power in each of the 5,003 voxels and normalized with the pre-stimulus baseline window. Analogous to sensor space, we first flipped all voxels at the y-axis (anterior-posterior axis) for the half of participants that saw the stimulus moving from right to left earlier to further statistical analysis. We then averaged across all time bins within the disappearance window and utilized cluster-based permutation statistics to identify clusters of voxels that show statistical difference in beta power between each of the temporal prediction tasks and the working memory task.
Inter-trial phase consistency
We computed ITPC estimates from the complex time-frequency representations obtained from the wavelet convolution as described in the Spectral power section above. In each time sample and trial, the phase of the complex data was extracted (using the function angle.m in MATLAB). ITPC was then computed across all subjectively correct and stratified trials within each of the four time windows in all frequencies as where n is the number of trials and k the phase angle in trial r at time-frequency point tf (Cohen, 2014). In other words, ITPC is the length of the mean vector from all phase vectors with length 1 across all trials at a given time-frequency point. Values for ITPC can vary between 0 and 1, where 0 means that at a given time-frequency point there is no phase consistency across trials at all and 1 means all trials show the exact same phase. Similar to spectral power, we averaged ITPC estimates again in bins of 100 ms and plotted all time windows averaged across all channels and conditions to obtain a general overview of ITPC estimates at all events during the trial.
Since we were most interested in ITPC related to stimulus disappearance behind the occluder, we subsequently computed ITPC in a longer time window from −1,900 ms to 1,900 ms centered around time of complete stimulus disappearance behind the occluder. Thus, we took advantage of the fact that the onset of other events within each trial, such as the movement onset and the reappearance of the stimulus, strongly jittered across all trials and strong contributions of these events to ITPC could thereby be reduced (see Figure S4). For statistical analysis, we first averaged ITPC estimates within a frequency band of 0.5 to 3 Hz (see Results) and then computed cluster-based permutation statistics across all 100 ms time bins within the 3,800 ms long window and all sensors between each of the temporal prediction tasks and the working memory task.
ITPC on source level was computed using the same leadfields and common beamformer filters as for spectral power (see above). The complex time-frequency representations obtained from the wavelet convolution within the 3,800 ms long window on sensor level were multiplied with the filters to obtain the time-frequency representations in each of the 5,003 voxels. ITPC was computed for each time sample and frequency and then averaged within the time window showing statistically significant difference between the temporal prediction tasks and the working memory task on sensor level and within the pre-defined frequency band of 0.5 to 3 Hz. Cluster-based permutation statistics were employed to identify clusters of voxels showing statistically significant differences in ITPC between the conditions on source level.
Correlations between condition-wise source level ITPC estimates and the steepness of each individual’s psychometric function were computed using Pearson correlations in each of the 5,003 voxels within the grid. For this analysis, we averaged ITPC estimates from time bins of 0 to 1,500 ms with respect to the disappearance of the stimulus within the pre-defined delta band of 0.5 to 3 Hz. Multiple comparisons were accounted for by using cluster-based permutation statistics as implemented in FieldTrip (α = .025 / 3 = .008)
Delta phase clustering at ROT
To determine whether each participant’s subjective ROT was associated with a specific phase in the delta band, we extracted the phase at each individual’s ROT from sensors showing the strongest ITPC effect and computed the distance from this distribution to a uniform distribution over all possible phases. The procedure was as follows.
For this analysis, we only considered trials in which the stimulus reappeared later than each individual’s ROT and the participant answered subjectively correct. By this, we prevented possible phase distortions by the external stimulation earlier to or at ROT. Trials were again stratified across conditions. Moreover, to make sure that we reduced also activity that was related to external stimulations after each individual’s ROT, we first aligned all trials from the same condition to the time point of stimulus reappearance, computed the average across trials (event-related field, ERF) and subtracted the ERF caused by the reappearance from all trials in that condition. Subsequently, in each trial we centered a 2,500 ms long window on each participant’s ROT, computed a complex wavelet convolution for all frequencies between 0.5 and 3 Hz (14 frequencies; same procedure and frequencies as above) in all channels, and computed the mean phase angle at ROT, i.e., the center time bin, across all considered trials in each condition. This procedure is similar to computing ITPC as described above, except for extracting the angle of the mean phase vector instead of the length. Since for the working memory task we did not have an estimate of each individual’s ROT, we applied the estimate of ROT from the visual prediction task also to the working memory trials, since based on their equal physical appearance temporal predictions should also be equal.
As a next step, from the result of the cluster-based permutation statistics on ITPC estimates described above, we determined the sensors that showed the strongest ITPC effect for the two contrasts between the temporal prediction tasks and the working memory task for a time window between 0 and 1,500 ms after disappearance behind the occluder. For the contrast between the visual prediction and the working memory task, we considered the sensors showing the top 20% of t-values (37 channels). To keep the number of sensors comparable, we also considered the top 37 sensors from the contrast of the tactile prediction task against working memory.
Within these channels, for each individual participant we determined the frequency within the 0.5 to 3 Hz delta band, which showed the strongest ITPC for the visual or the tactile prediction as compared to the working memory task, respectively, in the same time window of 0 to 1.500 ms. To determine the frequencies also for the working memory condition, we here extracted the frequencies showing the strongest estimates of ITPC in the working memory as compared to the visual temporal prediction task. For these individual frequencies, we plotted the phase angle at ROT (as described above) from all the considered channels and all participants in a histogram (in bins of 10°; see Figure 5). If there was no relationship between each individual’s phase at each individual’s ROT, this histogram would show a uniform distribution across all possible phases (−180° – 180°). Therefore, we computed the distance from the observed phase distribution to a uniform distribution using a discrete and normalized version of the Kullback-Leibler distance, i.e., the modulation index (MI) (Tort et al., 2010).
For statistical analysis, we repeated the same procedure as described above for 10,000 times and randomly picked any frequency from the 14 frequencies within the 0.5 to 3 Hz band in each repetition. By that we obtained a distribution of surrogate MI estimates (but still based on real data from all individual participants), from which we computed the percentile determined by the MI that was observed using the individually strongest ITPC frequency. MI estimates above the 95th percentile were considered significantly stronger as compared to the randomly obtained surrogate MIs (p-value = 1 – percentile).
Author Contributions
Conceptualization, J.D., A.K.E, P.W., A.M.; Methodology, J.D., A.K.E.; Software, J.D.; Formal Analysis, J.D.; Investigation, J.D.; Writing – Original Draft, J.D.; Writing – Review & Editing, A.K.E., P.W., A.M., D.Z.; Visualization, J.D.; Funding Acquisition, A.K.E.; Supervision, A.K.E.; Project Administration, A.K.E., D.Z.; Resources, A.K.E.
Declaration of Interests
The authors declare no competing interests.
Acknowledgements
We thank Florian Göschl, Tessa Rusch, Marina Fiene and Guido Nolte for valuable discussions. This work was funded by grants from the DFG (SFB TRR 169/B1 and SFB 936/A3 to A.K.E.).