Abstract
The somatic marker hypothesis proposes that cortical representation of visceral signals is a crucial component of emotion processing. No previous study has investigated the causal relationship among brain regions of visceral information processing during emotional perception. In this magnetoencephalography (MEG) study of 32 healthy subjects, heartbeat evoked potential (HEP), which reflects the cortical processing of heartbeats, was modulated by the perception of emotional faces. The modulation effect was localized in the prefrontal cortices, globus pallidus, and interoceptive network including right anterior insula (RAI) and anterior cingulate cortex (RACC). Importantly, our Granger Causality analysis provide the first evidence for an increased causal flow of heartbeat information from the RAI to the RACC by sad faces. Moreover, this HEP modulation effect was neither an artifact nor an effect of visual evoked potential. These findings provide an important progress in the understanding of brain-body interaction during emotion processing.
Introduction
According to the James-Lange theory and the somatic marker hypothesis, emotional feelings are the mental experience of bodily states (A. Damasio & Carvalho, 2013; James, 1884). More specifically, emotional stimuli usually induce a change of bodily status (Williams et al., 2005). Then various feelings subsequently emerge from the perception of bodily status including a sense of viscera (A. Damasio & Carvalho, 2013). Many previous studies showed evidence of emotional stimulus-evoked somatic response. For example, a fearful stimulus enhances sympathetic responses like heart rate elevation and skin conductance modulation (Critchley et al., 2005; Williams et al., 2005), and a disgusting stimulus induces tachygastria (N. A. Harrison, Gray, Gianaros, & Critchley, 2010). Moreover, neuroimaging studies using fMRI or EEG have shown that generation of bodily responses by an emotional stimulus is related to activation of subcortical regions like the amygdala and hypothalamus (A. R. Damasio et al., 2000). On the other hand, there is rare evidence that supports a change in cortical interoceptive processing in the brain during the experience of emotional feeling. Given that signals of internal organs cannot be identified without explicit measuring devices like an electrocardiogram (ECG), it is difficult to investigate which brain activity was directly evoked by interoceptive signals. Thus, previous studies explored changes in the interoceptive processing during just the experience of emotional states and reported their relation to the anterior insula and anterior cingulate cortex (Adolfi et al., 2016; Critchley et al., 2005). Heartbeat evoked potential (HEP), which is obtained by averaging electrophysiological signals time-locked to heartbeats, has been reported to be associated with the perception of heartbeat (Pollatos & Schandry, 2004), pain (Shao, Shen, Wilder-Smith, & Li, 2011), and feeling empathy (Fukushima, Terasawa, & Umeda, 2011). Moreover, HEP amplitude is attenuated in mood-related psychiatric disorders, including depression (Terhaar, Viola, Bär, & Debener, 2012) and borderline personality disorder (Müller et al., 2015), suggesting a potential link between HEP and aberrant emotional processing.
Based on these theories and indirect evidence, we hypothesized that HEP would be associated with the feeling of emotional stimuli and would be modulated during perception of emotional expression, such as a face or a text-based emoticon. And the modulation effect would be different between positive and negative valences. To test this idea, we used emotional faces and emotional emoticons that convey text-based emotion to evoke emotional feeling while measuring the HEP with magnetoencephalography (MEG). To verify the precise source of HEP modulation, T1-weighted structural magnetic resonance imaging (MRI) was acquired from all subjects. Importantly, we applied Granger causality analysis (Barnett & Seth, 2014) on sources to identify information flow between the sources of HEP modulation.
We formulated the following specific hypotheses. First, we expected that HEP would be modulated by the emotional expression and this effect would appear in different spatiotemporal dynamics between emotional and neutral stimulus presentation. Second, the modulation of HEP by emotional expression would be localized in the previously known interoceptive region, like the anterior insula and the anterior cingulate cortex, in source level analysis. Third, we expected that information flow between these interoceptive regions would be modulated by emotional expression. To be more specific, we expected that bottom up heartbeat information processing starting from the anterior insula, which re-represent the viscerosensory information from the posterior insula, to the anterior cingulate cortex would be enhanced by emotional expression (Medford & Critchley, 2010). This pathway is hypothesized to be involved in generating emotional feeling (Medford & Critchley, 2010; Smith & Lane, 2015). Finally, we hypothesized that neural activity that are evoked by heartbeat would have spatiotemporal patterns different from visual evoked cortical activity. That is, we expected that the MEG signals time-locked to the cardiac cycle onset would make the visual evoked effect disappear and vice versa.
Results
Sensor analysis
Six conditions of stimuli consisted of happy face, sad face, neutral face, happy emoticon, sad emoticon, neutral emoticon were presented to thirty-two participants while recording magnetoencephalography (MEG). Data were preprocessed including independent component analysis (Hyvärinen, Karhunen, & Oja, 2004) which could minimize cardiac field artifact (CFA). Then we compared HEP of emotional conditions and neutral conditions using cluster-based permutation paired t test (Oostenveld, Fries, Maris, & Schoffelen, 2011) in CFA-free time windows of 455ms to 595ms post R peak (Gray et al., 2007; Müller et al., 2015). Four tests were performed including a sad face vs. a neutral face, a happy face vs. a neutral face, a sad emoticon vs. a neutral emoticon, and a happy emoticon vs. a neutral emoticon.
Significant difference in the HEP between sad face and neutral face perception
A HEP cluster showing a significant difference between the perception of a sad face and that of a neutral face was found in the right frontocentral sensors within a 488ms-515ms time range (Monte-Carlo p = 0.046). In other conditions, including happy face vs neutral face (Monte-Carlo p = 0.396), sad emoticon vs neutral emoticon (Monte-Carlo p = 0.857), happy emoticon vs neutral emoticon (Monte-Carlo p = 0.710), there were no significantly different clusters showing a different HEP amplitude. Using the ROI-TOI mask of time and sensors of significant clusters found in the sad face vs neutral face condition, we performed an additional paired t test between the happy face or emotional emoticon condition and the neutral condition. In these analyses, only the happy face showed a significant difference (t (31) = -2.12, p = 0.042). Furthermore, we performed a correlation analysis with the HEP modulation effect of a happy face and a sad face (averaged within ROI-TOI mask) with PHQ-9 score (Supplementary materials). The HEP modulation was significantly correlated with participants’ PHQ-9 score (Spearman’s rho = 0.377, p = 0.033) in the happy face. Moreover, the HEP modulation effect of the happy face was more significant in participants with a low PHQ-9 score (t(18) = -3.314, p = 0.004) than in the paired t test within the whole group (t (31) = -2.12, p = 0.042), while participants with a high PHQ-9 score did not show the HEP modulation effect (t(12) = 0.30, p = 0.766). The HEP modulation of the sad face was only marginally correlated with PHQ-9 (Spearman’s rho = 0.297, p = 0.099). Detailed information of the analysis of the relationship between mood and HEP modulation is provided in the Supplementary materials.
Although the emotionality scores for a sad face and a sad emoticon were not different (in paired-t test, p = 0.492, mean emotionality score of sad emoticon = 2.81, mean emotionality score of sad face = 2.71), the sad emoticon did not show HEP modulation even in this analysis (t (31) = -0.2506, p = 0.8038). The happy emoticon also did not show HEP modulation (t (31) = 0.048, p = 0.9724).
Unlike face perception, which is an innate process, emoticon perception is an acquired process. Moreover, it has more abstract information than facial expression which needs an interpretation or appraisal process. Therefore, perception and interpretation of emotional emoticons are more likely to vary with a person’s emotional state and emotional experience than perception of emotional faces. We performed an additional analysis controlling for individual differences in emotional experience like verbal abuse (detailed results and methods are provided in the Supplementary materials). Briefly, the HEP modulation effect of sad emoticons was found to be significantly affected by the degree of experience of peer verbal abuse (PeVA) in left central sensors (Monte-Carlo p = 0.015, n = 30). The HEP modulation effect from sad emoticons within this cluster appeared only in participants who had no history of peer verbal abuse (in paired t test using permutation method, Monte-Carlo p = 0.004, n = 10), while there was no significant HEP modulation in participants with verbal abuse history (Monte-Carlo p = 0.212, n = 20). Happy emoticons did not show any relationship with verbal abuse history (PeVA and PaVA) or mood (PHQ-9) (failed to form cluster in cluster based permutation regression analysis).
Source analysis
Interoceptive network and prefrontal-basal ganglia network as sources of HEP modulation in sad faces
To find the brain region underlying modulation of the HEP in sad face conditions, source reconstructions of the MEG signal in a sad face and neutral face were done using the Brainstorm toolbox (Tadel, Baillet, Mosher, Pantazis, & Leahy, 2011). To estimate the time courses of both cortical and subcortical activity, we used the default settings in Brainstorm’s implementation of the Deep Brain Activity model using the minimum norm estimate (MNE) (Attal & Schwartz, 2013; Tadel et al., 2011). Then, a paired t test of the two conditions was conducted using SPM12 (Penny, Friston, Ashburner, Kiebel, & Nichols, 2011)(with a spatial map that was averaged within the time range of 488ms-515ms after the R peak, which have shown significant different HEP between sad face and neutral face). With the cluster-forming threshold using p-value = 0.01 and 10 adjacent voxels, several regions that have different HEP time courses between a sad face and a neutral face were identified. Four significant clusters appeared. Briefly, these clusters included the right prefrontal cortices, the anterior insula, the anterior cingulate cortex, and the basal ganglia. More specifically, the first cluster (red in Fig. 3, p = 0.003, cluster level family-wise error (FWE) corrected) included the right prefrontal regions that consisted of the right superior frontal gyrus (RSFG), which is close to the dorsomedial prefrontal cortex (dmPFC), and the middle frontal gyrus (RMFG), which corresponded to the dorsolateral prefrontal cortex (dlPFC). The second cluster (green in Fig. 3, p = 0.001, cluster level FWE corrected) included the right anterior insula (RAI) and the right putamen (RP). The third cluster (blue in Fig. 3, p = 0.003, cluster level FWE corrected) included the right anterior cingulate cortex (RACC) and the left anterior cingulate cortex (LACC). Finally, the fourth cluster (pink in Fig. 3, p < 0.001, cluster level FWE corrected) included the right basal ganglia, which consisted of the right globus pallidus (RGP) and the right putamen (RP). Previous studies reported ACC and AI as a source of HEP (Couto et al., 2015; H.-D. Park et al., 2017; H.-D. Park, Correia, Ducorps, & Tallon-Baudry, 2014). Additionally, in a paired t test using the absolute value of time courses, right ventromedial prefrontal cortex (RvmPFC) and left orbitofrontal cortex (LOFC) clusters were found, which is also a region known to process interoceptive information; a detailed result of this analysis is provided in the Supplementary materials. Furthermore, we also performed the same analysis comparing a happy face and a neutral face, which showed the HEP modulation effect only in the paired t test using the ROI-TOI mask, to see whether the HEP modulation pattern induced by a happy face was similar to that induced by a sad face. In the results, with the cluster-forming threshold using p-value = 0.01 and 10 adjacent voxels, only the right insula cluster was found to be different between the happy face and the neutral face, which was located in the mid-dorsal position of the insula (MNI coordinate = (33, 8, 8), p < 0.001, cluster level FWE corrected).
Granger causalities between sources of HEP modulation
The source analysis of HEP induced by a sad face revealed the activation of regions that were previously known to be key regions of feelings, namely, AI and ACC (Medford & Critchley, 2010; Smith & Lane, 2015). More specifically, they were hypothesized to act as the input (AI) and the output component (ACC) of a system in which the ACC re-represents interoceptive information relayed from the AI (Medford & Critchley, 2010). We hypothesized that when a person processes a sad face, causal information flow from the RAI to the RACC would be increased, while that of the RACC to the RAI remains at the baseline level. To test this hypothesis, we performed Granger causality (GC) analysis using MVGC toolbox (Barnett & Seth, 2014) on these two regions in both sad face and neutral face condition. Short time window GC estimation with a sliding window (Ding, Bressler, Yang, & Liang, 2000; Seth, Barrett, & Barnett, 2015) was used which is an appropriate method for the analysis whose temporal precision is important like a HEP modulation effect. Pairwise GC was calculated between the RAI and the RACC from the [455ms 515ms] window to the [505ms 565ms] window. Then we compared GCs of a sad face and a neutral face conditions for all time windows using cluster-based permutation paired-t test. Results showed that only the GC of HEP from the RAI to the RACC was significantly increased in the sad face compared to the neutral face from the 474ms to 568ms ([480ms 540ms] window to the [505ms 565ms] window, Monte-Carlo p = 0.014, cluster corrected for 2 GCs and 27 time windows). GC from the RACC to the RAI did not survive in multiple comparisons (Monte-Carlo p = 0.146, cluster from [493ms 553ms] to [501ms to 561ms]). These results indicate that only bottom up information from the RAI to the RACC was increased. Additionally, we analyzed GC change in a happy face between the right insula (found in the source analysis of the happy face) and RACC to determine whether a happy face has a similar GC change pattern as a sad face. An analysis of a happy face GC was performed in a way similar to the sad face, except that instead of the RAI, we used the peak coordinate of the right insula that corresponded to the mid-dorsal position of the insula, which was found in the source analysis of a happy face. The time course of the RACC was extracted from the same coordinate that was used in the sad face GC analysis because we assumed that even though there was no activation of the RACC in the happy face, there could be an information flow between those areas. Interestingly, the happy face had a slightly different pattern from the sad face, which means a top-down pattern of GC from the RACC to the right insula. GC from the RACC to the right insula were significantly increased compared to the neutral face from the [456ms 516ms] window to the [484ms 544ms] time window (Monte-Carlo p = 0.013) while there was no significant cluster time window in the right insula to the RACC (Monte-Carlo p = 0.191, cluster from [487ms 547ms] to [493ms to 553ms]).
Analysis of physiological data
Heart rate and ECG R peak amplitude modulation in each condition
To determine whether the effect of the different HEP amplitude originated from different heart-related physiological statuses, the heart rate and ECG R peak amplitude were compared between the conditions using one-way RANOVA. There were no significant differences between conditions in either heart rate (F (1.516, 46.982) = 2.367, p = 0.118, Greenhouse-Geisser corrected) or heartbeat amplitude (F (2.989, 92.658) = 0.958, p = 0.416, Greenhouse-Geisser corrected).
Heartbeat distribution in each condition
To rule out the possibility that our significant HEP modulation effect resulted from the fact that the heartbeat appeared more or less in a specific time window of visual evoked cortical processing, distribution of heartbeat occurrences within each time window of visual processing was analyzed. Two-way 6*9 RANOVA (six condition and nine 100ms time bins (-200ms from 700ms to visual stimulus onset)) showed neither a different occurrence rate of heartbeat among the conditions (F (3.810, 118.111) = 0.783, p = 0. 533, Greenhouse-Geisser corrected) nor among time bins (F (4.876, 151.143) = 1.540, p = 0.182, Greenhouse-Geisser corrected).
Analysis to exclude the effect of visual processing
Surrogate R peak analysis on HEP modulation effect of a sad face
To test whether the HEP modulation effect is time-locked to the heartbeat, we created 100 surrogate heartbeats that were independent of original heartbeats (H.-D. Park et al., 2016; H.-D. Park et al., 2014). Then we computed the surrogate HEP with surrogate heartbeats and performed the same cluster-based permutation t test between conditions that showed a significant difference in the sensor level analysis. Then, the maximum summed cluster t statistics of the difference between sad and neutral faces was calculated and used to make a permutation distribution. Our HEP modulation effect size (maximum cluster t statistics) was significant in that the distribution (Monte-Carlo p<0.03) indicating our effect was highly likely to be locked to the heartbeat.
Different spatial pattern between visual evoked potential analysis (VEP) results and HEP analysis results in sad face perception
Finally, to exclude the possibility that HEP modulation was confounded by neural activity reflecting visual processing, we analyzed visual evoked potential data with a cluster-based permutation t test. Two significant clusters at 73ms-198ms (Monte-Carlo p = 0.05) and 815ms-948ms (Monte-Carlo p = 0.004) after stimulus were found. However, their topological distribution was totally different from the HEP effect. Furthermore, in the source analysis of VEP (which was performed with the same method as the HEP source analysis), two clusters, which included the right ventromedial prefrontal cortex cluster and right cuneus cluster, were found at 73ms-198ms (cluster-forming threshold = 0.01, minimum number of adjacent significant voxels = 10, cluster level p-value < 0.001 in both clusters). Two clusters were also found in the 815ms-948ms time window, which included the left cuneus cluster and the right cuneus cluster (p < 0.001, p = 0.042 each with the same cluster definition of the earlier cluster). Finally, the paired t test of absolute time courses revealed that the right fusiform/middle temporal gyrus cluster (cluster level p < 0.001) and the right angular gyrus/superior temporal gyrus cluster (cluster level p = 0.043, FWE corrected) were more activated in the sad face of the 73ms-198ms window, while a supplementary motor area cluster (cluster level p-value = 0.003, FWE corrected) was found in the late time window (815ms-948ms). Detailed information on the absolute time courses, including statistical information, is provided in the Supplementary materials.
Discussion
Our findings provide direct and strong evidence that perception of an emotional, especially sad, face modulates the interoceptive information processing in the cortex.
First, we showed that cortical heartbeat processing after presentation of a sad face has significantly different spatiotemporal dynamics compared with a neutral face and these differences are localized in the interoceptive network (AI, ACC, vmPFC), basal ganglia (GP, Putamen) and prefrontal areas (MFG, SFG). Importantly, results of the GC analysis of these regions showed that bottom up heartbeat information processing from the RAI to the RACC was increased in sad face condition. Moreover, although it might not be statistically rigorous, a happy face showed the HEP modulation effect in ROI-TOI mask analysis. Consistent with this result, the HEP modulation effect was localized in the right insula and the top down GC from the RACC to the right insula was increased in the GC analysis for happy face condition.
In contrast to the HEP results, visual – locked activity was different in the bilateral visual information processing area including the bilateral cuneus, fusiform face area and other areas including the ventromedial prefrontal cortex and supplementary motor area. Interestingly, the only region that overlapped between the HEP and VEP were vmPFC – which is a key region in somatic marker hypothesis (Bechara et al., 2005). Finally, surrogate R peak analysis provided strong evidence that our result was a consequence of cortical heartbeat processing modulation. Additionally, analysis of physiological data and cardiac artifact removal using ICA also ruled out the possibility of an effect of other physiological effects on the cortical signal.
Our results go beyond previous studies of interoception and emotion in several aspects.
The results of our sensor analysis showed that a sad face evoked a different spatiotemporal pattern of HEP in the right frontal and central sensors at a time window centered about 500ms after the R peak, which was different from the pattern for a neutral face. Two recent studies with electroencephalography (EEG) reported HEP modulation by emotional stimuli. One study using visual and auditory stimuli showed the HEP modulation by high-arousal mood induction in left parietal clusters at 305 to 360ms after the R peak and the right temporoparietal cluster at 380 to 460ms after the R peak (Luft & Bhattacharya, 2015). They summed both positive and negative emotional valence conditions to show the arousal effect which is not specific to emotional feeling. Another recent EEG study in 5-month-old infants reported that a video clip of an angry or fearful face increased HEP at 150 to 300ms after R peak in the frontal cluster (Maister, Tang, & Tsakiris, 2017). While both happy and sad stimuli are low-arousal emotions (Luft & Bhattacharya, 2015), the second study showed neither a significant cluster in the contrast of happiness vs. neutral conditions nor applied sad stimuli. HEP modulations in these two studies more likely reflected the representation of body signals induced by arousal than the conceptualizing process of specific emotions. In addition, neither of these studies demonstrated the underlying neural mechanism of this modulation effect based on source analysis. Considering that the bodily signals not only need to be recognized in the brain but also to be conceptualized to make emotional feeling (Smith & Lane, 2015), we contend that our results of HEP modulation by a sad face especially increasing GC from the RAI to the RACC reflects the conceptualization process of interoceptive signals that are related to specific emotional feelings, not just a representation of bodily arousal signals. First, the ACC is thought to be related to forming an emotional concept based on whole body representation of the anterior insula (Medford & Critchley, 2010; Smith & Lane, 2015), and representation of body signals in the anterior insula is known to be formed by processing of the primary interoceptive representation processed by the posterior insula (Craig, 2009; Smith & Lane, 2015). In our main result, HEP modulation effect showing in the RAI and the RACC was not found in the RPI. Furthermore, additional pairwise GC analysis on three regions of interest, the RPI, RAI, and RACC, confirmed increased GC from the RAI to the RACC, but failed to show any modulation of GC from RPI to RAI by a sad face (details of the analysis is described in the Supplementary materials). These suggest that the HEP modulation effect in a sad face might not be related to primary interoceptive processing of physiological arousal; rather, it is likely to be related to later processing starting from the anterior insula to the anterior cingulate. Moreover, the spatiotemporal dynamics of our results are later than HEP modulation by both a heartbeat perception task, which typically appears 250ms ~ 250ms after the R peak (Pollatos & Schandry, 2004), and by arousal that appeared before 460ms post R peak (Luft & Bhattacharya, 2015). Finally, in one fMRI study of inducing sad feeling status showed an increased functional connectivity between the RAI and the RACC, which also supports our results on HEP modulation by sad face (B. J. Harrison et al., 2008).
Considering these aspects, our results are not likely to be caused only by representation of primary arousal signals and are likely to be related to later processing, which we suggest is related to conceptualization and creation of individual emotional feeling.
Contrary to a sad face, a happy face showed HEP modulation only in the ROI-TOI mask, which seems much weaker than HEP modulation of a sad face and it is hard to conclude whether HEP modulation existed. However, considering the results of the source analysis that localized the HEP modulation effect in the right mid-dorsal portion of the insula and results of GC analysis, which showed an increased GC from the RACC to the right insula, we suggest that there is a HEP modulation effect for a happy face, while the pattern of modulation is slightly different from that of a sad face. This result is also supported by the previous fMRI study of happy facial expression that showed activation of location similar to that in our study (Pohl, Anders, Schulte-Rüther, Mathiak, & Kircher, 2013). Furthermore, previous theories hypothesized that depression is related to interoceptive prediction (Paulus & Stein, 2010) and also HEP was differ between depressed and nondepressed groups while performing heartbeat perception tasks (Terhaar et al., 2012). In our results, HEP modulation of happy faces seemed to be influenced largely by a person’s mood state, which was reflected in a significant correlation between HEP modulation in the ROI-TOI mask and the PHQ-9 score, while HEP modulation of a sad face had only a marginal correlation. Considering previous studies and our results, it seems that a depressive mood largely affects interoceptive processing related to happy faces, which might have caused the weak HEP modulation effect in our results.
Our results showing both a significant relationship of the HEP difference between a sad emoticon and a neutral condition with the peer verbal abuse score across all subjects and a significant difference of the HEP difference in participants with no peer verbal abuse history suggest that the perception of a sad emoticon also modulates the interoceptive information processing that is influenced by subjective emotional experience such as peer verbal abuse. Although this might not be a complete explanation for the lack of the HEP modulation effect with a sad emoticon in contrast to a sad face, considering that there are not many studies related to the effect of emotional experience on responses to emoticons, our results might provide some clue to the effect of a person’s emotional experience on their emotional response to acquired (not innate) emotional stimuli. Further studies are needed. Finally, happy emoticons have shown neither the HEP modulation effect nor a relationship to psychological scores, like verbal abuse or mood. However, it might be too hasty to conclude that a happy emoticon does not modulate HEP considering that we did not control for positive emotional experience, which might influence perception of a happy emoticon, in a way similar to the influence of negative emotional experience on sad emoticon perception. Future study investigating influence of positive emotional experience on happy emoticon perception would clarify this point.
In source analysis of HEP and VEP modulation by a sad face, we found that there are two clearly distinct systems of processing, cardiac information processing and visual information processing. First, brain regions that reflect HEP modulation in sensor analysis were found in the RAI and the RACC, which is a previously known source of HEP (Couto et al., 2015; H.-D. Park et al., 2017; H.-D. Park et al., 2014; Pollatos, Kirsch, & Schandry, 2005). These regions are also identified as overlapping regions of emotion, interoception and social cognition in a recent meta-analysis (Adolfi et al., 2016). Moreover, a Granger causality analysis revealed that cardiac information flow from the RAI to the RACC increases with sad face perception. The anterior insula interacts with many regions, including the ACC, which is a central autonomic network that regulates autonomic function (Beissner, Meissner, Bär, & Napadow, 2013). Furthermore, the RAI to the RACC is proven to have a causal relationship in resting state fMRI in GC analysis, while the reverse directional connectivity is weak (Sridharan, Levitin, & Menon, 2008). Our results suggest that this causality might be increased in sad face perception.
These previous studies make it clear that bottom-up cardiac information flow is increased in sad face perception. However, according to the source analysis of VEP, only the visual cortical region and ventromedial prefrontal regions appeared, which is consistent with previous EEG/MEG studies of sad faces (Batty & Taylor, 2003; Esslen, Pascual-Marqui, Hell, Kochi, & Lehmann, 2004). By integrating these results, we firmly insist that processing of sad faces involves distinct interoceptive processing and visual processing and this is revealed by the HEP and VEP after the stimulus. To our knowledge, this is the first study that has shown distinctive processing of both systems of processing induced by sad faces in a clear and direct way. These results correspond well to hypotheses that explain the relationship between emotions and interoception, like the somatic marker hypothesis (A. R. Damasio, 2003), which predicts that bottom up interoceptive processing is increased by emotional stimulus. The somatic marker hypothesis predicts that physiological change is induced by emotional stimulus and modulates brain-body interaction, while the results of our physiological data analysis showed that cardiac activity parameters, including heart rate and heartbeat amplitude, were not modulated by the sad face. However, considering that direct input of the HEP is pressure in the carotid baroreceptor and that we did not measure blood pressure or index of carotid stimulation, it is hard to tell whether there was physiological change. Another possibility is that absence of induced physiological change might be due to our short stimulus onset asynchrony of about 1 second, in which it is hard to evaluate physiological changes like heart rate modulation, which occurs after several seconds (Critchley et al., 2005). Note that our results were derived from a sad emotion among negative ones. Therefore, future experiments need to be performed with other negative emotional stimuli, such as fear or anger.
An unexpected finding in the present study was that cardiac information processing was also modulated in the basal ganglia (RGP/RP) and prefrontal regions (SFG/MFG). Globus pallidus is known to send input to the prefrontal cortex via the thalamus. This pathway is related to initiating motor action (Singh-Bains, Waldvogel, & Faull, 2016). In particular, the ventral pallidum (VP) is closely related to regulating emotion or starting motor action in response to emotional stimuli (Singh-Bains et al., 2016). Moreover, there was a case of a patient with damage of the GP, including the VP, who reported inability to feel emotion (Vijayaraghavan, Vaidya, Humphreys, Beglinger, & Paradiso, 2008). Based on this evidence, we suggest that cardiac information is relayed to the GP (including the VP), and finally, the prefrontal area has a role in initiating emotion-related behavior like facial expression or generation of feeling triggered by cardiac information processing, which is consistent with the somatic marker hypothesis.
To our knowledge, this is the first study to show different interoceptive and visual processing by the same emotional stimulus. In conclusion, our results demonstrate that processing of sad faces induces different interoceptive information processing and visual processing compared to neutral faces, which is reflected by the HEP and VEP, respectively. Interoceptive processing involves increased bottom-up processing of the HEP from the RAI to the RACC, while different visual processing occurs in a different area, including the visual cortical area. Additionally, we found that cardiac signals are also processed differently in the basal ganglia and prefrontal regions in sad face processing, which might reflect the initiation of emotion-related behavior like facial expression or generation of feeling.
Methods
Participants
Forty healthy participants (19 females, mean age of 24.03 ± 3.28 years) volunteered in the experiment. The expected effect sizes were not known in advance, so we chose a sample size of approximately 40 participants, which was approximately two times more than those of previous MEG and EEG studies of HEP (Babo-Rebelo, Richter, & Tallon-Baudry, 2016; Fukushima et al., 2011; H.-D. Park et al., 2014).
MEG recording consisting of 4 runs was completed with one visit. High resolution T1 weighted MRI scans were acquired at another visit. In this MRI session, all subjects performed both functional MRI experiments consisting of emotion discrimination (unpublished) and/or decision (unpublished) tasks and other structural MRIs such as diffusion tensor imaging (unpublished). Among the forty subjects, we failed to acquire MEG data for five subjects due to magnetic field instability. Another three subjects were excluded in analysis because their ECG data were too noisy or absent. Therefore, thirty-two subjects were included for further analysis.
A structured interview was conducted using the Korean version of the diagnostic interview for genetic studies (Joo et al., 2004). None of the subjects had current neurological or psychological disease. Subjects completed Patient Health Questionnaire-9 (PHQ-9)(Kroenke, Spitzer, & Williams, 2001), a self-rating scale for depression, and the Korean version of the Verbal Abuse Questionnaire (K-VAQ) (Jeong et al., 2015) because we suspected that previous emotional abuse history might affect response to emotional stimulus. A mean of PHQ-9 score was 3.62 (SD: 2.59, n=39). The mean score of peers (PeVA) and parents (PaVA) were 22.61 (SD: 9.65, n=38) and 21.95 (SD: 7.82, n=38), respectively. All thirty-two participants who were included in the analysis completed PHQ-9 (mean and SD: 3.68 and 2.69, n=32), while two participants among them did not complete K-VAQ (mean and SD of PeVA: 21.73 and 7.26, mean and SD of PaVA: 23.43 and 10.29, n = 30). All participants submitted written informed consent to participate in the experiment. The study was approved by the Korean Advanced Institute of Science and Technology Institutional Review Boards in accordance with the Declaration of Helsinki.
Standardization of emotional stimuli
Stimuli consisted of forty-five emotional faces and forty-five text-based emotional emoticons. Forty-five faces expressing happy, sad, and neutral emotions were selected from the Korean Facial Expressions of Emoticon (KOFEE) database (J. Y. Park et al., 2011). Text-based happy and sad emoticons were searched for on the world-wide web. Then we produced scrambled emoticons that did not have configurable information and used these scrambled emoticons as emoticons of a neutral condition. Ninety emotional expressions, including faces and text-based emoticons, were standardized in independent samples consisting of forty-seven healthy volunteers (21 females, mean age of 28.43 ± 4.31 years). These participants were asked to rate the feeling they felt toward emotional expressions composed of 90 stimuli (45 faces and 45 facial emoticons of happy, neutral, and sad emotions) on an 11 point Likert scale (−5 to +5). We compared the mean absolute value of the four emotional expressions that we called ‘feeling intensity’ or ‘emotionality’ (Citron, Gray, Critchley, Weekes, & Ferstl, 2014). Repeated measures analysis of variance (RANOVA) of 2 (face, emoticon) by 3 valences (happy, sad, neutral) design was performed on the mean and variance of emotionality score, respectively. In the RANOVA of mean, there was a significant main effect of valence (F (1.744, 80.228) = 272.618, P < 0.001, Greenhouse-Geisser corrected), while there were no differences between emoticons and faces (F (1, 46) = 0.011, P = 0.919) and no interaction between those two main effects (F (1.685, 77.488) = 0.285, P = 0.818, Greenhouse-Geisser corrected). In addition, a post-hoc t test revealed that there was no difference between sad and happy conditions (P = 0.082) with a significant difference between emotional and neutral conditions (P < 0.001 in both sad and happy contrast to neutral). In the RANOVA of variance, the variance of emoticons was significantly larger than the variance of faces (F (1, 46) = 16.108, P < 0.001), while there were no difference in the variance between emotions (F (1.268, 58.342) = 2.608, P = 0.079, Greenhouse-Geisser corrected) and no significant interaction (F (1.347, 61.963) = 4.831, P = 0.066, Greenhouse-Geisser corrected). Additionally, participants of the main experiments also underwent the rating procedure above before the MEG recording.
MEG experimental task
During the MEG recording, ninety stimuli consisting of 45 faces and 45 text-based emoticons were presented in the center of the screen using in-house software, the KRISSMEG Stimulator 4. The size, duration, and stimulus onset asynchrony (SOA) of all the stimuli were 27×18 cm, 500ms and 1500ms, respectively, and the order of stimuli presentation was pseudo-randomized. Participants completed 4 runs and each run contained 180 stimuli (30 sad faces, 30 happy faces, 30 neutral faces, 30 sad emoticons, 30 happy emoticons, 30 neutral emoticons each) and took 270 s. In addition, to maintain the participants’ attention to task, the participants were instructed to discriminate between sad and happy by pressing a button when a question mark appeared. The question mark randomly appeared on the screen every 9 to 15 trials.
Acquisition
A 152-channel MEG system (KRISS MEG, Daejeon, Korea, 152 axial first-order double-relaxation oscillation superconducting quantum interference device (DROS) gradiometers) covering the whole head was used to make MEG recordings in a magnetically shielded room for 60–90 minutes at a sampling rate of 1,024 Hz. The relative positions of the head and the MEG sensors were determined by attaching four small positioning coils to the head. The positions of the coils were recorded at intervals of 10–15 min by the MEG sensors to allow co-registration with individual anatomical MRI data. The maximum difference between head positions before and after the run was deviation < 2 mm and goodness of fit (GoF) > 95%. The EEG system for recordings of eye and muscle artifacts were made simultaneously with the MEG recordings. During MEG recording, participants were seated with their heads leaning backward in the MEG helmet. The translation between the MEG coordinate systems and each participant’s structural MRI was made using four head position coils placed on the scalp and fiducial landmarks (Hämäläinen, Hari, Ilmoniemi, Knuutila, & Lounasmaa, 1993).
Data preprocessing
Data were processed with a Fieldtrip toolbox (Oostenveld et al., 2011). First, raw data were epoched from 700ms before stimulus onset to 1300ms after stimulus onset. Epochs containing large artifacts were rejected by visual inspection. After artifact trial rejection, the eye movement artifact and the cardiac field artifact were removed by ICA (Hyvärinen et al., 2004) using the function “ft_componentanalysis” with the runICA algorithm. Twenty components were identified for each of six conditions. Two neurologists visually inspected each component, and components that showed typical spatial and temporal patterns of the CFA, eye blinking and movement noise were removed. After removing the noise, the data were filtered with a 1-40 Hz Butterworth filter. Then, the heartbeat evoked potential (HEP) for each stimulus condition was extracted by subsequent epoching, which was time-locked to the R peak of every epoch. R peaks were detected using the Pan-Tompkins algorithm (Pan & Tompkins, 1985) and the HEP of each condition was extracted by epoching 500ms before the R peak to 600ms after the R peak in the epoch of each condition. Because a heartbeat enters the central nervous system (CNS) around 200ms after the R peak by vagal afferent stimulation at the carotid body (Eckberg & Sleight, 1992) and a visual stimulus enters the CNS immediately through the retina, a heartbeat that occurs before a -200ms visual stimulus onset stimulates the brain earlier than a visual stimulus onset. Therefore, we excluded R peaks that occurred before 200ms of a stimulus onset to include heartbeat evoked processing that occurred only after visual stimulus.
Therefore, this procedure excluded cortical input of a heartbeat that occurred before the visual stimulus. The R peak after 700ms of stimulus onset was also excluded because that HEP epoch would contain the next visual stimulus onset. Finally, a baseline correction was performed using a pre-R-peak interval of 300ms and trials of the same condition for each subject were averaged. We expected that HEP modulation induced by emotional faces and emoticons would be much smaller than CFA. Therefore, to control CFA maximally, in addition to removal of CFA related IC in ICA, we only used a time window of 455-595ms after each R peak, which is known to have the minimal influence on the CFA (less than 1%) as in the previous study (Gray et al., 2007; Müller et al., 2015).
Sensor analysis: Cluster-based permutation paired t test between each emotional condition and neutral condition
We compared the HEP of an emotional condition and a neutral condition. Four tests were performed including a sad face vs. a neutral face, a happy face vs. a neutral face, a sad emoticon vs. a neutral emoticon, and a happy emoticon vs. a neutral emoticon. To deal with multiple comparison problems, we used a cluster-based permutation paired t test. These tests were done as follows. First, data were downsampled to 256 Hz to make the computation efficient and paired t tests were performed at all time points between 455 and 595ms and all sensors. Then significant spatiotemporal points of uncorrected p-values below 0.05 (two-tailed) were clustered by the spatiotemporal distance and the summed t-value of each cluster was calculated. After calculating the cluster t-stat, permutation distribution was made by switching condition labels within subjects randomly, calculating the t-value of the paired t test between permutated conditions, forming clusters as mentioned above, selecting the maximum cluster t-value and repeating this procedure 5000 times. Finally, after the maximum cluster t-values of each permutation made the permutation distribution, the corrected p-value original clusters were calculated. Additional paired t tests were performed between conditions that did not show a significant difference in the cluster-based permutation t test (i.e., happy face vs neutral face, sad/happy emoticon vs neutral emoticon) using the mean HEP values within the significant clusters observed by the cluster-based permutation t test (i.e., sad face vs neutral face condition; ROI-TOI mask).
Source analysis
Source reconstruction was conducted using the MATLAB package Brainstorm (Tadel et al., 2011). To estimate the time courses of both cortical and subcortical activity, we used the default settings in open source Matlab toolbox Brainstorm’s implementation of the Deep Brain Activity model using the minimum norm estimate (MNE) (Attal & Schwartz, 2013; Tadel et al., 2011). First, cortical surfaces and subcortical structures, including the amygdala and basal ganglia, were generated for each subject from 3T MPRAGE T1 images using Freesurfer (Fischl, 2012). The individual heads/parcellations were then read into Brainstorm (Tadel et al., 2011) along with track head points to refine the MRI registration. In Brainstorm, a mixed surface/volume model was generated, and 15,000 dipoles were generated on the cortical surface and another 15,000 dipoles were generated in the subcortical structure volume. Refining the registration with the head points improves the initial MRI/MEG registration by fitting the head points digitized at the MEG acquisition and the scalp surface. Using the individual T1 images and transformation matrix generated as above, a forward model was computed for each subject using a realistic overlapping spheres model. The source activity for each subject was computed using the MNE (Brainstorm default parameters were used). The source map was averaged over a time window of 488ms to 515ms - which showed a significant difference between a sad face and a neutral face at a sensor level (other emotional conditions were not significantly different from neutral conditions in cluster based permutation paired t test). Then this averaged spatial map was exported to SPM12 software (Penny et al., 2011) and subsequent statistical tests were performed. Paired t tests were used to identify regions that had a different HEP time course within the selected time window between the sad face and the neutral face. Moreover, to test the absolute activation difference between the two conditions, we additionally exported the absolute spatial map to SPM12 and applied the paired t test design. In source analysis, data were downsampled to 512 Hz using a sufficient number of time points in the Granger causality analysis. Finally, we performed a paired t test between happy face and neutral face in the source that showed a significant difference in the additional paired t test of the sensor analysis.
Granger causality analysis of HEP source activity
After identifying brain regions that had different time courses, we performed a GC analysis (Barnett & Seth, 2014) on two regions of interest, the right anterior insula (RAI) and the right anterior cingulate cortex (RACC), to determine whether effective connectivity between these regions are modulated differently in emotional expression compared to the neutral condition. These regions are known to be core regions of interoceptive processing and feeling according to many previous studies (Adolfi et al., 2016; A. R. Damasio, 2003; Smith & Lane, 2015). Moreover, emotion processing in AI and ACC are known to be right lateralized, especially in AI (Craig, 2009; Gu, Hof, Friston, & Fan, 2013). Time courses of these two regions were extracted from voxels that showed a significant difference in source analysis. Detailed information including coordinates of ROIs are provided in Supplementary materials.
In GC (Granger, 1988) analysis, time course Y causes time course X if the past time points of Y and X explains X better than the past of X alone. It is formulated by the log-likelihood ratio between the residual covariance matrix of the model that explains X by the past of X and Y and the residual covariance matrix of the model that explains X by the past of X alone (Barnett & Seth, 2014).
A is the matrix of the regression coefficient, Epsilon is the residual, Sigma is the covariance matrix of the residual, and F is the GC of X and Y. All these calculations were done using a multivariate Granger causality toolbox (MVGC toolbox) (Barnett & Seth, 2014).
Time courses of two ROIs were extracted for every trial for each subject. To satisfy stationarity assumptions of GC analysis, we used short time window GC estimation with a sliding window (Ding et al., 2000; Seth et al., 2015). This approach was also appropriate for our analysis of the HEP modulation effect, which was found in quite a short time window, and thus needed high temporal precision of the GC estimation. The size of the window was 60ms and step size was 2ms (Cohen, 2014; Ding et al., 2000) and calculation of GC was done for whole epoch, which started from the [-500ms -440ms] window to the [540ms 600ms] window. Further control for stationarity was done by removing the average event-related potential from each trial (Wang, Chen, & Ding, 2008). Model order was determined using Akaike Information Criterion (AIC) to a maximum order of seven time points, which corresponded to 14ms. After model estimation, we tested the stationarity of the model by examining whether the spectral radius ρ(A) > 1 in every time window and every subject(Barnett & Seth, 2014). Although we tried to control every time window to satisfy the stationarity assumption, after the [507ms 567ms] time window, 21 participants violated the stationarity assumption. Additionally, violation occurred at around 0ms post R peak in three participants and at 370ms (around the T peak) in one participant. There was a similar pattern even when we tested the variable length of the time window and model order. We suspected that this stationarity violation might be induced by CFA. Therefore, we again used the time window starting from the [455ms 515ms] window to the [507ms 567ms] window, and every time window satisfied the stationarity assumption in every participant. The pairwise GC of two ROIs (two GCs - RAI to RACC, RACC to RAI) were each calculated for emotional and neutral conditions. To compare between emotional and neutral conditions, GC baseline normalization was performed in both conditions by calculating the change in GC relative to the average GC between the [-330ms -270ms] window and the [-130ms -70ms] window (Cohen, 2014). Time windows around 0ms post R peak were not used as a baseline because three subjects violated the stationarity assumption. Finally, the 2 estimated GCs of emotional and neutral conditions were compared using a cluster based permutation paired t test for all time windows starting from the [455ms 515ms] window to the [507ms 567ms] window (Oostenveld et al., 2011). Therefore, the multiple comparison was controlled for the number of GCs and the number of time windows using a cluster-based permutation paired t test.
Analysis of physiological data
Heartbeat distribution in each condition
To show that the HEP effect is not the result of a biased heartbeat distribution, within the original visual epoch, we divided the visual epoch between -200ms and 700ms, which was the beginning and end of the HEP epoching, by 100ms time windows (a total of nine time bins) and counted how many heartbeats there were in those time windows. We did this for every condition. Then, the analysis of variance between nine time bins was performed to test whether the occurrence of heartbeats was the same in every time bin.
Heart rate modulation in each condition
The heart rate in every condition was calculated for each subject. Then, an analysis of variance between every condition was performed to test whether the heart rates were different across conditions.
Analysis to exclude the effect of visual processing
Surrogate R peak analysis
To test whether the HEP modulation effect is time-locked to the heartbeat, we created 100 surrogate heartbeats that were independent of original heartbeats (H.-D. Park et al., 2016; H.-D. Park et al., 2014). Then we computed the surrogate HEP with surrogate heartbeats and performed the same cluster-based permutation t test between conditions that showed a significant difference in the sensor level analysis. Finally, we made the distribution of maximum cluster statistics of a surrogate R peak and the calculated position of our original cluster statistics in this distribution to show that the heartbeat-locked effect is significantly large in such a distribution.
Analysis of visual evoked potential (VEP)
To test whether the HEP modulation effect is confounded by the visual evoked potential effect, we performed the same cluster-based permutation test with a visual stimulus-locked potential and compared the topology of significant clusters between HEP clusters and the VEP sensor level. Then, we performed a source localization of the VEP activity in a significant cluster time window and exported it to SPM12 to perform a statistical test between emotional and neutral conditions with the same methods we used in the HEP analysis (including the absolute value difference). Then we compared the resulting source with the result of the HEP analysis. This VEP analysis was not done for all conditions but for conditions that had a significant HEP modulation effect so that we could compare significant modulation of the HEP with the VEP effect.
Acknowledgement
This research was supported by the Brain Research Program through the National Research F oundation of Korea (NRF) funded by the Ministry of Science & ICT (NRF - 2016M3C7A1914448 NRF - 2017M3C7A1031331). The authors wish to acknowledge Kyung-Min An and Yong-Ho Lee for helping data acquisition.