Abstract
Human emotions are complex and constructed of multiple facets of separable components. Amongst many models of emotions, circumplex model of emotion is one of a major theory. The use of the circumplex model allows us to model variable aspects of emotion; however, such momentary expression of one’s internal mental state still lacks to consider another, the third dimension of time. Here, we provide an exploratory attempt to build a three-axial model of human emotion to model one of our complicated sense of anticipatory excitement, “Waku-Waku (in Japanese)”, when people are predictively coding upcoming emotional events. Electroencephalography (EEG) was recorded from 28 young adult participants while they mentalized upcoming emotional pictures. Three auditory tones were used as indicative cues, predicting the likelihood of valence of upcoming picture, either positive, negative, or unknown. While seeing an image, participants judged its emotional valence during the task, and subsequently rated their subjective experiences on valence, arousal, and expectation immediately after the experiment. The collected EEG data were then analyzed to determine contributory neural signature for each of three axes. As was expected, a three axial model revealed considerable contribution of the third dimension over the classical two-dimension model. Distinctive contributing EEG components for each axis have been determined. The resultant model is provided as a novel model of ‘brain-emotion-interface’. Limitations and applicability of this method are discussed.
1. Introduction
Human emotions are complex and constructed of multiple facets of separable components. Amongst many models of emotion, two-dimensional circumplex model comprised of valence axis and arousal axis originally proposed by Russell [1] is commonly acquired and examined as a ubiquitous model across diversities of cultures [2,3]. Other theories such as discrete categorical theory exist [4,5]; however, the majority of models agreeably assumes that our emotion is to be explained on a momentarily affective state. Recent development of studies of emotion proposes an updated model of emotion. Some studies conceptualizes our emotion as ‘affective working memory system’ as a form of dynamic and active interactions between cognition and affect [6,7]. An alternative notion is also proposed, emotion occurs as a result of internal inference of predictive coding [8,9]. As such, our affective awareness may be an interactive state between cognitive and emotional functions but not composed of a simple unitary function [see 10 for review]. These recent evidences suggest the importance for a novel or more complex model of emotion to better account and decode our seemingly complex subjective experiences.
Particularly, the classical two-dimensional model lacks a notion of temporal dimension. Recent trends in neurosciences or psychological sciences proposes our brain as a predictive machine [8] supported by Bayesian theories on human brain [11,12], indicating that an emotional status might be explained with a sequence of momentarily affective states to predict upcoming states under uncertainty. For an instance, one may feel being excited emotionally while speculating on what to experience in the future cognitively, such as eating a delicious food in a few hours. Assuming the predicting nature of our brain, it is plausible that our emotional experiences may be explained by adding another or several dimension(s) to the classical circumplex model. Here, based on dimensional theory of emotion, we propose the addition of another dimension associated with predictory mechanisms to quantify our mental experiences.
1.1 “Kansei”–a multiplex state of mood
Our motivation for this study came from the idea to quantify such complex state of mental representations. In Asian language(s), for example, ‘Kansei (in Japanese a direct translation would be ‘sensitivity’ or ‘sensibility’)’ is a widely accepted term that reflects ones feeling exogenously triggered by something and that often accompanied by mental images of a target [13]. Kansei expressions often reflect a mixture of affective and cognitive states. For an instance, being “Waku-Waku”, is one of onomatopoeias state that is typically defined as an emotional state in which one is being excited emotionally while anticipating upcoming pleasant event(s) in the future cognitively. Closest synonyms for “Waku-Waku” in English may be ‘anticipatory excitement’ or ‘a sense of exhilaration’. As our eventual goal is to quantify such a putatively complex state, it was hypothesized the addition of an extra dimension of time might well explain the state of “Waku-Waku”. Therefore, we have hypothesized that a state of Kansei could be modelled with a combination of multi-dimensional axes that incorporates both affective and cognitive dimensions, including prediction. To note, in a broad sense, have Kansei not only a meaning for ones’ state but also reflects ones trait or preferences based on ones experiences; however, we focus on the former, state of one’s affect and cognition in this article. The other aspect of trait shall be treated elsewhere.
Here, we first aimed to build a psychological model for a Kansei, “Waku-Waku”. On a basis of a conventional two-dimensional model of affect [1], we hypothesized a three-dimensional model that was composed of ‘valence’ and ‘arousal’ axes as well as the third dimension of time, ‘expectation’. As the third axis, it was defined that ‘expectation’ is a simple cognitive aspect of anticipation of upcoming pleasant event, and differentiated against the “Waku-Waku” as a kind of onomatopoeia expression commonly used in Japanese culture as an emotional state particularly expecting pleasant events or episodes.
1.2 Brain-computer interface
Recent developments allow us to build an interface to monitor our brain status real-time and represent its status on a screen or a device. Today, the demand for techniques such as neurofeedback or brain-computer-interface (BCI) is increasing for use in clinical settings or even for industrial set-ups [14–16]. Applications of neurofeedback techniques are developing, some apply to visualize ones brain activities by using functional MRI for training and therapeutic purposes [17,18]. BCI with EEGs has been also widely employed method to detect a locus of ones attention with P300 component [15,16] or as a mean to assess conscious level with α-waves [19–21], and so forth. These classical models typically acquire data from one or a few electrode channel(s) and focus only a certain frequency range. With improved computational resources, BCIs focusing on limb movement acquire an independent component (or common spatial pattern) recorded from the whole scalp electrodes, rather than data from solely on one channel [22–24]. One can easily expect a further complex BCI could be achieved, such as building a neurofeedback system incorporating multiple neural indices such as the multi-dimensional model proposed here.
1.3 Brain-emotion interface
In this study, we first modelled “Waku-Waku” with three dimensions: namely valence, arousal and expectation; provided the psychological model for ”Waku-Waku” as an intermixed state of higher order affective and cognitive functions [25]. As it turned out, our 3-D model adequately showed a good fit. Given the model, secondly, we derived electrophysiological markers using electroencephalogram (EEG) reflecting each axis, namely ‘valence’ and ‘arousal’ evoked by seeing images and ‘expectation’ for upcoming images. At last, based on the outcomes as mentioned above, we propose a linear equation model of “brain-emotion-interface (BEI)” that may be able to quantify “Waku-Waku” by incorporating the three-dimensional psychological model with corresponding neural markers for each of the three axes in real-time. Below, we show the resultant psychological model, EEG markers for each axis, and propose a prototype 3-D model for the quantification of “Waku-Waku”. Potential applications of the BEI in clinical or industrial settings as a tool of Kansei-engineering as well as limitations in this study are discussed.
2. Methods
As the first step, we focused on building a psychological model for the “Waku-Waku”. We performed a picture rating experiment in which participants were asked to imagine what kind of novel picture would be displayed depending on a received valence-predicting cue. Immediately after the main task, participants completed subjective rating task to rate each condition just to confirm pictures used in this study did not differ from conventional scores. Details of the experiment and analyses are as it follows.
Given the hypothesis, an original experimental plan was to elucidate brain functions with functional MRI (fMRI) as well as EEG, thereby capturing multi-modal scopes of Kansei. The same participants visited the lab three times, twice for fMRI sessions and once for an EEG session. The part of fMRI outcome has been reported elsewhere [see 26]. The same task was repeated three times, once with an EEG recording and twice with fMRI recordings. We report fMRI session below because it was necessary to include the subjective ratings data obtained from all three visits of 28 participants to derive a satisfactory linear model.
2.1 Participants
Thirty-six healthy young adults (19 females) aged between 19–27 years old were recruited locally. Due to technical errors or early termination of all three visits to the lab resulted in rejection of some participants. As a result, data from 28 participants (16 females; age mean ± SD: 22.17 ± 1.79) are reported in this report. All reported no history of neurological or psychological disorders. All participants had normal hearing abilities with either normal or corrected-to-be normal vision. All participants gave their informed consent approved by a local research ethical committee located at the Hiroshima University.
2.2 Behavioral procedures
Participants performed a picture rating task in which they were requested to mentalize upcoming novel picture appearing on a computer monitor in accord with an auditory cue that preceded the picture onset (see Figure 1). There were 3 cuing conditions, 1) a high-tone predicting a positive picture (‘Predictive Pleasant’), 2) a low-tone predicting a negative picture (‘Predictable Unpleasant’), and 3) an intermediate tone that indicated a probability of seeing a pleasant and unpleasant picture was 50% (‘Unpredictable’). An auditory tone was played for 250msec, followed by a blank delay period for 3,750 msec. The conditional assignments for the high and low tones were counterbalanced across participants. While arousal of the pictures varied picture by picture (see Supplementary Figure 1 and Appendix for detailed ratings of valence and arousal), those auditory cue were unrelated to the arousal of upcoming image. To note, we categorized each picture for the purpose of counterbalancing picture sets across different sessions; we referred to subjective ratings of valence and arousal reported in the original article [IAPS ; 27]. Participants also rated based on their subjective feelings after the experiment (see below for details).
During the 4,000ms blank period, participants were requested to imagine what kind of image would be displayed. After the delay, an emotional-triggering picture was displayed at the center of the screen for 4,000ms (for the details of the selected pictures, see Stimuli section below). Followed by the picture display, a red fixation cross appeared at the center of the screen for 1,000ms. During this response period, participants were requested to rate their subjective feeling of valence, how they are moved by seeing that picture (on a 4 Likert-scale: ‘strongly pleasant’, ‘pleasant’, ‘unpleasant’ and ‘strongly unpleasant’) by pressing a corresponding button with a thumb, index finger, middle finger, or a ring finger on a keyboard (for EEG) or a response button (in fMRI). Prior to the main task, participants underwent a brief practice session for all three condition with a picture set independent to the main task.
2.3 Stimuli
As a predictory cue, one of three different auditory tones (500, 1,000, or 1,500Hz) were played for 200 msec via a headphone worn comfortably. Either 500 Hz (a low tone) or 1,500 Hz (a high tone) predicted either pleasant or unpleasant, and 1,000 Hz (an intermediate tone) was used as an unpredictable cue. The assignment of high and low tones was counterbalanced across participants.
As novel images, pictures were carefully selected from 1182 the International Affective Picture System [IAPS ; 27] with the following criteria. Pictures that might cause an excessive negative affect, such as corpses or a like, or that may interfere against our local ethics were discarded. In addition, pictures consisting of multiple objects where people may not necessarily focus on one aspect of that picture, pictures with intermediate valence that may not evoke adequate intensity of valence either positive or negative (such as a plain scenery or object; with valence ratings between 4–6, see Supplementary Figure 1), pictures containing texts or symbols or items that may have cultural discrepancies (i.e., guns) were disregarded for our picture sets. The resulting 320 picture were divided into 2 sets (for two-times of the test for MRI sessions and the other for an EEG session) randomly by controlling for average ratings (as reported in the IAPS dataset) of valence and arousal. Each set of 160 pictures were counterbalanced across participants. Of 160 pictures, a half of them were pleasant and the other half was unpleasant.
Each picture appeared at least once for a predictory cue (predicting pleasant or unpleasant), and a half of the 80 pictures each (pleasant and unpleasant) were used twice as a condition of unpredictory cue (‘intermediate tone’). Of selected pictures, spatial frequency and brightness were also controlled for the picture sets so that brightness and spatial frequency (high or low split at 140 Hz) of visual features were not significantly different among different sets. In addition, based on contents of each picture have been visually determined (human, animal, scenery, and others), and the categorical information was also equally distributed into each set. See Supplementary Figure 1 for the details of valence and arousal ratings used in this study. All IAPS pictures were novel to the participants.
A Dell 24-inch LCD monitor was used to display pictures at a 1920 x 1080 pixels resolution. A chin-rest placed 56cm away from the monitor was used to stabilize monitor to the eyes distance across participants. Participants were asked to rest their chin during the task. The size of pictures varied, some were oriented in landscape and others were portrait; however, the original pictures were displayed to fit to the monitor. All auditory and visual stimuli were delivered by the Presentation software version 17.2 (NeuroBehavioral Systems, San Francisco, USA).
As noted above, there were fMRI and EEG sessions. All behavioral tasks remained the same, except that the number of trials differed for the fMRI sessions; a total of 120 trials for an fMRI session, instead of 240 for EEG, were performed. On each visit, different sets of pictures were used to maintain the novelty of pictures. For a case of EEG session, there were 80 trials for each cuing condition and a total of 240 trials for an experiment. Regardless of cuing type, participants judged 120 pleasant and 120 unpleasant pictures.
2.4 Subjective rating procedures
At the completion of the task, participants reported their subjective feelings by rating on a 0–-100 visual analog scale for all conditions. For each condition (such as a condition in which a low-tone was played as predicting a pleasant picture), participants rated the degree of anticipated excitement (in Japanese, “Waku-Waku”), as well as ‘valence’ (unpleasant to pleasant), ‘arousal’ (low to high arousal), and ‘expectation’ (low to high expectation). As described above, each participant rated a total of three times for each visit of experiment (once for EEG, twice for fMRI experiments), and the all rating results have been accrued for the analysis. To note, building a model with 28 participants’ data only from one EEG experiment did not reach significance. We decided to include the entire three sessions’ data so as to assure all three axes achieved significance level.
2.5 Statistical procedures
A mixed linear model was computed based on these subjective ratings to model the anticipation of excitement, using SPSS version 22. The anticipation of excitement (“Waku-Waku”) was a dependent variable, and emotional ‘valence’, ‘arousal’, and ‘expectation’ were included in the model as independent variables. Assignment of counterbalanced tones, used picture sets, as well as examined domain of measures (MRI or EEG) were included as covariates of no interest.
As described above, inclusion of at least valence and arousal of the original circumplex model was expected to be fundamental to the model, likelihood ratio test (χ2 statistic) was evaluated on each axis. The all independent variables (axes) met the significant level (p < .05) reported formula in section 3.1. As it turned out, the arousal axis did not meet the criteria when including only one of three sessions.
2.6 EEG procedures
2.6.1 Recording procedures
During the above-mentioned task, participants’ EEGs have been recorded with 64ch BioSemi Active Two system at a sample rate of 2,000 Hz. In addition to the default 64 channels placed according to the International 10-20 system montage, a vertical and a horizontal electrooculograms were collected as in convention (approximately 3–4cm below and above the center of left eyeball for vertical, and approximately 1 cm horizontally to the side of external canthi on each eye). Online reference channel was placed on a tip of nose.
2.6.2 Analytical Procedures
Recorded EEG data were analyzed offline with EEGLAB toolbox [28] running on the Matlab 2015a (Mathworks, Inc), and it was partly combined with custom-made functions. Continuous data were first removed its DC offset, low-pass filtered with two-way least-squares FIR filter at 40 Hz, resampled to 512Hz, epoched from 500 msec before cue onset to 8,200 msec after the cue onset (4,200 msec after image onset). The epoched data were then average re-referenced, and each channel was normalized to the baseline period (the 500 msec before the cue-onset). Any trials with excessive artefacts on channels were rejected by the automatic artefact rejection model implemented in the EEGLAB with thresholds with more than 100μV, probability over 5 standard deviations. Each iteration of the artefact detection was performed with a maximally 5% of total trials to be rejected per iteration. In addition to the basic artifact rejections, we corrected artefacts derived from eye movements using conventional recursive least squares regression (CRLS) implemented in the EEGLAB [29] by referring to the vertical and horizontal EOG reference channels with 3rd order adaptive filter with a forgetting factor (lambda, ‘λ’) of .9999 and .01 sigma (‘σ’). The resulting corrected data received the first run of independent component decomposition (also known as, ‘ICA’) and followed by another run of automatic artefact rejection now on the independent components (ICs) to remove artifactual components with the same criteria used for the channel-based rejection as mentioned above. After the IC-based rejections, the second and the last IC decomposition was performed once, resulting in 64 putatively clean ICs per participant. Finally, dipole analysis was performed for each IC assuming one dipole in the brain.
2.6.3 Rejection criteria
Any ICs with residual variances more than 50% (equivalent to proportion of outliers at p < .005, one-tail; z-score > 2.58), estimated dipole positions outside of the brain, or any ICs with an inverse weight only on one of EEG channel (IC inverse weight with a channel with more than 5 standard deviation among the rest of channels) were rejected. This process retained an average of 33.89 ICs (ranged between 26–43 ICs) per participant. A total of 949 ICs was then proceeded for subsequent analyses.
2.6.4 Clustering independent components
In order to quantitatively determine a number of IC clusters to be extracted, we employed a gaussian mixture model (GMM) to cluster ICs based on their scalp topography, and we iterated GMM across a range of potential number of clusters (1 up to 60), and the number of clusters to extract was determined by Bayesian Information Criteria (BIC) due to its consistency over Akaike Information Criteria [30,31]. Because of the nature of independent component analysis, polarity of IC scalp map is arbitrary. Therefore, polarity of all retained IC was aligned by inverting polarity of each IC weights where necessary such that all components correlate positively to each other prior to the computation of the gaussian mixture models. All the aligned data were then Z-score normalized across channels per IC prior to GMM.
We iteratively clustered the 949 ICs with their inverse weights of the 64 channels by a gaussian mixture model that maximizes likelihood using the iterative expectation-maximization (EM) algorithm with the following rules. Covariance type was restricted to be diagonal; shared covariance was allowed, and with an addition of regularization value of 0.05. A maximum number of allowed EM iterations within each fit was set to be 1,000. We repeated the procedures for 1–60 clusters (we did not perform more than 60 as the decision could have been drawn straightforwardly from this number). The best GMM was determined by their BIC values. Finally, centroids of inverse weights for each cluster were computed, each IC was classified based on the selected model for subsequent statistical analyses.
2.6.5 Spectrogram computation
The pre-processed data was re-epoched from –500 to 4,200 msec around the image-onset for valence (seeing positive v.s. negative pictures) and arousal (seeing high v.s. low arousal pictures) and baseline was corrected between –500 and 0 msec. Likewise, data was re-epoched from –500 to 4,200 msec around the cue-onset for expectation (expecting positive picture v.s. unpredictable) and baseline was corrected between –500 and 0 msec. To note, both epoched data shared the same ICs as this epoch separation was done after the final ICA.
For all retained ICs, spectrogram was computed between 0–4,000 msec from the onset of image for ‘valence’ and ‘arousal’, and 0–4,000 msec after the onset of cue for ‘expectation’. The resulting spectral power were then averaged for each frequency range of θ (4–8 Hz), α (8–12 Hz), and (β (12–20 Hz). We then examined whether a spectral power of each IC can dissociate each type of valence, arousal, and expectation processes. Because spectral powers for each IC did not normally distributed for the most cases, Wilcoxon signed-rank test was applied and its alpha-level was corrected by false discovery rate (FDR) method, controlling for multiple comparisons across frequencies.
3. Results
3.1 Psychological model of “Waku-Waku”
See Figure 2 for a summary of subjective ratings for each cuing condition (‘expecting pleasant’, ‘unpredicting’, and ‘expecting unpleasant’). Based on these rating, a mixed linear model was tested with valence and arousal only and the 3-axis model including expectation. With the 2-axial model, “Waku-Waku (‘W’)” was modelled as follows (adjusted R2 = .90):
The linear model for the 3-axial model was as follows (adjusted R2 = .93).
As was expected, the 3-dimensional model topped its fitting accuracy by 3 percent of variance. Notably, the added third axis of expectation was highly loaded. When including only one of experimental sessions, a coefficient for the arousal axis did not meet our criteria but the other two valence and expectation axes met (p < .05). For completeness, here are the fitted formula for each session: [MRI1: W = .53*V + .04*A + .45*E; MRI1: W = .30*V + .12*A + .52*E; EEG: W = .29*V + .16*A + .61*E], where W, V, A, and E correspond to Waku-Waku, valence, arousal, and expectation. AIC values for each session (MRI1, MRI2, and EEG) with the three axial model were: 670, 686, and 669. These AIC results validate comparability of the scores on all three sessions.
3.2 Spectral EEG markers
As a result of the GMM tested for a number of clusters to be extracted based on the inverse weights of each IC, 15 clusters were selected based on their BIC values out of 1–60 clusters: 949 IC maps were aggregated into 15 clusters (See Figure 3; Supplementary Figure 2 also depicts centroid coordinates of dipole location for each cluster).
Spectral power has been examined for each IC cluster (see Figure 4 for a summary of significant and marginally significant ICs; Supplementary Table 1 contains the statistical results of all IC clusters for completeness). For valence, arousal, and expectation axes, 1, 7, and 1 IC clusters emerged as significant after FDR correction, respectively. Emotional valence was mainly represented by the θ band of IC cluster 6. Arousal instead was able to be quantified by many types of ICs, particularly on α band. Of those, the most reliable (shared by 100% of participants) and strong (having the highest Z-value) IC was α-band of IC cluster 5. As for expectation, θ-band of IC cluster 10 emerged to be significant.
Without the FDR correction, 4, 9 and 4 IC clusters were significant as shown in Figure 4. Some IC clusters were shared by all participants (100% of them) but some were not (i.e., cluster 11 for valence with 79%). Across all comparisons, some IC maps emerged to be weakly significant (puncorrected < .05) on different axes, such as cluster 5 for arousal and expectation. No single IC cluster (and its same frequency range) survived the correction across all the three axes.
3.3 3-D linear model of BEI
Given the three-dimensional psychological model (2) and corresponding neural correlates selected for each axis, we propose a conceptual BEI to estimate Waku-Waku (‘W’) below. , where EEG_Val, EEG_Aro, EEG_Exp correspond to standardized (0–100) spectral power of an IC for valence, arousal, and expectation axes, respectively. Given the statistical results, we selected for the most robust and significant component that survived our criteria. Because multiple components were found to be significant for arousal axis, we simply selected the most robust component that held the highest Z-score as well as having the highest proportion of participants who held the selected IC cluster (100%). The final selected formula for W is expressed as below: , where IC6, IC5, and IC10 correspond to IC cluster reported in the Figure 3, Figure 4 and Supplementary Table 1; θ and α in parentheses correspond to frequency range of interest for each IC. An example workflow using the formula (4) is depicted in Figure 5.
4. Discussion
We proposed a prototypical model of brain-emotion-interface (BEI) to quantify “Waku-Waku” towards upcoming visual images using EEG neural markers incorporating a three-dimensional psychological model.
4.1 3-D Psychological model
First, a psychological task to was given to participants to visually trigger one’s emotion and participants required to imagine upcoming stimuli. Participants reported their subjective feelings when they were anticipating one of three conditions (predicting pleasant, predicting unpleasant, and unpredictable) on 4 factors, “Waku-Waku”, valence, arousal, and expectation. As was expected, the 3-D psychological model of “Waku-Waku” with an inclusion of an axis of ‘expectation’ achieved adequately fair fitting accuracy. As quantified by adjusted R2 values, fit of the 3-D model was better by 3 percent than the 2-D model. The improvement is trivial to our aim, but at least securing the modeling human emotion with multiple axis may be feasible.
In addition, the 3-D model revealed a high loading on the third axis relative to the rest of two axes that classical 2-D circumplex model of emotion would propose [1]. This indicate that subjective feeling of interest, “Waku-Waku”, may highly involve the notion of anticipation rather than a momentary feeling of pleasantness or arousal. This result was considerably understandable because its definition of “Waku-Waku” is described as a state of one’s heart is moved due to being pleased and expecting something pleasant. Nevertheless, as was discussed earlier, Kansei, or our momentary mental state of emotion may be modelled by multifold of human affects and cognitions [7,8,25]. Obviously a psychological model shall not be restricted only by these proposed three axes (valence, arousal, and expectation). Particularly the concept of the third axis in our case was a concept of prediction, or time, domain; however, any other dimension associated with human senses may be acquired. Nevertheless, our results propose modelling of our putatively complex nature of awareness may certainly benefit from multi-dimensional model.
4.2 Corresponding electrophysiological markers for the three axes
As to determine neural correlates of each axis of BEI, spectral powers of independent components of EEGs were analyzed. Valence and arousal axes were quantified from the duration in which participants visually seeing an emotionally triggering picture, and expectation axis was estimated from the delay period in which they were anticipating upcoming picture according to played auditory cue tied to valence types. In order to determine a neural marker for each axis, a conventional spectral power analysis was performed on independent components. As it turned out, we identified dissociable neural markers for each axis. Notably, there were no overlap across all three axes, while a few ICs were found to be weakly related across two of three axes. In the other words, there was no single frequency range (θ, α, or β) of a certain IC cluster being responsible for all three axes, confirming putative functional independence of these psychological axes.
4.2.1 IC markers of valence
For valence axis, comparing seeing pleasant and unpleasant pictures, θ-band of IC cluster 6 was robustly significant and the component was found in 93% of participants. In addition, three other ICs including IC cluster 11 with high weights on frontal channels remained significant at uncorrected-level. Previous neuroimaging research suggest a source in proximity to orbitofrontal areas may be responsible for emotional valence [32]. A similar EEG study [33] relating subjective feeling during resting, rather than during a task, assessed on another type of 3-D emotional space (valence, arousal, and dominance) solely focusing on β-band power of IC clusters found that IC clusters with sources localized in posterior cingulate and right posterior temporal lobe were positively correlated with valence. In this study, an estimated dipole of the cluster 6 was centered around the mid- to posterior cingulate regions (see Supplementary Figure 2). While we found an IC with a dipole centered around the same region but a frequency differed from their study. The frequency difference is an interesting to find; however, our experimental paradigm certainly differs from Wyczesany & Ligeza (2014), and our finding may indicate relatively slow oscillation at θ-band induced by seeing a visual information may be associated with triggering remotely interconnected reward networks connected from the precuneus region such as midline orbitofrontal and anterior cingulate regions reported in fMRI studies [10].
4.2.2 IC markers for arousal
As for arousal axis, visually evoked neural activities by seeing a picture with high arousal against low arousal were compared. Amongst many, α-band of IC cluster 5 was selected as the best target while many other ICs (7 out of 15 clusters) emerged to be significant across all frequency ranges. It is notable that all of the 7 significant ICs were significant at the α-band. It has been widely known that alpha-band reflect our arousal [19–21], supporting our results. We were rather surprised by finding this many significant ICs for arousal, therefore we cannot be conclusive what sort of underlining neural mechanism are responsible for visually evoked arousal. This results rather supports previous notion of α-band reflects human arousal in general. While we could not find any distinctive neural marker. While we effortfully controlled visuophysical features among selected pictures, any aspects of physical properties, such as luminance or brightness of each picture might be related to some of components found to be significant here. Future studies could quantify detailed physical properties of visual stimuli, also using different picture package may be necessary.
4.2.3 IC markers for expectation
As for expectation axis, comparisons of neural activities when expecting a pleasant picture against when valence of expecting picture was unknown, θ-band of IC cluster 10 with its dipole centered around right angular gyrus (proximity to inferior parietal lobule and lateral occipital complex regions) was significant. This region is known to be responsible for visuospatial attention [34] or maintenance of visual information in memory [35].
Again, it could have been possible that we might observe the same or similar neural marker for expectation as that for valence axis because they merely differed whether they were anticipating or actually seeing a picture. One may perceive this as a good evidence by dissociating the putatively dissociable axes; however, the similar approach of showing emotional pictures while brain functions are monitored in fMRI [36], Bermpohl, et al (2006) reported lateral occipital regions are activated while seeing emotional pictures rather than expecting phase, while anterior and posterior cingulate region were responsible for expectation compared with neutral targets. One may argue that methodological differences between EEG and fMRI, as EEGs may be suitable for detecting electrical discharges while fMRI monitors cerebral blood flows. Another putative explanation may be that in our comparison, we did not have pictures with neutral valence. In our design, even at the unpredictable condition, anticipated imagery could have been either pleasant or unpleasant, or an intermixed state fluctuating between the two extremities, but not neutral image. Further detailed investigations would be necessary to discuss further about the overlap between the location of dipoles with BOLD responses obtained from fMRI studies.
Nonetheless, a previous fMRI research support dissociable networks for emotional expectancy and emotion perception [36], in our case, corresponding to expectation and valence axes, respectively. It is notable that the selected neural marker for the expectation axis was also dissociable against ones found to be significant for the valence axis, validating the inclusion of this third axis may benefit to quantify dissociable neural processes. It would be indeed reasonable to assume different neural mechanisms would underlie different cognitive or affective functions. Thus our experimental design at least captured dissociable functions while slightly different results were observed compared to previous studies using different techniques.
5. Conclusion
We proposed a prototype of BEI based on a multi-axis, 3-dimensional, model of emotion to quantify our anticipatory excitement using EEG. Fidelity of the BEI shall be examined in the future studies; however, provided a certain degree of accuracy backed by statistical results, our BEI may be able to quantify and applicable at least for young adult Japanese (or Asian) individuals. We found only one IC and its corresponding frequency band for valence and expectation axes; however, our result may not be conclusive due to putative cultural or age differences. In addition, as our BEI has been built only from an EEG data collected for visual stimuli, similar experiments need to be tested with stimuli on other modalities, such as audition and tactile.
Moreover, we found the number of ICs shared by our participants were not perfect, especially selected IC clusters for valence and expectation axes were not shared by all participants. This suggests that currently proposed BEI model may not be necessarily applicable for all individuals, even within our collected samples. A close investigation and individual optimization for selecting IC and its frequency range may be necessary to achieve full compatibility of the BEI. In addition, we applied GMM method to determine a number of ICs to be clustered together. While this method may be a quantitative mean to determine a number of clusters, this approach tends to fluctuate using slightly different parameters, such as regularization parameter, or changing a number of replications, etc. Therefore, one should be cautious when applying a result observed here. As the number of clusters differ, obviously there would be different statistical results may occur. Recent studies of neuroscience propose individually decoding and adjusting target neural activities overperform a group-level approach [37,38], especially when our focus of the use of this BEI is to accurately quantify anticipation of excitement for a certain person. As one could easily think that our personalities widely differ, our underlining neuroarchitectures also differ, it is plausible individual adaptation of a target frequency also needs to be optimized.
Our observations in this article may be limited in various aspects; however, this should constitute a reasonable basis to quantify our sense of Kansei. There are wide varieties of BCIs exist in the field, our approach of considering multiple axis combined with EEG markers may become a new tool for a neural consultation. Such tool may be applicable not only for stable pictures (i.e., seeing an art, picture, advertisement posters, etc) but also be useful for various other situations, such as evaluating emotional responses for seeing a motion-pictures (movies, TV commercials). BEI may certainly require further evidences and theoretical supports; however, it may become a useful tool for Kansei engineering.
Acknowledgements
Patents (PCT/JP2016/003712; JP2016116449A; US15/768,782; EP16855084.6A) have been submitted based on a part of described methods in this manuscript.
All authors designed behavioral paradigm and discussed the manuscript. N.K. and K.M. prepared experimental materials and collected data. MGM, N.K., and R.M. performed the EEG analyses. M.G.M. and G.L. designed and built the brain computer interface. MGM, NK, and GL wrote the manuscript. We highly appreciate Prof. Hirokazu Yanagihara at Hiroshima University for his mathematical advice on this project.
This research was supported by JST COI Grant Number JPMJCE1311. GL was supported by the New Energy and Industrial Technology Development Organization (NEDO), by ImPACT of CSTI and by the Commissioned Research of NICT.