Poster No:
753
Submission Type:
Abstract Submission
Authors:
Jin-Su Kim1, Hyun-Chul Kim1
Institutions:
1Kyungpook National University, Daegu, Korea, Republic of
First Author:
Jin-Su Kim
Kyungpook National University
Daegu, Korea, Republic of
Co-Author:
Hyun-Chul Kim
Kyungpook National University
Daegu, Korea, Republic of
Introduction:
The CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI) dataset consists of a diverse collection of opinion video clips, each labeled subjective emotion and sentiment intensity [1]. It is a collection of thousands of annotated YouTube clips expressing diverse opinions. Integrating functional magnetic resonance imaging (fMRI) with this dataset would offer a new way to explore neural reactions to emotion and sentiment during naturalistic stimuli. This study aimed to investigate the neural responses during emotional processing while exposed to CMU-MOSEI's multimodal stimuli, followed by an evaluation of the emotional responses elicited by the stimuli.
Methods:
We selected 120 video clips from each of six emotional categories (anger, disgust, fear, happiness, sadness, and disgust) in the CMU-MOSEI dataset, each lasting 45–75 seconds. During fMRI data acquisition, ten right-handed healthy participants (age = 24.6 ± 2.6; 4 males, 6 females) watched these clips and rated their emotions on a nine-point SAM scale [2] for arousal, dominance, and valence. To compare emotion levels in the valence-arousal space [3], the rated individual scores were normalized using the z-score method across all clips for each SAM and the averages were computed across subjects. Raw fMRI data were preprocessed using a standard pipeline, and physiological noise from the white matter and cerebral spinal fluid was removed using the Anatomical Component-based correction (AcompCor) method with the Analysis of Functional Neuroimages software [5]. Subsequently, using general linear model (GLM) the preprocessed fMRI data were analyzed at the individual level to estimate whole-brain activations from the naturalistic video stimuli. The estimated beta-value maps from GLM were used for one-way ANOVA (within-factor = emotion).

Results:
We observed that the rated scores for each of the emotional categories are distributed close to the area reported in previous studies within the valence-arousal space [3,4]. In particular, previous studies have consistently reported that happiness corresponds to a high valence and high arousal area (Fig. 2a). In this study, subjects rated high emotional scores for both valence and arousal when watching happiness-associated video clips. Figure 2b shows the ANOVA results that the superior medial gyrus, the medial frontal gyrus and insula revealed positive activations for all contrast maps while regions tied to emotional processing, such as the hippocampus, the olfactory cortex, and the middle temporal gyrus showed negative activation.
Conclusions:
We found that video clips from the CMU-MOSEI dataset could elicit changes in emotional and neural responses. For further investigations, the three-dimensional emotion space (including dominance) should be compared to the distribution of rated scores. Additionally, extracted features from deep neural networks, such as convolutional neural networks, should be investigated to ensure consistency with the statistical analysis.
Emotion, Motivation and Social Neuroscience:
Emotional Perception 1
Modeling and Analysis Methods:
Activation (eg. BOLD task-fMRI) 2
Keywords:
Emotions
FUNCTIONAL MRI
1|2Indicates the priority used for review
Provide references using author date format
Acknowledgement: This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. RS-2022-00166735 & No. RS-2023-00218987).
1. Zadeh, A. B., Liang, P. P., Poria, S., Cambria, E., & Morency, L. P. (2018, July). Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 2236-2246).
2. Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: the self-assessment manikin and the semantic differential. Journal of behavior therapy and experimental psychiatry, 25(1), 49-59.
3. Russell, J. A., Lewicka, M., & Niit, T. (1989). A cross-cultural study of a circumplex model of affect. Journal of personality and social psychology, 57(5), 848.
4. Bhattacharjee, Ananya, et al. "On the Performance Analysis of APIs Recognizing Emotions from Video Images of Facial Expressions."2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2018.
5. Cox, R. W. (1996). AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Computers and Biomedical research, 29(3), 162-173.
6. Kim, H. C., Bandettini, P. A., & Lee, J. H. (2019). Deep neural network predicts emotional responses of the human brain from functional magnetic resonance imaging. NeuroImage, 186, 607-627.