Surveying the Landscape: Early Research & Insights in NeuroAI
- Bassem Ben Ghorbel
- Feb 19, 2025
- 4 min read
As part of our journey with NeuroAI, we’ve embarked on a survey of seminal research papers to better understand the challenges and innovations in multi-modal emotion recognition. In this post, we share our initial readings, key takeaways, and the constructive feedback that is shaping our approach.

1. Papers Reviewed by the Team
Hajer’s Review: Olfactory-Enhanced Video on EEG-Based Emotion Recognition
Title: An Investigation of Olfactory-Enhanced Video on EEG-Based Emotion Recognition
Link: IEEE Explore
Overview:This study explores how integrating olfactory stimuli with video content affects EEG-based emotion recognition. Participants watched 2-minute videos paired with specific odors while wearing EEG caps and SMI glasses for eye-tracking.
Key Experiment Design:
Emotion Evocation: Each video was crafted to trigger a single target emotion.
Stimulus Pairing: Each odor was mapped to a specific emotion.
Video Conditions Tested:
TVEP: Traditional Video with Early Stimulation
TVLP: Traditional Video with Later Stimulation
OVEP: Olfactory-Enhanced Video with Early Stimulation
OVLP: Olfactory-Enhanced Video with Later Stimulation
Feedback:
Critics noted that while the study effectively combined EEG and EOG data, incorporating additional modalities could potentially boost accuracy.
Wafa’s Review: Intrinsic Features of EEG Signals via Empirical Mode Decomposition for Depression Recognition
Title: Exploring the Intrinsic Features of EEG Signals via Empirical Mode Decomposition for Depression Recognition
Link: Notion PDF
Overview:This paper presents an EMD-based approach to extract intrinsic features from EEG signals for better depression recognition, addressing limitations of traditional methods.
Key Contributions:
Introduces a regularization parameter to optimize intrinsic feature extraction.
Evaluates the framework across four EEG datasets, showing improved performance over methods like FFT, CSP, CNN, DNN, and LSTM.
Datasets Used:
Dataset 1: 64-channel EEG (15 depressed patients, 20 healthy controls)
Dataset 2: 128-channel EEG (MODMA dataset: 24 depressed, 29 healthy controls)
Dataset 3: 3-channel EEG (81 depressed, 89 healthy controls)
Dataset 4: 3-channel EEG (Auditory Stimulus-Evoked: 105 depressed, 109 healthy controls)
Key Findings:
Best classification accuracies ranged from 0.7768 to 0.8850.
The method notably reduced the common issue of mode mixing in EMD.
Feedback:
Some concerns were raised regarding the clarity in quantifying how much mode mixing was reduced and the representativeness of the datasets in terms of age, ethnicity, and varying levels of depression.
Sabaa’s Review: Deep Learning for Multimodal Emotion Recognition using EEG and Facial Expressions
Title: Deep Learning for Multimodal Emotion Recognition using EEG and Facial Expressions
Link: Notion PDF
Overview:This study integrates EEG signals with facial expression analysis to enhance emotion recognition accuracy.
Methodology:
Facial Feature Extraction: Utilizes a pre-trained CNN (DeepVANet) with an attention mechanism.
EEG Feature Capture: Employs CNNs with both local and global convolution kernels.
Fusion & Classification: Combines features from both modalities to achieve high accuracy.
Key Findings:
Multimodal fusion outperformed single-modal approaches, with the following accuracies:
DEAP Dataset: Valence: 96.63%, Arousal: 97.15%
MAHNOB-HCI Dataset: Valence: 96.69%, Arousal: 96.26%
Preprocessing:
EEG signals were down-sampled to 128 Hz and filtered (4–45 Hz).
Eye artifact removal was performed using Blind Source Separation (BSS).
Feedback:
While the multimodal approach shows promise, critics suggested incorporating additional modalities (e.g., non-physiological signals) to further enrich the model.
2. Collective Feedback & Additional Critiques
Hejer’s Observations:
Both the olfactory-enhanced and pure EEG-based emotion recognition systems could benefit from additional modalities beyond EEG and EOG to further improve accuracy.
Wafa’s Additional Input:
The paper on EMD-based feature extraction is promising but requires more details on quantifying mode mixing reduction and a broader dataset demographic.
Bechir’s Comments:
The study on temporally localized emotional events in EEG could improve its dataset by integrating augmented video experiences and incorporating more evaluation metrics.
Bassem’s Critique:
The machine learning framework for detecting mental stress leans heavily on facial expressions for ground truth, potentially biasing EEG performance. Including modalities like audio or GSR could mitigate this.
Key Technical Insights:
Preprocessing Techniques: Common methods for EEG include Wavelet Transform, FFT, and EMD—with strategies like using IMF to avoid mode mixing.
Multi-Modal Approaches: Utilizing attention mechanisms is becoming a standard practice, particularly in fusing EEG with video data.
3. How Do Existing Technologies Approach Multi-Modal Emotion Recognition?
Existing technologies often handle EEG, ECG, facial expression, speech, and movement analysis by:
Preprocessing Signals: Using advanced mathematical methods to extract clean, relevant features.
Feature Fusion: Applying attention-based mechanisms to integrate data from multiple modalities for a holistic understanding.
Deep Learning Models: Leveraging CNNs for image data and LSTMs/Transformers for sequential data to capture temporal dependencies and nuanced emotional cues.
These insights are guiding our exploration as we continue to build a more robust, multi-faceted emotion recognition system for NeuroAI.
Related PVs:
Check PVs 9 and 10 in the docs: https://docs.google.com/document/d/1STzaunNOn6xlyQjVtN9BcVp-MwhMXvwM-2-TlhX1vds/edit?tab=t.zgr4osbffgvt
Conclusion
Our initial survey of research papers has provided us with a wealth of knowledge—from innovative experimental designs and advanced preprocessing techniques to insightful critiques. Each paper contributes valuable ideas that will inform the development of our multi-modal emotion recognition system. As we move forward, these insights will play a crucial role in shaping our approach to integrating EEG, facial expressions, and beyond, paving the way for a more nuanced and effective AI-driven psychological assessment tool.
Stay tuned for more updates as we deepen our research and begin to synthesize these insights into actionable components for NeuroAI!



Comments