A Personality-Aware Hybrid Deep Learning Framework for Anxiety and Depression Detection Using a Neuro-Temporal Sparse Modulation Network
Received: 12 January 2026 | Revised: 7 February 2026 and 13 February 2026 | Accepted: 16 February 2026 | Online: 6 June 2026
Corresponding author: Raksha Rajanna
Abstract
The early detection of depressive and anxiety disorders is still an active area of research due to the differences between people and the change in time of signs that people show. This study presents a potential advancement towards more accurate identification through a multimodal hybrid model, named Neuro-Temporal Sparse Modulation Network (NTSM-Net). NTSM-Net consists of neuro-inspired sparse encoding, a Bayesian Temporal Transformer for uncertainty-aware temporal modeling, a Behavioral Micro Event Detector to obtain accurate cues, and the application of personality-aware modulation to obtain individualized inferences. NTSM-Net produces trajectories of emotion and estimates of the potential severity of the risk rather than diagnosing someone clinically. NTSM-Net was evaluated using five publicly available multimodal datasets, yielding 96.8% and 96.9% accuracy and F1-scores, respectively, and outperforming recent models. The results of the ablation and robustness analyses determine the contribution of neuro-inspired sparse encoding, probabilistic temporal modeling, behavioral micro-event detection, and personality modulation to the clarification of individuals with depressive/anxiety-related disorders.
Keywords:
neuro-inspired sparse coding, Bayesian temporal modelling, personality-aware modulation, micro-event detection, hybrid deep learning, behavioral signal modelingReferences
M. Mansoor and K. Ansari, "Early Detection of Mental Health Crises through Artificial-Intelligence-Powered Social Media Analysis: A Prospective Observational Study," Journal of Personalized Medicine, vol. 14, no. 9, Sept. 2024, Art. no. 958.
B. H. Bhavani and N. C. Naveen, "An Approach to Determine and Categorize Mental Health Condition using Machine Learning and Deep Learning Models," Engineering, Technology & Applied Science Research, vol. 14, no. 2, pp. 13780–13786, Apr. 2024.
M. A. Ruslim, M. J. Spencer, H. Hogendoorn, H. Meffin, Y. Lian, and A. N. Burkitt, "Emergence of Sparse Coding, Balance and Decorrelation from a Biologically-Grounded Spiking Neural Network Model of Learning in the Primary Visual Cortex." Neuroscience, Dec. 10, 2024.
B. H. Bhavani, M. Sreenatha, and N. C. Kundur, "Diagnosis and Classification of Depressive Disorders using ML and DL Models," Engineering, Technology & Applied Science Research, vol. 15, no. 2, pp. 21383–21389, Apr. 2025.
Y. Xiao, H. Shao, J. Wang, S. Yan, and B. Liu, "Bayesian variational transformer: A generalizable model for rotating machinery fault diagnosis," Mechanical Systems and Signal Processing, vol. 207, Jan. 2024, Art. no. 110936.
Z. Wang, K. Zhang, W. Luo, and R. Sankaranarayana, "HTNet for micro-expression recognition," Neurocomputing, vol. 602, Oct. 2024, Art. no. 128196.
M. Lukac, "Speech-based personality prediction using deep learning with acoustic and linguistic embeddings," Scientific Reports, vol. 14, no. 1, Dec. 2024, Art. no. 30149.
Z. Liu, S. X. Yin, G. Lin, and N. F. Chen, "Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems," in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024, pp. 626–642.
M. Song et al., "Empowering Mental Health Monitoring Using a Macro-Micro Personalization Framework for Multimodal-Multitask Learning: Descriptive Study," JMIR Mental Health, vol. 11, Oct. 2024, Art. no. e59512.
G. Pushpa et al., "An advanced AI framework for mental health diagnostics using Bidirectional Encoder Representations from Transformers with gated recurrent units and convolutional neural networks," Ingénierie des systèmes d information, vol. 30, no. 1, pp. 213–220, Jan. 2025.
X. Jia, J. Chen, K. Liu, Q. Wang, J. He, and College of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an, Shaanxi, China, "Multimodal depression detection based on an attention graph convolution and transformer," Mathematical Biosciences and Engineering, vol. 22, no. 3, pp. 652–676, 2025.
A. Ahmad et al., "A comprehensive bibliometric survey of micro-expression recognition system based on deep learning," Heliyon, vol. 10, no. 5, Mar. 2024, Art. no. e27392.
A. Jain and D. Bhakta, "Micro-expressions: a survey," Multimedia Tools and Applications, vol. 83, no. 18, pp. 53165–53200, Nov. 2023.
X. Han, F. Chen, and J. Ban, "FMFN: A Fuzzy Multimodal Fusion Network for Emotion Recognition in Ensemble Conducting," IEEE Transactions on Fuzzy Systems, vol. 33, no. 1, pp. 168–179, Jan. 2025.
Z. Xu et al., "Depression detection methods based on multimodal fusion of voice and text," Scientific Reports, vol. 15, no. 1, July 2025, Art. no. 21907.
X. Huang et al., "Depression recognition using voice-based pre-training model," Scientific Reports, vol. 14, no. 1, June 2024, Art. no. 12734.
M. Nykoniuk, O. Basystiuk, N. Shakhovska, and N. Melnykova, "Multimodal Data Fusion for Depression Detection Approach," Computation, vol. 13, no. 1, Jan. 2025, Art. no. 9.
Z. Tan et al., "Detecting Emotional Dynamic Trajectories: An Evaluation Framework for Emotional Support in Language Models." arXiv, 2025.
Y. Zhang et al., "Employing Machine Learning and Deep Learning Models for Mental Illness Detection," Computation, vol. 13, no. 8, Aug. 2025, Art. no. 186.
S. Zhou, C. Gao, T. Delbruck, M. Verhelst, and S. C. Liu, "Exploiting neuro-inspired dynamic sparsity for energy-efficient intelligent perception," Nature Communications, vol. 16, no. 1, Nov. 2025, Art. no. 9928.
S. Bahadi, E. Plourde, and J. Rouat, "Efficient Sparse Coding with the Adaptive Locally Competitive Algorithm for Speech Classification." arXiv, Aug. 29, 2025.
S. Mamidisetti and A. M. Reddy, "A stacking-based ensemble framework for automatic depression detection using audio signals," International Journal of Advanced Computer Science and Applications, vol. 14, no. 7, 2023.
M. Tveter et al., "Advancing EEG prediction with deep learning and uncertainty estimation," Brain Informatics, vol. 11, no. 1, Dec. 2024, Art. no. 27.
S. Xie et al., "Bayesian cooperative probabilistic Transformer for remaining useful life prediction with uncertainty estimation in industrial equipment," Advanced Engineering Informatics, vol. 67, Sept. 2025, Art. no. 103515.
I. H. Haraldsen et al., "Intelligent digital tools for screening of brain connectivity and dementia risk estimation in people affected by mild cognitive impairment: the AI-Mind clinical study protocol," Frontiers in Neurorobotics, vol. 17, Jan. 2024, Art. no. 1289406.
T. Kopalidis, V. Solachidis, N. Vretos, and P. Daras, "Advances in Facial Expression Recognition: A Survey of Methods, Benchmarks, Models, and Datasets," Information, vol. 15, no. 3, Feb. 2024, Art. no. 135.
P. Malik, J. Singh, F. Ali, S. S. Sehra, and D. Kwak, "Action unit based micro-expression recognition framework for driver emotional state detection," Scientific Reports, vol. 15, no. 1, July 2025, Art. no. 27824.
M. Hosseini et al., "A multimodal stress detection dataset with facial expressions and physiological signals," Scientific Data, vol. 12, no. 1, Nov. 2025, Art. no. 1844.
E. Lim, M. Jhon, J. W. Kim, S. H. Kim, S. Kim, and H. J. Yang, "A lightweight approach based on cross-modality for depression detection," Computers in Biology and Medicine, vol. 186, Mar. 2025, Art. no. 109618.
S. Koelstra et al., "DEAP: A Database for Emotion Analysis ;Using Physiological Signals," IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 18–31, Jan. 2012.
W. L. Zheng and B. L. Lu, "Investigating Critical Frequency Bands and Channels for EEG-Based Emotion Recognition with Deep Neural Networks," IEEE Transactions on Autonomous Mental Development, vol. 7, no. 3, pp. 162–175, Sept. 2015.
P. Schmidt, A. Reiss, R. Duerichen, C. Marberger, and K. Van Laerhoven, "Introducing WESAD, a Multimodal Dataset for Wearable Stress and Affect Detection," in Proceedings of the 20th ACM International Conference on Multimodal Interaction, Oct. 2018, pp. 400–408.
S. R. Livingstone and F. A. Russo, "The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English," PLOS ONE, vol. 13, no. 5, May 2018, Art. no. e0196391.
C. Busso et al., "IEMOCAP: interactive emotional dyadic motion capture database," Language Resources and Evaluation, vol. 42, no. 4, pp. 335–359, Dec. 2008.
Downloads
How to Cite
License
Copyright (c) 2026 R. Raksha, M. P. Pushpalatha, K. P. Impana

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
