Logo image
A Systematic Review of Interpretability and Explainability for Speech Emotion Features in Automatic Speech Emotion Recognition
Journal article   Open access   Peer reviewed

A Systematic Review of Interpretability and Explainability for Speech Emotion Features in Automatic Speech Emotion Recognition

Hiruni Maleesa Jayasinghe, Kok Wai Wong and Anupiya Nugaliyadde
Pattern recognition, Vol.171(Part A), 112122
2026
pdf
Published3.71 MBDownloadView
CC BY V4.0 Open Access

Abstract

automatic speech emotion recognition explainability interpretability speech emotion features
Speech Emotion Recognition (SER) is a method of identifying emotional states from the human voice. Automatic SER (ASER) is a research domain where Machine Learning (ML) is used to extract and analyze speech features to predict emotional states. Using ML in a sensitive area like SER requires transparency and reliability of the models. For instance, ASER is crucial to understanding the underlying decision-making in real-world applications such as mental health monitoring systems. Researchers, therefore, have focused attention on advancing the interpretability and explainability of ASER models. Interpretability maximizes human understanding of complex processes by providing meaningful insights. Explainability presents the interpretable insights in a clear and human-understandable manner. Some standard interpretability methods include feature importance, feature selection methods, and attention models. Explainability methods include SHapley Additive exPlanations (SHAP), visualizations using embedding plots, saliency maps, etc., and feature importance analysis. The current systematic review explores the different interpretability and explainability methods for speech emotion features. The current review paper aims to identify the progress in the area, identify potential research gaps, and motivate future research.

Details

Metrics

806 File views/ downloads
21 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Citation topics
4 Electrical Engineering, Electronics & Computer Science
4.174 Digital Signal Processing
4.174.2794 Speech Emotion Recognition
Web Of Science research areas
Computer Science, Artificial Intelligence
Engineering, Electrical & Electronic
ESI research areas
Engineering
Logo image