Output list
Journal article
Published 2026
Pattern recognition, 171, Part A, 112122
Speech Emotion Recognition (SER) is a method of identifying emotional states from the human voice. Automatic SER (ASER) is a research domain where Machine Learning (ML) is used to extract and analyze speech features to predict emotional states. Using ML in a sensitive area like SER requires transparency and reliability of the models. For instance, ASER is crucial to understanding the underlying decision-making in real-world applications such as mental health monitoring systems. Researchers, therefore, have focused attention on advancing the interpretability and explainability of ASER models. Interpretability maximizes human understanding of complex processes by providing meaningful insights. Explainability presents the interpretable insights in a clear and human-understandable manner. Some standard interpretability methods include feature importance, feature selection methods, and attention models. Explainability methods include SHapley Additive exPlanations (SHAP), visualizations using embedding plots, saliency maps, etc., and feature importance analysis. The current systematic review explores the different interpretability and explainability methods for speech emotion features. The current review paper aims to identify the progress in the area, identify potential research gaps, and motivate future research.
Journal article
The paper wasps (Hymenoptera: Vespidae: Polistinae) of Sri Lanka recorded from recent investigations
Published 2024
Zootaxa, 5406, 4, 519 - 534
Paper wasps of subfamily Polistinae Lepeletier have been studied in many countries of the world due to their importance as pest species, predators, model organisms in research and medical significance. Seven species have been well documented in Sri Lanka, of these five species represent genus Ropalidia Guérin-Méneville, and two species genus Polistes Latrielle. However, the species have not been studied systematically for many years and recent records are not available. In the present study investigations for wasps (Vespidae) were conducted in 28 locations of all provinces and climatic zones of the country. Five species of paper wasps were found in 15 of the locations investigated, four in the genus Ropalidia and one in the genus Polistes. Ropalidia marginata Lepeletier was the most abundant and widely distributed species, while the other species had more limited distribution. Polistes (Gyrostoma) olivaceus De Geer, previously recorded from Sri Lanka, was not recorded during the present study. All the species of paper wasps encountered in the present study showed changes in distribution from their historical locations, decline in distributional ranges and occurrence in new locations.
Journal article
An ultra-specific image dataset for automated insect identification
Published 2022
Multimedia Tools and Applications, 81, 3223 - 3251
Automated identification of insects is a tough task where many challenges like data limitation, imbalanced data count, and background noise needs to be overcome for better performance. This paper describes such an image dataset which consists of a limited, imbalanced number of images regarding six genera of subfamily Cicindelinae (tiger beetles) of order Coleoptera. The diversity of image collection is at a high level as the images were taken from different sources, angles and on different scales. Thus, the salient regions of the images have a large variation. Therefore, one of the main intentions in this process was to get an idea about the image dataset while comparing different unique patterns and features in images. The dataset was evaluated on different classification algorithms including deep learning models based on different approaches to provide a benchmark. The dynamic nature of the dataset poses a challenge to the image classification algorithms. However transfer learning models using softmax classifier performed well on the current dataset. The tiger beetle classification can be challenging even to a trained human eye, therefore, this dataset opens a new avenue for the classification algorithms to develop, to identify features which human eyes have not identified.
Journal article
An ultra-specific image dataset for automated insect identification
Published 2021
Multimedia Tools and Applications
Automated identification of insects is a tough task where many challenges like data limitation, imbalanced data count, and background noise needs to be overcome for better performance. This paper describes such an image dataset which consists of a limited, imbalanced number of images regarding six genera of subfamily Cicindelinae (tiger beetles) of order Coleoptera. The diversity of image collection is at a high level as the images were taken from different sources, angles and on different scales. Thus, the salient regions of the images have a large variation. Therefore, one of the main intentions in this process was to get an idea about the image dataset while comparing different unique patterns and features in images. The dataset was evaluated on different classification algorithms including deep learning models based on different approaches to provide a benchmark. The dynamic nature of the dataset poses a challenge to the image classification algorithms. However transfer learning models using softmax classifier performed well on the current dataset. The tiger beetle classification can be challenging even to a trained human eye, therefore, this dataset opens a new avenue for the classification algorithms to develop, to identify features which human eyes have not identified
Journal article
Deep learning approach to classify Tiger beetles of Sri Lanka
Published 2021
Ecological Informatics, 62, Art. 101286
Deep learning has shown to achieve dramatic results in image classification tasks. However, deep learning models require large amounts of data to train. Most of the real-world datasets, generally insect classification data does not have large number of training dataset. These images have a large amount of noise and various differences. The paper proposes a novel architectural model which removes the background noise and classify the Tiger beetles. Here object location is identified using contours by converting the original coloured image to white on black background. Then the remaining background is eliminated using grabcut algorithm. Later the extracted images are classified using a modified SqueezeNet transfer learning model to identify the tiger beetle class up to genus level. Transfer learning models with fewer trainable parameters performed well than the total number of parameters in the original model. When evaluating results it was identified that by freezing uppermost layers of SqueezeNet model better accuracy can be gained while freezing lowermost layers will reduce the validation accuracy. The proposed model achieved more than 90% for the test set in 40 epochs using 701,481 trainable parameters by freezing the top 19 layers of the original model. Improving the pre-processing to localize insect has improved the accuracy.
Journal article
Published 2020
Computers and Electronics in Agriculture, 173, Article 105438
The khapra beetle, Trogoderma granarium Everts, is the most critical biosecurity pest threat which threatens the grains industry worldwide. To prevent incursion of the khapra beetle, very accurate and reliable diagnostic tools are required to differentiate the khapra beetle from other morphologically, closely related Trogoderma sp., in particular the larva stage. However, at present, it can only be identified by highly skilled taxonomists. Furthermore, often suspected Trogoderma sp. found in grain products are the body fractions such as larval skins or fragmented adult, which are impossible to diagnose morphologically. This work explored the combination of visible near infrared hyperspectroscopy (VNIH) and deep learning tools to identify the khapra beetle. About 2000 hyperspectral images were acquired under this study. Images of T. granarium and Trogoderma variabile, adult, larvae, larvae skin, fragments of adult and larvae images, were subjected to two deep learning models; Convolutional Neural Networks (CNN) and Capsule Network for analysis. Overall, above 90% accuracy was obtained with both models, whereas Capsule Network achieved a higher accuracy of 96%. For whole adult body and adult fragments, the accuracy achieved was 96.2% and 91.7%, respectively. For whole larvae, larvae skin and larvae fragment, accuracies of 93.4%, 91.6%, and 90.3% were achieved. Ventral orientation gave better accuracy over dorsal orientation of the insects for both larvae and adult stages. Based on the above results, VNIH imaging technology coupled with appropriate machine learning tools can be used to identify one of the most notorious stored grain pests, the khapra beetle, from other morphologically similar Trogoderma sp like T. variabile. Particularly, the technology offers a new approach and possibility of an effective identification of Trogoderma sp. from its body fragments and larvae skins, which are otherwise impossible to diagnose taxonomically.
Book chapter
RCNN for region of interest detection in whole slide images
Published 2020
Neural Information Processing, 1333, 625 - 632
Digital pathology has attracted significant attention in recent years. Analysis of Whole Slide Images (WSIs) is challenging because they are very large, i.e., of Giga-pixel resolution. Identifying Regions of Interest (ROIs) is the first step for pathologists to analyse further the regions of diagnostic interest for cancer detection and other anomalies. In this paper, we investigate the use of RCNN, which is a deep machine learning technique, for detecting such ROIs only using a small number of labelled WSIs for training. For experimentation, we used real WSIs from a public hospital pathology service in Western Australia. We used 60 WSIs for training the RCNN model and another 12 WSIs for testing. The model was further tested on a new set of unseen WSIs. The results show that RCNN can be effectively used for ROI detection from WSIs.
Journal article
Modelling email traffic workloads with RNN and LSTM models
Published 2020
Human-centric Computing and Information Sciences, 10, 1, Art. 39
Analysis of time series data has been a challenging research subject for decades. Email traffic has recently been modelled as a time series function using a Recurrent Neural Network (RNN) and RNNs were shown to provide higher prediction accuracy than previous probabilistic models from the literature. Given the exponential rise of email workloads which need to be handled by email servers, in this paper we first present and discuss the literature on modelling email traffic. We then explain the advantages and limitations of different approaches as well as their points of agreement and disagreement. Finally, we present a comprehensive comparison between the performance of RNN and Long Short Term Memory (LSTM) models. Our experimental results demonstrate that both approaches can achieve high accuracy over four large datasets acquired from different universities’ servers, outperforming existing work, and show that the use of LSTM and RNN is very promising for modelling email traffic.
Doctoral Thesis
Enhancing natural language understanding using meaning representation and deep learning
Published 2019
Natural Language Understanding (NLU) is one of the complex tasks in artificial intelligence. Machine learning was introduced to address the complex and dynamic nature of natural language. Deep learning gained popularity within the NLU community due to its capability of learning features directly from data, as well as learning from the dynamic nature of natural language. Furthermore, deep learning has shown to be able to learn the hidden feature(s) automatically and outperform most of the other machine learning approaches for NLU. Deep learning models require natural language inputs to be converted to vectors (word embedding). Word2Vec and GloVe are word embeddings which are designed to capture the analogy context-based statistics and provide lexical relations on words. Using the context-based statistical approach does not capture the prior knowledge required to understand language combined with words. Although a deep learning model receives word embedding, language understanding requires Reasoning, Attention and Memory (RAM). RAM are key factors in understanding language. Current deep learning models focus either on reasoning, attention or memory. In order to properly understand a language however, all three factors of RAM should be considered. Also, a language normally has a long sequence. This long sequence creates dependencies which are required in order to understand a language. However, current deep learning models, which are developed to hold longer sequences, either forget or get affected by the vanishing or exploding gradient descent. In this thesis, these three main areas are of focus. A word embedding technique, which integrates analogy context-based statistical and semantic relationships, as well as extracts from a knowledge base to hold enhanced meaning representation, is introduced. Also, a Long Short-Term Reinforced Memory (LSTRM) network is introduced. This addresses RAM and is validated by testing on question answering data sets which require RAM. Finally, a Long Term Memory Network (LTM) is introduced to address language modelling. Good language modelling requires learning from long sequences. Therefore, this thesis demonstrates that integrating semantic knowledge and a knowledge base generates enhanced meaning and deep learning models that are capable of achieving RAM and long-term dependencies so as to improve the capability of NLU.
Conference paper
Language modeling through Long-Term memory network
Published 2019
2019 International Joint Conference on Neural Networks (IJCNN)
International Joint Conference on Neural Networks (IJCNN) 2019, 14/07/2019–19/07/2019, Budapest, Hungary
Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTM), and Memory Networks which contain memory are popularly used to learn patterns in sequential data. Sequential data has long sequences that hold relationships. RNN can handle long sequences but suffers from the vanishing and exploding gradient problems. While LSTM and other memory networks address this problem, they are not capable of handling long sequences (50 or more data points long sequence patterns). Language modelling requiring learning from longer sequences are affected by the need for more information in memory. This paper introduces Long Term Memory network (LTM), which can tackle the exploding and vanishing gradient problems and handles long sequences without forgetting. LTM is designed to scale data in the memory and gives a higher weight to the input in the sequence. LTM avoid overfitting by scaling the cell state after achieving the optimal results. The LTM is tested on Penn treebank dataset, and Text8 dataset and LTM achieves test perplexities of 83 and 82 respectively. 650 LTM cells achieved a test perplexity of 67 for Penn treebank, and 600 cells achieved a test perplexity of 77 for Text8. LTM achieves state of the art results by only using ten hidden LTM cells for both datasets.