Output list
Journal article
A systematic review of multi-modal large language models on domain-specific applications
Published 2025
The Artificial intelligence review, 58, 12, 383
While Large Language Models (LLMs) have shown remarkable proficiency in text-based tasks, they struggle to interact effectively with the more realistic world without the perceptions of other modalities such as visual and audio. Multi-modal LLMs, which integrate these additional modalities, have become increasingly important across various domains. Despite the significant advancements and potential of multi-modal LLMs, there has been no comprehensive PRISMA-based systematic review that examines their applications across different domains. The objective of this work is to fill this gap by systematically reviewing and synthesising the quantitative research literature on domain-specific applications of multi-modal LLMs. This systematic review follows the PRISMA guidelines to analyse research literature published after 2022, the release of OpenAI’s ChatGPT
3.5. The literature search was conducted across several online databases, including Nature, Scopus, and Google Scholar. A total of 22 studies were identified, with 11 focusing on the medical domain, 3 on autonomous driving, and 2 on geometric analysis. The remaining studies covered a range of topics, with one each on climate, music, e-commerce, sentiment analysis, human-robot interaction, and construction. This review provides a comprehensive overview of the current state of multi-modal LLMs, highlights their domain-specific applications, and identifies gaps and future research directions.
Journal article
TriagedMSA: Triaging Sentimental Disagreement in Multimodal Sentiment Analysis
Published 2025
IEEE transactions on affective computing, 16, 3, 1557 - 1569
Existing multimodal sentiment analysis models are effective at capturing sentiment commonalities across different modalities and discerning emotions. However, these models still face significant challenges when analyzing samples with sentiment polarity differences across modalities. Neural networks struggle to process such divergent sentiment samples, particularly when they are scarce within datasets. While larger datasets could help address this limitation, collecting and annotating them is resource-intensive. To overcome this challenge, we propose TriagedMSA, a multimodal sentiment analysis model with triage capability. Our model introduces the Sentiment Disagreement Triage Network, which identifies sentiment disagreement between modalities within a sample. This triage mechanism reduces mutual influence by learning to distinguish between samples of sentiment agreement and disagreement. To process these two sample types, we develop the Sentiment Selection Attention Network and the Sentiment Commonality Attention Network, both of which enhance modality interaction learning. Furthermore, we propose the Adaptive Polarity Detection (APD) algorithm, which ensures the generalizability of our model across different datasets, regardless of whether unimodal labels are available. The APD algorithm adaptively determines sentiment polarity disagreement or agreement between modalities. We conduct experiments on three multimodal sentiment analysis datasets: CMU-MOSI, CMU-MOSEI and CH-SIMS.v2. The results demonstrate that our proposed methodology outperforms existing state-of-the-art approaches.
Journal article
Drug-CoV: a drug-origin knowledge graph discovering drug repurposing targeting COVID-19
Published 2023
Knowledge and information systems
Drug repurposing is a technique for probing new usages of existing medicines, but its traditional methods, such as computational approaches, can be time-consuming and laborious. Recently, knowledge graphs (KGs) have emerged as a powerful approach for graph-based representation in drug repurposing, encoding entities and relations to predict new connections and facilitate drug discovery. As COVID-19 has become a major public health concern, it is critical to establish an appropriate COVID-19 KG for drug repurposing to combat the spread of the virus. However, most publicly available COVID-19 KGs lack support for multi-relations and comprehensive entity types. Moreover, none of them originates from COVID-19-related drugs, making it challenging to identify effective treatments. To tackle these issues, we developed Drug-CoV, a drug-origin and multi-relational COVID-19 KG. We evaluated the quality of Drug-CoV by performing link prediction and comparing the results to another publicly available COVID-19 KG. Our results showed that Drug-CoV outperformed the comparing KG in predicting new links between entities. Overall, Drug-CoV represents a valuable resource for COVID-19 drug repurposing efforts and demonstrates the potential of KGs for facilitating drug discovery.
Journal article
Targeted lipidomics coupled with machine learning for authenticating the provenance of chicken eggs
Published 2023
Food chemistry, 410, 135366
•A simple lipid extraction of chicken egg yolks was developed for LC-MS/MS analysis.
•937 lipid species from 20 major lipid subclasses were characterized in egg yolk.
•Statistical modeling was used to classify the types of conventional chicken eggs.
•Cage, barn, and free-range eggs can be differentiated based on lipid profile.
•Eggs from caged birds can be accurately predicted based on the lipidomic signature.
Free-range eggs are ethically desirable but as with all high-value commercial products, the establishment of provenance can be problematic. Here, we compared a simple one-step isopropanol method to a two-step methyl-tert-butyl ether method for extracting lipid species in chicken egg yolks before liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis. The isopropanol method extracted 937 lipid species from 20 major lipid subclasses with high reproducibility (CV < 30 %). Machine learning techniques could differentiate conventional cage, barn, and free-range eggs using an external test dataset with an accuracy of 0.94, 0.82, and 0.82, respectively. Lipid species that differentiated cage eggs were predominantly phosphocholines and phosphoethanolamines whilst the free-range egg lipidomes were dominated by acylglycerides with up to three fatty acids. The lipid profiles were found to be characteristic of the cage, barns, and free-range eggs. The lipidomic analysis together with the statistical modeling approach thus provides an efficient tool for verifying the provenance of conventional chicken eggs.
Journal article
Modelling Multi-relations for Convolutional-based Knowledge Graph Embedding
Published 2022
Procedia computer science, 207, 624 - 633
26th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, 07/09/2022–09/09/2022, Verona, Italy
Representation learning of knowledge graphs aims to embed entities and relations into low-dimensional vectors. Most existing works only consider the direct relations or paths between an entity pair. It is considered that such approaches disconnect the semantic connection of multi-relations between an entity pair, and we propose a convolutional and multi-relational representation learning model, ConvMR. The proposed ConvMR model addresses the multi-relation issue in two aspects: (1) Encoding the multi-relations between an entity pair into a unified vector that maintains the semantic connection. (2) Since not all relations are necessary while joining multi-relations, we propose an attention-based relation encoder to automatically assign weights to different relations based on semantic hierarchy. Experimental results on two popular datasets, FB15k-237 and WN18RR, achieved consistent improvements on the mean rank. We also found that ConvMR is efficient to deal with less frequent entities.
Journal article
Improving question answering over knowledge graphs using graph summarization
Published 2021
Neural Information Processing, 13111, 489 - 500
Question Answering (QA) systems over Knowledge Graphs (KGs) (KGQA) automatically answer natural language questions using triples contained in a KG. The key idea is to represent questions and entities of a KG as low-dimensional embeddings. Previous KGQAs have attempted to represent entities using Knowledge Graph Embedding (KGE) and Deep Learning (DL) methods. However, KGEs are too shallow to capture the expressive features and DL methods process each triple independently. Recently, Graph Convolutional Network (GCN) has shown to be excellent in providing entity embeddings. However, using GCNs to KGQAs is inefficient because GCNs treat all relations equally when aggregating neighbourhoods. Also, a problem could occur when using previous KGQAs: in most cases, questions often have an uncertain number of answers. To address the above issues, we propose a graph summarization technique using Recurrent Convolutional Neural Network (RCNN) and GCN. The combination of GCN and RCNN ensures that the embeddings are propagated together with the relations relevant to the question, and thus better answers. The proposed graph summarization technique can be used to tackle the issue that KGQAs cannot answer questions with an uncertain number of answers. In this paper, we demonstrated the proposed technique on the most common type of questions, which is single-relation questions. Experiments have demonstrated that the proposed graph summarization technique using RCNN and GCN can provide better results when compared to the GCN. The proposed graph summarization technique significantly improves the recall of actual answers when the questions have an uncertain number of answers.