Output list
Conference proceeding
TimelineKGQA: A Comprehensive Question-Answer Pair Generator for Temporal Knowledge Graphs
Published 2025
Companion Proceedings of the ACM on Web Conference 2025, 797 - 800
WWW '25: The ACM Web Conference 2025, 28/04/2025–02/05/2025, Sydney, NSW
Question answering over temporal knowledge graphs (TKGs) is crucial for understanding evolving facts and relationships, yet its development is hindered by limited datasets and difficulties in generating custom QA pairs. We propose a novel categorization framework based on timeline-context relationships, along with TimelineKGQA, a universal temporal QA generator applicable to any TKGs. The code is available at: https://github.com/PascalSun/TimelineKGQA as an open source Python package.
Journal article
A systematic review of multi-modal large language models on domain-specific applications
Published 2025
The Artificial intelligence review, 58, 12, 383
While Large Language Models (LLMs) have shown remarkable proficiency in text-based tasks, they struggle to interact effectively with the more realistic world without the perceptions of other modalities such as visual and audio. Multi-modal LLMs, which integrate these additional modalities, have become increasingly important across various domains. Despite the significant advancements and potential of multi-modal LLMs, there has been no comprehensive PRISMA-based systematic review that examines their applications across different domains. The objective of this work is to fill this gap by systematically reviewing and synthesising the quantitative research literature on domain-specific applications of multi-modal LLMs. This systematic review follows the PRISMA guidelines to analyse research literature published after 2022, the release of OpenAI’s ChatGPT
3.5. The literature search was conducted across several online databases, including Nature, Scopus, and Google Scholar. A total of 22 studies were identified, with 11 focusing on the medical domain, 3 on autonomous driving, and 2 on geometric analysis. The remaining studies covered a range of topics, with one each on climate, music, e-commerce, sentiment analysis, human-robot interaction, and construction. This review provides a comprehensive overview of the current state of multi-modal LLMs, highlights their domain-specific applications, and identifies gaps and future research directions.
Conference proceeding
Open-Source Large Language Models Excel in Named Entity Recognition
Published 2025
Neural Information Processing (ICONIP 2024), 2295, 313 - 326
Neural Information Processing 31st International Conference (ICONIP 2024), 02/12/2024–06/12/2024, Auckland, New Zealand
Current state-of-the-art Named Entity Recognition (NER) typically involves fine-tuning transformer-based models like BERT or RoBERTa with annotated datasets, posing challenges in annotation cost, model robustness, and data privacy. An emerging approach uses pre-trained Large Language Models (LLMs) such as ChatGPT to extract entities directly with a few or zero examples, achieving performance comparable to fine-tuned models. However, reliance on the close-source commercial LLMs raises cost and privacy concerns. In this work, we investigate open-source LLMs like Llama2 for NER on local consumer-grade GPUs, aiming to significantly reduce costs compared to cloud solutions while ensuring data security. Experimental results demonstrate competitive NER performance, achieving F1 85.37% on the CoNLL03 dataset and can also be generalised to specific domains, such as scientific texts.
Journal article
TriagedMSA: Triaging Sentimental Disagreement in Multimodal Sentiment Analysis
Published 2025
IEEE transactions on affective computing, 16, 3, 1557 - 1569
Existing multimodal sentiment analysis models are effective at capturing sentiment commonalities across different modalities and discerning emotions. However, these models still face significant challenges when analyzing samples with sentiment polarity differences across modalities. Neural networks struggle to process such divergent sentiment samples, particularly when they are scarce within datasets. While larger datasets could help address this limitation, collecting and annotating them is resource-intensive. To overcome this challenge, we propose TriagedMSA, a multimodal sentiment analysis model with triage capability. Our model introduces the Sentiment Disagreement Triage Network, which identifies sentiment disagreement between modalities within a sample. This triage mechanism reduces mutual influence by learning to distinguish between samples of sentiment agreement and disagreement. To process these two sample types, we develop the Sentiment Selection Attention Network and the Sentiment Commonality Attention Network, both of which enhance modality interaction learning. Furthermore, we propose the Adaptive Polarity Detection (APD) algorithm, which ensures the generalizability of our model across different datasets, regardless of whether unimodal labels are available. The APD algorithm adaptively determines sentiment polarity disagreement or agreement between modalities. We conduct experiments on three multimodal sentiment analysis datasets: CMU-MOSI, CMU-MOSEI and CH-SIMS.v2. The results demonstrate that our proposed methodology outperforms existing state-of-the-art approaches.
Conference proceeding
Published 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 46 - 52
Conference on Empirical Methods in Natural Language Processing(EMNLP 2024), 12/11/2024–16/11/2024, Miami, FL
Multimodal conversational agents are highly desirable because they offer natural and human-like interaction. However, there is a lack of comprehensive end-to-end solutions to support collaborative development and benchmark-ing. While proprietary systems like GPT-4o and Gemini demonstrating impressive integration of audio, video, and text with response times of 200-250ms, challenges remain in balancing latency, accuracy, cost, and data privacy. To better understand and quantify these issues, we developed OpenOmni, an open-source, end-to-end pipeline benchmarking tool that integrates advanced technologies such as Speech-to-Text, Emotion Detection, Retrieval Augmented Generation, Large Language Models , along with the ability to integrate cus-tomized models. OpenOmni supports local and cloud deployment, ensuring data privacy and supporting latency and accuracy bench-marking. This flexible framework allows researchers to customize the pipeline, focus-ing on real bottlenecks and facilitating rapid proof-of-concept development. OpenOmni can significantly enhance applications like indoor assistance for visually impaired individuals, advancing human-computer interaction. Our demonstration video is available https://www. youtube.com/watch?v=zaSiT3clWqY, demo is available via https://openomni.ai4wa. com, code is available via https://github. com/AI4WA/OpenOmniFramework.
Conference proceeding
Retinal Image Registration with Haar-Optimized Local Binary Descriptors for Bifurcation Points
Published 2024
2024 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 745 - 751
International Conference on Digital Image Computing: Techniques and Applications (DICTA) 2024, 27/11/2024–29/11/2024, Perth, WA
This paper introduces a novel method for the registration of color fundus photographs, featuring a new descriptor named Haar-Optimized Local Binary Descriptor (HOLBD). HOLBD is a fast-to-compute and match descriptor, highly optimized to uniquely describe retinal bifurcation and crossover points, which are crucial landmarks for fundus image registration. It utilizes four patterns reminiscent of Haar basis functions, optimized to define these bifurcation and crossover points. These patterns perform pixel intensity tests to form a 340-bit binary vector. Before computing the HOLBD descriptor, the overall image orientation and scaling factors are estimated, and images are normalized, making HOLBD robust against rotation and scaling. Experiments were conducted on both publicly available and private retinal image registration datasets, comprising a total of 484 retinal images (i.e., 242 pairs). The proposed method was compared with state-of-the-art techniques, including Generalized Dual-Bootstrap Iterative Closest Point, Hernandez-Matas et al., Saha et al., and Chen et al.'s methods. Results show that the proposed method outperforms the best performing method. On private dataset, the proposed method achieves 1-3% higher accuracy than the best-performing method for error thresholds up to 15 pixels. It significantly outperforms other methods by 4-30% for error thresholds up to 10 pixels. On the public dataset, the proposed method marginally outperforms the best reported method. It significantly outperforms GDP ICP, Hernandez-Matas et al., and Chen et al. by a margin of 10-40%.
Conference proceeding
Published 2024
Neural Information Processing (ICONIP 2024), 2296, 102 - 117
Neural Information Processing 31st International Conference (ICONIP 2024), 02/12/2024–06/12/2024, Auckland, New Zealand
The transmission of African swine fever (ASF) could be influenced by temperature and rainfall, particularly through the transmission of wild boars. Australia's ASF risk assessment capabilities can be further enhanced by analyzing the impact of temperature and precipitation on ASF. As there are currently no cases of ASF in Australia, this study utilized Poland's ASF-wild boar cases between 2018 and 2021 to establish a risk assessment model for Australia. Two methods were adopted to model the risk by analyzing the correlation between the number of ASF-wild boar cases, and the temperature and rainfall. The two methods used were linear regression and fuzzy inference systems. The aim is to develop a risk assessment analysis that can estimate the seasonal risk of ASF in Australia. The results from the two models showed that there is a significant relationship between the number of cases and the changes in the temperature, but has shown no prominent association with the amount of rainfall. To the best of our knowledge, this is the first model that conducts a seasonal assessment of ASF risk in Australia. The proposed technique used in modelling the Australia’s risk assessment is leading and can handle the incompleteness of data, making this a novel approach that can be used to build models for other countries or regions and also for different infectious diseases.
Journal article
Drug-CoV: a drug-origin knowledge graph discovering drug repurposing targeting COVID-19
Published 2023
Knowledge and information systems
Drug repurposing is a technique for probing new usages of existing medicines, but its traditional methods, such as computational approaches, can be time-consuming and laborious. Recently, knowledge graphs (KGs) have emerged as a powerful approach for graph-based representation in drug repurposing, encoding entities and relations to predict new connections and facilitate drug discovery. As COVID-19 has become a major public health concern, it is critical to establish an appropriate COVID-19 KG for drug repurposing to combat the spread of the virus. However, most publicly available COVID-19 KGs lack support for multi-relations and comprehensive entity types. Moreover, none of them originates from COVID-19-related drugs, making it challenging to identify effective treatments. To tackle these issues, we developed Drug-CoV, a drug-origin and multi-relational COVID-19 KG. We evaluated the quality of Drug-CoV by performing link prediction and comparing the results to another publicly available COVID-19 KG. Our results showed that Drug-CoV outperformed the comparing KG in predicting new links between entities. Overall, Drug-CoV represents a valuable resource for COVID-19 drug repurposing efforts and demonstrates the potential of KGs for facilitating drug discovery.
Journal article
Targeted lipidomics coupled with machine learning for authenticating the provenance of chicken eggs
Published 2023
Food chemistry, 410, 135366
•A simple lipid extraction of chicken egg yolks was developed for LC-MS/MS analysis.
•937 lipid species from 20 major lipid subclasses were characterized in egg yolk.
•Statistical modeling was used to classify the types of conventional chicken eggs.
•Cage, barn, and free-range eggs can be differentiated based on lipid profile.
•Eggs from caged birds can be accurately predicted based on the lipidomic signature.
Free-range eggs are ethically desirable but as with all high-value commercial products, the establishment of provenance can be problematic. Here, we compared a simple one-step isopropanol method to a two-step methyl-tert-butyl ether method for extracting lipid species in chicken egg yolks before liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis. The isopropanol method extracted 937 lipid species from 20 major lipid subclasses with high reproducibility (CV < 30 %). Machine learning techniques could differentiate conventional cage, barn, and free-range eggs using an external test dataset with an accuracy of 0.94, 0.82, and 0.82, respectively. Lipid species that differentiated cage eggs were predominantly phosphocholines and phosphoethanolamines whilst the free-range egg lipidomes were dominated by acylglycerides with up to three fatty acids. The lipid profiles were found to be characteristic of the cage, barns, and free-range eggs. The lipidomic analysis together with the statistical modeling approach thus provides an efficient tool for verifying the provenance of conventional chicken eggs.
Journal article
Modelling Multi-relations for Convolutional-based Knowledge Graph Embedding
Published 2022
Procedia computer science, 207, 624 - 633
26th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, 07/09/2022–09/09/2022, Verona, Italy
Representation learning of knowledge graphs aims to embed entities and relations into low-dimensional vectors. Most existing works only consider the direct relations or paths between an entity pair. It is considered that such approaches disconnect the semantic connection of multi-relations between an entity pair, and we propose a convolutional and multi-relational representation learning model, ConvMR. The proposed ConvMR model addresses the multi-relation issue in two aspects: (1) Encoding the multi-relations between an entity pair into a unified vector that maintains the semantic connection. (2) Since not all relations are necessary while joining multi-relations, we propose an attention-based relation encoder to automatically assign weights to different relations based on semantic hierarchy. Experimental results on two popular datasets, FB15k-237 and WN18RR, achieved consistent improvements on the mean rank. We also found that ConvMR is efficient to deal with less frequent entities.