Topic modeling is an unsupervised technique in natural language processing (NLP) used to identify hidden topic structures within large text datasets. Among traditional approaches to topic modeling, latent Dirichlet allocation, BERTopic, and Top2Vec, are widely adopted to uncover hidden topics in text data. However, these methods often struggle with poor performance in scenarios involving limited data availability or high-dimensional textual features. In this research, we propose QTopic, a novel hybrid quantum-classical topic modeling architecture that leverages quantum properties through parameterized quantum circuits. By integrating quantum-enhanced sampling into the inference pipeline, the proposed model captures richer topic distributions by mapping textual data into a higher-dimensional space. Benchmark experiments demonstrate that QTopic consistently outperforms classical approaches in terms of coherence, diversity, and topic distinctiveness, particularly when modeling a small number of topics. This study demonstrates the promise of quantum techniques in advancing unsupervised NLP, while also highlighting hardware limitations that present challenges for future research.
Details
Title
QTopic: A novel quantum perspective on learning topics from text
Authors/Creators
Monika Kabir - Murdoch University
Mohammed Kaosar - Murdoch University, School of Information Technology
Ferdous Sohel - Murdoch University, School of Information Technology