Machine Learning-Generated Longitudinal Synthetic International Data in Multiple Sclerosis

Hassam Iqbal; Zhe Qiang; Sifat Sharmin; Gareth Ball; Aida Brankovic; Allan G Kermode; Marzena Pedrini; William Carroll; Katherine Buzzard; Olga Skibina; Jeannette Lechner-Scott; Anneke van der Walt; Helmut Butzkueven; Nevin John; Michael Barnett; Suzanne Hodgkinson; Mark Slee; Pamela McCombe; Bruce Taylor; Yi Chao Foong; Richard Macdonell; Todd A Hardy; Steve Vucic; Stephen Reddel; Sudarshini Ramanathan; Deborah Field; Jennifer Massey; Charles B. Malpas; Izanne Roos; Tomas Kalincik

doi:10.1177/13524585261424108

Back

Machine Learning-Generated Longitudinal Synthetic International Data in Multiple Sclerosis

Conference proceeding

Peer reviewed

Machine Learning-Generated Longitudinal Synthetic International Data in Multiple Sclerosis

Hassam Iqbal, Zhe Qiang, Sifat Sharmin, Gareth Ball, Aida Brankovic, Allan G Kermode, Marzena Pedrini, William Carroll, Katherine Buzzard, Olga Skibina, …

Multiple sclerosis, Vol.32(1_suppl), P.18

MS Australia: 10th Progress in MS Research Conference 2025 (Sofitel Brisbane Central, Queensland, 03/12/2025–05/12/2025)

12/2025

DOI: https://doi.org/10.1177/13524585261424108

Abstract

EBV

T-cells

Single-cell transcriptomics

Background: Data scarcity and privacy concern impedes research requiring large datasets in neurology. Synthetic data holds the promise of facilitating research that requires significant analytical power. Longitudinal data in MS represents a significant challenge, because of the high complexity of this neurological condition. Objective: This study presents a dual generative framework to produce a synthetic MS data, assessing their utility in predictive modelling, and the credibility of the generated synthetic data. Methods: We used the MSBase data to train two models: an autoencoder for cross-sectional data, which used clinico-demographic information from 77,215 patients, and a Long Short-Term Memory model for longitudinal data, which was trained on 850,000 patient sequences. The autoencoder generated 13 cross-sectional variables, and LSTM generated time to visit, EDSS, relapses, treatment, treatment change, and MRI. The process was used to generate 2.8 million synthetic patient records. Results: The simulated cohort had a mean age at onset of 31.6 years (SD: 10.5), mean disease duration of 6.8 years, mean EDSS of 2.95 (SD: 2.06), female prevalence of 70.5%,mean follow-up duration of 8.9 years (95%CI: 8.7–9.1),and 22.2% of patients on high-efficacy therapies for 44.1% of the total follow-up, all comparable to the original dataset. Spearman analysis confirmed intra-variable relationships in the simulated data, with coefficients (r=0.2-0.9) consistent in the real data. Conclusion: Dual autoencoder-LSTM approach is suitable for the generation of cross-sectional and longitudinal synthetic data in multiple sclerosis. This solution has the potential to augment research requiring large and representative clinical datasets.

Details

Title: Machine Learning-Generated Longitudinal Synthetic International Data in Multiple Sclerosis
Authors/Creators: Hassam Iqbal
Zhe Qiang
Sifat Sharmin - The University of Melbourne
Gareth Ball
Aida Brankovic
Allan G Kermode - Murdoch University, Institute for Immunology and Infectious Diseases
Marzena Pedrini - Murdoch University, Personalised Medicine Centre
William Carroll - Perron Institute for Neurological and Translational Science
Katherine Buzzard - Box Hill Hospital
Olga Skibina - Box Hill Hospital
Jeannette Lechner-Scott - John Hunter Hospital
Anneke van der Walt - Monash University
Helmut Butzkueven - Monash University
Nevin John - Monash University
Michael Barnett - Mind Australia
Suzanne Hodgkinson - Ingham Institute
Mark Slee - Flinders University
Pamela McCombe - The University of Queensland
Bruce Taylor - Royal Hobart Hospital
Yi Chao Foong - Monash University
Richard Macdonell - Austin Health
Todd A Hardy - Concord Repatriation General Hospital
Steve Vucic - The University of Melbourne
Stephen Reddel - Concord Repatriation General Hospital
Sudarshini Ramanathan - Concord Repatriation General Hospital
Deborah Field - Lyell McEwin Hospital
Jennifer Massey - St Vincent's Hospital Sydney
Charles B. Malpas - The Royal Melbourne Hospital
Izanne Roos - The University of Melbourne
Tomas Kalincik - The University of Melbourne
Publication Details: Multiple sclerosis, Vol.32(1_suppl), P.18
Conference: MS Australia: 10th Progress in MS Research Conference 2025 (Sofitel Brisbane Central, Queensland, 03/12/2025–05/12/2025)
Publisher: SAGE PUBLICATIONS LTD
Number of pages: 53
Identifiers: 991005876850907891
Murdoch Affiliation: Institute for Immunology and Infectious Diseases; Personalised Medicine Centre
Language: English
Resource Type: Conference proceeding

Metrics

1 Record Views