Towards building robust models for unimodal and multimodal medical imaging data

Joy Dhar; Puneet Goyal; Maryam Haghighat; Nayyar Zaidi; Ferdous Sohel; Quoc Bao Vo; K C Santosh

doi:10.1016/j.inffus.2025.103822

Back

Towards building robust models for unimodal and multimodal medical imaging data

Journal article

Peer reviewed

Towards building robust models for unimodal and multimodal medical imaging data

Joy Dhar, Puneet Goyal, Maryam Haghighat, Nayyar Zaidi, Ferdous Sohel, Quoc Bao Vo and K C Santosh

Information fusion, Vol.127, 103822

2026

DOI: https://doi.org/10.1016/j.inffus.2025.103822

Abstract

Adversarial defense

Attention mechanisms

Medical image analysis

Multimodal fusion learning

Noise injection

Deep neural network (DNN) models applied to medical image analysis are highly vulnerable to adversarial attacks, at both the example (input) and feature (model) levels. Ensuring DNN robustness against these adversarial attacks is crucial for accurate diagnostics. However, existing example-level and feature-level defense strategies, including adversarial training and image-level preprocessing, struggle to achieve effective adversarial robustness in medical image analysis. This challenge arises primarily from difficulties in capturing complex texture features in medical images and the inherent risk of changing intrinsic structural information in the input data. To overcome this challenge, we propose a novel medical imaging protector framework named MI-Protector. This framework comprises two defense methods for unimodal learning and one for multimodal fusion learning, addressing both example-level and feature-level vulnerabilities to robustly protect DNNs against adversarial attacks. For unimodal learning, we introduce an example-level defense mechanism using a generative model with a purifier, termed DGMP. The purifier comprises of a trainable neural network and a pre-trained generator from the generative model, which automatically removes a wide variety of adversarial perturbations. For example and feature-level defense mechanism, we propose unimodal attention noise injection mechanism – (UMAN), to protect learning models at the example and feature layers. To protect the multimodal fusion learning network, we propose the multimodal information fusion attention noise (MMIFAN) injection method, which offers protection at the feature layers while the non-learnable UMAN is applied at the example layer. Extensive experiments conducted on 16 datasets across various medical imaging modalities demonstrate that our framework provides superior robustness compared to existing methods against adversarial attacks. Code: https://github.com/misti1203/MI-Protector.

Details

Title: Towards building robust models for unimodal and multimodal medical imaging data
Authors/Creators: Joy Dhar - Indian Institute of Technology Ropar
Puneet Goyal - Indian Institute of Technology Ropar
Maryam Haghighat - Queensland University of Technology
Nayyar Zaidi - Deakin University, Melbourne, 3125, Australia
Ferdous Sohel - Murdoch University, Centre for Crop and Food Innovation
Quoc Bao Vo - Swinburne University of Technology
K C Santosh - University of South Dakota
Publication Details: Information fusion, Vol.127, 103822
Publisher: Elsevier B.V.
Number of pages: 26
Identifiers: 991005830349807891
Murdoch Affiliation: Centre for Crop and Food Innovation; School of Information Technology; Murdoch University
Language: English
Resource Type: Journal article

Metrics

8 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types: Domestic collaboration; International collaboration
Citation topics: 1 Clinical & Life Sciences; 1.104 Virology - General; 1.104.2810 AI in COVID-19
Web Of Science research areas: Computer Science, Artificial Intelligence; Computer Science, Theory & Methods
ESI research areas: Computer Science