Logo image
Metric Unreliability in Multimodal Machine Unlearning: A Systematic Analysis and Principled Unified Score
Preprint   Open access

Metric Unreliability in Multimodal Machine Unlearning: A Systematic Analysis and Principled Unified Score

Abdullah Ahmad Khan, Hamid Laga and Ferdous Sohel
ArXiv.org
Cornell University
2026
pdf
Published1.59 MBDownloadView
Published (Version of Record) Open Access CC BY V4.0

Abstract

protection regulation general data data protection machine unlearning multimodal machine score machine principled unified machine metric unreliability current evaluation
Machine unlearning in Vision-Language Models (VLMs) is required for compliance with the General Data Protection Regulation (GDPR), yet current evaluation practices are inconsistent. We present the first systematic study of metric reliability in multimodal unlearning. Five standard metrics, Forget Accuracy (FA), Retain Accuracy (RA), Membership Inference Attack (MIA), Activation Distance (AD), and JS divergence (JS), yield conflicting method rankings across three VQA benchmarks (MLLMU-Bench, UnLOK-VQA, MMUBench). Kendall tau analysis over 36 unlearned LLaVA-1.5-7B models reveals two opposing clusters, FA, RA, MIA and AD, JS, with tau_(F)A_(A)D = -0.26, reproduced on BLIP-2 OPT-2.7B. Agreement is lower in multimodal VQA (average tau = 0.086) than in unimodal classification (average tau = 0.158; difference = 0.072), indicating that dual image-and-text pathways amplify inconsistency. We introduce the Unified Quality Score (UQS), a composite metric with weights derived from each metric's Spearman correlation with the oracle distance d(M_(h)at, M_(s)tar), where M_(s)tar is the oracle model retrained only on the retain set. RA shows the strongest reliability (rho = 0.484, p = 0.003), while FA is negatively correlated (rho = -0.418, p = 0.011). UQS yields stable rankings under 100 random weight perturbations (tau = 0.647 +- 0.262). We release the benchmark, 36 checkpoints, and an interactive leaderboard. Code and pre-computed results are available at https://github.com/neurips26/UnifiedUnl.

Details

Metrics

1 Record Views
Logo image