3D-CDNeT: Cross-domain learning with enhanced speed and robustness for point cloud recognition

Abu Bakor Hayat Arnob; A.A.M. Muzahid; Hua Han; Yujin Zhang; Ferdous Sohel

doi:10.1016/j.neucom.2025.131939

Back

3D-CDNeT: Cross-domain learning with enhanced speed and robustness for point cloud recognition

Journal article

Open access

Peer reviewed

3D-CDNeT: Cross-domain learning with enhanced speed and robustness for point cloud recognition

Abu Bakor Hayat Arnob, A.A.M. Muzahid, Hua Han, Yujin Zhang and Ferdous Sohel

Neurocomputing (Amsterdam), Vol.662, 131939

2026

DOI: https://doi.org/10.1016/j.neucom.2025.131939

Appears in Open Access via Read & Publish Agreements

Files and links (1)

pdf

Published4.46 MBDownload View

CC BY V4.0, Open Access

Abstract

3D computer vision

Domain adaptation

Robust 3D object recognition

Self-attention mechanism

Despite progress in 3D object recognition using deep learning (DL), challenges such as domain shift, occlusion, and viewpoint variations hinder robust performance. Additionally, the high computational cost and lack of labeled data limit real-time deployment in applications such as autonomous driving and robotic manipulation. To address these challenges, we propose 3D-CDNeT, a novel cross-domain deep learning network designed for unsupervised learning, enabling efficient and robust point cloud recognition. At the core of our model is a lightweight graph-infused attention encoder (GIAE) that enables effective feature interaction between the source and target domains. It not only improves recognition accuracy but also reduces inference time, which is essential for real-time applications. To enhance robustness and adaptability, we introduce a feature invariance learning module (FILM) using contrastive loss for learning invariant features. In addition, we adopt a Generative Decoder (GD) based on a Variational Auto-Encoder (VAE) to model diverse latent spaces and reconstruct meaningful 3D structures from the point cloud. This reconstruction process acts as a self-supervised generative objective that complements the discriminative recognition task, guiding the encoder to learn structure-preserving and domain-invariant features that improve recognition under occlusion and cross-domain conditions. Our proposed model unifies generative and discriminative tasks by using self-attention on the object covariance matrix to facilitate efficient information exchange, enabling the extraction of both local and global features. We further develop a self-supervised pretraining strategy that learns both global and local object invariances through GIAE and GD, respectively. A new loss function, combining contrastive loss and Chamfer distance, is proposed to strengthen cross-domain feature alignment. Experimental results on three benchmark datasets demonstrate that 3D-CDNeT outperforms existing state-of-the-art (SOTA) methods in recognition accuracy and inference speed, offering a practical solution for real-time 3D perception tasks. It achieves accuracies of 90.6 % on ModelNet40, 95.2 % on ModelNet10, and 76.4 % on the ScanObjectNN dataset in linear evaluation tasks, all while reducing runtime by 45 % without compromising performance. Detailed qualitative comparisons and ablation studies are provided to validate the effectiveness of each component and demonstrate the superior performance of our proposed method.

Details

Title: 3D-CDNeT: Cross-domain learning with enhanced speed and robustness for point cloud recognition
Authors/Creators: Abu Bakor Hayat Arnob - School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 27316277, China
A.A.M. Muzahid - Shanghai University of Engineering Science
Hua Han - Shanghai University of Engineering Science
Yujin Zhang - Shanghai University of Engineering Science
Ferdous Sohel - Murdoch University, Centre for Crop and Food Innovation
Publication Details: Neurocomputing (Amsterdam), Vol.662, 131939
Publisher: Elsevier B.V.
Number of pages: 12
Identifiers: 991005831950107891
Murdoch Affiliation: Centre for Crop and Food Innovation; School of Information Technology
Language: English
Resource Type: Journal article

Metrics

1 File views/ downloads

9 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types: Domestic collaboration; International collaboration
Citation topics: 4 Electrical Engineering, Electronics & Computer Science; 4.17 Computer Vision & Graphics; 4.17.2798 Stereo Depth Estimation
Web Of Science research areas: Computer Science, Artificial Intelligence
ESI research areas: Computer Science