Logo image
3D-CDNeT: Cross-domain learning with enhanced speed and robustness for point cloud recognition
Journal article   Open access   Peer reviewed

3D-CDNeT: Cross-domain learning with enhanced speed and robustness for point cloud recognition

Abu Bakor Hayat Arnob, A.A.M. Muzahid, Hua Han, Yujin Zhang and Ferdous Sohel
Neurocomputing (Amsterdam), Vol.662, 131939
2026
pdf
Published4.46 MBDownloadView
CC BY V4.0 Open Access

Abstract

3D computer vision Domain adaptation Robust 3D object recognition Self-attention mechanism
Despite progress in 3D object recognition using deep learning (DL), challenges such as domain shift, occlusion, and viewpoint variations hinder robust performance. Additionally, the high computational cost and lack of labeled data limit real-time deployment in applications such as autonomous driving and robotic manipulation. To address these challenges, we propose 3D-CDNeT, a novel cross-domain deep learning network designed for unsupervised learning, enabling efficient and robust point cloud recognition. At the core of our model is a lightweight graph-infused attention encoder (GIAE) that enables effective feature interaction between the source and target domains. It not only improves recognition accuracy but also reduces inference time, which is essential for real-time applications. To enhance robustness and adaptability, we introduce a feature invariance learning module (FILM) using contrastive loss for learning invariant features. In addition, we adopt a Generative Decoder (GD) based on a Variational Auto-Encoder (VAE) to model diverse latent spaces and reconstruct meaningful 3D structures from the point cloud. This reconstruction process acts as a self-supervised generative objective that complements the discriminative recognition task, guiding the encoder to learn structure-preserving and domain-invariant features that improve recognition under occlusion and cross-domain conditions. Our proposed model unifies generative and discriminative tasks by using self-attention on the object covariance matrix to facilitate efficient information exchange, enabling the extraction of both local and global features. We further develop a self-supervised pretraining strategy that learns both global and local object invariances through GIAE and GD, respectively. A new loss function, combining contrastive loss and Chamfer distance, is proposed to strengthen cross-domain feature alignment. Experimental results on three benchmark datasets demonstrate that 3D-CDNeT outperforms existing state-of-the-art (SOTA) methods in recognition accuracy and inference speed, offering a practical solution for real-time 3D perception tasks. It achieves accuracies of 90.6 % on ModelNet40, 95.2 % on ModelNet10, and 76.4 % on the ScanObjectNN dataset in linear evaluation tasks, all while reducing runtime by 45 % without compromising performance. Detailed qualitative comparisons and ablation studies are provided to validate the effectiveness of each component and demonstrate the superior performance of our proposed method.

Details

Metrics

1 File views/ downloads
9 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types
Domestic collaboration
International collaboration
Citation topics
4 Electrical Engineering, Electronics & Computer Science
4.17 Computer Vision & Graphics
4.17.2798 Stereo Depth Estimation
Web Of Science research areas
Computer Science, Artificial Intelligence
ESI research areas
Computer Science
Logo image