Abstract
With the development of deep 3D tracking models and their broad prospects for safety-critical applications, adversarial robustness, i.e., the ability of deep models to resist malicious adversarial attacks, has become an important research topic. Previous works generate adversarial examples by tampering with points of the input point cloud indiscriminately. Consequently, they suffer from high computing costs and limited attack performance caused by the trade-off between imperceptibility and adversarial strength. In this paper, we propose a novel adversarial attack against 3D object tracking, which is guided by an occlusion-based explainability method to target points crucial for the predictions in the search area and results in a significant deviation between the predictions and the ground truth. Specifically, an attribution map is generated to reveal the importance of points to the model decision, which is achieved by measuring the variations of tracking performance under subsets generated by the downsampling strategy. To facilitate the generation of attribution maps, the downsampling strategy considers prior knowledge of 3D trackers, which assigns higher sampling probabilities to points with potentially higher contributions enclosed by bounding boxes. Multi-scale fusion is also leveraged to integrate the sensitivity of the model to local regions of varying sizes. Considering the requirement of imperceptibility on adversarial attacks, a hard geometric constraint is imposed on the targeted critical points, which produces perturbations with the property of surface invariance. Furthermore, in contrast to existing works devoted to spatial information manipulation only, multiple loss functions are developed to guide the perturbation generation, where the predicted motions of the tracking target representing the spatial-temporal information unique to the tracking task are distorted to deceive 3D trackers. Extensive experiments conducted on public benchmarks and 3D trackers demonstrate that our method can generate effective and imperceptible adversarial examples with tiny perturbations.