Robust 6DoF Pose Estimation Against Depth Noise and a Comprehensive Evaluation on a Mobile Dataset

University of California, Berkeley
The Chinese University of Hong Kong, Shenzhen
DMLR Workshop ICML 2024

*Indicates Equal Contribution

Abstract

Robust 6DoF pose estimation with mobile devices is the foundation for applications in robotics, augmented reality, and digital twin localization. In this paper, we extensively investigate the robustness of existing RGBD-based 6DoF pose estimation methods against varying levels of depth sensor noise. We highlight that existing 6DoF pose estimation methods suffer significant performance discrepancies due to depth measurement inaccuracies. In response to the robustness issue, we present a simple and effective transformer-based 6DoF pose estimation approach called DTTDNet, featuring a novel geometric feature filtering module and a Chamfer distance loss for training. Moreover, we advance the field of robust 6DoF pose estimation and introduce a new dataset – Digital Twin Tracking Dataset Mobile (DTTDMobile), tailored for digital twin object tracking with noisy depth data from the mobile RGBD sensor suite of the Apple iPhone 14 Pro. Extensive experiments demonstrate that DTTDNet significantly outperforms state-of-the-art methods at least 4.32, up to 60.74 points in ADD metrics on the DTTD-Mobile. More importantly, our approach exhibits superior robustness to varying levels of measurement noise, setting a new benchmark for the robustness to noise measurements.

MY ALT TEXT

DTTD-Mobile Dataset

We introduce DTTD-Mobile as a novel digital-twin pose estimation dataset captured with mobile devices. We provide in-depth LiDAR depth analysis and evaluation metrics to illustrate the unique properties and complexities of mobile LiDAR data.

DTTDNet

We propose a new transformer-based 6DoF pose estimator with depth-robust designs on modality fusion and training strategies, called DTTDNet. We introduce two modules, Chamfer Distance Loss (CDL) and Geometric Feature Filtering (GFF), that enable the point-cloud encoder in DTTDNet to handle noisy and low-resolution LiDAR data robustly.

MY ALT TEXT

Experiments Results

Poster

BibTeX

@misc{DTTDv2,
        title={Robust Digital-Twin Localization via An RGBD-based Transformer Network and A Comprehensive Evaluation on a Mobile Dataset}, 
        author={Zixun Huang and Keling Yao and Seth Z. Zhao and Chuanyu Pan and Tianjian Xu and Weiyu Feng and Allen Y. Yang},
        year={2023},
        eprint={2309.13570},
        archivePrefix={arXiv},
        primaryClass={cs.CV}
  }