SLAM-Former: Putting SLAM into One Transformer

Putting SLAM into One Transformer

IIIS, Tsinghua University

Abstract

We present SLAM-Former, a novel neural approach that integrates full SLAM capabilities into a single transformer. Similar to traditional SLAM systems, SLAM-Former comprises both a frontend and a backend that operate in tandem. The frontend processes sequential monocular images in real-time for incremental mapping and tracking, while the backend performs global refinement to ensure a geometrically consistent result. This alternating execution allows the frontend and backend to mutually promote one another, enhancing overall system performance. Comprehensive experimental results demonstrate that SLAM-Former achieves superior or highly competitive performance compared to state-of-the-art dense SLAM methods.

@article{slam-former, title={SLAM-Former: Putting SLAM into One Transformer}, author={Yijun Yuan, Zhuoguang Chen, Kenan Li, Weibang Wang, and Hang Zhao}, journal={arXiv preprint arXiv:2509.16909}, year={2025} }

SLAM-Former

Putting SLAM into One Transformer

Abstract

SLAM-Former

Demonstration

BibTeX