We present SLAM-Former, a novel neural approach that integrates full SLAM capabilities into a single transformer.
Similar to traditional SLAM systems, SLAM-Former comprises both a frontend and a backend that operate in tandem.
The frontend processes sequential monocular images in real-time for incremental mapping and tracking, while the backend performs global refinement to ensure a geometrically consistent result.
This alternating execution allows the frontend and backend to mutually promote one another, enhancing overall system performance.
Comprehensive experimental results demonstrate that SLAM-Former achieves superior or highly competitive performance compared to state-of-the-art dense SLAM methods.
SLAM-Former
SLAM-Former consists of a frontend and a backend within the same Transformer architecture, working in cooperation.
The training strategy is designed to enable a single trans-
former to handle both frontend and backend SLAM func-
tionalities.
Demonstration
BibTeX
@article{slam-former,
title={SLAM-Former: Putting SLAM into One Transformer},
author={Yijun Yuan, Zhuoguang Chen, Kenan Li, Weibang Wang, and Hang Zhao},
journal={arXiv preprint arXiv:2509.xxxxx},
year={2025}
}