Options
CasCenter: A Cascaded Center Network for Visual Tracking
Journal
IEEE Transactions on Consumer Electronics
ISSN
0098-3063
Date Issued
2025-05
Author(s)
Qun Li
Haijun Zhang
Kai Yang
Yong-Guo Shi
Deqiang Zeng
DOI
10.1109/TCE.2025.3547962
Abstract
Object tracking has advanced significantly with Transformer-based architectures in recent years. However, replacing convolutional layers with global cross-attention in the tracking head of these architectures results in a loss of object-centric inductive bias. Consequently, existing Transformer-based methods often struggle with complex real-life scenarios, such as low resolution, background clutter, and scale variation. To address this issue, we propose a new Vision Transformer-based anchor-free tracking framework named CasCenter. Specifically, the framework features a cascade attention module in the decoder that propagates tracking cues from the previous tracking head to refine object features in a coarse-to-fine manner, enabling the tracker to focus more effectively on the target. Additionally, to further improve tracking stability and accuracy, we incorporate SIoU loss, a multi-scale tracking head, and a Gaussian mask-constrained cross-attention mechanism that emphasizes target regions while suppressing background interference. Extensive experiments demonstrate the superiority of our proposed CasCenter. © 1975-2011 IEEE.
Subjects
File(s)
Loading...
Name
j.png
Size
17.27 KB
Format
PNG
Checksum
(MD5):85f5e85fa8f8c13d7350540217a227b6
