A Comprehensive Survey of Deep Learning–Based Object Tracking for Augmented Reality in Complex Real-World Scenes

Authors

DOI:

https://doi.org/10.65455/ddjkhy91

Keywords:

Augmented Reality, Object Tracking, Deep Learning, Complex Scenes, Real-Time Systems, Mobile and Wearable AR

Abstract

Augmented Reality (AR) systems critically depend on accurate, temporally stable, and computationally efficient object tracking to maintain geometric alignment and perceptual coherence between virtual content and the physical world. As AR technologies transition from controlled laboratory prototypes to large-scale deployment in industrial, medical, and consumer scenarios, tracking must operate robustly in complex real-world environments characterized by dynamic objects, occlusions, illumination changes, fast motion, and strict computational constraints. Traditional geometry-driven tracking pipelines often degrade under such conditions, motivating increased adoption of deep learning based approaches. This survey provides a comprehensive review of deep learning based object tracking for AR in complex real-world scenes, with particular emphasis on system-level considerations. Object tracking is treated as a central perception primitive that underpins stable AR experiences and interacts tightly with modules such as visual simultaneous localization and mapping, depth and geometry estimation, and semantic scene understanding. In contrast to prior surveys that emphasize algorithmic accuracy in isolation, we explicitly analyze AR-specific constraints including real-time latency, temporal stability, energy efficiency, long-term robustness, and deployment on mobile and wearable platforms. We review major tracking paradigms, representative datasets and benchmarks, AR-centric evaluation criteria, and common failure modes observed in practice, and we outline future research directions toward scalable, reliable, and trustworthy AR tracking systems.

References

[1]WU Y, LIM J, YANG M H. Online object tracking: A benchmark//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Portland, USA: IEEE, 2013: 2411–2418. DOI: https://doi.org/10.1109/CVPR.2013.312

[2]DANELLJAN M, BHAT G, KHAN F S, et al. ECO: Efficient convolution operators for tracking//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017: 6931–6939. DOI: https://doi.org/10.1109/CVPR.2017.733

[3]LI B, YAN J, WU W, et al. High performance visual tracking with Siamese region proposal network//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE, 2018: 8971–8980. DOI: https://doi.org/10.1109/CVPR.2018.00935

[4]WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: A unifying approach//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019: 1328–1338. DOI: https://doi.org/10.1109/CVPR.2019.00142

[5]HU P, WANG Q, ZHANG L, et al. Learning Siamese representation for real-time visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 3072–3089.

[6]HUANG L, ZHAO X, HUANG K. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562–1577. DOI: https://doi.org/10.1109/TPAMI.2019.2957464

[7]FAN H, LIN L, YANG F, et al. LaSOT: A high-quality benchmark for large-scale single object tracking. International Journal of Computer Vision, 2021, 129(2): 439–461. DOI: https://doi.org/10.1007/s11263-020-01387-y

[8]KRISTAN M, LEONARDIS A, MATAS J, et al. The visual object tracking VOT challenge: A retrospective. International Journal of Computer Vision, 2016, 124(4): 527–559.

[9]JIAO L, WANG D, BAI Y, et al. Deep learning in visual tracking: A review. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(9): 5497–5516. DOI: https://doi.org/10.1109/TNNLS.2021.3136907

[10]MARVASTI-ZADEH S M, LI J, ZOU J, et al. Deep learning for visual tracking: A comprehensive survey. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(5): 3943–3968. DOI: https://doi.org/10.1109/TITS.2020.3046478

[11]ZHANG L, FAN H, XIANG T, et al. Visual object tracking: Progress, challenges, and future directions. The Innovation, 2023, 4(4): 100395. DOI: https://doi.org/10.1016/j.xinn.2023.100402

[12]YE J, CAO Z, LI B. Transformer-based visual object tracking: A survey. Pattern Recognition, 2024, 145: 109823.

[13]CHEN Z, PENG C, LIU S, et al. Spatial-temporal transformer networks for visual object tracking//Proceedings of the European Conference on Computer Vision (ECCV). Cham: Springer, 2022: 238–255.

[14]REZATOFIGHI H, MILAN A, SHI J, et al. A survey on multiple object tracking in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 1–20.

[15]AMOSA T I, SEBASTIAN P, IZHAR L I, et al. Multi-camera multi-object tracking: A review of current trends and future advances. Neurocomputing, 2023, 533: 158–184. DOI: https://doi.org/10.2139/ssrn.4353604

[16]FU C, LU K, ZHENG G, et al. Siamese object tracking for unmanned aerial vehicles: A review and comprehensive analysis. Artificial Intelligence Review, 2023, 56(6): 5805–5851. DOI: https://doi.org/10.1007/s10462-023-10558-5

[17]GAO S, XIAO Z, JIANG Z. RGB-D object tracking: A survey. Sensors, 2023, 23(4): 1828. DOI: https://doi.org/10.3390/s23208439

[18]SYED T A, SIDDIQUI M S, ABDULLAH H B, et al. In-depth review of augmented reality: Tracking technologies, development tools, AR displays, collaborative AR, and security concerns. Sensors, 2023, 23(1): 146. DOI: https://doi.org/10.3390/s23010146

[19]VAN KREVELEN D W F, POELMAN R. A survey of augmented reality technologies, applications and limitations. International Journal of Virtual Reality, 2010, 9(2): 1–20. DOI: https://doi.org/10.20870/IJVR.2010.9.2.2767

[20]CARMIGNIANI J, FURHT B, ANISETTI M, et al. Augmented reality technologies, systems and applications. Multimedia Tools and Applications, 2011, 51(1): 341–377. DOI: https://doi.org/10.1007/s11042-010-0660-6

[21]AZUMA R T, BAILLOT Y, BEHRINGER R, et al. Recent advances in augmented reality. IEEE Computer Graphics and Applications, 2001, 21(6): 34–47. DOI: https://doi.org/10.1109/38.963459

[22]MANURI F, SANNA A. A survey on applications of augmented reality. ACSIJ Advances in Computer Science: an International Journal, 2016, 5(1): 18–27.

[23]MALTA A, FARINHA T, MENDES M. Augmented reality in maintenance—history and perspectives. Journal of Imaging, 2023, 9(7): 142. DOI: https://doi.org/10.3390/jimaging9070142

[24]GHASEMI Y, JEONG H, CHOI S H, et al. Deep learning-based object detection in augmented reality: A systematic review. Computers in Industry, 2022, 139: 103661. DOI: https://doi.org/10.1016/j.compind.2022.103661

[25]SARLIN P E, DUSMANU M, SCHÖNBERGER J L, et al. LaMAR: Benchmarking localization and mapping for augmented reality//Computer Vision – ECCV 2022. Cham: Springer, 2022: 686–704. DOI: https://doi.org/10.1007/978-3-031-20071-7_40

[26]XIANG Y, SCHMIDT T, NARAYANAN V, et al. PoseCNN: A convolutional neural network for 6D object pose estimation in cluttered scenes. International Journal of Computer Vision, 2018, 126(7): 749–766. DOI: https://doi.org/10.15607/RSS.2018.XIV.019

[27]PARK K B, CHOI S H, KIM M, et al. Deep learning-based mobile augmented reality for task assistance using 3D spatial mapping and snapshot-based RGB-D data. Computers & Industrial Engineering, 2020, 146: 106585. DOI: https://doi.org/10.1016/j.cie.2020.106585

[28]KONSTANTINIDIS F K, KANSIZOGLOU I, SANTAVAS N, et al. MARMA: A mobile augmented reality maintenance assistant for fast-track repair procedures in the context of Industry 4.0. Machines, 2020, 8(4): 88. DOI: https://doi.org/10.3390/machines8040088

[29]MOURTZIS D, SIATRAS V, ANGELOPOULOS J. Real-time remote maintenance support based on Augmented Reality (AR). Applied Sciences, 2020, 10(5): 1855. DOI: https://doi.org/10.3390/app10051855

[30]WANG S, ZARGAR S A, YUAN F G. Augmented reality for enhanced visual inspection through knowledge-based deep learning. Structural Health Monitoring, 2021, 20(2): 426–442. DOI: https://doi.org/10.1177/1475921720976986

[31]ALVES J B, MARQUES B, FERREIRA C, et al. Comparing augmented reality visualization methods for assembly procedures. Virtual Reality, 2021, 26(2): 235–248. DOI: https://doi.org/10.1007/s10055-021-00557-8

[32]TANG A, OWEN C, BIOCCA F, et al. Comparative effectiveness of augmented reality in object assembly//Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI). New York, USA: ACM, 2003: 73–80. DOI: https://doi.org/10.1145/642611.642626

[33]HENDERSON S J, FEINER S K. Augmented reality in the psychomotor phase of a procedural task//Proceedings of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Basel, Switzerland: IEEE, 2011: 191–200. DOI: https://doi.org/10.1109/ISMAR.2011.6092386

[34]LAI Z H, TAO W, LEU M C, et al. Smart augmented reality instructional system for mechanical assembly towards worker-centered intelligent manufacturing. Journal of Manufacturing Systems, 2020, 55: 69–81. DOI: https://doi.org/10.1016/j.jmsy.2020.02.010

[35]ZHENG L, LIU X, AN Z, et al. A smart assistance system for cable assembly by combining wearable augmented reality with portable visual inspection. Virtual Reality & Intelligent Hardware, 2020, 2(1): 12–27. DOI: https://doi.org/10.1016/j.vrih.2019.12.002

[36]SUN Y, KANTAREDDY S N R, SIEGEL J, et al. Towards industrial IoT–AR systems using deep learning-based object pose estimation//Proceedings of the 2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC). London, UK: IEEE, 2019: 1–8. DOI: https://doi.org/10.1109/IPCCC47392.2019.8958753

[37]BASTES J B, RIBEIRO S, PINTO A, et al. Augmented reality for training and maintenance of reclosers: A case study of a wearable application//2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). Madrid, Spain: IEEE, 2021: 426–442. DOI: https://doi.org/10.1109/COMPSAC51774.2021.00079

[38]DINI G, DALLE MURA M. Application of augmented reality techniques in through-life engineering services. Procedia CIRP, 2015, 38: 14–23. DOI: https://doi.org/10.1016/j.procir.2015.07.044

[39]BOBOC R G, GÎRBACIA F, BUTILĂ E V. The application of augmented reality in the automotive industry: A systematic literature review. Applied Sciences, 2020, 10(12): 4259. DOI: https://doi.org/10.3390/app10124259

[40]ZHOU F, DUH H B L, BILLINGHURST M. Trends in augmented reality tracking, interaction and display: A review of ten years of ISMAR//Proceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Cambridge, UK: IEEE, 2008: 193–202. DOI: https://doi.org/10.1109/ISMAR.2008.4637362

[41]SARHAN A, LERAY M, CREUSIER T, et al. Augmented reality knowledge management for industrial transformation and innovation. Procedia CIRP, 2024, 128: 19–24. DOI: https://doi.org/10.1016/j.procir.2024.06.004

[42]MAO W, SCHEFFER S, MAJUMDAR A. Augmented reality-enabled knowledge management in industrial maintenance: The DILEAF framework. Computers & Industrial Engineering, 2025, 187: 111363. DOI: https://doi.org/10.1016/j.cie.2025.111363

[43]BILLINGHURST M, CLARK A, LEE G. A survey of augmented reality. Foundations and Trends in Human–Computer Interaction, 2015, 8(2–3): 73–272. DOI: https://doi.org/10.1561/1100000049

[44]SARLIN P E, LARSSON V, DUSMANU M, et al. Benchmarking localization and mapping for AR with LaMAR. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(2): 912–928.

[45]WANG T, QIN H, BAI X. Evaluation metrics for deep learning based object tracking. Pattern Recognition Letters, 2024, 177: 64–72.

[46]CHEN Y, SONG R, HU T. Tracking datasets for AR and VR: A review. Multimedia Tools and Applications, 2023, 82(23): 34501–34530.

[47]SUALEH M, KIM G. A review on robustness and challenges of simultaneous localization and mapping for AR/VR applications. Robotics and Autonomous Systems, 2023, 158: 104255.

[48]ZHANG J, YANG L, CHEN Y, et al. Simultaneous localization and mapping toward augmented reality: A survey. Sensors, 2023, 23(9): 4171.

[49]RAUF A, ELMASRY M, KIM S. Tracking technologies in augmented reality: A review. Multimedia Tools and Applications, 2024, 83(9): 25411–25442.

[50]RICCI S, et al. Viewpoint: Virtual and augmented reality in basic and clinical science. Journal of NeuroEngineering and Rehabilitation, 2022, 19(1): 54. DOI: https://doi.org/10.2196/28595

Downloads

Published

2026-03-24

How to Cite

A Comprehensive Survey of Deep Learning–Based Object Tracking for Augmented Reality in Complex Real-World Scenes. (2026). Applied Artificial Intelligence Research, 2(1). https://doi.org/10.65455/ddjkhy91