skip to main content
10.1145/3638550.3641122acmconferencesArticle/Chapter ViewAbstractPublication PageshotmobileConference Proceedingsconference-collections
research-article

Mobile AR Depth Estimation: Challenges & Prospects

Published:28 February 2024Publication History

ABSTRACT

Accurate metric depth can help achieve more realistic user interactions such as object placement and occlusion detection in mobile augmented reality (AR). However, it can be challenging to obtain metricly accurate depth estimation in practice. We tested four different state-of-the-art (SOTA) monocular depth estimation models on a newly introduced dataset (ARKitScenes) and observed obvious performance gaps on this real-world mobile dataset. We categorize the challenges to hardware, data, and model-related challenges and propose promising future directions, including (i) using more hardware-related information from the mobile device's camera and other available sensors, (ii) capturing high-quality data to reflect real-world AR scenarios, and (iii) designing a model architecture to utilize the new information.

References

  1. Apple. https://developer.apple.com/augmented-reality/, 2017.Google ScholarGoogle Scholar
  2. G. Baruch, Z. Chen, A. Dehghan, T. Dimry, Y. Feigin, P. Fu, T. Gebauer, B. Joffe, D. Kurz, A. Schwartz, and E. Shulman. ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data. In NeurIPS Datasets and Benchmarks Track, 2021.Google ScholarGoogle Scholar
  3. S. F. Bhat, I. Alhashim, and P. Wonka. Localbins: Improving depth estimation by learning local distributions. In ECCV, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. F. Bhat, R. Birkl, D. Wofk, P. Wonka, and M. Müller. Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv:2302.12288.Google ScholarGoogle Scholar
  5. R. Birkl, D. Wofk, and M. Müller. Midas v3.1--a model zoo for robust monocular relative depth estimation. arXiv:2307.14460, 2023.Google ScholarGoogle Scholar
  6. G. Brazil, A. Kumar, J. Straub, N. Ravi, J. Johnson, and G. Gkioxari. Omni3D: A large benchmark and model for 3D object detection in the wild. In CVPR, 2023.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. Cho, D. Min, Y. Kim, and K. Sohn. DIML/CVL RGB-D Dataset: 2M RGB-D Images of Natural Indoor and Outdoor Scenes. arXiv: 2110.11590, 2021.Google ScholarGoogle Scholar
  8. A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In CVPR, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  9. S. Farooq Bhat, I. Alhashim, and P. Wonka. AdaBins: Depth Estimation Using Adaptive Bins. In CVPR, 2021.Google ScholarGoogle Scholar
  10. Y. Fujimura, M. Iiyama, T. Funatomi, and Y. Mukaigawa. Deep depth from focal stack with defocus model for camera-setting invariance. arXiv:2202.13055, 2022.Google ScholarGoogle Scholar
  11. A. Ganj, Y. Zhao, F. Galbiati, and T. Guo. Toward Scalable and Controllable AR Experimentation. In ImmerCom, 2023.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Geiger, P. Lenz, C. Stiller, and R. Urtasun. Vision meets robotics: The kitti dataset. IJRR, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. V. Guizilini, I. Vasiljevic, D. Chen, R. Ambrus, and A. Gaidon. Towards zero-shot scale-aware monocular depth estimation. In ICCV, 2023.Google ScholarGoogle ScholarCross RefCross Ref
  14. S. Hwang, J. Lee, W. J. Kim, S. Woo, K. Lee, and S. Lee. Lidar depth completion using color-embedded information via knowledge distillation. IEEE Transactions on Intelligent Transportation Systems, 2022.Google ScholarGoogle Scholar
  15. Intel. https://www.intelrealsense.com/wp-content/uploads/2023/07/Intel-RealSense-D400-Series-Datasheet-July-2023.pdf, 2023.Google ScholarGoogle Scholar
  16. M. Maximov, K. Galim, and L. Leal-Taixe. Focus on defocus: Bridging the synthetic to real domain gap for depth estimation. In CVPR, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  17. P. K. Nathan Silberman, Derek Hoiem and R. Fergus. Indoor segmentation and support inference from rgbd images. In ECCV, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Norman, V. Kellen, S. Smallen, B. DeMeulle, S. Strande, E. Lazowska, N. Alterman, R. Fatland, S. Stone, A. Tan, K. Yelick, E. Van Dusen, and J. Mitchell. Cloudbank: Managed services to simplify cloud access for computer science research and education. In Practice and Experience in Advanced Research Computing, PEARC '21, New York, NY, USA, 2021. Association for Computing Machinery.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Ranftl, A. Bochkovskiy, and V. Koltun. Vision transformers for dense prediction. In ICCV, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  20. R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. TPAMI, 2020.Google ScholarGoogle Scholar
  21. M. Sayed, J. Gibson, J. Watson, V. Prisacariu, M. Firman, and C. Godard. Simplerecon: 3d reconstruction without 3d convolutions. In ECCV, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. A benchmark for the evaluation of rgb-d slam systems. In IROS, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  23. F. Tapia Benavides, A. Ignatov, and R. Timofte. Phonedepth: A dataset for monocular depth estimation on mobile devices. In CVPRW, 2022.Google ScholarGoogle ScholarCross RefCross Ref
  24. N.-H. Wang, R. Wang, Y.-L. Liu, Y.-H. Huang, Y.-L. Chang, C.-P. Chen, and K. Jou. Bridging unsupervised and supervised depth from focus via all-in-focus supervision. In ICCV, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  25. C.-Y. Wu, J. Wang, M. Hall, U. Neumann, and S. Su. Toward practical monocular indoor depth estimation. In CVPR, 2022.Google ScholarGoogle ScholarCross RefCross Ref
  26. W. Yin, C. Zhang, H. Chen, Z. Cai, G. Yu, K. Wang, X. Chen, and C. Shen. Metric3d: Towards zero-shot metric 3d prediction from a single image. 2023.Google ScholarGoogle Scholar
  27. J. Zhang, H. Yang, J. Ren, D. Zhang, B. He, T. Cao, Y. Li, Y. Zhang, and Y. Liu. Mobidepth: Real-time depth estimation using on-device dual cameras. MobiCom, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Y. Zhang, T. Scargill, A. Vaishnav, G. Premsankar, M. Di Francesco, and M. Gorlatova. Indepth: Real-time depth inpainting for mobile augmented reality. IMWUT, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Mobile AR Depth Estimation: Challenges & Prospects
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            HOTMOBILE '24: Proceedings of the 25th International Workshop on Mobile Computing Systems and Applications
            February 2024
            167 pages
            ISBN:9798400704970
            DOI:10.1145/3638550

            Copyright © 2024 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 February 2024

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate96of345submissions,28%
          • Article Metrics

            • Downloads (Last 12 months)112
            • Downloads (Last 6 weeks)30

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader