图像语义信息在视觉SLAM中的应用研究进展

Research progress in the application of image semantic information in visual SLAM

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：视觉同步定位与建图(visual simultaneous localization and mapping,VSLAM)技术以相机为主要传感器采集图像数据,基于多视几何、状态估计等算法原理获取载体的位置和姿态,同时构建一张用于导航定位的地图.视觉 SLAM 是自动驾驶、AR(augmented reality)、VR(virtual reality)、MR(mix reality)、智能机器人、无人机飞控中的关键技术.近年来,随着各个产业对智能导航定位的需求日渐增多,原本以几何测量为主的视觉SLAM逐渐融入对环境的语义理解.语义信息是指能够被人类直观感受和理解的概念,而图像语义信息是指图像中物体的轮廓、类别、显著性等信息.相比于图像中的几何特征,语义信息更具时空一致性,且更贴近人类感知的结果.将图像语义信息引入视觉SLAM,既能促进系统各个模块的性能,还能够提升视觉SLAM的智能感知能力,形成集几何测量、定位定姿、环境理解等多种功能的视觉语义SLAM.本文根据图像语义信息的应用方式,对视觉语义SLAM经典方案和最新研究进展进行归纳梳理.在此基础上,本文总结了视觉语义SLAM的现存问题与挑战,指出该领域未来的研究方向,以推动其面向智能导航定位进一步发展.

外文摘要：Visual simultaneous localization and mapping(VSLAM)technology uses cameras as the primary sensor to capture image data and obtain the position and orientation of the carrier based on algorithms such as multi-view geometry and state esti-mation,while simultaneously constructing a map for navigation and localization.VSLAM is a key technology in autonomous driving,AR,VR,MR,intelligent robotics,and drone flight control.In recent years,with the increasing demand for intelli-gent navigation and localization in various industries,VSLAM,which was originally focused on geometric measurements,has gradually integrated a semantic understanding of the environment.Semantic information refers to concepts that can be directly perceived and understood by humans,and semantic information in images refers to information such as object contours,catego-ries,and saliency.Compared to geometric structures and features,image semantic information is more temporally and spatially consistent and provides results that are closer to human perception.Introducing image semantic information into visual SLAM can not only promote the performance of each module of the system,but also enhance the intelligent perception ability of VS-LAM,forming a semantic VSLAM that integrates multiple functions such as geometric measurement,localization,and envi-ronment understanding.In this article,based on the application of image semantic information,we summarize the classic solu-tions and the latest research progress in semantic VSLAM.Based on this,we summarize the existing problems and challenges in visual semantic SLAM and propose future research directions in this field to further promote its development towards intelli-gent navigation and localization.

外文关键词：

visual SLAMvisual semantic SLAMdeep learningintelligent navigation and localization

作者：

郭迟、刘阳、罗亚荣、刘经南、张全

展开 >

作者单位：

武汉大学湖北珞珈实验室,湖北武汉 430079

武汉大学卫星导航定位技术研究中心,湖北武汉 430079

武汉大学人工智能研究院,湖北武汉 430079

关键词：

视觉SLAM 视觉语义SLAM 深度学习智能导航定位

基金：

国家重点研发计划湖北省重大科技专项珞珈实验室开放基金中国博士后科学基金

项目编号：

2022YFB39038012022AAA0092023TQ0248

出版年：

2024

DOI：

10.11947/j.AGCS.2024.20230259

测绘学报

中国测绘学会

测绘学报

CSTPCD北大核心

影响因子：1.602

ISSN：1001-1595

年,卷(期)：2024.53(6)

参考文献量4