基于YOLOv4的室内动态场景下ORB-SLAM3优化方法

Optimization of ORB-SLAM3 in indoor dynamic scenes based on YOLOv4

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：近年来,视觉同步定位与建图(simultaneous localization and mapping,SLAM)技术成为机器人和计算机视觉领域的研究热点.现有的主流算法通常只针对静态环境设计,当场景中出现动态物体时其算法的定位精准度和稳定性显著降低.本文提出一种改进的ORB-SLAM3(oriented FAST and rotated BRIEF SLAM3)方法.首先,在跟踪线程中采用了轻量化的YOLOv4(you only look once version 4)目标检测网络,对图像金字塔中的每一层图像进行处理,识别并移除动态特征点,进而提升位姿估计的精确度;其次,融合惯性测量单元的积分数据,提取关键帧中的相机内外参数信息,将深度图转换为三维彩色点云,通过拼接形成完整的场景点云地图;最后,进行验证评价.结果表明:本方法在室内动态场景中能有效排除动态特征点,增强相机定位的精度与稳定性;在实际测试场景中,平均距离误差在1.5 cm以内,可成功构建无动态物体干扰的激光点云地图.

外文摘要：The primary objective of this study is to achieve high-precision positioning for Visual Simultaneous Localization and Mapping(VSLAM)systems in indoor dynamic environments while constructing laser point cloud maps that exclude dynamic objects.To this end,the paper proposes an enhanced ORB-SLAM3 method by integrating a lightweight YOLOv4 object detection network to identify and remove dynamic feature points,thereby enhancing the accuracy of pose estimation.Additionally,the method incorporates Inertial Measurement Unit(IMU)data to ensure robust positioning in highly dynamic and low-texture scenes.The proposed approach involves several key steps.(1)Dynamic Object Detection Using YOLOv4:A lightweight YOLOv4 object detection network is employed in the tracking thread to process images at each level of the image pyramid.By identifying and removing dynamic feature points,the method aims to improve pose estimation accuracy.YOLOv4 is known for its real-time performance and high precision,making it suitable for this task.(2)IMU Data Integration:To enhance robustness in dynamic and low-texture environments,the method integrates IMU data for pose estimation.The IMU data provides high-frequency updates,which are combined with visual information to optimize pose estimation.(3)Key Frame Extraction and Point Cloud Construction:Camera intrinsic and extrinsic parameters from keyframes are used to convert depth maps into 3D colored point clouds.These point clouds are then stitched together to form a comprehensive scene map.(4)Dynamic Feature Point Rejection:By using YOLOv4 to detect dynamic objects,the method rejects feature points in dynamic regions,focusing only on static features for pose estimation.The experimental results demonstrate the efficacy of the proposed method in indoor dynamic scenes.The key findings are as follows:(1)Pose Estimation Accuracy.By effectively filtering out dynamic feature points,the menthod enhances both the accuracy and stability of camera positioning.In real-world tests conducted within an underground parking garage,the average positioning error was controlled below 1.5 cm.(2)Point Cloud Map Construction.The method successfully generated a laser point cloud map devoid of interference from dynamic objects.Utilizing camera data to generate point clouds for each frame,followed by their seamless integration,resulted in a complete and accurate 3D map of the scene.(3)Comparison with Other Methods.The proposed method was compared with other visual SLAM methods,including the traditional ORB-SLAM3 and other algorithms.The results show that the proposed method outperforms these algorithms in both positioning accuracy and map completeness.Future work will focus on further enhancing the robustness of the method in complex indoor lighting conditions and optimizing the lightweight semantic SLAM models to reduce computational load while maintaining high positioning accuracy.Additionally,integrating advanced image enhancement algorithms will be explored to improve tracking robustness.

外文关键词：

VSLAMORB-SLAM3IMUfeature recognitiondeep learning

作者：

蒋鹏程、邱俊武、陈衡锋、章旭国、陈佳鑫、田壮

展开 >

作者单位：

中国铁路广州局集团有限公司站房建设指挥部,广州 510180

深圳迈嘉城科信息科技有限公司,深圳 518000

中国铁路通信信号上海工程局集团有限公司,上海 200040

关键词：

视觉同步定位与建图 ORB-SLAM3 惯性测量单元特征识别深度学习

出版年：

2024

DOI：

10.20117/j.jsti.202405006

地理信息世界

中国地理信息产业协会黑龙江测绘地理信息局

地理信息世界

CSTPCD

影响因子：0.826

ISSN：1672-1586

年,卷(期)：2024.31(5)