基于LSTM深度强化学习的端到端自动驾驶

End-to-End Autonomous Driving Based on LSTM Deep Reinforcement Learning Combined

周昕阳 ¹宋振波 ¹李蔚清 ¹陆建峰¹

扫码查看

作者信息

1. 南京理工大学计算机科学与工程学院,江苏南京,210094
折叠

摘要

现有的基于深度强化学习端到端自动驾驶算法往往仅依赖单帧图像输入与车辆状态信息,没有利用图像序列的信息,驾驶状态的连续性无法得到保证.另外现有的方法通常把环境理解与规划决策放在一起训练,致使学习效率不高.为了解决上述问题,提出了一个基于LSTM深度强化学习端到端自动驾驶算法.首先利用环境中的图像分割结果,预训练一个变分自动编码器来对图像进行特征降维,接着构建一个基于LSTM-PPO(Proximal Policy Optimization)的自动驾驶决策模型,在训练过程中获取上下文的时序特征.最后在Carla仿真平台进行验证,实验结果表明了所提算法的可行性.

Abstract

Existing end-to-end autonomous driving algorithms based on deep reinforcement learning usually rely on single frame image and vehicle state information,but do not use the information of image sequence,and the conti-nuity of driving state cannot be guaranteed.In addition,the existing methods usually treat the environmental under-standing and planning decision as a whole for training,resulting in low learning efficiency.In order to solve the above problems,we propose an end-to-end autonomous driving algorithm based on deep reinforcement learning based on LSTM.Firstly,we trained a variational autoencoder with semantic segmentations to reduce the feature dimension of the image.Then we built an autonomous driving decision model based on LSTM-PPO,which can obtain the temporal characteristics of the context in the training process.Finally,the proposed model was verified on Carla simulation plat-form.The results show the feasibility of the proposed algorithm.

关键词

端到端自动驾驶/深度强化学习/长短期记忆网络-近端策略优化

Key words

End-to-end autonomous driving/Deep reinforcement learning/LSTM-PPO

引用本文复制引用

出版年

2024

计算机仿真

中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD

影响因子：0.518

ISSN：1006-9348

被引量1

参考文献量18

段落导航