Optimization Strategy for Safety Reinforcement Learning Guided by Ontology
Aiming at the problem that in the implementation process of safety reinforcement learning,the implementation approach based on shielding might be constrained by the lack of suitable alternative policies available,which resulted in the inability to prevent the system from leaving a safe state even if danger was detected.Although the implementation approach of knowledge integration could provide safety guidance for specific states by extracting conceptual features and applying structured knowledge,sometimes the guidance embedded in knowledge might not be the optimal strategy,and might even be inferior to the strategies learned by agent exploration.We proposed an optimization strategy for safety reinforcement learning guided by ontology to achieve risk identification avoidance and action generation optimization.Based on this theory,we designed and implemented a simulation system in the scenario of unmanned aerial vehicle obstacle avoidance,and verified the effectiveness by using five different reinforcement learning algorithms.The experimental results show that the optimization strategy for safety reinforcement learning based on ontology guidance can achieve alternative policy selection for intelligent agents on the basis of shielding risky actions,and has better performance than traditional reinforcement learning methods.