首页|基于长短期记忆-深度Q值网络的异构机器人煤矸协同分选方法

基于长短期记忆-深度Q值网络的异构机器人煤矸协同分选方法

扫码查看
[目的]提高传统的单一类别煤矸分选机器人在面对形状、尺寸差异较大的矸石时的适应性,分析异构机器人工作特性,实现异构机器人协同分选.[方法]基于深度Q值网络(deep Q network,DQN)提出异构机器人协同分选模型;分析协同工作分选流程制定决策框架,根据强化学习所需,设计交互环境,构建智能体连续的状态空间奖惩函数,长短期记忆网络(long short term memory,LTSM)和全连接网络相结合,构建DQN价值和目标网络,实现强化学习模型在工作过程中的任务分配.[结果]协同分选模型与传统顺序分配模型相比,在不同含矸率工作负载下,可提高分选效益0.49%~17.74%;在样本含矸率为21.61%,传送带速度为0.4~0.6 m/s的条件下,可提高分选效率2.41%~8.98%.[结论]异构机器人协同分选方法可以在不同的工作负载下获得稳定的分拣效益,避免单一分配方案无法适应动态变化的矸石流缺陷.
Heterogeneous robot coal gangue collaborative sorting method based on long short term memory-deep Q network
Objective Gangue is the waste and impurity produced during the process of coal mining and handling.Its proper separation can reduce environmental pollution,improve energy efficiency,and provide economic benefits.Intelligent coal gangue sorting com-monly involves robotic sorting and air-blowing separation.However,robotic sorting is offten costly and complex,with a high failure rate,while air-blowing separation is not adaptable to gangue with significant differences in quality.By analysing the working characteristics of the two different separation methods and designing a synergistic sorting system,the adaptability and cost-effectiveness of the gangue sorting system can be improved.Methods This paper proposed a collaborative sorting model using heterogeneous robots.The model combined deep reinforce-ment learning with heterogeneous sorting robots.The continuous sorting process of coal gangue was divided into a number of task segments.Overall planning was carried out for each task segment to develop a feasible cooperative work scheme for actuators.The third task set for gangue sorting and actuator collection was presented.To meet the continuity requirements for gangue sort-ing,we proposed splitting the continuous task into several subsets.Tasks were allocated using a buffer between identification and sorting.Furthermore,this paper proposed a reinforcement learning decision-making framework based on LSTM-DQN(long short term memory,LTSM;deep Q network,DQN)to design an interaction environment for reinforcement learning during the coal gangue sorting process.The framework includes state space,action space,and reward function.Additionally,a cross-attention mechanism was used to compute the actuator preferences for tasks,which accelerated the model convergence speed.Also,this paper constructed a core network of the model and introduced LSTM to handle state sequences for temporal and long-term dependencies.The DQN structure was then optimized.Samples with different gangue rates were set up,and the proposed method was compared with the sequential allocation model across different gangue rates and band speeds to demonstrate its superiority.Results and Discussion Based on the proposed LTSM-DQN model,a method for sorting coal gangue using heterogeneous robots was developed.Six groups of samples with varying gangue rates were prepared to simulate different workloads.The experiment showed that the LTSM-DQN model was effective for task assignment in heterogeneous robot cooperation.Fig.7 showed that vari-ous loads could converge within 500 rounds of training.Samples with gangue rates ranging from 4.73%to 30.45%were sorted using the LTSM-DQN-based sorting model,which could limit the reduction in sorting efficiency to within 8%.When compared to the traditional sequential assignment,the sorting model based on LTSM-DQN could improve sorting efficiency by 2.41%to 8.98%at a gangue rate of 21.61%and an adjusted belt speed of 0.4~0.6 m/s,as shown in Tab 2.This improvement was sig-nificant and demonstrated the effectiveness of the LTSM-DQN model.Conclusion A collaborative method for heterogeneous robots and an optimized task allocation strategy using a reinforcement learning algorithm were proposed to achieve efficient and cost-effective sorting.The experiment demonstrated that this collabora-tive sorting method for coal gangue sorting can maintain the overall sorting efficiency of the system above 90%under different loads and is less affected by belt speed compared to the traditional allocation method under different belt speeds and gangue con-tent conditions.The cooperative sorting method is expected to evolve into pneumatic sorting method and multi-mechanic coop-erative operation method.The system will be optimized in terms of multi-mechanic cooperation,air blowing,and robot coopera-tion.Reasonable and customized expansion will be carried out based the actual needs of the mining area to satisfy specific sort-ing needs in a cost-effective manner.

heterogeneous robotscooperative sortingreinforcement Learninglong short term memorydeep Q network

张杰、夏蕊、李博、王学文、李娟莉、徐文军

展开 >

太原理工大学机械与运载工程学院,山西太原 030002

山西量界数字科技有限公司,山西太原 030000

异构机器人 协同分选 强化学习 长短期记忆网络 深度Q值网络

国家自然科学基金项目山西省自然科学基金项目山西省自然科学基金项目

52204149202103021223080202203021221051

2024

中国粉体技术
中国颗粒学会,济南大学,中国非金属矿工业协会矿物加工利用专业委员会

中国粉体技术

CSTPCD
影响因子:0.469
ISSN:1008-5548
年,卷(期):2024.30(3)