首页|基于熵和不等概率的量子强化学习控制

基于熵和不等概率的量子强化学习控制

扫码查看
复杂量子系统的高精度控制是实现量子计算和量子信息处理的关键技术之一。深度强化学习算法已经应用到量子控制问题中,可以为不同的量子系统设计最优策略。为实现量子系统快速高精度的量子态制备,本文提出一种基于熵和不等概率的深度强化学习算法,其中引入了信息论中熵的概念以改进动作选择策略。通过当前状态的动作值得到该状态的熵值,并根据熵值选择进行"探索"(exploration)或者"利用"(exploitation),其中针对"利用"采用不等概率进行随机选择动作。所提强化学习算法中的智能体(agent)对于学习程度充分的状态专注于利用,对于学习程度非充分的状态则专注于探索,直到完成任务。在量子位系统上的数值仿真结果表明,与传统的强化学习算法相比,本文算法能够以更快的收敛速度和保真度实现本征态和纠缠态的制备。
Quantum reinforcement learning control based on entropy and unequal probability
High-precision control of complicated quantum systems is one of the key technologies for realizing quantum computing and quantum information processing.Deep reinforcement learning algorithms have been applied to quantum control problems to design optimal strategies for various quantum systems.In order to achieve rapid and accurate quantum state preparation,a deep reinforcement learning algorithm based on entropy and unequal probability is proposed,where action selection strategy is improved by introducing the notion of entropy from information theory.The entropy value of the current state is obtained through its action value and"exploration"or"exploitation"is determined based on the entropy value,where the unequal probability is employed to randomly select actions for"exploitation".The agent in the proposed reinforcement learning algorithm focuses on exploitation for sufficiently learned states and on exploration for non-sufficiently learned states,until the task is accomplished.Numerical simulation results on qubit systems show that the proposed algorithm achieves the preparation of eigenstates and entangled states with faster convergence speed and higher fidelities with respect to the conventional reinforcement learning algorithms.

reinforcement learningaction selection strategyentropyunequal probabilityquantum state preparation

张玉瑶、匡森

展开 >

中国科学技术大学自动化系,安徽 合肥 230027

强化学习 动作选择策略 不等概率 量子态制备

2024

控制理论与应用
华南理工大学 中国科学院数学与系统科学研究院

控制理论与应用

CSTPCD北大核心
影响因子:1.076
ISSN:1000-8152
年,卷(期):2024.41(12)