Knowledge reasoning is a critical task in knowledge graph completion and has garnered significant academic atten-tion.Addressing issues such as poor interpretability,inability to utilize hidden semantic information,and sparse rewards,this paper proposed a hierarchical reinforcement learning method integrating Bi-LSTM and multi-head attention mechanisms.The knowledge graph was clustered via spectral clustering,enabling agents to reason between clusters and entities.The Bi-LSTM and multi-head attention mechanism module processed the agent's historical information,effectively uncovering and utilizing hidden semantic information in the knowledge graph.The high-level agent selected the cluster containing the target entity through a hierarchical policy network,guiding the low-level agent in entity reasoning.Reinforcement learning allows the agents to solve interpretability issues,and a mutual reward mechanism addresses sparse rewards by rewarding agents'action choices and search paths.Experimental results on FB15K-237,WN18RR,and NELL-995 datasets show that the proposed method captures long-term dependencies in sequential data for long-path reasoning,outperforming similar methods in reasoning tasks.