To optimize attitude trajectory correction during shield excavation,a composite control method based on self-disturbance rejection control and Q-learning optimization is proposed.This is because the shield-tunneling posture considerably affects tunnel formation and excavation efficiency,strong coupling and nonlinearity that affect the excavation posture in practice are complex and difficult to distinguish,and steady-state effect of conventional parameter adjustment methods is insufficient.The proposed control method involves the mathematical modeling of the pressure regulation zones of the shield oil cylinder and designing of a linear self-disturbance rejection controller.Based on the self-disturbance rejection control framework,the Q-learning algorithm is used to achieve adaptive tuning of controller parameters.The effectiveness of the proposed method is validated through model simulations,providing technical insights for developing device control programs.Compared with the traditional proportional-integral-derivative and self-disturbance rejection controls,the proposed method achieves adaptive parameter debugging and improves the control performance of the deviation correction attitude of shield.