利用认知无线电非正交多址接入(cognitive radio non-orthogonal multiple access,CR-NOMA)技术可缓解频谱资源短缺问题,提升传感设备的吞吐量.传感设备的能效问题一直制约着传感设备的应用.为此,针对CR-NOMA中的传感设备,提出基于深度确定策略梯度的能效优化(deep deterministic policy gradient-based energy efficiency optimization,DPEE)算法.DPEE算法通过联合优化传感设备的传输功率和时隙分裂系数,提升传感设备的能效.将能效优化问题建模成马尔可夫决策过程,再利用深度确定策略梯度法求解.最后,通过仿真分析了电路功耗、时隙时长和主设备数对传感能效的影响.仿真结果表明,能效随传感设备电路功耗的增加而下降.此外,相比于基准算法,提出的DPEE算法提升了能效.
Deep deterministic policy gradient-based energy efficiency optimization algorithm for CR-NOMA
Cognitive radio non-orthogonal multiple access(CR-NOMA)technology was used to alleviate the shortage of spectrum resource,and improve the throughput of sensor devices.But the energy efficiency problem had been re-stricting the application of sensor devices.Therefore,for CR-NOMA,deep deterministic policy gradient-based energy efficiency optimization(DPEE)algorithm was proposed.By jointly optimizing the transmission power and time slot splitting coefficient,the energy efficiency of sensor devices was improved.The energy efficiency optimization prob-lem was modeled as a Markov decision process,and it was solved by the deep deterministic policy gradient(DDPG)method.Finally,the influence of circuit power consumption,time slot durations and number of main devices on en-ergy efficiency were analyzed.The simulation results show that the energy efficiency decreases as the circuit power consumption of sensor device increases.In addition,compared with other algorithms,the proposed algorithm im-proves energy efficiency.