Prediction of spillover risk for avian influenza virus based on deep learning
The influenza virus genome consists of eight genetic segments of varying lengths,with a to-tal length of approximately 14~16 kb.Due to the special molecular genetic mechanism of the virus,it undergoes rapid mutations through gene point mutation and genome rearrangement,which leads to changes in its biological infection characteristics and poses a continuous threat to health.Therefore,accurate prediction of natural avian influenza virus spillovers is crucial for public health.This paper,employs a combination of convolutional neural network(CNN)and recurrent neural network(RNN)to represent viral genome sequences.The model's transferability on both specific group datasets and entire datasets was evaluated.The experimental results demonstrate excellent prediction performance of the specific group model on the respective datasets,with AUROC exceeding 0.966 and AUPR val-ues surpassing 0.876.However,its generalization ability is limited.Conversely,except for the H9N2 group,the global model performs well with AUROC and AUPR values reaching 1.000 across all groups.Based on ablation experiments,it was found that attention mechanism and sequence embed-ding method significantly impact model performance while further testing its generalization ability re-veals AUROC values above 0.984 and AUPR values over 0.941 for transfer predictions respectively.Visualizing the attention weight matrix provides biological interpretability for the model.The high-per-forming deep learning prediction model can be effectively utilized for early warning systems against cross-species infections caused by avian influenza viruses.