DNA Transcription Factor Binding Site Prediction Based on Split-Attention Mechanism
Accurately identifying Transcription factor binding sites in DNA sequences is of great significance for gene expression analysis and drug design.Various prediction methods based on deep learning have been applied to transcription factor binding site tasks,but there is still room for improvement in prediction performance.To this end,a new deep learning method ResNest-TFBS is proposed for predicting transcrip-tion factor binding sites on 690 ChIP seq datasets.This method first extracts the spatial structural characteristics of DNA by introducing molec-ular dynamics features and electrostatic potential energy features based on sequence One-hot encoding;Then,the ResNest model is trained using the split attention mechanism and residual structure to apply the channel attention mechanism to different channel branches,in order to capture the interaction and multi-channel representation of features learned on the global dataset;Finally,the above prior knowledge was transferred to 690 ChIP seq datasets and extensively tested.The experimental results show that ResNest-TFBS has excellent performance,with an average AUC of 0.929.In addition,the SHAP tool was used to verify the contribution of different features in this task,confirming that the introduced features provide more valuable biological clues for predicting transcription factor binding sites.