Aiming at the problems of high computational complexity of long sequences and traditional models ignoring the diffe-rences between data in protein solubility prediction,a multi-input deep learning model FESOL was proposed.The linear com-plexity attention mechanism FAVOR+was used to efficiently extract the feature information of long protein sequences.The enhanced loss function was designed by combining cross entropy and cosine similarity,so that the model paid attention to the differences between different input data.Comparing experiments were carried out using a variety of advanced prediction methods on an independent test set.The results show that FESOL is superior to other methods in multiple evaluation indicators,which validates the effectiveness of the model in protein solubility prediction.
关键词
蛋白质溶解性预测/注意力机制/损失函数/深度学习/特征融合/长序列/神经网络
Key words
protein solubility prediction/attention mechanism/loss function/deep learning/feature fusion/long sequence/neu-ral network