首页|TransDFL:Identification of Disordered Flexible Linkers in Proteins by Transfer Learning

TransDFL:Identification of Disordered Flexible Linkers in Proteins by Transfer Learning

扫码查看
Disordered flexible linkers(DFLs)are the functional disordered regions in proteins,which are the sub-regions of intrinsically disordered regions(IDRs)and play important roles in connecting domains and maintaining inter-domain interactions.Trained with the limited available DFLs,the existing DFL predictors based on the machine learning techniques tend to predict the ordered residues as DFLs,leading to a high false positive rate(FPR)and low prediction accuracy.Previous studies have shown that DFLs are extremely flexible disordered regions,which are usually predicted as disordered residues with high confidence[P(D)>0.9]by an IDR predictor.Therefore,transferring an IDR predictor to an accurate DFL predictor is of great significance for understand-ing the functions of IDRs.In this study,we proposed a new predictor called TransDFL for iden-tifying DFLs by transferring the RFPR-IDP predictor for IDR identification to the DFL prediction.The RFPR-IDP was pre-trained with IDR sequences to learn the general features between IDRs and DFLs,which is helpful to reduce the false positives in the ordered regions.RFPR-IDP was fine-tuned with the DFL sequences to capture the specific features of DFLs so as to be transferred into the TransDFL.Experimental results of two application scenarios(predic-tion of DFLs only in IDRs or prediction of DFLs in entire proteins)showed that TransDFL con-sistently outperformed other existing DFL predictors with higher accuracy.The corresponding web server of TransDFL can be freely accessed at http://bliulab.net/TransDFL/.

Intrinsically disordered proteinDisordered flexible linkerFalse positive rateComputational predictorTransfer learning

Yihe Pang、Bin Liu

展开 >

School of Computer Science and Technology,Beijing Institute of Technology,Beijing 100081,China

Advanced Research Institute of Multidisciplinary Science,Beijing Institute of Technology,Beijing 100081,China

National Key R&D Program of ChinaBeijing Natural Science Foundation,China

2018AAA0 100100JQ19019

2023

基因组蛋白质组与生物信息学报(英文版)
中国科学院北京基因组研究所

基因组蛋白质组与生物信息学报(英文版)

CSTPCDCSCD
影响因子:0.495
ISSN:1672-0229
年,卷(期):2023.21(2)
  • 48