首页|融合单残基信息的U-RF算法识别SO42-和PO43-配体结合位点

融合单残基信息的U-RF算法识别SO42-和PO43-配体结合位点

扫码查看
SO42--和PO43-配体与蛋白质相结合在生命活动中起着重要的作用,因此,准确预测蛋白质-酸根离子配体结合残基具有重要意义。前人对酸根离子配体结合位点的研究多数是在片段水平上进行的,而极少考虑单残基水平,这可能导致信息的缺失。因此,在片段和单残基水平两个方面提取特征,可以避免信息丢失。在片段水平上使用前人对氨基酸、二级结构、相对溶剂可及性和亲疏水提取的组分信息和位点保守信息作为基础特征,在此基础上引入了单残基水平上的氨基酸、氨基酸的酸碱极性、能量及亲疏水的倾向性因子;结合残基左右残基对信息和9个正交因子为新的特征,使用欠采样和随机森林相融合的算法(U-RF)进行五交叉检验和独立检验,得到了好于前人的预测结果。
Identification of SO42-and PO43-Binding Sites by U-RF Algorithm Integrating Single-Residue Information
The SO42-and PO43-play crucial roles in binding with proteins,making accurate prediction of protein-anion binding residues essential.Previous research on anion binding sites has primarily focused on the fragment level,while neglected the single residue level.This potentially leads to information loss.Therefore,features were extracted at both fragment and single residue levels to avoid information loss.At the fragment level,this study utilized amino acid,secondary structure,relative solvent accessibility,and hydrophilic-hydrophobic extracted from previous studies as foundational features,and introduced single residue-level amino acid,amino acid acid-base polari-ty,energy,and hydrophilic-hydrophobic propensity factors,Combining neighboring residue pair information and 9 orthogonal factors as new features.An algorithm(U-RF)combining undersam-pling and random forest was performed,and a promising prediction result was verified by five-fold cross-validation and independent testing.

binding residuesingle-residue levelpropensity factoracid-base polarityneigh-boring residue pair information

陈少华、胡秀珍、胡慧敏、姚雨倩

展开 >

内蒙古工业大学理学院,呼和浩特 010051

结合残基 单残基水平 倾向性因子 酸碱极性 左右残基对信息

国家自然科学基金

61961032

2024

内蒙古大学学报(自然科学版)
内蒙古大学

内蒙古大学学报(自然科学版)

CSTPCD
影响因子:0.346
ISSN:1000-1638
年,卷(期):2024.55(2)
  • 27