首页|蛋白质结构标注中的基于支持向量机的拉氏图优化方法

蛋白质结构标注中的基于支持向量机的拉氏图优化方法

扫码查看
拉氏图是一种经典的蛋白质结构验证工具,在蛋白质结构研究领域有广泛应用.然而,传统拉氏图定义的合理区域范围广,容错率高,且包含了一些不准确的结构.针对这一问题,提出了一种基于支持向量机(support vector machine,SVM)和贝叶斯优化的方法SVM-Rama,对传统拉氏图的合理区域定义进行优化和细分,使细分后的合理区域的范围精确到具体的二级结构种类,SVM-Rama法可以提高蛋白质结构验证准确率,且能简便精确地标注二级结构.研究结果表明,该方法在二级结构标记中的准确率接近传统方法取得的最好结果,但训练和计算成本远小于传统方法.
An optimization method based on support vector machine for Ramachandran plot in protein structures annotation
The Ramachandran plot is among the most central concepts for validating the conformation of protein structures,and accordingly plays an important role in structural biology.However,the favored regions defined when using the traditional Ramachandran plot are too wide and contain inaccurate structures.To address these deficiencies,a method based on support vector machine(SVM)and Bayesian optimization(SVM-Rama)for op-timization and subdivision of the definition of favored regions for the Ramachandran plot is proposed.Aims in this study are to enhance the accuracy of the favored regions for the specific secondary structure species of proteins,and subsequently to validate and annotate protein secondary structures simply and accurately.The results reveal that the optimized plot has a high accuracy comparable to the best performance of traditional methods in secondary structure annotation,while facilitating analysis at lower training and computa-tional costs than these traditional methods.

Ramachandran plotSupport vector machinestructure annotation of pro-teins

王博、苏天昊、徐妍婷、高恒、郭聪、李永乐、吴伟

展开 >

上海大学理学院量子与分子结构国际中心,上海 200444

上海大学材料基因组工程研究院,上海 200444

拉氏图 支持向量机 蛋白质结构标记

上海市科学技术委员会创新项目上海市"科技创新行动计划"启明星项目扬帆专项

21JC140270022YF1413300

2024

上海大学学报(自然科学版)
上海大学

上海大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.579
ISSN:1007-2861
年,卷(期):2024.30(3)