首页|基于特征信息融合预测Anti-CRISPR蛋白

基于特征信息融合预测Anti-CRISPR蛋白

扫码查看
CRISPR-Cas系统是一种存在于细菌和古细菌中的获得性免疫系统,作为基因编辑工具在癌症治疗等方面被广泛研究,但CRISPR-Cas基因编辑技术存在基因脱靶效应。研究发现,Anti-CRISPR蛋白是一种可以调节CRISPR-Cas系统功能的蛋白质,在不破坏靶向基因编辑的情况下,可以减少脱靶效应等不良影响,从而提高基因编辑技术的效率和安全性。因此,研究Anti-CRISPR蛋白对于理解CRISPR-Cas系统的功能和细菌-病毒相互作用具有重要意义。本文构建了 Anti-CRISPR蛋白数据集,提取氨基酸组分、氨基酸二肽组分、g-gap二肽组成、蛋白质二级结构、平均化学位移和蛋白质骨架6种特征参数,利用支持向量机对Anti-CRISPRs蛋白进行预测,在Jackknife检验下,单特征参数最高预测成功率为93。50%;对维度过高的氨基酸二肽组分和g-gap二肽组成分别进行降维处理,得到g-gap二肽组成在g=3、维数是121维时预测成功率最高,为95。10%,进一步研究发现有16种g-gap二肽组合对应11种氨基酸与Anti-CRISPR蛋白预测相关度较大;最后对特征参数进行融合,融合后最高预测成功率为96。07%。
Prediction of Anti-CRISPR Proteins Based on Feature Information Fusion
The CRISPR-Cas system is a natural immune system found in bacteria and archaea.It has been extensively studied as a gene editing tool in various fields,including cancer treatment.However,the CRISPR-Cas gene editing technology is associated with off-target effects.Based on research findings,Anti-CRISPR proteins are capable of modulating the functionality of the CRISPR-Cas system.These proteins can reduce off-target effects and other adverse impacts without compro-mising targeted gene editing,thereby improving the efficiency and safety of gene editing techniques.Therefore,studying Anti-CRISPR proteins is of significant importance for understanding the func-tionality of the CRISPR-Cas system and the bacterial-viral interactions.In this study,an Anti-CRISPR protein dataset was constructed,and six features,including amino acid composition,dipep-tide composition,g-gap dipeptide composition,protein secondary structure,auto-covariance average chemical shift and protein blocks were extracted.Support vector machine(SVM)was employed for the prediction of Anti-CRISPR proteins.The highest accuracy of individual parameter is 93.50%with Jackknife test.Dimensionality reduction is performed on the high-dimensional dipeptide com-position and g-gap dipeptide composition,and the highest accuracy of 95.10%is obtained when g is set to 3.Further research discovers that 16 g-gap dipeptide compositions correspond to 11 amino acids with high relevance to the prediction of Anti-CRISPR proteins.Finally,the highest accuracy of combined features is 96.07%with Jackknife test.

Anti-CRISPR proteinfeature informationdimensionality reductionprediction

张迪萌、李凤敏

展开 >

内蒙古农业大学理学院,呼和浩特 010018

Anti-CRISPR蛋白 特征信息 降维 预测

内蒙古自治区自然科学基金

2019MS03015

2024

内蒙古大学学报(自然科学版)
内蒙古大学

内蒙古大学学报(自然科学版)

CSTPCD
影响因子:0.346
ISSN:1000-1638
年,卷(期):2024.55(3)