首页|Performance Comparison of Computational Methods for the Prediction of the Function and Pathogenicity of Non-coding Variants

Performance Comparison of Computational Methods for the Prediction of the Function and Pathogenicity of Non-coding Variants

扫码查看
Non-coding variants in the human genome significantly influence human traits and com-plex diseases via their regulation and modification effects.Hence,an increasing number of compu-tational methods are developed to predict the effects of variants in human non-coding sequences.However,it is difficult for inexperienced users to select appropriate computational methods from dozens of available methods.To solve this issue,we assessed 12 performance metrics of 24 methods on four independent non-coding variant benchmark datasets:(1)rare germline variants from clin-ical relevant sequence variants(Clin Var),(2)rare somatic variants from Catalogue Of Somatic Mutations In Cancer(COSMIC),(3)common regulatory variants from curated expression quanti-tative trait locus(eQTL)data,and(4)disease-associated common variants from curated genome-wide association studies(GWAS).All 24 tested methods performed differently under various conditions,indicating varying strengths and weaknesses under different scenarios.Importantly,the performance of existing methods was acceptable for rare germline variants from Clin Var with the area under the receiver operating characteristic curve(AUROC)of 0.4481-0.8033 and poor for rare somatic variants from COSMIC(AUROC=0.4984-0.7131),common regulatory variants from curated eQTL data(AUROC=0.4837-0.6472),and disease-associated common variants from curated GWAS(AUROC=0.4766-0.5188).We also compared the prediction performance of 24 methods for non-coding de novo mutations in autism spectrum disorder,and found that the combined annotation-dependent depletion(CADD)and context-dependent tolerance score(CDTS)methods showed better performance.Summarily,we assessed the performance of 24 computational methods under diverse scenarios,providing preliminary advice for proper tool selection and guiding the development of new techniques in interpreting non-coding variants.

Non-coding variantPathogenicity estimationFunctional predictionPerformance assessmentPrediction model

Zheng Wang、Guihu Zhao、Bin Li、Zhenghuan Fang、Qian Chen、Xiaomeng Wang、Tengfei Luo、Yijing Wang、Qiao Zhou、Kuokuo Li、Lu Xia、Yi Zhang、Xun Zhou、Hongxu Pan、Yuwen Zhao、Yige Wang、Lin Wang、Jifeng Guo、Beisha Tang、Kun Xia、Jinchen Li

展开 >

National Clinical Research Centre for Geriatric Disorders,Department of Geriatrics,Xiangya Hospital,Central South University,Changsha 410008,China

Department of Neurology,Xiangya Hospital,Central South University,Changsha 410008,China

Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics,School of Life Sciences,Central South University,Changsha 410008,China

Reproductive Medicine Center,Xiangya Hospital,Central South University,Changsha 410008,China

展开 >

National Natural Science Foundation of ChinaYoung Elite Scientist Sponsorship Program by China Association for Science and TechnologyInnovation-Driven Project of Central South University,ChinaNatural Science Foundation for Young Scientists of Hunan Province,ChinaNatural Science Foundation of Hunan Province for outstanding Young Scholars,China

818011332018QNRC001201800330400042019JJ509742020JJ3059

2023

基因组蛋白质组与生物信息学报(英文版)
中国科学院北京基因组研究所

基因组蛋白质组与生物信息学报(英文版)

CSTPCDCSCD
影响因子:0.495
ISSN:1672-0229
年,卷(期):2023.21(3)
  • 66