首页|Effectiveness of machine learning at modeling the relationship between Hi-C data and copy number variation

Effectiveness of machine learning at modeling the relationship between Hi-C data and copy number variation

扫码查看
Copy number variation(CNV)refers to the number of copies of a specific sequence in a genome and is a type of chromatin structural variation.The development of the Hi-C technique has empowered research on the spatial structure of chromatins by capturing interactions between DNA fragments.We utilized machine-learning methods including the linear transformation model and graph convolutional network(GCN)to detect CNV events from Hi-C data and reveal how CNV is related to three-dimensional interactions between genomic fragments in terms of the one-dimensional read count signal and features of the chromatin structure.The experimental results demonstrated a specific linear relation between the Hi-C read count and CNV for each chromosome that can be well qualified by the linear trans-formation model.In addition,the GCN-based model could accurately extract features of the spatial structure from Hi-C data and infer the corresponding CNV across different chromosomes in a cancer cell line.We performed a series of experiments including dimension reduction,transfer learning,and Hi-C data perturbation to comprehensively evaluate the utility and robust-ness of the GCN-based model.This work can provide a benchmark for using machine learning to infer CNV from Hi-C data and serves as a necessary foundation for deeper understanding of the relationship between Hi-C data and CNV.

copy number variantdeep learninggraph convolution networkHi-C

Yuyang Wang、Yu Sun、Zeyu Liu、Bijia Chen、Hebing Chen、Chao Ren、Xuanwei Lin、Pengzhen Hu、Peiheng Jia、Xiang Xu、Kang Xu、Ximeng Liu、Hao Li、Xiaochen Bo

展开 >

Institute of Health Service and Transfusion Medicine,Beijing,China

College of Computer and Data Science,Fuzhou University,Fuzhou,China

Beijing Institute of Radiation Medicine,Beijing,China

School of Life Sciences,Northwestern Polytechnical University,Xi'an,China

School of Mathematics and Computer Science,Shanxi Normal University,Taiyuan,China

School of Software,Shandong University,Qingdao,China

展开 >

2024

定量生物学(英文版)

定量生物学(英文版)

ISSN:
年,卷(期):2024.12(3)