A review of methods for copy number variation detection using high-throughput sequencing data
Copy number variation refers to the increase or decrease in the copy number of a large segment of DNA sequence in the genome.Previous studies have revealed that copy number variation is the cause of many human diseases and is closely related to their mechanisms of occurrence and development.The emergence of high-throughput sequencing technology has provided technical support for copy number variation detection,which has become the mainstream copy number variation detection technology in human disease research and clinical diagnosis.Although new algorithms and softwares based on high-throughput sequencing technology have been developed,the accuracy is still in challenge.This paper presents a comprehensive review of copy number variation detection methods based on high-throughput sequencing data,including methods based on the methods of depth of reads,double-end mapping,reads splitting,scratch splicing,and the method based on a combination of the above four techniques.Moreover,the principles of each type of method,representative software tools,and applicable data as well as advantages and disadvantages of each type of method are discussed in depth.In addition,the future directions for development in high-throughput sequencing technology are also explored.
Next Generation sequencing dataGenome structure variantCopy number variation detection