Identification of Copy Number Variation and Its Association with Body Weight and Size of Lion-Head Geese by Next-Generation Sequencing
[Background]Many previous studies have reported that copy number variation(CNV)is a kind of deletion or duplication with the length of 50 bp-5 Mb,which can affect the expression of genes.It is closely associated with economically important traits of livestock,which is one kind of promising molecular markers.Lion-head goose is one of the largest goose species in the world.It is originated in Raoping,Guangdong Province and is the raw material for Guangdong marinated geese.So far,there has no genome-wide association study on investigating the relationship between CNV and body weight and size in lion-head geese.[Objective]This study identified the CNV and CNV region(CNVR)of lion-head geese by using the second-generation genome sequencing data,and then detected CNV and candidate genes significantly affecting body weight and size through the association between them,which could provide the valuable reference information for molecular breeding of lion-head geese.[Method]A total of 111 lion-head geese were collected from Baisha Poultry and Livestock Origin Research Institute in Shantou,including 20 males and 91females.All geese were raised and managed under the uniform standards.The body weight and size traits of 111 geese were measured,and the body size traits included body oblique length,chest depth,chest width and so on.The next-generation genome sequencing data(5×)was generated using blood samples for these geese.SOAPnuke was used for the quality control of sequencing data.The BWA module of Speedseq was used for alignment,and the LUMPY and CNVnator modules of Speedseq were used to detect structural variations(SVs).CNV were selected from SV.The software SVtools was used to genotype CNV,and the association analysis between CNV and body weight and size traits was performed by using the single maker mixed model.CNV significantly associated with traits was screened through the chromosome significance level(0.05/number of CNV on the chromosome),and then annotated the significant CNV including their upstream and downstream 50 kb to identify candidate genes for the body weight and size of lion-head geese.The R package CNVrd2 was used to analyze the linkage disequilibrium(LD)of chromosome-significant CNV and chromosome-significant SNP with physical distance less than 1 Mb.[Result]For 111 lion-head geese,this study detected 99158 CNV including 94 560 deletions and 4 598 duplications.The average length of CNV was 11 858 bp,and most(74.06% )of them were located in the range of 50 bp-1 Kb.A total of 5 225 CNVR were detected,which contained 5 029 loss types,110 gain types,and 86 mixed types.The average length of CNVR was 7 136 bp,and the lengths of most(81.03% )of the CNVRs were 50 bp-1 Kb.Functional annotation showed that 46.92% of CNVR were located in the inter gene region,10.30% were located the upstream,and 9.35% were located the downstream.There were 6 217 CNV accurately genotyped for association analysis.By the association analysis of body weight and size traits and CNV,a total of 55 CNV exceeded the significance level of chromosomes,and then annotated 45 candidate genes based on these 55 CNV.Among these 45 candidate genes,it was found that 10 genes,such as SETD2,UBR7 and G2E3,simultaneously influenced two or more traits.Chromosome-significant CNV affected body weight and size traits independently of chromosome-significant SNP(r2<0.02).[Conclusion]This study for the first time reported the distribution of CNV and CNVR in the genome of lion-head geese as well as the association between CNV and body weight and size by using the next-generation genome sequencing data.It was found that a total of 45 candidate genes influencing the body weight and size traits,in which 11 genes were reported to be related to signal pathways of animal growth,among these 11 genes,SETD2,UBR7,ASB1 and HDAC4 were involved in muscle proliferation,differentiation and metabolism,G2E3,P3C2B,NOVA1 and PDE1B were involved in adipogenesis and obesity,ILKAP was involved in regulating growth factors,KIF1B was involved in bone metabolism,and ZFP37 was involved in glycogen metabolism.These results laid a solid foundation for analyzing molecular genetic mechanism and detecting molecular marker for the growth performance of lion-head goose.
lion-head goosebody weight and size traitsCNVcandidate gene