南方农业学报2023,Vol.54Issue(8) :2330-2339.DOI:10.3969/j.issn.2095-1191.2023.08.014

高产小粒咖啡叶绿体基因组密码子偏好性分析

Codon usage bias in the chloroplast genome of high production Coffea arabica L.

李亚麒 严炜 娄予强 黄家雄 胡发广 付兴飞 李亚男 程金焕
南方农业学报2023,Vol.54Issue(8) :2330-2339.DOI:10.3969/j.issn.2095-1191.2023.08.014

高产小粒咖啡叶绿体基因组密码子偏好性分析

Codon usage bias in the chloroplast genome of high production Coffea arabica L.

李亚麒 1严炜 1娄予强 1黄家雄 1胡发广 1付兴飞 1李亚男 1程金焕1
扫码查看

作者信息

  • 1. 云南省农业科学院热带亚热带经济作物研究所,云南保山 678000
  • 折叠

摘要

[目的]分析高产小粒咖啡叶绿体基因组密码子的使用模式及其偏好性影响因素,确定合适的异源表达宿主,为探究咖啡的系统进化分析及基因功能验证特别是异源表达提供参考依据.[方法]从GenBank数据库检索下载高产小粒咖啡(MK353209)的完整叶绿体基因组序列,从中选择长度大于300 bp,以ATG开始,TAG、TAA和TGA结尾,内部不存在终止密码子及重复序列的51条编码区序列(CDS),运用EMBOSS在线网站和CodonW 1.4.2等分析软件,系统分析高产小粒咖啡基因组密码子使用特征及其影响因素,并与模式生物的密码子使用频率比较.[结果]高产小粒咖啡基因组的密码子第1位(GC1)、第2位(GC2)和第3位(GC3)的GC含量和3个位置的平均GC含量(GC)均未超过50.00%,30个RSCU>1.00的密码子中,以A/T结尾的密码子占比96.67%,以G/C结尾的密码子占比3.33%,表明高产小粒咖啡叶绿体基因组倾向于使用A/T结尾的密码子.有效密码子数(ENC)、密码子适应指数(CAI)、最优密码子使用频率(Fop)的平均值分别为46.97、0.167和0.353,暗示高产小粒咖啡叶绿体基因密码子偏好性较弱.20个高频密码子分别为GCT、TGT、GAT、GAA、TTT、GGA、CAT、ATT、AAA、TTA、AAT、CCT、CAA、AGA、TCT、ACT、GTA、GTT、TAT和TAA.中性绘图、ENC-plot、PR2-plot和对应分析结果显示,密码子的偏好性受到自然选择、突变等因素的共同影响,其中自然选择起决定作用.密码子使用频率比较结果显示,与拟南芥、烟草、大肠杆菌和酿酒酵母密码子使用频率相比,高产小粒咖啡分别有23.44%、15.63%、40.63%和15.63%的密码子使用模式差异较大.最终确定了21个最优密码子,其中57.14%的密码子以T结尾,38.10%的密码子以A结尾,4.76%的密码子以G结尾.[结论]高产小粒咖啡叶绿体基因组密码子的偏好性较弱,仅对A/T结尾的密码子呈偏好性,其受到自然选择、突变、基因表达水平、基因长度等多种因素的共同影响.酿酒酵母和烟草更适合作为高产小粒咖啡基因的异源表达受体系统.

Abstract

[Objective]This paper analyzed the codon usage pattern of chloroplast genome in high production Coffea arabica L.and the influencing factors of its codon usage bias,to determine the appropriate heterologous expression hostand provide reference basis for phylogenetic analysis and gene function verification of coffee,especially heterologous expression.[Method]The complete chloroplast genome sequence of high production C.arabica(MK353209)was re-trieved and downloaded from the GenBank database.A total of 51 coding sequences(CDS)were selected,which had a length greater than 300 bp,started with ATG,ended with TAG,TAA and TGA,and had no stop codon or repetitive se-quence.And this study systematically analyzed the characteristics of codon usage in the genome of high production C.ara-bica and its influencing factors,and compared the codon usage frequency with model organismsby using EMBOSS online website and CodonW 1.4.2 software.[Result]The results showed that the GC content of the first codon positions(GC1),second codon positions(GC2),third codon positions(GC3)and the average GC content of three positions(GC)in the genome of high production C.arabica were less than 50.00%.Among the 30 codons with RSCU>1.00,96.67%codons ended with A/T and 3.33%codons ended with G/C,indicating that the chloroplast genome of high production C.arabica tended to end with A/T codons.The average values of effective number of codons(ENC),codon adaptation index(CAI),and optimal codon usage frequency(Fop)were 46.97,0.167 and 0.353 respectively,indicated that the codon usa-ge bias of chloroplast gene in high production C.arabica was weak.The 20 high-frequency codons were GCT,TGT,GAT,GAA,TTT,GGA,CAT,ATT,AAA,TTA,AAT,CCT,CAA,AGA,TCT,ACT,GTA,GTT,TAT and TAA.Neutral plot,ENC-plot,PR2-plot and correspondence analysis showed that codon usage bias was affected by natu-ral selection,mutation and other factors,among which natural selection played adecisive role.The comparison of codon usage frequency found that there were 23.44%,15.63%,40.63%and 15.63%codon usage patterns of high production C.arabica varied considerably compared with those of Arabidopsis thaliana,Nicotiana tabacum,Escherichia coli and Sac-charomyces cerevisiae respectively.Finally,21 optimal codons were determined,among which 57.14%codons ended with T,38.10%ended with A,and 4.76%ended with G.[Conclusion]The codon usage bias of the chloroplast genome of high production C.arabica is relatively weak,and tend to end with A/T codon.Codon usage bias is affected by many fac-tors,such as natural selection,mutation,gene expression level,gene length and so on.S.cerevisiae and N.tabacum are more suitable for heterologous expression receptor systems of high production C.arabicagenes.

关键词

高产小粒咖啡/叶绿体基因组/密码子偏好性/最优密码子/异源表达

Key words

high production Coffea arabica L./chloroplast genome/codon usage bias/optimal codon/heterolo-gous expression

引用本文复制引用

基金项目

国家重点研发计划项目(2022YFF130240403)

云南省国际科技特派员项目(202203AK140030)

出版年

2023
南方农业学报
广西壮族自治区农业科学院

南方农业学报

CSTPCDCSCD北大核心
影响因子:0.83
ISSN:2095-1191
参考文献量14
段落导航相关论文