遗传学报2024,Vol.51Issue(6) :652-664.DOI:10.1016/j.jgg.2024.03.004

RNAirport:a deep neural network-based database characterizing representative gene models in plants

Sitao Zhu Shu Yuan Ruixia Niu Yulu Zhou Zhao Wang Guoyong Xu
遗传学报2024,Vol.51Issue(6) :652-664.DOI:10.1016/j.jgg.2024.03.004

RNAirport:a deep neural network-based database characterizing representative gene models in plants

Sitao Zhu 1Shu Yuan 1Ruixia Niu 1Yulu Zhou 1Zhao Wang 1Guoyong Xu2
扫码查看

作者信息

  • 1. State Key Laboratory of Hybrid Rice,Institute for Advanced Studies(IAS),Wuhan University,Wuhan,Hubei 430072,China
  • 2. State Key Laboratory of Hybrid Rice,Institute for Advanced Studies(IAS),Wuhan University,Wuhan,Hubei 430072,China;Hubei Hongshan Laboratory,Wuhan,Hubei 430070,China
  • 折叠

Abstract

A 5'-leader,known initially as the 5'-untranslated region,contains multiple isoforms due to alternative splicing(aS)and alternative transcription start site(aTSS).Therefore,a representative 5'-leader is demanded to examine the embedded RNA regulatory elements in controlling translation efficiency.Here,we develop a ranking algorithm and a deep-learning model to annotate representative 5'-leaders for five plant species.We rank the intra-sample and inter-sample frequency of aS-mediated transcript isoforms using the Kruskal-Wallis test-based algorithm and identify the representative aS-5'-leader.To further assign a representative 5'-end,we train the deep-learning model 5'IeaderP to learn aTSS-mediated 5'-end distribution patterns from cap-analysis gene expression data.The model accurately predicts the 5'-end,confirmed experimentally in Arabidopsis and rice.The representative 5'-leader-contained gene models and 5'leaderP can be accessed at RNAirport(http://www.rnairport.com/leader5P/).The Stage 1 annotation of 5'-leader records 5'-leader diversity and will pave the way to Ribo-Seq open-reading frame annotation,identical to the project recently initiated by human GENCODE.

Key words

5'-leader/Transcript isoforms/RNA regulatory elements/uORF/Deep learning/Synthetic biology/Translational control

引用本文复制引用

基金项目

National Key R&D Program of China(2023ZD04073)

Major Project of Hubei Hongshan Laboratory(2022hszd016)

Key Research and Development Program of Hubei Province(2022BFE003)

National Natural Science Foundation of China(32070284t)

出版年

2024
遗传学报
中国遗传学会 中国科学院遗传与发育生物学研究所

遗传学报

CSTPCD
影响因子:0.821
ISSN:1673-8527
段落导航相关论文