Prediction of Cell-Specific Subcellular Localization of lncRNA Based on Multi-Feature Fusion
Long non-coding RNA(lncRNA)plays a crucial role in cellular biological processes and disease development.Due to the close correlation between the subcellular localization of lncRNA and its biological functions,determining the subcellular localization of lncRNA is of significant importance.Currently,there are some machine learning-based methods for identifying the subcellu-lar localization of lncRNA.However,there is still limited research on the cell-specific subcellular localization of lncRNA in humans.This study investigated the subcellular localization of lncRNA in human cell lines and extracted features such as k-mer,CKSNAP,SRS,and TSS.The different types of features were fused together,and an algorithm combining XGBoost and LightGBM was used to predict the subcellular localization of lncRNA in human cell lines.The model was evaluated using 10-fold cross-validation.The results showed that compared to existing prediction methods,this algo-rithm improved the prediction success rate in predicting the subcellular localization of lncRNA in human cell lines,with the highest AUROC value on the benchmark dataset reaching 92.26%.
cell line specificlong non-coding RNAsecondary structurefeature fusiongradi-ent lifting decision tree