摘要
代谢组学是研究生物体内所有小分子代谢产物的学科领域.其利用先进的技术平台如质谱和核磁共振等,可以全面、高通量地检测代谢物,为疾病的早期诊断、发病机制研究和个性化治疗提供新思路.由于代谢组学数据具有多样性、高维度、动态性、噪声和变异性等特点,为其数据分析方法研究提出了严峻挑战.本文拟通过归纳总结近年来主要的机器学习法尤其是其改进方法在代谢组学数据统计分析中的进展,为进行有效的数据分析,充分发挥代谢组学在医学研究中的应用提供重要依据.
Abstract
Metabolomics is a disciplinary field that investigates all small-molecule metabolites within the biological or-ganism.Utilizing advanced technological platforms such as mass spectrometry and nuclear magnetic resonance,comprehensive and high-throughput detection of metabolites can offer new avenues for early disease diagnosis,elucidation of pathogenic mecha-nisms,and personalized therapeutic approaches.Due to the diversity,high dimensionality,dynamics,noise,and variability in-herent in metabolomics data,it poses formidable challenges for the research and development of data analysis methods.This pa-per aims to provide a comprehensive overview of recent advancements of machine learning methods for statistical analysis on metabolomics data,with particular focus on improved approaches.By synthesizing these methodologies,the intention is to offer a crucial foundation for effective data analysis and to fully harness the potential of metabolomics in medical research.
基金项目
国家自然科学基金面上项目(82073536)
广西自然科学基金面上项目(2022GXNSFAA035634)