面向工业互联网数据分析的机器学习工作流推荐方法
Machine learning workflow recommendation for data analysis in industrial Internet
文一凭 1田沐阳 1谭铮 2康国胜 1刘建勋1
作者信息
- 1. 湖南科技大学服务计算与软件服务新技术湖南省重点实验室,湖南 湘潭 411201
- 2. 中国铁建重工集团股份有限公司,长沙 410000
- 折叠
摘要
工业大数据具有多模态和强关联等特性,这给数据分析与应用带来了新的挑战.如何根据工业应用需求的特点实施有效的数据分析过程通常是一项非常复杂、耗时耗力的任务.针对该问题,提出一种面向工业互联网数据分析的机器学习工作流推荐方法.该方法以已有解决方案为起点,将其所使用的数据集和机器学习工作流作为推荐参考,基于Doc2vec模型与最大平均差异方法计算文本描述相似度与数据分布特征相似度,可根据当前数据分析任务需求,推荐合适的已有解决方案中的机器学习工作流.仿真实验说明了该方法的有效性.
Abstract
The characteristics such as multi-modality and strong association of Industrial big data have brought many challenges.How to effectively accomplish the data analysis process according to the requirements of industrial appli-cations is a complex,time-consuming and labor-intensive task.In view of this task,a method of machine learning workflow recommendation for data analysis was proposed in industrial Internet.It started from existing solutions and utilized their involved datasets and machine learning workflows to provide recommendation.Based on Doc2vec model and the maximum average difference method,the similarities between existing solutions and the data analysis re-quirements by their text descriptions and data distribution features were calculated,by which suitable machine learn-ing workflows in existing solutions could be selected and recommended.The result of simulation experiments showed effectiveness of the proposed method.
关键词
工业互联网/机器学习工作流/推荐/数据分析Key words
industrial Internet/machine learning workflow/recommendation/data analysis引用本文复制引用
基金项目
国家重点研发计划资助项目(2020YFB1707600)
国家自然科学基金资助项目(62177014)
湖南省教育厅资助项目(20B222)
出版年
2024