首页|基于深度学习的船舶数据向量模型研究

基于深度学习的船舶数据向量模型研究

扫码查看
近年来,航运信息化建设进程加速发展,然而由于各单位在船舶信息管理上的不统一,导致收集到的数据存在大量的相似重复数据.如果直接使用这部分数据进行数据分析,会对最终结果造成严重影响.为了解决对重复数据的检测问题,文章基于深度学习,融合FastText向量模型、BERT模型以及LDA模型,搭建多语义融合模型,对船舶数据进行向量构建,使得生成的向量包含信息更全面,提升重复检测准确率,提高船舶数据清洗效率.
Research on ship data vector model based on deep learning
In recent years,the process of shipping information construction has accelerated.However,due to the inconsistency in ship management information among various units,there is a large number of similar and duplicate data collected.If these problem data are not processed and directly entered into the next step of data analysis,it will have a serious impact on the final result.In order to solve the problem of only considering unilateral semantics in current methods,based on deep learning and from a multi-semantic perspective,a multi-semantic fusion model is constructed for ship data vector construction by integrating FastText vector model,BERT model,and LDA model.This makes the final classified vector contain more comprehensive information,improves the accuracy of duplicate detection,and improves the efficiency of ship data cleaning.

ship datamulti semantic fusiondeep learning

顾晴、周军、潘纯杰、羌杨洋

展开 >

江苏航运职业技术学院 智能制造与信息学院,江苏 南通 226001

船舶数据 多语义融合 深度学习

2022年南通市科技局计划项目

JC12022054

2024

无线互联科技
江苏省科学技术情报研究所

无线互联科技

影响因子:0.263
ISSN:1672-6944
年,卷(期):2024.21(2)
  • 6