Research on ship data vector model based on deep learning
In recent years,the process of shipping information construction has accelerated.However,due to the inconsistency in ship management information among various units,there is a large number of similar and duplicate data collected.If these problem data are not processed and directly entered into the next step of data analysis,it will have a serious impact on the final result.In order to solve the problem of only considering unilateral semantics in current methods,based on deep learning and from a multi-semantic perspective,a multi-semantic fusion model is constructed for ship data vector construction by integrating FastText vector model,BERT model,and LDA model.This makes the final classified vector contain more comprehensive information,improves the accuracy of duplicate detection,and improves the efficiency of ship data cleaning.