Aims:This paper aims to address the issues of data imbalance and low prediction accuracy in the emerging technology recognition.The indicator system was optimized;and a composite model that combined the resampling technique and resemble algorithms was proposed.Methods:Firstly,technology frontier was added to improve the traditional emerging technology identification indicator system.Secondly,in our composite model,the SMOTE oversampling technique was used to improve the imbalance between data classes.An AdaBoost algorithm with the Genetic algorithm was used to improve model classification accuracy and convergence speed.For clarity,this composite model was called the SMOTE-GA-AdaBoost model.Finally,the patent data from intelligent connected vehicles in the Patsnap patent database was used to test the effectiveness of the above model.Results:The average accuracy of this model was 94.69%.The average recall rate was 89.75%;and the average F1 was 94.42%.The classification accuracy and stability of this composite model were better than other models.Conclusions:The emerging technology identification method can effectively address the issue of imbalanced data and improve the recognition accuracy.
emerging technologiesmachine learningunbalanced datacomposite model