Research on Academic Paper Move Recognition Method with ChatGPT Data Augmentation
[Purpose/Significance]Given the significant role of the move structure in academic papers for enabling readers to deeply understand the content and rapidly locate key information,this study aims to inves-tigate methods for full-text move recognition,to quickly capture the core content of academic papers,thereby advancing intelligent semantic retrieval.[Method/Process]The article reviewed current studies on move recog-nition methods and,on this basis,proposed a fine-grained move recognition model,the SciBERT-HAMI,which integrated ChatGPT data augmentation and a pre-trained language model.This model employed original texts and corpus augmentation via the ChatGPT large model,to enhance the variety and volume of the training data.A hierarchical neural network model was adopted to learn the paper's semantic feature representations at the"word-sentence-section"levels,to capture semantic information at varied levels.The SciBERT word embedding representations were inputted,and the model was trained using a hierarchical neural network with the FocalLoss loss function for fine-grained move recognition.[Result/Conclusion]Integrating ChatGPT data augmentation strategies,the SciBERT-HAMI-DA model achieve F1 scores of 73.1%and 74.1%on the CoreSC and AZ data-sets,respectively.Comparative experiments demonstrate that the proposed model shows effective performance improvement in the task of fine-grained move recognition in full-text academic papers,and its effectiveness is verified through ablation experiments.By integrating pre-trained language models and ChatGPT data augmenta-tion,the prediction effect of the full-text move recognition model is effectively improved,which helps to promote the automation and intelligence of academic research.