Multi-level Functional Structure Recognition of Scientific Literature
The automatic recognition of structure function helps improve the efficiency of tasks such as fine-grained information retrieval,keyword extraction,and citation analysis.In response to the current chal-lenges faced by structure function recognition research,including weak expression of internal textual depend-encies and insufficient model generalization and transferability,this paper utilizes graph convolution neural networks to capture inherent dependency information and topological structures among word nodes,enhan-cing the modeling and representation capabilities of scientific publications.Additionally,adversarial learning is introduced to improve the generalization ability of the structure-function recognition model.The ScienceDi-rect dataset is selected to examine the recognition effectiveness of various model approaches for structure function at three different granularities:Header,Section,and Paragraph.Furthermore,we tested the trans-ferability of multiple models across domains on PubMED-20k,a medical abstract structure function recogni-tion dataset.Experimental results demonstrate that BERT+GCN get the best performance at the Header lev-el,with an F1 value of 88%,which is a 3%improvement over baseline models.At the Section level,the combination of BERT and GAN achieves the best performance,which is also a 3%improvement over base-line models.At the section paragraph level,the F1 score reaches 68%.BERT+GCN exhibits superior cross-domain transferability compared to other models,achieving an F1 score of 90%on cross-domain data.