Research on Breakthrough Papers Identification in the Biomedical Field
[Purpose/Significance]Major advances in science depend largely on breakthrough discoveries.As the main carrier of breakthrough discovery,the identification and discovery of breakthrough papers play an important role in science and technology policy formulation and basic research cultivation.[Method/Process]At present,the quantitative identification methods have problems in using a single feature dimension and lacking close connection with the scientific breakthrough.This paper combines the features of breakthrough research in the bio-medical field to propose a breakthrough paper identification method that can more comprehensively reflect the laws and features of breakthrough innovation.Firstly,the indicators are constructed from the perspectives of knowledge innovation,diffusion,dissemination path,integration,uncertainty and controversy to measure the novelty,influ-ence,scientific destructiveness,interdisciplinary feature and controversial feature of breakthrough papers.Then the random forest model is applied and the feature indicators are used as input variables to construct a breakthrough identification model in the biomedical field.Finally,the model is evaluated and the parameters are tuned.[Result/Conclusion]The F1 value of the breakthrough paper identification model in the biomedical field is 0.834 1,and the indicators of influence,controversy and scientific destructive features have a great influence on the identification re-sults of the model.In the new sample of selected papers by Lasker Prize winners and milestone literature in the field of oncology,the model maintains relatively stable identification performance and good generalization performance.
breakthrough papersbreakthrough featuresrandom forest modelbiomedical field