首页|生物医学领域突破性论文识别研究

生物医学领域突破性论文识别研究

扫码查看
[目的/意义]科学的重大进步在很大程度上取决于突破性发现,突破性论文作为突破性发现的重要载体,其识别与发现对科技政策制定与基础研究培育具有重要作用.[方法/过程]针对目前定量识别方法使用特征维度单一且与科学突破概念性质缺乏紧密联系等问题,结合生物医学领域突破性研究特征,提出一种能够更全面、准确地反映突破性创新规律与特点的突破性论文识别方法.从知识创新、扩散、传播路径、融合性、不确定性与争议性等维度构建指标,测度突破性论文的新颖性、影响力、科学破坏性、学科交叉性、争议性特征,应用随机森林模型并将特征指标作为输入变量构建生物医学领域突破性识别模型,最后对模型进行性能评估与参数调优.[结果/结论]生物医学领域突破性论文识别模型的F1值为0.834 1,影响力、争议性和科学破坏性特征指标对该模型的识别结果影响较大.在拉斯克奖获得者的精选论文、肿瘤领域的里程碑文献组成的新样本中,该模型保持较稳定的识别效果和较好的泛化性能.
Research on Breakthrough Papers Identification in the Biomedical Field
[Purpose/Significance]Major advances in science depend largely on breakthrough discoveries.As the main carrier of breakthrough discovery,the identification and discovery of breakthrough papers play an important role in science and technology policy formulation and basic research cultivation.[Method/Process]At present,the quantitative identification methods have problems in using a single feature dimension and lacking close connection with the scientific breakthrough.This paper combines the features of breakthrough research in the bio-medical field to propose a breakthrough paper identification method that can more comprehensively reflect the laws and features of breakthrough innovation.Firstly,the indicators are constructed from the perspectives of knowledge innovation,diffusion,dissemination path,integration,uncertainty and controversy to measure the novelty,influ-ence,scientific destructiveness,interdisciplinary feature and controversial feature of breakthrough papers.Then the random forest model is applied and the feature indicators are used as input variables to construct a breakthrough identification model in the biomedical field.Finally,the model is evaluated and the parameters are tuned.[Result/Conclusion]The F1 value of the breakthrough paper identification model in the biomedical field is 0.834 1,and the indicators of influence,controversy and scientific destructive features have a great influence on the identification re-sults of the model.In the new sample of selected papers by Lasker Prize winners and milestone literature in the field of oncology,the model maintains relatively stable identification performance and good generalization performance.

breakthrough papersbreakthrough featuresrandom forest modelbiomedical field

杨雪梅、汪雪锋、唐小利、陈虹枢

展开 >

北京理工大学管理学院 北京 100081

中国医学科学院医学信息研究所/图书馆 北京 100005

突破性论文 突破性特征 随机森林模型 生物医学领域

国家自然科学基金面上项目中国医学科学院医学与健康科技创新工程"生物医学文献信息保障与集成服务平台"国家自然科学基金青年项目

720740202021-I2M-1-03372004009

2024

图书情报工作
中国科学院文献情报中心

图书情报工作

CSTPCDCSSCICHSSCD北大核心
影响因子:2.203
ISSN:0252-3116
年,卷(期):2024.68(15)