首页|Literature classification and its applications in condensed matter physics and materials science by natural language processing
Literature classification and its applications in condensed matter physics and materials science by natural language processing
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
万方数据
The exponential growth of literature is constraining researchers'access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP'battery'model applied on a larger dataset different from the training and testing dataset can achieve Fl score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions.
natural language processingtext miningmaterials science
吴思远、朱天念、涂思佳、肖睿娟、袁洁、吴泉生、李泓、翁红明
展开 >
Institute of Physics,Chinese Academy of Sciences,Beijing 100190,China
School of Physical Sciences,University of Chinese Academy of Sciences,Beijing 100190,China
Condensed Matter Physics Data Center of Chinese Academy of Sciences,Beijing 100190,China
College of Materials Science and Optoelectronic Technology,University of Chinese Academy of Sciences,Beijing 100049,China
展开 >
Informatization Plan of Chinese Academy of Sciences国家重点研发计划国家重点研发计划国家重点研发计划国家自然科学基金国家自然科学基金国家自然科学基金中国科学院项目