首页|Astronomical Knowledge Entity Extraction in Astrophysics Journal Articles via Large Language Models

Astronomical Knowledge Entity Extraction in Astrophysics Journal Articles via Large Language Models

扫码查看
Astronomical knowledge entities,such as celestial object identifiers,are crucial for literature retrieval and knowledge graph construction,and other research and applications in the field of astronomy.Traditional methods of extracting knowledge entities from texts face numerous challenging obstacles that are difficult to overcome.Consequently,there is a pressing need for improved methods to efficiently extract them.This study explores the potential of pre-trained Large Language Models(LLMs)to perform astronomical knowledge entity extraction(KEE)task from astrophysical journal articles using prompts.We propose a prompting strategy called Prompt-KEE,which includes five prompt elements,and design eight combination prompts based on them.We select four representative LLMs(Llama-2-70B,GPT-3.5,GPT-4,and Claude 2)and attempt to extract the most typical astronomical knowledge entities,celestial object identifiers and telescope names,from astronomical journal articles using these eight combination prompts.To accommodate their token limitations,we construct two data sets:the full texts and paragraph collections of 30 articles.Leveraging the eight prompts,we test on full texts with GPT-4 and Claude 2,on paragraph collections with all LLMs.The experimental results demonstrate that pre-trained LLMs show significant potential in performing KEE tasks,but their performance varies on the two data sets.Furthermore,we analyze some important factors that influence the performance of LLMs in entity extraction and provide insights for future KEE tasks in astrophysical articles using LLMs.Finally,compared to other methods of KEE,LLMs exhibit strong competitiveness in multiple aspects.

astronomical databases:miscellaneousvirtual observatory toolsmethods:data analysis

Wujun Shao、Rui Zhang、Pengli Ji、Dongwei Fan、Yaohua Hu、Xiaoran Yan、Chenzhou Cui、Yihan Tao、Linying Mi、Lang Chen

展开 >

National Astronomical Observatories,Chinese Academy of Sciences,Beijing 100101,China

University of Chinese Academy of Sciences,Beijing 100049,China

National Astronomical Data Center,Beijing 100101,China

Research Institute of Artificial Intelligence,Zhejiang Lab,Hangzhou 311100,China

Guilin University,Guangxi 541006,China

Xidian University,Xi,an 710126,China

展开 >

National Natural Science Foundation of China(NSFC)National Natural Science Foundation of China(NSFC)National Natural Science Foundation of China(NSFC)National Natural Science Foundation of China(NSFC)National Key Research and Development Program of ChinaNational Key Research and Development Program of China14th Fiveyear Informatization Plan of Chinese Academy of SciencesChina National Astronomical Data Center(NADC)CAS Astronomical Data Center and Chinese Virtual Observatory(China-VO)Astronomical Big Data Joint Research Center,cofounded by National Astronomical Observatories,Chinese Academy of Sciences and AliNASA's Astrophysics Data System Bibliographic Services and SIMBAD database,operated at CDS,Strasbourg,FranceData Publishing is supported by China National Astronomical Data Center(NADC)through Chinese Virtual Observatory(China-VO)PaperD

122730777210106812373110121030702022YFF07124002022YFF0711500CAS-WX2021SF-0204

2024

天文和天体物理学研究
中国科学院国家天文台

天文和天体物理学研究

CSTPCD
影响因子:0.406
ISSN:1674-4527
年,卷(期):2024.24(6)