为实现数据态环境下红色文献的细粒度组织,提高红色文献的管理和利用效率,文章设计了一种符合红色文献特点的语义标注框架.在综合分析红色文献的内部特征和外部特征基础上,将红色文献解构为不同粒度的语义单元,获取其中的语义元素及语义关系,以领域相关为原则复用FaBio、DoCo、FOAF等通用本体并构建红色文献领域本体RL,全方位描述红色文献中的实体概念及关系,利用OWL语言对红色文献标注本体进行形式化表示,形成一套完整的、形式化的语义本体描述.并以中共七大的相关文献为例,分析其语义特征并采用开放标注数据模型(Open Annotation Data Model)进行语义标注,结合Fuseki系统开展标注数据检验,为红色文献的知识图谱构建及细粒度的红色文献数据库建设提供参考.
In order to realize the fine-grained organization of red literature in a data-state environment and improve the management and utilization efficiency of red literature,a semantic labeling framework conforming to the characteristics of red literature was designed.Firstly,the internal and external features of red literature are analyzed comprehensively,and the red literature is deconstructed into semantic units of different granularity to obtain the semantic elements and semantic relations.Then,based on the principle of domain correlation,FaBio,DoCo,FOAF and other general ontologies are reused and RL of red literature domain ontology is constructed to comprehensively describe the entity concepts and relationships in red literature.Finally,OWL is used to formalize the red document annotation ontology,and a clear annotation ontology model is established.Taking the relevant literature of the Seventh National Congress of the Communist Party of China as an example,the semantic features were analyzed and semantic annotation was carried out by using the Open Annotation Data Model,and annotation data was checked by Fuseki system,providing application reference for the construction of red literature database and knowledge graph.
data formred literatureontology constructionsemantic annotation