Entity-relation extraction based on heterogeneous graphs and semantic fusion
[Objective]Relational extraction,which involves the extraction of all relational triples from unstructured text,is an important task in natural language processing.However,effectively addressing the problem of overlapping entity relations remains a challenge.Entity-relation overlap is a significant challenge in entity-relation extraction within natural language processing.Entity-relation overlap refers to the phenomenon in which an entity may have relationships with more than one entity or where multiple relationships exist between pairs of entities.[Methods]To better address the issue of relation overlap,this study proposes an entity-relation extraction method based on heterogeneous graphs and semantic fusion.The overall strategy is to first extract entities and then classify different pairs of entities into specific relationships.This approach effectively addresses the problem of single-entity overlap.To maximize entity extraction,heterogeneous layers are used to integrate predefined relationships as relational prior information into word representation.This enhances representation capability,making it more conducive to entity annotation tasks and reducing the extraction of redundant entities.After the entities are obtained,a global association matrix is employed to filter out entity pairs that do not have relational connections,thereby ensuring that only the correct entity pairs are selected.To better classify the relationship types between entity pairs with relational connections,a semantic fusion module is used to aggregate features at different levels as the input for the relational classification module.This can improve the performance of relational classification and address the problem of entity-pair overlap.[Results]Experimental results demonstrate that the proposed models outperform other benchmark models on the NYT and WebNLG datasets.Specifically,for the NYT data set,the proposed method improves the Fl value by 0.3%compared with the best existing method,and for the WebNLG data set,it improves the Fl value by 0.7%compared with the best model.Compared with RIFRE,the proposed model uses a semantic fusion module to aggregate multigranularity information in the subsequent decoding process,resulting in better quantitative performance.To further explore the effectiveness of the proposed model in handling overlapping entity-relation triples,two extended experiments for different sentence types are designed and performed.[Conclusions]The results of these extended experiments show that,for the WebNLG dataset,the proposed model outperforms other models in terms of processing different types of sentences and handling complex scenarios.For the NYT data set,the proposed model outperforms the benchmark model in extraction and the handling of complex scenarios.Even for nonoverlapping sentences,the proposed model achieves superior results.This indicates that the proposed method can effectively address complex scenarios and various types of overlapping problems.Experiments show that the entity-relation extraction method based on heterogeneous graphs and semantic fusion can effectively manage overlapping issues and extract entity-relation triples.Detailed experiments also confirm that the proposed method can handle complex scenarios.