Chinese address resolution method based on RoBERTa-BiLSTM-SelfAttention-CRF
To address the current challenges of low precision,inefficiency,and the neglect of fine-grained address elements in Chinese address parsing,a Chinese address resolution model integrating RoBERTa-BiLSTM-SelfAttention-CRF is proposed.Firstly,RoBERTa is employed to extract deep semantic features and rich contextual information from address texts.Secondly,the dependencies among address elements are captured through the sequential relationships within address texts modeled by the BiLSTM network.Then,the SelfAttention mechanism is introduced to establish effective correlations between different address elements,thereby optimizing the model's parsing performance in Chinese addresses.Finally,CRF is employed to label address sequences to achieve precise parsing.Experimental results show that the introduction of the SelfAttention mechanism significantly improves the effect of Chinese address parsing.On the self-built dataset,the model achieves an accuracy of 0.959 4,a recall of 0.969 7,and an F1 score of 0.964 5.On the publicly available CCKS2021 dataset,it achieves an accuracy of 0.908 0,a recall of 0.915 8,and an F1 score of 0.911 9,representing an improvement of 0.006 9 in F1 score over current state-of-the-art models.These results demonstrate the model's robust performance and generalization ability.
Chinese address resolutionAddress elementsRoBERTaBiLSTMCRFSelfAttention mechanism