Aiming at the problems that the SDMG-R algorithm for spatial bimodal graph reasoning does not provide in-depth analysis of complex document layouts and has low key information extraction accuracy,an improved SDMG-R algorithm is proposed.In order to enhance the model's ability to pay attention to important information in the image and reduce the sensitivity to noise and irrelevant information,an attention mechanism is integrated into the image feature extraction module;in order to expand the receptive field of the convolution kernel and capture a wider range of contextual information,In the U-Net downsampling part,ordinary convolutions are changed to dilated convolutions;in order to capture complex dependencies in sentences,it is proposed to use the BERT pre-training model to extract text features in receipts.In order to better process data complex relationships between nodes,the ratio of sentence lengths are embedded in graph reasoning modules.Experimental results show that compared with the SDMG-R algorithm for dual-modal graph reasoning in the original space,the accuracy of the improved method is improved by 2.32%on the WildReceipt data set.The key information extraction method proposed in this article has practical significance for the intelligent management and analysis of receipts.
关键词
关键信息抽取/图神经网络/注意力机制/扩张卷积/BERT
Key words
key information extraction/GNN/attention mechanism/dilated convolution/BERT