Receipt information extraction method based on improved spatial dual-modality graph reasoning
Aiming at the problems that the SDMG-R algorithm for spatial bimodal graph reasoning does not provide in-depth analysis of complex document layouts and has low key information extraction accuracy,an improved SDMG-R algorithm is proposed.In order to enhance the model's ability to pay attention to important information in the image and reduce the sensitivity to noise and irrelevant information,an attention mechanism is integrated into the image feature extraction module;in order to expand the receptive field of the convolution kernel and capture a wider range of contextual information,In the U-Net downsampling part,ordinary convolutions are changed to dilated convolutions;in order to capture complex dependencies in sentences,it is proposed to use the BERT pre-training model to extract text features in receipts.In order to better process data complex relationships between nodes,the ratio of sentence lengths are embedded in graph reasoning modules.Experimental results show that compared with the SDMG-R algorithm for dual-modal graph reasoning in the original space,the accuracy of the improved method is improved by 2.32%on the WildReceipt data set.The key information extraction method proposed in this article has practical significance for the intelligent management and analysis of receipts.
key information extractionGNNattention mechanismdilated convolutionBERT