The distributed visual question answering model based on deep learning
Visual Q&A enables machines to answer natural language questions related to images.There are some existing visual question answering models that only produce effects on specific types of question samples,this paper proposes a distribut-ed framework model based on deep neural networks.Firstly,the training samples are divided into biased and unbiased ones ac-cording to the information entropy of the answer distribution,counterfactual training samples are generated for the biased sam-ples,forcing the model to increase its attention to the key regions of the image and the problem,mitigate the prior influence of language;Secondly,for unbiased samples,a large number of image text pre-training and fine-tuning methods are used to im-prove the performance of the model on unbiased samples;Finally,the multi-classification cross-entropy loss is used to measure the difference between the prediction results of the model and the true labels,and improve the performance of the model.Experi-mental results show that based on VQA-cp-v2 and VQA-v2 datasets,the distributed visual question answering method proposed in this paper has achieved significant improvement in solving the problem of the influence of biased and unbiased samples.