Research on the evaluation of basic logic ability of large-scale pre-trained language models
For four basic logical reasoning abilities of quantity problem,set relationship,quantifier prob-lem and common sense reasoning,we construct few-shot learning sample templates for few-sort learning,which contain 11 logical reasoning subtasks.Two few-shot learning methods of in-context learning and prompt tuning are used to test the logical reasoning ability of GPT-Neo-1.3B and other models from the three dimensions of model,test method and task.The experimental results show that GPT-3 is relatively excellent in quantity problem,quantifier problem and common sense reasoning problem,GPT-Neo and GPT-J have more advantages in set-relation problem.Compared with in-context learning,the pre-trained models can significantly improve the prediction ability by prompt tuning.
natural language processingpre-trained language modelsin-context learningprompt-tun-ingfew-shot learning