To address the problems of large number of parameters,high computational resource consumption and low recognition accuracy of current facial expression recognition methods,a lightweight facial expression recognition method is studied.Firstly,the number of layers of MobileNet V3L network is reduced,and the number of intermedi-ate channels and output channels of inverse residual structure is increased to 1.5~3.2 times of the original one.Secondly,an improved conditional coordinate attention mechanism is introduced to extract detailed information of facial expressions in space and channel locations by choosing average pooling or maximum pooling for encoding in the coordinate information embedding according to the number of intermediate channels.Finally,Mish is used in-stead of the h-swish activation function to achieve nonlinearization after feature extraction.Experiments are conduc-ted on the publicly available datasets FERPlus and PAF-DB,and the results show that the proposed method im-proves the recognition accuracy by more than 0.60% and 1.07%,respectively,over the original MobileNet series models.The proposed method improves the inference speed by 21.94% and the recognition accuracy by more than 0.49% over the Ada-CM network,and the experiments show that the method has good recognition performance.
关键词
表情识别/轻量化/条件坐标注意力机制/Mish激活函数
Key words
facial expression recognition/light-weight/condition coordinate attention/Mish activation function