Apex frame spotting and recognition of micro-expression by optical flow
Objective Micro-expressions are unconscious facial actions made by people under external information and stimulation.These expressions are crucial proofs to judge people's emotions and thoughts.Micro-expressions are widely used in the fields of social security,business negotiation,and psychological counseling.This type of expression is different from the general macro-expression and demonstrates characteristics of short duration,low expression intensity,and fast change speed.Therefore,compared with macro-expressions,micro-expressions are more difficult to recognize and locate.Before the emergence of deep learning,researchers mostly used the traditional hand-crafted method,which utilizes the arti-ficially designed micro-expression extractors and complex parameter adjustment processes and algorithms to extract fea-tures.Some excellent algorithms can achieve competitive results,such as local binary pattern-three orthogonal plane and main directional mean optical flow(MDMO).However,these algorithms mostly only extract shallow features,and improv-ing their accuracy is difficult.With the development of machine learning in the field of computer vision,the research method of micro-expression based on deep learning has immediately become the mainstream.This method generally uses convolutional neural network to extract and classify the image or video features.The accuracy of micro-expression identifi-cation is markedly improved due to its powerful feature extraction and learning capability.However,the spotting and classi-fication of micro-expressions are still difficult tasks due to the subtle characteristics of micro-expressions and the difficulty of extracting effective features.Therefore,this paper proposes a dual-branch optical flow spotting network based on optical flow window,which can promote the solution of these problems.Method First,the size of the optical flow window is selected in accordance with the number of video frames,and three frames at both ends of the window are taken to stabilize the optical flow intensity.Dlib library is used to detect faces,and Farneback method is used to extract facial optical flow features and preprocess the optical flow image.The image size is finally converted into 224 × 224 pixels.The dual-branch network is then inputted for two classifications to address the presence or absence of micro-expression and the rising or fall-ing state of micro-expression.The twice classification should be judged in accordance with the same characteristics.There-fore,the same network backbone is used,and then the branches are utilized to process the characteristics,thereby focus-ing on different directions.Combining two loss functions can suppress the overfitting of the network,complete classifica-tion,and improve the network performance.Finally,the micro-expression state in the video window is obtained by sliding the window,and the intensity curve is drawn.Multiple windows are selected for positioning due to the different durations of micro-expression,and the highest point among them is taken as the apex frame.The classification network is different from the location network in two aspects.First,the front end of the window is the second to the fourth frame of the video and the back end uses the micro-expression part of the video.Second,Euler motion magnification is used to process video.This method can amplify facial motion and improve expression intensity but will destroy some optical flow features;thus,the method is not used in the positioning network.When classifying videos,the apex frame of the positioning network is taken as the center,and the five surrounding positions are selected as the input of the classification network.The classification network uses the uncomplicated network structure and obtains good results,proving the importance of apex frame spotting.Result The micro-expression spotting network is based on leave-one-subject-out cross-validation method on the Chinese Academy of Sciences Micro-expression Database Ⅱ(CASME Ⅱ)and the Chinese Academy of Sciences Micro-expression Database(CASME),which is the most commonly used validation method in the current micro-expression identification research.Compared with the current best spotting method,the lowest normalized mean absolute error(NMAE)value of 0.101 7 is obtained on the CASME Ⅱ,which is 9%lower than the current best spotting method.The NMAE value obtained on the CASME is 0.137 8,which is currently the second lowest number.Using this micro-expression spotting network,the classification network achieved 89.79%accuracy of three categories(positive,negative,and surprise)in the micro-expression classification experiment of CASME Ⅱ and 66.06%accuracy of four categories(disgust,tense,repression,and surprise)in the micro-expression classification experiment of CASME.Using the apex frame in dataset,the classification network achieved 91.83%and 76.96%accuracy on CASME Ⅱ and CASME,respectively.Conclusion The proposed micro-expression spotting network can effectively locate the position of the apex frame in the video and then extract its effec-tive micro-expression information.Extensive experimental evaluation proved that the spotting network has good spotting effect.The subsequent classification network shows that the extraction of effective micro-expression information such as an apex frame can significantly help the network in classifying micro-expressions.Overall,the proposed micro-expression spotting network can substantially improve the accuracy of micro-expression recognition.