Eye Gaze Estimation Network Based on Class Attention
In recent years,eye gaze estimation has attracted widespread attention.The gaze estimation method based on RGB ap-pearance uses ordinary cameras and deep learning for gaze estimation,avoiding the use of expensive infrared devices like commer-cial eye trackers,providing the possibility for more accurate and cost-effective eye gaze estimation.However,due to the presence of various features unrelated to gaze,such as lighting intensity and skin color,in RGB appearance images,these irrelevant features can cause interference in the deep learning regression process,thereby affecting the accuracy of gaze estimation.In response to the above issues,this paper proposes a new architecture called class attention network(CA-Net),which includes three different class attention modules:channel,scale,and eye.Through these class attention modules,different types of attention encoding can be ex-tracted and fused,thereby reducing the weight of gaze independent features.Extensive experiments on the GazeCapture dataset show that,compared to the state-of-the-art method,CA-Net can improve gaze estimation accuracy by approximately 0.6%and 7.4%on mobile phones and tablets,respectively,in RGB based gaze estimation methods.
Class attentionLight squeeze-and-excitationSelf-attentionMultiscaleEye gaze estimation