DETECTION OF PORNOGRAPHIC AUDIO EVENTS BASED ON IMPROVED MEAN TEACHER MODEL
To protect the physical and mental health of young people,China attaches more attention to the supervision of the pornographic information.Aiming at the problem that traditional pornographic audio detection cannot accurately locate the start and end times of events,we propose an improved mean teacher model based on semi-supervised learning.The input of training set was unlabeled,weak label,and strong label data.The audio frame and segment features were extracted through a multilayer neural network,and iteratively optimized the classification loss that produced by the frame and segment,and the consistency loss between the teacher-student model and the segment classification model.The experimental results on a real dataset show that when the time tolerance is 5 seconds,the porn categeory recall rate reaches 94.3%,and the F1 score can reach 83.4%.
Porn audio detectionSemi-supervised learningMean teacher model