Word-level voice liveness detection method based on improved convolutional neural network
In order to improve the detection accuracy of replay voice in the smart home voice verification system,a new word-level voice liveness detection method is proposed,that is,a light convolutional global gate recurrent neural network(LC-GGRNN)is used as a deep feature extractor,real and replay voice classification is performed by the support vector machine(SVM),that is framework of LC-GGRNN-SVM.In particular,a global attention mechanism and a gated recurrent unit are introduced into LC-GGRNN based on the light convolutional neural network.The former is to focus on the channel information,spatial information,and the interaction information between channel and space about extracted features,and the latter is to learn the long-term correlation of deep features.Three acoustic features extracted from audio files in the PO-CO(pop noise corpus)dataset are used for model training,validation,and testing.The results show that the extracted a-coustic features of Gammatone frequency cepstral coefficients have the best detection effect on the proposed method.The ac-curacy and equal error rates are 85.72%and 14.28%,respectively,and the sum of the false acceptance rate and the false rejection rate is 28.59%.It can also be proved that voice liveness detection of the proposed method on POCO is gender-de-pendent.In addition,the proposed method also has good generalization for sentence-level replay voice detection.