首页|L-Vector: Neural Label Embedding for Domain Adaptation

L-Vector: Neural Label Embedding for Domain Adaptation

扫码查看
We propose a novel neural label embedding (NLE) scheme for the domain adaptation of a deep neural network (DNN) acoustic model with unpaired data samples from source and target domains。 With NLE method, we distill the knowledge from a powerful source-domain DNN into a dictionary of label embeddings, or l-vectors, one for each senone class。 Each l-vector is a representation of the senone-specific output distributions of the source-domain DNN and is learned to minimize the average L<inf xmlns:mml="http://www。w3。org/1998/Math/MathML" xmlns:xlink="http://www。w3。org/1999/xlink">2</inf>, Kullback-Leibler (KL) or symmetric KL distance to the output vectors with the same label through simple averaging or standard back-propagation。 During adaptation, the l-vectors serve as the soft targets to train the target-domain model with cross-entropy loss。 Without parallel data constraint as in the teacher-student learning, NLE is specially suited for the situation where the paired target-domain data cannot be simulated from the source-domain data。 We adapt a 6400 hours multi-conditional US English acoustic model to each of the 9 accented English (80 to 830 hours) and kids’ speech (80 hours)。 NLE achieves up to 14。1% relative word error rate reduction over direct re-training with one-hot labels。

deep neural networklabel embeddingdomain adaptationteacher-student learningspeech recognition

Zhong Meng、Hu Hu、Jinyu Li、Changliang Liu、Yan Huang、Yifan Gong、Chin-Hui Lee

展开 >

Microsoft Corporation,Redmond,WA,USA

Georgia Institute of Technology,Atlanta,GA,USA

IEEE International Conference on Acoustics, Speech and Signal Processing

Barcelona(ES)

2020 IEEE International Conference on Acoustics, Speech and Signal Processing

7389-7393

2020