计算技术与自动化2024,Vol.43Issue(4) :59-65.DOI:10.16339/j.cnki.jsjsyzdh.202404010

基于支持向量机的复杂场景中多人对话语音智能识别方法研究

Research on Intelligent Speech Recognition Method for Multi Person Conversation in Complex Scenes Based on Support Vector Machine

刘子寒 沈力 奚梦婷 陆佳鑫 朱佳佳 查俊杰
计算技术与自动化2024,Vol.43Issue(4) :59-65.DOI:10.16339/j.cnki.jsjsyzdh.202404010

基于支持向量机的复杂场景中多人对话语音智能识别方法研究

Research on Intelligent Speech Recognition Method for Multi Person Conversation in Complex Scenes Based on Support Vector Machine

刘子寒 1沈力 1奚梦婷 1陆佳鑫 1朱佳佳 1查俊杰1
扫码查看

作者信息

  • 1. 国网江苏省电力有限公司信息通信分公司,江苏南京 210000
  • 折叠

摘要

面对多人对话语音单一特征表征性别组合信息不足,导致语音识别结果不精准的问题,提出了基于支持向量机的复杂场景中多人对话语音智能识别方法.使用距离度量方法,检测复杂场景多人对话变化点.计算任意两个数据集的对数似然概率值,构建得分集.结合T-Test相似性度量方法,判断两个数据集显著差异性.构造支持向量机判别函数,利用支持向量机的映射逻辑实现相似话音的分离.使用支持向量机的二元分类超线性分类器构建最优判别函数,结合男性、女性基音频率、信号非谐振频率特征,实现多人对话语音智能识别.由实验结果可知,所研究方法对于基音频率识别结果,男性、女性幅度波动范围分别为-0.5~0.5、0.7~0.7,与实验数据一致;对于信号非谐振频率识别结果,男性、女性频率波动范围分别为-600~600 Hz、360~405 Hz,男性频率波动范围与实验数据仅存在50 Hz的误差,女性频率波动范围与实验数据一致.

Abstract

In the face of the problem of insufficient gender combination information represented by a single feature in multi-per-son conversation speech,resulting in inaccurate speech recognition results,a support vector machine based intelligent recognition method for multi-person conversation speech in complex scenes is proposed.Using distance measurement methods to detect changes in multi-person conversations in complex scenes.Calculate the logarithmic likelihood probability values of any two datasets and con-struct a diversity set.Using the T-Test similarity measurement method,determine the significant differences between the two data-sets.Construct a support vector machine discriminant function and use the mapping logic of the support vector machine to achieve the separation of similar voices.The binary classification super linear classifier of support vector machine is used to construct the op-timal discriminant function,and combined with male and female pitch frequency and signal non resonant frequency characteristics,the intelligent recognition of multi person conversation speech is realized.From the experimental results,it can be seen that the range of amplitude fluctuations for pitch frequency recognition in the research method is-0.5~0.5 for males and-0.7~0.7 for females,which is consistent with the experimental data;For the non resonant frequency identification results of the signal,the fre-quency fluctuation ranges for males and females are-600~600 Hz and-360~405 Hz,respectively.There is only a 50 Hz error between the male frequency fluctuation range and the experimental data,while the female frequency fluctuation range is consistent with the experimental data.

关键词

支持向量机/复杂场景/多人对话/语音智能识别

Key words

support vector machine/complex scenarios/multi person dialogue/speech intelligent recognition

引用本文复制引用

出版年

2024
计算技术与自动化
湖南大学

计算技术与自动化

CSTPCD
影响因子:0.295
ISSN:1003-6199
段落导航相关论文