基于Mamba-UNet架构的音高估计模型

扫码查看

原文链接

万方数据
维普

中文摘要：单声源声音的音高估计算法主要有音高跟踪的鲁棒算法(Robust Algorithm for Pitch Tracking,RAPT)、SWIPE(Sawtooth Waveform Inspired Pitch Estimator)、Harvest等,但在引入有音乐伴奏等复调音乐的声源时,这些算法在人声音高估计任务中存在明显不足.借鉴现有的研究成果,改进传统声调估计的鲁棒模型(Robust Model for Vocal Pitch Estimation,RMVPE),提出一种基于Mamba-UNet架构的Mamba-RMVPE,用于解决复调音乐等多声源声音的人声音高估计问题.相较于传统的RMVPE,Mamba-RMVPE的音高准确率(Raw Pitch Accuracy,RPA)、音色准确率(Raw Chroma Accuracy,RCA)、总体正确率(Overall Accuracy,OA)均有提升,推理时间也大幅缩短.

外文标题：Pitch Estimation Model Based on Mamba-UNet Architecture

外文摘要：The pitch estimation algorithms for single source sound mainly include Robust Algorithm for Pitch Tracking (RAPT),Sawtooth Waveform Inspired Pitch estimator (SWIPE),Harvest,etc. However,when introducing polyphonic music sources with musical accompaniment,these algorithms have significant shortcomings in human voice high estimation tasks. Drawing on existing research results and improving traditional Robust Model for Vocal Pitch Estimation (RMVPE),a Mamba-RMVPE based on Mamba-UNet architecture is proposed to solve the problem of high estimation of human voice from multiple sound sources such as polyphonic music. Compared to traditional RMVPE,Mamba-RMVPE has improved Raw Pitch Accuracy (RPA),Raw Chroma Accuracy (RCA),and Overall Accuracy (OA),and significantly reduced inference time.

外文关键词：

polyphonypitch estimationRobust Model for Vocal Pitch Estimation (RMVPE)Mamba-UNet

作者：

彭祖剑

展开 >

作者单位：

开普云信息科技股份有限公司,广东东莞 523000

关键词：

复调音乐音高估计声调估计的鲁棒模型(RMVPE) Mamba-UNet

出版年：

2024

DOI：