一种鲁棒且低计算成本的基音估计方法

ID:38715

大小:1.81 MB

页数:21页

时间:2023-03-14

金币:2

上传者:战必胜

 
Citation: Wang, D.; Wei, Y.; Wang, Y.;
Wang, J. A Robust and Low
Computational Cost Pitch Estimation
Method. Sensors 2022, 22, 6026.
https://doi.org/10.3390/s22166026
Academic Editors: Enrico Vezzetti,
Gabriele Baronio, Domenico
Speranza, Luca Ulrich and Andrea
Luigi Guerra
Received: 5 July 2022
Accepted: 10 August 2022
Published: 12 August 2022
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
sensors
Article
A Robust and Low Computational Cost Pitch
Estimation Method
Desheng Wang
1
, Yangjie Wei
1,
* , Yi Wang
1
and Jing Wang
2
1
Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, School of Computer
Science and Engineering, Northeastern University, Shenyang 110169, China
2
School of Information Science and Engineering, Shenyang University of Technology, Shenyang 110870, China
* Correspondence: weiyangjie@cse.neu.edu.cn
Abstract:
Pitch estimation is widely used in speech and audio signal processing. However, the
current methods of modeling harmonic structure used for pitch estimation cannot always match the
harmonic distribution of actual signals. Due to the structure of vocal tract, the acoustic nature of
musical equipment, and the spectrum leakage issue, speech and audio signals’ harmonic frequencies
often slightly deviate from the integer multiple of the pitch. This paper starts with the summation
of residual harmonics (SRH) method and makes two main modifications. First, the spectral peak
position constraint of strict integer multiple is modified to allow slight deviation, which benefits
capturing harmonics. Second, a main pitch segment extension scheme with low computational cost
feature is proposed to utilize the smooth prior of pitch more efficiently. Besides, the pitch segment
extension scheme is also integrated into the SRH method’s voiced/unvoiced decision to reduce
short-term errors. Accuracy comparison experiments with ten pitch estimation methods show that
the proposed method has better overall accuracy and robustness. Time cost experiments show that
the time cost of the proposed method reduces to around 1/8 of the state-of-the-art fast NLS method
on the experimental computer.
Keywords: pitch estimation; harmonic structure; harmonic summation (HS); smooth prior
1. Introduction
Pitch is a subjective psychoacoustic phenomenon synthesized by the ear auditory
cortex system for the brain [
1
]. As a basic feature, pitch is widely used in the areas of speech
interaction [
2
6
], music signal processing [
7
11
], and medical diagnosis [
12
,
13
]. Research on
pitch estimation has been going on for decades, and estimating pitch from clean speech has
been considered a solved problem because many methods achieve high estimation accuracy
under high signal-to-noise ratio (SNR) conditions. However, the robustness of pitch
estimation under noise and reverberation conditions still needs to be improved. Drugman
and Alwan of the University of Mons, Belgium, authors of the well-known summation of
residual harmonics (SRH) pitch estimation method, emphasize that performance under
noisy conditions is the focus of research in pitch estimation over the next decade [14,15].
The robustness of pitch estimation is affected by the model accuracy of the method,
and the modeling of almost all pitch estimation methods directly or indirectly depends
on the harmonic structure since the harmonic structure is an essential feature of audio
signals. Figure 1 shows the harmonic structure of an audio signal. The spectral peak with a
frequency of 100 Hz is the pitch, and the higher spectral peaks located near integer multiples
of 100 Hz constitute the harmonic structure of the pitch. A fundamental assumption of
modeling harmonic structures used in the pitch estimation is that the harmonic components
are strictly distributed at integer multiples of the pitch [
14
,
16
18
]. Expressed in a formula,
this modeling method on harmonic structures is generally realized by the product of an
integer and the pitch, that is:
f
l
= l f
0
(l = 2, ...L) (1)
Sensors 2022, 22, 6026. https://doi.org/10.3390/s22166026 https://www.mdpi.com/journal/sensors
资源描述:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭