LPC AND PRONY PARAMETRIC MODELING INTEGRATED INTO THE KALMAN FILTER FOR SPEECH NOISE REDUCTION: COMPARATIVE ANALYSIS AND STATISTICAL EVALUATION
DOI:
https://doi.org/10.66104/hegxr115Keywords:
Kalman Filter, Noise suppression, Speech enhancement, PronyAbstract
This work presents a comparative experimental study between two parametric modeling strategies integrated into the Kalman filter for speech denoising: autoregressive models (LPC-AR) and pole–zero models estimated via the Prony method (ARMA). The experimental protocol included controlled input segmental signal-to-noise ratio (segSNR_in ≈ 3 dB), multiple stochastic noise realizations, and two noise scenarios (white and colored AR(1)). Performance was evaluated using complementary metrics: output segmental signal-to-noise ratio (segSNR_out), Itakura–Saito divergence (IS), short-time objective intelligibility (STOI), and computational time. Results were reported as mean ± standard deviation and analyzed using the Friedman test with post-hoc comparisons. The findings indicate that the Kalman+LPC approach achieved superior overall performance and greater statistical consistency, particularly under white noise conditions, while also exhibiting lower computational cost. The Kalman+Prony formulation demonstrated adequate numerical stability, with poles predominantly located inside the unit circle; however, it did not provide systematic performance gains over the all-pole model. Pareto analysis revealed a trade-off between spectral distortion and residual energy, with no clear advantage of the ARMA model under the evaluated conditions. It is concluded that, for the tested scenarios, the LPC model integrated into the Kalman filter constitutes the most robust alternative in terms of average performance, numerical stability, and computational efficiency.
Downloads
References
BAI, Yuting et al. State of art on state estimation: Kalman filter driven by machine learning. Annual Reviews in Control, v. 56, p. 100909, 2023. Disponível em: https://researchr.org/publication/BaiYZSJ23. Acesso em: 21 fev. 2026.
BENDORY, Tamir; DE CASTRO, Yoann; ELDAR, Yonina C. On the accuracy of Prony’s method for recovery of sparse measures from noisy frequency samples. arXiv, 2024. Disponível em: https://arxiv.org/abs/2302.05883. Acesso em: 21 fev. 2026.
BROWN, Robert Grover; HWANG, Patrick Y. C. Introduction to random signals and applied Kalman filtering. New York: John Wiley & Sons, 1997.
DELLER, John R.; PROAKIS, John G.; HANSEN, John H. L. Discrete-time processing of speech signals. New Jersey: Prentice Hall, 1993.
DIONELIS, Nikolaos; BROOKES, Mike. Phase-Aware Single-Channel Speech Enhancement with Modulation-Domain Kalman Filtering. arXiv, 2017. Disponível em: https://arxiv.org/abs/1708.02171. Acesso em: 20 fev. 2026.
FÉVOTTE, Cédric; BERTIN, Nancy; DUFOUR, Jean-Louis. Nonnegative matrix factorization with the Itakura–Saito divergence: with application to music analysis. Neural Computation, 2009. Disponível em: https://perso.ens-lyon.fr/patrice.abry/ENSEIGNEMENTS/14M2SCExam/Bertin.pdf. Acesso em: 20 fev. 2026.
GABREA, Marcel. An adaptive Kalman filter for the enhancement of speech signals. In: INTERSPEECH 2004. p. 2709–2712. DOI: 10.21437/Interspeech.2004-719. Disponível em: https://www.isca-archive.org/interspeech_2004/gabrea04_interspeech.html. Acesso em: 20 fev. 2026.
GIRALDO, Juan et al. Evaluating Speech Enhancement Performance Across Demographics: Revisiting VoiceBank-DEMAND. In: INTERSPEECH 2025. Disponível em: https://www.isca-archive.org/interspeech_2025/giraldo25_interspeech.pdf. Acesso em: 20 fev. 2026.
KANTAMANENI, S. et al. Speech enhancement with noise estimation and filtration using Extended Kalman Filter approach. Theoretical Computer Science, 2023. (Discussão de EKF e sensibilidade do Kalman a modelagem/ruído). Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0304397522004935. Acesso em: 20 fev. 2026.
KIM, K. et al. Neural Network Regression for Sound Source Localization Using Classical Spectral Estimation Techniques (Yule–Walker, Prony, Steiglitz–McBride). Applied Sciences, v. 15, n. 17, 2025. Disponível em: https://www.mdpi.com/2076-3417/15/17/9272. Acesso em: 21 fev. 2026.
KUMARESAN, R.; TUFTS, D. W.; SCHARF, L. L. A Prony method for noisy data: Choosing the signal components and selecting the order in exponential signal models. Proceedings of the IEEE, 1984. Disponível em: https://www.researchgate.net/publication/2996886_A_Prony_method_for_noisy_data_Choosing_the_signal_components_and_selecting_the_order_in_exponential_signal_models. Acesso em: 20 fev. 2026.
O’SHAUGHNESSY, Douglas. Review of methods for coding of speech signals. EURASIP Journal on Audio, Speech, and Music Processing, 2023. DOI: 10.1186/s13636-023-00274-x. Disponível em: https://link.springer.com/article/10.1186/s13636-023-00274-x. Acesso em: 21 fev. 2026.
R.E. Kalman, “A new approach to linear filtering and prediction problems”, Basic Eng, Trans ASME, Series D, Vol 82, March 1960, pp 35–45.
ROY, Sujan Kumar; NICOLSON, Aaron; PALIWAL, Kuldip K. A Deep Learning-Based Kalman Filter for Speech Enhancement. In: INTERSPEECH 2020. p. 2692–2696. DOI: 10.21437/Interspeech.2020-1551. Disponível em: https://www.isca-archive.org/interspeech_2020/roy20_interspeech.html. Acesso em: 20 fev. 2026.
SELICATO, L. et al. Sparse hyperparametric Itakura–Saito nonnegative matrix factorization via bi-level optimization. arXiv, 2025. Disponível em: https://eprints.soton.ac.uk/499610/1/2502.17123v2.pdf. Acesso em: 21 fev. 2026.
TAKABATAKE, Tetsuya; YANO, Keisuke. Towards a robust frequency-domain analysis: Spectral Rényi divergence revisited. arXiv, 2023. Disponível em: https://arxiv.org/abs/2310.06902. Acesso em: 21 fev. 2026.
VASEGHI, Saeed V. Advanced digital signal processing and noise reduction. New York: John Wiley & Sons, 2000.
WANG, J. et al. Independent low-rank matrix analysis for determined blind source separation of audio and speech signals using Itakura–Saito divergence. arXiv, 2024. Disponível em: https://arxiv.org/pdf/2401.01762. Acesso em: 21 fev. 2026.
ZHENG, C. et al. Sixty Years of Frequency-Domain Monaural Speech Enhancement: From Traditional to Deep Learning Methods. IEEE/Journal survey (versão em PMC), 2023. Disponível em: https://pmc.ncbi.nlm.nih.gov/articles/PMC10658184/. Acesso em: 20 fev. 2026.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Dr. Leandro Aureliano da Silva, Dr. Eduardo Silva Vasconcelos, Dr. Luiz Fernando Ribeiro de Paiva, Dr. Adriano Dawison de Lima, Me. Welington Mrad Joaquim, Dr. Edilberto Pereira Teixeira

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish in this journal agree to the following terms:
Authors retain copyright and grant the journal the right of first publication, with the work simultaneously licensed under the Creative Commons Attribution License, which permits the sharing of the work with proper acknowledgment of authorship and initial publication in this journal;
Authors are authorized to enter into separate, additional agreements for the non-exclusive distribution of the version of the work published in this journal (e.g., posting in an institutional repository or publishing it as a book chapter), provided that authorship and initial publication in this journal are properly acknowledged, and that the work is adapted to the template of the respective repository;
Authors are permitted and encouraged to post and distribute their work online (e.g., in institutional repositories or on their personal websites) at any point before or during the editorial process, as this may lead to productive exchanges and increase the impact and citation of the published work (see The Effect of Open Access);
Authors are responsible for correctly providing their personal information, including name, keywords, abstracts, and other relevant data, thereby defining how they wish to be cited. The journal’s editorial board is not responsible for any errors or inconsistencies in these records.
PRIVACY POLICY
The names and email addresses provided to this journal will be used exclusively for the purposes of this publication and will not be made available for any other purpose or to third parties.
Note: All content of the work is the sole responsibility of the author and the advisor.
