Robust speech parametrization based on pitch synchronized cepstral solutions

In general, the speech signal can be described by the excitation signal, the impulse response of the vocal tract, and a system that describes the impact of speech emission through human lips. The characteristics of the vocal tract primarily shape the semantic content of speech. Regrettably, the irre...

Full description

Saved in:
Bibliographic Details
Main Authors: Stanisław Gmyrek, Robert Hossa
Format: Article
Language:English
Published: Polish Academy of Sciences 2025-07-01
Series:International Journal of Electronics and Telecommunications
Subjects:
Online Access:https://journals.pan.pl/Content/135734/6-5152-Gmyrek_sk.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In general, the speech signal can be described by the excitation signal, the impulse response of the vocal tract, and a system that describes the impact of speech emission through human lips. The characteristics of the vocal tract primarily shape the semantic content of speech. Regrettably, the irregular periodicity of glottal excitation represents a significant factor in generating substantial distortions (ripples) in the amplitude spectrum of voiced speech. In this study, a PS-STFT (Pitch- Synchronized Short-Time Fourier Transform) method was proposed to achieve a reliable amplitude spectrum of the vocal tract. Subsequently, a set of cepstral coefficient vectors, namely PSHFCC (Pitch Synchronized Human Factor Cepstral Coefficients), as a chosen representative of the commonly used classical cepstral parameterization methods was analyzed to investigate the statistical properties after correction. Additionally, the widely accepted in speech recognition applications, the GMM (Gaussian Mixture Model) was chosen as the statistical acoustic model of individual Polish speech phonemes. To evaluate the quality of the proposed method, the distances between the multivariate probability distributions of the GMM form were calculated. Modifying classical cepstral methods through the analysis of variable-length signal frames synchronized to the fundamental period resulted in a reduction in the variance of the estimators of the cepstral coefficients, leading to an increase in the distances between the probability distributions and, consequently, improved classification results.
ISSN:2081-8491
2300-1933