Explore chapters and articles related to this topic
Comparison of Speech Enhancement Algorithms
Published in Philipos C. Loizou, Speech Enhancement, 2013
In this chapter, we report on the quality and intelligibility evaluation of a number of speech enhancement algorithms described in this book. The speech enhancement algorithms were chosen to encompass four different classes of noise reduction methods: spectral subtractive, subspace, statistical-model-based, and Wiener-type algorithms. The subjective quality evaluations were done using a noisy speech corpus (NOIZEUS) developed in our lab [1]. This corpus was designed to facilitate comparisons of speech enhancement algorithms among research labs. The enhanced speech files were sent to Dynastat, Inc. (Austin, TX) for subjective quality evaluation based on the ITU-T P.835 methodology (see Chapter 10). The subjective ratings were subsequently used to evaluate the correlation of several widely used objective measures with speech quality. The intelligibility evaluations were done using the IEEE sentence corpus. Noisy sentences enhanced by the various algorithms were presented to normal-hearing listeners and asked to identify the words spoken. The results of the intelligibility studies are summarized in this chapter.
AI for In-Vehicle Infotainment Systems
Published in Josep Aulinas, Hanky Sjafrie, AI for Cars, 2021
Speech recognition and synthesis are two important examples of how pervasive and influential AI-based technology has become in our modern lives. In AI research literature they represent the two most prominent sub-fields within speech-processing technology. Yet although other subfields such as speech coding, speaker recognition and speech enhancement are less well known, their applications may be examples of AI technology we have used without knowing it. Speech enhancement technology’s purpose is to make speech more understandable, e.g. when used in hearing aids, by automatically reducing unwanted background noise. And advancements in speech coding technology now allow us to enjoy good quality Internet phone calls without using much bandwidth.
1
Published in Jerry C. Whitaker, Microelectronics, 2018
Yariv Ephraim, Hanoch Lev-Ari, William J.J. Roberts
Speech enhancement aims at improving the performance of speech communication systems in noisy environments. Speech enhancement may be applied, for example, to a mobile radio communication system, a speech recognition system, a set of low quality recordings, or to improve the performance of aids for the hearing impaired. The interference source may be a wide-band noise in the form of a white or colored noise, a periodic signal such as in hum noise, room reverberations, or it can take the form of fading noise. The first two examples represent additive noise sources, while the other two examples represent convolutional and multiplicative noise sources, respectively. The speech signal may be simultaneously attacked by more than one noise source.
Improving time–frequency sparsity for enhanced audio source separation in degenerate unmixing estimation technique algorithm
Published in Journal of Control and Decision, 2022
Shahin M. Abdulla, J. Jayakumari
Some of the speech enhancement procedures are spectral subtraction (Wang et al., 2013), sparseness and temporal gradient regularisation method (Saleem et al., 2018), Weiner filtering (Scalart & Filho, 1996), subspace method (Ephraim & Van Trees, 1995). etc. Most of these strategies rely on STFT, which suffers from the TF resolution problem. To circumvent this wavelet-based transforms like Continuous Wavelet Transform (CWT), Discrete Wavelet Transform (DWT) (Khadidja et al., 2015), Discrete Wavelet Packet Transform(DWPT) (Messaoud et al., 2016). etc., have been employed. CWT is computationally complex and involves a lot of redundant information, whereas DWT and DWPT are shift variants. DTCWT being a shift variant, has been employed to improve the separation of speech mixtures (Abdulla & Jayakumari, 2017). Xin et al. (2019) integrate the SET into the enhanced empirical wavelet transform for physical structure analysis.
A Modified Tunable – Q Wavelet Transform Approach for Tamil Speech Enhancement
Published in IETE Journal of Research, 2022
J. Indra, R. Kiruba Shankar, N. Kasthuri, S. Geetha Manjuri
The proposed speech enhancement approach has considerably enhanced the noisy speech corrupted by different noises. This approach uses BPD with modified TQWT which permits the user to tune the quality factor and redundancy to provide perfect reconstruction and better quality. The evaluations based on the reconstruction error, cost function, SNR and MOS demonstrate the ability of the system to improve the performance of speech enhancement for computationally efficient implementations and speech recognition systems. In addition, the approach unlike most of the other wavelet based speech enhancement systems, where the Q factor remains constant and affects the reconstruction, allows tuning of different values for the Q factor for different parts of the speech signal, which leads to perfect reconstruction. The cost function minimized by BPD based modified TQWT is computed and the respective plots verify that the iterative method is converging. TQWT implemented in hardware using TMS320C6713 DSK also complies with the simulation results.
A Novel Approach to Improve the Speech Intelligibility Using Fractional Delta-amplitude Modulation Spectrogram
Published in Cybernetics and Systems, 2018
Arul Valiyavalappil Haridas, Ramalatha Marimuthu, Basabi Chakraborty
Speech enhancement is the process of detaching the speech signal and removing the distortions from the noisy speech signal (Garg and Sahu 2015). It is a significant step that enhances the perceptual quality of a noisy speech signals. Thus, it is clear that the speech enhancement process removes the disturbing background noise from the speech signal (He, Bao, and Bao 2017). For solving the problem of separating the noise and the speech signal, the commonly employed methods concentrate on the estimation of noise spectrum and removing them from the noisy speech spectrum (Sun et al. 2016). The presence of the noise spectrum in the speech spectrum has various impacts based on the application. Mainly, the noisy speech spectrum degrades the coding process of the bandwidth compression system. As an example for speech enhancement, one could say the communication over the mobile phones or hands-free systems where the speech signal of the target speaker should be available to the receiver by removing the murmurs and the speech of the nearby people other road noise, and so on. These disturbances are processed and removed at the sending end before it is sent to the receiver such that the far-end user enjoys the communication without any discomfort (Gannot et al. 2017). Thus, the speech enhancement process enhances the overall quality, intelligibility, or degree of listener fatigue (Anuprita and Choudhari 2013).