Explore chapters and articles related to this topic
Introduction
Published in Brecht De Man, Ryan Stables, Joshua D. Reiss, IntelligentMusic Production, 2019
Brecht De Man, Ryan Stables, Joshua D. Reiss
For progress towards intelligent systems in this domain, significant problems must be overcome that have not yet been tackled by the research community. Most state of the art audio signal processing techniques focus on single channel signals. Yet multichannel or multitrack signals are pervasive, and the interaction between channels plays a critical role in audio production quality. This issue has been addressed in the context of audio source separation research, but the challenge in source separation is generally dependent on how the sources were mixed, not on the respective content of each source. Multichannel signal processing techniques are well-established, but they are usually concerned with extracting information about sources from several received signals, and not necessarily about the facilitation or automation of tasks in the audio engineering pipeline, with the intention of developing high-quality audio content.
Digital principles
Published in John Watkinson, An Introduction to Digital Audio, 2013
The fixed number of bits in a PCM sample determines the extent of the quantizing range. In the sixteen-bit samples commonly used, there are 65 536 different numbers, each representing a different analog signal voltage. Care must be taken during conversion to ensure that the signal does not go outside the convertor range, or it will be clipped. In Figure 3.2 it will be seen that in a sixteen-bit pure binary system, the number range goes from 0000 hex, which represents the smallest voltage, through to FFFF hex, which represents the largest positive voltage. Effectively the zero voltage level of the analog waveform has been shifted so that the positive and negative voltages in a real audio signal may be expressed by binary numbers which are only positive. This approach is called offset binary and unfortunately it is unsuitable for audio signal processing in the digital domain.
Spatial audio psychoacoustics
Published in Francis Rumsey, Spatial Audio, 2012
A study of a few human pinnae will quickly show that, rather like fingerprints, they are not identical. They vary quite widely in shape and size. Consequently, so do HRTFs, which makes it difficult to generalise the spectral characteristics across large numbers of individuals. People that have tried experiments where they are given another person's HRTF, by blocking their own pinnae and feeding signals directly to the ear canal, have found that their localising ability is markedly reduced. After a short time, though, they appear to adapt to the new information. This has implications for binaural audio signal processing. Considerable effort has taken place, particularly over the last twenty years, to characterise human HRTFs and to find what features are most important for directional perception. If certain details of HRTFs can be simplified or generalised then it makes them much easier to simulate in audio systems, and for the results to work reasonably well for different listeners. There is some evidence that generalisation is possible, but people localise best with their own HRTFs. There are even known to be ‘good localisers’ and ‘poor localisers’, and the HRTFs of good localisers are sometimes found to be more useful for general application.
Reconstructing room scales with a single sound for augmented reality displays
Published in Journal of Information Display, 2023
Benjamin S. Liang, Andrew S. Liang, Iran Roman, Tomer Weiss, Budmonde Duinkharjav, Juan Pablo Bello, Qi Sun
We envision our work to open new possibilities of rapidly establishing physical scene perception for displaying real objects and environments in AR. Our framework applies to highly-occluded or remote scenes, without the currently tedious camera-based scanning process. Furthermore, it exploits audio signal processing hardware and software used with consumer-level AR displays, for multi-modal spatial scene content detection in augmented reality and autonomous driving. We also believe that, in order to be able to carry out the 3D scene reconstruction task, our model must embed the input signal into a generalized spatial representation that could be used in future work for other downstream tasks, such as discrete sound event localization or acoustic imaging, to name a couple of examples.
A novel design of low-cost hearing aid devices using an efficient lifting filter bank with a modified variable filter
Published in Expert Review of Medical Devices, 2022
N Subbulakshmi, R Manimegalai, G Rajakumar, T Ananth Kumar, Umadevi Kosuri
Then, the update operation produces even samples later taken into the prediction process to produce different odd samples for the next step. Several investigations have been carried out on various filter bank algorithms. Frequency-warped signal processing techniques are the beginning of wideband audio signal processing development. The modified variable filter bank (MVFB) structure is stated. It is designed to compensate a variety of earshot losses. The device consists of a microphone as an input to process various signals through a signal conditioning unit and output speaker.