Explore chapters and articles related to this topic
Modulation Schemes
Published in Stephan S. Jones, Ronald J. Kovac, Frank M. Groom, Introduction to COMMUNICATIONS TECHNOLOGIES, 2015
Stephan S. Jones, Ronald J. Kovac, Frank M. Groom
The sampling rate of the analog wave defines the quality of the signal. Increased sampling helps replicate the signal closer to its original form. Nyquist’s theorem dictates that sampling should occur at a rate that is twice the highest frequency being sampled. The human voice travels a range from approximately 30 to 24,000 Hz. However, most of the information being transmitted resides in the range from roughly 300 to 3700 Hz. Engineers created a filtering process that would allow the sampling of a bandwidth of 4000 Hz (4 kHz), which would encompass the frequencies defined. The sampling rate, needing to be twice the highest frequency being sampled (4 kHz), would result in a sampling rate of 8 kHz. To attempt to image sampling something at such a high rate in a very short period of time always leaves me in awe of how digital systems were originally designed. Exhibit 3.17 compares a basic rate to twice the basic sampling rate and the resulting replicated waveforms. The exhibit showing twice the basic rate can be seen to have a greater resemblance to the original waveform based on the points of each pulse. Defining those pulses and their associated voltage values in a binary format is the next step in moving to digital transmission technologies.
Architectural Acoustics
Published in Malcolm J. Crocker, A. John Price, Noise and Noise Control, 2018
Malcolm J. Crocker, A. John Price
In the general context of speech, vowels and consonants become woven together to produce not only linguistically organized words, but sounds which have a distinctive personal characteristic as well. The vowels usually have greater energy than consonants and give the voice its character. This is probably due to the fact that vowels have definite frequency spectra with superimposed periodic short duration peaks. However, it is the consonants which give speech its intelligibility. It is therefore essential in the design of rooms for speech to preserve both the vowel and consonant sounds for all listeners. Consonants are generally transient, short duration sounds of relatively low energy. Therefore, for speech, it is necessary to have a room with a short reverberation time to avoid blurring of consecutive consonants; we would expect therefore speech intelligibility to decrease with increasing reverberation time. At the same time we find that in order to produce a speech signal level well above the reverberant sound level (i.e., high signal-to-noise ratio), we require increased sound absorption in the room. This necessitates a lower reverberation time. Although this may lead us to think that an anechoic room would be most suitable for speech intelligibility, some sound reflections are required both to boost the level of the direct sound and to give the listener a feeling of volume. Therefore, an optimum reverberation time is established. This is usually under 1 sec for rooms with volumes under 300,000 ft3. If the speech power emitted by a male speaker is averaged over a relatively long period (i.e., 5 sec), the overall sound power level is found to be 75 dB. This corresponds to an averaged sound pressure level of 65 dB at 1 m from the lips of the speaker and directly in front of him. Converting the power level to acoustic power shows that the long time averaged power for men is 30 μW. The average female voice is found to emit approximately 18 μW. However, if we average over a very short time (i.e., 1/8 sec) we find that the power emitted in some vowel sounds can be 50 μW, while in other soft spoken consonants it is only 0.03 μW.59 Generally, the human voice has a dynamic range of approximately 30 dB throughout its frequency range.60 At maximum vocal effort (loud shouting) the sound power from the male voice may reach 3,000 μW.
Advances in Ultra-Low-Power Miniaturized Applications for Health Care and Sports
Published in Laurent A. Francis, Krzysztof Iniewski, Novel Advances in Microsystems Technologies and Their Applications, 2017
Miguel Hernandez-Silveira, Su-Shin Ang, Alison Burdett
The working principle of IP is as follows: variations of the air volume of the lungs resulting from inhalation and exhalation while breathing induce small and periodic changes in the body impedance (between 0.1 and 3 Ω) that can be detected as voltage fluctuations across the thoracic region when using appropriate instrumentation. The latter involves the injection of a low-intensity (≤100 μA) high-frequency (i.e. from 20 to 100 kHz) AC current through the chest wall using conventional pre-gelled surface (skin) electrodes. Thus, the typical pattern of a clean waveform resulting from IP respiratory activity is quasiperiodic by nature and resembles a sine waveform. Unfortunately, this is not always the case as the IP technique is very sensitive to motion artefacts, affecting the quality of the IP respiration signals. In simple terms, perturbation of the equilibrium of ionic charges around the electrode–electrolyte interface occurs as a result of mechanical strains imposed by movement on the electrodes. Consequently, such a perturbation results in unwanted alterations of the voltaic potential of the electrodes, adding errors to the dynamic potential difference signal measured/recorded across the thoracic region [20]. As mentioned earlier in this chapter, some of the signals generated by this method (in the absence of motion artefacts) are not clinically relevant. For example, human voice results from vibration of the vocal chords induced by air outflow. Thus, the rhythmic respiration pattern is disrupted when the person is talking. Likewise, food or drink intake interrupts the natural pattern of breathing. Therefore, resultant IP signals are not clinically relevant in these cases and hence must be discarded. All these factors together with the current platform limitations (i.e. intermittent respiration segments of 60 s each, lack of additional channel for reference signals from other sensors and electrodes and limited memory and processing capabilities in the body-sensor node) introduced defiant challenges in the respiration algorithm design – that is, must be simple but without compromising the reliability and robustness of the RR estimations. Thereby, the respiratory algorithm designed here does not involve the use of expensive computational DSP techniques to preprocess the signal. Similar to other approaches [9], it relies on a systematic set of rules to determine the validity of the signal instead. Then, the algorithm either returns an RR value when the signal corresponds to genuine respiration activity or an ‘invalid data’ message when the signal is distorted or clinically irrelevant.
Simplified cerebellum-like spiking neural network as short-range timing function for the talking robot
Published in Connection Science, 2018
The talking robot consists of an air pump, an artificial vocal cord, a resonance tube, an artificial nasal cavity, and a microphone connected to a sound analyser, which represents the lung, vocal cord, vocal tract, nasal cavity and auditory feedback of a human, respectively. An air compressor provides the airflow for the talking robot. The airflow is directed to the vocal cords via a pressure control valve and two control valves. These work as controllers for the volume of voiced sound and unvoiced sound. The resonance tube works as a vocal tract and is attached to the artificial vocal cord for the manipulation of resonance characteristics. The nasal cavity is connected to the resonance tube with a rotary valve between them. The microphone and amplifier acts as an auditory system for the feedback during speech. The relation between the voice characteristics and the motor control commands are stored in the system controller, which is referred to the generation of speech articulatory motion. The overview of the system is shown in Figure 1. The characteristics of a glottal wave, which determines the pitch and the volume of the human voice is governed by the complex behaviour of the vocal cord-the oscillatory mechanism of human organs consisting of a mucous membrane and muscles excited by airflow from the lung. The vibration of a 5 mm width rubber band attached to a plastic body creates sound from the artificial vocal cord.
Gestural systems for the voice: performance approaches and repertoire
Published in Digital Creativity, 2018
Continuing in the artist-led controller tradition, Hewitt has composed and performed consistently with the eMic (extended Mic-stand instrument controller) (Hewitt 2003) for over a decade. Controlled by a stylized gestural vocabulary based on common gestures enacted by vocalists in popular music, the interface modifies the voice with digital audio effects including filter sweeps, grainy distortion and pitch shifting. In recent years Hewitt developed a wearable wireless system, enabling her to embrace unencumbered, free air gestures. She combines both interfaces in selected performances, as seen in Figure 1, exploring the ability to capture and process the human voice, removing it from the body and its biological limitations, such as breath, pitch range, timbral quality and amplitude. Hewitt treats the voice as an abstract sound for manipulation, reconfiguring it to transcend the gender and cultural conditioning usually associated with the female vocal sound.
Design and implementation of a VoIP PBX integrated Vietnamese virtual assistant: a case study
Published in Journal of Information and Telecommunication, 2023
Hai Son Hoang, Anh Khoa Tran, Thanh Phong Doan, Huu Khoa Tran, Ngoc Minh Duc Dang, Hoang Nam Nguyen
Representing analogue signals in the digital form is a difficult task. Since the sound form itself, like a human voice, is analogue, many digital values are required to represent amplitude, frequency, and phase. Consequently, converting those values into the binary number form (0 & 1) is very difficult. Performing this conversion requires a coder-decoder device or an encoder and decoder. The analogue signal is applied to the input of this unit and converted into binary digital sequences at the output. The process is then repeated by converting the binary number into the analogue terminal at the end. There are four steps involved in digitizing an analogue signal: sampling, quantization, encoding, and voice compression.