Explore chapters and articles related to this topic
An Introduction to Sound, Hearing and Perception
Published in Nick Zacharov, Sensory Evaluation of Sound, 2018
The primary acoustic communication mode specific to human beings is speech. It is the original type of linguistic communication, substantially important for everyday life. A speech signal is the acoustic output from the speech production organs. It consists of vowels, consonants, and other sounds produced using the speech production organs: lungs, vocal folds, vocal and nasal tracts, tongue, and lips. Depending on their positions, and on the method of production of sound in the organs, the spectral characteristics of the sounds vary a lot, as seen in Figure 3.21. For example, the resonances in vocal and nasal tracts are called formants, which are during vowel sounds visible as peaks in the spectrogram around frequencies 800 Hz to 4 kHz. Each consonant and vowel has distinct characteristics in the time-frequency domain, and the linguistic information is coded into them in speech. Added to this, speech also contains so-called prosodic features. Such features are, e.g., intonation, stress, rhythm and timing, which may carry such information as indicating emotions and attitudes or signalling the difference between statements and questions. Speech production and related technologies have been researched a lot, and introductions can be found in (O’Shaughnessy, 1987; Flanagan, 1972; Titze, 1994; Rabiner and Schafer, 1978).
Principal characteristics of speech
Published in Sadaoki Furui, Digital Speech Processing, Synthesis, and Recognition, 2018
The speech production process involves three subprocesses: source generation, articulation, and radiation. The human vocal organ complex consists of the lungs, trachea, larynx, pharynx, and nasal and oral cavities. Together these form a connected tube as indicated in Fig. 2.2. The upper portion beginning with the larynx is called the vocal tract, which is changeable into various shapes by moving the jaw, tongue, lips, and other internal parts. The nasal cavity is separated from the pharynx and oral cavity by raising the velum or soft palate.
Analysis of fiber strain in the human tongue during speech
Published in Computer Methods in Biomechanics and Biomedical Engineering, 2020
Arnold D. Gomez, Maureen L. Stone, Jonghye Woo, Fangxu Xing, Jerry L. Prince
Deformation of the tongue plays a key role in several everyday functions, including breathing, swallowing, and speech generation—a principal contributor of quality of life (Campbell et al. 2000; Pierre et al. 2014). Deformation, quantified by the strain tensor, arises from biomechanical interactions including interfaces with organs such as bones, contraction of several myofiber families, and a delicate temporal orchestration between substructures in the tongue (Takemoto 2001; Gilbert et al. 2007; Sanders and Mu 2013). In speech production, the tongue modulates the reverberating volume within the vocal tract by surface shape changes, enabling the articulation of different sounds (Stone and Lundberg 1996). Surface changes result from volumetric shape changes achieved via time-variant and spatially inhomogeneous deformation patterns (Sanguineti et al. 1997; Buchaillard et al. 2009; Kajee et al. 2013). These patterns are of scientific and clinical interest, and although they can be interrogated using medical imaging (which yields 4D strain tensor fields), the results are difficult to visualize and interpret (Gilbert et al. 2007; Parthasarathy et al. 2007). Thus, interpretation with respect to tissue anisotropy, conservation of volume, and boundary conditions is necessary to take advantage of experimental strain measurements.