Explore chapters and articles related to this topic
Speech and its perception
Published in Stanley A. Gelfand, Hearing, 2017
The six stops are produced at three locations. The bilabials (/p, b/) are produced by an obstruction at the lips, the alveolars (/t, d/) by the tongue tip against the upper gum ridge, and the velars (/k, g/) by the tongue dorsum against the soft palate. Whether the sound is heard as voiced or voiceless is, of course, ultimately due to whether there is vocal cord vibration; however, cues differ according to the location of the stop in an utterance. The essential voicing cue for initial stops is voice onset time (VOT), which is simply the time delay between the onset of the stop burst and commencement of vocal cord vibration (Lisker and Abramson, 1964, 1967). In general, voicing onset precedes or accompanies stop burst onset for voiced stops but lags behind the stop burst for voiceless stops. For final stops and those which occur medially within an utterance, the essential voicing cue appears to be the duration of the preceding vowel (Raphael, 1972). Longer vowel durations are associated with the perception that the following stop is voiced. Voiceless stops are also associated with longer closure durations (Lisker, 1957a), faster formant transitions (Slis, 1970), greater burst intensities (Halle et al., 1957), and somewhat higher fundamental frequencies (Haggard et al., 1970) than voiced stops.
All about Foreign Accent Syndrome
Published in Jack Ryalls, Nick Miller, Foreign Accent Syndromes, 2014
For the initial speech motor examination, one examines control of respiration for speech. Voice production is measured with perceptual rating scales and may also be backed up by acoustic measures. Prolonged vowels and speech tasks are employed. Detailed assessment of prosody is likely to be of central interest. Articulation is evaluated with diadochokinetic (i.e. reiterative) speech tasks to look at variables such as (stability of) voice onset time, ability to sustain range and rate of movement, differential effects of weakness or incoordination at different sites between for instance lips, tongue tip, and tongue back.
Vocal Motor Disorders *
Published in Rolland S. Parker, Concussive Brain Trauma, 2016
A phoneme is a single distinct sound, the minimal sound unit that contrasts meaning and defines a word in a language (e.g., /p/ and /b/). Phonemes consist of distinctive features of sound production (i.e., voicing, aspiration, roundedness, and the location and degree of maximal constriction of the vocal tract creating pitch). In a given language, some sequences are permitted while others are forbidden (Blumstein, 1991; Caplan et al., 1999). Air has to go through the larynx either whispered or voiced. Voiced is defined as sounds produced with vocal cord vibrations (/b/), as contrasted with voiceless air (t/p/,/s/) (i.e., without vibration of the vocal cords). For voiceless consonants, the vocal cords vibrate 30 msec after the stop consonant is released (s/). Phonemes are formed by the location and the maximal constriction of the vocal tract, as well as voicing (glottal or laryngeal vibrations) (i.e., speech sounds produced by vibration of the vocal cords with the opening between them, as b/d/c). A glottal stop is a speech sound made by the closure and then explosive release of the glottis. Opening and closing of the velopharyngeal port is required to produce appropriate nasal and oral resonance of speech and the intraoral pressures necessary for the articulation of phonemes, as well as to affect prosody and articulation in dysarthritic speakers. Dysfunction may result after lesions to the upper motor neurons (UMNs) that supply the bulbar region of the brainstem, and the lower motor neurons (LMNs) that supply muscles of the soft palate and pharynx, and subcortical structures such as the basal ganglia and cerebellum (Theodoros & Murdoch, 2001a). Voice-onset time is timing between the release of a stop consonant and the onset of glottal pulsing. Anterior patients have difficulty with phonetic dimensions requiring the timing of two independent articulators (Blumstein, 1991). The brain processes complex acoustic information and identifies a phoneme based on known categories of speech signals (Fitch et al., 1997). Contrast between related sounds involves both voicing and the place of articulation.
Temporal characteristics of stop consonants in pediatric cochlear implant users
Published in Cochlear Implants International, 2019
Stops are the most common consonants which occur in all human languages (Ladefoged and Maddieson, 1996) and are produced by the complete occlusion of the cavity by the articulators followed by a release. Acoustic events of stops comprise of frequency related parameters which include burst frequency, formant transition, and temporal parameters such as Voice Onset Time (VOT), Burst Duration (BD), and Closure Duration (CD). Among the temporal parameters, VOT has been widely studied in TDC across languages (Lisker and Abrahamson, 1964; Savithri, 1996; Shukla, 1989; Sridevi, 1990). Voice Onset Time (VOT) is the time difference between the onset of articulatory release and the onset of voicing and is considered as a major cue for differentiating prevocalic stops along the voicing dimension (Lisker and Abrahamson, 1964). Studies in English and Dravidian languages like Kannada, Malayalam, and Tamil have revealed that voiceless plosives have longer VOT compared to voiced plosives (Docherty, 1992; Klatt, 1975; Lisker and Abramson, 1964, 1967; Shukla, 1989; Savithri et al., 2001). VOT values differ according to the place of articulation. In English, among the three primary places of articulation, i.e. bilabial, alveolar, and velar, the velar plosives exhibit the longest VOT, whereas bilabials have the shortest (Smith, 1978). In contrast, Dravidian languages, i.e. languages predominantly spoken in the southern region of India, had the longest VOT for velars followed by bilabials and alveolars (Savithri et al., 2001).
Assessing automatic VOT annotation using unimpaired and impaired speech
Published in International Journal of Speech-Language Pathology, 2018
Esteban Buz, Adam Buchwald, Tzeviya Fuchs, Joseph Keshet
In empirical studies of normal and impaired speech production, measuring aspects of the acoustic or articulatory signal can be critical to understanding a pattern but is a bottleneck to completing the research. In this article, we consider measurement of voice onset time (VOT), one of the most highly studied speech measurements in both unimpaired and impaired speakers. VOT represents the duration between the release of a stop consonant and the onset of voicing that follows, and is the primary acoustic cue for encoding the voicing contrast cross-linguistically (Lisker & Abramson, 1964). This measure has existed for decades to study seminal issues such as cross linguistic differences in speech (Lisker & Abramson, 1964) and how infants encode speech sound contrasts (Eimas, Siqueland, Jusczyk, & Vigorito, 1971). In addition to studies on the contrastive properties of VOT, the continuous nature of this measure has been widely used to study the gradient nature of speech production (Baese-Berk & Goldrick, 2009; Buz 2016). We highlight one specific annotation tool, AutoVOT (Sonderegger & Keshet, 2012), and examine how it compares to traditional manual annotation of VOT based on a reanalysis of data from two articles: Buz (2016) who examined a large corpus of VOTs produced by unimpaired speakers; and Buchwald and Miozzo (2011) who studied systematic differences in how impaired speakers produced VOT in both accurate and error productions.
Characteristics of speech production in patients with T1 glottic cancer who underwent laser cordectomy or radiotherapy
Published in Logopedics Phoniatrics Vocology, 2018
Yong Tae Hong, Min Ju Park, Ki Hwan Hong
The GRBAS and discrimination scores were not significantly different between two groups. The sustained vowel /e/ was analyzed using the MDVP and VOT, VD and CD lengths of bilabial stops were analyzed with PRAAT. The ROC curve cut-off values of the LC and RT groups were determined based on the results. RT groups were determined based on the results. The results are as follows. The Fo values were higher in the LC group and lower in the RT group. Second, the jitter values were higher in the LC group and lower in the RT group. Third, the VOT values of the bilabial stops and the VD values were longer value in the LC group and shorter in the RT group. Laser surgery results in loss of tissue while RT results in vocal fold fibrosis and muscle fatigue. In this study, even though stop consonants evaluation is not always performed at clinic and the clinical difference might be subtle, significant differences can be seen in certain measured values at a group level. The changes in VOT and VD might result in decreased speech intelligibility in the RT group and laser surgery in terms of speech production might be superior to radiation therapy.