Explore chapters and articles related to this topic
Applications and Performance of Current Technology
Published in John Holmes, Wendy Holmes, Speech Synthesis and Recognition, 2002
Automatic language identification has applications for surveillance and monitoring of communications, which are of interest to the military, for example. To the authors’ knowledge there are not yet any automatic language recognition systems in commercial use. Potential applications include automatic routing of multilingual telephone calls. For example, calls to the emergency services could be directed to an operator who can converse in the relevant language. Language identification can also form a component of systems for multilingual speech recognition or spoken language translation, which have so far been demonstrated as research systems but which should achieve commercial realization in the future.
Indian language identification using time-frequency image textural descriptors and GWO-based feature selection
Published in Journal of Experimental & Theoretical Artificial Intelligence, 2020
Amit A. Chowdhury, Vaibhav S. Borkar, Gajanan K. Birajdar
In a language identification task, various features of a speech like syntactic, acoustic, lexical, prosodic and phonotactic or combination of such features are used to differentiate a language. In Balleda, Murthy, and Nagarajan (2000), the first attempt made for identifying Indian languages. The algorithm uses Mel-frequency cepstral coefficients (MFCCs) for LID of four south Indian and one national (Hindi) language. In Dutta and Rao (2017), the delay features, normal group delay feature (NGD), auto-regressive group delay (ARGD), auto-regressive group delay with scale factor (ARGDSF) and MFCCs are used for LID. Features proposed in Dutta and Rao (2017) improved the performance by combining all the group delay-based systems independently with the MFCC-based system. The auto-associative neural networks (AANN) for capturing language-specific features are explored in Mary, Rao, Gangashetty, and Yegnanarayana (2004). Prosodic features for obtaining the language-specific information are also presented in the study.
CLOE: a cross-lingual ontology enrichment using multi-agent architecture
Published in Enterprise Information Systems, 2019
Mohamed Ali, Said Fathalla, Shimaa Ibrahim, Mohamed Kholief, Yasser F. Hassan
Language identification and separation: as the input text might contain text written in two or more different natural languages, it is important to detect and separate the text in different languages. Language identification plays a key role in several NLP applications. Depending on the language of the text, appropriate techniques are used for stop words and unnecessary word removal, part-of-speech (POS) tagging, and text annotation. For instance, given an input text T = ‘ مشغل الاقراص لا يعملyour DVD drive may not be found’, T will be divided into two sentences T1 = ‘ مشغل الاقراص لا يعمل’ and T2 = ‘your DVD drive may not be found’ with labels and for the identified languages respectively. Various approaches have been used to master this task (Lui, Lau, and Baldwin 2014; Souter et al. 2017). Apache Tika2 package is used for language identification.
Tackling the multilingual and heterogeneous documents with the pre-trained language identifiers
Published in International Journal of Computers and Applications, 2023
Mohamed Raouf Kanfoud, Abdelkrim Bouramoul
Text language identification is the process of determining the language of a document or text (e.g. English vs. Spanish vs. Dutch). The following briefly describes three state-of-the-art language identifiers used for experimentation.