Explore chapters and articles related to this topic
Natural Language Processing and Translation Using Machine Learning
Published in Abid Hussain, Garima Tyagi, Sheng-Lung Peng, IoT and AI Technologies for Sustainable Living, 2023
A neural network created artificially is used in neural machine translation (NMT). When translating, this deep learning algorithm considers entire sentences rather than just individual words. The memory used by neural networks is a fraction of that required by statistical approaches. They are much more efficient.
A Clinical Practice by Machine Translation on Low Resource Languages
Published in Satya Ranjan Dash, Shantipriya Parida, Esaú Villatoro Tello, Biswaranjan Acharya, Ondřej Bojar, Natural Language Processing in Healthcare, 2022
Rupjyoti Baruah, Anil Kumar Singh
A significant development in MT happened in the 1990s when companies like IBM started to leverage statistical models that significantly improved translation quality. The corpus-based Statistical Machine Translation (SMT) approach to learning by automatically searching sentences and translating them into the target language. SMT searches for patterns in a large number of parallel texts able to assign the probability of a sentence from the target language being the translation of another sentence from the source language. Building an SMT system requires a massive number of parallel corpora between source and target languages at the sentence level. The quality of SMT extensively depends on the language pair of the specific domain being translated. The corpora building can often be challenging in the healthcare industry. There is a massive variation in named entities such as diseases, chemical compounds, active ingredients, gender, symptoms, dosage levels, dosage forms, route of administration, date, location, location-species, and adverse reaction. SMT technology is CPU intensive and requires an extensive hardware configuration to run translation models at a satisfactory performance quality. So, companies began to experiment with hybrid MT engines, which commonly combined SMT with RBMT. These advancements popularized MT technology and helped adoption on a global scale. The current state of the art in MT technology is Neural Machine Translation (NMT) harnesses the power of Artificial Intelligence (AI) and uses neural networks to generate translations. Language translation technology is continuously changing, bringing new functionalities and more significant benefits to the medical industry. The end-to-end training paradigm of NMT is the powerful modeling capacity of neural networks that can produce comparable or even better results than traditional MT systems. NMT uses a single large neural network to model the entire translation process, freeing the need for excessive feature engineering and employing continuous representations instead of discrete symbolic representations in SMT.
On the scalability of data augmentation techniques for low-resource machine translation between Chinese and Vietnamese
Published in Journal of Information and Telecommunication, 2023
The neural approach, often called Neural Machine Translation (NMT) (Bahdanau, Cho, & Bengio, 2015; Cho et al., 2014a; Sutskever, Vinyals, & Le, 2014), has achieved great success in recent years. Unlike the traditional Statistical Machine Translation (SMT) (Brown, Pietra, Pietra, & Mercer, 1993; Koehn, Och, & Marcu, 2003), NMT uses continuous representations and is trained end-to-end. With recent development in models such as the attention mechanism (Luong, Pham, & Manning, 2015; Vaswani et al., 2017), NMT has achieved state-of-the-art performance on multiple language pairs, especially major ones like English–French, English–German (Edunov, Ott, Auli, & Grangier, 2018; Liu, Duh, Liu, & Gao, 2020). In the industrial field, NMT has been adopted in various commercial systems (Yang, Wang, & Chu, 2020).
An Efficient Method for Generating Synthetic Data for Low-Resource Machine Translation
Published in Applied Artificial Intelligence, 2022
Thi-Vinh Ngo, Phuong-Thai Nguyen, Van Vinh Nguyen, Thanh-Le Ha, Le-Minh Nguyen
Neural Machine Translation (NMT) systems (Bahdanau, Cho, and Bengio 2015; Sutskever, Vinyals, and Le 2014; Vaswani et al. 2017), recently, have shown state of the art in many translation tasks. Language pairs that are high-resource have presented impressive results; otherwise, low-resource language pairs have shown poor performance due to the lack of bilingual data. In some cases, datasets for research purposes are also absent. To solve this problem, using monolingual data is considered an effective strategy to enhance translation quality in low-resource situations.
Fully Unsupervised Machine Translation Using Context-Aware Word Translation and Denoising Autoencoder
Published in Applied Artificial Intelligence, 2022
Shweta Chauhan, Philemon Daniel, Shefali Saxena, Ayush Sharma
Machine Translation (MT) helps in breaking the language barrier but requires parallel data sets. However, bilingual corpora are restricted to high-resource languages like English or Chinese as compared to low-resource language. Unsupervised machine learning is an alternative to this approach, where the machine can be trained using monolingual corpora. Neural Machine Translation (NMT) models (Bahdanau et al., 2015) are the current standard, and the most challenging part is to train the system without substantial parallel corpora (Koehn and Knowles 2017).