Statistical parsing – Knowledge and References

Explore chapters and articles related to this topic

Natural Language Processing in Data Analytics

Published in Jay Liebowitz, Data Analytics and AI, 2020

NLP has been around for over six decades. The earliest success was in automatic translation, demonstrated by the work of Georgetown University and IBM on translating 60 Russian sentences into English in 1954. With the idea and interest of AI emerging, the NLP work from the late 1960s to late 1970s moved into a new phase of more sophistication, with more emphasis on the representation and manipulation of world knowledge and the construction of a knowledge base. Stimulated by the development of computational grammar theory in linguistics and the use of logic in knowledge representation and reasoning in AI, NLP moved to be more practically motivated from failing to build practical systems in the 1980s. In the 1990s, statistical NLP (Manning & Schütze, 1999) started flourishing due to the arrival of a large amount of data on the internet and the computing power of handling it. Corpus data, together with machine learning, has greatly driven the development of statistical parsing, which is one of the core NLP tasks, whereby NLP has made significant progress during this period in other subtasks such as machine translation and information extraction. Presently, NLP research and development have entered another new and exciting era due to the availability of rich computational resources, vast quantities of data, rapid development of machine learning techniques and tools, and the emergence of many new application opportunities and challenges. From IBM’s “Watson”* (2006) to Apple’s “Siri”† (2011), from Amazon’s “Alexa”* (2014) to Google’s “Google Assistant”† (2016), powered by Natural Language Processing and other technologies, AI has gradually become part of our daily life and will continue to transform our lives in every conceivable field. The applications of NLP today are incredibly diverse. It spreads in various fields such as machine translation (e.g., Google Translate), question answering (e.g., Apple Siri), information extraction, natural language generation (document summarization, chatbot), writing assistance, video scripting, text categorization, sentiment analysis, speech technologies (speech recognition and synthesis), hate speech detection, fake news detection, etc. In the following, we will introduce some NLP applications in data analytics.

Extraction and linking of motivation, specification and structure of inventions for early design use

View Article

Journal Information

Published in Journal of Engineering Design, 2023

Pingfei Jiang, Mark Atherton, Salvatore Sorce

Natural language processing (NLP) schemes have been developed in the past years, and aim to contribute to design research (Siddharth, Blessing, and Luo 2022a). NLP has been broadly applied in patent analysis to extract data from patents that are related to design. For example, Li et al. (2012) used NLP techniques to estimate the TRIZ level of invention for better classification. Fantoni et al. (2013) apply NLP to extract function, behaviour and state information from patent texts, enabling a graphical visualisation of a patent. Cao et al. (2016) use NLP to extract technical system components and their relationships to construct a design structure matrix. Li and Tate (2019) use part-of-speech tagging and statistical parsing techniques to identify functional requirements and design parameters. Jiang, Atherton, and Sorce (2021) utilise part-of-speech tagging and regular expression parsing to achieve automated patent functional modelling, by identifying Subject-Action-Object triplets from the patent-independent claim. In this section, popular techniques of NLP, sentiment analysis and word embeddings are reviewed.

Learning Bilingual Word Embedding Mappings with Similar Words in Related Languages Using GAN

View Article

Journal Information

Published in Applied Artificial Intelligence, 2022

Ghafour Alipour, Jamshid Bagherzadeh Mohasefi, Mohammad-Reza Feizi-Derakhshi

A critical obstacle toward bilingual transfer is lexical matching between the source and the target languages. Such lexical matchings are not prepared for most languages and dialect pairs, so discovering word mappings with no prior knowledge is extremely valuable for cross-lingual applications. Prior works have focused on independently trained word embeddings in each language by monolingual corpora. They learn a linear transformation to map the embeddings using a small or medium-sized lexical matching as a bilingual seed dictionary from the source language to the target language (Artetxe, Labaka, and Agirre 2016). The ability to produce lexical items of two different languages in a shared cross-lingual space leads the NLP research further. Word-level connections between languages are used in transferred statistical parsing (Ammar et al. 2016; Zeman et al. 2018) or language understanding systems (Mrkšić et al. 2017) and later by using a tiny seed bilingual dictionary (Artetxe, Labaka, and Agirre 2016; Kondrak, Hauer, and Nicolai 2017). However, they do not satisfactorily handle good accuracy and need more labeled data to get better results.