Explore chapters and articles related to this topic
Natural Language Processing
Published in Rakesh M. Verma, David J. Marchette, Cybersecurity Analytics, 2019
Rakesh M. Verma, David J. Marchette
Ambiguity of Natural Languages. Before we begin a discussion of NLP techniques for security challenges, we must keep in mind that, even when the data is clean or error-free, applying NLP techniques can be quite challenging. This is due to the inherent ambiguity in most natural languages such as English, Tamil and Swahili. Ambiguity, can be syntactic, semantic, or anaphoric. For example, the sentence “The man saw the dog with the telescope” has 14 different parse trees according to Collins.2 It is possible to construct a sentence with m clauses such that it has exponentially many, e.g., 3m, different parses and consequently many different meanings. Polysemy, where a word can have many different meanings, is a semantic ambiguity that is easily and quite quickly resolved by humans from the context of the word, but computers find it very challenging. This task is called word sense disambiguation. In anaphoric ambiguity, the same word or phrase that has been used earlier in a piece of text or speech can also have a different meaning later.
Artificial Intelligence for Document Image Analysis
Published in Sk Md Obaidullah, KC Santosh, Teresa Gonçalves, Nibaran Das, Kaushik Roy, Document Processing Using Machine Learning, 2019
Himadri Mukherjee, Payel Rakshit, Ankita Dhar, Sk Md Obaidullah, KC Santosh, Santanu Phadikar, Kaushik Roy
Word sense disambiguation [51,52] is the process of determining the actual meaning of a word in a sentence when a word has multiple meanings. For instance, the word “bank” may mean “river bank” or “a place where we deposit money”. It is very important to understand the meaning of different words prior to understanding the meaning of a text for different applications like summarization and evaluation. For instance, in the sentence, “I will go to the bank for a loan”, “bank” refers to the financial place and not to a river bank. The meaning of a word is very context-dependent, and analysis of neighboring text can help derive its meaning.
Machine Learning Applications
Published in Peter Wlodarczak, Machine Learning and its Applications, 2019
There is a lot of interest in natural language processing in research and practice since it simplifies human-machine interaction, however there are still limits to what natural language processing can do today. Machines are capable of understanding spoken commands or determine the subject of a text, but, to humans, a sentence or text has meaning, to a machine it is an array of characters. A machine does not understand the context what makes word sense disambiguation a tricky task. The word “meeting” can be a noun, as in “we are in a meeting”, or a verb, as in “we are meeting an my office”.
Representing word meaning in context via lexical substitutes
Published in Automatika, 2021
While lexical substitution (LS) intuitively appears to be a sensible approach to representing word meaning in context, it is by no means evident how it relates to sense-based representation. However, determining the correspondence between substitute- and sense-based meaning is important for at least two reasons. Firstly, many practical NLP applications require, for a given word in context, to explicitly identify its sense from a sense inventory such as WordNet, as in the word sense disambiguation [13] task, or to group together contexts pertaining to the same sense, as in the word senseinduction (WSI) [14] task. Secondly, even when detecting senses is not an end goal in itself, it is important to have a way of validating substitution-based representations, which can be achieved by comparing it to the more established sense-based representation.
A Machine Reasoning Algorithm for the Digital Analysis of Alchemical Language and its Decknamen
Published in Ambix, 2022
While alchemical symbols could theoretically be arbitrary,30 the order in which they are grouped is intentional, and this is how the symbols derive meaning.31 This same linguistic principle has been identified by scholars with regard to the problem of word-sense disambiguation in general – a word can be recognised by the company it keeps, and thus the meaning of a word can be determined by looking at its direct linguistic context.32 In computer linguistics the most common method of such concordancing is referred to as a “keyword in context” (KWIC). This paper proposes that alchemical Decknamen can be disambiguated using a variation of this principle. However, instead of investigating the linguistic context of the words directly surrounding a term, a computer approach to alchemical Decknamen would examine the “context of Decknamen” via a KOS containing all instances of alchemical expert vocabulary surrounding a word. Such a “context of Decknamen” could mean digitally annotating the five alchemical terms that precede and follow the term in question and subsequently encoding them in a digital KOS to allow for a computer to perform simple machine reasoning tasks. This would require a certain amount of historical context, such as “Michael Maier was an iatrochemist” or “Hermes Trismegistos is a mythological figure” expressed in the form of subject-verb-object relations (X is a Y, for example) to encode in a Resource Description Framework33 (RDF)-based digital ontology. The more difficult an example is to disambiguate, the more relations would need to be written in the form of the aforementioned statements for the machine to have adequate information to proceed.