Explore chapters and articles related to this topic
Natural language processing techniques for translation
Published in Rei Miyata, Masaru Yamada, Kyo Kageura, Metalanguages for Dissecting Translation Processes, 2022
Text segments, such as sentences and paragraphs, contain inline elements that need special treatments in the downstream tasks. Among them, formatting directives, such as font specifications and direct quotation, are usually given by the authors of documents; non-linguistic strings, such as numerical formulae, programming codes, and URLs may also be annotated. In contrast, several types of elements, such as named entities, technical terms, and locale elements, are seldom annotated explicitly. The task of identifying these inline elements is formalised as a task of sequence labelling, an instance of structured prediction. Formally,
Machine Learning Basics
Published in Peter Wlodarczak, Machine Learning and its Applications, 2019
Structured predictions predict structured objects rather than real or discrete values. In structured predictions, the output is built from parts, such as a translation, where the output, the translation into another language, is built from words in the target language. Structured predictions have a wide variety of applications in natural language processing (NLP), speech recognition and computer vision, to name a few.
Structured prediction models for argumentative claim parsing from text
Published in Automatika, 2020
The tasks of argumentation mining involves the transformation of text into structured representations. These are typically solved using structured prediction, a supervised machine learning paradigm that predicts structured objects such as sequences, trees, and graphs [43]. Conditional random fields (CRF) is a very powerful class of probabilistic modelling methods used for structured prediction [20]. Whereas a classifier predicts a label for an instance independently of other instances, a CRF can account for context. CRFs, particularly linear-chain CRFs, have been widely applied in NLP. Recent approaches to structured prediction rely on deep learning models. Long short-term memory network (LSTM) [21] is a recurrent neural network architecture with feedback connections that models sequences of data. LSTM networks modelling data in both forward and backward directions (BiLSTM) are often used to solve text classification problems [22] or sequence labelling problems [23, 24]. Distributed word representations [25] are often used as input features to solve such problems [26]. A popular alternative to probabilistic and deep learning models for structured prediction is chain classification [27]. Since the ordering of classifiers may significantly impact performance, ensembling of chain classifiers is often employed [28].
Data Science with Semantic Technologies: Application to Information Systems Development
Published in Journal of Computer Information Systems, 2023
Data Science is proving its usefulness in discovering knowledge and helping with decision-making. According to Dorr et al. Data Science is used in order to tackle several classes of problems:86Detection: when the aim is to find data of interest in a given dataset.Anomaly detection: when it is a question of identifying system states that force changes into a model.Cleaning : when the aim is to eliminate errors, omissions, and inconsistencies in data or across datasets.Alignment : when it is a question to relate different instances of the same object, especially the case of data for entity resolution within different data sources.Data fusion aims at integrating different representations of the same real-world object in which encoding is well defined.Identification and classification when it is question to determine the type or class to which an item of interest belongs.Regression: aims at finding functional relationships between variables to predict a numerical value on a continuous scale.Prediction: when the aim is to estimate a/or multiple variable variable(s) of interest at future times.Structured prediction: when it is intended to get structured objects, rather than numeric values as output.Knowledge base construction: when it is question of constructing a database having a predefined schema, based on any number of diverse inputs.Density estimation: when the aim is to produce a distribution function, rather than a label or a value.Joint inference: when it is question to jointly optimize predictors for different sub-problems using constraints that impose global consistency.