Labeling – Knowledge and References

Explore chapters and articles related to this topic

Know Where to Start – Select the Right Project

Published in James Luke, David Porter, Padmanabhan Santhanam, Beyond Algorithms, 2022

James Luke, David Porter, Padmanabhan Santhanam

Labelled data is a key enabler for supervised ML technologies and labelling data can be a difficult and laborious task. In cases where you have an existing business process, it is usually possible to obtain labelled data based on historical decisions. In cases where a completely new business process is being defined, then obtaining the data will require resources to be assigned to manually label data. A really important point to remember is the need for the data to be consistently labelled. This may seem obvious, but in practice, it is rarely considered. It is not uncommon to review a failing ML project and to find that the ML is being trained with data that is inconsistent and contradictory. For example, there may be two identical images of a cat where one image is labelled “cat” and the other is labelled “Siamese”. Even worse, you may find one of the images labelled “dog”.

Image Measurements

View Chapter

Purchase Book

Published in Ravishankar Chityala, Sridevi Pudipeddi, Image Processing and Acquisition using Python, 2020

Ravishankar Chityala, Sridevi Pudipeddi

Labeling is used to identify different objects in an image. The image has to be segmented before labeling can be performed. In a labeled image, all pixels in a given object have the same value. For example, if an image comprises four objects, then in the labeled image, all pixels in the first object have a value 1, etc.

Perception of rhythmic agency for conversational labeling

View Article

Journal Information

Published in Human–Computer Interaction, 2023

Christine Guo Yu, Alan F. Blackwell, Ian Cross

Labeling lays the foundation for the supervised training of machine-learning-based artificial intelligence (AI) algorithms (Brodley et al., 2012). The primary purpose of labeling is to construct a training dataset that exemplifies human subjective interpretation – considered to be the “ground truth” of human intelligence for AI tasks such as language interpretation, social judgments, creative expression or emotion classification. Based on these labeled datasets, AI classifiers emulate human intelligence and replicate human judgments (Blackwell, 2015; Ware et al., 2001). Well-established research resources have been constructed this way. For instance, the ImageNet database offers “millions of cleanly sorted images” to train computer vision and pattern recognition algorithms (Deng et al., 2009), while in more subjective tasks such as emotion recognition, human experts are recruited to label corpuses of naturalistic expressions in order to train affective computing systems that reflect human responses (Afzal & Robinson, 2014).

Natural language processing (NLP) in management research: A literature review

View Article

Journal Information

Published in Journal of Management Analytics, 2020

Yue Kang, Zhao Cai, Chee-Wee Tan, Qian Huang, Hefu Liu

Most machine learning methods have been applied in the field of text classification as opposed to other fields. These methods are fall into two categories: supervised learning and unsupervised learning. Data labeling is a major challenge in supervised learning. Researchers can either download labeled data sets or conduct a survey on Amazon MTurk to outsource labeling of data (Ghose et al., 2019; Wang et al., 2018). The most popular supervised learning algorithms are Logistic Regression (LR), Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), and Random Forest. Precision, recall, F1, AUC, and ROC are commonly used to compare the performance of different algorithms so researchers can choose the algorithm that works best for their purposes. Kmeans and K-Nearest-Neighbour (KNN) are the most frequently used unsupervised clustering algorithms. Although there is no need to label data before training, interpreting clustering results is often difficult. Because clustering algorithms are usually calculated based on the distance between vectors, results calculated from these algorithms differ from the ones generated from researchers’ logical analysis. This discrepancy leads to a situation that irrelevant words in logical analysis appear in the same category. Researchers must be patient in identifying patterns in the results. Sklearn in Python provides various kinds of classification and clustering algorithms.

Human–Vehicle Cooperation in Automated Driving: A Multidisciplinary Review and Appraisal

View Article

Journal Information

Published in International Journal of Human–Computer Interaction, 2019

Francesco Biondi, Ignacio Alvarez, Kyeong-Ah Jeong

Machine learning has been applied to autonomous vehicles as a solution to multiple tasks, including low-level environment perception tasks, such as signal detection or pedestrian recognition, and high-level cognitive processes, such as path planning or conversational dialog managers in in-vehicle infotainment systems. The goal of peer-to-peer coordination between automated systems and vehicle occupants requires the successful application of artificial intelligence learning techniques, such as supervised learning, imitation learning, and reinforcement learning. Supervised Learning makes use of labeled data to train a system to accomplish a particular task, the labeling being traditionally done via human annotation. This method has been used to estimate the cognitive level of the human operator across time (Mahi, Atkins, & Crick, 2017). This capability allows the vehicle to detect the emergence of cognitive stress in drivers, increasing their level of autonomy, and reducing demands on the driver’s attention. Supervised learning has also been used to teach a robot social affordances in daily human interactions through labeled videos, and transfer the knowledge to human–robot interactions in unseen scenarios (Shu, Gao, Ryoo, & Zhu, 2017). This could enable vehicles to learn user interactions preferences and social behavior from outside the vehicle that could be applied to in-cabin scenarios.