Weak supervision – Knowledge and References

Explore chapters and articles related to this topic

Science-Guided Design and Evaluation of Machine Learning Models: A Case-Study on Multi-Phase Flows

Published in Anuj Karpatne, Ramakrishnan Kannan, Vipin Kumar, Knowledge-Guided Machine Learning, 2023

Nikhil Muralidhar, Jie Bu, Ze Cao, Long He, Naren Ramakrishnan, Danesh Tafti, Anuj Karpatne

Previous work such as [15] have also explored the direction of incorporating prior domain knowledge to regularize the training loss in neural networks and have demonstrated good generalization. In [1], a ‘learning-from-examples' paradigm is employed to incorporate prior domain knowledge into the learning pipeline. An approach without training labels was adopted in [33, 34] where a customized loss function was formed to directly incorporate domain knowledge as a source of weak supervision. Physics-informed neural networks (PINN) [31, 32] are yet another recent line of research in the same domain of direct modeling with domain knowledge, wherein specific Partial Differential Equation (PDE) constraints are used in the loss functions of neural networks as sources of domain-based supervision. Other efforts like [16, 28] have explored the idea of incorporating physics-based loss functions to capture monotonic constraints while yet others [14] incorporated the principles of energy conservation as physics-based loss terms in the learning pipeline.

Automated Creation of an Intent Model for Conversational Agents

View Article

Journal Information

Published in Applied Artificial Intelligence, 2023

Alberto Benayas, Miguel Angel Sicilia, Marçal Mora-Cantallops

Collecting good quality data and labeling it properly is an expensive and costly task. Previous works aimed to reduce the labeling cost and effort using various techniques. In Mallinar et al. (2018) a labeling framework is proposed using weak supervision functions. Other works use techniques such as clustering (Chatterjee and Sengupta 2020; Shi et al. 2018), semi-supervised learning (Thomas 2009), transfer learning (Goyal, Metallinou, and Matsoukas 2018) and active learning (Settles 2009). Clustering in particular is one of the most popular methods of identifying patterns in data. Different algorithms are available for clustering purposes, with some of the most popular and widely used being K-Means (Lloyd 1982), DBSCAN (Ester et al. 1996) and HDBSCAN (McInnes, Healy, and Astels 2017). Each of these algorithms has its own set of pros and cons. For example, K-means is very fast and assigns every point to a cluster but the number of clusters must be known beforehand. Moreover, the clusters have to be of spherical shape. On the other hand, density-based algorithms like DBSCAN or HDBSCAN are appropriate when there is no prior knowledge about the number of clusters. The problem with these algorithms is that they are not able to assign a cluster to all points and, thus, some of them are left out of any cluster (and labeled as noise) in the end.

Bearing fault diagnosis using weakly supervised long short-term memory

View Article

Journal Information

Published in Journal of Nuclear Science and Technology, 2020

Daisuke Miki, Kazuyuki Demachi

However, a practical fault-diagnosis tasks, the ground-truth anomaly score in the th data point, , is unknown. Therefore, our model is trained using a novel training method based on weak supervision. The training method is discussed in the following subsections.