Co-training – Knowledge and References

Explore chapters and articles related to this topic

Introduction

Published in Sugato Basu, Ian Davidson, Kiri L. Wagstaff, Constrained Clustering, 2008

Sugato Basu, Ian Davidson, Kiri L. Wagstaff

Learning with pairwise constraints is also related to the semi-supervised learning problem, which attempts to leverage a large number of unlabeled data to boost the classifier built from a small number of labeled data. Work by Nigam et al. [18] handled the unlabeled data by using a combination of the EM algorithm [8] and a naive Bayes classifier to augment text classifiers, and demonstrated that unlabeled data can be used to improve the accuracy of text classification. Co-training [5] is one of the most well-known multi-view semisupervised learning algorithms. The idea of co-training is to incrementally update the classifiers of multiple views which allows the redundant information across views to improve the learning performance. Lafferty et al. [38] represented the labeled and unlabeled data as the vertices in the weighted graph, where the edge weights encode the similarity between instances. For learning the part-based appearance models, Xie et al. [31] extended the GMM model to the semi-supervised case where most of positive examples are corrupted with cluster but a small fraction are uncorrupted. Compared with these semi-supervised learning algorithms, the algorithms leveraging pairwise constraints can utilize additional information about the relationship between pairs of examples other than the unlabeled data itself. Recently, Zhang and Yan [36] proposed a transformation-based learning method for learning with pairwise constraints and showed that optimal decision boundary can be consistently found as the number of pairwise constraints approaches infinity.

Machine Learning Basics

View Chapter

Purchase Book

Published in Fei Hu, Qi Hao, Intelligent Sensor Networks, 2012

Krasimira Kapitanova, Sang H. Son

A single classifier can also be self-trained. Similar to the ensemble of classifiers, the single classifier is first trained on all labeled data. Then the classifier is applied to the unlabeled instances. Only those instances that meet a selection criterion are added to the labeled set and used for retraining. Co-training: Co-training requires two or more views of the data, i.e., disjoint feature sets that provide different complementary information about the instances (Blum and Mitchell 1998). Ideally, the two feature sets for each instance are conditionally independent. Also each feature set should be sufficient to accurately assign each instance to its respective class. The first step in co-training is to use all labeled data and train a separate classifier for each view. Then, the most confident predictions of each classifier are used on the unlabeled data to construct additional labeled training instances. Co-training is a suitable algorithm to use if the features of the dataset naturally split into two sets.Transductive SVMs: Transductive SVMs extend general SVMs in that they could also use partially labeled data for semi-supervised learning by following the principles of transduction (Gammerman et al. 1998). In inductive learning, the algorithm is trained on specific training instances, but the goal is to learn general rules, which are then applied to the test cases. By contrast, transductive learning is reasoning from specific training cases to specific testing cases.Graph-based methods: These are algorithms that utilize the graph structure obtained by capturing pairwise similarities between the labeled and unlabeled instances (Zhu 2007). These algorithms define a graph structure where the nodes are labeled and unlabeled instances, and the edges, which may be weighted, represent the similarity of the nodes they connect.

A Multi-View SVM Approach for Seizure Detection from Single Channel EEG Signals

View Article

Journal Information

Published in IETE Journal of Research, 2023

Gopal Chandra Jana, Mogullapally Sai Praneeth, Anupam Agrawal

Our Contributions: In this study, we propose a multi view SVM model to utilize information from two views of the dataset for seizure detection. In multi view learning, a ML model is able to learn features from multiple views of the same dataset. Multi view learning algorithms can be categorized based on: (1) Co-training, (2) Co-regularization, (3) Margin Consistency techniques [13]. Co-training is a type of semi-supervised learning algorithm in which two classifiers are trained separately on two views of the dataset. It uses features of labeled and unlabeled data, later incrementally builds the two classifiers over the two views. Co-regularization technique is adding an additional regularization term to the main cost function in order to make sure that data from different views are consistent as well make sure that the predictions from different views are close to each other. In margin consistency techniques, margin variables from different views of model to be consistent based on the product of output variables to be greater than every margin variable. In this paper, we have used a modified co-regularization technique to build SVM-2K [14]. Two views of the dataset were created in time and frequency domain using independent component analysis (ICA) and power spectral densities (PSD), respectively. Finally, extracted time and frequency domain features have been feed into proposed Multi-view SVM. Performance of the proposed model has been compared with single view SVMs (time and frequency domain feature individually) as well as with other relevant existing SVM based state of the art seizure detection models.