Random subspace method – Knowledge and References

Explore chapters and articles related to this topic

Prediction Models for Accurate Data Analysis: Innovations in Data Science

Published in Kavita Taneja, Harmunish Taneja, Kuldeep Kumar, Arvind Selwal, Eng Lieh Ouh, Data Science and Innovations for Intelligent Systems, 2021

Balwinder Kaur, Anu Gupta, R. K. Singla

Random Subspace Method: It is equivalent to Bagging. Rather than using the training data set, the Random Subspace Method (RSM) draws the samples from the feature space (Ho, 1998). In the Random Subspace Method, N-features are selected unsystematically in an arbitrary manner from the M-dimensional feature vector. Later, the base learners are trained on the training data set to create classifiers. The training subsets are built by replacing each instance in the training set into an n-dimensional vector. Finally, all the classifiers are integrated using some combination method. RSM is a good choice in the case when the training data set is small in size and even when data has many redundant features (Yang et al., 2004).

Dynamic Fuzzy Rule-based Source Selection in Distributed Decision Fusion Systems

View Article

Journal Information

Published in Fuzzy Information and Engineering, 2018

F. Fatemipour, M. R. Akbarzadeh-T

In this test, we use different sets of features for each local classifier, hence heterogeneous. For this purpose, we use Random Subspace [45] method to create the local data. The Random Subspace method uses different sets of features for each local classifier. Here we use ten local sources each trained with 50% of the features that are randomly selected. The rule base is trained with the whole set of features of the validation set. Table 4 shows average of fivefold cross validation results. For heterogeneous sources, we observe that the proposed approach works better than the others for six datasets. The average difference between FDSS and best of other approaches for all data sets is and for top-down FDSS it is . FDSS also works better than top-down FDSS in eight data sets. The p-values of the test are also shown in Table 4. As the table shows, the two proposed algorithms maintains similar performance.

A stacked ensemble learning method for traffic speed forecasting using empirical mode decomposition

View Article

Journal Information

Published in Journal of the Chinese Institute of Engineers, 2022

Mohammad-Ali Kianifar, Hassan Motallebi, Vahid Khatibi Bardsiri

In the first step in ensemble learning, we need to generate a diverse pool of base learners. We use the random subspace method to select different training and different feature sets of the data. The random subspace method, a.k.a. attribute or feature bagging, is an ensemble learning method that attempts to reduce the correlation between the base learners by training them on random samples of features instead of the entire feature set (Ho 1998).

Analysis of crash injury severity on two trans-European transport network corridors in Spain using discrete-choice models and random forests

View Article

Journal Information

Published in Traffic Injury Prevention, 2020

Bahar Dadashova, Blanca Arenas-Ramires, Jose Mira-McWillaims, Karen Dixon, Dominique Lord

The random forests method was proposed by Breiman (2001) and is considered to be one of the most efficient classification methods. RF method has garnered mostly favorable reviews when compared to logistic regression, quadratic discriminant analysis, support vector machines, classification and regression trees, and others (Verikas et al. 2011). The random forests method is based on the bagging principle and the random subspace method (Breiman 2001) that relies on constructing a collection of decision trees with random predictors. The general architecture of the random forests using decision trees is described below (Verikas et al. 2011):Generate a bootstrap sample of size from the overall data, to grow a by randomly selecting the predictors (we will call this bootstrap sample, a cluster) .Use the predictor at the node of the to vote for the class label in this node. At each node, only one predictor providing the best split is selected.Run the out-of-bag data down the to obtain the misclassification rate, i.e. out-of-bag error rate, Repeat i.-iii. for a large number of trees until the minimum out-of-bag error rate, is obtained.Assign each observation to final class by a majority vote by averaging over the set of trees.