Caret – Knowledge and References

Explore chapters and articles related to this topic

Unmanned Aircraft System (UAS) for Wetland Species Mapping

Published in Caiyun Zhang, Multi-sensor System Applications in the Everglades Ecosystem, 2020

Several machine-learning classifiers were examined for the classification procedure, but RF produced optimal results and thus was selected and presented in this chapter. RF is a machine-learning classifier combining an ensemble of decision-trees (Breiman, 2001). RF is the ideal classification algorithm because it has shown to be robust to parameter settings, small training samples, and uncertain data quality (Maxwell et al., 2018). RF has also proven successful for species classification using UAS imagery in previous studies (Feng et al., 2015; Lu and He, 2017). Parameter tuning and RF implementation were carried out in the free statistical software tool R (www.r-project.org/). Specifically, the caret package was used within R, which provides functions for machine learning classification and regression (Kuhn et al., 2016).

Statistical learning and predictive analytics

View Chapter

Purchase Book

Published in Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton, Texts in Statistical Science, 2017

Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton

The partykit::ctree() function builds a recursive partitioning model using conditional inference trees. The functionality is similar to rpart() but uses different criteria to determine the splits. The partykit package also includes a cforest() function. The caret package provides a number of useful functions for training and plotting classification and regression models. The glmnet and lars packages include support for regularization methods. The RWeka package provides an R interface to the comprehensive Weka machine learning library, which is written in Java.

Supervised learning

View Chapter

Purchase Book

Published in Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton, Modern Data Science with R, 2021

Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton

The ctree() function from the partykit package builds a recursive partitioning model using conditional inference trees. The functionality is similar to rpart() but uses different criteria to determine the splits. The partykit package also includes a cforest() function. The caret package provides a number of useful functions for training and plotting classification and regression models. The glmnet and lars packages include support for regularization methods. The RWeka package provides an R interface to the comprehensive Weka machine learning library, which is written in Java.

Re-Analysis of Non-Small Cell Lung Cancer and Drug Resistance Microarray Datasets with Machine Learning

View Article

Journal Information

Published in Cybernetics and Systems, 2023

Çiğdem Erol, Tchare Adnaane Bawa, Yalçın Özkan

As seen in Figure 1, after each dataset is downloaded, the microarray datasets are normalized with the affy package (Gautier et al. 2004) after the annotation files. Based on the annotation file and the nsFilter function of the Genefilter package, the dataset was filtered with a cutoff value of 0.95 (Gentleman et al. 2021). Then, the expressed data set was prepared for analysis using the exprs function of GeoQuery (Davis and Meltzer 2007). Support vector machine (svmRadial), k nearest neighbor (knn), naïve bayes, random forest (rf), C5.0 decision tree, multilayer perceptron (mlp), and artificial neural network with principal component step (pcaNNet) algorithms were applied to the expressed dataset. The 10 most important features (genes) for each model were determined through the Caret package (Kuhn 2008, Kuhn 2012) (Table 2). Caret package depending on the algorithm used generates (by default) a search grid that allows an automatic hyperparameter tuning upon which the model that performs well on the data is automatically selected. Genes obtained from 6 different datasets were searched for each dataset and common genes detected in different datasets are presented in Table 3 and Figure 2.

Sensitivity analysis of driving event classification using smartphone motion data: case of classifier type, sensor bundling, and data acquisition rate

View Article

Journal Information

Published in Journal of Intelligent Transportation Systems, 2022

Iman Taheri Sarteshnizi, Farbod Tavakkoli Khomeini, Borna Khedri, Amir Samimi

All computations are done using R 4.0.0 (R Core Team, 2013). As the performance of ML classifiers is affected by their hyperparameters, the Caret package (Kuhn, 2015) was utilized to tune them. Caret applies a grid search method to find the optimized set of hyperparameters. In this method, all possible combinations of the search space are tested, and the best one is selected. Additionally, for the deep learning models, Keras (Cholet, 2015) and Tensorflow (Abadi et al., 2016) packages are employed, as well as Adam optimizer. Other packages used for the rest of the classifiers are gmlnet, mboost, class, e1071, kohonen, partykit, MASS, rrlda, gbm, randomForest, xgboost, cforest, and adabag. Interested readers are referred to Boehmke and Greenwell (2019) for more information in this regard.

Multilevel weather detection based on images: a machine learning approach with histogram of oriented gradient and local binary pattern-based features

View Article

Journal Information

Published in Journal of Intelligent Transportation Systems, 2021

Md Nasim Khan, Anik Das, Mohamed M. Ahmed, Shaun S. Wulff

The detailed descriptions of the findings from this study have been presented in the following section. First, preliminary investigations of the extracted features have been conducted to examine significant differences among the image groups. Next, the results from the hyper-parameter tuning have been discussed. Afterward, the performance of the trained machine learning models has been described in terms of several performance indices. Subsequently, a comprehensive compression of the computational cost of different models has been provided. Finally, the effect of the number of features on multilevel weather detection has been investigated. It is worth mentioning that the HOG and LBP features were extracted using the Computer Vision Toolbox™ in MATLAB® version 9.8 (R2020a). Once the features were extracted, all the analysis was conducted in R® programming language version 3.6.3. R® is an open-source programming software for statistical computations and machine learning modeling. Most of the recent developments in the field of statistics and machine learning are usually available in R® through different packages and can be used on most operating platforms, including Windows, macOS, and Linux. All the machine learning models were trained, validated, and tested using the “caret” package in R®.