Explore chapters and articles related to this topic
Bootstrap Methods and Their Deployment in SAS and R
Published in Tanya Kolosova, Samuel Berestizhevsky, Supervised Machine Learning, 2020
Tanya Kolosova, Samuel Berestizhevsky
While using the m-out-of-n bootstrap method to generate multiple training and testing datasets, we strive to increase the variability among sample datasets based on the randomness of observations selection. There is always a chance that two randomly selected samples are comprised from very similar subsets of observations, and the larger the sample size, the higher the chance of such similarity. Applying the same machine learning method to two very similar training datasets will most likely produce very similar classifiers. By itself, it is not a problem, but our interest is to create different classifiers, and the estimation process is computationally expensive. We suggest applying the Jaccard similarity coefficient, also called the Jaccard index, to identify highly similar datasets and exclude or regenerate one of them.
Similarity Principle—The Fundamental Principle of All Sciences
Published in Mark Chang, Artificial Intelligence for Drug Development, Precision Medicine, and Healthcare, 2020
The Jaccard index (Tanimoto index), also known as intersection over union, is defined as the size of the intersection divided by the size of the union of the sample sets: ()J(A,B)=|A∩B||A∪B|=|A∩B||A|+|B|−|A∩B|.
Automatic Feature Selection for Coronary Stenosis Detection in X-Ray Angiograms Using Quantum Genetic Algorithm
Published in Siddhartha Bhattacharyya, Mario Köppen, Elizabeth Behrman, Ivan Cruz-Aceves, Hybrid Quantum Metaheuristics, 2022
Siddhartha Bhattacharyya, Mario Köppen, Elizabeth Behrman, Ivan Cruz-Aceves
As a second accuracy metric, the Jaccard index is used. The Jaccard index is a measure that determines how similar are two sets of elements. It is useful to measure the efficiency of a classifier by comparing the obtained results versus those expected. It is computed as follows: J(A,B)=A∩BA∪B,
Semantic segmentation of high-resolution remote sensing images using fully convolutional network with adaptive threshold
Published in Connection Science, 2019
Zhihuan Wu, Yongming Gao, Lei Li, Junshi Xue, Yuntao Li
In order to evaluate the performance of the algorithm, results of the model should be compared to the ground truth. In this paper, Average Jaccard Index is used to predict the value and the true value. The Jaccard index, also known as Intersection over Union and the Jaccard similarity coefficient, is a statistic used for comparing the similarity and diversity of sample sets. The Jaccard coefficient measures similarity between finite sample sets and is defined as the size of the intersection divided by the size of the union of the sample sets. is the number of classes, is true positive, is false positive, is false negative.