KNN – Knowledge and References

Explore chapters and articles related to this topic

Reliable Biomedical Applications Using AI Models

Published in Punit Gupta, Dinesh Kumar Saini, Rohit Verma, Healthcare Solutions Using Machine Learning and Informatics, 2023

Shambhavi Mishra, Tanveer Ahmed, Vipul Mishra

The distance measure used to locate the K-nearest neighbor query points has a significant impact on the KNN classifier’s performance [16]. In practice, the usual Euclidean distance is frequently utilized. [17] makes use of a large amount of data to make diagnoses depend on historical data. It focuses on employing a unique algorithm to calculate the chance of a specific condition occurring. The accuracy of such diagnoses is improved by applying the KNN algorithm. The KNN method can be used to improve automated diagnostics, such as those that detect several diseases with identical symptoms. KNN is used for the causes of common chronic diseases, for example diabetes, cancer and heart disease in the biomedical domain. Figure 8.2 describes different algorithms used in the healthcare sector.

An Optimal Diabetic Features-Based Intelligent System to Predict Diabetic Retinal Disease

View Chapter

Purchase Book

Published in Ayodeji Olalekan Salau, Shruti Jain, Meenakshi Sood, Computational Intelligence and Data Sciences, 2022

M. Shanmuga Eswari, S. Balamurali

KNN is a straight forward procedure used for classification and regression problem solving. This is a non-parametric lazy learner which means not having any best guess, and not learned from history of training set. The working condition is based on similarity distance measure on supervised lazy learners. It stores up all data and classifies new similarity-based data. The accuracy is based on the highest Euclidean distance value for k. The below-listed steps are implemented on diabetic dataset to obtain results: assigning number of k neighborscalculation of Euclidean distancecounting data points in each categoryassigning new data points based on the maximum number of neighborscreating model.

Breast Cancer Detection Using Machine Learning and Its Classification

View Chapter

Purchase Book

Published in Meenu Gupta, Rachna Jain, Arun Solanki, Fadi Al-Turjman, Cancer Prediction for Industrial IoT 4.0: A Machine Learning Perspective, 2021

Ashish Kumar, Ruchir Ahluwalia

In this chapter, we briefly analyzed the various machine learning-based techniques for breast cancer detection. We deduced that breast cancer and its implications can be cured if detected at an early stage. Work from various researchers was extensively reviewed, and we concluded that in the line of machine learning, SVM – which classifies cancerous data from normal data – can be considered the best classifier. Further, ANN is a classifier that gives tough competition to SVM in terms of accuracy and performance. In certain test cases, ANN even outperformed SVM, but usually it was the latter that marginally overshadowed the former. We determined that the performance of SVM and ANN is at par, but SVM is still preferred due to the ease of implementation as well as slightly better performance. ANN precedes SVM in the list of potential classifiers for breast cancer detection. In the next line of research, KNN was also utilized as a classifier. However, it can be concluded from the results that KNN lacks high accuracy in comparison to SVM and ANN. However, due to ease of implementation, KNN is also exploited as a classifier. We concluded that in terms of accuracy KNN precedes SVM and ANN, but in terms of ease of implementation and quick results, it succeeds ANN although still precedes SVM. After all these, the WHAVE technique is part of an unpopular selection of classifiers. It provided a high accuracy of 99.8%, although it is highly complex in terms of implementation.

Phylogenetic analyses of 41 Y-STRs and machine learning-based haplogroup prediction in the Qingdao Han population from Shandong province, Eastern China

View Article

Journal Information

Published in Annals of Human Biology, 2023

Guang-Yao Fan, De-Zhi Jiang, Yao-Heng Jiang, Wei Song, Ying-Yun He, Nixon Austin Wuo

The k-nearest neighbour (kNN) is a non-parametric supervised learning method which is helpful for both regression and classification (Altman 1992). Many prior studies mentioned its potential for allocating each haplogroup based on the Y-STR haplotype (Song et al. 2019b; Yin et al. 2022). Its availability had already been validated by a recent study (Fan 2022). In order to further enhance the predictive performance of the kNN model, a substantial training dataset was adopted for analysis using the “knn” package under the statistical environment R (Zhang 2016). The developed kNN predictor includes 23 common Y-STR loci and corresponding Y haplogroups from 3,248 Han males (Lang et al. 2019; Song et al. 2019a; Yin et al. 2020, 2022; Zhang et al. 2020). The algorithms were implemented using the R script available on GitHub (https://github.com/fanyoyo1983/knn-Y-haplogroup.git). Meanwhile, multi-copy loci and copy number variation (CNV) were excluded in the ML. The specificity and sensitivity of the kNN predictor for each predicted haplogroup were measured and performance was also shown in a confusion matrix.

Prediction and associated factors of hypothyroidism in systemic lupus erythematosus: a cross-sectional study based on multiple machine learning algorithms

View Article

Journal Information

Published in Current Medical Research and Opinion, 2022

Ting Huang, Siyang Liu, Jian Huang, Jiarong Li, Guixiong Liu, Weiru Zhang, Xuan Wang

Each machine learning model works differently. Regression is a fitting process (essentially a function estimation problem) that determines the relationship between the dependent and independent variables. The main idea of logistic regression is to establish a regression formula for the classification boundary based on the existing data18. SVM is based on the Vapnik–Chervonenkis dimension theory of statistical learning and the principle of minimal structural risk. According to limited sample information, SVM aims to find the best compromise between the complexity of the model (the learning accuracy of a specific training sample) and the learning ability (the ability to identify any sample without error) to obtain the optimal promotion ability19,20. Decision trees are structured such that each branch corresponds to a judgement condition. Starting from the root node of the tree, decisions are judged through layers of branches, finally reaching the leaf node (result). Random forest, an ensemble learning method based on decision trees, randomly selects certain features to build an independent decision tree and establishes a combination of multiple decision tree models to solve a single prediction21. Its working principle is to generate multiple decision tree models, of which the features, learning, and predictions are independently selected, and finally combine the models into a single prediction result. Finally, KNN is a clustering method that calculates the distance between features.

Noninvasive detection of COPD and Lung Cancer through breath analysis using MOS Sensor array based e-nose

View Article

Journal Information

Published in Expert Review of Molecular Diagnostics, 2021

Binson V A, M. Subramoniam, Luke Mathew

KNN is a supervised machine learning algorithm that classifies input into various categories based on the class of the nearest neighbors about the given k value [51]. Classification of a vector in the k-NN algorithm is made using vectors of known class. The unknown samples are processed one by one with every sample in the training set. Support vector machines and logistic regression are especially suitable for the classification of two classes (binary classifiers). To apply the logistic regression and SVM method to the multiclass classification scheme, the multiclass problem is decomposed into a sequence of binary problems, and the one-vs-All (OvA) method is employed [51–53]. Bayes’ theorem is the base of the classification algorithm Naive Bayes. Here, the possibility of assimilating the new dataset into any of the existing classes utilizing sample data from the current classified state. In this classifier, attributes are considered independent of each other [51,52]. Sir Ronald Fisher proposed the algorithm linear discriminant analysis primarily for dimensionality reduction. LDA is also used as a supervised machine learning classification algorithm. In this method, between-class variance and within-class variance ratio is maximized in any distinct dataset thus maximum discrimination is guaranteed [52,53].