Explore chapters and articles related to this topic
Kernel Methods
Published in Mark Chang, Artificial Intelligence for Drug Development, Precision Medicine, and Healthcare, 2020
In order to use the kernel trick, a kernel has to meet the following positive-definite criterion. A real function k (xi, xj) of two objects xi and xj is called a positive-definite kernel if and only if it is symmetric, that is k (xi, xj) = k (xj, xi), and positive definite: ()∑i=1n∑j=1ncicjk(xi,xj)≥0,
Spatio-temporal Models
Published in Yu Ding, Data Science for Wind Energy, 2019
Given a set of N locations, s1, …, sN, we can compute the corresponding covariance matrix C, whose (i, j)-th entry is Cij = C(si, sj). The covariance matrix is positive definite if all its eigenvalues are strictly positive, or positive semidefinite if some of its eigenvalues are zeros while the rest are positive. It is not difficult to notice that the covariance function is related to the kernel function mentioned in Section 2.5.2 and the covariance matrix is related to the Gram matrix (or the kernel matrix). A covariance function is referred to as a covariance kernel in a general machine learning context, and it can be shown that a positive definite kernel can be obtained as a covariance kernel in which the distribution has a particular form [94].
Gaussian Process Damage Prognosis under Random and Flight Profile Fatigue Loading
Published in Ashok N. Srivastava, Jiawei Han, Machine Learning and Knowledge Discovery for Engineering Systems Health Management, 2016
Aditi Chattopadhyay, Subhasish Mohanty
There are many possible choices for the kernel functions [14]. From a modeling viewpoint, the objective is to select a kernel a priori, which agrees with the assumptions and mathematically represents the structure of the process being modeled. Formally, it is required to specify a function that will generate a positive definite kernel matrix for any set of inputs. In more general terms, the high-dimensional transformation through the assumed kernel function should satisfy Mercer’s theorem of functional analysis[23]. For any two sets of input vectors xi and xj, the kernel function in Equation 6.10 through 6.13 has the following form () k(xi,xj,Θ)=ka(xi,xj,Θ)+kscatter(xi,xj,Θ),
Multivariate statistical algorithms for landslide susceptibility assessment in Kailash Sacred landscape, Western Himalaya
Published in Geomatics, Natural Hazards and Risk, 2023
Arvind Pandey, Mriganka Shekhar Sarkar, Sarita Palni, Deepanshu Parashar, Gajendra Singh, Saurabh Kaushik, Naveen Chandra, Romulus Costache, Ajit Pratap Singh, Arun Pratap Mishra, Hussein Almohamad, Motrih Al-Mutiry, Hazem Ghassan Abdo
The Support Vector Machine (SVM) algorithm is a theory-based universal constructive learning technique. It can handle classification and nonlinear regression difficulties by converting the input variables into a large-dimensional space in the form of an inner product derived from positive definite kernel functions (Yao and Dai 2006). The performance on several learning problems, including classification, regression, and novelty detection, has enhanced the popularity of the SVM (Karatzoglou et al. 2006). The model was first used in species distribution modelling by Guo et al. (2014) and later became more popular in several other sectors. The model was used in a few studies for landslide hazard modelling (Yao and Dai 2006; Yilmaz 2012; Pradhan 2013), of which Yao and Dai 2006 explained the use of the SVM algorithm in landslide susceptibility modelling. Here, ten categorical variables were used for predicting landslide-susceptible conditions using SVM. The function ‘ksvm’ in package ‘kernlab’ of the R has been utilized for the present SVM model construction. This study included Radial Basis Function (RBF) linear kernel to prepare the SVM model. The objective of the SVM classifier is to determine the best-separating hyperplane from the given tanning sets that can discriminate between two classes: occurring landslide regions and non-occurring landslide regions. A hyperplane can be defined using Eq. (8) in the case of linearly separable data: where ‘w’ stands for a vector coefficient that estimates the hyper plane’s orientation in the feature space, ‘δi’ is the positive slack variables, and ‘b’ is the offset of hyperplane from the origin (Cortes and Vapnik 1995). An optimal hyperplane leads to get the solution using Lagrangian multipliers as given in Eqs. (9) and (10).