Gram matrix – Knowledge and References

Explore chapters and articles related to this topic

Classification task

Published in Benny Raphael, Construction and Building Automation, 2023

An example in one dimension is taken for the simplicity of illustration. For the data in Table 9.2: Compute the Gram matrix. Comment about the relative magnitudes of the elements in this matrix.Write down the Lagrangian in dual form. What is the nature of the objective function?Write down the constraints in the dual form. Interpret the meaning of the final expression.Calculate the solution to the dual problem by guessing the values, α2= 0; α4= 0. Check whether all the constraints in the primal problem are satisfied with this solution.

Video Modeling and Retrieval

View Chapter

Purchase Book

Published in Ling Guan, Yifeng He, Sun-Yuan Kung, Multimedia Image and Video Processing, 2012

Zheng-Jun Zha, Jin Yuan, Yan-Tao Zheng, Tat-Seng Chua

where α is a vector [α1, α2, ... , αN]T containing Lagrange multipliers αi, 1 is a vector of all ones, and Q is an N × N positive semidefinite matrix with Q(i, j) = yiyjK(xi, xj). K(·) is a kernel function and K(xi, xj) = 〈Φ(xi), Φ(xj)〉. The matrix K containing the values of kernel function for all training sample pairs is named Gram matrix. It is worthy to note the importance of the kernel function, since it maps the distance between feature vectors into a higher-dimensional space in which the hyperplane separator and its support vectors are obtained. After getting the optimal α, the classification confidence score of a given test sample x can be obtained following the representation theorem:

Regression

View Chapter

Purchase Book

Published in A. C. Faul, A Concise Introduction to Machine Learning, 2019

A. C. Faul

The covariance matrices arising in a Gaussian process are completely specified, if the (i, j) entry is given by evaluating a kernel function at xi and xj, i.e. k(xi, xj). The covariance matrix is then the Gram matrixGram matrix and positive definite. The above are two specific choices of kernel. The choice of kernel function depends on the application. For possible choices of kernels see Section 5.2 in Chapter 5.

Quantum-behaved particle swarm optimization of convolutional neural network for fault diagnosis

View Article

Journal Information

Published in Journal of Experimental & Theoretical Artificial Intelligence, 2022

Jie Chen, QingShan Xu, Xiaofeng Xue, Yingchao Guo, Runfeng Chen

Before inputting the one-dimensional time-series signal into the CNN, the common method is to rearrange and combine the signal sampling points in a simple way and convert them into a two-dimensional matrix form. From the perspective of preserving the timing relationship between the signals, this article decides to use the GAF algorithm (Wang & Oates, 2015) to pre-process the data set. The idea of the GAF is originated from the Gramian Matrix in linear algebra. The Gram Matrix is often used to calculate the linear correlation of vector set, and the Gram Matrix retains the time dependence between the vectors. The time increases as the position of the Gram Matrix moves from the upper left corner to the lower right corner, so the time dimension is encoded to the geometry of the matrix. The GAF specific process is as follows: Firstly, the time series is normalised, the value in the sequence X is scaled to the interval [−1,1] or [0,1], and the scaled sequence is denoted as . Then, the scaled time series are converted to polar coordinates. Finally, the angular perspective can be easily exploited by the trigonometric sum/difference between each point to identify the temporal correlation within different time intervals. Figure 1 reflects the coding process of the time series.

A comprehensive survey on machine learning approaches for dynamic spectrum access in cognitive radio networks

View Article

Journal Information

Published in Journal of Experimental & Theoretical Artificial Intelligence, 2022

Amandeep Kaur, Krishan Kumar

Support Vector Machine (SVM) is supervised machine learning algorithms used for classification and regression analysis. It is based on statistical learning theory with structural risk minimisation (Awe et al., 2013). It is initially preferred for the classification of complex problems. For a given training set, the input training vector belongs to one or another class of two groups. These two groups are separated by an optimal hyperplane that has the largest distance to the nearest training set of any class (maximal margin classifier) that achieves good separation. It is easier to train SVM when the classes are linearly separable. In non-separable classification, non-linear SVM is obtained by introducing a kernel function. The function is said to be valid kernel if it satisfies Mercer’s Theorem (Hofmann, 2006), which provides necessary and sufficient characterisation of a function as a kernel function. A kernel represents a similarity measure in the form of a similarity matrix (Gram Matrix) between its input objects. The gram matrix fuses all the necessary information for learning algorithms merged in the form of the inner product. For more details, refer (Burges, 1998). SVM to develop a real-time approach to sense PU as shown in Figure 10. The input composite signal includes signal and noise, which is independent of each other in the time domain. The sampled data is classified as PU or not based on testing and training of the SVM classification model.