Distance correlation – Knowledge and References

Explore chapters and articles related to this topic

Hierarchical Spatial Features Learning for Image Classification

Published in Guoqing Zhou, Urban High-Resolution Remote Sensing, 2020

A large convolutional window, covering abundant contextual information, is necessary for capturing long-distance correlation and correct decision making. However, to design a large kernel window, a CNNs model would be unmanageably large (Farabet et al. 2013). To deal with this problem, the multiscale method can be considered. The multiscale method is popular in RS image analysis (Santos et al. 2012, 2013; Tilton et al. 2012; Valero et al. 2013). Many researchers integrated the multiscale analysis with the classification method to make the training procedure simple and increase the classification accuracy. Farabet et al. (2013) proposed multiscale CNNs, which utilized the Laplacian pyramid to transfer the raw input image into three scales followed by upsampling. Although the scene parsing and pixelwise prediction results of the multiscale CNNs were well designed, the multiscale CNNs did not effectively use the original spectral information of images. As is well known, spectral information is very important for RS image classification, as it is associated with the spectral absorption features of various materials (Zhang et al. 2018).

GP Fidelity and Scale

View Chapter

Purchase Book

Published in Robert B. Gramacy, Surrogates, 2020

Robert B. Gramacy

To summarize this segment on CSKs, consider the following notes. Sparse covariance matrices decompose faster compared to their dense analogs, but the gap in execution time is only impressive when matrices are very sparse. In that context, intervention is essential to mop up long-range structure left unattended by all those zeros. A solution entails hybridization between processes targeting long- and short-distance correlation. Kaufman et al. utilize a rich mean structure; Sang and Huang’s FSA stays covariance-centric. Either way, both agree that Bayesian posterior sampling is essential to average over competing explanations. We have seen that the MCMC required can be cumbersome: long chains “eat up” computational savings offered by sparsity. Nevertheless, both camps offer dramatic success stories. For example, Kaufman fit a surrogate to more than twenty thousand runs of a photometric redshift simulation – a cosmology example – in four input dimensions, and predict with full UQ at more then eighty thousand sites. Results are highly accurate, and computation time is reasonable.

Multivariate Analysis and Techniques

View Chapter

Purchase Book

Published in N.C. Basantia, Leo M.L. Nollet, Mohammed Kamruzzaman, Hyperspectral Imaging Analysis and Applications for Food Quality, 2018

Mohammed Kamruzzaman

Multivariate classification, also called pattern recognition, can be unsupervised or supervised. Supervised pattern recognition aims to establish a classification model in order to classify new unknown samples to previously defined known classes on the basis of its pattern of measurements. On the other hand, unsupervised classifications do not require a prior knowledge about the group structure in the data and the data are classified according to their natural groups. Hence samples are grouped according to a similarity metric, which can be distance, correlation or some combination of both. This type of analysis is often very useful for preliminary evaluation of the information contents in the spectral dataset. PCA is the most frequently used unsupervised technique for qualitative classification. Commonly used supervised methods are soft independent modeling of class analogy (SIMCA), linear discriminant analysis (LDA), K-means and fuzzy clustering, partial least squares-discriminant analysis (PLS-DA), Fisher discriminant analysis (FDA) and artificial neural network (ANN).

Fast Robust Correlation for High-Dimensional Data

View Article

Journal Information

Published in Technometrics, 2021

Jakob Raymaekers, Peter J. Rousseeuw

By itself distance correlation is not robust to outliers in the data. In fact, we illustrate in Section A.9 of the supplementary material that the distance correlation of independent variables can be made to approach 1 by a single outlier among 100,000 data points, and the distance correlation of perfectly dependent variables can be made to approach zero. On the other hand, we could first transform the data by the function g of (26) with the sigmoid , and then compute the distance covariance. This combined method does not require the first moments of the original variables to exist, and the population version is again zero if and only if the original variables are independent (since g is invertible). Figure 8 illustrates the robustness of this combined statistic.

Ranking Features to Promote Diversity: An Approach Based on Sparse Distance Correlation

View Article

Journal Information

Published in Technometrics, 2022

Andi Wang, Juan Du, Xi Zhang, Jianjun Shi

Our feature ranking procedure is based on distance correlation (Székely and Rizzo 2004). Distance correlation is an energy statistic (Székely and Rizzo 2017), which is a function of distances between all pairs of samples. As introduced in Section 2, it is a general dependency measure and can identify linear and nonlinear dependency relationships.