Kernel density estimation – Knowledge and References

Explore chapters and articles related to this topic

A Survey of Stream Clustering Algorithms

Published in Charu C. Aggarwal, Chandan K. Reddy, Data Clustering, 2018

Density-based methods [18, 30] construct a density profile of the data for clustering purposes. Typically, kernel density estimation methods [58] are used in order to construct a smooth density profile of the underlying data. Subsequently, the data are separated out into density-connected regions. These density connected regions may be of different shapes and sizes. One of the advantages of density-based algorithms is that an implicit shape is not assumed for the clusters. For example, when Euclidian distance functions are used, it is always assumed that the clusters have spherical shapes. Similarly, the Manhattan metric assumes that the clusters are of a diamond shape. In density-based clustering, connected regions of high density may often have arbitrary shapes. Another aspect of density-based clustering is that it does not predecide the number of clusters. Rather, a threshold on the density is used in order to determine the connected regions. Of course, this changes the nature of the parameter which needs to be presented to the algorithm (density threshold instead of number of clusters), but it does not necessarily make the approach parameter-free.

Advances in kernel density estimation on directional supports

View Chapter

Purchase Book

Published in Christophe Ley, Thomas Verdebout, Modern Directional Statistics, 2017

Christophe Ley, Thomas Verdebout

Kernel density estimation is the classical way to produce non-parametric density estimates on the real line. Its roots can be found in the seminal works of Rosenblatt (1956) and Parzen (1962). Letting Z1,…,Zn be iid observations from a population with unknown density f on R $ {\mathbb{R}} $ , the kernel density estimator (KDE) at some point z∈R $ z \in {\mathbb{R}} $ is defined as f^(z)=1ng∑i=1nKℓz-Zig $$ \hat{f}(z) = \frac{1}{{ng}}\mathop \sum \limits_{{i = 1}}^{n} K_{\ell } \left( {\frac{{z - Z_{i} }}{g}} \right) $$

Cooperative Regression-Based Forecasting in Distributed Traffic Networks

View Chapter

Purchase Book

Published in Qurban A. Memon, Distributed Networks, 2017

Jelena Fiosina, Maksims Fiosins

Kernel density estimation is a nonparametric approach for estimating the probability density function of a random variable. Kernel density estimation is a fundamental data-smoothing technique where inferences about the population are made on the basis of a finite data sample. A kernel is a weighting function used in nonparametric estimation techniques.

Bivariate kernel density estimation for environmental contours at two offshore sites

View Article

Journal Information

Published in Ships and Offshore Structures, 2022

Yingguang Wang

In statistics, kernel density estimation is a non-parametric way to estimate the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample. We begin with the formulations for the univariate kernel density estimator. Suppose we have a random sample , … taken from a continuous, univariate, unknown density . We are interested in estimating the shape of this function . Its univariate kernel density estimator is (Silverman (1986)): where K is a non-negative function satisfying , which we call the kernel function, and h > 0 is a smoothing parameter called the bandwidth parameter. A range of kernel functions are commonly used: uniform, triangular, biweight, triweight, Epanechnikov, normal, and others. The Epanechnikov kernel is optimal in a mean square error sense, and therefore it is utilised in this study. The mathematical formulation of the univariate Epanechnikov kernel is as follows:

Vulnerability of transmission towers under intense wind loads

View Article

Journal Information

Published in Structure and Infrastructure Engineering, 2022

Edgar Tapia-Hernández, David De-León-Escobedo

At this point, three probability distributions were computed before selecting a log-normal distribution: beta, kernel, and log-normal as depicted in Figure 7. In general, probability distributions describe the dispersion of the value of a random variable. The beta distribution is applied to model the behavior of random variables limited to intervals of finite length in a wide variety of disciplines. Kernel density estimation is a non-parametric way to estimate the probability density function of a random variable. Whereas in a log-normal distribution the random variable takes only positive real values. A χ2 goodness of fit test was performed (see Table 1); according to the results, the log-normal distribution represents a most reasonable estimation of the considered values and, therefore, as discussed above, it is the select distribution for the study. In Table 1, OF stands for observed frequency and EF refers to the expected frequency. Note that OF Actual, EF beta, EF LN and EF Kernel have units of percentage of total occurrences, and χ2 beta, χ2 LN, and χ2 Kernel are non-dimensional.

Estimating Level of Engagement from Ocular Landmarks

View Article

Journal Information

Published in International Journal of Human–Computer Interaction, 2020

Zeynep Yücel, Serina Koyama, Akito Monden, Mariko Sasakura

In addition to bandwidth selection, KDE needs to be handled carefully also against curse of dimensionality. Namely, the principles explained using Equation (5) based on a single variable can in theory be extended easily to a multivariate case. However, in practice, multivariate kernel density estimation is usually restricted to 2-D due to the curse of dimensionality. Similar to most other applications, also in our case, operating in the full (4-D) variable space potentially yields an overwhelmingly large number of bins, and thus the space is sparsely populated by data points. Therefore, we prefer using a set of 1-D variable spaces. However, this choice needs a justification for conditional independence of observations, which is elaborated on in Section 4.5.