Explore chapters and articles related to this topic
Appliance of Machine Learning Algorithms in Prudent Clinical Decision-Making Systems in the Healthcare Industry
Published in Ashish Mishra, G. Suseendran, Trung-Nghia Phung, Soft Computing Applications and Techniques in Healthcare, 2020
T. Venkat Narayana Rao, G. Akhila
Density estimation is one of the major areas of statistics that contributes to estimating the probability density, based on the observed data. It is the building block of machine learning. In medicine, medical image segmentation is done based on the estimated information. Segmentation is the process of dividing the data into groups containing similar sets of objects in it. Before the density estimation, the first step that a person should perform is a histogram plot. A histogram is the graphical representation of continuous numerical data. First, the data is grouped into bins, and then the number of objects that are falling in that bin are counted. These counts are the frequencies of observation that are plotted on the y-axis and bins are plotted on the x-axis. The number of bins in a histogram play an essential role in controlling the coarseness of the distribution [6].
Introduction to Visual Computing
Published in Ragav Venkatesan, Baoxin Li, Convolutional Neural Networks in Visual Computing, 2017
The Bayesian decision process is applicable whenever we are able to model the data in a feature space and the distributions (the class-conditionals) of the classes and the priors can somehow be obtained. In that case, optimal decision boundaries can be derived as above. In practice, both the priors and the class-conditionals need to be estimated from some training data. The priors are scalars and thus may be easily estimated by relative frequencies of the samples from each class. There are two general types of density estimation techniques: parametric and nonparametric. In the earlier example, we essentially assumed the PDFs of the image means were Gaussian (i.e., a parametric approach). A reader interested in density estimation may refer to standard textbooks like Duda et al. (2012).
Advances in kernel density estimation on directional supports
Published in Christophe Ley, Thomas Verdebout, Modern Directional Statistics, 2017
Christophe Ley, Thomas Verdebout
Kernel density estimation is the classical way to produce non-parametric density estimates on the real line. Its roots can be found in the seminal works of Rosenblatt (1956) and Parzen (1962). Letting Z1,…,Zn be iid observations from a population with unknown density f on R $ {\mathbb{R}} $ , the kernel density estimator (KDE) at some point z∈R $ z \in {\mathbb{R}} $ is defined as f^(z)=1ng∑i=1nKℓz-Zig $$ \hat{f}(z) = \frac{1}{{ng}}\mathop \sum \limits_{{i = 1}}^{n} K_{\ell } \left( {\frac{{z - Z_{i} }}{g}} \right) $$
Bivariate kernel density estimation for environmental contours at two offshore sites
Published in Ships and Offshore Structures, 2022
In statistics, kernel density estimation is a non-parametric way to estimate the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample. We begin with the formulations for the univariate kernel density estimator. Suppose we have a random sample , … taken from a continuous, univariate, unknown density . We are interested in estimating the shape of this function . Its univariate kernel density estimator is (Silverman (1986)): where K is a non-negative function satisfying , which we call the kernel function, and h > 0 is a smoothing parameter called the bandwidth parameter. A range of kernel functions are commonly used: uniform, triangular, biweight, triweight, Epanechnikov, normal, and others. The Epanechnikov kernel is optimal in a mean square error sense, and therefore it is utilised in this study. The mathematical formulation of the univariate Epanechnikov kernel is as follows:
Vulnerability of transmission towers under intense wind loads
Published in Structure and Infrastructure Engineering, 2022
Edgar Tapia-Hernández, David De-León-Escobedo
At this point, three probability distributions were computed before selecting a log-normal distribution: beta, kernel, and log-normal as depicted in Figure 7. In general, probability distributions describe the dispersion of the value of a random variable. The beta distribution is applied to model the behavior of random variables limited to intervals of finite length in a wide variety of disciplines. Kernel density estimation is a non-parametric way to estimate the probability density function of a random variable. Whereas in a log-normal distribution the random variable takes only positive real values. A χ2 goodness of fit test was performed (see Table 1); according to the results, the log-normal distribution represents a most reasonable estimation of the considered values and, therefore, as discussed above, it is the select distribution for the study. In Table 1, OF stands for observed frequency and EF refers to the expected frequency. Note that OF Actual, EF beta, EF LN and EF Kernel have units of percentage of total occurrences, and χ2 beta, χ2 LN, and χ2 Kernel are non-dimensional.
Investor domicile and second-hand ship sale prices
Published in Maritime Policy & Management, 2021
Wen Hao Peng, Roar Adland, Tsz Leung Yip
The non-parametric model adopted in the first stage of this study is based on kernel density estimation. Kernel density estimation aims to explain the relationship among factors directly from the data. Through characterising the joint distribution of transaction price versus market freight rate and ship age, the functional relationship can be estimated by minimizing the weighted residual pricing errors. The weight function incorporates both a multivariate probability density function (kernel) and a bandwidth matrix. For kernel density estimation, the selection of bandwidth decides the performance of the kernel density estimator and is more important than the choice of kernel (Heidenreich, Schindler, and Sperlich 2013). Adland and Koekebakker (2007) stated that the selection of bandwidth is a trade-off between the bias and increased variance of the estimator. Too small a bandwidth can result in over-fitting the data and increases the volatility, whereas too large a bandwidth smooths the estimator at the cost of possible bias. Here, we set the bandwidth at 10% of the historically observed range of the variables, equivalent to 4 years for vessel age and 3,600 USD/day for the freight rate.2