Explore chapters and articles related to this topic
Pearson’s Chi-Square
Published in Thomas S. Ferguson, A Course in Large Sample Theory, 2017
This is known as the Hellinger χ2 because of its relation to Hellinger distance. [The Hellinger distance between two densities, f(x) and g(x), is d(f, g) where .]
Regularized robust estimation in binary regression models
Published in Journal of Applied Statistics, 2022
Qingguo Tang, Rohana J. Karunamuni, Boxiao Liu
A robust methodology is vital in data analysis because outliers and model misspecifications are common in practical applications. Moreover, efficient methods are essential in practice. These considerations have motivated our research. We propose regularized estimation of minimum-distance approach. Minimum-distance estimators possess a certain degree of automatic robustness to model misspecification [11]. Furthermore, certain minimum-distance estimators achieve efficiency under the model. In particular, minimum Hellinger distance (MHD) estimators for parametric models attain efficiency under the model and have excellent robustness properties in the presence of outliers and/or model misspecification [3,4,39]. Moreover, Lindsay [30] has shown that the maximum likelihood and MHD estimators are members of a larger class of efficient estimators with various second-order efficiency properties. For discrete data, Simpson [39] has shown that the breakdownpoint of MHD estimators is 1/2; that is, they achieve maximum robustness in the presence of outliers (see also [20]). Another distance measure that is intimately related to the Hellinger distance is the symmetric chi-squared distance introduced by Lindsay [29]. Lindsay [29,30] studied non-regularized estimators using several distance measures and showed that the minimumsymmetric chi-squared distance (MSCD) generates highly efficient and robust estimators.
Robust estimation for longitudinal data based upon minimum Hellinger distance
Published in Journal of Applied Statistics, 2020
We consider a model for a continuous random variable Y and it has the parametric density P and Q denote two probability measures absolutely continuous with respect to a third probability measure λ. We define the squared Hellinger distance between P and Q as the quantity 16]. Censored data may belong to missing observations. This method was shown to be robust and efficient in this kind of data. Besides, the minimum Hellinger distance utilizes nonparametric kernel regression estimator. It was known that this kernel type estimator attains robustness by considering an appropriate bandwidth in data with missing values [17]. We deal with nonparametric kernel regression estimation for longitudinal data in next section for more details.
Privacy protection measures for randomized response surveys on stigmatizing continuous variables
Published in Journal of Applied Statistics, 2018
Anderson [2] discusses the privacy aspects of the randomization scheme in Section 4 of his paper. There, he considers the variance of the ‘revealing’ density and proposes privacy measures given by either this variance by itself, or the ratio of the variances of the ‘revealing’ and ‘true’ densities, or the minimum of this variance over all r. Thus, as overall measures of the protection provided by the randomization scheme he proposes alternative measures as f and g by taking into account one specific feature of the densities, namely, the variance. We extend his results by taking a more comprehensive approach in developing our measures and we focus on measuring the discrepancies between f and g in their entirety, as opposed to a specific feature of the densities. For this, we use two well accepted methods for comparing densities, namely the Kulback–Leibler and the Hellinger distance measures; divergence measures that have seen extensive applications in many areas like machine learning and information theory for measuring how one probability distribution differs from another. We propose measures of respondent privacy protection based on these two distance functions. We give the measures for model 4.1 and 4.2. The measures under model 4.3.