Explore chapters and articles related to this topic
Diagnostics
Published in Julian J. Faraway, Linear Models with Python, 2021
The most useful diagnostic is a plot of ϵ^ against y^. If all is well, you should see constant symmetrical variation (known as homoscedasticity) in the vertical (ϵ^) direction. Nonconstant variance is also called heteroscedasticity. Nonlinearity in the structural part of the model can also be detected in this plot. In Figure 6.1, three distinct cases are illustrated. We have generated these from known models and here is how we did it. First we need to load the packages for this chapter:
Linear Regression
Published in Simon Washington, Matthew Karlaftis, Fred Mannering, Panagiotis Anastasopoulos, Statistical and Econometric Methods for Transportation Data Analysis, 2020
Simon Washington, Matthew Karlaftis, Fred Mannering, Panagiotis Anastasopoulos
Remedial measures for dealing with heteroscedasticity include transformations on the response variable, Y, weighted least squares (WLS), and ridge regression and generalized least squares. Only the first of these, transforming Y, is accomplished within the OLS regression framework. Care must be taken not to improve one situation (heteroscedasticity) at the expense of creating another, such as nonlinearity. Fortunately, fixing heteroscedasticity in many applications also tends to improve nonlinearity. WLS regression is a method used to increase the precision of beta parameter estimates and requires a slight modification to OLS regression. Ridge regression is a technique used to produce biased but efficient estimates of beta parameters. Generalized least squares is presented in Chapter 5.
The transform-both-sides methodology
Published in Raymond J. Carroll, David Ruppert, Transfor mation and Weighting in Regression, 2017
Raymond J. Carroll, David Ruppert
The idea of transforming both sides of a regression model has been around for a long time. Its traditional use has been to linearize otherwise nonlinear models; there are many examples where a simple transformation of f(x, β), say the inverse or logarithm, is linear in the parameters. Modern nonlinear software makes such transformation unnecessary. Moreover, linearizing transformations may induce asymmetry or heteroscedasticity, making ordinary least squares very inefficient. Box and Hill (1974) give an example where the effect of linearization is severe induced heteroscedasticity and a physically impossible estimate of a parameter. They suggest retaining the transformation and correcting the induced heteroscedasticity by weighting. Carroll and Ruppert (1984a) show that an equally satisfactory fit results from using a transformation to homoscedasticity.
Social media use, loneliness and psychological distress in emerging adults
Published in Behaviour & Information Technology, 2023
Zoe Taylor, Ala Yankouskaya, Constantina Panourgia
We tested our hypothesis that the relationship between different types of SMU and psychological distress is mediated by loneliness. To do so, we examined a single-mediator model with types of media use (Active Social, Active non-Social and Passive) as predictors and three dimensions of psychological distress (Depression, Anxiety, Stress) as outcome variables. Before testing the mediation model, we assessed our variables to determine if mediation was appropriate. First, we tested whether the relationship between the variables is linear (Hayes 2013) by plotting residuals against predicted values for four regressions: (i) type of SMU predicting psychological distress (direct effect, c); (ii) type of SMU predicting loneliness (path a); (iii) loneliness predicting type of SMU (path b); (iv) type of SMU and loneliness predicting psychological distress (combined collinearity of b and c’). Second, we evaluated whether estimation error is relatively equal across all predicted Y values. Large variability of the estimation error may result in heteroscedasticity, which may affect the standard error of the regression coefficients (Hayes 2013). Third, we assessed the normality of estimation error using a Q-Q plot for multiple regression.
Analysing the power of deep learning techniques over the traditional methods using medicare utilisation and provider data
Published in Journal of Experimental & Theoretical Artificial Intelligence, 2019
Varadraj P. Gurupur, Shrirang A. Kulkarni, Xinliang Liu, Usha Desai, Ayan Nasir
The scatter plot depicted in Figure 7(b) using multiple LR indicates heteroscedasticity of data values. Heteroscedasticity has a major impact on regression analysis. The presence of heteroscedasticity can invalidate the significance of the results. Thus we further plan to investigate the more accurate modelling of our independent variable Total Medicare Standardized Payment Value using DLT algorithm. The simulation value gave a result of R2 as 0.5159, which in a way indicates the variance was reduced by 51%.
An adaptive two-stage dual metamodeling approach for stochastic simulation experiments
Published in IISE Transactions, 2018
Consider the simulation outputs obtained at design point xi from running ni simulation replications at , for i = 1, 2, …, k. Suppose that a random output obtained at xi can be regarded as being generated by the following model: where the s are random observations of the unknown function m at ; and the ϵ(xi)s are observation errors that are assumed to be independent, but not identically distributed, random variables with expectation zero. The problem of heteroscedasticity is known as having nonconstant error variance, i.e., , that changes systematically with x, rather than staying constant. Data transformation is a popular method of stabilizing the variances, and an analysis can be applied to the set of transformed data. However, such a transformation permits building a metamodel for the transformed simulation output instead of the original one, so this data transformation may not solve the original problem. Furthermore, such an attempt may neither adequately tackle the variance heterogeneity nor shed any light on the behavior of the variance function. Another line of work focuses on simultaneous estimation of the underlying mean and variance response surfaces/functions, which has been known as joint or dual modeling in the statistics literature (Carroll and Ruppert, 1988; McCullagh and Nelder, 1989; Robinson, 1997; Fan and Yao, 1998; Zabalza et al., 1998; Iooss and Ribatet, 2009; Robinson et al., 2010) such work also relates to robust parameter design under the name of the Taguchi approach in the context of response surface methodology (Myers et al., 2009).