Explore chapters and articles related to this topic
Multiple Linear Regression
Published in Jhareswar Maiti, Multivariate Statistical Modeling in Engineering and Management, 2023
The diagnostic issues that are important in MLR are identifying influential observations and detecting multicollinearity. Hair et al. (1998) stated that influential observations are of three types namely, outliers, leverage points, and influential observations. Belsley et al. (2004) stated that influential observations are those observations that do not conform to the pattern set by other data points or those that strongly influence the results of the regression. They also pointed out that influential observations may contain important information related to the sample collected, which the analyst must not overlook. Therefore, they cannot be always treated as bad data points. Multicollinearity refers to the presence of collinear relationships among the independent variables which may grossly affect the least squares estimation. Detecting and removing multicollinearity is therefore extremely important before using MLR results for practical purpose. We discuss these issues in the subsequent sections.
Linear Regression
Published in Simon Washington, Matthew Karlaftis, Fred Mannering, Panagiotis Anastasopoulos, Statistical and Econometric Methods for Transportation Data Analysis, 2020
Simon Washington, Matthew Karlaftis, Fred Mannering, Panagiotis Anastasopoulos
Identifying influential cases is important because influential observations may have a disproportionate affect on the fit of the regression line. If an influential observation is outlying with respect to the underlying relationship for some known or unknown reason, then this observation serves to misrepresent the true relationship between variables. As a result, the risk is that an “errant” observation dominates the fitted regression line and thus influences inferences drawn. It is recommended that outlying influential observations be systematically checked to make sure they are legitimate observations.
It’s also about timing! When do pedestrians want to receive navigation instructions
Published in Spatial Cognition & Computation, 2022
Antonia Golab, Markus Kattenbeck, Georgios Sarlas, Ioannis Giannopoulos
In addition to that, absence of multicollinearity is ensured based on the calculation of the corresponding variance inflation factors (VIFs), which is required as multicollinearity could potentially invalidate the employed statistical tests and parameter estimations. It should be noted that the VIF calculation is done on the ordinary least squares (OLS) counterpart of the employed model, with the addition of a constant term, as the required measure cannot be calculated for the case of AFT models. Having said this, no multicollinearity issues () are detected. Finally, we used the OLS counterpart to calculate Cook’s distance (Cook, 1977) in order to detect highly influential observations (leverage 5%) resulting in two observations being eliminated from the sample. While the used variables are explained in Table 2, descriptive statistics of the employed sample are given in Table 3 and the results of the parameter estimation and the accompanied goodness-of-fit measures, are presented in Table 4.
Robust AFT-based monitoring procedures for reliability data
Published in Quality Technology & Quantitative Management, 2020
Shervin Asadzadeh, Arash Baghaei
Although more attention has been given to control multistage processes in recent years, a few researches have been done dealing with monitoring quality characteristics in the presence of outliers, and this issue has been largely neglected in cascade processes. In general, least squares or maximum likelihood estimators are commonly used to estimate the regression parameters using historical data. However, the problem arises when there are outlying observations in the historical data. The contaminated data (outliers) could be the result of various reasons such as computational mistakes, the failure of machinery and equipment, etc. Such points have detrimental effects on the estimators and the presence of even one outlier can lead to biased estimation of the parameters (Jajo, 2005). Robust regression methods are used to solve this problem which lessens the negative effect of highly influential observations. Robust control charts have been totally analyzed for monitoring a single quality characteristic (Rocke, 1989, 1992). Moreover, robust regression methods are investigated considering three properties including breakdown point, efficiency, and bounded influence (Huber, 1973; Jajo, 2005; Rousseeuw & Leroy, 1987). Simpson and Montgomery (1998) presented a robust regression model called compound estimator that performs better for different types of data sets, compared with its previous counterparts. Ampanthong and Suwattee (2010) focused their attention on the weights structure used to estimate regression coefficients in multiple linear regressions in the presence of outlier. Cetin and Toka (2011) compared S-estimator with other robust estimators and the least squares estimator. Susanti, Pratiwi, Handanjani and Liana (2014) provided M-estimator, S-estimator and MM-estimator in a robust regression for determining a regression model and presented the algorithm of these methods.
High-pruning of silver birch (Betula pendula Roth): work efficiency as a function of pruning method, pole saw type, slash removal, operator, pruning height and branch characteristics
Published in International Journal of Forest Engineering, 2018
Jens Peter Skovsgaard, Clémentine Ols, Rebecka Mc Carthy
Model performance was evaluated primarily on the basis of extensive analyses of residual plots. To reveal possible trends in model predictions and to evaluate the assumption of variance homogeneity, studentized residuals were plotted against predicted values and versus predictor variables, both in transformed as well as untransformed scales. Possible influential observations were identified using Cook’s D statistic (none were identified).