Explore chapters and articles related to this topic
Analysis, Programming
Published in Zaven A. Karian, Edward J. Dudewicz, Modern Statistical, Systems, and GPSS Simulation, 2020
Zaven A. Karian, Edward J. Dudewicz
with R2=0.773 and Cp=6.00. (For some details on statistical programming, see Dudewicz, Chen, and Taneja (1989), especially Chapters 16-18.) A model without the C2 term could achieve R2=0.772, and a better Cp=4.01, but we will not reduce the model at this time. To check if any observations are “outliers,” the predictions and residuals (differences between actual and predicted values), as well as Studentized residuals (residuals divided by estimated standard error at the design point), are computed as summarized in Table 8.3-2.
Data Analysis
Published in Marian (Editor-in-Chief) Muste, Dennis A. Lyn, David M. Admiraal, Robert Ettema, Vladimir Nikora, Marcelo H. Garcia, Experimental Hydraulics: Methods, Instrumentation, Data Processing and Management, 2017
Marian (Editor-in-Chief) Muste, Dennis A. Lyn, David M. Admiraal, Robert Ettema, Vladimir Nikora, Marcelo H. Garcia
The assessment of a regression model should include a study of the residuals (or quantities derived from them) and their variation with the predicted response variable and possibly other predictors along with their deviation from any assumed distribution. For non-linear models, the raw residuals, ei, are used. For linear models, standardized (or so-called Studentized) residuals are preferred as they facilitate comparisons. The basic residuals diagnostics are graphical and hence mainly qualitative, searching for systematic variations in plots of residuals against either predictions or individual regressors that would indicate gross violations of the inference assumptions of linearity, independence, and homogeneity. A normal-distribution assumption can be examined through quantile-quantile plots (see Section 6.3.1, Volume I) in which the relevant residuals quantiles are plotted against the ideal normal distribution quantiles, so that large deviations from a straight line would be interpreted as non-normal behavior.
Regression
Published in Richard L. Shell, Ernest L. Hall, Handbook of Industrial Automation, 2000
Outliers are often difficult to detect because, in particular, they may be obscured by the presence of high leverage points. Generally speaking, an outlier has a high absolute value for its “Studentized” residual. Hoaglin and Welsch [4] suggest several alternative methods for detecting outliers. Many of these methods consider what happens to estimated coefficients and residuals when a suspected outlier is deleted. Instead of discussing these methods we will use a principal-component method for detecting outliers in the next section.
Recursive pseudo fatigue cracking damage model for asphalt pavements
Published in International Journal of Pavement Engineering, 2021
Kenneth A. Tutu, David H. Timm
Influential outliers over-determine the equation fit; therefore, plots of studentized residuals (residuals divided by their standard deviation) against predicted BETA were examined to detect the presence of outliers in the observed BETA values. Values with studentized residuals larger than 3 standard deviations away from zero are considered outliers (Chatterjee and Hadi 2012). Leverage was used to identify outliers in the predictor variables. Leverages greater than 0.5 are very high, but those between 0.2 and 0.5 are moderate (Kutner et al. 2008). Cook’s distance, the difference between regression coefficients obtained from the full dataset and by deleting the i-th observation, measures the influence of outliers. Observations with Cook’s distance greater than one are influential (Chatterjee and Hadi 2012). If a fitted equation satisfied the regression assumptions, R2 was a valid statistic for measuring the goodness-of-fit and for assessing predictive capability. Adjusted R2, which accounts for the number of predictor variables, evaluated goodness-of-fit, while predicted R2 assessed model predictive power.
Single-/triple-stage biotrickling filter treating a H2S-rich biogas stream: Statistical analysis of the effect of empty bed retention time and liquid recirculation velocity
Published in Journal of the Air & Waste Management Association, 2019
Reza Salehi, Sumate Chaiprapat
It is evident from Figure 1, that the data points on the plot were dispersed close to the 45-degree line (100% correlation line) with R2 value of 0.988 and 0.987 for H2S removal efficiency in SBTF and TBTF, respectively. This implies that the models could satisfactorily explain the variability in the responses. The models did not explain only about 1.3% (= 100 × (1-R2)) of the total variability. In addition, diagnostic plots including the normal probability plot (NPP) (Figure 2) and the plot of studentized residuals versus the predicted values for the responses (Figure 3) were constructed to clarify any problem in the experimental data. A studentized residual is the residual divided by its standard deviation where the residual is the difference between an actual value for the response and its corresponding predicted value.
Method for Developing a Side Impact Upper Neck Injury Criteria Which Compensates for Biomechanical Differences Between ATDs and Humans
Published in IISE Transactions on Occupational Ergonomics and Human Factors, 2018
Stephen J. Satava II, Lt Col Jeffrey C. Parr, Michael E. Miller
Multiple regression was used to identify the relationships between the dependent variable (MANIC response) and its explanatory variables (Peak G, helmet weight, gender, subject weight, etc.). All regression models were constructed using JMP Pro (v. 12.0.1, SAS Institute Inc.). We evaluated whether there was a significant regression relationship (H0: there is not a regression relationship); potential lack of fit for the model (H0: model fit is reasonable, i.e., no lack of fit); and significance of each model factor (H0: the factor is not significant, i.e., the factors contribution to the model’s slope or intercept is zero). Studentized residuals were used to identify potential outliers (studentized residuals >3 were deemed an outlier and excluded from the model). Finally, R2 was used to quantify the proportion of variance explained by the model. All statistical tests controlled Type I error rate at 0.05.