Overdispersion – Knowledge and References

Explore chapters and articles related to this topic

Statistical Techniques and Stochastic Modeling in Public Health Surveillance Systems Engineering

Published in Mangey Ram, Recent Advances in Mathematics for Engineering, 2020

Emmanouil-Nektarios Kalligeris, Alex Karagrigoriou, Christina Parpoula, Angeliki Lambrou

Some of the factors that may affect any assessment of the relative merits of available methods are (i) the scope and the field application of the public health surveillance system, e.g., the number (from one to a few thousands) of parallel data series to be monitored; (ii) the quality of the data which is related to the method of data collection as well as possible delays between the time of occurrence to the time of reporting; (iii) the spatiotemporal data features which may include the frequency, the trend as well as the seasonality structure, the epidemicity, and finally the time step and spatial resolution; (iv) the nonstationarity and the possible existence of correlations in the distribution of frequency of data; (v) the possible existence of the phenomenon of overdispersion; (vi) the outbreak specific characteristics such as explosive or gradual onset, brief or long duration, severity, or any mixture of the above; (vii) the use for which the system is intended, including the post-signal processing protocols; (viii) the support of the system in terms of processing power and human resources; and (ix) the choice of metrics and measures for performance evaluation.

Mixture Modelling of Discrete Data

View Chapter

Purchase Book

Published in Sylvia Frühwirth-Schnatter, Gilles Celeux, Christian P. Robert, Handbook of Mixture Analysis, 2019

Dimitris Karlis

The Poisson distribution is perhaps the simplest choice for modelling such data. However, since the mean of the Poisson distribution is equal to its variance, this choice may impose too restrictive an assumption in practice. A typical solution to deal with overdispersion (i.e. to allow the variance to be larger than the mean) is to consider mixtures of Poisson distributions; see Karlis & Xekalaki (2005) for a comprehensive review. In this section we consider finite mixtures of Poisson and related discrete distributions.

Count Data Models

View Chapter

Purchase Book

Published in Simon Washington, Matthew Karlaftis, Fred Mannering, Panagiotis Anastasopoulos, Statistical and Econometric Methods for Transportation Data Analysis, 2020

Simon Washington, Matthew Karlaftis, Fred Mannering, Panagiotis Anastasopoulos

When the data are overdispersed, the estimated variance term is larger than one would expect under a true Poisson process. As overdispersion gets larger, so does the estimated variance, and consequently all of the standard errors of parameter estimates become inflated.

Coupling shared E-scooters and public transit: a spatial and temporal analysis

View Article

Journal Information

Published in Transportation Letters, 2023

Mohammadjavad Javadiansr, Amir Davatgari, Ehsan Rahimi, Motahare Mohammadi, Abolfazl (Kouros) Mohammadian, Joshua Auld

Since our dependent variables consist of nonnegative integer values, we employed a negative binomial count modeling approach for our econometric analysis to characterize the factors affecting the frequency of using shared e-scooter with public transit. The negative binomial model is a type of generalized linear model in which the dependent variable is a count of the number of times an event occurs. The negative binomial model is particularly useful when dealing with overdispersion, which is a situation in which the observed variance of the data is larger than its theoretical variance. Overdispersion often exists in real-world data, and failing to account for it can lead to underestimated standard errors. The negative binomial model, by contrast, has an extra parameter that allows for the variance to be greater than the mean. This makes it a more flexible and robust approach for modeling count data that shows overdispersion.

Assessing the safety impacts of raising the speed limit on Michigan freeways using the multilevel mixed-effects negative binomial model

View Article

Journal Information

Published in Traffic Injury Prevention, 2020

Keneth Morgan Kwayu, Valerian Kwigizile, Jun-Seok Oh

Count data are usually represented with Poisson distribution when the mean and variance are the same. However, the Poisson distribution assumption of equal mean and variance rarely occurs in observational data. In most cases, the observed variance will be greater than the theoretical variances specified by the model, causing what is known as overdispersion (Lee et al. 2012). The overdispersion of the data can be controlled by forming a mixture of the Poisson-gamma model commonly known as a negative binomial model. The probability mass function of the negative binomial regression can be written as shown in Equation 1; whereby; – Count data, – Overdispersion parameter and -Gamma function. It can further be shown through the integration and iteration process of the Negative binomial probability mass function that the mean of is and variance of is

Analyzing truck accident data on the interurban road Ankara–Aksaray–Eregli in Turkey: Comparing the performances of negative binomial regression and the artificial neural networks models

View Article

Journal Information

Published in Journal of Transportation Safety & Security, 2019

Funda Ture Kibar, Fazil Celik, Fred Wegman

Several statistical methods are used for analyzing road accidents, particularly truck accidents (Amarishinga & Dissanayake, 2013; Dong, Burton, Nambisan, & Sun, 2016; Islam & Hernandez, 2016; Miaou, 1994, Ramirez, Izquierdo, Fernández, & Mendez, 2009; Sharma & Lange, 2013). These methods aim to establish the relationship (correlation) between truck accidents and explanatory variables. Poisson regression is considered to be a common methodology in accident analysis; however, there is a limitation in that the mean and the variance of the dependent variable have to be equal to each other (Abdel-Aty & Radwan, 2000; Chengye & Ranjitkar, 2013; Dean & Lawless, 1989). If the mean and the variance are not equal to each other in a given data set, and we have a larger variance, overdispersion (in case the variance is greater than the mean) can be observed. (Miaou, Hu, Wright, Davis, & Rathi, 1993). The negative binomial (NB) model allows overdispersion in the data and is widely used in accident analysis (Caliendo, Guida, & Parisi, 2007; Chang, 2005; Dong et al., 2016; Fancello, Soddu, & Fadda, 2016; Poch & Mannering, 1996; Van Petegem & Wegman, 2014). In road safety research, we face another problem because accidents occur relatively seldom, the data to be analyzed show an excess of zeros. In that case zero inflated models are generally used for predicting accident involvement (Ayati & Abbasi, 2014; Dong et al., 2016; Qin, Ivan, & Ravishankar, 2004; Shankar, Milton, & Mannering, 1997, Tehrani, Falls, & Mesher, 2016.