Explore chapters and articles related to this topic
Monsters and Mixtures
Published in Richard McElreath, Statistical Rethinking, 2020
This chapter introduced several new types of regression, all of which are generalizations of generalized linear models (GLMs). Ordered logistic models are useful for categorical outcomes with a strict ordering. They are built by attaching a cumulative link function to a categorical outcome distribution. Zero-inflated models mix together two different outcome distributions, allowing us to model outcomes with an excess of zeros. Models for over-dispersion, such as beta-binomial and gamma-Poisson, draw the expected value of each observation from a distribution that changes shape as a function of a linear model. The next chapter further generalizes these model types by introducing multilevel models.
Measuring infrastructure and community recovery rate using Bayesian methods: A case study of power systems resilience
Published in Stein Haugen, Anne Barros, Coen van Gulijk, Trond Kongsvik, Jan Erik Vinnem, Safety and Reliability – Safe Societies in a Changing World, 2018
Generalized Linear Models (GLM) are widely used within regression models when count data is present. Within this class of models, the Poisson density function is often used with a log-link function, if the variance of the counts is higher than the mean of the counts, it is common to also use a negative binomial GLM. In certain special cases, extensions of these models can accommodate specific situations. For example, zero-truncated models and zero-inflated models can be used when there are excess zero counts (Shankar & Mannering, 1997), and both use an underlying Poisson distribution.
Count Data Models
Published in Simon Washington, Matthew Karlaftis, Fred Mannering, Panagiotis Anastasopoulos, Statistical and Econometric Methods for Transportation Data Analysis, 2020
Simon Washington, Matthew Karlaftis, Fred Mannering, Panagiotis Anastasopoulos
Zero-inflated models imply that the underlying data-generating process has a splitting regime that provides for two types of zeros. The splitting process can be assumed to follow a logit (logistic) or probit (normal) probability process, or other probability processes. A point to remember is that there must be underlying justification to believe the splitting process exists (resulting in two distinct states) prior to fitting this type of statistical model. There should be a basis for believing that part of the process is in a zero-count state.
Eye-tracking of Facial Emotions in Relation to Self-criticism and Self-reassurance
Published in Applied Artificial Intelligence, 2019
Bronislava Strnádelová, Júlia Halamová, Martin Kanovský
The dependent variable was a continuous and bounded variable defined at unit interval (0, n). Eye-tracking measures do not have a normal distribution: they include fixation, duration, and most saccade measures, so they tend to have a skewed (typically right-skewed) distribution (Holmquist et al., 2011). Our data (see Figure 1) clearly exhibited this typical distribution. Holmquist et al. (2011) recommended log-normal or gamma distributions. Since our data contained many zeros, we had to use one of the two common methods for dealing with zero-inflated data, namely (1) modelling a zero-inflation parameter that represents the probability a given zero comes from the main distribution (zero-inflated models), or (2) modelling the zero and non-zero data with one model (the Bernoulli model), and then modelling the non-zero data with another (log-normal or gamma model). This class of models are called hurdle models. It is clear that zero-inflated models are not applicable: the zeros cannot come from the main distribution (log-normal or gamma) because they do not allow zero values. So we fitted the log-normal hurdle and gamma hurdle models.
A discrete spatial model for wafer yield prediction
Published in Quality Engineering, 2018
Hao Wang, Bo Li, Seung Hoon Tong, In-Kap Chang, Kaibo Wang
As an alternative to the NB model, zero-inflated models are another commonly used type of model. These models are designed to address dominant zeros in datasets, which can be regarded as a coupled problem of clustered defects. Since the introduction of zero-inflated Poisson (ZIP) regression by Lambert (1992), various zero-inflated models have been proposed to fit data with many zero values such as the zero-inflated Poisson (ZIP) model, the zero-inflated binomial (ZIB) model and the zero-inflated negative binomial (ZINB) model (Yau and Lee 2001; Fatahi et al. 2012; He et al. 2012). Zero-inflated models are constructed based on the assumption that there is a random shock leading to a Poisson or NB process and that this random shock occurs independently with probability p.
User characteristics of shared-mobility: a comparative analysis of car-sharing and ride-hailing services
Published in Transportation Planning and Technology, 2021
Kate (Kyung) Hyun, Farah Naz, Courtney Cronley, Sarah Leat
The ZINB model was developed using R statistical software. As previously mentioned, the ZINB model provides two modelling outcomes using a negative binomial count model and a zero-inflated model. The count model focuses on the number of shared-mobility services used, while the zero-inflated model discusses the membership of the excessive zeros (non-user group). Researchers are flexible to choose the independent variables in each model, and this study used the same independent variables for both models to understand how the variables impacted shared-mobility use behaviours. All the variables were checked for multi-collinearity and showed Variance Inflation Factor (VIF) values close to one.