Explore chapters and articles related to this topic
Epidemiology
Published in Samuel C. Morris, Cancer Risk Assessment, 2020
Ecological studies, also called aggregate or correlational studies, compare aggregated disease data with spatial data on exposure. Most ecological studies are crosssectional in design, but they can be done longitudinally, comparing aggregated disease data in the same location in different time periods. In a classic example, Lave and Seskin (1977) compared mortality rates in U.S. Standard Metropolitan Statistical Areas (SMSA) with air pollution levels in those cities in a multiple regression analysis which also included population density and factors accounting for age, race, and income. Significant correlations were found between air pollution and total mortality and total cancers. The primary weakness of such a study design has been described as the ecological fallacy which results from drawing causal inferences about individual phenomenon from observations of groups. The study tells us that mortality rates were correlated with air pollution levels, but the air pollution levels were based on central city monitoring stations. We do not know whether the individuals who died were exposed to those levels. Put another way, “we do not know the joint distribution of the study factor(s) and the disease within each group” (Kleinbaum et al., 1982). The correlation between two ecologie variables can be markedly different from the corresponding correlation using individual data from the same populations.
Insurance Redlining — A Complete Example
Published in Julian J. Faraway, Linear Models with Python, 2021
When data are collected at the group level, we may observe a correlation between two variables. The ecological fallacy is concluding that the same correlation holds at the individual level. For example, in countries with higher fat intakes in the diet, higher rates of breast cancer have been observed. Does this imply that individuals with high fat intakes are at a higher risk of breast cancer? Not necessarily. Relationships seen in observational data are subject to confounding, but even if this is allowed for, bias is caused by aggregating data. First we load the packages:
Can transportation network companies replace the bus? An evaluation of shared mobility operating costs
Published in Transportation Planning and Technology, 2022
Like averaged data, aggregated data has several limitations. One of the primary limitations is the ecological fallacy, where inferences about the nature of individuals are made using information about the group in which those individuals belong (Freedman 2015). A related limitation is the loss of individual information as data are aggregated. The complexity of a transit agency network is lost when costs are aggregated at the agency-level, for example. Third, like with averages, aggregated data hides variance. Aggregated data can still be useful when individual-level analysis is not required or if the focus is on trends or patterns, like in this paper. This paper primarily focuses on the agency-level so the loss of individual information is not as important. Other transit research has previously demonstrated the utility of analyzing transit at the system level. For example, in transit equity literature, Gini coefficients have been used to sum transit system equity into a single value (Delbosc and Currie 2011). An agency-level value allows a transit agency to be evaluated overtime or evaluated against other transit agencies. Thus, a system-level evaluation can facilitate comparisons, the foremost purpose of this paper.
Driving while impaired by alcohol: An analysis of drink-drivers involved in UK collisions
Published in Traffic Injury Prevention, 2019
Richard Owen, George Ursachi, Tanya Fosdick, Adrian V. Horodnic
One limitation of this study is the imperfect matching of age with the composition of the communities (e.g., although valid for the majority, not all drivers whose postal code is classified as pocket pensions are over 65 years old). Further research, focusing on better age matching, could provide more clarity to the phenomenon. Another important limitation of the study is that the study makes conclusions about individuals based on group characteristics, an error in reasoning known as ecological fallacy. Our study attempts to limit this error by characterizing communities that would encourage drink-driving behaviors, not only individuals. A limitation suggested by the literature regards the likelihood of police bias toward assigning drink-drive-related CFs to cases where some factors occur, such as increasing blood alcohol levels; a prior record of impaired driving; involvement in a single-vehicle collision; involvement in a nighttime collision; and traffic violations or unsafe driving actions recorded by police (Brubacher et al. 2013). As a consequence of the customary practice in the UK of applying breath tests for every driver involved in a collision where a police officer attended, bias toward the referred variables is less likely to occur.
Escherichia coli contamination of rural well water in Alberta, Canada is associated with soil properties, density of livestock and precipitation
Published in Canadian Water Resources Journal / Revue canadienne des ressources hydriques, 2019
Jesse Invik, Herman W. Barkema, Alessandro Massolo, Norman F. Neumann, Edwin Cey, Sylvia Checkley
The quality of the agricultural variables was a limiting factor in this study. While the Canadian Agricultural Census (Government of Canada 2012) provides a wealth of information, for confidentiality reasons it is aggregated to large regions. Using geographically aggregated data leads to two problems, ecological fallacy and the modifiable areal unit problem (MAUP). Ecological fallacy occurs when results based on aggregated data are erroneously applied to specific members of the aggregated area. If an aggregated area has an average animal density that is high, and animal density is associated with contamination in a study, it is incorrect to infer from that study that well X within that aggregated area will have a high rate of contamination. It is quite possible that the high animal density is based on animal numbers from only one half of the polygon and that well X is in the other half (Dark and Bram 2007). The two key concerns about MAUP are that the decision for the boundaries for the polygons is arbitrary and different decisions on the boundaries will produce different statistical outcomes. In addition, the aggregation of smaller units into larger areas means a decrease in the variation in the data while the mean remains unchanged (Dark and Bram 2007). Data aggregated to smaller regions or point data, such as specific locations of animals’ numbers would have improved the quality of the study considerably.