Explore chapters and articles related to this topic
Measurement Bias, Multiple Indicator Multiple Cause Modeling and Multiple Group Modeling
Published in Douglas D. Gunzler, Adam T. Perzynski, Adam C. Carle, Structural Equation Modeling for Health and Medicine, 2021
Douglas D. Gunzler, Adam T. Perzynski, Adam C. Carle
A form of systematic measurement error often labeled measurement bias or differential item functioning (DIF) occurs when multiple persons with an identical underlying level on the outcome nevertheless respond to the individual questions asking about their health differently due to their different backgrounds.
Rasch Modeling AppliedRating Scale Design
Published in Trevor G. Bond, Zi Yan, Moritz Heene, Applying the Rasch Model, 2020
Trevor G. Bond, Zi Yan, Moritz Heene
A final step in investigating the quality of the new measure is to compare the estimates across two or more distinct groups of interest (e.g., male/female, Christian/Jewish, employed/unemployed, married/divorced/never married) to examine whether the items have significantly different meanings for the different groups. This is called differential item functioning (DIF). We take the same example as before, the reported frequency of pedagogic strategies among elementary science teachers. Suppose we want to compare the frequency of use of those different pedagogic strategies with that of a sample of elementary mathematics teachers. We can use the data from both groups in the Excel spreadsheet for common-item linking (see Chapter 5), plotting item estimates (difficulties and errors) for science teachers against those for mathematics teachers to examine whether the frequency of usage is measurably different for science and mathematics teachers. Any difference in the frequency of pedagogic strategies between the groups can be examined more closely to see why any particular strategy was not rated the same for both groups (Figure 11.5).
Fairness in Rater-Mediated Assessment
Published in George Engelhard, Stefanie A. Wind, Invariant Measurement with Raters and Rating Scales, 2017
George Engelhard, Stefanie A. Wind
Soon after the presentation of these definitions of bias within the context of selection procedures, concerns with fairness in testing began to switch to methods for detecting bias at the individual item level. Further, rather than describing any observed difference in achievement between subgroups as bias, researchers began to distinguish between item bias and item impact. Whereas item bias indicates differences in performance across subgroups related to construct-irrelevant components, item impact reflects true differences among subgroups in terms of locations on the latent variable. The term differential item functioning (DIF) describes situations in which test takers from different subgroups have differing probabilities of success on an item after they have been matched on the construct being measured. Evidence of DIF can alert researchers to potential bias.
Using Rasch and factor analysis to develop a Proxy-Reported health state classification (descriptive) system for Cerebral Palsy
Published in Disability and Rehabilitation, 2021
Mina Bahrampour, Martin Downes, Roslyn N. Boyd, Paul A. Scuffham, Joshua Byrnes
Differential item functioning (DIF) was also considered as it can also affect model fit and generalisability of the classification system to all patients. Differential item functioning occurs when different groups in the sample (between genders for example) systematically answer an item differently. This can be assessed by measuring the item response for subgroups. In this study, the DIF was estimated with respect to gender, age (school aged or under), and condition severity (Gross Motor Functioning Classification System, GMFCS; and Manual Ability Classification System, MACS). Gross Motor Functioning Classification System and Manual Ability Classification System both have five levels in which one is the lowest severity and five is the high severity [50]. Person separation reliability was also estimated, which is a reliability index similar to Cronbach’s alpha (α).
Translating Item Response Theory Findings for Clinical Practice
Published in Journal of Personality Assessment, 2019
Douglas B. Samuel, Meredith A. Bucher
Runge et al. (2019) used IRT methods to address this latter question by evaluating the OMT to determine how its measurement precision varies across cultures. Specifically, Runge et al. analyzed the possibility of differential item functioning (DIF) across three widely varying cultures. DIF refers to the possibility that an item might connote significantly different meaning in one group than in another. Typically, DIF is evaluated with respect to differences across demographic groups such as gender, race, ethnicity, or age. For Runge et al., the focus is on a comparison across cultures within three countries (e.g., Costa Rica, Germany, and Cameroon). The basic underlying question of DIF is whether scores on an item, or instrument, have the same meaning across groups. Without knowledge of whether an item shows DIF across groups, one cannot know whether observed differences in mean scores across groups reflect true differences or some bias in measurement.
Refinement of the Child Amblyopia Treatment Questionnaire (CAT-QoL) using Rasch analysis
Published in Strabismus, 2019
Differential item functioning (DIF) is a form of item bias across groups of respondents. It occurs when different groups within the same sample, despite equal levels of the underlying characteristic, respond in a different manner to an individual item. There are two different types of DIF; uniform DIF (where one group shows a consistent systematic difference in their responses to an item, across the whole range of the attribute being measured) or non-uniform DIF (which occurs when the differences between groups varies across the level of the attribute). There are different methodological approaches that can be taken if DIF is found. In the case of instrument development, the presence of DIF may influence the removal of that item from the instrument. DIF can be detected both statistically and graphically. An ANOVA is performed for each of the items, comparing the scores across each level of the “person factor” and across different levels of the trait (class intervals). Uniform DIF is indicated by a significant main effect for the person factor (e.g. gender). Non-uniform DIF is indicated by a significant interaction effect (person factor X interval).