Explore chapters and articles related to this topic
Designing and Implementing Research on the Development Editions of the Test
Published in Lucy Jane Miller, Developing Norm-Referenced Standardized Tests, 2020
Item analysis is performed by examining the responses on each item. The functions of the item analysis are to select the items that best fit the purpose of the test, and to identify items with poor psychometric characteristics.8 The process of item analysis can be tedious, particularly if the test has many items and is administered to large numbers of subjects. It is unlikely that the task will be completed without the use of a computer.
Evaluating tests and assessments: Item analyses
Published in Claudio Violato, Assessing Competence in Medicine and Other Health Professions, 2018
Advanced Organizers A complete analysis of a test requires an item analysis together with descriptive statistics and reliability.There are three essential features for an item analysis for multiple-choice questions (MCQs): (1) difficulty of the item, (2) item discrimination, and (3) distractor effectiveness. All of these criteria apply to every other test or assessment form (e.g., OSCE, restricted essay, extended essay, survey) except for distractor effectiveness since there are no distractors in these test formats.The difficulty of the item is the percentage or proportion of people who got the item correct. If everyone gets the item correct, it is an easy item; if very few test-takers get the item correct, it is a very difficult item. Item difficulty (P) is usually expressed as a proportion such as P = 0.72 (72% got it correct).Item discrimination has to do with the extent to which an item distinguishes or “discriminates” between high test scorers and low test scorers. Positive D indicates discrimination in the correct direction. The point-biserial is commonly used for item discrimination D. Distractor effectiveness refers to the ability of distractors in attracting responses.The other important criteria of evaluating a test are descriptive statistics.The internal consistency reliability, Cronbach’s α and the mean discrimination index are also helpful in evaluating test quality. The item analysis, pass/fail rate, MPL, and reliability all help interpret the value of a test.
The neural basis of semantic memory
Published in Lars-Göran Nilsson, Nobuo Ohta, Dementia and Memory, 2013
Michael F. Bonner, Murray Grossman
We have found that non-aphasic patients with parietal disease due to CBS and PCA are significantly impaired in their comprehension of quantifiers. Thus, these patients were impaired at judging whether a small array of objects was accurately described by a phrase containing a quantifier. Regression analyses using quantitative assessment of volumetric MRI showed that this impairment was related to significant cortical atrophy in the parietal lobe, and this overlapped with fMRI activation of parietal cortex in healthy adults performing the same task (Troiani et al., 2009). In order to demonstrate that this deficit was not due to problems processing the visuospatial information in an object array, we also asked patients to evaluate the truth-value of brief statements about familiar temporal, distance, and monetary concepts. For example, patients were asked to judge statements such as – “there are at least 9 pennies in a dime,” “there are more than 10 inches in a foot,” or “there are fewer than 6 days in a week.” We performed an item-by-item analysis in each patient to remove specific stimuli for which the patient did not respond correctly to a probe of the facts involved in the statement. Thus, if a patient did not know that there are 7 days in a week, we removed from consideration the stimuli containing statements about the number of days in a week. We found that these non-aphasic patients were significantly impaired across the different categories of quantifier comprehension (Troiani, Clark, & Grossman, 2011). More recently, we asked patients with non-aphasic CBS to judge the truth-value of simple quantifier statements (e.g., “most of the cows are in the barn”) for simple, naturalistic scenes using quantities smaller than five. We also administered the same picture with a statement containing a precise number (e.g., “three cows are in the barn”). As expected, they were impaired at judging the statements containing precise numbers. We also found that these non-aphasic patients were significantly impaired at judging the quantifier statements (e.g., “most of the cows are in the barn”), and that there was a high correlation between their judgments of statements containing precise numbers and their judgments of quantified statements. They had no difficulty, however, judging statements about pictures that contained neither a number nor a quantifier (e.g., “there are cows in the barn”). Volumetric MRI analyses demonstrated that the impaired performance on quantifier comprehension in CBS was related to atrophy in parietal cortex (Morgan et al., Submitted).
Program evaluation of in-patient treatment units for adults with acquired brain injury and challenging behavior
Published in Brain Injury, 2022
Alison D. Cox, Madeline Pontone, Karl F. Gunnarsson
A detailed examination of item analysis patterns revealed two interesting trends. First, participant charts with some Evidence of a Formal BSP (45% of participants) showed narrower variability in overall PET scores (range 18% to 63%) with a substantially higher mean PET score (44%) compared to the overall group PET mean score (33%). However, six of the eight participant charts that received a maximum score (3) on Formal Evidence of BSP (category 3) scored zero on the BSP Quality Index (category 4; Figure 4). The remaining two only scored a one on the BSP Quality Index. This could suggest that although some aspect of formal BSPs are largely present, they are relatively poor in quality. This may not be entirely surprising given how few BCBAs are available to be hired on as core team members of neurobehavioral rehabilitation units. Another notable pattern was the overall PET mean score for eight participant charts with a score of one on the quality BSP index was substantially higher (45%) than the overall group PET mean (33%), with a narrower range in scores (20% to 60%). This suggests there may be a relationship between BSP quality and the other checklist items across participants. That is, for some reason when BSP quality is better – other programming features may also be better.
Development and initial validation of the verbal and nonverbal sexual communication questionnaire in Canada and Spain
Published in Sexual and Relationship Therapy, 2020
Pablo Santos-Iglesias, E. Sandra Byers
The results of this study demonstrate that the Verbal and Nonverbal Sexual Communication Questionnaire is a reliable measure consisting of three subscales that can be used in English-speaking and Spanish-speakers from Spain: Verbal Sexual Communication, Nonverbal Sexual Initiation and Pleasure, and Nonverbal Sexual Refusal. The item analysis showed items with good discrimination. All of the subscales showed good internal consistencies in both samples, and our validity hypotheses were supported for two of the subscales – the Verbal Sexual Communication and the Nonverbal Sexual Initiation and Pleasure – demonstrating their construct and concurrent validity in both the English and Spanish versions. However, as discussed further below, the results suggest that nonverbal sexual refusal may play a different role in sexual relationships than does positive verbal and nonverbal communication.
Knowledge, application and how about competence? Qualitative assessment of multiple-choice questions for dental students
Published in Medical Education Online, 2020
Mesküre Capan Melser, Verena Steiner-Hofbauer, Bledar Lilaj, Hermann Agis, Anna Knaus, Anita Holzinger
We determined the role of item difficulty – an important parameter for increasing the MCQs’ quality – on the distribution of cognitive level of our MCQs. Difficulty index (DIFI) is one of the most commonly used statistic parameter for item analysis, and is obtained by dividing the number of students who answered the item correctly by the total number of students who answered that item, thus ranging between 0.0 and 1.0 [17–19]. 82% MCQs from UCD had an adequate level of difficulty between 0.4 and 0.9 (moderately difficult to moderately easy). There was no significant difference between old and new MCQs with regard to ‘item difficulty’ (χ2 = 7.278; p = 0.534), and the data were not able to show the impact of difficulty of items on the distribution of the cognitive level of MCQs (Table 6).