A comparison of multiple-choice test response modes and scoring methods

TR Number
Date
1982
Journal Title
Journal ISSN
Volume Title
Publisher
Virginia Polytechnic Institute and State University
Abstract

This study reports the comparison of seven different response/scoring methods used with multiple-choice tests in an academic setting with students enrolled in a required undergraduate art appreciation course at a medium-sized state university. The number of correct responses, based on a single choice per item, constituted the score on a preliminary test in each section of 35 to 40 students. Then, a distinct response/scoring method was randomly assigned to each section for two additional tests. One section continued to use number-right scoring and six sections used methods requiring or permitting multiple marks per item. All test scores counted toward course grades and examinees were so informed. Responses were used to determine estimates of internal consistency reliability and of validity with the previous quarter's grade-point average and subscores of the Scholastic Aptitude Test. An Evaluation Questionnaire, administered in each section, obtained for each response/scoring method self-report information about examinee study habits, testing preferences and experience, and responding behaviors. Observations of examinee behavior, subjective in nature, helped explain or confirm various empirical findings from this study or from the literature.

Based on the findings from this study, it was concluded that:

  1. observed differences in estimates of reliability or validity were not sufficient to justify the effort expended in administering and hand-scoring multiple-mark tests.

  2. examinees experienced substantial difficulty becoming familiar with response/scoring methods which permit multiple marks. These methods require more than casual explanation and practice before examinees become adept in their use.

  3. Item mark totals from methods scoring all the levels of information can provide more feedback to instructors regarding the performance of examinees on specific test items than could be acquired from the NR method. This feedback can help the instructor identify areas of content needing revision or test items needing rewording. When the examinees receive feedback, they have the opportunity to observe, score and learn from their own item performance, Pressey's initial consideration in 1950.

Description
Keywords
Citation