A comparison of the effects of conventional testing and two-stage testing procedures on item bias as defined by three statistical techniques

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


The purpose of the study was to compare the effects on item bias of conventional testing procedures to the effects of two-stage testing procedures. It is conjectured that much of the measurement error identified as bias can be explained by factors, such as guessing or carelessness, attributable to inappropriate matching of test difficulty level and examinee ability level.

Methods for detecting bias based on the-traditional definition of item difficulty fail to separate test characteristics from the ability distribution of the respondent sample. The separation of item and ability parameters, however, is an essential ingredient for an objective definition of bias. Such objectivity in measurement is provided by the Rasch latent trait model, which consequently was selected as the basis for this study. Three definitions of bias were considered, two of which were based on the Rasch model.

The analyses were conducted using the scores of random subsamples (n=400 each) of black and white students on items selected from three reading subtests. The two-stage testing procedure was simulated using the real data set by "routing" students to one of three difficulty levels of the subtests based on their Rasch ability estimates as determined by a ten item routing test. Results for the two-stage testing procedure were compared with those from the conventional testing procedure at the subtest level.

A reduction in the number of items identified as biased under conditions of appropriate matching of examinee ability levels and test difficulty levels was indicated by these analyses. Although the results are not conclusive, it is felt that individualizing according to the examinee's ability level offers promise in the direction of reading differential cultural measurement error.



testing bias