Examining the relationship between performance measures and user evaluations in a transfer of training paradigm
User evaluations which generate detailed information can identify problematic aspects of software interfaces. In a preliminary study (Coleman, Wixon, and Williges, 1984), a methodology was developed for the systematic collection of detailed subjective evaluations of software interfaces. This methodology created a taxonomy of editing functions for users to evaluate and a set of bipolar scales on which they could make their evaluations. The present research investigated the utility of this methodology, while comparing two text editors within the context of a benchmark editing task. In addition, the detailed subjective measures collected were compared with more traditional objective measures.
The results of this research revealed that global subjective evaluations were insensitive to differences between two editors indicated by detailed evaluations. Examination of the detailed subjective evaluations indicated that the differences between editors could be 1 attributed to specific editing functions. The objective measures also indicated very specific differences between the two evaluated editors. Examination of the relationship between the objective and subjective measures indicated that the measures differed on both the magnitude and location of effects. Closer inspection of the data revealed that insensitivity on the part of the subjective measure could not account for all disagreement between measures. On several occasions the objective and subjective measures seemed to measure qualitatively different effects. Given that the measures were not completely redundant it was concluded that both objective and subjective measures should be collected during interface evaluation.