Rachel Kachchaf and Guillermo Solano-Flores

Applied Measurement in Education (2012)


We examined how rater language background affects the scoring of short-answer, open-ended test items in the assessment of English language learners (ELLs). Four native English and four native Spanish-speaking certified bilingual teachers scored 107 responses of fourth- and fifth-grade Spanish-speaking ELLs to mathematics items administered in English and Spanish. We examined the mean scores given by raters of different language backgrounds. Also, using generalizability theory, we examined the amount of score variation due to student (the object of measurement) and four sources of measurement error-item, language of testing, rater language background, and rater nested in rater language background. We observed a small, statistically significant difference between mean scores given by raters of different language background and a negligible score variation due to the main and interaction effect of rater. Provided that they are certified bilingual teachers, and regardless of language of testing, raters of different language backgrounds can score ELL responses to short-answer, open-ended items with comparable reliability. (Contains 4 tables, 1 figure, and 5 footnotes.)

