AsiaTEFL Logo        The Journal of AsiaTEFL
The Journal of AsiaTEFL
Current Issue
Past Issues
Special Issue
Information of the Journal
Editorial Board
Submission Guidelines
Ethical Guidelines
Manuscript Submission
Journal Order
Today 201
Total 377,339
Current Issue
Go List

Volume 14 Number 4, Winter 2017, Pages 587-836 PDF Download
    Investigating the Effect of Training on Raters' Bias toward Test Takers in Oral Proficiency Assessment: A FACETS Analysis
    Houman Bijani & Mona Khabiri

Typically, variability among raters in scoring and their bias is mediated through rater training. However, questions still remain about whether training can affect raters' severity or leniency. Furthermore, few studies have looked at the differences between trained and untrained raters in oral assessment. Oral test scores of 200 test takers rated by 20 raters and were analyzed before and after a training program using the multifaceted Rasch measurement (MFRM). The results demonstrated the constructive impact of training programs in reducing raters' biases and increasing their consistency measures. This study indicated that inexperienced raters benefited more from a training program than experienced raters and thus achieved higher measures of consistency afterward. It also demonstrated a higher biased interaction for test takers on the extreme ends of the oral ability continuum. The findings demonstrated that it is almost impossible to completely eradicate rater variability even through rater training. Therefore, rater training should be viewed as a procedure to establish within-rater consistency rather than between-rater consistency. Since this study showed that inexperienced raters can rate even more reliably than experienced ones after training, there is no evidence whereby decision makers can exclude inexperienced raters solely because of their lack of adequate experience. Consequently, decision makers need to use their budgets for establishing rater training programs for inexperienced raters instead.

Keywords: bias, feedback, interrater reliability, intra-rater reliability, multifaceted Rasch measurement (MFRM), rater severity/leniency, rater training