|
The Journal of Asia TEFL |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Search |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Today |
|
1,084 |
Total |
|
5,469,515 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Current Issue |
|
|
|
Go List
|
|
|
Volume 14 Number 4, Winter 2017, Pages 587-836 |
|
|
|
|
Investigating the Effect of Training on Raters' Bias toward Test Takers in Oral Proficiency Assessment: A FACETS Analysis
|
|
|
Houman Bijani & Mona Khabiri
|
|
Typically, variability among raters in scoring and their bias is mediated through rater training. However, questions still remain about whether training can affect raters' severity or leniency. Furthermore, few studies have looked at the differences between trained and untrained raters in oral assessment. Oral test scores of 200 test takers rated by 20 raters and were analyzed before and after a training program using the multifaceted Rasch measurement (MFRM). The results demonstrated the constructive impact of training programs in reducing raters' biases and increasing their consistency measures. This study indicated that inexperienced raters benefited more from a training program than experienced raters and thus achieved higher measures of consistency afterward. It also demonstrated a higher biased interaction for test takers on the extreme ends of the oral ability continuum. The findings demonstrated that it is almost impossible to completely eradicate rater variability even through rater training. Therefore, rater training should be viewed as a procedure to establish within-rater consistency rather than between-rater consistency. Since this study showed that inexperienced raters can rate even more reliably than experienced ones after training, there is no evidence whereby decision makers can exclude inexperienced raters solely because of their lack of adequate experience. Consequently, decision makers need to use their budgets for establishing rater training programs for inexperienced raters instead.
Keywords: bias, feedback, interrater reliability, intra-rater reliability, multifaceted Rasch measurement (MFRM), rater severity/leniency, rater training |
|
|
|
|
|