A fogalmazásértékelés megbízhatósága két független bíráló értékítéletének elemzése alapján
Main Article Content
Absztrakt
The purpose of the present study is to examine the operation of our scoring system, which was developed to assess the quality of schoolchildren’s texts and to analyse two independent raters’ evaluations. Rater performance was analysed using classical test theory and item response theory. The sample included 429 Hungarian children in Year 8. They were asked to produce a narrative text, and their compositions were scored by two independent raters using our scoring system, which comprises one holistic criterion and nine analytic ones (content, genre, tone, organization and structure, style, readability, lexical and grammatical conventions, spelling and orthographic conventions, handwriting and neatness). The analyses showed high reliability (Cronbach-α=0.95) for both raters. There are strong and significant correlations between the ratings (r=.85-.93, p<.01) and small differences (.56 logit) between the severity parameters of the two raters. The partial credit model analyses revealed differing uses of the scales by the raters. The scores given by the first rater on most scales show a bad model fit. The δ parameters of the scales’ characteristic curves indicate that the two raters used the scales differently. Results call attention to the problem of defining scales for written composition scoring systems. The results point out that a precise definition of the scales was unable to guarantee objective and consistent assessment as the two raters still interpreted the scales differently. The findings indicate the need to re-examine the scoring system used and to provide training for raters.