EUROCALL 2014

Full Program »

Using the Standard Settings from A Manual for the Validation of a Language Proficiency Test on Reading and Listening

This paper focusses on applying two methods of analyzing empirical data to the standard setting part of the validation process of a language proficiency test on reading and listening. These tests are designed to determine the level of English reading and listening proficiency at general academic level B1-C2. The Council of Europe Pilot Manual (2003) and A Manual (2009) provided the professional support needed to ensure good practice in the development of a high stakes proficiency test. The format of the tests is multiple-choice. In the first phase of the test construction we collected data from test takers of the language proficiency test on reading and listening. The tests have been administered by paper and pencil and by computer.
To determine the level of the items and to improve the quality of the items, first a graphical item analysis (GIA), designed by Van Batenburg & Laros (2002) was used. This analysis not only takes into account the direction of the answer but also of the distractors, and gives an indication of the language proficiency level of the test item. The scores of the test takers were categorized following the cut-off scores of the language proficiency levels (B1-C2), determined in the first phase of test construction. For each question and each cut-off score, a plot was generated for the proportion of the test takers answering an alternative. The language proficiency level of each question was interpolated from the proportion correct at 0.5.
Item Response Theory (IRT) analysis was included to attain a more sample independent measure of the level of the test items. For each question the difficulty, the Response Probability at borderline mastery (the 50 % chance to answer correctly) and the Response Probability at full mastery (the 80 % chance to answer correctly) have been calculated. In a plot, the items are ranked according to difficulty and a horizontal line has been drawn with at the left end the RP 50 value (borderline mastery) and at the right end the RP 80 value (full mastery). Items are ordered according to their level of difficulty.
In the first standard setting method mentioned in A Manual (Item-descriptor Matching Method), items are ranked according to IRT difficulty. The language level of the items determined with GIA can be converted to cut off scores for the CEFR levels with help of the threshold region. In the second standard setting method mentioned in a Manual (A Cito Variation on the Bookmark Method) the cut off scores for the CEFR levels can be determined by means of a vertical line in the IRT plot.
The cut off scores determined by the standard setting methods are compared to the cut off scores of the CEFR levels determined by the test creators and are discussed in terms of useful contribution in the determination of the levels of the tests according to the CEFR.

Author(s):

Yta Beetsma    
CIT / Educational Support and Innovation
University of Groningen
Netherlands

Educational measurement specialist

Engelien De Jong    
CIT / Educational Support and Innovation
University of Groningen
Netherlands

Educational measurement specialist

Estelle Meima    
Language Centre
University of Groningen
Netherlands

Angela Ashworth    
Language Centre
University of Groningen
Netherlands

 

Powered by OpenConf®
Copyright ©2002-2013 Zakon Group LLC