We used an app to develop a new way of finding out what examiners notice when they give grades to students in speaking tests of English. This means we can better understand how examiners give marks, which can help with training examiners.
The research investigated which specific features of candidate talk IELTS Speaking Test (IST) examiners orient to when taking scoring decisions. We also researched whether the use of the scoring scheme and customised app potentially adds any value to IST examiner development.
This was enabled by the development of an IST scoring scheme for an app (VEO) which creates a recording of when exactly in the test the examiners have noticed specific features of candidate talk and taken specific decisions on scoring as a result. Each of four IELTS examiners independently viewed two test videos using the app and scored it using the scoring tags. We then conducted individual stimulated recall interviews with the examiners involved.
We found that the use of the customised IELTS scoring scheme and VEO app illuminates the IST rating process and potentially adds significant value to IST examiner development, specifically the re-certification process. When examiners assign higher scores, they focus on positive evidence, whereas with lower scores, they focus on negative evidence. Fluency & coherence scores are mostly assigned cumulatively. Grammar scores can be more easily tagged in relation to specific features. Examiners notice idioms and reward their use with a high mark, even if not delivered perfectly. Examiners form hypotheses as they listen to the candidate talk, then look for evidence that will confirm or reject these hypotheses. Examiners make scoring decisions in a cumulative way, rather than orientating towards single instances. Pronunciation issues may influence examiner decisions in relation to other criteria.