E-assessment - Film studies
Words Steve Smethurst Illustration Marina Caruso
New research indicates that the marking quality from electronic marking is as good, if not better, than the present route
Electronic marking has had some bad press of late. It's been linked with sweatshop conditions as "traffic lights" warn supervisors of slow productivity, there have been claims of software glitches preventing examiners from going back to correct an error and, according to one disgruntled teacher on the TES web forum, it means "no more taking the scripts away with me on holiday and marking them on a sunny balcony".
So it's a change to hear some good news: analysis by DRS and the National Foundation for Educational Research (NFER) has shown that the accuracy of electronic marking is "very high".
Graham Hudson, national business development manager for education for DRS in the UK, has always been conscious that there are areas of examining where the electronic marking would improve marking.
"I'm not saying that, in a national context, marking quality is poor," he stresses. "It's just that there are ways to reduce the variability in marking quality that does occur, because it is a judgmental process."
A former Qualifications and Curriculum Authority employee, Hudson ran the external marking and data collection of the Key Stages 2 and 3 tests in England and established a government-funded programme for implementing new technologies in exams and assessments.
He says: "I became convinced that one way to reduce variability was electronic marking. It stems from the fact that examiners don't always add marks up correctly, for example, resulting in clerical errors."
Since joining DRS, he has helped to put in place the electronic mark capture and marking of tests for a number of awarding bodies in the UK and internationally, with DRS e-Marker systems capturing 80 million marks for individual questions last year.
It worked with AQA on the 2007 summer series and, using its e-Marker CMI+ system, marked 1.8 million scripts electronically - a process that involved around 2,500 markers across 105 subjects at GCSE. The subjects did not include English or History, owing to the length of answers, although markers did input the marks on a computer to speed up mark submission.
Larger scale
The purpose of the research, based on 2006 data, was to provide empirical information to support the view that marking reliability is improved by electronic marking. Although some small-scale research had been done before, there wasn't anything on a larger scale that was available publicly.
"What the research shows," says Hudson, "is that the degree of 'exact' agreement between markers and the standard items is extremely high. One of our tables shows that it's 98 per cent in one subject. Even in a subject with larger mark ranges, it's still 79 per cent. And if you move out of 'exact agreement with standard marks', into 'one mark difference either side', it goes up further. The degree of agreement is very high - so using this approach is therefore very accurate."
Hudson says the questions chosen for electronic marking were carefully constructed ones that lent themselves to the process. "This brings with it the benefits of focused attention of markers or automated mechanisms on certain items. It reduces marking variability as markers are focusing on one question at a time and are well versed in it," he says.
Another beneficial aspect of marking electronically is real-time monitoring of marking quality. Hudson says that one way to do this is to include "seed" marking with items of known marks in order to compare marker's marking with those items. This also happens with conventional marking but requires the transfer of paper scripts between one marker and their supervisor. However, this takes more time and the sampling is less frequent.
"The markers know that it's happening, they just don't know when," he explains. "It also means we can spot markers whose quality on a specific question is not up to scratch. They might be perfectly fine at all the other questions, so we can let them carry on with those."
Remaining variability
The research paper, Is electronic marking just about efficiency?, co-written by Hudson with Barbara Donahue, Simon Rutt and Ian Schagen, attempts to look at what might cause the remaining variability. However, with such a high level of agreement, it's looking at a very small area.
"The upshot of our studies," says Hudson, "is that the effects of markers - the time of day they mark, for example - don't have a great bearing on marking quality."
He says that the one area he wants to look at further is how seeds are compiled and used. "That's not to say they're compiled badly - there is already a high degree of agreement. It's just the area we want to do some work on. We're talking about fractions of mark difference."
Hudson insists that the research should provide some badly needed confidence that the marking quality from electronic marking is as good as, if not better than, the present route. It also provides some indicators for where to look next, to improve it further.
"Where I see this going," he says, "is into the territory of longer answers, However, the boards won't go down that route unless they are certain that marking quality is as good as, if not better than, the present route.
"The power of electronic marking is that it allow us not only to quantify the accuracy of the marking system, but also to collect detailed information that can be analysed in such a way as to provide clues for improving reliability even further. The work in this paper is a first step towards this goal."
Drawing out generalisations from the research is quite difficult, insists Hudson, but what he can say with certainty is that marking accuracy is very high and that there are areas that can be improved upon in terms of how items are seeded. So the outlook for electronic marking is a little sunnier after all. K
About the author
Steve Smethurst is a freelance journalist. He writes regularly for the Times and is managing editor of Leadership Focus, the magazine of the National Association of Head Teachers
Evidence of accuracy
Further encouragement for the electronic marking community comes from Dr Michelle Meadows, principal research manager in the research and policy analysis department at AQA.
Research by Rachel Taylor, a research assistant in the same department, shows that even though enquires about results continue to go up, even with electronic marking, there are fewer mark changes. The study indicates, therefore, that electronic marking seems to be associated with an improvement in reliability.
Between June 2005 and June 2006, AQA went from having 27 GCSE exams marked electronically to 83.
Currently, 18 per cent of AQA examination papers are marked using CMI+ across all qualification types, although electronic marking is mostly restricted to answers that take up less than half a side of A4.
"Of course, these results are what we would expect," says Meadows, "but there is very little empirical evidence in this area, which is why Rachel's work is of interest.
"If you take the components that were paper-based in 2005 and electronically marked in 2006, there were fewer changes to marks done with our electronic marking tool. It's a small, but significant, change. And as it is quite rare to get any empirical evidence to show that electronic marking does improve reliability, it's a genuinely good news story."