Our exam system is hopeless and teachers are wrong
Two major points follow. First, nothing about this should surprise us – teachers are often wrong about their pupils and biased to believe they are better than they are. Second, the exam system is hopelessly ill-adapted to this crisis and could have done better.
On the first point the Royal Statistical Society has aggregated concerns with the approach used by OFQUAL to moderate results. In brief, the regulator expected results to rise by 2% on the previous year, noted that teacher assessments yielded a 12% rise, and moderated downwards to reflect this bias.
They have done so generically, meaning that schools with high year-to-year variability in outcomes may have been penalised (or rewarded) unfairly compared with more stable performers. There is some evidence this impacts middle tier mostly state-comprehensive schools more than private schools. This is due to there being less variability at the extremes of performance and more testing, meaning better predictions by teachers.
Underlying the concern with the methodology more generally is the evidence that teachers are simply bad at guessing results. The study OFQUAL used to claim otherwise does not support their contention. It does suggest a 0.76 to 0.85 correlation between predictions and outcomes, but partially, and based on six mostly data-driven subjects like maths, where answers are generally right or wrong. Earlier studies cited in the same paper, covering a broader spectrum of subjects, note a range of 0.45 to 0.82, the lower end of which is worse than a coin toss.
That grades are being overstated (as opposed to normally distributed both above and below the mean) probably has psychological rather than nefarious causes. Teachers spend a year or more building a relationship with their charges and have every reason to want the best for them – not just from familiarity and decency but in that their own assessments and reward are based on demonstrating added value.
In that regard, there are potential penalties attached to under-estimating performance and potential gains to be made from over-estimating. There are further no penalties attached to being caught in overstatement. No teacher or head teacher is going to be fired for a decision by OFQUAL to reduce their grades. Some might for being too honest or for uncorrected pessimism.
Nor will they feel under pressure. Governing bodies and teaching unions will see fault with the regulator, not their employees and members. The public will sympathise with the teachers, with whom many have a relationship, not the faceless bureaucrats apparently responsible for the desolation of hope in their children; no longer leaping for joy on the front of every newspaper.
OFQUAL though may be wrong but in being too generous rather than unkind. If exams are objective assessments of ability, which is contestable but a reasonable assumption, why are they expecting a 2% rise in performance during a pandemic?
Several weeks of schooling at the most crucial time for exams have been lost to lockdowns. It has been replaced by variable home-schooling provision, something we assume is second best to the real thing. Most other major indicators of performance in the economy are falling. There are concerns about the psychological impact of isolation on impressionable young minds. We are social creatures and much of our learning is driven by peer-to-peer interaction, not just teacher to pupil. From that perspective a 2% rise sounds both heroic and deeply implausible.
But whether there’s a fall or rise may not matter very much. The fundamental purpose of academic exams is signalling, and generally within a single cohort at a single moment in time, not between years. They are not objective qualifications so much as evidence you know stuff and can think. Grades measure how much relative to those taking the papers at the same time, not objective competence. By the time you reach your 20s no employer cares much about how you did in your GCSEs, only what you can do. They seek evidence for this from a range of sources, not just your exam results, which end up forming 1-2 lines on a CV filled with achievement.
A qualification, for example a certificate in electrical safety, conversely denotes a degree of mastery that you are competent to rewire a fuse-board – vital if you want to be an electrician and prove it to a potential customer. An A-level in maths is principally useful if you wish to do further study in areas that require higher-level numeracy and theory. It is not so much a qualification in itself, but is useful and sometimes essential underpinning to a wide range of further study, qualifications and professions.
It is then principally a signal and a very important one for those going on to university, which is generally a moment-in-time decision. Some of those making that decision now will have their access challenged by OFQUAL’s downgrade putting them below their offer grades.
On this, pupils should be partially reassured that offer thresholds are likely to fall as a result of two other pandemic trends, deferrals and a fall in applications from foreign students. Universities need bodies to stay open and many are in dire trouble. The downgrade then may have no impact whatsoever on the process for which A-levels are the primary signal.
But some will still feel cheated, and it is not clear this can be finessed by better statistical modelling. If it were ever possible to accurately assess individual competence by design there would be no exams. A prediction model would be massively cheaper than a nationwide programme of formulation, testing, supervision, marking, appeals and re-sits. And such ideas are common in dystopian science fiction – the population being sorted at birth (or at some age of maturity) into leadership roles for the brainiacs, craft shops, warfare and mines for the rest.
Happily you can escape the circumstances of your birth in a free society and exams help signal your intent to try. All we can currently predict is which traits and circumstances are more likely to influence your path in life, not what that path will be.
OFQUAL’s approach then is a (very) second best solution to the cancellation of the exams that would otherwise have been set.
But was this the only solution? Certainly it was not possible to run the same exam system as in normal times. The schools were not fully open and any teacher or pupil with a long commute could not attend. Certainly the lockdown happened very late and was largely unexpected.
But what of online testing?
Most of us are bombarded with requests to complete web-surveys daily. Many of the tools to do this are free, or extremely cheap. It is very easy to set up surveys to collect both quantitative and qualitative data, with instant results for the former, and no more work to collate the latter than paper surveys (perhaps less, given the removal of the handwriting barrier and semiotic analysis tools that mirror key-phrase scoring by conventional markers).
It is not a perfect substitute – there are still digital access issues for many pupils, particularly from poorer backgrounds, whether cost, connections or reliability. There are supervision issues: how do you know if it’s actually the student if you can’t see them, how do you address claims of losing signal and demands for more time? But for each of these there are also solutions that reduce the risk of cheating, and no system can be cheat-free.
Further, perfection was not required. What was required was a better approach to assessing this year’s unfortunate school leavers than expert guessing, and it may still be possible with a bit of imagination. For example, pupils contesting their grades could be invited to sit a shorter sample test while connected to a cheap webcam showing their presence. That would be a partial examination approach, to offset guesswork downgraded by modelling. It would not be as good a test as a full exam, but a stronger signal of competence.
Done widely it would also provide evidential returns on whether there has been any difference in the performance of the lucky 2% of children whose parents working in the NHS meant their continued attendance throughout the lockdown, and those working from home. It would draw out differences between those schools offering comprehensive home schooling and those who left parents to fend for themselves.
By not doing anything objective, an enormous and rare opportunity to test the value of school and home schooling has also been wasted – which is a double disappointment if, as is currently being debated, a second wave means this happens again.
There exists then the possibility to offer something between the no-hope appeals process being offered now and full re-sits, if not nationwide, then at the very least in trial form such that this unsatisfactory state of affairs is never repeated.