Pass or Fail

The debate over how to overhaul the GCSE examination has produced a marvellously large number of new proposals.  The outline seems clear enough with harder tests sat at the end of the courses featuring longer essay questions and an abolition of or at least a reduction in coursework.  The strangest feature of the debate is that far more pixels have been spilled (wasted? blackened? addressed?) upon the nature of the examinations than on the curriculum (the skills and knowledge to be taught before the exam is taken) or the teaching methods to be used.  (This is not true for History where a lively and entertainingly ill-tempered argument over how and what to teach is flourishing http://www.bbc.co.uk/news/education-21600298 .  This may partly be a function of the nature of the historical community which is characterised by healthy and impassioned debate).  It is curious, however, that the measurement of progress has become of greater significance than the progress itself. 

The uncharitable may say that it is easier to reform and discuss examinations than it is to raise educational standards through improving lessons or thinking about what should be taught to different groups of students or trying to improve students’ motivation.  A more likely reason is probably the large number of purposes required of the examination.  GCSE grades must test students’ skills and knowledge, provide them with feedback on which subjects to take for A level, allow them to be identified as high flyers by prestigious universities, provide an accountability measure for schools and teachers and allow measurement of the success or failure of educational reform.  Compare this with the driving test which is, at least in assessment terms, much simpler.  There is general agreement upon the driving curriculum, which skills should be mastered and facts memorised and the assessment problem reduces to deciding what the pass standard should be and then maintaining this standard over time and across many test centres and examiners.  No grades are needed to compare students and there is little evidence of pass rates being used to compile league tables of different driving schools.  A failure tends to be seen as evidence of the candidate’s limitations rather than the school’s limitations and the test can be retaken repeatedly as necessary.  No annual accusations of dumbing down appear in the media if the number of people passing shows an increase on the previous year and there seems little interest in comparing driving standards or pass rates with those of other countries beyond the ritual parading of stereotypes of different European drivers. 
A particularly intriguing proposition at present is the notion that setting a harder examination will, of itself, raise educational standards.  The old adage ‘weighing a pig doesn’t make it any heavier’ springs to mind.  Will a harder exam cause students and teachers to apply themselves with greater industry and innovation allowing the students to complete their education with a greater level of skills and knowledge?  Would such an approach further demotivate the students who already find study at this level challenging? 
The use of examination results to assess the effectiveness of schools is now routine, although still contentious, with a large number of increasingly arcane statistical measures being deployed to measure students’ progress and to try to take account of their varying abilities and social backgrounds.  Interestingly, a careful study of GCSE performances in Wales, where league tables were abolished in 2001, has revealed that their removal has reduced students’ examination grades. http://www.bristol.ac.uk/cmpo/publications/bulletin/summer11/burgess.pdf .  This may not mean that league tables ensure a better education but they certainly seem to boost grades, an unfortunate truth for Welsh students seeking admission to higher education.  I have no particular objection to league tables per se, especially if Birkdale comes top, although the way in which varied opportunities, characters and achievements and the rich complexities of the life stories of a whole year-group of students are reduced to one stark, average statistic seems a bit dehumanising.  In any case, the standard measurement criteria are largely irrelevant to Birkdale; virtually all of our students gain C grades or above in all subjects so the efforts of staff and students to gain A* rather than A grades, for example, go unmeasured and unnoticed.  Of course, a change in the criteria will produce corresponding changes in the behaviour of schools and teachers; the trick is to ensure that these changes actually improve the standard of education rather than some aspect of examination preparation or performance.
How precisely should we assess the students?  Cambridge Assessment recently suggested that scores should be reported on a 900 point scale instead of using grades of A* to G.  This would allow candidates to be compared more carefully and prevent 2 candidates with similar marks from ending up on different sides of a grade boundary.  It might also prevent some schools from concentrating their efforts upon students close to the C/D borderline and instead provide incentives for them to focus on raising achievement for students across the ability range.  Schools could be measured upon some sort of average score rather than upon the percentage of students gaining at least 5 grades of C or above.  However, with many head teachers sceptical that students are always put in the correct grade it seems beyond the examination boards to measure achievement using such a fine scale particularly given the recent debacles over English GCSE grading (http://www.bbc.co.uk/news/education-22397739 ).  A variation of 1 part in 900 is surely less than the natural variability of students over time; if the same student took 2 similar exams on successive days it would be surprising if they achieved the same score out of 900 meaning that there is little point to such a granular measurement.  Michael Gove has recently suggested that GCSE grades will be replaced with the numbers 1 to 10 with 1 and 2 roughly corresponding to a current A* and 3 and 4 to an A.  Given that the new exams are likely to be harder this may be a sensible way of preventing direct comparison between the last year of the current letter grade system to be awarded in 2016 and the new number grades first awarded in 2017.  The prospect of moving to a driving test model with pass or fail as the only outcomes seems remote. 
Two things are clear.  Firstly, the focus on assessment will not pass quickly given the importance of the examinations.  Secondly, as a physicist, it seems to me that the Heisenberg Uncertainty Principle, famous from Quantum Physics, comes into play – any attempt to measure something inevitably changes what you are trying to measure.

