Background: A script concordance test (SCT) is a modality for assessing clinical reasoning. Concerns had been raised about the plausible validity threat to SCT scores if students deliberately avoided the extreme answer options to obtain higher scores. The aims of the study were firstly to investigate whether students’ avoidance of the extreme answer options could result in higher scores, and secondly to determine whether a ‘balanced approach’ by careful construction of SCT items (to include extreme as well as median options as model responses) would improve the validity of an SCT.

Methods: Using the paired sample t-test, the actual average student scores for 10 SCT papers from 2012–2016 were compared with simulated scores. The latter were generated by recoding all ‘2’ responses to ‘1’ and ‘+2’ responses to ‘+1’ for the whole and bottom 10% of the cohort (simulation 1), and scoring as if all students had chosen ‘0’ for their responses (simulation 2). The actual average and simulated average scores in 2012 (before the ‘balanced approach’) were compared with those from 2013–2016, when papers had a good balance of modal responses from the expert reference panel.

Results: In 2012, a score increase was seen in simulation 1 in the third-year cohort, from 50.2 to 55.6% (t [10] = 4.818; p = 0.001). Since 2013, with the ‘balanced approach’, the actual SCT scores (57.4%) were significantly higher than scores in both simulation 1 and simulation 2 (46.7% and 23.9% respectively).

Conclusions: When constructing SCT examinations, apart from the rigorous pre-examination optimisation, it is desirable to achieve a balance between items that attract extreme responses and those that attract median response options. This could mitigate the validity threat to SCT scores, especially for the low-performing students who have previously been shown to only select median responses and avoid the extreme responses.


script concordance test (SCT), assessing clinical reasoning, medical students, validity

