Sheffield Assessment Instrument for Letters (SAIL)
-
Upload
sarah-bridges -
Category
Documents
-
view
213 -
download
0
Transcript of Sheffield Assessment Instrument for Letters (SAIL)
Letters to the Editor
Facing the challenges ofcompetency-basedassessment of postgraduatedental training
Editor – I read with interest the recent
paper on longitudinal evaluation of per-
formance in competency-based
assessment.1 To develop and introduce a
valid and reliable system of assessment,
which accurately measures the all-round
competence of trainees, is indeed a
daunting task and the authors have
obviously put an enormous amount of
effort into producing their evaluation
form. The emphasis on increasing for-
mative and minimizing summative
assessment is certainly to be applauded.
I am, however, concerned that such
assessments are being developed with-
out having preset criteria against which
judgements of performance are made.
How can these judgements be reliable
and objective without clear criteria? To
imply that the reliability (and validity)
of such assessments is improved if a
large number of assessments are car-
ried out is dubious. Reputations are
easily made and hard to change and,
once an opinion is formed on a train-
ee, word spreads quickly around a
department. Assessments are thus
easily influenced, consciously or sub-
consciously, and any errors compoun-
ded. The ‘halo’ effect (and the oppo-
sing ‘horns’ effect) are best countered
by establishing objective criteria in
advance2 – not by the reinforcement of
subjective judgement.
I do accept that criteria for such wide-
ranging assessments may have to be
broad for reasons of feasibility, but
please let us not do away with them
altogether. If we do, we are in danger of
spending an enormous amount of time
and effort on assessments that are no
more reliable than the intuitive judge-
ments made in the past.
A W Evans
London, UK
References1 Prescott LE, Norcini JJ, McKinlay P,
Rennie JS. Facing the challenges of
competency-based assessment of post-
graduate dental training. Longitudinal
evaluation of performance (LEP). Med
Educ 2002;36:92–7.
2 Fletcher S. Competence-Based Assessment
Techniques. London: Kogan Page; 2000.
Students benefit fromexperience of hospitalization
Editor – We all learn in a variety of
ways. Although textbooks, lectures and
discussion groups are important con-
stituents of medical student learning, we
wouldn’t want anyone performing sur-
gery based only on a written description
of the procedure. Similarly, true
empathy towards patients involves more
than acquired knowledge and skills; it
also requires understanding something
about what patients experience.
Thus, unlike Professor Downie, we
believe there can be great value in hav-
ing students go through the hospital-
ization experience we describe in this
issue – despite the fact that it cannot
precisely replicate the exact circum-
stances faced by patients with acute
and⁄or chronic disease. Students who
participated in this project were able to
personally experience many of the
dehumanizing aspects of being in a
hospital, from wearing a flimsy hospital
gown and having to undress in front of
others, to being treated as an object
rather than as a human being.
Professor Downie also feels this edu-
cational project was unethical, for three
reasons. Firstly, he worries that it could
have resulted in harm to the students.
As we pointed out, the safeguards we
instituted meant that there was only
minimal risk. Although the institution of
safeguards itself creates additional differ-
ences between the students’ experience
and that which ’real patients’ undergo,
it does not eliminate many important
aspects of the experience of hospital-
ization, and is a reasonable and indeed
necessary compromise. Furthermore, of
the many student volunteers who gave
fully informed consent, only a few were
able to participate in the project.
Secondly, Professor Downie worries
about inappropriate use of resources.
No patients were denied care because a
handful of students took up otherwise
empty beds at a time when we knew our
hospital would not be full. The
increased demand on the time of phy-
sicians and nurses was minimal, and the
Correspondence: AW Evans, Department of
Oral & Maxillofacial Surgery, Eastman
Dental Institute for Oral Health Care
Sciences, 256 Grays Inn Road, London
WCIX 8LD, UK. E-mail: a.evans@eastman.
ucl.ac.uk
Correspondence: Dr Michael Wilkes, 39630
Larkspur Place, Davis, California 95616,
USA. E-mail: [email protected]
586 � Blackwell Science Ltd MEDICAL EDUCATION 2002;36:586–590
financial cost of the exercise was trivial.
As we note in the manuscript, the cost
of the exercise was dwarfed by the
amount spent on other medical educa-
tional activities that, we would argue,
are of far less value. We would never
advocate taking up hospital beds with
healthy students if those beds were
needed for real patients. In the UK or
elsewhere this concern might mean that
an alternative experience would need to
be provided. However, this was cer-
tainly not the case at our teaching hos-
pital.
We agree with Professor Downie’s
comment that ‘one of the saddest parts
of the experience’ related to one of the
students being turned away for lack of
health insurance. We are vocal critics of
this aspect of American health care.
Although this aspect of the experience
was a powerful and educational one for
the student involved, it does not miti-
gate the ethical shortcomings of a health
care system that treats people unequally
based on their ability to pay. However,
this unconscionable aspect of American
medicine was not the subject of our
paper.
Finally, Professor Downie raises
concerns about the intrinsic deception
involved. He opines that the only
justification for deception might be
‘important research’. We do not
understand this concept. Deception in
and of itself is of course undesirable,
but we believe that most ethical issues
are complex, and, typically, involve the
balancing of competing values. The
benefits of a project like this – whether
it is carried out for research, for edu-
cation, or for some other purpose – can
outweigh the harm associated with this
degree of deception. It goes without
saying that the project could not have
been accomplished in any meaningful
sense had the caregivers known what
was taking place. Given the specific
nature of the project, we are comfort-
able that its educational value, and
potential for positively changing future
behaviour, justified the degree and type
of deception involved. (Frankly we are
astonished by the assertion that our
willingness to conduct this project
proves that we ‘lack humane qualit-
ies’).
Professor Downie concludes that we
have addressed the wrong issue because
‘what is needed is to give permission for
the deployment of humane qualities that
students already possess’. He believes
this might be best accomplished by
offering courses in the humanities and
’encouraging a broader perspective on
life’. We have no objection to this pro-
posal and believe it reflects one positive
aspect of American medical education,
in that our medical students have first
obtained an undergraduate degree,
where they are far more likely than their
UK counterparts to have been exposed
to broad perspectives.
Moreover, we believe the experiences
our students had during this exercise
confirm our prior observations that,
despite the many other ways in which
we attempt to introduce humanism into
the curriculum at our medical school, a
great deal more needs to be done. We
do not discourage the use of other tools
as well, but suggest that this particular
tool can add greatly to students’
understanding, not only of the patient’s
experience of hospitalization, but also of
the critical importance of sensitivity and
humanism, or their absence, on the part
of physicians.
Michael Wilkes
J Hoffman
Davis, California, USA
Training of Doctors project
Editor – I read with interest the valuable
discussion paper by Bleakley1 in your
Journal. I have, however, a number of
comments concerning our group’s
quoted work.2,3,4
The Training of Doctors project was
a funded research and development
project, which was multidisciplinary in
nature and practical in approach. As an
action research developmental project,
it aimed to illuminate important ques-
tions and themes in an under studied
area, rather than to primarily develop
new theory. The mixture of doctors,
educationalists and a trained anthro-
pologist facilitated this approach. The
emphasis was on producing outcomes
and possible solutions which would
enhance the educational experience or
’training’ of junior doctors, not just
pre-registration house officers. These
outcomes were evaluated and found to
work in a variety of hospital settings and
departments.
Using a language that would allow
us to communicate with doctors was
essential, and a psychological turn of
phrase was almost inevitable as this
vernacular is predominant in the
medical paradigm. In addition, this
psychological approach facilitated ac-
cess to those well-described aspects of
the junior doctor’s life that concern
stress and emotion. This does not
mean that the authors do not believe
in the social constructivist worldview,
socialization or the recent seminal
work on communities of practice.5,6
But a practice based ‘how to do it on
the shop floor’ approach was what
appeared to be most valued by train-
ees.7 This may be because being a
doctor is a practical job. This situated
approach then led to further theory
development, particularly of commu-
nities of practice, which extended our
preliminary work.7
There is undoubtedly more ‘cultural
complexity’ to be unraveled, but the
actual educational value of many junior
doctors’ work experience and their
knowledge of optimal learning strategies
is still often only moderate, and so more
sophisticated approaches may well be
helpful in the future. For example, a
critical theory approach may have been
useful, but access to some institutions
was sometimes vulnerable where sensi-
tivities and confidentiality could easily
be disturbed, so some theoretical com-
promise was necessary in order to
maintain access. Similarly, although the
statement that ‘there is no generic
pedagogic formula’ is probably true,
particularly in the present unbounded,
contested and hybrid postmodern
world, some straightforward starting
point was pragmatically necessary for
the project. If, as with Bleakley’s paper,
this work also stimulates a discourse, it
can only help junior doctors and their
training in the future.
S J Ward
London, UK
Correspondence: Dr S J Ward, 32 Dovercourt
Road, London SE22 8ST, UK. E-mail:
Letters to the Editor 587
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:586–590
References1 Bleakley A. Pre-registration house
officers and ward-based learning: a ‘new
apprenticeship’ model. Med Educ
2002;36:9–15.
2 Hargreaves DH, Bowditch MG, Griffin
DR. On-the-Job Training for Surgeons: a
Practical Guide. Edinburgh: The Royal
Society of Medicine Press 1997.
3 Hargreaves DH, Southworth GW,
Stanley P, Ward SJ. On-the-Job Training
for Physicians: a Practical Guide. Edin-
burgh: The Royal Society of Medicine
Press 1997.
4 Stanley P. Structuring ward rounds for
learning: can opportunities be created?
Med Educ 1998;32:239–43.
5 Lave J, Wenger E. Situated Learning:
Legitimate Peripheral Participation. Cam-
bridge: Cambridge University Press 1991.
6 Wenger E. Communities of Practice:
Learning, Meaning and Identity. Cam-
bridge: Cambridge University Press 1998.
7 Ward SJ. Enhancing the capacity of
junior doctors’ training. Unpublished
Thesis. Cambridge School of Education
1997.
Sheffield AssessmentInstrument for Letters (SAIL)
Editor ) We were interested and
impressed by the simplicity of the
Sheffield Assessment Instrument for
Letters (SAIL).1 Specialist Registrars
(SpRs) in Paediatrics in the South-west
Region are recommended to include
example anonymous clinic letters and
discharge summaries within their port-
folios. These are then used to inform the
Record of In Training Assessment
(RITA). Our experience is that due to
time pressures within the allocated 1
hour for the RITA interview, the
opportunity for more than a superficial
review is severely limited. Thus, a
validated objective scoring system
would seem to be the ideal way forward.
However we were surprised that the
authors felt that SAIL was �highly feas-
ible to carry out� as the time incurred by
the �judges� would be enormous if this
scoring system was to be conducted for
each and every SpR. As an example,
using the mathematics from within the
paper, for a reliability coefficient of 0Æ80,
6 judges would need to score 10 letters
from each SpR. In the South-west
region there are more than 70 SpRs,
resulting in 700 clinic letters to be ana-
lysed. If each letter took the 6 minutes,
as suggested by the authors, then this
would amount to 70 hours for each of
the 6 judges. We are not sure that this is
a productive use of time for Regional
Advisors or RITA assessors.
Perhaps the SAIL system would be
more practical if used as a formal com-
ponent of the assessment undertaken at
the end of the first 2 core years of
training or when assessing poor per-
formance of trainees in difficulty.
Sarah J Bridges
Huw Thomas
Bristol, UK
Reference1 Crossley JGM, Howe A, Newble D,
Jolly B, Davies HA. Sheffield Assess-
ment Instrument for Letters (SAIL):
performance assessment using outpa-
tient letters. Med Educ 2001;35:1115–
24.
Authors’ reply
Editor – We are grateful to Dr Bridges
and Dr Thomas for their thoughtful
response to our paper. In particular it is
reassuring that others have recognized
the face validity, feasibility and at-
tractiveness of using clinic letters to
inform RITA and other assessment
processes.
They raise a very important point in
relation to the balance between reliability
and feasibility in assessing letters. Based
on our data they have calculated (cor-
rectly) that it would take the slowest
judge 70 hours to achieve a set of results
with a reliability coefficient of 0Æ8 on all
the SpRs in a large Higher Specialist
Training Programme at one point in
time. Even the quicker judges would still
take 35 hours. There are 3 important
points to make about this conclusion that
will illustrate the richness of generaliz-
ability data and some important princi-
ples of performance assessment. Unlike
other reliability techniques generaliz-
ability allows modelling of reliability for a
range of assessment strategies from
which the one best suited to a given pur-
pose and circumstances can be chosen.
A reliability coefficient of 0Æ8 is quo-
ted because it is the accepted threshold
for high-stakes assessment such as
revalidation. There is no commonly
held threshold for in-training assess-
ment processes, but most authors agree
that validity and feedback potential are
more important in this setting and that
the threshold for reliability is much
lower.1 The data show that a threshold
of 0Æ7 (still better than an hour-long
MCQ2 or a 3-hour OSCE)3 would be
reached if 6 judges each marked 5 letters
or 3 judges each marked 8 letters. For a
training programme of 30 SpRs this
would take each judge 7Æ5–24 hours
depending upon marking speed and
how many judges took part.
It is rarely necessary to produce a
high-reliability result on every doctor at
regular fixed time points. The marking
that contributes to a regular assessment
process could be distributed throughout
the year. Bridges and Thomas them-
selves suggest that high-reliability, high-
investment assessment could be
reserved for specific points in training
but will have been preceded by lower
reliability, high feasibility formative
assessment to inform the development
of letter writing skills. Similarly a less
discriminating assessment strategy
could be used to �screen� for borderline
trainees who would then be subjected to
a more rigorous and resource expensive
strategy before any definitive decision
about their subsequent progress was
made. We have developed a simple
global rating scale for this purpose that
correlates well with SAIL but has a
slightly lower reliability. It takes only
1)2 minutes per letter but cannot pro-
duce such rich formative feedback.
Bridges and Thomas are rightly
concerned that the busiest and most
Correspondence: Sarah Bridges, Paediatric
Unit, Southmead Hospital, Westbury-on-
Trym, Bristol, UK. E-mail: sarahbridges
Correspondence: Helena Davies, Consultant in
Medical Education, Sheffield Children’s
Hospital NHS Trust, Western Bank,
Sheffield S10 2TH, UK. Tel.: (44) 114 271
7108; Fax: (44) 114 271 7185; E-mail:
Letters to the Editor588
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:586–590
expensive clinicians should not be
spending their time in lengthy assessment
procedures. We included a consultant, a
GP and a trainee in the original study
since they are the main stakeholders in
good letter writing. This has enabled us to
show that the differences between them
are not significantly related to their des-
ignation. It follows that 3 trainees acting
as judges will produce a similar result that
is equally reliable when their marking is
guided by SAIL. Whilst most programme
directors would probably feel more
comfortable with a mix of judges for
higher-stakes assessment there is no rea-
son why trainees couldn’t mark most of
the letters most of the time. This in itself
has provided valuable instruction in letter
writing for markers of every grade in our
experience.
Using the results appropriately it is
easy to re-evaluate the reliability in any
of these circumstances to check that the
assessment tool had performed as pre-
dicted.
Helena Davies
Jim Crossley
Amanda Howe
Brian Jolly
David Newble
Sheffield, UK
References1 van der Vleuten C. The assessment of
professional competence: developments,
research and practical implications. Adv
Health Sci Education 1996;1:41–67.
2 Norcini JJ, Swanson DB, Grosso LJ,
Webster GD. Reliability, validity and
efficiency of multiple choice question
and patient management problem item
formats in assessment of clinical com-
petence. Med Educ 1985;19:238–47.
3 Newble DI, Swanson DB. Psychometric
characteristics of the objective struc-
tured clinical examination. Med Educ
1988;22:325–34.
Holding on to the philosophyand keeping the faith
Editor – Norman1 corrects Norman’s2
attribution of an argument to do with
the existence of God, and refers this to
�Norman�s inadequate educational pre-
paration in the liberal arts.’ However
this correction itself needs correcting.
Moreover this reveals a further deficit in
Norman’s educational preparation – an
understanding of philosophy that is
�inadequate� to PBL and the article he is
writing about.3
Norman’s correction is misleading on
two counts: first he is wrong about the
argument itself, and secondly he is
mistaken about the very notion of the
attribution of arguments. It is the latter
that is revealing.
This is the argument as given: �if you
believe in God and there is none, you
have lost nothing, if you don�t believe
and there is one, you have lost every-
thing’. Norman attributes this first to
Galton (1822–1911) and then to Spi-
noza (1632–1677). However the argu-
ment is better known as the Wager of
Pascal (1623–1662).4
It is not important that Norman’s
correction calls this a �logical proof of
the existence of God� (although it isn’t –
it’s an argument for believing in God’s
existence). Nor is it important that
Norman’s reference to an online
encyclopedia provides no evidence that
Spinoza ever used this argument. Nor is
it really so important that versions of it
can be found, no doubt, that predate
Pascal. All these are relatively trivial
points.
Norman’s real mistake, I would sug-
gest, is to look for someone to whom to
�credit� the argument at all. Historians of
philosophy can, and do, argue over
issues of priority and attribution – who
said what and when – just as historians
of the discovery of, say, oxygen, or
America, do. But doing the history of
philosophy is not the same thing as
doing philosophy. Philosophers, by
contrast, are interested in the arguments
themselves, for these are all any philo-
sopher has. An argument in philosophy,
in short, �belongs� to whoever asserts it.
Or, to put it another way, we are all
responsible for what we think. Try
running Pascal’s Wager yourself, as
paraphrased above: why are you not
persuaded?
The reasons for this responsibility
are the same as those which motivate
PBL: Galton can’t do your thinking,
nor your teacher your learning, for
you. This, in turn, suggests that put-
ting �the responsibility for learning in
the hands of the learner, not the tea-
cher� is more than simply an �assump-
tion�, more than an �unfounded belief�,but an idea that has substantial philo-
sophical warrant.
Norman’s original commentary was
sceptical of – or at least �agnostic�about – Dolmans’ paper, portraying it
as an attempt at �keeping the faith� in
PBL. Norman’s correction of his
commentary suggests that there may
be good philosophical reasons for us to
be sceptical of his agnosticism, and
indeed to continue �holding on to the
philosophy.�
Simon Harrison
University of Bristol
References1 Norman GR, Erratum. Med Educ
2002;36:102.
2 Norman GR. Holding on to the philos-
ophy and keeping the faith. Med Educ
2001;35:820–1.
3 Dolmans D, Wolfhagen I, van der Vle-
uten C, Wijnen W. Solving problems
with group work in problem based
learning: hold on to the philosophy. Med
Educ 2001;35:884–9.
4 Pascal B. Penses. Translated by AJ
Krailsheimer. London: Penguin; 1963: pp.
149–52.
The assessment tool is only asgood as the assessors
Editor ) We read with interest the art-
icle on videoconferencing to assess
neonatal resuscitation skills1 and were
impressed by the low levels of interob-
server variability found between the
two instructors in the 18 megacodes
evaluated.
Correspondence: Simon Harrison, 2 Rodney
Place, Clifton, Bristol BS8 4HY, UK. E-mail:
Correspondence: Gavin D Perkins, Research
Fellow Intensive Care Medicine,
Birmingham Heartlands Hospital, Bordesley
Green East, Birmingham B9 5SS, UK. Tel.:
(44) 121 424 3562; Fax: (44) 121 424 1108;
E-mail: [email protected]
Letters to the Editor 589
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:586–590
We have recently assessed inter-
observer variability in an adult resusci-
tation skills course (The Resuscitation
Council UK Advanced Life Support
Provider Program).2 We used video-
recorded scenarios to test a group of 25
examiners from 15 different assessment
centres in order to assess the levels of
interobserver variability. Our study
differed from the present study by
using a diverse group of examiners and
by using scenarios that were staged to
include a number of commonly ob-
served errors. Unlike Cronin’s study we
observed significant interobserver vari-
ability ranging from agreement of only
50% to a maximum of 100% for a
single scenario where the candidate
made multiple mistakes. Intraobserver
variability was also tested by showing
the instructors one of the videos twice
and found to be similarly poor (kappa
0Æ43). The marked difference in inter-
observer consistency between our
findings and those of Cronin et al.
suggests that extrapolation of their
findings per se to a larger group of in-
structors would not guarantee a similar
level of consistency as that demonstra-
ted in their study.
However, as Cronin identifies, the
development of this model for training
new and re-certifying instructors may
be a valuable tool to improve consis-
tency in marking and continuing edu-
cation.
Gavin D Perkins
Birmingham, UK
Michael J Tweed
Leicester, UK
References1 Cronin C, Cheang S, Hlynka D, Adair
E, Roberts S. Videoconferencing can be
used to assess neonatal resuscitation
skills. Med Educ 2001;35:1013–23.
2 Perkins GD, Hulme J, Tweed MJ.
Variability in the assessment of
advanced life support skills. Resuscitation
2001;50:281–6.
Letters to the Editor590
� Blackwell Science Ltd MEDICAL EDUCATION 2002;36:586–590